1:神经网络

每个神经元接收输入、进行加权求和并经过非线性激活函数转换后输出到下一层或作为最终输出。

2:构建神经网络组件

定义两个关键类:Dense(全连接层)和Activation(激活函数)。

Dense Layer (Dense类): 负责线性变换,包括权重初始化、前向传播及反向传播更新权重。

Activation Function (Activation类): 引入非线性,这里以Sigmoid函数为例。

3:反向传播

反向传播用于计算损失函数关于每个权重的梯度,进而更新权重以减小损失。

4:模型训练与评估

Model Class: 综合上述组件,定义模型结构,包含添加层、前向传播预测、训练(基于梯度下降的损失函数优化)等功能。

Loss Function: 采用均方误差(Mean Squared Error, MSE)作为损失函数,衡量模型预测值与真实值之间的差距。 

5:模型持久化
  • Save & Load: 实现模型参数的保存与加载功能,利用Numpy的.npz格式保存权重和偏置矩阵。

流程图

深度学习框架的搭建_神经网络

import numpy as np

class Dense:
    def __init__(self, input_units, output_units):
        self.weights = np.random.randn(input_units, output_units)
        self.bias = np.zeros((1, output_units))
    
    def forward(self, inputs):
        self.inputs=inputs
        return np.dot(inputs, self.weights) + self.bias
    
    def backward(self, grad_output, learning_rate):
        grad_inputs = np.dot(grad_output, self.weights.T)
        grad_weights = np.dot(self.inputs.T, grad_output)
        grad_bias = np.sum(grad_output, axis=0, keepdims=True)
        
        self.weights -= learning_rate * grad_weights
        self.bias -= learning_rate * grad_bias
        
        return grad_inputs

class Activation:
    @staticmethod
    def sigmoid(x):
        return 1 / (1 + np.exp(-x))
    
    @staticmethod
    def sigmoid_derivative(x):
        return x * (1 - x)

class SigmoidActivation:
    def forward( x):
        return 1 / (1 + np.exp(-x))
    
    def backward( x,LR):
        return x * (1 - x)

class Loss:
    @staticmethod
    def mse(y_true, y_pred):
        return np.mean(np.square(y_true - y_pred))
    
    @staticmethod
    def mse_derivative(y_true, y_pred):
        return y_pred - y_true

class Model:
    def __init__(self):
        self.layers = []
        self.loss_function = Loss.mse
        self.loss_derivative = Loss.mse_derivative
    
    def add(self, layer):
        self.layers.append(layer)
    
    def predict(self, inputs):
        for layer in self.layers:
            inputs = layer.forward(inputs)
        return inputs
    
    def fit(self, X_train, y_train, epochs=100, learning_rate=0.001):
        for epoch in range(epochs):
            predictions = self.predict(X_train)
            loss = self.loss_function(y_train, predictions)
            
            if epoch % 10 == 0:
                print(f"Epoch {epoch}, Loss: {loss}")
            
            grad = self.loss_derivative(y_train, predictions)
            for layer in reversed(self.layers):
                grad = layer.backward(grad, learning_rate)
    
    def save(self, filename):
        with open(filename, 'wb') as f:
            # 使用列表推导式筛选出有weights属性的层
            weight_layers = [layer for layer in self.layers if hasattr(layer, 'weights')]
            bias_layers = [layer for layer in self.layers if hasattr(layer, 'bias')]
            
            # 将权重和偏置分别保存
            if weight_layers and bias_layers:
                np.savez(f, *[layer.weights for layer in weight_layers], *[layer.bias for layer in bias_layers])
            elif weight_layers:  # 如果只有权重没有偏置的情况
                np.savez(f, *[layer.weights for layer in weight_layers])
            else:
                print("No layers with weights to save.")
        
    @staticmethod
    def load(filename):
        with np.load(filename) as data:
            weights = [data[f'arr_{i}'] for i in range(0, len(data), 2)]
            biases = [data[f'arr_{i}'] for i in range(1, len(data), 2)]
            model = Model()
            for w, b in zip(weights, biases):
                model.add(Dense(w.shape[0], w.shape[1]))
                model.layers[-1].weights = w
                model.layers[-1].bias = b
            return model

# Example usage:
# Assume we have a binary classification problem with 4 features
X_train = np.random.rand(100, 4)
y_train = np.random.randint(0, 2, size=(100, 1))

model = Model()
model.add(Dense(4, 8))  # Input to hidden layer
model.add(SigmoidActivation)  # Activation function
model.add(Dense(8, 1))  # Hidden to output layer
model.fit(X_train, y_train)

# Saving and loading the model
model.save('model.npz')
loaded_model = Model.load('model.npz')

# Predict using loaded model
predictions = model.predict(X_train[0])
print(predictions)
self.weights = np.random.randn(input_units, output_units)

初始化神经网络(神经元)权重,

  • np.random.randn:这是 NumPy 库中的一个函数,用于生成满足标准正态分布(均值为0,标准差为1)的随机样本。这个函数非常适合用于神经网络的权重初始化,因为它可以确保权重值在训练开始前是随机且分散的,有助于避免网络陷入局部最优解。
  • input_units:这个参数指定了输入单元(或神经元)的数量。在神经网络中,这通常对应于前一层(或输入层)的神经元数量。
  • output_units:这个参数指定了输出单元(或神经元)的数量。在神经网络中,这对应于当前层(或目标层)的神经元数量。
import numpy as np  
  
# 假设的输入和输出单元数量  
input_units = 3  
output_units = 2  
  
# 初始化权重  
weights = np.random.randn(input_units, output_units)  
  
# 打印权重矩阵  
print(weights)

深度学习框架的搭建_神经网络_02

self.bias = np.zeros((1, output_units))

在神经网络中,self.bias = np.zeros((1, output_units)) 这行代码的作用是初始化当前层的偏置项(bias)为一个形状为 (1, output_units) 的二维数组,其中所有元素都被设置为0。这里的 output_units 表示当前层的神经元数量,也就是每个偏置项对应一个输出神经元。

偏置项在神经网络中起着重要的作用,它们为神经元的激活函数提供了一个偏移量。即使没有输入(即所有输入都为零),由于偏置项的存在,神经元仍然可以有一个非零的输出。这增加了神经网络的灵活性和表达能力。

def forward(self, inputs):
        self.inputs=inputs
        return np.dot(inputs, self.weights) + self.bias
  • forward 方法:执行前向传播计算,即输入与权重的矩阵乘法加上偏置。