在我们深度学习神经网络里的反向传播其实就是对损失函数求导。笔者就求导在python中的几种方式进行汇总

一、Scipy求导

由于scipy 是基于numpy写的高级封装, 所以在numpy的生态可以共用。
就好比在给xgboost修改损失时算一阶和二阶导时就可以用scipy.misc.derivative

1.1 求导示例

# scipy deveration
from scipy.misc import derivative

def logistic_model(x, torch_flag=False):
    if torch_flag:
        return 1 / (1 + t.exp(-x))
    return 1 / (1 + np.exp(-x))
  
def logit_deveration(y):
    return y * (1 - y)

x = np.arange(10)
y = logistic_model(x)
# derivative(函数, x, n阶导, dx)
derivative(logistic_model, x, n=1, dx=1e-6)
logit_deveration(y)

"""
>>> derivative(logistic_model, x, n=1, dx=1e-6)
array([2.50000000e-01, 1.96611933e-01, 1.04993585e-01, 4.51766597e-02,
       1.76627062e-02, 6.64805666e-03, 2.46650922e-03, 9.10221232e-04,
       3.35237726e-04, 1.23379307e-04])
>>> logit_deveration(y)
array([2.50000000e-01, 1.96611933e-01, 1.04993585e-01, 4.51766597e-02,
       1.76627062e-02, 6.64805667e-03, 2.46650929e-03, 9.10221180e-04,
       3.35237671e-04, 1.23379350e-04])
>>>
"""

二、Torch计算图求导

详细可以查看github:《深度学习框架PyTorch:入门与实践》

import torch as t

x = t.randn(10, requires_grad=True)
y = logistic_model(x, torch_flag=True)
y.backward(t.ones(10))
x.grad, logit_deveration(y)

"""
x.grad, logit_deveration(y)
(tensor([0.2497, 0.1806, 0.2476, 0.2488, 0.2456, 0.2499, 0.1452, 0.2369, 0.2305,
        0.2320]), 
 tensor([0.2497, 0.1806, 0.2476, 0.2488, 0.2456, 0.2499, 0.1452, 0.2369, 0.2305,
        0.2320], grad_fn=<MulBackward0>))
>>>
"""

三、Scipy参数估算

对很多时候我们需要对数据进行估算(梯度下降、牛顿法)求解最优的参数,以得到最优拟合模型。

from scipy import optimize
from sklearn.linear_model import LinearRegression

def linear_model_fn(params, *args):
    w, b = params
    y, x = args
    y_hat = x * w + b
    return mse_loss(y, y_hat)


x = np.arange(10)
y = x * 20 + 9 + np.random.randn(10)
# opt_result : 估算的参数, 最小损失
opt_result = optimize.fmin_l_bfgs_b(linear_model_fn, x0=np.array([0.001, 0.001]), args=(y, x), approx_grad=True)

lr = LinearRegression(fit_intercept=True)
lr.fit(x.reshape(-1,1), y)
lr.coef_, lr.intercept_, opt_result[0]
"""显然估算结果与sklearn是一致的
>>> lr.coef_, lr.intercept_, opt_result[0]
(array([20.00363708]), 8.784787632129081, array([20.00363714,  8.78478726]))
"""