在我们深度学习神经网络里的反向传播其实就是对损失函数求导。笔者就求导在python中的几种方式进行汇总
一、Scipy求导
由于scipy 是基于numpy写的高级封装, 所以在numpy的生态可以共用。
就好比在给xgboost修改损失时算一阶和二阶导时就可以用scipy.misc.derivative
1.1 求导示例
# scipy deveration
from scipy.misc import derivative
def logistic_model(x, torch_flag=False):
if torch_flag:
return 1 / (1 + t.exp(-x))
return 1 / (1 + np.exp(-x))
def logit_deveration(y):
return y * (1 - y)
x = np.arange(10)
y = logistic_model(x)
# derivative(函数, x, n阶导, dx)
derivative(logistic_model, x, n=1, dx=1e-6)
logit_deveration(y)
"""
>>> derivative(logistic_model, x, n=1, dx=1e-6)
array([2.50000000e-01, 1.96611933e-01, 1.04993585e-01, 4.51766597e-02,
1.76627062e-02, 6.64805666e-03, 2.46650922e-03, 9.10221232e-04,
3.35237726e-04, 1.23379307e-04])
>>> logit_deveration(y)
array([2.50000000e-01, 1.96611933e-01, 1.04993585e-01, 4.51766597e-02,
1.76627062e-02, 6.64805667e-03, 2.46650929e-03, 9.10221180e-04,
3.35237671e-04, 1.23379350e-04])
>>>
"""
二、Torch计算图求导
import torch as t
x = t.randn(10, requires_grad=True)
y = logistic_model(x, torch_flag=True)
y.backward(t.ones(10))
x.grad, logit_deveration(y)
"""
x.grad, logit_deveration(y)
(tensor([0.2497, 0.1806, 0.2476, 0.2488, 0.2456, 0.2499, 0.1452, 0.2369, 0.2305,
0.2320]),
tensor([0.2497, 0.1806, 0.2476, 0.2488, 0.2456, 0.2499, 0.1452, 0.2369, 0.2305,
0.2320], grad_fn=<MulBackward0>))
>>>
"""
三、Scipy参数估算
对很多时候我们需要对数据进行估算(梯度下降、牛顿法)求解最优的参数,以得到最优拟合模型。
from scipy import optimize
from sklearn.linear_model import LinearRegression
def linear_model_fn(params, *args):
w, b = params
y, x = args
y_hat = x * w + b
return mse_loss(y, y_hat)
x = np.arange(10)
y = x * 20 + 9 + np.random.randn(10)
# opt_result : 估算的参数, 最小损失
opt_result = optimize.fmin_l_bfgs_b(linear_model_fn, x0=np.array([0.001, 0.001]), args=(y, x), approx_grad=True)
lr = LinearRegression(fit_intercept=True)
lr.fit(x.reshape(-1,1), y)
lr.coef_, lr.intercept_, opt_result[0]
"""显然估算结果与sklearn是一致的
>>> lr.coef_, lr.intercept_, opt_result[0]
(array([20.00363708]), 8.784787632129081, array([20.00363714, 8.78478726]))
"""