一、lasso
二、前向逐步回归
lasso差不多的效果,但是前向逐步回归更加简单。这是一种贪心算法,在每一步尽量减少误差。
(前向逐步回归流程)
三、实验
1、Matlab实现
主程序
1. clear all;
2. clc;
3. %% 导入数据
4. data = load('abalone.txt');
5. x = data(:,1:8);
6. y = data(:,9);
7. %% 处理数据
8. yMean = mean(y);
9. yDeal = y-yMean;
10. xMean = mean(x);
11. xVar = var(x,1);
12. [m,n] = size(x);
13. xDeal = zeros(m,n);
14. for i = 1:m
15. for j = 1:n
16. xDeal(i,j) = (x(i,j)-xMean(j))/xVar(j);
17. end
18. end
19.
20. %% 训练
21. runtime = 5000;%迭代的步数
22. eps = 0.001;%调整步长
23. wResult = stageWise(xDeal, yDeal, eps, runtime);
24.
25. %% 根据wResult画出收敛曲线
26. hold on
27. xAxis = 1:runtime;
28. for i = 1:n
29. plot(xAxis, wResult(:,i));
30. end
前向逐步回归函数
1. function [ wResult ] = stageWise( x, y, eps, runtime)
2. [m,n] = size(x);%数据集的大小
3. wResult = zeros(runtime, n);%最终的结果
4. w = zeros(n,1);
5. wMax = zeros(n,1);
6. for i = 1:runtime
7. ws = w'%输出每一次计算出来的权重
8. lowestError = inf;%定义最小值
9. for j = 1:n
10. for sign = -1:2:1
11. wTest = w;%初始化
12. wTest(j) = wTest(j)+eps*sign;%只改变一维变量
13. yTest = x*wTest;
14. %求误差
15. rssE = rssError(y, yTest);
16. if rssE < lowestError%如果好,就替换
17. lowestError = rssE;
18. wMax = wTest;
19. end
20. end
21. end
22. w = wMax;
23. wResult(i,:) = w;
24. end
25. end
误差函数
1. %% rssError函数主要是利用均方误差
2. function [ error ] = rssError( y, yTest )
3. yDis = y-yTest;%误差
4. [m,n] = size(yDis);
5. %求平方
6. for i = 1:m
7. yDis(i) = yDis(i)^2;
8. end
9. error = sum(yDis);%求列和
10. end
2、收敛曲线