时间序列预测
转化成时间格式后画预测图
残差注意相减长度
- 简单指数平滑预测模型
#模型
cpiforcast<-HoltWinters(table[,5],beta=FALSE,gamma=FALSE)
#拟合
cpiforcast$fitted
#预测
forecast(cpiforcast,h=month)
#残差
cpiforcast-fitted
- Holt指数平滑
#模型
price2<-HoltWinters(table[,2],gamma=FALSE)
#预测
price2$fitted
forecast(price2,h=month)
#残差
price2-fitted
- winter指数平滑
#模型
saleforecast<-HoltWinters(df.ts)
#拟合
df.ts$fitted
#预测
forecast(df.ts,h=month)
#残差
residualss2<-df.ts-saleforecast$fitted[,1]
- 一元线性回归
#模型
fit<-lm(收盘价格~时间,data=table)
#预测
predata<-predict(fit,data.frame(时间=1:35))
#残差
res<-fit$residuals
- 指数曲线
#模型
> y<-log(table[,2])
> x<-1:35
> fit1<-lm(y~x)
> fit1
#预测
predata<-exp(predict(fit1,data.frame(x=1:35)))
#残差
residuals<-table[,2]-predata
- 二阶多项式
#模型
> y<-table[,2]
> x<-1:35
> fit2<-lm(y~x+I(x^2))
> fit2
#预测
predata1<-predict(fit2,data.frame(x=1:36))
#残差
residual1<-fit2$residuals
- 分解预测
co2compose<-decompose(co2,type='multiplicative')
#调整后
seasonaladjust<-co2/co2compose$seasonal
#模型
> x<-1:468
> fitt<-lm(seasonaladjust~x)
#预测
predata_c<-predict(fitt,data.frame(x=1:492))*rep(co2compose$seasonal[1:12],41)
#残差
residuals_c<-co2-predict(fitt,data.frame(x=1:468))*co2compose$seasonal
文章目录
- 时间序列预测
- 11.1 时间序列的成分和预测方法
- 11.1.1 时间序列的成分
- 11.1.2 预测方法的选择与评估
- 预测方法选择
- 预测的好坏
- 11.2 指数平滑预测
- 11.2.1 四种模型
- 11.2.2 简单指数平滑预测
- 11.2.3 Holt指数平滑预测
- 11.2.3 Winters指数平滑预测
- 11.3趋势外推预测
- 11.3.1 线性趋势预测
- 11.3.2 非线性趋势预测
- 1. 指数曲线
- 2.多阶曲线
- 11.4 分解预测
- 分解预测步骤
- 11.5 时间序列平滑
11.1 时间序列的成分和预测方法
11.1.1 时间序列的成分
时间序列的变化受到一种或几种因素影响
,导致它在不同时间上的取值差异,这些因素就是时间序列的成分
- 趋势
时间序列在一段较长时期
内呈现出来的持续向上或持续向下
的变动 - 季节变动
时间序列呈现出的以年为周期、长度固定
的变动模式,以年为单位重复出现
- 循环波动
周期波动,时间序列呈现出的非固定长度的周期性变动
- 不规则波动
随机波动,除去趋势、季节变动、循环波动外的剩余波动
> table<-read.csv("/Users/zhourui/Documents/example11_1.csv")
> example11_1<-ts(table,start=2000)
#变为时间序列
> par(mfrow=c(2,2),mai=c(0.6,0.6,0.12,0.1),cex=0.7)
> plot(example11_1[,2],type="o",xlab="(a)liangshichanliangxulie",ylab="liangshichanliang")
> plot(example11_1[,3],type="o",xlab="(b)xiaofeishuipingxulie",ylab="xiaofeishuiping")
> plot(example11_1[,4],type="o",xlab="(c)yuanmeichanliangxulie",ylab="yuanmeichanliang")
> plot(example11_1[,5],type="o",xlab="(d)CPIxulie",ylab="CPI")
粮食产量呈现一定线性趋势
居民消费水平呈现一定的指数变化趋势
原煤产量呈现出一定的多阶曲线变化
CPI没有明显变化模式,随机波动
11.1.2 预测方法的选择与评估
预测方法选择
首先取决于历史数据的变化模式
其次取决于所能获得的历史数据的多少
预测的好坏
取决于预测误差的大小
,预测误差是预测值与实际值的差值,度量方法有平均误差、平均绝对误差、均方误差、平均百分比误差、平均绝对百分比误差
11.2 指数平滑预测
利用时间序列的平滑值
进行预测,根据时间序列所包含的成分不容有不同的模型。
11.2.1 四种模型
- Winters加法模型
预测含有:趋势成分、季节成分、随机成分
的序列
参数意义:
- Holt模型
预测含有:趋势成分
的序列
- 简单指数平滑预测模型
序列波动由随机因素
导致
得到t+1期
预测值
- Holt-Winters乘法模型
11.2.2 简单指数平滑预测
缺点:预测值滞后于实际值、无法考虑趋势和季节成分,仅考虑随机因素
- 确定模型参数阿尔法和系数a
> table<-read.csv("/Users/zhourui/Documents/example11_1.csv")
> table<-ts(table,start=2000)
> cpiforcast<-HoltWinters(table[,5],beta=FALSE,gamma=FALSE)
> cpiforcast
Holt-Winters exponential smoothing without trend and without seasonal component.
Call:
HoltWinters(x = table[, 5], beta = FALSE, gamma = FALSE)
Smoothing parameters:
alpha: 0.271798
beta : FALSE
gamma: FALSE
Coefficients:
[,1]
a 102.0851
beta:平滑参数(beta = FALSE,简单指数平滑)
gamma:季节调整(gamma = FALSE,不考虑季节调整)
- 历史数据拟合
> cpiforcast$fitted
Time Series:
Start = 2001
End = 2018
Frequency = 1
xhat level
2001 100.4000 100.4000
2002 100.4815 100.4815
2003 100.1332 100.1332
2004 100.4232 100.4232
2005 101.3682 101.3682
2006 101.4855 101.4855
2007 101.4895 101.4895
2008 102.3893 102.3893
2009 103.3435 103.3435
2010 102.2445 102.2445
2011 102.5314 102.5314
2012 103.3110 103.3110
2013 103.1178 103.1178
2014 102.9771 102.9771
2015 102.7115 102.7115
2016 102.3550 102.3550
2017 102.2585 102.2585
2018 102.0795 102.0795
xhat为计算值,是各年份的拟合值
- 绘制观测值和拟合值图
> par(mai=c(.6,0.6,0.1,0.1),cex=0.7,lab=c(19,5,1))
> plot(table[,5],type='o',xlab='time',ylab='CPI')
> lines(table[,1][-1],cpiforcast$fitted[,1],type='o',lty=2,col='blue')
> legend(x='topleft',legend = c('guance','nihe'),lty=1:2,col=c(1,4),fill=c(1,4),cex=0.8)
> table[,1][-1]
[1] 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
[18] 2018
- 获得样本外的预测值
> library(forecast)
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
> cpiforcast1<-forecast(cpiforcast,h=1)
> cpiforcast1
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2019 102.0851 99.56005 104.6102 98.22336 105.9468
h=1,预测下一年的值
- 实际值和预测值(2019年)图
> plot(cpiforcast1,type='o',xlab='time',ylab='CPI',main = '')
预测的80%、95%置信区间在图中阴影标出
11.2.3 Holt指数平滑预测
改进简单指数平滑模型,将趋势成分
考虑进来,用平滑值对序列的线性趋势进行修正
适用于含有趋势成分,不含季节成分
- 确定模型参数阿尔法和贝塔和系数a和b
> powerforecast<-HoltWinters(table[,2],gamma=FALSE)
> powerforecast
Holt-Winters exponential smoothing with trend and without seasonal component.
Call:
HoltWinters(x = table[, 2], gamma = FALSE)
Smoothing parameters:
alpha: 1
beta : 0.3369203
gamma: FALSE
Coefficients:
[,1]
a 71117.730
b 3985.466
- 绘制观测值和拟合值图
> par(mai=c(.6,0.6,0.1,0.1),cex=0.7,lab=c(19,5,1))
> plot(table[,2],type='o',xlab='time',ylab='power')
> lines(table[,1][c(-1,-2)],powerforecast$fitted[,1],type='o',lty=2,col='blue')
> legend(x='topleft',legend = c('guance','nihe'),lty=1:2,col=c(1,4),fill=c(1,4),cex=0.8)
> table[,1][c(-1,-2)]
[1] 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
- 2019年发电量的预测
> library(forecast)
> powerforecast1<-forecast(powerforecast,h=1)
> powerforecast1
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2019 75103.2 73222.15 76984.24 72226.38 77980.01
若要预测2020年,h=2
- 实际值和预测值图
> plot(powerforecast1,type='o',xlab='time',ylab='power',main = '')
11.2.3 Winters指数平滑预测
适用于含有趋势成分、季节成分
> table<-read.csv("/Users/zhourui/Documents/example11_4.csv")
> library(reshape2)
> library(forecast)
> df.ts<-ts(df[,3],start = 2016,frequency = 12)
>
> layout(matrix(c(1,1,2,3),nrow=2,ncol=2,byrow=TRUE))
> par(mai=c(0.7,0.7,0.3,0.1),las=1,mgp=c(3.5,1,0),lab=c(6,6,1),cex=0.7,cex.lab=1,font.main=1)
> plot(df.ts,type='o',col='red2',xlab='time',ylab='xiaoshouliang',main='a.guancezhi') #绘制a.观测值图
> abline(v=c(2016:2021),lty=6,col='grey70') #添加垂直线
> par(las=3) #设置坐标轴标签类型
> seasonplot(df.ts,xlab='month',ylab='xiaoshouliang',year.labels = TRUE,col=1:5,cex=0.6,main='b.nian zhedie') #绘制b.按年折叠图
> par(las=1)
> monthplot(df.ts,xlab='month',ylab='xiaoshouliang',lty.base=6,col.base='red',cex=0.8,main='c.tongyuefen') #绘制c.同月份图
- 确定模型参数阿尔法、贝塔和伽马和系数a、b和c
> saleforecast<-HoltWinters(df.ts)
> saleforecast
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = df.ts)
Smoothing parameters:
alpha: 0.4199832
beta : 0
gamma: 1
Coefficients:
[,1]
a 221.9538913
b 1.8831002
s1 -18.1002119
s2 -26.7294825
s3 0.9141249
s4 -5.9548353
s5 1.6493707
s6 26.6478895
s7 56.1023000
s8 54.1165315
s9 3.5690673
s10 -23.9075859
s11 -45.8990862
s12 -27.5538913
- 绘制观测值和拟合值图
> par(mai=c(.6,0.6,0.1,0.1),cex=0.7,lab=c(19,5,1))
> plot(df.ts,type='o',col='grey70',xlab='time',ylab='sale')
> lines(saleforecast$fitted[,1],type='o',lty=5,col='blue')
> legend(x='topleft',legend = c('guance','nihe'),lty=1:2,col=c(1,4),fill=c(1,4),cex=0.8)
- Winter模型的2021年预测
> saleforecast1<-forecast(saleforecast,h=12)
> saleforecast1
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
Jan 2021 205.7368 196.2069 215.2666 191.1621 220.3114
Feb 2021 198.9906 188.6544 209.3268 183.1827 214.7985
Mar 2021 228.5173 217.4333 239.6014 211.5657 245.4689
Apr 2021 223.5315 211.7469 235.3160 205.5086 241.5544
May 2021 233.0188 220.5731 245.4644 213.9848 252.0528
Jun 2021 259.9004 246.8270 272.9738 239.9064 279.8944
Jul 2021 291.2379 277.5656 304.9102 270.3279 312.1479
Aug 2021 291.1352 276.8891 305.3813 269.3477 312.9228
Sep 2021 242.4709 227.6732 257.2685 219.8398 265.1019
Oct 2021 216.8773 201.5479 232.2067 193.4330 240.3216
Nov 2021 196.7689 180.9256 212.6122 172.5387 220.9991
Dec 2021 216.9972 200.6562 233.3382 192.0058 241.9886
11.3趋势外推预测
将观测值与时间的关系用模型表达出来
当序列存在明显线性趋势,可以使用线性趋势模型进行预测
当序列存在明显非线性趋势,可以使用非线性趋势模型进行预测
11.3.1 线性趋势预测
时间序列按一个固定的常数增长或下降
- 方法:
Holt指数平滑模型、一元线性回归
- 拟合一元线性回归模型
> table<-read.csv("/Users/zhourui/Documents/example11_1.csv")
> fit<-lm(table$发电量~table$年份,data=table)
> summary(fit)
Call:
lm(formula = table$发电量 ~ table$年份, data = table)
Residuals:
Min 1Q Median 3Q Max
-2448.8 -1148.2 -210.6 1227.6 3522.2
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.559e+06 1.344e+05 -48.82 <2e-16 ***
table$年份 3.285e+03 6.688e+01 49.11 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1597 on 17 degrees of freedom
Multiple R-squared: 0.993, Adjusted R-squared: 0.9926
F-statistic: 2412 on 1 and 17 DF, p-value: < 2.2e-16
- 各年预测值
> predata<-predict(fit,data.frame(年份=2000:2019))
Warning message:
'newdata'必需有20行 但变量里有19行
> predata
1 2 3 4 5 6 7 8 9 10
10033.85 13318.46 16603.07 19887.68 23172.29 26456.90 29741.50 33026.11 36310.72 39595.33
11 12 13 14 15 16 17 18 19
42879.94 46164.55 49449.16 52733.77 56018.38 59302.99 62587.60 65872.21 69156.82
- 各年的预测残差
> res<-fit$residuals
> res
1 2 3 4 5 6 7
3522.15358 1489.56382 -63.06593 -781.92568 -1139.19544 -1454.29519 -1084.24495
8 9 10 11 12 13 14
-210.58470 -1641.90446 -2448.82421 -808.34396 965.63628 426.36653 1582.57677
15 16 17 18 19
1926.18702 -1157.26274 -1256.00249 172.25775 1960.90800
- 各年观测值和预测值比较图
> predata<-predict(fit,data.frame(年份=2000:2018))
> plot(2000:2018,predata,type='o',lty=2,col='blue',xlab='time',ylab='power')
> lines(table[,1],table[,2],type='o',pch=19)
> legend(x='topleft',legend = c('guance','nihe'),lty=1:2,col=c(1,4),fill=c(1,4),cex=0.8)
> abline(v=2016,lty=6,col=2)
- 一元线性回归预测与Holt指数平滑模型预测
11.3.2 非线性趋势预测
1. 指数曲线
- 指数曲线拟合
> y<-log(table[,3])
> x<-1:19
> fit<-lm(y~x)
> fit
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
8.0140 0.1141
取8.0140
> exp(8.0140)
[1] 3022.985
y=3022.985 exp(0.1141t)
- 历史数据以及2019年居民消费水平预测
> predata<-exp(predict(fit,data.frame(x=1:20)))
> predata
1 2 3 4 5 6 7 8 9
3388.284 3797.735 4256.666 4771.056 5347.607 5993.829 6718.144 7529.987 8439.936
10 11 12 13 14 15 16 17 18
9459.846 10603.005 11884.308 13320.448 14930.136 16734.343 18756.578 21023.185 23563.698
19 20
26411.214 29602.834
- 各年预测残差
> predata<-exp(predict(fit,data.frame(x=1:19)))
> perdata<-ts(predata,start =2000)
> residuals<-table[,3]-predata
> residuals
Time Series:
Start = 2000
End = 2018
Frequency = 1
1 2 3 4 5 6
309.7163886 156.2648160 -0.6662599 -229.0561085 -291.6065542 -322.8292924
7 8 9 10 11 12
-416.1437569 -95.9868141 43.0642866 -233.8458940 -53.0054467 761.6917605
13 14 15 16 17 18
754.5520345 684.8643600 536.6566202 172.4223569 -146.1854872 -493.6978367
19
-1033.2142317
- 实际值和预测值的比较图
predata<-exp(predict(fit,data.frame(x=1:19)))
plot(2000:2018,predata,type='o',lty=2,col='blue',xlab='time',ylab='xiaofeishuiping')
points(table[,1],table[,3],type='o',pch=19)
legend(x='topleft',legend = c('guance','nihe'),lty=1:2,col=c(1,4),fill=c(1,4),cex=0.8)
abline(v=2018,lty=6,col=2)
- 残差图
> plot(2000:2018,residuals,type='o',lty=2,xlab='time',ylab='residuals')
> abline(h=0,lty=2,col=2)
2.多阶曲线
- 拟合二阶曲线模型
> table<-read.csv("/Users/zhourui/Documents/example11_1.csv")
> y<-table[,4]
> x<-1:19
> fit1<-lm(y~x+I(x^2))
> fit1
Call:
lm(formula = y ~ x + I(x^2))
Coefficients:
(Intercept) x I(x^2)
6.9877 3.6193 -0.1078
- 二阶曲线预测值
> predata1<-predict(fit1,data.frame(x=1:20))
> predata1
1 2 3 4 5 6 7 8 9 10
10.49925 13.79529 16.87581 19.74081 22.39029 24.82425 27.04269 29.04562 30.83302 32.40490
11 12 13 14 15 16 17 18 19 20
33.76126 34.90211 35.82743 36.53723 37.03152 37.31028 37.37353 37.22125 36.85346 36.27014
- 二阶曲线残差值
> residual1<-fit1$residuals
> residual1
1 2 3 4 5 6 7
3.34075188 0.92471178 -1.37580864 -1.39080938 -1.16029043 -1.17425181 -1.34269350
8 9 10 11 12 13 14
-1.44561551 -1.80301784 -1.25490049 0.51873655 2.73789326 3.62256966 3.20276574
15 16 17 18 19
1.70848150 0.15971694 -3.26352794 -1.98125313 -0.02345865
- 拟合三阶曲线模型
> fit2<-lm(y~x+I(x^2)+I(x^3))
> fit2
Call:
lm(formula = y ~ x + I(x^2) + I(x^3))
Coefficients:
(Intercept) x I(x^2) I(x^3)
12.17141 0.85691 0.22885 -0.01122
- 三阶曲线预测值
> predata2<-predict(fit2,data.frame(x=1:20))
> predata2
1 2 3 4 5 6 7 8 9 10
13.24595 14.71085 16.49881 18.54249 20.77459 23.12776 25.53470 27.92809 30.24059 32.40490
11 12 13 14 15 16 17 18 19 20
34.35369 36.01964 37.33542 38.23372 38.64722 38.50860 37.75053 36.30569 34.10676 31.08642
- 三阶曲线残差值
> residual2<-fit2$residuals
> residual2
1 2 3 4 5 6 7
0.594053315 0.009145591 -0.998810797 -0.192494807 0.455414606 0.522238484 0.165297870
8 9 10 11 12 13 14
-0.328086191 -1.210592658 -1.254900487 -0.073688633 1.620363945 2.114578291 1.506275448
15 16 17 18 19
0.092776460 -1.038597630 -3.640525780 -1.065686945 2.723239918
- 实际值和预测曲线
> predata1<-predict(fit1,data.frame(x=1:17))
> predata2<-predict(fit2,data.frame(x=1:17))
> plot(2000:2016,predata2,type='o',tty=2,col='red',xlab='time',ylim=c(10,45),ylab='chanliang')
Warning messages:
1: In plot.window(...) : "tty"不是图形参数
2: In plot.xy(xy, type, ...) : "tty"不是图形参数
3: In axis(side = side, at = at, labels = labels, ...) : "tty"不是图形参数
4: In axis(side = side, at = at, labels = labels, ...) : "tty"不是图形参数
5: In box(...) : "tty"不是图形参数
6: In title(...) : "tty"不是图形参数
> lines(2000:2016,predata1,type='o',lty=3,col='blue')
There were 12 warnings (use warnings() to see them)
> points(table[,1],table[,4],type='o',pch=19)
There were 12 warnings (use warnings() to see them)
> legend(x='bottom',legend = c('guance','2','3'),lty=1:3,col=c('black','blue','red'),fill=c('black','blue','red'),cex=0.8)
There were 12 warnings (use warnings() to see them)
> abline(v=2015,lty=6,col=2)
There were 12 warnings (use warnings() to see them)
- 二阶曲线和三阶曲线预测残差的比较
> plot(fit1$residuals,ylab='yuce cancha',xlab='time',pch=0)
> points(fit2$residuals,pch=19,col='red')
> abline(h=0,lty=2,col=4)
> legend(x='bottomleft',legend = c('2','3'),pch=c(0,19),col=c('black','red'))
三阶比二阶残差小,故本题三阶更好
11.4 分解预测
时间序列同时包含趋势、季节、随机波动
多种成分
- 方法:
winter指数平滑预测、分解法预测
分解预测步骤
- 序列分解
> table<-read.csv("/Users/zhourui/Documents/example11_4.csv")
> df<-reshape2::melt(table,var='年份',value.name = '销售量')
Using 月份 as id variables
> df.ts<-ts(df[,3],start = 2016,frequency = 12)
> salecompose<-decompose(df.ts,type='multiplicative') #分解序列成分
> names(salecompose) #显示成分名称
[1] "x" "seasonal" "trend" "random" "figure" "type"
type=‘multiplicative’:指定分解使用乘法模型
type=‘additive’:指定分解使用乘法模型
- 输出季节成分
> salecompose$seasonal
Jan Feb Mar Apr May Jun Jul Aug
2016 0.9061343 0.8546989 0.9836370 0.9514711 0.9773915 1.1371688 1.2690723 1.2498771
2017 0.9061343 0.8546989 0.9836370 0.9514711 0.9773915 1.1371688 1.2690723 1.2498771
2018 0.9061343 0.8546989 0.9836370 0.9514711 0.9773915 1.1371688 1.2690723 1.2498771
2019 0.9061343 0.8546989 0.9836370 0.9514711 0.9773915 1.1371688 1.2690723 1.2498771
2020 0.9061343 0.8546989 0.9836370 0.9514711 0.9773915 1.1371688 1.2690723 1.2498771
Sep Oct Nov Dec
2016 1.0650963 0.9209406 0.7975754 0.8869367
2017 1.0650963 0.9209406 0.7975754 0.8869367
2018 1.0650963 0.9209406 0.7975754 0.8869367
2019 1.0650963 0.9209406 0.7975754 0.8869367
2020 1.0650963 0.9209406 0.7975754 0.8869367
- 绘制成分分解图
> plot(salecompose,type='o',col='red4')
由上到下:obversed、trend、seasonal、random
- 季节调整后序列图
> seasonaladjust<-df.ts/salecompose$seasonal
> plot(df.ts,xlab="时间",ylab="销售量")
> lines(seasonaladjust,lty=2,col='blue')
> legend(x='topleft',legend=c('销售量','销售量的季节调整'),lty=1:2,fill=c(1,4))
- 拟合季节调整后的序列的线性模型
> fit<-lm(seasonaladjust~x)
> fit
Call:
lm(formula = seasonaladjust ~ x)
Coefficients:
(Intercept) x
130.281 1.362
> summary(fit)
Call:
lm(formula = seasonaladjust ~ x)
Residuals:
Min 1Q Median 3Q Max
-13.2106 -3.5719 0.3778 3.7939 8.1324
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 130.28107 1.34523 96.85 <2e-16 ***
x 1.36157 0.03835 35.50 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.145 on 58 degrees of freedom
Multiple R-squared: 0.956, Adjusted R-squared: 0.9552
F-statistic: 1260 on 1 and 58 DF, p-value: < 2.2e-16
- 计算最终预测值(以季节 周期格式输出)
> predatas<-predict(fit,data.frame(x=1:72))*rep(salecompose$seasonal[1:12],6)
> predatas<-ts(predatas,start=2016,frequency = 12)
> predatas
Jan Feb Mar Apr May Jun Jul Aug Sep
2016 119.2859 113.6785 132.1671 129.1406 133.9895 157.4415 177.4316 176.4496 151.8137
2017 134.0910 127.6433 148.2386 144.6865 149.9589 176.0215 198.1667 196.8711 169.2161
2018 148.8962 141.6080 164.3100 160.2324 165.9283 194.6015 218.9018 217.2926 186.6184
2019 163.7013 155.5728 180.3815 175.7783 181.8977 213.1814 239.6369 237.7141 204.0208
2020 178.5065 169.5375 196.4529 191.3242 197.8671 231.7614 260.3720 258.1356 221.4232
2021 193.3116 183.5023 212.5243 206.8701 213.8365 250.3414 281.1071 278.5570 238.8256
Oct Nov Dec
2016 132.5203 115.8544 130.0425
2017 147.5674 128.8859 144.5340
2018 162.6144 141.9173 159.0255
2019 177.6615 154.9487 173.5169
2020 192.7086 167.9801 188.0084
2021 207.7556 181.0115 202.4999
- 计算预测的残差
> residualss<-df.ts-predict(fit,data.frame(x=1:60))*salecompose$seasonal
> residualss<-ts(residualss,start = 2016,frequency = 12)
> round(residualss,4)
Jan Feb Mar Apr May Jun Jul Aug Sep
2016 -3.0859 -1.8785 -3.9671 -0.0406 -4.3895 -6.2415 -2.7316 -9.8496 -2.0137
2017 2.2090 5.3567 3.9614 5.5135 2.6411 3.4785 0.0333 -2.4711 1.1839
2018 2.3038 2.8920 6.5900 6.7676 4.4717 7.9985 4.2982 6.9074 7.2816
2019 -0.5013 -2.9728 -6.5815 -8.7783 -7.6977 -4.3814 -3.9369 4.6859 -10.1208
2020 -5.7065 -5.3375 -1.5529 -1.2242 3.7329 -5.1614 2.6280 10.1644 0.7768
Oct Nov Dec
2016 -1.0203 -2.0544 3.3575
2017 -0.6674 1.2141 2.3660
2018 3.9856 4.4827 2.2745
2019 -5.3615 -6.1487 -11.7169
2020 2.6914 5.8199 6.3916
- 观测值和预测值图
> plot(predatas,xlab="时间",ylab="销售量",col='blue')
> lines(df.ts)
> legend(x='topleft',legend=c('实际销售量','预测销售量'),lty=1:2,col=c('black','blue'),fill=c('black','blue'))
> abline(v=2021,lty=6,col=2)
- 分解预测和winter模型预测残差图
> plot(as.numeric(residualss),pch=1,xlab="时间",ylab="残差")
> abline(h=0,lty=2,col=2)
> points(as.numeric(residualss2),pch=16,col=blues9)
> points(as.numeric(residualss2),pch=16,col='blue')
> legend(x='topleft',legend=c('分解预测残差','winter模型预测残差量'),lty=1:2,pch=c(1,16),col=c(1,4))
11.5 时间序列平滑
利用短期移动平均
对时间序列进行平滑,可以去除时间序列中随机波动
,从而有利于观察其变化趋势或形态
- 使用MoveAvg函数做平均移动
> table<-read.csv("/Users/zhourui/Documents/example11_1.csv")
> table<-ts(table,start = 2000)
> library(DescTools)
载入程辑包:‘DescTools’
The following object is masked from ‘package:forecast’:
BoxCox
> ma3<-MoveAvg(table[,5],order=3,align='center',endrule='keep')
> ma5<-MoveAvg(table[,5],order=5,align='center',endrule='NA')
> data.frame(年份=table[,1],'CPI'=table[,5],ma3,ma5)
年份 CPI ma3 ma5
1 2000 100.4 100.4000 NA
2 2001 100.7 100.1000 NA
3 2002 99.2 100.3667 101.08
4 2003 101.2 101.4333 101.36
5 2004 103.9 102.3000 101.52
6 2005 101.8 102.4000 102.64
7 2006 101.5 102.7000 103.58
8 2007 104.8 104.0667 102.66
9 2008 105.9 103.3333 102.96
10 2009 99.3 102.8333 103.74
11 2010 103.3 102.6667 103.30
12 2011 105.4 103.7667 102.64
13 2012 102.6 103.5333 103.18
14 2013 102.6 102.4000 102.80
15 2014 102.0 102.0000 102.12
16 2015 101.4 101.8000 101.92
17 2016 102.0 101.6667 101.82
18 2017 101.6 101.9000 NA
19 2018 102.1 102.1000 NA
order=3:移动平均间隔长度
align=‘center’:排列方式,放在中心位置
align=‘left’:放在向量左侧
align=‘right’:放在向量右侧
endrule=‘keep’:没有移动平均值的位置用相应的实际值
填充
endrule=‘constant’:没有移动平均值的位置用相应的预测值
填充
endrule=‘NA’:没有移动平均值的位置用相应的NA
填充
- 绘制实际值和平滑值的比较图形
> plot(table[,5],type='o',xlab='year',ylab='CPI')
> lines(ma3,type='o',lty=2,col='red')
> lines(ma5,type='o',lty=3,col='blue')
> legend(x='topleft',legend = c('CPI','ma3','ma5'),lty=c(1,2,6),col=c('black','red','blue'),fill=c(1,2,4),cex=0.8)