[Tips & Tricks การใช้เครื่องมือในงานวิจัย ] ARIMA Selection with R interface การวิเคราะห์ข้อมูลอนุกรมเวลา วัตถุประสงค์หนึ่งคือเพื่อต้องการทำนายหรือพยากรณ์ค่าของอนุกรมเวลาในอนาคต โดยวิธีของ Box และ Jenkins เป็นวิธีที่มีความแม่นยําของการพยากรณ์ค

สำรวจ
ลงทุน
คำถาม

มีบัญชีอยู่แล้ว?หรือ

Tips & Tricks การใช้เครื่องมือในงานวิจัย

•

15 ต.ค. 2023 เวลา 07:33 • การศึกษา

ARIMA Selection with R interface

การวิเคราะห์ข้อมูลอนุกรมเวลา วัตถุประสงค์หนึ่งคือเพื่อต้องการทำนายหรือพยากรณ์ค่าของอนุกรมเวลาในอนาคต โดยวิธีของ Box และ Jenkins เป็นวิธีที่มีความแม่นยําของการพยากรณ์ค่อนข้างสูง สามารถใช้กับข้อมูลที่มีการเคลื่อนไหวทุกประเภท อย่างไรก็ดีการกำหนดโมเดลให้สอดคล้องกับข้อมูลเพื่อการพยากรณ์นับว่ามีความยุ่งยากอยู่พอสมควร การเลือกโมเดลที่ให้ค่าสถิติที่เหมาะสมจึงเป็นวิธีการที่ใช้กันค่อนข้างแพร่หลายและเป็นทางเลือกในการตัดสินใจใช้โมเดลในการพยากรณ์ของใครหลายๆคน

การเตรียมข้อมูลสำหรับนำเข้าโปรแกรม R เป็นข้อมูลราคามันสำปะหลังโรงงานเฉลี่ยรายเดือน (ตัวเลขสมมุติ) ตั้งแต่เดือนมกราคม 2007 ถึงเดือน ธันวาคม 2020 จำนวน 168 เดือน ชื่อไฟล์ cassava.csv เก็บไว้ใน Folder C:\Users\**name**\Documents

cassava.csv

นำเข้าข้อมูลจากไฟล์ cassava.csv เข้าสู่โปรแกรม R

cassava <- read.csv("cassava.csv", header = TRUE)

ใช้คำสั่ง head(cassava,12) เพื่อแสดงข้อมูลตัวแปรที่นำเข้าสู่โปรแกรม R จำนวน 12 ค่าสังเกต

ราคามันสำปะหลังโรงงาน

Load Packages เพื่อใช้งาน

library(timeSeries)

library(timeDate)

library(uroot)

library(forecast)

library(fUnitRoots)

library(lmtest)

library(FinTS)

library(xts)

สร้าง Data Frame ให้เป็นแบบ Time Series

cassava.ts <- ts(data = cassava, start = c(2007), freq = 12)

cassava.ts <- data.frame(cassava.ts)

attach(cassava.ts)

price <- ts(price, frequency = 12, start = c(2007, 1),end = c(2020,12))

Plot Graph เพื่อดูรูปแบบการเคลื่อนไหวของข้อมูล

dev.new(width = 6,height = 5)

tsdisplay(price,lag = 36,col="darkgreen")

Graph แสดงรูปแบบการเคลื่อนไหวของข้อมูล

ทดสอบความนิ่งของข้อมูล (Stationary) ตามแบบ Augmented Dickey-Fuller เป็นการทดสอบว่าข้อมูลนั้นมีค่าเฉลี่ย ค่าความแปรปรวน และค่าความแปรปรวนร่วมของตัวมันเองคงที่หรือไม่ โดยกำหนด Lag order = 1 เลือก test without constant, with constant, with constant and trend

fUnitRoots::adfTest(price, lags = 1, type = "nc")

fUnitRoots::adfTest(price, lags = 1, type = "c")

fUnitRoots::adfTest(price, lags = 1, type = "ct")

ทดสอบความนิ่งของอนุกรมเวลาแบบฤดูกาล (รายเดือน) ตามแบบ HEGY โดยกำหนดให้มีค่าคงที่ และใช้ค่าสถิติ Akaike info criterion

summary(test <- hegy.test(price, deterministic = c(1,0,0),lag.method = "AIC", maxlag = 0))

กำหนดรูปแบบ (Model) โดย Log differences ตัวแปร price จะได้ตัวแปร dlprice ที่ stationary

dlprice <- diff(log(price))

dev.new(width = 6,height = 5)

tsdisplay(dlprice,lag =36,col="darkgreen",main='dlprice (Stationary)')

Graph แสดงข้อมูล Stationary

ซึ่งจะเห็นได้ว่า Autocorrelation (rk) ลดลงที่ k = 12 Partial Autocorrelation (rkk) ลดลงที่ k=2 และเข้าใกล้ 0 ที่ k=4

การพิจารณารูปแบบ Seasonal (ยังคงต้อง diff แม้ว่าผลการทดสอบ Seasonal จะ Stationary เนื่องจาก Seasonal เป็นคุณสมบัติของ Nonstationary) โดยกำหนดให้ diff = 1 และ seasonal lag = 1(12) ใช้คำสั่งดังนี้

fit <- auto.arima(log(price),d = 1,D = 1,max.P = 1,max.Q = 1,seasonal = TRUE, trace=TRUE)

Best model: ARIMA(0,1,2)(1,1,1)[12]

จะได้ Best model: ARIMA(0,1,2)(1,1,1)[12]

และเลือก model อันดับรองลงไป ARIMA(0,1,1)(1,1,1)[12] เพื่อเปรียบเทียบ

สร้าง model ARIMA (0 1 2)(1 1 1)[12]

summary(fitarima <- arima(log(price),order = c(0, 1, 2),seasonal = list(order = c(1, 1, 1), period = 12), method = "ML"))

lmtest::coeftest(fitarima)

model ARIMA (0 1 2)(1 1 1)[12]

สร้าง model ARIMA (0 1 1)(1 1 1)[12]

summary(fitARIMA <- arima(log(price),order = c(0, 1, 1),seasonal = list(order = c(1, 1, 1), period = 12), method = "ML"))

lmtest::coeftest(fitARIMA)

model ARIMA (0 1 1)(1 1 1)[12]

ทดสอบ Autocorrelation model (0 1 2)(1 1 1)[12]

checkresiduals(fitarima,lag=12,plot = FALSE)

ทดสอบ Autocorrelation model (0 1 1)(1 1 1)[12]

checkresiduals(fitARIMA,lag=12,plot = FALSE)

ทดสอบ Heteroskedasticity model (0 1 2)(1 1 1)[12]

FinTS::ArchTest(fitarima$residuals,lags=12)

ทดสอบ Heteroskedasticity model (0 1 1)(1 1 1)[12]

FinTS::ArchTest(fitARIMA$residuals,lags=12)

เปรียบเทียบความแม่นยำในการพยากรณ์ 2 model

round(accuracy(fitarima),4)

round(accuracy(fitARIMA),4)

พยากรณ์ Model fitarima (0 1 2)(1 1 1)[12] ออกไปอีก 12 เดือน

price_f <- predict(fitarima,n.ahead = 12)

plot(forecast(fitarima))

ผลการพยากรณ์ Model (0 1 2)(1 1 1)[12]

พยากรณ์ Model fitARIMA (0 1 1)(1 1 1)[12] ออกไปอีก 12 เดือน

price_ff <- predict(fitARIMA,n.ahead = 12)

plot(forecast(fitARIMA))

ผลการพยากรณ์ Model (0 1 1)(1 1 1)[12]

ผลพยากรณ์เป็นค่าที่ได้จากการ Take log ดังนั้นจึงต้อง Inverse กลับด้วย Exponential

price_f1 <- exp(price_f$pred)

price_f2 <- exp(price_ff$pred)

เปรียบเทียบผลพยากรณ์ 2 model ดังนี้

y2020 <- price[c(157:168)]

y2020 <- ts(y2020, frequency = 12, start = c(2020, 1))

Fcast <- ts(cbind(y2020,price_f1,price_f2), start = c(2020,1), freq = 12)

Fcast.ts <- data.frame(Fcast)

Fcast.ts <- ts(data = Fcast.ts, start = c(2020), freq = 12)

Fcast

ผลพยากรณ์ 2 model

Plot Graph เปรียบเทียบ 2 Model

library(xts)

dev.new(width = 7,height = 6)

plot(as.xts(Fcast.ts),type = "b",col=c("black","red", "green"),lwd=c(1,1,1), pch = 8, ylim = c(0.5,4), main='2 Model Arima Forecasting')

addLegend("bottomleft", on=NA,legend.names = c("actual 2020","predict_arima(0,1,2)(1,1,1) 2021", "predict_arima(0,1,1)(1,1,1) 2021"), lty=c(1,1,1), pch = 8, lwd=c(1,1,1),col=c("black","red", "green"))

Graph เปรียบเทียบผลการพยากรณ์ 2 Model

ต้องการข้อมูลหรือสคริปต์ไฟล์

cchantajirawat@gmail.com

share link

https://drive.google.com/drive/folders/1u5OqKIQlLOm4P3ccnfJh7jZjHvr3e68x?usp=sharing

โฆษณา

ดาวน์โหลดแอปพลิเคชัน

ARIMA Selection with R interface

ดาวน์โหลดแอปพลิเคชัน