2.0c MMA 867 Day 2_timeseries _MSTS, Rolling Horizon Holdout and VARS

pdf

School

Queens University *

*We aren’t endorsed by this school

Course

867

Subject

Electrical Engineering

Date

Oct 30, 2023

Type

pdf

Pages

15

Uploaded by DoctorFalcon697

Report
4/30/2020 1 MMA 867 Predictive Modelling (aka Predictive Analytics) Day 2: Time Series Models: MSTS, Rolling Horizon Holdout and VARS Queen's Master in Management Analytics Session 2‐3 Prof. Anton Ovchinnikov Moving average Exponential smoothing New Forecast ("level") = α * Actual + (1 − α) * Old Forecast ("level") Holt's model: Smoothing with [additive] trend: New Forecast=New Level+New Trend New Level = α * Actual + (1 − α) * Old Forecast New Trend = β * (New Level – Old Level) + (1 − β) * Old Trend Winter's model: Smoothing with [additive] trend and seasonality Multiplicative smoothing methods Decompositions: TBATS (trigonometric Fourier transforms) Auto‐regressive methods : "Classical": ARMA, ARIMA (ARCH, GARCH, etc. for variance) Above with regressors/covariates/features) – "dynamic regressions" Multiple seasonalities Above for multiple correlated time series – vectorized auto‐regressions Models for Time Series [from thereading] 1 2
4/30/2020 2 Motivation: Forecasting Weekly Beverage Sales Weeklybeveragesalesforecastingover~4yrs:TBATSwdummies/regressors,MAPE~1% Multiple Seasonalities yearly (frequency=52) quarterly (frequency=13) Weeklybeveragesalesforecastingover~4yrs:TBATSwdummies/regressors,MAPE~1% 3 4
4/30/2020 3 Context: back to Wells Fargo. In the IoT world we have data on energy consumption and production in 15min intervals File: " 02 CSV data ‐‐ Solar Output and Power Consumption.csv " [15mins, individually] Explore the data – what do you see? Multiple Seasonalities Multiple Seasonalities 5 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4/30/2020 4 Multiple Seasonalities Context: back to Wells Fargo. In the IoT world we have data on energy consumption and production in 15min intervals File: " 02 CSV data ‐‐ Solar Output and Power Consumption.csv " [15mins, individually] Explore the data – what do you see? WFdemand_msts <- msts (WFdata$Electricity.Demand.for.the.Branch..kw., seasonal.periods=c(96,672)) #define multiple-seasonality time series (time of day (15mins) and day of week) Dealing with msts : TBATS: easy ("natural") ETS: cannot handle ARIMA: yes, but with some "twists" ( lm + ARIMA on residuals) Multiple Seasonalities: msts 7 8
4/30/2020 5 WFdemand_tbats <- tbats(WFdemand_msts) plot(WFdemand_tbats) #plot decomposition WFdemand_tbats_pred <- forecast(WFdemand_tbats, h=1344, level=c(0.8, 0.95)) #predict 2 weeks out plot(WFdemand_tbats_pred, xlab="Time", ylab="Predicted Electricity Demand, kW") TBATS msts: decomposition WFdemand_tbats <- tbats(WFdemand_msts) plot(WFdemand_tbats) #plot decomposition WFdemand_tbats_pred <- forecast(WFdemand_tbats, h=1344, level=c(0.8, 0.95)) #predict 2 weeks out plot(WFdemand_tbats_pred, xlab="Time", ylab="Predicted Electricity Demand, kW") TBATS msts: training and predicting 9 10
4/30/2020 6 WFdemand_AAN <- ets(WFdemand_msts, model="AAN") #AAA cannot handle this, "Error in ets(WFdemand_msts, model = "AAA") : Frequency too high" plot(WFdemand_AAN) WFdemand_AAN_pred <- forecast(WFdemand_AAN, h=1344, level=c(0.8, 0.95)) plot(WFdemand_AAN_pred, xlab="Time", ylab="Predicted Electricity Demand, kW") ETS msts (cannot really handle it) (cannot really handle it either) WFdemand_arima <- auto.arima(WFdemand_msts,seasonal=TRUE) WFdemand_arima_pred <- forecast(WFdemand_arima, h=1344, level=c(0.8, 0.95)) plot(WFdemand_arima_pred, xlab="Time", ylab="Predicted Electricity Demand, kW") ARIMA msts: Approach I, "Plain Vanilla" 11 12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4/30/2020 7 weekdayMatrix <- cbind(Weekday=model.matrix(~as.factor(WFdata$DOW))) # Create dummies for each day-of-week weekdayMatrix <- weekdayMatrix[,-1 ]# Remove "intercept" (7th day) dummy colnames(weekdayMatrix) <- c("Mon","Tue","Wed","Thu","Fri","Sat") # Rename columns matrix_of_regressors <- weekdayMatrix WFdemand_arima <- auto.arima(WFdata$Electricity.Demand.for.the.Branch..kw., xreg=matrix_of_regressors) # Train a model WFdemand_arima # See what it is xreg.pred<-matrix_of_regressors[-c(1345:5664),] # Build a 2-weeks-out prediction matrix WFdemand_arima_pred <- forecast(WFdemand_arima, h=1344, xreg = xreg.pred, level=c(0.8, 0.95)) plot(WFdemand_arima_pred, xlab="Time", ylab="Predicted Electricity Demand, kW", ylim=c(0,20)) ARIMA msts: Approach II, with Regressors (better, but… still cannot REALLY handle it) ARIMA msts: Approach II, with Regressors 13 14
4/30/2020 8 WFlm_msts <- tslm (WFdemand_msts ~ trend + season) # Build a linear model for trend and seasonality summary(WFlm_msts) residarima1 <- auto.arima(WFlm_msts$residuals) # Build ARIMA on it's residuals residarima1 residualsArimaForecast <- forecast(residarima1, h=1344) #forecast from ARIMA residualsF <- as.numeric(residualsArimaForecast$mean) regressionForecast <- forecast(WFlm_msts, h=1344) #forecast from lm regressionF <- as.numeric(regressionForecast$mean) forecastR <- regression + residualsF # Total prediction ARIMA msts: Approach III, lm + ARIMA on Residuals ARIMA msts: Approach III, lm + ARIMA on Residuals Compared to Approach II, which one is better? 15 16
4/30/2020 9 ARIMA msts: Approach III, lm + ARIMA on Residuals Compared to TBATS, which one is better? How to check? accuracy. tbats =0 # we will check average 1-day-out accuracy for 7 days for (i in 1:7) { nTest <- 96*i # Redefine test and train data by shifting the "window" nTrain <- length(WFdemand_msts)- nTest - 1 train <- window(WFdemand_msts, start=1, end=1+(nTrain)/(7*24*4)) test <- window(WFdemand_msts, start=1+(nTrain+1)/(7*24*4), end=1+(nTrain+96)/(7*24*4)) s <- tbats(train) sp<- predict(s,h=96) cat("---------------------------------- Data Partition",i," Training Set includes", nTrain," time periods. Observations 1 to", nTrain, " Test Set includes 96 time periods. Observations", nTrain+1, "to", nTrain+96," ") print(accuracy(sp,test)) accuracy.tbats<-rbind(accuracy.tbats,accuracy(sp,test)[2,5]) } accuracy.tbats<-accuracy.tbats[-1] ARIMA III or TBATS? Rolling Horizon Holdout 17 18
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4/30/2020 10 accuracy. arima =0 # we will check average 1-day-out accuracy for 7 days for (i in 1:7) { nTest <- 96*i nTrain <- length(WFdemand_msts)- nTest -1 train <- window(WFdemand_msts, start=1, end=1+(nTrain)/(7*24*4)) test <- window(WFdemand_msts, start=1+(nTrain+1)/(7*24*4), end=1+(nTrain+96)/(7*24*4)) trainlm <- tslm(train ~ trend + season) trainlmf <- forecast(trainlm,h=96) residauto <- auto.arima(trainlm$residuals) residf <- forecast(residauto,h=96) y <- as.numeric(trainlmf$mean) x <- as.numeric(residf$mean) sp <- x+y print(accuracy(sp,test)) accuracy.arima<-rbind(accuracy.arima,accuracy(sp,test)[1,5]) } accuracy.arima<-accuracy.arima[-1] ARIMA III or TBATS? Rolling Horizon Holdout How does this compare with tsCV function? One point, k‐steps ahead tsCV does one‐point‐at‐a‐time Our approach implemented “day at a time” (24*4=96 points) Rolling window, k‐steps ahead 19 20
4/30/2020 11 ARIMA III or TBATS? Rolling Horizon Holdout SOO? TBATS ARIMA III #install.packages("vars") #install.packages("strucchange") library(vars) # Load package cor(WFdata[2],WFdata[3]) series <- ts(cbind(WFdata[2],WFdata[3])) plot(series) # Estimate and summarize the model model.VAR <- VAR(series,96,type="none") summary(model.VAR) # Impulse responses (if one series changes, what happens to the other?) impulse.response <- irf(model.VAR,impulse="Solar.System.Output..kWh.",response="Electricity.Deman d.for.the.Branch..kw.",n.ahead = 96,ortho = FALSE, cumulative = FALSE) plot(impulse.response) # Predict next week and plot the results predicted.values.VAR<-predict(model.VAR, n.ahead=672,ci=0.8) plot(predicted.values.VAR, xlim=c(5000,6500)) Predicting Correlated Time Series: VARS 21 22
4/30/2020 12 #install.packages("vars") #install.packages("strucchange") library(vars) # Load package cor(WFdata[2],WFdata[3]) series <- ts(cbind(WFdata[2],WFdata[3])) plot(series) # Estimate and summarize the model model.VAR <- VAR(series,96,type="none") summary(model.VAR) # Impulse responses (if one series changes, what happens to the other?) impulse.response <- irf(model.VAR,impulse="Solar.System.Output..kWh.",response="Electricity.Deman d.for.the.Branch..kw.",n.ahead = 96,ortho = FALSE, cumulative = FALSE) plot(impulse.response) # Predict next week and plot the results predicted.values.VAR<-predict(model.VAR, n.ahead=672,ci=0.8) plot(predicted.values.VAR, xlim=c(5000,6500)) Predicting Correlated Time Series: VARS On many occasions data are indexed by time – time series data Such data requires special analytical tools, which explicitly account for the fact that prediction errors are heteroskedastic, i.e., increase over time We discussed concepts and implementations of three main families of models + some “tricks”: 1. Exponential smoothing ( ets ) 2. Trigonometric decompositions ( tbats ) 3. Auto‐regressive moving averages (ARIMA) Dynamic regressions (we saw ARIMA‐based examples, but the concept applies to all methods) Multiple seasonalities ( msts ), multiple correlated time series ( vars ) Rolling horizon holdout Many more models/packages/”tricks” out there. Check resources online: e.g., FPP book https://www.otexts.org/fpp2 and blog: http://robjhyndman.com/hyndsight and learn more Summary of Day 2 23 24
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4/30/2020 13 [Optional/Time Permitting]: IATA Case Study [Optional/Time Permitting]: IATA Case Study 25 26
4/30/2020 14 [Optional/Time Permitting]: IATA Case Study World's leading time series forecasting competition (100,000 time series dataset) https://www.m4.unic.ac.cy/ Roots in "M1…" competitions; see “A brief history of time series forecasting competitions”: https://robjhyndman.com/hyndsight/forecasting‐competitions/ Rob Hyndman (who is this?) and his team often come on top This year's winner, however: Uber Engineering with combined ETS*RNN model https://eng.uber.com/m4‐forecasting‐competition/ [Optional/Time Permitting]: M4 Competition *AM("Holt‐Winters") NewModel Timeseries‐based part Feature‐based part (“transfer learning”) 27 28
4/30/2020 15 [Optional/Time Permitting]: Timeseries in Python 29
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help

Browse Popular Homework Q&A

Q: A given net force propels an object along a straight-line path. If the mass were doubled m, it's…
Q: Solve the system graphically and indicate whether the solution region is bounded or unbounded. Find…
Q: Using 32 bits what would the word "Bob!" look like in hexadecimal and binary. (Do not encode the…
Q: oes monetary policy have an advantage over fiscal policy? W
Q: djusting entries for accrued salaries Instructions Chart of Accounts Journal Instructions Paradise…
Q: Given square RSTV, where RS =9 cm. If square RSTV is dilated by a scale factor of 3, then what is…
Q: b) Laplace transform of f(t)=8 (1-3)
Q: Is there commerical resin similar to RP-46 polyimide resin?
Q: Find the slope, Y-interceprt, and X-intercept of the line. -8x+6y=7
Q: If 500 consumers were surveyed, give the frequency distribution for these data
Q: 20. P(A) = 0.40 P(B) = 0.25 P(AUB) = 0.50 Part a: Find P(ANB) Part b: Find P(A/B) Part c: Find…
Q: Alexander Inc. makes basketballs. The results for the year were as follows:       Basketballs…
Q: ind an s-grammar for L = {a"b²
Q: In an ideal-dilute solution (A+B), where A is the solvent and B is the solute: O B obeys Raoult's…
Q: 1. Evaluate the limit, if it exists; otherwise give the reason why the limit does not exist. 2 (a)…
Q: Reuben's Deli currently makes rolls for deli sandwiches it produces. It uses 30,000 rolls annually…
Q: The results of a grain-size analysis is given below. Plot the grain-size distribution curve and…
Q: Counting Monetary Units  Write a program in the class MonetaryUnits that prompts the user for a…
Q: Write the formula for an exponential function with initial value 23 and growth factor 1.2. (Use t as…
Q: If merchandise sells for $3,500, with terms 3/15.n/45 , and the cost of the inventory sold is…
Q: Is multi- generational workforce a good thing or a bad thing ?
Q: Write a Pep/9 assembly language program that prints your Adrian on the screen. Use immediate…