PP05_regression_and_expo

Rmd

School

Mount Holyoke College *

*We aren’t endorsed by this school

Course

344

Subject

Statistics

Date

Apr 3, 2024

Type

Rmd

Pages

5

Uploaded by SargentArmadillo1825

Report
--- title: "PP 05: Regression cont., plus exponential smoothing!" author: "Prof. Tupper" output: pdf_document editor_options: markdown: wrap: 72 --- ## Instructions New practice problems! You can see more info about practice problems in the syllabus online (section A.6.5). **Writeups:** For these problems, you'll need R/RStudio. You can work directly in the .Rmd file, typing your answers as usual and inserting R chunks to do any necessary calculations. If you need to write any math, R Markdown accepts LaTeX code; or use whatever notation makes sense to you, or knit a PDF with a big space there and hand-write over it. The Assessment is hard copy, so you won't have to worry about typesetting math :) **Engagement credit:** To earn engagement credit for your work, submit your work to the matching assignment on Gradescope by 11:59 pm Eastern time on the due date. Remember that you don't get "content-based" feedback on your practice problems, but that solutions will be available after the engagement deadline -- you can check your work, then ask at office hours or on a Topic Conversation board about anything sticky :) **The point of all this:** Remember, PPs are a tool *for your learning*. They're not graded for correctness, just whether you made an honest, thoughtful effort. So if you're stuck, don't just take a shortcut and get an answer from somewhere else! Writing down where exactly you're stuck and what you think *might* be the answer will earn you just as much credit...and will be vastly more useful to your learning process. Then you can go back later and look at the solutions to fill in the gaps. ### Collaboration and resources You can work with whatever people and resources you like on practice problems, but as usual, your writeup must be your own -- you cannot copy someone else's work (or use an AI source) without citation. If you write up your work together with someone else, both (or all) of you should **clearly cite each other** in your submissions. But I recommend that you write up your answers on your own instead, even if you work on the problems with a buddy, because that's how you know you personally understand the material. It's important to develop your own individual understanding as early as possible, rather than doing everything together and then having to go solo on the Assessments :) ## What's in here? This set of practice problems includes content from Module 03 (Regression and forecasting) and Module 04 (Exponential smoothing and state space models); Module 04 material continues on PP06. Concept-type learning goals associated with Module 03 include:
- Write and interpret an appropriate regression model in the TS context (including discussing residuals and forecast errors) - This includes the train/test stuff, and evaluating forecast distributions as well as point forecasts. - Describe baseline forecasting methods, including their setup, use, and differences (this includes mean, naive, seasonal naive, and drift methods) - While you won't have to hand-calculate forecasts, you should be able to talk about what you'd expect them to look like and why! You can write things like the formulas for prediction intervals/forecasting evaluation metrics/forecast SDs on your notes sheet, but the most important thing is to be able to explain *why* those formulas are the way they are. What does each piece represent? Why is it there? Concept-type learning goals associated with Module 04 include: - Write and explain exponential smoothing models (including explaining the equations, interpreting plots, and relating the models to specific TS/forecast behavior) - Write and explain state-space equations and how they relate to exponential smoothing models and TS behavior As usual, I won't ask you to live-code during Assessments, but I may ask you to read code, tell me what it's doing, and/or interpret the results. And of course you'll need to do coding for your projects! ## And now, the questions! ```{r} library(lubridate) library(generics) library(tsibble) library(dplyr) library(fpp3) library(forecast) library(latex2exp) library(seasonal) ``` ### Q1 HA exercise 8.6 (previously 7.6: from the "Time series regression models" chapter) The annual population of Afghanistan is available in the global_economy data set. a. Plot the data and comment on its features. Can you observe the effect of the Soviet-Afghan war? ```{r} global_economy %>% filter(Country=="Afghanistan") %>% tsibble(key = Code, index = Year) %>% autoplot(Population, show.legend = FALSE) ```
b. Fit a linear trend model and compare this to a piecewise linear trend model with knots at 1980 and 1989. ```{r} global_economy %>% filter(Country=="Afghanistan")%>% model(TSLM(Population ~ Year)) %>% report() ``` ```{r} global_economy %>% filter(Country=="Afghanistan")%>% filter(Year<1980)%>% model(TSLM(Population ~ Year)) %>% report() ``` c. Generate forecasts from these two models for the five years after the end of the data, and comment on the results. ```{r} model.fc1<-global_economy %>% filter(Country=="Afghanistan")%>% model(TSLM(Population ~ Year)) model.fc2<-global_economy %>% filter(Country=="Afghanistan")%>% filter(Year>1989)%>% model(TSLM(Population ~ Year)) forecast(model.fc1, h=5) ``` ```{r} forecast(model.fc2, h=5) ``` ### Q2 HA exercise 9.1 (from the "Exponential smoothing" chapter). Note that section 9.1 has example code for using the `ETS()` function. Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset. a. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of $\alpha$ and $l_0$, and generate forecasts for the next four months. ```{r} p <- aus_livestock %>% filter(Animal == 'Pigs' & State == 'Victoria') pigs <- p %>% autoplot(Count) + labs(title = 'Timeseries') pigs ```
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
```{r} fit <- p%>% model(ses = ETS(Count ~ error('A') + trend('N') + season('N'))) opt_val <- fit %>% report() ``` ```{r} pigsforecast <- fit %>% forecast(h = 4) pigsforecast ``` ```{r} plot <- fit %>% forecast(h = 4) %>% autoplot(filter(p, Month >= yearmonth('2016 Jan'))) + labs(title = 'Four Month Forecast') plot ``` b. Compute a 95% prediction interval for the first forecast using $\hat{y} ± 1.96s$ where s is the standard deviation of the residuals. Compare your interval with the interval produced by R. ```{r} y<- pigsforecast %>% pull(Count) %>% head(1) sD <- augment(fit) %>% pull(.resid) %>% sd() # Calculate the lower and upper confidence intervals. lowerCi <- y - 1.96 * sD upperCi <- y + 1.96 * sD z <- c(lowerCi, upperCi) names(z) <- c('Lower', 'Upper') z ``` ```{r} hilo(pigsforecast$Count, 95) ``` ### Q3 Fit a linear trend model to the data from exercise 9.1 and plot the forecasts; then repeat with a damped trend model. Briefly discuss the differences between the three approaches (SES, linear trend, damped trend) in terms of their formulas, connecting these differences to the results they produce. ```{r} plot <- fit %>% forecast(h = 4) %>% autoplot(filter(p, Month >= yearmonth('2016 Jan'))) +
labs(title = 'Four Month Forecast') plot ``` ### Q4 HA exercise 9.2. Optional but recommended: continue with exercises 9.3 and 9.4! Write your own function to implement simple exponential smoothing. The function should take arguments y (the time series), alpha (the smoothing parameter $\alpha$) and level (the initial level $l_0$). It should return the forecast of the next observation in the series. Does it give the same forecast as ETS()? ```{r} ETS <- function(y, alpha, level, h) { yHat <- level for(i in 1:length(y)+h){ if(i <= length(y)){ yHat[i] <- alpha*y[i] +(1-alpha)*yHat[i-1] } else{ yHat[i] <- alpha*yHat[i-1]+(1-alpha)*yHat[i-2] } } return(yHat[length(y):length(y)+h]) } ```