HW4

pdf

School

Georgia Institute Of Technology *

*We aren’t endorsed by this school

Course

6501

Subject

Industrial Engineering

Date

Dec 6, 2023

Type

pdf

Pages

14

Uploaded by DrCrocodile2618

Report
Question 7.1 Describe a situation or problem from your job, everyday life, current events, etc., for which exponential smoothing would be appropriate. What data would you need? Would you expect the value of α (the first smoothing parameter) to be closer to 0 or 1, and why? I work for a company which manufactures laptops. We have many instances where we can use the exponential smoothing approach for forecasting using moving averages. One such instance is to know the quantity of returns from customers. As we sell laptops there are also instances where customers return the laptops or exchange them from retail stores or online. We as a business need to keep track of these returns or exchanges to be ready with an alternative product for exchange or know why a product is being returned. To resolve this, we can use Exponential smoothing (moving averages) using the historical data of the return quantity of a product over its past life time and see how its trends are, if it’s having any seasonality like after Labor Day or Black Friday etc. As people may buy the product during promotional sales and then return it due to not liking the product or exchange them for defects. So, the company needs to anticipate these returns before hand to keep inventory optimally stocked for the returning customers to go back with a device. Using exponential smoothing we can predict in the immediate future or short range how many returns can be expected and how well our stores/warehouses need to be stocked to save ourselves from the stock out situation. Here, we need the historical data of return quantities of a product in its lifetime to date. The value of (alpha) is expected to be close to 1 because there is not much randomness in the system as in the values will not fluctuate much as the data will be mostly stationary as there are very minor chances of a mass return of products from customers.
HW7.2 2022-09-16 Question 7.2 Using the 20 years of daily high temperature data for Atlanta (July through October) from Question 6.2 (file temps.txt ), build and use an exponential smoothing model to help make a judgment of whether the unofficial end of summer has gotten later over the 20 years. (Part of the point of this assignment is for you to think about how you might use exponential smoothing to answer this question. Feel free to combine it with other models if you’d like to. There’ s certainly more than one reasonable approach.) Answer : This problem may be approached in various ways below is my approach based on what we learned. Initially I imported the data into a dataframe df_1 df_1 <- read.delim( "temps.txt" , header= TRUE) names(df_1) = gsub( 'X' , '' ,names(df_1)) #beautifying headers head(df_1) ## DAY 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2 009 ## 1 1-Jul 98 86 91 84 89 84 90 73 82 91 93 95 85 95 ## 2 2-Jul 97 90 88 82 91 87 90 81 81 89 93 85 87 90 ## 3 3-Jul 97 93 91 87 93 87 87 87 86 86 93 82 91 89 ## 4 4-Jul 90 91 91 88 95 84 89 86 88 86 91 86 90 91 ## 5 5-Jul 89 84 91 90 96 86 93 80 90 89 90 88 88 80 ## 6 6-Jul 93 84 89 91 96 87 93 84 90 82 81 87 82 87 ## 2010 2011 2012 2013 2014 2015 ## 1 87 92 105 82 90 85 ## 2 84 94 93 85 93 87 ## 3 83 95 99 76 87 79 ## 4 85 92 98 77 84 85 ## 5 88 90 100 83 86 84 ## 6 89 90 98 83 87 84 df <- as.vector(unlist(df_1[, 2 : 21 ])) #unlisting the df and creating a vector
#converting the data into time series data format using ts. df_ts <- ts(df, frequency = 123 , start= 1995 ) head(df_ts) ## [1] 98 97 97 90 89 93 plot.ts(df_ts, col= "grey" ) Below is the plot for the original time series data. As we can see there seems to be some seasonality in the data as there are peaks and valleys at regular time intervals over the years. Let us try to decompose this data and see individually if there are any trends, randomness or seasonality in the data. Decompose function helps find the underlying components in a time series data. dc_ts <- decompose(df_ts) plot(dc_ts)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
From the above plot we can observe that there is definitely seasonality in the data and it is constant seasonality over the years which makes sense as the temperatures from July to Oct usually goes from up to down. Also, we can see some randomness and trend in data. Once we decompose the data, we can see the type of time series is additive. If you have a seasonal time series that can be described using an additive model, you can seasonally adjust the time series by estimating the seasonal component, and subtracting the estimated seasonal component from the original time series. I did try removing seasonality and randomness from the data and got the plot below which shows there has been longer summers. dc_ts$type ## [1] "additive" df_ts<-df_ts-dc_ts$seasonal #removing seasonal component plot.ts(df_ts,col="grey") df_ts<-df_ts-dc_ts$random #removing random component plot.ts(df_ts,col="grey") However, we cannot implement Holtwinters (Triple smoothing) without the seasonality an d trend components. So, I ended up using the original time series for applying the Holtwint ers approach. op_ts <- HoltWinters(df_ts) #op_ts plot(op_ts$fitted)
Holt-Winters exponential smoothing with trend and additive seasonal component . Call: HoltWinters(x = df_ts) Smoothing parameters: alpha: 0.6610618 beta : 0 gamma: 0.6248076 Coefficients: [,1] a 71.477236414 b -0.004362918 Here, we can see that alpha is 0.6 which is shows that there is randomness in the data and the output is based on previous data points. Below is the plot for Holt Winters approach. We can see there is no trend in the data at all. Level below is the moving average and as shown there is cyclic seasonality in the data
#trend is constant df_fitted <- as.data.frame(op_ts$fitted) plot(df_ts, col= 'grey' ) lines(df_fitted$xhat, col= 'red' ) lines(df_fitted$level, col= 'blue' ) Below I plotted the original data and the output of holt winters(red) which is smoothed observations and the level (blue) to see if there are any indications of longer summers. The period of 2010 to 2015 shows that there is a slight increase but then the abrupt fall may be because we do not have data points after 2015. So, it is difficult to deduce confirmatively from the below graph. plot(df_fitted$season, col= 'yellow' , type = 'l' ) #plot of seasonality in HW
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
plot(op_ts) Below is the graph of holt winters smoothed curve which sits well on the original data and gets better as the more data points improve the smoothing. Here the initial year is unavailable since we are not forecasting for 1996. Since, we are unsure let’s implement the CUSUM approach we learn last week and check if this helps in identifying any indication as to if summers have been longer in Atlanta. I am implementing the CUSUM approach on the seasonality values we got from Holtwinters(HW) to check if we can identify any change length(df_fitted$ season) ## [1] 2337 #converting the season values to a matrix mat <- matrix( data = df_fitted$season, ncol= 123 ) dim(mat) ## [1] 19 123 #assigning row names and column names colnames(mat) <- as.vector(t(df_1[, 1 ])) rownames(mat) <- c( 1997 : 2015 ) #head(mat)
#transposing the matrix to have months in row and years in columns df_mat = t((mat)) #converting it to a data frame for easy operations df_mat <- as.data.frame(df_mat) #adding column avg which has the mean of each day across all years df_mat$avg <- rowMeans(df_mat, na.rm = TRUE) df_mat$cusum <- 0 #adding column for cusum(st) calculation df_mat$change <- NA #adding column for change detection trigger Mu <- mean(df_mat[ 1 : 30 , 20 ]) #mean of summer months July across 20 years C = sd(df_mat[ 1 : 30 , 20 ]) #stddev for the above mean T = 5 *C #I set the threshold to 5 times std dev #implementation of cusum(max(0,st-1+mu-xi-C)) and st>=threshold(t) using for loop for (i in 2 :nrow(df_mat)){ df_mat$cusum[i] <- max( 0 ,df_mat$cusum[i -1 ]+Mu-df_mat$avg[i]-C) if (df_mat$cusum[i]>=T){ df_mat$change[i] = 'Change detected' } else { df_mat$change[i] = 'No Change' } } head(df_mat) 1997 1998 1999 2000 2001 2002 1-Jul 4.303159 8.2381188 11.0917774 9.042997 2.067387 2.116168 2-Jul 7.896655 6.7665741 1.6771432 4.766574 5.799094 6.815355 3-Jul -2.493588 -9.5748893 -2.6561901 3.238119 5.156818 5.099907 4-Jul 8.132428 8.3438099 9.4901514 9.620233 2.770639 -3.152125 5-Jul 3.425111 -0.5748893 -0.5179787 -4.469198 -3.412288 -3.339117 6-Jul -12.192775 -16.1765153 -18.1358649 -22.103345 -4.046434 -12.013914 2003 2004 2005 2006 2007 2008 1-Jul -6.826922 5.197468 2.2055985 5.262509 2.295029 6.327550 2-Jul 6.986086 9.099907 9.2218587 7.278769 6.238119 -9.940743 3-Jul 4.189338 5.335680 8.5226717 8.652753 6.669013 6.620233 4-Jul -10.095214 3.937306 0.9291758 3.969826 6.042997 6.189338
5-Jul -4.322857 -1.233426 1.7990945 1.839745 4.799094 1.758444 6-Jul -9.875702 -12.802532 -12.7862714 -8.802532 -2.835052 -0.843182 2009 2010 2011 2012 2013 2014 1-Jul 4.343810 8.2706392 9.205599 9.189338 7.14055787 7.075517 2-Jul -2.013914 1.9779562 6.051127 6.986086 5.86413510 1.717794 3-Jul 5.603973 -0.3228568 -3.168385 -1.119605 -0.03017384 1.953566 4-Jul 6.221859 8.2950294 1.343810 3.392590 5.41698063 -4.526109 5-Jul -3.314727 -7.4122877 -10.404158 -18.379767 -16.25781611 -10.209036 6-Jul -1.835052 0.1568180 -15.843182 -18.826922 -13.81879172 -2.810662 2015 avg cusum change 1-Jul 6.994216 5.449501 0.000000 0 2-Jul 1.538932 4.459343 0.000000 No Change 3-Jul 4.986086 2.455919 0.000000 No Change 4-Jul -3.501719 3.306369 0.000000 No Change 5-Jul 1.839745 -3.599280 0.000000 No Change 6-Jul -0.745621 -9.807238 5.121384 No Change df_mat$change ## [1] NA "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [7] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [13] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [19] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [25] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [31] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [37] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [43] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [49] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [55] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge"
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## [61] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [67] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [73] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [79] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [85] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [91] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [97] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [103] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [109] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [115] "No Change" "No Change" "No Change" "No Change" "No Change" "No Chan ge" ## [121] "No Change" "No Change" "No Change" I am implementing the CUSUM approach on the xhat values we got from Holtwinters (HW) to check if we can identify any change using predicted xhat values. length(df_fitted$xhat) ## [1] 2337 #converting the xhat values to a matrix mat <- matrix( data = df_fitted$xhat, ncol= 123 ) dim(mat) ## [1] 19 123 #assigning row names and column names colnames(mat) <- as.vector(t(df_1[, 1 ])) rownames(mat) <- c( 1997 : 2015 ) #head(mat) #transposing the matrix to have months in row and years in columns df_mat = t((mat)) #converting it to a data frame for easy operations df_mat <- as.data.frame(df_mat) #adding column avg which has the mean of each day across all years
df_mat$avg <- rowMeans(df_mat, na.rm = TRUE) df_mat$cusum <- 0 #adding column for cusum(st) calculation df_mat$change <- NA #adding column for change detection trigger Mu <- mean(df_mat[ 1 : 30 , 20 ]) #mean of summer months July across 20 years C = sd(df_mat[ 1 : 30 , 20 ]) #stddev for the above mean T = 5 *C #I set the threshold to 5 times std dev #implementation of cusum(max(0,st-1+mu-xi-C)) and st>=threshold(t) using for loop for (i in 2 :nrow(df_mat)){ df_mat$cusum[i] <- max( 0 ,df_mat$cusum[i -1 ]+Mu-df_mat$avg[i]-C) if (df_mat$cusum[i]>=T){ df_mat$change[i] = 'Change detected' } else { df_mat$change[i] = 'No Change' } } head(df_mat) 1997 1998 1999 2000 2001 2002 2003 2004 1-Jul 87.17619 90.32925 92.96089 90.93360 83.99752 84.04358 75.06732 87.0428 4 2-Jul 89.85831 88.81753 83.84436 87.03232 88.03911 89.02515 89.17489 91.1687 4 3-Jul 79.87808 72.87301 79.87130 85.84632 87.86225 87.89204 87.04848 88.1584 1 4-Jul 90.15381 90.25915 91.22982 91.20361 84.21506 80.79001 78.60881 100.8283 1 5-Jul 92.92855 85.66611 87.26150 85.11624 88.07513 90.07746 92.34346 93.2182 9 6-Jul 65.51275 69.77949 75.23323 75.73464 97.92898 79.42710 81.93967 79.0483 6 2005 2006 2007 2008 2009 2010 2011 2012 1-Jul 84.01829 87.05875 84.04807 88.04445 86.02696 89.93161 90.90741 90.94800 2-Jul 91.17478 89.11179 87.99288 71.81436 79.85955 83.93990 88.04844 88.94701 3-Jul 91.23632 91.20582 89.08166 88.97453 87.97075 82.05889 79.17007 81.10206 4-Jul 92.64082 87.98181 85.43521 83.30630 85.11516 89.09103 78.76996 81.62751
5-Jul 80.89773 80.34056 84.39254 69.18892 64.64756 60.11756 62.33211 64.04852 6-Jul 81.01147 83.66114 87.20402 76.49648 66.57824 62.23394 48.71918 55.83264 2013 2014 2015 avg cusum change 1-Jul 88.92923 88.90661 88.88268 87.32912 0.000000 0 2-Jul 87.85573 83.80040 83.74912 86.48707 0.000000 No Change 3-Jul 82.11966 84.01993 87.03491 84.91603 0.000000 No Change 4-Jul 83.89378 75.33867 84.06756 86.02928 0.000000 No Change 5-Jul 80.67739 82.96870 88.42319 80.66955 0.000000 No Change 6-Jul 69.54085 80.84814 75.73753 74.34041 3.399691 No Change unique(df_mat$change) [1] "0" "No Change" Here, we can see that there is No change detected in the df_mat$change column. Finally, I would also check the if level component in HW output can help identify any change. length(df_fitted$level) ## [1] 2337 #converting the level values to a matrix mat <- matrix( data = df_fitted$level, ncol= 123 ) dim(mat) ## [1] 19 123 #assigning row names and column names colnames(mat) <- as.vector(t(df_1[, 1 ])) rownames(mat) <- c( 1997 : 2015 ) #head(mat) #transposing the matrix to have months in row and years in columns df_mat = t((mat)) #converting it to a data frame for easy operations df_mat <- as.data.frame(df_mat) #adding column avg which has the mean of each day across all years
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
df_mat$avg <- rowMeans(df_mat, na.rm = TRUE) df_mat$cusum <- 0 #adding column for cusum(st) calculation df_mat$change <- NA #adding column for change detection trigger Mu <- mean(df_mat[ 1 : 30 , 20 ]) #mean of summer months July across 20 years C = sd(df_mat[ 1 : 30 , 20 ]) #stddev for the above mean T = 5 *C #I set the threshold to 5 times std dev #implementation of cusum(max(0,st-1+mu-xi-C)) and st>=threshold(t) using for loop for (i in 2 :nrow(df_mat)){ df_mat$cusum[i] <- max( 0 ,df_mat$cusum[i -1 ]+Mu-df_mat$avg[i]-C) if (df_mat$cusum[i]>=T){ df_mat$change[i] = 'Change detected' } else { df_mat$change[i] = 'No Change' } } head(df_mat) 1997 1998 1999 2000 2001 2002 2003 2004 1-Jul 82.87739 82.09550 81.87348 81.89497 81.93450 81.93177 81.89860 81.8497 4 2-Jul 81.96602 82.05532 82.17158 82.27010 82.24438 82.21416 82.19317 82.0731 9 3-Jul 82.37604 82.45227 82.53185 82.61257 82.70979 82.79649 82.86350 82.8270 9 4-Jul 82.02575 81.91971 81.74403 81.58774 81.44878 83.94650 88.70838 96.8953 7 5-Jul 89.50780 86.24536 87.78384 89.58980 91.49178 93.42094 96.67068 94.4560 8 6-Jul 77.70989 85.96037 93.37346 97.84235 101.97977 91.44537 91.81973 91.8552 5 2005 2006 2007 2008 2009 2010 2011 2012 1-Jul 81.81705 81.80060 81.75740 81.72126 81.68752 81.66533 81.70618 81.76302 2-Jul 81.95728 81.83738 81.75912 81.75947 81.87782 81.96631 82.00167 81.96529 3-Jul 82.71801 82.55743 82.41701 82.35866 82.37114 82.38611 82.34282 82.22603 4-Jul 91.71601 84.01634 79.39658 77.12133 78.89766 80.80036 77.43051 78.23928 5-Jul 79.10299 78.50518 79.59781 67.43484 67.96665 67.53421 72.74063 82.43265
6-Jul 93.80211 92.46804 90.04343 77.34403 68.41766 62.08148 64.56672 74.66393 2013 2014 2015 avg cusum change 1-Jul 81.79304 81.83546 81.89283 81.88398 0 0 2-Jul 81.99595 82.08697 82.21455 82.03209 0 No Change 3-Jul 82.15420 82.07073 82.05319 82.46447 0 No Change 4-Jul 78.48116 79.86914 87.57364 82.72728 0 No Change 5-Jul 96.93957 93.18210 86.58781 84.27320 0 No Change 6-Jul 83.36400 83.66317 76.48751 84.15201 0 No Change unique(df_mat$change) "0" "No Change" Here also we observe there is no change detected. However, there was a change detected (8 th and 9 th aug, 3-4 th sept, 17-18 th sept) when the threshold was 2*C but such low threshold didn’t make sense to me to have such low threshold for this use case so went with 5*C for threshold. Conclusion : As per my analysis we can see that there is no evidence that the summers have been longer lately when we take into considering the data given to us and the data is small for us to conclude anything definitively. We have tried using the original data, xhat values , seasonality and level to check for change but we did not find any change as per our analysis which makes me believe that there has been no change in the length of summers.