Regression Analysis of Crime Data: Stepwise, Lasso, Elastic Net

HW8 11 . Using the crime data set uscrime.txt from Questions 8.2, 9.1, and 10.1, build a regression model using: 1. Stepwise regression 2. Lasso 3. Elastic net 11.1.1 Stepwise Regression Below is the implementation of Stepwise regression. I have used the olsrr package in R to do this. Initially I build a linear regression model with all the predictors to check how the model is doing and as you can see below, I got an R2 value of 0.7078. Then to improve the model and applied stepwise regression based on p value (backward). Any predictors with p<0.05 were excluded from the model and the R2 value went up to 0.731. I also plotted the various metrics for the model and they are also shown in the table output below. #removing all environment variables and importing libraries rm( list= ls()) library(olsrr) ## ## Attaching package: 'olsrr' ## The following object is masked from 'package:datasets': ## ## rivers #importing data into a df df = read.table( "USCrime.txt" , header= TRUE) #apply linear regression model on all the variables in the dataset model = lm(Crime~., data= df) #model summary summary(model) ## ## Call: ## lm(formula = Crime ~ ., data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -395.74 -98.09 -6.69 112.99 512.67 ## ## Coefficients:

## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -5.984e+03 1.628e+03 -3.675 0.000893 *** ## M 8.783e+01 4.171e+01 2.106 0.043443 * ## So -3.803e+00 1.488e+02 -0.026 0.979765 ## Ed 1.883e+02 6.209e+01 3.033 0.004861 ** ## Po1 1.928e+02 1.061e+02 1.817 0.078892 . ## Po2 -1.094e+02 1.175e+02 -0.931 0.358830 ## LF -6.638e+02 1.470e+03 -0.452 0.654654 ## M.F 1.741e+01 2.035e+01 0.855 0.398995 ## Pop -7.330e-01 1.290e+00 -0.568 0.573845 ## NW 4.204e+00 6.481e+00 0.649 0.521279 ## U1 -5.827e+03 4.210e+03 -1.384 0.176238 ## U2 1.678e+02 8.234e+01 2.038 0.050161 . ## Wealth 9.617e-02 1.037e-01 0.928 0.360754 ## Ineq 7.067e+01 2.272e+01 3.111 0.003983 ** ## Prob -4.855e+03 2.272e+03 -2.137 0.040627 * ## Time -3.479e+00 7.165e+00 -0.486 0.630708 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 209.1 on 31 degrees of freedom ## Multiple R-squared: 0.8031, Adjusted R-squared: 0.7078 ## F-statistic: 8.429 on 15 and 31 DF, p-value: 3.539e-07 #step wise backward regression using p value back <- ols_step_backward_p(model, pent = 0.05 , prem = 0.05 , progress = TRUE) ## Backward Elimination Method ## --------------------------- ## ## Candidate Terms: ## ## 1 . M ## 2 . So ## 3 . Ed ## 4 . Po1 ## 5 . Po2 ## 6 . LF ## 7 . M.F ## 8 . Pop ## 9 . NW ## 10 . U1 ## 11 . U2 ## 12 . Wealth ## 13 . Ineq ## 14 . Prob ## 15 . Time ## ## We are eliminating variables based on p value... ##

## Variables Removed: ## ## - So ## - Time ## - LF ## - NW ## - Po2 ## - Pop ## - Wealth ## - M.F ## - U1 ## ## No more variables satisfy the condition of p value = 0.05 ## ## ## Final Model Output ## ------------------ ## ## Model Summary ## ----------------------------------------------------------------- ## R 0.875 RMSE 200.690 ## R-Squared 0.766 Coef. Var 22.174 ## Adj. R-Squared 0.731 MSE 40276.421 ## Pred R-Squared 0.666 MAE 138.674 ## ----------------------------------------------------------------- ## RMSE: Root Mean Square Error ## MSE: Mean Square Error ## MAE: Mean Absolute Error ## ## ANOVA ## ----------------------------------------------------------------------- ## Sum of ## Squares DF Mean Square F Sig. ## ----------------------------------------------------------------------- ## Regression 5269870.803 6 878311.801 21.807 0.0000 ## Residual 1611056.856 40 40276.421 ## Total 6880927.660 46 ## ----------------------------------------------------------------------- ## ## Parameter Estimates ## -------------------------------------------------------------------------- ----------------------- ## model Beta Std. Error Std. Beta t Sig lower upper ## -------------------------------------------------------------------------- ----------------------- ## (Intercept) -5040.505 899.843 -5.602 0.000 -6859.156 -3221.854 ## M 105.020 33.299 0.341 3.154 0.003 37.719 172.320

Your preview ends here