439HW5Sol_F23

pdf

School

Washington University in St Louis *

*We aren’t endorsed by this school

Course

439

Subject

Mathematics

Date

Jan 9, 2024

Type

pdf

Pages

18

Uploaded by SanG12345489u3y78t34y85weriltu

Report
MATH 439. Solutions HW 5. Problem 1. a) > data(sat) > reduced<-lm(math~I(expend+ratio)+salary,data=sat) > full<-lm(math~expend+ratio+salary,data=sat) > anova(reduced,full) Analysis of Variance Table Model 1: math ~ I(expend + ratio) + salary Model 2: math ~ expend + ratio + salary Res.Df RSS Df Sum of Sq F Pr(>F) 1 47 65343 2 46 64834 1 508.19 0.3606 0.5511 Since the p-value is not small enough, we fail to reject H0 and conclude the full model is not significantly better than the reduced model. b) > data(sat) > reduced<-lm(math~expend+I(ratio-salary),data=sat) > full<-lm(math~expend+ratio+salary,data=sat) > anova(reduced,full) Analysis of Variance Table Model 1: math ~ expend + I(ratio - salary) Model 2: math ~ expend + ratio + salary Res.Df RSS Df Sum of Sq F Pr(>F) 1 47 64976 2 46 64834 1 141.59 0.1005 0.7527 Since the p-value is not small enough, we fail to reject H0 and conclude the full model is NOT significantly better than the reduced model.
Problem 2. a) We can then explicitly write the ellipse: ࠵? " !.# = {(࠵? ! , ࠵? $ ) ∈ ℝ $ : 7.9727(5.62 − ࠵? ! ) $ + 91.1162(−1.327 − ࠵? $ ) $ +2(12.5285)(5.62 − ࠵? ! )(−1.327 − ࠵? $ ) ≤ 2(2.82)(2.49)} b) Boferroni adjustment consists of changing the alpha to alpha/L, where L is the number of parameters we want to estimate jointly. In this case, L=2 and we need the critical value corresponding to the level .1/2*2: > qt(1-0.1/4,31) [1] 2.039513 Then, the intervals are: 0.82 ± 2.0395=2.82(. 170) → (−0.592133, 2.232133) −1.327 ± 2.0395=2.82(. 014) → (−1.732243, −0.9217574)
c) > qt(1-.1/8,31) [1] 2.355568 > qchisq(1-.1/8,31) [1] 51.25556 > qchisq(.1/8,31) [1] 16.07876
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Problem 3. (a) > X<-model.matrix(g) > C<-solve(t(X)%*%X) > D<-C[c(2,7),c(2,7)] > solve(D) x1 x6 x1 3902.45440 68.177148 x6 68.17715 7.457435 > (f<-qf(.95,2,18)) [1] 3.554557 > sigma(g)^2 [1] 10.41115 > summary(g) Call: lm(formula = y ~ ., data = table.b3) Residuals: Min 1Q Median 3Q Max -5.3441 -1.6711 -0.4486 1.4906 5.2508 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 17.339838 30.355375 0.571 0.5749 x1 -0.075588 0.056347 -1.341 0.1964 x2 -0.069163 0.087791 -0.788 0.4411 x3 0.115117 0.088113 1.306 0.2078 x4 1.494737 3.101464 0.482 0.6357 x5 5.843495 3.148438 1.856 0.0799 . x6 0.317583 1.288967 0.246 0.8082 x7 -3.205390 3.109185 -1.031 0.3162 x8 0.180811 0.130301 1.388 0.1822 x9 -0.397945 0.323456 -1.230 0.2344 x10 -0.005115 0.005896 -0.868 0.3971 x11 0.638483 3.021680 0.211 0.8350 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.227 on 18 degrees of freedom
(2 observations deleted due to missingness) Multiple R-squared: 0.8355, Adjusted R-squared: 0.7349 F-statistic: 8.31 on 11 and 18 DF, p-value: 5.231e-05 (b) > g<-lm(y~.,data=table.b3) > confint(g,level=1-0.05/2) 1.25 % 98.75 % (Intercept) -56.87922409 91.558899681 x1 -0.21335587 0.062179526 x2 -0.28381101 0.145485669 x3 -0.10032074 0.330554961 x4 -6.08836036 9.077833615 x5 -1.85445496 13.541444016 x6 -2.83394765 3.469114242 x7 -10.80736493 4.396585230 x8 -0.13777439 0.499397229 x9 -1.18879757 0.392907146 x10 -0.01953174 0.009301153 x11 -6.74954265 8.026508114 (c) > alpha=0.05 > delta=sqrt(2*qf(1-alpha,2,18)) > alpha.eff=2*(1-pt(delta,18)) > confint(g,level=1-alpha.eff) 0.787 % 99.213 % (Intercept) -63.59646238 98.27613797 x1 -0.22582462 0.07464827 x2 -0.30323788 0.16491254 x3 -0.11981907 0.35005329 x4 -6.77467285 9.76414611 x5 -2.55116224 14.23815130 x6 -3.11917874 3.75434534 x7 -11.49538600 5.08460630 x8 -0.16660818 0.52823102 x9 -1.26037411 0.46448369 x10 -0.02083651 0.01060592 x11 -7.41820008 8.69516554
d) Using the R-code below, we generate the plot below: library( ellipse ) plot(ellipse( g ,c( 2 , 7 )), type = "l" , main = 'Confidence Region' ) points (coef( g )[ 2 ], coef( g )[ 7 ], pch = 18 ) # Regular CIs abline ( v =confint( g )[ 2 ,], lty = 2 ) abline ( h =confint( g )[ 7 ,], lty = 2 ) # Bonferroni CFs abline ( v =confint( g , level = 1 - 0.05 /( 2 ))[ 2 ,], lty = 3 , col = 'red' ) abline ( h =confint ( g , level = 1 - 0.05 /( 2 ))[ 7 ,], lty = 3 , col = 'red' ) # Scheffe CFs abline ( v =confint( g , level = 1 - alpha.eff )[ 2 ,], lty = 4 , col = 'blue' ) abline ( h =confint ( g , level = 1 - alpha.eff )[ 7 ,], lty = 4 , col = 'blue' )
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Problem 4. Constant Variance:
Strong NonConstant Variance: Comments: Standard Residuals plot do show an increase in the variability around the x-axis as the fitted value gets larger. The standardized residuals plot (3 rd plot) also indicates that sigma may be linearly dependent in x. Interesting is that the R2 is relatively small, even though the p-value of significance of regression is quite small. The QQ plot also shows some issues with normality, but this is superfluous since, as we know, the errors are truly normally distributed.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Mild NonConstant Variance: Comments: The standard residuals plot (first plot) do show an increase in the variability as fitted value gets larger. The standardized residuals (3 rd plot) also shows that sigma is dependent in x. Interesting is that the R2 is now large as opposed to the previous case. The QQ plot does not show significant issues with normality.
Nonlinearity: Comments: Standard Residuals plot do show the nonlinearity pattern of the regression. The standardized residuals and QQ plots are Ok, as it should since there is no issues with the error model assumptions here. Interesting is that the R2 and p values are quite bad as expected.
Problem 5. Ex. 4.1 from MPV: This is because there seems to be some linear correlation between the errors and variable x2.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Note: For the last conclusion, the correlation between the errors and x2 seems mild so, concluding that it won’t be improve significantly is fine as well. Problem 6. a) b) The code below find the hat matrix and the leverage of all the points: > data(table.b3) > mymodel<-lm(y~x1+x8,data=table.b3) > X<-model.matrix(mymodel) > H<-X%*%vcov(mymodel)%*%t(X)/sigma(mymodel)^2 > diag(H) 1 2 3 4 5 6 0.04170355 0.04239733 0.06255061 0.04253231 0.07502672 0.35532964 7 8 9 10 11 12 0.04400666 0.05657843 0.13464435 0.11491736 0.05241265 0.12545435 13 14 15 16 17 18 0.06743573 0.10948015 0.08119224 0.04022419 0.13969058 0.15875219 19 20 21 22 23 24 0.04824053 0.03393987 0.04400666 0.07890021 0.11171762 0.11491736 25 26 27 28 29 30 0.09149888 0.13050830 0.08650701 0.13012246 0.09450602 0.09926926 31 32 0.05504747 0.13648931 The point with the largest leverage is the point 6: (x1,x8,y)=(440.0, 184.5, 11.20). The hmax=0.3553.
c) d) e) When applying the Bonferroni correction, we compare r-student to the t critical values with alpha/(2n) and still n-p-1. This value will turn to be bigger than the critical value in c). Since the largest r-student does not exceed the t-value of c), it won’t exceed either the t-value here. Therefore, we conclude that there is no atypical observation. f) The R output for the leverage is > # Verification > lm.influence(mymodel)$hat 1 2 3 4 5 6 0.04170355 0.04239733 0.06255061 0.04253231 0.07502672 0.35532964 7 8 9 10 11 12 0.04400666 0.05657843 0.13464435 0.11491736 0.05241265 0.12545435 13 14 15 16 17 18 0.06743573 0.10948015 0.08119224 0.04022419 0.13969058 0.15875219 19 20 21 22 23 24
0.04824053 0.03393987 0.04400666 0.07890021 0.11171762 0.11491736 25 26 27 28 29 30 0.09149888 0.13050830 0.08650701 0.13012246 0.09450602 0.09926926 31 32 0.05504747 0.13648931T This is the R output for the studentized residuals: > rstandard(g) 1 2 3 4 5 6 7 8 9 0.57292819 -0.05055391 -0.61214921 0.37496156 -0.98650013 -0.71628401 - 0.22268216 0.04064724 1.79358391 10 11 12 13 14 15 16 17 18 0.43360011 -0.22048348 2.33686180 -1.37529946 -0.63292011 -2.27493766 - 0.52427197 1.50345414 0.72099394 19 20 21 22 23 24 25 26 27 0.20899578 -0.73433532 0.24323034 1.61637231 0.60065010 0.94509677 0.78404364 0.47043307 -1.17059787 28 29 30 31 32 0.38957467 -1.06239901 -1.27744319 -0.95100714 -0.08171735 This is the R output for the R-student residuals is: > rstudent(g) 1 2 3 4 5 6 7 8 9 0.56617682 -0.04967684 -0.60542659 0.36933638 -0.98602805 -0.71013577 - 0.21899644 0.03994141 1.86910420 10 11 12 13 14 15 16 17 18 0.42744650 -0.21683051 2.54869186 -1.39772951 -0.62625234 -2.46623906 - 0.51761230 1.53847888 0.71489022 19 20 21 22 23 24 25 26 27 0.20551563 -0.72836698 0.23924408 1.66503209 0.59390910 0.94329960 0.77870451 0.46402495 -1.17841607 28 29 30 31 32 0.38380456 -1.06484892 -1.29210551 -0.94938803 -0.08030532
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Problem 7. We can also look at the qq-normal plot.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help