Hw6_Sol
pdf
keyboard_arrow_up
School
Gwinnett Technical College *
*We aren’t endorsed by this school
Course
4115
Subject
Statistics
Date
Feb 20, 2024
Type
Pages
11
Uploaded by BarristerStrawElk2529
ISyE 4031 Homework 6 Solution
Spring 2021
5.9
I. II. III.
##
## Call:
## lm(formula = ServTime ~ Desktops)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -83.505 -10.953
-3.453
12.043
56.510
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
14.431
17.631
0.818
0.428
## Desktops
22.007
3.114
7.068 8.44e-06 ***
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
## Residual standard error: 34.91 on 13 degrees of freedom
## Multiple R-squared:
0.7935, Adjusted R-squared:
0.7776
## F-statistic: 49.96 on 1 and 13 DF,
p-value: 8.439e-06
2
4
6
8
10
50
150
250
a. Scatter Plot
Desktops
ServTime
2
4
6
8
10
-100
0
50
100
b. Residual Plot
Desktops
Residuals
In the Scatter plot, the points seem to fan out as the number of desktops increases. The service time appears
to vary more when more desktops are being serviced. In the Residual plot, the points fan out. The variation
of the residuals is greater for greater number of desktops.
1
IV.
##
## Call:
## lm(formula = ServTime ~ Desktops)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -83.505 -10.953
-3.453
12.043
56.510
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
14.431
17.631
0.818
0.428
## Desktops
22.007
3.114
7.068 8.44e-06 ***
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
## Residual standard error: 34.91 on 13 degrees of freedom
## Multiple R-squared:
0.7935, Adjusted R-squared:
0.7776
## F-statistic: 49.96 on 1 and 13 DF,
p-value: 8.439e-06
##
## Call:
## lm(formula = ST1 ~ Desktops)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -3.2652 -0.6267
0.1362
0.7809
2.1009
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
5.8630
0.7169
8.178 1.76e-06 ***
## Desktops
0.9690
0.1266
7.654 3.61e-06 ***
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
## Residual standard error: 1.42 on 13 degrees of freedom
## Multiple R-squared:
0.8184, Adjusted R-squared:
0.8044
## F-statistic: 58.59 on 1 and 13 DF,
p-value: 3.615e-06
##
## Call:
## lm(formula = ST2 ~ Desktops)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -0.46930 -0.07589
0.02233
0.11031
0.31801
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
2.50004
0.11238
22.247 9.87e-12 ***
## Desktops
0.14747
0.01985
7.431 4.97e-06 ***
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
2
## Residual standard error: 0.2225 on 13 degrees of freedom
## Multiple R-squared:
0.8094, Adjusted R-squared:
0.7948
## F-statistic: 55.22 on 1 and 13 DF,
p-value: 4.967e-06
##
## Call:
## lm(formula = ST3 ~ Desktops)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -0.62770 -0.07817
0.03676
0.17841
0.39872
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
3.74070
0.15145
24.699 2.61e-12 ***
## Desktops
0.18284
0.02675
6.836 1.19e-05 ***
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
## Residual standard error: 0.2999 on 13 degrees of freedom
## Multiple R-squared:
0.7824, Adjusted R-squared:
0.7656
## F-statistic: 46.73 on 1 and 13 DF,
p-value: 1.195e-05
IV.1 Normal Probability Plots
-1
0
1
-80
-40
0
40
y
Theoretical Quantiles
-1
0
1
-3
-1
0
1
2
y
Theoretical Quantiles
-1
0
1
-0.4
0.0
0.2
y
0.25
-1
0
1
-0.6
-0.2
0.2
ln
(
y
29
## AD Test for Residuals- y as response
##
##
Anderson-Darling normality test
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
##
## data:
resid(model_Service)
## A = 0.45165, p-value = 0.2352
## AD Test for Residuals- sqrt(y) as response
##
##
Anderson-Darling normality test
##
## data:
resid(model_Service1)
## A = 0.26974, p-value = 0.6253
## AD Test for Residuals- y^0.25 as response
##
##
Anderson-Darling normality test
##
## data:
resid(model_Service2)
## A = 0.42469, p-value = 0.2758
## AD Test for Residuals- ln(y) as response
##
##
Anderson-Darling normality test
##
## data:
resid(model_Service3)
## A = 0.54869, p-value = 0.1303
From the above normal probability plots, all of them appear to be normal. However, the plot with the square
root transformation seems to handle the deviated points at both ends of the line the best, while the other
three plots have a few points deviated from the line significantly.
Results of Anderson-Darling Test on Residuals
Transformation
p-value
y
0.2352
y
0
.
5
0.6253
y
0
.
25
0.2758
ln
(
y
)
0.1303
All of the p-values are greater than 0.1 which indicates the residuals are normally distributed for all of the 4
models. However, as you may notice, the square root transformation has the largest p-value while the natural
log transformation has the smallest p-value which is silghtly greater than 0.1.
4
VI.2 Residual Plots
2
4
6
8
10
-100
0
50
Desktops
e, y
2
4
6
8
10
-4
-2
0
2
4
Desktops
e, y
2
4
6
8
10
-0.4
0.0
0.4
Desktops
e, y
0.25
2
4
6
8
10
-0.5
0.0
0.5
Desktops
e, ln
(
y
29
From the above residual plots, though there is still violation against the constant variance assumption,
transformations on ServTime do help to make the residual plots appear evenly spread while the original
residual plot shows strong pattern of fan-out.
[Note]
R
2
Comparisons
We list all the
R
2
’s in the following table. It turns out the model with the square root transformation on
ServTime has the highest
R
2
. This is also an indication that the square root transformation works better
than the original model as well as the other transformation models.
Transformation
R
2
Max
y
0.7935
y
0
.
5
0.8184
*
y
0
.
25
0.8094
ln
(
y
)
0.7824
5
5.15
a.
-2
-1
0
1
2
-600
0
400
Normal Q-Q Plot
Theoretical Quantiles
Sample Quantiles
0
5000
10000
15000
-600
0
400
Fitted
Residuals
0
5000
10000
15000
-600
0
400
BedDays
Residuals
4
5
6
7
8
9
10
11
-600
0
400
Length
Residuals
b.
## Hat Values
##
1
2
3
4
5
6
7
## 0.12075307 0.23505170 0.12968114 0.15876352 0.08688092 0.11440248 0.08606574
##
8
9
10
11
12
13
15
## 0.08353573 0.08762352 0.13502742 0.08333981 0.17799893 0.06633064 0.71443718
##
16
17
## 0.78675138 0.93335684
## 2*(k+1)/n
## [1] 0.5
## Hat Values > 0.5
##
15
16
17
## 0.7144372 0.7867514 0.9333568
2
×
(
k
+1)
n
=
8
16
= 0
.
5
→
Hospitals 15, 16, and 17 are outliers with respect to their
x
values.
c.
## Studentized Deleted Residuals
##
1
2
3
4
5
6
7
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
## -0.3329753
0.4035826
0.1607065
1.2335524
0.4249297 -0.7952567
0.6766342
##
8
9
10
11
12
13
15
##
1.1170641 -1.0782642 -1.3590574
1.4611929 -2.2241117 -0.6851192 -0.1374642
##
16
17
##
1.2537188
0.5966190
## t_{0.025,16-(3+2)}
## [1] 2.200985
## |SDR|>2.200985
##
12
## -2.224112
Hospital 12 is an outlier with respect to its
y
value since its Studentized Deleted Residual (-2.224112) is less
than
-
t
(11)
[0
.
025]
=
-
2
.
200985
.
d.
## Cooks Distance
##
1
2
3
4
5
6
## 0.004111350 0.013450584 0.001047070 0.068803012 0.004609869 0.021069992
##
7
8
9
10
11
12
## 0.011288649 0.027859603 0.027541648 0.067330758 0.044335089 0.201515928
##
13
15
16
17
## 0.008722368 0.012871368 1.383805243 1.316994321
## F0.5
## [1] 0.8884783
## F0.8
## [1] 0.4073066
## CooksD>F0.5
##
16
17
## 1.383805 1.316994
Hospital 16 and 17 are influential since their Cook’s Distances are greater than
F
(4
,
12)
[0
.
5]
= 0
.
8884783
.
e.
## Cooks Distance of Hospital 16 for model without Hospital 14
##
16
## 1.383805
## Cooks Distance of Hospital 17 for model with Hospital 14 (Full model)
##
17
## 5.03294
Yes, Cook’s D for hospital 16 when hospital 14 is removed is 1.383805 , which is considerably less than 5.033
for hospital 17 when hospital 14 is included. We basically compare the two most influential points in the two
models and find out that after removing hospital 14, the most influential point (Hospital 16) become less
inflential than the most inflential point (Hospital 17) in the original model.
7
5.16
a.
##
## Call:
## lm(formula = Hours ~ Xray + BedDays + Length + DL)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -485.91 -204.70
68.77
183.74
727.16
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2462.21640
501.98970
4.905 0.000363 ***
## Xray
0.04816
0.01193
4.037 0.001649 **
## BedDays
0.78432
0.07331
10.699 1.72e-07 ***
## Length
-432.40947
93.35426
-4.632 0.000578 ***
## DL
2871.78284
573.06176
5.011 0.000304 ***
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
## Residual standard error: 363.9 on 12 degrees of freedom
## Multiple R-squared:
0.9968, Adjusted R-squared:
0.9957
## F-statistic: 931.2 on 4 and 12 DF,
p-value: 7.656e-15
D
L
=
1
,
Large Hospital
0
,
Non-Large Hospital
The mean monthly labor hours for a large hospital will exceed those for small (not large) hospital by 2871.7828
hours when values of the other variables remain the same. Since the p-value = 0.0003 < 0.001, we have very
strong evidence that value of is statistically different from 0.
b.
## SDR for Hospoital 14
##
14
## 1.405802
## t_(0.025,11)
## [1] 2.200985
Because
|
1
.
405802
|
<
2
.
200985
, we don’t have evidence that Hospital 14 is an outlier with respect to its
y
value.
c.
## Cooks Distance with Dummy variable - DL
##
1
2
3
4
5
6
## 0.0699061232 0.0034984938 0.0223768125 0.0022035895 0.0009130147 0.0525707595
##
7
8
9
10
11
12
## 0.0074281942 0.0184548837 0.0049568588 0.0092883630 0.1152212446 0.0245065598
##
13
14
15
16
17
8
## 0.0074872707 0.8768968657 1.4216976841 0.8978682961 0.7377364159
## Cooks Distance of Hospital 17 in the original model
##
17
## 5.03294
Now Hospital 15 has the largest Cook’s D which is 1.4216976841. Yes, it’s much smaller than 5.03294 which
is the Cook’s D of Hospital 17 in the original model. Therefore, Hospital 15 is less influential. In addition,
after adding the dummy variable, Hospital 17 is no longer influential (its Cooks’ D is reduced to 0.7377364159
from 5.03294).
d.
h_df =
data.frame
(
Xray=
56194
,
BedDays=
14077.88
,
Length=
6.89
,
DL=
1
)
PI1=
predict
(model_hospital,
newdata=
h_df,
interval=
"prediction"
)
print
(PI1)
##
fit
lwr
upr
## 1 16064.55 14510.96 17618.15
PI1[
3
]
-
PI1[
2
]
## [1] 3107.196
PI2=
predict
(model_hospital_
14
,
newdata=
h_df,
interval=
"prediction"
)
print
(PI2)
##
fit
lwr
upr
## 1 15896.25 14906.24 16886.26
PI2[
3
]
-
PI2[
2
]
## [1] 1980.022
PI3=
predict
(model_hospital_dummy,
newdata=
h_df,
interval=
"prediction"
)
print
(PI3)
##
fit
lwr
upr
## 1 16102.53 15175.04 17030.01
PI3[
3
]
-
PI3[
2
]
## [1] 1854.973
•
n = 17, No Dummy (original model): 17618.15 – 14510.96 = 3,107.19
•
n = 16, No Dummy (removde Hospital 14): 16886.26 – 14906.24 = 1980.02
•
n = 17, Dummy: 17030.01 – 15175.04 = 1854.97
Model with the dummy variable for all 17 hospitals gives the shortest PI.
f.
summary
(model_hospital)
##
## Call:
## lm(formula = Hours ~ Xray + BedDays + Length, data = input)
##
## Residuals:
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
##
Min
1Q
Median
3Q
Max
## -687.40 -380.60
-25.03
281.91 1630.50
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1523.38924
786.89772
1.936
0.0749 .
## Xray
0.05299
0.02009
2.637
0.0205 *
## BedDays
0.97848
0.10515
9.305 4.12e-07 ***
## Length
-320.95083
153.19222
-2.095
0.0563 .
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
## Residual standard error: 614.8 on 13 degrees of freedom
## Multiple R-squared:
0.9901, Adjusted R-squared:
0.9878
## F-statistic:
432 on 3 and 13 DF,
p-value: 2.894e-13
summary
(model_hospital_
14
)
##
## Call:
## lm(formula = Hours ~ Xray + BedDays + Length, data = input_minus_14)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -677.23 -270.19
60.93
228.32
517.70
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1946.80204
504.18193
3.861
0.00226 **
## Xray
0.03858
0.01304
2.958
0.01197 *
## BedDays
1.03939
0.06756
15.386 2.91e-09 ***
## Length
-413.75780
98.59828
-4.196
0.00124 **
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
## Residual standard error: 387.2 on 12 degrees of freedom
## Multiple R-squared:
0.9961, Adjusted R-squared:
0.9952
## F-statistic:
1028 on 3 and 12 DF,
p-value: 9.919e-15
summary
(model_hospital_dummy)
##
## Call:
## lm(formula = Hours ~ Xray + BedDays + Length + DL)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -485.91 -204.70
68.77
183.74
727.16
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2462.21640
501.98970
4.905 0.000363 ***
## Xray
0.04816
0.01193
4.037 0.001649 **
## BedDays
0.78432
0.07331
10.699 1.72e-07 ***
## Length
-432.40947
93.35426
-4.632 0.000578 ***
10
## DL
2871.78284
573.06176
5.011 0.000304 ***
## ---
## Signif. codes:
0
***
0.001
**
0.01
*
0.05
.
0.1
1
##
## Residual standard error: 363.9 on 12 degrees of freedom
## Multiple R-squared:
0.9968, Adjusted R-squared:
0.9957
## F-statistic: 931.2 on 4 and 12 DF,
p-value: 7.656e-15
The best model for evaluating hospitals appears to be:
y
=
β
0
+
β
1
·
Xray
+
β
2
·
BedDays
+
β
3
·
Length
+
β
4
·
D
L
+
ε
using estimation from all 17 hospitals.
•
It has small p-values (<0.01) for all independent variables.
•
It has the largest adjusted
R
2
(0
.
9957)
.
•
It has the shortest Prediction Interval and the smallest
s
.
•
The influence of the individual hospitals on the estimates is relatively low.
11
Related Documents
Related Questions
explain dimensional analysis
arrow_forward
Show graphically and explain how large government budget deficits “crowd out” private investment
arrow_forward
Analyze production function equation
arrow_forward
Calculate the missing values. Express dollar values rounded to two decimal places
and break-even volumes rounded up to the next integer.
Total
Break-even Total Variable
Cost at
Break-even
(TVC) per
month
Fixed Cost
(FC) per
month
Variable
Cost (VC)
Selling
Price (S)
Revenue
(TR)-per
month at
Break-Even
Volume (x)
per unit
per month
per unit
arrow_forward
Find the consumer and producer surpluses where p=-0.00625x^3+100 is the demand function and p=0.025x^2+40 is the supply function
arrow_forward
Cross graph method
1+ 2^x / 1 + 4^x = x^2
x = smaller value. ????
I got the larger x value is 0.82 and it's correct
arrow_forward
3/4 p = 4/5 p + 3/2
arrow_forward
Fixed cost is the difference between total cost and total variable cost.
True or False
arrow_forward
x^2= 0.25
arrow_forward
TYPEWRITTEN PLEASE
arrow_forward
Solve part A-C
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
data:image/s3,"s3://crabby-images/0548d/0548d31ee9c133d39f23e1604390815031cd7982" alt="Text book image"
College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning
data:image/s3,"s3://crabby-images/b0445/b044547db96333d789eefbebceb5f3241eb2c484" alt="Text book image"
Related Questions
- Calculate the missing values. Express dollar values rounded to two decimal places and break-even volumes rounded up to the next integer. Total Break-even Total Variable Cost at Break-even (TVC) per month Fixed Cost (FC) per month Variable Cost (VC) Selling Price (S) Revenue (TR)-per month at Break-Even Volume (x) per unit per month per unitarrow_forwardFind the consumer and producer surpluses where p=-0.00625x^3+100 is the demand function and p=0.025x^2+40 is the supply functionarrow_forwardCross graph method 1+ 2^x / 1 + 4^x = x^2 x = smaller value. ???? I got the larger x value is 0.82 and it's correctarrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- College Algebra (MindTap Course List)AlgebraISBN:9781305652231Author:R. David Gustafson, Jeff HughesPublisher:Cengage Learning
data:image/s3,"s3://crabby-images/0548d/0548d31ee9c133d39f23e1604390815031cd7982" alt="Text book image"
College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning
data:image/s3,"s3://crabby-images/b0445/b044547db96333d789eefbebceb5f3241eb2c484" alt="Text book image"