Midterm Review Q2-3
pdf
keyboard_arrow_up
School
University of Calgary *
*We aren’t endorsed by this school
Course
429
Subject
Mathematics
Date
Jan 9, 2024
Type
Pages
8
Uploaded by DeanFieldTarsier28
University of Calgary
Department of Mathematics and Statistics
Statistics 429
Midterm Examination
Instructor: Dr. B. Cindy Sun
Nov. 1, 2022
Time: 2:00-3:15pm
Total Marks: 37
I.D. NUMBER
GIVEN NAME(S)
LAST NAME
EXAMINATION RULES
1.
This is a closed book exam.
2.
Students arriving late will not normally be admitted after one-half hour of the examination time has passed.
3.
No candidate will be permitted to leave the examination room until one-half hour has elapsed after the opening of the examination,
nor during the last 15 minutes of the examination. All candidates remaining during the last 15 minutes of the examination period
must remain at their desks until their papers have been collected by an invigilator.
4.
All enquirers and requests must be addressed to supervisors only.
5.
Candidates are strictly cautioned against:
(a)
speaking to other candidates or communicating with them under any circumstances whatsoever;
(b)
bringing into the examination room any textbook, notebook or memorandum not authorized by the examiner;
(c)
making use of calculators and/or portable computing machines not authorized by the instructor;
(d)
leaving answer papers exposed to view;
(e)
attempting to read other students’ examination papers.
The penalty for violation of these rules is suspension or expulsion or such other penalty as may be determined.
6.
Candidates are requested to write on both sides of the page, unless the examiner has asked that the left half page be reserved for
rough drafts or calculations.
7.
Discarded matter is to be struck out and not removed by mutilation of the examination answer book.
8.
Candidates are cautioned against writing in their answer books any matter extraneous to the actual answering of the question set.
9.
A candidate must report to a supervisor before leaving the examination room.
10.
Answer books must be handed to the supervisor-in-charge promptly when the signal is given. Failure to comply with this regulation
will be cause for rejection of an answer paper.
11.
If a student becomes ill or receives word of domestic a
✏
iction during the course of an examination, he/she should report at once
to the Supervisor, hand in the unfinished paper and request that it be canceled.
Thereafter, if illness is the cause, the student
must go directly to University Health Services so that any subsequent application for a deferred examination may be supported by a
medical certificate. An application for Deferred Final Examinations must be submitted to the Registrar by the date specified in the
University Calendar.
Should a student write an examination, hand in the paper for marking, and later report extenuating
circumstances to support a request for cancellation of the paper and for another examination, such request will be
denied.
12.
Attempt all questions.
2
. Many di
↵
erent interest groups — such as the lumber industry, ecologists, and foresters — benefit from being
able to predict the volume of a tree just by knowing its diameter. One classic data set (shortleaf.txt) — reported
by C. Bruce and F. X. Schumacher in 1935 — concerned the diameter (x, in inches) and volume (y, in cubic
feet) of n = 70 shortleaf pines. https://online.stat.psu.edu/stat462/node/154/ (
Total 7 marks
)
leaf
<-
read.table
(
"shortleaf.txt"
,
header
=T)
attach
(leaf)
range
(Diam)
## [1]
4.4 23.4
range
(Vol)
## [1]
2.0 163.5
m1
<-
lm
(Vol
~
Diam)
summary
(m1)
##
## Call:
## lm(formula = Vol ~ Diam)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -18.899
-4.768
-1.438
6.740
45.089
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept) -41.5681
3.4269
-12.13
<2e-16 ***
## Diam
6.8367
0.2877
23.77
<2e-16 ***
## ---
## Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.875 on 68 degrees of freedom
## Multiple R-squared:
0.8926,Adjusted R-squared:
0.891
## F-statistic: 564.9 on 1 and 68 DF,
p-value: < 2.2e-16
nDiam
<-
log
(Diam)
nVol
<-
log
(Vol)
m2
<-
lm
(nVol
~
nDiam)
summary
(m2)
##
## Call:
## lm(formula = nVol ~ nDiam)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -0.3323 -0.1131
0.0267
0.1177
0.4280
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
-2.8718
0.1216
-23.63
<2e-16 ***
## nDiam
2.5644
0.0512
50.09
<2e-16 ***
## ---
5
## Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1703 on 68 degrees of freedom
## Multiple R-squared:
0.9736,Adjusted R-squared:
0.9732
## F-statistic:
2509 on 1 and 68 DF,
p-value: < 2.2e-16
par
(
mfrow
=
c
(
2
,
2
))
plot
(m1
$
fitted.values,
rstandard
(m1),
xlab
=
"fitted values"
,
ylab
=
"Standardized residuals"
,
main
=
"m1"
)
plot
(m2
$
fitted.values,
rstandard
(m2),
xlab
=
"fitted values"
,
ylab
=
"Standardized residuals"
,
main
=
"m2"
)
detach
(leaf)
0
20
40
60
80
100
120
−
2
0
1
2
3
4
5
m1
fitted values
Standardized residuals
1
2
3
4
5
−
2
−
1
0
1
2
m2
fitted values
Standardized residuals
(a) Write down models ‘m1’ and ‘m2’. (
2 marks
)
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
(b) Decide which model is better for inference, explain how you decide which model is better. (
2 mark
)
(c) Interpret the slope estimate in the better model. (
2 marks
)
(d) Use the better model to predict volume of trees with diameter of 10 inches. (
1 marks
)
7
3
. The data used below records height of a shrub (height) based on the amount of bacteria in the soil (bacteria)
and whether the shrub is located in partial or full sun (sun). Height is measured in cm. Bacteria=1 if there are
100 thousand bacteria per ml of soil, and bacteria=0 if there are no bacteria in the soil. Sun=0 if the plant is in
partial sun, and sun=1 if the plant is in full sun. (
Total 11 marks
)
shrub
<-
read.csv
(
"shrub.csv"
,
header
=T)
attach
(shrub)
range
(height[bacteria
==
1
])
## [1] 2.4 6.1
range
(height[bacteria
==
0
])
## [1] 1.9 5.3
range
(height[sun
==
1
])
## [1] 4.0 6.1
range
(height[sun
==
0
])
## [1] 1.9 3.8
bacteria
=
as.factor
(bacteria)
sun
=
as.factor
(sun)
m1
<-
lm
(height
~
bacteria
*
sun)
summary
(m1)
##
## Call:
## lm(formula = height ~ bacteria * sun)
##
## Residuals:
##
Min
1Q
Median
3Q
Max
## -0.75000 -0.38333
0.04167
0.43750
0.76667
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
2.4333
0.1957
12.435 7.23e-11 ***
## bacteria1
0.6000
0.2767
2.168
0.0424 *
## sun1
2.3167
0.2767
8.371 5.74e-08 ***
## bacteria1:sun1
0.1167
0.3914
0.298
0.7687
## ---
## Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4793 on 20 degrees of freedom
## Multiple R-squared:
0.8881,Adjusted R-squared:
0.8713
## F-statistic:
52.9 on 3 and 20 DF,
p-value: 1.08e-09
m2
<-
lm
(height
~
bacteria
+
sun)
summary
(m2)
##
## Call:
## lm(formula = height ~ bacteria + sun)
##
8
## Residuals:
##
Min
1Q
Median
3Q
Max
## -0.77917 -0.36250
0.02917
0.42500
0.73750
##
## Coefficients:
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
2.4042
0.1657
14.51 2.05e-12 ***
## bacteria1
0.6583
0.1914
3.44
0.00246 **
## sun1
2.3750
0.1914
12.41 3.91e-11 ***
## ---
## Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4688 on 21 degrees of freedom
## Multiple R-squared:
0.8876,Adjusted R-squared:
0.8769
## F-statistic: 82.91 on 2 and 21 DF,
p-value: 1.08e-10
par
(
mfrow
=
c
(
2
,
2
))
plot
(m1
$
fitted.values,
rstandard
(m1),
xlab
=
"fitted values"
,
ylab
=
"Standardized residuals"
,
main
=
"m1"
)
plot
(m2
$
fitted.values,
rstandard
(m2),
xlab
=
"fitted values"
,
ylab
=
"Standardized residuals"
,
main
=
"m2"
)
anova
(m1,m2)
## Analysis of Variance Table
##
## Model 1: height ~ bacteria * sun
## Model 2: height ~ bacteria + sun
##
Res.Df
RSS Df Sum of Sq
F Pr(>F)
## 1
20 4.5950
## 2
21 4.6154 -1 -0.020417 0.0889 0.7687
detach
(shrub)
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2.5
3.0
3.5
4.0
4.5
5.0
5.5
−
1.5
−
0.5
0.5
1.5
m1
fitted values
Standardized residuals
2.5
3.0
3.5
4.0
4.5
5.0
5.5
−
1.5
−
0.5
0.5
1.5
m2
fitted values
Standardized residuals
(a) Write down models ‘m1’ and ‘m2’. (
2 marks
)
(b) Write down the null and alternative hypotheses involved in the ‘anova(m1,m2)’ output, and make a conclusion
for this test. (
3 marks
)
10
(c) Decide which model is better for inference, explain how you decide which model is better. (
2 mark
)
(d) Based on the better model, explain the influence of the predictors on the response. (
2 marks
)
(e) Use the better model to predict height of shrub with full sun and 100 thousand bacteria per ml of soil. Use
the better model to predict height of shrub with partial sun and 0 bacteria per ml of soil. (
2 marks
)
11