2379-sample-final-solution
pdf
keyboard_arrow_up
School
University of Ottawa *
*We aren’t endorsed by this school
Course
2379
Subject
Mathematics
Date
Jan 9, 2024
Type
Pages
13
Uploaded by GrandUniverseHyena41
MAT 2379A
Sample Final Exam
(with Solutions)
Professor Hai Yan Liu
Time: 3 hours
Student Number:
Seat Number:
Family Name:
First Name:
Cellular phones, smart watches, unauthorized electronic devices, or course notes are not allowed during this
exam. Phones and devices must be turned off and put away in your bag. Do not keep them in your possession,
such as in your pockets. If caught with such a device or document, the following may occur: academic fraud
allegations will be filed which may result in you obtaining a 0 (zero) for the exam.
By signing below, you
acknowledge that you have ensured that you are complying with the above statement.
Signature:
********************************************************************************************
This is a closed book examination.
A formula sheet and some statistical tables will be distributed with your exam.
Only Faculty standard calculators are permitted: TI30, TI34, Casio fx-260, Casio fx-300.
The exam consists of 13 multiple choice questions and 6 long answer questions.
Each multiple choice question is worth 5 marks and each long answer question is worth 10 marks.
The total number of marks is 125.
NOTE: At the end of the examination, hand in the entire booklet.
You can keep the
formula sheet and the tables.
********************************************************************************************
For professor’s use:
Number of marks
Total for all MC Questions
Long Answer Question 1
Long Answer Question 2
Long Answer Question 3
Long Answer Question 4
Long Answer Question 5
Long Answer Question 6
Total
1
Part 1: Multiple Choice Questions
Record your answer to the multiple choice questions in the table below:
Question
Answer
Question
Answer
1
8
2
9
3
10
4
11
5
12
6
13
7
1. The Bacillus Calmette-Gu´
erin (BCG) vaccine for tuberculosis (TB) is mandatory for school-age
children in many European countries. In Canada, before BCG vaccination, the patient is tested for
TB using a tuberculin skin test, called the Mantoux test. People who have been BCG vaccinated
will often have a positive Mantoux test result, although they many not have TB. Therefore, the
Mantoux test is not a very efficient tool for detecting TB. In a recent study, 12% of the subjects
had a positive Mantoux test result. Among those with a positive test result, only 10% had TB.
On the other hand, 1% of the patients with a negative test result also had TB. What was the
percentage of patients with TB in this study?
A) 1.10%
B) 2.08%
C) 0.88%
D) 1.20%
E) 13.03%
Solution
(Sections 3.2-3.3) We denote by
TB
the event that a randomly selected person in this
group has tuberculosis. By the total probability rule,
P
(TB) =
P
(TB
|
Test+
)
P
(
Test +
) +
P
(TB
|
Test
−
)
P
(Test
−
)
= (0
.
10)(0
.
12) + (0
.
01)(0
.
88) = 0
.
0208
The answer is B.
2. The intraocular pressure is the fluid pressure inside the eye. Glaucoma is an eye disease that is
manifested by high intraocular pressure.
The distribution of intraocular pressure in the general
population is approximately normal with mean 16 mm Hg and standard deviation 3 mm Hg. The
normal range for intraocular pressure is considered to be between 12 mm Hg and 20 mm Hg
(including these values). Which one of the following commands in
R
gives the probability that a
randomly chosen person has normal intraocular pressure? (Only one answer is correct.)
A)
qnorm(20,16,3)-qnorm(12,16,3)
B)
pnorm(20,3,16)-pnorm(12,3,16)
C)
pnorm(20,16,3)-pnorm(12,16,3)
D)
pnorm(20,16,3)-pnorm(11,16,3)
E)
pnorm(20,16,9)-pnorm(12,16,9)
2
Solution (Section 5.2) We wish to calculate
P
(12
≤
X
≤
20)
, where
X
has a normal distribution
with mean
µ
= 16
and standard deviation
σ
= 3
. This probability is:
P
(12
≤
X
≤
20) =
P
(
X
≤
20)
−
P
(
X <
12) =
P
(
X
≤
20)
−
P
(
X
≤
12)
=
pnorm
(20
,
16
,
3)
−
pnorm
(12
,
16
,
3)
We used the fact that
P
(
X <
12) =
P
(
X
≤
12)
, since
X
is a
continuous
random variable. The
answer is C. (The incorrect answer D is obtained using
P
(
X <
12) =
P
(
X
≤
11)
, which would
be true if
X
was a
discrete
random variable.)
3. Aboriginal people in Canada have a higher risk of developing many chronic diseases compared with
the rest of the population. In a particular Aboriginal community, 16% of the population has tuber-
culosis, 20% have diabetes and 8% have both diseases. What is the probability that a randomly
selected individual in this community does not have either one of the two diseases?
A) 0.72
B) 0.28
C) 0.64
D) 0.85
E) 0.90
Solution (Section 2.2) Let
A
be the event that the person has tuberculosis and
B
the event that
the person has diabetes. We know that
P
(
A
) = 0
.
16
,
P
(
B
) = 0
.
20
and
P
(
A
∩
B
) = 0
.
08
. By
the addition rule,
P
(
A
∪
B
) =
P
(
A
) +
P
(
B
)
−
P
(
A
∩
B
) = 0
.
16 + 0
.
20
−
0
.
08 = 0
.
28
.
The probability that the person does not have either one of the two diseases is:
P
(
A
′
∩
B
′
) = 1
−
P
(
A
∪
B
) = 1
−
0
.
28 = 0
.
72
The answer is A.
4. In biochemistry and pharmacology, a receptor is a protein molecule usually found embedded within
the plasma membrane surface of a cell that receives chemical signals from outside the cell. A sam-
ple of 109 cells was found to contain an average of 1203 fmol receptors per milligram of membrane
protein, with standard deviation 192 fmol. (An fmol is equal to
10
−
15
moles.) Using this data,
give a 95% confidence interval for the average amount (in fmols) of receptors per milligram found
in the membrane protein of these cells.
A)
[1077
.
31; 1329
.
72]
B)
[1153
.
83; 1252
.
21]
C)
[0; 1322
.
82]
D)
[1166
.
96; 1239
.
05]
E)
[1098
.
13; 1308
.
95]
Solution (Section 8.1) We denote by
µ
the average amount of receptors per milligram of membrane
protein. This is a large sample. The 95% confidence interval for
µ
is:
1203
±
1
.
96
192
√
109
= 1203
±
36
.
04 = [1166
.
96; 1239
.
05]
The answer is D.
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
5. The following data gives the birth weights (in ounces) for 6 consecutive deliveries at the Civic
Hospital.
Assuming that the birth weights follow a normal distribution, find a 90% confidence
interval for the average birth weight
µ
.
97
117
140
78
99
148
A)
[91
.
0; 135
.
4]
B)
[84
.
8; 141
.
5]
C)
[91
.
6; 134
.
8]
D)
[95
.
0; 131
.
3]
E)
[92
.
3; 133
.
6]
Solution
(Section 8.2) The sample mean and sample standard deviation for this sample are:
¯
x
=
1
6
6
X
i
=1
x
i
= 113
.
1667
,
s
=
v
u
u
t
1
5
6
X
i
=1
(
x
i
−
¯
x
)
2
= 27
.
00679
.
This is a small sample.
A
90%
confidence interval for
µ
is based on the
T
distribution with
6
−
1 = 5
degrees of freedom. For this level of confidence, the probability at the right of the point
t
is
0
.
05
. Table 18.4 gives the value
t
= 2
.
015
. Therefore, the
90%
confidence interval for
µ
is
113
.
1667
±
2
.
015
27
.
00679
√
6
= [90
.
950; 135
.
383]
The answer is A. The incorrect answers B, C and D are obtained using the wrong values
t
= 2
.
571
,
t
= 1
.
96
, respectively
t
= 1
.
645
.
6. The Younger Dryas Cold Event (or the “Big Freeze”) was an abrupt cooling event of the Northern
Hemisphere which occurred approximately 12,000 years ago, and might have resulted from a slowing
of the Atlantic meridional overturning circulation (AMOC). The most common means of slowing
the AMOC involves the reduction of oceanic surface water density via an increase in freshwater
discharge to the North Atlantic. To predict if such an event might happen again, the density of
the ocean water near surface is closely monitored. We collected 79 measurements of the density
of the Atlantic ocean water near surface (in
kg
/m
3
), at a latitude of 45 degrees north. For this
data, the mean is 1026, the median 1006, the first quartile is 948.1, the third quartile is 1122, and
the standard deviation 109.61. The picture below gives the QQ-plot for this data, together with
the line of best fit, produced using R:
4
Which one of the following statements is correct? (Only one statement is correct.)
A) The fitted line for the QQ plot is
y
= 1006 + 109
.
61
z
B) The fitted line for the QQ plot is
y
= 109
.
61 + 1006
z
C) The fitted line for the QQ plot is
y
= 1026 + 109
.
61
z
D) The fitted line for the QQ plot is
y
= 109
.
61 + 1122
z
E) The distribution of the water density does not appear to be normally distributed, so we cannot
find a fitted line for the normal QQ plot.
Solution
(Section 7.3) There is a clear linear tendency in the plot, so the data appears to be
normally distributed.
The line of best fit has equation
y
= ˆ
µ
+ ˆ
σz
where
ˆ
µ
= ¯
x
= 1026
and
ˆ
σ
=
s
= 109
.
61
. The answer is C.
7. The following data gives the number of deadly bear attacks in North America per decade, for the
9 decades between 1900 and 1989:
2
,
1
,
4
,
8
,
6
,
9
,
9
,
19
,
20
.
Calculate the mean and standard deviation for the number of deadly bear attacks in North America
per decade.
A) The mean is 8.667 and the standard deviation is 5.6505.
B) The mean is 8.0 and the standard deviation is 19.0.
C) The mean is 8.0 and the standard deviation is 5.0.
D) The mean is 8.667 and the standard deviation is 46.0.
E) The mean is 8.667 and the standard deviation is 6.7823.
Solution
(Section 7.1) The mean is
x
=
1
9
9
X
i
=1
x
i
=
78
9
= 8
.
6667
and the standard deviation is:
s
=
s
(
∑
9
i
=1
x
2
i
)
−
(
∑
9
i
=1
x
i
)
2
/
9
8
=
r
1044
−
(78)
2
/
9
9
−
1
=
√
46 = 6
.
7823
.
The answer is E.
8. 20% of the trees in a certain forest are maple trees. In this forest, 15% of the maple trees are
mature trees, with age between 10 and 15 years. We select randomly a tree in this forest. What
is the probability that this is a maple tree with age between 10 and 15 years?
A) 0.03
B) 0.15
C) 0.20
D) 0.75
E) 0.175
Solution
(Section 3.3) We denote by
A
the event that the tree is a maple tree and
B
the event
that the tree has an age between 10 and 15 years. We know that
P
(
A
) = 0
.
2
and
P
(
B
|
A
) = 0
.
15
.
By the multiplication rule,
P
(
A
∩
B
) =
P
(
A
)
P
(
B
|
A
) = (0
.
2)(0
.
15) = 0
.
03
5
The answer is A.
9. The boxplots below show the effects of different sugars on the growth of pea sections grown in
tissue culture, measured in ocular units. (An ocular unit is 0.114 cm.) In experiment A,
2%
of
glucose was added to the culture. In experiment B, 2% of sucrose was added to the culture. In
experiment C,
1%
of glucose and
2%
of fructose was added to the culture. Finally, in experiment
D,
1%
of fructose was added to the culture.
A
B
C
D
56
58
60
62
64
66
Experiment
Growth in ocular units
Which one of the following statements is correct? (Only one statement is correct.)
A) The median growth in experiments C and D is the same.
B) The data in experiments A and C have the same inter-quartile range.
C) There are outliers in the data of experiments A, C and D, but not in experiment B.
D) The distribution of the data in experiment B is approximately symmetric.
E) Experiment B has produced the smallest growth.
Solution
(Section 7.1) The answer is A.
10. One of the objectives of a study is to describe the distribution of the body mass index (BMI) for
women whose age is between 20 and 29 years. Suppose that women in this age group have an
average BMI of 26.8 with a standard deviation of 7.42. Consider a random sample of 50 women
in this age group. Give an approximation for the probability that the average BMI for these 50
women is greater than 29.
A) 0.0179
B) 0.9821
C) 0.6179
D) 0.3821
E) 0.0375
Solution (Section 7.2) Let
X
be the mean of this sample. By the central limit theorem, we know
that the random variable
X
−
26
.
8
7
.
42
/
√
50
has approximatively a standard normal distribution. Hence,
P
(
X >
29)
=
P
X
−
26
.
8
7
.
42
/
√
50
>
29
−
26
.
8
7
.
42
/
√
50
≈
P
(
Z >
2
.
10)
=
1
−
P
(
Z <
2
.
10) = 1
−
0
.
9821 = 0
.
0179
.
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The answer is A.
11. A pharmaceutical company is testing a new analgesic (medication for pain relief) on a sample of
6 patients suffering from migraine. Among these, 4 patients reported that their migraines disap-
peared after using the drug. However, it is known that 20% of migraines disappear anyways without
any treatment. What is the probability that in a sample of 6 patients suffering from migraine, the
migraines will disappear without any treatment for exactly 4 them?
A) 0.0016
B) 0.2534
C) 0.3523
D) 0.0154
E) 0.9992
Solution
(Section 4.2) Let
X
be the number of patients for whom the migraine will disappear
without any treatment, in a sample of 6 patients. Then
X
has a binomial distribution with
n
= 6
trials and probability
p
= 0
.
2
of success. The desired probability is
P
(
X
= 4) =
6
4
(0
.
2)
4
(0
.
8)
2
= 0
.
01536
The answer is D.
12. The plant-water relation plays an important role in plant physiology. We consider an experiment
in which 16 seedlings of birch tree were flooded with water for one day and 13 other seedlings were
kept as controls. At the end of the experiment, the roots of all plants were analyzed for the level
of adenosine triphosphate (ATP), as a measure for the intracellular energy transfer. Below is the
summary of the data:
flooded plants
control plants
sample size
n
1
= 16
n
2
= 13
sample mean
¯
x
1
= 1
.
17
¯
x
2
= 1
.
91
sample standard deviation
s
1
= 0
.
16
s
2
= 0
.
23
Give a 90% confidence interval for the difference
µ
1
−
µ
2
, where
µ
1
is the average ATP level for
the flooded plants and
µ
2
is the average ATP level for the controls. Based on this interval, can we
conclude that flooding causes a decrease or an increase in the ATP level? (Assume that the ATP
levels for flooded plants and controls are normally distributed with equal variances.)
A) [0.5673; 0.7614]; flooding causes an increase in the mean ATP level
B) [0.4532; 0.6719]; flooding causes an increase in the mean ATP level
C) [-0.6182; -0.4820]; flooding causes a decrease in the mean ATP level
D) [-0.8635; -0.6165]; flooding causes a decrease in the mean ATP level
E) [-0.0346; 0.3471]; we cannot conclude that flooding causes a decrease or an increase in the
mean ATP level
Solution
(Section 10.3) This is a small sample test, for normal populations with equal variances.
The pooled sample variance is:
s
2
p
=
(
n
1
−
1)
s
2
1
+ (
n
2
−
1)
s
2
2
n
1
+
n
2
−
2
=
(15)(0
.
16)
2
+ (12)(0
.
23)
2
16 + 13
−
2
= 0
.
03773
7
The 90% confidence interval for
µ
1
−
µ
2
is
¯
x
1
−
¯
x
2
±
t
q
s
2
p
(1
/n
1
+ 1
/n
2
)
The value
t
is found in Table 18.4 such that
P
(
−
t
≤
T
≤
t
) = 0
.
90
, where
T
has a
T
distribution
with 27 degrees of freedom. This means that
P
(
T
≤
t
) = 0
.
95
. In Table 18.4 (row 27, column
0.95) we find the value
t
= 1
.
703
. The 90% confidence interval for
µ
1
−
µ
2
is:
1
.
17
−
1
.
91
±
(1
.
703)
p
(0
.
03773)(1
/
16 + 1
/
13) =
−
0
.
74
±
0
.
1235 = [
−
0
.
8635;
−
0
.
6165]
Since the interval contains only negative values, we infer that
µ
1
< µ
2
. We conclude that flooding
causes a decrease in the ATP level. The answer is D.
13. The systolic blood pressure level in a certain population is approximately equal to the value
125
mm Hg. A topic of recent clinical interest is the fact that extensive use of oral contraceptive (OC)
may cause a reduction in the systolic blood pressure under the value
125
. A study is organized
to test this hypothesis. The
n
women who participated in this study used OC for a period of 3
months.
At the end of the study, their systolic blood pressure was measured.
This data has a
sample mean 120.4 and sample standard deviation 13.23.
What was the number
n
of participants in this study?
A) 12
B) 40
C) 10
D) 32
E) 25
Solution (Section 9.2) This is a left-tailed small sample test. We would like to test
H
0
:
µ
= 125
against
H
1
:
µ <
125
. The observed value of test statistic is
t
0
=
¯
x
−
125
s/
√
n
=
120
.
4
−
125
13
.
23
/
√
n
.
From the
R
output we know that
t
0
=
−
1
.
0998
. We infer that
120
.
4
−
125
13
.
23
/
√
n
=
−
1
.
0998
.
Therefore
n
=
−
1
.
0998
×
13
.
23
120
.
4
−
125
2
= 10
.
005
We conclude that the sample size was
n
= 10
. The answer is C.
Long answer questions are included on the following pages.
Part 2: Long Answer Questions
8
Record your answer to the long answer questions in the space provided below, specifying clearly your
notation and including a proper justification. Show the details of your calculations.
1. The average length of human gestation is approximately 40.5 weeks. It is thought that maternal
diabetes may influence the length of the gestation. In a study consisting of 20 diabetic pregnant
women, it was found that the mean gestation period was 38.8 weeks with a standard deviation
of 5 weeks.
We would like to gain evidence that the length of gestation in diabetic women is
significantly different
than the value of 40.5 weeks, using a test of hypotheses.
a) (2 marks) Set-up the test hypotheses to gain evidence for this claim.
b) (4 marks) Calculate the observed value of the test.
c) (4 marks) Report the range of the
p
-value.
d) (2 marks) Give the conclusion of the test at level
α
= 0
.
05
.
Solution
(Section 9.2) This is a two-sided small sample test.
a) We denote by
µ
the mean length of gestation for diabetic women.
We would like to test
H
0
:
µ
= 40
.
5
against
H
1
:
µ
̸
= 40
.
5
.
b) We know that
n
= 20
,
¯
x
= 38
.
8
and
s
= 5
. The observed value of the test statistic is:
t
0
=
¯
x
−
40
.
5
s/
√
n
=
38
.
8
−
40
.
5
5
/
√
20
=
−
1
.
52
.
c) The
p
-value of the test is:
p
-value
= 2
P
(
T
19
>
1
.
52)
From Table 18.4 (row 19) we see that 1.52 is between the values 1.328 and 1.729, whose corre-
sponding probabilities to the right are 0.10 and 0.05. Hence
P
(
T
19
>
1
.
52)
is between 0.05 and
0.10 and
0
.
10
< p
-value
<
0
.
20
d) Since
p
-value
>
0
.
05
, we fail to reject
H
0
. There is not enough evidence that the mean length
of gestation for diabetic women is significantly different than 40.5.
2. A study was conducted to estimate the sensitivity and specificity of a new procedure for detecting
the presence of a kidney disease among patients suffering from hypertension. Among the 54 hyper-
tensive patients who had the kidney disease, the procedure identified the disease for 45 subjects.
Among the 83 hypertensive patients who did not have the kidney disease, the procedure identified
the disease for 24 subjects. Consider a patient chosen from a certain hypertensive population in
which the prevalence of this kidney disease is 8%. Assume that the sensitivity and specificity of
the procedure remain the same as in the study mentioned above.
a) (5 marks) What is the probability of obtaining a positive test result?
b) (5 marks) If the new procedure identifies the presence of the kidney disease for this patient,
what is the probability that patient truly has the disease?
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Solution (Sections 3.2 and 3.4) Let
T
+
be the event that the new procedure identifies the presence
of the disease and
D
the event that the patient has the disease. The fact that the prevalence of
the disease is 8% means that
P
(
D
) = 0
.
08
. The fact that the sensitivity and specificity of the
procedure remain the same as in the study means that:
P
(
T
+
|
D
) = 45
/
54
and
P
(
T
+
|
D
′
) = 24
/
83
.
a)
P
(
T
+)
=
P
(
T
+
|
D
)
P
(
D
) +
P
(
T
+
|
D
′
)
P
(
D
′
)
=
(45
/
54)(0
.
08) + (24
/
83)(1
−
0
.
08)
=
0
.
3327
.
b) By Bayes’ rule,
P
(
D
|
T
+)
=
P
(
D
∩
T
+)
P
(
T
+)
=
P
(
T
+
|
D
)
P
(
D
)
P
(
T
+)
=
(45
/
54)(0
.
08)
0
.
3327
= 0
.
2004
3. Ebola virus disease (EVD), formerly known as Ebola haemorrhagic fever, is a severe, often fatal
illness in humans. It is thought that fruit bats of the Pteropodidae family are natural Ebola virus
hosts. The virus is introduced into the human population through close contact with bodily fluids
of infected animals. The incubation period (the time interval from infection with the virus to onset
of symptoms) is between 2 to 21 days. The following data gives the incubation period (in days)
for 16 patients infected with the Ebola virus:
4
5
6
6
7
8
9
9
11
12
13
15
15
17
20
21
a) (5 marks) Calculate the median (
˜
x
), first quartile (
q
1
) and third quartile (
q
3
) for this data set.
b) (5 marks) Give the values of the outliers (if they exist).
Solution
(Section 7.1) a) Note that the data is already arranged in increasing order. Hence
y
1
= 4
y
2
= 5
y
3
= 6
y
4
= 6
y
5
= 7
y
6
= 8
y
7
= 9
y
8
= 9
y
9
= 11
y
10
= 12
y
11
= 13
y
12
= 15
y
13
= 15
y
14
= 17
y
15
= 20
y
16
= 21
For this dataset,
n
= 16
is even. Hence, the median is:
˜
x
=
y
8
+
y
9
2
=
9 + 11
2
= 10
To compute the first quartile, we note that
(
n
+ 1)
/
4 = 17
/
4 = 4
.
25
, which is between 4 and 5
(closer to 4). The first quartile is:
q
1
= (0
.
75)
y
4
+ (0
.
25)
y
5
= (0
.
75)(6) + (0
.
25)(7) = 6
.
25
10
To compute the third quartile, we note that
3(
n
+ 1)
/
4 = 51
/
4 = 12
.
75
, which is between 12 and
13 (closer to 13). The third quartile is:
q
3
= (0
.
25)
y
12
+ (0
.
75)
y
13
= (0
.
25)(15) + (0
.
75)(15) = 15
b) To find the outliers, we need to find the location of the two fences. The inter-quartile range is
IQR
=
q
3
−
q
1
= 15
−
6
.
25 = 8
.
75
. Hence
Fence1 =
q
1
−
(1
.
5)IQR = 6
.
25
−
(1
.
5)(8
.
75) = 6
.
25
−
13
.
125 =
−
6
.
875
Fence2 =
q
3
+ (1
.
5)IQR = 15 + (1
.
5)(8
.
75) = 15 + 13
.
125 = 28
.
125
Since there are no data points outside the two fences, we conclude that there are no outliers.
4. In the Unites States, the blood types have the following distribution: 41% O, 31% A, 22% B and
6% AB. It is known that O is a universal donor, A can donate only to A and AB, B can donate only
to B and AB, and AB can donate only to AB. If a patient who needs a blood transfusion receives
blood from a randomly selected donor, and the two persons are independent of each other, what
is the probability that the transfusion is successful?
Solution (Section 3.5) Let
A
1
, A
2
, A
3
, A
4
be the events that the donor’s blood type are O, A, B,
respectively AB. Let
B
1
, B
2
, B
3
, B
4
be the events that the blood type of the receiving individual
are O, A, B, respectively AB. The event
A
i
is independent of
B
j
, for any
i
= 1
,
2
,
3
,
4
and
j
= 1
,
2
,
3
,
4
. The event that the transfusion is successful can be written as the following union of
disjoint events:
C
= (
A
1
∩
B
1
)
∪
(
A
1
∩
B
2
)
∪
(
A
1
∩
B
3
)
∪
(
A
1
∩
B
4
)
∪
(
A
2
∩
B
2
)
∪
(
A
2
∩
B
4
)
∪
(
A
3
∩
B
3
)
∪
(
A
3
∩
B
4
)
∪
(
A
4
∩
B
4
)
.
Hence,
P
(
C
)
=
P
(
A
1
)
P
(
B
1
) +
P
(
A
1
)
P
(
B
2
) +
P
(
A
1
)
P
(
B
3
) +
P
(
A
1
)
P
(
B
4
) +
P
(
A
2
)
P
(
B
2
) +
P
(
A
2
)
P
(
B
4
) +
P
(
A
3
)
P
(
B
3
) +
P
(
A
3
)
P
(
B
4
) +
P
(
A
4
)
P
(
B
4
)
=
(0
.
41)(0
.
41) + (0
.
41)(0
.
31) + (0
.
41)(0
.
22) + (0
.
41)(0
.
06) +
(0
.
31)(0
.
31) + (0
.
31)(0
.
06) + (0
.
22)(0
.
22) + (0
.
22)(0
.
06) +
(0
.
06)(0
.
06)
=
0
.
5899
5. Approximately 4% of men with age between 40 and 55 years will have a heart attack in a 5-year
period. A new drug was developed to reduce the probability of having a heart attack for men in
this age group.
A 5-year study was conducted involving men in this age group who have been
treated with the new drug. Among the 2046 participants in the study, 56 had a heart attack within
the 5-year period. Let
p
be the proportion of men in the age group 40-55 using this drug who will
11
have a heart attack.
a) (5 marks) Give a 95% confidence interval (c.i.) for
p
. Using this interval, can we conclude that
the new drug is efficient in reducing the risk of having a heart attack for men in this age group?
b) (5 marks) Formulate a null hypothesis
H
0
and an alternative hypothesis
H
1
which could be
used for testing that the new drug is efficient in reducing the risk of having a heart attack for men
in this age group. Calculate the
p
-value of this test and report the conclusion at level
α
= 0
.
05
.
Solution
a) (Section 8.3) An estimate for
p
is
ˆ
p
= 56
/
2046 = 0
.
02737
.
The 95% confidence
interval for
p
is:
0
.
02737
±
1
.
96
r
(0
.
02737)(1
−
0
.
02737)
2046
= [0
.
020; 0
.
034]
.
Because all the values in the interval are smaller than 0.04, we are confident that
p
is smaller than
0.04. We conclude that the new drug is efficient in reducing the risk of a heart attack.
b) (Section 9.3) We would like to test
H
0
:
p
= 0
.
04
against
H
1
:
p <
0
.
04
. The observed value
of the test statistic is:
z
0
=
ˆ
p
−
0
.
04
p
(0
.
04)(0
.
96)
/
2046
=
−
2
.
92
,
where
ˆ
p
= 56
/
2046 = 0
.
02737
. This is a left-tailed test. Using Table 18.2, we see that the
p
-value
of the test is given by
p
-value
=
P
(
Z <
−
2
.
92) = 0
.
0018
.
Since
p
-value
< α
= 0
.
05
, we reject
H
0
and conclude that the new drug is efficient in reducing the
risk of a heart attack.
6. A study is conducted to investigate the relationship between the number
X
of hours of exercise
per week and the systolic blood pressure
Y
for men of age 50. The following data was obtained
on
10
individuals:
Individual
Number of hours
x
i
Systolic blood pressure
y
i
x
2
i
y
2
i
x
i
y
i
1
4
120
16
14400
480
2
10
110
100
12100
1100
3
2
120
4
14400
240
4
3
135
9
18225
405
5
3
140
9
19600
420
6
5
115
25
13225
575
7
1
150
1
22500
150
8
2
165
4
27225
330
9
2
160
4
25600
320
10
0
180
0
32400
0
Total
32
1395
172
199675
4020
a) (4 marks) Calculate the sample covariance and sample correlation between the number of hours
of exercise and the systolic blood pressure.
b) (4 marks) Give the equation of the estimated regression line for this data.
12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
c) (2 marks) Give a prediction for the systolic blood pressure of an individual who exercises
6
hours
per week.
Solution
a) (Section 13.1) For this data, we have:
10
X
i
=1
(
x
i
−
¯
x
)
2
=
10
X
i
=1
x
2
i
−
1
10
10
X
i
=1
x
i
!
2
= 69
.
6
10
X
i
=1
(
y
i
−
¯
y
)
2
=
10
X
i
=1
y
2
i
−
1
10
10
X
i
=1
y
i
!
2
= 5072
.
5
10
X
i
=1
(
x
i
−
¯
x
)(
y
i
−
¯
y
) =
10
X
i
=1
x
i
y
i
−
1
10
10
X
i
=1
x
i
!
10
X
i
=1
y
i
!
=
−
444
The sample covariance is
c
cov
xy
=
1
9
10
X
i
=1
(
x
i
−
x
)(
y
i
−
y
) =
1
9
×
(
−
444) =
−
49
.
33
We have:
s
x
=
r
69
.
6
9
= 2
.
78
and
s
y
=
r
5072
.
5
9
= 23
.
74
The sample correlation is:
r
xy
=
c
cov
xy
s
x
s
y
=
−
49
.
33
(2
.
78)(23
.
74)
= 0
.
747
b) (Section 13.2) For calculating
ˆ
β
and
ˆ
α
, we use the formulas:
b
β
=
∑
10
i
=1
(
x
i
−
¯
x
)(
y
i
−
¯
y
)
∑
10
i
=1
(
x
i
−
¯
x
)
2
=
−
444
69
.
6
=
−
6
.
38
,
ˆ
α
= ¯
y
−
ˆ
β
¯
x
= 139
.
5
−
(
−
6
.
38)(3
.
2) = 159
.
91
The estimated regression line is
ˆ
y
= ˆ
α
+
ˆ
βx
, which in our case becomes:
ˆ
y
= 159
.
91
−
(6
.
38)
x
c) (Section 13.2) A prediction for the systolic blood pressure of an individual who exercises
6
hours
per week is
159
.
91
−
(6
.
38)(6) = 121
.
6
.
13