2379-midterm2-23A-sol (1)
pdf
keyboard_arrow_up
School
University of Ottawa *
*We aren’t endorsed by this school
Course
2379
Subject
Mathematics
Date
Jan 9, 2024
Type
Pages
8
Uploaded by GrandUniverseHyena41
MAT 2379B
Midterm Examination
November 15, 2023
Professor Raluca Balan
Time: 80 minutes
Student Number:
Family Name:
First Name:
This is a closed book examination.
You can bring your own formula sheet (one page, one-sided).
Some statistical tables are included on the last page of the booklet.
Only Faculty standard calculators are permitted: TI30, TI34, Casio fx-260, Casio fx-300.
You are not allowed to use any electronic device during the exam. Cell phones should be put away.
The exam consists of 6 multiple choice questions and 4 long answer questions.
Each multiple choice question is worth 5 marks and each long answer question is worth 10 marks.
The total number of marks is 70.
NOTE: At the end of the examination, hand in the entire booklet.
.*****************************************************.
For professor’s use:
Number of marks
Total for all MC Questions
Long Answer Question 1
Long Answer Question 2
Long Answer Question 3
Long Answer Question 4
Total
1
Part 1: Multiple Choice Questions
Record your answer to the multiple choice questions in the table below:
Question
Answer
1
2
3
4
5
6
1. The hydrochloric acid (HCl) is a highly acidic substance found in the human stomach, where it
aids in the digestion of food. Measurements on the pH level of HCl for 125 patients have been
recorded in
R
in the variable
x
. Below is the histogram and the QQ plot for this data.
Which of the following statements is correct ? (Only one statement is correct.)
A) The histogram is approximately symmetric and the QQ plot has a strong linear tendency.
It is reasonable to assume that this data is normally distributed.
B) The QQ plot has a strong curvilinear tendency. It is not reasonable to assume that this data is
normally distributed.
C) The distribution of the pH level is highly skewed to the
right
. It is not reasonable to assume
that this daya is normally distributed.
D) The distribution of the pH level is highly skewed to the
left
. It is not reasonable to assume
that this data is normally distributed.
E) We cannot draw any conclusion about the distribution of the pH level.
Solution:
The distribution is symmetric and the QQ plot is linear, so it is reasonable to assume
that the data is normally distributed. The answer is A.
2
2. Assume that the length of a blue whale has a normal distribution with mean 33 m and standard
deviation 4 m. Use the
R
output below to find a value
x
0
such that 80% of blue whales have the
length smaller than
x
0
. (In this output,
∗
denotes multiplication.)
A)
qnorm
(0.2, 33, 4)
B)
33
−
4
∗
qnorm
(0.8, 33, 4)
C)
pnorm
(0.8, 33, 4)
D)
1
−
pnorm
(0.8, 33, 4)
E)
33 + 4
∗
qnorm
(0.8, 0, 1)
Solution:
Let
X
be the length of a randomly chosen blue whale. Then
X
has a normal distribution
with mean
µ
= 33
and standard deviation
σ
= 4
. We have to find a value
x
0
such that
P
(
X <
x
0
) = 0
.
80
. The value is given by the
R
command
qnorm
(0.8, 33, 4) but this command is not
included among the listed answers. By standardization,
0
.
8 =
P
(
X < x
0
) =
P
X
−
33
4
<
x
0
−
33
4
=
P
(
Z < z
0
)
where
z
0
=
x
0
−
33
4
= qnorm(0
.
8
,
0
,
1)
. Solving for
x
0
we obtain:
x
0
= 33 + 4
z
0
. The answer is E.
3. Measurements on the length have been recorded for 3 species of bees: red, yellow and green. For
each species, we selected a sample of 100 bees, and measured their lengths. This data was saved
in R in variables “red”, “yellow” and “green”. Below are the histograms and boxplots for these
3 data sets.
The labels of the variables are missing from the boxplots, but are included in the
histograms. Our task is to identify the missing labels.
(a) green
(b) yellow
(c) red
(d)
(e)
(f)
Which one of the following statements is correct? (Only one statement is correct.)
A) boxplot (d) is for red, (e) is for yellow, and (f) is for green
B) boxplot (f) is for red, (e) is for yellow, and (d) is for green
C) boxplot (e) is for red, (f) is for yellow, and (d) is for green
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
D) boxplot (f) is for red, (d) is for yellow, and (e) is for green
E) boxplot (d) is for red, (f) is for yellow, and (e) is for green
Solution:
We have to match each histogram with the corresponding boxplot.
Histogram (c) is
symmetric, and so is boxplot (d). So (c) is matched with (d), i.e. (d) is for red. Histogram (b)
is skewed to the left, and this corresponds to boxplot (f) which has an asymmetry in the same
direction (of larger values), so (b) is matched with (f), i.e. (f) is for yellow. Finally (a) is matched
with (e), i.e. (e) is for green. The answer is E.
4. Suppose that the size of body of snowy owl is normally distributed with mean of 61.5 cm and
standard deviation of 4.75 cm. What is the probability that a randomly chosen snowy owl has a
body that is larger than 64 cm?
A)
0
.
2981
B)
0
.
9918
C)
0
.
9082
D)
0
.
8296
E) 0.2875
Solution:
We first calculate
P
(
X
≥
64)
, where
X
has a normal distribution with mean
µ
= 61
.
5
and standard deviation
σ
= 4
.
75
. Using standardization, we have:
P
(
X
≥
64)
=
P
X
−
61
.
5
4
.
75
≥
64
−
61
.
5
4
.
75
=
P
(
Z
≥
0
.
53)
=
1
−
P
(
Z <
0
.
53)
=
1
−
0
.
7019 = 0
.
2981
.
The answer is A.
5. A study on the polar bears in the Beauford Sea shows that the bears are fasting. Because of this,
their cubs have smaller weights at birth. We measure the weight at birth (in grams) for a sample
of 50 cubs, yielding a sample mean
¯
x
= 715
g and a standard deviation
s
= 123
g. Calculate a
99%
confidence interval for the average cub weight
µ
at birth.
A) [629.84;800.16]
B) [680.91; 749.09]
C) [689.62;740.38]
D) [670.21;759.79]
E) [702.12;727.88]
Solution:
This is a large sample interval. We need to find
z
such that
P
(
−
z < Z < z
) = 0
.
99
. This
means that
P
(
Z <
−
z
) =
P
(
Z > z
) = (1
−
0
.
99)
/
2 = 0
.
005
and
P
(
Z < z
) = 0
.
99 + 0
.
005 =
0
.
995
. In Table 18.3, we finds
P
(
Z <
2
.
57) = 0
.
9949
and
P
(
Z <
2
.
58) = 0
.
9951
, so we choose
z
= 2
.
575
. A 99% confidence interval for
µ
is
715
±
2
.
575
123
√
50
!
= 715
±
44
.
79 = [670
.
21; 759
.
79]
The answer is D. The wrong answer B is obtained using
z
= 1
.
96
.
6. The following data gives the weight for 8 corn cobs which were produced using an organic corn
fertilizer:
212
234
259
189
245
176
203
215
4
For this data, the sample mean is
¯
x
= 216
.
625
, and the sample standard deviation
s
= 28
.
09645
.
Find a 90% confidence interval for the average cob weight.
Assume that the data is normally
distributed.
A) [197.801; 235.449]
B) [193.132; 240.118 ]
C) [200.284; 232.966]
D) [197.155; 236.095]
E) [195.811; 237.439]
Solution:
Since this is a small sample and the data is normally distributed, we use the interval
based on the
T
distribution. We need to find the value
t
such that
P
(
−
t < T < t
) = 0
.
90
. This
means that
P
(
T > t
) = (1
−
0
.
90)
/
2 = 0
.
05
and hence
P
(
T < t
) = 0
.
95
. From Table 18.4 (row
7) we find
t
=
t
0
.
05
,
7
= 1
.
895
. The confidence interval is:
216
.
625
±
1
.
895
28
.
09645
√
8
!
= 216
.
625
±
20
.
12388 = [197
.
8008; 235
.
4492]
The answer is A. (Note that the interval in D is obtained using the incorrect value
z
= 1
.
96
instead
of
t
= 1
.
895
.)
Long answer questions are included on the following pages.
Part 2: Long Answer Questions
Record your answer to the long answer questions in the space provided below, specifying clearly your
notation and including a proper justification. Show the details of your calculations.
1. Platelets, also known as thrombocytes, are small, irregularly shaped cell fragments that play a
crucial role in blood clotting. The normal range for platelet counts in adults is typically between
150 thousands and 450 thousands platelets per microliter of blood. Below is some data on the
number of platelets (in thousands) per microliter of blood, which has been recorded for 12 patients:
159
132
160
165
163
197
176
160
169
164
161
183
a) (5 marks) Find the median (
˜
x
), and the quartiels
q
1
and
q
3
.
b) (5 marks) Find the outliers, if they exist. Justify your answer.
Solution:
a) We arrange the data in increasing order:
y
1
= 132
y
2
= 159
y
3
=
y
4
= 160
y
5
= 161
y
6
= 163
y
7
= 164
y
8
= 165
y
9
= 169
y
10
= 176
y
11
= 183
y
12
= 197
Because
n
= 12
is even and
n
2
= 6
, the median is
˜
x
=
y
6
+
y
7
2
=
163 + 164
2
= 163
.
5
.
To find the quartiles, we note that
n
+1
4
=
13
4
= 3
.
25
and
3(
n
+1)
4
= 9
.
75
. The first quartile is
q
1
= 0
.
75
y
3
+ 0
.
25
y
4
= (0
.
75)(160) + (0
.
25)(160) = 160
.
5
The third quartile is
q
3
= (0
.
25)
y
9
+ (0
.
75)
y
10
= (0
.
25)(169) + (0
.
75)(176) = 174
.
25
b)
IQR
= 174
.
25
−
160 = 14
.
25
. We calculate the two fences:
Fence1 =
q
1
−
1
.
5
IQR
= 160
−
1
.
5
·
14
.
25 = 138
.
625
Fence2 =
q
3
+ 1
.
5
IQR
= 174
.
25 + 1
.
5
·
14
.
25 = 195
.
625
The outliers are the values outside the two fences: 132 et 197.
2. Suppose that the height of an 8-year old girl has a normal distribution with a mean of 128 cm and
standard deviation of 10 cm.
a) (5 marks) What is the probability that a randomly selected an 8-year old girl has a height
between 124 cm and 132 cm ?
b) (5 marks) We select a random sample of 55 girls of age 8. What is the approximate probability
that the
average
height for this sample is between 124 cm and 132 cm?
Solution:
a) Let
X
be the height of a randomly chosen girl. The desired probability is:
P
(124
< X <
132)
=
P
124
−
128
10
<
X
−
128
10
<
132
−
128
10
=
P
(
−
0
.
4
< Z <
0
.
4)
=
P
(
Z <
0
.
4)
−
P
(
Z <
−
0
.
4) = 0
.
6554
−
0
.
3446 = 0
.
3108
.
We used the fact that
P
(
Z <
0
.
4) = 0
.
6554
(from Table 18.3) and
P
(
Z <
−
0
.
4) = 0
.
3446
(from
Table 18.2).
b) Let
X
be the mean of the sample of size 55. By the central limit theorem,
X
has approximately
a normal distribution with mean 128 and standard deviation
10
/
√
55 = 1
.
3484
.
The desired
probability is
P
(124
<
X <
132)
=
P
124
−
128
1
.
3484
<
X
−
128
1
.
3484
<
132
−
128
1
.
3484
!
=
P
(
−
2
.
97
< Z <
2
.
97) =
P
(
Z <
2
.
97)
−
P
(
Z <
−
2
.
97)
=
0
.
9985
−
0
.
0015 = 0
.
9970
,
where for the last line we used again Tables 18.2 and 18.3.
3. The maximum speed at which a female deer can run depends on various factors, including the
species of deer, age, health, and environmental conditions.
Some species of deer, such as the
white-tailed deer (Odocoileus virginianus), can reach speeds of up to 48 to 56 kilometers per hour
for short distances when they are sprinting.
However, their sustained running speed is typically
lower. Below is the data for the running speed for a sample of
n
= 6
white-tailed female deer:
x
1
= 45
.
5
x
2
= 37
.
5
x
3
= 42
.
1
x
4
= 34
.
8
x
5
= 34
.
0
x
6
= 32
.
9
a) (5 marks) What is the geometric mean for this data set?
b) (5 points) We transform the data using the linear transformation
X
′
=
−
5
X
+ 3
. What is the
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
median of the transformed measurements
x
′
1
, x
′
2
, x
′
3
, x
′
4
, x
′
5
, x
′
6
?
Solution:
a)
Method 1.
We apply a logarithmic transformation to this data set:
Y
= ln(
X
)
. We
obtain the following data:
y
1
= ln(45
.
5)
y
2
= ln(37
.
5)
y
3
= ln(42
.
1)
y
4
= ln(34
.
8)
y
5
= ln(34)
y
6
= ln(32
.
9)
The mean of the log-transformed data is:
¯
y
=
y
1
+
y
2
+
y
3
+
y
4
+
y
5
+
y
6
6
=
ln(45
.
5) + ln(37
.
5) + ln(42
.
1) + ln(34
.
8) + ln(34) + ln(32
.
9)
6
=
3
.
625
The geometric mean of the original data is:
g
=
e
¯
y
=
e
3
.
625
= 37
.
53
.
Method 2.
The geometric mean is:
g
=
6
Y
i
=1
x
i
!
1
/
6
= (45
.
5
×
37
.
5
×
42
.
1
×
34
.
8
×
34
×
32
.
9)
1
/
6
= 37
.
53
.
b) (Method 1) The transformed measurements are:
x
′
1
=
−
224
.
5
,
x
′
2
=
−
184
.
5
,
x
′
3
=
−
207
.
5
,
x
′
4
=
−
171
,
x
′
5
=
−
167
x
6
=
−
161
.
5
We arrange this data in increasing order. We obtain:
y
′
1
=
−
224
.
5
,
y
′
2
=
−
207
.
5
,
y
′
3
=
−
184
.
5
,
y
′
4
=
−
171
,
y
′
5
=
−
167
y
′
6
=
−
161
.
5
The median of the transformed data is
m
=
y
′
3
+
y
′
4
2
=
−
184
.
5
−
171
2
=
−
177
.
75
(Methode 2) We first find the median of the original data set
x
1
, x
2
, x
3
, x
4
, x
5
, x
6
. For this, we
arrange the data in increasing order:
y
1
= 32
.
9
,
y
2
= 34
,
y
3
= 34
.
8
,
y
4
= 37
.
5
,
y
5
= 42
.
1
,
y
6
= 45
.
5
The median of the original data set is:
˜
x
=
y
3
+
y
4
2
=
34
.
8 + 37
.
5
2
= 36
.
15
. The median of the transformed data is:
m
=
−
5
×
36
.
15 + 3 =
−
177
.
75
7
4. The amount of potassium in a triple cheeseburger is a random variable with a normal distribution
with mean
µ
= 460
mg and standard deviation
σ
= 64
mg. We select at random
n
cheeseburgers
and we denote by
X
the average amount of potassium for this sample. Find the sample size
n
such that
P
(
X >
470) = 0
.
1056
.
Solution:
Let
X
be the amount of potasium in one cheeseburger and
¯
X
the average amount of
potassium in a sample of size
n
. Because
X
is normally distributed,
¯
X
is also normally distributed
with mean
µ
= 460
and standard deviation
64
/
√
n
. By standardization,
Z
=
¯
X
−
460
64
/
√
n
∼
N
(0
,
1)
.
We have to find
n
such that
P
(
¯
X >
470) = 0
.
1056
, i.e.
P
(
¯
X <
470) = 0
.
8944
. By standardiza-
tion,
0
.
8944 =
P
(
¯
X <
470) =
P
Z <
470
−
460
64
/
√
n
!
=
P
Z <
10
√
n
64
!
.
In Table 18.3, we look for a value
z
such that
P
(
Z < z
) = 0
.
8944
. We find
z
= 1
.
25
. Hence,
10
√
n
64
= 1
.
25
,
and
√
n
=
(64)(1
.
25)
10
= 8
Therefore,
n
= (8)
2
= 64
.
8