2379-sample-midterm2-sol
pdf
keyboard_arrow_up
School
San Jose State University *
*We aren’t endorsed by this school
Course
167
Subject
Mathematics
Date
Jan 9, 2024
Type
Pages
7
Uploaded by GrandUniverseNewt7
MAT 2379
Sample Midterm 2 (with solutions)
Based on Sections 5.2, 7.1, 7.2, 7.3, 8.1 and 8.2
Date:
Instructor: Xiao Liang
Time: 80 minutes
Student Number:
Family Name:
First Name:
•
This is a closed book examination.
•
You can bring your own formula sheet (one page, one-sided).
•
Some statistical tables are included at the end of the exam.
•
Only Faculty standard calculators are permitted: TI30, TI34, Casio fx-260, Casio fx-300.
•
You are not allowed to use any electronic device during the exam. Cell phones should be put away.
•
The exam consists of 6 multiple choice questions and 4 long answer questions.
•
Each multiple choice question is worth 5 marks and each long answer question is worth 10 marks.
The total number of marks is 70.
NOTE: At the end of the examination, hand in the entire booklet.
.*****************************************************.
For professor’s use:
Number of marks
Total for all MC Questions
Long Answer Question 1
Long Answer Question 2
Long Answer Question 3
Long Answer Question 4
Total
1
Part 1: Multiple Choice Questions
Record your answer to the multiple choice questions in the table below:
Question
Answer
1
C
2
B
3
C
4
A
5
D
6
A
1. Some biology students were interested in analyzing the amount of time that the bees spend
gathering nectar. 39 bees visited a high-density flower patch and the time (in seconds) that each
one of them spent gathering nectar was recorded. Below is the normal QQ-plot and the histogram
for this data set (
x
).
Which one of the following statements is correct? (Only one statement is correct.)
A) It is reasonable to assume that the time gathering nectar is normally distributed.
B) It is reasonable to assume that the time gathering nectar has a
T
distribution with 38 degrees
of freedom.
C) The distribution of the time gathering nectar is highly skewed to the
right
. It is not reasonable
to assume that the time gathering nectar is normally distributed.
D) The distribution of the time gathering nectar is highly skewed to the
left
. It is not reasonable
to assume that the time gathering nectar is normally distributed.
E) The distribution of the time gathering nectar is approximately symmetric.
Solution:
(Sections 7.1 and 7.3) The distribution is highly skewed to the right. There is a curvilinear
tendency in the QQ-plot, so it is not reasonable to assume that the times are normally distributed.
The normal QQ plot should not be used for the
T
distribution. The answer is C.
2
2. The width of the shell of a burgundy snail (Helix pomatia) has a normal distribution with mean
40 mm and standard deviation 10 mm. Use the
R
output below to find a value
x
0
such that 70%
of burgundy snails have a width larger than
x
0
.
A)
pnorm
(0.3, 40, 10)
B)
qnorm
(0.3, 40, 10)
C)
pnorm
(0.7, 40, 10)
D)
qnorm
(0.7, 40, 10)
E) 1-
pnorm
(0.7, 40, 10)
Solution:
(Section 5.2) Let
X
be the width of a randomly chosen snail. Then
X
has a normal
distribution with mean
μ
= 40
and standard deviation
σ
= 10
. We have to find a value
x
0
such
that
P
(
X > x
0
) = 0
.
70
.
This means that
P
(
X < x
0
) = 0
.
30
.
The value is given by the
R
command
qnorm
(0.3, 40, 10). The answer is B.
3. Average levels of Carbon Monoxide (CO) in homes vary between 0 and 2.00 parts per millions
(ppm). We collected CO information for three cities: Ottawa, Montreal and Toronto. For each
city, we selected a sample of 100 of houses and recorded their CO level. We then created 3 data
sets of 100 observations each, called “Ottawa”, “Montreal” and “Toronto”. Below are the boxplots
and histograms for these data sets. The labels of the variables are missing from the histograms,
but are included in the boxplots. Our task is to identify the missing labels.
(a) Ottawa
(b) Montreal
(c) Toronto
(d)
(e)
(f)
Which one of the following statements is
correct
?
A) Histogram (d) is for Ottawa, (e) is for Montreal, and (f) is for Toronto.
B) Histogram (f) is for Ottawa, (e) is for Montreal, and (d) is for Toronto.
C) Histogram (e) is for Ottawa, (f) is for Montreal, and (d) is for Toronto.
D) Histogram (f) is for Ottawa, (d) is for Montreal, and (e) is for Toronto.
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
E) Histogram (d) is for Ottawa, (f) is for Montreal, and (e) is for Toronto.
Solution:
(Section 7.1) We have to match each boxplot with the corresponding histogram.
All
samples have the same range, so the whiskers cannot be used for the matching. Notice that one
of the histogram is skewed to the right, while the other two histograms are approximately sym-
metric. Here we would have to identify a median away from the center for the skewed distribution.
Therefore b matches with f. For the symmetric distributions, we can compare the dispersion for
the matching. Notice that the values in histogram e are less dispersed (i.e. more concentrated
in the center) compared to the values in histogram d. When comparing the boxplots a and c, we
should notice that the box in a is smaller (i.e. less dispersed). Therefore, a is matches with e and
c is matched with d. The answer is C.
4. Glaucoma is a disease of the eye that is manifested by high intraocular pressure.
Assume that
in the general population, the intraocular pressure has approximately a normal distribution with
mean 16 mm Hg and standard deviation 3 mm Hg. The usual range for intraocular pressure is
considered to be between 12 mm Hg and 20 mm Hg. What proportion of the general population
has intraocular pressure in the usual range?
A)
0
.
8164
B)
0
.
0918
C)
0
.
9082
D)
0
.
1426
E) 0.2875
Solution:
(Section 5.2) We wish to calculate
P
(12
≤
X
≤
20)
, where
X
has a normal distribution
with mean
μ
= 16
and standard deviation
σ
= 3
. Using standardization, we have:
P
(12
≤
X
≤
20)
=
P
12
-
16
3
≤
X
-
16
3
≤
20
-
16
3
=
P
(
-
1
.
33
≤
Z
≤
1
.
33)
=
P
(
Z
≤
1
.
33)
-
P
(
Z <
-
1
.
33)
=
0
.
9082
-
0
.
0918 = 0
.
8164
.
The answer is A.
5. One scientist studies the acquisition of rainfall data in Guinea Savanna part of Nigeria. One of
the major data acquisition problems in Sub-Saharan Africa includes instrumental errors, which are
associated with the functioning of the instruments. An error encountered frequently with the rain
gauges (instruments used by hydrologists) occurs during the siphoning cycle, when the rain persists
to enter the rain gauge. In a sample of 64 observations, it was found that the mean measurement
error was
¯
x
= 2
.
85
mm with a standard deviation
s
= 3
.
5
mm.
Calculate a
95%
confidence
interval for the average measurement error
μ
.
A)
2
.
85
±
1
.
645
B)
2
.
85
±
1
.
96
C)
2
.
85
±
2
.
262
D)
2
.
85
±
0
.
8575
E)
2
.
85
±
0
.
7197
Solution:
(Section 8.1) This is a large sample interval. A 95% confidence interval for
μ
is
2
.
85
±
1
.
96
3
.
5
√
64
!
= 2
.
85
±
0
.
8575
.
The answer is D.
4
6. Data on the amount of rainfall per year was collected in 15 locations in the equatorial rainforest
in the Amazon Basin of South America. For these locations, it was observed an average rainfall
¯
x
= 80
inches per year, with a standard deviation
s
= 34
inches. Give a 98% confidence interval
for the average amount
μ
of rainfall per year in the Amazon Basin.
Assume that the data is
normally distributed.
A) [56.96; 103.04]
B) [62.79; 97.21]
C) [50.56; 107.34]
D) [75.61; 84.38]
E) [64.71; 95.29]
Solution:
(Section 8.2) Since this is a small sample and the data is normally distributed, we use the
interval based on the
T
distribution. We need to find the value
t
such that
P
(
-
t < T < t
) = 0
.
98
.
This means that
P
(
T > t
) = (1
-
0
.
98)
/
2 = 0
.
01
and hence
P
(
T < t
) = 0
.
99
. From Table 18.4
(row 14) we find
t
=
t
0
.
01
,
14
= 2
.
624
. The confidence interval is:
80
±
2
.
624
34
√
15
!
= 80
±
23
.
04 = [56
.
96; 103
.
04]
The answer is A. (Note that the interval in B is obtained using the incorrect value
z
= 1
.
96
instead
of
t
= 2
.
624
.)
Part 2: Long Answer Questions
Record your answer to the long answer questions in the space provided below, specifying clearly your
notation and including a proper justification. Show the details of your calculations.
1. The following data gives the blood glucose level (in mmol/L) for 13 persons who suffer from
hypoglycemia (low blood glucose levels), before the first meal of the day:
2
.
8
4
.
2
4
.
6
4
.
7
4
.
5
4
.
3
4
.
2
5
.
1
4
.
9
4
.
4
4
.
6
4
.
9
5
.
6
a) (5 marks) Find the median (
˜
x
), and the two quartiles(
q
1
,
q
3
).
b) (5 marks) Give the values of the outliers (if they exist).
Solution:
(Section 7.1) a) We arrange the data in increasing order:
y
1
= 2
.
8
y
2
=
y
3
= 4
.
2
y
4
= 4
.
3
y
5
= 4
.
4
y
6
= 4
.
5
y
7
=
y
8
= 4
.
6
y
9
= 4
.
7
y
10
=
y
11
= 4
.
9
y
12
= 5
.
1
y
13
= 5
.
6
Since 13 is an odd number, the median is
y
n
+1
2
=
y
7
= 4
.
6
.
Note that
n
+1
4
=
14
4
= 3
.
5
and
3(
n
+1)
4
= 10
.
5
. The first quartile is
q
1
= (0
.
5)
y
3
+ (0
.
5)
y
4
= (0
.
5)(4
.
2) + (0
.
5)(4
.
3) = 4
.
25
.
The third quartile is
q
3
= (0
.
5)
y
10
+ (0
.
5)
y
11
= (0
.
5)(4
.
9) + (0
.
5)(4
.
9) = 4
.
9
5
b)
IQR
= 4
.
9
-
4
.
25 = 0
.
65
. We calculate the two fences:
Fence1 =
q
1
-
1
.
5(
IQR
) = 4
.
25
-
0
.
975 = 3
.
275
Fence2 =
q
3
+ 1
.
5(
IQR
) = 4
.
9 + 0
.
975 = 5
.
875
The outliers are the values located outside the fences (i.e. smaller than Fence 1, or larger than
Fence 2). The only outlier is 2.8.
2. Let
X
be the cholesterol level for teenagers with age between 13 and 16. Suppose that
X
has a
normal distribution with mean 160 mg/dl and standard deviation 32 mg/dl.
a) (5 marks) What is the probability that a randomly selected teenager with age between 13 and
16 has as cholesterol level between 152 mg/dl and 168 mg/dl?
b) (5 marks) We select a random sample of 16 teenagers with age between 13 and 16. What is
the probability that the
average
cholesterol level for this sample is between 152 mg/dl and 168
mg/dl?
Solution:
a) (Section 5.2) Let
X
be the cholesterol level of a randomly chosen person. The desired
probability is:
P
(152
< X <
168)
=
P
152
-
160
32
<
X
-
160
32
<
168
-
160
32
=
P
(
-
0
.
25
< Z <
0
.
25)
=
P
(
Z <
0
.
25)
-
P
(
Z <
-
0
.
25) = 0
.
5987
-
0
.
4103 = 0
.
1974
.
We used the fact that
P
(
Z <
0
.
25) = 0
.
5987
(from Table 18.3) and
P
(
Z <
-
0
.
25) = 0
.
4103
(from Table 18.2).
b) (Section 7.2) Let
X
be the sample mean. Then
X
has a normal distribution with mean 160
and standard deviation
32
/
√
16 = 8
. The desired probability is
P
(152
<
X <
168)
=
P
152
-
160
8
<
X
-
160
8
<
168
-
160
8
!
=
P
(
-
1
.
00
< Z <
1
.
00) =
P
(
Z <
1
.
00)
-
P
(
Z <
-
1
.
00)
=
0
.
8413
-
0
.
1587 = 0
.
6826
,
where for the last line we used again Tables 18.2 and 18.3.
3. The water in a certain lake has a salinity of around 70 mg/L. The salinity is measured for 5 water
samples taken from this lake. Below are the data.
x
1
= 59
.
15
x
2
= 72
.
24
x
3
= 68
.
03
x
4
= 104
.
58
x
5
= 79
.
04
a) (5 marks) What is the geometric mean for this data set?
b) (5 marks) We transform the data using the linear transformation
X
0
= 2
X
+ 3
. What is the
median of the transformed measurements
x
0
1
, x
0
2
, x
0
3
, x
0
4
, x
0
5
?
Solution:
(Section 7.1) a)
Method 1.
We apply a logarithmic transformation to this data:
Y
=
ln(
X
)
. We obtain the following new data:
y
1
= 4
.
08
,
y
2
= 4
.
28
,
y
3
= 4
.
22
,
y
4
= 4
.
65
,
y
5
= 4
.
37
.
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The mean of the log-transformed data is:
¯
y
=
y
1
+
y
2
+
y
3
+
y
4
+
y
5
5
=
4
.
08 + 4
.
28 + 4
.
22 + 4
.
65 + 4
.
37
5
= 4
.
32
.
The geometric mean of the original data is
g
=
e
¯
y
=
e
4
.
32
= 75
.
19
.
Method 2.
The geometric mean is:
g
=
5
Y
i
=1
x
i
!
1
/
5
= (59
.
15
×
72
.
24
×
68
.
03
×
104
.
58
×
79
.
04)
1
/
5
= 75
.
19
.
b) (Method 1) The transformed measurements are:
x
0
1
= 121
.
3
,
x
0
2
= 147
.
48
,
x
0
3
= 139
.
06
,
x
0
4
= 212
.
16
,
x
0
5
= 161
.
08
We arrange the transformed data in increasing order. We obtain:
y
0
1
= 121
.
3
,
y
0
2
= 139
.
06
,
y
0
3
= 147
.
48
,
y
0
4
= 161
.
08
,
y
0
5
= 212
.
16
The median of the transformed data set is
y
0
3
= 147
.
48
.
(Method 2) We first find the median of the original data set
x
1
, x
2
, x
3
, x
4
, x
5
. For this, we arrange
the original data in increasing order and obtain:
59
.
15
,
68
.
03
,
72
.
24
,
79
.
04
104
.
58
,
The median of the original data is
72
.
24
. The median of the transformed data is
2
×
72
.
24 + 3 =
147
.
48
.
4. The seed weight of the princess bean
Phaseotus vulgaris
has a normal distribution with mean
μ
= 500
mg and standard deviation
σ
= 119
mg. We select a random sample of size
n
from the
seeds of Phaseotus vulgaris. Let
X
denote the mean weight of the seeds in this sample. Find the
sample size
n
such that
P
(
X >
550) = 0
.
2
.
Solution:
(Section 7.2) Let
X
be the weight of a randomly chosen seed and
¯
X
be the mean weight
of the seeds in a sample of size
n
. Since
X
is normally distributed,
Z
=
¯
X
-
500
119
/
√
n
∼
N
(0
,
1)
.
We want to find
n
such that
P
(
¯
X >
550) = 0
.
2
or equivalently
P
(
¯
X <
550) = 0
.
8
.
By
standardization, it follows that:
0
.
8 =
P
(
¯
X <
550) =
P
Z <
550
-
500
119
/
√
n
!
=
P
Z <
50
√
n
119
!
.
In Table 18.3, we look for a value
z
such that
P
(
Z < z
) = 0
.
8
. We find
z
= 0
.
845
. Hence
50
√
n
119
= 0
.
845
,
and
√
n
=
(119)(0
.
845)
50
= 2
.
0111
.
Hence
n
= (2
.
0111)
2
= 4
.
045
≈
4
.
7