2379-sample-midterm2-sol

pdf

School

San Jose State University *

*We aren’t endorsed by this school

Course

167

Subject

Mathematics

Date

Jan 9, 2024

Type

pdf

Pages

7

Uploaded by GrandUniverseNewt7

Report
MAT 2379 Sample Midterm 2 (with solutions) Based on Sections 5.2, 7.1, 7.2, 7.3, 8.1 and 8.2 Date: Instructor: Xiao Liang Time: 80 minutes Student Number: Family Name: First Name: This is a closed book examination. You can bring your own formula sheet (one page, one-sided). Some statistical tables are included at the end of the exam. Only Faculty standard calculators are permitted: TI30, TI34, Casio fx-260, Casio fx-300. You are not allowed to use any electronic device during the exam. Cell phones should be put away. The exam consists of 6 multiple choice questions and 4 long answer questions. Each multiple choice question is worth 5 marks and each long answer question is worth 10 marks. The total number of marks is 70. NOTE: At the end of the examination, hand in the entire booklet. .*****************************************************. For professor’s use: Number of marks Total for all MC Questions Long Answer Question 1 Long Answer Question 2 Long Answer Question 3 Long Answer Question 4 Total 1
Part 1: Multiple Choice Questions Record your answer to the multiple choice questions in the table below: Question Answer 1 C 2 B 3 C 4 A 5 D 6 A 1. Some biology students were interested in analyzing the amount of time that the bees spend gathering nectar. 39 bees visited a high-density flower patch and the time (in seconds) that each one of them spent gathering nectar was recorded. Below is the normal QQ-plot and the histogram for this data set ( x ). Which one of the following statements is correct? (Only one statement is correct.) A) It is reasonable to assume that the time gathering nectar is normally distributed. B) It is reasonable to assume that the time gathering nectar has a T distribution with 38 degrees of freedom. C) The distribution of the time gathering nectar is highly skewed to the right . It is not reasonable to assume that the time gathering nectar is normally distributed. D) The distribution of the time gathering nectar is highly skewed to the left . It is not reasonable to assume that the time gathering nectar is normally distributed. E) The distribution of the time gathering nectar is approximately symmetric. Solution: (Sections 7.1 and 7.3) The distribution is highly skewed to the right. There is a curvilinear tendency in the QQ-plot, so it is not reasonable to assume that the times are normally distributed. The normal QQ plot should not be used for the T distribution. The answer is C. 2
2. The width of the shell of a burgundy snail (Helix pomatia) has a normal distribution with mean 40 mm and standard deviation 10 mm. Use the R output below to find a value x 0 such that 70% of burgundy snails have a width larger than x 0 . A) pnorm (0.3, 40, 10) B) qnorm (0.3, 40, 10) C) pnorm (0.7, 40, 10) D) qnorm (0.7, 40, 10) E) 1- pnorm (0.7, 40, 10) Solution: (Section 5.2) Let X be the width of a randomly chosen snail. Then X has a normal distribution with mean μ = 40 and standard deviation σ = 10 . We have to find a value x 0 such that P ( X > x 0 ) = 0 . 70 . This means that P ( X < x 0 ) = 0 . 30 . The value is given by the R command qnorm (0.3, 40, 10). The answer is B. 3. Average levels of Carbon Monoxide (CO) in homes vary between 0 and 2.00 parts per millions (ppm). We collected CO information for three cities: Ottawa, Montreal and Toronto. For each city, we selected a sample of 100 of houses and recorded their CO level. We then created 3 data sets of 100 observations each, called “Ottawa”, “Montreal” and “Toronto”. Below are the boxplots and histograms for these data sets. The labels of the variables are missing from the histograms, but are included in the boxplots. Our task is to identify the missing labels. (a) Ottawa (b) Montreal (c) Toronto (d) (e) (f) Which one of the following statements is correct ? A) Histogram (d) is for Ottawa, (e) is for Montreal, and (f) is for Toronto. B) Histogram (f) is for Ottawa, (e) is for Montreal, and (d) is for Toronto. C) Histogram (e) is for Ottawa, (f) is for Montreal, and (d) is for Toronto. D) Histogram (f) is for Ottawa, (d) is for Montreal, and (e) is for Toronto. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
E) Histogram (d) is for Ottawa, (f) is for Montreal, and (e) is for Toronto. Solution: (Section 7.1) We have to match each boxplot with the corresponding histogram. All samples have the same range, so the whiskers cannot be used for the matching. Notice that one of the histogram is skewed to the right, while the other two histograms are approximately sym- metric. Here we would have to identify a median away from the center for the skewed distribution. Therefore b matches with f. For the symmetric distributions, we can compare the dispersion for the matching. Notice that the values in histogram e are less dispersed (i.e. more concentrated in the center) compared to the values in histogram d. When comparing the boxplots a and c, we should notice that the box in a is smaller (i.e. less dispersed). Therefore, a is matches with e and c is matched with d. The answer is C. 4. Glaucoma is a disease of the eye that is manifested by high intraocular pressure. Assume that in the general population, the intraocular pressure has approximately a normal distribution with mean 16 mm Hg and standard deviation 3 mm Hg. The usual range for intraocular pressure is considered to be between 12 mm Hg and 20 mm Hg. What proportion of the general population has intraocular pressure in the usual range? A) 0 . 8164 B) 0 . 0918 C) 0 . 9082 D) 0 . 1426 E) 0.2875 Solution: (Section 5.2) We wish to calculate P (12 X 20) , where X has a normal distribution with mean μ = 16 and standard deviation σ = 3 . Using standardization, we have: P (12 X 20) = P 12 - 16 3 X - 16 3 20 - 16 3 = P ( - 1 . 33 Z 1 . 33) = P ( Z 1 . 33) - P ( Z < - 1 . 33) = 0 . 9082 - 0 . 0918 = 0 . 8164 . The answer is A. 5. One scientist studies the acquisition of rainfall data in Guinea Savanna part of Nigeria. One of the major data acquisition problems in Sub-Saharan Africa includes instrumental errors, which are associated with the functioning of the instruments. An error encountered frequently with the rain gauges (instruments used by hydrologists) occurs during the siphoning cycle, when the rain persists to enter the rain gauge. In a sample of 64 observations, it was found that the mean measurement error was ¯ x = 2 . 85 mm with a standard deviation s = 3 . 5 mm. Calculate a 95% confidence interval for the average measurement error μ . A) 2 . 85 ± 1 . 645 B) 2 . 85 ± 1 . 96 C) 2 . 85 ± 2 . 262 D) 2 . 85 ± 0 . 8575 E) 2 . 85 ± 0 . 7197 Solution: (Section 8.1) This is a large sample interval. A 95% confidence interval for μ is 2 . 85 ± 1 . 96 3 . 5 64 ! = 2 . 85 ± 0 . 8575 . The answer is D. 4
6. Data on the amount of rainfall per year was collected in 15 locations in the equatorial rainforest in the Amazon Basin of South America. For these locations, it was observed an average rainfall ¯ x = 80 inches per year, with a standard deviation s = 34 inches. Give a 98% confidence interval for the average amount μ of rainfall per year in the Amazon Basin. Assume that the data is normally distributed. A) [56.96; 103.04] B) [62.79; 97.21] C) [50.56; 107.34] D) [75.61; 84.38] E) [64.71; 95.29] Solution: (Section 8.2) Since this is a small sample and the data is normally distributed, we use the interval based on the T distribution. We need to find the value t such that P ( - t < T < t ) = 0 . 98 . This means that P ( T > t ) = (1 - 0 . 98) / 2 = 0 . 01 and hence P ( T < t ) = 0 . 99 . From Table 18.4 (row 14) we find t = t 0 . 01 , 14 = 2 . 624 . The confidence interval is: 80 ± 2 . 624 34 15 ! = 80 ± 23 . 04 = [56 . 96; 103 . 04] The answer is A. (Note that the interval in B is obtained using the incorrect value z = 1 . 96 instead of t = 2 . 624 .) Part 2: Long Answer Questions Record your answer to the long answer questions in the space provided below, specifying clearly your notation and including a proper justification. Show the details of your calculations. 1. The following data gives the blood glucose level (in mmol/L) for 13 persons who suffer from hypoglycemia (low blood glucose levels), before the first meal of the day: 2 . 8 4 . 2 4 . 6 4 . 7 4 . 5 4 . 3 4 . 2 5 . 1 4 . 9 4 . 4 4 . 6 4 . 9 5 . 6 a) (5 marks) Find the median ( ˜ x ), and the two quartiles( q 1 , q 3 ). b) (5 marks) Give the values of the outliers (if they exist). Solution: (Section 7.1) a) We arrange the data in increasing order: y 1 = 2 . 8 y 2 = y 3 = 4 . 2 y 4 = 4 . 3 y 5 = 4 . 4 y 6 = 4 . 5 y 7 = y 8 = 4 . 6 y 9 = 4 . 7 y 10 = y 11 = 4 . 9 y 12 = 5 . 1 y 13 = 5 . 6 Since 13 is an odd number, the median is y n +1 2 = y 7 = 4 . 6 . Note that n +1 4 = 14 4 = 3 . 5 and 3( n +1) 4 = 10 . 5 . The first quartile is q 1 = (0 . 5) y 3 + (0 . 5) y 4 = (0 . 5)(4 . 2) + (0 . 5)(4 . 3) = 4 . 25 . The third quartile is q 3 = (0 . 5) y 10 + (0 . 5) y 11 = (0 . 5)(4 . 9) + (0 . 5)(4 . 9) = 4 . 9 5
b) IQR = 4 . 9 - 4 . 25 = 0 . 65 . We calculate the two fences: Fence1 = q 1 - 1 . 5( IQR ) = 4 . 25 - 0 . 975 = 3 . 275 Fence2 = q 3 + 1 . 5( IQR ) = 4 . 9 + 0 . 975 = 5 . 875 The outliers are the values located outside the fences (i.e. smaller than Fence 1, or larger than Fence 2). The only outlier is 2.8. 2. Let X be the cholesterol level for teenagers with age between 13 and 16. Suppose that X has a normal distribution with mean 160 mg/dl and standard deviation 32 mg/dl. a) (5 marks) What is the probability that a randomly selected teenager with age between 13 and 16 has as cholesterol level between 152 mg/dl and 168 mg/dl? b) (5 marks) We select a random sample of 16 teenagers with age between 13 and 16. What is the probability that the average cholesterol level for this sample is between 152 mg/dl and 168 mg/dl? Solution: a) (Section 5.2) Let X be the cholesterol level of a randomly chosen person. The desired probability is: P (152 < X < 168) = P 152 - 160 32 < X - 160 32 < 168 - 160 32 = P ( - 0 . 25 < Z < 0 . 25) = P ( Z < 0 . 25) - P ( Z < - 0 . 25) = 0 . 5987 - 0 . 4103 = 0 . 1974 . We used the fact that P ( Z < 0 . 25) = 0 . 5987 (from Table 18.3) and P ( Z < - 0 . 25) = 0 . 4103 (from Table 18.2). b) (Section 7.2) Let X be the sample mean. Then X has a normal distribution with mean 160 and standard deviation 32 / 16 = 8 . The desired probability is P (152 < X < 168) = P 152 - 160 8 < X - 160 8 < 168 - 160 8 ! = P ( - 1 . 00 < Z < 1 . 00) = P ( Z < 1 . 00) - P ( Z < - 1 . 00) = 0 . 8413 - 0 . 1587 = 0 . 6826 , where for the last line we used again Tables 18.2 and 18.3. 3. The water in a certain lake has a salinity of around 70 mg/L. The salinity is measured for 5 water samples taken from this lake. Below are the data. x 1 = 59 . 15 x 2 = 72 . 24 x 3 = 68 . 03 x 4 = 104 . 58 x 5 = 79 . 04 a) (5 marks) What is the geometric mean for this data set? b) (5 marks) We transform the data using the linear transformation X 0 = 2 X + 3 . What is the median of the transformed measurements x 0 1 , x 0 2 , x 0 3 , x 0 4 , x 0 5 ? Solution: (Section 7.1) a) Method 1. We apply a logarithmic transformation to this data: Y = ln( X ) . We obtain the following new data: y 1 = 4 . 08 , y 2 = 4 . 28 , y 3 = 4 . 22 , y 4 = 4 . 65 , y 5 = 4 . 37 . 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The mean of the log-transformed data is: ¯ y = y 1 + y 2 + y 3 + y 4 + y 5 5 = 4 . 08 + 4 . 28 + 4 . 22 + 4 . 65 + 4 . 37 5 = 4 . 32 . The geometric mean of the original data is g = e ¯ y = e 4 . 32 = 75 . 19 . Method 2. The geometric mean is: g = 5 Y i =1 x i ! 1 / 5 = (59 . 15 × 72 . 24 × 68 . 03 × 104 . 58 × 79 . 04) 1 / 5 = 75 . 19 . b) (Method 1) The transformed measurements are: x 0 1 = 121 . 3 , x 0 2 = 147 . 48 , x 0 3 = 139 . 06 , x 0 4 = 212 . 16 , x 0 5 = 161 . 08 We arrange the transformed data in increasing order. We obtain: y 0 1 = 121 . 3 , y 0 2 = 139 . 06 , y 0 3 = 147 . 48 , y 0 4 = 161 . 08 , y 0 5 = 212 . 16 The median of the transformed data set is y 0 3 = 147 . 48 . (Method 2) We first find the median of the original data set x 1 , x 2 , x 3 , x 4 , x 5 . For this, we arrange the original data in increasing order and obtain: 59 . 15 , 68 . 03 , 72 . 24 , 79 . 04 104 . 58 , The median of the original data is 72 . 24 . The median of the transformed data is 2 × 72 . 24 + 3 = 147 . 48 . 4. The seed weight of the princess bean Phaseotus vulgaris has a normal distribution with mean μ = 500 mg and standard deviation σ = 119 mg. We select a random sample of size n from the seeds of Phaseotus vulgaris. Let X denote the mean weight of the seeds in this sample. Find the sample size n such that P ( X > 550) = 0 . 2 . Solution: (Section 7.2) Let X be the weight of a randomly chosen seed and ¯ X be the mean weight of the seeds in a sample of size n . Since X is normally distributed, Z = ¯ X - 500 119 / n N (0 , 1) . We want to find n such that P ( ¯ X > 550) = 0 . 2 or equivalently P ( ¯ X < 550) = 0 . 8 . By standardization, it follows that: 0 . 8 = P ( ¯ X < 550) = P Z < 550 - 500 119 / n ! = P Z < 50 n 119 ! . In Table 18.3, we look for a value z such that P ( Z < z ) = 0 . 8 . We find z = 0 . 845 . Hence 50 n 119 = 0 . 845 , and n = (119)(0 . 845) 50 = 2 . 0111 . Hence n = (2 . 0111) 2 = 4 . 045 4 . 7