2379-midterm2-23A-sol (1)

pdf

School

University of Ottawa *

*We aren’t endorsed by this school

Course

2379

Subject

Mathematics

Date

Jan 9, 2024

Type

pdf

Pages

8

Uploaded by GrandUniverseHyena41

Report
MAT 2379B Midterm Examination November 15, 2023 Professor Raluca Balan Time: 80 minutes Student Number: Family Name: First Name: This is a closed book examination. You can bring your own formula sheet (one page, one-sided). Some statistical tables are included on the last page of the booklet. Only Faculty standard calculators are permitted: TI30, TI34, Casio fx-260, Casio fx-300. You are not allowed to use any electronic device during the exam. Cell phones should be put away. The exam consists of 6 multiple choice questions and 4 long answer questions. Each multiple choice question is worth 5 marks and each long answer question is worth 10 marks. The total number of marks is 70. NOTE: At the end of the examination, hand in the entire booklet. .*****************************************************. For professor’s use: Number of marks Total for all MC Questions Long Answer Question 1 Long Answer Question 2 Long Answer Question 3 Long Answer Question 4 Total 1
Part 1: Multiple Choice Questions Record your answer to the multiple choice questions in the table below: Question Answer 1 2 3 4 5 6 1. The hydrochloric acid (HCl) is a highly acidic substance found in the human stomach, where it aids in the digestion of food. Measurements on the pH level of HCl for 125 patients have been recorded in R in the variable x . Below is the histogram and the QQ plot for this data. Which of the following statements is correct ? (Only one statement is correct.) A) The histogram is approximately symmetric and the QQ plot has a strong linear tendency. It is reasonable to assume that this data is normally distributed. B) The QQ plot has a strong curvilinear tendency. It is not reasonable to assume that this data is normally distributed. C) The distribution of the pH level is highly skewed to the right . It is not reasonable to assume that this daya is normally distributed. D) The distribution of the pH level is highly skewed to the left . It is not reasonable to assume that this data is normally distributed. E) We cannot draw any conclusion about the distribution of the pH level. Solution: The distribution is symmetric and the QQ plot is linear, so it is reasonable to assume that the data is normally distributed. The answer is A. 2
2. Assume that the length of a blue whale has a normal distribution with mean 33 m and standard deviation 4 m. Use the R output below to find a value x 0 such that 80% of blue whales have the length smaller than x 0 . (In this output, denotes multiplication.) A) qnorm (0.2, 33, 4) B) 33 4 qnorm (0.8, 33, 4) C) pnorm (0.8, 33, 4) D) 1 pnorm (0.8, 33, 4) E) 33 + 4 qnorm (0.8, 0, 1) Solution: Let X be the length of a randomly chosen blue whale. Then X has a normal distribution with mean µ = 33 and standard deviation σ = 4 . We have to find a value x 0 such that P ( X < x 0 ) = 0 . 80 . The value is given by the R command qnorm (0.8, 33, 4) but this command is not included among the listed answers. By standardization, 0 . 8 = P ( X < x 0 ) = P X 33 4 < x 0 33 4 = P ( Z < z 0 ) where z 0 = x 0 33 4 = qnorm(0 . 8 , 0 , 1) . Solving for x 0 we obtain: x 0 = 33 + 4 z 0 . The answer is E. 3. Measurements on the length have been recorded for 3 species of bees: red, yellow and green. For each species, we selected a sample of 100 bees, and measured their lengths. This data was saved in R in variables “red”, “yellow” and “green”. Below are the histograms and boxplots for these 3 data sets. The labels of the variables are missing from the boxplots, but are included in the histograms. Our task is to identify the missing labels. (a) green (b) yellow (c) red (d) (e) (f) Which one of the following statements is correct? (Only one statement is correct.) A) boxplot (d) is for red, (e) is for yellow, and (f) is for green B) boxplot (f) is for red, (e) is for yellow, and (d) is for green C) boxplot (e) is for red, (f) is for yellow, and (d) is for green 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
D) boxplot (f) is for red, (d) is for yellow, and (e) is for green E) boxplot (d) is for red, (f) is for yellow, and (e) is for green Solution: We have to match each histogram with the corresponding boxplot. Histogram (c) is symmetric, and so is boxplot (d). So (c) is matched with (d), i.e. (d) is for red. Histogram (b) is skewed to the left, and this corresponds to boxplot (f) which has an asymmetry in the same direction (of larger values), so (b) is matched with (f), i.e. (f) is for yellow. Finally (a) is matched with (e), i.e. (e) is for green. The answer is E. 4. Suppose that the size of body of snowy owl is normally distributed with mean of 61.5 cm and standard deviation of 4.75 cm. What is the probability that a randomly chosen snowy owl has a body that is larger than 64 cm? A) 0 . 2981 B) 0 . 9918 C) 0 . 9082 D) 0 . 8296 E) 0.2875 Solution: We first calculate P ( X 64) , where X has a normal distribution with mean µ = 61 . 5 and standard deviation σ = 4 . 75 . Using standardization, we have: P ( X 64) = P X 61 . 5 4 . 75 64 61 . 5 4 . 75 = P ( Z 0 . 53) = 1 P ( Z < 0 . 53) = 1 0 . 7019 = 0 . 2981 . The answer is A. 5. A study on the polar bears in the Beauford Sea shows that the bears are fasting. Because of this, their cubs have smaller weights at birth. We measure the weight at birth (in grams) for a sample of 50 cubs, yielding a sample mean ¯ x = 715 g and a standard deviation s = 123 g. Calculate a 99% confidence interval for the average cub weight µ at birth. A) [629.84;800.16] B) [680.91; 749.09] C) [689.62;740.38] D) [670.21;759.79] E) [702.12;727.88] Solution: This is a large sample interval. We need to find z such that P ( z < Z < z ) = 0 . 99 . This means that P ( Z < z ) = P ( Z > z ) = (1 0 . 99) / 2 = 0 . 005 and P ( Z < z ) = 0 . 99 + 0 . 005 = 0 . 995 . In Table 18.3, we finds P ( Z < 2 . 57) = 0 . 9949 and P ( Z < 2 . 58) = 0 . 9951 , so we choose z = 2 . 575 . A 99% confidence interval for µ is 715 ± 2 . 575 123 50 ! = 715 ± 44 . 79 = [670 . 21; 759 . 79] The answer is D. The wrong answer B is obtained using z = 1 . 96 . 6. The following data gives the weight for 8 corn cobs which were produced using an organic corn fertilizer: 212 234 259 189 245 176 203 215 4
For this data, the sample mean is ¯ x = 216 . 625 , and the sample standard deviation s = 28 . 09645 . Find a 90% confidence interval for the average cob weight. Assume that the data is normally distributed. A) [197.801; 235.449] B) [193.132; 240.118 ] C) [200.284; 232.966] D) [197.155; 236.095] E) [195.811; 237.439] Solution: Since this is a small sample and the data is normally distributed, we use the interval based on the T distribution. We need to find the value t such that P ( t < T < t ) = 0 . 90 . This means that P ( T > t ) = (1 0 . 90) / 2 = 0 . 05 and hence P ( T < t ) = 0 . 95 . From Table 18.4 (row 7) we find t = t 0 . 05 , 7 = 1 . 895 . The confidence interval is: 216 . 625 ± 1 . 895 28 . 09645 8 ! = 216 . 625 ± 20 . 12388 = [197 . 8008; 235 . 4492] The answer is A. (Note that the interval in D is obtained using the incorrect value z = 1 . 96 instead of t = 1 . 895 .) Long answer questions are included on the following pages. Part 2: Long Answer Questions Record your answer to the long answer questions in the space provided below, specifying clearly your notation and including a proper justification. Show the details of your calculations. 1. Platelets, also known as thrombocytes, are small, irregularly shaped cell fragments that play a crucial role in blood clotting. The normal range for platelet counts in adults is typically between 150 thousands and 450 thousands platelets per microliter of blood. Below is some data on the number of platelets (in thousands) per microliter of blood, which has been recorded for 12 patients: 159 132 160 165 163 197 176 160 169 164 161 183 a) (5 marks) Find the median ( ˜ x ), and the quartiels q 1 and q 3 . b) (5 marks) Find the outliers, if they exist. Justify your answer. Solution: a) We arrange the data in increasing order: y 1 = 132 y 2 = 159 y 3 = y 4 = 160 y 5 = 161 y 6 = 163 y 7 = 164 y 8 = 165 y 9 = 169 y 10 = 176 y 11 = 183 y 12 = 197 Because n = 12 is even and n 2 = 6 , the median is ˜ x = y 6 + y 7 2 = 163 + 164 2 = 163 . 5 . To find the quartiles, we note that n +1 4 = 13 4 = 3 . 25 and 3( n +1) 4 = 9 . 75 . The first quartile is q 1 = 0 . 75 y 3 + 0 . 25 y 4 = (0 . 75)(160) + (0 . 25)(160) = 160 . 5
The third quartile is q 3 = (0 . 25) y 9 + (0 . 75) y 10 = (0 . 25)(169) + (0 . 75)(176) = 174 . 25 b) IQR = 174 . 25 160 = 14 . 25 . We calculate the two fences: Fence1 = q 1 1 . 5 IQR = 160 1 . 5 · 14 . 25 = 138 . 625 Fence2 = q 3 + 1 . 5 IQR = 174 . 25 + 1 . 5 · 14 . 25 = 195 . 625 The outliers are the values outside the two fences: 132 et 197. 2. Suppose that the height of an 8-year old girl has a normal distribution with a mean of 128 cm and standard deviation of 10 cm. a) (5 marks) What is the probability that a randomly selected an 8-year old girl has a height between 124 cm and 132 cm ? b) (5 marks) We select a random sample of 55 girls of age 8. What is the approximate probability that the average height for this sample is between 124 cm and 132 cm? Solution: a) Let X be the height of a randomly chosen girl. The desired probability is: P (124 < X < 132) = P 124 128 10 < X 128 10 < 132 128 10 = P ( 0 . 4 < Z < 0 . 4) = P ( Z < 0 . 4) P ( Z < 0 . 4) = 0 . 6554 0 . 3446 = 0 . 3108 . We used the fact that P ( Z < 0 . 4) = 0 . 6554 (from Table 18.3) and P ( Z < 0 . 4) = 0 . 3446 (from Table 18.2). b) Let X be the mean of the sample of size 55. By the central limit theorem, X has approximately a normal distribution with mean 128 and standard deviation 10 / 55 = 1 . 3484 . The desired probability is P (124 < X < 132) = P 124 128 1 . 3484 < X 128 1 . 3484 < 132 128 1 . 3484 ! = P ( 2 . 97 < Z < 2 . 97) = P ( Z < 2 . 97) P ( Z < 2 . 97) = 0 . 9985 0 . 0015 = 0 . 9970 , where for the last line we used again Tables 18.2 and 18.3. 3. The maximum speed at which a female deer can run depends on various factors, including the species of deer, age, health, and environmental conditions. Some species of deer, such as the white-tailed deer (Odocoileus virginianus), can reach speeds of up to 48 to 56 kilometers per hour for short distances when they are sprinting. However, their sustained running speed is typically lower. Below is the data for the running speed for a sample of n = 6 white-tailed female deer: x 1 = 45 . 5 x 2 = 37 . 5 x 3 = 42 . 1 x 4 = 34 . 8 x 5 = 34 . 0 x 6 = 32 . 9 a) (5 marks) What is the geometric mean for this data set? b) (5 points) We transform the data using the linear transformation X = 5 X + 3 . What is the 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
median of the transformed measurements x 1 , x 2 , x 3 , x 4 , x 5 , x 6 ? Solution: a) Method 1. We apply a logarithmic transformation to this data set: Y = ln( X ) . We obtain the following data: y 1 = ln(45 . 5) y 2 = ln(37 . 5) y 3 = ln(42 . 1) y 4 = ln(34 . 8) y 5 = ln(34) y 6 = ln(32 . 9) The mean of the log-transformed data is: ¯ y = y 1 + y 2 + y 3 + y 4 + y 5 + y 6 6 = ln(45 . 5) + ln(37 . 5) + ln(42 . 1) + ln(34 . 8) + ln(34) + ln(32 . 9) 6 = 3 . 625 The geometric mean of the original data is: g = e ¯ y = e 3 . 625 = 37 . 53 . Method 2. The geometric mean is: g = 6 Y i =1 x i ! 1 / 6 = (45 . 5 × 37 . 5 × 42 . 1 × 34 . 8 × 34 × 32 . 9) 1 / 6 = 37 . 53 . b) (Method 1) The transformed measurements are: x 1 = 224 . 5 , x 2 = 184 . 5 , x 3 = 207 . 5 , x 4 = 171 , x 5 = 167 x 6 = 161 . 5 We arrange this data in increasing order. We obtain: y 1 = 224 . 5 , y 2 = 207 . 5 , y 3 = 184 . 5 , y 4 = 171 , y 5 = 167 y 6 = 161 . 5 The median of the transformed data is m = y 3 + y 4 2 = 184 . 5 171 2 = 177 . 75 (Methode 2) We first find the median of the original data set x 1 , x 2 , x 3 , x 4 , x 5 , x 6 . For this, we arrange the data in increasing order: y 1 = 32 . 9 , y 2 = 34 , y 3 = 34 . 8 , y 4 = 37 . 5 , y 5 = 42 . 1 , y 6 = 45 . 5 The median of the original data set is: ˜ x = y 3 + y 4 2 = 34 . 8 + 37 . 5 2 = 36 . 15 . The median of the transformed data is: m = 5 × 36 . 15 + 3 = 177 . 75 7
4. The amount of potassium in a triple cheeseburger is a random variable with a normal distribution with mean µ = 460 mg and standard deviation σ = 64 mg. We select at random n cheeseburgers and we denote by X the average amount of potassium for this sample. Find the sample size n such that P ( X > 470) = 0 . 1056 . Solution: Let X be the amount of potasium in one cheeseburger and ¯ X the average amount of potassium in a sample of size n . Because X is normally distributed, ¯ X is also normally distributed with mean µ = 460 and standard deviation 64 / n . By standardization, Z = ¯ X 460 64 / n N (0 , 1) . We have to find n such that P ( ¯ X > 470) = 0 . 1056 , i.e. P ( ¯ X < 470) = 0 . 8944 . By standardiza- tion, 0 . 8944 = P ( ¯ X < 470) = P Z < 470 460 64 / n ! = P Z < 10 n 64 ! . In Table 18.3, we look for a value z such that P ( Z < z ) = 0 . 8944 . We find z = 1 . 25 . Hence, 10 n 64 = 1 . 25 , and n = (64)(1 . 25) 10 = 8 Therefore, n = (8) 2 = 64 . 8