2379-sample-final-solution

pdf

School

University of Ottawa *

*We aren’t endorsed by this school

Course

2379

Subject

Mathematics

Date

Jan 9, 2024

Type

pdf

Pages

13

Uploaded by GrandUniverseHyena41

Report
MAT 2379A Sample Final Exam (with Solutions) Professor Hai Yan Liu Time: 3 hours Student Number: Seat Number: Family Name: First Name: Cellular phones, smart watches, unauthorized electronic devices, or course notes are not allowed during this exam. Phones and devices must be turned off and put away in your bag. Do not keep them in your possession, such as in your pockets. If caught with such a device or document, the following may occur: academic fraud allegations will be filed which may result in you obtaining a 0 (zero) for the exam. By signing below, you acknowledge that you have ensured that you are complying with the above statement. Signature: ******************************************************************************************** This is a closed book examination. A formula sheet and some statistical tables will be distributed with your exam. Only Faculty standard calculators are permitted: TI30, TI34, Casio fx-260, Casio fx-300. The exam consists of 13 multiple choice questions and 6 long answer questions. Each multiple choice question is worth 5 marks and each long answer question is worth 10 marks. The total number of marks is 125. NOTE: At the end of the examination, hand in the entire booklet. You can keep the formula sheet and the tables. ******************************************************************************************** For professor’s use: Number of marks Total for all MC Questions Long Answer Question 1 Long Answer Question 2 Long Answer Question 3 Long Answer Question 4 Long Answer Question 5 Long Answer Question 6 Total 1
Part 1: Multiple Choice Questions Record your answer to the multiple choice questions in the table below: Question Answer Question Answer 1 8 2 9 3 10 4 11 5 12 6 13 7 1. The Bacillus Calmette-Gu´ erin (BCG) vaccine for tuberculosis (TB) is mandatory for school-age children in many European countries. In Canada, before BCG vaccination, the patient is tested for TB using a tuberculin skin test, called the Mantoux test. People who have been BCG vaccinated will often have a positive Mantoux test result, although they many not have TB. Therefore, the Mantoux test is not a very efficient tool for detecting TB. In a recent study, 12% of the subjects had a positive Mantoux test result. Among those with a positive test result, only 10% had TB. On the other hand, 1% of the patients with a negative test result also had TB. What was the percentage of patients with TB in this study? A) 1.10% B) 2.08% C) 0.88% D) 1.20% E) 13.03% Solution (Sections 3.2-3.3) We denote by TB the event that a randomly selected person in this group has tuberculosis. By the total probability rule, P (TB) = P (TB | Test+ ) P ( Test + ) + P (TB | Test ) P (Test ) = (0 . 10)(0 . 12) + (0 . 01)(0 . 88) = 0 . 0208 The answer is B. 2. The intraocular pressure is the fluid pressure inside the eye. Glaucoma is an eye disease that is manifested by high intraocular pressure. The distribution of intraocular pressure in the general population is approximately normal with mean 16 mm Hg and standard deviation 3 mm Hg. The normal range for intraocular pressure is considered to be between 12 mm Hg and 20 mm Hg (including these values). Which one of the following commands in R gives the probability that a randomly chosen person has normal intraocular pressure? (Only one answer is correct.) A) qnorm(20,16,3)-qnorm(12,16,3) B) pnorm(20,3,16)-pnorm(12,3,16) C) pnorm(20,16,3)-pnorm(12,16,3) D) pnorm(20,16,3)-pnorm(11,16,3) E) pnorm(20,16,9)-pnorm(12,16,9) 2
Solution (Section 5.2) We wish to calculate P (12 X 20) , where X has a normal distribution with mean µ = 16 and standard deviation σ = 3 . This probability is: P (12 X 20) = P ( X 20) P ( X < 12) = P ( X 20) P ( X 12) = pnorm (20 , 16 , 3) pnorm (12 , 16 , 3) We used the fact that P ( X < 12) = P ( X 12) , since X is a continuous random variable. The answer is C. (The incorrect answer D is obtained using P ( X < 12) = P ( X 11) , which would be true if X was a discrete random variable.) 3. Aboriginal people in Canada have a higher risk of developing many chronic diseases compared with the rest of the population. In a particular Aboriginal community, 16% of the population has tuber- culosis, 20% have diabetes and 8% have both diseases. What is the probability that a randomly selected individual in this community does not have either one of the two diseases? A) 0.72 B) 0.28 C) 0.64 D) 0.85 E) 0.90 Solution (Section 2.2) Let A be the event that the person has tuberculosis and B the event that the person has diabetes. We know that P ( A ) = 0 . 16 , P ( B ) = 0 . 20 and P ( A B ) = 0 . 08 . By the addition rule, P ( A B ) = P ( A ) + P ( B ) P ( A B ) = 0 . 16 + 0 . 20 0 . 08 = 0 . 28 . The probability that the person does not have either one of the two diseases is: P ( A B ) = 1 P ( A B ) = 1 0 . 28 = 0 . 72 The answer is A. 4. In biochemistry and pharmacology, a receptor is a protein molecule usually found embedded within the plasma membrane surface of a cell that receives chemical signals from outside the cell. A sam- ple of 109 cells was found to contain an average of 1203 fmol receptors per milligram of membrane protein, with standard deviation 192 fmol. (An fmol is equal to 10 15 moles.) Using this data, give a 95% confidence interval for the average amount (in fmols) of receptors per milligram found in the membrane protein of these cells. A) [1077 . 31; 1329 . 72] B) [1153 . 83; 1252 . 21] C) [0; 1322 . 82] D) [1166 . 96; 1239 . 05] E) [1098 . 13; 1308 . 95] Solution (Section 8.1) We denote by µ the average amount of receptors per milligram of membrane protein. This is a large sample. The 95% confidence interval for µ is: 1203 ± 1 . 96 192 109 = 1203 ± 36 . 04 = [1166 . 96; 1239 . 05] The answer is D. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
5. The following data gives the birth weights (in ounces) for 6 consecutive deliveries at the Civic Hospital. Assuming that the birth weights follow a normal distribution, find a 90% confidence interval for the average birth weight µ . 97 117 140 78 99 148 A) [91 . 0; 135 . 4] B) [84 . 8; 141 . 5] C) [91 . 6; 134 . 8] D) [95 . 0; 131 . 3] E) [92 . 3; 133 . 6] Solution (Section 8.2) The sample mean and sample standard deviation for this sample are: ¯ x = 1 6 6 X i =1 x i = 113 . 1667 , s = v u u t 1 5 6 X i =1 ( x i ¯ x ) 2 = 27 . 00679 . This is a small sample. A 90% confidence interval for µ is based on the T distribution with 6 1 = 5 degrees of freedom. For this level of confidence, the probability at the right of the point t is 0 . 05 . Table 18.4 gives the value t = 2 . 015 . Therefore, the 90% confidence interval for µ is 113 . 1667 ± 2 . 015 27 . 00679 6 = [90 . 950; 135 . 383] The answer is A. The incorrect answers B, C and D are obtained using the wrong values t = 2 . 571 , t = 1 . 96 , respectively t = 1 . 645 . 6. The Younger Dryas Cold Event (or the “Big Freeze”) was an abrupt cooling event of the Northern Hemisphere which occurred approximately 12,000 years ago, and might have resulted from a slowing of the Atlantic meridional overturning circulation (AMOC). The most common means of slowing the AMOC involves the reduction of oceanic surface water density via an increase in freshwater discharge to the North Atlantic. To predict if such an event might happen again, the density of the ocean water near surface is closely monitored. We collected 79 measurements of the density of the Atlantic ocean water near surface (in kg /m 3 ), at a latitude of 45 degrees north. For this data, the mean is 1026, the median 1006, the first quartile is 948.1, the third quartile is 1122, and the standard deviation 109.61. The picture below gives the QQ-plot for this data, together with the line of best fit, produced using R: 4
Which one of the following statements is correct? (Only one statement is correct.) A) The fitted line for the QQ plot is y = 1006 + 109 . 61 z B) The fitted line for the QQ plot is y = 109 . 61 + 1006 z C) The fitted line for the QQ plot is y = 1026 + 109 . 61 z D) The fitted line for the QQ plot is y = 109 . 61 + 1122 z E) The distribution of the water density does not appear to be normally distributed, so we cannot find a fitted line for the normal QQ plot. Solution (Section 7.3) There is a clear linear tendency in the plot, so the data appears to be normally distributed. The line of best fit has equation y = ˆ µ + ˆ σz where ˆ µ = ¯ x = 1026 and ˆ σ = s = 109 . 61 . The answer is C. 7. The following data gives the number of deadly bear attacks in North America per decade, for the 9 decades between 1900 and 1989: 2 , 1 , 4 , 8 , 6 , 9 , 9 , 19 , 20 . Calculate the mean and standard deviation for the number of deadly bear attacks in North America per decade. A) The mean is 8.667 and the standard deviation is 5.6505. B) The mean is 8.0 and the standard deviation is 19.0. C) The mean is 8.0 and the standard deviation is 5.0. D) The mean is 8.667 and the standard deviation is 46.0. E) The mean is 8.667 and the standard deviation is 6.7823. Solution (Section 7.1) The mean is x = 1 9 9 X i =1 x i = 78 9 = 8 . 6667 and the standard deviation is: s = s ( 9 i =1 x 2 i ) ( 9 i =1 x i ) 2 / 9 8 = r 1044 (78) 2 / 9 9 1 = 46 = 6 . 7823 . The answer is E. 8. 20% of the trees in a certain forest are maple trees. In this forest, 15% of the maple trees are mature trees, with age between 10 and 15 years. We select randomly a tree in this forest. What is the probability that this is a maple tree with age between 10 and 15 years? A) 0.03 B) 0.15 C) 0.20 D) 0.75 E) 0.175 Solution (Section 3.3) We denote by A the event that the tree is a maple tree and B the event that the tree has an age between 10 and 15 years. We know that P ( A ) = 0 . 2 and P ( B | A ) = 0 . 15 . By the multiplication rule, P ( A B ) = P ( A ) P ( B | A ) = (0 . 2)(0 . 15) = 0 . 03 5
The answer is A. 9. The boxplots below show the effects of different sugars on the growth of pea sections grown in tissue culture, measured in ocular units. (An ocular unit is 0.114 cm.) In experiment A, 2% of glucose was added to the culture. In experiment B, 2% of sucrose was added to the culture. In experiment C, 1% of glucose and 2% of fructose was added to the culture. Finally, in experiment D, 1% of fructose was added to the culture. A B C D 56 58 60 62 64 66 Experiment Growth in ocular units Which one of the following statements is correct? (Only one statement is correct.) A) The median growth in experiments C and D is the same. B) The data in experiments A and C have the same inter-quartile range. C) There are outliers in the data of experiments A, C and D, but not in experiment B. D) The distribution of the data in experiment B is approximately symmetric. E) Experiment B has produced the smallest growth. Solution (Section 7.1) The answer is A. 10. One of the objectives of a study is to describe the distribution of the body mass index (BMI) for women whose age is between 20 and 29 years. Suppose that women in this age group have an average BMI of 26.8 with a standard deviation of 7.42. Consider a random sample of 50 women in this age group. Give an approximation for the probability that the average BMI for these 50 women is greater than 29. A) 0.0179 B) 0.9821 C) 0.6179 D) 0.3821 E) 0.0375 Solution (Section 7.2) Let X be the mean of this sample. By the central limit theorem, we know that the random variable X 26 . 8 7 . 42 / 50 has approximatively a standard normal distribution. Hence, P ( X > 29) = P X 26 . 8 7 . 42 / 50 > 29 26 . 8 7 . 42 / 50 P ( Z > 2 . 10) = 1 P ( Z < 2 . 10) = 1 0 . 9821 = 0 . 0179 . 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The answer is A. 11. A pharmaceutical company is testing a new analgesic (medication for pain relief) on a sample of 6 patients suffering from migraine. Among these, 4 patients reported that their migraines disap- peared after using the drug. However, it is known that 20% of migraines disappear anyways without any treatment. What is the probability that in a sample of 6 patients suffering from migraine, the migraines will disappear without any treatment for exactly 4 them? A) 0.0016 B) 0.2534 C) 0.3523 D) 0.0154 E) 0.9992 Solution (Section 4.2) Let X be the number of patients for whom the migraine will disappear without any treatment, in a sample of 6 patients. Then X has a binomial distribution with n = 6 trials and probability p = 0 . 2 of success. The desired probability is P ( X = 4) = 6 4 (0 . 2) 4 (0 . 8) 2 = 0 . 01536 The answer is D. 12. The plant-water relation plays an important role in plant physiology. We consider an experiment in which 16 seedlings of birch tree were flooded with water for one day and 13 other seedlings were kept as controls. At the end of the experiment, the roots of all plants were analyzed for the level of adenosine triphosphate (ATP), as a measure for the intracellular energy transfer. Below is the summary of the data: flooded plants control plants sample size n 1 = 16 n 2 = 13 sample mean ¯ x 1 = 1 . 17 ¯ x 2 = 1 . 91 sample standard deviation s 1 = 0 . 16 s 2 = 0 . 23 Give a 90% confidence interval for the difference µ 1 µ 2 , where µ 1 is the average ATP level for the flooded plants and µ 2 is the average ATP level for the controls. Based on this interval, can we conclude that flooding causes a decrease or an increase in the ATP level? (Assume that the ATP levels for flooded plants and controls are normally distributed with equal variances.) A) [0.5673; 0.7614]; flooding causes an increase in the mean ATP level B) [0.4532; 0.6719]; flooding causes an increase in the mean ATP level C) [-0.6182; -0.4820]; flooding causes a decrease in the mean ATP level D) [-0.8635; -0.6165]; flooding causes a decrease in the mean ATP level E) [-0.0346; 0.3471]; we cannot conclude that flooding causes a decrease or an increase in the mean ATP level Solution (Section 10.3) This is a small sample test, for normal populations with equal variances. The pooled sample variance is: s 2 p = ( n 1 1) s 2 1 + ( n 2 1) s 2 2 n 1 + n 2 2 = (15)(0 . 16) 2 + (12)(0 . 23) 2 16 + 13 2 = 0 . 03773 7
The 90% confidence interval for µ 1 µ 2 is ¯ x 1 ¯ x 2 ± t q s 2 p (1 /n 1 + 1 /n 2 ) The value t is found in Table 18.4 such that P ( t T t ) = 0 . 90 , where T has a T distribution with 27 degrees of freedom. This means that P ( T t ) = 0 . 95 . In Table 18.4 (row 27, column 0.95) we find the value t = 1 . 703 . The 90% confidence interval for µ 1 µ 2 is: 1 . 17 1 . 91 ± (1 . 703) p (0 . 03773)(1 / 16 + 1 / 13) = 0 . 74 ± 0 . 1235 = [ 0 . 8635; 0 . 6165] Since the interval contains only negative values, we infer that µ 1 < µ 2 . We conclude that flooding causes a decrease in the ATP level. The answer is D. 13. The systolic blood pressure level in a certain population is approximately equal to the value 125 mm Hg. A topic of recent clinical interest is the fact that extensive use of oral contraceptive (OC) may cause a reduction in the systolic blood pressure under the value 125 . A study is organized to test this hypothesis. The n women who participated in this study used OC for a period of 3 months. At the end of the study, their systolic blood pressure was measured. This data has a sample mean 120.4 and sample standard deviation 13.23. What was the number n of participants in this study? A) 12 B) 40 C) 10 D) 32 E) 25 Solution (Section 9.2) This is a left-tailed small sample test. We would like to test H 0 : µ = 125 against H 1 : µ < 125 . The observed value of test statistic is t 0 = ¯ x 125 s/ n = 120 . 4 125 13 . 23 / n . From the R output we know that t 0 = 1 . 0998 . We infer that 120 . 4 125 13 . 23 / n = 1 . 0998 . Therefore n = 1 . 0998 × 13 . 23 120 . 4 125 2 = 10 . 005 We conclude that the sample size was n = 10 . The answer is C. Long answer questions are included on the following pages. Part 2: Long Answer Questions 8
Record your answer to the long answer questions in the space provided below, specifying clearly your notation and including a proper justification. Show the details of your calculations. 1. The average length of human gestation is approximately 40.5 weeks. It is thought that maternal diabetes may influence the length of the gestation. In a study consisting of 20 diabetic pregnant women, it was found that the mean gestation period was 38.8 weeks with a standard deviation of 5 weeks. We would like to gain evidence that the length of gestation in diabetic women is significantly different than the value of 40.5 weeks, using a test of hypotheses. a) (2 marks) Set-up the test hypotheses to gain evidence for this claim. b) (4 marks) Calculate the observed value of the test. c) (4 marks) Report the range of the p -value. d) (2 marks) Give the conclusion of the test at level α = 0 . 05 . Solution (Section 9.2) This is a two-sided small sample test. a) We denote by µ the mean length of gestation for diabetic women. We would like to test H 0 : µ = 40 . 5 against H 1 : µ ̸ = 40 . 5 . b) We know that n = 20 , ¯ x = 38 . 8 and s = 5 . The observed value of the test statistic is: t 0 = ¯ x 40 . 5 s/ n = 38 . 8 40 . 5 5 / 20 = 1 . 52 . c) The p -value of the test is: p -value = 2 P ( T 19 > 1 . 52) From Table 18.4 (row 19) we see that 1.52 is between the values 1.328 and 1.729, whose corre- sponding probabilities to the right are 0.10 and 0.05. Hence P ( T 19 > 1 . 52) is between 0.05 and 0.10 and 0 . 10 < p -value < 0 . 20 d) Since p -value > 0 . 05 , we fail to reject H 0 . There is not enough evidence that the mean length of gestation for diabetic women is significantly different than 40.5. 2. A study was conducted to estimate the sensitivity and specificity of a new procedure for detecting the presence of a kidney disease among patients suffering from hypertension. Among the 54 hyper- tensive patients who had the kidney disease, the procedure identified the disease for 45 subjects. Among the 83 hypertensive patients who did not have the kidney disease, the procedure identified the disease for 24 subjects. Consider a patient chosen from a certain hypertensive population in which the prevalence of this kidney disease is 8%. Assume that the sensitivity and specificity of the procedure remain the same as in the study mentioned above. a) (5 marks) What is the probability of obtaining a positive test result? b) (5 marks) If the new procedure identifies the presence of the kidney disease for this patient, what is the probability that patient truly has the disease? 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Solution (Sections 3.2 and 3.4) Let T + be the event that the new procedure identifies the presence of the disease and D the event that the patient has the disease. The fact that the prevalence of the disease is 8% means that P ( D ) = 0 . 08 . The fact that the sensitivity and specificity of the procedure remain the same as in the study means that: P ( T + | D ) = 45 / 54 and P ( T + | D ) = 24 / 83 . a) P ( T +) = P ( T + | D ) P ( D ) + P ( T + | D ) P ( D ) = (45 / 54)(0 . 08) + (24 / 83)(1 0 . 08) = 0 . 3327 . b) By Bayes’ rule, P ( D | T +) = P ( D T +) P ( T +) = P ( T + | D ) P ( D ) P ( T +) = (45 / 54)(0 . 08) 0 . 3327 = 0 . 2004 3. Ebola virus disease (EVD), formerly known as Ebola haemorrhagic fever, is a severe, often fatal illness in humans. It is thought that fruit bats of the Pteropodidae family are natural Ebola virus hosts. The virus is introduced into the human population through close contact with bodily fluids of infected animals. The incubation period (the time interval from infection with the virus to onset of symptoms) is between 2 to 21 days. The following data gives the incubation period (in days) for 16 patients infected with the Ebola virus: 4 5 6 6 7 8 9 9 11 12 13 15 15 17 20 21 a) (5 marks) Calculate the median ( ˜ x ), first quartile ( q 1 ) and third quartile ( q 3 ) for this data set. b) (5 marks) Give the values of the outliers (if they exist). Solution (Section 7.1) a) Note that the data is already arranged in increasing order. Hence y 1 = 4 y 2 = 5 y 3 = 6 y 4 = 6 y 5 = 7 y 6 = 8 y 7 = 9 y 8 = 9 y 9 = 11 y 10 = 12 y 11 = 13 y 12 = 15 y 13 = 15 y 14 = 17 y 15 = 20 y 16 = 21 For this dataset, n = 16 is even. Hence, the median is: ˜ x = y 8 + y 9 2 = 9 + 11 2 = 10 To compute the first quartile, we note that ( n + 1) / 4 = 17 / 4 = 4 . 25 , which is between 4 and 5 (closer to 4). The first quartile is: q 1 = (0 . 75) y 4 + (0 . 25) y 5 = (0 . 75)(6) + (0 . 25)(7) = 6 . 25 10
To compute the third quartile, we note that 3( n + 1) / 4 = 51 / 4 = 12 . 75 , which is between 12 and 13 (closer to 13). The third quartile is: q 3 = (0 . 25) y 12 + (0 . 75) y 13 = (0 . 25)(15) + (0 . 75)(15) = 15 b) To find the outliers, we need to find the location of the two fences. The inter-quartile range is IQR = q 3 q 1 = 15 6 . 25 = 8 . 75 . Hence Fence1 = q 1 (1 . 5)IQR = 6 . 25 (1 . 5)(8 . 75) = 6 . 25 13 . 125 = 6 . 875 Fence2 = q 3 + (1 . 5)IQR = 15 + (1 . 5)(8 . 75) = 15 + 13 . 125 = 28 . 125 Since there are no data points outside the two fences, we conclude that there are no outliers. 4. In the Unites States, the blood types have the following distribution: 41% O, 31% A, 22% B and 6% AB. It is known that O is a universal donor, A can donate only to A and AB, B can donate only to B and AB, and AB can donate only to AB. If a patient who needs a blood transfusion receives blood from a randomly selected donor, and the two persons are independent of each other, what is the probability that the transfusion is successful? Solution (Section 3.5) Let A 1 , A 2 , A 3 , A 4 be the events that the donor’s blood type are O, A, B, respectively AB. Let B 1 , B 2 , B 3 , B 4 be the events that the blood type of the receiving individual are O, A, B, respectively AB. The event A i is independent of B j , for any i = 1 , 2 , 3 , 4 and j = 1 , 2 , 3 , 4 . The event that the transfusion is successful can be written as the following union of disjoint events: C = ( A 1 B 1 ) ( A 1 B 2 ) ( A 1 B 3 ) ( A 1 B 4 ) ( A 2 B 2 ) ( A 2 B 4 ) ( A 3 B 3 ) ( A 3 B 4 ) ( A 4 B 4 ) . Hence, P ( C ) = P ( A 1 ) P ( B 1 ) + P ( A 1 ) P ( B 2 ) + P ( A 1 ) P ( B 3 ) + P ( A 1 ) P ( B 4 ) + P ( A 2 ) P ( B 2 ) + P ( A 2 ) P ( B 4 ) + P ( A 3 ) P ( B 3 ) + P ( A 3 ) P ( B 4 ) + P ( A 4 ) P ( B 4 ) = (0 . 41)(0 . 41) + (0 . 41)(0 . 31) + (0 . 41)(0 . 22) + (0 . 41)(0 . 06) + (0 . 31)(0 . 31) + (0 . 31)(0 . 06) + (0 . 22)(0 . 22) + (0 . 22)(0 . 06) + (0 . 06)(0 . 06) = 0 . 5899 5. Approximately 4% of men with age between 40 and 55 years will have a heart attack in a 5-year period. A new drug was developed to reduce the probability of having a heart attack for men in this age group. A 5-year study was conducted involving men in this age group who have been treated with the new drug. Among the 2046 participants in the study, 56 had a heart attack within the 5-year period. Let p be the proportion of men in the age group 40-55 using this drug who will 11
have a heart attack. a) (5 marks) Give a 95% confidence interval (c.i.) for p . Using this interval, can we conclude that the new drug is efficient in reducing the risk of having a heart attack for men in this age group? b) (5 marks) Formulate a null hypothesis H 0 and an alternative hypothesis H 1 which could be used for testing that the new drug is efficient in reducing the risk of having a heart attack for men in this age group. Calculate the p -value of this test and report the conclusion at level α = 0 . 05 . Solution a) (Section 8.3) An estimate for p is ˆ p = 56 / 2046 = 0 . 02737 . The 95% confidence interval for p is: 0 . 02737 ± 1 . 96 r (0 . 02737)(1 0 . 02737) 2046 = [0 . 020; 0 . 034] . Because all the values in the interval are smaller than 0.04, we are confident that p is smaller than 0.04. We conclude that the new drug is efficient in reducing the risk of a heart attack. b) (Section 9.3) We would like to test H 0 : p = 0 . 04 against H 1 : p < 0 . 04 . The observed value of the test statistic is: z 0 = ˆ p 0 . 04 p (0 . 04)(0 . 96) / 2046 = 2 . 92 , where ˆ p = 56 / 2046 = 0 . 02737 . This is a left-tailed test. Using Table 18.2, we see that the p -value of the test is given by p -value = P ( Z < 2 . 92) = 0 . 0018 . Since p -value < α = 0 . 05 , we reject H 0 and conclude that the new drug is efficient in reducing the risk of a heart attack. 6. A study is conducted to investigate the relationship between the number X of hours of exercise per week and the systolic blood pressure Y for men of age 50. The following data was obtained on 10 individuals: Individual Number of hours x i Systolic blood pressure y i x 2 i y 2 i x i y i 1 4 120 16 14400 480 2 10 110 100 12100 1100 3 2 120 4 14400 240 4 3 135 9 18225 405 5 3 140 9 19600 420 6 5 115 25 13225 575 7 1 150 1 22500 150 8 2 165 4 27225 330 9 2 160 4 25600 320 10 0 180 0 32400 0 Total 32 1395 172 199675 4020 a) (4 marks) Calculate the sample covariance and sample correlation between the number of hours of exercise and the systolic blood pressure. b) (4 marks) Give the equation of the estimated regression line for this data. 12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
c) (2 marks) Give a prediction for the systolic blood pressure of an individual who exercises 6 hours per week. Solution a) (Section 13.1) For this data, we have: 10 X i =1 ( x i ¯ x ) 2 = 10 X i =1 x 2 i 1 10 10 X i =1 x i ! 2 = 69 . 6 10 X i =1 ( y i ¯ y ) 2 = 10 X i =1 y 2 i 1 10 10 X i =1 y i ! 2 = 5072 . 5 10 X i =1 ( x i ¯ x )( y i ¯ y ) = 10 X i =1 x i y i 1 10 10 X i =1 x i ! 10 X i =1 y i ! = 444 The sample covariance is c cov xy = 1 9 10 X i =1 ( x i x )( y i y ) = 1 9 × ( 444) = 49 . 33 We have: s x = r 69 . 6 9 = 2 . 78 and s y = r 5072 . 5 9 = 23 . 74 The sample correlation is: r xy = c cov xy s x s y = 49 . 33 (2 . 78)(23 . 74) = 0 . 747 b) (Section 13.2) For calculating ˆ β and ˆ α , we use the formulas: b β = 10 i =1 ( x i ¯ x )( y i ¯ y ) 10 i =1 ( x i ¯ x ) 2 = 444 69 . 6 = 6 . 38 , ˆ α = ¯ y ˆ β ¯ x = 139 . 5 ( 6 . 38)(3 . 2) = 159 . 91 The estimated regression line is ˆ y = ˆ α + ˆ βx , which in our case becomes: ˆ y = 159 . 91 (6 . 38) x c) (Section 13.2) A prediction for the systolic blood pressure of an individual who exercises 6 hours per week is 159 . 91 (6 . 38)(6) = 121 . 6 . 13