exam_sol

pdf

School

University of Ottawa *

*We aren’t endorsed by this school

Course

2377

Subject

Mathematics

Date

Apr 3, 2024

Type

pdf

Pages

14

Uploaded by AdmiralDolphin2144

Report
University of Ottawa Winter 2023 PROB. & STATS. FOR ENGINEERS MAT 2377-A Final Exam Solutions Prof. Radu Cebanu • This is a 180-minutes closed-book exam; no notes are allowed. • This exam contains 14 pages and 22 questions. • This is a closed book exam, but a formula sheet (both sides of two 8.5" x 11" sheets) is allowed. • For rough work or additional work space, you may use the back pages. You may also ask the proctors for scrap paper. Do not use scrap paper of your own. • Partial marks can be obtained for questions 19 to 22, in which you must justify completely your answers. • For questions 1-18, only the final answer to each question or sub-question will be graded; no partial grades are possible. Be sure to write the answer in the space provided below each question . • Unauthorized electronic devices (such as cellular phones) are not permitted during this exam. Such devices must be turned off completely and stored out of students’ reach (not in a pocket). Students found in possession of such a device during the exam will be asked to leave immediately and academic fraud allegations may be filed. Last name: First name: Student number: Signature:
Grade Table (for professor use only) Question: 1 2 3 4 5 6 Points: 2 2 2 2 2 2 Score: Question: 7 8 9 10 11 12 Points: 2 2 2 2 2 2 Score: Question: 13 14 15 16 17 18 Points: 2 2 2 2 2 2 Score: Question: 19 20 21 22 Total Points: 6 4 4 5 55 Score: 1. (2 points) Consider an ordinary deck of 52 playing cards (13 cards – 2 to 10, jack, queen, king, ace in each suit; 2 suits in each colour – diamonds, hearts are red, clubs, spades are black). The deck is shuffled and a card is picked randomly. Consider the following events: What is P (( A B c ) C ) . 1. A: the card is red; 2. B: the card is a jack, queen, or king of diamonds; 3. C: the card is an ace. A. 23 / 26 B. 23 / 52 C. 27 / 52 D. 25 / 52 E. 29 / 52 F. N / A Solution: The probability of each of the three events is P (a) = 1 / 2, P (b) = 3 / 52 and P (c) = 1 / 13. The event A B c consists of the card being red but not being the jack, queen or king of diamonds: P ( A B c )= 23 / 52. The event A B c C consists of the card being a red Page 2
ace (which guarantees it is not a jack, queen or king of diamonds): P ( A B c (c) = 1 / 26. By the additivity rule P (( A B c ) (c) = P ( A B c )+ P (c) P ( A B c (c) = 23 52 + 1 13 1 26 = 25 52 The answer is D. 2. (2 points) In a factory, machines 1, 2, and 3 produce screws of the same length, with 2%, 1%, and 3% defective screws, respectively. Of the total production of screws in the factory, the machines produce 35%, 25%, and 40%, respectively. If a screw is selected at random from the total screws produces in a day and is found to be defective, what is the probability that it was produced by machine 3? A. 0.0070 B. 0.0025 C. 0.5581 D. 0.0215 E. 0.0120 F. N / A Solution: If one screw is selected at random, the probability that it is defective is P ( D )= P ( 1 ) P ( D | 1 )+ P ( 2 ) P ( D | 2 )+ P ( 3 ) P ( D | 3 ) =( 0.35 )( 0.02 )+( 0.25 )( 0.01 )+( 0.40 )( 0.03 )= 0.0215. If the selected screw is defective, the conditional probability that it was produced by ma- chine 3 is P ( 3 | D )= P ( 3 ) P ( D | 3 ) P ( D ) = ( 0.40 )( 0.03 ) 0.0215 = 0.55814 The answer is C. 3. (2 points) There are two tall students and two short students in a classroom. We randomly select two students from this classroom without replacement. Let X be the number of tall students among the two selected students. Let F ( x ) be the cumulative distribution function of X and f ( x ) be the probability mass function of X. How many of the following statements are incorrect ? F ( x ) > 0 when x = 0; f ( x ) > 0 when x = 0; F ( x )= 0 when x = 3; f ( x )= 0 when x = 3; f ( x )= F ( x ) when x < 0 Page 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
A. 0 B. 1 C. 2 D. 3 E. 4 F. 5 Solution : X is a discrete random variable, which takes values in { 0,1,2 } with positive probabilities. By definition of the functions F and f , we have F ( 0 )= P ( X 0 )= P ( X = 0 )= f ( 0 ) > 0, F ( 3 )= P ( X 3 )= 1, f ( 3 )= P ( X = 3 )= 0. Since X 0, we have F ( x )= P ( X x )= 0 and f ( x )= P ( X = x )= 0 for x < 0. So A,B,D and E are correct. The only incorrect statement is C since F ( 3 )= P ( X 3 )= 1, so the answer is B. 4. (2 points) Suppose that the probability of germination of a beet seed is 0.7. If we plant 12 seeds and can assume that the germination of one seed is independent of another seed, what is the probability that 10 or fewer seeds germinate? A. 0.0282 B. 0.0059 C. 0.0712 D. 0.0850 E. 0.9150 F. N / A Solution: If X is the number of seeds that germinate in n trials, with probability of suc- cess p , then P ( X = k )= parenleftbig4 n k parenrightbig4 p k ( 1 p ) n k , k = 0,..., n . Thus P ( X n 2 )= 1 P ( X = 11 ) P ( X = 12 )= 1 12 ( 0.7 ) 11 ( 0.3 ) ( 0.7 ) 12 = 0.91495. The answer is E. 5. (2 points) Let X be a random variable following a normal distribution with mean 14 and variance 4. Determine the value c such that P ( X 2 < c )= 0.95. A. 13.29 B. 15.29 C. 17.29 D. 1.64 E. 1.96 F. N / A Solution: Since X N ( 14,2 ) , then Z = X 14 2 N ( 0,1 ) . Note that P ( X 2 < c )= P ( X < c + 2 )= 0.95; we have 0.95 = P ( X < c + 2 )= P parenleftbig4 X 14 2 < c + 2 14 2 parenrightbig4 = P parenleftbig4 Z < c 12 6 parenrightbig4 = Φ parenleftbig4 c 12 6 parenrightbig4 . This means that c 12 6 = Φ 1 ( 0.95 ) 1.65 = c = 15.3. The answer is B (close enough to 15.29). Page 4
6. (2 points) Ten individuals have participated in a diet modification program to stimulate weight loss. Their weight both before and after participation in the program is shown below: Before : 195, 213, 247, 201, 187, 210, 215, 246, 294, 310 After : 187, 195, 221, 190, 175, 197, 199, 221, 278, 285 Is there evidence to support the claim that this particular diet-modification program is ef- fective in producing mean weight reduction? Assume the werights are normally distributed and the significance α = 0.05. Define D i = before-after. A. We reject H 0 (at α = 0.05) i.e. that the diet does not reduce weight. B. We reject H 0 (at α = 0.05) i.e. that the diet does reduce weight. C. We fail to reject H 0 (at α = 0.05) i.e. that the diet does not reduce weight. D. We fail to reject H 0 (at α = 0.05) i.e. that the diet does reduce weight. E. N / A Solution : The difference D is 8,18,26,11,12,13,16,25,16,25. The mean is d = 17 and the sample variance S 2 d = 370 / 9 = 41.1111. Taking the square root, we find S d = 6.4118. The null hypothesis is H 0 : µ d = 0 and the alternate hypothesis H 1 : µ d > 0. The test statistic is 17 0 6.4119 radicallow 10 = 8.387. From the t-table, for 9 degrees of freedom, the P-value is smaller than 0.005, in particular P < α , therefore, we reject H 0 . The correct answer is A, but B is also accepted, since the formulation is ambiguous. 7. (2 points) A new type of electronic flash for cameras will last on average of µ = 5000 hours with a standard deviation of σ = 500 hours. A quality control engineer selects a random sample of 100 flashes. What is the probability that the mean lifetime of these 100 flashes will be greater than 4928 hours? A. 0.0749 B. 0.9251 C. 0.4532 D. 0.7575 E. 0.1587 F. N / A Solution: Let X be the mean lifetime of 100 flashes. Then according to CLT, X N ( 5000,500 / 10 ) . P ( X > 4928 )= P parenleftbig2 X 5000 50 > 4928 5000 50 parenrightbig2 = P parenleftbig2 X 5000 50 > 1.44 parenrightbig2 = 1 Φ ( 1.44 )= 0.9251. The answer is B. Page 5
8. (2 points) Two candidates (A and B) are running for an officer position. A poll is con- ducted: 150 voters are selected randomly and asked for their preference. Among the selected voters, 70 support A and 80 support B. Provide a 99% confidence interval for the true support rate of candidate A in the population. A. ( 0.428,0.638 ) B. ( 0.362,0.572 ) C. ( 0.453,0.613 ) D. ( 0.387,0.547 ) E. N / A Solution: When α = 0.01, z α/ 2 = 2.57. We estimate the true support rate by ˆ p = 70 / 150 = 0.46667. Since n ˆ p > 5 and n ( 1 ˆ p ) > 5, we can apply the CLT. Then E = radicaltp radicalbt ˆ p ( 1 ˆ p ) n · z α/ 2 = 0.104685 The 99% C.I. for p is ( ˆ p E, ˆ p + E )=( 0.361975,0.571345 ) . The answer is B. 9. (2 points) A hockey puck manufacturer claims that its process produces pucks with a mean weight of 163 grams and a standard deviation of 8 grams. A random sample of n pucks is going to be collected. We plan to use the sample mean ¯ X to estimate the population mean. Determine the minimal sample size n so that P ( | ¯ X 163 | < 3 )= 0.95. (Assume n is large.) A. 6 B. 8 C. 24 D. 26 E. 28 F. N / A Solution: Since n is assumed to be large, CLT tells us ¯ X µ σ/ radicallow n N ( 0,1 ) . We want E = 3, α = 0.05, therefore z α/ 2 = 1.96. This implies that n 8 2 3 2 · 1.96 2 = 27.318. Because we must round up, the answer is E. 10. (2 points) The tensile strength of manila ropes follows a normal distribution. A random sample of 16 manila ropes has a sample mean strength 4450 kg and sample standard de- viation 115 kg. Suppose that we want to test whether the mean strength of manila rope is less than 4500 kg. At a significance level α = 0.05, the value of the test statistic and the conclusion for this test are: A. 1.739; do not reject H 0 B. 1.739; reject H 0 C. 1.753; do not reject H 0 D. 1.753; reject H 0 Page 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
E. 1.739; reject H 0 F. N / A Solution: Since σ is unknown and the sample size is small, we should use a t-test. We want to test H 0 : µ = 4500 vs H 1 : µ < 4500. The test statistic is t = 4450 4500 115 / 4 = 1.739. There are 15 degrees of freedom and the critical value is 1.753, since it is a left-tail test. Therefore the P-value is between 0.05 and 0.1. In particular P > α . Thus, we fail to reject H 0 . The answer is A. 11. (2 points) According to a nationwide survey conducted by Statistics Canada, the mean birth weight in Canada is 3.4kg. A doctor would like to gain evidence for the hypothesis that urban mothers deliver babies whose birth weights are greater than 3.4kg. She con- ducted a statistical test based on 125 Canadian urban newborns with a sample standard deviation 0.78kg. Suppose that the p -value of this test is 0.0158. What is the mean weight (in kg) for those 125 Canadian urban newborns? A. 3.25 B. 3.4 C. 3.15 D. 3.55 E. 3.67 F. N / A Solution : Since the sample size n = 125 > 30, we can use the Z-table and assume σ = s . The doctor tested H 0 : µ = 3.4 versus H 1 : µ > 3.4. The observed value of the test statistic is z = ¯ x 3.4 0.78 / radicallow 125 = ¯ x 3.4 0.0698 Since this is a right-tailed test, p -value = P ( Z > z )= 0.0158. Hence, P ( Z < z )= 0.9842. Using the standard normal z table, we find z = 2.15. Solving ¯ x 3.4 0.0698 = 2.15, we get ¯ x = 3.55. The answer is D. 12. (2 points) The mean weight of a newborn baby in North America is 120 ounces (oz). We want to test the hypothesis that mothers with low socioeconomic status have babies whose weight at birth is lower than 120 oz. Let µ be the mean weight of a newborn baby whose mother has a low socioeconomic status. Set-up a test of hypotheses and explain when type I error or type II error occur by choosing the correct statement from the list below. Page 7
A. H 0 : µ = 120 versus H 1 : µ < 120. A Type II error occurs when we conclude that the mean weight of a newborn baby whose mother has a low socioeconomic status is lower than 120 oz, when in fact it is not true. B. H 0 : µ = 120 versus H 1 : µ < 120. A Type I error occurs when we conclude that the mean weight of a newborn baby whose mother has a low socioeconomic status is lower than 120 oz, when in fact it is not true. C. H 0 : µ 120 versus H 1 : µ < 120. A Type I error occurs when we conclude that the mean weight of a newborn baby whose mother has a low socioeconomic status is lower than 120 oz, when in fact it is not true. D. H 0 : µ 120 versus H 1 : µ < 120. A Type II error occurs when we conclude that the mean of a newborn baby whose mother has a low socioeconomic status is 120 oz, but in fact this weight is lower than 120 oz. E. H 0 : µ = 120 versus H 1 : µ < 120. A Type I error occurs when we conclude that the mean of a newborn baby whose mother has a low socioeconomic status is 120 oz, but in fact this weight is lower than 120 oz. F. N / A Solution : We want to test H 0 : µ = 120 versus H 1 : µ < 120. A Type I error occurs when we reject H 0 and H 0 is true, i.e. when we conclude that the mean weight of a newborn baby whose mother has a low socioeconomic status is lower than 120 oz, when in fact it is not true. A Type II error occurs when we fail to reject H 0 and H 0 is true, i.e. we could not conclude that mothers with low socioeconomic status have babies with lower birth weight, when in fact this is true. The answer is B. 13. (2 points) The following circuit operates if and only if there is a path of functional devices from left to right. Assume that the devices fail independently and that the probability of failure of each device is as shown. What is the probability that the circuit does not operate? Page 8
0.01 0.02 0.02 0.02 0.02 0.01 Solution : Since the 4 rows are connected in parallel, for the system to fail, we need all of the rows to fail. The probability of the 2nd row to fail is 0.02 + 0.02 0.02 · 0.02 = 0.0396 because the first device or the second one must fail. Therefore, the probability that the whole system fails is P = 0.01 · 0.0396 · 0.01 · 0.0396 0.00000016 = 1.6 × 10 7 . 14. (2 points) In a certain manufacture process, it is known that 1% of products are defective. Assume that products are manufactured one-by-one independently. What is the probability that the 3rd product will the first defective one? Solution : Let X be the number of products coming out from the process, until first defective item. Then X Geometric ( p = 0.1 ) . Therefore, P ( X = 3 )=( 1 0.01 ) 3 1 ( 0.01 )= 0.0098. 15. (2 points) If the probability density function of a random variable X is given by f ( x )= k x 3 , 0 < x < 1. Compute the probability that X will be between 1 / 4 and 3 / 4. Solution : 1 = integraldisplay 1 0 k x 3 dx = k 4 k = 4 P ( 1 / 4 < X < 3 / 4 )= integraldisplay 3 / 4 1 / 4 4 x 3 dx =( 3 / 4 ) 4 ( 1 / 4 ) 4 = 5 / 16 = 0.3125 16. (2 points) A company produces orange juice bottles with a volume of approximately 2 liters each. One machine fills half of each bottle with concentrate, and another machine fills the other half with water. Assume the two machines work independently. The volume (in liters) of concentrate poured by the first machine follows a normal distribution with mean 0.98 and variance 0.0009. The volume of water (in liters) poured by the second Page 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
machine follows a normal distribution with mean 1.02 and variance 0.0016. A bottle of orange juice produced by this company is therefore a mixture of water and concentrate. What is the probability that a bottle contains more than 1.98 liters of juice? Solution : Let X and Y be the volume of concentrate and water poured by the first and second machines, respectively. The volume of orange juice is X + Y. We know that X N ( 0.98,0.03 ) and Y N ( 1.02,0.04 ) . Since X and Y are independent, X + Y N ( 0.98 + 1.02, radicallow 0.03 + 0.04 )= N ( 2,0.5 ) . Using standardization and the Z table, the probability that a bottle contains more than 1.98 liters of juice is P ( X + Y > 1.98 )= P parenleftbig4 Z > 1.98 2 0.05 parenrightbig4 = P ( Z > 0.4 )= 1 P ( Z < 0.4 ) = 1 0.3446 = 0.6554. 17. (2 points) It takes a Christmas tree about 10 years to grow from seed to a size ready for cutting. We want to estimate the average height µ of a 4-year Christmas tree which has been grown from a seed. Assume that the height of a 4-year tree is normally distributed. A sample of 20 trees has a mean height 25.25 cm and a sample standard deviation 4.5 cm. This sample produces a confidence interval (CI) for µ of length 2.673. Determine the confidence level of this CI. Solution : The interval is based on a t distribution since the sample size is small and the population is normal. The length of the interval is twice the maximal error, so E = 2.673 / 2 = 1336.5. We also know that E = S radicallow n · t α/ 2 = 4.5 radicallow 20 · t α/ 2 = 1336.5 t α/ 2 = 1.328 From the t-table (for two-tailed values), in row 19 ( d . f . = 20 1) we see that 1.328 corresponds to the confidence level 80% (i.e. α = 0.2). 18. Among 2046 cars made by Company A in 1999, 56 had a problem in the brake system. Suppose that one wants to know whether the brake system defective rate for this type of car is less that 4%. (a) (1 point) Formulate the null and alternative hypotheses. Solution : We would like to test H 0 : p = 0.04 against H 1 : p < 0.04. (b) (1 point) Compute the p-value for this test. Solution : Since n ˆ p > 5 and n ( 1 ˆ p ) > 5, the sample size is large enough. The observed value of the test statistic is: z = ˆ p 0.04 radicalbig1 ( 0.04 )( 0.96 ) / 2046 = 2.92, Page 10
where ˆ p = 56 / 2046 = 0.02737. This is a left-tailed test. Hence, the p -value of the test is given by p -value = P ( Z < 2.92 )= 0.0018. 19. Suppose that the arrivals of small planes at an airport can be modeled by a Poisson random variable with an average of 1 plane per hour. (a) (1 point) What is the probability that more than 3 planes will arrive in one hour? Solution : Let X be the number of planes which arrive at the airport in 1 hour. Then X Poisson ( λ = 1 ) . P ( X > 3 )= 1 P ( X 3 )= 1 P ( X = 0 ) P ( X = 1 ) P ( X = 2 ) P ( X = 3 ) = 1 parenleftbig4 1 0 0! e 1 + 1 1 1! e 1 + 1 2 2! e 1 + 1 3 3! e 1 parenrightbig4 0.01899. (b) (1 point) Consider 15 consecutive and disjoint intervals of 1 hour. What is the prob- ability that none of these intervals will see the arrival of more than 3 planes? Solution : "15 consecutive intervals of 1 hour" can be interpreted as either intervals of 4 minutes each (15 · 4 = 60) or each interval being 1 hour long. Both options are considered correct. The number of intervals which will see the arrival of more than 3 planes is a binomial random variable, W B ( 15, p ) , where p is the probability of the arrival of more than 3 planes in an interval. Therefore, P ( W = 0 )= parenleftbig4 15 0 parenrightbig4 ( 1 p ) 15 p 0 =( 1 p ) 15 . For the interpretation with each interval having 1 hour, p = 0.01899 and P ( W = 0 )= ( 1 0.01899 ) 15 = 0.75. For the interpretation with each interval having 4 minutes, let X 1 / 15 be the number of planes arriving in 4 minutes. Then X 1 / 15 Poisson ( λ = 1 / 15 = 0.067 ) and P ( X 1 / 15 > 3 )= 1 parenleftbig4 ( 0.067 ) 0 0! e 0.067 + ( 0.067 ) 1 1! e 0.067 + ( 0.067 ) 2 2! e 0.067 + ( 0.067 ) 3 3! e 0.067 parenrightbig4 0.0000 Therefore P ( W = 0 )=( 1 0.0000008 ) 15 = 0.999988. (c) (1 point) What is the probability that exactly 3 planes will arrive in a 2-hour period? Solution : Let X 2 be the number of planes arriving in an interval of 2 hours. Then X 2 Poisson ( λ = 2 ) and P ( X 2 = 3 )= 2 3 3! e 2 0.18 Page 11
(d) (1 point) What is the length of the period for which the probability of having no arrival is 0.1? Solution : Let L be the period required and X L be the number of planes arriving in that period. Then X L Poisson ( λ = 1 · L ) . Then P ( X L = 0 )= L 0 0! e L = e L = 0.1 Solving for L, we have L = ln0.1 L = ln0.1 2.30259. (e) (1 point) What is the probability that we must wait at least 3 hours to witness the arrival of 3 planes? Solution : If we must wait at least 3 hours for the arrival of 3 planes, it means that up to the 3 hour mark, less than 3 planes will arrive. Let X 3 be the number of planes which arrive in a period of 3 hours. Then X 3 Poisson ( λ = 3 ) . The required probability is P ( X 3 < 3 )= P ( X = 0 )+ P ( X = 1 )+ P ( X = 2 ) = 3 0 0! e 3 + 3 1 1! e 3 + 3 2 2! e 3 = 0.42319 (f) (1 point) What is the mean and variance of the waiting time for 3 planes? Let Y 1 be the time until the next airplane arrives, then Y 2 the time from the first airplane to the second and X 3 the time from the second airplane to the third. Then Y 1 Y 2 Y 3 E xp ( λ = 1 ) and E ( Y 1 )= E ( Y 2 )= E ( Y 3 )= 1 / 1 and V ar ( Y 1 )= V ar ( Y 2 )= V ar ( Y 3 )= 1 / 1 2 = 1. Therefore, if Y = Y 1 + Y 2 + Y 3 , E ( Y )= 1 + 1 + 1 = 3 and V ar ( Y )= 1 + 1 + 1 = 3. 20. (4 points) The concentration of nicotine was measured in a random sample of 40 cigars. The data are displayed below, from smallest to largest: 72, 85, 110,124,137,140,147,151,158,163, 164,165,167,168,169,169,170,174,175,175, 179,179,182,185,186,188,190,192,193,197, 203,208,209,211,217,228,231,237,246,256. Find the suspected outliers in the data. Solution : If we remove 72, the sum of the numbers is x i = 7028 and the sum of the squares x 2 i = 1314518. Page 12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
We construct a prediction interval with a confidence of 99% and see if 72 is inside. We have x = 7028 / 39 = 180.205, S 2 = 39 · 1314518 ( 7028 ) 2 39 · 38 = 1264.11 S = 35.55. For α = 0.01, z α/ 2 = z 0.05 = 2.57 (We can use z instead of t because n 30). E 0 = s · radicaltp radicalbt 1 + 1 n · z α/ 2 = 35.55 · radicalbig2 40 / 39 · 2.57 = 92.527. The prediction intterval is ( x E 0 , x + E 0 )=( 87.678,272.732 ) . Since 72 is not inside, it is a suspected outlier. 21. Twenty-four girls in Grades 9 and 10 are put on a training program. Their time for a 40-yard dash is recorded before and after participating in a training program. The differ- ences between the before-training time and the after-training time for those 24 girls are measured, so that positive difference values represent improvement in the 40-yard dash time. Suppose that the values of those differences follow a normal distribution and they have a sample mean 0.079 min and a sample standard deviation 0.255 min. We conduct a statistical test to check whether this training program can reduce the mean finish time of 40-yard dash. (a) (1 point) What paired t test should we use in this situation? Solution : Let D = Before After with E ( D )= µ d . Based on the question, we should conduct a paired t-test for H 0 : µ d = 0 vs H 1 : µ d > 0. (b) (1 point) What is the observed value of the test statistic? Solution : The test statistic is 0.079 0.255 / radicallow 24 = 1.518. (c) (1 point) Find an interval containing the p value of the test (use the T-table). Solution : The p -value for the test is in ( 0.05,0.10 ) , from the t -table with 23 d.f. (d) (1 point) Does the training program improve the 40-yard dash time? Solution : If we take α = 0.05 (the default value if α is not given), the p -value is greater than 0.05, we would not reject H 0 . Page 13
22. Consider the following data, consisting of n = 20 paired measurements ( x i , y i ) of hydro- carbon levels ( x ) and pure oxygen levels ( y ) in fuels: x: 0.99 1.02 1.15 1.29 1.46 1.36 0.87 1.23 1.55 1.40 y: 90.01 89.05 91.43 93.74 96.73 94.45 87.59 91.77 99.42 93.65 x: 1.19 1.15 0.98 1.01 1.11 1.20 1.26 1.32 1.43 0.95 y: 93.54 92.52 90.56 89.54 89.85 90.39 93.25 93.41 94.98 87.33 Note that 20 summationdisplay i = 1 x i = 23.92, 20 summationdisplay i = 1 y i = 1843.21, 20 summationdisplay i = 1 x 2 i = 29.29, 20 summationdisplay i = 1 x i y i = 2214.66, 20 summationdisplay i = 1 y 2 i = 170044.5 (a) (1 point) What is the correlation coefficient between x and y ? Solution : We compute x = 23.92 / 20 = 1.196 and y = 1843.21 / 20 = 92.16. We get S x y = 2214.66 20 · 1.196 · 92.16 = 10.18 S x x = 29.29 20 · ( 1.196 ) 2 = 0.682 S y y = 170044.5 20 · ( 92.16 ) 2 = 173.345 Therefore ρ = S x y radicalbig1 S x x · S y y = 0.937 (b) (1 point) Assume that the simple linear regression model y = β 0 + β 1 x + ǫ is valid. What assumptions are we making about the error term ǫ ? Solution : It is assumed that E ( ǫ )= 0 and that the error terms are independent. (c) (1 point) Find the least square estimators b 0 and b 1 of the line of best fit. Solution : b 1 = S x y S x x = 10.18 0.682 14.93 b 0 = y b 1 · x = 92.16 14.93 · 1.196 = 74.3 (d) (1 point) Provide an estimate of σ 2 . Solution : s 2 = S y y b 1 · S x y n 2 = 173.345 14.93 · 10.18 18 1.18. (e) (1 point) What is the coefficient of determination of the line of best fit? Solution : R 2 = ρ 2 =( 0.937 ) 2 0.877. Page 14