Final exam Solutions

pdf

School

Binghamton University *

*We aren’t endorsed by this school

Course

147

Subject

Mathematics

Date

Feb 20, 2024

Type

pdf

Pages

12

Uploaded by MegaMagpiePerson102

Report
Math 147 B - Final Review Problems - Fall 2023 Instructions: This is not a sample exam! After you have reviewed your lectures notes, examples therein, re-done all the exercises from discussions and homework problem sets, you may spend some time on these problems. 1. After keeping track of his heating expenses for several winters, a homeowner believes he can estimate the monthly cost from the average daily Fahrenheit temperature by using the model d Cost = 133 2 . 13 Temp . Here is the residuals plot for his data: (a) Interpret the slope of the line in this context. (b) Interpret the y-intercept of the line in this context. (c) During months when the temperature stays around freezing, would you expect cost predictions based on this model to be accurate, too low, or too high? Explain. (d) What heating cost does the model predict for a month that averages 10 ? (e) During one of the months on which the model was based, the temperature did average 10 . What were the actual heating costs for that month? (f) Should the homeowner use this model? Explain. (g) Would this model be more successful if the temperature were expressed in degrees Celsius? Explain (a) The model predicts a decrease in 2.13 in heating cost for an increase in tem- perature of 1 Fahrenheit. Generally, warmer months are associated with lower heating costs. (b) When the temperature is 0 Fahrenheit, the model predicts a monthly heating cost of 133. (c) When the temperature is around 32 Fahrenheit, the predictions are generally too high. The residuals are negative, indicating that the actual values are lower than the predicted values. (d) d Cost = 133 2 . 13 Temp = 133 2 . 13(10) = $111 . 70; According to the model, the heating cost in a month with average daily temperature 10 Fahrenheit is expected to be 111.70. 1
(e) The residual for a 10 day is approximately –$6, meaning that the actual cost was 6 less than predicted, or $111 . 70–$6 = $105 . 70. (f) The model is not appropriate. The residuals plot shows a definite curved pattern. The association between monthly heating cost and average daily temperature is not linear. (g) A change of scale from Fahrenheit to Celsius would not affect the relationship. Associations between quantitative variables are the same, no matter what the units. 2. Highway planners investigated the relationship between traffic Density (number of automobiles per mile) and the average Speed of the traffic on a moderately large city thoroughfare. The data were collected at the same loca-tion at 10 different times over a span of 3 months. They found a mean traffic Density of 68.6 cars per mile (cpm) with standard deviation of 27.07 cpm. Overall, the cars’ average Speed was 26.38 mph, with standard deviation of 9.68 mph. These researchers found the regression line for these data to be d Speed = 50 . 55 0 . 352 Density . (a) What is the value of the correlation coefficient between Speed and Density? (b) What percent of the variation in average Speed is explained by traffic Density? (c) Predict the average Speed of traffic on the thoroughfare when the traffic Density is 50 cpm. (d) What is the value of the residual for a traffic Density of 56 cpm with an observed Speed of 32.5 mph? (e) The dataset initially included the point Density = 125 cpm, Speed=55 mph. This point was considered an outlier and was not included in the analysis. Will the slope increase, decrease, or remain the same if we redo the analysis and include this point? (f) Will the correlation become stronger, weaker, or remain the same if we redo the analysis and include this point (125, 55)? (g) A European member of the research team measured the Speed of the cars in kilo- meters per hour (1 km 0.62 miles) and the traffic Density in cars per kilometer. Find the value of his calculated correlation between speed and density. (a) We have b 1 = r s y s x i.e 0 . 352 = r 9 . 68 27 . 07 and hence r = 0 . 984. The correlation between traffic density and speed is r = 0 . 984. (b) R 2 = (–0 . 984) 2 = 0 . 969; The variation in the traffic density accounts for 96.9% of the variation in speed. (c) d Speed = 50 . 55 0 . 352 Density = 50 . 55 0 . 352(50) = 32 . 95; According to the linear model, when traffic density is 50 cars per mile, the average speed of traffic on a moderately large city thoroughfare is expected to be 32.95 miles per hour. 2
(d) d Speed = 50 . 55 0 . 352 Density = 50 . 55 0 . 352(56) = 30 . 84; According to the linear model, when traffic density is 56 cars per mile, the average speed of traffic on a moderately large city thoroughfare is expected to be 30.84 miles per hour. If traffic is actually moving at 32.5 mph, the residual is 32 . 5 30 . 84 = 1 . 66 miles per hour. (e) d Speed = 50 . 55 0 . 352 Density = 50 . 55 0 . 352(125) = 6 . 55; According to the linear model, when traffic density is 125 cars per mile, the average speed of traffic on a moderately large city thoroughfare is expected to be 6.55 miles per hour. The point with traffic density 125 cars per minute and average speed 55 miles per hour is considerably higher than the model would predict. If this point were included in the analysis, the slope would increase. (f) The correlation between traffic density and average speed would become weaker. The influential point (125 , 55) is a departure from the pattern established by the other data points. (g) The correlation would not change if kilometers were used instead of miles in the calculations. Correlation is a ”unitless” measure of the degree of linear association based on z-scores, and is not affected by changes in scale. The correlation would remain the same, r = 0 . 984. 3. A survey of families revealed that 58% of all families eat turkey at holiday meals, 44% eat ham, and 16% have both turkey and ham to eat at holiday meals. (a) What is the probability that a family selected at random had neither turkey nor ham at their holiday meal? (b) What is the probability that a family selected at random had only ham without having turkey at their holiday meal? (c) What is the probability that a randomly selected family having turkey had ham at their holiday meal? (d) Are having turkey and having ham disjoint events? Explain. Venn Diagram: (a) P (neither ham nor turkey) = 1 P (ham and turkey) = 1 [ P (ham) + P (turkey) P (ham and turkey)] = 1 [0 . 44 + 0 . 58 0 . 16] = 1 0 . 86 = 0 . 14 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Or, using the Venn diagram, 14% (b) P (ham only) = P (ham P (ham and turkey) = 0 . 44 0 . 16 = 0 . 28 Or, using the Venn Diagram, 28%. (c) P (ham | turkey) = P (ham and turkey) P (turkey) = 0 . 16 0 . 58 = 0 . 2759 (d) No, the events are not disjoint, since some families (16%) have both ham and turkey at their holiday meals. 4. Many school administrators watch enrollment numbers for answers to questions parents ask. Some parents wondered if preferring a particular science course is related to the student’s preference in foreign language. Students were surveyed to establish their preference in science and foreign language courses. Does it appear that preferences in science and foreign language are independent? Explain. Overall, 102 of 136, or 75%, preferred Spanish. 35 of 51, or 68.6%, of students in Chemistry had Spanish. 23 of 33, or 69.6%, of students in Physics had Spanish, and 44 of 52, or 84.6% of students in Biology had Spanish. Chemistry and Physics students were much less likely to take Spanish than Biology students. It appears that there is an association between preference in science and foreign language. 5. For purposes of making on-campus housing assignments, a college classifies its students as Priority A (seniors), Priority B (juniors), and Priority C (freshmen and sophomores). Of the students who choose to live on campus, 10% are seniors, 20% are juniors, and the rest are underclassmen. The most desirable dorm is the newly constructed Gold dorm, and 60% of the seniors elect to live there. 15% of the juniors also live there, along with only 5% of the freshmen and sophomores. What is the probability that a randomly selected resident of the Gold dorm is a senior? Show your work clearly. From the tree diagram 4
we get P ( A | G ) = P ( A and G ) P ( G ) = 0 . 060 0 . 060 + 0 . 030 + 0 . 035 = 0 . 060 0 . 125 = 0 . 48 6. Your company bids for two contracts. You believe the probability you get contract #1 is 0.8. If you get contract #1, the probability you also get contract #2 will be 0.2, and if you do not get #1, the probability you get #2 will be 0.3. (a) Are the two contracts independent? Explain. (b) Let X be the number of contracts you get. Find the probability model for X . (c) Find the expected value and standard deviation (a) The contracts are not independent. The probability that your company wins the second contract depends on whether or not your company wins the first contract. (b) We have x 0 1 2 P ( X = x ) 0 . 14 0 . 70 0 . 16 where e.g. P (getting 2 contracts) = P (getting#1) P (getting#2 | got#1) = (0 . 8)(0 . 2) = 0 . 16 P (getting 0 contract) = P (not getting#1) P (not getting#2 | didn’t get#1) = (0 . 2)(0 . 7) (c) E ( X ) = 0(0 . 14) + 1(0 . 70) + 2(0 . 16) = 1 . 02 contracts. σ 2 = Var( X ) = (0 1 . 02) 2 (0 . 14) + (1 1 . 02) 2 (0 . 70) + (2 1 . 02) 2 (0 . 16) = 0 . 2996 σ = SD ( X ) = p Var( X ) = 0 . 2996 0 . 55contracts 7. The amount of cereal that can be poured into a small bowl varies with a mean of 1.5 ounces and a stan-dard deviation of 0.3 ounces. A large bowl holds a mean of 2.5 ounces with a standard deviation of 0.4 ounces. You open a new box of cereal and pour one large and one small bowl. 5
(a) How much more cereal do you expect to be in the large bowl? (b) What’s the standard deviation of this difference? (c) If the difference follows a Normal model, what’s the probability the small bowl contains more cereal than the large one? (d) What are the mean and standard deviation of the total amount of cereal in the two bowls? (e) If the total follows a Normal model, what’s the probability you poured out more than 4.5 ounces of cereal in the two bowls together? (f) The amount of cereal the manufacturer puts in the boxes is a random variable with a mean of 16.3 ounces and a standard deviation of 0.2 ounces. Find the expected amount of cereal left in the box and the standard deviation. (a) E (large bowl small bowl) = E (large bowl) E (small bowl) = 2 . 5 1 . 5 = 1 ounce. (b) SD (large small) = p V ar (large) + V ar (small) = 0 . 4 2 + 0 . 3 2 = 0 . 5 ounces (c) The small bowl will contain more cereal than the large bowl when the difference between the amounts is less than 0. The z-score is z = 0 1 0 . 5 = 2 and according to the Normal model, the probability of this occurring is P ( Z < 2) = 0 . 0228. (d) E (large bowl + small bowl) = E (large bowl) + E (small bowl) = 2 . 5 + 1 . 5 = 4 (e) SD (large + small) = p V ar (large) + V ar (small) = 0 . 4 2 + 0 . 3 2 = 0 . 5 ounces. (f) According to the Normal model, the probability that the total weight of cereal in the two bowls is more than 4.5 ounces is approximately P ( Z > 4 . 5 4 0 . 5 ) = 1 P ( Z < 1) = 1 0 . 8413 = 0 . 1587. (g) µ = E (box large small) = E (box) E (large) E (small) = 16 . 3 2 . 5 1 . 5 = 12 . 3 ounces (h) σ = SD (box large small) = p V ar (box) + V ar (large) + V ar (small) = 0 . 2 2 + 0 . 3 2 + 0 . 4 2 0 . 54 ounces 8. I am the only bank teller on duty at my local bank. I need to run out for 10 minutes, but I don’t want to miss any customers. Suppose the arrival of customers can be modeled by a Poisson distribution with mean 2 customers per hour. (a) What’s the probability that no one will arrive in the next 10 minutes? (b) What’s the probability that 2 or more people arrive in the next 10 minutes? (c) You’ve just served 2 customers who came in one after the other. Is this a better time to run out? 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
(a) Because the Poisson model scales according to the sample size, we can calculate the mean for 10 minutes. A Poisson model with mean 2 customers per hour (60 minutes) is equivalent to a Poisson model with mean 2 6 = 1 3 per 60 6 = 10 minutes. Let X be the number of customers arriving in 10 minutes. P ( X = 0) = e 1 3 ( 1 3 ) 0 0! = 0 . 7165 (b) P ( X 2) = 1 [ P ( X = 0) = P ( X = 1)] = 1 e 1 3 ( 1 3 ) 0 0! + e 1 3 ( 1 3 ) 1 1! 0 . 0446. (c) No. The probabilities do not change based on what has just happened. Even though two customers just came in, the probability of no customers in the next 10 minutes will not change. This is neither a better nor a worse time. 9. Vitamin D is essential for strong, healthy bones. Our bod-ies produce vitamin D naturally when sunlight falls upon the skin, or it can be taken as a dietary supplement. Although the bone disease rickets was largely eliminated in England during the 1950s, some people there are concerned that this generation of children is at increased risk because they are more likely to watch TV or play computer games than spend time outdoors. Recent research indicated that about 20% of British children are deficient in vitamin D. Suppose doctors test a group of elementary school children. (a) What’s the probability that the first vitamin D–deficient child is the eighth one tested? (b) What’s the probability that the first 10 children tested are all okay? (c) How many kids do they expect to test before finding one who has this vitamin deficiency? (d) They will test 50 students at the third-grade level. Find the mean and standard deviation of the number who may be deficient in vitamin D. (e) If they test 320 children at this school, what’s the probability that no more than 50 of them have the vitamin deficiency? The selection of these children may be considered Bernoulli trials. There are only two possible outcomes, vitamin D deficient or not vitamin D deficient. Recent research indicates that 20% of British children are vitamin D deficient. (The probability of not being vitamin D deficient is therefore 80%.) Provided the students at this school are representative of all British children, we can consider the probability constant. The trials are not independent, since the population of British children is finite, but the children at this school represent fewer than 10% of all British children. (a) Let X be the number of students tested before finding a student who is vitamin D deficient. Use Geom(0.2) to model the situation. P (First vit. D def. child is the eighth one tested) = P ( X = 8) = (0 . 8) 7 (0 . 2) = 0 . 042 7
(b) P (The first ten children tested are okay) = (0 . 8) 10 = 0 . 107 (c) E ( X ) = 1 p = 1 0 . 2 = 5 kids. (d) Let Y be the number of children who are vitamin D deficient out of 50 children. Use Binom(50, 0.2). Then E ( Y ) = np = 50(0 . 2) = 10 kids; SD ( Y ) = npq = p 50(0 . 2)(0 . 8) = 2 . 83 kids. (e) Let now Y be the number of children who are vitamin D deficient out of 320 children. Using Binom(320, 0.2) we have that E ( Y ) = np = 320(0 . 2) = 64 kids; SD ( Y ) npq = 320(0 . 2)(0 . 8) = 7 . 16 kids. Since np = 64 and nq = 256 are both greater than 10, Binom(320,0.2) may be approximated by the Normal model, N(64, 7.16). We are looking for P ( Y 50) P ( Z 50 64 7 . 16 ) = P ( Z < 1 . 96) = 0 . 0250 According to the Normal model, the probability that no more than 50 out of 320 children have the vitamin D deficiency is approximately 0.0250. 10. Hoping to lure more shoppers downtown, a city builds a new public parking garage in the central business district. The city plans to pay for the structure through parking fees. During a two-month period (44 weekdays), daily fees collected averaged 126, with a standard deviation of 15. (a) What assumptions must you make in order to use these statistics for inference? (b) Write a 90% confidence interval for the mean daily income this parking garage will generate. (c) Interpret this confidence interval in context. (d) Explain what ”90% confidence” means in this context. (e) The consultant who advised the city on this project predicted that parking rev- enues would average 130 per day. Based on your confidence interval, do you think the consultant was correct? Why? (f) Someone suggests that the city use its data to create a 95% confidence interval instead of the 90% interval first created. How would this interval be better for the city? (You need not actually create the new interval.) (g) How would the 95% interval be worse for the planners? (h) How could they achieve an interval estimate that would better serve their planning needs? (a) Randomization condition: The weekdays were not randomly selected. We will assume that the weekdays in our sample are representative of all weekdays. Nearly Normal condition: We don’t have the actual data, but since the sample of 44 weekdays is fairly large it is okay to proceed. 8
The weekdays in the sample had a mean revenue of 126 and a standard deviation in revenue of 15. The sampling distribution of the mean can be modeled by a Student’s t-model, with 44 1 = 43 degrees of freedom. We will use a one-sample t-interval with 90% confidence for the mean daily income of the parking garage. (By hand, use t 40 = 1 . 684.) (b) ¯ y ± t n 1 s n = 126 ± t 43 15 44 = (122 . 2 , 129 . 8) (c) We are 90% confident that the interval 122.20 to 129.80 contains the true mean daily income of the parking garage. (d) 90% of all random samples of size 44 will produce intervals that contain the true mean daily income of the parking garage. (e) Since the interval is completely below the 130 predicted by the consultant, there is evidence that the average daily parking revenue is lower than 130. (f) The 95% confidence interval would be wider than the 90% confidence interval. We can be more confident that our interval contains the mean parking revenue when we are less precise. This would be better for the city because the 95% confidence interval is more likely to contain the true mean parking revenue. (g) The 95% confidence interval is wider than the 90% confidence interval, and there- fore less precise. It would be difficult for budget planners to use this wider interval, since they need precise figures for the budget. (h) By collecting a larger sample of parking revenue on weekdays, they could create a more precise interval without sacrificing confidence. 11. A researcher measured the body temperatures of that randomly selected group of adults. Here are summaries of the data he collected. We wish to estimate the average (or ”normal”) temperature among the adult population. (a) Check the conditions for creating a t-interval. 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
(b) Find a 98% confidence interval for mean body temperature. (c) Explain the meaning of that interval. (d) Explain what ”98% confidence” means in this context. (e) 98.6 F is commonly assumed to be ”normal.” Set up the null and alternative hypotheses for testing this. (f) Check the conditions for performing the test and perform the test. (g) What do the data say about the test in part a? (a) Randomization condition: The adults were randomly selected. Nearly Normal condition: The sample of 52 adults is large, and the histogram shows no serious skewness, outliers, or multiple modes. The people in the sample had a mean temperature of 98.2846 and a standard deviation in temperature of 0.682379 . Since the conditions are satisfied, the sampling distribution of the mean can be modeled by a Student’s t-model, with 52 1 = 51 degrees of freedom. We will use a one-sample t-interval with 98confidence for the mean body temperature. (By hand, use t 50 = 2 . 403 from the table.) (b) ¯ y ± t n 1 s n = 98 . 2846 ± t 51 0 . 682379 52 = (98 . 06 , 98 . 51) (c) We are 98% confident that the interval 98.06 F to 98.51 F contains the true mean body temperature for adults. (If you calculated the interval by hand, using t 50 = 2 . 403 from the table, your interval may be slightly different than inter- vals calculated using technology. With the rounding used here, they are identical. Even if they aren’t, it’s not a big deal.) (d) 98% of all random samples of size 52 will produce intervals that contain the true mean body temperature of adults. (e) H 0 : Mean body temperature is 98.6 F, as commonly assumed. ( µ = 98 . 6) H A : Mean body temperature is not 98.6 F. ( µ ̸ = 98 . 6). (f) Randomization condition: The adults were randomly selected. Nearly Normal condition: The sample of 52 adults is large, and the histogram shows no serious skewness, outliers, or multiple modes. The people in the sample had a mean temperature of 98.285 and a standard de- viation in temperature of 0.6824 . Since the conditions are satisfied, the sampling distribution of the mean can be modeled by a Student’s t-model, with 52 1 = 51 degrees of freedom. We will use a one-sample t-test: t = ¯ y µ 0 SE y ) = 98 . 285 98 . 6 0 . 6824 / 52 = 3 . 33 (g) Since the P-value = 0 . 0016 is low, we reject the null hypothesis. There is strong evidence that the true mean body temperature of adults is not 98.6 F. This sample would suggest that it is significantly lower. 10
12. The National Center for Education Statistics monitors many aspects of elementary and secondary education nationwide. Their 1996 numbers are often used as a baseline to assess changes. In 1996, 31% of students reported that their mothers had graduated from college. In 2000, responses from 8368 students found that this figure had grown to 32%. Is this evidence of a change in education level among mothers? (a) Write appropriate hypotheses. (b) Check the assumptions and conditions. (c) Perform the test and find the P-value. (d) State your conclusion. (e) Do you think this difference is meaningful? Explain. (a) H 0 : The percentage of students in 2000 whose mothers had graduated college is 31%. ( p = 0 . 31) H A : The percentage of students is different than 31%. ( p ̸ = 0 . 31) (b) Randomization condition: Although not specifically stated, we can assume that the National Center for Educational Statistics used random sampling. 10% condition: The 8368 students are less than 10% of all students. Success/Failure condition: np 0 = (8368)(0 . 31) = 2594 . 08 and nq 0 = (8368)(0 . 69) = 5773 . 92 are both greater than 10, so the sample is large enough. (c) Since the conditions for inference are met, a Normal model can be used to model the sampling distribution of the proportion, with p 0 = 0 . 31 and SD p ) = r p 0 q 0 n = r (0 . 31)(0 . 69) 8368 = 0 . 0051 . The observed proportion of students whose mothers are college graduates is ˆ p = 0 . 32. We can perform a one-proportion two-tailed z-test: z = ˆ p p 0 p p 0 q 0 n = 0 . 32 0 . 31 q (0 . 31)(0 . 69) 8368 1 . 98 The P-value is 0.0478. (d) With a P-value of 0.0478, we reject the null hypothesis. There is evidence to suggest that the percentage of students whose mothers are college graduates has changed since 1996. In fact, the evidence suggests that the percentage has in- creased. (e) This result is not meaningful. A difference this small, although statistically sig- nificant, is of little practical significance. 13. A clean air standard requires that vehicle exhaust emissions not exceed specified limits for various pollutants. Many states require that cars be tested annually to be sure they 11
meet these standards. Suppose state regulators double-check a random sample of cars that a suspect repair shop has certified as okay. They will revoke the shop’s license if they find significant evidence that the shop is certifying vehicles that do not meet standards. (a) In this context, what is a Type I error? (b) In this context, what is a Type II error? (c) Which type of error would the shop’s owner consider more serious? (d) Which type of error might environmentalists consider more serious? H 0 : The shop is meeting the emissions standards. H A : The shop is not meeting the emissions standards. (a) Type I error is when the regulators decide that the shop is not meeting standards when they actually are meeting the standards. (b) Type II error is when the regulators certify the shop when they are not meeting the standards. (c) Type I would be more serious to the shop owners. They would lose their certifi- cation, even though they are meeting the standards. (d) Type II would be more serious to environmentalists. Shops are allowed to operate, even though they are allowing polluting cars to operate. 14. As in the question above, state regulators are check-ing up on repair shops to see if they are certifying vehicles that do not meet pollution standards. (a) In this context, what is meant by the power of the test the regulators are con- ducting? (b) Will the power be greater if they test 20 or 40 cars? Why? (c) Will the power be greater if they use a 5% or a 10% level of significance? Why? (d) Will the power be greater if the repair shop’s inspectors are only a little out of compliance or a lot? Why? (a) The power of the test is the probability of detecting that the shop is not meeting standards when they are not. (b) The power of the test will be greater when 40 cars are tested. A larger sample size increases the power of the test. (c) The power of the test will be greater when the level of significance is 10the null hypothesis will be rejected. (d) The power of the test will be greater when the shop is out of compliance “a lot”. Larger problems are easier to detect. 12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help