SS23-STT200_FinalExamStudyGuide

pdf

School

Michigan State University *

*We aren’t endorsed by this school

Course

23

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

23

Uploaded by BarristerCat6704

Report
SS23 Final Exam Study Guide PAGE 1 of 23 SS23 - STT 200 Final Exam Study Guide The final exam is comprehensive, with more emphasis on the material covered since the second midterm. Part 1 of this study guide provides example problems from the new material. Part 2 of the study guide provides example problems for the older material. We also encourage you to review again the study guides and practice exams provided for the two midterms as you prepare for the final. Part 1 New material since Midterm 2 You can expect that approximately 50% of the points for the final exam will come from the material covered after Midterm 2. The learning objectives for this portion of the course and some example exam questions are provided in this section of the study guide. Part 5 Learning Objectives 1. Given a real-life decision situation, identify when and be able to conduct a parametric hypothesis test for a. a single population mean b. the difference in two population means for independent samples by a. formulating a null and alternative hypothesis appropriate to the research question. b. identifying and checking necessary conditions for the test. c. calculating a test statistic and p-value (including degrees of freedom, where applicable). d. interpreting the p-value and evaluating whether the null model is consistent with observed sample results. e. calculating and interpreting the estimated effect size. 2. Given a real-life estimation procedure, be able to construct and interpret parametric confidence intervals for the difference in two population means for independent samples. 3. Given a description of a hypothesis testing situation for means, be able to identify and contextually interpret Type I and Type II errors. 4. Given a hypothesis test scenario for a single population mean and a decision rule, be able to calculate the chance of making a Type I error and the power of a test for a given alternate mean value.
SS23 Final Exam Study Guide PAGE 2 of 23 1. Fast-food waiting times ~ You are the manager of a restaurant for a fast-food franchise. Last month the mean waiting time at the drive-through window, as measured from the time a customer places an order until the time the customer receives the order, was 3.7 minutes. The franchise helped you institute a new process intended to reduce waiting time. You conduct a study to determine if the mean waiting time at your restaurant has decreased. a. What hypotheses would be appropriate to test whether the new process has, in fact, reduced waiting time? H 0 : ________________ vs. H a : _____________________ b. You select a random sample of 64 orders. The sample mean waiting time is 3.57 minutes with a sample standard deviation of 0.8 minute. Conduct a test of these two hypotheses. Test statistic: _______________________ p-value: ____________________________ c. Which of the following is a correct interpretation of the p-value computed in (b)? Choose all that apply. i. The p-value represents the probability that 𝐻 0 is true, assuming that our data was collected using proper sampling technique. ii. The p-value represents the probability of observing our sample results under the null model. iii. The p-value represents the probability of observing our sample results or something more in favor of the alternate hypothesis under the null model. iv. The p-value represents the size of the difference between what we observed in the sample and the null model, relative to the expected error. d. Calculate the effect size for this hypothesis test. e. Based on your result in part (d), the effect size for this problem would be considered: v. small vi. small to moderate vii. moderate to large viii. large f. For this problem, which of the following conditions must be met for the hypothesis test to be valid? Choose all that apply. i. _______ 𝑛𝑝̂ ≥ 10 ?𝑛𝑑 𝑛(1 − 𝑝̂) ≥ 10 ii. _______ Observations must be independent. iii. _______ The population data must be nearly normal, or the sample size must be sufficiently large, 𝑛 ≥ 30 . iv. _______ The sample data must be nearly normal. v. _______ There must be an expected count of at least 5 on each day.
SS23 Final Exam Study Guide PAGE 3 of 23 2. Anorexia therapy ~ Anorexia is an eating disorder that can cause a person to become dangerously underweight. In a recent study, 29 girls who were suffering from anorexia received a psychotherapy treatment called cognitive behavioral therapy, which stresses identifying the thinking that causes the undesirable behavior and replacing it with thoughts designed to help improve this behavior. Over the course of the study, the mean weight gain for the patients was ?̅ = 3.0 pounds with a standard deviation of ? = 7.3 pounds. Researchers want to know if the therapy had an effect, that is, did the weight of the participants change (either increase or decrease) over the course of the study? a. [3] Express this research question in terms of a null and alternative hypothesis. Use appropriate notation. 𝐻 0 : ___________________________ vs 𝐻 𝑎 : ___________________________ b. [4] Conduct the hypothesis test. Calculate the test statistic and the p-value. Show all work using formulas and/or calculator input/output. Observed test statistic: _____________________ and p -value: __________________________ c. [3] Calculate the estimated effect size, 𝑑 ̂ . Show all work. d. [3] One year later, another researcher repeats this study with a group of 45 teenagers, male and female, who are also suffering from anorexia. The new study reports a p-value of 𝑝 = 0.053 and an estimated effect size of 𝑑 ̂ = 0.396 . The results of the new study (circle one) confirm conflict with the results of the previous study because… (explain your choice in 1 or 2 sentences)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SS23 Final Exam Study Guide PAGE 4 of 23 3. Air pollution ~ A chemical plant is required to maintain sulfur dioxide levels in the working environment atmosphere at an average level of no more than 0.125 parts per million (ppm). Safety engineers measure the levels at a randomly chosen 10 intervals each week. If the sample mean sulfur dioxide level is more than 0.15 ppm, the safety protocols say that the plant will be evacuated while the air is scrubbed and machines are adjusted. The standard deviation of the measurements is known to be 0.04 ppm, and the testing scenario tests the hypotheses 𝐻 0 : 𝜇 = 0.125 𝑣?. 𝐻 𝑎 : 𝜇 > 0.125 . a. Below are 4 possible outcomes from the testing scenario, 𝐻 0 : 𝜇 = 0.125 𝑣?. 𝐻 𝑎 : 𝜇 > 0.125 . Place a letter in each blank below to identify which outcome is the consequence of a Type I error and which is a consequence of a Type II error. A. The plant will not be evacuated when it needs to be. B. The plant will be evacuated when it needs to be. C. The plant will not be evacuated when it does not need to be. D. The plant will be evacuated when it does not need to be. Type I Error: ___________________ Type II Error: __________________________ b. Using this decision rule, what is the chance the safety engineers will make a Type I error? c. Using the same decision rule, what is the power of this test if the mean pollution level is 0.14 ppm?
SS23 Final Exam Study Guide PAGE 5 of 23 4. Battery Testing ~ At a manufacturing plant that produces 12-volt batteries, a quality assurance technician regularly performs a standard test that records how long each battery will produce 400 amperes. The time durations for the batteries are known to be normally distributed with a population mean time duration of 10.5 minutes and a population standard deviation time duration of 1.96 minutes. Every half-hour during a production run, a random sample of 3 batteries is selected and tested. If the sample mean time duration for the three batteries is less than 8.5 minutes, the production of the batteries will be stopped and the machinery will be inspected for problems. The hypotheses for this testing situation are 𝐻 0 : 𝜇 = 10.5 𝑣?. 𝐻 𝑎 : 𝜇 < 10.5 , where 𝜇 denotes the population mean time a battery produces a. Below are 4 possible outcomes from the testing scenario. Place a letter in each blank below to identify which outcome is the consequence of a Type I error and which is a consequence of a Type II error. A. The production of the batteries is not stopped when it is not necessary. B. The production of the batteries is stopped when it is necessary. C. The production of the batteries is not stopped when it is necessary . D. The production of the batteries is stopped when it is not necessary. Type I Error: ___________________ Type II Error: __________________________ b. What is the chance the manufacturer will make a Type I error with the decision rule given above? Show all work and/or calculator input/output. c. Using the same decision rule, what is the power of the hypothesis test if the population mean time duration for the batteries is 9 minutes? Show all work and/or calculator input/output.
SS23 Final Exam Study Guide PAGE 6 of 23 5. Railways & Housing Values ~ In the 1800s, an extensive system of railroads connected towns in New England but as automobile use spread most of the train tracks were disassembled. In recent years, many cities have converted the unused railroad beds into “rail trails” for citizens to use for walking and biking. In one such town, researchers collected information on 104 homes and classified them as either “Closer” or “Farther Away” from the rail trail and then calculated the percentage change in estimated sale price for each home between the years 1998 and 2014. a. Conduct a hypothesis test to determine if the mean percent change in estimated sale price is different for the “Closer” and “Farther Away” groups of homes. 𝐻 0 : ___________________________ vs 𝐻 𝑎 : ___________________________ Observed test statistic: _____________________ and p -value: __________________________ b. In order for this test to be valid, certain conditions must be met. Which of the following is not one of these conditions? Select all the incorrect statements. ______ Independence between population groups. ______ Independence within each sample. ______ There must be an expected value of at least 5 for each cell in the table. ______ We must be able to expect at least 10 successes and 10 failures in each sample. ______ Each population data distribution must be nearly normal, or sample sizes must be at least 30. c. Calculate the estimated effect size for this hypothesis testing scenario. Closer Farther Away Mean 48.0 38.6 SD 25.4 32.5 Sample size 40 64
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SS23 Final Exam Study Guide PAGE 7 of 23 6. Traveling distances between cities ~ The U.S. Department of Transportation provides the number of miles that residents of the 75 largest metropolitan areas travel per day in a car. Suppose that for a simple random sample of 50 Buffalo residents the mean is 22.5 miles a day and the standard deviation is 8.4 miles a day, and for an independent simple random sample of 50 Boston residents the mean is 18.6 miles a day and the standard deviation is 7.4 miles a day. a. A researcher wants to test to see if there is a difference in mean travel times between the two cities. Express the research question in terms of two competing hypotheses: 𝐻 0 : ___________________________ vs 𝐻 𝑎 : ___________________________ b. Conduct a test of these two hypotheses. Calculate a test statistic and a p-value. Show all work and/or calculator input/output. Observed test statistic: _____________________ and p -value: __________________________ c. Calculate the estimated effect size for this hypothesis testing scenario. d. Given the p-value calculated in part (b) and the effect size in part (c), we have (circle one) Little Some Strong Very Strong Extremely Strong evidence that the null model (circle one) is is not compatible with the observed sample data and an effect size that is ( circle one ) small small-to-moderate moderate-to-large large e. Construct a 95% confidence interval for the difference between the two population means. Show all work, including any calculator input.
SS23 Final Exam Study Guide PAGE 8 of 23 Part 6 Learning Objectives 1. Given a scatter plot, be able to roughly estimate the correlation coefficient. 2. Given the description of a linear association of two variables, be able to interpret the sign and strength of the correlation coefficient. 3. Given a scatter plot, be able to distinguish between a a. Positive and negative association. b. Linear, non-linear, and no association. c. Strong and weak association (linear or non-linear) 4. Given the means and ? ̅ , the standard deviations ? ? and ? ? , and the correlation coefficient ? , for a set of data and/or computer output, be able to calculate and/or find the equation of the regression line and R-squared. 5. Given a linear regression equation, be able to a. interpret the slope and intercept of the estimated regression line. b. use the regression line to compute point estimates. 6. Given a linear regression equation and information about an observed data point, be able to calculate and interpret the residual for that point. 7. Given a scatter plot and residual plots, be able to evaluate whether the conditions have been met for creating a linear regression model for the data. 8. Given computer output of a linear regression model, be able to conduct a hypothesis test for the significance of the slope. 7. Estimate r ~ Write the correlation coefficient beneath the scatter plot. Choose from the following values: A. -1.25 B. -0.93 C. -0.77 D. -0.42 E. -0.16 F. 0.42 G. 0.77 H. 0.93 ____________ a) Plot #1 ____________ b) Plot #2 ____________ c) Plot #3 ____________ d) Plot #4
SS23 Final Exam Study Guide PAGE 9 of 23 8. Mall sales ~ A national chain of women’s clothing stores with locations in large shopping malls wants to understand the variables that impact sales. For each of 24 randomly selected stores, management records the number of competing stores in the mall where the store is located ( Competitors ) and the monthly sales for that store ( Sales ). The summary statistics and scatter plot are shown below. Competitors : ?̅ = 14.2 , ? ? = 2.69 Sales : ? ̅ = 4535.48 , ? ? = 536.22 The correlation coefficient between Competitors and Sales is r = −0.377 . a. Use the summary statistics to find the equation of the estimated regression line for predicting Sales from Competitors . Use appropriate notation and show your work. Equation: __________________________________________ b. A store in the data set had 16 competitors and sales of $4379. Use your equation from part (a) to calculate the predicted sales for this store, along with its residual. Show your work. Predicted value: _______________________ and residual: ______________________ c. Based on your answer to part (b), did the model overestimate or underestimate the sales for this store? (Circle one) Overestimate Underestimate d. What is the R 2 value for the regression number of competing stores on monthly sales? (No work required.) Final answer: _____________________________
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SS23 Final Exam Study Guide PAGE 10 of 23 9. Seatbelt compliance ~ In a study on mandatory seat-belt laws, a group of lobbyists wanted to determine whether there was a relationship among the percentage of motorists who comply with the law ( Percent ) and the amount of time that the law has been in effect ( Years ). They decided to look at data on 48 locations with mandatory seatbelt laws and used statistical software to create a linear regression model using Years as the explanatory variable. The computer output for the linear regression model is shown below. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 46.2028 4.4426 10.400 1.15e-13 Years 1.8846 0.4673 4.033 0.000205 --- Residual standard error: 9.806 on 46 degrees of freedom Multiple R-squared: 0.2612, Adjusted R-squared: 0.2452 Match the number from the computer output above to the interpretation of the value. Not all numbers will be used. A. 0.2452 B. 4.033 C. 4.4426 D. 46.2028 E. 0.4673 F. 1.8846 G. 9.806 H. 0.000205 I. 0.5111 J. 0.2612 ___________ a) The test statistic for testing the hypotheses: 𝐻 0 : 𝛽 1 = 0 𝑣?. 𝐻 𝑎 : 𝛽 1 ≠ 0 . ___________ b) The proportion of the variation in Percent that can be explained by the linear relationship to Years . ___________ c) Each additional year the seatbelt law has been in effect is associated with a change of this percentage of motorists who comply with the law. ___________ d) The correlation coefficient between Percent and Years . ___________ e) The percentage of motorists that the model would predict to comply with a seatbelt law as soon as it was implemented, i.e., when Years = 0. ___________ f) The standard error for the regression model. ___________ g) The standard error for the slope of the regression model. h) How much evidence is there of a linear relationship between Percent and Years ? A. Little B. Some C. Strong D. Very Strong E. Extremely Strong F. It is impossible to tell from the information given.
SS23 Final Exam Study Guide PAGE 11 of 23 10. Regression Assumptions ~ Below are the scatter plots and plots for residuals for the Mall Sales regression model and the Seatbelt Compliance regression model from the previous two questions. Select True or False for each statement about the scatter plots . a) Stores with fewer competitors tend to have higher sales. A. TRUE B. FALSE b) Compliance with seatbelt laws tends to decrease the longer the laws are in effect. A. TRUE B. FALSE c) The relationship in the mall sales residual plot has more constant variance than the relationship in the seatbelt compliance residual plot . A. TRUE B. FALSE 11. Estimate r ~ Write the correlation coefficient for each scatter diagram on the line beneath the graph. Choose your answers from the following list (not all numbers will be used): 0.1, -0.85, -0.6, 0.75, 1, 0.9, -0.3, -1.25 a.________ b.________ c._________ d.________
SS23 Final Exam Study Guide PAGE 12 of 23 12. Regression assumptions ~ Place an X beneath the scatter plot for which linear regression would be most appropriate. __________ __________ __________ __________ 13. Basketball Salaries vs. Team Revenues ~ College basketball is big business, with coaches’ salaries, revenues, and expenses in millions of dollars. A sample of 45 college basketball programs was selected. The data is summarized below. Both revenues and salaries are reported in millions of dollars. Revenues: = 6.74, ? ? = 4.52; Coach Salary: ? ̅ = 0.77, ? ? = 0.44 ? = 0.61 a. Using the scatter plot, describe the relationship between revenues and coach salary for these 45 colleges. b. Use the summary statistics above to calculate the regression line ? ̂ = ? 0 + ? 1 ? between Revenues and Coach Salary using Revenues as the independent variable. c. Interpret the slope in the context of the problem. Be specific. d. Interpret the intercept in the context of the problem. Be specific. Is the model valid all the way to x=0?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SS23 Final Exam Study Guide PAGE 13 of 23 e. Use the regression line to predict the coach salary for a college with revenues of $11 million. f. In the data set, Michigan State has a recorded revenue of $11 million and a coach salary of $1.6 million. Calculate the residual for this college and explain its meaning. 14. Boomburgs ~ Many suburbs in California have grown into large cities, but are often not thought of as cities because they are in the shadow of a major city. Urban experts recognize suburbs having either a large population or having a substantial growth spurt as suburban cities. These suburban cities often have more homes than jobs and are sometimes called boomburbs . Fourteen suburban cities were sampled, and their age as a suburban city, the initial population of the suburban city when it was first recognized as a city, and the population size of the city in the year 2000 were recorded. To answer the question, “Can the size of the city when it was first recognized as a city predict the population of the city in 2000?” I ran a linear regression command in R for using initial po pulation as the explanatory variable and population in 2000 as the response variable. Summary output from R is shown below. a. What is the reported standard error for the regression model? Final answer: ________________________ b. How much evidence do the data provide that there is a linear relationship between initial population and population in 2000? Conduct a hypothesis test using the information given in the R output. 𝐻 0 : ___________________________ vs 𝐻 𝑎 : ___________________________ test statistic: ________________________ and p-value: ________________________ c. What percentage of the variation in the population of a city in 2000 can be explained by the linear relationship with the size of the city when it was first recognized as a city? Final answer: ________________________
SS23 Final Exam Study Guide PAGE 14 of 23 Part 2 Older material (Questions from Midterm 1) You can expect that approximately 50% of the points for the final exam will come from the material covered in HW01, HW02, and HW03. The learning objectives and example problems below are from the Midterm 1 Study Guide. Part 1 Learning Objectives 1. Given a described research question and/or results, be able to identify the population of interest and the sample. a. Identify observational units and the variables. b. Classify variables as categorical (ordinal or nominal) or numerical. 2. Given a description of a research scenario, be able to a. Identify and evaluate the sampling methods used i. Simple random sampling ii. Cluster sampling iii. Stratified sampling iv. Systematic sampling v. Convenience sampling 3. Given the description of conducted research, be able to a. Classify the research as a designed experiment or an observational study. b. Identify explanatory and response variables. 4. For Designed Experiments, given a description of the research, be able to a. Identify the treatment and control groups. b. Identify and evaluate the use of randomization. c. Explain why randomization in the assignment of subjects to treatment and control groups is important. d. Identify the use/non-use of a placebo and explain its role in the experiment. e. Distinguish between blind and double-blind experiments and explain the reasons for each. f. Explain when and why randomized, controlled, double-blind experiments allow the cautious inference of causation. 5. For Observational Studies, given a description of the research, be able to a. Identify and evaluate the use of randomization. b. Identify and explain possible confounding factors. c. Explain why observational studies do not allow the inference of causation. Part 2 Learning Objectives 1. Interpret data visualizations (bar chart, mosaic plot, dot plot, histogram, box plot) 2. Given a described research question and/or results, be able to identify and distinguish between a population, a parameter, a sample, and a statistic. 3. Calculate basic descriptive statistics for small data sets using statistical functions of a calculator (mean, median, variance, SD, quantiles, percentiles, range, IQR). 4. Use histograms to identify the shape of a data distribution (symmetric, right/left skew). 5. Estimate descriptive statistics from visualizations.
SS23 Final Exam Study Guide PAGE 15 of 23 15. Protection from the flu ~ A study seeks to compare the effectiveness of two types of respiratory protection during influenza outbreaks. A total of 440 nurses who were caring for patients with the flu were randomly divided into two groups. The 240 nurses assigned to Group 1 received surgical masks to wear while caring for their flu patients, and the remaining 200 nurses received fitted N95 respirators to wear while caring for their flu patients. The researcher who evaluated the nurses over the study period (to assess whether the nurse got influenza) did not know which respiratory protection each nurse actually wore when treating their flu patients. The table below summarizes some of the study results. Use these results to answer the following questions. Yes, got influenza No, did not get influenza Total Group 1 (Surgical Mask) 72 168 240 Group 2 (N95 Respirator) 38 162 200 Total 110 330 440 a. The study described above is an…. (circle one) observational study experiment Provide a brief explanation of your choice below. b. Which of the following terms applies to the variable ‘influenza status’? Select all that apply. Categorical Numerical Response Explanatory Confounding c. Consider the mosaic plot to the right, which summarizes the study results. Are the events ‘wore N95 respirator’ and ‘get influenza’ independent? Clearly circle ‘yes’ or ‘no’ and provide a brief written support. Circle one: Yes No Because…
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SS23 Final Exam Study Guide PAGE 16 of 23 16. Aspirin and women’s health ~ The following is part of an article from Newsweek, April 24, 2006. This question concerns the Women's Health Study described in the second paragraph. Take an Aspirin and ... BY JULIE BURING, SC.D., AND NANCY FERRARI Aspirin is a wonder drug, plain and simple. At high doses, it quells inflammation; at medium doses, it provides effective pain relief; at low doses, it reduces the blood's ability to clot by inhibiting the action of tiny blood cells called platelets. It makes sense, then, that aspirin might help prevent clot-related cardiovascular events such as heart attack and stroke, even in healthy people. In 1988, the Physicians' Health Study showed exactly that. In healthy men, 325 mg of aspirin taken every other day for five years reduced the risk of a first heart attack by 44 percent. That was great news. For men. It wasn't until March 2005 that the Women's Health Study addressed aspirin's benefits for women. Healthy women -who were at least 45 years old at the start of the study -who participated in the study took either 100 mg of aspirin or a placebo every other day for 10 years. Surprisingly, the women taking aspirin experienced no reduction in heart-attack risk. However, aspirin takers were 17 percent less likely to have a stroke. a. As described above, the Women’s Health Study is an (circle one): observational study experiment Provide a brief explanation of your choice below. b. Name one categorical and one numeric variable mentioned in the article. Categorical: ________________________________________ Numeric: __________________________________________ c. Was the Women’s Health Study conducted blind? Circle YES or NO and explain how you know. d. The article does not say how the women were assigned to the aspirin and placebo groups. What is the best way to do this, and why?
SS23 Final Exam Study Guide PAGE 17 of 23 17. Two-mile walk ~ Here is the beginning of an article from the Yahoo! Health News website on January 7, 1998: Daily Two-Mile Walk Halves Death Risk NEW YORK (Reuters) Walking two miles or more per day can cut the overall risk of dying in half, according to a new study. It also reduces the risk of dying from cancer and appears to cut the risk of death due to cardiovascular diseases, US researchers report. Between 1980 and 1982, multicenter researchers in the Honolulu Heart Program studied 707 nonsmoking, retired men, aged 61 to 81 years, and collected mortality data on these men over the following 12 years. During the study, 208 of the men died. The study results show that while 43.1% of men who walked less than one mile per day died, only half this figure 21.5% -- of the men who walked more than two miles per day died.” a. This article describes an (circle one) experiment observational study b. What is the research question? c. Briefly identify each element of the study below: i. Population of interest ii. Sample iii. Explanatory variable iv. Response variable d. Clearly explain why a person’s general health is a confounding factor in the relationship between walking and the risk of dying. e. Based only on this study, should men who want to reduce their risk of dying walk 2 miles each day? i. Yes, definitely, because the men in the study who walked 2 miles each day lived longer. ii. Not necessarily, because this relationship was found using an observational study. iii. Definitely not, because no women were included in the study. iv. Probably, because this was a controlled experiment so we can infer a causal relationship.
SS23 Final Exam Study Guide PAGE 18 of 23 18. Longleaf pine trees ~ A study of longleaf pine trees in a particular geographical area was conducted. The area was divided into a north tract and a south tract and tree diameter was measured. The graph below visualizes this data. Use this plot to make decisions about the relationship between Quantity A and Quantity B for each row in the table. In each case, select the most appropriate statement from the following choices (you may use each choice more than once or not at all): d. Quantity A is greater e. Quantity B is greater f. The quantities are the same g. The relationship cannot be determined without more information Statement a, b, c, or d Quantity A Quantity B The median tree diameter for trees in the north tract The median tree diameter for trees in the south tract The mean tree diameter for trees in the north tract The median tree diameter for trees in the north tract The standard deviation of tree diameter for trees in the north tract The standard deviation of tree diameter for trees in the south tract The 75 th percentile tree diameter for trees in the north tract The median of tree diameter for trees in the south tract The standard deviation of tree diameter for all tracts A standard deviation of 5 cm The plot at right shows the histogram of tree diameter for the two geographical tracts. Which of Histogram 1 and Histogram 2 is the histogram for the north tract? a. Histogram 1 b. Histogram 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SS23 Final Exam Study Guide PAGE 19 of 23 19. Sampling methods ~ Read each scenario below. Identify the sampling method used in each scenario. Choose from A. Simple random sample (SRS) B. Stratified sample C. Cluster sample D. Systematic sample E. Convenience sample 20. Parameter or Statistic? ~ For each of the following, determine whether the quantity described is a parameter or statistic. Circle one. a. The average wait time of 15 patients in a hospital’s emergency room on a busy Friday night. Parameter Statistic b. The proportion of all US adults who are employed full-time. Parameter Statistic c. The average price of all candy bars for sale at Kroger grocery stores. Parameter Statistic d. The proportion of 2000 randomly selected Ingham County citizens who voted in the last election. Parameter Statistic __________ A marketing expert at MTV is planning a survey in which 500 people will be randomly selected from each age group of 10-19, 20-29, 30-39, 40-49, and 50-59. _________ An inspector for the US Food and Drug Administration obtains all vitamin pills produced in an hour at the Health Supply Company. She thoroughly mixes them, then scoops a sample of 10 pills that are to be tested for the exact amount of vitamin content. _________ Motivated by a student who died from binge drinking, the administration at a university conducts a study of student drinking by randomly selecting 10 different classes being taught this semester and interviewing all of the students in each of those classes. _________ A finance professor surveyed all of his students to obtain sample data consisting of the number of credit cards students possess.
SS23 Final Exam Study Guide PAGE 20 of 23 21. Antibiotics in infancy and obesity in adults ~ A recent headline claiming “Antibiotics in infancy may cause obesity in adults” summarizes the results of two separate studies (Studies A and B) investigating the relationship between exposure to antibiotics early in life and the risk of obesity later on. Researchers suspect that antibiotics might have an effect on risk of obesity because they cause changes in the microbiome of the gastrointestinal tract. In Study A , researchers randomly split 88 infant lab mice into two groups. The first group received daily dosages of antibiotics until they reached maturity (38 days after birth) and the second group was held as a control. Of the mice that had received antibiotics, 25% were obese when they reached adulthood 38 days later. Only 18% of the mice in the control group were obese. Study B randomly sampled records of adults ages 18 to 50 from a local hospital. Each record was reviewed for ‘completeness,’ i.e., that the file included records dating all the way back to the individual’s birth. Incomplete files were discarded. Analysis of the remaining data found that individuals who had been given antibiotics before they were a year old (for example, for an ear infection) were more likely to be obese as adults. a. Study A is an ( circle one ) observational study experiment b. Study B is an ( circle one ) observational study experiment c. Consider just the first study (Study A) conducted on mice . Match each statistical term in the left column to the appropriate aspect of the study in the right column (note, you won’t use all available phrases from the right column). i. __________ Population of interest ii. __________ Response variable iii. __________ Cases iv. __________ Explanatory variable a. All lab mice given antibiotics. b. All lab mice. c. Whether an infant mouse is given antibiotics. d. Whether an adult mouse is classified as obese. e. The 44 mice given antibiotics. f. The 88 mice in the study. d. Now consider Study B (on hospital records) and recall that incomplete files were discarded before the data was analyzed. This action would most likely lead the results to have (circle one) selection bias non-response bias response bias because (circle one) i. not every person with a hospital record has an equal chance to be in the sample ii. researchers are more likely to choose obese patients if they took antibiotics iii. most patients will not consent to have their records examined
SS23 Final Exam Study Guide PAGE 21 of 23 e. As stated earlier, news coverage of these studies claimed “antibiotics in infancy may cause obesity in adults.” i. Is this an appropriate claim to make for mice? ( circle one ) YES NO Justify your answer in no more than two complete sentences. ii. Is the claim appropriate for humans? ( circle one ) YES NO Justify your answer in no more than two complete sentences. 22. How fast is the winner? ~ The Kentucky Derby is a 1.25 mile horse race held each year at the Churchill Downs race track in Louisville, Kentucky. Race enthusiasts collected data on the race for the years 1896 through 2017 and shared it on Wikipedia. For each of the listed variables, choose only one answer to indicate whether it is A. Numerical B. Ordinal C. Categorical ______________ the average speed of the winning horse in feet per second ______________ the trainer of the winning horse ______________ the condition of the track (fast, good, or slow) ______________ the jockey of the winning horse ______________ the number of horses who raced
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SS23 Final Exam Study Guide PAGE 22 of 23 23. Sampling methods ~ Read each scenario below. Identify the type of sample used in each scenario. Choose from: A. Simple random sample (SRS) B. Stratified sample C. Cluster sample D. Systematic sample E. Convenience sample ______________ A third grade teacher has a bag of craft sticks. Each stick has the name of one student in the class written on it. To select three students to help with a school assembly the teacher reaches into the bag without looking and chooses 3 sticks. ______________ A researcher studying short-term memory believes that age plays a role. From a large group of people who have agreed to be part of a research study, the researcher randomly selects 40 people over the age of 50, 40 people between the ages of 30 and 50, and 40 people who are between the ages of 15 and 30. ______________ Marcus and Caroline are collecting data about the price of soda and candy bars at service stations. They collect the information from the service stations that are close enough to their house to ride their bikes to. ____________ MSU housing officials want to collect in-depth information about student life on campus. They create a list of every floor of every dorm on campus, and then randomly select 4 floors from the list and survey every resident on those 4 floors. ___________ To select a sample of 12 Kentucky Derby winning horses, we arrange the winner by year, select a random number between 1 and 10, and then include that horse and every 10th horse on the list.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SS23 Final Exam Study Guide PAGE 23 of 23 24. Track conditions ~ The Kentucky Derby is a 1.25 mile horse race held each year at the Churchill Downs race track in Louisville, Kentucky. Race enthusiasts collected data on the race for the years 1896 through 2017 and shared it on Wikipedia. Two of the variables, the average speed of the winning horse and the condition of the track are summarized using the graphics below. Use the plot to make quantitative comparisons. In each case, select the most appropriate statement from the following choices (you may use each choice more than once or not at all): a. Quantity A is greater b. Quantity B is greater c. The quantities are the same d. The relationship cannot be determined without more information Statement a, b, c, or d Quantity A Quantity B The range of average speed of the winning horse under “fast” track conditions The range of average speed of the winning horse under “good” track conditions The minimum average speed of the winning horse under “fast” track conditions The minimum average speed of the winning horse under “good” track conditions The proportion of average speed of the winning horse under “good” conditions that is higher than 53 ft/sec. The proportion of average speed of the winning horse under “slow” condit ions that is lower than 53 ft/sec. The mean average speed of the winning horse for all track conditions. The median average speed of the winning horse for all track conditions. The standard deviation of the average speed of the winning horse for all track conditions. 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help