Concept explainers
Hormone replacement therapy (HRT) is thought to increase the risk of breast cancer. The accompanying data on
x = percent of women using HRT
and
y = breast cancer incidence (cases per 100,000 women) for a region in Germany for 5 years appeared in the paper “Decline in Breast Cancer Incidence after Decrease in Utilisation of Hormone Replacement Therapy” (Epidemiology [2008]: 427–430). The authors of the paper used a simple linear regression model to describe the relationship between HRT use and breast cancer incidence.
a. What is the equation of the estimated regression line?
b. What is the estimated average change in breast cancer incidence associated with a 1 percentage point increase in HRT use?
c. What would you predict the breast cancer incidence to be in a year when HRT use was 40%?
d. Should you use this regression model to predict breast cancer incidence for a year when HRT use was 20%? Explain.
e. Calculate and interpret the value of r2.
f. Calculate and interpret the value of se.
Want to see the full answer?
Check out a sample textbook solutionChapter 13 Solutions
Introduction to Statistics and Data Analysis
- What does the y -intercept on the graph of a logistic equation correspond to for a population modeled by that equation?arrow_forwardFind the equation of the regression line for the following data set. x 1 2 3 y 0 3 4arrow_forwardOlympic Pole Vault The graph in Figure 7 indicates that in recent years the winning Olympic men’s pole vault height has fallen below the value predicted by the regression line in Example 2. This might have occurred because when the pole vault was a new event there was much room for improvement in vaulters’ performances, whereas now even the best training can produce only incremental advances. Let’s see whether concentrating on more recent results gives a better predictor of future records. (a) Use the data in Table 2 (page 176) to complete the table of winning pole vault heights shown in the margin. (Note that we are using x=0 to correspond to the year 1972, where this restricted data set begins.) (b) Find the regression line for the data in part ‚(a). (c) Plot the data and the regression line on the same axes. Does the regression line seem to provide a good model for the data? (d) What does the regression line predict as the winning pole vault height for the 2012 Olympics? Compare this predicted value to the actual 2012 winning height of 5.97 m, as described on page 177. Has this new regression line provided a better prediction than the line in Example 2?arrow_forward
- On May 25, 2010, LiveScience published an article titled "Simple Memory Test Predicts Intelligence." The article summarized a study that found that the score on a test of working memory capacity was correlated with a number of different measures of intelligence. One measure was cognitive functioning score. Suppose that one such linear regression for working memory capacity and cognitive functioning score from a sample of 12 people is ŷ = 6.2 + 26.7x, where y is the cognitive functioning score and x is the working memory score. (a) In a hypothesis test for the slope of the linear regression, which of the following would be the correct alternative hypothesis? Ha: ? = 0 Ha: ? > 26.7 Ha: ? ≠ 0 Ha: ? < 26.7 (b) Given the standard error of the slope, 5.2, calculate the test statistic. (Round your answer to four decimal places.) (c) Calculate the P-value and determine which conclusion is correct for the hypothesis test based on ? = 0.05. (Use technology.) We…arrow_forwardAcrylamide is a chemical that is sometimes found in cooked starchy foods and which is thought to increase the risk of certain kinds of cancer. The paper "A Statistical Regression Model for the Estimation of Acrylamide Concentrations in French Fries for Excess Lifetime Cancer Risk Assessment"+ describes a study to investigate the effect of frying time (in seconds) and acrylamide concentration (in micrograms per kilogram) in french fries. The data in the accompanying table are approximate values read from a graph that appeared in the paper. Frying Acrylamide Time Concentration 150 240 240 270 300 300 150 + 115 190 180 145 275 (a) Find the equation of the least-squares line for predicting acrylamide concentration using frying time. (Round your answers to four decimal places.) ŷ = (b) Does the equation of the least-squares line support the conclusion that longer frying times tend to be paired with higher acrylamide concentrations? Explain. No, the least squares regression line equation…arrow_forwardAcrylamide is a chemical that is sometimes found in cooked starchy foods and which is thought to increase the risk of certain kinds of cancer. The paper "A Statistical Regression Model for the Estimation of Acrylamide Concentrations in French Fries for Excess Lifetime Cancer Risk Assessment"+ describes a study to investigate the effect of frying time (in seconds) and acrylamide concentration (in micrograms per kilogram) in french fries. The data in the accompanying table are approximate values read from a graph that appeared in the paper. Frying Acrylamide Time Concentration 150 240 240 270 300 300 150 125 + 195 185 135 275 USE SALT (a) Find the equation of the least-squares line for predicting acrylamide concentration using frying time. (Round your answers to four decimal places.) ŷ = (b) Does the equation of the least-squares line support the conclusion that longer frying times tend to be paired with higher acrylamide concentrations? Explain. O No, the least squares regression line…arrow_forward
- A study was conducted on 64 female college athletes. The researcher collected data on a number of variables including percent body fat, total body weight, height, and age of athlete. The researcher wondered if % body fat (%BF), height (HGT), and/or age are significant predictors of total body weight. All conditions have been checked and are met and no transformations were needed. The technology output from the multiple regression analysis is given below. What percent of the variation in total body weight is being explained by the regression model with these three explanatory variables?arrow_forwardA 10-year study conducted by the American Heart Association provided data on how age, blood pressure, and smoking relate to the risk of strokes (Dataset "Stroke"). Risk is interpreted as the probability (times 100) that a person will have a stroke over the next 10-year period. For the smoker variable, 1 indicates a smoker and 0 indicates a nonsmoker. a. Develop an estimated regression equation that can be used to predict the risk of stroke given the age and blood-pressure level. b. Consider adding two independent variables to the model developed in part (a), one for the interaction between age and blood-pressure level and the other for whether the person is a smoker. Develop an estimated regression equation using these four independent variables. c. At a 0.05 level of significance, test to see whether the addition of the interaction term and the smoker variable contributes significantly to the estimated regression equation developed in part (a). d. Refer to the model developed in part…arrow_forwardHormone replacement therapy (HRT) is thought to increase the risk of breast cancer. The accompanying data on x = percent of women using HRT and y = breast cancer incidence (cases per 100,000 women) for a region in Germany for 5 years appeared in the paper "Decline in Breast Cancer Incidence after Decrease in Utilization of Hormone Replacement Therapy." The authors of the paper used a simple linear regression model to describe the relationship between HRT use and breast cancer incidence. † HRT Use Breast Cancer Incidence 46.30 103.30 40.60 105.00 39.50 100.00 36.60 93.80 30.00 83.50 (a) What is the equation of the estimated regression line? (Round your numerical values to four decimal places.) ŷ = (b) What is the estimated average change in breast cancer incidence (in cases per 100,000 women) associated with a 1 percentage point increase in HRT use? (Round your answer to four decimal places.) cases per 100,000 women (c) What breast cancer…arrow_forward
- In an attempt to develop a model of wine quality as judged by wine experts, data on alcohol content and wine quality was collected from variants of a particular wine. From a sample of 17 wines, a model was created using the percentages of alcohol to predict wine quality. From the results of that regression, b1=0.4386 and Sb1=0.1141. a. At the 0.05 level of significance, is there evidence of a linear relationship between the percentage of alcohol and wine quality? b. Construct a 95% confidence interval estimate of the population slope, β1. b. The 95% confidence interval is __ ≤ β1 ≤ __ (Round to three decimal places as needed.)arrow_forwardThe administration of a midwestern university commissioned a salary equity study to help establish benchmarks for faculty salaries. The administration utilized the following regression model for annual salary, y : ?(?) β0+β1x ,where ?=0 if lecturer, 1 if assistant professor, 2 if associate professor, and 3 if full professor. The administration wanted to use the model to compare the mean salaries of professors in the different ranks. a) Explain the flaw in the model. b)Propose an alternative model that will achieve the administration’s objective. c) If the global F-test for the model you proposed in 2 is conducted, what would be the value of the numerator degrees of freedom?arrow_forwardA company randomly samples 48 months of monthly output and monthly total cost data. The sampled data will be used to develop a total cost curve for the company. The company believes that its monthly total cost depends, to a large extent on its monthly output, and hopes that a simple linear regression model will be useful in analyzing how total costs vary as monthly output varies. The company proposes the following model: Total Cost = Fixed Cost + Variable Cost per Unit *Monthly Output. Recall that fixed costs do not vary with the level of monthly output, while the variable cost per unit describes the change in total costs when monthly output changes by one unit. Regression Statistics Standard Error| 64.252 Observations 48 ANOVA df SS MS Regression 1 3097160 3097160 Residual 46 189904 4128 Total 47 3287064 Standard Error Coefficients I Stat Intercept 38.25 24.28 1.6 Monthly Output 19.69 0.72 27.4 A statistician has been asked by the company to conduct a statistical test to determine if…arrow_forward
- College AlgebraAlgebraISBN:9781305115545Author:James Stewart, Lothar Redlin, Saleem WatsonPublisher:Cengage LearningGlencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill
- Functions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage LearningBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt