Concept explainers
Credit card spending An analysis of spending by a sample of credit card bank cardholders shows that spending by cardholders in January (Jan) is related to their spending in December (Dec):
The assumptions and conditions of the linear regression seemed to be satisfied and an analyst was about to predict January spending using the model
Another analyst worried that different types of cardholders might behave differently. She examined the spending patterns of the cardholders and placed them into five market Segments. When she plotted the data using different colors and symbols for the five different segments, she found the following:
Look at this plot carefully and discuss why she might be worried about the predictions from the model
Being worried to make a prediction from the model
Explanation of Solution
Given info:
A scatterplot of spending for a sample of credit card bank cardholders in January and in December is given. The corresponding regression model to predict January spending from December spending is
Another scatterplot of spending for a sample of credit card bank cardholders in January and that in December for five market segments is given.
Justification:
The conditions for a scatterplot that is well-fitted for the data is as follows:
- Straight enough condition: The relationship between y and x is straight enough to proceed with a linear regression model.
- Outlier condition: No outlier must be there which influences the fit of the least square line.
- Thickness condition: The spread of the data around the generally straight relationship seems to be consistent for all values of x.
The different segments are not scattered at random throughout the scatterplot.
Thus, the spread of the data is not consistent for all values of December and each segment may have a different relationship that might affect the accuracy of the model to predict.
The relationship between the spending of credit card bank cardholders in January and in December is not straight enough to proceed with a linear regression model.
Want to see more full solutions like this?
Chapter 8 Solutions
Intro Stats, Books a la Carte Edition (5th Edition)
- What does the y -intercept on the graph of a logistic equation correspond to for a population modeled by that equation?arrow_forwardOlympic Pole Vault The graph in Figure 7 indicates that in recent years the winning Olympic men’s pole vault height has fallen below the value predicted by the regression line in Example 2. This might have occurred because when the pole vault was a new event there was much room for improvement in vaulters’ performances, whereas now even the best training can produce only incremental advances. Let’s see whether concentrating on more recent results gives a better predictor of future records. (a) Use the data in Table 2 (page 176) to complete the table of winning pole vault heights shown in the margin. (Note that we are using x=0 to correspond to the year 1972, where this restricted data set begins.) (b) Find the regression line for the data in part ‚(a). (c) Plot the data and the regression line on the same axes. Does the regression line seem to provide a good model for the data? (d) What does the regression line predict as the winning pole vault height for the 2012 Olympics? Compare this predicted value to the actual 2012 winning height of 5.97 m, as described on page 177. Has this new regression line provided a better prediction than the line in Example 2?arrow_forwardFind the equation of the regression line for the following data set. x 1 2 3 y 0 3 4arrow_forward
- Does Table 1 represent a linear function? If so, finda linear equation that models the data.arrow_forwardA major brokerage company has an office in Miami, Florida. The manager of the office is evaluated based on the number of new clients generated each quarter. Data were collected that show the number of new customers added during each quarter between 2015 and 2018. A multiple regression model was developed with the number of new customers as the dependent and the following four independent variables: Period (1, …, 16): A variable that measures the trend; Q1 = 1 for first quarter, Q1 = 0 otherwise; Q2 = 1 for second quarter, Q2 = 0 otherwise; Q3 = 1 for third quarter, Q3 = 0 otherwise. Questions: 1. Explain each of the four slopes (Period, Q1, Q2, Q3). 2. How many new customers would you expect in the second quarter of the following year (2019)?arrow_forwardUse the scatterplot of Vehicle Registrations below to answer the questions Vehicle Registrations in the United States, 1925- 2011 Vehicles millions 300 y = 3.0161x - 5819.5 R² = 0.9695 250 200 150 100 50 1920 -50 1940 1960 1980 2000 2020 Year State the trend line (regression line). y= 3.0161 x -5819.5 year number of vehicle registrations R^2 = 0.9695 not enough information to determine Registrations (in millions)arrow_forward
- Bill wants to explore factors affecting work stress. He would like to examine the relationship between age, number of years at the workplace, perceived social support, and work stress. He collects data on the variables from 100 employees (males and females) working in banks. The research question is How accurately can work stress be predicted from linear combination of the predictors (age, social support, number of years at the workplace)? Conduct a multiple regression analysis to answer the following questions: What is the regression equation for all the predictors? Write a results section based on your analysis that answers the research question.arrow_forwardBill wants to explore factors affecting work stress. He would like to examine the relationship between age, number of years at the workplace, perceived social support, and work stress. He collects data on the variables from 100 employees (males and females) working in banks. The research question is How accurately can work stress be predicted from linear combination of the predictors (age, social support, number of years at the workplace)? Conduct a multiple regression analysis to answer the following questions: What is the relationship of age, number of years, and social support with work stress? Is the regression significant? If yes, what does it indicate?arrow_forwardA social scientist collects information about counties in California and finds that the correlation between average income of the county and a rating of healthcare quality in the county is 0.78. A scatterplot of the two variables is football shaped. A particular county has an average income that is 0.4 SDs above the average of all counties. Using regression, we would predict that its healthcare quality is _________ SDs above the average healthcare quality for all counties, and that it is therefore at the _________ percentile of healthcare quality among all counties. Choose the answer below to fill in the two blanks. Group of answer choices 0.31; 24th 0.78; 58th 0.78; 79th 0.31; 62nd PreviousNextarrow_forward
- In a study of housing demand, the county assessor is interested in developing a regression model to estimate the market value (i.e., selling price) of residential property within his jurisdiction. The assessor feels that the most important variable affecting selling price (measured in thousands of dollars) is the size of house (measured in hundreds of square feet). He randomly selected 15 houses and measured both the selling price and size, as shown in the following table. OBSERVATIONi SELLING PRICE (× $1,000)Y SIZE (× 100 ft2 )X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 265.2 279.6 311.2 328.0 352.0 281.2 288.4 292.8 356.0 263.2 272.4 291.2 299.6 307.6 320.4 12.0 20.2 27.0 30.0 30.0 21.4 21.6 25.2 37.2 14.4 15.0 22.4 23.9 26.6 30.7 a. Plot the data.b. Determine the estimated regression line. Give an economic interpretation of the estimated slope (b) coefficient.c. Determine if size is a statistically significant variable in estimating selling price.d. Calculate the coefficient…arrow_forwardA logistic regression was used to investigate obesity and poor physical health while controlling for the following variables: age, gender, race, income, health status, education, current smoker, and diet/exercise status. Justify the use of a logistic regression.arrow_forwardZagat’s publishes restaurant ratings for various locations in the United States. The following table contains the Zagat rating for food, décor, service, and the cost per person for a sample of 100 restaurants located in New York City and in a suburb of New York City. Develop a regression model to predict the cost per person, based on a variable that represents the sum of the ratings for food, décor, and service. Predict the mean cost per person for a restaurant with a sum-mated rating of 50. What should you tell the owner of a group of restaurants in this geographical area about the relationship between the summated rating and the cost of a meal? Location Food Décor Service Summated Rating Coded Location Cost Bins Midpoints City 22 14 19 55 0 33 19.99 25 City 20 15 20 55 0 26 29.99 35 City 23 19 21 63 0 43 39.99 45 City 19 18 18 55 0 32 49.99 55 City 24 16 18 58 0 44 59.99 65 City 22 22 21 65 0 44 69.99 75 City 22 20 20 62 0 50 79.99 85 City 20 19…arrow_forward
- College AlgebraAlgebraISBN:9781305115545Author:James Stewart, Lothar Redlin, Saleem WatsonPublisher:Cengage LearningLinear Algebra: A Modern IntroductionAlgebraISBN:9781285463247Author:David PoolePublisher:Cengage Learning
- Algebra and Trigonometry (MindTap Course List)AlgebraISBN:9781305071742Author:James Stewart, Lothar Redlin, Saleem WatsonPublisher:Cengage LearningBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtGlencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill