Stat1250_SGTA10

docx

School

Macquarie University *

*We aren’t endorsed by this school

Course

1250

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

4

Uploaded by MagistrateFog23892

Report
© Copyright Macquarie University 1 Stat1250 SGTA 10: Linear Regression You are expected to read this material and think about the problems before attempting to solve them. SGTA 10: Linear Regression In this SGTA we will: Discuss scatter plots showing relationship between 2 numerical variables. Use regression analysis to investigate the relationship between two numerical variables. The principle of regression analysis was described and explained by Francis Galton in the 1870s and 1880s. Initially he investigated geniuses in various fields and noted that their children, while typically gifted, were almost always closer to the average than their exceptional parents. He later described the same effect numerically by comparing fathers’ heights to their sons’ heights. Again a son’s height was typically closer to the average than the height of his father. Galton hypothesised that the taller the father the taller the son would be. He plotted heights of fathers and heights of their sons for a number of father- son pairs, then attempted to fit a straight line through the data. Galton used the equation of the line to predict a son’s adult height based on the height of his father. Figure 1: Sir Francis Galton; Wikipedia (Public domain image)
© Copyright Macquarie University 2 Stat1250 SGTA 10: Linear Regression Scatter plots 1. Draw a curve that might depict the relation between the following variables. (Note: use the first variable given as the independent variable.) a. Maximum daily temperature and soft drink sales of a retailer. b. Odometer reading and sale price of a used car. c. Annual income and credit card balance of bank clients. a. b. c. 2. Consider the four scatter plots below. For each plot, with your group, discuss any relation that you can see between the variables.
Excel code Result =T.DIST.2T(4.2961,41) 0.0001 =T.DIST.2T(1.8177,41) 0.0764 =T.DIST.2T(2.8495,41) 0.0068 =2*(1 NORM.DIST(1.96,0,1,TRUE)) 0.0500 =T.DIST.2T(0.6379,41) 0.5271 =T.INV(0.975,41) 2.0195 =NORM.INV(0.975,0,1) 1.9600 Research Question: Is there a relationship between household income and expenditure? 340 320 300 280 260 240 220 200 180 40 45 5055 6065 70 75 Annual Income ($'000 ) 100 50 0 40 50 60 70 80 50 100 Annual Income($'000) Residuals 12 10 8 6 4 2 0 80 60 40 20020406080100 More © Copyright Macquarie University 3 Stat1250 SGTA 10: Linear Regression Testing the slope of a regression line As part of a study on poverty an economist was interested in knowing if there was a relationship between household income and expenditure for households with an income between $40,000 and $80,000 per annum. She obtained a random sample of 43 households with annual incomes within this range and, for each household, recorded the annual income ($’000) and the amount spent on food in one week. Reference: Selvanathan et al, Business Statistics Abridged , 6 th Edition, Cengage Learning, 2014 SUMMARY OUTPUT Regression Statistics R Square 0.165299635 Observations 43 Coefficients Standard Error t Stat P value Intercept 161.27613 37.54015762 4.296096 0.000104 Annual Income($'000) 1.817687813 0.637906051 * * Weekly Expenditure Food Residuals
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
© Copyright Macquarie University 4 Stat1250 SGTA 10: Linear Regression Testing the slope of a line 1. Use the information provided to answer the research question. Hypothesis Test: H 0 = 0 & H 1 ≠ 0 A = The scatterplot is stating that the relationship is positively linear. The histogram suggests that residuals are normally distributed. 2. Interpret the equation of the regression line. 3. Calculate a 95% confidence interval for β 1 , the population slope.