MAT 240 Project One

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

240

Subject

Economics

Date

Apr 3, 2024

Type

docx

Pages

7

Uploaded by AmbassadorSparrowPerson1076

Report
Median Housing Price Prediction Model for D. M. Pan National Real Estate Company 1 Report: Housing Price Prediction Model for D. M. Pan National Real Estate Company Tyler Schmidt Southern New Hampshire University
Median Housing Price Model for D. M. Pan National Real Estate Company 2 Introduction How does square feet affect the housing prices of homes in 2019? Using linear regression is most appropriate when trying to predict a continuous dependent variable based on an independent variable. When using linear regression, you should expect the scatterplot to show a linear relationship between both variables. Outliers are not unheard of but should be few in numbers. The difference between the predictor and response variable is the predictor variable is the independent variable that explains the variability of the response variable. The justification of selecting the variables includes prior knowledge of the two variables and determining which is independent and dependent. Data Collection I obtained a random sample of 50 houses by assigning each house with a random number with the =rand() function on Excel. Once they each had a random number, I just sorted them from smallest to largest by the random column and picked the first 50 houses. The predictor variable is square feet, and the response variable is listing price.
Median Housing Price Model for D. M. Pan National Real Estate Company 3 - 1,000 2,000 3,000 4,000 5,000 6,000 7,000 - 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000 900,000 f(x) = 106.09 x + 102145.7 Scatterplot of listing price vs square feet Square Feet of Homes Listing Price $
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Median Housing Price Model for D. M. Pan National Real Estate Company 4 Data Analysis
Median Housing Price Model for D. M. Pan National Real Estate Company 5 Listing Price Square Feet Area Mean $ 332,504 2,171 Median $ 329,300 1,890 Std Dev $ 133,240 1089 The graph has a positive linear spread with little to no outliers. Most of the data lies below 3,000 square feet and $500,000. The only outliers being above both of those, however these outliers are just extremes that still follow the trend line. The national summary statistics are very similar to those of my random sample of 50 households. The mean, median, and standard deviation of both the listing price and square feet are very close in both my sample and the national population. Develop Regression Model A regression model is certainly appropriate for this data because there are very few outliers and the independent variable accounts for most of the variation within the dependent variable.
Median Housing Price Model for D. M. Pan National Real Estate Company 6 The direction of the scatterplot indicates that the relationship between the two variables is positive. The strength of the scatterplot is medium-high considering most of the points cluster together around the line of best fit. Lastly, the form of the plot is linear. [ Discuss associations: Identify any possible outliers or influential points and discuss their effect on correlation.] The only outliers present within this data would be the data point in the far upper right corner. This data point is very far away from any other data point. However, this point remains consistent with the regression model and is almost directly on the line of best fit. The data of this point is just an extreme of the data represented. I would keep this outlier considering it still represents the relationship between listing prices and square feet. r = 0.86707233 The correlation coefficient supports what was noticed in the scatterplot because the calculated r is extremely high meaning the correlation between y and x is very strong. This was already identified within the scatterplot earlier on in this report before r was calculated. Determine the Line of Best Fit y = 106.09x + 102146 y= housing prices x= square feet The slope of 106.09 represents the change in listing price for the unit change in square footage. The intercept of 102146 represents the listing price of a home with 0 square footage. The intercept does not make sense considering no home would have 0 square feet. R-squared = 0.751814426 R-squared is the percentage of variation within listing prices explained by square footage.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Median Housing Price Model for D. M. Pan National Real Estate Company 7 The linear regression equation is very strong considering roughly 75% of variation is explained by square footage. A home with 1500 square feet should cost $ 261,281. Conclusions The results and findings with this data set were mostly expected. The only thing I expected to see more of within the 50 randoms homes were more outliers. Most of the data points remained consistent with the line of best fit on the scatterplot created. If the sample size were larger than 50 it could harbor different results and even be more accurate. I believe I got quite lucky with the data set given how similar it was to the national averages. Lastly, I am curious how far this model could be stretched and remain accurate with extreme data, such as increased listing prices or square footage.