MAT 240 Project One

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

MAT240

Subject

Economics

Date

Feb 20, 2024

Type

docx

Pages

9

Uploaded by MinisterSwanMaster994

Report
Median Housing Price Prediction Model for D. M. Pan National Real Estate Company 1 Report: Housing Price Prediction Model for D. M. Pan National Real Estate Company Dianna Sheely Southern New Hampshire University
Median Housing Price Model for D. M. Pan National Real Estate Company 2 Introduction As D.M. Pan National Real Estate Company requested, this project aims to create a model to predict median housing prices for homes sold in 2019. This report aims to help D.M Pan National Real Estate Company associates better understand the correlation between the square footage and listing prices of homes and give them a tool to use to predict home prices more accurately based on square footage. Since we want to determine the strength of the square footage and listing prices, it is appropriate to use simple linear regression in this analysis. This report will contain charts and graphs to visualize the correlation between these variables, square footage, and housing prices, such as a scatterplot and a histogram. If our linear model is appropriate, the histogram should look normal, and the scatterplot of residuals should show random scatter. Our (x) variable will be “square feet” and (y) the “listing price” on our charts and graphs. We expect a straight line and a strong correlation between the two variables. We expect that as “square feet” increases that the “listing price” will also increase. Our scatterplot should show a positive slope unless we have strong outliers. Our analysis will also include lines and equations that will be useful, such as the regression line (or predicted line), which can be used to estimate (or forecast) the response variable. Our Predictor variable (x) is "square feet," and our Response variable (y) is "listing price.” Again, we expect that as the square feet of a home increase, the listing price will also increase. This regression line will help us to determine, for each one-unit of growth in the Predictor variable (square feet), how much of an increase there will be in the Response variable (listing price).
Median Housing Price Model for D. M. Pan National Real Estate Company 3 Data Collection The following data is a randomly collected sample of 50 counties in the United States, selected from 999 counties from the provided Real Estate County Data spreadsheet for 2019. The sample was determined using the Excel random function (=RAND), sorting the data from the lowest to the highest randomly generated number and selecting the first 50. Using the predictor variable (x), Median Square Feet, and the response variable (y), Median Listing Price, the below scatterplot was created to test the theory that our variables are related and create a linear model to predict median housing prices in 2019. - 1,000 2,000 3,000 4,000 5,000 6,000 $0 $100,000 $200,000 $300,000 $400,000 $500,000 $600,000 $700,000 $800,000 $900,000 Scatterplot of y vs. x Median Square Feet Meidan Lising Price Figure 1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Median Housing Price Model for D. M. Pan National Real Estate Company 4 Data Analysis Our data set needs to meet certain conditions to determine whether linear regression exists. The chosen sample must represent the population; there should be a linear relationship between the independent and dependent variables, and the variables need to be normally distributed. We can check this by creating a histogram of the residuals. Figure 2
Median Housing Price Model for D. M. Pan National Real Estate Company 5 Figure 3 Summary Statistics   Listing Price Square Foot Area Mean $ 327,010.00 2038.74 Median $ 303,750.00 1842 Standard Deviation $ 120,817.33 918.60 We see in the scatterplot above (figure 1) that the square footage, predictor variable (x), increases as does the listing price, responsive variable (y), where X is our fixed variable and Y is our random variable. As median square footage increases, so does the median listing price. We also see that the samples form a linear shape. This shows a positive correlation between our variables.
Median Housing Price Model for D. M. Pan National Real Estate Company 6 Looking at the next item on the report (figure 2), the histogram of the median square feet, our data is skewed right, with most of the data points collected on the left side of the chart and proceeding to tail off to the right side with a range between 1162 – 5922 square feet. The chart shows a defined peak at 1442 – 1722 with a center at the median of 1842. When our data is skewed right, it is best to use the median range of data as our center. This histogram does show a couple of gaps, one between 2549-3408 square feet and another between 4666 - 5315 square feet. However, the line still shows a positive correlation between the increase in the median price and the median square footage increases. Therefore, they do not appear to be outliers. In our following histogram (figure 3), for median listing price, we see data is skewed to the right, with most of the data points collected on the left side of the chart, decreasing towards the right side, including ranges from $145,000 - $745,100 present. Our peak is defined here as $265,100. The histogram also allows us to show the gap in the data, with no data points showing between the pricing of $475,000 - $695,000—the median range of data $303,750, as our center. The summary statistics chart (figure 4) provides us with the mean, median, and standard deviation of our 50-county random sample. When we compare that with data from the National Summary Statistics and Graphs Real Estate Data, seen below, we will see that our sample data median listing prices align very closely with the national population with only a 4% - 4.5% difference. And median square footage aligns within 2% - 3%.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Median Housing Price Model for D. M. Pan National Real Estate Company 7 Therefore, our random sample set represents the national population well, especially when we see how closely our histograms align with the National Summary Statistics and Graphs Real Estate Data. Develop Regression Model - 1,000 2,000 3,000 4,000 5,000 6,000 $0 $100,000 $200,000 $300,000 $400,000 $500,000 $600,000 $700,000 $800,000 $900,000 f(x) = 99.63 x + 123890.07 Scatterplot of y vs. x Median Square Feet Meidan Lising Price On the scatterplot above, you will see how the shape of the graph's data points is linear and that it shows a positive correlation exists between the median square feet (x) and the median listing price (y). As size increases, so does the listing price of a home.
Median Housing Price Model for D. M. Pan National Real Estate Company 8 Since we are trying to estimate what the effect of median square feet will have on the median listing price, based on the information obtained from our scatterplot, it is appropriate to use a regression model for this analysis. The correlation of this data is moderately high and will help D.M. Pan National Real Estate Company agents determine listing prices based on square footage. The sample used in this report produced an R-value of r = 0.75 . This indicates a positive correlation between the two variables, as one increases as the other increases. Values of r defined as: 0.40 < r < 0.08 support the strength of our correlation of data. Determine the Line of Best Fit Regression equation: Expected Listing Price = 99.63(Square Feet) +123890 OR y ^ = 99.63 x +123890 Our regression equation is interpreted as follows: the slope of our regression model is 99.63, and the intercept is 123890. This can be interpreted as every time the property’s square footage increases by 1, the listing price increases by $99.63 in cost (the slope). This also tells us that when there is no house present on the property, the land value alone is $123890 (the intercept). In a regression model, we use R 2 to represent the proportion of the variance of a dependent or response variable explained by an independent or predictor variable. The closer our R 2 value is to 1, the stronger the regression equation. Our model’s R 2 is 0.573816, indicating a moderately strong regression equation, which means that this model explains 57% of listing prices.
Median Housing Price Model for D. M. Pan National Real Estate Company 9 The regression equation can be used to make predictions in listing prices. Below is an example using this equation to calculate the listing price for a home of 1500 square feet: y ^ = 99.63 x +123890 y^ = 99.63 (1500) + 123890 y^ = 149445 + 123890 y^ = $273,335 Or The predicted listing price for a 1500-square-foot house is $273,335. Conclusions In conclusion, the details provided here are dependable and will help agents with D.M. Pan National Real Estate Company set listing prices in 2019. This report's random sampling of 50 countries clearly represents the national population. The graphs, charts, and calculations used in creating this model can support the expected correlation that the square footage of a home should be used in calculating the listing price. The regression equation created y ^ = 99.63 x +123890 will help calculate accurate listing prices based on the square footage of a home. My recommended next steps would be to ask the same questions regarding a specific territory targeted by D.M. Pan National Real Estate Company. Are there particular areas where the regression model may need to be adjusted due to higher than national median home prices?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help