project 1 mod 4

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

240

Subject

Mathematics

Date

Feb 20, 2024

Type

docx

Pages

7

Uploaded by brittbrattxo25

Report
Brittany Curran Southern New Hampshire University Mat-240: Applied Statistics
Median Housing Price Prediction Model for D.M. Pan National Real Estate Company 1 Introduction To arrive at well-informed decisions, the real estate sector heavily depends on thorough data analysis and comprehensive market research. The significance of this information lies in its ability to facilitate the generation of reports that present specific data on median home pricing in correlation with square footage. Linear regression proves to be the most suitable method when aiming to model the relationship between two variables by incorporating a linear equation into the gathered data. In this particular report, we will employ linear regression to compare our two variables: median home pricing and square footage. The utilization of a scatterplot complements the application of linear regression, as it assists in determining the line of best fit and visually represents the regression equation. The predictor variable in this report, which is square footage, provides deeper insights into the response variable, namely median home pricing. Through the implementation of the linear regression model, we can offer more efficient guidance to clients and utilize it as a tool to forecast future trends in the real estate industry. Data Collection The sample data was derived from the real estate county data spreadsheet and the national statistics and graphs document for national comparisons. To obtain this dataset, a random sampling of 50 counties was conducted using the =RAND function in Excel. In this analysis, the response variable is the median house pricing, while the predictor variable is the square footage. In order to provide descriptive statistics, key measures such as the mean, median, and standard deviation were calculated and presented based on the collected data.
Median Housing Price Prediction Model for D.M. Pan National Real Estate Company 2 (scatter plot of median listing and median sq feet) Data Analysis (Histogram of frequencies of median sq feet)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Median Housing Price Prediction Model for D.M. Pan National Real Estate Company 3 (Histogram of frequencies of median listing prices) Summary Statistics Table 1: summary statistics Histograms serve as valuable tools for gaining insights into the central tendencies of data. In the provided histogram depicting median square feet, it is evident that the majority of values in the dataset are not closely concentrated around the center. Instead, values further away from the center appear to be more prevalent. The distribution exhibits asymmetry, with values spanning approximately 1200 to 2081. This suggests that the data points tend to be dispersed from the center, indicating higher variability.
Median Housing Price Prediction Model for D.M. Pan National Real Estate Company 4 Ideally, low variability is preferred as it enables more accurate predictions about the population based on sample data. Histograms also prove useful for identifying outliers or gaps. In this context, the histogram reveals certain gaps and outliers. An outlier, defined as an observation significantly distant from other values in a random sample, may signify homes in the suburbs with exceptional school districts. The histogram for median listing price illustrates a right-skewed distribution, where the tail extends to the right while most values cluster on the left. Once again, the histogram is instrumental in identifying outliers or gaps, and indeed, outliers are evident in this dataset. The sample data was randomly gathered, and a comparison of key statistical measures was made between the sample and the national data. Notably, the mean and median for National listing prices were lower, while the standard deviation was higher in comparison to my sample dataset. Despite slight variations between the sample and the National Summary Statistics, the data still provides a clear representation of national trends. It is observed that, on a national scale, as square footage increases, home listing prices also tend to increase. This trend holds true in both the sample data and the broader national dataset, reinforcing the notion that the sample is a reliable reflection of national data trends. The scatterplot graph with a trend line provides a visual representation of how changes in one variable are associated with changes in another. The trend line, often generated through a regression analysis, aims to capture the overall pattern or trend in the data. The scatterplot graph, along with its accompanying trend line, supports the development of a regression model. The line of best fit serves as a visual representation of the relationship between the two variables. Specifically, it reveals a positive correlation between median listing price and
Median Housing Price Prediction Model for D.M. Pan National Real Estate Company 5 median square feet. This is evident as the line of best fit indicates that the median listing price tends to increase with the rise in square footage. R-coefficent R=0.6356, This suggests a positive correlation between the median listing price and median square footage. As the predictor variable, square footage, increases, there is a corresponding increase in the response variable, median listing price. It's worth noting that the correlation is weak, as indicated by the coefficient being less than 1. Line of Best Fit Regression equation: y=102.09x+129705 The interpretation of the slope in the regression equation indicates that, on average, the median listing price increases with an increase in median square feet. Regarding the strength of the regression equation, an R-squared value of 0.635562377 suggests that approximately 63.56% of the variability in the median house pricing can be explained by the square footage. While this indicates a moderate correlation, it's not a very strong one as R- squared is less than 1. A higher R-squared value would suggest a stronger relationship between the variables. In practical terms, based on the data from the regression equation, if a county had a 136 square foot home, the predicted median listing price would be $268,545. However, it's crucial to note that this is a simplified prediction based on the regression model and doesn't account for all factors that might influence house pricing in reality. Conclusion
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Median Housing Price Prediction Model for D.M. Pan National Real Estate Company 6 Based on the analysis of my sample dataset, I have concluded that, while weak, there exists a positive upward trend between median square footage and median listing prices. With the exception of a few outliers, it is reasonably safe to assume that as the square footage of a home increases, the corresponding listing price will also increase. The presence of outliers suggests that there may be some unique cases influencing this trend, but the overall pattern supports the notion of a positive correlation between square footage and listing prices in the dataset .