MAT 303 Project One Summary Report_ChinhDoan
.docx
keyboard_arrow_up
School
University of North Dakota *
*We aren’t endorsed by this school
Course
303
Subject
Mathematics
Date
Jun 23, 2024
Type
docx
Pages
17
Uploaded by CountFoxMaster1054
MAT 303 Project One Summary Report
CHINH DOAN
chinh.doan@snhu.edu
Southern New Hampshire University
1
1. Introduction
As a data analyst employed by a real estate firm, I am tasked with analyzing a substantial
historical data set pertaining to residential properties. The objective of this analysis is to examine the
relationships among various attributes of homes. The findings from this analysis will be utilized to assist
the real estate company in establishing more accurate pricing for their clients' property listings. The
analytical methods employed in this project will encompass first order and second order regression
models, involving both quantitative and qualitative variables, as well as a nested second order
regression model.
2. Data Preparation
The key variables included in this data set are price, age, square footage of the living area,
number of bathrooms, view, square footage of the upper level, school rating, and crime rate. The data
set consists of 2,692 individual records (rows) and encompasses 23 columns.
3. Model #1 - First Order Regression Model with Quantitative and Qualitative Variables
2
The scatterplot presented above illustrates a positive correlation between the price of a home
and the square footage of its living area. Specifically, as the living area in square footage increases, there
is a corresponding increase in the price of the home.
3
The scatterplot of the price compared to the age of the home exhibits a positive trend,
indicating no association between the two variables.
4
The correlation coefficient between the price and the living area is 0.6895, while the correlation
coefficient between the price and the age of the home is -0.0746. These values indicate a strong positive
correlation between price and living area, and a strong negative correlation between price and the age
of the home.
Reporting Results:
The general form and prediction equation of the multiple regression model is as follows:
E
(
y
)
=
β
0
+
β
1
x
1
+
β
2
x
2
+
β
3
x
3
+
β
4
x
4
+
β
5
x
5
R script: ^
y
=
7709
+
129.3
x
1
+
19.51
x
2
+
1451
x
3
+
43970
x
4
+
1.67
∗
10
5
x
5
+
e
The multiple regression model is as follows:
^
y
=
^
β
0
+
^
β
1
x
1
+
^
β
2
x
2
+
^
β
3
x
3
+
^
β
4
x
4
+
^
β
5
x
5
R script: ^
y
=
77 09
+
129.3
x
1
+
19
.
51
x
2
+
1451
x
3
+
43970
x
4
+
2
.
49
∗
10
5
x
5
5
The multiple regression model yields an R-squared value of 0.6029 and an adjusted R-squared
value of 0.602. These values indicate a 60.29% and 60.2% variation within the model, respectively. The
beta estimate for living area is 1.293e+02, and for lake view is 2.490e+05. This suggests that a lake view
increases the price by 2.490e+05, and each unit increase in living area leads to a price increase of
1.293e+02.
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help