HW3
pdf
keyboard_arrow_up
School
Washington State University *
*We aren’t endorsed by this school
Course
MISC
Subject
Statistics
Date
Feb 20, 2024
Type
Pages
13
Uploaded by GrandDeerMaster1061
1.
Using a little bit of algebra, prove that (4.2) is equivalent to
(4.3). In other words, the logistic function representation and logit
representation for the logistic regression model are equivalent.
4.2: p(X) = (eˆ(B0 + B1X)) / (1 + eˆ(B0 + B1X)) 4.3: (p(X) / (1- p(X))) = eˆ(B0 + B1X)
To prove 4.2 is equivalent to 4.3, we can use the equation
p(X) = (eˆ(B0 + B1X)) / (1 + eˆ(B0 + B1X)) (1 + eˆ(B0 + B1X)) p(X) = eˆ(B0 + B1X) p(X) + p(X)eˆ(B0
+ B1X) - p(X)eˆ(B0 + B1X) p(X) = eˆ(B0 + B1X) - p(X)eˆ(B0 + B1X) p(X) = eˆ(B0 + B1X) (1 - p(X))
(p(X) / (1- p(X))) = eˆ(B0 + B1X)
Therefore 4.2 is equivalent to 4.3
2. It was stated in the text that classifying an observation to the
class for which (4.12) is largest is equivalent to classifying an ob-
servation to the class for which (4.13) is largest. Prove that this is
the case. In other words, under the assumption that the observa-
tions in the kth class are drawn from a N(uk, o2) distribution, the
Bayes’ classifier assigns an observation to the class for which the
discriminant function is maximized.
4.12: pk(x) = ((pi)k 1/sqrt((2(pi)))o exp(- 1/2o2 (x - uk)ˆ2) / ((o K, l = 1) (pi)l (1/(sqrt(2(pi)))o exp(-
1/(2oˆ2) (x - ul)ˆ2)))
By removing all terms independent of the class & from the Bayes classifier and also removing the denominator
since it will retain the same value for any class k, we find
log Pk(X) = log(pi)k - (1/2oˆ2) * (X - uk)ˆ2 = log(pi)k - ((xˆ2 / 2oˆ2)- (x * uk/ oˆ2) +(uˆ2_k / 2oˆ2))
4.13: (delta)k(x) = x * uk/oˆ2 - u
2_k/2o
2 + log((pi)k)
The above shows that the discriminant function contains the same terms as the Bayes classifier that are
conditional with respect to k So we can deduce that the class that maximizes 4.12 will also be the class that
maximizes 4.13
5. We now examine the differences between LDA and QDA.
#(a) If the Bayes decision boundary is linear, do we expect LDA or QDA to perform better on the training
set? On the test set?
LDA is expected to perform better than QDA on both the training set and test set.
When the decision
boundary is linear, this assumption of having a common variance-covariance matrix simplifies the model,
and LDA’s linear decision boundary should fit the training data well.
#(b) If the Bayes decision boundary is non-linear, do we expect LDA or QDA to perform better on the
training set? On the test set?
QDA is expected to perform better than LDA on both the training set and test set. In QDA class-conditional
densities have unequal covariance matrices. This captures the non-linear decision boundaries more effectively
than the LDA.
1
#(c) In general, as the sample size n increases, do we expect the test prediction accuracy of QDA relative
to LDA to improve, decline, or be unchanged? Why?
As n increases, the accuracy of QDA improves more than that of LDA. This is because QDA has more data
to estimate the means and covariance matrices accurately.
(d) True or False: Even if the Bayes decision boundary for a given
problem is linear, we will probably achieve a superior test error
rate using QDA rather than LDA because QDA is flexible enough
to model a linear decision boundary. Justify your answer.
False. If the Bayes decision boundary for a problem is linear, using QDA is not likely to achieve a better
test error rate compared to LDA.This only leads to a worse performance since QDA is a more flexible model
that can capture non-linear decision boundaries, but not necessarily the linear decision boundary.
6.
Suppose we collect data for a group of students in a statistics
class with variables X1 = hours studied, X2 = undergrad GPA,
and Y = receive an A. We fit a logistic regression and produce
estimated coefficient, B0 = -6, B1 = 0.05, B2 = 1.
(a) Estimate the probability that a student who studies for 40 h
and has an undergrad GPA of 3.5 gets an A in the class.
P(Y = 1) = 1 / (1 + eˆ-(B0 + B1X1 + B2X2))
B0 = -6 B1 = 0.05 B2 = 1 X1 = 40 X2 = 3.5
P(Y = 1) = 1 / (1 + eˆ-(-6 + (0.05)(40) + 1(3.5)))
Which, when entered into a calculator gives us 0.6225. Therefore, there is a 37.75% chance that the student
in question gets an A.
(b) How many hours would the student in part (a) need to study
to have a 50 % chance of getting an A in the class?
P(Y = 1) = 0.5 1 / 1 + eˆ-z = 0.5 1 + eˆ-z = 2 eˆ-z = 1 -z = 0 z = 0
B0 + B1X1 + B2X2 = z
B0 + B1X1 + B2X2 = 0 -6 + (0.05)(X1) - 1(3.5) = 0
Solving for X1 (time in hours), we get 50 hours for the student in question to have a 50% chance of getting
an A
2
7. Suppose that we wish to predict whether a given stock will issue
a dividend this year (“Yes” or “No”) based on X, last year’s percent
profit. We examine a large number of companies and discover that
the mean value of X for companies that issued a dividend was X =
10, while the mean for those that didn’t was X = 0. In addition, the
variance of X for these two sets of companies was o2 = 36. Finally,
80 % of companies issued dividends.
Assuming that X follows a
normal distribution, predict the probability that a company will
issue a dividend this year given that its percentage profit was X =
4 last year.
#Hint:
Recall that the density function for a normal random variable is f(x) = 1 / sqrt((2(pi)oˆ2))
e
(-(x-u)
2/2o2) - You will need to use Bayes’ theorem.
X1 = 10 X2 = 0 o2 = 36 (pi)1 = 0.8 (pi)2 = 0.2
f1(4) = 0.0433 f2(4) = 0.05324
p1(x=4)=(pi)1(f1(4)) / ((pi)1(f1(4)) + (pi)2(f2(x))) p1(x=4)=0.8(0.0433) / (0.8(0.0433) + 0.2(0.05324))
if the percentage of profit was 4%. the chance of a dividend being offered is 0.75185, or 75.2%
10. This question should be answered using the Weekly data set,
which is part of the ISLR package. This data is similar in nature
to the Smarket data from this chapter’s lab, except that it contains
1,089 weekly returns for 21 years, from the beginning of 1990 to
the end of 2010.
library
(ISLR)
## Warning: package ’ISLR’ was built under R version 4.2.3
data
(
"Weekly"
)
library
(corrplot)
## corrplot 0.92 loaded
#(a) Produce some numerical and graphical summaries of the Weekly data.
Do there appear to be any
patterns?
No real patterns I can see. And nothing really appears to have any correlation.
summary
(Weekly)
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
##
Year
Lag1
Lag2
Lag3
##
Min.
:1990
Min.
:-18.1950
Min.
:-18.1950
Min.
:-18.1950
##
1st Qu.:1995
1st Qu.: -1.1540
1st Qu.: -1.1540
1st Qu.: -1.1580
##
Median :2000
Median :
0.2410
Median :
0.2410
Median :
0.2410
##
Mean
:2000
Mean
:
0.1506
Mean
:
0.1511
Mean
:
0.1472
##
3rd Qu.:2005
3rd Qu.:
1.4050
3rd Qu.:
1.4090
3rd Qu.:
1.4090
##
Max.
:2010
Max.
: 12.0260
Max.
: 12.0260
Max.
: 12.0260
##
Lag4
Lag5
Volume
Today
##
Min.
:-18.1950
Min.
:-18.1950
Min.
:0.08747
Min.
:-18.1950
##
1st Qu.: -1.1580
1st Qu.: -1.1660
1st Qu.:0.33202
1st Qu.: -1.1540
##
Median :
0.2380
Median :
0.2340
Median :1.00268
Median :
0.2410
##
Mean
:
0.1458
Mean
:
0.1399
Mean
:1.57462
Mean
:
0.1499
##
3rd Qu.:
1.4090
3rd Qu.:
1.4050
3rd Qu.:2.05373
3rd Qu.:
1.4050
##
Max.
: 12.0260
Max.
: 12.0260
Max.
:9.32821
Max.
: 12.0260
##
Direction
##
Down:484
##
Up
:605
##
##
##
##
pairs
(Weekly)
Year
-15
-15
-15
-15
1990
2010
-15
5
Lag1
Lag2
-15
5
-15
5
Lag3
Lag4
-15
5
-15
5
Lag5
Volume
0
4
8
-15
5
Today
1990
-15
-15
0
8
1.0
1.6
1.0
Direction
4
plot
(Weekly
$
Year, Weekly
$
Volume,
type =
"l"
,
xlab =
"Year"
,
ylab =
"Volume"
)
1990
1995
2000
2005
2010
0
2
4
6
8
Year
Volume
corrplot
(
cor
(Weekly[,
-
9
]),
method=
"square"
)
5
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Year
Lag1
Lag2
Lag3
Lag4
Lag5
Volume
Today
Year
Lag1
Lag2
Lag3
Lag4
Lag5
Volume
Today
#(b) Use the full data set to perform a logistic regression with Direction as the response and the five lag
variables plus Volume as predictors. Use the summary function to print the results. Do any of the predictors
appear to be statistically significant? If so, which ones?
Lag2 appears to be significant at the 0.05 significance level.
model
<-
glm
(Direction
~
Lag1
+
Lag2
+
Lag3
+
Lag4
+
Lag5
+
Volume,
data =
Weekly,
family =
binomial)
summary
(model)
##
## Call:
## glm(formula = Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 +
##
Volume, family = binomial, data = Weekly)
##
## Deviance Residuals:
##
Min
1Q
Median
3Q
Max
## -1.6949
-1.2565
0.9913
1.0849
1.4579
##
## Coefficients:
##
Estimate Std. Error z value Pr(>|z|)
## (Intercept)
0.26686
0.08593
3.106
0.0019 **
## Lag1
-0.04127
0.02641
-1.563
0.1181
## Lag2
0.05844
0.02686
2.175
0.0296 *
## Lag3
-0.01606
0.02666
-0.602
0.5469
## Lag4
-0.02779
0.02646
-1.050
0.2937
## Lag5
-0.01447
0.02638
-0.549
0.5833
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
## Volume
-0.02274
0.03690
-0.616
0.5377
## ---
## Signif. codes:
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##
Null deviance: 1496.2
on 1088
degrees of freedom
## Residual deviance: 1486.4
on 1082
degrees of freedom
## AIC: 1500.4
##
## Number of Fisher Scoring iterations: 4
#(c) Compute the confusion matrix and overall fraction of correct predictions. Explain what the confusion
matrix is telling you about the types of mistakes made by logistic regression.
Up weekly trends are correctly predicted much more accurately than down weekly trends.
As demonstrated by
557 / 48 + 557 = 0.9207
54 / 430 + 54 = 0.1115
predicted_direction
<-
ifelse
(
predict
(model,
type =
"response"
)
>
0.5
,
"Up"
,
"Down"
)
confusion_matrix
<-
table
(
Actual =
Weekly
$
Direction,
Predicted =
predicted_direction)
correct_predictions
<-
sum
(
diag
(confusion_matrix))
/
sum
(confusion_matrix)
confusion_matrix
##
Predicted
## Actual Down
Up
##
Down
54 430
##
Up
48 557
correct_predictions
## [1] 0.5610652
#(d) Now fit the logistic regression model using a training data period from 1990 to 2008, with Lag2 as the
only predictor. Compute the confusion matrix and the overall fraction of correct predictions for the held out
data (that is, the data from 2009 and 2010).
training_data
<-
Weekly[Weekly
$
Year
<
2009
, ]
testing_data
<-
Weekly[Weekly
$
Year
>=
2009
, ]
model2
<-
glm
(Direction
~
Lag2,
data =
training_data,
family =
binomial)
predicted_direction2
<-
ifelse
(
predict
(model2,
newdata =
testing_data,
type =
"response"
)
>
0.5
,
"Up"
,
"
confusion_matrix2
<-
table
(
Actual =
testing_data
$
Direction,
Predicted =
predicted_direction2)
correct_predictions2
<-
sum
(
diag
(confusion_matrix2))
/
sum
(confusion_matrix2)
confusion_matrix2
7
##
Predicted
## Actual Down Up
##
Down
9 34
##
Up
5 56
correct_predictions2
## [1] 0.625
#(e) Repeat (d) using LDA.
library
(MASS)
training_data
<-
Weekly[Weekly
$
Year
<
2009
, ]
testing_data
<-
Weekly[Weekly
$
Year
>=
2009
, ]
lda_model
<-
lda
(Direction
~
Lag2,
data =
training_data)
lda_predictions
<-
predict
(lda_model, testing_data)
predicted_direction_lda
<-
lda_predictions
$
class
confusion_matrix_lda
<-
table
(
Actual =
testing_data
$
Direction,
Predicted =
predicted_direction_lda)
correct_predictions_lda
<-
sum
(
diag
(confusion_matrix_lda))
/
sum
(confusion_matrix_lda)
confusion_matrix_lda
##
Predicted
## Actual Down Up
##
Down
9 34
##
Up
5 56
correct_predictions_lda
## [1] 0.625
#(f) Repeat (d) using QDA.
training_data
<-
Weekly[Weekly
$
Year
<
2009
, ]
testing_data
<-
Weekly[Weekly
$
Year
>=
2009
, ]
qda_model
<-
qda
(Direction
~
Lag2,
data =
training_data)
qda_predictions
<-
predict
(qda_model, testing_data)
predicted_direction_qda
<-
qda_predictions
$
class
confusion_matrix_qda
<-
table
(
Actual =
testing_data
$
Direction,
Predicted =
predicted_direction_qda)
correct_predictions_qda
<-
sum
(
diag
(confusion_matrix_qda))
/
sum
(confusion_matrix_qda)
confusion_matrix_qda
##
Predicted
## Actual Down Up
##
Down
0 43
##
Up
0 61
8
correct_predictions_qda
## [1] 0.5865385
#(g) Repeat (d) using KNN with K = 1.
library
(class)
training_data
<-
Weekly[Weekly
$
Year
<
2009
, ]
testing_data
<-
Weekly[Weekly
$
Year
>=
2009
, ]
training_data_predictor
<-
data.frame
(
Lag2 =
training_data
$
Lag2)
testing_data_predictor
<-
data.frame
(
Lag2 =
testing_data
$
Lag2)
knn_model
<-
knn
(
train =
training_data_predictor,
test =
testing_data_predictor,
cl =
training_data
$
Dire
confusion_matrix_knn
<-
table
(
Actual =
testing_data
$
Direction,
Predicted =
knn_model)
correct_predictions_knn
<-
sum
(
diag
(confusion_matrix_knn))
/
sum
(confusion_matrix_knn)
confusion_matrix_knn
##
Predicted
## Actual Down Up
##
Down
21 22
##
Up
30 31
correct_predictions_knn
## [1] 0.5
#(h) Which of these methods appears to provide the best results on this data?
LDA with a score of 0.625 or 62.5%
13. Using the Boston data set, fit classification models in order to
predict whether a given suburb has a crime rate above or below
the median.
Explore logistic regression, LDA, and KNN models
using various subsets of the predictors. Describe your findings.
library
(MASS)
library
(ISLR)
data
(
"Boston"
)
Boston
$
crim01
<-
ifelse
(Boston
$
crim
>
median
(Boston
$
crim),
1
,
0
)
set.seed
(
1
)
n
<-
nrow
(Boston)
train_indices
<-
sample
(
1
:
n, n
/
2
)
train_data
<-
Boston[train_indices, ]
test_data
<-
Boston[
-
train_indices, ]
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
glm.fit
<-
glm
(crim01
~
.
-
crim,
data =
train_data,
family =
binomial)
summary
(glm.fit)
##
## Call:
## glm(formula = crim01 ~ . - crim, family = binomial, data = train_data)
##
## Deviance Residuals:
##
Min
1Q
Median
3Q
Max
## -1.6596
-0.1582
-0.0002
0.0022
3.7586
##
## Coefficients:
##
Estimate Std. Error z value Pr(>|z|)
## (Intercept) -40.760938
9.177609
-4.441 8.94e-06 ***
## zn
-0.119686
0.054901
-2.180 0.029255 *
## indus
-0.001286
0.061838
-0.021 0.983414
## chas
0.799622
1.100037
0.727 0.467284
## nox
46.304574
10.423302
4.442 8.90e-06 ***
## rm
-0.870265
1.082365
-0.804 0.421374
## age
0.041053
0.017749
2.313 0.020725 *
## dis
1.206786
0.349609
3.452 0.000557 ***
## rad
0.650511
0.225016
2.891 0.003841 **
## tax
-0.003316
0.003640
-0.911 0.362317
## ptratio
0.476177
0.193469
2.461 0.013845 *
## black
-0.007975
0.005461
-1.460 0.144249
## lstat
0.043148
0.074055
0.583 0.560135
## medv
0.242090
0.109941
2.202 0.027665 *
## ---
## Signif. codes:
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##
Null deviance: 350.73
on 252
degrees of freedom
## Residual deviance: 105.48
on 239
degrees of freedom
## AIC: 133.48
##
## Number of Fisher Scoring iterations: 9
glm.probs
<-
predict
(glm.fit,
newdata =
test_data,
type =
"response"
)
glm.pred
<-
ifelse
(glm.probs
>
0.5
,
1
,
0
)
table
(glm.pred, test_data
$
crim01)
##
## glm.pred
0
1
##
0 116
15
##
1
10 112
mean
(glm.pred
!=
test_data
$
crim01)
## [1] 0.09881423
10
lda.fit
<-
lda
(crim01
~
.
-
crim,
data =
train_data)
lda.pred
<-
predict
(lda.fit,
newdata =
test_data)
table
(lda.pred
$
class, test_data
$
crim01)
##
##
0
1
##
0 118
33
##
1
8
94
mean
(lda.pred
$
class
!=
test_data
$
crim01)
## [1] 0.1620553
library
(class)
train_data_X
<-
train_data[,
-
14
]
test_data_X
<-
test_data[,
-
14
]
train_data_crim01
<-
train_data
$
crim01
knn.pred1
<-
knn
(train_data_X, test_data_X, train_data_crim01,
k =
1
)
table
(knn.pred1, test_data
$
crim01)
##
## knn.pred1
0
1
##
0 110
13
##
1
16 114
knn.pred10
<-
knn
(train_data_X, test_data_X, train_data_crim01,
k =
10
)
table
(knn.pred10, test_data
$
crim01)
##
## knn.pred10
0
1
##
0 105
18
##
1
21 109
knn.pred100
<-
knn
(train_data_X, test_data_X, train_data_crim01,
k =
100
)
table
(knn.pred100, test_data
$
crim01)
##
## knn.pred100
0
1
##
0 123
60
##
1
3
67
Consider the data set provided with this homework assignment.
Implement LDA and QDA classifiers on this data and compare the
two classifiers using a ROC curve.
11
library
(MASS)
library
(ROCR)
## Warning: package ’ROCR’ was built under R version 4.2.3
data
<-
read.csv
(
"Hw3data-1.csv"
)
set.seed
(
123
)
train_indices
<-
sample
(
1
:
nrow
(data),
0.7
*
nrow
(data))
train_data
<-
data[train_indices, ]
test_data
<-
data[
-
train_indices, ]
lda_model
<-
lda
(response
~
predictor
.1
+
predictor
.2
+
predictor
.3
+
predictor
.4
+
predictor
.5
,
data =
qda_model
<-
qda
(response
~
predictor
.1
+
predictor
.2
+
predictor
.3
+
predictor
.4
+
predictor
.5
,
data =
lda_predictions
<-
predict
(lda_model,
newdata =
test_data)
qda_predictions
<-
predict
(qda_model,
newdata =
test_data)
prediction_lda
<-
prediction
(lda_predictions
$
posterior[,
2
], test_data
$
response)
prediction_qda
<-
prediction
(qda_predictions
$
posterior[,
2
], test_data
$
response)
roc_lda
<-
performance
(prediction_lda,
"tpr"
,
"fpr"
)
roc_qda
<-
performance
(prediction_qda,
"tpr"
,
"fpr"
)
plot
(roc_lda,
col =
"blue"
,
main =
"ROC Curve"
)
plot
(roc_qda,
col =
"red"
,
add =
TRUE
)
legend
(
"bottomright"
,
legend =
c
(
"LDA"
,
"QDA"
),
col =
c
(
"blue"
,
"red"
),
lty =
1
)
12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
ROC Curve
False positive rate
True positive rate
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
LDA
QDA
13
Related Documents
Related Questions
Log(Wagei) = α + β1Educationi + β2Femalei + β3Non Whitei + β4Educationi × Femalei + εi For males, and holding other variables constant, the regression output from the linear regression model above implies that an extra year of education is associated with what percent change in wages?
arrow_forward
Model the following data using an exponential function of the
form f(x) = Ab". You need to do this rigorously, that is, set up a system of
equations, and solve it to get A and b. Getting the right answer by guessing
will earn no credit.
(a) f(x) is exponential and goes through the points (1, 2) and (4,6).
(b) f(x) is exponential and goes through the points (2, 75) and (5, 9375).
arrow_forward
A multiple regression model has the form y with hat on top= 12 – 4 x1 + 5 x2. As x1 increases by 7 units (holding x2 constant), y with hat on top is expected to (enter increase, decrease) by units.
arrow_forward
Data were collected from a group of senior citizens, ages 65 and older, about their age and pet ownership. The results were used to calculate the model ŷ = –0.24x + 18.48, where x represents the age in years and ŷ represents the predicted number of pets owned. If a 25-year-old is predicted to have more than 12 pets, would that be a reasonable use of the model?
This is reasonable because the prediction is based on someone within the original population.
This is not reasonable because the prediction is based on someone outside the original population.
This is reasonable because the model is based on real data.
This is not reasonable because it is impossible to own more than 12 pets.
arrow_forward
You have obtained measurements of height in inches of 29 female and 81 male students (Studenth) at your university. A regression of the height on a constant and a binary variable (BFemme), which takes a value of one for females and is zero otherwise, yields the following result:
= 71.0 – 4.84×BFemme , R2 = 0.40, SER = 2.0
(0.3) (0.57)
Values in parenthesis are standard errors.
(a) What is the interpretation of the intercept? What is the interpretation of the coefficient of the dummy variable? How tall are females, on average?
(b) Test the hypothesis that females, on average, are shorter than males, at the 1% significance level.
arrow_forward
Data were collected from a group of senior citizens, ages 65 and older, about their age and pet ownership. The results were used to calculate the model ŷ = –0.24x + 18.48, where x represents the age in years and ŷ represents the predicted number of pets owned. If a 45-year-old is predicted to have more than 7 pets, would that be a reasonable use of the model?
This is not reasonable because the prediction is based on someone outside the original population.
This is reasonable because the model is based on real data.
This is not reasonable because it is impossible to own more than 7 pets.
This is reasonable because the prediction is based on someone within the original population.
arrow_forward
1.4
(A) Let x represent the number of years since 1960 and find a cubic regression polynomial for the marriage rate.
(B) Use the polynomial model from part (A) to estimate the marriage rate (to one decimal place) for 2025 .
arrow_forward
2. Are rent rates influenced by the student population in a college town? Let rent be the average monthly rent paid on rental
units in a college town in the United States. Let pop denote the total city population, avginc the average city income, and
pctstu the student population as a percentage of the total population. One model to test for a relationship is
log (rent) = Bo + B₁ log(pop) + ₂log (avginc) + B3pctstu + u,
(i) State the null hypothesis that size of the student body relative to the population has no ceteris paribus effect on monthly
rents. State the alternative that there is an effect.
(ii) What signs do you expect for 3₁ and 3₂. Explain.
(iii) The equation estimated using 1990 data for 64 college town is
log (rent) = 0.043 + 0.066 log(pop) + 0.507 log (avginc) + 0.0056 pctstu
(0.844) (0.039)
(0.081)
(0.0017)
n = 64, R² = 0.458
What is wrong with the statement "A 10% increase in population is associated with about a 6.6% increase in rent"?
(iv) Test the hypothesis stated in…
arrow_forward
A professor is interested in understanding the factors on which the grades a student obtains in her final exam depends. She chooses the student's intelligence quotient (X₁), the grades obtained in
the mid-term exam (X₂), the number of days the student was present in the class in the whole semester (X3) as regressors.
She randomly selects 120 students and estimates the following regression function:
9-65-123X₁,-0.45X₂ +2.13X+U
where YX₁ X₂ and X₂ denote the predicted value of grades obtained by the student, her intellegence quotient, the grades obtained by her in the mid-term exam, and the number of days she
was present in the class in the whole semester, respectively
She calculates the value of the regression to be 0.78
The value of the adjusted R² (²) will be
(Round your answer to two decimal places)
Which of the following statements is not true about the A? (Check all that apply)
A The value of R² is always greater than the given that the number of regressors remains the same in the…
arrow_forward
Please help
arrow_forward
1.4
(A) Let x represent the number of years since 1980 and find a cubic regression polynomial for the per capita consumption of ice cream.
(B) Use the polynomial model from part (A) to estimate (to the nearest tenth of a pound) the per capita consumption of ice cream in 2025.
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage
Related Questions
- Log(Wagei) = α + β1Educationi + β2Femalei + β3Non Whitei + β4Educationi × Femalei + εi For males, and holding other variables constant, the regression output from the linear regression model above implies that an extra year of education is associated with what percent change in wages?arrow_forwardModel the following data using an exponential function of the form f(x) = Ab". You need to do this rigorously, that is, set up a system of equations, and solve it to get A and b. Getting the right answer by guessing will earn no credit. (a) f(x) is exponential and goes through the points (1, 2) and (4,6). (b) f(x) is exponential and goes through the points (2, 75) and (5, 9375).arrow_forwardA multiple regression model has the form y with hat on top= 12 – 4 x1 + 5 x2. As x1 increases by 7 units (holding x2 constant), y with hat on top is expected to (enter increase, decrease) by units.arrow_forward
- Data were collected from a group of senior citizens, ages 65 and older, about their age and pet ownership. The results were used to calculate the model ŷ = –0.24x + 18.48, where x represents the age in years and ŷ represents the predicted number of pets owned. If a 25-year-old is predicted to have more than 12 pets, would that be a reasonable use of the model? This is reasonable because the prediction is based on someone within the original population. This is not reasonable because the prediction is based on someone outside the original population. This is reasonable because the model is based on real data. This is not reasonable because it is impossible to own more than 12 pets.arrow_forwardYou have obtained measurements of height in inches of 29 female and 81 male students (Studenth) at your university. A regression of the height on a constant and a binary variable (BFemme), which takes a value of one for females and is zero otherwise, yields the following result: = 71.0 – 4.84×BFemme , R2 = 0.40, SER = 2.0 (0.3) (0.57) Values in parenthesis are standard errors. (a) What is the interpretation of the intercept? What is the interpretation of the coefficient of the dummy variable? How tall are females, on average? (b) Test the hypothesis that females, on average, are shorter than males, at the 1% significance level.arrow_forwardData were collected from a group of senior citizens, ages 65 and older, about their age and pet ownership. The results were used to calculate the model ŷ = –0.24x + 18.48, where x represents the age in years and ŷ represents the predicted number of pets owned. If a 45-year-old is predicted to have more than 7 pets, would that be a reasonable use of the model? This is not reasonable because the prediction is based on someone outside the original population. This is reasonable because the model is based on real data. This is not reasonable because it is impossible to own more than 7 pets. This is reasonable because the prediction is based on someone within the original population.arrow_forward
- 1.4 (A) Let x represent the number of years since 1960 and find a cubic regression polynomial for the marriage rate. (B) Use the polynomial model from part (A) to estimate the marriage rate (to one decimal place) for 2025 .arrow_forward2. Are rent rates influenced by the student population in a college town? Let rent be the average monthly rent paid on rental units in a college town in the United States. Let pop denote the total city population, avginc the average city income, and pctstu the student population as a percentage of the total population. One model to test for a relationship is log (rent) = Bo + B₁ log(pop) + ₂log (avginc) + B3pctstu + u, (i) State the null hypothesis that size of the student body relative to the population has no ceteris paribus effect on monthly rents. State the alternative that there is an effect. (ii) What signs do you expect for 3₁ and 3₂. Explain. (iii) The equation estimated using 1990 data for 64 college town is log (rent) = 0.043 + 0.066 log(pop) + 0.507 log (avginc) + 0.0056 pctstu (0.844) (0.039) (0.081) (0.0017) n = 64, R² = 0.458 What is wrong with the statement "A 10% increase in population is associated with about a 6.6% increase in rent"? (iv) Test the hypothesis stated in…arrow_forwardA professor is interested in understanding the factors on which the grades a student obtains in her final exam depends. She chooses the student's intelligence quotient (X₁), the grades obtained in the mid-term exam (X₂), the number of days the student was present in the class in the whole semester (X3) as regressors. She randomly selects 120 students and estimates the following regression function: 9-65-123X₁,-0.45X₂ +2.13X+U where YX₁ X₂ and X₂ denote the predicted value of grades obtained by the student, her intellegence quotient, the grades obtained by her in the mid-term exam, and the number of days she was present in the class in the whole semester, respectively She calculates the value of the regression to be 0.78 The value of the adjusted R² (²) will be (Round your answer to two decimal places) Which of the following statements is not true about the A? (Check all that apply) A The value of R² is always greater than the given that the number of regressors remains the same in the…arrow_forward
- Please helparrow_forward1.4 (A) Let x represent the number of years since 1980 and find a cubic regression polynomial for the per capita consumption of ice cream. (B) Use the polynomial model from part (A) to estimate (to the nearest tenth of a pound) the per capita consumption of ice cream in 2025.arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Algebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:Cengage
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage