2023F_322_Midterm2_Solutions
pdf
keyboard_arrow_up
School
Rowan University *
*We aren’t endorsed by this school
Course
322
Subject
Economics
Date
Jan 9, 2024
Type
Pages
18
Uploaded by bb055fisher
Rutgers University
Department of Economics
Econometrics: 01:220:322:01
Midterm 2
Fall 2023
Instructor: Hector Blanco
Exam Version:
A
Instructions
- PLEASE READ ALL THESE INSTRUCTIONS FIRST
•
Write your name on the front page. Do it now!
•
This is a closed book exam –you may have one 3x5 notecard “cheatsheet” (both
sides ok)
•
Calculators may be used (no cell phones as calculators!)
•
You have the full class period to complete the exam (approximately 1 hour and
20 minutes)
•
There are a total of 80 points
•
Use a 5% significance level for tests unless otherwise stated (critical value: 1.96)
•
Unless stated otherwise, please round all answers to 2 decimal places
•
Please fill all questions on this exam sheet.
Do not unstaple it
.
If you
need more paper there are extra sheets up front.
Please clearly label all
short/long answer responses with the number and/or letter of the
question you are answering.
You can write on the front and back of the
sheet.
Name:
1
Section 1.
Warm-up. 3 Points
Same as last time, we start with some warm-up questions, no wrong answers in this
section (but do answer!)
1. [2 points] What is a TV show that I should be watching right now and why?
2. [1 point] Choose one of the options below:
A) Taylor ham
B) Pork roll
C) I don’t eat meat
D) What?
2
Section 2.
Multiple Choice Questions.
20 points in total, 2
points each
1. Assume that
Y
is distributed like a standard normal,
N
(0
,
1). Then, the prob-
ability that
Y
is between -1.96 and 1.96 is:
(a)
0.90
(b)
0.925
(c)
0.95
(d)
0.975
(e)
None of the above
2. New Brunswick’s daily temperature has an expected value of 52F and a standard
deviation of 11F. The formula to convert degrees Fahrenheit (F) to degrees
Celsius (C) is:
C
=
5
9
(
F
−
32)
What is the expected value of New Brunswick’s daily temperature in Celsius
(rounded to the nearest integer)?
(a)
9C
(b)
10C
(c)
11C
(d)
12C
(e)
None of the above
3. New Brunswick’s daily temperature has an expected value of 52F and a standard
deviation of 11F. The formula to convert degrees Fahrenheit (F) to degrees
Celsius (C) is:
C
=
5
9
(
F
−
32)
What is the variance of New Brunswick’s daily temperature in Celsius (rounded
to the nearest integer)?
(a)
37C
(b)
42C
(c)
47C
(d)
49C
(e)
None of the above
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4. What is the difference between an estimator and an estimate?
(a)
Both an estimator and an estimate are functions of a sample of data to
be drawn randomly from a population.
(b)
An estimator is a function of a sample of data to be drawn randomly from
a population whereas an estimate is the numerical value of the estimator
when it is actually computed using data from a specific sample.
(c)
An estimate is a function of a sample of data to be drawn randomly from
a population whereas an estimator is the numerical value of the estimator
when it is actually computed using data from a specific sample.
(d)
Both an estimator and an estimate are numerical values computed using
data from a specific sample.
(e)
None of the above
5. Consider the multivariate regression model
Y
i
=
θ
0
+
θ
1
X
1
i
+
θ
2
X
2
i
+
...
+
θ
k
X
ki
+
ε
i
Ordinary Least Squares (OLS) estimates the parameters
{
θ
0
, θ
1
, ..., θ
k
}
by min-
imizing the following function:
(a)
∑
n
i
=1
(
Y
i
−
θ
0
−
θ
1
X
1
i
−
θ
2
X
2
i
−
...
−
θ
k
X
ki
−
ε
i
)
2
(b)
∑
n
i
=1
(
Y
i
+
θ
0
+
θ
1
X
1
i
+
θ
2
X
2
i
+
...
+
θ
k
X
ki
+
ε
i
)
2
(c)
∑
n
i
=1
(
Y
i
−
θ
0
−
θ
1
X
1
i
−
θ
2
X
2
i
−
...
−
θ
k
X
ki
)
2
(d)
∑
n
i
=1
(
Y
i
+
θ
0
+
θ
1
X
1
i
+
θ
2
X
2
i
+
...
+
θ
k
X
ki
)
2
(e)
∑
n
i
=1
(
Y
2
i
−
θ
0
−
θ
1
X
2
1
i
−
θ
2
X
2
2
i
−
...
−
θ
k
X
2
ki
)
6. Imagine that you were told that the t-statistic for the slope coefficient of the
regression line
TestScore
V
= 698
.
8 + 2
.
28
StudentTeacherRatio
was 4.38. What
are the units of measurement for the t-statistic?
(a)
Points of the test score
(b)
Number of students per teacher
(c)
Points of the test score / Number of students per teacher
(d)
Standard deviations (estimated by the corresponding standard error)
(e)
Dollars
4
7. You estimate
Y
i
=
α
+
β
1
X
i
+
β
2
X
2
i
+
ϵ
i
. How can you test for whether the
relationship between
Y
i
and
X
i
is linear or quadratic?
(a)
Compute the t-statistic for
β
2
to test the null hypothesis that
β
2
= 0
(b)
Compute the t-statistic for
β
1
to test the null hypothesis that
β
1
= 0
(c)
Compute the F-statistic to test the null hypothesis that both
β
1
= 0 and
β
2
= 0
(d)
Compare the t-statistic for
β
1
when including
X
2
in the regression to the
t-statistic for
β
1
when not including
X
2
in the regression
(e)
None of the above
8. Which of the following statements is true?
(a)
The
R
2
will always be greater than the adjusted
R
2
as long as there is
at least one independent variable in the regression
(b)
The
R
2
will always be greater than the adjusted
R
2
even if there is no
independent variable in the regression
(c)
The adjusted
R
2
cannot be negative
(d)
The adjusted
R
2
accounts for omitted variable bias
(e)
none of the above
9. I am interested estimating the relationship between
X
and
Y
.
I know there
is a non-linear relationship between them.
I am considering using a log-log
specification, a log-linear specification, or a linear-log specification. Which of
the following should help me pick among the three specifications?
(a)
I can compare the
R
2
in the three specifications and pick the specification
with the highest
R
2
(b)
I can compare the adjusted
R
2
in the three specifications and pick the
specification with the highest adjusted
R
2
(c)
I can compare the t-statistic for
β
in the three specifications and pick
the specification with the highest t-statistic
(d)
I can compare the standard error of
β
in the three specifications and pick
the specification with the lowest standard error
(e)
None of the above
10. If you had a two-regressor regression model, then omitting one of the regressors:
(a)
Will bias the coefficient of the included regressor upward
(b)
Will bias the coefficient of the included regressor downward
(c)
May not bias the coefficient of the included regressor
(d)
Will have no effect on the coefficient of the included regressor if the
correlation between the excluded and the included regressor is negative
(e)
None of the above
5
Section 3.
Short Answer Questions. 12 points in total
11. [6 points] The company Fashion Icon LLC is planning to increase its spending
on the advertisement of their clothing items with the objective of increasing
the visibility of the brand and, ultimately, increase their profits. Fashion Icon
LLC has a bunch of data from previous years that contain information on two
variables:
revenue
t
, which denotes the total revenues from their sales in a given
year
t
(in thousand
$
), and
ad
costs
t
, which denotes the total costs incurred
by the company in the advertising campaign for year
t
(in thousand
$
).
(a) [4 points] Using the two variables, write down a univariate regression equa-
tion that the company could estimate where the slope coefficient
β
can be
interpreted as the percentage change in total revenues that is associated with a
1 percent increase in advertising spending.
(b) [2 points] If you estimate your regression equation in (a) using OLS, what
would the distribution of
ˆ
β
be if your sample is large enough and i.i.d? Why?
12. [6 points] Answer the following questions (round to two decimal places):
(a) [3 points] In Metrics’ High School, the probability that a student takes a
Statistics course is 0.27 and the probability that a student takes a Literature
course is 0.12. Given that the joint probability that a student takes Literature
but does not take Statistics is 0.62, what is the probability that a student takes
Literature conditional on not taking a Statistics course?
(b) [3 points] What is the probability that, when I roll a die twice, I get a two in
the first roll and a five on the second roll? What is the probability that, when
I roll a die twice, I get a five in each of the two rolls? Assume that the die is
not rigged.
Hint: does the second roll depend on what the realization of the first
roll is?
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Section 4.
Long Problem. 45 points in total
13. [21 points] Climate change is one of the biggest challenges of this century. Given
this, suppose that we are interested in understanding to what extent economic
progress has contributed to climate change.
To examine this question, we collected data in 2023 for all counties in the
United States. Counties are geographical areas similar to the size of an average
metropolitan area. We have data for three variables:
•
AQI
: Annual average of the Air Quality Index (AQI), which goes from 0
to 500 degrees (the lower, the better air quality). Air quality is affected
by pollution, which is an important driver of climate change.
•
income
capita
: Income per capita in the county, in thousands of dollars,
which is a proxy for economic progress.
•
urban
: Dummy variable that takes value 1 if the county is mostly urban
and takes value 0 if the county is mostly rural.
The table below shows the results of estimating several regression equations by
OLS, using
AQI
as the dependent variable:
Variables
(1)
(2)
(3)
income
capita
2.281
1.537
1.343
(0.493)
(0.354)
(0.295)
urban
14.199
12.304
(3.769)
(3.514)
income
capita
×
urban
0.189
(0.059)
constant
39.5
28.1
26.1
(4.26)
(4.09)
(3.78)
Obs
3,143
3,143
3,143
Note:
Standard error in parentheses. This table is made up.
Answer the questions below (round your answers to two decimal places):
(a) [4 points] Column (1) regresses AQI on income per capita.
Interpret the
slope coefficient. Interpret the intercept (does it make sense in this case? why?).
(b) [6 points] The regression in Column (1) omits variables that may bias the
estimate of the effect of income per capita on AQI. Argue why the variable
urban
may introduce omitted variable bias (OVB). Clearly state the two necessary
conditions for OVB and how they apply to this case. Column (2) adds
urban
to
the regression. Was the coefficient in Column (1) upward or downward biased?
(c) [6 points] Interpret the estimated coefficient of
urban
in Column (2). Is this
coefficient different from zero at the 5% significance level?
(d) [2 points] Column (3) adds an interaction term between the two main re-
gressors. Interpret the coefficient on
income
capita
×
urban
.
(e) [3 points] Using the estimates from Column (3) compute the expected value
of the AQI for an urban county with an income per capita of
$
25,000.
7
14. [24 points] Rutgers-New Brunswick is interested in studying the differences in
academic achievement across its five campuses: Busch, College Ave, Cook, Dou-
glass, and Livingston. We collected data for a random sample of 1,500 Rutgers
students. The data contains information about their GPA (
gpa
i
) and the cam-
pus where they take classes. To simplify the problem, assume that each student
can only take classes in one campus. For example, if student
i
takes classes in
College Ave, student
i
cannot take classes in other campuses.
Consider the following regression equation (Equation (1)):
gpa
i
=
α
+
β
campus
i
+
u
i
(1)
where
campus
i
is a variable that can take 5 values depending on where the
student is taking classes:
1 for Busch, 2 for College Ave, 3 for Cook, 4 for
Douglass, and 5 for Livingston.
Now consider the alternative regression equation below (Equation (2)):
gpa
i
=
α
+
β
1
busch
i
+
β
2
cook
i
+
β
3
douglass
i
+
β
4
livingston
i
+
ε
i
(2)
where each of the independent variables is a dummy variable that takes value 1
if student
i
lives in that campus and takes value 0 otherwise. The table below
shows the results of estimating Equation (2) by OLS:
Variables
(1)
busch
0.221 (0.054)
cook
-0.017 (0.021)
douglass
-0.189 (0.045)
livingston
-0.010 (0.030)
constant
3.730 (0.034)
Obs
1,500
R
2
0.34
Note:
Standard error in parentheses. This table is made up.
Answer the questions below (round your answers to two decimal places):
(a) [4 points] Can we estimate Equation (1)? If your answer is no, why? If your
answer is yes, does the interpretation of
β
make sense?
All the remaining questions refer to Equation (2):
(b) [4 points] In Equation (2), what is the omitted group? Why is it omitted?
(c) [6 points] How do we interpret ˆ
α
? How do we interpret
ˆ
β
4
?
(d) [6 points] We want to test the null hypothesis that the campus where stu-
dents take classes does not matter. Write down the joint null hypothesis and
the alternative hypothesis. What statistic do we need to compute to test this
hypothesis (no need to compute it)? Explain how you would reject or fail to
reject the null hypothesis.
(e) [4 points] Suppose that
instead of
busch
i
, we included
college ave
i
in the
regression.
In this hypothetical regression, can we know what would be the
estimated
ˆ
β
associated with
college ave
i
? If your answer is no, explain why.
If your answer is yes, provide a number.
8
Rutgers University
Department of Economics
Econometrics: 01:220:322:01
Midterm 2
Fall 2023
Instructor: Hector Blanco
Exam Version:
A
Instructions
- PLEASE READ ALL THESE INSTRUCTIONS FIRST
•
Write your name on the front page. Do it now!
•
This is a closed book exam –you may have one 3x5 notecard “cheatsheet” (both
sides ok)
•
Calculators may be used (no cell phones as calculators!)
•
You have the full class period to complete the exam (approximately 1 hour and
20 minutes)
•
There are a total of 80 points
•
Use a 5% significance level for tests unless otherwise stated (critical value: 1.96)
•
Unless stated otherwise, please round all answers to 2 decimal places
•
Please fill all questions on this exam sheet.
Do not unstaple it
.
If you
need more paper there are extra sheets up front.
Please clearly label all
short/long answer responses with the number and/or letter of the
question you are answering.
You can write on the front and back of the
sheet.
Name:
1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Answer Key for Exam
A
Section 1.
Warm-up. 3 Points
Same as last time, we start with some warm-up questions, no wrong answers in this
section (but do answer!)
1. [2 points] What is a TV show that I should be watching right now and why?
Answer:
Any answer is sufficient
2. [1 point] Choose one of the options below:
A) Taylor ham
B) Pork roll
C) I don’t eat meat
D) What?
Answer:
I do not dare giving my opinion on the Taylor ham/Pork roll NJ
debate, so I accepted any answer
2
Section 2.
Multiple Choice Questions.
20 points in total, 2
points each
1. Assume that
Y
is distributed like a standard normal,
N
(0
,
1). Then, the prob-
ability that
Y
is between -1.96 and 1.96 is:
(a)
0.90
(b)
0.925
(c)
0.95
(d)
0.975
(e)
None of the above
2. New Brunswick’s daily temperature has an expected value of 52F and a standard
deviation of 11F. The formula to convert degrees Fahrenheit (F) to degrees
Celsius (C) is:
C
=
5
9
(
F
−
32)
What is the expected value of New Brunswick’s daily temperature in Celsius
(rounded to the nearest integer)?
(a)
9C
(b)
10C
(c)
11C
(d)
12C
(e)
None of the above
3. New Brunswick’s daily temperature has an expected value of 52F and a standard
deviation of 11F. The formula to convert degrees Fahrenheit (F) to degrees
Celsius (C) is:
C
=
5
9
(
F
−
32)
What is the variance of New Brunswick’s daily temperature in Celsius (rounded
to the nearest integer)?
(a)
37C
(b)
42C
(c)
47C
(d)
49C
(e)
None of the above
3
4. What is the difference between an estimator and an estimate?
(a)
Both an estimator and an estimate are functions of a sample of data to
be drawn randomly from a population.
(b)
An estimator is a function of a sample of data to be drawn randomly from
a population whereas an estimate is the numerical value of the estimator
when it is actually computed using data from a specific sample.
(c)
An estimate is a function of a sample of data to be drawn randomly from
a population whereas an estimator is the numerical value of the estimator
when it is actually computed using data from a specific sample.
(d)
Both an estimator and an estimate are numerical values computed using
data from a specific sample.
(e)
None of the above
5. Consider the multivariate regression model
Y
i
=
θ
0
+
θ
1
X
1
i
+
θ
2
X
2
i
+
...
+
θ
k
X
ki
+
ε
i
Ordinary Least Squares (OLS) estimates the parameters
{
θ
0
, θ
1
, ..., θ
k
}
by min-
imizing the following function:
(a)
∑
n
i
=1
(
Y
i
−
θ
0
−
θ
1
X
1
i
−
θ
2
X
2
i
−
...
−
θ
k
X
ki
−
ε
i
)
2
(b)
∑
n
i
=1
(
Y
i
+
θ
0
+
θ
1
X
1
i
+
θ
2
X
2
i
+
...
+
θ
k
X
ki
+
ε
i
)
2
(c)
∑
n
i
=1
(
Y
i
−
θ
0
−
θ
1
X
1
i
−
θ
2
X
2
i
−
...
−
θ
k
X
ki
)
2
(d)
∑
n
i
=1
(
Y
i
+
θ
0
+
θ
1
X
1
i
+
θ
2
X
2
i
+
...
+
θ
k
X
ki
)
2
(e)
∑
n
i
=1
(
Y
2
i
−
θ
0
−
θ
1
X
2
1
i
−
θ
2
X
2
2
i
−
...
−
θ
k
X
2
ki
)
6. Imagine that you were told that the t-statistic for the slope coefficient of the
regression line
TestScore
V
= 698
.
8 + 2
.
28
StudentTeacherRatio
was 4.38. What
are the units of measurement for the t-statistic?
(a)
Points of the test score
(b)
Number of students per teacher
(c)
Points of the test score / Number of students per teacher
(d)
Standard deviations (estimated by the corresponding standard error)
(e)
Dollars
7. You estimate
Y
i
=
α
+
β
1
X
i
+
β
2
X
2
i
+
ϵ
i
. How can you test for whether the
relationship between
Y
i
and
X
i
is linear or quadratic?
(a)
Compute the t-statistic for
β
2
to test the null hypothesis that
β
2
= 0
(b)
Compute the t-statistic for
β
1
to test the null hypothesis that
β
1
= 0
(c)
Compute the F-statistic to test the null hypothesis that both
β
1
= 0 and
β
2
= 0
(d)
Compare the t-statistic for
β
1
when including
X
2
in the regression to the
t-statistic for
β
1
when not including
X
2
in the regression
(e)
None of the above
4
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8. Which of the following statements is true?
(a)
The
R
2
will always be greater than the adjusted
R
2
as long as there is
at least one independent variable in the regression
(b)
The
R
2
will always be greater than the adjusted
R
2
even if there is no
independent variable in the regression
(c)
The adjusted
R
2
cannot be negative
(d)
The adjusted
R
2
accounts for omitted variable bias
(e)
none of the above
9. I am interested estimating the relationship between
X
and
Y
.
I know there
is a non-linear relationship between them.
I am considering using a log-log
specification, a log-linear specification, or a linear-log specification. Which of
the following should help me pick among the three specifications?
(a)
I can compare the
R
2
in the three specifications and pick the specification
with the highest
R
2
(b)
I can compare the adjusted
R
2
in the three specifications and pick the
specification with the highest adjusted
R
2
(c)
I can compare the t-statistic for
β
in the three specifications and pick
the specification with the highest t-statistic
(d)
I can compare the standard error of
β
in the three specifications and pick
the specification with the lowest standard error
(e)
None of the above
10. If you had a two-regressor regression model, then omitting one of the regressors:
(a)
Will bias the coefficient of the included regressor upward
(b)
Will bias the coefficient of the included regressor downward
(c)
May not bias the coefficient of the included regressor
(d)
Will have no effect on the coefficient of the included regressor if the
correlation between the excluded and the included regressor is negative
(e)
None of the above
5
Section 3.
Short Answer Questions. 12 points in total
11. [6 points] The company Fashion Icon LLC is planning to increase its spending
on the advertisement of their clothing items with the objective of increasing
the visibility of the brand and, ultimately, increase their profits. Fashion Icon
LLC has a bunch of data from previous years that contain information on two
variables:
revenue
t
, which denotes the total revenues from their sales in a given
year
t
(in thousand
$
), and
ad
costs
t
, which denotes the total costs incurred
by the company in the advertising campaign for year
t
(in thousand
$
).
(a) [4 points] Using the two variables, write down a univariate regression equa-
tion that the company could estimate where the slope coefficient
β
can be
interpreted as the percentage change in total revenues that is associated with a
1 percent increase in advertising spending.
(b) [2 points] If you estimate your regression equation in (a) using OLS, what
would the distribution of
ˆ
β
be if your sample is large enough and i.i.d? Why?
Answer:
(a)
β
can be interpreted as the percentage change in total revenues
associated with a 1 percent increase in advertising costs in a log-log regres-
sion:
ln(
revenue
t
) =
α
+
β
ln(
ad
costs
t
) +
ϵ
t
(b)
ˆ
β
will be normally distributed when
n
is large enough because of the
Central Limit Theorem. More specifically,
ˆ
β
∼
N
(
β, V ar
(
β
))
12. [6 points] Answer the following questions (round to two decimal places):
(a) [3 points] In Metrics’ High School, the probability that a student takes a
Statistics course is 0.27 and the probability that a student takes a Literature
course is 0.12. Given that the joint probability that a student takes Literature
but does not take Statistics is 0.62, what is the probability that a student takes
Literature conditional on not taking a Statistics course?
(b) [3 points] What is the probability that, when I roll a die twice, I get a two in
the first roll and a five on the second roll? What is the probability that, when
I roll a die twice, I get a five in each of the two rolls? Assume that the die is
not rigged.
Hint: does the second roll depend on what the realization of the first
roll is?
Answer:
a) Using the conditional joint probability formula:
P
(Lit
|
No Stats) =
P
(Lit
,
No Stats)
P
(No Stats)
=
P
(Lit
,
No Stats)
1
−
P
(Stats)
=
0
.
62
1
−
0
.
27
= 0
.
85
(b) The two rolls are independent events: the second roll does not depend
on the first roll. Thus, the joint probability is equal to the product of the
probabilities of the two outcomes. Let
X
1
and
X
2
be the first and second
rolls, respectively:
P
(
X
1
= 2
, X
2
= 5) =
P
(
X
1
= 2)
×
P
(
X
2
= 5) = 1
/
6
∗
1
/
6 = 0
.
0278 = 0
.
03
Same applies to the probability of getting two fives.
6
Section 4.
Long Problem. 45 points in total
13. [21 points] Climate change is one of the biggest challenges of this century. Given
this, suppose that we are interested in understanding to what extent economic
progress has contributed to climate change.
To examine this question, we collected data in 2023 for all counties in the
United States. Counties are geographical areas similar to the size of an average
metropolitan area. We have data for three variables:
•
AQI
: Annual average of the Air Quality Index (AQI), which goes from 0
to 500 degrees (the lower, the better air quality). Air quality is affected
by pollution, which is an important driver of climate change.
•
income
capita
: Income per capita in the county, in thousands of dollars,
which is a proxy for economic progress.
•
urban
: Dummy variable that takes value 1 if the county is mostly urban
and takes value 0 if the county is mostly rural.
The table below shows the results of estimating several regression equations by
OLS, using
AQI
as the dependent variable:
Variables
(1)
(2)
(3)
income
capita
2.281
1.537
1.343
(0.493)
(0.354)
(0.295)
urban
14.199
12.304
(3.769)
(3.514)
income
capita
×
urban
0.189
(0.059)
constant
39.5
28.1
26.1
(4.26)
(4.09)
(3.78)
Obs
3,143
3,143
3,143
Note:
Standard error in parentheses. This table is made up.
Answer the questions below (round your answers to two decimal places):
(a) [4 points] Column (1) regresses AQI on income per capita.
Interpret the
slope coefficient. Interpret the intercept (does it make sense in this case? why?).
(b) [6 points] The regression in Column (1) omits variables that may bias the
estimate of the effect of income per capita on AQI. Argue why the variable
urban
may introduce omitted variable bias (OVB). Clearly state the two necessary
conditions for OVB and how they apply to this case. Column (2) adds
urban
to
the regression. Was the coefficient in Column (1) upward or downward biased?
(c) [6 points] Interpret the estimated coefficient of
urban
in Column (2). Is this
coefficient different from zero at the 5% significance level?
(d) [2 points] Column (3) adds an interaction term between the two main re-
gressors. Interpret the coefficient on
income
capita
×
urban
.
(e) [3 points] Using the estimates from Column (3) compute the expected value
of the AQI for an urban county with an income per capita of
$
25,000.
7
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Answer:
(a)
Slope
: an
$
1,000 increase in income per capita is associated with
an increase in AQI of 2.281 degrees.
That is, the greater the economic
progress, the more polluted the air is.
Intercept
: the AQI would be 39.5 degrees if income per capita was 0. In
this case, it does not make sense to interpret the coefficient because income
per capita will never be zero for any county.
(b) There are two conditions that should be met in order for
urban
to cause
OVB: (1) urban areas have a different income per capita than rural areas
(
cov
(
income
capita
,
urban
) is not zero), (2) urban areas should have a
different impact on air quality than rural areas (
β
2
is not zero).
In this
exercise, we may think that urban areas have (1) higher income per capita
and also have (2) higher pollution levels. Thus, the coefficient in Column
(1) should be upward biased.
Since the coefficient went down after we included the variable
urban
, we
can confirm that the previous estimated coefficient of income per capita
was upward biased.
(c) The coefficient on urban can be interpreted as the difference in AQI
between urban and rural counties
holding income per capita constant
.
We can test the hypothesis using two alternative approaches (one is enough):
(1) 2
SE
(
ˆ
β
)
<
|
ˆ
β
−
0
|
.
In this case, 7.538¡14.199; (2)
t
= (14
.
199
−
0)
/
3
.
769 = 3
.
77
>
1
.
96 Using both methods, we can reject the null hy-
pothesis that the difference in AQI between urban and rural areas is zero.
(d) The coefficient on
income
capita
×
urban
can be interpreted as the
additional impact that income per capita has on AQI in urban counties
relative to the impact of income per capita on AQI in rural counties. To
be very specific: it is the difference in the change in AQI that is associated
with a one thousand dollar increase in income per capita in urban counties
compared to the change in AQI that is associated with a one thousand
dollar increase in income per capita in rural counties.
(e) The expected value can be expressed as:
E
[
aqi
|
income
capita
= 25
,
urban
= 1] =
=
E
[
α
+
β
1
income
capita
+
β
2
urban
+
β
3
income
capita
×
urban
|
income
capita
= 25
,
urban
= 1] =
= ˆ
α
+
ˆ
β
1
25 +
ˆ
β
2
+
ˆ
β
3
25 =
= 26
.
1 + 1
.
343
∗
25 + 12
.
304 + 0
.
189
∗
25 = 76
.
7
8
14. [24 points] Rutgers-New Brunswick is interested in studying the differences in
academic achievement across its five campuses: Busch, College Ave, Cook, Dou-
glass, and Livingston. We collected data for a random sample of 1,500 Rutgers
students. The data contains information about their GPA (
gpa
i
) and the cam-
pus where they take classes. To simplify the problem, assume that each student
can only take classes in one campus. For example, if student
i
takes classes in
College Ave, student
i
cannot take classes in other campuses.
Consider the following regression equation (Equation (1)):
gpa
i
=
α
+
β
campus
i
+
u
i
(1)
where
campus
i
is a variable that can take 5 values depending on where the
student is taking classes:
1 for Busch, 2 for College Ave, 3 for Cook, 4 for
Douglass, and 5 for Livingston.
Now consider the alternative regression equation below (Equation (2)):
gpa
i
=
α
+
β
1
busch
i
+
β
2
cook
i
+
β
3
douglass
i
+
β
4
livingston
i
+
ε
i
(2)
where each of the independent variables is a dummy variable that takes value 1
if student
i
lives in that campus and takes value 0 otherwise. The table below
shows the results of estimating Equation (2) by OLS:
Variables
(1)
busch
0.221 (0.054)
cook
-0.017 (0.021)
douglass
-0.189 (0.045)
livingston
-0.010 (0.030)
constant
3.730 (0.034)
Obs
1,500
R
2
0.34
Note:
Standard error in parentheses. This table is made up.
Answer the questions below (round your answers to two decimal places):
(a) [4 points] Can we estimate Equation (1)? If your answer is no, why? If your
answer is yes, does the interpretation of
β
make sense?
All the remaining questions refer to Equation (2):
(b) [4 points] In Equation (2), what is the omitted group? Why is it omitted?
(c) [6 points] How do we interpret ˆ
α
? How do we interpret
ˆ
β
4
?
(d) [6 points] We want to test the null hypothesis that the campus where stu-
dents take classes does not matter. Write down the joint null hypothesis and
the alternative hypothesis. What statistic do we need to compute to test this
hypothesis (no need to compute it)? Explain how you would reject or fail to
reject the null hypothesis.
(e) [4 points] Suppose that
instead of
busch
i
, we included
college ave
i
in the
regression.
In this hypothetical regression, can we know what would be the
estimated
ˆ
β
associated with
college ave
i
? If your answer is no, explain why.
If your answer is yes, provide a number.
9
Answer:
(a) We can estimate Equation (1). However, the interpretation of
β
does not make sense. It could mean the impact on GPA of changing classes
from Busch to College Ave, or from Cook to Douglass. It is not readily
interpretable.
(b) The omitted group is students taking classes in College Ave.
It is
omitted because if adding a dummy variable for each of the five campuses
would result in perfect multicollinearity.
That is, we would be able to
write one of the dummies as a perfect linear combination of the others
because they are mutually exclusive, which violates the fourth assumption
of multivariate regression.
(c) ˆ
α
is the average GPA of students taking classes in College Ave.
ˆ
β
4
is the
difference in average GPAs between students taking classes in Livingston
and students taking classes in College Ave.
(d) The joint null hypothesis and the alternative hypothesis are:
H
0
:
β
1
= 0 and
β
2
= 0 and
β
3
= 0 and
β
4
= 0
H
A
:
β
1
̸
= 0 and/or
β
2
̸
= 0 and/or
β
3
̸
= 0 and/or
β
4
̸
= 0
You can also express
H
A
as: at least one of the restriction in the joint
null hypothesis does not hold.
To test this, we would need to compute
the F-statistic with 4 restrictions (degrees of freedom). To reject/fail to
reject the null hypothesis, there are two alternatives: (1) compare the F-
statistic to the critical value in the F-distribution table at the end of the
textbook (if the F-statistic is higher than the critical value, we can reject
the null), or (2) the statistical software will spit out a p-value associated
with the F-statistic: if it is lower than 0.05, we can reject the null at the
5% significance value.
(e) Yes. Since the
β
coefficients indicate the difference in the average GPA
between the included groups and the omitted group, if we omit Busch and
include College Ave, the coefficient for College Ave should be the negative
of the coefficient for Busch, -0.221.
10
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Recommended textbooks for you

Economics Today and Tomorrow, Student Edition
Economics
ISBN:9780078747663
Author:McGraw-Hill
Publisher:Glencoe/McGraw-Hill School Pub Co


Microeconomics: Principles & Policy
Economics
ISBN:9781337794992
Author:William J. Baumol, Alan S. Blinder, John L. Solow
Publisher:Cengage Learning


Managerial Economics: A Problem Solving Approach
Economics
ISBN:9781337106665
Author:Luke M. Froeb, Brian T. McCann, Michael R. Ward, Mike Shor
Publisher:Cengage Learning
Recommended textbooks for you
- Economics Today and Tomorrow, Student EditionEconomicsISBN:9780078747663Author:McGraw-HillPublisher:Glencoe/McGraw-Hill School Pub CoMicroeconomics: Principles & PolicyEconomicsISBN:9781337794992Author:William J. Baumol, Alan S. Blinder, John L. SolowPublisher:Cengage Learning
- Managerial Economics: A Problem Solving ApproachEconomicsISBN:9781337106665Author:Luke M. Froeb, Brian T. McCann, Michael R. Ward, Mike ShorPublisher:Cengage Learning

Economics Today and Tomorrow, Student Edition
Economics
ISBN:9780078747663
Author:McGraw-Hill
Publisher:Glencoe/McGraw-Hill School Pub Co


Microeconomics: Principles & Policy
Economics
ISBN:9781337794992
Author:William J. Baumol, Alan S. Blinder, John L. Solow
Publisher:Cengage Learning


Managerial Economics: A Problem Solving Approach
Economics
ISBN:9781337106665
Author:Luke M. Froeb, Brian T. McCann, Michael R. Ward, Mike Shor
Publisher:Cengage Learning