Suppose we have a random sample of 1000 married couples. Some of the couples had been randomly chosen (by a coin flip) to receive a gift of free access to a very large library of electronic books for the whole year 2015. (Having received the gift, husbands and wives can access and read the books from the library independently of each other.) We are interested in estimating the effect of the free access to the library on the total number of books (Y) the individual i has read in 2015. Let X = 1 if the individual i has received the gift and X, = 0 otherwise, i = 1,..., 2000. (a) First consider the subsample of wives only (i.e., 1000 observations). James says that because X, has been randomly assigned we can estimate the effect by the simple OLS regression Y, on X₁. Bob disagrees and says that the number of books read in 2015 will be closely related to the number of books the person reads in a typical year, regardless of having received the gift. Therefore, argues Bob, to estimate the effect we must include the number of books read by the person in a typical year into the regression. Who is right and why? (b) Now consider the full sample with 2000 observations. Suppose we are not concerned about the omitted variable bias, and we estimate the effect of the gift on the number of books an individual read in 2015 by running an OLS regression of Y, on X.. Amy says that we should use clustered standard errors for this regression, even though we do not use fixed effects regression here. Helen disagrees, saying that the usual heteroskedasticity-robust standard errors are most appropriate, especially given that X, has been randomly assigned. Who is right and why?

A First Course in Probability (10th Edition)
10th Edition
ISBN:9780134753119
Author:Sheldon Ross
Publisher:Sheldon Ross
Chapter1: Combinatorial Analysis
Section: Chapter Questions
Problem 1.1P: a. How many different 7-place license plates are possible if the first 2 places are for letters and...
icon
Related questions
Question
Note: The question should not be hand written.
Answer the following questions. The questions can be answered independently of each other. The questions
do not require more than 100 words for a complete answer but please be precise.
1.
3.
Suppose we have a random sample of 1000 married couples. Some of the couples had been
randomly chosen (by a coin flip) to receive a gift of free access to a very large library of electronic
books for the whole year 2015. (Having received the gift, husbands and wives can access and read the
books from the library independently of each other.) We are interested in estimating the effect of the
free access to the library on the total number of books (Y) the individual i has read in 2015. Let
x₁ = 1 if the individual i has received the gift and X, = 0 otherwise, i = 1,..., 2000.
(a) First consider the subsample of wives only (i.e., 1000 observations). James says that because X,
has been randomly assigned we can estimate the effect by the simple OLS regression Y, on X₁. Bob
disagrees and says that the number of books read in 2015 will be closely related to the number of
books the person reads in a typical year, regardless of having received the gift. Therefore, argues
Bob, to estimate the effect we must include the number of books read by the person in a typical
year into the regression. Who is right and why?
(b) Now consider the full sample with 2000 observations. Suppose we are not concerned about the
omitted variable bias, and we estimate the effect of the gift on the number of books an individual
read in 2015 by running an OLS regression of Y, on X.. Amy says that we should use clustered
standard errors for this regression, even though we do not use fixed effects regression here. Helen
disagrees, saying that the usual heteroskedasticity-robust standard errors are most appropriate,
especially given that X, has been randomly assigned. Who is right and why?
Briefly explain the advantages of Logit and Probit regressions over the Linear Probability
Model regression.
Remember that in the CA school dataset we have used variable testscr is the average school
district test score, and str is the student-to-teacher ratio. Suppose we try to run the command "regress
testser testser str". Does this regression suffer from perfect multicollinearity?
Transcribed Image Text:Answer the following questions. The questions can be answered independently of each other. The questions do not require more than 100 words for a complete answer but please be precise. 1. 3. Suppose we have a random sample of 1000 married couples. Some of the couples had been randomly chosen (by a coin flip) to receive a gift of free access to a very large library of electronic books for the whole year 2015. (Having received the gift, husbands and wives can access and read the books from the library independently of each other.) We are interested in estimating the effect of the free access to the library on the total number of books (Y) the individual i has read in 2015. Let x₁ = 1 if the individual i has received the gift and X, = 0 otherwise, i = 1,..., 2000. (a) First consider the subsample of wives only (i.e., 1000 observations). James says that because X, has been randomly assigned we can estimate the effect by the simple OLS regression Y, on X₁. Bob disagrees and says that the number of books read in 2015 will be closely related to the number of books the person reads in a typical year, regardless of having received the gift. Therefore, argues Bob, to estimate the effect we must include the number of books read by the person in a typical year into the regression. Who is right and why? (b) Now consider the full sample with 2000 observations. Suppose we are not concerned about the omitted variable bias, and we estimate the effect of the gift on the number of books an individual read in 2015 by running an OLS regression of Y, on X.. Amy says that we should use clustered standard errors for this regression, even though we do not use fixed effects regression here. Helen disagrees, saying that the usual heteroskedasticity-robust standard errors are most appropriate, especially given that X, has been randomly assigned. Who is right and why? Briefly explain the advantages of Logit and Probit regressions over the Linear Probability Model regression. Remember that in the CA school dataset we have used variable testscr is the average school district test score, and str is the student-to-teacher ratio. Suppose we try to run the command "regress testser testser str". Does this regression suffer from perfect multicollinearity?
Expert Solution
steps

Step by step

Solved in 3 steps with 5 images

Blurred answer