Question 2 You are interested in whether smoking potentially influences birth weight of babies. Suppose you estimate the following regression model 1: bweight, = a, + a,cigs, + azfaminc + € (1) where "bweight represents a child's weight at birth (in ounces), "cigs" is the number of cigarettes smoked per day by the mother while she was pregnant, and "faminc" is the total family income (in thousands of US dollars). a) Provide an intuition of the model above. How would you justify the selection of variables? Would you expect the estimated coefficients to be positive/negative, and why? b) The correlation between the two independent variables is -0.173. Provide an intuition about this statistics. c) The output of your regression is displayed in Table 3. Interpret the estimated coefficients, discuss their significance level and goodness of fit in the regression. Table 3: output of regression model 1 reg bwght cigs famine 1,388 21.27 0.0000 0.0298 0.0284 20.063 df Number of obs F(2, 1385) Prob > F R-squared Adj R-squared Source MS 17126.2088 557485.511 Model 2 8563.10442 Residual 1,385 402.516614 Total 574611.72 1,387 414.283864 Root MSE bwght Coef. Std. Err. P>It| [95 Conf. Interval] -.4634075 .0927647 116.9741 -5.06 3.18 -. 6430518 .0355075 114.9164 -.2837633 .1500219 119.0319 cigs .0915768 .0291879 1.048984 0.000 0.002 0.000 famine cons 111.51 d) Variable "cigs" contains 122 non-zero values (representing the number of cigarettes a mother smokes) and 1266 zero values. You decided that you want to test a hypothesis that the average birthweight of children born to smoking mothers is lower compared to the average birthweight of non-smoking mothers. Explain, how you would carry out such a test.

A First Course in Probability (10th Edition)
10th Edition
ISBN:9780134753119
Author:Sheldon Ross
Publisher:Sheldon Ross
Chapter1: Combinatorial Analysis
Section: Chapter Questions
Problem 1.1P: a. How many different 7-place license plates are possible if the first 2 places are for letters and...
icon
Related questions
Question
Question 2
You are interested in whether smoking potentially influences birth weight of babies.
Suppose you estimate the following regression model 1:
bweight; = a, + a, cigs; + azfaminc; + €;
(1)
where "bweight represents a child's weight at birth (in ounces), "cigs" is the number
of cigarettes smoked per day by the mother while she was pregnant, and "faminc" is
the total family income (in thousands of US dollars).
a) Provide an intuition of the model above. How would you justify the selection of
variables? Would you expect the estimated coefficients to be positive/negative, and
why?
b) The correlation between the two independent variables is -0.173. Provide an
intuition about this statistics.
c) The output of your regression is displayed in Table 3. Interpret the estimated
coefficients, discuss their significance level and goodness of fit in the regression.
Table 3: output of regression model 1
reg bwght cigs famine
df
Number of obs
1,388
21.27
Source
MS
F (2, 1385)
0.0000
0.0298
0.0284
Model
17126.2088
8563.10442
Prob > F
Residual
557485.511
1,385 402.516614
R-squared
Adj R-squared
Total
574611.72
1, 387
414.283864
20.063
Root MSE
bwght
Coef.
Std. Err.
P>|t|
[958 Conf. Interval]
cigs
famine
0.000
0.002
-.4634075
.0915768
-5.06
-.6430518
-.2837633
.1500219
.0927647
116.9741
.0355075
114.9164
.0291879
3.18
cons
1.048984
111.51
0.000
119.0319
d) Variable "cigs" contains 122 non-zero values (representing the number of
cigarettes a mother smokes) and 1266 zero values. You decided that you want to
test a hypothesis that the average birthweight of children born to smoking mothers
is lower compared to the average birthweight of non-smoking mothers. Explain,
how you would carry out such a test.
Transcribed Image Text:Question 2 You are interested in whether smoking potentially influences birth weight of babies. Suppose you estimate the following regression model 1: bweight; = a, + a, cigs; + azfaminc; + €; (1) where "bweight represents a child's weight at birth (in ounces), "cigs" is the number of cigarettes smoked per day by the mother while she was pregnant, and "faminc" is the total family income (in thousands of US dollars). a) Provide an intuition of the model above. How would you justify the selection of variables? Would you expect the estimated coefficients to be positive/negative, and why? b) The correlation between the two independent variables is -0.173. Provide an intuition about this statistics. c) The output of your regression is displayed in Table 3. Interpret the estimated coefficients, discuss their significance level and goodness of fit in the regression. Table 3: output of regression model 1 reg bwght cigs famine df Number of obs 1,388 21.27 Source MS F (2, 1385) 0.0000 0.0298 0.0284 Model 17126.2088 8563.10442 Prob > F Residual 557485.511 1,385 402.516614 R-squared Adj R-squared Total 574611.72 1, 387 414.283864 20.063 Root MSE bwght Coef. Std. Err. P>|t| [958 Conf. Interval] cigs famine 0.000 0.002 -.4634075 .0915768 -5.06 -.6430518 -.2837633 .1500219 .0927647 116.9741 .0355075 114.9164 .0291879 3.18 cons 1.048984 111.51 0.000 119.0319 d) Variable "cigs" contains 122 non-zero values (representing the number of cigarettes a mother smokes) and 1266 zero values. You decided that you want to test a hypothesis that the average birthweight of children born to smoking mothers is lower compared to the average birthweight of non-smoking mothers. Explain, how you would carry out such a test.
e) Do you think an important variable(s) which could cause bias in the estimated
coefficient for cigs is left out of a regression? Explain the mechanism of omitted
variable bias (i.e., when does an omitted variable cause bias and when does it
not)?
f) Table 4 presents another regression output with a slight adjustment. Identify what
we did and explain the reason why we would do so. Which regression output
(presented in Table 3 or Table 4) would you use if you were writing a report?
Table 4: output of regression model 2
• reg bwght cigs faminc, r
inear regression
Number of
1,388
F(2, 1385)
22.11
Prob > F
0.0000
R-squared
0.0298
Root MSE
20.063
Robust
bwght
Coef.
Std. Err.
P>|t|
(958 Conf. Interval]
cigs
famine
-.4634075
.0927647
0.000
0.001
-.637525
.0366875
.0887594
-5.22
-.2892901
.0285864
3.25
.148842
cons
116.9741
1.037207
112.78
0.000
114.9395
119.0088
Transcribed Image Text:e) Do you think an important variable(s) which could cause bias in the estimated coefficient for cigs is left out of a regression? Explain the mechanism of omitted variable bias (i.e., when does an omitted variable cause bias and when does it not)? f) Table 4 presents another regression output with a slight adjustment. Identify what we did and explain the reason why we would do so. Which regression output (presented in Table 3 or Table 4) would you use if you were writing a report? Table 4: output of regression model 2 • reg bwght cigs faminc, r inear regression Number of 1,388 F(2, 1385) 22.11 Prob > F 0.0000 R-squared 0.0298 Root MSE 20.063 Robust bwght Coef. Std. Err. P>|t| (958 Conf. Interval] cigs famine -.4634075 .0927647 0.000 0.001 -.637525 .0366875 .0887594 -5.22 -.2892901 .0285864 3.25 .148842 cons 116.9741 1.037207 112.78 0.000 114.9395 119.0088
Expert Solution
steps

Step by step

Solved in 5 steps

Blurred answer
Similar questions
Recommended textbooks for you
A First Course in Probability (10th Edition)
A First Course in Probability (10th Edition)
Probability
ISBN:
9780134753119
Author:
Sheldon Ross
Publisher:
PEARSON
A First Course in Probability
A First Course in Probability
Probability
ISBN:
9780321794772
Author:
Sheldon Ross
Publisher:
PEARSON