ps4_sol_Fall2023
pdf
keyboard_arrow_up
School
Columbia University *
*We aren’t endorsed by this school
Course
UN3412
Subject
Economics
Date
Jan 9, 2024
Type
Pages
20
Uploaded by JudgeMaskYak32
Department of Economics UN3412 Columbia University Fall 2023 SOLUTIONS to Problem Set 4 Introduction to Econometrics (Erden_ Section 1) ______________________________________________________________________________ Please make sure to select the page number for each question while you are uploading your solutions to Gradescope. Otherwise, it is tough to grade your answers, and you may lose points. 1.
(12p) A “Cobb
-
Douglas” production function relates production (𝑄) to factors of production, capital (?)
, labor (?)
, and raw materials (?)
, and an error term u using the equation
𝑄 =
𝜆?
𝛽
1
?
𝛽
2
?
𝛽
3
𝑒
𝑢
, where 𝜆, 𝛽
1
, 𝛽
2
,
and 𝛽
3
are production parameters. Suppose that you have data on production and the factors of production from a random sample of firms with the same Cobb-Douglas production function. (a)
(5p) How would you use regression analysis to estimate the production parameters? (b)
(4p) Suppose that you would like to test that there are constant returns to scale in this industry. How would you do that? (c)
(3p) Is there a way to impose the constant returns to scale in estimating the production parameters? Solution: (a)
Take log of both sides: ?𝑛𝑄 = ?𝑛𝜆 + 𝛽
1
?𝑛? + 𝛽
2
?𝑛? + 𝛽
3
?𝑛? + ?
(b)
𝐻
0
:
𝛽
1
+ 𝛽
2
+ 𝛽
3
= 1
(c)
Yes, through constrained linear regression (cnsreg in Stata, ConsReg in R), it is possible to impose constant returns to scale into Cobb Douglas model. 2.
(20p) Consider the following results for a wage regression where lwage
is the natural log of average hourly earnings in US dollars, age
is in years, female is a binary variable for gender, bachelor is one for someone with bachelor degree and zero otherwise, femxbac and femxage
are self explanotary interaction variables. Regression 1: . reg lwage age female bachelor femxbac, r Linear regression Number of obs = 15316 F( 4, 15311) = 861.64 Prob > F = 0.0000 R-squared = 0.1852 Root MSE = .50507 ------------------------------------------------------------------------------ | Robust lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+----------------------------------------------------------------
age | .0259277 .0014453 17.94 0.000 .0230947 .0287606 female | -.2257049 .010984 -20.55 0.000 -.2472348 -.2041749 bachelor | .3980052 .0114889 34.64 0.000 .3754857 .4205248 femxbac | .1082764 .0165844 6.53 0.000 .0757691 .1407838 _cons | 1.708492 .0434411 39.33 0.000 1.623342 1.793642 ------------------------------------------------------------------------------ Regression 2: . reg lwage female age bachelor femxage, r Linear regression Number of obs = 15316 F( 4, 15311) = 840.78 Prob > F = 0.0000 R-squared = 0.1848 Root MSE = .5052 ------------------------------------------------------------------------------ | Robust lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .331927 .0859364 3.86 0.000 .1634814 .5003726 age | .032957 .0019547 16.86 0.000 .0291256 .0367884 bachelor | .4437149 .0083212 53.32 0.000 .4274044 .4600254 femxage | -.0171776 .002895 -5.93 0.000 -.0228521 -.0115032 _cons | 1.481794 .0583113 25.41 0.000 1.367497 1.596092 ------------------------------------------------------------------------------ Using the appropriate regression please answer the following questions (a)
(3p) What is the estimated average wage difference between females with bachelor degrees and males with bachelor degree? Explain your answer. Females with bachelor degree are expected to earn about 11.74% (not just 11.74) (that is found by -.2257049+.1082764=-.1174285) less than males with bachelor degree keeping age unchanged. (b)
(3p) What is the estimated average wage difference between females with bachelor degrees and females without bachelor degree? Explain your answer. Females with bachelor degree are expected to earn about 50.6% (= .3982764 + .1080052) more than females without bachelor degree. (c)
(4p) How would you test if there is a significance difference in wages of females with bachelor degrees and males with bachelor degrees? Please write STATA commands necessary, you can use any method you want, but you must write the null hypothesis first.
H
0
: 𝛽
?????? + 𝛽
???𝑥??? = 0
here are 2 possible answers: (1) fooling stata (i)gen new= femxbac - female
(ii)reg lwage age female bachelor new
, r (iii) then check the coefficient next to female if p-value<0.01(or 0.05) then reject H
0 , otherwise do not reject H
0.
Rejecting the null hypothesis here means that there is significance difference in average expected wages of females with bachelor degrees and that of males with bachelor degree. The other possible answer is: (2) directly use stata command test fem+femxbac=0 (d)
(4p) How would you test if there is an intercept and if there is a slope difference in the two estimated regression lines for males and females in regression 2? Write the null hypothesis for each test. To test the intercept difference H
0
:
𝛽
?????? = 0
To test the slope difference H
0
: 𝛽
???𝑥??? = 0
(e)
(3p) As people get older, does the wage gender gap widen? Draw a sketch graph (with wage on vertical axis and age on horizontal axis) of what this result tells you. Female regression line must start from a higher point and the gap narrows. wage female male age (f)
(3p) Is the coefficient of female
in Regression 2, statistically significant at 1% significance level? Please write the null hypothesis and calculate the test statistic before you answer this question. Also make sure to give your reason for your answer. H
0
:
𝛽
?????? = 0
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
? =
𝛽
̂
??????
?. 𝑒(𝛽
̂
??????
)
= 3.86 > 2.58
Therefore we reject the Null hypothesis. The coefficient on female in Regression 2 is statistically significant at 1% significance level. 3.
(30p) Hprice1.dta is a data set collected from the real estate pages of the Boston Globe
during 1990. These are homes that sold in the Boston, MA area. Variables are explained in table 1.
In this problem set you will look at some empirical evidence on housing prices of 1990 in Boston, MA area. Note that, to do this problem set, you will need to create (generate) some new variables, which are functions of the variables in hprice1.dta. Preliminary data analysis: (a)
(2p) Produce the scatterplot of price
v. lotsize.
(b)
(2p) Produce the scatterplot of lprice
v. llotsize.
0
200
400
600
800
0
20000
40000
60000
80000
100000
size of lot in square feet
4.5
5
5.5
6
6.5
7
8
9
10
11
log(lotsize)
(c)
(2p) Produce the scatterplot of price
vs. sqrft
. (d)
(2p) Produce the scatterplot of price
vs. lsqrft
. (e)
(3p) Using the scatterplots from (a) and (b), would you suggest using the variables (i) price
and lotsize
or (ii) lprice
and llotsize
for modeling using linear regression? The relation between price
and lotsize
looks nonlinear. Taking logs of both variables makes the relation look much more like a scatter with a linear relation, the sort of thing that can be well handled by conventional multiple linear regression methods. (f)
(3p) Using the scatterplot from (c) and (d), does the relation between price
and sqrft
appear to be linear or nonlinear? If nonlinear, what sort of nonlinear curve might you want to explore (briefly explain)? From (c), ignoring a couple of outlier, it looks like they have a linear relation. (g)
(4p) Regress lprice on llotsize
, lsqrft
, bdrms
and colonial
. Interpret the coefficient of (i) llotsize
, (ii) lsqft
and (iii) bdrms
. . reg lprice llotsize lsqrft bdrms colonial,r Linear regression Number of obs = 88 F( 4, 83) = 34.50 0
200
400
600
800
1000
2000
3000
4000
size of house in square feet
0
200
400
600
800
7
7.5
8
8.5
log(sqrft)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Prob > F = 0.0000 R-squared = 0.6491 Root MSE = .18412 ------------------------------------------------------------------------------ | Robust lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- llotsize | .1678189 .0440356 3.81 0.000 .0802338 .2554041 lsqrft | .7071931 .1090447 6.49 0.000 .4903076 .9240787 bdrms | .0268305 .032718 0.82 0.415 -.0382444 .0919053 colonial | .0537962 .0489041 1.10 0.274 -.0434721 .1510645 _cons | -1.349589 .8115795 -1.66 0.100 -2.963788 .2646099 ------------------------------------------------------------------------------ (i)
10% increase in lot size will cause price to increase by 1.68% keeping other variables constant. (Elasticity of price with respect to lot size is 0.168) (ii)
10% increase in the size of the house will cause price to increase by 7.07% keeping other variables constant. (Elasticity of price with respect to the size of the house is 0.707) (iii)
An additional bedroom will increase the price of the house by 2.68% keeping other variables constant. (h)
(4p) Now regress lprice on llotsize, llotsize
2
, lsqrft
, lsqrft
2
, bdrms
and colonial
. Interpret the coefficient of (i) llotsize and (ii) lsqrft.
. reg lprice llotsize llotsize2 lsqrft lsqrft2 bdrms colonial,r Linear regression Number of obs = 88 F( 6, 81) = 27.46 Prob > F = 0.0000 R-squared = 0.6756 Root MSE = .17919 ------------------------------------------------------------------------------ | Robust lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- llotsize | .049371 .4969611 0.10 0.921 -.9394256 1.038168 llotsize2 | .0053273 .0269443 0.20 0.844 -.0482834 .0589381 lsqrft | -8.4956 3.052306 -2.78 0.007 -14.56873 -2.422469 lsqrft2 | .6025779 .2002404 3.01 0.003 .2041623 1.000994 bdrms | .0104218 .0323804 0.32 0.748 -.0540051 .0748486 colonial | .0911863 .0481 1.90 0.062 -.0045176 .1868901 _cons | 34.40863 12.08688 2.85 0.006 10.35953 58.45772 ------------------------------------------------------------------------------
The interpretation of the coefficients on llotsize and lsqrft is now complicated by the inclusion of the quadratic terms. One can say that the effect of proportional increases in lot size is increasing at an increasing rate, except neither term is remotely significant. Similarly, one can say that the effect of proportional increases in square footage is initially decreasing at a decreasing rate before switching signs. Alternatively, one can find the square footage level that minimizes lprice, holding other variables constant. (i)
(4p) Compare the model specification in part (g) to the one in part (h)
Both regressions have problems insignificant variables in (g) are bdrms
and colonial
in part (h) llotsize, llotsize
2 are also insignificant. Overall fit measures are better in part (h) but nevertheless, it seems there would be a better model specification. (j)
(4p) Regress price
on lotsize
, sqrft
, bdrms
and bdrms
2
. Is there an optimum number of bedrooms that maximizes (or minimizes) the price of a house? (hint: check the sign of the quadratic term) . reg price lotsize sqrft bdrms bdrms2,r Linear regression Number of obs = 88 F( 4, 83) = 16.55 Prob > F = 0.0000 R-squared = 0.6794 Root MSE = 59.544 ------------------------------------------------------------------------------ | Robust price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lotsize | .0020749 .0012125 1.71 0.091 -.0003367 .0044866 sqrft | .1221177 .01709 7.15 0.000 .0881265 .1561089 bdrms | -40.2742 48.31517 -0.83 0.407 -136.3711 55.82273 bdrms2 | 6.771337 6.532258 1.04 0.303 -6.221062 19.76374 _cons | 81.67705 91.26611 0.89 0.373 -99.84758 263.2017 ------------------------------------------------------------------------------ Approximately 3 bedrooms minimize the price of a house. There is no price maximizing value! 4.
(18p) Estimate the regressions in Table 2 and fill in the empty entries. You may write in the entries by hand or type them using the .doc electronic version of the table on the course Web site. Problem Set 4,
Table 2 Determinants of Housing Prices (1) (2) (3) (4) (5) Dependent variable: lprice lprice lprice lprice lprice Regressor: lsqrft 0.873** (0.098) 0.762** (0.077 ) -6.796* (2.974) 0.749** (0.081) 0.752** (0.083) (lsrqft)
2 __ __ 0.494* (0.194) __ __ llotsize __ 0.168** (0.038) 0.185 (0.382) 0.056 (0.562) 0.163 (0.544) (llotsize)
2 __ __ -0.002 (0.021) 0.006 (0.031) 0.001 (0.030) colonial __ __ __ 0.068 (0.047) -0.008 (0.085) victorian __ __ __ __ -0.117
(0.092) Intercept -0.975 (0.745) -1.640* (0.681) 27.242* (11.730) -1.070 (2.671) -1.553 (2.593) F
-
statistics testing the hypothesis that the population coefficients on the indicated regressors are all zero
:
lsqrft, (lsqrft)
2
__ __ 55.68 (0.000) __ __ llotsize,(llotsize)
2 __ __ 9.94 (0.0001) 7.89 (0.001) 9.56 (0.000) Style dummies (Colonial and Victorian) __ __ __ __ 2.51 ( 0.088) Regression summary statistics
2
R
0.548 0.627 0.639 0.629 0.635 R
2 0.553 0.635 0.655 0.646 0.656 SER 0.204 0.185 0.183 0.185 0.183 n
88 88 88 88 88 SER = (RMSE^2*(88/86))^(1/2) Notes
: Heteroskedasticity-robust standard errors are given in parentheses under estimated coefficients, and p
-values are given in parentheses under F
- statistics. The F
-statistics are heteroskedasticity-robust. Coefficients are significant at the +
10%, *5%, **1% significance level
5.
(20p) Use the results in Table 2 to answer the following questions. (a)
(4p) Using regression (1), test the hypothesis that the coefficient on lsqft
is zero, against the alternative that it is nonzero, at the 5% significance level. Explain in words what the coefficient means. t=8.908 so the hypothesis is rejected at the 5% (1%) significance level. The estimated coefficient 0.873, means that a 1% increase in sqft is associated with a 0.873% increase in house prices. (b)
(4p) Using regression (3), test the hypothesis that the coefficients on lsqft
and lsqft
2
are both zero, against the alternative that one or the other coefficient is nonzero, at the 5% significance level. F=55.68 with p-value o.oo so the hypothesis that both coefficients are zero(holding constant llotsize
and 2
llotsize
) is rejected at the 5%(1%) significance level (c)
(4p) Using regression (3), is there evidence that the relationship between lprice
and llotsize
is nonlinear? No, the t-statistic testing the hypothesis that 2
llotsize
has a zero coefficient is -0.095, so the coefficient is not significant at the 10% significance level.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
(d)
(4p) Using regression (3), is there evidence that the relationship between lprice
and lsqft
is nonlinear? Yes, the t-statistic testing the hypothesis that 2
lsqrft
has a zero coefficient is 2.546, so the coefficient is significant at the 5% significance level. (e)
(4p) Using regression (5), test the null hypothesis (at the 5% significance level) that the coefficients on the “style dummies” (Colonial and Victorian) all are zero, against the alternative hypothesis that at least one is nonzero. What is number of restrictions q
in your test? What is the critical value of your test? F
= 2.51 with p
-value = 0.088, so the hypothesis is not rejected at the 5% significance level. The number of restrictions is the number of coefficients that are zero under the null, here q
= 2 (coefficients on Colonial and Victorian
).
The 5% critical value of the F
2,82
distribution is approximately 3.10. Table 1 DATA DESCRIPTION, FILE: hprice1.dta Variable Definition price
House price, in $1000. assess Assessed value in $1000. bdrms
Number of bedrooms lotsize
Size of lot in square feet. sqrft
Size of house in square feet victorian = 1 if house is in Victorian style. = 0 otherwise. colonial = 1 if house is in Colonial style. = 0 otherwise. lprice Log(price) lassess Log(assess) llotsize Log(lotsize) lsqft Log(sqft) Following questions will not be graded, they are for you to practice and will be discussed at the recitation: 1.
SW Empirical Exercise 8.1
2.
SW Empirical Exercise 8.2
Solutions: (1)
clear ************************************************************* * Empirical Exercise 8.1; ************************************************************* * (Note: Change path name so that it is appropriate for your computer) use "/Users/seyhanerden/Documents/COLUMBIA ECONOMETRICS/Problem Sets/Problem Sets Fall 2023/Problem Set 4 - Nonlinear Regression/Solutions to Practice questions 8.1 and 8.2/lead_mortality .dta" gen lead_ph = lead*ph gen ln_pop = ln(population) ttest infrate, by(lead) unequal unpaired regress infrate lead ph lead_ph, r dis "Adjusted Rsquared = " _result(8) test lead lead_ph gen lead_ph_65 = lead*(ph-6.5) regress infrate lead ph lead_ph_65, r regress infrate lead ph lead_ph ln_pop, r dis "Adjusted Rsquared = " _result(8) test lead lead_ph regress infrate lead ph lead_ph ln_pop age typhoid_rate np_tub_rate foreign_share precipitation temperature, r dis "Adjusted Rsquared = " _result(8) test lead lead_ph (a) The table shows the sample mean (
Y
) and its standard error for lead and no-lead cities. The difference in the sample means is 0.022 with a standard error of 0.024. The estimate implies that cities with lead pipes have a larger infant mortality rate (by 0.02 deaths per 100 people in the population), but the standard error is large (0.024) and the difference is not statistically significant (
t = 0.022/0.024 = 0.090). n Y
SE(
Y
) Lead 117 0.403 0.014 No Lead 55 0.381 0.020 Difference 0.022 0.024 (b) The regression is = 0.919 + 0.462×
lead −
0.075×
pH −
0.057×
lead×pH (0.150) (0.208) (0.021) (0.028) (i) The first coefficient is the intercept, which shows the level of Infrate when lead = 0 and pH = 0. It dictates the level of the regression line. The second coefficient and fourth coefficients measure the effect of lead on the infant mortality rate. Comparing 2 cities, one with lead pipes (
lead = 1) and one without lead pipes (
lead = 0), but the same of pH
, the difference in predicted infant mortality rate is
The third and fourth coefficients measure the effect of pH on the infant mortality rate. Comparing 2 cities, one with a pH
= 6 and the other with pH = 5, but the same of lead
, the difference in predicted infant mortality rate is so the difference is -0.075 for cities without lead pipes and −
0.132 for cities with lead pipes.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
(ii) Solid: Cities with lead pipes Dashed: Cities without lead pipes The infant mortality rate is higher for cities with lead pipes, but the difference declines as the pH level increases. For example: The 10
th
percentile of pH is 6.4. At this level, the difference in infant mortality rates is The 50
th
percentile of pH is 7.5. At this level, the difference in infant mortality rates is The 90
th
percentile of pH is 8.2. At this level, the difference in infant mortality rates is (iii) The F
-statistic for the coefficient on lead and the interaction term is F = 3.94, which has a p
-
value of 0.02, so the coefficients are jointly statistically significantly different from zero at the 5% but not the 1% significance level. (iv) The interaction term has a t statistic of t = −
2.02, so the coefficient is significant at the 5% but not the 1% significance level.
(v) The mean of pH is 7.5. At this level, the difference in infant mortality rates is The standard deviation of pH is 0.69, so that the mean plus 1 standard devation is 8.19 and the mean minus 1 standard deviation is 6.81. The infant mortality rates at the pH levels are: (vi) Write the regression as Infrate =
0
+
l
lead +
2
pH +
3
lead×pH + u so the effect of lead on Infrate is
1
+
3
pH
. Thus, we want to construct a 95% confidence interval for
1
+ 6.5
3
. Using method 2 of Section 7.3, add and subtract 6.5
3
lead to the regression to obtain: Infrate =
0
+ (
l + 6.5
3
)
lead +
2
pH +
3
(
lead×pH −
0.65
lead
) +
u
or The estimated regression is = 0.919 + 0.092×
lead −
0.075×
pH −
0.057×
lead×
(
pH
−
6.5)
(0.150) (0.033) (0.021) (0.028) and the 95% confidence interval for the coefficient on lead (which is
l + 6.5
3
) is 0.027 to 0.157. (c) There are several demographic variables in the dataset. You should add these and see if the conclusions from (b) change in an important way. (2)
clear *************************************************************; * Empirical Exercise 8.2; *************************************************************; * (Note: Change path name so that it is appropriate for your computer);
use "/Users/seyhanerden/Documents/COLUMBIA ECONOMETRICS/Problem Sets/Problem Sets Fall 2023/Problem Set 4 - Nonlinear Regression/Solutions to Practice questions 8.1 and 8.2/CPS2015 .dta" *a reg ahe age female bachelor , r *b gen lahe=log(ahe) reg lahe age female bachelor , r predict yhatb *c gen lage=log(age) reg lahe lage female bachelor , r predict yhatc *d gen age2=age^2 reg lahe age age2 female bachelor , r predict yhatd *h * men gen malehighscool=(!female & !bachelor) twoway scatter yhatb age if malehighscool || scatter yhatc age if malehighscool || scatter yhatd age if malehighscool * women gen femalebachelor=female*bachelor twoway scatter yhatb age if femalebachelor || scatter yhatc age if femalebachelor || scatter yhatd age if femalebachelor *i reg lahe age age2 female bachelor femalebachelor , r This table contains the results from seven regressions that are referenced in these answers. Data from 2015 (1) (2) (3) (4) (5) (6) (7) (8) Dependent Variable AHE ln(
AHE
) ln(
AHE
) ln(
AHE
) ln(
AHE
) ln(
AHE
) ln(
AHE
) ln(
AHE
)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Age 0.531 (0.045) 0.024 (0.002)
0.134 (0.046) 0.135 (0.046) 0.139 (0.06) 0.160 (0.063) 0.160 (0.072) Age
2 −
0.0019 (0.00077) −
0.0019 (0.00077)
−
0.0018 (0.0010) −
0.0023 (0.0011) -0.0023 (0.0012) ln(
Age
) 0.72 (0.06) Female
Age
−
(0.091) ()
Female
Age
2 -00001 (0.0015) -0.0002 (0.0016) Bachelor
Age
−
(0.091) −
(0.093) Bachelor
Age
2 0.0009 (0.0015) 0.0008 (0.0016) Female −
4.14 (0.26) −
0.18 (0.01) −
0.18 (0.01) −
0.18 (0.01) −
0.19 (0.02) -0.03 (1.33) −
0.19 (0.02) -0.04 (1.36) Bachelor 9.85 (0.26) 0.46 (0.01) 0.46 (0.01) 0.46 (0.01) 0.45 (0.02) 0.45 (0.02) 1.09 (1.34) 0.94 (1.36) Female
Bachelor
0.023 (0.023) 0.023 (0.023) 0.024 (0.023) 0.024 (0.023) F-
statistic and p
-values on joint hypotheses F
-stat. on terms involving Age
76.6 (0.00) 76.8 (0.00) 38.8 (0.00) 38.52 (0.00 26.0 (0.00) Interaction terms of Female with Age
and Age
2
2.64 (0.07) 3.18 (0.04) Interaction of Bachelor with Age
and Age
2
1.03 (0.36) 1.57 (0.21) SER 10.92 0.48 0.48 0.48 0.48 0.48 0.48 0.48 2
R
0.19 0.21 0.21 0.21 0.21 0.21 0.21 0.21 Note: intercept is included in all regressions. Sample size is n = 7098
(a) The regression results for this question are shown in column (1) of the table. If Age
increases from 25 to 26, earnings are predicted to increase by $0.531 per hour. If Age
increases from 33 to 34, earnings are predicted to increase by $0.531 per hour. These values are the same because the regression is a linear function relating AHE and Age
. (b) The regression results for this question are shown in column (2) of the table. If Age
increases from 25 to 26, ln(
AHE
) is predicted to increase by 0.024, so earnings are predicted to increase by 2.4%. If Age
increases from 34 to 35, ln(
AHE
) is predicted to increase by 0.024, o earnings are predicted to increase by 2.4%. These values, in percentage terms, are the same because the regression is a linear function relating ln(
AHE
) and Age
. (c) The regression results for this question are shown in column (3) of the table. If Age
increases from 25 to 26, then ln(
Age
) has increased by ln(26) −
ln(25) =
0.0392 (or 3.92%). The predicted increase in ln(
AHE
) is 0.72
(.0392) =
0.028. This means that earnings are predicted to increase by 2.8%. If Age
increases from 34 to 35, then ln(
Age
) has increased by ln(35) −
ln(34) =
.0290 (or 2.90%). The predicted increase in ln(
AHE
) is 0.72
(0.0290) =
0.021. This means that earnings are predicted to increase by 2.1%. (d) The regression results for this question are shown in column (4) of the table. When Age increases from 25 to 26, the predicted change in ln(
AHE
) is (0.134
26 −
0.0019
26
2
) −
(0.134
25 −
0.0019
25
2
) =
0.037. This means that earnings are predicted to increase by 3.7%. When Age increases from 34 to 35, the predicted change in ln(
AHE
) is (0.134
34 −
0.0019
34
2
) −
(0.134
33 −
0.0019
33
2
) =
0.007. This means that earnings are predicted to increase by 0.7%. (e) The regressions differ in their choice of one of the regressors. They can be compared on the basis of the
2
.
R
The regression in (3) has a (marginally) higher
2
,
R
so it is preferred. (f) The regression in (4) adds the variable Age
2
to regression (2). The coefficient on Age
2
is statistically significant (
t
=
−
2.41). This suggests that (4) is preferred to (2). (g) The regressions differ in their choice of the regressors (ln(
Age
) in (3) and Age and Age
2
in (4)). They can be compared on the basis of the R
2
. The regression in (4) has a (marginally) higher R
2
, so it is preferred.
(h) Regression functions (2) Black line (3) Blue dashed (4) Red dots
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The regression functions are very similar. The quadratic regression shows somewhat more curvature than the log-log regression, but the difference is small. The regression functions for a female with a high school diploma will look just like these, but they will be shifted by the amount of the coefficient on the binary regressor Female
. The regression functions for workers with a bachelor’s degree will also look just like these, but they would be shifted by the amount of the coefficient on the binary variable Bachelor
. (i) This regression is shown in column (5). The coefficient on the interaction term Female
Bachelor
shows the “extra effect” of Bachelor
on ln(
AHE
) for women relative the effect for men. Predicted values of ln(
AHE
): Alexis: 0.104
30 −
0.0013
30
2
−
0.24
1 +
0.40
1 +
0.090
1 +
0.80 =
3.00 Jane: 0.104
30 −
0.0013
30
2
−
0.24
1 +
0.40
1 +
0.090
1 +
0.80 =
2.51 Bob: 0.104
30 −
0.0013
30
2
−
0.24
1 +
0.40
1 +
0.090
1 +
0.80 =
3.15 Jim: 0.104
30 −
0.0013
30
2
−
0.24
1 +
0.40
1 +
0.090
1 +
0.80 =
2.75 Difference in ln(AHE): Alexis −
Jane =
3.00 −
2.51 =
0.49 Difference in ln(
AHE
): Bob −
Jim =
3.15 −
2.75 =
0.40
Notice that the difference in the difference predicted effects is 0.49 −
0.40 =
0.09, which is the value of the coefficient on the interaction term. (j) This regression is shown in (6), which includes two additional regressors: the interactions of Female
and the age variables, Age
and Age
2
. The F
-statistic testing the restriction that the coefficients on these interaction terms is equal to zero is F =
4.14 with a p
-value of 0.02. This implies that there is statistically significant evidence (at the 5% but not 1% level) that there is a different effect of Age
on ln(
AHE
) for men and women. (k) This regression is shown in (7), which includes two additional regressors that are interactions of Bachelor
and the age variables, Age
and Age
2
. The F
-statistic testing the restriction that the coefficients on these interaction terms is zero is 1.30 with a p
-value of 0.27. This implies that there not is statistically significant evidence (at the 10% level) that there is a different effect of Age on ln(
AHE
) for high school and college graduates. (l) The estimated regressions suggest that earnings increase as workers age from 25
–
35, the range of age studied in this sample. Gender and education are significant predictors of earnings, and there are statistically significant interaction effects between age and gender and between gender and and education.The figure below shows the regressions predicted value of ln(
AHE
) for male and females with high school and college degrees from (6)
Green (line): male with college degree Red (dotes): female with college degree Blue (dashed): male without college degree Black (line): female without college degree The table below summarizes the regressions predictions for increases in earnings as a person ages from 25 to 32 and 32 to 34. Gender, Education Predicted ln (
AHE
) at Age Predicted Increase in ln(
AHE
) (In P\percentage points per year) 25 32 34 25 to 32 32 to 34 Females, High School 2.36 2.52 2.53 2.3 0.4 Males, High School 2.60 2.78 2.84 2.5 3.3 Females, BA 2.81 3.03 3.02 3.1 -0.4 Males, BA 2.96 3.19 3.24 3.3 2.6 Earnings for those with a college education are higher than those with a high school degree, and earnings of the college educated increase more rapidly early in their careers (age 25
–
34). Earnings for men are higher than those of women, and earnings of men increase more rapidly early in their careers (age 25
–
34). For all categories of workers (men/women, high school/college) earnings increase more rapidly from age 25
–
32 than from 32
–
34. While the percentage increase in women’s earning is similar to the percentage increase for men from age 25-
32, women’s earning tend to stagnate from age 32
-
34, while men’s continues to increase.
Recommended textbooks for you



Essentials of Economics (MindTap Course List)
Economics
ISBN:9781337091992
Author:N. Gregory Mankiw
Publisher:Cengage Learning

Principles of Economics (MindTap Course List)
Economics
ISBN:9781305585126
Author:N. Gregory Mankiw
Publisher:Cengage Learning

Principles of Macroeconomics (MindTap Course List)
Economics
ISBN:9781285165912
Author:N. Gregory Mankiw
Publisher:Cengage Learning

Brief Principles of Macroeconomics (MindTap Cours...
Economics
ISBN:9781337091985
Author:N. Gregory Mankiw
Publisher:Cengage Learning
Recommended textbooks for you
- Essentials of Economics (MindTap Course List)EconomicsISBN:9781337091992Author:N. Gregory MankiwPublisher:Cengage Learning
- Principles of Economics (MindTap Course List)EconomicsISBN:9781305585126Author:N. Gregory MankiwPublisher:Cengage LearningPrinciples of Macroeconomics (MindTap Course List)EconomicsISBN:9781285165912Author:N. Gregory MankiwPublisher:Cengage LearningBrief Principles of Macroeconomics (MindTap Cours...EconomicsISBN:9781337091985Author:N. Gregory MankiwPublisher:Cengage Learning



Essentials of Economics (MindTap Course List)
Economics
ISBN:9781337091992
Author:N. Gregory Mankiw
Publisher:Cengage Learning

Principles of Economics (MindTap Course List)
Economics
ISBN:9781305585126
Author:N. Gregory Mankiw
Publisher:Cengage Learning

Principles of Macroeconomics (MindTap Course List)
Economics
ISBN:9781285165912
Author:N. Gregory Mankiw
Publisher:Cengage Learning

Brief Principles of Macroeconomics (MindTap Cours...
Economics
ISBN:9781337091985
Author:N. Gregory Mankiw
Publisher:Cengage Learning