11c

pdf

School

The Chinese University of Hong Kong *

*We aren’t endorsed by this school

Course

3420

Subject

Statistics

Date

Nov 24, 2024

Type

pdf

Pages

Uploaded by nkc.alanng529

MGEC11 Week 11 & 12, R Codes Yue Yu Fall 2023 library (wooldridge) library (lmtest) library (sandwich) library (margins) library (car) Estimate Logit/Probit Models inlf: =1 if in labor force in 1975 educ: years of schooling exper: actual labor market experience age: woman’s age nwifinc: family income minus wife’s expected income kidslt6: number of kids less than 6 years old kidsge6: number of kids in between 6 and 18 years old Linear model Note that when the dependent variable is binary, we need to use the heteroskedasticity-robust standard errors. data ( mroz ) model.ols <- lm (inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6, data = mroz) coeftest (model.ols, vcov = vcovHC (model.ols, "HC0" )) ## ## t test of coefficients: ## ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.58551922 0.15144889 3.8661 0.0001202 *** ## nwifeinc -0.00340517 0.00151681 -2.2450 0.0250635 * ## educ 0.03799530 0.00722734 5.2572 1.913e-07 *** ## exper 0.03949239 0.00577907 6.8337 1.722e-11 *** ## expersq -0.00059631 0.00018899 -3.1552 0.0016683 ** ## age -0.01609081 0.00238623 -6.7432 3.108e-11 *** ## kidslt6 -0.26181047 0.03161391 -8.2815 5.626e-16 *** ## kidsge6 0.01301223 0.01346085 0.9667 0.3340215 ## --- 1

## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Logit model model.logit <- glm (inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6, data = mroz, family = binomial ( link = "logit" )) summary ( model.logit ) ## ## Call: ## glm(formula = inlf ~ nwifeinc + educ + exper + expersq + age + ## kidslt6 + kidsge6, family = binomial(link = "logit"), data = mroz) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 0.425452 0.860365 0.495 0.62095 ## nwifeinc -0.021345 0.008421 -2.535 0.01126 * ## educ 0.221170 0.043439 5.091 3.55e-07 *** ## exper 0.205870 0.032057 6.422 1.34e-10 *** ## expersq -0.003154 0.001016 -3.104 0.00191 ** ## age -0.088024 0.014573 -6.040 1.54e-09 *** ## kidslt6 -1.443354 0.203583 -7.090 1.34e-12 *** ## kidsge6 0.060112 0.074789 0.804 0.42154 ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1029.75 on 752 degrees of freedom ## Residual deviance: 803.53 on 745 degrees of freedom ## AIC: 819.53 ## ## Number of Fisher Scoring iterations: 4 Probit model model.probit <- glm (inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6, data = mroz, family = binomial ( link = "probit" )) summary ( model.probit ) ## ## Call: ## glm(formula = inlf ~ nwifeinc + educ + exper + expersq + age + ## kidslt6 + kidsge6, family = binomial(link = "probit"), data = mroz) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 0.2700736 0.5080782 0.532 0.59503 ## nwifeinc -0.0120236 0.0049392 -2.434 0.01492 * ## educ 0.1309040 0.0253987 5.154 2.55e-07 *** ## exper 0.1233472 0.0187587 6.575 4.85e-11 *** ## expersq -0.0018871 0.0005999 -3.145 0.00166 ** 2

## age -0.0528524 0.0084624 -6.246 4.22e-10 *** ## kidslt6 -0.8683247 0.1183773 -7.335 2.21e-13 *** ## kidsge6 0.0360056 0.0440303 0.818 0.41350 ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1029.7 on 752 degrees of freedom ## Residual deviance: 802.6 on 745 degrees of freedom ## AIC: 818.6 ## ## Number of Fisher Scoring iterations: 4 • The signs of the coefficients are the same across three models. • The set of variables that are statistically significant are the same across models. • The magnitude of the coefficients across models are not comparable. • The interpretation of the SE, z value (equivalent to t-value) and p-value are the same as before. The Average Partial/Marginal/Treatment Effect The APE/ATE/AME of the Linear Model coeftest ( model.ols, vcov = vcovHC (model.ols, "HC0" ) ) ## ## t test of coefficients: ## ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.58551922 0.15144889 3.8661 0.0001202 *** ## nwifeinc -0.00340517 0.00151681 -2.2450 0.0250635 * ## educ 0.03799530 0.00722734 5.2572 1.913e-07 *** ## exper 0.03949239 0.00577907 6.8337 1.722e-11 *** ## expersq -0.00059631 0.00018899 -3.1552 0.0016683 ** ## age -0.01609081 0.00238623 -6.7432 3.108e-11 *** ## kidslt6 -0.26181047 0.03161391 -8.2815 5.626e-16 *** ## kidsge6 0.01301223 0.01346085 0.9667 0.3340215 ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 The APE/ATE/AME of the Logit Model AME_logit = margins ( model.logit ) summary ( AME_logit ) ## factor AME SE z p lower upper ## age -0.0157 0.0024 -6.6027 0.0000 -0.0204 -0.0111 ## educ 0.0395 0.0073 5.4145 0.0000 0.0252 0.0538 ## exper 0.0368 0.0052 7.1386 0.0000 0.0267 0.0469 ## expersq -0.0006 0.0002 -3.1759 0.0015 -0.0009 -0.0002 ## kidsge6 0.0107 0.0133 0.8051 0.4207 -0.0154 0.0369 ## kidslt6 -0.2578 0.0319 -8.0696 0.0000 -0.3204 -0.1951 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

## nwifeinc -0.0038 0.0015 -2.5714 0.0101 -0.0067 -0.0009 The APE/ATE/AME of the Probit Model AME_probit = margins ( model.probit ) summary ( AME_probit ) ## factor AME SE z p lower upper ## age -0.0159 0.0024 -6.7392 0.0000 -0.0205 -0.0113 ## educ 0.0394 0.0073 5.4186 0.0000 0.0251 0.0536 ## exper 0.0371 0.0052 7.1779 0.0000 0.0270 0.0472 ## expersq -0.0006 0.0002 -3.2050 0.0014 -0.0009 -0.0002 ## kidsge6 0.0108 0.0132 0.8189 0.4129 -0.0151 0.0367 ## kidslt6 -0.2612 0.0319 -8.1860 0.0000 -0.3237 -0.1986 ## nwifeinc -0.0036 0.0015 -2.4604 0.0139 -0.0065 -0.0007 The APE for each variable across the three models are similar. Using Logit/Probit Model for Prediction Question 1: What is the predicted probability of joining the labor force for one with 0 kid less than 6 years old and the rest of characteristics at the sample median? Question 2: What if she has 1 kid less than 6 years old? We first construct a new dataset with 7 columns (each corresponding to one explanatory variable) and 2 data entries for each column. The first observation has 0 kid less than 6 years old and the rest of characteristics at the sample median. The second observation has 1 kid less than 6 years old and the rest of characteristics at the sample median. kidslt6 = c ( 0 , 1 ) # number of kids less than 6 years old for the first person (which is 0) and the secon nwifeinc = c ( median (mroz $ nwifeinc), median (mroz $ nwifeinc) ) educ = c ( median (mroz $ educ), median (mroz $ educ) ) exper = c ( median (mroz $ exper), median (mroz $ exper) ) expersq = c ( median (mroz $ exper) ˆ 2 , median (mroz $ exper) ˆ 2 ) age = c ( median (mroz $ age), median (mroz $ age) ) kidsge6 = c ( median (mroz $ kidsge6), median (mroz $ kidsge6) ) new_observation <- data.frame (kidslt6, nwifeinc, educ, exper, expersq, age, kidsge6) outcome <- predict (model.probit, new_observation, type = "response" ) outcome ## 1 2 ## 0.6363523 0.3016715 outcome[ 1 ] ## 1 ## 0.6363523 ˆ P ( working | X 50 , kidslt 6 = 0) = 0 . 636 outcome[ 2 ] ## 2 ## 0.3016715 ˆ P ( working | X 50 , kidslt 6 = 1) = 0 . 302 4

Hypothesis Testing (For Week 12) summary ( model.probit ) ## ## Call: ## glm(formula = inlf ~ nwifeinc + educ + exper + expersq + age + ## kidslt6 + kidsge6, family = binomial(link = "probit"), data = mroz) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 0.2700736 0.5080782 0.532 0.59503 ## nwifeinc -0.0120236 0.0049392 -2.434 0.01492 * ## educ 0.1309040 0.0253987 5.154 2.55e-07 *** ## exper 0.1233472 0.0187587 6.575 4.85e-11 *** ## expersq -0.0018871 0.0005999 -3.145 0.00166 ** ## age -0.0528524 0.0084624 -6.246 4.22e-10 *** ## kidslt6 -0.8683247 0.1183773 -7.335 2.21e-13 *** ## kidsge6 0.0360056 0.0440303 0.818 0.41350 ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1029.7 on 752 degrees of freedom ## Residual deviance: 802.6 on 745 degrees of freedom ## AIC: 818.6 ## ## Number of Fisher Scoring iterations: 4 Example 1 Test the null hypothesis that years of schooling has no effect on one’s choice of labor force participation. Use 5% significance level. Null hypothesis: β educ = 0 Alternative hypothesis: β educ = 0 The z-value, which is essentially a t-statistic, follows asymptotically normal distribution under the null hypothesis. Therefore, we conduct a t-test. We can compare the z-value with the critical value. Alternatively, we can compare the p-value with the significance level. Since 2.55e-07 is smaller than 0.05, we reject the null hypothesis. The conclusion is that education has an impact on one’s labor force participation decision. Example 2 Test the null hypothesis that the number of older kids has no effect on one’s choice of labor force participation. Use 5% significance level. Null hypothesis: β kidsge 6 = 0 Alternative hypothesis: β kidsge 6 = 0 We conduct a t-test. Since 0.41350 is greater than 0.05, we do not reject the null hypothesis. The conclusion is that number of elder kids have no impact on one’s labor force participation decision. 5

Example 3 Test the null hypothesis that working experience has no effect on one’s choice of labor force participation. Use 5% significance level. Null hypothesis: β exper = 0 , β expersq = 0 Alternative hypothesis: Null is not true Option 1. Use a likelihood-ratio test Use the following codes to carry out a likelihood-ratio test: # Estimate the unrestricted model first: q1.probit <- glm (inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6, data = mroz, family = binomial ( link = "probit" )) # Estimate the restricted model: q1.probit_ristrict <- glm (inlf ~ nwifeinc + educ + age + kidslt6 + kidsge6, data = mroz, family = binomial ( link = "probit" )) # Conduct LR test: lrtest (q1.probit, q1.probit_ristrict) ## Likelihood ratio test ## ## Model 1: inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6 ## Model 2: inlf ~ nwifeinc + educ + age + kidslt6 + kidsge6 ## #Df LogLik Df Chisq Pr(>Chisq) ## 1 8 -401.30 ## 2 6 -454.23 -2 105.85 < 2.2e-16 *** ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 The LR-stats is 105.85, associated with a p-value less than 2.2e-16. We reject the null hypothesis that working experience has no effect on one’s choice of labor force participation at 5% significance level. Option 2. Use a Wald test Alternatively, we can conduct a Wald test: library (car) linearHypothesis (model.probit, c ( "exper=0" , "expersq=0" )) ## Linear hypothesis test ## ## Hypothesis: ## exper = 0 ## expersq = 0 ## ## Model 1: restricted model ## Model 2: inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6 ## ## Res.Df Df Chisq Pr(>Chisq) ## 1 747 ## 2 745 2 95.862 < 2.2e-16 *** ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 6

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

The Wald-stat is 95.862, associated with a p-value less than 2.2e-16. We reject the null hypothesis that working experience has no effect on one’s choice of labor force participation at 5% significance level. 7

11c

Related Documents