11c

pdf

School

The Chinese University of Hong Kong *

*We aren’t endorsed by this school

Course

3420

Subject

Statistics

Date

Nov 24, 2024

Type

pdf

Pages

7

Uploaded by nkc.alanng529

Report
MGEC11 Week 11 & 12, R Codes Yue Yu Fall 2023 library (wooldridge) library (lmtest) library (sandwich) library (margins) library (car) Estimate Logit/Probit Models inlf: =1 if in labor force in 1975 educ: years of schooling exper: actual labor market experience age: woman’s age nwifinc: family income minus wife’s expected income kidslt6: number of kids less than 6 years old kidsge6: number of kids in between 6 and 18 years old Linear model Note that when the dependent variable is binary, we need to use the heteroskedasticity-robust standard errors. data ( mroz ) model.ols <- lm (inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6, data = mroz) coeftest (model.ols, vcov = vcovHC (model.ols, "HC0" )) ## ## t test of coefficients: ## ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.58551922 0.15144889 3.8661 0.0001202 *** ## nwifeinc -0.00340517 0.00151681 -2.2450 0.0250635 * ## educ 0.03799530 0.00722734 5.2572 1.913e-07 *** ## exper 0.03949239 0.00577907 6.8337 1.722e-11 *** ## expersq -0.00059631 0.00018899 -3.1552 0.0016683 ** ## age -0.01609081 0.00238623 -6.7432 3.108e-11 *** ## kidslt6 -0.26181047 0.03161391 -8.2815 5.626e-16 *** ## kidsge6 0.01301223 0.01346085 0.9667 0.3340215 ## --- 1
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Logit model model.logit <- glm (inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6, data = mroz, family = binomial ( link = "logit" )) summary ( model.logit ) ## ## Call: ## glm(formula = inlf ~ nwifeinc + educ + exper + expersq + age + ## kidslt6 + kidsge6, family = binomial(link = "logit"), data = mroz) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 0.425452 0.860365 0.495 0.62095 ## nwifeinc -0.021345 0.008421 -2.535 0.01126 * ## educ 0.221170 0.043439 5.091 3.55e-07 *** ## exper 0.205870 0.032057 6.422 1.34e-10 *** ## expersq -0.003154 0.001016 -3.104 0.00191 ** ## age -0.088024 0.014573 -6.040 1.54e-09 *** ## kidslt6 -1.443354 0.203583 -7.090 1.34e-12 *** ## kidsge6 0.060112 0.074789 0.804 0.42154 ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1029.75 on 752 degrees of freedom ## Residual deviance: 803.53 on 745 degrees of freedom ## AIC: 819.53 ## ## Number of Fisher Scoring iterations: 4 Probit model model.probit <- glm (inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6, data = mroz, family = binomial ( link = "probit" )) summary ( model.probit ) ## ## Call: ## glm(formula = inlf ~ nwifeinc + educ + exper + expersq + age + ## kidslt6 + kidsge6, family = binomial(link = "probit"), data = mroz) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 0.2700736 0.5080782 0.532 0.59503 ## nwifeinc -0.0120236 0.0049392 -2.434 0.01492 * ## educ 0.1309040 0.0253987 5.154 2.55e-07 *** ## exper 0.1233472 0.0187587 6.575 4.85e-11 *** ## expersq -0.0018871 0.0005999 -3.145 0.00166 ** 2
## age -0.0528524 0.0084624 -6.246 4.22e-10 *** ## kidslt6 -0.8683247 0.1183773 -7.335 2.21e-13 *** ## kidsge6 0.0360056 0.0440303 0.818 0.41350 ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1029.7 on 752 degrees of freedom ## Residual deviance: 802.6 on 745 degrees of freedom ## AIC: 818.6 ## ## Number of Fisher Scoring iterations: 4 The signs of the coefficients are the same across three models. The set of variables that are statistically significant are the same across models. The magnitude of the coefficients across models are not comparable. The interpretation of the SE, z value (equivalent to t-value) and p-value are the same as before. The Average Partial/Marginal/Treatment Effect The APE/ATE/AME of the Linear Model coeftest ( model.ols, vcov = vcovHC (model.ols, "HC0" ) ) ## ## t test of coefficients: ## ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.58551922 0.15144889 3.8661 0.0001202 *** ## nwifeinc -0.00340517 0.00151681 -2.2450 0.0250635 * ## educ 0.03799530 0.00722734 5.2572 1.913e-07 *** ## exper 0.03949239 0.00577907 6.8337 1.722e-11 *** ## expersq -0.00059631 0.00018899 -3.1552 0.0016683 ** ## age -0.01609081 0.00238623 -6.7432 3.108e-11 *** ## kidslt6 -0.26181047 0.03161391 -8.2815 5.626e-16 *** ## kidsge6 0.01301223 0.01346085 0.9667 0.3340215 ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 The APE/ATE/AME of the Logit Model AME_logit = margins ( model.logit ) summary ( AME_logit ) ## factor AME SE z p lower upper ## age -0.0157 0.0024 -6.6027 0.0000 -0.0204 -0.0111 ## educ 0.0395 0.0073 5.4145 0.0000 0.0252 0.0538 ## exper 0.0368 0.0052 7.1386 0.0000 0.0267 0.0469 ## expersq -0.0006 0.0002 -3.1759 0.0015 -0.0009 -0.0002 ## kidsge6 0.0107 0.0133 0.8051 0.4207 -0.0154 0.0369 ## kidslt6 -0.2578 0.0319 -8.0696 0.0000 -0.3204 -0.1951 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## nwifeinc -0.0038 0.0015 -2.5714 0.0101 -0.0067 -0.0009 The APE/ATE/AME of the Probit Model AME_probit = margins ( model.probit ) summary ( AME_probit ) ## factor AME SE z p lower upper ## age -0.0159 0.0024 -6.7392 0.0000 -0.0205 -0.0113 ## educ 0.0394 0.0073 5.4186 0.0000 0.0251 0.0536 ## exper 0.0371 0.0052 7.1779 0.0000 0.0270 0.0472 ## expersq -0.0006 0.0002 -3.2050 0.0014 -0.0009 -0.0002 ## kidsge6 0.0108 0.0132 0.8189 0.4129 -0.0151 0.0367 ## kidslt6 -0.2612 0.0319 -8.1860 0.0000 -0.3237 -0.1986 ## nwifeinc -0.0036 0.0015 -2.4604 0.0139 -0.0065 -0.0007 The APE for each variable across the three models are similar. Using Logit/Probit Model for Prediction Question 1: What is the predicted probability of joining the labor force for one with 0 kid less than 6 years old and the rest of characteristics at the sample median? Question 2: What if she has 1 kid less than 6 years old? We first construct a new dataset with 7 columns (each corresponding to one explanatory variable) and 2 data entries for each column. The first observation has 0 kid less than 6 years old and the rest of characteristics at the sample median. The second observation has 1 kid less than 6 years old and the rest of characteristics at the sample median. kidslt6 = c ( 0 , 1 ) # number of kids less than 6 years old for the first person (which is 0) and the secon nwifeinc = c ( median (mroz $ nwifeinc), median (mroz $ nwifeinc) ) educ = c ( median (mroz $ educ), median (mroz $ educ) ) exper = c ( median (mroz $ exper), median (mroz $ exper) ) expersq = c ( median (mroz $ exper) ˆ 2 , median (mroz $ exper) ˆ 2 ) age = c ( median (mroz $ age), median (mroz $ age) ) kidsge6 = c ( median (mroz $ kidsge6), median (mroz $ kidsge6) ) new_observation <- data.frame (kidslt6, nwifeinc, educ, exper, expersq, age, kidsge6) outcome <- predict (model.probit, new_observation, type = "response" ) outcome ## 1 2 ## 0.6363523 0.3016715 outcome[ 1 ] ## 1 ## 0.6363523 ˆ P ( working | X 50 , kidslt 6 = 0) = 0 . 636 outcome[ 2 ] ## 2 ## 0.3016715 ˆ P ( working | X 50 , kidslt 6 = 1) = 0 . 302 4
Hypothesis Testing (For Week 12) summary ( model.probit ) ## ## Call: ## glm(formula = inlf ~ nwifeinc + educ + exper + expersq + age + ## kidslt6 + kidsge6, family = binomial(link = "probit"), data = mroz) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 0.2700736 0.5080782 0.532 0.59503 ## nwifeinc -0.0120236 0.0049392 -2.434 0.01492 * ## educ 0.1309040 0.0253987 5.154 2.55e-07 *** ## exper 0.1233472 0.0187587 6.575 4.85e-11 *** ## expersq -0.0018871 0.0005999 -3.145 0.00166 ** ## age -0.0528524 0.0084624 -6.246 4.22e-10 *** ## kidslt6 -0.8683247 0.1183773 -7.335 2.21e-13 *** ## kidsge6 0.0360056 0.0440303 0.818 0.41350 ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1029.7 on 752 degrees of freedom ## Residual deviance: 802.6 on 745 degrees of freedom ## AIC: 818.6 ## ## Number of Fisher Scoring iterations: 4 Example 1 Test the null hypothesis that years of schooling has no effect on one’s choice of labor force participation. Use 5% significance level. Null hypothesis: β educ = 0 Alternative hypothesis: β educ = 0 The z-value, which is essentially a t-statistic, follows asymptotically normal distribution under the null hypothesis. Therefore, we conduct a t-test. We can compare the z-value with the critical value. Alternatively, we can compare the p-value with the significance level. Since 2.55e-07 is smaller than 0.05, we reject the null hypothesis. The conclusion is that education has an impact on one’s labor force participation decision. Example 2 Test the null hypothesis that the number of older kids has no effect on one’s choice of labor force participation. Use 5% significance level. Null hypothesis: β kidsge 6 = 0 Alternative hypothesis: β kidsge 6 = 0 We conduct a t-test. Since 0.41350 is greater than 0.05, we do not reject the null hypothesis. The conclusion is that number of elder kids have no impact on one’s labor force participation decision. 5
Example 3 Test the null hypothesis that working experience has no effect on one’s choice of labor force participation. Use 5% significance level. Null hypothesis: β exper = 0 , β expersq = 0 Alternative hypothesis: Null is not true Option 1. Use a likelihood-ratio test Use the following codes to carry out a likelihood-ratio test: # Estimate the unrestricted model first: q1.probit <- glm (inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6, data = mroz, family = binomial ( link = "probit" )) # Estimate the restricted model: q1.probit_ristrict <- glm (inlf ~ nwifeinc + educ + age + kidslt6 + kidsge6, data = mroz, family = binomial ( link = "probit" )) # Conduct LR test: lrtest (q1.probit, q1.probit_ristrict) ## Likelihood ratio test ## ## Model 1: inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6 ## Model 2: inlf ~ nwifeinc + educ + age + kidslt6 + kidsge6 ## #Df LogLik Df Chisq Pr(>Chisq) ## 1 8 -401.30 ## 2 6 -454.23 -2 105.85 < 2.2e-16 *** ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 The LR-stats is 105.85, associated with a p-value less than 2.2e-16. We reject the null hypothesis that working experience has no effect on one’s choice of labor force participation at 5% significance level. Option 2. Use a Wald test Alternatively, we can conduct a Wald test: library (car) linearHypothesis (model.probit, c ( "exper=0" , "expersq=0" )) ## Linear hypothesis test ## ## Hypothesis: ## exper = 0 ## expersq = 0 ## ## Model 1: restricted model ## Model 2: inlf ~ nwifeinc + educ + exper + expersq + age + kidslt6 + kidsge6 ## ## Res.Df Df Chisq Pr(>Chisq) ## 1 747 ## 2 745 2 95.862 < 2.2e-16 *** ## --- ## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The Wald-stat is 95.862, associated with a p-value less than 2.2e-16. We reject the null hypothesis that working experience has no effect on one’s choice of labor force participation at 5% significance level. 7