SA3_RSolution_Econ140

Rmd

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

140

Subject

Economics

Date

Feb 20, 2024

Type

Rmd

Pages

3

Uploaded by JusticeHawk19074

Report
--- title: "ECON 140: Section 3 (OLS)" output: pdf_document: default html_notebook: default html_document: default word_document: default header-includes: - \usepackage{amsmath} - \usepackage{amssymb} - \usepackage{pifont} - \usepackage{mathpazo} - \newcommand{\xmark}{\ding{55}} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` # Question 3: Dummy variables regression ## Load required libraries ```{r warning=FALSE, include=FALSE, message=FALSE, results='hide', cache=FALSE} #install.packages("dplyr") # install.packages("margins") library(dplyr) library(stargazer) library(knitr) library(ivreg) library(margins) ``` ## Load dataset ```{r set wd, echo=TRUE} # Set working directory getwd() setwd("/Users/jonathanold/Library/CloudStorage/GoogleDrive- jonathan_old@berkeley.edu/My Drive/_Berkeley Teaching/Spring 2024/Code")# Load dataset mexico_data = read.csv("../Datasets/Mexico.csv") head(mexico_data) ``` ```{r summarytable, echo=TRUE, results='hide'} # Create summary statistics table, export as LaTeX stargazer(data=mexico_data, type="latex", title="Summary Statistics", summary.stat = c("n", "mean", "sd", "min", "median", "max"), out="Summary Table.tex") # To display table, use \input{"Summary Table.tex"} ``` \input{"Summary Table.tex"}
## Get means and difference in means ```{r means, echo=TRUE} # How to create a dummy variable using an ifelse condition mexico_data = mexico_data %>% mutate(rich = ifelse(inc_m>3000, 1, 0)) # Using tidyR syntax to generate mean of income by indigenous language mexico_data %>% group_by(ind_lang) %>% summarize(mean(inc_m)) # Using base R syntax to generate mean of income by indigenous language mean1 = mean(mexico_data$inc_m[mexico_data$ind_lang==1]) mean0 = mean(mexico_data$inc_m[mexico_data$ind_lang==0]) mean_diff = mean1 - mean0 mean_diff mean1 mean0 ``` The average monthly income in the group of indigenous language speakers is 2537, and in the group of non-indigenous language speakers, it is 5309. The difference between the two is -2772. ## Run OLS regression ```{r ols, echo=TRUE} # Running and outputting linear regression ols_results = summary(lm(inc_m ~ ind_lang , data=mexico_data)) ols_results # Compare mean difference to OLS coefficient ols_results$coefficients[2,1] mean_diff ``` ```{r regtable, echo=TRUE, results='hide'} # Create regression table. Include with command "\input{reg1.tex}", written in text reg1 = lm(inc_m ~ ind_lang , data=mexico_data) stargazer(reg1, title="Indigenous language and wages", omit.stat=c("LL","ser","f","adj.rsq"), no.space=TRUE, header=TRUE, align=FALSE, type="latex", out="reg1.tex") ``` \input{reg1.tex} We see that the OLS regression is very useful to summarize the data. The constant/intercept gives the average monthly income in the group where ind_lang is zero, and the coefficient on ind_lang gives the mean difference between ind_lang==1 and ind_lang==0. This is always true if we run a regression with one dummy variable on the right hand side. It is also true when we run a regression with multiple dummy variables, as long as we put in enough regressors to describe all the categories present in the data.
# Question 4: Wages over the life-cycle ```{r wages, echo=TRUE} dataset = read.csv(file="../Datasets/wages.csv") plot(dataset$age, dataset$wage_yearly) linear_regression = summary(lm(wage_yearly ~ age , data=dataset)) print(linear_regression) # We use the function I() to create quadratic terms and interactions. quadratic_regression = lm(wage_yearly ~ age + I(age^2) , data=dataset) summary(quadratic_regression) # We can use cplot to create two types of plots: # 1. Predict wages by age (the estimated regression function) cplot(quadratic_regression, "age", what = "pred", main = "Predicted yearly wages, by age") # 2. The marginal "effect" of age (the derivative of the estimated regression function): cplot(quadratic_regression, "age", what = "effect", main = "Average Marginal Effect of age") ```
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help