ECON_3900_-_A1_Part_B_-_Group_E

pdf

School

Carleton University *

*We aren’t endorsed by this school

Course

3900

Subject

Economics

Date

Jan 9, 2024

Type

pdf

Pages

7

Uploaded by DeaconGuineaPigMaster959

Report
ECON 3900A May 26 2023 Assignment 1 Part B: Ryan Fleming 101147525 Wilder Noble 101150812 Zakira Ahmadi 100948068 Zheng Wang 101123076 4. Use the Labour Force Survey data from both December 2019 and December 2022 cohorts to answer the following questions. a. Create a folder named assignment3900. Download the two datasets in Stata format and save them in the assignment3900 folder. Rename the datasets as dec2019 and dec2022 respectively. Then, create a do file named datacleando and save it in the same folder. A screenshot of the folder is provided in the Appendix. b. Create a folder and name it Assignment3900. Download the two datasets in Stata format and save them in the Assignment3900 folder. Rename the datasets dec2019 and dec2022 respectively. Create a do file and name it datacleando. Save it in the same folder. A screenshot of the folder is provided in the Appendix. c. Open the do file and add a command to clear any objects saved from previous sessions using the clear command. Also, add commands to create a log file using the log begin and log close commands. Call the log file datacleanlog. Make sure to add an option to overwrite the latest saved log file with the most recent log file using the replace option. Stata code: clear all capture log close log using "C:\Users\wilde\Downloads\assignment3900\dec2019.dta", replace log close d. Open the dec2019 dataset (from within the do file) and keep only observations pertaining to individuals who are employed in the private sector, working at least 30 hours per week
with an hourly wage greater than or equal 10$, between 20 and 65 years old, and with a permanent job. For information on the variable names, values, and definitions, please consult the documentation files you downloaded with the datasets. use "C:\Users\wilde\Downloads\assignment3900\dec2019.dta" keep if COWMAIN == 2 drop if ATOTHRS < 30 drop if HRLYEARN < 10 drop if AGE_12 < 2 drop if AGE_12 > 9 keep if PERMTEMP == 1 e. Keep only the variables HRLYEARN, SEX, TENURE, and ATOTHRS. Re-name them hwage, female, tenure, and hours_worked, respectively. keep HRLYEARN SEX TENURE ATOTHRS rename HRLYEARN hwage rename SEX female rename TENURE tenure rename ATOTHRS hrs_worked f. Remove any missing values from the dataset. drop if hwage == . drop if female == . drop if tenure == . drop if hrs_worked == . g. Save the resulting "cleaned" dataset for dec2019 under the name dec2019clean. h. In the same do file, repeat the steps in part (d.) and (g.) for the dec2022 dataset. Name the resulting dataset dec2022clean. A screenshot of the folder is provided in the Appendix. i. Create a new do file and call it analysisdo. Add a command to clear memory and add commands to create a log file. Call this file analysislog and make sure to add an option to overwrite the latest saved log file with the most recent log file.
clear all capture log close log using "C:\Users\wilde\Downloads\assignment3900\dec2019.dta", replace log close j. Open the dataset dec2019clean and provide summary statistics for the variables. Check whether the minimum and maximum values of the variables are reasonable given the nature of the variables. For instance, it is not reasonable to have a negative minimum for the wage variable. sum hwage female tenure hrs_worked k. Regress log(hwage) on female and interpret the slope coefficient and the R. 2019: The slope coefficient was -.2025819 which signifies that on average being a female is associated with a wage that is 0.20 dollars lower than the wage for males. The R squared value is 0.0535 which signifies that only approximately 5.35% of the variability of the dependent variable can be explained by the dependent variable. 2022: The slope coefficient was -.1743415 which signifies that on average being a female is associated with a wage that is 0.17 dollars lower than the wage for males. The R squared value is 0.0389 which signifies that only approx 3.89% of the variability of the dependent variable can be explained by the dependent variable. l. Regress log(hwage) on female, tenure, and hours_worked. Interpret the Slope coefficient and the R2. How does the gender wage gap in this model compare to that in the previous part? 2019: The slope coefficient for the female variable is -.1780789 indicating that on average females earn a wage that is approximately 0.18 dollars less than men. The R squared value is 0.1172 indicating that the model can only predict approximately 11.72% of the variation in the dependent variable given the independent variables. The gender wage gap in this model when compared to the previous model is lower by about 0.02 dollars, more importantly this model has a greater R squared indicating more of the variation in the dependent variable can be explained by the independent variables. Therefore it appears there are more factors than just gender that influence the wage gap that exists among men and women.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
2022: The slope coefficient for the female variable is -.1580732 indicating that on average females earn a wage that is approx 0.16 dollars less than men. The R squared value is 0.0798 indicating that the model can only predict approx 7.98% of the variation in the dependent variable given the independent variables. The gender wage gap in this model when compared to the previous model is lower by about 0.01 dollars, more importantly this model has a greater R squared indicating more of the variation in the dependent variable can be explained by the independent variables. Therefore it appears there are more factors than just gender that influence the wage gap that exists among men and women. p. Based on all the results obtained above, comment on the direction and the magnitude of the change in the gender wage gap in Canada before and after Covid-19. After comparing the linear regression with the dependent variable log_wage with the independent variables female, tenure, and hours_worked for 2019 and 2022, it becomes apparent that before covid-19, the slope coefficient for the female variable was -.1780789, then after covid-19, the slope coefficient for the female variant was -.1580732. The differences in the slope coefficient for the female variable indicate that in 2019 before covid-19, females made about $0.18 per hour less than men. Then in 2022, females made about $0.16 per hour less than men. Based on the previous information, it is clear that the wage gap in Canada has decreased by a magnitude of $0.02. Similarly, for the linear regression taken only on the two variables female (independent) and log_ wage (dependent), we see that the result is similar. In 2019 the slope coefficient for the female was -.2025819 and for 2022 it was -.1743415. The differences between the two slope coefficients indicate that the wage gap in Canada decreased by a magnitude of $0.03. 5. The following question is based on your research project. Please follow the recommendations mentioned in the "Writing Tips" document when answering it. a. Provide a clear and concise title for your research project that accurately reflects the main focus and scope of your study. Analyzing the Net Tax Contribution Disparity between Education Levels: A Comparative Study of Regular Citizens and Immigrants b. Write a research statement that clearly defines the research problem, research question(s), and objectives of your study. The objective of this study is to examine the influence of the level of study of Canadian and immigrant workers in Canada on their net fiscal contributions. How does government fiscal contribution vary across different levels of education for immigrants and non-immigrants?
c. Describe the dataset(s) that you plan to use in your study. Include information about the source of the data, and any relevant limitations or issues that may affect your analysis. For our dataset, we will be using the 2006 and 2016 Canadian Census of Population Public Use Microdata File. We chose to compare the 2006 and 2016 Canadian Census of Populations to analyze if the 2008 financial crises had impacted the net tax contribution for immigrants and non-immigrants. The data collection for the Public Use Microdata File was obtained through the ODESI (Ontario Data Documentation, Extraction Service and Infrastructure). We will only be using data on permanent residents. We decided to only focus on permanent residents because the tax structure becomes more complex when income taxes and benefits are administered by more than one government body. Certain data is not included in the Census, which will limit the scope of our analysis. For some variables there exists an undefined or non-applicable section where participants did not disclose information, we will be omitting this which will limit the scope of our analysis. Often many trades workers become small-business owners and are therefore allowed to write off many expenses. This will decrease the amount of income tax deducted from their income, because our net contribution variable is the income tax paid minus government benefits received, this may result on average in a lower net contribution for people within the trades classification. Some individuals such as medical doctors may also own their own practices or small businesses which will also decrease the amount of personal income they pay, again on average this could decrease the total net contributions for those individuals. Our independent variable “highest education level achieved” does not account for individuals who have achieved a bachelor’s degree or higher after achieving a college or trades certificate. There could exist a sample of people who achieved a higher level of education but have opted to return to the trades, therefore their net contributions will show up in the bachelor’s category despite working in the trades. This could affect the results of the data, and could reduce the accuracy of our analysis. d. Describe your dependent variable and the main independent variables that you will use to test your research hypothesis. The dependent variable in our study is the net contribution of each person in the sample. We will create that variable by subtracting the total government transfers from the income tax each person pays. The main independent variables that we are using are education level and immigrant status. Education level will be divided into college, trade school, Bachelor’s degree holders, and
Graduate’s degree holders. Immigration status will involve permanent residents of immigrant and non-immigrant status. We are omitting non-permanent residents, as tracking their tax and government transfer information adds an unnecessary degree of complexity. We will use interaction terms between the different levels of education and immigration status to examine how immigrants of a certain level of study compare to non-immigrants of that same level in terms of net fiscal contributions. e. Write down the equation that you plan to estimate and the estimation method that you plan to use. ????????𝑖? 𝑖 = β 0 + β 1 ∗ 𝑖??𝑖𝑔 𝑖 + β 2 ∗ ?????? 𝑖 + β 3 * ?????𝑔? 𝑖 + β 4 * ?? 𝑖 + β 5 * 𝑀? 𝑖 + β 6 ∗ (𝑖??𝑖𝑔 * ??????) 𝑖 + β 7 ∗ (𝑖??𝑖𝑔 * ?????𝑔?) 𝑖 + β 8 ∗ (𝑖??𝑖𝑔 * ??) 𝑖 + β 9 ∗ (𝑖??𝑖𝑔 * 𝑀?) 𝑖 + ε netContrib = net contributions immig = immigration status trades = dummy variable for whether a person is a tradesperson college = dummy variable for whether a person completed a program of 3 months to less than 1 year, a program of 1 to 2 years, a program of more than 2 years, or university certificate or diploma below bachelor level BA = dummy variable for whether a person completed a Bachelor’s degree MA = dummy variable for whether a person completed a university certificate or diploma above bachelor level, degree in medicine, dentistry, veterinary medicine or optometry, a Master's degree, or an earned doctorate immig*trades = interaction term of immigration status with whether a person’s highest level of education is trades’ school immig*college = interaction term of immigration status with whether a person’s highest level of education is within the umbrella of the college dummy variable immig*BA = interaction term of immigration status with whether a person’s highest level of education is a Bachelor’s degree immig*MA = interaction term of immigration status with whether a person’s highest level of education is post-graduate studies We will be using the OLS (Ordinary Least Squares regression) estimation method to estimate the coefficients of the variables contained within our linear regression. Appendix
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
This is the file folder containing all of the Stata files in question 4.