Stata Lab 3
docx
keyboard_arrow_up
School
Washtenaw Community College *
*We aren’t endorsed by this school
Course
128
Subject
Health Science
Date
Feb 20, 2024
Type
docx
Pages
2
Uploaded by MegaPenguinMaster801
Columbia University
Mailman School of Public Health
Dept of Health Policy & Management
P8502 RM I: Empirical Analysis for Health Policy
Spring 2023
Prof. Jamie Daw
STATA LAB 3
Introduction to Regression
Part 1: Open and Review the Data
Open and review Session2.dta.
This dataset (also used in Lab 2) contains data on a random sample of births from
one hospital in 2018. Be sure to set a working directory, start a new .do file and log file. Your group should annotate answers to each question in the do file using comments and email the do file to the TA and all group members at the end of the lab.
Part 2: Inspect the Relationship Between Variables
You are interested in the relationship between first trimester prenatal care visits and Apgar score at delivery. The
Apgar score is a method (developed at NYP in 1952) to quickly summarize the health of newborn children. The score ranges from zero to 10. Scores 7 and above are generally normal; 4 to 6, fairly low; and 3 and below are generally regarded as critically low and cause for immediate resuscitative efforts.
i.
Create a binary variable indicating a normal Apgar score.
ii.
What percentage of infants have an abnormal Apgar score? iii.
Create a cross tabulation of Apgar score and prenatal care visits. What % of infants with no prenatal care
have a normal Apgar score? What % of those with 1 or more visits?
iv.
Create a bar graph showing the mean Apgar score by the number of prenatal care visits. Part 3: Run and Interpret Regression Results
First, run a bivariate regression with Apgar score as the dependent variable and the number of prenatal care visits
as the independent variable.
i.
Interpret the coefficient ftv
ii.
What is the predicted Apgar score for an infant with 3 prenatal care visits?
iii.
Is the coefficient ftv
statistically significant? Instead of examining the number of prenatal care visits as a continuous independent variable, you decide to examine the difference in Apgar score for infants with and without any prenatal care visits
i.
Create an indicator variable indicating the receipt of any prenatal care visits
ii.
Use regression to test the null hypothesis that the difference in mean Apgar score between infants with and without any prenatal care visits is zero.
iii.
What is the statistical conclusion of your test?
iv.
Using the regression results, interpret the coefficient for prenatal care.
v.
What does the constant represent?
A colleague suggests that you need to adjust f
or age and smoking status when examining the relationship between prenatal care and Apgar scores. Re-run the regression controlling for these two variables. i.
Interpret the coefficient for prenatal care.
ii.
What happened to R
2
after you added these variables? Why?
Commands Needed For This Lab:
Tabulate a variable (or two)
tab
e.g. tab bwt
e.g. tab bwt low
You can add row, col or if statements after tabulate to tab for specific subsets or to return percentages down the rows or columns of the table
e.g. tab bwt low, row
tab bwt low, col
tab low if bwt<2500
Create a new binary variable (0/1) equal to 1 if a condition is met
gen
e.g. gen low_bwt=(bwt<2500)
Create a new binary variable equal to a certain value
gen
e.g. gen low_bwt=0
Edit values of an existing variable
replace e.g. replace low_bwt=1 if bwt<2500
Create a bar graph of means (single or over another variable)
graph bar (mean) e.g.
graph bar (mean) bwt
graph bar (mean) bwt, over(low)
Simple regression (Y = outcome/dependent variable; X = independent variable)
regress Y X, robust
Multiple regression (Y = outcome/dependent variable; X = independent variable; Z = covariate)
regress Y X Z, robust
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help