Stata Lab 2
docx
keyboard_arrow_up
School
Washtenaw Community College *
*We aren’t endorsed by this school
Course
128
Subject
Health Science
Date
Feb 20, 2024
Type
docx
Pages
4
Uploaded by MegaPenguinMaster801
Columbia University
Mailman School of Public Health
Dept of Health Policy & Management
P8502 Research Methods
Spring 2023
Prof. Jamie Daw
LAB 2
Descriptive Statistics and Hypothesis Testing
Part 1: Open and Review the Data
Open and review the dataset Session2.dta.
This dataset contains data on a random sample of births from one hospital in 2018. Be sure to set a working directory, start a new .do file and log file. Take a look at the data using the data browser.
Part 2: Inspect Variables and Descriptive Statistics
Inspect the dataset:
i.
What is the unit of analysis, i.e. what does each observation represent? ii.
How many observations are there? How many variables?
iii.
What general information is contained in this dataset?
iv.
Are there missing values for any of the variables?
Inspect key variables:
i.
What is the average maternal age? What is the confidence interval for this estimate?
ii.
Create a histogram of prenatal care visits. Add titles to the axes.
iii.
What is the average and median number of prenatal visits? iv.
What is the average and median number of visits among births with >0 visits?
Rename and create new variables:
i.
The variable
ftv
contains information on the number of prenatal care visits in the first trimester. Rename this variable to something more intuitive.
ii.
Create a categorical age variable from the continuous variable age
. You can choose the age cut-
offs. Label the categories of this variable.
iii.
Create an indicator variable (binary 0/1) for low birthweight (<2500g).
iv.
What proportion of the total sample is low birthweight?
Part 3: Inspect the Relationship Between Two Variables
You are interested in the relationship between first trimester prenatal care visits and low birthweight.
i.
Create a cross tabulation of low birthweight and prenatal care visits. What % of infants with no prenatal care have low birthweight? ii.
Create a bar graph showing the mean birthweight by the number of prenatal care visits. iii.
Based on this analysis, how would you describe the relationship between birthweight and prenatal
care visits?
Part 4: Conduct Hypothesis Tests
You are asked by the hospital performance manager to examine a number of questions relating to maternal age and birthweight. You can use the ttest
command to address each question. For each test state: (1) the null and alternative hypotheses, (2) sample estimate, (3) test statistic, (4) confidence interval,
(5) p-value. Also include a one sentence interpretation of the result.
1.
Is the maternal age for this hospital the same as the U.S. average (27 years)?
2.
What is the difference in mean birthweight for women over and under age 25? You will first need
to create an indicator variable indicating age over 25.
Finished early?
Further explore the relationship between two other variables of your choice in this dataset. Create graphical displays showing this relationship and conduct a related hypothesis test.
Stata Commands Needed For This Lab:
EXAMINE DATASET
Count the number of observations
count
Describe the variables in the dataset
describe
Show detailed information on how each variable is defined, summary statistics and missing values
codebook
CALCULATE SUMMARY STATISTICS
Examine summary statistics for all variables summarize
Examine summary statistics for a specific variable (e.g. age) with additional details (medians etc.)
summarize age, detail
Calculate the mean and 95% CI for a variable
mean
e.g. mean age
Tabulate a variable (or two)
tab
e.g. tab age
tab age smoke
You can add row, col or if statements after tabulate to tab for specific subsets or to return percentages down the rows or columns of the table
e.g. tab age smok, row
tab age smok, col
tab smoke if age<20
Limit any command to a subgroup if
e.g.
tab age if age_u20==1
note Stata language requires two equal signs for “if” clauses
histogram age if bwt>2500
CREATING AND LABELING VARIABLES
Create a new binary variable (0/1)
gen
e.g. gen age_u20=(age<20)
Create a new binary variable equal to a certain value for all observations
gen
e.g. gen age_cat=. This creates an “empty” variable where all values are missing (Stata language denotes missing values with a period.
Edit values of an existing variable
replace e.g. replace age_cat=1 if age<20
replace age_cat=2 if age>20
Rename a variable
rename oldname newname
e.g. rename age ageinyears
GRAPHS
Create a histogram (e.g. for age) with axes labels, a title and red bars
histogram age, ytitle(Frequency) xtitle(Age) title(Distribution of Age) col(red)
Create a bar graph of means (single or over another variable)
graph bar (mean)
e.g.
graph bar (mean) bwt
graph bar (mean) bwt, over(low)
T-TESTS
T-test for a single mean
ttest if variable==
null_value
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
e.g.
ttest if bwt==2500
T-test for difference in means between two groups
ttest variable, by(group) unequal
e.g.
ttest bwt, by(age_u20) unequal