HW1
docx
keyboard_arrow_up
School
University of Illinois, Springfield *
*We aren’t endorsed by this school
Course
503
Subject
Health Science
Date
Dec 6, 2023
Type
docx
Pages
7
Uploaded by ChiefDragonfly3093
CHLH 573 - Homework Assignment 1 1. (15 pts) Determine whether each of the following variables is a categorical or quantitative variable. If the variable is quantitative variable, determine whether it
is a discrete or continuous variable. If the variable is a categorical variable determine whether it is a nominal or an ordinal variable. Answer 1: (a) Cancer stage (I,II and III) – Categorical-Ordinal
(b) Percent body fat – Quantitative-Continuous
(c) Hair color – Categorical-Nominal
(d) Number of new COVID-19 cases in IL on 08/01/2020 – Quantitative-Discrete
2. (30 pts) Suppose a nutritionist is studying the caloric content of a new low-fat ice-cream that is under development. A random sample of 11 observations yields the following data on
caloric content of this ice-cream. Observation Caloric content Observation Caloric content Observation Caloric content 1 127 5 126 9 125 2 129 6 134 10 131 3 130 7 127 11 135 4
116
8
117
(a) Using the data given in Table 1, estimate the sample (1) mean, (2) mode, (3) standard deviation, (4) range, and (5) interquartile range (15 pts). Answer:
(b) Based on the IQR you calculated above, find outliers in the dataset if there is any (5 pts). Answer:
(c) Draw the boxplot of caloric content by hand (10 pts) Answer:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
3. (30 pts) Some studies of Alzheimer’s disease (AD) have shown an increase in CO2 production in patients with the disease. In one such study the following CO2 values were obtained from 16 neocortical biopsy samples from AD patients. 1009 1280 1180 1255 1547 2352 1956 1080 1776 1767 1680 2050 1452 2857 3100 1621 Assume the population of such values is normally distributed. Answer the following questions: (a) Suppose population standard deviation is 350. Provide 95% Z-statistics confidence interval for mean level of CO2. (10 pts) Answer:
(b) Now suppose we do not know the standard deviation on the population level. Provide 95% confidence interval for mean level of CO2. (10pts) Answer:
(c) Compare the intervals in (a) and (b). Which one is wider? Why? (10 pts) Answer:
Confidence interval is wider for t-statistics compared to z-statistics, because sample standard
deviation (used for t-statistics) is larger than population standard deviation(used in z-
statistics).
Hint : You will need Stata to calculate the critical values in answering (b) and (c). Please directly copy and paste the Stata commends. No need to put them in the do file to submit.
4. [Stata] (25 pts) Data from a hypothetical sample of 32 white males over 40 years of age taken from the text Applied Regression Analysis by Kleinbaum, et al. (a) Read sbp.csv into Stata. Summarize systolic blood pressure for all observations. What is the 25%, 50% and 75% (5 pts)? Answer:
(Stata commend: summarize sbp, detail)
sbp
PercentilesSmallest
1% 120
120
5% 122
122
10% 129
126
Obs
32
25% 134.5
129
Sum of wgt.
32
50% 143
Mean 144.5313
LargestStd. dev.
14.39755
75% 152
164
90% 164
166
Variance
207.2893
95% 170
170
Skewness
.5020894
99% 180
180
Kurtosis
2.72561
Here,
25% is 134.5
50% is 143
75% is 152
(b) Summarize systolic blood pressure by smoking status. What is the 25%, 50% and 75% of systolic blood pressure for non-smoker and current or former smoker respectively? Do a side-by-side boxplot of systolic blood pressure between two smoking groups. (15 pts) Answer:
(Stata commend: by smk, sort:sum sbp,detail)
-> smk = 0 (non-smokers)
sbp
Percentiles
Smallest
1% 120
120
5% 120
122
10% 122
130
Obs
15
25% 132
132
Sum of wgt.
15
50% 138
Mean 140.8
LargestStd. dev.
12.90183
75% 152
152
90% 161
152
Variance
166.4571
95% 164
161
Skewness
.2029599
99% 164
164
Kurtosis
2.277845
-> smk = 1 (smokers)
sbp
PercentilesSmallest
1% 126
126
5% 126
129
10% 129
132
Obs
17
25% 138
134
Sum of wgt.
17
50% 145
Mean 147.8235
LargestStd. dev.
15.21198
75% 160
162
90% 170
166
Variance
231.4044
95% 180
170
Skewness
.5401733
99% 180
180
Kurtosis
2.419973
systolic blood pressure for non-smoker:
25% is 132
50% is 138
75% is 152
systolic blood pressure for current/former-smoker:
25% is 138
50% is 145
75% is 160
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Boxplot: [stata commend: graph box sbp, over(smk)]
(c) What can you conclude (5pts)? Answer:
From the observations we can conclude that Non-smokers have lower systolic blood pressure than current/past smokers, therefore they are at lesser risk of developing hypertension. This proves direct relationship between smoking and risk of developing hypertension.
Variables: systolic blood pressure (SBP), body size (QUET=100(weight/height2)), age (AGE) and smoking history (SMK=0 if non-smoker, SMK=1 if current or former smoker). (You need to record all commends used for this question in the do file and submit it.)