Homework 4 Solutions_Fall 2020

docx

School

University of Texas *

*We aren’t endorsed by this school

Course

PH 1830

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

9

Uploaded by CoachComputer12420

Report
PH 1830 Categorical Data Analysis Homework 4 Solutions Homework 4: Agresti 4.4, 4.5, 4.9, 4.20 4.4 a) The positive sign of x showed that the snoring levels have a positive effect on heart attack. As snoring level increases, the odds of having heart attack increases. (5 points) b) Logit(p)=-3.866+0.397x At x=0, p=exp(-3.866)/(1+exp(-3.866))=.0209/1.0209= 0.0205 (5 points) At x=5 p=exp(-3.866+0.397*5)/(1+ exp(-3.866+0.397*5))= 0.132 (5 points) The probability of heart disease was 0.0205 and 0.132 at snoring level 0 and 5, respectively. c) b=0.397, exp(0.397)= 1.487, which means the odds of heart disease increase by 1.487 for every unit increase in snoring score. (5 points) 4.5 a) (9 points -- 3pts output, 3pts equation, 3pts interpret) exp (-0.232)= 0.792 . With 1 degree increase in temperature, the odds of having at least one primary O-ring suffered thermal distress is expected to decrease by 21%. b) At x=31, P=exp(15.04-0.232*31)/(1+ exp(15.04-0.232*31))= 0.9996 (6 points)
PH 1830 Categorical Data Analysis Homework 4 Solutions c) At P=0.5, x = (logit(0.5)-15.04)/-0.232 = 64.828. When the temperature is 64.78 F the estimated proability per degree increase is -0.2322/4= =0.058. (6 points) d) exp (-0.232)= 0.792 . With 1 degree increase in temperature, the odds of having at least one primary O-ring suffered thermal distress is expected to decrease by 21%. (6 points) e) i) Wald test. As show in the STATA output above, the p-value for the Wald test is 0.032. We reject the null hypothesis that temperature has no effect at 95% significant level. (5 points) ii) Likelihood ratio test. As show in the STATA output below, the p-value for the LRT is 0.0048. We reject the null hypothesis that temperature has no effect at 95% significant level. (5 points) 4.9 a) The model is logit(p)=b0+b1*col1+b2*col2+b3*col3. Dummy coded the color variable, such that Col1=I{color==2} Col2=I{color==3} Col3=I{color==4}, where color=5 is set as the baseline. Then our model is Logit (p)=-0.76214+1.86*col1+1.74*col2+1.13*col3.When col1=1, that indicates the category of color 2. That means the odds for a color 2 crab to have a satellite is exp(1.86)= 6.42 to the odds of color 5. (9 points--3 pts equation, 3 pts output, 3 pts interpret)
PH 1830 Categorical Data Analysis Homework 4 Solutions b) Test the hypotheses: H0: b1=b2=b3=0 VS Ha: b1, b2, and b3 are not all 0. The likelihood ratio test statistic is 13.7 with a p-value of 0.0033. So we will reject the null hypothesis and conclude that color has a significant effect in the model. (6 points)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
PH 1830 Categorical Data Analysis Homework 4 Solutions c) When treating color as a quantitative variable, it will be included in the model directly and not dummy coded. To make it specific we generate a new color variable=color -1. So the new color variable ranges from 1 to 4. The equation will be logit (p)= 2.363-0.715*color. exp (-0.715)=0.489. With 1 unit increase in color, the odds of having satellite is expected to decrease about 51%. (6 points) d) Test the hypotheses: H0: b1=0 VS Ha: b1 is not 0. The Wald test statistic is 11.64 with df=1 and a p-value of 0.0006. So we will reject the null hypothesis and conclude that color has a significant effect in the model. (5 points)
PH 1830 Categorical Data Analysis Homework 4 Solutions e) When we take color as a quantitative variable, the hypothesis test for color turns to be one predictor model VS no predictor model with a degree of freedom =1, which makes this model easier than the dummy-coded model. However, quantitative manner does not reflect the truth that level 2-5 only represents different color categories, but not 1 unit difference between one and the other. This results in a lack of fit issue. (5 points) 4.20 a) The P value is 0.011, less than 0.05, so the treatment effect is significant. Controlling for center, log odds of success vs. failure will be increased by .777 with drug compared with placebo. (6 points) b) H0: all the odds ratios are the same across centers H1: odds ratios across centers are different Since the p value is 0.0115, less than 0.05, so we can reject the null hypothesis. (6 points)
PH 1830 Categorical Data Analysis Homework 4 Solutions Extra questions : Apply forward, backward, and stepwise variable selection in STATA to the naltrexone drinking data provided in the module, with drinking coded as binary and all other variables coded as we have coded them in class. Compare the resulting 3 models. Are they different or the same? Is this to be expected? From the output, we can find backward selection model and backward stepwise selection model include the same covariates, which is different from the forward selection setting. It is to be expected since the selection criteria are different between forward and backward selection.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
PH 1830 Categorical Data Analysis Homework 4 Solutions Forward selection:
PH 1830 Categorical Data Analysis Homework 4 Solutions Backward selection
PH 1830 Categorical Data Analysis Homework 4 Solutions Stepwise selection [this is backward stepwise selction ]
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help