Assignment9-BAS320-LogisticRegression

docx

School

The University of Tennessee, Knoxville *

*We aren’t endorsed by this school

Course

320

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

3

Uploaded by ProfessorSquidMaster853

Report
BAS 320 - Assignment 9 - Logistic Regression Matthew Zook Analysis of … (replace the … with what you’re analyzing) I’m using a logistic regression model to predict the probability that the Beer is Highly Rated based on some of the characteristics like the sweetness and sourness. I’m making this model because I am curious to see whether beer has a high rating based on its characteristics. The data I’m using comes from the BAS320datasetsYCategoricalFall23.RData and contains a total of 3197 rows and 17 total predictors Task 1 - Investigation of the relationship between the probability of … (replace with what you’re studying) and … (replace with a numeric X) As the sourness of beer increases, the probability of having a low score decreases. When the probability is 50%, the sourness score is around 142.3. M.task1 <- glm ( BEER $ newY ~ BEER $ Sour, family= binomial) #Fit a simple logistic regression predicting Y from X (numeric) summary (M.task1) ## ## Call: ## glm(formula = BEER$newY ~ BEER$Sour, family = binomial) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 1.281000 0.055368 23.136 <2e-16 *** ## BEER$Sour -0.009033 0.001049 -8.613 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 3768.1 on 3196 degrees of freedom ## Residual deviance: 3693.2 on 3195 degrees of freedom ## AIC: 3697.2 ## ## Number of Fisher Scoring iterations: 4 visualize_model (M.task1)
- 1.281 /- . 009 ## [1] 142.3333 Task 2 - Investigation of the relationship between the probability of … (replace with what you’re studying) and … (replace with a numeric X1 and categorical X2) Beers with a higher alcholol by volume content have a lower probability of being low quility. Task 4 - Assessing the full model M.full2 <- glm (newY ~ . ^ 2 , data= BEER, family= binomial) #Fit a regression predicting Y from all predictors and all interactions ## Warning: glm.fit: algorithm did not converge ## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred confusion_matrix (M.full2)
## Warning in predict.lm(object, newdata, se.fit, scale = 1, type = if (type == : ## prediction from rank-deficient fit; attr(*, "non-estim") has doubtful cases ## Predicted High Predicted Low Total ## Actual High 316 567 883 ## Actual Low 196 2118 2314 ## Total 512 2685 3197 table (BEER $ newY) #You need a frequency table of the column containing Y and a calculation of the naive model's accuracy ## ## High Low ## 883 2314 check_regression (M.full2) ## Method 1 (comparing each observation with simulated results given model is correct; not very sensitive) ## p-value of goodness of fit test is approximately 1 ## ## Method 2 (Hosmer-Lemeshow test with 10 categories; overly sensitive for large sample sizes) ## p-value of goodness of fit test is approximately 0
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help