Spam filters are built on principles similar to those used in logistic regression. We fit a probability that each message is spam or not spam. We have several variables for each email. Here are a few: to_multiple=1 if there are multiple recipients, winner=1 if the word 'winner' appears in the subject line, format=1 if the email is poorly formatted, re_subj=1 if "re" appears in the subject line. A logistic model was fit to a dataset with the following output:   Estimate SE Z Pr(>|Z|) (Intercept) -0.8161 0.086 -9.4895 0 to_multiple -2.5651 0.3052 -8.4047 0 winner 1.5801 0.3156 5.0067 0 format -0.1528 0.1136 -1.3451 0.1786 re_subj -2.8401 0.363 -7.824 0 (a) Write down the model using the coefficients from the model fit.log_odds(spam) = -0.8161 + -2.5651 + to_multiple  + 1.5801 winner + -0.1528 format + -2.8401 re_subj(b) Suppose we have an observation where to_multiple=0, winner=1, format=0, and re_subj=0. What is the predicted probability that this message is spam? answer put: 0.682    My answer for predicted probability was wrong. Please help me figure out why?

Holt Mcdougal Larson Pre-algebra: Student Edition 2012
1st Edition
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Chapter11: Data Analysis And Probability
Section11.2: Box-and-whisker Plots
Problem 1E
icon
Related questions
Question
100%
Spam filters are built on principles similar to those used in logistic regression. We fit a probability that each message is spam or not spam. We have several variables for each email. Here are a few: to_multiple=1 if there are multiple recipients, winner=1 if the word 'winner' appears in the subject line, format=1 if the email is poorly formatted, re_subj=1 if "re" appears in the subject line. A logistic model was fit to a dataset with the following output:

  Estimate SE Z Pr(>|Z|)
(Intercept) -0.8161 0.086 -9.4895 0
to_multiple -2.5651 0.3052 -8.4047 0
winner 1.5801 0.3156 5.0067 0
format -0.1528 0.1136 -1.3451 0.1786
re_subj -2.8401 0.363 -7.824 0


(a) Write down the model using the coefficients from the model fit.
log_odds(spam) = -0.8161 + -2.5651 + to_multiple  + 1.5801 winner + -0.1528 format + -2.8401 re_subj

(b) Suppose we have an observation where to_multiple=0, winner=1, format=0, and re_subj=0. What is the predicted probability that this message is spam?
answer put: 0.682 
 
My answer for predicted probability was wrong. Please help me figure out why?
Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Holt Mcdougal Larson Pre-algebra: Student Edition…
Holt Mcdougal Larson Pre-algebra: Student Edition…
Algebra
ISBN:
9780547587776
Author:
HOLT MCDOUGAL
Publisher:
HOLT MCDOUGAL
Big Ideas Math A Bridge To Success Algebra 1: Stu…
Big Ideas Math A Bridge To Success Algebra 1: Stu…
Algebra
ISBN:
9781680331141
Author:
HOUGHTON MIFFLIN HARCOURT
Publisher:
Houghton Mifflin Harcourt
Glencoe Algebra 1, Student Edition, 9780079039897…
Glencoe Algebra 1, Student Edition, 9780079039897…
Algebra
ISBN:
9780079039897
Author:
Carter
Publisher:
McGraw Hill
College Algebra (MindTap Course List)
College Algebra (MindTap Course List)
Algebra
ISBN:
9781305652231
Author:
R. David Gustafson, Jeff Hughes
Publisher:
Cengage Learning