Spam filters are built on principles similar to those used in logistic regression. We fit a probability that each message is spam or not spam. We have several variables for each email. Here are a few: to_multiple=1 if there are multiple recipients, winner=1 if the word 'winner' appears in the subject line, format=1 if the email is poorly formatted, re_subj=1 if "re" appears in the subject line. A logistic model was fit to a dataset with the following output: EstimateSEZPr(>|Z|)(Intercept)-0.81610.086-9.48950to_multiple-2.56510.3052-8.40470winner1.58010.31565.00670format-0.15280.1136-1.34510.1786re_subj-2.84010.363-7.8240(a) Write down the model using the coefficients from the model fit.log_odds(spam) = -0.8161 + -2.5651 + to_multiple + 1.5801 winner + -0.1528 format + -2.8401 re_subj(b) Suppose we have an observation where to_multiple=0, winner=1, format=0, and re_subj=0. What is the predicted probability that this message is spam?answer put: 0.682 My answer for predicted probability was wrong. Please help me figure out why?

PART (a) Write down the model using the coefficients from the model fitThe logistic regression model…

Answered: Spam filters are built on principles…

Holt Mcdougal Larson Pre-algebra: Student Edition 2012

1st Edition

ISBN:9780547587776

Author:HOLT MCDOUGAL

Publisher:HOLT MCDOUGAL

Chapter11: Data Analysis And Probability

Section11.2: Box-and-whisker Plots

Problem 1E

See similar textbooks

Related questions

Question

100%

Spam filters are built on principles similar to those used in logistic regression. We fit a probability that each message is spam or not spam. We have several variables for each email. Here are a few: to_multiple=1 if there are multiple recipients, winner=1 if the word 'winner' appears in the subject line, format=1 if the email is poorly formatted, re_subj=1 if "re" appears in the subject line. A logistic model was fit to a dataset with the following output:

	Estimate	SE	Z	Pr(>\|Z\|)
(Intercept)	-0.8161	0.086	-9.4895	0
to_multiple	-2.5651	0.3052	-8.4047	0
winner	1.5801	0.3156	5.0067	0
format	-0.1528	0.1136	-1.3451	0.1786
re_subj	-2.8401	0.363	-7.824	0

(a) Write down the model using the coefficients from the model fit.
log_odds(spam) = -0.8161 + -2.5651 + to_multiple + 1.5801 winner + -0.1528 format + -2.8401 re_subj

(b) Suppose we have an observation where to_multiple=0, winner=1, format=0, and re_subj=0. What is the predicted probability that this message is spam?

answer put: 0.682

My answer for predicted probability was wrong. Please help me figure out why?

Expert Solution