(a) Write down the model using the coefficients from the model fit. log_odds(spam) = to_multiple + winner + format + re_subj (b) Suppose we have an observation where to_multiple-0, winner-1, format-0, and re_subj-0. What is the predicted probability that this message is spam?

MATLAB: An Introduction with Applications
6th Edition
ISBN:9781119256830
Author:Amos Gilat
Publisher:Amos Gilat
Chapter1: Starting With Matlab
Section: Chapter Questions
Problem 1P
icon
Related questions
Question
Spam filters are built on principles similar to those used in logistic regression. We fit a probability that
each message is spam or not spam. We have several variables for each email. Here are a few:
to_multiple=1 if there are multiple recipients, winner=1 if the word 'winner' appears in the subject line,
format=1 if the email is poorly formatted, re_subj=1 if "re" appears in the subject line. A logistic model was
fit to a dataset with the following output:
Estimate
SE
Pr(>|Z])
(Intercept)
to_multiple
-0.8166
0.0883
-9.248
-2.6583
0.3023
-8.7936
winner
1.5705
0.3178
4.9418
format
-0.0799
0.1232
-0.6485
0.5166
re_subj
-2.807
0.3652
-7.6862
(a) Write down the model using the coefficients from the model fit.
log_odds(spam) =
to_multiple +
winner +
format +
re_subj
(b) Suppose we have an observation where to_multiple=0, winner=1, format=0, and re_subj=0. What is the
predicted probability that this message is spam?
Transcribed Image Text:Spam filters are built on principles similar to those used in logistic regression. We fit a probability that each message is spam or not spam. We have several variables for each email. Here are a few: to_multiple=1 if there are multiple recipients, winner=1 if the word 'winner' appears in the subject line, format=1 if the email is poorly formatted, re_subj=1 if "re" appears in the subject line. A logistic model was fit to a dataset with the following output: Estimate SE Pr(>|Z]) (Intercept) to_multiple -0.8166 0.0883 -9.248 -2.6583 0.3023 -8.7936 winner 1.5705 0.3178 4.9418 format -0.0799 0.1232 -0.6485 0.5166 re_subj -2.807 0.3652 -7.6862 (a) Write down the model using the coefficients from the model fit. log_odds(spam) = to_multiple + winner + format + re_subj (b) Suppose we have an observation where to_multiple=0, winner=1, format=0, and re_subj=0. What is the predicted probability that this message is spam?
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Point Estimation, Limit Theorems, Approximations, and Bounds
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, statistics and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
MATLAB: An Introduction with Applications
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
Elementary Statistics: Picturing the World (7th E…
Elementary Statistics: Picturing the World (7th E…
Statistics
ISBN:
9780134683416
Author:
Ron Larson, Betsy Farber
Publisher:
PEARSON
The Basic Practice of Statistics
The Basic Practice of Statistics
Statistics
ISBN:
9781319042578
Author:
David S. Moore, William I. Notz, Michael A. Fligner
Publisher:
W. H. Freeman
Introduction to the Practice of Statistics
Introduction to the Practice of Statistics
Statistics
ISBN:
9781319013387
Author:
David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:
W. H. Freeman