This case study provides a structured activity for an introduction to students of basic data preparation, mod- eling, and analysis motivated by an engaging topic. The students are provided background in- formation and data for point spreads of National Football League games. Using these point spreads, the students model the relationship between point spread and the probability of winning the game using linear and logistic regres- sion. We often teach it during the fall term when the NFL is in full swing so it is easy to relate the data. Essentially, if the point spread is less than double digits, the model states that the probability that a team wins a game equals 50% plus 3% per every unit of point spread. Stern states that the linear approximation for the probability of winning is 0.50 0.03p, where p is the point spread. At a certain point, the linear model breaks down as the probability estimate exceeds 100%. *700, mean- ing that a $100 bet would win $700 if the underdog wins the game. A heavy favorite to win might be listed as 1/9 or -900, meaning a $100 bet would only win about $11 if the favorite wins the game. Consider a football game where the underdog payoff is 7/1 and the favorite payoff is 1/9. To assess the underdog's probability of winning, divide the denominator by the sum of the numerator and the de- nominator to get 1/1/8 12.5% Similarly, an estimate for the favorite is 9/9/10- 90%. So, the best estimate for the underdog's probability of winning is likely somewhere in between 10% and 12.5%. Namely, the probability that an event will occur is the fraction of times you expect to see that event in many trials. Probabilities range between zero and one. The odds are defined as the probability that the event will occur like StatTools could be used, eliminating the need to use Solver for the logistic regression and providing out- put like the Receiver Operating Characteristic curve and lift chart. Or if the students are proficient at coding, the analysis could be done in Python or R, allowing for the case to take on a heavier emphasis on data science. This case study allows students to build both linear and logistic models of real sports data. In our expe- rience, stu- dents enjoy either the sports aspect of it or fitting the logistic curve, some both. There are several highlights to point out for this case as well as an abundance of teachable moments. We describe these below in order of occurrence as- suming students are first doing the general assign-ment and then working on the additional options. In our experience, the more practice students get with Pivot Tables, the better. Simple things like each column of data needing a header are easily forgotten. This can lead to a great classroom discussion of statistical sampling error. When the students fit the first line to the scatter diagram, they notice that it crosses 100% when the point spread exceeds 17. Even though the business students in the class are not necessarily mathematically oriented, this rubs them the wrong way and they realize something is off. Both of these games are right at the point where the linear model breaks down, and it is evident that no outcome is guaranteed, whatever the point spread may be. Although most of the students have used the trendline function in Excel, many of them are not familiar with its options, such as forcing the y-intercept and why this might be appropriate. This is another op- portunity for classroom discussion. Fitting the logistic curves in the standard and nonstandard ways tends to be a highlight of the case study. The students want a model that fits, that is, one that doesn't predict probabilities that exceed 100% as the linear model does. In our class, we follow standard logistic re-gression by fitting the general lo-
This case study provides a structured activity for an introduction to students of basic data preparation, mod- eling, and analysis motivated by an engaging topic. The students are provided background in- formation and data for point spreads of National Football League games. Using these point spreads, the students model the relationship between point spread and the probability of winning the game using linear and logistic regres- sion. We often teach it during the fall term when the NFL is in full swing so it is easy to relate the data. Essentially, if the point spread is less than double digits, the model states that the probability that a team wins a game equals 50% plus 3% per every unit of point spread. Stern states that the linear approximation for the probability of winning is 0.50 0.03p, where p is the point spread. At a certain point, the linear model breaks down as the probability estimate exceeds 100%. *700, mean- ing that a $100 bet would win $700 if the underdog wins the game. A heavy favorite to win might be listed as 1/9 or -900, meaning a $100 bet would only win about $11 if the favorite wins the game. Consider a football game where the underdog payoff is 7/1 and the favorite payoff is 1/9. To assess the underdog's probability of winning, divide the denominator by the sum of the numerator and the de- nominator to get 1/1/8 12.5% Similarly, an estimate for the favorite is 9/9/10- 90%. So, the best estimate for the underdog's probability of winning is likely somewhere in between 10% and 12.5%. Namely, the probability that an event will occur is the fraction of times you expect to see that event in many trials. Probabilities range between zero and one. The odds are defined as the probability that the event will occur like StatTools could be used, eliminating the need to use Solver for the logistic regression and providing out- put like the Receiver Operating Characteristic curve and lift chart. Or if the students are proficient at coding, the analysis could be done in Python or R, allowing for the case to take on a heavier emphasis on data science. This case study allows students to build both linear and logistic models of real sports data. In our expe- rience, stu- dents enjoy either the sports aspect of it or fitting the logistic curve, some both. There are several highlights to point out for this case as well as an abundance of teachable moments. We describe these below in order of occurrence as- suming students are first doing the general assign-ment and then working on the additional options. In our experience, the more practice students get with Pivot Tables, the better. Simple things like each column of data needing a header are easily forgotten. This can lead to a great classroom discussion of statistical sampling error. When the students fit the first line to the scatter diagram, they notice that it crosses 100% when the point spread exceeds 17. Even though the business students in the class are not necessarily mathematically oriented, this rubs them the wrong way and they realize something is off. Both of these games are right at the point where the linear model breaks down, and it is evident that no outcome is guaranteed, whatever the point spread may be. Although most of the students have used the trendline function in Excel, many of them are not familiar with its options, such as forcing the y-intercept and why this might be appropriate. This is another op- portunity for classroom discussion. Fitting the logistic curves in the standard and nonstandard ways tends to be a highlight of the case study. The students want a model that fits, that is, one that doesn't predict probabilities that exceed 100% as the linear model does. In our class, we follow standard logistic re-gression by fitting the general lo-
MATLAB: An Introduction with Applications
6th Edition
ISBN:9781119256830
Author:Amos Gilat
Publisher:Amos Gilat
Chapter1: Starting With Matlab
Section: Chapter Questions
Problem 1P
Related questions
Question
Read completely and identify the crux of the problem
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by step
Solved in 3 steps
Recommended textbooks for you
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
Elementary Statistics: Picturing the World (7th E…
Statistics
ISBN:
9780134683416
Author:
Ron Larson, Betsy Farber
Publisher:
PEARSON
The Basic Practice of Statistics
Statistics
ISBN:
9781319042578
Author:
David S. Moore, William I. Notz, Michael A. Fligner
Publisher:
W. H. Freeman
Introduction to the Practice of Statistics
Statistics
ISBN:
9781319013387
Author:
David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:
W. H. Freeman