main

txt

School

Georgia Institute Of Technology *

*We aren’t endorsed by this school

Course

6040

Subject

Computer Science

Date

Dec 6, 2023

Type

txt

Pages

Uploaded by DrTigerMaster1007

|Midterm 2|, |Fall 2023|: |Punt, Kick, or Go for it?|¶ <#Midterm-2,-Fall-2023:- Punt,-Kick,-or-Go-for-it?> Revision history: /Version 1.0/: Initial release /All of the header information is important. Please read it../ *Topics, number of exercises:* This problem builds on your knowledge of mostly Pandas with a little NumPy. It has *9* exercises, numbered 0 to *8*. There are *21* available points. However, to earn 100% the threshold is *13* points. (Therefore, once you hit *13* points, you can stop. There is no extra credit for exceeding this threshold.) *Exercise ordering:* Each exercise builds logically on previous exercises, but you may solve them in any order. That is, if you can't solve an exercise, you can still move on and try the next one. Use this to your advantage, as the exercises are *not* necessarily ordered in terms of difficulty. Higher point values generally indicate more difficult exercises. *Demo cells:* Code cells starting with the comment |### define demo inputs| load results from prior exercises applied to the entire data set and use those to build demo inputs. These must be run for subsequent demos to work properly, but they do not affect the test cells. The data loaded in these cells may be rather large (at least in terms of human readability). You are free to print or otherwise use Python to explore them, but we did not print them in the starter code. *Debugging you code:* Right before each exercise test cell, there is a block of text explaining the variables available to you for debugging. You may use these to test your code and can print/display them as needed (careful when printing large objects, you may want to print the head or chunks of rows at a time). *Exercise point breakdown:* * Exercise 0: *2* point(s) * Exercise 1: *2* point(s) * Exercise 2: *2* point(s) * Exercise 3: *2* point(s) * Exercise 4: *3* point(s) * Exercise 5: *2* point(s) * Exercise 6: *3* point(s) * Exercise 7: *2* point(s) * Exercise 8: *3* point(s) - Depends on Exercise 7 *Final reminders:* * Submit after *every exercise* * Review the generated grade report after you submit to see what errors were returned * Stay calm, skip problems as needed, and take short breaks at your leisure Topic Introduction¶ <#Topic-Introduction>

(Brief) Primer on American football¶ <#(Brief)-Primer-on-American-football> While this analysis is based on American football, having a deep knowledge of the intracices of the sport *is not necessary to complete this notebook.* Do not dwell on this primer information. * A football game is a contest between two teams. One team is designated the "home" team and one team is designated the "away" team. The home/away designation will not change during the game. * Football games are timed. When the game clock reaches zero, the game ends, and the team with the higher score wins. * One team possesses the ball at a time. This team is designated as the offense. The team designated as the offense will change during the course of the game. * When the offense takes possession of the ball it gets 4 attempts to advance the ball to the line to gain. Each attempt is referred to as a "down". After the 4th down, if the offense has not advanced the ball to the line to gain the other team will take possession. * If the offense advances the ball past the line to gain, they get another 4 downs and the line to gain is reset to 10 yards of where the progress stopped on the previous attempt. * On each down the offense has three options: o Run a play to attempt to advance the ball. There are many potential outcomes, Generally (but not always) the offense will either score a touchdown or retain possession of the ball. o Punt the ball to the other team. The ball advances, but the other team takes possession. o Attempt to kick a field goal. A successful attempt scores points, but an unsuccessful attempt will move the ball backwards and turn possession over to the other team. Our analysis¶ <#Our-analysis> This framework makes the offense's decision on what to do on the fourth down an interesting question. We will provide data-driven guidance on which option (punting, kicking a field goal attempt, or running a play) will give the offense the best chance of winning the game. In [ ]: ### Global Imports ### BEGIN HIDDEN TESTS if False: # set to True to set up import dill import hashlib def hash_check(f1, f2, verbose=True): with open(f1, 'rb') as f: h1 = hashlib.md5(f.read()).hexdigest() with open(f2, 'rb') as f: h2 = hashlib.md5(f.read()).hexdigest() if verbose: print(h1) print(h2) assert h1 == h2, f'The file "{f1}" has been modified' with open('resource/asnlib/public/hash_check.pkl', 'wb') as f:

dill.dump(hash_check, f) del hash_check with open('resource/asnlib/public/hash_check.pkl', 'rb') as f: hash_check = dill.load(f) for fname in ['testers.py', '__init__.py', 'test_utils.py']: hash_check(f'tester_fw/{fname}', f'resource/asnlib/public/{fname}') del hash_check ### END HIDDEN TESTS # Import required modules # Feel free to import anything else you find useful import pandas as pd import dill as pickle from matplotlib import pyplot as plt import numpy as np from scipy.stats import norm from football_utils import * # loading the raw data with open('resource/asnlib/publicdata/all_events_df.pkl', 'rb') as f: all_events_df = pickle.load(f) A look at our data¶ <#A-look-at-our-data> We have sourced play level data from ESPN for over 2000 games from the 2000-2022 NFL seasons and loaded it into a Pandas DataFrame. The meanings of key columns are as follows: * |play_id|, |drive_id|, |event_id| - Unique identifiers for a play, possession, and game respectively. * |type| - the type of play which was run. * |scoringPlay| - |True| if the result of a play was a score. |False| otherwise. * |awayScore|, |homeScore| - the score of the home and away teams *after the play occurred*. * |period| - the quarter in which the play started. * |clock| - the time remaining in the quarter when the play started. * |homeTeamPoss| - |True| if the home team is on offense when the play started. * |down| - The down when the play started. * |distance| - the distance from the current position of the ball to the line to gain. * |yardsToEndzone| - the distance from the current position of the ball to the endzone. In [ ]: all_events_df.sample(5, random_state=6040) In [ ]: all_events_df.dtypes Exercise 0 - (*2* Points):¶ <#Exercise-0---(2-Points):> To get a meaningful input for a model we need to convert the |period| and |clock| fields into a numerical measure of the time remaining in the game (in seconds). To do so we can apply the following formula:

Your preview ends here