CHarris - MAT 243 Project Three Summary Report

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

APPLIED ST

Subject

Mathematics

Date

Feb 20, 2024

Type

docx

Pages

6

Uploaded by CaptainEnergyStork29

Report
MAT 243 Project Three Summary Report Chanelle Harris Chanelle.harris@snhu.edu Southern New Hampshire University
As we continue to advance and improve our team, we have been tasked to continue our analysis. We have been asked to come up with regression models that predict the number of wins in a regular game based on the performance metrics that are included in each data set. These regression models will help us take the total number of wins, average points scored, average relative skill and the average point differential between our team and our comparison team to make key decisions to improve the performance of the team. We will use variables like the average points differential to compare the differences between our team's and the opponent’s average points in a regular season by subtracting the points the Cavs scored from those the Chicago Bulls scored. We will also use the average relative skills, which is the average team’s skill level in a regular season. This is calculated by using the team’s final score, the game location, and the outcome of the game relative to the probability of that outcome. To study the correlation of our variables we have built a scatterplot of the total number of wins versus the average relative skill. Using data visualizations tools, we can see the relationship between the two variables. Components like the correlation coefficient is used to describe the strength and the direction of the association. As shown in the chart above, our correlation is 0.9072 and features a strong positive correlation. We are looking at this with a 1% level of confidence and our chart shows us that we have a p-value equal to 0, which since it is less than 1% this tells us that this graph is statistically significant. We use the simple linear regression to predict the value of an output variable, known as the response, based on the value of an input variable, called the predictor. This is to model the relationship between two continuous variables. The model equation to find the total number of wins is y= α + β x. The null hypothesis equation is H 0 : β1 = 0, which tells us that there is no correlation existing between number of wins and the high average points in a season. The alternative hypothesis equation is H α : β1 ≠ 0, which we are saying that the
number of wins in a season does correlate with the high average points in a season. We will be using a level of significance at 1% or 0.01. Table 1: Hypothesis Test for the Overall F-Test Statistic Value Test Statistic 2865.00 P-value 0.00 Immediately looking at this chart we know that since our p-value is less than 0.01, or 1%, that there is significant evidence to reject our null hypothesis. Proving that there is a correlation between our team’s number of wins and the average points per season. We use this model to predict the total number of wins in a regular season for a team based off the relative skill level. For example, if the relative skill level is 1550 then the equation reads as is total_wins = -128.2475 + (0.1121(1550)) = 45.5, or if the relative skill is 1450, then the equations read: total number of wins = -128.2475 + (0.1121(1450)) = 34.2. To show the linear relationship between a response variable and two or more predictor variables we will use the multiple linear regression model. The standard equation for the model is Y = β0 + β 1 X 1 + β 2 X 2. Using the data that we pulled for our team, our equation is Y = -152.5736 + 0.3497(avg_pts) + 0.1055(avg_elo_n). The null hypothesis (H 0 : β 1 = β 2 = 0) states that there is no relationship between the response variable and any of the predictor variables. The alternative hypothesis (Hα: At least one βi ≠ 0 for i = 1, 2) states a relationship exists with at least one variable. The level of significance we are using is 1% or 0.01.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Table 2: Hypothesis Test for the Overall F-Test Statistic Value Test Statistic 0.4777 P-value 0.0 A p-value of less than 1% allows us to reject the null hypothesis, showing that there is significant evidence that this statement is not true. These results of our overall F-test tell us that the average points scored and the average relative skill both have an impact on the number of games won during the regular season. Since this is the overall F-test, we know that at least one of our predictors is statistically significant against the total wins in a season. The following data allows us to test each of these predictors individually to determine the role they each play.
When calculating the individual T-test we factor the values of the average points scored (avg_pts - 0.3497) and average relative skill (avg_elo_n – 0.1055). The p-value for both individual test is 0, so both predictor variables are statistically significant when using a 1% level of significance. Our chart also tells us that our coefficient of determination is set at 0.837 or 83.7%, which is the variance in average points scored and average relative skill. Using this information, we can predict certain outcomes of the total of wins based off the number of points scored per game with a known relative skill level. For example, if our team averaged 75 points per game and with a relative skill level of 1350 our total number of wins would be 16 games, and if we averaged 100 points per game and with a relative skill level of 1600 our total number of wins would be 51 games. Generally, we use the multiple linear regression model when we have multiple predictor variables and would like to predict a variable response. We would use the population regression function and the regression error term to create this model. The equation we use is Y = β0 + β 1 X 1 + β 2 X 2 + ɛ, and when we enter our data, the equation reads as Y = 34.5753 – 0.0134(avg_elo_n) + 0.2597(avg_pts) + 1.6206(avg_pts_differential) + 0.0525(avg_elo_differential). Our null hypothesis would be H 0 : β 1 = β 2 = β 3 = 0 and states no relationship between the response variable and any of the predictor variables listed. So, our alternative is the opposite of the null stating that a relationship exists with at least one variable, H α : At least one βi ≠ 0 for i = 1, 2, 3. We will be viewing this information with a level of significance of 1%, 0.01. Table 3: Hypothesis Test for Overall F-Test Statistic Value Test Statistic 1102 P-value 0.0 In conclusion the null hypothesis should be rejected in favor of the alternative hypothesis. Implying that there is a statistically significant relationship between one of the predictor variables and the response variable. This information lets us know that the average points scored, the average points differential, and the average relative skill have an impact on the number of games won in the regular season. If we break it down into individual test and look at the predictor variables we get the following values, average points scored = 0.2597, the average points differential = 1.6206, and the average relative skill = -0.0134. Each predicator has a p-value that is less than 0.01 except for average relative skill variable. Telling us that the average points scored, and the average points differential have significance in predicting the number of wins, but the average relative skill does not.
Our coefficient of determination is 0.878 or 88%, meaning that 88% of our data fits this regression model. If we wanted to predict the number of wins in the regular season from a team that is averaging 75 points per game with a relative skill level of 1350 and an average point differential of -5 and average relative skill differential of -30 is 26 games won. While the predict the number of wins in the regular season from a team that is averaging 100 points per game with a relative skill level of 1600 and an average point differential of +5 and average relative skill differential of +95 is 52 games won. Based on our performed hypothesis tests and our linear regression models we can say that the number of games won is impacted by the average relative skill level, that average points per game, and the average point differential. We can use this information to improve our performance. We know for sure that we win games by putting more points on the score board. We run test like this to help our coaches and management predict outcomes and help plan our team’s development.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help