MAT 243 Project One Summary Report Josh Rose - 11-9-23
docx
keyboard_arrow_up
School
Southern New Hampshire University *
*We aren’t endorsed by this school
Course
243
Subject
Mathematics
Date
Feb 20, 2024
Type
docx
Pages
11
Uploaded by BarristerBoulder10431
MAT 243 Project One Summary Report
Joshua Rose
Southern New Hampshire University
11/9/2023
1.
Introduction: Problem Statement
In this project, I am going to conduct a comparative analysis between two basketball teams. There will be a personal selection of “My Team” and one that has been previously assigned to me as “Assigned Team”. We are trying to gain a better insight into performance statistics based on different statistical analysis and using data visualization. In this comparison I will be choosing
the team The Suns as my comparative data and will be focusing on the years 2013 – 2015. The alternate assigned team will be The Bulls and we will be using their data from the years 1996 -
1998.
The data set we will be primarily using for this project is based on the points scored for each team’s respective year range. The report will be using statistical data analysis to provide data visualization charts to compare the various performances of each team.
This project will involve many statistical analysis methods, like descriptive statistics (mean, median, variance and standard deviation) to be able to summarize and show the comparison of home and away games and using confidence intervals to determine the average relative skill level of the teams based on the range of years.
2.
Introduction: Your Team and the Assigned Team
For this study, I have chosen the Suns as my designated team. I am focusing on the years of 2013
– 2015 for running the analysis. For this project I will be examining many aspects of the Suns’ performance to understand the information, gain insight and make comparisons against the
assigned team. The team I was assigned was the Chicago Bulls, and we will be looking at this teams’ statistics covering the years 1996 – 1998. Table 1. Information on the Teams
Name of Team
Assigned Years
1. Yours
Suns
2013 – 2015
2. Assigned
Bulls
1996 - 1998
3.
Data Visualization: Points Scored by Your Team
The data visualization is a great tool that is used to study data distributions and trends from comparing complex datasets in an intuitive manner. These visualizations allow others to have a deeper understanding of how patterns, comparisons and certain characteristics are read from the data within. These visualizations can show central tendency, spread, and shapes of data, allowing
a view of potential outliers or anomalies. For this specific project, I have chosen to use the histogram to represent the data from the distribution of the variable for my team chosen, being the Suns, which would be points scored. I went with this option because a histogram works very well for displaying the frequency or count of said data within exact intervals. When looking at a team’s points scored, it is crucial to understand the frequency of scored point ranges. This histogram allows me to see the point patterns in a certain range, allowing us to see if they tend to score points in a specific range, or if there is an even distribution of points scored. This visual will also help in assessing the central tendency and how the data is spread, allowing a good view of scoring trends.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4.
Data Visualization: Points Scored by the Assigned Team
In this activity, I have decided to use a histogram to show the data distribution of the points scored for the assigned team, being the Bulls, showing the scoring data from years 1996 – 1998. The histogram for this data set is a proper visualization to display the count of data for the intervals, making it an ideal visual to understand the distribution of points scored.
The reasoning behind choosing this histogram as the visualization of data is because it shows a clear visualization of the frequency the Bulls scored within different point ranges over the years of coverage. This also helps in showing the central tendency and overall data spread, while identifying any scoring patterns.
5.
Data Visualization: Comparing the Two Teams
Data visualization is normally used to compare two different data distributions by showing the relationship of the two variables. This allows us to visually compare similarities, differences and even patterns between the two. In this section, I have decided to create a boxplot to show the comparison of the data distributions of the team I selected (Suns) and my assigned team (Bulls). I went with this visualization because the boxplot is great at showing statistics of datasets, like the median, quartiles, and optional outliers. With the boxplot it shows more clearly where the average scores are for each team.
6.
Descriptive Statistics: Points Scored By Your Team in Home Games
Table 2. Descriptive Statistics for Points Scored by Suns in Home Games
Statistic Name
Value
Mean
101.8
Median
101.0
Variance
173.79
Standard Deviation
13.18
We can use the measures of central tendency and variability to show the distribution of a specific
data set. We accomplish this by finding the Mean, or average of a full data set, the Median, or middle score of a data set, the mode, which is the most frequently occurring value, the Variance or the diversity of a distribution, and the Standard Deviation. The measures of variability
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
(variance and standard deviation) quantify the spread of data points around a central value. They are there to indicate how much individual data points deviate from a central tendency. This will help with answering how far the data variation is from a center.
Mean (101.8)
– The mean shows the average number of points scored by the team in home games.
Median (101.0)
– The median represents the middle value of all the points scored.
Variance (173.79)
– A variance indicates how data points deviate from the mean. This higher variance shows a greater variability in the data.
Standard Deviation (13.18) – The standard deviation would be the square root of the variance and will show a measurement of how spread out the data points are relative to the mean. A lower
standard deviation shows that the data points will be closer to the mean.
As far as the skew of this distribution is concerned, as we have found earlier, the mean and median are close. This would suggest the distribution is nearly symmetrical or bell-shaped. In this situation both measures of central tendency could be used to represent the center of the distribution.
7.
Descriptive Statistics: Points Scored By Your Team in Away Games
Table 3. Descriptive Statistics for Points Scored by Your Team in Away Games
Statistic Name
Value
Mean
100.08
Median
99.0
Variance
132.6
Standard Deviation
11.52
Mean (100.8)
– This represents the average number of points scored by my team in away games.
Median (99.0)
– This represents the middle value of all points scored in away games.
Variance (132.6)
– The variance measures how data points deviate from the mean. This higher variance indicates a greater variability.
Standard Deviation (11.52) – This is the square root of the variance and provides a measure of the spread of data points relative to the mean.
The skew of this data shows a very symmetrical distribution or showing a bell-shaped set. We see that the distribution is quite symmetrical, showing that both the mean and median are well suited measures of central tendency to show the center of the distribution.
Looking at the performance from home compared to away games we can show based on the mean and standard deviation. When we compare the means of each, we see that in home games (101.8) is showing a slightly better performance to the away games mean (100.08). When looking at the standard deviation comparison, the points scored at home games (13.18) is again slightly higher compared to the away games standard deviation of points scored (11.52). This would show that away games points are more consistent.
8.
Confidence Intervals for the Average Relative Skill of All Teams in Your Team’s Years
Table 4. Confidence Interval for Average Relative Skill of Teams in Your Team’s Years
Confidence Level (%)
Confidence Interval
95%
(1502.02, 1507.18)
Confidence levels are made to estimate certain uncertainties by providing a lower and upper limit
that shows a range of possible values. They are based on sample data and help quantify precision, help in making inferences, and provide a certain measure of confidence for the estimated values. This is a detailed example of the confidence interval for the average skill of all teams in the assigned years range of 2013-2015:
The 95% confidence interval would suggest that we can be 95% confident that the average relative skill level of a team in the range of 2013-2015 falls within the range of 1502.02 and 1507.18. In general terms, if we took multiple samples from a team’s relative skill level, from the
same population, and use the calculation of 95% confidence, we should expect around 95% of those intervals to contain the average skill level.
If we were to use a different confidence level, say 99%, the confidence level would be at a wider range. A higher confidence level requires a wider interval to account for increased levels of certainty.
The probability of a team that is in the same league having a lower relative skill lever would depend on the distribution of relative skill levels in the population. If the skill levels follow a
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
normal distribution, you can use the confidence interval to have an estimate. If the relative skill levels are normally distributed, the possibility of a team’s skill level being lower than our lower bound of 95% confidence would be rather low.
9.
Confidence Intervals for the Average Relative Skill of All Teams in the Assigned Team’s Years Table 5. Confidence Interval for Average Relative Skill of Teams in Assigned Team’s Years
Confidence Level (%)
Confidence Interval
95%
(1487.66, 1493.65)
The confidence interval at 95% confidence for the average relative skill of teams in the years range is (1487.66, 1493.65). This shows that we can be 95% confident that the average relative skill of teams during the period fall within range. If we were to choose a different confidence level, like 90% or 99%, the width of the level would differ between each. A higher confidence level of 99% would result in wider intervals. If we used a lower confidence level such as 90%, this would produce a narrower interval, which would
give higher precision, but run the risk of not capturing the correct parameters.
If we were to compare to the previous one for my team, which was at (1502.02, 1507.18) the intervals do not overlap. What this shows is that with a 95% confidence level the average skill of
teams in those years are statistically lower than the relative skill in my teams’ years.
10. Conclusion
The importance of this full analysis would help dramatically in determining the performance of teams in the NBA compared to my own by providing crucial information regarding decision making, evaluation of the team, effective future planning, and coaching strategies. Having this data will help determine where our strengths and weaknesses are and how to move forward in developing areas of improvement. Being able to bring in data visualization also helped to keep things a little simpler to understand when comparing numbers. In the end, this analysis was able to provide a sound, complex and deep framework to assess and understand the statistics and trends of the two teams.