Pravallika Dharanikota Statistics Data Analysis Word

docx

School

University of Scranton *

*We aren’t endorsed by this school

Course

509

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

16

Uploaded by ColonelScienceMagpie8855

Report
Divya Nimmala , Pravallika Dharanikota, Sai Balaji Nagi Rew DATA ANALYSIS PROJECT
1. Purpose: The individuals listed in the Excel file are basketball players. Each player is identified by their name and assigned a Position (POS), such as Point guard (PG), Shooting guard (SG), Small forward (SF), Power forward (PF), or Center (C), College, Age, Height in Inches, Weight in LBS, Salary. a. Who? Divya Nimmala (R01403110) - Presentation and Visualization Pravallika Dharanikota (R01404120) - Analysis and Content Creation Sai Balaji NagiReddy (R01401716) - Collection and Cleaning of Data b. What? The data includes both qualitative and quantitative variables. Qualitative variables include: Name : Names of the basketball players. POS : Positions played by the players. College : The colleges from which the players graduated or attended. Quantitative variables include: AGE : Age of the players. Height : Height of the players, typically measured in feet and inches. Weight : Weight of the players, usually measured in pounds. Salary : Annual salary of the players, expressed in dollars. c . When? The age of the players is provided, indicating their current ages at the time the data was collected. d. Why? The purpose of this data could be to analyze and compare various attributes of basketball players, such as their positions, physical characteristics (age, height, weight), college affiliations, and salaries. This analysis could be useful for team management, scouting, contract negotiations, and various other purposes related to basketball operations.
2. Understanding the Data: Content: Introduction to the basketball players listed in the dataset. Utilization: Identifying individual players and their respective positions. Value Add: Helps in forming team compositions and understanding player roles. What ? Explanation of qualitative and quantitative variables. Utilization: Analyzing player characteristics such as age, height, weight, college affiliations, and positions played. Value Add: Provides insights into player diversity and skill distribution across various attributes. When ? Discussion on the age distribution of players. Utilization: Understanding the age demographics of the roster. Value Add: Helps in assessing the team's youthfulness or experience level and planning for player development or recruitment strategies accordingly. Why? Highlighting the significance of the dataset for basketball operations. Utilization: Supporting decision-making processes related to team management, scouting, contract negotiations, and strategic planning. Value Add: Empowers teams to make informed decisions based on player attributes, performance, and market value.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
3. VISUALIZATION TOOL: MINI TAB Minitab is primarily a statistical analysis software rather than a visualization tool. While Minitab does offer some basic data visualization capabilities, it is widely utilized in statistical analysis. However, Minitab does provide various graphs and charts for visualizing data distributions, trends, relationships, and quality control analysis. These include histograms, scatterplots, boxplots, among others. While Minitab’s visualization features can be useful for exploratory data analysis, they may not be as extensive or customizable as those found in dedicated visualization tools like tableau or excel. Its main value lies in its ability to provide powerful statistical tools in a user-friendly interface, allowing users to:
Name POS College AGE Heigh t in inche s Weig ht in LBS Salary Keita Bates- Diop13 SF Ohio State 28 80 229 $2,364,61 4 Mikal Bridges1 SF Villanova 27 78 209 $21,700,0 00 Nic Claxton33 C Georgia 24 83 215 $9,625,00 0 Noah Clowney21 F Alabama 19 81 210 $3,089,52 0 Dorian Finney- Smith28 PF Florida 30 79 220 $13,932,0 08 Cameron Johnson2 SF North Carolina 28 80 210 $25,679,3 48 Keon Johnson45 G Tennesse e 22 77 185 $2,808,72 0 Day'Ron Sharpe20 C North Carolina 22 81 265 $2,210,04 0 Ben Simmons10 PG LSU 27 82 240 $37,893,4 08 Dennis Smith Jr.4 PG NC State 26 75 205 $2,019,70 6 Cam Thomas24 SG LSU 22 75 210 $2,240,16 0 Lonnie Walker IV8 G Miami 25 76 204 $2,019,70 6 Trendon Watford9 PF LSU 23 80 237 $2,019,70 6 Dariq Whitehead0 F Duke 19 78 220 $2,966,04 0 Jalen Wilson22 F Kansas 23 78 220 $850,000 Steven Adams12 C Pittsburg h 30 83 265 $12,600,0 00 Dillon Brooks9 SF Oregon 28 78 225 $22,627,6 71 Reggie Bullock Jr.25 SF North Carolina 33 78 205 $2,019,70 6 Tari Eason17 F LSU 22 80 215 $3,527,16 0 Jeff Green32 PF Georgeto wn 37 80 235 $9,600,00 0 Aaron Holiday0 G UCLA 27 72 185 $2,019,70 6 Jock Landale2 C Saint Mary's 28 83 255 $8,000,00 0 Jabari Smith Jr.10 PF Auburn 20 83 220 $9,326,52 0 Jae'Sean Tate8 SF Ohio 28 76 230 $6,500,00
State 0 Fred VanVleet5 PG Wichita State 30 72 197 $40,806,3 00 Cam Whitmore7+ F Villanova 19 79 230 $3,218,16 0 Source:official ESPN sports website: https://www.espn.com/nba/team/roster/_/name/bkn/brooklyn-nets 4. Type of Data Being Analyzed The Qualitative Data: Name: This is a Nominal Qualitative Data representing the name of the players POS: This is also Nominal Qualitative Data as it represents the position played by the basketball players College : This is a Nominal Qualitative Data representing the colleges from which the players graduated or attended. Qualitative Data: Age : This is quantitative discrete data representing the age of the players. Height : This is quantitative continuous data representing the height of the players. Weight : This is quantitative continuous data representing the weight of the players. Salary : This is quantitative continuous data representing the annual salary of the players.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
5. Histogram of Salary Histogram of salary provides a visual representation of the distribution of overall salary. It displays the frequency of ratings within different salaries.
Descriptive Statistics: Salary Skewness 1.71 in this case, it indicates that the distribution is skewed to the right which means there are more potentially more extreme values on the higher end of the salary range. Resistant statistics: median=3372660 IQR= 10770546
To identify potential outlier values in the salary data provided, we can use a common method based on the interquartile range (IQR). Outliers are often defined as values that fall below Q1 - 1.5 * IQR or above Q3 + 1.5 * IQR. From the statistics provided: Q1 (first quartile): $2,162,457 Q3 (third quartile): $12,933,002 IQR (interquartile range): $10,770,546 Using these values, we can calculate the lower and upper bounds for identifying potential outliers: Lower bound : Q1 - 1.5 * IQR = $2,162,457 - 1.5 * $10,770,546 = -$14,425,728.5 (negative value does not make sense in this context) Upper bound : Q3 + 1.5 * IQR = $12,933,002 + 1.5 * $10,770,546 = $28,750,010 Since negative salaries don't make sense, we will consider only the upper bound as a threshold for potential outliers. Any salary above this value could be considered an outlier. Therefore, potential outlier values in the salary data are those exceeding $28,750,010.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
6. Boxplot of Salary and POS:
BOX PLOT OF SALARY AND POS REPRESENTS THE HIGHEST AND LOWEST SALARY RANGES AS PER THE POS. Descriptive Statistics: Salary
Position SG has a mean value of 2240160 close to its median value, indicating resistance to outliers. RESISTANT DATA SETS: Median 8812500 3089520 2019706 9463260 37893408
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
14100000 2240160 IQR 8198720 1464640 789014 9002597 38786594 21112203 7. Scatterplot of AGE vs Weight in LBS
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Resistant Data sets are. Median and the IQR from the above statistics Correlation Coefficient r is 0.095642.