S24 - Data Analysis Exercise

pdf

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

132

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

4

Uploaded by ChancellorWaterJackal38

Report
Data Analysis Assignment: Instructions The Excel file accompanying this assignment contains two data sets. The first data set (the “movie ratings” tab in the Excel file) consists of movie ratings of 100 movies released in 2012, alongside ratings of the next movie released by the same director. The ratings of each movie are taken from two websites. The “Metascore” columns contain ratings from metacritic.com, a website that generates a final rating based on a weighted average of dozens and sometimes hundreds of movie reviews. The metacritic ratings range from 0-100. The “AVRating” columns contain ratings from a single website: avclub.com. The AV Club rates movies on a grade scale ranging from A to F. In this spreadsheet, these grades have been converted to numbers ranging from 1-12, where 1 is an F and 12 is an A. The “Improvement” columns simply subtract the ratings of the director’s 2012 movie from the ratings of their next movie. Thus, positive numbers mean that, in the eyes of the critics, the director improved on their next movie whereas negative numbers mean that the director performed worse on their next movie. The second data set (the “baseball” tab in the Excel file) contains data on major league baseball team performance 2015. You do not need to know anything about baseball to do this assignment. This is what you need to know: There are 30 major league baseball teams. Each team plays 162 games in a season. The data set shows team performance during two 20-game stretches (Games 61-80 and Games 81-100), and also by season half (the first 81 games vs. the second 81 games). Specifically, the data set shows each team’s winning percentage for each of these time periods. The data set also provides a measurement of improvement from one 20-game stretch to the next, and of improvement from the season’s first half to the season’s second half. For more information about the variables, click on the “Key” tab in the Excel file.
Please perform the following set of analyses using all of the data (i.e., the data from all three seasons). For correlations, you can use the CORREL function in Excel. Report all correlations to two decimal places. 1. In the “Movie Ratings” data set, correlate the 2012 Metacritic rating with the Metacritic rating of the director’s next movie. Report it here: 0.57 2. In the “Movie Ratings” data set, correlate the 2012 AV Club’s rating with the AV Club’s rating of the director’s next movie. Report it here: 0.37 3. In the “Movie Ratings” data set, correlate the 2012 Metacritic rating with the improvement in the Metacritic rating from the 2012 movie to the director’s next movie. Report it here: -0.27 4. In the “Movie Ratings” data set, correlate the 2012 AV Club’s rating with the improvement in the AV Club’s rating from the 2012 movie to the director’s next movie. Report it here: -0.71 5. In the “Baseball” data set, correlate winning percentages in Games 61-80 with winning percentages in Games 81-100. Report it here: 0.11 6. In the “Baseball” data set, correlate first-half winning percentages with second-half winning percentages. Report it here: 0.55 7. In the “Baseball” data set, correlate winning percentages in Games 61-80 with improvement in Games 81-100. Report it here: -0.62 8. In the Baseball” data set, correlate first-half winning percentage with second-half improvement . Report it here: -0.26
I now want you to interpret what you found. Please answer each of the questions below while making no reference to anything specific about movies or baseball. You do NOT need to know anything about movies or baseball to answer these questions, and nor will that knowledge help you to answer these questions. Your answers should be clear, concise, and without jargon. Imagine you are explaining your answers to someone who is not knowledgeable about statistics. 1. Compare the correlation in Question 1 to the correlation in Question 2, and the correlation in Question 5 to the correlation in Question 6. What do these comparisons reveal? Explain what you think this means and why you think this happened. Use fewer than 150 words to answer this question. For case one, when we look at the correlation in question 1 to the correlation in question 2, we see that 0.5 and 0.37 (respectively) are positive and similar. This tells us that both the 2012 Metacritic rating and 2012 Av Club rating are correlated with the critic rating of the directors next movie and Av Club rating of the directors next movie. It’s simply telling us that the movie scores are correlated in scores by 0.57 and 0.37 respectively. For case two, when we look at the correlation in question 5 to the correlation in question 6, we see that 0.11 and 0.55 (respectively) are positive and similar. This tells us that both the winning percentages in games 61-80 and first-half winning percentages are correlated to winning percentages in games 81-100 and with second-half winning percentages. It’s simply telling us that winning percentages are correlated in values by 0.11 and 0.55 respectively. 2. Compare the correlations in Question 3, 4, 7, and 8. What do they show? Explain why you think this happened. Use fewer than 150 words to answer this question. In case question 2, the correlation in questions 3,4,7,8 all have negative values of -.27, -.71, - .62, -.26 respectively. The negative numbers mean that the director and baseball teams correlated in an inverse relationship for the respective variables (scale of -1 to 0)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help