Investigation 2: Songs from Spotify – Genre and Mode (revisited) From our first Data Analysis Assignment, a sample of 3439 songs were selected from the Spotify database and twelve variables were measured for each song. In Data Analysis 1 Investigation 1, we were interested in analyzing the Genre and Mode of each song. Remember, the Mode variable indicates the type of scale the song’s melodic content is derived from (Major or Minor). Let us now investigate whether a difference exists between the Major songs that are Rock and the Minor songs that are Rock. From the random sample of songs, there were 1964 Major songs and of those 365 were Rock songs. In addition, there were 1475 Minor songs and of those 147 were Rock songs. a) Calculate and label the two sample proportions separately and round each value to four decimal places. Next, calculate the difference between these sample proportions of Major and Minor songs by subtracting (Major – Minor). Type these calculations, label each of them, and present each of these values in your solutions document.
Investigation 2: Songs from Spotify – Genre and
From our first Data Analysis Assignment, a sample of 3439 songs were selected from the Spotify
database and twelve variables were measured for each song. In Data Analysis 1 Investigation 1,
we were interested in analyzing the Genre and Mode of each song. Remember, the Mode
variable indicates the type of scale the song’s melodic content is derived from (Major or Minor).
Let us now investigate whether a difference exists between the Major songs that are Rock and
the Minor songs that are Rock. From the random sample of songs, there were 1964 Major songs
and of those 365 were Rock songs. In addition, there were 1475 Minor songs and of those 147
were Rock songs.
a) Calculate and label the two sample proportions separately and round each value to four
decimal places. Next, calculate the difference between these sample proportions of
Major and Minor songs by subtracting (Major – Minor). Type these calculations, label
each of them, and present each of these values in your solutions document.
b) What is the parameter of interest? Use words and symbol(s) in context in your answer.
Define any subscripts that you use.
c) Create a bootstrap distribution by following these instructions. In StatKey under the
middle pane labeled ‘Bootstrap Confidence Intervals’, click CI for Difference in
Proportions. Click ‘Edit Data’, then enter in the count and
Make Group 1 Major and Group 2 Minor. Next, click ‘Generate 1000 Samples.’ Take a
screenshot of your bootstrap distribution including the mean and standard error and paste
it in your solutions document.
d) Describe the shape of the bootstrap distribution in a complete sentence.
e) Construct a 95% bootstrap confidence interval using the original sample statistic and the
+ 2SE method. Show all work and present your answer as (lower value, upper value).
f) Interpret the meaning of the confidence interval you obtained in part (e) in context.
g) Does your 95% confidence interval capture 0? Based on your answer, what can we infer
about whether a difference exists between the Modes? Answer these questions in the
context of the problem in one or two sentences.
h) Using your bootstrap distribution from part (c), construct a 90% confidence interval using
the percentile method. Go to the top left corner of the distribution and click ‘Two-Tail’
and then enter in the percentile values needed based on the significance level. Present a
screenshot of your bootstrap distribution (with all five blue boxes visible) and write your
answer as (lower value, upper value).
i) Does your 90% confidence interval capture 0? Based on your answer, what can we infer
about whether a difference exists between the Modes? Answer these questions in the
context of the problem in one or two sentences.
j) If the analyst was testing the hypothesis that there exists a difference between the
proportion of Rock songs that were of Major and Minor Mode, state the null and
alternative hypotheses using correct notation. Consider Major as Population 1 and Minor
as Population 2.
k) Create a randomization distribution by following these instructions. In StatKey, go to the
right pane labeled ‘Randomization Hypothesis Tests’ and click Test for Difference in
Proportions. Edit the data in ‘Edit Data’ by entering in the count and sample size for each
group and click ‘Generate 1000 Samples.’ Screenshot your distribution and paste it in
your solutions document.
l) Why is your randomization distribution centered at zero? Answer in one sentence.
m) Calculate the p-value from your randomization distribution using your observed statistic
calculated in part 2(a). First, click the ‘Right Tail’ button and enter the value of your
observed statistic in the blue box below the x-axis. Next, click the ‘Left Tail’ button and
enter the negative value of your observed statistic in the blue box below the x-axis (to the
left of zero). Then, if necessary, readjust your bottom blue box to the right of zero to
correctly display the value of the observed statistic. Finally, add the values of the two
blue boxes above their corresponding red x’s to obtain the p-value.
n) Is this p-value significant at the 10% significance level? Is it significant at the 5%
significance level? Compare the answers to these questions to your answers to parts (g)
and (i) in two complete sentences.
Trending now
This is a popular solution!
Step by step
Solved in 6 steps with 7 images
- Is this p-value significant at the 10% significance level? Is it significant at the 5% significance level? Compare the answers to these questions to your answers to parts (g) and (i) in two complete sentences.