Project1STAT243Z

pdf

School

University of Oregon *

*We aren’t endorsed by this school

Course

243

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

4

Uploaded by GeneralSpider4171

Report
Project 1–STAT 243Z–Fall 2023 Instructions: This project is designed to have you interact with a larger, more realistic data set. You will apply various graphical and quantitative tools to study the data. When you have completed your work you will need to convert your pages into single pdf and upload that pdf to Canvas by 11:59pm on Friday, October 27 . This project requires you to study a large dataset and do calculations with it. You will need to use Excel or Google sheets to do this work. (Yes it is possible to use a calculator, but you would need to type hundreds of numbers into a calculator. Cutting and pasting in a spreadsheet would be much simpler.) You have two options for doing your work: (1) You can print the pdf, write all of your answers on the paper, convert your project back into a single pdf and upload it to Canvas by the due date, or (2) You can edit the pdf digitally, typing in the pdf and embedding your images, and upload it to Canvas by the due date. Note that in general you will need to spend some time navigating multiple pieces of software and learning new things about each software. If you have a printed pdf, you will need to convert the physical document into a pdf which could take a little extra time. All of this means that you should allocate enough time before it is due to ensure you can use the technology appropriately. Specifically, if you wait until 11pm to convert something to a pdf and fail, then you most likely will not be able to get help that late at night. This project will be graded based on the rubric you can see under Modules on Canvas. Please read through the project and rubric carefully and be sure to get any questions answered well in advance of the due date. 1. Visit the website https://www.kaggle.com. Find the dataset titled “Most Streamed SpotifySongs 2023.” Download this CSV file that contains this data. I suggest you immediately convert it to an xlsx or xls file otherwise you risk losing work you do every time you close the file. 2. Choose one quantitative variable from the data set, write which variable you chose in the space below and describe what that variable is measuring (include units if applicable). Choose from the variables in columns J, K, L, R, S or T. I choose the variable in column R and it measures the percentage of how suitable the song is for dancing
3. Use technology to make a histogram for the variable you chose and draw it in the space provided. Be sure both axes are labeled appropriately. (Or you can embed the image in the pdf if you know how to do that, which I recommend.)
4. Now we’ll create a simple random sample from the data. Determine how many individuals here are in the set. Next, go to random.org, choose the Integer Set Generator and generate a 100 random integers from one to the number of individuals in the set. For each song corresponding to a number in this set write the value of the variable you chose in the space provided. (Again editing the pdf with typed numbers is better.) 52, 71, 64, 77, 71, 84, 59, 91, 49, 64, 42, 75, 81, 68, 60, 34, 54, 65, 70, 78, 80, 49, 65, 50, 80, 52, 80, 82, 73, 59, 60, 56, 70, 82, 68, 40, 87, 81, 95, 39, 34, 70, 49, 69, 66, 64, 60, 76, 70, 31, 56, 77, 86, 45, 75, 75, 90, 63, 91, 84, 77, 91, 86, 74, 56, 76, 81, 59, 75, 68, 81, 78, 50, 51, 91, 82, 86, 53, 37, 70, 88, 60, 80, 71, 63, 51, 56, 52, 56, 52, 70, 51, 88, 68, 77, 90, 37, 64, 70, 92, 34, 48 5. Calculate the five-number summary, the mean and the sample standard deviation for yoursample. Write you answers in the blanks provided. Minimum: 31 Q 1 : 56 Median: 72 Q 3 : 80 Maximum: 95 Mean:42.63 Standard Deviation: 15.76 6. Write the Excel function used for each of those calculations in the blanks provided. Minimum:=MIN(A1:A100) Q 1 :=QUARTILE(A1:A100,1) Median:=MEDIAN(A1:A100) Q 3 : =QUARTILE(A1:A100,3) Maximum: =MAX(A1:A100) Mean: =MEAN(A1:A100) Standard Deviation:=STDEV(A1:A100) 7. Its almost certainly untrue, but lets assume that the population mean and standard deviationof your chosen variable, for all songs, are equal to the sample mean and standard deviation from your sample. Let’s also assume that the variable follows a Normal distribution with that mean
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
and standard deviation. Draw a nice picture of that Normal distribution with the minimum and maximum values of your sample indicated in the correct locations on the horizontal axis. 8. Assume the distribution is Normal, calculate the percent of songs that have a variable value between the minimum and maximum of your data and write it below. Also write the Excel function that is necessary to calculate this percent. Percent: 76.9% Excel function: =NORM.DIST(Max, Mean, STdev ,TRUE)-NORM.DIST(Min,Mean,STDev,TRUE)