Project_1_AlexisWilliams

docx

School

Wilmington University *

*We aren’t endorsed by this school

Course

MISC

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

11

Uploaded by BrigadierMaskTurtle300

Report
Project 1: Descriptive Statistics & Correlation Student name: Alexis Williams Project Goal: Using collected data, assess comprehension of data collection, create graphical displays, interpret descriptive statistics, and analyze data using correlation and regression. Project Overview: This project is broken down into three checkpoints, which you will complete according to the course schedule. All written work should be completed within this document, and you should use it like a step-by-step checklist of questions/objectives to complete. All short answer questions should be written in full sentences, with consideration given for grammar and spelling. Before you begin working in this project, and while you are on Microsoft Word, click on File, then Save As, and change the saved file name to include your name. For example, someone named Risco Goodman would save the file as Project_1_RiscoGoodman. Data Collection Checkpoint ( Problems #1 to #5 ONLY): Checkpoint Overview: The Data Collection Checkpoint is designed to walk you through determining a research question and data collection plan that will allow you to adequately collect the data needed to complete your projects this semester. 1. Brainstorm a topic of interest to you either personally or professionally. Write your topic below; be specific and provide details as to why this interests you. As you think of a topic, keep in mind that you will conduct a survey on your own and collect your own original data. You will not work on data on the internet or elsewhere that others have collected. You will collect your own data. All your projects this semester will be on this topic. The data you collect for Project 1 will be the data that you will use for Project 2 and Project 3 (Need help brainstorming? Refer to the Overview & Instructions: Descriptive Statistics & Correlation Project on D2L under Course Projects). If you are having trouble deciding on a topic, here is a website that might trigger your creativity: https://www.pewresearch.org/topics-categorized/ There are many interesting topics there . Now, write you topic and why it interests you. Project Topic: Coffee lovers Why it interests you: …. This topic interests me because I am personally a huge coffee drinker. I drink coffee at least 2 twice a day. I prefer my coffee from specific store and brands. I wonder if anyone already me feel the same way. Using the topic of interest that you came up with, write at least three good statistical questions about your topic you hope to answer this semester. Later in the semester, you will try to answer these questions using the data you eventually collect. Remember, good statistical questions are investigative questions which anticipate variability ©APN 2021-2025 Page 1 of 11
and need data in order to answer, and do not have exact answers. Example : What percentage of US adults use Facebook? Refer to Preview 1D and In-Class Activity 1D for more details about good statistical questions . Now, write at least three good statistical questions about your topic: a. First question: What percentage of corporate workers reply on coffee? b. Second question: Are people spending over 40 dollars a week on coffee? c. Third question: Does young adults drink more coffee than older adults? d. (Optional) Additional questions may be written here. 2. Write at least two quantitative survey questions pertaining to your topic of interest, that could help you learn more about your statistical questions. (These are questions that you will ask when you conduct your survey to help you learn more about your statistical question). A quantitative question is a question whose answer is a number, not a category. Survey questions usually contain the word “you” or “your” and is about the person you are asking the question. Keep it simple. Do not ask a survey question that requires the person to do research to find an answer for you. Example : If your statistical question is “What percentage of U.S. adults use Facebook?” then your survey question could be “How many times a week do you use Facebook?” or you might ask “How many hours a day do you spend on Facebook?”. Remember: In a survey, you will NOT ask “What percentage of U.S. adults use Facebook?” That is not a survey question. It is a good statistical question but not a suitable survey question . Now, write your own quantitative survey questions: a. First question: How often do you consume at least one cup of coffee a week? b. Second question: When did you start drinking coffee? c. (Optional) Additional questions may be written here. How much do you spend a week on coffee? 3. Write at least two categorical survey questions pertaining to your topic of interest, that could help you learn more about your statistical questions. One of the categorical questions must be a question with a YES or NO answer. (These categorical survey questions are again questions that you will ask when you conduct your survey to help you learn more about your statistical question.). Important Note : In Project 2, you will compare the categorical data from your survey to previously published percentage about your topic. At least one of the categorical questions you ask below, should be the same as (or similar to) a question asked in a previously published survey where the published percentage was obtained. (Hint: You will need to include the link to this previously published survey in Project 2.) Example : If your statistical question is “What percentage of U.S. adults use Facebook?” then for a categorical survey question, you could ask “Do you use Facebook (Yes or No)?” or you might ask “How many hours a day do you spend on Facebook (less than 2 hours, between 2 and 5 hours, or more than 5 hours)?”. Please do not provide too many options to a question since this is a small class project where the sample size would not be in thousands. Not more than five options – in fact, two or three options will suffice. Now, write your own categorical survey questions: a. First question (Yes or No question): Do you feel like you need coffee everyday to function? ( yes or no) b. Second question: how old are you? ( in your: 20s, 30s,40s or higher) c. (Optional) Additional questions may be written here. 4. Using any type of sampling technique, design a method for collecting sample data using the survey questions you wrote in Question 3 & Question 4 above. You should aim to collect a sample which is as representative as possible of the population; you will have a chance to discuss the potential for bias in your sampling method later in this project. a. Identify the population you are studying and explain (in detail) how you would acquire sample data from this population. Be specific. ©APN 2021-2025 Page 2 of 11
The position I will be using is my corporate job. I will be doing an anonymous survey by sharing a link with everyone via email for them to answer the questions. When a person clicks on my link they will be able to answer the question. Once the person saves their survey, it will go to an Excel sheet where I will keep my data. b. Explain why your data collection method would be an observational study or an experiment. This will be an observational study because I am collecting information from people who want to participant at my job. Once your instructor has confirmed your data collection plan with you, proceed to conduct your survey. You will need to survey at least 35 but preferably not more than 100 . If your survey yields more than 35 responses, you will use all responses. The data you collect must be provided to your instructor via the Survey & Spreadsheet Submission Assignment in D2L, so be sure to keep this data organized in a Google Sheet. It is strongly recommended that you use a Google Form to collect your data, since it will automatically populate the survey responses into a Google Sheet. Be sure that your Google Sheet security settings allow view- only for Delaware Technical Community College, so your instructor can access the file. Include the link here. The link to your Google Form is _____________ Visualization and Descriptive Statistics Checkpoint ( Problems #6 to #10): Checkpoint Overview: The Visualization and Descriptive Statistics Checkpoint is designed to walk you through the process of discussing and interpreting descriptive statistics, as well as creating graphical displays using technology and describing what we learn from graphical displays. The remainder of this project will utilize the two quantitative questions you collected survey data on. So, we can get started with our analysis, let us begin by answering a few general questions about our data. 5. For each of your quantitative survey questions (see question 3), DO THREE THINGS : (i) write the name of the variable, (ii) classify the variable as quantitative discrete or quantitative continuous , and (iii) justify the classification by giving reasons. Answers are to be written in full sentences. a. First variable: b. Second variable: c. (Optional) Additional variables may be written here. 6. Using the appropriate terminology, explain at least one way in which this data might be biased. ©APN 2021-2025 Page 3 of 11
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
For the next group of questions choose two quantitative variables from question 6. We will be investigating the distributions using summary statistics and graphical displays. It is recommended that you consider using the Describing and Exploring Quantitative Variables https://dcmpdatatools.utdanacenter.org/eda_quantitative/ . 7. We will begin by analyzing your first quantitative variable . Use technology to answer the following questions. a. Write the name of your first quantitative question and the corresponding first quantitative variable . First Quantitative Question = ____________ First Quantitative Variable = ____________ b. Use technology (Use the app: Describing and Exploring Quantitative Variables at https://dcmpdatatools.utdanacenter.org/eda_quantitative/ ) to calculate the mean, median, and standard deviation of the dataset. Include a screen clip (screenshot) of your results below. Remember to include a full-page screenshot showing everything on the page on how you used the app as well as the descriptive statistics NOTE: Examples of good screenshot and bad screenshots are shown at end of this document. SCREENSHOT HERE: c. Write a sentence that identifies the values of the mean and median, including correct units. d. Compare the values of the mean to the median for the dataset. What does this suggest to you about the shape of distribution? Use only the mean/median to answer this question. e. Write a sentence that identifies the value of the standard deviation, including correct units. f. Use technology https://dcmpdatatools.utdanacenter.org/eda_quantitative/ ) to construct a boxplot. Remember to include a full-page screenshot of your results whenever you use technology. Be sure to modify the name of the variable to match your dataset. g. Use technology https://dcmpdatatools.utdanacenter.org/eda_quantitative/ ) to construct a histogram. Remember to include a full-page screenshot of your results whenever you use technology. Be sure to modify the name of the variable to match your dataset. h. Using the graphs only, describe the shape of the distribution and identify any outliers. In your explanation, make note if this is different from the predictions you made about the shape of the distribution in part d. ©APN 2021-2025 Page 4 of 11
8. We will begin by analyzing your second quantitative variable . Use technology to answer the following questions. a. Write the name of your second quantitative question and the corresponding second quantitative variable . Second Quantitative Question = ____________ Second Quantitative Variable = ____________ b. Use technology (Use the app: Describing and Exploring Quantitative Variables at https://dcmpdatatools.utdanacenter.org/eda_quantitative/ ) to calculate the mean, median, and standard deviation of the dataset. Include a screen clip (screenshot) of your results below. Remember to include a full-page screenshot showing everything on the page on how you used the app as well as the descriptive statistics NOTE: Examples of good screenshot and bad screenshots are shown at end of this document. SCREENSHOT HERE: c. Write a sentence that identifies the values of the mean and median, including correct units. d. Compare the values of the mean to the median for the dataset. What does this suggest to you about the shape of distribution? Use only the mean/median to answer this question. e. Write a sentence that identifies the value of the standard deviation, including correct units. f. Use technology https://dcmpdatatools.utdanacenter.org/eda_quantitative/ ) to construct a boxplot. Remember to include a full-page screenshot of your results whenever you use technology. Be sure to modify the name of the variable to match your dataset. g. Use technology https://dcmpdatatools.utdanacenter.org/eda_quantitative/ ) to construct a histogram. Remember to include a full-page screenshot of your results whenever you use technology. Be sure to modify the name of the variable to match your dataset. ©APN 2021-2025 Page 5 of 11
h. Using the graphs only, describe the shape of the distribution and identify any outliers. In your explanation, make note if this is different than the predictions you made about the shape of the distribution in part d. Analysis & Reflection : 9. Write 1-2 paragraphs summarizing what you have learned so far about the data you have been exploring in questions 6-9. Explain any contradictions and/or alignments between the summary statistics and the graphs as you discuss the shape, center, spread and outliers for each distribution. Add any additional observations or questions that you have as you move through this project. Correlation and Regression Checkpoint ( Problems #11 to #17): Checkpoint Overview: The Correlation and Regression Checkpoint is designed to walk you through the process of determining if a correlation exists between two quantitative variables, as well as creating graphical displays using technology. The remainder of this project will utilize the two quantitative questions you collected survey data on. 10. Write a research question by filling in the blanks with your quantitative variables (variables from questions 8 & 9), to state the research question as follows. Is there a linear relationship between the < first variable> and the < second variable> ? HINT : All you have to do here is to copy this research question and replace the highlighted “first variable” and “second variable” with your own variables. That is it! 11. Which variable is the explanatory variable and which variable is the response variable? Explain your answer in 1-2 sentences. For the following questions, use the Linear Regression https://dcmpdatatools.utdanacenter.org/linear_regression/ . 12. Use technology to create a graphical display appropriate for the relationship between two quantitative variables. Remember to include a full-page screenshot of your results whenever you use technology.. Be sure to modify the name of the variables to match your datasets. SCREENSHOT HERE: ©APN 2021-2025 Page 6 of 11
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
13. Use a linear regression model to answer the following questions in the context of your variables . a. Provide a full-page screenshot of your results from https://dcmpdatatools.utdanacenter.org/linear_regression/ that must include the Linear Regression Equation and Model Summary from the technology output. In the Plot Options, remember to select “Regression Line.” b. Write the linear regression equation. c. Write and interpret the slope. d. Write and interpret the y-intercept. e. Write and interpret the correlation coefficient (R). f. Write and interpret the coefficient of determination (R 2 ). 14. Is a linear regression model appropriate? Answer YES or NO, and explain. Use the information from the previous question to explain why or why not. Analysis & Reflection: 15. Write a paragraph summarizing what you have learned so far about the data you have been exploring in questions 11-15. Explain any contradictions and/or alignments between the individual data and the correlation analysis. Add any additional observations or questions that you have as you move through this project. 16. Personal Reflection : At the conclusion of your project, write a detailed paragraph with your personal thoughts on your project and the process. You may wish to include anything which surprised you or things you found to be challenging along the way. ©APN 2021-2025 Page 7 of 11
Samples of good screenshots and bad screenshots are shown below: Samples: Good & Bad Screenshots Sample 1 : This sample is a good screenshot because it is a full-page screenshot showing the output and how the app was used to get it. It is large and legible. Sample 2 : This sample below is a not good screenshot because it is NOT a full-page screenshot, since it does NOT show how the app was used to produce the displayed info. It is looks fancy for professional presentation but not the best for evaluation purposes in a project like this. ©APN 2021-2025 Page 8 of 11
Sample 3 : This sample below is a bad screenshot because it is TINY. Sample 4 : This sample below is a bad screenshot because it shows more than the screen – it shows the keyboard and stuff outside the screen. It is also tiny. Also, two images should not be on the same line. ©APN 2021-2025 Page 9 of 11
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Sample 5 : The sample below is as bad as Sample 2. Does not show details of how the app was used. Not full page. Descriptive Statistics: insufficient Sleep n Mean Std. Dev. Min . Q1 Median Q3 Max . IQR 2017 51 33.9 3.61 26.1 31.4 33.9 36.6 42.8 5.20 2019 51 35.0 3.35 28.7 32.6 34.5 37.5 42.9 4.88 Sample 6 : The sample below is as bad as Sample 4. ©APN 2021-2025 Page 10 of 11
Best wishes with the project! ©APN 2021-2025 Page 11 of 11