Mission 1

docx

School

Harvard University *

*We aren’t endorsed by this school

Course

E150

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

7

Uploaded by mzaid0001

Report
STAT E-100 SP 2024 Mission 1: Mental Health in Technology Study: https://www.kaggle.com/datasets/osmi/mental-health-in-tech-survey https://osmihelp.org/research.html Study Context 1) (3 points) In under 250 words, provide three potential motivations (for example, medical, financial, legal, etc.) behind why this study could have been conducted and who could benefit from the findings. A company whose rental lease is ending on an office building may use the data collected from a study such as this one to decide whether converting to a permanent remote workplace is a healthy and viable option. Another potential reason for a company to conduct this type of study is to find out what types of clauses to add to a company handbook to avoid any legal repercussions. Knowing whether there is any correlation in mental health issues between remote workers and office workers can help with setting special protocols pertinent to remote work that may not have previously been a part of the companies work culture/operations. For example, adding mandatory biweekly video breakout sessions to share ideas and concerns and foster inclusion. Finally, this type of study could potentially determine whether where one lives has any effect on the mental health of remote vs in office workers. For example, using categorical variables such as country or state it could be determined whether mental health issues are higher in certain areas in remote workers and therefore in office work should be the employment type of choice. Study Design 2) (2 points) Was the study design experimental or observational? How do you know? This study used observational design. An example of why this is true is this study was conducted using a survey which is a common method forobservational study design. Also, no situation was created, only data was collected from certain situations already in place. Data Collection
Questions 3 to 5 refer to Mental Health and Technology dataset. 3) (1 point) What type of variable is Country (categorical or quantitative)? What is the level of measurement (nominal, ordinal, interval, or ratio)? In a few sentences, justify your reasoning for the above choices. Country is a categorical variable. The level of measurement is nominal. This is true because countries can be put into categories by name, however there is no basis for putting the names in any order. 4) (1 point) What type of variable is Age (categorical or quantitative)? What is the level of measurement (nominal, ordinal, interval, or ratio)? In a few sentences, justify your reasoning for the above choices. In this study age is an example of a quantitative variable because it can take multiple numerical values. Age is usually measured on a ratio scale. The reason for this is that age has a true zero point in other words when we want to show an absence of age the value of zero is used. Also, age values can be used to perform various math including addition, subtraction, etc. 5) (1 point) What type of variable is Work_Interference (categorical or quantitative)? What is the level of measurement (nominal, ordinal, interval, or ratio)? Ignore the value ‘NA’. In a few sentences, justify your reasoning for the above choices. Work interference is a categorical variable with ordinal because scores can be assigned and represent the rank order of individuals. Results and Discussion *Please include the R code used to generate the output for the questions below (example on next page) and submit this assignment as a Word document or PDF file. 6) (2 points) Choose one categorical variable and generate a table that shows the count, proportion, or percentage. Write a sentence or two as though you are explaining the findings from the table to a novice audience.
tally (~Country, format = "count", data = survey_xlsx_cleaned, margins=TRUE) Across the participants, there were 45 participants from Germany. 7) (2 points) Generate a bar graph using the one categorical variable from question 6). Write a sentence or two as though you are explaining the finding(s) from the visualization to a novice audience. Default Bargraph Bar graph with my categorical variable from 6 (Country) bargraph( ~ Country,
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
data = survey_xlsx_cleaned, main = "Engagement with Work", col = c("green"), # horiz=TRUE, xlab="Work Interference Level", ylab="Count of Persons") The United States has the largest number of participants and therefore the highest level of work interference. Because of the discrepancy in the number of participants it is difficult to infer whether the results are truly significant. 8) (2 points) Generate a pie chart using the one categorical variable from question 6). Write a sentence or two as though you are explaining the findings from the visualization to a novice audience. piedata <- tally(~country, format =“percent”, data = survey_xlsx_cleaned) > pie(piedata)
The findings from the pie chart show slight differences between those who sought mental health treatment or not 9) (2 points) Generate a contingency table using the one categorical variable from question 6), as well as one other categorical variable. The contingency table can show counts, proportions, or percentages. Share one or two findings from the table. tally(~treatment + work interfere, format = "count", data = survey_xlsx_cleaned, margins=TRUE) work_interfere treatment NA Never Often Rarely Sometimes Total No 259 183 21 51 107 621 Yes 4 30 120 122 357 633 Total 263 213 141 173 464 1254 Having generated a contingency table that investigated the correlation between remote workers, mental health treatment received, and work interference it showed that from the group of individuals who had previous treatment only 4 had never felt work interference and 107 had sometimes experienced it. It can be inferred from this data that previous treatment correlated with a higher likelihood of work interference.
10) (2 points) Generate a Mosaic plot using the two categorical variables from question 9. Write a sentence or two as though you are explaining the findings from the visualization to a novice audience. The right side of the mosaic plot illustrates remote workers. The left side illustrates office workers. It can be inferred from the plot that there are a larger number of in office workers than remote workers that participated in the survey. Having generated a mosaic plot that investigated the correlation between remote workers and mental health treatment received it showed that remote workers have a higher likelihood of having a history of mental health treatment relative to in office workers. 11) (2 points) Generate a bar graph using the two categorical variables from question 9. Write a sentence or two as though you are explaining the findings from the visualization to a novice audience. bargraph(~work_interfere, groups=treatment, auto.key=TRUE, stack=TRUE, data=survey_xlsx_cleaned, main = "Mental Health Utilization and \nWork Interference", xlab = "Work Interference", ylab = "Participant Count")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Those who previously have had mental health treatment show significant interference with work.