Screener exercise for Research Assistant September 2023
The purpose of this exercise is to understand your level of comfort and your approach handling raw survey data. Please use the attached Excel spreadsheet to complete the following tasks using the sample data for the Treatment and Control group for a hypothetical pilot. Please produce a .do or R script which performs all modifications to the data – please do not manually adjust in Excel. This will help us simulate how you will handle much larger data sets than the sample. Task 1 – Data Cleaning
Please create commented
code in Stata (preferred) or R which performs the following tasks:
Clean the Date of Birth column
Create a Participant Age variable
Create a Number of Children variable
Create dummy variables for Race (e.g Race_White 1=yes 0=no)
Create an Annual Income variable using the last 4 months of self-reported income.
Recode the column “What is the highest level of education you have completed?” to numeric. Bonus points if you add value labels in Stata 😊
Task 2 – Scoring the SF-36 Scale
Recode and score the SF-36
Present descriptives of T & C groups (Column B) based on each subdomain
Survey Instrument (you will need to locate the same questions in the data) https://www.rand.org/health-care/surveys_tools/mos/36-item-short-form/survey-instrument.html
Guide for recoding the SF-36:
https://www.rand.org/health-care/surveys_tools/mos/36-item-short-form/scoring.html