Kaelyn_Murphy_5-1_Assignment_Data_Timeline

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

223

Subject

Industrial Engineering

Date

Apr 3, 2024

Type

docx

Pages

2

Uploaded by BaronKuduPerson693

Report
Kaelyn Murphy DAT 223 November 26, 2023 Module 5-1 Data Timeline Assignment Characteristics of the data: The CDC collected data via phone survey responses through BRFSS (the largest health survey system in the world) for Alabama residents. Heart Matters then obtained the data from the CDC. The survey collects information regarding health related risk behaviors, chronic health conditions, and use of preventive services. The information collected contained data regarding who lives in the respondent home, the height and weight of the respondent, health insurance status, capacity to cover out-of-pocket expenses, and the last time they had a checkup. In addition, personal information was about physical health was collected like general health, any existing health conditions, any medications being taken, and if the respondent is a smoker. If insufficient data or the data is from a large enough base, we may come to incorrect conclusions about Alabama resident’s access to health care and the conditions they have. If we responses from a variety of respondents with different backgrounds, it may result in the information being skewed. For example, if the survey is responded to by only smokers, we would conclude that smoking is the only cause for certain conditions. If the data provided isn’t sufficient or our respondent pool isn’t large enough, we should follow up with additional surveys Data Provenance: The data has been handled by at least five different teams since the survey was conducted in 2017 which has resulted in data being lost. The timeline is as follows: 2017 – Data was retrieved from the BRFSS database and stored in CSV format on a secured drive in a securely locked room. During this time, an initial analysis of the data was done on the data. During the same year, the IT system administrators physically transferred the data over to an SQL database via a flash drive. 2018- Heart Matters new data analyst reviewed the master data set and discovered that the CSV file contained characters that couldn’t be read by the SQL database during migration. The SQL database not being able to read some rows resulted in data being lost from not being imported properly. The data being handled by various parties has resulted in data being lost or the data not being transferred properly. The quality and validity of the data is degraded due to the various hands that have reviewed the data and the various databases the system was housed in. Data Management: The data is currently housed in two different formats via a CSV file and an SQL database. The data did not transfer properly which has resulted in data loss. The original file should be retrieved from the CDC or directly from BRFSS. With the original file, we could download it and make the needed to corrections to ensure it transfers properly to the SQL database. The data should be organized, stored, and managed via the SQL database. The data should be organized in columns and rows so that it transfers properly
Kaelyn Murphy DAT 223 November 26, 2023 between the file and the database. We may run into legal or ethical issues if HIPPA is not followed when reviewing the information in the original dataset. We need to ensure that the respondent’s privacy is kept and that any disclosure agreements are in place as needed.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help