DAT430 Project One
docx
keyboard_arrow_up
School
Southern New Hampshire University *
*We aren’t endorsed by this school
Course
430
Subject
Industrial Engineering
Date
Jan 9, 2024
Type
docx
Pages
3
Uploaded by ChancellorNightingale2248
1
Project One: Part One
Tiffany Rudman Quinn
Southern New Hampshire University
DAT 430: Leverage Data for Org Results
Dr. Arash Kamari
November 20, 2023
2
Clarify and Define the Question
A rapidly growing organization is looking to determine what the potential negative
factors are that are attributing to employees leaving the organization. They have asked you to
analyze the HR Attrition data set to find what these factors are. Within the data set is a field for
Attrition as a Yes or No indicator. This will be the variable that we compare our other data points
with to determine their potential to be impacting whether an employee will attrite. I could use
the following line of code to determine the effect WorkLifeBalance has on attrition. We can
determine that those who ranked their work life balance on the low end (scale of 1-4 according
to our dataset) were likely to leave than those who ranked their work life balance closer to a 3.
The code would be as follows:
Source and Prepare the Data
The company has provided you with the data that they would like analyzed. The data set
has 35 columns of variables and 1,470 rows. Since this is a large data set, we would want to
remove the noise and variables not necessarily needed or may have little to no impact on the
attrition of employees, such as the variables training. We could do this by feature selection.
Feature selection will also help to determine if there is any missing data and whether that data
should be filled in or eliminated. I would also change data types as needed to help determine if,
3
for example, gender is a factor in attrition. To get a count on null values by variable, I would use
df.isnull().sum(). In doing this, we can determine if the variable is a factor or not. I can use the
aforementioned code to see the results of what the variable is with the attrition variable. That
will allow me to determine if the variable should be filled in with a value or if it can be dropped.
Monitor Reports for Data Updates
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help