DAT430 Project One

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

430

Subject

Industrial Engineering

Date

Jan 9, 2024

Type

docx

Pages

3

Uploaded by ChancellorNightingale2248

Report
1 Project One: Part One Tiffany Rudman Quinn Southern New Hampshire University DAT 430: Leverage Data for Org Results Dr. Arash Kamari November 20, 2023
2 Clarify and Define the Question A rapidly growing organization is looking to determine what the potential negative factors are that are attributing to employees leaving the organization. They have asked you to analyze the HR Attrition data set to find what these factors are. Within the data set is a field for Attrition as a Yes or No indicator. This will be the variable that we compare our other data points with to determine their potential to be impacting whether an employee will attrite. I could use the following line of code to determine the effect WorkLifeBalance has on attrition. We can determine that those who ranked their work life balance on the low end (scale of 1-4 according to our dataset) were likely to leave than those who ranked their work life balance closer to a 3. The code would be as follows: Source and Prepare the Data The company has provided you with the data that they would like analyzed. The data set has 35 columns of variables and 1,470 rows. Since this is a large data set, we would want to remove the noise and variables not necessarily needed or may have little to no impact on the attrition of employees, such as the variables training. We could do this by feature selection. Feature selection will also help to determine if there is any missing data and whether that data should be filled in or eliminated. I would also change data types as needed to help determine if,
3 for example, gender is a factor in attrition. To get a count on null values by variable, I would use df.isnull().sum(). In doing this, we can determine if the variable is a factor or not. I can use the aforementioned code to see the results of what the variable is with the attrition variable. That will allow me to determine if the variable should be filled in with a value or if it can be dropped. Monitor Reports for Data Updates
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help