IBM Attrition Analysis

DESCRIPTION :


•IBM is an American MNC operating in around 170 countries with major business vertical as computing, software, and hardware.
•Attrition is a major risk to service-providing organizations where trained and experienced people are the assets of the company.
•The organization would like to identify the factors which influence the attrition of employees.


Dataset Description:

• Age: Age of employee
• Attrition: Employee attrition status
• Department: Department of work
• DistanceFromHome
• Education: 1-Below College; 2- College; 3-Bachelor; 4-Master; 5-Doctor;
• EducationField
• EnvironmentSatisfaction: 1-Low; 2-Medium; 3-High; 4-Very High;
• JobSatisfaction: 1-Low; 2-Medium; 3-High; 4-Very High;
• MaritalStatus
• MonthlyIncome
• NumCompaniesWorked: Number of companies worked prior to IBM
• WorkLifeBalance: 1-Bad; 2-Good; 3-Better; 4-Best;
• YearsAtCompany: Current years of service in IBM


Analysis Task:


•Import attrition dataset and import libraries such as pandas, matplotlib.pyplot, numpy, and seaborn.


Exploratory data analysis:


•Find the age distribution of employees in IBM


Age Distribution of Employees


•Exploring attrition by age



•Explore data for Left employees


Attrition Breakdown


•Find out the distribution of employees by the education field


Education Distribution


•Give a bar chart for the number of married and unmarried employees


Martial Status


•Build up a logistic regression model to predict which employees are likely to attrite.
•Logistic Regression model accuracy is 84% and ROC score is 65%.


Confusion Matrix



Get in Touch

Logo 1 Logo 2 Logo 3