This R Markdown document is part of SMU’s Master’s in Data Science Program DS 6306 “Doing Data Science.” Student’s are given a data set and asked to make predictions using data science methods and techniques learned in the course. For this case study we are asumming that we have been hired by a company called DDSAnalytics that specializes in talent management. The company wants to gain a competitive edge by providing its customers with accurate predictions regarding attrition (employee turnover) and monthly salary.
We will start by importing the following data for analysis:
CaseStudy2-Data.csv:
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.1.0 v stringr 1.4.0
## v tidyr 1.1.3 v forcats 0.5.1
## v readr 1.4.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
## corrplot 0.84 loaded
## Loading required package: grid
## Loading required package: rpart
##
## Attaching package: 'BBmisc'
## The following object is masked from 'package:grid':
##
## explode
## The following objects are masked from 'package:dplyr':
##
## coalesce, collapse
## The following object is masked from 'package:base':
##
## isFALSE
Load Theme for plots
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Data Preparation #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## ID Age Attrition BusinessTravel DailyRate Department
## 1 1 32 No Travel_Rarely 117 Sales
## 2 2 40 No Travel_Rarely 1308 Research & Development
## 3 3 35 No Travel_Frequently 200 Research & Development
## 4 4 32 No Travel_Rarely 801 Sales
## 5 5 24 No Travel_Frequently 567 Research & Development
## 6 6 27 No Travel_Frequently 294 Research & Development
## DistanceFromHome Education EducationField EmployeeCount EmployeeNumber
## 1 13 4 Life Sciences 1 859
## 2 14 3 Medical 1 1128
## 3 18 2 Life Sciences 1 1412
## 4 1 4 Marketing 1 2016
## 5 2 1 Technical Degree 1 1646
## 6 10 2 Life Sciences 1 733
## EnvironmentSatisfaction Gender HourlyRate JobInvolvement JobLevel
## 1 2 Male 73 3 2
## 2 3 Male 44 2 5
## 3 3 Male 60 3 3
## 4 3 Female 48 3 3
## 5 1 Female 32 3 1
## 6 4 Male 32 3 3
## JobRole JobSatisfaction MaritalStatus MonthlyIncome
## 1 Sales Executive 4 Divorced 4403
## 2 Research Director 3 Single 19626
## 3 Manufacturing Director 4 Single 9362
## 4 Sales Executive 4 Married 10422
## 5 Research Scientist 4 Single 3760
## 6 Manufacturing Director 1 Divorced 8793
## MonthlyRate NumCompaniesWorked Over18 OverTime PercentSalaryHike
## 1 9250 2 Y No 11
## 2 17544 1 Y No 14
## 3 19944 2 Y No 11
## 4 24032 1 Y No 19
## 5 17218 1 Y Yes 13
## 6 4809 1 Y No 21
## PerformanceRating RelationshipSatisfaction StandardHours StockOptionLevel
## 1 3 3 80 1
## 2 3 1 80 0
## 3 3 3 80 0
## 4 3 3 80 2
## 5 3 3 80 0
## 6 4 3 80 2
## TotalWorkingYears TrainingTimesLastYear WorkLifeBalance YearsAtCompany
## 1 8 3 2 5
## 2 21 2 4 20
## 3 10 2 3 2
## 4 14 3 3 14
## 5 6 2 3 6
## 6 9 4 2 9
## YearsInCurrentRole YearsSinceLastPromotion YearsWithCurrManager
## 1 2 0 3
## 2 7 4 9
## 3 2 2 2
## 4 10 5 7
## 5 3 1 3
## 6 7 1 7
## The following object is masked from package:vcd:
##
## JobSatisfaction
## 'data.frame': 870 obs. of 36 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Age : int 32 40 35 32 24 27 41 37 34 34 ...
## $ Attrition : chr "No" "No" "No" "No" ...
## $ BusinessTravel : chr "Travel_Rarely" "Travel_Rarely" "Travel_Frequently" "Travel_Rarely" ...
## $ DailyRate : int 117 1308 200 801 567 294 1283 309 1333 653 ...
## $ Department : chr "Sales" "Research & Development" "Research & Development" "Sales" ...
## $ DistanceFromHome : int 13 14 18 1 2 10 5 10 10 10 ...
## $ Education : int 4 3 2 4 1 2 5 4 4 4 ...
## $ EducationField : chr "Life Sciences" "Medical" "Life Sciences" "Marketing" ...
## $ EmployeeCount : int 1 1 1 1 1 1 1 1 1 1 ...
## $ EmployeeNumber : int 859 1128 1412 2016 1646 733 1448 1105 1055 1597 ...
## $ EnvironmentSatisfaction : int 2 3 3 3 1 4 2 4 3 4 ...
## $ Gender : chr "Male" "Male" "Male" "Female" ...
## $ HourlyRate : int 73 44 60 48 32 32 90 88 87 92 ...
## $ JobInvolvement : int 3 2 3 3 3 3 4 2 3 2 ...
## $ JobLevel : int 2 5 3 3 1 3 1 2 1 2 ...
## $ JobRole : chr "Sales Executive" "Research Director" "Manufacturing Director" "Sales Executive" ...
## $ JobSatisfaction : int 4 3 4 4 4 1 3 4 3 3 ...
## $ MaritalStatus : chr "Divorced" "Single" "Single" "Married" ...
## $ MonthlyIncome : int 4403 19626 9362 10422 3760 8793 2127 6694 2220 5063 ...
## $ MonthlyRate : int 9250 17544 19944 24032 17218 4809 5561 24223 18410 15332 ...
## $ NumCompaniesWorked : int 2 1 2 1 1 1 2 2 1 1 ...
## $ Over18 : chr "Y" "Y" "Y" "Y" ...
## $ OverTime : chr "No" "No" "No" "No" ...
## $ PercentSalaryHike : int 11 14 11 19 13 21 12 14 19 14 ...
## $ PerformanceRating : int 3 3 3 3 3 4 3 3 3 3 ...
## $ RelationshipSatisfaction: int 3 1 3 3 3 3 1 3 4 2 ...
## $ StandardHours : int 80 80 80 80 80 80 80 80 80 80 ...
## $ StockOptionLevel : int 1 0 0 2 0 2 0 3 1 1 ...
## $ TotalWorkingYears : int 8 21 10 14 6 9 7 8 1 8 ...
## $ TrainingTimesLastYear : int 3 2 2 3 2 4 5 5 2 3 ...
## $ WorkLifeBalance : int 2 4 3 3 3 2 2 3 3 2 ...
## $ YearsAtCompany : int 5 20 2 14 6 9 4 1 1 8 ...
## $ YearsInCurrentRole : int 2 7 2 10 3 7 2 0 1 2 ...
## $ YearsSinceLastPromotion : int 0 4 2 5 1 1 0 0 0 7 ...
## $ YearsWithCurrManager : int 3 9 2 7 3 7 3 0 0 7 ...
## [1] 870 36
## integer(0)
## ID Age Attrition
## 0 0 0
## BusinessTravel DailyRate Department
## 0 0 0
## DistanceFromHome Education EducationField
## 0 0 0
## EmployeeCount EmployeeNumber EnvironmentSatisfaction
## 0 0 0
## Gender HourlyRate JobInvolvement
## 0 0 0
## JobLevel JobRole JobSatisfaction
## 0 0 0
## MaritalStatus MonthlyIncome MonthlyRate
## 0 0 0
## NumCompaniesWorked Over18 OverTime
## 0 0 0
## PercentSalaryHike PerformanceRating RelationshipSatisfaction
## 0 0 0
## StandardHours StockOptionLevel TotalWorkingYears
## 0 0 0
## TrainingTimesLastYear WorkLifeBalance YearsAtCompany
## 0 0 0
## YearsInCurrentRole YearsSinceLastPromotion YearsWithCurrManager
## 0 0 0
## ID Age Attrition BusinessTravel
## Min. : 1.0 Min. :18.00 Length:870 Length:870
## 1st Qu.:218.2 1st Qu.:30.00 Class :character Class :character
## Median :435.5 Median :35.00 Mode :character Mode :character
## Mean :435.5 Mean :36.83
## 3rd Qu.:652.8 3rd Qu.:43.00
## Max. :870.0 Max. :60.00
## DailyRate Department DistanceFromHome Education
## Min. : 103.0 Length:870 Min. : 1.000 Min. :1.000
## 1st Qu.: 472.5 Class :character 1st Qu.: 2.000 1st Qu.:2.000
## Median : 817.5 Mode :character Median : 7.000 Median :3.000
## Mean : 815.2 Mean : 9.339 Mean :2.901
## 3rd Qu.:1165.8 3rd Qu.:14.000 3rd Qu.:4.000
## Max. :1499.0 Max. :29.000 Max. :5.000
## EducationField EmployeeCount EmployeeNumber EnvironmentSatisfaction
## Length:870 Min. :1 Min. : 1.0 Min. :1.000
## Class :character 1st Qu.:1 1st Qu.: 477.2 1st Qu.:2.000
## Mode :character Median :1 Median :1039.0 Median :3.000
## Mean :1 Mean :1029.8 Mean :2.701
## 3rd Qu.:1 3rd Qu.:1561.5 3rd Qu.:4.000
## Max. :1 Max. :2064.0 Max. :4.000
## Gender HourlyRate JobInvolvement JobLevel
## Length:870 Min. : 30.00 Min. :1.000 Min. :1.000
## Class :character 1st Qu.: 48.00 1st Qu.:2.000 1st Qu.:1.000
## Mode :character Median : 66.00 Median :3.000 Median :2.000
## Mean : 65.61 Mean :2.723 Mean :2.039
## 3rd Qu.: 83.00 3rd Qu.:3.000 3rd Qu.:3.000
## Max. :100.00 Max. :4.000 Max. :5.000
## JobRole JobSatisfaction MaritalStatus MonthlyIncome
## Length:870 Min. :1.000 Length:870 Min. : 1081
## Class :character 1st Qu.:2.000 Class :character 1st Qu.: 2840
## Mode :character Median :3.000 Mode :character Median : 4946
## Mean :2.709 Mean : 6390
## 3rd Qu.:4.000 3rd Qu.: 8182
## Max. :4.000 Max. :19999
## MonthlyRate NumCompaniesWorked Over18 OverTime
## Min. : 2094 Min. :0.000 Length:870 Length:870
## 1st Qu.: 8092 1st Qu.:1.000 Class :character Class :character
## Median :14074 Median :2.000 Mode :character Mode :character
## Mean :14326 Mean :2.728
## 3rd Qu.:20456 3rd Qu.:4.000
## Max. :26997 Max. :9.000
## PercentSalaryHike PerformanceRating RelationshipSatisfaction StandardHours
## Min. :11.0 Min. :3.000 Min. :1.000 Min. :80
## 1st Qu.:12.0 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:80
## Median :14.0 Median :3.000 Median :3.000 Median :80
## Mean :15.2 Mean :3.152 Mean :2.707 Mean :80
## 3rd Qu.:18.0 3rd Qu.:3.000 3rd Qu.:4.000 3rd Qu.:80
## Max. :25.0 Max. :4.000 Max. :4.000 Max. :80
## StockOptionLevel TotalWorkingYears TrainingTimesLastYear WorkLifeBalance
## Min. :0.0000 Min. : 0.00 Min. :0.000 Min. :1.000
## 1st Qu.:0.0000 1st Qu.: 6.00 1st Qu.:2.000 1st Qu.:2.000
## Median :1.0000 Median :10.00 Median :3.000 Median :3.000
## Mean :0.7839 Mean :11.05 Mean :2.832 Mean :2.782
## 3rd Qu.:1.0000 3rd Qu.:15.00 3rd Qu.:3.000 3rd Qu.:3.000
## Max. :3.0000 Max. :40.00 Max. :6.000 Max. :4.000
## YearsAtCompany YearsInCurrentRole YearsSinceLastPromotion
## Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 3.000 1st Qu.: 2.000 1st Qu.: 0.000
## Median : 5.000 Median : 3.000 Median : 1.000
## Mean : 6.962 Mean : 4.205 Mean : 2.169
## 3rd Qu.:10.000 3rd Qu.: 7.000 3rd Qu.: 3.000
## Max. :40.000 Max. :18.000 Max. :15.000
## YearsWithCurrManager
## Min. : 0.00
## 1st Qu.: 2.00
## Median : 3.00
## Mean : 4.14
## 3rd Qu.: 7.00
## Max. :17.00
Name | df |
Number of rows | 870 |
Number of columns | 36 |
_______________________ | |
Column type frequency: | |
character | 9 |
numeric | 27 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
Attrition | 0 | 1 | 2 | 3 | 0 | 2 | 0 |
BusinessTravel | 0 | 1 | 10 | 17 | 0 | 3 | 0 |
Department | 0 | 1 | 5 | 22 | 0 | 3 | 0 |
EducationField | 0 | 1 | 5 | 16 | 0 | 6 | 0 |
Gender | 0 | 1 | 4 | 6 | 0 | 2 | 0 |
JobRole | 0 | 1 | 7 | 25 | 0 | 9 | 0 |
MaritalStatus | 0 | 1 | 6 | 8 | 0 | 3 | 0 |
Over18 | 0 | 1 | 1 | 1 | 0 | 1 | 0 |
OverTime | 0 | 1 | 2 | 3 | 0 | 2 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
ID | 0 | 1 | 435.50 | 251.29 | 1 | 218.25 | 435.5 | 652.75 | 870 | ▇▇▇▇▇ |
Age | 0 | 1 | 36.83 | 8.93 | 18 | 30.00 | 35.0 | 43.00 | 60 | ▂▇▇▃▂ |
DailyRate | 0 | 1 | 815.23 | 401.12 | 103 | 472.50 | 817.5 | 1165.75 | 1499 | ▇▇▇▇▇ |
DistanceFromHome | 0 | 1 | 9.34 | 8.14 | 1 | 2.00 | 7.0 | 14.00 | 29 | ▇▅▂▂▂ |
Education | 0 | 1 | 2.90 | 1.02 | 1 | 2.00 | 3.0 | 4.00 | 5 | ▂▅▇▆▁ |
EmployeeCount | 0 | 1 | 1.00 | 0.00 | 1 | 1.00 | 1.0 | 1.00 | 1 | ▁▁▇▁▁ |
EmployeeNumber | 0 | 1 | 1029.83 | 604.79 | 1 | 477.25 | 1039.0 | 1561.50 | 2064 | ▇▇▇▇▇ |
EnvironmentSatisfaction | 0 | 1 | 2.70 | 1.10 | 1 | 2.00 | 3.0 | 4.00 | 4 | ▅▆▁▇▇ |
HourlyRate | 0 | 1 | 65.61 | 20.13 | 30 | 48.00 | 66.0 | 83.00 | 100 | ▇▇▆▇▇ |
JobInvolvement | 0 | 1 | 2.72 | 0.70 | 1 | 2.00 | 3.0 | 3.00 | 4 | ▁▃▁▇▁ |
JobLevel | 0 | 1 | 2.04 | 1.09 | 1 | 1.00 | 2.0 | 3.00 | 5 | ▇▇▃▂▁ |
JobSatisfaction | 0 | 1 | 2.71 | 1.11 | 1 | 2.00 | 3.0 | 4.00 | 4 | ▅▅▁▇▇ |
MonthlyIncome | 0 | 1 | 6390.26 | 4597.70 | 1081 | 2839.50 | 4945.5 | 8182.00 | 19999 | ▇▅▂▁▁ |
MonthlyRate | 0 | 1 | 14325.62 | 7108.38 | 2094 | 8092.00 | 14074.5 | 20456.25 | 26997 | ▇▇▇▇▇ |
NumCompaniesWorked | 0 | 1 | 2.73 | 2.52 | 0 | 1.00 | 2.0 | 4.00 | 9 | ▇▃▂▂▁ |
PercentSalaryHike | 0 | 1 | 15.20 | 3.68 | 11 | 12.00 | 14.0 | 18.00 | 25 | ▇▅▃▂▁ |
PerformanceRating | 0 | 1 | 3.15 | 0.36 | 3 | 3.00 | 3.0 | 3.00 | 4 | ▇▁▁▁▂ |
RelationshipSatisfaction | 0 | 1 | 2.71 | 1.10 | 1 | 2.00 | 3.0 | 4.00 | 4 | ▅▅▁▇▇ |
StandardHours | 0 | 1 | 80.00 | 0.00 | 80 | 80.00 | 80.0 | 80.00 | 80 | ▁▁▇▁▁ |
StockOptionLevel | 0 | 1 | 0.78 | 0.86 | 0 | 0.00 | 1.0 | 1.00 | 3 | ▇▇▁▂▁ |
TotalWorkingYears | 0 | 1 | 11.05 | 7.51 | 0 | 6.00 | 10.0 | 15.00 | 40 | ▇▇▂▁▁ |
TrainingTimesLastYear | 0 | 1 | 2.83 | 1.27 | 0 | 2.00 | 3.0 | 3.00 | 6 | ▂▇▇▂▃ |
WorkLifeBalance | 0 | 1 | 2.78 | 0.71 | 1 | 2.00 | 3.0 | 3.00 | 4 | ▁▃▁▇▂ |
YearsAtCompany | 0 | 1 | 6.96 | 6.02 | 0 | 3.00 | 5.0 | 10.00 | 40 | ▇▃▁▁▁ |
YearsInCurrentRole | 0 | 1 | 4.20 | 3.64 | 0 | 2.00 | 3.0 | 7.00 | 18 | ▇▃▂▁▁ |
YearsSinceLastPromotion | 0 | 1 | 2.17 | 3.19 | 0 | 0.00 | 1.0 | 3.00 | 15 | ▇▁▁▁▁ |
YearsWithCurrManager | 0 | 1 | 4.14 | 3.57 | 0 | 2.00 | 3.0 | 7.00 | 17 | ▇▂▅▁▁ |
## 'data.frame': 870 obs. of 38 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Age : int 32 40 35 32 24 27 41 37 34 34 ...
## $ Attrition : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ BusinessTravel : Factor w/ 3 levels "Non-Travel","Travel_Frequently",..: 3 3 2 3 2 2 3 3 3 2 ...
## $ DailyRate : int 117 1308 200 801 567 294 1283 309 1333 653 ...
## $ Department : Factor w/ 3 levels "Human Resources",..: 3 2 2 3 2 2 2 3 3 2 ...
## $ DistanceFromHome : int 13 14 18 1 2 10 5 10 10 10 ...
## $ Education : int 4 3 2 4 1 2 5 4 4 4 ...
## $ EducationField : Factor w/ 6 levels "Human Resources",..: 2 4 2 3 6 2 4 2 2 6 ...
## $ EnvironmentSatisfaction : int 2 3 3 3 1 4 2 4 3 4 ...
## $ Gender : Factor w/ 2 levels "Female","Male": 2 2 2 1 1 2 2 1 1 2 ...
## $ HourlyRate : int 73 44 60 48 32 32 90 88 87 92 ...
## $ JobInvolvement : int 3 2 3 3 3 3 4 2 3 2 ...
## $ JobLevel : int 2 5 3 3 1 3 1 2 1 2 ...
## $ JobRole : Factor w/ 9 levels "Healthcare Representative",..: 8 6 5 8 7 5 7 8 9 1 ...
## $ JobSatisfaction : int 4 3 4 4 4 1 3 4 3 3 ...
## $ MaritalStatus : Factor w/ 3 levels "Divorced","Married",..: 1 3 3 2 3 1 2 1 2 2 ...
## $ MonthlyIncome : int 4403 19626 9362 10422 3760 8793 2127 6694 2220 5063 ...
## $ MonthlyRate : int 9250 17544 19944 24032 17218 4809 5561 24223 18410 15332 ...
## $ NumCompaniesWorked : int 2 1 2 1 1 1 2 2 1 1 ...
## $ OverTime : Factor w/ 2 levels "No","Yes": 1 1 1 1 2 1 2 2 2 1 ...
## $ PercentSalaryHike : int 11 14 11 19 13 21 12 14 19 14 ...
## $ PerformanceRating : int 3 3 3 3 3 4 3 3 3 3 ...
## $ RelationshipSatisfaction: int 3 1 3 3 3 3 1 3 4 2 ...
## $ StockOptionLevel : int 1 0 0 2 0 2 0 3 1 1 ...
## $ TotalWorkingYears : int 8 21 10 14 6 9 7 8 1 8 ...
## $ TrainingTimesLastYear : int 3 2 2 3 2 4 5 5 2 3 ...
## $ WorkLifeBalance : int 2 4 3 3 3 2 2 3 3 2 ...
## $ YearsAtCompany : int 5 20 2 14 6 9 4 1 1 8 ...
## $ YearsInCurrentRole : int 2 7 2 10 3 7 2 0 1 2 ...
## $ YearsSinceLastPromotion : int 0 4 2 5 1 1 0 0 0 7 ...
## $ YearsWithCurrManager : int 3 9 2 7 3 7 3 0 0 7 ...
## $ iJobRole : int 8 6 5 8 7 5 7 8 9 1 ...
## $ iDepartment : int 3 2 2 3 2 2 2 3 3 2 ...
## $ iMaritalStatus : int 1 3 3 2 3 1 2 1 2 2 ...
## $ iBusinessTravel : int 3 3 2 3 2 2 3 3 3 2 ...
## $ iEducation : int 4 3 2 4 1 2 5 4 4 4 ...
## $ iAttrition : int 1 1 1 1 1 1 1 1 1 1 ...
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Exploratoration into Data #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## 0% 25% 50% 75% 100%
## 18 30 35 43 60
## No Yes
## 37.41233 33.78571
## 0% 25% 50% 75% 100%
## 0 1 2 4 9
## No Yes
## 2.660274 3.078571
## 0% 25% 50% 75% 100%
## 11 12 14 18 25
## No Yes
## 15.17534 15.32857
## 0% 25% 50% 75% 100%
## 0 6 10 15 40
## No Yes
## 11.602740 8.185714
## 0% 25% 50% 75% 100%
## 1081.0 2839.5 4945.5 8182.0 19999.0
## No Yes
## 6702.000 4764.786
## 0% 25% 50% 75% 100%
## 0 2 3 7 18
## No Yes
## 4.453425 2.907143
## 0% 25% 50% 75% 100%
## 0 2 3 7 17
## No Yes
## 4.369863 2.942857
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## 0% 25% 50% 75% 100%
## 1081.0 2839.5 4945.5 8182.0 19999.0
## 0% 25% 50% 75% 100%
## 0 2 3 7 18
## No Yes
## 4.453425 2.907143
## 0% 25% 50% 75% 100%
## 0 2 3 7 18
## No Yes
## 4.453425 2.907143
## 0% 25% 50% 75% 100%
## 0 3 5 10 40
## No Yes
## 7.301370 5.192857
## integer(0)
## 'data.frame': 870 obs. of 43 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Age : int 32 40 35 32 24 27 41 37 34 34 ...
## $ Attrition : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ BusinessTravel : Factor w/ 3 levels "Non-Travel","Travel_Frequently",..: 3 3 2 3 2 2 3 3 3 2 ...
## $ DailyRate : int 117 1308 200 801 567 294 1283 309 1333 653 ...
## $ Department : Factor w/ 3 levels "Human Resources",..: 3 2 2 3 2 2 2 3 3 2 ...
## $ DistanceFromHome : int 13 14 18 1 2 10 5 10 10 10 ...
## $ Education : int 4 3 2 4 1 2 5 4 4 4 ...
## $ EducationField : Factor w/ 6 levels "Human Resources",..: 2 4 2 3 6 2 4 2 2 6 ...
## $ EnvironmentSatisfaction : int 2 3 3 3 1 4 2 4 3 4 ...
## $ Gender : Factor w/ 2 levels "Female","Male": 2 2 2 1 1 2 2 1 1 2 ...
## $ HourlyRate : int 73 44 60 48 32 32 90 88 87 92 ...
## $ JobInvolvement : int 3 2 3 3 3 3 4 2 3 2 ...
## $ JobLevel : int 2 5 3 3 1 3 1 2 1 2 ...
## $ JobRole : Factor w/ 9 levels "Healthcare Representative",..: 8 6 5 8 7 5 7 8 9 1 ...
## $ JobSatisfaction : int 4 3 4 4 4 1 3 4 3 3 ...
## $ MaritalStatus : Factor w/ 3 levels "Divorced","Married",..: 1 3 3 2 3 1 2 1 2 2 ...
## $ MonthlyIncome : int 4403 19626 9362 10422 3760 8793 2127 6694 2220 5063 ...
## $ MonthlyRate : int 9250 17544 19944 24032 17218 4809 5561 24223 18410 15332 ...
## $ NumCompaniesWorked : int 2 1 2 1 1 1 2 2 1 1 ...
## $ OverTime : Factor w/ 2 levels "No","Yes": 1 1 1 1 2 1 2 2 2 1 ...
## $ PercentSalaryHike : int 11 14 11 19 13 21 12 14 19 14 ...
## $ PerformanceRating : int 3 3 3 3 3 4 3 3 3 3 ...
## $ RelationshipSatisfaction : int 3 1 3 3 3 3 1 3 4 2 ...
## $ StockOptionLevel : int 1 0 0 2 0 2 0 3 1 1 ...
## $ TotalWorkingYears : int 8 21 10 14 6 9 7 8 1 8 ...
## $ TrainingTimesLastYear : int 3 2 2 3 2 4 5 5 2 3 ...
## $ WorkLifeBalance : int 2 4 3 3 3 2 2 3 3 2 ...
## $ YearsAtCompany : int 5 20 2 14 6 9 4 1 1 8 ...
## $ YearsInCurrentRole : int 2 7 2 10 3 7 2 0 1 2 ...
## $ YearsSinceLastPromotion : int 0 4 2 5 1 1 0 0 0 7 ...
## $ YearsWithCurrManager : int 3 9 2 7 3 7 3 0 0 7 ...
## $ iJobRole : int 8 6 5 8 7 5 7 8 9 1 ...
## $ iDepartment : int 3 2 2 3 2 2 2 3 3 2 ...
## $ iMaritalStatus : int 1 3 3 2 3 1 2 1 2 2 ...
## $ iBusinessTravel : int 3 3 2 3 2 2 3 3 3 2 ...
## $ iEducation : int 4 3 2 4 1 2 5 4 4 4 ...
## $ iAttrition : int 1 1 1 1 1 1 1 1 1 1 ...
## $ Age.Group : Factor w/ 4 levels "Senior","Undergrad",..: 4 3 3 4 4 4 3 3 4 4 ...
## $ MonthlyIncome.Group : Factor w/ 4 levels "Above.Avg","Avg",..: 2 3 3 3 2 3 4 1 4 1 ...
## $ YearsWithCurrManager.Group: Factor w/ 4 levels "2thru4","4thru6",..: 1 3 4 3 1 3 1 4 4 3 ...
## $ YearsInCurrentRole.Group : Factor w/ 4 levels "5&above","Lessthan2",..: 2 1 2 1 3 1 2 2 2 2 ...
## $ YearsAtCompany.Group : Factor w/ 4 levels "10&above","3thru5",..: 2 1 4 1 3 3 2 4 4 3 ...
## [1] 43
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Prepare data for Modeling Train Test SPlit #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## [1] 870 17
## [1] 609 17
## [1] 261 17
## 'data.frame': 609 obs. of 17 variables:
## $ Attrition : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ Age.Group : Factor w/ 4 levels "Senior","Undergrad",..: 3 3 4 4 4 3 1 3 4 4 ...
## $ DistanceFromHome : int 9 11 8 24 5 2 29 1 10 3 ...
## $ MonthlyIncome.Group : Factor w/ 4 levels "Above.Avg","Avg",..: 3 4 4 2 2 3 2 1 2 1 ...
## $ TotalWorkingYears : int 24 5 8 6 7 20 9 10 5 6 ...
## $ OverTime : Factor w/ 2 levels "No","Yes": 1 2 1 1 2 1 1 1 2 1 ...
## $ YearsAtCompany : int 1 2 3 4 6 19 6 9 5 2 ...
## $ StockOptionLevel : int 0 1 0 0 2 0 0 0 1 0 ...
## $ JobRole : Factor w/ 9 levels "Healthcare Representative",..: 4 3 2 3 7 1 7 1 7 8 ...
## $ JobLevel : int 5 1 1 1 1 3 1 2 1 2 ...
## $ JobInvolvement : int 2 3 4 3 4 3 3 4 3 1 ...
## $ Education : int 2 4 2 3 2 4 3 4 4 2 ...
## $ EnvironmentSatisfaction: int 4 4 4 4 1 3 3 2 4 4 ...
## $ WorkLifeBalance : int 3 3 3 3 2 3 2 2 4 3 ...
## $ YearsInCurrentRole : int 0 2 2 3 2 6 5 7 3 2 ...
## $ YearsAtCompany.Group : Factor w/ 4 levels "10&above","3thru5",..: 4 4 4 2 3 1 3 3 2 4 ...
## $ YearsWithCurrManager : int 1 2 2 2 5 8 3 8 0 2 ...
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # find important Variables #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
## No Yes
## Age.Group 0.5103284 0.5103284
## DistanceFromHome 0.5786201 0.5786201
## MonthlyIncome.Group 0.6164635 0.6164635
## TotalWorkingYears 0.6683950 0.6683950
## OverTime 0.6629622 0.6629622
## YearsAtCompany 0.6513220 0.6513220
## No Yes
## Age.Group 0.5103284 0.5103284
## DistanceFromHome 0.5786201 0.5786201
## MonthlyIncome.Group 0.6164635 0.6164635
## TotalWorkingYears 0.6683950 0.6683950
## OverTime 0.6629622 0.6629622
## YearsAtCompany 0.6513220 0.6513220
## StockOptionLevel 0.6624561 0.6624561
## JobRole 0.5880397 0.5880397
## JobLevel 0.6551126 0.6551126
## JobInvolvement 0.6287647 0.6287647
## Education 0.5766267 0.5766267
## EnvironmentSatisfaction 0.5530366 0.5530366
## WorkLifeBalance 0.5292605 0.5292605
## YearsInCurrentRole 0.6472423 0.6472423
## YearsAtCompany.Group 0.6029952 0.6029952
## YearsWithCurrManager 0.6325759 0.6325759
## [1] 0.6683950 0.6629622 0.6624561 0.6551126 0.6513220 0.6472423 0.6325759
## [8] 0.6287647 0.6164635 0.6029952 0.5880397 0.5786201 0.5766267 0.5530366
## [15] 0.5292605 0.5103284
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Begin Modeling #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # 1. Support Vector Model #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Confusion Matrix and Statistics
##
## Reference
## Prediction No Yes
## No 215 45
## Yes 0 1
##
## Accuracy : 0.8276
## 95% CI : (0.7762, 0.8714)
## No Information Rate : 0.8238
## P-Value [Acc > NIR] : 0.4746
##
## Kappa : 0.0353
##
## Mcnemar's Test P-Value : 5.412e-11
##
## Sensitivity : 1.00000
## Specificity : 0.02174
## Pos Pred Value : 0.82692
## Neg Pred Value : 1.00000
## Prevalence : 0.82375
## Detection Rate : 0.82375
## Detection Prevalence : 0.99617
## Balanced Accuracy : 0.51087
##
## 'Positive' Class : No
##
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # 2. Model Decesion Tree #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Attrition is 0.00 when
## OverTime is Yes
## StockOptionLevel < 1
## MonthlyIncome.Group is Above.Avg or Avg or High
## JobRole is Healthcare Representative or Human Resources or Laboratory Technician or Manufacturing Director or Research Director or Research Scientist
## YearsAtCompany.Group is 10&above or 3thru5
##
## Attrition is 0.06 when
## OverTime is Yes
## StockOptionLevel >= 1
## JobLevel >= 2
##
## Attrition is 0.07 when
## OverTime is No
## TotalWorkingYears >= 3
##
## Attrition is 0.08 when
## OverTime is No
## StockOptionLevel >= 1
## TotalWorkingYears < 3
##
## Attrition is 0.15 when
## OverTime is Yes
## StockOptionLevel >= 1
## DistanceFromHome < 13
## JobLevel < 2
##
## Attrition is 0.19 when
## OverTime is Yes
## StockOptionLevel < 1
## MonthlyIncome.Group is Above.Avg or Avg or High
## JobRole is Healthcare Representative or Human Resources or Laboratory Technician or Manufacturing Director or Research Director or Research Scientist
## YearsAtCompany.Group is 5thru10 or LessThan3
## JobInvolvement >= 3
##
## Attrition is 0.38 when
## OverTime is Yes
## StockOptionLevel < 1
## MonthlyIncome.Group is Above.Avg or Avg or High
## JobRole is Manager or Sales Executive or Sales Representative
## DistanceFromHome < 8
##
## Attrition is 0.38 when
## OverTime is Yes
## StockOptionLevel < 1
## MonthlyIncome.Group is Low
## Age.Group is Veteran
##
## Attrition is 0.62 when
## OverTime is No
## StockOptionLevel < 1
## TotalWorkingYears < 3
##
## Attrition is 0.67 when
## OverTime is Yes
## StockOptionLevel < 1
## MonthlyIncome.Group is Above.Avg or Avg or High
## JobRole is Healthcare Representative or Human Resources or Laboratory Technician or Manufacturing Director or Research Director or Research Scientist
## YearsAtCompany.Group is 5thru10 or LessThan3
## JobInvolvement < 3
##
## Attrition is 0.75 when
## OverTime is Yes
## StockOptionLevel >= 1
## DistanceFromHome >= 13
## JobLevel < 2
##
## Attrition is 0.88 when
## OverTime is Yes
## StockOptionLevel < 1
## MonthlyIncome.Group is Above.Avg or Avg or High
## JobRole is Manager or Sales Executive or Sales Representative
## DistanceFromHome >= 8
##
## Attrition is 0.89 when
## OverTime is Yes
## StockOptionLevel < 1
## MonthlyIncome.Group is Low
## Age.Group is Undergrad or Young-Professional
## Confusion Matrix and Statistics
##
## Reference
## Prediction No Yes
## No 201 30
## Yes 14 16
##
## Accuracy : 0.8314
## 95% CI : (0.7804, 0.8748)
## No Information Rate : 0.8238
## P-Value [Acc > NIR] : 0.41022
##
## Kappa : 0.3275
##
## Mcnemar's Test P-Value : 0.02374
##
## Sensitivity : 0.9349
## Specificity : 0.3478
## Pos Pred Value : 0.8701
## Neg Pred Value : 0.5333
## Prevalence : 0.8238
## Detection Rate : 0.7701
## Detection Prevalence : 0.8851
## Balanced Accuracy : 0.6414
##
## 'Positive' Class : No
##
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # 3. KNN Model #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Confusion Matrix and Statistics
##
## Reference
## Prediction No Yes
## No 209 43
## Yes 6 3
##
## Accuracy : 0.8123
## 95% CI : (0.7595, 0.8578)
## No Information Rate : 0.8238
## P-Value [Acc > NIR] : 0.7192
##
## Kappa : 0.0546
##
## Mcnemar's Test P-Value : 2.706e-07
##
## Sensitivity : 0.97209
## Specificity : 0.06522
## Pos Pred Value : 0.82937
## Neg Pred Value : 0.33333
## Prevalence : 0.82375
## Detection Rate : 0.80077
## Detection Prevalence : 0.96552
## Balanced Accuracy : 0.51866
##
## 'Positive' Class : No
##
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Hyper Parameter tunning #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## [1] 9
## [1] 0.8528736
## [1] 20
## [1] 0.9955048
## [1] 1
## [1] 0.2394917
## classifications
## No Yes
## No 218 5
## Yes 33 5
## Confusion Matrix and Statistics
##
## classifications
## No Yes
## No 218 5
## Yes 33 5
##
## Accuracy : 0.8544
## 95% CI : (0.8057, 0.8949)
## No Information Rate : 0.9617
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.1572
##
## Mcnemar's Test P-Value : 1.187e-05
##
## Sensitivity : 0.8685
## Specificity : 0.5000
## Pos Pred Value : 0.9776
## Neg Pred Value : 0.1316
## Prevalence : 0.9617
## Detection Rate : 0.8352
## Detection Prevalence : 0.8544
## Balanced Accuracy : 0.6843
##
## 'Positive' Class : No
##
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # 4. GLM #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Confusion Matrix and Statistics
##
## Reference
## Prediction No Yes
## No 210 29
## Yes 5 17
##
## Accuracy : 0.8697
## 95% CI : (0.8227, 0.9081)
## No Information Rate : 0.8238
## P-Value [Acc > NIR] : 0.02751
##
## Kappa : 0.4356
##
## Mcnemar's Test P-Value : 7.998e-05
##
## Sensitivity : 0.9767
## Specificity : 0.3696
## Pos Pred Value : 0.8787
## Neg Pred Value : 0.7727
## Prevalence : 0.8238
## Detection Rate : 0.8046
## Detection Prevalence : 0.9157
## Balanced Accuracy : 0.6732
##
## 'Positive' Class : No
##
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # 5. Naive Bayes #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
## Confusion Matrix and Statistics
##
## Reference
## Prediction No Yes
## No 176 25
## Yes 40 20
##
## Accuracy : 0.751
## 95% CI : (0.6939, 0.8022)
## No Information Rate : 0.8276
## P-Value [Acc > NIR] : 0.99933
##
## Kappa : 0.229
##
## Mcnemar's Test P-Value : 0.08248
##
## Sensitivity : 0.8148
## Specificity : 0.4444
## Pos Pred Value : 0.8756
## Neg Pred Value : 0.3333
## Prevalence : 0.8276
## Detection Rate : 0.6743
## Detection Prevalence : 0.7701
## Balanced Accuracy : 0.6296
##
## 'Positive' Class : No
##
## No Yes
## Age.Group 0.5571882 0.5571882
## DistanceFromHome 0.5706430 0.5706430
## MonthlyIncome.Group 0.6353778 0.6353778
## TotalWorkingYears 0.6650113 0.6650113
## OverTime 0.6592976 0.6592976
## YearsAtCompany 0.6807905 0.6807905
## No Yes
## Age.Group 0.5571882 0.5571882
## DistanceFromHome 0.5706430 0.5706430
## MonthlyIncome.Group 0.6353778 0.6353778
## TotalWorkingYears 0.6650113 0.6650113
## OverTime 0.6592976 0.6592976
## YearsAtCompany 0.6807905 0.6807905
## StockOptionLevel 0.6536760 0.6536760
## JobRole 0.5343334 0.5343334
## JobLevel 0.6477166 0.6477166
## JobInvolvement 0.6168442 0.6168442
## Education 0.5277186 0.5277186
## NumCompaniesWorked 0.5525599 0.5525599
## EnvironmentSatisfaction 0.5535019 0.5535019
## WorkLifeBalance 0.5546283 0.5546283
## YearsInCurrentRole 0.6633934 0.6633934
## YearsAtCompany.Group 0.6339955 0.6339955
## YearsWithCurrManager 0.6328384 0.6328384
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # 5. Random Forest #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
##
## combine
## The following object is masked from 'package:ggplot2':
##
## margin
##
## Call:
## randomForest(formula = Attrition ~ ., data = training, ntree = 50, nodesize = 1, importance = TRUE)
## Type of random forest: classification
## Number of trees: 50
## No. of variables tried at each split: 4
##
## OOB estimate of error rate: 14.29%
## Confusion matrix:
## No Yes class.error
## No 499 15 0.02918288
## Yes 72 23 0.75789474
## Confusion Matrix and Statistics
##
## Reference
## Prediction No Yes
## No 207 36
## Yes 9 9
##
## Accuracy : 0.8276
## 95% CI : (0.7762, 0.8714)
## No Information Rate : 0.8276
## P-Value [Acc > NIR] : 0.5397121
##
## Kappa : 0.2077
##
## Mcnemar's Test P-Value : 0.0001063
##
## Sensitivity : 0.9583
## Specificity : 0.2000
## Pos Pred Value : 0.8519
## Neg Pred Value : 0.5000
## Prevalence : 0.8276
## Detection Rate : 0.7931
## Detection Prevalence : 0.9310
## Balanced Accuracy : 0.5792
##
## 'Positive' Class : No
##
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Explorator Analysis on the Models #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Models Sensitivities Specificities Precisions Recalls F1_Scores Accuracies
## 1 DTM 0.9348837 0.34782609 0.8701299 0.9348837 0.9013453 0.8314176
## 2 GLM 0.9767442 0.36956522 0.8786611 0.9767442 0.9251101 0.8697318
## 3 NBM 0.8148148 0.44444444 0.8756219 0.8148148 0.8441247 0.7509579
## 4 KNN 0.8685259 0.50000000 0.9775785 0.8685259 0.9198312 0.8544061
## 5 RFM 0.9583333 0.20000000 NA 0.9583333 0.9019608 0.8275862
## 6 SVM 1.0000000 0.02173913 NA 1.0000000 0.9052632 0.8275862
## Balanced_Accuracies
## 1 NA
## 2 0.6731547
## 3 0.6296296
## 4 0.6842629
## 5 0.5791667
## 6 0.5108696
## Warning: Removed 2 rows containing missing values (position_stack).
## Warning: Removed 2 rows containing missing values (geom_text).
## Warning: Removed 1 rows containing missing values (position_stack).
## Warning: Removed 1 rows containing missing values (geom_text).
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Begin Attrition Competition #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## ID Age BusinessTravel DailyRate Department DistanceFromHome
## 1 1171 35 Travel_Rarely 750 Research & Development 28
## 2 1172 33 Travel_Rarely 147 Human Resources 2
## 3 1173 26 Travel_Rarely 1330 Research & Development 21
## 4 1174 55 Travel_Rarely 1311 Research & Development 2
## 5 1175 29 Travel_Rarely 1246 Sales 19
## 6 1176 51 Travel_Frequently 1456 Research & Development 1
## Education EducationField EmployeeCount EmployeeNumber
## 1 3 Life Sciences 1 1596
## 2 3 Human Resources 1 1207
## 3 3 Medical 1 1107
## 4 3 Life Sciences 1 505
## 5 3 Life Sciences 1 1497
## 6 4 Medical 1 145
## EnvironmentSatisfaction Gender HourlyRate JobInvolvement JobLevel
## 1 2 Male 46 4 2
## 2 2 Male 99 3 1
## 3 1 Male 37 3 1
## 4 3 Female 97 3 4
## 5 3 Male 77 2 2
## 6 1 Female 30 2 3
## JobRole JobSatisfaction MaritalStatus MonthlyIncome
## 1 Laboratory Technician 3 Married 3407
## 2 Human Resources 3 Married 3600
## 3 Laboratory Technician 3 Divorced 2377
## 4 Manager 4 Single 16659
## 5 Sales Executive 3 Divorced 8620
## 6 Healthcare Representative 1 Single 7484
## MonthlyRate NumCompaniesWorked Over18 OverTime PercentSalaryHike
## 1 25348 1 Y No 17
## 2 8429 1 Y No 13
## 3 19373 1 Y No 20
## 4 23258 2 Y Yes 13
## 5 23757 1 Y No 14
## 6 25796 3 Y No 20
## PerformanceRating RelationshipSatisfaction StandardHours StockOptionLevel
## 1 3 4 80 2
## 2 3 4 80 1
## 3 4 3 80 1
## 4 3 3 80 0
## 5 3 3 80 2
## 6 4 3 80 0
## TotalWorkingYears TrainingTimesLastYear WorkLifeBalance YearsAtCompany
## 1 10 3 2 10
## 2 5 2 3 5
## 3 1 0 2 1
## 4 30 2 3 5
## 5 10 3 3 10
## 6 23 1 2 13
## YearsInCurrentRole YearsSinceLastPromotion YearsWithCurrManager
## 1 9 6 8
## 2 4 1 4
## 3 1 0 0
## 4 4 1 2
## 5 7 0 4
## 6 12 12 8
## The following objects are masked from df:
##
## Age, BusinessTravel, DailyRate, Department, DistanceFromHome,
## Education, EducationField, EmployeeCount, EmployeeNumber,
## EnvironmentSatisfaction, Gender, HourlyRate, ID, JobInvolvement,
## JobLevel, JobRole, JobSatisfaction, MaritalStatus, MonthlyIncome,
## MonthlyRate, NumCompaniesWorked, Over18, OverTime,
## PercentSalaryHike, PerformanceRating, RelationshipSatisfaction,
## StandardHours, StockOptionLevel, TotalWorkingYears,
## TrainingTimesLastYear, WorkLifeBalance, YearsAtCompany,
## YearsInCurrentRole, YearsSinceLastPromotion, YearsWithCurrManager
## The following object is masked from package:vcd:
##
## JobSatisfaction
## 'data.frame': 300 obs. of 35 variables:
## $ ID : int 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 ...
## $ Age : int 35 33 26 55 29 51 52 39 31 31 ...
## $ BusinessTravel : chr "Travel_Rarely" "Travel_Rarely" "Travel_Rarely" "Travel_Rarely" ...
## $ DailyRate : int 750 147 1330 1311 1246 1456 585 1387 1062 534 ...
## $ Department : chr "Research & Development" "Human Resources" "Research & Development" "Research & Development" ...
## $ DistanceFromHome : int 28 2 21 2 19 1 29 10 24 20 ...
## $ Education : int 3 3 3 3 3 4 4 5 3 3 ...
## $ EducationField : chr "Life Sciences" "Human Resources" "Medical" "Life Sciences" ...
## $ EmployeeCount : int 1 1 1 1 1 1 1 1 1 1 ...
## $ EmployeeNumber : int 1596 1207 1107 505 1497 145 2019 1618 1252 587 ...
## $ EnvironmentSatisfaction : int 2 2 1 3 3 1 1 2 3 1 ...
## $ Gender : chr "Male" "Male" "Male" "Female" ...
## $ HourlyRate : int 46 99 37 97 77 30 40 76 96 66 ...
## $ JobInvolvement : int 4 3 3 3 2 2 3 3 2 3 ...
## $ JobLevel : int 2 1 1 4 2 3 1 2 2 3 ...
## $ JobRole : chr "Laboratory Technician" "Human Resources" "Laboratory Technician" "Manager" ...
## $ JobSatisfaction : int 3 3 3 4 3 1 4 1 1 3 ...
## $ MaritalStatus : chr "Married" "Married" "Divorced" "Single" ...
## $ MonthlyIncome : int 3407 3600 2377 16659 8620 7484 3482 5377 6812 9824 ...
## $ MonthlyRate : int 25348 8429 19373 23258 23757 25796 19788 3835 17198 22908 ...
## $ NumCompaniesWorked : int 1 1 1 2 1 3 2 2 1 3 ...
## $ Over18 : chr "Y" "Y" "Y" "Y" ...
## $ OverTime : chr "No" "No" "No" "Yes" ...
## $ PercentSalaryHike : int 17 13 20 13 14 20 15 13 19 12 ...
## $ PerformanceRating : int 3 3 4 3 3 4 3 3 3 3 ...
## $ RelationshipSatisfaction: int 4 4 3 3 3 3 2 4 2 1 ...
## $ StandardHours : int 80 80 80 80 80 80 80 80 80 80 ...
## $ StockOptionLevel : int 2 1 1 0 2 0 2 3 0 0 ...
## $ TotalWorkingYears : int 10 5 1 30 10 23 16 10 10 12 ...
## $ TrainingTimesLastYear : int 3 2 0 2 3 1 3 3 2 2 ...
## $ WorkLifeBalance : int 2 3 2 3 3 2 2 3 3 3 ...
## $ YearsAtCompany : int 10 5 1 5 10 13 9 7 10 1 ...
## $ YearsInCurrentRole : int 9 4 1 4 7 12 8 7 9 0 ...
## $ YearsSinceLastPromotion : int 6 1 0 1 0 12 0 7 1 0 ...
## $ YearsWithCurrManager : int 8 4 0 2 4 8 0 7 8 0 ...
## [1] 300 35
## integer(0)
## 'data.frame': 300 obs. of 36 variables:
## $ ID : int 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 ...
## $ Age : int 35 33 26 55 29 51 52 39 31 31 ...
## $ BusinessTravel : Factor w/ 3 levels "Non-Travel","Travel_Frequently",..: 3 3 3 3 3 2 1 3 3 2 ...
## $ DailyRate : int 750 147 1330 1311 1246 1456 585 1387 1062 534 ...
## $ Department : Factor w/ 3 levels "Human Resources",..: 2 1 2 2 3 2 3 2 2 2 ...
## $ DistanceFromHome : int 28 2 21 2 19 1 29 10 24 20 ...
## $ Education : int 3 3 3 3 3 4 4 5 3 3 ...
## $ EducationField : Factor w/ 6 levels "Human Resources",..: 2 1 4 2 2 4 2 4 4 2 ...
## $ EnvironmentSatisfaction : int 2 2 1 3 3 1 1 2 3 1 ...
## $ Gender : Factor w/ 2 levels "Female","Male": 2 2 2 1 2 1 2 2 1 2 ...
## $ HourlyRate : int 46 99 37 97 77 30 40 76 96 66 ...
## $ JobInvolvement : int 4 3 3 3 2 2 3 3 2 3 ...
## $ JobLevel : int 2 1 1 4 2 3 1 2 2 3 ...
## $ JobRole : Factor w/ 9 levels "Healthcare Representative",..: 3 2 3 4 8 1 9 5 1 1 ...
## $ JobSatisfaction : int 3 3 3 4 3 1 4 1 1 3 ...
## $ MaritalStatus : Factor w/ 3 levels "Divorced","Married",..: 2 2 1 3 1 3 1 2 3 2 ...
## $ MonthlyIncome : int 3407 3600 2377 16659 8620 7484 3482 5377 6812 9824 ...
## $ MonthlyRate : int 25348 8429 19373 23258 23757 25796 19788 3835 17198 22908 ...
## $ NumCompaniesWorked : int 1 1 1 2 1 3 2 2 1 3 ...
## $ OverTime : Factor w/ 2 levels "No","Yes": 1 1 1 2 1 1 1 1 1 1 ...
## $ PercentSalaryHike : int 17 13 20 13 14 20 15 13 19 12 ...
## $ PerformanceRating : int 3 3 4 3 3 4 3 3 3 3 ...
## $ RelationshipSatisfaction: int 4 4 3 3 3 3 2 4 2 1 ...
## $ StockOptionLevel : int 2 1 1 0 2 0 2 3 0 0 ...
## $ TotalWorkingYears : int 10 5 1 30 10 23 16 10 10 12 ...
## $ TrainingTimesLastYear : int 3 2 0 2 3 1 3 3 2 2 ...
## $ WorkLifeBalance : int 2 3 2 3 3 2 2 3 3 3 ...
## $ YearsAtCompany : int 10 5 1 5 10 13 9 7 10 1 ...
## $ YearsInCurrentRole : int 9 4 1 4 7 12 8 7 9 0 ...
## $ YearsSinceLastPromotion : int 6 1 0 1 0 12 0 7 1 0 ...
## $ YearsWithCurrManager : int 8 4 0 2 4 8 0 7 8 0 ...
## $ iJobRole : int 3 2 3 4 8 1 9 5 1 1 ...
## $ iDepartment : int 2 1 2 2 3 2 3 2 2 2 ...
## $ iMaritalStatus : int 2 2 1 3 1 3 1 2 3 2 ...
## $ iBusinessTravel : int 3 3 3 3 3 2 1 3 3 2 ...
## $ iEducation : int 3 3 3 3 3 4 4 5 3 3 ...
## integer(0)
##
## Attaching package: 'BBmisc'
## The following objects are masked from 'package:dplyr':
##
## coalesce, collapse
## The following object is masked from 'package:grid':
##
## explode
## The following object is masked from 'package:base':
##
## isFALSE
## [1] "ID" "Age.Group"
## [3] "DistanceFromHome" "MonthlyIncome.Group"
## [5] "TotalWorkingYears" "OverTime"
## [7] "YearsAtCompany" "StockOptionLevel"
## [9] "JobRole" "JobLevel"
## [11] "JobInvolvement" "Education"
## [13] "EnvironmentSatisfaction" "WorkLifeBalance"
## [15] "YearsInCurrentRole" "YearsAtCompany.Group"
## [17] "YearsWithCurrManager"
## ID Age.Group DistanceFromHome
## 0 0 0
## MonthlyIncome.Group TotalWorkingYears OverTime
## 0 0 0
## YearsAtCompany StockOptionLevel JobRole
## 0 0 0
## JobLevel JobInvolvement Education
## 0 0 0
## EnvironmentSatisfaction WorkLifeBalance YearsInCurrentRole
## 0 0 0
## YearsAtCompany.Group YearsWithCurrManager
## 0 0
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Make Attrition Predictions #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Warning in predict.naiveBayes(NB.Model2, newdata = attritionPredictionData):
## Type mismatch between training and new data for variable 'NumCompaniesWorked'.
## Did you use factors with numeric labels for training, and numeric values for new
## data?
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Begin Salary Modeling #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Prepare data for Modeling Train Test SPlit #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## [1] 870 45
## [1] 609 10
## [1] 261 10
## 'data.frame': 609 obs. of 10 variables:
## $ MonthlyIncome : int 3730 6877 15379 16184 3452 2321 4424 6804 6799 2220 ...
## $ Age : int 32 30 49 33 45 31 31 49 34 34 ...
## $ YearsAtCompany : int 3 0 8 6 6 3 11 7 10 1 ...
## $ YearsInCurrentRole: int 2 0 7 1 5 2 7 7 8 1 ...
## $ TotalWorkingYears : int 4 12 23 10 9 4 11 7 10 1 ...
## $ JobLevel : int 1 2 4 4 1 1 2 2 2 1 ...
## $ Attrition : Factor w/ 2 levels "No","Yes": 2 1 1 1 1 2 1 1 1 1 ...
## $ BusinessTravel : Factor w/ 3 levels "Non-Travel","Travel_Frequently",..: 3 3 3 3 3 1 3 1 3 3 ...
## $ JobRole : Factor w/ 9 levels "Healthcare Representative",..: 3 5 4 6 7 7 5 5 8 9 ...
## $ Department : Factor w/ 3 levels "Human Resources",..: 2 2 2 2 2 2 2 2 3 3 ...
## MonthlyIncome Age YearsAtCompany YearsInCurrentRole
## Min. : 1081 Min. :18.00 Min. : 0.000 Min. : 0.000
## 1st Qu.: 2840 1st Qu.:30.00 1st Qu.: 3.000 1st Qu.: 2.000
## Median : 4946 Median :35.00 Median : 5.000 Median : 3.000
## Mean : 6390 Mean :36.83 Mean : 6.962 Mean : 4.205
## 3rd Qu.: 8182 3rd Qu.:43.00 3rd Qu.:10.000 3rd Qu.: 7.000
## Max. :19999 Max. :60.00 Max. :40.000 Max. :18.000
##
## TotalWorkingYears JobLevel Attrition BusinessTravel
## Min. : 0.00 Min. :1.000 No :730 Non-Travel : 94
## 1st Qu.: 6.00 1st Qu.:1.000 Yes:140 Travel_Frequently:158
## Median :10.00 Median :2.000 Travel_Rarely :618
## Mean :11.05 Mean :2.039
## 3rd Qu.:15.00 3rd Qu.:3.000
## Max. :40.00 Max. :5.000
##
## JobRole Department
## Sales Executive :200 Human Resources : 35
## Research Scientist :172 Research & Development:562
## Laboratory Technician :153 Sales :273
## Manufacturing Director : 87
## Healthcare Representative: 76
## Sales Representative : 53
## (Other) :129
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # find important Var #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
## Overall
## Age 13.530904
## YearsAtCompany 13.971729
## YearsInCurrentRole 9.689207
## TotalWorkingYears 29.492006
## JobLevel 71.368261
## Attrition 3.936266
## Overall
## Age 13.530904
## YearsAtCompany 13.971729
## YearsInCurrentRole 9.689207
## TotalWorkingYears 29.492006
## JobLevel 71.368261
## Attrition 3.936266
## BusinessTravel 1.094123
## JobRole 1.529198
## Department 1.691547
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Bayesion Lasso Regression or the avg of posterior estimates of the regression coefficients #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Loading required package: pls
##
## Attaching package: 'pls'
## The following object is masked from 'package:caret':
##
## R2
## The following object is masked from 'package:corrplot':
##
## corrplot
## The following object is masked from 'package:stats':
##
## loadings
## Loading required package: lars
## Loaded lars 1.2
## t=100, m=12
## t=200, m=11
## t=300, m=11
## t=400, m=10
## t=500, m=9
## t=600, m=11
## t=700, m=11
## t=800, m=12
## t=900, m=10
## t=100, m=8
## t=200, m=7
## t=300, m=11
## t=400, m=7
## t=500, m=10
## t=600, m=8
## t=700, m=5
## t=800, m=7
## t=900, m=13
## t=100, m=9
## t=200, m=10
## t=300, m=11
## t=400, m=9
## t=500, m=9
## t=600, m=9
## t=700, m=11
## t=800, m=8
## t=900, m=13
## t=100, m=14
## t=200, m=9
## t=300, m=11
## t=400, m=10
## t=500, m=11
## t=600, m=13
## t=700, m=11
## t=800, m=13
## t=900, m=10
## t=100, m=10
## t=200, m=9
## t=300, m=9
## t=400, m=8
## t=500, m=9
## t=600, m=11
## t=700, m=8
## t=800, m=7
## t=900, m=6
## t=100, m=9
## t=200, m=9
## t=300, m=7
## t=400, m=8
## t=500, m=8
## t=600, m=6
## t=700, m=8
## t=800, m=6
## t=900, m=9
## t=100, m=10
## t=200, m=11
## t=300, m=11
## t=400, m=13
## t=500, m=9
## t=600, m=10
## t=700, m=8
## t=800, m=10
## t=900, m=13
## t=100, m=10
## t=200, m=13
## t=300, m=9
## t=400, m=11
## t=500, m=12
## t=600, m=6
## t=700, m=11
## t=800, m=12
## t=900, m=10
## t=100, m=12
## t=200, m=9
## t=300, m=9
## t=400, m=11
## t=500, m=11
## t=600, m=12
## t=700, m=10
## t=800, m=9
## t=900, m=10
## t=100, m=11
## t=200, m=12
## t=300, m=13
## t=400, m=11
## t=500, m=13
## t=600, m=10
## t=700, m=11
## t=800, m=12
## t=900, m=11
## t=100, m=7
## t=200, m=6
## t=300, m=11
## t=400, m=7
## t=500, m=7
## t=600, m=6
## t=700, m=9
## t=800, m=8
## t=900, m=5
## t=100, m=10
## t=200, m=12
## t=300, m=12
## t=400, m=9
## t=500, m=10
## t=600, m=11
## t=700, m=12
## t=800, m=11
## t=900, m=11
## t=100, m=10
## t=200, m=10
## t=300, m=9
## t=400, m=10
## t=500, m=8
## t=600, m=10
## t=700, m=7
## t=800, m=9
## t=900, m=8
## t=100, m=8
## t=200, m=10
## t=300, m=9
## t=400, m=11
## t=500, m=12
## t=600, m=9
## t=700, m=9
## t=800, m=11
## t=900, m=14
## t=100, m=9
## t=200, m=11
## t=300, m=10
## t=400, m=10
## t=500, m=10
## t=600, m=7
## t=700, m=9
## t=800, m=10
## t=900, m=8
## t=100, m=11
## t=200, m=10
## t=300, m=10
## t=400, m=10
## t=500, m=13
## t=600, m=7
## t=700, m=11
## t=800, m=12
## t=900, m=7
## t=100, m=13
## t=200, m=14
## t=300, m=9
## t=400, m=10
## t=500, m=13
## t=600, m=7
## t=700, m=10
## t=800, m=10
## t=900, m=8
## t=100, m=10
## t=200, m=10
## t=300, m=9
## t=400, m=8
## t=500, m=11
## t=600, m=11
## t=700, m=9
## t=800, m=11
## t=900, m=7
## t=100, m=10
## t=200, m=10
## t=300, m=10
## t=400, m=11
## t=500, m=10
## t=600, m=10
## t=700, m=11
## t=800, m=9
## t=900, m=9
## t=100, m=11
## t=200, m=8
## t=300, m=10
## t=400, m=11
## t=500, m=9
## t=600, m=10
## t=700, m=6
## t=800, m=12
## t=900, m=11
## t=100, m=7
## t=200, m=9
## t=300, m=8
## t=400, m=9
## t=500, m=8
## t=600, m=12
## t=700, m=9
## t=800, m=8
## t=900, m=12
## t=100, m=9
## t=200, m=7
## t=300, m=6
## t=400, m=9
## t=500, m=10
## t=600, m=8
## t=700, m=5
## t=800, m=7
## t=900, m=9
## t=100, m=10
## t=200, m=13
## t=300, m=7
## t=400, m=8
## t=500, m=11
## t=600, m=7
## t=700, m=11
## t=800, m=9
## t=900, m=10
## t=100, m=8
## t=200, m=12
## t=300, m=12
## t=400, m=10
## t=500, m=9
## t=600, m=10
## t=700, m=10
## t=800, m=9
## t=900, m=11
## t=100, m=9
## t=200, m=8
## t=300, m=5
## t=400, m=4
## t=500, m=10
## t=600, m=6
## t=700, m=6
## t=800, m=7
## t=900, m=9
## t=100, m=7
## t=200, m=8
## t=300, m=9
## t=400, m=11
## t=500, m=11
## t=600, m=10
## t=700, m=8
## t=800, m=9
## t=900, m=8
##
## Call:
## lm(formula = MonthlyIncome ~ ., data = training)
##
## Coefficients:
## (Intercept) Age
## -528.68466 -0.02604
## YearsAtCompany YearsInCurrentRole
## -9.68087 -1.81881
## TotalWorkingYears JobLevel
## 53.51056 2712.27911
## AttritionYes BusinessTravelTravel_Frequently
## -2.24141 233.98698
## BusinessTravelTravel_Rarely JobRoleHuman Resources
## 375.53380 -26.22935
## JobRoleLaboratory Technician JobRoleManager
## -833.91648 4383.10401
## JobRoleManufacturing Director JobRoleResearch Director
## -7.04692 3991.67235
## JobRoleResearch Scientist JobRoleSales Executive
## -580.73143 114.84677
## JobRoleSales Representative DepartmentResearch & Development
## -389.44354 505.69635
## DepartmentSales
## 194.96933
## function (x, y, plot.it = TRUE, xlab = deparse1(substitute(x)),
## ylab = deparse1(substitute(y)), ...)
## {
## sx <- sort(x)
## sy <- sort(y)
## lenx <- length(sx)
## leny <- length(sy)
## if (leny < lenx)
## sx <- approx(1L:lenx, sx, n = leny)$y
## if (leny > lenx)
## sy <- approx(1L:leny, sy, n = lenx)$y
## if (plot.it)
## plot(sx, sy, xlab = xlab, ylab = ylab, ...)
## invisible(list(x = sx, y = sy))
## }
## <bytecode: 0x0000000046d9a918>
## <environment: namespace:stats>
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2036 2751 5788 6816 8788 19751
## [1] 1023.376
## RMSE Rsquared MAE
## 1023.3763611 0.9586828 804.2372799
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Begin Salary Predictions Prep #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## 'data.frame': 300 obs. of 35 variables:
## $ ï..ID : int 871 872 873 874 875 876 877 878 879 880 ...
## $ Age : int 43 33 55 36 27 39 33 21 30 51 ...
## $ Attrition : chr "No" "No" "Yes" "No" ...
## $ BusinessTravel : chr "Travel_Frequently" "Travel_Rarely" "Travel_Rarely" "Non-Travel" ...
## $ DailyRate : int 1422 461 267 1351 1302 895 750 251 1312 1405 ...
## $ Department : chr "Sales" "Research & Development" "Sales" "Research & Development" ...
## $ DistanceFromHome : int 2 13 13 9 19 5 22 10 23 11 ...
## $ Education : int 4 1 4 4 3 3 2 2 3 2 ...
## $ EducationField : chr "Life Sciences" "Life Sciences" "Marketing" "Life Sciences" ...
## $ EmployeeCount : int 1 1 1 1 1 1 1 1 1 1 ...
## $ EmployeeNumber : int 1849 995 1372 1949 1619 42 160 1279 159 1367 ...
## $ EnvironmentSatisfaction : int 1 2 1 1 4 4 3 1 1 4 ...
## $ Gender : chr "Male" "Female" "Male" "Male" ...
## $ HourlyRate : int 92 53 85 66 67 56 95 45 96 82 ...
## $ JobInvolvement : int 3 3 4 4 2 3 3 2 1 2 ...
## $ JobLevel : int 2 1 4 1 1 2 2 1 1 4 ...
## $ JobRole : chr "Sales Executive" "Research Scientist" "Sales Executive" "Laboratory Technician" ...
## $ JobSatisfaction : int 4 4 3 2 1 4 2 3 3 2 ...
## $ MaritalStatus : chr "Married" "Single" "Single" "Married" ...
## $ MonthlyRate : int 19246 17241 9277 9238 16290 3335 15480 25308 22310 24439 ...
## $ NumCompaniesWorked : int 1 3 6 1 1 3 0 1 1 3 ...
## $ Over18 : chr "Y" "Y" "Y" "Y" ...
## $ OverTime : chr "No" "No" "Yes" "No" ...
## $ PercentSalaryHike : int 20 18 17 22 11 14 13 20 25 16 ...
## $ PerformanceRating : int 4 3 3 4 3 3 3 4 4 3 ...
## $ RelationshipSatisfaction: int 3 1 3 2 1 3 1 3 3 2 ...
## $ StandardHours : int 80 80 80 80 80 80 80 80 80 80 ...
## $ StockOptionLevel : int 1 0 0 0 2 1 1 0 3 0 ...
## $ TotalWorkingYears : int 7 5 24 5 7 19 8 2 10 29 ...
## $ TrainingTimesLastYear : int 5 4 2 3 3 6 2 2 2 1 ...
## $ WorkLifeBalance : int 3 3 2 3 3 4 4 1 2 2 ...
## $ YearsAtCompany : int 7 3 19 5 7 1 7 2 10 5 ...
## $ YearsInCurrentRole : int 7 2 7 4 7 0 7 2 7 2 ...
## $ YearsSinceLastPromotion : int 7 0 3 0 0 0 0 2 0 0 ...
## $ YearsWithCurrManager : int 7 2 8 2 7 0 7 2 9 3 ...
## [1] "ï..ID" "Age"
## [3] "Attrition" "BusinessTravel"
## [5] "DailyRate" "Department"
## [7] "DistanceFromHome" "Education"
## [9] "EducationField" "EmployeeCount"
## [11] "EmployeeNumber" "EnvironmentSatisfaction"
## [13] "Gender" "HourlyRate"
## [15] "JobInvolvement" "JobLevel"
## [17] "JobRole" "JobSatisfaction"
## [19] "MaritalStatus" "MonthlyRate"
## [21] "NumCompaniesWorked" "Over18"
## [23] "OverTime" "PercentSalaryHike"
## [25] "PerformanceRating" "RelationshipSatisfaction"
## [27] "StandardHours" "StockOptionLevel"
## [29] "TotalWorkingYears" "TrainingTimesLastYear"
## [31] "WorkLifeBalance" "YearsAtCompany"
## [33] "YearsInCurrentRole" "YearsSinceLastPromotion"
## [35] "YearsWithCurrManager"
## ID Age Attrition BusinessTravel
## Min. : 871.0 Min. :18.00 Length:300 Length:300
## 1st Qu.: 945.8 1st Qu.:29.00 Class :character Class :character
## Median :1020.5 Median :36.00 Mode :character Mode :character
## Mean :1020.5 Mean :36.27
## 3rd Qu.:1095.2 3rd Qu.:42.00
## Max. :1170.0 Max. :60.00
## DailyRate Department DistanceFromHome Education
## Min. : 105.0 Length:300 Min. : 1.00 Min. :1.000
## 1st Qu.: 429.2 Class :character 1st Qu.: 2.00 1st Qu.:2.000
## Median : 693.0 Mode :character Median : 7.00 Median :3.000
## Mean : 783.2 Mean : 8.70 Mean :2.887
## 3rd Qu.:1171.2 3rd Qu.:11.25 3rd Qu.:4.000
## Max. :1492.0 Max. :29.00 Max. :5.000
## EducationField EmployeeCount EmployeeNumber EnvironmentSatisfaction
## Length:300 Min. :1 Min. : 7 Min. :1.00
## Class :character 1st Qu.:1 1st Qu.: 477 1st Qu.:2.00
## Mode :character Median :1 Median :1008 Median :3.00
## Mean :1 Mean :1014 Mean :2.77
## 3rd Qu.:1 3rd Qu.:1569 3rd Qu.:4.00
## Max. :1 Max. :2068 Max. :4.00
## Gender HourlyRate JobInvolvement JobLevel
## Length:300 Min. : 30.00 Min. :1.000 Min. :1
## Class :character 1st Qu.: 48.00 1st Qu.:2.000 1st Qu.:1
## Mode :character Median : 66.00 Median :3.000 Median :2
## Mean : 66.52 Mean :2.737 Mean :2
## 3rd Qu.: 85.25 3rd Qu.:3.000 3rd Qu.:2
## Max. :100.00 Max. :4.000 Max. :5
## JobRole JobSatisfaction MaritalStatus MonthlyRate
## Length:300 Min. :1.000 Length:300 Min. : 2122
## Class :character 1st Qu.:2.000 Class :character 1st Qu.: 7778
## Mode :character Median :3.000 Mode :character Median :13508
## Mean :2.747 Mean :14091
## 3rd Qu.:4.000 3rd Qu.:20464
## Max. :4.000 Max. :26999
## NumCompaniesWorked Over18 OverTime PercentSalaryHike
## Min. :0.00 Length:300 Length:300 Min. :11.00
## 1st Qu.:1.00 Class :character Class :character 1st Qu.:12.75
## Median :2.00 Mode :character Mode :character Median :14.00
## Mean :2.74 Mean :15.28
## 3rd Qu.:4.00 3rd Qu.:18.00
## Max. :9.00 Max. :25.00
## PerformanceRating RelationshipSatisfaction StandardHours StockOptionLevel
## Min. :3.00 Min. :1.000 Min. :80 Min. :0.0000
## 1st Qu.:3.00 1st Qu.:2.000 1st Qu.:80 1st Qu.:0.0000
## Median :3.00 Median :3.000 Median :80 Median :1.0000
## Mean :3.16 Mean :2.637 Mean :80 Mean :0.8333
## 3rd Qu.:3.00 3rd Qu.:4.000 3rd Qu.:80 3rd Qu.:1.0000
## Max. :4.00 Max. :4.000 Max. :80 Max. :3.0000
## TotalWorkingYears TrainingTimesLastYear WorkLifeBalance YearsAtCompany
## Min. : 0.00 Min. :0.00 Min. :1.000 Min. : 0.000
## 1st Qu.: 6.00 1st Qu.:2.00 1st Qu.:2.000 1st Qu.: 3.000
## Median : 9.00 Median :3.00 Median :3.000 Median : 5.000
## Mean :10.78 Mean :2.82 Mean :2.717 Mean : 6.623
## 3rd Qu.:14.00 3rd Qu.:3.00 3rd Qu.:3.000 3rd Qu.: 9.000
## Max. :40.00 Max. :6.00 Max. :4.000 Max. :33.000
## YearsInCurrentRole YearsSinceLastPromotion YearsWithCurrManager
## Min. : 0.0 Min. : 0.00 Min. : 0.000
## 1st Qu.: 2.0 1st Qu.: 0.00 1st Qu.: 2.000
## Median : 3.0 Median : 1.00 Median : 3.000
## Mean : 4.2 Mean : 2.14 Mean : 3.817
## 3rd Qu.: 7.0 3rd Qu.: 3.00 3rd Qu.: 7.000
## Max. :16.0 Max. :15.00 Max. :15.000
## [1] 300 35
## integer(0)
## ID Age Attrition
## 0 0 0
## BusinessTravel DailyRate Department
## 0 0 0
## DistanceFromHome Education EducationField
## 0 0 0
## EmployeeCount EmployeeNumber EnvironmentSatisfaction
## 0 0 0
## Gender HourlyRate JobInvolvement
## 0 0 0
## JobLevel JobRole JobSatisfaction
## 0 0 0
## MaritalStatus MonthlyRate NumCompaniesWorked
## 0 0 0
## Over18 OverTime PercentSalaryHike
## 0 0 0
## PerformanceRating RelationshipSatisfaction StandardHours
## 0 0 0
## StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 0 0 0
## WorkLifeBalance YearsAtCompany YearsInCurrentRole
## 0 0 0
## YearsSinceLastPromotion YearsWithCurrManager
## 0 0
## [1] 300 31
## integer(0)
## ID Age Attrition
## 0 0 0
## BusinessTravel DailyRate Department
## 0 0 0
## DistanceFromHome Education EducationField
## 0 0 0
## EnvironmentSatisfaction Gender HourlyRate
## 0 0 0
## JobInvolvement JobLevel JobRole
## 0 0 0
## JobSatisfaction MaritalStatus MonthlyRate
## 0 0 0
## NumCompaniesWorked OverTime PercentSalaryHike
## 0 0 0
## PerformanceRating RelationshipSatisfaction StockOptionLevel
## 0 0 0
## TotalWorkingYears TrainingTimesLastYear WorkLifeBalance
## 0 0 0
## YearsAtCompany YearsInCurrentRole YearsSinceLastPromotion
## 0 0 0
## YearsWithCurrManager
## 0
## 'data.frame': 300 obs. of 36 variables:
## $ ID : int 871 872 873 874 875 876 877 878 879 880 ...
## $ Age : int 43 33 55 36 27 39 33 21 30 51 ...
## $ Attrition : chr "No" "No" "Yes" "No" ...
## $ BusinessTravel : Factor w/ 3 levels "Non-Travel","Travel_Frequently",..: 2 3 3 1 3 3 1 2 2 3 ...
## $ DailyRate : int 1422 461 267 1351 1302 895 750 251 1312 1405 ...
## $ Department : Factor w/ 3 levels "Human Resources",..: 3 2 3 2 2 3 3 2 2 2 ...
## $ DistanceFromHome : int 2 13 13 9 19 5 22 10 23 11 ...
## $ Education : int 4 1 4 4 3 3 2 2 3 2 ...
## $ EducationField : Factor w/ 6 levels "Human Resources",..: 2 2 3 2 5 6 3 2 2 6 ...
## $ EnvironmentSatisfaction : int 1 2 1 1 4 4 3 1 1 4 ...
## $ Gender : Factor w/ 2 levels "Female","Male": 2 1 2 2 2 2 2 1 2 1 ...
## $ HourlyRate : int 92 53 85 66 67 56 95 45 96 82 ...
## $ JobInvolvement : int 3 3 4 4 2 3 3 2 1 2 ...
## $ JobLevel : int 2 1 4 1 1 2 2 1 1 4 ...
## $ JobRole : Factor w/ 9 levels "Healthcare Representative",..: 8 7 8 3 3 9 8 3 7 5 ...
## $ JobSatisfaction : int 4 4 3 2 1 4 2 3 3 2 ...
## $ MaritalStatus : Factor w/ 3 levels "Divorced","Married",..: 2 3 3 2 1 2 2 3 1 3 ...
## $ MonthlyRate : int 19246 17241 9277 9238 16290 3335 15480 25308 22310 24439 ...
## $ NumCompaniesWorked : int 1 3 6 1 1 3 0 1 1 3 ...
## $ OverTime : Factor w/ 2 levels "No","Yes": 1 1 2 1 1 1 1 1 1 1 ...
## $ PercentSalaryHike : int 20 18 17 22 11 14 13 20 25 16 ...
## $ PerformanceRating : int 4 3 3 4 3 3 3 4 4 3 ...
## $ RelationshipSatisfaction: int 3 1 3 2 1 3 1 3 3 2 ...
## $ StockOptionLevel : int 1 0 0 0 2 1 1 0 3 0 ...
## $ TotalWorkingYears : int 7 5 24 5 7 19 8 2 10 29 ...
## $ TrainingTimesLastYear : int 5 4 2 3 3 6 2 2 2 1 ...
## $ WorkLifeBalance : int 3 3 2 3 3 4 4 1 2 2 ...
## $ YearsAtCompany : int 7 3 19 5 7 1 7 2 10 5 ...
## $ YearsInCurrentRole : int 7 2 7 4 7 0 7 2 7 2 ...
## $ YearsSinceLastPromotion : int 7 0 3 0 0 0 0 2 0 0 ...
## $ YearsWithCurrManager : int 7 2 8 2 7 0 7 2 9 3 ...
## $ iJobRole : int 8 7 8 3 3 9 8 3 7 5 ...
## $ iDepartment : int 3 2 3 2 2 3 3 2 2 2 ...
## $ iMaritalStatus : int 2 3 3 2 1 2 2 3 1 3 ...
## $ iBusinessTravel : int 2 3 3 1 3 3 1 2 2 3 ...
## $ iEducation : int 4 1 4 4 3 3 2 2 3 2 ...
## ID Age Attrition
## 0 0 0
## BusinessTravel DailyRate Department
## 0 0 0
## DistanceFromHome Education EducationField
## 0 0 0
## EnvironmentSatisfaction Gender HourlyRate
## 0 0 0
## JobInvolvement JobLevel JobRole
## 0 0 0
## JobSatisfaction MaritalStatus MonthlyRate
## 0 0 0
## NumCompaniesWorked OverTime PercentSalaryHike
## 0 0 0
## PerformanceRating RelationshipSatisfaction StockOptionLevel
## 0 0 0
## TotalWorkingYears TrainingTimesLastYear WorkLifeBalance
## 0 0 0
## YearsAtCompany YearsInCurrentRole YearsSinceLastPromotion
## 0 0 0
## YearsWithCurrManager iJobRole iDepartment
## 0 0 0
## iMaritalStatus iBusinessTravel iEducation
## 0 0 0
##
## Attaching package: 'BBmisc'
## The following objects are masked from 'package:dplyr':
##
## coalesce, collapse
## The following object is masked from 'package:grid':
##
## explode
## The following object is masked from 'package:base':
##
## isFALSE
## [1] "ID" "Age"
## [3] "Attrition" "BusinessTravel"
## [5] "DailyRate" "Department"
## [7] "DistanceFromHome" "Education"
## [9] "EducationField" "EnvironmentSatisfaction"
## [11] "Gender" "HourlyRate"
## [13] "JobInvolvement" "JobLevel"
## [15] "JobRole" "JobSatisfaction"
## [17] "MaritalStatus" "MonthlyRate"
## [19] "NumCompaniesWorked" "OverTime"
## [21] "PercentSalaryHike" "PerformanceRating"
## [23] "RelationshipSatisfaction" "StockOptionLevel"
## [25] "TotalWorkingYears" "TrainingTimesLastYear"
## [27] "WorkLifeBalance" "YearsAtCompany"
## [29] "YearsInCurrentRole" "YearsSinceLastPromotion"
## [31] "YearsWithCurrManager" "iJobRole"
## [33] "iDepartment" "iMaritalStatus"
## [35] "iBusinessTravel" "iEducation"
## [37] "time.at.past.job" "ntime.at.past.job"
## [1] "ID" "Age" "Attrition"
## [4] "BusinessTravel" "YearsAtCompany" "YearsInCurrentRole"
## [7] "TotalWorkingYears" "JobLevel" "JobRole"
## [10] "Department"
## ID Age Attrition BusinessTravel
## Min. : 871.0 Min. :18.00 Length:300 Non-Travel : 24
## 1st Qu.: 945.8 1st Qu.:29.00 Class :character Travel_Frequently: 62
## Median :1020.5 Median :36.00 Mode :character Travel_Rarely :214
## Mean :1020.5 Mean :36.27
## 3rd Qu.:1095.2 3rd Qu.:42.00
## Max. :1170.0 Max. :60.00
##
## YearsAtCompany YearsInCurrentRole TotalWorkingYears JobLevel
## Min. : 0.000 Min. : 0.0 Min. : 0.00 Min. :1
## 1st Qu.: 3.000 1st Qu.: 2.0 1st Qu.: 6.00 1st Qu.:1
## Median : 5.000 Median : 3.0 Median : 9.00 Median :2
## Mean : 6.623 Mean : 4.2 Mean :10.78 Mean :2
## 3rd Qu.: 9.000 3rd Qu.: 7.0 3rd Qu.:14.00 3rd Qu.:2
## Max. :33.000 Max. :16.0 Max. :40.00 Max. :5
##
## JobRole Department
## Sales Executive :69 Human Resources : 17
## Research Scientist :59 Research & Development:190
## Laboratory Technician :51 Sales : 93
## Manufacturing Director :27
## Healthcare Representative:26
## Manager :21
## (Other) :47
## ID Age Attrition BusinessTravel
## 0 0 0 0
## YearsAtCompany YearsInCurrentRole TotalWorkingYears JobLevel
## 0 0 0 0
## JobRole Department
## 0 0
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Make Salary Predictions #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## t=100, m=9
## t=200, m=8
## t=300, m=7
## t=400, m=12
## t=500, m=10
## t=600, m=7
## t=700, m=7
## t=800, m=7
## t=900, m=9
## t=100, m=9
## t=200, m=10
## t=300, m=12
## t=400, m=16
## t=500, m=13
## t=600, m=14
## t=700, m=12
## t=800, m=11
## t=900, m=13
## t=100, m=9
## t=200, m=13
## t=300, m=10
## t=400, m=8
## t=500, m=12
## t=600, m=9
## t=700, m=13
## t=800, m=12
## t=900, m=12
## t=100, m=11
## t=200, m=7
## t=300, m=8
## t=400, m=10
## t=500, m=11
## t=600, m=9
## t=700, m=9
## t=800, m=8
## t=900, m=8
## t=100, m=7
## t=200, m=10
## t=300, m=6
## t=400, m=10
## t=500, m=7
## t=600, m=10
## t=700, m=7
## t=800, m=6
## t=900, m=9
## t=100, m=9
## t=200, m=7
## t=300, m=6
## t=400, m=10
## t=500, m=9
## t=600, m=9
## t=700, m=7
## t=800, m=7
## t=900, m=8
## t=100, m=8
## t=200, m=8
## t=300, m=10
## t=400, m=9
## t=500, m=7
## t=600, m=7
## t=700, m=11
## t=800, m=12
## t=900, m=7
## t=100, m=11
## t=200, m=12
## t=300, m=9
## t=400, m=12
## t=500, m=10
## t=600, m=11
## t=700, m=10
## t=800, m=12
## t=900, m=10
## t=100, m=9
## t=200, m=9
## t=300, m=8
## t=400, m=7
## t=500, m=10
## t=600, m=10
## t=700, m=7
## t=800, m=7
## t=900, m=10
## t=100, m=7
## t=200, m=7
## t=300, m=9
## t=400, m=8
## t=500, m=8
## t=600, m=8
## t=700, m=7
## t=800, m=8
## t=900, m=8
## t=100, m=5
## t=200, m=7
## t=300, m=10
## t=400, m=7
## t=500, m=6
## t=600, m=8
## t=700, m=6
## t=800, m=6
## t=900, m=8
## t=100, m=9
## t=200, m=9
## t=300, m=7
## t=400, m=8
## t=500, m=9
## t=600, m=7
## t=700, m=9
## t=800, m=6
## t=900, m=9
## t=100, m=7
## t=200, m=8
## t=300, m=7
## t=400, m=9
## t=500, m=9
## t=600, m=7
## t=700, m=10
## t=800, m=10
## t=900, m=8
## t=100, m=7
## t=200, m=9
## t=300, m=10
## t=400, m=10
## t=500, m=10
## t=600, m=10
## t=700, m=12
## t=800, m=11
## t=900, m=10
## t=100, m=11
## t=200, m=11
## t=300, m=9
## t=400, m=10
## t=500, m=11
## t=600, m=10
## t=700, m=11
## t=800, m=10
## t=900, m=14
## t=100, m=8
## t=200, m=5
## t=300, m=6
## t=400, m=6
## t=500, m=8
## t=600, m=6
## t=700, m=8
## t=800, m=7
## t=900, m=6
## t=100, m=6
## t=200, m=6
## t=300, m=6
## t=400, m=11
## t=500, m=8
## t=600, m=6
## t=700, m=10
## t=800, m=8
## t=900, m=9
## t=100, m=12
## t=200, m=12
## t=300, m=13
## t=400, m=9
## t=500, m=14
## t=600, m=11
## t=700, m=11
## t=800, m=10
## t=900, m=10
## t=100, m=12
## t=200, m=8
## t=300, m=6
## t=400, m=6
## t=500, m=10
## t=600, m=9
## t=700, m=8
## t=800, m=11
## t=900, m=9
## t=100, m=9
## t=200, m=9
## t=300, m=11
## t=400, m=7
## t=500, m=8
## t=600, m=11
## t=700, m=9
## t=800, m=9
## t=900, m=9
## t=100, m=8
## t=200, m=7
## t=300, m=11
## t=400, m=6
## t=500, m=5
## t=600, m=5
## t=700, m=7
## t=800, m=7
## t=900, m=6
## t=100, m=7
## t=200, m=9
## t=300, m=9
## t=400, m=7
## t=500, m=10
## t=600, m=8
## t=700, m=9
## t=800, m=9
## t=900, m=7
## t=100, m=8
## t=200, m=8
## t=300, m=8
## t=400, m=8
## t=500, m=8
## t=600, m=7
## t=700, m=10
## t=800, m=6
## t=900, m=6
## t=100, m=10
## t=200, m=6
## t=300, m=8
## t=400, m=7
## t=500, m=8
## t=600, m=6
## t=700, m=6
## t=800, m=7
## t=900, m=7
## t=100, m=6
## t=200, m=6
## t=300, m=7
## t=400, m=8
## t=500, m=5
## t=600, m=9
## t=700, m=7
## t=800, m=6
## t=900, m=8
## t=100, m=8
## t=200, m=8
## t=300, m=8
## t=400, m=8
## t=500, m=8
## t=600, m=7
## t=700, m=7
## t=800, m=7
## t=900, m=9
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] monomvn_1.9-13 lars_1.2 pls_2.7-3 MASS_7.3-53
## [5] e1071_1.7-5 caret_6.0-86 dplyr_1.0.5 lattice_0.20-41
## [9] pROC_1.17.0.1 rpart.plot_3.0.9 rpart_4.1-15 vcd_1.4-8
## [13] ggthemes_4.2.4 corrplot_0.84 skimr_2.1.3 GGally_2.1.1
## [17] visdat_0.5.3 forcats_0.5.1 purrr_0.3.4 readr_1.4.0
## [21] tidyr_1.1.3 tibble_3.1.0 ggplot2_3.3.3
##
## loaded via a namespace (and not attached):
## [1] colorspace_2.0-0 ellipsis_0.3.1 class_7.3-18
## [4] base64enc_0.1-3 fs_1.5.0 rstudioapi_0.13
## [7] proxy_0.4-25 farver_2.1.0 prodlim_2019.11.13
## [10] fansi_0.4.2 mvtnorm_1.1-1 lubridate_1.7.10
## [13] xml2_1.3.2 codetools_0.2-18 splines_4.0.4
## [16] knitr_1.31 jsonlite_1.7.2 broom_0.7.5
## [19] kernlab_0.9-29 dbplyr_2.1.0 compiler_4.0.4
## [22] httr_1.4.2 backports_1.2.1 assertthat_0.2.1
## [25] Matrix_1.3-2 cli_2.3.1 htmltools_0.5.1.1
## [28] tools_4.0.4 gtable_0.3.0 glue_1.4.2
## [31] reshape2_1.4.4 Rcpp_1.0.6 cellranger_1.1.0
## [34] jquerylib_0.1.3 vctrs_0.3.6 nlme_3.1-152
## [37] iterators_1.0.13 lmtest_0.9-38 timeDate_3043.102
## [40] gower_0.2.2 xfun_0.22 stringr_1.4.0
## [43] rvest_1.0.0 lifecycle_1.0.0 zoo_1.8-9
## [46] scales_1.1.1 ipred_0.9-11 hms_1.0.0
## [49] tidyverse_1.3.0 RColorBrewer_1.1-2 BBmisc_1.11
## [52] yaml_2.2.1 sass_0.3.1 reshape_0.8.8
## [55] stringi_1.5.3 highr_0.8 foreach_1.5.1
## [58] randomForest_4.6-14 checkmate_2.0.0 lava_1.6.9
## [61] repr_1.1.3 rlang_0.4.10 pkgconfig_2.0.3
## [64] evaluate_0.14 recipes_0.1.15 labeling_0.4.2
## [67] tidyselect_1.1.0 plyr_1.8.6 magrittr_2.0.1
## [70] R6_2.5.0 generics_0.1.0 DBI_1.1.1
## [73] pillar_1.5.1 haven_2.3.1 withr_2.4.1
## [76] mgcv_1.8-33 survival_3.2-7 nnet_7.3-15
## [79] modelr_0.1.8 crayon_1.4.1 utf8_1.2.1
## [82] rmarkdown_2.7 readxl_1.3.1 data.table_1.14.0
## [85] ModelMetrics_1.2.2.2 reprex_1.0.0 digest_0.6.27
## [88] stats4_4.0.4 munsell_0.5.0 bslib_0.2.4
## [91] quadprog_1.5-8