regression

profilemathgenuis2021
attachment_12.pdf

Guidelines for Regression Project

This project is designed to help you gain experience and build skills in the diagnosis

and prediction of employee turnover. The dataset we will use (see the Excel file

named “Regression Project”) contains a wide variety of workforce data (employee

demographics, and attitudes) on approximately 1000 employees. The primary

dependent variables are "Attrition" and “Probability of Turnover.”

Your CEO wants to better understand the factors driving employee turnover, and she

has asked you to take the lead in conducting the analyses.

You should begin with data cleaning and range checks. Expect that there are

problems here, as with most any large dataset! Please address any problems that

you find and document any changes that you have made in your memo to me.

Then, move on to the basics (e.g., are departing employees older, younger, have

higher education levels, lower job satisfaction, etc.)? You should also seek to

determine whether or not there are any differences in attrition across departments

and if so, why.

Once you have outlined the basics, develop a multivariate regression model to

determine which factors appear to be the most important predictors of Probability of

Turnover and/or Turnover (be sure to use a logistic regression model if you focus on

the latter). Note that you have a lot of discretion in how you approach this problem,

so I am intentionally not providing step-by-step details on what you should do with

these projects. I want you to show me how you would approach the problem.

Please summarize your findings in a (maximum) five-page, double spaced memo to

the CEO. Any tables, figures, etc., can be placed in an Appendix to your memo.

Appendix

Variable Descriptions