Work in excel

DIV708
Milestone2work.pdf

BUA 6315: Business Analytics for Decision Making

Milestone 2 Regression Analysis Handout: Dataset 1

1

BUA 6315: Business Analytics for Decision Making Milestone 2: Regression Analysis Guidelines and Rubric

PART 1 Overview: The objective of this milestone is to fit and estimate a regression model to predict the response variable for the same dataset that you selected in Milestone

1. You will document the key and relevant steps or plans and include them in your report and appendices.

Submission Guidelines: Your final submission must be submitted as a 2- to 3-page Microsoft Word document with double spacing, 12-point Times New Roman font, 1-inch margins, and should include the tables in an appendix. Instructor Feedback: This activity uses an integrated rubric in Blackboard. Students can view instructor feedback in the Grade Center.

Please also watch the following videos for this assignment:

Sample Case Milestone 2 Part 1 Multivariate RegressionLinks to an external site. https://www.youtube.com/watch?v=KLZtS9BHJJs

Sample Case Milestone 2 Part 2 Logistic RegressionLinks to an external site. https://www.youtube.com/watch?v=auBwuaHdM64 If you are using the college admission data, follow the instructions below to complete Milestone 2.

Part I: Multivariate Regression Analysis You will begin by subsetting the three colleges and choose the college Business and Economics.You will use the sample of enrolled students to best predict a student’s college grade point average. You will use a multivariate regression model for Steps 1, 2, 3, 4, and 5. Step 1: Estimate Models and Report Your Findings in a Table First, regress college grade point average on 3 models for Business and Economics.

● Model 1: Use predictors HSGPA, SAT/ACT, Gender ● Model 2: Use predictors HSGPA, SAT/ACT, Gender, Race (Asian and White) ● Model 3: Use predictors HSGPA, SAT/ACT, Gender, Race (Asian and White), Parent’s education

(Edu_parent_1 and Edu_Parent_2) Hint: Create a new variable Gender_T which equals to 1 if the “Gender” is F, 0 otherwise.

BUA 6315: Business Analytics for Decision Making

Milestone 2 Regression Analysis Handout: Dataset 1

2

Second, report your results in user-friendly tables, which will be included in the appendix of a Microsoft Word document for your Milestone 2 written report submission. You can find an example in your textbook section 6.3 (Table 6.11, 1st edition). In your table, you should include parameter estimates and p-values of each estimate (each model), standard error of estimate (Se), R-squared, adjusted R-squared, and p-value of the F-test. The tables should include a footnote for significance level(s) and additional information that a reader needs to understand the table. You are encouraged (but not required) to copy and paste the following tables template into your report for this step:

Business and Economics

Model 1 Model 2 Model 3

Intercept

HSGPA

SAT/ACT

Gender

White NA

Asian NA

Edu1 NA NA

Edu2 NA NA

Standard Error

𝑅2

Adjusted 𝑅2

F-test (p-value)

Notes: Parameter estimates are in the top half of the table with p-values in the parenthesis; * represents significance at the 5% level. NA denotes not applicable. The lower part of the table includes goodness of fit measures.

Step 2: Select the Best Model Choose the best model for the chosen college based on your results in Step 1. Explain why the model you have chosen is the best in a Microsoft Word document for your Milestone 2 written report.

BUA 6315: Business Analytics for Decision Making

Milestone 2 Regression Analysis Handout: Dataset 1

3

Step 3: Check for Violations of Model Assumptions Using the models you chose for each college in Step 2, plot the residuals across predicted student’s college grade point average and determine whether any of the assumptions of the linear regression model are violated. If any of the assumptions are violated, describe ways to solve this problem in your written report (see Section 6.4: Model Assumptions and Common Violations). If there are no violations, be sure to state that in your written report. Note: You do not need to submit your residual graph for this step as part of your milestone submission.

Step 4: Check for Multicollinearity Using the same models for each college, check the R-squared and F statistics, and determine if the multicollinearity may be an issue. Additionally, examine the correlations between the predictor variables. If there is multicollinearity, drop one of the collinear variables. Explain in your report whether multicollinearity is an issue in either college subset, and if not, be sure to explain why. Step 5: Test the Significance of Coefficients Using the same models for each college, determine which predictor variables are significant at the 5% significance level, and interpret R-squared and coefficient estimates. Be sure to explain these in your written report.

Part II: Logistic Regression Next, you will develop a logistic regression model for predicting the probability of admission for the whole dataset. You need to transform “Admitted” from categorical to numerical. Hint: Create a new variable Admitted_T which equals to 1 if the “Admitted” is Yes, 0 otherwise. Step 6: Select the Best Model Regress admitted on three models using a logistic regression model.

Model 1: Use predictors HSGPA, SAT/ACT, Gender Model 2: Use predictors HSGPA, SAT/ACT, Gender, Race (Asian and White) Model 3: Use predictors HSGPA, SAT/ACT, Gender, Race (Asian and White), Parent’s education (Edu_parent_1 and Edu_Parent_2)

Choose the best model for predicting the probability of admission using the hold-out method for the whole dataset (70% training set, 30% validation set). Explain why the model you have chosen is the best in your written report. Step 7: Determine the Accuracy Rate of Your Best Model Next, report the accuracy rate of your best model in your written report. Step 8: Recommendations and Suggestions College admission can be stressful for both students and parents as there is no magic formula when it comes to admission decisions. Just as prospective students are anxious about receiving an acceptance letter, most colleges are concerned about meeting their enrollment targets. For this step, address the following to make recommendations and suggestions for colleges making decisions in your written report:

BUA 6315: Business Analytics for Decision Making

Milestone 2 Regression Analysis Handout: Dataset 1

4

● Consider your regression analysis where you regress student’s college GPA on several predictor variables. Based on your findings, explain how the university can use this information to make decisions about which students will be successful if admitted to each college. Should universities focus more on high school GPA or SAT scores or both to admit students?

● Consider the probability model for admission. What factors should the university take into account to predict the probability of admission? How can the university use this information to make decisions about its admission target?

Step 9: Finalize Your Written Report To complete this milestone, finalize your milestone written report. Your report must be written in essay format (with an introduction, body, and conclusion), and summarize your findings in a way that a non- technical person can understand. You can find examples of a well-written report in your textbook in the “Writing with Big Data” section at the end of each chapter. The content of your written report will be assessed based on the following criteria:

● Your written report must be well-presented and argued ● Your ideas should be detailed, developed, and supported with evidence, data analysis, tables, and

figures as appropriate ● A non-technical audience must be able to easily understand the content.

Submission Guidelines: Your final submission must be submitted as a 2- to 3-page Microsoft Word document with double spacing, 12-point Times New Roman font, 1-inch margins, and should include the tables in an appendix. See the Milestone 2 Guidelines and Rubric document, available in Blackboard for more information.