Big Data project

profileJossAddie
FinalProject11_16_2019.docx

Big Data Project

For the big data project, you can work alone or work in a group (not more than 5). The purpose of the project is for you to gain experience in applying statistical method of multiple regression.

Submission Start Date – December 6, 2019

Submission Due Date – December 12, 2019

Questions and lab hours: normal class meetings and email communications.

Project Description

The general description of this project stems from research associated with self-efficacy and its relationship with academic performance and persistent.

The standard project is to use multiple regression analysis to analyze a data set. The data set is a study of student persistent enrolling in the next semester based on Gender, Age, GPA, a 22 questionnaire on self-efficacy, and student enrollment status.

The educational researcher wants to study the relationship between student enrollment status as it relates to gender, age, GPA, and the total response to a 22 questionnaire survey.

Data Sets

The data consist of 132 students at a 2-year college. The data is in pdf format, so it will require input to Excel or conversion to Excel.

For the qualitative variables use the following to prepare data.

Gender – 0 represents female, 1 represents male

Age – 0 represents student age (18 – 25), 1 represents student age (26 – older)

GPA – student’s current grade point average

Total Q – total of the 22 questions (all question responses come from a scale of 1 – 5, where 5 high.

Enrollment status – 0 represents not enrolled, 1 represents enrolled.

Items to turn in

1. BigDataProjectTemplate Excel Sheet with the following:

a. Group Names in tab 1, list of group member names if you not working alone.

b. Capture the data from the PDF, the data should be in the Data tab. The data should be reviewed and updated, the data does have some wrong inputs so it will require updates

c. The data setup for the multiple regression should include in the tab name Data Setup for Regression

d. include a residual output, may require research ( See examples of how to use Multiple Linear Regression using Microsoft Excel)

2. The following questions are included in the BigDataProjectTemplate sheet in the tab name Questions and Answers.

a. The estimated multiple regression analysis equation.

b. Does the model work? Research this question, use the Significance F value and compare it using p-value.

c. How well does the model work? Research this question using p-values and R Square.

d. Which variables contribute to the model? Research this question using p-values.

e. General interpretation of the data and the data analysis.

3. Upload BigDataProjectTemplate as file BigDataProject your name

4. Each group member should upload as individual upload.