Big Data project
Big Data Project
For the big data project, you can work alone or work in a group (not more than 5). The purpose of the project is for you to gain experience in applying statistical method of multiple regression.
Submission Start Date – December 6, 2019
Submission Due Date – December 12, 2019
Questions and lab hours: normal class meetings and email communications.
Project Description
The general description of this project stems from research associated with self-efficacy and its relationship with academic performance and persistent.
The standard project is to use multiple regression analysis to analyze a data set. The data set is a study of student persistent enrolling in the next semester based on Gender, Age, GPA, a 22 questionnaire on self-efficacy, and student enrollment status.
The educational researcher wants to study the relationship between student enrollment status as it relates to gender, age, GPA, and the total response to a 22 questionnaire survey.
Data Sets
The data consist of 132 students at a 2-year college. The data is in pdf format, so it will require input to Excel or conversion to Excel.
For the qualitative variables use the following to prepare data.
Gender – 0 represents female, 1 represents male
Age – 0 represents student age (18 – 25), 1 represents student age (26 – older)
GPA – student’s current grade point average
Total Q – total of the 22 questions (all question responses come from a scale of 1 – 5, where 5 high.
Enrollment status – 0 represents not enrolled, 1 represents enrolled.
Items to turn in
1. BigDataProjectTemplate Excel Sheet with the following:
a. Group Names in tab 1, list of group member names if you not working alone.
c. The data setup for the multiple regression should include in the tab name Data Setup for Regression
d. include a residual output, may require research ( See examples of how to use Multiple Linear Regression using Microsoft Excel)
2. The following questions are included in the BigDataProjectTemplate sheet in the tab name Questions and Answers.
a. The estimated multiple regression analysis equation.
b. Does the model work? Research this question, use the Significance F value and compare it using p-value.
c. How well does the model work? Research this question using p-values and R Square.
d. Which variables contribute to the model? Research this question using p-values.
e. General interpretation of the data and the data analysis.
3. Upload BigDataProjectTemplate as file BigDataProject your name
4. Each group member should upload as individual upload.