Finalproject

profileMyUniverse
FinalProjectReq.docx

For your final course project, you conduct a data analysis problem of your choice. This allows you the opportunity to dive deeper into an analytical topic that you would like to learn more about.

Note: Depending on class size, I may choose to use small groups for this project. Group rules (such as group size, speaking roles for the presentation, etc.) will be defined in class and in an announcement.

You should preferably pick some data set from your place of work. While there are no defined rules as to the size of the dataset, it should be large enough to conduct meaningful analyses. As a guideline, your analysis should include some aspect of the following components:

Introduction to Data Set

· Quantity and types of variables (continuous, discrete)

· Quantity of rows/observations

· Meaning of variables

· Where you found the data

Data Preparation

· Did you look at the data initially? (I.e., Missing values, outliers, etc.)

· Initial plots and graphs to examine distributional shapes, etc.?

Main research question you are trying to answer

Main Data Analysis

· Consider multiple regression, time-series, data mining techniques (clustering, classification, etc.)

· Other techniques ok if approved by instructor

· You may also consider techniques not covered in class. For example, you might consider logistic regression for a binary outcome variable.

· Show output from analysis

Conclusions/Findings

· What does the output mean?

· How can it benefit the business?

· Future considerations for research with this data set (or in this general business area)

Note: If you cannot find an applicable data set, there are plenty that exist on the web. One good place to look is the UCI Machine Learning Repository .