Data Analytics Project

JASON7
DataAnalyticsProjectInstructions.pdf

PROJECT INSTRUCTIONS

THE PROCESS 1.​ ​SELECT​ a domain area of research (COVID-19 related) 2.​ ​ ​FORMULATE ​a problem statement & hypothesis. describe the problem in details you wish to explore. 3.​ ​FRAME​ the ​question(s)​ according to your domain

a.​ ​Understand A Business b.​ ​Understand A Stakeholders

4.​ ​OBTAIN​ data for your project a.​ ​Describe the Data: Information about the dataset itself, e.g., the attributes and attribute types, the number of instances, your target variable.

5.​ ​SCRUB​ the data, this includes cleaning and preparing the data for analytic purposes 6.​ ​ANALYZE​ the data, looking for patterns and insights (EDA & Analytics) *Use Jupyter Notebook 7.​ ​SUMMARIZE​ your findings

THE DELIVERABLES

● A project report document (APA formatted) between 5-8 pages in length, not including title page, content page, or images/graphics/reference. The report should have the below sections:

○ Introduction:​ this is where you provide a brief description of your personal motivation for the project and the framing question. Tell the reader why they care about the results you are about to present and why is the question you will be answering is important. A description of your dataset including what type of data it contains, how many attributes, how many instances. Any additional challenges such as messy or missing values.

○ Data Analysis​: this is where you describe your data (summary statistics, EDA), explain the methods you used to analyze the data. Discuss how the method works, why it was well suited for your data, and how you applied it.

○ Results​: this is where you describe and explain your findings. Why do you think you found the results you did and what do you think they mean?

○ Conclusion​: this should provide a concise answer to the analytical question posed in the introduction along with a brief description of why the analysis answered the way it did, which should be consistent with your results section. Additionally, you may wish to posit questions raised by your analysis for future analysis.

○ Reflection​: Conduct a self-reflection for each of the phases in The Process Section above to uncover key learnings that you can apply towards future projects or which you can share with your colleagues. Document at least 4 key learnings from your reflection. Describe observations, challenges, lucky breaks, emotions that you experienced, etc. you may have experienced during a specific phase.

● Do not just provide diagrams and statistics, each table & figure included must have a caption (e.g., Figure number and textual description) that is referenced from the text (e.g., “Figure 2 shows a frequency diagram for ...”).

● You should also provide your source code of a well-documented and formatted Jupyter Notebook and dataset files.

DATA

You could look for data from the following sites.

Data.gov ​ (Links to an external site.) kaggle.com (Links to an external site.) google data set search​ (Links to an external site.)

GRADING

This assignment is worth 100 points, which is 10% of your final grade. Your assignment will be evaluated based on a successful compilation and adherence to the project requirements. Grading criteria:

● 50 pts for project report ● 50 pts for Python implementation (Use Jupyter Notebook)

SUBMISSION

You should submit a well-documented Jupyter Notebook and dataset files. Submit both .ipynb and .docx file, name your files First_Lastname_FinalProject.xxx.