Info Project 2

danya.alsinan
PossibleDataAnalysisStepsforClassProject2.docx

Note: Items in red are updates from Project 1 document.

The most common error people make when starting data analysis is coming up with questions to ask before they are intimately familiar with the data!!!

· Remember: Getting the answer is the easy part. The hard part is trying to figure out the right questions to ask.

· Data analysis is an iterative and incremental task.

· You will not be able to do it well if you just make one pass at the data.

1. Explore and get to know the data

· Make sure you understand what each column stands for.

· Identify the data type and unit ($, lbs., etc.) for each column.

· Choose a few records (rows) and go through the data and see how the columns may or may not be related to each other.

2. CHECK DATA QUALTITY!!!!!

· Use Filters, Sort, and Remove Duplicates to check data quality.

· Look for misspellings and synonyms to consolidate data.

· Use Error Checking to find formula errors.

· Look for values that just don’t make sense.

You should constantly be checking for data quality issues.

· You may have to make changes as you are doing the analysis.

3. Do your BASIC analysis

For all types of data, use:

· Pivot tables/charts to examine both individual variables and relationships between variables

· Frequencies Distributions and Cross Tabulations (single and multiple variables)

· Both Relative (%) and Absolute (#)

For categorical data, use:

· Pie Charts

· Other types of charts (if they better illustrate the point being made)

· Bar Charts

· Stacked and Side-By-Side

· Both Relative (%--100% Stacked) and Absolute (#) for multiple variable situations

For quantitative data, use:

INFO 1010

Possible Data Analysis Process for Class Project 2

·

· Average (Mean)

· Standard Deviation

· Variance

· Median

· Mode

· Range

· Coefficient of Variation

· Interquartile Range (IQR)

· Histograms

· Skew

· Box Plot

For relationships between data, use:

· Correlation Matrix (Quantitative only)

· Trendlines with Equations and R Squared values (Quantitative only)

· Multi-variable Frequency (Cross Tabulation) Tables

· Stacked and 100% Stacked Bar Charts (for absolute and relative comparisons)

· Other types of charts (if they better illustrate the point being made)

What do these tables, charts, and statistics indicate may be interesting about the data?

4. Do your ADVANCED analysis

Investigate further those questions/findings that the data indicate are interesting using multiple tools/methods

· Examine those questions using multiple tools/methods

· Pivot Tables/Charts

· Vary different aspects of both the tables and charts (and type of charts)

· Filters, Rows, Columns, Values

· Sum, Average, Count, Percent, Subtotals and Totals

· Grouping size for qualitative variables

· Hierarchy of Rows and Columns

5. Explain what results you found interesting and why they are interesting.

· The memo should contain:

· Executive Summary

· Purpose of Memo/Analysis

· Data Quality Issues/How Addressed

· Interesting/Useful Analysis/Findings

· Recommendations of Possible Actions/Next Steps

· In your memo, use tables and charts to support and illustrate your findings!

· Format the reports, tables, and charts so that the reader focuses on what you are trying to say and is not distracted by lack of formatting, poor formatting, over-formatting, etc.

· Remember that most of the time your manager is going to be looking only at your report and not at your actual Excel files.

· Put your name on the report

· Make sure that the report is professional looking.

· You can create your own memo format, but Microsoft Word has many Memo templates from which to choose.