Final Project
Final project
Final project
Data analysis (today’s lecture)
20 Excel skills (same as before)
Memo (same as before)
Appropriate numbers
Count, vs. sum, vs. average, vs. percent of…
Is this unfair?
There are four women top executives and 10 men
If there are only 4 women in the company?
If there are 92% women in the company?
What would be a more reasonable number? What else would you need to know?
Proportion: Percent of column total (number of execs/total number)
4 female executives out of a total of 4 female employees = %100
10 male executives out of 203 male employees = 5%
53 female and 67 male – leads to distortion using count
Appropriate charts
Picking the right type of chart (read the PDF provided in the assignment
Example: Pie chart is used to compare parts in relation to each other in terms of a whole. It is not effective for comparing two different things. In that case a column chart would be best.
Are women in this organization discriminated against?
This scale provides a more useful perspective
An alternative explanation to ‘discrimination: Are salary differences due to age, i.e., are women in this organization younger?
Are salary differences due to years of experience worked?
Are there systematic differences within level in the organization? Are differences in average salary due to more men at higher positions?
Do not mix separate analyses together, apples and oranges
Comparing data on different scales – examples:
Comparing GPA and working hours of male vs. female students in the same table
One possible solution: Use percent of column total
Poor use of chart: Comparing data with different scales
Poor use of chart: Comparing far too many variables
Female Montclair only Transfer Montclair only Transfer Montclair only Transfer Montclair only Transfer Freshman Junior Senior Sophmore 4 51 93 4 24 46 10 Male Montclair only Transfer Montclair only Transfer Montclair only Transfer Montclair only Transfer Freshman Junior Senior Sophmore 8 1 69 120 32 30 20 Other Montclair only Transfer Montclair only Transfer Montclair only Transfer Montclair only Transfer Freshman Junior Senior Sophmore 3
Review: Proportions
Average salary of men/women: $56,789 vs. $59,992
Difference $3,201
M/F – What is M in relation to F? “M average salary is 95% of F”
F/M – What is F in relation to M?
“F average salary is 106% of Male” or “..6% more than male”
Percent difference (M-F)/Average(M,F) is 6%
Measures of central tendency
Mean (average)
Median (pick middle number)
Better if there are outliers, extreme numbers, skew in the data
Average salary of those in this room gets skewed if Bezos walks in… pick the median to get a better measure of the group
Example – Income in a group can vary widely, so median is frequently used
Why do we care about mean/median?
What are some common ways we think about the world using that statistic?
GPA, Earned Run Average, Batting Average
Provides information about a set or group – how they tend as a whole
What can you do with it?
The difference in GPA (numeric variable) by category
Male/Female (categorical variable
Hours worked: Average GPA by hours worked
Chunking a numeric variable to use it as a categorical
Chunks of hours worked: 0-9, 10-19, 20-29, 30-39, 40+
Pivot table – use the Grouping function on the ribbon
Alternative: Scatterplot
Measures of variance
Range, min, max
Standard deviation (from the mean)
How scattered the data is around the mean
Higher numbers mean more variance
Whether variance is high or low: SD/Mean
Higher percent means higher variance
Why do we care about variance?
Compare categories by variance
Men have more variance, and greater range of salaries – what does that mean?
Distribution
How data falls
Normal distribution as benchmark: Natural distribution given no bias, i.e., something skewing the data
Analysis of distribution: Box and Whiskers chart, 5 number summary: Min, 1st quartile, median, 3rd quartile, Max
PEWRESEARCH.ORG
Excellent source of data analysis examples, tables, charts
Excellent example of how to write with and about data analysis
Source of data: I have already provided some of their data for you to analyze yourself.
Question: Did Mayor Giuliani’s policies lead to decline in crime in NYC?
Question: Who are our students and how can we help them? What should we focus on? Are there many students living in poverty? How many students are living in poverty? Is that increasing or decreasing? Is that related to whether they are still with their parents?
Young college grads are complaining about their financial situation – is it true that they have a greater debt? What is that tend like? Does it look like there is a problem that needs to be addressed?
Has anything changed with immigration given the recent news of families being separated? Or is that simply a Trump policy?
Salary
demanded
26,041$
26,459$
26,900$
27,116$
27,382$
27,736$
27,999$
28,670$
28,961$
29,089$
30,647$
30,863$
31,695$
31,903$
32,906$
34,084$
34,232$
34,508$
34,619$
35,019$