Final Project

profilePErl
Finalproject2.pptx

Final project

Final project

Data analysis (today’s lecture)

20 Excel skills (same as before)

Memo (same as before)

Appropriate numbers

Count, vs. sum, vs. average, vs. percent of…

Is this unfair?

There are four women top executives and 10 men

If there are only 4 women in the company?

If there are 92% women in the company?

What would be a more reasonable number? What else would you need to know?

Proportion: Percent of column total (number of execs/total number)

4 female executives out of a total of 4 female employees = %100

10 male executives out of 203 male employees = 5%

53 female and 67 male – leads to distortion using count

Appropriate charts

Picking the right type of chart (read the PDF provided in the assignment

Example: Pie chart is used to compare parts in relation to each other in terms of a whole. It is not effective for comparing two different things. In that case a column chart would be best.

Are women in this organization discriminated against?

This scale provides a more useful perspective

An alternative explanation to ‘discrimination: Are salary differences due to age, i.e., are women in this organization younger?

Are salary differences due to years of experience worked?

Are there systematic differences within level in the organization? Are differences in average salary due to more men at higher positions?

Do not mix separate analyses together, apples and oranges

Comparing data on different scales – examples:

Comparing GPA and working hours of male vs. female students in the same table

One possible solution: Use percent of column total

Poor use of chart: Comparing data with different scales

Poor use of chart: Comparing far too many variables

Female Montclair only Transfer Montclair only Transfer Montclair only Transfer Montclair only Transfer Freshman Junior Senior Sophmore 4 51 93 4 24 46 10 Male Montclair only Transfer Montclair only Transfer Montclair only Transfer Montclair only Transfer Freshman Junior Senior Sophmore 8 1 69 120 32 30 20 Other Montclair only Transfer Montclair only Transfer Montclair only Transfer Montclair only Transfer Freshman Junior Senior Sophmore 3

Review: Proportions

Average salary of men/women: $56,789 vs. $59,992

Difference $3,201

M/F – What is M in relation to F? “M average salary is 95% of F”

F/M – What is F in relation to M?

“F average salary is 106% of Male” or “..6% more than male”

Percent difference (M-F)/Average(M,F) is 6%

Measures of central tendency

Mean (average)

Median (pick middle number)

Better if there are outliers, extreme numbers, skew in the data

Average salary of those in this room gets skewed if Bezos walks in… pick the median to get a better measure of the group

Example – Income in a group can vary widely, so median is frequently used

Why do we care about mean/median?

What are some common ways we think about the world using that statistic?

GPA, Earned Run Average, Batting Average

Provides information about a set or group – how they tend as a whole

What can you do with it?

The difference in GPA (numeric variable) by category

Male/Female (categorical variable

Hours worked: Average GPA by hours worked

Chunking a numeric variable to use it as a categorical

Chunks of hours worked: 0-9, 10-19, 20-29, 30-39, 40+

Pivot table – use the Grouping function on the ribbon

Alternative: Scatterplot

Measures of variance

Range, min, max

Standard deviation (from the mean)

How scattered the data is around the mean

Higher numbers mean more variance

Whether variance is high or low: SD/Mean

Higher percent means higher variance

Why do we care about variance?

Compare categories by variance

Men have more variance, and greater range of salaries – what does that mean?

Distribution

How data falls

Normal distribution as benchmark: Natural distribution given no bias, i.e., something skewing the data

Analysis of distribution: Box and Whiskers chart, 5 number summary: Min, 1st quartile, median, 3rd quartile, Max

PEWRESEARCH.ORG

Excellent source of data analysis examples, tables, charts

Excellent example of how to write with and about data analysis

Source of data: I have already provided some of their data for you to analyze yourself.

Question: Did Mayor Giuliani’s policies lead to decline in crime in NYC?

Question: Who are our students and how can we help them? What should we focus on? Are there many students living in poverty? How many students are living in poverty? Is that increasing or decreasing? Is that related to whether they are still with their parents?

Young college grads are complaining about their financial situation – is it true that they have a greater debt? What is that tend like? Does it look like there is a problem that needs to be addressed?

Has anything changed with immigration given the recent news of families being separated? Or is that simply a Trump policy?

Salary

demanded

26,041$

26,459$

26,900$

27,116$

27,382$

27,736$

27,999$

28,670$

28,961$

29,089$

30,647$

30,863$

31,695$

31,903$

32,906$

34,084$

34,232$

34,508$

34,619$

35,019$