assignment week 5

profileFrodo
week_4_reference.xlsx

Data

See comments at the right of the data set.
ID Salary Compa Midpoint Age Performance Rating Service Gender Raise Degree Gender1 Grade
8 23 1.000 23 32 90 9 1 5.8 0 F A The ongoing question that the weekly assignments will focus on is: Are males and females paid the same for equal work (under the Equal Pay Act)?
10 22 0.956 23 30 80 7 1 4.7 0 F A Note: to simplfy the analysis, we will assume that jobs within each grade comprise equal work.
11 23 1.000 23 41 100 19 1 4.8 0 F A
14 24 1.043 23 32 90 12 1 6 0 F A The column labels in the table mean:
15 24 1.043 23 32 80 8 1 4.9 0 F A ID – Employee sample number Salary – Salary in thousands
23 23 1.000 23 36 65 6 1 3.3 1 F A Age – Age in years Performance Rating – Appraisal rating (Employee evaluation score)
26 24 1.043 23 22 95 2 1 6.2 1 F A Service – Years of service (rounded) Gender: 0 = male, 1 = female
31 24 1.043 23 29 60 4 1 3.9 0 F A Midpoint – salary grade midpoint Raise – percent of last raise
35 24 1.043 23 23 90 4 1 5.3 1 F A Grade – job/pay grade Degree (0= BS\BA 1 = MS)
36 23 1.000 23 27 75 3 1 4.3 1 F A Gender1 (Male or Female) Compa - salary divided by midpoint
37 22 0.956 23 22 95 2 1 6.2 1 F A
42 24 1.043 23 32 100 8 1 5.7 0 F A
3 34 1.096 31 30 75 5 1 3.6 0 F B
18 36 1.161 31 31 80 11 1 5.6 1 F B
20 34 1.096 31 44 70 16 1 4.8 1 F B
39 35 1.129 31 27 90 6 1 5.5 1 F B
7 41 1.025 40 32 100 8 1 5.7 0 F C
13 42 1.050 40 30 100 2 1 4.7 1 F C
22 57 1.187 48 48 65 6 1 3.8 0 F D
24 50 1.041 48 30 75 9 1 3.8 1 F D
45 55 1.145 48 36 95 8 1 5.2 0 F D
17 69 1.210 57 27 55 3 1 3 0 F E
48 65 1.140 57 34 90 11 1 5.3 1 F E
28 75 1.119 67 44 95 9 1 4.4 1 F F
43 77 1.149 67 42 95 20 1 5.5 1 F F
19 24 1.043 23 32 85 1 0 4.6 1 M A
25 24 1.043 23 41 70 4 0 4 0 M A
40 25 1.086 23 24 90 2 0 6.3 0 M A
2 27 0.870 31 52 80 7 0 3.9 0 M B
32 28 0.903 31 25 95 4 0 5.6 0 M B
34 28 0.903 31 26 80 2 0 4.9 1 M B
16 47 1.175 40 44 90 4 0 5.7 0 M C
27 40 1.000 40 35 80 7 0 3.9 1 M C
41 43 1.075 40 25 80 5 0 4.3 0 M C
5 47 0.979 48 36 90 16 0 5.7 1 M D
30 49 1.020 48 45 90 18 0 4.3 0 M D
1 58 1.017 57 34 85 8 0 5.7 0 M E
4 66 1.157 57 42 100 16 0 5.5 1 M E
12 60 1.052 57 52 95 22 0 4.5 0 M E
33 64 1.122 57 35 90 9 0 5.5 1 M E
38 56 0.982 57 45 95 11 0 4.5 0 M E
44 60 1.052 57 45 90 16 0 5.2 1 M E
46 65 1.140 57 39 75 20 0 3.9 1 M E
47 62 1.087 57 37 95 5 0 5.5 1 M E
49 60 1.052 57 41 95 21 0 6.6 0 M E
50 66 1.157 57 38 80 12 0 4.6 0 M E
6 76 1.134 67 36 70 12 0 4.5 1 M F
9 77 1.149 67 49 100 10 0 4 1 M F
21 76 1.134 67 43 95 13 0 6.3 1 M F
29 72 1.074 67 52 95 5 0 5.4 0 M F

Week 4

Week 4 Confidence Intervals and Chi Square (Chs 11 - 12)
For questions 3 and 4 below, be sure to list the null and alternate hypothesis statements. Use .05 for your significance level in making your decisions.
For full credit, you need to also show the statistical outcomes - either the Excel test result or the calculations you performed.
1 Using our sample data, construct a 95% confidence interval for the population's mean salary for each gender.
Interpret the results. How do they compare with the findings in the week 2 one sample t-test outcomes (Question 1)?
Mean St error t value Low to High
Males 52 3.5552777669 2.0638985473 44.66 59.34
Females 38 3.6587793957 2.0638985473 30.45 45.55
<Reminder: standard error is the sample standard deviation divided by the square root of the sample size.>
Interpretation: If repeated observations are taken, the mean salary of male employees is expected to lie within 44.66 to 59.34 thousands about 95% of the time.
If repeated observations are taken, the mean salary of female employees is expected to lie within 30.45 to 45.55 thousands about 95% of the time.
As per our previous findings in week 2 one sample t-test outcomes, the mean salary of the male employees is 52 thousands and there is no evidence to suggest that the mean salary of the male employees is significantly different from the mean salary of the population, which is 45 thousand. The mean salary of the female employees is 38 thousands and there is no evidence to suggest that the mean salary of the female employees is significantly different from the mean salary of the popululation.
The 95% confidence intervals for the salaries of the male and the female employees both contain the population mean salary of 45 thousand.
This is in accordance with our previous findings that the mean salary of the male and female employees are not significantly different from the mean salary of the popululation.
2 Using our sample data, construct a 95% confidence interval for the mean salary difference between the genders in the population.
How does this compare to the findings in week 2, question 2?
Difference St Err. T value Low to High
14 5.1016337253 2.0106347219 3.7424780935 24.2575219065
Yes/No
Can the means be equal? No Why? The confidence interval for the difference of means does not contain 0.
How does this compare to the week 2, question 2 result (2 sampe t-test)?
As the confidence interval for the population's mean salary difference for male and female employees does not include 0, we can conclude that there is a significant difference between the means at 95% level of confidence.
This is in accordance with our previous findings of week 2 two sample t-test outcome that the mean salary of the male employees is significantly different from the mean salary of the female employees, at 95% level of confidence.
a. Why is using a two sample tool (t-test, confidence interval) a better choice than using 2 one-sample techniques when comparing two samples?
It reduces the number of errors due to approximations.
Females
3 We found last week that the degrees compa values within the population. Count of Degree Column Labels
do not impact compa rates. This does not mean that degrees are distributed evenly across the grades and genders. Row Labels A B C D E F Grand Total
Do males and females have athe same distribution of degrees by grade? 0 7 1 1 2 1 12
(Note: while technically the sample size might not be large enough to perform this test, ignore this limitation for this exercise.) 1 5 3 1 1 1 2 13
Grand Total 12 4 2 3 2 2 25
What are the hypothesis statements:
Ho: Males and females have the same distribution of degrees by grade.
Ha: Males and females do not have the same distribution of degrees by grade. Count of Degree Column Labels
Note: You can either use the Excel Chi-related functions or do the calculations manually. Row Labels A B C D E F Grand Total
Data input tables - graduate degrees by gender and grade level 0 2 2 2 1 5 1 13
OBSERVED A B C D E F Total Do manual calculations per cell here (if desired) 1 1 1 1 1 5 3 12
M Grad 1 1 1 1 5 3 12 A B C D E F Grand Total 3 3 3 2 10 4 25
Fem Grad 5 3 1 1 1 2 13 M Grad 1.8778 0.2752 0.0333 0.0333 1.5606 1.6900
Male Und 2 2 2 1 5 1 13 Fem Grad 0.3103 0.7651 0.0692 0.0692 1.4405 0.1241
Female Und 7 1 1 2 1 0 12 Male Und 0.9256 0.0178 0.3769 0.0692 1.1328 0.2010
Total 15 7 5 5 12 6 50 Female Und 3.2111 0.2752 0.0333 0.5333 1.2272 1.4400
Sum = 17.6923076923
EXPECTED
M Grad 3.6 1.68 1.2 1.2 2.88 1.44 For this exercise - ignore the requirement for a correction
Fem Grad 3.9 1.82 1.3 1.3 3.12 1.56 for expected values less than 5.
Male Und 3.9 1.82 1.3 1.3 3.12 1.56
Female Und 3.6 1.68 1.2 1.2 2.88 1.44
Interpretation:
What is the value of the chi square statistic: 17.6923076923
What is the p-value associated with this value: 0.2791871758
Is the p-value <0.05? No
Do you reject or not reject the null hypothesis: We do not reject the null hypothesis.
If you rejected the null, what is the Cramer's V correlation: Not Applicable
What does this correlation mean? Not Applicable
What does this decision mean for our equal pay question: There is no evidence to suggest that male and female employees have different distribution of degrees by grade, at significance level of 0.05.
4 Based on our sample data, can we conclude that males and females are distributed across grades in a similar pattern
within the population?
What are the hypothesis statements: Count of Gender1 Column Labels
Ho: Males and females have same distribution across grades. Row Labels A B C D E F Grand Total
Ha: Males and females have different distribution across grades. F 12 4 2 3 2 2 25
M 3 3 3 2 10 4 25
Do manual calculations per cell here (if desired) Grand Total 15 7 5 5 12 6 50
A B C D E F Total A B C D E
OBS COUNT - m 3 3 3 2 10 4 25 M 2.7000 0.0714 0.1000 0.1000 2.6667
OBS COUNT - f 12 4 2 3 2 2 25 F 2.7000 0.0714 0.1000 0.1000 2.6667
Total 15 7 5 5 12 6 50
Sum = 11.2762
EXPECTED 7.5 3.5 2.5 2.5 6 3
7.5 3.5 2.5 2.5 6 3
What is the value of the chi square statistic: 11.2762
What is the p-value associated with this value: 0.0461707397
Is the p-value <0.05? Yes
Do you reject or not reject the null hypothesis: We reject the null hypothesis.
If you rejected the null, what is the Phi correlation: 0.96937224
What does this correlation mean? The extent of relationship between gender and grades is strong.
What does this decision mean for our equal pay question: Males and females have different distribution across grades, and this might be accountable for the difference in the mean salaries of males and females.
5.      How do you interpret these results in light of our question about equal pay for equal work?
The mean salaries of males are significantly higher that the mean salaries of females, at significance level of 0.05.
The distribution of males and females across grades are also significantly different, at significance level of 0.05.
This difference might be the underlying factor for the difference in the mean salaries of males and females.