Statics Problem Set - Week 3

profileAEIOUsometimesY
Week3ProblemSet.xlsx

Data

ID Salary Compa-ratio Midpoint Age Performance Rating Service Gender Raise Degree Gender1 Grade Do not manipuilate Data set on this page, copy to another page to make changes
1 58 1.018 57 34 85 8 0 5.7 0 M E The ongoing question that the weekly assignments will focus on is: Are males and females paid the same for equal work (under the Equal Pay Act)?
2 28 0.902 31 52 80 7 0 3.9 0 M B Note: to simplfy the analysis, we will assume that jobs within each grade comprise equal work.
3 33.8 1.089 31 30 75 5 1 3.6 1 F B
4 64.8 1.137 57 42 100 16 0 5.5 1 M E The column labels in the table mean:
5 48.7 1.014 48 36 90 16 0 5.7 1 M D ID – Employee sample number Salary – Salary in thousands
6 75.4 1.126 67 36 70 12 0 4.5 1 M F Age – Age in years Performance Rating - Appraisal rating (employee evaluation score)
7 41.6 1.039 40 32 100 8 1 5.7 1 F C Service – Years of service (rounded) Gender – 0 = male, 1 = female
8 23.6 1.028 23 32 90 9 1 5.8 1 F A Midpoint – salary grade midpoint Raise – percent of last raise
9 73.8 1.102 67 49 100 10 0 4 1 M F Grade – job/pay grade Degree (0= BS\BA 1 = MS)
10 22.5 0.980 23 30 80 7 1 4.7 1 F A Gender1 (Male or Female) Compa-ratio - salary divided by midpoint
11 24.1 1.046 23 41 100 19 1 4.8 1 F A
12 58.3 1.023 57 52 95 22 0 4.5 0 M E
13 41.4 1.035 40 30 100 2 1 4.7 0 F C
14 23.2 1.008 23 32 90 12 1 6 1 F A
15 22.5 0.978 23 32 80 8 1 4.9 1 F A
16 44.7 1.117 40 44 90 4 0 5.7 0 M C
17 63.8 1.120 57 27 55 3 1 3 1 F E
18 34.9 1.127 31 31 80 11 1 5.6 0 F B
19 24.9 1.083 23 32 85 1 0 4.6 1 M A
20 33.4 1.076 31 44 70 16 1 4.8 0 F B
21 73.5 1.097 67 43 95 13 0 6.3 1 M F
22 51.4 1.071 48 48 65 6 1 3.8 1 F D
23 25.1 1.090 23 36 65 6 1 3.3 0 F A
24 48.6 1.013 48 30 75 9 1 3.8 0 F D
25 24.4 1.062 23 41 70 4 0 4 0 M A
26 24.3 1.058 23 22 95 2 1 6.2 0 F A
27 39.2 0.980 40 35 80 7 0 3.9 1 M C
28 75.8 1.131 67 44 95 9 1 4.4 0 F F
29 76.3 1.139 67 52 95 5 0 5.4 0 M F
30 48.6 1.012 48 45 90 18 0 4.3 0 M D
31 23.4 1.018 23 29 60 4 1 3.9 1 F A
32 27.5 0.886 31 25 95 4 0 5.6 0 M B
33 58 1.018 57 35 90 9 0 5.5 1 M E
34 27 0.871 31 26 80 2 0 4.9 1 M B
35 23 0.998 23 23 90 4 1 5.3 0 F A
36 23 1.001 23 27 75 3 1 4.3 0 F A
37 23.6 1.026 23 22 95 2 1 6.2 0 F A
38 60.8 1.066 57 45 95 11 0 4.5 0 M E
39 35 1.129 31 27 90 6 1 5.5 0 F B
40 24.3 1.056 23 24 90 2 0 6.3 0 M A
41 40.5 1.013 40 25 80 5 0 4.3 0 M C
42 22.6 0.982 23 32 100 8 1 5.7 1 F A
43 76 1.134 67 42 95 20 1 5.5 0 F F
44 63.1 1.107 57 45 90 16 0 5.2 1 M E
45 52.9 1.103 48 36 95 8 1 5.2 1 F D
46 59.1 1.036 57 39 75 20 0 3.9 1 M E
47 62.6 1.099 57 37 95 5 0 5.5 1 M E
48 67.4 1.183 57 34 90 11 1 5.3 1 F E
49 58.4 1.025 57 41 95 21 0 6.6 0 M E
50 61.4 1.077 57 38 80 12 0 4.6 0 M E

Week 1

Week 1: Descriptive Statistics, including Probability Salary Gender1
While the lectures will examine our equal pay question from the compa-ratio viewpoint, our weekly assignments will focus on 33.8 F
examining the issue using the salary measure. 41.6 F
23.6 F
The purpose of this assignmnent is two fold: 22.5 F
1. Demonstrate mastery with Excel tools. 24.1 F
2. Develop descriptive statistics to help examine the question. 41.4 F
3. Interpret descriptive outcomes 23.2 F
22.5 F
The first issue in examining salary data to determine if we - as a company - are paying males and females equally for doing equal work is to develop some 63.8 F
descriptive statistics to give us something to make a preliminary decision on whether we have an issue or not. 34.9 F
33.4 F
1 Descriptive Statistics: Develop basic descriptive statistics for Salary 51.4 F
The first step in analyzing data sets is to find some summary descriptive statistics for key variables. 25.1 F
Suggestion: Copy the gender1 and salary columns from the Data tab to columns T and U at the right. 48.6 F
Then use Data Sort (by gender1) to get all the male and female salary values grouped together. 24.3 F
75.8 F
a. Use the Descriptive Statistics function in the Data Analysis tab Place Excel outcome in Cell K19 23.4 F
to develop the descriptive statistics summary for the overall Column1 23 F
group's overall salary. (Place K19 in output range.) 23 F
Highlight the mean, sample standard deviation, and range. Mean 44.364 23.6 F
Standard Error 2.6490133088 35 F
Median 41.5 22.6 F
Mode 23.6 76 F
b. Using Fx (or formula) functions find the following (be sure to show the formula Standard Deviation 18.7313527411 52.9 F
and not just the value in each cell) asked for salary statistics for each gender: Sample Variance 350.8635755102 67.4 F
Male Female Male Female Kurtosis -1.3765917226 58 M
Mean: 51.252 37.476 24.300 76.300 =AVERAGE(T27:T51) =AVERAGE(T2:T26) Skewness 0.2827691781 28 M
Sample Standard Deviation: 17.4001465511 17.7408727707 22.500 76 =STDEV.S(T27:T51) =STDEV.S(T2:T26) Range 53.8 64.8 M
Range: 52.000 53.500 =H28-G28 =H29-G29 Minimum 22.5 48.7 M
Maximum 76.3 75.4 M
Sum 2218.2 73.8 M
Count 50 58.3 M
44.7 M
2 Develop a 5-number summary for the overall, male, and female SALARY variable. 24.9 M
For full credit, use the excel formulas in each cell rather than simply the numerical answer. 73.5 M
Overall Males Females Overall Males Females 24.4 M
Max 76.300 76.300 76.000 =QUARTILE(T$2:T$51,4) =QUARTILE($T$27:$T$51,4) =QUARTILE($T$2:$T$26,4) 39.2 M
3rd Q 60.375 62.600 48.600 =QUARTILE(T$2:T$51,3) =QUARTILE($T$27:$T$51,3) =QUARTILE($T$2:$T$26,3) 76.3 M
Midpoint 41.500 58.000 33.400 =QUARTILE(T$2:T$51,2) =QUARTILE($T$27:$T$51,2) =QUARTILE($T$2:$T$26,2) 48.6 M
1st Q 24.525 39.200 23.400 =QUARTILE(T$2:T$51,1) =QUARTILE($T$27:$T$51,1) =QUARTILE($T$2:$T$26,1) 27.5 M
Min 22.500 24.300 22.500 =QUARTILE(T$2:T$51,0) =QUARTILE($T$27:$T$51,0) =QUARTILE($T$2:$T$26,0) 58 M
27 M
3 Location Measures: comparing Male and Female midpoints to the overall Salary data range. 60.8 M
For full credit, show the excel formulas in each cell rather than simply the numerical answer. 24.3 M
Using the entire Salary range and the M and F midpoints found in Q2 Male Female Male Female 40.5 M
a. What would each midpoint's percentile rank be in the overall range? 0.645 0.42 Use Excel's =PERCENTRANK.EXC function =PERCENTRANK.EXC(E38:E42,F40) =PERCENTRANK.EXC(E38:E42,G40) 63.1 M
b. What is the normal curve z value for each midpoint within overall range? 0.7280 -0.5853 Use Excel's =STANDARDIZE function =STANDARDIZE(F40,L21,L25) =STANDARDIZE(G40,L21,L25) 59.1 M
62.6 M
4 Probability Measures: comparing Male and Female midpoints to the overall Salary data range 58.4 M
For full credit, show the excel formulas in each cell rather than simply the numerical answer. 61.4 M
Using the entire Salary range and the M and F midpoints found in Q2, find Male Female Male Female
a. The Empirical Probability of equaling or exceeding (=>) that value for 0.36 0.64 Show the calculation formula = value/50 or =countif(range,">="&cell)/50 =COUNTIF(T2:T51,">="&F40)/50 =COUNTIF(T2:T51,">="&G40)/50
b. The Normal curve Prob of => that value for each group 0.233313734430809 0.720836669337808 Use "=1-NORM.S.DIST" function =1-NORM.S.DIST(I48, TRUE) =1-NORM.S.DIST(J48, TRUE)
Note: be sure to use the ENTIRE salary range for part a when finding the probability.
5 Conclusions: What do you make of these results? Be sure to include findings from this week's lectures as well.
In comparing the overall, male, and female outcomes, what relationship(s) see, to exist between the data sets?
Your findings: Although with a lower mean, females end up making more in the long run, with a higher empiical probability as well.
The lecture's related findings: From the lecture, the found the same, females tend to have a higher average of pay than males.
Overall conclusion: Males tend to start off with a higher salary, but females end up with a higher salary in the peak of their careers.
What does this suggest about our equal pay for equal work question? It suggests, that equal pay is given for both genders.
Looking at the mean, standard deviation, and the range the numbers are close, we can say that the pay is pretty much equal.

Week 2

Week 2: Identifying Significant Differences - part 1 Salary Gender male salaries female salaries Male salaries female salaries Salary Degree Salaries with advanced degree salary without advanced degree Salaries with advanced degree Salaries without advanced degrees
58 0 58 58 33.8 58 0 58 33.8 58
To Ensure full credit for each question, you need to show how you got your results. This involves either showing where the data you used is located 28 0 28 28 41.6 28 0 28 64.8 28
or showing the excel formula in each cell. Be sure to copy the appropriate data columns from the data tab to the right for your use this week. 33.8 1 33.8 64.8 23.6 33.8 1 33.8 48.7 58.3
64.8 0 64.8 48.7 22.5 64.8 1 64.8 75.4 41.4
As with our examination of compa-ratio in the lecture, the first question we have about salary between the genders involves equality - are they the same or different? 48.7 0 48.7 75.4 24.1 48.7 1 48.7 41.6 44.7
What we do, depends upon our findings. 75.4 0 75.4 73.8 41.4 75.4 1 75.4 23.6 34.9
41.6 1 41.6 58.3 23.2 41.6 1 41.6 73.8 33.4
1 As with the compa-ratio lecture example, we want to examine salary variation within the groups - are they equal? . Use Cell K10 for the Excel test outcome location. 23.6 1 23.6 44.7 22.5 23.6 1 23.6 22.5 25.1
a What is the data input ranged used for this question: F-Test Two-Sample for Variances 73.8 0 73.8 24.9 63.8 73.8 1 73.8 24.1 48.6
V2:W26 on this sheet 22.5 1 22.5 73.5 34.9 22.5 1 22.5 23.2 24.4
b Which is needed for this question: a one- or two-tail hypothesis statement and test ? Variable 1 Variable 2 24.1 1 24.1 24.4 33.4 24.1 1 24.1 22.5 24.3
Answer: One-tailed test Mean 51.252 37.476 58.3 0 58.3 39.2 51.4 58.3 0 58.3 63.8 75.8
Why: It uses F-test which is a one-tailed test. Variance 302.7651 314.7385666667 41.4 1 41.4 76.3 25.1 41.4 0 41.4 24.9 76.3
Observations 25 25 23.2 1 23.2 48.6 48.6 23.2 1 23.2 73.5 48.6
c. Step 1: Ho: male salary variance=female salary variance df 24 24 22.5 1 22.5 27.5 24.3 22.5 1 22.5 51.4 27.5
Ha: male salary variance≠female salary variance F 0.9619574214 44.7 0 44.7 58 75.8 44.7 0 44.7 39.2 23
Step 2: Significance (Alpha): 0.05 P(F<=f) one-tail 0.4625480001 63.8 1 63.8 27 23.4 63.8 1 63.8 23.4 23
Step 3: Test Statistic and test: Test statistic is the F and test is the F-test F Critical one-tail 0.5040933467 34.9 1 34.9 60.8 23 34.9 0 34.9 58 23.6
Why this test? Because it tests for the equality of variances In this case, we test if male population salary variance is equal to female population salary variance 24.9 0 24.9 24.3 23 24.9 1 24.9 27 60.8
Step 4: Decision rule: Reject the null hypothesis if pvalue is less than 0.05 33.4 1 33.4 40.5 23.6 33.4 0 33.4 22.6 35
Step 5: Conduct the test - place test function in cell k10 73.5 0 73.5 63.1 35 73.5 1 73.5 63.1 24.3
51.4 1 51.4 59.1 22.6 51.4 1 51.4 52.9 40.5
Step 6: Conclusion and Interpretation 25.1 1 25.1 62.6 76 25.1 0 25.1 59.1 76
What is the p-value: 0.4625 48.6 1 48.6 58.4 52.9 48.6 0 48.6 62.6 58.4
What is your decision: REJ or NOT reject the null? NOT reject the null 24.4 0 24.4 61.4 67.4 24.4 0 24.4 67.4 61.4
Why? p-value>0.05 24.3 1 24.3 24.3 0 24.3
What is your conclusion about the variance in the population for male and female salaries? 39.2 0 39.2 39.2 1 39.2
the variance in population and female variances are equal 75.8 1 75.8 75.8 0 75.8
76.3 0 76.3 76.3 0 76.3
2 Once we know about variance quality, we can move on to means: Are male and female average salaries equal? Use Cell K35 for the Excel test outcome location. 48.6 0 48.6 48.6 0 48.6
(Regardless of the outcome of the above F-test, assume equal variances for this test.) 23.4 1 23.4 23.4 1 23.4
27.5 0 27.5 27.5 0 27.5
a What is the data input ranged used for this question: 58 0 58 58 1 58
V2:W26 on this sheet t-Test: Two-Sample Assuming Equal Variances 27 0 27 27 1 27
b Does this question need a one or two-tail hypothesis statement and test? Two-tailed test 23 1 23.0 23 0 23
Why: we are testing for the possibility of the relationship in both directions; either less than or greater than Variable 1 Variable 2 23 1 23.0 23 0 23
c. Step 1: Ho: population mean male salary=population mean female salary Mean 51.252 37.476 23.6 1 23.6 23.6 0 23.6
Ha: population mean male salary≠population mean female salary Variance 302.7651 314.7385666667 60.8 0 60.8 60.8 0 60.8
Step 2: Significance (Alpha): 0.05 Observations 25 25 35 1 35.0 35 0 35
Step 3: Test Statistic and test: t-statistic and t-test Pooled Variance 308.7518333333 24.3 0 24.3 24.3 0 24.3
Why this test? We are testing for the differences between group means Hypothesized Mean Difference 0 40.5 0 40.5 40.5 0 40.5
Step 4: Decision rule: Reject the null hypothesis if p-value is less than 0.05 df 48 22.6 1 22.6 22.6 1 22.6
Step 5: Conduct the test - place test function in cell K35 t Stat 2.7718732548 76 1 76.0 76 0 76
P(T<=t) one-tail 0.003954294 63.1 0 63.1 63.1 1 63.1
Step 6: Conclusion and Interpretation t Critical one-tail 1.6772241961 52.9 1 52.9 52.9 1 52.9
What is the p-value: 0.0079 P(T<=t) two-tail 0.0079085881 59.1 0 59.1 59.1 1 59.1
What is your decision: REJ or NOT reject the null? Reject the null hypothesis t Critical two-tail 2.0106347576 62.6 0 62.6 62.6 1 62.6
Why? p-value<0.05 67.4 1 67.4 67.4 1 67.4
What is your conclusion about the means in the population for male and female salaries? 58.4 0 58.4 58.4 0 58.4
61.4 0 61.4 61.4 0 61.4
The population means for male and female salaries are not equal
3 Education is often a factor in pay differences.
Do employees with an advanced degree (degree = 1) have higher average salaries? Use Cell K60 for the Excel test outcome location.
Note: assume equal variance for the salaries in each degree for this question.
a What is the data input ranged used for this question:
AC2:AD26 in this sheet
b Does this question need a one or two-tail hypothesis statement and test? one-tailed
Why: We are only tesing for the possibility in one direction (ie higher average salaries) t-Test: Two-Sample Assuming Equal Variances
c. Step 1: Ho: Salaries for those with advanced degrees is equal to salaries for those without advanced degrees
Ha: Salaries for those with advanced degrees is greater than salaries for those without advanced degrees Variable 1 Variable 2
Step 2: Significance (Alpha): 0.05 Mean 45.716 43.012
Step 3: Test Statistic and test: t-statistic and t-test Variance 382.1664 330.3719333333
Why this test? We are testing for the equality of means Observations 25 25
Step 4: Decision rule: Reject the null hypothesis if p-value<0.05 Pooled Variance 356.2691666667
Step 5: Conduct the test - place test function in cell K60 Hypothesized Mean Difference 0
df 48
Step 6: Conclusion and Interpretation t Stat 0.5064919822
What is the p-value: 0.3074 P(T<=t) one-tail 0.3074150127
Is the t value in the t-distribution tail indicated by the arrow in the Ha claim? t Critical one-tail 1.6772241961
No P(T<=t) two-tail 0.6148300254
What is your decision: REJ or NOT reject the null? NOT reject the null t Critical two-tail 2.0106347576
Why? p-value>0.05
What is your conclusion about the impact of education on average salaries?
Salaries for those with advanced degrees is equal to salaries for those without advanced degrees
Therefore, education does not have an impact on salaries
4 Considering both the compa-ratio information from the lectures and your salary information, what conclusions can you reach about equal pay for equal work?
Your findings: We can see that there is a significant difference between male and female salaries
The lecture's related findings: It does not appear that the equal pay act is not being violated. It seems that male and femals are being paid around the same.
Overall conclusion: Equal work for equal pay does not hold
Why - what statistical results support this conclusion?
the t-test for equality of means in number 2

Week 3

Week 3: Identifying Significant Differences - part 2 Data Input Table: Salary Range Groups
Group name: A B C D E F
To Ensure full credit for each question, you need to show how you got your results. This involves either showing where the data you used is located List salaries within each grade
or showing the excel formula in each cell. Be sure to copy the appropriate data columns from the data tab to the right for your use this week.
1 A good pay program will have different average salaries by grade. Is this the case for our company?
a What is the data input ranged used for this question: Use Cell K08 for the Excel test outcome location.
Note: assume equal variances for each grade, even though this may not be accurate, for purposes of this question.
b. Step 1: Ho:
Ha:
Step 2: Significance (Alpha):
Step 3: Test Statistic and test:
Why this test?
Step 4: Decision rule:
Step 5: Conduct the test - place test function in cell K08
Step 6: Conclusion and Interpretation
What is the p-value:
What is your decision: REJ or NOT reject the null?
Why?
What is your conclusion about the means in the population for grade salaries?
2 If the null hypothesis in question 1 was rejected, which pairs of means differ?
(Use the values from the ANOVA table to complete the follow table.)
Groups Compared Mean Diff. T value used +/- Term Low to High Difference Significant? Why?
A-B
A-C
A-D
A-E
A-F
B-C
B-D
B-E
B-E
C-D
C-E
C-F
D-E
D-F
E-F
3 One issue in salary is the grade an employee is in - higher grades have higher salaries.
This suggests that one question to ask is if males and females are distributed in a similar pattern across the salary grades?
a What is the data input ranged used for this question: Use Cell K54 for the Excel test outcome location.
b. Step 1: Ho:
Ha:
Step 2: Significance (Alpha):
Step 3: Test Statistic and test: Place the actual distribution in the table below.
Why this test? A B C D E F Sum
Step 4: Decision rule: Male 0
Step 5: Conduct the test - place test function in cell K54 Female 0
Sum: 0 0 0 0 0 0 0
Step 6: Conclusion and Interpretation Place the expected distribution in the table below.
What is the p-value: A B C D E F
What is your decision: REJ or NOT reject the null? Male 0
Why? Female 0
What is your conclusion about the means in the population for male and female salaries? Sum: 0 0 0 0 0 0 0
4 What implications do this week's analysis have for our equal pay question?
Your findings:
The lecture's related findings:
Overall conclusion:
Why - what statistical results support this conclusion?

Week 4

Week 4: Identifying relationships - correlations and regression
To Ensure full credit for each question, you need to show how you got your results. This involves either showing where the data you used is located
or showing the excel formula in each cell. Be sure to copy the appropriate data columns from the data tab to the right for your use this week.
1 What is the correlation between and among the interval/ratio level variables with salary? (Do not include compa-ratio in this question.)
a. Create the correlation table. Use Cell K08 for the Excel test outcome location.
i. What is the data input ranged used for this question:
ii. Create a correlation table in cell K08.
b. Technically, we should perform a hypothesis testing on each correlation to determine
if it is significant or not. However, we can be faithful to the process and save some
time by finding the minimum correlation that would result in a two tail rejection of the null.
We can then compare each correlation to this value, and those exceeding it (in either a
positive or negative direction) can be considered statistically significant.
i. What is the t-value we would use to cut off the two tails? T =
ii. What is the associated correlation value related to this t-value? r =
c. What variable(s) is(are) significantly correlated to salary?
d. Are there any surprises - correlations you though would be significant and are not, or non significant correlations you thought would be?
e. Why does or does not this information help answer our equal pay question?
2 Perform a regression analysis using salary as the dependent variable and all of the variables used in Q1. Add the
two dummy variables - gender and education - to your list of independent variables. Show the result, and interpret your findings by answering the following questions.
Suggestion: Add the dummy variables values to the right of the last data columns used for Q1.
What is the multiple regression equation predicting/explaining salary using all of our possible variables except compa-ratio?
a. What is the data input ranged used for this question:
b. Step 1: State the appropriate hypothesis statements: Use Cell M34 for the Excel test outcome location.
Ho:
Ha:
Step 2: Significance (Alpha):
Step 3: Test Statistic and test:
Why this test?
Step 4: Decision rule:
Step 5: Conduct the test - place test function in cell M34
Step 6: Conclusion and Interpretation
What is the p-value:
What is your decision: REJ or NOT reject the null?
Why?
What is your conclusion about the factors influencing the population salary values?
c. If we rejected the null hypothesis, we need to test the significance of each of the variable coefficients.
Step 1: State the appropriate coefficient hypothesis statements: (Write a single pair, we will use it for each variable separately.)
Ho:
Ha:
Step 2: Significance (Alpha):
Step 3: Test Statistic and test:
Why this test?
Step 4: Decision rule:
Step 5: Conduct the test
Note, in this case the test has been performed and is part of the Regression output above.
Step 6: Conclusion and Interpretation
Place the t and p-values in the following table
Identify your decision on rejecting the null for each variable. If you reject the null, place the coefficient in the table.
Midpoint Age Perf. Rat. Seniority Raise Gender Degree
t-value:
P-value:
Rejection Decision:
If Null is rejected, what is the variable's coefficient value?
Using the intercept coefficient and only the significant variables, what is the equation?
Salary =
d. Is gender a significant factor in salary?
e. Regardless of statistical significance, who gets paid more with all other things being equal?
f. How do we know?
3 After considering the compa-ratio based results in the lectures and your salary based results, what else would you like to know
before answering our question on equal pay? Why?
4 Between the lecture results and your results, what is your answer to the question
of equal pay for equal work for males and females? Why?
Your findings:
The lecture's related findings:
Overall conclusion:
5 What does regression analysis show us about analyzing complex measures?