Statistics

profileAziz12m
912BUS308Statistics.rar

Excel Files Assingments/Copy of Student_Assignment_File.11.01.2016.xlsx

Data

ID Salary Compa-ratio Midpoint Age Performance Rating Service Gender Raise Degree Gender1 Grade Copy Employee Data set to this page.
The ongoing question that the weekly assignments will focus on is: Are males and females paid the same for equal work (under the Equal Pay Act)?
Note: to simplfy the analysis, we will assume that jobs within each grade comprise equal work.
The column labels in the table mean:
ID – Employee sample number Salary – Salary in thousands
Age – Age in years Performance Rating – Appraisal rating (Employee evaluation score)
SERvice – Years of service Gender: 0 = male, 1 = female
Midpoint – salary grade midpoint Raise – percent of last raise
Grade – job/pay grade Degree (0= BS\BA 1 = MS)
Gender1 (Male or Female) Compa-ratio - salary divided by midpoint

Week 2

This assignment covers the material presented in weeks 1 and 2. Six Questions
Before starting this assignment, make sure the the assignment data from the Employee Salary Data Set file is copied over to this Assignment file.
You can do this either by a copy and paste of all the columns or by opening the data file, right clicking on the Data tab, selecting Move or Copy, and copying the entire sheet to this file
(Weekly Assignment Sheet or whatever you are calling your master assignment file).
It is highly recommended that you copy the data columns (with labels) and paste them to the right so that whatever you do will not disrupt the original data values and relationships.
To Ensure full credit for each question, you need to show how you got your results. For example, Question 1 asks for several data values. If you obtain them using descriptive statistics,
then the cells should have an "=XX" formula in them, where XX is the column and row number showing the value in the descriptive statistics table. If you choose to generate each
value using fxfunctions, then each function should be located in the cell and the location of the data values should be shown.
So, Cell D31 - as an example - shoud contain something like "=T6" or "=average(T2:T26)". Having only a numerical value will not earn full credit.
The reason for this is to allow instructors to provide feedback on Excel tools if the answers are not correct - we need to see how the results were obtained.
In starting the analysis on a research question, we focus on overall descriptive statistics and seeing if differences exist. Probing into reasons and mitigating factors is a follow-up activity.
1 The first step in analyzing data sets is to find some summary descriptive statistics for key variables. Since the assignment problems will
focus mostly on the compa-ratios, we need to find the mean, standard deviations, and range for our groups: Males, Females, and Overall.
Sorting the compa-ratios into male and females will require you copy and paste the Compa-ratio and Gender1 columns, and then sort on Gender1.
The values for age, performance rating, and service are provided for you for future use, and - if desired - to test your approach to the compa-ratio answers
(see if you can replicate the values).
You can use either the Data Analysis Descriptive Statistics tool or the Fx =average and =stdev functions.
The range can be found using the difference between the =max and =min functions with Fx functions or from Descriptive Statistics.
Suggestion: Copy and paste the compa-ratio data to the right (Column T) and gender data in column U.
If you use Descriptive statistics, Place the output table in row 1 of a column to the right.
If you did not use Descriptive Statistics, make sure your cells show the location of the data (Example: =average(T2:T51)
Compa-ratio Age Perf. Rat. Service
Overall Mean 35.7 85.9 9.0
Standard Deviation 8.2513 11.4147 5.7177 Note - remember the data is a sample from the larger company population
Range 30 45 21
Female Mean 32.5 84.2 7.9
Standard Deviation 6.9 13.6 4.9
Range 26.0 45.0 18.0
Male Mean 38.9 87.6 10.0
Standard Deviation 8.4 8.7 6.4
Range 28.0 30.0 21.0
A key issue in comparing data sets is to see if they are distributed/shaped the same. At this point we can do this
by looking at the probabilities that males and females are distributed in the same way for a grade levels.
2 Empirical Probability: What is the probability for a: Probability
a.       Randomly selected person being in grade E or above?
b.      Randomly selected person being a male in grade E or above?
c.      Randomly selected male being in grade E or above?
d. Why are the results different?
3 Normal Curve based probability: For each group (overall, females, males), what are the values for each question below?:
Make sure your answer cells show the Excel function and cell location of the data used.
A The probability of being in the top 1/3 of the compa-ratio distribution.
Note, we can find the cutoff value for the top 1/3 using the fx Large function: =large(range, value).
Value is the number that identifies the x-largest value. For the top 1/3 value would be the value that starts the top 1/3 of the range,
For the overall group, this would be the 50/3 or 17th (rounded), for the gender groups, it would be the 25/3 = 8th (rounded) value.
Overall Female Male All of the functions below are in the fx statistical list.
i. How nany salaries are in the top 1/3 (rounded to nearest whole number) for each group? Use the "=ROUND" function (found in Math or All list)
ii What Compa-ratio value starts the top 1/3 of the range for each group? Use the "=LARGE" function
iii What is the z-score for this value? Use Excel's STANDARDIZE function
iv. What is the normal curve probability of exceeding this score? Use "=1-NORM.S.DIST" function
B How do you interpret the relationship between the data sets? What does this suggest about our equal pay for equal work question?
4 Based on our sample data set, can the male and female compa-ratios in the population be equal to each other?
A First, we need to determine if these two groups have equal variances, in order to decide which t-test to use.
What is the data input ranged used for this question:
Step 1: Ho:
Ha:
Step 2: Decision Rule:
Step 3: Statistical test:
Why?
Step 4: Conduct the test - place cell B77 in the output location box.
Step 5: Conclusion and Interpretation
What is the p-value:
Is the P-value < 0.05 (for a one tail test) or 0.025 (for a two tail test)?
What is your decision: REJ or NOT reject the null?
What does this result say about our question of variance equality?
B Are male and female average compa-ratios equal?
(Regardless of the outcome of the above F-test, assume equal variances for this test.)
What is the data input ranged used for this question:
Step 1: Ho:
Ha:
Step 2: Decision Rule:
Step 3: Statistical test:
Why?
Step 4: Conduct the test - place cell B109 in the output location box.
Step 5: Conclusion and Interpretation
What is the p-value:
Is the P-value < 0.05 (for a one tail test) or 0.025 (for a two tail test)?
What is your decision: REJ or NOT reject the null?
What does your decision on rejecting the null hypothesis mean?
If the null hypothesis was rejected, calculate the effect size value:
If the effect size was calculated, what doe the result mean in terms of why the null hypothesis was rejected?
What does the result of this test tell us about our question on salary equality?
5 Is the Female average compa-ratio equal to or less than the midpoint value of 1.00?
This question is the same as: Does the company, pay its females - on average - at or below the grade midpoint (which is considered the market rate)?
Suggestion: Use the data column T to the right for your null hypothesis value.
What is the data input ranged used for this question:
Step 1: Ho:
Ha:
Step 2: Decision Rule:
Step 3: Statistical test:
Why?
Step 4: Conduct the test - place cell B162 in the output location box.
Step 5: Conclusion and Interpretation
What is the p-value:
Is the P-value < 0.05 (for a one tail test) or 0.025 (for a two tail test)?
What, besides the p-value, needs to be considered with a one tail test?
Decision: Reject or do not reject Ho?
What does your decision on rejecting the null hypothesis mean?
If the null hypothesis was rejected, calculate the effect size value:
If the effect size was calculated, what doe the result mean in terms of why the null hypothesis was rejected?
What does the result of this test tell us about our question on salary equality?
6 Considering both the salary information in the lectures and your compa-ratio information, what conclusions can you reach about equal pay for equal work?
Why - what statistical results support this conclusion?

Week 3

Week 3 ANOVA Three Questions
Remember to show how you got your results in the appropriate cells. For questions using functions, show the input range when asked.
Group name: G1 G2 G3 G4 G5 G6
1 One interesting question is are the average compa-ratios equal across salary ranges of 10K each. Salary Intervals: 22-29 30-39 40-49 50-59 60-69 70-79
While compa-ratios remove the impact of grade on salaries, are they different for different pay levels, Compa-ratio values:
that is are people at different levels paid differently relative to the midpoint? (Put data values at right.)
What is the data input ranged used for this question:
Step 1: Ho:
Ha:
Step 2: Decision Rule:
Step 3: Statistical test:
Why?
Step 4: Conduct the test - place cell b16 in the output location box.
Step 5: Conclusions and Interpretation
What is the p-value?
Is P-value < 0.05?
What is your decision: REJ or NOT reject the null?
If the null hypothesis was rejected, what is the effect size value (eta squared)?
If calculated, what does the effect size value tell us about why the null hypothesis was rejected?
What does that decision mean in terms of our equal pay question?
2 If the null hypothesis in question 1 was rejected, which pairs of means differ? Why?
Groups Compared Diff T +/- Term Low to High Difference Significant? Why?
G1 G2
G1 G3
G1 G4
G1 G5
G1 G6
G2 G3
G2 G4
G2 G5
G2 G6
G3 G4
G3 G5
G3 G6
G4 G5
G4 G6
G5 G6
3 Since compa is already a measure of pay for equal work, do these results impact
your conclusion on equal pay for equal work? Why or why not?

Week 4

Regression and Corellation Five Questions Compa-ratio Midpoint Age Performance Rating Service Raise Degree Gender
Remember to show how you got your results in the appropriate cells. For questions using functions, show the input range when asked.
1 Create a correlation table using Compa-ratio and the other interval level variables, except for Salary.
Suggestion, place data in columns T - Y.
What range was placed in the Correlation input range box:
Place C9 in output box.
b What are the statistically significant correlations related to Compa-ratio? T = Significant r =
c Are there any surprises - correlations you though would be significant and are not, or non significant correlations you thought would be?
d Why does or does not this information help answer our equal pay question?
2 Perform a regression analysis using compa as the dependent variable and the variables used in Q1 along with
including the dummy variables. Show the result, and interpret your findings by answering the following questions.
Suggestion: Place the dummy variables values to the right of column Y.
What range was placed in the Regression input range box:
Note: be sure to include the appropriate hypothesis statements.
Regression hypotheses
Ho:
Ha:
Coefficient hyhpotheses (one to stand for all the separate variables)
Ho:
Ha:
Place B36 in output box.
Interpretation:
For the Regression as a whole:
What is the value of the F statistic:
What is the p-value associated with this value:
Is the p-value < 0.05?
What is your decision: REJ or NOT reject the null?
What does this decision mean?
For each of the coefficients: Midpoint Age Perf. Rat. Service Gender Degree
What is the coefficient's p-value for each of the variables:
Is the p-value < 0.05?
Do you reject or not reject each null hypothesis:
What are the coefficients for the significant variables?
Using the intercept coefficient and only the significant variables, what is the equation? Compa-ratio =
Is gender a significant factor in compa-ratio?
Regardless of statistical significance, who gets paid more with all other things being equal?
How do we know?
3 What does regression analysis show us about analyzing complex measures?
4 Between the lecture results and your results, what else would you like to know
before answering our question on equal pay? Why?
5 Between the lecture results and your results, what is your answer to the question
of equal pay for equal work for males and females? Why?

Excel Files Assingments/Randomized BUS308 Data - 08.01.2017.xlsm

Data

ID Salary Compa Midpoint Age Performance Rating Service Gender Raise Degree Gender1 Gr
1 63.3 1.110 57 34 85 8 0 5.7 0 M E The ongoing question that the weekly assignments will focus on is: Are males and females paid the same for equal work (under the Equal Pay Act)?
2 28.3 0.914 31 52 80 7 0 3.9 0 M B Note: to simplfy the analysis, we will assume that jobs within each grade comprise equal work.
3 35.5 1.144 31 30 75 5 1 3.6 1 F B
4 64.8 1.137 57 42 100 16 0 5.5 1 M E The column labels in the table mean:
5 48.4 1.008 48 36 90 16 0 5.7 1 M D ID – Employee sample number Salary – Salary in thousands
6 78.4 1.170 67 36 70 12 0 4.5 1 M F Age – Age in years Performance Rating - Appraisal rating (employee evaluation score)
7 41.5 1.037 40 32 100 8 1 5.7 1 F C Service – Years of service (rounded) Gender – 0 = male, 1 = female
8 24.7 1.073 23 32 90 9 1 5.8 1 F A Midpoint – salary grade midpoint Raise – percent of last raise
9 77.2 1.152 67 49 100 10 0 4 1 M F Grade – job/pay grade Degree (0= BS\BA 1 = MS)
10 23.4 1.019 23 30 80 7 1 4.7 1 F A Gender1 (Male or Female) Compa - salary divided by midpoint
11 23.8 1.035 23 41 100 19 1 4.8 1 F A
12 59.7 1.048 57 52 95 22 0 4.5 0 M E
13 42.5 1.062 40 30 100 2 1 4.7 0 F C
14 24 1.042 23 32 90 12 1 6 1 F A
15 23.4 1.018 23 32 80 8 1 4.9 1 F A
16 44.2 1.106 40 44 90 4 0 5.7 0 M C
17 68.4 1.200 57 27 55 3 1 3 1 F E
18 34.9 1.126 31 31 80 11 1 5.6 0 F B
19 24.7 1.072 23 32 85 1 0 4.6 1 M A
20 33.9 1.095 31 44 70 16 1 4.8 0 F B
21 75.1 1.121 67 43 95 13 0 6.3 1 M F
22 50.2 1.046 48 48 65 6 1 3.8 1 F D
23 23.9 1.038 23 36 65 6 1 3.3 0 F A
24 59.5 1.239 48 30 75 9 1 3.8 0 F D
25 23.7 1.029 23 41 70 4 0 4 0 M A 1.08164 0.0702512633 43.896 19.2867718052
26 25.7 1.116 23 22 95 2 1 6.2 0 F A 1.06248 0.0789066537 46.7 19.4289303188
27 36.2 0.906 40 35 80 7 0 3.9 1 M C .
28 78.1 1.166 67 44 95 9 1 4.4 0 F F
29 68.7 1.026 67 52 95 5 0 5.4 0 M F
30 48.4 1.009 48 45 90 18 0 4.3 0 M D
31 23.2 1.008 23 29 60 4 1 3.9 1 F A
32 29.2 0.942 31 25 95 4 0 5.6 0 M B Compa A B C D E F
33 65.9 1.156 57 35 90 9 0 5.5 1 M E F mean 1.0141666667 1.1205 1.0375 1.1243333333 1.175 1.134
34 27.8 0.898 31 26 80 2 0 4.9 1 M B m 1.0573333333 0.892 1.0833333333 0.9995 1.0818 1.12275
35 24.4 1.062 23 23 90 4 1 5.3 0 F A
36 24.1 1.049 23 27 75 3 1 4.3 0 F A F Stdev 0.0337876614 0.0311608729 0.0176776695 0.0751620472 0.0494974747 0.0212132034
37 23.6 1.024 23 22 95 2 1 6.2 0 F A m 0.0248260616 0.0190525589 0.0877971146 0.028991378 0.0607029196 0.0332603367
38 57.9 1.015 57 45 95 11 0 4.5 0 M E
39 34.9 1.125 31 27 90 6 1 5.5 0 F B
40 24.1 1.048 23 24 90 2 0 6.3 0 M A
41 45.4 1.134 40 25 80 5 0 4.3 0 M C
42 23.9 1.037 23 32 100 8 1 5.7 1 F A
43 76.4 1.140 67 42 95 20 1 5.5 0 F F
44 64.1 1.124 57 45 90 16 0 5.2 1 M E
45 49.6 1.034 48 36 95 8 1 5.2 1 F D
46 65.7 1.153 57 39 75 20 0 3.9 1 M E
47 57.3 1.006 57 37 95 5 0 5.5 1 M E
48 67.4 1.183 57 34 90 11 1 5.3 1 F E
49 61.7 1.082 57 41 95 21 0 6.6 0 M E
50 63.8 1.119 57 38 80 12 0 4.6 0 M E

Sheet1

Sal Compa G Mid Age EES SR G Raise Deg SUMMARY OUTPUT SUMMARY OUTPUT
24 1.045 1 23 32 90 9 1 5.8 1
24.2 1.053 1 23 30 80 7 1 4.7 1 Regression Statistics Regression Statistics
23.4 1.018 1 23 41 100 19 1 4.8 1 Multiple R 0.7050179484 Multiple R 0.9931286935
23.4 1.017 1 23 32 90 12 1 6 1 R Square 0.4970503076 R Square 0.9863046018
22.6 0.983 1 23 32 80 8 1 4.9 1 Adjusted R Square 0.4132253589 Adjusted R Square 0.9840220355
22.9 0.995 1 23 36 65 6 1 3.3 0 Standard Error 0.0561252686 Standard Error 2.4352822665
23.1 1.003 1 23 22 95 2 1 6.2 0 Observations 50 Observations 50
23.3 1.011 1 23 29 60 4 1 3.9 1
22.7 0.985 1 23 23 90 4 1 5.3 0 ANOVA ANOVA
23.5 1.023 1 23 27 75 3 1 4.3 0 df SS MS F Significance F df SS MS F Significance F
23 1.002 1 23 22 95 2 1 6.2 0 Regression 7 0.1307500775 0.0186785825 5.9296225662 0.0000782906 Regression 7 17938.424611863 2562.632087409 432.1033638177 5.29906273684337E-37
24 1.042 1 23 32 100 8 1 5.7 1 Residual 42 0.1323019225 0.0031500458 Residual 42 249.085188137 5.9305997175
35.5 1.145 1 31 30 75 5 1 3.6 1 Total 49 0.263052 Total 49 18187.5098
34.7 1.119 1 31 31 80 11 1 5.6 0
35.5 1.146 1 31 44 70 16 1 4.8 0 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
35.2 1.136 1 31 27 90 6 1 5.5 0 Intercept 0.9486238772 0.0817167716 11.6086803119 0 0.7837127557 1.1135349987 0.7837127557 1.1135349987 Intercept -4.8714544587 3.54570071 -1.3739045839 0.1767599037 -12.0269681853 2.2840592678 -12.0269681853 2.2840592678
40.4 1.01 1 40 32 100 8 1 5.7 1 Mid 0.0034995027 0.0006492568 5.3900133356 0.0000029767 0.0021892495 0.0048097559 0.0021892495 0.0048097559 Mid 1.2284155048 0.0281713308 43.6051641629 1.32019333894083E-36 1.1715634576 1.2852675521 1.1715634576 1.2852675521
42.7 1.068 1 40 30 100 2 1 4.7 0 Age 0.0005527738 0.0014459446 0.3822925256 0.7041721007 -0.0023652605 0.0034708081 -0.0023652605 0.0034708081 Age 0.0368279425 0.0627397124 0.5869957178 0.5603489282 -0.0897859231 0.1634418081 -0.0897859231 0.1634418081
53.4 1.112 1 48 48 65 6 1 3.8 1 EES -0.0018462553 0.0010252155 -1.8008461371 0.0789105539 -0.0039152239 0.0002227133 -0.0039152239 0.0002227133 EES -0.0821579785 0.0444842245 -1.8469014451 0.0718147225 -0.171930778 0.007614821 -0.171930778 0.007614821
51.5 1.072 1 48 30 75 9 1 3.8 0 SR -0.0004182288 0.0018278101 -0.2288141345 0.820123898 -0.004106899 0.0032704414 -0.004106899 0.0032704414 SR -0.0778484529 0.079308905 -0.9815852701 0.3319249969 -0.2379003029 0.0822033971 -0.2379003029 0.0822033971
49.8 1.037 1 48 36 95 8 1 5.2 1 G 0.0646649961 0.0183396697 3.5259629624 0.001034866 0.0276540443 0.101675948 0.0276540443 0.101675948 G 2.9145083112 0.7957605113 3.6625445343 0.000693549 1.3085985836 4.5204180389 1.3085985836 4.5204180389
68.3 1.198 1 57 27 55 3 1 3 1 Raise 0.0146549564 0.0139088976 1.0536389608 0.2980722322 -0.0134143354 0.0427242483 -0.0134143354 0.0427242483 Raise 0.6763294824 0.6035087689 1.1206622295 0.2687988764 -0.5416005215 1.8942594864 -0.5416005215 1.8942594864
65.4 1.148 1 57 34 90 11 1 5.3 1 Deg 0.0014675988 0.0161098249 0.0910996125 0.9278465471 -0.0310433441 0.0339785418 -0.0310433441 0.0339785418 Deg 0.0345044482 0.6990072742 0.0493620731 0.9608647532 -1.3761493419 1.4451582383 -1.3761493419 1.4451582383
78.4 1.17 1 67 44 95 9 1 4.4 0
75.9 1.133 1 67 42 95 20 1 5.5 0
24 1.044 0 23 32 85 1 0 4.6 1
23.3 1.012 0 23 41 70 4 0 4 0
24.1 1.049 0 23 24 90 2 0 6.3 0
27.5 0.887 0 31 52 80 7 0 3.9 0 t-Test: Two-Sample Assuming Equal Variances
27.1 0.875 0 31 25 95 4 0 5.6 0
27.7 0.895 0 31 26 80 2 0 4.9 1 Variable 1 Variable 2
40.8 1.019 0 40 44 90 4 0 5.7 0 Mean 1.06684 1.04836
43.9 1.097 0 40 35 80 7 0 3.9 1 Variance 0.00430164 0.00648099
41 1.025 0 40 25 80 5 0 4.3 0 Observations 25 25
48.7 1.014 0 48 36 90 16 0 5.7 1 Pooled Variance 0.005391315
49.4 1.029 0 48 45 90 18 0 4.3 0 Hypothesized Mean Difference 0
64.4 1.13 0 57 34 85 8 0 5.7 0 df 48
64.5 1.132 0 57 42 100 16 0 5.5 1 t Stat 0.8898352784
58.9 1.033 0 57 52 95 22 0 4.5 0 P(T<=t) one-tail 0.188996287
57.9 1.016 0 57 35 90 9 0 5.5 1 t Critical one-tail 1.6772241961
59 1.035 0 57 45 95 11 0 4.5 0 P(T<=t) two-tail 0.3779925741
63.3 1.111 0 57 45 90 16 0 5.2 1 t Critical two-tail 2.0106347576
56.8 0.996 0 57 39 75 20 0 3.9 1
58 1.017 0 57 37 95 5 0 5.5 1
62.4 1.094 0 57 41 95 21 0 6.6 0
63.8 1.12 0 57 38 80 12 0 4.6 0
79 1.179 0 67 36 70 12 0 4.5 1
77 1.149 0 67 49 100 10 0 4 1
74.8 1.116 0 67 43 95 13 0 6.3 1
76 1.135 0 67 52 95 5 0 5.4 0

Week1/Discusion 1.docx

Part One – Analysis Toolkpak

Add the “Analysis Toolpak” to Excel. Be sure you are you able to copy, sort, and find averages and sums in Excel. Use the  Load the Analysis ToolPak (Links to an external site.)Links to an external site. article for information on how to load this in Excel. (This should be completed on Day 1.)

Part Two – Data Characteristics

Read  Lecture One Preview the document View in a new window on descriptive data and review the Employee Data . Be sure to familiarize yourself with the different variables shown on the Data tab. In this course, we will be using the Employee Data and statistical tools to answer a single research question: In our BUS308 company, are the males and females paid equally for equal work? Lecture One discusses different ways data values can be classified. In our data set for the equal pay for equal work assignment, students in the past have correctly identify the variable gender (coded M and F for male and female respectively) as nominal level data, but they often see gender1 (coded 0 and 1 for male and female respectively) as interval or ratio level data. Why? What could cause this wrong classification? What data do you use in your personal or professional lives that might suffer from not being correctly labeled/understood? (This should be started on Day 1.)

Part Three –Descriptive Statistics

Read  Lecture Two Preview the document View in a new window on describing data sets and view  The Role of Data & Analytics Today  (Links to an external site.)Links to an external site.video. Lecture Two discusses several different ways of summarizing a data set--central location, variability, etc. Often, business reports provide a mean or average value for some measure (such as average number of defects per production run). Why is the average alone not enough information to make informed judgements about the result? What other descriptive statistic should be included? Why? Can you illustrate this with an example from your personal or professional lives? (This should be started on Day 3.)   

Part Four – Probability

Read  Lecture Three Preview the document View in a new window on probability. Lecture Three introduces the idea of probability—a measure of how likely it is to get a particular outcome. Looking at outcomes as resulting from probabilities (somewhat random outcomes/selections) rather than fixed constants often changes the way we see things. How does considering the salary outcomes in our sample the result of a probabilistic sample rather than a completely accurate and precise reflection of the population change how we interpret the sample statistic outcomes? What results in your personal or professional lives could be viewed this way? What differences would this cause? Why? (This should be completed by Day 5 .)

image1.png

image2.png

Week1/excel tool pak needed.txt

Load the Analysis ToolPak https://support.office.com/en-us/article/Load-the-Analysis-ToolPak-6a63e598-cd6d-42e3-9317-6b40ba1a66b4?CorrelationId=b44046dd-0bbf-472c-aaaf-1c7fd6858b56&ui=en-US&rs=en-US&ad=US&ocmsassetID=HP010021569 Microsoft. (2007). Copy Excel data or charts to Word (Links to an external site.)Links to an external site.. Retrieved from http://office.microsoft.com/en-us/word-help/copy-excel-data-or-charts-to-word-HP010198874.aspx Multimedia AnalystSoft Inc. StatPlus:mac LE (Links to an external site.)Links to an external site.. Retrieved from http://www.analystsoft.com/en/products/statplusmacle

Week1/Week 1 Lecture 1.pdf

Week 1 Lecture 1

Class Approach to Statistics

Statistics is basically a set of tools that allow us to get information out of data sets (we will get to the more formal definition below). As such, it can be taught as a math class (focusing on formulas), a logic class (If this, then that), or as a case study (here is the problem, what are we going to do). We have chosen the later – we will be examining statistical tools and approaches as they help us answer a business question.

The question we will focus on involves the Equal Pay Act, specifically the requirement that males and females be paid the same if they are performing equal or equivalent work. So, our business research question is: are males and females paid the same for equal work?

In starting out with our case, we will have a data set that provides a number of variables (measures that can assume different values with different subjects) for each of 50 employees selected randomly from our company. (The company and employee data are fictitious, of course).

For each employee (labeled 1 thru 50 in the ID column), we will have:

• Salary, the annual salary, rounded to the nearest hundred dollars; for example, a salary of 32, 650 would be rounded to 32.7.

• Compa (short for compa-ratio or Comparative ratio) – a measure of how a salary relates to the midpoint of a pay range, found by dividing the salary by the pay range midpoint.

• Midpoint – the middle of the salary range assigned to each grade. • Age – the employee’s age (rounded to the nearest birthday) • Performance rating – a value between 1 and 100 showing the manager’s rating how

good the employee performs their job • Service – the years the employee has been with the company (rounded to the nearest

hiring anniversary • Gender – a numerical code indicating the employee’s gender (1 = female, 0 = male) • Raise – the percent increase in pay of the last performance based increase in salary • Degree – the educational achievement of the employee (0 = BA/BS, 1 = Master’s or

more) • Gender1 – a letter code indicating the employee’s gender (F = female, M = male) • Grade – the employee’s pay level – grade A is the lowest (entry level) and grade E is

the highest.

During each week, we will examine some of these variables to see if they help us answer the question of males and females receiving equal pay for equal work. In the weekly lectures, we will work with the variable salary. In the homework assignments for weeks 2, 3, and 4; you will have the same questions but work with the variable compa, which – by definition – is an alternate method of looking at pay.

If you have any questions about this description of our course case, please ask them in either Ask Your Instructor or in one of the class posts.

Introduction to Statistics

Formally, we can define statistics as “the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions” (Lind, Marchel, & Wathen, 2008, p. 4). This makes statistics and statistical analysis a subset of both critical thinking and quantitative thinking, both skills that Ashford University has identified as critical abilities for any student graduating with a degree. H. G. Wells, the author, once said that “one day quantitative reasoning will be as necessary for effective citizenship as the ability to read.”

In this class, we will focus mostly on the analyzing and interpreting of data that we will assume has been correctly collected to allow us to use it to make decisions with. In doing this, there is a fairly well agreed upon approach to understanding what the data is trying to tell us. This approach will be followed in this class, and involves:

• Identifying what kinds of data we are working with, then • Developing summary statistics for the data • Developing appropriate statistical tests to make decisions about the population the

data came from. • Drawing conclusions from the test results to answer the initial research question(s).

Data Characteristics

We all recognized that not all data is the same. Saying we “like” something is quite a bit different than saying, the part weighs 3.7 ounces. We treat these two kinds of data in very different ways.

The first distinction we make in data types involves identifying our data as either qualitative or quantitative. Qualitative data identifies characteristics or attributes of something being studied. They are non-numeric and can often be used for grouping purposes. Some examples include nationality, gender, type of car, etc.

Quantitative data, on the other hand, tend to measure how much of what is being examined exists. Examples of these kinds of variables include, money, temperature, number of drawers in a desk, etc.

Within quantitative data, we can identify continuous and discrete data types. Continuous data variables can assume any value with limits. For example, depending upon how accurate our measuring instrument is, the temperature, in degrees Fahrenheit, could be 75, 75.3 75.32, 75.3287468…. There are no natural “breaks” in temperature even though we typically only report it in whole numbers and ignore the decimal portion. Height would be another continuous data variable. Discrete data, on the other hand, has only certain values, and shows breaks between these values. The number of drawers in a desk could be 3 or 4, but not 3.56, for example.

The second important approach in defining data is the “level” of the data. There exist four distinct levels:

• Nominal – these serve as names or labels, and could be considered qualitative. The basic use for this level is to identify distinctions between and among subjects, such as ID numbers, gender identification (Male or Female), car type (Ford, Nissan, etc.). We can basically only count how many exist within each group of a nominal data variable.

• Ordinal – these data have the same characteristics as nominal with the addition of being rankable – that is, we can place them in a descending or ascending order. One example is rating something using good, better, best (even if coded 1 = good, 2 = better, and 3 = best). We can rank this preference, but cannot say the difference between each data point is the same for everyone.

• Interval – this level of data adds the element of constant differences between sequential data points – while we did not know the difference between good and better or better and best; we do know the difference between 57 degrees and 58 degrees – and it is the same as the difference between 67 and 68 degrees.

• Ratio – this level adds a “meaningful” 0 – which means the absence of any characteristic. Temperature (at least for the Celsius and Fahrenheit scales)) does not have a 0 point meaning no heat at all. A scale with a meaningful 0, such as length, has equal ratios – the ratio of 4 feet to 2 feet has the same value as that of 8 feet to 4 feet – both are 2. This cannot be said of temperatures, for example (Tanner & Youssef-Morgan, 2013).

These are often recalled by the acronym NOIR.

Knowing what kinds of data we have is important, as it identifies what kinds of statistical analysis we can do.

Equal Pay Question

At the end of each lecture, we will apply the topics discussed to our research question of do males and females receive equal pay for equal work. In this section, we will look at identifying the data characteristics for each of our data variables.

In looking at our first classification of qualitative versus quantitative, we have

Qualitative Quantitative Continuous Discrete ID Compa Salary Gender Age Midpoint

Gender1 Raise Performance Rating

Degree Service Grade

Most of these are fairly clear – the variables in the qualitative column merely identify different groups. The continuous variable lists can all – theoretically – be carried out to many decimal points, while those in the discrete list all have distinct values within their range of available values.

The identification for the NOIR classification are shown below.

Nominal Ordinal Interval Ratio

ID Degree Performance Rating Salary

Gender Grade Midpoint Gender1 Service Compa Age Raise

While an argument can be made that Performance Ratings, being basically opinions, are really ordinal data; for this class let us assume that they are interval level as many organizations treat them as such.

An important reason for always knowing the data level for each variable is that we are limited to what can be done with different levels. With nominal scales, we can count the differences. With ordinal scales, we can do some limited analysis of differences using certain tests that are not covered in this course. Both interval and ratio scales allow us to do both inferential and descriptive analysis (Tanner & Youssef-Morgan, 2013). Most of the statistical tools we will cover in this class require data scales that are at least interval in nature. During our last two weeks, we will look at some techniques for nominal and ordinal data measures.

In Lecture 2, we will start to see what kinds of things we can do with each level of the NOIR characteristics.

If you have any questions about this material, please ask questions in either Ask Your Instructor or in the discussion area.

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Diego, CA: Bridgeport Education.

Week1/Week 1 Lecture 2.pdf

Week 1 Lecture 2

In Lecture 1, we focused on identifying the characteristics – quantitative, qualitative, discrete, continuous, NOIR – of the data. In this section, we will take a look at how we can summarize a data set with descriptive statistics, and how we can ensure that these descriptive statistics can be used as inferential statistics to make inferences and judgments about a larger population. We are moving into the second step of the analysis approach mentioned in Lecture 1.

Descriptive Statistics

Once we understand the kinds of data we have, the natural reaction is want to summarize it – reduce what may be a lot of data into a few measures to make sense of what we have. We start with summary descriptions, the principle types focus on location, variability, and likelihood. (Note, we will deal with likelihood, AKA probability, in Lecture 3 for this week.)

For nominal data, our analysis is limited to counting how many exist in each group, such as how many cars by car company (Ford, Nissan, etc.) are in the company parking lot. However, we can also use nominal data as a group name to form different groups to examine, in this case we do nothing with the actual data label, but do some analysis with the data in each group. An example related to our class case: we can group the salary data values into two groups using the nominal variable gender (or gender1).

With ordinal scales, we can do some limited analysis of differences using certain tests; most of which are not covered in this course. We can also use ordinal data as grouping labels, for example we could do some analysis of salary by educational degree.

Both interval and ratio scales allow us to do both inferential and descriptive analysis (Tanner & Youssef-Morgan, 2013). Most of the statistical tools we will cover in this class require data scales that are at least interval in nature.

Location measures. When working with interval or ratio level variables, the first measure most researchers look are indications of location – mean, median, mode. The mean is the numerical average of the data – simply add the values and divide by the total count. The median is the middle of the data set; rank order the values form low to high or high to low, and pick the value that is in the middle. This is easy if we have an odd number of values, we can find the middle exactly. If we have an even number of variables, the middle is the average of the middle two values. For example, in this data set: 2, 3, 4, 5, 6, we have five values and the median is 4. However, in this data set: 2, 3, 4, 5, we have only four values and the median is the average of the middle 2 numbers = (3 + 4)/2 = 7/2 = 3.5. Finally, the mode is the most frequently occurring value; as such, it may or may not occur. And, there may be more than one mode in any data set.

Generally, the mean is the most useful measure for a data set, as it contains information regarding all the values. It is the location measure that is used in many statistical tests. The symbol for the mean of a population is μ – called mu – while we use 𝑥𝑥 – sometimes typed as x- bar – for the sample mean.

Variation measures. After finding our mean (or other center measure), we generally want to know how consistent the data is – that is, is the data bunched around the center, or is it spread out. The more spread out a data is, the less any single measure accurately describes all of the data. Looking at the consistency (or lack of consistency) in a data set will often give us a different understanding of what is going on. A simple example, if we have two departments in a company that each averaged 3.0 on a question in a company morale survey, we might be tempted to say they were the same. However, if we looked at the actual scores and saw that one department had individual scores of 3, 3, 3, 3, 3, and 3 while the other department’s scores were 5, 5, 5, 1, 1, and1 we can now see that the groups are quite a bit different. The mean alone did not provide enough information to interpret what was going on in each group.

We have 3 general measures of variation – range, standard deviation, and variance. Range is simply the difference between the largest and smallest value (largest – smallest = range).

Standard deviation and variance are related values. The variance is a somewhat awkward measure to initially understand. To calculate it, we first take the difference between each value and mean of the entire group. This outcome will have both positive and negative values, and if we add them together we would get a result of 0. So, to eliminate the negative values, we square each outcome. Then we get the sum these squared values and divide it by the count. (Note: this is the same as the mean of the squared differences.) For example, the variance of this data set (2, 3, 4) would be:

• Mean = (2 + 3 + 4)/3 = 9/3 = 3 • Variance = ((2 - 3)^2 + (3 – 3)^2 + (4 – 3)^2)/3 = ((-1)^2 + (0)^2 + (1)^2))/3 = (1 + 0

+ 1)/3 = 2/3 = 0.667.

This gives us an awkward measure – the variance of something measured in inches, for example, would be measured in inches squared – not a measure we all use on a daily basis.

The standard deviation changes this awkward measure to one that makes more intuitive sense. It does so by taking the positive square root of the variance. This would give us, for our inches measure a result that is expressed in inches. The standard deviation is always expressed in the same units as the initial measure. For our example above with the variance of 0.667, the standard deviation would be the square root of 0.667 or 0.817. Both the variance and standard deviation require data that is at least interval in nature. The standard deviation is about 1/6 of the range, and is considered the average difference from the mean for all of the data values in the set (Tanner & Youssef-Morgan, 2013).

Technical point – both the variance and the standard deviation have two different formulas, one for populations and one for samples. The difference is that with the sample formula, the average is found with the (count -1) rather than the full count. This serves to increase the estimate, since the data in a sample will not be as spread out as in the population (unlikely to have the extreme largest and smallest value). The symbol for the population standard deviation is σ, while the sample standard deviation symbol is s. In statistics, since we deal with samples, we use the sample formulas – to be discussed below.

The nice thing about descriptive statistics is that Excel will do all of the math calculations for us, we just need to know how to interpret our results.

For a video discussion of descriptive statistics take a look at Descriptive statistics from the Kahn Academy - https://www.khanacademy.org/math/probability/descriptive-statistics.

Research Question Example

Now that we have identified the data types for each of our variables, we need to develop some descriptive statistics – particularly for those at the interval and ratio level. In our discussion and example of salary, we will be using a salary sample of 50 that does not exactly match the data that is available in your data set. It is not significantly different, and should be considered to come from a different sample of the same population. The results will be accurate enough to consider them in answering our equal pay for equal work question for the sample results provided to the class.

Equal Pay Question. The obvious first question to ask is what is the overall average salary, and what is the average for the males and females separately? This descriptive statistic should also be accompanied with the standard deviation of each group to examine group diversity. (Reminder: the salary results presented each week will not exactly match the results from this class’ data set if you choose to duplicate the results presented in this lecture. The results are statistically close enough to use to answer our assignment question on equal pay.)

The related question concerns the standard deviation of each of the three groups (entire sample, males, and females) – what is the standard deviation for each group.

In setting up the data for this, copy the salary data column (B1:B51) and paste it on a new sheet. This is a recommend practice – never do analysis on the raw data set so that relationships between various columns are not compromised. Then copy the Gender column (M and F) and paste it beside the salary data. Using Excel’s sort function, sort the two columns (at the same time) using Gender as the sort key. This will give you the salary data grouped by males and females.

The screen shot below displays the results using both the Descriptive Statistics option found in the Data Analysis list and the =Average and = Stdev.s functions found in the fx or Formulas – statistics section.

Note a couple of things about the Descriptive Statistics output. First, since for both the overall and female groups, the input range included the label Sal, this was shown at the top. The male range did not have a label, so Column 1 was automatically used. We can use Descriptive Statistics for any number of contiguous columns in the input range box. For reporting purposes, we should change the Sal and Column 1 labels to Overall, Female, and Male.

The second issue about the descriptive statistics output is that it contains much more information than we were looking for. This is a good tool for an overall look at a data set.

Looking at the fx values and those from the Descriptive Statistics output, we can see that the means and standard deviations are identical for each group – so, it does not matter which approach you use.

Now, looking at the actual statistical values, we see that the overall all salary mean (45) lies in the middle between the lower female mean (38) and the upper male mean (52) – overall means will always be flanked by sub-group means, but the differences will not also be equidistant.

The standard deviations, on the other hand are much closer together with the overall (19.2) being somewhat larger than either the female (18.3) or male (17.8). This is also somewhat common – the variation in the entire group is generally a bit larger than for the sub-groups.

While we did not specifically ask for it, we can also note that the range in each group is very close 22 – 77 for overall and females and 24 – 77 for males.

So, what can we say at this point? It appears that males and females have about the same range and standard deviations for salaries, but that females appear to average less than the males. However, at this point, we cannot say anything about our equal pay for equal work question as the Salaries have not been divided into equal work groups. So, at this point we have some interesting information, but no conclusive results yet.

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Diego, CA: Bridgeport Education.

Week1/Week 1 Lecture 3.pdf

Week 1 Lecture 3

A second way of looking at data differences or similarities is to consider how likely a given outcome is. In looking at our data set, we could ask questions such as, what is the probability (likelihood) of a male or female salary exceeding 60K, what is the probability that a person’s salary is within the range of 38K to 52K, etc. Probability questions about a data begin to help us look at distributions, a topic we will delve into in more detail in the upcoming weeks.

Probability is the likelihood that a specific outcome will occur; it is always positive and ranges from 0 (will never occur) to 1.00 (will always occur). Generally speaking, we have 3 kinds of probability – empirical (counting actual outcomes), theoretical (using theory/logic to determine what should occur) and subjective (our individual guesses and feelings). Obviously the theoretical and empirical are the best approaches for business research questions, but at times the best we can get is an expert’s guess.

Theoretical probability is just as it sounds – the theory of what the probability should be. For example, if we flip a fair coin, our theory says we should get heads 50% of the time – one outcome out of the two possible. If we flip the coin a number of times, we will get the empirical probability – the number of actual heads divided by the number of flips. While this is generally close to .5, achieving this is usually the result of a lot of flips rather than just a few (even up to 100) (Lind, Marchel, & Wathen, 2008).

While many approaches to theoretical probability exist (binominal, hypergeometric, Poisson, etc.) (Lind, Marchel, & Wathen, 2008) exist, we will look at just two particular types – the binominal and the normal curve based probabilities. The binominal requires that we have only two outcomes, such as heads and tails when flipping a coin. This is not as restrictive as it might seem, as we can always create 2 groups out of what we have. For example, if we have a single die (one of a pair of dice), we could form several two group situations – evens versus odds, 1 – 3 versus 4 – 6, etc. We will use the binominal to discuss several basic probability rules.

Four general probability (P) concerns exist. Typically, we want to know one or more of the following probabilities:

• of something happening – called P(event), • of two things happening together – called joint probability: P(A and B), • of either one or the other but not both events occurring – P(A or B), • of something occurring given that something else has occurred – conditional

probability: P(A|B) (read as probability of A given B). • Compliment rule: P(not A) = 1- p(A) (Lind, Marchel, & Wathen, 2008).

Two other issues are needed, the idea of mutually exclusive means that the elements of one data set do not belong to another – for example, males and pregnant are mutually exclusive data sets. The other term we frequently hear with probability is collectively exhaustive – this simply means that all members of the data set are listed (Lind, Marchel, & Wathen, 2008).

Some rules, which apply for both theoretical and empirical based probabilities, for dealing with these different probability situations include:

• P(event) = (number of success)/(number of attempts or possible outcomes) • P(A and B) = P(A)*P(B) for independent events or P(A)*P(B|A) for dependent events

(This last is called conditional probability the probability of B occurring given that A has occurred).

• P(A or B) = P(A) + P(B) – P(A and B); if A and B cannot occur together (such as the example of male and pregnant) then P(A and B) = 0

• P(A|B) = P(A and B)/P(B) (Lind, Marchel, & Wathen, 2008).

Binominal Probability

Binominal probabilities deal with dichotomous outcomes – those that have only 2 possible outcomes. A typical example is flipping a coin, the result can only be a head or tail. Another common example is gender, we are born as either male or female. The interesting element about binominal outcomes is that while every single trail (such as the flip of a coin) has the same probability, the outcome of a group of trails will not necessarily match that probability. For example, the probability of getting exactly 5 heads out of 10 flips of a fair coin is not .5, but rather 24.6%! This is due to the number of ways the 10 outcomes can be distributed (Lind, Marchel, & Wathen, 2008).

We can turn almost any outcome into a dichotomous outcome by creating groups. For example, we can say that when we toss a six-sided die (half of a pair of dice), we have two outcomes: getting a 1 or 2 versus getting anything else. Now we have two outcomes of interest instead of the original 6 possible outcomes.

Tables exist to determine the likelihood, but the easier way is to use the Excel functions found in the fx or Formulas lists. For example, Excel’s BINOM.DIST functions can quickly provide us with the correct probability of getting a certain number of outcomes within a given number of attempts.

Research Question Example

Understanding the distribution of the data is an important element of understanding what the data is trying to tell us. Probabilities can give us a sense of the data set and allow us to compare results across groups.

Equal Pay Example. In thinking about equal pay, we might be interested in the probability that both males and females appear to be grouped in similar ways as the overall group. This would be an example of an empirical probability, as we would be counting how many of each group fall into each of the ranges we would set up.

We noted that the overall salary mean was 45, with the female mean equaling 38 and the male mean equaling 52. This suggests one group to look at – what is the probability of someone having a salary between 38 and 52 in each group – overall, females, and males?

Translating this into “probability” terms, we want to know:

• What is p(38 <= salary <=52)? What is the probability that salaries are between 38 and 52 inclusive?

• What is p(38 <= salary <=52|Female)? What is the probability that salaries are between 38 and 52 inclusive, given a female salary? Or, if a female, what is the probability that salaries are between 38 and 52 inclusive?

• What is p(38 <= salary <=52|Male)? What is the probability that salaries are between 38 and 52 inclusive, given a Male salary? Or, if a Male, what is the probability that salaries are between 38 and 52 inclusive?

We know a couple of things right off. First the entire sample has 50 members, and we have 25 males and females. These become the denominators in the respective probabilities. Since you do not have the exact data set we are working with, the counts for salaries in these ranges are: Overall: 8, Females: 3, and Males: 5.

So, we have:

P(38 <= salary <=52) = 8/50 = .16.

P(38 <= salary <=52|Female) = 3/25 = .12.

P(38 <= salary <=52|Male) = 5/25 = .20.

We can see if gender influences being within this range by seeing if the formula for independent events is true. Above we had stated P(A and B) = P(A)*P(B) for independent events. In this case, the P(within salary range AND Female) is the same as P(within salary range|Female); (this is not always the case). So, since P(within salary range) = .16, and P(Female) = .5, we would have:

P(within salary range and Female) = P(within salary range) * P(Female)

Replacing these with the associated values, we would have:

.12 = .16 * .5 (= .08). An expression that is clearly not correct or true.

Since the two sides of the equation are clearly not the same; we can say that gender and being within this salary ranger are not independent elements. Doing this for other ranges produces similar results, so we have a clue that gender and salary interact in some ways that suggest males and females are not paid equally.

What we still do not know yet, is how to consider equal work in our examination.

The Normal Curve

The normal curve is a data distribution that is often called the bell curve, as when you plot the likelihood of outcomes occurring, the resulting graph looks like a bell – the most outcomes in the middle (where the mean = median = mode), and then smoothly decreasing on each side.

As a probability distribution, the normal has some interesting characteristics. First, the probability of any outcome equals the area under the curve for that range of outcomes. (Tables and Excel give us these values.) Second, the curve technically extends from - infinity to + infinity (although this range is rarely actually used). Third, since the normal curve is continuous data, the probability of any single outcome (for example, getting a 76 on a test) is 0, so to overcome this we develop a range of values – the 76 score outcome would be the area from 75.5 to 76.5 – the adding of +/- half a unit to a value allows us to translate discrete data into a continuous range (Lind, Marchel, & Wathen, 2008).

The normal curve is important due to its wide spread appearance in everyday situations. Some examples of data that follow the normal curve are height, weight, IQ, standardized test scores such as the college boards, many manufacturing measures (above and below the average result), etc.

To make working with different normal curves (having different means and standard deviations), we can convert them all into the Standard Normal Curve, which has a mean of 0 and a standard deviation of 1.0.

We do this using a z-score – subtract the mean from the data value, and divide the result by the standard deviation. Doing this for every value in a data set would change the mean of the new distribution to 0 (due to the subtraction), while the division changes the standard deviation to 1. The resulting data values are now z-scores, and the area between z-scores is the probability of an outcome within that range of values. One characteristic of the z-score is that it tells us, in standard deviations, how close or far from the mean any individual score is; so in some ways this is another location measure but one that focuses on individual values.

Here is an example using the normal distribution and related Excel functions found in the fx list. (See the Excel Week 1 lecture for guidance on using this function if you are unclear about it.) To find the probability of an outcome between a z-score of 1.63 and 2.0, we would need to find the area between these two scores. To do this, we would subtract the area under the curve up to the z score of 1.63 from the area under curve up to the area of the z-score for 2.0. In excel we use the fx function NORM.S.DIST (z, cumulative) this way”

=NORM.S.DIST(2.0,1)-NORM.S.DIST(1.63,1) = 0.0288.

This tells us that the probability of finding a sample value within this range is about 2.9%.

A second example with values that are above and below the mean would be done the same way. Looking to find the probability that we would find a sample value between the z scores of -1.63 and 2.0 would be:

=NORM.S.DIST(2.0,1)-NORM.S.DIST(-1.63,1) = 0.9257 or 92.6%.

The final example of finding a normal curve based probability is determining the probability of being greater than some value; for example, what is the probability of exceeding a z score of 2.0? =1 - NORM.S.DIST(2.0,1) = 0.02275 or 2.3%

Finding the area below a negative z score is found by simply using the NORM.S.DIST function. = NORM.S.DIST(-2.0,1) = 0.02275 or 2.3%

A hint on doing these kinds of problems is to draw a picture of the normal curve, and draw a vertical line at each of the z score values you are working with. Then shade the area you are interested in. There are three cases – the area below a certain value, the area above a certain value, and the area between two values. This visual guide helps in determining what we subtract from what.

Side note – the probability of exceeding a particular outcome by pure chance alone is called the p-value. We will start using this idea next week.

Research Question Example

We will be assuming that the variables we are using for our equal pay question come from a normally distributed population. This allows us to use normal curve based probabilities and statistical tests to examine the data we are using to answer our question.

Equal Pay Example. Earlier we found that the likelihood of males and females having a salary between 38 and 52K were not the same, suggesting that gender and salary interacted in some way. Let us ask if the probability of having a salary greater than the overall mean of 45 is the same for both genders. Since we are assuming that salary is normally distributed as a whole and for each gender, we can use a normal curve probability to examine this.

The first step is to find the z scores for each gender for the data value of 45. We found earlier that the female mean is 38, and the sample standard deviation is 18.3, and the male mean and standard deviation is 52 and 17.8 respectively. This gives us the information needed to determine our z scores:

Female z = (45-38)/18.3 = (7)/18.3 = 0.38.

Male z = (45-52)/17.8 = (-)7/17.8 = -0.39.

The second step is to find the probability of exceeding each of these values using the NORM.S.DIST function.

Female: =1-NORM.S.DIST(0.38,1) = 0.352

Males = 1-NORM.S.DIST(-0.39,1) = 0.652

So, it again appears that males and females have a different salary distribution as males are almost twice as likely to be above the overall average of 45 as females are. Again, we have not yet considered the equal work element.

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Week2/Assingment.docx

Problem Set Week Two

The assignment for this week involves developing an understanding of the problem and the data that we will be analyzing during the class. We will be using a data set of 50 employees sampled from an imaginary company to answer the question of whether males and females receive equal pay for performing equal work. The questions in the assignment follow the examples provided in the weekly guidance lectures. The first question this week focuses on the kind of data we have. Different levels of data allow us to do different kinds of analysis, so we need to understand what we have to work with. Question two involves developing the probability of randomly picking a student who has certain characteristics from the sample.  Question three involves finding the probability of randomly picked employees falling within the top one-third of different groups using Excel functions.  Question four and five involve using statistical tests to determine if the compa-ratio (an alternate measure of pay). The final question asks for an interpretation of your opinion on the question of equal pay for equal work based on the work done this week.

Both the assignment file and the data file are located in the  Course Materials section at the bottom in the  Multi-Media section.  The assignment file contains all of the weekly assignments (for Weeks 2, 3, and 4). See the labeled tabs at the bottom of the Excel assignment file. The data in the  data file needs to be copied over into the assignment file, and you will be set for the entire class. *Ask questions if it is not clear how to move the data from one file to the other.

Week2/discu1.docx

Hypothesis Testing / T-tests / F-test

Although the initial post is due on Day 5, you are encouraged to start working on it early, as it is a three-part discussion that should be completed in sequential order.

Part One – Hypothesis Testing

Read  Lecture Four Preview the document View in a new window. Lecture Four starts out with the five-step procedure for hypothesis testing. What is this? What does it do for us? Why do we need to follow these steps in making a judgement about the populations our samples came from? What are the “tricky” parts of developing appropriate hypotheses to test? What examples can you suggest where this process might be appropriate in your personal or professional lives? (This should be started on Day 1.)

Part Two – T-tests

Read  Lecture Five Preview the document View in a new window. Lecture Five illustrates several t-tests on the data set. What conclusions can you draw from these tests about our research question on equal pay for equal work? What is missing from these results to give us a complete answer to the question? Why? (This should be started on Day 3.) 

Part Three – F-test

Read  Lecture Six Preview the document View in a new window. Lecture Six introduces you to the F-test for variance equality. Last week, we discussed how adding a variation measure to reports of means was a smart thing to do. Why does variation make our analysis of the equal pay for equal work question more complicated? What causes of variation impact salary that we have not discussed yet? How can you relate this issue to measures used in your personal or professional lives? (This should be completed by Day 5 .)

Your responses should be separated in the initial post, addressing each part individually, similar to what you see here.

image1.png

image2.png

Week2/Week 2 Lecture 4-1.pdf

Lecture 4

(Sampling basics and Hypothesis test)

This week we turn from descriptive statistics to inferential statistics and making decisions about our populations based on the samples we have. For example, our class case research question is really asking if in the entire company population of employees, do males and females receive the same pay for doing equal work. However, we are not analyzing the entire population, instead we have a sample of 25 males and 25 females to work with.

This brings us to the idea of sampling – taking a small group/sample from a larger population. To paraphrase, not all samples are created equal. For example, if you wanted to study religious feelings in the United States, would you only sample those leaving a fundamentalist church on a Wednesday? While this is a legitimate element of US religions, it does not represent the entire range of religious views – it is representative of only a portion of the US population, and not the entire population.

The key to ensuring that sample descriptive statistics can be used as inferential statistics – sample results that can be used to infer the characteristics (AKA parameters) of a population – is have a random sample of the entire population. A random sample is one where, at the start, everyone in the population has the same chance of being selected. There are numerous ways to design a random sampling process, but these are more of a research class concern than a statistical class issue. For now, we just need to make certain that the samples we use are randomly selected rather than selected with an intent of ensuring desired outcomes are achieved.

The issue about using samples that students often new to statistics is that the sample statistic values/outcomes will rarely be exactly the same as the population parameters we are trying to estimate. We will have, for each sample, some sampling error, the difference between the actual and the sample result. Researchers feel that this sampling error is generally small enough to use the data to make decisions about the population (Lind, Marchel, & Wathen, 2008).

While we cannot tell for any given sample exactly what this difference is, we can estimate the maximum amount of the error. Later, we will look at doing this; for now, we just need to know that this error is incorporated into the statistical test outcomes that we will be studying.

Once we have our random sample (and we will assume that our class equal pay case sample was selected randomly), we can start with our analysis. After developing the descriptive statistics, we start to ask questions about them. In examining a data set, we need to not only identify if important differences exist or not but also to identify reasons differences might exist. For our equal pay question, it would be legal to pay males and females different salaries if, for example, one gender performed the duties better, or had more required education, or have more seniority, etc. Equal pay for equal work, as we are beginning to see, is more complex than a simple single question about salary equality. As we go thru the class, we will be able to answer

increasingly more complex questions. For this week, we will stay with questions about involving ways to sort our salary results – looking for differences might exist.

Some of these questions for this week with our equal pay case could include:

• Could the means for both males and females be the same, and the observed difference be due to sampling error only?

• Could the variances for the males and female be the same (AKA statistically equal)? • Could salaries per grade be statistically equal? • Could salaries per degree (undergraduate and graduate) be the same? • Etc.

Hypothesis Testing

As we might expect, research and statistics have a set procedure/process on how to go about answering these questions. The hypothesis testing procedure is designed to ensure that data is analyzed in a consistent and recognized fashion so everyone can accept the outcome.

Statistical tests focus on differences – is this difference large enough to be significant, that is not simply a sampling error? If so, we say the difference is statistically significant; if not, the difference is not considered statistically significant. This phrasing is important as it is easy to measure a difference from some point, it is much harder to measure “things are different.” It is that pesky sampling error that interferes with assessing differences directly.

Before starting the hypothesis test, we need to have a clear research question. The questions above are good examples, as each clearly asks if some comparison is statistically equal or not. Once we have a clear question – and a randomly drawn sample – we can start the hypothesis testing procedure. The procedure itself has five steps:

• Step 1: State the null and alternate hypothesis • Step 2: Form the decision rule • Step 3: Select the appropriate statistical test • Step 4: Perform the analysis • Step 5: Make the decision, and translate the outcome into an answer to the initial

research question.

Step 1. The null hypothesis is the “testable” claim about the relationship between the variables. It always makes the claim of no difference exists in the populations. For the question of male and female salary equality, it would be: Ho: Male mean salary = Female mean salary. If this claim is found not to be correct, then we would accept the alternate hypothesis claim: Ha: Male salary mean =/= (not equal) Female salary mean. (Note, some alternate ways of phrasing these exist, and we will cover them shortly. For now, let’s just go with this format.)

Step 2. This step involves selecting the decision rule for rejecting the null hypothesis claim. This will be constant for our class – we will reject the null hypothesis when the p-value is equal to or less than 0.05 (this probability is called alpha). Other common values are .1, and .01 – the more severe the consequences of being wrong if we reject the null, the smaller the value of

alpha we select. Recall that we defined the p-value last week as the probability of exceeding a value, the value in this case would be the statistical outcome from our test.

Step 3. Selecting the appropriate statistical test is the next step. We start with a question about mean equality, so we will be using the T-test – the most appropriate test to determine if two population means are equal based upon sample results.

Step 4. Performing the analysis comes next. Fortunately for us, we can do all the arithmetic involved with Excel. We will go over how to select and run the appropriate T-test below.

Step 5. Interpret the test results, making a decision on rejecting or not rejecting the null hypothesis, and using this outcome to answer the research question is the final step. Excel output tables provide all the information we need to make our decision in this step.

Step 1: Setting up the hypothesis statements

In setting up a hypothesis test for looking at the male and female means, there are actually three questions we could ask and associated hypothesis statements in step 1.

1. Are male and female mean salaries equal? a. Ho: Male mean salary = Female mean salary b. Ha: Male mean salary =/= Female mean salary

2. Is the male mean salary equal to or greater than the Female mean salary? a. Ho: Male mean salary => Female mean salary b. Ha: Male mean salary < Female mean salary

3. Is the male salary equal to or less than the female mean salary? a. Ho: Male mean salary <= Female mean salary b. Ha: Male mean salary > Female mean salary

While they appear similar each answers a different question. We cannot, for example, take the first question, determine the means are not equal and then say that, for example, the male mean is greater than the female mean because the sample results show this. Our statistical test did not test for this condition. If we are interested in a directional difference, we need to use a directional set of hypothesis statements as shown in statements 2 and 3 above.

Rules. There are several rules or guidelines in developing the hypothesis statements for any statistical test.

1. The variables must be listed in the same order in both claims. 2. The null hypothesis must always contain the equal (=) sign. 3. The null can contain an equal (=), equal to or less than (<=) or equal to or greater than

(=>) claim. 4. The null and alternate hypothesis statement must, between them, account for all

possible actual comparisons outcomes. So, if the null has the equal (=) claim, the alternate must contain the not equal (=/= or ≠) statement. If the null has the equal or less than (<= or ≤) claim, the alternate must contain the greater than (>) claim.

Finally, if the null has the equal to or greater (=> or ≥) claim, the null must contain the less than (<) claim.

Deciding which pair of statements to use depends on the research question being asked – which is why we always start with the question. Look at the research question being asked; does it contain words indicating a simple equality (means are equal, the same, etc.) or inequality (not equal, different, etc.), if so we have the first example Ho: variable 1 mean = variable 2 mean, Ha: variable 1 mean =/= variable 2 mean.

If the research question implies a directional difference (larger, greater, exceeds, increased, etc. or smaller, less than, reduced, etc.) then it is often easier to use the question to frame the alternate hypothesis and back into the null. For example, the question is the male mean salary greater than the female mean salary would lead to an alternate of exactly what was said (Ha: Male salary mean > Female salary mean) and the opposite null (Male salary mean <= Female salary mean).

Step 2: Decision Rule

Once we have our hypothesis statements, we move on to deciding the level of evidence that will cause us to reject the null hypothesis. Note, we always test the null hypothesis, since that is where our claim of equality lies. And, our decision is either reject the null or fail to reject the null. If the latter, we are saying that the alternate hypothesis statement is the more accurate description of the relationship between the two variable population means. We never accept the alternate.

When we perform a statistical test; we are in essence asking if, based on the evidence we have is, the difference we observe be large enough to have been caused by something other than chance or is it due to sampling error?

A statistical test gives us a statistic as a result. We know the shape of the statistical distribution for each type of test, therefore we can easily find the probability of exceeding this test value. Remember we called this the p-value.

Now all we need to decide is what is an acceptable level of chance – that is, when would the outcome be so rare that we would not expect to see it purely by chance sampling error alone? Most researchers agree that if the p-value is 5% (.05) or less than, then chance is not the cause of the observed difference, something else must be responsible. This decision point is called alpha. Other values of alpha frequently used are 10% (often used in marketing tests) and 1% (frequently used in medical studies). The smaller the chosen alpha is, the more serious the error is in rejecting the null when we should not have.

For our analysis, we will use an alpha of .05 for all our tests.

Final Point

You may have noticed that we have two basic types of hypothesis statements – those testing equality and those testing directional differences. This leads to two different types of statistical tests – the two-tail and the one-tail. In the one-tail test, the entire value of alpha is

focused on the distribution tail – either the right or left tail depending upon the phrasing of the alternate hypothesis. A neat hint, the arrow head in the alternate hypothesis shows which tail the result needs to be in to reject the null.

In the case of the two-tail test (equality), we do not care if one variable is bigger or smaller than the other, only that they differ. This means that the rejection statistic could be in either tail, the right or left. Since the reject region is split into two areas, we need to split alpha into these areas – so with a two-tail test, we use alpha/2 as the comparison with our p-value (e.g., 0.05/2 = 0.025). The example in Lecture 5 will review this in more detail.

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Week2/Week 2 Lecture 5-1.pdf

Lecture 5

The T-Test

In the previous lecture, we introduced the hypothesis testing procedure, and developed the first two steps of a statistical test to determine if male and female mean salaries could be equal in the population – where our differences were caused simply by sampling errors. This lecture continues with this example by completing the final three steps. It also introduces our first statistical test, the t-test for mean equality.

Last week we looked at the normal curve and noted several of its characteristics, such as mean = median = mode, symmetrical around the mean, curve height drops off the further the score gets from the mean (meaning scores further from the mean are less likely to occur). Our first statistical test, the t-test, is based on a population that is distributed normally. The t-test is used when we do not have the population variance value – this is the situation every time we use a sample to make decisions about their related populations.

While the t-test has several different versions, we will focus on the most commonly used form – the two sample test for mean equality assuming equal variance. When we are testing measures for mean equality, it is fairly rare for the variances to be much difference, and the observed difference is often merely sample error. (In Lecture 6, we will revisit this assumption.)

The logic of the test is that the difference between mean values divided by a measure of this difference’s variation will provide a t statistic that is distributed normally, with the mean equaling 0 and the standard deviation equaling 1. This outcome can then be tested to see what the likelihood is that we would get a value this large or larger purely by chance – our old friend the p-value. If this p-value exceeds our decision criteria, alpha, then we reject the null hypothesis claim of no difference (Lind, Marchel, & Wathen, 2008).

Setting up the t-test

Before selecting any test from Excel, the data needs to be set up. For the t-test, there are a couple of steps needed. First, copy the data you want to first set up the data. In our question about male and female salaries, copy the gender variable column from the data page to a new worksheet page (the recommendation is on the week 2 tab) and paste it to the right of the questions (such as in column T), then copy and paste the salary values and paste them next to the gender data. Next, sort both columns by the gender column – this will give you the salary data sorted by gender. Then, in column V place the label/word Males, and in column W place the label Females. Now copy the male salaries and paste them under the Male label, and do the same for the female salaries and the female label. The data is now set up for easy entry into the T-test data entry section.

The t-test is found in the Analysis Toolpak that was loaded into your Excel program last week. To find it, click on the Data button in the top ribbon, then on the Data Analysis link in the Analyze box at the right, then scroll down to the T-test: Two-Sample Assuming Equal Variances. For assistance in setting up the t-test, please see the discussion in the Week 2 Excel Help lecture.

Interpreting the T-test Output

The t-test output contains a lot of information, and not all of it is needed to interpret the result. The important elements of the t-test outcome will be shown with an example for our research case question.

Equal Pay Example - continued

In Lecture 4 we set up the first couple of steps for our testing of the research question: Do males and females receive equal pay for equal work? Our first examination of the data we have for answering this question involves determining if the average salaries are the same.

Here is the completed hypothesis test for the question: Is the male average salary equal to the female average salary?

Step 1. Ho: Male mean salary = female mean salary

Ha: Male mean salary ≠ female mean salary

Step 2. Reject the null if the p-value is < (less than) alpha = .05.

Step 3. The selected test is the Two-Sample T-test assuming equal variances.

Step 4. The test results are below. The screen shot shows output table.

Step 5. Interpretation and conclusions.

The first step is to ensure we have all of the correct data. We see that we have 25 males and females in the Observations row, and that the respective means are equal to what we earlier calculated.

The calculated t statistic is 2.74 (rounded). We have two ways to determine if our result rejects or fails to reject the null hypothesis; both involve the two-tail rows, as we have a two tail test (equal or not equal hypothesis statements). The first is a comparison of the t-values – if the critical t of 2.74 (rounded) is greater than the T-Critical two-tail value of 2.01, we reject the null hypothesis. The second way is to compare the p-value with our criteria of alpha = .05. Remember, since this is a two-tail test, the alpha for each tail is half of the overall alpha or .025. If the p-value (shown as P(T<=t) two -tail value of 0.0085 is less than our one tail alpha (.025) then we reject the null hypothesis. Note: at times Excel will report the p-value in an E format, such as 3.45E-04. This is called an Exponent format, and is the same as 3.45 * 10-04. This means move the decimal point 4 places to the left, making 3.45E-04 = 0.000345. Virtually any p-value reported with an E-xx form will be less than our alpha of 0.05 (which would be 5E-02).

Since we rejected the null hypothesis in both approaches (and both will always provide the same outcome), we can answer our question with: No - the male and female mean salaries are not equal.

Note that for this set of data, we would have rejected the null for a one-tail test if and only if the null hypothesis had been: Male mean salary is <= Female mean salary and the alternate was Male mean salary is > Female mean salary. The arrow in the alternate points to the positive/right tail and that is where the calculated t-statistic is. So, even if the p-value is smaller than alpha in a one tail test, we need to ensure the t-statistic is in the correct tail for rejection.

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Week2/Week 2 Lecture 6-1.pdf

Lecture 6

(Additional information on t-tests and hypothesis testing)

Lecture 5 focused on perhaps the most common of the t-tests, the two sample assuming equal variance. There are other versions as well; Excel lists two others, the two sample assuming unequal variance and the paired t-test. We will end with some comments about rejecting the null hypothesis.

Choosing between the t-test options

As the names imply each of the three forms of the t-test deal with different types of data sets. The simplest distinction is between the equal and unequal variance tests. Both require that the data be at least interval in nature, come from a normally distributed population, and be independent of each other – that is, collected from different subjects.

The F-test for variance.

To determine if the population variances of two groups are statistically equal – in order to correctly choose the equal variance version of the t-test – we use the F statistic, which is calculated by dividing one variance by the other variance. If the outcome is less than 1.0, the rejection region is in the left tail; if the value is greater than 1.0, the rejection region is in the right tail. In either case, Excel provides the information we need.

To perform a hypothesis test for variance equality we use Excel’s F-Test Two-Sample for Variances found in the Data Analysis section under the Data tab. The test set-up is very similar to that of the t-test, entering data ranges, checking Labels box if they are included in the data ranges, and identifying the start of the output range. The only unique element in this test is the identification of our alpha level.

Since we are testing for equality of variances, we have a two sample test and the rejection region is again in both tails. This means that our rejection region in each tail is 0.25. The F-test identifies the p-value for the tail the result is in, but does not give us a one and two tail value, only the one tail value. So, compare the calculated p-value against .025 to make the rejection decision. If the p-value is greater than this, we fail to reject the null; if smaller, we reject the null of equal variances.

Excel Example. To test for equality between the male and female salaries in the population, we set up the following hypothesis test.

Research question: Are the male and female population variances for salary equal?

Step 1: Ho: Male salary variance = Female salary variance

Ha: Male salary variance ≠ Female salary variance

Step 2: Reject Ho if p-value is less than Alpha = 0.025 for one tail.

Step 3: Selected test is the F-test for variance

Step 4: Conduct the test

Step 5: Conclusion and interpretation. The test resulted in an F-value less than 1.0, so the statistic is in the left tail. Had we put Females as the first variable we would have gotten a right tail F-value greater than 1.0. This has no bearing on the decision. The F value is larger than the critical F (which is the value for a 1-tail probability of 0.25 – as that was entered for the alpha value).

So, since our p-value (.44 rounded) is > .025 and/or our F (0.94 rounded) is greater than our F Critical, we fail to reject the null hypothesis of no differences in variance. The correct t- test would be the two-sample T-test assuming equal variances.

Other T-tests.

We mentioned that Excel has three versions of the t-test. The equal and unequal variance versions are set up in the same way and produce very similar output tables. The only difference

is that the equal variance version provides an estimate of the common variation called pooled variance while this row is missing in the unequal variance version.

A third form of the t-test is the T-Test: Paired Two Sample for Means. A key requirement for the other versions of the t-test is that the data are independent – that means the data are collected on different groups. In the paired t-test, we generally collect two measures on each subject. An example of paired data would be a pre- and post-test given to students in a statistics class. Another example, using our class case study would the comparing the salary and midpoint for each employee – both are measured in dollars and taken from each person. An example of NON-pared data, would the grades of males and females at the end of a statistics class. The paired t-test is set up in the same way as the other two versions. It provides the correlation (a measure of how closely one variable changes when another does – to be covered later in the class) coefficient as part of its output.

An Excel Trick. You may have noticed that all of the Excel t-tests are for two samples, yet at times we might want to perform a one-sample test, for example quality control might want to test a sample against a quality standard to see if things have changed or not. Excel does not expressly allow this. BUT, we can do a one-sample test using Excel.

The reason is a bit technical, but boils down to the fact that the two-sample unequal variance formula will reduce to the one-sample formula when one of the variables has a variance equal to 0. So using the unequal variance t-test, we enter the variable we are interested – such as salary – as variable one and the hypothesized value we are testing against – such as 45 for our case – as variable two, ensuring that we have the same number of variables in each column.

Here is an example of this outcome.

Research question: Is the female population salary mean = 45?

Step 1: Ho: Female salary mean = 45

Ha: Female salary mean ≠ 45

Step 2: Reject the null hypothesis is less than Alpha = 0.05

Step 3: Selected test is the two sample unequal variance t-test

Step 4: Conduct the test

Step 5: Conclusions and Interpretation. Since the two tail p-value is greater than (>) .025 and/or the absolute value of the t-statistic is less than the critical two tail t value, we fail to reject the null hypothesis. Our research question answer is that, based upon this sample, the overall female salary average could equal 45.

Miscellaneous Issues on Hypothesis Testing

Errors. Statistical tests are based on probabilities, there is a possibility that we could make the wrong decision in either rejecting or failing to reject the null hypothesis. Rejecting the null hypothesis when it is true is called a Type I error. Accepting (failing to reject) the null when it is false is called a Type II error.

Both errors are minimized somewhat by increasing the sample size we work with. A type I error is generally considered the more severe of the two (imagine saying a new medicine works when it does not), and is managed by the selection of our alpha value – the smaller the alpha, the harder it is to reject the null hypothesis (or, put another way, the more evidence is needed to convince us to reject the null). Managing the Type II error probability is slightly more complicated and is dealt with in more advanced statistics class. Choosing an alpha of .05 for most test situations has been found to provide a good balance between these two errors.

Reason for Rejection. While we are not spending time on the formulas behind our statistical outcomes, there is one general issue with virtually all statistical tests. A larger sample size makes it easier to reject the null hypothesis. What is a non-statistically significant outcome based upon a sample size of 25, could very easily be found significant with a sample size of, for example, 25,000. This is one reason to be cautious of very large sample studies – far from meaning the results are better, it could mean the rejection of the null was due to the sample size and not the variables that were being tested.

The effect size measure helps us investigate the cause of rejecting the null. The name is somewhat misleading to those just learning about it; it does NOT mean the size of the difference being tested. The significance of that difference is tested with our statistical test. What it does measure is the effect the variables had on the rejection (that is, is the outcome practically significant and one we should make decisions using) versus the impact of the sample size on the rejection (meaning the result is not particularly meaningful in the real world).

For the two-sample t-test, either equal or unequal variance, the effect size is measured by Cohen’s D. Unfortunately, Excel does not yet provide this calculation automatically, however it is fairly easy to generate.

Cohen’s D = (absolute value of the difference between the means)/the standard deviation of both samples combined.

Note: the total standard deviation is not given in the t-test outputs, and is not the same as the square root of the pooled variance estimate. To get this value, use the fx function stdev.s on the entire data set – both samples at the same time.

Interpreting the effect size outcome is fairly simple. Effect sizes are generally between 0 and 1. A large effect (a value around .8 or larger) means the variables and their interactions caused the rejection of the null, and the result has a lot of practical significance for decision making. A small effect (a value around .2 or less) means the sample size was more responsible for the rejection decision than the variable outcomes. The medium effect (values around .5) are harder to interpret and would suggest additional study (Tanner & Youssef-Morgan, 2013).

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Deigeo, CA: Bridgepoint Education.

Week3/assignemtn.docx

Problem Set Week Two

The assignment for this week involves developing an understanding of the problem and the data that we will be analyzing during the class. We will be using a data set of 50 employees sampled from an imaginary company to answer the question of whether males and females receive equal pay for performing equal work. The questions in the assignment follow the examples provided in the weekly guidance lectures. The first question this week focuses on the kind of data we have. Different levels of data allow us to do different kinds of analysis, so we need to understand what we have to work with. Question two involves developing the probability of randomly picking a student who has certain characteristics from the sample.  Question three involves finding the probability of randomly picked employees falling within the top one-third of different groups using Excel functions.  Question four and five involve using statistical tests to determine if the compa-ratio (an alternate measure of pay). The final question asks for an interpretation of your opinion on the question of equal pay for equal work based on the work done this week.

Both the assignment file and the data file are located in the  Course Materials section at the bottom in the  Multi-Media section.  The assignment file contains all of the weekly assignments (for Weeks 2, 3, and 4). See the labeled tabs at the bottom of the Excel assignment file. The data in the  data file needs to be copied over into the assignment file, and you will be set for the entire class. *Ask questions if it is not clear how to move the data from one file to the other.

Week3/Discu1.docx

Multiple Testing / ANOVA / Effect Size

Although the initial post is due on Day 5, you are encouraged to start working on it early, as it is a three-part discussion that should be completed in sequential order.

Part One – Multiple Testing

Read  Lecture Seven Preview the document View in a new window. The lectures from last week and Lecture Seven discuss issues around using a single test versus multiple uses of the same tests to answer questions about mean equality between groups. This suggests that we need to master—or at least understand—a number of statistical tests. Why can’t we just master a single statistical test—such as the t-test—and use it in situations calling for mean equality decisions? (This should be started on Day 1.)  

Part Two – ANOVA

Read  Lecture Eight Preview the document View in a new window. Lecture Eight provides an ANOVA test showing that the mean salary for each job grade significantly differed. It then shows a technique to allow us to determine which pair or pairs of means actually differ. What other factors would you be interested in knowing if means differed by grade level? Why? Can you provide an ANOVA table showing these results? (Do not bother with which means differ.) How does this help answer our research question of equal pay for equal work? What kinds of results in your personal or professional lives could use the ANOVA test? Why? (This should be started on Day 3.)

Part Three – Effect Size

Read  Lecture Nine Preview the document View in a new window. Lecture Nine introduces you to Effect size measure.  There are two reasons we reject a null hypothesis.  One is that the interaction of the variables causes significant differences to occur – our typical understanding of a rejected null hypothesis.  The other is having a large sample size – virtually any difference can be made to appear significant if the sample is large enough.  What is the Effect size measure?  How does it help us decide what caused us reject the null hypothesis?  (This should be completed by Day 5 .)

image1.png

image2.png

Week3/Week 3 lecture 7-1.pdf

Week 3 Lecture 7

We have so far seen how we can summarize data sets using descriptive statistics, showing several characteristics including mean and standard deviation. We also found that if our data comes from a random sample of a larger population, these descriptive statistics become inferential statistics, and can be used to make inferences about the population. These inferences can then be used in statistical tests to see if things have changed or not (equal to known standards or other data sets or not).

We have looked at one and two sample mean tests (with the t-test) and two sample comparisons of variance equality (with the F test). This week we will look at the Analysis of Variance (ANOVA) test for mean equality between three or more groups.

ANOVA

The first question often asked is why not just do multiple t-tests comparing three or more different group means? One answer involves efficiency. Conducting multiple t-tests can become somewhat tedious. Comparing just three groups (A, B, and C) requires us to compare A and B, B and C, and A and C (3 tests). With 4 groups (A, B, C, and D) we have A and B, A and C, A and D, B and C, B and D, C and D (6 tests)! So a single test can save us a lot of time and is much more efficient.

A second reason and much more important reason is that we lose confidence in our results when multiple tests are performed on the same data. With an alpha of 0.05, we are 95% certain we are right with each test, but being certain we are right for all the tests involves multiplying the results together, so for three tests we would be .95*.95*.95 or 86% certain; with six tests, our confidence drops to .95^6 = .74, a long way from our desired 95% confidence. So, a single test maintains our desired level of confidence in the outcome (Lind, Marchel, & Wathen, 2008).

Logic

A second question asked comes from the name itself, how can analyzing variance tell us anything about mean differences? The answer lies in how ANOVA works. The key assumptions for an ANOVA analysis are that each of the groups are normally distributed AND have equal variances. These mean that the distributions are shaped the same and, this allows for an easy comparison. Take a look at the following two sets of normal curves.

Exhibit A

Exhibit B

The means of the three sample groups in Exhibit A could clearly come from three populations that have the same mean, and the differences seen are merely sampling errors. However, we cannot say the same thing about the sample groups in Exhibit B.

ANOVA takes the variation of all of the data in the groups being tested (three in this case) and compares it with the average variation for each of the groups using the F-test (discussed last week). Since for the Exhibit A groups, the overall variation will be only slightly larger than the average of the three (which are assumed to be equal). Since the resulting F value

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

-5 -4 -3 -2 -1 0 1 2 3 4 5

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

-10 -5 0 5 10

will not be statistically significant, we can say that the groups are closely distributed and the means are statistically equal.

In Exhibit B, however, the variation of the entire group would be around three times the variation of the average. Just by looking at the average variance for the individual groups and comparing it to the variance for the entire group, we can make a judgement on how close the distributions are, and with that a judgement on mean equality.

As with the t-test, ANOVA will let us know exactly how much difference in the population locations is enough to say means differ or not, we cannot just “eyeball” it.

Hypothesis

Stating the null and alternate hypothesis for an ANOVA test is simple, as they are always the same:

Ho: All means equal.

Ha: At least one mean differs (Tanner & Youssef-Morgan, 2013).

You might recall from last week that we said the alternate always states the opposite from the null statement. If so, why isn’t our alternate: all means differ, which seems like the opposite? The reason is that the ANOVA test will reject the null hypothesis if even one mean from the groups being examined is statistically significant difference. So, the opposite of all means differ is actual at least one mean differs.

Data Set-up

Setting-up the data for an ANOVA analysis is just a bit more complicated than for a t- test. While with the T-test we just highlighted the column or portion of a column of data (sometimes after sorting it by a variable such as gender), for an ANOVA test, we need to create a table. For example, if we wanted to look at average salaries per grade (shown in the Week 3 Lecture 8 example), we would need a table looking like this.

Doing this is fairly simple. Copy the grade and salary columns (separately) and paste them onto a new Excel sheet (probably in Week 3 to the right of the questions). Then, highlight both columns – from labels to last value – and select Data Sort. Select sorting on the grade variable and click on OK. Both columns are now in grade order, and you can highlight and cut the salaries for each grade and paste them into a new table you create with the grade letter as the head. When finished, you will have the input table used in setting up an Excel ANOVA test.

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Diego, CA: Bridgeport Education.

Week3/Week 3 Lecture 8-1.pdf

Week 3 Lecture 8

Excel ANOVA Example

In our on-going investigation of whether or not males and females are paid equally for equal work, we have come up with contradicting results so far, average salaries are clearly different but average compa-ratios are not. We need to examine reasons that might impact these differences to see if we can explain what is going on. For possible factors influencing individual salaries, we need to be able to, paraphrasing what they say in TV cop shows, “rule it out as a suspect” in causing differences or keep it in as a cause of differences between the gender pay practices.

One key issue in our question that has not clearly been examined yet is the impact of grades on salaries. Clearly, grade differences have the potential to complicate the issue as the work done differs by grade. One question to ask here is, “are average salaries equal across grade levels?” This becomes our research question.

Example

For the research question of: are average salaries equal across the grades, we have the following hypothesis test.

Step 1: Ho: All salary means are equal.

Ha: At least one mean differs.

Step 2: Reject the null if the p-value < alpha = .05.

Step 3: Statistical Test: Single-factor ANOVA. (Note: salary variance in some of the grades may violate the equal variance requirement. We will ignore this for the purposes of this example.)

Step 4: Perform Test.

The input box for Excel’s Single factor ANOVA is

The input range for this example would be D1:F16; we would click on Labels in the first row, and select any output range desired (This would be given in the assignment for

consistency’s sake). Completing the input screen and clicking OK gives us an output table.

Reading the ANOVA output tables

The first thing we see is the test name in cell K-1: Anova: Single Factor. This is just a check to ensure we have the right test. Next we see a summary table. Under the Groups column we should see the data labels (in this case our grades). If not, and we see something such as a number, an input error has been made, the labels were not included but the Labels box was checked. If this happens, just redo the data set up and overwrite the output.

For each variable, we see the count, sum, average, and variance. If we had some question about having equal variance, we could perform an F-test on the variables with the extreme values. (Again, for purposed of this example, we are going to ignore the requirement for equal variances.)

The next table is the ANOVA output. While, technically for our hypothesis test, we only need to look at the p-value result, the other columns provide some useful information.

Note: this is somewhat technical, and is presented only as an explanation of the table. The source of variation column gives us our two variation measures; Between groups refers to the overall variation while Within Groups refers to the average variation for all the groups. The SS column (Sum of Squares) is an estimate of the variation (slightly different than our variance formula). This value is divided by the df (degrees of freedom) value for each group. This df is conceptually the same as that discussed with the t-test; and the total df is N-1, where N is the number of data points. Looking at this value (49 in this example) confirms we entered the right number of data points of 50.

MS stands for Mean Square and is the SS divided by the df. The F value is determined by dividing the MS for the between row by the MS of the Within groups row. The p-value and the critical F statistic complete the table.

Step 5: Conclusion and Interpretation: The F is much larger than the F critical, and the p- value is much less than 0.05 (Note: 1.04E-35 means move the decimal point 35 places to the left (0.0000000000000000000000000000000000104). If the E (for exponent) had been positive, we would have moved the decimal to the right, example 1.04E4 = 10400.)

So, according to our decision rule, since the p-value is < (less than) 0.05, we reject the null hypothesis and conclude that at least one mean differs. This suggests that grade level has an impact on salary, and that measuring pay in salary terms could be creating some issues in answering our questions.

Determining Differences

When we reject the null hypothesis, a logical follow-up question is often, which differences are meaningful? There are several approaches to answering this question; all involve a pair by pair comparison, and most require access to statistical tables not available within Excel.

One approach that we can use in our Excel worksheet involves developing confidence intervals around the difference in group means. (Note: Confidence intervals allow us to develop a range that contains the value we are looking for with a known level of confidence such as 95%. We will discuss this again in Week 5.)

All of the required information for these intervals is available from the ANVOA output. The basic approach is to

1. Find the difference between each pair of means 2. To this value, add and subtract a measure of the variation in the data (due to sample

error, we know our sample means are not exactly equal to the population parameter, so we need to take this sample error into account, our real difference might be a bit larger or smaller than the samples show).

3. Examine the ranges to see if 0 is included (alternately, do the endpoints have different signs a + and -); if so the real population difference could be 0 and the means do not significantly differ.

The formula for the interval that we will build in Excel is:

(mean1 – mean2) +/- t*sqrt(MSW * (1/n1 + 1/n2)) (Lind, Marchel, & Wathen, 2008).

Here is an example of how we work out the formula, and what each term means. The value of the means for each variable is found in the Summary table, as is the count (n) for each variable. The MSW is the MS for within that is found in the ANOVA table, and we find t with the t.inv function from Excel.

So, let’s walk thru constructing an interval for grades A and B, and then we can look at what it might look like in an Excel spreadsheet. From our example output above, we have:

Mean A = 23.5 (rounded)

Mean B = 31.7 (rounded)

n for A = 15

n for B = 7

MSW = 8.64 (rounded)

T has a df equal to that of MSW (44 in this case), and the probability is our 0.05 for a 95% interval. T.inv(0.05, 44) equals 2.015 (rounded).

So, for grades A and B, our mean difference = 31.7 – 23.5 = 8.2

The +/- term is t * sqrt(MSW * (1/n1 +1/n2)). Plugging in our values gives us

2.015* sqrt(8.64 * (1/15 + 1/7) = 2.71.

So, our interval is 8.2 +/- 2.71 = 5.49 to 10.91 (rounded).

Since 0 is not in this range, we can say that the mean salaries for grades A and B differ significantly. Setting this up in Excel (using cell references as the examples on the left show) give us the following:

So, all of the grade average salary differences are significantly different from each

other. Grade is definitely a factor in an employee’s salary, and introduces a source of variation that is not an equal work measure.

We have not yet found an answer to our question, as we have not yet figured out how to get a measure of equal work to base our comparisons on.

More to follow next week.

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Week3/Week 3 Lecture 9-2.pdf

Week 3 Lecture 9

Effect Size

When we reject the null hypothesis with an ANOVA test, we have two questions that arise. The first, which pair of means differs significantly, we have dealt with already. The second question, similar to what we asked with the t-test null hypothesis rejection is: what caused the rejection, the sample size, or the variable interactions? This question is again answered using an effect size measure.

Recall that the effect size measure shows how likely the variable interaction caused the null hypothesis rejection. Large values lead us to say the variables caused the outcome, while small values lead us to say the outcome has little to no practical significance as the sample size was the most likely cause of the rejection of the null.

With the single factor ANOVA, the effect size measure is eta squared, and equals the SS(between)/SS(total) (Tanner & Youssef-Morgan, 2013). For our salary example in Lecture 8, eta squared equals 17686.02 (SS(between)) / 18066 (SS(total)) = 0.979 (rounded). Eta squared effect size measures have different interpretation values than Cohen’s d (from the t- test). According to Nandy (2012), a small eta squared effect size has a value of 0.01, a medium of 0.06, and a large value of 0.14 or more. This means we have a large effect size, and the variables of salary and grade interaction are the most likely cause of our rejecting the null hypothesis rather than the sample size.

Side note: Eta squared can also be interpreted as the percent of “differences between group scores that can be explained by the independent variable” (Tanner & Youssef-Morgan, 2013, p. 123). This is consistent with our saying the variable interactions caused the outcome.

Different Forms of ANOVA

Just as the t-test has several forms, so does the ANOVA test. Excel has three versions available. While we will focus only on the single factor test, a brief description of the other two versions will be presented.

ANOVA: Two factor without replication

The ANOVA – two factor without replication tests mean differences from two different variables at the same time. If we are interested in knowing if the mean salary differs by grade and also by gender, we can perform one two-factor test rather than two separate tests. As mentioned in lecture two for this week, this is more efficient and maintains our desired alpha significance level.

Excel Example. To test the mean salaries by grade and gender at the same time, we would set up our hypothesis test as follows.

Step 1: Ho1: All salary means are equal across grades.

Ha1: At least one mean differs.

Ho2: All gender (male and female) means are equal.

Ha2: At least one mean differs.

Note that in this test, we need to have a hypothesis statement pair for each variable being tested.

Step 2: ANOVA: Two sample without replication.

Step 3: Reject the null hypothesis if the p-value is < alpha = .05.

Step 4: Perform the test. While the input screen for this test is identical to that of the one factor test, the data table used is a bit different. As seen below, it has one value for each variable pair cell. Since we have multiple values for each variable pair, this table was set up with the mean values for each group.

A B C D E F Male 24.3 27.7 43.3 48.0 61.7 75.3

Female 23.3 34.8 41.5 52.5 67.0 76.0

The data entry box would include the entire table, labels and all. The output for this test is:

Step 5: Conclusions and Interpretation. As with the single factor ANOVA, we start out with a summary table for each variable showing the sum, average, and variance for each variable label.

The ANOVA table has an extra row, and one renamed row. The Error row is what we knew as the Within row in the single factor ANOVA. The two rows dedicated to the data are Rows and Columns; these refer to how the variables are presented in the data input table.

The row line refers to our gender variable, since that is the row variable in the input. The p-value is 0.16 (rounded), so we do not reject the null hypothesis of equal means.

The Column line refers to the grade variable, as that was listed in the column position. This p-value is 3.76E-05, or 0.0000376. This is less than (<) our alpha of .05, so we reject the null hypothesis of equal salary means in each grade. We can find which pair(s) of means differ using the same technique as with the single factor ANOVA discussed in Lecture 8.

The effect size measure for a Two-factor ANOVA without replication is generally the same as with the single factor ANOVA. For each variable it would be eta squared = SS(for variable) divided by the SS(total) value (Tanner & Youssef-Morgan, 2013). The effect size for our rejected null hypothesis is 3865.341/3917.059 = .987 (rounded), a very large effect –

meaning the variable interaction caused the rejection of the null, and we have significant practical outcome; one we can make decisions with.

But, let’s go back to the other result, the failure to reject the null hypothesis claiming that the male and female average salaries are equal. What goes with this outcome? We have clear evidence from t-test done in Week 2 that the average salaries are not equal.

This brings us to the other reason for using this test. This is to reduce one cause of error or variation in the measurement of a variable. For example, if we think that grade level may be a cause of differences in the salaries by grade (a reasonable assumption), then we can remove their impact by using this approach. It will take the grade variation out of the overall analysis of salary and include it only in the grade results.

What does this mean? We have been concerned that we have not been able to measure salary for “equal work,” this approach does this for us. The salary average difference examined in this test has the impact of grade level differences removed, in essence, the salary that is analyzed is the salary impact of gender if everyone did “equal work” (at least as far as job duties). There is still some questions around the impact of performance ratings, education, seniority, etc. But for now, we have a better view of “equal pay for equal work” salary differences.

It appears that perhaps males and females are being paid equally for equal work, on average. Ah, the power of statistics to make things clearer. 

ANOVA: Two-factor with Replication (AKA Factorial ANOVA)

This form of the ANOVA test is somewhat different than the previous two forms. While it can test for mean equality (or differences), this is not its primary purpose. The main purpose is to look at the impact of interaction between variables – that is do the results show different patterns when graphed? Interaction means the variables react differently at different measurement levels (Lind, Marchel, & Wathen, 2008). An example is water and temperature, at cold temperatures water is a solid, at mid-range temperatures it is a liquid, at high temperatures it is a gas; there is a clear interaction going on.

As with the without replication test, an example will help demonstrate this test. We will continue with our gender and grade impact on salary. While our primary research question will be if an interaction between gender and grades impacts salary, we will also repeat our questions about mean salary differences by gender and grade.

Excel Example. To test the mean salaries by grade and gender at the same time, we would set up our hypothesis test as follows.

Step 1: Ho1: All salary means are equal across grades.

Ha1: At least one mean differs.

Ho2: All gender (male and female) means are equal.

Ha2: At least one mean differs.

Ho3: The interaction impact is not significant.

Ha3: The interaction is significant.

Note that in this test, we need to have a hypothesis statement pair for each variable being tested, as well as the interaction.

Step 2: ANOVA: Two sample with replication.

Step 3: Reject the null hypothesis if the p-value is < alpha = .05.

Step 4: Perform the test. The input screen for this test is similar to that of the other ANVOA forms, it asks for the number of rows for each variable, which seen below would be two. The data table used is a bit different, as seen below, it has multiple values for each cell. Since several grades have only two males or females, we can only use two values in each cell in our table. If your data has more counts per cell, you can include more values. The data entry table was set up with the minimum and maximum salary values for each cell.

A B C D E F Male 24.0 27.0 40.0 47.0 62.0 72.0

25.0 28.0 47.0 49.0 66.0 77.0 Female 22.0 34.0 41.0 50.0 65.0 75.0

24.0 36.0 42.0 55.0 69.0 77.0

The data entry box would include the entire table, labels and all. The output for this test is:

Step 5: Conclusions and Interpretation. As with the other ANOVA forms, we start out with a summary of the variables.

In the ANOVA table itself, we have added another row, this one for interaction. Whereas with the no replication output table we started with rows, we did not have sample, both refer to the variable listed in the input table row – gender in our case. We again do not reject our

gender null hypothesis as the p-value is 0.055, greater than our p-value of .05. This test also found that gender average salaries did not significantly differ.

The column, or grade, null hypothesis was rejected with a p-value of 1.07E-11 or 0.0000000000107, which is less than (<) .05. So, our grade salary averages do differ by grade.

The interaction null hypothesis is also rejected with a p-value of o.135 (rounded), meaning that the salaries do not show a differing pattern by gender-grade groupings. Males and females are treated consistently through the grades, essentially growing average salaries for each grade jump.

The effect size, eta squared, is done the same way as before – SS for the variable divided by SStotal. The calculation of differences is also done the same way – using SSwithin in the calculations.

The various ANOVA formats can provide us with a lot of information that is hidden in other tests. This is one reason why single variable statistical tests that cannot separate out distinct sources of variation within our data measurements often do not provide a complete understanding of the meaning hidden within the measures. More on this in the upcoming weeks.

Week4/Assingment.docx

Problem Set Week Four

This week we get to answer our equal pay for equal work question by looking at relationships between and among the different variables.

The first question this week looks at correlations and the creation of a correlation table for our variables.  The second question asks for a regression equation showing how the different variables impact the compa-ratio measure. The third questions asks you to discuss the benefits of using a regression equation approach over the single variable tests we have been doing.

The forth question asks for what other information you would have liked to have analyzed in our research. The fifth question asks for your answer to the equal pay for equal work question of: Is the company paying fairly or not?  If not, who benefits and why?

Week4/Discu1.docx

  Correlation / Linear Regression / Multiple Regression

Although the initial post is due on Day 5, you are encouraged to start working on it early, as it is a three-part discussion that should be completed in sequential order.  

Part One – Correlation

Read  Lecture Ten Preview the document View in a new window. Lecture Ten introduces the idea that different variables may move together—sometimes due to causation and at other times due to an unknown influence. An example involves the perfect (+1.0) correlation between annual number of rum barrels imported into the New England region of the U.S. between the years 1790 and 1820 and the number of churches built each of those years (citation lost). Discuss this correlation: What does it tell us? Does rum drinking cause church building? Does church building cause rum drinking? Or what else could it tell us? If this correlation shows a cause and effect relationship, what drives what? If not, why does it exist? What could this correlation be used for? (This should be started on Day 1.)  

Part Two – Linear Regression

Read  Lecture Eleven Preview the document View in a new window. Lecture Eleven provides information showing a strong positive correlation and a significant linear regression existed between the individual’s salary and midpoint (used as a substitute for grade). This is not an unexpected outcome in a company. How useful are these in understanding what drives salary differences? Why? What examples of a linear regression might be useful in your personal or professional lives? Why? (This should be started on Day 3.)

Part Three – Multiple Regression

Read  Lecture Twelve Preview the document View in a new window. In Lecture Twelve, a multiple-regression equation was developed that showed the factors that influenced a person’s salary and—almost as important—factors that did not influence salary. How do we interpret a multiple-regression equation? Pick one of the factors—whether statistically significant or not—used in the analysis, and describe its impact on salary, what the coefficient is and what it means, what its significance is, and whether you expected this outcome or not. (This should be completed by Day 5 .)

Your responses should be separated in the initial post, addressing each part individually, similar to what you see here.

image1.png

image2.png

Week4/Week 4 Lecture 10-1.pdf

Week 4 Lecture 10

We have been examining the question of equal pay for equal work for several weeks now; but have been somewhat frustrated with the equal work part. We suspect that salary varies with grade level, so that equal work is not done if we compare salaries across grades. We found that we could control the effect of grades with either of two techniques. The first is by choosing a variable that does not include grade level variation such as compa-ratios (the salary divided by midpoint). The second by statistically removing the impact of grade level using the ANOVA Two-factor without replication. Both of these gave us different outcomes on the question of male and female pay equality than examining salary only.

However, we still have not gotten a “clean” measure of equal work as there are still other factors that may impact work done such as performance levels (measured by the performance appraisal rating), seniority, education, etc. And, there could be gender bias (and, for real world companies, ethnic bias as well. We will not cover this, but it can be dealt with the same way as we will examine gender). We need to find a way to eliminate the impact of these variables on our pay measure as well.

This week we will look at two techniques that are very good at examining and explaining the influence of variables on outcomes. These are correlation and regression techniques.

Linear Correlation

Correlation is a measure of how variables/things relate – that is, if one variable changes does another variable change in a predictable pattern as well? One very well-known example is the correlation (or relationship) between length/height of children and weight. As children become longer/taller their weight also increases (Tanner & Youssef-Morgan, 2013). Using this relationship, we can make predictions (using the technique of regression discussed in Lecture 11 for this week) about how heavy a child should be for any given height.

For variables that are at least interval in nature, two types of correlation exist for a bi- variable (two variables only) relationship– linear and curvilinear. As they sound, linear correlations show the extent to which the data variables move in a straight line. Curvilinear correlations – which we will not cover – show the extent that variables move in curved lines.

Scatter Diagrams

An effective way to see if the data do relate in predictable ways involves generating a scatter diagram (AKA scatter chart) – a visual display of how the data points – (variable 1 value, corresponding variable 2 value) relate together (Lind, Marchel, & Wathen, 2008).

Example1. One relationship we might expect to show a positive (both values increasing) relationship would be salary and performance rating, either for the entire salary range or at least within grades. The following scatter diagram (made with the Excel Insert Graph functions) show the relationship with Performance Rating on the bottom and Salary on the on the vertical axis. It shows if we put a straight line through the data points, there is a very modest increase from the lower left to upper right.

Salary (Y-axis) and Performance Rating (X-axis)

Example2. If we look at the same variables, but include Grade as a factor, we get the second graph (below) and see the data separated by grade. Each grade seems to show (again, if we were to put a straight line thru the data points for each grade) level lines, indicating no correlation at all. Neither graph gives us much hope that Performance Rating is related to Salary , something HR would probably not be happy with.

Salary Grades (Y-axis) and Performance Appraisal Rating (X-axis)

Correlation

We will be focusing our efforts on the Pearson Correlation Coefficient – a mathematical value that shows the strength of the linear (straight line) relationship between two variables (Lind, Marchel, & Wathen, 2008). The math formula is a bit tedious, so we will not bother with

it – but, if interested, you can ask Excel to display it (either with Help or the “Tell me what you want to do.” With the latter, I typed show help on Pearson Correlation, and then selected the “show help…” line, getting a description and the math formula.).

Pearson correlation ranges from a value of -1.00 to a +1.00. Any value outside of this range indicates an error in the math or setup. A perfect negative correlation (-1.00) means that the data points all fit exactly on a line that runs from the upper left corner to the lower right on a graph, a negative slope. A perfect positive correlation (+1.00) has the line with a positive slope and runs from the lower left to the upper right (Tanner & Youssef-Morgan, 2013).

As the values move away from the perfect extremes, the data points move away from a line to a spread around the line. If we look at our first graph above, the overall Salary and Performance Rating relationship, we have a correlation of +.15, considered very low and not particularly impressive.

Pearson Correlation. Excel finds the Pearson Correlation Coefficient using either the fx function Correl or the Data Analysis function Correlation. The former is used for a single data set with two variables, while the latter can be used for a single or multiple data sets. The Correl output for the Performance Rating and Salary correlation result is:

Column

1 Column

2 Column 1 1 Column 2 0.151307 1

Note the variable names are not included, and we have three correlations. Two will always show a perfect +1.00 correlation of column 1 with column 1 and column 2 with column 2; a diagonal convention makes more sense with the Correlation table we will look at below. The third correlation is the column 1 with column 2 variable. It does not matter which variable is considered in column 1 or 2, as the result will be the same as switching the variable columns.

We can use the Correlation function to identify correlations between multiple data sets at the same time, much as Descriptive Statistics could work with multiple variables at once. In trying to identify what variables might be impacting Salary, we could generate the following table. Remember, that Pearson’s Correlation requires at least interval level data, so that not all of our variables are used. In addition, since Salary and Compa-ratio are two measures of the same thing (pay) we do not want to include them in the same table.

Sal Mid Age Perf Rat Service Raise Sal 1.000 Mid 0.989 1.000 Age 0.544 0.567 1.000

Perf Rat 0.151 0.192 0.139 1.000 Service 0.452 0.471 0.565 0.226 1.000 Raise -0.041 -0.029 -0.180 0.674 0.103 1.000

To identify all of the correlations for a single variable, find the name in the left column. Then go across until you reach the 1.00 value, then go down. For age, we find that the correlation with:

Age = 0.544,

Mid = 0.567,

Age (itself) = 1.00,

Perf Rat = 0.139,

Service = 0.565, and

Raise = -0.180.

Side note: now we can see why the correlation with itself is shown in the tables, it provides the pivot point for reading the table outcomes. The values above this diagonal of 1.00 values would be identical to those below, so they are not provided to make the table visually easier to read.

Coefficient of Determination. We will look at determining statistical significance of correlations in lecture three for this week. But, in the meantime, we can consider the Coefficient of Determination as a rough measure of usefulness (we will look at the effect size measure in lecture three as well). The coefficient of determination is the square of the correlation, and represents the percent of variation that the variables share in common; that is, the amount of variation in one variable’s changes that is explained by the variation in the other variable. So, for age and salary, the coefficient equals 0.5442 = .30 (rounded). As a rule of thumb, variable pairs with coefficients less than (<) 70% are generally not very valuable for prediction purposes.

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Diego, CA: Bridgeport Education.

Week4/Week 4 Lecture 11-1.pdf

Week 3 Lecture 11

Regression Analysis

Regression analysis is the development of an equation that shows the impact of the independent variables (the inputs we can generally control) on the output result. While the mathematical language may sound strange, most of you are quite familiar with regression like instructions and use them quite regularly.

To make a cake, we take 1 box mix, add 1¼ cups of water, ½ cup of oil, and 3 eggs. All of this is combined and cooked. The recipe is an example of a regression equation. The output (or result or dependent variable) is the cake, the inputs (or independent variables) are the inputs used. Each input is accompanied by a coefficient (AKA weight or amount) that tells us how “much” of the variable is “used” or weighted into the outcome.

So, in an equation format, this cake recipe might look like:

Y = 1X1 + 1.25X2 + .5X3 + 3X4 where:

Y = cake

X1 = box mix

X2 = cups of water

X3 = cups of oil

X4 = an egg.

Of course, for the cake, the recipe needs to go through the cooking process; while for other regression equations the outputs need to go through whatever “process” turns the inputs into the output – this is often called “life.”

Example

With a regression analysis, we can identify what factors influence an outcome. So, with our Salary issue, the natural question to help us answer our research question of do males and females get equal pay for equal work would be: what factors influence or explain an individual’s pay? This is a perfect question for a multi-variate regression. Multi-variate simply means we have multiple input variables with a single output variable (Lind, Marchel, & Wathen, 2008).

Variables. A regression analysis uses two distinct types of data. The first are variables that are at least interval level or better (the same as the other techniques we have used so far). The other is called a dummy variable, a variable that can be coded 0 or 1 indicating the presence of some characteristic. In our data set, we have two variables that can be used as dummy coded variables in a regression, Degree and Gender; both coded 0 or 1. In the case of Degree, the 0 stands for having a bachelor’s degree and the 1 stands for having an advanced degree. For Gender, 0 means a male and 1 means a female. How these are interpreted in a regression output will be discussed below. For now, the significance of dummy coding is that it allows us to include nominal or ordinal data in our analysis.

Excel Approach. For our question of what factors influence pay, we will use Excel’s Regression function found in the Data Analysis section. This function will produce two output tables of interest. The first table tests to see if the entire regression equation is statistically significant; that is, do the input variables significantly impact the output variable. If so, we would then examine the second table – the coefficients used in a regression equation for each of the variables. We would have a second set of hypothesis statements for each variable, the null would be the coefficient equals 0 versus an alternate of the coefficient is not equal to 0. Typically, we list these before we start the analysis.

Step 1: For the regression equation:

Ho: The regression equation is not significant

Ha: The regression equation is significant.

For the coefficients if the regression equation is significant:

Ho: The regression coefficient equals 0

Ha: The regression coefficient is not equal to 0.

Note: We would write one pair of statements for each variable, for space reasons, we include only one general statement that should be applied to each variable.

Step 2: Reject each null hypothesis claim if the related p-value > (is greater than) p-value = .05.

Step 3: Regression Analysis

Step 4: Perform the test. Selecting the Regression option in Data Analysis will open a familiar data entry box. The Input Y Range would be the salary range including the label. The Input X range would the labels and data for our input variables. In this case we will use Midpoint, Age, Performance Rating, Service, Raise, Degree, and Gender. Be sure to check the labels box and pick an output range upper left corner. This will result in the following output (values rounded to three decimal places):

Step 5: Conclusions and Interpretation. Let’s look at each table separately.

The Regression Statistics table shows A Multiple R and an R squared value. Multiple R is the multiple correlation value. Similar to our Pearson Coefficient it shows the relationship between the dependent (output or Salary in this case) variable with all for the independent or input variables. Multiple R is the multiple coefficient of determination, similar to the Pearson coefficient of determination, it displays the percent of variation in common between the dependent and all of the independent variables.

The adjusted R square reduces the R square by a factor that involves the number of variables and the sample size, a suggestion if the design impacted the outcome more than the

variables. We have an insignificant reduction. The standard error is a measure of variation in the outcome used for predictions. The count shows the number of cases used in the regression.

The ANOVA table, sometimes called ANOR – analysis of regression – provides us with our test of significance outcome. Similar to the ANOVA covered in Week 3, we look at the Significance of F (AKA P-value) to see if we reject or fail to reject the null hypothesis of no significance. In this case, with a p-value of 8.44E-36 (equaling 0.00000000000000000000000000000000000844) is less than .05, so we reject the null of no significance. The regression equation explains a significant proportion of the variation in our dependent variable of salary.

Now that we have a significant regression equation, we move on to the final table that presents and tests the coefficients for each variable. One of the important parts of a regression equation is that it shows us the impact of each factor if all other factors are held constant. A regression has the form:

Y = A + B1* X1 + B2*X2 + B3*X3 + …. Where Y is the output, A is the intercept (places the line up or down on the Y axis when all other values are 0), the B’s are the coefficient values, and the X’s are the variable names. Before considering whether each coefficient is statistically significant or not, our equation would be:

Salary - -4.009 + 1.22* Midpoint + 0.029*Age – 0.096*Perf Rat – 0.074*Service + 0.834*Raise + 1.002*Degree + 2.552* Gender. Whew!

What does this mean? The intercept is an adjustment factor, one that we do not need to analyze. For midpoint, it means that as midpoint goes up by a thousand dollars (remember salary and midpoint are measured in thousands), the salary goes up by 1.22 thousand – higher graded employees are paid relatively more compared to midpoint than others (all others things equal). For Performance Rating, employees lose $96 (-0.096) for every higher PR point they have – certainly not what HR would like!

Now, let’s look at our dummy variables, Degree and Gender. For Degree, an extra $1,002 is added to employees having a Deg code = 1, as if Deg = 0, the +1.002* 0 = 0; so graduate degree holders get an extra $1002 per year. The same thing applies to Gender, those coded 0 get nothing extra and those coded 1 get $2,552 more per year (all other things equal). Since females are coded 1, if this factor is significant, they would be paid $2552 more than males with all other factors equal (the definition of equal work).

So, now let’s take a look at the statistical significance of each of the variables. This is determined with the P-value column (next to the t Stat value). This is read the same way we noticed in the t-test and ANVOA tables, if the value is less than 0.05 we reject the null hypothesis of no significance.

While the intercept has a significance value, we tend to ignore it and include the intercept in all equations. For the other variables, the only significant variables are: Midpoint, Perf Rating (unrounded it was 0.0497994…), and Gender. So, the regression equation including only our statistically significant factors is Sal = -4.009 +1.22*Midpoint -).096*Perf Rat + 2.552*Gender.

So, we now have a clear answer to our question about males and females getting equal pay for equal work. Not only is the answer no (as gender is a significant factor in determining salary) but also females are paid $2552 more annually all other things equal!

This is certainly not the outcome most of us expected when we began this journey. What we see is that variation within any measure has some often unanticipated outcomes, and unless we examine the inputs into our results, we often do not understand them very well. Single measure tests such as the t and ANOVA tests are quite valuable comparing similar results, but they do not always get to the root of what causes differences.

Reference

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Week4/Week 4 Lecture 12-1.pdf

Week 4 Lecture 12

Significance

Earlier we discussed correlations without going into how we can identify statistically significant values. Our approach to this uses the t-test. Unfortunately, Excel does not automatically produce this form of the t-test, but setting it up within an Excel cell is fairly easy. And, with some slight algebra, we can determine the minimum value that is statistically significant for any table of correlations all of which have the same number of pairs (for example, a Correlation table for our data set would use 50 pairs of values, since we have 50 members in our sample).

The t-test formula for a correlation (r) is t = r * sqrt(n-2)/sqrt(1-r2); the associated degrees of freedom are n-2 (number of pairs – 2) (Lind, Marchel, & Wathen, 2008). For some this might look a bit off-putting, but remember that we can translate this into Excel cells and functions and have Excel do the arithmetic for us.

Excel Example

If we go back to our correlation table for salary, midpoint, Age, Perf Rat, Service, and Raise, we have:

Using Excel to create the formula and cell numbers for our key values allows us to quickly create a result. The T.dist.2t gives us a p-value easily.

The formula to use in finding the minimum correlation value that is statistically significant is r = sqrt(t^2/(t^2 + n-2)). We would find the appropriate t value by using the

t.inv.2T(alpha, df) with alpha = 0.05 and df = n-2 or 48. Plugging these values into the gives us a t-value of 2.0106 or 2.011(rounded).

Putting 2.011 and 48 (n-2) into our formula gives us a r value of 0.278; therefore, in a correlation table based on 50 pairs, any correlation greater or equal to 0.278 would be statistically significant.

Technical Point. If you are interested in how we obtained the formula for determining the minimum r value, the approach is shown below. If you are not interested in the math, you can safely skip this paragraph.

t = r* sqrt(n-2)/sqrt(1-r2)

Multiplying gives us t *sqrt (1- r2) = r2* (n-2)

Squaring gives us: t2 * (1- r2) = r2* (n-2)

Multiplying out gives us: t2– t2* r2 = n r2-2* r2

Adding gives us: t2= n* r2-2*r2+ t2 *r2

Factoring gives us t2= r2 *(n -2+ t2)

Dividing gives us t2 / (n -2+ t2) = r2

Taking the square root gives us r = sqrt (t2 / (n -2+ t2)

Effect Size Measures

As we have discussed, there is a difference between statistical and practical significance. Virtually any statistic can become statistically significant if the sample is large enough. In practical terms, a correlation of .30 and below is generally considered too weak to be of any practical significance. Additionally, the effect size measure for Pearson’s correlation is simply the absolute value of the correlation; the outcome has the same general interpretation as Cohen’s D for the t-test (0.8 is strong, and 0.2 is quite weak, for example) (Tanner & Youssef- Morgan, 2013).

Spearman’s Rank Correlation

Another type of correlation is the Spearman’s rank order correlation. This correlation, which is interpreted the same way as the Pearson’s Correlation, can be performed on ordinal or any ranked data. If the data used is ordinal (rankable), we use Spearman’s rank order correlation, rho (Tanner & Youssef-Morgan, 2013). Using the same data, only assuming at least one variable is ordinal would give us the following results. Note in ranking from low to high, similar values are given the average rank for all of the same values. For example, in the example below the raise of 4.7 occurs twice (the 3rd and 4th places), so it gets a rank of 3.5.

Performance Rating Raise Raise -

Rank

Difference in rank

Difference squared PR-

Rank 1 55 3 1 0 0 2 75 3.6 2 0 0 4 80 4.7 3.5 0.5 0.25 9 100 4.7 3.5 5.5 30.25 9 100 4.8 5 4 16 4 80 4.9 6 -2 4 4 80 5.6 7 -3 9 9 100 5.7 8 1 1

6.5 90 5.8 9 -2.5 6.25 6.5 90 6 10 -3.5 12.25

Sum = 79

Spearman’s rank order correlation = 1-6*sum of differences squared/(n*(n2 -1))

For this data, the sum of differences = 79, and n = 10. This gives us a value of

1-6*(79/(10 *(102 -1))79 = 1 – 6* (79/(10*99) = 1-6 * (79/990) = 1 – 6*0.08 = 0.52.

For comparison purposes, the Pearson Correlation equals 0.686.

Note that we have less information about the data when we use ranks, particularly with several ties in the data. This reduced information results in a lower correlation value with Spearman’s. This correlation is tested and interpreted the same way as Pearson’s Coefficient is (Lind, Marchel, & Wathen, 2008).

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Diego, CA: Bridgeport Education.

Week5/Discu1.docx

Confidence Intervals / Chi Square / Overall Reactions

Although the initial post is due on Day 5, you are encouraged to start working on it early, as it is a three-part discussion that should be completed in sequential order.

Part One – Confidence Intervals

Read  Lecture Thirteen Preview the document View in a new window. Lecture Thirteen introduces you to confidence intervals. What is a confidence interval, and why do some prefer them to single point estimates? Ask your manager what is preferred and why? What are the strengths and weaknesses of using confidence intervals in making decisions? (This should be started on Day 1.)

Part Two – Chi Square

Read  Lecture Fourteen Preview the document View in a new window. As Lecture Fourteen notes, the chi-square test is—in some ways—fundamentally different than the previous tests we have looked at. In what ways and why is this approach important? Examples were shown of gender-degree distributions and employees per grade. How do these tests help with understanding our equal pay for equal work question? Do they change or reinforce our decision from last week? What situations in your personal or professional lives could use a chi-square approach? (This should be started on Day 3.)

Part Three – Overall Reactions

Has your opinion about statistics changed? How can statistical analysis help your professional career? (This should be completed by Day 5 .)

image1.png

image2.png

Week5/FINAL.docx

Final Paper

The Final Paper provides you with an opportunity to integrate and reflect on what you have learned during the class.

The question to address is, what have you learned about statistics? In developing your responses, consider—at a minimum—and discuss the application of each of the course elements in analyzing and making decisions about data (counts and/or measurements).

The course elements include

· Descriptive statistics.

· Inferential statistics.

· Hypothesis development and testing.

· Selection of appropriate statistical tests.

· Evaluating statistical results.

The Final Paper

· Must be three to five double-spaced pages in length (not including title and references pages) and formatted according to APA style as outlined in the  Ashford Writing Center (Links to an external site.)Links to an external site. .

· Must include a separate title page with the following:

· Title of paper

· Student’s name

· Course name and number

· Instructor’s name

· Date submitted

· Must begin with an introductory paragraph that has a succinct thesis statement.

· Must address the topic of the paper with critical thought.

· Must end with a conclusion that reaffirms your thesis.

· Must use at least three scholarly sources in addition to the course text.

Week5/Week 5 Lecture 13-1.pdf

Week 5 Lecture 13

This week we look at two different approaches to analyzing data and making inferences about the populations they come from. The first is confidence intervals, a range of values that we expect to contain the actual population mean based on the sample results we obtained. The other is a way to use nominal and ordinal data in a statistical analysis. The Chi Square family of tests looks at patterns within samples and sees whether the underlying populations could contain the same pattern of measure distributions (Lind, Marchel, & Wathen, 2008).

Confidence Intervals

When we perform a t-test or ANOVA, we are using a single point estimate for the means of the populations we are testing. Some professionals and managers are a bit uncomfortable with this; they understand that the sample has a sampling error – and the actual population mean could be – and most likely is – a bit different. They are interested in getting an estimate of what the sampling error is and how much the population mean could differ from the sample mean.

We deal with this through the use of confidence intervals, a range of values that have a specific probability of containing the actual population mean. We have seen one example of a confidence mean already, the intervals used to determine which population means varied when we rejected the null hypothesis for the ANOVA test were confidence intervals.

Confidence intervals often provide the added information and comfort about estimates of population parameter values that the single point estimates lack. Since the one thing we do know about a statistic generated from a sample is that it will not exactly equal the population parameter, we can use a confidence interval to get a better feel for the range of values that might be the actual population parameter. They also give us an indication of how much variation exists in the data set. The larger the range (at the same confidence level), the more variation within the sample data set and the less representative the mean would be (Lind, Marchel, & Wathen, 2008). We are going to look at two different kinds of confidence intervals this week – intervals for a one sample mean and intervals for the differences between the means of two samples (Lind, Marchel, & Wathen, 2008).

One Sample Confidence Interval for the mean

A confidence interval is simply a range of values that could contain the actual population parameter of interest. It is centered on the sample mean, and uses the variation in the sample to estimate a range of possible values (Lind, Marchel, & Wathen, 2008). To construct a confidence interval, we use several pieces of information from the sample and the confidence level we want.

From the sample we use the mean, standard deviation, and size. To get the confidence level – a desired probability (usually set at 95%), that the interval does, in fact, contain the population mean.

Example. The confidence interval for the female mean salary in the population would be calculated this way. The sample mean value is 38, the standard deviation is 18., and the sample

size is 25 3 (from Week 1 material). Once we determine the confidence level we want, we use the associated 2-tail t value to achieve it. The t-value is found with the fx function t.inv.2t (Prob, df). For a 95% confidence interval, we would use t.inv.2t(0.05, 24), this equals 2.064 (rounded).

We now have all the information we need to construct a 95% confidence interval for the female salary mean:

CI = mean +/- t * stdev/sqrt(sample size) = 38 +/- 2.064*18.3/sqrt(25) = 38 +/- 7.6.

This is typically written as 30.4 to 45.6. Note: the standard deviation divided by the square root of the sample size is called the standard error of the mean, and is the variation measure of the sample used in several statistical tests, including the t-test and confidence intervals.

The associated 95% CI for males is 44.6 to 59.3. Note that the endpoints overlap – male smallest vale is 44.6 while the female largest value is 45.6. This suggests that both population average salaries could be the same and around 45. However, just as the two one-sample t-tests gave us misleading information on possible equality, using two confidence intervals to compare two populations also is not the best approach.

The Confidence Interval for mean differences.

When comparing multiple samples, it is always best to use all the possible information in a single test or procedure. The same is true for confidence intervals. If we are interested in seeing if sample means could be equal, we look to see if the difference between the averages could be 0 or not. If so, then the means could be the same; if not, then the means must be significantly different.

The formula for the mean difference confidence interval is mean difference +/- t*standard error. The standard error for the difference of two populations is found by adding the variance/sample size (which is the standard error squared) for each and taking the square root (Lind, Marchel, & Wathen, 2008). For our salary data set we have the following values:

Female mean = 38 Male mean = 52 t = t.inv.2t(0.05, 48) = 2.106

Female Stdev = 18.3 Maler Stdev = 17.8 Sample size = 50, df = 48

Standard error = sqrt(Variance (female)/25 + Variance (male)/25) =

Sqrt(334.7/25 + 316/25) = 5.10.

This gives us a 95% confidence interval for the difference equaling:

(52-38) +/- 2.106 * 5.10 = 14 +/- 10.7 = 3.3 to 24.7.

Since this confidence interval does not contain 0, we are 95% confident that the male and female salary means are not equal – which is the same result we got from our 2 sample t-test in week 2. We also now have a sense of how much variation exists in our measures.

Side note: The “+/- t* SE” term is often called the margin of error. We most often hear this phrase in conjunction with opinion polls – particularly political polls, “candidate A has 43% approval rating with a margin of error of 3.5%. While we do not deal with proportions in the class, they are calculated the same as an empirical probability – number of positive replies divided by the sample size. The construction of these margins or confidences is conceptually the same – a t-value and a standard error of the proportion based on the sample size and results (Lind, Marchel, & Wathen, 2008).

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Week5/Week 5 Lecture 14-1.pdf

Week 5 Lecture 14

The Chi Square Test

Quite often, patterns of responses or measures give us a lot of information. Patterns are generally the result of counting how many things fit into a particular category. Whenever we make a histogram, bar, or pie chart we are looking at the pattern of the data. Frequently, changes in these visual patterns will be our first clues that things have changed, and the first clue that we need to initiate a research study (Lind, Marchel, & Wathen, 2008).

One of the most useful test in examining patterns and relationships in data involving counts (how many fit into this category, how many into that, etc.) is the chi-square. It is extremely easy to calculate and has many more uses than we will cover. Examining patterns involves two uses of the Chi-square - the goodness of fit and the contingency table. Both of these uses have a common trait: they involve counts per group. In fact, the chi-square is the only statistic we will look at that we use when we have counts per multiple groups (Tanner & Youssef-Morgan, 2013).

Chi Square Goodness of Fit Test

The goodness of fit test checks to see if the data distribution (counts per group) matches some pattern we are interested in. Example: Are the employees in our example company distributed equal across the grades? Or, a more reasonable expectation for a company might be are the employees distributed in a pyramid fashion – most on the bottom and few at the top?

The Chi Square test compares the actual versus a proposed distribution of counts by generating a measure for each cell or count: (actual – expected)2/actual. Summing these for all of the cells or groups provides us with the Chi Square Statistic. As with our other tests, we determine the p-value of getting a result as large or larger to determine if we reject or not reject our null hypothesis. An example will show the approach using Excel.

Regardless of the Chi Square test, the chi square related functions are found in the fx Statistics window rather than the Data Analysis where we found the t and ANOVA test functions. The most important for us are:

• CHISQ.TEST (actual range, expected range) – returns the p-value for the test • CHISQ.INV.RT(p-value, df) – returns the actual Chi Square value for the p-value

or probability value used. • CHISQ.DIST.RT(X, df) – returns the p-value for a given value.

When we have a table of actual and expected results, using the =CHISQ.TEST(actual range, expected range) will provide us with the p-value of the calculated chi square value (but does not give us the actual calculated chi square value for the test). We can compare this value against our alpha criteria (generally 0.05) to make our decision about rejecting or not rejecting the null hypothesis.

If, after finding the p-value for our chi square test, we want to determine the calculated value of the chi square statistic, we can use the =CHISQ.INV.RT(probability, df) function, the value for probability is our chi square test outcome, and the degrees of freedom (df) equals the number of cells in our actual table minus 1 (6 – 1 =5 for an problem working with our 6 grade levels). Finally, if we are interested in the probability of exceeding a particular chi square value, we can use the CHIDIST or CHISQ.DIST.RT function.

Excel Example. To see if our employees are distributed in a traditional pyramid shape, we would use the Chi Square Goodness of Fit test as we are dealing both with count data and with a proposed distribution pattern. For this test, let us assume the following table shows the expected distribution of our 50 employees in a pyramid organizational structure.

Grade: A B C D E F Total Count: 15 12 10 6 4 3 50

The actual or observed distribution within our sample is shown below.

Grade: A B C D E F Total Count: 15 7 5 5 12 6 50

The research question: Are employees distributed in a pyramidal fashion?

Step 1: Ho: No difference exists between observed and expected frequency counts

Ha: Observed and Expected frequencies differ.

Step 2: Reject the null hypothesis if the p-value < alpha = .05.

Step 3: Chi Square Goodness of Fit test.

Step 4: Conduct the test. Below is a screen short of an Excel solution.

Step 5: Conclusions and Interpretation. Since our p-value of 0.00024 is < our alpha of 0.05, we reject the null hypothesis. The employees are not distributed in a pyramid pattern.

Side Note: We might think that if our sample had an equal number of employees per grade we would have a better chance of grade based differences averaging out. Doing this same test and assuming an equal distribution across grades produces a p-value of 0.063 causing us to fail to reject the null hypothesis. The student is encouraged to try this, the equal value for each grade would be 50/6.

Effect size. For a single row, goodness-of-fit test, the associated effect size measure is called effect size r, and equals the square root of: the chi square value/(N*df), where df = the number of cells – 1. A value less than .30 is considered small, between .30 and .50 is considered moderate, and more than .50 is considered large (Steinberg, 2008). Since we rejected the null in the example above, the effect size would be: r= square root (23.75/50*5) = sgrt(0.095) =0.31. This is a moderate impact, suggesting that both sample size and variable interaction had some impact. With moderate results, we generally would want to get a larger sample and repeat the test (Tanner & Youssef-Morgan, 2013).

Chi Square Contingency Table test

Contingency table tests, also known as tests of independence, are slightly more complex than goodness of fit tables. They classify the data by two or more variable labels (we will limit our discussions to two variable tables). Looking a lot like the input table for the ANOVA 2- factor without replication we looked at last week. Both variables involve the counts per category (nominal, ordinal, or interval/ratio data in ranges) of items that meet our research interest (Lind, Marchel, & Wathen, 2008).

With most contingency tables, we do not have a given expected frequency as we had with the goodness of fit situation. To find the expected value for each cell for a multiple row/column table, we use the formula: row total * column total/grand total (which suggests the expected frequency is the average of the observed frequencies per cell, not an unreasonable expectation). Once we have generated the values for the expected table, we use the same formula to perform the Chi Square test. Manually, this is the sum of ((actual – expected)2/expected) for all of the cells. The same fx Chi Square functions used for the Goodness of Fit test are used for the Contingency Table analysis.

The null hypothesis for a contingency table test is “no relationship exists between the variables.” The alternate hypothesis would be: “a relationship exists.” In general, you are testing either for similar distributions between the groups of interest or to see if a relationship ("correlation") exists (even if the data is nominal level). The df for a contingency table is (number of rows-1)*(number of columns – 1).

Excel Example. The data entry for this test is the same as with our earlier test, and the functions are found in the fx statistical list. One possible explanation for different salaries is the performance on the job, reflected in the performance rating. We might wonder if males and females are evaluated differently (either due to actual performance or to bias; if so, we have

another issue to examine). So, our research question for this issue becomes, are males and females rated the same?

Step 1: Ho: Male and Female ratings are similar (no difference in distributions)

Ha: Males and Females rating distributions differ

Step 2: Reject Ho if p-value is < alpha = 0.05.

Step 3: Chi Square Contingency Table Analysis

Step 4: Perform Test.

Step 5: Conclusions and Interpretation. Since the p-value (CHISQ.TEST result) is greater than (>) alpha = .05, we fail to reject the null hypothesis and conclude that males and females are

evaluated in a similar pattern. It does not appear that performance rating impact average salary differences.

Effect size. Now, as with the t-test and ANOVA, had we rejected the null hypothesis, we would have wanted to examine the practical impact of the outcome using an effect size measure. The effect size measure for the Chi Square is a correlation measure. Two measures are generally used with the contingency table outcomes – the Phi coefficient and Cramer’s V (Tanner & Youssef-Morgan, 2013).

The Phi coefficient (=square root of (chi square/sample size)) provides a rough estimate of the correlation between the two variables. Phi is primarily used with small tables (2x2, 2x3, or 3x2). Values below .30 are weak, .30 to about .50 are moderate, and above .50 (to 1) are strong relationships (Tanner & Youssef-Morgan, 2013).

Cramer’s V can be considered as a percent of the shared variation – or common variation between the variables. It equals the square root of (phi squared/(smaller number of rows or columns -1). It ranges from 0 (no relationship or variation in common) to 1.0 (strong relationship, all variation in common) (Tanner & Youssef-Morgan, 2013).

For our example above, it would not make sense to calculate either value since we did not reject the null; but for illustrative purposes we will.

• Phi = square root of (1.978/50) = square root of (0.03956) = 0.199 –small, no relationship • V = square root of (0.1.99^2/(2-1)) = 0.19. Note, when the smaller of the number of rows

and columns equals 1, V will equal Phi (Tanner & Youssef-Morgan, 2013).

Caution

Due to the division involved in calculating the Chi Square value, it is extremely influenced with cells that have small expected values. Most texts say simply that if the expected frequency in one or more cells is less than (<) 5 to not use the Chi Square distribution in a hypothesis test. There are some different opinions about this issue. Different texts issue different rules on what to do if we have expected frequencies of 5 or less in cells.

As a compromise, let’s use the standard that no more than 20% of the cells should have an expected value of less than 5. If they do, we need to combine rows or columns to reduce this percentage interest (Lind, Marchel, & Wathen, 2008).

References

Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.

Steinberg, W.J. (2008). Statistics Alive! Thousand Oaks, CA: Sage Publications, Inc.

Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Diageo, CA: Bridgeport Education.

week overview (Read First).PNG