Statics - Final Paper

AEIOUsometimesY
Lectureweek5-3.pdf

BUS 308 Week 5 Lecture 3

Excel has limited functions available for us to use on this week’s material. We generally need to take the output from the other functions and generate our Effect Size values.

Effect Sizes

One issue many have with statistical significance is the influence of sample size on the decision to reject the null hypothesis. If the average difference in preference for a soft drink was found to be ½ of 1%; most of us would not expect this to be statistically significant. And, indeed, with typical sample sizes (even up to 100), a statistical test is unlikely to find any significant difference. However, if the sample size were much larger; for example, 100,000; we would suddenly find this miniscule difference to be significant!

Statistical significance is not the same as practical significance. If for example, our sample of 100,000 was 1% more in favor of an expensive product change, would it really be worthwhile making the change? Regardless of how large the sample was, it does not seem reasonable to base a business decision on such a small difference.

Enter the idea of Effect Size. The name is descriptive but at the same time not very illuminating on what this measure does. We will get to specific measures shortly, but for now, let’s look at how an Effect Size measure can help us understand our findings. First, the name: Effect Size. What effect? What size? In very general terms, the effect we are monitoring is the effect that occurs when we change one of the variables. For example, is there an effect on the average compa-ratio when we change from male to female. Certainly, but not all that much, as we found no significant difference between the average male and female compa-ratios. Is there an effect when we change from male to female on the average salary? Definitely. And it is much larger than what we observed on the compa-ratio means. We found a significant difference in the average salary for males than females – around $14,000.

The Effect Size measures looks at the impact of the variables on our outcomes; large impacts suggest that variables are important, while small impacts might suggest that the variable is not particularly important in determining changes in outcomes. We could, for example, argue that both male and females in the population had the same compa-ratio mean and what we observed in the sample was simply the result of sample error. Certainly, our test results and confidence intervals could support this.

Now, when do we look at an Effect Size; that is, when should we go to the effort of calculating one. The general consensus is that the Effect Size measure only adds value to our analysis if we have already rejected the null hypothesis. This makes sense, if we found no difference between the variables we were looking at, why try to see what effect changing from one to the other would do. We already know, not much.

When we reject a null hypothesis due to a significant test statistic (one having a p-value less than our chosen alpha level), we can ask a question: was this rejection due to the variable interactions or was it due to the sample size? If due to a large sample size, the practical significance of the outcome is very low. It would often not be “smart business” to make a

decision based on those kinds of results. If, however, we have evidence that the null was rejected due to a significant interaction by the variables, then it makes more sense to use this information in making decisions.

Therefore, when looking at Effect Sizes, we tend to classify them as large, moderate, or small. Large effects mean that the variable interactions caused the rejection of the null, and our results have practical significance. If we have small effect size measures, it indicates that the rejection of the null was more likely to have been caused by the sample size, and thus the rejection has very little practical significance on daily activities and decisions.

OK, so far:

• Effect sizes are examined only after we reject the null hypothesis, they are meaningless when we do not reject a claim of no difference.

• Large effect size values indicate that variable interactions caused the rejection of the null hypothesis, and indicate a strong practical significance to the rejection decision.

• Small effect size values indicate that the sample size was the most likely cause of rejecting the null, and that the outcome is of very limited practical significance.

• Moderate effect sizes are more difficult to interpret. It is not clear what had more influence on the rejection decision and suggests only moderate practical significance. These results might suggest a new sample and analysis.

Different statistical tests have different effect size measures and interpretations of their values. Here are some that relate to the work we have done in this course.

• T-test for independent samples. Cohen’s D is found by the absolute difference between the means divided by the pooled standard deviation of the entire data set. A large effect is .8 or above, a moderate effect is around .5 to .7, and a small effect is .4 or lower. Interpretation of values between these levels is up to the researcher and/or decision maker.

• One-sample T-test. Cohen’s D is found by the absolute difference between the means divided by the standard deviation of the tested variable data set. A large effect is .8 or above, a moderate effect is around .5 to .7, and a small effect is .4 or lower. Interpretation of values between these levels is up to the researcher and/or decision maker.

• Paired T-test. Effect size r = square root of (t^2/(t^2 + df)). A large effect is .4 or above, a moderate effect is around .25 to .4, and a small effect is .25 or lower.

• ANOVA. Eta squared equals the SS between/SS total. A large effect is .4 or above, a moderate effect is .25 to .40, and a small effect is .25 or lower.

• Chi Square Goodness of Fit tests (1-row actual tables). It is, also called Effect size r = square root (Chi Square statistic/(N * (c -1)), where c equals the number of columns in the table. A large effect is .3 or above, a moderate effect is .3 to .5, and a small effect is .3 or lower.

• Chi Square Contingency Table tests. For a 2x2 table, use phi = square root of (chi square value/N). A large effect is .5 or above, a moderate effect is .3 to .5, and a small effect is .3 or lower.

• Chi Square Contingency Table tests. For larger than a 2x2 table, use Cramer’s V = square root (chi square value/((smaller of R or C)-1)). A large effect is .5 or above, a moderate effect is .3 to .5, and a small effect is .3 or lower.

• Correlation. Use the absolute value of the correlation, A large effect is .4 or above, a moderate effect is .25 to .4, and a small effect is .25 or lower.

Would using these measures change any of our test interpretations?

Please ask your instructor if you have any questions about this material.

When you have finished with this lecture, please respond to Discussion thread 3 for this week with your initial response and responses to others over a couple of days.