Logic of Hypothesis Testing

profilewjm3774
MODULE12wk5.docx

MODULE 12 STATISTICAL POWER

Page navigation

· previous: Evaluating P-Values Summary

· next: Power Analysis

· Go to page 

Current Module | Pages 51 - 56

Statistical Power

Learning Objectives

· Describe the relationship between statistical power, sample size, effect size, and alpha.

Suppose you are planning an experiment involving stereotype threat. Stereotype threat is defined as a tendency to behave in a manner consistent with negative beliefs that others have about a racial or gender group. For example, if some black test takers are told that as a group, black test takers do not perform well on math tests, performance among those black test takers is worse than for black test takers for whom the stereotype is not evoked. One question you will need to answer is how many participants should you include in your study to be confident of identifying the effect? In other words, how many participants do you need in order to have adequate statistical power in your study?

Statistical power is the probability of rejecting a null hypothesis if the null is false (i.e., the alternative is true). It is the degree to which the researcher is able to detect an effect if there actually is one. With low statistical power, a researcher may struggle to detect an effect (to reject the null), even if an effect actually occurs in the population.

Understanding how several factors affect the statistical power of a study will help you to understand and critique research findings and will also lead to greater satisfaction with your own research. When conducting your own research studies, you should do a power analysis prior to collecting data to make sure you have a good chance of demonstrating the effect you are looking for. There are three main factors that affect how much statistical power you have in your study:

· Alpha (i.e., the probability of a type I error)

· Effect size (i.e., the difference between the population means for the experimental and control groups)

· Sample size (i.e., n )

As a researcher, you have control over alpha and sample size. The effect size, however, is not under your control and is predetermined. What will be important to you is having an idea about how great the effect may be. This skill builder is concerned with how alpha, effect size, and sample size are related to statistical power.

A Review of Type I and Type II Errors

Before discussing power, let’s review the basics of hypothesis testing:

· The null hypothesis is the statement of no effect.

· The alternative hypothesis is a statement that an effect exists in the population.

· Obtaining a significant result means that you have rejected the null hypothesis and have concluded that it’s likely that there is an effect in the population.

· A type I error happens when the null hypothesis is true but you reject it erroneously. This is referred to as a false positive.

· A type II error happens when the null hypothesis is false but you fail to reject it. This is referred to as a false negative.

Type I and type II errors and their probabilities are important concepts when thinking about hypothesis testing. These error events are called “conditional,” meaning that the events can only occur under certain conditions.

Using the table below, you can see the language that is used to talk about these conditional events:

· Alpha (α) = P(type I error) = P(Reject H 0 |H 0 is true) which is read as the probability of a type I error equals the probability of rejecting the null hypothesis given the null is true.

· Beta (β) = P(type II error) = P(Retain H 0 |H A is true) which is read as the probability of a type II error equals the probability of retaining the null hypothesis given the alternative hypothesis is true.

The following table shows the possible outcomes for a hypothesis test.

unlabelled image

Learn by Doing

Hint, displayed below

You are planning a study of stereotype threat and are concerned you may not be able to detect a significant result, even though you believe your experimental procedures should induce the stereotype threat effect. Which of the following errors are you concerned about?

Type II error

Type I error

51