Logic of Hypothesis Testing
Page navigation
· previous: Unit 5: Evaluating P-Values and Statistical Power
· next: One-tailed vs. Two-tailed Tests
· Go to page
Current Module | Pages 47 - 50
Learning Objectives
· Evaluate p-values in research studies.
In doing research, one of the most common activities is testing hypotheses. The Afrobarometer data set below is a survey of African citizens’ attitudes on democracy, governance, the economy, and other related topics (www.afrobarometer.org). Using this data set, you might want to examine hypotheses related to whether rural and urban citizens differ, on average, in how much they trust government. The tables below present results from an independent samples t- test to examine these hypotheses using a random sample of 44 participants from the complete data set. Each respondent’s score is a value between 0 and 15 with a higher score indicating greater trust. You can see that the mean for the urban group is 7.00 ( SD = 4.17) and the mean for the rural group is 7.74 ( SD = 4.38). The observed value of the t -statistic is -.564 and the p- value equals 0.576 (see the column labeled “Sig. (2-tailed)”).
|
|
t |
df |
Sig. (2-tailed) |
Mean Difference |
Std. Error Difference |
|
Trust in Government Index (higher scores = more trust) |
-.564 |
41 |
.576 |
-.73913 |
1.30978 |
Group Statistics
|
|
Urban or Rural Primary Sampling Unit |
N |
Mean |
Std. Deviation |
Std. Error Mean |
|
Trust in Government Index (higher scores = more trust) |
Urban Rural |
20 30 |
7.0000 7.7391 |
4.16754 4.38196 |
.93189 .91370 |
The p-value is the probability of obtaining a value more extreme than .564 (less than -.564 or greater than +.564) if you were to repeat the test with a new sample of data and if the null hypothesis is true. You will see in this skill builder that the p-value can easily be used to make statistical decisions in hypothesis testing. However, while the p-value is important in determining statistical significance, it does not tell the whole story.
Steps of Hypothesis Testing
To interpret p-values, let's review the key steps in hypothesis testing.
1. State the null and alternative hypotheses. Recall that hypotheses are statements about population parameters. For the Trust in Government example from the Afrobarometer data set, the null (HO)(HO) and alternative hypotheses (HA)(HA) would be
HOμ:Urban=μRural ⋅HOμ:Urban=μRural · and
HAμ:Urban≠μRural ⋅HAμ:Urban≠μRural ·
The Greek letter, µ, indicates a population mean, and the subscripts indicate levels of the independent variable (“urban” and “rural”). Here the null is saying that the mean for the urban population on the Trust In Government variable is the same as the mean for the rural population. The alternative hypothesis states that these means are not the same.
2. Set alpha , the probability of a type I error. In the Afrobarometer example, a type I error would be to decide that the rural and urban populations have different mean levels of trust in government when, in fact, there is no difference.
Frequently, the value of alpha is set equal to 0.05, although researchers are free to use other values. If using an alpha of .05, then researchers are specifying that there is a 5% chance that they will reject the null when, in fact, it should not be rejected. Setting alpha at .05 is popular because there is relatively minimal risk of making a type I error, and alpha is not so small that researchers greatly increase their risk of not rejecting the null when they actually should (a type II error). So in setting alpha, researchers have to be aware of both the risk of rejecting the null erroneously and of not rejecting it when they actually should. For our Afrobarometer example, we will set alpha at .05.
3. Decide on a test statistic . Because of a desire to compare two groups (rural and urban), a t-test for two independent samples is being used.
4. Collect the data and examine the model assumptions. Before calculating the value for your test statistic, be sure you have checked assumptions, like homogeneity of variance and the absence of outliers.
5. Calculate the observed value of the test statistic . Once the data have been collected, the observed value of the test statistic will be used to make a statistical decision. In the Afrobarometer example, the observed value of the test statistic is -.564, sometimes written as tobserved(41)= −.564tobserved(41)= -.564 where the 41 is the number of degrees of freedom associated with the test.
6. Make a statistical decision using the observed value. This decision requires examining the distribution of the test statistic under the assumption the null hypothesis is true. Practically, the area in the tail of the distribution beyond the observed value of the test statistic, called the p-value, needs to be determined (see the figure below). Fortunately, computer programs can do the calculation of the area quickly and easily. If the probability is less than alpha (e.g., .05), we will reject the null hypothesis. Thus, if you set alpha equal to .05 and the p-value for your test statistic is any value less than .05, you will reject the null hypothesis. Otherwise, retain the null.
In our Afrobarometer example, the p-value was .576 and was greater than our alpha of .05, so we would not reject the null hypothesis; in this case, statisticians often say that they “fail to reject” the null hypothesis. If you reject the null, you will say the result is statistically significant, and if you retain the null, you will say the result is not statistically significant.
7. Make a Real World Decision. The statistical decision is focused on the abstract hypothesis test. The final step is to examine the implications of the statistical decision in the real world. You will need to consider whether your results are practically significant. It turns out that not all statistically significant results are important in the real world. We will discuss more about this later in the skill builder.
47