Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA)
1
Choosing the Right Statistic
Exploring Relationships
Association: Chi-square test
Correlation: Pearson
Multiple regression
Exploring Differences between Groups:
T-tests
One-way analysis of variance: ANOVA
Two-way analysis of variance: ANOVA
2
Types of ANOVA
One Way ANOVA – One independent variable (with 2 or more levels) and one dependent variable
Two-way ANOVA (factorial ANOVA) – two independent variables (with two or more levels) and one dependent variable
Repeated Measures ANOVA – (similar to paired t-test) – ie. looking at the same subjects at different points in time
Mixed Model ANOVA – Includes both repeated measures and factorial design
3
One-way ANOVA
Comparing differences between two or more groups.
Null hypothesis: Mean A=Mean B=Mean C
4
One-Way ANOVA Variables
Independent – Categorical (with 2 or more levels or groups)
For example: Type of therapy (groups: yoga therapy, paint therapy, no therapy)
Dependent – Equal-interval
For example: Stress scale
5
One-Way ANOVA Research Questions
Is there a difference between elementary, middle, and high school teacher attitudes towards using technology in the classroom?
Is there a difference in the number of students enrolled in remedial math courses at four local community colleges?
6
One-way ANOVA Hypotheses
Research Hypothesis: There is a difference in elementary, middle, and high school teacher attitudes towards using technology in the classroom.
Null Hypothesis: There is no difference in elementary, middle, and high school teacher attitudes towards using technology in the classroom.
7
Copyright © 2011 by Pearson Education, Inc. All rights reserved
When the Null Hypothesis Is Not True
If the null hypothesis is not true, the populations do not have the same mean.
The means of the populations are spread out for two different reasons.
chance factors that cause variation within the population
the different treatments received by the groups (treatment effect)
Between Groups and Within Groups
Signal to the noise analogy
We are interested in differences between groups (signal), but we need to account for the differences within groups (noise)
So we want to know how do people in sample A differ from those in sample B and C
BUT
People within group A differ from each other
9
F Ratio
Between-groups variance
Within-groups variance
(Actual Difference in Means)(Within-groups variance)
Within-groups variance
Measuring: Signal to noise ratio!
Another example: I want to assess if global warming is happening! I need to compare the difference between a global warming climate and the current climate.
Question: Is the difference between the 2 climate states GREATER than the variability within the current climate? e.g. If in St. Louis, the natural variation in temperature in the current climate is 3 degrees C (noise) and the difference between a global warming climate and the current climate is 2 degrees C (signal), then the signal is less than the noise and I cannot conclude that the climate is warming due to CO2.
10
Between Groups and Within Groups
The “within groups” variation is not affected by whether or not the null hypothesis is true
The “between groups” variation is larger when the null hypothesis is rejected (because there is an actual variation in the means)
Thus our F ratio will be larger when the null hypothesis is false
11
Hypothesis Testing with ANOVA
State research hypothesis and a null hypothesis.
Determine the characteristics of the comparison distribution.
Determine the cutoff score on the F table.
Determine your sample’s score. (F ratio)
Decide whether to reject the null.
12
F Distribution
13
Reading an F Table
Between-groups degrees of freedom (numerator)
dfbetween=Ngroups -1
Within-groups degrees of freedom (denominator)
dfwithin=df1+df2....+dflast
**Check significance level
14
Hypothesis Test Example
Jurors are divided into three groups: Group A is told the defendant has a criminal record, Group B is told the defendant has a clean record, Group C is given no information about the criminal record. There are 5 jurors assigned to each group. The between groups population variance is 21.07 and the within groups population variance is 5.33.
Question: Does knowledge of criminal record influence a juror’s rating of guilt>
15
Hypothesis Test Example
Population 1 = Jurors told defendant has criminal record.
Population 2= Jurors told that the defendant has a clean record.
Population 3: Jurors given no information about defendant’s record.
Null Hypothesis: The three populations of jurors have the same mean.
Research Hypothesis: The population means are not the same.
Comparison Distribution: F Distribution, dfbetween=2, dfwithin=12
Cutoff F ratio is 3.89
F ratio is 4.07
Reject the null hypothesis that the three groups come from populations with the same mean. People exposed to different kinds of information about a defendant’s criminal record will differ in ratings of defendant’s guilt.
16
Comparing each group to each other group
We know that the three groups differ, but how do they differ?
Group A vs. Group B?
Group A vs. Group C?
Group B vs. Group C?
Which of these comparisons is statistically significantly different?
18
Why not just use a bunch of t-tests?
Three t-tests at the .05 level, your chance of at least one of them being significant by chance is .15. Or in other words the probability of making a Type 1 error is 15% rather than 5%.
You have to adjust your alpha very low to account for increased risk of Type 1 error. .05/3=.012
Which...increases your chance of Type 2 error.
Type 1 error: rejecting null hypothesis when true
Type 2 error: failure to reject false null hypothesis
19
Comparing Each Group to Each Other Group
Instead of t-tests, “post-hoc analyses” are done.
SPSS does these for you.
Most common is Tukey’s HSD.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Assumptions in the Analysis of Variance
The populations follow a normal curve.
The populations’ variances are equal. (SPSS can test this for you)
Roughly equal group sizes
Reporting one-way ANOVA
A one-way between groups analysis of variance was conducted to explore the impact of age on levels of optimism. Participants were divided into three groups according to their age (Group 1: <29 yrs; Group 2: 30-44 yrs; Group 3: >45 yrs). There was a statistically significant difference in total optimism scores for the three age groups: F(2, 432) = 4.6, p=.01. Post-hoc comparisons using the Tukey HSD test indicated that the mean score for Group 1 (M=21.36, SD=4.55) was significantly different from Group 3 (M=22.96, SD=4.49). Group 2 (M=22.10, SD=4.15) did not differ significantly from either Group 1 or 3.
22
Two-Way ANOVA (factorial ANOVA)
How is this different from one-way ANOVA?
2 independent variables instead of one
For example: What is the impact of age and gender on optimism?
What the two-way ANOVA tells us:
The main effect of age on optimism scores
The main effect of gender on optimism scores
The interaction effect of gender and age on optimism scores: When the effect of one independent variable depends on the level of the other independent variable
23
Interaction Effect and Main Effects Graphically
2 Main effects and an interaction effect for 2 independent variables:
Age vs. Optimism
Gender vs. Optimism
Interaction: Gender and age together are assessed differently!
24
Factorial ANOVA Example
Studying two different factors that impacted mood:
Sensitivity (high, not high), Test Difficulty (easy, hard)
This is an example of a 2x2 factorial ANOVA
| Easy Test | Hard Test | |
| High Sensitivity | A | B |
| Not High Sensitivity | C | D |
25
Interaction Effect and Main Effects
| Easy Test | Hard Test | Marginal Means | |
| Not high sensitivity | 2.43 | 2.56 | 2.50 |
| High sensitivity | 2.19 | 3.01 | 2.6 |
| Marginal Means | 2.31 | 2.8 |
Note: The higher the score the more negative the mood
26