statistics

mc1973
Exercise33.docx

Exercise 33

Calculating Analysis of Variance (ANOVA) and Post Hoc Analyses Following ANOVA

Analysis of variance (ANOVA) is a statistical procedure that compares data between two or more groups or conditions to investigate the presence of differences between those groups on some continuous dependent variable (see Exercise 18). In this exercise, we will focus on the one-way ANOVA, which involves testing one independent variable and one dependent variable (as opposed to other types of ANOVAs, such as factorial ANOVAs that incorporate multiple independent variables).

Why ANOVA and not a t-test? Remember that a t-test is formulated to compare two sets of data or two groups at one time (see Exercise 23 for guidance on selecting appropriate statistics). Thus, data generated from a clinical trial that involves four experimental groups, Treatment 1, Treatment 2, Treatments 1 and 2 combined, and a Control, would require 6 t-tests. Consequently, the chance of making a Type I error (alpha error) increases substantially (or is inflated) because so many computations are being performed. Specifically, the chance of making a Type I error is the number of comparisons multiplied by the alpha level. Thus, ANOVA is the recommended statistical technique for examining differences between more than two groups (Zar, 2010).

ANOVA is a procedure that culminates in a statistic called the F statistic. It is this value that is compared against an F distribution (see Appendix C) in order to determine whether the groups significantly differ from one another on the dependent variable. The formulas for ANOVA actually compute two estimates of variance: One estimate represents differences between the groups/conditions, and the other estimate represents differences among (within) the data.

Research Designs Appropriate for the One-Way ANOVA

Research designs that may utilize the one-way ANOVA include the randomized experimental, quasi-experimental, and comparative designs (Gliner, Morgan, & Leech, 2009). The independent variable (the “grouping” variable for the ANOVA) may be active or attributional. An active independent variable refers to an intervention, treatment, or program. An attributional independent variable refers to a characteristic of the participant, such as gender, diagnosis, or ethnicity. The ANOVA can compare two groups or more. In the case of a two-group design, the researcher can either select an independent samples t-test or a one-way ANOVA to answer the research question. The results will always yield the same conclusion, regardless of which test is computed; however, when examining differences between more than two groups, the one-way ANOVA is the preferred statistical test.

Example 1: A researcher conducts a randomized experimental study wherein she randomizes participants to receive a high-dosage weight loss pill, a low-dosage weight loss pill, or a placebo. She assesses the number of pounds lost from baseline to post-treatment 378for the three groups. Her research question is: “Is there a difference between the three groups in weight lost?” The independent variables are the treatment conditions (high-dose weight loss pill, low-dose weight loss pill, and placebo) and the dependent variable is number of pounds lost over the treatment span.

Null hypothesis: There is no difference in weight lost among the high-dose weight loss pill, low-dose weight loss pill, and placebo groups in a population of overweight adults.

Example 2: A nurse researcher working in dermatology conducts a retrospective comparative study wherein she conducts a chart review of patients and divides them into three groups: psoriasis, psoriatric symptoms, or control. The dependent variable is health status and the independent variable is disease group (psoriasis, psoriatic symptoms, and control). Her research question is: “Is there a difference between the three groups in levels of health status?”

Null hypothesis: There is no difference between the three groups in health status.

Statistical Formula and Assumptions

Use of the ANOVA involves the following assumptions (Zar, 2010):

1. Sample means from the population are normally distributed.

2. The groups are mutually exclusive.

3. The dependent variable is measured at the interval/ratio level.

4. The groups should have equal variance, termed “homogeneity of variance.”

5. All observations within each sample are independent.

The dependent variable in an ANOVA must be scaled as interval or ratio. If the dependent variable is measured with a Likert scale and the frequency distribution is approximately normally distributed, these data are usually considered interval-level measurements and are appropriate for an ANOVA (de Winter & Dodou, 2010; Rasmussen, 1989).

The basic formula for the F without numerical symbols is:

The term “mean square” (MS) is used interchangeably with the word “variance.” The formulas for ANOVA compute two estimates of variance: the between groups variance and the within groups variance. The between groups variance represents differences between the groups/conditions being compared, and the within groups variance represents differences among (within) each group's data. Therefore, the formula is F = MS between/MS within.

Data for Additional Computational Practice for Questions to be Graded

Using the example from Ottomanelli and colleagues (2012) study, participants were randomized to receive Supported Employment or treatment as usual. A third group, also a treatment as usual group, consisted of a nonrandomized observational group of participants. A simulated subset was selected for this example so that the computations would be small and manageable. The independent variable in this example is treatment group (Supported Employment, Treatment as Usual–Randomized, and Treatment as Usual–Observational/Not Randomized), and the dependent variable was the number of hours worked post-treatment. Supported employment refers to a type of specialized interdisciplinary vocational rehabilitation designed to help people with disabilities obtain and maintain community-based competitive employment in their chosen occupation (Bond, 2004).

The null hypothesis is: “There is no difference between the treatment groups in post-treatment number of hours worked among veterans with spinal cord injuries.”

Compute the ANOVA on the data in Table 33-3 below.

TABLE 33-3

POST-TREATMENT HOURS WORKED BY TREATMENT GROUP

Participant #

Supported Employment

Participant #

TAU Observational

Participant #

TAU Randomized

1

8

6

15

11

25

2

9

7

18

12

28

3

15

8

9

13

35

4

17

9

18

14

30

5

24

10

16

15

15

“TAU” = Treatment as Usual.

391

EXERCISE 33 Questions to Be Graded

Name: _______________________________________________________ Class: _____________________

Date: ___________________________________________________________________________________

Follow your instructor's directions to submit your answers to the following questions for grading. Your instructor may ask you to write your answers below and submit them as a hard copy for grading. Alternatively, your instructor may ask you to use the space below for notes and submit your answers online at http://evolve.elsevier.com/Grove/statistics/ under “Questions to Be Graded.”

1. Do the data meet criteria for homogeneity of variance? Provide a rationale for your answer.

2. If calculating by hand, draw the frequency distribution of the dependent variable, hours worked at a job. What is the shape of the distribution? If using SPSS, what is the result of the Shapiro-Wilk test of normality for the dependent variable?

3. What are the means for three groups' hours worked on a job?

4. What are the F value and the group and error df for this set of data?

392

5. Is the F significant at α = 0.05? Specify how you arrived at your answer.

6. If using SPSS, what is the exact likelihood of obtaining an F value at least as extreme as or as close to the one that was actually observed, assuming that the null hypothesis is true?

7. Which group worked the most weekly job hours post-treatment? Provide a rationale for your answer.

8. Write your interpretation of the results as you would in an APA-formatted journal.

9. Is there a difference in your final interpretation when comparing the results of the LSD post hoc test versus Tukey HSD test? Provide a rationale for your answer.

10. If the researcher decided to combine the two Treatment as Usual groups to represent an overall “Control” group, then there would be two groups to compare: Supported Employment versus Control. What would be the appropriate statistic to address the difference in hours worked between the two groups? Provide a rationale for your answer.