Discussion

profilesandesh121
ResearchMethodsEssentialKnowledgeBase_2e_Ch12_PowerPoint.pptx

12

Inferential Analysis

William M. Trochim

James P. Donnelly

Kanika Arora

2e

© 2016 Cengage Learning. All Rights Reserved.

1

The analysis procedure you choose is based on your research design

All of the procedures in this chapter are based on the General Linear Model (GLM)

A system of equations that is used as the mathematical framework for most of the statistical analyses used in applied social research

12.1 Foundations of Analysis for Research Design

© 2016 Cengage Learning. All Rights Reserved.

Statistical analyses used to reach conclusions that extend beyond the immediate data alone

The GLM uses dummy variables

A variable that uses discrete numbers, usually 0 and 1, to represent different groups in your study

12.2 Inferential Statistics

© 2016 Cengage Learning. All Rights Reserved.

3

Uses the GLM to estimate statistical significance

p value: an estimate of the probability of your result if the null hypothesis is true

Statistical significance is not enough; we need an effect size, as well

12.2 Inferential Statistics – Statistical Significance

© 2016 Cengage Learning. All Rights Reserved.

4

12.2 Statistical and Practical Significance

© 2016 Cengage Learning. All Rights Reserved.

Table 12.1 Possible outcomes of a study with regard to statistical and practical significance

5

Foundation for

t-test

ANOVA and ANCOVA

Regression, factor, and cluster analyses

Multidimensional scaling

Discriminant function analysis

Canonical correlation

12.3 General Linear Model

© 2016 Cengage Learning. All Rights Reserved.

6

Assumptions

The relationships between variables are linear

Samples are random and independently drawn from the population

Variables have equal (homogeneous) variances

Variables have normally distributed error

12.3 General Linear Model

© 2016 Cengage Learning. All Rights Reserved.

7

12.3a The Two-Variable Linear Model

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.2 A bivariate plot.

Figure 12.3 A straight-line summary of the data.

Linear model: Any statistical model that uses equations to estimate lines.

8

12.3a The Straight Line Model

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.4 The straight-line model.

Regression line: A line that describes the relationship between two or more variables.

Regression analysis: A general statistical analysis that enables us to model relationships in data and test for treatment effects. In regression analysis, we model relationships that can be depicted in graphic form with lines that are called regression lines.

9

12.3a Estimates Using the Two-Variable Linear Model

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.5 The two-variable linear model.

Figure 12.6 What the model estimates.

Error term: A term in a regression equation that captures the degree to

which the line is in error (that is, the residual) in describing each point.

10

12.3b The “General” in the General Linear Model

© 2016 Cengage Learning. All Rights Reserved.

11

The GLM allows you to summarize a wide variety of research outcomes

The major problem for the researcher who uses the GLM is model specification

How to identify the equation that best summarizes the data for a study

If the model is misspecified, the estimates of the coefficients (the b-values) that you get from the analysis are likely to be biased

12.3b The “General” in the General Linear Model (cont’d.)

© 2016 Cengage Learning. All Rights Reserved.

12

Enable you to use a single regression equation to represent multiple groups

Act like switches that turn various values on and off in an equation

12.3c Dummy Variables

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.7 Use of a dummy variable in a regression equation

13

12.3c Using Dummy Variables

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.8 Using a dummy variable to create separate equations for each dummy variable value.

Figure 12.9 Determine the difference between two groups by subtracting the equations generated through their dummy variables.

14

Assesses whether the means of two groups (for example, the treatment and control groups) are statistically different from each other

12.3d The t-Test

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.10 Idealized distributions for treated and control group posttest values

15

12.3d Three Scenarios

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.11 Three scenarios for differences between means.

16

12.3d Low-, Medium-, and High-Variability Scenarios

© 2016 Cengage Learning. All Rights Reserved.

Table 12.2 shows the low-, medium-, and high-variability scenarios represented with data that correspond to each case.

The first thing to notice about the three situations is that the difference between the means is the same in all three.

17

When you are looking at the differences between scores for two groups, you have to judge the difference between their means relative to the spread or variability of their scores

The t-test does just this—it determines if a difference exists between the means of two groups

12.3d Difference Between the Means

© 2016 Cengage Learning. All Rights Reserved.

12.3d Formula for the t-Test

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.12 Formula for the t-test. (left)

Figure 12.13 Formula for the standard error of the difference between the means. (top right)

Figure 12.14 Final formula for the t-test. (bottom right)

19

t-Value

Standard error of the difference

Variance

Standard deviation (sd)

Alpha level (α)

Degrees of freedom (df)

12.3d The t-Test

The regression formula for the t-test & ANOVA

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.15 The regression formula for the t-test (and also the two-group one-way posttest-only Analysis of Variance or ANOVA model).

t-value: The estimate of the difference between the groups relative to the variability of the scores in the groups.

Standard error of the difference: A statistical estimate of the standard deviation one would obtain from the distribution of an infinite number of estimates of the difference between the means of two groups.

Variance: A statistic that describes the variability in the data for a variable. The variance is the spread of the scores around the mean of a distribution. Specifically, the variance is the sum of the squared deviations from the mean divided by the number of observations minus 1.

Standard deviation: The spread or variability of the scores around their average in a single sample. The standard deviation, often abbreviated SD, is mathematically the square root of the variance. The standard deviation and variance both measure dispersion, but because the standard deviation is measured in the same units as the original measure and the variance is measured in squared units, the standard deviation is usually more directly interpretable and meaningful.

Alpha level: The p value selected as the significance level. Specifically, alpha is the Type I error, or the probability of concluding that there is a treatment effect when, in reality, there is not.

Degrees of freedom (df) A statistical term that is a function of the sample size. In the t-test formula, for instance, the df is the number of persons in both groups minus 2.

20

Meets the following requirements:

Has two groups

Uses a post-only measure

Has a distribution for each group on the response measure, each with an average and variation

Assesses treatment effect as the statistical (non-chance) difference between the groups

12.4a The Two-Group Posttest-Only Randomized Experiment

© 2016 Cengage Learning. All Rights Reserved.

21

Three tests meet these requirements, and they all yield the same results

Independent t-Test

One-way ANOVA

Regression analysis

12.4a The Two-Group Posttest-Only Randomized Experiment (cont’d.)

© 2016 Cengage Learning. All Rights Reserved.

22

Analysis requires results for two main effects and one interaction effect in a 2 x 2 factorial design

12.4b Factorial Design Analysis

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.17 Regression model for a 2 x 2 factorial design.

Main effect: An outcome that shows consistent differences between all levels of a factor.

Interaction effect: An effect that occurs when differences on one factor depend on which level you are on another factor.

23

The dummy variable Z1 represents the treatment group

The other dummy variables indicate the blocks

The beta values (Β) reflect the analogous treatment and blocks

12.4c Randomized Block Analysis

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.18 Regression model for a Randomized Block design

24

An analysis that estimates the difference between the groups on the posttest after adjusting for differences on the pretest

12.4d Analysis of Covariance

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.19 Regression model for the ANCOVA.

25

Quasi-experimental designs still use the GLM, but it has to be adjusted for measurement error

Any influence on an observed score not related to what you are attempting to measure

This adjustment for error makes the analyses more complicated

12.5 Quasi-Experimental Analysis

© 2016 Cengage Learning. All Rights Reserved.

12.5a Nonequivalent Groups Analysis

Formula for adjusting pretest values for unreliability in the reliability-corrected ANCOVA

The regression model for the reliability corrected ANCOVA

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.21 Formula for adjusting pretest values for unreliability in the reliability-corrected ANCOVA

Figure 12.22 The regression model for the reliability corrected ANCOVA

27

12.5b Regression-Discontinuity Analysis

Adjusting the pretest by subtracting the cutoff in the Regression-Discontinuity (RD) analysis model.

The regression model for the basic regression-discontinuity design

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.23 Adjusting the pretest by subtracting the cutoff in the Regression-Discontinuity (RD) analysis model.

Figure 12.24 The regression model for the basic regression-discontinuity design.

28

12.5c Regression Point Displacement Analysis

© 2016 Cengage Learning. All Rights Reserved.

Figure 12.25 The regression model for the RPD design assuming a linear pre-post relationship.

29

Summary

© 2016 Cengage Learning. All Rights Reserved.

Table 12.3. Summary of the statistical models for the experimental and quasi-experimental research designs.

30

What is the difference between statistical significance and practical significance?

Give an example to support your answer

Discuss how the four assumptions underlying the GLM impact the data analysis process

Discuss and Debate

© 2016 Cengage Learning. All Rights Reserved.

Statistical significance simply tells us the probability that there is a difference between groups due to chance alone. Practical significance tells us the degree to which the results have meaning in real life. Examples will vary.

 

By running the descriptive statistics first, researchers can check the data to be sure it conforms to the four assumptions: 1) the relationships between variables are linear 2) samples are random and independently drawn from the population 3) variables have equal (homogeneous) variances, and 4) variables have normally distributed error. A researcher must test these assumptions, or conclusion validity will be threatened.

 

31

image1.emf

image2.emf

image3.png

image4.png

image5.png

image6.png

image7.png

image8.png

image9.png

image10.png

image11.png

image12.png

image13.png

image14.png

image15.png

image16.png

image17.png

image18.png

image19.png

image20.png

image21.png

image22.png

image23.png

image24.png

image25.png

image26.png

image27.png

image28.png