research paper

profileSample
MillsChapter182.pptx

Chapter 18: Inferential Statistics

Educational Research:

Competencies for Analysis and Application

11/E

Geoffrey Mills and Lorraine Gay

© 2016, 2012, 2009, 2006 Pearson Education, Inc. All Rights Reserved

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

After reading Chapter 18, you should be able to do the following

Explain the concepts underlying inferential statistics.

Select among tests of significance and apply them to your study.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Concepts Underlying Inferential Statistics

Inferential statistics are data analysis techniques for determining how likely it is that results obtained from a sample, or samples, are the same results that would be obtained from the entire population.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Concepts Underlying Inferential Statistics

Descriptive statistics show how often or how frequent an event or score occurred.

Inferential statistics help researchers known whether they can generalize their findings to a population based upon their sample of participants.

Inferential statistics use data to assess likelihood—or probability.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Concepts Underlying Inferential Statistics

Standard Error

Inferences about populations are based on information from samples.

There is very little chance that any sample is identical to the population.

The expected variance among sample means and the population mean is referred to as sampling error.

Sampling error is expected.

Sampling error tends to be normally distributed.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Standard Error

A distribution of sample means is normally distributed and has its own mean and standard deviation.

The standard deviation of the sample means is referred to as the standard error of the mean.

Our ability to estimate standard error of the mean is affected by size of sample.

As the sample size increases the standard error of the mean decreases.

Our ability to estimate the standard error of the mean is also affected by the size of the population standard deviation.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Standard Error

The standard error of the mean can be calculated by:

SEx = the standard error of the mean

SD = the standard deviation for a sample

N = the sample size

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Hypothesis Testing

Hypothesis testing is the process of decision making in which researchers evaluate the results of a study against their original expectations.

Null hypothesis: Predicting no difference in scores

Research hypothesis: Predicting a difference in scores

We want to assure differences we observe between groups are ‘real’ differences and did not occur by chance.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Hypothesis Testing

If the groups are significantly different we reject the null hypothesis.

We do not accept a research hypothesis, we cannot prove our hypothesis.

We instead report that our research hypothesis was supported.

If there are not expected differences, we report that the null hypothesis was not rejected; and that our research hypothesis was not supported.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Tests of Significance

Tests of significance allow us to inferentially test if differences between scores in our sample are simply due to chance or if they are representative of the true state of affairs in the population.

To conduct a test of significance we determine a preselected probability level, known as level of significance (alpha or α).

Usually educational researchers use alpha .05 or 5 out of 100 chances that the observed difference occurred by chance (α =.05).

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Tests of Significance

Two-tailed and one-tailed tests

Tests of significance are almost always two-tailed.

Researchers will select a one-tailed test of significance only when they are quite certain that a difference will occur in only one direction.

It is ‘easier’ to obtain a significant effect when predicting in one direction.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Type I and Type II Errors

Based upon a test of significance the researcher will either reject or not reject the null hypothesis.

The researcher makes a decision that the observed effect is or is not due to chance.

This decision is based upon probability, not certainty.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Type I and Type II Errors

Sometimes the researcher will erroneously reject the null hypothesis or will erroneously retain the null hypothesis.

When the researcher incorrectly rejects the null hypothesis she has committed a Type I error.

When the researcher incorrectly fails to reject the null hypothesis but a true difference exists, she has committed a Type II error.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Type I and Type II Errors

True status of null hypothesis: True (should not be rejected) True status of null hypothesis: False (should be rejected)
Researcher’s decision: True (does not reject) Correct Decision Type II Error
Researcher’s decision: False (rejects) Type I Error Correct Decision

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Degrees of Freedom

After determining whether the significance test will be two-tailed or one-tailed and selecting a probability level (i.e., alpha), the researcher selects an appropriate statistical test to conduct the analysis.

Degrees of freedom are the number of observations free to vary around a parameter.

Each test of significance has its own formula for determining degrees of freedom (df).

The value for the df is important in determining whether the results are statistically significant.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Selecting Among Tests of Significance

The use of a specific significance test is determined by several factors.

Scale of measurement represented by the data (nominal, ordinal, interval, ratio)

Participant selection

Number of groups being compared

Number of independent variables

Significance tests applied incorrectly can lead to incorrect decisions.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Selecting Among Tests of Significance

The first decision in selecting an appropriate test is to determine whether a parametric or nonparametric test will be used.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Selecting Among Tests of Significance

Parametric tests require that the data meet several assumptions.

Variable must be normally distributed

Interval or ratio scale of measurement

Selection of participants is independent

Variance of the comparison groups is equal

Most parametric tests are fairly robust.

If assumptions are violated, nonparametric tests should be used.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Selecting Among Tests of Significance

The t test

The t test is used to determine whether two groups of scores are significantly different from one another.

The t test compares the observed difference between means with the difference expected by chance.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Selecting Among Tests of Significance

The t test for independent samples is a parametric test of significance used to determine if differences exist between the means of two independent samples.

Independent samples are randomly formed.

The assumption is that the means are the same at the outset of the study but there may be differences between the groups after treatment.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Selecting Among Tests of Significance

The t test for nonindependent samples is a parametric test of significance used to determine if differences exist between the means of two groups that are formed through matching.

When scores are nonindependent, they are systematically related.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Selecting Among Tests of Significance

The comparison of gain or difference scores are not generally tested with a t-test.

There are other better strategies for analyzing such data.

e.g., t test on posttest scores (if there are no differences on pretest scores).

e.g., Analysis of covariance (if there are differences in pretest scores).

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Selecting Among Tests of Significance

Analysis of variance

A simple (one-way) analysis of variance (ANOVA) is a parametric test used to determine whether scores from two or more groups are significantly different at a selected probability level.

ANOVA is used to avoid the error rate problems of conducting multiple tests.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Analysis of Variance

An F ratio is computed to determine if sample means are significantly different.

The F ratio is calculated based upon variance between groups/variance within groups.

The larger the F ratio the more likely there are differences among groups.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Analysis of Variance

If there are significant differences among groups based upon an ANOVA; the researcher then must determine where these differences exist.

Multiple comparisons are used to determine where differences between groups are statistically significant.

Comparisons planned before collecting data are referred to as a priori.

Comparisons after are referred to as post hoc.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Analysis of Variance

When a factorial design is used and there are two or more independent variables analyzed, a factorial or multifactor analysis of variance is used to analyze the data.

MANOVA is an analytic procedure used when there is more than one dependent variable and multiple independent variables.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Analysis of Variance

Analysis of covariance (ANCOVA) is a form of ANOVA that allows for control of extraneous variables and also is used as a means for increasing power of an analysis.

Power is increased in an ANCOVA because the within-group error variance is decreased.

When a study has two or more dependent variables, and a covariate, MANCOVA is used.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Multiple Regression

Multiple regression is used to determine the amount of variance accounted for in a dependent variable by interval and ratio level independent variables.

Multiple regression combines variables that are known to predict the criterion variable into an equation.

Stepwise regression allows the researcher to enter one variable at a time.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Multiple Regression

Multiple regression is also the basis for path analysis.

Path analysis begins with a predictive model.

Path analysis determines the degree to which predictor variables interact with each other and contribute to variance in the dependent variables.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Chi Square

Chi square (χ2) is a nonparametric test used to test differences between groups when the data are frequency counts or percentages or proportions converted into frequencies.

A true category is one in which persons naturally fall.

An artificial category is one that is defined by the researcher.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Other Investigative Techniques

Data mining uses analytical tools to identify and predict patterns in datasets.

Factor analysis is a statistical procedure used to identify relations among variables in a correlation matrix.

Factor analysis is often used to reduce instruments to scales or subscales.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

Other Statistical Procedures

Structural Equation Modeling (SEM)

Structural equation modeling is a combination of path analysis and factor analysis.

SEM is a powerful analytic tool.

Gay & Mills

Educational Research, 11e

© 2016 Pearson Education, Inc. All rights reserved.

18-‹#›

(

)

=

S

E

S

D

N

X

-

1