More and More Research

profileBatman007
Resource1.pdf

Conceptual Foundations of Statistics

In: Quantitative Research in Education: A Primer

By: Wayne K. Hoy

Pub. Date: 2012

Access Date: May 7, 2019

Publishing Company: SAGE Publications, Inc.

City: Thousand Oaks

Print ISBN: 9781412973267

Online ISBN: 9781452272061

DOI: https://dx.doi.org/10.4135/9781452272061

Print pages: 45-66

© 2010 SAGE Publications, Inc. All Rights Reserved.

This PDF has been generated from SAGE Research Methods. Please note that the pagination of the

online version will vary from the pagination of the print book.

Conceptual Foundations of Statistics

In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation

and conceptual understanding of some basic statistical tests used in educational research. As we suggested

in the first chapter, statistics are tools that empirical researchers use for analysis of quantitative research.

Statistical tools are useful means and not ends in themselves. We focus on conceptual understanding and

not on the technical details of computing the statistics, which is most often done in statistical courses or by

using statistical packages such as SPSS or SAS. We begin with a review of some basic descriptive statistics

and then move to the conceptual underpinnings of inferential statistics, which are used to test research

hypotheses. Read the text with a pencil in hand; check the simple calculations.

Measures of Central Tendencies

There are three common measures of central tendency: mean, mode, and median. The mean is the most

widely known statistic; it is the average of a set of numbers or scores. Most students compute their average

test scores in a course without difficulty, and they understand what it means—it is their typical test score. The

arithmetic average of some set of numbers in statistics is called the mean. Summing all the scores in the

set and then dividing the sum by the number of scores is the calculation of the mean. Consider the set of

numbers (1, 2, 2, 3, 4, 6). The mean is calculated as follows:

Mean = Sum of the scores (Σ (scores)) divided by N (number of scores)

or

The mean or average of the set is 3, which represents a typical score for this set of data points. If the scores

are reasonably consistent—that is, they don't vary wildly—then the mean is a good indication of the central

tendency. If there are a few extreme scores, however, the mean can be distorted. Consider the set of numbers

(1, 1, 1, 7, 1, 7). In this case, the mean is still 3, but it is not really typical. A few large and extreme numbers

can distort the mean, and therein lies the possible rub of using the mean to describe a set of scores as typical.

For example, in the previous set of numbers (1, 1, 1, 7, 1, 7), 1 is clearly more typical than 3.

The mode is the most frequent number in a set of scores. In the above set (1, 1, 1, 7, 1, 7), the mode is 1,

the most frequent number in the distribution, and in this case, it is a good standard to describe the typical

score of this set of numbers. But again, just as with the mean, the mode can be misleading. For example,

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 2 of 19 Quantitative Research in Education: A Primer

suppose you give a test to 30 students and most students score close to 88; in fact, when you compute the

mean you get 88. Yet there were five people who got 100 and only three who actually scored 88. Which is

the better measure of central tendency, the mean (88) or the mode (100)? Clearly, the mean is more typical

of the distribution of scores.

The median is the middle score of a distribution of numbers. To compute the median, do the following:

Rank the numbers or scores from low to high.

Find the middle number or score:

• If there is an odd number of scores, for example, 11 numbers in the set, simply add 1 to

the total number and divide by 2; the resulting number represents how far to go to find

the median. Consider the numbers in the set (1, 2, 2, 2, 3, 5, 6, 7, 7, 8, 12). Since the

set has 11 numbers (an odd number), add 1 to 11 and divide by 2: 12/2 = 6. The sixth

number in the set is 5, and it is the median or middle score.

• But if there is an even number of scores, simply average the two middle scores. For

example, consider the set (1, 2, 2, 2, 4, 5, 6, 7, 7, 8), which has 10 scores in the

distribution. You simply add the fifth and sixth scores and divide by 2; hence, in this

example the median is (4 + 5)/2 = 4.5. The median is the middle score, which is 4.5 in

this case.

The median is the middle score in the distribution of ranked numbers; it is the point at which half the numbers

are larger and half are smaller. When there are a few very high or very low scores, however, the median or

mode may represent better the central tendency than does the mean.

In sum, the mean, mode, and median are the three most common measures of central tendency; they are

indicators of how typical a given score is in a distribution of numbers, but none of these indicators gives you

a sense of how the scores are distributed, that is, how much variability there is in the set of numbers.

Measures of Variability

Let's now turn our attention to how much variability there is in a set of numbers. How are the scores

distributed? How much do they vary? We consider three measures of variability: the range, the average

deviation, and the standard deviation (SD).

The range is the difference between the highest and lowest scores in a set of numbers, but it is also given as

the span of scores beginning with the lowest score and ending with the highest score, as in the range of 89 to

144 (or, alternately, the range is 55). The range is direct and simple but a little crude because it only describes

in broad strokes the limits of the scores; it does not tell us what is happening in between the extremes.

The average deviation from the mean is just what the phrase suggests: We find the mean, then find the

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 3 of 19 Quantitative Research in Education: A Primer

deviation from the mean for each number (subtract the mean from the number), and then average all the

deviations to get a typical departure of the scores from the mean. Conceptually, that makes sense, but

unfortunately, we always get the same average deviation because half the scores will deviate above the mean

and the other half below the mean; consequently, when you add the deviations you always get 0. Thus, the

average deviation is always 0 and not useful. Take an example. Consider the set of numbers (1, 2, 3, 4, 5, 3,

3). The mean is 21/7 = 3. The deviations from the mean are −2, −1, 0, 1, 2, 0, and 0, respectively, and the

sum is therefore 0. Zero divided by 7 is 0. Zero is always the average deviation from the mean because half

the scores are above the mean and the other half are the same amount below the mean, and 0 divided by

any number is 0. Try it yourself with a small set of numbers. Why bother with the average deviation from the

mean? Only to help you understand the concept of a standard deviation from the mean.

The standard deviation from the mean is the extent to which scores vary from the mean—the typical deviation

from the mean for a set of scores. The standard deviation is conceptually similar to the average deviation, but

it is more useful because it is not always 0, and it has some interesting mathematical and statistical properties,

which we will discuss later. Remember, the standard deviation is always from the mean; the mean is the point

of reference. How much are the scores deviating from the mean? What is the typical or standard deviation of

scores from the mean? Let's consider the same set of numbers as before (1, 2, 3, 4, 5, 3, 3), and illustrate

the computation of its standard deviation.

• Compute the mean as we did above; it better still be 3, but check it.

• Compute the deviations from the mean; subtract the mean from each score.

• Square each deviation from the mean; check these computations below:

Deviation From the Mean (Score – Mean) Deviation From the Mean Squared (Score –Mean)2

(1 − 3) = −2 (1 − 3)2 = −22 = 4

(2 − 3) = −1 (2 − 3)2 = −12 = 1

(3 − 3) = 0 (3 − 3)2 = 02 = 0

(4 − 3) = 1 (4 − 3)2 = 12 = 1

(5 − 3) = 2 (5 − 3)2 = 22 = 4

(3 − 3) = 0 (3 − 3)2 = 02 = 0

(3 − 3) = 0 (3 − 3)2 = 02 = 0

• Sum the squared deviations:

Σ (Score – Mean)2 = (4 + 1 + 0 + 1 + 4 + 0 + 0) = 10.

• Divide the sum of squared deviations by the number of scores:

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 4 of 19 Quantitative Research in Education: A Primer

Σ (Score – Mean)2 / 7 = 10 / 7= 1.43.

• Take the square root of the quotient to obtain the standard deviation: Square root of 1.43 = 1.196.

Hence, the standard deviation of this set of numbers is 1.196, and the formula is

One small note: Statisticians use the shorthand expression sum of squares to refer to the sum of the

deviations from the mean squared, which often confuses students. So remember that you square all the

deviations from the mean and then calculate the sum to get the sum of squares; then you divide by the

number of scores and take the square root of this quotient to get the standard deviation. Now you have

the formula for computing the standard deviation, but it is just as important to know what standard deviation

means—the extent to which your set of scores vary from the mean—the larger the standard deviation, the

more widely the scores vary from the mean (see Figure 3.1); when the standard deviation is small, the

variability is also small.

Knowing the mean and the standard deviation of a group of scores gives you a better understanding of an

individual score. For example, suppose you received a score of 79 on a test. You would be pleased with the

score if the mean of the test were 70 and the SD were 4 because your score would be a little more than 2

SDs above the mean, a score well above average.

Consider the difference if the mean of the test had remained at 70 but the SD had been 16. In this case, your

score of 79 would be less than 1 SD from the mean. You would be much closer to the middle of the group,

with a score slightly above average but not high. Knowing the standard deviation tells you much more than

simply knowing the range of scores. No matter how the majority scored on the test, one or two students may

do very well or very poorly and thus make the range very large.

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 5 of 19 Quantitative Research in Education: A Primer

Figure 3.1 The Normal Distribution

Normal Distribution

Standard deviations are especially useful if the distribution of scores is normal. You have heard of a normal

distribution before; it is the bell-shaped curve that describes many naturally occurring physical and social

phenomena such as height and intelligence. Most scores in a normal distribution fall toward the middle, with

fewer and fewer scores toward the ends, or the tails, of the distribution. The mean of a normal distribution is

also its midpoint. Half the scores are above the mean, and half are below it. Furthermore, the mean, median,

and mode are identical in a normal distribution.

When the distribution of scores is normal, the percentage of scores falling within each area of the curve is

known, as you can see in Figure 3.1. Scores have a tendency toward the middle or mean. In fact, 68% of all

scores are located in the area from 1 SD below to 1 SD above the mean. About 16% of the scores are beyond

1 SD above the mean. Of this higher group, only 2% are greater than 2 SDs above the mean. Similarly, only

about 16% of the scores are beyond 1 SD below the mean, and of that group only about 2% are beyond 2

SDs below the mean.

Standard scores are based on the standard deviation. A z scoreis a standard score that tells how many

standard deviations above or below the mean a score falls. In the example described earlier, in which you

were fortunate enough to get a 79 on a test where the mean was 70 and the SD was 4, your z score would

have been greater than 2 SDs above the mean (actually 2.25 SDs above the mean), which means that your

score is higher than 98% of those who took the test. To determine your place in a normal distribution, you

need to convert your raw score to a standard score, which is a simple process—simply subtract the mean

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 6 of 19 Quantitative Research in Education: A Primer

from the raw score and divide the difference by the standard deviation. The formula is

Populations and Samples

So far, all our statistics have described properties of populations. The population, or universe, contains all

the elements of the set. If you have all the elements of the set you are studying, for example, all the scores

for all students in your class, then you have the results for that universe. You can compute the exact or actual

mean, mode, median, range, and standard deviation for the population; there is no need to estimate.

For the most part, however, researchers are interested in samples of a population. A sample is a subgroup

of the population. If we want to generalize about the third-grade students in the country, the population is

all-third grade students in America. It is impractical, if not impossible, to get information on all such students,

so researchers limit the population to third-grade students in a state. Even this population may be too large

for practical purposes, so we take a subgroup of these students as a sample. We would like to get a

representative sample so that our conclusions are general to the population.

We need to add a few more refinements to our definitions because we will usually be directly concerned

with samples rather than populations. Statistics are the characteristics of samples. Parameters are the

characteristics of populations. That is, measures of central tendencies (mean, mode, median) and indicators

of variability (range and standard deviation) are parameters, which are estimated from the sample. One

formula, the standard deviation, needs to be altered slightly to get a better estimate of the actual standard

deviation of the population. In other words, when using a sample to estimate the standard deviation of a

population, divide by n − 1 (number in the sample minus 1). This revised formula yields a better estimate

of the standard deviation for the population; this slightly altered way of calculating the variance is called the

mean square and has other mathematical and statistical properties that make it useful. Thus, the standard

deviation for a sample is best defined as

Thus far in our analysis, we have used the standard deviation as a measure of the variability. A related

concept that is more useful in statistics is the variance of a set of scores. The variance of a sample is its

standard deviation squared.

The variance and the mean are the two key concepts used in most statistical analyses. Both are summaries

of a set of scores; the mean is a measure of central tendency and the variance a measure of variability. We

started our discussion of variability with the standard deviation because we assumed that it was more familiar,

but now the related concept of variance becomes our chief index of variability.

Much of statistical analysis is explaining the variance in the dependent variable. Does the independent

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 7 of 19 Quantitative Research in Education: A Primer

variable cause the dependent variable to vary or lean in a certain direction? That is the key problem of

inferential statistics. We ask the question “Were the results of my study a consequence of the independent

variable, or were they a result of chance?” In other words, we measure our actual findings against the chance

model. We attempt to eliminate chance as an explanation of our results in order to buttress the argument

that our independent variable, not chance, made the difference. To reiterate the central thesis of inferential

statistics, “Did the results occur by chance, or are they a function of our independent variable?” Statistics

and probability help us answer this basic question. This is not a book on statistics, but a basic conceptual

understanding of statistical tests is essential if we are to grasp the nature and meaning of quantitative

research.

Statistical Tests

One more time, the basic statistical question: In my research, is what I found significantly different from what

I would expect to find by chance? What you as a researcher need to do is to compare your actual results

with the chance model. Do the results vary enough from chance to conclude that something else is causing

the variance or variability in the dependent variable? Statistics provide critical ratios such as the t ratio, the

F ratio, or chi square, which enable us to answer the chance question with confidence (see “Elements of a

Proposal,” Appendix A).

t Test

All critical ratios work the same way, and we will illustrate a few so that you understand what is happening

and why. The t test is a good place to begin because it is a clear, straightforward statistical application. If we

are doing a study in which the independent variable has only two categories and the dependent variable is

continuous, then the appropriate statistic is a t test.

For example, suppose we want to know if college men and women are significantly different with respect to

liberal attitudes toward premarital sex. Note that the population is all students at College A. Assume that we

select a representative sample of men and women from College A, and we have all students in the sample

respond to a reliable and valid scale measuring their attitudes. Assume further that the higher the score on

the scale, the more liberal their attitudes. How would we test the results of our little research problem?

First, we would divide the sample into two groups, male and female; the independent variable has only two

categories. Then we would compute the mean scores for men and for women on the dependent variable,

liberal attitudes toward premarital sex. Finally, we would ask whether the means for the men and women were

significantly different. The t test is an appropriate statistical procedure when the independent variable has two

and only two categories and the dependent variable is continuous.

Here is how the t test works. To assess whether there is a significant difference (one not explained by

the chance model), we would compare what we found—the actual difference in scores between men and

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 8 of 19 Quantitative Research in Education: A Primer

women—with the difference expected by chance. The ratio between the actual difference and the difference

due to chance is a t ratio. A t test is defined as

The larger the ratio, the greater the probability that the difference is not a function of chance. If the t value is

1, what does that mean? The actual difference between the means is exactly what we would expect if nothing

but chance were working; chance is explaining this relationship. But if the t value were 2, it is more likely that

something other than chance is operating.

Let's continue a little further without getting bogged down in statistical calculations. The general formula for a

t test is as follows:

There are several important aspects of this general formula:

We are examining the actual difference between the means of the two groups.

We are comparing the actual difference with what is expected by chance.

Statisticians can determine what is expected by chance by computing the standard error of the

difference between the two means.

A t ratio is computed that indicates the extent to which the results depart from the chance model:

The greater the t value, the greater the likelihood that chance is not explaining the relationship.

Fortunately for us, using any one of a number of statistical packages, the computer will calculate the standard

error of the difference between the means as well as the t ratio and its level of significance (p value).

A p value is a probability level that indicates the level of significance, that is, the probability that the results are

a function of chance. When you read research publications, you find statements such as (t = 2.62, p < .01).

This means that a t test produced a t ratio of 2.62, which was significant beyond the .01 level of significance

(p < .01); hence, we can be quite confident that the chance model does not explain the relationship. By

convention, most researchers accept a relation as statistically significant if the p value is equal to or less than

.05. What that means is that the relation could have occurred by chance only 5 times or less out of 100.

Let's return to the question of whether men and women at College A have different attitudes toward premarital

sex.

• First, we add up all the scores for the men and divide by the number of men (mean score for men)

and do the same for the women (mean score for women).

• Next, we subtract the scores (mean score of men minus mean score of women).

• Then we compute the standard error of the difference between the means of men and women (the

difference we would expect to get by chance).

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 9 of 19 Quantitative Research in Education: A Primer

• Finally, we compare the two by computing a t ratio (actual difference divided by the standard error of

the difference).

Fortunately, our laptop and SPSS computer program will do all this as quickly as we can hit the Execute

button. The results will include the t value and give us its level of significance.

What would it mean in our research project if we obtained the following: (t = 1.02, p > .95)? The answer is

that a t value of 1.02 is not statistically significant. We can tell this just by looking at the t value because 1

would indicate perfect chance to explain the result. The p > .95 indicates that more than 95 times out of 100,

chance would explain our results. Hence, in College A, we can conclude with great confidence that there is

no significant difference between men and women in their attitudes toward premarital sex.

F Test

The independent variable is not always a dichotomous variable, one with only two categories. Sometimes

the independent variable has more than two categories. If so, we cannot use the t test. We need a more

general test that does essentially the same thing, that is, produces a critical ratio to check the departure from

the chance model. In the case with more than two categories in the independent variable and a continuous

dependent variable, the more general F ratio will provide our answer. An F test is done using a statistical

procedure called analysis of variance (ANOVA). There are a variety of ANOVAs, but we will focus on the

least complex; however, conceptually all ANOVAs are similar in that one or more F values are computed to

answer the question of the deviation from the chance model question.

Let's illustrate a simple one-way ANOVA with an example. Suppose I want to test the effectiveness of three

teaching approaches with graduate students in education—teacher directed, student directed, and shared.

I am teaching a large group of 90 students in an introductory course in education, about a third of all the

beginning graduate education students. Does my teaching approach make any difference in the mastery of

key concepts in education? Assume that I can divide the group into three similar subgroups; probably the best

way to do this would be to assign the students to the groups at random. Assume further that the 90 students

are representative of all beginning graduate education students at my university.

What is my independent variable in the research problem? How many independent variables do I have?

Three? No, actually I have only one independent variable (teaching approach) with three variations or

categories (teacher-directed, student-directed, and shared approaches). The independent variable is a

manipulated categorical variable. I, the researcher, will manipulate the variable by teaching each group in one

of three ways. What is the dependent variable? I am interested in mastery of basic education concepts, and

I have a final exam that I have developed over the years that is reliable and valid; that is, it taps the content

that I am interested in having students master in a consistent manner. The dependent variable is measured

by my test and is continuous: The higher the test score, the greater the level of mastery of basic concepts.

The F test is an appropriate statistical procedure when the independent variable has two or more categories

and the dependent variable is continuous.

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 10 of 19 Quantitative Research in Education: A Primer

Here is how an F test is computed using ANOVA. At the end of the term, I would compute the mean score on

mastery for each of the three groups. Almost certainly, there will be differences in the means, but the question

is essentially the same here as it was for the t test: Is there a significant difference among the three mean

scores? I would proceed by doing the following:

• First, compute the mean for each of the three groups on the mastery exam.

• Next, calculate the total variance for the entire sample. That is, combine all three groups into one,

and compute the overall mean for the entire 90 students. To compute the variance for the entire

group, which is called the total variance (Vt), use the following formula described earlier:

• Now compute the variance between the groups. To do this, we treat each of the three means for the

groups as data points and use our variance formula. The between-group variance is the variance

caused by the independent variable; it is also called systematic or experimental variance.

• The variance due to error is commonly called the within-group variance; it is also called error

variance. This computation is a little more difficult to explain, but conceptually it is the variance “left

over” from the total variance after the between-groups or experimental variance is removed from the

total variance. The within-group variance is a measure of chance variation.

• Finally, calculate the F ratio, which is the variance produced by the independent variable divided by

the variance due to chance.

We have come a long way to show that the F ratio using ANOVA is essentially the same as a t ratio in that

both compare actual findings in relation to chance and yield an index and a probability level to enable us to

make confident judgments about the nature of our relationships. A significant F ratio in this kind of problem

simply means that there is a significant difference among the three groups. To find which pairs of means are

different, we must do some further post hoc analyses, which can be found in any good statistics book. But the

idea is the same: Compare your actual results with what you would expect by chance.

Chi-Square Test

Sometimes both the independent and the dependent variables are categorical. If so, we need another

statistical tool to compute the critical ratio for such situations, called the chi-square (χ2) test. Suppose you

want to examine the relationship between gender and graduation. Is the gender of the students related

to whether or not one graduates? What are the independent and dependent variables of this research

problem? What kind of variable is each in terms of measurement? Gender is the independent variable; it is

the presumed cause and has two variations or categories: male and female. Graduation is the dependent

variable, and it also has two categories: graduate and no graduate. One might also consider graduation

as a continuous variable, that is, graduation rate, but to illustrate the chi-square, we cast graduation as a

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 11 of 19 Quantitative Research in Education: A Primer

dichotomous variable.

We decide to go back to the freshman class of 4 years ago and see how many men and women graduated at

the end of 4 years. We select a random sample of 100: 50 men and 50 women. We summarize the results of

our research in a 2 × 2 cross break or contingency table (see Table 3.1).

As we examine the results in the 2 × 2 table, we see that in our sample, women may be more likely to

graduate than men, but what is the likelihood that the results can be explained by chance? In other words, we

need to compare what we found in this analysis with what we would expect to find by chance. Do the results

here represent a major departure from the chance model? We need a critical ratio. The chi-square test is the

appropriate statistic whenboth variables are categorical. The chi-square is a test of frequency counts.

Table 3.1 Summary of the Results of our Analysis

What do the numbers in the cells of our 2 × 2 table represent? Yes, frequencies—the number of students in

each cell. The chi-square is an index of the actual results compared with those expected by chance. Examine

the formula for chi-square:

Now we will use the formula and the results we obtained and summarized in our 2 × 2 cross break. The

chance model would predict 25 students in each cell; that is, the expected frequency for each cell (fe) is 25.

Now compare the expected with the actual for each cell by subtracting the expected frequency (fe) from the

observed frequency (fo), squaring the difference, and then dividing the difference by the expected frequency

(fe). Let's do the computations for each cell and sum them as the formula instructs.

The χ2 index is 16. What would the chi-square have been if only chance were working? Each cell would have

had the number 25, and χ2 would have been 0. Run the numbers, and make sure you see why the answer is

0. Thus, our index of departure from chance in this example is 16. Using a computer program, we would have

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 12 of 19 Quantitative Research in Education: A Primer

found (χ2 = 16, p < .01). The results show that a χ2 of 16 is statistically significant beyond the .01 level of

significance; that is, these results would occur by chance less than 1 time out of 100. Our conclusion would be

that women are more likely to graduate from the college than are men. Note that, as always, our conclusion is

probabilistic, not certain. The point of this exercise is to demonstrate the meaning of yet another critical ratio,

one that works when both variables are categorical.

Effect Size

The three tests that we have examined thus far—the t test, the F test, and chi-square—are statistics that

help us answer the basic statistical question: Is what I found in my analysis significantly different than I

would expect to find by chance? None of these statistics, however, tells us anything about the magnitude of

the relation. Increasingly, researchers want to know the strength of the relation, that is, its effect size. The

magnitude of the independent variable's effect on the dependent variable is the effect size. Suffice it to say

that when using t tests, analysis of variance, or chi-square analysis, we must do additional computations to

determine effect size. For example, a contingency coefficient and an omega-squared (Hays, 1994; Kerlinger

& Lee, 2000) are relatively straightforward computations that will tell us the magnitude of the effect size. The

point here is that the F and t values and chi-square tell us if there is a statistically significant relationship, but

they do not indicate the magnitude of the relation; other indices are needed.

We turn next to coefficients of correlation, which not only answer the question of statistical significance

but also indicate the magnitude of the relationship between the independent and dependent variable—the

proportion of variance in the dependent variable explained by the independent variable. Correlation

coefficients, unlike the statistics explored thus far, answer both the statistical significance and the effect size

questions.

Linear Regression and Coefficient of Correlation

What if both the independent and the dependent variables are continuous? We need another statistic: A

coefficient of correlation (r) will give us the answer to whether the relation is likely a chance one or not. But

another useful feature of the correlation is that we can use it to test not only the departure from the chance

model but also the strength of the relation. A coefficient of correlation is a number that indicates the magnitude

of the relation between two continuous variablessuch that the higher the absolute value of the correlation, the

stronger the relation. Correlations range in value from −1 to +1. If the two variables vary together, they have

a positive correlation, which means that as the value of one increases, so does the other. If the correlation

is negative, then as the independent variable changes, the dependent variable changes in the opposite

direction. Which is stronger, a correlation of +1 or one of −1? Neither. Both are perfect correlations; both are

as high as they can get but in opposite directions. The sign of the correlation represents the direction of the

relation and has nothing to do with its strength. So r = −.85 is a stronger correlation than r = +.41 because the

sign merely indicates whether the variables are varying in the same or opposite direction.

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 13 of 19 Quantitative Research in Education: A Primer

The calculations of coefficients of correlations are a little more tedious and not as self-evident as the other

statistics that we have discussed, so we will not spend much time with the formulas and computations.

Instead, we will illustrate the correlations with some graphs. Correlations describe linear relations, which are

straight lines when graphed. The relation between two variables, x and y, is a set of ordered pairs. That

means that for every value of x there is one corresponding value of y. We can express the pairs of values in

set notation, or we can simply express them in a table or graph or both. Consider the relations between three

sets of ordered pairs (relations) as expressed in Table 3.2.

The first set of ordered pairs has a correlation coefficient of +1; the numbers vary together. For each change

in the independent variable x, there is a corresponding change in the dependent variable y of the same

magnitude and direction. In the second set of ordered pairs, sometimes a change in x produces a positive

change in y and sometimes a negative change; there is no systematic pattern in the relation; there is no

relation (r = 0). Finally, in the third set, for each change in x there is a corresponding change in y of the same

magnitude except in the opposite direction; we have a perfect negative correlation (r = −1); x and y vary

together in opposite directions. In brief, the correlation coefficient provides an index of the extent to which the

two variables vary together and the direction of the variation.

Table 3.2 Correlations for Three Sets of Numbers

A computer program will provide you with correlation coefficients and levels of significance. Consider the

statement (r = −.52, p < .01). The correlation is negative: As x increases, y decreases. The relation is

statistically significant; that is, chance is unlikely to explain the relationship; in fact, in less than 1 time in

100, the two variables would not be related. The correlation coefficient also suggests how strong the relation

is between the two variables. Square the coefficient of correlation and multiply it by 100, and you have an

estimate of the percentage of the variance in the dependent variable (y) caused by the independent variable

(its effect size). For example, if r = .50, then the independent variable x explains 25% of the variance in y. If r =

0, then none of the variance in y is explained by x. If r = −.83, then about 69% of the variance in y is explained

by x. An important point: What scientists try to do with their research and statistics is to identify independent

variables that explain the variance in the dependent variable. Explaining variance in a dependent variable is

an important goal of scientific research.

A final observation about a correlation coefficient—it is mathematically the coefficient of x in the formula for a

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 14 of 19 Quantitative Research in Education: A Primer

straight line, as expressed by the following equation:

Think of the set of ordered pairs that represents the relation between the independent variable, x, and the

dependent variable, y, as a graph of a line that passes through those points such that the line represents

the best fit for all the points; mathematically, that means the sum of all the distances from the points to the

line (sometimes called a regression line) would be as small as possible. If we standardize x and y, then

the coefficient of x is the correlation coefficient for the regression line for the relation of x and y. In sum,

a correlation coefficient for a relation in which both variables have been standardized is the slope of its

regression line. The regression line for two variables will take the form of y = mx + b, where m is the slope

of the line and the correlation coefficient for the standardized data and b is the y-intercept. Perhaps we are

getting a little too technical, so let's move on.

Multiple Regression and Multiple Coefficient of Correlation

Thus far, all the tests that we have described are bivariate; that is, they examine the relation between one

independent and one dependent variable. In the actual world, relationships are more complicated. Typically,

dependent variables are influenced by more than one variable at a time; thus, we need multivariate statistics.

You should be beginning to realize that there are statistics for just about any relation you can imagine, but

most are designed to answer basic questions: Can I reject the chance model as a good explanation? How

strong is this relationship?

Just as a simple correlation (r) tells us whether the chance model can be rejected and how strong the relation

is between x and y, a multiple correlation (R) tells us the same thing. But in the case of the multiple R, we

have a little more information because R represents how much variance in the continuous dependent variable

y is explained by a set of continuous independent variables (x1, x2, x3 … xn). Moreover, each x variable

has a coefficient, which is sometimes called a regression coefficient or beta weight. So a multiple regression

analysis will produce a multiple R, which represents the combined influence of all the independent variables

on the dependent variable, y, as well as a regression coefficient or beta weight for each independent variable

(x). The coefficients represent the strength of the relation between that x and the dependent y, controlling

for the other xs, that is, taking out the influence of the other independent variables. Consider the following

formula for a multiple regression line:

Note that this equation is simply an extension of simple regression. Here, however, we have three

independent variables instead of only one. For example, we might be trying to predict student achievement

(y) based on the IQ (x1), motivation (x2), and sense of optimism (x3) of students. If we had data from

some sample of students on these variables, we could use a standard statistical package to run a multiple

regression analysis on this set of variables. The analysis would first compute an R, which would tell us

how strong the relation is between this set of variables and student achievement. For example, what is the

combined impact of IQ, motivation, and sense of optimism on student achievement? If we square the R, then

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 15 of 19 Quantitative Research in Education: A Primer

R2 is a good estimate of how much of the variance in student achievement is explained by the combination of

IQ, motivation, and sense of optimism. The program also computes a t value or F ratio to gauge the likelihood

that the relation is a matter of chance. Furthermore, the analysis yields a standardized beta coefficient for

each independent variable, which tells us how much influence each independent variable has relative to

the other independent variables, and, of course, for each coefficient there will be a corresponding test to

determine its departure from chance.

Remember that the two variables in simple correlation are both continuous; this is also the case in multiple

regression—all the variables are typically continuous. We could continue building our set of statistical

procedures. For example, what if we are interested in multiple continuous independent variables and multiple

continuous dependent variables? There is, of course, a statistical test for that circumstance, a canonical

correlation, but we have gone far enough to give you a flavor of statistics, what they do, and when and how

they are employed.

Summary

You now have a good working repertoire of statistical procedures and tests if you have carefully read and

studied this chapter; however, the chapter is not a substitute for a set of statistics courses, but it should

provide the conceptual understanding that you need to begin to analyze and frame quantitative research.

Let's review the key points:

• The mean, mode, and median are measures of central tendencies.

• The range and standard deviation are measures of variability.

• The basic purpose of inferential statistics is to answer the question “Were the results of my study a

consequence of the independent variable, or were they a result of chance?”

• Your inventory of statistical tests includes

• the t test for the difference between two means,

• ANOVA and the F test,

• the chi-square test,

• coefficients of correlation (r), and

• multiple regression and multiple correlations (R).

• Which test is appropriate depends on the nature of the independent and dependent variables, that

is, whether they are continuous or categorical (see Table 3.3 for a summary).

Table 3.3 Types of Variables and Appropriate Statistical Tests

Independent Variable Dependent Variable Statistical Test

Dichotomous Continuous t Test

Categorical Continuous F test (ANOVA)

Categorical Categorical Chi-square (χ2)

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 16 of 19 Quantitative Research in Education: A Primer

Continuous Continuous Correlation (r)

Multiple and continuous Continuous Multiple correlation (R)

Check Your Understanding

An educational researcher has done an experiment with two groups: an experimental group (A)

and a control group (B). A was a taught using “dynamic inquiry,” and B was taught in a traditional

way. At the end of the unit, a performance test was given to both groups, and their scores were

as follows:

A B

3 6

5 5

1 7

4 8

2 4

Using the formulas in this chapter, compute the mean, standard deviation, and variance for

Group A and Group B. Based on the results, develop a hypothesis relating dynamic inquiry and

effectiveness.

The following scores are the result of a test of reading comprehension in a fourth-grade class:

0, 2, 4, 1, 3, 5, 2, 4, 6, 6, 4, 2, 5, 3, 1, 4, 2, 0

What are the mean, mode, and median for this set of scores? What are the range, average

deviation, and standard deviation? In your own words, not in statistical terms, describe the

variance and central tendency of this distribution.

A student score of 600 on the SAT is the same as a standard score of 1. How does this student

compare with all those who have taken the test? What if the SAT score is 300 or a standard score

of −2? What is a standard score? (Hint: For the SAT test, the mean score is 500, and SD = 100.)

Compute a t value for Exercise 1 above, assuming that the standard error of the difference

between the two means is 1. Interpret what that t value means. Is the difference in the means of

the two groups statistically significant?

You have computed a correlation between the socioeconomic status of your students and their

math achievement scores (r = .70). Interpret what this correlation means. If this is a true

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 17 of 19 Quantitative Research in Education: A Primer

correlation, what can you as a teacher do to improve performance? How much does SES help or

hinder your task?

You have just read an interesting article in which the researcher shows that the multiple

regression of home background (HB), intelligence (IQ), and motivation (M) on achievement

produces an R2 of .87 and the standardized beta weights are .31, .41, and .34, respectively. How

strong is the relation? Which variable is the most important in explaining achievement? What is

the relative influence of each of the independent variables? What conclusions can you draw?

Key Terms

ANOVA (analysis of variance) (p. 55)

Beta weight (p. 62)

Between-group variance (p. 56)

Chi-square (χ2) (p. 57)

Coefficient of correlation (r) (p. 59)

Effect size (p. 59)

Error variance (p. 56)

Experimental variance (p. 56)

F value (p. 55)

Level of significance (p. 54)

Mean (p. 45)

Median (p. 45)

Mode (p. 45)

Multiple correlation (R) (p. 62)

Negative correlation (p. 61)

Normal distribution (p. 50)

Population (p. 51)

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 18 of 19 Quantitative Research in Education: A Primer

p Value (p. 54)

Range (p. 47)

Research hypothesis (p. 45)

Sample (p. 51)

Standard deviation (p. 47)

Standard score (p. 51)

Systematic variance (p. 56)

t Test (p. 53)

t Value (p. 54)

Within-group variance (p. 56)

z Score (p. 51)

http://dx.doi.org/10.4135/9781452272061.n3

SAGE

2010 SAGE Publications, Ltd. All Rights Reserved.

SAGE Research Methods

Page 19 of 19 Quantitative Research in Education: A Primer

  • Conceptual Foundations of Statistics
    • In: Quantitative Research in Education: A Primer