06279 -2 pages within20hrs

ltdprinwival

Required_Resources_WEEK_4.docx

Home >Business & Finance homework help >Management homework help >06279 -2 pages within20hrs

Required Resources

Text

Read from the course text, Statistics for the Behavioral & Social Sciences :

· Chapter 8: Correlation

· Chapter 9: Linear Regression

Websites

The Critical Thinking Community (Links to an external site.)Links to an external site.. (http://www.criticalthinking.org)

· This website teaches and promotes the main tenets of critical thinking. There are some excellent articles that can help you better understand what critical thinking is and how to utilize it in everyday life.

· Accessibility Staetment does not exist

· Privacy Policy (Links to an external site.)Links to an external site.

Recommended Resources

Article

Kirwan, J., Lounsbury, J., Gibson, L. (2010). Self-direction in learning and personality: The Big Five and narrow personality traits in relation to learner self-direction Preview the document . International Journal of Self-Directed Learning, 7(2), 21-34. Retrieved from http://sdlglobal.com/IJSDL/IJSDL7.2-2010.pdf#page=25

· This is an article about personality, self-directed learning, and scale development and the major traits that may affect them. These include: agreeableness, conscientiousness, emotional stability, and openness. It incorporates correlation and regression procedures with tables that display the statistical results.

Stark, P.B. (2013). Chapter 9: Regression (Links to an external site.)Links to an external site.. Retrieved from http://www.stat.berkeley.edu/~stark/SticiGui/Text/regression.htm

· This website contains several video lectures and examples of how regression is used.

Trochim, W. M. (2006). Correlation (Links to an external site.)Links to an external site.. In Research Methods Knowledge Base. Retrieved from http://www.socialresearchmethods.net/kb/statcorr.php

· This website contains many tutorials and tools for statistical analyses and methods used in the social sciences. This particular page is a detailed description, with examples and graphs, to help understand correlation statistics.

Websites

VassarStats: Website for Statistical Computation (Links to an external site.)Links to an external site.. (http://vassarstats.net/)

· This website includes tools to calculate many of the statistical tests we cover in this course including t-tests, ANOVA, correlation, and regression. Each calculator includes a tutorial and/or walkthrough.

Web Center for Social Research Methods (Links to an external site.)Links to an external site.. (http://socialresearchmethods.net/)

· This website includes links to numerous tools and tutorials relating to statistical concepts, calculations, and scale development.

Linear Regression

Racehorses out of the gate begin the race

Seth Joel/Corbis

Chapter Learning Objectives

After reading this chapter, you should be able to do the following:

1. Explain the relationship between correlation and regression.

2. Describe the regression line in least-squares regression.

3. Estimate a predictor-based criterion value using regression.

4. Explain multiple regression.

Introduction

Regression is a powerful analytical tool that in its simplest form uses the relationship between two variables to predict one from the other. Peopleoften make regression-like predictions. The presence of clouds in the morning sky prompts us to take an umbrella to work, for example, or a phonecall from unexpected guests results in extra food prepared for a meal. The concepts in this chapter follow the same thinking, except that thepredictions are mathematical.

Social scientists rely on regression in virtually every advanced statistical procedure. Most of those high-end statistical techniques—such asmultivariate analysis of variance, discriminant-function analysis, and structural-equations modeling—are beyond the scope of an introductory text, butregression analysis is an essential part of the preparation for each of them. In the meantime, regression has value in its own right, as a mathematicalprocess that uses the relationship between two variables to predict the value of one from the value of the other.

9.1 Regression and Correlation

Chapter 8 made the point that when two variables are correlated, it is because they shareinformation. For example, if intelligence correlates with reading comprehension, it is becauseto some degree each measures a common characteristic. The more highly they are correlated,the greater the quantity of whatever is measured that the two characteristics have in common,which is what the coefficient of determination (rxy2) indicates. It reveals the proportion of onevariable that can be explained by the other. If intelligence (x) and reading comprehension (y)are correlated at say, rxy = 0.8, then rxy2 = 0.64: 64% of whatever reading comprehensionmeasures can be explained by variations in intelligence.

If correlated variables share information and we have information about the value of one ofthose variables, we should be able to make a better-than-chance prediction of thecorresponding value of the other.

· If age and height are correlated for teenagers, and we know how old a subject is, weshould be able to make a better-than-chance prediction of the individual’s height.Conversely, if we know a teen’s height, we ought to be able to predict age.

· If education and income are correlated, and we know how many years of schooling asubject has had, we should be able to make a reasonable prediction of that person’sincome.

· If the length of soldiers’ exposure to combat correlates with their manifestation ofpost-traumatic stress disorder (PTSD), we can predict the severity of PTSD from thelength of combat exposure.

Regression allows us to address issues such as these mathematically. The concept is not new.Karl Gauss, the same mathematician who defined the characteristics of the normal (Gaussian)distribution, began developing the procedures behind regression in the early part of the 19thcentury. Many others have also contributed. Collectively, their work has allowed experts in avariety of fields to use regression procedures in their decision-making for many years.

A tornado approaching a wooden building

Victor Zastol`skiy/Hemera/Thinkstock

Meteorologists use regressionprocedures to predict the occurrenceof violent storms.

· Economists gather data on unemploymentrates, wholesale inventories, and consumerspending in order to predict the rate at whichthe economy will grow. This approach iseffective because each of those variablescorrelates with economic expansion.

· Meteorologists use changes in barometricpressure to predict weather. Because drops inbarometric pressure are predictors of violentstorms in the Great Plains states and in thesoutheastern part of the United States,meteorologists watch particularly fordramatic drops in air pressure.

· Sports oddsmakers rely on data such as ateam’s past performance, injuries to key players, and the quality of the opponent topredict game outcomes.

· Psychologists use genetic and social factors, including the history of alcohol abuse in afamily, to predict an individual’s predisposition to abuse drugs.

Try It!: #1

From the standpoint of making aprediction, why does the strength ofthe correlation between the twovariables involved matter?

Each of these scenarios is possible becauseof correlations between variables.Correlations pave the way for prediction.The point is not that the people in theexamples necessarily sit down withmathematical models to calculate theprobability that certain results will emerge,but they could. In fact, a review of theprofessional literature provides ampleevidence that scholars perform analyses likethese frequently. They are importantbecause prediction allows those who mustact to be proactive. Rather than waiting for some important condition to emerge, affectedparties can anticipate its timing with some precision, and then take appropriate action.Prediction is the basis for sound decision-making.

The Language of Regression

Many types of regression procedures are employed; although this chapter concerns itself withjust one, the concepts and much of the language used here are common to the differentapproaches. In the Statistical Package for the Social Sciences (SPSS), one of the mostpopular computer programs for statistical analysis, the variable to be predicted is called thedependent variable. The variable used to make the prediction is called the independentvariable.

Those terms are common enough in statistics, but in regression discussions, the words independent and dependent run the risk of suggesting a causal relationship between theantecedent variables and that dependent variable. Although we also used this language with ttests and ANOVA, the risk is greater in correlation discussions because the discussion beginsby assuming a relationship between the variables. To avoid this slippery slope, we will makean adjustment in discussing regression. Rather than the terms “dependent variable” and“independent variable,” as common as those terms are elsewhere, we will refer to thevariable to be predicted in a regression procedure as the criterion variable, and the variableused to make the prediction as the predictor variable. We adopt this language to minimizethe risk of confusing correlations with causal relationships.

This does not mean that no causal relationship exists; just such a connection may be at work.In fact, the relationship between exposure to combat and the development of post-traumaticstress, for example, probably is causal. The point is that the correlation alone—andcorrelation is the foundation for pursuing regression—is not usually sufficient by itself toestablish causality.

Although this chapter uses the terms criterion and predictor here for descriptive purposes,some shorthand indicators are needed as well. The symbols used in regression are the sameas those used with the Pearson correlation in Chapter 8: x and y for the correlated variables.Here, x symbolizes the predictor variable and y symbolizes the criterion variable.

Choosing the Predictor

The confusion that can occur when equating correlation with cause increases when werecognize that either variable in a significant correlation can be used to predict the other. If acorrelation exists between the degree of post-traumatic stress disorder (PTSD) and the lengthof exposure to combat, it means that each variable is equally related to the other. A researchermight, for instance, predict the degree of PTSD from the length of combat exposure orpredict the converse relationship: the length of combat exposure from the degree of PTSD.

Either variable in a statistically significant correlation can predict the other. From the point ofview of the mathematics involved, which variable predicts which does not matter, althoughpractical considerations may dictate the predictor and the criterion. Sometimes one of thevariables will prove more elusive than the other because the data involved are more difficultto gather. In such cases, the difficulty involved may require that the more accessible variablebecomes the predictor and that the less available variable be predicted rather than gathered.

If reading comprehension scores are significantly correlated with intelligence scores, andsomeone wishes to predict the value of one from the other, to use reading scores as thepredictor variable makes sense. Reading scores are more accessible than intelligence scores.Most major intelligence tests must be administered to one subject at a time by someonetrained to use the instrument. This process makes gathering intelligence scores expensive andtime-consuming. Reading tests, on the other hand, can be group administered and usuallyrequire little training.

Other factors must be considered when determining predictors and criteria. Perhaps thescores from the college-aptitude tests students take in high school are correlated with thegrades that students earn during their first year of college study. From the standpoint of thecorrelation, scores can be predicted from grades quite as readily as grades can be predictedfrom scores, but it will generally be the students’ future—rather than their past—performance that will be of interest.

Picturing Regression

Table 9.1: Study data for hoursstudied and grade average

Subject	Hours studied(x)	Grade average(y)
1	1	1.5
2	2	1.8
3	3	2.2
4	3	2.0
5	5	2.0
6	5	2.1
7	7	2.4
8	7	2.2
9	8	2.4
10	10	2.7
11	10	2.6
12	11	2.9
13	13	3.0
14	15	3.0
15	16	3.1
16	16	2.7
17	16	3.3
18	17	3.0
19	18	3.4
20	20	4.0

Chapter 8’s scatterplot illustrated the correlationbetween verbal ability and intelligence. In the graph,each point represented one subject’s scores on twovariables. When variables are highly correlated, thepoints reflect an inclining or declining line from leftto right in the scatterplot, depending upon whetherthe correlation is positive or negative. Little “scatter”along the line indicates high correlation betweenvariables.

When scatterplots are applied to regression, thepredictor variable scores (x) are plotted on thehorizontal axis, the criterion variable scores (y) onthe vertical axis. Perhaps a researcher randomlyselects a group of 20 students at the end of their firstterm of study at a large university and gathers fromthem two types of data: (a) the number of hours perweek they typically study and (b) their recordedgrade averages at the end of the first term. Theresearcher wants to determine how well the numberof hours the student studies per week will predict thestudent’s grades. This means that the hours studied isthe predictor variable, x, and grade average is thecriterion variable, y. Table 9.1 lists the data.

These data can be used to create another scatterplotlike the one in Chapter 8. If the data are placed incolumns in an Excel spreadsheet just as they are here(the left-hand numbering occurs automatically inExcel and is not considered one of the data columns),the commands for creating the scatterplot are Insertand then Scatter. The resulting scatterplot (withadded labels and title) is Figure 9.1.

Figure 9.1: A scatterplot for the relationship betweenstudy time and grades

A scatterplot numbered 0 to 25 along the horizontal axis to indicate the number of hours studied per week and numbered 0 to 4.5, with increments of 0.5, to indicate the grade-point averages for students in the survey. Points on the scatterplot follow a pattern from lower left to upper right.

To plot the data manually, draw the vertical and horizontal axes of the graph. Mark equalintervals on the horizontal axis for increasing hours studied and along the vertical axis forincreasing grade averages. For the first subject, identify one hour studied on the horizontalaxis and move vertically to GPA = 1.5. Mark the point. With points plotted for all subjects,three conclusions emerge:

· The number of hours studied and the students’ first-term grades appear correlated. Ifthis was not the case, the dots would have no particular pattern.

· The correlation is substantial; an imaginary line drawn from lower left to upper rightvaries little. If we plotted the number of hours students spend playing video gameswith their grades, the pattern might be from upper left to lower right: more play, poorergrades.

· The relationship between x and y appears to be consistent and linear.

Students studying in a library with one woman looking out of the window

Digital Vision/Photodisc/Thinkstock

Regression can be used to help usunderstand the correlation betweenthe amount of time students spendstudying and the grades they earn.

About Linearity

The third point about the linear relationshipbetween x and y is particularly important. The typeof regression discussed here assumes that theassociation between the predictor and criterionvariables is linear. Because of the linearityassumption, we can draw a straight line through thedata, attempting to remain as close to as many datapoints as possible while keeping it straight. Such aline might resemble the graph in Figure 9.2A.

The line through the data points is called a regression line. It is positioned so that it is as closeas possible to as many of the 20 data points as itcan be and still be a straight line. The regressionline can be used to determine the value of y from aspecified value of x. For example, someone using the graph can select any value of x alongthe horizontal axis, go vertically from that x value up to the line, and then move lefthorizontally from the line to the y axis. The value where the y axis is encountered will be thevalue of y for the specified x value, according to the data from these 20 students. Figure 9.2Bshows how to use the regression line to determine y from x.

No one in the sample of 20 indicated 12.5 hours per week study time, but perhaps one of theresearcher’s colleagues, aware of what kind of analysis is underway, asks, “I know someonewho studies 12.5 hours every week. What corresponding grade average might we expect forthat student?” If the regression line is positioned accurately, the researcher can locate 12.5 onthe x axis, travel vertically up to the regression line and then move left to the y axis todetermine that 12.5 hours per week of study time predicts a grade point average of about 2.9.

The researcher gathered data from only 20 students. The sample size is somewhat risky, butsubjects were randomly selected, and sampling theory tells us that a randomly selectedsample will differ from the population of all freshman students only by chance. In spite of thesmall sample size, perhaps the sampling error is minimal.

Figure 9.2A: Regression lines

Scatterplot similar to the one in Figure 9.1 but with a straight line drawn through the points along the graph

Figure 9.2B: Using the regression line to determine y from x

Scatterplot identical to the one in 9.2A, with added horizontal and vertical lines extending across from three on the vertical axis to about twelve point five on the horizontal axis and meeting at a right angle.

Coping with Less-Than-Perfect Correlations

Besides the fact that the graph in Figure 9.2A does not provide very detailed markings (theresearcher had to guess that 12.5 hour studied will produce a grade average of “about 2.9”),other factors affect prediction accuracy. The researcher cannot be precise about what grade aspecified number of hours studied will predict because the correlation between the twovariables is imperfect. Although a grade average of 2.9 might be the best possible predictiongiven these data, it is quite likely that the prediction will not be exact. For a particular studentwho studies 12.5 hours per week, a GPA of 2.8 or perhaps 3.0 may be a more accurateprediction.

Without yet calculating the correlation, the evidence for rxy < 1.0 is the scatter in the datapoints. For example, note that three students reported studying the same 16 hours per weekbut ended the term with different grade averages. This result reflects the fact that grades areaffected by more than just study time. The researcher has not accounted for differences inacademic ability, class rigor, teaching quality, or a host of other variables. Those problemsaside, as long as the correlation between the predictor and criterion variables is statisticallysignificant, the predicted value of y from the value of x will be more accurate over time thana number of random predictions.

Try It!: #2

What is the visual evidence in ascatterplot for a weak correlation?

Error in prediction is something we tolerate.No one sues the College Board if a studentwith a high SAT score performs poorly thefirst year in college. Viewers do not petitionthe television station to fire themeteorologist when the forecast high for theday is wrong by a couple of degrees. Erroris inevitable when predictions must bebased on imperfectly correlated variables,but we can at least have some measure ofhow extensive the error is likely to be. Later, the chapter will discuss how to calculate theamount of error and then use it to qualify the predicted value in a way that gauges predictionaccuracy.

Understanding the Least-Squares Criterion

A scatterplot is a helpful way to introduce the idea of regression, but relying on the scatteredpoints in a graph and a positioned line to predict one variable from the other is not practical.The regression line is a conceptual model for what a regression equation actually does. Theequation meets what is called the least-squares criterion, the requirement that the regressionline be positioned so that the sum of all possible prediction errors has the lowest possiblevalue. In short, the equation for the regression solution minimizes prediction error.

Describing Prediction Error

Whenever correlations are imperfect, regression solution will include some error. If wepredict the corresponding value of y for a series of x values, and we know the actual value of y in each case, error is the difference between the criterion variable’s predicted valueaccording to the regression equation, and its actual value according to the data. To avoidconfusing the various values, we will identify them as follows:

· x is the actual value of the predictor variable,

· y is the actual value of the criterion variable, and

· y' (y prime) is the predicted value of the criterion variable.

Researchers often do not know the actual value of y (thus the value of regression procedures),but if they did, the difference between the actual and predicted values (y − y') would indicatethe error in a solution. The y − y' difference is called a residual score.

Look at Figure 9.2A again. If the correlation between hours studied and end-of-term gradesformed a straight line—that is, if the correlation between those variables was rxy = 1.0—eachvalue of x would result in just one corresponding value of y. Those three people who reported16 hours per week study time would all have had the same grades at the end of the term.

But with scatter in the data points, establishing the regression line is not a matter of justconnecting the dots. Because the regression line is a straight line and must be positioned soas to minimize the prediction errors, its placement involves some compromise. It is a “line ofbest fit,” not a “line of perfect fit.”

Using the Residual Scores

If someone were to rely on Figure 9.2A to make a series of predictions and then go throughuniversity records to determine what the actual grade averages were for those 20 studentsafter their first term, any difference between predicted grades and actual grades would be aresidual score.

If all of those residual scores were added up [∑(y − y')], what would they total? Someresidual scores would be positive (the actual value of y will be larger than the predictedvalue, y') and some negative (y is smaller than y'). The positive and negative residual scoreswould cancel each other out and sum to 0; summing up residual errors does not reveal muchabout the amount of error in a series of regression solutions. However, if the residual scoresare squared (which eliminates the negative residual scores), and then summed, the resultbetter indicates the amount of error. When the regression line is positioned so that the sum ofthose squared values is as low as possible, the solution meets the least-squares criterion.Positioning the line so as to minimize error is the function of regression equation and thereason that this particular form of regression is called ordinary least-squares regression.

9.2 Ordinary Least-Squares Regression with One Predictor

Theoretically, a regression problem can have any number of predictors, x1, x2, x3 . . . . Havingmore than one predictor makes the procedure “multiple” regression. Toward the end of thechapter, we will describe how multiple regression works. For now, the chapter will focus onregression with just one predictor, sometimes called simple regression or bivariateregression because there are only two variables involved: a predictor variable and a criterionvariable.

To position the regression line so as to meet the least-squares criterion requires answers totwo questions:

1. Where does the regression line cross the y axis in the graph?

2. How much does the criterion variable (y) change when the predictor variable (x)increases by 1.0?

The answer to the first question establishes the regression line’s intercept, called thatbecause it indicates the value of y where the regression line intercepts the y axis. Theintercept is the value of y when x = 0. Look at Figure 9.2A again, and note that the regressionline appears to cross the y axis at about y = 1.2. Following is an equation to calculate theintercept value, which will indicate how close the estimate is.

The second question above concerns the slope of the line. It indicates how much theregression line inclines or declines from left to right. Using the units in Figure 9.2A, wemight estimate that whenever x increases by 5.0, from left to right, y increases by a little lessthan 1.0. Reducing it to the units used in the second question above, if x increases by 1.0, yincreases by something less than 0.2.

The Regression Equation

The simple, or bivariate, regression equation has this form:

Formula 9.1

y' = a + bx + e

where

y' = the predicted value of the criterion variable

a = the intercept

b = the slope of the regression line

x = the value of the predictor variable

e = prediction error

Formula 9.1 shows that for any value of the predictor (x), the predicted value of y is the valueof the intercept (a), plus the slope times the predictor variable’s value (bx), plus error (e).Calculating the amount of error in a regression problem is a separate process, and the e willbe dropped from the equation hereafter, resulting in y' = a + bx . The error symbol is inFormula 9.1 to remind us that absent a perfect correlation, prediction error is always present.

Before calculating a regression solution, we need to know the intercept value, a, and theslope, b. Each has its own equation, but the components of both are already familiar. First,the formula for the intercept is

Formula 9.2

a = My − bMx

where

a = the intercept

b = the slope of the regression line

My = the mean of the criterion variable, y

Mx = the mean of the predictor variable, x

That is, the intercept value is the mean of the criterion variable minus the slope value timesthe mean of the predictor variable.

Because the intercept formula includes the slope of the regression line, or the regression coefficient, we need to start there. To determine the regression coefficient, b, use thefollowing:

Formula 9.3

b=rxy(sysx)

where

b = the slope of the regression line

rxy = the correlation coefficient for the two variables

sy = the standard deviation of the criterion variable, y

sx = the standard deviation of the predictor variable, x

Calculating a Regression Solution

Using the study-time and term-grades data, we will calculate a regression solution for theindividual who studied 12.5 hours per week. The graph suggested that such a person wouldprobably have a term grade average of about 2.9. How accurate was that estimate? Toanswer, we will need to calculate the following:

· the means and standard deviations for x (hours studied) and y (grade average),

· the correlation of x and y,

· the slope of the line (the regression coefficient) b,

· the regression intercept, a, and

· the value of y'.

1. For the x and y variables, verify that Mx = 10.15, sx = 5.941, My = 2.615, and sx =0.613.

2. Using Formula 8.2, calculate the correlation as follows:

rxy=n∑xy−(∑x)(∑y){[n∑x2−(∑x)2][n∑y2−(∑y)2]}‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√rxy=20(596.3)−(203)(52.3){[20(2,731)−2032][10(143.91)−52.32]}‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√=1,309.4(13,411)(142.91)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√=0.946

Checking this value against the critical value from Table 8.5, rxy0.05(18) = 0.444,indicates that the correlation is statistically significant.

To identify the correlation using Excel, remember that having set up the data in the twocolumns, the commands are Data → Data Analysis → Correlation.

3. The slope of the line is

b=rxy(sysx)=0.946(0.6135.941)=0.098

This value indicates that y increases 0.098 for every 1.0 increase in x. Earlier, weguessed that the slope of the line in the Figure 9.2 graphs might be about 0.2. Theestimate was not very accurate.

4. The regression line intercept is

a = My − bMx

= 2.615 − (0.098)(10.15)

= 1.620

This value indicates that if x = 0, then y = 1.620. Based on the visual best fit that we made tothe graphs in Figures 9.2A and 9.2B, we guessed that the intercept would be about y = 1.2.So that estimate was not close either. In Figure 9.3, rather than estimating where theregression line would be positioned to establish a best fit, as we did in the 9.2 figures, Excelhas completed the regression calculations and positioned the line. Note that the Excel valuesconform more closely to our calculations.

What grades would someone who studies 12.5 hours per week likely earn? From Figure9.2A, we estimated about 2.9. Check this estimate by solving for y', comparing the twosolutions, and consulting Figure 9.3.

y' = a + bx

= 1.620 + (0.098)(12.5)

= 2.845

Based on the data from the 20 students, the grade average predicted for someone who studies12.5 hours per week is 2.845. Interestingly, that value is not far from the earlier prediction.

Practicing the Regression Solution

Now that the calculations of the slope (b) and the intercept (a) have been completed, it is asimple matter to solve for any other value of x. For example, what grades can be predictedfor someone who seems to study incessantly, perhaps 30 hours a week? Based on the formulafor the predicted value y' = a + bx, we can substitute in our values of a (1.620), b (0.098), and x (30) to find the following:

y' = a + bx

= 1.620 + (0.098)(30)

= 4.560

This result is interesting because grades are averaged on a four-point scale, meaning they canbe no higher than 4.0, straight As. The regression procedure does not “know” there is an effective ceiling to how high grades can be. The linearity assumption, to which we referredearlier, is that the relationship continues higher and lower in either direction from the datagathered.

What grade average is predicted for someone who does not study at all? Is such a studentlikely to receive a grade average that is also 0?

y' = a + bx

= 1.620 + (0.098)(0)

= 1.620

Even with no time devoted to weekly study, a student is unlikely to have a GPA of zero. Butthis is an answer we already had. Remember that the intercept is defined as the value of y if x= 0. In terms of our analysis, this is equivalent to asking what the GPA value (the y variable)is for someone who studies zero hours (the x variable). To answer that question, we couldhave just reported the value of the intercept.

Determining the Error in a Regression Solution

The prediction of the grade average for the student who studies 12.5 hours per week duringthat first term of college was 2.845. It is the best prediction that can be made with the datathat are available for the 20 students in the data set. Random sampling will make anysampling error, any degree to which the sample is unlike the population of all freshmanstudents at the university, minimal.

However, even if the data set included data for every freshman student, the answer still willnot necessarily be precisely accurate for one individual. The regression equations allow the best prediction, but it is a generalization based on the group, which may or may not beexactly accurate for a particular student. No matter how large the sample, and regardless ofhow it is selected, some prediction error is inevitable as long as the correlation betweenpredictor and criterion is < 1.0.

This reality does not mean that the regression process is flawed, simply imperfect. We made asimilar point about calculating the various standard-error statistics in the chapters about t test.The fact that the test statistic includes error does not imply that mistakes were made. Errorindicates variability in the data for which the research cannot account. It is the same withregression procedures. The least-squares criterion explains that the equations are designed tominimize error but cannot eliminate it entirely.

For any data set that makes a large number of predictions, some of the predictions will err bybeing too high, some will be too low, and a few might be correct. The fact that all theseprediction errors would sum to 0, ∑(y − y') = 0, is small consolation if we are making oneprediction for one individual and the outcome is important. To know how much to trust theprediction, regression procedures include a way to estimate the amount of error.

The Standard Error of the Estimate

Recall the standard error of the mean and the standard error of the difference that arecalculated for the t tests. Both those statistics measure error variance in the other tests.Regression has a similar measure of error variance called the standard error of the estimate(SEest). Theoretically, the standard error of the estimate is explained this way:

If a researcher calculates a very large number of regression solutions from a dataset and for each solution determines the residual score, or the difference betweenthe actual and predicted values of the criterion variable (y − y'), then the standarderror of the estimate is the standard deviation of all those residual scores.

The only way that residual scores can be determined, of course, is if the researcher alreadyhas all the actual values of y. If the researcher had that information, what would be the pointof using regression? The standard deviation of residual scores explains the standard error ofthe estimate, but it is not a guide to how the researcher calculates that statistic. Recall that intheory, the standard error of the mean (Chapter 4) is the standard deviation of all the samplemeans in the population. But that value was estimated by dividing the sample standarddeviation by the square root of the number in the sample. The standard error of the estimatecan be calculated in a similar way:

Formula 9.4

SEest=sy(1−rxy2)‾‾‾‾‾‾‾‾‾‾√

where

SEest = the standard error of estimate

sy = the standard deviation of the criterion (y) variable

rxy2 = the square of the correlation coefficient

For the hours-studied and grade-point-average problem, the standard error of the estimatewill be as follows:

Given the correlation between study time and grade averages of rxy = 0.946, andthe standard deviation of the y variable (grade averages) of Sy = 0.613, the standarderror of the estimate is

SEest=sy(1−rxy2)‾‾‾‾‾‾‾‾‾‾√=0.613(1−0.9462)‾‾‾‾‾‾‾‾‾‾‾‾‾√=0.199

A large SEest value indicates substantial error in the prediction. Consider the factors affectingthe size of the standard error of the estimate.

1. The sy value is the standard deviation of the variable to be predicted. Highly variabledata sets result in large standard deviation values and, as a result, in large SEest values.

2. The sy value is a multiplier for the result of

(1−rxy2)‾‾‾‾‾‾‾‾‾‾√

The more highly x and y are correlated, the smaller this resulting value will be and, as aconsequence, the smaller the SEest value.

The smallest SEest can be is 0, and the largest the value can be is the value of sy. In eithercase, can you see why?

· If the correlation between predictor and criterion is perfect (that is, if rxy = 1.0), thelatter part of the term

(1−rxy2)‾‾‾‾‾‾‾‾‾‾√ becomes (1−1)‾‾‾‾‾‾‾√ or 0‾‾√;sy×0=0.

· At the other extreme, if the correlation between predictor and criterion has its lowestpossible value (0), the latter part of the term becomes

(1−0)‾‾‾‾‾‾‾√, or 1; sy×1=sy

Using the Standard Error of the Estimate

By itself, the standard error of the estimate is not that helpful. It is hard to know when theamount of error is comparatively large and when it is not. Remembering the relationshipbetween a standard deviation and the normal distributions provides some guidance.

Recall that in a normal distribution, the area from one standard deviation below the mean toone standard deviation above the mean includes about two-thirds of the entire population.Noting that the SEest is like a standard deviation of all possible error scores, from thepredicted value (y') minus 1 SEest to y' plus 1 SEest provides a range within which the truevalue of the criterion variable, y, will occur about 68% of the time. Therefore, if y' = 2.845and SEest = 0.199, then the range from 2.646 (2.845 − 0.199) to 3.044 (2.845 + 0.199) will in-clude the true grade average for someone who studies 12.5 hours per week 68% of the time.To put it more concisely, with p = 0.68, the value of y is between 2.646 and 3.044.

Determining the true predicted value 68% of the time leaves a great deal to chance. Whenresearchers use confidence intervals (CIs) in regression procedures, they more commonlycalculate them so that they capture the true value of y 95% or 99% of the time. These arecalled 0.95 or 0.99 confidence intervals. The process for a 0.95 confidence interval for thegrade average and number of hours studied problem is as follows:

Formula 9.5

CI = ±t(SEest) + y'

where

CI = the confidence interval for the regression solution

t = a critical value of t for n − 2 df for p = 0.05 (for a 0.99 confidence interval, it isthe value for p = 0.01)

SEest = the standard error of the estimate according to Formula 9.4

y' = the predicted value for the criterion variable

So, for the problem predicting grade average from the number of hours studied, a 0.95confidence interval will be as follows:

With

SEest = 0.199

t(df = 18) = 2.101

and

y' = 2.845

then

CI = ±tn−2(SEest) + y'

= ±2.101(0.199) + 2.845

= 3.263, 2.427

Try It!: #3

What factors affect the width of aconfidence interval?

To be 0.95 confident of having captured thetrue grade average for someone who studies12.5 hours per week, the range for possiblegrades needs to encompass every valuefrom 3.263 down to 2.427. This wideconfidence interval stretches from asubstantial B to what is ordinarily a C. It isa wider interval than if we were satisfiedwith 0.90 confidence, and not as wide as if we adopted 0.99 confidence. Several factors affectthe width of the confidence interval:

· the level of confidence, as we must note with the difference between 0.90, 0.95, and0.99

· the sample size, which affects both the amount of variability in y and the critical valueof t

· the strength of the correlation

Apply It!Using Regression to Predict Growth

A psychologist is considering whether to purchase a marriage and family therapypractice from someone who is retiring. The prospective purchaser wants to predict thepractice’s growth. This sort of procedure is sometimes called trend analysis, but thework involved is regression.

An older man seated on a couch speaks to a woman with a clipboard seated across from him.

KatarzynaBialasiewicz/iStock/Thinkstock

Historically, the psychologist’s practiceappears to have grown along with thepopulation of the town, which is currentlybooming because of the growth of nearbygovernment-research and development-defenseindustries. To verify the relationship betweenthe number of clients and population growth,the buyer will calculate a correlation to start. Ifthe two variables are significantly correlated,the psychologist can then use the townpopulation to predict growth in the number of clients. Since data from the countyoffice predict that within the next five years the town will grow to 260,000, thepsychologist wishes to know how many clients can be expected when the populationreaches that projected number.

To pursue all of this, the psychologist gathers data on the town’s population at one-year intervals over the previous 14 years along with the number of clients in thepractice for the same years. Table 9.2 lists these data.

Table 9.2: Population and the number of clients

Year	Population	No. of clients
2001	25,780	27
2002	29,580	32
2003	36,500	75
2004	39,870	82
2005	43,580	102
2006	57,800	111
2007	59,000	131
2008	70,000	118
2009	82,000	152
2010	91,000	149
2011	129,000	174
2012	149,000	188
2013	176,000	209
2014	198,000	254

If population is plotted as the x (predictor) value and the number of clients as the y(criterion) value, Figure 9.3 seems to indicate a strong linear relationship between thetwo variables.

The line through the 14 data points is what Excel calls a “trend line.” For ourpurposes, it is the regression line. To proceed with a regression solution, thepsychologist needs the means and standard deviations for the two variables, thestrength of the correlation between x and y, and the values for a, the intercept orregression constant, and b, the slope or the regression coefficient.

Figure 9.3: Number of clients as a function ofpopulation growth

Scatterplot with horizontal axis numbered from 0 to 250,000 and vertical axis numbered from 0 to 300. Points along the graph follow a pattern from lower left to upper right, with a straight line drawn through them.

First, for the correlation, we have rxy = 0.943. In comparison, r0.05(12) = 0.532, so thecorrelation is statistically significant. The means and standard deviations for the twovariables are shown in Table 9.3.

Table 9.3: Descriptive statistics for population and the number of clients

	Mean	Standard deviation
Population (x)	84793.571	56466.608
No. of clients (y)	128.214	63.926

The slope of the regression line, or the regression coefficient, is

b=rxy(sysx)=0.943(63.92656466.608)=0.001

The intercept, sometimes called the regression constant, is

a = My − bMx

= 128.214 − (0.001 × 84793.571)

= 43.420

Recall that the psychologist’s question was how many clients could be expected if thepopulation of the town grew to 260,000. In the regression equation, then, the value of x is 260,000, so using y' = a + bx results in y' = 43.420 + (0.001)(260,000) = 303.420.

The data suggest that if the population grew to 260,000, the best prediction fornumber of clients is about 303. To have a sense of how much error there might be inthis prediction, the psychologist needs a confidence interval, for which an importantpart is the standard error of the estimate:

SEest=sy(1−rxy2)‾‾‾‾‾‾‾‾‾‾√=63.926(1−0.9432)‾‾‾‾‾‾‾‾‾‾‾‾‾√=2.661

Since the significance of the correlation was tested at p = 0.05 (note the critical valuefor the correlation), the confidence interval should be at the same level. That willmake it a 0.95 confidence interval.

CI0.95 = ±t(SEest) + y'

The value of t for 12 degrees of freedom is 2.179 at p = 0.05, so

CI0.95 = ±t0.05(12) (SEest) + y'

= ±2.179(2.661) + 303.420

= 309.218, 297.622

With 95% confidence, the psychologist can expect somewhere between 298 and 309clients when the city’s population is 260,000. This is actually quite a precise interval,something that reflects the strength of the correlation between the population (x) andthe number of clients (y).

Apply It! boxes written by Shawn Murphy

Interpreting the Regression Results

The value of the slope,

b=rxy(sysx)

; the regression coefficient, is a proportion of the ratio of sy to sx. The proportion isdetermined by the strength of the correlation. It never happens in human-subjects research,but if the correlation between the two variables were perfect (rxy = 1.0), the value of the slopewould be one times that ratio of sy to sx. As the correlation diminishes, the slope’s value is adecreasing proportion of that ratio. At rxy = 0.50, for example, the slope’s value is half theratio of sy to sx.

Try It!: #4

If the correlation between x and y isnegative, what happens to thepredicted value of y as x increases?

The slope of the regression line need not bea positive value. If the correlation betweenthe predictor and criterion variables isnegative, the slope will be negative. Thismeans that as x increases, y declines,something illustrated by a scatterplot whichlowers from left to right.

Suppose a criminologist is usingincarcerated inmates’ good behavior to predict the length of the inmate’s sentence. As thenumber of days of good behavior while incarcerated increases, the overall length of theinmate’s sentence decreases. The regression slope for such a problem would be negative,declining from left to right. Negative values for b and slopes that decline from left to right arenot unusual.

· If the amount of time students spend on video games is used to predict their gradeaverages during the first year of college, the correlation between those variables willprobably be negative; as time on games increases, grades probably decline.

· If the frequency of substance abuse is used to predict job productivity, the slope isprobably negative.

· The number of extramarital affairs is probably negatively correlated with the length ofa marriage. The regression line for predicting marital harmony would be negative.

Using the first example above, an enterprising graduate student with an interest in predictingstudents’ grades gathers data on video gaming and grades from 10 randomly selectedundergraduates, resulting in Table 9.4.

Table 9.4: Video gaming and grades

Student	Video gaming hours	Grades
1	0	3.9
2	1	3.8
3	1	3.6
4	3	3.6
5	5	3.4
6	5	3.0
7	7	2.9
8	6	2.7
9	4	2.9
10	8	2.5

The scatterplot and the Excel solution for the regression line are represented in Figure 9.4.

Figure 9.4: The relationship between hours in videogaming per day and grades

Scatterplot with horizontal axis numbering 0 to 10, in intervals of 2, and vertical axis numbering 0 to 4.5 in 0.5 increments. Points along the graph follow a pattern from upper left to lower right, with a straight line drawn through them.

Because the correlation is negative, the regression line here declines from left to right;according to these data, more time spent with gaming results in lower grades.

Another Regression Problem

A social worker is responsible for encouraging indigent people to advance their educations soas to improve their living conditions. To demonstrate to clients that schooling affects income,the social worker gathers the data for a group of 12 people as listed in Table 9.5.

Table 9.5: Data for study comparing education and income

Group member	Years of education (x)	Income in thousands (y)
1	10	23.3
2	12	27
3	12	30.5
4	14	34
5	14	45
6	16	55
7	16	57.5
8	16	62
9	16	68
10	18	70
11	18	85
12	18	90

One client who is a high school graduate (12 years of education) asks, “If I attend thecommunity college and complete a two-year certification program, what is a good predictionfor my income?” Regression analysis with x = 14 will answer the question.

First, the social worker calculates the correlation:

rxy=n∑xy−(∑x)(∑y){[n∑x2−(∑x)2][n∑y2−(∑y)2]}‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√rxy=12(10.319)−(180)(647.3){[12(2,776)−1802][12(40,407.39)−647.32]}‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√=7,314(912)(65891.39)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√=0.944

Calculating the slope and intercept requires knowing the means and standard deviations.Table 9.6 tabulates these.

Table 9.6: Means and standard deviations for study comparing education and income

	Years of education (x)	Income in thousands (y)
Means	15.0	53.942
Standard deviations	2.629	22.342

The slope (regression coefficient) is then

b=rxy(sysx)=0.944(22.3422.629)=8.022

and the intercept (regression constant) is

a = My − bMx

= 53.942 − (8.022)(15.0)

= −66.388

Solving the equation for 14 years of education (12 plus the 2 years of community collegeeducation) produces

y' = a + bx

= −66.388 + (8.022)(14)

= 45.92

The best prediction that the social worker can make with these data is that the individual islikely to make about 45.92 thousand dollars ($45,920) with the benefit of the additionalschooling. If the social worker wants a confidence interval for that answer, the first step is todetermine the standard error of the estimate:

SEest=sy(1−rxy2)‾‾‾‾‾‾‾‾‾‾√=22.342(1−0.9442)=7.372

The confidence interval is

CI = ±t(SEest) + y'

= ±2.228(7.372) + 45.92

= 62.345, 29.495

Try It!: #5

If additional data in a regressionproblem produce a higher correlationbetween x and y, what will be theimpact on a confidence interval for thesolution?

Although the correlation between educationand income is very high (r = 0.944), otherfactors affect the value of the standard errorof the estimate and therefore the confidenceinterval. One is the variability in each of thepredictor and criterion variables. The otheris the probability level involved, which willbe discussed later. With so much variabilityin income (it ranges from $23,300 to$90,000 for this sample), the standard errorof the estimate is quite large, and theconfidence interval is not precise. For thisdata set, a CI0.95 suggests that thecommunity college program will result in an income somewhere between $29,495 and$62,345.

Apply It! Intelligence and Working Memory

Three students taking a test

Fuse/Thinkstock

Although psychometric intelligence andworking memory represent separate traditionsin the study of intelligence, research indicatesthey are linked (see, for example, Thomas,Rammsayer, Schweizer, & Troche, 2015).Working memory is often evaluated in terms ofthe number of separate bits of data anindividual can accurately retrieve after a briefexposure. As such, even with a short-form testof psychometric intelligence, working memorydata are much easier to collect than traditionalintelligence scores. A psychologist reasons that if the correlation between workingmemory and psychometric intelligence is statistically significant and substantial,working memory data can be used to predict intelligence scores. Table 9.7 presentsdata on both variables for 10 subjects.

Table 9.7: Short-term memory (STM) and intelligence

No. of items retrieved (STM)	Psychometric intelligence
6	105
4	95
5	100
7	105
7	110
6	100
8	115
6	100
6	95
5	90

After collecting the data, the psychologist encounters someone who has STM = 10, afull two points better than anyone in the sample. What is the best prediction forintelligence if STM = 10?

Table 9.8 shows the calculated means and standard deviations.

Table 9.8: Descriptive statistics for short-term memory and intelligence

	No. of items (x)	Intelligence score (y)
Means	6.0	101.50
Standard deviations	1.155	7.472

Using the Pearson Correlation formula gives rxy = 0.837.

Since the critical value for r0.05(8) = 0.632, the psychologist can be confident that therelationship between short-term memory ability and intelligence scores is not random.A regressions solution based on this relationship is appropriate. For a regressionsolution for STM = 10,

The slope (regression coefficient) is

b=rxy(sysx)=0.837(7.4721.155)=5.415

The intercept (regression constant) is

a = My − bMx

= 101.50 − 5.415(6.0)

= 69.010

Solving the equation for STM = 10 results in:

y' = a + bx

= 69.010 + 5.415(10)

= 123.160

Given the data from the 10 people in the sample, the best prediction of apsychometric intelligence score for someone who has a short-term memory capacityof 10 is about 123 points. This solution raises an important issue related to regression.Although the STM = 10 is 1.667 times the mean of 6.0 for STM (10/6 = 1.667), thepredicted intelligence score is only 1.213 times the mean for all intelligence scores.Put more succinctly, the criterion value is less extreme than the predictor. Thisconcept, called regression to the mean, is characteristic of linear regressionsolutions; an extreme predictor will always predict a less extreme criterion value.Consider the nature of normal distributions, and the reason becomes clear. Most ofthe individuals in any normal distribution occur to the right of extremely low valuesand to the left of extremely high values in the distribution. Regression solutionsreflect this characteristic.

To calculate a confidence interval for the solution, the psychologist first determinesthe standard error of the estimate:

SEest=sy(1−rxy2)‾‾‾‾‾‾‾‾‾‾√=7.472(1−0.8372)‾‾‾‾‾‾‾‾‾‾‾‾‾√=4.089

Then the confidence interval is

CI = ±t(SEest) + y'

= ±2.228(4.089) + 123.160

= 132.270

With 0.95 probability, the intelligence score for someone who has a short-termmemory score of 10 is somewhere between 114.050 and 132.270.

As always with confidence intervals, the width of the interval is a function of the sizeof the standard error of the estimate, the amount of variability in the data, and finally,the level of probability at which it is calculated. Ordinarily the confidence interval iscalculated for the same level as that at which the test was conducted. When we test atα = 0.05 we have established the probability of a type 1 (alpha) error—in otherwords, 5% of the time, what appears to be a statistically significant finding will be arandom outcome. The corresponding level of probability for a confidence interval is p= 0.95, which is to suggest that 5% of the time, the actual value of the criterionvariable will be either above or below the calculated interval.

Apply It! boxes written by Shawn Murphy

9.3 Regression with Excel

The regression procedure which is part of Excel for Windows provides useful output. Using theproblem predicting the income from education, the procedure is as follows:

1. Arrange the data in columns, A for years of education and B for income. Enter the labels “years”and “income” in cells A1 and B1, respectively.

2. Select the Data tab at the top of the page and then Data Analysis at the far right just below thepage tabs.

3. In the Analysis Tools window, scroll down to Regression and click OK.

4. Click the Input Y Range window and drag the cursor from B2 to B13.

5. Click the Input X Range window and drag the cursor from A2 to A13.

6. Click the Output Range window and enter something like A15 so that the output does notoverwrite the data. Click OK.

The result is shown in Figure 9.5.

Figure 9.5: Using Excel to predict income from years ofeducation

Screen capture of an Excel worksheet with data from the survey of people comparing income and years of education completed

Source: Microsoft Excel. Used with permission from Microsoft.

Column A has been expanded in Figure 9.5 to make it easier to read the output. Regression statistics inExcel illustrate the following:

· The same correlation value is calculated in the manual solution, although Excel calls it“Multiple R,” which is actually the name for correlation procedures involving more than onepredictor.

· The R Square value is the square of the Pearson correlation, indicating the amount of variance in y explained by x.

· The Adjusted R Square is the correlation value diminished because of the risk in small samples(which this is).

· The Standard Error value is the standard error of the estimate, but using the Adjusted R Squarevalue.

· The number of Observations is n = 12.

· The ANOVA tests probability that the relationship between x and y occurred by chance. Theentry 4.12E-06 is a short-hand way of indicating 4.12 with the decimal 6 places to the left (4.12( 10–6)) so that p = 0.00000412 that the x − y relationship is random.

· The last table in Figure 9.5 provides the regression solution. The intercept and slope values aresimilar to the manual calculations, with differences attributable to differences in rounding. Thestandard error values for a and b were not calculated manually, nor were the significance tests orconfidence intervals for those individual values. The significance tests are redundant because inleast-squares regression, if rxy is statistically significant, x is also a significant predictor of y.

Excel for Mac does not include a regression procedure. It does, however, include correlation and theother descriptive statistics. It will also produce a scatterplot with a “trend line,” which is like theregression line in ordinary least-squares regression.

Regression with Excel

/var/folders/3w/43pn04g50hv2rw2d8kq0pyl40000gn/T/com.microsoft.Word/WebArchiveCopyPasteTempFiles/aV2rdzYBsM6CIONbI1NZENTr3RwdB2PlnJgJ6BRgA4hwu5Qg5iswAAAAASUVORK5CYII=

00:00

Shrinkage and Overfitting the Sample

Try It!: #6

What does shrinkage mean in regression,and how can it be avoided?

When samples are small and correlations arerelatively weak, bear in mind the potential forprediction error. Even when confidence intervalsare narrow, regression solutions carry potentialrisks. By necessity, the solution is based on theavailable data. This is not a problem as long asthe existing data set represents the population ofall such data reasonably well; no sample,however, can exactly emulate a population, and small samples involve particular risks. When aregression solution fits a sample but not the population, the problem is overfitting the sample.Another term to describe this characteristic is shrinkage: the degree to which the accuracy of aregression solution is diminished when used with other data from the same population. In the earlierexample where the number of hours in daily video-gaming was used to predict grades, the sample of10 students may not be a good representation of the population of all undergraduates. But the solutionis (by necessity) based on those 10 students who were all the graduate student/researcher hadavailable. A larger sample might provide a different correlation between the two variables. A largersample might also provide less variability in either the x or y variable, which is typical as sample sizesgrow. With either change, the solution based on 10 students would be inaccurate for the population;the solution would be overfitted to the sample.

The Requirements for Ordinary Least-Squares Regression

Many different regression procedures are possible, with each adapted to a different set ofcircumstances. Bivariate, ordinary least-squares regression requires the following:

· The variables involved must be interval or ratio scale.

· The variables must be normally distributed in their populations.

· The predictor and criterion variables must show statistically significant correlation.

· The relationship between the variables must be linear.

· The data must have similar amounts of variability throughout their ranges.

9.4 A Conceptual Introduction to Multiple Regression

To this point, the chapter has confined its discussion to what is called “simple” or “bivariate”regression. It involves one predictor (x) and one criterion variable (y). Multiple regression, onthe other hand, uses similar logic but employs more than one predictor. If multiple variablesare correlated with a criterion variable, and they are not too highly correlated with each other,multiple predictors can often provide a more precise prediction of y than a single predictor.Here is the multiple regression equation with two predictors:

y' = a + b1x1 + b2x2

The intercept or constant value, a, is defined as the value of y when both x values equal zero.There are two b values, two slopes, one for each of the predictor variables, x1 and x2. Eachindicates how much y changes when the particular x value increases by 1.0, and the value ofthe other x value is held constant.

Recall how using short-term memory data predicted psychometric intelligence. Supposesomeone modifies that problem so that short-term memory (STM; now x1) and problem-solving ability (prob.solv.; x2) are both used to predict psychometric intelligence (y). Theintercept would indicate the psychometric intelligence value if both STM and prob.solv. = 0.The b1 value would indicate how psychometric intelligence changes if STM increases by 1.0,and problem-solving ability is unchanged. The b2 value would indicate how much psychometric intelligence changes if prob.solv. increases by 1.0 and STM is unchanged.

Holding one predictor unchanged as the other increases controls redundancy betweenpredictors. When two predictors are correlated with a criterion value, they also tend to becorrelated with each other, and an accurate prediction requires determining what eachpredictor reveals about y that is unique.

As a footnote to this overview, note that not all regression procedures are based on a linearrelationship between x (or the xs) and y. Bivariate and the multiple regression mentionedabove are based on that requirement, but the relationship between x and y is not alwayslinear. Indeed, the x and y variables are not always measured on an interval or ratio scale.Those instances involve still other regression procedures, procedures beyond the scope of anintroductory text.

Writing Up Statistics

Although this chapter has focused on simple regression with one predictor variable, humancomplexity makes explaining behavior with one variable very difficult. Consequently, theresearch more commonly represents multiple regression. Bai, Lai, Lee, Chang, and Chiou(2015) used several variables to explain fatigue among patients receiving dialysis because offailed kidneys. The research literature indicated that age, length of employment, amount ofphysical activity, amount of medication taken, and degree of depression were all factors indialysis patients’ fatigue. Gathering data for 193 patients, the researchers developed aregression model that explained 64.2% of the variance in fatigue. Such a model would allowthe researchers to make quite accurate predictions of the level of fatigue, given the levels ofthese other variables.

Summary and Resources

Chapter Summary

The correlation coefficient is an elegant statistic. Whenever separate measures have somequality in common, correlations indicate the strength of the relationship between them.Regression procedures capitalize on this by using what is contained in one measure to predictthe probable level of the other measure (Objective 1). Because prediction is a part of allscience and of virtually every social domain as well, regression has remarkably wideapplication. When variables are related, but one is more difficult to measure than the other,the more accessible variable can be used to predict the more elusive variable (Objective 3).

Many types of regression exist. Bivariate regression has one predictor variable and onevariable predicted, the criterion variable. The math in least-squares regression, or ordinaryleast-squares regression, produces a solution that minimizes the sum of the squared errorsfrom a series of predictions. The regression line is a visual representation of the relationshipbetween the variables. It allows the prediction of y from x, and because the regression makesno assumption about which variable is the cause, it also predicts x from y, when that ishelpful (Objective 2).

The regression line is a best fit given the available data, but because correlations between thepredictor and criterion variable are never perfect in human subjects research, some predictionerror is always likely. The standard error of the estimate is an average measure of that errorand, when used in a confidence interval, indicates how large the interval must be around thepredicted value of y, called “y prime” (y') to capture with confidence the true value of y.

When regression solutions are based on a data set not representative of the population, theycan be “overfitted” to the sample. The evidence of overfitting is a solution that predicts lesswell for other data sets drawn from the same population. The reduction in the utility of theregression solution is sometimes referred to as shrinkage.

Bivariate regression employs one predictor variable to estimate the value of a criterion. Thesame principles used here for bivariate regression can be applied to multiple regression. Forthat procedure, the information about a criterion contained in multiple predictor variables isused to calculate multiple regression coefficients which are combined to estimate the value ofthe criterion variable (Objective 4).

Chapter 9 Flashcards

Key Terms

criterion variable

intercept

least-squares criterion

ordinary least-squares regression

overfitting the sample

predictor variable

regression coefficient

regression line

regression to the mean

residual scores

shrinkage

simple or bivariate regression

slope

standard error of the estimate

Review Questions

Answers to the odd-numbered questions are provided in Appendix A.

The table immediately below is an economical way to show the correlations among severalvariables. Using the different correlations listed across the top line and also down the firstcolumn, we can determine the correlation coefficient between any two variables. Thecorrelation between probsolv (problem solving) and analytic (analytical ability), for example,is determined by moving down the left column for one variable, across the top for the othervariable, and then finding the value at the point where the two meet. At the intersection ofprobsolv down the left and analytic across the top, the value is 0.726. The correlation ofprobsolv and analytic is r = 0.726.

Use the following information to answer questions 1–6.

Correlation Matrix*

	problemsolv	analytic	comprehen	reasoning	computat	vocab
probsolv	1.000	0.726	0.833	0.598	0.919	0.714
analytic	0.726	1.000	0.767	0.857	0.734	0.894
comprehen	0.833	0.767	1.000	0.686	0.736	0.740
reasoning	0.598	0.857	0.686	1.000	0.534	0.852
computat	0.919	0.734	0.736	0.534	1.000	0.675
vocab	0.714	0.894	0.740	0.852	0.675	1.000

*All correlations are statistically significant.

Descriptive Statistics

Test	Mean	Standard deviation
Problem solving	43.000	8.441
Analytic	46.500	9.317
Comprehension	46.500	8.893
Reasoning	48.000	6.144
Computation	52.750	7.502
Vocabulary	54.850	5.250

1. Noting the correlation between problem solving and the computation score, whatcomputation score can be predicted for a student whose problem-solving score is 49?

a. How much will computation score increase for every 1.0 increase in problemsolving?

a. What value will computation have if problem solving is 0?

a. In terms of regression solutions, why is the value of computation relevant whenproblem solving is 0?

1. What is the standard error of the estimate for the Question 1 solution?

1. Calculate a 0.99 confidence interval for the Question 1 solution. Assume n = 52.

c. What is the confidence interval expected to contain?

c. On average, how often will the assumption referred to in Question 3a be wrong?

c. What could a researcher do to shrink the confidence interval?

1. Referring to the matrix at the beginning of the Review Questions, what variable willprovide the best prediction of comprehension scores? Explain.

1. From the matrix at the beginning of the Review Questions, what vocabulary score is predicted for someone who has an analytic score of 57.5?

1. What comprehension score is predicted for someone who has a reasoning score of 60?

1. The data below indicate the number of times subjects are reinforced for solving eachproblem and the number of problems correctly solved in the class period.

Reinforcements	Number of problems solved
2	3
5	6
4	5
1	3
3	4
5	7
4	4
6	9

g. What is the correlation between reinforcement and response rates?

g. What is the best prediction for number of responses for someone who has beenreinforced 5 times?

g. Assume n = 52 and determine a 0.95 confidence interval for the 7b. solution.

g. What would the confidence interval be if the correlation were only rxy = 0.7?

1. If two sets of data are uncorrelated, what is the best prediction for the value of y?

1. What impact does a negative correlation between x and y have on the slope of theregression line?

1. What factors determine error in a regression prediction?

Answers to Try It! Questions

1. The size of the correlation matters because the larger it is, the more the two variableshave in common, and the more accurately the value of one can be predicted from thevalue of the other.

2. A weak correlation is indicated by extensive scatter among the points in a scatterplot.

3. The factors in the width, or size, of a confidence interval are the strength of thecorrelation, the variability in the criterion variable, the sample size, and the level ofconfidence.

4. A negative correlation is reflected in a slope that declines from left to right in thegraph. The value of b, the regression coefficient, will be negative.

5. A higher correlation between x and y results in a narrower confidence interval for thesolution. A higher correlation results in more precision.

6. In regression, shrinkage means that a regression solution does not fit subsequent datasets as well as it fits the sample for which it was initially calculated. The best way toavoid shrinkage is to ensure that the sample reflects the characteristics of thepopulation. This means large, randomly selected samples.

Nominal Data and the Chi-Square Tests

Someone scoops jelly beans from a container in a row of different flavored jelly beans

Jupiterimages/Stockbyte/Thinkstock

Chapter Learning Objectives

After reading this chapter, you should be able to do the following:

1. Describe nominal data.

2. Complete and explain the chi-square goodness-of-fit-test.

3. Complete and explain the chi-square test of independence.

Introduction

When an important development in statistical analysis took place in the early part of the 20thcentury, more often than not Karl Pearson was associated with it. As the text previouslynoted, many of those who made important contributions were members of the departmentthat Pearson founded at University College London. Those who gravitated to Pearson’sdepartment included William Sealy Gosset, who developed the t tests; R. A. Fisher, whodeveloped analysis of variance; and Charles Spearman, who did the early work on factoranalysis. Although social relations among these men were not always harmonious, they wereenormously productive scholars, and this was particularly true of Pearson. Besides thecorrelation coefficient named for him, Pearson developed an analytical approach related toSpearman’s factor analysis called principal components analysis, as well as the proceduresthat are the subjects of this chapter, the chi-square tests. (The Greek letter chi [χ] ispronounced “kye” and rhymes with sky. Chi is the Greek equivalent of the letter c, ratherthan the letter x, which it resembles.)

10.1 Nominal Data

With the exception of Spearman’s rho in Chapter 8, Chapters 1 through 9 have focused onprocedures designed for interval or ratio data. Sometimes, however, the data are neitherinterval scale nor the ordinal-scale data that Spearman’s rho accommodates. When the dataare nominal scale, researchers often use one of the chi-square tests.

Because our focus has been so much on interval- and ratio-scale data, it might be helpful toreview what makes data nominal scale. Nominal data either fit a category or do not, which iswhy they are sometimes referred to as “categorical data.” Because of this presence-or-absence quality, analyses of nominal data are based on counting how frequently they occur,and for that reason they are also called “count data.” Compared to ratio, interval, and evenordinal data, nominal data provide relatively little information. They reveal only the presenceor absence of a characteristic, not how much of the characteristic, or how the individual’spossession of the characteristic compares to others in the category. To illustrate: when peopleare classified according to whether they are

1. left-handed or right-handed, or

2. Buddhist, Jewish, Muslim, or

3. African American, Hispanic, or Native American, or

4. blue-eyed or brown-eyed, or

5. introverted or extroverted,

then the resulting data are nominal scale.

Parameters and Tests for Nominal Data

Because data of different scales provide different kinds of information, the statisticalprocedure used in their analyses is tailored accordingly. Because nominal data concerns itselfwith frequency, the related analytical procedures—in this instance, the chi-square tests—arebased on how many individuals are in a particular category. To put it simply, themeasurement procedure for chi-square is counting.

Recall from Chapter 8 that tests for nominal data are nonparametric tests. The “noparameters” element means that employing these tests does not obligate the researcher tomeet most of the traditional parameters, or requirements, for statistical tests. The t tests andANOVA, for example, require that the dependent variable be normally distributed in itspopulation. The Pearson correlation and ordinary least-squares regression upon which it isbased (Chapter 9) also require that the x and y variables be normally distributed. LikeSpearman’s rho (Chapter 8), which is also a nonparametric test, the chi-square tests setnormality and homogeneity requirements aside; they are “distribution free” tests. However,in the statistical equivalent of no such thing as a free lunch, all of this analytical flexibilityhas a cost. The chi-square’s drawback has to do with the power of the test, which the chapterwill later discuss.

When working with nominal data, most of the descriptive statistics used to this point areirrelevant. As the most frequently occurring value, the mode, of course, can still becalculated, but the means and medians to which we compared the mode in order to determineskew require at least interval data. Nominal data offer no standard deviation or range valuesto examine to evaluate kurtosis. It is just as well that the chi-square tests are nonparametricsince most of the values needed to determine normality are unavailable in any case.

10.2 The Chi-Square Tests

This chapter explains two chi-square tests. The analysis in both tests is based on comparingthe frequency (count) with which something actually occurs, compared to the frequency withwhich it is expected to occur.

The first test is called the 1 × k (“one by kay”), or the goodness-of-fit chi-square test. Likethe independent variable in the one-way ANOVA, this test accommodates just one variable,but that one variable can have any number of categories greater than one. For instance, apsychologist could analyze whether those participating in court-ordered group therapysessions for drug addiction represent some vocations more than others. In that case, thevariable is vocation. It can have any number of manifestations (clerical workers, laborers, theunemployed, educators, and so on), but the only variable is vocation.

The second chi-square test the chapter takes up is called the r × k (“are by kay”), or the chi-square test of independence. This test accommodates two variables. Each of the twovariables can be further divided into any number of categories. A researcher might beinterested in whether marital status (single never-married, married, divorced) is related tograduating on-time among university students (graduated within four years, did not graduatewithin four years).

The Goodness-of-Fit or 1 × k Chi-Square Test

This test asks whether an outcome is different enough from an initial hypothesis that researchshould conclude that the difference is not likely to have occurred by chance. The focus onwhether an outcome might be expected to have occurred by chance makes the 1 × k like allsignificance tests. The important difference is that it accommodates a nominal-scale,dependent variable. For some illustrations of problems that might involve the 1 × k chi-square, consider the following:

Group of male and female college students on campus

Fuse/Thinkstock

Do women and men pursuepsychology majors in equalnumbers? A 1 × k chi-square testwill provide an answer.

Those responsible for recruitment in theuniversity’s college of social sciences wonderwhether opting for a psychology major relates tothe potential students’ gender. The variable is thegender of the student, with two categories: femaleand male. The research questions whether, in arandomly selected group of psychology majors,male or female students occur with significantlydifferent frequencies.

This problem is similar to an independent groups ttest in that it has two independent categories. Thedifference in the two tests is whether the count orfrequency with which subjects occur in eachcategory significantly strays from a pre-determinedhypothesis, rather than whether the groups’ means, which nominal data cannot provide, aresignificantly different from each other.

Try It!: #1

How many variables will the 1 × k chi-square accommodate?

In a second example, a militarypsychologist wants to know whetherrecruits represent urban, suburban, semi-rural, and rural backgrounds in similarproportions. The psychologist selects arandom sample of 50 recent recruits anddetermines their demographic origins. Thevariable is the population characteristics ofthe recruits’ origins. In the absence ofinformation to the contrary, the researcher’shypothesis is probably that recruits come from different areas of the country in equalproportions. If the psychologist determines that twice as many people live in suburban areasas in semi-rural areas, however, perhaps the corresponding hypothesis is that recruits fromsuburban areas will be twice as numerous as those from rural areas. The psychologist mightalso hypothesize that patriotism, which may affect the individual’s desire to join the military,runs higher in rural than in urban populations, so that the expectation is that rural recruits willoccur in greater proportions than those from urban environments. With multiple groupsrepresented in this hypothetical problem, it bears some similarity to a one-way ANOVA, butwithout any sums of squares to analyze.

Without wishing to belabor the point, the independent t test and analysis of variance dividesubjects into two or more categories, with each category characterized by a different level, ormanifestation, of the independent variable. The study analyzes how the different levels affectsome other variable, the dependent variable. The chi-square similarly has two or morecategories, but it analyzes the frequency with which individuals are distributed into thosedifferent categories.

Observed and Expected Frequencies

To restate our approach, then, the measurement involved in chi-square analysis is simplycounting. Researchers who use this analysis are interested in the frequency with whichsomething occurs in a category. More specifically, rather than comparing sample means topopulation means, or sample means to each other, chi square examines differences betweenthe frequency with which individuals occur in a particular category (symbolized by fo), andthe frequency with which they were expected to occur (symbolized by fe).

The fe and fo values are simply the number of observations in each category; they arefrequency counts. When the expected number varies sufficiently from the observed number,the result is statistically significant.

The Chi-Square Test Statistic

The test statistic for the chi-square test is as follows:

Formula 10.1

χ2=∑(fo−fe)2fe

where

χ2 = the value of the chi-square statistic

fo = the frequency observed in the particular category

fe = the frequency expected in the particular category

Studying the test statistic for the chi-square test is quite revealing. To calculate the value ofthis statistic, start with these steps:

1. Count the number in each category (fo).

2. Determine the number expected in each category (fe). When the assumption is that allcategories are equal, this will be the total number of subjects divided by the number ofcategories.

3. As a quick check before continuing, note that the sum of the fe categories must equalthe sum of the fo categories. Then, perform the following mathematical operations:

 Subtract fe from fo.

 Square the difference.

 Divide the squared difference by fe.

 Sum the squared differences divided by fe across the categories.

 Compare to the critical value of chi-square for the number of categories, minus 1degree of freedom. (The critical values of chi-square appear in Table 10.2.)

A Goodness-of-Fit (1 × k Chi-Square) Problem

Using the ethnic diversity of voters as an example, a psychologist who has examined votingpatterns and ethnicity perhaps wishes to test the assumption that voting in a general electionis unrelated to voters’ ethnic group membership. On election day, the psychologist journeysto a polling place in an ethnically diverse part of the city and administers a brief survey tothose who have just voted. One question concerns the respondents’ ethnic group. Figure 10.1shows the data for the 18 people who completed the survey.

Figure 10.1: Voter participation data

Bar graph showing results of the voter survey with ethnic groups A, B, C, and D across the horizontal axis and numbered 0 to 9 along the vertical axis to indicate the number of respondents.

Although the calculations are not difficult, determining the value of chi-square involves somearithmetic. An easy way to keep track of the calculations is to arrange the data into a tablelike Table 10.1. The rows are numbered to be consistent with the numbered steps listed afterFormula 10.1 for calculating the chi-square statistic. The results from the survey are the frequency-observed values in the first line of the table. The frequency-expected values are ndivided by the number of categories: 18 ÷ 4 = 4.50. That value indicates that if the ethnicgroup membership of the voters in this group is exactly equivalent, 4.50 of the respondentswill declare for each group. Do not let the .50 value in each fe distract you. Although the fonumbers have no chance of any such value, that fe value is the same for all groups; the issueis whether the fo − fe differences are significantly different from category to category.

Table 10.1: A goodness-of-fit chi-square problem for voting patterns

Value	Ethnic group A	Ethnic group B	Ethnic group C	Ethnic group D
1. fo	5	3	2	8
2. fe	4.50	4.50	4.50	4.50
3a. fo − fe	0.50	−1.500	−2.50.0	3.50
3b. fo − fe2	0.25	2.25	6.25	12.25
3c. fo − fe2 / fe	0.06	0.50	1.39	2.72
3d.χ2=∑(fo−fe)2fe=0.06+0.50+1.39+2.72=4.67

Determining Significance

For this problem, the value of chi-square is χ2 = 4.67. Having calculated the statistic, theresearcher needs something with which to compare it, a critical value, and—as with othertests—the critical value is indexed to degrees of freedom for the problem.

Try It!: #2

Why can chi-square values never benegative?

· The degrees of freedom for agoodness-of-fit problem are thenumber of categories in the problem,minus 1.

· With subjects in the votingparticipation problem divided intofour different ethnic groups, there are4 − 1 = 3 df.

The critical values for chi-square in Table 10.2 (also table B.7 in Appendix B) are arrangedby degrees of freedom down the left side, and the level at which the test is conducted acrossthe top.

Table 10.2: The critical values of chi-squared

df	p = 0.05	p = 0.01	p = 0.001
1	3.84	6.64	10.83
2	5.99	9.21	13.82
3	7.82	11.35	16.27
4	9.49	13.28	18.47
5	11.07	15.09	20.52
6	12.59	16.81	22.46
7	14.07	18.48	24.32
8	15.51	20.09	26.13
9	16.92	21.67	27.88
10	18.31	23.21	29.59
11	19.68	24.73	31.26
12	21.03	26.22	32.91
13	22.36	27.69	34.53
14	23.69	29.14	36.12
15	25.00	30.58	37.70
16	26.30	32.00	39.25
17	27.59	33.41	40.79
18	28.87	34.81	42.31
19	30.14	36.19	43.82
20	31.41	37.57	45.32
21	32.67	38.93	46.80
22	33.92	40.29	48.27
23	35.17	41.64	49.73
24	36.42	42.98	51.18
25	37.65	44.31	52.62
26	38.89	45.64	54.05
27	40.11	46.96	55.48
28	41.34	48.28	56.89
29	42.56	49.59	58.30
30	43.77	50.89	59.70

Source: Virginia Tech, Quantitative Population Ecology. (n.d.). Table of chi-square statistics. Retrieved from https://web.archive.org/web/20150930232540/http://alexei.nfshost.com/PopEcol/tables/chisq.html

To keep the size of the table manageable, the values are carried to just two decimals. Forconsistency, the final values of chi-square will be also rounded to two decimal places.

The critical value for a chi-square problem with df = 3 and p = 0.05 is 7.82. To distinguishthe calculated value of chi-square from the critical value, follow the same pattern adopted forthe other tests. First, calculate the value from the test results:

χ2 = 4.667 for the calculated value

This value is compared to the critical value, which is indicated by the subscripts for the levelof probability of alpha error for the test (0.05) and its degrees of freedom.

χ20.05(3) = 7.82

With a calculated value less than the critical value from the table, the differences in theethnicity of the voters in these four groups are not statistically significant; the researcherattributes the differences to chance. That may seem like a strange conclusion when thedifferences in the fo values are so substantial. The explanation goes back to the heart of whata goodness-of-fit test is designed to analyze. Pearson focused not on the differences (in thiscase) between ethnic groups, but on the differences between what was observed and whatcould be expected to occur if the initial hypothesis is valid. The comparison is not how ethnicgroup C compares to ethnic group D, for example, but how the fo and fe values within eachcategory differ. The difference in ethnic group C between 2 (fo) and 4.5 (fe) is a differentmatter than the difference between 2 (ethnic group C) and 8 (ethnic group D). The resultindicates that across the four groups, the difference between fo and fe does not vary enoughfor the result to be significant. This much difference could have occurred by chance.

The Hypotheses in a Goodness-of-Fit Test

Consistent with the other tests of significant differences (z, t, F), the null hypothesis in thechi-square tests is the hypothesis of no difference, symbolized by H0: fo = fe. However, as thesymbols indicate, the difference between what is observed (fo) and what is expected (fe) iswhat is at issue. When the fo and fe for a particular category show an approximateequivalence, we fail to reject the null hypothesis. The alternate hypothesis is that what isobserved is significantly different from the expected, HA: fo ≠ fe. Literally, the frequencyobserved does not equal the frequency expected.

In the case of ethnic-group voting behavior, the statistical decision is to fail to reject H0. Thedifferences between what was observed and what was expected across the four groups werenot great enough to be statistically significant.

A Goodness-of-Fit Problem with Nonequivalent Frequencies Expected

In the ethnicity and voting problem, the researcher tested the assumption that what could beexpected (fe) did not differ from group-to-group among the four ethnic groups. However,researchers do not always assume equivalent fe values. When the hypothesis is that thefrequencies will vary from category to category, the different fe values must be calculated forthe categories in order to reflect the hypothesis.

Perhaps a psychologist working with the military observes that service personnel exposed tocombat situations for more than six months appear to experience post-traumatic stressdisorder (PTSD) about three times more frequently than those with less than six months ofcombat exposure. To test this hypothesis, the fe values will need to indicate the differentexpectations. Gathering data for a group of service personnel, the psychologist has thefollowing:

Of 429 service personnel, 154 were exposed to combat situations for less than six months andthe other 275 had six months or more of combat exposure.

Those 154 and 275 numbers indicate the fo values for the problem. As always with chi-squareproblems, the fe values must sum to the same 429 value, but the fe numbers must also reflectthe 3-to-1 hypothesis. To determine the fe values, follow these steps:

1. Take the ratio, 3 to 1 in this example.

2. Add the elements of the ratio together: 3 + 1 = 4.3.

3. Divide the total number of subjects, n, by the sum of the ratio elements: 429 ÷ 4 =107.25

The fe value for those exposed to combat situations for less than six months will be 1 ×107.25. The fe value for those exposed to combat situations for six months or more will be 3× 107.25 = 321.75.

The balance of the problem involves the same procedure used in Table 10.1 except that thereare only two categories. The problem is completed in Table 10.3.

Table 10.3: A goodness-of-fit chi-square with unequal frequencies

Combat experience
Value	Less than 6 months	6 months or more
fo	154	275
fe	107.25	321.75
fo − fe	46.75	−46.75
fo − fe2	2185.56	2185.56
fo − fe2 / fe	20.38	6.79
∑(fo−fe)2fe=χ2=27.17

Note that the null hypothesis reflects the assumption that there will be no difference betweenwhat was expected and what was observed. In this particular problem, the hypothesis is that fo ≠ fe. What the psychologist expected was a PTSD rate that was three times higher amongservice personnel who had been exposed to combat situations for six months or more thanamong personnel who had less than six months of exposure. The value calculated is χ2 =27.17, and the associated critical value from the table for p = 0.05 and one degree of freedomis χ20.05(1) = 3.84. With a calculated value of chi-square higher than the critical value fromthe table, the result is statistically significant. What does that mean in terms of what thepsychologist expected to occur? It means that among these 429 service personnel, H0: fo = femust be rejected. The rate of PTSD is not three times higher among those with six months ormore of combat exposure. Just examining the fo values indicates that the PTSD rate is aboutdouble for those with six months or more of combat experience compared to those with lessthan six months of exposure. The psychologist’s expectation does not hold for thesepersonnel.

The Chi-Square and Statistical Power

Before a chi-square result is significant, the difference between what is expected and whatactually occurs must be substantial. Nominal data cannot match the sophistication of ratio,interval, or even ordinal data, because the data used in a chi-square problem reflect onlyfrequency. They do not contain the information that can indicate the subtle differences inmeasured qualities that data of the other scales reflect. The analytical price paid for relyingexclusively on nominal data is power. Recall that in statistical terms, power refers to theprobability of detecting significance.

Users of distribution-free tests like chi-square gain great flexibility. They need not make anyjudgments about normality or linearity, but as the chapter earlier stated, such flexibilitycomes at a cost. The flexibility’s ever-present companion is an increased probability of a typeII error. The failure to detect significance is higher with these distribution-free tests than withthe procedures in the earlier chapters. The departures from the fo = fe assumption must bequite extreme before they can be chalked up to anything except sampling variability. Thatwas the situation in the first problem on voter turnout, when it appeared that there weresubstantial differences in the voting behavior of people of different ethnic backgrounds, butthey were nonsignificant nevertheless.

Remember that type I and II errors are related, however. When the likelihood of failing todetect a statistically significant difference is higher than usual (a type II error), the probabilityof finding significant difference in error (a type I error) is correspondingly reduced. Althoughthe chi-square tests have a relatively high incidence of type II error, at least the probability oftype I error is lower than with many of the parametric alternatives.

These characteristics notwithstanding, the loss of power from using a chi-square test is oftena nonissue. If the data are nominal scale to begin with, researchers can make no decisionabout what kind of test to use; their only choice is to use one that accommodates nominaldata. Power becomes an issue when data are ordinal scale or higher but some requirementsuch as normality is suspect. In that case the analyst must make a decision about the bestcourse: rely on a traditional parametric test in spite of suspect normality, or adopt anonparametric test with relaxed requirements but also consequent loss of power.

Try It!: #3

The risk of which type of decisionerror increases with chi-squareproblems?

For example, suppose that the voting-survey researcher also asked respondentshow many times in the last 15 years theyhad participated in elections. The researchermay have intended to use analysis ofvariance to determine whether ethnic-groupdifferences exist in the level of participationin elections. However, 18 respondentsdivided among four ethnic groups is a verysmall sample size. The two people in ethnicgroup C provide little basis for completingan ANOVA; the sample is simply too small. With groups so small, just one or two extremelylow or extremely high scores will skew results, making normality an issue. In such a case, ashift to a nonparametric test like the goodness-of-fit test, where neither the normality of thedata nor the sample size is central, is likely to be more appropriate.

10.3 The Chi-Square Test of Independence

Both of the chi-square problems we have worked in this chapter have been goodness-of-fit (1× k) tests. Like all goodness-of-fit tests, the first problem involved just one variable, althoughit was divided into four categories to reflect the ethnicity of the voter. The second problem’sone variable—the incidence of post-traumatic stress disorder among service personnel—wasdivided into two categories: those deployed to combat situations for less than six months andthose deployed for six months or more. The goodness-of-fit test works well for any numberof data categories related to a single, nominal-scale, variable.

Sometimes the question is more complex. Maybe the question involves the ethnicity of therespondent and whether the individual voted in the last election. Or perhaps the PTSDproblem looks at the incidence among service personnel of different deployment periods andwhether the service personnel were men or women. Both of those examples involve twovariables. In any statistical analysis, researchers add variables to be able to explain thescoring variability more completely. Although z, t, and one-way ANOVA procedures areextremely important, they, like the goodness-of-fit test, are all restricted to a singleindependent variable. Relatively few outcomes, particularly related to human subjects, can beadequately explained by a single variable. People are too complicated.

Both the chi-square tests in this chapter compare what is observed to what is expected, but inthe goodness-of-fit test, fo to fe differences test a hypothesis about frequencies in categories.The chi-square test of independence uses the fo to fe differences to test whether the twovariables being examined, as the name suggests, operate independently of each other. Thissecond chi-square test is also known as the r × k chi-square for reasons that will become clearbelow.

The Hypotheses in the Chi-Square Test of Independence

The null and alternate hypotheses look the same as they do in the 1 × k:

· H0: fo = fe

· HA: fo ≠ fe

The hypotheses are reminders that the problem seeks to resolve how the frequenciesobserved compare to the frequencies expected. As before, H0 is rejected for calculated valuesof chi-square that are larger than the table value. However, in an r × k chi-square problem,the null hypothesis also indicates that the two variables are unrelated: the frequency withwhich one variable occurs does not affect the frequency of the other. If the null hypothesis isrejected (indicating that the two variables are related), the analysis has another step:determining the strength of the relationship between the variables, as the following exampledemonstrates.

A Chi-Square Test of Independence Problem

Let us return to the ethnicity and voting behavior problem. The researcher now decides toexpand the study to gain a more comprehensive view of how ethnicity and the tendency tovote might be related. With a list of registered voters in hand, the researcher sends out severalquestionnaires asking, among other things, the individual’s ethnicity and whether the personvoted in the last national election. With 36 responses, the researcher has gathered thefollowing data:

Ethnic Group A: Of the 12 respondents, 8 voted

Ethnic Group B: Of the 8 respondents, 2 voted

Ethnic Group C: Of the 8 respondents, 3 voted

Ethnic Group D: Of the 8 respondents, 7 voted

The Contingency Table

In this two-variable chi-square test, a table called a contingency table helps to keep the dataorganized. The subsets of one variable are reflected in the rows of the table (the r in the r × k), and the subsets of the other variable are listed in the table columns or categories (the k inthe r × k). Table 10.4, an example of a contingency table, shows the breakdown of ethnicityand voting behavior data results.

Table 10.4: Contingency table

Ethnic group	Voted in last election	Total number of respondents
	Yes	No
A	8 a	4 b	12
B	2 c	6 d	8
C	3 e	5 f	8
D	7 g	1 h	8
Totals	20	16	36

The subject’s ethnicity is indicated in the rows, which end with a row for column totals. Thecolumns indicate how many voted and how many did not, as well as the total number in eachethnic group. Each of the 8 cells is identified with a letter, which the researcher will use tocalculate the chi-square value. Cell a, for example, indicates that eight of the people in ethnicgroup A voted.

Calculating the Frequency-Expected Values, fe

As it was with the 1 × k chi-square test, the frequency-observed (fo) values reflect whatactually occurred. The frequencies expected (fe) are calculated differently than they were inthe one-variable test, however. Because each value reflects the influence of two variables(each cell in the contingency table is at the intersection of a row and a column), a researchercannot just divide the number of subjects by the number of cells and use the same fe value foreach cell. The fact that the two variables might have a different impact on some combinationsthan on others disallows such an approach. The fe value must reflect the impact that bothvariables have on the outcome in each combination. The fe value for cell a in the r × k chi-square test is completed this way:

The fe value for a particular cell is the row total for that cell times the column totalfor that cell, divided by the total number of subjects.

The fe value for cell a, for example, is the row total for cell a (12) times the total for thecolumn in which cell a is found (20), divided by the total number of subjects (36): (12 × 20)÷ 36 = 6.67.

The fe calculations for cells b through h follow:

b: (12 × 16) ÷ 36 = 5.33

c: (8 × 20) ÷ 36 = 4.44

d: (8 × 16) ÷ 36 = 3.56

e: (8 × 20) ÷ 36 = 4.44

f: (8 × 16) ÷ 36 = 3.56

g: (8 × 20) ÷ 36 = 4.44

h: (8 × 16) ÷ 36 = 3.56

Using the frequency-observed values in the cells of the contingency table and the calculatedfrequency-expected values, the researcher can create the same table used in the goodness-of-fit problems earlier:

For each of the eight cells,

1. subtract fe from fo,

2. square the difference,

3. divide the squared difference by fe, and

4. sum the results from each of the cells, which is the value of chi-square.

Table 10.5 completes the ethnicity and voting behavior problem.

Table 10.5: The chi-square test of independence: Ethnicity and voting behavior

Value	a	b	c	d	e	f	g	h
fo	8.00	4.0	2.0	6.0	3.0	5.0	7.0	1.0
fe	6.67	5.33	4.44	3.56	4.44	3.56	4.44	3.56
fo − fe	1.33	−1.33	−2.44	2.44	−1.44	1.44	2.56	−2.56
fo − fe2	1.77	1.77	5.95	5.95	2.07	2.07	6.55	6.55
fo − fe2 / fe	0.27	0.33	1.34	1.67	0.47	0.58	1.48	4.84
∑(fo−fe)2fe=χ2=7.98

Degrees of Freedom in the Chi-Square Test of Independence

For a chi-square test of independence, the number of degrees of freedom is determined by thenumber of categories of one variable, minus one, times the number of categories in the othervariable, minus one. For this problem, which has four rows and two columns in thecontingency table, the number of degrees of freedom is (4 − 1) × (2 − 1) = 3.

From the table for critical values of chi-square (Table 10.2), the value for 3 degrees offreedom and testing for alpha error at p = 0.05 is χ20.05 (3) = 7.82.

Interpreting the r × k Result

Try It!: #4

How many categories in either variablecan the r × k chi-square accommodate?

By conducting the chi-square test ofindependence, the researcher is asking, “Isethnicity related to whether the individualvotes in a national election?” As with thefirst test, Pearson compared what actuallyoccurs in a particular situation (fo) to whatcan be expected, but with the test ofindependence, what is expected is based onthe hypothesis that the variables involvedare unrelated, uncorrelated. The nullhypothesis for this test is based on that uncorrelated hypothesis, so the fe values arecalculated to indicate what to expect when the variables are independent of each other. Thesubstantial variations of fo from fe prompt larger values of chi-square. If the variationsbetween fo and fe are great enough that they meet or exceed the critical value, the statisticaldecision is to reject the null hypothesis and conclude that the variables are not independent ofeach other; they are correlated.

The psychologist’s data on ethnicity and voting behavior produced a calculated value of chi-square which exceeds the critical value from Table 10.2 for p = 0.05 and three degrees offreedom. It is statistically significant. The lack of independence indicates that voting behaviorfor some ethnic groups is different than it is for those of other ethnic groups.

Classifying the r × k Test

Earlier chapters organized statistical tests according to whether they addressed the hypothesisof difference or the hypothesis of association. Tests like z, t, and ANOVA (F) are analyses ofsignificant differences between samples and populations, or differences between samples.The Pearson and Spearman correlation procedures quantified the strength of the relationshipbetween two variables; they addressed the hypothesis of association. The chi-square test ofindependence does not fit this either-or classification. The researcher initially questionedwhether there are significant differences in voting behavior among the different ethnicgroups, which makes the r × k sound a lot like an ANOVA. But the analysis is based onwhether ethnicity and voting behavior are related, a question that makes the test more of acorrelation analysis. The r × k test addresses both of those main hypotheses. It straddles theground between the hypotheses of difference and association.

Phi Coefficient and Cramér’s V

Because the researcher’s results indicate that ethnicity and voting behavior are notindependent, a supplementary question follows: How related are the two variables? This is acorrelation question; whenever the r × k chi-square value is statistically significant, thestrength of the relationship must be determined.

Research uses several correlation procedures for nominal data. Depending uponcircumstances, the phi coefficient (symbolized by the Greek letter phi, φ), or a variation ofthe phi coefficient called Cramér’s V are the correlation procedures we’ll use. Phicoefficient is appropriate when at least one of the variables has only two levels. In theethnicity and voting behavior problem, the individual voted or did not—these are the onlytwo categories. If both variables have three or more levels, the researcher must chooseCramér’s V.

Both Cramér’s V and phi coefficient employ the previously determined value of chi-square intheir calculations, which means that most of the work has been done before the strength-of-the-correlation question is resolved. In addition, the correlation values require no separatesignificance tests. Since these values are calculated only when the chi-square value isstatistically significant, any related φ or V coefficient will likewise be significant. For thesame reason, the phi coefficient and Cramér’s V may also be considered effect sizes. Like allsignificance tests, the initial r × k result indicates only whether the chi-square value is or isnot statistically significant. Phi and Cramér’s V provide information about the magnitude ofthe relationship.

The formula for the phi coefficient follows:

Formula 10.2

φ=(χ2n)‾‾‾‾‾‾‾√

where

χ2 = the value already available from the r × k problem

n = the total number of observations (the sum of all the fo values)

The formula for Cramér’s V follows:

Formula 10.3

V=χ2n(k−1)‾‾‾‾‾‾‾‾‾√

where

χ2 = the value already available from the r × k problem

n = the total number of observations

k − 1 = the number of rows or columns—whichever is less—minus 1

When there are only two levels of either or both of the variables, then V = φ because thedenominator in V would be n × 1.

For the ethnicity and voting behavior problem, χ2 = 7.98, and it is statistically significant.The phi coefficient is

φ=(χ2n)‾‾‾‾‾‾‾√=(7.98236)‾‾‾‾‾‾‾‾‾‾⎷=0.47

Note that the chi-square value is symbolized by χ2. Sometimes students note the exponentand proceed to square the value, 7.98 in this case. Keep in mind that the exponent refers tothe way the value was initially calculated (fo − fe)2 and that to square it again as part of thephi or Cramér’s V coefficient would be an error.

The phi coefficient value is interpreted like any other correlation value, except for two things:1. A researcher does not need to check for significance because that was already establishedfor the chi-square value upon which it is based. 2. Because the coefficient is reached bycalculating a square root, phi can never be negative. Correlations closer to 1.0 are stronger;values closer to 0 are weaker. The correlation here between ethnicity and the tendency to votemight be described as a moderate correlation.

Apply It! Chi-Square Test for Locating Group Homes

Because working with the mentally ill in group homes rather than in secure facilitiesis more economical and in some cases more effective, a group of mental healthprofessionals has decided to gauge whether public opinion favors outpatient treatmentof mental health inmates convicted of nonviolent crimes. The group assumes thatthere will be differences of opinion in rural, semi-rural, suburban, and urbanlocations, since some of those locations are more likely to house group homes thanothers. Researchers sent to people in different locations 3,000 questionnaires, askingresidents whether they favor releasing certain classifications of the mentally ill to thecustody of group homes for treatment and rehabilitation into society. The healthprofessionals received 1,434 completed questionnaires. The null hypothesis (H0: fo = fe) holds that whether people favor group homes rather than assignment to a securefacility will be unrelated to where they live. The alternate hypothesis (HA: fo ≠ fe) isthat the two variables are related. Table 10.6 summarizes the survey data results.

Table 10.6: Questionnaire results concerning release of mentally ill to grouphomes

Type of neighborhood	Favor group homes	Totals
	Yes	No
Urban	248	128	376
Suburban	212	143	355
Small town	198	95	293
Semi-rural	33	67	100
Rural	133	177	310
Totals	824	610	1434

Because this 5 × 2 chi-square involves so many repetitious calculations, this would bea good problem to complete using Excel. The related calculations are demonstratedbelow, but in the meantime, note that χ2 = 75.47. By comparison, the critical value isχ20.05(4) = 9.49. With a statistically significant value of chi-square, the psychologistsreject the null hypothesis and proceed to calculate phi:

φ=(χ2n)‾‾‾‾‾‾‾√

Inserting the values for χ2 and for n, they discover that

φ=(75.471,434)‾‾‾‾‾‾‾‾‾‾√φ=0.23

Researchers initially chose to break down responses by area of residence because theyassumed that there might be differences of opinion about the value of establishinggroup homes depending upon the likelihood of one near respondents’ places ofresidence. The initial r × k results support this explanation. The two variables (whereone lives and whether one supports) do not operate independently. However, the phicoefficient value makes it clear that the relationship is not a particularly strong one. Avalue of φ = 0.23 suggest that where one lives and whether one supports group homesas an alternative to secure institutions are only modestly related.

Apply It! boxes written by Shawn Murphy

A 3 × 3 Problem

An earthquake and an accompanying tsunami have wreaked havoc in a particular country. Apsychologist is analyzing the appeal of three different outreach programs (A, B, and C) topeople who describe their losses as primarily material, primarily emotional, or acombination. The three programs offer counseling regarding insurance and financial recovery(A), therapy focused on coping with change and the loss of loved ones (B), and therapy thatincludes elements of both financial recovery and emotional disruption (C). The data are asfollows:

· Among 8 people describing their losses as primarily material, 6 opt for A, 2 for B, and0 for C.

· Among 6 people describing their losses as primarily emotional, 0 opt for A, 2 for B,and 4 for C.

· Among 6 people describing their losses as a combination, 0 opt for A, 5 opt for B, and1 for C.

The question is whether people who have different kinds of losses choose to be involved indifferent kinds of counseling programs. Table 10.7 depicts the contingency table for thesedata.

Table 10.8 shows the solution for this problem.

Table 10.7: Contingency table for a 3 × 3 problem

Type of Loss	Program	Totals
	A	B	C
Material	6a	2b	0c	8
Emotional	0d	2e	4f	6
Material and Emotional	0g	5h	1i	6
Totals	6	9	5	20

Table 10.8: Another r × k problem: Outreach programs and the type of loss suffered

Value	a	b	c	d	e	f	g	h	i
fo	6.0	2.0	0	000	2.00	4.00	0.00	5.00	1.00
fe	2.4	3.6	2	1.80	2.70	1.50	1.80	2.70	1.50
fo − fe	3.6	−1.6.0	−2.0	−1.8 00	−0.70 0	2.50	−1.8.00	2.30	−0.5.00
fo − fe2	12.96	2.56	4	3.24	0.49	6.25	3.24	5.29	0.25
fo − fe2 / fe	5.4	0.71	2	1.80	0.18	4.17	1.8.0	1.96	0.17
∑(fo−fe)2fe=χ2=18.19

Completing the calculations produces a chi-square value of χ2 = 18.19. With four degrees offreedom (3 − 1) × (3 − 1), the critical value for testing for alpha error at p = 0.05 (χ20.05(4)) is9.49 = 18.190.05(4). The type of loss the individual suffered is related to the kind ofcounseling the individual chooses. Because both variables have more than three levels,researchers use Cramér’s V as the correlation procedure to determine the strength of therelationship between type of loss and kind of counseling.

With n = 20 and the number of either the rows or columns = 3, k − 1 = 2. Cramér’s V iscalculated as follows:

V=χ2n(k−1)‾‾‾‾‾‾‾‾‾√=18.1920(2)‾‾‾‾‾‾√=0.67

The r × k result indicated that there are significant differences between what was observedand what would have been expected had there been no relationship between the type ofcounseling program and the kind of loss victims suffered. Cramér’s V quantified the strengthof that relationship. Besides being statistically significant (which the significant χ2 valueestablishes), the strength of the relationship appears to be substantial.

Apply It! Public Policy Research

Hand filling out survey questionnaire

Mark Massel/iStock/Thinkstock

Psychologists associated with a particularnational psychological association areinterested in how people view the insanitydefense in criminal trials. Their research isbased on anecdotal data that suggest that theviability of an insanity defense has somethingto do with the region of the country in whichthe respondent lives. The psychologistsdevelop a questionnaire on mental healthissues where one of the items is in Likert-typeform and is posed as follows:

At certain times, circumstances make otherwise rational peopleunaccountable for their actions.

The response choices are the following:

· disagree

· neither agree nor disagree

· agree

The researchers solicit responses from five different areas of the country identified asthese:

· the Northeast

· the South

· the Midwest

· the Northwest

· the West

This is a 5 (regions) × 3 (type of response) chi-square test of independence. Thequestion the procedure will answer is whether where the respondent lives isindependent of the respondent’s response to this particular item. The null hypothesisassociated with such a question is that the two are unrelated, indicated by nostatistically significant differences between fo and fe. Results for the survey are shownin Table 10.9.

Table 10.9: Agreement with insanity defense according to region

Campaign	Disagree	Agree	Strongly Agree	Totals
Northwest	1	4	13	18
Northeast	2	5	9	16
Midwest	10	4	2	16
West	11	5	1	17
South	13	3	0	16
Totals	37	21	25	83

Having completed the analysis, the psychologists find that

χ2 = 42.04.

The number of degrees of freedom is (5 − 1) × (3 − 1) = 8, so

χ20.05(8) = 15.51.

The calculated value of chi-square is larger than the critical value of chi-square fromTable 10.2, indexed by the probability of a type I error and the degrees of freedom.For these data at least, where respondents live does relate to how they answer the itemabout accountability. To determine the strength of the relationship between the regionof residence and the response type, the psychologists will need Cramér’s V, since bothvariables have more than two levels.

In this case, the response type (agree, neither agree . . .) represents the variable withthe fewest number of levels (3), so k − 1 = 2, and n = the total number of responses(83). Therefore,

V=χ2n(k−1)‾‾‾‾‾‾‾‾‾√=42.04(83×2)‾‾‾‾‾‾‾‾‾√=0.50

This is another moderate correlation. Psychologists who are called as expert witnessesin trials involving an insanity defense will have a more difficult time convincing thejury in some areas of the country than in others.

Apply It! boxes written by Shawn Murphy

10.4 Completing the r × k with Excel

The Excel statistical package focuses on analyses for interval and ratio data. Excel offers no procedurefor either of the chi-square tests, but they are not difficult to set up and do not require elaborateprogramming. To illustrate, here is how to complete the second r × k problem on Excel. To produceFigure 10.2, follow these steps:

1. Leave cell A1 blank, and then label cells B1–F1 as follows:

B1: fo

C1: fe

D1: fo − fe

E1: (fo − fe)2

F1: (fo − fe)2/fe

Figure 10.2: Completing the chi-square test of independenceusing Excel

Screen capture of an Excel worksheet using data from the outreach-program survey

Source: Microsoft Excel. Used with permission from Microsoft.

2. Beginning in cell A2 and continuing to cell A10, enter the labels, as shown in Figure 10.1.

3. In column B under “fo,” enter the fo values for cells a through i from the contingency table.

4. Leave cell A11 blank, and then beginning in cell A12, enter the label “row 1 =.” The label “row2 =” will go in cell A13.

5. Starting in cell B12, enter the formula for determining the sum of row 1 (as if you were addingcells a, b, and c in the contingency table): =sum(B2:B4).

· For the row 2 total in cell B13, the command will be =sum(B5:B7). Click Enter.

· For the row 3 total in cell B14 the command will be =sum(B8:B10). Click Enter.

· For the column 1 total in cell B15 the command will be =sum(B2+B5+B8). Click Enter.

· For the column 2 total in cell B16 the command will be =sum(B3+B6+B9). Click Enter.

· For the column 3 total in cell B17 the command will be =sum(B4+B7+B10). Click Enter.

6. In cell C2, enter the formula for determining the fe value for what is cell a in the contingencytable, with the formula =(B12*B15)/20. For cells b–i in the contingency table, the commandsare the following:

· In cell C3, enter the command =(B12*B16)/20.

· In cell C4, enter the command =(B12*B17)/20.

· In cell C5, enter the command =(B13*B15)/20.

· In cell C6, enter the command =(B13*B16)/20.

· In cell C7, enter the command =(B13*B17)/20.

· In cell C8, enter the command =(B14*B15)/20.

· In cell C9, enter the command =(B14*B16)/20.

· In cell C10, enter the command =(B14*B17)/20.

7. In cell D2, enter the command =B2-C2.

· Repeat this command for cells D3–D10 by placing the cursor on the result in cell D2 anddragging the cursor down to cell D10 so that all the cells from D2 to D10 are highlighted.

· From the Home tab, select the small down arrow to the right of the larger down arrow atthe top of the page on the extreme right just under the summation (∑) sign, and then select down.

8. In cell E2, enter the command =D2^2 which will enter the square of the cell D2 contents in cellE2.

· Repeat this command for cells E3–E10 by placing the cursor on the result in cell E2 anddragging it down to cell E10 so that all the cells from E2 to E10 are highlighted.

9. In cell F2, enter the command =E2/C2.

· Repeat this command for cells F3–F10 by placing the cursor on the result in cell F2 anddragging it down to cell F10 so that all the cells from F2 to F10 are highlighted.

· From the Home tab, select the small down arrow to the right of the larger down arrow atthe top of the page on the extreme right just below the summation (∑) sign, then select down.

10. Sum the values from cells F2 to F10 by entering the command =sum(F2:F10) in cell F11. Theresult is the chi-square value, which may modestly differ from a longhand calculation becauseExcel carries several more decimals than the two in the longhand solution.

Completing the r x k with Excel

/var/folders/3w/43pn04g50hv2rw2d8kq0pyl40000gn/T/com.microsoft.Word/WebArchiveCopyPasteTempFiles/aV2rdzYBsM6CIONbI1NZENTr3RwdB2PlnJgJ6BRgA4hwu5Qg5iswAAAAASUVORK5CYII=

00:00

Writing Up Statistics

Habersham (2014) used the chi-square test of independence to study the relationship between collegepersistence and student characteristics such as dual enrollment (concurrent college and high schoolenrollment), gender, and ethnicity. Of the three variables, only gender was significantly related tocollege persistence. The students’ persistence in post-secondary studies was dependent upon neithertheir prior experience with dual enrollment nor their ethnicity. Interestingly, the author does not reportthe strength of the relationship between gender and persistence.

Summary and Resources

Chapter Summary

Nineteenth-century British prime minister and apparent skeptic Benjamin Disraeli observedthat what we expect usually does not occur and what we do not anticipate typically happens.Karl Pearson, who happened to be one of Disraeli’s contemporaries, would have disagreed.In fact, Pearson’s chi-square tests are based on the assumption that under normalcircumstances, it is the expected thing that happens, and when the outcome differssignificantly from what is expected, that is noteworthy.

The chi-square tests are the only procedures in this book that are designed for nominal-scaledata, data assigned to categories rather than quantities (Objective 1). For that reason, theystand apart from the others. There are, however, some important similarities as well. Both thegoodness-of-fit (1 × k) test and the test of independence (r × k) are tests of significantdifferences like the z, t, and ANOVA tests. In addition, a significant finding with the r × k testleads to calculating either the phi coefficient or, when both variables have more than twocategories, Cramér’s V (Objectives 2 and 3). The phi coefficient and Cramér’s V are bothcorrelation procedures, which places them in the same category as the Pearson and Spearmancorrelations. Although this chapter may seem very different from the first nine chapters, thedifferences are confined to the scale of the data involved and the kinds of values that arecalculated as a result, not to the kinds of decisions that are made.

Data of any scale can be reduced to nominal-scale data and subjected to chi-square analysis.It is possible to reduce ratio-scale data into nominal-scale data. For example, this occurs if apsychologist who has collected data on the number of times clients manifest compulsivebehaviors elects to ignore the number of behaviors (ratio data) and focus only on whether ornot people from different groups manifest compulsive behavior (yes/no responses constitutenominal data—Objective 1) according to significantly different frequencies (Objective 2).Such research provides for a simpler analysis, but any opportunity to examine the degree ofcompulsivity is forfeit. The differences between the client who is minimally compulsive andthe one who is extremely compulsive are lost. Reductions in data scale result in lost data, andwhen researchers have a choice, most avoid reducing data scale, in spite of the opportunity touse chi-square analysis with all its flexibility.

Chi-square tests’ appeal, on the other hand, is that they do not require that the data meet anyof the normality requirements that parametric tests impose. Very small data sets, with theirinherent risks to normality, are more acceptable in chi-square tests than they are with t test,for example. The chi-square tests should have a place in every behavioral scientist’sanalytical repertoire.

To some degree, the chi-square tests in this chapter represent a nominal data equivalent to theone-way ANOVA and the factorial ANOVA tests used for interval and ratio data. Like theone-way ANOVA, the goodness-of-fit chi-square involves just one independent variable(political party affiliation, for example), but it can be divided into any number of categories.Like the factorial ANOVA, the chi-square test of independence involves two independentvariables (party affiliation and gender, for example), although in the case of the chi-squaretest, two variables are the limit. (Objective 3). Although the coverage of statistical proceduresin this book has not been exhaustive, it has been representative. The 10 chapters introducedsome of the most important concepts in data analysis and some of the statistical proceduresthat provide the foundation for nearly all quantitative analysis. Someone with a grasp of thematerial in these chapters has a solid footing from which to conduct research, complete thedata analysis that many reports require, and read the scholarly research.

A parting comment: data-analysis concepts do not often come up in day-to-day conversation,but they probably should provide at least the backdrop for those conversations. You nowhave the tools for summarizing data and presenting them to others and for answering avariety of important questions. Do the people in your community differ from a nationalpopulation in voter participation? Are graduation rates at your university significantlydifferent from those at others you considered attending? Is the relationship between maritalstatus and salary merely a random relationship? People frequently make snap judgmentsabout such things, but you need not. You now have the tools that will allow you, with somework in data collection, to answer such questions definitively. So it is worth working to keepyour data-analysis tools well-oiled. Think of the understanding you have gained of thesetopics as elements of a cognitive muscle: Do not let it atrophy. Occasional practice andfrequent review will keep your understanding active. The author wishes each student the verybest of luck. Happy analyzing!

Chapter 10 Flashcards

Key Terms

chi-square test of independence, or r × k chi-square

contingency table

Cramér’s V

goodness-of-fit chi-square test, or 1 × k chi-square

phi coefficient

Review Questions

Answers to the odd-numbered questions are provided in Appendix A.

1. A market researcher for an advertising agency wants to determine whether peoplelisten to three local radio stations in similar proportions. The researcher conducts atelephone survey to find out which station people prefer. In two hours, 52 of therespondents indicated preference for one of the three. The data are as follows:

Station A: 22 Station B: 18 Station C: 12

Are the differences among their preferences statistically significant?

2. What does it mean when an r × k problem is statistically significant?

3. Why are significant r × k problems followed by either phi coefficient or Cramér’s V?What are the circumstances under which each procedure is used?

4. A counselor working with people who are developmentally disabled has read researchrelating accurate responses to the type of reinforcement they receive. For three weeks,the counselor provides verbal praise every time a client completes a simple taskcorrectly and then totals the number of simple problem-solving tasks that arecompleted successfully. For the next three weeks, the counselor provides a small pieceof candy for each successfully completed task. The data for the group are as follows:

After 3 weeks of verbal praise—17 tasks completed successfully in one hour. After 3 weeks of tangible rewards—27 tasks completed successfully in one hour.

Are the differences statistically significant?

5. An armed-services psychologist believes that among those who have been in a combatarea, ground forces experience post-traumatic stress more frequently than Navy or AirForce personnel. Data from a number of armed-services personnel who have been incombat areas reveal the following:

Of 40 Army personnel, 22 have experienced some post-traumatic stress. Of 30 Air Force personnel, 3 have experience some post-traumatic stress. Of 30 Navy personnel, 12 have experienced some post-traumatic stress.

e. Are the branch of service and post-traumatic stress related?

e. If so, what is the strength of the relationship?

1. A university counselor has a theory that students’ employment and their grades may berelated. Among 20 employed students, 8 have grades above 3.0. Among 15nonemployed students, 12 have grades better than 3.0. Is the counselor correct? If so,what is the strength of the relationship?

1. A university administrator believes that undergraduate students of different majorsattend the writing lab in different proportions. Test this with the following data:

Of 20 English majors, 2 attend the writing lab. Of 18 engineering majors, 10 attend the lab. Of 15 history majors, 6 attend the lab.

Do major and lab attendance operate independently?

1. Juvenile offenders in court-ordered treatment can choose between community-serviceactivities and a series of group-counseling sessions. The therapist believes that theywill choose therapy over community service by a ratio of two to one. Among 35offenders, 20 opt for therapy. Is the counselor correct?

1. In an effort to assist students more effectively, a high school counselor wonderswhether students who opt for college choose one institution with significantly greaterfrequency than the others. Data are as follows for 70 graduating seniors who are goingto college:

37 apply to the local community college 23 apply to the state university 10 apply to the private, liberal-arts college

Are the differences in institutional preference statistically significant?

1. A car salesperson attempts to determine whether age and the type of car purchased arerelated.

Of 15 people in their 20s, 3 opt for sports cars, 8 for economy cars, and 4 for sedans.Of18 people in their 30s, 7 opt for sports cars, 4 for economy cars, and 7 for sedans.Of 12people in their 40s, 2 opt for sports cars, 4 for economy cars, and 6 for sedans.

Are customers’ ages and the type of car they select related?

Answers to Try It! Questions

1. The 1 × k chi-square will accommodate just one nominal-scale variable, but it mayhave any number of categories.

2. The fact that chi-square values are squared as they are calculated does away with anypossibility of a negative value.

3. Type II errors are typically more of a problem with chi-square than type I errors are.

4. The r × k can accommodate just two variables, but theoretically, those two variablescan have any number of categories.

Appendix A

Answers to Odd-Numbered Review Questions

Chapter 1

a. nominal

a. ratio

a. ordinal

a. mode and median

a. M = 52.083

b. Mdn = 51.5

c. R = 39

d. s = 10.475

a. M = 35.615, s2 = 88.590, R = 30

i. The mode is unchanged. The addition of 35 doesn’t alter the fact that themost frequently occurring value is still 36.

ii. Since the new value is very near the mean, the overall variance will shrink. Itbecomes 81.802.

iii. The range isn’t changed.

iv. The range is based on only the largest and smallest values, of which 35 isneither. The variance is based on all values.

v. The range would increase accordingly.

1. Both the standard deviation and variance are based on the distance between individualvalues and the mean of the group. Both values assume equal intervals between datapoints. Ordinal data lack equal intervals.

a. ratio

b. nominal

c. ordinal

d. For ratio data, it is the mean. For nominal data, it is the mode. For ordinal data, itis the median.

e. The standard deviation can be calculated for ratio data.

1. The lowercase sigma indicates the population standard deviation.

a. positive reinforcement

b. response rates

Chapter 2

4	3
3	33
2	02235577
1	2449
0	29

1. A histogram is a type of bar graph used to illustrate interval/quantitative data. Bargraphs are plots of categorical or qualitative data. Bar graphs compare variables andhistograms show distributions of variables. Additionally, in the histogram, the nextcategory begins where the last one ends. In the bar graph, there are gaps betweencategories.

1. In theory, at least, a bimodal distribution, which can’t be normal, could be symmetricaland mesokurtic.

1. It has slight negative skew (M = 21.063, Mdn = 22, Mo = 25) and is platykurtic (s =9.753, R = 41).

a. Yes, there is positive skew. M = 18.083, Mdn = 17, Mo = 14.

b. With R = 22 and s = 7.342, the data are platykurtic. Small samples are typicallyplatykurtic because, with only a few measures, there is unlikely to be enoughscoring repetition for mesokurtic or leptokurtic distributions.

a. 17 marks the 50th percentile (15 + 19)/2.

b. 12.5 (11 + 14)/2 and 23.5 (23 + 24)/2 mark the extremes of the interquartile range.

· The midpoints of the class intervals are 13.5, 9.5, 5.5, and 1.5.

· The sum of those midpoints times the frequencies of each is 40.5 + 47.5 + 38.5 +6 = 132.5.

· The sum divided by the number of class intervals is 132.5/19 = 6.97.

Chapter 3

a. −1.049

a. It is positive because it is higher than the mean of the group.

a. It is 1.049 standard deviations from the mean.

a. 0. It is the same value as the mean.

a. Yes. You can calculate a z for any raw score. Even with extreme values, there isalways a probability of occurrence.

a. With a z score on the RAT = −1.109, and z score on the CAT = −.975, the CATscore is the higher (less negative) relative to its mean.

b. Because the two tests use different metrics, the scores are not of equivalent value.To compare them you need to convert the two scores to a standard score.

1. z55 = −1.235 (39.25); 50 – 39.25 = 10.75%

1. zCAT = 0.734; zANGST = 0.659. The CAT score is the higher of the two.

1. A z = 1.0 corresponds to the 84th percentile.

a. MSS = 20.621

b. p = 0.189 (50 − 31.06)

1. Bell curve with the numbers -3, -2, -1, 0, 1, 2, and 3 along the number line. An arrow labeled “-1.17” points toward the number line near -1. An arrow labeled “2.53” points toward the number line roughly halfway between numbers 2 and 3.

z = +2.53 (49.43)

z = −1.17 (37.90); 49.43 + 37.90 = 87.33% between the two z values

Chapter 4

1. μ = μM = 47.5

a. z = 0.345 (13.68); p= 0.363

b. A p of .363 is higher than 0.05, indicating that such a difference between thesample and the population could have occurred by chance.

c. When a group is found to be statistically significant, the only possible decisionerror is type I (alpha).

1. The probability of a type I error is the alpha level, which indicates the point at which adifference becomes statistically significant. So, if the criterion is set at an alpha of 0.05,then the chance of making a type I error is p = .05, or 5%. If a result is statisticallysignificant, the probability of beta (type II) error is p = 0. A type II error can only occurif the result is found statistically significant.

a. sM = 25; z = 5.0. With a z of 5.0, they are not characteristic of a nationalpopulation with M = 500.

b. With z = 1.0, p = .159 that a group of n = 16 selected at random will score M =525 or greater.

a. σM = 1.440

b. z = −4.444

c. At z = –4.444, yes. The sample is significantly different from the population.

Source: Microsoft Excel. Used with permission from Microsoft.

1. The factors that will reduce the size of the confidence interval are less variability in thedata, which usually means a larger sample, and a reduced level of probability: p = 0.95rather than p = 0.99, for example.

Chapter 5

a. SEM = 3.454

a. z = 1.737. Yes this sample is representative of a population with μM = 66.0.

1. t = 1.741. At p = 0.05, the two groups are not significantly different.

a. When the null hypothesis is rejected at p = 0.05, the probability of alpha error is0.05. There is no probability of beta error, which can only occur when the nullhypothesis is not rejected.

b. HA : μ1 ≠ μ2; HA : μ1 < μ2

a. The IV is whether they receive a rebate. The DV is attitude.

b. The IV is nominal scale.

c. The DV must be at least interval scale.

a. one-tailed

b. HA : μ1 > μ2

c. t = 3.444. The result is significant. The means of the population is from 1.441 to6.003.

Chapter 6

1. SStot = 11.979

1. F = 10.671 and is statistically significant at p = 0.05.

1. There is only one possible difference.

a. F = 6.604 and is significant.

b. The 8 week group is significantly different from the 16 week group.

c. n2 indicates that 52.4% of the variance in days drug free can be explained by thenumber of weeks.

a. F = 3.112

b. F(0.05)3,28 = 2.95. The result is significant.

Chapter 7

a. sd = 1.768

a. SEMd = 0.625

a. t = 2.60

a. t(0.05)7 = 2.365. The differences are statistically significant.

a. F = 59.291 and is significant. HSD = 0.782; all groups are significantly differentfrom all.

b. The data are ratio scale.

c. η2 = 0.529; about 53% of the variance in responding can be explained byreinforcement.

1. With no treatment effect, F = 1.0. It will be a ratio of error to error.

a. F = 16.955; anxiety is related to time.

b. The first set of measures is significantly different from the other three. The secondset of measures is significantly different from the fourth.

c. η2 = 0.408. About 40.8% of the variance in anxiety can be explained by time fromthe test.

a. t = 2.339; t0.05(7) = 2.365. The differences are not significant.

b. t = 2.082

c. The difference is in the reduction in error with a repeated measures design.

a. No. An F = 3.386 would not have been significant.

b. In the repeated measures design, the error term (SSresid) is reduced by the amountof the variability between rows resulting in a larger F value.

Chapter 8

1. 1.0/−1.0 and 0 respectively

1. The most important requirements are a) interval or ratio scale, b) both variables mustbe normally distributed in their populations, and c) there should be no attenuation ofrange in either variable, and the relationship between variables must be linear.

1. point biserial correlation

a. assuming that grade averages are interval scale, a Pearson correlation

b. The statistic is r2, the coefficient of determination.

1. The two measures of attitude are significantly correlated. Spearman’s rho (rs) = .815and is statistically significant.

1. r = .504 and is not statistically significant.

a. The two variables are not significantly correlated at r = −.601.

b. The null hypothesis is that there is no relationship between dogmatism andsatisfaction.

c. fail to reject the null hypothesis

d. As the level of dogmatism goes up, the level of job satisfaction goes down.

e. No.

Chapter 9

1. y' = 57.562

a. The issue is the slope value; b = .817.

a. a = 17.619

a. It defines the intercept value.

1. SEest = 2.958; C.I.99 = 65.484, 49.640

a. The true value of the predicted variable, with a specified level of confidence.

b. The true value won’t occur within the specified interval 1 in every 100 times, asan average.

c. The confidence interval will shrink if the correlation becomes stronger, the samplesize increases, or the probability is relaxed to 0.95 or lower.

1. b = rxy(Sy/Sx = .894(5.250/9.317) = .504 a = My − bMx = 54.850 − (.504)(46.50) = 31.414 y' = 31.414 + .504(57.5) = 60.394

a. r = .907

b. The predicted number of responses (y') = 6.551

c. Actually, our table only goes to 25 pairs (df = 23) for which the value is .396.Using that value, the confidence interval of the prediction stretches from 6.201 to6.901, with .95 confidence.

d. If rxy = .7 but the other descriptive statistics and the predicted value of y remainedthe same (an improbable occurrence), the resulting confidence interval grows to5.957, 7.145 because of the weaker correlation.

1. The slope declines from left to right if the correlation is negative.

Chapter 10

1. χ2 = 2.923; not significant

a. The test in an r × k problem is whether the two variables operate independently. Ifthey do not, the supplementary question is of the strength of the relationshipbetween them, which is what phi or V measures.

b. Phi is used in 2 × 2 models. Cramer’s V is used whenever both variables have 3 ormore levels.

a. Yes, χ2 = 15.058.

b. φ = 0.388

1. They are not independent; χ2 = 9.106.

1. The counselor is on to something; χ2 = 15.629 and is statistically significant.

Appendix B

Important Tables

Table B.1: The z table

	0.00	0.01	0.02	0.03	0.04	0.05	0.06	0.07	0.08	0.09
0.0	0.0000	0.0040	0.0080	0.0120	0.0160	0.0199	0.0239	0.0279	0.0319	0.0359
0.1	0.0398	0.0438	0.0478	0.0517	0.0557	0.0596	0.0636	0.0675	0.0714	0.0753
0.2	0.0793	0.0832	0.0871	0.0910	0.0948	0.0987	0.1026	0.1064	0.1103	0.1141
0.3	0.1179	0.1217	0.1255	0.1293	0.1331	0.1368	0.1406	0.1443	0.1480	0.1517
0.4	0.1554	0.1591	0.1628	0.1664	0.1700	0.1736	0.1772	0.1808	0.1844	0.1879
0.5	0.1915	0.1950	0.1985	0.2019	0.2054	0.2088	0.2123	0.2157	0.2190	0.2224
0.6	0.2257	0.2291	0.2324	0.2357	0.2389	0.2422	0.2454	0.2486	0.2517	0.2549
0.7	0.2580	0.2611	0.2642	0.2673	0.2704	0.2734	0.2764	0.2794	0.2823	0.2852
0.8	0.2881	0.2910	0.2939	0.2967	0.2995	0.3023	0.3051	0.3078	0.3106	0.3133
0.9	0.3159	0.3186	0.3212	0.3238	0.3264	0.3289	0.3315	0.3340	0.3365	0.3389
1.0	0.3413	0.3438	0.3461	0.3485	0.3508	0.3531	0.3554	0.3577	0.3599	0.3621
1.1	0.3643	0.3665	0.3686	0.3708	0.3729	0.3749	0.3770	0.3790	0.3810	0.3830
1.2	0.3849	0.3869	0.3888	0.3907	0.3925	0.3944	0.3962	0.3980	0.3997	0.4015
1.3	0.4032	0.4049	0.4066	0.4082	0.4099	0.4115	0.4131	0.4147	0.4162	0.4177
1.4	0.4192	0.4207	0.4222	0.4236	0.4251	0.4265	0.4279	0.4292	0.4306	0.4319
1.5	0.4332	0.4345	0.4357	0.4370	0.4382	0.4394	0.4406	0.4418	0.4429	0.4441
1.6	0.4452	0.4463	0.4474	0.4484	0.4495	0.4505	0.4515	0.4525	0.4535	0.4545
1.7	0.4554	0.4564	0.4573	0.4582	0.4591	0.4599	0.4608	0.4616	0.4625	0.4633
1.8	0.4641	0.4649	0.4656	0.4664	0.4671	0.4678	0.4686	0.4693	0.4699	0.4706
1.9	0.4713	0.4719	0.4726	0.4732	0.4738	0.4744	0.4750	0.4756	0.4761	0.4767
2.0	0.4772	0.4778	0.4783	0.4788	0.4793	0.4798	0.4803	0.4808	0.4812	0.4817
2.1	0.4821	0.4826	0.4830	0.4834	0.4838	0.4842	0.4846	0.4850	0.4854	0.4857
2.2	0.4861	0.4864	0.4868	0.4871	0.4875	0.4878	0.4881	0.4884	0.4887	0.4890
2.3	0.4893	0.4896	0.4898	0.4901	0.4904	0.4906	0.4909	0.4911	0.4913	0.4916
2.4	0.4918	0.4920	0.4922	0.4925	0.4927	0.4929	0.4931	0.4932	0.4934	0.4936
2.5	0.4938	0.4940	0.4941	0.4943	0.4945	0.4946	0.4948	0.4949	0.4951	0.4952
2.6	0.4953	0.4955	0.4956	0.4957	0.4959	0.4960	0.4961	0.4962	0.4963	0.4964
2.7	0.4965	0.4966	0.4967	0.4968	0.4969	0.4970	0.4971	0.4972	0.4973	0.4974
2.8	0.4974	0.4975	0.4976	0.4977	0.4977	0.4978	0.4979	0.4979	0.4980	0.4981
2.9	0.4981	0.4982	0.4982	0.4983	0.4984	0.4984	0.4985	0.4985	0.4986	0.4986
3.0	0.4987	0.4987	0.4987	0.4988	0.4988	0.4989	0.4989	0.4989	0.4990	0.4990

Source: StatSoft. (2011). Electronic Statistics Textbook. Tulsa, OK: StatSoft. Retrieved from http://www.statsoft.com/textbook/distribution-tables/#z

Table B.2: t distribution critical values

	Critical t value
	Two-tailed tests	One-tailed tests
df	p = 0.05	p = 0.01	p = 0.05	p = 0.01
1	12.706	63.657	6.314	31.821
2	4.303	9.925	2.920	6.965
3	3.182	5.841	2.353	4.541
4	2.776	4.604	2.132	3.747
5	2.571	4.032	2.015	3.365
6	2.447	3.707	1.943	3.143
7	2.365	3.499	1.895	2.998
8	2.306	3.355	1.860	2.896
9	2.262	3.250	1.833	2.821
10	2.228	3.169	1.812	2.764
11	2.201	3.106	1.796	2.718
12	2.179	3.055	1.782	2.681
13	2.160	3.012	1.771	2.650
14	2.145	2.977	1.761	2.624
15	2.131	2.947	1.753	2.602
16	2.120	2.921	1.746	2.583
17	2.110	2.898	1.740	2.567
18	2.101	2.878	1.734	2.552
19	2.093	2.861	1.729	2.539
20	2.086	2.845	1.725	2.528
21	2.080	2.831	1.721	2.518
22	2.074	2.819	1.717	2.508
23	2.069	2.807	1.714	2.500
24	2.064	2.797	1.711	2.492
25	2.060	2.787	1.708	2.485
26	2.056	2.779	1.706	2.479
27	2.052	2.771	1.703	2.473
28	2.048	2.763	1.701	2.467
29	2.045	2.756	1.699	2.462
30	2.032	2.750	1.697	2.457
∞	1.96	2.576	1.645	2.326

Source: Adapted from Gertsman, B. B. (n.d.). Probability tables: t table. Retrieved from http://www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf

Table B.3: The critical values of F

Values in regular type indicate the critical value for p = 0.05.

Values in bold type indicate the critical value for p = 0.01.

dfdenominator

df numerator

18.51

98.50

19.00

99.00

19.16

99.16

19.25

99.25

19.30

99.30

19.33

99.33

19.35

99.36

19.37

99.38

19.38

99.39

19.40

99.40

10.13

34.12

9.55

30.82

9.28

29.46

9.12

28.71

9.01

28.24

8.94

27.91

8.89

27.67

8.85

27.49

8.81

27.34

8.79

27.23

7.71

21.20

6.94

18.00

6.59

16.69

6.39

15.98

6.26

15.52

6.16

15.21

6.09

14.98

6.04

14.80

6.00

14.66

5.96

14.55

6.61

16.26

5.79

13.27

5.41

12.06

5.19

11.39

5.05

10.97

4.95

10.67

4.88

10.46

4.82

10.29

4.77

10.16

4.74

10.05

5.99

13.75

5.14

10.92

4.76

9.78

4.53

9.15

4.39

8.75

4.28

8.47

4.21

8.26

4.15

8.10

4.10

7.98

4.06

7.87

5.59

12.25

4.74

9.55

4.35

8.45

4.12

7.85

3.97

7.46

3.87

7.19

3.79

6.99

3.73

6.84

3.68

6.72

3.64

6.62

5.32

11.26

4.46

8.65

4.07

7.59

3.84

7.01

3.69

6.63

3.58

6.37

3.50

6.18

3.44

6.03

3.39

5.91

3.35

5.81

5.12

10.56

4.26

8.02

3.86

6.99

3.63

6.42

3.48

6.06

3.37

5.80

3.29

5.61

3.23

5.47

3.18

5.35

3.14

5.26

4.96

10.04

4.10

7.56

3.71

6.55

3.48

5.99

3.33

5.64

3.22

5.39

3.14

5.20

3.07

5.06

3.02

4.94

2.98

4.85

4.84

9.65

3.98

7.21

3.59

6.22

3.36

5.67

3.20

5.32

3.09

5.07

3.01

4.89

2.95

4.74

2.90

4.63

2.85

4.54

4.75

9.33

3.89

6.93

3.49

5.95

3.26

5.41

3.11

5.06

3.00

4.82

2.91

4.64

2.85

4.50

2.80

4.39

2.75

4.30

4.67

9.07

3.81

6.70

3.41

5.74

3.18

5.21

3.03

4.86

2.92

4.62

2.83

4.44

2.77

4.30

2.71

4.19

2.67

4.10

4.60

8.86

3.74

6.51

3.34

5.56

3.11

5.04

2.96

4.69

2.85

4.46

2.76

4.28

2.70

4.14

2.65

4.03

2.60

3.94

4.54

8.68

3.68

6.36

3.29

5.42

3.06

4.89

2.90

4.56

2.79

4.32

2.71

4.14

2.64

4.00

2.59

3.89

2.54

3.80

4.49

8.53

3.63

6.23

3.24

5.29

3.01

4.77

2.85

4.44

2.74

4.20

2.66

4.03

2.59

3.89

2.54

3.78

2.49

3.69

4.45

8.40

3.59

6.11

3.20

5.19

2.96

4.67

2.81

4.34

2.70

4.10

2.61

3.93

2.55

3.79

2.49

3.68

2.45

3.59

4.41

8.29

3.55

6.01

3.16

5.09

2.93

4.58

2.77

4.25

2.66

4.01

2.58

3.84

2.51

3.71

2.46

3.60

2.41

3.51

4.38

8.18

3.52

5.93

3.13

5.01

2.90

4.50

2.74

4.17

2.63

3.94

2.54

3.77

2.48

3.63

2.42

3.52

2.38

3.43

4.35

8.10

3.49

5.85

3.10

4.94

2.87

4.43

2.71

4.10

2.60

3.87

2.51

3.70

2.45

3.56

2.39

3.46

2.35

3.37

4.32

8.02

3.47

5.78

3.07

4.87

2.84

4.37

2.68

4.04

2.57

3.81

2.49

3.64

2.42

3.51

2.37

3.40

2.32

3.31

4.30

7.95

3.44

5.72

3.05

4.82

2.82

4.31

2.66

3.99

2.55

3.76

2.46

3.59

2.40

3.45

2.34

3.35

2.30

3.26

4.28

7.88

3.42

5.66

3.03

4.76

2.80

4.26

2.64

3.94

2.53

3.71

2.44

3.54

2.37

3.41

2.32

3.30

2.27

3.21

4.26

7.82

3.40

5.61

3.01

4.72

2.78

4.22

2.62

3.90

2.51

3.67

2.42

3.50

2.36

3.36

2.30

3.26

2.25

3.17

4.24

7.77

3.39

5.57

2.99

4.68

2.76

4.18

2.60

3.85

2.49

3.63

2.40

3.46

2.34

3.32

2.28

3.22

2.24

3.13

4.23

7.72

3.37

5.53

2.98

4.64

2.74

4.14

2.59

3.82

2.47

3.59

2.39

3.42

2.32

3.29

2.27

3.18

2.22

3.09

4.21

7.68

3.35

5.49

2.96

4.60

2.73

4.11

2.57

3.78

2.46

3.56

2.37

3.39

2.31

3.26

2.25

3.15

2.20

3.06

4.20

7.64

3.34

5.45

2.95

4.57

2.71

4.07

2.56

3.75

2.45

3.53

2.36

3.36

2.29

3.23

2.24

3.12

2.19

3.03

4.18

7.60

3.33

5.42

2.93

4.54

2.70

4.04

2.55

3.73

2.43

3.50

2.35

3.33

2.28

3.20

2.22

3.09

2.18

3.00

4.17

7.56

3.32

5.39

2.92

4.51

2.69

4.02

2.53

3.70

2.42

3.47

2.33

3.30

2.27

3.17

2.21

3.07

2.16

2.98

Source: Critical values of F. (n.d.). Retrieved from http://faculty.vassar.edu/lowry/apx_d.html

Table B.4: Tukey’s HSD critical values: q (alpha, k, df)

* The critical values for q corresponding to alpha = 0.05 (top) and alpha = 0.01 (bottom)

df	k = Number of Treatments
	2	3	4	5	6	7	8	9	10
5	3.64 5.70	4.60 6.98	5.22 7.80	5.67 8.42	6.03 8.91	6.33 9.32	6.58 9.67	6.80 9.97	6.99 10.24
6	3.46 5.24	4.34 6.33	4.90 7.03	5.30 7.56	5.63 7.97	5.90 8.32	6.12 8.61	6.32 8.87	6.49 9.10
7	3.34 4.95	4.16 5.92	4.68 6.54	5.06 7.01	5.36 7.37	5.61 7.68	5.82 7.94	6.00 8.17	6.16 8.37
8	3.26 4.75	4.04 5.64	4.53 6.20	4.89 6.62	5.17 6.96	5.40 7.24	5.60 7.47	5.77 7.68	5.92 7.86
9	3.20 4.60	3.95 5.43	4.41 5.96	4.76 6.35	5.02 6.66	5.24 6.91	5.43 7.13	5.59 7.33	5.74 7.49
10	3.15 4.48	3.88 5.27	4.33 5.77	4.65 6.14	4.91 6.43	5.12 6.67	5.30 6.87	5.46 7.05	5.60 7.21
11	3.11 4.39	3.82 5.15	4.26 5.62	4.57 5.97	4.82 6.25	5.03 6.48	5.20 6.67	5.35 6.84	5.49 6.99
12	3.08 4.32	3.77 5.05	4.20 5.50	4.51 5.84	4.75 6.10	4.95 6.32	5.12 6.51	5.27 6.67	5.39 6.81
13	3.06 4.26	3.73 4.96	4.15 5.40	4.45 5.73	4.69 5.98	4.88 6.19	5.05 6.37	5.19 6.53	5.32 6.67
14	3.03 4.21	3.70 4.89	4.11 5.32	4.41 5.63	4.64 5.88	4.83 6.08	4.99 6.26	5.13 6.41	5.25 6.54
15	3.01 4.17	3.67 4.84	4.08 5.25	4.37 5.56	4.59 5.80	4.78 5.99	4.94 6.16	5.08 6.31	5.20 6.44
16	3.00 4.13	3.65 4.79	4.05 5.19	4.33 5.49	4.56 5.72	4.74 5.92	4.90 6.08	5.03 6.22	5.15 6.35
17	2.98 4.10	3.63 4.74	4.01 5.14	4.30 5.43	4.52 5.66	4.70 5.85	4.86 6.01	4.99 6.15	5.11 6.27
18	2.97 4.07	3.61 4.70	4.00 5.09	4.28 5.38	4.49 5.60	4.67 5.79	4.82 5.94	4.96 6.08	5.07 6.20
19	2.96 4.05	3.59 4.67	3.98 5.05	4.25 5.33	4.47 5.55	4.65 5.73	4.79 5.89	4.92 6.02	5.04 6.14
20	2.95 4.02	3.58 4.64	3.96 5.02	4.23 5.29	4.45 5.51	4.62 5.69	4.77 5.84	4.90 5.97	5.01 6.09
24	2.92 3.96	3.53 4.55	3.90 4.91	4.17 5.17	4.37 5.37	4.54 5.54	4.68 5.69	4.81 5.81	4.92 5.92
30	2.89 3.89	3.49 4.45	3.85 4.80	4.10 5.05	4.30 5.24	4.46 5.40	4.60 5.54	4.72 5.65	4.82 5.76
40	2.86 3.82	3.44 4.37	3.79 4.70	4.04 4.93	4.23 5.11	4.39 5.26	4.52 5.39	4.63 5.50	4.73 5.60

Source: Tukey’s HSD critical values. (n.d.). Retrieved from http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html

Table B.5: The critical values of rxy

Number of xy Pairs(n)	df (n − 2)	Lowest statistically significant correlation for the specifiedprobability
		p = 0.10	p = 0.05	p = 0.01
3	1	0.988	0.997	1.000
4	2	0.900	0.950	0.990
5	3	0.805	0.878	0.959
6	4	0.729	0.811	0.917
7	5	0.669	0.754	0.875
8	6	0.621	0.707	0.834
9	7	0.582	0.666	0.798
10	8	0.549	0.632	0.765
11	9	0.521	0.602	0.735
12	10	0.497	0.576	0.708
13	11	0.476	0.553	0.684
14	12	0.458	0.532	0.661
15	13	0.441	0.514	0.641
16	14	0.426	0.497	0.623
17	15	0.412	0.482	0.606
18	16	0.400	0.468	0.590
19	17	0.389	0.456	0.575
20	18	0.378	0.444	0.561
21	19	0.369	0.433	0.549
22	20	0.360	0.423	0.537
23	21	0.352	0.413	0.526
24	22	0.344	0.404	0.515
25	23	0.337	0.396	0.505

Source: Brighton Webs Ltd. (2006). Critical values of correlation coefficient (R). Retrieved from http://www.brighton-webs.co.uk/tables/critical_values_r.asp

Table B.6: The critical values for Spearman’s rho

Number of Pairs of Scores	p = 0.05	p = 0.01
5	1.000
6	0.886	1.0
7	0.786	0.929
8	0.738	0.881
9	0.683	0.883
10	0.648	0.794
12	0.591	0.777
14	0.544	0.715
16	0.506	0.665
18	0.475	0.625
20	0.450	0.591
22	0.428	0.562
24	0.409	0.537
26	0.392	0.515
28	0.377	0.496
30	0.364	0.478

Source: University of Sussex. (n.d.). Critical values of Spearman’s rho (two-tailed). Retrieved from www.sussex.ac.uk/Users/grahamh/RM1web/Rhotable.htm

Table B.7: The critical values of chi-squared

df	p = 0.05	p = 0.01	p = 0.001
1	3.84	6.64	10.83
2	5.99	9.21	13.82
3	7.82	11.35	16.27
4	9.49	13.28	18.47
5	11.07	15.09	20.52
6	12.59	16.81	22.46
7	14.07	18.48	24.32
8	15.51	20.09	26.13
9	16.92	21.67	27.88
10	18.31	23.21	29.59
11	19.68	24.73	31.26
12	21.03	26.22	32.91
13	22.36	27.69	34.53
14	23.69	29.14	36.12
15	25.00	30.58	37.70
16	26.30	32.00	39.25
17	27.59	33.41	40.79
18	28.87	34.81	42.31
19	30.14	36.19	43.82
20	31.41	37.57	45.32
21	32.67	38.93	46.80
22	33.92	40.29	48.27
23	35.17	41.64	49.73
24	36.42	42.98	51.18
25	37.65	44.31	52.62
26	38.89	45.64	54.05
27	40.11	46.96	55.48
28	41.34	48.28	56.89
29	42.56	49.59	58.30
30	43.77	50.89	59.70

Appendix C

Important Formulas

Chapter 1

Formula 1.1: Mean of a Set of Scores

M=∑xn

Formula 1.2: Variance of a Sample

s2=∑(x−M)2(n−1)

Formula 1.3: Standard Deviation of a Sample

s=∑(x−M)2(n−1)‾‾‾‾‾‾‾‾‾‾‾‾√

Formula 1.4: Variance of a Population

σ2=∑(x−µ)2N

Formula 1.5: Standard Deviation of a Population

σ=∑(x−μ)2N‾‾‾‾‾‾‾‾‾‾‾‾√

Chapter 2

Formula 2.1: Skew

sk=M−MdnMdn

Positive values indicate positive skew, and negative values, a negative skew. Values within±1.0 indicate a distribution that is reasonably close to symmetrical.

Chapter 3

Formula 3.1: Transformation

z=x−Ms

This is the formula for transforming raw scores into z scores, giving them a mean of 0 and astandard deviation of 1; x is the score to be transformed, and M and s are the mean andstandard deviation of the group of scores.

Formula 3.2: Modified Standard Score

MSS = (sspec)(z)+ Mspec

The modified standard score (MSS) transforms z scores so that they conform to distributionsthat have any specified mean (Mspec) and standard deviation (sspec).

Chapter 4

Formula 4.1: z Test

z=M−μMσM

This is the formula for the z test. It allows one to determine whether a sample is characteristicof or significantly different from a population.

Formula 4.2: Standard Error of the Mean

σM=σN‾‾√

If a value for the population standard deviation (σ) is available, one can calculate thestandard error of the mean (σM) with this formula.

Formula 4.3: Optimal Sample Size

n=((z)(σ)vatarion from σ)2

Formula 4.4: Confidence Interval

CI = ±z(σM) + M

This is the formula for calculating a confidence interval when z is statistically significant in a z test.

Chapter 5

Formula 5.1: Estimated Standard Error of the Mean

SEM=sn‾‾√

Formula 5.2: One-Sample t Test

t=M−μMSEM

Formula 5.3: Independent t-Test

t=M1−M2SEM1−M2

Formula 5.4: Estimated Standard Error of the Difference for Equal SampleSizes

SEM1−M2=(SEM1)2+(SEM2)2‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√

Formula 5.5: Estimated Standard Error of the Difference for Unequal SampleSizes

SEM1−M2=[(n1−1)s21+(n2−1)s22(n1+n2−2)][1n1+1n2]‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾⎷

Formula 5.6: .95 Confidence Interval for the Difference Between Means

CI.95 = ±t (SEM1 − M2) + (M1 − M2)

Formula 5.7: Calculating Effect Size Using Omega-Squared

ω2 = t2 − 1/(t2 + n1 + n2 − 1)

Chapter 6

Formula 6.1: Total Sum of Squares

SStot = ∑(x − MG)2

The total sum of squares is the total of all variance from all sources in an ANOVA problem.

Formula 6.2: Sum of Squares Between

SSbet = (Ma − MG)2na + (Mb − MG)2nb + (Mc − MG)2nc

The sum of squares between, which includes the effect of the IV or IVs plus error variance, isa measure of how much particular groups differ from the mean of all the data.

Formula 6.3: Sum of Squares Within

SSwith = ∑(xa − Ma )2 + ∑(xb − Mb)2 + ∑(xc − Mc)2

The sum of squares within is a measure of how much individuals within a group differ fromthe mean of their sample when exposed to the same level of the IV or IVs. It’s a measure oferror variance.

Formula 6.4: F Statistic

F=MSbetMSwith

The F statistic in ANOVA.

Formula 6.5: Tukey’s HSD

HSD=xMSwithn‾‾‾‾‾‾‾‾√

Tukey’s HSD is a post hoc test used to determine which groups in an ANOVA aresignificantly different from which.

Formula 6.6: Alternate HSD

HSD=x(MSwith2)(1n1+1n2)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√

This formula may be used when group sizes are unequal.

Formula 6.7: Eta-Squared

η2=SSbetSStot

Eta-squared is an estimate of effect size. It suggests the proportion of variance explained bythe particular component. In this configuration, it’s the variability between groups.

Formula 6.8: Omega-Squared

ω2=SSbet−(k−1)MSwithSStot+MSwith

Chapter 7

Formula 7.1: Before/After or Matched-Pairs t-Test

t=MdSEMd

The numerator is the mean of the differences between the first and second score, and thedenominator is the standard error of the mean for the difference scores.

Formula 7.2: Sum of Squares Between Columns for Within-Subjects F

SScol = (Mcol 1 − MG)2ncol 1 + (Mcol 2 − MG)2ncol 2 + . . . + (Mcol k − MG)2ncol k

This is the formula for determining the sum of squares between columns for a within-subjects F. It indicates the treatment effect plus some error.

Formula 7.3: Sum of Squares Between Rows for Within-Subjects F

SSrows = (Mrow 1 − MG)2nrow 1 + (Mrow 2 − MG)2nrow 2 + . . . + (Mrow i − MG)2nrow i

This formula determines the person-to-person variance within a group. These are a source oferror variance and because it’s common to each group in a repeated measures design, it’scalculated in order to eliminate it from what will be the denominator in the F ratio.

Formula 7.4: Residual Sum of Squares for Within-Subjects F

SSresid = SStot − SScol − SSrows

The error term in the within-subjects F is determined by subtracting the column-to-column(treatments) and the row-to-row (participants) differences from all variance. Whatever is left,once it’s divided by its degrees of freedom, becomes the error term in the F ratio.

Chapter 8

Formula 8.1: Pearson Correlation (z Score Formula)

rxy=∑[(zx)(zy)]n−1

This formula gauges the relationship between two sets of interval- or ratio-scale data by firstturning raw scores into z scores.

Formula 8.2: Pearson Correlation (Raw-Score Formula) and Point-BiserialCorrelation

rxy=n∑xy−(∑x)(∑y){[n∑x2−(∑x)2][n∑y2−(∑y)2]}‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√

This raw-score formula for the Pearson correlation indicates the correlation between twovariables that are interval or ratio. It is also the formula for the point-biserial correlation.

Formula 8.3: Spearman’s Rho

ρ=1−6∑d2n(n2−1)

This formula measures the relationship between variables that are any combination ofordinal, interval, or ratio scales. The normality of the data isn’t an issue.

Chapter 9

Formula 9.1: Bivariate Regression

y' = a + bx

This is the equation for ordinary least-squares regression.

Formula 9.2: Intercept

a = My − bMx

This formula establishes the regression constant, or intercept, value. It indicates the value of y when x = 0.

Formula 9.3: Regression Coefficient

b=rxy(sysx)

This formula indicates the slope of the regression line, or the regression coefficient. Itindicates the impact on y of increasing x by 1.0.

Formula 9.4: Standard Error of the Estimate

SEest=sy(1−rxy2)‾‾‾‾‾‾‾‾‾‾√

This statistic indicates the amount of prediction error, a value used in the confidence interval.

Formula 9.5: Regression Confidence Interval

CI = ±t(SEest) + y'

Because y' is a “point estimate” of the predicted value without any gauge of how precise theestimate is, this formula can be used to determine how large the interval around y' wouldhave to be to have confidence of capturing the actual value of y.

Chapter 10

Formula 10.1: Chi-Square

χ2=∑(fo−fe)2fe

The formula for chi-square for both the goodness-of-fit (1 × k) and the chi-square test ofindependence (r × k).

Formula 10.2: Phi Coefficient

φ=(χ2n)‾‾‾‾‾‾‾√

The phi coefficient is the measure of the correlation of the two variables involved when a 2 ×2 chi-square test of independence is statistically significant.

Formula 10.3: Cramér’s V

V=χ2n(k−1)‾‾‾‾‾‾‾‾‾√

Cramér’s V is the measure of correlation for the variables in a statistically significant r × kchi-square, when both of the variables have at least three categories.

Glossary

abscissa The horizontal (x) axis in a graph based on Cartesian coordinates.

actual limits In a class interval, includes upper and lower scores with decimal values, whichallows consistent classification of any value into one of the class intervals.

alternate hypothesis The hypothesis that predicts that two samples belong to populationswith distinct means.

analysis of variance (ANOVA) Name given to Fisher’s test allowing a research study todetect significant differences among any number of groups.

apparent limits The highest and lowest integers in a category for a particular class interval.

bar graphs Graphical data presentations that indicate proportions as comparative vertical orhorizontal columns. Bar graphs are often used to indicate the frequency with which nominaldata occur.

before/after t test A dependent-groups application of the t test in which one group ismeasured before and after a treatment.

bias In statistical analysis, a consistent error of the same nature. If a sample is drawn whichdistorts some characteristic of the population, the nature of the parent population will bemisrepresented in all experiments involving the sample. Results are distorted (biased) in away that can’t be corrected by simply repeating the experiment.

bivariate correlations Include all procedures that test for significant relationships betweentwo variables.

canonical correlation Measures the relationship between two groups of variables.

Cartesian coordinates The values associated with the horizontal and vertical axes, oftenrespectively termed x and y, that indicate the location of a score on a graph.

central limit theorem Holds that if a population is sampled an infinite number of timesusing sample size n and the mean of each sample determined, the multiple means (Ms) willtake on the characteristics of a normal distribution whether or not the original population ofindividuals was normal.

chi-square test of independence, or r × k chi-square A test of whether two nominalvariables operate independently of each other.

class intervals The groups in a grouped frequency distribution.

coefficient of determination Indicates the proportion of one variable in a Pearson correlationthat can be explained by the other.

confidence interval (CI = ±z(σM) + M) Provides a way to determine how precisely Mestimates µM.

confidence interval of the difference Estimates, with a specified level of probability, thedifference between the means of the two populations represented by the samples in the t test.

confounding variables Variables that influence an outcome but are uncontrolled in theanalysis and obscure the effects of other variables. If a psychologist is interested in gender-related differences in problem-solving ability but does not control for age differences,differences in gender may be confounded by differences that are actually age-related.

constants Also called constant values; have only one value. For example, the temperature atwhich water boils at sea level is a constant value, 212 degrees Fahrenheit.

contingency table The arrangement of data from a chi-square test of independence where thecategories of one nominal variable are in rows and the categories of the other variable are incolumns.

correlation matrix A box in which the variables involved are listed in rows as well as incolumns, and each variable is correlated with all variables, including itself.

Cramér’s V Variation of the phi coefficient used to determine the strength of the relationshipbetween two variables with more than two categories.

criterion variable The variable for which the value is predicted in a regression procedure.

critical value A value from a table that indicates the point at which a calculated test result isstatistically significant.

data scale The kind of information that data values provide. The scales of data includenominal data, which define category; ordinal data, which allow ranking; interval data, whichhave consistent increases/decreases between consecutive data points; and ratio data, whichhave a meaningful 0.

decision errors The result of any statistical decision based on a probability. There are twotypes of decision errors: type I errors and type II errors.

degrees of freedom (df) The number of measures in a procedure that are free to vary, or tohave any value, when the result is known. For the standard deviation, for example, df = n − 1,which means that if there are 10 values on which the standard deviation calculation is based,9 of them may have any value, so long as the 10th produces the correct result.

dependent variable In a research problem, the variable affected by the treatment.

dependent-groups designs Statistical procedures in which the groups are related, eitherbecause multiple measures are taken of the same participants, or because each participant in aparticular group is matched on characteristics relevant to the analysis to a participant in theother groups with the same characteristics. Dependent-groups designs minimize errorvariance because they reduce score variation due to factors unrelated to the independentvariable.

descriptive statistics Provide values that define the characteristics of a data set. Typicaldescriptive statistics are measures of what is most typical and measures of how differentindividual values are from each other.

disordered arrays A presentation of data without organization into classes or groups or bymagnitude.

distribution of difference scores The theoretical population upon which the independent ttest is based.

distribution of sample means A population based on the means of all possible samples ofequal size drawn from the particular population, rather than on the values of the individualmeasures which make up the population.

effect sizes Indicators of the practical importance of a statistically significant outcome.

error variance Variability in a measure stemming from a source other than the variablesintroduced into the analysis.

eta squared A measure of effect size for ANOVA. It estimates the amount of variability inthe DV explained by the IV.

experimental research A type of research in which the group members in a study arerandomly selected and the treatment is randomly assigned.

factor An alternate name for an independent variable, particularly in procedures that involvemore than one.

factorial ANOVA An ANOVA with more than one IV.

F ratio The test statistic calculated in an analysis of variance problem. It is the ratio of thevariance between the groups to the variance within the groups.

frequency distribution A graphical data presentation that indicates the number of times eachscore in a group occurs.

frequency polygon A graphical data distribution in which straight lines connect themidpoints of the bars that make up a histogram. The normal curve is a frequency polygon butwith so many successive values that the individual lines are imperceptible.

goodness-of-fit chi-square test, or 1 × k chi-square A test for significant differences in thefrequency with which nominal data occur in distinct categories.

grouped frequency distribution A graphical data presentation in which the data are groupedaccording to some characteristic held in common.

histograms A graphical presentation of interval or ratio scale data in which the “bars” in thebar graph-like presentation represent continuous categories reflected in the absence of gapsbetween the bars.

homogeneity of variance The variance that occurs when multiple groups manifest similarvariability.

hypothesis of association The umbrella term for significance tests that analyze thecorrelation between or among variables.

hypothesis of difference The umbrella term for significance tests that analyze the differencesbetween groups.

independent t test A test to determine whether two samples likely belong to populationswith identical means.

independent variable In a research problem, the antecedent variable expected to explain anychange in the dependent variable.

inferential statistics Procedures that allow the analyst to draw inferences or conclusionsfrom the data. It is common in statistical analysis, for example, to deduce the characteristicsof the population from what occurs in a sample.

interaction Occurs when the combined effect of multiple independent variables is differentthan the variables acting independently.

intercept The point where the regression line crosses the y axis when the regression solutionis plotted in a graph, defined as the value of y when x = 0.

interquartile range (IQR) The middle half of the distribution stretching from the first to thethird quartile, or from the measure representing the 25th percentile to that representing the75th percentile.

kurtosis A descriptor indicating the level of homogeneity among the measures in a datadistribution.

law of large numbers Mathematical principle that errors diminish as the number of datapoints increases.

least-squares criterion The requirement that the sum of squared errors have their lowestpossible value.

leptokurtic A descriptor for distributions of highly similar data gathered too closely aroundthe mean for the resulting distribution to be normal.

linear Describes a relationship between two variables whose strength is consistentthroughout their ranges. With curvilinear relationships, the strength and sometimes even thenature of the relationship (positive or negative) changes depending upon where in thevariables’ ranges they are measured.

matched-pairs t test A dependent-groups application of the t test in which each participantin the second group is paired to a participant in the first group with the same characteristics,so as to limit the error variance that would otherwise stem from using dissimilar groups.

mean (M) The arithmetic average of a set of values. The mean of a population is representedby the symbol µ.

mean square The sum of squares divided by the relevant degrees of freedom. This divisionallows the mean square to reflect a mean, or average, amount of variability from a source.

measures of central tendency Indicate what measure is most typical in a data set andinclude the mean, median, and mode.

measures of variability Indicate how much variety there is in a set of data values. Alsocalled measures of dispersion.

median (Mdn) The middle number when data are ordered.

mesokurtic Literally “middle” kurtic, distributions with an intermediate level ofhomogeneity among the data that are characteristic of the conventional bell-shaped curve,which represents a normal distribution.

mixed methods research Research involving both qualitative and quantitative variables, aswell as methods that are appropriate to each.

mode (Mo) The most frequently occurring value in a set.

modified standard scores Standard scores adjusted to reflect a specified mean and standarddeviation.

multiple correlation Gauges the strength of the relationship between one variable and two ormore other variables.

nominal scale Data categories that are not continuous and therefore the order of the categories is unimportant.

nonexperimental research A type of research which does not perform random selection orrandom assignment.

nonparametric Tests for data that do not meet the usual normality requirements. Moretechnically, a test in which there is no interest in population parameters.

normal (or Gaussian) distribution Distribution characterized by the following: symmetry,unimodality, and having data distributed such that the range has about six times the value ofthe standard deviation. When presented in a frequency distribution, a normal distributiontakes on the bell-shape by which it is commonly described.

null hypothesis The hypothesis that predicts that two samples belong to populations withidentical means for the independent t test.

omega-squared A procedure for determining how much of the difference between groupscan be attributed to the independent variable.

one-sample t test A test of whether a sample is likely to belong to a specified population.

one-tailed test When a t test is based on an initial hypothesis predicting the direction of thedifference between population means, μ1 > μ2, for example, the test is a one-tailed test; asignificant result can occur in only one tail of the distribution.

one-way ANOVA Simplest variance analysis, involving only one independent variable.Similar to the t test.

ordered array A presentation in which data are organized into classes or groups or bymagnitude.

ordinary least-squares regression A form of regression in which the sum of the squaredprediction errors must have its lowest possible value.

ordinate The vertical (y) axis in a graph based on Cartesian coordinates.

outlier A score that is substantially different from most of the other scores in a datadistribution. If they are not balanced by outliers in the opposite direction, these extremescores create skew.

overfitting the sample When the accuracy of regression solutions diminishes with new datasets.

parameters Population characteristics; symbolized by Greek letters, such as µ for thepopulation mean and σ for the population standard deviation.

partial correlation Measures the relationship between two variables, controlling for theinfluence of a third in both of the first two.

Pearson correlation coefficient Indicates the strength of the relationship between interval-or ratio-scale variables.

percentile A value below which a certain percentage of all scores in a distribution may befound; 37% of all scores occur below the 37th percentile.

percentile rank A point in a data distribution below which a specified percentage of thescores occur.

phi coefficient Used to determine the strength of the relationship when there are twocategories of each variable in the chi-square test of independence. When there are more thantwo categories of both variables, the measure is Cramér’s V.

pie chart A circle that is divided into sectors. The size of each sector is defined by thepercentage of the total area of the circle a whole category represents; also called a pie graph.

platykurtic A descriptor for distributions in which the data are too heterogeneous toconstitute a normal distribution. Samples are typically more platykurtic than the populationsthey represent.

point of origin In a graph based on Cartesian coordinates, the point where the values plottedon the abscissa and those on the ordinate both equal zero.

point-biserial correlation A special application of the Pearson correlation for thoseinstances where one of the variables, such as gender or marital status, has just twomanifestations.

population Includes all members of a defined group.

post hoc test A test conducted after a significant ANOVA or some similar test that identifieswhich among multiple possibilities is statistically significant.

power Refers to the likelihood that a test will detect statistical significance when it is present.Because error is often reduced in a factorial ANOVA, reducing the value of MSwith andmaking a significant F more likely, it is often a more powerful test than a one-way ANOVA.

predictor variable In regression, the variable used to predict the value of the criterionvariable.

probability The measure of the likelihood that an event will occur. The values range from p= 0, for events that never occur, to p = 1.0, for events that occur every time.

qualitative variables Defined by the kind of characteristic they represent, such as gender oreye color.

quantitative variables Defined by the amount of the characteristic they represent, such asintelligence.

quartiles Indicate the fourths of a data distribution. The first quartile (Q1) is from the pointrepresenting the 25th percentile down. The second quartile is from the median (Mdn) of thedistribution to Q1, and so on.

quasi-experimental research Research in which some variables are randomly assigned,some are not.

random selection Selection of a sample from a population where every member has an equalprobability of being selected.

range (R) The difference between the highest and lowest values in a data set.

range attenuation Occurs when a variable is not measured throughout its entire range.Attenuated range artificially reduces the strength of any resulting correlation value.

regression coefficient Indicates the attitude of the regression line. Sometimes called the slope value, it is defined as the impact on y of increasing x by 1.0.

regression line When positioned in a scatterplot, an illustration of the relationship betweenpredictor and criterion variables. It is positioned so as to mini-mize the sum of all squaredprediction errors.

regression to the mean Term used to describe the fact that extreme values of a predictoralways predict less extreme values of a criterion variable. This occurs because normaldistributions have the preponderance of data in the middle of the distribution, and frequencydeclines with distance from the mean.

research design A formal plan for conducting a study. It specifies the variables to be studied;indicates who the subjects in the experiment will be, as well as how they will be selected;specifies how the data will be gathered, including how the independent variable will bemanipulated; and indicates the type of analysis to be used.

residual scores The differences between the actual and predicted values of the criterion.

sample Any subset of a population.

sampling error Reflected in the degree to which the characteristics of the sample, such asthe mean and standard deviation, vary from those populations.

scatterplot A graph representing two variables, one on the horizontal axis, the other on thevertical axis. Each point in the graph indicates the measure of both variables for oneindividual.

semi-partial correlation Gauges the relationship between two variables, controlling for athird in just one of the first two.

shrinkage The degree to which a regression solution diminishes in accuracy when applied tonew data sets.

simple or bivariate regression Refers to regression with one predictor variable.

skew A descriptor that indicates that a data distribution lacks symmetry, a characteristicwhich occurs when scores on one side of the mode are more extreme than those on the other.Positive skew indicates that in the frequency polygon plotted to represent the data, the curveto the right of the mode is more gradual than that to the left. When the skew is negative,scores to the left of the mode are more gradual.

slope Indicates how much the regression line inclines or declines from left to right.

Spearman’s rho A correlation procedure for two ordinal variables, one ordinal and oneinterval/ratio variable or two interval or ratio variables, that fail to meet Pearson correlationrequirements for normality.

standard deviation The sample standard deviation (s) and the population standard deviation(σ) indicate how much individual scores tend to differ from the mean of the respective group.

standard error of the differences The measure of variability in the distribution of differencescores, symbolized SEM1 − M2.

standard error of the estimate An average measure of error in a regression solution. It isbased on the strength of the correlation between the variables and the variability in thecriterion variable.

standard error of the mean Standard deviation of the sample means (σM).

standard normal distribution A normal distribution in which µ = 0 and s = 1.0.

standard scores Normally distributed, equal-interval scores that have a fixed mean andstandard deviation.

statistic A characteristic of a sample. Some common statistics are the mean (M), the standarddeviation (s), and the range (R).

statistically significant An outcome so unlike the population to which it is compared that itcan be presumed to reflect some other population. Put another way, the value of M is distantenough from µM that it probably was not randomly selected from the distribution of samplemeans. If the probability that an outcome occurred by chance is p = 0.05 or less, the outcomeis statistically significant.

stem-and-leaf plots Also called stem-and-leaf displays. Graphical data presentations whicharrange data in a vertical hierarchy from the smallest to the largest. All values except the lastin each measure constitute the “stem” of the display. The last value is the “leaf.” For two-digit numbers, the stem is the 10s value, and the leaf is the 1s value. All measures with thesame leaf value occur to the right of their common stem.

sum of squares The variance measure in analysis of variance. It is the sum of the squareddeviations between a set of scores and their mean.

sum of squares between The variability related to the independent variable and anymeasurement error that may occur.

sum of squares error Another name for the sum of squares within because it refers to thedifferences after treatment within the same group, all of which constitute error variance.

sum of squares total Total variance from all sources.

sum of squares within Variability stemming from different responses from individuals in thesame group. Because all the individuals in a particular group receive the same treatment,differences among them constitute error variance.

systematic sampling error Occurs because the same mistake in selecting a sample of apopulation is made over and over.

t score A standard score based on a normal distribution in which M = 50 and s = 10. They aresometimes preferred to z scores because they rarely involve negative values.

two-tailed test When a t test is based on the assumption that the sample comes from apopulation that has either a significantly larger or smaller value than the population to whichit is compared, it is a two-tailed test; a significant result can occur in either tail of thedistribution.

type I errors, or alpha (α) errors Decision errors made when a result is judged to be statistically significant, but further research and testing would show that it is not.

type II errors, or beta (ß) errors Decision errors that occur when the sample ischaracteristic of some population other than the distribution of samples means to which itwas compared, but the statistical testing suggests no significant difference.

unimodal A distribution with just one most frequently occurring value, one mode.

variables Characteristics that can have changing values.

variance (s2) The square of the standard deviation. The variance is one measure of howmuch individual scores differ from the mean of the group.

within-subjects F The dependent-groups equivalent of the one-way ANOVA. In thisprocedure, either participants in each group are paired on the relevant characteristics withparticipants in the other groups, or one group is measured repeatedly after different levels ofthe independent variable are introduced.

z score The score that results when scores from any source are made to conform to thecharacteristics of the standard normal, or z, distribution. The z distribution has M = 0 and s =1.

z test Indicates how distant a sample mean is from the mean of the distribution of samplemeans in units of the standard error of the mean. When the value of z is 1.96 or greater, thereis a probability of p = 0.05 or less that the sample belongs to the population.

z transformation Changes any raw score into a z score so that it fits the standard normaldistribution.

References

Arroyo, Y. (2015). A descriptive and correlational study between employees’ level ofworkplace engagement and workplace consideration. [Abstract]. Dissertation AbstractsInternational: Section A, Humanities and Social Sciences, 75 (10-A(E)).

Bai, Y., Lai, L., Lee, B., Chang, Y., & Chiou, C. (2015). The impact of depression on fatiguein patients with haemodialysis. A correlational study. Journal of Clinical Nursing, 24,2104–2022.

Bear, J. B., & Babcock, L. (2012). Negotiation topic as a moderator of gender differences innegotiation. Psychological Science: Research, Application, and Theory in Psychologyand Related Sciences, 23, 743–744.

Bowers, J. S., Mattys, S. L., & Gage, S. H. (2009). Preserved implicit knowledge of aforgotten childhood language. Psychological Science, 20, 1064–1069.

Brighton Webs Ltd. (2006). Critical values of correlation coefficient (R) [Table]. Statistics forEnergy and the Environment. Retrieved from http://www.brighton-webs.co.uk/statistics/critical_values_r.aspx

Brock, S. E. (2010). Descriptive statistics and psychological testing. [Course handout].Sacramento, CA: California State University. Retrieved from http://www.csus.edu/indiv/b/brocks/Courses/EDS%20250/EDS%20250/Handouts/11/Descrptive%20Statistics%20and%20the%20Normal%20Curve.pdf

Brown, P. H. (2012). Alternative class rankings using z scores. Assessment & Evaluation inHigher Education, 37, 889–905.

Butler, A. C., Zaromb, F. M., Lyle, K. B., & Roediger III, H. L. (2009). Using popular filmsto enhance classroom learning: The good, the bad, and the interesting. PsychologicalScience, 20, 1161–1168.

Ceci, M. W., & Kumar, V. K. (2015, January 14). A correlational study of creativity,happiness, motivation, and stress from creative pursuits. Journal of Happiness Studies,1–18. doi: 10.1007/s10902-015-9615-y.

Critical values of F. (n.d.). [Table]. Retrieved from http://faculty.vassar.edu/lowry/apx_d.html

Diekhoff, G. (1992). Statistics for the social and behavioral sciences: Univariate, bivariate,and multivariate. Dubuque, IA: Brown.

Fischer, R., & Milfont, T. L. (2010). Standardization in psychological research. InternationalJournal of Psychological Research, 3(1), 88–96.

Fisher, R. A. (1925). Statistical methods for research workers. Edinburgh, UK: Oliver andBoyd.

Flynn, J. R., & Weiss, L. G. (2007). American IQ gains from 1932–2002: The WISC subtestsand educational progress. International Journal of Testing, 7, 209–224.

Friendly, M. (2000). Visualizing categorical data. Cary, NC: SAS Institute.

Gertsman, B. B. (n.d.). Probability tables: t table. [Adapted table]. Retrieved from http://www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf

Guo, J., & Drasgow, F. (2010). Identifying cheating on unproctored internet tests: The z-testand the likelihood ratio test. International Journal of Selection and Assessment, 18(4),351–364.

Habersham, S. L. (2014). Dual enrollment: An analysis of persistence, ethnicity, and gender. Dissertation Abstracts International, Section A: Humanities and Social Sciences, 74(10-A(E)).

Hewer, M. (2014). Selling sweet nothings. Observer, 27(10), 14–18.

Jokela, M. (2012). Birth-cohort effects in the association between personality and fertility. Psychological Science, 23, 835–841.

Kiecolt-Glaser, J. K., Weng, N., Malarkey, W. B., Beversdorf, D. Q., & Glaser, R. (2011).Childhood adversity heightens the impact of later-life caregiving stress on telomerelength and inflammation. Psychosomatic Medicine, 73, 16–22.

Lambert-Lee, K. A., Jones, R., O’Sullivan, J., Hastings, R. P., Douglas-Cobane, E., Esther, T.J., . . . Griffith, G. (2015). Translating evidence-based practice into a comprehensiveeducational model within an autism-specific special school. British Journal of SpecialEducation, 42(1), 69–86.

Linn, R. L. (2003). Accountability: responsibility and reasonable expectations. EducationalResearcher, 32(7), 3–13.

Lockart, R. S. (1998). Introduction to statistics and data analysis for the behavioral sciences.New York, NY: W.H. Freeman.

Longo, M. R., Long, C., & Haggard, P. (2012). Mapping the invisible hand: A body model ofa phantom limb. Psychological Science: Research, Application, and Theory inPsychology and Related Sciences, 23, 740–742.

Lopez, N. (2003). Hopeful girls, troubled boys: Race and gender disparity in urbaneducation. New York, NY: Routledge.

Malik, A., Goodwin, G., Hoppitt, L., & Holmes, E. (2014). Hypomanic experience in youngadults confers vulnerability to intrusive imagery after experimental trauma: Relevancefor bipolar disorder. Clinical Psychological Science, 2, 675–684.

“Normal Distribution.” (2014). MathIsFun.com. Retrieved from https://www.mathsisfun.com/data/standard-normal-distribution.html

Overbeek, W. A. (2012). Differences in body mass index z scores in a Dutch pediatricpsychiatric population with and without use of second-generation antipsychotics. Journal of Child and Adolescent Psychopharmacology, 22, 166–173.

Robinson, T. E. (2014). Different roads, same reward. Observer, 27(10), 17, 25.

Sheskkin, D. (2004). Handbook of parametric and nonparametric statistical procedures. (3rded.) Boca Raton, FL: Chapman & Hall/CRC Press.

Sprinthall, R. C. (2000). Basic statistical analysis (6th Ed.). Needham Heights, MA: Allynand Bacon.

Sprinthall, R. C. (2011). Basic statistical analysis (9th ed.). New York: Pearson.

StatSoft. (2011). Electronic Statistics Textbook. Tulsa, OK: StatSoft. Retrieved from http://www.statsoft.com/textbook/distribution-tables/#z

Thomas, D. R., & Zumbo, B. D. (2012). Difference scores from the point of view ofreliability and repeated measures ANOVA. Educational and PsychologicalMeasurement, 72(1), 37–43.

Thomas, P., Rammsayer, T., Schweizer, K., & Troche, S. (2015). Elucidating the functionalrelationship between working memory capacity and psychometric intelligence: A fixed-links modeling approach for experimental repeated-measures designs. Advances inCognitive Psychology, 11(1), 3–13.

Tufte, E. R. (2001). The visual display of quantitative information. (2nd ed.) Cheshire, CT:Graphic Books.

Tukey’s HSD critical values. (n.d.). [Table]. Retrieved from http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html

University of Sussex. (n.d.). Critical values of Spearman’s rho (two-tailed). [Table].Retrieved from http://www.sussex.ac.uk/Users/grahamh/RM1web/Rhotable.htm

Virginia Tech, Quantitative Population Ecology. (n.d.). Table of chi-square statistics.Retrieved from http://alexei.nfshost.com/PopEcol/tables/chisq.html

Witkin, H. A., Moore, C. A., Goodenough, D. R., & Cox, P. W. (1977). Field-dependent andfield-independent cognitive styles and their educational implications. Reviews ofEducational Research, 47, 1–64.