The research assignment
Sukal, M. (2019). Research methods: Applying statistics in research. San Diego, CA: Bridgepoint Education, Inc
iStockphoto/Thinkstock
Learning Objectives
After reading this chapter, you will be able to. . .
· distinguish between data of nominal, ordinal, interval, and ratio scale.
· associate the measures of central tendency with the scale of the data for which those measures areappropriate descriptive statistics.
· calculate the mean, median, and mode for a set of data.
· calculate and interpret the variance and the standard deviation for a set of data.
· distinguish between the characteristics and the notation used for sample and population data.
· present results of descriptive statistics and draw conclusions based on data analysis.
· interpret and report descriptive statistics in APA format.
A recently published statistic by IBM (Eaton, Deroos, Deutsch, Lapis, & Zikopoulos, 2012) stated that, "Every day, we create 2.5 quintillion bytesof data—so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere:sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phoneGPS signals to name a few." In fact, these statistics are already outdated by the time you read this in your graduate class. In other words, weare a data-driven society where using, analyzing, and understanding data will make you as a professional, a marketable and educatedpsychologist, consultant, and consumer.
As a graduate student, you will encounter two categories of learners; field-dependent and field-independent. Field-dependent learners processinformation as a whole. These people tend to see complex problems or tasks in their entirety, and they usually resist taking them apart; theyshy away from analytical tasks. People with this type of cognitive style are sometimes referred to as global thinkers. In contrast, field-independent learners are inclined to break complex problems into pieces. Their approach is to understand the whole by reducing it to itselements. People with this type of cognitive style are called analytical thinkers.
What does this theory about cognitive-style preferences have to do with learning statistics? The arms-length assessment preferred by field-dependent people can be helpful, for example, to make a quick appraisal of whether a group of students understands a new concept. However,solving job-related problems often requires an analytical approach. Statistical analysis, with its concern for measurement, detail, andmathematical precision, seems to favor field-independent people. So where does that leave students for whom an analytical approach is notsecond nature? For these people, data analysis can seem like standing on foreign soil. This book has been written to help everyone involved inbehavioral sciences tackle quantitative problems—especially students who do not naturally gravitate to data analysis. If analytical tasks justhappen to be what you prefer, terrific! If they are not, you too will discover you can navigate your way around statistics.
1.1 How Much Math Will You Need?
Statistical analysis is a means to an end, rather than the end itself. You are studying the subject so that you can understand the numericalmeasures that you encounter in your job rather than because you are interested in statistical theory. This text explains what statistical formulasmean and in some cases why they have the form they have. However, for the most part, the math you have learned in secondary school will bequite adequate for this course. Ordinarily, there will be nothing more mathematically complex than order of operations to worry about. If youcan remember whether to multiply or add first, and what parentheses and exponents mean, you will be fine. If not, review "please excuse my dear Aunt Sally," or whatever memory aid you used in high school to prompt you to
· first, do what's in the p arentheses,
· then deal with any e xponents (any squaring, for example),
· then do any m ultiplication and d ivision, working from left to right, and
· last, any a ddition and s ubtraction, from left to right.
In some contemporary statistics classes, the student does not do very much mathematical calculation. Computer programs, applets, and onlinecalculators are so easy and so readily available that some educators say there is little need for the traditional longhand mathematicalcalculations that were once mainstays in statistics classes. Statistical packages, such as SPSS and SAS are generally quite affordable, while otherssuch as R are free, and many spreadsheet programs like Microsoft Excel can make all of the basic calculations, as well as many of the simplerstatistical tests. With all of these resources, why do anything by hand?
Those who take the time to learn and complete the calculations by hand come to understand the underlying concepts more clearly than thosewho have the computer do all the work. With computer output, there is a result to view, and you can usually be assured of accuracy andprecision, but often the tables and statistics do not communicate the logic involved very well. There is typically little guidance about what theoutput means or how the results were derived and why.
Hand calculation requires you to take each incremental step required for a solution. Taking these connected steps makes the reasoning moretransparent and more coherent. The intent is that completing each chapter will enable you to explain what you have done and why it wasnecessary. Once you understand the logic and the process, you can turn to a computer for verification and for the economy of time thatsoftware analyses can provide, particularly with larger data sets.
1.2 What About the Statistical Notation Symbols?
Greek letters are commonly used to signify procedures or values that we use repeatedly. You may already know how to perform the proceduresto which some of the Greek letters refer, and you have probably already calculated others under a different name. For example,
∑, the uppercase Greek letter sigma, indicates summation. That symbol tells us to add several values; ∑x means that several valuesare to be summed, with that group of values referred to as x. If x refers to a group of verbal ability scores, ∑x indicates that thescores are to be added.
Sometimes a Greek letter indicates a value. A population is all the members of a defined group. All the members of your family comprise apopulation, as do all Nevada voters, or all psychologists in the country.
The average, or the mean, of some population characteristic (the mean age of Nevada voters, for example) is indicated by thelowercase Greek letter mu (μ), pronounced "mew"—like the sound a cat makes.
Although the idea of a mean score probably is not new, the symbol μ may be. The symbol is just a shorthand way to refer to a populationmean. If the ages of the members of your family are 14, 21, 45, and 47, μ = 31.75 [(14 + 21 + 45 + 47)/4].
People working with statistics find the symbols very helpful. They provide a great economy, particularly when the same procedures and valuesare used repeatedly, as they tend to be in statistical analysis. None of the symbols we use represents concepts or operations that are verycomplicated; they are just a briefer way to refer to operations or characteristics that we use repeatedly.
The point of all of this is that statistical analysis does not need to be the exclusive domain of those with an analytical style. It can be used bypeople who are global thinkers as well. The math in this book will be entirely manageable, and the notation used in the formulas is moreeconomy than mystery.
1.3 What Are the Objectives of Statistics?
Many of the problems that behavioral science professionals must solve are complex and nuanced because the problems are about people withall of their variability. Some of the procedures required to analyze behavior must accommodate that complexity. Complex usually means many-faceted, so we often have several things to explain. We may wish to describe the mean age, the typical gender, the marital status, and thesocial class of a group of voters, which is a complex problem, but it is not difficult.
In fact, the descriptive statistics used to portray the characteristics of the things we study tend to be rather straightforward. Descriptivestatistics are an economical way to describe characteristics found in multiple measures. When weather people say that the average temperaturefor the date is 55 degrees or that average rainfall for the month is 2.5 inches, they are providing descriptive statistics for a larger body of datathat has been gathered over many years.
Other statistics that we can calculate have a different purpose. Perhaps a psychologist examines a group of clients with alcohol addictionproblems to determine what the relationship is between addiction and depression for all who struggle with alcohol addiction. The psychologist'sobjective is to produce inferential statistics or to infer the characteristics of all people with alcohol addiction problems by examining a subset ofthat population, known as a sample. The ability to use the sample as a window on the population requires some care in selecting the sampleand in determining the values. But when the appropriate conventions are observed, samples can provide great insight regarding populationcharacteristics.
Our initial interest, however, is in descriptive statistics. We want to learn to calculate and interpret them. Once we have done that, descriptivestatistics will lead us to the inferential statistics.
1.4 Describing Data
There are many ways to describe data. Although it is common to rely on numbers to manage large quantities of information, the numbers fordifferent kinds of data convey different meanings. A data scale is the kind of information that data values provide. There are four scales, orkinds, of data:
· nominal data,
· ordinal data,
· interval data, and
· ratio data.
The first two types of data scales (nominal and ordinal) are known as categorical data as these values are discrete entities where each numberis distinct from the other numbers in the array. There are also no fractions associated with these ratings. Conversely, the latter two scales(interval and ratio) are continuous data as these numbers can be used in mathematical functions and can be fractioned between ratings (i.e.,1.5, 2.3, etc.).
Nominal Data
Nominal data identifies the group to which an individual belongs. Some researchers refer to this as categorical data because the values refer tospecific categories, such as gender, race, ethnicity, marital status, and religion. It is common to assign numbers to nominal data categories (1 forAfrican American, 2 for Asian, 3 for Caucasian, 4 for Hispanic, for example) to make them easier to keep track of. The numbers have nomathematical meaning, however. It does not make sense to try computing the "average" ethnicity, for example, by adding up the 1s, 2s, 3s, and4s and then dividing by the number of categories. The values are just labels. They indicate the group to which an individual belongs and keep usfrom having to write out the entire name.
Ordinal Data
Ordinal data provides more information than nominal data, enough in fact that individuals can be ranked. Higher values indicate more ofwhatever characteristic is measured, although they do not indicate how much more. By contrast, a higher number in nominal data just meansthat the individual belongs to a different group. Noting that one individual is faster, or more creative, or more successful than another is anexample of ordinal measurement. It is not very precise, but it does provide for a ranking. A student who is ranked 5th in the class is doingbetter than someone ranked 6th, although how much better is not clear. First prize at a car show is better than second, but we cannot knowwhat the margin of victory. Additional examples are educational ranks (i.e., 1 = Bachelor degree, 2 = Master degree, 3 = Doctoral degree) ormilitary ranks (i.e., 1 = Private, 2 = Corporal, 3 = Sergeant, 4 = Colonel, 5 = General). Ordinal data only allows ranking individuals relative to theothers in the group without equal gaps between each rank.
Interval Data
Try It!
There is a deep-seated debate inacademia about theconnections betweenstatistics and scales ofmeasurement used for ordinaland interval data. Click on thefollowing links to read articles by S. S. Stevens (1946), J. Michell(1986), and T. R. Knapp (1990),and uncover more about thedifferent perspectives.
With interval data, the size of the gap, or the interval, between individual cases is known becausethe amount of difference between consecutive data points is the same anywhere along thenumber line. Fahrenheit temperature measurements are an example of interval scale data. Theincrease in heat from 52 to 57 degrees is the same as the increase from 86 to 91 degrees. It is notjust that the change is 5 degrees in either case, but that the incremental increase in heat is thesame in both instances. In other words, the degree is equal across all intervals of the scale.
Interval scale measurement is quite common in the social sciences mainly becausepsychometricians (test developers) assume instruments developed to gauge depression, aptitude,intelligence, personality, and so on, as being interval data. In other words, when employing arating scale (1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, 5 = strongly agree), oneassumes that the increments between anchors 1, 2, 3, 4, and 5 are equal. This assumption ofequal distances between testing anchors allows data analysts to treat the data as interval scalingthat allows for more sophisticated analyses. This has been argued as a debatable assumption. Thescaling of psychological construct items is not equal across rating anchors and should be treated asordinal data with unequal intervals across anchors. Therefore, such psychological construct itemsshould be based on ranking only, thereby cautioning sophisticated analysis using ordinal data.
Ratio Data
Ratio scale data has all the characteristics that interval data has plus two more. First, a 0 in ratio scale measures indicates the absence of thecharacteristic being measured. By contrast, with interval scale data, 0 is just a point on the scale midway between −1 and +1. If the temperatureis 0 degrees Fahrenheit, it does not mean that it cannot become colder; 0 is just a point on the scale.
Try It!
A (1) Someonecomments about astudent, "He's moreambitious than hisclassmates." What is the scale ofdata that is involved in such astatement? (2) What is the scaleof the data in your bankstatement?
See answer here.
The other characteristic that sets ratio data apart has to do with the name. With this data, whatare called ratio comparisons are possible. A person who is 6 feet tall can be accurately said to betwice as tall as a child who is 3 feet tall child, a 20-year-old is half as old as someone who is 40,or someone who makes $90,000 a year has an income three times greater than the person whomakes $30,000. These ratio comparisons are part of what give ratio data the name.
Except for those kinds of descriptive characteristics, however, ratio scale measurement is rare inthe social sciences. Although we can say that someone who scores 20 on a spelling test answeredtwice as many questions correctly as someone who scored 10, we probably wouldn't say that theperson is twice the speller as the person who scored 10, or four times the speller that the personwho scored 5. Achievement testing, and in fact mental measurements generally, are rarely ratioscale.
One of the reasons for discussing data scale is that some descriptive statistics are specific to dataof a particular scale. Some of the statistical tests you will calculate later are likewise tied to dataof particular scales, so it is important that you become familiar with data scale. Tests that requireinterval data will also accommodate ratio data, so the distinction between those two is notimportant in terms of which test to use. The differences between nominal, ordinal, and intervaldata matter a great deal, however.
1.5 Descriptive Statistics
Descriptive statistics indicate the most common characteristics in a particular data set collected over time. They provide researchers with theability to identify and measure similarities and differences that are revealed by a study.
The statistics that indicate what is most typical are called measures of central tendency. Statistics that indicate data variety are called measuresof variability, or sometimes measures of dispersion. First, let us take a look at measures of central tendency.
Measures of Central Tendency
There are three different measures of central tendency: the mode, the median, and the mean. Each indicates what is most typical in a data set,but each defines what is typical differently. The mode is the value that occurs most frequently. The median is the middle number in an array ofnumbers, and the mean is the average of all numbers. Let us discuss each of these in greater detail, their respective calculations, and when theyare to be used.
The Mode
The mode is the most frequently occurring value in a group, and it is the statistic for indicating central tendency when the data are nominalscale. Perhaps someone is interested in the political party affiliations for 20 social workers. If they indicate their party on a questionnaire and if1 indicates Democrat, 2 Republican, 3 Independent, and 4 Green Party, the results might be as follows:
1, 2, 2, 1, 1, 3, 4, 1, 4, 3, 2, 1, 1, 2, 1, 3, 2, 1, 1, 2
Remember that the numbers indicate the category. If those in each category are counted, the result is this:
|
1s |
9 |
|
2s |
6 |
|
3s |
3 |
|
4s |
2 |
Because the most commonly occurring value is 1, then mode = 1, which indicates that in this group of 20 social workers, more associatethemselves with the Democratic Party than with any of the other three parties.
If we were to add a few more numbers to this data array:
1, 2, 2, 1, 1, 3, 4, 1, 4, 3, 2, 1, 1, 2, 1, 3, 2, 1, 1, 2, 2, 2, 2
Remember that the numbers indicate the category. If those in each category are counted, the result is this:
|
1s |
9 |
|
2s |
9 |
|
3s |
3 |
|
4s |
2 |
Now there are two values (1 and 2) that have a frequency of 9. This is called a bimodal distribution (literally two modes) since both are at thesame value. If there are more than two most frequently occurring values this is termed a multimodal distribution.
The mode is often calculated for ordinal, interval, and ratio data, but it is the only measure of central tendency that makes sense for nominaldata. When it is used for ordinal and for interval/ratio data, it indicates the most commonly occurring value just as it does for nominal data, butwith ordinal or interval/ratio data, it is usually reported along with other measures of central tendency.
The Median
When scores are arranged either from largest to smallest or from smallest to largest, the median (Mdn) is the middle score. The medianrequires data of at least ordinal scale, although it can also be calculated for interval/ratio data. The median is not calculated for nominal databecause the numbers are only category labels.
Suppose all freshman students at a university are ranked in order of their academic performance at the end of the year. These are the classrankings for nine graduates in the department of biology:
3, 7, 13, 15, 17, 33, 36, 42, 51
The median is the middle ranking. With nine rankings, the middle ranking is the fifth, which for these class rankings is 17, so the Mdn rankingfor biology freshman = 17. Note that for this data set, there is no mode. All the rankings occur with the same frequency, 1.
If there were 10 students in biology, and the 10th had a class ranking of 52, we would have the following:
3, 7, 13, 15, 17, 33, 36, 42, 51, 52
In this case, there is an even number of scores, so there are two middle scores. The median is the average of those two middle scores. The fifthand sixth scores are 17 and 33: 17 + 33 = 50 ÷ 2 = 25. With the additional class ranking, the Mdn ranking for biology freshman = 25.
The Mean
The mean ( M ) is the average of a set of values, but hereafter we will use the term mean rather than average. To calculate the mean, the datamust be interval or ratio scale. Perhaps a data analyst is interested in gauging the level of depression among 10 faculty members. A psychologistadministers a depression scale for the 10 faculty members and has the following results:
3, 4, 4, 5, 5, 6, 6, 7, 7, 8
Probably calculating a mean is not new to you, but the symbols that statistics uses might be, so note Formula 1.1:
|
M=∑xn |
Formula 1.1 |
|
|
Where |
||
|
M = the mean |
||
|
x = each value in the set (each depression score in this case) |
||
|
n = the number of values or scores |
For the depression data, verify that
|
∑x=55, (this is the sum of the individual scores) |
|
n = 10, |
|
∑x/n=55/10=5.5. |
|
M = 5.5 The mean level of depression is 5.5. |
It used to be common to use x̄ (pronounced "ex-bar") to symbolize the mean, but most journals in psychology and other disciplines now use Mto indicate the mean of a sample (rather than the μ for population means that we noted earlier). It is the convention we will follow throughoutthis presentation.
Because the median, like the mean, can be calculated for interval data, we can also indicate the median level of depression for these facultymembers. With an even number of scores as there are here, the median will be the midpoint between those two middle scores, the 5 and the6. For the depression scores, Mdn = 5.5.
Calculating the mode for the depression scores shows that there are four (mode = 4, 5, 6, and 7) because 4, 5, 6, and 7 all have the samefrequency. In small data sets, the mode sometimes does not reveal very much and often is not calculated at all, but the most frequent score ormeasurement can be very informative with larger groups.
The fact that the mean and median have the same value (M = 5.5, Mdn = 5.5) will have particular importance in Chapter 2 when we discussdata normality. For now, just note that the mean and the median have the same value, and for our set of depression scores at least, the modeis not very helpful.
Measures of Variability
Measures of central tendency are more informative when they are accompanied by a variability measure, and these two types of statistics oftengo hand in hand. Measures of variability indicate how much variety there is in scores, and for most measures of variability, large values indicatemore difference. Relatively small variability values indicate that the data is quite similar, or quite homogeneous.
The Range
When you need a quick measure of data variability, the range is the easiest to calculate. It is the difference between the highest and lowestvalues. For the depression data
3, 4, 4, 5, 5, 6, 6, 7, 7, 8
range = 5 (8 − 3)
It is common to hear people say, "Scores ranged from ____________ to ____________." To a statistician, however, the range is just one value:the difference between the highest and lowest measures. For that reason range isn't very informative when it's reported by itself, such as thedepression data's range of 5.
If you read a research report, and the authors report only that range = 5, you would not know how many values were involved or even whatthe highest and lowest values were. For example, if two administrators from the same college as the faculty members are also measured on thedepression scale, and they are found to have scores of 9 and 14, the range for those two administrators is also
range = 5
Their depression scores are quite different from the depression scores of the faculty members in terms of both how many there are and theirvalues, but the range does not reflect either of those differences.
The Variance and the Standard Deviation
There are other dimensions to data variability besides the range. The variance (s 2) and the standard deviation (s) both gauge variability in termsof how much individual values tend to differ from the mean (M) of the group. Their formulas are very similar.
That for the variance is
|
s2=∑(x−M)2n−1 |
Formula 1.2 |
That for the standard deviation is
|
s=∑(x−M)2n−1−−−−−−−−−−√ |
Formula 1.3 |
In the case of either formula,
|
∑ = summation, |
|
x = each score in the group, |
|
M = the mean of the group, |
|
n = the number of scores in the sample. |
As Formulas 1.2 and 1.3 suggest, if you multiply the standard deviation (s) by itself, the result is the variance (s2). From the other direction,entering the variance into the calculator and taking the square root produces the standard deviation.
The steps for calculating the variance follow:
|
1. Determine the mean for the group, M |
|
2. Take the difference between each individual score in the group and the mean (x − M for each score) |
|
3. Square the differences from each of the x − M calculations, (x − M)2 |
|
4. Sum all of the squared differences, ∑(x−M)2 |
|
5. Divide the sum of the squared differences by the number of scores minus one, n − 1 |
Using the depression data (3, 4, 4, 5, 5, 6, 6, 7, 7, 8), the procedure is
|
1. M=∑x/n=55/10=5.5. |
|
2. Subtract M from each x. 3 − 5.5 = −2.5 6 − 5.5 = 0.5 4 − 5.5 = −1.5 6 − 5.5 = 0.5 4 − 5.5 = −1.5 7 − 5.5 = 1.5 5 − 5.5 = −0.5 7 − 5.5 = 1.5 5 − 5.5 = −0.5 8 − 5.5 = 2.5 |
|
3. Square each x − M difference. Remember that the square of a negative number makes it positive. −2.52 = 6.25 0.52 = 0.25 −1.52 = 2.25 0.52 = 0.25 −1.52 = 2.25 1.52 = 2.25 −0.52 = 0.25 1.52 = 2.25 −0.52 = 0.25 2.52 = 6.25 |
|
4. Sum the squared differences. 6.25 + 2.25 + 2.25 + 0.25 + 0.25 + 0.25 + 0.25 + 2.25 + 2.25 + 6.25 = 22.50 |
|
5. Divide the sum by the number of scores minus 1. 22.50/9 = 2.50 s2 = 2.50 |
If you need the standard deviation, there is one more step.
6. Because the standard deviation = the square root of the variance, or
s=s2−−√
s=2.50−−−−√=1.58
Note that the way the variance and standard deviation are calculated emphasizes their relationship to the mean. The central component of eachstatistic is repeated in the x − M calculations (Step 2). Because subtracting each individual value (x) from the mean indicates how far individualdata points are from M, we just sum those differences to get a total of the differences between the xs and M. Why square the result?
The answer is that about half of those differences are going to be positive (when x > M) and about half are going to be negative (when x < M),as was the case with the depression scores. Summing them would result in something close to 0, which would not be very informative.However, when the differences are squared, all the negatives become positives, and the result is a value that provides a gauge of the typicaldistance between the scores and the mean.
The variance and standard deviation have more than purely descriptive functions. Some of the more complex procedures that come later in thisbook involve the variance, and others the standard deviation, which is why we need both. Here, our interest is just in describing data sets. Thedepression scores are similar; the 10 scores have a range of only 5 points and a small variance and standard deviation:
s2 = 2.50 and
s = 1.58
It is difficult to have a context for what makes a large or small standard deviation or variance; here both are quite small values because of thesimilarity in the original scores.
Apply It!
A Study on Studying
As part of her thesis, Anna, a graduate student in psychology, is studying college failure rates among first-yearstudents. She has just read results from a nationwide study that found that the majority of students who dropped outin their first year studied less than 11 hours per week. Anna decides to conduct a survey of first-year students at her college. Shechooses a random sample of 100 freshmen and asks each how many hours they study in a typical week. Rather than present all100 pieces of data, she would like to use descriptive statistics to summarize the survey results.
The survey results measure a quantitative variable. In other words, the numbers indicate the amount of what is measured. Theresults are an example of ratio data. The number of hours can be ordered from lowest to highest. Furthermore, there are equalintervals between values. That is, a 1-hour difference in studying represents the same quantity anywhere on the scale. Inaddition, there is an absolute zero point. In other words, someone who replied that they studied 0 hours in a typical week hasnot studied at all.
Anna calculates several measures of central tendency, and the results are shown below. As you can see, values for the mode,median, and mean are all closely grouped.
· Mode = 16 hours
· Mdn = 15.5 hours
· M = 16.5 hours
Anna knows that measures of central tendency are more informative when they are accompanied by a variability measure. As aquick measure of variability, she calculates the range, which is the difference between the lowest and highest values. The range inthis case is 33 hours.
In addition to the range, she calculates the sample standard deviation. The standard deviation gauges variability in terms of howmuch individual values tend to differ from the mean (M) of the group. She knows she should use the sample standard deviation(s), which is a representation of the population standard deviation (σ), because she is using data from only some of the first-yearstudents (a sample of first-year students), not all of the first-year students in the class (the population of first-year students). Shecalculates the sample standard deviation to be 5.5 hours.
By using descriptive statistics, Anna is able to quickly describe the results of her survey. In a typical week, first-year studentsstudied an average of 16.5 hours, with a standard deviation of 5.5 hours. Students who are more than one standard deviation(5.5 hours) below the mean of 16.5 hours study less than 11 hours per week, and they are more likely to drop out their first year.
At the end of the study, Anna presented her results to the dean of students. He asked her to make a presentation to thefreshman class about the value of studying at least 11 hours per week.
Apply It! boxes written by Shawn Murphy.
Conceptual Versus Calculation Formulas
Try It!
B In a set of n = 5where M = 6.0, thelowest value is 3 andthe highest is 11. If avalue of 12 is added, which willbe affected most, range or s?
See answer here.
Formulas 1.2 (the variance) and 1.3 (the standard deviation) represent what are called conceptualformulas. In contrast to calculation formulas, conceptual formulas make it easier to see what theresulting value means. For example, the repeated x − M procedures in Formulas 1.2 and 1.3remind us that what both the variance and the standard deviation measure is the amount ofdifference between individual values and the mean of the group.
Calculation formulas usually lack this clarity; however, they do provide the same answer asconceptual formulas. They are easier to use with large data sets when the calculations are all doneby hand, but the logic behind the formula is not as clear. This is particularly the case when thestatistic is generated by a computer. You will use Excel later in the chapter, after you have somepractice with conceptual formulas. Because we emphasize clarity over ease of calculation, we willkeep the data sets small.
The Impact of Different Score Values
The highest depression score was 8. If instead it was 12, what impact would that have on the variance and the standard deviation values?Would you agree that since 12 is a more extreme score than 8, on average, individual scores would vary more from the mean with 12 than with8 as the highest score? The values of s2 and s should increase if a 12 replaces the 8. If s2 and s are recalculated to reflect the change,
s2 becomes 6.32, and
s becomes 2.51.
Because both statistics are based on the square of the difference between individual scores and the mean, and because extreme scores producethe largest squared differences, extreme values have a disproportionate effect on the size of s2 and s. Note that very small scores have the sameimpact as large scores because the issue is the difference between x and M. If one of the 4s in the data set is changed to 0 so that thedepression scores become
0, 3, 4, 5, 5, 6, 6, 7, 7, 8, then
s2 becomes 5.43, and
s becomes 2.33.
(Originally, s2 was 2.50 and s was 1.58.)
Just as scores larger or smaller than M increase the s2 and s, scores similar to M make them smaller. If the 8 in the original data set is changedto 5.5 so that the depression scores become
3, 4, 4, 5, 5, 5.5, 6, 6, 7, 7, then
s2 becomes 1.74, and
s becomes 1.32.
This last example provides an interesting contrast between variance and standard deviation statistics on the one hand and the range on theother. Once the range for a set of values is established, no value added between the high and the low values can shrink it. The number ofscores in the range can increase, but range cannot shrink. In contrast, additional values near the mean will always reduce s2 and s. Both thestandard deviation and the variance measure how much individual scores in the group tend to vary from the mean of the group. When averagevariability changes, so will s2 and s.
Populations Versus Samples and a Correction for a Biased Estimator
Earlier, the population was defined as every member of a particular group, and the sample as any subset of the population. Every psychologistin San Diego defines a population, as do all law enforcement officers in the state, or everyone in your family. Remove at least one person fromany of those populations, and the group becomes a sample.
When a sample is representative, its characteristics are very similar to those of the population. Later we will be more specific about what makessamples representative, but for now note that the ability to understand the population from the sample, or the ability to conduct inferentialanalyses, depends upon samples that are representative.
One of the limitations in samples is that they tend to have less overall variability than populations. A sample cannot be an exact copy of thepopulation, and the values that define the range in a population are unlikely to be represented completely in any sample. So inferringpopulation variability from a variance or standard deviation that is not adjusted to account for the fact that the data is sample data will result ina consistent underestimation. Repeated errors in the same direction are what constitute bias in statistical analysis.
To counter the bias, an adjustment is made in the formulas for sample variances and sample standard deviations. It is the −1 in thedenominators of Formulas 1.2 and 1.3, and it is called a correction for a biased estimator. In any division problem, if the value of thedenominator is reduced, the resulting value gets larger. Here the value of the standard deviation or the variance is increased because of theadjustment. Note that the impact of the correction is greatest when the samples are small:
· The −1 adjustment in the score value when n = 10 is proportionately much greater than when n = 100.
· 10 ÷ 9 = 1.11, but 100 ÷ 99 = 1.01.
It makes sense that the impact of the correction diminishes as the sample size increases because the sample's potential to distort populationcharacteristics is generally greatest when sample sizes are smallest.
If all the data is available for every possible member of a group, bias will not be any concern, and the adjustment will not be necessary. In thecase of population data, the formulas for variance and standard deviation are as follows:
|
σ2=∑(x−μ)2n |
Formula 1.4 |
|
σ=∑(x−μ)2n−−−−−−−−−−√ |
Formula 1.5 |
Try It!
C What constitutesbias in statistics?
See answer here.
Besides the absence of the −1 in the denominator, note that sigma (σ) has been substituted for s,and mu (μ) has been substituted for M in both formulas. The σ indicates the population standarddeviation, and σ2 indicates the population variance. Earlier in the chapter, we noted that μindicates the population mean.
Differentiating Sample and Population Characteristics
The descriptive characteristics of populations are referred to as parameters. Although the word isoften used more loosely, technically, the term statistic refers to a sample characteristic.
To summarize,
|
|
Population Parameter |
Sample Statistic |
|
|
||
|
Mean |
μ |
M |
|
Standard deviation |
σ |
s |
The Statistic or the Parameter?
Unless we are working with a relatively small population, like the population of a particular social worker's clients or the population of a familyor social group, we generally do not have population data to work with. There are exceptions, of course: Researchers sometimes work with datafrom the U.S. Census Bureau that includes the entire population of the country, and testing agencies will sometimes provide parameter meansfor the entire population who took a particular test. It is much more common to have access only to sample data. Therefore, in this book theexamples and exercises will calculate the statistic, the sample standard deviation, using Formula 1.3.
Understanding Degrees of Freedom
The n − 1 correction in the variance and standard deviation formulas for samples gives rise to another topic. The statistics we calculate have degrees of freedom (df). Degrees of freedom are one of those odd statistical abstractions that are difficult to explain briefly but affect the waythat many other procedures are computed and interpreted, such as t-tests (Chapter 5) and analysis of variance (Chapters 6–8). The variance andthe standard deviation introduce us to degrees of freedom.
Degrees of freedom are the number of scores in a calculation that are free to vary when the final result of the calculation is known; it is amathematical adjustment for use with sample data. Keep in mind that when using population parameter data, N (the population sample size) isused in the calculation but that since samples are most often used to represent populations, the adjustment would be n − 1 (the sample sizeminus 1). For example, if the sum of three integers is 6 and we fix one of the integers, the other two integers that add up to 6 will havefreedom to vary so long as we end up with a total of 6; this is seen in the following example.
· If _ + _ + _ = 6,
· then the first two of those three integers can be any integers.
· They could be 2 + 2, or
· 3 + 1, or
· 3 + 10, or any other two values, as long as the value of the third integer makes the result come out to 6.
· The third value cannot vary; it must be 2 in the first and second examples above, and −1 in the third example.
· The problem has 2 degrees of freedom.
If the number of integers that make up the problem are considered n, then degrees of freedom (df) for a problem like this are n − 1. That sameexpression, n − 1, also defines degrees of freedom for the standard deviation and the variance. If we know the final value of either s2 or s, thescores in the group except for that final one can have any value. However, that final value must be whatever it takes to make the result be whatit is. Other procedures have df values that differ from n − 1, and we will address these as they come up.
Please note that a measure of variability cannot be less than 0. To say it more directly, there is no such thing as negative variance. If a schoolpsychologist measures intelligence scores for a variety of students who have a learning disability and finds that they all have the same score,there is no variation in the scores, and s2, s, and range will all equal 0. If in the course of all the number crunching or statistical softwareoutput, a negative value somehow emerges for a variability statistic, the psychologist should start looking for a calculation error.
1.6 Calculating Descriptive Statistics with Excel
Now that you can calculate descriptive statistics longhand and understand what they mean, we will move the calculations to Excel. As statedearlier, computer software makes it easy to generate results, but the output can be difficult to understand. We will counter this by doing theprocedure systematically so that it is straightforward and the results make sense.
Like all spreadsheets, Excel is laid out in the rows and columns of a ledger. Spreadsheets were originally designed to allow businesspeople tokeep track of and manipulate large amounts of numeric data, including calculating descriptive statistics. Although the commands vary fordifferent kinds of software, all spreadsheets will produce descriptive statistics, and most programs, including Excel, will also complete some ofthe basic statistical tests.
There are two ways to generate descriptive statistics in Excel. They can be calculated directly by entering the individual commands, or they canbe part of a package in Excel called "Descriptive Statistics." First, we will try the individual commands.
Consider a psychologist who is interested in the cognitive characteristics of teenage boys who get into trouble with the law. The psychologistgathers data on problem-solving ability among teenage boys consigned to juvenile hall. Using a Juvenile Delinquency Scale (JuDeS) and havingsecured the appropriate permissions, 12 randomly selected juvenile offenders are tested. These are the resulting JuDeS scores:
11, 14, 14, 15, 17, 17, 17, 19, 22, 22, 23, 27
Navigating Excel
The individual boxes in an Excel spreadsheet are called "cells." Each cell is identified by the column and row in which it is located.
· The columns are labeled from left to right, alphabetically from column A.
· The rows are numbered down the left side of the window.
· Cell A1 is the cell in column A, first row-the upper left. The next cell down is cell A2, and so on.
· When identifying a cell, the column letter is first, followed by the row number.
· When cell locations are entered in Excel, there is no space between letter and number.
The steps for entering the JuDeS data into the spreadsheet follow:
· Put the cursor in the first cell, cell A1 for example. This can be done
· by using the arrow keys in the keypad to move the cursor,
· by clicking the mouse on the particular cell, or
· by using the touchpad on a laptop.
· In cell A1, key in the number 11 and then press Enter. The Enter key will move the cursor to the next cell down.
· Enter each of the other 11 values so that the data is arranged vertically in all the cells from A1 to A12. That will make the spreadsheetlook as it does in Figure 1.1.
Figure 1.1: A data set entered in Excel
Entering the Command for the Mean
An equal sign (=) tells Excel to expect a specific command or a formula next. Calculate the mean for the 12 JuDeS scores to appear in cell A13as follows:
· Put the cursor in cell A13.
· Enter the command =average (A1:A12), which will calculate the mean for the data in cells A1 to A12. Note the Excel command is average rather than mean.
· Press Enter.
· The value in cell A13 is the mean, 18.16667.
· In the Home page, click the arrow in the bottom right corner of the Number tab (it is in the middle near the top of the screen).
· Under Category, click Number, and then to the right indicate the number of decimal places. Rounding to two decimal places will make M = 18.16.
Using the Descriptive Statistics Option
Entering the specific command works well when a particular statistic is needed, but sometimes you may want a more comprehensivedescription of the data. The mean is reported with other descriptive values as part of a Descriptive Statistics option. For the JuDeS data list, thecommands for that package of statistics are the following:
· From the Home tab, click the Data tab, which is four to the right at the top of the page.
· Click the Data Analysis window at the extreme right just below the tabs. This will open a small window in the page with a list ofoptions.
· Click on the Descriptive Statistics option, and then click OK.
· In the small window labeled Input Range, type in the cells for which you wish the values to be included, A1:A12, just as we did whenwe entered the formula for the mean. When entering the letter for the column, it does not matter whether it is upper- or lowercase.
· Note that the default is that data is "Grouped by" columns. If the data was listed along a row, you would have to change the default.
· Click Output Range and indicate where the results display is to begin, perhaps cell C1, so that results are next to the original data butnot on top of them.
· Finally, click the particular output you wish, which is Summary statistics.
· Click OK.
Results are shown in Figure 1.2.
Figure 1.2: The Excel data analysis, descriptive statistics option
Excel can also produce other descriptive statistics, including the median (Mdn), the mode, the lowest and highest values, the range, the samplestandard deviation (s), and the variance (s2). There are also some statistics that will be introduced later in the book.
1.7 The Language of Research
Sometimes descriptive statistics serve their own ends. A researcher might need to know the mean level of education among those who areunemployed, or how much variation there is in autistic clients' verbal behaviors. Often, however, descriptive statistics are calculated as part ofsome more involved research project, like a senior paper or a research report.
The reason that people take the time to calculate descriptive statistics is that whatever is being measured varies. Constants, or constant values,hold little interest precisely because they do not change. If intelligence scores were constant, it would have been pointless for Alfred Binet, thefather of intelligence testing, to take the time to develop an intelligence test. The changes are what make variables interesting and worthstudying.
Qualitative variables are difficult to reduce to a number. Often they are the nominal scale data mentioned earlier that describe peoples'demographic characteristics, such as religious persuasion or national origin. Sometimes qualitative variables refer to emotional characteristics,such as passion, or to traits, such as appreciation—things that defy measurement by ordinary means.
With quantitative variables, numbers indicate the amount of what is measured. Larger numbers indicate more of the characteristic, smallernumbers less. Mixed-methods research involves both qualitative and quantitative variables. Research in the social sciences typically uses mixedmethods. Although some people advocate primarily for quantitative research or for qualitative research, it is difficult to find studies that do notinvolve both kinds of variables.
Research Design
When a formal plan is developed to gather and analyze data in order to investigate something, the plan is called a research design. Researchdesign enables the researchers to gather the relevant data and perform the analyses needed to answer their questions. The statisticalprocedures in this chapter and those that follow allow people to execute research designs.
Dependent and Independent Variables
Research designs always identify the variables believed to be relevant to an outcome. The outcome itself is the dependent variable, also knownas the outcome variable or criterion; it is the affected variable, or the consequence variable. The variable thought to help bring about the effectis the independent variable or the predictor. It is tempting to say that the independent variable causes the dependent variable, but causation isdifficult to demonstrate in social science research. The problem is not that causes do not exist but that they are difficult to confirm.
Perhaps, having read the relevant research, a psychologist believes that service to other people will reduce individuals' feelings ofdiscouragement. The psychologist develops a plan, a research design, to test whether serving in a soup kitchen for the poor (the independentvariable) is associated with lower feelings of discouragement (the dependent variable). If there is a relationship, the psychologist might concludethat serving in the kitchen brings less discouragement, but it is also possible that those who serve in the kitchen are people who are naturallyless discouraged.
As the psychologist executes the design, descriptive statistics will be calculated for the independent and dependent variables—the mean level ofdiscouragement, the standard deviation of the hours served by subjects in the experiment. However, in this instance, the descriptive statisticsare components of a broader purpose, which is to determine the relationship between the independent and dependent variables.
1.8 Presenting Results
Although it is important to be able to calculate and obtain statistical results, it is more important to be able to discern and present which resultsare most relevant and useful. Though we are somewhat limited in what we can do with descriptive statistics, they are important inunderstanding our data and describing its most basic characteristics.
SPSS steps for Descriptive Statistics: Analyze→Descriptive Statistics→Frequencies. Place the age and sex variable into the variable(s) box. Clickon Statistics and check Mean, Median, Mode, Standard Deviation, Range, Skewness, and Kurtosis. Then click Continue and OK. The resultingSPSS output tables are provided in Figure 1.3.
Figure 1.3: SPSS output of descriptive statistic
The items highlighted in yellow are those used in the interpretation and reporting of these variables'descriptive statistics in the next section.
1.9 Interpreting Results
In Section 1.6, we learned how to calculate the various descriptive statistics in Excel for a study that looked at 12 juvenile offenders' scores onthe JuDeS. Using the results obtained, what could we say about this sample? Well, first we know that the average score obtained was 18.17,which was slightly larger than the median and mode, both of which were 17.00. Thus, the "center" of our data was approximately between 17and 18, when considering all three measures of central tendency. The standard deviation for the data was 1.33, which means that scores were,on average, 1.33 points from the mean. The range of scores was from 11 (minimum score) to 27 (maximum score), giving a total range of 16.These interpretations give us a basic overview of what our data "looks like" in terms of the results obtained.
Graduate students typically conduct research, or are enrolled in research-based courses, where they may be running complex analyses usingstatistical software, such as SPSS. It is useful to learn to interpret the outputs from these programs. Though learning how to use SPSS is beyondthe scope of this book, we will present how to find and interpret the results. Example SPSS outputs will be provided in each Presenting Resultssection of this book, and each Interpreting Results section will provide important pieces of information to help interpret results.
Example: We collected data on the sex and age of 200 participants in a study. The tables in Figure 1.3 depict the outputs that will display whencalculating the descriptive statistics for this data in SPSS. These tables show the tabulations of age and sex for this sample. Is there anyinformation tabulated that should not be reported based on what you know about scales of measurement? Since sex is a nominal variable, withmales and females being the two categories, we should not use mean, median, standard deviation, and so on for this data. In this case, the onlydescriptive statistic that can be reported is mode. It does not make sense to report an average or calculated number for categories as thenumbers used to represent them in the data file (1 = female, 2 = male) were simply used for data entry and categorization purposes. We cansee that there were 112 females and 88 males, making female the value most frequently reported.
The variable for age is a ratio scale of measurement; thus, we should review the various descriptive statistics accordingly. In this sample of 200participants, the average age was 41.63 years, with a standard deviation of 9.83 years. This means that the typical age was 10 years from themean age of 42, approximately. The range of ages was 48 years, with a minimum age of 23 years and maximum age of 71 years. The medianage was close to the mean at 41.00 years. When there are multiple modes for a data set, SPSS will only present the lowest one with asuperscript a (a). This means you will need to refer to the frequency table to determine what other ages are most frequent. In this case, it was40 and 45, both of which occurred 12 times in the data. Thus, the modes were 40 and 45.
The most common format used for reporting research results and writing papers in many fields, including the social and behavioral sciences, isthe American Psychological Association (APA) format. It is vital for students to know how to present statistical test results in the correct format.Though you should refer to the most recent edition of the APA manual for specific detail on formatting statistics, Table 1.1 may be used as aquick guide in presenting the descriptive statistics covered in this chapter.
|
Table 1.1: Guide to APA formatting of descriptive statistics |
||
|
Abbreviation or Term |
Description |
|
|
M |
Sample mean |
|
|
μ |
Population mean |
|
|
s |
Sample standard deviation |
|
|
σ |
Population standard deviation |
|
|
Mdn |
Median |
|
|
mode |
Mode |
|
|
N |
Total number of cases |
|
|
n |
Number of cases in subsample/group |
|
|
df |
Degrees of freedom |
|
|
Source: Publication Manual of the American Psychological Association, 6th edition.© 2009 American Psychological Association, pp. 119–122. |
|
Note that abbreviations are typically italicized, while characters (Greek or Latin) or words are not italicized. The following are examples of howto interpret and present results using these abbreviations and terms. These examples utilize the data presented in Section 1.8. Note that whenthe results are part of the flow of the sentence, abbreviation is typically not used. However, when the statistics are interpreted and beingreported as supporting documentation that is not part of the sentence, the abbreviation is used and the data is presented in parentheses.
· The age of participants was from 23 to 71 years (M = 41.63, s = 9.83).
· The average age of participants was 41.63 years (s = 9.83).
· Participants were 112 females and 88 males aged 23 to 71 years (M = 41.63, s = 9.83).
· The median age of participants was 41.63 years (N = 200).
· There were 200 participants in the study (Mdn = 41.00).
· All measures of central tendency were within 3 years for the age of participants (M = 41.63, Mdn = 41.00, modes = 40.00, 45.00).
· The most common ages of participants were 40.00 years and 45.00 years (M = 41.63, Mdn = 41.00).
Using the data from Section 1.6, we could present the results in the following way:
The average JuDeS score was 18.17 (s = 4.59). The median and mode were the same at 17.00, while the range of scores was from11 to 27 (N = 12).
· Notebook
Summary
Part of the transition into any new discipline is learning the terminology. It is important that we have a common language. Part of that languageis the scale of the data. Recall that scale (nominal, ordinal, interval, and ratio) refers to the kind and the quantity of information that the dataprovides (Objective 1). Data scale also help us determine which statistics we can calculate (Objective 2).
Descriptive statistics can provide a great economy when data sets are more than a few measures. The central tendency measures (mean,median, and mode) suggest what is most representative in a data set, but they each do so by offering different descriptions of what is typical.For that reason, they are often reported together (Objective 3).
Measures of variability complement measures of central tendency. When the standard deviation, the variance, or the range is calculated andreported with the mean, we have a view of not just what is most typical but also of how homogeneous the data is (Objective 4). Variance andstandard deviation values can tell us when there is a great deal of data variability and when data tends to be very similar.
The concepts of populations and samples likely are not new, but the notation used to represent them may be (Objective 5). These will becomefamiliar as they are repeated in subsequent chapters, as will presenting results (Objective 6) and interpreting and reporting results in APA format(Objective 7) beyond basic descriptive statistics.
Chapter 2
Illustrating Data
Tim Graham/Getty Images
Learning Objectives
After reading this chapter, you will be able to. . .
· organize measures into frequency distributions, ordered arrays, and stem-and-leaf plots.
· use Excel to create pie charts, bar graphs, and frequency polygons.
· describe the components of data normally.
· judge data normality by using the relevant longhand statistics and by using Excel output.
· use statistical calculations and graphs to identify outliers.
· present graphical results and draw conclusions based on descriptive statistics.
· interpret results of analysis in graphical and tabular forms in APA format.
People who like to organize things will especially like this chapter. What is covered here can be particularly helpful when we are barraged withmore data than we can absorb, especially since we only use and analyze about 20% of the available data anyway. When the material isirrelevant, this is not a problem, but when we are submerged in important information, we need ways to deal with the overload. This chapteroffers some solutions involving visual data displays. We will introduce them with an anecdote.
During World War II, a British analyst was assigned to recommend to aircraft builders the points on airframes that should be reinforced witharmor plating. Too much armor plating and the aircraft would lose maneuverability and range; too little and it would become too vulnerable toenemy fire. The analyst examined aircraft returning from combat and noted where there was damage. It is easy to imagine him drawing picturesof the planes and then noting the places where they had been hit. His recommendation was to reinforce the areas where the returning planeshad not been damaged. How counterintuitive was that? As illogical as his recommendation may seem, his reasoning was that if the damage tothe returning planes had been fatal, neither the pilot nor the airplane would have returned. Therefore, it was damage to the other areas thatwas apparently the most serious, and those were the areas that needed the most protection.
This story is a lesson in the value of clarifying relationships with visual displays. Certainly, there are times when mathematical manipulation andstatistical procedures are required, but often a necessary first step to understanding a data set is to arrange the data so that it can be viewed.The conclusions we draw because of observation can then guide any analyses that follow.
Chapter 1 emphasized the descriptors and the statistical shorthand that allow us to classify and describe groups of data. In that chapter,descriptions were limited to the scale of the data and the measures of central tendency and variability that allow data summaries. In thischapter, we will use visual display for some of the same purposes and expand the applications for descriptive statistics.
2.1 From Description to Display
Because of the incremental nature of statistics, the early topics are the building blocks for those that come later. Here, we will use what weknow about data scales of measurement and descriptive statistics to arrange measures into the tables and figures that reveal the multipledimensions of numerical data.
The adage, "a picture is worth a thousand words," is most relevant here as most audiences are more attracted to a visual display than to a textpresentation and when a good deal of data must be communicated in a short time, a visual display is a good place to begin. The pages thatfollow suggest some of the more common procedures for representing different kinds of data, but this chapter is only the briefest ofintroductions. For someone interested in a more in-depth discussion, an interesting starting point is The Visual Display of QuantitativeInformation by Edward Tufte (2001). Tufte, who taught joint seminars with his colleague John Tukey at Princeton, collected historical examples ofeffective visual representations of statistical information and provided a framework for the theory and practice of the effective display of data.
Data distributions of one sort or another are difficult to avoid. A glance at the newspaper indicates how unemployment numbers have changedduring the year. Checking how the stock market has fluctuated over today's trading session indicates highs, lows, and probably the volume oftrading. It is the fact that data fluctuates that makes data distributions interesting. Datasets that either all have the same value or that alwaysoccur in the same proportions leave little to be analyzed. They interest us much less than datasets for which proportions and frequencieschange.
Frequency Distributions
Scores on most measures vary, but in that variation, there will generally be some repetition. Whether it is a college admissions test or thescores on a statistics quiz, all scores are not equally likely; some will occur more frequently than others will. Frequency distributions indicatethe number of measures in a data set that have the same characteristic. They allow us to display scores in terms of both their variability andtheir frequency of occurrence.
If a licensing board administers an exam for organizational development professionals, it probably does not report every individual score, butinstead reports the test results in terms of the categories:
Meritorious
Exceeds Expectations
Pass
Pass with Exceptions
Fail
A group of 25 graduates of State U's Organizational Development program takes the Training Professions Licensing Exam (TrPLE). Their resultsare shown in Table 2.1.
|
Table 2.1: A frequency distribution for results on theTrPLE |
|
|
Licensing Test Results |
f |
|
Meritorious |
4 |
|
Exceeds Expectations |
6 |
|
Pass |
8 |
|
Pass with Exceptions |
4 |
|
Fail |
3 |
|
Total |
25 |
Table 2.1 is a frequency distribution, with the symbol f indicating the frequency or number of scores that occur in a particular category. If eachindividual score had been entered rather than being grouped into categories, the result would have been a table with 25 discrete entries.Instead, the data in Table 2.1 are a grouped frequency distribution. Such a table provides a very economical presentation when there are manyscores.
Ordered and Disordered Arrays
If each of the 25 results were listed in order from the four that were Meritorious down to the three Fails, the display would be an orderedarray. If instead of organizing the data from highest to lowest, we just arbitrarily piled all the scores into the table, it would be, not surprisingly,a disordered array. Table 2.1 is a much shorter display than either an ordered or a disordered array would have been.
The kind of presentation is not too much of an issue when n = 25, but it would be if the frequency distribution included data for every aspiringmarriage and family counselor in the state who took the licensing test. Even if hundreds of scores were being reported, the grouped frequencydistribution would have the same number of rows as Table 2.1, the only difference is that the sums for each category would be larger.Frequency distributions can make a presentation very compact.
Class Intervals
The "groups" in grouped frequency distributions are called class intervals. Although they provide a great economy and make a great deal ofdata accessible to even a casual observer, inevitably some of the details get lost. It is not apparent from studying the table, for example, whichnumerical test scores belong to a particular class interval. We can address that deficiency by adding score ranges, which might be the following:
|
28–34 |
Meritorious |
|
21–27 |
Exceeds Expectations |
|
14–20 |
Pass |
|
7–13 |
Pass with Exceptions |
|
0–6 |
Fail |
Try It!
A According to thediscussion of the scaleof data in Chapter 1,what scale do datacategories such as Meritorious,Exceeds Expectations, and so onindicate?
See answer here.
With the ranges, we know how scores were classified, but it still is not apparent exactly how one individual whose score is in the Pass interval,for example, scored. It could have been anywhere from 14 to 20. We know only the category.
Although we cannot know precisely how each individual scored, the scores can at least be roughly ranked. Clearly, those who "exceededexpectations" did better than those in the Pass category, although exactly how much better is not indicated.
Estimating the Mean From a Class Interval
Indicating the score frequencies in the class intervals reduces the scores to value that can be roughly ranked. Even without the individual scores,we can use the categories to estimate the mean of those scores. It involves
· determining the middle point in each class interval,
· summing the midpoints of all the intervals, and then
· dividing the result by the number of class intervals.
To see how accurate the estimation is, we will calculate that actual value first so we can make the comparison. Let's assume that the individualscores for the licensing test data in the grouped frequency distribution above were the following:
Meritorious: 34, 33, 33, 29
Exceeds Expectations: 26, 26, 24, 23, 23, 22
Pass: 20, 19, 19, 18, 17, 15, 15, 14
Pass with Exceptions: 12, 11, 9, 8
Fail: 6, 3, 1
Verify that the value of the mean is:
|
M=∑xn=46025=18.40. |
|
|
Now, to estimate the mean based on the class intervals, follow these four steps:
1. Determine the midpoint of each class interval by
a. adding the two end points in the interval and then
b. dividing by 2.
So we have
Meritorious: (34 + 28)/2 = 31
Exceeds Expectations: (27 + 21)/2 = 24
Pass: (20 + 14)/2 = 17
Pass with Exceptions: (13 + 7)/2 = 10
Fail: (6 + 0)/2 = 3
2. Multiply the midpoint values from Step 1 by the number of scores in the interval.
31 × 4 = 124
24 × 6 = 144
17 × 8 = 136
10 × 4 = 40
3 × 3 = 9
3. Sum the products from Step 2.
124 + 144 + 136 + 40 + 9 = 453
4. Divide the sum of the products from Step 3 by the number of scores.
453/25 = 18.12
The actual mean is 18.40. The estimated mean is 18.12.
Remember that this is an estimate, so a discrepancy between the value estimated from the class intervals and the actual value of the mean isnot surprising. Here the difference is 0.28. As the number of values in the data set increases, the size of the discrepancy will usually diminish.The point is that with only the values that constitute the class intervals and the number of scores in each interval, you can estimate the value ofthe mean, something that can be helpful in a data summary when the original scores are not available. If you determine the value of M byestimating it from the class intervals, any reporting of the value must clearly state that it is an estimate and that you did not calculate it directlyfrom the raw data.
The Difference Between Apparent and Actual Limits
For the licensing data, the scores are all whole numbers: integers. This makes creating the class intervals easy, but often the data we work withinclude numbers with decimal values and class limits must accommodate any value between the highest and lowest integers. Those highest andlowest integers in the category represent the apparent limits of the class interval. For the Meritorious category, for example, the apparent limitsare 28 and 34. If the scores do not involve decimal values, there is not a problem, but sometimes that is not the case. A student's grade-pointaverage, for example, is likely to have a decimal value. Ordinary grading procedures also often include decimals. If the lower limit for A work is90% and the upper limit for B work is 89%, to which class interval does 89.5% belong?
To accommodate any value, class intervals must have actual limits in addition to apparent limits. In the case of grade averages and a greatmany other kinds of data, the class interval actually goes from a half point below the lower whole number in the interval to a half point above.That means the lower limit for an A would be 89.5%. For the 21–27 class interval for Exceeds Expectations, that would make the actual limitsfrom 20.5 to 27.5. If we subtract the lower from the upper actual limit, we have the width of the class interval:
27.5 − 20.5 = 7.0
That difference between the actual limits is the same as the number of whole numbers in the 21–27 apparent limits. In this case, that would be21, 22, 23, 24, 25, 26, 27, or seven whole numbers.
In our licensing example, actual limits involve a problem we did not have with apparent limits: The lower actual limit for Exceeds Expectations isthe same as the upper actual limit for Pass. They are both 20.5. So when scores happen to include whole numbers and decimals, where does ascore like 20.5 belong? Sheskin's (2004) solution is to adopt a rule. The rule could provide, for example, that if the first value for the score inquestion is an odd number, it goes in one interval, perhaps the upper interval, and if it is an even number, the value goes in the lower interval.By that rule, someone scoring 20.5 would receive a Pass rating—because the first number, 2, is an even number. It does not matter much whatthe rule is so long as it is equitable and followed consistently.
Creating Grouped Frequency Distributions
Speaking of consistency, grouped frequency distributions are also developed according to a couple of conventions:
1. Each class interval must have the same range. Whether the class limits are apparent or actual, the ranges of the different intervalsmust be equal. In the licensing scores example, the range of the apparent limits are 6.0 for each interval:
34 − 28 = 6.0 for the Meritorious interval, 27 − 21 = 6.0 for the Exceeds Expectation interval, and so on.
2. A score must fit into just one group. This is simple enough when scores involve only whole numbers, but when there are decimalvalues, the difference between actual and apparent limits becomes relevant.
There are no rules about how many intervals are too few or too many, and of course, it is a nonissue when the data have their own categories,like the licensing results. However, when there are not prescribed categories, we have to decide on the number of categories to use. As a roughrule of thumb, Sheskin (2004) suggests taking the square root of the number of scores to determine the number of class intervals. So if, forexample, there are 50 scores in the data set,
|
50−−√=7.071 |
or about 7 class intervals would be a reasonable number. This is only a suggestion, however. When the data set is large, the rule may not bevery helpful. If there are 200 scores, for example, 14 intervals
|
200−−−√=14.14 |
might be more than we want in one table.
The objective is to find a reasonable balance between the brevity of a few categories and the clarity about the data set that more categoriesyield. For example, if we were to reduce the five intervals in Table 2.1 to just a Pass category and a Fail category, the result would be a verycompact table, but a good deal of information about the level at which particular individual passed would be lost.
Score Frequencies and Score Aggregates
Table 2.1 provides a simple summary of frequency, or how the scores on the licensing exam are distributed for 25 test-takers. Otherarrangements of these data can provide different pictures of how the scores are distributed.
· Frequency ( f ) indicates how many scores are in each class interval. This is what Table 2.1 provides.
· Relative frequency indicates the proportion or percentage of the total that represents scores in the class interval. Relative frequenciescan be reported as common fractions, but proportions or percentages of the whole are more common. The proportions are calculatedby dividing the number of scores in the class interval by the total number of scores.
· A cumulative relative frequency value adds each successive class interval to the proportions of scores that precede it so that the lastinterval will indicate 100%. The cumulative relative frequency for Exceeds Expectations will be the relative frequency for that classinterval (.24) plus the relative frequency for the preceding class interval, Meritorious (.16).
.24 + .16 = .40
If we modify Table 2.1 by adding columns for relative frequency and for cumulative relative frequency, the result is Table 2.2.
|
Table 2.2: Frequencies, relative frequencies, andcumulative relative frequencies |
|||
|
Licensing TestResults |
f |
Relative f |
CumulativeRelative f |
|
Meritorious |
4 |
.16 |
.16 |
|
ExceedsExpectations |
6 |
.24 |
.40 |
|
Pass |
8 |
.32 |
.72 |
|
Pass withExceptions |
4 |
.16 |
.88 |
|
Fail |
3 |
.12 |
1.00 |
|
Total |
25 |
|
|
Stem and Leaf: A More Comprehensive Data Display
Sometimes, rather than collapsing or abbreviating the data list, the scores need to be organized so that when all are presented, they are easy toabsorb. There are data displays that accommodate all of the data and still manage to remain fairly compact. One such display is the stem-and-leaf display or stem-and-leaf plot. Rather than collapsing the scores into class intervals (and losing some of the information about their originalvalues), stem-and-leaf displays show the original scores. They just rearrange them for a more compact presentation. The stem-and-leaf displayhas that name because each score is reduced to a stem and a leaf.
· The stem is all values in the number preceding the last number in the score.
· The leaf is the last value in the score.
Figure 2.1 is a stem-and-leaf display of the 25 test scores on which Table 2.1 and 2.2 are based.
Figure 2.1: A stem-and-leaf display of testscores
Try It!
B What would thestem be for a score of1,012?
See answer here.
At first glance, the display appears a little odd, but the beauty of stem-and-leaf displays is that thedata list is complete. The single-digit original scores are on the bottom row, where the stem (thenumber preceding the final value) is 0. The stem is just one number because all of the scores areeither single-digit or two-digit numbers. If the data set contained a score of 100, the stem for thatscore would be 10, a two-digit number.
· The single-digit original scores on the bottom row, then, are 1, 3, 6, 8, and 9.
· The second-row test scores are those for which the first digit is a 1 (the stem is 1).Those scores are 11, 12, 14, 15, 15, 17, 18, 19, and 19.
· The third-row test scores are those for which the first number (the stem) is a 2. These,of course, are the test scores in the 20s.
· In addition, the top row, with a stem of 3, contains the three highest scores: 33, 33, and 34.
Try It!
C How many "stems"would a stem-and-leafplot have if scoresrepresented everyinteger from 1 to 99?
See answer here.
Once you are oriented to "stems" and "leaves," the display is not difficult to interpret. It is clear ata glance, for example, that the bulk of these test scores are in the 10s and 20s. In alsointerpreting this display of numbers if the numbers are rotated 90° counter clockwise, it is ahistogram of numbered columns with the exception that the numbers of the stem are descendingfrom left to right.
Data Cross-Tabulations
Beyond simply listing data, the stem-and-leaf display suggests that the way data are organized can make what are often quite subtlerelationships easier to recognize. There are other types of displays that do this well. For the sake of the example, assume that the 25 peoplerepresent all those from a particular city who took the licensing test in a given year. Assume further that they are the products of two differentuniversities in that city. We know from the earlier tables that there were just three outright failures on the test and an additional four whopassed with exceptions, a kind of conditional pass. It might be important to determine whether students from the two universities performedsimilarly. Cross-tabulating the data is one way to present them so that such questions are easier to answer.
Tables 2.1 and 2.2 organized test results according to just the categories that constitute the class intervals. If the university the student attendedis added, a data table can be developed so that the test results are indicated in the columns and the university the individual attended isindicated in the rows (see Table 2.3 below).
|
Table 2.3: Cross-tabulating test results with the institution |
|||||
|
Class Intervals |
|||||
|
Institution |
Meritorious |
Exceeds Expectations |
Pass |
Pass with Exceptions |
Fail |
|
UniversityA |
0 |
1 |
4 |
4 |
3 |
|
UniversityB |
4 |
5 |
4 |
0 |
0 |
This cross-tabulation reveals information about the relative success of students from the two universities. If we aggregate the data acrossinstitutions, it is not apparent, for example, that no one from University B failed the test. Nor is it clear that all those who scored at theMeritorious level were from University B. Cross-tabulating data allows a second variable to be represented and provides for a moresophisticated level of analysis.
If information was also available on marital status, for example, the rows could be further divided to reflect the variable that would allow us tocompare results by test result, by university, and by marital status. If you have done enough reading to know what information is likely to beimportant in your report, the published research will guide you to the additional variables that ought to be gathered and presented in a data display.
2.2 Graphs and Other Data Figures
Sometimes, rather than grouping or arranging the scores, a more graphic presentation is helpful. That was certainly the case for the aircraftanalyst described at the beginning of the chapter. Pie charts and bar graphs are both quite common and with good reason. Either one can beunderstood with very little explanation. As compact and efficient as the stem-and-leaf display is, the unfamiliar observer must be oriented to itbefore the data makes sense. This is less often the case with pie charts and bar graphs, both of which are used more often than stem-and-leafdisplays.
Pie Chart
Perhaps better than any other type of figure or graph, the pie chart clarifies proportions. Pie charts are not new technology. Scholars andbusiness professionals have been using pie charts to illustrate proportional differences for several hundred years. Technically speaking, a piechart is a circle that is divided into sectors. The size of each sector is defined by the length of the arc around the perimeter of the circle.Perhaps a pie chart is used to illustrate where people in a particular county live, and the percentages are as follows:
· 25% are city dwellers,
· 20% live in the suburbs,
· 25% live in small towns, and
· the remaining 30% live in rural areas.
If the circle that makes up the pie chart has a perimeter of 10 inches,
· an arc of 2½ inches around the perimeter will be the city sector,
· an arc of 2 inches around the perimeter will be for those in the suburbs,
· an arc of 2½ inches will be small-town people, and
· an arc of 3 inches will represent the sector for rural people.
When we are interested in how much of the whole is explained by individual categories, a pie chart is usually more helpful than a table. This isparticularly the case when the data sets are large.
Perhaps a sociologist is interested in the ethnic group makeup of the residents in a particular county. Examining census data might produce thefollowing statistics:
|
African Americans |
23,375 |
|
Asian Americans |
18,217 |
|
Caucasian Americans |
32,667 |
|
Hispanic Americans |
40,886 |
|
Native Americans |
11,364 |
|
Other Americans |
5,887 |
Treated as a list, the data are certainly precise, but perhaps they are not as communicative as they might be. If the intent is to indicate how thedifferent ethnic groups compare as proportions of the entire population of the county, a pie chart is probably more helpful. The population datain a pie chart result in Figure 2.2. The way this chart is prepared, the size of each slice of the pie is based on the percentage of representationfor that group. The raw numbers may not matter to someone who wants a graphic demonstration (i.e., for someone who is interested inknowing that Hispanic American residents constitute the largest single ethnic group in the county, that the second largest group is the CaucasianAmerican group, that Native Americans are about half as numerous as African Americans, etc.).
Figure 2.2: A pie chart of ethnic group makeup based on census data
To make this pie chart in Excel, enter the data in two columns just as they are in the list. Drag thecursor to highlight both columns and then select the Insert tab at the top of the page. Select the Pieoption. The default product is the two-dimensional pie chart.
Pie charts illustrate proportional differences better for large proportions than for small. A good rule of thumb is to include the percentagenumbers with each slice of the pie, even though this can become messy if there are many small slices. Note that Figure 2.2 makes it difficult toknow how much larger the Native American population is than "Other Americans." Pie charts generally work better when comparing anindividual "slice" to the whole rather than one slice to another.
Bar Graphs
Bar graphs use a series of bars of different lengths to represent the different quantities of some variable. The bars can be either horizontal orvertical.
When there are gaps between the bars, it indicates that the categories in the graph are not continuous; they are discrete or independentcategories. This is the case for a chart that shows the popularity of the different academic majors at a university or that illustrates the ethnicgroup makeup of the county, such as in Figure 2.3.
Bar graph data values are given along the y (vertical) axis of the graph (as seen in Figure 2.3). This makes it an easy-to-read total populationvalue based on the height of the bar for each ethnic group and also simplifies comparisons from group to group. Also note in Figure 2.3 thatthe order of the bars usually is not significant in a bar graph with discrete categories in alphabetical order (as compared with histogramsdiscussed in the next section).
Figure 2.3: A bar graph of ethnic makeup based on census data
To create this bar graph in Excel, use the same data set used for the pie chart; highlight both columns ofdata; select the Insert tab at the top of the page; choose Bar; and then select All Chart Types at thebottom of the page because the default charts all use horizontal bars. Select the upper left columngraph; place your cursor on the series 1 notation at the right; press the Delete key on your keyboard,and then click OK.
Histograms
The ethnicity data in the categories in Figure 2.3 are nominal scale. As discussed in Chapter 1 on the distinction between categorical andcontinuous data, the categories are not continuous, and as we noted, the order of the categories is unimportant. Sometimes, the datacategories continue from one to the next so that each category indicates an incremental increase or decrease in the level of the samecharacteristic. This variation of a bar graph is a histogram.
The subtle visual difference between histograms and other bar graphs is the absence of a gap between the bars or columns. It is a reminderthat the data continue without interruption into the next category. Visually the data on the x-axis is continuous data (this, however, is notalways the case as sometimes ordinal data is used) as opposed to the categorical data found along the x-axis in bar graphs. Earlier in thischapter, we discussed actual versus apparent limits in class intervals; here the lack of interruption indicates that limits in a histogram are actuallimits.
In a histogram, the order of the intervals is never random. It is dictated by the magnitude of the variable. In Figure 2.4, the "salary" is acontinuous variable measured in $10,000 increments and the "number of employees" is measured using 100-person increments. The resultsindicate that the frequency of employees that is highest (over 500) is the group who earn a salary between $44,000 and $54,000. Conversely,there are about 50 employees who make a $0 to $10,000 salary. Like bar graphs, histograms allow "bars" of data to be compared with theadded benefit of seeing the distribution of the entire data set.
Figure 2.4: A histogram of salary and number of employees
Source: Statistics Canada, http://www.statcan.gc.ca
Frequency Polygons
Recall that when we used the grouped frequency distribution to estimate the mean, we calculated the midpoint of each class interval. Picture avertical bar graph where each bar represents one score. If a point indicates the midpoint at the top of each bar, joining the points with a seriesof short lines creates a frequency polygon.
In a frequency polygon, the lowest value occurs to the left on the graph, and scores graduate to the highest value on the right. Figure 2.5 is afrequency polygon for all the test scores that were in the stem-and-leaf display (Figure 2.1). Those data and the frequency polygon arepresented below.
Figure 2.5: A frequency polygon of the test scores
Tracing from the point to the vertical axis indicates score frequency. From the point down to the horizontal axis indicates the value of theparticular score. This is not a conventional frequency polygon because there are no more than two repetitions of any score, but studying thegraph should make it clear that there is one score of 3, for example, no scores of 4 or 5, and so on. Figure 2.6 is a cumulative frequencypolygon that adds the score for each cumulative frequency for each test score on the abscissa. Therefore, scores will go up as IQ increases tocreate this S-shape. These types of graphs are common in psychological and business disciplines to chart, for example, progress in treatment,survival, or growth over time.
Figure 2.6: A cumulative frequency polygon
Source: Lane, D. (2003, July 18). Frequency Polygons. Retrieved from the Connexions Web site: http://cnx.org/content/m11214/1.3/
The midpoints from which the frequency polygon is produced are plotted according to what are called Cartesian coordinates. Named for the17th-century French mathematician René Descartes, Cartesian coordinates are values of x and y. The horizontal line is the x-axis, or the abscissa, and the vertical axis, which is at right angles to the x-axis, is the y-axis, or the ordinate. The abscissa and the ordinate intersect at the point of origin, where x = 0 and y = 0.
If negative values were included in the graph, they would be plotted either to the left of the point of origin (for negative values of x) or below it(for negative values of y). When all values are positive, it is common to delete the upper left, lower left, and lower right quadrants and presentjust the upper right quadrant. This is all that is seen in most graphs.
Box Plots
Try It!
The Khan Academyhas a video tutorialon how to create abox-and-whiskers plot.Click the link provided here andsearch for and watch the Box-and-Whisker Plots video,included as part of theProbability and Statistics videosabout descriptive statistics.
One of the most important graphs in statistical research is the box plot (also known as a box-and-whiskers plot). Like histograms, box plots are used to show the distribution of data with the addedadvantage that they are used to detect outliers. This graphical presentation of dispersion andextreme scores is essential to understanding issues of skewness, which is discussed later in thechapter. As shown in Figure 2.7 the line in the middle of the box represents the median. The endsof the box represent hinges or Q1 (the 25th percentile) and Q3 (the 75th percentile). The length ofthe box, or Q3 − Q1, is known as the interquartile range (IQR). Beyond the IQR or box length arethe whiskers that extend out to the 5th and 95th percentiles. Beyond the whiskers are outlierpoints, and depending on your distribution, you may have none to several points of outliers. Theseterms and the ways to deal with these outliers in the distribution are elaborated later in thechapter.
Figure 2.7: Basic box plot showing percentilesand outliers
2.3 The Normal Distribution
Perhaps the most common application of the frequency polygon is the normal curve. What is often referred to as a bell-shaped curve or just abell curve is also a frequency polygon. It is based on enough individual scores so that the straight lines between consecutive scores are tooshort to notice; they appear as part of one long, continuous curved line. There is no way to show this with a small sample of just 25 test scores;however, if we had all scores from all students nationally, it would appear more like the smooth, continuous curve in Figure 2.8.
The normal curve is important for more than its appearance. That bell shape (Figure 2.8), which is low on either side and highest in the middle,suggests the way many characteristics tend to be distributed when they are measured for large numbers of people. The shape is a reminderthat particularly when we are dealing with mental traits, there is often a predictable distribution of that characteristic in populations. We cansometimes know even before collecting data which scores in a distribution are likely to have the greatest frequency and which the least.
Much of what we know about the normal distribution is due to the work of Karl Friedrich Gauss (1777–1855). The son of a gardener andbricklayer, Karl Gauss was a gifted German mathematician who started correcting his father's arithmetic at the tender age of 3. He was perhapsthe most important of those early scholars who began the modern theory of numbers and recognized the existence of normal distributions anddefined their properties. In his honor, the normal distribution is sometimes called a Gaussian distribution.
Figure 2.8: The normal distribution
The Elements of Normality
According to Karl Gauss, if we were to measure very large numbers of people on some mental characteristic like verbal aptitude, a frequencypolygon of the results would likely have the following characteristics:
1. Divided down the middle, each half of the distribution is a mirror image of the other. In other words, the distribution is symmetrical.
2. The distribution is unimodal; it will have just one most frequently occurring number (mode)—effectively, the bell has just one peak.
3. The standard deviation of the data in the population will be about one-sixth of the range; s = 1/6 range.
There are more technical descriptions of normality than these statements, but these will serve us well. With what we already know aboutdescriptive statistics, we can make reasonably good judgments about whether a data distribution is normal. This is important because to somedegree, the types of analyses that are available depend upon whether data are normal.
Skewness
An easy way to determine whether data are symmetrical is to compare the measures of central tendency. If the mean (M), the median (Mdn),and the mode all have the same value, the distribution is symmetrical, as well as unimodal.
When the measures of central tendency do not agree, it is because some scores on one side of the distribution are not counterbalanced byscores a similar distance from the mean on the other side of the distribution. This imbalance creates the lack of symmetry that is called skewness. When there is no skewness (skewness = 0), the distribution is symmetrical.
Comparing the M, Mdn, and mode can indicate whether data are skewed, but the amount of the skewness can also be calculated. One of thesimpler formulas follows:
|
skewness=M−MdnMdn |
Formula 2.1 |
|
|
Where M = the mean of the values and Mdn = the median of the values. |
With the following 25 test scores from the licensing test example:
1, 3, 6, 8, 9, 11, 12, 14, 15, 15, 17, 18, 19, 19, 20, 22, 23, 23, 24, 26, 26, 29, 33, 33, 34
verify that
M = 18.40
Mdn = 19
Using Formula 2.1,
|
skewness=M−MdnMdn=18.40−1919=−0.032 |
The negative value indicates some negative skewness. A positive value would have indicated positive skewness. Negative skewness means thatthe slope to the left of the mean and median is more gradual than that to the right. Particularly in small groups, some skewness is common,and for purposes of analysis, skewness values from −1.0 to +1.0 indicate quite modest skewness. Figure 2.9 shows graphs of distributions withnegative (A) and positive (B) skewness.
Figure 2.9: Skewed distributions
We noted that many of the more common statistical procedures require that data be relatively normal. When data is not normal—and a lack ofsymmetry is one characteristic of nonnormal data—we have to use analytical procedures that do not rest on the normality assumption; you willlearn about these procedures in Chapter 12. At skewness = −0.032, there is some negative skew, but because it is less than ±1.0, it is notenough to worry about. Skewness is not a problem in this set of data.
Note that the mean is slightly lower than the median in the 25 test scores. The way skewness is calculated in Formula 2.1 indicates that anytime the mean is less than the median, the result will be a negative skew value. If M > Mdn, the data have positive skew.
In effect, the mean is pulled in the direction for the skewness, so that even before you calculate a skew value, a comparison of the twomeasures of central tendency will usually tell you whether skewness is positive or negative.
Outliers
The mean and median differ when the values on one side of the distribution are more extreme than those on the other side. A case in point isthat extreme values create skewness. Consider this data:
20, 25, 30, 35, 40
For these five values, Mdn = 30 and M = 30. There is no skewness; the data are symmetrical. This example is certainly not normal, but it issymmetrical. However, if values of 5 and 45 are added to the set, things change:
5, 20, 25, 30, 35, 40, 45
For these seven values, Mdn remains at 30, but M becomes 28.57. The additional values created some negative skewness. That is because, ofthe two new values added to the distribution, the 5 is more distant from the mean than the 45. The effect of the 5 is to pull the mean awayfrom the median. The result is negative skewness.
As the most extreme score in the set, the 5 can be termed an outlier. Outliers are scores that are uncharacteristic of the other scores in thedata set. Outliers in just one direction create skewness. If instead of 45, the upper value had been 55, there would have been no skew becausethe 5 and 55 are equidistant from the mean (which in that case, would have remained M = 30).
We have seen that the orientation of the mean to the median indicates skew. We also noted that normal data are unimodal, so we need todetermine the mode as well, even though it is a statistic that is typically associated with nominal scale data.
Kurtosis
Try It!
D If the mode is leastaffected by outliersand the mean is themost influenced, whatwill be the order of the mean,median, and mode from left toright in a distribution withnegative skewness?
See answer here.
Symmetry and unimodality do not assure normality. A data distribution can be symmetrical andunimodal but not normal. The third dimension of normality, kurtosis, has to do with how spreadout the data is. That rather odd word, which comes from the Greek word that means "bulging" or"convex," is part of three different descriptions of data distributions:
· Normally distributed data are mesokurtic, literally "middle-kurtic."
· Data that are too homogeneous, or too similar to be normal, are leptokurtic, or"narrow-kurtic."
· Data that are too heterogeneous, or too varied to be normal, are platykurtic, or "flat-kurtic." A good mnemonic to help you remember: "plat" and "flat."
The data distributions in Figure 2.10 illustrate platykurtic, leptokurtic, and mesokurtic (normal)data distributions.
Figure 2.10: Kurtosis distributions
Although kurtosis can be calculated, the calculations are quite tedious, and it is more common to judge kurtosis by comparing the standarddeviation (s) of the data to their range or using software such as SPSS to calculate the values of skewness and kurtosis. Even simpler would beto superimpose a data distribution on that of a normal distribution. This "normality" test indicates whether the two graphs are significantlydifferent from each other. Two commonly used tests named after their respective statisticians are the Kolmogorov-Smirnov and Shapiro-Wilkstests. Recall from Chapter 1 that the standard deviation is a measure of how much the data typically vary from the mean of the distribution. Alarge standard deviation indicates that, on average, individual measures differ substantially from the mean—they are not very homogeneous. Fora small standard deviation, the standard deviation is about one-sixth the range. So if scores occur from, say, 20 to 55 (range = 35), the standarddeviation in a normal distribution will be somewhere near 6 points, about (1/6) range.
· If s < range/6, the distribution is leptokurtic; the data is too similar for normality.
· If s > range/6, the distribution is platykurtic; the data is too varied to be normal.
Note that the range/6 rule may not be helpful with small data sets. Small samples tend to be platykurtic because just one or two extremescores have a disproportionate effect on the balance of the sample. As the sample grows, this effect is minimized, but it can be substantialwhen there are only a few scores. For the sake of manageability, many of the data sets we deal with in this book are fewer than 15, and ingroups that small, ordinarily s > range/6.
This does not mean that the population from which the sample was drawn is not mesokurtic. Just note that the range/6 rule is a way todetermine whether populations are distributed normally, not whether samples are particularly small samples.
If more precision is needed, and there is not a computer with Excel handy, you can calculate kurtosis longhand by using Formula 2.2:
|
kurtosis=∑(xi−M)4(n−1)s4−3 |
Formula 2.2 |
|
Where |
|
xi = each value in the sample, |
|
(xi − M)4 = the difference between each number and the mean raised to the fourth power, which means that |
1. each x − M difference is squared, after which
2. the result is multiplied by (x − M), which raises it to the third power, after which
3. the result is again multiplied by (x − M), which raises it to the fourth power,
n = the number in the sample, and s4 = the sample standard deviation, also raised to the fourth power.
Note that Formula 2.2 is a simplified form of other equations for finding kurtosis. Because of this, Excel will not necessarily produce the samevalue as the equation. In addition, note that the Excel function KURT will not calculate kurtosis for sets of numbers less than four.
If three people are given a test to see how many word problems they can successfully solve in a 10-minute period, and the scores are 3, 5, and7, a kurtosis value can be calculated for this set of scores as follows:
· M = 5
· (xi − M)4 = 3 − 5 = −2; ( −2)2 = 4; ( −2)3 = −8; ( −2)4 = 16 5 − 5 = 0; 0,4 = 0 7 − 5 = 2; 22 = 4, 23 = 8; 24 = 16
· ∑ (xi − M)4 = 16 + 0 + 16 = 32
· s = 2; 22 = 4, 23 = 8, 24 = 16
· n = 3
kurtosis=∑(xi−M)4(n−1)s4−3=32(2)16−3=−2.0
The final "−3," which early formulas for calculating kurtosis didn't have, brings kurtosis values in line with calculated values for skewness, where0 indicates a symmetrical distribution. In the case of kurtosis,
· Zero indicates a mesokurtic distribution—neither leptokurtic nor platykurtic.
· Positive values indicate a leptokurtic distribution.
· Negative values indicate a platykurtic distribution.
· Kurtosis values in the ±1.0 range are ideal for statistical analyses that require normal data. Values in the ±2.0 range are not considerednormal, but they are, if not ideal, at least acceptable for statistical analyses.
The value for the three problem-solving scores indicates that the distribution is somewhat platykurtic, which is not surprising. Usually there issome level of skewness and kurtosis in all distributions and seldom is the case for a mesokurtic distribution.
There has to be a peak to the frequency distribution. In this simple example, three scores occur with equal frequencies, one each indicating amultimodal distribution. Note that there is no skewness, but the kurtosis value indicates that the distribution is not normal.
The values raised to the fourth power can make the longhand calculations quite grueling. Besides, the range/6 rule makes it possible to makejudgments about kurtosis without the use of a calculator or appropriate statistical software.
Using Excel to Calculate Skewness and Kurtosis
Excel makes calculating kurtosis painless and provides additional information in the bargain, as this example will show.
If the 25 licensing test scores are entered in one column in Excel, the commands are as follows for calculating the descriptive statistics that willinclude skewness and kurtosis values:
Data → Data Analysis
· In the Data Analysis window, select Descriptive Statistics, and click OK.
· For Input Range, drag the cursor over the cells in which the 25 scores occur.
· Click Output range and designate a cell value where there will be room below that cell for the output, which takes up about 15 linesand two columns.
The Excel output is shown in Table 2.4.
Note that although there are multiple duplicated numbers, Excel only lists one value for the mode.
|
Table 2.4: Descriptive statistics using Excel |
|
|
Mean |
18.40 |
|
Standard Error |
1.833939 |
|
Median |
19 |
|
Mode |
33 |
|
Standard Deviation |
9.169696 |
|
Sample Variance |
84.08333 |
|
Kurtosis |
−0.63032 |
|
Skewness |
−0.06491 |
|
Range |
33 |
|
Minimum |
1 |
|
Maximum |
34 |
|
Sum |
460 |
|
Count |
25 |
The negative value for kurtosis indicates that these 25 values make up a platykurtic distribution, which, as we keep noting, is typical forrelatively small groups. The distribution is a little too flat to be normal, although it is well within the limits for normality that statistical testsrequire.
The skewness value, by the way, is a little different from the value we calculated earlier, skewness = −.032. Formula 2.1 is one of the simplerformulas for calculating skewness, so the result is a value that is a little less precise than what Excel produces. Any variation betweencalculations with Formula 2.1 and what Excel produces will generally be minor.
2.4 Determining What Is Representative
Earlier, we noted that extreme scores have the potential to distort the descriptive statistics in a data distribution. Particularly when the group isrelatively small, both the mean and the standard deviation can be substantially affected by extreme scores. This observation prompts a question.Because nearly all distributions have scores that differ from the mean of the group, at what point does a score become an extreme score? Whatdefines an outlier?
Percentile Ranks
There are several ways to answer this question. One approach requires a quick introduction to percentiles. Recall that the median (Mdn) is thepoint in a distribution where half of all scores occur below it. In terms of percentages, when they are arranged from lowest score to highest,50% of the scores fall below the median in a distribution. Therefore, the median marks the 50th percentile rank. Percentile ranks define thepercentage of scores occurring below a point.
If we divide each half of the distribution again into halves, the result is fourths of the distribution, which are called quartiles. As you recall fromour discussion of box plots in Section 2.2, we used the term quartiles, which is the division of the data set into quarters. Arranged in order, theTrPLE licensing test results are as follows:
1, 3, 6, 8, 9, 11, 12, 14, 15, 15, 17, 18, 19, 19, 20, 22, 23, 23, 24, 26, 26, 29, 33, 33, 34
Because there are 25 scores, the median is the 13th score, the first of the two 19s. If we exclude the Mdn and then find the middle of thelower half of the distribution, the middle point of the first 12 scores is halfway between the sixth and seventh scores, or between 11 and 12.
· Midway between 11 and 12 is 11.5. That score marks the 25th percentile rank, or quartile 1 (Q1).
· Midway between the uppermost 12 scores is between 24 and 26. That makes 25 the 75th percentile rank, or quartile 3 (Q3).
The Interquartile Range
The portion of the distribution from the 25th to the 75th percentile rank constitutes the interquartile range (IQR), which is shown in Figure2.11.
Figure 2.11: The interquartile range
Because scores in the middle half of the distribution are more likely to be repeated than those at either end of the distribution, theinterquartile range generally contains the most representative scores. One approach is to use the IQR to identify the outliers, the scores mostlikely to distort a distribution.
However, to exclude everything outside the IQR as an outlier would exclude half of the distribution, a move that is probably too extreme. Anapproach is to calculate the IQR and then identify as outliers scores that are more than 1.5 × IQR above Q3.
As an illustration, a psychologist is working with clients in an addiction program. The issue is how many drug-free days each client has achieved.A random sample of seven clients yields the following numbers of drug-free days: 1, 7, 13, 17, 25, 27, 63.
Verify that
Mdn = 17
Q1 = 7
Q3 = 27
IQR = 20
For lower outliers,
Q1 − (1.5 × IQR) = 7 − (1.5 × 20) = any score below −23 is an outlier
For upper outliers,
Q3 + (1.5 × IQR) = 27 + (1.5 × 20) = any score above 57 is an outlier
Among these seven scores, only 63 is an outlier by definition. Although there's always a judgment involved and perhaps some subjectivity, the IQR outlier calculation approach will at least result in consistent decisions about which data to exclude in order to get a more accurate pictureof the least representative data in a distribution.
If the IQR outlier calculation seems too complex to answer the question of which scores are least like the other scores, there are alternatives. In Chapter 3, where the normal distribution will be examined more closely; you will see that when data is normal, specified distribution will beexamined more closely, you will see that when data is normal, specified percentages of the distribution always occur within certain ranges. Forexample, ± two standard deviations will always include about 95% of a normal distribution. We could rely on that fact to devise a rule foroutliers and exclude scores beyond ± two standard deviations (2s). For the group of seven values just above,
M = 21.857 and
s = 20.359
By the ±2s rule,
· anything lower than −18.860 (21.857 − 2 × 20.359) or
· higher than 62.575 (21.857 + 2 × 20.359) could be excluded as outliers.
In this data set, of course, that would exclude just 63, which is the same decision we came to with Lockhart's (1998) approach. The point is todevelop a reasonable rule, and be consistent and transparent in its application.
The Distorting Effect of Outliers
The problem with outliers, of course, is that extreme scores make the mean, which is supposed to indicate central tendency, something lessthan central. Outliers can also dramatically inflate the value of the standard deviation. Remember that the standard deviation is based on thesquare of the difference between a score and the mean of the group. Squaring the difference between an extreme score and the mean canhave a disproportionate effect on the magnitude of the statistic. When that happens, what are supposed to be descriptive statistics do notdescribe the most representative scores very well.
The point of organizing tables and calculating graphs, figures, skew, kurtosis, and all the descriptive statistics is to describe data sets. It is easy toget buried in a lot of complex calculations and arrangements as you create the different descriptive tools, but do not lose sight of the objective,which is to clarify and often to simplify the data. For example, if the point of the analysis is the range of scores included in the Meritoriouscategory, it makes little sense to exclude the most extreme scores when developing a feel for the highest scores. That those highest scores maycreate positive skew in the distribution as a whole in this case is largely unimportant.
This advice is not meant to diminish the importance of identifying and excluding outliers sometimes. As an example, British educational practicerequired students to take very rigorous government examinations in the ninth and twelfth grades called O-levels and A-levels. The results fromthese exams had much to do with students' subsequent educational options in higher education. In the author's 12th grade year in a very smallhigh school, several students took the exam in physics. One of the author's classmates produced a perfect score, something that had neveroccurred before in the history of the test. If someone wished to review those scores to develop a feel for the level of physics performancetypical of seniors in that high school, it would make sense to exclude that one perfect score before calculating a mean, or perhaps to calculatethe median score instead of the mean because medians are less affected than means by outliers. With just a few scores to begin with, a perfectscore on a very difficult examination holds too much potential to distort a mean.
Apply It!
Placement Test Outliers
The local junior high school has three levels of sixth-grade math. New students entering from outside the district musttake a standardized math placement test.
For the upcoming school year, 40 students from outside the district take the test. The counselor records their test scores andthen performs statistics using Excel. The Descriptive Statistics function returns the following values for these 40 test scores:
|
Mean |
54.65 |
|
Median |
54.22 |
|
Standard Deviation |
10.27 |
|
Kurtosis |
4.65 |
|
Skewness |
1.20 |
|
Range |
62.00 |
|
Minimum Score |
15 |
|
Maximum Score |
76 |
The counselor uses her knowledge of statistics to evaluate the results. For example, she knows that these placement test resultsshould follow a normal distribution and that in a normal distribution, the standard deviation of the data will be about ⅙ of therange and so is about 10.33, which is very close to the standard deviation of 10.27. The counselor also notices that the mean andmedian values are almost the same. The skewness value of 1.2 indicates that the data lack symmetry. However, the value thatreally stands out is kurtosis.
The kurtosis value of 4.65 is far outside the ±2.0 range for normal data. This indicates that the value is not normally distributed,as expected. As a next step, the counselor decides to look for outliers by using the interquartile range. She arranges the data inorder and finds the score marking the 25th (Q1) and 75th (Q3) percentile ranks. The portion between these values is theinterquartile range (IQR).
Q1 = 49 Q3 = 60 IQR = 11
For lower outliers,
Q1 − (1.5 × IQR) = 49 − (1.5 × 11) = any score below 32.5 is an outlier
For upper outliers,
Q3 + (1.5 × 11) = any score above 76.5 is an outlier
Based on these rules, only the minimum score of 15 is an outlier. The counselor decides to investigate this score and finds atranscription error. The minimum score should have been recorded as 51, not 15.
The erroneously recorded score of 15 is changed to 51 and new statistics are calculated, with the new results shown as follows:
|
Mean |
55.55 |
|
Median |
54.23 |
|
Standard Deviation |
8.05 |
|
Kurtosis |
−0.16 |
|
Skewness |
0.38 |
|
Range |
35 |
|
Minimum Score |
42 |
|
Maximum Score |
76 |
The counselor notices that the standard deviation is smaller and that the values for skewness and kurtosis are now well withinthe range found for normal distribution.
Based on her knowledge of statistics, the counselor knew beforehand that these test results should be normally distributed.When her initial analysis showed evidence to the contrary, she was able to investigate further, find, and then correct a mistake inthe data.
Apply It! Boxes written by Shawn Murphy
Practical examples aside, outliers produce a very real problem for researchers. You will often see in the literature that researchers andstatisticians will give results with and without the presence of outlier points. Best practice research is to detect and determine whether to usethese extreme points in your data set. Specifically indicate the number of points detected and why it was deleted or retained in your analyses.
2.5 Presenting Results
In Section 2.3 we learned how to calculate skewness and kurtosis in Excel for a study that looked at 25 licensing test scores. Using the resultsobtained, what could we say about this sample in relation to the distribution? Well, first we know that the average score obtained was 18.40,which was slightly less than the median, which was 19.00. The mode was much larger at 33.00. The standard deviation for the data was 9.17,which means that scores were, on average, 9 points from the mean. The range of scores was from 1 (minimum score) to 34 (maximum score),giving a total range of 33. The skewness value was –0.06, which means there was a very small negative skew. We can use the standard of
|
range6 |
which would be 33/6 or 5.5, which is smaller than the standard deviation of 9.17. This means that the data is possibly platykurtic. This iscommon for small data sets and may not be representative of the population distribution. Moving on to the kurtosis, this value should be within±2 to be considered normal. In this case the value is –0.63, thus we would state that the distribution appears to be fairly normal.
SPSS Steps for Descriptive Statistics with Histogram
Let us expand on the example we presented in Section 1.8, where we collected the age of 200 participants, to determine if the data is normallydistributed. The table in Table 2.5 presents the descriptive statistics for age. The average age was 41.63 (SD = 9.83). The skewness value was0.468, which means there was a small positive skew. Using
|
range6(48/6=8) |
we can compare this value, 8, to the standard deviation of 9.83. Though the numbers are close, the standard deviation is slightly larger, whichimplies that the data may be platykurtic. However, we can see in Figure 2.12 that the data is not necessarily flat but the slight skew is in thepositive direction as the tail of the data extends further in the positive direction. The kurtosis value of 0.19 is well within the ±2 needed to beconsidered normal. Though the data is not perfectly normal, as few things are, it is close enough to not be considered significantly skewed orvaried.
The steps in executing this analysis are as follows: Analyze → Descriptive Statistics → Frequencies. Place the age and sex variable into the variable(s) box. Click on Statistics and check Mean, Median, Mode, Standard Deviation, Range, Skewness, and Kurtosis. Click on Charts andcheck Histograms: and Show normal curve on histogram. Then click Continue and OK.
|
Table 2.5: SPSS output of descriptive statistics |
||
|
Statistics—Age |
||
|
N |
Valid |
200 |
|
|
Missing |
0 |
|
Mean |
41.63 |
|
|
Median |
41.00 |
|
|
Mode |
40a |
|
|
Std. Deviation |
9.827 |
|
|
Variance |
96.566 |
|
|
Skewness |
.468 |
|
|
Std. Error of Skewness |
.172 |
|
|
Kurtosis |
.191 |
|
|
Std. Error of Kurtosis |
.342 |
|
|
Range |
48 |
|
|
Minimum |
23 |
|
|
Maximum |
71 |
|
|
aMultiple modes exist. The smallest value is shown. |
The items highlighted in yellow are those used in the interpretation of these variables' descriptive statistics.
Figure 2.12: SPSS output of distribution
2.6 Interpreting Results
Though you should refer to the most recent edition of the APA manual for specific detail on formatting statistics, the following may be used as aquick guide in interpreting the descriptive statistics covered in this chapter. Since there are no abbreviations or symbols used to represent skewor kurtosis, we typically use the entire term in presenting results. Note that skew and kurtosis are not italicized since they are not abbreviations.
The following are some examples of how to interpret results using these terms, though you may use different combinations of results. Theseexamples utilize the data presented in Section 2.5.
· The mean age of participants was 41.63 years (SD = 9.83), with a skewness of 0.47 and kurtosis of 0.19.
· There were 200 participants in the study (skewness = 0.47, kurtosis = 0.19).
· Ages were normally distributed with a skewness of 0.47 and kurtosis of 0.19.
· Ages were normally distributed with a skew less than ±1 (skewness = 0.47).
· Ages were normally distributed with a kurtosis less than ±2 (kurtosis = 0.19).
Using the data from Section 2.3, we could interpret the results in the following way:
The mean licensing test score was 18.40 (SD = 9.17). The median was 19.00 and the mode was 33. The range of scores was from 11to 27 (N = 25) and the data was normally distributed (skewness = −0.06, kurtosis = −0.63).
Similarly, we could interpret the results from the Apply It! section in the following way:
The mean standardized math placement score was 54.65 (SD = 10.27). The median was 54.22 and the range of scores was from 15to 76 (N = 40). The data was not normally distributed with a skewness of 1.20 and kurtosis of 4.65.
Summary
Everyone has to analyze data. Whether we are trying to determine if it is safe to cross a street, read a co-worker's body language, or conduct acomplex numerical analysis of many variables and hundreds of data points, the tasks are analytical. When the data is numerous and thedecision is important, it is usually helpful to begin by organizing the data. Objectives 1 and 2 note that learning to organize and present data ismuch of our purpose in this chapter. As the size of the data set increases, so does the need for the information that good organization canprovide. Proper organization can make a muddle of data comprehensible; an ordered array is a good deal more informative than a disorderedarray. Whether a frequency distribution, a pie chart, a stem-and-leaf display, or some other presentation is appropriate will depend upon whatwe need to know. Frequency distributions provide a clear indicator of data repetition; pie charts reveal comparative proportions; a frequencypolygon can provide a rough estimate of whether the data is normally distributed; and a box plot can detect outliers in the data set.
The normal distribution is a central concept in statistics and data analysis. When data is normally distributed, we know from the beginningroughly how data will be distributed relative to the mean. We know what proportions of a data distribution will occur where, a concept that willprove very helpful as we examine z scores in Chapter 3. Sometimes, even when they are organized into the proper display, a data set does notreflect the characteristics of a normal distribution. Consistent with Objectives 3 and 4, we learned to judge normality based on a few descriptivestatistics. If we can compare the mean to the median, and the standard deviation to the range, we can make a rough estimate of whether thedata is normal.
Some scores threaten normality because they are so different from the balance of the data set. These outliers are so extreme that includingthem in whatever statistics are calculated is likely to distort the value of those statistics. Outliers create skewed distributions, so having amechanism for identifying them can be helpful (Objective 5). In addition, how to present results (Objective 6) and interpret them in APA format(Objective 7) as they relate to describing distributions are important pieces of utilizing and writing about statistical data.
At the beginning of the chapter was a comment on the incremental nature of statistics. Consistent with that theme, the concepts related tonormality and normal distributions have particular application in Chapters 3 and 4, so to prepare for those next steps, work the end-of-chapterproblems and review the terms and their definitions in the glossary.
The Standard Normal Distribution and z Scores
iStockphoto/Thinkstock
Learning Objectives
After reading this chapter, you will be able to. . .
· identify the characteristics of the standard normal distribution.
· demonstrate the use of the z transformation.
· determine the percent of a population above a point, below a point, and between two points on thehorizontal axis of a normal distribution.
· calculate z scores using Excel.
· describe alternative standard scores.
· demonstrate the use of the modified standard score.
· present results based on a z score results in SPSS.
· interpret results of the z and T score transformations in APA format.
The data from which the characteristics of groups are described comes either from samples or populations—descriptors—which came up inboth of the first two chapters. Recall that by definition, populations include all possible members of any specified group. All university students,all psychology majors, all residents of Orange County, all left-handed male tennis players in their 20s—each indicates a population. We rely onGreek letters, such as μ for the mean and σ for the standard deviation, to distinguish population parameters from the statistics that describesamples. (The word parameter indicates a characteristic of a population.) To reiterate, the sample is a subset of the population and since weusually cannot collect the entire populace we rely on a representative sample of that population.
In the course of describing populations, we noted that many are normally distributed. Normality is suggested when data distributions aresymmetrical—when all the measures of central tendency have very similar values—and the standard deviation has a value about one-sixth thatof the range.
Normality is much more than a descriptive issue. Because many of the mental characteristics that describe human behavior are normallydistributed, the proportions of scores that occur in a particular area of the distribution are the same for all normally distributed characteristics.The characteristics of normal populations have been defined well enough that from the mean of the population to one standard deviationbelow the mean always includes 34.13% of the area under the curve. Because normal distributions are symmetrical, from the mean to onestandard deviation above the mean also includes 34.13%, so from −1σ to +1σ includes about 68.26% of the area under the curve in anynormally distributed population. As long as the data is normal, percentages like these will hold true. This allows us to know a good deal aboutanything that is normally distributed without actually gathering the data and doing the analysis. Whether the characteristic is intelligence orachievement motivation or anxiety, a normal distribution means that the proportion of the distribution within +1 or −1 standard deviation fromthe mean will be the same:
· If a particular intelligence scale has μ = 100 and σ = 15, about 68% of any general population will have intelligence scores between 85and 115.
· If an achievement motivation scale has μ = 40 and σ = 8, about two-thirds of any population will have achievement motivation scoresfrom 32 to 48.
· For an anxiety measure with μ = 25 and σ = 5, about 68% of any general population will have scores between 20 and 30.
The consistency in the way so many characteristics are distributed provides us with a good deal of interpretive power. Anyone who needsinformation about the likelihood of individuals scoring in certain areas of a distribution has an advantage whenever data is normally distributed.In addition to the 68% of any general population likely to score between +1σ and −1σ,
· From μ to +2σ is about 47.72% of the population, so about 95% (2 × 47.72) of the people in any general population will haveintelligence scores between 70 (100 − 30) and 130 (100 + 30).
· From +3σ (49.87%) to −3σ includes nearly everyone in any normally distributed population (2 × 49.87 = 99.74).
The foregoing is intended to make the point that sometimes isolated bits of data can be quite informative. When a 12-year-old with anintelligence score of 170 pops up on You-Tube, straightaway it is apparent that this is a very unusual child. An intelligence score of thatmagnitude is about 4.667σ (170 − 100) = 70; 70 ÷ 15 = 4.667) beyond the mean of the general population! If from +3σ to −3σ includes morethan 99% of the population, from +4.667 to −4.667σ must include all but the very most extreme scores. We get an even better context for howcommon (or uncommon) particular measures may be when we can determine the precise probability of their occurrence.
The Standard Normal Distribution and z Scores
iStockphoto/Thinkstock
Learning Objectives
After reading this chapter, you will be able to. . .
· identify the characteristics of the standard normal distribution.
· demonstrate the use of the z transformation.
· determine the percent of a population above a point, below a point, and between two points on thehorizontal axis of a normal distribution.
· calculate z scores using Excel.
· describe alternative standard scores.
· demonstrate the use of the modified standard score.
· present results based on a z score results in SPSS.
· interpret results of the z and T score transformations in APA format.
The data from which the characteristics of groups are described comes either from samples or populations—descriptors—which came up inboth of the first two chapters. Recall that by definition, populations include all possible members of any specified group. All university students,all psychology majors, all residents of Orange County, all left-handed male tennis players in their 20s—each indicates a population. We rely onGreek letters, such as μ for the mean and σ for the standard deviation, to distinguish population parameters from the statistics that describesamples. (The word parameter indicates a characteristic of a population.) To reiterate, the sample is a subset of the population and since weusually cannot collect the entire populace we rely on a representative sample of that population.
In the course of describing populations, we noted that many are normally distributed. Normality is suggested when data distributions aresymmetrical—when all the measures of central tendency have very similar values—and the standard deviation has a value about one-sixth thatof the range.
Normality is much more than a descriptive issue. Because many of the mental characteristics that describe human behavior are normallydistributed, the proportions of scores that occur in a particular area of the distribution are the same for all normally distributed characteristics.The characteristics of normal populations have been defined well enough that from the mean of the population to one standard deviationbelow the mean always includes 34.13% of the area under the curve. Because normal distributions are symmetrical, from the mean to onestandard deviation above the mean also includes 34.13%, so from −1σ to +1σ includes about 68.26% of the area under the curve in anynormally distributed population. As long as the data is normal, percentages like these will hold true. This allows us to know a good deal aboutanything that is normally distributed without actually gathering the data and doing the analysis. Whether the characteristic is intelligence orachievement motivation or anxiety, a normal distribution means that the proportion of the distribution within +1 or −1 standard deviation fromthe mean will be the same:
· If a particular intelligence scale has μ = 100 and σ = 15, about 68% of any general population will have intelligence scores between 85and 115.
· If an achievement motivation scale has μ = 40 and σ = 8, about two-thirds of any population will have achievement motivation scoresfrom 32 to 48.
· For an anxiety measure with μ = 25 and σ = 5, about 68% of any general population will have scores between 20 and 30.
The consistency in the way so many characteristics are distributed provides us with a good deal of interpretive power. Anyone who needsinformation about the likelihood of individuals scoring in certain areas of a distribution has an advantage whenever data is normally distributed.In addition to the 68% of any general population likely to score between +1σ and −1σ,
· From μ to +2σ is about 47.72% of the population, so about 95% (2 × 47.72) of the people in any general population will haveintelligence scores between 70 (100 − 30) and 130 (100 + 30).
· From +3σ (49.87%) to −3σ includes nearly everyone in any normally distributed population (2 × 49.87 = 99.74).
The foregoing is intended to make the point that sometimes isolated bits of data can be quite informative. When a 12-year-old with anintelligence score of 170 pops up on You-Tube, straightaway it is apparent that this is a very unusual child. An intelligence score of thatmagnitude is about 4.667σ (170 − 100) = 70; 70 ÷ 15 = 4.667) beyond the mean of the general population! If from +3σ to −3σ includes morethan 99% of the population, from +4.667 to −4.667σ must include all but the very most extreme scores. We get an even better context for howcommon (or uncommon) particular measures may be when we can determine the precise probability of their occurrence.
3.1 A Primer in Probability
Probability is defined as the number of times an event classified as A occurs divided by the total number of possible outcomes. Scholars, dataanalysts, and, in fact, people generally are rarely interested in outcomes that occur every time. If everyone has an intelligence score of 170, noone would pay any attention. It's the fact that we know it to be uncommon that piques our curiosity.
If we are not interested in events that always occur, we also do not closely follow events that never occur. If no one had ever had anintelligence score of 170, probably no one would wonder about what such a score means for the person who has it. It is the fact that thingsoccur some of the time that intrigues us. The "some of the time" indicates that the event has some probability of occurrence.
· What's the probability that those newlyweds will divorce?
· How likely is it that the Yankees will win the World Series?
· What's the probability of an earthquake for someone near the San Andreas Fault?
· What's the probability of an IRS audit for one taxpayer?
Because all the things listed above have happened in the past and because their occurrence is important to at least someone, people areinterested in the likelihood, or the probability, of those occurrences, whether or not they use the language of probability.
· When they're stated numerically, probability values range from 0, which means that the event never occurs, to 1.0, which is theprobability for an event that occurs every time.
· Something that happens 50% of the time has a probability of 0.5.
As that last point indicates, percentages can be converted to probability values. Dividing the percentage of times an event occurs by 100indicates the associated probability of the event.
Back to the intelligence scores: Because about 68% of the population has intelligence scores between 85 and 115, the probability that someoneselected at random will have a score somewhere between 85 and 115 is p = .68 (68/100). What is the probability that someone selected atrandom will have a score 100 or lower?
· Because 100 is the mean for intelligence scores, and
· because 50% of the population occur at the mean or below,
· p = .5.
What is the probability that someone selected at random will have an intelligence score higher than 115?
· Because 34.13% occur between μ and σ = 1.0, which is between 115 and 100,
· 50 − 34.13 = 15.87% must occur above 115, so
· if we round 15.87%,
· p = .16.
By the same logic, because 85 is one standard deviation below the mean, p = .16 that someone selected at random from the population willscore below 85. If we combine the two outcomes, the probability is about p = .32 that someone from the population will score either below 85or above 115.
Recall that the lowest probability for any value is p = 0. If p = 0, then the event or outcome never occurs. There is no such thing as a negativeprobability.
3.2 The Standard Normal Distribution
Not all populations are normally distributed. Home sales are usually reported in terms of the median price of a home, and salary data islikewise reported as the median. In both cases, it is because the related populations are rarely normally distributed. In each case, a fewextremely high values coupled with a great many more modest values skew the distributions. But when it comes to the characteristicsdescribing people, particularly their mental characteristics such as intelligence, achievement motivation, problem-solving ability, verbal aptitude,and reading comprehension, population data is often normally distributed.
So, there are many normal distributions. Nevertheless, there are important differences among them. An intelligence test might have a mean andstandard deviation of 100 and 15 points, respectively. A nationally administered reading test, for which the data is also normally distributed,might have a mean of 60 and a standard deviation of 8. These differences make it difficult to compare the same individual's performance acrossmultiple normally distributed measures.
One way to address that problem is to convert the scores from different distributions into a common metric. If scores from differentdistributions are transformed so that they both fit the same distribution, the scores can be compared directly. This is one of the purposes of thestandard normal distribution.
The standard normal distribution looks like all other normal distributions—from the mean to +1 standard deviation includes 34.13% of thedistribution, for example. What makes it different from the others is that in the standard normal distribution, the mean and standard deviationhave fixed values. The mean is always 0 and the standard deviation remains at 1.0 (Figure 3.1). Other distributions have fixed values for theirmeans and standard deviations, but only for this one is the mean always 0 and the standard deviation always 1.0.
Figure 3.1: The standard normal distribution
The z Transformation
Although different normal distributions might have different means and standard deviations, they all mirror each other in terms of theproportions of their populations that occur in particular areas. The difference between the standard normal distribution and other distributionsis that in the case of the standard normal distribution, the proportions of its population that occur in nearly any area of the distribution havebeen determined. That means that when data from any normal distribution is transformed to standard normal distribution, there is animportant analytical advantage. Someone analyzing data can ask questions about what is likely to occur in virtually any area of the distribution.The data just must conform to the standard normal distribution.
Individual scores in a distribution are called "raw scores" and in the standard normal distribution are called scores. The formula for turning rawscores into z scores is the z transformation:
|
z=x−Ms |
Formula 3.1 |
|
Where z = the transformed score x = the original (raw) score M = the mean of the scores before the transformation s = the standard deviation of the scores before the transformation |
Although the M and s indicate sample data, the process is the same for population data, except that μ replaces M and σ replaces s. But becauseit is far more common to have access to sample data, the formula here uses M and s. In either case, the transformation is from data that canhave any mean and standard deviation to a distribution where the mean will always equal 0 and the standard deviation will always equal 1.0.
The process for turning scores from any source into z scores is the following:
1. Determine the mean and standard deviation for the data set.
2. Subtract the mean of the data set from the score to be transformed.
3. Divide the difference by the standard deviation of the data set.
A political scientist is interested in the level of apathy among potential voters regarding political issues. Scores on the S ummary o f Wh o's A pathetic T est (the SoWhat for short), an apathy measure, are gathered for 10 registered voters:
5, 6, 9, 11, 15, 15, 17, 20, 22, 25
What is the z score for someone who has an apathy control of 11?
1. Verify that for these 10 scores, M = 14.5 and s = 6.74.
2. The z score equivalent for an apathy score of 11 is
|
z=x−Ms=11−14.56.74=−0.52 |
A political apathy score of 11 translates into a z score of −0.52. Because the mean of the z distribution is 0 and the standard deviation in the zdistribution is 1.0, where would a score of −0.52 occur on the horizontal axis of the data distribution? It would be a little more than half astandard deviation below the mean, right? Figure 3.2 indicates the z distribution and the point about where a raw score of 11 occurs in thisdistribution once it is transformed into a z score.
Try It!
A A raw score has a zscore of 1.5. Howmany standarddeviations from themean of all scores is this rawscore?
See answer here.
Using the z transformation does not make data normal. The data is assumed to be normal beforethe formula is used. If the population from which the data is drawn is not normal before thetransformation, transforming the data into z scores does not change the shape of the distribution.This is a linear transformation. Actually changing the distribution based on different statisticaltechniques is called bootstrapping or log transformations (see Field, 2009).
With a mean of 0 in the standard normal distribution, half of all z scores—all the scores below themean—are going to be negative when they are transformed into z scores. A raw score of 11 islower than the mean of the original distribution, which was 14.5. Once transformed, it has anegative z distribution, which was 14.5. Once transformed, it has a negative z value. Because ofthe way z scores are calculated, a score that has a value of z = −1.0 is one standard deviationbelow the mean. A z score of −0.52, therefore, is just over half a standard deviation below themean.
Figure 3.2: Location of a score on the z distribution
Comparing Scores From Different Instruments
Consider another application of the standard normal distribution. A counselor has intelligence and reading scores for the same person andwishes to know on which measure the individual scored higher. The data is as follows:
· On the intelligence test, which has a mean of 100 and a standard deviation of 15, the individual scored 105.
· On the reading test, which has a mean of 60 and a standard deviation of 8, the individual's score is 62.
If the counselor transforms both scores so that they fit the standard normal distribution, they can be compared directly to each other.
The z value for the intelligence test is
|
z=x−Ms=105−10015=0.33 |
The z value for the reading test is
|
z=x−Ms=82−808=0.25 |
Initially, the intelligence score of 105 and the reading score of 62 were difficult to compare because they belonged to different distributions, butafter both scores are transformed so that they fit the standard normal distribution, they can be compared directly. Of the two z values, theintelligence score is higher, indicating that the individual has a higher intelligence score than reading score.
Expanding the Use of the z Distribution
However, an application of the standard normal distribution can also be more complex. Because the z distribution is a normal distribution, fixedproportions of the entire population occur in specific areas of the distribution. Because analysts and researchers often rely on this distributionto answer questions like those posed earlier, the proportions of the population that occur in all of the most common regions of the distributionhave been calculated. The table, which provides all this information, allows us to know how much of the entire population is above or belownearly any value of z in the distribution. Therefore, by transforming scores from other distributions to fit the z distribution, we can use what weknow about this population to answer questions about scores from any normal distribution. The proportions of the distribution that occurbetween all the most common values of z and the mean are reported in Table 3.1.
Try It!
B Table 3.1 has tablevalues only forpositive z scores. Howdo we interpret thevalue when z turns out to benegative?
See answer here.
Not all z tables are organized like Table 3.1. The way this table is prepared, it indicates theproportion of the population between any particular value of z and the mean of the distribution.The table is listed again as Table A in the Critical Values Tables Appendix.
The z value for a SoWhat score of 11 was calculated and left at two decimal values, and the zscore table also goes to only two decimals So, rounded to two decimals, the z value for a rawscore of 11 = −0.52.
To interpret the z score, read the whole numbers and tenths (the first value to the right of thedecimal) vertically down the left margin of the table. For the hundredths (the second value to theright of the decimal), move across the columns at the top of the table from left to right.
1. Read down the left margin for 0.5.
2. Read across the top for 0.02.
3. The table value where row and column intersect is 0.1985.
4. To determine the percentage of the distribution between z = −0.52 and the mean, multiply the table value by 100; 100 × 0.1985 =19.85% of the distribution is between −0.52 and the population mean.
Note that all the z values in Table 3.1 are positive. Because the z score for any raw score below the mean (which is 0) has to be negative, thetable has values for only half the distribution. But half is all that is needed because
a. the standard normal distribution is a normal distribution,
b. all normal distributions are symmetrical, and
c. the table indicates the proportion of a normal population between any value and the mean,
d. then the table value for −0.52 will be the same as the value for +0.52. From either −0.52 or +0.52 back to the mean must includethe same proportion of the entire distribution.
|
Table 3.1: The z table |
||||||||||
|
|
0.00 |
0.01 |
0.02 |
0.03 |
0.04 |
0.05 |
0.06 |
0.07 |
0.08 |
0.09 |
|
0.0 |
0.0000 |
0.0040 |
0.0080 |
0.0120 |
0.0160 |
0.0199 |
0.0239 |
0.0279 |
0.0319 |
0.0359 |
|
0.1 |
0.0398 |
0.0438 |
0.0478 |
0.0517 |
0.0557 |
0.0596 |
0.0636 |
0.0675 |
0.0714 |
0.0753 |
|
0.2 |
0.0793 |
0.0832 |
0.0871 |
0.0910 |
0.0948 |
0.0987 |
0.1026 |
0.1064 |
0.1103 |
0.1141 |
|
0.3 |
0.1179 |
0.1217 |
0.1255 |
0.1293 |
0.1331 |
0.1368 |
0.1406 |
0.1443 |
0.1480 |
0.1517 |
|
0.4 |
0.1554 |
0.1591 |
0.1628 |
0.1664 |
0.1700 |
0.1736 |
0.1772 |
0.1808 |
0.1844 |
0.1879 |
|
0.5 |
0.1915 |
0.1950 |
0.1985 |
0.2019 |
0.2054 |
0.2088 |
0.2123 |
0.2157 |
0.2190 |
0.2224 |
|
0.6 |
0.2257 |
0.2291 |
0.2324 |
0.2357 |
0.2389 |
0.2422 |
0.2454 |
0.2486 |
0.2517 |
0.2549 |
|
0.7 |
0.2580 |
0.2611 |
0.2642 |
0.2673 |
0.2704 |
0.2734 |
0.2764 |
0.2794 |
0.2823 |
0.2852 |
|
0.8 |
0.2881 |
0.2910 |
0.2939 |
0.2967 |
0.2995 |
0.3023 |
0.3051 |
0.3078 |
0.3106 |
0.3133 |
|
0.9 |
0.3159 |
0.3186 |
0.3212 |
0.3238 |
0.3264 |
0.3289 |
0.3315 |
0.3340 |
0.3365 |
0.3389 |
|
1.0 |
0.3413 |
0.3438 |
0.3461 |
0.3485 |
0.3508 |
0.3531 |
0.3554 |
0.3577 |
0.3599 |
0.3621 |
|
1.1 |
0.3643 |
0.3665 |
0.3686 |
0.3708 |
0.3729 |
0.3749 |
0.3770 |
0.3790 |
0.3810 |
0.3830 |
|
1.2 |
0.3849 |
0.3869 |
0.3888 |
0.3907 |
0.3925 |
0.3944 |
0.3962 |
0.3980 |
0.3997 |
0.4015 |
|
1.3 |
0.4032 |
0.4049 |
0.4066 |
0.4082 |
0.4099 |
0.4115 |
0.4131 |
0.4147 |
0.4162 |
0.4177 |
|
1.4 |
0.4192 |
0.4207 |
0.4222 |
0.4236 |
0.4251 |
0.4265 |
0.4279 |
0.4292 |
0.4306 |
0.4319 |
|
1.5 |
0.4332 |
0.4345 |
0.4357 |
0.4370 |
0.4382 |
0.4394 |
0.4406 |
0.4418 |
0.4429 |
0.4441 |
|
1.6 |
0.4452 |
0.4463 |
0.4474 |
0.4484 |
0.4495 |
0.4505 |
0.4515 |
0.4525 |
0.4535 |
0.4545 |
|
1.7 |
0.4554 |
0.4564 |
0.4573 |
0.4582 |
0.4591 |
0.4599 |
0.4608 |
0.4616 |
0.4625 |
0.4633 |
|
1.8 |
0.4641 |
0.4649 |
0.4656 |
0.4664 |
0.4671 |
0.4678 |
0.4686 |
0.4693 |
0.4699 |
0.4706 |
|
1.9 |
0.4713 |
0.4719 |
0.4726 |
0.4732 |
0.4738 |
0.4744 |
0.4750 |
0.4756 |
0.4761 |
0.4767 |
|
2.0 |
0.4772 |
0.4778 |
0.4783 |
0.4788 |
0.4793 |
0.4798 |
0.4803 |
0.4808 |
0.4812 |
0.4817 |
|
2.1 |
0.4821 |
0.4826 |
0.4830 |
0.4834 |
0.4838 |
0.4842 |
0.4846 |
0.4850 |
0.4854 |
0.4857 |
|
2.2 |
0.4861 |
0.4864 |
0.4868 |
0.4871 |
0.4875 |
0.4878 |
0.4881 |
0.4884 |
0.4887 |
0.4890 |
|
2.3 |
0.4893 |
0.4896 |
0.4898 |
0.4901 |
0.4904 |
0.4906 |
0.4909 |
0.4911 |
0.4913 |
0.4916 |
|
2.4 |
0.4918 |
0.4920 |
0.4922 |
0.4925 |
0.4927 |
0.4929 |
0.4931 |
0.4932 |
0.4934 |
0.4936 |
|
2.5 |
0.4938 |
0.4940 |
0.4941 |
0.4943 |
0.4945 |
0.4946 |
0.4948 |
0.4949 |
0.4951 |
0.4952 |
|
2.6 |
0.4953 |
0.4955 |
0.4956 |
0.4957 |
0.4959 |
0.4960 |
0.4961 |
0.4962 |
0.4963 |
0.4964 |
|
2.7 |
0.4965 |
0.4966 |
0.4967 |
0.4968 |
0.4969 |
0.4970 |
0.4971 |
0.4972 |
0.4973 |
0.4974 |
|
2.8 |
0.4974 |
0.4975 |
0.4976 |
0.4977 |
0.4977 |
0.4978 |
0.4979 |
0.4979 |
0.4980 |
0.4981 |
|
2.9 |
0.4981 |
0.4982 |
0.4982 |
0.4983 |
0.4984 |
0.4984 |
0.4985 |
0.4985 |
0.4986 |
0.4986 |
|
3.0 |
0.4987 |
0.4987 |
0.4987 |
0.4988 |
0.4988 |
0.4989 |
0.4989 |
0.4989 |
0.4990 |
0.4990 |
|
Source: StatSoft. (2011). Electronic statistics textbook. Tulsa, OK: StatSoft. Retrieved from http://www.statsoft.com/textbook/distribution-tables/-z |
From z Values to Percentages
A proportion of 1.0 is 100%. The table value indicates that, of a total proportion under the curve of 1 or 100%, a proportion of 0.1985 of anormal distribution occurs between the point at which z = 0.52 (or z = −0.52) and the mean of the distribution where z = 0. Multiplying by 100converts the proportion to a percentage, and in the case of z = −0.52, we find that 19.85% of any normal distribution occurs between that pointand the mean.
Try It!
C What is the largestpossible value for z?
See answer here.
To reemphasize an earlier point, because all normal distributions are symmetrical, the percentageof the population between the mean and a positive value of z will be the same as the percentagebetween the mean and a negative z with the same absolute value. (The "absolute value" of anumber refers to its magnitude without regard to the ± sign. Values of −2.0 and 2.0 have thesame absolute value.) Because the z value in our example is negative, it indicates that 19.85% ofthe entire population occurs between the mean and that point to the left of the mean indicatedby z = −0.5195.
Because 50% of the distribution occurs on either side of the mean,
· if 19.85% of the distribution is from a z of −0.52 back to the mean, the rest of the left half of the distribution of 30.15% (50 − 19.85)must occur below a z score of −0.52, as Figure 3.3 illustrates.
· Working backward, if the question is, what's the probability of scoring 11 or lower on the SoWhat, the answer is to turn thepercentage below 11 (30.15%) back into a probability: 30.15/100 = 0.3015, or p = .3015.
Figure 3.3: Calculation of proportions under the normal distributioncurve
c
Source: Statistics, 4th Edition by David Freeman, Robert Pisani and Roger Purves. Copyright 2007, 1998,1991, 1978 by W.W. Norton & Company, Inc. Used by permission of W.W. Norton & Company, Inc.
Once a raw score is transformed into a z score and you have the table value, the answers to a variety of related questions are also available.
The Percentage Between Two Scores on Opposite Sides of the Mean
If 5 and 25 are the most extreme apathy scores gathered in the sample of SoWhat scores, we might ask what percentage of the entiredistribution will score between 5 and 25. Because those were the political scientist's lowest and highest scores, should not the answer be 100%?Remember that the collected data is from a sample. Although everyone in the sample scored between 5 and 25, it is entirely possible, evenprobable, that someone in the larger population will score either below 5 or above 25. The related z scores will help us determine howprobable. We will proceed as follows:
1. Turn both 5 and 25 into z scores.
2. Determine the related table values for each z.
3. Turn the table values into percentages by multiplying each by 100.
4. Add the resulting percentages together to indicate the percentage of the population from 5 to 25.
The z score formula is
|
z=(x−M)s |
With the subscript to z indicating the particular raw score and with M = 14.5 and s = 6.737, we have the following:
|
z5=5−14.56.737=−1.410,which corresponds to p=.4207 |
A proportion of .4207 is a percentage of 42.07% (.4207 × 100).
|
z25=25−14.56.737=1.559=1.56,which corresponds to p=.4406 |
A proportion of 0.4406 is a percentage of 44.06% (.4406 × 100).
42.07 + 44.06 = 86.13% from 5 to 25. This is clearly not 100%!
Subtracting 86.13% from 100% will indicate the percentage of the distribution either lower than 5 or higher than 25. This result is indicated inFigure 3.4.
100 − 86.13 = 13.87% either below 5 or above 25
Figure 3.4: Areas under the curve beyond z = −1.41 and z = 1.56
This problem reminds us that the political scientist is dealing with sample data. The tails in any normal curve extend infinitely outward in eitherdirection along the horizontal axis. Consequently, there is the possibility of a score more extreme than those already gathered, a score lowerthan the lowest score in the data set or higher than the highest score already measured. In a standard normal distribution, there is never avalue of z that accounts for 100% of the distribution.
The Percentage of the Distribution Between z Scores With the Same Sign
In the preceding example, the question was about the percentage of the distribution between z scores on opposite sides of the mean—z scoreswith opposite signs, one positive (z = 1.56) and the other negative (z = −1.41). Perhaps the researcher has a question about the percentage ofthe distribution between SoWhat scores of 15 and 20. With M = 14.5, both of these raw scores are higher than the mean and are going to havepositive z values. If the z scores have the same sign, we must change a step from the last example to find the answer:
1. Turn both raw scores into z scores.
2. Determine the related table values for each z.
3. Turn the table values into percentages by multiplying by 100.
4. Subtract the smaller percentage from the larger one.
|
z=x−Ms |
|
z15=15−14.56.737=0.0742, or 0.07, which corresponds to p=.0279 |
This is the proportion of the distribution from z = 0.07 to the mean of the distribution. The percentage of the distribution between 15 and M = 2.79% (.0279 × 100).
|
z20=20−14.56.737=0.8164, or 0.82, which corresponds to p = .2939 |
This is the proportion of the distribution from z = 0.82 to the mean of the distribution. The percentage between 20 and M = 29.39% (.2939 ×100).
The two table values indicate the proportion between the respective value of z and the mean. When the z scores were on opposite sides of themean, determining the amount of the distribution between them was a simple matter of adding the two table values. With both scores on thesame side of the mean, however, the respective table values overlap. To determine what is between the two values, subtract the smallerproportion (which we converted into a percentage) from the larger to eliminate the percentage of the distribution common to both scores:
29.39 − 2.79 = 26.6% of the distribution will score between 15 and 20.
Figure 3.5 illustrates this result below.
Figure 3.5: Areas under the curve between z = 0.07 and z = 0.82
It is very helpful to draw a simple diagram like Figure 3.5 when trying to answer a question about the percentage of the distribution in aparticular area. It will help clarify the question asked and simplify the logic involved in arriving at a conclusion.
Apply It!
Quality Control
A bottling company uses an automated machine to fill 4-liter plastic containers with orange juice. Bottles filled to lessthan 95% of the listed amount (3.80 liters) must be rejected or the company risks fines. Containers filled above 4.30liters overflow and must also be rejected. The equipment engineer thinks that too many bottles are being rejected, costing thecompany money, and would like to get to a rejection rate of less than 1% of all containers.
Over the course of several months, the engineer plots values for the amount of orange juice in each container. He finds that theaverage fill amount is 4.00 liters. The process results in fill amounts that are normally distributed, with a standard deviation of0.12 liters. The engineer wants to know what percentage of the bottles will be filled below 3.80 liters and what percentage above4.30 liters. Because the raw data is normally distributed, the engineer can use the z transformation formula.
To find out what proportion of the bottles will be filled below 3.8 liters
x = 3.80 liters M = 4.00 liters s = 0.12 liters z3.80 = (3.80 − 4.00)/0.12 = −1.67
Table 3.1 shows that this z score corresponds to a proportion of 0.4525. Therefore, 0.50 − 0.4525 = 0.0475, or 4.75% of thebottles will be below 3.80 liters.
The proportion filled above 4.30 liters is
z4.30 = (4.30 − 4.00)/0.12 = 2.5
Table 3.1 shows that this z score corresponds to a proportion of 0.4938. Therefore, 0.50 − 0.4938 = 0.0062, or 0.62% of thebottles will be above 4.3 liters.
Clearly, this is unacceptable because 4.75% + 0.62% = 5.37% of all containers will be rejected. The engineer calls the machinemanufacturer, who informs him they can overhaul and recalibrate the machine. After the overhaul, they guarantee that the fillamount will be normally distributed, with a standard deviation of 0.09 liters or less. The engineer decides to have the machinerecalibrated and to fill the bottles to 4.05 liters (instead of 4.00 liters). The engineer wants to know what his new rejection ratewould be.
The percentage filled below 3.80 liters is
z = (3.80 − 4.05)/0.09 = −2.78
This corresponds to a proportion of 0.4973 from Table 3.1. Therefore, 0.50 − 0.4973 = 0.0027, or 0.27% of the bottles will bebelow 3.8 liters.
The proportion filled above 4.30 liters is
z = (4.30 − 4.05)/0.09 = 2.78
This is the same z score as that for the percentage filled below 3.80 liters, so 0.27% of the bottles will be more than 4.3 liters.
In this case, the total rejected number of containers is 0.27% + 0.27% = 0.54%, which is less than the required 1% rejection rate.Therefore, recalibrating the machine so that the standard deviation is 0.09 liters and increasing the mean fill level from 4.00 to4.05 liters will meet the objective.
Apply It! boxes written by Shawn Murphy
Guidelines for Determining How Much of the Distribution Is Under Areas of the Curve
The related questions can usually be answered without a rule to follow, but for the sake of order and clarity, here are some guidelines foranswering the different questions that might be asked:
A. If the question is about the proportion of the population below a score,
1. determine the z score, and
2. determine the proportion from Table 3.1.
3. If the z value is positive, add the table value to 0.50 so that that the lower half of the distribution is included.
4. For a negative z score, subtract the table value from 0.50.
B. If the question is about the proportion of the population above a score,
1. determine z, and
2. look up the proportion associated with z in Table 3.1.
3. If the z value is positive, subtract the proportion from 0.50.
4. If the value of z is negative, add the proportion to 0.50.
C. If the question is about the proportion of the distribution between two scores, and the scores are in different halves of thedistribution,
1. calculate z for each score,
2. determine the appropriate values for z from Table 3.1, and
3. sum the table values.
If the scores are in the same half of the distribution,
1.
4. calculate the z value for both scores,
5. determine the appropriate values for z from Table 3.1, and
6. subtract the smaller table value from the larger.
D. If the question is about the percentage of the distribution either above or below two scores, and the two scores are in different halvesof the distribution,
1. calculate z for each score,
2. determine the Table 3.1 value for each z score,
3. subtract each of the two table values from 0.50, and
4. sum the results.
Try It!
D How does the ztransformation allowyou to figurativelycompare apples tooranges?
See answer here.
If the scores are in the same half of the distribution,
1.
5. calculate z for each score,
6. determine the Table 3.1 value for each z score,
7. subtract the smaller table value from the larger, and
8. subtract the difference from 1.00.
The list of steps must seem like a great deal to remember. In fact, the better course whenconfronted with a z score problem is to sketch out a distribution to produce something like Figures3.3 and 3.4. They help clarify the question and then determine the steps needed to answer it. Amoment or two of quiet study will usually suggest the process needed to answer any of thequestions we have addressed here.
Comparing Data From Different Tests
One of the questions that came up when the standard normal distribution was introduced earlier was how test scores from two differentinstruments with different means and standard deviations can be compared. Perhaps a juvenile gang member under court-ordered counselingtakes two different tests, a test of aggression, and a test of social alienation.
· The aggression measure has M = 32.55 and s = 5.82.
· The alienation test has M = 12.92 and s = 2.67.
· The client scores 39 on the aggression test and 15 on the alienation test.
The question is, which is the more extreme score?
In both cases, the juvenile scored beyond the mean, suggesting higher-than-average aggression and also higher-than-average social alienation.Because the two tests have different means and standard deviations, comparing the raw scores directly is not helpful. However, if both scoresare transformed into z scores, a direct comparison is possible because the transformation converts both scores into that distribution where themean is 0 and the standard deviation is 1.0.
|
z=x−Ms |
First, the z for the aggression score:
|
z39=39−32.5545.824=1.11 |
Then the z for social alienation:
|
z15=15−12.9172.674=0.78 |
Interpreting Multiple z Values
There's no need for table values here. The z value indicates how distant a raw score is from its mean (the numerator) in standard deviationunits (the denominator). The aggression value is 1.11 standard deviations from the mean of the distribution. The alienation score is just 0.78standard deviations from its mean. The aggression score is more extreme than the alienation score.
As long as means and standard deviations are accessible for each score, they allow for the direct comparison of entirely dissimilarcharacteristics. They allow us to transform what would otherwise be "apples to oranges" both into measures with means of 0 and standarddeviations of 1.0.
Another Comparison
Perhaps like Lewis Terman, who developed the Stanford-Binet intelligence test, a psychologist is interested in giftedness among children.Because unusual verbal ability often seems to accompany superior intelligence in gifted children, the psychologist measures both characteristicsfor a group of subjects. For one particular candidate, the intelligence score is 140 and the verbal ability measure is 55. The descriptive data foreach test is as follows:
|
|
Mean |
Standard Deviation |
|
Intelligence test |
100 |
15 |
|
Verbal ability measure |
40 |
5.45 |
As with the first example, both scores must be turned into z scores before they can be directly compared.
|
z=x−Ms |
For the intelligence score,
|
z140=140−10015=2.67 |
For the verbal ability measure,
|
z55=55−405.451=2.75 |
The z scores indicate that both test scores are about the same distance from their respective means. This makes it more difficult to just glanceat the raw scores and know which is higher. But because both have been transformed into z scores, both measures now belong to a commondistribution, and it's apparent that the verbal ability measure is slightly higher than the intelligence score.
Apply It!
Making Sense of Quantitative Ratings
Every year, there is a local singing contest that is judged by the same three judges. The judges are told to rate eachsinger on a 100-point scale. To save time, each judge only judges six of the nine singers. The scores for each judgetend to be normally distributed. However, one of the judges (Judge A) tends to give very low mean scores, and another judge(Judge B) tends to have a lot of variability in his scores, as shown by the large standard deviation.
Obviously, the singers judged by Judge A would be less likely to have a higher average score. One way to cancel out thisvariability is to turn all the scores in to z scores first and then average them.
For this year's contest, there were nine singers, labeled 1–9. Scores for Judge A, Judge B, and Judge C are given in the tablebelow, along with the average of the three scores for each singer. The table also shows the mean and standard deviations for allscores of each judge.
|
Singer |
Judge A |
Judge B |
Judge C |
MeanRating |
|
1 |
42 |
60 |
— |
51.0 |
|
2 |
48 |
74 |
— |
61.0 |
|
3 |
46 |
74 |
— |
60.0 |
|
4 |
54 |
— |
84 |
69.0 |
|
5 |
43 |
— |
69 |
56.0 |
|
6 |
63 |
— |
86 |
74.5 |
|
7 |
— |
90 |
82 |
86.0 |
|
8 |
— |
70 |
72 |
71.0 |
|
9 |
— |
80 |
75 |
77.5 |
|
Mean |
49.33 |
74.67 |
78 |
67.33 |
|
StandardDeviation |
7.94 |
10.01 |
6.96 |
11.23 |
If the raw average of the scores were used, Singer 7 would take first place, Singer 9 second, and Singer 6 third. Note that neitherthe first nor second place finishers were judged by Judge A.
A better way to aggregate these ratings is to convert the scores into a common metric by using the z transformation. Thetransformed z scores are shown in the table below. The average scores from each judge now have a mean of 0 and a standarddeviation of 1.
|
Singer |
Judge A |
Judge B |
Judge C |
MeanRating |
|
1 |
−0.92 |
−1.46 |
— |
−1.19 |
|
2 |
−0.17 |
−0.07 |
— |
0.12 |
|
3 |
−0.42 |
−0.07 |
— |
−0.24 |
|
4 |
0.59 |
— |
0.86 |
0.73 |
|
5 |
−0.80 |
— |
−1.29 |
−1.05 |
|
6 |
1.72 |
— |
1.15 |
1.44 |
|
7 |
— |
1.53 |
0.57 |
1.05 |
|
8 |
— |
−0.47 |
−0.86 |
−0.66 |
|
9 |
— |
0.53 |
−0.43 |
0.05 |
The rankings according to the average raw and z scores are shown in the table below.
|
ContestPlace |
Using Raw Scores |
Using z Scores |
||
|
|
Singer |
Score |
Singer |
Score |
|
First |
7 |
86.0 |
6 |
1.44 |
|
Second |
9 |
77.5 |
7 |
1.05 |
|
Third |
6 |
74.5 |
4 |
0.73 |
|
Fourth |
8 |
71.0 |
9 |
0.05 |
|
Fifth |
4 |
69.0 |
2 |
–0.12 |
|
Sixth |
2 |
61.0 |
3 |
−0.24 |
|
Seventh |
3 |
60.0 |
8 |
−0.66 |
|
Eighth |
5 |
56.0 |
5 |
−1.05 |
|
Ninth |
1 |
51.0 |
1 |
−1.19 |
As you can see, the ordered ranking changes when using the z scores. This method is used to transform scores from the threedifferent judges to a common metric, so that they can then be averaged. Converting to z scores before averaging scores removesthe variability due to the judges and makes the final score more just.
Apply It! boxes written by Shawn Murphy
3.3 Working From Percentages Back to z Scores
To this point, the task has been to work from raw scores to z score, and then to percentages or proportions of the distribution in specifiedareas. If the percentages are already available but neither the raw data nor the related descriptive statistics is, you can use Table 3.1 and workbackward to determine the z value even without the mean and standard deviation for the data.
Perhaps published data indicates that only 1% of the population have intelligence scores above 140. What z score does this represent?
1. Because Table 3.1 uses proportions, the first step is to turn the percentage into a proportion: 1% is 1/100, which is the same as aproportion of .01.
2. Table 3.1 indicates the proportion of a normal population between a particular value of z and the mean for half (0.5) of thedistribution. Therefore, we need a z value that includes all but that most extreme .01, which will be the z value for a proportion of .50− .01 = .49. Whatever this z value is, it will exclude the highest .01 of the distribution.
Table 3.1 does not list a proportion of exactly .49, but it does list .4901, which is very close. Reading backward from the proportion to the leftmargin and also vertically to the column heading, the associated z value for .4901 is 2.33. If data were gathered for intelligence scores, a z =2.33 excludes the top .01 or 1%. This is illustrated in Figure 3.6.
Proportion .4901
z value 2.33
Figure 3.6: The value of z associated with a particular proportion
Converting z Scores to Percentile Ranks
Percentile scores were introduced in Chapter 2. Recall that percentiles indicate the point below which a specified percentage of the groupoccurs. Seventy-three percent of the distribution occurs at or below the point defined by the 73rd percentile, and so on. Because the tablevalues associated with z scores can be used to determine the percentage of the distribution occurring below a point, it is not difficult to go onemore step and turn that percentage into a percentile score. For example,
· Because z = 1.0 includes 34.13% between that point and the mean, and
· because that part of the distribution from the mean downward is 50%,
· 34.13 + 50 = 84.13% of scores are at or below z = 1.0,
· which means that z = 1.0 occurs at the 84th percentile.
Although percentile scores can be easily determined from the table values that are associated with z scores, there is an important differencebetween percentile scores and z scores. The z score is one of several standard scores. Standard scores are all equal-interval scores—the intervalbetween consecutive integers is constant, which means that, in terms of data scale, standard scores are interval scale. The increase in whateveris measured from z = −1.5 to z = −1.0 is the same as it is from z = 0.3 to z = 0.8. The increase is 0.5 in either case.
This is not the case for percentile scores. Because these scores indicate the percentage of scores below a point rather than being a directmeasure of some characteristic, the distances between consecutive scores differ widely in various parts of the distribution. Most of the data inany normal distribution is in the middle portion where scores have the greatest frequency. The frequency diminishes as scores become moredistant from the mean, something reflected in curves that are vertically highest in the middle and that then decline toward the two tails.
As a result of high frequency in the middle of the distribution, the magnitude of the difference in whatever is measured, for example, betweenthe 49th and the 50th percentiles, is likely to be much smaller than the difference between scores between the 9th and 10th percentiles orbetween the 91st and 92nd percentiles. Consecutive percentile scores in either tail of a normal distribution will always indicate a much greaterdifference in whatever is measured than will consecutive percentile scores near the mean. Percentile scores are ordinal scale, whereas z scoresare interval scale.
From z to Other Standard Scores
The appeal of the z score is that it translates relative performance so readily. Anyone familiar with z scores knows that someone with a scorethat transforms into a positive z value has scored in the upper half of the distribution. Someone who scores one standard deviation beyond themean has scored at the 84th percentile, and so on. Because z scores communicate so readily, a number of other standard scores have also beendeveloped, although the z score is easily the most commonly used in any kind of statistical analysis.
Because those who work with them would rather not report negative values, sometimes a T score is substituted for z. Like z, T has a fixed meanand standard deviation, and once z is calculated, it is easy to get to T. In fact, this is true for any score that has a fixed mean and standarddeviation, whether it is a standard score like T or a score from the Graduate Record Exam (GRE), which also has a fixed mean and standarddeviation.
|
|
Mean |
Standard Deviation |
|
T scores |
50 |
10 |
|
Graduate RecordExam |
100 |
25 |
Either score can be derived from z. To go from z to T, for example, the z score is multiplied by 10, and 50 is added to the result. So,
if z = 1.75,
T = 10 × 1.75 = 17.5 + 50 = 67.50
The same pattern is followed for GRE scores—the z score is multiplied by the GRE's mean value (100), and standard deviation (25) is added tothe result. Again,
if z = 1.75,
GRE = 25 × 1.75 = 175 + 100 = 143.75
Try It!
E What makes a scorea standard score?
See answer here.
One important note in performing any type of transformation is that all standardized scores are alinear transformation. That is, the scores are transformed using a mathematical equation as seenin the z and T scores—this does not make the distribution normal. In fact, the distribution of theraw scores will be identical to that of the z score or T score distribution. There are many othertypes of standardized scores for various purposes as shown in Figure 3.7.
Figure 3.7: Different types of standardized scores
Source: http://upload.wikimedia.org/wikipedia/commons/b/bb/Normal_distribution_and_scales.gif
3.4 Using Excel for the z Score Transformation
The z score transformation is a fairly simple formula. As a result, it is not difficult to program it into Excel and transform an entire data set into zscores. In fact, there are several ways to do this, but we'll explore just one. It involves programming the z score transformation formula directlyinto the data sheet.
Interested in the relationship between poverty and achievement motivation among secondary school-aged young people, a researcher gathersachievement motivation data from a group of students whose families qualify for free and reduced-price lunches at school. The achievementmotivation scores are as follows:
4, 5, 7, 7, 8, 9, 9, 9, 10, 13
To use Excel to transform those data into their z score equivalents,
1. list the data in Excel in Column B, with the label "Ach Mot" in B1.
2. Enter the 10 scores into cells B2 to B11.
3. In cell B12, enter the formula =average(B2:B11).
a. The equal sign indicates that a formula follows.
b. The command average will provide the arithmetic mean.
c. When several cells are to be included in the function, they are placed in parentheses ( ). When the cells are consecutive, the colon(:) indicates that all cells from B2 to B11 are included in the function.
d. Press Enter.
e. In cell A12, enter the label "mean =."
The value in cell A12 will be 8.1, the mean of the achievement motivation scores.
4. In cell B13, enter the formula =stdev(B2:B11).
a. stdev is the Excel abbreviation for "sample standard deviation."
b. Press Enter.
c. In cell A13, enter the label "std dev 5/."
The value in cell A13 will be 2.558211, the standard deviation of the scores.
5. In cell C1, enter the label "equiv z."
6. In cell C2, enter the formula =(B2 − 8.1)/2.558 and press Enter.
a. Consistent with the z score transformation, this formula subtracts the mean from the raw score in cell B2 and then divides theresult by the standard deviation, 2.558, which we rounded to three decimals.
b. Now the task is to repeat that operation for all the other scores without having to reenter the formula nine more times. Steps 7and 8 detail how to do this.
7. With the cursor in cell C2, click and drag the cursor down from C2 to C11 so that cells C2 to C11 are highlighted.
8. In the Editing section at the top of the page near the right side, there is a Fill command with a down-arrow at the left. Click on the down-arrow to the right of the Fill command, and click Down. This will repeat the result in C2 for the other nine cells, adjusting forthe different test score in each cell.
3.6 Presenting Results
As with Excel, SPSS presents the z scores as another variable in its own column. Using the data from Figure 3.8, in this data set (M = 41.63, SD =9.83) an age of 30 years corresponds to a z score of −1.18, which is a little more than one standard deviation below the mean. An age of 64years is more than two standard deviations above the mean at a z score of 2.28. Standardized scores give us a quick look at where the rawscores are in relation to the mean, providing us with the ability to compare scores to each other in the distribution and compare the scores toother variables that have been standardized.
Figure 3.8: SPSS input with z scores
3.7 Interpreting Results
Though you should refer to the most recent edition of the APA manual for specific detail on formatting statistics, the information in Table 3.2may be used as a quick guide in interpreting and reporting the statistics covered in this chapter.
|
Table 3.2: Guide to APA formatting of standardized scores |
|
|
Abbreviation orTerm |
Description |
|
T |
T distribution score (M = 50, SD = 10) |
|
z |
standard score; z distribution score (M = 0, SD= 1) |
|
Source: Publication Manual of the American Psychological Association, 6th edition.©2009 American Psychological Association, pp. 119–122. |
Note that T and z are always italicized as they represent the standardized score for a T or z distribution. The following are some examples ofhow to interpret and report the results using these abbreviations, though you may use different combinations of results. These examples utilizethe data presented in the Apply It! Quality Control section.
· The average fill amount was 4.00 liters (SD = 0.12), with 4.75% of bottles falling below the reject level of 3.80 liters (z = −1.67).
· After calibration to a standard deviation of 0.09 (M = 4.00), the percentage of bottles that fall below the reject level is 0.27% (x = 3.80, z = −2.78).
Using the data from Figure 3.8, we could report the results in the following way:
The average age was 41.63 (SD = 9.83). An age of 30 is over one standard deviation below the mean (z = −1.18), whereas an age of64 is over two standard deviations above the mean (z = 2.28).
Summary
Not all data is normally distributed. However, people in the social sciences in general, and in psychology in particular, have an advantage. Manyof the traits and conditions we are interested in are normally distributed, which means that, at least in populations, they share descriptivecharacteristics that are consistent even when the thing measured changes. By now, these characteristics are probably familiar. Normaldistributions are unimodal, they are symmetrical, and their standard deviations tend to be about one-sixth of the range. Because these elementstend to be constant in normal distributions, we can have some confidence about how the different scores will be arrayed even before we view adisplay of the data.
What the standard normal distribution, or z distribution, does is capitalize on the consistency in normally distributed populations by offering onedistribution to which all other normal populations can be referenced. In this distribution, where the mean is always 0 and the standarddeviation 1.0 (Objective 1), table values indicate the proportions of the population likely to occur anywhere along its range. By transforming rawscores (Objective 2) from any normal population so that they fit this z distribution, we can take advantage of how well the characteristics of thisdistribution are known and answer important questions about data from any population (Objective 3) in terms of z:
· For example, when someone scores at a particular level, we can ask what proportion of the entire population is likely to score below(or above) that point?
· When most of the people in a particular group score between two points, we can ask what proportion of the entire population willscore between (or outside) those points?
Because the z score transformation is a relatively simple formula, it is not difficult to program Excel to produce the z equivalents for any set ofscores (Objective 4), something that can be helpful with large data sets.
The z is one of several standard scores in fairly common use. All standard scores are normative scores. Rather than how much of the measuredquality an individual possesses, they all indicate how one person's score compares to any others for whom there is data. Those who prefer notto deal in negative values sometimes opt for T scores over z scores. In all material respects, T is the same as z, except that the mean is 50 andthe standard deviation is 10.
The modified standard score (Objective 6) enhances the ability of standard scores to communicate an individual's standing relative to apopulation. Standard scores are often used to report the data from standardized tests, but these tests are revised from time to time, which canaffect the test means and standard deviations. For the sake of stability over time, the modified standard score uses the z transformation as away to maintain constant descriptive characteristics. Presenting results (Objective 7) and interpreting results in APA format (Objective 8) as theyrelate to describing T and z scores are important pieces of utilizing and writing about statistical data.
In the incremental nature of statistics books, each chapter is preface to the next. Chapters 1–3 are preface to Chapter 4. With all our effort tolabel, display, and describe data sets, the focus in the discussion of z scores and the like has been primarily about analyzing the performance ofindividuals. Nevertheless, behavioral scientists are generally much more interested in asking questions about groups. Analyzing how groupscompare to specified populations is our direction in Chapter 4, where the z distribution will be expanded to discussions about the probability ofselecting groups with particular characteristics.
The beauty of what we have just done in this chapter is that the math and the logic involved will be much the same in the next chapter. If thediscussion in this chapter makes sense, the materials in Chapter 4 will not be difficult. However, the way we have discussed some of theconcepts in this chapter may be new, and if they are, take some time to restudy the material, recalculate the sample problems worked in thechapter, and then work the problems that are at the end. There is great value in repetition, so do not be reluctant to repeat calculations thatyou have completed once.
Key Terms
Click on each key term to see the definition.
Chapter 3 Flashcard