Case study Statistics
32
Chapter 1
1
–
117111X1
File Edit View Data Transform Insert F^°µ-it Analyze Graphs Utilities Add-ons Window Help
Eg?
4 111 el
+ - VI
A, AL
Histogram
12.5—
10 0--
0—
Mean =17.39
Std. Dev. =11.55
N =94
0.0
0.00
10.00
20.00
30.00
40.00
50.00
socint
_
SPSS Proce
x is ready
Figure 1-20 SPSS histogram for the social interactions example. (Data from
McLaughlin-Volpe et al., 2001.)
I
esearch articles follow the procedure we recommend here: going from at the top to highest at the bottom. However, some statistics authorities -nend going from highest at the top to lowest at the bottom.
□
shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth6350fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor0
Displaying the Order in a Group of Numbers
31
Outputl [Documentl] - SPSS Viewer
I- ili:-:11
File Edit View Date Transform Insert Format Analyze Graphs Utilities Add-ons Window aelp
q
socim
Frequency.
Percent
Valid Percent
cumulative
Percent
Valid
1.00
?.1
II
2.1
2.00
1.1
1
:3.110
8
4.00
4.3
12.8
5.00
2
16.0
6.00
1
18.1
nn
2.1
1
20.2
8.00
6.4
6.4
26.6
9.00
32
29.8
_-
10.00
6.4
6.4
36.2
11.00
4
4.3
4.3
40.4
12.00
1
1
1.1
41.5
1:3.00
1
2.1
43.6
14.00
15.00
46.8
1
1.1
1.1
47.9
}
5P55 Processor is ready
Figure 1-19 SPSS frequency table for the social interactions example. (Data from
McLaughlin-Volpe et al., 2001.)
Practice these steps by creating a histogram for the social interactions example in this chapter (the scores are listed on p. 8). Your output window should look like Figure 1-20. Notice that SPSS automatically creates a histogram based on a grouped frequency table, with an interval in this case of 3 (1-3, 4-6, 7-9, and so on). (Should you wish, you can change the number of intervals or the interval size for the histogram by doing the following: Place your mouse cursor on the histogram and double .,' to bring up a Chart Editor window; place your mouse cursor over one of the bars in the histogram and double to bring up a Properties window; . • the tab labeled Binning; ;'-‘' Custom; then enter the number of intervals you want for the interval size, labeled Interval Width; Apply.) (If you want a nongrouped histogram, type in "1" for the interval size.)
□
□
30
Chapter 1
.121. Untitledl [DataSet0] - SPSS Data Editor
1r:1
x
r
File Edit View Data Transform Analyze ,:?aphs Utilities Add-ons Window Help
i
1
so cint
48
Visible: 1 of
soc int
48.00
1
)
i
33.00
• Frequenc
3.00
21.0
0
Variableis):
OK
, socint
19.00
_
17.00
16.00
1
Beset 1
44.00
Cancel
25.00
Help
:30.00
3.00
5.00
Display frequency tables
9.00
3500 .
[Statis
tics...
) I Charts
...
I I
Format...
32.00
26.
00
13.00
4 1> \ Data Voev yVariable View /
<
)
5P55 Processor is ready
•
Figure 1-18 SPSS data window and frequencies window for the social interactions
example. (Data from McLaughlin-Volpe et al., 2001.)
Creating a Histogram
m Enter the scores from your distribution in one column of the data window.
@ Analyze.
@ Descriptive statistics.
m Frequencies.
m the variable you want to make a histogram of and then .•-• on the arrow.
@ Charts, 2- Histograms, Continue.
6 Optional: To instruct SPSS not to produce a frequency table, the box labeled
Display frequency tables (this unchecks the box).
0 OK.
:3
4
6
9
10
11
12
13
14
15
17
18
□
Displaying the Order in a Group of Numbers
29
I
Table 1-11 Dominant Category of Explanation for Intimate Aggression by Gender
and Perpetrator Status
Group
Female
Male
Comparisons
Perpetrators
Comparisons
Perpetrators
(n = 36)
(n = 33)
(n = 32)
(n = 25)
_
Category
f
%
f
%
f
%
f
%
Self-defense
Control motives
Expressive
aggression
Face/self-esteem preservation
-6
Exculpatory explanations
Rejection of perpetrator or act
Prosocial/acceptable explanations
0
0
0
0
0
0
0
0
Tied categories
4
1
7
21
2
6
0
0
Note: f = frequency. °A) = percentage of respondents in a given group who provided a particular category of explanation. Source: Mouradian, V. E. (2001). Applying schema theory to intimate aggression: Individual and gender differences in representation of contexts and goals. Journal of Applied Social Psychology 31, 376-408. Copyright © 2001 by Blackwell Publishing. Reprinted by permission of Blackwell Publishers Journals.
The . in the following steps indicates a mouse click to carry out these analyses. The steps and output may versions of SPSS.)
Creating a Frequency Table
O Enter the scores from your distribution in one col
@ Analyze.
@ Descriptive statistics.
m Frequencies.
@ the variable you want to make a frequency tat 15 OK.
Practice the preceding steps by creating a frequency t example in this chapter (the scores are listed on p. should look like Figure 1-18. Your output window (, OK in Step @) should look like Figure 1-19. As yot. produces a column with the cumulative percentage (Note that it is possible to create grouped frequency to a straightforward process, we do not cover it here.)
2
6
3
9
3
9
1
4
8
22
9
27
9
28
3
12
4
II
3
9
3
9
8
32
1
3
2
2
6
3
12
5
14
3
9
3
9
3
12
12
33
6
18
10
31
7
28
28
Chapter 1
17. Pick a book and a page number of your choice. (Select a page with at least
30 lines; do not pick a textbook or any book with tables or illustrations.) Make a list of the number of words on each line; use that list as your data set. Make (a) a frequency table and (b) a histogram. Then (c) describe the general shape of the distribution. (Be sure to give the name, author, publisher, and year of the
book you used, along with the page number, with your answer.)
18. Explain to a person who has never taken a course in statistics the meaning of a
grouped frequency table.
19. Give an example of something having these distribution shapes: (a) bimodal,
(b) approximately rectangular, and (c) positively skewed. Do not use an exam‑
ple given in this book or in class.
20. Find an example in a newspaper or magazine of a graph that misleads by failing
to use equal interval sizes or by exaggerating proportions.
21. Nownes (2000) surveyed representatives of interest groups who were registered
as lobbyists of three U.S. state legislatures. One of the issues he studied was whether interest groups are in competition with each other. Table 1-10 shows the results for one such question. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics.
(b) Explain the general meaning of the pattern of results.
22. Mouradian (2001) surveyed college students selected from a screening session to
include two groups: (a) "Perpetrators"—students who reported at least one violent act (hitting, shoving, etc.) against their partner in their current or most recent relationship—and (b) "Comparisons"—students who did not report any such uses of violence in any of their last three relationships. At the actual testing session, the students first read a description of an aggressive behavior such as, "Throw something at his or her partner" or "Say something to upset his or her partner." They then were asked to write "as many examples of circumstances of situations as [they could] in which a person might engage in behaviors or acts of this sort with or towards their significant other." Table 1-11 shows the "Dominant
Category of Explanation" (the category a participant used most) for females and males, broken down by comparisons and perpetrators. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics. (b) Explain the general meaning of the pattern of results.
Table 1-10 Competition for Members and Other Resources
Question: How much competition does this group face from other groups with similar goals for members and other resources?
Answer
Percentage
Number
No competition
20
118
Some competition
58
342
A lot of competition
22
131
Total
100
591
Note: There were no statistically significant differences between states. For full results of significance tests, contact the author. Source: Nownes, A. J. (2001). Policy conflict and the structure of interest communities. American Politics Quarterly, 28, 316. Copyright © 2001 by Sage Publications, Ltd. Reprinted by permission of Sage Publications, Thousands Oaks, London,
and New Delhi.
Displaying the Order in a Group of Numbers
27
Table 1-9 Descriptive Statistics for the Type of News Given
Category
Frequency
Percentage
1. Relationship with family
19
21.1
2. School
6
3. Job/work
6.7
4. Relationship with actual/potential girlfriend/boyfriend
- Personal health
- Health of family member/friend
- What kind of skew is created by (a) a floor effect and (b) a ceiling effect?
- Why would a researcher want to make a frequency table?
- Figure the percentage of scores for each value. To do this, take the frequency
- Frequency Tables 7
17
18.9
6. Finance
7. Relationship with friends
21
23.3
23
25.6
9. Other
1
Llj
Source: McKee, T. L. E., & Placek, J. T. (2001). I'm afraid I have something bad to tell you: Breaking bad news from the perspective of the given. Journal of Applied Social Psychology, 31, 246-273. Copyright © 2001 by Blackwell Publishing. Reprinted by permission of Blackwell Publishers Journals.
12. Explain and give an example for each of the following types of variables: (a)
equal-interval, (b) rank-order, (c) nominal, (d) ratio scale, (e) continuous.
13. An organizational psychologist asks 20 employees in a company to rate their
job satisfaction on a 5-point scale from 1 = very unsatisfied to 5 = very satisfied. The ratings are as follows:
3, 2, 3, 4, 1, 3, 3, 4, 5, 2, 3, 5, 2, 3, 3, 4, 1, 3, 2, 4
Make (a) a frequency table and (b) a histogram. Then (c) describe the general
shape of the distribution.
14. A social psychologist asked 15 college students how many times they "fell in love" before they were 11 years old. The numbers of times were as follows:
2, 0, 6, 0, 3, 1, 0, 4, 9, 0, 5, 6, 1, 0, 2
Make (a) a frequency table and (b) a histogram. Then (c) describe the general
shape of the distribution.
15. Following are the speeds of 40 cars clocked by radar on a particular road in a 35-mph zone on a particular afternoon:
30, 36, 42, 36, 30, 52, 36, 34, 36, 33, 30, 32, 35, 32, 37, 34, 36, 31, 35, 20, 24, 46, 23, 31, 32, 45, 34, 37, 28, 40, 34, 38, 40, 52, 31, 33, 15, 27, 36, 40
Make (a) a frequency table and (b) a histogram. Then (c) describe the general
shape of the distribution.
16. Here are the number of holiday gifts purchased by 25 families randomly interviewed at a local mall at the end of the holiday season:
22, 18, 22, 26, 19, 14, 23, 27, 2, 18, 28, 28, 11, 16, 34, 28, 13, 21, 32,
17, 6, 29, 23, 22, 19
Make (a) a frequency table and (b) a grouped frequency table using intervals of 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, and 30-34. Based on the grouped frequency table, (c) make a histogram and (d) describe the general shape of the distribution.
26
Chapter 1
Make (a) a frequency table and (b) a histogram. Then (c) describe the general
shape of the distribution.
5. These are the scores on a test of sensitivity to smell taken by 25 chefs attending a national conference:
96, 83, 59, 64, 73, 74, 80, 68, 87, 67, 64, 92, 76, 71, 68, 50, 85, 75, 81, 70, 76, 91, 69, 83, 75
Make (a) a frequency table and (b) histogram. (c) Make a grouped frequency table using intervals of 50-59, 60-69, 70-79, 80-89, and 90-99. Based on the grouped frequency table, (d) make a histogram and (e) describe the general
shape of the distribution.
6. The following data are the number of minutes it took each of a group of 34 10-year-olds to do a series of abstract puzzles:
24, 83, 36, 22, 81, 39, 60, 62, 38, 66, 38, 36, 45, 20, 20, 67, 41, 87,
41, 82, 35, 82, 28, 80, 80, 68, 40, 27, 43, 80, 31, 89, 83, 24
Make (a) a frequency table and (b) a grouped frequency table using intervals of 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, and 80-89. Based on the grouped frequency table, (c) make a histogram and (d) describe the general shape of the
distribution.
7. Describe the shapes of the three distributions illustrated.
(a)
(b)
(c)
8. Draw an example of each of the following distributions: (a) symmetrical,
(b) rectangular, and (c) skewed to the right.
9. Explain to a person who has never had a course in statistics what is meant by
(a) a symmetrical unimodal distribution and (b) a negatively skewed unimodal distribution. (Be sure to include in your first answer an explanation of what
"distribution" means.)
10. McKee and Ptacek (2001) asked 90 college students about a time they had deliv‑
ered bad news to someone. Table 1-9 shows the results for the type of bad news given. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics. (b) Explain the general meaning of the pattern of results.
Set II
11. A participant in a cognitive psychology study is given 50 words to remember
and later asked to recall as many as he can of them. This participant recalls 17 What is the (a) variable, (b) possible values, and (c) score?
Displaying the Order in a Group of Numbers
25
Making a Histogram
i
See Figure 1-17.
0
Interest in
3 -
Graduate School
Frequency
0
1
1
2
1
3
2
4
1
2
3
0
Figure 1-17 Answer to Worked-Out Problem for making a histogram. 0 Make a frequency
table (or grouped frequency table). @ Put the values along the bottom of the page, from left to right, from lowest to highest. (i) Make a scale of frequencies along the left edge of the page that goes from 0 at the bottom to the highest frequency for any value. 0 Make a bar above each value with a height for the frequency of that value.
Practice Proble
These problems involve tat lems are done on a comput( software, do these problerr how to use a computer to s the Using SPSS section at t
Workbook that accompanie
All data are fictional u
Set I (for Answers t
1. A client rates her satis
scale from 1 = not at c
(b) possible values, an
2. Give the level of mea
group to which a persc
turn in a laboratory ma
3. A particular block in a of children in these ho
2, 4, 2, 1,
Make (a) a frequency shape of the distributic
4. Fifty students were as1
their answers:
11, 2, 0, 13, 5, 7, 1, 8
11, 18, 2, 9, 7, 3, E, „,
2
1
2
3
4
5
24
Chapter 7
8. Statistical graphs for the general public are sometimes distorted in ways
that mislead the eye, such as failing to use equal intervals or exaggerating
proportions.
9. Frequency tables and histograms are rarely shown in research articles. When
they are, they often follow nonstandard formats or involve frequencies (or percentages) for a nominal variable. The shapes of distributions are more often described.
111112
2)
continuous variable (p. 4)
bimodal distribution (p. 15) multimodal distribution (p. 15) rectangular distribution (p. 15) symmetrical distribution (p. 17) skewed distribution (p. 17) floor effect (p. 17)
ceiling effect (p. 18) normal curve (p. 18) kurtosis (p. 18)
atistics (p. 2)
rank-order variable (p. 4)
tistics (p. 2)
nominal variable (p. 4)
1)
levels of measurement (p. 5)
frequency table (p. 7)
interval (p. 9)
tble (p. 4)
grouped frequency table (p. 9)
1 variable (p. 4)
histogram (p. 10)
4)
frequency distribution (p. 15) unimodal distribution (p. 15)
ble (p. 4)
ple Worked-Out Problems
I
Ten first-year students rated their interest in graduate school on a scale from 1 = no in‑ terest at all to 6 = high interest. Their scores were as follows: 2, 4, 5, 5, 1, 3, 6, 3, 6, 6.
Making a Frequency Table
See Figure 1-16.
•
Interest in
Frequency
Percent
0
Graduate School
1
1
10
-
2
II
10
/
//
1, 3, 6, 3, 6, 6
//
///
Figure 1-16 Answer to Example Worked-Out Problem for making a frequency table. 0 Make a list down the page of each possible value, from lowest to highest. A Go one by one
through the scores, making a mark for each next to its value on your list. 0 Make a table showing how many times each value on your list is used. 0 Figure the percentage of scores for each value.
3
2
20
0 4
1
10
5
2
20
6
3
30
Displaying the Order in a Group of Numbers
23
100
80
60
40
20
0
10-11
12-13
14-15
16-17
Age in Years
Figure 1-15 Change in the percentage of adolescents surveyed in the Canadian
National Longitudinal Survey of Children and Youth longitudinal sample.
Source: Maggi, S., Hertzman, C., & Vaillancourt, T. (2007). Changes in smoking behaviors from late childhood to adolescence: Insights from the Canadian National Longitudinal Survey of Children and
Youth. Health Psychology, 26, 232-240. Published by the American Psychological Association. Reprinted with permission.
1. Psychologists use descriptive statistics to describe and summarize a group of
numbers from a research study.
2. A value is a number or category; a variable is a characteristic that can have dif‑
ferent values; a score is a particular person's value on the variable.
3. Most variables in psychology research are numeric with approximately equal
intervals. However, some numeric variables are rank-ordered (the values are ranks), and some variables are not numeric at all (the values are categories).
4. A frequency table organizes the scores into a table of each of the possible values
with the frequency and percentage of scores with that value.
5. When there are many different values, a grouped frequency table is useful. It is
like an ordinary frequency table except that the frequencies are given for inter‑
vals that include a range of values.
6. The pattern of frequencies in a distribution can be shown visually with a his‑
togram (or bar graph), in which the height of each bar is the frequency for a
particular value.
7. The general shape of a histogram can be unimodal (having a single peak), bimodal (having two peaks), multimodal (including bimodal), or rectangular
(having no peak); it can be symmetrical or skewed (having a long tail) to the right or the left; and, compared to the bell-shaped normal curve, it can be kurtotic (having a peaked or flat distribution).
1
□
□
22
Chapter 1
Table 1-8 Incidence of Traditional and Electronic
Bullying and Victimization (N= 84)
Form of bullying
N
%
Electronic victims
41
48.8
Text-message victim
27
32.1
Internet victim (Web sites, chatrooms)
13
15.5
Picture-phone victim
8
9.5
Traditional victims
60
71.4
Physical victim
38
45.2
Teasing victim
50
59.5
Rumors victim
32
38.6
Exclusion victim
30
50.0
Electronic bullies
18
21.4
Text-message bully
18
21.4
Internet bully
ni
13.1
Traditional bullies
54
64.3
Physical bully
29
34.5
Teasing bully
38
45.2
Rumor bully
22
26.2
Exclusion bully
35
41.7
Source: Raskauskas, J., & Stoltz, A. D. (2007). Involvement in traditional and electronic bullying among adolescents. Developmental Psychology, 43, 564-575. Published by the American Psychological Association. Reprinted with permission.
bullying as "
. . .
a means of bullying in which peers use electronics such as text mes‑
sages, emails, and defaming Web sites] to taunt, threaten, harass, and/or intimidate a peer" (p. 565). Table 1-8 is a frequency table showing the adolescents' reported incidence of being victims or perpetrators of traditional and electronic bullying. The table shows, for example, that about half (48.8%) of the adolescents reported being the victim of electronic bullying, and the most common vehicle for electronic bullying (experienced by 32.1% of the adolescents) was text messaging.
Histograms are even more rare in research articles (except in articles about statistics), but they do appear occasionally. Maggi and colleagues (2007) conducted a study of age-related changes in cigarette smoking behaviors in Canadian adolescents. As shown in Figure 1-15, they created a histogram—from a grouped frequency table—to display their results. Their histogram shows the results from the two samples they studied (one shown in the light colored bars and the other in the dark colored bars). As you can see in the figure, less than 10% of the 10- and 11-year-olds reported that they had tried smoking, but more than half of the 16- and 17-year-olds said they had tried smoking. As already mentioned, such figures are often not standard in some way. In this example, the researchers drew the histogram with gaps between the bars, whereas it is standard not to use gaps (unless you are drawing a bar graph for a nominal variable). However, the histogram still does a good job of showing the distribution. Also, the researchers, to allow for a fair comparison of how the rate of smoking differed among adolescents of varying ages, plotted the percentage of adolescents on the vertical axis instead of the actual number of adolescents. (Plotting the actual number of adolescents who reported smoking would have been misleading, because there were not the same number of individuals in each of the age groups.)
Displaying the Order in a Group of Numbers
21
10
8
10
4
2
6
0
3
9
I1
Stress Rating
4
10
8
6
4
2
0
I 1
1
1
5
1]
1
3
9
(b)
Stress Rating
(c)
Stress Rating
Figure 1-14 Histogram of students' stress ratings distorted from the standard of width
1 to 1.5 times height. (Data based on Aron et al., 1995.)
housing price in a particular region over a 4-year period (from 2004 to 2007). By starting the vertical axis at $150,000 (instead of 0, as is customary), the graph appears to exaggerate the changes in housing price over time. Figure 1-13b shows the same results with the vertical axis starting at $0. You can still see the changes in housing price from year to year in Figure 1-13b but the figure does a better job of showing the size of those
changes.
The overall proportion of a histogram or bar graph should be about 1 to 1.5 times as wide as it is tall, as in Figure 1-14a for the stress ratings example. But look what happens if we make the graph much taller or shorter, as shown in Figures 1-14b and 1-14c. The effect is like that of a fun house mirror: the true picture is distorted. Any particular shape is in a sense accurate. But the 1-to-1.5 proportion has been adopted to give people a standard for comparison. Changing this proportion misleads the eye.
Frequency Tables and Histograms in Research Articles
Psychology researchers mainly use frequency tables and histograms as a first step in more elaborate statistical analyses. They are usually not included in research articles, and when they are, just because they are so rare, they are often not standard in some way. When they do appear, they are most likely to be in survey studies. For example, Raskauskas and Stoltz (2007) asked a group of 84 adolescents about their involvement in traditional and electronic bullying. The researchers defined electronic
3
20
Chapter 1
Commission Payments to Travel Agents
A
$50
I
First
I 'I
liaif
76
77
78
0
EASTER!,
f I
UNITED
AIRLINES
Figure 1-12 Misleading illustration of a frequency distribution due to unequal interval
sizes.
Source: "Commission Payments to Travel Agents," From The New York Times, August 8, 1978. (1) 1978 The New York Times. Used by permission and protected by the Copyright Laws of the United States. The
printing, copying, redistribution, or retransmission of the Material without express written permission is prohibited. wwwnytimes.corn
(a)
(b)
164,000
175,000
162,000
150,000
160,000
125,000
158,000
100,000
156,000
75,000
154,000
50,000
152,000
25,000
150,000
0
Year
Year
Figure 1-13 Misleading bar graph due to not starting at zero. The vertical axis starts at
$150,000 for figure (a) compared to $0 for figure (b).
2004
2005
2006
2007
2004
2005
2006
2007
Displaying the Order in a Group of Numbers
1g
How are ou doin • ?
a•
1. Describe the difference between a unimodal and multimodal distribution in
terms of (a) a frequency graph and (b) a frequency table.
2. What does it mean to say that a distribution is skewed to the left?
4. When a distribution is described as being peaked or flat, what is it being
compared to?
.emno iotwou e 01 paiedwoo 6uieq sl uoilnquisip e41 ..17
· 14el eql 01 peme>is sl loal}a 6u!I1eo
e Aq peleen euo (q):1q6p eq1 o1 peme)is sl low .1001j o Aq paleen me)is v (e) •c
.senien y6ly 8AEL.1 uet.41 sewn Awl eABL1 SaJOOS Jemed •z
.11 punwe
semen eqi. 01 paiedwoo sepuenbe4 eami ipAA amen euo uetp, 0.10W seq uoll -nqulsip lepowilinw e :sepuenbelj Jaillo eql lie ueql Apuenbell .1eq6N 0 qi.inA enien euo seq uoilnquisip lepow!un v (q) .1t.nod 4614 uiew euo ueq; 9.10W seq uoilnqpisip lepowp.inw 0 :Tulod y6lu uiew euo seq uoilnquisip iepowiun v (e) •
S.18MSUV
Controversy: Misleading Graphs
The most serious controversy about frequency tables and histograms is not among psychologists, but among the general public. The misuse of these procedures by some public figures, advertisers, and the media seems to have created skepticism about the trustworthiness of statistics in general and of statistical tables and charts in particular. Everyone has heard that "statistics lie."
Of course, people can and do lie with statistics. It is just as easy to lie with words, but you may be less sure of your ability to recognize lies with numbers. In this section, we note two ways in which frequency tables and graphs can be misused and tell how to recognize such misuses. (Much of this material is based on the classic discussion of these issues in Tufte, 1983.)
Failure to Use Equal Interval Sizes
A key requirement of a grouped frequency table or graph is that the size of the intervals be equal. If they are not equal, the table or graph can be very misleading. Tufte (1983) gives an example, shown in Figure 1-12, from the respectable (and usually accurate) New York Times. This chart gives the impression that commissions paid to travel agents dropped dramatically in 1978. However, a close reading of the graph shows that the third bar for each airline is for only the first half of 1978. Thus, only half a year is being compared to each of the preceding full years. Assuming that the second half of 1978 was like the first half, the information in this graph actually tells us that 1978 shows an increase rather than a decrease. For example, Delta Airlines estimated a full-year 1978 figure of $72 million, much higher than 1977's $57 million.
Exaggeration of Proportions
The height of a histogram or bar graph (or frequency polygon) usually begins at 0 or the lowest value of the scale and continues to the highest value of the scale. Figure 1-13a
shows a bar graph that does not follow this standard. The bar graph shows the mean
Chapter 1
A skewed distribution caused by an upper limit is shown in Figure 1-10b. This is a distribution of adults' scores on a multiplication table test. This distribution is strongly skewed to the left. Most of the scores pile up at the right, the high end (a perfect score). This shows a ceiling effect. The stress ratings example also shows a mild ceiling effect because many students had high levels of stress, the maximum rating was 10, and people often do not like to use ratings right at the maximum.
Normal and Kurtotic Distributions
Psychologists also describe a distribution in terms of whether the middle of the distribution is particularly peaked or flat. The standard of comparison is a bell-shaped curve. In psychology research and in nature generally, distributions often are similar to this bell-shaped standard, called the normal curve. We discuss this curve in some detail in later chapters. For now, however, the important thing is that the normal curve is a unimodal, symmetrical curve with an average peak—the sort of bell shape shown in Figure 1-11 a. Both the stress ratings and the social interactions examples approximate a normal curve in a very general way—although, as we noted, both are somewhat skewed. In our experience, most distributions that result from psychology research are closer to the normal curve than are these two
examples.
Kurtosis is how much the shape of a distribution differs from a normal curve in terms of whether its curve in the middle is more peaked or flat than the normal curve (DeCarlo, 1997). Kurtosis comes from the Greek word kyrtos, "curve." Figure 1-11 b shows a kurtotic distribution with a more extreme peak than the normal curve. Figure 1-11c shows an extreme example of a kurtotic distribution, one with a very flat distribution. (A rectangular distribution would be even more
extreme.)
Distributions that are more peaked or flat than a normal curve also tend to have a different shape in the tails. Those with a very peaked curve usually have more scores in the tails of the distribution than the normal curve (see Figure 1-11b). It is as if the normal curve got pinched in the middle and some of it went up into a sharp peak and the rest spread out into thick tails. Distributions with a flatter curve usually have fewer scores in the tails of the distribution than the normal curve (see Figure 1-11c). It is as if the tails and the top of the curve both got sucked in toward the middle on both sides. Although it is often easiest to identify kurtosis in terms of how peaked or flat the distribution is, the number of scores in the tails is what matters.
ceiling effect situation in which many
scores pile up at the high end of a distribution (creating skewness) because it is not possible to have a higher score.
normal curve specific, mathematically
defined, bell-shaped frequency distribution that is symmetrical and unimodal; distributions observed in nature and in research commonly approximate it.
-
(a)
(b)
(c)
kurtosis extent to which a frequency
Figure 1-11 Examples of (a) normal, (b) peaked, and (c) flat distribution. The normal distri‑
distribution deviates from a normal
curve in terms of whether its curve in bution is shown as a dashed line in (b) and (c).
the middle is more peaked or flat than Source: Adapted from DeCarlo, T. (1997). On the meaning and use of kurtosis. Psychological Methods, 3, the normal curve.
292-307, Figure 1. Published by the American Psychological Association. Adapted with permission.
Displaying the Order in a Group of Numbers
17
(a)
(b)
(c)
Figure 1-9 Examples of frequency polygons of distributions that are (a) approximately symmet‑
rical, (b) skewed to the right (positively skewed), and (c) skewed to the left (negatively skewed).
symmetrical distribution (if you fold the graph of a symmetrical distribution in half,
symmetrical distribution distribution
the two halves look the same).
in which the pattern of frequencies on the left and right side are mirror images of each other.
A distribution that clearly is not symmetrical is called a skewed distribution. The stress ratings distribution is an example. A skewed distribution has one side that is long and spread out, somewhat like a tail. The side with the fewer scores (the side that looks like a tail) is considered the direction of the skew. Thus, the stress study example, which has too few scores at the low end, is skewed to the left. However, the social interactions example, which has too few scores at the high end, is skewed to the right (see Figure 1-4). Figure 1-9 shows examples of approximately symmetrical
skewed distribution distribution in
which the scores pile up on one side of the middle and are spread out on the other side; distribution that is not symmetrical.
floor effect situation in which many
and skewed distributions.
scores pile up at the low end of a distrib ution (creating skewness) because it is
AA distribution that is skewed to the right is also called positively skewed. A dis‑
tribution skewed to the left is also called negatively skewed.
not possible to have any lower score.
Strongly skewed distributions come up in psychology research mainly when what is being measured has some upper or lower limit. For example, a family cannot have fewer than zero children. When many scores pile up at the low end because it is impossible to have a lower score, the result is called a floor effect. A skewed distri- bution caused by a lower limit is shown in Figure 1-10a.
It helps you remember the direction of the skew to know that the word skew comes from the French queue , which l i or Thus, t he direction of the skew is the side that has the long line, or tail.
(a)
(b)
10
10 20 30 40 50 60 70 80 90 100 Percentage of Correct Answers
Number of Children
Figure 1-10 (a) A distribution skewed to the right due to a floor effect: fictional distribution of the
number of children in families. (b) A distribution skewed to the left due to a ceiling effect: fictional distribution of adults' scores on a multiplication table test.
0
1
2
3
4
5
6
0
16
Chapter 1
(a)
Number of
People in a
Toddler's
Play Area
Age
(b)
Number
of
Students
Grade Level
Figure 1-8 Fictional examples of distributions that are not unimodal: (a) A bimodal
distribution showing the possible frequencies for people of different ages in a toddler's play
area. (b) A regular distribution showing the possible frequencies of students at different grade levels in an elementary school
The scores from most psychology studies are usually an approximately unimodal distribution. Bimodal and other multimodal distributions occasionally turn up. A bimodal example is the distribution of the ages of people in a toddler's play area in a park, who are mostly either toddlers with ages of around 2 to 4 or caretakers with ages of 20 to 40 or so (with few people aged 5 to 19 years or above 40). Thus, if you make a frequency distribution of these ages, the large frequencies are at the values for low ages (2 to 4) and for higher ages (20 to 40 or so). An example of a rectangular distribution is the number of children at each grade level at an elementary school; there is about the same number in first grade, second grade, and so on. Figure 1-8 shows these examples.
Symmetrical and Skewed Distributions
Look again at the histograms of the stress ratings example (Figure 1-3). The distribution is lopsided, with more scores near the high end. This is somewhat unusual. Most things we measure in psychology have about equal numbers on both sides of the middle. That is, most of the time in psychology, the scores follow an approximately
1
2
3
4
5
6
Displaying the Order in a Group of Numbers
1:5
•E uousanb „Outocj no azy moH„ toj atet2otsm
anittA
9
g
t
£
Z
I
£
9
L
8
g—L a in6i
frequency distribution pattern of
frequencies over the various values; what a frequency table, histogram, or frequency polygon describes.
unimodal distribution frequency
distribution with one value clearly having a larger frequency than any other.
g—I. °es 'c
'amen
bimodal distribution frequency
121_11aol Aouenbati to It.i6!a4 e gip Jeq e s! amenwee anoqe (o) :11e!
buole ob et_11 an Aue lo Aouenball i.seqb!q ol NT 0 LUOI
distribution with two approximately equal frequencies, each clearly larger than any of the others.
sepuonball (q) buole ob iseqb!q Isemo! 'sonleA au (e) •
multimodal distribution frequency
alge} Aouenb
distribution with two or more high frequencies separated by a lower frequency; a bimodal distribution is the special case of two high frequencies.
-au e ui Ailens!A welled NT molts of aideJb Aouenball elew sJeLialeaseld • I.
s jemsuv
rectangular distribution frequency
Shapes of Frequency Distributions
distribution in which all values have approximately the same frequency.
A frequency distribution shows the pattern of frequencies over the various values. A frequency table or histogram describes a frequency distribution because each
(a) Unimodal
(b) Approximately Bimodal
(c) Approximately Rectangular
shows the pattern or shape of how the frequencies are spread out, or "distributed."
Psychologists also describe this shape in words. Describing the shape of a distribution is important both in the descriptive statistics of this chapter and the next and in the inferential statistics of later chapters.
Unimodal and Bimodal Frequency Distributions
One question is whether a distribution's shape has only one main high point: one high "tower" in the histogram. For example, in the stress ratings study, the most frequent
value is 7, giving a graph only one very high area. This is a unimodal distribution. If a distribution has two fairly equal high points, it is a bimodal distribution. Any distribution with two or more high points is called a multimodal distribution. (Strictly speaking, a distribution is bimodal or multimodal only if the peaks are exactly equal. However, psychologists use these terms more informally to describe the general shape.) Finally, a distribution with values of all about the same frequency is a rectangular distribution. Figure 1-7 shows examples of these frequency distribution shapes. As you will see, the graphs in Figure 1-7 are not histograms, but special line graphs called frequency polygons, which are another way to graph a frequency table. In a frequency polygon, the line moves from point to point. The height of each point shows the number of scores with that value. This creates a mountain peak skyline.
Figure 1-7 Examples of (a)
unimodal, (b) approximately bimodal, and (c) approximately rectangular frequency polygons.
□
Chapter 1
find the midpoint between the start of the interval and the start of what would be the next highest interval. So, in Figure 1-4, the midpoint for the 45-49 interval is halfway between 45 (the start of the interval) and 50 (the start of what would be the next interval), which is 47.5.
—
@ Make a scale of frequencies along the left edge of the page that goes from 0
You will probably find it easier to make a histogram if you use graph paper.
at the bottom to the highest frequency for any value.
0 Make a bar above each value with a height for the frequency of that value.
For each bar, make sure that the middle of the bar is above its value.
When you have a nominal variable, the histogram is called a bar graph. Since the values of a nominal variable are not in any particular order, leave a space between the bars. Figure 1-5 shows a bar graph based on the frequency table in Table 1-4.
100
90
I
80
70
60
50
40
30 I
20
10
0
Family
Nonromantic
Romantic
Other
member
friend
partner
Closest Person
Figure 1-5 Bar graph for the closest person in life for 208 students (see Table 1-4).
(Data from Aron et al., 1995.)
How are you doing?
1. Why do researchers make frequency graphs?
2. When making a histogram from a frequency table, (a) what goes along the bot‑
tom, (b) what goes along the left edge, and (c) what goes above each value? 3. Make a histogram based on the following frequency table:
Value
Frequency
Displaying the Order in a Group of Numbers
13
thinking ability. Anxiety produces arousal, and one of the best understood relationships in psychology is between arousal and performance. Whereas moderate arousal helps performance, too much or too little dramatically reduces it. In the case of too much, things you have learned become harder to recall. Your mind starts to race, creating more anxiety, more arousal, and so on. Because during a test you may be fearing that you are "no good and never will be," it is important to rethink beforehand any poor grades you may have received in the past. They most likely reflected
yourself a time limit for solving some homework prob‑ lems. Make yourself write out answers fully and legibly. This may be part of what makes you feel slow during a
test. If the presence of others bothers you—the sound of
their scurrying pencils while yours is frozen in midair
do your practice test with others in your course. Even
make it an explicit contest to see who can finish first.
Is your problem a general lack of confidence? Is something else in your life causing you to worry or feel bad about yourself? Then we suggest that it is time you
your problems with tests more than your abilities.
tried your friendly college counseling center.
There are many ways to reduce anxiety and arousal in general, such as learning to breathe properly and to take a brief break to relax deeply. Your counseling center should be able to help you or direct you to some good books on the subject. Again, many Web sites deal with
Lastly, could you be highly sensitive? A final word about anxiety and arousal. About 15 to 20% of humans
reducing anxiety.
(and all higher animals) seem to be born with a tempera-ment trait that has been seen traditionally as shyness, hesitancy, or introversion (Eysenck, 1981; Kagan, 1994). But this shyness or hesitancy seems actually due to a preference to observe and an ability to notice subtle stimulation and process information deeply (Aron, 1996; Aron & Aron, 1997). This often causes highly sensitive persons (HSPs) to be very intuitive or even gifted. But it also means they are more easily overaroused by high lev‑
Test anxiety specifically is first reduced by over-preparing for a few tests, so that you go in with the cer-tainty that you cannot possibly fail, no matter how aroused you become. The best time to begin applying this tactic is the first test of this course. There will be no old material to review, success will not depend on having understood previous material, and initial success will help you do well throughout the course. (You also might enlist the sympathy of your instructor or teaching assis-tant. Bring in a list of what you have studied, state why you are being so exacting, and ask if you have missed anything.) Your preparation must be ridiculously thor-ough, but only for a few exams. After these successes,
els of stimulation, like tests.
You might want to find out if you are an HSP (at http://www.hsperson.com ). If you are, appreciate the trait's assets and make some allowances for its one dis‑
your test anxiety should decline.
advantage, this tendency to become easily overaroused. It has to affect your performance on tests. What matters is what you actually know, which is probably quite a bit. This simple act of self-acceptance—that you are not less smart but are more sensitive—may in itself help ease your arousal when trying to express your statistical
Also, create a practice test situation as similar to a real test as possible, making a special effort to duplicate the aspects that bother you most. If feeling rushed is the troubling part, once you think you are well prepared, set
knowledge.
So good luck to all of you. We wish you the best while taking this course and in your lives.
How to Make a Histogram
There are four steps in making a histogram.
0 Make a frequency table (or grouped frequency table).
@ Put the values along the bottom of the page, from left to right, from lowest
to highest. If you are making a histogram from a grouped frequency table, the
values you put along the bottom of the page are the interval midpoints. The mid-point of an interval is halfway between the start of that interval and the start of the next highest interval. So, in Figure 1-4, the midpoint for the 0-4 interval is 2.5, because 2.5 is halfway between 0 (the start of the interval) and 5 (the start of the next highest interval). For the 5-9 interval, the midpoint is 7.5 because 7.5 is halfway between 5 (the start of the interval) and 10 (the start of the next highest interval). Do this for each interval. When you get to the last interval,
Now try this yourself! Work out the interval midpoints for the grouped frequency table for the stress rat-ings example shown in Table 1-6. Your answers should be the same as the values shown along the bot‑
of Figure 1-3b.
Chapter 7
INTERVAL
FREQUENCY
16
0 — 4
12
15
5 — 9
16
10 — 14
16
14
15 — 19
16
13
20 — 24
10
12
25 — 29
1
i
30 — 34
35 — 39
10
40 — 44
45 — 49
2.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5 42.5 47.5 Number of Social Interactions
Figure 1-4 Histogram for number of social interactions during a week for 94 college
students based on grouped frequencies. (Data from McLaughlin-Volpe et al., 2001.)
BOX 1 -2
Math Anxiety, Statistics Anxiety, and You: A Message for Those of You Who Are Truly Worried About This Course
Let's face it: Many of you dread this course, even to the point of having a full-blown case of "statistics anxiety" (Zeidner, 1991). If you become tense the minute you see
math you need when you need it" (1995, p. 12). (Could it
be that this course in statistics is one of those times?)
numbers, we need to talk about that right now.
Tobias explains that math mental health is usually lost in elementary school, when you are called to the black-board, your mind goes blank, and you are unable to pro-duce the one right answer to an arithmetic problem. What confidence remained after such an experience probably faded during timed tests, which you did not re-alize were difficult for everyone except the most profi‑
First, this course is a chance for a fresh start with digits. Your past performance in (or avoidance of) geometry, trigonometry, calculus, or similar horrors need not influ-ence in any way how well you comprehend statistics. This is largely a different subject.
Second, if your worry persists, you need to determine where it is coming from. Math or statistics anxiety, test anxiety, general anxiety, and generally low self-confidence each seems to play its own role in students' difficulties with math courses (Cooper & Robinson, 1989; Dwinell &
cient few.
Tobias says that students who are good at math are not necessarily smarter than the rest of us, but they really know their strengths and weaknesses, and they have individual styles of thinking and feeling their way around a problem. They do not judge themselves harshly for mis‑
Higbee, 1991).
Is your problem mainly math or statistics anxiety? An
takes. In particular, they do not expect to understand
"slow learner"
things instantly. Allowing yourself to be a
Internet search will yield hundreds of wonderful books and Web sites to help you. We highly recommend Sheila Tobias's classics Overcoming Math Anxiety (1995) or Succeed with Math: Every Student's Guide to Conquering Math Anxiety (1987). Tobias, a former math avoider her-self, suggests that your goal should be "math mental health," which she defines as "the willingness to learn the
does not mean that you are less intelligent. It shows that
you are growing in math mental health.
Is your problem test anxiety? Test taking requires the use of the thinking part of our brain, the prefrontal cortex. When we are anxious, we naturally "downshift" to more basic, instinctual brain systems, and that effect ruins our
Displaying the Order in a Group of Numbers
11
(a) Frequency Table
STRESS
0
7
RATING
FREQUENCY
6
5
4
@ 3
2
10
1
0
4 5 6
Stress Rating
(b) Grouped Frequency Table
11
0
STRESS
0
10
RATING
INTERVAL
FREQUENCY
9
0 — 1
2 — 3
8
4 — 5
6 — 7
1
7
8 — 9
10 — 11
6
5
4
3
2
1
0
11
Stress Rating
Figure 1-3 Histograms based on (a) frequency table and (b) a grouped frequency for the
stress ratings example. (Data based on Aron et al., 1995.)
0
2
3
8
9
10
1
3
5
7
9
10
Chapter 1
When setting up a grouped frequency table, it makes a big difference how many intervals you use. There are guidelines to help researchers with this, but in practice it is done automatically by the researcher's computer (see the Using SPSS section for in-structions on how to create frequency tables using statistical software). Thus, we will not focus on it in this book. However, should you have to make a grouped frequency table on your own, the key is to experiment with the interval size until you come up with one that is a round number (such as 2, 3, 5, or 10) and that creates about 5 to 15 in-tervals. Then, when actually setting up the table, be sure you set the start of each inter-val to a multiple of the interval size and the top end of each interval to the number just below the start of the next interval. For example, Table 1-6 uses six intervals with an interval size of 2. The intervals are 0-1, 2-3, 4-5, 6-7, 8-9, and 10-11. Note that each interval starts with a multiple of 2 (0, 2, 4, 6, 8, 10) and the top end of each interval (1, 3, 5, 7, 9) is the number just below the start of the next interval (2, 4, 6, 8, 10). Table 1-7 uses 10 intervals with an interval size of 5. The intervals are 0-4, 5-9, 10-14, 15-19, and so on, with a final interval of 45-49. Note that each interval starts with a multiple of 5 (0, 5, 10, 15, and so on) and that the top end of each interval (4, 9, 14, 19, and so on) is the number just below the start of the next interval (5, 10, 15, 20, and so on).
Table 1-7 Grouped
Frequency Table for Numbers of Social Interactions During a Week for 94
College Students
Interval
Frequency
Percent
0-4
12
1 2.8
5-9
16
17.0
10-14
16
17.0
15-19
16
17.0
20-24
10
10.6
25-29
m
11.7
30-34
4.3
35-39
3.2
40-44
3.2
45-49
3.2
Source: Data from McLaughlin-Volpe et al.,
(2001).
How are you doing?
1. What is a frequency table?
2. Make a frequency table for the following scores: 5, 7, 4, 5, 6, 5, 4.
1. What does a grouped frequency table group?
•sien.4elul olw semen lueoefpe to sepuenball eql scino.J6 algal Aouenbat v
C*171.
t7
6' Z
IUGOJGd
Aouanbead
anieA
'MOOS 10 dnoi6 e6.iel e ul welled eql eas of (see 1! seiew Apuenbeil v •
.pe!pnls dnoA6 et.p, u! omen Lose Jo
(Aouenball NT) SalOOS to.iegwnu 91410 6uus!l opwelsAs e si a !qui. Aouen ban v • 1.
sJemsuv
Histograms
histogram barlike graph of a frequency
A graph is another good way to make a large group of scores easy to understand. A picture may be worth a thousand words, but it is sometimes worth a thousand numbers. A straightforward approach is to make a graph of the frequency table. One kind of graph of the information in a frequency table is a kind of bar chart called a histogram. In a histogram, the height of each bar is the frequency of each value in the frequency table. Ordinarily, in a histogram all the bars are put next to each other with no space in between. The result is that a histogram looks a bit like a city skyline. Figure 1-3 shows two histograms based on the stress ratings example (one based on the ordinary frequency table and one based on the grouped frequency table). Figure 1-4 shows a histogram based on the grouped frequency table for the example of the numbers of students' social interactions in a week.
distribution in which the values are plotted along the horizontal axis and the height of each bar is the frequency of that value; the bars are usually placed next to each other without spaces, giving the appearance of a city skyline.
Displaying the Order in a Group of Numbers
9
Table 1-5 Frequency Table for Number of Social Interactions During a
Week for 94 College Students
Score
Frequency
Score Frequency
Score
Frequency
0
17
0
1
18
2
2
19
0
3
20
0
4
21
1
5
22
0
6
23
1
7
24
1
8
25
0
9
26
0
10
27
1
1
28
0
12
29
0
13
30
14
31
1
15
32
16
33
Source: Data from McLaughlin-Volpe et al., (2001).
Table 1-6 Grouped
Frequency Table for Stress Ratings
0 Figure the percentage of scores for each value. We have not done so in this
Stress Rating
example because it would not help much for seeing the pattern of scores. How-
Interval
Frequency
Percent
ever, if you want to check your understanding of this step, the first five percent‑
0-1
2
6.7
ages would be 0.0%, 2.1%, 1.1%, 5.3%, and 4.3%. (These are the percentages for frequencies of 0, 2, 1, 5, and 4, rounded to one decimal place.)
2-3
3
10.0
4-5
3
10.0
6-7
m
36.7
Grouped Frequency Tables
8-9
8
26.7
Sometimes there are so many possible values that an ordinary frequency table is too
10-11
3
10.0
awkward to give a simple picture of the scores. The last example was a bit like that, wasn't it? The solution is to make groupings of values that include all values in a cer‑
Source: Data based on Aron et al. (1995).
tain range. Consider the stress ratings example. Instead of having a separate frequency figure for the group of students who rated their stress as 8 and another for those who rated it as 9, you could have a combined category of 8 and 9. This combined category is a range of values that includes these two values. A combined category like this is called an interval. This particular interval of 8 and 9 has a frequency of 8 (the
..1,A
5 scores with a value of 8 plus the 3 scores with a value of 9).
You can cross-check your work by adding the frequencies for all of the scores. This sum should equal the total number of scores you started with.
A frequency table that uses intervals is called a grouped frequency table. Table 1-6 is a grouped frequency table for the stress ratings example. (Note that in this example the full frequency table has only 11 different values. Thus, a grouped frequency table is not really necessary.) Table 1-7 is a grouped frequency table for
interval range of values in a grouped
the 94 students' number of social interactions over a week.
frequency table that are grouped together. (For example, if the interval size is 10, one of the intervals might be from 10 to 19.)
A grouped frequency table can make information even more directly understand-able than an ordinary frequency table can. Of course, the greater understandability of a grouped frequency table is at a cost. You lose some information: the details of the breakdown of frequencies in each interval.
grouped frequency table frequency
table in which the number of individuals
(frequency) is given for each interval
of values.
34
35
36
37
38
39
40
41
42
43
44
45
46
47
2
8
Chapter 1
1
r
STRESS
RATING
FREQUENCY
K-81-678,9,9,7,
/
7, 6, 5, 0, 9, 10, 7, 7, 3,
7, 5, 2, 1, 6, 7, 10, 8, 8
//
When doingStep @, cross (7'each score as you mark it on the list. This should help you avoid mistakes, which are common in this step.
Figure 1-1 Making a frequency table for the stress ratings scores. (Data based on Aron
et al., 1995.)
Frequency Tables for Nominal Variables
The preceding steps assume you are using numeric variables, the most common situation. However, you can also use a frequency table to show the number of scores in each value (or category) of a nominal variable. For example, researchers (Aron, Aron, & Smollan, 1992) asked 208 students to name the closest person in their life. As shown in Table 1-4, 33 students selected a family member, 76 a nonromantic friend, 92 a roman-tic partner, and 7 selected some other person. Also in Table 1-4, the values listed on the left hand side of the frequency table are the values (the categories) of the variable.
Table 1-4 Frequency Table
for a Nominal Variable: Closest Person
in Life for 208 Students
Closest Person Frequency Percent
Family member
33
15.9
Nonromantic friend
76
36.5
Romantic partner
92
44.2
Other
7
3.4
Another Example
Tracy McLaughlin-Volpe and her colleagues (2001) had 94 introductory psychology students keep a diary of their social interactions for a week during the regular semester. Each time a participant had a social interaction lasting 10 minutes or longer, he or she would fill out a card. The card had questions about various aspects of the conversation and the conversation partner. Excluding family and work situations, the number of social interactions 10 minutes or longer over a week for these students were as follows:
Source: Data from Aron et al. (1992).
-
17 - ////
34 -
1 - //
18 - M.t
35 -/I
2 -/
19 -
36 -
3 -The
20 -
37 -
4 -////
21 - ////
5
22 -///
6 - //
23 -/
40 - /
7 - //
24 - //
41 - /
8 -The / 25 - ///
42 -
9 -m
26 - //
43 -
10 -Thu/
27 -/
44 - /
11 - ////
28 -/
45 -
12 -/
29 - ////
46 -
13 - //
30 - //
47 - 0
14 -///
31 -
15 -/
32 -/
16 - //
33 -/
48, 15, 33, 3, 21, 19, 17, 16, 44, 25, 30, 3, 5, 9, 35, 32, 26, 13, 14, 14, 47, 47, 18, 11, 5, 19, 24, 17, 6, 25, 8, 18, 29, 1, 18, 22, 3, 22, 29, 2, 6, 10, 29, 10, 29, 21, 38, 41, 16, 17, 8, 40, 8, 10, 18, 7, 4, 4, 8, 11, 3, 23, 10, 19, 21, 13, 12, 10, 4, 17, 11, 21, 9, 8, 7, 5, 3, 22, 14, 25, 4, 11, 10, 18, 1, 28, 27, 19, 24, 35, 9, 30, 8, 26.
Now, let's follow our four steps for making a frequency table.
0 Make a list down the page of each possible value, from lowest to highest.
The lowest possible number of interactions is 0. In this study, the highest number of interactions could be any number. However, the highest actual number in this group is 48; so we can use 48 as the highest value. Thus, the first step is to list these values down a page. (It might be good to use several columns so that you can have all the scores on a single page.)
0 Go one by one through the scores, making a mark for each next to its value
Figure 1-2 Making a frequency
on your list. Figure 1-2 shows the results of this step.
table of students' social interactions over a week. (Data from McLaughlin- Volpe et al., 2001.)
0 Make a table showing how many times each value on your list is used.
Table 1-5 is the result.
Displaying the Order in a Group of Numbers
7
Frequency Tables
An Example
Let's return to the stress ratings example. Recall that in this study, students in an introductory statistics class during the first week of the course answered the question, "How stressed have you been in the last TA weeks, on a scale of 0 to 10, with 0 being not at all stressed and 10 being as stressed as possible?" The actual study included scores from 151 students. To ease the learning for this example, we are going to use a representative subset of scores from 30 of the 151 students (this also saves you time if you want to try it for yourself). The 30 students' scores (their ratings on the scale) are:
8, 7, 4, 10, 8, 6, 8, 9, 9, 7, 3, 7, 6, 5, 0, 9, 10, 7, 7, 3, 6, 7, 5, 2, 1, 6, 7, 10, 8, 8.
Looking through all these scores gives some sense of the overall tendencies, but this is hardly an accurate method. One solution is to make a table showing how many stu- dents used each of the 11 values that the ratings can have (0, 1, 2, and so on, through
Table 1-3 Frequency Table of
Number of Students Rating Each Value
10). We have done this in Table 1-3. We also figured the percentage each value's fre-
of the Stress Scale
quency is of the total number of scores. Tables like this sometimes give only the raw‑
Stress Rating Frequency
Percent
number frequencies, not the percentages, or only the percentages and not the raw-number frequencies. In addition, some frequency tables include, for each value, the total number of scores with that value and all values preceding it. These are called cumulative frequencies because they tell how many scores are accumulated up to this point on the table. If percentages are used, cumulative percentages also may be included (for an example, see Figure 1-18 in the Using SPSS section on page 30). Cumulative percentages give, for each value, the percentage of scores up to and including that value. The cumulative percentage for any given value (or for a score having that value) is also called a percentile. Cumulative frequencies and cumulative percentages
3.3
3.3
3.3
6.7
3.3
6.7
13.3
23.3
16.7
allow you to see where a particular score falls in the overall group of scores.
10.0
Table 1-3 is called a frequency table because it shows how frequently (how
10
10.0
many times) each score was used. A frequency table makes the pattern of numbers
easy to see. In this example, you can see that most of the students rated their stress level around 7 or 8, with few rating it very low.
Source: Data based on Aron eta]. (1995).
How to Make a Frequency Table
There are the four steps in making a frequency table.
m Make a list down the page of each possible value, from lowest to highest. In
the stress ratings results, the list goes from 0, the lowest possible rating, up to 10, the highest possible rating.1 Note that even if one of the ratings between 0 and 10 is not used, you still include that value in the listing, showing it as having a frequency of 0. For example, if no one gives a stress rating of 2, you still include 2 as one of the values on the frequency table.
A Go one by one through the scores, making a mark for each next to its value
m Make a table showing how many times each value on your list is used. That
on your list. This is shown in Figure 1-1.
is, add up the number of marks beside each value.
for that value, divide it by the total number of scores, and multiply by 100. You may need to round off the percentage. We recommend that you round percent-ages to one decimal place. Note that because of the rounding, your percentages do not usually add up to exactly 100% (but they should be close).
frequency table listing of number of
individuals having each of the different values for a particular variable.
6
Chapter 1
· .lepio-)peA (3) leNeluHenbe (q) leupou (e) ti
· senien onnl Aue ueewaq
semen to Jeqwnu allugu! Lie 'AJoaqi. ui `seq elqupen snonupoo v •senien owo
-ads ayl ueeweq senpen ou sey pue sanien °woods seq apepen eleiosp v 'C
aapao ieopewnu Jeinon.led ou aneq pue sepobeleo lueJemp
ale Imp. sanien seq eiqepen leum.iou e :sainseew eneuen ayl leqm lo lualxe
JO eei6ap ayl nog( pal ley. siaqwnu OJe leyl senieA seq eiqepeA opewnu v •a
· L of 1. (0) (q) 'sseuNuen (e) • 1.
sJemsuv
BOX 1-1
Important Trivia for Poetic Statistics Students
The word statistics comes from the Italian word statista, a person dealing with affairs of state (from stato, "state"). It was originally called "state arithmetic," involving the tabulation of information about nations, especially for the purpose of taxation and planning the feasibility
what is considered the first use of a statistical test, he proved that the male birthrate was higher than could be expected by chance (assuming that 50:50 was chance) and concluded that there was a plan operating, since males face more danger to obtain food for their families, and
of wars.
only God, he said, could do such planning.
Statistics were needed in ancient times to figure the odds of shipwrecks and piracy for marine insurance that would encourage voyages of commerce and exploration to far-flung places. The modern study of mortality rates and life insurance descended from the 17th-century plague pits—counting the bodies of persons cut down in the bloom of youth. The theory of errors (covered in Chapter 12) began in astronomy, that is, with stargazing; the theory of correlation (Chapter 11) has its roots in bi-ology, from the observation of parent and child differ-ences. Probability theory (Chapter 3) arose in the tense environs of the gambling table. The theory of analysis of experiments (Chapters 7 to 10) began in breweries and out among waving fields of wheat, where correct guesses determined not only the survival of a tasty beer but of thousands of marginal farmers. Theories of measurement and factor analysis (Chapter 15) derived from personality
In 1767, John Michell also used probability theory to prove the existence of God when he argued that the odds were 500,000 to 1 against six stars being placed as close together as those in the constellation Pleiades; so their placement had to have been a deliberate act of the
Creator.
Statistics in the "state arithmetic" sense are legally en-dorsed by most governments today. For example, the first article of the U.S. Constitution requires a census. And statistics helped the United States win the Revolutionary War. John Adams obtained critical aid from Holland by pointing out certain vital statistics, carefully gathered by the clergy in local parishes, demonstrating that the colonies had doubled their population every 18 years, adding 20,000 fighting men per annum. "Is this the case of our enemy, Great Britain?" Adams wrote.
"Which then can maintain the war the longest?"
psychology, where the depths of human character were first explored with numbers. And chi-square
Similar statistics were observed by U.S. President Thomas Jefferson in 1786. He wrote that his people "be-come uneasy" when there are more of them than 10 per square mile and that given the population growth of the new country, within 40 years these restless souls would fill up all of their country's "vacant land." Some 17 years later, Jefferson doubled the size of the United States' "vacant" land through the Louisiana Purchase.
(Chapter 13) came to us from sociology, where it was
often a question of class.
In the early days of statistics, it was popular to use the new methods to prove the existence of God. For example, John Arbuthnot discovered that more male than female babies were born in London between 1629 and 1710. In
Displaying the Order in a Group of Numbers
5
Table 1-2 Levels of Measurement
Level
Definition
Example
Equal-interval Numeric variable in which differences between values correspond
Stress level, age
Rank-order
to differences in the underlying thing being measured Numeric variable in which values correspond to the relative
Class standing, position
position of things measured
finished in a race
Nominal
Variable in which the values are categories
Gender, religion
variables are also called categorical variables because their values are categories.) For example, for the nominal variable gender, the values are female and male. A person's "score" on the variable gender is one of these two values. Another example is psychiatric diagnosis, which has values such as major depression, post-traumatic
stress disorder, schizophrenia, and obsessive-compulsive disorder.
These different kinds of variables are based on different levels of measurement (see Table 1-2). Researchers sometimes have to decide how they will measure a particular variable. For example, they might use an equal-interval scale, a rank-order scale, or a nominal scale. The level of measurement selected affects the type of statistics that can be used with a variable. Suppose a researcher is studying the effects of a particular type of brain injury on being able to recognize objects. One approach the researcher might take would be to measure the number of different objects an injured person can observe at once. This is an example of an equal-interval level of measurement. Alternately, the researcher might rate people as able to observe no objects (rated 0), only one object at a time (rated 1), one object with a vague sense of other objects (rated 2), or ordinary vision (rated 3). This would be a rank-order approach. Finally, the researcher might divide people into those who are completely blind (rated B), those who can identify the location of an object but not what the object is (rated L), those who can identify what the object is but not locate it in space (rated I), those who can both locate and identify an object but have other abnormalities of object perception (rated 0), and those with normal visual perception (rated N).
This is a nominal level of measurement.
In this book, as in most psychology research, we focus mainly on numeric, equal-interval variables (or variables that roughly approximate equal-interval variables). We discuss statistical methods for working with nominal variables in Chapter 13 and methods for working with rank-order variables in Chapter 14.
levels of measurement types of
underlying numerical information
provided by a measure, such as equal‑
interval, rank-order, and nominal
(categorical).
How are you doing?
1. A father rates his daughter as a 2 on a 7-point scale (from 1 to 7) of cranki‑
ness. In this example, (a) what is the variable, (b) what is the score, and
(c) what is the range of values?
2. What is the difference between a numeric and a nominal variable?
3. What is the difference between a discrete
and
a continuous variable?
4. Give the level of measurement of each of the following variables: (a) a person's
nationality (Mexican, Spanish, Ethiopian, Australian, etc.), (b) a person's score on a standard IQ test, (c) a person's place on a waiting list (first in line, second in line, etc.).
4
Chapter 1
an example of a numeric variable. Numeric variables are also called quantitative
variables.
There are several kinds of numeric variables. In psychology research the most important distinction is between two types: equal-interval variables and rank-order variables. An equal-interval variable is a variable in which the numbers stand for approximately equal amounts of what is being measured. For example, grade point average (GPA) is a roughly equal-interval variable, since the difference between a GPA of 2.5 and 2.8 means about as much as the difference between a GPA of 3.0 and 3.3 (each is a difference of 0.3 of a GPA). Most psychologists also consider
scales like the 0-to-10 stress ratings as roughly equal interval. So, for example, a difference between stress ratings of 4 and 6 means about as much as the difference
between 7 and 9.
Some equal-interval variables are measured on what is called a ratio scale. An equal-interval variable is measured on a ratio scale if it has an absolute zero point. An absolute zero point means that the value of zero on the variable indicates a complete absence of the variable. Most counts or accumulations of things use a ratio scale. For example, the number of siblings a person has is measured on a ratio scale, because a zero value means having no siblings. With variables that are measured on a ratio scale, you can make statements about the difference in magnitude between values. So, we can say that a person with four siblings has twice as many siblings as a person with two siblings. However, most of the variables in psychology are not on
a ratio scale
numeric variable variable whose
Equal-interval variables can also be distinguished as being either discrete vari‑ ables or continuous variables. A discrete variable is one that has specific values and cannot have values between the specific values. The number of times you went to the dentist in the last 12 months is a discrete variable. You may have gone 0, 1, 2, 3, or more times, but you can't have gone 1.72 times or 2.34 times. With a continuous variable, there are in theory an infinite number of values between any two values. So, even though we usually answer the question " How old are you?" with a specific age, such as 19 or 20, you could also answer it by saying that you are 19.26 years
values are numbers (as opposed to a
nominal variable). Also called quantita‑
five variable.
equal-interval variable variable in
which the numbers stand for approximately equal amounts of what is being measured.
old. Height, weight, and time are examples of other continuous variables.
The other main type of numeric variable, a rank-order variable, is a variable in which the numbers stand only for relative ranking. (Rank-order variables are also called ordinal variables.) A student's standing in his or her graduating class is an ex‑
ratio scale an equal-interval variable
is measured on a ratio scale if it has an absolute zero point, meaning that the value of zero on the variable indicates
ample. The amount of difference in underlying GPA between being second and third in class standing could be very unlike the amount of difference between being eighth
a complete absence of the variable.
and ninth.
discrete variable variable that has
A rank-order variable provides less information than an equal-interval variable. That is, the difference from one rank to the next doesn't tell you the exact difference in amount of what is being measured. However, psychologists often use rank-order variables because they are the only information available. Also, when people are being asked to rate something, it is sometimes easier and less arbitrary for them to make rank-order ratings. For example, when rating how much you like each of your friends, it may be easier to rank them by how much you like them than to rate your liking for them on a scale. Yet another reason researchers often use rank-order variables is that asking people to do rankings forces them to make distinctions. For example, if asked to rate how much you like each of your friends on a 1-to-10 scale, you might rate sev‑
specific values and that cannot have values between these specific values
continuous variable variable for
which, in theory, there are an infinite number of values between any two values.
rank-order variable numeric variable
in which the values are ranks, such as class standing or place finished in a race. Also called ordinal variable.
eral of them at exactly the same level, but ranking would avoid such ties.
nominal variable variable with values
Another major type of variable used in psychology research, which is not a nu-meric variable at all, is a nominal variable in which the values are names or categories. The term nominal comes from the idea that its values are names. (Nominal
that are categories (that is, they are names rather than numbers). Also called
categorical variable.
Displaying the Order in a Group of Numbers
3
In this chapter and the next, we focus on descriptive statistics. This topic is important in its own right, but it also prepares you to understand inferential statistics.
Inferential statistics are the focus of the remainder of the book.
In this chapter we introduce you to some basic concepts, and then you will learn to use tables and graphs to describe a group of numbers. The purpose of descriptive statistics is to make a group of numbers easy to understand. As you will see, tables and graphs help a great deal.
Some Basic Concepts Variables, Values, and Scores
As part of a larger study (Aron, Paris, & Aron, 1995), researchers gave a questionnaire to students in an introductory statistics class during the first week of the course. One question asked was, "How stressed have you been in the last TA weeks, on a scale of 0 to 10, with 0 being not at all stressed and 10 being as stressed as possible?" (How would you answer?) In this study, the researchers used a survey to examine students' level of stress. Other methods that researchers use to study stress include measuring stress-related hormones in human blood or conducting controlled
laboratory studies with animals.
In this example, level of stress is a variable, which can have values from 0 to 10, and the value of any particular person's answer is the person's score. If you answered
6, your score is 6; your score has a value of 6 on the variable called "level of stress."
More formally, a variable is a condition or characteristic that can have different
values. In short, it can vary. In our example, the variable was level of stress, which can have the values of 0 through 10. Height is a variable, social class is a variable, score on a creativity test is a variable, type of psychotherapy received by patients is a variable, speed on a reaction time test is a variable, number of people absent from
work on a given day is a variable, and so forth.
A value is just a number, such as 4, –81, or 367.12. A value can also be a category, such as male or female, or a psychiatric diagnosis—major depression, post-traumatic
stress disorder—and so forth.
Finally, on any variable, each person studied has a particular number or score that is his or her value on the variable. As we've said, your score on the stress vari‑
able might have a value of 6. Another student's score might have a value of 8.
Psychology research is about variables, values, and scores (see Table 1-1). The formal definitions are a bit abstract, but in practice, the meaning usually is clear.
variable characteristic that can have
Levels of Measurement (Kinds of Variables)
different values.
values possible number or category
Most of the variables psychologists use are like those in the stress ratings example: the scores are numbers that tell you how much there is of what is being measured. In the stress ratings example, the higher the number is, the more stress there is. This is
that a score can have.
score particular person's value on a
variable.
Table 1-1 Some Basic Terminology
Term
Definition
Examples
Variable
Condition or characteristic that can have different values
Stress level, age, gender, religion
0, 1, 2, 3, 4, 25, 85, female, Catholic 0, 1, 2, 3, 4, 25, 85, female, Catholic
Value
Number or category
Score
A particular person's value on a variable
2
Chapter 1
helps you to read the work of other psychologists, to do your own research if you so choose, and to hone both your reasoning and intuition. Formally, statistics is a branch of mathematics that focuses on the organization, analysis, and interpretation of a group of numbers. But really what is statistics? Think of statistics as a tool that has evolved from a basic thinking process employed by every human: you observe a thing; you wonder what it means or what caused it; you have an insight or make an intuitive guess; you observe again, but now in detail, or you try making little changes in the process to test your intuition. Then you face the eternal problem: was your hunch confirmed or not? What are the chances that what you observed this second time will happen again and again, so
that you can announce your insight to the world as something probably true?
Statistics is a method of pursuing truth. As a minimum, statistics can tell you the likelihood that your hunch is true in this time and place and with these sorts of people. This pursuit of truth, or at least its future likelihood, is the essence of psychology, of science, and of human evolution. Think of the first research questions: what will the mammoths do next spring? What will happen if I eat this root? It is easy to see how the early accurate "researchers" survived. You are here today because your ancestors exercised brains as well as brawn. Do those who come after you the same
favor: think carefully about outcomes. Statistics is one good way to do that.
Psychologists use statistical methods to help them make sense of the numbers they collect when conducting research. The issue of how to design good research is a topic in itself, summarized in a Web Chapter (Overview of the Logic and Language of Psychology Research) available on the Web site for this book http://www. pearsonhighered.com/. But in this text we confine ourselves to the statistical meth‑
ods for making sense of the data collected through research.
Psychologists usually use a computer and statistical software to carry out statistical procedures, such as the ones you will learn in this book. However, the best way to develop a solid understanding of statistics is to learn how to do the procedures by hand (with the help of a calculator). To minimize the amount of figuring you have to do, we use relatively small groups of numbers in each chapter's examples and practice problems. We hope that this will also allow you to focus more on the underlying principles and logic of the statistical procedure, rather than on the mathematics of each practice problem (such as subtracting 3 from 7 and then dividing the result by 2 to give an answer of 2). (See the Introduction to the Student on pp. xvi—xviii for more information on the goals of this book.) Having said that, we also recognize the importance of learning how to do statistical procedures on a computer, as you most likely would when conducting your own research. So, at the end of relevant chapters, there is a section called Using SPSS (see also the Study Guide and Computer Workbook that accompanies this text and that includes a guide to getting started with
SPSS). SPSS statistical software is commonly used by psychologists and other behavioral and social scientists to carry out statistical analyses. Check with your instructor to see if you have access to SPSS at your institution.
The Two Branches of Statistical Methods
branch of mathematics that focuses on the organization, analysis, and interpretation of a group of numbers.
There are two main branches of statistical methods.
1. Descriptive statistics: Psychologists use descriptive statistics to summarize and
descriptive statistics procedures for
describe a group of numbers from a research study.
summarizing a group of scores or other-wise making them more comprehensible.
2. Inferential statistics: Psychologists use inferential statistics to draw conclu‑
inferential statistics procedures for
sions and to make inferences that are based on the numbers from a research study but that go beyond the numbers. For example, inferential statistics allow researchers to make inferences about a large group of individuals based on a re‑
drawing conclusions based on the scores collected in a research study but going beyond them
search study in which a much smaller number of individuals took part.
U
CHAPTER 1
Displaying the Order ii n a Group
of Numbers Using Tables and Graphs
Chapter Outline
· The Two Branches of Statistical
4.`:• Summary 23
Methods 2
0 Key Terms 24
m Some Basic Concepts 3
0 Example Worked-Out
Problems 24
m Histograms 10
0 Practice Problems 25
m Shapes of Frequency Distributions 15
m Controversy: Misleading Graphs 19
· Frequency Tables and Histograms
0 Using SPSS 29
fl Chapter Note 32
in Research Articles 21
to Statistics for Psychology. We imagine you to be like other stu‑
dents we have known who have taken this course. You have chosen to
major in psychology or a related field because you are fascinated by
people—by the visible behaviors of the people around you, perhaps too by their inner lives as well as by your own. Some of you are highly scientific sorts; others are more intuitive. Some of you are fond of math; others are less so, or even afraid of it. Whatever your style, we welcome you. We want to assure you that if you give this book some special attention (perhaps a little more than most textbooks require), you will learn statistics. The approach used in this book has successfully taught all sorts of students before you, including those who had taken statistics previously and done poorly.
With this book and your instructor's help, you will learn statistics and learn it well.
More importantly, we want to assure you that whatever your reason for studying psychology or a related field, this course is not a waste of time. Learning about statistics
1
xviii
Introduction to the Student
enormously. Those who fear trouble ahead need to work with those who do not (the blind leading the blind is no way to learn). Pick group members who live near you so that it is easy for you to get together. Also, meet often—between each class, if possible.
A Final Note
Believe it or not, we love teaching statistics. Time and again, we have had the wonder‑ ful experience of having beaming students come to us to say, "Professor, I got a 90% on this exam. I can't believe it! Me, a 90 on a statistics exam!" Or the student who tells
us, "This is actually fun. Don't tell anyone, but I'm actually enjoying
all things!" We hope you will have these kinds of experiences in this course.
. . .
statistics, of
Arthur Aron
Elaine N. Aron
Elliot J. Coups
Introduction to the Student
xvii
abstraction often is grasped only superficially at first, as slogans instead of useful knowledge. Of all the courses you are likely to take in psychology, this one will probably do the most to help you learn to think precisely, to evaluate information, and to apply logical analysis at a very high level.
How to Gain the Most from This Course
There are five things we can advise:
1. Keep your attention on the concepts. Treat this course less like a math
course and more like a course in logic. When you read a section of a chapter, your attention should be on grasping the principles. When working the exercises, think about why you are doing each step. If you simply try to memorize how to come up with the right numbers, you will have learned very little of use in your future
studies—nor will you do very well on the tests in this course.
2. Be sure you know each concept before you go on to the next. Statistics is cumulative. Each new concept is built on the last one. There are short "How Are You Doing?" self-tests at the end of each main chapter section. Be sure you do them. You may also find it helpful to review the "How Are You Doing" sections before working on the practice problems and when studying for exams. If you are having trouble answering a question at any time—or even if you can answer it but aren't sure you really understand it—stop. Reread the section, rethink it, ask for help. Do whatever you need to do to grasp it. Don't go on to the next section until you are completely confident you have gotten this one. If you are not sure, and you've already done the "How are you doing?" questions, take a look at the Example Worked-Out Problems toward the end of the chapter, or try working a practice problem on this material from the end of the chapter. The answers to the Set I practice problems are given
toward the end of the book so that you will be able to check your work.
Having to read the material in this book over and over does not mean that you are stupid. Most students have to read each chapter several times. And each reading in statistics is usually much slower than that in other textbooks. Statistics reading has to be pored over with clear, calm attention for it to sink in. Allow plenty of time for
this kind of reading and rereading.
3. Keep up. Again, statistics is cumulative. If you fall behind in your reading or miss lectures, the lectures you do attend will be almost meaningless. It will get hard‑
er and harder to catch up.
4. Study especially intensely in the first half of the course. It is particularly important to master the material thoroughly at the start of the course. Everything else you learn in statistics is built on what you learn at the start. Yet the beginning of the
semester is often when students study least.
If you have mastered the first half of the course—not just learned the general idea, but really know it—the second half will be easier. If you have not mastered the
first half, the second half will be close to impossible.
5. Help each other. There is no better way to solidify and deepen your understanding of statistics than to try to explain it to someone who is having a harder time. (Of course, this explaining has to be done with patience and respect.) For those of you who are having a harder time, there is no better way to work through the difficult
parts than by learning from another student who has just mastered the material.
Thus, we strongly urge you to form study groups with one to three other students. It is best if your group includes some who expect this material to come easily and some who don't. Those who learn statistics easily will get the most from helping others who have to struggle with it—the latter will tax the former's supposed understanding
Introduction to the Student
The goal of this book is to help you understand statistics. We emphasize meaning
and concepts, not just symbols and numbers.
This emphasis plays to your strength. Most psychology majors are not lovers of mathematics but are keenly attuned to ideas. And we want to underscore the following, based on our collective many decades of teaching experience: We have never had a student who could do well in other college courses who could not also do well in this course. (However, we admit that doing well in this course may require more
work than doing well in others.)
In this introduction, we discuss why you are taking this course and how you can gain the most from it.
Why Learn Statistics, Other Than to Fulfill a Requirement?
1. Understanding statistics is crucial to being able to read psychology research articles. Nearly every course you will take as a psychology major will emphasize the results of research studies, and these almost always are expressed using statistics. If you do not understand the basic logic of statistics—if you cannot make sense of the jargon, the tables, and the graphs that are at the heart of any research report—your reading of research will be very superficial. (We also recommend that you take a course on how to design and evaluate good research. In this book, we focus on the statistical methods for making sense of the data collected through research. However, we have included a downloadable chapter on the Web site for the
book—ham,
w
1—that provides an overview of the logic
and language of psychology research.)
2. Understanding statistics is crucial to doing research yourself Many psy‑
chology majors eventually decide to go on to graduate school. Graduate study in
psychology—even in clinical and counseling psychology and other applied areas
almost always involves doing research. In fact, learning to do research on your own is often the main focus of graduate school, and doing research almost always involves statistics. This course gives you a solid foundation in the statistics you need for doing research. Further, by mastering the basic logic and ways of thinking about statistics, you will be unusually well prepared for the advanced courses, which focus
on the nitty gritty of analyzing research results.
Many psychology programs also offer opportunities for undergraduates to do research. The main focus of this book is understanding statistics, not using statistics. Still, you will learn the basics you need to analyze the results of the kinds of research
you are likely to do. (Also, the Web site that accompanies this book—http://www. has a special chapter to help you with practical issues in
~eursoiiuitzucrcu.cu
using what you learn in this book for analyzing results of your own research.)
3. Understanding statistics develops your analytic and critical thinking.
Psychology majors are often most interested in people and in improving things in the practical world. This does not mean that you avoid abstractions. In fact, the students we know are exhilarated most by the almost philosophical levels of abstraction where the secrets of human experience so often seem to hide. Yet even this kind of
xvi
Preface to the Instructor
KV
outlines before the Practice Problems section and including definitions of key terms in the margin. For several chapters, we expanded the Using SPSS section that shows students how to carry out the chapter's procedures. Also, we added a Using SPSS section to Chapter 15 that shows students how to use SPSS to figure a partial correlation, internal consistency reliability, and an analysis of covariance (ANCOVA). Yet another addition is a section on multilevel modeling analysis in Chapter 15.
Keep in Touch
Our goal is to do whatever we can to help you make your course a success. If you have any questions or suggestions, please send us an email ( [email protected] will do for all of us). Also, if you should find an error somewhere, for everyone's benefit, please let us know right away. When errors have come up in the past, we have usually been able to fix them in the very next printing.
Acknowledgments
First and foremost, we are grateful to our students through the years, who have shaped our approach to teaching by rewarding us with their appreciation for what we have done well as well as their various means of extinguishing what we have done not so well. We also deeply appreciate all those students and instructors who
have sent us their ideas and encouragement.
We remain grateful to all of those who helped us with the first four editions of this book, as well as to those who helped with the four editions of the Brief Course version. For their very helpful input on the development of this fifth edition of Statistics for Psychology, we want to thank Mark Walter, Albion College; Helga Walz, University of Baltimore; Susan Nolan, Seton Hall University; Jwa K. Kim, Middle Tennessee State University; Steven Gangestad, University of New Mexico; Mark Vosvick, University of North Texas; Ann Lynn, Ithaca College; John Bechtold, Messiah College; Donald Sharpe, University of Regina; Terri-Lynn MacKay, University of Manitoba; and Jacqueline Bichsel, Penn State Harrisburg. We are extremely grateful to LeeAnn Doherty and Jeff Marshall of Prentice Hall for superbly leading us through the long revision process. Thanks are also due to Jill Traut, Lori Hazzard, and Fred Dahl for their excellent assistance with the production of this edition. We also particularly want to acknowledge Ted Whitley (East Carolina University) for identifying many crucial final changes to the text.
Arthur Aron
Elaine N. Aron
Elliot J. Coups
Credits
Data in Tables 7-11, 7-12, 8-4, 8-5, 9-9, 9-10, 10-15, 10-16, 11-7, 11-8, 13-9, and 13-10 are based on tables in Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Copyright © 1988 by Lawrence Erlbaum Associates, Inc. Reprinted by permission.
Preface to the Instructor
suitable for copying for student handouts). These lecture outlines and worked-out examples are especially useful to new instructors or those using our book for the first time, since structuring lectures and creating good examples is one of the most de‑
manding parts of teaching the course.
9. Our Test Bank makes preparing exams easy. We supply approximately 40 multiple-choice, 25 fill-in, and 10 to 12 problem/essay questions for each chapter. Considering that the emphasis of the course is so conceptual, the multiple-choice questions will be particularly useful for instructors who do not have the resources to
grade essays.
10. The accompanying Study Guide and Computer Workbook focuses on mastering concepts and also includes instructions and examples for working problems with SPSS. Most study guides concentrate on plugging numbers into formulas and memorizing rules (which is consistent with the emphasis of the textbooks they accompany). For each chapter, our Study Guide and Computer Workbook provides learning objectives, the chapter's formulas (with all symbols defined), and summaries of steps of conducting each procedure covered in the chapter, plus a set of
self tests, including multiple-choice, fill-in, and problem/essay questions.
Also, our Study Guide and Computer Workbook goes beyond the brief SPSS sections in each text chapter to provide the needed support for teaching students to become comfortable with this program and carrying out analyses on the computer. First, there is a special appendix on getting started with SPSS. Then, in each chapter corresponding to the text chapters, there is a section showing in detail how to carry out the chapter's procedures with SPSS. (These sections include step-by-step instructions, examples, and illustrations of how each menu and each output appears on the screen.) There are also special activities for using SPSS to strengthen understanding. As far as we know, no other statistics textbook package provides this much depth of explanation.
What's New in This Fifth Edition
With each new edition we have worked to improve the writing, update content, and make adjustments based on our experience teaching and the wonderful input we
have received from instructors using the text.
A Web page, which is available to instructors who adopt the book and their
students:
• ° ,.n, supplements the text with four down‑
loadable chapters: (1) the basics of research methods, (2) applying statistics in one's
own research projects, (3) repeated-measures analysis of variance, and (4) integra‑ tion of statistical tests and the general linear model (which also serves as an excellent
review/overview of the entire book).
In the fourth edition, we reconceptualized the teaching of the material on correlation and regression. We had long resisted calls from instructors to move these topics to after the t test and analysis of variance, thinking that they worked best as descriptive statistics (in previous editions they came right after mean and standard deviation). On the other hand, many instructors will no doubt continue to prefer to follow our original order; so we have made sure in this edition that users can still go directly from Chapter 2 to correlation and regression (now Chapters 11 and 12), and
then return to Chapter 3 to begin the discussion of inferential statistics.
In this fifth edition, we of course have continued to focus on simplifying exposition and have done our usual updating of content, examples, boxes, controversies, and other elements, in addition to making a host of minor adjustments to make the book more effective. And we have added further pedagogical aids, such as adding essay
Preface to the Instructor
xiii
thinking in statistical theory and application, and this book reflects that engagement. For example, we devote an entire early chapter (Chapter 6) to effect size and power
and then return to these topics as we teach each technique.
5. We capitalize on the students' motivations. We do this in two ways. First, our examples emphasize topics or populations that students seem to find most interesting. The very first is from a real study in which students in their first week of an introductory statistics class rated how much stress they felt they were under. Other examples emphasize clinical, organizational, social, and educational psychology while being sure to include sufficient interesting examples from cognitive, developmental, and behavioral psychology, as well as social and cognitive neuroscience, to inspire students with the value of those approaches. (Also, our examples continually emphasize the usefulness of statistical methods and ideas as tools in the research process, never allowing students to feel that what they are learning is theory for the
sake of theory.)
Second, we have worked to make the book extremely straightforward and systematic in its explanation of basic concepts so that students can have frequent "aha" experiences. Such experiences bolster self-confidence and motivate further learning. It is quite inspiring to us to see even fairly modest students glow from having mastered some concept like negative correlation or the distinction between failing to reject the null hypothesis and supporting the null hypothesis. At the same time, we do not constantly remind them how greatly oversimplified we have made things, as some books do. Instead, we show students, in the controversy sections in particular,
how much there is for them to consider deeply, even in an introductory course.
6. We emphasize statistical methods as a living, growing field of research. We take the time to describe the issues, such as the relative merits of both significance testing and confidence intervals. In addition, each chapter includes one or more "boxes" about famous statisticians or interesting sidelights. The goal is for students to see statistical methods as human efforts to make sense out of the jumble of numbers generated by a research study—to see that statistics are not "given" by nature, not infallible, not perfect descriptions of the events they try to describe, but rather a language that is constantly improving through the careful thought of those who use it. We hope that this orientation will help them maintain a questioning, alert attitude
as students and later as professionals.
7. The final chapter looks at advanced procedures without actually teaching them in detail. It explains in simple terms how to make sense out of these statistics when they are encountered in research articles. Most psychology research articles today use methods such as analysis of covariance, multivariate analysis of variance, multilevel modeling, mediation, factor analysis, or structural equation modeling. Students completing the ordinary introductory statistics course are ill equipped to comprehend most of the articles they must read to prepare a paper or study a course topic in further depth. This chapter makes use of the basics that students have just learned (along with extensive excerpts from current research articles) to give a rudimentary understanding of these advanced procedures. This chapter also serves as a reference
guide that students can keep and use in the future when reading such articles.
8. We have written an Instructor's Manual that really helps teach the course. The Manual begins with a chapter summarizing what we have gleaned from our own teaching experience and the research literature on effectiveness in college teaching. The next chapter discusses alternative organizations of the course, tables of possible schedules and a sample syllabus, advice on structuring exams and an example test, and more still! Then each chapter, corresponding to the text chapters, provides full lecture outlines and additional worked-out examples not found in the text (in a form
aria
Preface to the Instructor
mouse clicks. What is important today is that students work problems in a way that keeps them constantly aware of the underlying logic of what they are doing. Consider the population variance—the average of the squared deviations from the mean. This concept is directly displayed in the definitional formula (once the student is used to the symbols): Variance = [ E(X — M) 2 ]/N. Repeatedly working problems using this formula ingrains the meaning in the student's mind. In contrast, the usual computational version of this formula only obscures this meaning: Variance
[Ex—(EA)\N]/N. Repeatedly working problems using this formula does
nothing but teach the student the difference between E X2 and ( 1 X )2!
Teaching these tired computational formulas today is an anachronism—at least 40 years out-of-date! Researchers do their statistics on computers now, and the use of statistical software makes the understanding of the basic principles, as they are symbolically expressed in the definitional formulas, more important than ever. Students still need to work lots of problems by hand to learn the material. But they need to work them using the definitional formulas that reinforce the concepts, not using the antiquated computational formulas that obscure them. Not since the era when Lyndon B. Johnson was U.S. president have those computational formulas made sense as time-savers when researchers had to work with large data sets by hand. Even then, however, they were poor teaching tools. (Because some instructors may feel naked without them, we still provide the computational formulas, usually in a
brief note at the end of the chapter.)
2. Each procedure is taught both verbally and numerically—and usually visually as well. In fact, when we introduce every formula, it has attached to it a concise statement of the formula in words. (The major formulas with their verbal descriptions are also repeated on the inside front cover.) Typically, each example lays out the procedures in worked-out formulas, in words (often with a list of steps), and illustrated with easy-to-grasp figures. Practice problems and test bank items, in turn, require the student to calculate results, write a short explanation in layperson's language of what they have done, and make a sketch (for example, of the distributions involved in a t test). The chapter material completely prepares the student for these
kinds of practice problems and test questions.
It is our repeated experience that these different ways of expressing an idea are crucial for establishing a concept in a student's mind. Many psychology students are more at ease with words than with numbers. In fact, some have a positive fear of all mathematics. Writing the formula in words and providing the lay-language explana‑
tion gives them an opportunity to do what they do best.
3. A main goal of any introductory statistics course in psychology is to prepare students to read research articles. The way a procedure such as a t test or an analysis of variance is described in a research article is often quite different from what the student expects from the standard textbook discussions. Therefore, as this book teaches a statistical method, it also gives examples of how that method is reported in current journal articles. And we don't just leave it there. The practice problems and test bank
items also include excerpts from journal articles for the student to explain.
4. The book is unusually up-to-date. Most introductory statistics textbooks read
as if they were written in the 1950s. The basics are still the basics, but statisticians and researchers think far more subtly about those basics now. Today, the basics are under‑
girded by a new appreciation of effect size, power, limitations of significance testing, the accumulation of results through meta-analysis, the critical role of models, the underlying unity of difference and association statistics, the growing prominence of regression and associated methods, and a host of new developments arising from the central role of the computer in statistical analyses. We are much engaged in the latest
Preface to the Instructor
The heart of this book was written over a summer in a small apartment near the Place Saint Ferdinand, having been outlined in nearby cafés and on walks in the Bois de Boulogne. It is based on our collective many decades of experience teaching, researching, and writing. We believe that the result is a book as different from the conventional lot of statistics texts as Paris is from Pompeii, yet still comfortable and
stimulating to the long-suffering community of statistics instructors.
Our approach was developed over decades of successful teaching—successful not only in the sense that students have consistently rated the course (a statistics course, remember) as a highlight of their major, but also in the sense that students come back to us long after graduating saying, "I was light years ahead of my fellow graduate students because of your course," or "Even though I don't do research, your
course has really helped me read the journals in my field."
The response to the first four editions has been overwhelming. We have received hundreds of thank-you emails and letters from instructors (and from students themselves!) from all over the world. (The text has been translated into Traditional Chinese and Spanish.) Of course, we have also been delighted by the enthusiastic reviews it has received, starting with the first edition in Contemporary Psychology (Bourgeois, 1997) and continuing through recent years (Shevlin, 2005, in Psychology
Learning and Teaching).
With each revision, we have tried to maintain those things about the book that have been especially appreciated, while reworking the text to take into account the feedback we have received, our own experiences, and advances and changes in the field. We have also added new pedagogical features to make the book even more accessible for students. (As we undertook this fifth edition we were particularly concerned that the book not become stale and that it remain as lively and as up-to-date as our very first edition.) However, before turning to what's new in this latest revision, we want to reiterate what we said with the first edition about how this book, from the beginning, has been so different from other statistics texts.
How This Book Was Dramatically Different from the Start
Different as this book is, it has from the start also done what the best of the better statistics texts of the last few years have been already doing pretty well: emphasizing the intuitive, de-emphasizing the mathematical, and explaining everything in direct, simple language. But what we have done continues to go beyond even the best of the
current lot in 10 key respects.
1. The definitional formulas are brought to center stage because they provide a concise symbolic summary of the logic of each particular procedure. All our explanations, examples, practice problems, and test bank items are based on these definitional formulas. (The amount of data to be processed in practice problems and test
bank items is reduced appropriately to keep computations manageable.)
Why this approach? Even in 2008, statistics texts have still not faced the technological realities. What is important today is not that the students learn to calculate a t test with a large data set—programs like SPSS can do this in an instant with a few
xi
x
Contents
Box 15-2: The Golden Age of Statistics: Four Guys Around London 627
Procedures That Compare Groups 634
Analysis of Covariance (ANCOVA) 634
Multivariate Analysis of Variance (MANOVA) and Multivariate Analysis
of Covariance (MANCOVA) 635
Overview of Statistical Techniques 636
Controversy: Should Statistics Be Controversial? 637
Box 15-3: The Forced Partnership of Fisher and Pearson 638
How to Read Results Using Unfamiliar Statistical Techniques 639
Summary 641
Key Terms 642
Practice Problems 642
Using SPSS 654
Chapter Notes 662
Appendix: Tables 664
Answers to Set I Practice Problems 673
Glossary 701
Glossary of Symbols 708 References
710
Index 719
Web Chapters (available at http://www.pearsonhighered.com)
Charrqerf vgATO Overview of the Logic and Language of Psychology
Research
EPI•Rrotcerr W2 Applying Statistical Methods in Your Own Project Charofce,F \\WIS Repeated-Measures Analysis of Variance
CITAT:Aec, WV41 Integration and the General Linear Model
Contents
ix
Issues in Prediction 503 Multiple Regression 506
Limitations of Prediction 508
Controversy: Unstandardized and Standardized Regression Coefficients; Comparing Predictors 509
Box 12-1: Clinical versus Statistical Prediction 510
Prediction in Research Articles 511
Advanced Topic: Error and Proportionate Reduction in Error 514 Summary 518
Key Terms 519
Example Worked-Out Problems 519
Practice Problems 524
Using SPSS 532
Chapter Notes 535
Chavlarr 113 Chi-Square Tests 536
Box 13-1: Karl Pearson, Inventor of Chi-Square and Center of Controversy 537
The Chi-Square Statistic and the Chi-Square Test for Goodness of Fit 538 The Chi-Square Test for Independence 546
Assumptions for Chi-Square Tests 554
Effect Size and Power for Chi-Square Tests for Independence 554 Controversy: The Minimum Expected Frequency 558
Chi-Square Tests in Research Articles 559
Summary 560
Key Terms 561
Example Worked-Out Problems 561
Practice Problems 565
Using SPSS 572
Chapter Notes 576
Chapter 114 Strategies When Population Distributions
Are Not Normal: Data Transformations and Rank-Order Tests 577
Assumptions in the Standard Hypothesis-Testing Procedures 578 Data Transformations 580
Rank-Order Tests 585
Comparison of Methods 589
Controversy: Computer-Intensive Methods 591
Box 14-1: Where Do Random Numbers Come From? 594
Data Transformations and Rank-Order Tests in Research Articles 595 Summary 596
Key Terms 597
Example Worked-Out Problems 597
Practice Problems 597
Using SPSS 602
Chapter Notes 609
Ch makes 1>3 The General Linear Model and Making Sense of Advanced
Statistical Procedures in Research Articles 611
The General Linear Model 612
Box 15-1: Two Women Make a Point About Gender and Statistics 616
Partial Correlation 617
Reliability 618
Multilevel Modeling 620
Factor Analysis 622
Causal Modeling 625
Contents
Analyses of Variance in Research Articles 344
Advanced Topic: The Structural Model in the Analysis of Variance 345 Principles of the Structural Model 345
Summary 351
Key Terms 352
Example Worked-Out Problems 353
Practice Problems 357
Using SPSS 364
Chapter Notes 368
Chaptev 110 Factorial Analysis of Variance 370
Basic Logic of Factorial Designs and Interaction Effects 371 Recognizing and Interpreting Interaction Effects 376
Basic Logic of the Two-Way Analysis of Variance 386
Box 10-1: Personality and Situational Influences on Behavior: An Interaction Effect 387
Assumptions in the Factorial Analysis of Variance 389 Extensions and Special Cases of the Analysis of Variance 389 Controversy: Dichotomizing Numeric Variables 391
Factorial Analysis of Variance in Research Articles 393 Advanced Topic: Figuring a Two-Way Analysis of Variance 395
Advanced Topic: Power and Effect Size in the Factorial Analysis
of Variance 406
Summary 410
Key Terms 411
Example Worked-Out Problems 412
Practice Problems 415
Using SPSS 426
Chapter Notes 431
Chi orrAn VI Correlation 432
Graphing Correlations: The Scatter Diagram 434 Patterns of Correlation 437
The Correlation Coefficient 443
Pro. 11 1 Galton: Gentleman Genius 446
Significance of a Correlation Coefficient 452
Correlation and Causality 456
Issues in Interpreting the Correlation Coefficient 458
Box 11-2: Illusory Correlation: When You Know Perfectly Well That If It's Big,
It's Fat-and You Are Perfectly Wrong 460
Effect Size and Power for the Correlation Coefficient 464 Controversy: What Is a Large Correlation? 466 Correlation in Research Articles 467
Summary 469
Key Terms 471
Example Worked-Out Problems 471
Practice Problems 474
Using SPSS 482
Chapter Notes 485
Chapter 12 Prediction 487
Predictor (X) and Criterion (Y) Variables 488 The Linear Prediction Rule 488
The Regression Line 492
Finding the Best Linear Prediction Rule 496 The Least Squared Error Principle 498
Contents
vii
Summary 214
Key Terms 215
Example Worked-Out Problems 215
Practice Problems 217
Chapter Note 221
Cha0err 77
Introduction to t Tests: Single Sample and Dependent Means 222
The t Test for a Single Sample 223
Box 7-1: William S. Gosset, Alias "Student": Not a Mathematician, But a Practical Man 224
The t Test for Dependent Means 236
Assumptions of the t Test for a Single Sample and the t Test for Dependent
Means 247
Effect Size and Power for the t Test for Dependent Means 247
Controversy: Advantages and Disadvantages of Repeated-Measures
Designs 250
Box 7-2: The Power of Studies Using Difference Scores: How the Lanarkshire Milk Experiment Could Have Been Milked for More 251
Single Sample t Tests and Dependent Means t Tests in Research Articles 252 Summary 253
Key Terms 254
Example Worked-Out Problems 254
Practice Problems 258
Using SPSS 265
Chapter Notes 268
;
The t Test for Independent Means 270
The Distribution of Differences Between Means 271 Hypothesis Testing with a t Test for Independent Means 278 Assumptions of the t Test for Independent Means 286
Box 8-1: Monte Carlo Methods: When Mathematics Becomes Just an Experiment, and Statistics Depend on a Game of Chance 286
Effect Size and Power for the t Test for Independent Means 288 Review and Comparison of the Three Kinds of t Tests 290 Controversy: The Problem of Too Many t Tests 291
The t Test for Independent Means in Research Articles 292
Advanced Topic: Power for the t Test for Independent Means When Sample Sizes
Are Not Equal 293
Summary 294
Key Terms 295
Example Worked-Out Problems 295
Practice Problems 298
Using SPSS 305
Chapter Notes 309
Chapicee 9
Introduction to the Analysis of Variance 310
Basic Logic of the Analysis of Variance 311
Box 9-1: Sir Ronald Fisher, Caustic Genius of Statistics 317
Carrying Out an Analysis of Variance 319
Hypothesis Testing with the Analysis of Variance 327 Assumptions in the Analysis of Variance 331
Planned Contrasts 334
Post Hoc Comparisons 337
Effect Size and Power for the Analysis of Variance 339 Controversy: Omnibus Tests versus Planned Contrasts 343
vi
Contents
Controversies: Is the Normal Curve Really So Normal? and Using Nonrandom
Samples 93
Z Scores, Normal Curves, Samples and Populations, and Probabilities
in Research Articles 95
Advanced Topics: Probability Rules and Conditional Probabilities 96 Summary 97
Key Terms 98
Example Worked-Out Problems 99
Practice Problems 102
Using SPSS 105
Chapter Notes 106
CC apitere a
Introduction to Hypothesis Testing 107
A Hypothesis-Testing Example 108
The Core Logic of Hypothesis Testing 109
The Hypothesis-Testing Process 110
One-Tailed and Two-Tailed Hypothesis Tests 119 Controversy: Should Significance Tests Be Banned? 124
Box 4-1: Jacob Cohen, the Ultimate New Yorker: Funny, Pushy, Brilliant, and Kind 126
Hypothesis Tests in Research Articles 127 Summary 128
Key Terms 129
Example Worked-Out Problems 129
Practice Problems 131
Chapter Notes 136
crw-e-ar, 5
Hypothesis Tests with Means of Samples 137
The Distribution of Means 138
Hypothesis Testing with a Distribution of Means: TheZ Test 146
Box 5-1: More About Polls: Sampling Errors and Errors in Thinking About Samples 147
Controversy: Marginal Significance 153
Hypothesis Tests About Means of Samples (Z Tests) and Standard Errors
in Research Articles 154
Advanced Topic: Estimation, Standard Errors, and Confidence Intervals 156 Advanced Topic Controversy: Confidence Intervals versus Significance Tests 162 Advanced Topic: Confidence Intervals in Research Articles 163
Summary 163
Key Terms 164
Example Worked-Out Problems 164
Practice Problems 167
Chapter Notes 173
Chapter 6
Making Sense of Statistical Significance: Decision Errors, Effect Size, and Statistical Power 175
Decision Errors 175
Effect Size 179
Box 6-1: Effect Sizes for Relaxation and Meditation: A Restful Meta-Analysis 184
Statistical Power 187
What Determines the Power of a Study? 191
Box 6-2: The Power of Typical Psychology Experiments 199
The Role of Power When Planning a Study 203
The Role of Power When Interpreting the Results of a Study 205 Controversy: Statistical Significance versus Effect Size 208 Decision Errors, Effect Size, and Power in Research Articles 210 Advanced Topic: Figuring Statistical Power 212
Contents
Preface to the Instructor xi Introduction to the Student xvi
Chapter 1
Displaying the Order in a Group of Numbers Using Tables and Graphs 1
The Two Branches of Statistical Methods 2 Some Basic Concepts 3
Box 1-1: Important Trivia for Poetic Statistics Students 6
Frequency Tables 7
Histograms 10
Box 1-2: Math Anxiety, Statistics Anxiety, and You: A Message for Those of You Who Are Truly Worried About This Course 12
Shapes of Frequency Distributions 15
Controversy: Misleading Graphs 19
Frequency Tables and Histograms in Research Articles 21 Summary 23
Key Terms 24
Example Worked-Out Problems 24
Practice Problems 25
Using SPSS 29
Chapter Note 32
Chap2eT 2
Central Tendency and Variability 33
Central Tendency 34 Variability 43
Box 2-1: The Sheer Joy (Yes, Joy) of Statistical Analysis 51
Controversy: The Tyranny of the Mean 52
Box 2-2: Gender, Ethnicity, and Math Performance 53
Central Tendency and Variability in Research Articles 55 Summary 57
Key Terms 57
Example Worked-Out Problems 57
Practice Problems 59
Using SPSS 62
Chapter Notes 65
n_,Aerr 3
Some Key Ingredients for Inferential Statistics: Z Scores, the Normal Curve, Sample versus Population, and Probability 67
Z Scores 68
The Normal Curve 73
Box 3-1: de Moivre, the Eccentric Stranger Who Invented the Normal Curve 74
Sample and Population 83
Box 3-2: Surveys, Polls, and 1948's Costly "Free Sample" 86
Probability 88
Box 3-3: Pascal Begins Probability Theory at the Gambling Table, Then Learns
to Bet on God 89
v
Brief Contents
Chapter INT, Overview of the Logic and Language of Psychology Research Chapter W2 Applying Statistical Methods in Your Own Project
Chapter W3 Repeated-Measures Analysis of Variance
Chapter W4 Integration and the General Linear Model
Brief Contents
Preface to the Instructor xi Introduction to the Student xvi
Chapter 11 Displaying the Order in a Group of Numbers Using
Tables and Graphs 1
CGS per 2 Central Tendency and Variability 33
Chapter, 3 Some Key Ingredients for Inferential Statistics: Z Scores,
the Normal Curve, Sample versus Population, and Probability 67
Chapeir & Introduction to Hypothesis Testing 107
CG aptev 3 Hypothesis Tests with Means of Samples 137
CE_pter 6 Making Sense of Statistical Significance: Decision Errors,
Effect Size, and Statistical Power 175
W Introduction to t Tests: Single Sample and Dependent
Means 222
ChalpiL,v, 0 The t Test for Independent Means 270 Cft._,2ere 0 Introduction to the Analysis of Variance 310 Chapccere 1i0 Factorial Analysis of Variance 370
nap'lerf •U Correlation 432
Chap .Ln 'd 2 Prediction 487
CG-Tia,Te 'd 33 Chi-Square Tests 536
CG ue~ Strategies When Population Distributions Are Not Normal:
Data Transformations and Rank-Order Tests 577
EGT,apftev 115 The General Linear Model and Making Sense of Advanced
Statistical Procedures in Research Articles 611
Appendix: Tables 664
Answers to Set I Practice Problems 673
Glossary 701
Glossary of Symbols 708 References 710
Index 719
Web Chapters (available at http://www.pearsonhighered.com)
iii
Cover Art: Courtesy of Photodisc/Getty Images.
Taken from:
Statistics for Psychology, Fifth Edition
by Arthur Aron, Elaine N. Aron, and Elliot J. Coups
Copyright © 2009, 2006, 2003, 1999, 1994 by Pearson Education
Published by Prentice Hall
Upper Saddle River, New Jersey 07458
All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher.
This special edition published in cooperation with Pearson Learning Solutions.
All trademarks, service marks, registered trademarks, and registered service marks are the property of their respective owners and are used herein for identification purposes only.
Pearson Learning Solutions, 501 Boylston Street, Suite 900, Boston, MA 02116 A Pearson Education Company
Printed in the United States of America
1 2 3 4 5 6 7 8 9 10 OBRV 17 16 15 14 13 12
000200010271653336
AD
PEARSON
ISBN 10: 1-256-77287-9 ISBN 13: 978-1-256-77287-3
PEARSON
ALWAYS LEARNING
Arthur Aron • Elaine N. Aron • Elliot 3. Coups
Statistics for
Psychology
Custom Edition for Liberty University
Taken from:
Statistics for Psychology, Fifth Edition
by Arthur Aron, Elaine N. Aron, and Elliot 3. Coups
/(M
—
GM)2
s2m
=
df Between
The estimated variance of the distribution of means is the sum of each sample mean's squared deviation from the grand mean, divided by the degrees of freedom for the between-groups population variance estimate.
(9-2)
(9-4)
SBetween or MSBetween
(Sm) (n)
The between-groups population variance estimate (or mean squares between) is the estimated variance of the distribution of
means multiplied by the number of scores in each group.
Sietween
MSBetween
F
=
or
SWithin
MSWithin
The F ratio is the between-groups population variance estimate (or mean squares between) divided by the within-groups population variance estimate (or mean squares within).
(9-5)
Si
/(M
—
GM)2
etween
=
or
The between-groups population variance estimate is the sum of squared deviations of each score's group's mean from the grand mean divided by the degrees of freedom
(9-10)
df Between
SSBetween
MSBetween
=
"fJ Between
for the between-groups population variance estimate.
E (x - M)2
SSWithin
S2Within
=
or MSwithin
=
dfWithin
df Within
The within-groups population variance estimate is the sum of squared deviations of each score from its group's mean divided by the degrees of freedom for the within-groups population variance estimate.
(9-11)
SSRows
I(A4Row
—
GM)2
(10-1)
The sum of squared deviations for rows is the sum of each score's
row's mean's squared deviation from the grand mean.
(10-3)
SSInteraction
E [ (x
-
GM)
—
(X
—
M)
—
(MROW
—
GM)
—
(MColumn
—
GM)?
The sum of squared deviations for the interaction is the sum of the squares of each score's deviation from the grand mean minus its deviation from its cell's mean, minus its row's mean's deviation from the grand mean, minus its column's mean's deviation from the grand mean.
SSWithin
/(X
—
NI)2
The sum of squared deviations within groups (within cells) is the sum of each score's squared deviation from its cell's mean.
(10-4)
The correlation coefficient is the sum, over all the people in the study, of the product of each person's two deviation scores, divided by the square root of the result of multiplying the sum of everyone's squared deviation scores on the X variable by the sum of everyone's squared deviation
1[ (X
—
Mx)(Y
—
My)]
r
=
X/ (SSx)(SSy)
(11-1)
scores on the Y variable.
(12-1)
a + (b) (X)
A person's predicted score on the criterion variable equals the regression constant plus the regression coefficient multiplied by that person's score on the predictor variable.
(0
—
E)2
X
E
E
Chi-square is the sum, over all the categories or cells, of the squared difference between observed and expected frequencies divided by the expected frequency.
(13-1)
E
(R'
(C)
A cell's expected frequency is the number in its row divided by the total number, multiplied by the number in its column.
(13-2)
shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth9525fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor10063221shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth635fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor0
Major Formulas
Formula Number
The mean is the sum of the scores divided by the number of scores.
IX
M
=
(2-1)
N
The variance is the sum of the squared deviations of the scores from the mean, divided by the number of scores.
sp2
/(X
—
M)2
=
N
(2-2)
X
—
M
Z
=
A Z score is the raw score minus the mean, divided by the standard deviation.
(3-1)
SD
Cr
QM
=
N
The variance of a distribution of means is the variance of the population
(5-2)
of individuals divided by the number of individuals in each sample.
The effect size for the difference between two means is the difference between the population means divided by the population's standard deviation.
—
1-1-2
d
—
cr
(6-1)
The estimated population variance is the sum of the squared deviation scores divided by the number of scores minus 1.
Ev
—
M)2
SS
S2
=
—
N
—
1
N
—
1
(7-1)
I
The variance of the distribution of means based on an estimated population variance is the estimated population variance divided by the number of scores in the sample.
S2
N
S2m
=
(7-5)
The t score in a single sample t test and a t test for dependent means is the sample mean minus the population mean, divided by the standard deviation of the distribution of means.
t
=
M
—
11
SM
(7-7)
The pooled estimate of the population variance is the degrees of freedom in the first sample divided by the total degrees of freedom (from both samples) multiplied by the population variance estimate based on the first sample, plus the degrees of freedom in the second sample divided by the total degrees of freedom multiplied by the population
SP
doole
=
1(5?) +
df2
(Si)
dfTotal
dfTotal
_
(8-1)
variance estimate based on the second sample.
°c,2
The variance of the distribution of means for the first population (based on an estimated population variance) is the pooled estimate of the population variance
Pooled
Sm2
=
N1
(8-2)
divided by the number of participants in the sample from the first population.
The variance of the distribution of differences between means is the variance of the distribution of means for the first population (based on an estimated
Aifference
M1
S2 + Sit,
(8-4)
population variance) plus the variance of the distribution of means for the second population (based on an estimated population variance).
The t score in a t test for independent means is the difference between the two sample means divided by the standard deviation of the distribution of differences between means.
M1 — M2
t
=
-
SDifference
(8-7)
The within-groups population variance estimate (or mean squares within) is the sum of the population variance esti‑
2 + s3 +
• • •
+ sLst
S2
oorr MSWithin
-
_
NGroups
groups.
mates based on each sample, divided by the number of
(9-1)
shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth6350fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor12231006shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth12700fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor12231006