Case study Statistics

stat354_ch1.doc

Home >Psychology homework help >Case study Statistics

Chapter 1

1

–

117111X1

File Edit View Data Transform Insert F^°µ-it Analyze Graphs Utilities Add-ons Window Help

Eg?

4 111 el

+ - VI

A, AL

Histogram

12.5—

10 0--

0—

Mean =17.39

Std. Dev. =11.55

N =94

0.0

0.00

10.00

20.00

30.00

40.00

50.00

socint

_

SPSS Proce

x is ready

Figure 1-20 SPSS histogram for the social interactions example. (Data from

McLaughlin-Volpe et al., 2001.)

I

esearch articles follow the procedure we recommend here: going from at the top to highest at the bottom. However, some statistics authorities -nend going from highest at the top to lowest at the bottom.

□

shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth6350fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor0

Displaying the Order in a Group of Numbers

31

Outputl [Documentl] - SPSS Viewer

I- ili:-:11

File Edit View Date Transform Insert Format Analyze Graphs Utilities Add-ons Window aelp

q

socim

Frequency.

Percent

Valid Percent

cumulative

Percent

Valid

1.00

?.1

II

2.1

2.00

1.1

1

:3.110

8

4.00

4.3

12.8

5.00

2

16.0

6.00

1

18.1

nn

2.1

1

20.2

8.00

6.4

6.4

26.6

9.00

32

29.8

_-

10.00

6.4

6.4

36.2

11.00

4

4.3

4.3

40.4

12.00

1

1

1.1

41.5

1:3.00

1

2.1

43.6

14.00

15.00

46.8

1

1.1

1.1

47.9

}

5P55 Processor is ready

Figure 1-19 SPSS frequency table for the social interactions example. (Data from

McLaughlin-Volpe et al., 2001.)

Practice these steps by creating a histogram for the social interactions example in this chapter (the scores are listed on p. 8). Your output window should look like Figure 1-20. Notice that SPSS automatically creates a histogram based on a grouped frequency table, with an interval in this case of 3 (1-3, 4-6, 7-9, and so on). (Should you wish, you can change the number of intervals or the interval size for the histogram by doing the following: Place your mouse cursor on the histogram and double .,' to bring up a Chart Editor window; place your mouse cursor over one of the bars in the histogram and double to bring up a Properties window; . • the tab labeled Binning; ;'-‘' Custom; then enter the number of intervals you want for the interval size, labeled Interval Width; Apply.) (If you want a nongrouped histogram, type in "1" for the interval size.)

□

□

30

Chapter 1

.121. Untitledl [DataSet0] - SPSS Data Editor

1r:1

x

r

File Edit View Data Transform Analyze ,:?aphs Utilities Add-ons Window Help

i

1

so cint

48

Visible: 1 of

soc int

48.00

1

)

i

33.00

• Frequenc

3.00

21.0

0

Variableis):

OK

, socint

19.00

_

17.00

16.00

1

Beset 1

44.00

Cancel

25.00

Help

:30.00

3.00

5.00

Display frequency tables

9.00

3500 .

[Statis

tics...

) I Charts

...

I I

Format...

32.00

26.

00

13.00

4 1> \ Data Voev yVariable View /

<

)

5P55 Processor is ready

•

Figure 1-18 SPSS data window and frequencies window for the social interactions

example. (Data from McLaughlin-Volpe et al., 2001.)

Creating a Histogram

m Enter the scores from your distribution in one column of the data window.
@ Analyze.
@ Descriptive statistics.
m Frequencies.
m the variable you want to make a histogram of and then .•-• on the arrow.
@ Charts, 2- Histograms, Continue.
6 Optional: To instruct SPSS not to produce a frequency table, the box labeled

Display frequency tables (this unchecks the box).

0 OK.

:3

4

6

9

10

11

12

13

14

15

17

18

□

Displaying the Order in a Group of Numbers

29

I

Table 1-11 Dominant Category of Explanation for Intimate Aggression by Gender

and Perpetrator Status

Group

Female

Male

Comparisons

Perpetrators

Comparisons

Perpetrators

(n = 36)

(n = 33)

(n = 32)

(n = 25)

_

Category

f

%

f

%

f

%

f

%

Self-defense

Control motives

Expressive

aggression

Face/self-esteem preservation

-6

Exculpatory explanations

Rejection of perpetrator or act

Prosocial/acceptable explanations

0

0

0

0

0

0

0

0

Tied categories

4

1

7

21

2

6

0

0

Note: f = frequency. °A) = percentage of respondents in a given group who provided a particular category of explanation. Source: Mouradian, V. E. (2001). Applying schema theory to intimate aggression: Individual and gender differences in representation of contexts and goals. Journal of Applied Social Psychology 31, 376-408. Copyright © 2001 by Blackwell Publishing. Reprinted by permission of Blackwell Publishers Journals.

The . in the following steps indicates a mouse click to carry out these analyses. The steps and output may versions of SPSS.)

Creating a Frequency Table

O Enter the scores from your distribution in one col

@ Analyze.
@ Descriptive statistics.
m Frequencies.
@ the variable you want to make a frequency tat 15 OK.
Practice the preceding steps by creating a frequency t example in this chapter (the scores are listed on p. should look like Figure 1-18. Your output window (, OK in Step @) should look like Figure 1-19. As yot. produces a column with the cumulative percentage (Note that it is possible to create grouped frequency to a straightforward process, we do not cover it here.)

2

6

3

9

3

9

1

4

8

22

9

27

9

28

3

12

4

II

3

9

3

9

8

32

1

3

2

2

6

3

12

5

14

3

9

3

9

3

12

12

33

6

18

10

31

7

28

28

Chapter 1

17. Pick a book and a page number of your choice. (Select a page with at least

30 lines; do not pick a textbook or any book with tables or illustrations.) Make a list of the number of words on each line; use that list as your data set. Make (a) a frequency table and (b) a histogram. Then (c) describe the general shape of the distribution. (Be sure to give the name, author, publisher, and year of the

book you used, along with the page number, with your answer.)

18. Explain to a person who has never taken a course in statistics the meaning of a

grouped frequency table.

19. Give an example of something having these distribution shapes: (a) bimodal,

(b) approximately rectangular, and (c) positively skewed. Do not use an exam‑

ple given in this book or in class.

20. Find an example in a newspaper or magazine of a graph that misleads by failing

to use equal interval sizes or by exaggerating proportions.

21. Nownes (2000) surveyed representatives of interest groups who were registered

as lobbyists of three U.S. state legislatures. One of the issues he studied was whether interest groups are in competition with each other. Table 1-10 shows the results for one such question. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics.

(b) Explain the general meaning of the pattern of results.

22. Mouradian (2001) surveyed college students selected from a screening session to

include two groups: (a) "Perpetrators"—students who reported at least one violent act (hitting, shoving, etc.) against their partner in their current or most recent relationship—and (b) "Comparisons"—students who did not report any such uses of violence in any of their last three relationships. At the actual testing session, the students first read a description of an aggressive behavior such as, "Throw something at his or her partner" or "Say something to upset his or her partner." They then were asked to write "as many examples of circumstances of situations as [they could] in which a person might engage in behaviors or acts of this sort with or towards their significant other." Table 1-11 shows the "Dominant

Category of Explanation" (the category a participant used most) for females and males, broken down by comparisons and perpetrators. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics. (b) Explain the general meaning of the pattern of results.

Table 1-10 Competition for Members and Other Resources

Question: How much competition does this group face from other groups with similar goals for members and other resources?

Answer

Percentage

Number

No competition

20

118

Some competition

58

342

A lot of competition

22

131

Total

100

591

Note: There were no statistically significant differences between states. For full results of significance tests, contact the author. Source: Nownes, A. J. (2001). Policy conflict and the structure of interest communities. American Politics Quarterly, 28, 316. Copyright © 2001 by Sage Publications, Ltd. Reprinted by permission of Sage Publications, Thousands Oaks, London,

and New Delhi.

Displaying the Order in a Group of Numbers

27

Table 1-9 Descriptive Statistics for the Type of News Given

Category

Frequency

Percentage

1. Relationship with family

19

21.1

2. School

6

3. Job/work

6.7

4. Relationship with actual/potential girlfriend/boyfriend

Personal health

17

18.9

6. Finance

7. Relationship with friends

Health of family member/friend

21

23.3

23

25.6

9. Other

1

Llj

Source: McKee, T. L. E., & Placek, J. T. (2001). I'm afraid I have something bad to tell you: Breaking bad news from the perspective of the given. Journal of Applied Social Psychology, 31, 246-273. Copyright © 2001 by Blackwell Publishing. Reprinted by permission of Blackwell Publishers Journals.

12. Explain and give an example for each of the following types of variables: (a)

equal-interval, (b) rank-order, (c) nominal, (d) ratio scale, (e) continuous.

13. An organizational psychologist asks 20 employees in a company to rate their

job satisfaction on a 5-point scale from 1 = very unsatisfied to 5 = very satisfied. The ratings are as follows:

3, 2, 3, 4, 1, 3, 3, 4, 5, 2, 3, 5, 2, 3, 3, 4, 1, 3, 2, 4

Make (a) a frequency table and (b) a histogram. Then (c) describe the general

shape of the distribution.

14. A social psychologist asked 15 college students how many times they "fell in love" before they were 11 years old. The numbers of times were as follows:

2, 0, 6, 0, 3, 1, 0, 4, 9, 0, 5, 6, 1, 0, 2

Make (a) a frequency table and (b) a histogram. Then (c) describe the general

shape of the distribution.

15. Following are the speeds of 40 cars clocked by radar on a particular road in a 35-mph zone on a particular afternoon:

30, 36, 42, 36, 30, 52, 36, 34, 36, 33, 30, 32, 35, 32, 37, 34, 36, 31, 35, 20, 24, 46, 23, 31, 32, 45, 34, 37, 28, 40, 34, 38, 40, 52, 31, 33, 15, 27, 36, 40

Make (a) a frequency table and (b) a histogram. Then (c) describe the general

shape of the distribution.

16. Here are the number of holiday gifts purchased by 25 families randomly interviewed at a local mall at the end of the holiday season:

22, 18, 22, 26, 19, 14, 23, 27, 2, 18, 28, 28, 11, 16, 34, 28, 13, 21, 32,

17, 6, 29, 23, 22, 19

Make (a) a frequency table and (b) a grouped frequency table using intervals of 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, and 30-34. Based on the grouped frequency table, (c) make a histogram and (d) describe the general shape of the distribution.

26

Chapter 1

Make (a) a frequency table and (b) a histogram. Then (c) describe the general

shape of the distribution.

5. These are the scores on a test of sensitivity to smell taken by 25 chefs attending a national conference:

96, 83, 59, 64, 73, 74, 80, 68, 87, 67, 64, 92, 76, 71, 68, 50, 85, 75, 81, 70, 76, 91, 69, 83, 75

Make (a) a frequency table and (b) histogram. (c) Make a grouped frequency table using intervals of 50-59, 60-69, 70-79, 80-89, and 90-99. Based on the grouped frequency table, (d) make a histogram and (e) describe the general

shape of the distribution.

6. The following data are the number of minutes it took each of a group of 34 10-year-olds to do a series of abstract puzzles:

24, 83, 36, 22, 81, 39, 60, 62, 38, 66, 38, 36, 45, 20, 20, 67, 41, 87,

41, 82, 35, 82, 28, 80, 80, 68, 40, 27, 43, 80, 31, 89, 83, 24

Make (a) a frequency table and (b) a grouped frequency table using intervals of 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, and 80-89. Based on the grouped frequency table, (c) make a histogram and (d) describe the general shape of the

distribution.

7. Describe the shapes of the three distributions illustrated.

(a)

(b)

(c)

8. Draw an example of each of the following distributions: (a) symmetrical,

(b) rectangular, and (c) skewed to the right.

9. Explain to a person who has never had a course in statistics what is meant by

(a) a symmetrical unimodal distribution and (b) a negatively skewed unimodal distribution. (Be sure to include in your first answer an explanation of what

"distribution" means.)

10. McKee and Ptacek (2001) asked 90 college students about a time they had deliv‑

ered bad news to someone. Table 1-9 shows the results for the type of bad news given. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics. (b) Explain the general meaning of the pattern of results.

Set II

11. A participant in a cognitive psychology study is given 50 words to remember

and later asked to recall as many as he can of them. This participant recalls 17 What is the (a) variable, (b) possible values, and (c) score?

Displaying the Order in a Group of Numbers

25

Making a Histogram

i

See Figure 1-17.

0

Interest in

3 -

Graduate School

Frequency

0

1

1

2

1

3

2

4

1

2

3

0

Figure 1-17 Answer to Worked-Out Problem for making a histogram. 0 Make a frequency

table (or grouped frequency table). @ Put the values along the bottom of the page, from left to right, from lowest to highest. (i) Make a scale of frequencies along the left edge of the page that goes from 0 at the bottom to the highest frequency for any value. 0 Make a bar above each value with a height for the frequency of that value.

Practice Proble

These problems involve tat lems are done on a comput( software, do these problerr how to use a computer to s the Using SPSS section at t

Workbook that accompanie

All data are fictional u

Set I (for Answers t

1. A client rates her satis

scale from 1 = not at c

(b) possible values, an

2. Give the level of mea

group to which a persc

turn in a laboratory ma

3. A particular block in a of children in these ho

2, 4, 2, 1,

Make (a) a frequency shape of the distributic

4. Fifty students were as1

their answers:

11, 2, 0, 13, 5, 7, 1, 8

11, 18, 2, 9, 7, 3, E, „,

2

1

2

3

4

5

24

Chapter 7

8. Statistical graphs for the general public are sometimes distorted in ways

that mislead the eye, such as failing to use equal intervals or exaggerating

proportions.

9. Frequency tables and histograms are rarely shown in research articles. When

they are, they often follow nonstandard formats or involve frequencies (or percentages) for a nominal variable. The shapes of distributions are more often described.

111112

2)

continuous variable (p. 4)

bimodal distribution (p. 15) multimodal distribution (p. 15) rectangular distribution (p. 15) symmetrical distribution (p. 17) skewed distribution (p. 17) floor effect (p. 17)

ceiling effect (p. 18) normal curve (p. 18) kurtosis (p. 18)

atistics (p. 2)

rank-order variable (p. 4)

tistics (p. 2)

nominal variable (p. 4)

1)

levels of measurement (p. 5)

frequency table (p. 7)

interval (p. 9)

tble (p. 4)

grouped frequency table (p. 9)

1 variable (p. 4)

histogram (p. 10)

4)

frequency distribution (p. 15) unimodal distribution (p. 15)

ble (p. 4)

ple Worked-Out Problems

I

Ten first-year students rated their interest in graduate school on a scale from 1 = no in‑ terest at all to 6 = high interest. Their scores were as follows: 2, 4, 5, 5, 1, 3, 6, 3, 6, 6.

Making a Frequency Table

See Figure 1-16.

•

Interest in

Frequency

Percent

0

Graduate School

1

1

10

-

2

II

10

/

//

1, 3, 6, 3, 6, 6

//

///

Figure 1-16 Answer to Example Worked-Out Problem for making a frequency table. 0 Make a list down the page of each possible value, from lowest to highest. A Go one by one

through the scores, making a mark for each next to its value on your list. 0 Make a table showing how many times each value on your list is used. 0 Figure the percentage of scores for each value.

3

2

20

0 4

1

10

5

2

20

6

3

30

Displaying the Order in a Group of Numbers

23

100

80

60

40

20

0

10-11

12-13

14-15

16-17

Age in Years

Figure 1-15 Change in the percentage of adolescents surveyed in the Canadian

National Longitudinal Survey of Children and Youth longitudinal sample.

Source: Maggi, S., Hertzman, C., & Vaillancourt, T. (2007). Changes in smoking behaviors from late childhood to adolescence: Insights from the Canadian National Longitudinal Survey of Children and

Youth. Health Psychology, 26, 232-240. Published by the American Psychological Association. Reprinted with permission.

1. Psychologists use descriptive statistics to describe and summarize a group of

numbers from a research study.

2. A value is a number or category; a variable is a characteristic that can have dif‑

ferent values; a score is a particular person's value on the variable.

3. Most variables in psychology research are numeric with approximately equal

intervals. However, some numeric variables are rank-ordered (the values are ranks), and some variables are not numeric at all (the values are categories).

4. A frequency table organizes the scores into a table of each of the possible values

with the frequency and percentage of scores with that value.

5. When there are many different values, a grouped frequency table is useful. It is

like an ordinary frequency table except that the frequencies are given for inter‑

vals that include a range of values.

6. The pattern of frequencies in a distribution can be shown visually with a his‑

togram (or bar graph), in which the height of each bar is the frequency for a

particular value.

7. The general shape of a histogram can be unimodal (having a single peak), bimodal (having two peaks), multimodal (including bimodal), or rectangular

(having no peak); it can be symmetrical or skewed (having a long tail) to the right or the left; and, compared to the bell-shaped normal curve, it can be kurtotic (having a peaked or flat distribution).

1

□

□

22

Chapter 1

Table 1-8 Incidence of Traditional and Electronic

Bullying and Victimization (N= 84)

Form of bullying

N

%

Electronic victims

41

48.8

Text-message victim

27

32.1

Internet victim (Web sites, chatrooms)

13

15.5

Picture-phone victim

8

9.5

Traditional victims

60

71.4

Physical victim

38

45.2

Teasing victim

50

59.5

Rumors victim

32

38.6

Exclusion victim

30

50.0

Electronic bullies

18

21.4

Text-message bully

18

21.4

Internet bully

ni

13.1

Traditional bullies

54

64.3

Physical bully

29

34.5

Teasing bully

38

45.2

Rumor bully

22

26.2

Exclusion bully

35

41.7

Source: Raskauskas, J., & Stoltz, A. D. (2007). Involvement in traditional and electronic bullying among adolescents. Developmental Psychology, 43, 564-575. Published by the American Psychological Association. Reprinted with permission.

bullying as "

. . .

a means of bullying in which peers use electronics such as text mes‑

sages, emails, and defaming Web sites] to taunt, threaten, harass, and/or intimidate a peer" (p. 565). Table 1-8 is a frequency table showing the adolescents' reported incidence of being victims or perpetrators of traditional and electronic bullying. The table shows, for example, that about half (48.8%) of the adolescents reported being the victim of electronic bullying, and the most common vehicle for electronic bullying (experienced by 32.1% of the adolescents) was text messaging.

Histograms are even more rare in research articles (except in articles about statistics), but they do appear occasionally. Maggi and colleagues (2007) conducted a study of age-related changes in cigarette smoking behaviors in Canadian adolescents. As shown in Figure 1-15, they created a histogram—from a grouped frequency table—to display their results. Their histogram shows the results from the two samples they studied (one shown in the light colored bars and the other in the dark colored bars). As you can see in the figure, less than 10% of the 10- and 11-year-olds reported that they had tried smoking, but more than half of the 16- and 17-year-olds said they had tried smoking. As already mentioned, such figures are often not standard in some way. In this example, the researchers drew the histogram with gaps between the bars, whereas it is standard not to use gaps (unless you are drawing a bar graph for a nominal variable). However, the histogram still does a good job of showing the distribution. Also, the researchers, to allow for a fair comparison of how the rate of smoking differed among adolescents of varying ages, plotted the percentage of adolescents on the vertical axis instead of the actual number of adolescents. (Plotting the actual number of adolescents who reported smoking would have been misleading, because there were not the same number of individuals in each of the age groups.)

Displaying the Order in a Group of Numbers

21

10

8

10

4

2

6

0

3

9

I1

Stress Rating

4

10

8

6

4

2

0

I 1

1

1

5

1]

1

3

9

(b)

Stress Rating

(c)

Stress Rating

Figure 1-14 Histogram of students' stress ratings distorted from the standard of width

1 to 1.5 times height. (Data based on Aron et al., 1995.)

housing price in a particular region over a 4-year period (from 2004 to 2007). By starting the vertical axis at $150,000 (instead of 0, as is customary), the graph appears to exaggerate the changes in housing price over time. Figure 1-13b shows the same results with the vertical axis starting at $0. You can still see the changes in housing price from year to year in Figure 1-13b but the figure does a better job of showing the size of those

changes.

The overall proportion of a histogram or bar graph should be about 1 to 1.5 times as wide as it is tall, as in Figure 1-14a for the stress ratings example. But look what happens if we make the graph much taller or shorter, as shown in Figures 1-14b and 1-14c. The effect is like that of a fun house mirror: the true picture is distorted. Any particular shape is in a sense accurate. But the 1-to-1.5 proportion has been adopted to give people a standard for comparison. Changing this proportion misleads the eye.

Frequency Tables and Histograms in Research Articles

Psychology researchers mainly use frequency tables and histograms as a first step in more elaborate statistical analyses. They are usually not included in research articles, and when they are, just because they are so rare, they are often not standard in some way. When they do appear, they are most likely to be in survey studies. For example, Raskauskas and Stoltz (2007) asked a group of 84 adolescents about their involvement in traditional and electronic bullying. The researchers defined electronic

3

20

Chapter 1

Commission Payments to Travel Agents

A

$50

I

First

I 'I

liaif

76

77

78

0

EASTER!,

f I

UNITED

AIRLINES

Figure 1-12 Misleading illustration of a frequency distribution due to unequal interval

sizes.

Source: "Commission Payments to Travel Agents," From The New York Times, August 8, 1978. (1) 1978 The New York Times. Used by permission and protected by the Copyright Laws of the United States. The

printing, copying, redistribution, or retransmission of the Material without express written permission is prohibited. wwwnytimes.corn

(a)

(b)

164,000

175,000

162,000

150,000

160,000

125,000

158,000

100,000

156,000

75,000

154,000

50,000

152,000

25,000

150,000

0

Year

Year

Figure 1-13 Misleading bar graph due to not starting at zero. The vertical axis starts at

$150,000 for figure (a) compared to $0 for figure (b).

2004

2005

2006

2007

2004

2005

2006

2007

Displaying the Order in a Group of Numbers

1g

How are ou doin • ?

a•

1. Describe the difference between a unimodal and multimodal distribution in

terms of (a) a frequency graph and (b) a frequency table.

2. What does it mean to say that a distribution is skewed to the left?

What kind of skew is created by (a) a floor effect and (b) a ceiling effect?

4. When a distribution is described as being peaked or flat, what is it being

compared to?

.emno iotwou e 01 paiedwoo 6uieq sl uoilnquisip e41 ..17

· 14el eql 01 peme>is sl loal}a 6u!I1eo
e Aq peleen euo (q):1q6p eq1 o1 peme)is sl low .1001j o Aq paleen me)is v (e) •c

.senien y6ly 8AEL.1 uet.41 sewn Awl eABL1 SaJOOS Jemed •z

.11 punwe

semen eqi. 01 paiedwoo sepuenbe4 eami ipAA amen euo uetp, 0.10W seq uoll -nqulsip lepowilinw e :sepuenbelj Jaillo eql lie ueql Apuenbell .1eq6N 0 qi.inA enien euo seq uoilnquisip lepow!un v (q) .1t.nod 4614 uiew euo ueq; 9.10W seq uoilnqpisip lepowp.inw 0 :Tulod y6lu uiew euo seq uoilnquisip iepowiun v (e) •

S.18MSUV

Controversy: Misleading Graphs

The most serious controversy about frequency tables and histograms is not among psychologists, but among the general public. The misuse of these procedures by some public figures, advertisers, and the media seems to have created skepticism about the trustworthiness of statistics in general and of statistical tables and charts in particular. Everyone has heard that "statistics lie."

Of course, people can and do lie with statistics. It is just as easy to lie with words, but you may be less sure of your ability to recognize lies with numbers. In this section, we note two ways in which frequency tables and graphs can be misused and tell how to recognize such misuses. (Much of this material is based on the classic discussion of these issues in Tufte, 1983.)

Failure to Use Equal Interval Sizes

A key requirement of a grouped frequency table or graph is that the size of the intervals be equal. If they are not equal, the table or graph can be very misleading. Tufte (1983) gives an example, shown in Figure 1-12, from the respectable (and usually accurate) New York Times. This chart gives the impression that commissions paid to travel agents dropped dramatically in 1978. However, a close reading of the graph shows that the third bar for each airline is for only the first half of 1978. Thus, only half a year is being compared to each of the preceding full years. Assuming that the second half of 1978 was like the first half, the information in this graph actually tells us that 1978 shows an increase rather than a decrease. For example, Delta Airlines estimated a full-year 1978 figure of $72 million, much higher than 1977's $57 million.

Exaggeration of Proportions

The height of a histogram or bar graph (or frequency polygon) usually begins at 0 or the lowest value of the scale and continues to the highest value of the scale. Figure 1-13a

shows a bar graph that does not follow this standard. The bar graph shows the mean

Chapter 1

A skewed distribution caused by an upper limit is shown in Figure 1-10b. This is a distribution of adults' scores on a multiplication table test. This distribution is strongly skewed to the left. Most of the scores pile up at the right, the high end (a perfect score). This shows a ceiling effect. The stress ratings example also shows a mild ceiling effect because many students had high levels of stress, the maximum rating was 10, and people often do not like to use ratings right at the maximum.

Normal and Kurtotic Distributions

Psychologists also describe a distribution in terms of whether the middle of the distribution is particularly peaked or flat. The standard of comparison is a bell-shaped curve. In psychology research and in nature generally, distributions often are similar to this bell-shaped standard, called the normal curve. We discuss this curve in some detail in later chapters. For now, however, the important thing is that the normal curve is a unimodal, symmetrical curve with an average peak—the sort of bell shape shown in Figure 1-11 a. Both the stress ratings and the social interactions examples approximate a normal curve in a very general way—although, as we noted, both are somewhat skewed. In our experience, most distributions that result from psychology research are closer to the normal curve than are these two

examples.

Kurtosis is how much the shape of a distribution differs from a normal curve in terms of whether its curve in the middle is more peaked or flat than the normal curve (DeCarlo, 1997). Kurtosis comes from the Greek word kyrtos, "curve." Figure 1-11 b shows a kurtotic distribution with a more extreme peak than the normal curve. Figure 1-11c shows an extreme example of a kurtotic distribution, one with a very flat distribution. (A rectangular distribution would be even more

extreme.)

Distributions that are more peaked or flat than a normal curve also tend to have a different shape in the tails. Those with a very peaked curve usually have more scores in the tails of the distribution than the normal curve (see Figure 1-11b). It is as if the normal curve got pinched in the middle and some of it went up into a sharp peak and the rest spread out into thick tails. Distributions with a flatter curve usually have fewer scores in the tails of the distribution than the normal curve (see Figure 1-11c). It is as if the tails and the top of the curve both got sucked in toward the middle on both sides. Although it is often easiest to identify kurtosis in terms of how peaked or flat the distribution is, the number of scores in the tails is what matters.

ceiling effect situation in which many

scores pile up at the high end of a distribution (creating skewness) because it is not possible to have a higher score.

normal curve specific, mathematically

defined, bell-shaped frequency distribution that is symmetrical and unimodal; distributions observed in nature and in research commonly approximate it.

-

(a)

(b)

(c)

kurtosis extent to which a frequency

Figure 1-11 Examples of (a) normal, (b) peaked, and (c) flat distribution. The normal distri‑

distribution deviates from a normal

curve in terms of whether its curve in bution is shown as a dashed line in (b) and (c).

the middle is more peaked or flat than Source: Adapted from DeCarlo, T. (1997). On the meaning and use of kurtosis. Psychological Methods, 3, the normal curve.

292-307, Figure 1. Published by the American Psychological Association. Adapted with permission.

Displaying the Order in a Group of Numbers

17

(a)

(b)

(c)

Figure 1-9 Examples of frequency polygons of distributions that are (a) approximately symmet‑

rical, (b) skewed to the right (positively skewed), and (c) skewed to the left (negatively skewed).

symmetrical distribution (if you fold the graph of a symmetrical distribution in half,

symmetrical distribution distribution

the two halves look the same).

in which the pattern of frequencies on the left and right side are mirror images of each other.

A distribution that clearly is not symmetrical is called a skewed distribution. The stress ratings distribution is an example. A skewed distribution has one side that is long and spread out, somewhat like a tail. The side with the fewer scores (the side that looks like a tail) is considered the direction of the skew. Thus, the stress study example, which has too few scores at the low end, is skewed to the left. However, the social interactions example, which has too few scores at the high end, is skewed to the right (see Figure 1-4). Figure 1-9 shows examples of approximately symmetrical

skewed distribution distribution in

which the scores pile up on one side of the middle and are spread out on the other side; distribution that is not symmetrical.

floor effect situation in which many

and skewed distributions.

scores pile up at the low end of a distrib ution (creating skewness) because it is

AA distribution that is skewed to the right is also called positively skewed. A dis‑

tribution skewed to the left is also called negatively skewed.

not possible to have any lower score.

Strongly skewed distributions come up in psychology research mainly when what is being measured has some upper or lower limit. For example, a family cannot have fewer than zero children. When many scores pile up at the low end because it is impossible to have a lower score, the result is called a floor effect. A skewed distri- bution caused by a lower limit is shown in Figure 1-10a.

It helps you remember the direction of the skew to know that the word skew comes from the French queue , which l i or Thus, t he direction of the skew is the side that has the long line, or tail.

(a)

(b)

10

10 20 30 40 50 60 70 80 90 100 Percentage of Correct Answers

Number of Children

Figure 1-10 (a) A distribution skewed to the right due to a floor effect: fictional distribution of the

number of children in families. (b) A distribution skewed to the left due to a ceiling effect: fictional distribution of adults' scores on a multiplication table test.

0

1

2

3

4

5

6

0

16

Chapter 1

(a)

Number of

People in a

Toddler's

Play Area

Age

(b)

Number

of

Students

Grade Level

Figure 1-8 Fictional examples of distributions that are not unimodal: (a) A bimodal

distribution showing the possible frequencies for people of different ages in a toddler's play

area. (b) A regular distribution showing the possible frequencies of students at different grade levels in an elementary school

The scores from most psychology studies are usually an approximately unimodal distribution. Bimodal and other multimodal distributions occasionally turn up. A bimodal example is the distribution of the ages of people in a toddler's play area in a park, who are mostly either toddlers with ages of around 2 to 4 or caretakers with ages of 20 to 40 or so (with few people aged 5 to 19 years or above 40). Thus, if you make a frequency distribution of these ages, the large frequencies are at the values for low ages (2 to 4) and for higher ages (20 to 40 or so). An example of a rectangular distribution is the number of children at each grade level at an elementary school; there is about the same number in first grade, second grade, and so on. Figure 1-8 shows these examples.

Symmetrical and Skewed Distributions

Look again at the histograms of the stress ratings example (Figure 1-3). The distribution is lopsided, with more scores near the high end. This is somewhat unusual. Most things we measure in psychology have about equal numbers on both sides of the middle. That is, most of the time in psychology, the scores follow an approximately

1

2

3

4

5

6

Displaying the Order in a Group of Numbers

1:5

•E uousanb „Outocj no azy moH„ toj atet2otsm

anittA

9

g

t

£

Z

I

£

9

L

8

g—L a in6i

frequency distribution pattern of

frequencies over the various values; what a frequency table, histogram, or frequency polygon describes.

unimodal distribution frequency

distribution with one value clearly having a larger frequency than any other.

g—I. °es 'c

'amen

bimodal distribution frequency

121_11aol Aouenbati to It.i6!a4 e gip Jeq e s! amenwee anoqe (o) :11e!

buole ob et_11 an Aue lo Aouenball i.seqb!q ol NT 0 LUOI

distribution with two approximately equal frequencies, each clearly larger than any of the others.

sepuonball (q) buole ob iseqb!q Isemo! 'sonleA au (e) •

multimodal distribution frequency

alge} Aouenb

distribution with two or more high frequencies separated by a lower frequency; a bimodal distribution is the special case of two high frequencies.

-au e ui Ailens!A welled NT molts of aideJb Aouenball elew sJeLialeaseld • I.

s jemsuv

rectangular distribution frequency

Shapes of Frequency Distributions

distribution in which all values have approximately the same frequency.

A frequency distribution shows the pattern of frequencies over the various values. A frequency table or histogram describes a frequency distribution because each

(a) Unimodal

(b) Approximately Bimodal

(c) Approximately Rectangular

shows the pattern or shape of how the frequencies are spread out, or "distributed."

Psychologists also describe this shape in words. Describing the shape of a distribution is important both in the descriptive statistics of this chapter and the next and in the inferential statistics of later chapters.

Unimodal and Bimodal Frequency Distributions

One question is whether a distribution's shape has only one main high point: one high "tower" in the histogram. For example, in the stress ratings study, the most frequent

value is 7, giving a graph only one very high area. This is a unimodal distribution. If a distribution has two fairly equal high points, it is a bimodal distribution. Any distribution with two or more high points is called a multimodal distribution. (Strictly speaking, a distribution is bimodal or multimodal only if the peaks are exactly equal. However, psychologists use these terms more informally to describe the general shape.) Finally, a distribution with values of all about the same frequency is a rectangular distribution. Figure 1-7 shows examples of these frequency distribution shapes. As you will see, the graphs in Figure 1-7 are not histograms, but special line graphs called frequency polygons, which are another way to graph a frequency table. In a frequency polygon, the line moves from point to point. The height of each point shows the number of scores with that value. This creates a mountain peak skyline.

Figure 1-7 Examples of (a)

unimodal, (b) approximately bimodal, and (c) approximately rectangular frequency polygons.

□

Chapter 1

find the midpoint between the start of the interval and the start of what would be the next highest interval. So, in Figure 1-4, the midpoint for the 45-49 interval is halfway between 45 (the start of the interval) and 50 (the start of what would be the next interval), which is 47.5.

—

@ Make a scale of frequencies along the left edge of the page that goes from 0

You will probably find it easier to make a histogram if you use graph paper.

at the bottom to the highest frequency for any value.

0 Make a bar above each value with a height for the frequency of that value.

For each bar, make sure that the middle of the bar is above its value.

When you have a nominal variable, the histogram is called a bar graph. Since the values of a nominal variable are not in any particular order, leave a space between the bars. Figure 1-5 shows a bar graph based on the frequency table in Table 1-4.

100

90

I

80

70

60

50

40

30 I

20

10

0

Family

Nonromantic

Romantic

Other

member

friend

partner

Closest Person

Figure 1-5 Bar graph for the closest person in life for 208 students (see Table 1-4).

(Data from Aron et al., 1995.)

How are you doing?

1. Why do researchers make frequency graphs?

2. When making a histogram from a frequency table, (a) what goes along the bot‑

tom, (b) what goes along the left edge, and (c) what goes above each value? 3. Make a histogram based on the following frequency table:

Value

Frequency

Displaying the Order in a Group of Numbers

13

thinking ability. Anxiety produces arousal, and one of the best understood relationships in psychology is between arousal and performance. Whereas moderate arousal helps performance, too much or too little dramatically reduces it. In the case of too much, things you have learned become harder to recall. Your mind starts to race, creating more anxiety, more arousal, and so on. Because during a test you may be fearing that you are "no good and never will be," it is important to rethink beforehand any poor grades you may have received in the past. They most likely reflected

yourself a time limit for solving some homework prob‑ lems. Make yourself write out answers fully and legibly. This may be part of what makes you feel slow during a

test. If the presence of others bothers you—the sound of

their scurrying pencils while yours is frozen in midair

do your practice test with others in your course. Even

make it an explicit contest to see who can finish first.

Is your problem a general lack of confidence? Is something else in your life causing you to worry or feel bad about yourself? Then we suggest that it is time you

your problems with tests more than your abilities.

tried your friendly college counseling center.

There are many ways to reduce anxiety and arousal in general, such as learning to breathe properly and to take a brief break to relax deeply. Your counseling center should be able to help you or direct you to some good books on the subject. Again, many Web sites deal with

Lastly, could you be highly sensitive? A final word about anxiety and arousal. About 15 to 20% of humans

reducing anxiety.

(and all higher animals) seem to be born with a tempera-ment trait that has been seen traditionally as shyness, hesitancy, or introversion (Eysenck, 1981; Kagan, 1994). But this shyness or hesitancy seems actually due to a preference to observe and an ability to notice subtle stimulation and process information deeply (Aron, 1996; Aron & Aron, 1997). This often causes highly sensitive persons (HSPs) to be very intuitive or even gifted. But it also means they are more easily overaroused by high lev‑

Test anxiety specifically is first reduced by over-preparing for a few tests, so that you go in with the cer-tainty that you cannot possibly fail, no matter how aroused you become. The best time to begin applying this tactic is the first test of this course. There will be no old material to review, success will not depend on having understood previous material, and initial success will help you do well throughout the course. (You also might enlist the sympathy of your instructor or teaching assis-tant. Bring in a list of what you have studied, state why you are being so exacting, and ask if you have missed anything.) Your preparation must be ridiculously thor-ough, but only for a few exams. After these successes,

els of stimulation, like tests.

You might want to find out if you are an HSP (at http://www.hsperson.com ). If you are, appreciate the trait's assets and make some allowances for its one dis‑

your test anxiety should decline.

advantage, this tendency to become easily overaroused. It has to affect your performance on tests. What matters is what you actually know, which is probably quite a bit. This simple act of self-acceptance—that you are not less smart but are more sensitive—may in itself help ease your arousal when trying to express your statistical

Also, create a practice test situation as similar to a real test as possible, making a special effort to duplicate the aspects that bother you most. If feeling rushed is the troubling part, once you think you are well prepared, set

knowledge.

So good luck to all of you. We wish you the best while taking this course and in your lives.

How to Make a Histogram

There are four steps in making a histogram.

0 Make a frequency table (or grouped frequency table).

@ Put the values along the bottom of the page, from left to right, from lowest

to highest. If you are making a histogram from a grouped frequency table, the

values you put along the bottom of the page are the interval midpoints. The mid-point of an interval is halfway between the start of that interval and the start of the next highest interval. So, in Figure 1-4, the midpoint for the 0-4 interval is 2.5, because 2.5 is halfway between 0 (the start of the interval) and 5 (the start of the next highest interval). For the 5-9 interval, the midpoint is 7.5 because 7.5 is halfway between 5 (the start of the interval) and 10 (the start of the next highest interval). Do this for each interval. When you get to the last interval,

Now try this yourself! Work out the interval midpoints for the grouped frequency table for the stress rat-ings example shown in Table 1-6. Your answers should be the same as the values shown along the bot‑

of Figure 1-3b.

Chapter 7

INTERVAL

FREQUENCY

16

0 — 4

12

15

5 — 9

16

10 — 14

16

14

15 — 19

16

13

20 — 24

10

12

25 — 29

1

i

30 — 34

35 — 39

10

40 — 44

45 — 49

2.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5 42.5 47.5 Number of Social Interactions

Figure 1-4 Histogram for number of social interactions during a week for 94 college

students based on grouped frequencies. (Data from McLaughlin-Volpe et al., 2001.)

BOX 1 -2

Math Anxiety, Statistics Anxiety, and You: A Message for Those of You Who Are Truly Worried About This Course

Let's face it: Many of you dread this course, even to the point of having a full-blown case of "statistics anxiety" (Zeidner, 1991). If you become tense the minute you see

math you need when you need it" (1995, p. 12). (Could it

be that this course in statistics is one of those times?)

numbers, we need to talk about that right now.

Tobias explains that math mental health is usually lost in elementary school, when you are called to the black-board, your mind goes blank, and you are unable to pro-duce the one right answer to an arithmetic problem. What confidence remained after such an experience probably faded during timed tests, which you did not re-alize were difficult for everyone except the most profi‑

First, this course is a chance for a fresh start with digits. Your past performance in (or avoidance of) geometry, trigonometry, calculus, or similar horrors need not influ-ence in any way how well you comprehend statistics. This is largely a different subject.

Second, if your worry persists, you need to determine where it is coming from. Math or statistics anxiety, test anxiety, general anxiety, and generally low self-confidence each seems to play its own role in students' difficulties with math courses (Cooper & Robinson, 1989; Dwinell &

cient few.

Tobias says that students who are good at math are not necessarily smarter than the rest of us, but they really know their strengths and weaknesses, and they have individual styles of thinking and feeling their way around a problem. They do not judge themselves harshly for mis‑

Higbee, 1991).

Is your problem mainly math or statistics anxiety? An

takes. In particular, they do not expect to understand

"slow learner"

things instantly. Allowing yourself to be a

Internet search will yield hundreds of wonderful books and Web sites to help you. We highly recommend Sheila Tobias's classics Overcoming Math Anxiety (1995) or Succeed with Math: Every Student's Guide to Conquering Math Anxiety (1987). Tobias, a former math avoider her-self, suggests that your goal should be "math mental health," which she defines as "the willingness to learn the

does not mean that you are less intelligent. It shows that

you are growing in math mental health.

Is your problem test anxiety? Test taking requires the use of the thinking part of our brain, the prefrontal cortex. When we are anxious, we naturally "downshift" to more basic, instinctual brain systems, and that effect ruins our

Displaying the Order in a Group of Numbers

11

(a) Frequency Table

STRESS

0

7

RATING

FREQUENCY

6

5

4

@ 3

2

10

1

0

4 5 6

Stress Rating

(b) Grouped Frequency Table

11

0

STRESS

0

10

RATING

INTERVAL

FREQUENCY

9

0 — 1

2 — 3

8

4 — 5

6 — 7

1

7

8 — 9

10 — 11

6

5

4

3

2

1

0

11

Stress Rating

Figure 1-3 Histograms based on (a) frequency table and (b) a grouped frequency for the

stress ratings example. (Data based on Aron et al., 1995.)

0

2

3

8

9

10

1

3

5

7

9

10

Chapter 1

When setting up a grouped frequency table, it makes a big difference how many intervals you use. There are guidelines to help researchers with this, but in practice it is done automatically by the researcher's computer (see the Using SPSS section for in-structions on how to create frequency tables using statistical software). Thus, we will not focus on it in this book. However, should you have to make a grouped frequency table on your own, the key is to experiment with the interval size until you come up with one that is a round number (such as 2, 3, 5, or 10) and that creates about 5 to 15 in-tervals. Then, when actually setting up the table, be sure you set the start of each inter-val to a multiple of the interval size and the top end of each interval to the number just below the start of the next interval. For example, Table 1-6 uses six intervals with an interval size of 2. The intervals are 0-1, 2-3, 4-5, 6-7, 8-9, and 10-11. Note that each interval starts with a multiple of 2 (0, 2, 4, 6, 8, 10) and the top end of each interval (1, 3, 5, 7, 9) is the number just below the start of the next interval (2, 4, 6, 8, 10). Table 1-7 uses 10 intervals with an interval size of 5. The intervals are 0-4, 5-9, 10-14, 15-19, and so on, with a final interval of 45-49. Note that each interval starts with a multiple of 5 (0, 5, 10, 15, and so on) and that the top end of each interval (4, 9, 14, 19, and so on) is the number just below the start of the next interval (5, 10, 15, 20, and so on).

Table 1-7 Grouped

Frequency Table for Numbers of Social Interactions During a Week for 94

College Students

Interval

Frequency

Percent

0-4

12

1 2.8

5-9

16

17.0

10-14

16

17.0

15-19

16

17.0

20-24

10

10.6

25-29

m

11.7

30-34

4.3

35-39

3.2

40-44

3.2

45-49

3.2

Source: Data from McLaughlin-Volpe et al.,

(2001).

How are you doing?

1. What is a frequency table?

Why would a researcher want to make a frequency table?

2. Make a frequency table for the following scores: 5, 7, 4, 5, 6, 5, 4.

1. What does a grouped frequency table group?

•sien.4elul olw semen lueoefpe to sepuenball eql scino.J6 algal Aouenbat v

C*171.

t7

6' Z

IUGOJGd

Aouanbead

anieA

'MOOS 10 dnoi6 e6.iel e ul welled eql eas of (see 1! seiew Apuenbeil v •

.pe!pnls dnoA6 et.p, u! omen Lose Jo

(Aouenball NT) SalOOS to.iegwnu 91410 6uus!l opwelsAs e si a !qui. Aouen ban v • 1.

sJemsuv

Histograms

histogram barlike graph of a frequency

A graph is another good way to make a large group of scores easy to understand. A picture may be worth a thousand words, but it is sometimes worth a thousand numbers. A straightforward approach is to make a graph of the frequency table. One kind of graph of the information in a frequency table is a kind of bar chart called a histogram. In a histogram, the height of each bar is the frequency of each value in the frequency table. Ordinarily, in a histogram all the bars are put next to each other with no space in between. The result is that a histogram looks a bit like a city skyline. Figure 1-3 shows two histograms based on the stress ratings example (one based on the ordinary frequency table and one based on the grouped frequency table). Figure 1-4 shows a histogram based on the grouped frequency table for the example of the numbers of students' social interactions in a week.

distribution in which the values are plotted along the horizontal axis and the height of each bar is the frequency of that value; the bars are usually placed next to each other without spaces, giving the appearance of a city skyline.

Displaying the Order in a Group of Numbers

9

Table 1-5 Frequency Table for Number of Social Interactions During a

Week for 94 College Students

Score

Frequency

Score Frequency

Score

Frequency

0

17

0

1

18

2

2

19

0

3

20

0

4

21

1

5

22

0

6

23

1

7

24

1

8

25

0

9

26

0

10

27

1

1

28

0

12

29

0

13

30

14

31

1

15

32

16

33

Source: Data from McLaughlin-Volpe et al., (2001).

Table 1-6 Grouped

Frequency Table for Stress Ratings

0 Figure the percentage of scores for each value. We have not done so in this

Stress Rating

example because it would not help much for seeing the pattern of scores. How-

Interval

Frequency

Percent

ever, if you want to check your understanding of this step, the first five percent‑

0-1

2

6.7

ages would be 0.0%, 2.1%, 1.1%, 5.3%, and 4.3%. (These are the percentages for frequencies of 0, 2, 1, 5, and 4, rounded to one decimal place.)

2-3

3

10.0

4-5

3

10.0

6-7

m

36.7

Grouped Frequency Tables

8-9

8

26.7

Sometimes there are so many possible values that an ordinary frequency table is too

10-11

3

10.0

awkward to give a simple picture of the scores. The last example was a bit like that, wasn't it? The solution is to make groupings of values that include all values in a cer‑

Source: Data based on Aron et al. (1995).

tain range. Consider the stress ratings example. Instead of having a separate frequency figure for the group of students who rated their stress as 8 and another for those who rated it as 9, you could have a combined category of 8 and 9. This combined category is a range of values that includes these two values. A combined category like this is called an interval. This particular interval of 8 and 9 has a frequency of 8 (the

..1,A

5 scores with a value of 8 plus the 3 scores with a value of 9).

You can cross-check your work by adding the frequencies for all of the scores. This sum should equal the total number of scores you started with.

A frequency table that uses intervals is called a grouped frequency table. Table 1-6 is a grouped frequency table for the stress ratings example. (Note that in this example the full frequency table has only 11 different values. Thus, a grouped frequency table is not really necessary.) Table 1-7 is a grouped frequency table for

interval range of values in a grouped

the 94 students' number of social interactions over a week.

frequency table that are grouped together. (For example, if the interval size is 10, one of the intervals might be from 10 to 19.)

A grouped frequency table can make information even more directly understand-able than an ordinary frequency table can. Of course, the greater understandability of a grouped frequency table is at a cost. You lose some information: the details of the breakdown of frequencies in each interval.

grouped frequency table frequency

table in which the number of individuals

(frequency) is given for each interval

of values.

34

35

36

37

38

39

40

41

42

43

44

45

46

47

2

8

Chapter 1

1

r

STRESS

RATING

FREQUENCY

K-81-678,9,9,7,

/

7, 6, 5, 0, 9, 10, 7, 7, 3,

7, 5, 2, 1, 6, 7, 10, 8, 8

//

When doingStep @, cross (7'each score as you mark it on the list. This should help you avoid mistakes, which are common in this step.

Figure 1-1 Making a frequency table for the stress ratings scores. (Data based on Aron

et al., 1995.)

Frequency Tables for Nominal Variables

The preceding steps assume you are using numeric variables, the most common situation. However, you can also use a frequency table to show the number of scores in each value (or category) of a nominal variable. For example, researchers (Aron, Aron, & Smollan, 1992) asked 208 students to name the closest person in their life. As shown in Table 1-4, 33 students selected a family member, 76 a nonromantic friend, 92 a roman-tic partner, and 7 selected some other person. Also in Table 1-4, the values listed on the left hand side of the frequency table are the values (the categories) of the variable.

Table 1-4 Frequency Table

for a Nominal Variable: Closest Person

in Life for 208 Students

Closest Person Frequency Percent

Family member

33

15.9

Nonromantic friend

76

36.5

Romantic partner

92

44.2

Other

7

3.4

Another Example

Tracy McLaughlin-Volpe and her colleagues (2001) had 94 introductory psychology students keep a diary of their social interactions for a week during the regular semester. Each time a participant had a social interaction lasting 10 minutes or longer, he or she would fill out a card. The card had questions about various aspects of the conversation and the conversation partner. Excluding family and work situations, the number of social interactions 10 minutes or longer over a week for these students were as follows:

Source: Data from Aron et al. (1992).

-

17 - ////

34 -

1 - //

18 - M.t

35 -/I

2 -/

19 -

36 -

3 -The

20 -

37 -

4 -////

21 - ////

5

22 -///

6 - //

23 -/

40 - /

7 - //

24 - //

41 - /

8 -The / 25 - ///

42 -

9 -m

26 - //

43 -

10 -Thu/

27 -/

44 - /

11 - ////

28 -/

45 -

12 -/

29 - ////

46 -

13 - //

30 - //

47 - 0

14 -///

31 -

15 -/

32 -/

16 - //

33 -/

48, 15, 33, 3, 21, 19, 17, 16, 44, 25, 30, 3, 5, 9, 35, 32, 26, 13, 14, 14, 47, 47, 18, 11, 5, 19, 24, 17, 6, 25, 8, 18, 29, 1, 18, 22, 3, 22, 29, 2, 6, 10, 29, 10, 29, 21, 38, 41, 16, 17, 8, 40, 8, 10, 18, 7, 4, 4, 8, 11, 3, 23, 10, 19, 21, 13, 12, 10, 4, 17, 11, 21, 9, 8, 7, 5, 3, 22, 14, 25, 4, 11, 10, 18, 1, 28, 27, 19, 24, 35, 9, 30, 8, 26.

Now, let's follow our four steps for making a frequency table.

0 Make a list down the page of each possible value, from lowest to highest.

The lowest possible number of interactions is 0. In this study, the highest number of interactions could be any number. However, the highest actual number in this group is 48; so we can use 48 as the highest value. Thus, the first step is to list these values down a page. (It might be good to use several columns so that you can have all the scores on a single page.)

0 Go one by one through the scores, making a mark for each next to its value

Figure 1-2 Making a frequency

on your list. Figure 1-2 shows the results of this step.

table of students' social interactions over a week. (Data from McLaughlin- Volpe et al., 2001.)

0 Make a table showing how many times each value on your list is used.

Table 1-5 is the result.

Displaying the Order in a Group of Numbers

7

Frequency Tables

An Example

Let's return to the stress ratings example. Recall that in this study, students in an introductory statistics class during the first week of the course answered the question, "How stressed have you been in the last TA weeks, on a scale of 0 to 10, with 0 being not at all stressed and 10 being as stressed as possible?" The actual study included scores from 151 students. To ease the learning for this example, we are going to use a representative subset of scores from 30 of the 151 students (this also saves you time if you want to try it for yourself). The 30 students' scores (their ratings on the scale) are:

8, 7, 4, 10, 8, 6, 8, 9, 9, 7, 3, 7, 6, 5, 0, 9, 10, 7, 7, 3, 6, 7, 5, 2, 1, 6, 7, 10, 8, 8.

Looking through all these scores gives some sense of the overall tendencies, but this is hardly an accurate method. One solution is to make a table showing how many stu- dents used each of the 11 values that the ratings can have (0, 1, 2, and so on, through

Table 1-3 Frequency Table of

Number of Students Rating Each Value

10). We have done this in Table 1-3. We also figured the percentage each value's fre-

of the Stress Scale

quency is of the total number of scores. Tables like this sometimes give only the raw‑

Stress Rating Frequency

Percent

number frequencies, not the percentages, or only the percentages and not the raw-number frequencies. In addition, some frequency tables include, for each value, the total number of scores with that value and all values preceding it. These are called cumulative frequencies because they tell how many scores are accumulated up to this point on the table. If percentages are used, cumulative percentages also may be included (for an example, see Figure 1-18 in the Using SPSS section on page 30). Cumulative percentages give, for each value, the percentage of scores up to and including that value. The cumulative percentage for any given value (or for a score having that value) is also called a percentile. Cumulative frequencies and cumulative percentages

3.3

3.3

3.3

6.7

3.3

6.7

13.3

23.3

16.7

allow you to see where a particular score falls in the overall group of scores.

10.0

Table 1-3 is called a frequency table because it shows how frequently (how

10

10.0

many times) each score was used. A frequency table makes the pattern of numbers

easy to see. In this example, you can see that most of the students rated their stress level around 7 or 8, with few rating it very low.

Source: Data based on Aron eta]. (1995).

How to Make a Frequency Table

There are the four steps in making a frequency table.

m Make a list down the page of each possible value, from lowest to highest. In
the stress ratings results, the list goes from 0, the lowest possible rating, up to 10, the highest possible rating.1 Note that even if one of the ratings between 0 and 10 is not used, you still include that value in the listing, showing it as having a frequency of 0. For example, if no one gives a stress rating of 2, you still include 2 as one of the values on the frequency table.

A Go one by one through the scores, making a mark for each next to its value

m Make a table showing how many times each value on your list is used. That

Figure the percentage of scores for each value. To do this, take the frequency

on your list. This is shown in Figure 1-1.

is, add up the number of marks beside each value.

for that value, divide it by the total number of scores, and multiply by 100. You may need to round off the percentage. We recommend that you round percent-ages to one decimal place. Note that because of the rounding, your percentages do not usually add up to exactly 100% (but they should be close).

frequency table listing of number of

individuals having each of the different values for a particular variable.

6

Chapter 1

· .lepio-)peA (3) leNeluHenbe (q) leupou (e) ti
· senien onnl Aue ueewaq
semen to Jeqwnu allugu! Lie 'AJoaqi. ui `seq elqupen snonupoo v •senien owo

-ads ayl ueeweq senpen ou sey pue sanien °woods seq apepen eleiosp v 'C

aapao ieopewnu Jeinon.led ou aneq pue sepobeleo lueJemp

ale Imp. sanien seq eiqepen leum.iou e :sainseew eneuen ayl leqm lo lualxe

JO eei6ap ayl nog( pal ley. siaqwnu OJe leyl senieA seq eiqepeA opewnu v •a

· L of 1. (0) (q) 'sseuNuen (e) • 1.
sJemsuv

BOX 1-1

Important Trivia for Poetic Statistics Students

The word statistics comes from the Italian word statista, a person dealing with affairs of state (from stato, "state"). It was originally called "state arithmetic," involving the tabulation of information about nations, especially for the purpose of taxation and planning the feasibility

what is considered the first use of a statistical test, he proved that the male birthrate was higher than could be expected by chance (assuming that 50:50 was chance) and concluded that there was a plan operating, since males face more danger to obtain food for their families, and

of wars.

only God, he said, could do such planning.

Statistics were needed in ancient times to figure the odds of shipwrecks and piracy for marine insurance that would encourage voyages of commerce and exploration to far-flung places. The modern study of mortality rates and life insurance descended from the 17th-century plague pits—counting the bodies of persons cut down in the bloom of youth. The theory of errors (covered in Chapter 12) began in astronomy, that is, with stargazing; the theory of correlation (Chapter 11) has its roots in bi-ology, from the observation of parent and child differ-ences. Probability theory (Chapter 3) arose in the tense environs of the gambling table. The theory of analysis of experiments (Chapters 7 to 10) began in breweries and out among waving fields of wheat, where correct guesses determined not only the survival of a tasty beer but of thousands of marginal farmers. Theories of measurement and factor analysis (Chapter 15) derived from personality

In 1767, John Michell also used probability theory to prove the existence of God when he argued that the odds were 500,000 to 1 against six stars being placed as close together as those in the constellation Pleiades; so their placement had to have been a deliberate act of the

Creator.

Statistics in the "state arithmetic" sense are legally en-dorsed by most governments today. For example, the first article of the U.S. Constitution requires a census. And statistics helped the United States win the Revolutionary War. John Adams obtained critical aid from Holland by pointing out certain vital statistics, carefully gathered by the clergy in local parishes, demonstrating that the colonies had doubled their population every 18 years, adding 20,000 fighting men per annum. "Is this the case of our enemy, Great Britain?" Adams wrote.

"Which then can maintain the war the longest?"

psychology, where the depths of human character were first explored with numbers. And chi-square

Similar statistics were observed by U.S. President Thomas Jefferson in 1786. He wrote that his people "be-come uneasy" when there are more of them than 10 per square mile and that given the population growth of the new country, within 40 years these restless souls would fill up all of their country's "vacant land." Some 17 years later, Jefferson doubled the size of the United States' "vacant" land through the Louisiana Purchase.

(Chapter 13) came to us from sociology, where it was

often a question of class.

In the early days of statistics, it was popular to use the new methods to prove the existence of God. For example, John Arbuthnot discovered that more male than female babies were born in London between 1629 and 1710. In

Displaying the Order in a Group of Numbers

5

Table 1-2 Levels of Measurement

Level

Definition

Example

Equal-interval Numeric variable in which differences between values correspond

Stress level, age

Rank-order

to differences in the underlying thing being measured Numeric variable in which values correspond to the relative

Class standing, position

position of things measured

finished in a race

Nominal

Variable in which the values are categories

Gender, religion

variables are also called categorical variables because their values are categories.) For example, for the nominal variable gender, the values are female and male. A person's "score" on the variable gender is one of these two values. Another example is psychiatric diagnosis, which has values such as major depression, post-traumatic

stress disorder, schizophrenia, and obsessive-compulsive disorder.

These different kinds of variables are based on different levels of measurement (see Table 1-2). Researchers sometimes have to decide how they will measure a particular variable. For example, they might use an equal-interval scale, a rank-order scale, or a nominal scale. The level of measurement selected affects the type of statistics that can be used with a variable. Suppose a researcher is studying the effects of a particular type of brain injury on being able to recognize objects. One approach the researcher might take would be to measure the number of different objects an injured person can observe at once. This is an example of an equal-interval level of measurement. Alternately, the researcher might rate people as able to observe no objects (rated 0), only one object at a time (rated 1), one object with a vague sense of other objects (rated 2), or ordinary vision (rated 3). This would be a rank-order approach. Finally, the researcher might divide people into those who are completely blind (rated B), those who can identify the location of an object but not what the object is (rated L), those who can identify what the object is but not locate it in space (rated I), those who can both locate and identify an object but have other abnormalities of object perception (rated 0), and those with normal visual perception (rated N).

This is a nominal level of measurement.

In this book, as in most psychology research, we focus mainly on numeric, equal-interval variables (or variables that roughly approximate equal-interval variables). We discuss statistical methods for working with nominal variables in Chapter 13 and methods for working with rank-order variables in Chapter 14.

levels of measurement types of

underlying numerical information

provided by a measure, such as equal‑

interval, rank-order, and nominal

(categorical).

How are you doing?

1. A father rates his daughter as a 2 on a 7-point scale (from 1 to 7) of cranki‑

ness. In this example, (a) what is the variable, (b) what is the score, and

(c) what is the range of values?

2. What is the difference between a numeric and a nominal variable?

3. What is the difference between a discrete

and

a continuous variable?

4. Give the level of measurement of each of the following variables: (a) a person's

nationality (Mexican, Spanish, Ethiopian, Australian, etc.), (b) a person's score on a standard IQ test, (c) a person's place on a waiting list (first in line, second in line, etc.).

4

Chapter 1

an example of a numeric variable. Numeric variables are also called quantitative

variables.

There are several kinds of numeric variables. In psychology research the most important distinction is between two types: equal-interval variables and rank-order variables. An equal-interval variable is a variable in which the numbers stand for approximately equal amounts of what is being measured. For example, grade point average (GPA) is a roughly equal-interval variable, since the difference between a GPA of 2.5 and 2.8 means about as much as the difference between a GPA of 3.0 and 3.3 (each is a difference of 0.3 of a GPA). Most psychologists also consider

scales like the 0-to-10 stress ratings as roughly equal interval. So, for example, a difference between stress ratings of 4 and 6 means about as much as the difference

between 7 and 9.

Some equal-interval variables are measured on what is called a ratio scale. An equal-interval variable is measured on a ratio scale if it has an absolute zero point. An absolute zero point means that the value of zero on the variable indicates a complete absence of the variable. Most counts or accumulations of things use a ratio scale. For example, the number of siblings a person has is measured on a ratio scale, because a zero value means having no siblings. With variables that are measured on a ratio scale, you can make statements about the difference in magnitude between values. So, we can say that a person with four siblings has twice as many siblings as a person with two siblings. However, most of the variables in psychology are not on

a ratio scale

numeric variable variable whose

Equal-interval variables can also be distinguished as being either discrete vari‑ ables or continuous variables. A discrete variable is one that has specific values and cannot have values between the specific values. The number of times you went to the dentist in the last 12 months is a discrete variable. You may have gone 0, 1, 2, 3, or more times, but you can't have gone 1.72 times or 2.34 times. With a continuous variable, there are in theory an infinite number of values between any two values. So, even though we usually answer the question " How old are you?" with a specific age, such as 19 or 20, you could also answer it by saying that you are 19.26 years

values are numbers (as opposed to a

nominal variable). Also called quantita‑

five variable.

equal-interval variable variable in

which the numbers stand for approximately equal amounts of what is being measured.

old. Height, weight, and time are examples of other continuous variables.

The other main type of numeric variable, a rank-order variable, is a variable in which the numbers stand only for relative ranking. (Rank-order variables are also called ordinal variables.) A student's standing in his or her graduating class is an ex‑

ratio scale an equal-interval variable

is measured on a ratio scale if it has an absolute zero point, meaning that the value of zero on the variable indicates

ample. The amount of difference in underlying GPA between being second and third in class standing could be very unlike the amount of difference between being eighth

a complete absence of the variable.

and ninth.

discrete variable variable that has

A rank-order variable provides less information than an equal-interval variable. That is, the difference from one rank to the next doesn't tell you the exact difference in amount of what is being measured. However, psychologists often use rank-order variables because they are the only information available. Also, when people are being asked to rate something, it is sometimes easier and less arbitrary for them to make rank-order ratings. For example, when rating how much you like each of your friends, it may be easier to rank them by how much you like them than to rate your liking for them on a scale. Yet another reason researchers often use rank-order variables is that asking people to do rankings forces them to make distinctions. For example, if asked to rate how much you like each of your friends on a 1-to-10 scale, you might rate sev‑

specific values and that cannot have values between these specific values

continuous variable variable for

which, in theory, there are an infinite number of values between any two values.

rank-order variable numeric variable

in which the values are ranks, such as class standing or place finished in a race. Also called ordinal variable.

eral of them at exactly the same level, but ranking would avoid such ties.

nominal variable variable with values

Another major type of variable used in psychology research, which is not a nu-meric variable at all, is a nominal variable in which the values are names or categories. The term nominal comes from the idea that its values are names. (Nominal

that are categories (that is, they are names rather than numbers). Also called

categorical variable.

Displaying the Order in a Group of Numbers

3

In this chapter and the next, we focus on descriptive statistics. This topic is important in its own right, but it also prepares you to understand inferential statistics.

Inferential statistics are the focus of the remainder of the book.

In this chapter we introduce you to some basic concepts, and then you will learn to use tables and graphs to describe a group of numbers. The purpose of descriptive statistics is to make a group of numbers easy to understand. As you will see, tables and graphs help a great deal.

Some Basic Concepts Variables, Values, and Scores

As part of a larger study (Aron, Paris, & Aron, 1995), researchers gave a questionnaire to students in an introductory statistics class during the first week of the course. One question asked was, "How stressed have you been in the last TA weeks, on a scale of 0 to 10, with 0 being not at all stressed and 10 being as stressed as possible?" (How would you answer?) In this study, the researchers used a survey to examine students' level of stress. Other methods that researchers use to study stress include measuring stress-related hormones in human blood or conducting controlled

laboratory studies with animals.

In this example, level of stress is a variable, which can have values from 0 to 10, and the value of any particular person's answer is the person's score. If you answered

6, your score is 6; your score has a value of 6 on the variable called "level of stress."

More formally, a variable is a condition or characteristic that can have different

values. In short, it can vary. In our example, the variable was level of stress, which can have the values of 0 through 10. Height is a variable, social class is a variable, score on a creativity test is a variable, type of psychotherapy received by patients is a variable, speed on a reaction time test is a variable, number of people absent from

work on a given day is a variable, and so forth.

A value is just a number, such as 4, –81, or 367.12. A value can also be a category, such as male or female, or a psychiatric diagnosis—major depression, post-traumatic

stress disorder—and so forth.

Finally, on any variable, each person studied has a particular number or score that is his or her value on the variable. As we've said, your score on the stress vari‑

able might have a value of 6. Another student's score might have a value of 8.

Psychology research is about variables, values, and scores (see Table 1-1). The formal definitions are a bit abstract, but in practice, the meaning usually is clear.

variable characteristic that can have

Levels of Measurement (Kinds of Variables)

different values.

values possible number or category

Most of the variables psychologists use are like those in the stress ratings example: the scores are numbers that tell you how much there is of what is being measured. In the stress ratings example, the higher the number is, the more stress there is. This is

that a score can have.

score particular person's value on a

variable.

Table 1-1 Some Basic Terminology

Term

Definition

Examples

Variable

Condition or characteristic that can have different values

Stress level, age, gender, religion

0, 1, 2, 3, 4, 25, 85, female, Catholic 0, 1, 2, 3, 4, 25, 85, female, Catholic

Value

Number or category

Score

A particular person's value on a variable

2

Chapter 1

helps you to read the work of other psychologists, to do your own research if you so choose, and to hone both your reasoning and intuition. Formally, statistics is a branch of mathematics that focuses on the organization, analysis, and interpretation of a group of numbers. But really what is statistics? Think of statistics as a tool that has evolved from a basic thinking process employed by every human: you observe a thing; you wonder what it means or what caused it; you have an insight or make an intuitive guess; you observe again, but now in detail, or you try making little changes in the process to test your intuition. Then you face the eternal problem: was your hunch confirmed or not? What are the chances that what you observed this second time will happen again and again, so

that you can announce your insight to the world as something probably true?

Statistics is a method of pursuing truth. As a minimum, statistics can tell you the likelihood that your hunch is true in this time and place and with these sorts of people. This pursuit of truth, or at least its future likelihood, is the essence of psychology, of science, and of human evolution. Think of the first research questions: what will the mammoths do next spring? What will happen if I eat this root? It is easy to see how the early accurate "researchers" survived. You are here today because your ancestors exercised brains as well as brawn. Do those who come after you the same

favor: think carefully about outcomes. Statistics is one good way to do that.

Psychologists use statistical methods to help them make sense of the numbers they collect when conducting research. The issue of how to design good research is a topic in itself, summarized in a Web Chapter (Overview of the Logic and Language of Psychology Research) available on the Web site for this book http://www. pearsonhighered.com/. But in this text we confine ourselves to the statistical meth‑

ods for making sense of the data collected through research.

Psychologists usually use a computer and statistical software to carry out statistical procedures, such as the ones you will learn in this book. However, the best way to develop a solid understanding of statistics is to learn how to do the procedures by hand (with the help of a calculator). To minimize the amount of figuring you have to do, we use relatively small groups of numbers in each chapter's examples and practice problems. We hope that this will also allow you to focus more on the underlying principles and logic of the statistical procedure, rather than on the mathematics of each practice problem (such as subtracting 3 from 7 and then dividing the result by 2 to give an answer of 2). (See the Introduction to the Student on pp. xvi—xviii for more information on the goals of this book.) Having said that, we also recognize the importance of learning how to do statistical procedures on a computer, as you most likely would when conducting your own research. So, at the end of relevant chapters, there is a section called Using SPSS (see also the Study Guide and Computer Workbook that accompanies this text and that includes a guide to getting started with

SPSS). SPSS statistical software is commonly used by psychologists and other behavioral and social scientists to carry out statistical analyses. Check with your instructor to see if you have access to SPSS at your institution.

The Two Branches of Statistical Methods

branch of mathematics that focuses on the organization, analysis, and interpretation of a group of numbers.

There are two main branches of statistical methods.

1. Descriptive statistics: Psychologists use descriptive statistics to summarize and

descriptive statistics procedures for

describe a group of numbers from a research study.

summarizing a group of scores or other-wise making them more comprehensible.

2. Inferential statistics: Psychologists use inferential statistics to draw conclu‑

inferential statistics procedures for

sions and to make inferences that are based on the numbers from a research study but that go beyond the numbers. For example, inferential statistics allow researchers to make inferences about a large group of individuals based on a re‑

drawing conclusions based on the scores collected in a research study but going beyond them

search study in which a much smaller number of individuals took part.

U

CHAPTER 1

Displaying the Order ii n a Group

of Numbers Using Tables and Graphs

Chapter Outline

· The Two Branches of Statistical
4.`:• Summary 23

Methods 2

0 Key Terms 24

m Some Basic Concepts 3

Frequency Tables 7

0 Example Worked-Out

Problems 24

m Histograms 10
0 Practice Problems 25

m Shapes of Frequency Distributions 15

m Controversy: Misleading Graphs 19

· Frequency Tables and Histograms

0 Using SPSS 29

fl Chapter Note 32

in Research Articles 21

to Statistics for Psychology. We imagine you to be like other stu‑

dents we have known who have taken this course. You have chosen to

major in psychology or a related field because you are fascinated by

people—by the visible behaviors of the people around you, perhaps too by their inner lives as well as by your own. Some of you are highly scientific sorts; others are more intuitive. Some of you are fond of math; others are less so, or even afraid of it. Whatever your style, we welcome you. We want to assure you that if you give this book some special attention (perhaps a little more than most textbooks require), you will learn statistics. The approach used in this book has successfully taught all sorts of students before you, including those who had taken statistics previously and done poorly.

With this book and your instructor's help, you will learn statistics and learn it well.

More importantly, we want to assure you that whatever your reason for studying psychology or a related field, this course is not a waste of time. Learning about statistics

1

xviii

Introduction to the Student

enormously. Those who fear trouble ahead need to work with those who do not (the blind leading the blind is no way to learn). Pick group members who live near you so that it is easy for you to get together. Also, meet often—between each class, if possible.

A Final Note

Believe it or not, we love teaching statistics. Time and again, we have had the wonder‑ ful experience of having beaming students come to us to say, "Professor, I got a 90% on this exam. I can't believe it! Me, a 90 on a statistics exam!" Or the student who tells

us, "This is actually fun. Don't tell anyone, but I'm actually enjoying

all things!" We hope you will have these kinds of experiences in this course.

. . .

statistics, of

Arthur Aron

Elaine N. Aron

Elliot J. Coups

Introduction to the Student

xvii

abstraction often is grasped only superficially at first, as slogans instead of useful knowledge. Of all the courses you are likely to take in psychology, this one will probably do the most to help you learn to think precisely, to evaluate information, and to apply logical analysis at a very high level.

How to Gain the Most from This Course

There are five things we can advise:

1. Keep your attention on the concepts. Treat this course less like a math

course and more like a course in logic. When you read a section of a chapter, your attention should be on grasping the principles. When working the exercises, think about why you are doing each step. If you simply try to memorize how to come up with the right numbers, you will have learned very little of use in your future

studies—nor will you do very well on the tests in this course.

2. Be sure you know each concept before you go on to the next. Statistics is cumulative. Each new concept is built on the last one. There are short "How Are You Doing?" self-tests at the end of each main chapter section. Be sure you do them. You may also find it helpful to review the "How Are You Doing" sections before working on the practice problems and when studying for exams. If you are having trouble answering a question at any time—or even if you can answer it but aren't sure you really understand it—stop. Reread the section, rethink it, ask for help. Do whatever you need to do to grasp it. Don't go on to the next section until you are completely confident you have gotten this one. If you are not sure, and you've already done the "How are you doing?" questions, take a look at the Example Worked-Out Problems toward the end of the chapter, or try working a practice problem on this material from the end of the chapter. The answers to the Set I practice problems are given

toward the end of the book so that you will be able to check your work.

Having to read the material in this book over and over does not mean that you are stupid. Most students have to read each chapter several times. And each reading in statistics is usually much slower than that in other textbooks. Statistics reading has to be pored over with clear, calm attention for it to sink in. Allow plenty of time for

this kind of reading and rereading.

3. Keep up. Again, statistics is cumulative. If you fall behind in your reading or miss lectures, the lectures you do attend will be almost meaningless. It will get hard‑

er and harder to catch up.

4. Study especially intensely in the first half of the course. It is particularly important to master the material thoroughly at the start of the course. Everything else you learn in statistics is built on what you learn at the start. Yet the beginning of the

semester is often when students study least.

If you have mastered the first half of the course—not just learned the general idea, but really know it—the second half will be easier. If you have not mastered the

first half, the second half will be close to impossible.

5. Help each other. There is no better way to solidify and deepen your understanding of statistics than to try to explain it to someone who is having a harder time. (Of course, this explaining has to be done with patience and respect.) For those of you who are having a harder time, there is no better way to work through the difficult

parts than by learning from another student who has just mastered the material.

Thus, we strongly urge you to form study groups with one to three other students. It is best if your group includes some who expect this material to come easily and some who don't. Those who learn statistics easily will get the most from helping others who have to struggle with it—the latter will tax the former's supposed understanding

Introduction to the Student

The goal of this book is to help you understand statistics. We emphasize meaning

and concepts, not just symbols and numbers.

This emphasis plays to your strength. Most psychology majors are not lovers of mathematics but are keenly attuned to ideas. And we want to underscore the following, based on our collective many decades of teaching experience: We have never had a student who could do well in other college courses who could not also do well in this course. (However, we admit that doing well in this course may require more

work than doing well in others.)

In this introduction, we discuss why you are taking this course and how you can gain the most from it.

Why Learn Statistics, Other Than to Fulfill a Requirement?

1. Understanding statistics is crucial to being able to read psychology research articles. Nearly every course you will take as a psychology major will emphasize the results of research studies, and these almost always are expressed using statistics. If you do not understand the basic logic of statistics—if you cannot make sense of the jargon, the tables, and the graphs that are at the heart of any research report—your reading of research will be very superficial. (We also recommend that you take a course on how to design and evaluate good research. In this book, we focus on the statistical methods for making sense of the data collected through research. However, we have included a downloadable chapter on the Web site for the

book—ham,

w

1—that provides an overview of the logic

and language of psychology research.)

2. Understanding statistics is crucial to doing research yourself Many psy‑

chology majors eventually decide to go on to graduate school. Graduate study in

psychology—even in clinical and counseling psychology and other applied areas

almost always involves doing research. In fact, learning to do research on your own is often the main focus of graduate school, and doing research almost always involves statistics. This course gives you a solid foundation in the statistics you need for doing research. Further, by mastering the basic logic and ways of thinking about statistics, you will be unusually well prepared for the advanced courses, which focus

on the nitty gritty of analyzing research results.

Many psychology programs also offer opportunities for undergraduates to do research. The main focus of this book is understanding statistics, not using statistics. Still, you will learn the basics you need to analyze the results of the kinds of research

you are likely to do. (Also, the Web site that accompanies this book—http://www. has a special chapter to help you with practical issues in

~eursoiiuitzucrcu.cu

using what you learn in this book for analyzing results of your own research.)

3. Understanding statistics develops your analytic and critical thinking.

Psychology majors are often most interested in people and in improving things in the practical world. This does not mean that you avoid abstractions. In fact, the students we know are exhilarated most by the almost philosophical levels of abstraction where the secrets of human experience so often seem to hide. Yet even this kind of

xvi

Preface to the Instructor

KV

outlines before the Practice Problems section and including definitions of key terms in the margin. For several chapters, we expanded the Using SPSS section that shows students how to carry out the chapter's procedures. Also, we added a Using SPSS section to Chapter 15 that shows students how to use SPSS to figure a partial correlation, internal consistency reliability, and an analysis of covariance (ANCOVA). Yet another addition is a section on multilevel modeling analysis in Chapter 15.

Keep in Touch

Our goal is to do whatever we can to help you make your course a success. If you have any questions or suggestions, please send us an email ( [email protected] will do for all of us). Also, if you should find an error somewhere, for everyone's benefit, please let us know right away. When errors have come up in the past, we have usually been able to fix them in the very next printing.

Acknowledgments

First and foremost, we are grateful to our students through the years, who have shaped our approach to teaching by rewarding us with their appreciation for what we have done well as well as their various means of extinguishing what we have done not so well. We also deeply appreciate all those students and instructors who

have sent us their ideas and encouragement.

We remain grateful to all of those who helped us with the first four editions of this book, as well as to those who helped with the four editions of the Brief Course version. For their very helpful input on the development of this fifth edition of Statistics for Psychology, we want to thank Mark Walter, Albion College; Helga Walz, University of Baltimore; Susan Nolan, Seton Hall University; Jwa K. Kim, Middle Tennessee State University; Steven Gangestad, University of New Mexico; Mark Vosvick, University of North Texas; Ann Lynn, Ithaca College; John Bechtold, Messiah College; Donald Sharpe, University of Regina; Terri-Lynn MacKay, University of Manitoba; and Jacqueline Bichsel, Penn State Harrisburg. We are extremely grateful to LeeAnn Doherty and Jeff Marshall of Prentice Hall for superbly leading us through the long revision process. Thanks are also due to Jill Traut, Lori Hazzard, and Fred Dahl for their excellent assistance with the production of this edition. We also particularly want to acknowledge Ted Whitley (East Carolina University) for identifying many crucial final changes to the text.

Arthur Aron

Elaine N. Aron

Elliot J. Coups

Credits

Data in Tables 7-11, 7-12, 8-4, 8-5, 9-9, 9-10, 10-15, 10-16, 11-7, 11-8, 13-9, and 13-10 are based on tables in Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Copyright © 1988 by Lawrence Erlbaum Associates, Inc. Reprinted by permission.

Preface to the Instructor

suitable for copying for student handouts). These lecture outlines and worked-out examples are especially useful to new instructors or those using our book for the first time, since structuring lectures and creating good examples is one of the most de‑

manding parts of teaching the course.

9. Our Test Bank makes preparing exams easy. We supply approximately 40 multiple-choice, 25 fill-in, and 10 to 12 problem/essay questions for each chapter. Considering that the emphasis of the course is so conceptual, the multiple-choice questions will be particularly useful for instructors who do not have the resources to

grade essays.

10. The accompanying Study Guide and Computer Workbook focuses on mastering concepts and also includes instructions and examples for working problems with SPSS. Most study guides concentrate on plugging numbers into formulas and memorizing rules (which is consistent with the emphasis of the textbooks they accompany). For each chapter, our Study Guide and Computer Workbook provides learning objectives, the chapter's formulas (with all symbols defined), and summaries of steps of conducting each procedure covered in the chapter, plus a set of

self tests, including multiple-choice, fill-in, and problem/essay questions.

Also, our Study Guide and Computer Workbook goes beyond the brief SPSS sections in each text chapter to provide the needed support for teaching students to become comfortable with this program and carrying out analyses on the computer. First, there is a special appendix on getting started with SPSS. Then, in each chapter corresponding to the text chapters, there is a section showing in detail how to carry out the chapter's procedures with SPSS. (These sections include step-by-step instructions, examples, and illustrations of how each menu and each output appears on the screen.) There are also special activities for using SPSS to strengthen understanding. As far as we know, no other statistics textbook package provides this much depth of explanation.

What's New in This Fifth Edition

With each new edition we have worked to improve the writing, update content, and make adjustments based on our experience teaching and the wonderful input we

have received from instructors using the text.

A Web page, which is available to instructors who adopt the book and their

students:

• ° ,.n, supplements the text with four down‑

loadable chapters: (1) the basics of research methods, (2) applying statistics in one's

own research projects, (3) repeated-measures analysis of variance, and (4) integra‑ tion of statistical tests and the general linear model (which also serves as an excellent

review/overview of the entire book).

In the fourth edition, we reconceptualized the teaching of the material on correlation and regression. We had long resisted calls from instructors to move these topics to after the t test and analysis of variance, thinking that they worked best as descriptive statistics (in previous editions they came right after mean and standard deviation). On the other hand, many instructors will no doubt continue to prefer to follow our original order; so we have made sure in this edition that users can still go directly from Chapter 2 to correlation and regression (now Chapters 11 and 12), and

then return to Chapter 3 to begin the discussion of inferential statistics.

In this fifth edition, we of course have continued to focus on simplifying exposition and have done our usual updating of content, examples, boxes, controversies, and other elements, in addition to making a host of minor adjustments to make the book more effective. And we have added further pedagogical aids, such as adding essay

Preface to the Instructor

xiii

thinking in statistical theory and application, and this book reflects that engagement. For example, we devote an entire early chapter (Chapter 6) to effect size and power

and then return to these topics as we teach each technique.

5. We capitalize on the students' motivations. We do this in two ways. First, our examples emphasize topics or populations that students seem to find most interesting. The very first is from a real study in which students in their first week of an introductory statistics class rated how much stress they felt they were under. Other examples emphasize clinical, organizational, social, and educational psychology while being sure to include sufficient interesting examples from cognitive, developmental, and behavioral psychology, as well as social and cognitive neuroscience, to inspire students with the value of those approaches. (Also, our examples continually emphasize the usefulness of statistical methods and ideas as tools in the research process, never allowing students to feel that what they are learning is theory for the

sake of theory.)

Second, we have worked to make the book extremely straightforward and systematic in its explanation of basic concepts so that students can have frequent "aha" experiences. Such experiences bolster self-confidence and motivate further learning. It is quite inspiring to us to see even fairly modest students glow from having mastered some concept like negative correlation or the distinction between failing to reject the null hypothesis and supporting the null hypothesis. At the same time, we do not constantly remind them how greatly oversimplified we have made things, as some books do. Instead, we show students, in the controversy sections in particular,

how much there is for them to consider deeply, even in an introductory course.

6. We emphasize statistical methods as a living, growing field of research. We take the time to describe the issues, such as the relative merits of both significance testing and confidence intervals. In addition, each chapter includes one or more "boxes" about famous statisticians or interesting sidelights. The goal is for students to see statistical methods as human efforts to make sense out of the jumble of numbers generated by a research study—to see that statistics are not "given" by nature, not infallible, not perfect descriptions of the events they try to describe, but rather a language that is constantly improving through the careful thought of those who use it. We hope that this orientation will help them maintain a questioning, alert attitude

as students and later as professionals.

7. The final chapter looks at advanced procedures without actually teaching them in detail. It explains in simple terms how to make sense out of these statistics when they are encountered in research articles. Most psychology research articles today use methods such as analysis of covariance, multivariate analysis of variance, multilevel modeling, mediation, factor analysis, or structural equation modeling. Students completing the ordinary introductory statistics course are ill equipped to comprehend most of the articles they must read to prepare a paper or study a course topic in further depth. This chapter makes use of the basics that students have just learned (along with extensive excerpts from current research articles) to give a rudimentary understanding of these advanced procedures. This chapter also serves as a reference

guide that students can keep and use in the future when reading such articles.

8. We have written an Instructor's Manual that really helps teach the course. The Manual begins with a chapter summarizing what we have gleaned from our own teaching experience and the research literature on effectiveness in college teaching. The next chapter discusses alternative organizations of the course, tables of possible schedules and a sample syllabus, advice on structuring exams and an example test, and more still! Then each chapter, corresponding to the text chapters, provides full lecture outlines and additional worked-out examples not found in the text (in a form

aria

Preface to the Instructor

mouse clicks. What is important today is that students work problems in a way that keeps them constantly aware of the underlying logic of what they are doing. Consider the population variance—the average of the squared deviations from the mean. This concept is directly displayed in the definitional formula (once the student is used to the symbols): Variance = [ E(X — M) 2 ]/N. Repeatedly working problems using this formula ingrains the meaning in the student's mind. In contrast, the usual computational version of this formula only obscures this meaning: Variance

[Ex—(EA)\N]/N. Repeatedly working problems using this formula does

nothing but teach the student the difference between E X2 and ( 1 X )2!

Teaching these tired computational formulas today is an anachronism—at least 40 years out-of-date! Researchers do their statistics on computers now, and the use of statistical software makes the understanding of the basic principles, as they are symbolically expressed in the definitional formulas, more important than ever. Students still need to work lots of problems by hand to learn the material. But they need to work them using the definitional formulas that reinforce the concepts, not using the antiquated computational formulas that obscure them. Not since the era when Lyndon B. Johnson was U.S. president have those computational formulas made sense as time-savers when researchers had to work with large data sets by hand. Even then, however, they were poor teaching tools. (Because some instructors may feel naked without them, we still provide the computational formulas, usually in a

brief note at the end of the chapter.)

2. Each procedure is taught both verbally and numerically—and usually visually as well. In fact, when we introduce every formula, it has attached to it a concise statement of the formula in words. (The major formulas with their verbal descriptions are also repeated on the inside front cover.) Typically, each example lays out the procedures in worked-out formulas, in words (often with a list of steps), and illustrated with easy-to-grasp figures. Practice problems and test bank items, in turn, require the student to calculate results, write a short explanation in layperson's language of what they have done, and make a sketch (for example, of the distributions involved in a t test). The chapter material completely prepares the student for these

kinds of practice problems and test questions.

It is our repeated experience that these different ways of expressing an idea are crucial for establishing a concept in a student's mind. Many psychology students are more at ease with words than with numbers. In fact, some have a positive fear of all mathematics. Writing the formula in words and providing the lay-language explana‑

tion gives them an opportunity to do what they do best.

3. A main goal of any introductory statistics course in psychology is to prepare students to read research articles. The way a procedure such as a t test or an analysis of variance is described in a research article is often quite different from what the student expects from the standard textbook discussions. Therefore, as this book teaches a statistical method, it also gives examples of how that method is reported in current journal articles. And we don't just leave it there. The practice problems and test bank

items also include excerpts from journal articles for the student to explain.

4. The book is unusually up-to-date. Most introductory statistics textbooks read

as if they were written in the 1950s. The basics are still the basics, but statisticians and researchers think far more subtly about those basics now. Today, the basics are under‑

girded by a new appreciation of effect size, power, limitations of significance testing, the accumulation of results through meta-analysis, the critical role of models, the underlying unity of difference and association statistics, the growing prominence of regression and associated methods, and a host of new developments arising from the central role of the computer in statistical analyses. We are much engaged in the latest

Preface to the Instructor

The heart of this book was written over a summer in a small apartment near the Place Saint Ferdinand, having been outlined in nearby cafés and on walks in the Bois de Boulogne. It is based on our collective many decades of experience teaching, researching, and writing. We believe that the result is a book as different from the conventional lot of statistics texts as Paris is from Pompeii, yet still comfortable and

stimulating to the long-suffering community of statistics instructors.

Our approach was developed over decades of successful teaching—successful not only in the sense that students have consistently rated the course (a statistics course, remember) as a highlight of their major, but also in the sense that students come back to us long after graduating saying, "I was light years ahead of my fellow graduate students because of your course," or "Even though I don't do research, your

course has really helped me read the journals in my field."

The response to the first four editions has been overwhelming. We have received hundreds of thank-you emails and letters from instructors (and from students themselves!) from all over the world. (The text has been translated into Traditional Chinese and Spanish.) Of course, we have also been delighted by the enthusiastic reviews it has received, starting with the first edition in Contemporary Psychology (Bourgeois, 1997) and continuing through recent years (Shevlin, 2005, in Psychology

Learning and Teaching).

With each revision, we have tried to maintain those things about the book that have been especially appreciated, while reworking the text to take into account the feedback we have received, our own experiences, and advances and changes in the field. We have also added new pedagogical features to make the book even more accessible for students. (As we undertook this fifth edition we were particularly concerned that the book not become stale and that it remain as lively and as up-to-date as our very first edition.) However, before turning to what's new in this latest revision, we want to reiterate what we said with the first edition about how this book, from the beginning, has been so different from other statistics texts.

How This Book Was Dramatically Different from the Start

Different as this book is, it has from the start also done what the best of the better statistics texts of the last few years have been already doing pretty well: emphasizing the intuitive, de-emphasizing the mathematical, and explaining everything in direct, simple language. But what we have done continues to go beyond even the best of the

current lot in 10 key respects.

1. The definitional formulas are brought to center stage because they provide a concise symbolic summary of the logic of each particular procedure. All our explanations, examples, practice problems, and test bank items are based on these definitional formulas. (The amount of data to be processed in practice problems and test

bank items is reduced appropriately to keep computations manageable.)

Why this approach? Even in 2008, statistics texts have still not faced the technological realities. What is important today is not that the students learn to calculate a t test with a large data set—programs like SPSS can do this in an instant with a few

xi

x

Contents

Box 15-2: The Golden Age of Statistics: Four Guys Around London 627

Procedures That Compare Groups 634

Analysis of Covariance (ANCOVA) 634

Multivariate Analysis of Variance (MANOVA) and Multivariate Analysis

of Covariance (MANCOVA) 635

Overview of Statistical Techniques 636

Controversy: Should Statistics Be Controversial? 637

Box 15-3: The Forced Partnership of Fisher and Pearson 638

How to Read Results Using Unfamiliar Statistical Techniques 639

Summary 641

Key Terms 642

Practice Problems 642

Using SPSS 654

Chapter Notes 662

Appendix: Tables 664

Answers to Set I Practice Problems 673

Glossary 701

Glossary of Symbols 708 References

710

Index 719

Web Chapters (available at http://www.pearsonhighered.com)

Charrqerf vgATO Overview of the Logic and Language of Psychology

Research

EPI•Rrotcerr W2 Applying Statistical Methods in Your Own Project Charofce,F \\WIS Repeated-Measures Analysis of Variance

CITAT:Aec, WV41 Integration and the General Linear Model

Contents

ix

Issues in Prediction 503 Multiple Regression 506

Limitations of Prediction 508

Controversy: Unstandardized and Standardized Regression Coefficients; Comparing Predictors 509

Box 12-1: Clinical versus Statistical Prediction 510

Prediction in Research Articles 511

Advanced Topic: Error and Proportionate Reduction in Error 514 Summary 518

Key Terms 519

Example Worked-Out Problems 519

Practice Problems 524

Using SPSS 532

Chapter Notes 535

Chavlarr 113 Chi-Square Tests 536

Box 13-1: Karl Pearson, Inventor of Chi-Square and Center of Controversy 537

The Chi-Square Statistic and the Chi-Square Test for Goodness of Fit 538 The Chi-Square Test for Independence 546

Assumptions for Chi-Square Tests 554

Effect Size and Power for Chi-Square Tests for Independence 554 Controversy: The Minimum Expected Frequency 558

Chi-Square Tests in Research Articles 559

Summary 560

Key Terms 561

Example Worked-Out Problems 561

Practice Problems 565

Using SPSS 572

Chapter Notes 576

Chapter 114 Strategies When Population Distributions

Are Not Normal: Data Transformations and Rank-Order Tests 577

Assumptions in the Standard Hypothesis-Testing Procedures 578 Data Transformations 580

Rank-Order Tests 585

Comparison of Methods 589

Controversy: Computer-Intensive Methods 591

Box 14-1: Where Do Random Numbers Come From? 594

Data Transformations and Rank-Order Tests in Research Articles 595 Summary 596

Key Terms 597

Example Worked-Out Problems 597

Practice Problems 597

Using SPSS 602

Chapter Notes 609

Ch makes 1>3 The General Linear Model and Making Sense of Advanced

Statistical Procedures in Research Articles 611

The General Linear Model 612

Box 15-1: Two Women Make a Point About Gender and Statistics 616

Partial Correlation 617

Reliability 618

Multilevel Modeling 620

Factor Analysis 622

Causal Modeling 625

Contents

Analyses of Variance in Research Articles 344

Advanced Topic: The Structural Model in the Analysis of Variance 345 Principles of the Structural Model 345

Summary 351

Key Terms 352

Example Worked-Out Problems 353

Practice Problems 357

Using SPSS 364

Chapter Notes 368

Chaptev 110 Factorial Analysis of Variance 370

Basic Logic of Factorial Designs and Interaction Effects 371 Recognizing and Interpreting Interaction Effects 376

Basic Logic of the Two-Way Analysis of Variance 386

Box 10-1: Personality and Situational Influences on Behavior: An Interaction Effect 387

Assumptions in the Factorial Analysis of Variance 389 Extensions and Special Cases of the Analysis of Variance 389 Controversy: Dichotomizing Numeric Variables 391

Factorial Analysis of Variance in Research Articles 393 Advanced Topic: Figuring a Two-Way Analysis of Variance 395

Advanced Topic: Power and Effect Size in the Factorial Analysis

of Variance 406

Summary 410

Key Terms 411

Example Worked-Out Problems 412

Practice Problems 415

Using SPSS 426

Chapter Notes 431

Chi orrAn VI Correlation 432

Graphing Correlations: The Scatter Diagram 434 Patterns of Correlation 437

The Correlation Coefficient 443

Pro. 11 1 Galton: Gentleman Genius 446

Significance of a Correlation Coefficient 452

Correlation and Causality 456

Issues in Interpreting the Correlation Coefficient 458

Box 11-2: Illusory Correlation: When You Know Perfectly Well That If It's Big,

It's Fat-and You Are Perfectly Wrong 460

Effect Size and Power for the Correlation Coefficient 464 Controversy: What Is a Large Correlation? 466 Correlation in Research Articles 467

Summary 469

Key Terms 471

Example Worked-Out Problems 471

Practice Problems 474

Using SPSS 482

Chapter Notes 485

Chapter 12 Prediction 487

Predictor (X) and Criterion (Y) Variables 488 The Linear Prediction Rule 488

The Regression Line 492

Finding the Best Linear Prediction Rule 496 The Least Squared Error Principle 498

Contents

vii

Summary 214

Key Terms 215

Example Worked-Out Problems 215

Practice Problems 217

Chapter Note 221

Cha0err 77

Introduction to t Tests: Single Sample and Dependent Means 222

The t Test for a Single Sample 223

Box 7-1: William S. Gosset, Alias "Student": Not a Mathematician, But a Practical Man 224

The t Test for Dependent Means 236

Assumptions of the t Test for a Single Sample and the t Test for Dependent

Means 247

Effect Size and Power for the t Test for Dependent Means 247

Controversy: Advantages and Disadvantages of Repeated-Measures

Designs 250

Box 7-2: The Power of Studies Using Difference Scores: How the Lanarkshire Milk Experiment Could Have Been Milked for More 251

Single Sample t Tests and Dependent Means t Tests in Research Articles 252 Summary 253

Key Terms 254

Example Worked-Out Problems 254

Practice Problems 258

Using SPSS 265

Chapter Notes 268

;

The t Test for Independent Means 270

The Distribution of Differences Between Means 271 Hypothesis Testing with a t Test for Independent Means 278 Assumptions of the t Test for Independent Means 286

Box 8-1: Monte Carlo Methods: When Mathematics Becomes Just an Experiment, and Statistics Depend on a Game of Chance 286

Effect Size and Power for the t Test for Independent Means 288 Review and Comparison of the Three Kinds of t Tests 290 Controversy: The Problem of Too Many t Tests 291

The t Test for Independent Means in Research Articles 292

Advanced Topic: Power for the t Test for Independent Means When Sample Sizes

Are Not Equal 293

Summary 294

Key Terms 295

Example Worked-Out Problems 295

Practice Problems 298

Using SPSS 305

Chapter Notes 309

Chapicee 9

Introduction to the Analysis of Variance 310

Basic Logic of the Analysis of Variance 311

Box 9-1: Sir Ronald Fisher, Caustic Genius of Statistics 317

Carrying Out an Analysis of Variance 319

Hypothesis Testing with the Analysis of Variance 327 Assumptions in the Analysis of Variance 331

Planned Contrasts 334

Post Hoc Comparisons 337

Effect Size and Power for the Analysis of Variance 339 Controversy: Omnibus Tests versus Planned Contrasts 343

vi

Contents

Controversies: Is the Normal Curve Really So Normal? and Using Nonrandom

Samples 93

Z Scores, Normal Curves, Samples and Populations, and Probabilities

in Research Articles 95

Advanced Topics: Probability Rules and Conditional Probabilities 96 Summary 97

Key Terms 98

Example Worked-Out Problems 99

Practice Problems 102

Using SPSS 105

Chapter Notes 106

CC apitere a

Introduction to Hypothesis Testing 107

A Hypothesis-Testing Example 108

The Core Logic of Hypothesis Testing 109

The Hypothesis-Testing Process 110

One-Tailed and Two-Tailed Hypothesis Tests 119 Controversy: Should Significance Tests Be Banned? 124

Box 4-1: Jacob Cohen, the Ultimate New Yorker: Funny, Pushy, Brilliant, and Kind 126

Hypothesis Tests in Research Articles 127 Summary 128

Key Terms 129

Example Worked-Out Problems 129

Practice Problems 131

Chapter Notes 136

crw-e-ar, 5

Hypothesis Tests with Means of Samples 137

The Distribution of Means 138

Hypothesis Testing with a Distribution of Means: TheZ Test 146

Box 5-1: More About Polls: Sampling Errors and Errors in Thinking About Samples 147

Controversy: Marginal Significance 153

Hypothesis Tests About Means of Samples (Z Tests) and Standard Errors

in Research Articles 154

Advanced Topic: Estimation, Standard Errors, and Confidence Intervals 156 Advanced Topic Controversy: Confidence Intervals versus Significance Tests 162 Advanced Topic: Confidence Intervals in Research Articles 163

Summary 163

Key Terms 164

Example Worked-Out Problems 164

Practice Problems 167

Chapter Notes 173

Chapter 6

Making Sense of Statistical Significance: Decision Errors, Effect Size, and Statistical Power 175

Decision Errors 175

Effect Size 179

Box 6-1: Effect Sizes for Relaxation and Meditation: A Restful Meta-Analysis 184

Statistical Power 187

What Determines the Power of a Study? 191

Box 6-2: The Power of Typical Psychology Experiments 199

The Role of Power When Planning a Study 203

The Role of Power When Interpreting the Results of a Study 205 Controversy: Statistical Significance versus Effect Size 208 Decision Errors, Effect Size, and Power in Research Articles 210 Advanced Topic: Figuring Statistical Power 212

Contents

Preface to the Instructor xi Introduction to the Student xvi

Chapter 1

Displaying the Order in a Group of Numbers Using Tables and Graphs 1

The Two Branches of Statistical Methods 2 Some Basic Concepts 3

Box 1-1: Important Trivia for Poetic Statistics Students 6

Frequency Tables 7

Histograms 10

Box 1-2: Math Anxiety, Statistics Anxiety, and You: A Message for Those of You Who Are Truly Worried About This Course 12

Shapes of Frequency Distributions 15

Controversy: Misleading Graphs 19

Frequency Tables and Histograms in Research Articles 21 Summary 23

Key Terms 24

Example Worked-Out Problems 24

Practice Problems 25

Using SPSS 29

Chapter Note 32

Chap2eT 2

Central Tendency and Variability 33

Central Tendency 34 Variability 43

Box 2-1: The Sheer Joy (Yes, Joy) of Statistical Analysis 51

Controversy: The Tyranny of the Mean 52

Box 2-2: Gender, Ethnicity, and Math Performance 53

Central Tendency and Variability in Research Articles 55 Summary 57

Key Terms 57

Example Worked-Out Problems 57

Practice Problems 59

Using SPSS 62

Chapter Notes 65

n_,Aerr 3

Some Key Ingredients for Inferential Statistics: Z Scores, the Normal Curve, Sample versus Population, and Probability 67

Z Scores 68

The Normal Curve 73

Box 3-1: de Moivre, the Eccentric Stranger Who Invented the Normal Curve 74

Sample and Population 83

Box 3-2: Surveys, Polls, and 1948's Costly "Free Sample" 86

Probability 88

Box 3-3: Pascal Begins Probability Theory at the Gambling Table, Then Learns

to Bet on God 89

v

Brief Contents

Chapter INT, Overview of the Logic and Language of Psychology Research Chapter W2 Applying Statistical Methods in Your Own Project

Chapter W3 Repeated-Measures Analysis of Variance

Chapter W4 Integration and the General Linear Model

Brief Contents

Preface to the Instructor xi Introduction to the Student xvi

Chapter 11 Displaying the Order in a Group of Numbers Using

Tables and Graphs 1

CGS per 2 Central Tendency and Variability 33

Chapter, 3 Some Key Ingredients for Inferential Statistics: Z Scores,

the Normal Curve, Sample versus Population, and Probability 67

Chapeir & Introduction to Hypothesis Testing 107

CG aptev 3 Hypothesis Tests with Means of Samples 137

CE_pter 6 Making Sense of Statistical Significance: Decision Errors,

Effect Size, and Statistical Power 175

W Introduction to t Tests: Single Sample and Dependent

Means 222

ChalpiL,v, 0 The t Test for Independent Means 270 Cft._,2ere 0 Introduction to the Analysis of Variance 310 Chapccere 1i0 Factorial Analysis of Variance 370

nap'lerf •U Correlation 432

Chap .Ln 'd 2 Prediction 487

CG-Tia,Te 'd 33 Chi-Square Tests 536

CG ue~ Strategies When Population Distributions Are Not Normal:

Data Transformations and Rank-Order Tests 577

EGT,apftev 115 The General Linear Model and Making Sense of Advanced

Statistical Procedures in Research Articles 611

Appendix: Tables 664

Answers to Set I Practice Problems 673

Glossary 701

Glossary of Symbols 708 References 710

Index 719

Web Chapters (available at http://www.pearsonhighered.com)

iii

Cover Art: Courtesy of Photodisc/Getty Images.

Taken from:

Statistics for Psychology, Fifth Edition

by Arthur Aron, Elaine N. Aron, and Elliot J. Coups

Copyright © 2009, 2006, 2003, 1999, 1994 by Pearson Education

Published by Prentice Hall

Upper Saddle River, New Jersey 07458

All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher.

This special edition published in cooperation with Pearson Learning Solutions.

All trademarks, service marks, registered trademarks, and registered service marks are the property of their respective owners and are used herein for identification purposes only.

Pearson Learning Solutions, 501 Boylston Street, Suite 900, Boston, MA 02116 A Pearson Education Company

www.pearsoned.com

Printed in the United States of America

1 2 3 4 5 6 7 8 9 10 OBRV 17 16 15 14 13 12

000200010271653336

AD

PEARSON

ISBN 10: 1-256-77287-9 ISBN 13: 978-1-256-77287-3

PEARSON

ALWAYS LEARNING

Arthur Aron • Elaine N. Aron • Elliot 3. Coups

Statistics for

Psychology

Custom Edition for Liberty University

Taken from:

Statistics for Psychology, Fifth Edition

by Arthur Aron, Elaine N. Aron, and Elliot 3. Coups

/(M

—

GM)2

s2m

=

df Between

The estimated variance of the distribution of means is the sum of each sample mean's squared deviation from the grand mean, divided by the degrees of freedom for the between-groups population variance estimate.

(9-2)

(9-4)

SBetween or MSBetween

(Sm) (n)

The between-groups population variance estimate (or mean squares between) is the estimated variance of the distribution of

means multiplied by the number of scores in each group.

Sietween

MSBetween

F

=

or

SWithin

MSWithin

The F ratio is the between-groups population variance estimate (or mean squares between) divided by the within-groups population variance estimate (or mean squares within).

(9-5)

Si

/(M

—

GM)2

etween

=

or

The between-groups population variance estimate is the sum of squared deviations of each score's group's mean from the grand mean divided by the degrees of freedom

(9-10)

df Between

SSBetween

MSBetween

=

"fJ Between

for the between-groups population variance estimate.

E (x - M)2

SSWithin

S2Within

=

or MSwithin

=

dfWithin

df Within

The within-groups population variance estimate is the sum of squared deviations of each score from its group's mean divided by the degrees of freedom for the within-groups population variance estimate.

(9-11)

SSRows

I(A4Row

—

GM)2

(10-1)

The sum of squared deviations for rows is the sum of each score's

row's mean's squared deviation from the grand mean.

(10-3)

SSInteraction

E [ (x

-

GM)

—

(X

—

M)

—

(MROW

—

GM)

—

(MColumn

—

GM)?

The sum of squared deviations for the interaction is the sum of the squares of each score's deviation from the grand mean minus its deviation from its cell's mean, minus its row's mean's deviation from the grand mean, minus its column's mean's deviation from the grand mean.

SSWithin

/(X

—

NI)2

The sum of squared deviations within groups (within cells) is the sum of each score's squared deviation from its cell's mean.

(10-4)

The correlation coefficient is the sum, over all the people in the study, of the product of each person's two deviation scores, divided by the square root of the result of multiplying the sum of everyone's squared deviation scores on the X variable by the sum of everyone's squared deviation

1[ (X

—

Mx)(Y

—

My)]

r

=

X/ (SSx)(SSy)

(11-1)

scores on the Y variable.

(12-1)

a + (b) (X)

A person's predicted score on the criterion variable equals the regression constant plus the regression coefficient multiplied by that person's score on the predictor variable.

(0

—

E)2

X

E

E

Chi-square is the sum, over all the categories or cells, of the squared difference between observed and expected frequencies divided by the expected frequency.

(13-1)

E

(R'

(C)

A cell's expected frequency is the number in its row divided by the total number, multiplied by the number in its column.

(13-2)

shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth9525fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor10063221shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth635fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor0

Major Formulas

Formula Number

The mean is the sum of the scores divided by the number of scores.

IX

M

=

(2-1)

N

The variance is the sum of the squared deviations of the scores from the mean, divided by the number of scores.

sp2

/(X

—

M)2

=

N

(2-2)

X

—

M

Z

=

A Z score is the raw score minus the mean, divided by the standard deviation.

(3-1)

SD

Cr

QM

=

N

The variance of a distribution of means is the variance of the population

(5-2)

of individuals divided by the number of individuals in each sample.

The effect size for the difference between two means is the difference between the population means divided by the population's standard deviation.

—

1-1-2

d

—

cr

(6-1)

The estimated population variance is the sum of the squared deviation scores divided by the number of scores minus 1.

Ev

—

M)2

SS

S2

=

—

N

—

1

N

—

1

(7-1)

I

The variance of the distribution of means based on an estimated population variance is the estimated population variance divided by the number of scores in the sample.

S2

N

S2m

=

(7-5)

The t score in a single sample t test and a t test for dependent means is the sample mean minus the population mean, divided by the standard deviation of the distribution of means.

t

=

M

—

11

SM

(7-7)

The pooled estimate of the population variance is the degrees of freedom in the first sample divided by the total degrees of freedom (from both samples) multiplied by the population variance estimate based on the first sample, plus the degrees of freedom in the second sample divided by the total degrees of freedom multiplied by the population

SP

doole

=

1(5?) +

df2

(Si)

dfTotal

dfTotal

_

(8-1)

variance estimate based on the second sample.

°c,2

The variance of the distribution of means for the first population (based on an estimated population variance) is the pooled estimate of the population variance

Pooled

Sm2

=

N1

(8-2)

divided by the number of participants in the sample from the first population.

The variance of the distribution of differences between means is the variance of the distribution of means for the first population (based on an estimated

Aifference

M1

S2 + Sit,

(8-4)

population variance) plus the variance of the distribution of means for the second population (based on an estimated population variance).

The t score in a t test for independent means is the difference between the two sample means divided by the standard deviation of the distribution of differences between means.

M1 — M2

t

=

-

SDifference

(8-7)

The within-groups population variance estimate (or mean squares within) is the sum of the population variance esti‑

2 + s3 +

• • •

+ sLst

S2

oorr MSWithin

-

_

NGroups

groups.

mates based on each sample, divided by the number of

(9-1)

shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth6350fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor12231006shapeType20fFlipH0fFlipV0posrelh1posrelv1shapePath4fFillOK0fFilled0lineWidth12700fArrowheadsOK1fBehindDocument0dxWrapDistLeft0dxWrapDistTop0dxWrapDistRight0dxWrapDistBottom0lineColor12231006