Summary

profileZ19
CognitionExploringthescienceofthemindchapter12.docx

12

Judgment and Reasoning

Judgment The activity of "thinking" takes many forms, but one of the central forms is judgment-the process through which people draw conclusions from the evidence they encounter, often evidence provided by life experiences. But how-and how well-do people make judgments? Experience is, of course, an who, after many seasons, tells you which game strategies work and which ones don't. Likewise, you trust the police detective who asserts that over the years he's learned how to tell whether a suspect extraordinary teacher, and so you're likely to believe the sports coach is lying. You welcome the advice of the hair stylist who says, "I can tell you from the hair I cut every day, this shampoo repairs split ends."

But we can also find cases in which people don't learn from experience: "He's getting married again? Why does he think this one will last longer than the last four?"; "It doesn't matter how many polite New Yorkers she meets; she's still convinced that everyone from that city is rude."

What's going on here? Why do people sometimes draw accurate conclusions from their experience, and sometimes not? Attribute Substitution  Let's start with the information you use when drawing a conclusion from experience. Imagine that you're shopping for a car and trying to decide if European know how often these cars break down and need repair-how frequent are the problems? As a cars are reliable. Surely, you'd want to different case, imagine that you're trying to choose an efficient route for your morning drive to work. Here, too, the information you need concerns frequencies: When you've gone down 4th Avenue, how often were you late? How often were you late when you stayed on Front Street instead?

Examples like these remind us that a wide range of judgments begin with a frequency estimate- an assessment of how often various events have occurred in the past. For many of the judgments you make in day-to-day life, though, you don't have direct access to frequency information. You probably don't have instant access to a count of how many VW's break down, in comparison to how many ow, therefore, do Hondas. You probably don't have a detailed list of your various commute times. you proceed in making your judgments?

Let's pursue the decision about commuting routes. In making your choice, you're likely to do a quick scan through memory, looking for relevant cases. If you can immediately think of three occasions when you got caught in a traffic snarl on 4th Avenue and can't think of similar occasions on Front Street, you'll probably decide that Front Street is the better bet. In contrast, if you can recall two horrible traffic jams on Front Street but only one on 4th Avenue, you'll draw the opposite conclusion. The strategy you're using here is known as attribute substitution- a strategy in which you rely on easily assessed information as a proxy for the information you really need. In this judgment about traffic, the information you need is frequency (how often you've been late when you've taken one route or the other), but you don't have access to this information. As a substitute, you base your judgment on availability-how easily and how quickly you can come up with relevant examples. The logic is this: "Examples leap to mind? Must be a common, often-experienced event. A struggle to come up with examples? Must be a rare event”.

This strategy-relying on availability substitution known as the availability heuristic (Tversky & Kahneman, 1973). Here's a different type of attribute substitution: Imagine that you're applying for a job. You hope that the employer will examine your credentials carefully and make a thoughtful judgment about whether you'd be a good hire. It's likely, though, that the employer will rely on a faster, easier strategy. Specifically, he may as a substitute for frequency-is a form of attribute barely glance at your résumé and, instead, ask himself how much you resemble other people he's hired who have worked out well. Do you have the same mannerisms or the same look as Joan, an employee that he's very happy with? If so, you're likely to get the job. Or do you remind him of Jane, an employee he had to fire after just two months? If so, you'll still be looking at the job ads tomorrow In this case, the person who's interviewing you needs to judge probability that you'd work out well if hired) and instead relies on resemblance to known cases. This a probability (namely, the substitution is referred to as the representativeness heuristic. The Availability Heuristic  People rely on heuristics like availability and representativeness in a wide range of settings, and so, if we understand these strategies, we understand how a great deal of thinking proceeds. (See Table 12.1 for a summary comparison of these two strategies.)

In general, the term heuristic describes an efficient strategy that usually leads to the right answer. The key word, however, is "usually," because heuristics allow errors; that's the price you pay in order to gain the efficiency. The availability and representativeness heuristics both fit this profile. In each case, you're relying on an attribute (availability or resemblance) that's easy to assess, and that's the source of the efficiency. And in each case, the attribute is correlated with the target dimension, so that it can serve as a reasonable proxy for the target: Events or objects that are frequent in the world are, in most cases, likely to be easily available in memory, so generally you'll be fine if you rely on availability as an index for frequency. And many categories are homogeneous enough so that members of the category do resemble one another; that's why you can often rely on resemblance as a way of judging probability of category membership.

Nonetheless, these strategies can lead to error. To take a simple case, ask yourself: "Are there more words in the dictionary beginning with the letter R (rose, 'rock, rabbit) or more words with an R in the third position (tarp, 'bare; 'throw)?" Most people insist that there are more words beginning with R (Tversky & Kahneman, 1973, 1974), but the reverse is true-by a margin of at least 2- to-1.

Why do people get this wrong? The answer lies in availability. If you search your memory for words starting with R, many will come to mind. (Try it: How many R-words can you name in 10 seconds?) But if you search your memory for words with an R in the third position, fewer will come up. (Again, try this for 10 seconds.) This difference, favoring the words beginning with R, arises because your memory is organized roughly like a dictionary, with words that share a starting sound all grouped together. As a result, it's easy to search memory using "starting letter" as your cue; a search based on "R in third position" is more difficult. In this way, the organization of memory creates a bias in what's easily available, and this bias in availability leads to an error in frequency judgment. The Wide Range of Availability Effects  The R-word example isn't very interesting on its own-after all, how often do you need to make judgments about spelling patterns? But other examples are easy to find, including cases in which people are making judgments of some importance.

For example, people regularly overestimate the frequency of events that are, in actuality, quite rare (Lichtenstein, Slovic, Fischhoff, Layman,&Combs, 1978). This probably plays a part in people's willingness to buy lottery tickets; they overestimate the likelihood of winning. Likewise, physicians often overestimate the likelihood of a rare disease and, in the process, fail to pursue other, more appropriate, diagnoses (e.g., Elstein et al., 1986; Obrecht, Chapman, & Gelman, 2009).

What causes this pattern? There's little reason to spend time thinking about familiar events ("Oh, look-that airplane has wings!"), but you're likely to notice and think about rare events, especially rare emotional events ("How awful-that airplane crashed!"). As a result, rare events are likely well recorded in memory, and this will, in turn, make these events easily available to you. As a to be consequence, if you rely on the availability heuristic, you'll overestimate the frequency of these distinctive events and, correspondingly, overestimate the likelihood of similar events happening in the future.

Here's a different example. Participants in one study were asked to think about episodes in their lives in which they'd acted in an assertive manner (Schwarz et al., 1991; also see Raghubir & Menon, 2005). Half of the participants were asked to recall 6 episodes; half were asked to recall 12 episodes. Then, all the participants were asked some general questions, including how assertive overall they thought they were. Participants had an easy time coming up with 6 episodes, and so, using the availability heuristic, they concluded, "Those cases came quickly to mind; therefore, there must be a large number of these episodes; therefore, I must be an assertive person." In contrast, participants who were asked for 12 episodes had some difficulty generating the longer list, so they concluded, "If these cases are so difficult to recall, I guess the episodes can't be typical for how I act."

Consistent with these suggestions, participants who were asked to recall fewer episodes judged themselves to be more assertive. Notice, ironically, that the participants who tried to recall more episodes actually ended up with more evidence in view for their own assertiveness. But it's not the quantity of evidence that matters. Instead, what matters is the ease of coming up with the episodes. Participants who were asked for a dozen episodes had a hard time with the task because they'd been asked to do something difficult-namely, to come up with a lot of cases. But the participants seemed not to realize this. They reacted only to the fact that the examples were difficult to generate, and using the availability heuristic, they concluded that being assertive was relatively infrequent in their past. The Representativeness Heuristic  Similar points can be made about the representativeness heuristic. Just like availability, this strategy often leads to the correct conclusion. But here, too, the strategy can sometimes lead you astray.

How does the representativeness heuristic work? Let's start with the fact that many of the categories you encounter are relatively homogeneous. The category "birds," for example, is reasonably uniform with regard to the traits of having wings, having feathers, and so on. Virtually every member of the category has these traits, and so, in these regards, each member of the category resembles most of the others. Likewise, the category "motels" is homogeneous with regard to traits like has beds in each room, has a Bible in each room, and has an office, and so, again, in these regards each member of the category resembles the others. The representativeness heuristic capitalizes on this homogeneity. We expect each individual to resemble the other individuals in the category (i.e., we expect each individual to be representative of the category overall). As a result, we can use resemblance as a basis for judging the likelihood of category membership. So if a creature resembles other birds you've seen, you conclude that the creature probably is a bird. We first met this approach in Chapter 9, when we were discussing simple categories like "dog" and "fruit." But the same approach can be used more broadly-and this is the heart of the representativeness strategy. Thus, if a job candidate resembles successful hires you've made, you conclu de that the person will probably be a successful hire; if someone you meet at a party resembles engineers you've known, you assume that the person is likely to be an engineer.

Once again, though, use of this heuristic can lead to error. Imagine tossing a coin over and over, and let's say that it lands "heads" up six times in a row. Many people believe that on the next toss the coin is more likely to come up tails. They reason that if the coin is fair, then any series of tosses should contain roughly equal numbers of heads and tails. If no tails have appeared for a while, then some are "overdue" to make up the balance.

This pattern of thinking is called the "gambler's fallacy." To see that it is a fallacy, bear in mind that a coin has no "memory," so the coin has no way of knowing how long it has been since the last tails. Therefore, the likelihood of a tail occurring on any particular toss must be independent of what happened on previous tosses; there's no way that the previous tosses could possibly influence the next one. As a result, the probability of a tail on toss number 7 is .50, just as it was on the first toss- and on every toss. What produces the gambler's fallacy? The explanation lies in the assumption of category homogeneity. We know that in the long run, a fair coin will produce equal numbers of heads and tails. Therefore, the category of "all tosses" has this property. Our assumption of homogeneity, though, leads us to expect that any "representative" of the category will also have this property-that is, any sequence of tosses will also show the 50-50 split. But this isn't true: Some sequences of tosses are 75% heads; some are 5% heads. It's only when we combine these sequences that the 50-50 split emerges. (For a different perspective on the gambler's fallacy, see Farmer, Warren, & Hahn, 2017.) Reasoning from a Single Case to the Entire Population  The assumption of homogeneity can also lead to a different error, one that's in view whenever people try to persuade each other with a "man who" argument. To understand this term (first proposed by Nisbett & Ross, 1980), imagine that you're shopping for a new cell phone. You've read various consumer magazines and decided, based on their test data, that you'll buy a Smacko brand phone. You report this to a friend, who is aghast. "Smacko? You must be crazy. Why, I know a guy who bought a Smacko, and the case fell apart two weeks after he got it. Then, the wire for the headphones went. Then, the charger failed. How could you possibly buy a Smacko?" What should you make of this argument? The consumer magazines tested many phones and reported that, say, 2% of all Smackos have repair problems. In your friend's "data," 100 % of the Smackos (one out of one) broke. It seems silly to let this "sample of one" outweigh the much larger sample tested by the magazine, but even so your friend probably thinks he's offering persuasive a argument. What guides your friend's thinking? He must be assuming that the category will resemble the instance. Only in that case would reasoning from a single instance be appropriate. (For a classic demonstration of the "man who" pattern, see Hamill, Wilson, & Nisbett, 1980.)

If you listen to conversations around you, you'll regularly hear "man who" (or "woman who") arguments. "What do you mean, cigarette smoking 50 years, and she runs in marathons!" Often, these arguments seem persuasive. But they have force only by virtue of the representativeness heuristic and your assumption of category homogeneity. causes cancer?! I have an aunt who smoked for e. Demonstration 12.1: Sample Size  Research on how people make judgments suggests that their performance is at best uneven, with people in many cases drawing conclusions that are not justified by the evidence they've seen. Here, for example, is a question drawn from a classic study of judgment:

In a small town nearby, there are two hospitals. Hospital A has an average of 45 births per day; Hospital B is smaller and has an average of 15 births per day. As we all know, overall the proportion of males born is 50%. Each hospital recorded the number of days in which, on that day, at least 60 % of the babies born were male.

Which hospital recorded more such days?

a. Hospital A

b. Hospital B

c. both equal

What's your answer to this question? In more formal procedures, the majority of research participants choose response (c), "both equal," but this answer is statistically unwarranted. All of the births in the country add up to a 50-50 split between male and female babies, and, the larger the sample you examine, the more likely you are to approximate this ideal. But, conversely, the smaller the sample you examine, the more likely you are to stray from this ideal. Days with 60 % male births, straying from the ideal, are therefore more likely in the smaller hospital, Hospital B.

If you don't see this, consider a more extreme case:

Hospital C has 1,000 births per day; Hospital D has exactly 1 birth per day. Which hospital records more days with at least 90% male births?

This value will be observed in Hospital D rather often, since on many days all the babies born (one out of one) will be male. This value is surely less likely, though, in Hospital C: 900 male births, with just 100 female, would be a remarkable event indeed. In this case, it seems clear that the smaller hospital can more easily stray far from the 50-50 split.

In the hospital problem, participants seem not to take sample size into account. They seem to think a particular pattern is just as likely with a small sample as with a large sample, although this is plainly not true. This belief, however, is just what we would expect if people were relying on the representativeness heuristic, making the assumption that each instance of a category-or, in this case, each subset of a larger set-should show the properties associated with the entire set.

Try this demonstration witha couple of your friends. As you'll see, it's easy to find people who choose the incorrect option ("both equal"), underlining just how often people seem to be insensitive to considerations of sample size. e. Demonstration 12.2: Relying on the Representativeness Heuristic  Demonstration 12.1 indicated that people often neglect (or misunderstand the meaning of) sample so their size. In other cases, people rely on heuristics that are not in any way guided by logic, conclusion ends up being quite illogical. For example, here is another classic problem from research on judgment:

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and she also participated in anti-nuclear demonstrations.

Which of the following is more likely to be true?

a. Linda is a bank teller.

b. Linda is a bank teller and is active in the feminist movement.

What's your response? In many studies, a clear majority of participants (sometimes as high as 85%) choose option (b). Logically, though, this makes no sense. If Linda is a feminist bank teller, then she is still a bank teller. Therefore, there's no way for option (b) to be true without option (a) also being true. Therefore, option (b) couldn't possibly be more likely than option (a)! Choosing option (b), in other words, is akin to saying that if we randomly choose someone who lives in North America, the chance of that person being from Vermont is greater than the chance of that person being from the United States. Why, therefore, do so many people choose option (b)? This option makes sense if people are relying on the representativeness heuristic. In that case, they make the category judgment by asking themselves: "How much does Linda resemble my idea of a bank teller? How much does she resemble my idea of a feminist bank teller?" On this basis, they could easily be led to option (b), because the description of Linda does, in fact, encourage a particular view of her and her politics.

There is, however, another possibility. With options (a) and (b) sitting side-by-side, someone might say: "Well, if option (b) is talking about a bank teller who is a feminist, then option (a) must be talking about a bank teller who is not a feminist." On that interpretation, choosing option (b) does seem reasonable. Is this how you interpreted option (a)?

You might spend a moment thinking about how to test this alternative interpretation-the idea that research participants interpret option (a) in this narrowed fashion. One strategy is to present option (a) to some participants and ask them how likely it is, and to present option (b) to other participants and ask them how likely it is. In this way, the options there's never any implied contrast in the options. In this situation, then, there's no reason at all for are never put side by side, so participants to interpret option (a) in the narrowed fashion. Even so, in studies using this alternative procedure, the group of participants seeing option (a) still rated it as less likely than the other group of participants rated the option they saw. Again, this makes no sense from the standpoint of logic, but it makes perfect sense if participants are using the representativeness heuristic. Detecting Covariation  It cannot be surprising that people often rely on mental shortcuts. After all, you don't have unlimited time, and many of the judgments you make, day by day, are far from life-changing. It's unsettling, though, that people use the same shortcuts when making deeply consequential judgments. And to make things worse, the errors caused by the heuristics can trigger other sorts of errors, including errors in judgments of covariation. This term has a technical meaning, but for our purposes we can define it this way: X and Y "covary" if X tends to be on the scene whenever Y is, and if X tends to be absent whenever Y is absent. For example, exercise and stamina covary: People who do the first tend to have a lot of the second. Years of education and annual salary also covary (and so people with more education tend to earn more), but the covariation is weaker than that between exercise and stamina. Notice, then, that covariation can be strong or weak, and it can also be negative or positive. Exercise and stamina, for example, covary positively (as exercise increases, so does stamina). Exercise and risk of heart attacks covary negatively (because exercise strengthens the heart muscle, decreasing the risk).

Covariation is important for many reasons-including the fact that it's what you need to consider when checking on a belief about cause and effect. For example, do you feel better on days when you eat a good breakfast? If so, then the presence or absence ofa good breakfast in the morning should covary with how you feel as the day wears on. Similarly: Are you more likely to fall in love with someone tall? Does your car start more easily if you pump the gas pedal? These, too, are questions that hinge on covariation, leading us to ask: How accurately do people judge covariation? Illusions of Covariation  People routinely "detect" covariation even where there is none. For example, many people are convinced there's a relationship between someone's astrological sign (e.g., whether the person is a Libra or a Virgo) and their personality, yet no serious study has documented this covariation. predict the weather by paying attention to their arthritis Likewise, many people believe they can pain ("My knee always acts up when a storm is coming"). This belief, too, turns out to be groundless. Other examples concern social stereotypes (e.g., the idea that being "moody" covaries with gender), superstitions (e.g., the idea that Friday the 13th brings bad luck), and more. (For some of the evidence, see King & Koehler, 2000; Redelmeier & Tversky, 1996; Shaklee & Mims, 1982.)

What causes illusions like these? One reason, which we've known about for years, is centered on the evidence people consider when judging covariation: In making these judgments, people seem to consider only a subset of the facts, and it's a subset skewed by their prior expectations (Baron, 1988; Evans, 1989; Gilovich, 1991; Jennings, Amabile, & Ross, 1982). This virtually guarantees mistaken judgments, since even if the judgment process were 100% fair, a biased input would lead to a biased output. Specifically, when judging covariation, your selection of evidence is likely to be guided by confirmation bias-a tendency to be more alert to evidence that confirms your beliefs rather than to evidence that might challenge them (Nisbett & Ross, 1980; Tweney, Doherty, & Mynatt, 1981). We'll have more to say about confirmation bias later, but for now let's note how confirmation bias can distort the assessment of covariation. Let's say, for example, that you have the belief that big dogs tend to be vicious. With this belief, you're more likely to notice big dogs that are, in fact, vicious and little dogs that are friendly. As a result, a biased sample of dogs is available to you, in the dogs you perceive and the dogs you remember. Therefore, if you try to estimate covariation between dog size and temperament, you'll get it wrong. This isn't because you're ignoring the facts. The problem instead lies in your "data"; if the data are biased, so will be your judgment. Base Rates  Assessment of covariation can also be pulled off track by another problem: neglect of base-rate information-information about how frequently something occurs in general. Imagine that we're testing a new drug in the hope that it will cure the common cold. Here, we're trying to find out if taking the drug covaries with a better medical outcome, and let's say that our study tells us that 70% of patients taking the drug recover from their illness within 48 hours. This result is uninterpretable on its own, because we need the base rate: We need to know how often in general people recover from their colds in the same time span. If it turns out that the overall recovery rate within 48 hours is 70%, then our new drug is having no effect whatsoever.

Similarly, do good-luck charms help? Let's say that you wear your lucky socks whenever your favorite team plays, and the team has won 85% of its games. Here, too, we need to ask about base rates: How many games has your team won over the last few years? Perhaps the team has won 90% overall. In that case, your socks are actually a jinx.

Despite the importance of base rates, people often ignore them. In a classic study, Kahneman and Tversky (1973) asked participants this question: If someone is chosen at random from a group of 70 lawyers and 30 engineers, what is his profession likely to be? Participants understood perfectly well that in this setting the probability of the person being a lawyer is .70. Apparently, in some settings people are appropriately sensitive to base-rate information. Other participants did a similar task, but they were given the same base rates and also brief descriptions of certain individuals. Based on this information, they were asked whether each individual was more likely to be a lawyer or an engineer. These descriptions provide diagnostic information-information about the particular case-and some of the descriptions had been crafted (based on common stereotypes) to suggest that the person was a lawyer; some suggested engineer; some were relatively neutral.

Participants understood the value of these descriptions and-as we've just seen-also seem to understand the value of base rates: They're responsive to base-rate information if this is the only information they have. When given both types of information, therefore, we should expect that the participants will combine these inputs as well as they can. If both the base rate and the diagnostic information favor the lawyer response, participants should offer this response with confidence. If the base rate indicates one response and the diagnostic information the other response, participants should temper their estimates accordingly.

However, this isn't what participants did. When provided with both types of information, they relied only on the descriptive information about the individual. In fact, when given both the base rate and diagnostic information, participants' responses were the same if the base rates were as already described (70 lawyers, 30 engineers) or if the base rates were reversed (30 lawyers, 70 engineers). This reversal had no impact on participants' judgments, confirming that they were indeed ignoring the base rates. What produces this neglect of base rates? The answer, in part, is attribute substitution. When asked whether a particular person-Tom, let's say-is a lawyer or an engineer, people seem to turn this question about category membership into a question about resemblance. (In other words, they rely on the representativeness heuristic.) Therefore, to ask whether Tom is a lawyer, they ask themselves how much Tom resembles (their idea of) a lawyer. This substitution is (as we've discussed) often helpful, but the strategy provides no role for base rates-and this guarantees that people will routinely ignore base rates. Consistent with this claim, base-rate neglect is widespread and can be observed both in laboratory tasks and in many real-world judgments. (For some indications, though, of when people do take base rates into account, see Griffin et al., 2012; Klayman & Brown, 1993; Pennycook, Trippas, Handley, & Thompson, 2014.) e. Demonstration 12.3: "Man who" Arguments  The chapter suggests that people often rely on "man who" arguments. For example: "It's crazy to Toyota, and she's had problem after think Japanese cars are reliable. I have a friend who owns a problem with it, starting with the week she bought it!"

But are "man who" arguments really common? As an exercise, for the next week try to be on the lookout for these arguments. Bear in mind that there are many variations on this form: "I know a team that barely practices, and they win almost all their games" "I have a classmate who parties every Friday, and he's doing great in school, what these variations have in common is that they draw a conclusion based on just Friday nights?" Of course, why should I stay home on SO one case. How often can you detect any of these variations in your day-to-day conversations or in things you read online?

One more question: Once you're alert to "man who" arguments, and noticing them when they guard against such arguments? You might try of these arguments, with the simple assertion, "That's just come into your view, does that help you to be on responding, each time you encounter one single case; maybe it's not in any way typical of the broader pattern." e. Demonstration 12.4: Applying Base Rates  Chapter 12 documents many errors in judgment, and it is deeply troubling that these errors can be making judgments about domains that are observed even when knowledgeable experts are enormously consequential. As an illustration, consider the following scenario.

Imagine that someone you care about-let's call her Julia, age 42-is worried that she might have breast cancer. In thinking about Julia's case, we might start by asking: How common is breast cancer for women of Julia's age, with her family history, her dietary pattern, and so on? Let's assume that for this group the statistics show an overall 3% likelihood of developing breast cancer. This should be reassuring to Julia, because there is a 97% chance that she is cancer free.

Of course, a 3 % chance is still scary for this disease, so Julia decides to get a mammogram. When her results come back, the report is bad-indicating that she does have breast cancer. Julia quickly does some research to find out how accurate mammograms are, and she learns that the available data are something like this:

Mammogram indicates

Cancer No cancer

Cancer actually present 85% 15%

Cancer actually absent 10% 90%

(We emphasize that these are fictitious numbers, created for this exercise. Even so, the reality is that mammograms are reasonably accurate, in the way shown.)

In light of all this information, what is your best estimate of the chance that Julia does, in fact, have breast cancer? She comes from a group that only has a 3 % risk for cancer, but she's gotten an abnormal mammogram result, and the test seems, according to her research, accurate. What should we conclude? Think about this for a few moments, and before reading on, estimate the percentage chance of Julia having breast cancer.

When medical doctors are asked questions like these, their answers are often wildly inaccurate, because they (like most people) fail to use base-rate information correctly. What was your estimate of the percentage chance of Julia having breast cancer? The correct answer is 20 %. This is an awful number, given what's at stake, and Julia would surely want to pursue further tests. But the odds are still heavily in Julia's favor, with a 4-to-1 chance that she is entirely free of cancer.

Where does this answer come from? Let's create a table using actual counts rather than the percentages shown in the previous table. Go get a piece of paper, and set up a table like this:

Mammogram indicates

Cancer No cancer Total number

Cancer actually present

Cancer actually absent

Let's imagine that we're considering 100,000 women with medical histories similar to Julia's. We have already said that overall there's a 3% chance of breast cancer in this group, and so 3,000 (3% of 100,000) of these women will have breast cancer. Fill that number in as the top number in the "Total number" column, and this will leave the rest of the overall group (97,000) as the bottom number in this column.

SHOW

Now, let's fill in the rows. There are 3,000 women counted in the top row, and we've already said that in this group the mammogram will (correctly) indicate that cancer is present in 85% of the cases. So the number for "Mammogram indicates cancer" in the top row will be 85% of the total in this row (3,000), or 2,550. The number of cases for "Mammogram indicates no cancer" in this row will be the remaining 15% of the 3,000, so let's fill in that number-450.

SHOW

Let's now do the same for the bottom row. We've already said that there are 97,000 women Let's now do the same for the bottom row. We've already said that there are 97,000 women represented in this row; of these, the mammogram will correctly indicate no cancer for 90 % (87,300) and will falsely indicate cancer for 10 % (9,700). Let's now put those numbers in the appropriate positions.

SHOW

Finally, let's put these pieces together. According to our numbers, a total of 12,250 women will receive the horrid information that they have breast cancer. (That's the total of the two numbers, 2,550+9,700, in the left column, "Mammogram indicates cancer") Within this group, this test result will be correct for 2,550 women (left column, top row). The result will be misleading for the remaining 9,700 (left column, bottom row). Therefore, of the women receiving awful news from their 12,250, or 20%, will actually have breast cancer; the remaining 80% will be mammogram, 2,550 cancer free.

Notice, then, that the mammogram is wrong far more often than it's right. This isn't because the mammogram is an inaccurate test. In fact, the test is rather accurate. However, if the test is used with patient groups for which the base rate is low, then the mammogram might be wrong in only 10% of the cancer-free cases; but this will be 10% of a large number, producing a substantial number of horrifying false alarms. This is obviously a consequential example, because we're discussing a disease that is lethal in many cases. It is therefore deeply troubling that even in this very important example, people still make errors of judgment. Worse, it's striking that experienced physicians, when asked the same questions, also make errors-they, too, ignore the base rates and therefore give risk estimates that are off by a very wide margin.

At the same time, because this is a consequential example, let's add some caution to these points. First, if a woman has a different background from Julia (our hypothetical patient), her overall risk for breast cancer may be higher than Julia's. In other words, the base rate for her group may be higher or lower (depending on the woman's age, exposure to certain toxins, family history, and other factors), and this will have a huge impact on the calculations we've discussed here. Therefore, we cannot freely generalize from the numbers considered here to other cases; we would need to know the base rate for these other cases.

Second, even if Julia's risk is 20%, this is still a high number, so Julia (or anyone in this situation) might pursue treatment for this life-threatening illness. A 1-in-5 chance of having a deadly disease must be taken seriously! However, this doesn't change the fact that a 20 % risk is very different from the 85% risk that one might fear if one considered only the mammogram results in isolation from the base rates. At 20%, the odds are good that Julia is safe; at 85%, she probably does have this disease. It seems certain that this is a difference that would matter for Julia's subsequent steps, and it reminds us that medical decision making needs to be guided by full information-including, it seems, information about base rates. Dual-Process Models  We seem to be painting a grim portrait of human judgment, and we can document errors even among experts-financial managers making large investments (e.g., Hilton, 2003; Kahneman, 2011) and physicians diagnosing cancer (but ignoring base rates; Eddy, 1982; also see Koehler, Brenner, & Griffin, 2002). The errors occur even when people are strongly motivated to be careful, with clear instructions and financial rewards offered for good performance (Arkes, 1991; Gilovich, 1991; Hertwig & Ortmann, 2003)

Could it be, then, that human judgment is fundamentally flawed? If so, this might explain why people are so ready to believe in telepathy, astrology, and a variety of bogus cures (Gilovich, 1991; King & Koehler, 2000). In fact, maybe these points help us understand why warfare, racism, neglect of poverty, and environmental destruction are so widespread; maybe these problems are the inevitable outcome of people's inability to understand facts and to draw decent conclusions.

Before we make these claims, however, let's acknowledge another side to our story: Sometimes human judgment rises above the heuristics we've described so far. People often rely on availability in judging frequency, but sometimes they seek other (more accurate) bases for making their judgments (Oppenheimer, 2004; Schwarz, 1998; Winkielman & Schwarz, 2001). Likewise, people often rely on the representativeness heuristic, and so (among other concerns) they draw conclusions from "man who" stories. But in other settings people are keenly sensitive to sample size, and they draw no conclusions if their sample is small or possibly biased. (For an early statement of this point, see Nisbett, Krantz, Jepson, & Kunda, 1983; for more recent discussion, see Kahneman, 2011.) How can we make sense of this mixed pattern? Ways of Thinking: Type 1, Type 2  A number of authors have proposed that people have two distinct ways of thinking. One type of thinking is fast and easy; the heuristics we've described fall into this category. The other type is slower and more effortful, but also more accurate.

Researchers have offered various versions of this dual-process model, and different theorists use different terminology (Evans, 2006, 2012a; Ferreira, Garcia-Marques, Sherman, & Sherman, 2006; Kahneman, 2011; Pretz, 2008; Shafir & LeBoeuf, 2002). We'll rely on rather neutral terms (initially proposed by Stanovich and West, 2000; Stanovich, 2012), so we'll use Type 1 as the label for the fast easy sort of thinking and Type 2 as the label for the slower, more effortful thinking. (Also see Figure 12.2.)

When do people use one type of thinking or the other? One hypothesis is that people choose when to rely on each system; presumably, they shift to the more accurate Type 2 when making judgments that really matter. As we've seen, however, people rely on Type 1 heuristics even when incentives are offered for accuracy, even when making important professional judgments, even when making medical diagnoses that may literally be matters of life and death. Surely people would choose Type 1 and fall into error. On these Type 2 in these cases if they could, yet they still rely on to use grounds, it's difficult to argue that using Type 2 is a matter of deliberate choice.

Instead, evidence suggests that Type 2 is likely to come into play only if it's triggered by certain cues and only if the circumstances are right. We've suggested, for example, that Type 2 judgments surprising that heuristic-based judgments (and, likely when judgments are made under time pressure (Finucane, Alhakami, Slovic, & Johnson, 2000). We've also said that Type 2 judgments require effort, so this form of thinking is more likely if the person can focus attention on the judgment being made are slower than Type 1, and on this basis it's not thus, heuristic-based errors) are more (De Neys, 2006; Ferreira et al., 2006; for some complexity, though, see Chun & Kruglanski, 2006). Triggers for Skilled Intuition  We need to be clear, though, that we cannot equate Type 1 thinking with "bad" or "sloppy" thinking, because fast-and-efficient thinking can be quite sophisticated if the environment contains the "right sort" of triggers. Consider base-rate neglect. We've already said that people often ignore base rates- and, as a result, misinterpret the evidence they encounter. But sensitivity demonstrated, even in cases involving Type 1 thinking (e.g., Pennycook to base rates can also be et al., 2014). This mixed likely presented. Base-rate neglect is more pattern is attributable, in part, to how the base rates are if the relevant information is cast in terms of probabilities or proportions: "There is a .01 chance that people like Mary will have this disease"; "Only 5% of the people in this group are lawyers." But base- rate information can also be conveyed in terms of frequencies, and it turns out that people often use the base rates if they're conveyed in this way. For example, people are more alert to a base rate are to the same information cast as a percentage phrased as "12 out of every 1,000 cases" than they probability (.012). (See Gigerenzer & Hoffrage, 1995; also Brase, 2008; Cosmides & Tooby on how the problem is presented, with some presentations (1.2%) 1996.) It seems, then, that much depends being more "user friendly" than others. (For more on the circumstances in which Type 1 thinking can be rather sophisticated, see Gigerenzer & Gaissmaier, 2011; Kahneman & Klein, 2009; Oaksford & Hall, 2016.) The Role for Chance  Fast-but-accurate judgments are also more likely if the role of random chance is conspicuous in a likely to realize that the "evidence" they're problem. If this role is prominent, people are more considering may just be a fluke or an accident, not an indication of a reliable pattern. With this, people are more likely to pay attention to the quantity of evidence, on the (sensible) idea that a larger set of observations is less vulnerable to chance fluctuations.

In one study, participants were asked about someone who evaluated a restaurant based on just one meal (Nisbett, Krantz, Jepson, & Kunda, 1983). This is, of course, a weak basis for judging the restaurant: If the diner had a great meal, maybe he was lucky and selected by chance the one entrée the chef knew how to cook. If the meal was lousy, maybe the diner happened to choose the weakest option on the menu. With an eye on these possibilities, we should be cautious in evaluating the diner's report, based on his limited experience of just one dinner.

In one condition of the study, participants were told that the diner chose his entrée by blindly dropping a pencil onto the menu. This cue helped the participants realize that a different sample and perhaps different views of the restaurant, might have emerged if the pencil had fallen on a different selection. As a result, these participants were appropriately cautious about the diner's Gigerenzer, 1991; Gigerenzer, Hell, & Blank, 1988; assessment based on just a single meal. (Also see Tversky & Kahneman, 1982.) Education  The quality of a person's thinking is also shaped by the background knowledge that she or he brings judgment; Figure 12.3 provides an illustration of this pattern (after Nisbett et al., 1983). In addition, a person's quality of thinking is influenced by education. For example, Fong, Krantz, & to a telephone survey of "opinions about sports," calling students who were an undergraduate course in statistics. Half of the students were contacted during the first Nisbett (1986) conducted a taking week of the semester; half were contacted during the last week.

In their course, these students had learned about the importance of sample size. They'd been reminded that accidents do happen, but that accidents don't keep happening over and over. Therefore, a pattern visible in a small sample of data might be the result of some accident, but a pattern evident in large sample probably isn't. Consequently, large samples are more reliable, more trustworthy, than small samples.

This classroom training had a broad impact. In the phone interview (which was-as far as the students knew-not in any way connected to their course), one of the questions involved a comparison between how well a baseball player did in his first year and how well he did in the rest of his career. This is essentially a question about sample size (with the first year being just a sample of the player's overall performance). Did the students realize that sample size was relevant here? For those contacted early in the term, only 16% gave answers that showed any consideration of sample size. For those contacted later, the number of answers influenced by sample size more than doubled (to 37%).

It seems, then, that how well people think about evidence can be improved, and the improvement applies to problems in new domains and new contexts. Training in statistics, it appears, can have widespread benefits. (For more on education effects, see Ferreira et al., 2006; Gigerenzer, Gaissmaier, Kurz-Milcke, Schwartz, & Woloshin, 2008; Lehman & Nisbett, 1990.) The Cognitive Reflection Test  Even with education, some people make judgment errors all the time, and part of the explanation is suggested by the Cognitive Reflection Test (CRT). This test includes just three questions (Figure 12.4), and for each one, there is an obvious answer that turns out to be wrong. To do well on the test, therefore, you need to resist the obvious answer and instead spend a moment reflecting on the question; if you do, the correct answer is readily available.

Many people perform poorly on the CRT, even when we test students at elite universities (Frederick, 2005). People who do well on the CRT, in contrast, are people who in general are more likely to rely on Type 2 thinking-and therefore likely to avoid the errors we've described in this chapter. In fact, people with higher CRT scores tend to have better scientific understanding, show greater skepticism about paranormal abilities, and even seem more analytic in their moral decisions (Baron, Scott, Fincher, & Metz, 2015; Pennycook, Cheyne, Koehler, & Fugelsang, 2016; Travers, Rolison, & Feeney, 2016). Let's be clear, though, that no one is immune to the errors we've been discussing, but the risk of error does seem lower in people who score well on the CRT. e. Demonstration 12.5: Frequencies Versus Percentages  This chapter argues that we can improve people's judgments by presenting evidence to them in the right way. To see how this plays out, recruit a few friends. Ask some of them Question 1, and some Question 2:

1. Mr. Jones is a patient in a psychiatric hospital, and he has a history of violence. However, the time has come to consider discharging Mr. Jones from the hospital. He is therefore evaluated by several experts at the hospital, and they conclude: Patients with Mr. Jones's profile are estimated to have a 10 % probability of committing an act of violence against others during the first several months after discharge. How comfortable would you be in releasing Mr. Jones?

1 2 3 5 4 6 7

No way! Keep him in the hospital. Yes, he is certainly ready for discharge.

2. Mr. Jones is a patient in a psychiatric hospital, and he has a history of violence. However, the time has come to consider discharging Mr. Jones from the hospital. He is therefore evaluated by several experts at the hospital, and they conclude: Of every 100 patients similar to Mr. Jones 10 are estimated to commit an act of violence against others during the first several months after discharge. How comfortable would you be in releasing Mr. Jones?

1 2 3 5 4 6 7

No way! Keep him in the hospital. Yes, he is certainly ready for discharge.

These two questions provide the same information (10% = 10 out of 100), but do your friends react in the same way? When experienced forensic psychologists were asked these questions, 41 % of them denied the discharge when they saw the data in frequency format (10 out of 100), and only 21 % denied the discharge when they saw the percentage

Of course, there's room for debate about what the "right answer" is in this case. Therefore, we cannot conclude from this example that a frequency format improves reasoning. (Other evidence, though, does confirm this important point.) But this example does make clear that a change in format matters-with plainly different outcomes when information is presented as a frequency, rather than as a percentage. e. Demonstration 12.6: Cognitive Reflection People make many errors in judgment and reasoning. Are some people, however, more likely to make these errors? One line of evidence, discussed in the chapter, comes from the Cognitive Reflection Test (CRT). For each of the three questions on the test, there's an obvious and intuitive answer that happens to be wrong. What the test really measures, therefore, is whether someone is inclined to quickly give that intuitive answer, or whether the person is instead inclined to pause and give a more reflective answer.

The CRT is widely used, and the questions are well publicized (in the media, on the Internet). If someone has seen the questions in one of these other settings, then the test loses all validity. To address this concern, some researchers have tried to develop variations on the CRT-still seeking to measure cognitive reflection, but using questions that may be less familiar.

Here are some examples; what is your answer to each question?

1. If you're running a race and you pass the person in second place, what place are you in?

2. A farmer had 15 sheep, and all but 8 died. How many are left?

3. Emily's father has three daughters. The first two are named April and May. What is the third daughter's name?

4. How many cubic feet of dirt are there in a hole that is 3 feet deep, 3 feet wide, and 3 feet long?

Answer the questions before you read further.

For question 1, the intuitive answer is that you're now in first place. The correct answer is that you're actually in second place.

For question 2, the intuitive answer is 7. The correct answer is 8 ("all but 8 died").

For question 3, the intuitive answer is June. But note that we're talking about Emily's father, so apparently Emily is the name of the third daughter!

For question 4, the intuitive answer is 27. But, of course, what makes a hole a hole is that all of the dirt has been removed from it. Therefore, the correct answer is "none."

Even if you got these questions right, you probably felt the "tug" toward the obvious-but- incorrect answer. This tug is exactly what the test is trying to measure-by determining how often you give in to the tug! Confirmation and Disconfirmation  In this chapter so far, we've been looking at a type of thinking that requires induction-the process through which you make forecasts about new cases, based on cases you've observed so far. Just as important, though, is deduction-a process in which you start with claims or assertions that you count as "given" and ask what follows from these premises. For example, perhaps you're already convinced that red wine gives you headaches or that relationships based on rarely last. You might want to ask: What follows from this? What implications do these claims have physical attraction for your other beliefs or actions?

Deduction has many functions, including the fact that it helps keep your beliefs in touch with reality. After all, if deduction leads you to a prediction based on your beliefs and the prediction turns out to be wrong, this indicates that something is off track in your beliefs-so that claims you thought were solidly established aren't so solid after all.

Does human reasoning respect this principle? If you encounter evidence confirming your beliefs, does this strengthen your convictions? If evidence challenging your beliefs should come your way, do you adjust? Confirmation Bias  It seems sensible that in evaluating any belief, you'd want to take a balanced approach-considering evidence that supports the belief, and weighing that information against other evidence that might challenge the belief. And, in fact, evidence that challenges you is especially valuable; many authors argue that this type of evidence is more informative than evidence that seems to support you. (For the classic statement of this position, see Popper, 1934.) There's a substantial gap, however, between these suggestions about what people should do and what they actually do. Specifically, people routinely display a pattern we've already mentioned, confirmation bias: a greater sensitivity to confirming evidence and a tendency to neglect disconfirming evidence. Let's emphasize, however, that this is an "umbrella" term, because confirmation bias can take many different forms (see Figure 12.5). What all the forms have in common is the tendency to protect your beliefs from challenge. (See, among others, Gilovich, 1991; Kassin, Bogart, & Kerner, 2012; Schulz-Hardt, Frey, Lüthgens, & Moscovici, 2000; Stangor & McMillan, 1992.)

In an early demonstration of confirmation bias, Wason (1966, 1968) presented research participants with a series of numbers, such as "2, 4, 6 The participants numbers conformed to a specific rule, and their task was to figure out the rule. Participants allowed to propose their own trios of numbers ("Does '8, 10, 12' follow the rule?"), and in each case were told that this trio of were the experimenter responded appropriately ("Yes, it follows the rule" or "No, it doesn't"). Then, once participants were satisfied that they had discovered the rule, they announced their "discovery."

The rule was actually quite simple: The three numbers had to be in ascending order. For example, "1, 3, 5" follows the rule, but "6, 4, 2" does not, and neither does "10, 10, 10" Despite this simplicity, participants had difficulty discovering the rule, often requiring many minutes. This was to the type of information they requested as they tried to evaluate their hypotheses: To an overwhelming extent, they sought to confirm the rules they had proposed; requests for disconfirmation were relatively rare. And it's noteworthy that those few participants who did seek out disconfirmation for their hypotheses were more largely due to discover the rule. It seems, then, that confirmation bias was likely strongly present in this experiment and interfered with performance. Reinterpreting Disconfirming Evidence Here's a different manifestation of confirmation bias. When people encounter information consistent with their beliefs, they're likely to take the evidence at face value, accepting it without challenge or question. In contrast, when people encounter evidence that's inconsistent with their beliefs, they're often skeptical and scrutinize this new evidence, seeking flaws or ambiguities.

One study examined gamblers who bet on professional football games (Gilovich, 1983; see also Gilovich, 1991; Gilovich & Douglas, 1986). These people all believed they had good strategies for picking winning teams, and their faith in these strategies was undiminished by a series of losses. Why is this? It's because the gamblers didn't understand their losses as "losses" Instead, they going to win if it right. New York was remembered them as flukes or oddball coincidences: "I was hadn't been for that crazy injury to their running back"; "I was correct in picking St. Louis. They would have won except for that goofy bounce the ball took after the kickoff." In this way, winning bets were remembered as wins; losing bets were remembered as "near wins." No wonder, then, that the gamblers maintained their views despite the contrary evidence provided by their empty wallets. Belief Perseverance

Even when disconfirming evidence is undeniable, people sometimes don't use it, leading to a phenomenon called belief perseverance. Participants in a classic study were asked to read a series of suicide notes; their task was to figure out which notes were authentic, collected by the police, and which were fake, written by other students as an exercise. As participants offered their judgments, they received feedback about how well they were doing-that is, how accurate they were in detecting the authentic notes. The trick, though, was that the feedback had nothing to do with the participants' actual judgments. By prearrangement, some participants were told that they were performing at a level well above average in this task; other participants were told the opposite-that performing at a level far below average (Ross, Lepper, & Hubbard, 1975; also Ross & they were Anderson, 1982)

Later on, participants were debriefed. They were told that the feedback they had received was bogus and had nothing to do with their performance. They were even shown the experimenter's instruction sheet, which had assigned them in advance to the success or failure group. They were then asked a variety of additional questions, including some for which they had to assess their own "social sensitivity." Specifically, they were asked to rate their actual ability, as they perceived it, in tasks like the suicide-note task. Let's emphasize that participants been told clearly that the feedback they'd received was randomly determined and had no credibility whatsoever. Nonetheless, participants who had received the "above average" feedback continued to making these judgments about themselves after they'd were being above average, and likewise their ability to judge suicide notes. Those who had received the "below average" feedback showed the opposite pattern. All think of their social sensitivity as participants, in other words, persevered in their beliefs even when the basis for the belief had been completely discredited.

What is going on here? Imagine yourself as one of the participants, and let's say that we've told you that you're performing rather poorly at the suicide-note task. As you digest this new "information" about yourself, you'll probably wonder, "Could this be true? Am I less sensitive than I think I am?" To check on this possibility, you might search through your memory, looking for evidence that will help you evaluate this suggestion.

What sort of evidence will you seek? This is where confirmation bias comes into play. Because of this bias, chances are good that you'll check on the researcher's information by seeking other facts or other episodes in your memory that might confirm your lack of social perception. As a result, you'll soon have two sources of evidence for your social insensitivity: the (bogus) feedback provided by the researcher, and the supporting information you came up with yourself, thanks to your (selective) memory search. So even if the researcher discredits the information he provided, you still have the information you provided yourself, and on this basis you might maintain your belief. (For discussion, see Nisbett & Ross. 1980: also Johnson & Seifert. 1994) Of course, in this experiment, participants could be led either to an enhanced estimate of their on which false information they or to a diminished estimate, depending were own social sensitivity given in the first place. Presumably, this is because the range of episodes in participants' memories is wide: In some previous episodes they've been sensitive, and in some they haven't been. Therefore, if they search through their memories seeking to confirm the hypothesis that they've been sensitive in the past, they'll find confirming evidence. If they search through memory seeking to confirm the opposite hypothesis, this too will be possible. In short, they can confirm either hypothesis via a suitably selective memory search. This outcome highlights the dangers built into a selective search of the evidence and, more broadly, the danger associated with confirmation bias. Logic  In displaying confirmation bias, people sometimes seem to defy logic. "If my gambling strategy is good, then I'll win my next bet. But I lose the bet. Therefore, my strategy is good." How should we think about this? In general, do people fail to understand-or perhaps ignore-the rules of logic?  Reasoning about Syllogisms  Over the years, a number of theorists have proposed that human thought does follow the rules of logic, and so, when people make reasoning errors, the problem must lie elsewhere: carelessness, perhaps, or a misinterpretation of the problem (Boole, 1854; Mill, 1874; Piaget, 1952). It turns out, however, that errors in logical reasoning happen all the time. If people are careless or misread problems, they do so rather frequently. This is evident, for example, in studies using categorical syllogisms-a type of logical argument that begins with two assertions (the problem's premises), each containing a statement about a category, as shown in Figure 12.6. The syllogism can then be completed with a conclusion that may or may not follow from these premises. The cases shown in the figure are all valid syllogisms-that is, the conclusion does follow from the premises stated. In contrast, here is an example of an invalid syllogism:

All P are M.

All S are M.

Therefore, all S are P.

To see that this is invalid, try translating it into concrete terms, such as "All plumbers are mortal" and "All secretaries are mortal." Both of these are surely true, but it doesn't follow from this that "All secretaries are plumbers."

Research participants who are asked to reason about syllogisms do remarkably poorly-a fact that the research has confirmed for many years. Chapman and Chapman (193 participants a number of syllogisms, including the one just discussed, with premises of "All P are M" and "All S are M" The vast majority of participants endorsed the invalid conclusion "All S are P" Only for example, gave their 9% got this problem right. More recently, other studies, with other problems, have yielded similar data-with error rates regularly as high as 70% to 90 %. (Khlemani & Johnson-Laird, 2012, provide a review.) Belief Bias

Errors in logical reasoning are also quite systematic. For example, people often show a pattern called belief bias: If a syllogism's conclusion happens to be something people believe to be true anyhow they're likely judge the conclusion as following logically from the premises. Conversely, if the to conclusion happens to be something they believe to be false, they're likely to reject the conclusion as invalid (Evans, 2012b; Trippas, Thompson, & Handley, 2017; Trippas, Verde, & Handley, 2014).

This strategy at first appears reasonable. Why wouldn't you endorse conclusions you believe to be true, based on the totality of your knowledge, and reject claims you believe to be false? Let's be clear, though, that there's a problem here: When people show the belief-bias pattern, they're failing to distinguish between good arguments (those that are truly persuasive) and bad ones. As a result, illogical argument if it happens to lead to conclusions they like, and they'll reject a they'll endorse an logical argument if it leads to conclusions they have doubts about. The Four-Card Task  Similar conclusions derive from research reasoning about conditional statements. These are on statements of the "If X, then Y" format, with the first statement providing a condition under which the second statement is guaranteed to be true.

Often, psychologists study conditional reasoning with the selection task (sometimes called the four-card task). In this task, participants are shown four playing cards, as in Figure 12.7 (after Wason, 1966, 1968). The participants are told that each card has a number on one side and a letter on the other. Their task is to evaluate this rule: "If a card has a vowel on one side, it must have an even number on the other side." Which cards must be turned over to put this rule to the test?

In many studies, roughly a third of the participants turn over just the "A" card to check for an even number. In addition, many turn over both the "A" and the "6." However, just a handful of participants give the correct answer-turning over the "A" and the "7" Plainly, performance is atrocious in this problem, with more than 90 % of participants giving wrong answers. (See the caption for Figure 12.7 for an explanation of the right answer.)

Performance is much better, though, with some variations of the four-card task. For example Griggs and Cox (1982) asked their participants to test rules like this one: "If a person is drinking beer, then the person must be at least 21 years old." As in the other studies, participants were shown four cards and asked which cards they would need to turn over to test the rule (see Figure 12.8). In this version, participants did quite well: 73% (correctly) selected the card labeled "Drinking a beer" and also the card labeled "16 years of age." They did not select "Drinking a Coke" or "22 years of age."

It seems, then, that how well you think depends on what you're thinking about. The problems posed in Figures 12.7 and 12.8 have the same logical structure, but they yield very different performances. Researchers have offered a variety of explanations for this pattern, but the data don't allow us to determine which account is preferable. (For some of the options, see Almor & Sloman, 2000; Cheng, Holyoak, Nisbett, & Oliver, 1986; Cummins, 2004; Cummins & Allen, 1998; Gigerenzer & Hug, 1992; Girotto, 2004; Nisbett, 1993.)

Even with this unsettled issue, it's important to note the parallels between these points and our earlier discussion of how people make judgments about the evidence they encounter. In both domains (inductive judgments and deductive reasoning), it's easy to document errors in people's thinking. But in both domains we can also document higher-quality thinking, and this more- sophisticated thinking can be encouraged by the "right" circumstances. Specifically, in our discussion of judgment, we listed several factors that can trigger better thinking. Now, in our problem's content can sometimes trigger more accurate discussion of logic, we've seen that a reasoning. Thus, the quality of thinking is certainly uneven-but with the right triggers (and, it turns out, proper education), it be improved. e. Demonstration 12.7: The Effect of Content on Reasoning  In the textbook chapter, we note that how people reason is heavily influenced by the content of what they're reasoning about, and this is not what we would expect if people were using the rules of logic. Those rules are guided only by the form of the statements being examined, and not (This is the basis for the term "formal logic.") Thus, the rules apply in exactly the same way to any of by the content. these cases:

"If Sam is happy, then I'll shout. Sam is happy. Therefore I'll shout."

"If p is true, then q is true. P is true. Therefore, q is true."

"If griss is triffle, then zupie is hockel. Griss is triffle. Therefore zupie is hockel."

All three of these are valid arguments, and there's just one logic rule (called modus ponens) that to all of them. Obviously, these statements vary in their content, but, again, that's irrelevant applies for logic. As we said, logic rules only consider the statements' form.

The chapter provides several examples, however, indicating that human reasoning does depend problem's content, and this provides important information about the principles guiding your on a reasoning. As one more demonstration of this broad point, let's look at a variation on the four-card task discussed in the chanter. Imagine that you're the ouwner of a large company. You're concerned that your employees have been taking days off even when they're not entitled to do so. The company's rule is:

If an employee works on the weekend, then that person gets a day off during the week

Here are employment records for four employees. Each record indicates, on one side, whether or not the employee has worked on the weekend. The record indicates, on the other side, whether or not that person got a day off.

As the owner of this business, which records would you want to inspect to make sure that your rule is being followed?

DID WORK ON A WEEKEND DID NOT WORK ON A WEEKEND

DID GET A DAY OFF DID NOT GET A DAY OFF

Now, imagine that you are a worker in the same large company. You're concerned that your boss isn't giving people their days off, even when they've earned the days off. Which records would you want to inspect to make sure that the company's rule is being followed?

DID WORK ON A WEEKEND DID NOT WORK ON A WEEKEND

DID GET A DAY OFF DID NOT GET A DAY OFF

Most people give different answers to these two questions. As a company owner, they choose the options-to make sure that the person who didn't work on a weekend isn't taking an "illegal" day off, and to make sure the person who did take a day off really earned it. As a worker, they middle tw0 tend to choose the first and last options-to make sure that everyone who earned a day off gets one.

In the textbook chapter we mention that there's debate over how we should think about these cases. One theory is cast in terms of evolution, with an emphasis on how people reason when they're thinking about "cheaters." A second theory is cast in terms of pragmatic reasoning, on the idea that people have learned, during their lifetime, how to think about situations involving permission and situations involving obligation. Which of these theories fit with the observation that selections in the four-card problem presented above change if your perspective changes (i.e., changes from the owner's perspective to the worker's perspective)? Can you see how an emphasis on "cheater- detection" might fit with these results, including the flip-flop in responses when your perspective changes? Can you see how an emphasis on permission and obligation also fits with the results? Decision Making  We turn now to a different type of thinking: the thinking that underlies choices. Choices, big and spend your small, fill your life, whether you're choosing what courses to take next semester, how to next vacation, or whether to stay with your current partner. How do you make any of these decisions?  Costs and Benefits Each of us has our own values-things we prize, or conversely, things we hope to avoid. Likewise, each of us has a series of goals-things we hope to accomplish, things we hope to see. The obvious suggestion, then, is that we use these values and goals in making decisions. In choosing 'courses for next semester, for example, you'll choose classes that are interesting (something you value) and also those that help fill the requirements for your major (one of your goals). In choosing a medical treatment, you hope to avoid pain and also to retain your physical capacities as long as possible.

To put this a bit more formally, each decision will have certain costs attached to it (consequences that will carry you farther from your goals) as well as benefits (consequences moving you toward your goals and providing things you value). In deciding, you weigh the costs against the benefits and seek a path that will minimize the former and maximize the latter. When you have several options, C you choose the one that provides the best balance of benefits and costs.

Economists cast these ideas in terms of utility maximization. The word "utility" refers to the value that you place people gain utility from eating in fancy particular outcome-some on a restaurant; others gain utility from watching their savings accumulate in a bank account; still others, from giving their money to charity. No matter how you gain utility, the proposal is that you try to make decisions that will bring you as much utility as possible. (See von Neumann & Morgenstern, 1947; also see Baron, 1988; Speekenbrink & Shanks, 2012.) Framing of Outcomes  It's remarkably easy, however, to find cases in which decisions are little to do with utility maximization. Consider the problem posed in Figure 12.9. In this choice, a guided by principles that have huge majority of people-72%-choose Program A, selecting the sure bet rather than the gamble (Tversky & Kahneman, 1987; also Willemsen, Böckenholt, & Johnson, 2011). Now consider the problem in Figure 12.10. Here, an enormous majority-78%- choose Program B, preferring the gamble rather than the sure bet.

The puzzle lies in the fact that the two problems are objectively identical: 200 people saved out of 600 is the same as 400 dead out of 600. Nonetheless, the change in how the problem is phrased- that is, the framing of the decision-has an enormous impact, turning a 3-to-1 preference (72% to 28%) in one direction into a 4-to-1 preference (78% to 22%) in the opposite direction.

We should emphasize that there's nothing wrong with participants' individual choices. In either Figure 12.9 or Figure 12.10, there's no decision to avoid risk (by selecting Program A) or the decision to gamble (by choosing Program B). The problem lies in the contradiction created by choosing Program A in one context and Program B in the other context. In fact, if a single participant is given both frames on occasions, he's quite likely to contradict himself. For that matter, if you wanted to manipulate someone's evaluation of these programs (e.g., if you wanted to manipulate voters or "right answer" and you can persuasively defend either the slightly different shoppers), then framing effects provide an effective way to do this.

Related effects are easy to demonstrate. When participants are given the first problem in Figure 12.11, almost three quarters of them (72% ) choose Option A-the sure gain of $100. Participants contemplating the second problem generally choose Option B, with 64 % going for this choice (Tversky & Kahneman, 1987). Note, though, that the problems are once again identical. Both pose the question of whether you'd rather end up with a certain $400 or with an even chance of ending up with either $300 or $500. Despite this equivalence, participants treat these problems very differently, preferring the sure thing in one case and the gamble in the other.

In fact, there's a reliable pattern in these data. If the frame casts a choice in terms of losses, decision makers tend to be risk seeking-that is, they prefer to gamble, presumably attracted by the idea that maybe they'll avoid the loss. So, for example, when the Asian disease problem is cast in terms people dying (Figure 12.10), people choose Program B, apparently focused on the hope that, with this program, there may be no loss of life. Likewise, Problem 2 in Figure 12.11 casts the options in terms of financial losses, and this, too, triggers risk seeking: Here, people reliably choose the 50-50 gamble over the sure loss. (This pattern-a willingness to take risks-is especially strong when people contemplate large losses; Harinck, Van Dijk, Van Beest, & Mersmann, 2007; also see LeBoeuf & Shafir, 2012.)

In contrast, if the frame casts a choice in terms of gains, decision makers are likely to show risk aversion: They refuse to gamble, choosing instead to hold tight to what they already have. Thus, Figure 12.9 casts the Asian disease problem in terms of gains (the number of people saved), and this leads people to prefer the risk-free choice (Program A) over the gamble offered by Program B. (And likewise for Problem 1 in Figure 12.11.)

Again, there's nothing wrong with either of these strategies by itself: If someone prefers to be risk seeking, this is fine; if someone prefers to be risk averse, this is okay too. The problem arises when people flip-flop between these strategies, depending on how the problem is framed. 

Framing of Questions and Evidence  Related effects emerge with changes in how a question is framed. For example, imagine that you're on a jury in a messy divorce case; the parents are battling over who will get custo dy of their only child. The two parents have the attributes listed in Figure 12.12. To which parent will you award sole custody of the child?

Research participants who are asked this question tend to favor Parent B by a wide margin. After all, this parent does have a close relationship with the child and has a good income. Note, though, that we asked to which parent you would award custody. Results are different if we ask participants to which parent they would deny custody. In this case, 55% of the participants choose to deny custody to Parent B (and so, by default, end up awarding custody to Parent A). In other words, the decision is simply reversed: With the "award" question, most participants award custody to Parent B. With the "deny" question, the majority deny custody to Parent B-and so give custody to Parent A (Shafir, 1993; Shafir, Simonson, & Tversky, 1993)

People are also influenced by how evidence is framed. For example, they rate a basketball player People more highly if the player has made 75% of his free throws, compared to their ratings of a player who has missed 25% of his free throws. They're more likely to endorse a medical treatment with a "50% success rate" than one with a "50% failure rate." And so on. (See Levin & Gaeth, 1988; Levin, Schnittjer, & Thee, 1988; also Dunning & Parpal, 1989.)  Opt-In versus Opt-Out  A related pattern again hinges on how a decision is presented. Let's start with the fact that more than 100,000 people in the United States are waiting for medically necessary organ transplants. You can help these people-and save lives-by agreeing to be an organ donor. If so, then, in the event of your death, healthy organs from your body might help as many as 50 people. In light of these facts, it's discouraging that relatively few Americans agree to be organ donors, and the reason may lie in the way the decision to donate is framed. In the United States, decisions about organ donation are "opt-in" decisions: The potential donor has to say explicitly that he or she wishes to be a donor; otherwise, the assumption is that the person will not be an organ donor. Other countries use the reverse system: Unless people say explicitly that they don't want to be donors ("opt-out"), the assumption is that they will be donors.

How much does this contrast matter? In Germany, which relies on an opt-in system like the one used in the United States, only 12% of German citizens have agreed to be organ donors. Neighboring Austria, with a reasonably similar culture, uses an opt-out system, and here 99% of the citizens agree to be donors (Johnson & Goldstein, 2003; Thaler, 2009). Similar patterns have been observed with other decisions-for example, the decision to participate in "green energy" programs, or the step of signing up for a plan that will make an automatic monthly contribution to your pension fund (Sunstein, 2016). In each case, there's a sharp contrast between the number of people who say they're in favor of these programs and the number of people who actually participate. And, in each case, part of the reason for non- an opt-in participation is the reliance on system.

This pattern has broad implications for public policy, and some public figures suggest that governments should design programs that "nudge" people to sign up for green energy, to save for their retirement, and so on. (See Thaler & Sunstein, 2009; Sunstein, 2016; although, for discussion. see Burzzone, 2008: Randhawa, Brocklehurst, Pateman, & Kinsella, 2010.) But, in addition, the contrast between opt-in and opt-out decisions reminds us that our choices are governed not just by what's at stake, but also by how the decision is framed.

Maximizing Utility versus Seeing Reasons

In case after case, then, people are powerfully influenced by changes in how a decision is framed, even though, on most accounts, these changes have no impact on the utility you'd receive from the various options. To explain these findings, one possibility is that people are trying to use (something like) utility calculations when making decisions, but aren't very good at it. As a result, they're pulled off track by distractions, including how the decision is framed. A different possibility, though, is more radical. Perhaps we're not guided by utilities at all. Instead, suppose our goal is simply to make decisions that we feel good about, decisions that we think are reasonable and justified. This view of decision making is called reason-based choice (Shafir et al., 1993; also Redelmeier & Shafir, 1995). To see how this account plays out, let's go back to the divorce/custody case just described. Half of the participants in this study were asked to which parent they would award custody. These participants therefore asked themselves: "What would justify giving custody to one parent or another?" and this drew their attention to each parent's positive traits. As a result, they were swayed by Parent B's above-average income and close relationship with the child. Other participants were asked to which parent they would deny custo dy, and this led them to ask: "What would justify this denial?" This approach drew attention to the parents' negative attributes-especially, Parent B's heavy travel schedule and health problems.

In both cases, then, the participants relied on justification in making their decision. As it turns out, though, the shift in framing caused a change in the factors relevant to that justification, and this is why the shift in framing reversed the pattern of decisions. (For a different example, see Figure 12.13.)

Emotion  Still another factor needs to be included in our theorizing, because people's decisions are powerfully influenced by emotion. (See, among others, Kahneman, 2003; Loewenstein, Weber, Hsee, & Welch, 2001; Medin, Schwartz, Blok, & Birnbaum, 1999; Slovic, Finucane, Peters, & MacGregor, 2002; Weber & Johnson, 2009.)

We mentioned the importance of emotion early in the chapter, when we saw that Elliot, unable to feel emotion, seems unable to make decisions. But what is the linkage between emotion and decision making? At the chapter's start, we pointed out that many decisions involve an element of risk. (Should you try out a new, experimental drug? Should we rely more on nuclear power? Should you sign up for the new professor's course, even though you don't know much about her?) In cases like these, we suggested, people seem to assess the risk in emotional terms. For example, they ask themselves how much dread they experience when thinking about a nuclear accident, and they use that dread as an indicator of risk (Fischhoff, Slovic, &Lichtenstein, 1978; Slovic et al., 2002; also Pachur, Hertwig, & Steinmann, 2012).

But there are also other ways that emotion can influence decisions. Here, we start with the fact that-of course-memories can cause a strong bodily reaction. In remembering a scary movie, for example, you again become tense and your palms might sweat. In remembering a romantic encounter, you again become aroused. In the same way, anticipated events can also produce bodily arousal, and Damasio (1994) suggests that you use these sensations-he calls them somatic markers- as a way of evaluating your options. So, in making a choice, you literally rely on your "gut feelings" to assess your options-an approach that pulls you toward options that trigger positive feelings and away from ones that trigger negative feelings. We've mentioned that a particular region of the brain-the orbitofrontal cortex (at the base of the frontal lobe, just behind the eyeballs)-is crucial in your use of these somatic markers, because this is the brain region that enables you to interpret your emotions. When this region is damaged (as it was in Elliot, the case we met at the chapter's start), decision making is markedly impaired. (See Damasio, 1994; Naqvi, Shiv, & Bechara, 2006. Also see Coricelli, Dolan, & Sirigu, 2007; Dunn et al., 2010; Jones et al., 2012; also Figure 12.14.)

Predicting Emotions  Here's another way emotion shapes decision making: Many decisions depend on a forecast of future emotions. Imagine that you're choosing between two apartments you might rent for next year. One is cheaper and larger but faces a noisy street. Will you just get used to the noise, so that sooner or later it won't bother you? If so, then you should take the apartment. Or will the noise grow increasingly obnoxious as the weeks pass? If so, you should pay the extra money for the other apartment. Plainly, your decision here depends on a prediction about the future-about how your likes and dislikes will change as time goes by.

Research suggests that affective forecasting-your predictions for your own emotions-is often inaccurate. In many studies, people have been asked how they would feel after a significant the events at issue include "breaking up with a romantic partner, losing an election, receiving a gift, learning they have a serious illness, failure to secure a promotion, scoring well on an exam," and so on (Gilbert & Ebert, 2002, p. 503; Kermer, Driver-Linn, Wilson, & Gilbert, 2006; see also Figure 12.15). People can usually predict whether their reaction will be positive or negative-and so they realize that scoring well on an exam will make them feel good and that a romantic breakup will make them feel bad. But people consistently overestimate how long these feelings will last-apparently underestimating their ability to adjust to changes in fortune, and also underestimating how easily they'll find excuses and rationalizations for their own mistakes. (For evidence, though, that people aren't awful all the time in predicting their own emotions, see Doré, Meksin, Mather, Hirst, & Ochsner, 2016.)

As a related matter, people generally believe that their current feelings will last longer than they actually will-so they seem to be convinced that things that bother them now will continue to bother them in the future, and that things that please them now will continue to bring pleasure in the future. In both directions, people underestimate their own ability to adapt; as a result, they work to avoid things that they'd soon get used to anyhow and spend money for things that provide only short-term pleasure. (For the data, see Hsee & Hastie, 2005; Loewenstein & Schkade, 1999; Sevdalis & Harvey, 2007; Wilson, Wheatley, Meyers, Gilbert, & Axsom, 2000.)  Research on Happiness  Earlier in this chapter, we saw that people often make errors in judgment and reasoning. It now appears that people also lack skill in decision making. Framing effects leave them open to manipulation and self-contradiction, and errors in affective forecasting guarantee that people will often take steps to avoid regrets that in reality they wouldn't have felt, and pay for expensive toys that they'll soon lose interest in.

Some investigators draw strong conclusions from these findings. Maybe people really are incompetent in making decisions. Maybe they really don't know what will make them happy and might be better off if someone else made their choices for them. (See, e.g.. Gilbert, 2006: Hsee. Hastie. & Chen, 2008: but for different views, see Kahneman, 2011; Keys & Schwartz, 2007; Weber & Johnson, 2009.) These are strong claims, and they've been the subject of considerable debate. One author, for example, simply asserts that people are "predictably irrational" in their decision making and we're stuck with that (Ariely, 2009). Another author suggests that in general people are unable to move efficiently toward happiness; the best they can do is "stumble on happiness" (Gilbert, 2006). Yet another author notes that we all like to have choices but argues that having too many choices actually makes us less happy-a pattern he calls the "paradox of choice" (Schwartz, 2003).

Plainly, these are issues that demand scrutiny, with implications for how each of us lives and also, perhaps, implications that might guide government policies or business practices, helping people to become happy (Layrd, 2010; Thaler & Sunstein, 2009). In fact, the broad study of "subjective well- being"-what it is, what promotes it-has become an active and exciting area of research. In this way the study of how people make decisions has led to important questions-and, perhaps, some helpful answers-regarding how they should make decisions. In the meantime, the research highlights some traps to avoid and suggests that each of us should be more careful in making the choices that shape our lives. e. Demonstration 12.8: Wealth Versus Changes in Wealth  Many economists regard utility theory as a description of how you should make choices, and also as a description of how you do make choices. But psychologists have had an easy time identifying challenges to utility theory. Specifically, ordinary decision makers seem to follow rules that are rather different from those proposed by this theory.

For example, utility calculations focus on where a decision will leave you-will you choose to end up with Outcome #1 or with Outcome #2? In making choices, however, people don't just look at the outcomes; they also compare the outcomes to their current status and are heavily influenced by the change that might result from their decision. This is a problem for utility theory, because current status is actually irrelevant to utility calculations.

To make this point concrete, first consider this comparison (from Nobel Prize winner Daniel Kahneman):

Today, Jack and Jill each have a wealth of $5 million.

Yesterday, Jack had $1 million and Jill had $9 million.

Are they equally happy?

For most people, this is an easy question-Jack is likely to be much happier than Jill, because of the change he has experienced. How might this influence their decision making? Here's a different comparison, again from Kahneman:

Anthony's current wealth is $1 million.

Betty's current wealth is $4 million.

They are both offered the choice below; which would they rather have?

· A gamble, with equal chances to end up winning $1 million or $4 million

OR

· A sure thing: End up owning $2 million

How will Anthony choose? How will Betty choose? You've probably answered differently for these two people, because you're alert to the changes in wealth, and not just the outcome levels.

Or, as one more example (and, again, adapted from Kahneman), imagine two contestants on a TV game show:

Contestant 1 has just won $1,000.

She now has a choice: Receive another $500 (and so end up with $1,500) OR toss a coin.

· If the coin comes up "heads," she wins another $1000 (and so has $2000).

· If the coin comes up "tails," nothing happens (and so she stays at $1000).

Contestant 2 has just won $2,000.

She now has a choice: Give up $500 (and so end up with $1,500) OR toss a coin.

· If the coin comes up "heads," nothing happens (and so she stays at $2,000).

· If the coin comes up "tails," she loses $1,000 (and so drops to $1,000).

What would you do if you were Contestant 1? Contestant 2? Again, you're probably very sensitive to the changes in wealth state, and not just the outcomes. e. Demonstration 12.9: Probabilities Versus Decision Weights Does utility theory describe how ordinary people make decisions-in the marketplace or in their lives? According to the theory, people should be influenced by probabilities. It's easy to show however, that people don't respect the laws of probability in their decision making. Among other points, people overinterpret the difference between a 0 % probability and, say, a 5% probability- because they perceive a huge difference between "can't happen" and "might happen." Kahneman and Tversky refer to this as the "possibility effect." (In other words, people are keenly sensitive to whether something is a possibility or not.) Likewise, people overinterpret the difference between 100% and, say, 95%-because here they perceive a huge difference between "will happen" and "might happen." This is the certainty effect.

These effects show up in many settings. People are willing to pay for lottery tickets because they're impressed by the notion that they might win (the possibility effect). People will likewise pay a lot to increase their chances of a gain from 99% to 100 %-because of the high value they put on certainty to see these effects in action, consider the following pair of questions:

· A friend of yours is going in for surgery. You've heard that the surgery has risks attached to it, so you do a bit of research on the Internet and discover that for people with your friend's profile, the risk is actually 0%. But then you find a more recent bit of news, and you realize that the risk is actually 5%. How much would this increase your level of anxiety?

· A different friend of yours is going in for surgery. You've heard that the surgery has risks attached to it, so you do a bit of research on the Internet and discover that for people with your friend's profile, the risk is actually 5%. But then you find a more recent bit of news, and you realize that the risk is actually 10%. How much would this increase your level of anxiety?

Alternatively, imagine two scenarios:

· You're the director of a small company, and you've heard that an illness is likely to affect the city in which your company is located. You decide to vaccinate all of your employees to protect them against the illness. You do a bit of research and discover that you can buy enough doses of Vaccine A for $30,000. But Vaccine A is only 90% effective; Vaccine B is 93% effective. How much more would you be willing to pay for Vaccine B?

· Again, you're the director of a small company, and (as before) you've heard that an illness is likely to affect your city. You decide to vaccinate all of your employees to protect them against the illness. You do a bit of research and discover that you can buy enough doses of Vaccine X for $30.000. But Vaccine X is only 97 % effective; Vaccine Y is 100% effective. How much more would you be willing to pay for Vaccine Y?

What are your intuitions about these cases? Is the difference between a 0% and 5 % risk larger than the difference between 5% and 10%? Is the difference between 90% and 93% smaller than the difference between 97% and 100%? e. Demonstration 12.10: Framing Questions  What are the factors that influence your decisions? What are the factors that should influence your decisions? Evidence suggests that the "frame" of a decision plays an important role. Should it?

Recruit four friends for this demonstration. Ask two of them this question:

Imagine that you are part of a team, working for a medium-size company, trying to decide how to invest $10 million. You have just learned about a new stock market fund that has, in the last five years, outperformed 75% of its competitors.

What percentage of the Ș10 million would you want to invest in this stock market fund?

Ask two other friends this question:

Imagine that you are part of a team, working for a medium-size company, trying to decide how to invest $10 million. You have just learned about a new stock market fund that has, in the last five years, been outperformed by 25% of its competitors.

What percentage of the $10 million would you want to invest in this stock market fund?

According to a straightforward economic analysis, the two versions of this question provide identical information-outperforming 75% of a group is, of course, the same as being outperformed by 25% of the group. According to the points in the textbook chapter, though, this difference in frame may change how people react to the questions. Can you predict which group will be more likely to invest heavily in the fund?

In making your prediction, bear in mind that physicians are more likely to recommend a new medication if they've been told that the treatment has a 50% success rate, rather than being told that it has a 50% failure rate. People are more likely to buy ground meat that is 90 % fat-free rather than meat that is 10% fat. It seems likely, therefore, that your friends will invest more heavily in the fund in the first frame described here, rather than the second! e. Demonstration 12.11: Mental Accounting In the textbook chapter, we consider evidence that people seek reasons when making a decision, and they select an option only when they see a good reason to make that choice. But how do people seek reasons, and what reasons do they find persuasive? We can get some insights into this problem by looking at various decisions that people make. For example, imagine that you're at an electronics store and about to purchase a pair of headphones for $180 and a calculator for $20. Your friend mentions, though, that the same calculator is on sale for $10 at a different store 20 minutes away.

Would you make the trip to the other store? Think about it for a moment. Poll a few friends to find out if they decide the same way.

Now, imagine a different scenario. You're at an electronics store and about to purchase a pair of headphones for $20 and a calculator for $180. Your friend mentions that the same calculator is on sale for $170 at a different store 20 minutes away.

In this case, would you make the trip to the other store? Think about it and decide, and again poll a few friends. Most people would go to the other store in the first scenario, but they would not go in the second scenario. Of course, in either of these scenarios, your total purchase will be $200 if you buy the two items at the first store, and $190 if you buy both items at the second store. In both cases, therefore, the decision depends on whether you think a savings of $10 is enough to justify a 20-minute trip. Even so, people react to the two problems differently, as if they were dividing their purchases into different "accounts." If the $10 savings comes from the smaller account, then it seems like a great deal (the calculator is 50% cheaper!). If the savings comes from the more expensive account, it seems much less persuasive (merely a 5% savings on that item).

It seems clear, then, that our theories of decision making must include principles of "mental accounting"-principles that will describe how you separate your gains and losses, your income and your expenses, into separate "budget categories." These principles are likely to be complex, and, in truth, it's not at all obvious why, in our example, people seem to regard the calculator and the headphone purchases as separate (rather than, say, just thinking about them under the broader label "supplies"). But the fact is that people do think of these purchases as separate, and this influences their decision making. Therefore, complicated or not, principles of mental accounting must become part of our overall theorizing. e. Demonstration 12.12: Seeking Reasons

As you can see, we've offered a long list of demonstrations for this chapter-because the phenomena in play here involve the sort of real-world thinking that we all do all the time. Therefore, it's easy to find demonstrations that link the research in this domain to our everyday experience! Here's one last demonstration. For this one, you'll need to recruit some friends.

Ask your friends to imagine they're shopping for a technical vocabulary in a field that they're especially interested in. Ask them to specialized dictionary, one that covers the imagine that they're in a used-book store, and find:

· For some of your friends: They find a copy of the highly respected Brown Dictionary. The copy they discover was published in 1993 and has 10,000 entries. Its condition is like new. How much would they be willing to spend for the dictionary? For other friends: They find a copy of the highly respected White Dictionary. The copy they discover was published in 1993 and has 20,000 entries. Its cover is torn, but it is otherwise like new. How much would they be willing to spend for the dictionary?

· For other friends: They find a copy of the highly respected White Dictionary. The copy they discover was published in 1993 and has 20,000 entries. Its cover is torn, but it is otherwise like new. How much would they be willing to spend for the dictionary?

· For a third group of friends: They find a store that offers both the highly respected Browm Dictionary and also the highly respected White Dictionary. The copy of the Brown Dictionary that they find was published in 1993 and has 10,000 entries. Its condition is like new. The copy of the White Dictionary that they find was also published in 1993 but has 20,000 entries. Its cover is torn, but it is otherwise like new. Which copy seems more attractive, and therefore worth a higher price?

Odds are good that the first group of friends will offer a higher price than the second group of friends. If each dictionary is evaluated on its own, the Brown Dictionary looks better. But odds are good that people in the third group, evaluating the two dictionaries side by side, will prefer the White Dictionary! Is this the pattern of your data? If so, can you explain this in terms of reason- based choice? COGNITIVE PSYCHOLOGY AND EDUCATION  making people smarter  people make in judgment, but it also offers This chapter documents the many errors encouragement: We can take certain steps to improve our judgments. Some of those steps involve changes in the decision-making environment-so that we can, for example, ensure that the evidence frequencies (e.g., "4 cases out of 100") rather than percentages to make judgments more accurate we consider has been converted to ("4%") or proportions (04"). This simple step, it seems, is enough and to increase the likelihood that people will consider base rates when drawing conclusions.

Other steps, in contrast, involve education. As the chapter mentions, training students in statistics to think about evidence-including evidence that's obviously seems to improve their ability quantitative (e.g., a baseball player's batting average or someone's exam scores) and also evidence that's not, at first appearance, quantitative (e.g., thinking about how to interpret a dancer's audition or someone's job interview). The benefits of statistics training are large, with some studies showing error rates in subsequent reasoning essentially cut in half. The key element in statistical training, however, is probably not in the mathematics per se. It is surely valuable to know the derivation of statistical equations or the procedure for using a statistics software package. For improvement of everyday judgment, however, the key involves the perspective that a statistics course encourages. This perspective helps you realize that certain observations (e.g., an audition or an interview) can be thought of as a sample of evidence, drawn from a larger pool of observations that potentially you could have made. The perspective also alerts you to the fact that a sample may not be representative of a broader population and that larger samples are more likely to be representative. For purposes of the statistics course itself, these are simple points; but being alert to these points can have striking and widespread consequences in your thinking about issues separate from the topics covered in the statistics class.

In fact, once we cast things in this way, it becomes clear that other forms of education can also have the same benefit. Many courses in psychology, for example, include coverage of methodological issues. These courses can highlight the fact that a single observation is just a sample and that a small sample sometimes cannot be trusted. These courses also cover topics that might reveal (and warn you against) confirmation bias or caution against the dangers of informally collected evidence. On this basis, it seems likely that other courses (not just statistics classes) can actually improve your everyday thinking-and, in fact, several studies confirm this optimistic conclusion. Ironically, though, courses in the “hard sciences” -such as chemistry and physics- may not have these benefits. These courses are obviously of immense value for their own sake and will provide you with impressive and sophisticated skills. However, these courses may do little to improve your day- to-day reasoning. Why not? These courses do emphasize the process of testing hypotheses through the collection of evidence, as well as quantitative analysis of the evidence. But bear in mind that the data in, say, a chemistry course involve relatively homogeneous sets of observations: The weight of one carbon atom is the same as the weight of other carbon atoms; the temperature at which water boils (at a particular altitude) is the same on Tuesday as it is on Thursday. As a result, issues of variability in the data are less prominent in chemistry than in, say, psychology. (Compare how much people differ from one another to how much benzene molecules differ from one another.) This is a great strength for chemistry; it's one of the many reasons why chemistry has become such a sophisticated science. But this point means that chemists have to worry less than psychologists do about the variability within their sample, or whether their sample is of adequate size to compensate for the variability. As a result, chemistry courses often provide little practice in thinking about issues that are crucial when confronting the far messier data provided by day-to-day life.

In the same way, cause-and-effect sequences are often more straightforward in the "hard sciences" than they are in daily life: If a rock falls onto a surface, the impact depends simply on the mass of the rock and its velocity at the moment of collision. We don't need to ask what mood the rock was in, whether the surface was expecting the rock, or whether the rock was acting peculiarly on this occasion because it knew we were watching its behavior. But these latter factors are the sort of concerns that routinely crop up in the "messy" sciences-and in daily life. So here, too, the hard sciences gain enormous power from the "clean" nature of their data but, by the same token. don't provide practice in the skills of reasoning about these complications. Which courses, therefore, should you take? Again, courses in chemistry and physics (and biology important and will teach you sophisticated methods and fascinating content. They will provide you with skills that you probably can't gain in any other setting. But, for purposes and mathematics) are of improving your day-to-day reasoning, you probably want to seek out courses that involve the testing of hypotheses through quantitative evaluation of messy data. These courses will include many of the offerings of your school's psychology department, and probably some of the offerings in sociology, anthropology, political science, and economics. These are the courses that may genuinely make you a better, more critical thinker about the evidence you're likely to encounter in your daily existence. COGNITIVE PSYCHOLOGY AND THE LAW confirmation bias in police investigation  We're well served when people who are guilty of crimes confess what they've done. Their confessions ensure an efficient prosecution, helping us to achieve justice and promote public safety. But there's another side to this story, because sometimes it's innocent people who confess, and the power of confession evidence makes it likely that the innocent party will end up in prison and the actual perpetrator will escape punishment.

How often do false confessions occur? No one really knows, but one insight comes from the cases (mentioned in Chapter 8) of people who have been convicted in U.S. courts but then exonerated, years later, when DNA evidence finally showed that they weren't guilty after all.

Scrutiny of these cases indicates that roughly 25 % of the exonerees had offered confessions-and since the DNA evidence tells us they weren't guilty, we know these confessions were false.

Police understand these issues and certainly don't assume that every confession is truthful. Instead, they seek further evidence to corroborate (or, in some cases, undermine) a confession. The problem, though, is that police officers are human, and so they're vulnerable to the cognitive errors and illusions described in the chapter, including the pattern of confirmation bias. As a result, the collection of evidence intended to check on a confession can be biased by the confession itself. so that in some cases bad evidence (a false confession) leads to more bad evidence. Kassin, Bogart, and Kerner (2012) examined all of the DNA exoneration cases that included a confession (and again, the DNA evidence confirms that these confessions were false). Over and over the researchers found that these cases tended to contain other errors as well-invalid forensic evidence, mistaken eyewitness identifications, and false testimony from snitches or informants. Each of these types of errors was more common in the false-confession cases than in cases without confessions. Worse still, the police records from these cases indicate that the confessions were obtained early in the investigation, before the other errors occurred. This pattern strongly suggests that the false confessions may have encouraged the other errors.

What's going on here? The confession persuaded the police that the suspect was likely to be guilty. At that point, confirmation bias entered the scene, and this bias influenced what further evidence the police found and how they interpreted it. Confirmation bias also influenced other people involved in the case, undermining the quality of their evidence.

As one example of how these effects emerge, Dror and Charlton (2006) presented six experienced fingerprint experts with pairs of fingerprints (one print from a crime scene, one from the suspect) and asked the experts whether the fingerprints in each pair were a "match." In some cases, the fingerprint experts were told that the suspect had already confessed. This contextual information shouldn't have influenced the experts' judgment (which should have depended only on the fingerprints themselves). Nonetheless, the experts-carefully trained professionals-were more likely to "perceive" a match after they'd learned about the confession. In fact, they were likely to "perceive" a match even though they'd earlier seen the same fingerprints and decided the prints weren't a match. Similar patterns have been observed with other forms of evidence. In a study by Hasel and Kassin (2009), witnesses were asked to identify a thief in a lineup. Once the witnesses had made their selection, they were sent home. Two days later, the witnesses were told that someone else in the lineup (not the person the witnesses had selected) had actually confessed. With this new information, almost two thirds of the witnesses abandoned their first selection and chose the confessor. ("Now that I think about it, I'm sure it was number 4, and not the person I chose two days ago.")

What should we do with these results? At the least, it seems important to get these findings into the view of the police, in the hope that they can somehow guard against this type of bias. It's also important to get these findings into a jury's view, with the goal of helping jurors to interpret confession evidence. More broadly, these results provide a compelling indication of the power of confirmation bias-which can even influence trained professionals and can certainly influence people making highly consequential judgments. COGNITIVE PSYCHOLOGY AND THE LAW  pretrial publicity  In many trials, potential jurors have been exposed to media coverage of the crime prior to the trial's start-perhaps in the news, perhaps on social media. This pretrial publicity can have many effects. One concern is the pattern called "belief bias." In the lab, this term refers to a tendency to consider an argument to be "more logical" if it leads to a conclusion that the person believed to begin with. In the courtroom, this bias could translate into a juror's evaluating the trial arguments not on their own terms, but in terms of whether the arguments led to the conclusion the juror would have endorsed (based on the media coverage) at the trial's start. This influence of information from outside the courthouse is contrary to the rules of a trial, but jurors may be unable to resist this powerful effect. As a related concern, consider confirmation bias. As the chapter discusses, this bias takes many forms, including a tendency to accept evidence favoring your views at face value but to subject evidence challenging your views to special scrutiny, seeking flaws or weaknesses in these unwelcome facts. This tendency can easily be documented in trials. In one study, participants were first exposed to a newspaper article that created a bias about a particular murder trial. These research-participant "jurors" were then presented with the trial evidence and had to evaluate how persuasive each bit of evidence was. The results showed a clear effect of the newspaper article: Evidence consistent with the (biased) pretrial publicity was seen as more compelling; evidence inconsistent with the publicity was seen as less compelling. And this effect fed on itself. Each bit of evidence that the "jurors" heard was filtered through their confirmation bias, so the evidence seemed particularly persuasive if it favored the view they held already. This led them to be more confident that their view was, in fact, correct. (After all, the evidence- as they interpreted it-did seem to favor that view.) This now-stronger view, in turn, amplified the confirmation bias, which colored how the "jurors" interpreted the next bit of evidence. Around and around we go-with confirmation bias coloring how the evidence is interpreted, which strengthens the belief held by the "jurors," which creates more confirmation bias, which colors how later evidence is interpreted, which further strengthens the belief.

In this study, the pretrial publicity had a powerful effect on the "jury's" verdict, but we need to be clear that the publicity didn't influence the verdict directly. In fact, the odds are good that the "jurors" weren't thinking of the publicity at all when they voted "guilty" or "not guilty" Instead, their verdicts were based (as all real jurors' should be) on their evaluation of the trial evidence. The problem, though, is that this evaluation was itself powerfully shaped by the pretrial publicity, via the mechanisms we've just described. In light of these results, we might worry that the courts' protections against juror bias may not be adequate. In some trials, for example, jurors are merely asked: "Can you set aside any personal beliefs or knowledge you have obtained outside the court and decide this case solely on the evidence you hear from the witness stand?" Such questions seem to be a thin shield against juror prejudice. As one concern, jurors might not know whether they'll be able to set aside their prejudices. They might not realize that they're vulnerable to belief bias or confirmation bias, and so they might overestimate their ability to overcome these effects. As a related point, jurors might be determined to vote, in the jury room, based only on what they heard during the trial. As we have now seen, though, that's no protection at all. In the study we described, the "jury's" ultimate decision was based on the evidence, but that doesn't change the fact that the decision was based on the evidence as viewed through the lens provided by pretrial publicity.

Belief bias and confirmation bias are powerful effects that often work in a way that is completely unconscious. This strongly suggests that the courts need to seek more potent means of avoiding these influences in order to ensure each defendant a fair and unbiased trial. Possible solutions include stricter screening of jurors and procedures that would make it easier to change a trial's location. In any case, it seems clear that stronger precautions are needed than those currently in place.