Critical Analytical Thinking
Critical Analytical Thinking Part II: Heuristics and Biases Dr. Abdelghani Es-Sajjade aessajjade@uod.edu.sa
Overview
The law of small numbers
Cause and chance
Anchors
Availability heuristic
The public and the experts
Representativeness
Causal stereotypes
Regression to the mean
A two-systems view of regression
The law of small numbers
Observations
The counties in which the incidence of kidney cancer is lowest are mostly rural, sparsely populated in the Midwest, the South, and the West
Why? The clean living of the rural lifestyle. No air pollution, no water pollution, fresh food without additives.
Observations
The counties in which the incidence of kidney cancer is highest are mostly rural, sparsely populated in the Midwest, the South, and the West
Why? Poverty of rural lifestyle—no access to good medical care, too much alcohol, too much tobacco.
Our mind & statistics
Explanation has nothing to do with rural life
System 1 excels in one form of thinking: it automatically and effortlessly establishes causal connections between events…
even when supporting data is minimal or totally absent
We are insensitive to sample size or reliability of data.
Sample of 150 or 3000, who cares?
Why? WYSIATI and system 1 is gullible.
Our mind & statistics
We know about sample size!
But often can’t help ourselves.
Did you initially notice “sparsely populated”?
What is the difference?
Large samples are more precise than small samples.
Small samples yield extreme results more often than large samples do.
Hence, small counties, less people so …?
Certainty & doubt
Our mind has a preference for sliding into certainty over maintaining doubt
System 1: rich image with poor evidence
Even in science:
Small sample experiment, complex phenomenon.
Exercise 1
Cause & Chance
We have an inclination to causal thinking
Statistics is different because it focuses on what could have happened instead
The null-hypothesis
Randomness sometimes appears as a pattern
Hot hand: 3 or 4 scores in a row
basketball hot hand, team of players who scores 3 or 4 times in a row is now given more passes and extra defended. Research: this sequence of successes and missed shot fits all the conditions of random. The hot hand is in the eye of the beholder. Massive and widespread cognitive illusion.
11
Speaking of the Law of Small Numbers
“Yes, the studio has had three successful films since the new CEO took over. But it is too early to declare he has a hot hand.”
“The sample of observations is too small to make any inferences. Let’s not follow the law of small numbers.”
“I plan to keep the results of the experiment secret until we have a sufficiently large sample. Otherwise we will face pressure to reach a conclusion prematurely.”
Anchors
Anchoring effect: considering a particular value from an unknown quantity before estimating that quantity
Question: was Ibn Taymiyyah younger or older than 114 years old when he passed away?
What is the anchor? 114 years old.
Your answer will be higher than 35 as an anchor.
Asking price: 1,500,000
How much should you pay for a house? You will be influenced by its asking price.
14
Anchoring effect
Produced by 2 processes:
Deliberate process of adjustment: System 2.
Automatic priming effect: System 1.
Process 1: effortful operation, often insufficient
Anchoring as adjustment
Process 1: effortful operation, often insufficient
You stop at the near edge of uncertainty…
…when you are no longer sure you should continue
People adjust less when there’s depletion of mental resources
They stay closer to the anchor
Anchoring as priming
EXPERIMENT (Mussweiler and Strack)
Is the average temperature in Germany higher or lower than 20 degrees?
Is the average temperature in Germany higher or lower than 5 degrees?
All participants were then shown distorted words that they had to identify.
Results:
Those who were asked 20 degrees had less difficulty remembering summer words (sun, beach)
Those asked 5 degrees had less difficulty remembering winter words (frost, ski).
Conclusion: anchors call for information that is compatible/suitable
The Anchoring Index
Is the tallest redwood more or less than 1200 or 180 feet?
Difference between anchors = 1020 feet
Difference in means between two groups of participants was 562 feet
Anchoring index = 562/1020 = 55%
EXPERIMENT
Real estate agents were asked to visit a property and study extensive booklet with information
2 groups: lower and higher price in booklet.
What is the value of this property?
Anchoring index = 41%
How did you come up with that value?
Not influenced by anchor, with pride. Expertise.
Anchor index for business students with no real estate expertise and who admitted to be influenced by anchor: 48%
Conclusion: anchors also/even influence experts
Exercise 2
USES AND ABUSES OF ANCHORS
We are much more receptive to suggestions than we think…
…and there are people and organizations who know this and exploit this.
EXPERIMENT
Supermarket in Sioux City, Iowa.
Promotion! Campbell's Soup.
Some days “Limit 12”
Other days no limit.
Average purchased 7 versus 4
23
Dealing with anchors
Negotiating a price over purchase of a home
Or even in a shop, advice: walk out if the amount is too much, don’t use the proposed anchor.
Galinsky and Mussweiler -> use system 2.
Focus on the minimal (but fair) offer he would accept
Focus on the costs of the seller failing to reach an agreement. What does he have to lose?
Anchoring and the 2 systems
Anchoring, judgement and choice were thought of as system 2.
System 2 uses data retrieval from memory, retrieval is automatic process by system 1.
System 2 is subject to the biasing influence that makes some information easier to retrieve than others and has no control over this.
That's why denial: “I cannot have been influenced by such absurd information.”
Some lessons on anchoring
Main lesson of priming research: our thoughts and behavior are influenced by the environment of the moment much more than we want or know
Many people find the results uncomfortable…
…because threat to agency, autonomy and sense of expertise and professional pride.
Any number on the table has an anchoring effect on you.
When the stakes are high, mobilize system 2 to counter the effect.
Speaking about anchoring
“The firm we want to acquire sent us their business plan, with the revenue they expect. We shouldn’t let that number influence our thinking. Set it aside.”
“Plans are best-case scenarios. Let’s avoid anchoring on plans when we forecast actual outcomes. Thinking about ways the plan could go wrong is one way to do it.”
The science of availability
What is definition of heuristic again?
Availability heuristic.
What do people do when they're asked about the frequency of an event
What is the percentage of people getting divorced after 40?
How many Cubans are there in KSA?
Instances of the category will be retrieved from memory
If retrieval was easy then frequency high if not then low.
Heuristic: simple procedure that helps find an adequate though imperfect answer to a difficult question. Mental shortcuts
28
Exercise 3
Availability heuristic
Availability heuristic substitutes one question for another.
Frequency versus ease with which impressions come to the mind.
Problematic! Bias! Why?
Other factors that influence ease of retrieval:
Media attention
Dramatic event
Personal experiences more powerful than what happens to others, statistics or mere words.
, salient events related to celebrities or politicians. E.g. You may exaggerate the number of Hollywood drug addicts.
: plane crash with extensive coverage changed your perception of safety.
30
Resisting bias
Resisting this large collection of biases is onerous.
Makes you tired.
The chance to avoid a costly mistake is worth the effort.
Awareness of biases contributes to peace in marriages
and probably in other collective projects.
Experiment
Spouses were asked.
How large was your personal contribution to keeping the place tidy in percentages? Divided in different tasks e.g. Taking out the trash etc.
Should add up to 100%.
Own contributions more than 100%,
Both spouses remembered their own contributions more than those of the other.
THE PSYCHOLOGY OF AVAILABILITY Experiment (Norbert Schwarz)
Research question: how will people's impression of a frequency be affected if they have to list a specific number of instances?
First: list six instances in which you behaved assertively.
Next: evaluate how assertive you are.
Other group were asked to list 12 instances.
Would you think you were more or less assertive?
Schwarz thought that this impression of your own assertiveness would be affected by:
The number of instances retrieved
Or the ease with which they come to mind
Results: people who had listed only 6 rated themselves as more assertiveness than those who (struggled more) to come up with 12.
Counterintuitive because 12 is greater than 6.
First few instances easy, then 6 and above more difficult "If I'm having this much difficulty coming up with instances I can't be very assertive."
33
Other experiments
People believe that they use their bicycles less often after remembering many rather than few instances.
People are less confident in a choice when they are asked to generate MORE arguments to back it.
People are less confident that an event could have been avoided after listing more ways it could have been avoided.
Are less impressed by a car after listing many of its advantages.
Experiment Student feedback
Professor at UCLA asked student for different ways to improve the course.
Two groups: one had to come up with a lower and one with a higher number of improvements.
The group students who were asked to come up with a higher number of improvements rated the course higher.
Who can explain?
Circumstances producing bias
When they are engaged in another effortful task at the same time
When they are in a good mood because they just thought of a happy episode in their life
If they are knowledgeable novices on the topic of the task, in contrast to true experts
If they are (or are made to feel) powerful
“I don’t spend a lot of time taking polls around
the world to tell me what I think is the right way
to act. I’ve just got to know how I feel.”
George W. Bush, November 2002
Exercise 4
AVAILABILITY AND AFFECT
Affect heuristic: instead of "what do I think about it?" "how do I feel about it?“
Affect heuristic simplifies our world and makes it more organized and consistent than reality.
The public and the experts
Experts see risk different than the public.
Experts: number of life-years lost versus public who make more nuanced distinctions
Such as good and bad deaths, deaths during voluntary exercises versus murder etc.
Paul Slovic: public perception of risk is better and richer than the experts!
The public and the experts (2)
Cass Sunstein: NO Slovic is wrong.
We need experts to prevent influence of populist movements.
Any risk policy should be measured in number of life years saved (more weight to the young) and cost in dollar to economy.
Policy makers and government intervention should be LESS influenced by public opinion.
Availability cascades
Minor event, major coverage, more worries and fear, more coverage, large-scale government action.
Can you come up with an example of an availability cascade?
Implications
Limitation of the mind to deal with minor risk: either ignored or given to much weight.
Parent waiting up for child to come home.
Knows there's not much to worry but the horrible news stories cannot be rejected.
The numerator "horror story" is given attention while the denominator "Instances actually occurred" is ignored.
This is the "probability neglect" effect.
Availability cases DISTORT priorities in the spending of public resources!
Terrorism vs other causes of risk/threat
Public or experts?
What do you think?
Ignore public fear and go with the experts or…
…forget about the experts and resolve the issues important to the public?
Public or experts? (2)
Sunstein: experts who are independent from public influence should have the strongest voice in informing policy making.
Slovic: policies not supported by the public will be rejected. Not sustainable.
Speaking of availability
“Because of the coincidence of two planes crashing last month, she now prefers to take the train. That’s silly. The risk hasn’t really changed; it is an availability bias.”
“She has been watching too many spy movies recently, so she’s seeing conspiracies everywhere.”
“The CEO has had several successes in a row, so failure doesn’t come easily to his mind. The availability bias is making him overconfident.”
“She’s raving about an innovation that has large benefits and no costs. I suspect the affect heuristic.”
“This is an availability cascade: a non-event that is inflated by the media and the public until it fills our TV screens and becomes all anyone is talking about.”
Exercise 5
Base rates
You’ve used a base rate: “How many students of a particular specialization are there?” which leads to rank.
Base: the proportion of units of a particular category divided by all the units over all categories.
Exercise 6
Representativeness
Used the stereotype of Tom while ignoring base rates.
Also ignoring whether description is accurate and from trustworthy source.
Same description was offered to another scientist, a statistician and colleague of the researchers who responded "computer scientist!“
Same experiment done with 114 graduate psychology students who are aware of base rates and trustworthiness of information.
Same outcome: use stereotype, ignore base rates and quality of information.
Representativeness
Explanation:
The question to assess probability using base rates is a difficult question, instead SUBSTITUTED by answering question about similarity to the stereotypes which is called representativeness, which is an easier question.
Serious mistake in probability assessment: ignoring quality of information and base rates.
This mistake is called the representativeness heuristic.
Interesting book "Moneyball" about professional baseball. Scouts judge future success of players on build or looks. The lead in this story is Billy Beane, manager of the Oakland A's and who bravely overruled the suggestions by his scouts and instead hired based on past statistics of performance. Result: good players against low costs (players who were rejected by other teams because of unfitting build or look) and eventually success; excellent results at low cost.
54
THE SINS OF REPRESENTATIVENESS
You see someone on the tube reading the New
York Times. Which of the following is a better
guess?
She has a PhD
She does not have a college degree.
PhD? not wise, there are many more college dropout or people who didn't start college on the subway than people holding a PhD.
55
THE SINS OF REPRESENTATIVENESS (2)
The second sin of representativeness is ignoring the quality of information.
Tom W's info should have been ignored, particularly when participants were told the information is not trustworthy.
Your system 1 cannot help but to process the information because of associative coherence and produce the story.
Unless the information is immediately rejected (e.g. this comes from a liar), your system 1 will start working with it.
How to solve Tom W.
People who frowned did much better. Who can explain why?
Solution to Tom W.:
Stay very close to your initial estimates.
Reduce slightly the value of the highly populated fields (humanities and education, social science)
Increase slightly the value of the sparsely populated fields (library science, computer science).
You will still not be the way you were without having read Tom W's description but the idea is to make a solid effort to ignore the information and work with base rates.
Bayesian reasoning
You need to discipline your intuition.
18th century English Minister Thomas Bayes:
How should people change their logic in light of evidence?
Mathematical details are not important for this course but remember two rules about Bayesian reasoning:
Base rates matter, even if information about a particular case is offered.
Intuitive beliefs about the accuracy of descriptions is exaggerated.
WYSIATI and associative coherence stimulate us to believe in the stories we spin for ourselves.
Author’s note: implementation of these rules comes unnatural and requires effort. I was shocked when I realised I was never taught how to implement them.
58
Speaking of Representativeness
“The lawn is well trimmed, the receptionist looks competent, and the furniture is attractive, but this doesn’t mean it is a well-managed company. I hope the board does not go by representativeness.”
“This start-up looks as if it could not fail, but the base rate of success in the industry is extremely low. How do we know this case is different?”
Exercise 7
Linda: less is more
The critical items in the list:
Does Linda look more like a bank teller?
Or more like a bank teller who is active in the environmentalist movement?
Everyone agrees that Linda fits the idea of a “environmentalist bank teller” better than she fits the stereotype of bank tellers.
Even when scenarios are listed sequentially.
Logic vs representativeness
Same problem offered to doctoral students in the decision-science program of the Stanford Graduate School of Business, all of whom had taken several advanced courses in probability, statistics, and decision theory.
85% of these respondents also ranked “environmentalist bank teller” as more likely than “bank teller.”
Logic was again beaten by representativeness.
The word fallacy is used, in general, when people fail to apply a logical rule that is obviously relevant.
Linda problem is a conjunction fallacy.
Amos and I introduced the idea of a conjunction fallacy, which people commit when they judge a conjunction of two events (here, bank teller and environmentalist) to be more probable than one of the events (bank teller) in a direct comparison.
63
Implications
The most coherent stories are not always the most probable, but they are plausible, and ideas of coherence, plausibility, and probability are easily confused by the incautious.
Adding detail to scenarios makes them more persuasive, but less likely to come true.
Mark has hair.
Mark has blond hair
Conclusion: In the absence of a competing intuition, logic prevails.
Implications (2)
The less-is-more pattern is bizarre.
In all these cases, the conjunction seemed plausible (not probable) which was enough for an endorsement of System 2.
Again: lazy system 2.
Representativeness can hinder the application of an obvious logical rule.
Speaking of less is more
“They constructed a very complicated scenario and insisted on calling it highly probable. It is not—it is only a plausible story.”
“They added a cheap gift to the expensive product, and made the whole deal less attractive. Less is more in this case.”
“In most situations, a direct comparison makes people more careful and more logical. But not always. Sometimes intuition beats logic even when the correct answer stares you in the face.”
Experiment (Nisbett and Borgida)
Helping Experiment conducted at New York University: participants in individual booths to speak over the intercom about personal lives. Talk in turn for about 2 minutes. Only 1 microphone active at one time.
6 participants in every round, 1 is a stooge. Stooge speaks first and tells with embarrassment that he sometimes has seizures. Automatically microphone to next speaker.
At one point when it's stooge's turn again he becomes distressed and says he's having a seizure and asks for help in a disturbing way. Last words: “I'm gonna dieeee”. Then microphone of next individual became active and nothing was heard from the dying individual.
Results of NYU experiment:
only 4 out of the 15 (experiment ran 3 times) participants responded immediately to help.
6 never came out of booth.
5 only after they heard the stooge choking.
Experiment (2) (Nisbett and Borgida)
What would you do?
Conclusion of experiment: expectation is wrong. Most of us don't help when we expect others to help instead.
Nisbett and Borgida wanted to know: “Have our students changed their minds about human nature?”
They showed them interviews of people who were part of the New York experiment. The interviews were short :their hobbies, their leisure activities, and their future plans, which were entirely common.
After the interview students were asked: how quick did that person come out of his booth to help?
Experiment (3) (Nisbett and Borgida)
Solution should have involved Bayesian reasoning. Why?
Probabilities have changed after knowing results of experiment.
Without seeing the interview, what is the base rate? 4 out of 15 immediately helped. The probability for immediate help for unknown participant is 27%
Bayesian logic also requires to rethink the probability in light of new information. The interviews were carefully designed to be UNINFORMATIVE.
What did students answer?
Nisbett and Borgida asked 2 groups of students to provide probabilities for immediate help; one group who knew about the results of the experiment and one group did not.
Results:
Group who did not: as expected, good expectations of human nature: these individuals would immediate help.
Group who knew: SAME ANSWER!!! As if knowing the results of the experiment had no effect at all.
Implications
Conclusion: students exempt themselves (and their friends and acquaintances) from the conclusions of experiments that surprise them.
The researchers however found one way to get students to appreciate the point of the helping experiment:
They are shown the interviews of the people and are told: they did NOT help. They are NOT told the results of the experiment. Then they're asked to produce the results. Results? The students' guesses were very accurate.
Implications: to teach students psychology they did not know before you need to surprise them. But how?
Statistical facts vs individual cases
Surprise using statistical fact did not produce a learning effect.
Surprise by individual cases - 2 nice people who had not helped - produced a learning effect:
Generalization that helping is more difficult than they thought.
Famous conclusion:
"Subjects’ unwillingness to deduce the particular from the general was matched only by their willingness to infer the general from the particular."
Deduction
p1: All ravens are black
p2: X is a raven
C: X is black
NB: new information cannot change the truth value of the conclusion anymore: i.e. the conclusion is given once the premises are given!
72
Induction
p1: Raven 1 is black;
p2: Raven 2 is black;
p3 Raven 3 is black;
p………………..;
pn: Raven N is black
Hence: All ravens are black?
NB: new information can change the truth value of the conclusion!
73
theory
observations
empirical laws
new hypotheses
(induction)
(deduction)
generalisation
theory development
testing hypothesis
developing new hypotheses
De hypothetico-deductive method
74
It is more effective to reward improved performance than punish mistakes. One flight instructor: NO! When my cadets do great I compliment them and their performance decreases, and when poor I punish them and their performance increases.
75
Regression to the mean
Statistics: you have an average performance. If you peak sometimes, you are likely to do less well on the next occasion and vice versa.
Statistical irony: we are punished when we're nice and rewarded when we're nasty as managers.
Who can explain the irony?
76
Extreme scores
The more extreme the original score, the more regression we expect.
A worse or better next performance does not need a causal explanation
Although it may be there.
He was on the cover of a major sports magazine but this season his performance is awful.
Must have been overconfident or not handling the pressure properly.
Success = talent + hard work + fortune (pre-decreed)
وَتِلْكَ الْأَيَّامُ نُدَاوِلُهَا بَيْنَ النَّاسِ
Regression effects
Regression effects are common, everywhere, but we do not recognize them as such.
No Allaah does not want to punish you
It's not because you're a bad person
It's not because you're not good enough
It's not because you don't deserve it
It was simply meant to be that after your great performance you performed less well.
Or after health you fell ill way of life
Our mind prefers causal stories over mere statistics; i.e. events that simply had to happen for no reason that is connected to your qualities.
Correlation
The correlation coefficient between two measures, which varies between 0 and 1, is a measure of the relative weight of the factors they share.
For example, we all share half our genes with each of our parents, and for traits in which environmental factors have relatively little influence, such as height, the correlation between parent and child is not far from .50
Correlation and regression are not two concepts—they are different perspectives on the same concept.
Whenever the correlation between two scores is imperfect, there will be regression to the mean
People don't like statistical inferences, they want causal stories.
A business commentator who says on air that the business did better this year because it performed extremely poorly last year won't remain on air for very long.
Correlation and causation
Very depressed children who are given energy drink improve over a 3 month period.
Correlation or causation?
When the rooster crows, the sun rises.
Correlation or causation?
Thanks for letting me know!
Exercise 8
Solution
Adding 10% to each store is wrong, you should add more to the low performing stores and less to the high performing stores.
Speaking of Regression to Mediocrity
“She says experience has taught her that criticism is more effective than praise. What she doesn’t understand is that it’s all due to regression to the mean.”
“Perhaps his second interview was less impressive than the first because he was afraid of disappointing us, but more likely it was his first that was unusually good.”
“Our screening procedure is good but not perfect, so we should anticipate regression. We shouldn’t be surprised that the very best candidates often fail to meet our expectations.”
Taming intuitive predictions
Maryam is currently a senior in a state university. She read fluently when she was four years old. What is her grade point average (GPA)?
People quickly respond somewhere at 3.7 or 3.8.
Intuitive prediction
Several system 1 operations (not in this exact order):
A causal link between early reading and high GPA because both fit the story of academic talent which is a link. When a link is found: WYSIATI. Best possible story produced.
Would not occur if she was good at weightlifting or flying her kite.
Evidence is related to a norm. How special is a child who reads at 4?
Substitution: the little evidence for her reading capability and the percentile she belongs to is used as the answer to the percentile she would belong to for the GPA
A better prediction tool
Start with an estimate of average GPA.
Step 1 retrieves the baseline, the prediction if you knew nothing about Maryam except that she was about to graduate.
Determine the GPA that matches your impression of the evidence.
Step 2 is your intuitive judgment based on the evidence.
Estimate the correlation between your evidence and GPA (evaluate quality of evidence)
Step 3 moves you from the baseline to your intuition but is controlled by the estimated correlation.
If the correlation is .30, move 30% from the average to the matching GPA.
Step 4: a prediction which is based on intuition but much more moderate.
Taming intuitive predictions (2)
When is the increased effort of the correction justified?
When you've got much to lose.
What about when you've got much to win? E.g. venture capitalists.
Risk of making a modest investment in a startup that fails vs. risk of missing the next Google
A two-systems view of regress
Extreme predictions and a willingness to predict rare events from weak evidence are both manifestations of System 1
Overconfidence which is produced by coherence.
Regression is also a problem for system 2.
Although a simple idea, it is hard to understand and share with others.
Many great statistics teachers fear the class in which they have to cover it and their students only get a vague idea of it.
We need to be trained to do this because matching predictions with evidence feels good, seems the right thing to do.
Even when we identify a regression (e.g. flight instructors) we use a causal explanation.
WARNING: your intuitions may render predictions that are too extreme and you may end up putting too much faith in them.
88
Speaking of Intuitive Predictions
“That start-up achieved an outstanding proof of concept, but we shouldn’t expect them to do as well in the future. They are still a long way from the market and there is a lot of room for regression.”
“Our intuitive prediction is very favorable, but it is probably too high. Let’s take into account the strength of our evidence and regress the prediction toward the mean.”
Group exercise: availability heuristic
Do the availability heuristic group exercise and prepare a brief presentation.