essay
A Rogue Economist
Explores the Hidden
Side of Everything
Steven D. Levitt and Stephen J. Dubner
Will iam Morrow An Imprint of Harp erCol lln sPublishers
FREAKONOMICS. Copyright© 2005 by Steven D. Levitt and Stephen J. Dubner. All rights reserved. Primed in .the United States of America. No part of this book may be used or reproduced in any manner whatsoever without written permission except in the case of brief quocu:ions embodied in critical articles and reviews. For information address HarperCollins Publishers Inc., 10 East 53rd Street, New York, NY 10022.
HarperCollins books may be purchased for educational, business, or sales promotional use. For information please write: Special Markets Department, HarperCollins Publishers Inc., 10 East 53rd Street, New York, NY 10022.
FIRSfEDffiON
Daigned by Katy Riegel
Printed on acid-free paper
Library of Congress Cataloging-in-Publication Data
Levin, Steven D.
Freakonomics : a rogue economist explores the hidden side of everything I Steven D. Levitt and Stephen J. Dubner.-lst ed.
p. cm. Includes bibliographical references and index. ISBN 0-06-073132-X 1. Economics---Psychological aspects. 2. Economics-Sociological aspects. I. Dubner,
Stephen J. II. Title.
HB74.P8L479 2005 330-<lc22 2004065478
05 06 07 08 09 DIXfRRD 30 29 28 27 26 25 24
What Do Schoolteachers and Sumo Wrestlers Have in Common?
Im agine for a moment that you are the manager of a day-care center.
You have a clearly stated policy that children are supposed to be
pi ·ked up by 4 p.m. But very often parents are late. The result: at day's
nd , you have some anxious children and at least one teacher who
must wait around for the parents to arrive. What to do?
A pair of economists who heard of this dilemma-it turned out to
b- n rather common one-offered a solution: fine the tardy parents. Why, after all, should the day-care center take care of these kids for Ii· ?
' rhe economists decided to test their solution by conducting a
t11Jy of ten day-care centers in Haifa, Israel. The study lasted twenty W ·ks, but the fine was not introduced immediately. For the first
)lll" weeks, the economists simply kept track of the number of par-
i Its who came late; there were, on average, eight late pickups per
k per day-care center. In the fifth week, the fine was enacted. It I nnnounced that any parent arriving more than ten minutes late
AK ONOMI C
would pay$ p r hild fi r ea h incident. T he fee wo uld be added ro the parents' monthly bill, which was roughly $380.
After the fine was enacted, the number of late pickups promprl y
went ... up. Before long there were twenty late pickups per week,
more than double the original average. The incentive had plainl y
backfired.
Economics is, at root, the study of incentives: how people get wh:i 1
they want, or need, especially when other people want or need th e
same thing. Economists love incentives. They love to dream them up
and enact them, study them and tinker with them. The typical econ -
omist believes the world has not yet invented a problem that he can -
not fix if given a free hand to design the proper incentive scheme. His
solution may not always be pretty-it may involve coercion or exor-
bitant penalties or the violation of civil liberties-but the original
problem, rest assured, will be fixed. An incentive is a bullet, a lever, a
key: an often tiny object with astonishing power to change a situa-
tion.
We. all learn to respond to incentives, negative and positive, from
the outset of life. If you toddle over to the hot stove and touch it, you burn a finger. But if you bring home straight Ns from school, you get a new bike. If you are spotted picking your nose in class, you get ridiculed. But if you make the basketball team, you move up the so-
cial ladder. If you break curfew, you get grounded. But if you ace your SATs, you get to go to a good college. If you flunk out of law school, you have to go to work at your father's insurance company. But if you
perform so well that a rival company comes calling, you become a vice
president and no longer have to work for your father. If you become so excited about your new vice president job that you drive home at
eighty mph, you get pulled over by the police and fined $100. But if
you hit your sales projections and collect a year-end bonus, you not
only aren't worried about the $100 ticket but can also afford to buy
20
Schoolteachers and Sumo Wrestlers
rhat Viking range you've always wanted-and on which your toddler
an now burn her own finger.
An incentive is simply a means of urging people to do more of
a good thing and less of a bad thing. But most incentives don't
·ome about organically. Someone-an economist or a politician or a
parent-has to invent them. Your three-year-old eats all her vegeta-
bl es for a week? She wins a trip to the toy store. A big steelmaker belches too much smoke into the air? The company is fined for each
·ubic foot of pollutants over the legal limit. Too many Americans
aren't paying their share of income tax? It was the economist Milton I ;riedman who helped come up with a solution to this one: automatic
lax withholding from employees' paychecks.
T here are three basic flavors of incentive: economic, social, and
moral. Very often a single incentive scheme will include all three vari-
l ies. Think about the anti-smoking campaign of recent years. The
1 ldition of a $3-per-pack "sin tax" is a strong economic incentive
1ga inst buying cigarettes. The banning of cigarettes in restaurants and
ha rs is a powerful social incentive. And when the U.S. government as- ' 'l'ts that terrorists raise money by selling black-market cigarettes,
I ha t acts as a rather jarring moral incentive.
Some of the most compelling incentives yet invented have been
put in place to deter crime. Considering this fact, it might be worth-
whil e to take a familiar question-why is there so much crime in
1110dern society?-and stand it on its head: why isn't there a lot more
·rime ?
After all, every one of us regularly passes up opportunities to
111 nim , steal, and defraud. The chance of going to jail-thereby losing
o ur job, your house, and your freedom, all of which are essentially
·o no mic penalties-is certainly a strong incentive. But when it
•om s to crime, people also respond to moral incentives (they don't 11 H co do so mething they consider wrong) and social incentives
21
F' FH: AK ONOMI CS
(they don't want to be seen by oth rs as doing something wrong). Fo r
certain types of misbehavior, social incentives are terribly powerful. In
an echo of Hester Prynne's scarlet letter, many American cities now
fight prostitution with a "shaming" offensive, posting pictures of con-
victed johns (and prostitutes) on websites or on local-access televi-
sion. Which is a more horrifying deterrent: a $500 fine for soliciting a
prostitute or the thought of your friends and family ogling you on
www.HookersAndJohns.com.
So through a complicated, haphazard, and constantly readjusted
web of economic, social, and moral incentives, modern society does
its best to militate against crime. Some people would argue that we
don't do a very good job. But taking the long view, that is clearly not
true. Consider the historical trend in homicide (not including wars),
which is both the most reliably measured crime and the best barome-
ter of a society's overall crime rate. These statistics, compiled by the
criminologist Manuel Eisner, track the historical homicide levels in
five European regions.
HOMICIDES
(per 100,000 People)
NETHERLANDS GERMANY AND
ENGLAND AND BELGIUM SCANDINAVIA SWITZERLAND
13th and 14th c. 23.0 47.0 n.a. 37 .0
15th c. n.a. 45 .0 46.0 16.0
16th c . 7.0 25.0 21 .0 11 .0
17th c. 5.0 7.5 18.0 7.0
18th c . 1.5 5.5 1.9 7.5
19th c. 1.7 1.6 1.1 2.8
1900-1949 0.8 1.5 0.7 1.7
1950-1994 0.9 0.9 0.9 1.0
22
5
7
47
32 10
12
3,
Schoolteachers and Sumo Wrestlers
The steep decline of these numbers over the centuries suggests
rhat, for one of the gravest human concerns-getting murdered-the
incentives that we collectively cook up are working better and better.
So what was wrong with the incentive at the Israeli day-care cen-
rers?
You have probably already guessed that the $3 fine was simply too
$mall. For that price, a parent with one child could afford to be late
·very day and only pay an extra $60 each month-just one-sixth of
the base fee. As babysitting goes, that's pretty cheap. What if the fine
had been set at $100 instead of $3? That would have likely put an end
10 the late pickups, though it would have also engendered plenty of ill will. (Any incentive is inherently a trade-off; the trick is to balance the
xtremes.)
But there was another problem with the day-care center fine. It 'ubstituted an economic incentive (the $3 penalty) for a moral incen-
tive (the guilt that parents were supposed to feel when they came late).
llor just a few dollars each day, parents could buy off their guilt. Fur-
1 h ' rmore, the small size of the fine sent a signal to the parents that late
pi ·Imps weren't such a big problem. If the day-care center suffers only $3 worth of pain for each late pickup, why bother to cut short the ten-
11 is game? Indeed, when the econo~ists eliminated the $3 fine in the ·v ·nteenth week of their study, the number of late-arriving parents
llc.ln'r change. Now they could arrive late, pay no fine, and feel no
ll ii [. Such is the strange and powerful nature of incentives. A slight
IW ak can produce drastic and often unforeseen results. Thomas Jef- f r.~o n noted this while reflecting on the tiny incentive that led to the
llosto n Tea Party and, in turn, the American Revolution: "So in-
rurnb le is the arrangement of causes and consequences in this world
th IC a two-pe nny duty on tea, unjustly imposed in a sequestered part I' It, hanges the condition of all its inhabitants."
23
R AKONOMI CS
In the 1970s, researchers onducted a study that, I ike the Israeli day-care study, pitted a moral incentive against an economic ince n-
tive. In this case, they wanted to learn about the motivation behind
blood donations. Their discovery: when people are given a small
stipend for donating blood rather than simply being praised for their
altruism, they tend to donate less blood. The stipend turned a noble
act of charity into a painful way to make a few dollars, and it wasn't
worth it.
What if the blood donors had been offered an incentive of $50, or
$500, or $5,000? Surely the number of donors would have changed
dramatically.
But something else would have changed dramatically as well, for
every incentive has its dark side. If a pint of blood were suddenly worth $5,000, you can be sure that plenty of people would take note.
They might literally steal blood at knifepoint. They might pass off pig
blood as their own. They might circumvent donation limits by using
fake IDs. Whatever the incentive, whatever the situation, dishonest
people will try to gain an advantage by whatever means necessary.
Or, as W C. Fields once said: a thing worth having is a thing worth cheating for.
Who cheats?
Well, just about anyone, if the stakes are right. You might say to
yourself, I don't cheat, regardless of the stakes. And then you might re-
member the time you cheated on, say, a board game. Last week. Or
the golf ball you nudged out of its bad lie. Or the time you really
wanted a bagel in the office break room but couldn't come up with the
dollar you were supposed to drop in the coffee can. And then took the
bagel anyway. And told yourself you'd pay double the next time. And
didn't.
For every clever person who goes to the trouble of creating an in-
24
Schoolteachers and Sumo Wrestlers
centive scheme, there is an army of people, clever and otherwise, who
will inevitably spend even more time trying to beat it. Cheating
may or may not be human nature, but it is certainly a prominent fea-
ture in just about every human endeavor. Cheating is a primordial
eco nomic act: getting more for less. So it isn't just the boldface
names-inside-trading CEOs and pill-popping ballplayers and perk-
abusing politicians-who cheat. It is the waitress who pockets her tips
instead of pooling them. It is the Wal-Mart payroll manager who goes
into the computer and shaves his employees' hours to make his own
performance look better. It is the third grader who, worried about not
making it to the fourth grade, copies test answers from the kid sitting
11 ext to him.
Some cheating leaves barely a shadow of evidence. In other cases,
th e evidence is massive. Consider what happened one spring evening
11 midnight in 1987: seven million American children suddenly dis-
tppeared. The worst kidnapping wave in history? Hardly. It was the
night of April 15, and the Internal Revenue Service had just changed
1 rule. Instead of merely listing each dependent child, tax filers were
now required to provide a Social Security numbei: for each child. Sud-
1 · 11 I y, seven million children-children who had existed only as phantom exemptions on the previous year's 1040 forms-vanished,
I ·presenting about one in ten of all dependent children in the United
, 'tntes .
' rh e incentive for those cheating taxpayers was quite dear. The
1111 c for the waitress, the payroll manager, and the third grader. But
lia t about that third grader's teacher? Might she have an incentive to
h 'tit? And if so, how would she do it?
Im 1gi ne now that instead of running a day-care center in Haifa, 11111 ar runnin g the Chicago Public Schools, a system that educates
00 ,000 s ud n ts ea h y ar.
25
R AK ONOMI CS
he most vol at il e u1or "nt d b. ce am ng Am eri ca n school admin - istrators, teachers, parents, and students concerns "high-stakes" tesr-
ing. The stakes are considered high because instead of simply testin g
students to measure their progress, schools are increasingly held ac-
countable for the results.
The federal government mandated high-stakes testing as pare of
the No Child Left Behind law, signed by President Bush in 2002. Bur
even before chat law, most states gave annual standardized tests co stu -
dents in elementary and secondary school. Twenty states rewarded in -
dividual schools for good test scores or dramatic improvement;
thirty-two states sanctioned the schools chat didn't do well.
The Chicago Public School system embraced high-stakes testing
in 1996. Under the new policy, a school with low reading scores would be placed on probation and face the threat of being shut down,
its staff co be dismissed or reassigned. The CPS also did away with
what is known as social promotion. In the past, only a dramatically
inept or difficult student was held back a grade. Now, in order to be
promoted, every student in third, sixth, and eighth grade had co man-
age a minimum score on the standardized, multiple-choice exam
known as the Iowa Test of Basic Skills.
Advocates of high-stakes testing argue chat it raises the standards
of learning and gives students more incentive to study. Also, if the test
prevents poor students from advancing without merit, cheywon'c clog
up che higher grades and slow down good students. Opponents,
meanwhile, worry chat certain students will be unfairly penalized if
they don't happen co test well, and chat teachers may concentrate on
the test topics at the exclusion of more important lessons.
Schoolchildren, of course, have had incentive to cheat for as long
as there have been tests. Bue high-stakes testing has so radically
changed the incentives for teachers chat they coo now have added rea-
son co cheat. With high-stakes testing, a teacher whose students test
26
Schoolteachers and Sumo Wrestlers
poorly can be censured or passed over for a raise or promotion. If the entire school does poorly, federal funding can be withheld; if the
school is put on probation, the teacher stands to be fired. High-stakes
tes ting also presents teachers with some positive incentives. If her stu- dents do well enough, she might find herself praised, promoted, and
even richer: the state of California at one point introduced bonuses of
$25 ,000 for teachers who produced big test-score gains.
And if a teacher were to survey this newly incentivized landscape
and consider somehow inflating her students' scores, she just might
be persuaded by one final incentive: teacher cheating is rarely looked
fo r, hardly ever detected, and just about never punished.
How might a teacher go about cheating? There are any number of
possibilities, from the brazen to the sophisticated. A fifth-grade stu-
dent in Oakland recently came home from school and gaily told her
mother that her super-nice teacher had written the answers to the
sta te exam right there on the chalkboard. Such instances are certainly
rare, for placing your fate in the hands of thirty prepubescent wit-
nesses doesn't seem like a risk that even the worst teacher would take.
(T he Oakland teacher was duly fired.) There are more subtle ways to
in A ate students' scores. A teacher can simply give students extra time 1 o complete the test. If she obtains a copy of the exam early-that is, ill egitimately-she can prepare them for specific questions. More
broadly, she can "teach to the test," basing her lesson plans on ques-
1 io ns from past years' exams, which isn't consider~d cheating but
· nainly violates the spirit of the test. Since these tests all have
mu 1 tiple-choice answers, with no penalty for wrong guesses, a teacher
1ni ght instruct her students to randomly fill in every blank as the
·lo k is winding down, perhaps inserting a long string of Bs or an al-
l rn ating pattern of Bs and Cs. She might even fill in the blanks for
1h ·m after they've left the room. But if a teac her really wanted to cheat- and make it worth her
2 7
AK ONOMI C
whil e- she mi gh t o ll e c her scud encs' answer shee r and, in th e ho lll'
or so before turning th em in to be read by an elec tronic scann er, eras1:
the wrong answers and fill in correct ones. (And you always thou ght
that no. 2 pencil was for the children to change their answers.) If chi s kind of teacher cheating is truly going on, how might it be detected ?
To catch a cheater, it helps to think like one. If you were willing to erase your students' wrong answers and fill in correct ones, you prob-
ably wouldn't want to change too many wrong answers. That would
clearly be a tip-off. You probably wouldn't even want to change a n-
swers on every student's test-another tip-off. Nor, in all likelihood ,
would you have enough time, because the answer sheets are turned in
soon after the test is over. So what you might do is select a string of
eight or ten consecutive questions and fill in the correct answers for,
say, one-half or two-thirds of your students. You could easily memo-
rize a short pattern of correct answers, and it would be a lot faster to
erase and change that pattern than to go through each student's an-
swer sheet individually. You might even think to focus your activity
toward the end of the test, where the questions tend to be harder than
the earlier questions. In that way, you'd be most likely to substitute
correct answers for wrong ones.
If economics is a science primarily concerned with incentives, it is also-fortunately-a science with statistical tools to measure how
people respond to those incentives. All you need are some data.
In this case, the Chicago Public School system obliged. It made
available a database of the test answers for every CPS student from
third grade through seventh grade from 1993 to 2000. This amounts
to roughly 30,000 students per grade per year, more than 700,000
sets of test answers, and nearly 100 million individual answers. The
data, organized by classroom, included each student's question-by-
question answer strings for reading and math tests. (The actual paper
answer sheets were not included; they were habitually shredded soon
28
Schoolteachers and Sumo Wrestlers
after a test.) The data also included some information about each
teacher and demographic information for every student, as well as his
or her past and future test scores-which would prove a key element
in detecting the teacher cheating.
Now it was time to construct an algorithm that could tease some
conclusions from this mass of data. What might a cheating teacher's
classroom look like?
The first thing to search for would be unusual answer patterns in a
given classroom: blocks of identical answers, for instance, especially
among the harder questions. If ten very bright students (as indicated by past and future test scores) gave correct answers to the exam's first
five questions (typically the easiest ones), such an identical block
shouldn't be considered suspicious. But if ten poor students gave cor-
rect answers to the last five questions on the exam (the hardest ones), rliat's worth looking into. Another red flag would be a strange pattern
within any one student's exam-such as getting the hard questions
ri ght while missing the easy ones-especially when measured against
I he thousands of students in other classrooms who scored similarly on
th e same test. Furthermore, the algorithm would seek out a classroom
ILdl of students who performed far better than their past scores would
have predicted and who then went on to score significantly lower the
fo l lowing year. A dramatic one-year spike in test scores might initially
b · attributed to a good teacher; but with a dramatic fall to follow, I h ·re's a strong likelihood that the spike was brought about by artifi-
·ia 1 means. o nsider now the answer strings from the students in two sixth-
w :i <l e C hicago classrooms who took the identical math test. Each hor-
w n cal row represents one student's answers. The letter a, b, c, or d ndi aces a correct answer; a number indicates a wrong answer, with 1
·or r spo nding to a, 2 corresponding to b, and so on. A zero represents
in :i nswer th at was left blank. One of these classrooms almost cer-
2 9
F' R AK ONOMI CS
rainly had a ch eating reacher and rhe ocher did nor. Try ro cell rhe dif'..
ference-alrhough be forewarned char ir's nor easywirh rhe naked ey ·.
30
Classroom A
112a4a342cb214d0001acd24a3al2dadbcb4a0000000
d4a2341cacbddad3142a2344a2ac23421c00adb4b3cb
lb2a34d4ac42d23b141acd24a3a12dadbcb4a2134141
dbaab3dcacbldadbc42ac2cc31012dadbcb4adb40000
dl2443d43232d32323c213c22d2c23234c332db4b300
db2abadlacbdda212blacd24a3a12dadbcb400000000
d4aab2124cbddadbcbla42cca3412dadbcb423134bcl
lb33b4d4a2bldadbc3ca22c000000000000000000000
d43a3a24acbld32b412acd24a3a12dadbcb422143bc0
313a3adlac3d2a23431223c000012dadbcb400000000
db2a33dcacbd32d313c21142323cc300000000000000
d43ab4dlac3dd43421240d24a3al2dadbcb400000000
db223a24acblla3b24cacd12a241cdadbcb4adb4b300
db4abadcacbldad3141ac212a3alc3a144ba2db4lb43
1142340c2cbddadb4blacd24a3a12dadbcb43d133bc4
214ab4dc4cbdd31blb2213c4ad412dadbcb4adb00000
1423b4d4a23d24131413234123a243a2413a21441343
3b3ab4d14c3d2ad4cbcaclc003a12dadbcb4adb40000
dba2ba21ac3d2ad3c4c4cd40a3al2dadbcb400000000
dl22ba2cacbdla13211a2d02a2412d0dbcb4adb4b3c0
144a3adc4cbddadbcbc2c2cc43al2dadbcb4211ab343
d43aba3cacbddadbcbca42c2a3212dadbcb42344b3cb
Classroom B
db3a431422bd131b4413cd422alacda332342d3ab4c4
dlaalallacb2d3dbclca22c23242c3a142b3adb243cl
Schoolteachers and Sumo Wrestlers
d42a12d2a4bld32b21ca2312a341ld00000000000000
3b2a34344c32d2lb1123cdc000000000000000000000
34aabad12cbdd3d4clca112cad2ccd00000000000000
d33a3431a2b2d2d44b2acd2cad2c2223b40000000000
2 3aa32d2albd,2 4 31141342cl 3d212d2 3 3c3 4a3b3b0 O O
d32234d4albdd23b242a22c2alalcda2blbaa33a0000
d3aab23c4cbddadb23c322c2a222223232b443b24bc3
d13a14313c31d42b14c421c42332cd2242b3433a3343
dl3a3ad122blda2b11242dcla3a12100000000000000
d12a3adlal3d23d3cb2a21ccada24d2131b440000000
314a133c4cbd142141ca424cad34cl22413223ba4b40
d42a3adcacbddadbc42ac2c2ada2cda341baa3b24321
db1134dc2cb2dadb24c412clada2c3a341ba20000000
dl341431acbddad3c4c213412da22d3dl132al344blb
lba41a21alb2dadb24ca22clada2cd32413200000000
dbaa33d2a2bddadbcbcallc2a2accdalb2ba20000000
If you guessed that classroom A was the cheating classroom, con- gratulations. Here again are the answer strings from classroom A, now
reordered by a computer that has been asked to apply the cheating al-
go rithm and seek out suspicious patterns.
Classroom A
(With cheating algorithm applied)
1. 112a4a342cb214d0001acd24a3al2dadbcb4a0000000
2. lb2a34d4ac42d23bl41acd24a3al2dadbcb4a2134141
3. db2abadlacbdda212blacd24a3al2dadbcb400000000
4. d43a3a24acbld32b412acd24a3al2dadbcb422143bc0
5. d43ab4dlac3dd43421240d24a3al2dadbcb400000000
6. 1142340c2cbddadb4blacd24a3al2dadbcb43d133bc4
7. db a 2ba2lac3d2ad3c4c4cd4 0a3al2dadbcb400000000
31
F' AK ONOMI CS
8. 1 44 a3a d c4cbdda dbcbc2c2cc4 3al2dadbcb4 211a b 343
9. 3b3ab4d14c3d2ad4cbcac l c003al2dadbcb4adb40000
10. d43aba3cacbddadbcbca42c2a3212dadbcb42344b3cb
11. 214ab4dc4cbdd31blb2213c4ad412dadbcb4adb00000
12. 313a3adlac3d2a23431223c000012dadbcb400000000
13. d4aab2124cbddadbcbla42cca3412dadbcb423134bcl
14. dbaab3dcacbldadbc42ac2cc31012dadbcb4adb40000
15. db223a24acblla3b24cacd12a241cdadbcb4adb4b300
16. dl22ba2cacbdla13211a2d02a2412d0dbcb4adb4b3c0
17. 1423b4d4a23d24131413234123a243a2413a21441343
18. db4abadcacbldad3141ac212a3alc3a144ba2db41b43
19. db2a33dcacbd32d313c21142323cc300000000000000
20. lb33b4d4a2bldadbc3ca22cOOOOOOOOOOOOOOOOOOOOO
21. dl2443d43232d32323c213c22d2c23234c332db4b300
22 . d4a2341cacbddad3142a2344a2ac23421c00adb4b3cb
Take a look at the answers in bold. Did fifteen out of twenty-two
students somehow manage to reel off the same six consecutive correct
answers (the d-a-d-b-c-b string) all by themselves?
There are at least four reasons this is unlikely. One: those ques-
tions, coming near the end of the test, were harder than the earlier
questions. Two: these were mainly subpar students to begin with, few
of whom got six consecutive right answers elsewhere on the test, mak-
ing it all the more unlikely they would get right the same six hard
questions. Three: up to this point in the test, the fifteen students' an-
swers were virtually uncorrelated. Four: three of the students (num-
bers 1, 9, and 12) left at least one answer blank before the suspicious string and then ended the test with another string of blanks. This sug-
gests that a long, unbroken string of blank answers, was broken not by
the student but by the teacher.
There is another oddity about the suspicious answer string. On
32
Schoolteachers and Sumo Wrestlers
nine of the fifteen tests, the six correct answers are preceded by an-
other identic~l string, 3-a-1-2, which includes three of four incorrect answers. And on all fifteen tests, the six correct answers are followed
by the same incorrect answer, a 4. Why on earth would a cheating
teacher go to the trouble of erasing a student's test sheet and then fill
in the wrong answer?
Perhaps she is merely being strategic. In case she is caught and
hauled into the principal's office, she could point to the wrong an-
swers as proof that she didn't cheat. Or perhaps-and this is a less
charitable but just as likely answer-she doesn't know the right an-
swers herself. (With standardized tests, the teacher is typically not
given an answer key.) If this is the case, then we have a pretty good due as to why her students are in need of inflated grades in the first
place: they have a bad teacher.
Another indication of teacher cheating in classroom A is the
class's overall performance. As sixth graders who were taking the test
in the eighth month of the academic year, these students needed to
achieve an average score of 6.8 to be considered up to national stan-
dards. (Fifth graders taking the test in the eighth month of the year
needed to score 5.8, seventh graders 7.8, and so on.) The students in
classroom A averaged 5.8 on their sixth-grade tests, which is a full
grade level below where they should be. So plainly these are poor stu-
dents. A year earlier, however, these students did even worse, averag-
ing just 4.1 on their fifth-grade tests. Instead of improving by one full
point between fifth and sixth grade, as would be expected, they im-
proved by 1.7 points, nearly two grades' worth. But this miraculous
improvement was short-lived. When these sixth-grade students
reached seventh grade, they averaged 5.5-more than two grade lev-
els below standard and even worse than they did in sixth grade. Con-
sid er the erratic year-to-year scores of three particular students from
lass roomA:
33
AK ONOMI CS
5TH G RAD S CORE 6TH GRAD E S CORE 7r H G RADE S co m
Stud ent 3 3.0 6.5 5.1
Stud ent 6 3.6 6.3 4.9
Student 14 3.8 7.1 5.6
The three-year scores from classroom B, meanwhile, are also poor
but at least indicate an honest effort: 4.2, 5.1, and 6.0. So an entirt:
roomful of children in classroom A suddenly got very smart one year
and very dim the next, or more likely, their sixth-grade teacher
worked some magic with a no. 2 pencil.
There are two noteworthy points to be made about the children in
classroom A, tangential to the cheating itself The first is that they are
obviously in terrible academic shape, which makes them the very
children whom high-stakes testing is promoted as helping the most.
The second point is that these students would be in for a terrible
shock once they reached the seventh grade. All they knew was that
they had been successfully promoted due to their test scores. (No
child left behind, indeed.) They weren't the ones who artificially jacked up their scores; they probably expected to do great in the sev-
enth grade-and then they failed miserably. This may be the cruelest
twist yet in high-stakes testing. A cheating teacher may tell herself
that she is helping her students, but the fact is that she would appear
far more concerned with helping herself
An analysis of the entire Chicago data reveals evidence of teacher
cheating in more than two hundred classrooms per year, roughly 5
percent of the total. This is a conservative estimate, since the algo-
rithm was able to identify only the most egregious form of cheating-
in which teachers systematically changed students' answers-and not
the many subtler ways a teacher might cheat. In a recent study among North Carolina schoolteachers, some 35 percent qf the respondents
said they had witnessed their colleagues cheating in some fashion,
34
Schoolteachers and Sumo Wrestlers
whether by giving students extra time, suggesting answers, or manu-
ally changing students' answers.
What are the chadcteristics of a cheating teacher? The Chicago
data show that male and female teachers are about equally prone to
cheating. A cheating teacher tends to be younger and less qualified
than average. She is also more likely to cheat after her incentives
change. Because the Chicago data ran from 1993 to 2000, it brack-
eted the introduction of high-stakes testing in 1996. Sure enough,
there was a pronounced spike in cheating in 1996. Nor was the cheat-
ing random. It was the teachers in the lowest-scoring classrooms who were most likely to cheat. It should also be noted that the
$25,000 bonus for California teachers was eventually revoked, in part
because of suspicions that too much of the money was going to
cheaters.
Not every result of the Chicago cheating analysis was so dour. In
addition to detecting cheaters, the algorithm could also identify the
best teachers in the school system. A good teacher's impact was nearly
as distinctive as a cheater's. Instead of getting random answers correct,
her students would show real improvement on the easier types of
questions they had previously missed, an indication of actual learn-
ing. And a good teacher's students carried over all their gains into the
next grade.
Most academic analyses of this sort tend to languish, unread, on a
dusty library shelf. But in early 2002, the new CEO of the Chicago
Public Schools, Arne Duncan, contacted the study's authors. He
didn't want to protest or hush up their findings. Rather, he wanted to
make sure that the teachers identified by the algorithm as cheaters
were truly cheating-and then do something about it.
Duncan was an unlikely candidate to hold such a powerful job.
He was only thirty-six when appointed, a onetime academic all-
American at Harvard who later played pro basketball in Australia. He
35
FREAKONOMICS
had spent just three years with the CPS-and never in a job impor tant enough to have his own secretary-before becoming its CEO. It didn't hurt that Duncan had grown up in Chicago. His father taught psychology at the University of Chicago; his mother ran an after school program for forty years, without pay, in a poor neighborhood. When Duncan was a boy, his afterschool playmates were the under privileged kids his mother cared for. So when he took over the public schools, his allegiance lay more with schoolchildren and their families than with teachers and their union.
The best way to get rid of cheating teachers, Duncan had decided, was to readminister the standardized exam. He only had the resources to retest 120 classrooms, however, so he asked the creators of the cheating algorithm to help choose which classrooms to test.
How could those 120 retests be used most effectively? It might have seemed sensible to retest only the classrooms that likely had a cheating teacher. But even if their retest scores were lower, the teach ers could argue that the students did worse merely because they were told that the scores wouldn't count in their official record-which, in fact, all retested students would be told. To make the retest results convincing, some non-cheaters were needed as a control group. The best control group? The classrooms shown by the algorithm to have the best teachers, in which big gains were thought to have been legiti mately attained. If those classrooms held their gains while the class rooms with a suspected cheater lost ground, the cheating teachers could hardly argue that their students did worse only because the scores wouldn't count.
So a blend was settled upon. More than half of the 120 retested classrooms were those suspected of having a cheating teacher. The re mainder were divided between the supposedly excellent teachers (high scores but no suspicious answer patterns) and, as a further con trol, classrooms with mediocre scores and no suspicious answers.
The retest was given a few weeks after the original exam. The chil-
36
s
dren were not told the reason hi But they may have gotten the i officials, not the teachers, wm were asked to stay in the � would not be allowed to even o
The results were as compdli dieted. In the classrooms chOSf suspected, scores stayed abour � srudents with the teachers ideo an average of more than a full
As a result, the Chicago Pl cheating teachers. The evidence dozen of them, but the many (i The final outcome of the Chic power of incentives: the followi:J than 30 percent.
You might think that the sophis increase along with the le d o
niversity of Georgia in the' course was called Coaching � 2nd the final grade was based o ·ans. Among the questions:
How many halves are in a coll�
a. 1 b.2 c.3 cl 4
!How many points does a 3-pL j g;ime?
a. 1 b.2 c. 3 d.4
and never in a job impor
fore becoming its CEO. It
in Chicago: His father taught
· his mother ran an after-
pay in a poor neighborhood.
I playmates were the under
when he took over the public
ichoolchildren and their families
!ICSC scores were lower, the teach
rse merely because they were
n their official record-which, in
m cl To make the retest results � needed as a control group. The shown by the algorithm to have
, cho�ght �o hav� been legiti hdd their gams while the class
ground, the cheating teachers
did worse only because the
than half of the 120 retested
1 .... ,,,,-�a a cheating teacher. The re
supposedly excellent teachers
�patterns) and, as a further con
j and no suspicious answers.
s after the original exam. The chil-
Schoolteachers and Sumo Wrestlers
dren were not told the reason for the retest. Neither were the teachers.
But they may have gotten the idea when it was announced that CPS
officials, not the teachers, would administer the test. The teachers
were asked to stay in the classroom with their students, but they
would not be allowed to even touch the answer sheets.
The results were as compelling as the cheating algorithm had pre
dicted. In the classrooms chosen as controls, where no cheating was
suspected, scores stayed about the same or even rose. In contrast, the
students with the teachers identified as cheaters scored far worse, by
an average of more than a full grade level.
As a result, the Chicago Public School system began to fire its
cheating teachers. The evidence was only strong enough to get rid of a
dozen of them, but the many other cheaters had been duly warned.
T he final outcome of the Chicago study is further testament to the
power of incentives: the following year, cheating by teachers fell more
than 30 percent.
You might think that the sophistication of teachers who cheat would
increase along with the level of schooling. But an exam given at the
University of Georgia in the fall of 2001 disputes that idea. The
course was called Coaching Principles and Strategies of Basketball,
and the final grade was based on a single exam that had twenty ques
cions. Among the questions:
How many halves are in a college basketball game?
a. 1 b.2 c. 3 d.4
How many points does a 3-pt. field goal account for in a basketball
game?
a. 1 b.2 c.3 d.4
37
I f
.1 t
t
FREAKONOMICS
What is the name of the exam which all high school seniors in the
State of Georgia must pass?
a. Eye Exam
b. How Do the Grits Taste Exam
c. Bug Control Exam
d. Georgia Exit Exam
In your opinion, who is the best Division I assistant coach in the
country?
a. Ron Jirsa
b. John Pelphrey
c. Jim Harri ck Jr.
d. Steve Wojciechowski
If you are stumped by the final question, it might help to know that Coaching Principles was taught by Jim Harrick Jr., an assistant coach with the university's basketball team. It might also help to know that his father, Jim Harrick Sr., was the head basketball coach. Not surprisingly, Coaching Principles was a favorite course among players on the Harricks' team. Every student in the class received an A. Not long afterward, both Harricks were relieved of their coaching duties.
If it strikes you as disgraceful that Chicago schoolteachers and/Uni versity of Georgia professors will cheat-a teacher, after all, is meant to instill values along with the facts-then the thought of cheating among sumo wrestlers may also be deeply disturbing. In Japan, sumo is not only the national sport but also a repository of the country's re ligious, military, and historical emotion. With its purification rituals and its imperial roots, sumo is sacrosanct in a way that American
38
sports can never be. lo
than about honor itsdi
It is true that spom
cheating is more COllll
line between winning:
c.entive. Olympic sprii
France, football linen
shown to swallow wha
is not only the parricip
sceal an opponents�
competition, a Frendl
o swap votes to make
of orchestrating the
-"'\limzhan Tokhtakho\1
eants in Moscow.)
An athlete who geli
ost fans at least app
mat he bent the rules.
if you're not ch�
, meanwhile, is co
919 Chicago White-•
orld Series (and are
rain a stench of iniqu
College of New YodC�
r its smart and scr.q
ered in 1951 that: 1
in ts-intentionally
read. Remember To
stemmed &om ch
could have had cbs
If cheating to lose.
ill high school seniors in the
sion I assistant coach in the
srion, ir might help to know Jtm Harrick Jr., an assistant
un.. Ir might also help to know ie head basketball coach. Not &vorire course among players
t in rhe class received an A. :re relieved of their coaching
ago schoolteachers and Uni -a teacher, after all, is meant -dien rhe thought of cheating i>lf disturbing. In Japan, sumo .repository of the country's re o.. With its purification rituals :an.a in a way that American
Schoolteachers and Sumo Wrestlers
sports can never be. Indeed, sumo is said to be less about competition than about honor itself.
It is true that sports and cheating go hand in hand. That's because cheating is more common in the face of a bright-line incentive (the line between winning and losing, for instance) than with a murky in centive. Olympic sprinters and weightlifters, cyclists in the Tour de France, football linemen and baseball sluggers: they have all been shown to swallow whatever pill or powder may give them an edge. It is not only the participants who cheat. Cagey baseball managers try to steal an opponent's signs. In the 2002 W inter Olympic .figure-skating competition, a French judge and a Russian judge were caught trying to swap votes to make sure their skaters medaled. (The man accused of orchestrating the vote swap, a reputed Russian mob boss named Alimzhan Tokhtakhounov, was also suspected of rigging beauty pag eants in Moscow.)
An athlete who gets caught cheating is generally condemned, but most fans at least appreciate his motive: he wanted so badly to win that he bent the rules. (As the baseball player Mark Grace once said,
If you're not cheating, you're not trying.") An athlete who cheats to lose, meanwhile, is consigned to a deep circle of sporting hell. The 1919 Chicago White Sox, who conspired with gamblers to throw the World Series (and are therefore known forever as the Black Sox), re tain a stench of iniquity among even casual baseball fans. The City College of New York's championship basketball team, once beloved for its smart and scrappy play, was instantly reviled when it was dis covered in 1951 that several players had taken mob money to shave points-intentionally missing baskets to help gamblers beat the point spread. Remember Terry Malloy, the torment.ed former boxer played by Marlon Brando in On the Waterfront? As Malloy saw it, all his trou bles stemmed from the one fight in which he took a dive. Otherwise, he could have had class; he could have been a contender.
If cheating to lose is sport's premier sin, and if sumo wrestling is
39
. I
FREAKONOMICS
the premier sport of a great nation, cheating to lose couldn't possibly
exist in sumo. Could it?
Once again, the data can tell the story. As with the Chicago school
tests, the data set under consideration here is surpassingly large: the
results from nearly every official sumo match among the top rank of
Japanese sumo wrestlers between January 1989 and January 2000, a
total of 32,000 bouts fought by 281 different wrestlers.
The incentive scheme that rules sumo is intricate and extraordi
narily powerful. Each wrestler maintains a ranking that affects every
slice of his life: how much money he makes, how large an entourage
he carries, how much he:gets to eat, sleep, and otherwise take advan
tage of his success. The sixty-six highest-ranked wrestlers in Japan,
comprising the makuuchi and juryo divisions, make up the sumo elite.
A wrestler near the top of this elite pyramid may earn millions and
is treated like royalty. Any wrestler in the top forty earns at least
$170,000 a year. The seventieth-ranked wrestler in Japan, mean
while, earns only $15,000 a year. Life isn't very sweet outside the elite.
Low-ranked wrestlers must tend to their superiors, preparing their
meals and deaning their quarters and even soaping up their hardest
to-reach body parts. So ranking is everything.
A wrestler's ranking is based on his performance in the elite tour
naments that are held six times a year. Each wrestler has fifteen bouts
per tournament, one per day over fifteen consecutive days. If he fin
ishes the tournament with a winning record (eight victories or better),
his ranking will rise. If he has a losing record, his ranking falls. If it
falls far enough, he is booted from the elite rank entirely. The eighth
victory in any tournament is therefore critical, the difference between
promotion and demotion; it is roughly four times as valuable in the
rankings as the typical victory.
So a wrestler entering the final day of a tournament on the bubble,
with a 7-7 record, has far more to gain from a victory than an oppo
nent with a record of 8-6 has to lose.
40
Is it possible, then, d
to beat him? A sumo bo
and leverage, often lasrit
to let yourself be tosst
wrestling is rigged. How
The first step woulc fought on a tournament and a wrestler who has a than half of all wrestleis1 nine victories, hundreds between two 7-7 wrestL badly need the victory; wouldn't throw a march to win: the $100, 000 pri ries of $20,000 prizes fu ing spirit" award, and oo
Let's now consider rl
hundreds of matches in�
a tournament's final day.'
on all past meetings hen
the 7-7 wrestler will win.
wrestler actually did win.
7-7 WRESTU!li PREDICTED Wr PERC
AGAINST 8-6 ()PF(
48.7
So the 7-7 wrestle� b: just less than half them tournament indicate chat ruality, the wrestler on du
earing to lose couldn't possibly
O'· As with the Chicago school
here is surpassingly large: the
march among the top rank of \
fY 1989 and January 2000, a
frerent wrestlers.
is intricate and extraordi
os a ranking that affects every
, how large an entourage
and otherwise take advan
-ranked wrestlers in Japan,
isions, make up the sumo elite.
· d may earn millions and
wrestler in Japan,, mean
't very sweet outside the elite.
ormance in the elite tour
Each wrestler has fifteen bouts
m consecutive days. If he fin
oord (eight victories or better),
.record, his ranking falls. If it
·re rank entirely. The eighth
· ·cal, the difference between
four times as valuable in the
a tournament on the bubble,
from a victory than an oppo-
Schoolteachers and Sumo Wrestlers
Is it possible, then, that an 8-6 wrestler might allow a 7-7 wrestler
to beat him? A sumo bout is a concentrated flurry of force and speed
and leverage, often lasting only a few seconds. It wouldn't be very hard
to let y ourself be tossed. Let's imagine for a moment that sumo
wrestling is rigged. How might we measure the data to prove it?
The first step would be to isolate the bouts in question: those
fought on a tournament's fin.al day between a wrestler on the bubble
and a wrestler who has already secured his eighth win. (Because more
than half of all wrestlers end a tournament with either seven, eight, or
nine victories, hundreds of bouts fit these criteria.) A final-day match
between two 7-7 wrestlers isn't likely to be fixed, since both fighters
badly need the victory. A wrestler with ten or more victories probably
wouldn't throw a match either, since he has his own strong incentive
to win: the $100,000 prize for overall tournament champion and a se
ries of $20,000 prizes for the "outstanding technique" award, "fight
ing spirit" award, and others.
Let's now consider the following statistic, which represents the
hundreds of matches in which a 7-7 wrestler faced an 8-6 wrestler on
a tournament's final day. The left column tallies the probability, based
on all past meetings between the two wrestlers fighting that day, that
the 7-7 wrestler will win. The right column shows how often the 7-7
wrestler actually did win.
7-7 WRESTLER'S
PREDICTED W1N PERCENTAGE
AGAINST 8-6 OPPONENT
48.7
7-7 WRESTLER'S
ACTUAL WIN PERCENTAGE
AGAINST 8-6 OPPONENT
79.6
So the 7-7 wrestler, based on past outcomes, was expected to win
just less than half the time. This makes sense; their records in this
tournament indicate that the 8-6 wrestler is slightly better. But in ac
tuality, the wrestler on the bubble won almost eight out of ten matches
41
FREAKONOMICS
against his 8-6 opponent. Wrestlers on the bubble also do astonish ingly well against 9-5 opponents:
7-7 WRESTLER'S
PREDICTED WIN PERCENTAGE
AGAINST 9-5 OPPONENT
47.2
7-7 WRESTLER'S
A CTUAL WIN PERCENTAGE
AGAINST 9-5 OPPONENT
73.4
As suspicious as this looks, a high winning percentage alone isn't enough to prove that a match is rigged. Since so much depends on a wrestler's eighth win, he should be expected to fight harder in a crucial bout. But perhaps there are further clues in the data that prove collu s10n.
It's worth thinking about the incentive a wrestler might have to throw a match. Maybe he accepts a bribe (which would obviously not be recorded in the data). Or perhaps some other arrangement is made between the two wrestlers. Keep in mind that the pool of elite s·umo wrestlers is extraordinarily tight-knit. Each of the sixty-�ix elite wrestlers fights fifteen of the others in a tournament every two months. Furthermore, each wrestler belongs to a stable that is typi cally managed by a former sumo champion, so even the rival stables have close ties. (Wrestlers from the same stable do not wrestle one an other.)
Now let's look at the win-loss percentage between the 7-7 wrestlers and the 8-6 wrestlers the next time they meet, when neither one is on the bubble. In this case, there is no great pressure on the in dividual match. So you might expect the wrestlers who won their 7-7 matches in the previous tournament to do about as well as they had in earlier matches against these same opponents-that is, winning roughly 50 percent of the time. You certainly wouldn't expect them to uphold their 80 percent clip.
42
As ic cums ou di oerc:enc of che reman:l -
the next? How 001 J
The mosc logicaJ < agreement you J
d I ll let you win di e a cash bribe.) 1
wrestlers' second sub expected levd o
only two mard And it isn't only di e collective reco
- nal. W hen one wrestlers from a secoi
' e second stables 'WI me match rigging1
spore-much like dit o formal discipl sumo wrestler l
o Association disgruntled form rds "sumo" and c or. People tend ro
sport is impugned. Still, allegations <
-
co the Japanese m ore chance to m
scrutiny, after all, en o their stab.les..ha�I rinue when a swam
em.
lie bubble also do astonish-
7-7 WRESTLER'S
CTUAL W1N PERCENTAGE
T 9-5 OPPONENT
73.4
ming percentage alone isn't
mce so much depends on a
eel to fight harder in a crucial the data that prove collu-
a wrestler might have to
WhiCh would obviously not
e ocher arrangement is made
that the pool of elite sumo
� of the sixty-six elite a tournament every two
1>ngs to a stable that is typi
lon so even the rival stables
Stable do not wrestle one an-
rcentage between the 7-7
ime they meet, when neither
; no great pressure on the in- tlers who won their 7-7
about as well as they had
�nents-that is, winning wouldn't expect them to
Schoolteachers and Sumo Wrestlers
As it turns out, the data show that the 7-7 wrestlers win only 40 percent of the rematches. Eighty percent in one match and 40 percent in the next? How do you make sense of that?
The most logical explanation is that the wrestlers made a quid pro quo agreement: you let me win todaY, when I really need the victory, and I ' ll let you win the next time. (Such an arrangement wouldn't pre clude a cash bribe.) It's especially interesting to note that by the two wrestlers' second subsequent meeting, the win percentages revert to the expected level of about 50 percent, suggesting that the collusion spans only two matches.
And it isn't only the individual wrestlers whose records are suspect. The collective records of the various sumo stables are similarly aberra tional. When one stable's wrestlers fare well on the bubble against wrestlers from a second stable, they tend to do especially poorly when the second stable's wrestlers are on the bubble. This indicates that some match rigging may be choreographed at ·the highest level of the sport-much like the Olympic skating judg�s' vote swapping.
No formal disciplinary attion has ever been taken against a Japa nese sumo wrestler for match rigging. Officials from the Japanese
umo Association typically dismiss any such charges as fabrications by disgruntled former wrestlers. In fact, the mere utterance of the words "sumo" and "rigged" in the same sentence can cause a national furor. People tend to get defensive when the integrity of their national sport is impugned.
Still, allegations of match rigging do occasionally find their way into the Japanese media. These occasional media storms offer one more chance to measure possible corruption in sumo. Media scrutiny, after all, creates a powerful incentive: if two sumo wrestlers or their stables have been rigging matches, they might be leery to con cinue when a swarm of journalists and TV cameras descend upon chem.
43
FREAKONOMICS
So what happens in such cases? The data show that in the sumo tournaments held immediately after allegations of match rigging, 7-7 wrestlers win only 50 percent of their final-day matches against 8--6 opponents instead of the ty pical 80 percent. No matter how the data ar� sliced, they inevitably suggest one thing: it is hard to argue that sumo wrestling isn't rigged.
Several years ago, two former sumo wrestlers came forward with extensive allegations of match rigging-and more. Aside from the crooked matches, they said, sumo was rife with drug use and sexca pades, bribes and tax evasion, and close ties to the yakuza., the Japa nese mafia. The two men began to receive threatening phone calls; one of them told friends he was afraid he would be killed by the yakuza.. Still, they went forward with plans to hold a press conference at the Foreign Correspondents' Club in Tokyo. But shortly before hand, the two men died-hours apart, in the same hospital, of a sim ilar respiratory ailment. The police declared there had been no foul play but did not conduct an investigation. "It seems very strange for these two people to die on the same day at the same hospital," said Mitsuru Miyake, the editor of a sumo magazine. "But no one has seen them poisoned, so you can't prove the skepticism."
Whether or not their deaths were intentional, these two men had done what no other sumo insider had previously done: named names. Of the 281 �restlers covered in the dat� cited above, they identified 29 crooked wrestlers and 11 who were said to be incorruptible.
What happens when the whistle-blowers' corroborating evidence is factored into the analysis of the match data? In matches between two supposedly corrupt wrestlers, the wrestler who was on the bubble won about 80 percent of the time. In bubble matches against a sup posedly dean opponent, meanwhile, the bubble wrestler was no more likely to win than his record would predict. Furthermore, when a supposedly corrupt wrestler faced an opponent whom the whistle-
44
blowers did nor name .as nearly as skewed as when 1 most wrestlers who t«7mi
So if sumo wrestlers, sdu: are we to assume char :ma And if so, how corrupt!
The answer may lie in . named Paul Feldman.
Once upon a time, Fd agricultural economist, he rook a job in WashingtoJJ
.S. Navy. This was in I� ore of the same. He hdcl
ur he wasn't fully engagal colleagues would inrrodoo
ublic research group" ( e bagels." The bagels had begun :
. loyees whenever they W1l habit. Every Friday, he wo
d cream cheese. When about the bagels, they waJi -· fifteen dozen bagels a a cash basket and a sign · � about 95 percent; he:
or fraud. In 1984, when his te.Se4
Feldman took a look at hii · job and sell bagels. His
.t in. the sumo ch rigging, 7-7
thes against 8-6 er how the data
to argue tha�
me forward with Aside from the
� use and sexca 'aku.za, the Japa . g phone calls; be killed by the
a press conference t shortly before-
very strange for e hospital," said
rating evidence In matches between
was on the bubble
tler was no more ermore, when a
t whom the whistle-
Schoolteachers and Sumo Wrestlers
blowers did not name as either corrupt or clean, the results were nearly as skewed as when two corrupt wrestlers met-suggesting that most wrestlers who weren't specifically named were also corrupt.
So if sumo wrestlers, schoolteachers, and day-care parents all cheat, are we to assume that mankind is innately and universally corrupt? And if so, how corrupt?
The answer may lie in ... bagels. Consider the true story of a man named Paul Feldman.
Once upon a time, Feldman dreamed big dreams. Trained as an agricultural economist, he wanted to tackle world hunger. Instead, he took a job in Washington, analyzing weapons expenditures for the U.S. Navy. This was in 1962. For the next twenty-odd years, he did more of the same. He held senior-level jobs and earned good money, but he wasn't fully engaged in his work. At the office Christmas party, colleagues would introduce him to thei� wives not as "the head of the public research group" (which he was) but as "the guy who brings in the bagels."
The bagels had begun as a casual gesture: a boss treating his em ployees whenever they won a research contract. Then he made it a habit. Every Friday, he would bring in some bagels, a serrated knife, and cream chee�e. W hen employees from neighboring floors heard about the bagels, they wanted some too. Eventually he was bringing in fifteen dozen bagels a week. In order to recoup his costs, he set out a cash basket and a sign with the suggested price. His collection rate was about 95 percent; he attributed the underpayment to oversight, not fraud.
In 1984, when his research institute fell under new management, Feldman took a look at his career and grimaced. He decided to quit his job and sell bagels. His economist friends thought he had lost his
45
FREAKONOMICS
mind, but his wife supported him. The last of their three children was finishing college, and they had retired their mortgage.
Driving around the office parks that encircle Washington, he so licited customers with a simple pitch: early in the morning, he would deliver some bagels and a cash basket to a company's snack room; he would return before lunch to pick up the money and the leftovers. It was an honor-system commerce scheme, and it worked. Within a few years, Feldman was delivering 8,400 bagels a week to 140 companies and earning as much as he had ever made as a research analyst. He had thrown off the shackles of cubicle life and made himself happy.
He had also-quite without meaning to-designed a beautiful economic experiment. From the beginning, Feldman kept rigorous data on his business. So by measuring the money collected against the bagels taken, he found it possible to tell, down to the penny, just how honest his customers were. Did they steal from him? If so, what were the characteristics of a company that stole versus a company that did not? Under what circumstances did people tend to steal more, or less?
& it happens, Feldman's accidental study provides a window onto a form of cheating that has long stymied academics: white-collar crime. (Yes, shorting the bagel man is white-collar crime, writ how ever small.) It might seem ludicrous to address as large and intractable a problem as white-collar crime through the life of a bagel man. But often a small and simple question can help chisel away at the biggest problems.
Despite all the attention paid to rogue companies like Enron, aca demics know very little about the practicalities of white-collar crime. The reason? There are no good data. A key fact of white-collar crime is that we hear about only the very slim fraction of people who are caught cheating. Most embezzlers lead quiet and theoretically happy lives; employees who steal company property are rarely detected.
With street crime, meanwhile, that is not the case. A mugging or a
46
burglary or a mun caught. A street cri the police, who go ademic papers b white-collar crime did the masters of if you don't know 1 what magnitude?
Paul Feldmalls The victim was Pa
When he started h based on the exper low on a street whc ri£cially high: Pel< but those bagel eat1 good ones) aboucJ :research has sh same item depen' Thaler, in his 198. sunbather would p nJy $1.50 for the
In the real wo c. He came roo >Ve 90 percenL
oying buc m Feldman�
The cost of bag the ear.
lCi.r three children was e.
Washington, he so
h.e morning, he would
y's snack room; he
and the leftovers. It
to 140 companies
ch analyst. He had
himself happy.
!-designed a beautiful
dman kept rigorous
�collected against the
to the penny, just how
a him? If so, what were
a company that did
a co steal more, or less?
·des a window onto
Cademics: white-collar
rollar crime, writ how
large and intractable
of a bagel man. But
isel away at the biggest
·es like Enron, aca
of white-collar crime.
ct of white-collar crime
·on of people who are
111d theoretically happy
a.re rarely detected. e case. A mugging or a
Schoolteachers and Sumo Wrestlers
burglary or a murder is usually tallied whether or not the criminal is caught. A street crime has a victim, who typically reports the crime to the police, who generate data, which in turn generate thousands of ac ademic papers by criminologists, socio�ogists, and economists. But white-collar crime presents no obvious victim. From whom, exactly, did the masters of Enron steal? And how can you measure something if you don't know to whom it, happened, or with what frequency, or in what magnitude?
Paul Feldman's bagel business was different. It did present a victim. The victim was Paul Feldman.
When he started his business, he expected a 95 percent payment rate, based on the experience at his own office. But just as crime tends to be low on a street where a police car is parked, the 95 percent rate was ar cificially high: Feldman's presence had deterred theft. Not only that, but those bagel eaters knew the provider and had feelings (presumably good ones) about him. A broad swath of psychological and economic research has shown that people will pay different amounts for the same item depending on who is providing it. The economist Richard Thaler, in his 1985 "Beer on the Beach" study, showed that a thirsty sunbather would pay $2.65 for a beer delivered from a resort hotel but only $1.50 for the same beer if it came from a shabby grocery store.
In the real world, Feldman learned to settle for less th<!,U 95 per t. He came to consider a company "honest" if its payment rate was ve 90 percent. He considered a rate between 80 and 90 percent
annoying but tolerable." If a company habitually paid below 80 per- t, Feldman might post a hectoring note, like this one:
The cost of bagels has gone up dramatically since the beginning of the year. Unfortunately, the number of bagels tha,t disappear
47
FREAKONOMICS
without being paid for has also gone up. Don't let that con
tinue. I don't imagine that you would teach your children to
cheat, so why do it yourselves?
In the beginning, Feldman left behind an open basket for the
cash, but too often the money vanished. Then he tried a coffee can
with a money slot in its plastic lid, which also proved too tempting.
In the end, he resorted to making small plywood boxes with a slot
cut into the top. The wooden box has worked well. Each year he drops
off about seven thousand boxes and loses, on average, just one to
theft. This is an intriguing statistic: the same people who routinely
steal more than 10 percent of his bagels almost never stoop to stealing
his money box-a tribute to the nuanced social calculus of theft.
From Feldman's perspective, an office worker who eats a bagel with
out paying is committing a crime; the office worker probably doesn't
think so. This distinction probably has less to do with the admittedly
small amount of money involved (Feldman's bagels cost one dollar
each, cream cheese included) than with the context of the "crime."
The same office worker who fails to pay for his bagel might also help
himself to a long slurp of soda while filling a glass in a self-serve
restaurant, but he is very unlikely to leave the restaurant without
paymg.
So what do the bagel data have to say? In recent years, there have
been two noteworthy trends in the overall payment rate. The first was
a long, slow decline that began in 1992. By the summer of 2001, the
overall rate had slipped to about 87 percent. But immediately after
September 11 of that year, the rate spiked a full 2 percent and hasn't
slipped much since. (If a 2 percent gain in payment doesn't sound like
much, think of it this way: the nonpayment rate fell from 13 to 11
percent, which amounts to a 15 percent decline in theft.) Because
many of Feldman's customers are affiliated with national security,
48
there may have been a parrio ; have represented a more gener
The data also show thar ones. An office with a few don 5 percent an office with a kw counterintuitive. In a bia� o bb vene around the bagel table, p you drop your money in che 1 comparison, bagel crime seems street crime per capita in rural cause a rural criminal is mor caught). Also, a smaller comm centives against crime, the mail
The bagel data also reflea: .h feet honesty. Weather, for insc
easant weather inspires peop' Jd weather, meanwhile, make
· and wind. Worst are the h a 2 percent drop in payme
• rheft, an effect on the same 11 Thanksgiving is nearly as bad;
, as is the week straddling holidays: the weeks rhaI iJJ
Columbus Day. The dilfen -cheating holidays represent:
. The high-cheating holie 'ecies and the high expeccatffi
Feldman has also reached so� _ based more on his experiel]j
that morale is a big facror employees like their boss ai:
µp. Don't let that con
reach your children to
an open basket for the
en he tried a coffee can
also proved too tempting.
plywood boxes with a slot
bl well. Each year he drops
on average, just one to
e people who routinely
nose never stoop to stealing
social calculus of theft.
who eats a bagel with
worker probably doesn't
s to do with the admittedly
s bagels cost one dollar
a glass in a self-serve
IVC the restaurant without
In recent years, there have
ent rate. The first was
the summer of 2001, the
a full 2 percent and hasn't
ent doesn't sound like
t rate fell from 13 to 11
with national security,
Schoolteachers and Sumo Wrestlers
there may have been a patriotic element to this 9/11 Effect. Or it may
have represented a more general surge in empathy.
The data also show that S!llaller offices are more honest than big
ones. An office with a few dozen employees generally outpays by 3 to
5 percent an office with a few hundred employees. This may seem
counterintuitive. In a,bigger office, a bigger crowd is bound to con
vene around the bagel table, providing more witnesses to make sure
you drop your money in the box. But in the big-office/small-office
comparison, bagel crime seems to mirror street crime. There is far less
street crime per capita in rural areas than in cities, in large part be
cause a rural criminal is more likely to be known (and therefore
caught). Also, a smaller community tends to exert greater social in
centives against crime, the main one being shame.
The bagel data also reflect how much personal mood seems to af
fect honesty. Weather, for instance, is a major factor. Unseasonably
pleasant weather inspires people to pay at a higher rate. Unseasonably
cold weather, meanwhile, makes people cheat prolifically; so do heavy
rain and wind. Worst are the holidays. The week of Christmas pro
duces a 2 percent drop in payment rates-again, a 15 percent increase
in theft, an effect on the same magnitude, in reverse, as that of 9/11.
Thanksgiving is nearly as bad; the week of Valentine's Day is also
lousy, as is the week straddling April 15. There are, however, a few
good holidays: the weeks that include the Fourth of July, Labor Day,
and Columbus Day. The difference in the two sets of holidays? The
low-cheating holidays represent little more than an extra day off from
work. Th� high-cheating holidays are fraught with miscellaneous
anxieties and the high expectations of loved ones .
Feldman has also reached some of his own conclusions about hon
esry, based more on his experience than the data. He has come to be
lieve that morale is a big factor-that an office is more honest when
che employees like their boss and their work. He also believes that
49
FREAKONOMICS
employees further up the corporate ladder cheat more than those
down below. He got this idea after delivering for years to one com
pany spread out over three floors-an executive floor on top and two
lower floors with sales, service, and administrative employees. (Feld
man wondered if perhaps the executives cheated out of an overdevel
oped sense of entitlement. What he didn't consider is that perhaps
cheating was how they got to be executives.)
If morality represents the way we would like the world to work and
economics represents how it actually does work, then the story of
Feldman's bagel business lies at the very intersection of morality and
economics. Yes, a lot of people steal from him, but the vast majority,
even though no one is watching over them, do not. This outcome
may surprise some people-including Feldman's economist friends,
who counseled him twenty years ago that his honor-system scheme
would never work. But it would not have surprised Adam Smith. In
fact, the theme of Smith's first book, The Theory of Moral Sentiments,
was the innate honesty of mankind. "How selfish soever man may be
supposed,' ' Smith wrote, "there are evidently some principles in his
nature, which interest him in the fortune of others, and render their
happiness necessary to him, though he derives nothing from it, except
the pleasure of seeing it."
There is a tale, "The Ring of Gyges," that Feldman sometimes tells
his economist friends. It comes from Plato's Republic. A student
named Glaucon offered � the story in response to a lesson by Socrates
who, like Adam Smith, argued that people are generally good even
without enforcement. Glaucon, like Feldman's economist friends, dis
agreed. He told of a shepherd named Gyges who stumbled upon a se
cret cavern with a corpse inside that wore a ring. When Gyges put on
the ring, he found that it made him invisible. With no one able to
50
monitor his behavior, Gyges the queen, murder the king. question: could any man res acts could not be witnessed!• no. But Paul Feldman sides knows that the answer, ar lea!
e than those
to one com
top and two
yees. (Feld-
1to work and
the story of
norality and
�t majority,
Ills outcome
rust friends,
item scheme
m Smith. In,
f Sentiments,
man may be
ciples in his
render their
1m it, except
!letimes tells
A student
rSocrates-
good even
friends, dis
i upon a se
�ges put on
one able to
Schoolteachers and Sumo Wrestlers
monitor his behavior, Gyges proceeded to do woeful things-seduce the queen, murder the king, and so on. Glaucon's story posed a moral question: could any man resist the temptation of evil if he knew his acts could not be witnessed? Glaucon seemed to think the answer was no. But Paul Feldman sides with Socrates and Adam Smith-for he knows that the answer, at least 87 percent of the time, is y es.
51