Assignment Kim

Mr.Garcia
Chapter3Randomness.pdf

1

 Randomness  and  Uncertainty  

Reports  that  say  that  something  hasn't  happened  are  always  interesting  to  me,  

because  as  we  know,  there  are  known  knowns;  there  are  things  we  know  we  know.  We  

also  know  there  are  known  unknowns;  that  is  to  say  we  know  there  are  some  things  we  

do  not  know.  But  there  are  also  unknown  unknowns  –  the  ones  we  don't  know  we  don't  

know.                   Donald  Rumsfeld,  February  12,  2002  

 

What  do  randomness  and  uncertainty  have  to  do  with  clear  thinking?    Isn’t  

randomness  the  antithesis  of  thinking?    It  might  be  surprising  that  there  is  an  element  of  

randomness  in  most  things  we  do.  Without  randomness,  we  would  get  exactly  the  same  

result  each  time  we  repeated  the  exact  same  action.    The  drive  to  work  would  be  

completely  predictable,  friends  would  always  react  the  same  way,  and  sports  would  be  

boring  to  watch.    Even  if  something  seems  like  it  should  be  completely  predictable,  

inherent  variability  comes  into  play.    If  the  alarm  is  set  for  exactly  the  same  time  every  

day,  there  are  still  bound  to  be  a  few  minutes  of  difference  between  the  time  it  goes  off  

and  the  time  you  are  ready  to  leave  each  day.    Traffic  is  affected  by  any  number  of  

variables;  weather,  the  number  of  other  drivers,  problems  caused  by  other  drivers  

cutting  in  and  out  of  lanes,  and  road  construction.  On  any  given  day,  these  factors  may  

or  may  not  affect  what  is  usually  a  fairly  predictable  trip.    While  clear  thinking  isn’t  the  

result  of  randomness,  it  acknowledges  and  accounts  for  randomness  and  uncertainty  

when  making  choices  and  planning  for  the  future.  

2

Most  people  chronically  underestimate  the  effects  of  randomness.  Good  luck  

rarely  gets  the  credit  it  deserves,  while  bad  luck  receives  too  much  blame.      When  we  

make  plans,  we  often  forget  to  factor  in  variability.      

Randomness  is  intrinsic  to  the  laws  of  probability,  with  which  people  also  have  

trouble.    However,  it  should  be  given  its  due.    Many  times  one’s  efforts  seem  to  be  

highly  effective  when,  in  reality,  external  circumstances  may  be  more  responsible  for  

success.    The  converse  is  also  true  –  a  great  decision  cannot  always  compensate  for  the  

effects  of  the  economy,  Mother  Nature,  or  changing  consumer  tastes.    

  What  are  the  benefits  of  understanding  randomness  and  uncertainty?    With  the  

flood  of  information  we  are  constantly  subject  to,  we  need  to  know  what  to  believe  and  

what  to  ignore  and  how  to  use  information  to  make  realistic  decisions.  Too  few  people  

understand  the  difference  between  correlation  and  causality,  whether  a  new  product  or  

medical  treatment  will  make  any  difference  in  our  well-­‐being,  whether  we  should  risk  an  

investment,  or  what  news  is  credible.    A  little  skepticism  about  claims  can  go  a  long  way  

toward  developing  a  realistic  view  of  the  world.    Understanding  of  the  variability  of  the  

conditions  that  shape  our  decisions  will  foster  improved  choices  and  plans.      The  ability  

to  recognize  that  something  is  a  coincidence,  and  not  inherently  meaningful,  keeps  us  

from  developing  false  beliefs.      It’s  important  to  know  when  information  is  reliable  and  

when  it’s  not.    Choices,  both  personal  and  professional,  work  out  better  when  they  are  

based  on  reality,  not  assumptions  or  misperceptions.  

What  is  Randomness?  

3

What  does  it  mean  for  something  to  be  “random”?    People  use  the  word  random  

to  describe  events  that  are  unexpected  or  seem  to  be  unrelated  to  the  topic  at  hand  

(“That  was  a  random  comment”).    The  typical  definition  is  “without  any  discernable  

pattern.”    An  easy  way  to  understand  randomness  is  to  look  at  examples  from  gambling.  

The  bouncing  ping-­‐pong  balls  that  determine  lottery  winners  are  drawn  at  

random.    Every  ball  has  an  equal  chance  of  being  selected  every  time  the  lottery  is  

played,  despite  beliefs  about  lucky  numbers  or  relatives’  birthdays.    Although  the  balls  

can’t  remember  which  ones  were  drawn  in  the  past,  some  people  persist  in  trying  to  

find  patterns,  thinking  they  will  improve  their  chances  of  winning  the  jackpot.      

Many  people  don’t  know  what  randomness  looks  like.    If  someone  were  asked  to  

pick  a  random  number  between  1  and  50,  few  would  select  1  or  50,  even  though  those  

numbers  are  as  likely  as  something  more  “random-­‐sounding”  like  19  or  37.    If  we  flipped  

a  coin  repeatedly  and  saw  the  patterns  HTHTHTHT,  HHHTTT,  HHHHHT,  TTTTTT,  and  

HHTHTT,  most  people  would  say  the  last  one  is  random,  but  the  others  aren’t.    The  truth  

is  that  they  are  all  equally  likely,  because  each  coin  toss  is  an  independent  event  –  the  

coin  doesn’t  remember  what  the  outcome  of  the  last  flip  was.    Even  though  we  

eventually  expect  an  equal  number  of  heads  and  tails  from  repeated  flips,  it  takes  many,  

many  flips  to  get  this  kind  of  result.  This  is  due  to  the  “law  of  large  numbers.”        

Simply  put,  the  law  of  large  numbers  says  that  as  the  number  of  trials  (flips  of  a  

coin,  dice  rolls,  spin  of  a  roulette  wheel,  pulls  of  a  slot  machine  lever,  etc.)  increases,  the  

more  likely  the  average  result  will  be  the  expected  value  (in  this  case,  50%  heads  and  

tails).    While  a  long  series  of  trials  will  converge  on  the  expected  value,  short  series  

4

seldom  do.    Most  people  know  that  there  is  supposed  to  be  a  50:50  chance  of  heads  or  

tails,  but  relatively  few  understand  that  this  is  the  long-­‐run  outcome.    When  there  is  a  

streak  of  several  heads  or  tails  in  a  row,  it  seems  surprising.      

One  phenomenon  that  sports  fans  wholeheartedly  believe  in  is  the  “hot  hand.”    

This  is  the  idea  that  an  athlete  is  on  a  winning  streak  (or  conversely,  a  losing  streak).    

The  usual  explanations  point  to  momentum  or  the  confidence  from  one  success  leading  

to  another  success.    From  a  probability  perspective,  a  hot  hand  implies  that  when  a  

player  scores,  the  probability  that  he  or  she  will  score  on  the  next  try  should  be  higher  

than  average.    Psychologists  Robert  Vallone  and  Tom  Gilovich  wondered  whether  the  

hot  hand  could  be  documented,  so  they  analyzed  the  shooting  records  of  each  player  on  

the  Philadelphia  76ers  for  48  games.    Much  to  the  dismay  of  players,  coaches,  and  fans,  

they  found  no  evidence  of  a  hot  hand  for  any  player.    The  reaction  to  this  finding  was,  

and  continues  to  be,  disbelief.    However,  think  back  to  the  coin-­‐flipping  example;  

remember  that  a  series  of  flips  doesn’t  usually  alternate  between  heads  and  tails,  even  

though  the  average  over  the  long  run  is  50:50.    In  a  short  series,  a  streak  of  heads  or  

tails  may  not  look  random,  but  it  is.    It’s  the  same  with  the  hot  hand.    Great  players  

make  more  shots  than  average  players,  but  the  likelihood  that  he  or  she  will  make  the  

next  shot  isn’t  a  function  of  the  last  shot.    Since  people  are  generally  not  very  good  at  

recognizing  randomness,  and  the  idea  that  momentum  and  confidence  affect  

performance  is  very  appealing,  the  myth  of  the  hot  hand  rings  true  despite  reality.  

The  “gambler’s  fallacy”  is  another  common  belief.    When  someone  is  betting  on  

a  random  outcome,  like  a  particular  number  on  a  roulette  wheel,  a  common  

5

misperception  is  that  the  longer  he  or  she  goes  without  winning,  the  more  likely  the  

desired  number  is  to  come  up.    The  problem  is  that  each  spin  is  independent  and  the  

roulette  wheel  has  no  memory.    Luck  doesn’t  self-­‐correct.    The  same  is  true  for  slot  

machines,  the  lottery,  and  just  about  any  other  kind  of  gambling.  Thinking  that  they  are  

due  to  win  on  the  next  spin,  or  the  one  after  that,  or  maybe  the  one  after  that,  gamblers  

keep  betting,  often  ending  up  with  significant  financial  losses.    

  What  do  these  examples  have  to  do  with  everyday  life?    You  don’t  have  to  be  a  

gambler  to  encounter  problems  caused  by  misunderstanding  randomness  or  probability.      

Believing  that  success  will  continue  based  on  prior  success  can  lead  to  overconfidence  

and  less  careful  decision  making.    Continuing  to  make  risky  decisions  in  an  expectation  

that  a  win  is  due  is  wishful  thinking.    There  are  three  main  areas  in  decision  making  

where  understanding  randomness  will  help  you  make  better  choices  and  plans:  

• Understanding  cause  and  effect  

• Developing  more  accurate  expectations  about  future  outcomes    

• Being  a  smart  consumer  of  information  

Understanding  Cause  and  Effect  

Many  athletes  swear  by  pre-­‐game  rituals  to  give  them  an  edge,  from  lucky  shirts  

to  a  specific  way  to  tie  shoes  to  special  foods.      Michael  Jordan,  famed  Chicago  Bull  

basketball  player,  always  wore  his  University  of  North  Carolina  uniform  shorts  under  his  

Chicago  uniform.    These  rituals  may  give  athletes  a  boost  of  confidence,  but  do  they  

really  cause  better  performance?  

6

On  a  more  serious  note,  a  number  of  parents  in  the  U.S.  refuse  to  vaccinate  their  

children  against  childhood  diseases  such  as  measles  and  whooping  cough.    The  basis  for  

this  practice  was  a  now  widely  discredited  paper  by  Andrew  Wakefield,  a  British  doctor  

who  claimed  that  childhood  vaccination  caused  autism.      He  subsequently  lost  his  

medical  license  for  falsifying  data.      Still,  some  Hollywood  celebrities  helped  spread  the  

idea  that  vaccines  contain  harmful  ingredients  that  cause  autism,  giving  legitimacy  to  

the  anti-­‐vaccination  trend  in  the  eyes  of  some  parents.    Despite  wide  agreement  in  the  

medical  community  that  there  is  no  link  between  vaccines  and  autism,  many  parents  

persist  in  refusing  vaccinations  for  their  children.  

Vaccination  provides  “herd  immunity”  –  if  the  majority  of  a  population  is  

immune  to  a  disease,  it’s  much  less  likely  to  spread  widely.    In  populations  where  the  

anti-­‐vaccination  movement  is  strong,  diseases  such  as  measles,  mumps,  whooping  

cough  and  chicken  pox  are  on  the  rise.    For  most  healthy  individuals,  these  illnesses  

cause  minor  discomfort  for  a  few  days.    However,  for  those  with  a  compromised  

immune  system  or  infants  too  young  to  be  vaccinated,  these  illnesses  can  be  severe  or  

even  fatal.    How  can  we  determine  whether  vaccination  causes  autism?  

If  you  have  ever  taken  a  statistics  course,  you  will  have  heard  “Correlation  does  

not  imply  causation.”      Correlation  is  a  measure  of  the  relationship  between  two  

variables,  such  as  total  revenue  and  the  amount  of  money  spent  on  advertising  or  time  

spent  exercising  and  cardiovascular  health.    Correlation  is  necessary  to  demonstrate  

causal  relationships,  but  it’s  not  enough.    Two  variables  can  be  highly  correlated  such  

that  an  effect  is  present  when  a  possible  cause  is  present  and  an  effect  is  absent  when  a  

7

possible  cause  is  absent.    That’s  because  other  variables  might  be  responsible.    For  

example,  deaths  from  drowning  are  highly  correlated  with  ice  cream  consumption.    

When  ice  cream  consumption  is  high,  deaths  by  drowning  are  high.    When  ice  cream  

consumption  is  low,  deaths  by  drowning  decrease.    Would  water  safety  be  improved  if  

the  ice  cream  supply  were  restricted?    Do  people  go  back  into  the  water  too  soon  after  

eating  ice  cream?      In  this  case,  the  answer  is  obvious.  There  is  a  correlation  between  

deaths  by  drowning  and  ice  cream  consumption  because  both  swimming  (and,  

unfortunately,  drowning)  and  eating  ice  cream  occur  more  frequently  in  hot  weather  

and  less  frequently  in  cold  weather.    

To  assess  whether  a  causal  relationship  exists  between  two  variables,  we  need  

information  about  each  variable.    Let’s  look  at  the  relationship  between  vaccination  and  

autism.    The  variables  are  whether  or  not  a  child  is  vaccinated  and  whether  or  not  the  

child  is  diagnosed  with  autism.    According  to  the  Center  for  Disease  Control,  the  current  

prevalence  of  autism  in  the  U.S.  is  about  1.5%  among  children  aged  3  to  10.    With  a  

sample  of  100,00  children  of  whom  10%  are  not  vaccinated,  this  is  what  we  would  

expect  to  see.  

    Vaccinated     Not  Vaccinated  

Autism          1,350              150  

No  Autism      88,650         9,850  

Total     90,000         10,000  

8

The  number  of  autism  cases  is  proportional  to  the  number  of  children  in  each  

group.    There  are  more  autism  cases  in  the  vaccinated  group  because  there  are  9  times  

as  many  children,  not  because  they  were  vaccinated.  

If  vaccinations  did  cause  autism,  our  table  should  look  more  like  this.  

    Vaccinated     Not  Vaccinated  

Autism        90,000         0  

No  Autism                            0                                                                                  10,000                                    

Total     90,000              10,000  

Of  course,  there  might  be  cases  of  autism  unrelated  to  vaccination,  and  not  

every  vaccinated  child  would  end  up  with  an  autism  diagnosis,  so  these  numbers  are  an  

exaggeration.    But  the  general  pattern  would  look  like  this.  

Here’s  what  you  need  to  determine  cause  and  effect:  

      Cause  Present       Cause  Not  Present  

Effect  Present     Yes         No  

Effect  Absent     No         Yes  

 

If  the  possible  cause  is  present,  it  should  lead  to  the  effect  the  majority  of  the  

time,  and  it  should  seldom  lead  to  cases  where  there  is  no  effect.    If  the  possible  cause  is  

absent,  there  should  not  be  an  effect,  and  most  of  the  time,  absence  of  the  possible  

cause  should  mean  no  effect.  

Why  do  people  falsely  believe  that  one  thing  causes  another,  when  in  reality  

there  is  no  relationship?    Essentially,  they  only  look  at  one  cell  of  the  table  above  –  the  

9

cell  for  Cause  Present  and  Effect  Present.    When  two  events  happen  close  together,  

people  sometimes  think  the  first  one  caused  the  second  one.    They  forget  to  check  

whether  other  causes  account  for  the  effect  or  whether  the  effect  ever  happens  without  

the  possible  cause.      

Interestingly,  even  pigeons  can  be  conditioned  to  act  “superstitious”  by  

providing  food  at  predictable  intervals  that  have  nothing  to  do  with  the  bird’s  behavior.    

(Pigeons  are  usually  trained  by  receiving  food  after  they  perform  a  specific  task.)  The  

pigeons  engage  in  behaviors  like  whirling  around  or  flapping  their  wings  in  a  certain  way  

–  whatever  they  were  doing  when  the  food  first  arrived.    They  look  as  though  they  

believe  their  behavior  caused  the  food  to  appear  and  continue  to  repeat  the  specific  

behavior  so  the  food  will  keep  coming.  

When  people  hold  strong  beliefs,  they  are  likely  to  see  causality  when  there  is  

only  coincidence.    In  the  case  of  superstitious  sports  stars,  a  good  performance  

coincides  with  a  lucky  shirt  (or  meal,  socks,  etc.).    When  the  athlete  seeks  a  reason  for  

the  performance,  attention  falls  on  the  shirt.    Superstitions  like  this  are  harmless,  but  

when  mistaken  beliefs  about  causality  affect  public  health  and  policy  decisions,  we  are  

worse  off.  

In  business  settings,  there  are  numerous  occasions  when  it’s  important  to  know  

whether  two  variables  have  a  causal  connection.    Do  training  programs  improve  

employee  performance?    If  more  funds  are  allocated  to  the  social  media  budget,  will  

brand  image  improve  in  proportion  to  the  extra  spending?    Does  increased  customer  

satisfaction  really  increase  sales?    Many  online  firms  conduct  A/B  testing  to  determine  

10

whether  one  variable  has  a  causal  relationship  with  another.    Too  often,  businesses  

don’t  have  the  luxury  to  conduct  those  real  world  experiments  and  must  work  with  the  

data  that  are  available.    In  these  cases,  it’s  important  to  look  at  all  the  information  that  

bears  on  the  question,  not  just  that  which  supports  the  idea  of  a  causal  relationship.  

Expectations  about  the  future  

Will  the  future  be  like  the  past?  

It’s  human  nature  to  wonder  what  will  happen  in  the  future.    Most  of  us  end  up  

basing  our  predictions  on  our  prior  experiences,  or  those  of  people  we  know.    When  

thinking  about  how  you  will  do  on  a  final  exam,  it’s  natural  to  think  about  how  well  you  

did  on  the  midterm.    If  you  have  an  exceptionally  good  meal  at  a  restaurant,  you  look  

forward  to  sampling  it  again.    How  could  randomness  be  part  of  predicting  your  

performance  on  an  exam  or  the  quality  of  a  restaurant  meal?        If  you  aced  the  midterm,  

shouldn’t  you  expect  to  ace  the  final?      

You  may  well  ace  the  final,  but  making  that  prediction  just  on  the  basis  of  your  

midterm  score  is  a  mistake.    Performance  on  exams,  quality  of  restaurant  meals,  stock  

prices,  race  times,  heights  of  siblings,  download  speeds,  and  almost  anything  else  that  

can  be  measured  are  a  combination  of  an  average  performance  plus  some  random  

variation.    Performance  varies  from  one  time  to  the  next,  so  a  truly  exceptional  

performance  (either  positive  or  negative)  is  unlikely  to  be  followed  by  another  that  is  

equally  exceptional.      This  is  due  to  a  phenomenon  called  regression  to  the  mean.    The  

basic  principle  is  that  over  time,  extreme  values  are  followed  by  more  moderate  values.    

11

Simply  put,  scores  typically  return  to  their  long-­‐run  average.    That  doesn’t  mean  

extreme  values  can’t  be  followed  by  other  extreme  values,  just  that  it’s  unlikely.    With  

no  additional  information,  the  average  value  is  the  best  prediction.  

If  a  student  consistently  aces  all  exams,  his  or  her  average  performance  is  pretty  

high  and  the  student  may  well  ace  the  next  one.  For  more  typical  students,  an  

exceptionally  high  or  low  score  will  likely  be  followed  by  something  closer  to  his  or  her  

usual  score.      If  a  restaurant  meal  is  exceptional,  it’s  more  likely  that  the  next  one  won’t  

stand  out  as  much  unless  the  average  quality  is  very  high.  

An  easy  way  to  understand  this  is  to  think  about  peoples’  heights.    This  is  

actually  where  the  idea  of  regression  to  the  mean  originated,  with  British  scientist  

Francis  Galton  in  1886.    He  noted  that  very  tall  people  usually  had  tall  children,  but  at  

least  some  of  them  were  shorter  than  their  parents.    Very  short  people  usually  had  

short  children,  but  at  least  some  of  them  were  taller  than  their  parents.    If  the  children  

of  tall  people  were  always  taller  than  their  parents,  eventually  their  descendants  would  

be  extremely  tall.    The  same  holds  for  short  people.    Without  regression  to  the  mean,  

the  range  for  adult  human  height  eventually  might  go  from  1  foot  to  12  feet,  or  even  

more  extreme  sizes.  

Regression  to  the  mean  should  be  taken  into  account  when  making  plans  and  

predictions.    One  of  several  factors  contributing  to  the  2008  recession  was  an  unrealistic  

belief  that  housing  prices  only  went  in  one  direction  –  up.    Had  that  been  the  case,  the  

risky  loans  made  to  homebuyers  with  bad  credit  and  few  resources  would  have  been  

secured  by  continually  appreciating  assets.    Instead,  as  was  inevitable,  home  prices  fell.    

12

Because  so  many  risky  loans  had  been  made,  a  cascade  of  bad  debt  severely  impacted  

the  economy.  

A  similar  phenomenon  is  the  “Sports  Illustrated  effect,”  where  some  people  

believe  a  team  that  appears  on  the  cover  of  Sports  Illustrated  will  be  jinxed  and  perform  

worse  following  the  cover  feature.    Similarly,  the  performance  of  CEOs  who  appear  on  

the  cover  of  Business  Week  often  declines  following  the  cover  story.    Does  this  publicity  

really  affect  performance?    It’s  much  more  likely  that  the  events  that  prompted  the  

athletes  and  executives  to  be  featured  on  magazine  covers  were  outliers  and  their  

performance  returned  to  historic  averages  after  the  magazine  covers  appeared.  

The  problem  with  over-­‐specified  plans  

When  we  think  about  the  future,  we  often  engage  in  daydreaming  about  what  

we  think  our  lives  will  be  like  when  we  finish  graduate  school,  have  a  new  job,  move  to  a  

different  part  of  the  country,  or  whatever  other  event  we  hope  will  actually  happen.    

The  more  detail  we  add,  the  more  real  it  seems.    Daydreaming  about  the  details  of  your  

future  life  is  fun,  but  it  shouldn’t  be  the  basis  of  planning.    While  details  make  your  

daydreams  seem  more  real,  the  more  detail  you  add,  the  less  likely  it  is  that  those  

details  will  be  correct.  

This  may  seem  counterintuitive,  but  the  reason  lies  with  a  simple  rule  of  

probability.    The  probability  of  two  independent  events  co-­‐occurring  is  always  lower  

than  the  probability  of  either  individual  event.    Probabilities  are  always  between  0  and  

1:  a  probability  of  0  means  the  event  will  never  happen  and  a  probability  of  1  means  

that  it  is  certain  to  happen.    To  determine  the  joint  probability  of  two  events  co-­‐

13

occurring  (e.g.,  taking  a  specific  job  in  a  specific  city)  you  multiply  the  individual  

probabilities.    So  if  you  have  a  20%  chance  of  being  hired  for  a  specific  job  and  a  30%  

chance  of  finding  a  job  in  a  specific  city,  the  probability  of  both  happening  is  6%.    Every  

time  a  detail  is  added,  the  joint  probability  is  reduced.    We  will  see  more  about  the  

probability  of  multiple  events  in  later  chapters.  

So,  how  should  people  think  about  the  future?    Do  we  need  to  be  statisticians  

before  we  can  start  making  good  plans?    Should  uncertainty  strike  fear  into  our  hearts?    

Absolutely  not.  The  most  important  thing  to  remember  is  that  there  is  variability  around  

future  events.    Rather  than  making  plans  depend  on  a  specific  outcome,  we  need  to  try  

to  figure  out  a  likely  range  of  outcomes.      Remember  that  trends  rarely  continue  in  a  

single  direction  indefinitely.    Investment  firms  always  include  the  statement,  “Past  

performance  does  not  guarantee  future  results.”    It’s  true  well  beyond  the  domain  of  

stock  prices.    Rather  than  evoking  fear,  accounting  for  uncertainty  will  lead  to  plans  that  

are  more  realistic  and  flexible.  

The  best  way  to  account  for  uncertainty  is  to  first  establish  what  is  known  and  

what  is  unknown,  then  develop  estimates  for  the  likelihood  of  different  situations.    With  

the  combination  of  what  is  known  and  what  is  estimated,  different  contingency  plans  

can  be  developed.    This  may  seem  a  bit  formal,  but  for  important  decisions  it’s  worth  

taking  the  time  to  be  as  accurate  as  possible.  

Following  some  significant  intelligence  failures,  such  as  the  prediction  that  

weapons  of  mass  destruction  would  be  found  in  Iraq  prior  to  the  Gulf  War,  the  

Intelligence  Advanced  Research  Projects  Activity  funded  research  into  how  to  improve  

14

predictions.    In  response,  psychologists  Philip  Tetlock  and  Barbara  Mellers  developed  

the  Good  Judgment  Project  to  understand  the  characteristics  of  people  who  were  good  

at  predictions  and  what  might  make  them  even  better.    The  key  factors  turned  out  to  be  

training  in  basic  probability  theory,  education  about  cognitive  biases,  and  working  in  a  

team  that  included  both  specialists  and  generalists.      Keeping  track  of  results  and  

forming  teams  of  “superforecasters”  led  to  accuracy  that  was  almost  double  that  of  

people  with  no  training.      

Being  a  smart  consumer  of  information  

More  than  60  years  ago,  Darrell  Huff  published  a  small  book  titled  How  to  Lie  

with  Statistics.  The  purpose  of  the  book  was  to  help  people  understand  how  statistics  in  

the  news  and  advertising  could  be  technically  correct,  but  misleading,  depending  on  the  

purpose  of  the  news  report  or  the  ad.    This  slim  volume  had  dozens  of  printings  and  

ultimately  over  half  a  million  copies  were  purchased.    The  examples  Huff  used  were  tied  

to  1950s  era  concerns,  but  decades  later  the  underlying  message  is  still  important.  

We  hear  statistics  about  government,  sports,  political  races,  traffic  accidents,  

crime  and  a  myriad  of  other  topics.    Are  we  in  a  recession  or  a  recovery?    How  can  the  

unemployment  rate  go  up  when  more  new  jobs  are  being  created?    The  news  is  full  of  

reports  about  purported  causes  of  cancer,  heart  disease,  and  other  health  issues.    

Advertising  makes  promises  that  products  will  make  us  more  attractive,  energetic,  and  

slimmer.    Should  we  eat  dark  chocolate  for  its  antioxidants  or  avoid  it  because  it  might  

contribute  to  obesity  and  diabetes?    Should  we  run  for  cardiovascular  health  or  walk  to  

avoid  joint  damage?      Do  we  need  to  buy  a  standing  desk  to  avoid  the  effects  of  too  

15

much  sitting?    We  often  forget  that  news  programs  shape  their  programming  to  

maximize  ratings  and  advertisements  are  designed  to  influence  our  spending,  not  to  

help  us  make  good  decisions.  

Many  of  us  glaze  over  at  the  mention  of  statistics.    But  statistics  enables  us  to  

summarize  information  in  order  to  learn  about  the  world.  Statistics  is  a  tool  to  

understand  whether  a  change  has  happened  or  not,  whether  variables  are  related;  a  

way  to  detect  a  signal  in  the  noise  of  randomness.    Unfortunately,  someone  with  an  

agenda  can  easily  “lie  with  statistics”  to  mislead  us.    We  don’t  have  to  look  too  far  for  

examples.  

During  the  lead-­‐up  to  the  Brexit  vote,  in  which  Britain  voted  to  leave  the  

European  Union,  the  Vote  Leave  group  repeatedly  claimed  that  the  United  Kingdom  

sent    £350  million  every  week  to  the  European  Union.    This  was  true  –  but  something  

was  missing.    The  European  Union  refunded  about  two-­‐thirds  of  that  amount,  so  the  net  

figure  was  actually    £100  to  £125  million.      

A  recently  published  study  reported  in  the  Wall  Street  Journal  (8-­‐29-­‐16)  was  

titled  “Eating  Fruit  While  Pregnant  May  Boost  Your  Baby’s  Intelligence,”  with  a  subtitle  

of  “Infants  whose  mothers  ate  more  fruit  were  smarter  one  year  after  birth,  a  

preliminary  study  shows.”    Fruit  is  part  of  a  healthy  diet,  so  this  news  is  not  exactly  

earthshaking.    However,  the  claim  that  the  fruit  eaten  during  pregnancy  is  the  reason  

for  a  baby’s  higher  intelligence  is  stretching  what  the  scientists  found.    Researchers  

looked  at  cognitive  development  scores  for  688  infants  and  related  the  scores  to  data  

from  a  survey  the  mothers  completed  during  pregnancy.    The  finding  was  that  there  was  

16

a  statistically  significant  relationship  between  self-­‐reported  fruit  consumption  and  a  

composite  of  the  scores  on  the  Bayley  Scales  of  Infant  and  Toddler  Development  at  age  

one.    Test  scores  are  not  the  same  as  intelligence,  and  the  increase  in  scores  was  2.38  

points  per  serving  of  fruit,  well  within  the  standard  deviation  of  the  Bayley  Scale,  which  

has  a  mean  of  100  and  standard  deviation  of  15.    The  authors  of  the  research  study  

were  careful  to  state  that  these  results  are  preliminary  and  that  cognitive  development  

scores  at  one  year  don’t  predict  cognitive  development  scores  at  the  age  of  three.  The  

journalist  made  a  claim  in  a  catchy  headline  about  intelligence,  but  the  researchers  were  

talking  about  test  scores  at  age  one,  not  intelligence,  which  is  a  much  more  complex  

concept.  

Questions  to  Ask  

There  are  a  few  things  to  keep  in  mind  when  someone  is  using  statistics  to  

support  a  point  of  view.    In  How  to  Lie  With  Statistics,  Darrell  Huff  characterized  these  

issues  in  a  chapter  titled  “How  to  Talk  Back  to  a  Statistic.”  

Who  Benefits?    

First,  does  the  sponsor  of  the  research  have  a  reason  to  favor  one  side  of  the  

argument?    Here  are  two  examples  from  nutritional  research  where  this  question  

needed  to  be  asked.    The  California  Walnut  Commission  sponsored  a  study  that  found  

eating  walnuts  improved  the  health  of  people  at  risk  for  diabetes.    Another  study  found  

that  Concord  grape  juice  improved  driving  performance  and  spatial  memory  among  

mothers  of  pre-­‐teens  included  an  author  who  was  an  employee  of  a  major  grape  juice  

17

provider.  It’s  entirely  possible  that  these  findings  are  legitimate,  but  in  many  cases,  

studies  that  are  funded  by  organizations  with  a  vested  interest  in  the  results  tend  to  

show  more  positive  findings  than  studies  funded  by  neutral  organizations.    

How  Do  They  Know?  

   

What  Sample?    

  A  second  issue  to  consider  is  the  nature  of  the  sample.    Two  factors  matter  here:    

the  size  of  the  sample  and  how  the  people  in  it  were  selected.    When  a  sample  is  large,  

the  data  it  provides  is  more  likely  to  be  true  of  the  population  the  sample  represents  

because  of  the  law  of  large  numbers.    When  the  sample  is  small,  you  really  can’t  draw  

solid  conclusions  from  the  data.      

Problems  with  sample  selection  occur  for  a  number  of  different  reasons.    The  

ideal  sample  is  one  that  accurately  represents  the  population  of  interest.    Finding  a  truly  

random  sample  to  answer  a  pollster’s  survey  is  difficult.    If  you  select  people  from  a  

telephone  directory,  you’ll  miss  the  growing  number  of  those  who  only  use  cell  phones.    

With  the  prevalence  of  caller  ID,  many  people  won’t  answer  the  phone  unless  they  

recognize  the  caller.    If  your  survey  is  online,  you  miss  the  population  that  doesn’t  use  

the  Internet.      

There  are  many  reputable  polling  organizations  that  take  pains  to  sample  

respondents  and  report  statistics  properly.    Gallup,  Pew  Research,  Harris  and  NORC  

(National  Opinion  Research  Center)  all  apply  sophisticated  approaches  to  sampling  and  

analyzing  opinion  data,  so  you  can  be  confident  in  what  organizations  like  these  report.  

18

Which  Average?  

There  is  a  joke  about  Microsoft  founder  Bill  Gates  walking  into  a  bar  and  

everyone  in  the  bar  being  happy  because  their  average  income  just  went  up  

dramatically.    Technically,  a  scenario  like  that  would  be  true  (about  the  average,  not  

necessarily  the  happiness)  –  if  the  mean  is  the  average  that  you  use.    Income  

distributions  are  almost  always  positively  skewed,  meaning  that  there  are  some  

individuals  whose  income  is  high  enough  to  distort  the  mean  in  a  positive  direction.    If  

the  distribution  weren’t  skewed,  the  mean  would  be  very  close  to  two  other  average  

measures  –  the  median  and  the  mode.    The  median  is  the  number  that  divides  the  

distribution  in  two,  so  that  half  of  the  people  make  less  than  the  median  and  half  make  

more.  Medians  are  usually  used  to  report  income,  housing  prices  and  other  government  

statistics  because  they  aren’t  sensitive  to  extreme  values  like  Bill  Gates’s  income.      The  

mode  is  the  most  frequent  value  in  a  distribution  and  isn’t  used  as  commonly  as  means  

and  medians.    You  would  use  a  mode  if  you  wanted  to  figure  out  which  item  (or  flavor  

or  size)  was  the  most  popular.    So,  when  you  hear  a  news  story  that  reports  average  

income,  prices,  scores  on  educational  tests,  or  any  of  a  host  of  other  topics,  keep  in  

mind  which  average  is  being  reported.  

What’s  Missing?    

When  a  new  medical  study  comes  out,  we  are  often  warned  that  the  risk  of  

contracting  a  disease  is  increased  by  50%  among  people  who  fit  a  certain  profile  or  

promised  that  a  new  drug  will  reduce  the  time  required  to  recover  from  an  illness  by  

19

20%.    What  is  left  out  is  what  is  called  the  “base  rate;”  how  many  people  are  affected  by  

the  disease  or  how  long  people  are  typically  sick.    For  example,  Tamiflu  is  widely  

prescribed  for  the  flu  because  it  cuts  the  duration  of  the  illness  by  20%  when  taken  

within  36  hours  of  symptoms.    The  flu  will  make  most  people  miserable,  but  the  misery  

usually  lasts  about  5  to  7  days  without  medication.    Tamiflu  reduces  the  duration  by  

20%  -­‐  to  about  4  to  6  days  (from  123  hours  with  a  placebo  to  98  hours  with  the  drug,  

according  to  a  2015  study).        

Since  1997,  direct  to  consumer  advertising  for  pharmaceuticals  has  become  

widespread  in  the  U.S.    Although  ads  must  include  disclosures  about  possible  side  

effects,  they  rarely  discuss  the  risks  and  benefits  of  drugs  in  a  transparent  way.    Most  

ads  mention  benefits  as  a  relative  risk,  such  as  a  50%  reduction  in  developing  a  disease.    

What  is  missing  is  absolute  risk,  without  which  you  can’t  tell  whether  the  50%  reduction  

is  meaningful.    Does  the  50%  reduction  mean  that  only  100  of  1000  people  would  

develop  the  disease  compared  to  200  of  1000  without  the  drug?    Or  does  it  mean  that  

only  1  of  1000  people  would  develop  the  disease,  compared  to  2  of  1000  people  

without  the  drug?    The  50%  reduction  in  relative  risk  is  correct  in  both  cases,  but  the  

extent  of  the  absolute  risk  is  different  by  two  orders  of  magnitude.    You  can’t  really  get  

an  idea  of  the  risk  unless  you  know  the  base  rate.    That’s  why  (from  a  marketing  

perspective)  many  pharmaceutical  ads  mention  benefits  only  in  relative  terms  without  

including  information  about  the  absolute  risk.  

Does  the  picture  tell  the  true  story?    

20

Many  arguments  are  made  using  information  presented  in  charts.    Well-­‐

constructed  charts  convey  information  more  quickly  than  tables  and  make  it  easy  to  

understand  relationships  that  otherwise  might  be  difficult  to  discern.    Unfortunately,  

charts  are  susceptible  to  the  same  kinds  of  manipulation  as  statistics.      Can  you  tell  

what’s  wrong  with  the  following  chart?    It  documents  gun  deaths  over  time  in  Florida,  

with  a  special  emphasis  on  2005,  the  year  the  “Stand  Your  Ground”  law  was  passed.  

 

The  vertical  axis  starts  at  1,000  rather  than  zero,  so  what  you  might  normally  interpret  

as  a  decline  when  the  law  was  enacted  in  2005  is  actually  a  steep  increase.    This  chart  

drew  media  attention  because  it  was  so  misleading.  

  There  are  many  ways  charts  can  mislead.    As  in  this  example,  axes  can  be  

misleading,  especially  when  they  start  at  a  number  other  than  zero.    Pie  charts  are  often  

used  inappropriately  (they  should  only  be  used  to  indicate  proportions  within  a  whole),  

and  sometimes  add  to  more  than  100%.    Some  figures  on  infographics  represent  more  

21

of  a  difference  between  items  than  is  warranted,  because  the  area  of  the  figures  varies  

in  two  dimensions  when  the  numbers  they  represent  vary  only  in  one.    When  someone  

has  a  point  of  view  they  are  trying  to  sell  you,  be  sure  to  look  at  how  they  are  presenting  

the  data.  

Applications      

  The  benefits  of  understanding  the  basics  of  randomness,  uncertainty,  and  

probability  are  similar  in  both  personal  and  managerial  settings.    You  will  be  at  a  

significant  advantage  because  the  evidence  is  that  far  too  few  people  understand  these  

topics,  even  those  who  are  educated.    You  will  be  less  susceptible  to  questionable  claims  

and  better  able  to  assess  possibilities.    Your  plans  will  account  for  uncertainty  and  be  

more  realistic.    There  are  two  major  types  of  benefits  associated  with  understanding  

randomness,  probability  and  uncertainty.    The  first  is  greater  clarity  in  your  thinking.    

The  second  is  that  you  will  be  able  to  make  plans  more  successfully.    Both  benefits  apply  

to  personal  and  business  life.  

Clarity     The  ability  to  discern  when  something  is  random  or  not  is  helpful  when  you  are  

trying  to  understand  why  things  happened  and  whether  a  causal  relationship  exists.    

When  you  see  a  true  causal  relationship,  your  actions  will  be  more  effective  and  you  will  

be  able  to  avoid  problems.    When  you  know  something  is  random,  you  can  stop  wasting  

time  trying  to  change  it.    You  won’t  be  fooled  into  thinking  something  will  succeed  just  

because  there’s  been  a  long  string  of  misses.  

22

  When  you  understand  the  principle  of  regression  to  the  mean,  you  will  have  

more  realistic  expectations  about  future  events.    Spectacularly  good  and  spectacularly  

bad  events  can  occur  to  anyone,  but  they  are  unlikely  to  be  repeated  and  shouldn’t  be  

taken  as  an  indication  of  how  future  events  will  unfold.    Investors  who  do  the  best  tend  

to  be  the  ones  who  don’t  react  on  the  basis  of  day-­‐to-­‐day  swings  in  the  market.    Instead,  

they  recognize  that  outliers  occur  on  both  the  positive  and  negative  side  and  focus  on  

the  long-­‐term  return.    The  less  fortunate  investors  are  those  who  check  their  portfolios  

daily,  reacting  to  what  is  essentially  random  noise.  

  Understanding  which  events  are  meaningful  and  which  are  just  noise  requires  a  

skeptical  eye.    Inclusion  of  base  rates  helps  you  understand  whether  a  risk  or  benefit  is  

significant  or  not.    Statistics  are  so  easily  distorted  that  it’s  worth  your  while  to  consider  

the  source  and  ask  the  basic  questions:  

• Who  says  so?    

• How  do  they  know?    

• Are  they  comparing  apples  to  apples?      

• Do  they  have  an  interest  in  a  particular  interpretation?  

Planning      

Planning  involves  making  choices  about  what  we  will  do  in  the  future  on  the  basis  of  

what  we  expect  the  state  of  the  world  to  be  in  the  future.    The  problem  is  that  the  

future  is  uncertain,  except  as  Benjamin  Franklin  famously  noted,  “…  in  this  world  

nothing  can  be  said  to  be  certain,  except  death  and  taxes.”    What  we  want  to  be  true  in  

23

the  future  doesn’t  necessarily  have  an  impact  on  what  will  happen.    If  you  don’t  smoke,  

eat  wisely,  and  stay  fit,  you  will  be  more  likely  than  not  to  enjoy  a  long  and  energetic  

life,  but  there’s  no  guarantee.    You  may  want  to  win  the  lottery  and  quit  your  job,  but  

the  probability  remains  1  in  292  million,  so  you’ll  likely  need  to  find  an  alternative  for  

retirement.    Rare  events  do  happen,  but  they  are  by  definition  rare.      

  How  can  understanding  randomness  and  probability  help  in  planning?    If  your  

plans  depend  on  economic  conditions,  competitors’  responses,  and  consumer  demand,  

you  are  already  well  aware  that  the  past  doesn’t  predict  the  future.  Certainly  the  

present  and  recent  past  provide  a  baseline  to  initiate  planning,  but  how  can  you  go  

beyond  looking  at  the  past  and  present  to  predict  the  most  likely  future?      

  As  mentioned  above,  regression  to  the  mean  should  be  taken  into  account  when  

trying  to  determine  whether  trends  are  likely  to  continue.    Extreme  results  are  most  

often  outliers,  so  unless  you  can  identify  the  specific  causes  and  can  expect  those  causal  

factors  to  continue  to  impact  your  business,  you  are  better  off  with  a  more  moderate  

forecast.    If  you  are  experiencing  phenomenal  success,  how  much  of  it  can  be  attributed  

to  you  or  your  firm’s  actions  and  how  much  can  be  attributed  to  external  factors?    

Similarly,  if  you’ve  had  a  disastrous  year,  can  you  identify  the  causes?    Was  it  something  

over  which  you  had  control?        

  To  make  good  predictions,  you  need  to  distinguish  those  aspects  of  your  life  or  

your  business  that  you  can’t  control.    For  each  of  these,  what  is  most  likely  to  happen?    

How  much  variability  exists?    For  example,  if  you  are  a  manufacturer,  what  factors  affect  

your  supply  chain  and  how  likely  are  they  to  occur?      The  2011  earthquake  and  

24

subsequent  tsunami  in  Japan  led  to  massive  shortages  in  the  automotive  supply  chain.    

These  shortages  affected  not  only  Japanese  carmakers,  but  an  estimated  350,000  –  

400,000  fewer  vehicles  were  produced  in  the  US  due  to  parts  shortages.    While  it  isn’t  

possible  to  predict  specific  earthquakes,  Japan  is  part  of  the  “Ring  of  Fire”,  a  seismically  

active  area  that  stretches  around  the  Pacific  from  New  Zealand  to  Chile  and  is  home  to  

about  90%  of  the  world’s  earthquakes  and  most  of  the  active  volcanoes.    Earthquakes  

are  a  fact  of  daily  life  in  Japan,  although  most  are  quite  minor.    They  are  unpredictable  

as  far  as  timing,  but  they  are  unsurprising  due  to  Japan’s  location.    It  is  more  surprising  

that  automakers  did  not  already  have  plans  in  place  to  deal  with  the  aftermath  of  a  

severe  earthquake.    In  the  spring  of  2016,  two  major  earthquakes  again  struck  Japan,  

but  this  time  the  impact  on  the  supply  chain  was  less  severe  –  automakers  had  adopted  

a  policy  of  multiple  sources  for  parts.    While  they  didn’t  know  when  the  next  big  

earthquake  would  be,  they  knew  it  was  coming  eventually  and  developed  a  back-­‐up  

plan.  

  Our  plans  are  typically  affected  by  factors  that  are  much  more  predictable  than  

earthquakes.    Most  guides  to  business  planning  recommend  a  standard  list  of  items  to  

consider.    That’s  a  great  starting  point.    How  can  we  improve  on  that  list?  

  A  useful  planning  exercise,  developed  by  psychologist  Gary  Klein,  is  the  “pre-­‐

mortem.”  In  a  pre-­‐mortem,  after  you  have  spent  some  time  developing  plans,  you  are  to  

imagine  the  project  you  are  planning  has  failed,  then  come  up  with  as  many  plausible  

reasons  for  failure  as  you  can  in  two  minutes.  The  benefit  of  the  exercise,  which  is  

usually  done  with  other  members  of  your  workgroup,  is  that  you  have  to  think  carefully  

25

about  threats  to  your  success.    In  the  process,  issues  often  surface  about  which  no  one  

has  thought  much,  but  many  will  recognize  as  potentially  important.    These  are  

examples  of  “unknown  unknowns,”  to  use  Donald  Rumsfeld’s  phrase.  

  Like  earthquakes  in  Japan,  severe  weather  events  can  be  hard  to  predict.    A  truly  

unusual  event,  like  a  blizzard  in  Georgia,  is  probably  an  outlier;  but  a  blizzard  in  Chicago  

is  a  typical  winter  event.    There  are  regions  of  the  US  where  floods,  blizzards  and  

tornadoes  occur  often  enough  to  be  included  as  a  risk  in  plans.    We  can’t  really  plan  for  

an  unexpected  extreme  event,  but  we  should  have  contingencies  in  place  for  the  

unsurprising  extreme  event,  the  “known  unknowns”.    

  Planning  should  include  estimations  of  probabilities  for  events  that  can  affect  

you  or  your  business,  along  with  what  the  consequences  of  those  events  are.    For  

example,  how  likely  is  a  significant  increase  in  the  price  of  gasoline?    If  you  drive  a  

hybrid  car,  it  wouldn’t  affect  you  significantly,  but  someone  with  a  fleet  of  delivery  

vehicles  could  be  severely  impacted.    The  probability  of  the  price  increase  is  the  same  in  

both  cases,  but  the  consequences  are  very  different.    Thinking  through  issues  in  this  way  

will  help  you  distinguish  the  risks  you  should  worry  about  from  the  ones  you  can  let  go.  

Along  with  probability  estimates,  remember  that  the  probability  of  independent  events  

co-­‐occurring  is  always  a  lot  lower  than  the  probability  of  each  occurring  separately.  

  When  we  don’t  incorporate  randomness  and  uncertainty  into  our  thinking,  our  

vision  of  the  future  tends  to  be  flawed.    We  can  mistake  coincidence  for  causality  and  

develop  false  beliefs.    We  make  plans  as  though  the  present  state  of  the  world  will  

26

continue  into  the  future.    That’s  fine  as  a  starting  point,  but  it’s  important  to  remember  

that  the  future  comes  with  a  range  of  outcomes,  not  just  the  ones  we  want.    

 

Quick  Tips  to  Deal  with  Randomness    

Before  accepting  a  claim  that  one  thing  causes  another,  ask  yourself  

• Does  the  outcome  ever  occur  without  the  cause?  

• Does  the  cause  always  lead  to  the  outcome?  

• Does  the  person  making  the  claim  have  a  strong  belief  about  the  topic?  

When  assessing  risks,  be  sure  to  include  the  base  rate  for  the  risk  occurring.  

Planning  and  decisions  should  include  a  process  to  account  for  the  following:  

• What  is  the  most  likely  outcome  if  you  continue  your  current  actions?  

• Are  you  keeping  track  of  what  happened  as  a  result  of  prior  decisions?  

• What  are  the  uncontrollable  factors  in  your  situation?      

• What  is  the  range  of  outcomes  that  could  result  from  uncontrollable  factors?  

• Are  you  paying  attention  to  base  rates?  

When  you  hear  news  about  polls,  health,  the  economy,  and  potential  risks,  ask  yourself  

• Who  says  so?  

• How  do  they  know?  

• Is  the  quantitative  information  communicated  appropriately?  

• Are  comparisons  being  made  on  the  same  scale?