Reading

TheSignalandtheNoiseChapter4.pdf

FOR YEARS YOU'VE BEEN TELLING US THAT RAIN IS

GREEN

On Tuesday, August 23, 2005, an Air Force reconnaissance plane picked up signs of a disturbance over the Bahamas.1 There were "several small vortices," it reported, spirals of wind rotating in a counterclockwise motion from east to west---away from the expanse of the Atlantic and toward the United States. This disruption in wind patterns was hard to detect from clouds or from satellite data, but cargo ships were beginning to recognize it. The National Hurricane Center thought there was enough evidence to characterize the disturbance as a tropical cyclone, labeling it Tropical Depression Twelve. It was a "tricky" storm that might develop into something more serious or might just as easily dissipate; about half of all tropical depressions in the Atlantic Basin eventually become hurricanes.2

The depression strengthened quickly, however, and by

Wednesday afternoon one of the Hurricane Center's computer models was already predicting a double landfall in the United States---a first one over southern Florida and a second that might "[take] the cyclone to New Orleans."3 The storm had gathered enough strength to become a hurricane and it was given a name, Katrina.4

Katrina's first landfall---it passed just north of Miami and then zoomed through the Florida Everglades a few hours later as a Category 1 hurricane---had not been prolonged enough to threaten many lives. But it had also not been long enough to take much energy out of the storm. Instead, Katrina was gaining strength in the warm waters of the Gulf of Mexico. In the wee hours of Saturday morning the forecast really took a turn for the worse: Katrina had become a Category 3 hurricane, on its way to being a Category 5. And its forecast track had gradually been moving westward, away from the Florida Panhandle and toward Mississippi and Louisiana. The computer models were now in agreement: the storm seemed bound for New Orleans.5

"I think I had five congressional hearings after Katrina." said Max Mayfield, who was director of the National Hurricane Center at the time the storm hit, when I asked him to recall when he first recognized the full magnitude of the threat. "One of them asked me when I first became concerned with New Orleans. I said 'Sixty years ago.'"

A direct strike of a major hurricane on New Orleans had long been every weather forecaster's worst nightmare. The city presented a perfect set of circumstances that might contribute to the death and destruction there. On the one hand there was its geography: New Orleans does not border the Gulf of Mexico as much as sink into it. Much of the population lived below sea level and was counting on protection from an outmoded system of levees and a set of natural barriers that had literally been washing away to sea.6 On the other hand there was its culture. New Orleans does many things well, but there are two things that it proudly refuses to do. New Orleans does not move quickly, and New Orleans does not place much faith in authority. If it did those things, New Orleans would not really be New Orleans. It would also have been much better prepared to deal with Katrina, since those are the exact two things you need to do when a hurricane threatens to strike.

The National Hurricane Center nailed its forecast of Katrina; it anticipated a potential hit on the city almost five days before the levees were breached, and concluded that some version of the nightmare scenario was probable more than forty-eight hours away. Twenty or thirty years ago, this much advance warning would almost certainly not have been possible, and fewer people would have been evacuated. The Hurricane Center's forecast, and the steady advances made in weather forecasting

over the past few decades, undoubtedly saved many lives. Not everyone listened to the forecast, however. About 80,000

New Orleanians7---almost a fifth of the city's population at the time---failed to evacuate the city, and 1,600 of them died. Surveys of the survivors found that about two-thirds of them did not think the storm would be as bad as it was.8 Others had been confused by a bungled evacuation order; the city's mayor, Ray Nagin, waited almost twenty-four hours to call for a mandatory evacuation, despite pleas from Mayfield and from other public officials. Still other residents---impoverished, elderly, or disconnected from the news---could not have fled even if they had wanted to.

Weather forecasting is one of the success stories in this book, a case of man and machine joining forces to understand and sometimes anticipate the complexities of nature. That we can sometimes predict nature's course, however, does not mean we can alter it. Nor does a forecast do much good if there is no one willing to listen to it. The story of Katrina is one of human ingenuity and human error.

The Weather of Supercomputers

The supercomputer labs at the National Center for Atmospheric Research (NCAR) in Boulder, Colorado, literally produce their own weather. They are hot: the 77 trillion calculations that the IBM Bluefire supercomputer makes every second generate a substantial amount of radiant energy. They are windy: all that heat must be cooled, lest the nation's ability to forecast its weather be placed into jeopardy, and so a series of high- pressure fans blast oxygen on the computers at all times. And they are noisy: the fans are loud enough that hearing protection is standard operating equipment.

The Bluefire is divided into eleven cabinets, each about eight feet tall and two feet wide with a bright green racing stripe running down the side. From the back, they look about how you might expect a supercomputer to look: a mass of crossed cables and blinking blue lights feeding into the machine's brain stem. From the front, they are about the size and shape of a portable toilet, complete with what appears to be a door with a silver handle.

"They look a little bit like Porta-Potties," I tell Dr. Richard Loft, the director of technology development for NCAR, who oversees the supercomputer lab.

Those in the meteorology business are used to being the butt of jokes. Larry David, in the show Curb Your Enthusiasm, posits that meterologists sometimes predict rain when there

won't be just so they can get a head start on everyone else at the golf course.9 Political commercials use weather metaphors as a basis to attack their opponents,10 usually to suggest that they are always flip-flopping on the issues. Most people assume that weather forecasters just aren't very good at what they do.

Indeed, it was tempting to look at the rows of whirring computers and wonder if this was all an exercise in futility: All this to forecast the weather? And they still can't tell us whether it's going to rain tomorrow?

Loft did not look amused. Improved computing power has not really improved earthquake or economic forecasts in any obvious way. But meteorology is a field in which there has been considerable, even remarkable, progress. The power of Loft's supercomputers is a big part of the reason why.

A Very Brief History of Weather Forecasting

"Allow me to deviate from the normal flight plan," Loft said back in his office. He proved to have a sense of humor after all-- -quirky and offbeat, like a more self-aware version of Dwight

Schrute from The Office.* From the very beginnings of history, Loft explained, man has tried to predict his environment. "You go back to Chaco Canyon or Stonehenge and people realized they could predict the shortest day of the year and the longest day of the year. That the moon moved in predictable ways. But there are things an ancient man couldn't predict. Ambush from some kind of animal. A flash flood or a thunderstorm."

Today we might take it for granted that we can predict where a hurricane will hit days in advance, but meteorology was very late to develop into a successful science. For centuries, progress was halting. The Babylonians, who were advanced astronomers, produced weather prognostications that have been preserved on stone tablets for more than 6,000 years.11 Ultimately, however, they deferred to their rain god Ningirsu. Aristotle wrote a treatise on meteorology12 and had a few solid intuitions, but all in all it was one of his feebler attempts. It's only been in the past fifty years or so, as computer power has improved, that any real progress has been made.

You might not think of the weather report as an exercise in metaphysics, but the very idea of predicting the weather evokes age-old debates about predestination and free will. "Is everything written, or do we write it ourselves?" Loft asked. "This has been a basic problem for human beings. And there really were two lines of thought.

"One comes through Saint Augustine and Calvinism," he continued, describing people who believed in predestination. Under this philosophy, humans might have the ability to predict the course they would follow. But there was nothing they could do to alter it. Everything was carried out in accordance with God's plan. "This is against the Jesuits and Thomas Aquinas who said we actually have free will. This question is about whether the world is predictable or unpredictable."

The debate about predictability began to be carried out on different terms during the Age of Enlightenment and the Industrial Revolution. Isaac Newton's mechanics had seemed to suggest that the universe was highly orderly and predictable, abiding by relatively simple physical laws. The idea of scientific, technological, and economic progress---which by no means could be taken for granted in the centuries before then--- began to emerge, along with the notion that mankind might learn to control its own fate. Predestination was subsumed by a new idea, that of scientific determinism.

The idea takes on various forms, but no one took it further than Pierre-Simon Laplace, a French astronomer and mathematician. In 1814, Laplace made the following postulate, which later came to be known as Laplace's Demon:

We may regard the present state of the universe as the

effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.13

Given perfect knowledge of present conditions ("all positions of all items of which nature is composed"), and perfect knowledge of the laws that govern the universe ("all forces that set nature in motion"), we ought to be able to make perfect predictions ("the future just like the past would be present"). The movement of every particle in the universe should be as predictable as that of the balls on a billiard table. Human beings might not be up to the task, Laplace conceded. But if we were smart enough (and if we had fast enough computers) we could predict the weather and everything else---and we would find that nature itself is perfect.

Laplace's Demon has been controversial for all its two- hundred-year existence. At loggerheads with the determinists are

the probabilists, who believe that the conditions of the universe are knowable only with some degree of uncertainty.* Probabilism was, at first, mostly an epistemological paradigm: it avowed that there were limits on man's ability to come to grips with the universe. More recently, with the discovery of quantum mechanics, scientists and philosophers have asked whether the universe itself behaves probabilistically. The particles Laplace sought to identify begin to behave like waves when you look closely enough---they seem to occupy no fixed position. How can you predict where something is going to go when you don't know where it is in the first place? You can't. This is the basis for the theoretical physicist Werner Heisenberg's famous uncertainty principle.14 Physicists interpret the uncertainty principle in different ways, but it suggests that Laplace's postulate cannot literally be true. Perfect predictions are impossible if the universe itself is random.

Fortunately, weather does not require quantum mechanics for us to study it. It happens at a molecular (rather than an atomic) level, and molecules are much too large to be discernibly impacted by quantum physics. Moreover, we understand the chemistry and Newtonian physics that govern the weather fairly well, and we have for a long time.

So what about a revised version of Laplace's Demon? If we knew the position of every molecule in the earth's atmosphere---

a much humbler request than deigning to know the position of every atomic particle in the universe---could we make perfect weather predictions? Or is there a degree of randomness inherent in the weather as well?

The Matrix

Purely statistical predictions about the weather have long been possible. Given that it rained today, what is the probability that it will rain tomorrow? A meteorologist could look up all the past instances of rain in his database and give us an answer about that. Or he could look toward long-term averages: it rains about 35 percent of the time in London in March.15

The problem is that these sorts of predictions aren't very useful---not precise enough to tell you whether to carry an umbrella, let alone to forecast the path of a hurricane. So meteorologists have been after something else. Instead of a statistical model, they wanted a living and breathing one that simulated the physical processes that govern the weather.

Our ability to compute the weather has long lagged behind our theoretical understanding of it, however. We know which

equations to solve and roughly what the right answers are, but we just aren't fast enough to calculate them for every molecule in the earth's atmosphere. Instead, we have to make some approximations.

The most intuitive way to do this is to simplify the problem by breaking the atmosphere down into a finite series of pixels--- what meteorologists variously refer to as a matrix, a lattice, or a grid. According to Loft, the earliest credible attempt to do this was made in 1916 by Lewis Fry Richardson, a prolific English physicist. Richardson wanted to determine the weather over northern Germany at a particular time: at 1 P.M. on May 20, 1910. This was not, strictly speaking, a prediction, the date being some six years in the past. But Richardson had a lot of data: a series of observations of temperature, barometric pressures and wind speeds that had been gathered by the German government. And he had a lot of time: he was serving in northern France as part of a volunteer ambulance unit and had little to do in between rounds of artillery fire. So Richardson broke Germany down into a series of two-dimensional boxes, each measuring three degrees of latitude (about 210 miles) by three degrees of longitude across. Then he went to work attempting to solve the chemical equations that governed the weather in each square and how they might affect weather in the adjacent ones.

FIGURE 4-1: RICHARDSON'S MATRIX: THE BIRTH OF MODERN WEATHER FORECASTING

Richardson's experiment, unfortunately, failed miserably16---it "predicted" a dramatic rise in barometric pressure that hadn't occurred in the real world on the day in question. But he

published his results nevertheless. It certainly seemed like the right way to predict the weather---to solve it from first principles, taking advantage of our strong theoretical understanding of how the system behaves, rather than relying on a crude statistical approximation.

The problem was that Richardson's method required an awful lot of work. Computers were more suitable to the paradigm that he had established. As you'll see in chapter 9, computers aren't good at every task we hope they might accomplish and have been far from a panacea for prediction. But computers are very good at computing: at repeating the same arithmetic tasks over and over again and doing so quickly and accurately. Tasks like chess that abide by relatively simple rules, but which are difficult computationally, are right in their wheelhouse. So, potentially, was the weather.

The first computer weather forecast was made in 1950 by the mathematician John von Neumann, who used a machine that could make about 5,000 calculations per second.17 That was a lot faster than Richardson could manage with a pencil and paper in a French hay field. Still, the forecast wasn't any good, failing to do any better than a more-or-less random guess.

Eventually, by the mid-1960s, computers would start to demonstrate some skill at weather forecasting. And the Bluefire- --some 15 billion times faster than the first computer forecast

and perhaps a quadrillion times faster than Richardson--- displays quite a bit of acumen because of the speed of computation. Weather forecasting is much better today than it was even fifteen or twenty years ago. But, while computing power has improved exponentially in recent decades, progress in the accuracy of weather forecasts has been steady but slow.

There are essentially two reasons for this. One is that the world isn't one or two dimensional. The most reliable way to improve the accuracy of a weather forecast---getting one step closer to solving for the behavior of each molecule---is to reduce the size of the grid that you use to represent the atmosphere. Richardson's squares were about two hundred miles by two hundred miles across, providing for at best a highly generalized view of the planet (you could nearly squeeze both New York and Boston---which can have very different weather- --into the same two hundred by two hundred square). Suppose you wanted to reduce the diameter of the squares in half, to a resolution of one hundred by one hundred. That improves the precision of your forecast, but it also increases the number of equations you need to solve. In fact, it would increase this number not twofold but fourfold---since you're doubling the magnitude both lengthwise and widthwise. That means, more or less, that you need four times as much computer power to produce a solution.

But there are more dimensions to worry about than just two. Different patterns can take hold in the upper atmosphere, in the lower atmosphere, in the oceans, and near the earth's surface. In a three-dimensional universe, a twofold increase in the resolution of our grid will require an eightfold increase in computer power:

And then there is the fourth dimension: time. A meteorological model is no good if it's static---the idea is to know how the weather is changing from one moment to the next. A thunderstorm moves at about forty miles per hour: if you have a three-dimensional grid that is forty by forty by forty across, you can monitor the storm's movement by collecting one observation every hour. But if you halve the dimensions of the grid to twenty by twenty by twenty, the storm will now pass through one of the boxes every half hour. That means you need to halve the time

parameter as well---again doubling your requirement to sixteen times as much computing power as you had originally.

If this was the only problem it wouldn't be prohibitive. Although you need, roughly speaking, to get ahold of sixteen times more processing power in order to double the resolution of your weather forecast, processing power has been improving exponentially---doubling about once every two years.18 That means you only need to wait eight years for a forecast that should be twice as powerful; this is about the pace, incidentally, at which NCAR has been upgrading its supercomputers.

Say you've solved the laws of fluid dynamics that govern the movement of weather systems. They're relatively Newtonian: the uncertainty principle---interesting as it might be to physicists--- won't bother you much. You've gotten your hands on a state-of- the-art piece of equipment like the Bluefire. You've hired Richard Loft to design the computer's software and to run its simulations. What could possibly go wrong?

How Chaos Theory Is Like Linsanity

What could go wrong? Chaos theory. You may have heard the expression: the flap of a butterfly's wings in Brazil can set off a tornado in Texas . It comes from the title of a paper19 delivered in 1972 by MIT's Edward Lorenz, who began his career as a meteorologist. Chaos theory applies to systems in which each of two properties hold:

1. The systems are dynamic, meaning that the behavior of the system at one point in time influences its behavior in the future;

2. And they are nonlinear, meaning they abide by exponential rather than additive relationships.

Dynamic systems give forecasters plenty of problems---as I describe in chapter 6, for example, the fact that the American economy is continually evolving in a chain reaction of events is one reason that it is very difficult to predict. So do nonlinear ones: the mortgage-backed securities that triggered the financial crisis were designed in such a way that small changes in macroeconomic conditions could make them exponentially more likely to default.

When you combine these properties, you can have a real mess. Lorenz did not realize just how profound the problems were until, in the tradition of Alexander Fleming and penicillin20 or the New York Knicks and Jeremy Lin, he made a major discovery purely by accident.

Lorenz and his team were working to develop a weather forecasting program on an early computer known as a Royal McBee LGP-30.21 They thought they were getting somewhere until the computer started spitting out erratic results. They began with what they thought was exactly the same data and ran what they thought was exactly the same code---but the program would forecast clear skies over Kansas in one run, and a thunderstorm in the next.

After spending weeks double-checking their hardware and trying to debug their program, Lorenz and his team eventually discovered that their data wasn't exactly the same: one of their technicians had truncated it in the third decimal place. Instead of having the barometric pressure in one corner of their grid read 29.5168, for example, it might instead read 29.517. Surely this couldn't make that much difference?

Lorenz realized that it could. The most basic tenet of chaos theory is that a small change in initial conditions---a butterfly flapping its wings in Brazil---can produce a large and unexpected divergence in outcomes---a tornado in Texas. This

does not mean that the behavior of the system is random, as the term "chaos" might seem to imply. Nor is chaos theory some modern recitation of Murphy's Law ("whatever can go wrong will go wrong"). It just means that certain types of systems are very hard to predict.

The problem begins when there are inaccuracies in our data. (Or inaccuracies in our assumptions, as in the case of mortgage- backed securities). Imagine that we're supposed to be taking the sum of 5 and 5, but we keyed in the second number wrong. Instead of adding 5 and 5, we add 5 and 6. That will give us an answer of 11 when what we really want is 10. We'll be wrong, but not by much: addition, as a linear operation, is pretty forgiving. Exponential operations, however, extract a lot more punishment when there are inaccuracies in our data. If instead of taking 55---which should be 3,215---we instead take 56 (five to the sixth power), we wind up with an answer of 15,625. That's way off: we've missed our target by 500 percent.

This inaccuracy quickly gets worse if the process is dynamic, meaning that our outputs at one stage of the process become our inputs in the next. For instance, say that we're supposed to take five to the fifth power, and then take whatever result we get and apply it to the fifth power again. If we'd made the error described above, and substituted a 6 for the second 5, our results will now be off by a factor of more than 3,000.22 Our small,

seemingly trivial mistake keeps getting larger and larger. The weather is the epitome of a dynamic system, and the

equations that govern the movement of atmospheric gases and fluids are nonlinear---mostly differential equations.23 Chaos theory therefore most definitely applies to weather forecasting, making the forecasts highly vulnerable to inaccuracies in our data.

Sometimes these inaccuracies arise as the result of human error. The more fundamental issue is that we can only observe our surroundings with a certain degree of precision. No thermometer is perfect, and if it's off in even the third or the fourth decimal place, this can have a profound impact on the forecast.

Figure 4-2 shows the output of fifty runs from a European weather model, which was attempting to make a weather forecast for France and Germany on Christmas Eve, 1999. All these simulations are using the same software, and all are making the same assumptions about how the weather behaves. In fact, the models are completely deterministic: they assume that we could forecast the weather perfectly, if only we knew the initial conditions perfectly. But small changes in the input can produce large differences in the output. The European forecast attempted to account for these errors. In one simulation, the barometric pressure in Hanover might be perturbed just slightly.

In another, the wind conditions in Stuttgart are permuted by a fraction of a percent. These small changes might be enough for a huge storm system to develop in Paris in some simulations, while it's a calm winter evening in others.

FIGURE 4-2: DIVERGENT WEATHER FORECASTS WITH SLIGHTLY DIFFERENT INITIAL CONDITIONS

This is the process by which modern weather forecasts are made. These small changes, introduced intentionally in order to

represent the inherent uncertainty in the quality of the observational data, turn the deterministic forecast into a probabilistic one. For instance, if your local weatherman tells you that there's a 40 percent chance of rain tomorrow, one way to interpret that is that in 40 percent of his simulations, a storm developed, and in the other 60 percent---using just slightly different initial parameters---it did not.

It is still not quite that simple, however. The programs that meteorologists use to forecast the weather are quite good, but they are not perfect. Instead, the forecasts you actually see reflect a combination of computer and human judgment. Humans can make the computer forecasts better or they can make them worse.

The Vision Thing

The World Weather Building is an ugly, butterscotch-colored, 1970s-era office building in Camp Springs, Maryland, about twenty minutes outside Washington. The building forms the operational headquarters of NOAA---the National Oceanic and Atmospheric Administration---which is the parent of the

National Weather Service (NWS) on the government's organization chart.24 In contrast to NCAR's facilities in Boulder, which provide for sweeping views of the Front Range of the Rocky Mountains, it reminds one of nothing so much as bureaucracy.

The Weather Service was initially organized under the Department of War by President Ulysses S. Grant, who authorized it in 1870. This was partly because President Grant was convinced that only a culture of military discipline could produce the requisite accuracy in forecasting25 and partly because the whole enterprise was so hopeless that it was only worth bothering with during wartime when you would try almost anything to get an edge.

The public at large became more interested in weather forecasting after the Schoolhouse Blizzard of January 1888. On January 12 that year, initially a relatively warm day in the Great Plains, the temperature dropped almost 30 degrees in a matter of a few hours and a blinding snowstorm came.26 Hundreds of children, leaving school and caught unaware as the blizzard hit, died of hypothermia on their way home. As crude as early weather forecasts were, it was hoped that they might at least be able to provide some warning about an event so severe. So the National Weather Service was moved to the Department of Agriculture and took on a more civilian-facing mission.*

The Weather Service's origins are still apparent in its culture today. Its weather forecasters work around the clock for middling salaries27 and they see themselves as public servants. The meteorologists I met in Camp Springs were patriotic people, rarely missing an opportunity to remind me about the importance that weather forecasting plays in keeping the nation's farms, small businesses, airlines, energy sector, military, public services, golf courses, picnic lunches, and schoolchildren up and running, all for pennies on the dollar. (The NWS gets by on just $900 million per year28---about $3 per U.S. citizen---even though weather has direct effects on some 20 percent of the nation's economy.29)

Jim Hoke, one of the meteorologists I met, is the director of the NWS's Hydrometeorological Prediction Center. He is also a thirty-five-year veteran of the field, having taken his turn both on the computational side of the NWS (helping to build the computer models that his forecasters use) and on the operational side (actually making those forecasts and communicating them to the public). As such, he has some perspective on how man and machine intersect in the world of meteorology.

What is it, exactly, that humans can do better than computers that can crunch numbers at seventy-seven teraFLOPS? They can see. Hoke led me onto the forecasting floor, which consisted of a series of workstations marked with blue overhanging signs with

such legends as MARITIME FORECAST CENTER and NATIONAL CENTER. Each station was manned by one or two meterologists---accompanied by an armada of flat-screen monitors that displayed full-color maps of every conceivable type of weather data for every corner of the country. The forecasters worked quietly and quickly, with a certain amount of Grant's military precision.30

Some of the forecasters were drawing on these maps with what appeared to be a light pen, painstakingly adjusting the contours of temperature gradients produced by the computer models---fifteen miles westward over the Mississippi Delta, thirty miles northward into Lake Erie. Gradually, they were bringing them one step closer to the Platonic ideal they were hoping to represent.

The forecasters know the flaws in the computer models. These inevitably arise because---as a consequence of chaos theory--- even the most trivial bug in the model can have potentially profound effects. Perhaps the computer tends to be too conservative on forecasting nighttime rainfalls in Seattle when there's a low-pressure system in Puget Sound. Perhaps it doesn't know that the fog in Acadia National Park in Maine will clear up by sunrise if the wind is blowing in one direction, but can linger until midmorning if it's coming from another. These are the sorts of distinctions that forecasters glean over time as they learn to work around the flaws in the model, in the way that a skilled

pool player can adjust to the dead spots on the table at his local bar.

The unique resource that these forecasters were contributing was their eyesight. It is a valuable tool for forecaters in any discipline---a visual inspection of a graphic showing the interaction between two variables is often a quicker and more reliable way to detect outliers in your data than a statistical test. It's also one of those areas where computers lag well behind the human brain. Distort a series of letters just slightly---as with the CAPTCHA technology that is often used in spam or password protection---and very "smart" computers get very confused. They are too literal-minded, unable to recognize the pattern once its subjected to even the slightest degree of manipulation. Humans by contrast, out of pure evolutionary necessity, have very powerful visual cortexes. They rapidly parse through any distortions in the data in order to identify abstract qualities like pattern and organization---qualities that happen to be very important in different types of weather systems.

FIGURE 4-3: CAPTCHA

Indeed, back in the old days when meterological computers weren't much help at all, weather forecasting was almost entirely a visual process. Rather than flat screens, weather offices were instead filled with a series of light tables, illuminating maps that meterologists would mark with chalk or drafting pencils, producing a weather forecast fifteen miles at a time. Although the last light table was retired many years ago, the spirit of the technique survives today.

The best forecasters, Hoke explained, need to think visually and abstractly while at the same time being able to sort through the abundance of information the computer provides them with. Moreover, they must understand the dynamic and nonlinear nature of the system they are trying to study. It is not an easy task, requiring vigorous use of both the left and right brain. Many of his forecasters would make for good engineers or good software designers, fields where they could make much higher incomes, but they choose to become meteorologists instead.

The NWS keeps two different sets of books: one that shows how well the computers are doing by themselves and another that accounts for how much value the humans are contributing. According to the agency's statistics, humans improve the accuracy of precipitation forecasts by about 25 percent over the computer guidance alone,31 and temperature forecasts by about 10 percent.32 Moreover, according to Hoke, these ratios have

been relatively constant over time: as much progress as the computers have made, his forecasters continue to add value on top of it. Vision accounts for a lot.

Being Struck by Lightning Is Increasingly Unlikely

When Hoke began his career, in the mid-'70s, the jokes about weather forecasters had some grounding in truth. On average, for instance, the NWS was missing the high temperature by about 6 degrees when trying to forecast it three days in advance (figure 4-4). That isn't much better than the accuracy you could get just by looking up a table of long-term averages. The partnership between man and machine is paying big dividends, however. Today, the average miss is about 3.5 degrees, meaning that almost half the inaccuracy has been stripped out.

FIGURE 4-4: AVERAGE HIGH TEMPERATURE ERROR IN NWS FORECASTS

Weather forecasters are also getting better at predicting severe weather. What are your odds of being struck---and killed- --by lightning? Actually, this is not a constant number; they depend on how likely you are to be outdoors when lightning hits and unable to seek shelter in time because you didn't have a good forecast. In 1940, the chance of an American being killed by lightning in a given year was about 1 in 400,000.33 Today, it's just 1 chance in 11,000,000, making it almost thirty times less likely. Some of this reflects changes in living patterns (more of our work is done indoors now) and improvement in communications technology and medical care, but it's also

because of better weather forecasts. Perhaps the most impressive gains have been in hurricane

forecasting. Just twenty-five years ago, when the National Hurricane Center tried to forecast where a hurricane would hit three days in advance of landfall, it missed by an average of 350 miles.34 That isn't very useful on a human scale. Draw a 350-mile radius outward from New Orleans, for instance, and it covers all points from Houston, Texas, to Tallahassee, Florida (figure 4-5). You can't evacuate an area that large.

FIGURE 4-5: IMPROVEMENT IN HURRICANE TRACK FORECASTING

Today, however, the average miss is only about one hundred miles, enough to cover only southeastern Louisiana and the southern tip of Mississippi. The hurricane will still hit outside that circle some of the time, but now we are looking at a relatively small area in which an impact is even money or better---small enough that you could plausibly evacuate it seventy-two hours in advance. In 1985, by contrast, it was not

until twenty-four hours in advance of landfall that hurricane forecasts displayed the same skill. What this means is that we now have about forty-eight hours of additional warning time before a storm hits---and as we will see later, every hour is critical when it comes to evacuating a city like New Orleans.*

The Weather Service hasn't yet slain Laplace's Demon, but you'd think they might get more credit than they do. The science of weather forecasting is a success story despite the challenges posed by the intricacies of the weather system. As you'll find throughout this book, cases like these are more the exception than the rule when it comes to making forecasts. (Save your jokes for the economists instead.)

Instead, the National Weather Service often goes unappreciated. It faces stiff competition from private industry,35

competition that occurs on a somewhat uneven playing field. In contrast to most of its counterparts around the world, the Weather Service is supposed to provide its model data free of charge to anyone who wants it (most other countries with good weather bureaus charge licensing or usage fees for their government's forecasts). Private companies like AccuWeather and the Weather Channel can then piggyback off their handiwork to develop their own products and sell them commercially. The overwhelming majority of consumers get their forecast from one of the private providers; the Weather Channel's Web site,

Weather.com, gets about ten times more traffic than Weather.gov.36

I am generally a big fan of free-market competition, or competition between the public and private sectors. Competition was a big part of the reason that baseball evolved as quickly as it did to better combine the insights gleaned from scouts and statistics in forecasting the development of prospects.

In baseball, however, the yardstick for competition is clear: How many ballgames did you win? (Or if not that, how many ballgames did you win relative to how much you spent.) In weather forecasting, the story is a little more complicated, and the public and private forecasters have differing agendas.

What Makes a Forecast Good?

"A pure researcher wouldn't be caught dead watching the Weather Channel, but lots of them do behind closed doors," Dr. Bruce Rose, the affable principal scientist and vice president at the Weather Channel (TWC), informed me. Rose wasn't quite willing to say that TWC's forecasts are better than those issued by the government, but they are different, he claimed, and

oriented more toward the needs of a typical consumer. "The models typically aren't measured on how well they

predict practical weather elements," he continued. "It's really important if, in New York City, you get an inch of rain rather than ten inches of snow.37 That's a huge [distinction] for the average consumer, but scientists just aren't interested in that."

Much of Dr. Rose's time, indeed, is devoted to highly pragmatic and even somewhat banal problems related to how customers interpret his forecasts. For instance: how to develop algorithms that translate raw weather data into everyday verbiage. What does bitterly cold mean? A chance of flurries? Just where is the dividing line between partly cloudy and mostly cloudy? The Weather Channel needs to figure this out, and it needs to establish formal rules for doing so, since it issues far too many forecasts for the verbiage to be determined on an ad hoc basis.

Sometimes the need to adapt the forecast to the consumer can take on comical dimensions. For many years, the Weather Channel had indicated rain on their radar maps with green shading (occasionally accompanied by yellow and red for severe storms). At some point in 2001, someone in the marketing department got the bright idea to make rain blue instead---which is, after all, what we think of as the color of water. The Weather Channel was quickly beseiged with phone calls from outraged---

and occasionally terrified---consumers, some of whom mistook the blue blotches for some kind of heretofore unknown precipitation (plasma storms? radioactive fallout?). "That was a nuclear meltdown," Dr. Rose told me. "Somebody wrote in and said, 'For years you've been telling us that rain is green---and now it's blue? What madness is this?'"

But the Weather Channel also takes its meteorology very seriously. And at least in theory, there is reason to think that they might be able to make a better forecast than the government. The Weather Channel, after all, gets to use all of the government's raw data as their starting point and then add whatever value they might be able to contribute on their own.

The question is, what is a "better" forecast? I've been defining it simply as a more accurate one. But there are some competing ideas, and they are pertinent in weather forecasting.

An influential 1993 essay38 by Allan Murphy, then a meteorologist at Oregon State University, posited that there were three definitions of forecast quality that were commonplace in the weather forecasting community. Murphy wasn't necessarily advocating that one or another definition was better; he was trying to faciliate a more open and honest conversation about them. Versions of these definitions can be applied in almost any field in which forecasts or predictions are made.

One way to judge a forecast, Murphy wrote---perhaps the

most obvious one---was through what he called "quality," but which might be better defined as accuracy. That is, did the actual weather match the forecast?

A second measure was what Murphy labeled "consistency" but which I think of as honesty. However accurate the forecast turned out to be, was it the best one the forecaster was capable of at the time? Did it reflect her best judgment, or was it modified in some way before being presented to the public?

Finally, Murphy said, there was the economic value of a forecast. Did it help the public and policy makers to make better decisions?

Murphy's distinction between accuracy and honesty is subtle but important. When I make a forecast that turns out to be wrong, I'll often ask myself whether it was the best forecast I could have made given what I knew at the time. Sometimes I'll conclude that it was: my thought process was sound; I had done my research, built a good model, and carefully specified how much uncertainty there was in the problem. Other times, of course, I'll find that there was something I didn't like about it. Maybe I had too hastily dismissed a key piece of evidence. Maybe I had overestimated the predictability of the problem. Maybe I had been biased in some way, or otherwise had the wrong incentives.

I don't mean to suggest that you should beat yourself up every

time your forecast is off the mark. To the contrary, one sign that you have made a good forecast is that you are equally at peace with however things turn out---not all of which is within your immediate control. But there is always room to ask yourself what objectives you had in mind when you made your decision.

In the long run, Murphy's goals of accuracy and honesty should converge when we have the right incentives. But sometimes we do not. The political commentators on The McLaughlin Group, for instance, probably cared more about sounding smart on telvision than about making accurate predictions. They may well have been behaving rationally. But if they were deliberately making bad forecasts because they wanted to appeal to a partisan audience, or to be invited back on the show, they failed Murphy's honesty-in-forecasting test.

Murphy's third criterion, the economic value of a forecast, can complicate matters further. One can sympathize with Dr. Rose's position that, for instance, a city's forecast might deserve more attention if it is close to its freezing point, and its precipitation might come down as rain, ice, or snow, each of which would have different effects on the morning commute and residents' safety. This, however, is more a matter of where the Weather Channel focuses its resources and places its emphasis. It does not necessarily impeach the forecast's accuracy or honesty. Newspapers strive to ensure that all their articles are accurate

and honest, but they still need to decide which ones to put on the front page. The Weather Channel has to make similar decisions, and the economic impact of a forecast is a reasonable basis for doing so.

There are also times, however, when the goals may come into more conflict, and commercial success takes precedence over accuracy.

When Competition Makes Forecasts Worse

There are two basic tests that any weather forecast must pass to demonstrate its merit:

1. It must do better than what meteorologists call persistence: the assumption that the weather will be the same tomorrow (and the next day) as it was today.

2. It must also beat climatology, the long-term historical average of conditions on a particular date in a particular area.

These were the methods that were available to our ancestors long before Richardson, Lorenz, and the Bluefire came along; if we can't improve on them, then all that expensive computer power must not be doing much good.

We have lots of data, going back at least to World War II, on past weather outcomes: I can go to Wunderground.com, for instance, and tell you that the weather at 7 A.M. in Lansing, Michigan, on January 13, 1978---the date and time when I was born---was 18 degrees with light snow and winds from the northeast.39 But relatively few people had bothered to collect information on past weather forecasts. Was snow expected in Lansing that morning? It was one of the few pieces of information that you might have expected to find on the Internet but couldn't.

In 2002 an entrepeneur named Eric Floehr, a computer science graduate from Ohio State who was working for MCI, changed that. Floehr simply started collecting data on the forecasts issued by the NWS, the Weather Channel, and AccuWeather, to see if the government model or the private- sector forecasts were more accurate. This was mostly for his own edification at first---a sort of very large scale science fair project---but it quickly evolved into a profitable business, ForecastWatch.com, which repackages the data into highly customized reports for clients ranging from energy traders (for

whom a fraction of a degree can translate into tens of thousands of dollars) to academics.

Floehr found that there wasn't any one clear overall winner. His data suggests that AccuWeather has the best precipitation forecasts by a small margin, that the Weather Channel has slightly better temperature forecasts, and the government's forecasts are solid all around. They're all pretty good.

But the further out in time these models go, the less accurate they turn out to be (figure 4-6). Forecasts made eight days in advance, for example, demonstate almost no skill; they beat persistence but are barely better than climatology. And at intervals of nine or more days in advance, the professional forecasts were actually a bit worse than climatology.

FIGURE 4-6: COMPARISON OF HIGH-TEMPERATURE FORECASTS40

After a little more than a week, Loft told me, chaos theory completely takes over, and the dynamic memory of the atmopshere erases itself. Although the following analogy is somewhat imprecise, it may help to think of the atmosphere as akin to a NASCAR oval, with various weather systems represented by individual cars that are running along the track. For the first couple of dozen laps around the track, knowing the starting order of the cars should allow us to make a pretty good prediction of the order in which they might pass by. Our predictions won't be perfect---there'll be crashes, pit stops, and

engine failures that we've failed to account for---but they will be a lot better than random. Soon, however, the faster cars will start to lap the slower ones, and before long the field will be completely jumbled up. Perhaps the second-placed car is running side by side with the sixteenth-placed one (which is about to get lapped), as well as the one in the twenty-eighth place (which has already been lapped once and is in danger of being lapped again). What we knew of the initial conditions of the race is of almost no value to us. Likewise, once the atmosphere has had enough time to circulate, the weather patterns bear so little resemblence to their starting positions that the models don't do any good.

Still, Floehr's finding raises a couple of disturbing questions. It would be one thing if, after seven or eight days, the computer models demonstrated essentially zero skill. But instead, they actually display negative skill: they are worse than what you or I could do sitting around at home and looking up a table of long- term weather averages. How can this be? It is likely because the computer programs, which are hypersensitive to the naturally occurring feedbacks in the weather system, begin to produce feedbacks of their own. It's not merely that there is no longer a signal amid the noise, but that the noise is being amplified.

The bigger question is why, if these longer-term forecasts aren't any good, outlets like the Weather Channel (which

publishes ten-day forecasts) and AccuWeather (which ups the ante and goes for fifteen) continue to produce them. Dr. Rose took the position that doing so doesn't really cause any harm; even a forecast based purely on climatology might be of some interest to their consumers.

The statistical reality of accuracy isn't necessarily the governing paradigm when it comes to commercial weather forecasting. It's more the perception of accuracy that adds value in the eyes of the consumer.

For instance, the for-profit weather forecasters rarely predict exactly a 50 percent chance of rain, which might seem wishy- washy and indecisive to consumers.41 Instead, they'll flip a coin and round up to 60, or down to 40, even though this makes the forecasts both less accurate and less honest.42

Floehr also uncovered a more flagrant example of fudging the numbers, something that may be the worst-kept secret in the weather industry. Most commercial weather forecasts are biased, and probably deliberately so. In particular, they are biased toward forecasting more precipitation than will actually occur43---what meteorologists call a "wet bias." The further you get from the government's original data, and the more consumer facing the forecasts, the worse this bias becomes. Forecasts "add value" by subtracting accuracy.

How to Know if Your Forecasts Are All Wet

One of the most important tests of a forecast---I would argue that it is the single most important one44---is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If, over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. If it wound up raining just 20 percent of the time instead, or 60 percent of the time, they weren't.

Calibration is difficult to achieve in many fields. It requires you to think probabilistically, something that most of us (including most "expert" forecasters) are not very good at. It really tends to punish overconfidence---a trait that most forecasters have in spades. It also requires a lot of data to evaluate fully---cases where forecasters have issued hundreds of predictions.*

Meteoroloigsts meet this standard. They'll forecast the temperatures, and the probability of rain and other precipitation, in hundreds of cities every day. Over the course of a year, they'll make tens of thousands of forecasts.

This sort of high-frequency forecasting is extremely helpful not just when we want to evaluate a forecast but also to the forecasters themselves---they'll get lots of feedback on whether

they're doing something wrong and can change course accordingly. Certain computer models, for instance, tend to come out a little wet45---forecasting rain more often than they should. But once you are alert to this bias you can correct for it. Likewise, you will soon learn if your forecasts are overconfident.

The National Weather Service's forecasts are, it turns out, admirably well calibrated46 (figure 4-7). When they say there is a 20 percent chance of rain, it really does rain 20 percent of the time. They have been making good use of feedback, and their forecasts are honest and accurate.

FIGURE 4-7: NATIONAL WEATHER SERVICE CALIBRATION

The meteorologists at the Weather Channel will fudge a little bit under certain conditions. Historically, for instance, when they say there is a 20 percent chance of rain, it has actually only rained about 5 percent of the time.47 In fact, this is deliberate and is something the Weather Channel is willing to admit to. It has to do with their economic incentives.

People notice one type of mistake---the failure to predict rain- --more than another kind, false alarms. If it rains when it isn't

supposed to, they curse the weatherman for ruining their picnic, whereas an unexpectedly sunny day is taken as a serendipitous bonus. It isn't good science, but as Dr. Rose at the Weather Channel acknolwedged to me: "If the forecast was objective, if it has zero bias in precipitation, we'd probably be in trouble."

Still, the Weather Channel is a relatively buttoned-down organization---many of their customers mistakenly think they are a government agency---and they play it pretty straight most of the time. Their wet bias is limited to slightly exaggerating the probability of rain when it is unlikely to occur---saying there is a 20 percent chance when they know it is really a 5 or 10 percent chance---covering their butts in the case of an unexpected sprinkle. Otherwise, their forecasts are well calibrated (figure 4-8). When they say there is a 70 percent chance of rain, for instance, that number can be taken at face value.

FIGURE 4-8: THE WEATHER CHANNEL CALIBRATION

Where things really go haywire is when weather is presented on the local network news. Here, the bias is very pronounced, with accuracy and honesty paying a major price.

Kansas City ought to be a great market for weather forecasting---it has scorching-hot summers, cold winters, tornadoes, and droughts, and it is large enough to be represented by all the major networks. A man there named J. D. Eggleston began tracking local TV forecasts to help his daughter with a

fifth-grade classroom project. Eggleston found the analysis so interesting that he continued it for seven months, posting the results to the Freakonomics blog.48

The TV meteorologists weren't placing much emphasis on accuracy. Instead, their forecasts were quite a bit worse than those issued by the National Weather Service, which they could have taken for free from the Internet and reported on the air. And they weren't remotely well calibrated. In Eggleston's study, when a Kansas City meteorologist said there was a 100 percent chance of rain, it failed to rain about one-third of the time (figure 4-9).

FIGURE 4-9: LOCAL TV METEOROLOGIST CALIBRATION

The weather forecasters did not make any apologies for this. "There's not an evaluation of accuracy in hiring meteorologists. Presentation takes precedence over accuracy," one of them told Eggleston. "Accuracy is not a big deal to viewers," said another. The attitude seems to be that this is all in good fun---who cares if there is a little wet bias, especially if it makes for better television? And since the public doesn't think our forecasts are any good anyway, why bother with being accurate?

This logic is a little circular. TV weathermen say they aren't bothering to make accurate forecasts because they figure the public won't believe them anyway. But the public shouldn't believe them, because the forecasts aren't accurate.

This becomes a more serious problem when there is something urgent---something like Hurricane Katrina. Lots of Americans get their weather information from local sources49

rather than directly from the Hurricane Center, so they will still be relying on the goofball on Channel 7 to provide them with accurate information. If there is a mutual distrust between the weather forecaster and the public, the public may not listen when they need to most.

The Cone of Chaos

As Max Mayfield told Congress, he had been prepared for a storm like Katrina to hit New Orleans for most of his sixty-year life.50 Mayfield grew up around severe weather---in Oklahoma, the heart of Tornado Alley---and began his forecasting career in the Air Force, where people took risk very seriously and drew up battle plans to prepare for it. What took him longer to learn

was how difficult it would be for the National Hurricane Center to communicate its forecasts to the general public.

"After Hurricane Hugo in 1989," Mayfield recalled in his Oklahoma drawl, "I was talking to a behavioral scientist from Florida State. He said people don't respond to hurricane warnings. And I was insulted. Of course they do. But I have learned that he is absolutely right. People don't respond just to the phrase 'hurricane warning.' People respond to what they hear from local officials. You don't want the forecaster or the TV anchor making decisions on when to open shelters or when to reverse lanes."

Under Mayfield's guidance, the National Hurricane Center began to pay much more attention to how it presented its forecasts. It contrast to most government agencies, whose Web sites look as though they haven't been updated since the days when you got those free AOL CDs in the mail, the Hurricane Center takes great care in the design of its products, producing a series of colorful and attractive charts that convey information intuitively and accurately on everything from wind speed to storm surge.

The Hurricane Center also takes care in how it presents the uncertainty in its forecasts. "Uncertainty is the fundamental component of weather prediction," Mayfield said. "No forecast is complete without some description of that uncertainty."

Instead of just showing a single track line for a hurricane's predicted path, for instance, their charts prominently feature a cone of uncertainty---"some people call it a cone of chaos," Mayfield said. This shows the range of places where the eye of the hurricane is most likely to make landfall.51 Mayfield worries that even this isn't enough. Significant impacts like flash floods (which are often more deadly than the storm itself) can occur far from the center of the storm and long after peak wind speeds have died down. No people in New York City died from Hurricane Irene in 2011 despite massive media hype surrounding the storm, but three people did from flooding in landlocked Vermont52 once the TV cameras were turned off.

What the Hurricane Center usually does not do is issue policy guidance to local officials, such as whether to evacuate a city. Instead, this function is outsourced to the National Weather Service's 122 local offices, who communicate with governors and mayors, sheriffs and police chiefs. The official reason for this is that the Hurricane Center figures the local offices will have better working knowledge of the cultures and the people they are dealing with on the ground. The unofficial reason, I came to recgonize after speaking with Mayfield, is that the Hurricane Center wants to keep its mission clear. The Hurricane Center and the Hurricane Center alone issues hurricane forecasts, and it needs those forecasts to be as accurate and

honest as possible, avoiding any potential distractions. But that aloof approach just wasn't going to work in New

Orleans. Mayfield needed to pick up the phone. Evacuation decisions are not easy, in part because

evacuations themselves can be deadly; a bus carrying hospital evacuees from another 2005 storm, Hurricane Rita, burst into flames while leaving Houston, killing twenty-three elderly passengers.53 "This is really tough with these local managers," Mayfield says. "They look at this probabilistic information and they've got to translate that into a decision. A go, no-go. A yes- or-no decision. They have to take a probabilistic decision and turn it into something deterministic."

In this case, however, the need for an evacuation was crystal clear, and the message wasn't getting through.

"We have a young man at the hurricane center named Matthew Green. Exceptional young man. Has a degree in meteorology. Coordinates warnings with the transit folks. His mother lived in New Orleans. For whatever reason, she was not leaving. Here's a guy who knows about hurricanes and emergency management and he couldn't get his own mother to evacuate."

So the Hurricane Center started calling local officials up and down the Gulf Coast. On Saturday, August 27---after the forecast had taken a turn for the worse but still two days before Katrina hit---Mayfield spoke with Governor Haley Barbour of

Mississippi, who ordered a mandatory evacuation for its most vulnerable areas almost immediately,54 and Governor Kathleen Blanco of Louisiana, who had already declared a state of emergency. Blanco told Mayfield that he needed to call Ray Nagin, the mayor of New Orleans, who had been much slower to respond.

Nagin missed Mayfield's call but phoned him back. "I don't remember exactly what I said," Mayfield told me. "We had tons of interviews over those two or three days. But I'm absolutely positive that I told him, You've got some tough decisions and some potential for a large loss of life." Mayfield told Nagin that he needed to issue a mandatory evacuation order, and to do so as soon as possible.

Nagin dallied, issuing a voluntary evacuation order instead. In the Big Easy, that was code for "take it easy"; only a mandatory evacuation order would convey the full force of the threat.55 Most New Orleanians had not been alive when the last catastrophic storm, Hurricane Betsy, had hit the city in 1965. And those who had been, by definition, had survived it. "If I survived Hurricane Betsy, I can survive that one, too. We all ride the hurricanes, you know," an elderly resident who stayed in the city later told public officials.56 Reponses like these were typical. Studies from Katrina and other storms have found that having survived a hurricane makes one less likely to evacuate the next time one

comes.57

The reasons for Nagin's delay in issuing the evacuation order is a matter of some dispute---he may have been concerned that hotel owners might sue the city if their business was disrupted.58

Either way, he did not call for a mandatory evacuation until Sunday at 11 A.M.59---and by that point the residents who had not gotten the message yet were thoroughly confused. One study found that about a third of residents who declined to evacuate the city had not heard the evacuation order at all. Another third heard it but said it did not give clear instructions.60 Surveys of disaster victims are not always reliable---it is difficult for people to articulate why they behaved the way they did under significant emotional strain,61 and a small percentage of the population will say they never heard an evacuation order even when it is issued early and often. But in this case, Nagin was responsible for much of the confusion.

There is, of course, plenty of blame to go around for Katrina-- -certainly to FEMA in addition to Nagin. There is also credit to apportion---most people did evacuate, in part because of the Hurricane Center's accurate forecast. Had Betsy topped the levees in 1965, before reliable hurricane forecasts were possible, the death toll would probably have been even greater than it was in Katrina.

One lesson from Katrina, however, is that accuracy is the best

policy for a forecaster. It is forecasting's original sin to put politics, personal glory, or economic benefit before the truth of the forecast. Sometimes it is done with good intentions, but it always makes the forecast worse. The Hurricane Center works as hard as it can to avoid letting these things compromise its forecasts. It may not be a concidence that, in contrast to all the forecasting failures in this book, theirs have become 350 percent more accurate in the past twenty-five years alone.

"The role of a forecaster is to produce the best forecast possible," Mayfield says. It's so simple---and yet forecasters in so many fields routinely get it wrong.