When the uncertainty about the model is higher than the uncertainty in the model

19 Post author: Stuart_Armstrong 28 November 2014 06:12PM

Most models attempting to estimate or predict some elements of the world, will come with their own estimates of uncertainty. It could be the Standard Model of physics predicting the mass of the Z boson as 91.1874 ± 0.0021 GeV, or the rather wider uncertainty ranges of economic predictions.

In many cases, though, the uncertainties in or about the model dwarf the estimated uncertainty in the model itself - especially for low probability events. This is a problem, because people working with models often try to use the in-model uncertainty and adjust it to get an estimate of the true uncertainty. They often realise the model is unreliable, but don't have a better one, and they have a measure of uncertainty already, so surely doubling and tripling this should do the trick? Surely...

The following three cases are going to be my go-to examples for showing what a mistake this can be; they cover three situations: extreme error, being in the domain of a hard science, and extreme negative impact.

Black Monday

On October 19, 1987, the world's stock markets crashed, shedding a huge value in a very short time. The Dow Jones Industrial Average dropped by 22.61% that day, losing between a fifth and a quarter of its value.

How likely was such an event, according to the prevailing models of the time? This was apparently a 20-sigma event, which means that the event was twenty standard deviations away from the expected behaviour.

Such events have a probability of around 10-50 of happening, which in technical mathematical terms is classified as "very small indeed". If every day were compressed into a second, and the stock markets had been running since the big bang... this gives us only about 1017 seconds. If every star in the observable universe ran its own stock market, and every day were compressed into a second, and we waited a billion times the lifetime of the universe... then we might expect to observe a twenty sigma event. Once.

No amount of reasonable "adjusting" of the probability could bring 10-50 into something plausible. What is interesting, though, is that if we took the standard deviation as the measure of uncertainty, the adjustment is much smaller. One day in a hundred years is a roughly 3x10-5 event, which corresponds very roughly to three standard deviations. So "simply" multiplying the standard deviation by seven would have been enough. It seems that adjusting ranges is more effective than adjusting probabilities.

 

Castle Bravo

But economics is a soft science. Surely errors couldn't occur in harder sciences, like physics? In nuclear bomb physics, where the US had access to the best brains and the best simulations, and some of the best physical evidence (and certainly a very high motivation to get it right), such errors could not occur? Ok, maybe at the very beginning, but by 1954, the predictions must be very accurate?


The Caste Bravo hydrogen bomb was the highest yield nuclear bomb ever detonated by the United States (though not necessarily intended to be). The yield was predicted to be 6 megatons of TNT, within a maximum range of 4 to 8. It ended up being an explosion of 15 megatons of TNT, around triple the expectation, with fallout landing on inhabited islands(on the Rongelap and Utirik atolls) and spreading across the world, killing at least one person (a Japanese fisherman).

What went wrong? The bomb designers considered that the lithium-7 isotope used in the bomb was essentially inert, when in fact it... wasn't. And that was sufficient to triple the yield of the weapon, far beyond what the designers had considered possible.

Again, extending the range is more successful than adjusting the probabilities of high events. And even the hard sciences are susceptible to great errors.

 

Physician, wash thyself

The previous are good examples of dramatic underestimates of uncertainty by the model's internal measure of uncertainty. They are especially good because we have numerical estimates of the internal uncertainty. But they lack one useful rhetorical component: evidence of a large scale disaster. Now, there are a lots of of models to choose from which dramatically underestimated the likelihood of disaster, but I'll go with one problem in particular: the practice that doctors used to have of not washing their hands.


Ignaz Semmelweis noted in 1847 that women giving birth in the presence of doctors died at about twice the rate of those attended only by midwives. He correctly deduced that doctors were importing something from their experiments on cadavers, and that this was causing the women to die. He instituted a policy of using a solution of chlorinated lime for washing hands between autopsy work and the examination of patients - with great success, sometimes even reducing the death rate to zero in some months.

This caused a problem. Semmelweis was unable to convince other doctors about this, for a variety of standard biases. But the greatest flaw was that he had no explanation for this behaviour. Yes, he could point at graphs and show improvements - but there was nothing in standard medical theory that could account for it. The other doctors could play around with their models or theories for days, but they could never explain this type of behaviour.

His claims, in effect, lacked scientific basis. So they ignored them. Until Pasteur came up with a new model, there was just no way to understand these odd results.

The moral of this is that sometimes the uncertainty can not only be much greater than that of the model. Sometimes the uncertainty isn't even visible anywhere in the model.

And sometimes making these kinds of mistakes can lead to millions of unnecessary deaths, and the decisions made (using doctors for childbirths) have the absolute opposite effect than was intended.

Comments (78)

Comment author: mwengler 29 November 2014 04:37:21PM 15 points [-]

The "model" you never name for the stock price distribution is a normal or Gaussian distribution. You point out that this model fails spectacularly to predict the highest variability point 20 sigma event. What you don't point out that this model fails to predict even the 5 sigma events.

Looking at the plot of "Daily Changes in the Dow," we see it is plotted over fewer than 95 years. Each trading year has a little less than 260 trading days in it. So plotted should be at most 24,000 daily changes. For a normal distribution, a 1/24000 event, the biggest event we would expect to happen once on this whole graph, would be somewhere between 4 and 4.5 sigma.

But instead of seeing one, or close to one 4.5 sigma or higher events in 24,000 daily changes, we see about 28 by my rough count looking at the graph. The data starts in 1915. By 1922 we have seen 5 "1 in100 years" events.

My point being: by 1922, we know the model that daily changes in the stock market fit a normal or gaussian distribution is just pure crap. We don't need to wait 70 years for 20 sigma event to know the model isn't just wrong, it is stupid. We suspect this fact within the first year or two of data, and we know with high reliability that the model is garbage by 1922.

What I don't understand is why anybody rational would ever state the data fits a normal distribution, in the trivially obvious event that even within a few hundred events, we see an overwhelmingly obvious excess of large daily changes from what a normal distribution would predict.

One need not go to a mathematically well-described alternative distribution to produce a much better prediction of how often 20 sigma events will actually occur. Simply plotting the histogram of daily changes will show the tail probability falling off much more slowly than a normal distribution predicts. Before the 20 sigma event occurs, one might by estimating from the actual distribution of changes underestimate its probably by a factor of a few (or not, I have not done the exercise on this data), but one will not have underestimated it by tens of orders of magnitude.

There is no "uncertainty" in modeling daily stock market changes as normally (or log-normally) distributed. One can look at just the first few 100 points in the measured data, to be CERTAIN that a normal or a log-normal distribution is simply wrong.

Comment author: Stuart_Armstrong 01 December 2014 11:05:54AM 0 points [-]

The model used is the Black-Scholes model with, as you point out, a normal distribution. It endures, despite being clearly wrong, because there doesn't seem to be any good alternatives.

Comment author: Cthulhoo 04 December 2014 10:52:20AM 3 points [-]

I doesn't endure, not in Risk Management, anyways. Some alternatives for equities are e.g. the Heston Model or other stochastic volatilities approaches. Then, there is the whole filed of systemic risk which studies correlated crashes: events when a bunch of equities all crash at the same time are way more common than they should be and people are aware of this. See e.g. this anaysis that uses a Hawk model to capture the clustering of the crashes.

Comment author: Stuart_Armstrong 04 December 2014 12:28:13PM 2 points [-]

B-S endures, but is generally patched with insights like these.

Comment author: Cthulhoo 04 December 2014 02:27:43PM 2 points [-]

B-S endures, but is generally patched with insights like these.

I think I can see what you mean, and in fact I partially agree, so I'll try to restate the argument. Correct me if you think I got it wrong. In my experience it's true that B-S is still used for quick and dirty bulk calculations or by organizations that don't have the means to implement more complex models. But the model's shortcomings are very well understood by the industry, and risk managers absolutely don't rely on this model when e.g. calculating the capital requirement for Basel or Solvency purposes. If they did, the regulators will utterly demolish their internal risk model.

There is still a lot of work to be done, and there is what you call model uncertainty at least when dealing with short time scales, but (fortunately) there's been a lot of progress since B-S.

Comment author: Larks 04 December 2014 03:50:03AM 3 points [-]

Why were you using an options-pricing model to predict stock returns? Black-Scholes is not used to model equity market returns.

Comment author: Azathoth123 02 December 2014 02:55:41AM *  0 points [-]

How about the Black-Scholes model with a more realistic distribution?

Or does BS make annoying assumptions about its distribution, like that it has a well-defined variance and mean?

Comment author: ChristianKl 03 December 2014 06:39:38PM 1 point [-]

It assumes that the underlying model follows a Gaussian distribution but as Mandelbrot showed a Lévy distribution is a better model.

Black-Scholes is a name for a formula that was around before Black and Scholes published. Beforehand it was simply a heuristic used by traders. Those traders also did scale a few parameters around in a way that a normal distribution wouldn't allow. Black-Scholes then went and proved the formula correct for a Gaussian distribution based on advanced math.

After Black-Scholes got a "nobel prize" people stated to believe that the formula is actually measuring real risk and betting accordingly. Betting like that is benefitial for traders who make bonuses when they win but who don't suffer that much if they lose all the money they bet. Or a government bails you out when you lose all your money.

The problem with Levy distributions is that they have a parameter c that you can't simply estimate by having a random sample in the way you can estimate all the parameters of Gaussian distribution if you have a big enough sample.

*I'm no expert on the subject but the above is my understanding from reading Taleb and other reading.

Comment author: ahbwramc 29 November 2014 12:49:09AM 10 points [-]
Comment author: Stuart_Armstrong 29 November 2014 02:59:58PM 2 points [-]

Added a link to that, thanks.

Comment author: Lumifer 29 November 2014 04:35:15AM 9 points [-]

This was apparently a 20-sigma event

That's just a complicated way of saying "the model was wrong".

So "simply" multiplying the standard deviation by seven would have been enough.

Um... it's not that easy. If your model breaks down in a pretty spectacular fashion you don't get to recover by inventing a multiple for your standard deviation. In the particular case of the stock markets, one way would be to posit a heavy-tailed underlying distribution and if it's sufficiently heavy-tailed the standard deviation isn't even defined. Another, a better way would be to recognize that the underlying generating process is not stable.

In general, the problem with recognizing the uncertainty of your model is that you still need a framework to put it into and if your model blew up you may be left without any framework at all.

Comment author: [deleted] 30 November 2014 10:11:23PM 2 points [-]

Isn't this more or less what mixture models were made for?

Comment author: Stuart_Armstrong 01 December 2014 11:05:09AM *  2 points [-]

Those can work, if you have clear alternative candidate models. But it's not clear how you would have done that here, in, say the second problem. The model you would have mixed with is something like "lithium-7 is actually reactive on relevant timescales here"; that's not really a model, barely a coherent assumption.

Comment author: shminux 28 November 2014 10:37:09PM *  2 points [-]

Re Black Monday, and possibly Castle Bravo, this site devoted to Managing the Unexpected seems to have some relevant research and recommendations.

And I don't think that your last example is in the same category of uncertain models with certain predictions.

Comment author: Stuart_Armstrong 29 November 2014 03:05:21PM 1 point [-]

And I don't think that your last example is in the same category of uncertain models with certain predictions.

It is somewhat different, yes. Do you have a better example, still with huge negative impact?

Comment author: Larks 04 December 2014 03:54:46AM 1 point [-]

Such events have a probability of around 10-50 of happening

No, you cannot infer a probability just from a SD. You also need to know what type of distribution it is. You're implicitly assuming a normal distribution, but everyone knows asset price returns have negative skew and excess kurtosis.

You could easily correct this by adding "If you use a normal distribution...".

Comment author: Stuart_Armstrong 04 December 2014 12:27:11PM 0 points [-]

I'm not implicitly assuming it - the market models were the ones explicitly assuming it.

Comment author: Larks 05 December 2014 01:08:21AM 1 point [-]

Up to this point in the post you haven't mentioned any models. If you give a probability without first mentioning a model for it to be relative to, the implication is that you are endorsing the implicit model. But this is just nit-picking.

More importantly, there are many models used by people in the market. Plenty of people use far more sophisticated models. You can't just say "the market models" without qualification or citation.

Comment author: shminux 30 November 2014 05:01:40PM *  1 point [-]

Seems that (a generalization of) the equation 1 in the paper by Toby Ord http://arxiv.org/abs/0810.5515 linked in the comments to Yvain's post is something like what you are looking for.

Comment author: Stuart_Armstrong 01 December 2014 11:02:16AM 0 points [-]

This post is looking for salient examples of that type of behaviour.

Comment author: joaolkf 29 November 2014 03:00:50AM 1 point [-]

On October 18, 1987, what sort of model of uncertainty of models one would have to have to say the uncertainty over the 20-sigma estimative was enough to allow it to be 3-sigma? 20-sigma, give 120 or take 17? Seems a bit extreme, and maybe not useful.

Comment author: Stuart_Armstrong 29 November 2014 03:02:05PM 3 points [-]

This seems to depend almost entirely on what other models you had. A 1% belief in a wider model (say one using a Cauchy distribution rather than a normal one) might have been sufficient to make the result considerably less surprising.