Calibrating your probability estimates of world events: Russia vs Ukraine, 6 months later.

Shmi

Some of the comments on the link by James_Miller exactly six months ago provided very specific estimates of how the events might turn out:

James_Miller:

The odds of Russian intervening militarily = 40%.
The odds of the Russians losing the conventional battle (perhaps because of NATO intervention) conditional on them entering = 30%.
The odds of the Russians resorting to nuclear weapons conditional on them losing the conventional battle = 20%.

Me:

"Russians intervening militarily" could be anything from posturing to weapon shipments to a surgical strike to a Czechoslovakia-style tank-roll or Afghanistan invasion. My guess that the odds of the latter is below 5%.

A bet between James_Miller and solipsist:

I will bet you $20 U.S. (mine) vs $100 (yours) that Russian tanks will be involved in combat in the Ukraine within 60 days. So in 60 days I will pay you $20 if I lose the bet, but you pay me $100 if I win.

While it is hard to do any meaningful calibration based on a single event, there must be lessons to learn from it. Given that Russian armored columns are said to capture key Ukrainian towns today, the first part of James_Miller's prediction has come true, even if it took 3 times longer than he estimated.

Note that even the most pessimistic person in that conversation (James) was probably too optimistic. My estimate of 5% appears way too low in retrospect, and I would probably bump it to 50% for a similar event in the future.

Now, given that the first prediction came true, how would one reevaluate the odds of the two further escalations he listed? I still feel that there is no way there will be a "conventional battle" between Russia and NATO, but having just been proven wrong makes me doubt my assumptions. If anything, maybe I should give more weight to what James_Miller (or at least Dan Carlin) has to say on the issue. And if I had any skin in the game, I would probably be even more cautious.

Some of the comments on the link by James_Miller exactly six months ago provided very specific estimates of how the events might turn out:

James_Miller:

The odds of Russian intervening militarily = 40%.
The odds of the Russians losing the conventional battle (perhaps because of NATO intervention) conditional on them entering = 30%.
The odds of the Russians resorting to nuclear weapons conditional on them losing the conventional battle = 20%.

Me:

A bet between James_Miller and solipsist:

Upvoted -- thanks for a long, even if not fully even handed, reply (also it is perhaps not most transparent to explain CIs using a discrete set of hypotheses). I will try to give an example with a continuous valued parameter.

Say we want to estimate the mean height of LW posters. Ignoring the issue of sock puppets for the moment, we could pick LW usernames out of a hat, show up at the person with that username's house, and measure their height. Say we do that for 100 LW users we picked randomly, and take an average, call it X1. The 100 users are a "sample" and X1 is a "sample mean." If we randomly picked a different set of 100, we would get a different average, call it X2. If again a different set of 100, we would get yet a different average, call it X3, etc.

These X1, X2, X3 are realizations of something called the "sampling distribution," call it Ps. This distribution is a different thing than the distribution that governs height among all LW users, call it Ph. Ph could be anything in general, maybe Gaussian, maybe bimodal, maybe something weird. But if we can figure out what the distribution Ps is, we could make statements of the form

"most of the times where I pick a sample Xi from Ps, e.g. most of the time I pick 100 LW users at random and get their average heights, this average will be pretty close to the real average height of all LW users, under a very small set of assumptions on Ph."

This is what confidence intervals are about. In fact, if the number of LW users we pick for our sample is large enough, we can well-approximate Ps by a Gaussian distribution because of a neat result called the Central Limit Theorem, (again regardless of what Ph is, or more precisely under very mild assumptions on Ph).

What makes these kinds of statements powerful is that we can sometimes make them without needing to know much at all about Ph. Sometimes it is useful to be able to say something like that -- maybe we are very uncertain about Ph, or we suspect shenanigans with how Ph is defined.

27

Calibrating your probability estimates of world events: Russia vs Ukraine, 6 months later.

27

27

27

Calibrating your probability estimates of world events: Russia vs Ukraine, 6 months later.

27

27