Tricky Bets and Truth-Tracking Fields

lukeprog

While visiting Oxford for MIRI’s November 2013 workshop, I had the pleasure of visiting a meeting of “Skeptics in the Pub” in the delightfully British-sounding town of High Wycombe in Buckinghamshire. (Say that aloud in a British accent and try not to grin; I dare you!)

I presented a mildly drunk intro to applied rationality, followed by a 2-hour Q&A that, naturally, wandered into the subject of why AI will inevitably eat the Earth. I must have been fairly compelling despite the beer, because at one point I noticed the bartenders were leaning uncomfortably over one end of the bar in order to hear me, ignoring thirsty customers at the other end.

Anyhoo, at one point I was talking about the role of formal knowledge in applied rationality, so I explained Solomonoff’s lightsaber and why it made me think the wave function never collapses.

Someone — I can’t recall who; let’s say “Bob” — wisely asked, “But if quantum interpretations all predict the same observations, what does it mean for you to say the wave function never collapses? What do you anticipate?” [1]

Now, I don’t actually know whether the usual proposals for experimental tests of collapse make sense, so instead I answered:

Well, I think theoretical physics is truth-tracking enough that it eventually converges toward true theories, so one thing I anticipate as a result of favoring a no-collapse view is that a significantly greater fraction of physicists will reject collapse in 20 years, compared to today.

Had Bob and I wanted to bet on whether the wave function collapses or not, that would have been an awfully tricky bet to settle. But if we roughly agree on the truth-trackingness of physics as a field, then we can use the consensus of physicists a decade or two from now as a proxy for physical truth, and bet on that instead.

This won’t work for some fields. For example, philosophy sometimes looks more like a random walk than a truth-tracking inquiry — or, more charitably, it tracks truth on the scale of centuries rather than decades. For example, did you know that one year after the cover of TIME asked “Is God dead?”, a philosopher named Alvin Plantinga launched a renaissance in Christian philosophy, such that theism and Christian particularism were more commonly defended by analytic philosophers in the 1970s than they were in the 1930s? I also have the impression that moral realism was a more popular view in the 1990s than it was in the 1970s, and that physicalism is less common today than it was in the 1960s, but I’m less sure about those.

You can also do this for bets that are hard to settle for a different kind of reason, e.g. an apocalypse bet. [2] Suppose Bob and I want to bet on whether smarter-than-human AI is technologically feasible. Trouble is, if it’s ever proven that superhuman AI is feasible, that event might overthrow the global economy, making it hard to collect the bet, or at least pointless.

But suppose Bob and I agree that AI scientists, or computer scientists, or technology advisors to first-world governments, or some other set of experts, is likely to converge toward the true answer on the feasibility of superhuman AI as time passes, as humanity learns more, etc. Then we can instead make a bet on whether it will be the case, 20 years from now, that a significantly increased or decreased fraction of those experts will think superhuman AI is feasible.

Often, there won’t be acceptable polls of the experts at both times, for settling the bet. But domain experts typically have a general sense of whether some view has become more or less common in their field over time. So Bob and I could agree to poll a randomly chosen subset of our chosen expert community 20 years from now, asking them how common the view in question is at that time and how common it was 20 years earlier, and settle our bet that way.

Getting the details right for this sort of long-term bet isn’t trivial, but I don't see a fatal flaw. Is there a fatal flaw in the idea that I’ve missed? [3]

I can’t recall exactly how the conversation went, but it was something like this. ↩
See also Jones, How to bet on bad futures. ↩
I also doubt I’m the first person to describe this idea in writing: please link to other articles making this point if you know of any. ↩

Suppose Bob and I want to bet on whether smarter-than-human AI is technologically feasible. Trouble is, if it’s ever proven that superhuman AI is feasible, that event might overthrow the global economy, making it hard to collect the bet, or at least pointless.

This doesn't preclude a bet. You think money will be worthless in 20 years and Bob doesn't. Bob pays you a small amount of money today in return for you promising to pay him a massive amount in 20 years. Or, just look at long-term bond prices or life insurance prices which should reflect the possibility of money becoming worthless.

That was discussed here before. (The person proposing that no longer believes it would work.)

Bob will accept that some phrase X is meaningful if there is a test that can be performed whose outcome value depends on truth value of X. If there is such a test, then we can construct a further test of asking someone who has performed the original test what the outcome of the test was. Since the people who set up tests are usually honest, this test would also be a test of X (provided the original test exists).

If I ask an honest peasant how long the emperor's nose is, but I also suspect no one has ever seen the emperor, how much do I learn from her statement? What if she says, "I have never seen the emperor, but other people tell me his nose is 5cm"? How many people has she talked to? Has any of them seen the emperor?

I don't know how to answer those questions, and yet your example is even less clear. You think no one has seen the emperor and you're not sure if he can be seen. 5cm? Well she is honest.

I am not sure that meme propagation through subcultures of experts is a good proxy for "truth", except in very clear-cut cases. Not even physics is that reliable. For example, 5 years ago very few HEP experts believed that there is something physically detectable at the black hole horizon. By last year this fraction jumped significantly, thanks to the horizon firewall papers. I expect that it will plummet again 5 years from now. The situation is worse in less quantitative sciences, like psychology or economy. I'm sure you can come up with a few examples yourself.

This just comes back to the question of how truth-tracking the two people think the given expert community is, and how they should set their bets based on those anticipations. If one is worried about temporally local perturbations (see e.g. this bet), one could also agree to randomly sample the expert population at 15 years, 20 years, and 25 years.

Using a human arbiter as a proxy for such difficult bets is also the approach chosen by the Foresight Exchange

Is there documentation on that somewhere? I couldn't easily find it.

It is as if you're buying / shorting an index fund on opinions.

It's as if you're participating in a prediction market such as PredictionBook or The Good Judgement Project.

But suppose Bob and I agree that AI scientists, or computer scientists, or technology advisors to first-world governments, or some other set of experts, is likely to converge toward the true answer on the feasibility of superhuman AI as time passes, as humanity learns more, etc. Then we can instead make a bet on whether it will be the case, 20 years from now, that a significantly increased or decreased fraction of those experts will think superhuman AI is feasible.

You need stronger assumptions.

For example you need to postulate that the speed of convergence will overcome the natural variability within your time frame.

asking them how common the view in question is at that time and how common it was 20 years earlier, and settle our bet that way.

Planning in 2015 to poll people in 2035 about 2015 seems like clearly inferior to polling them in 2015.

domain experts typically have a general sense of whether some view has become more or less common in their field over time

The hypothesis that this sense is accurate really should be tested. Maybe it's true for things that were controversial at both points in time, but for theories that are eliminated, history is rewritten. For example, the interpretation of quantum mechanics.

How accurate would recollections of the past be? You'd expect a lot of distortion.

It's kinda the same idea as temporal difference learning, which seems to work quite well. That's evidence that this kind of bet is an effective tool for improving one's knowledge, if one has thousands of lifetimes over which to make them :-).

LESSWRONG
LW

LESSWRONG
LW

26

Tricky Bets and Truth-Tracking Fields

26

26