With the release of Rohin Shah and Eliezer Yudkowsky's conversation, the Late 2021 MIRI Conversations sequence is now complete.
This post is intended as a generalized comment section for discussing the whole sequence, now that it's finished. Feel free to:
- raise any topics that seem relevant
- signal-boost particular excerpts or comments that deserve more attention
- direct questions to participants
In particular, Eliezer Yudkowsky, Richard Ngo, Paul Christiano, Nate Soares, and Rohin Shah expressed active interest in receiving follow-up questions here. The Schelling time when they're likeliest to be answering questions is Wednesday March 2, though they may participate on other days too.
I'll try to explain the technique and why it's useful. I'll start with a non-probabilistic version of the idea, since it's a little simpler conceptually, then talk about the corresponding idea in the presence of uncertainty.
Suppose I'm building a mathematical model of some system or class of systems. As part of the modelling process, I write down some conditions which I expect the system to satisfy - think energy conservation, or Newton's Laws, or market efficiency, depending on what kind of systems we're talking about. My hope/plan is to derive (i.e. prove) some predictions from these conditions, or maybe prove some of the conditions from others.
Before I go too far down the path of proving things from the conditions, I'd like to do a quick check that my conditions are consistent at all. How can I do that? Well, human brains are quite good at constrained optimization, so one useful technique is to look for one example of a system which satisfies all the conditions. If I can find one example, then I can be confident that the conditions are at least not inconsistent. And in practice, once I have that one example in hand, I can also use it for other purposes: I can usually see what (possibly unexpected) degrees of freedom the conditions leave open, or what (possibly unexpected) degrees of freedom the conditions don't leave open. By looking at that example, I can get a feel for the "directions" along which the conditions do/don't "lock in" the properties of the system.
(Note that in practice, we often start with an example to which we want our conditions to apply, and we choose the conditions accordingly. In that case, our one example is built in, although we do need to remember the unfortunately-often-overlooked step of actually checking what degrees of freedom the conditions do/don't leave open to the example.)
What would a probabilistic version of this look like? Well, we have a world model with some (uncertain) constraints in it - i.e. kinds-of-things-which-tend-to-happen, and kinds-of-things-which-tend-to-not-happen. Then, we look for an example which generally matches the kinds-of-things-which-tend-to-happen. If we can find such an example, then we know that the kinds-of-things-which-tend-to-happen are mutually compatible; a high probability for some of them does not imply a low probability for others. With that example in hand, we can also usually recognize which features of the example are very-nailed-down by the things-which-tend-to-happen, and which features have lots of freedom. We may, for instance, notice that there's some very-nailed-down property which seems unrealistic in the real world; I expect that to be the most common way for this technique to unearth problems.
That's the role a "mainline" prediction serves. Note that it does not imply the mainline has a high probability overall, nor does it imply a high probability that all of the things-which-tend-to-happen will necessarily occur simultaneously. It's checking whether the supposed kinds-of-things-which-tend-to-happen are mutually consistent with each other, and it provides some intuition for what degrees of freedom the kinds-of-things-which-tend-to-happen do/don't leave open.