Outside View(s) and MIRI's FAI Endgame

Wei Dai

21 Outside View(s) and MIRI's FAI Endgame

28th Aug 2013

2 min read

21

On the subject of how an FAI team can avoid accidentally creating a UFAI, Carl Shulman wrote:

If we condition on having all other variables optimized, I'd expect a team to adopt very high standards of proof, and recognize limits to its own capabilities, biases, etc. One of the primary purposes of organizing a small FAI team is to create a team that can actually stop and abandon a line of research/design (Eliezer calls this "halt, melt, and catch fire") that cannot be shown to be safe (given limited human ability, incentives and bias).

In the history of philosophy, there have been many steps in the right direction, but virtually no significant problems have been fully solved, such that philosophers can agree that some proposed idea can be the last words on a given subject. An FAI design involves making many explicit or implicit philosophical assumptions, many of which may then become fixed forever as governing principles for a new reality. They'll end up being last words on their subjects, whether we like it or not. Given the history of philosophy and applying the outside view, how can an FAI team possibly reach "very high standards of proof" regarding the safety of a design? But if we can foresee that they can't, then what is the point of aiming for that predictable outcome now?

Until recently I haven't paid a lot of attention to the discussions here about inside view vs outside view, because the discussions have tended to focus on the applicability of these views to the problem of predicting intelligence explosion. It seemed obvious to me that outside views can't possibly rule out intelligence explosion scenarios, and even a small probability of a future intelligence explosion would justify a much higher than current level of investment in preparing for that possibility. But given that the inside vs outside view debate may also be relevant to the "FAI Endgame", I read up on Eliezer and Luke's most recent writings on the subject... and found them to be unobjectionable. Here's Eliezer:

On problems that are drawn from a barrel of causally similar problems, where human optimism runs rampant and unforeseen troubles are common, the Outside View beats the Inside View.

Does anyone want to argue that Eliezer's criteria for using the outside view are wrong, or don't apply here?

And Luke:

One obvious solution is to use multiple reference classes, and weight them by how relevant you think they are to the phenomenon you're trying to predict.

[...]

Once you've combined a handful of models to arrive at a qualitative or quantitative judgment, you should still be able to "adjust" the judgment in some cases using an inside view.

These ideas seem harder to apply, so I'll ask for readers' help. What reference classes should we use here, in addition to past attempts to solve philosophical problems? What inside view adjustments could a future FAI team make, such that they might justifiably overcome (the most obvious-to-me) outside view's conclusion that they're very unlikely to be in the possession of complete and fully correct solutions to a diverse range of philosophical problems?

Machine Intelligence Research Institute (MIRI)Inside/Outside View

Personal Blog

21

New Comment

Rendering 0/60 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 4:29 PM

Moderation Log

21 Outside View(s) and MIRI's FAI Endgame

by Wei Dai

28th Aug 2013

2 min read

21

On the subject of how an FAI team can avoid accidentally creating a UFAI, Carl Shulman wrote:

If we condition on having all other variables optimized, I'd expect a team to adopt very high standards of proof, and recognize limits to its own capabilities, biases, etc. One of the primary purposes of organizing a small FAI team is to create a team that can actually stop and abandon a line of research/design (Eliezer calls this "halt, melt, and catch fire") that cannot be shown to be safe (given limited human ability, incentives and bias).

On problems that are drawn from a barrel of causally similar problems, where human optimism runs rampant and unforeseen troubles are common, the Outside View beats the Inside View.

Does anyone want to argue that Eliezer's criteria for using the outside view are wrong, or don't apply here?

And Luke:

One obvious solution is to use multiple reference classes, and weight them by how relevant you think they are to the phenomenon you're trying to predict.

[...]

Once you've combined a handful of models to arrive at a qualitative or quantitative judgment, you should still be able to "adjust" the judgment in some cases using an inside view.

Machine Intelligence Research Institute (MIRI)Inside/Outside View

Personal Blog

21

Mentioned in

87Relitigating the Race to Build Friendly AI

New Comment

Rendering 0/60 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 4:29 PM

Moderation Log

More from Wei Dai

Curated and popular this week

60Comments

Comment Permalink

Wei Dai13y20

(to be specific, I don't think MIRI proponents have adequately addressed concerns like those expressed by Wei Dai here and elsewhere)

I do wonder why MIRI people often do not respond to my criticisms about their strategy. For example the only MIRI-affiliated person who responded to this post so far is Paul Christiano (but given his disagreements with Eliezer, he isn't actually part of my intended audience for this post). The upcoming workshop might be a good opportunity to see if I can get MIRI people to take my concerns more seriously, if I talk to them face to face. If you or anyone else has any ideas on what else I should try, please let me know.

[anonymous]13y00

I do wonder why MIRI people often do not respond to my criticisms about their strategy.

I'm not sure why you're wondering, when in the history of MIRI and its predecessor they've only ever responded to about two criticisms about their strategy.

13lukeprog13y

Speaking for myself... Explaining strategic choices, and replying to criticisms, takes enormous amounts of time. For example, Nick Bostrom set out to explain what MIRI/FHI insiders might consider to be "10% of the basics about AI risk" in a clear and organized way, and by the time he's done with the Superintelligence book it will have taken him something like 2.5 years of work just to do that, with hundreds of hours of help from other people — and he was already an incredibly smart, productive academic writer who had a strong comparative advantage writing exactly that book. It would've taken me, or Carl, or anybody else besides Nick a lot more time and effort to write that book at a similar level of quality. Which of your many discussion threads on AI risk strategy do you most wish would be engaged further by somebody on staff at MIRI?

2Kaj_Sotala13y

Personally, I didn't respond to this post because my reaction to it was mostly "yes, this is a problem, but I don't see a way by which talking about it will help at this point; we'll just have to wait and see". In other words, I feel that MIRI will just have to experiment with a lot of different strategies and see which ones look like they'll have promise, and then that experimentation will maybe reveal a way by which issues like this one can be solved, or maybe MIRI will end up pursuing an entirely different strategy. But I expect that we'll actually have to try out the different strategies before we can know.

See in context