One possible answer to the argument "attempting to build FAI based on Eliezer's ideas seems infeasible and increases the risk of UFAI without helping much to increase the probability of a good outcome, and therefore we should try to achieve a positive Singularity by other means" is that it's too early to decide this. Even if our best current estimate is that trying to build such an FAI increases risk, there is still a reasonable chance that this estimate will turn out to be wrong after further investigation. Therefore, the counter-argument goes, we ought to mount a serious investigation into the feasibility and safety of Eliezer's design (as well as other possible FAI approaches), before deciding to either move forward or give up.
(I've been given to understand that this is a standard belief within SI, except possibly for Eliezer, which makes me wonder why nobody gave this counter-argument in response to my post linked above. ETA: Carl Shulman did subsequently give me a version of this argument here.)
This answer makes sense to me, except for the concern that even seriously investigating the feasibility of FAI is risky, if the team doing so isn't fully rational. For example they may be overconfident about their abilities and thereby overestimate the feasibility and safety, or commit sunken cost fallacy once they have developed lots of FAI-relevant theory in the attempt to study feasibility, or become too attached to their status and identity as FAI researchers, or some team members may disagree with a consensus of "give up" and leave to form their own AGI teams and take the dangerous knowledge developed with them.
So the question comes down to, how rational is such an FAI feasibility team likely to be, and is that enough for the benefits to exceed the costs? I don't have a lot of good ideas about how to answer this, but the question seems really important to bring up. I'm hoping this post this will trigger SI people to tell us their thoughts, and maybe other LWers have ideas they can share.
Quick note: I'm glad we're doing this. Let's keep going.
Is it easy for you to sum up your core disagreements with Eliezer and with Paul? That would be pretty useful to my own strategic thinking.
As for where our core strategic disagreements are… I skimmed through your posts that looked fairly strategy-relevant: Cynical explanations of FAI critics, Work on security instead of Friendliness?, How can we ensure that a Friendly AI team will be sane enough, Reframing the problem of AI progress, Against "AI risk", Modest superintelligences, Wanted: backup plans for "see AI turns out to be easy", Do we want more publicity, and if so how?, Some thoughts on singularity strategies, Outline of possible singularity scenarios (that are not completely disastrous), Metaphilosophical Mysteries, Hacking the CEV for fun and profit, Late great filter is not bad news, Complexity of value ≠ complexity of outcome, Value uncertainty and the singleton scenario, Non-Malthusian scenarios, Outside view(s) and MIRI's FAI endgame, and Three approaches to "Friendliness".
My first idea is that we might disagree about the plausibility of the alternatives to "AI-foom disaster" you list here. My second idea is that some of our core disagreements are about the stuff we're talking about in this thread already.
I did this once and didn't get much of a response. But maybe I could do it more anyway.
Right. I think I was persuaded of this point when we discussed it here. I think the question does deserve more analysis (than I've seen written down, anyway), but I could easily see it being one of the questions that is unwise to discuss in great detail in public. I'd definitely like to know what you, Eliezer, Carl, Bostrom, etc. think about the issue.
The key questions are: how much greater is P(eAI | FAI attempted) than P(eAI | FAI not-attempted), and what tradeoff are we willing to accept? The first part of this question is another example of a strategic question I expect work toward FAI to illuminate more effectively than any other kind of research I can think of to do.
I notice that many of your worries seem to stem from a worry not about the math work MIRI is doing now, but perhaps from a worry about mission lock-in (from cognitive dissonance, inertia, etc.). Is that right? Anyway, I don't think even Eliezer is so self-confident that he would make a serious attempt at Yudkowskian FAI even with mounting evidence that there was a pretty good chance we'd get eAI from the attempt.
I have no comment on this part since I haven't taken much time to familiarize myself with acausal trade arguments. And I stand by that choice.
It's not MIRI's focus, but e.g. Carl has spent a lot of time on that kind of thing over the past couple years, and I'm quite happy for FHI to be working on that kind of thing to some degree.
You might be underestimating how costly it is to purchase new strategic insights at this stage. I think Bostrom & Yudkowsky & company picked up most of the low-hanging fruit over the past 15 years (though, most of it hasn't been written up clearly anywhere). Example 1 (of difficulty of purchasing new strategic insights): Bostrom's book has taken many person-years of work to write, but I'm not sure it contains any new-to-insiders strategic insights, it's just work that makes it easier for a wider population to build on the work that's already been done and work toward producing new strategic insights. Example 2: Yudkowsky (2013)+Grace (2013) represents quite a lot of work, but again doesn't contain any new strategic insights; it merely sets up the problem and shows what kind of work would need to be done on a much larger scale to (maybe!) grab new strategic insights.
Also, remember that MIRI's focus on math work was chosen because it purchases lots of other benefits alongside some expected strategic progress – benefits which seem to be purchased less cheaply by doing "pure" strategic research.
Thanks, that's good to know.
I guess I would describe my overall view as being around 50/50 uncertain about whether the Singularity will be Yudkowsky-style (fast local FOOM) or Hanson-style (slower distributed FOOM). Conditioning on Yudkowsky-style Singularity, I agree with Eliezer that the default outcome is probably a paperclipper-style UFAI, and disagree with him on how hard the FAI problems are (I think they are ... (read more)