I'm a reactionary, not an innovator, dammit! Reacting against this newfangled antiheroic 'reference class' claim that says we ought to let the world burn because we don't have enough of a hero license!
"Reference class" to me is just an intuitive way of thinking about updating on certain types of evidence. It seems like you're saying that in some cases we ought to use the inside view, or weigh object-level evidence more heavily, but 1) I don't understand why you are not worried about "inside view" reasoning typically producing overconfidence or why you don't think it's likely to produce overconfidence in this case, and 2) according to my inside view, the probability of a team like the kind you're envisioning solving FAI is low, and a typical MIRI donor or potential donor can't be said to have much of an inside view on this matter, and has to use "reference class" reasoning. So what is your argument here?
I'm also really unconvinced by the claim that this work could reasonably have expected net negative consequences.
Every AGI researcher is unconvinced by that, about their own work.
but trying to warn people against reverse or negative effects seems pretty perverse for anything that has made it onto Givewell's Top 3, or CFAR, or FHI, or MIRI
CFAR and MIRI were created by you, to help you build FAI. If FHI has endorsed your plan for building FAI (as opposed to endorsing MIRI as an organization that's a force for good overall, which I'd agree with and I've actually provided various forms of support to MIRI because of that), I'm not aware of it. I also think I've thought enough about this topic to give some weight to my own judgments, so even if FHI does endorse your plan, I'd want to see their reasoning (which I definitely have not seen) and not just take their word. I note that Givewell does publish its analyses and are not asking people to just trust it.
Info that shortens AI timelines should mostly just not be released publicly
My model of FAI development says that you have to get most of the way to being able to build an AGI just to be able to start working on many Friendliness-specific problems, and solving those problems would take a long time relative to finishing rest of the AGI capability work. Unless you're flying completely below the radar, which is incompatible with your plan for funding via public donations, what is stopping your unpublished results from being stolen or leaked in the mean time? And just gathering 10 to 50 world-class talents to work on FAI is likely to spur competition and speed up AGI progress. The fact that you seem to be overconfident about your chance of success also suggests that you are likely to be overconfident in other areas, and indicates a high risk of accidental UFAI creation (relative to the probability of success, not necessarily high in absolute terms).
My model of FAI development says that you have to get most of the way to being able to build an AGI just to be able to start working on many Friendliness-specific problems, and solving those problems would take a long time relative to finishing rest of the AGI capability work.
Agree, though luckily there are other Friendliness-specific problems that we can start solving right now.
...Unless you're flying completely below the radar, which is incompatible with your plan for funding via public donations, what is stopping your unpublished results from being st
In the past, people like Eliezer Yudkowsky (see 1, 2, 3, 4, and 5) have argued that MIRI has a medium probability of success. What is this probability estimate based on and how is success defined?
I've read standard MIRI literature (like "Evidence and Import" and "Five Theses"), but I may have missed something.
-
(Meta: I don't think this deserves a discussion thread, but I posted this on the open thread and no-one responded, and I think it's important enough to merit a response.)