I've been writing about effective altruism and AI and would be interested in feedback: Effective altruists should work towards human-level AI
Things like AI boxing or "emergency stop buttons" would be instances of safeguards. Basically any form of human supervision that can keep the AI in check even if it's not safe to let it roam free.
Obviously AI is hard, and obviously software has bugs.
To counter my argument, you need to make a case that the bugs will be so fundamental and severe, and go undetected for so long, that despite any safeguards we take, they will lead to catastrophic results with probability greater than 99%.
It doesn't matter what happens when we sample a mind at random. We only care about the sorts of minds we might build, whether by designing them or evolving them. Either way, they'll be far from random.
Let's say "we" are the good guys in the race for AI. Define
W = we win the race to create an AI powerful enough to protect humanity from any subsequent AIs
G = our AI can be used to achieve a good outcome
F = we go the "formalize friendliness" route
O = we go a promising route other than formalizing friendliness
At issue is which of the following is higher:
P(G|WF)P(W|F) or P(G|WO)P(W|O)
From what I know of SIAI's approach to F, I estimate P(W|F) to be many orders of magnitude smaller than P(W|O). I estimate P(G|WO) to be more than 1% for a ...
Sounds like we mostly agree. However, I don't think it's a question of general knowledge. I'm talking about smart, rational people who have studied AI enough to have strongly-held opinions about it. Those are the people who need to be convinced; their opinions propagate to smart, rational people who haven't personally investigated AI in depth.
I'd love to hear your take on X here. What are your reasons for believing that friendliness can be formalized practically, and an AGI based on that formalization built before any other sort of AGI?
My point is that Shane's conclusion strikes me as the obvious one, and I believe many smart, rational, informed people would agree. It may be the case that, for the majority of smart, rational, informed people, there exists an issue X for which they think "obviously X" and SIAI thinks "obviously not X." To be taken seriously, SIAI needs to engage with the X's.
Agreed, and I think there's a pattern here. XiXiDu is asking the right questions about why SIAI doesn't have wider support. It is because there are genuine holes in its reasoning about the singularity, and SIAI chooses not to engage with serious criticism that gets at those holes. Example (one of many): I recall Shane Legg commenting that it's not practical to formalize friendliness before anyone builds any form of AGI (or something to that effect). I haven't seen SIAI give a good argument to the contrary.
Is this happening again this Sunday?
It seems to me that whpearson's reasoning is an instance of the "ends don't justify means" heuristic, which is especially reasonable in this case since the ends are fuzzy and the means are clear.
If we grant that businesses that feed into status games (and other counterproductive activities) are likely to be much more profitable than businesses more aligned with rationalist/altruistic/"nerd" values, then arguing that one should go into the latter kind of business undermines the argument that ends do justify means here. And if the ends don't justify going into business despite a lack of intrinsic motivation, what does?
Free online AGI seminar coming up