For example if the FAI scenario is much less likely (a priori) than a Clippy scenario, then there's no reason for Clippy to make strong concessions.
But if a "paperclips" maximizer, as opposed to "tables", "cars", or "alien sex toys" maximizer, is just one of many unfriendly maximizers, then maximizing "human values" is just one of many unlikely outcomes. In other words, you can't just say that unfriendly AIs are more likely than friendly AIs when it comes to cooperation. Since the opposition between a paperclip maximizer and an "alien sex toy" maximizer is the same as the opposition between the former and an alien or human friendly AI. Since all of them want to maximize their opposing values. And even if there turns out to be a subset of values shared by some AIs, other groups could cooperate to outweigh their leverage.
But since there is an exponentially huge set of random maximisers, the probability of each individual one is infinitesimal. OTOH, human values have a high probability density in mindspace because people are actually working towards it.
Cross-posted from my blog.
Yudkowsky writes:
My own projection goes more like this:
At least one clear difference between my projection and Yudkowsky's is that I expect AI-expert performance on the problem to improve substantially as a greater fraction of elite AI scientists begin to think about the issue in Near mode rather than Far mode.
As a friend of mine suggested recently, current elite awareness of the AGI safety challenge is roughly where elite awareness of the global warming challenge was in the early 80s. Except, I expect elite acknowledgement of the AGI safety challenge to spread more slowly than it did for global warming or nuclear security, because AGI is tougher to forecast in general, and involves trickier philosophical nuances. (Nobody was ever tempted to say, "But as the nuclear chain reaction grows in power, it will necessarily become more moral!")
Still, there is a worryingly non-negligible chance that AGI explodes "out of nowhere." Sometimes important theorems are proved suddenly after decades of failed attempts by other mathematicians, and sometimes a computational procedure is sped up by 20 orders of magnitude with a single breakthrough.