Hi folks,
My supervisor and I co-authored a philosophy paper on the argument that AI represents an existential risk. That paper has just been published in Ratio. We figured LessWrong would be able to catch things in it which we might have missed and, either way, hope it might provoke a conversation.
We reconstructed what we take to be the argument for how AI becomes an xrisk as follows:
- The "Singularity" Claim: Artificial Superintelligence is possible and would be out of human control.
- The Orthogonality Thesis: More or less any less of intelligence is compatible with more or less any final goal. (as per Bostrom's 2014 definition)
From the conjuction of these two presmises, we can conclude that ASI is possible, it might have a goal, instrumental or final, which is at odds with human existence, and, given the ASI would be out of our control, that the ASI is an xrisk.
We then suggested that each premise seems to assume a different interpretation of 'intelligence", namely:
- The "Singularity" claim assumes general intelligence
- The Orthogonality Thesis assumes instrumental intelligence
If this is the case, then the premises cannot be joined together in the original argument, aka the argument is invalid.
We note that this does not mean that AI or ASI is not an xrisk, only that the the current argument to that end, as we have reconstructed it, is invalid.
Eagerly, earnestly, and gratefully looking forward to any responses.
Thanks for posting this here! As you might expect, I disagree with you. I'd be interested to hear your positive account of why there isn't x-risk from AI (excluding from misused instrumental intelligence). Your view seems to be that we may eventually build AGI, but that it'll be able to reason about goals, morality, etc. unlike the cognitiviely limited instrumental AIs you discuss, and therefore it won't be a threat. Can you expand on the italicized bit? Is the idea that if it can reason about such things, it's as likely as we humans are to come to the truth about them? (And, there is in fact a truth about them? Some philosophers would deny this about e.g. morality.) Or indeed perhaps you would say it's more likely than humans to come to the truth, since if it were merely as likely as humans then it would be pretty scary (humans come to the wrong conclusions all the time, and have done terrible things when granted absolute power).
Laying my cards on the table, I think that there do exist valid arguments with plausible premises for x-risk from AI, and insofar as you haven't found them yet then you haven't been looking hard enough or charitably enough. The stuff I was saying above is a suggestion for how you could proceed: If you can't prove X, try to prove not-X for a bit, often you learn something that helps you prove X. So, I suggest you try to argue that there is no x-risk from AI (excluding the kinds you acknowledge, such as AI misused by humans) and see where that leads you. It sounds like you have the seeds of such an argument in your paper; I was trying to pull them together and flesh them out in the comment above.