Evidence for the orthogonality thesis

Stuart_Armstrong

One of the most annoying arguments when discussing AI is the perennial "But if the AI is so smart, why won't it figure out the right thing to do anyway?" It's often the ultimate curiosity stopper.

Nick Bostrom has defined the "Orthogonality thesis" as the principle that motivation and intelligence are essentially unrelated: superintelligences can have nearly any type of motivation (at least, nearly any utility function-bases motivation). We're trying to get some rigorous papers out so that when that question comes up, we can point people to standard, and published, arguments. Nick has had a paper accepted that points out the orthogonality thesis is compatible with a lot of philosophical positions that would seem to contradict it.

I'm hoping to complement this with a paper laying out the positive arguments in favour of the thesis. So I'm asking you for your strongest arguments for (or against) the orthogonality thesis. Think of trying to convince a conservative philosopher who's caught a bad case of moral realism - what would you say to them?

Many thanks! Karma and acknowledgements will shower on the best suggestions, and many puppies will be happy.

One of the most annoying arguments when discussing AI is the perennial "But if the AI is so smart, why won't it figure out the right thing to do anyway?" It's often the ultimate curiosity stopper.

Many thanks! Karma and acknowledgements will shower on the best suggestions, and many puppies will be happy.

superintelligences can have nearly any type of motivation (at least, nearly any utility function-bases motivation).

Sure they can, but will they?

The weaker "in-theory" orthogonality thesis is probably true, almost trivially, but it doesn't matter much.

We don't care about all possible minds or all possible utility functions for the same reason we don't care about all possible programs. What's actually important is the tiny narrow subset of superintelligences and utility functions that are actually likely to be built and exist in the future.

And in this light it is clear that there will be some correlation between the population distributions over intelligences and utility functions/motivations, and the strongest form of the orthogonality thesis trivially fails.

Intelligence in humans evolved necessarily in the context of language and the formation of social meta-organisms, and we thus have many specific features such as altruistic punishment (moral justice), empathy, and so on that are critical to the meta-organism.

AGI systems will likewise develop from this foundation and evolve in our economy. This environment will select for AGI systems that either fulfill our needs or are like us (or both). The rest will be culled.

14

Evidence for the orthogonality thesis

14

14

14

Evidence for the orthogonality thesis

14

14