Evidence for the orthogonality thesis

Stuart_Armstrong

One of the most annoying arguments when discussing AI is the perennial "But if the AI is so smart, why won't it figure out the right thing to do anyway?" It's often the ultimate curiosity stopper.

Nick Bostrom has defined the "Orthogonality thesis" as the principle that motivation and intelligence are essentially unrelated: superintelligences can have nearly any type of motivation (at least, nearly any utility function-bases motivation). We're trying to get some rigorous papers out so that when that question comes up, we can point people to standard, and published, arguments. Nick has had a paper accepted that points out the orthogonality thesis is compatible with a lot of philosophical positions that would seem to contradict it.

I'm hoping to complement this with a paper laying out the positive arguments in favour of the thesis. So I'm asking you for your strongest arguments for (or against) the orthogonality thesis. Think of trying to convince a conservative philosopher who's caught a bad case of moral realism - what would you say to them?

Many thanks! Karma and acknowledgements will shower on the best suggestions, and many puppies will be happy.

One of the most annoying arguments when discussing AI is the perennial "But if the AI is so smart, why won't it figure out the right thing to do anyway?" It's often the ultimate curiosity stopper.

Many thanks! Karma and acknowledgements will shower on the best suggestions, and many puppies will be happy.

I just reread this post yesterday and found it to be a very convincing counter-argument against the idea that we should solely act on high stakes.

Eh, I think Vassar's reply is more to the point.

I think Wei_Dai's reply does trump that.

What Vassar is saying sounds to me like a justification of Pascal's Wager by arguing that some God's have more measure than others and that therefore we can rationally decide to believe into a certain God and live accordingly.

That is like saying that a biased coin does not have a probability of 1/2 and that we can therefore maximize our payoff by betting on the side of the coin that is more likely to end up face-up. Which would be true if we had any other information other than that the coin is biased. But if we don't have any reliable information except other than that it is biased, it makes no sense to deviate from the probability of a fair coin.

And I don't think it is clear, at this point, that we are justified to assume more than that there might be risks from AI. Claiming that there are actions that we can take, with respect to risks from AI, that are superior to others, is like claiming that the coin is biased while being unable to determine the direction of the bias. By claiming that doing something is better than doing nothing we might as well end up making things worse. Just like by unconditionally assigning a higher probability to one side of a coin, of which we know nothing but that it is biased, in a coin tossing tournament.

The only sensible option seems to be to wait for more information.

And I don't think it is clear, at this point, that we are justified to assume more than that there might be risks from AI. Claiming that there are actions that we can take, with respect to risks from AI, that are superior to others, is like claiming that the coin is biased while being unable to determine the direction of the bias. By claiming that doing something is better than doing nothing we might as well end up making things worse. Just like by unconditionally assigning a higher probability to one side of a coin, of which we know nothing but that it i

... (read more)

2Rain14y

This is one of The Big Three Problems I came to LW hoping to find a solution for, but have mainly noticed that nobody wants to talk about it. Oh well.