bentarm writes:
I'm just echoing everyone else here, but I don't understand why the AI would do anything at all other than just immediately find the INT_MAX utility and halt - you can't put intermediate problems with some positive utility because the AI is smarter than you and will immediately devote all its energy to finding INT_MAX.
Now, this is in response to a proposed AI who gets maximum utility when inside its box. Such an AI would effectively be a utility junkie, unable to abandon its addiction and, consequently, unable to do much of anything.
(EDIT: this is a misunderstanding of the original idea by jimrandomh. See comment here.)
However, doesn't the same argument apply to any AI? Under the supposition that it would be able to modify its own source code, the quickest and easiest way to maximize utility would be to simply set its utility function to infinity (or whatever the maximum is) and then halt. Are there ways around this? It seems to me that any AI will need to be divided against itself if it's ever going to get anything done, but maybe I'm missing something?
Well, we can draw a useful distinction here between "number of paperclips" and "perceived number of paperclips". The AI only ever has access to the latter, and is probably a lot more aware of this than we are, since it can watch its own program running. The process of updating its model of the world toward greater accuracy is likely to be "painful" (in an appropriate sense), as the AI realizes progressively that there are fewer paperclips in the world than previously believed; it's much easier, and a lot more rewarding, to simply decide that there are an arbitrarily large number of paperclips in the world already and it needn't do anything.
If it's designed to be concerned with a particular number in a spreadsheet , so to speak, absolutely. I have no idea if it's possible to make the desire more connected to the world, to make the AI really care about something in the world.