Stuart_Armstrong comments on Naturalism versus unbounded (or unmaximisable) utility options - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (72)
You could generate a random number using a distribution that has infinite expected value, then tell Omega that number. Your expected utility of following this procedure is infinite.
But if there is a non-zero chance of an Omega existing that can grant you an arbitrary amount of utility, then there must also a non-zero chance of some Omega deciding on its own at some future time to grant you a random amount of utility using the above distribution, so you've already got infinite expected utility, no matter what you do.
It doesn't seem to me the third problem ("You're immortal. Tell Omega any real number r > 0, and he'll give you 1-r utility.") corresponds to any real world problems, so generalizing from the first two, the problem is just the well known problem of unbounded utility function leading to infinite or divergent expected utility. I don't understand why a lot of people seem to think very highly of this post. (What's the relevance of using ideas related to Busy Beaver to generate large numbers, if with a simple randomized strategy, or even by doing nothing, you can get infinite expected utility?)
Actually, the third problem is probably the most relevant of them all - it's akin to a bounded paperclipper uncertain as to whether they've succeeded. Kind of like: "You get utility 1 for creating 1 paperclip and then turning yourself off (and 0 in all other situations)."
I still don't see how it's relevant, since I don't see a reason why we would want to create an AI with a utility function like that. The problem goes away if we remove the "and then turning yourself off" part, right? Why would we give the AI a utility function that assigns 0 utility to an outcome where we get everything we want but it never turns itself off?
The designer of that AI might have (naively?) thought this was a clever way of solving the friendliness problem. Do the thing I want, and then make sure to never do anything again. Surely that won't lead to the whole universe being tiled with paperclips, etc.
This can arise indirectly, or through design, or for a host of reasons. That was the first thought that popped into my mind; I'm sure other relevant examples can be had. We might not assign such a utility - then again, we (or someone) might, which makes it relevant.