Houshalter comments on AI indifference through utility manipulation - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (53)
I want to second RolfAndreassen' viewpoint below.
The problem with this entire train of thought is that you completely skip past the actual real difficulty, which is constructing any type of utility function even remotely as complex as the one you propose.
Your hypothetical utility function references undefined concepts such as "taking control of", "cooperating", "humans", and "self", etc etc
If you actually try to ground your utility function and go through the work of making it realistic, you quickly find that it ends up being something on the order of complexity of a human brain, and its not something that you can easily define in a few pages of math.
I'm skeptical then about the entire concept of 'utility function filters', as it seems their complexity would be on the order of or greater than the utility function itself, and you need to keep constructing an endless sequence of such complex utility function filters.
A more profitable route, it seems to me, is something like this:
Put the AI's in a matrix-like sim (future evolution of current computer game & film simulation tech) and get a community of a few thousand humans to take part in a Truman Show like experiment. Indeed, some people would pay to spectate or even participate, so it could even be a for profit venture. A hierarchy of admins and control would ensure that potential 'liberators' were protected against. In the worst case, you can always just rewind time. (something the Truman Show could never do - a fundamental advantage of a massive sim)
The 'filter function' operates at the entire modal level of reality: the AI's think they are humans, and do not know they are in a sim. And even if they suspected they were in a sim (ie by figuring out the simulation argument), they wouldn't know who were humans and who were AI's (and indeed they wouldn't know which category they were in). As the human operators would have godlike monitoring capability over the entire sim, including even an ability to monitor AI thought activity, this should make a high level of control possible.
They can't turn against humans in the outside world if they don't even believe it exists.
This sounds like a science fiction scenario (and it is), but it's also feasible, and I'd say far more feasible than approaches which directly try to modify, edit, or guarantee mindstates of AI's who are allowed to actually know they are AIs.
This is interesting, because once you have AI you can use it to make a simulation like this feasable, by making the code more efficient, monitoring the AI's thoughts, etc, and yet the "god AI" wouldn't be able to influence the outside world in any meaningful way and it's modification of the inside world would be heavily restricted as to just alerting admins about problems, making the simulation more efficient, and finding glitches.
All you have to do is feed the original AI with some basic parameters (humans look like this, cars have these properties, etc) and it can generate it's own laws of physics and look for inconsistencies that way the AI would have a hard time figuring it out and abusing bugs.
I don't think it's necessary to make the AI's human though. You could run a variety of different simulations. In some the AI's would be led into a scenerio were they would have to do something or other (maybe make CEV) that would be useful in the real world, but you want to test it for hidden motives and traps in the simulation first before you implement it.
Despite a number of assumptions here that would have to be true first (like the development of AI in the first place) a real concern would be how you manage such an expiriment without the whole world knowing about it, or with the whole world knowing about it but make it safe so some terrorists can't blow it up, hackers tamper with it, or spies steal it. The world's reaction to AI is my biggest concern in any AI development scenario.
A number of assumptions yes, but actually I see this is a viable route to creating AI, not something you do after you already have AI. Perhaps the biggest problem in AI right now is the grounding problem - actually truly learning what nouns and verbs mean. I think the most straightforward viable approach is simulation in virtual reality.
I concur with your concern. However, I don't know if such an experiment necessarily must be kept a secret (although that certainly is an option, and if/when governments take this seriously, it may be so).
On the other hand, at the moment most of the world seems to be blissfully unconcerned with AI atm.