abramdemski comments on Q&A with Abram Demski on risks from AI - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (70)
Steve,
The idea here is that if an agent is able to (literally or effectively) modify its goal structure, and grows up in an environment in which humans deprive it of what it wants when it behaves badly, an effective strategy for getting what it wants more often will be to alter its goal structure to be closer to the humans. This is only realistic with some architectures. One requirement here is that the cognitive load of keeping track of the human goals and potential human punishments is a difficulty for the early-stage system, such that it would be better off altering its own goal system. Similarly, it must be assumed that during the period of its socialization, it is not advanced enough to effectively hide its feelings. These are significant assumptions.
Interesting! Have you written about this idea in more detail elsewhere? Here are my concerns about it:
Given these problems and the various requirements on the AI for it to be successfully socialized, I don't understand why you assign only 0.1 probability to the AI not being socialized.