Kaj_Sotala comments on [draft] Concepts are Difficult, and Unfriendliness is the Default: A Scary Idea Summary - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (39)
Hmm. Actually, I'm not making any assumptions about the AGI's decision-making process (or at least I'm trying not to): it could have a formal utility function, but it could also have e.g. a more human-like system with various instincts that pull it in different directions, or pretty much any decision-making system that might be reasonable.
You make a good point that this probably needs to be clarified. Could you point out the main things that give the impression that I'm presuming utility function -based decision making?
I am not sure what other AGI designs exist, other than utility function based decision makers, where it would make sense to talk about "friendly" and "unfriendly" goal architectures. If we're talking about behavior executors or AGI designs with malleable goals, then we're talking about hardcoded tools in the former case and unpredictable systems in the latter case, no?