You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Gunnar_Zarncke comments on Restrictions that are hard to hack - Less Wrong Discussion

6 Post author: Stuart_Armstrong 09 March 2015 01:52PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (8)

You are viewing a single comment's thread.

Comment author: Gunnar_Zarncke 09 March 2015 08:47:26PM -1 points [-]

Very interesting.

Some time ago I posted a comment about raising AIs with a caregiver. Basically rules given to the child/AI cause it to search for circumventions whereas rewarding positive behaviors could be modelled as shaping the motivation structure of the child/AI. At least for children positively reinforced behaviors cause searching for new behaviors in that direction and implicitly inhibit other behaviors.

Only the theoretical model you gave for the motivation part does look quite different from my model. The children model seems to work more like heavily (for an advanced AI) penalizing the search outside the rewarded areas. This is different from the usual temporal discounting, so that might nontheless be another AI control approach. Search distance would need to be quantified for this and that is more difficult than time discounting.