Warrigal comments on Tiling Agents for Self-Modifying AI (OPFAI #2) - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (260)
Lemme expand on my thoughts a little bit. I imagine a non-self-modifying AI to be made of three parts: a thinking algorithm, a decision algorithm, and a belief database. The thinking and decision algorithms are immutable, and the belief database is (obviously) mutable. The supergoal is coded into the decision algorithm, so it can't be changed. (Problem: the supergoal only makes sense in the concept of certain beliefs, and beliefs are mutable.) The contents of the belief database influence the thinking algorithm's behavior, but they don't determine its behavior.
The ideal possibility is that we can make the following happen:
(My ideas haven't been taken seriously in the past, and I have no special knowledge in this area, so it's likely that my ideas are worthless. They feel valuable to me, however.)