I've been meditating lately on a possibility of an advanced artificial intelligence modifying its value function, even writing some excrepts about this topic.
Is it theoretically possible? Has anyone of note written anything about this -- or anyone at all? This question is so, so interesting for me.
My thoughts led me to believe that it is theoretically possible to modify it for sure, but I could not come to any conclusion about whether it would want to do it. I seriously lack a good definition of value function and understanding about how it is enforced on the agent. I really want to tackle this problem from human-centric point, but i don't really know if anthropomorphization will work here.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Is the following a rationality failure? When I make a stupid mistake that caused some harm I tend to ruminate over it and blame myself a lot. Is this healthy or not? The good thing is that I analyze what I did wrong and learn something from it. The bad part is that it makes me feel terrible. Is there any analysis of this behaviour out there? Studies?
I suspect attempted telekinesis is relevant.