ygert comments on Daimons - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (12)
You can't breeze over this so fast and ignore the ramifications. I do see what you are trying to do here, but it seems to me that you have not seeded in that just by decreeing that it cannot change it's goal. Changing one's goal is in general not a very useful thing, as if you change your (terminal) goal, you are less likely to achieve that (original) goal.
So, your prohibition here is less useful than you might think. It blocks something that it probably wouldn't do anyway, and it allows the potentially dangerous stuff. That is, subgoals. (or instrumental goals, or whatever you want to call them.)
For example, a daimon might be tasked with baking a cake. First, it has to find the ingredients. It can't move on until it does so. It must spend as much energy and resources as it needs to to find them, or else it will fail in making the cake. This subgoal is a natural part of cake-baking, and it is not possible to ban the daimon from taking it on.
But, of course, subgoals can cause effects that we do not like. The big example is intelligence raising. That is almost as essential a subgoal for any agent as finding the ingredients is for the cake-baker daimon. But you cannot just ban subgoals in general, because the cake-maker does need to find the ingredients.
So, in short, not only are you banning the wrong thing, but it is very hard or impossible to fix it to be banning the thing that is actually potentially an issue, as in most cases, subgoals are an essential part of achieving your goal.
I was thinking here of the sort of changing utility function that Eliezer talks about in Coherent Extrapolated Volition.
I accept your clarification that a daimon would need to be able to set subgoals (that don't contradict the core values / end goal).
And, yes, the issue of it deciding to set a subgoal of raising its own intelligence is something that has to be addressed. I mentioned it in passing above, when I talked about efficient use of finite resources.