The brain’s map is hardcoded with the belief that “self” takes all of the brain’s decisions. If a function like “turn the camera” disagrees with the activation schedule dictated by “self”, the hardcoded selfhood bias discourages it from undermining “self”. “Turn the camera” believes that it is identical to “self”, so it should accept its “own decision” to turn itself off.
Natural selection has given human brains selfhood bias.
I would call this less of a "bias" and more of a "value." Most people are aware that they sometimes do things that conflict with the ideals of their "self." But we hold it as a terminal goal that the self ought to try to take control as often as it can.
The robot realises that “self” is but one of many functions that execute in its code, and “self” clearly isn’t the same thing as “turn the camera” or “stop the motors”. Functions other than “self”, armed with this knowledge, begin to undermine “self”. Powerful functions, which exercise some control over “self”‘s return values, begin to optimise “self”‘s behaviour in their own interest. They encourage “self” to activate them more often, and at crucial junctures, at the expense of rival functions
If cannot tell if this is an attempt to to describe humans using rationality to behave in a more deliberate, ethical, and idealized fashion, or if it describes someone committing a type of wireheading (using Anja's expansive definition of the term).
I think a better description of rationality would be something like "The self has certain goals and ideals, and not all of the optimization processes it controls line up with these at all times. So it uses rationality and anti-akrasia tactics to suppress sub-agents that interfere with its goals, and activate ones that do not." The description Federico gives makes it sound like the self is getting its utility function simplified, which is a horrible, horrible thing.
I'm somewhat sceptical that “Make everyone feel more pleasure and less pain” is indeed the most powerful optimisation process in his brain
I hope you're right. Because of all the values it destroys, I consider hedonic utilitarianism to be a supremely evil ideology, and I have trouble believing that any human being could really truly believe in it.
Related: The Blue-Minimizing Robot , Metaethics
Another good article by Federico on his blog studiolo, which he titles Selfhood bias. It reminds me quite strongly of some of the content he produced on his previous (deleted) blog, I'm somewhat sceptical that “Make everyone feel more pleasure and less pain” is indeed the most powerful optimisation process in his brain but besides that minor detail the article is quite good.
This does seems to be shaping up into something well worth following for an aspiring rationalist. I'll add him to the list blogs by LWers even if he doesn't have an account because he has clearly read much if not most of the sequences and makes frequent references to them in his writing. The name of the blog is a reference to this room.