endoself comments on Where do selfish values come from? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (57)
It seems that in this post, by "selfish" you mean something like "not updateless" or "not caring about counterfactuals". A meaning closer to usual sense of the word would be, "caring about welfare of a particular individual" (including counterfactual instances of that individual, etc.), which seems perfectly amenable to being packaged as a reflectively consistent agent (that is not the individual in question) with world-determined utility function.
(A reference to usage in Stuart's paper maybe? I didn't follow it.)
By "selfish" I mean how each human (apparently) cares about himself more than others, which needs an explanation because there can't be a description of himself embedded in his brain at birth. "Not updateless" is meant to be a proposed explanation, not a definition of "selfish".
No, that's not the meaning I had in mind.
This post isn't related to his paper, except that it made me think about selfishness and how it relates to AIXI and UDT.
Pointing at self is possible, which looks like a reasonable description of self, referring to all the details of a particular person. That is, interpretation of individual's goal representation depends on the fact that the valued individual is collocated with the individual-as-agent.
Just as how a file offset value stored in memory of my computer won't be referring to the same data if used on (moved to) your computer that has different files; its usefulness depends on the fact that it's kept on the same computer; and it will continue to refer to same data if we move the whole computer around.
Now I'm confused again, as I don't see how these senses (one I suggested and one you explained in parent comment) differ, other than on the point of caring vs. not caring about counterfactual versions of same individual. You said, "each human (apparently) cares about himself more than others, which needs an explanation", and it reads to me as asking how can humans have the individual-focused utility I suggested, that you then characterized as not the meaning you had in mind...
Why not, or what do you mean by this? Common sense suggests that we do know ourselves from others at a very low, instinctive level.
I expect Wei's intuition is that knowing self means having an axiomatic definition of (something sufficiently similar to) self, so that it can be reasoned about for decision-theoretic purposes. But if we look at an axiomatic definition as merely some structure that is in known relation to the structure it defines, then your brain state in the past is just as good, and the latter can be observed in many ways, including through memory, accounts of own behavior, etc., and theoretically to any level of detail.
(Knowing self "at a low, instinctive level" doesn't in itself meet the requirement of having access to a detailed description, but is sufficient to point to one.)
I think there's enough science on the subject - here's the first paper <pdf> I could find with a quick Google - to sketch out an approximate answer to the question of how self-care arises in an individual life. The infant first needs to form the concept of a person (what Bischof calls self-objectification), loosely speaking a being with both a body and a mind. This concept can be applied to both self and others. Then, depending on its level of emotional contagion (likelihood of feeling similarly to others when observing their emotions) it will learn, through sophisticated operant conditioning, self-concern and other-concern at different rates.
Since the typical human degree of emotional contagion is less than unity, we tend to be selfish to some degree. I'm using the word "selfish" just as you've indicated.
Just as altruism can be related to trust, selfishness can be related to distrust.
An agent which has a high prior belief in the existence of deceptive adversaries would exhibit "selfish" behaviors.
What is your meaning then? What would you call "caring about the welfare of a particular individual (that happens to be myself)"?
Ok, I do mean:
but I don't mean:
(i.e., without the part in parenthesis) Does that clear it up?
Do you mean that the agent itself must be the person it cares about? What if the agent is carried in a backpack (of the person in question), or works over the Internet?
What if the selfish agent that cares about itself writes an AI that cares about the agent, giving this AI more optimization power, since they share the same goal?
Ah, there was a slight confusion on my part. So if I'm reading this correctly you define formally selfish to mean... selfish. :-)
The usage in Stuart's posts on here just meant a certain way of calculating expected utilities. Selfish agents only used their own future utility when calculating expected utility, unselfish agents mixed in other peoples' utilities. To make this a bit more robust to redefinition of what's in your utility function, we could say that a purely selfish agent's expected utility doesn't change if actions stay the same but other peoples' utilities change.
But this is all basically within option (2).
No one can mix another person's actual utility function into their own. You can mix in your estimate of it. You can mix in your estimate of what you think it should be. But the actual utility function of another person is in that other person, and not in you.
You can mix a pointer to it into your own. To see that this is different from mixing it your estimate, consider what you would do if you found out your estimate was mistaken.
Good point, if not totally right.
In general, you can have anything in your utility function you please. I could care about the number of ducks in the pond near where I grew up, even though I can't see it. And when I say caring about the number of ducks in the pond, I don't just mean my perception of it - I don't want to maximize how many ducks I think are in the pond, or I would just drug myself. However, you're right that when calculating an "expected utility," that is, your best guess at the time, you don't usually have perfect information about other peoples' utility functions, just like I wouldn't have perfect information about the number of ducks in the pond, and so would have to use an estimate.
The reason it worked without this distinction in Stuart's articles on the sleeping beauty problem was because the "other people" were actually copies of Sleeping Beauty, so you knew that their utility functions were the same.