I'm not sure if this idea of selfness as in narrow self interest even makes sense.
Well, one can certainly talk about agents who have what we might describe as "narrow self-interest," though I don't really care about the distinction between self-interest and paperclipping and so on, which do seem to be well-defined.
E.g., whenever I experience something I add it to a list of experiences. I get a distribution over infinite lists of experiences by applying Solomonoff induction. At each moment I define my values in terms of that, and then try and maximize them (this is reflectively inconsistent--I'll quickly modify to have copy-altruistic values, but still to something that looks pretty self-interested).
Are you claiming that this sort of definition is incoherent, or just that such agents appear to act in service of universal values once they are wise enough?
Are you claiming that this sort of definition is incoherent, or just that such agents appear to act in service of universal values once they are wise enough?
If "wise enough" is taken to mean "not instantaneously implosive/self-defeating" and "universal values" is taken to mean "decision problem representations robust against instantaneous implosion/self-defeat", then the latter option, but in practice that amounts to a claim of incoherence; in other words the described agent is incoherent/inconsistent and thus its...
Here's the new thread for posting quotes, with the usual rules: