steven0461 comments on Branches of rationality - Less Wrong

75 Post author: AnnaSalamon 12 January 2011 03:24AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (64)

You are viewing a single comment's thread. Show more comments above.

Comment author: AnnaSalamon 13 January 2011 11:45:47AM *  3 points [-]

The point is, we don't just want to turn humans into coherent agents, we want to turn them into coherent agents who can be said to have the same preferences as the original humans. But given that we don't have a theory of preferences for incoherent agents, how do we know that any given trick intended to improve coherence is preference-preserving? Right now we have little to guide us except intuition.

I absolutely agree. The actual question I had written on my sheet, as I tried to figure out what a more powerful “rationality” might include, was “... into coherent agents, with something like the goals ‘we’ wish to have?” Branch #8 above is exactly the art of not having the goals-one-acts-on be at odds with the goals-one-actually-cares-about (and includes much mention of the usefulness of theory).

My impression, though, is that some of the other branches of rationality in the post are very helpful for self-modifying in a manner you’re less likely to regret. Philosophy always holds dangers, but a person approaching the question of “What goals shall I choose?”, and encountering confusing information that may affect what he wants (e.g., encountering arguments in meta-ethics, or realizing his religion is false, or realizing he might be able to positively or negatively affect a disorienting number of lives) will be much better off if he already has good self-knowledge and has accepted that his current state is his current state (vs. if he wants desperately to maintain that, say, he doesn’t care about status and that only utilitarian expected-global-happiness-impacts affect his behavior -- a surprisingly common nerd failure mode).

I don’t know how to extrapolate the preferences of myself or other people either, but my guess is, while further theoretical work is critical, it’ll be easier to do this work in a non-insane fashion in the context of a larger, or more whole-personed, rationality. What are your thoughts here?

Comment author: steven0461 13 January 2011 11:08:44PM 2 points [-]

will be much better off if he already has good self-knowledge and has accepted that his current state is his current state

Everything here turns on the meaning of "accept". Does it mean "acknowledge as a possibly fixable truth" or does it mean "consciously endorse"? I think you're suggesting the latter but only defending the former, which is much more obviously true.

he wants desperately to maintain that, say, he doesn’t care about status and that only utilitarian expected-global-happiness-impacts affect his behavior -- a surprisingly common nerd failure mode

Is the disagreement here about what his brain does, or about what parts of his brain to label as himself? If the former, it's not obviously common, if the latter, it's not obviously a failure mode.

Comment author: Nick_Tarleton 26 February 2011 07:26:41AM *  1 point [-]

will be much better off if he already has good self-knowledge and has accepted that his current state is his current state

Everything here turns on the meaning of "accept". Does it mean "acknowledge as a possibly fixable truth" or does it mean "consciously endorse"?

Those both sound like basically verbal/deliberate activities, which is probably not what Anna meant. I would say "not be averse to the thought of".