DSimon comments on So You Want to Save the World - Less Wrong

41 Post author: lukeprog 01 January 2012 07:39AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (146)

You are viewing a single comment's thread. Show more comments above.

Comment author: TheOtherDave 04 January 2012 09:32:19PM 0 points [-]

are you saying that if you believe X, there can't possibly exist any information that you haven't discovered yet that could convince your belief is false?

No. I'm saying that if I value X, I can't think of any information that would cause me to value NOT(X) instead.

Can you give me an example of something you desire not to do, which you would willingly edit yourself to desire to do?

Comment author: DSimon 06 January 2012 11:03:43PM *  0 points [-]

No. I'm saying that if I value X, I can't think of any information that would cause me to value NOT(X) instead.

Er, this seems to imply that you believe yourself immune to being hacked, which can't be right; human brains are far from impregnable. Do you consider such things to not be information in this context, or are you referring to "I" in a general "If I were an AI" sense, or something else?

Comment author: TheOtherDave 06 January 2012 11:16:03PM 1 point [-]

Mm, interesting question. I think that when I said it, I was referring to "I" in a "if I were an AI" sense. Or, rather, "if I were an AI properly designed to draw inferences from information while avoiding value drift," since of course it's quite possible to build an AI that doesn't have this property. I was also clearly assuming that X is the only thing I value; if I value X and Y, discovering that Y implies NOT(X) might lead me to value NOT(X) instead. (Explicitly, I mean. In this example I started out valuing X and NOT(X), but I didn't necessarily know it.)

But the question of what counts as information (as opposed to reprogramming attempts) is an intriguing one that I'm not sure how to address. On five seconds thought, it seems clear that there's no clear line to be drawn between information and attempts to hack my brain, and that if I want such a distinction to exist I need to design a brain that enforces that kind of security... certainly evolution hasn't done so.