How many people here agree with Holden? [Actually, who agrees with Holden?]

private_messaging

How many people here agree with Holden? [Actually, who agrees with Holden?] — LessWrong

Comment Permalink

You're right. Feel free to formalize my argument at your leisure and tell me where it breaks down.

EDIT: All AIXI cares about is the input. And so the proof that rewiring your head can increase reward is simply that r(x) has at least one maximum (since its sum over steps needs to have a maximum), combined with the assumption that the real world does not already maximize the sum of r(x). As for the asteroid, the stuff doing the inputting gets blown up, so the simplest implementation just has the reward be r(null). But you could have come up with that on your own.

private_messaging14y60

I don't think we need to prove wireheading here. Suffices that it only cares about the input, and so will find a way to set that input. You wire it to paperclip counter to maximize paperclips, it'll be also searching for a way to replace counter with infinity or 'trick' the counter (anything goes). You sit here yourself rewarding it for making paperclips, with a pushbutton, it's search will include tricking you into pushing the button.

I also think that if you want it to self preserve you'll need to code in special stuff to equate self inside world model (w... (read more)

2asr14y

The argument breaks down because you are equivocating on what the space is to search over and what the utility function in question is. Under a given utility function U, "change the utility function to U' " won't generally have positive utility. Self-awareness and pleasure-seeking aren't some natural properties of optimization processes. They have to be explicitly built in. Suppose you set a theorem-prover to work looking for a proof of some theorem. It's searching over the space of proofs. There's no entry corresponding to "pick a different and easier theorem to prove", or "stop proving theorems and instead be happy."

2JoshuaZ14y

I'm not aware of any formalization of AIXI that reflects its real world form. Your comment thus amounts to something like a plausibility argument, but trying to formalize it further seems tricky and possibly highly nontrivial.

See in context

3

How many people here agree with Holden? [Actually, who agrees with Holden?]

3

3

3

How many people here agree with Holden? [Actually, who agrees with Holden?]

3

3