You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Slider comments on Presidents, asteroids, natural categories, and reduced impact - Less Wrong Discussion

1 Post author: Stuart_Armstrong 06 July 2015 05:44PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (16)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 07 July 2015 12:28:31PM 0 points [-]

If ¬X happened, the result would be missaiming. But since X happens (almost certainly), it aims correctly.

Comment author: Slider 07 July 2015 05:51:42PM 1 point [-]

The programmer expects to ¬X and it must program the bot with things that are X agnostic so it is planning to not aim. Then because the programmed bot can't be X sensitive it will make essentially as if ¬X.

If the mission is to do the grue thing given that t<1 and the executor of the strategy can not have references to t the simple strategy is for the bot to do the green thing. Now if we put the programmed bot in a environment where t>1 the grue thing to do would be to press the blue button but the bot presses the green button. Such a solution is not grue-friendly or blue-friendly.

Comment author: Stuart_Armstrong 08 July 2015 10:18:51AM 0 points [-]

I'm not sure what you're saying. The AI is programmed to be reduced impact, conditional on ¬X. If ¬X happens, then outputting the correct y coordinates is reduced impact, which it will thus do (as it is separately motivated to do that).

So, given ¬X, the AI is motivated to: a) output the correct y coordinate (or cause its subagent to do so), b) have a reduced impact overall.

The whole construction is an attempt to generalise a) and b) to X, even though they are in tension/contradiction with each other in X (because outputing the correct y coordinate will have a high impact).

Comment author: Slider 08 July 2015 10:56:19AM 1 point [-]

If ¬X happens, then outputting the correct y coordinates is reduced impact, which it will thus do (as it is separately motivated to do that).

If the x-coordinate AI is not turned on (call this event ¬X), it is motivated to have reduced impact. This motivation is sufficiently strong that it will not want to have the correct y-coordinate outputted.

These bits are contradictory. One tells of a story where two low-impact options are tie-breaked by an aiming instinct to aim anyway. The other tells that "sit tight" instinct will overwhelm the aiming instinct.

If you want to control what happens in X, drives that are conditioned on ¬X are irrelevant. In my understanding the attempt is to generalise the reduced impact drive by not having it conditioned on X. Then what it does in ¬X can not be based on the fact that ¬X. But it can't deduce that aiming is low impact even in ¬X because it must assume that the x-aiming robot could be on and that would make it a high impact decision. It must use the same decision process in both X and ¬X and the X decision process can't be based on what it would do if it where allowed to assume that ¬X (that is you are not allowed to know whether the grue object is currently green or blue and you can't decide what you would do if it were green based on what you would do if it was blue).

Comment author: Stuart_Armstrong 08 July 2015 12:09:32PM 0 points [-]

These bits are contradictory

Indeed. I have corrected the top post. Thanks!