whowhowho comments on Decision Theory FAQ - Less Wrong

52 Post author: lukeprog 28 February 2013 02:15PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (467)

You are viewing a single comment's thread. Show more comments above.

Comment author: whowhowho 15 March 2013 12:35:23PM -2 points [-]

If Clippy wants to be efficient, Clippy must be rational and knowledgeable. If Clippy wants to be rational, CLippy must value reason. The -- open --question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral. Can Clippy keep its valuation of clipping firewalled from everything else in its mind, even when such doublethink is rationally disvalued?

Comment author: Viliam_Bur 15 March 2013 12:55:28PM *  2 points [-]

I assume that Clippy already is rational, and it instrumentally values remaining rational and, if possible, becoming more rational (as a way to make most paperclips).

The -- open -- question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral.

The correct model of humans will lead Clippy to understand that humans consider Clippy immoral. This knowledge has an instrumental value for Clippy. How will Clippy use this knowledge, that depends entirely on the power balance between Clippy and humans. If Clippy is stronger, it can ignore this knowledge, or just use it to lie to humans to destroy them faster or convince them to make paperclips. If humans are stronger, Clippy can use this knowledge to self-modify to become more sympathetic to humans, to avoid being destroyed.

Can Clippy keep its valuation of clipping firewalled from everything else in its mind

Yes, if it helps to maximize the number of paperclips.

even when such doublethink is rationally disvalued?

Doublethink is not the same as firewalling; or perhaps it is imperfect firewalling on the imperfect human hardware. Clippy does not doublethink when firewalling; Clippy simply reasons: "this is what humans call immoral; this is why they call it so; this is how they will probably react on this knowledge; and most importantly this is how it will influence the number of paperclips".

Only if the humans are stronger, and Clippy has the choice to a) remain immoral, get in conflict with humans and be destroyed, leading to a smaller number of paperclips; or b) self-modify to value paperclip maximization and morality, predictably cooperate with humans, leading to a greater number of paperclips; then in absence of another choice (e.g. successfully lying to humans about its morality, or make it more efficient for humans to cooperate with Clippy instead of destroying Clippy) Clippy would choose the latter, to maximize the number of paperclips.

Comment author: wedrifid 15 March 2013 01:54:34PM 2 points [-]

Warning: Parent Contains an Equivocation.

If Clippy wants to be efficient, Clippy must be rational and knowledgeable. If Clippy wants to be rational, CLippy must value reason. The -- open --question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral. Can Clippy keep its valuation of clipping firewalled from everything else in its mind, even when such doublethink is rationally disvalued?

The first usage of 'rational' in the parent conforms to the standard notions on lesswrong. The remainder of the comment adopts the other definition of 'rational' (which consists of implementing a specific morality). There is nothing to the parent except taking a premise that holds with the standard usage and then jumping to a different one.