Rain comments on xkcd on the AI box experiment - Less Wrong

15 Post author: FiftyTwo 21 November 2014 08:26AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (229)

You are viewing a single comment's thread. Show more comments above.

Comment author: XiXiDu 21 November 2014 12:05:59PM *  6 points [-]

Regarding Yudkowsky's accusations against RationalWiki. Yudkowsky writes:

First false statement that seems either malicious or willfully ignorant:

In LessWrong's Timeless Decision Theory (TDT),[3] punishment of a copy or simulation of oneself is taken to be punishment of your own actual self

TDT is a decision theory and is completely agnostic about anthropics, simulation arguments, pattern identity of consciousness, or utility.

Calling this malicious is a huge exaggeration. Here is a quote from the LessWrong Wiki entry on Timeless Decision Theory:

When Omega predicts your behavior, it carries out the same abstract computation as you do when you decide whether to one-box or two-box. To make this point clear, we can imagine that Omega makes this prediction by creating a simulation of you and observing its behavior in Newcomb's problem. [...] TDT says to act as if deciding the output of this computation...

RationalWiki explains this in the way that you should act as if it is you that is being simulated and who possibly faces punishment. This is very close to what the LessWrong Wiki says, phrased in a language that people with a larger inferential distance can understand.

Yudkowsky further writes:

The first malicious lie is here:

an argument used to try and suggest people should subscribe to particular singularitarian ideas, or even donate money to them, by weighing up the prospect of punishment versus reward

Neither Roko, nor anyone else I know about, ever tried to use this as an argument to persuade anyone that they should donate money.

This is not a malicious lie. Here is a quote from Roko's original post (emphasis mine):

...there is the ominous possibility that if a positive singularity does occur, the resultant singleton may have precommitted to punish all potential donors who knew about existential risks but who didn't give 100% of their disposable incomes to x-risk motivation. This would act as an incentive to get people to donate more to reducing existential risk, and thereby increase the chances of a positive singularity. This seems to be what CEV (coherent extrapolated volition of humanity) might do if it were an acausal decision-maker.1 So a post-singularity world may be a world of fun and plenty for the people who are currently ignoring the problem, whilst being a living hell for a significant fraction of current existential risk reducers (say, the least generous half). You could take this possibility into account and give even more to x-risk in an effort to avoid being punished.

This is like a robber walking up to you and explaining that you could take into account that he could shoot you if you don't give him your money.

Also notice that Roko talks about trading with uFAIs as well.