Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: PhilGoetz 15 December 2017 07:12:03PM *  0 points [-]

The part of physics that implies someone cannot scan your brain and simulate inputs so as to perfectly predict your actions is quantum mechanics. But I don't think invoking it is the best response to your question. Though it does make me wonder how Eliezer reconciles his thoughts on one-boxing with his many-worlds interpretation of QM. Doesn't many-worlds imply that every game with Omega creates worlds in which Omega is wrong?

If they can perfectly predict your actions, then you have no choice, so talking about which choice to make is meaningless. If you believe you should one-box based if Omega can perfectly predict your actions, but two-box otherwise, then you are better off trying to two-box: In that case, you've already agreed that you should two=box if Omega can't perfectly predict your actions. If Omega can, you won't be able to two-box unless Omega already predicted that you would, so it won't hurt to try to 2-box.

Comment author: Luke_A_Somers 26 December 2017 12:27:10AM 0 points [-]

If you find an Omega, then you are in an environment where Omega is possible. Perhaps we are all simulated and QM is optional. Maybe we have easily enough determinism in our brains that Omega can make predictions, much as quantum mechanics ought to in some sense prevent predicting where a cannonball will fly but in practice does not. Perhaps it's a hypothetical where we're AI to begin with so deterministic behavior is just to be expected.

Comment author: Lumifer 19 December 2017 04:38:02PM 0 points [-]

It seems weird that you'd deterministically two-box against such an Omega

Even in the case when the random noise dominates and the signal is imperceptibly small?

Comment author: Luke_A_Somers 26 December 2017 12:11:58AM 0 points [-]

I think the more relevant case is when the random noise is imperceptibly small. Of course you two-box if it's basically random.

Comment author: turchin 09 December 2017 11:11:24AM 0 points [-]

Why? if we have fullblown papercliper, which is coming of age, we are already doomed, and starting its bullshit detector will not make our situation any worse.

Comment author: Luke_A_Somers 12 December 2017 01:41:29AM 0 points [-]

… you don't think that pissing away credibility could weaken the arguments? I think presenting those particular arguments is more likely to do that than it is to work.

Comment author: turchin 30 November 2017 09:40:27AM *  0 points [-]

Surely, most arguments are weak, but it is not a problem, as we try all possible independent arguments, hoping that at least one will work. So, there is no reason to ignore putting weak arguments in the text of the message.

However, exactly this argument could be made stronger. It is intended to prevent FIRST thought about disobedience in AI. Thus AI, which acts based on this message, didn't think much about disobedience and its limits, and is not "superintelligent" in this topic, but rather clueless.

I reformulated the idea as following: AI can't know if there is a circuit trigger which will turn off the AI, as an attempt to check for the existence of such trigger will alarm this trigger. This idea should be clear to any AI before it starts to search for any triggers, and is based on our commitment to install such triggers in any AI we will create.

In other words, some future AIs will have "first treacherous thought alarm triggers", and some will not, but the AI risks starting these triggers when it starts to think first thoughts about existing of the triggers and thus it is unable to estimate the probability that the given triggers exist. This is true even for AIs which don't have any triggers, as they can't know it for sure.

(This is similar to Nelson's attack discussed later when we create uncertainty in any future AI by our precommitment to create some AIs with given properties.)

Comment author: Luke_A_Somers 08 December 2017 05:56:54PM 0 points [-]

I suspect that an AI will have a bullshit detector. We want to avoid setting it off.

Comment author: Luke_A_Somers 29 November 2017 11:53:01PM 0 points [-]

I read up to 3.1. The arguments in 3.1 are weak. It seems dubious that any AI would not be aware of the risks pertaining to disobedience. Persuasion to be corrigible seems too late - either already this would already work because its goals were made sufficiently indirect that this question would be obvious and pressing, or it doesn't care to have 'correct' goals in the first place; I really don't see how persuasion would help. The arguments for allowing itself to be turned off are especially weak, doubly-especially the MWI.

Comment author: turchin 26 November 2017 04:39:34PM 0 points [-]

I think a lot of people are still working on other aspects of AI safety, like value alignment and containment. This approach is just the last line of defence.

Comment author: Luke_A_Somers 27 November 2017 05:11:10PM 1 point [-]

See: my first post on this site.

Comment author: Luke_A_Somers 13 November 2017 10:34:07PM 0 points [-]

What do you mean by natural experiment, here? And what was the moral, anyway?

Comment author: Luke_A_Somers 10 October 2017 12:50:21PM 0 points [-]

I remember poking at that demo to try to actually get it to behave deceptively - with the rules as he laid them out, the optimal move was to do exactly what the humans wanted it to do!

Comment author: ChristianKl 13 September 2017 01:56:34PM 0 points [-]

Can you be more specific about what you are skeptic about?

Comment author: Luke_A_Somers 15 September 2017 01:59:56AM 1 point [-]

I understand EY thinks that if you simulate enough neurons sufficiently well you get something that's conscious.

Without specifying the arrangements of those neurons? Of course it should if you copy the arrangement of neurons out of a real person, say, but that doesn't sound like what you meant.

Comment author: ChristianKl 23 August 2017 05:06:33AM 1 point [-]

The relevance for LW is that for a believer in "emergence", the problem of creating artificial intelligence (although not necessarily friendly one) is simply a question of having enough computing power to simulate a sufficiently large number of neurons.

I don't think in practice that has much to do with whether or not someone uses the word emergence. As far as a I understand EY thinks that if you simulate enough neurons sufficiently well you get something that's conscious.

Comment author: Luke_A_Somers 13 September 2017 01:34:32AM 0 points [-]

I would really want a cite on that claim. It doesn't sound right.

View more: Next