Comment Permalink

Daniel Kokotajlo5yΩ140

Thanks for the thoughtful pushback! It was in anticipation of comments like this that I put hedging language in like "it think" and "perhaps." My replies:

This seems a bit like writing the bottom line first?
Like, AI fears in our community have come about because of particular arguments. If those arguments don't apply, I don't see why one should strongly assume that AI is to be feared, outside of having written the bottom line first.

1. Past experience has shown that even when particular AI risk arguments don't apply, often an AI design is still risky, we just haven't thought of the reasons why yet. So we should make a pessimistic meta-induction and conclude that even if our standard arguments for risk don't apply, the system might still be risky--we should think more about it.

2. I intended those two "perhaps..." statements to be things the person says, not necessarily things that are true. So yeah, maybe they *say* the standard arguments don't apply. But maybe they are wrong. People are great at rationalizing, coming up with reasons to get to the conclusion they wanted. If the conclusion they want is "We finally did it and made a super powerful impressive AI, come on come on let's take it for a spin!" then it'll be easy to fool yourself into thinking your architecture is sufficiently different as to not be problematic, even when your architecture is just a special case of the architecture in the standard arguments.

Points 1 and 2 are each individually sufficient to vindicate my claims, I think.

It also seems kind of condescending to operate under the assumption that you know more about the AI system someone is creating than the person who's creating it knows? You refer to their safety strategy as "amateur", but isn't there a chance that having created this system entitles them to a "professional" designation? A priori, I would expect that an outsider not knowing anything about the project at hand would be much more likely to qualify for the "amateur" designation.

3. I'm not operating under the assumption that I know more about the AI system someone is creating than the person who's creating it knows. The fact that you said this dismays me, because it is such an obvious staw man. It makes me wonder if I touched a nerve somehow, or had the wrong tone or something, to raise your hackles.

4. Yes, I refer to their safety strategy as amateur. Yes, this is appropriate. AI safety is related to AI capabilities, but the two are distinct sub-fields, and someone who is great at one could be not so great at another. Someone who doesn't know the AI safety literature, who does something to make their AI safe, probably deserves the title amateur. I don't claim to be a non-amateur AI scientist, and whether I'm a non-amateur AI safety person is irrelevant because I'm not going to be one of the people in The Talk. I do claim that e.g. someone like Paul Christiano or Stuart Russell is a professional AI safety person, whereas most AI scientists are not.

This isn't obvious to me. One possibility is that there will be some system which is safe if used carefully, and having a decent technological lead gives you plenty of room to use it carefully, but if you delay your development too much, competing teams will catch up and you'll no longer have space to use it carefully. I think you have to learn more about the situation to know for sure whether a month of delay is a good thing.

5. I agree that this is a possibility. This is why I said "say it buys us a month;" I meant that to be an average of the various possibilities. In retrospect I was unclear; I should have clarified that It might not be a good idea to delay at all, for the reasons you mention. I agree we have to learn more about the situation; in retrospect I shouldn't have said "I think it would be better for these conversations to end X way" (even though that is what I think is most likely) but rather found some way to express the more nuanced position.

6. I agree with everything you say about overconfidence, echo chambers, etc. except that I don't think I was writing the bottom line first in this case. I was making a claim without arguing for it, but then I argued for it in the comments when you questioned it. It's perfectly reasonable (indeed necessary) to have some unargued for claims in any particular finite piece of writing.

64

Preparing for "The Talk" with AI projects

64

Ω 32

New to LessWrong?

64

Ω 32