You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Dmytry comments on against "AI risk" - Less Wrong Discussion

24 Post author: Wei_Dai 11 April 2012 10:46PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (89)

You are viewing a single comment's thread. Show more comments above.

Comment author: Dmytry 12 April 2012 05:40:22PM *  -1 points [-]

Pretty ordinary meaning: Bunch of people trusting extraordinary claims not backed with any evidence or expert consensus, originating from a charismatic leader who is earning living off cultists. Subtype doomsday. Now, I don't give any plus or minus points for the leader and living off cultists part, but the general lack of expert concern of the issue is a killer. Experts being people with expertise on relevant subject (but no doomsday experts allowed; has to be something practically useful or at least not all about the doomsday itself. Else you start counting theologians as experts). E.g. for AI risk, the relevant experts may be people with CS accomplishments, the folks who made self driving car, the visual object recognition experts, speech recognition, who developed actual working AI of some kind, etc etc.

I wonder what'd happen if we'd train a SPR for cult recognition. http://lesswrong.com/lw/3gv/statistical_prediction_rules_outperform_expert/ SPRs don't care for any unusual redeeming qualities or special circumstances.

Can you list some non-cult most similar to LW/SIAI ?

Comment author: Incorrect 12 April 2012 07:22:54PM 0 points [-]

extraordinary claims not backed with any evidence

There are two claims the conjunction of which must be true in order for a doomsday scenario to be likely:

  1. self-improving human-level AI is dangerous enough
  2. humans are likely to create human-level AI

I am unsure of 2 but believe 1. Do you disagree with 1?

Comment author: Dmytry 12 April 2012 08:28:11PM *  -1 points [-]

I think the problem is conflating different aspects of intelligence into one variable. The three major groups of aspects are:

1: thought/engineering/problem-solving/etc; it can work entirely within mathematical model. This we are making steady progress at.

2: real-world volition, especially the will to form most accurate beliefs of the world. This we don't know how to solve, and don't even need to automate. We ourselves aren't even a shining example of 2, but generally don't care so much about that. 2 is a hard philosophical problem.

3: Morals.

Even strongly superhuman 1 by itself is entirely harmless, even if very general within the problem space of 1. 2 without 1 can't invent anything. The 3 may follow from strong 1 and 2 assuming that AI assigns non zero chance to being under test in a simulation, and strong 1 providing enormous resources.

So, what is your human level AI?

It seems to me that people with high capacity for 1, i.e. the engineers and scientists, are so dubious about AI risk because it is pretty clear to them, both internally, and from the AI effort, that 1 doesn't imply 2 and adding 2 won't strengthen 1. There isn't some great issue with 1 that 2 would resolve. The 1 works just fine. If for example we invent awesome automatic software development AI, it will be harmless even if superhuman at programming, and will self improve as much as possible without 2. Not just harmless, there's no reason why 1-agent plus human are together any less powerful than 1-agent with 2-capability.

Eliezer, it looks like, is very concerned with forming accurate beliefs, i.e. 2-type behaviour, but i don't see him inventing novel solutions as much. Maybe he's so scared of the AI because he attributes other people's problem solving to intellect paralleling his, while it's more orthogonal. Maybe he imagines that very strongly more-2 agent will somehow be innovative and foom, and he sees a lot of room for improving the 2. Or something along those lines. He is a very unusual person; I don't know how he thinks. The way I think it is very natural for me that the problem solving does not require wanting to actually do anything real first. That also parallels the software effort because ultimately everyone who is capable of working effectively as innovative software developers are very 1-orientated and don't see 2 as either necessary or desirable. I don't think 2 would just suddenly appear out of nothing by some emergence or accident.

Comment author: Incorrect 12 April 2012 08:46:01PM 0 points [-]

Even strongly superhuman 1 by itself is entirely harmless, even if very general within the problem space of 1.

Type 1 intelligence is dangerous as soon as you try to use it for anything practical simply because it is powerful. If you ask it "how can we reduce global temperatures" and "causing a nuclear winter" is in its solution space, it may return that. Powerful tools must be wielded precisely.

Comment author: Dmytry 12 April 2012 08:49:18PM *  0 points [-]

See, that's what is so incredibly irritating about dealing with people who lack any domain specific knowledge. You can't ask it, "how can we reduce global temperatures" in the real world.

You can ask it how to make a model out of data, you can ask it what to do to the model so that such and such function decreases, it may try nuking this model (inside the model), and generate such solution. You got to actually put a lot of effort, like replicating it's in-model actions in real world in mindless manner, for this nuking to happen in real world. (and you'll also have the model visualization to examine, by the way)

Comment author: Incorrect 12 April 2012 08:56:39PM *  0 points [-]

What if instead of giving the solution "cause nuclear war" it simply returns a seemingly innocuous solution expected to cause nuclear war? I'm assuming that the modelling portion is a black box so you can't look inside and see why that solution is expected to lead to a reduction in global temperatures.

If the software is using models we can understand and check ourselves then it isn't nearly so dangerous.

Comment author: Dmytry 12 April 2012 09:02:09PM *  -2 points [-]

I'm assuming that the modelling portion is a black box so you can't look inside and see why that solution is expected to lead to a reduction in global temperatures.

Let's just assume that mister president sits on nuclear launch button by accident, shall we?

It isn't an amazing novel philosophical insight that type-1 agents 'love' to solve problems in the wrong way. It is fact of life apparent even in the simplest automated software of that kind. You, of course, also have some pretty visualization of what is the scenario where the parameter was minimized or maximized.

edit: also the answers could be really funny. How do we solve global warming? Okay, just abduct the prime minister of china! That should cool the planet off.

Comment author: Incorrect 12 April 2012 09:09:42PM *  0 points [-]

It isn't an amazing novel philosophical insight that type-1 agents 'love' to solve problems in the wrong way. It is fact of life apparent even in the simplest automated software of that kind.

Of course it isn't.

Let's just assume that mister president sits on nuclear launch button by accident, shall we?

There are machine learning techniques like genetic programming that can result in black-box models. As I stated earlier, I'm not sure humans will ever combine black-box problem solving techniques with self-optimization and attempt to use the product to solve practical problems; I just think it is dangerous to do so once the techniques become powerful enough.

Comment author: Dmytry 12 April 2012 09:16:20PM *  0 points [-]

There are machine learning techniques like genetic programming that can result in black-box models.

Which are even more prone to outputting crap solutions even without being superintelligent.

Comment author: Incorrect 12 April 2012 09:18:12PM *  1 point [-]

Yup, we seem safe for the moment because we simply lack the ability to create anything dangerous.

Sorry you're being downvoted. It's not me.