Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Lumifer comments on The map of "Levels of defence" in AI safety - Less Wrong Discussion

0 Post author: turchin 12 December 2017 10:44AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (5)

You are viewing a single comment's thread. Show more comments above.

Comment author: Lumifer 04 January 2018 03:47:35PM 0 points [-]

Are you reinventing Asimov's Three Laws of Robotics?

Comment author: RedMan 04 January 2018 07:17:21PM *  0 points [-]

I hadn't thought about it that way.

I do think that either compiler time flags for the AI system or a second 'monitor' system chained to the AI system in order to enforce the named rules would probably limit the damage.

The broader point is that probabilistic AI safety is probably a much more tractable problem than absolute AI safety for a lot of reasons, to further the nuclear analogy, emergency shutdown is probably a viable safety measure for a lot of the plausible 'paperclip maximizer turns us into paperclips' scenarios.

"I need to disconnect the AI safety monitoring robot from my AI-enabled nanotoaster robot prototype because it keeps deactivating it" might still be the last words a human ever speaks, but hey, we tried.

Comment author: Lumifer 05 January 2018 04:35:27PM *  1 point [-]

There seems to be a complexity limit to what humans can build. A full GAI is likely to be somewhere beyond that limit.

The usual solution to that problem -- see the EY's fooming scenario -- is to make the process recursive: let a mediocre AI improve itself, and as it gets better it can improve itself more rapidly. Exponential growth can go fast and far.

This, of course, gives rise to another problem: you have no idea what the end product is going to look like. If you're looking at the gazillionth iteration, your compiler flags were probably lost around the thousandth iteration and your chained monitor system mutated into a cute puppy around the millionth iteration...

Probabilistic safety systems are indeed more tractable, but that's not the question. The question is whether they are good enough.