jsteinhardt comments on The Inefficiency of Theoretical Discovery - Less Wrong

19 Post author: lukeprog 03 November 2013 09:26PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (109)

You are viewing a single comment's thread. Show more comments above.

Comment author: Kaj_Sotala 05 November 2013 05:51:28AM 2 points [-]

I don't think the argument is that AI would be fundamentally different, but rather that "we can reason at least somewhat reliably when making predictions of agents who don't drastically self-modify, and of whom we have thousands of years of data to help build our predictions on" isn't good enough to deal with the case of a drastically self-modifying agent that could exhibit entirely novel behavior and cognitive dynamics even if it wasn't capable of self-modifying. "Somewhat reliably" is fine only as long as a single failure isn't enough to throw all the rest of your predictions to the trash bin.

I don't know enough about your second example to feel confident commenting on it.

Comment author: jsteinhardt 05 November 2013 06:03:37AM 3 points [-]

"Somewhat reliably" is fine only as long as a single failure isn't enough to throw all the rest of your predictions to the trash bin.

Humans seem pretty good at making correct predictions even if they have made incorrect predictions in the past. More generally, any agent for whom a single wrong prediction throws everything into disarray will probably not continue to function for very long.

I don't know enough about your second example to feel confident commenting on it.

Fair enough. This is an admirable habit that is all too rare, so have an upvote :).

Comment author: Kaj_Sotala 05 November 2013 07:30:38AM *  2 points [-]

Humans seem pretty good at making correct predictions even if they have made incorrect predictions in the past. More generally, any agent for whom a single wrong prediction throws everything into disarray will probably not continue to function for very long.

That's basically my point. A human has to predict the answer to questions of the type "what would I do in situation X", and their overall behavior is the sum of their actions over all situations, so they can still get the overall result roughly correct as long as they are correct on average. An AI that's capable of self-modification also has to predict the answer to questions of the type "how would my behavior be affected if I modified my decision-making algorithm in this way", where the answer doesn't just influence the behavior in one situation but all the ones that follow. The effects of individual decisions become global rather than local. It needs to be able to make much more reliable predictions if it wants to have a chance of even remaining basically operational over the long term.

Fair enough. This is an admirable habit that is all too rare, so have an upvote :).

Thanks. :)

Comment author: jmmcd 08 November 2013 09:09:20PM 0 points [-]

And more important, its creators want to be sure that it will be very reliable before they switch it on.