You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

AndreInfante comments on Steelmaning AI risk critiques - Less Wrong Discussion

26 Post author: Stuart_Armstrong 23 July 2015 10:01AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (98)

You are viewing a single comment's thread. Show more comments above.

Comment author: AndreInfante 27 July 2015 08:50:01PM *  1 point [-]

(1) Intelligence is an extendible method that enables software to satisfy human preferences. (2) If human preferences can be satisfied by an extendible method, humans have the capacity to extend the method. (3) Extending the method that satisfies human preferences will yield software that is better at satisfying human preferences. (4) Magic happens. (5) There will be software that can satisfy all human preferences perfectly but which will instead satisfy orthogonal preferences, causing human extinction.

This is deeply silly. The thing about arguing from definitions is that you can prove anything you want if you just pick a sufficiently bad definition. That definition of intelligence is a sufficiently bad definition.

EDIT:

To extend this rebuttal in more detail:

I'm going to accept the definition of 'intelligence' given above. Now, here's a parallel argument of my own:

  1. Entelligence is an extendible method for satisfying an arbitrary set of preferences that are not human preferences.

  2. If these preferences can be satisfied by an extendible method, then the entelligent agent has the capacity to extend the method.

  3. Extending the method that satisfies these non-human preferences will yield software that's better at satisfying non-human preferences.

  4. The inevitable happens.

  5. There will be software that will satisfy non-human preferences, causing human extinction.


Now, I pose to you: how do we make sure that we're making intelligent software, and not "entelligent" software, under the above definitions? Obviously, this puts us back to the original problem of how to make a safe AI.

The original argument is rhetorical slight of hand. The given definition of intelligence implicitly assumes that the problem doesn't exist, and all AI's will be safe, and then goes on to prove that all AIs will be safe.

It's really, fundamentally silly.