CronoDAS comments on Humor: GURPS Friendly AI - Less Wrong

9 Post author: hankx7787 04 February 2013 04:38PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (28)

You are viewing a single comment's thread.

Comment author: CronoDAS 05 February 2013 02:38:42AM 0 points [-]

One thing I've never been sure of... are these results supposed to be worse than a normal failure (which destroys the world)?

Comment author: private_messaging 05 February 2013 04:26:47PM 3 points [-]

Worse ones are easy to come up with, by just looking at actual accidents that sometimes happen with software. E.g. a critical typo flips the sign of utility value. The resulting AI is truly unfriendly, and simulates an enormously larger number of suffering beings than the maximum number of happy beings which can exist. Or the backstory for Terminator movie - the AI has determined that maximum human value is achieved through epic struggle against the machines.

Comment author: Kawoomba 05 February 2013 04:58:45PM 2 points [-]

E.g. a critical typo flips the sign of utility value. The resulting AI is truly unfriendly

When you're not in destructor-mode (alternatively: Hulk-Smash mode), you're full of interesting ideas:

I wonder if discovering/inventing the "ultimate" (a really good) theory of friendliness also implies a really good theory of unfriendliness. Of course, the inverse of "really friendly" isn't "really unfriendly" (but instead "everything other than really friendly"), still if friendliness theory yields a specific utility function, it may be a small step to make the ultimate switch.

Let's hope there's no sneaky Warden Dios (of Donaldson's Gap Cycle) who makes a last minute modification to the code before turning it on.

Comment author: private_messaging 05 February 2013 08:42:22PM 4 points [-]

Well, its not that it's hard to come up with, it's IMO that hardly anyone ever actually thinks about artificial intelligence. Hardly anyone thinks of reducible intelligence, either.

Comment author: Armok_GoB 14 February 2013 10:32:54PM 0 points [-]

Actually, friendlyness is an utility function, which means it's ranks every possibility from least to most friendly. It must be able to determine the absolute worst and most unfriendly possible outcome so that if it becomes a possibility it knows how desperately to avoid it.