Baughn comments on NES-game playing AI [video link and AI-boxing-related comment] - Less Wrong

30 Post author: Dr_Manhattan 12 April 2013 01:11PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (22)

You are viewing a single comment's thread. Show more comments above.

Comment author: latanius 13 April 2013 06:32:57AM 7 points [-]

This thing looks more and more relevant as I think about it. What it does is not just optimizing an objective function in a weird and unexpected way, but actually learning it in all its complicatedness from observed human behavior.

Would it be an overestimation to call this a FAI research paper?

Comment author: Baughn 14 April 2013 01:24:51PM 1 point [-]

AI research paper? Maybe not.

What's friendly about this AI?

Comment author: latanius 14 April 2013 07:42:54PM 6 points [-]

The point is that it's not, but making it so is a design goal of the paper.

Example: Mario immediately jumping into a pit at level 2. According to the learned utility function of the system, it's a good idea. According to ours, it's not.

Just as with optimizing smiling faces. But while that one was purely a thought experiment, this paper presents a practical, experimentally testable benchmark for utility function learning, and, by the way, shows a not-yet-perfect but working solution for it. (After all, Mario's Flying Goomba Kick of High Munchkinry definitely satisfies our utility functions.)

Comment author: pjeby 15 April 2013 04:57:03AM -1 points [-]

What's friendly about this AI?

Nothing. It's mostly useful to illustrate cognitive biases around AI, demonstrating how alien a simple "utility"-maximizing process is, compared to how humans think about things. It's an example answer to the standard, "But my AI wouldn't do a stupid thing like that" objection. Well, yes, actually, it would. And the simpler and more elegant your design is, the higher the probability that it will do things like that: things you don't even think about because to a human, they're obviously stupid. (At the same time, of course, it will also do things that seem utterly brilliant to a human, for the exact same reason: finding that brilliant move first required doing something stupid, like jumping at an enemy.)

It also illustrates some decision theory concepts, like looking into the future to see how your actions fare, and the importance of matching the machine's "utility" with a human's utility. (In each game, the actual game utility differs in certain ways from the simple utility function derived from scoring, and it's these differences that create the bad-weird moves.)