# Eliezer_Yudkowsky comments on Reply to Holden on 'Tool AI' - Less Wrong

91 12 June 2012 06:00PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 13 June 2012 06:19:29PM 8 points [-]

Yeah, well, hardwiring the AI to understand human desires wouldn't be goddamned trivial either, I just decided not to go down that particular road, mostly because I'd said it before and Holden had apparently read at least some of it.

Getting the light-square bishop out of danger as highest priority...

1) Do I assume the opponent assigns symmetric value to attacking the light-square bishop?

2) Or that the opponent actually values checkmates only, but knows that I value the light-square bishop myself and plan forks and skewers accordingly?

3) Or that the opponent has no idea why I'm doing what I'm doing?

4) Or that the opponent will figure it out eventually, but maybe not in the first game?

5) What about the complicated static-position evaluator? Do I have to retrain all of it, and possibly design new custom heuristics, now that the value of a position isn't "leads to checkmate" but rather "leads to checkmate + 25% leads to bishop being captured"?

Adding this to Deep Blue is not remotely as trivial as it sounds in English. Even to add it in a half-assed way, you have to at least answer question 1, because the entire non-brute-force search-tree pruning mechanism depends on guessing which branches the opponent will prune. Look up alpha-beta search to start seeing why everything becomes more interesting when position-values are no longer being determined symmetrically.

Comment author: 13 June 2012 07:11:58PM *  9 points [-]

For what it's worth, the intended answers are 1) no 2) no 3) yes 4) no 5) the evaluation function and the opening book stay the same, there's just a bit of logic squished above them that kicks in only when the bishop is threatened, not on any move before that.

Yeah, game-theoretic considerations make the problem funny, but the intent wasn't to convert an almost-consistent utility maximizer into another almost-consistent utility maximizer with a different utility function that somehow values keeping the bishop safe. The intent was to add a hack that throws consistency to the wind, and observe that the AI doesn't rebel against the hack. After all, there's no law saying you must build only consistent AIs.

My guess is that's what most folks probably mean when they talk about "hardwiring" stuff into the AI. They don't mean changing the AI's utility function over the real world, they mean changing the AI's code so it's no longer best described as maximizing such a function. That might make the AI stupid in some respects and manipulable by humans, which may or may not be a bad thing :-) Of course your actual goals (whatever they are) would be better served by a genuine expected utility maximizer, but building that could be harder and more dangerous. Or at least that's how the reasoning is supposed to go, I think.

Comment author: 14 June 2012 02:55:52AM 5 points [-]

The intent was to add a hack that throws consistency to the wind, and observe that the AI doesn't rebel against the hack.

Why doesn't the AI reason "if I remove this hack, I'll be more likely to win?" Because this is just a narrow chess AI and the programmer never gave it general reasoning abilities?

Comment author: 26 June 2012 10:45:59AM *  1 point [-]

Why doesn't the AI reason "if I remove this hack, I'll be more likely to win?"

More interesting question is why it (if made capable of such reflection) would not take it a little step further and ponder what happens if it removes enemy's queen from it's internal board, which would also make it more likely to win, with its internal definition of win which is defined in terms of internal board.

Or why would anyone go through the bother of implementing possibly irreducible notion of what 'win' really means in the real world, given that this would simultaneously waste computing power on unnecessary explorations and make AI dangerous / uncontrollable.

Thing is, you don't need to imagine the world dying to avoid making pointless likely impossible accomplishments.

Comment author: 14 June 2012 07:39:24AM *  0 points [-]

Yeah, because it's just a narrow real-world AI without philosophical tendencies... I'm actually not sure. A more precise argument would help, something like "all sufficiently powerful AIs will try to become or create consistent maximizers of expected utility, for such-and-such reasons".

Comment author: 14 June 2012 08:19:14AM *  4 points [-]

Does a pair of consistent optimizers with different goals have a tendency to become a consistent optimizer?

The problem with powerful non-optimizers seems to be that the "powerful" property already presupposes optimization power, and so at least one optimizer-like thing is present in the system. If it's powerful enough and is not contained, it's going to eat all the other tendencies of its environment, and so optimization for its goal will be all that remains. Unless there is another optimizer able to defend its non-conformity from the optimizer in question, in which case the two of them might constitute what counts as not-a-consistent-optimizer, maybe?

Comment author: 13 June 2012 08:18:01PM 0 points [-]

Option 3? Doesn't work very well. You're assuming the opponent doesn't want to threaten the bishop, which means you yank it to a place where it would be safe if the opponent doesn't want to threaten it, but if the opponent clues in, it's then trivial for them to threaten the bishop again (to gain more advantage as you try to defend), which you weren't expecting them to do, because that's not how your search tree was structured. Kasparov would kick hell out of thus-hardwired Deep Blue as soon as he realized what was happening.

It's that whole "see the consequences of the math" thing...

Comment author: 14 June 2012 07:46:51AM 13 points [-]

Either your comment is in violent agreement agreement with mine ("that might make the AI stupid in some respects and manipulable by humans"), or I don't understand what you're trying to say...

Comment author: 14 June 2012 09:23:08PM 3 points [-]

Probably violent agreement.