You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Stuart_Armstrong comments on Trapping AIs via utility indifference - Less Wrong Discussion

3 Post author: Stuart_Armstrong 28 February 2012 07:27PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (32)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 02 March 2012 10:37:17AM *  0 points [-]

it did look at the rules of the game, and it did some thinking - how can I , without being able to calculate EU(Z) , make moves that would work - and came up with an approach based on that function. (Incidentally this approach is applicable only to fairly simple utility functions of some future state.)

And that's where it comes up with: play randomly for three moves, then apply the material advantage process. This maximises the new utility function, without needing to calculate EU(Z) (or EU(A)).

An AI needs to be programmed in a specific, very under optimized way to allow you to make that sort of modification you're proposing here.

Specific, certainly; under-optimised is debatable.

For a seed AI, we can build the the indifference in early, and under broad conditions, be certain that it will retain the indifference at a later step.

Comment author: Dmytry 02 March 2012 11:39:48AM *  0 points [-]

And why exactly is this 'play randomly for 3 moves then applying material advantage' gives better utility than just applying material advantage?

Plus you got yourself some utility function that is entirely ill defined in a screwy self referential way (as the expected utility of a move ultimately depends to the AI itself and it's ability to use resulting state after the move to it's advantage). You can talk about it in words but you didn't define it other than 'okay now it will make ai indifferent'.

To be contrasted with original well defined utility function of future states; the AI may be unable to predict the future states, and calculate some utility numbers to assign to moves, but it can calculate utility of particular end-state of the board, and it can reason from this to strategies. There's simple thing for it to reason about, originally. I can write python code that looks at board, and tells the available legal moves or the win/loss/tie utility if it is end state. That is the definition of chess utility. AI can take it and reason about it. You instead have some utility that feeds back AI's own conclusions about utility of potential moves into utility function.

Comment author: Stuart_Armstrong 05 March 2012 07:41:24PM -1 points [-]

And why exactly is this 'play randomly for 3 moves then applying material advantage' gives better utility than just applying material advantage?

In this instance, they won't differ at all. But if the AI had some preferences outside of the chess board, then the indifferent AI would be open to playing any particular move (for the first three turns) in exchange for some other separate utility gain.

Plus you got yourself some utility function that is entirely ill defined in a screwy self referential way

In fact no. It seems like that, because of the informal language I used, but the utility function is perfectly well defined without any reference to the AI. The only self-reference is the usual one - how do I predict my future actions now.

If you mean that an indifferent utility can make these predictions harder/more necessary in some circumstances, then you are correct - but this seems trivial for a superintelligence.