You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Gram_Stone comments on Rationality Reading Group: Part V: Value Theory - Less Wrong Discussion

6 Post author: Gram_Stone 10 March 2016 01:11AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (31)

You are viewing a single comment's thread. Show more comments above.

Comment author: SquirrelInHell 20 March 2016 03:56:13AM 1 point [-]

You would indeed maximize EV in each iteration, but this EV would also include a factor from value of information.

Comment author: Gram_Stone 20 March 2016 04:11:20AM *  0 points [-]

Ah, okay. I went downstairs for a minute and thought to myself, "Well, the only way I get what he's saying is if we go up a level and assume that the given utilities are not simply changing, but are changing according to some sort of particular rule."

Also, I spent a long time writing my reply to your original problem statement, without refreshing the page, so I only read the original comment, not the edit. That might explain why I didn't immediately notice that you were talking about value of information, if I seemed a little pedantic in my earlier comment with all of the math.

Back to the original point that brought this problem up, what's going on inside the brain is that the brain has assigned utilities to outcomes, but there's a tremble on its actions caused by the stochastic nature of neural networks. The brain isn't so much uncertain about utilities as it is believing that its utility estimates are accurate and randomly not doing what it considers most desirable.

That's why I wrote, in the original comment:

It just seems interesting to consider the consequences of the assumption that there is a decision-maker without a trembling hand.

Does that make sense?

Comment author: SquirrelInHell 20 March 2016 05:31:57AM 1 point [-]

Ah, okay. I went downstairs for a minute and thought to myself, "Well, the only way I get what he's saying is if we go up a level and assume that the given utilities are not simply changing, but are changing according to some sort of particular rule."

Congratulations on good thinking and attitude :)

Does that make sense?

Yes, I get that. What I meant to suggest to you in the broader picture, is that this "tremble" might be evolution's way to crudely approximate a fully rational agent, who makes decisions based on VOI.

So it's not necessarily detrimental to us. Sometimes it might well be.

The main takeaway from all that I have said is it that replacing your intuition with "let's always take option A because it's the rational thing to do" just doesn't do the trick when you play multiple games (as is often the case in real life).