Vaniver comments on Value of Information: Four Examples - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (60)
Great post on a very important topic.
One suggestion: pictures would help a lot here. Norvig's AIMA has some very nice illustrations in ch 16 (I think)
Not sure how you worked this out. Not clear what X is
I like the idea of having pictures but I do not like the idea of procuring pictures. I'll make it a higher priority for future posts, though, and if someone wants to send me pictures (which I can legally use) for this post I'll be happy to edit them in.
I replaced the "x"s with "p"s; hopefully that'll make it a bit clearer.
We start off with a prior P(p)=1. That is, I think every p is equally likely, and when I integrate over the domain of p (from 0 to 1) I get 1, like I should.
Then I update on seeing heads. For each p value, the chance I saw heads was p- and so I expect my function to have the functional form P(p)=p. Notice that after seeing heads I think the mode is a coin that always lands on heads and that it's impossible that the coin always lands on tails- both are what I expect. When I integrate p from 0 to 1, though, I get 1/2. I need to multiply it by 2 to normalize it, and so we have P(p)=2p.
This might look odd at first because it sounds like the probability of the coin always landing on heads is 2, which suggests an ill-formed probability. That's the probability density, though- right now, my prior puts 0 probability on the coin always landing on heads, because that's an integral with 0 width.
The 2-2x comes from the same argument, but the form is now 1-x.
Just to be clear I was not suggesting ripping off their illustration, but it is a very good one worthy of legal emulation :).
The reason I put that is because I find at least half of the ugh in finding pictures is checking legality.
I'm not sure it isn't clearer with 'x's, given that you have two different kinds of probabilities to confuse.
It may just be that there's a fair bit of inferential distance to clear, though in presenting this notation at all.
I have a strong (if rusty) math background, but I had to reason through exactly what you could possibly mean down a couple different trees (one of which had a whole comment partially written asking you to explain certain things about your notation and meaning) before it finally clicked for me on a second reading of your comment here after trying to explain my confusion in formal mathematical terms.
I think a footnote about what probability distribution functions look like and what the values actually represent (densities, rather than probabilities), and a bit of work with them would be helpful. Or perhaps there's enough inferential work there to be worth a whole post.
I definitely think that should be a post of its own.
Thanks for the feedback! It's helpful when planning out a sequence to know where I should focus extra attention.