You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Vaniver comments on Some thoughts on meta-probabilties - Less Wrong Discussion

0 Post author: iarwain1 21 September 2015 05:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (15)

You are viewing a single comment's thread.

Comment author: Vaniver 21 September 2015 07:03:44PM *  1 point [-]

If you made no approximations, the normatively correct approach is to carry around your current probability estimate p, and a table which contains what p would be updated to under all possible pieces of evidence you could receive. For example, I might say "I know very little about sports, so I'll assign probability 50% that the Dallas Cowboys will win their next game, but if my friend who follows football tells me they will, I'll assign probability 75%, and if I see a bookie's odds, I'll adopt the implied probability estimate." (This is, of course, an incomplete list--there are many, many other pieces of evidence I could see.) Obviously, these updates should follow the laws of probability on pain of paradox.

Why is this necessary to do things correctly? You can work out that I thought my friend's prediction, because it moved me from 1:1 odds to 3:1 odds, has a likelihood ratio of three. But where did 3 come from? It's the interaction between my knowledge and my friend's knowledge. If the same friend makes the same prediction, then I shouldn't update my probability, because the first time they give me useful info, and the second time they don't give me any useful info. If a second friend also predicts that the Cowboys will win, then I need to estimate how correlated their predictions are in order to determine how to update.

The hyperparameter approach is the clean way to do this in cases where the likelihood of incoming evidence given existing evidence is easy to determine. If I've got a coin flipped in a random fashion (but weighted in an unknown way), then I think that successive flips are independent and equally indicative of the underlying propensity of the coin to land heads when flipped randomly. But if I've got a coin flipped in a precisely controlled deterministic fashion, then I don't think that successive flips are independent and equally indicative of the underlying propensity of the coin to land heads, because that "propensity" is not longer a useful node in my model.