multifoliaterose comments on Why We Can't Take Expected Value Estimates Literally (Even When They're Unbiased) - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (249)
Interesting! But let's go back to the roots...
You're proposing to equip an agent with a prior over the effectiveness of its actions instead of (in addition to?) a prior over possible worlds. Will such an agent be Bayesian-rational, or will it exhibit weird preference reversals? If the latter, do you see that as a problem? If the former, Bayesian-rationality means the agent must behave as though its actions were governed only by some prior over possible worlds; what does that prior look like?
Approximately normal distributions arise from the assumption of the involvement of many independent random variables with the largest ones being of roughly comparable size. It's intuitively plausible that a Solomonoff type prior would (at least approximately) yield such an assumption.
But even if "intuitively plausible" equates to, say, 0.9999 probability, that's insufficient to disarm Pascal's Mugging. I think there's at least 0.0001 chance that a better approximate prior distribution for "value of an action" is one with a "heavy tail", e.g., one with infinite variance.
Sure, the present post deals only with the case where the value that one assigns to an action obeys a (log)-normal distribution over actions. In the case that you describe, there may (or may not) be a different way to disarm Pascal Mugging.
Intuitively plausible, but wrong; Solomonoff priors have long, very slowly decreasing tails.
Care to elaborate or give a reference?
See http://lesswrong.com/lw/6fd/observed_pascals_mugging/4fky and the replies.
Okay, I didn't mean a literal Solomonoff prior; I meant "what your posterior would be after starting with a Solomonoff prior, observing the natural/human world at some length and Bayesian updating accordingly." The prior alone contains essentially no information!
Observations would merely shrink the tails by a multiplicative constant, they would not change the shape.