Thanks Emile,
Is there anything you'd like to see added?
For example, I was thinking of running it on nodejs and logging the scores of players, so you could see how you compare. (I don't have a way to host this, right now, though.)
Or another possibility is to add diagnostics. E.g. were you setting your guess too high systematically or was it fluctuating more than the data would really say it should (under some models for the prior/posterior, say).
Also, I'd be happy to have pointers to your calibration apps or others you've found useful.
I've written a game (or see (github)) that tests your ability to assign probabilities to yes/no events accurately using a logarithmic scoring rule (called a Bayes score on LW, apparently).
For example, in the subgame "Coins from Urn Anise," you'll be told: "I have a mysterious urn labelled 'Anise' full of coins, each with possibly different probabilities. I'm picking a fresh coin from the urn. I'm about to flip the coin. Will I get heads? [Trial 1 of 10; Session 1]". You can then adjust a slider to select a number a in [0,1]. As you adju...
There's no math error.
Why is it consistent that assigning a probability of 99% to one half of a binary proposition that turns out false is much better than assigning a probability of 1% to the opposite half that turns out true?
I think there's some confusion. Coscott said these three facts:
Let f(x) be the output if the question is true, and let g(x) be the output if the question is false.
f(x)=g(1-x)
f(x)=log(x)
In consequence, g(x)=log(1-x). So if x=0.99 and the question is false, the output is g(x)=log(1-x)=log(0.01). Or if x=0.01 and the question is true, the output is f(x)=log(x)=log(0.01). So the symmetry that you desire is true.
It's certainly in the right spirit. He's reasoning backwards in the same way Bayesian reasoning does: here's what I see; here's what I know about possible mechanisms for how that could be observed and their prior probabilities; so here what I think is most likely to be really going on.