All of mcoram's Comments + Replies

It's certainly in the right spirit. He's reasoning backwards in the same way Bayesian reasoning does: here's what I see; here's what I know about possible mechanisms for how that could be observed and their prior probabilities; so here what I think is most likely to be really going on.

Thanks Emile,

Is there anything you'd like to see added?

For example, I was thinking of running it on nodejs and logging the scores of players, so you could see how you compare. (I don't have a way to host this, right now, though.)

Or another possibility is to add diagnostics. E.g. were you setting your guess too high systematically or was it fluctuating more than the data would really say it should (under some models for the prior/posterior, say).

Also, I'd be happy to have pointers to your calibration apps or others you've found useful.

Here's the "normalized" version: f(x)=1+log2(x), g(x)=1+log2(1-x) (i.e. scale f and g by 1/log(2) and add 1).

Now f(1)=1, f(.5)=0, f(0)=-Inf ; g(1)=-Inf, g(.5)=0, g(0)=1.

Ok?

0Decius
Huh. I thought that wasn't a Bayesian score (not maximized by estimating correctly), but doing the math the maximum is at the right point for 1/4, 1/100, 3/4, and 99/100, and 1/2.

I've written a game (or see (github)) that tests your ability to assign probabilities to yes/no events accurately using a logarithmic scoring rule (called a Bayes score on LW, apparently).

For example, in the subgame "Coins from Urn Anise," you'll be told: "I have a mysterious urn labelled 'Anise' full of coins, each with possibly different probabilities. I'm picking a fresh coin from the urn. I'm about to flip the coin. Will I get heads? [Trial 1 of 10; Session 1]". You can then adjust a slider to select a number a in [0,1]. As you adju... (read more)

0[anonymous]
Thank you. I really, really want to see more of these. Feature request #976: More stats to give you an indication of overconfidence / underconfidence. (e.g. out of 40 questions where you gave an answer between .45 and .55, you were right 70% of the time).
6Scott Garrabrant
This game has taught me something. I get more enjoyment than I should out of watching a random variable go up and down, and probably should avoid gambling. :)

There's no math error.

Why is it consistent that assigning a probability of 99% to one half of a binary proposition that turns out false is much better than assigning a probability of 1% to the opposite half that turns out true?

I think there's some confusion. Coscott said these three facts:

Let f(x) be the output if the question is true, and let g(x) be the output if the question is false.

f(x)=g(1-x)

f(x)=log(x)

In consequence, g(x)=log(1-x). So if x=0.99 and the question is false, the output is g(x)=log(1-x)=log(0.01). Or if x=0.01 and the question is true, the output is f(x)=log(x)=log(0.01). So the symmetry that you desire is true.

0Decius
But that doesn't output 1 for estimates of 100%, 0 for estimates of 50%, and -inf (or even -1) to estimates of 0%, or even something that can be normalized to either of those triples.