RowanE comments on Calibration Test with database of 150,000+ questions - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (31)
Awesome!
I've been dying for something like this after I zoomed through all the questions in the CFAR calibration app.
Notes so far:
* The highest-available confidence is 99%, so the lowest-available confidence should be 1% rather than 0%. Or even better, you could add 99.9% and 0.1% as additional options.
* So far I've come across one question that was blank. It just said Category: jewelry and then had no other text. Somehow the answer was Ernest Hemingway.
* Would be great to be able to sign up for an account so I could track my calibration across multiple sessions.
It's possible to be, to some extent, certain that you haven't thought of a correct answer (if not certain you don't know the answer), because you don't have any answer in mind and yet are not considering the answer "this is a trick question" or "there is no correct answer". Is this something that should be represented, making "0%" correct to include, or am I confused?
I got one blank question, which I think was an error with loading since the answer came up the same as the previous question, and the one after it took a couple seconds to appear on-screen.
I'd prefer not to allow 0 and 1 as available credences. But if 0 remained as an option I would just interpret it as "very close to 0" and then keep using the app, though if a future version of the app showed me my Bayes score then the difference between what the app allows me to choose (0%) and what I'm interpreting 0 to mean ("very close to 0") could matter.
I think it's misleading to just drop in the statement that 0 and 1 are not probabilities.
There is a reasonable and arguably better definition of probabilities which excludes them, but it's not the standard one, and it also has costs -- for example probabilities are a useful tool in building models, and it is sometimes useful to use probabilities 0 and 1 in models.
(aside: it works as a kind of 'clickbait' in the original article title, and Eliezer doesn't actually make such a controversial statement in the post, so I'm not complaining about that)
Fair enough. I've edited my original comment.
(For posterity: the text for my original comment's first hyperlink originally read "0 and 1 are not probabilities".)
Perfect, thanks!