PredictionBook.com - Track your calibration

26Eliezer_Yudkowsky14 October 2009 12:08AM

Our hosts at Tricycle Developments have created PredictionBook.com, which lets you make predictions and then track your calibration - see whether things you assigned a 70% probability happen 7 times out of 10.

The major challenge with a tool like this is (a) coming up with good short-term predictions to track (b) maintaining your will to keep on tracking yourself even if the results are discouraging, as they probably will be.

I think the main motivation to actually use it, would be rationalists challenging each other to put a prediction on the record and track the results - I'm going to try to remember to do this the next time Michael Vassar says "X%" and I assign a different probability.  (Vassar would have won quite a few points for his superior predictions of Singularity Summit 2009 attendance - I was pessimistic, Vassar was accurate.)

Comments (31)

Jack19 October 2009 04:56:53AM* 3 points [-]

Eliezer, you ought to be ashamed of yourself!

Cyan14 October 2009 02:02:27PM7 points [-]

maintaining your will to keep on tracking yourself even if the results are discouraging, as they probably will be...

I predict with probability 0.95 that my 95% intervals will contain the quantity I'm estimating around 50% of the time.

gwern14 October 2009 02:49:29PM2 points [-]

I see what you did there.

LauraABJ15 October 2009 08:22:54PM0 points [-]

Hilarious.

kess3r14 October 2009 11:51:45PM1 point [-]

Also, there need to be more explanations of how things work and the interface needs to be tweaked for better user friendliness. Also, please add more bandwidth. Otherwise, awesome idea.

matt15 October 2009 09:12:52AM0 points [-]

Yeh - sorry about the slow. Speed optimizations are one of the things we left out. If enough of you keep using it, we'll make it faster.

kess3r14 October 2009 11:22:54PM1 point [-]

This is pure awesome. Finally something has been done! This is akin to the mythbusters going on TV and doing science instead of just talking about how awesome science is.

Apologies for my little rant above.

As for the site itself, other than being awesome, it needs a few tweaks. There is no place to discuss the site itself and possible improvements to it. Also, I wish there was a feature to hide the result until after I vote.

matt15 October 2009 01:32:52AM0 points [-]

See the Feedback tab floating on the right.

MBlume14 October 2009 06:14:36AM7 points [-]

I've created an account to represent predictions made by the intrade markets. If anyone would like to help me update with regular quotes, PM me for the password.

gwern29 July 2010 06:48:40AM0 points [-]

I don't think I have time for regular quotes, but I could help you out monthly by judging predictions, and adding the prediction of markets added since the last month (if there is any easy way to get such a list).

CannibalSmith14 October 2009 11:05:42AM* 2 points [-]

The sun will rise tomorrow morning
( 80% confidence; 5 wagers; 1 comment )

O_o

rwallace14 October 2009 10:39:26AM2 points [-]

+1 Interesting! I've put in a prediction... and also pressed the wrong button on somebody else's prediction (for which the time hasn't elapsed yet) and marked it judged right, hopefully clicking Unknown undoes that...

The advantage of a site like this having been brought to the attention of geeks is that there are at least a few predictions listed to which my answer isn't "how the heck would I know?" :)

rwallace14 October 2009 10:46:34AM3 points [-]

Seems like a few other people have been doing the pressing the wrong button thing, if I'm now understanding the user interface correctly? I've tried setting some of those still in the future predictions to unknown, hopefully that's the right thing to do. If so, would it be possible to change the user interface to avoid this error?

Emile14 October 2009 02:53:36PM5 points [-]

Same here - once I entered a percentage, I wasn't sure which button to press, I hesitated between "right" (meaning the percentage I was giving was my confidence that it was right) and "my 2 cents" (which I thought only applieds to when you entered a comment). I selected "right", which was wrong.

The interface needs a bit of polishing.

ektimo16 October 2009 06:13:17PM0 points [-]

Me too. The interface for that was confusing enough that I ended up not submitting at all.

Jack14 October 2009 06:28:49AM4 points [-]

Everyone here should go and make at least one prediction. Its rationalist homework time.

Jonathan_Graehl14 October 2009 01:41:06AM4 points [-]

Calibration may be achievable by a general procedure of making and testing (banded) predictions, but I wouldn't trust anyone's calibration in a particular domain on evidence of calibration in another.

In other words, people will have studied the accuracy of only some of their maps.

DanArmak14 October 2009 01:11:52AM* 1 point [-]

This is quite interesting & exciting.

Are they planning on adding features relevant to a prediction market (apart from betting money)? E.g., tracking better reputation/score based on success or transitive trust; or tracking the overall predicted value of a prediction with many bets, weighed by the success/reputation/... of the betters.

Eliezer_Yudkowsky14 October 2009 02:13:21AM2 points [-]

Whether they add features will depend on whether people seem interested in using it, they say.

matt14 October 2009 04:53:13AM5 points [-]

Official answer: Eliezer's right. If we see traffic growing we'll invest in further development.

We can think of many things we could do to make the site better… but those users who currently use it don't use it enough, and if they tell their friends about it their friends don't become regular users (often enough).

Hosting the current code is very cheap and easy, so the site's in little danger of being shut down, but we won't be developing it further unless you guys and gals (and your friends, and their friends) pile on the love.

kess3r15 October 2009 10:41:41PM3 points [-]

Just out of curiosity, are you a startup, a non profit or a guy doing a side project?

I predict the site's userbase will not explode overnight but will escalate in the shape of a hockey stick. That's how these things usually happen. You will have to keep improving it even while the userbase is still low, otherwise people will think the site is dying and they will stop showing up. Interesting things need to already be happening on the site before a larger audience will keep coming back to it, not vice versa.

Also, you need to add documentation no matter how simple and intuitive you think the sites features are. They don't seem as intuitive from the outside. By 'documentation' I mean a short and EXPLICIT description of what each feature does. I like the 'help' button near the timeframe for the prediction. You could add help buttons next to everything. Also a faq would be nice.

Overall I think the site has great potential. Keep up the good work.

matt16 October 2009 04:35:57AM* 3 points [-]

Just out of curiosity, are you a startup, a non profit or a guy doing a side project?

We're Investling, which is a handfull of startups and an IT consultancy. We're for-profit, with some non-profit projects on the side (in part because we'll make more profits if we can help save the world from surprise conversion to paperclips). The majority of our non-profit work is SIAI related.

I predict the site's userbase will not explode overnight but will escalate in the shape of a hockey stick. […]

Some projects follow that pattern. Some projects never hockey-stick. How can you tell which curve you're riding?

We have many projects running: some have maintained exponential growth since we became involved; some are too young to judge; and some are on the low end of a curve that may be a hockey stick and may just be a project that doesn't have any legs. I very much hope that the LW crowd will latch on to PBook (keep coming back, tell your friends, etc.). If you do (we do - several of us are very keen LWers) and we see traffic growing, we'll flood more resources into the project. If it languishes we'll continue to host it and may even open source it, but it seems more sensible to flood our resources into projects that are winning. I really don't want to see PBook die, but I'm trying to count warm fuzzies consciously.

Also, you need to add documentation […]

We know the documentation is sparse (or, more precisely, the user interface isn't intuitive - documentation is evidence of a UI failure and good design is self-documenting). If you guys are still around in 14 days we should talk about more dev resources.

gwern29 July 2010 06:03:28AM* 1 point [-]

I just signed up and did a bunch of predictions. Here are my initial impressions:

The majority of our non-profit work is SIAI related.

A tool like PB is like spaced repetition flash card programs or writing Wikipedia articles - a long-term tool. Some benefits appear quickly, yes, but the bulk of the benefits arrive over years or decades. (PB is somewhat like Long Bets.)

As the saying goes, "In the long run, the utility of all non-Free software approaches zero. All non-Free software is a dead end." If I invest time in PB, what guarantee do I have that I will be able to get my data* out of PB when** it dies, especially for topics I didn't write? Are you guys going to license the content under a CC license?

(You should do it early, while there still isn't too much content - once Wikipedia got large, it took years and years and a unique one-time exemption by the FSF to liberate its content from the GFDL into a CC license.)

* And it will die eventually. Every site either dies or evolves out of recognition.

** My data is vastly more important to me than the website software. If I had to, I could run a personal PB in just a flat text file, after all.


2) comments are ridiculously constrained. I dunno if you guys were trying for some sort of auto-Twitter compatibility, but it's really annoying. If you need to dump comments on Twitter and they're too long, then just truncate them.

3) I just judged a Michael Jackson-related prediction wrong, with a citation that the predicted event happened in the wrong year. But in the history section, my comment never appeared!
My current workaround is to make a 0 or 50% prediction (wrong/right), explain my reasoning as best as I can in so short a space, and then separately mark it wrong/right. This is unfair to my score, since obviously I can choose 0 or 100% and always be right.

4) The black boxes on prediction pages (eg. "Join this prediction") are horrible. I was convinced for the longest time that they were buttons to push, and that they were disabled by some JavaScript pokery until I went and read the page source.

5) Newlines in comments do not get translated to a space or two in the comment; they get translated to nothing whatsoever.

6) No apparent way to edit 'due dates' for predictions; many unjudged predictions can't be judged at all because they seem to have been created expired.

7) On userpages, the most recent prediction/action gets split in half by the statistics graph in my Firefox; screenshot.

thomblake16 October 2009 12:55:24PM2 points [-]

good design is self-documenting

Yes yes yes. Four times yes.

and we see traffic growing

Right now the UI is so slow / bad that I couldn't see myself using it.

anonym18 October 2009 07:31:27PM2 points [-]

Agreed on the UI being incredibly confusing (and slow).

In terms of usability, if they just moved the judgment buttons down below, added text like "Render final judgment on this prediction" to make it obvious what judgment does, and changed "My 2 cents" to "Submit Estimate" or something like that, it would be a huge improvement over the current. These sorts of very minor cosmetic UI changes would be trivial to make.

Jack19 October 2009 04:49:12PM0 points [-]

Can some explain to me what is going on with this prediction given this prediction. I'm not going crazy, right:? People are confused.

thomblake19 October 2009 08:39:48PM0 points [-]

Interesting - by the time I checked this, it looks like there aren't any inconsistent estimates.

Jack19 October 2009 09:20:33PM0 points [-]

Right now the later prediction has 9 points higher probability than the sooner prediction. I counted two or three cases of individual users posting higher probabilities for the later prediction. Unless they're 're really confident the first cryonic revival takes place during that decade they're making a huge mistake. My best explanation is that they just saw a farther off date and assumed a higher probability of everything...

ektimo16 October 2009 06:30:03PM0 points [-]

Some of my predictions are of the sort "the stock market will fall 50% tomorrow with 20% odds" (not a real prediction!). If it did happen I should get huge credit, but would it show up as negative credit since I predicted there was only a 20% chance it would happen? Is there some way it would be possible to do this kind of prediction with PredictionBook?

I predict this comment will get less than 4 points by Oct. 19 with 75% odds.

gwern16 October 2009 07:39:09PM0 points [-]

If it did happen I should get huge credit, but would it show up as negative credit since I predicted there was only a 20% chance it would happen? Is there some way it would be possible to do this kind of prediction with PredictionBook?

It seems to me like you're asking about 2 different issues: the first is not desiring to be penalized for making low-probability bets; but that should be handled pretty simply - if you figure it at 1 in 5, then after only a few failed bets things should and ought to start looking bad for you, but if at 1 in thousands, each failed prediction ought to affect your score very little.

Presumably PredictionBook is offering richer rewards for low-probability successes, just like a 5% share on a prediction market pays out (proportionately) much more than a 95% share would; on net you would do the same.

The second issue is that you seem to think that certain predictions are simply harder to predict better than chance, and that you should be rewarded for going out on a limb? (20% odds on a big market bet tomorrow is much more detailed than the default 1-in-thousands-chance-per-day prediction.)

I don't know what the fair reward here is. If few people are making that prediction at all, then it should be easy to do better than them. In prediction markets, one expects that unpopular markets will be easier to arbitrage and beat - the thicker the market, the more efficient; standard economics. So in a sense, unpopular predictions are their own reward.

But this doesn't prevent making obscure predictions ('will I remember to change my underwear tomorrow?') Nor would it seem to adequately cover 'big questions' like open scientific puzzles or predictions about technological development (think the union of Longbets & Intrade). Maybe there could be a bonus for having predictions pay out with confidence levels higher than the average? This would attract well-calibrated people to predictions where others are not informed or are too pessimistic.

UnholySmoke15 October 2009 01:24:07PM0 points [-]

Cracking idea, like it a lot. Hofstadter would jump for joy, and in his honour:

http://predictionbook.com/predictions/532