Discussion article for the meetup : Bath, UK: Agreement, practical meetups, and report from last meetup

WHEN: 02 November 2014 02:00:00PM (+0000)

WHERE: 5-10 James St W, Avon, Bath BA1 2BX

Bath, UK will be having its second meetup this Sunday 2nd November at 14:00 in the King of Wessex, which is a Wetherspoons pub in the city. I shall wait at least ninety minutes (i.e. until 15:30) for the first arrivals.

I'll put a sheet featuring a paperclip and saying 'Less Wrong' on the table so you know you've found us. Make sure you venture into the pub, since there's no guarantee our table will be near the door.

In case you need to contact me (e.g. if the venue is unexpectedly busy and we have to move elsewhere and you can't find us), my mobile number is the product 3 x 3 x 23 x 97 x 375127, preceded by a zero (so eleven digits total).

We have a Facebook group.

At the start we'll chat for a bit, then move onto an agreement exercise: Unlike last meetup, where we made predictions for our own PredictionBook accounts somewhat independently without necessarily sharing all our information, this time we shall try to reach consensus in our probabilities and then see how our consensus is calibrated, by means of a single PredictionBook account for the meetup group.

After that, we shall discuss ideas for future meetups and activities. In particular, we shall discuss how we can move forward with practical meetups and instrumental rationality, and how to balance this with discussion and 'abstract' or epistemic stuff.

====Previous meetup (2014-10-19 Sunday)====

It went well. There was me, someone who tagged along with me, someone from Bristol, and someone from Bradford-on-Avon. I think everyone had arrived by 14:30, and we probably stayed until 17:30 or 18:00. (We stayed long enough that we all got something to eat in the pub.)

We got to know each other a bit then did 15 predictions. The previous night, I had prepared a list of prompts for things to make predictions about, ranging from things where I thought we might have very high (or low) confidences, to things where I expected that most of the attendees would be basically indifferent (e.g. whether an even or odd number of elements have been observed, whether the density of water is above or below 1kg per liter, etc.).

We skipped some of the ambiguous prompts, and for a couple we had to sort of figure out what we'd use to judge the prediction midway through. I'd state the prompt, then where necessary we'd pin it down into something we could judge objectively enough. There might be some brief discussion, but we weren't trying to share all our information. I would type in (but not submit or write my probability for) the final wording of the prediction on PredictionBook. At a suitable point, when everyone understood what we were predicting and how it would be judged, I would give 90 seconds for everyone to stop communicating and log their final probabilities. I'd then type in my probability and create the prediction on Predictionbook, then we'd go round stating the probability we'd written down.

Some of the prompts were intentionally underspecified. For example, the first prediction was about Wladimir Klitschko's mass. In that case we each independently (to avoid priming) wrote down a figure (after explaining who he is, of course). Then we took the usual mean of the figures to obtain a 'wisdom of crowds' estimate and used that as the mass for the prediction.

(If you're worried that averaging the guesses would lead everyone to put 50% probability on the proposition, then you can shift by some amount to encourage more extreme confidences. But remember that it's still useful to test calibration at the 50% level!)

That was one of the cases where we had to decide partway through how we were going to judge the prediction, since we realised his mass would fluctuate a lot depending on e.g. whether he'd cut weight for a weigh-in. We agreed that if Google gave a unique figure and it seemed plausible, then we'd go with that. I'm not certain, but I don't think we actually shifted the average in this case, and the mean of our initial guesses turned out to be exactly correct (110kg).

In some cases, where the initial estimates varied wildly, I suggested we use a 'logarithmic average', i.e. use the exponential of the mean of the logarithms of the estimates, i.e. exp(arithmetic_mean(log(estimates)).

Then we'd check the prediction and I'd mark it Right or Wrong accordingly on Predictionbook. When they got home, the others marked the prediction Unknown, then put in the probability they'd made a note of, then re-mark the prediction as Right or Wrong as before.

I had my laptop and used the pub's Wi-Fi to create the predictions on PredictionBook with my estimate. The others made a note of their probabilities. After each prediction, we checked the prediction and I marked it Right or Wrong on PredictionBook accordingly. When they got home, each of the others who attended then marked the prediction Unknown, then submitted their probability from earlier, then re-marked the Prediction as Right or Wrong.

Discussion article for the meetup : Bath, UK: Agreement, practical meetups, and report from last meetup

New Comment
4 comments, sorted by Click to highlight new comments since: Today at 10:21 AM

I shall be attending (90% confidence).

Anyway, I looked through the questions, and, well, please take this as constructive criticism, but I'd have no idea about the truth of most of those statements, and they mostly seem to be fairly dry statistics. I dunno what the people who actually attended the last meeting thought, but I'd suggest maybe something more like geeky pub-quiz with probability estimates?

Statements can still be used for calibration even if you don't know the answer, but it's always more fun if you have at least an inkling of the answer. It's always good to add more fun to things like this, so any chance I could convince you to bring along some of the type of questions you think would be good?

Ahh, this completely slipped my mind on the day. As it turned out, I thought the problems we tackled on the day were more interesting than I had previously assumed they would be. I had thought that questions like, I dunno, "which got better reviews, Indiana Jones and the temple or doom or Return of the Jedi" or "who had more number 1 records, the beatles or elvis" (I think its elvis, but with only 60% confidance) could be fun, perhaps intersperced with the questions about the mass of pluto.

Not sure about the equivalent for Fermi estimates.

See you on sunday! I've had a look at some cool exercises to do and Fermi Calculations, 5 minute debiasing and Zendo all look fun and useful. We'll talk it over it at the rock climbing place in a few hours