"Often I compare my own Fermi estimates with those of other people, and that’s sort of cool, but what’s way more interesting is when they share what variables and models they used to get to the estimate."
– Oliver Habryka, at a model building workshop at FHI in 2016
One question that people in the AI x-risk community often ask is
"By what year do you assign a 50% probability of human-level AGI?"
We go back and forth with statements like "Well, I think you're not updating enough on AlphaGo Zero." "But did you know that person X has 50% in 30 years? You should weigh that heavily in your calculations."
However, 'timelines' is not the interesting question. The interesting parts are in the causal models behind the estimates. Some possibilities:
- Do you have a story about how the brain in fact implements back-propagation, and thus whether current ML techniques have all the key insights?
- Do you have a story about the reference class of human brains and monkey brains and evolution, that gives a forecast for how hard intelligence is and as such whether it’s achievable this century?
- Do you have a story about the amount of resources flowing into the problem, that uses factors like 'Number of PhDs in ML handed out each year' and 'Amount of GPU available to the average PhD'?
Timelines is an area where many people discuss one variable all the time, where in fact the interesting disagreement is much deeper. Regardless of whether our 50% dates are close, when you and I have different models we will often recommend contradictory strategies for reducing x-risk.
For example, Eliezer Yudkowsky, Robin Hanson, and Nick Bostrom all have different timelines, but their models tell such different stories about what’s happening in the world that focusing on timelines instead of the broad differences in their overall pictures is a red herring.
(If in fact two very different models converge in many places, this is indeed evidence of them both capturing the same thing - and the more different the two models are, the more likely this factor is ‘truth’. But if two models significantly disagree on strategy and outcome yet hit the same 50% confidence date, and we should not count this as agreement.)
Let me sketch a general model of communication.
A Sketch
Step 1: You each have a different model that predicts a different probability for a certain event.

Step 2: You talk until you have understood how they see the world.

Step 3: You do some cognitive work to integrate the evidences and ontologies of you and them, and this implies a new probability.

"If we were simply increasing the absolute number of average researchers in the field, then I’d still expect AGI much slower than you, but if now we factor in the very peak researchers having big jumps of insight (for the rest of the field to capitalise on), then I think I actually have shorter timelines than you.”
One of the common issues I see with disagreements in general is people jumping prematurely to the third diagram before spending time getting to the second one. It’s as though if you both agree on the decision node, then you must surely agree on all the other nodes.
I prefer to spend an hour or two sharing models, before trying to change either of our minds. It otherwise creates false consensus, rather than successful communication. Going directly to Step 3 can be the right call when you’re on a logistics team and need to make a decision quickly, but is quite inappropriate for research, and in my experience the most important communication challenges are around deep intuitions.
Don't practice coming to agreement; practice exchanging models.
Something other than Good Reasoning
Here’s an alternative thing you might do after Step 1. This is where you haven't changed your model, but decide to agree with the other person anyway.

This doesn’t make any sense but people try it anyway, especially when they’re talking to high status people and/or experts. “Oh, okay, I’ll try hard to believe what the expert said, so I look like I know what I’m talking about.”

This last one is the worst, because it means you can’t notice your confusion any more. It represents “Ah, I notice that p = 0.6 is inconsistent with my model, therefore I will throw out my model.” Equivalently, “Oh, I don’t understand something, so I’ll stop trying.”
This is the first post in a series of short thoughts on epistemic rationality, integrity, and curiosity. My thanks to Jacob Lagerros and Alex Zhu for comments on drafts.
Descriptions of your experiences of successful communication about subtle intuitions (in any domain) are welcomed.
I like the approach but the problem this may run into on many occasions is - parts of the reasoning include things which are not easily communicable. Suppose you talk to AlphaGo and want it to explain why it prefers some move. It may relatively easily communicate part of tree search (an I can easily integrate the non-overlapping part of the tree into my tree), but we will run into trouble communicating the "policy network" consisting of weights obtained on training by very large number of examples. *
From human perspective, such parts of our cognitive infrastructure are often not really accessible, giving their results in the form of "intuition". What's worse, other part of our cognitive infrastructure is very good in making fake stories explaining the intuitions, so if you are motivated enough your mind may generate plausible but in fact fake models for your intuition. In worst case you discard the good intuition when someone shows you problems in the fake model.
Also... it seems to me it is in practice difficult to communicate such parts of your reasoning, and among rationalists in particular. I would expect communicating parts of the model in the form of "my intuition is ..." to be intuitively frowned upon, down-voted, pattern-matched to argument from authority, or "argument" by someone who cannot think clearly. (An exception are intuitions of high-status people). In theory it would help if you describe why your intuitive black box is producing something else than random noise, but my intuition tells me in it may sound even more awkward ("Hey I studied this for several years" )
*Btw in some cases, it _may_ make sense to throw away part of you model and replace it just by the opinion of the expert. I'm not sure I can describe it clearly without drawing
In retrospect, I think this comment of mine didn't address Jan's key point, which is that we often form intuitions/emotions by running a process analagous to aggregating data into a summary statistic and then throwing away the data. Now the evidence we saw is quite incommunicable - we no longer have the evidence ourselves.
Ray Arnold gave me a good example the other day of two people - one an individualist libertarian, the other a communitarian Christian. In the example these two people deeply disagree on how society should be set up, and this is ... (read more)