I’ve recently been spending some time thinking about the rationality mistakes I’ve made in the past. Here’s an interesting one: I think I have historically been too hasty to go from “other people seem very wrong on this topic” to “I am right on this topic”.

Throughout my life, I’ve often thought that other people had beliefs that were really repugnant and stupid. Now that I am older and wiser, I still think I was correct to think that these ideas were repugnant and stupid. Overall I was probably slightly insufficiently dismissive of things like the opinions of apparent domain experts and the opinions of people who seemed smart whose arguments I couldn’t really follow. I also overrated conventional wisdom about factual claims about how the world worked, though I underrated conventional wisdom about how to behave.

Examples of ideas where I thought the conventional wisdom was really dumb:

  • I thought that animal farming was a massive moral catastrophe, and I thought it was a sign of terrible moral failure that almost everyone around me didn’t care about this and wasn’t interested when I brought it up.
  • I thought that AI safety was a big deal, and I thought the arguments against it were all pretty stupid. (Nowadays the conventional wisdom has a much higher opinion of AI safety; I’m talking about 2010-2014.)
  • I thought that people have terrible taste in economic policy, and that they mostly vote for good-sounding stuff that stops sounding good if you think about it properly for even a minute
  • I was horrified by people proudly buying products that said “Made in Australia” on them; I didn’t understand how that wasn’t obviously racist, and I thought that we should make it much easier to allow anyone who wants to to come live in Australia. (This one has become much less controversial since Trump inadvertently convinced liberals that they should be in favor of immigration liberalization.)
  • I thought and still think that a lot of people’s arguments about why it’s good to call the police on bike thieves were dumb. See eg many of the arguments people made in response to a post of mine about this (that in fairness was a really dumb post, IMO)

I think I was right about other people being wrong. However, I think that my actual opinions on these topics were pretty confused and wrong, much more than I thought at the time. Here’s how I updated my opinion for all the things above:

  • I have updated against the simple view of hedonic utilitarianism under which it’s plausible that simple control systems can suffer. A few years ago, I was seriously worried that the future would contain much more factory farming and therefore end up net negative; I now think that I overrated this fear, because (among other arguments) almost no-one actually endorses torturing animals, we just do it out of expediency, and in the limit of better technology our weak preferences will override our expediency.
  • My understanding of AI safety was “eventually someone will build a recursively self improving singleton sovereign AGI, and we need to figure out how to build it such that it can have an off switch and it implements some good value function instead of something bad.” I think this picture was massively oversimplified. On the strategic side, I didn’t think about the possibilities of slower takeoffs or powerful technologies without recursive self improvement; on the technical safety side, I didn’t understand that it’s hard to even build a paperclip maximizer, and a lot of our effort might go into figuring out how to do that.
  • Other people have terrible taste in economic policy, but I think that I was at the time overconfident in various libertarianish ideas that I’m now less enthusiastic about. Also, I no longer think it’s a slam dunk that society is better off from becoming wealthier, because of considerations related to the far future, animals, and whether more money makes us happier.
  • I think that immigration liberalization is more dangerous than I used to think, because rich societies seem to generate massive positive externalities for the rest of the world and it seems possible that a sudden influx of less educated people with (in my opinion) worse political opinions might be killing the goose that lays the golden eggs.
  • Re bike thieves: I think that even though utilitarianism is good and stuff, it’s extremely costly to have thievery be tolerated, because then you have to do all these negative-sum things like buying bike locks. Also it seems like we’re generally better off if people help with enforcement of laws.

In all of these cases, my arguments against others were much higher quality than my actual beliefs. Much more concerningly, I think I was much better at spotting the holes in other people’s arguments than spotting holes in my own.

There’s also a general factor here of me being overconfident in the details of ideas that had some ring of truth to them. Like, the importance of AGI safety seemed really obvious to me, and I think that my sense of obviousness has historically been pretty good at spotting arguments that later stand up to intense scrutiny. But I was massively overconfident in my particular story for how AGI would go down. I should have been more disjunctive: I should have said “It sure seems like something like this ought to happen, and it seems like step three could happen in any of these four possible ways, and I don’t know which of them will be true, and maybe it will actually be another one, but I feel pretty convinced that there’s some way it will happen”.

Here are some other ideas which I continue to endorse which had that ring of truth to them, but whose details I’ve been similarly overconfident about. (Some of these are pretty obscure.)

  • The simulation hypothesis
  • UDASSA
  • The malignancy of the universal prior
  • The mathematical universe hypothesis
  • Humans have weird complex biases related to categories like race and gender, and we should be careful about this in our thinking. (Nowadays this idea is super widespread and so it feels weird to put it in the same list as all these crazy other ideas. But when I first encountered it seriously in my first year of college, it felt like an interesting and new idea, in the same category as many of the cognitive biases I heard about on LessWrong.)

And here are ideas which had this ring of truth to them that I no longer endorse:

  • We should fill the universe with hedonium.
  • The future might be net negative, because humans so far have caused great suffering with their technological progress and there’s no reason to imagine that this will change. Futurists are biased against this argument because they personally don’t want to die and have a strong selfish desire for human civilization to persist.
  • Because of Landauer’s limit, civilizations have an incentive to aestivate. (This one is wrong because it involves a misunderstanding of thermodynamics.)

My bias towards thinking my own beliefs are more reasonable than they are would be disastrous if it prevented me from changing my mind in response to good new arguments. Luckily, I don’t think that I am particularly biased in that direction, for two reasons. Firstly, when I’m talking to someone who thinks I’m wrong, for whatever reason I usually take them pretty seriously and I have a small crisis of faith that prompts me to go off and reexamine my beliefs a bunch. Secondly, I think that most of the time that people present an argument which later changes my mind, my initial reaction is confusion rather than dismissiveness.

As an example of the first: Once upon a time I told someone I respected that they shouldn’t eat animal products, because of the vast suffering caused by animal farming. He looked over scornfully and told me that it was pretty rich for me to say that, given that I use Apple products—hadn’t I heard about the abusive Apple factory conditions and how they have nets to prevent people killing themselves by jumping off the tops of the factories? I felt terrified that I’d been committing some grave moral sin, and then went off to my room to research the topic for an hour or two. I eventually became convinced that the net effect of buying Apple products on human welfare is probably very slightly positive but small enough to not worry about, and also it didn’t seem to me that there’s a strong deontological argument against doing it.

(I went back and told the guy about the result of me looking into it. He said he didn’t feel interested in the topic anymore and didn’t want to talk about it. I said “wow, man, I feel pretty annoyed by that; you gave me a moral criticism and I took it real seriously; I think it’s bad form to not spend at least a couple minutes hearing about what I found.” Someone else who was in the room, who was very enthusiastic about social justice, came over and berated me for trying to violate someone else’s preferences about not talking about something. I learned something that day about how useful it is to take moral criticism seriously when it’s from people who don’t seem to be very directed by their morals.)

Other examples: When I first ran across charismatic people who were in favor of deontological values and social justicey beliefs, I took those ideas really seriously and mulled them over a lot. A few weeks ago, someone gave me some unexpectedly harsh criticism about my personal manner and several aspects of how I approach my work; I updated initially quite far in the direction of their criticism, only to update 70% of the way back towards my initial views after I spent ten more hours thinking and talking to people about it.

Examples of the second: When I met people whose view of AI safety didn’t match my own naive view, I felt confused and took them seriously (including when they were expressing a bunch of skepticism of MIRI). When my friend Howie told me he thought the criminal justice system was really racist, I was surprised and quickly updated my opinion to “I am confused about this”, rather than dismissing him.

I can’t think of cases where I initially thought an argument was really stupid but then it ended up convincing either me or a majority of people who I think of as my epistemic peers and superiors (eg people who I think have generally good judgement at EA orgs).

However, I can think of cases where I felt initially that an argument is dumb, but lots of my epistemic peers think that the argument is at least sort of reasonable. I am concerned by this and I’m trying to combat it. For example, the following arguments are in my current list of things that I am worried I’m undervaluing because they initially seem implausible to me, and are on my to-do list to eventually look into more carefully: Drexler’s Comprehensive AI Systems. AI safety via ambitious value learning. Arguments that powerful AI won’t lead to a singleton.

Please let me know if you have examples along these lines where I seemed dumber than I’m presenting here.


Here’s another perspective on why my approach might be a problem. I think that people are often pretty bad at expressing why they believe things, and in particular they don’t usually say “I don’t know why I believe this, but I believe it anyway.” So if I dismiss arguments that suck, I might be dismissing useful knowledge that other people have gained through experience.

I think I’ve made mistakes along these lines in the past. For example, I used to have a much lower opinion of professionalism than I now do. And there are a couple of serious personal mistakes I’ve made where I looked around for the best arguments against doing something weird I wanted to do, and all of those arguments sucked, and then I decided to do the weird thing, and then it was a bad idea.

Katja Grace calls this mistake “breaking Chesterton’s fence in the presence of bull”.

This would suggest the heuristic “Take received wisdom on topics into account, even if you ask people where the received wisdom comes from and they tell you a source that seems extremely unreliable”.

I think this heuristic is alright but shouldn’t be an overriding consideration. The ideas that evolve through the experience of social groups are valuable because they’re somewhat selected for truth and importance. But the selection process for these ideas is extremely simple and dumb.

I’d expect that in most cases where something is bad, there is a legible argument for why we shouldn’t do it (where I’m including arguments from empirical evidence as legible arguments). I’d prefer to just learn all of the few things that society implicitly knows, rather than giving up every time it disagrees with me.

Maybe this is me being arrogant again, but I feel like the mistake I made with the bike-stealing thing wasn’t me refusing to bow to social authority, it was me not trying hard enough to think carefully about the economics of the situation. My inside view is that if I now try to think about economics, I don’t need to incorporate that much outside-view-style discounting of my own arguments.

I have the big advantage of being around people who are really good at articulating the actual reasons why things are bad. Possibly the number one strength of the rationalist community is creating and disseminating good explicit models of things that are widely implicitly understood (eg variants of Goodhart’s law, Moloch, Chesterton’s fence, the unilateralist’s curse, “toxoplasma of rage”). If I was in any other community, I’m worried that I’d make posts like the one about the bike, and no-one would be able to articulate why I was wrong in a way that was convincing. So I don’t necessarily endorse other people taking the strategy I take.

I am not aware of that many cases where I believed something really stupid because all the common arguments against it seemed really dumb to me. If I knew of more cases like this, I’d be more worried about this.


Claire Zabel says, in response to all this:

I'd say you're too quick to buy a whole new story if it has the ring of truth, and too quick to ask others (and probably yourself) to either refute on the spot, or accept, a complex and important new story about something about the world, and leave too little room to say "this seems sketchy but I can't articulate how" or "I want to think about it for a while" or "I'd like to hear the critics' counterarguments" or "even though none of the above has yielded fruit, I'm still not confident about this thing"

This seems plausible. I spend a bunch of time trying to explain why I’m worried about AI risk to people who don’t know much about the topic. This requires covering quite a lot of ground; perhaps I should try harder to explicitly say “by the way, I know I’m telling you a lot of crazy stuff; you should take as long as it takes to evaluate all of this on your own; my goal here is just to explain what I believe; you should use me as a datapoint about one place that human beliefs sometimes go after thinking about the subject.”


I feel like my intuitive sense of whether someone else’s argument is roughly legit is pretty good, and I plan to continue feeling pretty confident when I intuitively feel like someone else is being dumb. But I am trying to not make the jump from “I think that this argument is roughly right” to “I think that all of the steps in this fleshed out version of that argument are roughly right”. Please let me know if you think I’m making that particular mistake.

New Comment
20 comments, sorted by Click to highlight new comments since:

I already told Buck that I loved this post. For this curation notice, let me be specific about why.

  • Posts from people who think carefully and seriously about difficult questions writing about some of the big ways they changed their mind over time are rare and valuable (other examples: Holden, Eliezer, Kahneman).
  • OP is unusually transparent, in a way that leads me to feel I can actually update on the data rather than holding it in an internal sandbox. In feel it has not been as adversarially selected as most other writings by someone about themselves, making it extremely valuable data. (Where data is normally covered up, even small amounts of true data are often very surprising.)
  • I find the specific update quite useful, including all of the examples. It fits together with Eliezer's claim (at the end of section 5 here) that you can figure out which experts are right/wrong far more often than you can come up with the correct theory yourself.
(I went back and told the guy about the result of me looking into it. He said he didn’t feel interested in the topic anymore and didn’t want to talk about it. I said “wow, man, I feel pretty annoyed by that; you gave me a moral criticism and I took it real seriously; I think it’s bad form to not spend at least a couple minutes hearing about what I found.” Someone else who was in the room, who was very enthusiastic about social justice, came over and berated me for trying to violate someone else’s preferences about not talking about something. I learned something that day about how useful it is to take moral criticism seriously when it’s from people who don’t seem to be very directed by their morals.)

I've run into this phenomenon myself at times. Popehat calls it the "Doctrine of the Preferred First Speaker," https://www.popehat.com/2013/12/21/ten-points-about-speech-ducks-and-flights-to-africa/ :

The doctrine of the Preferred First Speaker holds that when Person A speaks, listeners B, C, and D should refrain from their full range of constitutionally protected expression to preserve the ability of Person A to speak without fear of non-governmental consequences that Person A doesn't like. The doctrine of the Preferred First Speaker applies different levels of scrutiny and judgment to the first person who speaks and the second person who reacts to them; it asks "why was it necessary for you to say that" or "what was your motive in saying that" or "did you consider how that would impact someone" to the second person and not the first. It's ultimately incoherent as a theory of freedom of expression.

This has been one of the most useful posts on LessWrong in recent years for me personally. I find myself often referring to it, and I think almost everyone underestimates the difficulty gap between critiquing others and proposing their own, correct, ideas.

In the last year or so, I've noticed people are getting tired of staring at their phones all the time or having many strong opinions prompted by the internet. Looks like the internet is settling into its niche alongside books and TV, which are nice but everyone knows they shouldn't be the center of your being.

I find many of the views you updated away from plausible and perhaps compelling. Given that I have found your wriitng compelling on other topics compelling. Given this I feel like I should update my confidence in my own beliefs. Based on the post I find it hard to model where you currently stand on some of these issues. For example you claim you don't endorse the following:

The future might be net negative, because humans so far have caused great suffering with their technological progress and there’s no reason to imagine that this will change.

I certainly don't think its obvious that average suffering will be higher in the future. But it also seems plausible to me that the future will be net negative. 'The trendline will continue' seems like a strong enough argument to find a net negative future plausible. Elsewhere in the article you claim that human's weak preferences will eventually end factory farming and I agree with that. However new forms of suffering may develop. One could imagine strong competitive pressures rewarding agents that 'negatively reinforce' agents they simulate. There are many other ways things can go wrong. So I am genuinely unsure what you mean by the fact that you don't endorse this claim anymore. Do you think it is implausible the future is net negative? Or have you just substantially reduced the probabality you assign to a net negative future?

Relatedly do you have any links on why you updated your opinion of professionalism? I should note I am not at all trying to nitpick this post. I am very interested in how my own views should update.


Regarding the title problem,

I have historically been too hasty to go from “other people seem very wrong on this topic” to “I am right on this topic”

I think it's helpful here to switch from binary wrong/right language to continuous language. We can talk of degrees of wrongness and rightness.

Consider people who are smarter than those they usually argue with, in the specific sense of "smarter" where we mean they produce more-correct, better-informed, or more-logical arguments and objections. These people probably have some (binarily) wrong ideas. The people they usually argue with, however, are likely to be (by degrees) wronger.

When the other people are wronger, the smart person is in fact righter. So I think, as far as you were thinking in terms of degrees of wrongness and rightness, it would be perfectly fair for you to have had the sense you did. It wouldn't have been a hasty generalization. And if you stopped to consider whether there might exist views that are even righter still, you'd probably conclude there are.

I'm confused about what point you're making with the bike thief example. I'm reading through that post and its comments to see if I can understand your post better with that as background context, but you might want to clarify that part of the post (with a reader who doesn't have that context in mind).

I think the techniques in this post may be helpful to avoid the kind of overconfidence you describe here and be more disjunctive in one's thinking.

Here are some other ideas which I continue to endorse which had that ring of truth to them, but whose details I’ve been similarly overconfident about.

I'm curious what details you were overconfident about, in case I can use the same updates that you made.

I'm confused about what point you're making with the bike thief example. I'm reading through that post and its comments to see if I can understand your post better with that as background context, but you might want to clarify that part of the post (with a reader who doesn't have that context in mind).

Can you clarify what is unclear about it?

One way I’ve experienced this was playing poker in a casino. I was pretty decent and could see a lot of the the mistakes other people were making. I thought that meant I should just be able to come in and win their money, but I was sorely mistaken. It was only after a full week of training and practice that I was able to get to that level. (A week sounds like not a lot of time, but it took me a few years of playing on and off to get to that level + realization,)

A relevant book recommendation: The Enigma of Reason argues that thinking of high-level human reasoning as a tool for attacking other people's beliefs and defending our own (regardless of their actual veracity) helps explain a lot of weird asymmetries in cognitive biases we're susceptible to, including this one.

[mod hat] I suspect some people may be confused about where we draw some lines re: "what goes on the frontpage", and thought I'd outline why we thought this post made sense for frontpage.

The post may be confusing because it references a bunch of political positions that are controversial. We sometimes don't promote things to frontpage that we expect to invite a bunch a bunch of controversy that sheds more heat than light.

But this post does a pretty good job of otherwise sticking to the "explain rather than persuade", using political examples to illustrate broader points that seem quite important. Even when it comes to the core point of "what are the best ways to think", it does a good job of explaining-rather-than-persuading, as well leaving signposts for what sort of people/situations might benefit from which sort of approaches.

See also https://www.lesswrong.com/posts/qNZM3EGoE5ZeMdCRt/reversed-stupidity-is-not-intelligence. I applaud (and aspire to) your ability to distinguish levels of plausability without falling into false dichotomies.

We should fill the universe with hedonium.

I'll admit that the "pink goo" scenario (which I tend to call orgasmium, but will now use hedonium as the name of the substance) is one which I find most likely to be my supervillian almost-correct belief (that thing which, if I were powerful and not epistemically vigilant, I might undertake, causing great harm).

This is awesome.

Reminds me of Ben Kuhn's recent question on the EA Forum – Has your EA worldview changed over time?

Thanks heaps for the post man, I really enjoyed it! While I was reading it felt like you were taking a bunch of half-baked vague ideas out of my own head, cleaning them up, and giving some much clearer more-developed versions of those ideas back to me :)

One good idea to take out of this is that other people's ability to articulate their reasons for their belief can be weak—weak enough that it can distract from the strength of evidence for the actual belief. (More people can catch a ball than explain why it follows the arc that it does).

This is an insight I’d had in a more abstract way before. As Zvi puts it in Zeroing Out:

It is much, much easier to pick out a way in which a system is sub-optimal, than it is to implement or run that system at anything like its current level of optimization.

A corollary of this:

It is much, much easier to find a bias in a complex model, than it is to build an equally good model.

Nonetheless, I didn’t truly grok it until reading this post. 

Sometimes I notice that a system is biased or flawed, and I think in part due to reading this post I am not likely to then think “And now I know better”, but in fact I will realize that my thinking on this topic has only just begun. Once I discover I am not able to rely on others to do my thinking on a subject, I realize I will have to do it myself, from scratch. This is a large and difficult undertaking. Far harder than noticing the initial mistake in others.

Another way of saying it is just that “Just because I know one answer is wrong, does not mean I know which answer is right.” Yet another would be “Reversed stupidity is not intelligence”. I guess we all have to learn the basics. Still, this post helped me learn this lesson quite a bit.

Question: how is the criminal justice system not racist? As I understand it. There are no laws banning slavery in prison, so there were financial incentives to move enslaved people to prison and there are financial incentives today to put people in jail which results in the perpetuation of racial biases in law enforcement

I doubt anyone will think the basic argument of the post is very surprising. Nevertheless, it is not something that lends itself to staying in one's mind. The other side is dumb. "I am permitted" to continue to believe my side. A post with a very memorable title and compelling stories is a good way to fight that effect.

A question, are you/were you in Australia ? I've never heard anyone being proud of «made in Australia» stuff, so I don't understand this example, unless of course you are speaking of people living in Australia and thus proud of their country

[+]TAG-70