You are unlikely to see me posting here again, after today. There is a saying here that politics is the mind-killer. My heretical realization lately is that philosophy, as generally practiced, can also be mind-killing.
As many of you know I am, or was running a twice-monthly Rationality: AI to Zombies reading group. One of the bits I desired to include in each reading group post was a collection of contrasting views. To research such views I've found myself listening during my commute to talks given by other thinkers in the field, e.g. Nick Bostrom, Anders Sandberg, and Ray Kurzweil, and people I feel are doing “ideologically aligned” work, like Aubrey de Grey, Christine Peterson, and Robert Freitas. Some of these were talks I had seen before, or generally views I had been exposed to in the past. But looking through the lens of learning and applying rationality, I came to a surprising (to me) conclusion: it was philosophical thinkers that demonstrated the largest and most costly mistakes. On the other hand, de Grey and others who are primarily working on the scientific and/or engineering challenges of singularity and transhumanist technologies were far less likely to subject themselves to epistematic mistakes of significant consequences.
Philosophy as the anti-science...
What sort of mistakes? Most often reasoning by analogy. To cite a specific example, one of the core underlying assumption of singularity interpretation of super-intelligence is that just as a chimpanzee would be unable to predict what a human intelligence would do or how we would make decisions (aside: how would we know? Were any chimps consulted?), we would be equally inept in the face of a super-intelligence. This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that's not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.
This post is not about the singularity nature of super-intelligence—that was merely my choice of an illustrative example of a category of mistakes that are too often made by those with a philosophical background rather than the empirical sciences: the reasoning by analogy instead of the building and analyzing of predictive models. The fundamental mistake here is that reasoning by analogy is not in itself a sufficient explanation for a natural phenomenon, because it says nothing about the context sensitivity or insensitivity of the original example and under what conditions it may or may not hold true in a different situation.
A successful physicist or biologist or computer engineer would have approached the problem differently. A core part of being successful in these areas is knowing when it is that you have insufficient information to draw conclusions. If you don't know what you don't know, then you can't know when you might be wrong. To be an effective rationalist, it is often not important to answer “what is the calculated probability of that outcome?” The better first question is “what is the uncertainty in my calculated probability of that outcome?” If the uncertainty is too high, then the data supports no conclusions. And the way you reduce uncertainty is that you build models for the domain in question and empirically test them.
The lens that sees its own flaws...
Coming back to LessWrong and the sequences. In the preface to Rationality, Eliezer Yudkowsky says his biggest regret is that he did not make the material in the sequences more practical. The problem is in fact deeper than that. The art of rationality is the art of truth seeking, and empiricism is part and parcel essential to truth seeking. There's lip service done to empiricism throughout, but in all the “applied” sequences relating to quantum physics and artificial intelligence it appears to be forgotten. We get instead definitive conclusions drawn from thought experiments only. It is perhaps not surprising that these sequences seem the most controversial.
I have for a long time been concerned that those sequences in particular promote some ungrounded conclusions. I had thought that while annoying this was perhaps a one-off mistake that was fixable. Recently I have realized that the underlying cause runs much deeper: what is taught by the sequences is a form of flawed truth-seeking (thought experiments favored over real world experiments) which inevitably results in errors, and the errors I take issue with in the sequences are merely examples of this phenomenon.
And these errors have consequences. Every single day, 100,000 people die of preventable causes, and every day we continue to risk extinction of the human race at unacceptably high odds. There is work that could be done now to alleviate both of these issues. But within the LessWrong community there is actually outright hostility to work that has a reasonable chance of alleviating suffering (e.g. artificial general intelligence applied to molecular manufacturing and life-science research) due to concerns arrived at by flawed reasoning.
I now regard the sequences as a memetic hazard, one which may at the end of the day be doing more harm than good. One should work to develop one's own rationality, but I now fear that the approach taken by the LessWrong community as a continuation of the sequences may result in more harm than good. The anti-humanitarian behaviors I observe in this community are not the result of initial conditions but the process itself.
What next?
How do we fix this? I don't know. On a personal level, I am no longer sure engagement with such a community is a net benefit. I expect this to be my last post to LessWrong. It may happen that I check back in from time to time, but for the most part I intend to try not to. I wish you all the best.
A note about effective altruism…
One shining light of goodness in this community is the focus on effective altruism—doing the most good to the most people as measured by some objective means. This is a noble goal, and the correct goal for a rationalist who wants to contribute to charity. Unfortunately it too has been poisoned by incorrect modes of thought.
Existential risk reduction, the argument goes, trumps all forms of charitable work because reducing the chance of extinction by even a small amount has far more expected utility than would accomplishing all other charitable works combined. The problem lies in the likelihood of extinction, and the actions selected in reducing existential risk. There is so much uncertainty regarding what we know, and so much uncertainty regarding what we don't know that it is impossible to determine with any accuracy the expected risk of, say, unfriendly artificial intelligence creating perpetual suboptimal outcomes, or what effect charitable work in the area (e.g. MIRI) is have to reduce that risk, if any.
This is best explored by an example of existential risk done right. Asteroid and cometary impacts is perhaps the category of external (not-human-caused) existential risk which we know the most about, and have done the most to mitigate. When it was recognized that impactors were a risk to be taken seriously, we recognized what we did not know about the phenomenon: what were the orbits and masses of Earth-crossing asteroids? We built telescopes to find out. What is the material composition of these objects? We built space probes and collected meteorite samples to find out. How damaging an impact would there be for various material properties, speeds, and incidence angles? We built high-speed projectile test ranges to find out. What could be done to change the course of an asteroid found to be on collision course? We have executed at least one impact probe and will monitor the effect that had on the comet's orbit, and have on the drawing board probes that will use gravitational mechanisms to move their target. In short, we identified what it is that we don't know and sought to resolve those uncertainties.
How then might one approach an existential risk like unfriendly artificial intelligence? By identifying what it is we don't know about the phenomenon, and seeking to experimentally resolve that uncertainty. What relevant facts do we not know about (unfriendly) artificial intelligence? Well, much of our uncertainty about the actions of an unfriendly AI could be resolved if we were to know more about how such agents construct their thought models, and relatedly what language were used to construct their goal systems. We could also stand to benefit from knowing more practical information (experimental data) about in what ways AI boxing works and in what ways it does not, and how much that is dependent on the structure of the AI itself. Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).
Where should I send my charitable donations?
Aubrey de Grey's SENS Research Foundation.
100% of my charitable donations are going to SENS. Why they do not get more play in the effective altruism community is beyond me.
If you feel you want to spread your money around, here are some non-profits which have I have vetted for doing reliable, evidence-based work on singularity technologies and existential risk:
- Robert Freitas and Ralph Merkle's Institute for Molecular Manufacturing does research on molecular nanotechnology. They are the only group that work on the long-term Drexlarian vision of molecular machines, and publish their research online.
- Future of Life Institute is the only existential-risk AI organization which is actually doing meaningful evidence-based research into artificial intelligence.
- B612 Foundation is a non-profit seeking to launch a spacecraft with the capability to detect, to the extent possible, ALL Earth-crossing asteroids.
I wish I could recommend a skepticism, empiricism, and rationality promoting institute. Unfortunately I am not aware of an organization which does not suffer from the flaws I identified above.
Addendum regarding unfinished business
I will no longer be running the Rationality: From AI to Zombies reading group as I am no longer in good conscience able or willing to host it, or participate in this site, even from my typically contrarian point of view. Nevertheless, I am enough of a libertarian that I feel it is not my role to put up roadblocks to others who wish to delve into the material as it is presented. So if someone wants to take over the role of organizing these reading groups, I would be happy to hand over the reigns to that person. If you think that person should be you, please leave a reply in another thread, not here.
EDIT: Obviously I'll stick around long enough to answer questions below :)
Thanks for sharing your contrarian views, both with this post and with your previous posts. Part of me is disappointed that you didn't write more... it feels like you have several posts' worth of objections to Less Wrong here, and at times you are just vaguely gesturing towards a larger body of objections you have towards some popular LW position. I wouldn't mind seeing those objections fleshed out in to long, well-researched posts. Of course you aren't obliged to put in the time & effort to write more posts, but it might be worth your time to fix specific flaws you see in the LW community given that it consists of many smart people interested in maximizing their positive impact on the far future.
I'll preface this by stating some points of general agreement:
I haven't bothered to read the quantum physics sequence (I figure if I want to take the time to learn that topic, I'll learn from someone who researches it full-time).
I'm annoyed by the fact that the sequences in practice seem to constitute a relatively static document that doesn't get updated in response to critiques people have written up. I think it's worth reading them with a grain of salt for that reason. (I'm also annoyed by the fact that they are extremely wordy and mostly without citation. Given the choice of getting LWers to either read the sequences or read Thinking Fast and Slow, I would prefer they read the latter; it's a fantastic book, and thoroughly backed up by citations. No intellectually serious person should go without reading it IMO, and it's definitely a better return on time. Caveat: I personally haven't read the sequences through and through, although I've read lots of individual posts, some of which were quite insightful. Also, there is surprisingly little overlap between the two works and it's likely worthwhile to read both.)
And here are some points of disagreement :P
You talk about how Less Wrong encourages the mistake of reasoning by analogy. I searched for "site:lesswrong.com reasoning by analogy" on Google and came up with these 4 posts: 1, 2, 3, 4. Posts 1, 2, and 4 argue against reasoning by analogy, while post 3 claims the situation is a bit more nuanced. In this comment here, I argue that reasoning by analogy is a bit like taking the outside view: analogous phenomena can be considered part of the same (weak) reference class. So...
Insofar as there is an explicit "LW consensus" about whether reasoning by analogy is a good idea, it seems like you've diagnosed it incorrectly (although maybe there are implicit cultural norms that go against professed best practices).
It seems useful to know the answer to questions like "how valuable are analogies", and the discussions I linked to above seem like discussions that might help you answer that question. These discussions are on LW.
Finally, it seems you've been unable to escape a certain amount of reasoning by analogy in your post. You state that experimental investigation of asteroid impacts was useful, so by analogy, experimental investigation of AI risks should be useful.
The steelman of this argument would be something like "experimentally, we find that investigators who take experimental approaches tend to do better than those who take theoretical approaches". But first, this isn't obviously true... mathematicians, for instance, have found theoretical approaches to be more powerful. (I'd guess that the developer of Bitcoin took a theoretical rather than an empirical approach to creating a secure cryptocurrency.) And second, I'd say that even this argument is analogy-like in its structure, since the reference class of "people investigating things" seems sufficiently weak to start pushing in to analogy territory. See my above point about how reasoning by analogy at its best is reasoning from a weak reference class. (Do people think this is worth a toplevel post?)
This brings me to what I think is my most fundamental point of disagreement with you. Viewed from a distance, your argument goes something like "Philosophy is a waste of time! Resolve your disagreements experimentally! There's no need for all this theorizing!" And my rejoinder would be: Resolving disagreements experimentally is great... when it's possible. We'd love to do a randomized controlled trial of whether universes with a Machine Intelligence Research Institute are more likely to have a positive singularity, but that unfortunately we don't currently know how to do that.
There are a few issues with too much emphasis of experimentation over theory. The first issue is that you may be tempted to prefer experimentation over theory even for problems that theory is better suited for (e.g. empirically testing prime number conjectures). The second issue is that you may fall prey to the streetlight effect and prioritize areas of investigation that look tractable from an experimental point of view, ignoring questions that are both very important and not very tractable experimentally.
You write:
This would seem to depend on the specifics of the agent in question. This seems like a potentially interesting line of inquiry. My impression is that MIRI thinks most possible AGI architectures wouldn't meet its standards for safety, so given that their ideal architecture is so safety-constrained, they're focused on developing the safety stuff first before working on constructing thought models etc. This seems like a pretty reasonable approach for an organization with limited resources, if it is in fact MIRI's approach. But I could believe that value could be added by looking at lots of budding AGI architectures and trying to figure out how one might make them safer on the margin.
Sure... but note that Eliezer Yudkowsky from MIRI was the one who invented the AI box experiment and ran the first few experiments, and FHI wrote this paper consisting of a bunch of ideas for what AI boxes consist of. (The other thing I didn't mention as a weakness of empiricism is that empiricism doesn't tell you what hypotheses might be useful to test. Knowing what hypotheses to test is especially nice to know when testing hypotheses is expensive.)
I could believe that there are fruitful lines of experimental inquiry that are neglected in the AI safety space. Overall it looks kinda like crypto to me in the sense that theoretical investigation seems more likely to pan out. But I'm supportive of people thinking hard about specific useful experiments that someone could run. (You could survey all the claims in Bostrom's Superintelligence and try to estimate what fraction could be cheaply tested experimentally. Remember that just because a claim can't be tested experimentally doesn't mean it's not an important claim worth thinking about...)
I think the definition of 'experiment' gets tricky and confusing when you are talking about math specifically. When you talk about finding the distribution of prime numbers and using that to arrive at a more accurate model for your prior probability of 3339799 being prime, that is an experiment.
Math is unique in that regard though. For questions about the real world we must seek evidence that is outside of our heads.