Systemic risk: a moral tale of ten insurance companies
Once upon a time...
Imagine there were ten insurance sectors, each sector being a different large risk (or possibly the same risks, in different geographical areas). All of these risks are taken to be independent.
To simplify, we assume that all the risks follow the same yearly payout distributions. The details of the distribution doesn't matter much for the argument, but in this toy model, the payouts follow the discrete binomial distribution with n=10 and p=0.5, with millions of pounds as the unit:

This means that the probability that each sector pays out £n million each year is (0.5)10 . 10!/(n!(10-n)!).
All these companies are bound by Solvency II-like requirements, that mandate that they have to be 99.5% sure to payout all their policies in a given year - or, put another way, that they only fail to payout once in every 200 years on average. To do so, in each sector, the insurance companies have to have capital totalling £9 million available every year (the red dashed line).
Assume that each sector expects £1 million in total yearly expected profit. Then since the expected payout is £5 million, each sector will charge £6 million a year in premiums. They must thus maintain a capital reserve of £3 million each year (they get £6 million in premiums, and must maintain a total of £9 million). They thus invest £3 million to get an expected profit of £1 million - a tidy profit!
Every two hundred years, one of the insurance sectors goes bust and has to be bailed out somehow; every hundred billion trillion years, all ten insurance sectors go bust all at the same time. We assume this is too big to be bailed out, and there's a grand collapse of the whole insurance industry with knock on effects throughout the economy.
But now assume that insurance companies are allowed to invest in each other's sectors. The most efficient way of doing so is to buy equally in each of the ten sectors. The payouts across the market as a whole are now described by the discrete binomial distribution with n=100 and p=0.5:

This is a much narrower distribution (relative to its mean). In order to have enough capital to payout 99.5% of the time, the whole industry needs only keep £63 million in capital (the red dashed line). Note that this is far less that the combined capital for each sector when they were separate, which would be ten times £9 million, or £90 million (the pink dashed line). There is thus a profit taking opportunity in this area (it comes from the fact that the standard deviation of X+Y is less that the standard deviation of X plus the standard deviation Y).
If the industry still expects to make an expected profit of £1 million per sector, this comes to £10 million total. The expected payout is £50 million, so they will charge £60 million in premium. To accomplish their Solvency II obligations, they still need to hold an extra £3 million in capital (since £63 million - £60 million = £3 million). However, this is now across the whole insurance industry, not just per sector.
Thus they expect profits of £10 million based on holding capital of £3 million - astronomical profits! Of course, that assumes that the insurance companies capture all the surplus from cross investing; in reality there would be competition, and a buyer surplus as well. But the general point is that there is a vast profit opportunity available from cross-investing, and thus if these investments are possible, they will be made. This conclusion is not dependent on the specific assumptions of the model, but captures the general result that insuring independent risks reduces total risk.
But note what has happened now: once every 200 years, an insurance company that has spread their investments across the ten sectors will be unable to payout what they owe. However, every company will be following this strategy! So when one goes bust, they all go bust. Thus the complete collapse of the insurance industry is no longer a one in hundred billion trillion year event, but a one in two hundred year event. The risk for each company has stayed the same (and their profits have gone up), but the systemic risk across the whole insurance industry has gone up tremendously.
...and they failed to live happily ever after for very much longer.
The Goal of the Bayesian Conspiracy
Suppose that there were to exist such an entity as the Bayesian Conspiracy.
I speak not of the social group of that name, the banner under which rationalists meet at various conventions – though I do not intend to disparage that group! Indeed, it is my fervent hope that they may in due time grow into the entity which I am setting out to describe. No, I speak of something more like the “shadowy group of scientists” which Yudkowsky describes, tongue (one might assume) firmly in cheek. I speak of such an organization which has been described in Yudkowsky's various fictional works, the secret and sacred cabal of mathematicians and empiricists who seek unwaveringly for truth... but set in the modern-day world, perhaps merely the seed of such a school, an organization which can survive and thrive in the midst of, yet isolated from, our worldwide sociopolitical mess. I ask you, if such an organization existed, right now, what would – indeed, what should – be its primary mid-term (say, 50-100 yrs.) goal?
I submit that the primary mid-term goal of the Bayesian Conspiracy, at this stage of its existence, is and/or ought to be nothing less than world domination.
Before the rotten fruit begins to fly, let me make a brief clarification.
The term “world domination” is, unfortunately, rather socially charged, bringing to mind an image of the archetypal mad scientist with marching robot armies. That's not what I'm talking about. My usage of the phrase is intended to evoke something slightly less dramatic, and far less sinister. “World domination”, to me, actually describes rather a loosely packed set of possible world-states. One example would be the one I term “One World Government”, wherein the Conspiracy (either openly or in secret) is in charge of all nations via an explicit central meta-government. Another would be a simple infiltration of the world's extant political systems, followed by policy-making and cooperation which would ensure the general welfare of the world's entire population – control de facto, but without changing too much outwardly. The common thread is simply that the Conspiracy becomes the only major influence in world politics.
(Forgive my less-than-rigorous definition, but a thorough examination of the exact definition of the word “influence” is far, far outside the scope of this article.)
So there is my claim. Let me tell you why I believe this is the morally correct course of action.
Let us examine, for a moment, the numerous major good works which are currently being openly done by rationalists, or with those who may not self-identify as rationalists, but whose dogmas and goals accord with ours. We have the Singularity Institute, which is concerned with ensuring that our technological, transhumanistic advent happens smoothly and with a minimum of carnage. We have various institutions worldwide advocating and practicing cryonics, which offers a non-zero probability of recovery from death. We have various institutions also who are working on life extension technologies and procedures, which offer to one day remove the threat of death entirely from our world.
All good things, I say. I also say: too slow!
Imagine what more could be accomplished if the United States, for example, granted to the Life Extension Foundation or to Alcor the amount of money and social prominence currently reserved for military purposes. Imagine what would happen if every scientist around the world were perhaps able to contribute under a unified institution, working on this vitally important problem of overcoming death, with all the money and time the world's governments could offer at their disposal.
Imagine, also, how many lives are lost every day due to governmental negligence, and war, and poverty, and hunger. What does it profit the world, if we offer to freeze the heads of those who can afford it, while all around us there are people who can't even afford their bread and water?
I have what is, perhaps, to some who are particularly invested, an appalling and frightening proposition: for the moment, we should devote fewer of our resources to cryonics and life extension, and focus on saving the lives of those to whom these technologies are currently beyond even a fevered dream. This means holding the reins of the world, that we might fix the problems inherent in our society. Only when significant steps have been taken in the direction of saving life can we turn our focus toward extending life.
What should the Bayesian Conspiracy do, once it comes to power? It should stop war. It should usurp murderous despots, and feed the hungry and wretched who suffered under them. Again: before we work on extending the lives of the healthy and affluent beyond what we've so far achieved, we should, for example, bring the average life expectancy in Africa above the 50-year mark, where it currently sits (according to a 2006 study in the BMJ). This is what will bring about the maximum level of happiness in the world; not cryonics for those who can afford it.
Does this mean that we should stop researching these anti-death technologies? No! Of course not! Consider: even if cryonics drops to, say, priority 3 or 4 under this system, once the Conspiracy comes to power, that will still be far more support than it's currently receiving from world governments. The work will end up progressing at a far faster rate than it currently does.
Some of you may have qualms about this plan of action. You may ask, what about individual choice? What about the peoples' right to choose who leads them? Well, for those of us who live in the United States, at least, this is already a bit of a naïve question: due to color politics, you already do not have much of a choice in who leads you. But that's a matter for another time. Even if you think that dictatorship – even benevolent, rationalist dictatorship – would be inherently morally worse than even the flawed democratic system we enjoy here – a notion that may not even necessarily be the case! – do not worry: there's no reason why world domination need entail dictatorships. In countries where there are democratic systems in place, we will work within the system, placing Conspirators into positions where they can convince the people, via legitimate means, to give them public office. Once we have attained a sufficient level of power over this democratic system, we will effect change, and thence the work will go forth until this victory of rationalist dogma covers all the earth. When there are dictators, they will be removed and replaced with democratic systems... under the initial control of Conspirators, of course, and ideally under their continued control as time passes – but legitimately obtained control.
It is demonstrable that one's level of strength as a rationalist has a direct correlation to the probability that the one will make correct decisions. Therefore, the people who make decisions that affect large numbers of people ought to be those who have the highest level of rationality. In this way we can seek to avoid the many, many, many pitfalls of politics, including the inefficiency which Yudkowsky has again and again railed against. If all the politicians are on the same side, who's to argue?
In fact, even if two rationalists disagree on a particular point (which they shouldn't, but hey, even the best rationalists aren't perfect yet), they'll be able to operate more efficiently than two non-rationalists in the same position. Is the disagreement able to be settled by experiment? If it's important, throw funds at a lab to conduct such an experiment! After all, we're in charge of the money and the scientists. Is it not? Find a compromise that has the maximum expected utility for the constituents. We can do that with a high degree of accuracy; we have access to the pollsters and sociologists, and know about reliable versus unreliable polling methods!
What about non-rationalist aspiring politicians? Well, under an ideal Conspiracy takeover, there would be no such thing. Lessons on politics would include rationality as a basis; graduation from law school would entail induction into the Conspiracy, and access to the truths had therein.
I suppose the biggest question is, is all this realistic? Or is just an idealist's dream? Well, there's a non-zero probability that the Conspiracy already exists, in which case, I hope that they will consider my proposal... or, even better, I hope that I've correctly deduced and adequately explained the master plan. If the Conspiracy does not currently exist, then if my position is correct, we have a moral obligation to work our hardest on this project.
“But I don't want to be a politician,” you exclaim! “I have no skill with people, and I'd much rather tinker with the Collatz Conjecture at my desk for a few years!” I'm inclined to say that that's just too bad; sacrifices must be made for the common good, and after all, it's often said that anyone who actually wants a political office is by the fact unfit for the position. But in all realism, I'm quite sure that there will be enough room in the Conspiracy for non-politicians. We're all scientists and mathematicians at heart, anyway.
So! Here is our order of business. We must draw up a charter for the Bayesian Conspiracy. We must invent a testing system able to keep a distinction between those who are and are not ready for the Truths the Conspiracy will hold. We must find our strongest Rationalists – via a testing procedure we have not yet come up with – and put them in charge, and subordinate ourselves to them (not blindly, of course! The strength of community, even rationalist community, is in debate!). We must establish schools and structured lesson plans for the purpose of training fresh students; we must also take advantage of those systems which are already in place, and utilize them for (or turn them to) our purposes. I expect to have the infrastructure set up in no more than five years.
At that point, our real work will begin.
Reductionism
Followup to: How An Algorithm Feels From Inside, Mind Projection Fallacy
Almost one year ago, in April 2007, Matthew C submitted the following suggestion for an Overcoming Bias topic:
"How and why the current reigning philosophical hegemon (reductionistic materialism) is obviously correct [...], while the reigning philosophical viewpoints of all past societies and civilizations are obviously suspect—"
I remember this, because I looked at the request and deemed it legitimate, but I knew I couldn't do that topic until I'd started on the Mind Projection Fallacy sequence, which wouldn't be for a while...
But now it's time to begin addressing this question. And while I haven't yet come to the "materialism" issue, we can now start on "reductionism".
First, let it be said that I do indeed hold that "reductionism", according to the meaning I will give for that word, is obviously correct; and to perdition with any past civilizations that disagreed.
This seems like a strong statement, at least the first part of it. General Relativity seems well-supported, yet who knows but that some future physicist may overturn it?
On the other hand, we are never going back to Newtonian mechanics. The ratchet of science turns, but it does not turn in reverse. There are cases in scientific history where a theory suffered a wound or two, and then bounced back; but when a theory takes as many arrows through the chest as Newtonian mechanics, it stays dead.
"To hell with what past civilizations thought" seems safe enough, when past civilizations believed in something that has been falsified to the trash heap of history.
And reductionism is not so much a positive hypothesis, as the absence of belief—in particular, disbelief in a form of the Mind Projection Fallacy.
On Terminal Goals and Virtue Ethics
Introduction
A few months ago, my friend said the following thing to me: “After seeing Divergent, I finally understand virtue ethics. The main character is a cross between Aristotle and you.”
That was an impossible-to-resist pitch, and I saw the movie. The thing that resonated most with me–also the thing that my friend thought I had in common with the main character–was the idea that you could make a particular decision, and set yourself down a particular course of action, in order to make yourself become a particular kind of person. Tris didn’t join the Dauntless cast because she thought they were doing the most good in society, or because she thought her comparative advantage to do good lay there–she chose it because they were brave, and she wasn’t, yet, and she wanted to be. Bravery was a virtue that she thought she ought to have. If the graph of her motivations even went any deeper, the only node beyond ‘become brave’ was ‘become good.’
(Tris did have a concept of some future world-outcomes being better than others, and wanting to have an effect on the world. But that wasn't the causal reason why she chose Dauntless; as far as I can tell, it was unrelated.)
My twelve-year-old self had a similar attitude. I read a lot of fiction, and stories had heroes, and I wanted to be like them–and that meant acquiring the right skills and the right traits. I knew I was terrible at reacting under pressure–that in the case of an earthquake or other natural disaster, I would freeze up and not be useful at all. Being good at reacting under pressure was an important trait for a hero to have. I could be sad that I didn’t have it, or I could decide to acquire it by doing the things that scared me over and over and over again. So that someday, when the world tried to throw bad things at my friends and family, I’d be ready.
You could call that an awfully passive way to look at things. It reveals a deep-seated belief that I’m not in control, that the world is big and complicated and beyond my ability to understand and predict, much less steer–that I am not the locus of control. But this way of thinking is an algorithm. It will almost always spit out an answer, when otherwise I might get stuck in the complexity and unpredictability of trying to make a particular outcome happen.
Virtue Ethics
I find the different houses of the HPMOR universe to be a very compelling metaphor. It’s not because they suggest actions to take; instead, they suggest virtues to focus on, so that when a particular situation comes up, you can act ‘in character.’ Courage and bravery for Gryffindor, for example. It also suggests the idea that different people can focus on different virtues–diversity is a useful thing to have in the world. (I'm probably mangling the concept of virtue ethics here, not having any background in philosophy, but it's the closest term for the thing I mean.)
I’ve thought a lot about the virtue of loyalty. In the past, loyalty has kept me with jobs and friends that, from an objective perspective, might not seem like the optimal things to spend my time on. But the costs of quitting and finding a new job, or cutting off friendships, wouldn’t just have been about direct consequences in the world, like needing to spend a bunch of time handing out resumes or having an unpleasant conversation. There would also be a shift within myself, a weakening in the drive towards loyalty. It wasn’t that I thought everyone ought to be extremely loyal–it’s a virtue with obvious downsides and failure modes. But it was a virtue that I wanted, partly because it seemed undervalued.
By calling myself a ‘loyal person’, I can aim myself in a particular direction without having to understand all the subcomponents of the world. More importantly, I can make decisions even when I’m rushed, or tired, or under cognitive strain that makes it hard to calculate through all of the consequences of a particular action.
Terminal Goals
The Less Wrong/CFAR/rationalist community puts a lot of emphasis on a different way of trying to be a hero–where you start from a terminal goal, like “saving the world”, and break it into subgoals, and do whatever it takes to accomplish it. In the past I’ve thought of myself as being mostly consequentialist, in terms of morality, and this is a very consequentialist way to think about being a good person. And it doesn't feel like it would work.
There are some bad reasons why it might feel wrong–i.e. that it feels arrogant to think you can accomplish something that big–but I think the main reason is that it feels fake. There is strong social pressure in the CFAR/Less Wrong community to claim that you have terminal goals, that you’re working towards something big. My System 2 understands terminal goals and consequentialism, as a thing that other people do–I could talk about my terminal goals, and get the points, and fit in, but I’d be lying about my thoughts. My model of my mind would be incorrect, and that would have consequences on, for example, whether my plans actually worked.
Practicing the art of rationality
Recently, Anna Salamon brought up a question with the other CFAR staff: “What is the thing that’s wrong with your own practice of the art of rationality?” The terminal goals thing was what I thought of immediately–namely, the conversations I've had over the past two years, where other rationalists have asked me "so what are your terminal goals/values?" and I've stammered something and then gone to hide in a corner and try to come up with some.
In Alicorn’s Luminosity, Bella says about her thoughts that “they were liable to morph into versions of themselves that were more idealized, more consistent - and not what they were originally, and therefore false. Or they'd be forgotten altogether, which was even worse (those thoughts were mine, and I wanted them).”
I want to know true things about myself. I also want to impress my friends by having the traits that they think are cool, but not at the price of faking it–my brain screams that pretending to be something other than what you are isn’t virtuous. When my immediate response to someone asking me about my terminal goals is “but brains don’t work that way!” it may not be a true statement about all brains, but it’s a true statement about my brain. My motivational system is wired in a certain way. I could think it was broken; I could let my friends convince me that I needed to change, and try to shoehorn my brain into a different shape; or I could accept that it works, that I get things done and people find me useful to have around and this is how I am. For now. I'm not going to rule out future attempts to hack my brain, because Growth Mindset, and maybe some other reasons will convince me that it's important enough, but if I do it, it'll be on my terms. Other people are welcome to have their terminal goals and existential struggles. I’m okay the way I am–I have an algorithm to follow.
Why write this post?
It would be an awfully surprising coincidence if mine was the only brain that worked this way. I’m not a special snowflake. And other people who interact with the Less Wrong community might not deal with it the way I do. They might try to twist their brains into the ‘right’ shape, and break their motivational system. Or they might decide that rationality is stupid and walk away.
How minimal is our intelligence?
Gwern suggested that, if it were possible for civilization to have developed when our species had a lower IQ, then we'd still be dealing with the same problems, but we'd have a lower IQ with which to tackle them. Or, to put it another way, it is unsurprising that living in a civilization has posed problems that our species finds difficult to tackle, because if we were capable of solving such problems easily, we'd probably also have been capable of developing civilization earlier than we did.
How true is that?
In this post I plan to look in detail at the origins of civilization with an eye to considering how much the timing of it did depend directly upon the IQ of our species, rather than upon other factors.
Although we don't have precise IQ test numbers for our immediate ancestral species, the fossil record is good enough to give us a clear idea of how brain size has changed over time:

and we do have archaeological evidence of approximately when various technologies (such as pictograms, or using fire to cook meat) became common.
How To Have Things Correctly
I think people who are not made happier by having things either have the wrong things, or have them incorrectly. Here is how I get the most out of my stuff.
Money doesn't buy happiness. If you want to try throwing money at the problem anyway, you should buy experiences like vacations or services, rather than purchasing objects. If you have to buy objects, they should be absolute and not positional goods; positional goods just put you on a treadmill and you're never going to catch up.
Supposedly.
I think getting value out of spending money, owning objects, and having positional goods are all three of them skills, that people often don't have naturally but can develop. I'm going to focus mostly on the middle skill: how to have things correctly1.
Rationality, Transhumanism, and Mental Health
My name is Brent, and I'm probably insane.
I can perform various experimental tests to verify that I do not perform primate pack-bonding rituals correctly, which is about half of what we mean by "insane". This concerns me simply from a utilitarian perspective (separation from pack makes ego-depletion problems harder; it makes resources harder to come by; and it simply sucks to experience "from the inside"), but these are not the things that concern me most.
The thing that concerns me most is this:
What if the very tools that I use to make decisions are flawed?
I stumbled upon Bayesian techniques as a young child; I was lucky enough to have the opportunity to perform a lot of self-guided artificial intelligence "research" in Junior High and High School, due to growing up in a time and place when computers were utterly mysterious, so no one could really tell me what I was "supposed" to be doing with them - so I started making simple video games, had no opponents to play them against due to the aforementioned failures to correctly perform pack-bonding rituals, decided to create my own, became dissatisfied with the quality of my opponents, and suddenly found myself chewing on Hopfstaedter and Wiener and Minsky.
I'm filling in that bit of detail to explain that I have been attempting to operate as a rational intelligence for quite some time, so I believe that I've become very familiar with the kinds of "bugs" that I will tend to exhibit.
I've spent a very long time attempting to correct for my cognitive biases, edit out tendencies to seek comfortable-but-misleading inputs, and otherwise "force" myself to be rational, and often, the result is that my "will" will crack under the strain. My entire utility-table will suddenly flip on its head, and attempt to maximize my own self-destruction rather than allow me to continue to torture it with endlessly recursive, unsolvable problems that all tend to boil down to "you do not have sufficient social power, and humans are savage and cruel no matter how much you care about them."
Most of my energy is spent attempting to maintain positive, rational, long-term goals in the face of some kind of regedit-hack of my utility table itself, coming from somewhere in my subconscious that I can't seem to gain write-access to.
Clearly, the transhumanist solution would be to identify the underlying physical storage where the bug is occurring, and replace it with a less-malfunctioning piece of hardware.
Hopefully someday someone with more self-control, financial resources, and social resources than I will invent a method to do that, and I can get enough of a partial personectomy to create something viable with the remaining subroutines.
In the meantime, what is someone who wishes to be rational supposed to do, when the underlying hardware simply won't cooperate?
Avoid inflationary use of terms
Inflationary terms! You see them everywhere. And for those who actually know and care about the subject matter they can be very frustrating. These terms are notorious for being used in contexts where:
- They are only loosely applicable at best.
- There exists a better word that is more specific.
- The topic has a far bias.
Some examples:
- Rational
- Evolution
- Singularity
- Emergent
- Nanotech
- Cryogenics
- Faith
The problem is not that these words are meaningless in their original form, nor that you shouldn't ever use them. The problem is that they often get used in stupid ways that make them much less meaningful. By that I mean, less useful for keeping a focus on the topic and understanding what the person is really talking about.
For example, terms like Nanotech (or worse, "Nanobot") do apply in a certain loose sense to several kinds of chemistry and biological innovations that are currently in vogue. Nonetheless, each time the term is used to refer to these things it makes it much harder to know if you are referring to Drexlerian Mechanosynthesis. Hint: If you get your grant money by convincing someone you are working on one thing whereas you are really working on something completely different, that's fraud.
Similarly, Cryogenics is the science of keeping things really cold. And of course Cryonics is a form of that. But saying "Cryogenics" when you really mean exactly Cryonics is an incredibly harmful practice which actual Cryonicists generally avoid. Most people who work in Cryogenics have nothing to do with Cryonics, and this kind of confusion in popular culture has apparently engendered animosity towards Cryonics among Cryogenics specialists.
Recently I fell prey to something like this with respect to the term "Rational". I wanted to know in general terms what the best programming language for a newbie would be and why. I wanted some in depth analysis, from a group I trust to do so. (And I wasn't disappointed -- we have some very knowledgeable programmers whose opinions were most helpful to me.) However the reaction of some lesswrongers to the title I initially chose for the post was distinctly negative. The title was "Most rational programming language?"
After thinking about it for a while I realized what the problem was: This way of using the term, despite being more or less valid, makes the term less meaningful in the long run. And I don't want to be the person who makes Rational a less meaningful word. Nobody here wants that to happen. Thus it would have been better to use a term such as "Best" or "Most optimal" instead.
Another example that comes to mind is when people (usually outsiders) refer to Transhumanism, Bayeseanism, the Singularity, or even skepticism, as a "Faith" or "Belief". Well yeah, trivially, if you are willing to stretch that word to its broadest possible meaning you can feel free to apply it to such as us. But... for crying out loud! What meaning does the word have if Faith is something absolutely everyone has? We're really referring to something like "Confidence" here.
Then there's Evolution. Is Transhumanism really about the next stage in human Evolution? Perhaps in a certain loose sense it is -- but let's not lose sight of the mutilation of the language (and consequent noise-to-signal increase) that occurs when you say such a thing. Human Evolution is an existing scientific specialty with absolutely zilch to do with cybernetic body modification or genetic engineering, and everything to do with the effects of natural selection and mutation on the development of humans in the past.
Co-opting terms isn't always bad. If you are brand-new to a topic, seeing an analogy to something with which you are already familiar may reduce the inferential distance and help you click the idea in your brain. But this gets more hazardous the closer the terms actually are in meaning. Distant terms are safer -- when I say "Avoid inflationary use of terms" you can instantly see that I'm definitely not talking about money, nor rubber objects with compressed air inside of them, but about words and phrases.
On the other hand with such things as Rational versus Optimal, we're taking two surface-level-similar words and blurring them in such a way that one cannot meaningfully talk about either without accidentally importing baggage from the other. Rational is more suitable for use in contrast with clear examples of irrationality -- cognitive biases, for example, or drug addiction, and is a rather unabashedly idealistic term. Optimal on the other hand doesn't so much require specific contrast because pretty much everything is suboptimal by default to some degree or another -- optimizing is understood as an ongoing and very relativistic process.
To sum up: Avoid making words cheaper and less effective for their specialized tasks. Don't use them for things where a better and more appropriate term exists. As your brain gets used to an idea, be prepared to discard old terms you have co-opted from other domains that were really just useful placeholders to get you started. Specialized jargon exists for a reason!
Tool for maximizing paperclips vs a paperclip maximizer
To clarify some point that is being discussed in several threads here, tool vs intentional agent distinction:
A tool for maximizing paperclips would - for efficiency purposes - have a world-model which it has god's eye view of (not accessing it through embedded sensors like eyes), implementing/defining a counter of paperclips within this model. Output of this counter is what is being maximized by a problem solving portion of the tool. Not the real world paperclips
No real world intentionality exist in this tool for maximizing paperclips; the paperclip-making-problem-solver would maximize the output of the counter, not real world paperclips. Such tool can be hooked up to actuators, and to sensors, and made to affect the world without human intermediary; but it won't implement real world intentionality.
An intentional agent for maximizing paperclips is the familiar 'paperclip maximizer', that truly loves the real world paperclips and wants to maximize them, and would try to improve it's understanding of the world to know if it's paperclip making efforts are successful.
The real world intentionality is ontologically basic in human language and consequently there is very strong bias to describe the former as the latter.
The distinction: the wireheading (either direct or through manipulation of inputs) is a valid solution to the problem that is being solved by the former, but not by the latter. Of course one could rationalize and postulate tool that is not general purpose enough as to wirehead, forgetting that the issue being feared is a tool that's general purpose to design better tool or self improve. That is an incredibly frustrating feature of rationalization. The aspects of problem are forgotten when thinking backwards.
The issues with the latter: We do not know if humans actually implement real world intentionality in such a way that it is not destroyed under full ability to self modify (and we can observe that we very much like to manipulate our own inputs; see art, porn, fiction, etc). We do not have single certain example of such stable real world intentionality, and we do not know how to implement it (that may well be impossible). We also are prone to assuming that two unsolved problems in AI - general problem solving and this real world intentionality - are a single problem, or are solved necessarily together. A map compression issue.
Thoughts on the Singularity Institute (SI)
This post presents thoughts on the Singularity Institute from Holden Karnofsky, Co-Executive Director of GiveWell. Note: Luke Muehlhauser, the Executive Director of the Singularity Institute, reviewed a draft of this post, and commented: "I do generally agree that your complaints are either correct (especially re: past organizational competence) or incorrect but not addressed by SI in clear argumentative writing (this includes the part on 'tool' AI). I am working to address both categories of issues." I take Luke's comment to be a significant mark in SI's favor, because it indicates an explicit recognition of the problems I raise, and thus increases my estimate of the likelihood that SI will work to address them.
September 2012 update: responses have been posted by Luke and Eliezer (and I have responded in the comments of their posts). I have also added acknowledgements.
The Singularity Institute (SI) is a charity that GiveWell has been repeatedly asked to evaluate. In the past, SI has been outside our scope (as we were focused on specific areas such as international aid). With GiveWell Labs we are open to any giving opportunity, no matter what form and what sector, but we still do not currently plan to recommend SI; given the amount of interest some of our audience has expressed, I feel it is important to explain why. Our views, of course, remain open to change. (Note: I am posting this only to Less Wrong, not to the GiveWell Blog, because I believe that everyone who would be interested in this post will see it here.)
I am currently the GiveWell staff member who has put the most time and effort into engaging with and evaluating SI. Other GiveWell staff currently agree with my bottom-line view that we should not recommend SI, but this does not mean they have engaged with each of my specific arguments. Therefore, while the lack of recommendation of SI is something that GiveWell stands behind, the specific arguments in this post should be attributed only to me, not to GiveWell.
Summary of my views
- The argument advanced by SI for why the work it's doing is beneficial and important seems both wrong and poorly argued to me. My sense at the moment is that the arguments SI is making would, if accepted, increase rather than decrease the risk of an AI-related catastrophe. More
- SI has, or has had, multiple properties that I associate with ineffective organizations, and I do not see any specific evidence that its personnel/organization are well-suited to the tasks it has set for itself. More
- A common argument for giving to SI is that "even an infinitesimal chance that it is right" would be sufficient given the stakes. I have written previously about why I reject this reasoning; in addition, prominent SI representatives seem to reject this particular argument as well (i.e., they believe that one should support SI only if one believes it is a strong organization making strong arguments). More
- My sense is that at this point, given SI's current financial state, withholding funds from SI is likely better for its mission than donating to it. (I would not take this view to the furthest extreme; the argument that SI should have some funding seems stronger to me than the argument that it should have as much as it currently has.)
- I find existential risk reduction to be a fairly promising area for philanthropy, and plan to investigate it further. More
- There are many things that could happen that would cause me to revise my view on SI. However, I do not plan to respond to all comment responses to this post. (Given the volume of responses we may receive, I may not be able to even read all the comments on this post.) I do not believe these two statements are inconsistent, and I lay out paths for getting me to change my mind that are likely to work better than posting comments. (Of course I encourage people to post comments; I'm just noting in advance that this action, alone, doesn't guarantee that I will consider your argument.) More
Intent of this post
I did not write this post with the purpose of "hurting" SI. Rather, I wrote it in the hopes that one of these three things (or some combination) will happen:
- New arguments are raised that cause me to change my mind and recognize SI as an outstanding giving opportunity. If this happens I will likely attempt to raise more money for SI (most likely by discussing it with other GiveWell staff and collectively considering a GiveWell Labs recommendation).
- SI concedes that my objections are valid and increases its determination to address them. A few years from now, SI is a better organization and more effective in its mission.
- SI can't or won't make changes, and SI's supporters feel my objections are valid, so SI loses some support, freeing up resources for other approaches to doing good.
Which one of these occurs will hopefully be driven primarily by the merits of the different arguments raised. Because of this, I think that whatever happens as a result of my post will be positive for SI's mission, whether or not it is positive for SI as an organization. I believe that most of SI's supporters and advocates care more about the former than about the latter, and that this attitude is far too rare in the nonprofit world.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)