Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
I don't know if anyone on LW watches DoctorWho/Torchwood but the american Torchwood remake's first episode just came out and the premise is that everyone on earth loses the ability to die: S04 EP1 the "Day of Miracles"
Enjoy, the reason i added it to discussion was because of the whole never die concept and one of the best quotes of all time :)
"And someday when the descendants of humanity have spread from star to star, they won't tell the children about the history of Ancient Earth until they're old enough to bear it; and when they learn they'll weep to hear that such a thing as Death had ever once existed" [Harry Potter and the Methods of Rationality](http://www.fanfiction.net/s/5782108/1/Harry_Potter_and_the_Methods_of_Rationality)
Every project needs a risk assessment.
There's a feeling, just bubbling under the surface here at Less Wrong, that we're just playing at rationality. It's rationality kindergarten. The problem has been expressed in various ways:
- not a whole lot of rationality
- rationalist porn for daydreamers
- not quite as great as everyone seems to think
- shiny distraction
- only good for certain goals
And people are starting to look at fixing it. I'm not worried that their attempts - and mine - will fail. At least we'd have fun and learn something.
I'm worried that they will succeed.
What would such a Super Less Wrong community do? Its members would self-improve to the point where they had a good chance of succeeding at most things they put their mind to. They would recruit new rationalists and then optimize that recruitment process, until the community got big. They would develop methods for rapidly generating, classifying and evaluating ideas, so that the only ideas that got tried would be the best that anyone had come up with so far. The group would structure itself so that people's basic social drives - such as their desire for status - worked in the interests of the group rather than against it.
It would be pretty formidable.
What would the products of such a community be? There would probably be a self-help book that works. There would be an effective, practical guide to setting up effective communities. There would be an intuitive, practical guide to human behavior. There would be books, seminars and classes on how to really achieve your goals - and only the materials which actually got results would be kept. There would be a bunch of stuff on the Dark Arts too, no doubt. Possibly some AI research.
That's a whole lot of material that we wouldn't want to get into the hands of the wrong people.
- Half-rationalists: people who pick up on enough memes to be really dangerous, but not on enough to realise that what they're doing might be foolish. For example, building an AI without adding the friendliness features.
- Rationalists with bad goals: Someone could rationally set about trying to destroy humanity, just for the lulz.
- Dangerous information discovered: e.g. the rationalist community develops a Theory of Everything that reveals a recipe for a physics disaster (e.g. a cheap way to turn the Earth into a block hole). A non-rationalist decides to exploit this.
If this is a problem we should take seriously, what are some possible strategies for dealing with it?
- Just go ahead and ignore the issue.
- The Bayesian Conspiracy: only those who can be trusted are allowed access to the secret knowledge.
- The Good Word: mix in rationalist ideas with do-good and stay-safe ideas, to the extent that they can't be easily separated. The idea being that anyone who understands rationality will also understand that it must be used for good.
- Rationality cap: we develop enough rationality to achieve our goals (e.g. friendly AI) but deliberately stop short of developing the ideas too far.
- Play at rationality: create a community which appears rational enough to distract people who are that way inclined, but which does not dramatically increase their personal effectiveness.
- Risk management: accept that each new idea has a potential payoff (in terms of helping us avoid existential threats) and a potential cost (in terms of helping "bad rationalists"). Implement the ideas which come out positive.
In the post title, I have suggested an analogy with AI takeoff. That's not entirely fair; there is probably an upper bound to how effective a community of humans can be, at least until brain implants come along. We're probably talking two orders of magnitude rather than ten. But given that humanity already has technology with slight existential threat implications (nuclear weapons, rudimentary AI research), I would be worried about a movement that aims to make all of humanity more effective at everything they do.
Inspired by Don't Plan For the Future.
For the purposes of discussion on this site, a Friendly AI is assumed to be one that shares our terminal values. It's a safe genie that doesn't need to be told what to do, but anticipates how to best serve the interests of its creators. Since our terminal values are a function of our evolutionary history, it seems reasonable to assume that an FAI created by one intelligent species would not necessarily be friendly to other intelligent species, and that being subsumed by another species' FAI would be fairly catastrophic.
Except.... doesn't that seem kind of bad? Supposing I were able to create a strong AI, and it created a sound fun-theoretic utopia for human beings, but then proceeded to expand and subsume extraterrestrial intelligences, and subject them to something they considered a fate worse than death, I would have to regard that as a major failing of my design. My utility function assigns value to the desires of beings whose values conflict with my own. I can't allow other values to supersede mine, but absent other considerations, I have to assign negative utility in my own function for creating negative utility in the functions of other existing beings. I'm skeptical that an AI that would impose catastrophe on other thinking beings is really maximizing my utility.
It seems to me that to truly maximize my utility, an AI would need to have consideration for the utility of other beings. Secondary consideration, perhaps, but it could not maximize my utility simply by treating them as raw material with which to tile the universe with my utopian civilization.
Perhaps my utility function gives more value than most to beings that don't share my values (full disclosure, I prefer the "false" ending of Three Worlds Collide, although I don't consider it ideal.) However, if an AI imposes truly catastrophic fates on other intelligent beings, my own utility function takes such a hit that I cannot consider it friendly. A true Friendly AI would need to be at least passably friendly to other intelligences to satisfy me.
I don't know if I've finally come to terms with Eliezer's understanding of how hard Friendly AI is, or made it much, much harder, but it gives me a somewhat humbling perspective of the true scope of the problem.
First, if you don't already know it, Frequentist Doomsday Argument:
There's some number of total humans. There's a 95% chance that you come after the last 5%. There's been about 60 to 120 billion people so far, so there's a 95% chance that the total will be less than 1.2 to 2.4 trillion.
I've modified it to be Bayesian.
First, find the priors:
Do you think it's possible that the total number of sentients that have ever lived or will ever live is less than a googolplex? I'm not asking if you're certain, or even if you think it's likely. Is it more likely than one in infinity? I think it is too. This means that the prior must be normalizable.
If we take P(T=n) ∝ 1/n, where T is the total number of people, it can't be normalized, as 1/1 + 1/2 + 1/3 + ... is an infinite sum. If it decreases faster, it can at least be normalized. As such, we can use 1/n as an upper limit.
Of course, that's just the limit of the upper tail, so maybe that's not a very good argument. Here's another one:
We're not so much dealing with lives as life-years. Year is a pretty arbitrary measurement, so we'd expect the distribution to be pretty close for the majority of it if we used, say, days instead. This would require the 1/n distribution.
T = total number of people
U = number you are
P(T=n) ∝ 1/n
U = m
P(U=m|T=n) ∝ 1/n
P(T=n|U=m) = P(U=m|T=n) * P(T=n) / P(U=m)
= (1/n^2) / P(U=m)
P(T>n|U=m) = ∫P(T=n|U=m)dn
= (1/n) / P(U=m)
And to normalize:
P(T>m|U=m) = 1
= (1/m) / P(U=m)
m = 1/P(U=m)
P(T>n|U=m) = (1/n)*m
P(T>n|U=m) = m/n
So, the probability of there being a total of 1 trillion people total if there's been 100 billion so far is 1/10.
There's still a few issues with this. It assumes P(U=m|T=n) ∝ 1/n. This seems like it makes sense. If there's a million people, there's a one-in-a-million chance of being the 268,547th. But if there's also a trillion sentient animals, the chance of being the nth person won't change that much between a million and a billion people. There's a few ways I can amend this.
First: a = number of sentient animals. P(U=m|T=n) ∝ 1/(a+n). This would make the end result P(T>n|U=m) = (m+a)/(n+a).
Second: Just replace every mention of people with sentients.
Third: Take this as a prediction of the number of sentients who aren't humans who have lived so far.
The first would work well if we can find the number of sentient animals without knowing how many humans there will be. Assuming we don't take the time to terreform every planet we come across, this should work okay.
The second would work well if we did tereform every planet we came across.
The third seems a bit wierd. It gives a smaller answer than the other two. It gives a smaller answer than what you'd expect for animals alone. It does this because it combines it for a Doomsday Argument against animals being sentient. You can work that out separately. Just say T is the total number of humans, and U is the total number of animals. Unfortunately, you have to know the total number of humans to work out how many animals are sentient, and vice versa. As such, the combined argument may be more useful. It won't tell you how many of the denizens of planets we colonise will be animals, but I don't think it's actually possible to tell that.
One more thing, you have more information. You have a lifetime of evidence, some of which can be used in these predictions. The lifetime of humanity isn't obvious. We might make it to the heat death of the universe, or we might just kill each other off in a nuclear or biological war in a few decades. We also might be annihilated by a paperclipper somewhere in between. As such, I don't think the evidence that way is very strong.
The evidence for animals is stronger. Emotions aren't exclusively intelligent. It doesn't seem animals would have to be that intelligent to be sentient. Even so, how sure can you really be. This is much more subjective than the doomsday part, and the evidence against their sentience is staggering. I think so anyway, how many animals are there at different levels of intelligence?
Also, there's the priors for total human population so far. I've read estimates vary between 60 and 120 billion. I don't think a factor of two really matters too much for this discussion.
So, what can we use for these priors?
Another issue is that this is for all of space and time, not just Earth.
Consider that you're the mth person (or sentient) from the lineage of a given planet. l(m) is the number of planets with a lineage of at least m people. N is the total number of people ever, n is the number on the average planet, and p is the number of planets.
l(m)/p is the portion of planets that made it this far. This increases with n, so this weakens my argument, but only to a limited extent. I'm not sure what that is, though. Instinct is that l(m)/p is 50% when m=n, but the mean is not the median. I'd expect a left-skew, which would make l(m)/p much lower than that. Even so, if you placed it at 0.01%, this would mean that it's a thousand times less likely at that value. This argument still takes it down orders of magnitude than what you'd think, so that's not really that significant.
Also, a back-of-the-envolope calculation:
Assume, against all odds, there are a trillion times as many sentient animals as humans, and we happen to be the humans. Also, assume humans only increase their own numbers, and they're at the top percentile for the populations you'd expect. Also, assume 100 billion humans so far.
n = 1,000,000,000,000 * 100,000,000,000 * 100
n = 10^12 * 10^11 * 10^2
n = 10^25
Here's more what I'd expect:
Humanity eventually puts up a satilite to collect solar energy. Once they do one, they might as well do another, until they have a dyson swarm. Assume 1% efficiency. Also, assume humans still use their whole bodies instead of being a brain in a vat. Finally, assume they get fed with 0.1% efficiency. And assume an 80-year lifetime.
n = solar luminosity * 1% / power of a human * 0.1% * lifetime of Sun / lifetime of human
n = 4 * 10^26 Watts * 0.01 / 100 Watts * 0.001 * 5,000,000,000 years / 80 years
n = 2.5 * 10^27
By the way, the value I used for power of a human is after the inefficiencies of digesting.
Even with assumptions that extreme, we couldn't use this planet to it's full potential. Granted, that requires mining pretty much the whole planet, but with a dyson sphere you can do that in a week, or two years with the efficiency I gave.
It actually works out to about 150 tons of Earth per person. How much do you need to get the elements to make a person?
Incidentally, I rewrote the article, so don't be surprised if some of the comments don't make sense.