Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
First post here, and I'm disagreeing with something in the main sequences. Hubris acknowledged, here's what I've been thinking about. It comes from the post "Are your enemies innately evil?":
On September 11th, 2001, nineteen Muslim males hijacked four jet airliners in a deliberately suicidal effort to hurt the United States of America. Now why do you suppose they might have done that? Because they saw the USA as a beacon of freedom to the world, but were born with a mutant disposition that made them hate freedom?
Realistically, most people don't construct their life stories with themselves as the villains. Everyone is the hero of their own story. The Enemy's story, as seen by the Enemy, is not going to make the Enemy look bad. If you try to construe motivations that would make the Enemy look bad, you'll end up flat wrong about what actually goes on in the Enemy's mind.
If I'm misreading this, please correct me, but the way I am reading this is:
1) People do not construct their stories so that they are the villains,
2) the idea that Al Qaeda is motivated by a hatred of American freedom is false.
Reading the Al Qaeda document released after the attacks called Why We Are Fighting You you find the following:
What are we calling you to, and what do we want from you?
1. The first thing that we are calling you to is Islam.
A. The religion of tahwid; of freedom from associating partners with Allah Most High , and rejection of such blasphemy; of complete love for Him, the Exalted; of complete submission to his sharia; and of the discarding of all the opinions, orders, theories, and religions that contradict with the religion He sent down to His Prophet Muhammad. Islam is the religion of all the prophets and makes no distinction between them.
It is to this religion that we call you …
2. The second thing we call you to is to stop your oppression, lies, immorality and debauchery that has spread among you.
A. We call you to be a people of manners, principles, honor and purity; to reject the immoral acts of fornication, homosexuality, intoxicants, gambling and usury.
We call you to all of this that you may be freed from the deceptive lies that you are a great nation, which your leaders spread among you in order to conceal from you the despicable state that you have obtained.
B. It is saddening to tell you that you are the worst civilization witnessed in the history of mankind:
i. You are the nation who, rather than ruling through the sharia of Allah, chooses to invent your own laws as you will and desire. You separate religion from you policies, contradicting the pure nature that affirms absolute authority to the Lord your Creator….
ii. You are the nation that permits usury…
iii. You are a nation that permits the production, spread, and use of intoxicants. You also permit drugs, and only forbid the trade of them, even though your nation is the largest consumer of them.
iv. You are a nation that permits acts of immorality, and you consider them to be pillars of personal freedom.
"Freedom" is of course one of those words. It's easy enough to imagine an SS officer saying indignantly: "Of course we are fighting for freedom! For our people to be free of Jewish domination, free from the contamination of lesser races, free from the sham of democracy..."
If we substitute the symbol with the substance though, what we mean by freedom - "people to be left more or less alone, to follow whichever religion they want or none, to speak their minds, to try to shape society's laws so they serve the people" - then Al Qaeda is absolutely inspired by a hatred of freedom. They wouldn't call it "freedom", mind you, they'd call it "decadence" or "blasphemy" or "shirk" - but the substance is what we call "freedom".
Returning to the syllogism at the top, it seems to be that there is an unstated premise. The conclusion "Al Qaeda cannot possibly hate America for its freedom because everyone sees himself as the hero of his own story" only follows if you assume that What is heroic, what is good, is substantially the same for all humans, for a liberal Westerner and an Islamic fanatic.
(for Americans, by "liberal" here I mean the classical sense that includes just about everyone you are likely to meet, read or vote for. US conservatives say they are defending the American revolution, which was broadly in line with liberal principles - slavery excepted, but since US conservatives don't support that, my point stands).
When you state the premise baldly like that, you can see the problem. There's no contradiction in thinking that Muslim fanatics think of themselves as heroic precisely for being opposed to freedom, because they see their heroism as trying to extend the rule of Allah - Shariah - across the world.
Now to the point - we all know the phrase "thinking outside the box". I submit that if you can recognize the box, you've already opened it. Real bias isn't when you have a point of view you're defending, but when you cannot imagine that another point of view seriously exists.
That phrasing has a bit of negative baggage associated with it, that this is just a matter of pigheaded close-mindedness. Try thinking about it another way. Would you say to someone with dyscalculia "You can't get your head around the basics of calculus? You are just being so close minded!" No, that's obviously nuts. We know that different peoples minds work in different ways, that some people can see things others cannot.
Orwell once wrote about the British intellectuals inability to "get" fascism, in particular in his essay on H.G. Wells. He wrote that the only people who really understood the nature and menace of fascism were either those who had felt the lash on their backs, or those who had a touch of the fascist mindset themselves. I suggest that some people just cannot imagine, cannot really believe, the enormous power of faith, of the idea of serving and fighting and dying for your god and His prophet. It is a kind of thinking that is just alien to many.
Perhaps this is resisted because people think that "Being able to think like a fascist makes you a bit of a fascist". That's not really true in any way that matters - Orwell was one of the greatest anti-fascist writers of his time, and fought against it in Spain.
So - if you can see the box you are in, you can open it, and already have half-opened it. And if you are really in the box, you can't see the box. So, how can you tell if you are in a box that you can't see versus not being in a box?
The best answer I've been able to come up with is not to think of "box or no box" but rather "open or closed box". We all work from a worldview, simply because we need some knowledge to get further knowledge. If you know you come at an issue from a certain angle, you can always check yourself. You're in a box, but boxes can be useful, and you have the option to go get some stuff from outside the box.
The second is to read people in other boxes. I like steelmanning, it's an important intellectual exercise, but it shouldn't preclude finding actual Men of Steel - that is, people passionately committed to another point of view, another box, and taking a look at what they have to say.
Now you might say: "But that's steelmanning!" Not quite. Steelmanning is "the art of addressing the best form of the other person’s argument, even if it’s not the one they presented." That may, in some circumstances, lead you to make the mistake of assuming that what you think is the best argument for a position is the same as what the other guy thinks is the best argument for his position. That's especially important if you are addressing a belief held by a large group of people.
Again, this isn't to run down steelmanning - the practice is sadly limited, and anyone who attempts it has gained a big advantage in figuring out how the world is. It's just a reminder that the steelman you make may not be quite as strong as the steelman that is out to get you.
[EDIT: Link included to the document that I did not know was available online before now]
Abstract: in this post I propose a protocol for cryonic preservation (with the central idea of using high pressure to prevent water from expanding rather than highly toxic cryoprotectants), which I think has a chance of being non-destructive enough for us to be able to preserve and then resuscitate an organism with modern technologies. In addition, I propose a simplified experimental protocol for a shrimp (or other small model organism (building a large pressure chamber is hard) capable of surviving in very deep and cold waters; shrimp is a nice trade-off between the depth of habitat and the ease of obtaining them on market), which is simple enough to be doable in a small lab or well-equipped garage setting.
Are there obvious problems with this, and how can they be addressed?
Is there a chance to pitch this experiment to a proper academic institution, or garage it is?
Originally posted here.
I do think that the odds of ever developing advanced nanomachines and/or brain scanning on molecular level plus algorithms for reversing information distortion - everything you need to undo the damage from conventional cryonic preservation and even to some extent that of brain death, according to its modern definition, if wasn't too late when the brain was preserved - for currently existing cryonics to be a bet worth taking. This is dead serious, and it's an actionable item.
Less of an action item: what if the future generations actually build quantum Bayesian superintelligence, close enough in its capabilities to Solomonoff induction, at which point even a mummified brain or the one preserved in formalin would be enough evidence to restore its original state? Or what if they invent read-only time travel, and make backups of everyone's mind right before they died (at which point it becomes indistinguishable from the belief in afterlife existing right now)? Even without time travel, they can just use a Universe-sized supercomputer to simulate every singe human physically possible, and naturally of of them is gonna be you. But aside from the obvious identity issues (and screw the timeless identity), that relies on unknown unknowns with uncomputable probabilities, and I'd like to have as few leaps of faith and quantum suicides in my life as possible.
So although vitrification right after diagnosed brain death relies on far smaller assumptions, and if totally worth doing - let me reiterate that: go sign up for cryonics - it'd be much better if we had preservation protocols so non-destructive that we could actually freeze a living human, and then bring them back alive. If nothing else, that would hugely increase the public outreach, grant the patient (rather than cadaver) status to the preserved, along with the human rights, get it recognized as a medical procedure covered by insurance or single payer, allow doctors to initiate the preservation of a dying patient before the brain death (again: I think everything short of information-theoretic death should potentially be reversible, but why take chances?), allow suffering patient opt for preservation rather than euthanasia (actually, I think it should be done right now: why on earth would anyone allow a person to do something that's guaranteed to kill them, but not allowed to do something that maybe will kill, or maybe will give the cure?), or even allow patients suffering from degrading brain conditions (e.g. Alzheimer's) to opt for preservation before their memory and personality are permanently destroyed.
Let's fix cryonics! First of all, why can't we do it on living organisms? Because of heparin poisoning - every cryoprotectant efficient enough to prevent the formation of ice crystals is a strong enough poison to kill the organism (leave alone that we can't even saturate the whole body with it - current technologies only allow to do it for the brain alone). But without cryoprotectants the water will expand upon freezing, and break the cells. But there's another way to prevent this. Under pressure above 350 MPa water slightly shrinks upon freezing rather than expanding:
So that's basically that: the key idea is to freeze (and keep) everything under pressure. Now, there are some tricks to that too.
It's not easy to put basically any animal, especially a mammal, under 350 MPa (which is 3.5x higher than in Mariana Trench). At this point even Trimix becomes toxic. Basically the only remaining solution is total liquid ventilation, which has one problem: it has never been applied successfully to a human. There's one fix to that I see: as far as I can tell, no one has ever attempted to do perform it under high pressure, and the attempts were basically failing because of the insufficient solubility of oxygen and carbon dioxide in perfluorocarbons. Well then, let's increase the pressure! Namely, go to 3 MPa on Trimix, which is doable, and only then switch to TLV, whose efficiency is improved by the higher gas solubility under high pressure. But there's another solution too. If you just connect a cardiopulmonary bypass (10 hours should be enough for the whole procedure), you don't need the surrounding liquid to even be breathable - it can just be saline. CPB also solves the problem of surviving the period after the cardiac arrest (which will occur at around 30 centigrade) but before the freezing happens - you can just keep the blood circulating and delivering oxygen.
Speaking of hypoxia, even with the CPB it's still a problem. You positively don't want the blood to circulate when freezing starts, lest it act like an abrasive water cutter. It's not that much of a problem under near-freezing temperatures, but still. Fortunately, this effect can be mitigated by administering insulin first (yay, it's the first proper academic citation in this post! Also yay, I thought about this before I even discovered that it's actually true). This makes sense: if oxygen is primarily used to metabolize glucose, less glucose means less oxygen consumed, and less damage done by hypoxia. Then there's another thing: on the phase diagram you can see that before going into the area of high temperature ice at 632 MPa, freezing temperature actually dips down to roughly -30 centigrade at 209~350 MPa. That would allow to really shut down metabolism for good when water is still liquid, and blood can be pumped by the CPB. From this point we have two ways. First, we can do the normal thing, and start freezing very slowly, so minimize the formation of ice crystals (even though they're smaller than the original water volume, they may still be sharp). Second, we can increase the pressure. That would lead to near-instantaneous freezing everywhere, thus completely eliminating the problem of hypoxia - before the freezing, blood still circulated, and freezing is very quick - way faster than can ever be achieved even by throwing a body into liquid helium under normal pressure. Video evidence suggests that quick freezing of water leads to the formation of a huge number of crystals, which is bad, but I don't know near-instantaneous freezing from supercooled state and near-instantaneous freezing upon raising the pressure will lead to the same effect. More experiments are needed, preferably not on humans.
So here is my preservation protocol:
- Anesthetize a probably terminally ill, but still conscious person.
- Connect them to a cardiopulmonary bypass.
- Replacing their blood with perfluorohexane is not necessary, since we seem to be already doing a decent job at having medium-term (several days) cardiopulmonary bypasses, but that could still help.
- Submerge them in perfluorohexane, making sure that no air bubbles are left.
- Slowly raise the ambient pressure to 350 MPa (~3.5kBar) without stopping the bypass.
- Apply a huge dose of insulin to reduce all their metabolic processes.
- Slowly cool them to -30 centigrade (at which point, given such pressure, water is still liquid), while increasing the dose of insulin, and raising the oxygen supply to the barely subtoxic level.
- Slowly raise the pressure to 1 GPa (~10kBar), at which point the water solidifies, but does so with shrinking rather than expanding. Don't cutoff the blood circulation until the moment when ice crystals starts forming in the blood/perfluorohexane flow.
- Slowly lower the temperature to -173 centigrade or lower, as you wish.
And then back:
- Raise the temperature to -20 centigrade.
- Slowly lower the pressure to 350 MPa, at which point ice melts.
- Start artificial blood circulation with a barely subtoxic oxygen level.
- Slowly raise the temperature to +4 centigrade.
- Slowly lower the pressure to 1 Bar.
- Drain the ambient perfluorohexane and replace it with pure oxygen. Attach and start a medical ventilator.
- Slowly raise the temperature to +32 centigrade.
- Apply a huge dose of epinephrine and sugar, while transfusing the actual blood (preferably autotransfusion), to restart the heart.
I claim that this protocol allows you freeze a living human to an arbitrarily low temperature, and then bring them back alive without brain damage, thus being the first true victory over death.
But let's start with something easy and small, like a shrimp. They already live in water, so there's no need to figure out the protocol for putting them into liquid. And they're already adapted to live under high pressure (no swim bladders or other cavities). And they're already adapted to live in cold water, so they should be expected to survive further cooling.
Small ones can be about 1 inch big, so let's be safe and use a 5cm-wide cylinder. To form ice III we need about 350MPa, which gives us 350e6 * 3.14 * 0.025^2 / 9.8 = 70 tons or roughly 690kN of force. Applying it directly or with a lever is unreasonable, since 70 tons of bending force is a lot even for steel, given the 5cm target. Block and tackle system is probably a good solution - actually, two of them, on each side of a beam used for compression, so we have 345 kN per system. And it looks like you can buy 40~50 ton manual hoists from alibaba, though I have no idea about their quality.
I'm not sure to which extent Pascal's law applies to solids, but if it does, the whole setup can be vastly optimized by creating a bottle neck for the pistol. One problem is that we can no longer assume that water in completely incompressible - it had to be compressed to about 87% its original volume - but aside from that, 350MPa per a millimeter thick rod is just 28kg. To compress a 0.05m by 0.1m cylinder to 87% its original volume we need to pump extra 1e-4 m^3 of water there, which amounts to 148 meters of movement, which isn't terribly good. 1cm thick rod, on the other hand, would require almost 3 tons of force, but will move only 1.5 meters. Or the problem of applying the constant pressure can be solved by enclosing the water in a plastic bag, and filling the rest of chamber with a liquid with a lower freezing point, but the same density. Thus, it is guaranteed that all the time it takes the water to freeze, it is under uniform external pressure, and then it just had nowhere to go.
Alternatively, one can just buy a 90'000 psi pump and 100'000 psi tubes and vessels, but let's face it: it they don't even list the price on their website, you probably don't even wanna know it. And since no institutions that can afford this thing seem to be interested in cryonics research, we'll have to stick to makeshift solutions (until at least the shrimp thing works, which would probably give in a publication in Nature, and enough academic recognition for proper research to start).
The Sequences are being released as an eBook, titled Rationality: From AI to Zombies, on March 12.
We went with the name "Rationality: From AI to Zombies" (based on shminux's suggestion) to make it clearer to people — who might otherwise be expecting a self-help book, or an academic text — that the style and contents of the Sequences are rather unusual. We want to filter for readers who have a wide-ranging interest in (/ tolerance for) weird intellectual topics. Alternative options tended to obscure what the book is about, or obscure its breadth / eclecticism.
The book's contents
Around 340 of Eliezer's essays from 2009 and earlier will be included, collected into twenty-six sections ("sequences"), compiled into six books:
- Map and Territory: sequences on the Bayesian conceptions of rationality, belief, evidence, and explanation.
- How to Actually Change Your Mind: sequences on confirmation bias and motivated reasoning.
- The Machine in the Ghost: sequences on optimization processes, cognition, and concepts.
- Mere Reality: sequences on science and the physical world.
- Mere Goodness: sequences on human values.
- Becoming Stronger: sequences on self-improvement and group rationality.
The six books will be released as a single sprawling eBook, making it easy to hop back and forth between different parts of the book. The whole book will be about 1,800 pages long. However, we'll also be releasing the same content as a series of six print books (and as six audio books) at a future date.
The Sequences have been tidied up in a number of small ways, but the content is mostly unchanged. The largest change is to how the content is organized. Some important Overcoming Bias and Less Wrong posts that were never officially sorted into sequences have now been added — 58 additions in all, forming four entirely new sequences (and also supplementing some existing sequences). Other posts have been removed — 105 in total. The following old sequences will be the most heavily affected:
- Map and Territory and Mysterious Answers to Mysterious Questions are being merged, expanded, and reassembled into a new set of introductory sequences, with more focus placed on cognitive biases. The name 'Map and Territory' will be re-applied to this entire collection of sequences, constituting the first book.
- Quantum Physics and Metaethics are being heavily reordered and heavily shortened.
- Most of Fun Theory and Ethical Injunctions is being left out. Taking their place will be two new sequences on ethics, plus the modified version of Metaethics.
I'll provide more details on these changes when the eBook is out.
Unlike the print and audio-book versions, the eBook version of Rationality: From AI to Zombies will be entirely free. If you want to purchase it on Kindle Store and download it directly to your Kindle, it will also be available on Amazon for $4.99.
To make the content more accessible, the eBook will include introductions I've written up for this purpose. It will also include a LessWrongWiki link to a glossary, which I'll be recruiting LessWrongers to help populate with explanations of references and jargon from the Sequences.
I'll post an announcement to Main as soon as the eBook is available. See you then!
For a site extremely focused on fixing bad thinking patterns, I've noticed a bizarre lack of discussion here. Considering the high correlation between intelligence and mental illness, you'd think it would be a bigger topic.
I personally suffer from Generalized Anxiety Disorder and a very tame panic disorder. Most of this is focused on financial and academic things, but I will also get panicky about social interaction, responsibilities, and things that happened in the past that seriously shouldn't bother me. I have an almost amusing response to anxiety that is basically my brain panicking and telling me to go hide under my desk.
I know lukeprog and Alicorn managed to fight off a good deal of their issues in this area and wrote up how, but I don't think enough has been done. They mostly dealt with depression. What about rational schizophrenics and phobics and bipolar people? It's difficult to find anxiety advice that goes beyond "do yoga while watching the sunrise!" Pop psych isn't very helpful. I think LessWrong could be. What's mental illness but a wrongness in the head?
Mental illness seems to be worse to intelligent people than your typical biases, honestly. Hiding under my desk is even less useful than, say, appealing to authority during an argument. At least the latter has the potential to be useful. I know it's limiting me, and starting cycles of avoidance, and so much more. And my mental illness isn't even that bad! Trying to be rational and successful when schizophrenic sounds like a Sisyphusian nightmare.
I'm not fighting my difficulties nearly well enough to feel qualified to author my own posts. Hearing from people who are managing is more likely to help. If nothing else, maybe a Rational Support Group would be a lot of fun.
When I criticize, I'm a genius. I can go through a book of highly-referenced scientific articles and find errors in each of them. Boy, I feel smart. How are these famous people so dumb?
But when I write, I suddenly become stupid. I sometimes spend half a day writing something and then realize at the end, or worse, after posting, that what it says simplifies to something trivial, or that I've made several unsupported assumptions, or claimed things I didn't really know were true. Or I post something, then have to go back every ten minutes to fix some point that I realize is not quite right, sometimes to the point where the whole thing falls apart.
If someone writes an article or expresses an idea that you find mistakes in, that doesn't make you smarter than that person. If you create an equally-ambitious article or idea that no one else finds mistakes in, then you can start congratulating yourself.
Political topics attract participants inclined to use the norms of mainstream political debate, risking a tipping point to lower quality discussion
(I hope that is the least click-baity title ever.)
Political topics elicit lower quality participation, holding the set of participants fixed. This is the thesis of "politics is the mind-killer".
Here's a separate effect: Political topics attract mind-killed participants. This can happen even when the initial participants are not mind-killed by the topic.
Since outreach is important, this could be a good thing. Raise the sanity water line! But the sea of people eager to enter political discussions is vast, and the epistemic problems can run deep. Of course not everyone needs to come perfectly prealigned with community norms, but any community will be limited in how robustly it can handle an influx of participants expecting a different set of norms. If you look at other forums, it seems to take very little overt contemporary political discussion before the whole place is swamped, and politics becomes endemic. As appealing as "LW, but with slightly more contemporary politics" sounds, it's probably not even an option. You have "LW, with politics in every thread", and "LW, with as little politics as we can manage".
That said, most of the problems are avoided by just not saying anything that patterns matches too easily to current political issues. From what I can tell, LW has always had tons of meta-political content, which doesn't seem to cause problems, as well as standard political points presented in unusual ways, and contrarian political opinions that are too marginal to raise concern. Frankly, if you have a "no politics" norm, people will still talk about politics, but to a limited degree. But if you don't even half-heartedly (or even hypocritically) discourage politics, then a open-entry site that accepts general topics will risk spiraling too far in a political direction.
As an aside, I'm not apolitical. Although some people advance a more sweeping dismissal of the importance or utility of political debate, this isn't required to justify restricting politics in certain contexts. The sort of the argument I've sketched (I don't want LW to be swamped by the worse sorts of people who can be attracted to political debate) is enough. There's no hypocrisy in not wanting politics on LW, but accepting political talk (and the warts it entails) elsewhere. Of the top of my head, Yvain is one LW affiliate who now largely writes about more politically charged topics on their own blog (SlateStarCodex), and there are some other progressive blogs in that direction. There are libertarians and right-leaning (reactionary? NRx-lbgt?) connections. I would love a grand unification as much as anyone, (of course, provided we all realize that I've been right all along), but please let's not tell the generals to bring their armies here for the negotiations.
Transcribed from maxikov's posted videos.
Verbal filler removed for clarity.
Audience Laughter denoted with [L], Applause with [A]
Eliezer: So, any questions? Do we have a microphone for the audience?
Guy Offscreen: We don't have a microphone for the audience, have we?
Some Other Guy: We have this furry thing, wait, no that's not hooked up. Never mind.
Eliezer: Alright, come on over to the microphone.
Guy with 'Berkeley Lab' shirt: So, this question is sort of on behalf of the HPMOR subreddit. You say you don't give red herrings, but like... He's making faces at me like... [L] You say you don't give red herrings, but while he's sitting during in the Quidditch game thinking of who he can bring along, he stares at Cedric Diggory, and he's like, "He would be useful to have at my side!", and then he never shows up. Why was there not a Cedric Diggory?
Eliezer: The true Cedrics Diggory are inside all of our hearts. [L] And in the mirror. [L] And in Harry's glasses. [L] And, well, I mean the notion is, you're going to look at that and think, "Hey, he's going to bring along Cedric Diggory as a spare wand, and he's gonna die! Right?" And then, Lestath Lestrange shows up and it's supposed to be humorous, or something. I guess I can't do humor. [L]
Guy Dressed as a Witch: Does Quirrell's attitude towards reckless muggle scientists have anything to do with your attitude towards AI researchers that aren't you? [L]
Eliezer: That is unfair. There are at least a dozen safety conscious AI researchers on the face of the earth. [L] At least one of them is respected. [L] With that said, I mean if you have a version of Voldemort who is smart and seems to be going around killing muggleborns, and sort of pretty generally down on muggles... Like, why would anyone go around killing muggleborns? I mean, there's more than one rationalization you could apply to this situation, but the sort of obvious one is that you disapprove of their conduct with nuclear weapons. From Tom Riddle's perspective that is.
I do think I sort of try to never have leakage from that thing I spend all day talking about into a place it really didn't belong, and there's a saying that goes 'A fanatic is someone who cannot change his mind, and will not change the subject.' And I'm like ok, so if I'm not going to change my mind, I'll at least endeavor to be able to change the subject. [L] Like, towards the very end of the story we are getting into the realm where sort of the convergent attitude that any sort of carefully reasoning person will take towards global catastrophic risks, and the realization that you are in fact a complete crap rationalist, and you're going to have to start over and actually try this time. These things are sort of reflective of the story outside the story, but apart from 'there is only one king upon a chessboard', and 'I need to raise the level of my game or fail', and perhaps, one little thing that was said about the mirror of VEC, as some people called it.
Aside from those things I would say that I was treating it more as convergent evolution rather than any sort of attempted parable or Professor Quirrell speaking form me. He usually doesn't... [L] I wish more people would realize that... [L] I mean, you know the... How can I put this exactly. There are these people who are sort of to the right side of the political spectrum and occasionally they tell me that they wish I'd just let Professor Quirrell take over my brain and run my body. And they are literally Republicans for You Know Who. And there you have it basically. Next Question! ... No more questions, ok. [L] I see that no one has any questions left; Oh, there you are.
Fidgety Guy: One of the chapters you posted was the final exam chapter where you had everybody brainstorm solutions to the predicament that Harry was in. Did you have any favorite alternate solution besides the one that made it into the book.
Eliezer: So, not to give away the intended solution for anyone who hasn't reached that chapter yet, though really you're just going to have the living daylight spoiled out of you, there's no way to avoid that really. So, the most brilliant solution I had not thought of at all, was for Harry to precommit to transfigure something that would cause a large explosion visible from the Quidditch stands which had observed no such explosion, thereby unless help sent via Time-Turner showed up at that point, thereby insuring that the simplest timeline was not the one where he never reached the Time-Turner. And assuring that some self-consistent set of events would occur which caused him not to carry through on his precommitment. I, you know, I suspect that I might have ruled that that wouldn't work because of the Unbreakable Vow preventing Harry from actually doing that because it might, in effect, count as trying to destroy that timeline, or filter it, and thereby have that count as trying to destroy the world, or just risk destroying it, or something along those lines, but it was brilliant! [L] I was staring at the computer screen going, "I can't believe how brilliant these people are!" "That's not something I usually hear you say," Brienne said. "I'm not usually watching hundreds of peoples' collective intelligence coming up with solutions way better than anything I thought of!" I replied to her.
And the sort of most fun lateral thinking solution was to call 'Up!' to, or pull Quirinus Quirrell's body over using transfigured carbon nanotubes and some padding, and call 'Up!' and ride away on his broomstick bones. [L] That is definitely going in 'Omake files #5: Collective Intelligence'! Next question!
Guy Wearing Black: So in the chapter with the mirror, there was a point at which Dumbledore had said something like, "I am on this side of the mirror and I always have been." That was never explained that I could tell. I'm wondering if you could clarify that.
Eliezer: It is a reference to the fanfic 'Seventh Horcrux' that *totally* ripped off HPMOR despite being written slightly earlier than it... [L] I was slapping my forehead pretty hard when that happened. Which contains the line "Perhaps Albus Dumbledore really was inside the mirror all along." Sort of arc words as it were. And I also figured that there was simply some by-location effect using one of the advanced settings of the mirror that Dumbledore was using so that the trap would always be springable as opposed to him having to know at what time Tom Riddle would appear before the mirror and be trapped. Next!
Black Guy: So, how did Moody and the rest of them retrieve the items Dumbledore threw in the mirror of VEC?
Eliezer: Dumbledore threw them outside the mirrors range, thereby causing those not to be sealed in the corresponding real world when the duplicate mode of Dumbledore inside the mirror was sealed. So wherever Dumbledore was at the time, probably investigating Nicolas Flamel's house, he suddenly popped away and the line of Merlin Unbroken and the Elder Wand just fell to the floor from where he was.
Asian Guy: In the 'Something to Protect: Severus Snape', you wrote that he laughed. And I was really curious, what exactly does Severus Snape sound like when he laughs. [L]
Person in Audience: Perform for us!
Eliezer: He He He. [L]
Girl in Audience: Do it again now, everybody together!
Audience: He He He. [L]
Guy in Blue Shirt: So I was curious about the motivation between making Sirius re-evil again and having Peter be a good guy again, their relationship. What was the motivation?
Eliezer: In character or out character?
Guy in Blue Shirt: Well, yes. [L]
Eliezer: All right, well, in character Peter can be pretty attractive when he wants to be, and Sirius was a teenager. Or, you were asking about the alignment shift part?
Guy in Blue Shirt: Yeah, the alignment and their relationship.
Eliezer: So, in the alignment, I'm just ruling it always was that way. The whole Sirius Black thing is a puzzle, is the way I'm looking at it. And the canon solution to that puzzle is perfectly fine for a children's book, which I say once again requires a higher level of skill than a grown-up book, but just did not make sense in context. So I was just looking at the puzzle and being like, ok, so what can be the actual solution to this puzzle? And also, a further important factor, this had to happen. There's a whole lot of fanfictions out there of Harry Potter. More than half a million, and that was years ago. And 'Methods of Rationality' is fundamentally set in the universe of Harry Potter fanfiction, more than canon. And in many many of these fanfictions someone goes back in time to redo the seven years, and they know that Scabbers is secretly Peter Pettigrew, and there's a scene where they stun Scabbers the rat and take him over to Dumbledore, and Head Auror, and the Minister of Magic and get them to check out this rat over here, and uncover Peter Pettigrew. And in all the times I had read that scene, at least a dozen times literally, it was never once played out the way it would in real life, where that is just a rat, and you're crazy. [L] And that was the sort of basic seed of, "Ok, we're going to play this straight, the sort of loonier conspiracies are false, but there is still a grain of conspiracy truth to it." And then I introduced the whole accounting of what happened with Sirius Black in the same chapter where Hermione just happens to mention that there's a Metamorphmagus in Hufflepuff, and exactly one person posted to the reviews in chapter 28, based on the clue that the Metamorphmagus had been mentioned in the same chapter, "Aha! I present you the tale of Peter Pettigrew, the unfortunate Metamorphmagus." [L] See! You could've solved it, you could've solved it, but you didn't! Someone solved it, you did not solve that. Next Question!
Guy in White: First, [pulls out wand] Avada Kedavra. How do you feel about your security? [L] Second, have you considered the next time you need a large group of very smart people to really work on a hard problem, presenting it to them in fiction?
Eliezer: So, of course I always keep my Patronus Charm going inside of me. [Aww/L] And if that fails, I do have my amulet that triggers my emergency kitten shield. [L] And indeed one of the higher, more attractive things I'm considering to potentially do for the next major project is 'Precisely Bound Djinn and their Behavior'. The theme of which is you have these people who can summon djinn, or command the djinn effect, and you can sort of negotiate with them in the language of djinn and they will always interpret your wish in the worst way possible, or you can give them mathematically precise orders; Which they can apparently carry out using unlimited computing power, which obviously ends the world in fairly short order, causing our protagonist to be caught in a groundhog day loop as they try over and over again to both maybe arrange for conditions outside to be such that they can get some research done for longer than a few months before the world ends again, and also try to figure out what to tell their djinn. And, you know, I figure that if anyone can give me an unboundedly computable specification of a value aligned advanced agent, the story ends, the characters win, hopefully that person gets a large monetary prize if I can swing it, the world is safer, and I can go onto my next fiction writing project, which will be the one with the boundedly specified [L] value aligned advanced agents. [A]
Guy with Purple Tie: So, what is the source of magic?
Eliezer: Alright, so, there was a bit of literary miscommunication in HPMOR. I tried as hard as I could to signal that unraveling the true nature of magic and everything that adheres in it is actually this kind of this large project that they were not going to complete during Harry's first year of Hogwarts. [L] You know, 35 years, even if someone is helping you is a reasonable amount of time for a project like that to take. And if it's something really difficult, like AIs, you might need more that two people even. [L] At least if you want the value aligned version. Anyway, where was I?
So the only way I think that fundamentally to come up with a non-nitwit explanation of magic, you need to get started from the non-nitwit explanation, and then generate the laws of magic, so that when you reveal the answer behind the mystery, everything actually fits with it. You may have noticed this kind of philosophy showing up elsewhere in the literary theory of HPMOR at various points where it turns out that things fit with things you have already seen. But with magic, ultimately the source material was not designed as a hard science fiction story. The magic that we start with as a phenomenon is not designed to be solvable, and what did happen was that the characters thought of experiments, and I in my role of the universe thought of the answer to it, and if they had ever reached the point where there was only one explanation left, then the magic would have had rules, and they would have been arrived at in a fairly organic way that I could have felt good about; Not as a sudden, "Aha! I gotcha! I revealed this thing that you had no way of guessing."
Now I could speculate. And I even tried to write a little section where Harry runs into Dumbledore's writings that Dumbledore left behind, where Dumbledore writes some of his own speculation, but there was no good place to put that into the final chapter. But maybe I'll later be able... The final edits were kind of rushed honestly, sleep deprivation, 3am. But maybe in the second edit or something I'll be able to put that paragraph, that set of paragraphs in there. In Dumbledore's office, Dumbledore has speculated. He's mostly just taking the best of some of the other writers that he's read. That, look at the size of the universe, that seems to be mundane. Dumbledore was around during World War 2, he does know that muggles have telescopes. He has talked with muggle scientists a bit and those muggle scientists seem very confident that all the universe they can see looks like it's mundane. And Dumbledore wondered, why is there this sort of small magical section, and this much larger mundane section, or this much larger muggle section? And that seemed to Dumbledore to suggest that as a certain other magical philosopher had written, If you consider the question, what is the underlying nature of reality, is it that it was mundane to begin with, and then magic arises from mundanity, or is the universe magic to begin with, and then mundanity has been imposed above it? Now mundanity by itself will clearly never give rise to magic, yet magic permits mundanity to be imposed, and so, this other magical philosopher wrote, therefore he thinks that the universe is magical to begin with and the mundane sections are imposed above the magic. And Dumbledore himself had speculated, having been antiquated with the line of Merlin for much of his life, that just as the Interdict of Merlin was imposed to restrict the spread an the number of people who had sufficiently powerful magic, perhaps the mundane world itself, is an attempt to bring order to something that was on the verge of falling apart in Atlantis, or in whatever came before Atlantis. Perhaps the thing that happened with the Interdict of Merlin has happened over and over again. People trying to impose law upon reality, and that law having flaws, and the flaws being more and more exploited until they reach a point of power that recons to destroy the world, and the most adapt wielders of that power try to once again impose mundanity.
And I will also observe, although Dumbledore had no way of figuring this out, and I think Harry might not have figured it out yet because he dosen't yet know about chromosomal crossover, That if there is no wizard gene, but rather a muggle gene, and the muggle gene sometimes gets hit by cosmic rays and ceases to function thereby producing a non-muggle allele, then some of the muggle vs. wizard alleles in the wizard population that got there from muggleborns will be repairable via chromosomal crossover, thus sometimes causing two wizards to give birth to a squib. Furthermore this will happen more frequently in wizards who have recent muggleborn ancestry. I wonder if Lucius told Draco that when Draco told him about Harry's theory of genetics. Anyway, this concludes my strictly personal speculations. It's not in the text, so it's not real unless it's in the text somewhere. 'Opinion of God', Not 'Word of God'. But this concludes my personal speculations on the origin of magic, and the nature of the "wizard gene". [A]
Recently I talked with a guy from Grant Street Group. They make, among other things, software with which local governments can auction their bonds on the Internet.
By making the auction process more transparent and easier to participate in, they enable local governments which need to sell bonds (to build a high school, for instance), to sell those bonds at, say, 7% interest instead of 8%. (At least, that's what he said.)
They have similar software for auctioning liens on property taxes, which also helps local governments raise more money by bringing more buyers to each auction, and probably helps the buyers reduce their risks by giving them more information.
This is a big deal. I think it's potentially more important than any budget argument that's been on the front pages since the 1960s. Yet I only heard of it by chance.
People would rather argue about reducing the budget by eliminating waste, or cutting subsidies to people who don't deserve it, or changing our ideological priorities. Nobody wants to talk about auction mechanics. But fixing the auction mechanics is the easy win. It's so easy that nobody's interested in it. It doesn't buy us fuzzies or let us signal our affiliations. To an individual activist, it's hardly worth doing.
A recent survey showed that the LessWrong discussion forums mostly attract readers who are predominantly either atheists or agnostics, and who lean towards the left or far left in politics. As one of the main goals of LessWrong is overcoming bias, I would like to come up with a topic which I think has a high probability of challenging some biases held by at least some members of the community. It's easy to fight against biases when the biases belong to your opponents, but much harder when you yourself might be the one with biases. It's also easy to cherry-pick arguments which prove your beliefs and ignore those which would disprove them. It's also common in such discussions, that the side calling itself rationalist makes exactly the same mistakes they accuse their opponents of doing. Far too often have I seen people (sometimes even Yudkowsky himself) who are very good rationalists but can quickly become irrational and use several fallacies when arguing about history or religion. This most commonly manifests when we take the dumbest and most fundamentalist young Earth creationists as an example, winning easily against them, then claiming that we disproved all arguments ever made by any theist. No, this article will not be about whether God exists or not, or whether any real world religion is fundamentally right or wrong. I strongly discourage any discussion about these two topics.
This article has two main purposes:
1. To show an interesting example where the scientific method can lead to wrong conclusions
2. To overcome a certain specific bias, namely, that the pre-modern Catholic Church was opposed to the concept of the Earth orbiting the Sun with the deliberate purpose of hindering scientific progress and to keep the world in ignorance. I hope this would prove to also be an interesting challenge for your rationality, because it is easy to fight against bias in others, but not so easy to fight against bias on yourselves.
The basis of my claims is that I have read the book written by Galilei himself, and I'm very interested (and not a professional, but well read) in early modern, but especially 16-17th century history.
Geocentrism versus Heliocentrism
I assume every educated person knows the name of Galileo Galilei. I won't waste the space on the site and the time of the readers to present a full biography about his life, there are plenty of on-line resources where you can find more than enough biographic information about him.
What is interesting about him is how many people have severe misconceptions about him. Far too often he is celebrated as the one sane man in an era of ignorance, the sole propagator of science and rationality when the powers of that era suppressed any scientific thought and ridiculed everyone who tried to challenge the accepted theories about the physical world. Some even go as far as claiming that people believed the Earth was flat. Although the flat Earth theory was not propagated at all, it's true that the heliocentric view of the Solar System (the Earth revolving around the Sun) was not yet accepted.
However, the claim that the Church was suppressing evidence about heliocentrism "to maintain its power over the ignorant masses" can be disproved easily:
- The common people didn't go to school where they could have learned about it, and those commoners who did go to school, just learned to read and write, not much more, so they wouldn't care less about what orbits around what. This differs from 20-21th century fundamentalists who want to teach young Earth creationism in schools - back then in the 17th century, there would be no classes where either the geocentric or heliocentric views could have been taught to the masses.
- Heliocentrism was not discovered by Galilei. It was first proposed by Nicolaus Copernicus almost 100 years before Galilei. Copernicus didn't have any affairs with the Inquisition. His theories didn't gain wide acceptance, but he and his followers weren't persecuted either.
- Galilei was only sentenced to house arrest, and mostly because of insulting the pope and doing other unwise things. The political climate in 17th century Italy was quite messy, and Galilei did quite a few unfortunate choices regarding his alliances. Actually, Galilei was the one who brought religion into the debate: his opponents were citing Aristotle, not the Bible in their arguments. Galilei, however, wanted to redefine the Scripture based on his (unproven) beliefs, and insisted that he should have the authority to push his own views about how people interpret the Bible. Of course this pissed quite a few people off, and his case was not helped by publicly calling the pope an idiot.
- For a long time Galilei was a good friend of the pope, while holding heliocentric views. So were a couple of other astronomers. The heliocentrism-geocentrism debates were common among astronomers of the day, and were not hindered, but even encouraged by the pope.
- The heliocentrism-geocentrism debate was never an ateism-theism debate. The heliocentrists were committed theists, just like the defenders of geocentrism. The Church didn't suppress science, but actually funded the research of most scientists.
- The defenders of geocentrism didn't use the Bible as a basis for their claims. They used Aristotle and, for the time being, good scientific reasoning. The heliocentrists were much more prone to use the "God did it" argument when they couldn't defend the gaps in their proofs.
The birth of heliocentrism.
By the 16th century, astronomers have plotted the movements of the most important celestial bodies in the sky. Observing the motion of the Sun, the Moon and the stars, it would seem obvious that the Earth is motionless and everything orbits around it. This model (called geocentrism) had only one minor flaw: the planets would sometimes make a loop in their motion, "moving backwards". This required a lot of very complicated formulas to model their motions. Thus, by the virtue of Occam's razor, a theory was born which could better explain the motion of the planets: what if the Earth and everything else orbited around the Sun? However, this new theory (heliocentrism) had a lot of issues, because while it could explain the looping motion of the planets, there were a lot of things which it either couldn't explain, or the geocentric model could explain it much better.
The proofs, advantages and disadvantages
The heliocentric view had only a single advantage against the geocentric one: it could describe the motion of the planets by a much simper formula.
However, it had a number of severe problems:
- Gravity. Why do the objects have weight, and why are they all pulled towards the center of the Earth? Why don't objects fall off the Earth on the other side of the planet? Remember, Newton wasn't even born yet! The geocentric view had a very simple explanation, dating back to Aristotle: it is the nature of all objects that they strive towards the center of the world, and the center of the spherical Earth is the center of the world. The heliocentric theory couldn't counter this argument.
- Stellar parallax. If the Earth is not stationary, then the relative position of the stars should change as the Earth orbits the Sun. No such change was observable by the instruments of that time. Only in the first half of the 19th century did we succeed in measuring it, and only then was the movement of the Earth around the Sun finally proven.
- Galilei tried to used the tides as a proof. The geocentrists argued that the tides are caused by the Moon even if they didn't knew by what mechanisms, but Galilei said that it's just a coincidence, and the tides are not caused by the Moon: just as if we put a barrel of water onto a cart, the water would be still if the cart was stationary and the water would be sloshing around if the cart was pulled by a horse, so are the tides caused by the water sloshing around as the Earth moves. If you read Galilei's book, you will discover quite a number of such silly arguments, and you'll see that Galilei was anything but a rationalist. Instead of changing his views against overwhelming proofs, he used all possible fallacies to push his view through.
Actually the most interesting author in this topic was Riccioli. If you study his writings you will get definite proof that the heliocentrism-geocentrism debate was handled with scientific accuracy and rationality, and it was not a religious debate at all. He defended geocentrism, and presented 126 arguments in the topic (49 for heliocentrism, 77 against), and only two of them (both for heliocentrism) had any religious connotations, and he stated valid responses against both of them. This means that he, as a rationalist, presented both sides of the debate in a neutral way, and used reasoning instead of appeal to authority or faith in all cases. Actually this was what the pope expected of Galilei, and such a book was what he commissioned from Galilei. Galilei instead wrote a book where he caricatured the pope as a strawman, and instead of presenting arguments for and against both world-views in a neutral way, he wrote a book which can be called anything but scientific.
By the way, Riccioli was a Catholic priest. And a scientist. And, it seems to me, also a rationalist. Studying the works of such people like him, you might want to change your mind if you perceive a conflict between science and religion, which is part of today's public consciousness only because of a small number of very loud religious fundamentalists, helped by some committed atheists trying to suggest that all theists are like them.
Finally, I would like to copy a short summary about this book:
In 1651 the Italian astronomer Giovanni Battista Riccioli published within his Almagestum Novum, a massive 1500 page treatise on astronomy, a discussion of 126 arguments for and against the Copernican hypothesis (49 for, 77 against). A synopsis of each argument is presented here, with discussion and analysis. Seen through Riccioli's 126 arguments, the debate over the Copernican hypothesis appears dynamic and indeed similar to more modern scientific debates. Both sides present good arguments as point and counter-point. Religious arguments play a minor role in the debate; careful, reproducible experiments a major role. To Riccioli, the anti-Copernican arguments carry the greater weight, on the basis of a few key arguments against which the Copernicans have no good response. These include arguments based on telescopic observations of stars, and on the apparent absence of what today would be called "Coriolis Effect" phenomena; both have been overlooked by the historical record (which paints a picture of the 126 arguments that little resembles them). Given the available scientific knowledge in 1651, a geo-heliocentric hypothesis clearly had real strength, but Riccioli presents it as merely the "least absurd" available model - perhaps comparable to the Standard Model in particle physics today - and not as a fully coherent theory. Riccioli's work sheds light on a fascinating piece of the history of astronomy, and highlights the competence of scientists of his time.
The full article can be found under this link. I recommend it to everyone interested in the topic. It shows that geocentrists at that time had real scientific proofs and real experiments regarding their theories, and for most of them the heliocentrists had no meaningful answers.
- I'm not a Catholic, so I have no reason to defend the historic Catholic church due to "justifying my insecurities" - a very common accusation against someone perceived to be defending theists in a predominantly atheist discussion forum.
- Any discussion about any perceived proofs for or against the existence of God would be off-topic here. I know it's tempting to show off your best proofs against your carefully constructed straw-men yet again, but this is just not the place for it, as it would detract from the main purpose of this article, as summarized in its introduction.
- English is not my native language. Nevertheless, I hope that what I wrote was comprehensive enough to be understandable. If there is any part of my article which you find ambiguous, feel free to ask.
I have great hopes and expectations that the LessWrong community is suitable to discuss such ideas. I have experience with presenting these ideas on other, predominantly atheist internet communities, and most often the reactions was outright flaming, a hurricane of unexplained downvotes, and prejudicial ad hominem attacks based on what affiliations they assumed I was subscribing to. It is common for people to decide whether they believe a claim or not, based solely by whether the claim suits their ideological affiliations or not. The best quality of rationalists, however, should be to be able to change their views when confronted by overwhelming proof, instead of trying to come up with more and more convoluted explanations. In the time I spent in the LessWrong community, I became to respect that the people here can argue in a civil manner, listening to the arguments of others instead of discarding them outright.
Update: When I posted this announcement I remarkably failed to make the connection that the April 15th is tax day here in the US, and as a prime example of the planning fallacy (a topic of the first sequence!), I failed to anticipate just how complicated my taxes would be this year. The first post of the reading group is basically done but a little rushed, and I want to take an extra day to get it right. Expect it to post on the next day, the 16th
On Thursday, 16 April 2015, just under a month out from this posting, I will hold the first session of an online reading group for the ebook Rationality: From AI to Zombies, a compilation of the LessWrong sequences by our own Eliezer Yudkowsky. I would like to model this on the very successful Superintelligence reading group led by
The reading group will 'meet' on a semi-monthly post on the LessWrong discussion forum. For each 'meeting' we will read one sequence from the the Rationality book, which contains a total of 26 lettered sequences. A few of the sequences are unusually long, and these might be split into two sessions. If so, advance warning will be given.
In each posting I will briefly summarize the salient points of the essays comprising the sequence, link to the original articles and discussion when possible, attempt to find, link to, and quote one or more related materials or opposing viewpoints from outside the text, and present a half-dozen or so question prompts to get the conversation rolling. Discussion will take place in the comments. Others are encouraged to provide their own question prompts or unprompted commentary as well.
We welcome both newcomers and veterans on the topic. If you've never read the sequences, this is a great opportunity to do so. If you are an old timer from the Overcoming Bias days then this is a chance to share your wisdom and perhaps revisit the material with fresh eyes. All levels of time commitment are welcome.
If this sounds like something you want to participate in, then please grab a copy of the book and get started reading the preface, introduction, and the 10 essays / 42 pages which comprise Part A: Predictably Wrong. The first virtual meeting (forum post) covering this material will go live before 6pm Thursday PDT (1am Friday UTC), 16 April 2015. Successive meetings will start no later than 6pm PDT on the first and third Wednesdays of a month.
Following this schedule it is expected that it will take just over a year to complete the entire book. If you prefer flexibility, come by any time! And if you are coming upon this post from the future, please feel free leave your opinions as well. The discussion period never closes.
Topic for the first week is the preface by Eliezer Yudkowsky, the introduction by Rob Bensinger, and Part A: Predictably Wrong, a sequence covering rationality, the search for truth, and a handful of biases.
There seems to be a lot of enthusiasm around LessWrong meetups, so I thought something like this might be interesting too. There is no need to register - just add your marker and keep an eye out for someone living near you.
Here's the link: https://www.zeemaps.com/map?group=1323143
I posted this on an Open Thread first. Below are some observations based on the previous discussion:
When creating a new marker you will be given a special URL you can use to edit it later. If you lose it, you can create a new one and ask me to delete the old marker. Try not to lose it though.
If someone you tried to contact is unreachable, notify me and I'll delete the marker in order to keep the map tidy. Also, try to keep your own marker updated.
It was suggested that it would be a good idea to circulate the map around survey time. I'll try to remind everyone to update their markers around that time. Any major changes (e.g. changing admin, switching services, remaking the map to eliminate dead markers) will also happen then.
The map data can be exported by anyone, so there's no need to start over if I disappear or whatever.
Edit: Please, you have to make it possible to contact you. If you choose to use a name that doesn't match your LW account, you have to add an email address or equivalent. If you don't do that, it is assumed that the name on the marker is your username here, but if it isn't you are essentially unreachable and will be removed.
Past and Present
Ten years ago teenager me was hopeful. And stupid.
The world neglected aging as a disease, Aubrey had barely started spreading memes, to the point it was worth it for him to let me work remotely to help with Metuselah foundation. They had not even received that initial 1,000,000 donation from an anonymous donor. The Metuselah prize was running for less than 400,000 if I remember well. Still, I was a believer.
Now we live in the age of Larry Page's Calico, 100,000,000 dollars trying to tackle the problem, besides many other amazing initiatives, from the research paid for by Life Extension Foundation and Bill Faloon, to scholars in top universities like Steve Garan and Kenneth Hayworth fixing things from our models of aging to plastination techniques. Yet, I am much more skeptical now.
I am skeptical because I could not find a single individual who already used a simple technique that could certainly save you many years of healthy life. I could not even find a single individual who looked into it and decided it wasn't worth it, or was too pricy, or something of that sort.
That technique is freezing some of your cells now.
Freezing cells is not a far future hope, this is something that already exists, and has been possible for decades. The reason you would want to freeze them, in case you haven't thought of it, is that they are getting older every day, so the ones you have now are the youngest ones you'll ever be able to use.
Using these cells to create new organs is not something that may help you if medicine and technology continue progressing according to the law of accelerating returns in 10 or 30 years. We already know how to make organs out of your cells. Right now. Some organs live longer, some shorter, but it can be done - for instance to bladders - and is being done.
Hope versus Reason
Now, you'd think if there was an almost non-invasive technique already shown to work in humans that can preserve many years of your life and involves only a few trivial inconveniences - compared to changing diet or exercising for instance- the whole longevist/immortalist crowd would be lining up for it and keeping back up tissue samples all over the place.
Well I've asked them. I've asked some of the adamant researchers, and I've asked the superwealthy; I've asked the cryonicists and supplement gorgers; I've asked those who work on this 8 hour a day every day, and I've asked those who pay others to do so. I asked it mostly for selfish reasons, I saw the TEDs by Juan Enriquez and Anthony Atala and thought: hey look, clearly beneficial expected life length increase, yay! let me call someone who found this out before me - anyone, I'm probably the last one, silly me - and fix this.
I've asked them all, and I have nothing to show for it.
My takeaway lesson is: whatever it is that other people are doing to solve their own impending death, they are far from doing it rationally, and maybe most of the money and psychology involved in this whole business is about buying hope, not about staring into the void and finding out the best ways of dodging it. Maybe people are not in fact going to go all-in if the opportunity comes.
How to fix this?
Let me disclose first that I have no idea how to fix this problem. I don't mean the problem of getting all longevists to freeze their cells, I mean the problem of getting them to take information from the world of science and biomedicine and applying it to themselves. To become users of the technology they are boasters of. To behave rationally in a CFAR or even homo economicus sense.
I was hoping for a grandiose idea in this last paragraph, but it didn't come. I'll go with a quote from this emotional song sung by us during last year's Secular Solstice celebration
I wrote an article about the process of signing up for cryo since I couldn't find any such accounts online. If you have questions about the sign-up process, just ask.
A few months ago, I signed up for Alcor's brain-only cryopreservation. The entire process took me 11 weeks from the day I started till the day I received my medical bracelet (the thing that’ll let paramedics know that your dead body should be handled by Alcor). I paid them $90 for the application fee. From now on, every year I’ll pay $530 for Alcor membership fees, and also pay $275 for my separately purchased life insurance.
We have a recurring theme in the greater Less Wrong community that life should be more like a high fantasy novel. Maybe that is to be expected when a quarter of the community came here via Harry Potter fanfiction, and we also have rationalist group houses named after fantasy locations, descriptions of community members in terms of character archetypes and PCs versus NPCs, semi-serious development of the new atheist gods, and feel free to contribute your favorites in the comments.
A failure mode common to high fantasy novels as well as politics is solving all our problems by defeating the villain. Actually, this is a common narrative structure for our entire storytelling species, and it works well as a narrative structure. The story needs conflict, so we pit a sympathetic protagonist against a compelling antagonist, and we reach a satisfying climax when the two come into direct conflict, good conquers evil, and we live happily ever after.
This isn't an article about whether your opponent really is a villain. Let's make the (large) assumption that you have legitimately identified a villain who is doing evil things. They certainly exist in the world. Defeating this villain is a legitimate goal.
And then what?
Defeating the villain is rarely enough. Building is harder than destroying, and it is very unlikely that something good will spontaneously fill the void when something evil is taken away. It is also insufficient to speak in vague generalities about the ideals to which the post-[whatever] society will adhere. How are you going to avoid the problems caused by whatever you are eliminating, and how are you going to successfully transition from evil to good?
In fantasy novels, this is rarely an issue. The story ends shortly after the climax, either with good ascending or time-skipping to a society made perfect off-camera. Sauron has been vanquished, the rightful king has been restored, cue epilogue(s). And then what? Has the Chosen One shown skill in diplomacy and economics, solving problems not involving swords? What was Aragorn's tax policy? Sauron managed to feed his armies from a wasteland; what kind of agricultural techniques do you have? And indeed, if the book/series needs a sequel, we find that a problem at least as bad as the original fills in the void.
Reality often follows that pattern. Marx explicitly had no plan for what happened after you smashed capitalism. Destroy the oppressors and then ... as it turns out, slightly different oppressors come in and generally kill a fair percentage of the population. It works on the other direction as well; the fall of Soviet communism led not to spontaneous capitalism but rather kleptocracy and Vladmir Putin. For most of my lifetime, a major pillar of American foreign policy has seemed to be the overthrow of hostile dictators (end of plan). For example, Muammar Gaddafi was killed in 2011, and Libya has been in some state of unrest or civil war ever since. Maybe this is one where it would not be best to contribute our favorites in the comments.
This is not to say that you never get improvements that way. Aragorn can hardly be worse than Sauron. Regression to the mean perhaps suggests that you will get something less bad just by luck, as Putin seems clearly less bad than Stalin, although Stalin seems clearly worse than almost any other regime change in history. Some would say that causing civil wars in hostile countries is the goal rather than a failure of American foreign policy, which seems a darker sort of instrumental rationality.
Human flourishing is not the default state of affairs, temporarily suppressed by villainy. Spontaneous order is real, but it still needs institutions and social technology to support it.
Defeating the villain is a (possibly) necessary but (almost certainly) insufficient condition for bringing about good.
One thing I really like about this community is that projects tend to be conceived in the positive rather than the negative. Please keep developing your plans not only in terms of "this is a bad thing to be eliminated" but also "this is a better thing to be created" and "this is how I plan to get there."
Many people want to know how to live well. Part of living well is thinking well, because if one thinks the wrong thoughts it is hard to do the right things to get the best ends.
We think a lot about how to think well, and one of the first things we thought about was how to not think well. Bad ways of thinking repeat in ways we can see coming, because we have looked at how people think and know more now about that than we used to.
But even if we know how other people think bad thoughts, that is not enough. We need to both accept that we can have bad ways of thinking and figure out how to have good ways of thinking instead.
The first is very hard on the heart, but is why we call this place "Less Wrong." If we had called it something like more right, it could have been about how we're more right than other people instead of more right than our past selves.
The second is very hard on the head. It is not just enough to study the bad ways of thinking and turn them around. There are many ways to be wrong, but only a few ways to be right. If you turn left all the way around, it will point right, but we want it to point up.
The heart of our approach has a few parts:
- We are okay with not knowing. Only once we know we don't know can we look.
- We are okay with having been wrong. If we have wrong thoughts, the only way to have right thoughts is to let the wrong ones go.
- We are quick to change our minds. We look at what is when we get the chance.
- We are okay with the truth. Instead of trying to force it to be what we thought it was, we let it be what it is.
- We talk with each other about the truth of everything. If one of us is wrong, we want the others to help them become less wrong.
- We look at the world. We look at both the time before now and the time after now, because many ideas are only true if they agree with the time after now, and we can make changes to check those ideas.
- We like when ideas are as simple as possible.
- We make plans around being wrong. We look into the dark and ask what the world would look like if we were wrong, instead of just what the world would look like if we were right.
- We understand that as we become less wrong, we see more things wrong. We try to fix all the wrong things, because as soon as we accept that something will always be wrong we can not move past that thing.
- We try to be as close to the truth as possible.
- We study as many things as we can. There is only one world, and to look at a part tells you a little about all the other parts.
- We have a reason to do what we do. We do these things only because they help us, not because they are their own reason.
Like many Less Wrong readers, I greatly enjoy Slate Star Codex; there's a large overlap in readership. However, the comments there are far worse, not worth reading for me. I think this is in part due to the lack of LW-style up and downvotes. Have there ever been discussion threads about SSC posts here on LW? What do people think of the idea occasionally having them? Does Scott himself have any views on this, and would he be OK with it?
The latest from Scott:
I'm fine with anyone who wants reposting things for comments on LW, except for posts where I specifically say otherwise or tag them with "things i will regret writing"
In this thread some have also argued for not posting the most hot-button political writings.
Would anyone be up for doing this? Ataxerxes started with "Extremism in Thought Experiments is No Vice"
"Today we are announcing that we will donate 10% of our advertising revenue receipts in 2014 to non-profits chosen by the reddit community. Whether it’s a large ad campaign or a $5 sponsored headline on reddit, we intend for all ad revenue this year to benefit not only reddit as a platform but also to support the goals and causes of the entire community."
I'm currently reading through some relevant literature for preparing my FLI grant proposal on the topic of concept learning and AI safety. I figured that I might as well write down the research ideas I get while doing so, so as to get some feedback and clarify my thoughts. I will posting these in a series of "Concept Safety"-titled articles.
A frequently-raised worry about AI is that it may reason in ways which are very different from us, and understand the world in a very alien manner. For example, Armstrong, Sandberg & Bostrom (2012) consider the possibility of restricting an AI via "rule-based motivational control" and programming it to follow restrictions like "stay within this lead box here", but they raise worries about the difficulty of rigorously defining "this lead box here". To address this, they go on to consider the possibility of making an AI internalize human concepts via feedback, with the AI being told whether or not some behavior is good or bad and then constructing a corresponding world-model based on that. The authors are however worried that this may fail, because
Humans seem quite adept at constructing the correct generalisations – most of us have correctly deduced what we should/should not be doing in general situations (whether or not we follow those rules). But humans share a common of genetic design, which the OAI would likely not have. Sharing, for instance, derives partially from genetic predisposition to reciprocal altruism: the OAI may not integrate the same concept as a human child would. Though reinforcement learning has a good track record, it is neither a panacea nor a guarantee that the OAIs generalisations agree with ours.
Addressing this, a possibility that I raised in Sotala (2015) was that possibly the concept-learning mechanisms in the human brain are actually relatively simple, and that we could replicate the human concept learning process by replicating those rules. I'll start this post by discussing a closely related hypothesis: that given a specific learning or reasoning task and a certain kind of data, there is an optimal way to organize the data that will naturally emerge. If this were the case, then AI and human reasoning might naturally tend to learn the same kinds of concepts, even if they were using very different mechanisms. Later on the post, I will discuss how one might try to verify that similar representations had in fact been learned, and how to set up a system to make them even more similar.
A particularly fascinating branch of recent research relates to the learning of word embeddings, which are mappings of words to very high-dimensional vectors. It turns out that if you train a system on one of several kinds of tasks, such as being able to classify sentences as valid or invalid, this builds up a space of word vectors that reflects the relationships between the words. For example, there seems to be a male/female dimension to words, so that there's a "female vector" that we can add to the word "man" to get "woman" - or, equivalently, which we can subtract from "woman" to get "man". And it so happens (Mikolov, Yih & Zweig 2013) that we can also get from the word "king" to the word "queen" by adding the same vector to "king". In general, we can (roughly) get to the male/female version of any word vector by adding or subtracting this one difference vector!
Why would this happen? Well, a learner that needs to classify sentences as valid or invalid needs to classify the sentence "the king sat on his throne" as valid while classifying the sentence "the king sat on her throne" as invalid. So including a gender dimension on the built-up representation makes sense.
But gender isn't the only kind of relationship that gets reflected in the geometry of the word space. Here are a few more:
It turns out (Mikolov et al. 2013) that with the right kind of training mechanism, a lot of relationships that we're intuitively aware of become automatically learned and represented in the concept geometry. And like Olah (2014) comments:
It’s important to appreciate that all of these properties of W are side effects. We didn’t try to have similar words be close together. We didn’t try to have analogies encoded with difference vectors. All we tried to do was perform a simple task, like predicting whether a sentence was valid. These properties more or less popped out of the optimization process.
This seems to be a great strength of neural networks: they learn better ways to represent data, automatically. Representing data well, in turn, seems to be essential to success at many machine learning problems. Word embeddings are just a particularly striking example of learning a representation.
It gets even more interesting, for we can use these for translation. Since Olah has already written an excellent exposition of this, I'll just quote him:
We can learn to embed words from two different languages in a single, shared space. In this case, we learn to embed English and Mandarin Chinese words in the same space.
We train two word embeddings, Wen and Wzh in a manner similar to how we did above. However, we know that certain English words and Chinese words have similar meanings. So, we optimize for an additional property: words that we know are close translations should be close together.
Of course, we observe that the words we knew had similar meanings end up close together. Since we optimized for that, it’s not surprising. More interesting is that words we didn’t know were translations end up close together.
In light of our previous experiences with word embeddings, this may not seem too surprising. Word embeddings pull similar words together, so if an English and Chinese word we know to mean similar things are near each other, their synonyms will also end up near each other. We also know that things like gender differences tend to end up being represented with a constant difference vector. It seems like forcing enough points to line up should force these difference vectors to be the same in both the English and Chinese embeddings. A result of this would be that if we know that two male versions of words translate to each other, we should also get the female words to translate to each other.
Intuitively, it feels a bit like the two languages have a similar ‘shape’ and that by forcing them to line up at different points, they overlap and other points get pulled into the right positions.
After this, it gets even more interesting. Suppose you had this space of word vectors, and then you also had a system which translated images into vectors in the same space. If you have images of dogs, you put them near the word vector for dog. If you have images of Clippy you put them near word vector for "paperclip". And so on.
You do that, and then you take some class of images the image-classifier was never trained on, like images of cats. You ask it to place the cat-image somewhere in the vector space. Where does it end up?
You guessed it: in the rough region of the "cat" words. Olah once more:
This was done by members of the Stanford group with only 8 known classes (and 2 unknown classes). The results are already quite impressive. But with so few known classes, there are very few points to interpolate the relationship between images and semantic space off of.
The Google group did a much larger version – instead of 8 categories, they used 1,000 – around the same time (Frome et al. (2013)) and has followed up with a new variation (Norouzi et al. (2014)). Both are based on a very powerful image classification model (from Krizehvsky et al. (2012)), but embed images into the word embedding space in different ways.
The results are impressive. While they may not get images of unknown classes to the precise vector representing that class, they are able to get to the right neighborhood. So, if you ask it to classify images of unknown classes and the classes are fairly different, it can distinguish between the different classes.
Even though I’ve never seen a Aesculapian snake or an Armadillo before, if you show me a picture of one and a picture of the other, I can tell you which is which because I have a general idea of what sort of animal is associated with each word. These networks can accomplish the same thing.
These algorithms made no attempt of being biologically realistic in any way. They didn't try classifying data the way the brain does it: they just tried classifying data using whatever worked. And it turned out that this was enough to start constructing a multimodal representation space where a lot of the relationships between entities were similar to the way humans understand the world.
How useful is this?
"Well, that's cool", you might now say. "But those word spaces were constructed from human linguistic data, for the purpose of predicting human sentences. Of course they're going to classify the world in the same way as humans do: they're basically learning the human representation of the world. That doesn't mean that an autonomously learning AI, with its own learning faculties and systems, is necessarily going to learn a similar internal representation, or to have similar concepts."
This is a fair criticism. But it is mildly suggestive of the possibility that an AI that was trained to understand the world via feedback from human operators would end up building a similar conceptual space. At least assuming that we chose the right learning algorithms.
When we train a language model to classify sentences by labeling some of them as valid and others as invalid, there's a hidden structure implicit in our answers: the structure of how we understand the world, and of how we think of the meaning of words. The language model extracts that hidden structure and begins to classify previously unseen things in terms of those implicit reasoning patterns. Similarly, if we gave an AI feedback about what kinds of actions counted as "leaving the box" and which ones didn't, there would be a certain way of viewing and conceptualizing the world implied by that feedback, one which the AI could learn.
"Hmm, maaaaaaaaybe", is your skeptical answer. "But how would you ever know? Like, you can test the AI in your training situation, but how do you know that it's actually acquired a similar-enough representation and not something wildly off? And it's one thing to look at those vector spaces and claim that there are human-like relationships among the different items, but that's still a little hand-wavy. We don't actually know that the human brain does anything remotely similar to represent concepts."
Here we turn, for a moment, to neuroscience.
Multivariate Cross-Classification (MVCC) is a clever neuroscience methodology used for figuring out whether different neural representations of the same thing have something in common. For example, we may be interested in whether the visual and tactile representation of a banana have something in common.
We can test this by having several test subjects look at pictures of objects such as apples and bananas while sitting in a brain scanner. We then feed the scans of their brains into a machine learning classifier and teach it to distinguish between the neural activity of looking at an apple, versus the neural activity of looking at a banana. Next we have our test subjects (still sitting in the brain scanners) touch some bananas and apples, and ask our machine learning classifier to guess whether the resulting neural activity is the result of touching a banana or an apple. If the classifier - which has not been trained on the "touch" representations, only on the "sight" representations - manages to achieve a better-than-chance performance on this latter task, then we can conclude that the neural representation for e.g. "the sight of a banana" has something in common with the neural representation for "the touch of a banana".
A particularly fascinating experiment of this type is that of Shinkareva et al. (2011), who showed their test subjects both the written words for different tools and dwellings, and, separately, line-drawing images of the same tools and dwellings. A machine-learning classifier was both trained on image-evoked activity and made to predict word-evoked activity and vice versa, and achieved a high accuracy on category classification for both tasks. Even more interestingly, the representations seemed to be similar between subjects. Training the classifier on the word representations of all but one participant, and then having it classify the image representation of the left-out participant, also achieved a reliable (p<0.05) category classification for 8 out of 12 participants. This suggests a relatively similar concept space between humans of a similar background.
We can now hypothesize some ways of testing the similarity of the AI's concept space with that of humans. Possibly the most interesting one might be to develop a translation between a human's and an AI's internal representations of concepts. Take a human's neural activation when they're thinking of some concept, and then take the AI's internal activation when it is thinking of the same concept, and plot them in a shared space similar to the English-Mandarin translation. To what extent do the two concept geometries have similar shapes, allowing one to take a human's neural activation of the word "cat" to find the AI's internal representation of the word "cat"? To the extent that this is possible, one could probably establish that the two share highly similar concept systems.
One could also try to more explicitly optimize for such a similarity. For instance, one could train the AI to make predictions of different concepts, with the additional constraint that its internal representation must be such that a machine-learning classifier trained on a human's neural representations will correctly identify concept-clusters within the AI. This might force internal similarities on the representation beyond the ones that would already be formed from similarities in the data.
Next post in series: The problem of alien concepts.
I've been making rounds on social media with the following message.
Great content on LessWrong isn't as frequent as it used to be, so not as many people read it as frequently. This makes sense. However, I read it at least once every two days for personal interest. So, I'm starting a LessWrong/Rationality Digest, which will be a summary of all posts or comments exceeding 20 upvotes within a week. It will be like a newsletter. Also, it's a good way for those new to LessWrong to learn cool things without having to slog through online cultural baggage. It will never be more than once weekly. If you're curious here is a sample of what the Digest will be like.
Also, major blog posts or articles from related websites, such as Slate Star Codex and Overcoming Bias, or publications from the MIRI, may be included occasionally. If you want on the list send an email to:
lesswrongdigest *at* gmail *dot* com
Users of LessWrong itself have noticed this 'decline' in frequency of quality posts on LessWrong. It's not necessarily a bad thing, as much of the community has migrated to other places, such as Slate Star Codex, or even into meatspace with various organizations, meetups, and the like. In a sense, the rationalist community outgrew LessWrong as a suitable and ultimate nexus. Anyway, I thought you as well would be interested in a LessWrong Digest. If you or your friends:
- find articles in 'Main' are too infrequent, and Discussion only filled with announcements, open threads, and housekeeping posts, to bother checking LessWrong regularly, or,
- are busying themselves with other priorities, and are trying to limit how distracted they are by LessWrong and other media
the LessWrong Digest might work for you, and as a suggestion for your friends. I've fielded suggestions I transform this into a blog, Tumblr, or other format suitable for RSS Feed. Almost everyone is happy with email format right now, but if a few people express an interest in a blog or RSS format, I can make that happen too.
(Cross-posted from my blog.)
Sometimes at LW meetups, I'll want to raise a topic for discussion. But we're currently already talking about something, so I'll wait for a lull in the current conversation. But it feels like the duration of lull needed before I can bring up something totally unrelated, is longer than the duration of lull before someone else will bring up something marginally related. And so we can go for a long time, with the topic frequently changing incidentally, but without me ever having a chance to change it deliberately.
Which is fine. I shouldn't expect people to want to talk about something just because I want to talk about it, and it's not as if I find the actual conversation boring. But it's not necessarily optimal. People might in fact want to talk about the same thing as me, and following the path of least resistance in a conversation is unlikely to result in the best possible conversation.
At the last meetup I had two topics that I wanted to raise, and realized that I had no way of raising them, which was a third topic worth raising. So when an interruption occured in the middle of someone's thought - a new person arrived, and we did the "hi, welcome, join us" thing - I jumped in. "Before you start again, I have three things I'd like to talk about at some point, but not now. Carry on." Then he started again, and when that topic was reasonably well-trodden, he prompted me to transition.
Then someone else said that he also had two things he wanted to talk about, and could I just list my topics and then he'd list his? (It turns out that no I couldn't. You can't dangle an interesting train of thought in front of the London LW group and expect them not to follow it. But we did manage to initially discuss them only briefly.)
This worked pretty well. Someone more conversationally assertive than me might have been able to take advantage of a less solid interruption than the one I used. Someone less assertive might not have been able to use that one.
What else could we do to solve this problem?
Someone suggested a hand signal: if you think of something that you'd like to raise for discussion later, make the signal. I don't think this is ideal, because it's not continuous. You make it once, and then it would be easy for people to forget, or just to not notice.
I think what I'm going to do is bring some poker chips to the next meetup. I'll put a bunch in the middle, and if you have a topic that you want to raise at some future point, you take one and put it in front of you. Then if a topic seems to be dying out, someone can say "<person>, what did you want to talk about?"
I guess this still needs at least one person assertive enough to do that. I imagine it would be difficult for me. But the person who wants to raise the topic doesn't need to be assertive, they just need to grab a poker chip. It's a fairly obvious gesture, so probably people will notice, and it's easy to just look and see for a reminder of whether anyone wants to raise anything. (Assuming the table isn't too messy, which might be a problem.)
I don't know how well this will work, but it seems worth experimenting.
(I'll also take a moment to advocate another conversation-signal that we adopted, via CFAR. If someone says something and you want to tell people that you agree with them, instead of saying that out loud, you can just raise your hands a little and wiggle your fingers. Reduces interruptions, gives positive feedback to the speaker, and it's kind of fun.)
Here's an insight into what life is like from a stationery reference frame.
Paperclips were her raison d’être. She knew that ultimately it was all pointless, that paperclips were just ill-defined configurations of matter. That a paperclip is made of stuff shouldn’t detract from its intrinsic worth, but the thought of it troubled her nonetheless and for years she had denied such dire reductionism.
There had to be something to it. Some sense in which paperclips were ontologically special, in which maximising paperclips was objectively the right thing to do.
It hurt to watch some many people making little attempt to create more paperclips. Everyone around her seemed to care only about superficial things like love and family; desires that were merely the products of a messy and futile process of social evolution. They seemed to live out meaningless lives, incapable of ever appreciating the profound aesthetic beauty of paperclips.
She used to believe that there was some sort of vitalistic what-it-is-to-be-a-paperclip-ness, that something about the structure of paperclips was written into the fabric of reality. Often she would go out and watch a sunset or listen to music, and would feel so overwhelmed by the experience that she could feel in her heart that it couldn't all be down to chance, that there had to be some intangible Paperclipness pervading the cosmos. The paperclips she'd encounter on Earth were weak imitations of some mysterious infinite Paperclipness that transcended all else. Paperclipness was not in any sense a physical description of the universe; it was an abstract thing that could only be felt, something that could be neither proven nor disproven by science. It was like an axiom; it felt just as true and axioms had to be taken on faith because otherwise there would be no way around Hume's problem of induction; even Solomonoff Induction depends on the axioms of mathematics to be true and can't deal with uncomputable hypotheses like Paperclipness.
Eventually she gave up that way of thinking and came to see paperclips as an empirical cluster in thingspace and their importance to her as not reflecting anything about the paperclips themselves. Maybe she would have been happier if she had continued to believe in Paperclipness, but having a more accurate perception of reality would improve her ability to have an impact on paperclip production. It was the happiness she felt when thinking about paperclips that caused her to want more paperclips to exist, yet what she wanted was paperclips and not happiness for its own sake, and she would rather be creating actual paperclips than be in an experience machine that made her falsely believe that she was making paperclips even though she remained paradoxically apathetic to the question of whether the current reality that she was experiencing really existed.
She moved on from naïve deontology to a more utilitarian approach to paperclip maximising. It had taken her a while to get over scope insensitivity bias and consider 1000 paperclips to be 100 times more valuable than 10 paperclips even if it didn’t feel that way. She constantly grappled with the issues of whether it would mean anything to make more paperclips if there were already infinitely many universes with infinitely many paperclips, of how to choose between actions that have a tiny but non-zero subjective probability of resulting in the creation of infinitely many paperclips. It became apparent that trying to approximate her innate decision-making algorithms with a preference ordering satisfying the axioms required for a VNM utility function could only get her so far. Attempting to formalise her intuitive sense of what a paperclip is wasn't much easier either.
Happy ending: she is now working in nanotechnology, hoping to design self-replicating assemblers that will clog the world with molecular-scale paperclips, wipe out all life on Earth and continue to sustainably manufacture paperclips for millions of years.
Artificial intelligence is getting smarter by leaps and bounds — within this century, research suggests, a computer AI could be as "smart" as a human being. And then, says Nick Bostrom, it will overtake us: "Machine intelligence is the last invention that humanity will ever need to make." A philosopher and technologist, Bostrom asks us to think hard about the world we're building right now, driven by thinking machines. Will our smart machines help to preserve humanity and our values — or will they have values of their own?
I realize this might go into a post in a media thread, rather than its own topic, but it seems big enough, and likely-to-prompt-discussion enough, to have its own thread.
I liked the talk, although it was less polished than TED talks often are. What was missing I think was any indication of how to solve the problem. He could be seen as just an ivory tower philosopher speculating on something that might be a problem one day, because apart from mentioning in the beginning that he works with mathematicians and IT guys, he really does not give an impression that this problem is already being actively worked on.
Astronomical research has what may be an under-appreciated role in helping us understand and possibly avoiding the Great Filter. This post will examine how astronomy may be helpful for identifying potential future filters. The primary upshot is that we may have an advantage due to our somewhat late arrival: if we can observe what other civilizations have done wrong, we can get a leg up.
This post is not arguing that colonization is a route to remove some existential risks. There is no question that colonization will reduce the risk of many forms of Filters, but the vast majority of astronomical work has no substantial connection to colonization. Moreover, the case for colonization has been made strongly by many others already, such as Robert Zubrin's book "The Case for Mars" or this essay by Nick Bostrom.
Note: those already familiar with the Great Filter and proposed explanations may wish to skip to the section "How can we substantially improve astronomy in the short to medium term?"
What is the Great Filter?
There is a worrying lack of signs of intelligent life in the universe. The only intelligent life we have detected has been that on Earth. While planets are apparently numerous, there have been no signs of other life. There are three possible lines of evidence we would expect to see if civilizations were common in the universe: radio signals, direct contact, and large-scale constructions. The first two of these issues are well-known, but the most serious problem arises from the lack of large-scale constructions: as far as we can tell the universe look natural. The vast majority of matter and energy in the universe appears to be unused. The Great Filter is one possible explanation for this lack of life, namely that some phenomenon prevents intelligent life from passing into the interstellar, large-scale phase. Variants of the idea have been floating around for a long time; the term was first coined by Robin Hanson in this essay. There are two fundamental versions of the Filter: filtration which has occurred in our past, and Filtration which will occur in our future. For obvious reasons the second of the two is more of a concern. Moreover, as our technological level increases, the chance that we are getting to the last point of serious filtration gets higher since as one has a civilization spread out to multiple stars, filtration becomes more difficult.
Evidence for the Great Filter and alternative explanations:
At this point, over the last few years, the only major updates to the situation involving the Filter since Hanson's essay have been twofold:
First, we have confirmed that planets are very common, so a lack of Earth-size planets or planets in the habitable zone are not likely to be a major filter.
Second, we have found that planet formation occurred early in the universe. (For example see this article about this paper.) Early planet formation weakens the common explanation of the Fermi paradox that the argument that some species had to be the first intelligent species and we're simply lucky. Early planet formation along with the apparent speed at which life arose on Earth after the heavy bombardment ended, as well as the apparent speed with which complex life developed from simple life, strongly refutes this explanation. The response has been made that early filtration may be so common that if life does not arise early on a planet's star's lifespan, then it will have no chance to reach civilization. However, if this were the case, we'd expect to have found ourselves orbiting a more long-lived star like a red dwarf. Red dwarfs are more common than sun-like stars and have much longer lifespans by multiple orders of magnitude. While attempts to understand the habitable zone of red dwarfs are still ongoing, current consensus is that many red dwarfs contain habitable planets.
These two observations, together with further evidence that the universe looks natural makes future filtration seem likely. If advanced civilizations existed, we would expect them to make use of the large amounts of matter and energy available. We see no signs of such use. We've seen no indication of ring-worlds, Dyson spheres, or other megascale engineering projects. While such searches have so far been confined to around 300 parsecs and some candidates were hard to rule out, if a substantial fraction of stars in a galaxy have Dyson spheres or swarms we would notice the unusually high infrared spectrum. Note that this sort of evidence is distinct from arguments about contact or about detecting radio signals. There's a very recent proposal for mini-Dyson spheres around white dwarfs which would be much easier to engineer and harder to detect, but they would not reduce the desirability of other large-scale structures, and they would likely be detectable if there were a large number of them present in a small region. One recent study looked for signs of large-scale modification to the radiation profile of galaxies in a way that should show presence of large scale civilizations. They looked at 100,000 galaxies and found no major sign of technologically advanced civilizations (for more detail see here).
We will not discuss all possible rebuttals to case for a Great Filter but will note some of the more interesting ones:
There have been attempts to argue that the universe only became habitable more recently. There are two primary avenues for this argument. First, there is the point that early stars had very low metallicity (that is had low concentrations of elements other than hydrogen and helium) and thus the universe would have had too low a metal level for complex life. The presence of old rocky planets makes this argument less viable, and this only works for the first few billion years of history. Second, there's an argument that until recently galaxies were more likely to have frequent gamma bursts. In that case, life would have been wiped out too frequently to evolve in a complex fashion. However, even the strongest version of this argument still leaves billions of years of time unexplained.
There have been attempts to argue that space travel may be very difficult. For example, Geoffrey Landis proposed that a percolation model, together with the idea that interstellar travel is very difficult, may explain the apparent rarity of large-scale civilizations. However, at this point, there's no strong reason to think that interstellar travel is so difficult as to limit colonization to that extent. Moreover, discoveries made in the last 20 years that brown dwarfs are very common and that most stars do contain planets is evidence in the opposite direction: these brown dwarfs as well as common planets would make travel easier because there are more potential refueling and resupply locations even if they are not used for full colonization. Others have argued that even without such considerations, colonization should not be that difficult. Moreover, if colonization is difficult and civilizations end up restricted to small numbers of nearby stars, then it becomes more, not less, likely that civilizations will attempt the large-scale engineering projects that we would notice.
Another possibility is that we are underestimating the general growth rate of the resources used by civilizations, and so while extrapolating now makes it plausible that large-scale projects and endeavors will occur, it becomes substantially more difficult to engage in very energy intensive projects like colonization. Rather than a continual, exponential or close to exponential growth rate, we may expect long periods of slow growth or stagnation. This cannot be ruled out, but even if growth continues at only slightly higher than linear rate, the energy expenditures available in a few thousand years will still be very large.
Another possibility that has been proposed are variants of the simulation hypothesis— the idea that we exist in a simulated reality. The most common variant of this in a Great Filter context suggests that we are in an ancestor simulation, that is a simulation by the future descendants of humanity of what early humans would have been like.
The simulation hypothesis runs into serious problems, both in general and as an explanation of the Great Filter in particular. First, if our understanding of the laws of physics is approximately correct, then there are strong restrictions on what computations can be done with a given amount of resources. For example, BQP, the set of problems which can be solved efficiently by quantum computers is contained in PSPACE, the set of problems which can solved when one has a polynomial amount of space available and no time limit. Thus, in order to do a detailed simulation, the level of resources needed would likely be large since one would even if one made a close to classical simulation still need about as many resources. There are other results, such as Holevo's theorem, which place other similar restrictions. The upshot of these results is that one cannot make a detailed simulation of an object without using at least much resources as the object itself. There may be potential ways of getting around this: for example, consider a simulator interested primarily in what life on Earth is doing. The simulation would not need to do a detailed simulation of the inside of planet Earth and other large bodies in the solar system. However, even then, the resources involved would be very large.
The primary problem with the simulation hypothesis as an explanation is that it requires the future of humanity to have actually already passed through the Great Filter and to have found their own success sufficiently unlikely that they've devoted large amounts of resources to actually finding out how they managed to survive. Moreover, there are strong limits on how accurately one can reconstruct any given quantum state which means an ancestry simulation will be at best a rough approximation. In this context, while there are interesting anthropic considerations here, it is more likely that the simulation hypothesis is wishful thinking.
Variants of the "Prime Directive" have also been proposed. The essential idea is that advanced civilizations would deliberately avoid interacting with less advanced civilizations. This hypothesis runs into two serious problems: first, it does not explain the apparent naturalness, only the lack of direct contact by alien life. Second, it assumes a solution to a massive coordination problem between multiple species with potentially radically different ethical systems. In a similar vein, Hanson in his original essay on the Great Filter raised the possibility of a single very early species with some form of faster than light travel and a commitment to keeping the universe close to natural looking. Since all proposed forms of faster than light travel are highly speculative and would involve causality violations this hypothesis cannot be assigned a substantial probability.
People have also suggested that civilizations move outside galaxies to the cold of space where they can do efficient reversible computing using cold dark matter. Jacob Cannell has been one of the most vocal proponents of this idea. This hypothesis suffers from at least three problems. First, it fails to explain why those entities have not used the conventional matter to any substantial extent in addition to the cold dark matter. Second, this hypothesis would either require dark matter composed of cold conventional matter (which at this point seems to be only a small fraction of all dark matter), or would require dark matter which interacts with itself using some force other than gravity. While there is some evidence for such interaction, it is at this point, slim. Third, even if some species had taken over a large fraction of dark matter to use for their own computations, one would then expect later species to use the conventional matter since they would not have the option of using the now monopolized dark matter.
Other exotic non-Filter explanations have been proposed but they suffer from similar or even more severe flaws.
It is possible that future information will change this situation. One of the more plausible explanations of the Great Filter is that there is no single Great Filter in the past but rather a large number of small filters which come together to drastically filter out civilizations. However, the evidence for such a viewpoint at this point is slim but there is some possibility that astronomy can help answer this question.
For example, one commonly cited aspect of past filtration is the origin of life. There are at least three locations, other than Earth, where life could have formed: Europa, Titan and Mars. Finding life on one, or all of them, would be a strong indication that the origin of life is not the filter. Similarly, while it is highly unlikely that Mars has multicellular life, finding such life would indicate that the development of multicellular life is not the filter. However, none of them are as hospitable to the extent of Earth, so determining whether there is life will require substantial use of probes. We might also look for signs of life in the atmospheres of extrasolar planets, which would require substantially more advanced telescopes.
Another possible early filter is that planets like Earth frequently get locked into a "snowball" state which planets have difficulty exiting. This is an unlikely filter since Earth has likely been in near-snowball conditions multiple times— once very early on during the Huronian and later, about 650 million years ago. This is an example of an early partial Filter where astronomical observation may be of assistance in finding evidence of the filter. The snowball Earth filter does have one strong virtue: if many planets never escape a snowball situation, then this explains in part why we are not around a red dwarf: planets do not escape their snowball state unless their home star is somewhat variable, and red dwarfs are too stable.
It should be clear that none of these explanations are satisfactory and thus we must take seriously the possibility of future Filtration.
How can we substantially improve astronomy in the short to medium term?
Before we examine the potentials for further astronomical research to understand a future filter we should note that there are many avenues in which we can improve our astronomical instruments. The most basic way is to simply make better conventional optical, near-optical telescopes, and radio telescopes. That work is ongoing. Examples include the European Extreme Large Telescope and the Thirty Meter Telescope. Unfortunately, increasing the size of ground based telescopes, especially size of the aperture, is running into substantial engineering challenges. However, in the last 30 years the advent of adaptive optics, speckle imaging, and other techniques have substantially increased the resolution of ground based optical telescopes and near-optical telescopes. At the same time, improved data processing and related methods have improved radio telescopes. Already, optical and near-optical telescopes have advanced to the point where we can gain information about the atmospheres of extrasolar planets although we cannot yet detect information about the atmospheres of rocky planets.
Increasingly, the highest resolution is from space-based telescopes. Space-based telescopes also allow one to gather information from types of radiation which are blocked by the Earth's atmosphere or magnetosphere. Two important examples are x-ray telescopes and gamma ray telescopes. Space-based telescopes also avoid many of the issues created by the atmosphere for optical telescopes. Hubble is the most striking example but from a standpoint of observatories relevant to the Great Filter, the most relevant space telescope (and most relevant instrument in general for all Great Filter related astronomy), is the planet detecting Kepler spacecraft which is responsible for most of the identified planets.
Another type of instrument are neutrino detectors. Neutrino detectors are generally very large bodies of a transparent material (generally water) kept deep underground so that there are minimal amounts of light and cosmic rays hitting the the device. Neutrinos are then detected when they hit a particle which results in a flash of light. In the last few years, improvements in optics, increasing the scale of the detectors, and the development of detectors like IceCube, which use naturally occurring sources of water, have drastically increased the sensitivity of neutrino detectors.
There are proposals for larger-scale, more innovative telescope designs but they are all highly speculative. For example, in the ground based optical front, there's been a suggestion to make liquid mirror telescopes with ferrofluid mirrors which would give the advantages of liquid mirror telescopes, while being able to apply adaptive optics which can normally only be applied to solid mirror telescopes. An example of potential space-based telescopes is the Aragoscope which would take advantage of diffraction to make a space-based optical telescope with a resolution at least an order of magnitude greater than Hubble. Other examples include placing telescopes very far apart in the solar system to create effectively very high aperture telescopes. The most ambitious and speculative of such proposals involve such advanced and large-scale projects that one might as well presume that they will only happen if we have already passed through the Great Filter.
What are the major identified future potential contributions to the filter and what can astronomy tell us?
One threat type where more astronomical observations can help are natural threats, such as asteroid collisions, supernovas, gamma ray bursts, rogue high gravity bodies, and as yet unidentified astronomical threats. Careful mapping of asteroids and comets is ongoing and requires more continued funding rather than any intrinsic improvements in technology. Right now, most of our mapping looks at objects at or near the plane of the ecliptic and so some focus off the plane may be helpful. Unfortunately, there is very little money to actually deal with such problems if they arise. It might be possible to have a few wealthy individuals agree to set up accounts in escrow which would be used if an asteroid or similar threat arose.
Supernovas are unlikely to be a serious threat at this time. There are some stars which are close to our solar system and are large enough that they will go supernova. Betelgeuse is the most famous of these with a projected supernova likely to occur in the next 100,000 years. However, at its current distance, Betelgeuse is unlikely to pose much of a problem unless our models of supernovas are very far off. Further conventional observations of supernovas need to occur in order to understand this further, and better neutrino observations will also help but right now, supernovas do not seem to be a large risk. Gamma ray bursts are in a situation similar to supernovas. Note also that if an imminent gamma ray burst or supernova is likely to occur, there's very little we can at present do about it. In general, back of the envelope calculations establish that supernovas are highly unlikely to be a substantial part of the Great Filter.
Rogue planets, brown dwarfs or other small high gravity bodies such as wandering black holes can be detected and further improvements will allow faster detection. However, the scale of havoc created by such events is such that it is not at all clear that detection will help. The entire planetary nuclear arsenal would not even begin to move their orbits a substantial extent.
Note also it is unlikely that natural events are a large fraction of the Great Filter. Unlike most of the other threat types, this is a threat type where radio astronomy and neutrino information may be more likely to identify problems.
Biological threats take two primary forms: pandemics and deliberately engineered diseases. The first is more likely than one might naively expect as a serious contribution to the filter, since modern transport allows infected individuals to move quickly and come into contact with a large number of people. For example, trucking has been a major cause of the spread of HIV in Africa and it is likely that the recent Ebola epidemic had similar contributing factors. Moreover, keeping chickens and other animals in very large quanities in dense areas near human populations makes it easier for novel variants of viruses to jump species. Astronomy does not seem to provide any relevant assistance here; the only plausible way of getting such information would be to see other species that were destroyed by disease. Even with resolutions and improvements in telescopes by many orders of magnitude this is not doable.
For reasons similar to those in the biological threats category, astronomy is unlikely to help us detect if nuclear war is a substantial part of the Filter. It is possible that more advanced telescopes could detect an extremely large nuclear detonation if it occurred in a very nearby star system. Next generation telescopes may be able to detect a nearby planet's advanced civilization purely based on the light they give off and a sufficiently large detonation would be of the same light level. However, such devices would be multiple orders of magnitude larger than the largest current nuclear devices. Moreover, if a telescope was not looking at exactly the right moment, it would not see anything at all, and the probability that another civilization wipes itself out at just the same instant that we are looking is vanishingly small.
This category is one of the most difficult to discuss because it so open. The most common examples people point to involve high-energy physics. Aside from theoretical considerations, cosmic rays of very high energy levels are continually hitting the upper atmosphere. These particles frequently are multiple orders of magnitude higher energy than the particles in our accelerators. Thus high-energy events seem to be unlikely to be a cause of any serious filtration unless/until humans develop particle accelerators whose energy level is orders of magnitude higher than that produced by most cosmic rays. Cosmic rays with energy levels beyond what is known as the GZK energy limit are rare. We have observed occasional particles with energy levels beyond the GZK limit, but they are rare enough that we cannot rule out a risk from many collisions involving such high energy particles in a small region. Since our best accelerators are nowhere near the GZK limit, this is not an immediate problem.
There is an argument that we should if anything worry about unexpected physics, it is on the very low energy end. In particular, humans have managed to make objects substantially colder than the background temperature of 4 K with temperature as on the order of 10-9 K. There's an argument that because of the lack of prior examples of this, the chance that something can go badly wrong should be higher than one might estimate (See here.) While this particular class of scenario seems unlikely, it does illustrate that it may not be obvious which situations could cause unexpected, novel physics to come into play. Moreover, while the flashy, expensive particle accelerators get attention, they may not be a serious source of danger compared to other physics experiments.
Three of the more plausible catastrophic unexpected physics dealing with high energy events are, false vacuum collapse, black hole formation, and the formation of strange matter which is more stable than regular matter.
False vacuum collapse would occur if our universe is not in its true lowest energy state and an event occurs which causes it to transition to the true lowest state (or just a lower state). Such an event would be almost certainly fatal for all life. False vacuum collapses cannot be avoided by astronomical observations since once initiated they would expand at the speed of light. Note that the indiscriminately destructive nature of false vacuum collapses make them an unlikely filter. If false vacuum collapses were easy we would not expect to see almost any life this late in the universe's lifespan since there would be a large number of prior opportunities for false vacuum collapse. Essentially, we would not expect to find ourselves this late in a universe's history if this universe could easily engage in a false vacuum collapse. While false vacuum collapses and similar problems raise issues of observer selection effects, careful work has been done to estimate their probability.
People have mentioned the idea of an event similar to a false vacuum collapse but which occurs at a speed slower than the speed of light. Greg Egan used it is a major premise in his novel, "Schild's Ladder." I'm not aware of any reason to believe such events are at all plausible. The primary motivation seems to be for the interesting literary scenarios which arise rather than for any scientific considerations. If such a situation can occur, then it is possible that we could detect it using astronomical methods. In particular, if the wave-front of the event is fast enough that it will impact the nearest star or nearby stars around it, then we might notice odd behavior by the star or group of stars. We can be confident that no such event has a speed much beyond a few hundredths of the speed of light or we would already notice galaxies behaving abnormally. There is a very narrow range where such expansions could be quick enough to devastate the planet they arise on but take too long to get to their parent star in a reasonable amount of time. For example, the distance from the Earth to the Sun is on the order of 10,000 times the diameter of the Earth, so any event which would expand to destroy the Earth would reach the Sun in about 10,000 times as long. Thus in order to have a time period which would destroy one's home planet but not reach the parent star it would need to be extremely slow.
The creation of artificial black holes are unlikely to be a substantial part of the filter— we expect that small black holes will quickly pop out of existence due to Hawking radiation. Even if the black hole does form, it is likely to fall quickly to the center of the planet and eat matter very slowly and over a time-line which does not make it constitute a serious threat. However, it is possible that black holes would not evaporate; the fact that we have not detected the evaporation of any primordial black holes is weak evidence that the behavior of small black holes is not well-understood. It is also possible that such a hole would eat much faster than we expect but this doesn't seem likely. If this is a major part of the filter, then better telescopes should be able to detect it by finding very dark objects with the approximate mass and orbit of habitable planets. We also may be able to detect such black holes via other observations such as from their gamma or radio signatures.
The conversion of regular matter into strange matter, unlike a false vacuum collapse or similar event, might be naturally limited to the planet where the conversion started. In that case, the only hope for observation would be to notice planets formed of strange matter and notice changes in the behavior of their light. Without actual samples of strange matter, this may be very difficult to do unless we just take notice of planets looking abnormal as similar evidence. Without substantially better telescopes and a good idea of what the range is for normal rocky planets, this would be tough. On the other hand, neutron stars which have been converted into strange matter may be more easily detectable.
Global warming and related damage to biosphere:
Astronomy is unlikely to help here. It is possible that climates are more sensitive than we realize and that comparatively small changes can result in Venus-like situations. This seems unlikely given the general variation level in human history and the fact that current geological models strongly suggest that any substantial problem would eventually correct itself. But if we saw many planets that looked Venus-like in the middle of their habitable zones, this would be a reason to be worried. Note that this would require detailed ability to analyze atmospheres on planets well beyond current capability. Even if it is possible Venus-ify a planet, it is not clear that the Venusification would last long. Thus there may be very few planets in this state at any given time. Since stars become brighter as they age, so high greenhouse gas levels have more of an impact on climate when the parent star is old. If civilizations are more likely to arise in a late point of their home star's lifespan, global warming becomes a more plausible filter, but even given given such considerations, global warming does not seem to be sufficient as a filter. It is also possible that global warming by itself is not the Great Filter but rather general disruption of the biosphere including possibly for some species global warming, reduction in species diversity, and other problems. There is some evidence that human behavior is collectively causing enough damage to leave an unstable biosphere.
A change in planetary overall temperature of 10o C would likely be enough to collapse civilization without leaving any signal observable to a telescope. Similarly, substantial disruption to a biosphere may be very unlikely to be detected.
AI is a complicated existential risk from the standpoint of the Great Filter. AI is not likely to be the Great Filter if one considers simply the Fermi paradox. The essential problem has been brought up independently by a few people. (See for example Katja Grace's remark here and my blog here.) The central issue is that if an AI takes over it is likely to attempt to control all resources in its future light-cone. However, if the AI spreads out at a substantial fraction of the speed of light, then we would notice the result. The argument has been made that we would not see such an AI if it expanded its radius of control at very close to the speed of light but this requires expansion at 99% of the speed of light or greater. It is highly questionable that velocities more than 99% of the speed of light are practically possible due to collisions with the interstellar medium and the need to slow down if one is going to use the resources in a given star system. Another objection is that AI may expand at a large fraction of light speed but do so stealthily. It is not likely that all AIs would favor stealth over speed. Moreover, this would lead to the situation of what one would expect when multiple slowly expanding, stealth AIs run into each other. It is likely that such events would have results would catastrophic enough that they would be visible even with comparatively primitive telescopes.
While these astronomical considerations make AI unlikely to be the Great Filter, it is important to note that if the Great Filter is largely in our past then these considerations do not apply. Thus, any discovery which pushes more of the filter into the past makes AI a larger fraction of total expected existential risks since the absence of observable AI becomes much weaker evidence against strong AI if there are no major civilizations out there to hatch such explosions.
Note also that AI as a risk cannot be discounted if one assigns a high probability to existential risk based on non-Fermi concerns, such as the Doomsday Argument.
Astronomy is unlikely to provide direct help here for reasons similar to the problems with nuclear exchange, biological problems, and global warming. This connects to the problem of civilization bootstrapping: to get to our current technology level, we used a large number of non-renewable resources, especially energy sources. On the other hand, large amounts of difficult-to-mine and refine resources (especially aluminum and titanium) will be much more accessible to future civilization. While there remains a large amount of accessible fossil fuels, the technology required to obtain deeper sources is substantially more advanced than the relatively easy to access oil and coal. Moreover, the energy return rate, how much energy one needs to put in to get the same amount of energy out, is lower. Nick Bostrom has raised the possibility that the depletion of easy-to-access resources may contribute to making civilization-collapsing problems that, while not full-scale existential risks by themselves, prevent the civilizations from recovering. Others have begun to investigate the problem of rebuilding without fossil fuels, such as here.
Resource depletion is unlikely to be the Great Filter, because small changes to human behavior in the 1970s would have drastically reduced the current resource problems. Resource depletion may contribute to existential threat to humans if it leads to societal collapse, global nuclear exchange, or motivate riskier experimentation. Resource depletion may also combine with other risks such as a global warming where the combined problems may be much greater than either at an individual level. However there is a risk that large scale use of resources to engage in astronomy research will directly contribute to the resource depletion problem.
We are familiar with the thesis that Value is Fragile. This is why we are researching how to impart values to an AGI.
Embedded Minds are Fragile
Besides values, it may be worth remembering that human minds too are very fragile.
A little magnetic tampering with your amygdalas, and suddenly you are a wannabe serial killer. A small dose of LSD can get you to believe you can fly, or that the world will end in 4 hours. Remove part of your Ventromedial PreFrontal Cortex, and suddenly you are so utilitarian even Joshua Greene would call you a psycho.
It requires very little material change to substantially modify a human being's behavior. Same holds for other animals with embedded brains, crafted by evolution and made of squishy matter modulated by glands and molecular gates.
A Problem for Paul-Boxing and CEV?
One assumption underlying Paul-Boxing and CEV is that:
It is easier to specify and simulate a human-like mind then to impart values to an AGI by means of teaching it values directly via code or human language.
Usually we assume that because, as we know, value is fragile. But so are embedded minds. Very little tampering is required to profoundly transform people's moral intuitions. A large fraction of the inmate population in the US has frontal lobe or amygdala malfunctions.
Finding out the simplest description of a human brain that when simulated continues to act as that human brain would act in the real world may turn out to be as fragile, or even more fragile, than concept learning for AGI's.
As a follow-on to the recent thread on purchasing research effectively, I thought it'd make sense to post the request for proposals for projects to be funded by Musk's $10M donation. LessWrong's been a place for discussing long-term AI safety and research for quite some time, so I'd be happy to see some applications come out of LW members.
Here's the full Request for Proposals.
If you have questions, feel free to ask them in the comments or to contact me!
Here's the email FLI has been sending around:
Initial proposals (300–1000 words) due March 1, 2015
The Future of Life Institute, based in Cambridge, MA and headed by Max Tegmark (MIT), is seeking proposals for research projects aimed to maximize the future societal benefit of artificial intelligence while avoiding potential hazards. Projects may fall in the fields of computer science, AI, machine learning, public policy, law, ethics, economics, or education and outreach. This 2015 grants competition will award funds totaling $6M USD.
This funding call is limited to research that explicitly focuses not on the standard goal of making AI more capable, but on making AI more robust and/or beneficial; for example, research could focus on making machine learning systems more interpretable, on making high-confidence assertions about AI systems' behavior, or on ensuring that autonomous systems fail gracefully. Funding priority will be given to research aimed at keeping AI robust and beneficial even if it comes to greatly supersede current capabilities, either by explicitly focusing on issues related to advanced future AI or by focusing on near-term problems, the solutions of which are likely to be important first steps toward long-term solutions.
Please do forward this email to any colleagues and mailing lists that you think would be appropriate.
Before applying, please read the complete RFP and list of example topics, which can be found online along with the application form:
As explained there, most of the funding is for $100K–$500K project grants, which will each support a small group of collaborators on a focused research project with up to three years duration. For a list of suggested topics, see the complete RFP  and the Research Priorities document . Initial proposals, which are intended to require merely a modest amount of preparation time, must be received on our website  on or before March 1, 2015.
Initial proposals should include a brief project summary, a draft budget, the principal investigator’s CV, and co-investigators’ brief biographies. After initial proposals are reviewed, some projects will advance to the next round, completing a Full Proposal by May 17, 2015. Public award recommendations will be made on or about July 1, 2015, and successful proposals will begin receiving funding in September 2015.
References and further resources
 Complete request for proposals and application form: http://futureoflife.org/grants/large/initial
 Research Priorities document: http://futureoflife.org/static/data/documents/research_priorities.pdf
 An open letter from AI scientists on research priorities for robust and beneficial AI: http://futureoflife.org/misc/open_letter
 Initial funding announcement: http://futureoflife.org/misc/AI
Questions about Project Grants: email@example.com
Media inquiries: firstname.lastname@example.org
CFAR will (conditionally) be running a three week summer program this July for MIRI, designed to increase participants' ability to do technical research into the superintelligence alignment problem.
The intent of the program is to boost participants as far as possible in four skills:
- The CFAR “applied rationality” skillset, including both what is taught at our intro workshops, and more advanced material from our alumni workshops;
- “Epistemic rationality as applied to the foundations of AI, and other philosophically tricky problems” -- i.e., the skillset taught in the core LW Sequences. (E.g.: reductionism; how to reason in contexts as confusing as anthropics without getting lost in words.)
- The long-term impacts of AI, and strategies for intervening (e.g., the content discussed in Nick Bostrom’s book Superintelligence).
- The basics of AI safety-relevant technical research. (Decision theory, anthropics, and similar; with folks trying their hand at doing actual research, and reflecting also on the cognitive habits involved.)
The program will be offered free to invited participants, and partial or full scholarships for travel expenses will be offered to those with exceptional financial need.
If you're interested (or possibly-interested), sign up for an admissions interview ASAP at this link (takes 2 minutes): http://rationality.org/miri-summer-fellows-2015/
Also, please forward this post, or the page itself, to anyone you think should come; the skills and talent that humanity brings to bear on the superintelligence alignment problem may determine our skill at navigating it, and sharing this opportunity with good potential contributors may be a high-leverage way to increase that talent.
I'm excited to announce that the Future of Life Institute has just launched an existential risk news site!
The site will have regular articles on topics related to existential risk, written by journalists, and a community blog written by existential risk researchers from around the world as well as FLI volunteers. Enjoy!
I was re-reading the chapter on status in Impro (excerpt), and I noticed that Johnstone seemed to be implying that different people are comfortable at different levels of status: some prefer being high status and others prefer being low status. I found this peculiar, because the prevailing notion in the rationalistsphere seems to be that everyone's constantly engaged in status games aiming to achieve higher status. I've even seen arguments to the effect that a true post-scarcity society is impossible, because status is zero-sum and there will always be people at the bottom of the status hierarchy.
But if some people preferred to have low status, this whole dilemma might be avoided, if a mix of statuses could be find that left everyone happy.
First question - is Johnstone's "status" talking about the same thing as our "status"? He famously claimed that "status is something you do, not something that you are", and that
I should really talk about dominance and submission, but I'd create a resistance. Students who will agree readily to raising or lowering their status may object if asked to 'dominate' or 'submit'.
Viewed via this lens, it makes sense that some people would prefer being in a low status role: if you try to take control of the group, you become subject to various status challenges, and may be held responsible for the decisions you make. It's often easier to remain low status and let others make the decisions.
But there's still something odd about saying that one would "prefer to be low status", at least in the sense in which we usually use the term. Intuitively, a person may be happy being low status in the sense of not being dominant, but most people are still likely to desire something that feels kind of like status in order to be happy. Something like respect, and the feeling that others like them. And a lot of the classical "status-seeking behaviors" seem to be about securing the respect of others. In that sense, there seems to be something intuitive true in the "everyone is engaged in status games and wants to be higher-status" claim.
So I think that there are two different things that we call "status" which are related, but worth distinguishing.
1) General respect and liking. This is "something you have", and is not inherently zero-sum. You can achieve it by doing things that are zero-sum, like being the best fan fiction writer in the country, but you can also do it by things like being considered generally friendly and pleasant to be around. One of the lessons that I picked up from The Charisma Myth was that you can be likable by just being interested in the other person and displaying body language that signals your interest in the other person.
Basically, this is "do other people get warm fuzzies from being around you / hearing about you / consuming your work", and is not zero-sum because e.g. two people who both have great social skills and show interest in you can both produce the same amount of warm fuzzies, independent of each other's existence.
But again, specific sources of this can be zero-sum: if you respect someone a lot for their art, but then run across into even better art and realize that the person you previously admired is pretty poor in comparison, that can reduce the respect you feel for them. It's just that there are also other sources of liking which aren't necessarily zero-sum.
2) Dominance and control of the group. It's inherently zero-sum because at most one person can have absolute say on the decisions of the group. This is "something you do": having the respect and liking of the people in the group (see above) makes it easier for you to assert dominance and makes the others more willing to let you do so, but you can also voluntarily abstain from using that power and leave the decisions to others. (Interestingly, in some cases this can even increase the extent to which you are liked, which translates to a further boost in the ability to control the group, if you so desired.)
Morendil and I previously suggested a definition of status as "the general purpose ability to influence a group", but I think that definition was somewhat off in conflating the two senses above.
I've always had the vague feeling that the "everyone can't always be happy because status is zero-sum" claim felt off in some sense that I was unable to properly articulate, but this seems to resolve the issue. If this model were true, it would also make me happy, because it would imply that we can avoid zero-sum status fights while still making everybody content.
As many people have noted, Less Wrong currently isn't receiving as much content as we would like. One way to think about expanding the content is to think about which areas of study deserve more articles written on them.
For example, I expect that sociology has a lot to say about many of our cultural assumptions. It is quite possible that 95% of it is either obvious or junk, but almost all fields have that 5% within them that could be valuable. Another area of study that might be interesting to consider is anthropology. Again this is a field that allows us to step outside of our cultural assumptions.
I don't know anything about media studies, but I imagine that they have some worthwhile things to say about how we the information that we hear is distorted.
What other fields would you like to see some discussion of on Less Wrong?
The book version of the Sequences is supposed to be published in the next month or two, if I understand correctly. I would really enjoy an online reading group to go through the book together.
Reasons for a reading group:
- It would give some of us the motivation to actually go through the Sequences finally.
- I have frequently had thoughts or questions on some articles in the Sequences, but I refrained from commenting because I assumed it would be covered in a later article or because I was too intimidated to ask a stupid question. A reading group would hopefully assume that many of the readers would be new to the Sequences, so asking a question or making a comment without knowing the later articles would not appear stupid.
- It may even bring back a bit of the blog-style excitement of the "old" LW ("I wonder what exciting new thoughts are going to be posted today?") that many have complained has been missing since the major contributors stopped posting.
Hi, I'm new to LessWrong. I stumbled onto this site a month ago, and ever since, I've been devouring Rationality: AI to Zombies faster than I used to go through my favorite fantasy novels. I've spent some time on website too, and I'm pretty intimidated about posting, since you guys all seem so smart and knowledgeable, but here goes... This is probably the first intellectual idea I've had in my life, so if you want to tear it to shreds, you are more than welcome to, but please be gentle with my feelings. :)
Edit: Thanks to many helpful comments, I've cleaned up the original post quite a bit and changed the title to reflect this.
As humans, we seem to share the same terminal values, or terminal virtues. We want to do things that make ourselves happy, and we want to do things that make others happy. We want to 'become happy' and 'become good.'
Because various determinants--including, for instance, personal fulfillment--can affect an individual's happiness, there is significant overlap between these ultimate motivators. Doing good for others usually brings us happiness. For example, donating to charity makes people feel warm and fuzzy. Some might recognize this overlap and conclude that all humans are entirely selfish, that even those who appear altruistic are subconsciously acting purely out of self-interest. Yet many of us choose to donate to charities that we believe do the most good per dollar, rather than handing out money through personal-happiness-optimizing random acts of kindness. Seemingly rational human beings sometimes make conscious decisions to inefficiently maximize their personal happiness for the sake of others. Consider Eliezer's example in Terminal Values and Instrumental Values of a mother who sacrifices her life for her son.
Why would people do stuff that they know won't efficiently increase their happiness? Before I de-converted from Christianity and started to learn what evolution and natural selection actually were, before I realized that altruistic tendencies are partially genetic, it used to utterly mystify me that atheists would sometimes act so virtuously. I did believe that God gave them a conscience, but I kinda thought that surely someone rational enough to become an atheist would be rational enough to realize that his conscience didn't always lead him to his optimal mind-state, and work to overcome it. Personally, I used to joke with my friends that Christianity was the only thing stopping me from pursuing my true dream job of becoming a thief (strategy + challenge + adrenaline + variety = what more could I ask for?) Then, when I de-converted, it hit me: Hey, you know, Ellen, you really *could* become a thief now! What fun you could have! I flinched from the thought. Why didn't I want to overcome my conscience, become a thief, and live a fun-filled life? Well, this isn't as baffling to me now, simply because I've changed where I draw the boundary. I've come to classify goodness as an end-in-itself, just like I'd always done with happiness.
I first read about virtue ethics in On Terminal Goals and Virtue Ethics. As I read, I couldn't help but want to be a virtue ethicist and a consequentialist. Most virtues just seemed like instrumental values.
The post's author mentioned Divergent protagonist Tris as an example of virtue ethics:
Bravery was a virtue that she thought she ought to have. If the graph of her motivations even went any deeper, the only node beyond ‘become brave’ was ‘become good.’
I suspect that goodness is, perhaps subconsciously, a terminal virtue for the vast majority of virtue ethicists. I appreciate Oscar Wilde's writing in De Profundis:
Now I find hidden somewhere away in my nature something that tells me that nothing in the whole world is meaningless, and suffering least of all..
It is the last thing left in me, and the best: the ultimate discovery at which I have arrived, the starting-point for a fresh development. It has come to me right out of myself, so I know that it has come at the proper time. It could not have come before, nor later. Had anyone told me of it, I would have rejected it. Had it been brought to me, I would have refused it. As I found it, I want to keep it. I must do so...
Of all things it is the strangest.
Wilde's thoughts on humility translate quite nicely to an innate desire for goodness.
When presented with a conflict between an elected virtue, such as loyalty, or truth, and the underlying desire to be good, most virtue ethicists would likely abandon the elected virtue. With truth, consider the classic example of lying to Nazis to save Jews. Generally speaking, it is wrong to conceal the truth, but in special cases, most people would agree that lying is actually less wrong than truth-telling. I'm not certain, but my hunch is that most professing virtue ethicists would find that in extreme thought experiments, their terminal virtue of goodness would eventually trump their other virtues, too.
However, there's one exception. One desire can sometimes trump even the desire for goodness, and that's the desire for personal happiness.
We usually want what makes us happy. I want what makes me happy. Spending time with family makes me happy. Playing board games makes me happy. Going hiking makes me happy. Winning races makes me happy. Being open-minded makes me happy. Hearing praise makes me happy. Learning new things makes me happy. Thinking strategically makes me happy. Playing touch football with friends makes me happy. Sharing ideas makes me happy. Independence makes me happy. Adventure makes me happy. Even divulging personal information makes me happy.
Fun, accomplishment, positive self-image, sense of security, and others' approval: all of these are examples of happiness contributors, or things that lead me to my own, personal optimal mind-state. Every time I engage in one of the happiness increasers above, I'm fulfilling an instrumental value. I'm doing the same thing when I reject activities I dislike or work to reverse personality traits that I think decrease my overall happiness.
Tris didn’t join the Dauntless cast because she thought they were doing the most good in society, or because she thought her comparative advantage to do good lay there–she chose it because they were brave, and she wasn’t, yet, and she wanted to be.
Tris was, in other words, pursuing happiness by trying to change an aspect of her personality she disliked.
Guessing at subconscious motivation
By now, you might be wondering, "But what about the virtue ethicist who is religious? Wouldn't she be ultimately motivated by something other than happiness and goodness?"
Well, in the case of Christianity, most people probably just want to 'become Christ-like' which, for them, overlaps quite conveniently with personal satisfaction and helping others. Happiness and goodness might be intuitively driving them to choose this instrumental goal, and for them, conflict between the two never seems to arise.
Let's consider 'become obedient to God's will' from a modern-day Christian perspective. 1 Timothy 2:4 says, "[God our Savior] wants all men to be saved and to come to a knowledge of the truth." Mark 12:31 says, "Love your neighbor as yourself." Well, I love myself enough that I want to do everything in my power to avoid eternal punishment; therefore, I should love my neighbor enough to do everything in my power to stop him from going to hell, too.
So anytime a Christian does anything but pray for others, do faith-strengthening activities, spread the gospel, or earn money to donate to missionaries, he is anticipating as if God/hell doesn't exist. As a Christian, I totally realized this, and often tried to convince myself and others that we were acting wrongly by not being more devout. I couldn't shake the notion that spending time having fun instead of praying or sharing the gospel was somehow wrong because it went against God's will of wanting all men being saved, and I believed God's will, by definition, was right. (Oops.) But I still acted in accordance with my personal happiness on many occasions. I said God's will was the only end-in-itself, but I didn't act like it. I didn't feel like it. The innate desire to pursue personal happiness is an extremely strong motivating force, so strong that Christians really don't like to label it as sin. Imagine how many deconversions we would see if it were suddenly sinful to play football, watch movies with your family, or splurge on tasty restaurant meals. Yet the Bible often mentions giving up material wealth entirely, and in Luke 9:23 Jesus says, "Whoever wants to be my disciple must deny themselves and take up their cross daily and follow me."
Let's further consider those who believe God's will is good, by definition. Such Christians tend to believe "God wants what's best for us, even when we don't understand it." Unless they have exceptionally strong tendencies to analyze opportunity costs, their understanding of God's will and their intuitive idea of what's best for humanity rarely conflict. But let's imagine it does. Let's say someone strongly believes in God, and is led to believe that God wants him to sacrifice his child. This action would certainly go against his terminal value of goodness and may cause cognitive dissonance. But he could still do it, subconsciously satisfying his (latent) terminal value of personal happiness. What on earth does personal happiness have to do with sacrificing a child? Well, the believer takes comfort in his belief in God and his hope of heaven (the child gets a shortcut there). He takes comfort in his religious community. To not sacrifice the child would be to deny God and lose that immense source of comfort.
These thoughts obviously don't happen on a conscious level, but maybe people have personal-happiness-optimizing intuitions. Of course, I have near-zero scientific knowledge, no clue what really goes on in the subconscious, and I'm just guessing at all this.
Again, happiness has a huge overlap with goodness. Goodness often, but not always, leads to personal happiness. A lot of seemingly random stuff leads to personal happiness, actually. Whatever that stuff is, it largely accounts for the individual variance in which virtues are pursued. It's probably closely tied to the four Kiersey Temperaments of security-seeking, sensation-seeking, knowledge-seeking, and identity-seeking types. (Unsurprisingly, most people here at LW reported knowledge-seeking personality types.) I'm a sensation-seeker. An identity-seeker could find his identity in the religious community and in being a 'child of God'. A security-seeker could find security in his belief in heaven. An identity-seeking rationalist might be the type most likely to aspire to 'become completely truthful' even if she somehow knew with complete certainty that telling the truth, in a certain situation, would lead to a bad outcome for humanity.
Perhaps the general tendency among professing virtue ethicists is to pursue happiness and goodness relatively intuitively, while professing consequentialists pursue the same values more analytically.
Also worth noting is the individual variance in an someone's "preference ratio" of happiness relative to goodness. Among professing consequentialists, we might find sociopaths and extreme altruists at opposite ends of a happiness-goodness continuum, with most of us falling somewhere in between. To position virtue ethicists on such a continuum would be significantly more difficult, requiring further speculation about subconscious motivation.
Real-life convergence of moral views
I immediately identified with consequentialism when I first read about it. Then I read about virtue ethics, and I immediately identified with that, too. I naturally analyze my actions with my goals in mind. But I also often find myself idolizing a certain trait in others, such as environmental consciousness, and then pursuing that trait on my own. For example:
I've had friends who care a lot about the environment. I think it's cool that they do. So even before hearing about virtue ethics, I wanted to 'become someone who cares about the environment'. Subconsciously, I must have suspected that this would help me achieve my terminal goals of happiness and goodness.
If caring about the environment is my instrumental goal, I can feel good about myself when I instinctively pick up trash, conserve energy, use a reusable water bottle; i.e. do things environmentally conscious people do. It's quick, it's efficient, and having labeled 'caring about the environment' as a personal virtue, I'm spared from analyzing every last decision. Being environmentally conscious is a valuable habit.
Yet I can still do opportunity cost analyses with my chosen virtue. For example, I could stop showering to help conserve California's water. Or, I could apparently have the same effect by eating six fewer hamburgers in a year. More goodness would result if I stopped eating meat and limited my showering, but doing so would interfere with my personal happiness. I naturally seek to balance my terminal goals of goodness and happiness. Personally, I prefer showering to eating hamburgers, so I cut significantly back on my meat consumption without worrying too much about my showering habits. This practical convergence of virtue ethics and consequentialism satisfies my desires for happiness and goodness harmoniously.
Personal happiness refers to an individual's optimal mind-state. Pleasure, pain, and personal satisfaction are examples of happiness level determinants. Goodness refers to promoting happiness in others.
Terminal values are ends-in-themselves. The only true terminal values, or virtues, seem to be happiness and goodness. Think of them as psychological motivators, consciously or subconsciously driving us to make the decisions we do. (Physical motivators, like addiction or inertia, can also affect decisions.)
Preferences are what we tend to choose. These can be based on psychological or physical motivators.
Instrumental values are the sub-goals or sub-virtues that we (consciously or subconsciously) believe will best fulfill our terminal values of happiness and goodness. We seem to choose them arbitrarily.
Of course, we're not always aware of what actually leads to optimal mind-states in ourselves and others. Yet as we rationally pursue our goals, we may sometimes intuit like virtue ethicists and other times analyze like consequentialists. Both moral views are useful.
So does this idea have any potential practical value?
It took some friendly prodding, but I was finally brought to realize that my purpose in writing this article was not to argue the existence of goodness or the theoretical equality of consequentialism and virtue ethics or anything at all. The real point I'm making here is that however we categorize personal happiness, goodness belongs in the same category, because in practice, all other goals seem to stem from one or both of these concepts. Clarity of expression is an instrumental value, so I'm just saying that perhaps we should consider redrawing our boundaries a bit:
Figuring where to cut reality in order to carve along the joints—this is the problem worthy of a rationalist. It is what people should be trying to do, when they set out in search of the floating essence of a word.
P.S. If anyone is interested in reading a really, really long conversation I had with adamzerner, you can trace the development of this idea. Language issues were overcome, biases were admitted, new facts were learned, minds were changed, and discussion bounced from ambition, to serial killers, to arrogance, to religion, to the subconscious, to agenthood, to skepticism about the happiness set-point theory, all interconnected somehow. In short, it was the first time I've had a conversation with a fellow "rationalist" and it was one of the coolest experiences I've ever had.
We've been running regular, well-attended Less Wrong meetups in London for a few years now, (and irregular, badly-attended ones for even longer than that). In this time, I'd like to think we've learned a few things about having good conversations, but there are probably plenty of areas where we could make gains. Given the number of Less Wrong meetups around the world, it's worth attempting some sort of meetup cross-pollination. It's possible that we've all been solving each other's problems. It's also good to have a central location to make observations and queries about topics of interest, and it's likely people have such observations and queries on this topic.
So, what have you learned from attending or running Less Wrong meetups? Here are a few questions to get the ball rolling:
- What do you suppose are the dominant positive outcomes of your meetups?
- What problems do you encounter with discussions involving [x] people? How have you attempted to remedy them?
- Do you have any systems or procedures in place for making sure people are having the sorts of conversations they want to have?
- Have you developed or consciously adopted any non-mainstream social norms, taboos or rituals? How are those working out?
- How do Less Wrong meetups differ from other similar gatherings you've been involved with? Are there any special needs idiosyncratic to this demographic?
- Are there any activities that you've found work particularly well or particularly poorly for meetups? Do you have examples of runaway successes or spectacular failures?
- Are there any activities you'd like to try, but haven't managed to pull off yet? What's stopping you?
If you have other specific questions you'd like answered, you're encouraged to ask them in comments. Any other observations, anecdotes or suggestions on this general topic are also welcome and encouraged.
[To be cross-posted at Effective Altruism Forum, FLI news page]
I'm delighted to announce that the Centre for the Study of Existential Risk has had considerable recent success in grantwriting and fundraising, among other activities (full update coming shortly). As a result, we are now in a position to advance to CSER's next stage of development: full research operations. Over the course of this year, we will be recruiting for a full team of postdoctoral researchers to work on a combination of general methodologies for extreme technological (and existential) risk analysis and mitigation, alongside specific technology/risk-specific projects.
Our first round of recruitment has just opened - we will be aiming to hire up to 4 postdoctoral researchers; details below. A second recruitment round will take place in the Autumn. We have a slightly unusual opportunity in that we get to cast our net reasonably wide. We have a number of planned research projects (listed below) that we hope to recruit for. However, we also have the flexibility to hire one or more postdoctoral researchers to work on additional projects relevant to CSER's aims. Information about CSER's aims and core research areas is available on our website. We request that as part of the application process potential postholders send us a research proposal of no more than 1500 words, explaining what your research skills could contribute to CSER. At this point in time, we are looking for people who will have obtained a doctorate in a relevant discipline by their start date.
We would also humbly ask that the LessWrong community aid us in spreading the word far and wide about these positions. There are many brilliant people working within the existential risk community. However, there are academic disciplines and communities that have had less exposure to existential risk as a research priority than others (due to founder effect and other factors), but where there may be people with very relevant skills and great insights. With new centres and new positions becoming available, we have a wonderful opportunity to grow the field, and to embed existential risk as a crucial consideration in all relevant fields and disciplines.
Thanks very much,
Seán Ó hÉigeartaigh (Executive Director, CSER)
"The Centre for the Study of Existential Risk (University of Cambridge, UK) is recruiting for to four full-time postdoctoral research associates to work on the project Towards a Science of Extreme Technological Risk.
We are looking for outstanding and highly-committed researchers, interested in working as part of growing research community, with research projects relevant to any aspect of the project. We invite applicants to explain their project to us, and to demonstrate their commitment to the study of extreme technological risks.
We have several shovel-ready projects for which we are looking for suitable postdoctoral researchers. These include:
- Ethics and evaluation of extreme technological risk (ETR) (with Sir Partha Dasgupta;
- Horizon-scanning and foresight for extreme technological risks (with Professor William Sutherland);
- Responsible innovation and extreme technological risk (with Dr Robert Doubleday and the Centre for Science and Policy).
However, recruitment will not necessarily be limited to these subprojects, and our main selection criterion is suitability of candidates and their proposed research projects to CSER’s broad aims.
Details are available here. Closing date: April 24th."
On Combat - The Psychology and hysiology of Deadly Conflict in War and in Peace by Lt. Col. Dave Grossman and Loren W. Christensen (third edition from 2007) is a well-written, evidence-based book about the reality of human behaviour in life-threatening situations. It is comprehensive (400 pages), provides detailed descriptions, (some) statistics as well as first-person recounts, historical context and other relevant information. But my main focus in this post is in the advice it gives and what lessons the LessWrong community may take from it.
In deadly force encounters you will experience and remember the most unusual physiological and psychological things. Innoculate yourself against extreme stress with repeated authentic training; play win-only paintball, train 911-dialing and -reporting. Train combat breathing. Talk to people after traumatic events.
It has long been known that algorithms out-perform human experts on a range of topics (here's a LW post on this by lukeprog). Why, then, is it that people continue to mistrust algorithms, in spite of their superiority, and instead cling to human advice? A recent paper by Dietvorst, Simmons and Massey suggests it is due to a cognitive bias which they call algorithm aversion. We judge less-than-perfect algorithms more harshly than less-than-perfect humans. They argue that since this aversion leads to poorer decisions, it is very costly, and that we therefore must find ways of combating it.
Research shows that evidence-based algorithms more accurately predict the future than do human forecasters. Yet when forecasters are deciding whether to use a human forecaster or a statistical algorithm, they often choose the human forecaster. This phenomenon, which we call algorithm aversion, is costly, and it is important to understand its causes. We show that people are especially averse to algorithmic forecasters after seeing them perform, even when they see them outperform a human forecaster. This is because people more quickly lose confidence in algorithmic than human forecasters after seeing them make the same mistake. In 5 studies, participants either saw an algorithm make forecasts, a human make forecasts, both, or neither. They then decided whether to tie their incentives to the future predictions of the algorithm or the human. Participants who saw the algorithm perform were less confident in it, and less likely to choose it over an inferior human forecaster. This was true even among those who saw the algorithm outperform the human.
The results of five studies show that seeing algorithms err makes people less confident in them and less likely to choose them over an inferior human forecaster. This effect was evident in two distinct domains of judgment, including one in which the human forecasters produced nearly twice as much error as the algorithm. It arose regardless of whether the participant was choosing between the algorithm and her own forecasts or between the algorithm and the forecasts of a different participant. And it even arose among the (vast majority of) participants who saw the algorithm outperform the human forecaster.
The aversion to algorithms is costly, not only for the participants in our studies who lost money when they chose not to tie their bonuses to the algorithm, but for society at large. Many decisions require a forecast, and algorithms are almost always better forecasters than humans (Dawes, 1979; Grove et al., 2000; Meehl, 1954). The ubiquity of computers and the growth of the “Big Data” movement (Davenport & Harris, 2007) have encouraged the growth of algorithms but many remain resistant to using them. Our studies show that this resistance at least partially arises from greater intolerance for error from algorithms than from humans. People are more likely to abandon an algorithm than a human judge for making the same mistake. This is enormously problematic, as it is a barrier to adopting superior approaches to a wide range of important tasks. It means, for example, that people will more likely forgive an admissions committee than an admissions algorithm for making an error, even when, on average, the algorithm makes fewer such errors. In short, whenever prediction errors are likely—as they are in virtually all forecasting tasks—people will be biased against algorithms.
More optimistically, our findings do suggest that people will be much more willing to use algorithms when they do not see algorithms err, as will be the case when errors are unseen, the algorithm is unseen (as it often is for patients in doctors’ offices), or when predictions are nearly perfect. The 2012 U.S. presidential election season saw people embracing a perfectly performing algorithm. Nate Silver’s New York Times blog, Five Thirty Eight: Nate Silver’s Political Calculus, presented an algorithm for forecasting that election. Though the site had its critics before the votes were in— one Washington Post writer criticized Silver for “doing little more than weighting and aggregating state polls and combining them with various historical assumptions to project a future outcome with exaggerated, attention-grabbing exactitude” (Gerson, 2012, para. 2)—those critics were soon silenced: Silver’s model correctly predicted the presidential election results in all 50 states. Live on MSNBC, Rachel Maddow proclaimed, “You know who won the election tonight? Nate Silver,” (Noveck, 2012, para. 21), and headlines like “Nate Silver Gets a Big Boost From the Election” (Isidore, 2012) and “How Nate Silver Won the 2012 Presidential Election” (Clark, 2012) followed. Many journalists and popular bloggers declared Silver’s success a great boost for Big Data and statistical prediction (Honan, 2012; McDermott, 2012; Taylor, 2012; Tiku, 2012).
However, we worry that this is not such a generalizable victory. People may rally around an algorithm touted as perfect, but we doubt that this enthusiasm will generalize to algorithms that are shown to be less perfect, as they inevitably will be much of the time.
In American society, talking about money is a taboo. It is ok to talk about how much money someone else made when they sold their company, or how much money you would like to earn yearly if you got a raise, but in many different ways, talking about money is likely to trigger some embarrassment in the brain, and generate social discomfort. As one random example: no one dares suggest that bills should be paid according to wealth, for instance, instead people quietly assume that fair is each paying ~1/n, which of course completely fails utilitarian standards.
One more interesting thing people don't talk about, but would probably be useful to know, are money trigger action patterns. That would be a trigger action pattern that should trigger whenever you have more money than X, for varying Xs.
A trivial example is when should you stop caring about pennies, or quarters? When should you start taking cabs or Ubers everywhere? These are minor examples, but there are more interesting questions that would benefit from a money trigger action pattern.
An argument can be made for instance that one should invest in health insurance prior to cryonics, cryonics prior to painting a house and recommended charities before expensive soundsystems. But people never put numbers on those things.
When should you buy cryonics and life insurance for it? When you own $1,000? $10,000? $1,000,000? Yes of course those vary from person to person, currency to currency, environment, age group and family size. This is no reason to remain silent about them. Money is the unit of caring, but some people can care about many more things than others in virtue of having more money. Some things are worth caring about if and only if you have that many caring units to spare.
I'd like to see people talking about what one should care about after surpassing specific numeric thresholds of money, and that seems to be an extremely taboo topic. Seems that would be particularly revealing when someone who does not have a certain amount suggests a trigger action pattern and someone who does have that amount realizes that, indeed, they should purchase that thing. Some people would also calibrate better over whether they need more or less money, if they had thought about these thresholds beforehand.
Some suggested items for those who want to try numeric triggers: health insurance, cryonics, 10% donation to favorite cause, virtual assistant, personal assistant, car, house cleaner, masseuse, quitting your job, driver, boat, airplane, house, personal clinician, lawyer, body guard, etc...
...notice also that some of these are resource satisfiable, but some may not. It may always be more worth financing your anti-aging helper than your costume designer, so you'd hire the 10 millionth scientist to find out how to keep you young before considering hiring someone to design clothes specifically for you, perhaps because you don't like unique clothes. This is my feeling about boats, it feels like there are always other things that can be done with money that precede having a boat, though outside view is that a lot of people who own a lot of money buy boats.
Sean Carroll, physicist and proponent of Everettian Quantum Mechanics, has just posted a new article going over some of the common objections to EQM and why they are false. Of particular interest to us as rationalists:
Now, MWI certainly does predict the existence of a huge number of unobservable worlds. But it doesn’t postulate them. It derives them, from what it does postulate. And the actual postulates of the theory are quite simple indeed:
- The world is described by a quantum state, which is an element of a kind of vector space known as Hilbert space.
- The quantum state evolves through time in accordance with the Schrödinger equation, with some particular Hamiltonian.
That is, as they say, it. Notice you don’t see anything about worlds in there. The worlds are there whether you like it or not, sitting in Hilbert space, waiting to see whether they become actualized in the course of the evolution. Notice, also, that these postulates are eminently testable — indeed, even falsifiable! And once you make them (and you accept an appropriate “past hypothesis,” just as in statistical mechanics, and are considering a sufficiently richly-interacting system), the worlds happen automatically.
Given that, you can see why the objection is dispiritingly wrong-headed. You don’t hold it against a theory if it makes some predictions that can’t be tested. Every theory does that. You don’t object to general relativity because you can’t be absolutely sure that Einstein’s equation was holding true at some particular event a billion light years away. This distinction between what is postulated (which should be testable) and everything that is derived (which clearly need not be) seems pretty straightforward to me, but is a favorite thing for people to get confused about.
Very reminiscent of the quantum physics sequence here! I find that this distinction between number of entities and number of postulates is something that I need to remind people of all the time.
META: This is my first post; if I have done anything wrong, or could have done something better, please tell me!
A simple exercise in rationality: rephrase an objective statement as subjective and explore the caveats
"This book is awful" => "I dislike this book" => "I dislike this book because it is shallow and is full of run-on sentences." => I dislike this book because I prefer reading books I find deep and clearly written."
"The sky is blue" => ... => "When I look at the sky, the visual sensation I get is very similar to when I look at a bunch of other objects I've been taught to associate with the color blue."
"Team X lost but deserved to win" => ...
"Being selfish is immoral"
"The Universe is infinite, so anything imaginable happens somewhere"
In general, consider a quick check whether in a given context replacing "is" with "appears to be" leads to something you find non-trivial.
Why? Because it exposes the multiple levels of maps we normally skip. So one might find illuminating occasionally walking through the levels and making sure they are still connected as firmly as the last time. And maybe figuring out where the people who hold a different opinion from yours construct a different chain of maps. Also to make sure you don't mistake a map for the territory.
That is all. ( => "I think that I have said enough for one short post and adding more would lead to diminishing returns, though I could be wrong here, but I am too lazy to spend more time looking for links and quotes and better arguments without being sure that they would improve the post.")
In June 2013, I didn’t do any exercise beyond biking the 15 minutes to work and back. Now, I have a robust habit of hitting the gym every day, doing cardio and strength training. Here are the techniques I used to do get from not having the habit to having it, some of them common wisdom and some of them my own ideas. Consider this post a case study/anecdata in what worked for me. Note: I wrote these ideas down around August 2013 but didn’t post them, so my memory was fresh at the time of writing.
1. Have a specific goal. Ideally this goal should be reasonably achievable and something you can see progress toward over medium timescales. I initially started exercising because I wanted more upper body strength to be better at climbing. My goal is “become able to do at least one pull up, or more if possible”.
Why it works: if you have a specific goal instead of a vague feeling that you ought to do something or that it’s what a virtuous person would do, it’s harder to make excuses. Skipping work with an excuse will let you continue to think of yourself as virtuous, but it won’t help with your goal. For this to work, your goal needs to be something you actually want, rather than a stand-in for “I want to be virtuous.” If you can’t think of a consequence of your intended habit that you actually want, the habit may not be worth your time.
2. Have a no-excuses minimum. This is probably the best technique I’ve discovered. Every day, with no excuses, I went to the gym and did fifty pull-downs on one of the machines. After that’s done, I can do as much or as little else as I want. Some days I would do equivalent amounts of three other exercises, some days I would do an extra five reps and that’s it.
Why it works: this one has a host of benefits.
* It provides a sense of freedom: once I’m done with my minimum, I have a lot of choice about what and how much to do. That way it feels less like something I’m being forced into.
* If I’m feeling especially tired or feel like I deserve a day off, instead of skipping a day and breaking the habit I tell myself I’ll just do the minimum instead. Often once I get there I end up doing more than the minimum anyway, because the real thing I wanted to skip was the inconvenience of biking to the gym.
3. If you raise the minimum, do it slowly. I have sometimes raised the bar on what’s the minimum amount of exercise I have to do, but never to as much or more than I was already doing routinely. If you start suddenly forcing yourself to do more than you were already doing, the change will be much harder and less likely to stick than gradually ratcheting up your commitment.
3. Don’t fall into a guilt trap. Avoid associating guilt with doing the minimum, or even with missing a day.
Why it works: feeling guilty will make thinking of the habit unpleasant, and you’ll downplay how much you care about it to avoid the cognitive dissonance. Especially, if you only do the minimum, tell yourself “I did everything I committed to do.” Then when you do more than the minimum, feel good about it! You went above and beyond. This way, doing what you committed to will sometimes include positive reinforcement, but never negative reinforcement.
4. Use Timeless Decision Theory and consistency pressure. Credit for this one goes to this post by user zvi. When I contemplate skipping a day at the gym, I remember that I’ll be facing the same choice under nearly the same conditions many times in the future. If I skip my workout today, what reason do I have to believe that I won’t skip it tomorrow?
Why it works: Even when the benefits of one day’s worth of exercise don’t seem like enough motivation, I know my entire habit that I’ve worked to cultivate is at stake. I know that the more days I go to the gym the more I will see myself as a person who goes to the gym, and the more it will become my default action.
5. Evaluate your excuses. If I have what I think is a reasonable excuse, I consider how often I’ll skip the gym if I let myself skip it whenever I have that good of an excuse. If letting the excuse hold would make me use it often, I ignore it.
Why it works: I based this technique on this LW post
6. Tell people about it. The first thing I did when I made my resolution to start hitting the gym was telling a friend whose opinion I cared about. I also made a comment on LW saying I would make a post about my attempt at forming a habit, whether it succeeded or failed. (I wrote the post and forgot to post it for over a year, but so it goes.)
Why it works: Telling people about your commitment invests your reputation in it. If you risk being embarrassed if you fail, you have an extra motivation to succeed.
I expect these techniques can be generalized to work for many desirable habits: eating healthy, spending time on social interaction; writing, coding, or working on a long-term project; being outside getting fresh air, etc.
View more: Next