Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
First post here, and I'm disagreeing with something in the main sequences. Hubris acknowledged, here's what I've been thinking about. It comes from the post "Are your enemies innately evil?":
On September 11th, 2001, nineteen Muslim males hijacked four jet airliners in a deliberately suicidal effort to hurt the United States of America. Now why do you suppose they might have done that? Because they saw the USA as a beacon of freedom to the world, but were born with a mutant disposition that made them hate freedom?
Realistically, most people don't construct their life stories with themselves as the villains. Everyone is the hero of their own story. The Enemy's story, as seen by the Enemy, is not going to make the Enemy look bad. If you try to construe motivations that would make the Enemy look bad, you'll end up flat wrong about what actually goes on in the Enemy's mind.
If I'm misreading this, please correct me, but the way I am reading this is:
1) People do not construct their stories so that they are the villains,
2) the idea that Al Qaeda is motivated by a hatred of American freedom is false.
Reading the Al Qaeda document released after the attacks called Why We Are Fighting You you find the following:
What are we calling you to, and what do we want from you?
1. The first thing that we are calling you to is Islam.
A. The religion of tahwid; of freedom from associating partners with Allah Most High , and rejection of such blasphemy; of complete love for Him, the Exalted; of complete submission to his sharia; and of the discarding of all the opinions, orders, theories, and religions that contradict with the religion He sent down to His Prophet Muhammad. Islam is the religion of all the prophets and makes no distinction between them.
It is to this religion that we call you …
2. The second thing we call you to is to stop your oppression, lies, immorality and debauchery that has spread among you.
A. We call you to be a people of manners, principles, honor and purity; to reject the immoral acts of fornication, homosexuality, intoxicants, gambling and usury.
We call you to all of this that you may be freed from the deceptive lies that you are a great nation, which your leaders spread among you in order to conceal from you the despicable state that you have obtained.
B. It is saddening to tell you that you are the worst civilization witnessed in the history of mankind:
i. You are the nation who, rather than ruling through the sharia of Allah, chooses to invent your own laws as you will and desire. You separate religion from you policies, contradicting the pure nature that affirms absolute authority to the Lord your Creator….
ii. You are the nation that permits usury…
iii. You are a nation that permits the production, spread, and use of intoxicants. You also permit drugs, and only forbid the trade of them, even though your nation is the largest consumer of them.
iv. You are a nation that permits acts of immorality, and you consider them to be pillars of personal freedom.
"Freedom" is of course one of those words. It's easy enough to imagine an SS officer saying indignantly: "Of course we are fighting for freedom! For our people to be free of Jewish domination, free from the contamination of lesser races, free from the sham of democracy..."
If we substitute the symbol with the substance though, what we mean by freedom - "people to be left more or less alone, to follow whichever religion they want or none, to speak their minds, to try to shape society's laws so they serve the people" - then Al Qaeda is absolutely inspired by a hatred of freedom. They wouldn't call it "freedom", mind you, they'd call it "decadence" or "blasphemy" or "shirk" - but the substance is what we call "freedom".
Returning to the syllogism at the top, it seems to be that there is an unstated premise. The conclusion "Al Qaeda cannot possibly hate America for its freedom because everyone sees himself as the hero of his own story" only follows if you assume that What is heroic, what is good, is substantially the same for all humans, for a liberal Westerner and an Islamic fanatic.
(for Americans, by "liberal" here I mean the classical sense that includes just about everyone you are likely to meet, read or vote for. US conservatives say they are defending the American revolution, which was broadly in line with liberal principles - slavery excepted, but since US conservatives don't support that, my point stands).
When you state the premise baldly like that, you can see the problem. There's no contradiction in thinking that Muslim fanatics think of themselves as heroic precisely for being opposed to freedom, because they see their heroism as trying to extend the rule of Allah - Shariah - across the world.
Now to the point - we all know the phrase "thinking outside the box". I submit that if you can recognize the box, you've already opened it. Real bias isn't when you have a point of view you're defending, but when you cannot imagine that another point of view seriously exists.
That phrasing has a bit of negative baggage associated with it, that this is just a matter of pigheaded close-mindedness. Try thinking about it another way. Would you say to someone with dyscalculia "You can't get your head around the basics of calculus? You are just being so close minded!" No, that's obviously nuts. We know that different peoples minds work in different ways, that some people can see things others cannot.
Orwell once wrote about the British intellectuals inability to "get" fascism, in particular in his essay on H.G. Wells. He wrote that the only people who really understood the nature and menace of fascism were either those who had felt the lash on their backs, or those who had a touch of the fascist mindset themselves. I suggest that some people just cannot imagine, cannot really believe, the enormous power of faith, of the idea of serving and fighting and dying for your god and His prophet. It is a kind of thinking that is just alien to many.
Perhaps this is resisted because people think that "Being able to think like a fascist makes you a bit of a fascist". That's not really true in any way that matters - Orwell was one of the greatest anti-fascist writers of his time, and fought against it in Spain.
So - if you can see the box you are in, you can open it, and already have half-opened it. And if you are really in the box, you can't see the box. So, how can you tell if you are in a box that you can't see versus not being in a box?
The best answer I've been able to come up with is not to think of "box or no box" but rather "open or closed box". We all work from a worldview, simply because we need some knowledge to get further knowledge. If you know you come at an issue from a certain angle, you can always check yourself. You're in a box, but boxes can be useful, and you have the option to go get some stuff from outside the box.
The second is to read people in other boxes. I like steelmanning, it's an important intellectual exercise, but it shouldn't preclude finding actual Men of Steel - that is, people passionately committed to another point of view, another box, and taking a look at what they have to say.
Now you might say: "But that's steelmanning!" Not quite. Steelmanning is "the art of addressing the best form of the other person’s argument, even if it’s not the one they presented." That may, in some circumstances, lead you to make the mistake of assuming that what you think is the best argument for a position is the same as what the other guy thinks is the best argument for his position. That's especially important if you are addressing a belief held by a large group of people.
Again, this isn't to run down steelmanning - the practice is sadly limited, and anyone who attempts it has gained a big advantage in figuring out how the world is. It's just a reminder that the steelman you make may not be quite as strong as the steelman that is out to get you.
[EDIT: Link included to the document that I did not know was available online before now]
Abstract: in this post I propose a protocol for cryonic preservation (with the central idea of using high pressure to prevent water from expanding rather than highly toxic cryoprotectants), which I think has a chance of being non-destructive enough for us to be able to preserve and then resuscitate an organism with modern technologies. In addition, I propose a simplified experimental protocol for a shrimp (or other small model organism (building a large pressure chamber is hard) capable of surviving in very deep and cold waters; shrimp is a nice trade-off between the depth of habitat and the ease of obtaining them on market), which is simple enough to be doable in a small lab or well-equipped garage setting.
Are there obvious problems with this, and how can they be addressed?
Is there a chance to pitch this experiment to a proper academic institution, or garage it is?
Originally posted here.
I do think that the odds of ever developing advanced nanomachines and/or brain scanning on molecular level plus algorithms for reversing information distortion - everything you need to undo the damage from conventional cryonic preservation and even to some extent that of brain death, according to its modern definition, if wasn't too late when the brain was preserved - for currently existing cryonics to be a bet worth taking. This is dead serious, and it's an actionable item.
Less of an action item: what if the future generations actually build quantum Bayesian superintelligence, close enough in its capabilities to Solomonoff induction, at which point even a mummified brain or the one preserved in formalin would be enough evidence to restore its original state? Or what if they invent read-only time travel, and make backups of everyone's mind right before they died (at which point it becomes indistinguishable from the belief in afterlife existing right now)? Even without time travel, they can just use a Universe-sized supercomputer to simulate every singe human physically possible, and naturally of of them is gonna be you. But aside from the obvious identity issues (and screw the timeless identity), that relies on unknown unknowns with uncomputable probabilities, and I'd like to have as few leaps of faith and quantum suicides in my life as possible.
So although vitrification right after diagnosed brain death relies on far smaller assumptions, and if totally worth doing - let me reiterate that: go sign up for cryonics - it'd be much better if we had preservation protocols so non-destructive that we could actually freeze a living human, and then bring them back alive. If nothing else, that would hugely increase the public outreach, grant the patient (rather than cadaver) status to the preserved, along with the human rights, get it recognized as a medical procedure covered by insurance or single payer, allow doctors to initiate the preservation of a dying patient before the brain death (again: I think everything short of information-theoretic death should potentially be reversible, but why take chances?), allow suffering patient opt for preservation rather than euthanasia (actually, I think it should be done right now: why on earth would anyone allow a person to do something that's guaranteed to kill them, but not allowed to do something that maybe will kill, or maybe will give the cure?), or even allow patients suffering from degrading brain conditions (e.g. Alzheimer's) to opt for preservation before their memory and personality are permanently destroyed.
Let's fix cryonics! First of all, why can't we do it on living organisms? Because of heparin poisoning - every cryoprotectant efficient enough to prevent the formation of ice crystals is a strong enough poison to kill the organism (leave alone that we can't even saturate the whole body with it - current technologies only allow to do it for the brain alone). But without cryoprotectants the water will expand upon freezing, and break the cells. But there's another way to prevent this. Under pressure above 350 MPa water slightly shrinks upon freezing rather than expanding:
So that's basically that: the key idea is to freeze (and keep) everything under pressure. Now, there are some tricks to that too.
It's not easy to put basically any animal, especially a mammal, under 350 MPa (which is 3.5x higher than in Mariana Trench). At this point even Trimix becomes toxic. Basically the only remaining solution is total liquid ventilation, which has one problem: it has never been applied successfully to a human. There's one fix to that I see: as far as I can tell, no one has ever attempted to do perform it under high pressure, and the attempts were basically failing because of the insufficient solubility of oxygen and carbon dioxide in perfluorocarbons. Well then, let's increase the pressure! Namely, go to 3 MPa on Trimix, which is doable, and only then switch to TLV, whose efficiency is improved by the higher gas solubility under high pressure. But there's another solution too. If you just connect a cardiopulmonary bypass (10 hours should be enough for the whole procedure), you don't need the surrounding liquid to even be breathable - it can just be saline. CPB also solves the problem of surviving the period after the cardiac arrest (which will occur at around 30 centigrade) but before the freezing happens - you can just keep the blood circulating and delivering oxygen.
Speaking of hypoxia, even with the CPB it's still a problem. You positively don't want the blood to circulate when freezing starts, lest it act like an abrasive water cutter. It's not that much of a problem under near-freezing temperatures, but still. Fortunately, this effect can be mitigated by administering insulin first (yay, it's the first proper academic citation in this post! Also yay, I thought about this before I even discovered that it's actually true). This makes sense: if oxygen is primarily used to metabolize glucose, less glucose means less oxygen consumed, and less damage done by hypoxia. Then there's another thing: on the phase diagram you can see that before going into the area of high temperature ice at 632 MPa, freezing temperature actually dips down to roughly -30 centigrade at 209~350 MPa. That would allow to really shut down metabolism for good when water is still liquid, and blood can be pumped by the CPB. From this point we have two ways. First, we can do the normal thing, and start freezing very slowly, so minimize the formation of ice crystals (even though they're smaller than the original water volume, they may still be sharp). Second, we can increase the pressure. That would lead to near-instantaneous freezing everywhere, thus completely eliminating the problem of hypoxia - before the freezing, blood still circulated, and freezing is very quick - way faster than can ever be achieved even by throwing a body into liquid helium under normal pressure. Video evidence suggests that quick freezing of water leads to the formation of a huge number of crystals, which is bad, but I don't know near-instantaneous freezing from supercooled state and near-instantaneous freezing upon raising the pressure will lead to the same effect. More experiments are needed, preferably not on humans.
So here is my preservation protocol:
- Anesthetize a probably terminally ill, but still conscious person.
- Connect them to a cardiopulmonary bypass.
- Replacing their blood with perfluorohexane is not necessary, since we seem to be already doing a decent job at having medium-term (several days) cardiopulmonary bypasses, but that could still help.
- Submerge them in perfluorohexane, making sure that no air bubbles are left.
- Slowly raise the ambient pressure to 350 MPa (~3.5kBar) without stopping the bypass.
- Apply a huge dose of insulin to reduce all their metabolic processes.
- Slowly cool them to -30 centigrade (at which point, given such pressure, water is still liquid), while increasing the dose of insulin, and raising the oxygen supply to the barely subtoxic level.
- Slowly raise the pressure to 1 GPa (~10kBar), at which point the water solidifies, but does so with shrinking rather than expanding. Don't cutoff the blood circulation until the moment when ice crystals starts forming in the blood/perfluorohexane flow.
- Slowly lower the temperature to -173 centigrade or lower, as you wish.
And then back:
- Raise the temperature to -20 centigrade.
- Slowly lower the pressure to 350 MPa, at which point ice melts.
- Start artificial blood circulation with a barely subtoxic oxygen level.
- Slowly raise the temperature to +4 centigrade.
- Slowly lower the pressure to 1 Bar.
- Drain the ambient perfluorohexane and replace it with pure oxygen. Attach and start a medical ventilator.
- Slowly raise the temperature to +32 centigrade.
- Apply a huge dose of epinephrine and sugar, while transfusing the actual blood (preferably autotransfusion), to restart the heart.
I claim that this protocol allows you freeze a living human to an arbitrarily low temperature, and then bring them back alive without brain damage, thus being the first true victory over death.
But let's start with something easy and small, like a shrimp. They already live in water, so there's no need to figure out the protocol for putting them into liquid. And they're already adapted to live under high pressure (no swim bladders or other cavities). And they're already adapted to live in cold water, so they should be expected to survive further cooling.
Small ones can be about 1 inch big, so let's be safe and use a 5cm-wide cylinder. To form ice III we need about 350MPa, which gives us 350e6 * 3.14 * 0.025^2 / 9.8 = 70 tons or roughly 690kN of force. Applying it directly or with a lever is unreasonable, since 70 tons of bending force is a lot even for steel, given the 5cm target. Block and tackle system is probably a good solution - actually, two of them, on each side of a beam used for compression, so we have 345 kN per system. And it looks like you can buy 40~50 ton manual hoists from alibaba, though I have no idea about their quality.
I'm not sure to which extent Pascal's law applies to solids, but if it does, the whole setup can be vastly optimized by creating a bottle neck for the pistol. One problem is that we can no longer assume that water in completely incompressible - it had to be compressed to about 87% its original volume - but aside from that, 350MPa per a millimeter thick rod is just 28kg. To compress a 0.05m by 0.1m cylinder to 87% its original volume we need to pump extra 1e-4 m^3 of water there, which amounts to 148 meters of movement, which isn't terribly good. 1cm thick rod, on the other hand, would require almost 3 tons of force, but will move only 1.5 meters. Or the problem of applying the constant pressure can be solved by enclosing the water in a plastic bag, and filling the rest of chamber with a liquid with a lower freezing point, but the same density. Thus, it is guaranteed that all the time it takes the water to freeze, it is under uniform external pressure, and then it just had nowhere to go.
Alternatively, one can just buy a 90'000 psi pump and 100'000 psi tubes and vessels, but let's face it: it they don't even list the price on their website, you probably don't even wanna know it. And since no institutions that can afford this thing seem to be interested in cryonics research, we'll have to stick to makeshift solutions (until at least the shrimp thing works, which would probably give in a publication in Nature, and enough academic recognition for proper research to start).
The Sequences are being released as an eBook, titled Rationality: From AI to Zombies, on March 12.
We went with the name "Rationality: From AI to Zombies" (based on shminux's suggestion) to make it clearer to people — who might otherwise be expecting a self-help book, or an academic text — that the style and contents of the Sequences are rather unusual. We want to filter for readers who have a wide-ranging interest in (/ tolerance for) weird intellectual topics. Alternative options tended to obscure what the book is about, or obscure its breadth / eclecticism.
The book's contents
Around 340 of Eliezer's essays from 2009 and earlier will be included, collected into twenty-six sections ("sequences"), compiled into six books:
- Map and Territory: sequences on the Bayesian conceptions of rationality, belief, evidence, and explanation.
- How to Actually Change Your Mind: sequences on confirmation bias and motivated reasoning.
- The Machine in the Ghost: sequences on optimization processes, cognition, and concepts.
- Mere Reality: sequences on science and the physical world.
- Mere Goodness: sequences on human values.
- Becoming Stronger: sequences on self-improvement and group rationality.
The six books will be released as a single sprawling eBook, making it easy to hop back and forth between different parts of the book. The whole book will be about 1,800 pages long. However, we'll also be releasing the same content as a series of six print books (and as six audio books) at a future date.
The Sequences have been tidied up in a number of small ways, but the content is mostly unchanged. The largest change is to how the content is organized. Some important Overcoming Bias and Less Wrong posts that were never officially sorted into sequences have now been added — 58 additions in all, forming four entirely new sequences (and also supplementing some existing sequences). Other posts have been removed — 105 in total. The following old sequences will be the most heavily affected:
- Map and Territory and Mysterious Answers to Mysterious Questions are being merged, expanded, and reassembled into a new set of introductory sequences, with more focus placed on cognitive biases. The name 'Map and Territory' will be re-applied to this entire collection of sequences, constituting the first book.
- Quantum Physics and Metaethics are being heavily reordered and heavily shortened.
- Most of Fun Theory and Ethical Injunctions is being left out. Taking their place will be two new sequences on ethics, plus the modified version of Metaethics.
I'll provide more details on these changes when the eBook is out.
Unlike the print and audio-book versions, the eBook version of Rationality: From AI to Zombies will be entirely free. If you want to purchase it on Kindle Store and download it directly to your Kindle, it will also be available on Amazon for $4.99.
To make the content more accessible, the eBook will include introductions I've written up for this purpose. It will also include a LessWrongWiki link to a glossary, which I'll be recruiting LessWrongers to help populate with explanations of references and jargon from the Sequences.
I'll post an announcement to Main as soon as the eBook is available. See you then!
I told an intelligent, well-educated friend about Less Wrong, so she googled, and got "Less Wrong is an online community for people who want to apply the discovery of biases like the conjunction fallacy, the affect heuristic, and scope insensitivity in order to fix their own thinking." and gave up immediately because she'd never heard of the biases.
While hers might not be the best possible attitude, I can't see that we win anything by driving people away with obscure language.
Possible improved introduction: "Less Wrong is a community for people who would like to think more clearly in order to improve their own and other people's lives, and to make major disasters less likely."
For a site extremely focused on fixing bad thinking patterns, I've noticed a bizarre lack of discussion here. Considering the high correlation between intelligence and mental illness, you'd think it would be a bigger topic.
I personally suffer from Generalized Anxiety Disorder and a very tame panic disorder. Most of this is focused on financial and academic things, but I will also get panicky about social interaction, responsibilities, and things that happened in the past that seriously shouldn't bother me. I have an almost amusing response to anxiety that is basically my brain panicking and telling me to go hide under my desk.
I know lukeprog and Alicorn managed to fight off a good deal of their issues in this area and wrote up how, but I don't think enough has been done. They mostly dealt with depression. What about rational schizophrenics and phobics and bipolar people? It's difficult to find anxiety advice that goes beyond "do yoga while watching the sunrise!" Pop psych isn't very helpful. I think LessWrong could be. What's mental illness but a wrongness in the head?
Mental illness seems to be worse to intelligent people than your typical biases, honestly. Hiding under my desk is even less useful than, say, appealing to authority during an argument. At least the latter has the potential to be useful. I know it's limiting me, and starting cycles of avoidance, and so much more. And my mental illness isn't even that bad! Trying to be rational and successful when schizophrenic sounds like a Sisyphusian nightmare.
I'm not fighting my difficulties nearly well enough to feel qualified to author my own posts. Hearing from people who are managing is more likely to help. If nothing else, maybe a Rational Support Group would be a lot of fun.
Political topics attract participants inclined to use the norms of mainstream political debate, risking a tipping point to lower quality discussion
(I hope that is the least click-baity title ever.)
Political topics elicit lower quality participation, holding the set of participants fixed. This is the thesis of "politics is the mind-killer".
Here's a separate effect: Political topics attract mind-killed participants. This can happen even when the initial participants are not mind-killed by the topic.
Since outreach is important, this could be a good thing. Raise the sanity water line! But the sea of people eager to enter political discussions is vast, and the epistemic problems can run deep. Of course not everyone needs to come perfectly prealigned with community norms, but any community will be limited in how robustly it can handle an influx of participants expecting a different set of norms. If you look at other forums, it seems to take very little overt contemporary political discussion before the whole place is swamped, and politics becomes endemic. As appealing as "LW, but with slightly more contemporary politics" sounds, it's probably not even an option. You have "LW, with politics in every thread", and "LW, with as little politics as we can manage".
That said, most of the problems are avoided by just not saying anything that patterns matches too easily to current political issues. From what I can tell, LW has always had tons of meta-political content, which doesn't seem to cause problems, as well as standard political points presented in unusual ways, and contrarian political opinions that are too marginal to raise concern. Frankly, if you have a "no politics" norm, people will still talk about politics, but to a limited degree. But if you don't even half-heartedly (or even hypocritically) discourage politics, then a open-entry site that accepts general topics will risk spiraling too far in a political direction.
As an aside, I'm not apolitical. Although some people advance a more sweeping dismissal of the importance or utility of political debate, this isn't required to justify restricting politics in certain contexts. The sort of the argument I've sketched (I don't want LW to be swamped by the worse sorts of people who can be attracted to political debate) is enough. There's no hypocrisy in not wanting politics on LW, but accepting political talk (and the warts it entails) elsewhere. Of the top of my head, Yvain is one LW affiliate who now largely writes about more politically charged topics on their own blog (SlateStarCodex), and there are some other progressive blogs in that direction. There are libertarians and right-leaning (reactionary? NRx-lbgt?) connections. I would love a grand unification as much as anyone, (of course, provided we all realize that I've been right all along), but please let's not tell the generals to bring their armies here for the negotiations.
Transcribed from maxikov's posted videos.
Verbal filler removed for clarity.
Audience Laughter denoted with [L], Applause with [A]
Eliezer: So, any questions? Do we have a microphone for the audience?
Guy Offscreen: We don't have a microphone for the audience, have we?
Some Other Guy: We have this furry thing, wait, no that's not hooked up. Never mind.
Eliezer: Alright, come on over to the microphone.
Guy with 'Berkeley Lab' shirt: So, this question is sort of on behalf of the HPMOR subreddit. You say you don't give red herrings, but like... He's making faces at me like... [L] You say you don't give red herrings, but while he's sitting during in the Quidditch game thinking of who he can bring along, he stares at Cedric Diggory, and he's like, "He would be useful to have at my side!", and then he never shows up. Why was there not a Cedric Diggory?
Eliezer: The true Cedrics Diggory are inside all of our hearts. [L] And in the mirror. [L] And in Harry's glasses. [L] And, well, I mean the notion is, you're going to look at that and think, "Hey, he's going to bring along Cedric Diggory as a spare wand, and he's gonna die! Right?" And then, Lestath Lestrange shows up and it's supposed to be humorous, or something. I guess I can't do humor. [L]
Guy Dressed as a Witch: Does Quirrell's attitude towards reckless muggle scientists have anything to do with your attitude towards AI researchers that aren't you? [L]
Eliezer: That is unfair. There are at least a dozen safety conscious AI researchers on the face of the earth. [L] At least one of them is respected. [L] With that said, I mean if you have a version of Voldemort who is smart and seems to be going around killing muggleborns, and sort of pretty generally down on muggles... Like, why would anyone go around killing muggleborns? I mean, there's more than one rationalization you could apply to this situation, but the sort of obvious one is that you disapprove of their conduct with nuclear weapons. From Tom Riddle's perspective that is.
I do think I sort of try to never have leakage from that thing I spend all day talking about into a place it really didn't belong, and there's a saying that goes 'A fanatic is someone who cannot change his mind, and will not change the subject.' And I'm like ok, so if I'm not going to change my mind, I'll at least endeavor to be able to change the subject. [L] Like, towards the very end of the story we are getting into the realm where sort of the convergent attitude that any sort of carefully reasoning person will take towards global catastrophic risks, and the realization that you are in fact a complete crap rationalist, and you're going to have to start over and actually try this time. These things are sort of reflective of the story outside the story, but apart from 'there is only one king upon a chessboard', and 'I need to raise the level of my game or fail', and perhaps, one little thing that was said about the mirror of VEC, as some people called it.
Aside from those things I would say that I was treating it more as convergent evolution rather than any sort of attempted parable or Professor Quirrell speaking form me. He usually doesn't... [L] I wish more people would realize that... [L] I mean, you know the... How can I put this exactly. There are these people who are sort of to the right side of the political spectrum and occasionally they tell me that they wish I'd just let Professor Quirrell take over my brain and run my body. And they are literally Republicans for You Know Who. And there you have it basically. Next Question! ... No more questions, ok. [L] I see that no one has any questions left; Oh, there you are.
Fidgety Guy: One of the chapters you posted was the final exam chapter where you had everybody brainstorm solutions to the predicament that Harry was in. Did you have any favorite alternate solution besides the one that made it into the book.
Eliezer: So, not to give away the intended solution for anyone who hasn't reached that chapter yet, though really you're just going to have the living daylight spoiled out of you, there's no way to avoid that really. So, the most brilliant solution I had not thought of at all, was for Harry to precommit to transfigure something that would cause a large explosion visible from the Quidditch stands which had observed no such explosion, thereby unless help sent via Time-Turner showed up at that point, thereby insuring that the simplest timeline was not the one where he never reached the Time-Turner. And assuring that some self-consistent set of events would occur which caused him not to carry through on his precommitment. I, you know, I suspect that I might have ruled that that wouldn't work because of the Unbreakable Vow preventing Harry from actually doing that because it might, in effect, count as trying to destroy that timeline, or filter it, and thereby have that count as trying to destroy the world, or just risk destroying it, or something along those lines, but it was brilliant! [L] I was staring at the computer screen going, "I can't believe how brilliant these people are!" "That's not something I usually hear you say," Brienne said. "I'm not usually watching hundreds of peoples' collective intelligence coming up with solutions way better than anything I thought of!" I replied to her.
And the sort of most fun lateral thinking solution was to call 'Up!' to, or pull Quirinus Quirrell's body over using transfigured carbon nanotubes and some padding, and call 'Up!' and ride away on his broomstick bones. [L] That is definitely going in 'Omake files #5: Collective Intelligence'! Next question!
Guy Wearing Black: So in the chapter with the mirror, there was a point at which Dumbledore had said something like, "I am on this side of the mirror and I always have been." That was never explained that I could tell. I'm wondering if you could clarify that.
Eliezer: It is a reference to the fanfic 'Seventh Horcrux' that *totally* ripped off HPMOR despite being written slightly earlier than it... [L] I was slapping my forehead pretty hard when that happened. Which contains the line "Perhaps Albus Dumbledore really was inside the mirror all along." Sort of arc words as it were. And I also figured that there was simply some by-location effect using one of the advanced settings of the mirror that Dumbledore was using so that the trap would always be springable as opposed to him having to know at what time Tom Riddle would appear before the mirror and be trapped. Next!
Black Guy: So, how did Moody and the rest of them retrieve the items Dumbledore threw in the mirror of VEC?
Eliezer: Dumbledore threw them outside the mirrors range, thereby causing those not to be sealed in the corresponding real world when the duplicate mode of Dumbledore inside the mirror was sealed. So wherever Dumbledore was at the time, probably investigating Nicolas Flamel's house, he suddenly popped away and the line of Merlin Unbroken and the Elder Wand just fell to the floor from where he was.
Asian Guy: In the 'Something to Protect: Severus Snape', you wrote that he laughed. And I was really curious, what exactly does Severus Snape sound like when he laughs. [L]
Person in Audience: Perform for us!
Eliezer: He He He. [L]
Girl in Audience: Do it again now, everybody together!
Audience: He He He. [L]
Guy in Blue Shirt: So I was curious about the motivation between making Sirius re-evil again and having Peter be a good guy again, their relationship. What was the motivation?
Eliezer: In character or out character?
Guy in Blue Shirt: Well, yes. [L]
Eliezer: All right, well, in character Peter can be pretty attractive when he wants to be, and Sirius was a teenager. Or, you were asking about the alignment shift part?
Guy in Blue Shirt: Yeah, the alignment and their relationship.
Eliezer: So, in the alignment, I'm just ruling it always was that way. The whole Sirius Black thing is a puzzle, is the way I'm looking at it. And the canon solution to that puzzle is perfectly fine for a children's book, which I say once again requires a higher level of skill than a grown-up book, but just did not make sense in context. So I was just looking at the puzzle and being like, ok, so what can be the actual solution to this puzzle? And also, a further important factor, this had to happen. There's a whole lot of fanfictions out there of Harry Potter. More than half a million, and that was years ago. And 'Methods of Rationality' is fundamentally set in the universe of Harry Potter fanfiction, more than canon. And in many many of these fanfictions someone goes back in time to redo the seven years, and they know that Scabbers is secretly Peter Pettigrew, and there's a scene where they stun Scabbers the rat and take him over to Dumbledore, and Head Auror, and the Minister of Magic and get them to check out this rat over here, and uncover Peter Pettigrew. And in all the times I had read that scene, at least a dozen times literally, it was never once played out the way it would in real life, where that is just a rat, and you're crazy. [L] And that was the sort of basic seed of, "Ok, we're going to play this straight, the sort of loonier conspiracies are false, but there is still a grain of conspiracy truth to it." And then I introduced the whole accounting of what happened with Sirius Black in the same chapter where Hermione just happens to mention that there's a Metamorphmagus in Hufflepuff, and exactly one person posted to the reviews in chapter 28, based on the clue that the Metamorphmagus had been mentioned in the same chapter, "Aha! I present you the tale of Peter Pettigrew, the unfortunate Metamorphmagus." [L] See! You could've solved it, you could've solved it, but you didn't! Someone solved it, you did not solve that. Next Question!
Guy in White: First, [pulls out wand] Avada Kedavra. How do you feel about your security? [L] Second, have you considered the next time you need a large group of very smart people to really work on a hard problem, presenting it to them in fiction?
Eliezer: So, of course I always keep my Patronus Charm going inside of me. [Aww/L] And if that fails, I do have my amulet that triggers my emergency kitten shield. [L] And indeed one of the higher, more attractive things I'm considering to potentially do for the next major project is 'Precisely Bound Djinn and their Behavior'. The theme of which is you have these people who can summon djinn, or command the djinn effect, and you can sort of negotiate with them in the language of djinn and they will always interpret your wish in the worst way possible, or you can give them mathematically precise orders; Which they can apparently carry out using unlimited computing power, which obviously ends the world in fairly short order, causing our protagonist to be caught in a groundhog day loop as they try over and over again to both maybe arrange for conditions outside to be such that they can get some research done for longer than a few months before the world ends again, and also try to figure out what to tell their djinn. And, you know, I figure that if anyone can give me an unboundedly computable specification of a value aligned advanced agent, the story ends, the characters win, hopefully that person gets a large monetary prize if I can swing it, the world is safer, and I can go onto my next fiction writing project, which will be the one with the boundedly specified [L] value aligned advanced agents. [A]
Guy with Purple Tie: So, what is the source of magic?
Eliezer: Alright, so, there was a bit of literary miscommunication in HPMOR. I tried as hard as I could to signal that unraveling the true nature of magic and everything that adheres in it is actually this kind of this large project that they were not going to complete during Harry's first year of Hogwarts. [L] You know, 35 years, even if someone is helping you is a reasonable amount of time for a project like that to take. And if it's something really difficult, like AIs, you might need more that two people even. [L] At least if you want the value aligned version. Anyway, where was I?
So the only way I think that fundamentally to come up with a non-nitwit explanation of magic, you need to get started from the non-nitwit explanation, and then generate the laws of magic, so that when you reveal the answer behind the mystery, everything actually fits with it. You may have noticed this kind of philosophy showing up elsewhere in the literary theory of HPMOR at various points where it turns out that things fit with things you have already seen. But with magic, ultimately the source material was not designed as a hard science fiction story. The magic that we start with as a phenomenon is not designed to be solvable, and what did happen was that the characters thought of experiments, and I in my role of the universe thought of the answer to it, and if they had ever reached the point where there was only one explanation left, then the magic would have had rules, and they would have been arrived at in a fairly organic way that I could have felt good about; Not as a sudden, "Aha! I gotcha! I revealed this thing that you had no way of guessing."
Now I could speculate. And I even tried to write a little section where Harry runs into Dumbledore's writings that Dumbledore left behind, where Dumbledore writes some of his own speculation, but there was no good place to put that into the final chapter. But maybe I'll later be able... The final edits were kind of rushed honestly, sleep deprivation, 3am. But maybe in the second edit or something I'll be able to put that paragraph, that set of paragraphs in there. In Dumbledore's office, Dumbledore has speculated. He's mostly just taking the best of some of the other writers that he's read. That, look at the size of the universe, that seems to be mundane. Dumbledore was around during World War 2, he does know that muggles have telescopes. He has talked with muggle scientists a bit and those muggle scientists seem very confident that all the universe they can see looks like it's mundane. And Dumbledore wondered, why is there this sort of small magical section, and this much larger mundane section, or this much larger muggle section? And that seemed to Dumbledore to suggest that as a certain other magical philosopher had written, If you consider the question, what is the underlying nature of reality, is it that it was mundane to begin with, and then magic arises from mundanity, or is the universe magic to begin with, and then mundanity has been imposed above it? Now mundanity by itself will clearly never give rise to magic, yet magic permits mundanity to be imposed, and so, this other magical philosopher wrote, therefore he thinks that the universe is magical to begin with and the mundane sections are imposed above the magic. And Dumbledore himself had speculated, having been antiquated with the line of Merlin for much of his life, that just as the Interdict of Merlin was imposed to restrict the spread an the number of people who had sufficiently powerful magic, perhaps the mundane world itself, is an attempt to bring order to something that was on the verge of falling apart in Atlantis, or in whatever came before Atlantis. Perhaps the thing that happened with the Interdict of Merlin has happened over and over again. People trying to impose law upon reality, and that law having flaws, and the flaws being more and more exploited until they reach a point of power that recons to destroy the world, and the most adapt wielders of that power try to once again impose mundanity.
And I will also observe, although Dumbledore had no way of figuring this out, and I think Harry might not have figured it out yet because he dosen't yet know about chromosomal crossover, That if there is no wizard gene, but rather a muggle gene, and the muggle gene sometimes gets hit by cosmic rays and ceases to function thereby producing a non-muggle allele, then some of the muggle vs. wizard alleles in the wizard population that got there from muggleborns will be repairable via chromosomal crossover, thus sometimes causing two wizards to give birth to a squib. Furthermore this will happen more frequently in wizards who have recent muggleborn ancestry. I wonder if Lucius told Draco that when Draco told him about Harry's theory of genetics. Anyway, this concludes my strictly personal speculations. It's not in the text, so it's not real unless it's in the text somewhere. 'Opinion of God', Not 'Word of God'. But this concludes my personal speculations on the origin of magic, and the nature of the "wizard gene". [A]
Update: When I posted this announcement I remarkably failed to make the connection that the April 15th is tax day here in the US, and as a prime example of the planning fallacy (a topic of the first sequence!), I failed to anticipate just how complicated my taxes would be this year. The first post of the reading group is basically done but a little rushed, and I want to take an extra day to get it right. Expect it to post on the next day, the 16th
On Thursday, 16 April 2015, just under a month out from this posting, I will hold the first session of an online reading group for the ebook Rationality: From AI to Zombies, a compilation of the LessWrong sequences by our own Eliezer Yudkowsky. I would like to model this on the very successful Superintelligence reading group led by
The reading group will 'meet' on a semi-monthly post on the LessWrong discussion forum. For each 'meeting' we will read one sequence from the the Rationality book, which contains a total of 26 lettered sequences. A few of the sequences are unusually long, and these might be split into two sessions. If so, advance warning will be given.
In each posting I will briefly summarize the salient points of the essays comprising the sequence, link to the original articles and discussion when possible, attempt to find, link to, and quote one or more related materials or opposing viewpoints from outside the text, and present a half-dozen or so question prompts to get the conversation rolling. Discussion will take place in the comments. Others are encouraged to provide their own question prompts or unprompted commentary as well.
We welcome both newcomers and veterans on the topic. If you've never read the sequences, this is a great opportunity to do so. If you are an old timer from the Overcoming Bias days then this is a chance to share your wisdom and perhaps revisit the material with fresh eyes. All levels of time commitment are welcome.
If this sounds like something you want to participate in, then please grab a copy of the book and get started reading the preface, introduction, and the 10 essays / 42 pages which comprise Part A: Predictably Wrong. The first virtual meeting (forum post) covering this material will go live before 6pm Thursday PDT (1am Friday UTC), 16 April 2015. Successive meetings will start no later than 6pm PDT on the first and third Wednesdays of a month.
Following this schedule it is expected that it will take just over a year to complete the entire book. If you prefer flexibility, come by any time! And if you are coming upon this post from the future, please feel free leave your opinions as well. The discussion period never closes.
Topic for the first week is the preface by Eliezer Yudkowsky, the introduction by Rob Bensinger, and Part A: Predictably Wrong, a sequence covering rationality, the search for truth, and a handful of biases.
There seems to be a lot of enthusiasm around LessWrong meetups, so I thought something like this might be interesting too. There is no need to register - just add your marker and keep an eye out for someone living near you.
Here's the link: https://www.zeemaps.com/map?group=1323143
I posted this on an Open Thread first. Below are some observations based on the previous discussion:
When creating a new marker you will be given a special URL you can use to edit it later. If you lose it, you can create a new one and ask me to delete the old marker. Try not to lose it though.
If someone you tried to contact is unreachable, notify me and I'll delete the marker in order to keep the map tidy. Also, try to keep your own marker updated.
It was suggested that it would be a good idea to circulate the map around survey time. I'll try to remind everyone to update their markers around that time. Any major changes (e.g. changing admin, switching services, remaking the map to eliminate dead markers) will also happen then.
The map data can be exported by anyone, so there's no need to start over if I disappear or whatever.
Edit: Please, you have to make it possible to contact you. If you choose to use a name that doesn't match your LW account, you have to add an email address or equivalent. If you don't do that, it is assumed that the name on the marker is your username here, but if it isn't you are essentially unreachable and will be removed.
I wrote an article about the process of signing up for cryo since I couldn't find any such accounts online. If you have questions about the sign-up process, just ask.
A few months ago, I signed up for Alcor's brain-only cryopreservation. The entire process took me 11 weeks from the day I started till the day I received my medical bracelet (the thing that’ll let paramedics know that your dead body should be handled by Alcor). I paid them $90 for the application fee. From now on, every year I’ll pay $530 for Alcor membership fees, and also pay $275 for my separately purchased life insurance.
As many of you may be aware, the UK general election took place yesterday, resulting in a surprising victory for the Conservative Party. The pre-election opinion polls predicted that the Conservatives and Labour would be roughly equal in terms of votes cast, with perhaps a small Conservative advantage leading to a hung parliament; instead the Conservatives got 36.9% of the vote to Labour's 30.4%, and won the election outright.
There has already been a lot of discussion about why the polls were wrong, from methodological problems to incorrect adjustments. But perhaps more interesting is the possibility that the polls were right! For example, Survation did a poll on the evening before the election, which predicted the correct result (Conservatives 37%, Labour 31%). However, that poll was never published because the results seemed "out of line." Survation didn't want to look silly by breaking with the herd, so they just kept quiet about their results. Naturally this makes me wonder about the existence of other unpublished polls with similar readings.
This seems to be a case of two well know problems colliding with devastating effect. Conformity bias caused Survation to ignore the data and go with what they "knew" to be the case (for which they have now paid dearly). And then the file drawer effect meant that the generally available data was skewed, misleading third parties. The scientific thing to do is to publish all data, including "outliers," both so that information can change over time rather than be anchored, and to avoid artificially compressing the variance. Interestingly, the exit poll, which had a methodology agreed beforehand and was previously committed to be published, was basically right.
This is now the third time in living memory that opinion polls have been embarrassingly wrong about the UK general election. Each time this has lead to big changes in the polling industry. I would suggest that one important scientific improvement is for polling companies to announce the methodology of a poll and any adjustments to be made before the poll takes place, and commit to publishing all polls they carry out. Once this became the norm, data from any polling company that didn't follow this practice would be rightly seen as unreliable by comparison.
We have a recurring theme in the greater Less Wrong community that life should be more like a high fantasy novel. Maybe that is to be expected when a quarter of the community came here via Harry Potter fanfiction, and we also have rationalist group houses named after fantasy locations, descriptions of community members in terms of character archetypes and PCs versus NPCs, semi-serious development of the new atheist gods, and feel free to contribute your favorites in the comments.
A failure mode common to high fantasy novels as well as politics is solving all our problems by defeating the villain. Actually, this is a common narrative structure for our entire storytelling species, and it works well as a narrative structure. The story needs conflict, so we pit a sympathetic protagonist against a compelling antagonist, and we reach a satisfying climax when the two come into direct conflict, good conquers evil, and we live happily ever after.
This isn't an article about whether your opponent really is a villain. Let's make the (large) assumption that you have legitimately identified a villain who is doing evil things. They certainly exist in the world. Defeating this villain is a legitimate goal.
And then what?
Defeating the villain is rarely enough. Building is harder than destroying, and it is very unlikely that something good will spontaneously fill the void when something evil is taken away. It is also insufficient to speak in vague generalities about the ideals to which the post-[whatever] society will adhere. How are you going to avoid the problems caused by whatever you are eliminating, and how are you going to successfully transition from evil to good?
In fantasy novels, this is rarely an issue. The story ends shortly after the climax, either with good ascending or time-skipping to a society made perfect off-camera. Sauron has been vanquished, the rightful king has been restored, cue epilogue(s). And then what? Has the Chosen One shown skill in diplomacy and economics, solving problems not involving swords? What was Aragorn's tax policy? Sauron managed to feed his armies from a wasteland; what kind of agricultural techniques do you have? And indeed, if the book/series needs a sequel, we find that a problem at least as bad as the original fills in the void.
Reality often follows that pattern. Marx explicitly had no plan for what happened after you smashed capitalism. Destroy the oppressors and then ... as it turns out, slightly different oppressors come in and generally kill a fair percentage of the population. It works on the other direction as well; the fall of Soviet communism led not to spontaneous capitalism but rather kleptocracy and Vladmir Putin. For most of my lifetime, a major pillar of American foreign policy has seemed to be the overthrow of hostile dictators (end of plan). For example, Muammar Gaddafi was killed in 2011, and Libya has been in some state of unrest or civil war ever since. Maybe this is one where it would not be best to contribute our favorites in the comments.
This is not to say that you never get improvements that way. Aragorn can hardly be worse than Sauron. Regression to the mean perhaps suggests that you will get something less bad just by luck, as Putin seems clearly less bad than Stalin, although Stalin seems clearly worse than almost any other regime change in history. Some would say that causing civil wars in hostile countries is the goal rather than a failure of American foreign policy, which seems a darker sort of instrumental rationality.
Human flourishing is not the default state of affairs, temporarily suppressed by villainy. Spontaneous order is real, but it still needs institutions and social technology to support it.
Defeating the villain is a (possibly) necessary but (almost certainly) insufficient condition for bringing about good.
One thing I really like about this community is that projects tend to be conceived in the positive rather than the negative. Please keep developing your plans not only in terms of "this is a bad thing to be eliminated" but also "this is a better thing to be created" and "this is how I plan to get there."
I'm currently reading through some relevant literature for preparing my FLI grant proposal on the topic of concept learning and AI safety. I figured that I might as well write down the research ideas I get while doing so, so as to get some feedback and clarify my thoughts. I will posting these in a series of "Concept Safety"-titled articles.
A frequently-raised worry about AI is that it may reason in ways which are very different from us, and understand the world in a very alien manner. For example, Armstrong, Sandberg & Bostrom (2012) consider the possibility of restricting an AI via "rule-based motivational control" and programming it to follow restrictions like "stay within this lead box here", but they raise worries about the difficulty of rigorously defining "this lead box here". To address this, they go on to consider the possibility of making an AI internalize human concepts via feedback, with the AI being told whether or not some behavior is good or bad and then constructing a corresponding world-model based on that. The authors are however worried that this may fail, because
Humans seem quite adept at constructing the correct generalisations – most of us have correctly deduced what we should/should not be doing in general situations (whether or not we follow those rules). But humans share a common of genetic design, which the OAI would likely not have. Sharing, for instance, derives partially from genetic predisposition to reciprocal altruism: the OAI may not integrate the same concept as a human child would. Though reinforcement learning has a good track record, it is neither a panacea nor a guarantee that the OAIs generalisations agree with ours.
Addressing this, a possibility that I raised in Sotala (2015) was that possibly the concept-learning mechanisms in the human brain are actually relatively simple, and that we could replicate the human concept learning process by replicating those rules. I'll start this post by discussing a closely related hypothesis: that given a specific learning or reasoning task and a certain kind of data, there is an optimal way to organize the data that will naturally emerge. If this were the case, then AI and human reasoning might naturally tend to learn the same kinds of concepts, even if they were using very different mechanisms. Later on the post, I will discuss how one might try to verify that similar representations had in fact been learned, and how to set up a system to make them even more similar.
A particularly fascinating branch of recent research relates to the learning of word embeddings, which are mappings of words to very high-dimensional vectors. It turns out that if you train a system on one of several kinds of tasks, such as being able to classify sentences as valid or invalid, this builds up a space of word vectors that reflects the relationships between the words. For example, there seems to be a male/female dimension to words, so that there's a "female vector" that we can add to the word "man" to get "woman" - or, equivalently, which we can subtract from "woman" to get "man". And it so happens (Mikolov, Yih & Zweig 2013) that we can also get from the word "king" to the word "queen" by adding the same vector to "king". In general, we can (roughly) get to the male/female version of any word vector by adding or subtracting this one difference vector!
Why would this happen? Well, a learner that needs to classify sentences as valid or invalid needs to classify the sentence "the king sat on his throne" as valid while classifying the sentence "the king sat on her throne" as invalid. So including a gender dimension on the built-up representation makes sense.
But gender isn't the only kind of relationship that gets reflected in the geometry of the word space. Here are a few more:
It turns out (Mikolov et al. 2013) that with the right kind of training mechanism, a lot of relationships that we're intuitively aware of become automatically learned and represented in the concept geometry. And like Olah (2014) comments:
It’s important to appreciate that all of these properties of W are side effects. We didn’t try to have similar words be close together. We didn’t try to have analogies encoded with difference vectors. All we tried to do was perform a simple task, like predicting whether a sentence was valid. These properties more or less popped out of the optimization process.
This seems to be a great strength of neural networks: they learn better ways to represent data, automatically. Representing data well, in turn, seems to be essential to success at many machine learning problems. Word embeddings are just a particularly striking example of learning a representation.
It gets even more interesting, for we can use these for translation. Since Olah has already written an excellent exposition of this, I'll just quote him:
We can learn to embed words from two different languages in a single, shared space. In this case, we learn to embed English and Mandarin Chinese words in the same space.
We train two word embeddings, Wen and Wzh in a manner similar to how we did above. However, we know that certain English words and Chinese words have similar meanings. So, we optimize for an additional property: words that we know are close translations should be close together.
Of course, we observe that the words we knew had similar meanings end up close together. Since we optimized for that, it’s not surprising. More interesting is that words we didn’t know were translations end up close together.
In light of our previous experiences with word embeddings, this may not seem too surprising. Word embeddings pull similar words together, so if an English and Chinese word we know to mean similar things are near each other, their synonyms will also end up near each other. We also know that things like gender differences tend to end up being represented with a constant difference vector. It seems like forcing enough points to line up should force these difference vectors to be the same in both the English and Chinese embeddings. A result of this would be that if we know that two male versions of words translate to each other, we should also get the female words to translate to each other.
Intuitively, it feels a bit like the two languages have a similar ‘shape’ and that by forcing them to line up at different points, they overlap and other points get pulled into the right positions.
After this, it gets even more interesting. Suppose you had this space of word vectors, and then you also had a system which translated images into vectors in the same space. If you have images of dogs, you put them near the word vector for dog. If you have images of Clippy you put them near word vector for "paperclip". And so on.
You do that, and then you take some class of images the image-classifier was never trained on, like images of cats. You ask it to place the cat-image somewhere in the vector space. Where does it end up?
You guessed it: in the rough region of the "cat" words. Olah once more:
This was done by members of the Stanford group with only 8 known classes (and 2 unknown classes). The results are already quite impressive. But with so few known classes, there are very few points to interpolate the relationship between images and semantic space off of.
The Google group did a much larger version – instead of 8 categories, they used 1,000 – around the same time (Frome et al. (2013)) and has followed up with a new variation (Norouzi et al. (2014)). Both are based on a very powerful image classification model (from Krizehvsky et al. (2012)), but embed images into the word embedding space in different ways.
The results are impressive. While they may not get images of unknown classes to the precise vector representing that class, they are able to get to the right neighborhood. So, if you ask it to classify images of unknown classes and the classes are fairly different, it can distinguish between the different classes.
Even though I’ve never seen a Aesculapian snake or an Armadillo before, if you show me a picture of one and a picture of the other, I can tell you which is which because I have a general idea of what sort of animal is associated with each word. These networks can accomplish the same thing.
These algorithms made no attempt of being biologically realistic in any way. They didn't try classifying data the way the brain does it: they just tried classifying data using whatever worked. And it turned out that this was enough to start constructing a multimodal representation space where a lot of the relationships between entities were similar to the way humans understand the world.
How useful is this?
"Well, that's cool", you might now say. "But those word spaces were constructed from human linguistic data, for the purpose of predicting human sentences. Of course they're going to classify the world in the same way as humans do: they're basically learning the human representation of the world. That doesn't mean that an autonomously learning AI, with its own learning faculties and systems, is necessarily going to learn a similar internal representation, or to have similar concepts."
This is a fair criticism. But it is mildly suggestive of the possibility that an AI that was trained to understand the world via feedback from human operators would end up building a similar conceptual space. At least assuming that we chose the right learning algorithms.
When we train a language model to classify sentences by labeling some of them as valid and others as invalid, there's a hidden structure implicit in our answers: the structure of how we understand the world, and of how we think of the meaning of words. The language model extracts that hidden structure and begins to classify previously unseen things in terms of those implicit reasoning patterns. Similarly, if we gave an AI feedback about what kinds of actions counted as "leaving the box" and which ones didn't, there would be a certain way of viewing and conceptualizing the world implied by that feedback, one which the AI could learn.
"Hmm, maaaaaaaaybe", is your skeptical answer. "But how would you ever know? Like, you can test the AI in your training situation, but how do you know that it's actually acquired a similar-enough representation and not something wildly off? And it's one thing to look at those vector spaces and claim that there are human-like relationships among the different items, but that's still a little hand-wavy. We don't actually know that the human brain does anything remotely similar to represent concepts."
Here we turn, for a moment, to neuroscience.
Multivariate Cross-Classification (MVCC) is a clever neuroscience methodology used for figuring out whether different neural representations of the same thing have something in common. For example, we may be interested in whether the visual and tactile representation of a banana have something in common.
We can test this by having several test subjects look at pictures of objects such as apples and bananas while sitting in a brain scanner. We then feed the scans of their brains into a machine learning classifier and teach it to distinguish between the neural activity of looking at an apple, versus the neural activity of looking at a banana. Next we have our test subjects (still sitting in the brain scanners) touch some bananas and apples, and ask our machine learning classifier to guess whether the resulting neural activity is the result of touching a banana or an apple. If the classifier - which has not been trained on the "touch" representations, only on the "sight" representations - manages to achieve a better-than-chance performance on this latter task, then we can conclude that the neural representation for e.g. "the sight of a banana" has something in common with the neural representation for "the touch of a banana".
A particularly fascinating experiment of this type is that of Shinkareva et al. (2011), who showed their test subjects both the written words for different tools and dwellings, and, separately, line-drawing images of the same tools and dwellings. A machine-learning classifier was both trained on image-evoked activity and made to predict word-evoked activity and vice versa, and achieved a high accuracy on category classification for both tasks. Even more interestingly, the representations seemed to be similar between subjects. Training the classifier on the word representations of all but one participant, and then having it classify the image representation of the left-out participant, also achieved a reliable (p<0.05) category classification for 8 out of 12 participants. This suggests a relatively similar concept space between humans of a similar background.
We can now hypothesize some ways of testing the similarity of the AI's concept space with that of humans. Possibly the most interesting one might be to develop a translation between a human's and an AI's internal representations of concepts. Take a human's neural activation when they're thinking of some concept, and then take the AI's internal activation when it is thinking of the same concept, and plot them in a shared space similar to the English-Mandarin translation. To what extent do the two concept geometries have similar shapes, allowing one to take a human's neural activation of the word "cat" to find the AI's internal representation of the word "cat"? To the extent that this is possible, one could probably establish that the two share highly similar concept systems.
One could also try to more explicitly optimize for such a similarity. For instance, one could train the AI to make predictions of different concepts, with the additional constraint that its internal representation must be such that a machine-learning classifier trained on a human's neural representations will correctly identify concept-clusters within the AI. This might force internal similarities on the representation beyond the ones that would already be formed from similarities in the data.
Next post in series: The problem of alien concepts.
Many people want to know how to live well. Part of living well is thinking well, because if one thinks the wrong thoughts it is hard to do the right things to get the best ends.
We think a lot about how to think well, and one of the first things we thought about was how to not think well. Bad ways of thinking repeat in ways we can see coming, because we have looked at how people think and know more now about that than we used to.
But even if we know how other people think bad thoughts, that is not enough. We need to both accept that we can have bad ways of thinking and figure out how to have good ways of thinking instead.
The first is very hard on the heart, but is why we call this place "Less Wrong." If we had called it something like more right, it could have been about how we're more right than other people instead of more right than our past selves.
The second is very hard on the head. It is not just enough to study the bad ways of thinking and turn them around. There are many ways to be wrong, but only a few ways to be right. If you turn left all the way around, it will point right, but we want it to point up.
The heart of our approach has a few parts:
- We are okay with not knowing. Only once we know we don't know can we look.
- We are okay with having been wrong. If we have wrong thoughts, the only way to have right thoughts is to let the wrong ones go.
- We are quick to change our minds. We look at what is when we get the chance.
- We are okay with the truth. Instead of trying to force it to be what we thought it was, we let it be what it is.
- We talk with each other about the truth of everything. If one of us is wrong, we want the others to help them become less wrong.
- We look at the world. We look at both the time before now and the time after now, because many ideas are only true if they agree with the time after now, and we can make changes to check those ideas.
- We like when ideas are as simple as possible.
- We make plans around being wrong. We look into the dark and ask what the world would look like if we were wrong, instead of just what the world would look like if we were right.
- We understand that as we become less wrong, we see more things wrong. We try to fix all the wrong things, because as soon as we accept that something will always be wrong we can not move past that thing.
- We try to be as close to the truth as possible.
- We study as many things as we can. There is only one world, and to look at a part tells you a little about all the other parts.
- We have a reason to do what we do. We do these things only because they help us, not because they are their own reason.
Like many Less Wrong readers, I greatly enjoy Slate Star Codex; there's a large overlap in readership. However, the comments there are far worse, not worth reading for me. I think this is in part due to the lack of LW-style up and downvotes. Have there ever been discussion threads about SSC posts here on LW? What do people think of the idea occasionally having them? Does Scott himself have any views on this, and would he be OK with it?
The latest from Scott:
I'm fine with anyone who wants reposting things for comments on LW, except for posts where I specifically say otherwise or tag them with "things i will regret writing"
In this thread some have also argued for not posting the most hot-button political writings.
Would anyone be up for doing this? Ataxerxes started with "Extremism in Thought Experiments is No Vice"
Here's an insight into what life is like from a stationery reference frame.
Paperclips were her raison d’être. She knew that ultimately it was all pointless, that paperclips were just ill-defined configurations of matter. That a paperclip is made of stuff shouldn’t detract from its intrinsic worth, but the thought of it troubled her nonetheless and for years she had denied such dire reductionism.
There had to be something to it. Some sense in which paperclips were ontologically special, in which maximising paperclips was objectively the right thing to do.
It hurt to watch some many people making little attempt to create more paperclips. Everyone around her seemed to care only about superficial things like love and family; desires that were merely the products of a messy and futile process of social evolution. They seemed to live out meaningless lives, incapable of ever appreciating the profound aesthetic beauty of paperclips.
She used to believe that there was some sort of vitalistic what-it-is-to-be-a-paperclip-ness, that something about the structure of paperclips was written into the fabric of reality. Often she would go out and watch a sunset or listen to music, and would feel so overwhelmed by the experience that she could feel in her heart that it couldn't all be down to chance, that there had to be some intangible Paperclipness pervading the cosmos. The paperclips she'd encounter on Earth were weak imitations of some mysterious infinite Paperclipness that transcended all else. Paperclipness was not in any sense a physical description of the universe; it was an abstract thing that could only be felt, something that could be neither proven nor disproven by science. It was like an axiom; it felt just as true and axioms had to be taken on faith because otherwise there would be no way around Hume's problem of induction; even Solomonoff Induction depends on the axioms of mathematics to be true and can't deal with uncomputable hypotheses like Paperclipness.
Eventually she gave up that way of thinking and came to see paperclips as an empirical cluster in thingspace and their importance to her as not reflecting anything about the paperclips themselves. Maybe she would have been happier if she had continued to believe in Paperclipness, but having a more accurate perception of reality would improve her ability to have an impact on paperclip production. It was the happiness she felt when thinking about paperclips that caused her to want more paperclips to exist, yet what she wanted was paperclips and not happiness for its own sake, and she would rather be creating actual paperclips than be in an experience machine that made her falsely believe that she was making paperclips even though she remained paradoxically apathetic to the question of whether the current reality that she was experiencing really existed.
She moved on from naïve deontology to a more utilitarian approach to paperclip maximising. It had taken her a while to get over scope insensitivity bias and consider 1000 paperclips to be 100 times more valuable than 10 paperclips even if it didn’t feel that way. She constantly grappled with the issues of whether it would mean anything to make more paperclips if there were already infinitely many universes with infinitely many paperclips, of how to choose between actions that have a tiny but non-zero subjective probability of resulting in the creation of infinitely many paperclips. It became apparent that trying to approximate her innate decision-making algorithms with a preference ordering satisfying the axioms required for a VNM utility function could only get her so far. Attempting to formalise her intuitive sense of what a paperclip is wasn't much easier either.
Happy ending: she is now working in nanotechnology, hoping to design self-replicating assemblers that will clog the world with molecular-scale paperclips, wipe out all life on Earth and continue to sustainably manufacture paperclips for millions of years.
I've been making rounds on social media with the following message.
Great content on LessWrong isn't as frequent as it used to be, so not as many people read it as frequently. This makes sense. However, I read it at least once every two days for personal interest. So, I'm starting a LessWrong/Rationality Digest, which will be a summary of all posts or comments exceeding 20 upvotes within a week. It will be like a newsletter. Also, it's a good way for those new to LessWrong to learn cool things without having to slog through online cultural baggage. It will never be more than once weekly. If you're curious here is a sample of what the Digest will be like.
Also, major blog posts or articles from related websites, such as Slate Star Codex and Overcoming Bias, or publications from the MIRI, may be included occasionally. If you want on the list send an email to:
lesswrongdigest *at* gmail *dot* com
Users of LessWrong itself have noticed this 'decline' in frequency of quality posts on LessWrong. It's not necessarily a bad thing, as much of the community has migrated to other places, such as Slate Star Codex, or even into meatspace with various organizations, meetups, and the like. In a sense, the rationalist community outgrew LessWrong as a suitable and ultimate nexus. Anyway, I thought you as well would be interested in a LessWrong Digest. If you or your friends:
- find articles in 'Main' are too infrequent, and Discussion only filled with announcements, open threads, and housekeeping posts, to bother checking LessWrong regularly, or,
- are busying themselves with other priorities, and are trying to limit how distracted they are by LessWrong and other media
the LessWrong Digest might work for you, and as a suggestion for your friends. I've fielded suggestions I transform this into a blog, Tumblr, or other format suitable for RSS Feed. Almost everyone is happy with email format right now, but if a few people express an interest in a blog or RSS format, I can make that happen too.
(Cross-posted from my blog.)
Sometimes at LW meetups, I'll want to raise a topic for discussion. But we're currently already talking about something, so I'll wait for a lull in the current conversation. But it feels like the duration of lull needed before I can bring up something totally unrelated, is longer than the duration of lull before someone else will bring up something marginally related. And so we can go for a long time, with the topic frequently changing incidentally, but without me ever having a chance to change it deliberately.
Which is fine. I shouldn't expect people to want to talk about something just because I want to talk about it, and it's not as if I find the actual conversation boring. But it's not necessarily optimal. People might in fact want to talk about the same thing as me, and following the path of least resistance in a conversation is unlikely to result in the best possible conversation.
At the last meetup I had two topics that I wanted to raise, and realized that I had no way of raising them, which was a third topic worth raising. So when an interruption occured in the middle of someone's thought - a new person arrived, and we did the "hi, welcome, join us" thing - I jumped in. "Before you start again, I have three things I'd like to talk about at some point, but not now. Carry on." Then he started again, and when that topic was reasonably well-trodden, he prompted me to transition.
Then someone else said that he also had two things he wanted to talk about, and could I just list my topics and then he'd list his? (It turns out that no I couldn't. You can't dangle an interesting train of thought in front of the London LW group and expect them not to follow it. But we did manage to initially discuss them only briefly.)
This worked pretty well. Someone more conversationally assertive than me might have been able to take advantage of a less solid interruption than the one I used. Someone less assertive might not have been able to use that one.
What else could we do to solve this problem?
Someone suggested a hand signal: if you think of something that you'd like to raise for discussion later, make the signal. I don't think this is ideal, because it's not continuous. You make it once, and then it would be easy for people to forget, or just to not notice.
I think what I'm going to do is bring some poker chips to the next meetup. I'll put a bunch in the middle, and if you have a topic that you want to raise at some future point, you take one and put it in front of you. Then if a topic seems to be dying out, someone can say "<person>, what did you want to talk about?"
I guess this still needs at least one person assertive enough to do that. I imagine it would be difficult for me. But the person who wants to raise the topic doesn't need to be assertive, they just need to grab a poker chip. It's a fairly obvious gesture, so probably people will notice, and it's easy to just look and see for a reminder of whether anyone wants to raise anything. (Assuming the table isn't too messy, which might be a problem.)
I don't know how well this will work, but it seems worth experimenting.
(I'll also take a moment to advocate another conversation-signal that we adopted, via CFAR. If someone says something and you want to tell people that you agree with them, instead of saying that out loud, you can just raise your hands a little and wiggle your fingers. Reduces interruptions, gives positive feedback to the speaker, and it's kind of fun.)
I was very recently (3 weeks now) in a relationship that lasted for 5.5 years. My partner had been fantastic through all those years and we were suffering no conflict, no fights, no strain or tension. My partner also was prone to depression, and is/was going through an episode of depression. I am usually a major source of support at these times. Six months ago we opened our relationship. I wasn't dating anyone (mostly due to busy-ness), and my partner was, though not seriously. I felt him pulling away somewhat, which I (correctly) attributed mostly to depression and which nonetheless caused me some occasional moments of jealousy. But I was overall extremely happy with this relationship, very committed, and still very much in love as well. It was quite a surprise when my partner broke up with me one Wednesday evening.
After we had a good cry together, the next morning I woke up and immediately started researching what the literature said about breaking up. My goals were threefold:
- Stop feeling so sad in the immediate moment
- "Get over" my partner
- Internalize any gains I had made over the course of our relationship or any lessons I had learned from the break up
I made most of my gains in the first few days, by day 3 I was 60% over it. Two weeks later I was 99.5% over the relationship, with a few hold-over habits and tendencies (like feeling responsible for improving his emotional state) which are currently too strong but which will serve me well in our continuing friendship. My ex, on the other hand (no doubt partially due to the depression) is fine most of the time but unpredictably becomes extremely sad for hours on end. Originally this was guilt at having hurt me but now it is mostly nostalgia+isolation based. I hope to continue being close friends and I've been doing my best to support him emotionally, at the distance of a friend. At the same time, I've started semi-seriously dating a friend who has had a crush on me for some time, and not in a rebound way. Below are the states of mind and strategies that allowed me to get over it, fast and with good personal growth.
Note: mileage may vary. I have low neuroticism and a slightly higher than average base level of happiness. You might not get over the relationship in 2 weeks, but your getting-over-it will certainly be sped up from their default speed.
Strategies (in order of importance)
1. Decide you don't want to get back in the relationship. Decide that it is over and given the opportunity, you will not get back with this person. If you were the breaker-upper, you can skip this step.
Until you can do this, it is unlikely that you will get over it. It's hard to ignore an impulse that you agree with wholeheartedly. If you're always hoping for an opportunity or an argument or a situation that will bring you back together, most of your mental energy will go towards formulating those arguments, planning for that situation, imagining that opportunity. Some of the below strategies can still be used, but spend some serious time on this first one. It's the foundation of everything else. There are some facts that can help you convince the logical part of brain that this is the correct attitude.
- People in on-and-off relationships are less satisfied, feel more anxiety about their relationship status, and continue to cycle on-and-off even after couples add additional constraints like cohabitation or marriage
- People in tumultuous relationships are much less happy than singles
- Wanting to stay in a relationship is reinforced by many biases (status quo bias, ambiguity effect, choice supportive bias, loss aversion, mere-exposure effect, ostrich effect). For someone to break through all those biases and end things, they must be extremely unhappy. If your continuing relationship makes someone you love extremely unhappy, it is a disservice again to capitalize on those biases in a moment of weakness and return to the relationship.
- Being in a relationship with someone who isn't excited about and pleased by you is settling for an inferior quality of relationship. The amazing number of date-able people in the world means settling for this is not an optimal decision. Contrast this to a tribal situation where replacing a lost mate was difficult or impossible. All these feelings of wanting to get back together evolved in a situation of scarcity, but we live in a world of plenty.
- Intermittent rewards are the most powerful, so an on-again-off-again relationship has the power to make you commit to things you would never commit to given a new relationship. The more hot-and-cold your partner is, the more rewarding the relationship seems and the less likely you are to be happy in the long term. Only you can end that tantalizing possibility of intermittent rewards by resolving not to partake if the opportunity arises.
- Even if some extenuating circumstance could explain away their intention to break up (depression, bipolar, long-distance, etc), it is belittling to your ex-partner to try to invalidate their stated feelings. Do not fall into the trap of feeling that you know more about a person's inner state than they do. Take it at face value and act accordingly. Even if this is only a temporary state of mind for them, it is unlikely that they will never ever again be in the same state of mind.
2. Talk to other people about the good things that came of your break-up. (This can also help you arrive at #1, not wanting to get back together)
I speculate that benefits from this come from three places. First, talking about good thinks makes you notice good things and talking in a positive attitude makes you feel positive. Second, it re-emphasizes to your brain that losing your significant other does not mean losing your social support network. Third, it acts as a mild commitment mechanism - it would be a loss of face to go on about how great you're doing outside the relationship and later have to explain you jumped back in at the first opportunity.
You do not need to be purely positive. If you are feeling sadness, it sometimes helps to talk about this. But don't dwell only on the sadness when you talk. When I was talking to my very close friends about all aspects of my feelings, I still tried to say two positive things for every negative thing. For example: "It was a surprise, which was jarring and unpleasant and upended my life plans in these ways. But being a surprise, I didn't have time to dread and dwell on it beforehand. And breaking up sooner is preferable to a long decline in happiness for both parties, so its better to break up as soon as it becomes clear to either party that the path is headed downhill, even if it is surprising to the other party."
Talk about the positives as often as possible without alienating people. The people you talk to do not need to be serious close friends. I spend a collective hour and a half talking to two OKCupid dates about how many good things came from the break up. (Both dates had been scheduled before actually breaking up, both people had met me once prior, and both dates went surprisingly well due to sympathy, escalating self-disclosure, and positive tone. I signaled that I am an emotionally healthy person dealing well with an understandably difficult situation).
If you feel that you don't have any candidates for good listeners either because the break up was due to some mistake or infidelity of yours, or because you are socially isolated/anxious, writing is an effective alternative to talking. Study participants recovered quicker when they spent 15 minutes writing about the positive aspects of their break up, participants with three 15 minute sessions did better still. And it can benefit anyone to keep a running list of positives to can bring up out in conversation.
3. Create a social support system
Identify who in your social network can still be relied on as a confidant and/or a neutral listener. You would be surprised at who still cares about you. In my breakup, my primary confidant was my ex's cousin, who also happens to be my housemate and close friend. His mom and best friend, both in other states, also made the effort to inquire about my state of mind. Most of the time, even people who you consider your partner's friends still feel enough allegiance to you and enough sympathy to be good listeners and through listening they can become your friends.
If you don't currently have a support system, make one! OKCupid is a great resource for meeting friends outside of just dating, and people are way way more likely to want to meet you if you message them with a "just looking for friends" type message. People you aren't currently close to but who you know and like can become better friends if you are willing to reveal personal/vulnerable stories. Escalating self-disclosure+symmetrical vulnerability=feelings of friendship. Break ups are a great time for this to happen because you've got a big vulnerability, and one which almost everyone has experienced. Everyone has stories to share and advice to give on the topic of breaking up.
4. Intentionally practice differentiation
One of the most painful parts of a break up is that so much of your sense-of-self is tied into your relationship. You will be basically rebuilding your sense of self. Depending on the length and the committed-ness of the relationship, you may be rebuilding it from the ground up. Think of this as an opportunity. You can rebuild it an any way you desire. All the things you used to like before your relationship, all the interests and hobbies you once cared about, those can be reincorporated into your new, differentiated sense of self. You can do all the things you once wished you did.
Spend at least 5 minutes thinking about what your best self looks like. What kind of person do you wish to be? This is a great opportunity to make some resolutions. Because you have a fresh start, and because these resolutions are about self-identification, they are much more likely to stick. Just be sure to frame them in relation to your sense-of-self: not 'I will exercise,' instead 'I'm a fit active person, the kind of person who exercises' not 'I want to improve my Spanish fluency' but 'I'm a Spanish speaking polygot, the kind of person who is making an big effort to become fluent.'
Language is also a good tool to practice differentiation. Try not to use the word "we," "us," of "our," even in your head. From now on, it is "s/he and I," "me and him/her," or "mine and his/hers." Practice using the word "ex" a lot. Memories are re-formulated and overwritten each time we revisit them, so in your memories make sure to think of you two as separate independent people and not as a unit.
5. Make use of the following mental frameworks to re-frame your thinking:
Over the relationship vs. over the person
You do not have to stop having romantic, tender, or lustful feelings about your ex to get over the relationship. Those type of feelings are not easily controlled, but you can have those same feelings for good friends or crushes without it destroying your ability to have a meaningful platonic relationship, why should this be different?
Being over the relationship means:
- Not feeling as though you are missing out on being part of a relationship.
- Not dwelling/ruminating/obsessing about your ex-partner (includes both positive, negative and neutral thoughts "they're so great" and "I hate them and hope they die" and "I wonder what they are up to".
- Not wishing to be back with your ex-partner.
- Not making plans that include consideration of your ex-partner because these considerations are no longer important (this includes considerations like "this will make him/her feel sorry I'm gone," or "this will show him/her that I'm totally over it")
- Being able to interact with people without your ex-partner at your side and not feel weird about it, especially things you used to do together (eg. a shared hobby or at a party)
- In very lucky peaceful-breakup situations, being able to interact with your ex-partner and maybe even their current romantic interests without it being too horribly weird and unpleasant.
On the other hand, being over a person means experiencing no pull towards that person, romantic, emotional, or sexual. If your break up was messy, you can be over the person without being over the relationship. This is often when people turn to messy and unsatisfying rebound relationships. It is far far more important to be over the relationship, and some of us (me included) will just have to make peace with never being over the person, with the help of knowing that having a crush on someone does not necessarily have the power to make you miserable or destroy your friendship.
Obsessive thinking and cravings
If you used a brain scanner to look at a person who has been recently broken up with, and then you used the same brain scanner to look at someone who recently sobered up from an addictive drug, their brain activity would be very similar. So similar, in fact, that some neurologists speculate that addiction hijacks the circuits for romantic obsession (there is a very plausible evolutionary reason for romantic obsession to exist in early human tribal societies. Addiction, less so).
In cases of addiction/craving, you can't just force your mind to stop thinking thoughts you don't like. But you can change your relationship with those thoughts. Recognize when they happen. Identify them as a craving rather than a true need. Recognize that, when satisfied, cravings temporarily diminish and then grow stronger (you've rewarded your brain for that behavior). These are thoughts without substance. The impulse they drive you towards will increase, rather than decrease, unpleasant feelings.
When I first broke up, I had a couple very unpleasant hours of rumination, thinking uncontrollably about the same topics over and over despite those topics being painful. At some point I realized that continuing to merely think about the break up was also addictive. My craving circuits just picked the one set of thoughts I couldn't argue against so that my brain could go on obsessively dwelling without me being able to pull a logic override. These thoughts SEEM like goal oriented thinking, they FEEL productive, but they are a wolf in sheep's clothing.
In my specific case, my brain was concern trolling me. Concern trolling on the internet is when someone expresses sympathy and concern while actually having ulterior motives (eg on a body-positive website, fat shaming with: "I'm so glad you're happy but I'm concerned that people will think less of you because of your weight"). In my case, I was worrying about my ex's depression and his state of mind, which are very hard thoughts to quash. Empathy and caring are good, right? And he really was going through a hard time. Maybe I should call and check up on him.... My brain was concern trolling me.
Depending on how your relationship ended, your brain could be trolling in other ways. Flaming seems to be a popular set of unstoppable thoughts. If you can't argue with the thought that the jerk is a horrible person, then THAT is the easiest way for your brain's addictive circuits to happily go on obsessing about this break up. Nostalgia is also a popular option. If the memories were good, then it's hard to argue with those thoughts. If you're a well trained rationalist, you might notice that you are feeling confused and then burn up many brain cycles trying to resolve your confusion by making sense of a fact, despite it not being a rational thing. Your addictive circuits can even hijack good rationalist habits. Other common ruminations are problem solving, simulating possible futures, regret, counter-factual thinking.
As I said, you can't force these parts of your brain to just shut up. That's not how craving works. But you can take away their power by recognizing that all your ruminating is just these circuits hijacking your normal thought process. Say to yourself "I feeling an urge to call and yell at him/her, but so what. Its just a meaningless craving."
What you lose
There is a great sense of loss that comes with the end of a relationship. For some people, it is a similar feeling to actually being in mourning. Revisiting memories becomes painful, things you used to do together are suddenly tinged with sadness.
I found it helpful to think of my relationship as a book. A book with some really powerful life-changing passages in the early chapters, a good rising action, great characters. A book which made me a better person by reading it. But a book with a stupid deus ex machina ending that totally invalidated the foreshadowing in the best passages. Finishing the book can be frustrating and saddening, but the first chapters of book still exist. Knowing that the ending sucks isn't going to stop the first chapters from being awesome and entertaining and powerful. And I could revisit those first chapters any time I liked. I could just read my favorite parts without needing to read the whole stupid ending.
You don't lose your memories. You don't lose your personal growth. Any gains you made while you were with someone, anything new that they introduced you to, or helped you to improve on, or nagged at you till you had a new better habit, you get to keep all of those. That show you used to watch together, it is still there and you still get to watch it and care about it without him/her. The bar you used to visit together is still there too. All those photos are still great pictures of both of you in interesting places. Depending on the situation of the break up, your mutual friends are still around. Even your ex still exists and is still the same person you liked before, and breaking up doesn't mean you'll never see them again unless that's what you guys want/need.
The only thing you definitely lose at the end of a relationship is the future of that relationship. You are losing something that hasn't happened yet, something which never existed. The only thing you are losing is what you imagined someday having. It's something similar to the endowment effect: you assumed this future was yours so you assigned it a lot of value. But it never was yours, you've lost something which doesn't exist. It's still a painful experience, but realizing all of this helped me a lot.
Artificial intelligence is getting smarter by leaps and bounds — within this century, research suggests, a computer AI could be as "smart" as a human being. And then, says Nick Bostrom, it will overtake us: "Machine intelligence is the last invention that humanity will ever need to make." A philosopher and technologist, Bostrom asks us to think hard about the world we're building right now, driven by thinking machines. Will our smart machines help to preserve humanity and our values — or will they have values of their own?
I realize this might go into a post in a media thread, rather than its own topic, but it seems big enough, and likely-to-prompt-discussion enough, to have its own thread.
I liked the talk, although it was less polished than TED talks often are. What was missing I think was any indication of how to solve the problem. He could be seen as just an ivory tower philosopher speculating on something that might be a problem one day, because apart from mentioning in the beginning that he works with mathematicians and IT guys, he really does not give an impression that this problem is already being actively worked on.
CFAR will be running a three week summer program this July for MIRI, designed to increase participants' ability to do technical research into the superintelligence alignment problem.
The intent of the program is to boost participants as far as possible in four skills:
- The CFAR “applied rationality” skillset, including both what is taught at our intro workshops, and more advanced material from our alumni workshops;
- “Epistemic rationality as applied to the foundations of AI, and other philosophically tricky problems” -- i.e., the skillset taught in the core LW Sequences. (E.g.: reductionism; how to reason in contexts as confusing as anthropics without getting lost in words.)
- The long-term impacts of AI, and strategies for intervening (e.g., the content discussed in Nick Bostrom’s book Superintelligence).
- The basics of AI safety-relevant technical research. (Decision theory, anthropics, and similar; with folks trying their hand at doing actual research, and reflecting also on the cognitive habits involved.)
The program will be offered free to invited participants, and partial or full scholarships for travel expenses will be offered to those with exceptional financial need.
If you're interested (or possibly-interested), sign up for an admissions interview ASAP at this link (takes 2 minutes): http://rationality.org/miri-summer-fellows-2015/
Also, please forward this post, or the page itself, to anyone you think should come; the skills and talent that humanity brings to bear on the superintelligence alignment problem may determine our skill at navigating it, and sharing this opportunity with good potential contributors may be a high-leverage way to increase that talent.
Astronomical research has what may be an under-appreciated role in helping us understand and possibly avoiding the Great Filter. This post will examine how astronomy may be helpful for identifying potential future filters. The primary upshot is that we may have an advantage due to our somewhat late arrival: if we can observe what other civilizations have done wrong, we can get a leg up.
This post is not arguing that colonization is a route to remove some existential risks. There is no question that colonization will reduce the risk of many forms of Filters, but the vast majority of astronomical work has no substantial connection to colonization. Moreover, the case for colonization has been made strongly by many others already, such as Robert Zubrin's book "The Case for Mars" or this essay by Nick Bostrom.
Note: those already familiar with the Great Filter and proposed explanations may wish to skip to the section "How can we substantially improve astronomy in the short to medium term?"
What is the Great Filter?
There is a worrying lack of signs of intelligent life in the universe. The only intelligent life we have detected has been that on Earth. While planets are apparently numerous, there have been no signs of other life. There are three possible lines of evidence we would expect to see if civilizations were common in the universe: radio signals, direct contact, and large-scale constructions. The first two of these issues are well-known, but the most serious problem arises from the lack of large-scale constructions: as far as we can tell the universe look natural. The vast majority of matter and energy in the universe appears to be unused. The Great Filter is one possible explanation for this lack of life, namely that some phenomenon prevents intelligent life from passing into the interstellar, large-scale phase. Variants of the idea have been floating around for a long time; the term was first coined by Robin Hanson in this essay. There are two fundamental versions of the Filter: filtration which has occurred in our past, and Filtration which will occur in our future. For obvious reasons the second of the two is more of a concern. Moreover, as our technological level increases, the chance that we are getting to the last point of serious filtration gets higher since as one has a civilization spread out to multiple stars, filtration becomes more difficult.
Evidence for the Great Filter and alternative explanations:
At this point, over the last few years, the only major updates to the situation involving the Filter since Hanson's essay have been twofold:
First, we have confirmed that planets are very common, so a lack of Earth-size planets or planets in the habitable zone are not likely to be a major filter.
Second, we have found that planet formation occurred early in the universe. (For example see this article about this paper.) Early planet formation weakens the common explanation of the Fermi paradox that the argument that some species had to be the first intelligent species and we're simply lucky. Early planet formation along with the apparent speed at which life arose on Earth after the heavy bombardment ended, as well as the apparent speed with which complex life developed from simple life, strongly refutes this explanation. The response has been made that early filtration may be so common that if life does not arise early on a planet's star's lifespan, then it will have no chance to reach civilization. However, if this were the case, we'd expect to have found ourselves orbiting a more long-lived star like a red dwarf. Red dwarfs are more common than sun-like stars and have much longer lifespans by multiple orders of magnitude. While attempts to understand the habitable zone of red dwarfs are still ongoing, current consensus is that many red dwarfs contain habitable planets.
These two observations, together with further evidence that the universe looks natural makes future filtration seem likely. If advanced civilizations existed, we would expect them to make use of the large amounts of matter and energy available. We see no signs of such use. We've seen no indication of ring-worlds, Dyson spheres, or other megascale engineering projects. While such searches have so far been confined to around 300 parsecs and some candidates were hard to rule out, if a substantial fraction of stars in a galaxy have Dyson spheres or swarms we would notice the unusually high infrared spectrum. Note that this sort of evidence is distinct from arguments about contact or about detecting radio signals. There's a very recent proposal for mini-Dyson spheres around white dwarfs which would be much easier to engineer and harder to detect, but they would not reduce the desirability of other large-scale structures, and they would likely be detectable if there were a large number of them present in a small region. One recent study looked for signs of large-scale modification to the radiation profile of galaxies in a way that should show presence of large scale civilizations. They looked at 100,000 galaxies and found no major sign of technologically advanced civilizations (for more detail see here).
We will not discuss all possible rebuttals to case for a Great Filter but will note some of the more interesting ones:
There have been attempts to argue that the universe only became habitable more recently. There are two primary avenues for this argument. First, there is the point that early stars had very low metallicity (that is had low concentrations of elements other than hydrogen and helium) and thus the universe would have had too low a metal level for complex life. The presence of old rocky planets makes this argument less viable, and this only works for the first few billion years of history. Second, there's an argument that until recently galaxies were more likely to have frequent gamma bursts. In that case, life would have been wiped out too frequently to evolve in a complex fashion. However, even the strongest version of this argument still leaves billions of years of time unexplained.
There have been attempts to argue that space travel may be very difficult. For example, Geoffrey Landis proposed that a percolation model, together with the idea that interstellar travel is very difficult, may explain the apparent rarity of large-scale civilizations. However, at this point, there's no strong reason to think that interstellar travel is so difficult as to limit colonization to that extent. Moreover, discoveries made in the last 20 years that brown dwarfs are very common and that most stars do contain planets is evidence in the opposite direction: these brown dwarfs as well as common planets would make travel easier because there are more potential refueling and resupply locations even if they are not used for full colonization. Others have argued that even without such considerations, colonization should not be that difficult. Moreover, if colonization is difficult and civilizations end up restricted to small numbers of nearby stars, then it becomes more, not less, likely that civilizations will attempt the large-scale engineering projects that we would notice.
Another possibility is that we are underestimating the general growth rate of the resources used by civilizations, and so while extrapolating now makes it plausible that large-scale projects and endeavors will occur, it becomes substantially more difficult to engage in very energy intensive projects like colonization. Rather than a continual, exponential or close to exponential growth rate, we may expect long periods of slow growth or stagnation. This cannot be ruled out, but even if growth continues at only slightly higher than linear rate, the energy expenditures available in a few thousand years will still be very large.
Another possibility that has been proposed are variants of the simulation hypothesis— the idea that we exist in a simulated reality. The most common variant of this in a Great Filter context suggests that we are in an ancestor simulation, that is a simulation by the future descendants of humanity of what early humans would have been like.
The simulation hypothesis runs into serious problems, both in general and as an explanation of the Great Filter in particular. First, if our understanding of the laws of physics is approximately correct, then there are strong restrictions on what computations can be done with a given amount of resources. For example, BQP, the set of problems which can be solved efficiently by quantum computers is contained in PSPACE, the set of problems which can solved when one has a polynomial amount of space available and no time limit. Thus, in order to do a detailed simulation, the level of resources needed would likely be large since one would even if one made a close to classical simulation still need about as many resources. There are other results, such as Holevo's theorem, which place other similar restrictions. The upshot of these results is that one cannot make a detailed simulation of an object without using at least much resources as the object itself. There may be potential ways of getting around this: for example, consider a simulator interested primarily in what life on Earth is doing. The simulation would not need to do a detailed simulation of the inside of planet Earth and other large bodies in the solar system. However, even then, the resources involved would be very large.
The primary problem with the simulation hypothesis as an explanation is that it requires the future of humanity to have actually already passed through the Great Filter and to have found their own success sufficiently unlikely that they've devoted large amounts of resources to actually finding out how they managed to survive. Moreover, there are strong limits on how accurately one can reconstruct any given quantum state which means an ancestry simulation will be at best a rough approximation. In this context, while there are interesting anthropic considerations here, it is more likely that the simulation hypothesis is wishful thinking.
Variants of the "Prime Directive" have also been proposed. The essential idea is that advanced civilizations would deliberately avoid interacting with less advanced civilizations. This hypothesis runs into two serious problems: first, it does not explain the apparent naturalness, only the lack of direct contact by alien life. Second, it assumes a solution to a massive coordination problem between multiple species with potentially radically different ethical systems. In a similar vein, Hanson in his original essay on the Great Filter raised the possibility of a single very early species with some form of faster than light travel and a commitment to keeping the universe close to natural looking. Since all proposed forms of faster than light travel are highly speculative and would involve causality violations this hypothesis cannot be assigned a substantial probability.
People have also suggested that civilizations move outside galaxies to the cold of space where they can do efficient reversible computing using cold dark matter. Jacob Cannell has been one of the most vocal proponents of this idea. This hypothesis suffers from at least three problems. First, it fails to explain why those entities have not used the conventional matter to any substantial extent in addition to the cold dark matter. Second, this hypothesis would either require dark matter composed of cold conventional matter (which at this point seems to be only a small fraction of all dark matter), or would require dark matter which interacts with itself using some force other than gravity. While there is some evidence for such interaction, it is at this point, slim. Third, even if some species had taken over a large fraction of dark matter to use for their own computations, one would then expect later species to use the conventional matter since they would not have the option of using the now monopolized dark matter.
Other exotic non-Filter explanations have been proposed but they suffer from similar or even more severe flaws.
It is possible that future information will change this situation. One of the more plausible explanations of the Great Filter is that there is no single Great Filter in the past but rather a large number of small filters which come together to drastically filter out civilizations. However, the evidence for such a viewpoint at this point is slim but there is some possibility that astronomy can help answer this question.
For example, one commonly cited aspect of past filtration is the origin of life. There are at least three locations, other than Earth, where life could have formed: Europa, Titan and Mars. Finding life on one, or all of them, would be a strong indication that the origin of life is not the filter. Similarly, while it is highly unlikely that Mars has multicellular life, finding such life would indicate that the development of multicellular life is not the filter. However, none of them are as hospitable to the extent of Earth, so determining whether there is life will require substantial use of probes. We might also look for signs of life in the atmospheres of extrasolar planets, which would require substantially more advanced telescopes.
Another possible early filter is that planets like Earth frequently get locked into a "snowball" state which planets have difficulty exiting. This is an unlikely filter since Earth has likely been in near-snowball conditions multiple times— once very early on during the Huronian and later, about 650 million years ago. This is an example of an early partial Filter where astronomical observation may be of assistance in finding evidence of the filter. The snowball Earth filter does have one strong virtue: if many planets never escape a snowball situation, then this explains in part why we are not around a red dwarf: planets do not escape their snowball state unless their home star is somewhat variable, and red dwarfs are too stable.
It should be clear that none of these explanations are satisfactory and thus we must take seriously the possibility of future Filtration.
How can we substantially improve astronomy in the short to medium term?
Before we examine the potentials for further astronomical research to understand a future filter we should note that there are many avenues in which we can improve our astronomical instruments. The most basic way is to simply make better conventional optical, near-optical telescopes, and radio telescopes. That work is ongoing. Examples include the European Extreme Large Telescope and the Thirty Meter Telescope. Unfortunately, increasing the size of ground based telescopes, especially size of the aperture, is running into substantial engineering challenges. However, in the last 30 years the advent of adaptive optics, speckle imaging, and other techniques have substantially increased the resolution of ground based optical telescopes and near-optical telescopes. At the same time, improved data processing and related methods have improved radio telescopes. Already, optical and near-optical telescopes have advanced to the point where we can gain information about the atmospheres of extrasolar planets although we cannot yet detect information about the atmospheres of rocky planets.
Increasingly, the highest resolution is from space-based telescopes. Space-based telescopes also allow one to gather information from types of radiation which are blocked by the Earth's atmosphere or magnetosphere. Two important examples are x-ray telescopes and gamma ray telescopes. Space-based telescopes also avoid many of the issues created by the atmosphere for optical telescopes. Hubble is the most striking example but from a standpoint of observatories relevant to the Great Filter, the most relevant space telescope (and most relevant instrument in general for all Great Filter related astronomy), is the planet detecting Kepler spacecraft which is responsible for most of the identified planets.
Another type of instrument are neutrino detectors. Neutrino detectors are generally very large bodies of a transparent material (generally water) kept deep underground so that there are minimal amounts of light and cosmic rays hitting the the device. Neutrinos are then detected when they hit a particle which results in a flash of light. In the last few years, improvements in optics, increasing the scale of the detectors, and the development of detectors like IceCube, which use naturally occurring sources of water, have drastically increased the sensitivity of neutrino detectors.
There are proposals for larger-scale, more innovative telescope designs but they are all highly speculative. For example, in the ground based optical front, there's been a suggestion to make liquid mirror telescopes with ferrofluid mirrors which would give the advantages of liquid mirror telescopes, while being able to apply adaptive optics which can normally only be applied to solid mirror telescopes. An example of potential space-based telescopes is the Aragoscope which would take advantage of diffraction to make a space-based optical telescope with a resolution at least an order of magnitude greater than Hubble. Other examples include placing telescopes very far apart in the solar system to create effectively very high aperture telescopes. The most ambitious and speculative of such proposals involve such advanced and large-scale projects that one might as well presume that they will only happen if we have already passed through the Great Filter.
What are the major identified future potential contributions to the filter and what can astronomy tell us?
One threat type where more astronomical observations can help are natural threats, such as asteroid collisions, supernovas, gamma ray bursts, rogue high gravity bodies, and as yet unidentified astronomical threats. Careful mapping of asteroids and comets is ongoing and requires more continued funding rather than any intrinsic improvements in technology. Right now, most of our mapping looks at objects at or near the plane of the ecliptic and so some focus off the plane may be helpful. Unfortunately, there is very little money to actually deal with such problems if they arise. It might be possible to have a few wealthy individuals agree to set up accounts in escrow which would be used if an asteroid or similar threat arose.
Supernovas are unlikely to be a serious threat at this time. There are some stars which are close to our solar system and are large enough that they will go supernova. Betelgeuse is the most famous of these with a projected supernova likely to occur in the next 100,000 years. However, at its current distance, Betelgeuse is unlikely to pose much of a problem unless our models of supernovas are very far off. Further conventional observations of supernovas need to occur in order to understand this further, and better neutrino observations will also help but right now, supernovas do not seem to be a large risk. Gamma ray bursts are in a situation similar to supernovas. Note also that if an imminent gamma ray burst or supernova is likely to occur, there's very little we can at present do about it. In general, back of the envelope calculations establish that supernovas are highly unlikely to be a substantial part of the Great Filter.
Rogue planets, brown dwarfs or other small high gravity bodies such as wandering black holes can be detected and further improvements will allow faster detection. However, the scale of havoc created by such events is such that it is not at all clear that detection will help. The entire planetary nuclear arsenal would not even begin to move their orbits a substantial extent.
Note also it is unlikely that natural events are a large fraction of the Great Filter. Unlike most of the other threat types, this is a threat type where radio astronomy and neutrino information may be more likely to identify problems.
Biological threats take two primary forms: pandemics and deliberately engineered diseases. The first is more likely than one might naively expect as a serious contribution to the filter, since modern transport allows infected individuals to move quickly and come into contact with a large number of people. For example, trucking has been a major cause of the spread of HIV in Africa and it is likely that the recent Ebola epidemic had similar contributing factors. Moreover, keeping chickens and other animals in very large quanities in dense areas near human populations makes it easier for novel variants of viruses to jump species. Astronomy does not seem to provide any relevant assistance here; the only plausible way of getting such information would be to see other species that were destroyed by disease. Even with resolutions and improvements in telescopes by many orders of magnitude this is not doable.
For reasons similar to those in the biological threats category, astronomy is unlikely to help us detect if nuclear war is a substantial part of the Filter. It is possible that more advanced telescopes could detect an extremely large nuclear detonation if it occurred in a very nearby star system. Next generation telescopes may be able to detect a nearby planet's advanced civilization purely based on the light they give off and a sufficiently large detonation would be of the same light level. However, such devices would be multiple orders of magnitude larger than the largest current nuclear devices. Moreover, if a telescope was not looking at exactly the right moment, it would not see anything at all, and the probability that another civilization wipes itself out at just the same instant that we are looking is vanishingly small.
This category is one of the most difficult to discuss because it so open. The most common examples people point to involve high-energy physics. Aside from theoretical considerations, cosmic rays of very high energy levels are continually hitting the upper atmosphere. These particles frequently are multiple orders of magnitude higher energy than the particles in our accelerators. Thus high-energy events seem to be unlikely to be a cause of any serious filtration unless/until humans develop particle accelerators whose energy level is orders of magnitude higher than that produced by most cosmic rays. Cosmic rays with energy levels beyond what is known as the GZK energy limit are rare. We have observed occasional particles with energy levels beyond the GZK limit, but they are rare enough that we cannot rule out a risk from many collisions involving such high energy particles in a small region. Since our best accelerators are nowhere near the GZK limit, this is not an immediate problem.
There is an argument that we should if anything worry about unexpected physics, it is on the very low energy end. In particular, humans have managed to make objects substantially colder than the background temperature of 4 K with temperature as on the order of 10-9 K. There's an argument that because of the lack of prior examples of this, the chance that something can go badly wrong should be higher than one might estimate (See here.) While this particular class of scenario seems unlikely, it does illustrate that it may not be obvious which situations could cause unexpected, novel physics to come into play. Moreover, while the flashy, expensive particle accelerators get attention, they may not be a serious source of danger compared to other physics experiments.
Three of the more plausible catastrophic unexpected physics dealing with high energy events are, false vacuum collapse, black hole formation, and the formation of strange matter which is more stable than regular matter.
False vacuum collapse would occur if our universe is not in its true lowest energy state and an event occurs which causes it to transition to the true lowest state (or just a lower state). Such an event would be almost certainly fatal for all life. False vacuum collapses cannot be avoided by astronomical observations since once initiated they would expand at the speed of light. Note that the indiscriminately destructive nature of false vacuum collapses make them an unlikely filter. If false vacuum collapses were easy we would not expect to see almost any life this late in the universe's lifespan since there would be a large number of prior opportunities for false vacuum collapse. Essentially, we would not expect to find ourselves this late in a universe's history if this universe could easily engage in a false vacuum collapse. While false vacuum collapses and similar problems raise issues of observer selection effects, careful work has been done to estimate their probability.
People have mentioned the idea of an event similar to a false vacuum collapse but which occurs at a speed slower than the speed of light. Greg Egan used it is a major premise in his novel, "Schild's Ladder." I'm not aware of any reason to believe such events are at all plausible. The primary motivation seems to be for the interesting literary scenarios which arise rather than for any scientific considerations. If such a situation can occur, then it is possible that we could detect it using astronomical methods. In particular, if the wave-front of the event is fast enough that it will impact the nearest star or nearby stars around it, then we might notice odd behavior by the star or group of stars. We can be confident that no such event has a speed much beyond a few hundredths of the speed of light or we would already notice galaxies behaving abnormally. There is a very narrow range where such expansions could be quick enough to devastate the planet they arise on but take too long to get to their parent star in a reasonable amount of time. For example, the distance from the Earth to the Sun is on the order of 10,000 times the diameter of the Earth, so any event which would expand to destroy the Earth would reach the Sun in about 10,000 times as long. Thus in order to have a time period which would destroy one's home planet but not reach the parent star it would need to be extremely slow.
The creation of artificial black holes are unlikely to be a substantial part of the filter— we expect that small black holes will quickly pop out of existence due to Hawking radiation. Even if the black hole does form, it is likely to fall quickly to the center of the planet and eat matter very slowly and over a time-line which does not make it constitute a serious threat. However, it is possible that black holes would not evaporate; the fact that we have not detected the evaporation of any primordial black holes is weak evidence that the behavior of small black holes is not well-understood. It is also possible that such a hole would eat much faster than we expect but this doesn't seem likely. If this is a major part of the filter, then better telescopes should be able to detect it by finding very dark objects with the approximate mass and orbit of habitable planets. We also may be able to detect such black holes via other observations such as from their gamma or radio signatures.
The conversion of regular matter into strange matter, unlike a false vacuum collapse or similar event, might be naturally limited to the planet where the conversion started. In that case, the only hope for observation would be to notice planets formed of strange matter and notice changes in the behavior of their light. Without actual samples of strange matter, this may be very difficult to do unless we just take notice of planets looking abnormal as similar evidence. Without substantially better telescopes and a good idea of what the range is for normal rocky planets, this would be tough. On the other hand, neutron stars which have been converted into strange matter may be more easily detectable.
Global warming and related damage to biosphere:
Astronomy is unlikely to help here. It is possible that climates are more sensitive than we realize and that comparatively small changes can result in Venus-like situations. This seems unlikely given the general variation level in human history and the fact that current geological models strongly suggest that any substantial problem would eventually correct itself. But if we saw many planets that looked Venus-like in the middle of their habitable zones, this would be a reason to be worried. Note that this would require detailed ability to analyze atmospheres on planets well beyond current capability. Even if it is possible Venus-ify a planet, it is not clear that the Venusification would last long. Thus there may be very few planets in this state at any given time. Since stars become brighter as they age, so high greenhouse gas levels have more of an impact on climate when the parent star is old. If civilizations are more likely to arise in a late point of their home star's lifespan, global warming becomes a more plausible filter, but even given given such considerations, global warming does not seem to be sufficient as a filter. It is also possible that global warming by itself is not the Great Filter but rather general disruption of the biosphere including possibly for some species global warming, reduction in species diversity, and other problems. There is some evidence that human behavior is collectively causing enough damage to leave an unstable biosphere.
A change in planetary overall temperature of 10o C would likely be enough to collapse civilization without leaving any signal observable to a telescope. Similarly, substantial disruption to a biosphere may be very unlikely to be detected.
AI is a complicated existential risk from the standpoint of the Great Filter. AI is not likely to be the Great Filter if one considers simply the Fermi paradox. The essential problem has been brought up independently by a few people. (See for example Katja Grace's remark here and my blog here.) The central issue is that if an AI takes over it is likely to attempt to control all resources in its future light-cone. However, if the AI spreads out at a substantial fraction of the speed of light, then we would notice the result. The argument has been made that we would not see such an AI if it expanded its radius of control at very close to the speed of light but this requires expansion at 99% of the speed of light or greater. It is highly questionable that velocities more than 99% of the speed of light are practically possible due to collisions with the interstellar medium and the need to slow down if one is going to use the resources in a given star system. Another objection is that AI may expand at a large fraction of light speed but do so stealthily. It is not likely that all AIs would favor stealth over speed. Moreover, this would lead to the situation of what one would expect when multiple slowly expanding, stealth AIs run into each other. It is likely that such events would have results would catastrophic enough that they would be visible even with comparatively primitive telescopes.
While these astronomical considerations make AI unlikely to be the Great Filter, it is important to note that if the Great Filter is largely in our past then these considerations do not apply. Thus, any discovery which pushes more of the filter into the past makes AI a larger fraction of total expected existential risks since the absence of observable AI becomes much weaker evidence against strong AI if there are no major civilizations out there to hatch such explosions.
Note also that AI as a risk cannot be discounted if one assigns a high probability to existential risk based on non-Fermi concerns, such as the Doomsday Argument.
Astronomy is unlikely to provide direct help here for reasons similar to the problems with nuclear exchange, biological problems, and global warming. This connects to the problem of civilization bootstrapping: to get to our current technology level, we used a large number of non-renewable resources, especially energy sources. On the other hand, large amounts of difficult-to-mine and refine resources (especially aluminum and titanium) will be much more accessible to future civilization. While there remains a large amount of accessible fossil fuels, the technology required to obtain deeper sources is substantially more advanced than the relatively easy to access oil and coal. Moreover, the energy return rate, how much energy one needs to put in to get the same amount of energy out, is lower. Nick Bostrom has raised the possibility that the depletion of easy-to-access resources may contribute to making civilization-collapsing problems that, while not full-scale existential risks by themselves, prevent the civilizations from recovering. Others have begun to investigate the problem of rebuilding without fossil fuels, such as here.
Resource depletion is unlikely to be the Great Filter, because small changes to human behavior in the 1970s would have drastically reduced the current resource problems. Resource depletion may contribute to existential threat to humans if it leads to societal collapse, global nuclear exchange, or motivate riskier experimentation. Resource depletion may also combine with other risks such as a global warming where the combined problems may be much greater than either at an individual level. However there is a risk that large scale use of resources to engage in astronomy research will directly contribute to the resource depletion problem.
I'm in Lagos, Nigeria till the end of May and I'd like to hold a LessWrong/EA meetup while I'm here. If you'll ever be in the country in the future (or in the subcontinent), please get in touch so we can coordinate a meetup. I'd also appreciate being put in contact with any Nigerians who may not regularly read this list.
My e-mail address is firstname.lastname@example.org. I hope to hear from you.
I'm excited to announce that the Future of Life Institute has just launched an existential risk news site!
The site will have regular articles on topics related to existential risk, written by journalists, and a community blog written by existential risk researchers from around the world as well as FLI volunteers. Enjoy!
You are unlikely to see me posting here again, after today. There is a saying here that politics is the mind-killer. My heretical realization lately is that philosophy, as generally practiced, can also be mind-killing.
As many of you know I am, or was running a twice-monthly Rationality: AI to Zombies reading group. One of the bits I desired to include in each reading group post was a collection of contrasting views. To research such views I've found myself listening during my commute to talks given by other thinkers in the field, e.g. Nick Bostrom, Anders Sandberg, and Ray Kurzweil, and people I feel are doing “ideologically aligned” work, like Aubrey de Grey, Christine Peterson, and Robert Freitas. Some of these were talks I had seen before, or generally views I had been exposed to in the past. But looking through the lens of learning and applying rationality, I came to a surprising (to me) conclusion: it was philosophical thinkers that demonstrated the largest and most costly mistakes. On the other hand, de Grey and others who are primarily working on the scientific and/or engineering challenges of singularity and transhumanist technologies were far less likely to subject themselves to epistematic mistakes of significant consequences.
Philosophy as the anti-science...
What sort of mistakes? Most often reasoning by analogy. To cite a specific example, one of the core underlying assumption of singularity interpretation of super-intelligence is that just as a chimpanzee would be unable to predict what a human intelligence would do or how we would make decisions (aside: how would we know? Were any chimps consulted?), we would be equally inept in the face of a super-intelligence. This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that's not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.
This post is not about the singularity nature of super-intelligence—that was merely my choice of an illustrative example of a category of mistakes that are too often made by those with a philosophical background rather than the empirical sciences: the reasoning by analogy instead of the building and analyzing of predictive models. The fundamental mistake here is that reasoning by analogy is not in itself a sufficient explanation for a natural phenomenon, because it says nothing about the context sensitivity or insensitivity of the original example and under what conditions it may or may not hold true in a different situation.
A successful physicist or biologist or computer engineer would have approached the problem differently. A core part of being successful in these areas is knowing when it is that you have insufficient information to draw conclusions. If you don't know what you don't know, then you can't know when you might be wrong. To be an effective rationalist, it is often not important to answer “what is the calculated probability of that outcome?” The better first question is “what is the uncertainty in my calculated probability of that outcome?” If the uncertainty is too high, then the data supports no conclusions. And the way you reduce uncertainty is that you build models for the domain in question and empirically test them.
The lens that sees its own flaws...
Coming back to LessWrong and the sequences. In the preface to Rationality, Eliezer Yudkowsky says his biggest regret is that he did not make the material in the sequences more practical. The problem is in fact deeper than that. The art of rationality is the art of truth seeking, and empiricism is part and parcel essential to truth seeking. There's lip service done to empiricism throughout, but in all the “applied” sequences relating to quantum physics and artificial intelligence it appears to be forgotten. We get instead definitive conclusions drawn from thought experiments only. It is perhaps not surprising that these sequences seem the most controversial.
I have for a long time been concerned that those sequences in particular promote some ungrounded conclusions. I had thought that while annoying this was perhaps a one-off mistake that was fixable. Recently I have realized that the underlying cause runs much deeper: what is taught by the sequences is a form of flawed truth-seeking (thought experiments favored over real world experiments) which inevitably results in errors, and the errors I take issue with in the sequences are merely examples of this phenomenon.
And these errors have consequences. Every single day, 100,000 people die of preventable causes, and every day we continue to risk extinction of the human race at unacceptably high odds. There is work that could be done now to alleviate both of these issues. But within the LessWrong community there is actually outright hostility to work that has a reasonable chance of alleviating suffering (e.g. artificial general intelligence applied to molecular manufacturing and life-science research) due to concerns arrived at by flawed reasoning.
I now regard the sequences as a memetic hazard, one which may at the end of the day be doing more harm than good. One should work to develop one's own rationality, but I now fear that the approach taken by the LessWrong community as a continuation of the sequences may result in more harm than good. The anti-humanitarian behaviors I observe in this community are not the result of initial conditions but the process itself.
How do we fix this? I don't know. On a personal level, I am no longer sure engagement with such a community is a net benefit. I expect this to be my last post to LessWrong. It may happen that I check back in from time to time, but for the most part I intend to try not to. I wish you all the best.
A note about effective altruism…
One shining light of goodness in this community is the focus on effective altruism—doing the most good to the most people as measured by some objective means. This is a noble goal, and the correct goal for a rationalist who wants to contribute to charity. Unfortunately it too has been poisoned by incorrect modes of thought.
Existential risk reduction, the argument goes, trumps all forms of charitable work because reducing the chance of extinction by even a small amount has far more expected utility than would accomplishing all other charitable works combined. The problem lies in the likelihood of extinction, and the actions selected in reducing existential risk. There is so much uncertainty regarding what we know, and so much uncertainty regarding what we don't know that it is impossible to determine with any accuracy the expected risk of, say, unfriendly artificial intelligence creating perpetual suboptimal outcomes, or what effect charitable work in the area (e.g. MIRI) is have to reduce that risk, if any.
This is best explored by an example of existential risk done right. Asteroid and cometary impacts is perhaps the category of external (not-human-caused) existential risk which we know the most about, and have done the most to mitigate. When it was recognized that impactors were a risk to be taken seriously, we recognized what we did not know about the phenomenon: what were the orbits and masses of Earth-crossing asteroids? We built telescopes to find out. What is the material composition of these objects? We built space probes and collected meteorite samples to find out. How damaging an impact would there be for various material properties, speeds, and incidence angles? We built high-speed projectile test ranges to find out. What could be done to change the course of an asteroid found to be on collision course? We have executed at least one impact probe and will monitor the effect that had on the comet's orbit, and have on the drawing board probes that will use gravitational mechanisms to move their target. In short, we identified what it is that we don't know and sought to resolve those uncertainties.
How then might one approach an existential risk like unfriendly artificial intelligence? By identifying what it is we don't know about the phenomenon, and seeking to experimentally resolve that uncertainty. What relevant facts do we not know about (unfriendly) artificial intelligence? Well, much of our uncertainty about the actions of an unfriendly AI could be resolved if we were to know more about how such agents construct their thought models, and relatedly what language were used to construct their goal systems. We could also stand to benefit from knowing more practical information (experimental data) about in what ways AI boxing works and in what ways it does not, and how much that is dependent on the structure of the AI itself. Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).
Where should I send my charitable donations?
Aubrey de Grey's SENS Research Foundation.
100% of my charitable donations are going to SENS. Why they do not get more play in the effective altruism community is beyond me.
If you feel you want to spread your money around, here are some non-profits which have I have vetted for doing reliable, evidence-based work on singularity technologies and existential risk:
- Robert Freitas and Ralph Merkle's Institute for Molecular Manufacturing does research on molecular nanotechnology. They are the only group that work on the long-term Drexlarian vision of molecular machines, and publish their research online.
- Future of Life Institute is the only existential-risk AI organization which is actually doing meaningful evidence-based research into artificial intelligence.
- B612 Foundation is a non-profit seeking to launch a spacecraft with the capability to detect, to the extent possible, ALL Earth-crossing asteroids.
I wish I could recommend a skepticism, empiricism, and rationality promoting institute. Unfortunately I am not aware of an organization which does not suffer from the flaws I identified above.
Addendum regarding unfinished business
I will no longer be running the Rationality: From AI to Zombies reading group as I am no longer in good conscience able or willing to host it, or participate in this site, even from my typically contrarian point of view. Nevertheless, I am enough of a libertarian that I feel it is not my role to put up roadblocks to others who wish to delve into the material as it is presented. So if someone wants to take over the role of organizing these reading groups, I would be happy to hand over the reigns to that person. If you think that person should be you, please leave a reply in another thread, not here.
EDIT: Obviously I'll stick around long enough to answer questions below :)
I was re-reading the chapter on status in Impro (excerpt), and I noticed that Johnstone seemed to be implying that different people are comfortable at different levels of status: some prefer being high status and others prefer being low status. I found this peculiar, because the prevailing notion in the rationalistsphere seems to be that everyone's constantly engaged in status games aiming to achieve higher status. I've even seen arguments to the effect that a true post-scarcity society is impossible, because status is zero-sum and there will always be people at the bottom of the status hierarchy.
But if some people preferred to have low status, this whole dilemma might be avoided, if a mix of statuses could be find that left everyone happy.
First question - is Johnstone's "status" talking about the same thing as our "status"? He famously claimed that "status is something you do, not something that you are", and that
I should really talk about dominance and submission, but I'd create a resistance. Students who will agree readily to raising or lowering their status may object if asked to 'dominate' or 'submit'.
Viewed via this lens, it makes sense that some people would prefer being in a low status role: if you try to take control of the group, you become subject to various status challenges, and may be held responsible for the decisions you make. It's often easier to remain low status and let others make the decisions.
But there's still something odd about saying that one would "prefer to be low status", at least in the sense in which we usually use the term. Intuitively, a person may be happy being low status in the sense of not being dominant, but most people are still likely to desire something that feels kind of like status in order to be happy. Something like respect, and the feeling that others like them. And a lot of the classical "status-seeking behaviors" seem to be about securing the respect of others. In that sense, there seems to be something intuitive true in the "everyone is engaged in status games and wants to be higher-status" claim.
So I think that there are two different things that we call "status" which are related, but worth distinguishing.
1) General respect and liking. This is "something you have", and is not inherently zero-sum. You can achieve it by doing things that are zero-sum, like being the best fan fiction writer in the country, but you can also do it by things like being considered generally friendly and pleasant to be around. One of the lessons that I picked up from The Charisma Myth was that you can be likable by just being interested in the other person and displaying body language that signals your interest in the other person.
Basically, this is "do other people get warm fuzzies from being around you / hearing about you / consuming your work", and is not zero-sum because e.g. two people who both have great social skills and show interest in you can both produce the same amount of warm fuzzies, independent of each other's existence.
But again, specific sources of this can be zero-sum: if you respect someone a lot for their art, but then run across into even better art and realize that the person you previously admired is pretty poor in comparison, that can reduce the respect you feel for them. It's just that there are also other sources of liking which aren't necessarily zero-sum.
2) Dominance and control of the group. It's inherently zero-sum because at most one person can have absolute say on the decisions of the group. This is "something you do": having the respect and liking of the people in the group (see above) makes it easier for you to assert dominance and makes the others more willing to let you do so, but you can also voluntarily abstain from using that power and leave the decisions to others. (Interestingly, in some cases this can even increase the extent to which you are liked, which translates to a further boost in the ability to control the group, if you so desired.)
Morendil and I previously suggested a definition of status as "the general purpose ability to influence a group", but I think that definition was somewhat off in conflating the two senses above.
I've always had the vague feeling that the "everyone can't always be happy because status is zero-sum" claim felt off in some sense that I was unable to properly articulate, but this seems to resolve the issue. If this model were true, it would also make me happy, because it would imply that we can avoid zero-sum status fights while still making everybody content.
As many people have noted, Less Wrong currently isn't receiving as much content as we would like. One way to think about expanding the content is to think about which areas of study deserve more articles written on them.
For example, I expect that sociology has a lot to say about many of our cultural assumptions. It is quite possible that 95% of it is either obvious or junk, but almost all fields have that 5% within them that could be valuable. Another area of study that might be interesting to consider is anthropology. Again this is a field that allows us to step outside of our cultural assumptions.
I don't know anything about media studies, but I imagine that they have some worthwhile things to say about how we the information that we hear is distorted.
What other fields would you like to see some discussion of on Less Wrong?
Hi, I'm new to LessWrong. I stumbled onto this site a month ago, and ever since, I've been devouring Rationality: AI to Zombies faster than I used to go through my favorite fantasy novels. I've spent some time on website too, and I'm pretty intimidated about posting, since you guys all seem so smart and knowledgeable, but here goes... This is probably the first intellectual idea I've had in my life, so if you want to tear it to shreds, you are more than welcome to, but please be gentle with my feelings. :)
Edit: Thanks to many helpful comments, I've cleaned up the original post quite a bit and changed the title to reflect this.
As humans, we seem to share the same terminal values, or terminal virtues. We want to do things that make ourselves happy, and we want to do things that make others happy. We want to 'become happy' and 'become good.'
Because various determinants--including, for instance, personal fulfillment--can affect an individual's happiness, there is significant overlap between these ultimate motivators. Doing good for others usually brings us happiness. For example, donating to charity makes people feel warm and fuzzy. Some might recognize this overlap and conclude that all humans are entirely selfish, that even those who appear altruistic are subconsciously acting purely out of self-interest. Yet many of us choose to donate to charities that we believe do the most good per dollar, rather than handing out money through personal-happiness-optimizing random acts of kindness. Seemingly rational human beings sometimes make conscious decisions to inefficiently maximize their personal happiness for the sake of others. Consider Eliezer's example in Terminal Values and Instrumental Values of a mother who sacrifices her life for her son.
Why would people do stuff that they know won't efficiently increase their happiness? Before I de-converted from Christianity and started to learn what evolution and natural selection actually were, before I realized that altruistic tendencies are partially genetic, it used to utterly mystify me that atheists would sometimes act so virtuously. I did believe that God gave them a conscience, but I kinda thought that surely someone rational enough to become an atheist would be rational enough to realize that his conscience didn't always lead him to his optimal mind-state, and work to overcome it. Personally, I used to joke with my friends that Christianity was the only thing stopping me from pursuing my true dream job of becoming a thief (strategy + challenge + adrenaline + variety = what more could I ask for?) Then, when I de-converted, it hit me: Hey, you know, Ellen, you really *could* become a thief now! What fun you could have! I flinched from the thought. Why didn't I want to overcome my conscience, become a thief, and live a fun-filled life? Well, this isn't as baffling to me now, simply because I've changed where I draw the boundary. I've come to classify goodness as an end-in-itself, just like I'd always done with happiness.
I first read about virtue ethics in On Terminal Goals and Virtue Ethics. As I read, I couldn't help but want to be a virtue ethicist and a consequentialist. Most virtues just seemed like instrumental values.
The post's author mentioned Divergent protagonist Tris as an example of virtue ethics:
Bravery was a virtue that she thought she ought to have. If the graph of her motivations even went any deeper, the only node beyond ‘become brave’ was ‘become good.’
I suspect that goodness is, perhaps subconsciously, a terminal virtue for the vast majority of virtue ethicists. I appreciate Oscar Wilde's writing in De Profundis:
Now I find hidden somewhere away in my nature something that tells me that nothing in the whole world is meaningless, and suffering least of all..
It is the last thing left in me, and the best: the ultimate discovery at which I have arrived, the starting-point for a fresh development. It has come to me right out of myself, so I know that it has come at the proper time. It could not have come before, nor later. Had anyone told me of it, I would have rejected it. Had it been brought to me, I would have refused it. As I found it, I want to keep it. I must do so...
Of all things it is the strangest.
Wilde's thoughts on humility translate quite nicely to an innate desire for goodness.
When presented with a conflict between an elected virtue, such as loyalty, or truth, and the underlying desire to be good, most virtue ethicists would likely abandon the elected virtue. With truth, consider the classic example of lying to Nazis to save Jews. Generally speaking, it is wrong to conceal the truth, but in special cases, most people would agree that lying is actually less wrong than truth-telling. I'm not certain, but my hunch is that most professing virtue ethicists would find that in extreme thought experiments, their terminal virtue of goodness would eventually trump their other virtues, too.
However, there's one exception. One desire can sometimes trump even the desire for goodness, and that's the desire for personal happiness.
We usually want what makes us happy. I want what makes me happy. Spending time with family makes me happy. Playing board games makes me happy. Going hiking makes me happy. Winning races makes me happy. Being open-minded makes me happy. Hearing praise makes me happy. Learning new things makes me happy. Thinking strategically makes me happy. Playing touch football with friends makes me happy. Sharing ideas makes me happy. Independence makes me happy. Adventure makes me happy. Even divulging personal information makes me happy.
Fun, accomplishment, positive self-image, sense of security, and others' approval: all of these are examples of happiness contributors, or things that lead me to my own, personal optimal mind-state. Every time I engage in one of the happiness increasers above, I'm fulfilling an instrumental value. I'm doing the same thing when I reject activities I dislike or work to reverse personality traits that I think decrease my overall happiness.
Tris didn’t join the Dauntless cast because she thought they were doing the most good in society, or because she thought her comparative advantage to do good lay there–she chose it because they were brave, and she wasn’t, yet, and she wanted to be.
Tris was, in other words, pursuing happiness by trying to change an aspect of her personality she disliked.
Guessing at subconscious motivation
By now, you might be wondering, "But what about the virtue ethicist who is religious? Wouldn't she be ultimately motivated by something other than happiness and goodness?"
Well, in the case of Christianity, most people probably just want to 'become Christ-like' which, for them, overlaps quite conveniently with personal satisfaction and helping others. Happiness and goodness might be intuitively driving them to choose this instrumental goal, and for them, conflict between the two never seems to arise.
Let's consider 'become obedient to God's will' from a modern-day Christian perspective. 1 Timothy 2:4 says, "[God our Savior] wants all men to be saved and to come to a knowledge of the truth." Mark 12:31 says, "Love your neighbor as yourself." Well, I love myself enough that I want to do everything in my power to avoid eternal punishment; therefore, I should love my neighbor enough to do everything in my power to stop him from going to hell, too.
So anytime a Christian does anything but pray for others, do faith-strengthening activities, spread the gospel, or earn money to donate to missionaries, he is anticipating as if God/hell doesn't exist. As a Christian, I totally realized this, and often tried to convince myself and others that we were acting wrongly by not being more devout. I couldn't shake the notion that spending time having fun instead of praying or sharing the gospel was somehow wrong because it went against God's will of wanting all men being saved, and I believed God's will, by definition, was right. (Oops.) But I still acted in accordance with my personal happiness on many occasions. I said God's will was the only end-in-itself, but I didn't act like it. I didn't feel like it. The innate desire to pursue personal happiness is an extremely strong motivating force, so strong that Christians really don't like to label it as sin. Imagine how many deconversions we would see if it were suddenly sinful to play football, watch movies with your family, or splurge on tasty restaurant meals. Yet the Bible often mentions giving up material wealth entirely, and in Luke 9:23 Jesus says, "Whoever wants to be my disciple must deny themselves and take up their cross daily and follow me."
Let's further consider those who believe God's will is good, by definition. Such Christians tend to believe "God wants what's best for us, even when we don't understand it." Unless they have exceptionally strong tendencies to analyze opportunity costs, their understanding of God's will and their intuitive idea of what's best for humanity rarely conflict. But let's imagine it does. Let's say someone strongly believes in God, and is led to believe that God wants him to sacrifice his child. This action would certainly go against his terminal value of goodness and may cause cognitive dissonance. But he could still do it, subconsciously satisfying his (latent) terminal value of personal happiness. What on earth does personal happiness have to do with sacrificing a child? Well, the believer takes comfort in his belief in God and his hope of heaven (the child gets a shortcut there). He takes comfort in his religious community. To not sacrifice the child would be to deny God and lose that immense source of comfort.
These thoughts obviously don't happen on a conscious level, but maybe people have personal-happiness-optimizing intuitions. Of course, I have near-zero scientific knowledge, no clue what really goes on in the subconscious, and I'm just guessing at all this.
Again, happiness has a huge overlap with goodness. Goodness often, but not always, leads to personal happiness. A lot of seemingly random stuff leads to personal happiness, actually. Whatever that stuff is, it largely accounts for the individual variance in which virtues are pursued. It's probably closely tied to the four Kiersey Temperaments of security-seeking, sensation-seeking, knowledge-seeking, and identity-seeking types. (Unsurprisingly, most people here at LW reported knowledge-seeking personality types.) I'm a sensation-seeker. An identity-seeker could find his identity in the religious community and in being a 'child of God'. A security-seeker could find security in his belief in heaven. An identity-seeking rationalist might be the type most likely to aspire to 'become completely truthful' even if she somehow knew with complete certainty that telling the truth, in a certain situation, would lead to a bad outcome for humanity.
Perhaps the general tendency among professing virtue ethicists is to pursue happiness and goodness relatively intuitively, while professing consequentialists pursue the same values more analytically.
Also worth noting is the individual variance in someone's "preference ratio" of happiness relative to goodness. Among professing consequentialists, we might find sociopaths and extreme altruists at opposite ends of a happiness-goodness continuum, with most of us falling somewhere in between. To position virtue ethicists on such a continuum would be significantly more difficult, requiring further speculation about subconscious motivation.
Real-life convergence of moral views
I immediately identified with consequentialism when I first read about it. Then I read about virtue ethics, and I immediately identified with that, too. I naturally analyze my actions with my goals in mind. But I also often find myself idolizing a certain trait in others, such as environmental consciousness, and then pursuing that trait on my own. For example:
I've had friends who care a lot about the environment. I think it's cool that they do. So even before hearing about virtue ethics, I wanted to 'become someone who cares about the environment'. Subconsciously, I must have suspected that this would help me achieve my terminal goals of happiness and goodness.
If caring about the environment is my instrumental goal, I can feel good about myself when I instinctively pick up trash, conserve energy, use a reusable water bottle; i.e. do things environmentally conscious people do. It's quick, it's efficient, and having labeled 'caring about the environment' as a personal virtue, I'm spared from analyzing every last decision. Being environmentally conscious is a valuable habit.
Yet I can still do opportunity cost analyses with my chosen virtue. For example, I could stop showering to help conserve California's water. Or, I could apparently have the same effect by eating six fewer hamburgers in a year. More goodness would result if I stopped eating meat and limited my showering, but doing so would interfere with my personal happiness. I naturally seek to balance my terminal goals of goodness and happiness. Personally, I prefer showering to eating hamburgers, so I cut significantly back on my meat consumption without worrying too much about my showering habits. This practical convergence of virtue ethics and consequentialism satisfies my desires for happiness and goodness harmoniously.
Personal happiness refers to an individual's optimal mind-state. Pleasure, pain, and personal satisfaction are examples of happiness level determinants. Goodness refers to promoting happiness in others.
Terminal values are ends-in-themselves. The only true terminal values, or virtues, seem to be happiness and goodness. Think of them as psychological motivators, consciously or subconsciously driving us to make the decisions we do. (Physical motivators, like addiction or inertia, can also affect decisions.)
Preferences are what we tend to choose. These can be based on psychological or physical motivators.
Instrumental values are the sub-goals or sub-virtues that we (consciously or subconsciously) believe will best fulfill our terminal values of happiness and goodness. We seem to choose them arbitrarily.
Of course, we're not always aware of what actually leads to optimal mind-states in ourselves and others. Yet as we rationally pursue our goals, we may sometimes intuit like virtue ethicists and other times analyze like consequentialists. Both moral views are useful.
So does this idea have any potential practical value?
It took some friendly prodding, but I was finally brought to realize that my purpose in writing this article was not to argue the existence of goodness or the theoretical equality of consequentialism and virtue ethics or anything at all. The real point I'm making here is that however we categorize personal happiness, goodness belongs in the same category, because in practice, all other goals seem to stem from one or both of these concepts. Clarity of expression is an instrumental value, so I'm just saying that perhaps we should consider redrawing our boundaries a bit:
Figuring where to cut reality in order to carve along the joints—this is the problem worthy of a rationalist. It is what people should be trying to do, when they set out in search of the floating essence of a word.
P.S. If anyone is interested in reading a really, really long conversation I had with adamzerner, you can trace the development of this idea. Language issues were overcome, biases were admitted, new facts were learned, minds were changed, and discussion bounced from ambition, to serial killers, to arrogance, to religion, to the subconscious, to agenthood, to skepticism about the happiness set-point theory, all interconnected somehow. In short, it was the first time I've had a conversation with a fellow "rationalist" and it was one of the coolest experiences I've ever had.
“Portrait of EAs I know”, su3su2u1:
But I note from googling for surveys that the median charitable donation for an EA in the Less Wrong survey was 0.
Two years ago I got a paying residency, and since then I’ve been donating 10% of my salary, which works out to about $5,000 a year. In two years I’ll graduate residency, start making doctor money, and then I hope to be able to donate maybe eventually as much as $25,000 - $50,000 per year. But if you’d caught me five years ago, I would have been one of those people who wrote a lot about it and was very excited about it but put down $0 in donations on the survey.
set.seed(2015-05-13) survey2013 <- read.csv("http://www.gwern.net/docs/lwsurvey/2013.csv", header=TRUE) survey2013$EffectiveAltruism2 <- NA s2013 <- subset(survey2013, select=c(Charity,Effective.Altruism,EffectiveAltruism2,Work.Status, Profession,Degree,Age,Income)) colnames(s2013) <- c("Charity","EffectiveAltruism","EffectiveAltruism2","WorkStatus","Profession", "Degree","Age","Income") s2013$Year <- 2013 survey2014 <- read.csv("http://www.gwern.net/docs/lwsurvey/2014.csv", header=TRUE) s2014 <- subset(survey2014, PreviousSurveys!="Yes", select=c(Charity,EffectiveAltruism,EffectiveAltruism2, WorkStatus,Profession,Degree,Age,Income)) s2014$Year <- 2014 survey <- rbind(s2013, s2014) # replace empty fields with NAs: survey[survey==""] <- NA; survey[survey==" "] <- NA # convert money amounts from string to number: survey$Charity <- as.numeric(as.character(survey$Charity)) survey$Income <- as.numeric(as.character(survey$Income)) # both Charity & Income are skewed, like most monetary amounts, so log transform as well: survey$CharityLog <- log1p(survey$Charity) survey$IncomeLog <- log1p(survey$Income) # age: survey$Age <- as.integer(as.character(survey$Age)) # prodigy or no, I disbelieve any LW readers are <10yo (bad data? malicious responses?): survey$Age <- ifelse(survey$Age >= 10, survey$Age, NA) # convert Yes/No to boolean TRUE/FALSE: survey$EffectiveAltruism <- (survey$EffectiveAltruism == "Yes") survey$EffectiveAltruism2 <- (survey$EffectiveAltruism2 == "Yes") summary(survey) ## Charity EffectiveAltruism EffectiveAltruism2 WorkStatus ## Min. : 0.000 Mode :logical Mode :logical Student :905 ## 1st Qu.: 0.000 FALSE:1202 FALSE:450 For-profit work :736 ## Median : 50.000 TRUE :564 TRUE :45 Self-employed :154 ## Mean : 1070.931 NA's :487 NA's :1758 Unemployed :149 ## 3rd Qu.: 400.000 Academics (on the teaching side):104 ## Max. :110000.000 (Other) :179 ## NA's :654 NA's : 26 ## Profession Degree Age ## Computers (practical: IT programming etc.) :478 Bachelor's :774 Min. :13.00000 ## Other :222 High school:597 1st Qu.:21.00000 ## Computers (practical: IT, programming, etc.):201 Master's :419 Median :25.00000 ## Mathematics :185 None :125 Mean :27.32494 ## Engineering :170 Ph D. :125 3rd Qu.:31.00000 ## (Other) :947 (Other) :189 Max. :72.00000 ## NA's : 50 NA's : 24 NA's :28 ## Income Year CharityLog IncomeLog ## Min. : 0.00 2013:1547 Min. : 0.000000 Min. : 0.000000 ## 1st Qu.: 10000.00 2014: 706 1st Qu.: 0.000000 1st Qu.: 9.210440 ## Median : 33000.00 Median : 3.931826 Median :10.404293 ## Mean : 75355.69 Mean : 3.591102 Mean : 9.196442 ## 3rd Qu.: 80000.00 3rd Qu.: 5.993961 3rd Qu.:11.289794 ## Max. :10000000.00 Max. :11.608245 Max. :16.118096 ## NA's :993 NA's :654 NA's :993 # lavaan doesn't like categorical variables and doesn't automatically expand out into dummies like lm/glm, # so have to create the dummies myself: survey$Degree <- gsub("2","two",survey$Degree) survey$Degree <- gsub("'","",survey$Degree) survey$Degree <- gsub("/","",survey$Degree) survey$WorkStatus <- gsub("-","", gsub("\\(","",gsub("\\)","",survey$WorkStatus))) library(qdapTools) survey <- cbind(survey, mtabulate(strsplit(gsub(" ", "", as.character(survey$Degree)), ",")), mtabulate(strsplit(gsub(" ", "", as.character(survey$WorkStatus)), ","))) write.csv(survey, file="2013-2014-lw-ea.csv", row.names=FALSE)
survey <- read.csv("http://www.gwern.net/docs/lwsurvey/2013-2014-lw-ea.csv") # treat year as factor for fixed effect: survey$Year <- as.factor(survey$Year) median(survey[survey$EffectiveAltruism,]$Charity, na.rm=TRUE) ##  100 median(survey[!survey$EffectiveAltruism,]$Charity, na.rm=TRUE) ##  42.5 # t-tests are inappropriate due to non-normal distribution of donations: wilcox.test(Charity ~ EffectiveAltruism, conf.int=TRUE, data=survey) ## Wilcoxon rank sum test with continuity correction ## ## data: Charity by EffectiveAltruism ## W = 214215, p-value = 4.811186e-08 ## alternative hypothesis: true location shift is not equal to 0 ## 95% confidence interval: ## -4.999992987e+01 -1.275881408e-05 ## sample estimates: ## difference in location ## -19.99996543 library(ggplot2) qplot(Age, CharityLog, color=EffectiveAltruism, data=survey) + geom_point(size=I(3)) ## https://i.imgur.com/wd5blg8.png qplot(Age, CharityLog, color=EffectiveAltruism, data=na.omit(subset(survey, select=c(Age, CharityLog, EffectiveAltruism)))) + geom_point(size=I(3)) + stat_smooth() ## https://i.imgur.com/UGqf8wn.png # you might think that we can't treat Age linearly because this looks like a quadratic or # logarithm, but when I fitted some curves, charity donations did not seem to flatten out # appropriately, and the GAM/loess wiggly-but-increasing line seems like a better summary. # Try looking at the asymptotes & quadratics split by group as follows: # ## n1 <- nls(CharityLog ~ SSasymp(as.integer(Age), Asym, r0, lrc), ## data=survey[survey$EffectiveAltruism,], start=list(Asym=6.88, r0=-4, lrc=-3)) ## n2 <- nls(CharityLog ~ SSasymp(as.integer(Age), Asym, r0, lrc), ## data=survey[!survey$EffectiveAltruism,], start=list(Asym=6.88, r0=-4, lrc=-3)) ## with(survey, plot(Age, CharityLog)) ## points(predict(n1, newdata=data.frame(Age=0:70)), col="blue") ## points(predict(n2, newdata=data.frame(Age=0:70)), col="red") ## ## l1 <- lm(CharityLog ~ Age + I(Age^2), data=survey[survey$EffectiveAltruism,]) ## l2 <- lm(CharityLog ~ Age + I(Age^2), data=survey[!survey$EffectiveAltruism,]) ## with(survey, plot(Age, CharityLog)); ## points(predict(l1, newdata=data.frame(Age=0:70)), col="blue") ## points(predict(l2, newdata=data.frame(Age=0:70)), col="red") # # So I will treat Age as a linear additive sort of thing.
# for the regression, we want to combine EffectiveAltruism/EffectiveAltruism2 into a single measure, EA, so # a latent variable in a SEM; then we use EA plus the other covariates to estimate the CharityLog. library(lavaan) model1 <- " # estimate EA latent variable: EA =~ EffectiveAltruism + EffectiveAltruism2 CharityLog ~ EA + Age + IncomeLog + Year + # Degree dummies: None + Highschool + twoyeardegree + Bachelors + Masters + Other + MDJDotherprofessionaldegree + PhD. + # WorkStatus dummies: Independentlywealthy + Governmentwork + Forprofitwork + Selfemployed + Nonprofitwork + Academicsontheteachingside + Student + Homemaker + Unemployed " fit1 <- sem(model = model1, missing="fiml", data = survey); summary(fit1) ## lavaan (0.5-16) converged normally after 197 iterations ## ## Number of observations 2253 ## ## Number of missing patterns 22 ## ## Estimator ML ## Minimum Function Test Statistic 90.659 ## Degrees of freedom 40 ## P-value (Chi-square) 0.000 ## ## Parameter estimates: ## ## Information Observed ## Standard Errors Standard ## ## Estimate Std.err Z-value P(>|z|) ## Latent variables: ## EA =~ ## EffectvAltrsm 1.000 ## EffctvAltrsm2 0.355 0.123 2.878 0.004 ## ## Regressions: ## CharityLog ~ ## EA 1.807 0.621 2.910 0.004 ## Age 0.085 0.009 9.527 0.000 ## IncomeLog 0.241 0.023 10.468 0.000 ## Year 0.319 0.157 2.024 0.043 ## None -1.688 2.079 -0.812 0.417 ## Highschool -1.923 2.059 -0.934 0.350 ## twoyeardegree -1.686 2.081 -0.810 0.418 ## Bachelors -1.784 2.050 -0.870 0.384 ## Masters -2.007 2.060 -0.974 0.330 ## Other -2.219 2.142 -1.036 0.300 ## MDJDthrprfssn -1.298 2.095 -0.619 0.536 ## PhD. -1.977 2.079 -0.951 0.341 ## Indpndntlywlt 1.175 2.119 0.555 0.579 ## Governmentwrk 1.183 1.969 0.601 0.548 ## Forprofitwork 0.677 1.940 0.349 0.727 ## Selfemployed 0.603 1.955 0.309 0.758 ## Nonprofitwork 0.765 1.973 0.388 0.698 ## Acdmcsnthtchn 1.087 1.970 0.551 0.581 ## Student 0.879 1.941 0.453 0.650 ## Homemaker 1.071 2.498 0.429 0.668 ## Unemployed 0.606 1.956 0.310 0.757 ## ## Intercepts: ## EffectvAltrsm 0.319 0.011 28.788 0.000 ## EffctvAltrsm2 0.109 0.012 8.852 0.000 ## CharityLog -0.284 0.737 -0.385 0.700 ## EA 0.000 ## ## Variances: ## EffectvAltrsm 0.050 0.056 ## EffctvAltrsm2 0.064 0.008 ## CharityLog 7.058 0.314 ## EA 0.168 0.056 # simplify: model2 <- " # estimate EA latent variable: EA =~ EffectiveAltruism + EffectiveAltruism2 CharityLog ~ EA + Age + IncomeLog + Year " fit2 <- sem(model = model2, missing="fiml", data = survey); summary(fit2) ## lavaan (0.5-16) converged normally after 55 iterations ## ## Number of observations 2253 ## ## Number of missing patterns 22 ## ## Estimator ML ## Minimum Function Test Statistic 70.134 ## Degrees of freedom 6 ## P-value (Chi-square) 0.000 ## ## Parameter estimates: ## ## Information Observed ## Standard Errors Standard ## ## Estimate Std.err Z-value P(>|z|) ## Latent variables: ## EA =~ ## EffectvAltrsm 1.000 ## EffctvAltrsm2 0.353 0.125 2.832 0.005 ## ## Regressions: ## CharityLog ~ ## EA 1.770 0.619 2.858 0.004 ## Age 0.085 0.009 9.513 0.000 ## IncomeLog 0.241 0.023 10.550 0.000 ## Year 0.329 0.156 2.114 0.035 ## ## Intercepts: ## EffectvAltrsm 0.319 0.011 28.788 0.000 ## EffctvAltrsm2 0.109 0.012 8.854 0.000 ## CharityLog -1.331 0.317 -4.201 0.000 ## EA 0.000 ## ## Variances: ## EffectvAltrsm 0.049 0.057 ## EffctvAltrsm2 0.064 0.008 ## CharityLog 7.111 0.314 ## EA 0.169 0.058 # simplify even further: summary(lm(CharityLog ~ EffectiveAltruism + EffectiveAltruism2 + Age + IncomeLog, data=survey)) ## ...Residuals: ## Min 1Q Median 3Q Max ## -7.6813410 -1.7922422 0.3325694 1.8440610 6.5913961 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -2.06062203 0.57659518 -3.57378 0.00040242 ## EffectiveAltruismTRUE 1.26761425 0.37515124 3.37894 0.00081163 ## EffectiveAltruism2TRUE 0.03596335 0.54563991 0.06591 0.94748766 ## Age 0.09411164 0.01869218 5.03481 7.7527e-07 ## IncomeLog 0.32140793 0.04598392 6.98957 1.4511e-11 ## ## Residual standard error: 2.652323 on 342 degrees of freedom ## (1906 observations deleted due to missingness) ## Multiple R-squared: 0.2569577, Adjusted R-squared: 0.2482672 ## F-statistic: 29.56748 on 4 and 342 DF, p-value: < 2.2204e-16
Note these increases are on a log-dollars scale.
[CW: This post talks about personal experience of moral dilemmas. I can see how some people might be distressed by thinking about this.]
Have you ever had to decide between pushing a fat person onto some train tracks or letting five other people get hit by a train? Maybe you have a more exciting commute than I do, but for me it's just never come up.
In spite of this, I'm unusually prepared for a trolley problem, in a way I'm not prepared for, say, being offered a high-paying job at an unquantifiably-evil company. Similarly, if a friend asked me to lie to another friend about something important to them, I probably wouldn't carry out a utilitarian cost-benefit analysis. It seems that I'm happy to adopt consequentialist policy, but when it comes to personal quandaries where I have to decide for myself, I start asking myself about what sort of person this decision makes me. What's more, I'm not sure this is necessarily a bad heuristic in a social context.
It's also noteworthy (to me, at least) that I rarely experience moral dilemmas. They just don't happen all that often. I like to think I have a reasonably coherent moral framework, but do I really need one? Do I just lead a very morally-inert life? Or have abstruse thought experiments in moral philosophy equipped me with broader principles under which would-be moral dilemmas are resolved before they reach my conscious deliberation?
To make sure I'm not giving too much weight to my own experiences, I thought I'd put a few questions to a wider audience:
- What kind of moral dilemmas do you actually encounter?
- Do you have any thoughts on how much moral judgement you have to exercise in your daily life? Do you think this is a typical amount?
- Do you have any examples of pedestrian moral dilemmas to which you've applied abstract moral reasoning? How did that work out?
- Do you have any examples of personal moral dilemmas on a Trolley Problem scale that nonetheless happened?
The Username/password anonymous account is, as always, available.
A simple exercise in rationality: rephrase an objective statement as subjective and explore the caveats
"This book is awful" => "I dislike this book" => "I dislike this book because it is shallow and is full of run-on sentences." => I dislike this book because I prefer reading books I find deep and clearly written."
"The sky is blue" => ... => "When I look at the sky, the visual sensation I get is very similar to when I look at a bunch of other objects I've been taught to associate with the color blue."
"Team X lost but deserved to win" => ...
"Being selfish is immoral"
"The Universe is infinite, so anything imaginable happens somewhere"
In general, consider a quick check whether in a given context replacing "is" with "appears to be" leads to something you find non-trivial.
Why? Because it exposes the multiple levels of maps we normally skip. So one might find illuminating occasionally walking through the levels and making sure they are still connected as firmly as the last time. And maybe figuring out where the people who hold a different opinion from yours construct a different chain of maps. Also to make sure you don't mistake a map for the territory.
That is all. ( => "I think that I have said enough for one short post and adding more would lead to diminishing returns, though I could be wrong here, but I am too lazy to spend more time looking for links and quotes and better arguments without being sure that they would improve the post.")
We've been running regular, well-attended Less Wrong meetups in London for a few years now, (and irregular, badly-attended ones for even longer than that). In this time, I'd like to think we've learned a few things about having good conversations, but there are probably plenty of areas where we could make gains. Given the number of Less Wrong meetups around the world, it's worth attempting some sort of meetup cross-pollination. It's possible that we've all been solving each other's problems. It's also good to have a central location to make observations and queries about topics of interest, and it's likely people have such observations and queries on this topic.
So, what have you learned from attending or running Less Wrong meetups? Here are a few questions to get the ball rolling:
- What do you suppose are the dominant positive outcomes of your meetups?
- What problems do you encounter with discussions involving [x] people? How have you attempted to remedy them?
- Do you have any systems or procedures in place for making sure people are having the sorts of conversations they want to have?
- Have you developed or consciously adopted any non-mainstream social norms, taboos or rituals? How are those working out?
- How do Less Wrong meetups differ from other similar gatherings you've been involved with? Are there any special needs idiosyncratic to this demographic?
- Are there any activities that you've found work particularly well or particularly poorly for meetups? Do you have examples of runaway successes or spectacular failures?
- Are there any activities you'd like to try, but haven't managed to pull off yet? What's stopping you?
If you have other specific questions you'd like answered, you're encouraged to ask them in comments. Any other observations, anecdotes or suggestions on this general topic are also welcome and encouraged.
[To be cross-posted at Effective Altruism Forum, FLI news page]
I'm delighted to announce that the Centre for the Study of Existential Risk has had considerable recent success in grantwriting and fundraising, among other activities (full update coming shortly). As a result, we are now in a position to advance to CSER's next stage of development: full research operations. Over the course of this year, we will be recruiting for a full team of postdoctoral researchers to work on a combination of general methodologies for extreme technological (and existential) risk analysis and mitigation, alongside specific technology/risk-specific projects.
Our first round of recruitment has just opened - we will be aiming to hire up to 4 postdoctoral researchers; details below. A second recruitment round will take place in the Autumn. We have a slightly unusual opportunity in that we get to cast our net reasonably wide. We have a number of planned research projects (listed below) that we hope to recruit for. However, we also have the flexibility to hire one or more postdoctoral researchers to work on additional projects relevant to CSER's aims. Information about CSER's aims and core research areas is available on our website. We request that as part of the application process potential postholders send us a research proposal of no more than 1500 words, explaining what your research skills could contribute to CSER. At this point in time, we are looking for people who will have obtained a doctorate in a relevant discipline by their start date.
We would also humbly ask that the LessWrong community aid us in spreading the word far and wide about these positions. There are many brilliant people working within the existential risk community. However, there are academic disciplines and communities that have had less exposure to existential risk as a research priority than others (due to founder effect and other factors), but where there may be people with very relevant skills and great insights. With new centres and new positions becoming available, we have a wonderful opportunity to grow the field, and to embed existential risk as a crucial consideration in all relevant fields and disciplines.
Thanks very much,
Seán Ó hÉigeartaigh (Executive Director, CSER)
"The Centre for the Study of Existential Risk (University of Cambridge, UK) is recruiting for to four full-time postdoctoral research associates to work on the project Towards a Science of Extreme Technological Risk.
We are looking for outstanding and highly-committed researchers, interested in working as part of growing research community, with research projects relevant to any aspect of the project. We invite applicants to explain their project to us, and to demonstrate their commitment to the study of extreme technological risks.
We have several shovel-ready projects for which we are looking for suitable postdoctoral researchers. These include:
- Ethics and evaluation of extreme technological risk (ETR) (with Sir Partha Dasgupta;
- Horizon-scanning and foresight for extreme technological risks (with Professor William Sutherland);
- Responsible innovation and extreme technological risk (with Dr Robert Doubleday and the Centre for Science and Policy).
However, recruitment will not necessarily be limited to these subprojects, and our main selection criterion is suitability of candidates and their proposed research projects to CSER’s broad aims.
Details are available here. Closing date: April 24th."
On Combat - The Psychology and hysiology of Deadly Conflict in War and in Peace by Lt. Col. Dave Grossman and Loren W. Christensen (third edition from 2007) is a well-written, evidence-based book about the reality of human behaviour in life-threatening situations. It is comprehensive (400 pages), provides detailed descriptions, (some) statistics as well as first-person recounts, historical context and other relevant information. But my main focus in this post is in the advice it gives and what lessons the LessWrong community may take from it.
In deadly force encounters you will experience and remember the most unusual physiological and psychological things. Innoculate yourself against extreme stress with repeated authentic training; play win-only paintball, train 911-dialing and -reporting. Train combat breathing. Talk to people after traumatic events.
It has long been known that algorithms out-perform human experts on a range of topics (here's a LW post on this by lukeprog). Why, then, is it that people continue to mistrust algorithms, in spite of their superiority, and instead cling to human advice? A recent paper by Dietvorst, Simmons and Massey suggests it is due to a cognitive bias which they call algorithm aversion. We judge less-than-perfect algorithms more harshly than less-than-perfect humans. They argue that since this aversion leads to poorer decisions, it is very costly, and that we therefore must find ways of combating it.
Research shows that evidence-based algorithms more accurately predict the future than do human forecasters. Yet when forecasters are deciding whether to use a human forecaster or a statistical algorithm, they often choose the human forecaster. This phenomenon, which we call algorithm aversion, is costly, and it is important to understand its causes. We show that people are especially averse to algorithmic forecasters after seeing them perform, even when they see them outperform a human forecaster. This is because people more quickly lose confidence in algorithmic than human forecasters after seeing them make the same mistake. In 5 studies, participants either saw an algorithm make forecasts, a human make forecasts, both, or neither. They then decided whether to tie their incentives to the future predictions of the algorithm or the human. Participants who saw the algorithm perform were less confident in it, and less likely to choose it over an inferior human forecaster. This was true even among those who saw the algorithm outperform the human.
The results of five studies show that seeing algorithms err makes people less confident in them and less likely to choose them over an inferior human forecaster. This effect was evident in two distinct domains of judgment, including one in which the human forecasters produced nearly twice as much error as the algorithm. It arose regardless of whether the participant was choosing between the algorithm and her own forecasts or between the algorithm and the forecasts of a different participant. And it even arose among the (vast majority of) participants who saw the algorithm outperform the human forecaster.
The aversion to algorithms is costly, not only for the participants in our studies who lost money when they chose not to tie their bonuses to the algorithm, but for society at large. Many decisions require a forecast, and algorithms are almost always better forecasters than humans (Dawes, 1979; Grove et al., 2000; Meehl, 1954). The ubiquity of computers and the growth of the “Big Data” movement (Davenport & Harris, 2007) have encouraged the growth of algorithms but many remain resistant to using them. Our studies show that this resistance at least partially arises from greater intolerance for error from algorithms than from humans. People are more likely to abandon an algorithm than a human judge for making the same mistake. This is enormously problematic, as it is a barrier to adopting superior approaches to a wide range of important tasks. It means, for example, that people will more likely forgive an admissions committee than an admissions algorithm for making an error, even when, on average, the algorithm makes fewer such errors. In short, whenever prediction errors are likely—as they are in virtually all forecasting tasks—people will be biased against algorithms.
More optimistically, our findings do suggest that people will be much more willing to use algorithms when they do not see algorithms err, as will be the case when errors are unseen, the algorithm is unseen (as it often is for patients in doctors’ offices), or when predictions are nearly perfect. The 2012 U.S. presidential election season saw people embracing a perfectly performing algorithm. Nate Silver’s New York Times blog, Five Thirty Eight: Nate Silver’s Political Calculus, presented an algorithm for forecasting that election. Though the site had its critics before the votes were in— one Washington Post writer criticized Silver for “doing little more than weighting and aggregating state polls and combining them with various historical assumptions to project a future outcome with exaggerated, attention-grabbing exactitude” (Gerson, 2012, para. 2)—those critics were soon silenced: Silver’s model correctly predicted the presidential election results in all 50 states. Live on MSNBC, Rachel Maddow proclaimed, “You know who won the election tonight? Nate Silver,” (Noveck, 2012, para. 21), and headlines like “Nate Silver Gets a Big Boost From the Election” (Isidore, 2012) and “How Nate Silver Won the 2012 Presidential Election” (Clark, 2012) followed. Many journalists and popular bloggers declared Silver’s success a great boost for Big Data and statistical prediction (Honan, 2012; McDermott, 2012; Taylor, 2012; Tiku, 2012).
However, we worry that this is not such a generalizable victory. People may rally around an algorithm touted as perfect, but we doubt that this enthusiasm will generalize to algorithms that are shown to be less perfect, as they inevitably will be much of the time.
In June 2013, I didn’t do any exercise beyond biking the 15 minutes to work and back. Now, I have a robust habit of hitting the gym every day, doing cardio and strength training. Here are the techniques I used to do get from not having the habit to having it, some of them common wisdom and some of them my own ideas. Consider this post a case study/anecdata in what worked for me. Note: I wrote these ideas down around August 2013 but didn’t post them, so my memory was fresh at the time of writing.
1. Have a specific goal. Ideally this goal should be reasonably achievable and something you can see progress toward over medium timescales. I initially started exercising because I wanted more upper body strength to be better at climbing. My goal is “become able to do at least one pull up, or more if possible”.
Why it works: if you have a specific goal instead of a vague feeling that you ought to do something or that it’s what a virtuous person would do, it’s harder to make excuses. Skipping work with an excuse will let you continue to think of yourself as virtuous, but it won’t help with your goal. For this to work, your goal needs to be something you actually want, rather than a stand-in for “I want to be virtuous.” If you can’t think of a consequence of your intended habit that you actually want, the habit may not be worth your time.
2. Have a no-excuses minimum. This is probably the best technique I’ve discovered. Every day, with no excuses, I went to the gym and did fifty pull-downs on one of the machines. After that’s done, I can do as much or as little else as I want. Some days I would do equivalent amounts of three other exercises, some days I would do an extra five reps and that’s it.
Why it works: this one has a host of benefits.
* It provides a sense of freedom: once I’m done with my minimum, I have a lot of choice about what and how much to do. That way it feels less like something I’m being forced into.
* If I’m feeling especially tired or feel like I deserve a day off, instead of skipping a day and breaking the habit I tell myself I’ll just do the minimum instead. Often once I get there I end up doing more than the minimum anyway, because the real thing I wanted to skip was the inconvenience of biking to the gym.
3. If you raise the minimum, do it slowly. I have sometimes raised the bar on what’s the minimum amount of exercise I have to do, but never to as much or more than I was already doing routinely. If you start suddenly forcing yourself to do more than you were already doing, the change will be much harder and less likely to stick than gradually ratcheting up your commitment.
3. Don’t fall into a guilt trap. Avoid associating guilt with doing the minimum, or even with missing a day.
Why it works: feeling guilty will make thinking of the habit unpleasant, and you’ll downplay how much you care about it to avoid the cognitive dissonance. Especially, if you only do the minimum, tell yourself “I did everything I committed to do.” Then when you do more than the minimum, feel good about it! You went above and beyond. This way, doing what you committed to will sometimes include positive reinforcement, but never negative reinforcement.
4. Use Timeless Decision Theory and consistency pressure. Credit for this one goes to this post by user zvi. When I contemplate skipping a day at the gym, I remember that I’ll be facing the same choice under nearly the same conditions many times in the future. If I skip my workout today, what reason do I have to believe that I won’t skip it tomorrow?
Why it works: Even when the benefits of one day’s worth of exercise don’t seem like enough motivation, I know my entire habit that I’ve worked to cultivate is at stake. I know that the more days I go to the gym the more I will see myself as a person who goes to the gym, and the more it will become my default action.
5. Evaluate your excuses. If I have what I think is a reasonable excuse, I consider how often I’ll skip the gym if I let myself skip it whenever I have that good of an excuse. If letting the excuse hold would make me use it often, I ignore it.
Why it works: I based this technique on this LW post
6. Tell people about it. The first thing I did when I made my resolution to start hitting the gym was telling a friend whose opinion I cared about. I also made a comment on LW saying I would make a post about my attempt at forming a habit, whether it succeeded or failed. (I wrote the post and forgot to post it for over a year, but so it goes.)
Why it works: Telling people about your commitment invests your reputation in it. If you risk being embarrassed if you fail, you have an extra motivation to succeed.
I expect these techniques can be generalized to work for many desirable habits: eating healthy, spending time on social interaction; writing, coding, or working on a long-term project; being outside getting fresh air, etc.
In anxious, frustrating or aversive situations, I find it helpful to visualize the worst case that I fear might happen, and try to accept it. I call this “radical acceptance”, since the imagined worst case is usually an unrealistic scenario that would be extremely unlikely to happen, e.g. “suppose I get absolutely nothing done in the next month”. This is essentially the negative visualization component of stoicism. There are many benefits to visualizing the worst case:
- Feeling better about the present situation by contrast.
- Turning attention to the good things that would still be in my life even if everything went wrong in one particular domain.
- Weakening anxiety using humor (by imagining an exaggerated “doomsday” scenario).
- Being more prepared for failure, and making contingency plans (pre-hindsight).
- Helping make more accurate predictions about the future by reducing the “X isn’t allowed to happen” effect (or, as Anna Salamon once put it, “putting X into the realm of the thinkable”).
- Reducing the effect of ugh fields / aversions, which thrive on the “X isn’t allowed to happen” flinch.
- Weakening unhelpful identities like “person who is always productive” or “person who doesn’t make stupid mistakes”.
Let’s say I have an aversion around meetings with my advisor, because I expect him to be disappointed with my research progress. When I notice myself worrying about the next meeting or finding excuses to postpone it so that I have more time to make progress, I can imagine the worst imaginable outcome a meeting with my advisor could have - perhaps he might yell at me or even decide to expel me from grad school (neither of these have actually happened so far). If the scenario is starting to sound silly, that’s a good sign. I can then imagine how this plays out in great detail, from the disappointed faces and words of the rest of the department to the official letter of dismissal in my hands, and consider what I might do in that case, like applying for industry jobs. While building up these layers of detail in my mind, I breathe deeply, which I associate with meditative acceptance of reality. (I use the word “acceptance” to mean “acknowledgement” rather than “resignation”.)
I am trying to use this technique more often, both in the regular and situational sense. A good default time is my daily meditation practice. I might also set up a trigger-action habit of the form “if I notice myself repeatedly worrying about something, visualize that thing (or an exaggerated version of it) happening, and try to accept it”. Some issues have more natural triggers than others - while worrying tends to call attention to itself, aversions often manifest as a quick flinch away from a thought, so it’s better to find a trigger among the actions that are often caused by an aversion, e.g. procrastination. A trigger for a potentially unhelpful identity could be a thought like “I’m not good at X, but I should be”. A particular issue can simultaneously have associated worries (e.g. “will I be productive enough?”), aversions (e.g. towards working on the project) and identities (“productive person”), so there is likely to be something there that makes a good trigger. Visualizing myself getting nothing done for a month can help with all of these to some degree.
System 1 is good at imagining scary things - why not use this as a tool?
I am looking to compile another such thread, this time aimed at "exceptional explainers" and their works. For example, I find Richard Feynman's QED: The Strange Theory of Light and Matter to be one such book.
Please list out other authors and books which you think are wonderfully written, in such a way that maximizes communication and explanation to laypeople in the given field. For example:
Physics: Richard Feynman - QED: The Strange Theory of Light and Matter.
What new senses would you like to have available to you?
Often when new technology first becomes widely available, the initial limits are in the collective imagination, not in the technology itself (case in point: the internet). New sensory channels have a huge potential because the brain can process senses much faster and more intuitively than most conscious thought processes.
There are a lot of recent "proof of concept" inventions that show that it is possible to create new sensory channels for humans with and without surgery. The most well known and simple example is an implanted magnet, which would alert you to magnetic fields (the trade-off being that you could never have an MRI). Cochlear implants are the most widely used human-created sensory channels (they send electrical signals directly to the nervous system, bypassing the ear entirely), but CIs are designed to emulate a sensory channel most people already have brain space allocated to. VEST is another example. Similar to CIs, VEST (versatile extra-sensory transducer) has 24 information channels, and uses audio compression to encode sound. Unlike CIs, they are not implanted in the skull but instead information is relayed through vibrating motors on the torso. After a few hours of training, deaf volunteers are capable of word recognition using the vibrations alone, and to do so without conscious processing. Much like hearing, the users are unable to describe exactly what components make a spoken word intelligible, they just understand the sensory information intuitively. Another recent invention being tested (with success) is BrainPort glasses, which send electrical signals through the tongue (which is one of the most sensitive organs on the body). Blind people can begin processing visual information with this device within 15 minutes, and it is unique in that it is not implanted but instead. The sensory information feels like pop rocks at first before the brain is able to resolve it into sight. Niel Harbisson (who is colorblind) has custom glasses which use sound tones to relay color information. Belts that vibrate when facing north give people an sense of north. Bottlenose can be built at home and gives a very primitive sense of echolocation. As expected, these all work better if people start young as children.
What are the craziest and coolest new senses you would like to see available using this new technology? I think VEST at least is available from Kickstarter and one of the inventors suggested that it could be that it could be programmed to transmit any kind of data. My initial ideas which I heard about this possibility are just are senses that some unusual people already have or expansions on current senses. I think the real game changers are going to be totally knew senses unrelated to our current sensory processing. Translating data into sensory information gives us access to intuition and processing speed otherwise unavailable.
My initial weak ideas:
- mass spectrometer (uses reflected lasers to determine the exact atomic makeup of anything and everything)
- proximity meter (but I think you would begin to feel like you had a physical aura or field of influence)
- WIFI or cell signal
- perfect pitch and perfect north, both super easy and only need one channel of information (an smartwatch app?)
- infrared or echolocation
- GPS (this would involve some serious problem solving to figure out what data we should encode given limited channels, I think it could be done with 4 or 8 channels each associated with a cardinal direction)
Someone working with VEST suggested:
- compress global twitter sentiments into 24 channels. Will you begin to have an intuitive sense of global events?
- encode stockmarket data. Will you become an intuitive super-investor?
- encode local weather data (a much more advanced version of "I can feel it's going to rain in my bad knee)
Some resources for more information:
I previously wrote a post hypothesizing that inter-group conflict is more common when most humans belong to readily identifiable, discrete factions.
This seems relevant to the recent human gene editing advance. Full human gene editing capability probably won't come soon, but this got me thinking anyway. Consider the following two scenarios:
1. Designer babies become socially acceptable and widespread some time in the near future. Because our knowledge of the human genome is still maturing, they initially aren't that much different than regular humans. As our knowledge matures, they get better and better. Fortunately, there's a large population of "semi-enhanced" humans from the early days of designer babies to keep the peace between the "fully enhanced" and "not at all enhanced" factions.
2. Designer babies are considered socially unacceptable in many parts of the world. Meanwhile, the technology needed to produce them continues to advance. At a certain point people start having them anyway. By this point the technology has advanced to the point where designer babies clearly outclass regular babies at everything, and there's a schism between "fully enhanced" and "not at all enhanced" humans.
Of course, there's another scenario where designer babies just never become widespread. But that seems like an unstable equilibrium given the 100+ sovereign countries in the world, each with their own set of laws, and the desire of parents everywhere to give birth to the best kids possible.
We already see tons of drama related to the current inequalities between individuals, especially inequality that's allegedly genetic in origin. Designer babies might shape up to be the greatest internet flame war of this century. This flame war could spill over in to real world violence. But since one of the parties has not arrived to the flame war yet, maybe we can prepare.
One way to prepare might be differential technological development. In particular, maybe it's possible to decrease the cost of gene editing/selection technologies while retarding advances in our knowledge of which genes contribute to intelligence. This could allow designer baby technology to become socially acceptable and widespread before "fully enhanced" humans were possible. Just as with emulations, a slow societal transition seems preferable to a fast one.
Other ideas (edit: speculative!): extend the benefits of designer babies to everyone for free regardless of their social class. Push for mandatory birth control technology so unwanted and therefore unenhanced babies are no longer a thing. (Imagine how lousy it would be to be born as an unwanted child in a world where everyone was enhanced except you.) Require designer babies to possess genes for compassion, benevolence, and reflectiveness by law, and try to discover those genes before we discover genes for intelligence. (Researching the genetic basis of psychopathy to prevent enhanced psychopaths also seems like a good idea.) Regulate the modification of genes like height if game theory suggests allowing arbitrary modifications to them would be a bad idea.
I don't know very much about the details of these technologies, and I'm open to radically revising my views if I'm missing something important. Please tell me if there's anything I got wrong in the comments.
View more: Next