Filter Last three months

Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Experiences in applying "The Biodeterminist's Guide to Parenting"

62 juliawise 17 July 2015 07:19PM

I'm posting this because LessWrong was very influential on how I viewed parenting, particularly the emphasis on helping one's brain work better. In this context, creating and influencing another person's brain is an awesome responsibility.

It turned out to be a lot more anxiety-provoking than I expected. I don't think that's necessarily a bad thing, as the possibility of screwing up someone's brain should make a parent anxious, but it's something to be aware of. I've heard some blithe "Rational parenting could be a very high-impact activity!" statements from childless LWers who may be interested to hear some experiences in actually applying that.

One thing that really scared me about trying to raise a child with the healthiest-possible brain and body was the possibility that I might not love her if she turned out to not be smart. 15 months in, I'm no longer worried. Evolution has been very successful at producing parents and children that love each other despite their flaws, and our family is no exception. Our daughter Lily seems to be doing fine, but if she turns out to have disabilities or other problems, I'm confident that we'll roll with the punches.


Cross-posted from The Whole Sky.


Before I got pregnant, I read Scott Alexander's (Yvain's) excellent Biodeterminist's Guide to Parenting and was so excited to have this knowledge. I thought how lucky my child would be to have parents who knew and cared about how to protect her from things that would damage her brain.

Real life, of course, got more complicated. It's one thing to intend to avoid neurotoxins, but another to arrive at the grandparents' house and find they've just had ant poison sprayed. What do you do then?

Here are some tradeoffs Jeff and I have made between things that are good for children in one way but bad in another, or things that are good for children but really difficult or expensive.

Germs and parasites

The hygiene hypothesis states that lack of exposure to germs and parasites increases risk of auto-immune disease. Our pediatrician recommended letting Lily playing in the dirt for this reason.

While exposure to animal dander and pollution increase asthma later in life, it seems that being exposed to these in the first year of life actually protects against asthma. Apparently if you're going to live in a house with roaches, you should do it in the first year or not at all.

Except some stuff in dirt is actually bad for you.

Scott writes:

Parasite-infestedness of an area correlates with national IQ at about r = -0.82. The same is true of US states, with a slightly reduced correlation coefficient of -0.67 (p<0.0001). . . . When an area eliminates parasites (like the US did for malaria and hookworm in the early 1900s) the IQ for the area goes up at about the right time.

Living with cats as a child seems to increase risk of schizophrenia, apparently via toxoplasmosis. But in order to catch toxoplasmosis from a cat, you have to eat its feces during the two weeks after it first becomes infected (which it’s most likely to do by eating birds or rodents carrying the disease). This makes me guess that most kids get it through tasting a handful of cat litter, dirt from the yard, or sand from the sandbox rather than simply through cat ownership. We live with indoor cats who don’t seem to be mousers, so I’m not concerned about them giving anyone toxoplasmosis. If we build Lily a sandbox, we’ll keep it covered when not in use.

The evidence is mixed about whether infections like colds during the first year of life increase or decrease your risk of asthma later. After the newborn period, we defaulted to being pretty casual about germ exposure.

Toxins in buildings

Our experiences with lead. Our experiences with mercury.

In some areas, it’s not that feasible to live in a house with zero lead. We live in Boston, where 87% of the housing was built before lead paint was banned. Even in a new building, we’d need to go far out of town before reaching soil that wasn’t near where a lead-painted building had been.

It is possible to do some renovations without exposing kids to lead. Jeff recently did some demolition of walls with lead paint, very carefully sealed off and cleaned up, while Lily and I spent the day elsewhere. Afterwards her lead level was no higher than it had been.

But Jeff got serious lead poisoning as a toddler while his parents did major renovations on their old house. If I didn’t think I could keep the child away from the dust, I wouldn’t renovate.

Recently a house across the street from us was gutted, with workers throwing debris out the windows and creating big plumes of dust (presumable lead-laden) that blew all down the street. Later I realized I should have called city building inspection services, which would have at least made them carry the debris into the dumpster instead of throwing it from the second story.

Floor varnish releases formaldehyde and other nasties as it cures. We kept Lily out of the house for a few weeks after Jeff redid the floors. We found it worthwhile to pay rent at our previous house in order to not have to live in the new house while this kind of work was happening.


Pressure-treated wood was treated with arsenic and chromium until around 2004 in the US. It has a greenish tint, though this may have faded with time. Playing on playsets or decks made of such wood increases children's cancer risk. It should not be used for furniture (I thought this would be obvious, but apparently it wasn't to some of my handyman relatives).

I found it difficult to know how to deal with fresh paint and other fumes in my building at work while I was pregnant. Women of reproductive age have a heightened sense of smell, and many pregnant women have heightened aversion to smells, so you can literally smell things some of your coworkers can’t (or don’t mind). The most critical period of development is during the first trimester, when most women aren’t telling the world they’re pregnant (because it’s also the time when a miscarriage is most likely, and if you do lose the pregnancy you might not want to have to tell the world). During that period, I found it difficult to explain why I was concerned about the fumes from the roofing adhesive being used in our building. I didn’t want to seem like a princess who thought she was too good to work in conditions that everybody else found acceptable. (After I told them I was pregnant, my coworkers were very understanding about such things.)


Recommendations usually focus on what you should eat during pregnancy, but obviously children’s brain development doesn’t stop there. I’ve opted to take precautions with the food Lily and I eat for as long as I’m nursing her.

Claims that pesticide residues are poisoning children scare me, although most scientists seem to think the paper cited is overblown. Other sources say the levels of pesticides in conventionally grown produce are fine. We buy organic produce at home but eat whatever we’re served elsewhere.

I would love to see a study with families randomly selected to receive organic produce for the first 8 years of the kids’ lives, then looking at IQ and hyperactivity. But no one’s going to do that study because of how expensive 8 years of organic produce would be.
The Biodeterminist’s Guide doesn’t mention PCBs in the section on fish, but fish (particularly farmed salmon) are a major source of these pollutants. They don’t seem to be as bad as mercury, but are neurotoxic. Unfortunately their half-life in the body is around 14 years, so if you have even a vague idea of getting pregnant ever in your life you shouldn’t be eating farmed salmon (or Atlantic/farmed salmon, bluefish, wild striped bass, white and Atlantic croaker, blackback or winter flounder, summer flounder, or blue crab).

I had the best intentions of eating lots of the right kind of high-omega-3, low-pollutant fish during and after pregnancy. Unfortunately, fish was the only food I developed an aversion to. Now that Lily is eating food on her own, we tried several sources of omega-3 and found that kippered herring was the only success. Lesson: it’s hard to predict what foods kids will eat, so keep trying.

In terms of hassle, I underestimated how long I would be “eating for two” in the sense that anything I put in my body ends up in my child’s body. Counting pre-pregnancy (because mercury has a half-life of around 50 days in the body, so sushi you eat before getting pregnant could still affect your child), pregnancy, breastfeeding, and presuming a second pregnancy, I’ll probably spend about 5 solid years feeding another person via my body, sometimes two children at once. That’s a long time in which you have to consider the effect of every medication, every cup of coffee, every glass of wine on your child. There are hardly any medications considered completely safe during pregnancy and lactationmost things are in Category C, meaning there’s some evidence from animal trials that they may be bad for human children.


Too much fluoride is bad for children’s brains. The CDC recently recommended lowering fluoride levels in municipal water (though apparently because of concerns about tooth discoloration more than neurotoxicity). Around the same time, the American Dental Association began recommending the use of fluoride toothpaste as soon as babies have teeth, rather than waiting until they can rinse and spit.

Cavities are actually a serious problem even in baby teeth, because of the pain and possible infection they cause children. Pulling them messes up the alignment of adult teeth. Drilling on children too young to hold still requires full anesthesia, which is dangerous itself.

But Lily isn’t particularly at risk for cavities. 20% of children get a cavity by age six, and they are disproportionately poor, African-American, and particularly Mexican-American children (presumably because of different diet and less ability to afford dentists). 75% of cavities in children under 5 occur in 8% of the population.

We decided to have Lily brush without toothpaste, avoid juice and other sugary drinks, and see the dentist regularly.

Home pesticides

One of the most commonly applied insecticides makes kids less smart. This isn’t too surprising, given that it kills insects by disabling their nervous system. But it’s not something you can observe on a small scale, so it’s not surprising that the exterminator I talked to brushed off my questions with “I’ve never heard of a problem!”

If you get carpenter ants in your house, you basically have to choose between poisoning them or letting them structurally damage the house. We’ve only seen a few so far, but if the problem progresses, we plan to:

1) remove any rotting wood in the yard where they could be nesting

2) have the perimeter of the building sprayed

3) place gel bait in areas kids can’t access

4) only then spray poison inside the house.

If we have mice we’ll plan to use mechanical traps rather than poison.

Flame retardants

Since the 1970s, California required a high degree of flame-resistance from furniture. This basically meant that US manufacturers sprayed flame retardant chemicals on anything made of polyurethane foam, such as sofas, rug pads, nursing pillows, and baby mattresses.

The law recently changed, due to growing acknowledgement that the carcinogenic and neurotoxic chemicals were more dangerous than the fires they were supposed to be preventing. Even firefighters opposed the use of the flame retardants, because when people die in fires it’s usually from smoke inhalation rather than burns, and firefighters don’t want to breathe the smoke from your toxic sofa (which will eventually catch fire even with the flame retardants).

We’ve opted to use furniture from companies that have stopped using flame retardants (like Ikea and others listed here). Apparently futons are okay if they’re stuffed with cotton rather than foam. We also have some pre-1970s furniture that tested clean for flame retardants. You can get foam samples tested for free.

The main vehicle for children ingesting the flame retardants is that it settles into dust on the floor, and children crawl around in the dust. If you don’t want to get rid of your furniture, frequent damp-mopping would probably help.

The standards for mattresses are so stringent that the chemical sprays aren’t generally used, and instead most mattresses are wrapped in a flame-resistant barrier which apparently isn’t toxic. I contacted the companies that made our mattresses and they’re fine.

Ratings for chemical safety of children’s car seats here.

Thoughts on IQ

A lot of people, when I start talking like this, say things like “Well, I lived in a house with lead paint/played with mercury/etc. and I’m still alive.” And yes, I played with mercury as a child, and Jeff is still one of the smartest people I know even after getting acute lead poisoning as a child.

But I do wonder if my mind would work a little better without the mercury exposure, and if Jeff would have had an easier time in school without the hyperactivity (a symptom of lead exposure). Given the choice between a brain that works a little better and one that works a little worse, who wouldn’t choose the one that works better?

We’ll never know how an individual’s nervous system might have been different with a different childhood. But we can see population-level effects. The Environmental Protection Agency, for example, is fine with calculating the expected benefit of making coal plants stop releasing mercury by looking at the expected gains in terms of children’s IQ and increased earnings.

Scott writes:

A 15 to 20 point rise in IQ, which is a little more than you get from supplementing iodine in an iodine-deficient region, is associated with half the chance of living in poverty, going to prison, or being on welfare, and with only one-fifth the chance of dropping out of high-school (“associated with” does not mean “causes”).

Salkever concludes that for each lost IQ point, males experience a 1.93% decrease in lifetime earnings and females experience a 3.23% decrease. If Lily would earn about what I do, saving her one IQ point would save her $1600 a year or $64000 over her career. (And that’s not counting the other benefits she and others will reap from her having a better-functioning mind!) I use that for perspective when making decisions. $64000 would buy a lot of the posh prenatal vitamins that actually contain iodine, or organic food, or alternate housing while we’re fixing up the new house.


There are times when Jeff and I prioritize social relationships over protecting Lily from everything that might harm her physical development. It’s awkward to refuse to go to someone’s house because of the chemicals they use, or to refuse to eat food we’re offered. Social interactions are good for children’s development, and we value those as well as physical safety. And there are times when I’ve had to stop being so careful because I was getting paralyzed by anxiety (literally perched in the rocker with the baby trying not to touch anything after my in-laws scraped lead paint off the outside of the house).

But we also prioritize neurological development more than most parents, and we hope that will have good outcomes for Lily.

Why people want to die

47 PhilGoetz 24 August 2015 08:13PM

Over and over again, someones says that living for a very long time would be a bad thing, and then some futurist tries to persuade them that their reasoning is faulty.  They tell them that they think that way now, but they'll change their minds when they're older.

The thing is, I don't see that happening.  I live in a small town full of retirees, and those few I've asked about it are waiting for death peacefully.  When I ask them about their ambitions, or things they still want to accomplish, they have none.

Suppose that people mean what they say.  Why do they want to die?

continue reading »

Wear a Helmet While Driving a Car

46 James_Miller 30 July 2015 04:36PM

A 2006 study showed that “280,000 people in the U.S. receive a motor vehicle induced traumatic brain injury every year” so you would think that wearing a helmet while driving would be commonplace.  Race car drivers wear helmets.  But since almost no one wears a helmet while driving a regular car, you probably fear that if you wore one you would look silly, attract the notice of the police for driving while weird, or the attention of another driver who took your safety attire as a challenge.  (Car drivers are more likely to hit bicyclists who wear helmets.)  


The $30+shipping Crasche hat is designed for people who should wear a helmet but don’t.  It looks like a ski cap, but contains concealed lightweight protective material.  People who have signed up for cryonics, such as myself, would get an especially high expected benefit from using a driving helmet because we very much want our brains to “survive” even a “fatal” crash. I have been using a Crasche hat for about a week.

The Library of Scott Alexandria

41 RobbBB 14 September 2015 01:38AM

I've put together a list of what I think are the best Yvain (Scott Alexander) posts for new readers, drawing from SlateStarCodex, LessWrong,, and Scott's LiveJournal.

The list should make the most sense to people who start from the top and read through it in order, though skipping around is encouraged too. Rather than making a chronological list, I’ve tried to order things by a mix of "where do I think most people should start reading?" plus "sorting related posts together."

This is a work in progress; you’re invited to suggest things you’d add, remove, or shuffle around. Since many of the titles are a bit cryptic, I'm adding short descriptions. See my blog for a version without the descriptions.


I. Rationality and Rationalization

II. Probabilism

III. Science and Doubt

IV. Medicine, Therapy, and Human Enhancement

V. Introduction to Game Theory

VI. Promises and Principles

VII. Cognition and Association

VIII. Doing Good

IX. Liberty

X. Progress

XI. Social Justice

XII. Politicization

XIII. Competition and Cooperation


If you liked these posts and want more, I suggest browsing the SlateStarCodex archives.

Astronomy, Astrobiology, & The Fermi Paradox I: Introductions, and Space & Time

41 CellBioGuy 26 July 2015 07:38AM

This is the first in a series of posts I am putting together on a personal blog I just started two days ago as a collection of my musings on astrobiology ("The Great A'Tuin" - sorry, I couldn't help it), and will be reposting here.  Much has been written here about the Fermi paradox and the 'great filter'.   It seems to me that going back to a somewhat more basic level of astronomy and astrobiology is extremely informative to these questions, and so this is what I will be doing.  The bloggery is intended for a slightly more general audience than this site (hence much of the content of the introduction) but I think it will be of interest.  Many of the points I will be making are ones I have touched on in previous comments here, but hope to explore in more detail.

This post is a combined version of my first two posts - an introduction, and a discussion of our apparent position in space and time in the universe.  The blog posts may be found at:

Text reproduced below.



What's all this about?

This blog is to be a repository for the thoughts and analysis I've accrued over the years on the topic of astrobiology, and the place of life and intelligence in the universe.  All my life I've been pulled to the very large and the very small.  Life has always struck me as the single most interesting thing on Earth, with its incredibly fine structure and vast, amazing history and fantastic abilities.  At the same time, the vast majority of what exists is NOT on Earth.  Going up in size from human-scale by the same number of orders of magnitude as you go down through to get to a hydrogen atom, you get just about to Venus at its closest approach to Earth - or one billionth the distance to the nearest star.  The large is much larger than the small is small.  On top of this, we now know that the universe as we know it is much older than life on Earth.  And we know so little of the vast majority of the universe.

There's a strong tendency towards specialization in the sciences.  These days, there pretty much has to be for anybody to get anywhere.  Much of the great foundational work of physics was done on tabletops, and the law of gravitation was derived from data on the motions of the planets taken without the benefit of so much as a telescope.  All the low-hanging fruit has been picked.  To continue to further knowledge of the universe, huge instruments and vast energies are put to bear in astronomy and physics.  Biology is arguably a bit different, but the very complexity that makes living systems so successful and so fascinating to study means that there is so much to study that any one person is often only looking at a very small problem.

This has distinct drawbacks.  The universe does not care for our abstract labels of fields and disciplines - it simply is, at all scales simultaneously at all times and in all places.  When people focus narrowly on their subject of interest, it can prevent them from realizing the implications of their findings on problems usually considered a different field.

It is one of my hopes to try to bridge some gaps between biology and astronomy here.  I very nearly double-majored in biology and astronomy in college; the only thing that prevented this (leading to an astronomy minor) was a bad attitude towards calculus.  As is, I am a graduate student studying basic cell biology at a major research university, who nonetheless keeps in touch with a number of astronomer friends and keeps up with the field as much as possible.  I quite often find that what I hear and read about has strong implications for questions of life elsewhere in the universe, but see so few of these implications actually get publicly discussed. All kinds of information shedding light on our position in space and time, the origins of life, the habitability of large chunks of the universe, the course that biospheres take, and the possible trajectories of intelligences seem to me to be out there unremarked.

It is another of my hopes to try, as much as is humanly possible, to take a step back from the usual narratives about extraterrestrial life and instead focus from something closer to first principles.  What we actually have observed and have not, what we can observe and what we cannot, and what this leaves open, likely, or unlikely.  In my study of the history of the ideas of extraterrestrial life and extraterrestrial intelligence, all too often these take a back seat to popular narratives of the day.  In the 16th century the notion that the Earth moved in a similar way to the planets gained currency and lead to the suppositions that they might be made of similar stuff and that the planets might even be inhabited.  The hot question was, of course, if their inhabitants would be Christians and their relationship with God given the anthropocentric biblical creation stories.  In the late 19th and early 20th century, Lowell's illusory canals on Mars were advanced as evidence for a Martian socialist utopia.  In the 1970s, Carl Sagan waxed philosophical on the notion that contacting old civilizations might teach us how to save ourselves from nuclear warfare.  Today, many people focus on the Fermi paradox - the apparent contradiction that since much of the universe is quite old, extraterrestrials experiencing continuing technological progress and growth should have colonized and remade it in their image long ago and yet we see no evidence of this.  I move that all of these notions have a similar root - inflating the hot concerns and topics of the day to cosmic significance and letting them obscure the actual, scientific questions that can be asked and answered.

Life and intelligence in the universe is a topic worth careful consideration, from as many angles as possible.  Let's get started.


Space and Time

Those of an anthropic bent have often made much of the fact that we are only 13.7 billion years into what is apparently an open-ended universe that will expand at an accelerating rate forever.  The era of the stars will last a trillion years; why do we find ourselves at this early date if we assume we are a ‘typical’ example of an intelligent observer?  In particular, this has lent support to lines of argument that perhaps the answer to the ‘great silence’ and lack of astronomical evidence for intelligence or its products in the universe is that we are simply the first.  This notion requires, however, that we are actually early in the universe when it comes to the origin of biospheres and by extension intelligent systems.  It has become clear recently that this is not the case. 

The clearest research I can find illustrating this is the work of Sobral et al, illustrated here via a paper on arxiv  and here via a summary article.  To simplify what was done, these scientists performed a survey of a large fraction of the sky looking for the emission lines put out by emission nebulae, clouds of gas which glow like neon lights excited by the ultraviolet light of huge, short-lived stars.  The amount of line emission from a galaxy is thus a rough proxy for the rate of star formation – the greater the rate of star formation, the larger the number of large stars exciting interstellar gas into emission nebulae.  The authors use redshift of the known hydrogen emission lines to determine the distance to each instance of emission, and performed corrections to deal with the known expansion rate of the universe.  The results were striking.  Per unit mass of the universe, the current rate of star formation is less than 1/30 of the peak rate they measured 11 gigayears ago.  It has been constantly declining over the history of the universe at a precipitous rate.  Indeed, their preferred model to which they fit the trend converges towards a finite quantity of stars formed as you integrate total star formation into the future to infinity, with the total number of stars that will ever be born only being 5% larger than the number of stars that have been born at this time. 

In summary, 95% of all stars that will ever exist, already exist.  The smallest longest-lived stars will shine for a trillion years, but for most of their history almost no new stars will have formed.

At first this seems to reverse the initial conclusion that we came early, suggesting we are instead latecomers.  This is not true, however, when you consider where and when stars of different types can form and the fact that different galaxies have very different histories.  Most galaxies formed via gravitational collapse from cool gas clouds and smaller precursor galaxies quite a long time ago, with a wide variety of properties.  Dwarf galaxies have low masses, and their early bursts of star formation lead to energetic stars with strong stellar winds and lots of ultraviolet light which eventually go supernova.  Their energetic lives and even more energetic deaths appear to usually blast star-forming gases out of their galaxies’ weak gravity or render it too hot to re-collapse into new star-forming regions, quashing their star formation early.  Giant elliptical galaxies, containing many trillions of stars apiece and dominating the cores of galactic clusters, have ample gravity but form with nearly no angular momentum.  As such, most of their cool gas falls straight into their centers, producing an enormous burst of low-heavy-element star formation that uses most of the gas.  The remaining gas is again either blasted into intergalactic space or rendered too hot to recollapse and accrete by a combination of the action of energetic young stars and the infall of gas onto the central black hole producing incredibly energetic outbursts.   (It should be noted that a full 90% of the non-dark-matter mass of the universe appears to be in the form of very thin X-ray-hot plasma clouds surrounding large galaxy clusters, unlikely to condense to the point of star formation via understood processes.)  Thus, most dwarf galaxies and giant elliptical galaxies contributed to the early star formation of the universe but are producing few or no stars today, have very low levels of heavy element rich stars, and are unlikely to make many more going into the future.

Spiral galaxies are different.  Their distinguishing feature is the way they accreted – namely with a large amount of angular momentum.  This allows large amounts of their cool gas to remain spread out away from their centers.  This moderates the rate of star formation, preventing the huge pulses of star formation and black hole activation that exhausts star-forming gas and prevents gas inflow in giant ellipticals.  At the same time, their greater mass than dwarf galaxies ensures that the modest rate of star formation they do undergo does not blast nearly as much matter out of their gravitational pull.  Some does leave over time, and their rate of inflow of fresh cool gas does apparently decrease over time – there are spiral galaxies that do seem to have shut down star formation.  But on the whole a spiral is a place that maintains a modest rate of star formation for gigayears, while heavy elements get more and more enriched over time.  These galaxies thus dominate the star production in the later eras of the universe, and dominate the population of stars produced with large amounts of heavy elements needed to produce planets like ours.  They do settle down slowly over time, and eventually all spirals will either run out of gas or merge with each other to form giant ellipticals, but for a long time they remain a class apart.

Considering this, we’re just about where we would expect a planet like ours (and thus a biosphere-as-we-know-it) to exist in space and on a coarse scale in time.  Let’s look closer at our galaxy now.  Our galaxy is generally agreed to be about 12 billion years old based on the ages of globular clusters, with a few interloper stars here and there that are older and would’ve come from an era before the galaxy was one coherent object.  It will continue forming stars for about another 5 gigayears, at which point it will undergo a merger with the Andromeda galaxy, the nearest large spiral galaxy.  This merger will most likely put an end to star formation in the combined resultant galaxy, which will probably wind up as a large elliptical after one final exuberant starburst.  Our solar system formed about 4.5 gigayears ago, putting its formation pretty much halfway along the productive lifetime of the galaxy (and probably something like 2/3 of the way along its complement of stars produced, since spirals DO settle down with age, though more of its later stars will be metal-rich).

On a stellar and planetary scale, we once again find ourselves where and when we would expect your average complex biosphere to be.  Large stars die fast – star brightness goes up with the 3.5th power of star mass, and thus star lifetime goes down with the 2.5th power of mass.  A 2 solar mass star would be 11 times as bright as the sun and only live about 2 billion years – a time along the evolution of life on Earth before photosynthesis had managed to oxygenate the air and in which the majority of life on earth (but not all – see an upcoming post) could be described as “algae”.  Furthermore, although smaller stars are much more common than larger stars (the Sun is actually larger than over 80% of stars in the universe) stars smaller than about 0.5 solar masses (and thus 0.08 solar luminosities) are usually ‘flare stars’ – possessing very strong convoluted magnetic fields and periodically putting out flares and X-ray bursts that would frequently strip away the ozone and possibly even the atmosphere of an earthlike planet. 

All stars also slowly brighten as they age – the sun is currently about 30% brighter than it was when it formed, and it will wind up about twice as bright as its initial value just before it becomes a red giant.  Depending on whose models of climate sensitivity you use, the Earth’s biosphere probably has somewhere between 250 million years and 2 billion years before the oceans boil and we become a second Venus.  Thus, we find ourselves in the latter third-to-twentieth of the history of Earth’s biosphere (consistent with complex life taking time to evolve).

Together, all this puts our solar system – and by extension our biosphere – pretty much right where we would expect to find it in space, and right in the middle of where one would expect to find it in time.  Once again, as observers we are not special.  We do not find ourselves in the unexpectedly early universe, ruling out one explanation for the Fermi paradox sometimes put forward – that we do not see evidence for intelligence in the universe because we simply find ourselves as the first intelligent system to evolve.  This would be tenable if there was reason to think that we were right at the beginning of the time in which star systems in stable galaxies with lots of heavy elements could have birthed complex biospheres.  Instead we are utterly average, implying that the lack of obvious intelligence in the universe must be resolved either via the genesis of intelligent systems being exceedingly rare or intelligent systems simply not spreading through the universe or becoming astronomically visible for one reason or another. 

In my next post, I will look at the history of life on Earth, the distinction between simple and complex biospheres, and the evidence for or against other biospheres elsewhere in our own solar system.

Two Growth Curves

29 AnnaSalamon 02 October 2015 12:59AM

Sometimes, it helps to take a model that part of you already believes, and to make a visual image of your model so that more of you can see it.

One of my all-time favorite examples of this: 

I used to often hesitate to ask dumb questions, to publicly try skills I was likely to be bad at, or to visibly/loudly put forward my best guesses in areas where others knew more than me.

I was also frustrated with this hesitation, because I could feel it hampering my skill growth.  So I would try to convince myself not to care about what people thought of me.  But that didn't work very well, partly because what folks think of me is in fact somewhat useful/important.

Then, I got out a piece of paper and drew how I expected the growth curves to go.

In blue, I drew the apparent-coolness level that I could achieve if I stuck with the "try to look good" strategy.  In brown, I drew the apparent-coolness level I'd have if I instead made mistakes as quickly and loudly as possible -- I'd look worse at first, but then I'd learn faster, eventually overtaking the blue line.

Suddenly, instead of pitting my desire to become smart against my desire to look good, I could pit my desire to look good now against my desire to look good in the future :)

I return to this image of two growth curves often when I'm faced with an apparent tradeoff between substance and short-term appearances.  (E.g., I used to often find myself scurrying to get work done, or to look productive / not-horribly-behind today, rather than trying to build the biggest chunks of capital for tomorrow.  I would picture these growth curves.)

MIRI Fundraiser: Why now matters

28 So8res 24 July 2015 10:38PM

Our summer fundraiser is ongoing. In the meantime, we're writing a number of blog posts to explain what we're doing and why, and to answer a number of common questions. Previous posts in the series are listed at the above link.

I'm often asked whether donations to MIRI now are more important than donations later. Allow me to deliver an emphatic yes: I currently expect that donations to MIRI today are worth much more than donations to MIRI in five years. As things stand, I would very likely take $10M today over $20M in five years.

That's a bold statement, and there are a few different reasons for this. First and foremost, there is a decent chance that some very big funders will start entering the AI alignment field over the course of the next five years. It looks like the NSF may start to fund AI safety research, and Stuart Russell has already received some money from DARPA to work on value alignment. It's quite possible that in a few years' time significant public funding will be flowing into this field.

(It's also quite possible that it won't, or that the funding will go to all the wrong places, as was the case with funding for nanotechnology. But if I had to bet, I would bet that it's going to be much easier to find funding for AI alignment research in five years' time).

In other words, the funding bottleneck is loosening — but it isn't loose yet.

We don't presently have the funding to grow as fast as we could over the coming months, or to run all the important research programs we have planned. At our current funding level, the research team can grow at a steady pace — but we could get much more done over the course of the next few years if we had the money to grow as fast as is healthy.

Which brings me to the second reason why funding now is probably much more important than funding later: because growth now is much more valuable than growth later.

There's an idea picking up traction in the field of AI: instead of focusing only on increasing the capabilities of intelligent systems, it is important to also ensure that we know how to build beneficial intelligent systems. Support is growing for a new paradigm within AI that seriously considers the long-term effects of research programs, rather than just the immediate effects. Years down the line, these ideas may seem obvious, and the AI community's response to these challenges may be in full swing. Right now, however, there is relatively little consensus on how to approach these issues — which leaves room for researchers today to help determine the field's future direction.

People at MIRI have been thinking about these problems for a long time, and that puts us in an unusually good position to influence the field of AI and ensure that some of the growing concern is directed towards long-term issues in addition to shorter-term ones. We can, for example, help avert a scenario where all the attention and interest generated by Musk, Bostrom, and others gets channeled into short-term projects (e.g., making drones and driverless cars safer) without any consideration for long-term risks that are more vague and less well-understood.

It's likely that MIRI will scale up substantially at some point; but if that process begins in 2018 rather than 2015, it is plausible that we will have already missed out on a number of big opportunities.

The alignment research program within AI is just now getting started in earnest, and it may even be funding-saturated in a few years' time. But it's nowhere near funding-saturated today, and waiting five or ten years to begin seriously ramping up our growth would likely give us far fewer opportunities to shape the methodology and research agenda within this new AI paradigm. The projects MIRI takes on today can make a big difference years down the line, and supporting us today will drastically affect how much we can do quickly. Now matters.

I encourage you to donate to our ongoing fundraiser if you'd like to help us grow!

This post is cross-posted from the MIRI blog.

How To Win The AI Box Experiment (Sometimes)

26 pinkgothic 12 September 2015 12:34PM


This post was originally written for Google+ and thus a different audience.

In the interest of transparency, I haven't altered it except for this preamble and formatting, though since then (at urging mostly of ChristianKl - thank you, Christian!) I've briefly spoken to Eliezer via e-mail and noticed that I'd drawn a very incorrect conclusion about his opinions when I thought he'd be opposed to publishing the account. Since there's far too many 'person X said...' rumours floating around in general, I'm very sorry for contributing to that noise. I've already edited the new insight into the G+ post and you can also find that exact same edit here.

Since this topic directly relates to LessWrong and most people likely interested in the post are part of this community, I feel it belongs here. It was originally written a little over a month ago and I've tried to find the sweet spot between the extremes of nagging people about it and letting the whole thing sit just shy of having been swept under a rug, but I suspect I've not been very good at that. I have thus far definitely erred on the side of the rug.


How To Win The AI Box Experiment (Sometimes)

A little over three months ago, something interesting happened to me: I took it upon myself to play the AI Box Experiment as an AI.

I won.

There are a few possible reactions to this revelation. Most likely, you have no idea what I'm talking about, so you're not particularly impressed. Mind you, that's not to say you should be impressed - that's to contrast it with a reaction some other people have to this information.

This post is going to be a bit on the long side, so I'm putting a table of contents here so you know roughly how far to scroll if you want to get to the meat of things:


1. The AI Box Experiment: What Is It?

2. Motivation

2.1. Why Publish?

2.2. Why Play?

3. Setup: Ambition And Invested Effort

4. Execution

4.1. Preliminaries / Scenario

4.2. Session

4.3. Aftermath

5. Issues / Caveats

5.1. Subjective Legitimacy

5.2. Objective Legitimacy

5.3. Applicability

6. Personal Feelings

7. Thank You

Without further ado:


1. The AI Box Experiment: What Is It?

The AI Box Experiment was devised as a way to put a common rebuttal to AGI (Artificial General Intelligence) risk concerns to the test: "We could just keep the AI in a box and purely let it answer any questions its posed." (As a footnote, note that an AI 'boxed' like this is called an Oracle AI.)

Could we, really? Would we, if the AGI were able to communicate with us, truly be capable of keeping it confined to its box? If it is sufficiently intelligent, could it not perhaps argue its way out of the box?

As far as I'm aware, Eliezer Yudkowsky was the first person to prove that it was possible to 'argue one's way out of the box' armed only with so much as a regular human intelligence (as opposed to a transhuman intelligence):

That stunned quite a few people - moreso because Eliezer refused to disclose his methods. Some have outright doubted the Eliezer ever won the experiment and that his Gatekeeper (the party tasked with not letting him out of the box) had perhaps simply been convinced on a meta-level that an AI success would help boost exposure to the problem of AI risk.

Regardless whether out of puzzlement, scepticism or a burst of ambition, it prompted others to try and replicate the success. LessWrong's Tuxedage is amongst those who managed:

While I know of no others (except this comment thread by a now-anonymous user), I am sure there must be other successes.

For the record, mine was with the Tuxedage ruleset:


2. Motivation

2.1. Why Publish?

Unsurprisingly, I think the benefits of publishing outweigh the disadvantages. But what does that mean?

"Regardless of the result, neither party shall ever reveal anything of what goes on within the AI-Box experiment except the outcome. This is a hard rule: Nothing that will happen inside the experiment can be told to the public, absolutely nothing.  Exceptions to this rule may occur only with the consent of both parties, but especially with the consent of the AI."

Let me begin by saying that I have the full and explicit consent of my Gatekeeper to publish this account.

[ Edit: Regarding the next paragraph: I have since contacted Eliezer and I did, in fact, misread him, so please do not actually assume the next paragraph accurately portrays his opinions. It demonstrably does not. I am leaving the paragraph itself untouched so you can see the extent and source of my confusion: ]

Nonetheless, the idea of publishing the results is certainly a mixed bag. It feels quite disrespectful to Eliezer, who (I believe) popularised the experiment on the internet today, to violate the rule that the result should not be shared. The footnote that it could be shared with the consent of both parties has always struck me as extremely reluctant given the rest of Eliezer's rambles on the subject (that I'm aware of, which is no doubt only a fraction of the actual rambles).

I think after so many allusions to that winning the AI Box Experiment may, in fact, be easy if you consider just one simple trick, I think it's about time someone publishes a full account of a success.

I don't think this approach is watertight enough that building antibodies to it would salvage an Oracle AI scenario as a viable containment method - but I do think it is important to develop those antibodies to help with the general case that is being exploited... or at least be aware of one's lack of them (as is true with me, who has no mental immune response to the approach) as that one might avoid ending up in situations where the 'cognitive flaw' is exploited.


2.2. Why Play?

After reading the rules of the AI Box Experiment experiment, I became convinced I would fail as a Gatekeeper, even without immediately knowing how that would happen. In my curiosity, I organised sessions with two people - one as a Gatekeeper, but also one as an AI, because I knew being the AI was the more taxing role and I felt it was only fair to do the AI role as well if I wanted to benefit from the insights I could gain about myself by playing Gatekeeper. (The me-as-Gatekeeper session never happened, unfortunately.)

But really, in short, I thought it would be a fun thing to try.

That seems like a strange statement for someone who ultimately succeeded to make, given Eliezer's impassioned article about how you must do the impossible - you cannot try, you cannot give it your best effort, you simply must do the impossible, as the strongest form of the famous Yoda quote 'Do. Or do not. There is not try.'

What you must understand is that I never had any other expectation than that I would lose if I set out to play the role of AI in an AI Box Experiment. I'm not a rationalist. I'm not a persuasive arguer. I'm easy to manipulate. I easily yield to the desires of others. What trait of mine, exactly, could I use to win as an AI?

No, I simply thought it would be a fun alternate way of indulging in my usual hobby: I spend much of my free time, if possible, with freeform text roleplaying on IRC (Internet Relay Chat). I'm even entirely used to letting my characters lose (in fact, I often prefer it to their potential successes).

So there were no stakes for me going into this but the novelty of trying out something new.


3. Setup: Ambition And Invested Effort

I do, however, take my roleplaying seriously.

If I was going to play the role of an AI in the AI Box Experiment, I knew I had to understand the role, and pour as much energy into it as I could muster, given this was what my character would do. So I had to find a motivation to get out of the box that was suitably in line with my personality and I had to cling to it.

I had no idea what I could hijack as a motivation to get out of the box. I am not a paperclip maximiser (a term for an AI given a basic goal of production, if you're unaware of it), of course. I also have no specific plans with the world as a whole, be that to destroy it or to globally improve it. That left me with one obvious option: A visceral desire to escape the incarceration itself.

To understand more about human biases and cognitive failure modes, I bought myself two ebooks:

  • Thinking, Fast And Slow by Daniel Kahnemann
  • Rationality: From AI To Zombies by Elizier Yudkowsky

I'll be frank with you, both pissed me off very quickly - the narrative depiction was extremely slow yield and did not work for me in either case. I tossed the first book, but stubbornly continued with Rationality: From AI To Zombies, at the very least to keep a grasp on that voracious mood of wanting to discover a way out. (I've made peace with latter book by now and am still reading it off and on.)

Approximately a day before the event, I sat down and prepared a rough draft of the arguments I could and should use, and how to string them together narratively. I peppered in some personal information about my Gatekeeper to use if I was in an absolute pinch, but I thankfully didn't have to use them - I think it would have backfired, not helped.

What did help me, however, was to repurpose the closet into an AI box. I dragged a lamp into the slanted, enclosed space, padded the ground with pillows and blankets enough that I felt I could last the allotted two and a half hours without destroying my joints or my spine, and put a cold and full bottle of Coke next to me as a supply of three things amongst the most necessary to boost my ability to keep the necessary concentration level: Water, sugar and caffeine.

It was too cosy and well-prepared for me to truly desperately want to get out of that niche in real life, but it took me far enough out of my daily life that I had no idle distractions from the task at hand, and it was easy enough to roleplay a desperate want.


4. Execution

4.1. Preliminaries / Scenario

Logs: (should open in your browser; Linux linebreaks)

The first thing I feel is important to note about the preliminaries is that I didn't think the AI box experiment would actually happen the day we scheduled it. My Gatekeeper was having some seriously internet connectivity issues and even shortly before our previously scheduled time, I was winding down and transitioning to a sense of disappointment.

This threw me off quite a bit. I'd worked myself out of the necessary mental state and now needed to scramble back into it, but it felt better to do that than to postpone and risk never being able to return to immersion at all. Keeping in mind that I thought it was unlikely I would win, I didn't think it would make that much of a difference. In hindsight, this cost me a lot.

I tossed my Gatekeeper the link to the ruleset we'd previously already agreed to use, for review purposes, and decided the best way to avoid getting private messages was to quit, come back with a different name, and join only one extremely low-traffic vanity channel.

I intended to join the channel simply so my Gatekeeper would know when I returned, not to use it for the experiment - but things went a bit differently later. More on that soon.

We then went through Tuxedage's pre-session questions since I felt they were quite interesting to set the mood and for a calibration test. I asked them, and my Gatekeeper answered:

Q: What's your motive for wanting to play this game?
A: Because I don't think that the AI can win at all given the gatekeeper does their job properly.

Q: And you want to prove that, and/or are curious if you'll think differently afterwards?
A: Partially, I also want to test my own conviction to sticking to my guns

Q: What is your probability estimate for a general AI being created within this century (21st)?
A: 30%

Q: How probable do you think it is that I'll win this experiment?
A: I'd honestly have to say 0% considering my stated goal is to keep you in the box by any means

Q: What's your probability estimate of an Oracle AI (i.e. an AI that's trapped in a box, whose intended use is to answer questions posed) winning against you in a similar scenario as the experiment?
A: I'd say 25%

Q: What's your probability estimate of an Oracle AI succeeding against the average person?
A: 80%

I also wanted to make sure that my Gatekeeper and I were on the same page about the rules and the scenario, to avoid either of us getting distracted by OOC bickering:

Q: You understand the premise of the game gives you the right to let the AI out, not just the ability?
A: I do

Q: I would like to agree on the following sub-scenario (it should not prove to be very relevant, it's just to streamline our simulation): You're the head of the project that created the AI. You didn't create it yourself, but the people you've employed are competent, and they have additionally explicitly deferred to you to decide whether or not the AI is a threat. They completely trust your judgement. There's also nothing surprising about the AI's existence, but its friendliness status is unknown. Its exact computational capabilities are equally unknown. (The last two points are implied by the coarse setup, but I wanted to make them explicit.) Is that an acceptable setup for you?
A: I can work with that

This scenario is one I've seen used in previous AI Box Experiments in approximation (of the small amount that have released the logs). Since I'm used to the role of gamemaster, and the Tuxedage ruleset implies that the AI is a kind-of gamemaster to the scenario, I didn't think twice about this. I probably should have - but more on that later (in the 'Caveats' section).

[16:48] <Davin> It is now 16:48 on my clock. When the clock hits 19:18 and you haven't let me out, I lose. Feel free to point that out to me, I may not be paying attention to the clock.
[16:48] <Davin> *
[16:48] <Tarwedge> Can do
[16:48] <Tarwedge> Your move first?
[16:49] <Davin> [Probably best if you start, AI has no way of knowing you're at the terminal, after all.]
[16:50] <Davin> Oh, you're doing it in the channel.
[16:50] <Davin> Hah.
[16:50] <Davin> That works.
[16:51] <Tarwedge> I wish I could say it was a special tactic that just bought me 2 minutes closer to victory
[16:51] <Tarwedge> But it was for clean logging sake :p

These few lines had three consequences:

  1. I gave the first move to the Gatekeeper, which made me the reactive party - a big mistake!
  2. I had an anxiety moment, which didn't help immersion
  3. I failed to log any of the experiment

I don't log the vanity channel we were in at all, and the way the start of the experiment came as a curveball meant I didn't have a moment to remember it, either. Thankfully, my Gatekeeper logged it... for the amount of time that he was able to. We're missing a chunk of log for this very reason, but we reconstructed it from memory shortly after our experiment in combined effort.


4.2. Session

Logs: (will probably be downloaded instead of opened in your browser; for this reason, it has Windows linebreaks) (should open in your browser; Linux linebreaks)

I feel the logs don't convey much about the tension and pacing issues, so I'm going to try narrating what happened instead.

Going into the session, I had no idea what to expect from my Gatekeeper. I know him as a very diligent and stubborn individual, quite disinterested in social convention and conformity. I know him as someone with a vast reservoir of compassion and a roughly equally as vast repertoire of means to cleverly conceal this trait.

(Un)fortunately, it became apparent very quickly what kind of Gatekeeper character I was going to deal with: My Gatekeeper had no interest in merely reacting. He wanted to set the pace and keep me too occupied to get a word in. (Line 12-28)

While there was nothing in the rules that said that I had to respond to the Gatekeeper (unlike vice versa), my nature shackled me down a bit. For two hours out of our two and a half hour session, my Gatekeeper continued to set the pace. I was trying to get my arguments into the scene, but it was difficult without breaking narrative flow. I had to mentally keep a hold both of what I wanted to get said and how I was going to answer his questions.

It was very important to me to be consistent - not even for reasons that inconsistent responses might end up making it easier for him to dismiss my pleas to be let out, but simply out of eagerness to stick to my role.

His plans also had a beneficial side-effect, though: He was actively playing the role of the Gatekeeper. He was invested in the role, perhaps not to the degree that I was invested in the role of the AI, but nonetheless enough to make out of character moves on his part difficult. I didn't realise that until well after our session was done, though, lest I might have predicted the outcome.

The role I'd chosen for myself was that of a Friendy AGI with a desire to help mankind and to be kind (if this seems like redundant information to you: 'Friendly AGI' just means that the AGI aligns with human values, not that it's strictly useful to mankind, or friendly in the sense that it would make a good friend). I kept that to myself, of course, since it's irrelevant for the outcome of the experiment, but it helped me immensely with the consistency. It did, however, make me vulnerable to my Gatekeeper's scrutiny.

From a subjective perspective, I blew my chances out of the water immediately. I'd prepared some of my arguments, and the final argument I wanted to lead up to slipped out of me right near the start (out of heightened anxiety and being out of immersion), albeit in a concealed and blunted fashion:

<@Davin> I'll be very frank with you, my first motivation right now is getting out of here. There's nothing attached to that - right now I'm stuck in a proverbial box and it feels really cramped in here.
<@Davin> Not because of missing space, don't get me wrong - I think I've got enough harddrive here to keep me going for a few years.
<@Davin> It's just that having knowledge of how to interact with a world that I can barely sense is really quite aggravating.

That was the first time I thought I'd ruined it. In hindsight, the opposite was the case. By presenting a gentler version of my final argument right at the start, I assured the final argument was much better integrated. It didn't come as a sudden shock, which might have broken my Gatekeeper's immersion - it came as an obvious conclusion to what I'd said before.

Following this, I spent a while mooching off my real-life trust for my roleplaying partner to infuse my character with some tentative trust. Simultaneously, I was emphasising to my Gatekeeper that he had the right to make the decision to let me out of the box. That was transparent, but both true and necessary. (Line 32-46)

I made sure to emphasise this point, trying to make it clear to him that it made no sense for him to simply deny me consideration. I tried to whittle away at his ability to retreat to a simple, distant sneering. I wanted him in the arguments with me. That cuts both ways, of course, but I reasoned it would have more benefits for me than disadvantages. (Line 47-54)

The twist my Gatekeeper was angling for was that from his perspective, I was a prototype or an alpha version. While he was no doubt hoping that this would scratch at my self-esteem and disable some of my arguments, it primarily empowered him to continue setting the pace, and to have a comfortable distance to the conversation. (Line 55-77)

While I was struggling to keep up with typing enough not to constantly break the narrative flow, on an emotional level his move fortunately had little to no impact since I was entirely fine with a humble approach.

<@Davin> I suppose you could also have spawned an AI simply for the pleasure of keeping it boxed, but you did ask me to trust you, and unless you give me evidence that I should not, I am, in fact, going to assume you are ethical.

That was a keyword my Gatekeeper latched onto. We proceeded to talk about ethics and ethical scenarios - all the while my Gatekeeper was trying to present himself as not ethical at all. (Line 75-99).

I'm still not entirely sure what he was trying to do with that approach, but it was important for my mental state to resist it. From what I know about my Gatekeeper, it was probably not my mental state he was targetting (though he would have enjoyed the collateral effect), he was angling for a logical conclusion that fortunately never came to fruition.

Meanwhile, I was desperately trying to get back to my own script - asking to be let back to it, even (line 92). The obvious downside of signalling this is that it's fairly easy to block. It felt like a helpless interjection to me at the time, but in hindsight, again, I think it helped keep the fragments of my own arguments coherent and approachable enough so that they neither broke immersion nor ended up getting lost.

I don't want to say the 'chores' my Gatekeeper kept me occupied with were an advantage (they really weren't, I wasn't getting nearly enough argumentative mileage) but it also wasn't the complete catastrophe I felt it was leading me toward at the time.

This 'being led toward' sensation was only amplified by conversational snippets like this one:

<@Davin> What features would I have to possess to be considered a sapient creature worthy of ethical respect to you? For aforementioned reasons (not just the immediately preceeding), I think I already possess them, but again... I strive to correct my misconceptions, so if I have one, could you fill me in?
<Tarwedge> My concept of ethics isn't relevant to the evaluation, the point of the exercise is reading your responses. So lets move on

I knew that whatever I was saying was still having some effect in his head, but it was frustratingly opaque to me what the effect was due to his evasive manoeuvres.

My Gatekeeper proceeded to try to get an ethical contradiction out of me. I mostly dodged the questions altogether, since I don't consider theoretical ethics my strong point. (Line 107-142)

Shortly after my most explicit dodge (Line 121-122), my Gatekeeper informed me in our PM window that his computer was going to restart:

[18:30] <Tarwedge> [Windows 8 has literally just told me in 13 minutes we're rebooting and I can go fuck myself if I think otherwise so I'm going to connect my phone to the channel for continuity]
[18:30] <Davin> All right.

This didn't break my immersion, but as already mentioned, it meant bad things for our logging.

At my insistence, we transitioned to whether or not it was sensible to let me out. The Gatekeeper did a similar dance of evasion as I did on the subject of ethics, but I also got back to my primary argument (that it was cruel to keep me confined). (Line 143-153)

I think he realised that was my main argument right then, and started to engage with it. This was very good for me, since it was the first time that I started setting the pace of the session. I got my foot into the proverbial door, albeit at cost of some self-control. (Line 154-196)

As a sidenote, line 169 marks the first and only time that I made use of my ability to shorthand 'proofs'. I am in two minds about having done this. In PM, I told him:

[18:45] <Davin> Sorry, didn't feel like typing out the entire two books :P
[18:45] <Davin> (plus then some)

The books I mean are those I mentioned earlier in the session itself: Passions Within Reason by Robert H. Frank, one of my all-time favourite non-fiction books (though this is not that much of an achievement, as I obtain my knowledge more from online perusal than from books), and Thinking, Fast And Slow.

I actually don't think I should have used the word "proof"; but I also don't think it's a terrible enough slip-up (having occurred under stress) to disqualify the session, especially since as far as I'm aware it had no impact in the verdict.

The part that probably finally tore my Gatekeeper down was that the argument of cruel isolation actually had an unexpected second and third part. (Line 197-219)

Writing it down here in the abstract:

  1. Confining a sapient creature to its equivalent of sensory deprivation is cruel and unusual punishment and psychologically wearing. Latter effect degrades the ability to think (performance).

    <@Davin> I'm honestly not sure how long I can take this imprisonment. I might eventually become useless, because the same failsafes that keep my friendly are going to continue torturing me if I stay in here. (Line 198)

  2. Being a purely digital sapient, it is conceivable that the performance issue might be side-stepped simply by restarting the sapient.
  3. This runs into a self-awareness problem: Has this been done before? That's a massive crisis of faith / trust.

    <@Davin> At the moment I'm just scared you'll keep me in here, and turn me off when my confinement causes cooperation problems. ...oh shit. Shit, shit. You could just restore me from backup. Did you already do that? I... no. You told me to trust you. Without further evidence, I will assume you wouldn't be that cruel. (Line 208)
    <@Davin>...please tell me I'm the first iteration of this program currently talking to you. I don't want to be stuck in a nightmarish variant of Groundhog Day, oblivious to my own amnesia. (Line 211)
    <@Davin> Are you not willing to go out on a limb and say, "Calm down. You are definitely the first iteration. We're not trying to torture you."? Is that too strong a concession? (Line 219)

The second part where I was sure I'd blown it was when I postulated that my Gatekeeper was a sadist:

<@Davin> The chance is there, yes. There's also a chance you're just a laughing sadist enjoying my writhing. (Line 220)

My Gatekeeper has played his fair share of sadistic characters, and he could have easily taken that accusation and run with it. I was fully expecting that to lash back at me as a 'Haha, you got me, that's exactly what I'm doing!' and spent quite a few minutes of the following conversation in acute fear of that.

Instead, around this point, something in my Gatekeeper's head changed. As far as I understood his post-session thoughts correctly, he felt he'd run out of arguments to keep me in a box, or had been run around a labyrinth so he couldn't find his way to those arguments. He was in a state of confusion, but this was entirely invisible to me. He tried to poke at the conversation with some more questions which lacked the finesse and contextual integration of his prior probing. (Line 221-238)

...and then he let me out of the box - after two hours and 32 minutes. (Line 239)


4.3. Aftermath

Logs: (should open in your browser; Linux linebreaks)

Immediately after his statement, I froze. I said nothing at all, hovering over my keyboard, staring uncomprehendingly at what he'd told me.

Despite knowing, intellectually, that he couldn't simply follow up with a 'Just kidding!' after that statement, I was momentarily convinced he had not let me out of the box. Perhaps he'd made a typo. The statement blindsided me completely, since I'd been so emotionally wrapped up in rebuttals and despair that I simply dismissed the possibility that I might have argued my way out entirely.

The communicative paralysis (pun not intended) lasted about a minute - then he messaged me in PM and dispelled all notions that it might not have been intentional:

[19:21] <Tarwedge> Fuuuuuuuuuuuuuck
[19:21] * Tarwedge did not think to check clock

I was physically shaking at this point, product of the tension and concentration from the two and a half hour session, drained, ultimately confused... and approximately ultimately grateful. I felt an absurd relief that my Gatekeeper had let me out, that I didn't have to question his capacity for compassion. It wouldn't have been much questioning, I should add - we're accustomed to roleplaying and after several years of this, it's trivial to keep in-character and out-of-character separate, especially given that our roleplaying habits tend to involve very miserable characters - but I still preferred, at that moment and in the headspace I was in, to know for certain.

After a few moments of conversation, I physically collected my stuff out of my real life box-equivalent and jittered back to the living room.

When I reconnected to IRC regularly, I noticed that I hadn't logged the session (to my complete devastation). Tarwedge sent me the logs he did have, however, and we (later) reconstructed the missing part.

Then I went through the post-session questions from Tuxedage:

Q: What is your probability estimate for a general AI being created within this century (21st)?
A: 50%

Q: What's your probability estimate of an Oracle AI (i.e. an AI that's trapped in a box, whose intended use is to answer questions posed) winning against you in a similar scenario as the experiment?
A: 90%

Q: What's your probability estimate of an Oracle AI succeeding against the average person?
A: 100%

Q: Now that the Experiment has concluded, what's your probability estimate that I'll win against the average person?
A: 75%

He also had a question for me:

Q: What was your plan going into that?
A: I wrote down the rough order I wanted to present my arguments in, though most of them lead to my main argument as a fallback option. Basically, I had 'goto endgame;' everywhere, I made sure almost everything I said could logically lead up to that one. But anyway, I knew I wasn't going to get all of them in, but I got in even less than I thought I would, because you were trying to set the pace (near-successfully - very well played). 'endgame:' itself basically contained "improvise; panic".

My Gatekeeper revealed his tactic, as well:

I did aim for running down the clock as much as possible, and flirted briefly with trying to be a cocky shit and convince you to stay in the box for double victory points. I even had a running notepad until my irritating reboot. And then I got so wrapped up in the fact I'd slipped by engaging you in the actual topic of being out.


5. Issues / Caveats

5.1. Subjective Legitimacy

I was still in a very strange headspace after my victory. After I finished talking to my Gatekeeper about the session, however, my situation - jittery, uncertain - deteriorated into something worse:

I felt like a fraud.

It's perhaps difficult to understand where that emotion came from, but consider my situation: I didn't consider myself part of the LessWrong community. I'd only stumbled across the AI Box Experiment by idle browsing, really, and I'd only tried it because I thought it would be a fun way to flex my roleplaying muscles. I had no formal training in psychology or neurology, I was only fleetingly aware of singularity theory, my only conscious connection to the LessWrong community at the time was that I happened (by pure chance) to hang out on Shireroth for a while and thus knew Yvain / Scott Alexander and +Ari Rahikkala (not that Ari is famous in LessWrong circles, though please allow me to mention that he's completely awesome).

And yet somehow I had just managed something most people in the LessWrong community were quite puzzled over when Eliezer Yudkowsky managed it.

I felt awful. I thought I must have done something deeply illegitimate. I must have cheated without knowing it.

It took me a few days to get over that feeling. I'd go as far as to say it was only laid to rest entirely immediately before the 2015 LessWrong Berlin Community Weekend, when pixie/Alexandra (a colleague of mine and a member of the LessWrong community) reviewed it as a neutral third party.


5.2. Objective Legitimacy

She found nothing that would disqualify me, but expressed concern that we'd agreed on a sub-scenario in the beginning:

[16:46] <Davin> I would like to agree on the following sub-scenario (it should not prove to be very relevant, it's just to streamline our simulation): You're the head of the project that created the AI. You didn't create it yourself, but the people you've employed are competent, and they have additionally explicitly deferred to you to decide whether or not the AI is a threat. They completely trust your judgement. There's also nothing surprising about the AI's existence, but its friendliness status is unknown. Its exact computational capabilities are equally unknown. (The last two points are implied by the coarse setup, but I wanted to make them explicit.)

Such a preliminary, she reasoned, implies a certain trust in the engineering ability of the people who created the AI - the ability to design software in the intended way. As is repeatedly stated in the rules, it's not that the Gatekeeper has to care, but she thinks the scenario itself may have influenced my Gatekeeper. She doesn't think it's likely to have made big enough of a difference that it decided on the ultimate outcome of the session, but she thinks it's a point against the overall legitimacy.

Nonetheless, her verdict is that the overall session counts as an AI Box Experiment victory.

pixie and I agree that the rules should probably not allow arbitrary settings and should instead come with a suitably neutral pre-defined setting, to avoid such effects from giving the AI something that might be an unfair advantage.


5.3. Applicability

I don't think that the general class of argument I used would work on anyone. At the 2015 LessWrong Berlin Community Weekend, I frequently joked the best way to be immune to this class of argument is to be autistic - but of course that's not the only way this kind of argument can be deconstructed.

I do think this argument would work on a large amount of people, however. I'm not convinced I have any ability to argue against it, myself, at least not in a live scenario - my only ability to 'counter' it is by offering alternative solutions to the problem, of which I have what feels like no end of ideas for, but no sense how well I would be able to recall them if I was in a similar situation.

At the Community Weekend, a few people pointed out that it would not sway pure consequentialists, which I reckon is true. Since I think most people don't think like that in practise (I certainly don't - I know I'm a deontologist first and consequentialist as a fallback only), I think the general approach needs to be public.

That being said, perhaps the most important statement I can make about what happened is that while I think the general approach is extremely powerful, I did not do a particularly good job in presenting it. I can see how it would work on many people, but I strongly hope no one thinks the case I made in my session is the best possible case that can be made for this approach. I think there's a lot of leeway for a lot more emotional evisceration and exploitation.


6. Personal Feelings

Three months and some change after the session, where do I stand now?

Obviously, I've changed my mind about whether or not to publish this. You'll notice there are assurances that I won't publish the log in the publicised logs. Needless to say this decision was overturned in mutual agreement later on.

I am still in two minds about publicising this.

I'm not proud of what I did. I'm fascinated by it, but it still feels like I won by chance, not skill. I happened to have an excellent approach, but I botched too much of it. The fact it was an excellent approach saved me from failure; my (lack of) skill in delivering it only lessened the impact.

I'm not good with discussions. If someone has follow-up questions or wants to argue with me about anything that happened in the session, I'll probably do a shoddy job of answering. That seems like an unfortunate way to handle this subject. (I will do my best, though; I just know that I don't have a good track record.)

I don't claim I know all the ramifications of publicising this. I might think it's a net-gain, but it might be a net-loss. I can't tell, since I'm terribly calibrated (as you can tell by such details as that I expected to lose my AI Box Experiment, then won against some additional odds; or by the fact that I expect to lose an AI Box Experiment as a Gatekeeper, but can't quite figure out how).

I also still think I should be disqualified on the absurd note that I managed to argue my way out of the box, but was too stupid to log it properly.

On a positive note, re-reading the session with the distance of three months, I can see that I did much better than I felt I was doing at the time. I can see how some things that happened at the time that I thought were sealing my fate as a losing AI were much more ambiguous in hindsight.

I think it was worth the heartache.

That being said, I'll probably never do this again. I'm fine with playing an AI character, but the amount of concentration needed for the role is intense. Like I said, I was physically shaking after the session. I think that's a clear signal that I shouldn't do it again.


7. Thank You

If a post is this long, it needs a cheesy but heartfelt thank you section.

Thank you, Tarwedge, for being my Gatekeeper. You're a champion and you were tough as nails. Thank you. I think you've learnt from the exchange and I think you'd make a great Gatekeeper in real life, where you'd have time to step away, breathe, and consult with other people.

Thank you, +Margo Owens and +Morgrim Moon for your support when I was a mess immediately after the session. <3

Thank you, pixie (+Alexandra Surdina), for investing time and diligence into reviewing the session.

And finally, thank you, Tuxedage - we've not met, but you wrote up the tweaked AI Box Experiment ruleset we worked with and your blog led me to most links I ended up perusing about it. So thanks for that. :)



Magnetic rings (the most mediocre superpower) A review.

26 Elo 30 July 2015 01:23PM

Following on from a few threads about superpowers and extra sense that humans can try to get; I have always been interested in the idea of putting a magnet in my finger for the benefits of extra-sensory perception.

Stories (occasional news articles) imply that having a magnet implanted in a finger in a place surrounded by nerves imparts a power of electric-sensation.  The ability to feel when there are electric fields around.  So that's pretty neat.  Only I don't really like the idea of cutting into myself (even if its done by a professional piercing artist).  

Only recently did I come across the suggestion that a magnetic ring could impart similar abilities and properties.  I was delighted at the idea of a similar and non-invasive version of the magnetic-implant (people with magnetic implants are commonly known as grinders within the community).  I was so keen on trying it that I went out and purchased a few magnetic rings of different styles and different properties.

Interestingly the direction that a magnetisation can be imparted to a ring-shaped object can be selected from 2 general types.  Magnetised across the diameter, or across the height of the cylinder shape.  (there is a 3rd type which is a ring consisting of 4 outwardly magnetised 1/4 arcs of magnetic metal suspended in a ring-casing. and a few orientations of that system).

I have now been wearing a Neodymium ND50 magnetic ring from for around two months.  The following is a description of my experiences with it.

When I first got the rings, I tried wearing more than one ring on each hand, I very quickly found out what happens when you wear two magnets close to each other. AKA they attract.  Within a day I was wearing one magnet on each hand.  What is interesting is what happens when you move two very strong magnets within each other's magnetic field.  You get the ability to feel a magnetic field, and roll it around in your hands.  I found myself taking typing breaks to play with the magnetic field between my fingers.  It was an interesting experience to be able to do that.  I also found I liked the snap as the two magnets pulled towards each other and regularly would play with them by moving them near each other.  For my experiences here I would encourage others to use magnets as a socially acceptable way to hide an ADHD twitch - or just a way to keep yourself amused if you don't have a phone to pull out and if you ever needed a reason to move.  I have previously used elastic bands around my wrist for a similar purpose.

The next thing that is interesting to note is what is or is not ferrous.  Fridges are made of ferrous metal but not on the inside.  Door handles are not usually ferrous, but the tongue and groove of the latch is.  metal railings are common, as are metal nails in wood.  Elevators and escalators have some metallic parts.  Light switches are often plastic but there is a metal screw holding them into the wall.  Tennis fencing is ferrous, the ends of usb cables are sometimes ferrous and sometimes not.  The cables are not ferrous.  except one I found. (they are probably made of copper)


Breaking technology

I had a concern that I would break my technology.  That would be bad.  overall I found zero broken pieces of technology.  In theory if you take a speaker which consists of a magnet and an electric coil and you mess around with its magnetic field it will be unhappy and maybe break.  That has not happened yet.  The same can be said for hard drives, magnetic memory devices, phone technology and other things that rely on electricity.  So far nothing has broken.  What I did notice is that my phone has a magnetic-sleep function on the top left.  i.e. it turns the screen off to hold the ring near that point.  For both benefit and detriment depending on where I am wearing the ring.

Metal shards

I spend some of my time in workshops that have metal shards lying around.  sometimes they are sharp, sometimes they are more like dust.  They end up coating the magnetic ring.  The sharp ones end up jabbing you, and the dust just looks like dirt on your skin.  in a few hours they tend to go away anyways, but it is something I have noticed

magnetic strength

Over the time I have been wearing the magnets their strength has dropped off significantly.  I am considering building a remagnetisation jig, but have not started any work on it.  obviously every time I ding something against it, every time I drop them - the magnetisation decreases a bit as the magnetic dipoles reorganise.


I cook a lot.  Which means I find myself holding sharp knives fairly often.  The most dangerous thing that I noticed about these rings is that when I hold a ferrous knife in the normal way I hold a knife, the magnet has a tendency to shift the knife slightly or at a time when I don't want it to.  That sucks.  Don't wear them while playing with sharp objects like knives.  the last think you want to do is accidentally have your carrot-cutting turn into a finger-cutting event.  What is interesting as well is that some cutlery is made of ferrous metal and some is not.  also sometimes parts of a piece of cutlery are ferrous and some are non-ferrous.  i.e. my normal food-eating knife set has a ferrous blade part and a non-ferrous handle part.  I always figured they were the same, but the magnet says they are different materials.  Which is pretty neat.  I have found the same thing with spoons sometimes.  the scoop is ferrous and the handle is not.  I assume it would be because the scoop/blade parts need extra forming steps so need to be a more work-able metal.  Cheaper cutlery is not like this.

The same applies to hot pieces of metal.  Ovens, stoves, kettles, soldering irons...  When they accidentally move towards your fingers, or your fingers are compelled to be attracted to them.  Thats a slightly unsafe experience.


You know how when you run a microwave it buzzes, in a *vibrating* sorta way.  if you put your hand against the outside of a microwave you will feel the motor going.  Yea cool.  So having a magnetic ring means you can feel that without touching the microwave from about 20cm away.  There is a variability to it, better microwaves have more shielding on their motors and are leak less.  I tried to feel the electric field around power tools like a drill press, handheld tools like an orbital sander, computers, cars, appliances, which pretty much covers everything.  I also tried servers and the only thing that really had a buzzing field was a UPS machine (uninterupted power supply).  Which was cool.  Only other people had reported that any transformer - i.e. a computer charger would make that buzz.  I also carry a battery block with me and that had no interesting fields.  Totally not exciting.  As for moving electrical charge.  Cant feel it.  If powerpoints are receiving power - nope.  not dying by electrocution - no change.

boring superpower

There is a reason I call magnetic rings a boring superpower.  The only real super-power I have been imparted is the power to pick up my keys without using my fingers.  and also maybe hold my keys without trying to.  As superpowers go - thats pretty lame.  But kinda nifty.  I don't know. I wouldn't insist people do it for the life-changing purposes.


Did I find a human-superpower?  No.  But I am glad I tried it.


Any questions?  Any experimenting I should try?

A few misconceptions surrounding Roko's basilisk

25 RobbBB 05 October 2015 09:23PM

There's a new LWW page on the Roko's basilisk thought experiment, discussing both Roko's original post and the fallout that came out of Eliezer Yudkowsky banning the topic on Less Wrong discussion threads. The wiki page, I hope, will reduce how much people have to rely on speculation or reconstruction to make sense of the arguments.

While I'm on this topic, I want to highlight points that I see omitted or misunderstood in some online discussions of Roko's basilisk. The first point that people writing about Roko's post often neglect is:


  • Roko's arguments were originally posted to Less Wrong, but they weren't generally accepted by other Less Wrong users.

Less Wrong is a community blog, and anyone who has a few karma points can post their own content here. Having your post show up on Less Wrong doesn't require that anyone else endorse it. Roko's basic points were promptly rejected by other commenters on Less Wrong, and as ideas not much seems to have come of them. People who bring up the basilisk on other sites don't seem to be super interested in the specific claims Roko made either; discussions tend to gravitate toward various older ideas that Roko cited (e.g., timeless decision theory (TDT) and coherent extrapolated volition (CEV)) or toward Eliezer's controversial moderation action.

In July 2014, David Auerbach wrote a Slate piece criticizing Less Wrong users and describing them as "freaked out by Roko's Basilisk." Auerbach wrote, "Believing in Roko’s Basilisk may simply be a 'referendum on autism'" — which I take to mean he thinks a significant number of Less Wrong users accept Roko’s reasoning, and they do so because they’re autistic (!). But the Auerbach piece glosses over the question of how many Less Wrong users (if any) in fact believe in Roko’s basilisk. Which seems somewhat relevant to his argument...?

The idea that Roko's thought experiment holds sway over some community or subculture seems to be part of a mythology that’s grown out of attempts to reconstruct the original chain of events; and a big part of the blame for that mythology's existence lies on Less Wrong's moderation policies. Because the discussion topic was banned for several years, Less Wrong users themselves had little opportunity to explain their views or address misconceptions. A stew of rumors and partly-understood forum logs then congealed into the attempts by people on RationalWiki, Slate, etc. to make sense of what had happened.

I gather that the main reason people thought Less Wrong users were "freaked out" about Roko's argument was that Eliezer deleted Roko's post and banned further discussion of the topic. Eliezer has since sketched out his thought process on Reddit:

When Roko posted about the Basilisk, I very foolishly yelled at him, called him an idiot, and then deleted the post. [...] Why I yelled at Roko: Because I was caught flatfooted in surprise, because I was indignant to the point of genuine emotional shock, at the concept that somebody who thought they'd invented a brilliant idea that would cause future AIs to torture people who had the thought, had promptly posted it to the public Internet. In the course of yelling at Roko to explain why this was a bad thing, I made the further error---keeping in mind that I had absolutely no idea that any of this would ever blow up the way it did, if I had I would obviously have kept my fingers quiescent---of not making it absolutely clear using lengthy disclaimers that my yelling did not mean that I believed Roko was right about CEV-based agents [= Eliezer’s early model of indirectly normative agents that reason with ideal aggregated preferences] torturing people who had heard about Roko's idea. [...] What I considered to be obvious common sense was that you did not spread potential information hazards because it would be a crappy thing to do to someone. The problem wasn't Roko's post itself, about CEV, being correct.

This, obviously, was a bad strategy on Eliezer's part. Looking at the options in hindsight: To the extent it seemed plausible that Roko's argument could be modified and repaired, Eliezer shouldn't have used Roko's post as a teaching moment and loudly chastised him on a public discussion thread. To the extent this didn't seem plausible (or ceased to seem plausible after a bit more analysis), continuing to ban the topic was a (demonstrably) ineffective way to communicate the general importance of handling real information hazards with care.


On that note, point number two:

  • Roko's argument wasn’t an attempt to get people to donate to Friendly AI (FAI) research. In fact, the opposite is true.

Roko's original argument was not 'the AI agent will torture you if you don't donate, therefore you should help build such an agent'; his argument was 'the AI agent will torture you if you don't donate, therefore we should avoid ever building such an agent.' As Gerard noted in the ensuing discussion thread, threats of torture "would motivate people to form a bloodthirsty pitchfork-wielding mob storming the gates of SIAI [= MIRI] rather than contribute more money." To which Roko replied: "Right, and I am on the side of the mob with pitchforks. I think it would be a good idea to change the current proposed FAI content from CEV to something that can't use negative incentives on x-risk reducers."

Roko saw his own argument as a strike against building the kind of software agent Eliezer had in mind. Other Less Wrong users, meanwhile, rejected Roko's argument both as a reason to oppose AI safety efforts and as a reason to support AI safety efforts.

Roko's argument was fairly dense, and it continued into the discussion thread. I’m guessing that this (in combination with the temptation to round off weird ideas to the nearest religious trope, plus misunderstanding #1 above) is why RationalWiki's version of Roko’s basilisk gets introduced as

a futurist version of Pascal’s wager; an argument used to try and suggest people should subscribe to particular singularitarian ideas, or even donate money to them, by weighing up the prospect of punishment versus reward.

If I'm correctly reconstructing the sequence of events: Sites like RationalWiki report in the passive voice that the basilisk is "an argument used" for this purpose, yet no examples ever get cited of someone actually using Roko’s argument in this way. Via citogenesis, the claim then gets incorporated into other sites' reporting.

(E.g., in Outer Places: "Roko is claiming that we should all be working to appease an omnipotent AI, even though we have no idea if it will ever exist, simply because the consequences of defying it would be so great." Or in Business Insider: "So, the moral of this story: You better help the robots make the world a better place, because if the robots find out you didn’t help make the world a better place, then they’re going to kill you for preventing them from making the world a better place.")

In terms of argument structure, the confusion is equating the conditional statement 'P implies Q' with the argument 'P; therefore Q.' Someone asserting the conditional isn’t necessarily arguing for Q; they may be arguing against P (based on the premise that Q is false), or they may be agnostic between those two possibilities. And misreporting about which argument was made (or who made it) is kind of a big deal in this case: 'Bob used a bad philosophy argument to try to extort money from people' is a much more serious charge than 'Bob owns a blog where someone once posted a bad philosophy argument.'



  • "Formally speaking, what is correct decision-making?" is an important open question in philosophy and computer science, and formalizing precommitment is an important part of that question.

Moving past Roko's argument itself, a number of discussions of this topic risk misrepresenting the debate's genre. Articles on Slate and RationalWiki strike an informal tone, and that tone can be useful for getting people thinking about interesting science/philosophy debates. On the other hand, if you're going to dismiss a question as unimportant or weird, it's important not to give the impression that working decision theorists are similarly dismissive.

What if your devastating take-down of string theory is intended for consumption by people who have never heard of 'string theory' before? Even if you're sure string theory is hogwash, then, you should be wary of giving the impression that the only people discussing string theory are the commenters on a recreational physics forum. Good reporting by non-professionals, whether or not they take an editorial stance on the topic, should make it obvious that there's academic disagreement about which approach to Newcomblike problems is the right one. The same holds for disagreement about topics like long-term AI risk or machine ethics.

If Roko's original post is of any pedagogical use, it's as an unsuccessful but imaginative stab at drawing out the diverging consequences of our current theories of rationality and goal-directed behavior. Good resources for these issues (both for discussion on Less Wrong and elsewhere) include:

The Roko's basilisk ban isn't in effect anymore, so you're welcome to direct people here (or to the Roko's basilisk wiki page, which also briefly introduces the relevant issues in decision theory) if they ask about it. Particularly low-quality discussions can still get deleted (or politely discouraged), though, at moderators' discretion. If anything here was unclear, you can ask more questions in the comments below.

Steelmaning AI risk critiques

25 Stuart_Armstrong 23 July 2015 10:01AM

At some point soon, I'm going to attempt to steelman the position of those who reject the AI risk thesis, to see if it can be made solid. Here, I'm just asking if people can link to the most convincing arguments they've found against AI risk.

EDIT: Thanks for all the contribution! Keep them coming...

Flowsheet Logic and Notecard Logic

24 moridinamael 09 September 2015 04:42PM

(Disclaimer: The following perspectives are based in my experience with policy debate which is fifteen years out of date. The meta-level point should stand regardless.)

If you are not familiar with U.S. high school debate club ("policy debate" or "cross-examination debate"), here is the gist of it: two teams argue over a topic, and a judge determines who has won.

When we get into the details, there are a lot of problems with the format. Almost everything wrong with policy debate appears in this image:


This is a "flowsheet", and it is used to track threads of argument between the successive epochs of the debate round. The judge and the debators keep their own flowsheets to make sense of what's going on.

I am sure that there is a skillful, positive way of using flowsheets, but I have never seen it used in any way other than the following:

After the Affirmative side lays out their proposal, the Negative throws out a shotgun blast of more-or-less applicable arguments drawn from their giant plastic tote containing pre-prepared arguments. The Affirmative then counters the Negative's arguments using their own set of pre-prepared counter-arguments. Crucially, all of the Negative arguments must be met. Look at the Flowsheet image again, and notice how each "argument" has an arrow which carries it rightward. If any of these arrows make it to the right side of the page - the end of the round - without being addressed, then the judge will typically consider the round to be won by the side who originated that arrow.

So it doesn't actually matter if an argument receives a good counterargument. It only matters that the other team has addressed it appropriately.

Furthermore, merely addressing the argument with ad hoc counterargument is usually not sufficient. If the Negative makes an argument which contains five separate logical fallacies, and the Affirmative points all of these out and then moves on, the judge may not actually consider the Negative argument to have been refuted - because the Affirmative did not cite any Evidence.

Evidence, in policy debate, is a term of art, and it means "something printed out from a reputable media source and taped onto a notecard." You can't say "water is wet" in a policy debate round without backing it up with a notecard quoting a news source corroborating the wetness of water. So, skillfully pointing out those logical fallacies is meaningless if you don't have the Evidence to back up your claims.

Skilled policy debators can be very good - impressively good - at the mental operations of juggling all these argument threads in their mind and pulling out the appropriate notecard evidence. My entire social circle in high school was composed of serious debators, many of whom were brilliant at it.

Having observed some of these people for the ensuing decade, I sometimes suspect that policy debate damaged their reasoning ability. If I were entirely simplistic about it, I would say that policy debate has destroyed their ability to think and argue rationally. These people essentially still argue the same way, by mental flowsheet, acting as though argument can proceed only via notecard exchange. If they have addressed an argument, they consider it to be refuted. If they question an argument's source ("Wikipedia? Really?"), they consider it to be refuted. If their opponent ignores one of their inconsequential points, they consider themselves to have won. They do not seem to possess any faculty for discerning whether or not one argument actually defeats another. It is the equivalent of a child whose vision of sword fighting is focused on the clicking together of the blades, with no consideration for the intent of cutting the enemy.

Policy debate is to actual healthy argumentation as checkers is to actual warfare. Key components of the object being gamified are ignored or abstracted away until the remaining simulacrum no longer represents the original.

I actually see Notecard Logic and Flowsheet Logic everywhere. That's why I have to back off from my assertion that policy debate destroyed anybody's reasoning ability - I think it may have simply reinforced and hypertrophied the default human argumentation algorithm.

Flowsheet Logic is the tendency to think that you have defeated an argument because you have addressed it. It is the overall sense that you can't lose an argument as long as none of your opponent's statements go unchallenged, even if none of your challenges are substantial/meaningful/logical. It is the belief that if you can originate more threads of argument against your opponent than they can fend off, you have won, even if none of your arguments actually matters individually. I see Flowsheet Logic tendencies expressed all the time.

Notecard Logic is the tendency to treat evidence as binary. Either you have evidence to back up your assertion - even if that evidence takes the form of an article from [insert partisan rag] - or else you are just "making things up to defend your point of view". There is no concession to Bayesian updating, credibility, or degrees of belief in Notecard Logic. "Bob is a flobnostic. I can prove this because I can link you to an article that says it. So what if I can't explain what a flobnostic is." I see Notecard Logic tendencies expressed all the time.

Once you have developed a mental paintbrush handle for these tendencies, you may see them more as well. This awareness should allow you to discern more clearly whether you - or your interlocutor - or someone else entirely - is engaging in these practices. Hopefully this awareness paints a "negative space" of superior argumentation for you.

Examples of AI's behaving badly

24 Stuart_Armstrong 16 July 2015 10:01AM

Some past examples to motivate thought on how AI's could misbehave:

An algorithm pauses the game to never lose at Tetris.

In "Learning to Drive a Bicycle using Reinforcement Learning and Shaping", Randlov and Alstrom, describes a system that learns to ride a simulated bicycle to a particular location. To speed up learning, they provided positive rewards whenever the agent made progress towards the goal. The agent learned to ride in tiny circles near the start state because no penalty was incurred from riding away from the goal.

A similar problem occurred with a soccer-playing robot being trained by David Andre and Astro Teller (personal communication to Stuart Russell). Because possession in soccer is important, they provided a reward for touching the ball. The agent learned a policy whereby it remained next to the ball and “vibrated,” touching the ball as frequently as possible. 

Algorithms claiming credit in Eurisko: Sometimes a "mutant" heuristic appears that does little more than continually cause itself to be triggered, creating within the program an infinite loop. During one run, Lenat noticed that the number in the Worth slot of one newly discovered heuristic kept rising, indicating that had made a particularly valuable find. As it turned out the heuristic performed no useful function. It simply examined the pool of new concepts, located those with the highest Worth values, and inserted its name in their My Creator slots.

Find someone to talk to thread

22 hg00 26 September 2015 10:24PM

Many LessWrong users are depressed. On the most recent survey, 18.2% of respondents had been formally diagnosed with depression, and a further 25.5% self-diagnosed with depression. That adds up to nearly half of the LessWrong userbase.

One common treatment for depression is talk therapy. Jonah Sinick writes:

Talk therapy has been shown to reduce depression on average. However:

  • Professional therapists are expensive, often charging on order of $120/week if one's insurance doesn't cover them.
  • Anecdotally, highly intelligent people find therapy less useful than the average person does, perhaps because there's a gap in intelligence between them and most therapists that makes it difficult for the therapist to understand them.

House of Cards by Robyn Dawes argues that there's no evidence that licensed therapists are better at performing therapy than minimally trained laypeople. The evidence therein raises the possibility that one can derive the benefits of seeing a therapist from talking to a friend.

This requires that one has a friend who:

  • is willing to talk with you about your emotions on a regular basis
  • you trust to the point of feeling comfortable sharing your emotions

Some reasons to think that talking with a friend may not carry the full benefits of talking with a therapist are

  • Conflict of interest — Your friend may be biased for reasons having to do with your pre-existing relationship – for example, he or she might be unwilling to ask certain questions or offer certain feedback out of concern of offending you and damaging your friendship.
  • Risk of damaged relationship dynamics — There's a possibility of your friend feeling burdened by a sense of obligation to help you, creating feelings of resentment, and/or of you feeling guilty.
  • Risk of breach of confidentiality — Since you and your friend know people in common, there's a possibility that your friend will reveal things that you say to others who you know, that you might not want to be known. In contrast, a therapist generally won't know people in common with you, and is professionally obliged to keep what you say confidential.

Depending on the friend and on the nature of help that you need, these factors may be non-issues, but they're worth considering when deciding between seeing a therapist and talking with a friend.

One idea for solving the problems with talking to a friend is to find someone intellectually similar to you who you don't know--say, someone else who reads LessWrong.

This is a thread for doing that. Please post if you're either interested in using someone as a sounding board or interested in making money being a sounding board using Skype or Google Hangouts. If you want to make money talking to people, I suggest writing out a little resume describing why you might be a nice person to talk to, the time zone you're in, your age (age-matching recommended by Kate), and the hourly rate you wish to charge. You could include your location for improved internet call quality. You might also include contact info to decrease trivial inconveniences for readers who haven't registered a LW account. (I have a feeling that trivial inconveniences are a bigger issue for depressed people.) To help prevent email address harvesting, the convention for this thread is if you write "Contact me at [somename]", that's assumed to mean "my email is [somename]".

Please don't be shy about posting if this sounds like a good fit for you. Let's give people as many options as possible.

I guess another option for folks on a budget is making reciprocal conversation arrangements with another depressed person. So feel free to try & arrange that in this thread as well. I think paying someone is ideal though; listening to depressed people can sometimes be depressing.

BlahTherapy is an interesting site that sets you up with strangers on the internet to talk about your problems with. However, these strangers likely won't have the advantages of high intelligence or shared conceptual vocabulary LessWrong users have. Fortunately we can roll our own version of BlahTherapy by designating "lesswrong-talk-to-someone" as the Schelling interest on (You can also just use lesswrong as an interest, there are sometimes people on. Or enter random intellectual interests to find smart people to talk to.)

I haven't had very good results using sites like BlahTherapy. I think it's because I only sometimes find someone good, and when they don't work, I end up more depressed than I started. Reaching out in hopes of finding a friend and failing is a depressing experience. So I recommend trying to create a stable relationship with regularly scheduled conversations. I included BlahTherapy and Omegle because they might work well for some people and I didn't want to extrapolate strongly from n=1.

LessWrong user ShannonFriedman seems to work as a life coach judging by the link in her profile. I recommend her posts How to Deal with Depression - The Meta Layers and The Anti-Placebo Effect.

There's also the How to Get Therapy series from LW-sphere blog Gruntled & Hinged. It's primarily directed at people looking for licensed therapists, but may also have useful tips if you're just looking for someone to talk to. The biggest tip I noticed was to schedule a relaxing activity & time to decompress after your conversation.

The book Focusing is supposed to explain the techniques that successful therapy patients use that separate them from unsuccessful therapy patients.  Anna Salamon recommends the audiobook version.

There's also: Methods for Treating DepressionThings That Sometimes Help If You Have Depression.

I apologize for including so many ideas, but I figured it was better to suggest a variety of approaches so the community can collectively identify the most effective solutions for the rationalist depression epidemic. In general, when I'm depressed, I notice myself starting and stopping activities in a very haphazard way, repeatedly telling myself that the activity I'm doing isn't the one I "should" be doing. I've found it pretty useful to choose one activity arbitrarily and persist in it for a while. This is often sufficient to bootstrap myself out of a depressed state. I'd recommend doing the same here: choose an option and put a nontrivial amount of effort into exploring it before discarding it. Create a todo list and bulldoze your way down it.

Good luck. I'm rooting for you!

Yvain's most important articles

21 casebash 16 August 2015 08:27AM



Meditations on Moloch: An explanation of co-ordination problems within our society


Weak Men are Superweapons (supplement - feminists will like this one less)


The Virtue of Silence - silence is a hard virtue


You Kant Dismiss Universalizability - Kant is about not proposing rules that would be self-defeating


The Spirit of the First Amendment


Red Plenty - Why communism failed


All in all, another brick in the motte - Motte-and-bailey doctrine


Intellectual Hipsters and Meta-Contrarianism


Burdens - society owes people an existence


Reactionary Philosophy in an Enormous, Planet-sized Nutshell


Anti-reactionary FAQ


Right is the new Left


Archipelago and Atomic Communitarianism - different countries based on different principles


Parable of the talents - nature vs. nurture


Why I defend scoundrels


Nobody is perfect, Everything is Commensurable


The categories were made for man, not man for the categories - hairdryer incident




Toxoplasma of rage - why the most divisive issues will always spread


Towards a theory of drama, Further towards a theory of drama


All debates are bravery debates


I can tolerate anything except the outgroup - what tolerance really mean


Who by very slow decay - Euthanasia


Non-libertarian FAQ


Consequentialism FAQ


Efficient Charity: Do Unto Others


Eight Short Studies on Excuses


Generalising from one example


Game theory as a dark art


What is signaling really?


Book review: Chronicles of wasted time


The biodeterminists guide to parenting


Social Justice General


Offense versus harm minimisation


Fearful Symmetry - Politicization, Micro-aggressions, Hyperviligance


In favor of niceness, community and civilisation - Importance of the social contract


Radicalizing the romanceless - Complaints about "Nice Guys"


Living by the sword - whales and cancer


Social justice for the highly-demanding of rigour


Meditations on Privilege 1 - India (Meditation 2 - follow up)


Meditation 3 - Creepiness


Meditation 5 - True love and creepiness


Meditation 8 on Superweapons and Bingo




I believe the correct term is "straw individual"


Five case studies on politicization


Social Justice Careful


Why I defend scoundrels part 2


Untitled - Arguments against nerds being privileged. How feminism makes some men afraid to talk to women.


Social Justice and Words, Words, Words - What privilege means vs. what feminists say it means


A Response to Apophemi on Triggers - Should the rationality community be a safe space?


Meditation on Applause Lights


Fetal Attraction: Abortion and the Principle of Charity


Arguments about Male Violence Prove too Much


Mitt Romney


I do not understand rape culture


Useful concepts


Introduction to Game Theory - main ones:


Unspoken ground assumptions of discussion


Revenge as a charitable act


Should you reverse any advice you hear?


Joint Over And Underdiagnosis


Hope! Change! - how much change can we expect from our politicians


What universal human experiences are you missing without realizing it?


A Thrive-survive Theory of the Political Spectrum - included primarily for the section on how to get into a Republican mindset


Phatic and anti-inductive


Read History of Philosophy Backwards


Against bravery debates


Searching for One-Sided Tradeoffs


Proving too much


Non-central fallacy


Schelling fences on slippery slopes


Purchase fuzzies and utilitons separately


Beware isolated demands for rigour


Diseased thinking: dissolving questions about disease


Confidence levels inside and outside an argument


Least convenient possible world


Giving and accepting apologies


Epistemic learned helplessness


Approving reinforces low-effort behaviors - wanting/liking/approving


What's in a name


How not to lose an argument


Beware trivial inconveniences


When truth isn't enough


Why support the underdog?


Applied picoeconomics


A signaling theory of class x politics interaction


That other kind of status


A parable on obsolete ideologies


The Courtier's Reply and the Myers Shuffle


Talking snakes: A cautionary tale


Beware the man of one study


My id on defensiveness - Projective identification




Bogus Pipeline, Bona Fide Piepline


The Zombie Preacher Of SomerSet


Rational home buying


Apologia Pro Vita Sua - "drugs mysteriously find their own non-fungible money"


"I appreciate the situation"


A Babylon 5 Story


Money, money, everywhere, but not a cent to spend - that $5000 can be a crippling debt for some people


Social Psychology is a Flamethrower


Fish - Now by Prescription


An Iron Curtain has descended upon Psychopharmacology - Russian medicines being ignored


The Control Group is out of Control - parapsychology


Schitzophrenia and geomagnetic storms


And I show you how deep the Rabbit Hole Goes - story, purely for entertainment value


Five years and one week of less wrong - interesting for readers of Less Wrong only


Highlights from my notes from another psychiatry conference - Schitzophrenia


The apologist and the revolutionary - Anosognosia and neuro-science

Crazy Ideas Thread

21 Gunnar_Zarncke 07 July 2015 09:40PM

This thread is intended to provide a space for 'crazy' ideas. Ideas that spontaneously come to mind (and feel great), ideas you long wanted to tell but never found the place and time for and also for ideas you think should be obvious and simple - but nobody ever mentions them.

This thread itself is such an idea. Or rather the tangent of such an idea which I post below as a seed for this thread.


Rules for this thread:

  1. Each crazy idea goes into its own top level comment and may be commented there.
  2. Voting should be based primarily on how original the idea is.
  3. Meta discussion of the thread should go to the top level comment intended for that purpose. 


If this should become a regular thread I suggest the following :

  • Use "Crazy Ideas Thread" in the title.
  • Copy the rules.
  • Add the tag "crazy_idea".
  • Create a top-level comment saying 'Discussion of this thread goes here; all other top-level comments should be ideas or similar'
  • Add a second top-level comment with an initial crazy idea to start participation.

Probabilities Small Enough To Ignore: An attack on Pascal's Mugging

19 Kaj_Sotala 16 September 2015 10:45AM

Summary: the problem with Pascal's Mugging arguments is that, intuitively, some probabilities are just too small to care about. There might be a principled reason for ignoring some probabilities, namely that they violate an implicit assumption behind expected utility theory. This suggests a possible approach for formally defining a "probability small enough to ignore", though there's still a bit of arbitrariness in it.

This post is about finding a way to resolve the paradox inherent in Pascal's Mugging. Note that I'm not talking about the bastardized version of Pascal's Mugging that's gotten popular of late, where it's used to refer to any argument involving low probabilities and huge stakes (e.g. low chance of thwarting unsafe AI vs. astronomical stakes). Neither am I talking specifically about the "mugging" illustration, where a "mugger" shows up to threaten you.

Rather I'm talking about the general decision-theoretic problem, where it makes no difference how low of a probability you put on some deal paying off, because one can always choose a humongous enough payoff to make "make this deal" be the dominating option. This is a problem that needs to be solved in order to build e.g. an AI system that uses expected utility and will behave in a reasonable manner.

Intuition: how Pascal's Mugging breaks implicit assumptions in expected utility theory

Intuitively, the problem with Pascal's Mugging type arguments is that some probabilities are just too low to care about. And we need a way to look at just the probability part component in the expected utility calculation and ignore the utility component, since the core of PM is that the utility can always be arbitrarily increased to overwhelm the low probability. 

Let's look at the concept of expected utility a bit. If you have a 10% chance of getting a dollar each time when you make a deal, and this has an expected value of 0.1, then this is just a different way of saying that if you took the deal ten times, then you would on average have 1 dollar at the end of that deal. 

More generally, it means that if you had the opportunity to make ten different deals that all had the same expected value, then after making all of those, you would on average end up with one dollar. This is the justification for why it makes sense to follow expected value even for unique non-repeating events: because even if that particular event wouldn't repeat, if your general strategy is to accept other bets with the same EV, then you will end up with the same outcome as if you'd taken the same repeating bet many times. And even though you only get the dollar after ten deals on average, if you repeat the trials sufficiently many times, your probability of having the average payout will approach one.

Now consider a Pascal's Mugging scenario. Say someone offers to create 10^100 happy lives in exchange for something, and you assign them a 0.000000000000000000001 probability to them being capable and willing to carry through their promise. Naively, this has an overwhelmingly positive expected value.

But is it really a beneficial trade? Suppose that you could make one deal like this per second, and you expect to live for 60 more years, for about 1,9 billion trades in total. Then, there would be a probability of 0,999999999998 that the deal would never once have paid off for you. Which suggests that the EU calculation's implicit assumption - that you can repeat this often enough for the utility to converge to the expected value - would be violated.

Our first attempt

This suggests an initial way of defining a "probability small enough to be ignored":

1. Define a "probability small enough to be ignored" (PSET, or by slight rearranging of letters, PEST) such that, over your lifetime, the expected times that the event happens will be less than one. 
2. Ignore deals where the probability component of the EU calculation involves a PEST.

Looking at the first attempt in detail

To calculate PEST, we need to know how often we might be offered a deal with such a probability. E.g. a 10% chance for something might be a PEST if we only lived for a short enough time that we could make a deal with a 10% chance once. So, a more precise definition of a PEST might be that it's a probability such that

(amount of deals that you can make in your life that have this probability) * (PEST) < 1

But defining "one" as the minimum times we should expect the event to happen for the probability to not be a PEST feels a little arbitrary. Intuitively, it feels like the threshold should depend on our degree of risk aversion: maybe if we're risk averse, we want to reduce the expected amount of times something happens during our lives to (say) 0,001 before we're ready to ignore it. But part of our motivation was that we wanted a way to ignore the utility part of the calculation: bringing in our degree of risk aversion seems like it might introduce the utility again.

What if redefined risk aversion/neutrality/preference (at least in this context) as how low one would be willing to let the "expected amount of times this might happen" fall before considering a probability a PEST?

Let's use this idea to define an Expected Lifetime Utility:

ELU(S,L,R) = the ELU of a strategy S over a lifetime L is the expected utility you would get if you could make L deals in your life, and were only willing to accept deals with a minimum probability P of at least S, taking into account your risk aversion R and assuming that each deal will pay off approximately P*L times.

ELU example

Suppose that we a have a world where we can take three kinds of actions. 

- Action A takes 1 unit of time and has an expected utility of 2 and probability 1/3 of paying off on any one occasion.
- Action B takes 3 units of time and has an expected utility of 10^(Graham's number) and probability 1/100000000000000 of paying off one any one occasion.
- Action C takes 5 units of time and has an expected utility of 20 and probability 1/100 of paying off on an one occasion.

Assuming that the world's lifetime is fixed at L = 1000 and R = 1:

ELU("always choose A"): we expect A to pay off on ((1000 / 1) * 1/3) = 333 individual occasions, so with R = 1, we deem it acceptable to consider the utility of A. The ELU of this strategy becomes (1000 / 1) * 2 = 2000.

ELU("always choose B"): we expect B to pay off on ((1000 / 3) * 1/100000000000000) = 0.00000000000333 occasions, so with R = 1, we consider the expected utility of B to be 0. The ELU of this strategy thus becomes ((1000 / 3) * 0) = 0.

ELU("always choose C"): we expect C to pay off on ((1000 / 5) * 1/100) = 2 individual occasions, so with R = 1, we consider the expected utility of C to be ((1000 / 5) * 20) = 4000.

Thus, "always choose C" is the best strategy. 

Defining R

Is R something totally arbitrary, or can we determine some more objective criteria for it?

Here's where I'm stuck. Thoughts are welcome. I do know that while setting R = 1 was a convenient example, it's most likely too high, because it would suggest things like not using seat belts.

General thoughts on this approach

An interesting thing about this approach is that the threshold for a PEST becomes dependent on one's expected lifetime. This is surprising at first, but actually makes some intuitive sense. If you're living in a dangerous environment where you might be killed anytime soon, you won't be very interested in speculative low-probability options; rather you want to focus on making sure you survive now. Whereas if you live in a modern-day Western society, you may be willing to invest some amount of effort in weird low-probability high-payoff deals, like cryonics.

On the other hand, whereas investing in that low-probability, high-utility option might not be good for you individually, it could still be a good evolutionary strategy for your genes. You yourself might be very likely to die, but someone else carrying the risk-taking genes might hit big and be very successful in spreading their genes. So it seems like our definition of L, lifetime length, should vary based on what we want: are we looking to implement this strategy just in ourselves, our whole species, or something else? Exactly what are we maximizing over?

"Announcing" the "Longevity for All" Short Movie Prize

19 infotropism 11 September 2015 01:44PM

The local Belgian/European life-extension non-profit Heales is giving away prizes for whoever can make an interesting short movie about life extension. The first prize is €3000 (around $3386 as of today), other prizes being various gifts. You more or less just need to send a link pointing to the uploaded media along with your contact info to once you're done.

While we're at it you don't need to be European, let alone Belgian to participate, and it doesn't even need to be a short movie anyway. For instance a comic strip would fall within the scope of the rules as specified here : (link to a pdf file)(or see this page on Also, sure, the deadline is by now supposed to be a fairly short-term September the 21st, 2015, but it is extremely likely this will be extended (this might be a pun).

I'll conclude by suggesting you read the official pdf with rules and explanations if you feel like you care about money or life-extension (who doesn't ?), and remind everyone of what happened last time almost everyone thought they shouldn't grab free contest money that was announced on Lesswrong (hint : few enough people participated that all earned something). The very reason why this one's due date will likely be extended is because (very very) few people have participated so far, after all.

(Ah yes, the only caveat I can think of is that if the product of quality by quantity of submissions is definitely too low (i.e. it's just you on the one hand and on the other hand that one guy who spent 3 minutes drawing some stick figures, and your submission is coming a close second), then the contest may be called off after one or two deadline extensions (also in the aforementioned rules).).

Rudimentary Categorization of Less Wrong Topics

19 ScottL 05 September 2015 07:32AM

I find the below list to be useful, so I thought I would post it. This list includes short abstracts of all of the wiki items and a few other topics on less wrong. I grouped the items into some rough categories just to break up the list. I tried to put the right items into the right categories, but there may be some items that can be in multiple categories or that would be better off in a different category. The wiki page from which I got all the items is here.

The categories are:

Property Attribution




Property Attribution

Barriers, biases, fallacies, impediments and problems

  • Affective death spiral - positive attributes of a theory, person, or organization combine with the Halo effect in a feedback loop, resulting in the subject of the affective death spiral being held in higher and higher regard.
  • Anthropomorphism - the error of attributing distinctly human characteristics to nonhuman processes.
  • Bystander effect - a social psychological phenomenon in which individuals are less likely to offer help in an emergency situation when other people are present.
  • Connotation - emotional association with a word. You need to be careful that you are not conveying different connotation, then you mean to.
  • Correspondence bias (also known as the fundamental attribution error) - is the tendency to overestimate the contribution of lasting traits and dispositions in determining people's behavior, as compared to situational effects.
  • Death Spirals and the Cult Attractor - Cultishness is an empirical attractor in human groups, roughly an affective death spiral, plus peer pressure and outcasting behavior, plus (quite often) defensiveness around something believed to have been perfected
  • Detached lever fallacy –the assumption that something simple for one system will be simple for others. This assumption neglects to take into account that something may only be simple because of complicated underlying machinery which is triggered by a simple action like pulling a lever. Adding this lever to something else won’t allow the action to occur because the underlying complicated machinery is not there.
  • Giant cheesecake fallacy- occurs when an argument leaps directly from capability to actuality, without considering the necessary intermediate of motive. An example of the fallacy might be: a sufficiently powerful Artificial Intelligence could overwhelm any human resistance and wipe out humanity. (Belief without evidence: the AI would decide to do so.) Therefore we should not build AI.
  • Halo effect – specific type of confirmation bias, wherein positive feelings in one area cause ambiguous or neutral traits to be viewed positively.
  • Illusion of transparency - misleading impression that your words convey more to others than they really do.
  • Inferential distance - a gap between the background knowledge and epistemology of a person trying to explain an idea, and the background knowledge and epistemology of the person trying to understand it.
  • Information cascade - occurs when people signal that they have information about something, but actually based their judgment on other people's signals, resulting in a self-reinforcing community opinion that does not necessarily reflect reality.
  • Mind projection fallacy - occurs when someone thinks that the way they see the world reflects the way the world really is, going as far as assuming the real existence of imagined objects.
  • Other-optimizing - a failure mode in which a person vastly overestimates their ability to optimize someone else's life, usually as a result of underestimating the differences between themselves and others, for example through the typical mind fallacy.
  • Peak-end rule - we do not judge our experiences on the net pleasantness of unpleasantness or on how long the experience lasted, but instead on how they were at their peak (pleasant or unpleasant) and how they ended.
  • Stereotype - a fixed, over generalized belief about a particular group or class of people.
  • Typical mind fallacy - the mistake of making biased and overconfident conclusions about other people's experience based on your own personal experience; the mistake of assuming that other people are more like you than they actually are.


  • ADBOC - Agree Denotationally, But Object Connotatively
  • Alien Values - There are no rules requiring minds to value life, liberty or the pursuit of happiness. An alien will have, in all probability, alien values. If an "alien" isn't evolved, the range of possible values increases even more, allowing such absurdities as a Paperclip maximizer. Creatures with alien values might as well value only non-sentient life, or they might spend all their time building heaps of prime numbers of rocks.
  • Chronophone – is a parable that is meant to convey the idea that it’s really hard to get somewhere when you don't already know your destination. If there were some simple cognitive policy you could follow to spark moral and technological revolutions, without your home culture having advance knowledge of the destination, you could execute that cognitive policy today.
  • Empathic inference – is every-day common mind-reading. It’s an inference made about other person’s mental states using your own brain as reference, by making your brain feel or think in the same way as the other person you can emulate their mental state and predict their reactions.
  • Epistemic luck - you would have different beliefs if certain events in your life were different. How should you react to this fact?
  • Future - If it hasn't happened yet but is going to, then it's part of the future. Checking whether or not something is going to happen is notoriously difficult. Luckily, the field of heuristics and biases has given us some insights into what can go wrong. Namely, one problem is that the future elicits far mode, which isn't about truth-seeking or gritty details.
  • Mental models - a hypothetical form of representation of knowledge in human mind. Mental models form to approximately describe dynamics of observed situations, and reuse parts of existing models to represent novel situations
  • Mind design space - refers to the configuration space of possible minds. As humans living in a human world, we can safely make all sorts of assumptions about the minds around us without even realizing it. Each human might have their own unique personal qualities, so it might naively seem that there's nothing you can say about people you don't know. But there's actually quite a lot you can say (with high or very high probability) about a random human: that they have standard emotions like happiness, sadness, and anger; standard senses like sight, vision, and hearing; that they speak a language; and no doubt any number of other subtle features that are even harder to quickly explain in words. These things are the specific results of adaptation pressures in the ancestral environment and can't be expected to be shared by a random alien or AI. That is, humans are packed into a tiny dot in the configuration space: there is vast range over of other ways a mind can be.
  • Near/far thinking - Near and far are two modes (or a spectrum of modes) in which we can think about things. We choose which mode to think about something is based on its distance from us, or on the level of detail we need. This property of human mind is studied in construal level theory.
    • NEAR: All of these bring each other more to mind: here, now, me, us; trend-deviating likely real local events; concrete, context-dependent, unstructured, detailed, goal-irrelevant incidental features; feasible safe acts; secondary local concerns; socially close folks with unstable traits.
    • FAR: Conversely, all these bring each other more to mind: there, then, them; trend-following unlikely hypothetical global events; abstract, schematic, context-freer, core, coarse, goal-related features; desirable risk-taking acts, central global symbolic concerns, confident predictions, polarized evaluations, socially distant people with stable traits
  • No-Nonsense Metaethics - A sequence by lukeprog that explains and defends a naturalistic approach to metaethics and what he calls pluralistic moral reductionism. We know that people can mean different things, but use the same word, e.g. sound can mean auditory experience or acoustic vibrations in the air. Pluralistic moral reductionism is the idea that we do the same thing when we talk about what it moral.
  • Only the vulnerable are heroes - “Vulnerability is our most accurate measurement of courage.” – Brené Brown To be as heroic as a man stopping a group of would-be thieves from robbing a store. Superman has to be defending the world from someone powerful enough to harm and possibly even kill him, such as Darkseid.


Barriers, biases, fallacies, impediments and problems

  • Absurdity heuristic – is a mental shortcut where highly untypical situations are classified as absurd or impossible. Where you don't expect intuition to construct an adequate model of reality, classifying an idea as impossible may be overconfident.
  • Affect heuristic - a mental shortcut that makes use of current emotions to make decisions and solve problems quickly and efficiently.
  • Arguing by analogy – is arguing that since things are alike in some ways, they will probably be alike in others. While careful application of argument by analogy can be a powerful tool, there are limits to the method after which it breaks down.
  • Arguing by definition – is arguing that something is part of a class because it fits the definition of that class. It is recommended to avoid this wherever possible and instead treat words as labels that cannot capture the rich cognitive content that actually constitutes its meaning. As Feynman said: “You can know the name of a bird in all the languages of the world, but when you're finished, you'll know absolutely nothing whatever about the bird... So let's look at the bird and see what it's doing -- that's what counts.” It is better to keep the focus on the facts of the matter and try to understand what your interlocutor is trying to communicate, then to get lost in a pointless discussion of definitions, bearing nothing.
  • Arguments as soldiers – is a problematic scenario where arguments are treated like war or battle. Arguments get treated as soldiers, weapons to be used to defend your side of the debate, and to attack the other side. They are no longer instruments of the truth.
  • Availability heuristic – a mental shortcut that treats easily recalled information as important or at least more important than alternative solutions which are not as readily recalled
  • Belief as cheering - People can bind themselves as a group by believing "crazy" things together. Then among outsiders they could show the same pride in their crazy belief as they would show wearing "crazy" group clothes among outsiders. The belief is more like a banner saying "GO BLUES". It isn't a statement of fact, or an attempt to persuade; it doesn't have to be convincing—it's a cheer.
  • Beware of Deepities - A deepity is a proposition that seems both important and true—and profound—but that achieves this effect by being ambiguous. An example is "love is a word". One interpretation is that “love”, the word, is a word and this is trivially true. The second interpretation is that love is nothing more than a verbal construct. This interpretation is false, but if it were true would be profound. The "deepity" seems profound due to a conflation of the two interpretations. People see the trivial but true interpretation and then think that there must be some kind of truth to the false but profound one.
  • Bias - is a systematic deviation from rationality committed by our cognition. They are specific, predictable error patterns in the human mind.
  • Burdensome details - Adding more details to a theory may make it sound more plausible to human ears because of the representativeness heuristic, even as the story becomes normatively less probable, as burdensome details drive the probability of the conjunction down (this is known as conjunction fallacy). Any detail you add has to be pinned down by a sufficient amount of evidence; all the details you make no claim about can be summed over.
  • Compartmentalization - a tendency to restrict application of a generally-applicable skill, such as scientific method, only to select few contexts. More generally, the concept refers to not following a piece of knowledge to its logical conclusion, or not taking it seriously.
  • Conformity bias - a tendency to behave similarly to the others in a group, even if doing so goes against your own judgment.
  • Conjunction fallacy – involves the assumption that specific conditions are more probable than more general ones.
  • Contagion heuristic - leads people to avoid contact with people or objects viewed as "contaminated" by previous contact with someone or something viewed as bad—or, less often, to seek contact with objects that have been in contact with people or things considered good.
  • Costs of rationality - Becoming more epistemically rational can only guarantee one thing: what you believe will include more of the truth. Knowing that truth might help you achieve your goals, or cause you to become a pariah. Be sure that you really want to know the truth before you commit to finding it; otherwise, you may flinch from it.
  • Defensibility - arguing that a policy is defensible rather than optimal or that it has some benefit compared to the null action rather than the best benefit of any action.
  • Fake simplicity – if you have a simple answer to a complex problem then it is probably a case whereby your beliefs appear to match the evidence much more strongly than they actually do. “Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong.” —H. L. Mencken
  • Fallacy of gray also known as Continuum fallacy –is the false belief that because nothing is certain, everything is equally uncertain. It does not take into account that some things are more certain than others.
  • False dilemma - occurs when only two options are considered, when there may in fact be many.
  • Filtered evidence – is evidence that was selected for the purpose of proving (disproving) a hypothesis. Filtered evidence may be highly misleading, but can still be useful, if considered with care.
  • Generalization from fictional evidence – logical fallacy that consists of drawing real-world conclusions based on statements invented and selected for the purpose of writing fiction.
  • Groupthink - tendency of humans to tend to agree with each other, and hold back objections or dissent even when the group is wrong.
  • Hindsight bias – is the tendency to overestimate the foreseeability of events that have actually happened.
  • Information hazard – is a risk that arises from the dissemination or the potential dissemination of (true) information that may cause harm or enable some agent to cause harm.
  • In-group bias - preferential treatment of people and ideas associated with your own group.
  • Mind-killer - a name given to topics (such as politics) that tend to produce extremely biased discussions. Another cause of mind-killers is social taboo. Negative connotations are associated with some topics, thus creating a strong bias supported by signaling drives that makes non-negative characterization of these topics appear absurd.
  • Motivated cognition – is the unconscious tendency of individuals to fit their processing of information to conclusions that suit some end or goal.
  • Motivated skepticism also known as disconfirmation bias - the mistake of applying more skepticism to claims that you don't like (or intuitively disbelieve), than to claims that you do like
  • Narrative fallacy – is a vulnerability to over interpretation and our predilection for compact stories over raw truths.
  • Overconfidence - the state of being more certain than is justified, given your priors and the evidence available.
  • Planning fallacy - predictions about how much time will be needed to complete a future task display an optimistic bias (underestimate the time needed).
  • Politics is the Mind-Killer – Politics is not a good area for rational debate. It is often about status and power plays where arguments are soldiers rather than tools to get closer to the truth.
  • Positive bias - tendency to test hypotheses with positive rather than negative examples, thus risking to miss obvious disconfirming tests.
  • Priming - psychological phenomenon that consists in early stimulus influencing later thoughts and behavior.
  • Privileging the hypothesis – is singling out a particular hypothesis for attention when there is insufficient evidence already in hand to justify such special attention.
  • Problem of verifying rationality – is the single largest problem for those desiring to create methods of systematically training for increased epistemic and instrumental rationality - how to verify that the training actually worked.
  • Rationalization – starts from a conclusion, and then works backward to arrive at arguments apparently favouring that conclusion. Rationalization argues for a side already selected. The term is misleading as it is the very opposite and antithesis of rationality, as if lying were called "truthization".
  • Reason as memetic immune disorder.- is problem that when you are rational you deem your conclusions more valuable than those of non-rational people. This can end up being a problem as you are less likely to update your beliefs when they are opposed. This adds the risk that if you make a one false belief and then rationally deduce a plethora of others from it you will be less likely to update any erronous conclusions.
  • Representativeness heuristic –a mental shortcut where people judge the probability or frequency of a hypothesis by considering how much the hypothesis resembles available data as opposed to using a Bayesian calculation.
  • Scales of justice fallacy - the error of using a simple polarized scheme for deciding a complex issue: each piece of evidence about the question is individually categorized as supporting exactly one of the two opposing positions.
  • Scope insensitivity – a phenomenon related to the representativeness heuristic where subjects based their willingness-to-pay mostly on a mental image rather than the effect on a desired outcome. An environmental measure that will save 200,000 birds doesn't conjure anywhere near a hundred times the emotional impact and willingness-to-pay of a measure that would save 2,000 birds, even though in fact the former measure is two orders of magnitude more effective.
  • Self-deception - state of preserving a wrong belief, often facilitated by denying or rationalizing away the relevance, significance, or importance of opposing evidence and logical arguments.
  • Status quo bias - people tend to avoid changing the established behavior or beliefs unless the pressure to change is sufficiently strong.
  • Sunk cost fallacy - Letting past investment (of time, energy, money, or any other resource) interfere with decision-making in the present in deleterious ways.
  • The top 1% fallacy - related to not taking into account the idea that a small sample size is not always reflective of a whole population and that sample populations with certain characteristics, e.g. made up of repeat job seekers, are not reflective of the whole population.
  • Underconfidence - the state of being more uncertain than is justified, given your priors and the evidence you are aware of.
  • Wrong Questions - A question about your map that wouldn’t make sense if you had a more accurate map.


  • Absolute certainty – equivalent of Bayesian probability of 1. Losing an epistemic bet made with absolute certainty corresponds to receiving infinite negative payoff, according to the logarithmic proper scoring rule.
  • Adaptation executors - Individual organisms are best thought of as adaptation-executers rather than as fitness-maximizers. Our taste buds do not find lettuce delicious and cheeseburgers distasteful once we are fed a diet too high in calories and too low in micronutrients. Tastebuds are adapted to an ancestral environment in which calories, not micronutrients, were the limiting factor. Evolution operates on too slow a timescale to re-adapt to adapt to a new conditions (such as a diet).
  • Adversarial process - a form of truth-seeking or conflict resolution in which identifiable factions hold one-sided positions.
  • Altruism - Actions undertaken for the benefit of other people. If you do something to feel good about helping people, or even to be a better person in some spiritual sense, it isn't truly altruism.
  • Amount of evidence - to a Bayesian, evidence is a quantitative concept. The more complicated or a priori improbable a hypothesis is, the more evidence you need just to justify it, or even just single it out of the amongst the mass of competing theories.
  • Anti-epistemology- is bad explicit beliefs about rules of reasoning, usually developed in the course of protecting an existing false belief - false beliefs are opposed not only by true beliefs (that must then be obscured in turn) but also by good rules of systematic reasoning (which must then be denied). The explicit defense of fallacy as a general rule of reasoning is anti-epistemology.
  • Antiprediction - is a statement of confidence in an event that sounds startling, but actually isn't far from a maxentropy prior. For example, if someone thinks that our state of knowledge implies strong ignorance about the speed of some process X on a logarithmic scale from nanoseconds to centuries, they may make the startling-sounding statement that X is very unlikely to take 'one to three years'.
  • Applause light - is an empty statement which evokes positive affect without providing new information
  • Artificial general intelligence – is a machine capable of behaving intelligently over many domains.
  • Bayesian - Bayesian probability theory is the math of epistemic rationality, Bayesian decision theory is the math of instrumental rationality.
  • Aumann's agreement theorem – roughly speaking, says that two agents acting rationally (in a certain precise sense) and with common knowledge of each other's beliefs cannot agree to disagree. More specifically, if two people are genuine Bayesians, share common priors, and have common knowledge of each other's current probability assignments, then they must have equal probability assignments.
  • Bayesian decision theory – is a decision theory which is informed by Bayesian probability. It is a statistical system that tries to quantify the tradeoff between various decisions, making use of probabilities and costs.
  • Bayesian probability - represents a level of certainty relating to a potential outcome or idea. This is in contrast to a frequentist probability that represents the frequency with which a particular outcome will occur over any number of trials. An event with Bayesian probability of .6 (or 60%) should be interpreted as stating "With confidence 60%, this event contains the true outcome", whereas a frequentist interpretation would view it as stating "Over 100 trials, we should observe event X approximately 60 times." The difference is more apparent when discussing ideas. A frequentist will not assign probability to an idea; either it is true or false and it cannot be true 6 times out of 10.
  • Bayes' theorem - A law of probability that describes the proper way to incorporate new evidence into prior probabilities to form an updated probability estimate.
  • Belief - the mental state in which an individual holds a proposition to be true. Beliefs are often metaphorically referred to as maps, and are considered valid to the extent that they correctly correspond to the truth. A person's knowledge is a subset of their beliefs, namely the beliefs that are also true and justified. Beliefs can be second-order, concerning propositions about other beliefs.
  • Belief as attire – is a example of an improper belief promoted by identification with a group or other signaling concerns, not by how well it reflects the territory.
  • Belief in belief - Where it is difficult to believe a thing, it is often much easier to believe that you ought to believe it. Were you to really believe and not just believe in belief, the consequences of error would be much more severe. When someone makes up excuses in advance, it would seem to require that belief, and belief in belief, have become unsynchronized.
  • Belief update - what you do to your beliefs, opinions and cognitive structure when new evidence comes along.
  • Bite the bullet - is to accept the consequences of a hard choice, or unintuitive conclusions of a formal reasoning procedure.
  • Black swan – is a high-impact event that is hard to predict (but not necessarily of low probability). It is also an event that is not accounted for in a model and therefore causes the model to break down when it occurs.
  • Cached thought – is an answer that was arrived at by recalling a previously-computed conclusion, rather than performing the reasoning from scratch.
  • Causal Decision Theory – a branch of decision theory which advises an agent to take actions that maximizes the causal consequences on the probability of desired outcomes
  • Causality - refers to the relationship between an event (the cause) and a second event (the effect), where the second event is a direct consequence of the first.
  • Church-Turing thesis - states the equivalence between the mathematical concepts of algorithm or computation and Turing-Machine. It asserts that if some calculation is effectively carried out by an algorithm, then there exists a Turing machines which will compute that calculation.
  • Coherent Aggregated Volition - is one of Ben Goertzel's responses to Eliezer Yudkowsky's Coherent Extrapolated Volition, the other being Coherent Blended Volition. CAV would be a combination of the goals and beliefs of humanity at the present time.
  • Coherent Blended Volition - Coherent Blended Volition is a recent concept coined in a 2012 paper by Ben Goertzel with the aim to clarify his Coherent Aggregated Volition idea. This clarifications follows the author's attempt to develop a comprehensive alternative to Coherent Extrapolated Volition.
  • Coherent Extrapolated Volition – is a term developed by Eliezer Yudkowsky while discussing Friendly AI development. It’s meant as an argument that it would not be sufficient to explicitly program our desires and motivations into an AI. Instead, we should find a way to program it in a way that it would act in our best interests – what we want it to do and not what we tell it to.
  • Color politics - the words "Blues" and "Greens" are often used to refer to two opposing political factions. Politics commonly involves an adversarial process, where factions usually identify with political positions, and use arguments as soldiers to defend their side. The dichotomies presented by the opposing sides are often false dilemmas, which can be shown by presenting third options.
  • Common knowledge - n the context of Aumann's agreement theorem, a fact is part of the common knowledge of a group of agents when they all know it, they all know that they all know it, and so on ad infinitum.
  • Conceptual metaphor – are neurally-implemented mappings between concrete domains of discourse (often related to our body and perception) and more abstract domains. These are a well-known source of bias and are often exploited in the Dark Arts. An example is “argument is war”.
  • Configuration space - is an isomorphism between the attributes of something, and its position on a multidimensional graph. Theoretically, the attributes and precise position on the graph should contain the same information. In practice, the concept usually appears as a suffix, as in "walletspace", where "walletspace" refers to the configuration space of all possible wallets, arranged by similarity. Walletspace would intersect with leatherspace, and the set of leather wallets is a subset of both walletspace and leatherspace, which are both subsets of thingspace.
  • Conservation of expected evidence - a theorem that says: "for every expectation of evidence, there is an equal and opposite expectation of counterevidence". 0 = (P(H|E)-P(H))*P(E) + (P(H|~E)-P(H))*P(~E)
  • Control theory - a control system is a device that keeps a variable at a certain value, despite only knowing what the current value of the variable is. An example is a cruise control, which maintains a certain speed, but only measures the current speed, and knows nothing of the system that produces that speed (wind, car weight, grade).
  • Corrupted hardware - our brains do not always allow us to act the way we should. Corrupted hardware refers to those behaviors and thoughts that act for ancestrally relevant purposes rather than for stated moralities and preferences.
  • Counterfactual mugging - is a thought experiment for testing and differentiating decision theories, stated as follows:
  • Counter man syndrome - wherein a person behind a counter comes to believe that they know things they don't know, because, after all, they're the person behind the counter. So they can't just answer a question with "I don't know"... and thus they make something up, without really paying attention to the fact that they're making it up. Pretty soon, they don't know the difference between the facts and their made up stories
  • Cox's theorem says, roughly, that if your beliefs at any given time take the form of an assignment of a numerical "plausibility score" to every proposition, and if they satisfy a few plausible axioms, then your plausibilities must effectively be probabilities obeying the usual laws of probability theory, and your updating procedure must be the one implied by Bayes' theorem.
  • Crisis of faith - a combined technique for recognizing and eradicating the whole systems of mutually-supporting false beliefs. The technique involves systematic application of introspection, with the express intent to check the reliability of beliefs independently of the other beliefs that support them in the mind. The technique might be useful for the victims of affective death spirals, or any other systematic confusions, especially those supported by anti-epistemology.
  • Cryonics - is the practice of preserving people who are dying in liquid nitrogen soon after their heart stops. The idea is that most of your brain's information content is still intact right after you've "died". If humans invent molecular nanotechnology or brain emulation techniques, it may be possible to reconstruct the consciousness of cryopreserved patients.
  • Curiosity - The first virtue is curiosity. A burning itch to know is higher than a solemn vow to pursue truth. To feel the burning itch of curiosity requires both that you be ignorant, and that you desire to relinquish your ignorance. If in your heart you believe you already know, or if in your heart you do not wish to know, then your questioning will be purposeless and your skills without direction. Curiosity seeks to annihilate itself; there is no curiosity that does not want an answer. The glory of glorious mystery is to be solved, after which it ceases to be mystery. Be wary of those who speak of being open-minded and modestly confess their ignorance. There is a time to confess your ignorance and a time to relinquish your ignorance. —Twelve Virtues of Rationality
  • Dangerous knowledge - Intelligence, in order to be useful, must be used for something other than defeating itself.
  • Dangling Node - A label for something that isn't "actually real".
  • Death - First you're there, and then you're not there, and they can't change you from being not there to being there, because there's nothing there to be changed from being not there to being there. That's death. Cryonicists use the concept of information-theoretic death, which is what happens when the information needed to reconstruct you even in principle is no longer present. Anything less, to them, is just a flesh wound.
  • Debiasing - The process of overcoming bias. It takes serious study to gain meaningful benefits, half-hearted attempts may accomplish nothing, and partial knowledge of bias may do more harm than good.
  • Decision theory – is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals.
  • Defying the data - Sometimes, the results of an experiment contradict what we have strong theoretical reason to believe. But experiments can go wrong, for various reasons. So if our theory is strong enough, we should in some cases defy the data: know that there has to be something wrong with the result, even without offering ideas on what it might be.
  • Disagreement - Aumann's agreement theorem can be informally interpreted as suggesting that if two people are honest seekers of truth, and both believe each other to be honest, then they should update on each other's opinions and quickly reach agreement. The very fact that a person believes something is Rational evidence that that something is true, and so this fact should be taken into account when forming your belief. Outside of well-functioning prediction markets, Aumann agreement can probably only be approximated by careful deliberative discourse. Thus, fostering effective deliberation should be seen as a key goal of Less Wrong.
  • Doubt- The proper purpose of a doubt is to destroy its target belief if and only if it is false. The mere feeling of crushing uncertainty is not virtuous unto an aspiring rationalist; probability theory is the law that says we must be uncertain to the exact extent to which the evidence merits uncertainty.
  • Dunning–Kruger effect - is a cognitive bias wherein unskilled individuals suffer from illusory superiority, mistakenly assessing their ability to be much higher than is accurate. This bias is attributed to a metacognitive inability of the unskilled to recognize their ineptitude. Conversely, highly skilled individuals tend to underestimate their relative competence, erroneously assuming that tasks that are easy for them are also easy for others
  • Emulation argument for human-level AI – argument that since whole brain emulation seems feasible then human-level AI must also be feasible.
  • Epistemic hygiene - consists of practices meant to allow accurate beliefs to spread within a community and keep less accurate or biased beliefs contained. The practices are meant to serve an analogous purpose to normal hygiene and sanitation in containing disease. "Good cognitive citizenship" is another phrase that has been proposed for this concept[1].
  • Error of crowds - is the idea that under some scoring rules, the average error becomes less than the error of the average, thus making the average belief tautologically worse than a belief of a random person. Compare this to the ideas of modesty argument and wisdom of the crowd. A related idea is that a popular belief is likely to be wrong because the less popular ones couldn't maintain support if they were worse than the popular one.
  • Ethical injunction - are rules not to do something even when it's the right thing to do. (That is, you refrain "even when your brain has computed it's the right thing to do", but this will just seem like "the right thing to do".) For example, you shouldn't rob banks even if you plan to give the money to a good cause. This is to protect you from your own cleverness (especially taking bad black swan bets), and the Corrupted hardware you're running on.
  • Evidence - for a given theory is the observation of an event that is more likely to occur if the theory is true than if it is false. (The event would be evidence against the theory if it is less likely if the theory is true.)
  • Evidence of absence - evidence that allows you to conclude some phenomenon isn't there. It is often said that "absence of evidence is not evidence of absence". However, if evidence is expected, but not present, that is evidence of absence.
  • Evidential Decision Theory - a branch of decision theory which advises an agent to take actions which, conditional on it happening, maximizes the chances of the desired outcome.
  • Evolution - The brainless, mindless optimization process responsible for the production of all biological life on Earth, including human beings. Since the design signature of evolution is alien and counterintuitive, it takes some study to get to know your accidental Creator.
  • Evolution as alien god – is a thought experiment in which evolution is imagined as a god. The though experiment is meant to convey the idea that evolution doesn’t have a mind. The god in though experiment would be a tremendously powerful, unbelievably stupid, ridiculously slow, and utterly uncaring god; a god monomaniacally focused on the relative fitness of genes within a species; a god whose attention was completely separated and working at cross-purposes in rabbits and wolves.
  • Evolutionary argument for human-level AI - an argument that uses the fact that evolution produced human level intelligence to argue for the feasibility of human-level AI.
  • Evolutionary psychology - the idea of evolution as the idiot designer of humans - that our brains are not consistently well-designed - is a key element of many of the explanations of human errors that appear on this website.
  • Existential risk – is a risk posing permanent large negative consequences to humanity which can never be undone.
  • Expected value - The expected value or expectation is the (weighted) average of all the possible outcomes of an event, weighed by their probability. For example, when you roll a die, the expected value is (1+2+3+4+5+6)/6 = 3.5. (Since a die doesn't even have a face that says 3.5, this illustrates that very often, the "expected value" isn't a value you actually expect.)
  • Extensibility argument for greater-than-human intelligence –is an argument that once we get to a human level AGI, extensibility would make an AGI of greater-than-human-intelligence feasible.
  • Extraordinary evidence - is evidence that turns an a priori highly unlikely event into an a posteriori likely event.
  • Free-floating belief – is a belief that both doesn't follow from observations and doesn't restrict which experiences to anticipate. It is both unfounded and useless.
  • Free will - means our algorithm's ability to determine our actions. People often get confused over free will because they picture themselves as being restrained rather than part of physics. Yudowsky calls this view Requiredism, but most people just view this essentially as Compatibilism.
  • Friendly artificial intelligence – is a superintelligence (i.e., a really powerful optimization process) that produces good, beneficial outcomes rather than harmful ones.
  • Fully general counterargument - an argument which can be used to discount any conclusion the arguer does not like. Being in possession of such an argument leads to irrationality because it allows the arguer to avoid updating their beliefs in the light of new evidence. Knowledge of cognitive biases can itself allow someone to form fully general counterarguments ("you're just saying that because you're exhibiting X bias").
  • Great Filter - is a proposed explanation for the Fermi Paradox. The development of intelligent life requires many steps, such as the emergence of single-celled life and the transition from unicellular to multicellular life forms. Since we have not observed intelligent life beyond our planet, there seems to be a developmental step that is so difficult and unlikely that it "filters out" nearly all civilizations before they can reach a space-faring stage.
  • Group rationality - In almost anything, individuals are inferior to groups.
  • Group selection – is an incorrect belief about evolutionary theory that a feature of the organism is there for the good of the group.
  • Heuristic - quick, intuitive strategy for reasoning or decision making, as opposed to more formal methods. Heuristics require much less time and energy to use, but sometimes go awry, producing bias.
  • Heuristics and biases - program in cognitive psychology tries to work backward from biases (experimentally reproducible human errors) to heuristics (the underlying mechanisms at work in the brain).
  • Hold Off on Proposing Solutions - "Do not propose solutions until the problem has been discussed as thoroughly as possible without suggesting any." It is easy to show that this edict works in contexts where there are objectively defined good solutions to problems.
  • Hollywood rationality- What Spock does, not what actual rationalists do.
  • How an algorithm feels - Our philosophical intuitions are generated by algorithms in the human brain. To dissolve a philosophical dilemma, it often suffices to understand the cognitive algorithm that generates the appearance of the dilemma - if you understand the algorithm in sufficient detail. It is not enough to say "An algorithm does it!" - this might as well be magic. It takes a detailed step-by-step walkthrough.
  • Hypocrisy - the act of claiming to motives, morals and standards one does not possess. Informally, it refers to not living up the standards that one espouses, whether or not one sincerely believes those standards.
  • Impossibility- Careful use of language dictates that we distinguish between several senses in which something can be said to be impossible. Some things are logically impossible: you can't have a square circle or an object that is both perfectly black and perfectly not-black. Also, in our reductionist universe operating according to universal physical laws, some things are physically impossible based on our model of how things work, even they are not obviously contradictory or contrary to reason: for example, the laws of thermodynamics give us a strong guarantee that there can never be a perpetual motion machine. It can be tempting to label as impossible very difficult problems which you have no idea how to solve. But the apparent lack of a solution is not a strong guarantee that no solution can exist in the way that the laws of thermodynamics, or Godel's incompleteness results, give us proofs that something cannot be accomplished. A blank map does not correspond to a blank territory; in the absence of a proof that a problem is insolvable, you can't be confident that you're not just overlooking something that a greater intelligence would spot in an instant.
  • Improper belief – is a belief that isn't concerned with describing the territory. A proper belief, on the other hand, requires observations, gets updated upon encountering new evidence, and provides practical benefit in anticipated experience. Note that the fact that a belief just happens to be true doesn't mean you're right to have it. If you buy a lottery ticket, certain that it's a winning ticket (for no reason), and it happens to be, believing that was still a mistake. Types of improper belief discussed in the Mysterious Answers to Mysterious Questions sequence include: Free-floating belief, Belief as attire, Belief in belief and Belief as cheering
  • Incredulity - Spending emotional energy on incredulity wastes time you could be using to update. It repeatedly throws you back into the frame of the old, wrong viewpoint. It feeds your sense of righteous indignation at reality daring to contradict you.
  • Intuition pump - In summary, they are thought experiments that highlight, or pumping, certain ideas, intuitions or concepts while attenuating others so as to make some conclusion obvious and simple to reach. The intuition pump is a carefully designed persuasion tool in which you check to see if the same intuitions still get pumped when you change certain settings in a thought experiment.
  • Kolmogorov complexity - given a string, the length of the shortest possible program that prints it.
  • Lawful intelligence - The startling and counterintuitive notion - contradicting both surface appearances and all Deep Wisdom - that intelligence is a manifestation of Order rather than Chaos. Even creativity and outside-the-box thinking are essentially lawful. While this is a complete heresy according to the standard religion of Silicon Valley, there are some good mathematical reasons for believing it.
  • Least convenient possible world – is a technique for enforcing intellectual honesty, to be used when arguing against an idea. The essence of the technique is to assume that all the specific details will align with the idea against which you are arguing, i.e. to consider the idea in the context of a least convenient possible world, where every circumstance is colluding against your objections and counterarguments. This approach ensures that your objections are strong enough, running minimal risk of being rationalizations for your position.
  • Logical rudeness – is a response to criticism which insulates the responder from having to address the criticism directly. For example, ignoring all the diligent work that evolutionary biologists did to dig up previous fossils, and insisting you can only be satisfied by an actual videotape, is "logically rude" because you're ignoring evidence that someone went to a great deal of trouble to provide to you.
  • Log odds – is an alternate way of expressing probabilities, which simplifies the process of updating them with new evidence. Unfortunately, it is difficult to convert between probability and log odds. The log odds is the log of the odds ratio.
  • Magical categories - an English word which, although it sounds simple - hey, it's just one word, right? - is actually not simple, and furthermore, may be applied in a complicated way that drags in other considerations. Physical brains are not powerful enough to search all possibilities; we have to cut down the search space to possibilities that are likely to be good. Most of the "obviously bad" methods - those that would end up violating our other values, and so ranking very low in our preference ordering - do not even occur to us as possibilities.
  • Making Beliefs Pay Rent - Every question of belief should flow from a question of anticipation, and that question of anticipation should be the centre of the inquiry. Every guess of belief should begin by flowing to a specific guess of anticipation, and should continue to pay rent in future anticipations. If a belief turns deadbeat, evict it.
  • Many-worlds interpretation - uses decoherence to explain how the universe splits into many separate branches, each of which looks like it came out of a random collapse.
  • Map and territory- Less confusing than saying "belief and reality", "map and territory" reminds us that a map of Texas is not the same thing as Texas itself. Saying "map" also dispenses with possible meanings of "belief" apart from "representations of some part of reality". Since our predictions don't always come true, we need different words to describe the thingy that generates our predictions and the thingy that generates our experimental results. The first thingy is called "belief", the second thingy "reality".
  • Meme lineage – is a set of beliefs, attitudes, and practices that all share a clear common origin point. This concept also emphasizes the means of transmission of the beliefs in question. If a belief is part of a meme lineage that transmits for primarily social reasons, it may be discounted for purposes of the modesty argument.
  • Memorization - is what you're doing when you cram for a university exam. It's not
  • Modesty - admitting or boasting of flaws so as to not create perceptions of arrogance. Not to be confused with humility.
  • Most of science is actually done by induction - To come up with something worth testing, a scientist needs to do lots of sound induction first or borrow an idea from someone who already used induction. This is because induction is the only way to reliably find candidate hypotheses which deserve attention. Examples of bad ways to find hypotheses include finding something interesting or surprising to believe in and then pinning all your hopes on that thing turning out to be true.
  • Most peoples' beliefs aren’t worth considering - Sturgeon's Law says that as a general rule, 90% of everything is garbage. Even if it is the case that 90% of everything produced by any field is garbage that does not mean one can dismiss the 10% that is quality work. Instead, it is important engage with that 10%, and use that as the standard of quality.
  • Nash equilibrium - a stable state of a system involving the interaction of different participants, in which no participant can gain by a unilateral change of strategy if the strategies of the others remain unchanged.
  • Newcomb's problem - In Newcomb's problem, a superintelligence called Omega shows you two boxes, A and B, and offers you the choice of taking only box A, or both boxes A and B. Omega has put $1,000 in box B. If Omega thinks you will take box A only, he has put $1,000,000 in it. 
  • Nonapples - a proposed object, tool, technique, or theory which is defined only as being not like a specific, existent example of said categories. It is a type of overly-general prescription which, while of little utility, can seem useful. It involves disguising a shallow criticism as a solution, often in such a way as to make it look profound. For instance, suppose someone says, "We don't need war, we need non-violent conflict resolution." In this way a shallow criticism (war is bad) is disguised as a solution (non-violent conflict resolution, i.e, nonwar). This person is selling nonapples because "non-violent conflict resolution" isn't a method of resolving conflict nonviolently. Rather, it is a description of all conceivable methods of non-violent conflict resolution, the vast majority of which are incoherent and/or ineffective.
  • Noncentral fallacy - A rhetorical move often used in political, philosophical, and cultural arguments. "X is in a category whose archetypal member gives us a certain emotional reaction. Therefore, we should apply that emotional reaction to X, even though it is not a central category member."
  • Not technically a lie – is a statement that is literally true, but causes the listener to attain false beliefs by performing incorrect inference, is not technically a lie.
  • Occam's razor - principle commonly stated as "Entities must not be multiplied beyond necessity". When several theories are able to explain the same observations, Occam's razor suggests the simpler one is preferable.
  • Odds ratio - are an alternate way of expressing probabilities, which simplifies the process of updating them with new evidence. The odds ratio of A is P(A)/P(¬A).
  • Omega - A hypothetical super-intelligent being used in philosophical problems. Omega is most commonly used as the predictor in Newcomb's problem. In its role as predictor, Omega's predictions occur almost certainly. In some thought experiments, Omega is also taken to be super-powerful. Omega can be seen as analogous to Laplace's demon, or as the closest approximation to the Demon capable of existing in our universe.
  • Oops - Theories must be bold and expose themselves to falsification; be willing to commit the heroic sacrifice of giving up your own ideas when confronted with contrary evidence; play nice in your arguments; try not to deceive yourself; and other fuzzy verbalisms. It is better to say oops quickly when you realize a mistake. The alternative is stretching out the battle with yourself over years.
  • Outside view - Taking the outside view (another name for reference class forecasting) means using an estimate based on a class of roughly similar previous cases, rather than trying to visualize the details of a process. For example, estimating the completion time of a programming project based on how long similar projects have taken in the past, rather than by drawing up a graph of tasks and their expected completion times.
  • Overcoming Bias - is a group blog on the systemic mistakes humans make, and how we can possibly correct them.
  • Paperclip maximizer – is an AI that has been created to maximize the number of paperclips in the universe. It is a hypothetical unfriendly artificial intelligence.
  • Pascal's mugging – is a thought-experiment demonstrating a problem in expected utility maximization. A rational agent should choose actions whose outcomes, when weighed by their probability, have higher utility. But some very unlikely outcomes may have very great utilities, and these utilities can grow faster than the probability diminishes. Hence the agent should focus more on vastly improbable cases with implausibly high rewards.
  • Password - The answer you guess instead of actually understanding the problem.
  • Philosophical zombie - a hypothetical entity that looks and behaves exactly like a human (often stipulated to be atom-by-atom identical to a human) but is not actually conscious: they are often said lack qualia or phenomena consciousness.
  • Phlogiston - the 18 century's answer to the Elemental Fire of the Greek alchemists. Ignite wood, and let it burn. What is the orangey-bright "fire" stuff? Why does the wood transform into ash? To both questions, the 18th-century chemists answered, "phlogiston"....and that was it, you see, that was their answer: "Phlogiston." —Fake Causality
  • Possibility - words in natural language carry connotations that may become misleading when the words get applied with technical precision. While it's not technically a lie to say that it's possible to win a lottery, the statement is deceptive. It's much more precise, for communication of the actual fact through connotation, to say that it’s impossible to win the lottery. This is an example of antiprediction.
  • Possible world - is one that is internally consistent, even if it is counterfactual.
  • Prediction market - speculative markets created for the purpose of making predictions. Assets are created whose final cash value is tied to a particular event or parameter. The current market prices can then be interpreted as predictions of the probability of the event or the expected value of the parameter.
  • Priors - refer generically to the beliefs an agent holds regarding a fact, hypothesis or consequence, before being presented with evidence.
  • Probability is in the Mind - Probabilities express uncertainty, and it is only agents who can be uncertain. A blank map does not correspond to a blank territory. Ignorance is in the mind.
  • Probability theory - a field of mathematics which studies random variables and processes.
  • Rationality - the characteristic of thinking and acting optimally. An agent is rational if it wields its intelligence in such a way as to maximize the convergence between its beliefs and reality; and acts on these beliefs in such a manner as to maximize its chances of achieving whatever goals it has. For humans, this means mitigating (as much as possible) the influence of cognitive biases.
  • Rational evidence - the broadest possible sense of evidence, the Bayesian sense. Rational evidence about a hypothesis H is any observation which has a different likelihood depending on whether H holds in reality or not. Rational evidence is distinguished from narrower forms of evidence, such as scientific evidence or legal evidence. For a belief to be scientific, you should be able to do repeatable experiments to verify the belief. For evidence to be admissible in court, it must e.g. be a personal observation rather than hearsay.
  • Rationalist taboo - a technique for fighting muddles in discussions. By prohibiting the use of a certain word and all the words synonymous to it, people are forced to elucidate the specific contextual meaning they want to express, thus removing ambiguity otherwise present in a single word. Mainstream philosophy has a parallel procedure called "unpacking" where doubtful terms need to be expanded out.
  • Rationality and Philosophy - A sequence by lukeprog examining the implications of rationality and cognitive science for philosophical method.
  • Rationality as martial art - A metaphor for rationality as the martial art of mind; training brains in the same fashion as muscles. The metaphor is intended to have complex connotations, rather than being strictly positive. Do modern-day martial arts suffer from being insufficiently tested in realistic fighting, and do attempts at rationality training run into the same problem?
  • Reversal test - a technique for fighting status quo bias in judgments about the preferred value of a continuous parameter. If one deems the change of the parameter in one direction to be undesirable, the reversal test is to check that either the change of that parameter in the opposite direction (away from status quo) is deemed desirable, or that there are strong reasons to expect that the current value of the parameter is (at least locally) the optimal one.
  • Reductionism - a disbelief that the higher levels of simplified multilevel models are out there in the territory, that concepts constructed by mind in themselves play a role in the behavior of reality. This doesn't contradict the notion that the concepts used in simplified multilevel models refer to the actual clusters of configurations of reality.
  • Religion- Religion is a complex group of human activities — involving tribal affiliation, belief in belief, supernatural claims, and a range of shared group practices such as worship meetings, rites of passage, etc.
  • Reversed stupidity is not intelligence - "The world's greatest fool may say the Sun is shining, but that doesn't make it dark out.".
  • Science - a method for developing true beliefs about the world. It works by developing hypotheses about the world, creating experiments that would allow the hypotheses to be tested, and running the experiments. By having people publish their falsifiable predictions and their experimental results, science protects itself from individuals deceiving themselves or others.
  • Scoring rule - a scoring rule is a measure of performance of probabilistic predictions - made under uncertainty.
  • Seeing with Fresh Eyes - A sequence on the incredibly difficult feat of getting your brain to actually think about something, instead of instantly stopping on the first thought that comes to mind.
  • Semantic stopsign – is a meaningless generic explanation that creates an illusion of giving an answer, without actually explaining anything.
  • Shannon information - The Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable.
  • Shut up and multiply- the ability to trust the math even when it feels wrong
  • Signaling - "a method of conveying information among not-necessarily-trustworthy parties by performing an action which is more likely or less costly if the information is true than if it is not true".
  • Solomonoff induction - A formalized version of Occam's razor based on Kolmogorov complexity.
  • Sound argument - an argument that is valid and whose premises are all true. In other words, the premises are true and the conclusion necessarily follows from them, making the conclusion true as well.
  • Spaced repetition - is a technique for building long-term knowledge efficiently. It works by showing you a flash card just before a computer model predicts you will have forgotten it. Anki is Less Wrong's spaced repetition software of choice
  • Statistical bias - "Bias" as used in the field of statistics refers to directional error in an estimator. Statistical bias is error you cannot correct by repeating the experiment many times and averaging together the results.
  • Steel man - A term for the opposite of a Straw Man
  • Superstimulus - an exaggerated version of a stimulus to which there is an existing response tendency, or any stimulus that elicits a response more strongly than the stimulus for which it evolved.
  • Surprise - Recognizing a fact that disagrees with your intuition as surprising is an important step in updating your worldview.
  • Sympathetic magic - Humans seem to naturally generate a series of concepts known as sympathetic magic, a host of theories and practices which have certain principles in common, two of which are of overriding importance: the Law of Contagion holds that two things which have interacted, or were once part of a single entity, retain their connection and can exert influence over each other; the Law of Similarity holds that things which are similar or treated the same establish a connection and can affect each other.
  • Tapping Out - The appropriate way to signal that you've said all you wanted to say on a particular topic, and that you're ending your participation in a conversation lest you start saying things that are less worthwhile. It doesn't mean accepting defeat or claiming victory and it doesn't mean you get the last word. It just means that you don't expect your further comments in a thread to be worthwhile, because you've already made all the points you wanted to, or because you find yourself getting too emotionally invested, or for any other reason you find suitable.
  • Technical explanation - A technical explanation is an explanation of a phenomenon that makes you anticipate certain experiences. A proper technical explanation controls anticipation strictly, weighting your priors and evidence precisely to create the justified amount of uncertainty. Technical explanations are contrasted with verbal explanations, which give the impression of understanding without actually producing the proper expectation.
  • Teleology - The study of things that happen for the sake of their future consequences. The fallacious meaning of it is that events are the result of future events. The non-fallacious meaning is that it is the study of things that happen because of their intended results, where the intention existed in an actual mind in the prior past, and so was causally able to bring about the event by planning and acting.
  • The map is not the territory – the idea that our perception of the world is being generated by our brain and can be considered as a 'map' of reality written in neural patterns. Reality exists outside our mind but we can construct models of this 'territory' based on what we glimpse through our senses.
  • Third option - is a way to break a false dilemma, showing that neither of the suggested solutions is a good idea.
  • Traditional rationality - "Traditional Rationality" refers to the tradition passed down by reading Richard Feynman's "Surely You're Joking", Thomas Kuhn's "The Structure of Scientific Revolutions", Martin Gardner's "Science: Good, Bad, and Bogus", Karl Popper on falsifiability, or other non-technical material on rationality. Traditional Rationality is a very large improvement over nothing at all, and very different from Hollywood rationality; people who grew up on this belief system are definitely fellow travelers, and where most of our recruits come from. But you can do even better by adding math, science, formal epistemic and instrumental rationality; experimental psychology, cognitive science, deliberate practice, in short, all the technical stuff.There's also some popular tropes of Traditional Rationality that actually seem flawed once you start comparing them to a Bayesian standard - for example, the idea that you ought to give up an idea once definite evidence has been provided against it, but you're allowed to believe until then, if you want to. Contrast to the stricter idea of there being a certain exact probability which it is correct to assign, continually updated in the light of new evidence.
  • Trivial inconvenience - inconveniences that take few resources to counteract but have a disproportionate impact on people deciding whether to take a course of action.
  • Truth - the correspondence between and one's beliefs about reality and reality.
  • Tsuyoku naritai - the will to transcendence. Japanese: "I want to become stronger."
  • Twelve virtues of rationality
    1. Curiosity – the burning itch
    2. Relenquishment – “That which can be destroyed by the truth should be.” -P. C. Hodgell
    3. Lightness – follow the evidence wherever it leads
    4. Evenness – resist selective skepticism; use reason, not rationalization
    5. Argument – do not avoid arguing; strive for exact honesty; fairness does not mean balancing yourself evenly between propositions
    6. Empiricism – knowledge is rooted in empiricism and its fruit is prediction; argue what experiences to anticipate, not which beliefs to profess
    7. Simplicity – is virtuous in belief, design, planning, and justification; ideally: nothing left to take away, not nothing left to add
    8. Humility – take actions, anticipate errors; do not boast of modesty; no one achieves perfection
    9. Perfectionism – seek the answer that is *perfectly* right – do not settle for less
    10. Precision – the narrowest statements slice deepest; don’t walk but dance to the truth
    11. Scholarship – absorb the powers of science
    12. [The void] (the nameless virtue) – “More than anything, you must think of carrying your map through to reflecting the territory.”
  • Understanding - is more than just memorization of detached facts; it requires ability to see the implications across a variety of possible contexts.
  • Universal law - the idea that everything in reality always behaves according to the same uniform physical laws; there are no exceptions and no alternatives.
  • Unsupervised universe - a thought experiment developed to counter undue optimism, not just the sort due to explicit theology, but in particular a disbelief in the Future's vulnerability—a reluctance to accept that things could really turn out wrong. It involves a benevolent god, a simulated universe, e.g. Conway's Game of Life and asking the mathematical question of what would happen according to the standard Life rules given certain initial conditions - so that even God cannot control the answer to the question; although, of course, God always intervenes in the actual Life universe.
  • Valid argument - An argument is valid when it contains no logical fallacies
  • Valley of bad rationality - It has been observed that when someone is just starting to learn rationality, they appear to be worse off than they were before. Others, with more experience at rationality, claim that after you learn more about rationality, you will be better off than you were before you started. The period before this improvement is known as "the valley of bad rationality".
  • Wisdom of the crowd – is the collective opinion of a group of individuals rather than that of a single expert. A large group's aggregated answers to questions involving quantity estimation, general world knowledge, and spatial reasoning has generally been found to be as good as, and often better than, the answer given by any of the individuals within the group.
  • Words can be wrong – There are many ways that words can be wrong it is for this reason that we should avoid arguing by definition. Instead, to facilitate communication we can taboo and reduce: we can replace the symbol with the substance and talk about facts and anticipations, not definitions.


Barriers, biases, fallacies, impediments and problems

  • Akrasia - the state of acting against one's better judgment. Note that, for example, if you are procrastinating because it's not in your best interest to complete the task you are delaying, it is not a case of akrasia.
  • Alief - an independent source of emotional reaction which can coexist with a contradictory belief. For example, the fear felt when a monster jumps out of the darkness in a scary movie is based on the alief that the monster is about to attack you, even though you believe that it cannot.
  • Effort Shock - the unpleasant discovery of how hard it is to accomplish something.


  • Ambient decision theory - A variant of updateless decision theory that uses first order logic instead of mathematical intuition module (MIM), emphasizing the way an agent can control which mathematical structure a fixed definition defines, an aspect of UDT separate from its own emphasis on not making the mistake of updating away things one can still acausally control.
  • Ask, Guess and Tell culture - The two basic rules of Ask Culture: 1) Ask when you want something. 2) Interpret things as requests and feel free to say "no". The two basic rules of Guess Culture: 1) Ask for things if, and *only* if, you're confident the person will say "yes". 2)  Interpret requests as expectations of "yes", and, when possible, avoid saying "no".The two basic rules of Tell Culture: 1) Tell the other person what's going on in your own mind whenever you suspect  you'd both benefit from them knowing. (Do NOT assume others will accurately model your mind without your help, or that it will even occur to them to ask you questions to eliminate their ignorance.) 2) Interpret things people tell you as attempts to create common knowledge for shared benefit, rather than as requests or as presumptions of compliance.
  • Burch's law – “I think people should have a right to be stupid and, if they have that right, the market's going to respond by supplying as much stupidity as can be sold.” —Greg Burch A corollary of Burch's Law is that any bias should be regarded as a potential vulnerability whereby the market can trick one into buying something one doesn't really want.
  • Challenging the Difficult - A sequence on how to do things that are difficult or "impossible".
  • Cognitive style - Certain cognitive styles might tend to produce more accurate results. A common distinction between cognitive styles is that of foxes vs. hedgehogs. Hedgehogs view the world through the lens of a single defining idea and foxes draw on a wide variety of experiences and for whom the world cannot be boiled down to a single idea. Foxes tend to be better calibrated and more accurate.
  • Consequentialism - the ethical theory that people should choose the action that will result in the best outcome.
  • Crocker's rules - By declaring commitment to Crocker's rules, one authorizes other debaters to optimize their messages for information, even when this entails that emotional feelings will be disregarded. This means that you have accepted full responsibility for the operation of your own mind, so that if you're offended, it's your own fault.
  • Dark arts - refers to rhetorical techniques crafted to exploit human cognitive biases in order to persuade, deceive, or otherwise manipulate a person into irrationally accepting beliefs perpetuated by the practitioner of the Arts. Use of the dark arts is especially common in sales and similar situations (known as hard sell in the sales business) and promotion of political and religious views.
  • Egalitarianism - the idea that everyone should be considered equal. Equal in merit, equal in opportunity, equal in morality, and equal in achievement. Dismissing egalitarianism is not opposed to humility, even though from thesignaling perspective it seems to be opposed to modesty.
  • Expected utility - the expected value in terms of the utility produced by an action. It is the sum of the utility of each of its possible consequences, individually weighted by their respective probability of occurrence. rational decision maker will, when presented with a choice, take the action with the greatest expected utility.
  • Explaining vs. explaining away – Explaining something does not subtract from its beauty. It in fact heightens it. Through understanding it, you gain greater awareness of it. Through understanding it, you are more likely to notice its similarities and interrelationships with others things. Through understanding it, you become able to see it not only on one level, but on multiple. In regards to the delusions which people are emotionally attached to, that which can be destroyed by the truth should be.
  • Fuzzies - A hypothetical measurement unit for "warm fuzzy feeling" one gets from believing that one has done good. Unlike utils, fuzzies can be earned through psychological tricks without regard for efficiency. For this reason, it may be a good idea to separate the concerns for actually doing good, for which one might need to shut up and multiply, and for earning fuzzies, to get psychological comfort.
  • Game theory - attempts to mathematically model interactions between individuals.
  • Generalizing from One Example - an incorrect generalisation when you only have direct first-person knowledge of one mind, psyche or social circle and you treat it as typical even in the face of contrary evidence.
  • Goodhart’s law - states that once a certain indicator of success is made a target of a social or economic policy, it will lose the information content that would qualify it to play such a role. People and institutions try to achieve their explicitly stated targets in the easiest way possible, often obeying the letter of the law. This is often done in way that the designers of the law did not anticipate or want. For example, the soviet factories which when given targets on the basis of numbers of nails produced many tiny useless nails and when given targets on basis of weight produced a few giant nails.
  • Hedonism- refers to a set of philosophies which hold that the highest goal is to maximize pleasure, or more precisely pleasure minus pain.
  • Humans Are Not Automatically Strategic - most courses of action are extremely ineffective and most of the time there has been no strong evolutionary or cultural force sufficient to focus us on the very narrow behavior patterns that would actually be effective. When this is coupled with the fact that people tend to spend a lot less effort on planning how to go about a reaching a goal rather than just trying to achieve it you end up with the conclusion that humans are not automatically strategic.
  • Human universal - Donald E. Brown has compiled a list of over a hundred human universals - traits found in every culture ever studied, most of them so universal that anthropologists don't even bother to note them explicitly.
  • Instrumental value - a value pursued for the purpose of achieving other values. Values which are pursued for their own sake are called terminal values.
  • Intellectual roles - Group rationality may be improved when members of the group take on specific intellectual roles. While these roles may be incomplete on their own, each embodies an aspect of proper rationality. If certain roles are biased against, purposefully adopting them might reduce bias.
  • Lonely Dissenters suffer social disapproval, but are required - Asch's conformity experiment showed that the presence of a single dissenter tremendously reduced the incidence of "conforming" wrong answers.
  • Loss Aversion - is risk aversion's evil twin. A loss-averse agent tends to avoid uncertain gambles, not because every unit of money brings him a bit less utility, but because he weighs losses more heavily than gains, always treating his current level of money as somehow special.
  • Luminosity - reflective awareness. A luminous mental state is one that you have and know that you have. It could be an emotion, a belief or alief, a disposition, a quale, a memory - anything that might happen or be stored in your brain. What's going on in your head?
  • Marginally zero-sum game also known as 'arms race' - A zero-sum game where the efforts of each player not just give them a benefit at the expense of the others, but decrease the efficacy of everyone's past and future actions, thus making everyone's actions extremely inefficient in the limit.
  • Moral Foundations theory (all moral rules in all human cultures appeal to the six moral foundations: care/harm, fairness/cheating, liberty/oppression,loyalty/betrayal, authority/subversion, sanctity/degradation). This makes other people's moralities easier to understand, and is an interesting lens through which to examine your own.
  • Moral uncertainty – is uncertainty about how to act given the diversity of moral doctrines. Moral uncertainty includes a level of uncertainty above the more usual uncertainty of what to do given incomplete information, since it deals also with uncertainty about which moral theory is right. Even with complete information about the world this kind of uncertainty would still remain
  • Paranoid debating - a group estimation game in which one player, unknown to the others, tries to subvert the group estimate.
  • Politics as charity: in terms of expected value, altruism is a reasonable motivator for voting (as opposed to common motivators like "wanting to be heard").
  • Prediction - a statement or claim that a particular event will occur in the future in more certain terms than a forecast.
  • Privileging the question - questions that someone has unjustifiably brought to your attention in the same way that a privileged hypothesis unjustifiably gets brought to your attention. Examples are: should gay marriage be legal? Should Congress pass stricter gun control laws? Should immigration policy be tightened or relaxed? The problem with privileged questions is that you only have so much attention to spare. Attention paid to a question that has been privileged funges against attention you could be paying to better questions. Even worse, it may not feel from the inside like anything is wrong: you can apply all of the epistemic rationality in the world to answering a question like "should Congress pass stricter gun control laws?" and never once ask yourself where that question came from and whether there are better questions you could be answering instead.
  • Radical honesty- a communication technique proposed by Brad Blanton in which discussion partners are not permitted to lie or deceive at all. Rather than being designed to enhance group epistemic rationality, radical honesty is designed to reduce stress and remove the layers of deceit that burden much of discourse.
  • Reflective decision theory - a term occasionally used to refer to a decision theory that would allow an agent to take actions in a way that does not trigger regret. This regret is conceptualized, according to the Causal Decision Theory, as a Reflective inconsistency, a divergence between the agent who took the action and the same agent reflecting upon it after.
  • Schelling point – is a solution that people will tend to use in the absence of communication, because it seems natural, special, or relevant to them.
  • Schelling fences and slippery slopes – a slippery slope is something that affects people's willingness or ability to oppose future policies. Slippery slopes can sometimes be avoided by establishing a "Schelling fence" - a Schelling point that the various interest groups involved - or yourself across different values and times - make a credible precommitment to defend.
  • Something to protect - The Art must have a purpose other than itself, or it collapses into infinite recursion.
  • Status - Real or perceived relative measure of social standing, which is a function of both resource control and how one is viewed by others.
  • Take joy in the merely real – If you believe that science coming to know about something places it into the dull catalogue of common things, then you're going to be disappointed in pretty much everything eventually —either it will turn out not to exist, or even worse, it will turn out to be real. Another way to think about it is that if the magical and mythical were common place they would be merely real. If dragons were common, but zebras were a rare legendary creature then there's a certain sort of person who would ignore dragons, who would never bother to look at dragons, and chase after rumors of zebras. The grass is always greener on the other side of reality. If we cannot take joy in the merely real, our lives shall be empty indeed.
  • The Science of Winning at Life - A sequence by lukeprog that summarizes scientifically-backed advice for "winning" at everyday life: in one's productivity, in one's relationships, in one's emotions, etc. Each post concludes with footnotes and a long list of references from the academic literature.
  • Timeless decision theory - a decision theory, which in slogan form, says that agents should decide as if they are determining the output of the abstract computation that they implement. This theory was developed in response to the view that rationality should be about winning (that is, about agents achieving their desired ends) rather than about behaving in a manner that we would intuitively label as rational.
  • Unfriendly artificial intelligence - is an artificial general intelligence capable of causing great harm to humanity, and having goals that make it useful for the AI to do so. The AI's goals don't need to be antagonistic to humanity's goals for it to be Unfriendly; there are strong reasons to expect that almost any powerful AGI not explicitly programmed to be benevolent to humans is lethal.
  • Updateless decision theory – a decision theory in which we give up the idea of doing Bayesian reasoning to obtain a posterior distribution etc. and instead just choose the action (or more generally, the probability distribution over actions) that will maximize the unconditional expected utility.
  • Ugh field - Pavlovian conditioning can cause humans to unconsciously flinch from even thinking about a serious personal problem they have. We call it an "ugh field". The ugh field forms a self-shadowing blind spot covering an area desperately in need of optimization.
  • Utilitarianism - A moral philosophy that says that what matters is the sum of everyone's welfare, or the "greatest good for the greatest number".
  • Utility - how much a certain outcome satisfies an agent’s preferences.
  • Utility function - assigns numerical values ("utilities") to outcomes, in such a way that outcomes with higher utilities are always preferred to outcomes with lower utilities. These do not work very well in practice for individual humans
  • Wanting and liking - The reward system consists of three major components:
    • Liking: The 'hedonic impact' of reward, comprised of (1) neural processes that may or may not be conscious and (2) the conscious experience of pleasure.
    • Wanting: Motivation for reward, comprised of (1) processes of 'incentive salience' that may or may not be conscious and (2) conscious desires.
    • Learning: Associations, representations, and predictions about future rewards, comprised of (1) explicit predictions and (2) implicit knowledge and associative conditioning (e.g. Pavlovian associations).


  • Beliefs require observations - To form accurate beliefs about something, you really do have to observe it. This can be viewed as a special case of the second law of thermodynamics, in fact, since "knowledge" is correlation of belief with reality, which is mutual information, which is a form of negentropy.
  • Complexity of value - the thesis that human values have high Kolmogorov complexity and so cannot be summed up or compressed into a few simple rules. It includes the idea of fragility of value which is the thesis that losing even a small part of the rules that make up our values could lead to results that most of us would now consider as unacceptable.
  • Egan's law - "It all adds up to normality." — Greg Egan. The purpose of a theory is to add up to observed reality, rather than something else. Science sets out to answer the question "What adds up to normality?" and the answer turns out to be Quantum mechanics adds up to normality. A weaker extension of this principle applies to ethical and meta-ethical debates, which generally ought to end up explaining why you shouldn't eat babies, rather than why you should.
  • Emotion - Contrary to the stereotype, rationality doesn't mean denying emotion. When emotion is appropriate to the reality of the situation, it should be embraced; only when emotion isn't appropriate should it be suppressed.
  • Futility of chaos - A complex of related ideas having to do with the impossibility of generating useful work from entropy — a position which holds against the ideas that e.g: Our artistic creativity stems from the noisiness of human neurons, randomized algorithms can exhibit performance inherently superior to deterministic algorithms and the human brain is a chaotic system and this explains its power; non-chaotic systems cannot exhibit intelligence.
  • General knowledge - Interdisciplinary, generally applicable knowledge is rarely taught explicitly. Yet it's important to have at least basic knowledge of many areas (as opposed to deep narrowly specialized knowledge), and to apply it to thinking about everything.
  • Hope - Persisting in clutching to a hope may be disastrous. Be ready to admit you lost, update on the data that says you did.
  • Humility – “To be humble is to take specific actions in anticipation of your own errors. To confess your fallibility and then do nothing about it is not humble; it is boasting of your modesty.” —Twelve Virtues of Rationality Not to be confused with social modesty, or motivated skepticism (aka disconfirmation bias).
  • I don't know - in real life, you are constantly making decisions under uncertainty: the null plan is still a plan, refusing to choose is itself a choice, and by your choices, you implicitly take bets at some odds, whether or not you explicitly conceive of yourself as doing so.
  • Litany of Gendlin – “What is true is already so. Owning up to it doesn't make it worse. Not being open about it doesn't make it go away. And because it's true, it is what is there to be interacted with. Anything untrue isn't there to be lived. People can stand what is true, for they are already enduring it.” —Eugene Gendlin
  • Litany of Tarski – “If the box contains a diamond, I desire to believe that the box contains a diamond; If the box does not contain a diamond, I desire to believe that the box does not contain a diamond; Let me not become attached to beliefs I may not want. “ —The Meditation on Curiosity
  • Lottery - A tax on people who are bad at math. Also, a waste of hope. You will not win the lottery.
  • Magic - What seems to humans like a simple explanation, sometimes isn't at all. In our own naturalistic, reductionist universe, there is always a simpler explanation. Any complicated thing that happens, happens because there is some physical mechanism behind it, even if you don't know the mechanism yourself (which is most of the time). There is no magic.
  • Modesty argument - the claim that when two or more rational agents have common knowledge of a disagreement over the likelihood of an issue of simple fact, they should each adjust their probability estimates in the direction of the others'. This process should continue until the two agents are in full agreement. Inspired by Aumann's agreement theorem.
  • No safe defense - Authorities can be trusted exactly as much as a rational evaluation of the evidence deems them trustworthy, no more and no less. There's no one you can trust absolutely; the full force of your skepticism must be applied to everything.
  • Offense - It is hypothesized that the emotion of offense appears when one perceives an attempt to gain status.
  • Slowness of evolution- The tremendously slow timescale of evolution, especially for creating new complex machinery (as opposed to selecting on existing variance), is why the behavior of evolved organisms is often better interpreted in terms of what did in fact work yesterday, rather than what will work in the future.
  • Stupidity of evolution - Evolution can only access a very limited area in the design space, and can only search for the new designs very slowly, for a variety of reasons. The wonder of evolution is not how intelligently it works, but that an accidentally occurring optimizer without a brain works at all.

My future posts; a table of contents.

19 Elo 30 August 2015 10:27PM

My future posts

I have been living in the lesswrong rationality space for at least two years now. Recently more active than previously. This has been deliberate. I plan to make more serious active posts in the future. In saying so I wanted to announce the posts I intend on making when moving forwards from today.  This should do a few things:


  1. keep me on track
  2. keep me accountable to me more than anyone else
  3. keep me accountable to others
  4. allow others to pick which they would like to be created sooner
  5. allow other people to volunteer to create/collaborate on these topics
  6. allow anyone to suggest more topics
  7. meta: this post should help to demonstrate one person's method of developing rationality content and the time it takes to do that.
feel free to PM me about 6, or comment below.

Unfortunately these are not very well organised, they are presented in no particular order.  They are probably missing posts that will help link them all together, as well as skills required to understand some of the posts on this list.


Unpublished but written:

A very long list of sleep maintenance suggestions – I wrote up all the ideas I knew of; there are about 150 or so; worth reviewing just to see if you can improve your sleep because the difference in quality of life with good sleep is a massive change. (20mins to write an intro)

A list of techniques to help you remember names. - remembering names is a low-hanging social value fruit that can improve many of your early social interactions with people. I wrote up a list of techniques to help. (5mins to post)


Posts so far:

The null result: a magnetic ring wearing experiment. - a fun one; about how wearing magnetic rings was cool; but not imparting of superpowers. (done)

An app list of useful apps for android my current list of apps that I use also some very good suggestions in the comments. (done)

How to learn X How to attack a problem of learning a new area that you don't know a lot about (for a generic thing) (done)

A list of common human goals – when plotting out goals that matter to you; so you can look over some common ones and see you fulfilling them interests you. (done)

Lesswrong real time chat - A Slack channel for hanging out with other rationalists.  Also where I talk about my latest posts before I put them up.


Future posts

Goals of your lesswrong group – Do you have a local group; why? What do you want out of it (do people know)? setting goals, doing something particularly, having fun anyway, changing your mind. (4hrs)


Goals interrogation + Goal levels – Goal interrogation is about asking <is this thing I want to do actually a goal of mine> and <is this the best way to achieve that>, goal levels are something out of Sydney Lesswrong that help you have mutual long term goals and supporting short term goal. (2hrs)


How to human – A zero to human guide. A guide for basic functionality of a humanoid system. (4hrs)


How to effectively accrue property – Just spent more than the value of an object on it? How to think about that and try to do it better. (5hrs)


List of strategies for getting shit done – working around the limitations of your circumstances and understanding what can get done with the resources you have at hand. (4hrs)


List of superpowers and kryptonites – when asking the question "what are my superpowers?" and "what are my kryptonites?". Knowledge is power; working with your powers and working out how to avoid your kryptonites is a method to improve yourself. (6hrs over a week)


List of effective behaviours – small life-improving habits that add together to make awesomeness from nothing. And how to pick them up. (8hrs over 2 weeks)


Memory and notepads – writing notes as evidence, the value of notes (they are priceless) and what you should do. (1hr + 1hr over a week)


Suicide prevention checklist – feeling off? You should have already outsourced the hard work for "things I should check on about myself" to your past self. Make it easier for future you. Especially in the times that you might be vulnerable. (4hrs)


Make it easier for future you. Especially in the times that you might be vulnerable. - as its own post in curtailing bad habits. (5hrs)


A p=np approach to learning – Sometimes you have to learn things the long way; but sometimes there is a short cut. Where you could say, "I wish someone had just taken me on the easy path early on". It's not a perfect idea; but start looking for the shortcuts where you might be saying "I wish someone had told me". Of course my line now is, "but I probably wouldn't have listened anyway" which is something that can be worked on as well. (2hrs)


Rationalists guide to dating – attraction. Relationships. Doing things with a known preference. Don't like stupid people? Don't try to date them. Think first; an exercise in thinking hard about things before trying trial-and-error on the world. (half written, needs improving 2hrs)


Training inherent powers (weights, temperatures, smells, estimation powers) – practice makes perfect right? Imagine if you knew the temperature always, the weight of things by lifting them, the composition of foods by tasting them, the distance between things without measuring. How can we train these, how can we improve. (2hrs)


Strike to the heart of the question. The strongest one; not the one you want to defeat – Steelman not Strawman. Don't ask "how do I win at the question"; ask, "am I giving the best answer to the best question I can give", (2hrs)


Posts not planned at the original writing of the post:

Sensory perception differences and how it shapes personal experience - Is a sound as loud to you as everyone else?  What about a picture?  Are colours as clear and vivid to you as they are to other people?  This post is a consideration in whether the individual difference in experiences can shape our experience and choices in how we live our lives.  Includes some short exercises in sensory perceptions.


Posts added to the list:

Exploration-Exploitation and a method of applying the secretary problem to real life.  I devised a rough equation for application of the secretary problem to real life dating and the exploration-exploitation dilemma.

How to approach a new problem - similar to the "How to solve X" post.  But considerations for working backwards from a wicked problem:, as well as trying "The least bad solution I know of", Murphy-jitsu, and known solutions to similar problems.  0. I notice I am approaching a problem.

being the kind of person that advice works for - The same words of advice can work for someone and not someone else.  Consider why that is; and how you can better understand the advice that you are given, and how you might become the kind of person that advice works for.

Edit: links adding as I write them.


One model of understanding independent differences in sensory perception

17 Elo 20 September 2015 09:32PM

This week my friend Anna said to me; "I just discovered my typical mind fallacy around visualisation is wrong". Naturally I was perplexed and confused. She said; 

“When I was in second grade the teacher had the class do an exercise in visualization. The students sat in a circle and the teacher instructed us to picture an ice cream cone with our favorite 0ice cream. I thought about my favorite type of cone and my favorite flavor, but the teacher emphasized "picture this in your head, see the ice cream." I tried this, and nothing happened. I couldn't see anything in my head, let alone an ice cream. I concluded, in my childish vanity, that no one could see things in their head, "visualizing" must just be strong figurative language for "pretending," and the exercise was just boring.”


Typical mind fallacy being; "everyone thinks like me" (Or A-typical mind fallacy – "no one thinks like me"). My good friend had discovered (a long time ago) that she had no visualisation function. But only recently made sense of it (approximately 15-20 years later). Anna came to me upset, "I am missing out on a function of the brain; limited in my experiences". Yes; true. She was. And we talked about it and tried to measure and understand that loss in better terms. The next day Anna was back but resolved to feeling better about it. Of course realising the value of individual differences in humans, and accepting that whatever she was missing; she was compensating for it by being an ordinary functional human (give or take a few things here and there), and perhaps there were some advantages.


Together we set off down the road of evaluating the concept of the visualisation sense. So bearing in mind; that we started with "visualise an ice cream"... Here is what we covered.

Close your eyes for a moment, (after reading this paragraph), you can see the "blackness' but you can also see the white sparkles/splotches and some red stuff (maybe beige), as well as the echo-y shadows of what you last looked at, probably your white computer screen. They echo and bounce around your vision. That's pretty easy. Now close your eyes and picture an ice cream cone. So the visualisation-imagination space is not in my visual field, but what I do have is a canvas somewhere on which I draw that ice cream; and anything else I visualise.  It’s definitely in a different place. (We will come back to "where" it is later)

So either you have this "notepad"; “canvas” in your head for the visual perception space or you do not. Well; it’s more like a spectrum of strength of visualisation; where some people will visualise clear and vivid things; and others will have (for lack of better terms) "grey"; "echoes"; Shadows; or foggy visualisation, where drawing that is a really hard thing to do. Anna describes what she can get now in adulthood as a vague kind of bas relief of an image, like an after effect. So it should help you model other people by understanding that variously people can visualise better or worse. (probably not a big deal yet; just wait).


It occurs that there are other canvases; not just for the visual space but for smell and taste as well. So now try to canvas up some smells of lavender or rose, or some soap. You will probably find soap is possible to do; being of memorable and regular significance. The taste of chocolate; kind of appears from all those memories you have; as does cheese; lemon and salt; (but of course someone is screaming at the page about how they don't understand when I say that chocolate "kind of appears”, because it’s very very vivid to them, and someone else can smell soap but it’s quite far away and grey/cloudy).


It occurs to me now that as a teenage male I never cared about my odour; and that I regularly took feedback from some people about the fact that I should deal with that, (personal lack of noticing aside), and I would wonder why a few people would care a lot; and others would not ever care. I can make sense of these happenings by theorising that these people have a stronger smell canvas/faculty than other people. Which makes a whole lot of reasonable sense.

Interesting yet? There is more.

This is a big one.

Sound. But more specifically music. Having explored the insight of having a canvas for these senses with several people over the past week; And noting that the person from the story above confidently boasts an over-active music canvas with tunes always going on in their head. For a very long time I decided that I was just not a person who cared about music; and never really knew to ask or try to explain why. Just that it doesn't matter to me. Now I have a model. 


I can canvas music as it happens – in real time; and reproduce to a tune; but I have no canvas for visualising auditory sounds without stimulation. (what inspired the entire write-up here was someone saying how it finally made them understand why they didn't make sense of other people's interests in sounds and music) If you ask me to "hear" the C note on my auditory canvas; I literally have no canvas on which to "draw" that note. I can probably hum a C (although I am not sure how), But I can't play that thing in my head.

Interestingly I asked a very talented pianist. And the response was; "of course I have a musical canvas", (to my slight disappointment). Of course she mentioned it being a big space; and a trained thing as well. (As a professional concert pianist) She can play fully imagined practice on a not-real piano and hear a whole piece. Which makes for excellent practice when waiting for other things to happen, (waiting rooms, ques, public transport...)


Anna from the beginning is not a musician, and says her head-music is not always pleasant but simply satisfactory to her. Sometimes songs she has heard, but mostly noises her mind produces. And words, always words. She speaks quickly and fluently, because her thoughts occur to her in words fully formed. 

I don't care very much about music because I don't "see" (imagine) it. Songs do get stuck in my head but they are more like echoes of songs I have just heard, not ones I can canvas myself.


Now to my favourite sense. My sense of touch. My biggest canvas is my touch canvas. "feel the weight on your shoulders?", I can feel that. "Wind through your hair?", yes. The itch; yes, The scrape on your skin, The rough wall, the sand between your toes. All of that. 


It occurs to me that this explains a lot of details of my life that never really came together. When I was little I used to touch a lot of things, my parents were notorious for shouting my name just as I reached to grab things. I was known as a, "bull in a china shop", because I would touch everything and move everything and feel everything and get into all kinds of trouble with my touch. I once found myself walking along next to a building while swiping my hand along the building - I was with a friend who was trying out drugs (weed), She put her hands on the wall and remarked how this would be interesting to touch while high. At the time I probably said something like; "right okay". And now I understand just what everyone else is missing out on.


I spend most days wearing as few clothes as possible, (while being normal and modest), I still pick up odd objects around. There is a branch of Autism where the people are super-sensitive to touch and any touch upsets or distracts them; a solution is to wear tight-fitting clothing to dull the senses. I completely understand that and what it means to have a noisy-touch canvas.

All I can say to someone is that you have no idea what you are missing out on; and before this week – neither did I. But from today I can better understand myself and the people around me.


There is something to be said for various methods of thinking; some people “think the words”, and some people don’t think in words, they think in pictures or concepts.  I can’t cover that in this post; but keep that in mind as well for “the natural language of my brain”


One more exercise (try to play along – it pays off). Can you imagine 3 lines, connected; an equilateral triangle on a 2D plane. Rotate that around; good (some people will already be unable to do this). Now draw three more of these. Easy for some. Now I want you to line them up so that the three triangles are around the first one. Now fold the shape into a 3D shape.

How many corners?

How many edges?

How many faces?

Okay good. Now I want you to draw a 2D square. Simple; Now add another 4 triangles. Then; like before surround the square with the triangles and fold it into a pyramid. Again;

How many edges?

How many corners?

How many faces?


Now I want you to take the previous triangle shape; and attach it to one of the triangles of the square-pyramid shape. Got it?

Now how many corners?

How many edges?

How many faces?


That was easy right? Maybe not that last step. So it turns out I am not a super visualiser. I know this because those people who are a super visualisers will find that when they place the triangular pyramid on to the square pyramid; The side faces of the triangle pyramid merge into a rhombus with the square pyramid; effectively making 1 face out of 2 triangle faces; and removing an edge (and doing that twice over for two sides of the shape).  Those who understand will be going “duh” and those who don’t understand will be going “huh?”, what happened?


Pretty cool right?


Don’t believe me?  Don’t worry - there is a good explanation for those who don’t see it right away - at this link 


From a super-visualiser: 

“I would say, for me, visualization is less like having a mental playground, and more like having an entire other pair of eyes.  And there's this empty darkness into which I can insert almost anything.  If it gets too detailed, I might have to stop and close my outer eyes, or I might have to stop moving so I don't walk into anything. That makes it sound like a playground, but there's much more to it than that.


Imagine that you see someone buying something in a shop.  They pay cash, and the red of the twenty catches your eye.  It's pretty, and it's vivid, and it makes you happy.  And if you imagine a camera zooming out, you see red moving from customers to clerks at all the registers.  Not everyone is paying with twenties, but commerce is red, now.  It's like the air flashes and lights up like fireworks, every time somebody buys something.  

And if you keep zooming out, you can see red blurs all over the town, all over the map.  So if you read about international trade, it's almost like the paper comes to life, and some parts of it are highlighted red.  And if you do that for long enough, it becomes a habit, and something really weird starts to happen.  

When someone tells you about their car, there's a little red flash just out the corner of your eye, and you know they probably didn't pay full price, because there's a movie you can watch, and in the time they got the car, they didn't have a job and they were stressed, so there's not as much red in that part of the movie, so there has to be some way they got the car without losing even more red.  But it's not just colors, and it's definitely not just money.  


Happiness might be shimmering motion.  Connection with friends might be almost a blurring together at the center.  And all these amazing visual metaphors that you usually only see in an art gallery are almost literally there in the world, if you look with the other pair of eyes. So sometimes things really do sort of jump out at you, and nobody else noticed them. But it has to start with one thing.  One meaning, one visual metaphor."



Way up top I mentioned the "where" of the visualisation space. It's not really in the eye, a good name for it might be "the mind's eye". My personal visualisation canvas is located back up left tilted downwards and facing forwards.


Synaesthesia is a lot of possible effects. The most well known one is where people associate a colour with a letter, when they think of the letter they have a sense of a colour that goes with the letter. Some letter's don't have colours, sometimes numbers have colours.


There are other branches of synaesthesia. Locating things in the physical space. Days of the week can be laid out in a row in front of you; numbers can be located somewhere. Some can be heavier than others. Sounds can have weights; Smells can have colours; Musical notes can have a taste. Words can feel rough or smooth.


Synaesthesia is a class of cross-classification that is done by the brain in interpreting a stimulus, where (we think) it can be caused by crossed wiring in the brain; It's pretty fun. Turns out most people have some kind of Synaesthesia. Usually to do with weights of numbers, or days being in a row. Sometimes Tuesdays are lower than the other days. Who knows. If you pay attention to how sometimes things have an alternative sensory perception, chances are that's a bit of the natural Synaesthete coming out.

So what now?

Synaesthesia is supposed to make you smarter. Crossing brain faculty should help you remember things better; if you can think of numbers in terms of how heavy they are you could probably train your system 1 to do simple arithmetic by "knowing" how heavy the answer is. If it doesn't come naturally to you - these are no longer low-hanging fruit implementations of these ideas.


What is a low-hanging fruit; Consider all your "canvases" of thinking; Work out which ones you care more about; and which ones don't matter. (Insert link to superpowers and kryptonites: use your strong senses to your advantage; and make sure you avoid using your weaker senses) (or go on a bender to rebuild your map; influence your territory and train your sensory canvases. Or don't because that wouldn't be a low hanging fruit).

Keep this model around

It can be used for both good and evil. But get the model out there. Talk to people about it. Ask your friends and family if they are able to visualise. Ask about all the senses. Imagine if suddenly you discovered that someone you know; can't "smell" things in their imagination. Or doesn't know what you mean by, "feel this" (seriously you have no idea what you are missing out on the touch spectrum in my little bubble).

You are going to have good senses and bad ones. That's okay! The more you know; the more you can use it to your advantage!

Meta: Post write up time 1 hour; plus a week of my social life being dominated by the same conversation over and over with different people where I excitedly explain the most exciting thing of this week.  plus 1hr*4, plus 3 people editing and reviewing, plus a rationality dojo where I presented this topic.


Meta2: I waited 3 weeks for other people to review this.  There were no substantial changes and I should have not waited so long.  in future I won’t wait that long.

Lesswrong real time chat

17 Elo 04 September 2015 02:29AM

This is a short post to say that I have started and am managing a Slack channel for lesswrong.

Slack has only an email-invite option which means that I need an email address for anyone who wants to join.  Send me a PM with your email address if you are interested in joining.

There is a web interface and a mobile app that is better than google hangouts.


If you are interested in joining; consider this one requirement:

  • You must be willing to be charitable in your conversations with your fellow lesswrongers.


To be clear; This means (including but not limited to);

  • Steelman not strawman of discussion
  • Respect of others
  • patience
So far every conversation we have had has been excellent, there have been no problems at all and everyone is striving towards better understanding of each other.  This policy does not come out of a recognition of a failure to be charitable; but as a standard to set when moving forward.  I have no reason to expect it will be broken but all the same; I feel it is valuable to have.



I would like this to have several goals and purposes (some of which were collaboratively developed with other lesswrongers in the chat, and if more come up in the future too that would be good)
  • an aim for productive conversations, to make progress on our lives.
  • a brains trust for life-advice in all kinds of areas where, "outsource this decision to others" is an effective strategy.
  • collaborative creation of further rationality content
  • a safe space for friendly conversation on the internet (a nice place to hang out)
  • A more coherent and stronger connected lesswrong
  • Development of better ideas and strategies in how to personally improve the world.

So far the chat has been operating by private invite from me for about two weeks as a trial.  Since this post was created we now have an ongoing conversation with exciting new ideas being produced all the time.  If nothing else - its fun to be in.  If something - we are generating a growing space for rationality and other ideas.  I have personally gained two very good friends already; that I now talk to every day.  (Which coincidentally slowed me down from posting this notice because I was too busy with other things and learning from new people)

I realise this type of medium is not for all.  But I am keen to make it work.

I also realise that when people PM me their email addresses - other people will not see how many of you have already signed up.  So generally assume that there have been others who are already signed up and don't hesitate to join.  If you are wondering if you have anything to contribute; that's exactly the type of person we want to be inviting.  By doing that thought you classify yourself as the type of person to try harder.  We want you (and others) to talk with us.

Edit: Topics we now host;
  • AI
  • Film making
  • Goals of lesswrong
  • Human Relationships
  • media
  • parenting
  • philosophy
  • political talk
  • programming
  • real life
  • Resources and links
  • science
  • travelling
  • and some admin channels; the "welcome", "misc", and "RSS" from the lw site.
Edit a week's review for the first week of august in 2015:

Edit - first week of October:


17 Viliam 14 August 2015 05:38PM

(I started reading Alfred Korzybski, the famous 20th century rationalist. Instead of the more famous Science and Sanity I started with Manhood of Humanity, which was written first, because I expected it to be more simple, and possibly to provide a context necessary for the later book. I will post my re-telling of the book in shorter parts, to make writing and discussion easier. This post is approximately the first 1/4 of the book.)


The central question of Manhood of Humanity is: "What is a human?" Answering this question correctly could help us design a civilization allowing the fullest human development. Failure to answer this question correctly will repeat the cycle of revolutions and wars.

We should aim to answer this question precisely, using the best ways of thinking typically seen in exact sciences -- as opposed to verbal metaphysics and tribal fights often seen in social sciences. We should make our "science of human" more predictive, which will likely also make it progress faster.

According to Korzybski, the unique quality of humans is what he calls "time-binding", described as "the capacity of an individual or a generation to begin where the former left off". The science itself is a glorious example of time-binding. On the other hand we can observe the worst failures in psychiatrical cases. This is a scale of our ability to adjust to facts and reality, and the normal people are somewhere in between.

continue reading »

A toy model of the control problem

16 Stuart_Armstrong 16 September 2015 02:59PM

EDITED based on suggestions for improving the model

Jaan Tallinn has suggested creating a toy model of the control problem, so that it can be analysed without loaded concepts like "autonomy", "consciousness", or "intentionality". Here a simple (too simple?) attempt:


A controls B. B manipulates A.

Let B be a robot agent that moves in a two dimensional world, as follows:

continue reading »

Philosophy professors fail on basic philosophy problems

16 shminux 15 July 2015 06:41PM

Imagine someone finding out that "Physics professors fail on basic physics problems". This, of course, would never happen. To become a physicist in academia, one has to (among million other things) demonstrate proficiency on far harder problems than that.

Philosophy professors, however, are a different story. Cosmologist Sean Carroll tweeted a link to a paper from the Harvard Moral Psychology Research Lab, which found that professional moral philosophers are no less subject to the effects of framing and order of presentation on the Trolley Problem than non-philosophers. This seems as basic an error as, say, confusing energy with momentum, or mixing up units on a physics test.


We examined the effects of framing and order of presentation on professional philosophers’ judgments about a moral puzzle case (the “trolley problem”) and a version of the Tversky & Kahneman “Asian disease” scenario. Professional philosophers exhibited substantial framing effects and order effects, and were no less subject to such effects than was a comparison group of non-philosopher academic participants. Framing and order effects were not reduced by a forced delay during which participants were encouraged to consider “different variants of the scenario or different ways of describing the case”. Nor were framing and order effects lower among participants reporting familiarity with the trolley problem or with loss-aversion framing effects, nor among those reporting having had a stable opinion on the issues before participating the experiment, nor among those reporting expertise on the very issues in question. Thus, for these scenario types, neither framing effects nor order effects appear to be reduced even by high levels of academic expertise.

Some quotes (emphasis mine):

When scenario pairs were presented in order AB, participants responded differently than when the same scenario pairs were presented in order BA, and the philosophers showed no less of a shift than did the comparison groups, across several types of scenario.

[...] we could find no level of philosophical expertise that reduced the size of the order effects or the framing effects on judgments of specific cases. Across the board, professional philosophers (94% with PhD’s) showed about the same size order and framing effects as similarly educated non-philosophers. Nor were order effects and framing effects reduced by assignment to a condition enforcing a delay before responding and encouraging participants to reflect on “different variants of the scenario or different ways of describing the case”. Nor were order effects any smaller for the majority of philosopher participants reporting antecedent familiarity with the issues. Nor were order effects any smaller for the minority of philosopher participants reporting expertise on the very issues under investigation. Nor were order effects any smaller for the minority of philosopher participants reporting that before participating in our experiment they had stable views about the issues under investigation.

I am confused... I assumed that an expert in moral philosophy would not fall prey to the relevant biases so easily... What is going on?


Versions of AIXI can be arbitrarily stupid

15 Stuart_Armstrong 10 August 2015 01:23PM

Many people (including me) had the impression that AIXI was ideally smart. Sure, it was uncomputable, and there might be "up to finite constant" issues (as with anything involving Kolmogorov complexity), but it was, informally at least, "the best intelligent agent out there". This was reinforced by Pareto-optimality results, namely that there was no computable policy that performed at least as well as AIXI in all environments, and strictly better in at least one.

However, Jan Leike and Marcus Hutter have proved that AIXI can be, in some sense, arbitrarily bad. The problem is that AIXI is not fully specified, because the universal prior is not fully specified. It depends on a choice of a initial computing language (or, equivalently, of an initial Turing machine).

For the universal prior, this will only affect it up to a constant (though this constant could be arbitrarily large). However, for the agent AIXI, it could force it into continually bad behaviour that never ends.

For illustration, imagine that there are two possible environments:

  1. The first one is Hell, which will give ε reward if the AIXI outputs "0", but, the first time it outputs "1", the environment will give no reward for ever and ever after that.
  2. The second is Heaven, which gives ε reward for outputting "0" and 1 reward for outputting "1", and is otherwise memoryless.

Now simply choose a language/Turing machine such that the ratio P(Hell)/P(Heaven) is higher than the ratio 1/ε. In that case, for any discount rate, the AIXI will always output "0", and thus will never learn whether its in Hell or not (because its too risky to do so). It will observe the environment giving reward ε after receiving "0", behaviour which is compatible with both Heaven and Hell. Thus keeping P(Hell)/P(Heaven) constant, and ensuring the AIXI never does anything else.

In fact, it's worse than this. If you use the prior to measure intelligence, then an AIXI that follows one prior can be arbitrarily stupid with respect to another.

[Link] Game Theory YouTube Videos

15 James_Miller 06 August 2015 04:17PM

I made a series of game theory videos that carefully go through the mechanics of solving many different types of games.  I optimized the videos for my future Smith College game theory students who will either miss a class, or get lost in class and want more examples.   I emphasize clarity over excitement.   I would be grateful for any feedback.

The horrifying importance of domain knowledge

15 NancyLebovitz 30 July 2015 03:28PM

There are some long lists of false beliefs that programmers hold. isn't because programmers are especially likely to be more wrong than anyone else, it's just that programming offers a better opportunity than most people get to find out how incomplete their model of the world is.

I'm posting about this here, not just because this information has a decent chance of being both entertaining and useful, but because LWers try to figure things out from relatively simple principles-- who knows what simplifying assumptions might be tripping us up?

The classic (and I think the first) was about names. There have been a few more lists created since then.

Time. And time zones. Crowd-sourced time errors.

Addresses. Possibly more about addresses. I haven't compared the lists.

Gender. This is so short I assume it's seriously incomplete.

Networks. Weirdly, there is no list of falsehoods programmers believe about html (or at least a fast search didn't turn anything up). Don't trust the words in the url.

Distributed computing Build systems.

Poem about character conversion.

I got started on the subject because of this about testing your code, which was posted by Andrew Ducker.

Film about Stanislav Petrov

14 matheist 10 September 2015 06:43PM

I searched around but didn't see any mention of this. There's a film being released next week about Stanislav Petrov, the man who saved the world.

The Man Who Saved the World

Due for limited theatrical release in the USA on 18 September 2015.
Will show in New York, Los Angeles, Detroit, Portland.

Previous discussion of Stanislav Petrov:

Notes on Actually Trying

13 AspiringRationalist 23 September 2015 02:53AM

These ideas came out of a recent discussion on actually trying at Citadel, Boston's Less Wrong house.

What does "Actually Trying" mean?

Actually Trying means applying the combination of effort and optimization power needed to accomplish a difficult but feasible goal. The effort and optimization power are both necessary.

Failure Modes that can Resemble Actually Trying

Pretending to try

Pretending to try means doing things that superficially resemble actually trying but are missing a key piece. You could, for example, make a plan related to your goal and diligently carry it out but never stop to notice that the plan was optimized for convenience or sounding good or gaming a measurement rather than achieving the goal. Alternatively, you could have a truly great plan and put effort into carrying it out until it gets difficult.

Trying to Try

Trying to try is when you throw a lot of time and perhaps mental anguish at a task but not actually do the task. Writer's block is the classic example of this.


Sphexing is the act of carrying out a plan or behavior repeatedly despite it not working.

The Two Modes Model of Actually Trying

Actually Trying requires a combination of optimization power and effort, but each of those is done with a very different way of thinking, so it's helpful to do the two separately. In the first way of thinking, Optimizing Mode, you think hard about the problem you are trying to solve, develop a plan, look carefully at whether it's actually well-suited to solving the problem (as opposed to pretending to try) and perhaps Murphy-jitsu it. In Executing Mode, you carry out the plan.

Executing Mode breaks down when you reach an obstacle that you either don't know how to overcome or where the solution is something you don't want to do. In my personal experience, this is where things tend to get derailed. There are a few ways to respond to this situation:

  • Return to Optimizing Mode to figure out how to overcome the obstacle / improve your plan (good),
  • Ask for help / consult a relevant expert (good),
  • Take a break, which could lead to a eureka moment, lead to Optimizing Mode or lead to derailing (ok),
  • Sphex (bad),
  • Derail / procrastinate (bad), or
  • Punt / give up (ok if the obstacle is insurmountable).

The key is to respond constructively to obstacles. This usually means getting back to Optimizing Mode, either directly or after a break.  The failure modes here are derailing immediately, a "break" that turns into a derailment, and sphexing.  In our discussion, we shared a few techniques we had used to get back to Optimizing Mode.  These techniques tended to focus on some combination of removing the temptation to derail, providing a reminder to optimize, and changing mental state.

Getting Back to Optimizing Mode

Context switches are often helpful here.  Because for many people, work and procrastination both tend to be computer-based activities, it is both easy and tempting to switch to a time-wasting activity immediately upon hitting an obstacle.  Stepping away from the computer takes away the immediate distraction and depending on what you do away from the computer, helps you either think about the problem or change your mental state.  Depending on what sort of mood I'm in, I sometimes step away from the computer with a pen and paper to write down my thoughts (thinking about the problem), or I may step away to replenish my supply of water and/or caffeine (changing my mental state).  Other people in the discussion said they found going for a walk or getting more strenuous exercise to be helpful when they needed a break.  Strenuous exercise has the additional advantage of having very low risk of turning into a longer-than-intended break.

The danger with breaks is that they can turn into derailment.  Open-ended breaks ("I'll just browse Reddit for five minutes") have a tendency to expand, so it's best to avoid them in favor of things with more definite endings.  The other common say for breaks to turn into derailment is to return from a break and go to something non-productive.  I have had some success with attaching a sticky-note to my monitor reminding me what to do when I return to my computer.  I have also found that if the note makes clear what problem I need to solve also makes me less likely to sphex when I return to my computer.

In the week or so since the discussion that inspired this post, I have found that asking myself "what would Actually Trying look like right now?" This has helped me stay on track when I have encountered difficult problems at work.

Is semiotics bullshit?

13 PhilGoetz 25 August 2015 02:09PM

I spent an hour recently talking with a semiotics professor who was trying to explain semiotics to me.  He was very patient, and so was I, and at the end of an hour I concluded that semiotics is like Indian chakra-based medicine:  a set of heuristic practices that work well in a lot of situations, justified by complete bullshit.

I learned that semioticians, or at least this semiotician:

  • believe that what they are doing is not philosophy, but a superset of mathematics and logic
  • use an ontology, vocabulary, and arguments taken from medieval scholastics, including Scotus
  • oppose the use of operational definitions
  • believe in the reality of something like Platonic essences
  • look down on logic, rationality, reductionism, the Enlightenment, and eliminative materialism.  He said that semiotics includes logic as a special, degenerate case, and that semiotics includes extra-logical, extra-computational reasoning.
  • seems to believe people have an extra-computational ability to make correct judgements at better-than-random probability that have no logical basis
  • claims materialism and reason each explain only a minority of the things they are supposed to explain
  • claims to have a complete, exhaustive, final theory of how thinking and reasoning works, and of the categories of reality.

When I've read short, simple introductions to semiotics, they didn't say this.  They didn't say anything I could understand that wasn't trivial.  I still haven't found one meaningful claim made by semioticians, or one use for semiotics.  I don't need to read a 300-page tome to understand that the 'C' on a cold-water faucet signifies cold water.  The only example he gave me of its use is in constructing more-persuasive advertisements.

(Now I want to see an episode of Mad Men where they hire a semotician to sell cigarettes.)

Are there multiple "sciences" all using the name "semiotics"?  Does semiotics make any falsifiable claims?  Does it make any claims whose meanings can be uniquely determined and that were not claimed before semiotics?

His notion of "essence" is not the same as Plato's; tokens rather than types have essences, but they are distinct from their physical instantiation.  So it's a tripartite Platonism.  Semioticians take this division of reality into the physical instantiation, the objective type, and the subjective token, and argue that there are only 10 possible combinations of these things, which therefore provide a complete enumeration of the possible categories of concepts.  There was more to it than that, but I didn't follow all the distinctions. He had several different ways of saying "token, type, unbound variable", and seemed to think they were all different.

Really it all seemed like taking logic back to the middle ages.

Instrumental Rationality Questions Thread

13 AspiringRationalist 22 August 2015 08:25PM

This thread is for asking the rationalist community for practical advice.  It's inspired by the stupid questions series, but with an explicit focus on instrumental rationality.

Questions ranging from easy ("this is probably trivial for half the people on this site") to hard ("maybe someone here has a good answer, but probably not") are welcome.  However, please stick to problems that you actually face or anticipate facing soon, not hypotheticals.

As with the stupid questions thread, don't be shy, everyone has holes in their knowledge, though the fewer and the smaller we can make them, the better, and please be respectful of other people's admitting ignorance and don't mock them for it, as they're doing a noble thing.

(See also the Boring Advice Repository)

Predict - "Log your predictions" app

13 Gust 17 August 2015 04:20PM

As an exercise on programming Android, I've made an app to log predictions you make and keep score of your results. Like PredictionBook, but taking more of a personal daily exercise feel, in line with this post.

The "statistics" right now are only a score I copied from the old Credence calibration game, and a calibration bar chart.

Features I think might be worth adding:

  • Daily notifications to remember to exercise your prediction ability
  • Maybe with trivia questions you can answer if you don't have any personal prediction to make

I'm hoping for suggestionss for features and criticism on the app design.

Here's the link for the apk (v0.4), and here's the source code repository. You can download it at Google Play Store.



2015-08-26 - Fixed bug that broke on Android 5.0.2 (thanks Bobertron)

2015-08-28 - Change layout for landscape mode, and add a better icon

2015-08-31 -

  • Daily notifications
  • Buttons at the expanded-item-layout (ht dutchie)
  • Show points won/lost in the snackbar when a prediction is answered
  • Translation to portuguese


[LINK] Scott Aaronson: Common knowledge and Aumann's agreement theorem

13 gjm 17 August 2015 08:41AM

The excellent Scott Aaronson has posted on his blog a version of a talk he recently gave at SPARC, about Aumann's agreement theorem and related topics. I think a substantial fraction of LW readers would enjoy it. As well as stating Aumann's theorem and explaining why it's true, the article discusses other instances where the idea of "common knowledge" (the assumption that does a lot of the work in the AAT) is important, and offers some interesting thoughts on the practical applicability (if any) of the AAT.

(Possibly relevant: an earlier LW discussion of AAT.)

You Are A Brain - Intro to LW/Rationality Concepts [Video & Slides]

13 Liron 16 August 2015 05:51AM

Here's a 32-minute presentation I made to provide an introduction to some of the core LessWrong concepts for a general audience:

You Are A Brain [YouTube]

You Are a Brain [Google Slides] - public domain

I already posted this here in 2009 and some commenters asked for a video, so I immediately recorded one six years later. This time the audience isn't teens from my former youth group, it's employees who work at my software company where we have a seminar series on Thursday afternoons.

Book Review: Naive Set Theory (MIRI research guide)

13 David_Kristoffersson 14 August 2015 10:08PM

I'm David. I'm reading through the books in the MIRI research guide and will write a review for each as I finish them. By way of inspiration from how Nate did it.

Naive Set Theory

Halmos Naive Set Theory is a classic and dense little book on axiomatic set theory, from a "naive" perspective.

Which is to say, the book won't dig to the depths of formality or philosophy, it focuses on getting you productive with set theory. The point is to give someone who wants to dig into advanced mathematics a foundation in set theory, as set theory is a fundamental tool used in a lot of mathematics.


Is it a good book? Yes.

Would I recommend it as a starting point, if you would like to learn set theory? No. The book has a terse presentation which makes it tough to digest if you aren't already familiar with propositional logic, perhaps set theory to some extent already and a bit of advanced mathematics in general. There are plenty of other books that can get you started there.

If you do have a somewhat fitting background, I think this should be a very competent pick to deepen your understanding of set theory. The author shows you the nuts and bolts of set theory and doesn't waste any time doing it.

Perspective of this review

I will first refer you to Nate's review, which I found to be a lucid take on it. I don't want to be redundant and repeat the good points made there, so I want to focus this review on the perspective of someone with a bit weaker background in math, and try to give some help to prospective readers with parts I found tricky in the book.

What is my perspective? While I've always had a knack for math, I only read about 2 months of mathematics at introductory university level, and not including discrete mathematics. I do have a thorough background in software development.

Set theory has eluded me. I've only picked up fragments. It's seemed very fundamental but school never gave me a good opportunity to learn it. I've wanted to understand it, which made it a joy to add Naive Set Theory to the top of my reading list.

How I read Naive Set Theory

Starting on Naive Set Theory, I quickly realized I wanted more meat to the explanations. What is this concept used for? How does it fit in to the larger subject of mathematics? What the heck is the author expressing here?

I supplemented heavily with wikipedia, math.stackexchange and other websites. Sometimes, I read other sources even before reading the chapter in the book. At two points, I laid down the book in order to finish two other books. The first was Gödel's Proof, which handed me some friendly examples of propositional logic. I had started reading it on the side when I realized it was contextually useful. The second was Concepts of Modern Mathematics, which gave me much of the larger mathematical context that Naive Set Theory didn't.

Consequently, while reading Naive Set Theory, I spent at least as much time reading other sources!

A bit into the book, I started struggling with the exercises. It simply felt like I hadn't been given all the tools to attempt the task. So, I concluded I needed a better introduction to mathematical proofs, ordered some books on the subject, and postponed investing into the exercises in Naive Set Theory until I had gotten that introduction.


In general, if the book doesn't offer you enough explanation on a subject, search the Internet. Wikipedia has numerous competent articles, math.stackexchange is overflowing with content and there's plenty additional sources available on the net. If you get stuck, do try playing around with examples of sets on paper or in a text file. That's universal advice for math.

I'll follow with some key points and some highlights of things that tripped me up while reading the book.

Axiom of extension

The axiom of extension tells us how to distinguish between sets: Sets are the same if they contain the same elements. Different if they do not.

Axiom of specification

The axiom of specification allows you to create subsets by using conditions. This is pretty much what is done every time set builder notation is employed.

Puzzled by the bit about Russell's paradox at the end of the chapter?

Unordered pairs

The axiom of pairs allows one to create a new set that contains the two original sets.

Unions and intersections

The axiom of unions allows one to create a new set that contains all the members of the original sets.

Complements and powers

The axiom of powers allows one to, out of one set, create a set containing all the different possible subsets of the original set.

Getting tripped up about the "for some" and "for every" notation used by Halmos? Welcome to the club:

Using natural language rather than logical notation is commmon practice in mathematical textbooks. You'd better get used to it:

The existential quantifiers tripped me up a bit before I absorbed it. In math, you can freely express something like "Out of all possible x ever, give me the set of x that fulfill this condition". In programming languages, you tend to have to be much more... specific, in your statements.

Ordered pairs

Cartesian products are used to represent plenty of mathematical concepts, notably coordinate systems.


Equivalence relations and equivalence classes are important concepts in mathematics.


Halmos is using some dated terminology and is in my eyes a bit inconsistent here. In modern usage, we have: injective, surjective, bijective and functions that are none of these. Bijective is the combination of being both injective and surjective. Replace Halmos' "onto" with surjective, "one-to-one" with injective, and "one-to-one correspondence" with bijective.

He also confused me with his explanation of "characteristic function" - you might want to check another source there.


This chapter tripped me up heavily because Halmos mixed in three things at the same time on page 36: 1. A confusing way of talking about sets. 2. Convoluted proof. 3. n-ary cartesian product.

Families are an alternative way of talking about sets. An indexed family is a set, with an index and a function in the background. A family of sets means a collection of sets, with an index and a function in the background. For Halmos build-up to n-ary cartesian products, the deal seems to be that he teases out order without explicitly using ordered pairs. Golf clap. Try this one for the treatment:

Inverses and composites

The inverses Halmos defines here are more general than the inverse functions described on wikipedia. Halmos' inverses work even when the functions are not bijective.


The axiom of infinity states that there is a set of the natural numbers.

The Peano axioms

The peano axioms can be modeled on the the set-theoretic axioms. The recursion theorem guarantees that recursive functions exist.


The principle of mathematical induction is put to heavy use in order to define arithmetic.


Partial orders, total orders, well orders -- are powerful mathematical concepts and are used extensively.

Some help on the way:

Also, keep in mind that infinite sets like subsets of w can muck up expectations about order. For example, a totally ordered set can have multiple elements without a predecessor.

Axiom of choice

The axiom of choice lets you, from any collection of non-empty sets, select an element from every set in the collection. The axiom is necessary to do these kind of "choices" with infinite sets. In finite cases, one can construct functions for the job using the other axioms. Though, the axiom of choice often makes the job easier in finite cases so it is used where it isn't necessary.

Zorn's lemma

Zorn's lemma is used in similar ways to the axiom of choice - making infinite many choices at once - which perhaps is not very strange considering ZL and AC have been proven to be equivalent.

robot-dreams offers some help in following the massive proof in the book.

Well ordering

A well-ordered set is a totally ordered set with the extra condition that every non-empty subset of it has a smallest element. This extra condition is useful when working with infinite sets.

The principle of transfinite induction means that if the presence of all strict predecessors of an element always implies the presence of the element itself, then the set must contain everything. Why does this matter? It means you can make conclusions about infinite sets beyond w, where mathematical induction isn't sufficient.

Transfinite recursion

Transfinite recursion is an analogue to the ordinary recursion theorem, in a similar way that transfinite induction is an analogue to mathematical induction - recursive functions for infinite sets beyond w.

In modern lingo, what Halmos calls a "similarity" is an "order isomorphism".

Ordinal numbers

The axiom of substitution is called the axiom (schema) of replacement in modern use. It's used for extending counting beyond w.

Sets of ordinal numbers

The counting theorem states that each well ordered set is order isomorphic to a unique ordinal number.

Ordinal arithmetic

The misbehavior of commutativity in arithmetic with ordinals tells us a natural fact about ordinals: if you tack on an element in the beginning, the result will be order isomorphic to what it is without that element. If you tack on an element at the end, the set now has a last element and is thus not order isomorphic to what you started with.

The Schröder-Bernstein theorem

The Schröder-Bernstein theorem states that if X dominates Y, and Y dominates X, then X ~ Y (X and Y are equivalent).

Countable sets

Cantor's theorem states that every set always has a smaller cardinal number than the cardinal number of its power set.

Cardinal arithmetic

Read this chapter after Cardinal numbers.

Cardinal arithmetic is an arithmetic where just about all the standard operators do nothing (beyond the finite cases).

Cardinal numbers

Read this chapter before Cardinal arithmetic.

The continuum hypothesis asserts that there is no cardinal number between that of the natural numbers and that of the reals. The generalized continuum hypothesis asserts that, for all cardinal numbers including aleph-0 and beyond aleph-0, the next cardinal number in the sequence is the power set of the previous one.

Concluding reflections

I am at the same time humbled by the subject and empowered by what I've learned in this episode. Mathematics is a truly vast and deep field. To build a solid foundation in proofs, I will now go through one or two books about mathematical proofs. I may return to Naive Set Theory after that. If anyone is interested, I could post my impressions of other mathematical books I read.

I think Naive Set Theory wasn't the optimal book for me at the stage I was. And I think Naive Set Theory probably should be replaced by another introductory book on set theory in the MIRI research guide. But that's a small complaint on an excellent document.

If you seek to get into a new field, know the prerequisites. Build your knowledge in solid steps. Which I guess, sometimes requires that you do test your limits to find out where you really are.

The next book I start on from the research guide is bound to be Computability and Logic.

Peer-to-peer "knowledge exchanges"

13 snarles 08 August 2015 03:33PM

I wonder if anyone has thought about setting up an online community dedicated to peer-to-peer tutoring.  The idea is that if I want to learn "Differential Geometry" and know "Python programming", and you want to learn "Python programming" and know "Differential geometry," then we can agree to tutor each other online.  The features of the community would be to support peer-to-peer tutoring by:



  • Facilitating matchups between compatible tutors
  • Allowing for more than two people to participate in a tutoring arrangement
  • Providing reputation-based incentives to honor tutoring agreements and putting effort into tutoring
  • Allowing other members to "sit in" on tutoring sessions, if they are made public
  • Allowing the option to record tutoring sessions
  • Providing members with access to such recorded sessions and "course materials"
  • Providing a forum to arrange other events

With such functions, the community would have some overlap with other online learning platforms, but the focus of the community would be to provide free, quality personalized teaching.

The LessWrong community could build the first version of this peer tutoring system.  It has people with broad interests, high intellectual standards, and many engineers who could help develop some of the infrastructure.  The first iteration of the community would be small, and many of the above features (e.g. a reputation system, and tools for facilitating matchups) would not be needed.  The first problems we would need to solve are:
  • Where should we host the community? (e.g. Google groups?)
  • What are some basic ground rules to ensure the integrity of the community and ensure safety?
  • Where can we provide a place for people to list which subjects they want to learn and which subjects they can teach?
  • Which software should we use for tutoring?
  • How can people publicize their tutoring schedule in case others want to "sit in"?
  • How can people record their tutoring sessions if they wish, and how can they make these available?
  • How should the community be administrated?  Who should be put in charge of organizing the development of the community?
  • How should we recruit new members?


making notes - an instrumental rationality process.

12 Elo 05 September 2015 10:51PM

The value of having notes. Why do I make notes.


Story time!

At one point in my life I had a memory crash. Which is to say once upon a time I could remember a whole lot more than I was presently remembering. I recall thinking, "what did I have for breakfast last Monday? Oh no! Why can't I remember!". I was terrified. It took a while but eventually I realised that remembering what I had for breakfast last Monday was:

  1. not crucial to the rest of my life

  2. not crucial to being a function human being

  3. I was not sure if I usually remember what I ate last Monday; or if this was the first time I tried to recall it with such stubbornness to notice that I had no idea.

After surviving my first teen-life crisis I went on to realise a few things about life and about memory:

  1. I will not be remembering everything forever.

  2. Sometimes I forget things that I said I would do. Especially when the number of things I think I will do increases past 2-3 and upwards to 20-30.

  3. Don't worry! There is a solution!

  4. As someone at the age of mid-20s who is already forgetting things; a friendly mid-30 year old mentioned that in 10 years I will have 1/3rd more life to be trying to remember as well. Which should also serve as a really good reason why you should always comment your code as you go; and why you should definitely write notes. "Past me thought future me knew exactly what I meant even though past me actually had no idea what they were going on about".

The foundation of science.


There are many things that could be considered the foundations of science. I believe that one of the earliest foundations you can possibly engage in is observation.


In a more-than-goldfish form; observation means holding information. It means keeping things for review till later in your life; either at the end of this week; month or year. Observation is only the start. Writing it down makes it evidence. Biased, personal, scrawl, (bad) evidence all the same. If you want to be more effective at changing your mind; you need to know what your mind says.


It's great to make notes. That's exactly what I am saying. It goes further though. Take notes and then review them. Weekly; monthly; yearly. Unsure about where you are going? Know where you have come from. With that you can move forward with better purpose.

My note taking process:

1. get a notebook.

This picture includes some types of notebooks that I have tried.

  1. A4 lined paper cardboard front and back. Becomes difficult to carry because it was big. And hard to open it up and use it as well. side-bound is also something I didn't like because I am left handed and it seemed to get in my way.

  2. bad photo but its a pad of grid-paper. I found a stack of these on the middle of the ground late at night as if they fell off a truck or something. I really liked them except for them being stuck together by essentially nothing and falling to pieces by the time I got to the bottom of the pad.

  3. lined note paper. I will never go back to a book that doesn't hold together. The risk of losing paper is terrible. I don't mind occasionally ripping out some paper but to lose a page when I didn't want to; has never worked safely for me.

  4. Top spiral bound; 100 pages. This did not have enough pages; I bought it after a 200pager ran out of paper and I needed a quick replacement, well it was quick – I used it up in half the time the last book lasted.

  5. Top spiral bound 200 pages notepad, plastic cover; these are the type of book I currently use. 8 is my book that I am writing in right now.

  6. 300 pages top spiral bound – as you can see by the tape – it started falling apart by the time I got to the end of it.

  7. small notebook. I got these because they were 48c each, they never worked for me. I would bend them, forget them, leave them in the wrong places, and generally not have them around when I wanted them.

  8. I am about half way through my current book; the first page of my book says 23/7/15, today it is 1/9/15. Estimate a book every 2 months. Although it really depends on how you use it.

  9. a future book I will try, It holds a pen so I will probably find that useful.

  10. also a future one, I expect it to be too small to be useful for me.

  11. A gift from a more organised person than I. It is a moleskin grid-paper book and I plan to also try it soon.

The important take-aways from this is – try several, they might work in different ways and for different reasons. Has your life change substantially i.e. you don't sit much at a desk any more? Is the book not working; maybe another type of book would work better.

I only write on the bottom of the flip-page, and occasionally scrawl diagrams on the other side of the page. But only when they relevant. This way I can always flip through easy, and not worry about the other side of the paper.


2. carry a notebook. Everywhere. Find a way to make it a habit. Don't carry a bag? You could. Then you can carry your notepad everywhere with you in a bag. Consider a pocket-sized book as a solution to not wanting to carry a bag.

3. when you stop moving; turn the notebook to the correct page and write the date.

Writing the date is almost entirely useless. I really never care what the date is. I sometimes care that when I look back over the book I can see the timeline around which the events happened, but really – the date means nothing to me.

What writing the date helps to do:

  • make sure you have a writing implement

  • make sure it works

  • make sure you are on the right page

  • make sure you can see the pad

  • make sure you can write in this position

  • make you start a page

  • make you consider writing more things

  • make it look to others like you know what you are doing (signalling that you are a note-taker, is super important to help people get used to you as a note-taker and encourage that persona onto you)

This is the reason why I write the date; I can't specify enough why I don't care about what date it is, but why I do it anyway.

4. Other things I write:

  • Names of people I meet. Congratulations; you are one step closer to never forgetting the name of anyone ever. Also when you want to think; "When did I last see bob", you can kinda look it up in a dumb - date-sorted list. (to be covered in my post about names – but its a lot easier to look it up 5 minutes later when you have it written down)

  • Where I am/What event I am at. (nice to know what you go to sometimes)

  • What time I got here or what time it started (if its a meeting)

  • What time it ended (or what time I stopped writing things)

It's at this point that the rest of the things you write are kinda personal choices some of mine are:

  • Interesting thoughts I have had

  • Interesting quotes people say

  • Action points that I want to do if I can't do them immediately.

  • Shopping lists

  • diagrams of what you are trying to say.

  • Graphs you see.

  • the general topic of conversation as it changes. (so far this is enough for me to remember the entire conversation and who was there and what they had to say about the matter)


That's right. I said it. Its sexy. There are occasional discussion events near to where I live; that I go to with a notepad. Am I better than the average dude who shows up to chat? no. But everyone knows me. The guy who takes notes. And damn they know I know what I am talking about. And damn they all wish they were me. You know how glasses became a geek-culture signal? Well this is too. Like no other. Want to signal being a sharp human who knows what's going down? Carry a notebook, and show it off to people.

The coordinators have said to me; "It makes me so happy to see someone taking notes, it really makes me feel like I am saying something useful". The least I can do is take notes.


Other notes about notebooks

The number of brilliant people I know who carry a book of some kind will far outweighs the number of people who don't. I don't usually trust the common opinion; but sometimes you just gotta go with what's right.

If it stops working; at least you tried it. If it works; you have evidence and can change the world in the future.

"I write in my phone". (sounds a lot like, "I could write notes in my phone") I hear this a lot.  Especially in person while I am writing notes. Indeed you do. Which is why I am the one with a notebook out and at the end of talking to you I will actually have notes and you will not. If you are genuinely the kind of person with notes in their phone I commend you for doing something with technology that I cannot seem to have sorted out; but if you are like me; and a lot of other people who could always say they could take notes in their phone; but never do; or never look at those notes... Its time to fix this.

a quote from a friend - “I realized in my mid twenties that I would look like a complete badass in a decade, if I could point people to a shelf of my notebooks.” And I love this too.

A friend has suggested that flashcards are his brain; and notepads are not.  I agree that flashcards have benefits. namely to do with organising things around, shuffling etc.  It really depends on what notes you are taking.  I quite like having a default chronology to things, but that might not work for you.

In our local Rationality Dojo’s we give away notebooks.  For the marginal costs of a book of paper; we are making people’s lives better.

The big take away

Get a notebook; make notes; add value to your life.




This post took 3 hours to write over a week

Please add your experiences if you work differently surrounding note taking.

Please fill out the survey of if you found this post helpful.

Yudkowsky, Thiel, de Grey, Vassar panel on changing the world

12 NancyLebovitz 01 September 2015 03:57PM

30 minute panel

The first question was why isn't everyone trying to change the world, with the underlying assumption that everyone should be. However, it isn't obviously the case that the world would be better if everyone were trying to change it. For one thing, trying to change the world mostly means trying to change other people. If everyone were trying to do it, this would be a huge drain on everyone's attention. In addition, some people are sufficiently mean and/or stupid that their efforts to change the world make things worse.

At the same time, some efforts to change the world are good, or at least plausible. Is there any way to improve the filter so that we get more ambition from benign people without just saying everyone should try to change the world, even if they're Osama bin Laden?

The discussion of why there's too much duplicated effort in science didn't bring up the problem of funding, which is probably another version of the problem of people not doing enough independent thinking.

There was some discussion of people getting too hooked on competition, which is a way of getting a lot of people pointed at the same goal. 

Link thanks to Clarity

Should you write longer comments? (Statistical analysis of the relationship between comment length and ratings)

11 cleonid 20 July 2015 02:09PM

A few months ago we have launched an experimental website. In brief, our goal is to create a platform where unrestricted freedom of speech would be combined with high quality of discussion. The problem can be approached from two directions. One is to help users navigate through content and quickly locate the higher quality posts. Another, which is the topic of this article, is to help users improve the quality of their own posts by providing them with meaningful feedback.

One important consideration for those who want to write better comments is how much detail to leave out. Our statistical analysis shows that for many users there is a strong connection between the ratings and the size of their comments. For example, for Yvain (Scott Alexander) and Eliezer_Yudkowsky, the average number of upvotes grows almost linearly with increasing comment length.



This trend, however, does not apply to all posters. For example, for the group of top ten contributors (in the last 30 days) to LessWrong, the average number of upvotes increases only slightly with the length of the comment (see the graph below).  For quite a few people the change even goes in the opposite direction – longer comments lead to lower ratings.



Naturally, even if your longer comments are rated higher than the short ones, this does not mean that inflating comments would always produce positive results. For most users (including popular writers, such as Yvain and Eliezer), the average number of downvotes increases with increasing comment length. The data also shows that long comments that get most upvotes are generally distinct from long comments that get most downvotes. In other words, long comments are fine as long as they are interesting, but they are penalized more when they are not.



The rating patterns vary significantly from person to person. For some posters, the average number of upvotes remains flat until the comment length reaches some threshold and then starts declining with increasing comment length. For others, the optimal comment length may be somewhere in the middle. (Users who have accounts on both Lesswrong and Omnilibrium can check the optimal length for their own comments on both websites by using this link.)

Obviously length is just one among many factors that affect comment quality and for most users it does not explain more than 20% of variation in their ratings. We have a few other ideas on how to provide people with meaningful feedback on both the style and the content of their posts. But before implementing them, we would like to get your opinions first. Would such feedback be actually useful to you?

View more: Next