Filter This month

Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

A Year of Spaced Repetition Software in the Classroom

91 tanagrabeast 04 July 2015 10:30PM

Last year, I asked LW for some advice about spaced repetition software (SRS) that might be useful to me as a high school teacher. With said advice came a request to write a follow-up after I had accumulated some experience using SRS in the classroom. This is my report.

Please note that this was not a scientific experiment to determine whether SRS "works." Prior studies are already pretty convincing on this point and I couldn't think of a practical way to run a control group or "blind" myself. What follows is more of an informal debriefing for how I used SRS during the 2014-15 school year, my insights for others who might want to try it, and how the experience is changing how I teach.


SRS can raise student achievement even with students who won't use the software on their own, and even with frequent disruptions to the study schedule. Gains are most apparent with the already high-performing students, but are also meaningful for the lowest students. Deliberate efforts are needed to get student buy-in, and getting the most out of SRS may require changes in course design.

The software

After looking into various programs, including the game-like Memrise, and even writing my own simple SRS, I ultimately went with Anki for its multi-platform availability, cloud sync, and ease-of-use. I also wanted a program that could act as an impromptu catch-all bin for the 2,000+ cards I would be producing on the fly throughout the year. (Memrise, in contrast, really needs clearly defined units packaged in advance).

The students

I teach 9th and 10th grade English at an above-average suburban American public high school in a below-average state. Mine are the lower "required level" students at a school with high enrollment in honors and Advanced Placement classes. Generally speaking, this means my students are mostly not self-motivated, are only very weakly motivated by grades, and will not do anything school-related outside of class no matter how much it would be in their interest to do so. There are, of course, plenty of exceptions, and my students span an extremely wide range of ability and apathy levels.

The procedure

First, what I did not do. I did not make Anki decks, assign them to my students to study independently, and then quiz them on the content. With honors classes I taught in previous years I think that might have worked, but I know my current students too well. Only about 10% of them would have done it, and the rest would have blamed me for their failing grades—with some justification, in my opinion.

Instead, we did Anki together, as a class, nearly every day.

As initial setup, I created a separate Anki profile for each class period. With a third-party add-on for Anki called Zoom, I enlarged the display font sizes to be clearly legible on the interactive whiteboard at the front of my room.

Nightly, I wrote up cards to reinforce new material and integrated them into the deck in time for the next day's classes. This averaged about 7 new cards per lesson period.These cards came in many varieties, but the three main types were:

  1. concepts and terms, often with reversed companion cards, sometimes supplemented with "what is this an example of" scenario cards.
  2. vocabulary, 3 cards per word: word/def, reverse, and fill-in-the-blank example sentence
  3. grammar, usually in the form of "What change(s), if any, does this sentence need?" Alternative cards had different permutations of the sentence.

Weekly, I updated the deck to the cloud for self-motivated students wishing to study on their own.

Daily, I led each class in an Anki review of new and due cards for an average of 8 minutes per study day, usually as our first activity, at a rate of about 3.5 cards per minute. As each card appeared on the interactive whiteboard, I would read it out loud while students willing to share the answer raised their hands. Depending on the card, I might offer additional time to think before calling on someone to answer. Depending on their answer, and my impressions of the class as a whole, I might elaborate or offer some reminders, mnemonics, etc. I would then quickly poll the class on how they felt about the card by having them show a color by way of a small piece of card-stock divided into green, red, yellow, and white quadrants. Based on my own judgment (informed only partly by the poll), I would choose and press a response button in Anki, determining when we should see that card again.

End-of-year summary for one of my classes

[Data shown is from one of my five classes. We didn't start using Anki until a couple weeks into the school year.]

Opportunity costs

8 minutes is a significant portion of a 55 minute class period, especially for a teacher like me who fills every one of those minutes. Something had to give. For me, I entirely cut some varieties of written vocab reinforcement, and reduced the time we spent playing the team-based vocab/term review game I wrote for our interactive whiteboards some years ago. To a lesser extent, I also cut back on some oral reading comprehension spot-checks that accompany my whole-class reading sessions. On balance, I think Anki was a much better way to spend the time, but it's complicated. Keep reading.

Whole-class SRS not ideal

Every student is different, and would get the most out of having a personal Anki profile determine when they should see each card. Also, most individuals could study many more cards per minute on their own than we averaged doing it together. (To be fair, a small handful of my students did use the software independently, judging from Ankiweb download stats)

Getting student buy-in

Before we started using SRS I tried to sell my students on it with a heartfelt, over-prepared 20 minute presentation on how it works and the superpowers to be gained from it. It might have been a waste of time. It might have changed someone's life. Hard to say.

As for the daily class review, I induced engagement partly through participation points that were part of the final semester grade, and which students knew I tracked closely. Raising a hand could earn a kind of bonus currency, but was never required—unlike looking up front and showing colors during polls, which I insisted on. When I thought students were just reflexively holding up the same color and zoning out, I would sometimes spot check them on the last card we did and penalize them if warranted.

But because I know my students are not strongly motivated by grades, I think the most important influence was my attitude. I made it a point to really turn up the charm during review and play the part of the engaging game show host. Positive feedback. Coaxing out the lurkers. Keeping that energy up. Being ready to kill and joke about bad cards. Reminding classes how awesome they did on tests and assignments because they knew their Anki stuff.

(This is a good time to point out that the average review time per class period stabilized at about 8 minutes because I tried to end reviews before student engagement tapered off too much, which typically started happening at around the 6-7 minute mark. Occasional short end-of-class reviews mostly account for the difference.)

I also got my students more on the Anki bandwagon by showing them how this was directly linked reduced note-taking requirements. If I could trust that they would remember something through Anki alone, why waste time waiting for them to write it down? They were unlikely to study from those notes anyway. And if they aren't looking down at their paper, they'll be paying more attention to me. I better come up with more cool things to tell them!

Making memories

Everything I had read about spaced repetition suggested it was a great reinforcement tool but not a good way to introduce new material. With that in mind, I tried hard to find or create memorable images, examples, mnemonics, and anecdotes that my Anki cards could become hooks for, and to get those cards into circulation as soon as possible. I even gave this method a mantra: "vivid memory, card ready".

When a student during review raised their hand, gave me a pained look, and said, "like that time when...." or "I can see that picture of..." as they struggled to remember, I knew I had done well. (And I would always wait a moment, because they would usually get it.)

Baby cards need immediate love

Unfortunately, if the card wasn't introduced quickly enough—within a day or two of the lesson—the entire memory often vanished and had to be recreated, killing the momentum of our review. This happened far too often—not because I didn't write the card soon enough (I stayed really on top of that), but because it didn't always come up for study soon enough. There were a few reasons for this:

  1. We often had too many due cards to get through in one session, and by default Anki puts new cards behind due ones.
  2. By default, Anki only introduces 20 new cards in one session (I soon uncapped this).
  3. Some cards were in categories that I gave lower priority to.

Two obvious cures for this problem:

  1. Make fewer cards. (I did get more selective as the year went on.)
  2. Have all cards prepped ahead of time and introduce new ones at the end of the class period they go with. (For practical reasons, not the least of which was the fact that I didn't always know what cards I was making until after the lesson, I did not do this. I might able to next year.)

Days off suck

SRS is meant to be used every day. When you take weekends off, you get a backlog of due cards. Not only do my students take every weekend and major holiday off (slackers), they have a few 1-2 week vacations built into the calendar. Coming back from a week's vacation means a 9-day backlog (due to the weekends bookending it). There's no good workaround for students that won't study on their own. The best I could do was run longer or multiple Anki sessions on return days to try catch up with the backlog. It wasn't enough. The "caught up" condition was not normal for most classes at most points during the year, but rather something to aspire to and occasionally applaud ourselves for reaching. Some cards spent weeks or months on the bottom of the stack. Memories died. Baby cards emerged stillborn. Learning was lost.

Needless to say, the last weeks of the school year also had a certain silliness to them. When the class will never see the card again, it doesn't matter whether I push the button that says 11 days or the one that says 8 months. (So I reduced polling and accelerated our cards/minute rate.)

Never before SRS did I fully appreciate the loss of learning that must happen every summer break.


I kept each course's master deck divided into a few large subdecks. This was initially for organizational reasons, but I eventually started using it as a prioritizing tool. This happened after a curse-worthy discovery: if you tell Anki to review a deck made from subdecks, due cards from subdecks higher up in the stack are shown before cards from decks listed below, no matter how overdue they might be. From that point, on days when we were backlogged (most days) I would specifically review the concept/terminology subdeck for the current semester before any other subdecks, as these were my highest priority.

On a couple of occasions, I also used Anki's study deck tools to create temporary decks of especially high-priority cards.

Seizing those moments

Veteran teachers start acquiring a sense of when it might be a good time to go off book and teach something that isn't in the unit, and maybe not even in the curriculum. Maybe it's teaching exactly the right word to describe a vivid situation you're reading about, or maybe it's advice on what to do in a certain type of emergency that nearly happened. As the year progressed, I found myself humoring my instincts more often because of a new confidence that I can turn an impressionable moment into a strong memory and lock it down with a new Anki card. I don't even care if it will ever be on a test. This insight has me questioning a great deal of what I thought knew about organizing a curriculum. And I like it.

A lifeline for low performers

An accidental discovery came from having written some cards that were, it was immediately obvious to me, much too easy. I was embarrassed to even be reading them out loud. Then I saw which hands were coming up.

In any class you'll get some small number of extremely low performers who never seem to be doing anything that we're doing, and, when confronted, deny that they have any ability whatsoever. Some of the hands I was seeing were attached to these students. And you better believe I called on them.

It turns out that easy cards are really important because they can give wins to students who desperately need them. Knowing a 6th grade level card in a 10th grade class is no great achievement, of course, but the action takes what had been negative morale and nudges it upward. And it can trend. I can build on it. A few of these students started making Anki the thing they did in class, even if they ignored everything else. I can confidently name one student I'm sure passed my class only because of Anki. Don't get me wrong—he just barely passed. Most cards remained over his head. Anki was no miracle cure here, but it gave him and I something to work with that we didn't have when he failed my class the year before.

A springboard for high achievers

It's not even fair. The lowest students got something important out of Anki, but the highest achievers drank it up and used it for rocket fuel. When people ask who's widening the achievement gap, I guess I get to raise my hand now.

I refuse to feel bad for this. Smart kids are badly underserved in American public schools thanks to policies that encourage staff to focus on that slice of students near (but not at) the bottom—the ones who might just barely be able to pass the state test, given enough attention.

Where my bright students might have been used to high Bs and low As on tests, they were now breaking my scales. You could see it in the multiple choice, but it was most obvious in their writing: they were skillfully working in terminology at an unprecedented rate, and making way more attempts to use new vocabulary—attempts that were, for the most part, successful.

Given the seemingly objective nature of Anki it might seem counterintuitive that the benefits would be more obvious in writing than in multiple choice, but it actually makes sense when I consider that even without SRS these students probably would have known the terms and the vocab well enough to get multiple choice questions right, but might have lacked the confidence to use them on their own initiative. Anki gave them that extra confidence.

A wash for the apathetic middle?

I'm confident that about a third of my students got very little out of our Anki review. They were either really good at faking involvement while they zoned out, or didn't even try to pretend and just took the hit to their participation grade day after day, no matter what I did or who I contacted.

These weren't even necessarily failing students—just the apathetic middle that's smart enough to remember some fraction of what they hear and regurgitate some fraction of that at the appropriate times. Review of any kind holds no interest for them. It's a rerun. They don't really know the material, but they tell themselves that they do, and they don't care if they're wrong.

On the one hand, these students are no worse off with Anki than they would have been with with the activities it replaced, and nobody cries when average kids get average grades. On the other hand, I'm not ok with this... but so far I don't like any of my ideas for what to do about it.

Putting up numbers: a case study

For unplanned reasons, I taught a unit at the start of a quarter that I didn't formally test them on until the end of said quarter. Historically, this would have been a disaster. In this case, it worked out well. For five weeks, Anki was the only ongoing exposure they were getting to that unit, but it proved to be enough. Because I had given the same test as a pre-test early in the unit, I have some numbers to back it up. The test was all multiple choice, with two sections: the first was on general terminology and concepts related to the unit. The second was a much harder reading comprehension section.

As expected, scores did not go up much on the reading comprehension section. Overall reading levels are very difficult to boost in the short term and I would not expect any one unit or quarter to make a significant difference. The average score there rose by 4 percentage points, from 48 to 52%.

Scores in the terminology and concept section were more encouraging. For material we had not covered until after the pre-test, the average score rose by 22 percentage points, from 53 to 75%. No surprise there either, though; it's hard to say how much credit we should give to SRS for that.

But there were also a number of questions about material we had already covered before the pretest. Being the earliest material, I might have expected some degradation in performance on the second test. Instead, the already strong average score in that section rose by an additional 3 percentage points, from 82 to 85%. (These numbers are less reliable because of the smaller number of questions, but they tell me Anki at least "locked in" the older knowledge, and may have strengthened it.)

Some other time, I might try reserving a section of content that I teach before the pre-test but don't make any Anki cards for. This would give me a way to compare Anki to an alternative review exercise.

What about formal standardized tests?

I don't know yet. The scores aren't back. I'll probably be shown some "value added" analysis numbers at some point that tell me whether my students beat expectations, but I don't know how much that will tell me. My students were consistently beating expectations before Anki, and the state gave an entirely different test this year because of legislative changes. I'll go back and revise this paragraph if I learn anything useful.

Those discussions...

If I'm trying to acquire a new skill, one of the first things I try to do is listen to skilled practitioners of that skill talk about it to each other. What are the terms-of-art? How do they use them? What does this tell me about how they see their craft? Their shorthand is a treasure trove of crystallized concepts; once I can use it the same way they do, I find I'm working at a level of abstraction much closer to theirs.

Similarly, I was hoping Anki could help make my students more fluent in the subject-specific lexicon that helps you score well in analytical essays. After introducing a new term and making the Anki card for it, I made extra efforts to use it conversationally. I used to shy away from that because so many students would have forgotten it immediately and tuned me out for not making any sense. Not this year. Once we'd seen the card, I used the term freely, with only the occasional reminder of what it meant. I started using multiple terms in the same sentence. I started talking about writing and analysis the way my fellow experts do, and so invited them into that world.

Even though I was already seeing written evidence that some of my high performers had assimilated the lexicon, the high quality discussions of these same students caught me off guard. You see, I usually dread whole-class discussions with non-honors classes because good comments are so rare that I end up dejectedly spouting all the insights I had hoped they could find. But by the end of the year, my students had stepped up.

I think what happened here was, as with the writing, as much a boost in confidence as a boost in fluency. Whatever it was, they got into some good discussions where they used the terminology and built on it to say smarter stuff.

Don't get me wrong. Most of my students never got to that point. But on average even small groups without smart kids had a noticeably higher level of discourse than I am used to hearing when I break up the class for smaller discussions.


SRS is inherently weak when it comes to the abstract and complex. No card I've devised enables a student to develop a distinctive authorial voice, or write essay openings that reveal just enough to make the reader curious. Yes, you can make cards about strategies for this sort of thing, but these were consistently my worst cards—the overly difficult "leeches" that I eventually suspended from my decks.

A less obvious limitation of SRS is that students with a very strong grasp of a concept often fail to apply that knowledge in more authentic situations. For instance, they may know perfectly well the difference between "there", "their", and "they're", but never pause to think carefully about whether they're using the right one in a sentence. I am very open to suggestions about how I might train my students' autonomous "System 1" brains to have "interrupts" for that sort of thing... or even just a reflex to go back and check after finishing a draft.

Moving forward

I absolutely intend to continue using SRS in the classroom. Here's what I intend to do differently this coming school year:

  • Reduce the number of cards by about 20%, to maybe 850-950 for the year in a given course, mostly by reducing the number of variations on some overexposed concepts.
  • Be more willing to add extra Anki study sessions to stay better caught-up with the deck, even if this means my lesson content doesn't line up with class periods as neatly.
  • Be more willing to press the red button on cards we need to re-learn. I think I was too hesitant here because we were rarely caught up as it was.
  • Rework underperforming cards to be simpler and more fun.
  • Use more simple cloze deletion cards. I only had a few of these, but they worked better than I expected for structured idea sets like, "characteristics of a tragic hero".
  • Take a less linear and more opportunistic approach to introducing terms and concepts.
  • Allow for more impromptu discussions where we bring up older concepts in relevant situations and build on them.
  • Shape more of my lessons around the "vivid memory, card ready" philosophy.
  • Continue to reduce needless student note-taking.
  • Keep a close eye on 10th grade students who had me for 9th grade last year. I wonder how much they retained over the summer, and I can't wait to see what a second year of SRS will do for them.

Suggestions and comments very welcome!

Experiences in applying "The Biodeterminist's Guide to Parenting"

60 juliawise 17 July 2015 07:19PM

I'm posting this because LessWrong was very influential on how I viewed parenting, particularly the emphasis on helping one's brain work better. In this context, creating and influencing another person's brain is an awesome responsibility.

It turned out to be a lot more anxiety-provoking than I expected. I don't think that's necessarily a bad thing, as the possibility of screwing up someone's brain should make a parent anxious, but it's something to be aware of. I've heard some blithe "Rational parenting could be a very high-impact activity!" statements from childless LWers who may be interested to hear some experiences in actually applying that.

One thing that really scared me about trying to raise a child with the healthiest-possible brain and body was the possibility that I might not love her if she turned out to not be smart. 15 months in, I'm no longer worried. Evolution has been very successful at producing parents and children that love each other despite their flaws, and our family is no exception. Our daughter Lily seems to be doing fine, but if she turns out to have disabilities or other problems, I'm confident that we'll roll with the punches.


Cross-posted from The Whole Sky.


Before I got pregnant, I read Scott Alexander's (Yvain's) excellent Biodeterminist's Guide to Parenting and was so excited to have this knowledge. I thought how lucky my child would be to have parents who knew and cared about how to protect her from things that would damage her brain.

Real life, of course, got more complicated. It's one thing to intend to avoid neurotoxins, but another to arrive at the grandparents' house and find they've just had ant poison sprayed. What do you do then?

Here are some tradeoffs Jeff and I have made between things that are good for children in one way but bad in another, or things that are good for children but really difficult or expensive.

Germs and parasites

The hygiene hypothesis states that lack of exposure to germs and parasites increases risk of auto-immune disease. Our pediatrician recommended letting Lily playing in the dirt for this reason.

While exposure to animal dander and pollution increase asthma later in life, it seems that being exposed to these in the first year of life actually protects against asthma. Apparently if you're going to live in a house with roaches, you should do it in the first year or not at all.

Except some stuff in dirt is actually bad for you.

Scott writes:

Parasite-infestedness of an area correlates with national IQ at about r = -0.82. The same is true of US states, with a slightly reduced correlation coefficient of -0.67 (p<0.0001). . . . When an area eliminates parasites (like the US did for malaria and hookworm in the early 1900s) the IQ for the area goes up at about the right time.

Living with cats as a child seems to increase risk of schizophrenia, apparently via toxoplasmosis. But in order to catch toxoplasmosis from a cat, you have to eat its feces during the two weeks after it first becomes infected (which it’s most likely to do by eating birds or rodents carrying the disease). This makes me guess that most kids get it through tasting a handful of cat litter, dirt from the yard, or sand from the sandbox rather than simply through cat ownership. We live with indoor cats who don’t seem to be mousers, so I’m not concerned about them giving anyone toxoplasmosis. If we build Lily a sandbox, we’ll keep it covered when not in use.

The evidence is mixed about whether infections like colds during the first year of life increase or decrease your risk of asthma later. After the newborn period, we defaulted to being pretty casual about germ exposure.

Toxins in buildings

Our experiences with lead. Our experiences with mercury.

In some areas, it’s not that feasible to live in a house with zero lead. We live in Boston, where 87% of the housing was built before lead paint was banned. Even in a new building, we’d need to go far out of town before reaching soil that wasn’t near where a lead-painted building had been.

It is possible to do some renovations without exposing kids to lead. Jeff recently did some demolition of walls with lead paint, very carefully sealed off and cleaned up, while Lily and I spent the day elsewhere. Afterwards her lead level was no higher than it had been.

But Jeff got serious lead poisoning as a toddler while his parents did major renovations on their old house. If I didn’t think I could keep the child away from the dust, I wouldn’t renovate.

Recently a house across the street from us was gutted, with workers throwing debris out the windows and creating big plumes of dust (presumable lead-laden) that blew all down the street. Later I realized I should have called city building inspection services, which would have at least made them carry the debris into the dumpster instead of throwing it from the second story.

Floor varnish releases formaldehyde and other nasties as it cures. We kept Lily out of the house for a few weeks after Jeff redid the floors. We found it worthwhile to pay rent at our previous house in order to not have to live in the new house while this kind of work was happening.


Pressure-treated wood was treated with arsenic and chromium until around 2004 in the US. It has a greenish tint, though this may have faded with time. Playing on playsets or decks made of such wood increases children's cancer risk. It should not be used for furniture (I thought this would be obvious, but apparently it wasn't to some of my handyman relatives).

I found it difficult to know how to deal with fresh paint and other fumes in my building at work while I was pregnant. Women of reproductive age have a heightened sense of smell, and many pregnant women have heightened aversion to smells, so you can literally smell things some of your coworkers can’t (or don’t mind). The most critical period of development is during the first trimester, when most women aren’t telling the world they’re pregnant (because it’s also the time when a miscarriage is most likely, and if you do lose the pregnancy you might not want to have to tell the world). During that period, I found it difficult to explain why I was concerned about the fumes from the roofing adhesive being used in our building. I didn’t want to seem like a princess who thought she was too good to work in conditions that everybody else found acceptable. (After I told them I was pregnant, my coworkers were very understanding about such things.)


Recommendations usually focus on what you should eat during pregnancy, but obviously children’s brain development doesn’t stop there. I’ve opted to take precautions with the food Lily and I eat for as long as I’m nursing her.

Claims that pesticide residues are poisoning children scare me, although most scientists seem to think the paper cited is overblown. Other sources say the levels of pesticides in conventionally grown produce are fine. We buy organic produce at home but eat whatever we’re served elsewhere.

I would love to see a study with families randomly selected to receive organic produce for the first 8 years of the kids’ lives, then looking at IQ and hyperactivity. But no one’s going to do that study because of how expensive 8 years of organic produce would be.
The Biodeterminist’s Guide doesn’t mention PCBs in the section on fish, but fish (particularly farmed salmon) are a major source of these pollutants. They don’t seem to be as bad as mercury, but are neurotoxic. Unfortunately their half-life in the body is around 14 years, so if you have even a vague idea of getting pregnant ever in your life you shouldn’t be eating farmed salmon (or Atlantic/farmed salmon, bluefish, wild striped bass, white and Atlantic croaker, blackback or winter flounder, summer flounder, or blue crab).

I had the best intentions of eating lots of the right kind of high-omega-3, low-pollutant fish during and after pregnancy. Unfortunately, fish was the only food I developed an aversion to. Now that Lily is eating food on her own, we tried several sources of omega-3 and found that kippered herring was the only success. Lesson: it’s hard to predict what foods kids will eat, so keep trying.

In terms of hassle, I underestimated how long I would be “eating for two” in the sense that anything I put in my body ends up in my child’s body. Counting pre-pregnancy (because mercury has a half-life of around 50 days in the body, so sushi you eat before getting pregnant could still affect your child), pregnancy, breastfeeding, and presuming a second pregnancy, I’ll probably spend about 5 solid years feeding another person via my body, sometimes two children at once. That’s a long time in which you have to consider the effect of every medication, every cup of coffee, every glass of wine on your child. There are hardly any medications considered completely safe during pregnancy and lactationmost things are in Category C, meaning there’s some evidence from animal trials that they may be bad for human children.


Too much fluoride is bad for children’s brains. The CDC recently recommended lowering fluoride levels in municipal water (though apparently because of concerns about tooth discoloration more than neurotoxicity). Around the same time, the American Dental Association began recommending the use of fluoride toothpaste as soon as babies have teeth, rather than waiting until they can rinse and spit.

Cavities are actually a serious problem even in baby teeth, because of the pain and possible infection they cause children. Pulling them messes up the alignment of adult teeth. Drilling on children too young to hold still requires full anesthesia, which is dangerous itself.

But Lily isn’t particularly at risk for cavities. 20% of children get a cavity by age six, and they are disproportionately poor, African-American, and particularly Mexican-American children (presumably because of different diet and less ability to afford dentists). 75% of cavities in children under 5 occur in 8% of the population.

We decided to have Lily brush without toothpaste, avoid juice and other sugary drinks, and see the dentist regularly.

Home pesticides

One of the most commonly applied insecticides makes kids less smart. This isn’t too surprising, given that it kills insects by disabling their nervous system. But it’s not something you can observe on a small scale, so it’s not surprising that the exterminator I talked to brushed off my questions with “I’ve never heard of a problem!”

If you get carpenter ants in your house, you basically have to choose between poisoning them or letting them structurally damage the house. We’ve only seen a few so far, but if the problem progresses, we plan to:

1) remove any rotting wood in the yard where they could be nesting

2) have the perimeter of the building sprayed

3) place gel bait in areas kids can’t access

4) only then spray poison inside the house.

If we have mice we’ll plan to use mechanical traps rather than poison.

Flame retardants

Since the 1970s, California required a high degree of flame-resistance from furniture. This basically meant that US manufacturers sprayed flame retardant chemicals on anything made of polyurethane foam, such as sofas, rug pads, nursing pillows, and baby mattresses.

The law recently changed, due to growing acknowledgement that the carcinogenic and neurotoxic chemicals were more dangerous than the fires they were supposed to be preventing. Even firefighters opposed the use of the flame retardants, because when people die in fires it’s usually from smoke inhalation rather than burns, and firefighters don’t want to breathe the smoke from your toxic sofa (which will eventually catch fire even with the flame retardants).

We’ve opted to use furniture from companies that have stopped using flame retardants (like Ikea and others listed here). Apparently futons are okay if they’re stuffed with cotton rather than foam. We also have some pre-1970s furniture that tested clean for flame retardants. You can get foam samples tested for free.

The main vehicle for children ingesting the flame retardants is that it settles into dust on the floor, and children crawl around in the dust. If you don’t want to get rid of your furniture, frequent damp-mopping would probably help.

The standards for mattresses are so stringent that the chemical sprays aren’t generally used, and instead most mattresses are wrapped in a flame-resistant barrier which apparently isn’t toxic. I contacted the companies that made our mattresses and they’re fine.

Ratings for chemical safety of children’s car seats here.

Thoughts on IQ

A lot of people, when I start talking like this, say things like “Well, I lived in a house with lead paint/played with mercury/etc. and I’m still alive.” And yes, I played with mercury as a child, and Jeff is still one of the smartest people I know even after getting acute lead poisoning as a child.

But I do wonder if my mind would work a little better without the mercury exposure, and if Jeff would have had an easier time in school without the hyperactivity (a symptom of lead exposure). Given the choice between a brain that works a little better and one that works a little worse, who wouldn’t choose the one that works better?

We’ll never know how an individual’s nervous system might have been different with a different childhood. But we can see population-level effects. The Environmental Protection Agency, for example, is fine with calculating the expected benefit of making coal plants stop releasing mercury by looking at the expected gains in terms of children’s IQ and increased earnings.

Scott writes:

A 15 to 20 point rise in IQ, which is a little more than you get from supplementing iodine in an iodine-deficient region, is associated with half the chance of living in poverty, going to prison, or being on welfare, and with only one-fifth the chance of dropping out of high-school (“associated with” does not mean “causes”).

Salkever concludes that for each lost IQ point, males experience a 1.93% decrease in lifetime earnings and females experience a 3.23% decrease. If Lily would earn about what I do, saving her one IQ point would save her $1600 a year or $64000 over her career. (And that’s not counting the other benefits she and others will reap from her having a better-functioning mind!) I use that for perspective when making decisions. $64000 would buy a lot of the posh prenatal vitamins that actually contain iodine, or organic food, or alternate housing while we’re fixing up the new house.


There are times when Jeff and I prioritize social relationships over protecting Lily from everything that might harm her physical development. It’s awkward to refuse to go to someone’s house because of the chemicals they use, or to refuse to eat food we’re offered. Social interactions are good for children’s development, and we value those as well as physical safety. And there are times when I’ve had to stop being so careful because I was getting paralyzed by anxiety (literally perched in the rocker with the baby trying not to touch anything after my in-laws scraped lead paint off the outside of the house).

But we also prioritize neurological development more than most parents, and we hope that will have good outcomes for Lily.

Astronomy, Astrobiology, & The Fermi Paradox I: Introductions, and Space & Time

37 CellBioGuy 26 July 2015 07:38AM

This is the first in a series of posts I am putting together on a personal blog I just started two days ago as a collection of my musings on astrobiology ("The Great A'Tuin" - sorry, I couldn't help it), and will be reposting here.  Much has been written here about the Fermi paradox and the 'great filter'.   It seems to me that going back to a somewhat more basic level of astronomy and astrobiology is extremely informative to these questions, and so this is what I will be doing.  The bloggery is intended for a slightly more general audience than this site (hence much of the content of the introduction) but I think it will be of interest.  Many of the points I will be making are ones I have touched on in previous comments here, but hope to explore in more detail.

This post is a combined version of my first two posts - an introduction, and a discussion of our apparent position in space and time in the universe.  The blog posts may be found at:

Text reproduced below.



What's all this about?

This blog is to be a repository for the thoughts and analysis I've accrued over the years on the topic of astrobiology, and the place of life and intelligence in the universe.  All my life I've been pulled to the very large and the very small.  Life has always struck me as the single most interesting thing on Earth, with its incredibly fine structure and vast, amazing history and fantastic abilities.  At the same time, the vast majority of what exists is NOT on Earth.  Going up in size from human-scale by the same number of orders of magnitude as you go down through to get to a hydrogen atom, you get just about to Venus at its closest approach to Earth - or one billionth the distance to the nearest star.  The large is much larger than the small is small.  On top of this, we now know that the universe as we know it is much older than life on Earth.  And we know so little of the vast majority of the universe.

There's a strong tendency towards specialization in the sciences.  These days, there pretty much has to be for anybody to get anywhere.  Much of the great foundational work of physics was done on tabletops, and the law of gravitation was derived from data on the motions of the planets taken without the benefit of so much as a telescope.  All the low-hanging fruit has been picked.  To continue to further knowledge of the universe, huge instruments and vast energies are put to bear in astronomy and physics.  Biology is arguably a bit different, but the very complexity that makes living systems so successful and so fascinating to study means that there is so much to study that any one person is often only looking at a very small problem.

This has distinct drawbacks.  The universe does not care for our abstract labels of fields and disciplines - it simply is, at all scales simultaneously at all times and in all places.  When people focus narrowly on their subject of interest, it can prevent them from realizing the implications of their findings on problems usually considered a different field.

It is one of my hopes to try to bridge some gaps between biology and astronomy here.  I very nearly double-majored in biology and astronomy in college; the only thing that prevented this (leading to an astronomy minor) was a bad attitude towards calculus.  As is, I am a graduate student studying basic cell biology at a major research university, who nonetheless keeps in touch with a number of astronomer friends and keeps up with the field as much as possible.  I quite often find that what I hear and read about has strong implications for questions of life elsewhere in the universe, but see so few of these implications actually get publicly discussed. All kinds of information shedding light on our position in space and time, the origins of life, the habitability of large chunks of the universe, the course that biospheres take, and the possible trajectories of intelligences seem to me to be out there unremarked.

It is another of my hopes to try, as much as is humanly possible, to take a step back from the usual narratives about extraterrestrial life and instead focus from something closer to first principles.  What we actually have observed and have not, what we can observe and what we cannot, and what this leaves open, likely, or unlikely.  In my study of the history of the ideas of extraterrestrial life and extraterrestrial intelligence, all too often these take a back seat to popular narratives of the day.  In the 16th century the notion that the Earth moved in a similar way to the planets gained currency and lead to the suppositions that they might be made of similar stuff and that the planets might even be inhabited.  The hot question was, of course, if their inhabitants would be Christians and their relationship with God given the anthropocentric biblical creation stories.  In the late 19th and early 20th century, Lowell's illusory canals on Mars were advanced as evidence for a Martian socialist utopia.  In the 1970s, Carl Sagan waxed philosophical on the notion that contacting old civilizations might teach us how to save ourselves from nuclear warfare.  Today, many people focus on the Fermi paradox - the apparent contradiction that since much of the universe is quite old, extraterrestrials experiencing continuing technological progress and growth should have colonized and remade it in their image long ago and yet we see no evidence of this.  I move that all of these notions have a similar root - inflating the hot concerns and topics of the day to cosmic significance and letting them obscure the actual, scientific questions that can be asked and answered.

Life and intelligence in the universe is a topic worth careful consideration, from as many angles as possible.  Let's get started.


Space and Time

Those of an anthropic bent have often made much of the fact that we are only 13.7 billion years into what is apparently an open-ended universe that will expand at an accelerating rate forever.  The era of the stars will last a trillion years; why do we find ourselves at this early date if we assume we are a ‘typical’ example of an intelligent observer?  In particular, this has lent support to lines of argument that perhaps the answer to the ‘great silence’ and lack of astronomical evidence for intelligence or its products in the universe is that we are simply the first.  This notion requires, however, that we are actually early in the universe when it comes to the origin of biospheres and by extension intelligent systems.  It has become clear recently that this is not the case. 

The clearest research I can find illustrating this is the work of Sobral et al, illustrated here via a paper on arxiv  and here via a summary article.  To simplify what was done, these scientists performed a survey of a large fraction of the sky looking for the emission lines put out by emission nebulae, clouds of gas which glow like neon lights excited by the ultraviolet light of huge, short-lived stars.  The amount of line emission from a galaxy is thus a rough proxy for the rate of star formation – the greater the rate of star formation, the larger the number of large stars exciting interstellar gas into emission nebulae.  The authors use redshift of the known hydrogen emission lines to determine the distance to each instance of emission, and performed corrections to deal with the known expansion rate of the universe.  The results were striking.  Per unit mass of the universe, the current rate of star formation is less than 1/30 of the peak rate they measured 11 gigayears ago.  It has been constantly declining over the history of the universe at a precipitous rate.  Indeed, their preferred model to which they fit the trend converges towards a finite quantity of stars formed as you integrate total star formation into the future to infinity, with the total number of stars that will ever be born only being 5% larger than the number of stars that have been born at this time. 

In summary, 95% of all stars that will ever exist, already exist.  The smallest longest-lived stars will shine for a trillion years, but for most of their history almost no new stars will have formed.

At first this seems to reverse the initial conclusion that we came early, suggesting we are instead latecomers.  This is not true, however, when you consider where and when stars of different types can form and the fact that different galaxies have very different histories.  Most galaxies formed via gravitational collapse from cool gas clouds and smaller precursor galaxies quite a long time ago, with a wide variety of properties.  Dwarf galaxies have low masses, and their early bursts of star formation lead to energetic stars with strong stellar winds and lots of ultraviolet light which eventually go supernova.  Their energetic lives and even more energetic deaths appear to usually blast star-forming gases out of their galaxies’ weak gravity or render it too hot to re-collapse into new star-forming regions, quashing their star formation early.  Giant elliptical galaxies, containing many trillions of stars apiece and dominating the cores of galactic clusters, have ample gravity but form with nearly no angular momentum.  As such, most of their cool gas falls straight into their centers, producing an enormous burst of low-heavy-element star formation that uses most of the gas.  The remaining gas is again either blasted into intergalactic space or rendered too hot to recollapse and accrete by a combination of the action of energetic young stars and the infall of gas onto the central black hole producing incredibly energetic outbursts.   (It should be noted that a full 90% of the non-dark-matter mass of the universe appears to be in the form of very thin X-ray-hot plasma clouds surrounding large galaxy clusters, unlikely to condense to the point of star formation via understood processes.)  Thus, most dwarf galaxies and giant elliptical galaxies contributed to the early star formation of the universe but are producing few or no stars today, have very low levels of heavy element rich stars, and are unlikely to make many more going into the future.

Spiral galaxies are different.  Their distinguishing feature is the way they accreted – namely with a large amount of angular momentum.  This allows large amounts of their cool gas to remain spread out away from their centers.  This moderates the rate of star formation, preventing the huge pulses of star formation and black hole activation that exhausts star-forming gas and prevents gas inflow in giant ellipticals.  At the same time, their greater mass than dwarf galaxies ensures that the modest rate of star formation they do undergo does not blast nearly as much matter out of their gravitational pull.  Some does leave over time, and their rate of inflow of fresh cool gas does apparently decrease over time – there are spiral galaxies that do seem to have shut down star formation.  But on the whole a spiral is a place that maintains a modest rate of star formation for gigayears, while heavy elements get more and more enriched over time.  These galaxies thus dominate the star production in the later eras of the universe, and dominate the population of stars produced with large amounts of heavy elements needed to produce planets like ours.  They do settle down slowly over time, and eventually all spirals will either run out of gas or merge with each other to form giant ellipticals, but for a long time they remain a class apart.

Considering this, we’re just about where we would expect a planet like ours (and thus a biosphere-as-we-know-it) to exist in space and on a coarse scale in time.  Let’s look closer at our galaxy now.  Our galaxy is generally agreed to be about 12 billion years old based on the ages of globular clusters, with a few interloper stars here and there that are older and would’ve come from an era before the galaxy was one coherent object.  It will continue forming stars for about another 5 gigayears, at which point it will undergo a merger with the Andromeda galaxy, the nearest large spiral galaxy.  This merger will most likely put an end to star formation in the combined resultant galaxy, which will probably wind up as a large elliptical after one final exuberant starburst.  Our solar system formed about 4.5 gigayears ago, putting its formation pretty much halfway along the productive lifetime of the galaxy (and probably something like 2/3 of the way along its complement of stars produced, since spirals DO settle down with age, though more of its later stars will be metal-rich).

On a stellar and planetary scale, we once again find ourselves where and when we would expect your average complex biosphere to be.  Large stars die fast – star brightness goes up with the 3.5th power of star mass, and thus star lifetime goes down with the 2.5th power of mass.  A 2 solar mass star would be 11 times as bright as the sun and only live about 2 billion years – a time along the evolution of life on Earth before photosynthesis had managed to oxygenate the air and in which the majority of life on earth (but not all – see an upcoming post) could be described as “algae”.  Furthermore, although smaller stars are much more common than larger stars (the Sun is actually larger than over 80% of stars in the universe) stars smaller than about 0.5 solar masses (and thus 0.08 solar luminosities) are usually ‘flare stars’ – possessing very strong convoluted magnetic fields and periodically putting out flares and X-ray bursts that would frequently strip away the ozone and possibly even the atmosphere of an earthlike planet. 

All stars also slowly brighten as they age – the sun is currently about 30% brighter than it was when it formed, and it will wind up about twice as bright as its initial value just before it becomes a red giant.  Depending on whose models of climate sensitivity you use, the Earth’s biosphere probably has somewhere between 250 million years and 2 billion years before the oceans boil and we become a second Venus.  Thus, we find ourselves in the latter third-to-twentieth of the history of Earth’s biosphere (consistent with complex life taking time to evolve).

Together, all this puts our solar system – and by extension our biosphere – pretty much right where we would expect to find it in space, and right in the middle of where one would expect to find it in time.  Once again, as observers we are not special.  We do not find ourselves in the unexpectedly early universe, ruling out one explanation for the Fermi paradox sometimes put forward – that we do not see evidence for intelligence in the universe because we simply find ourselves as the first intelligent system to evolve.  This would be tenable if there was reason to think that we were right at the beginning of the time in which star systems in stable galaxies with lots of heavy elements could have birthed complex biospheres.  Instead we are utterly average, implying that the lack of obvious intelligence in the universe must be resolved either via the genesis of intelligent systems being exceedingly rare or intelligent systems simply not spreading through the universe or becoming astronomically visible for one reason or another. 

In my next post, I will look at the history of life on Earth, the distinction between simple and complex biospheres, and the evidence for or against other biospheres elsewhere in our own solar system.

Wear a Helmet While Driving a Car

33 James_Miller 30 July 2015 04:36PM

A 2006 study showed that “280,000 people in the U.S. receive a motor vehicle induced traumatic brain injury every year” so you would think that wearing a helmet while driving would be commonplace.  Race car drivers wear helmets.  But since almost no one wears a helmet while driving a regular car, you probably fear that if you wore one you would look silly, attract the notice of the police for driving while weird, or the attention of another driver who took your safety attire as a challenge.  (Car drivers are more likely to hit bicyclists who wear helmets.)  


The $30+shipping Crasche hat is designed for people who should wear a helmet but don’t.  It looks like a ski cap, but contains concealed lightweight protective material.  People who have signed up for cryonics, such as myself, would get an especially high expected benefit from using a driving helmet because we very much want our brains to “survive” even a “fatal” crash. I have been using a Crasche hat for about a week.

Analogical Reasoning and Creativity

25 jacob_cannell 01 July 2015 08:38PM

This article explores analogism and creativity, starting with a detailed investigation into IQ-test style analogy problems and how both the brain and some new artificial neural networks solve them.  Next we analyze concept map formation in the cortex and the role of the hippocampal complex in establishing novel semantic connections: the neural basis of creative insights.  From there we move into learning strategies, and finally conclude with speculations on how a grounded understanding of analogical creative reasoning could be applied towards advancing the art of rationality.

  1. Introduction
  2. Under the Hood
  3. Conceptual Abstractions and Cortical Maps
  4. The Hippocampal Association Engine
  5. Cultivate memetic heterogeneity and heterozygosity
  6. Construct and maintain clean conceptual taxonomies
  7. Conclusion


The computer is like a bicycle for the mind.

-- Steve Jobs

The kingdom of heaven is like a mustard seed, the smallest of all seeds, but when it falls on prepared soil, it produces a large plant and becomes a shelter for the birds of the sky.

-- Jesus

Sigmoidal neural networks are like multi-layered logistic regression.

-- various

The threat of superintelligence is like a tribe of sparrows who find a large egg to hatch and raise.  It grows up into a great owl which devours them all.

-- Nick Bostrom (see this video)

Analogical reasoning is one of the key foundational mechanisms underlying human intelligence, and perhaps a key missing ingredient in machine intelligence.  For some - such as Douglas Hofstadter - analogy is the essence of cognition itself.[1] 

Steve Job's bicycle analogy is clever because it encapsulates the whole cybernetic idea of computers as extensions of the nervous system into a single memorable sentence using everyday terms.  

A large chunk of Jesus's known sayings are parables about the 'Kingdom of Heaven': a complex enigmatic concept that he explains indirectly through various analogies, of which the mustard seed is perhaps the most memorable.  It conveys the notions of exponential/sigmoidal growth of ideas and social movements (see also the Parable of the Leaven), while also hinting at greater future purpose.

In a number of fields, including the technical, analogical reasoning is key to creativity: most new insights come from establishing mappings between or with concepts from other fields or domains, or from generalizing existing insights/concepts (which is closely related).  These abilities all depend on deep, wide, and well organized internal conceptual maps.

In a previous post, I presented a high level working hypothesis of the brain as a biological implementation of a universal learning machine, using various familiar computational concepts as analogies to explain brain subsystems.  In my last post, I used the conceptions of unfriendly superintelligence and value alignment as analogies for market mechanism design and the healthcare problem (and vice versa).

A clever analogy is like a sophisticated conceptual compressor that helps maximize knowledge transmission.  Coming up with good novel analogies is hard because it requires compressing a complex large body of knowledge into a succinct message that heavily exploits the recipient's existing knowledgebase.  Due to the deep connections between compression, inference, intelligence and creativity, a deeper investigation of analogical reasoning is useful from a variety of angles.

It is the hard task of coming up with novel analogical connections that can lead to creative insights, but to understand that process we should start first with the mechanics of recognition.

Under the Hood

You can think of the development of IQ tests as a search for simple tests which have high predictive power for g-factor in humans, while being relatively insensitive to specific domain knowledge.  That search process resulted in a number of problem categories, many of which are based on verbal and mathematical analogies.

The image to the right is an example of a simple geometric analogy problem.  As an experiment, start a timer before having a go at it.  For bonus points, attempt to introspect on your mental algorithm.

Solving this problem requires first reducing the images to simpler compact abstract representations.  The first rows of images then become something like sentences describing relations or constraints (Z is to ? as A is to B and C is to D).  The solution to the query sentence can then be found by finding the image which best satisfies the likely analogous relations.

Imagine watching a human subject (such as your previous self) solve this problem while hooked up to a future high resolution brain imaging device.  Viewed in slow motion, you would see the subject move their eyes from location to location through a series of saccades, while various vectors or mental variable maps flowed through their brain modules.  Each fixation lasts about 300ms[2], which gives enough time for one complete feedforward pass through the dorsal vision stream and perhaps one backwards sweep.  

The output of the dorsal stream in inferior temporal cortex (TE on the bottom) results in abstract encodings which end up in working memory buffers in prefrontal cortex.  From there some sort of learned 'mental program' implements the actual analogy evaluations, probably involving several more steps in PFC, cingulate cortex, and various other cortical modules (coordinated by the Basal Ganglia and PFC). Meanwhile the eye frontal fields and various related modules are computing the next saccade decision every 300ms or so.

If we assume that visual parsing requires one fixation on each object and 50ms saccades, this suggests that solving this problem would take a typical brain a minimum of about 4 seconds (and much longer on average).  The minimum estimate assumes - probably unrealistically - that the subject can perform the analogy checks or mental rotations near instantly without any backtracking to help prime working memory.  Of course faster times are also theoretically possible - but not dramatically faster.

These types of visual analogy problems test a wide set of cognitive operations, which by itself can explain much of the correlation with IQ or g-factor: speed and efficiency of neural processing, working memory, module communication, etc.  

However once we lay all of that aside, there remains a core dependency on the ability for conceptual abstraction.  The mapping between these simple visual images and their compact internal encodings is ambiguous, as is the predictive relationship.  Solving these problems requires the ability to find efficient and useful abstractions - a general pattern recognition ability which we can relate to efficient encoding, representation learning, and nonlinear dimension reduction: the very essence of learning in both man and machine[3].

The machine learning perspective can help make these connections more concrete when we look into state of the art programs for IQ tests in general and analogy problems in particular.  Many of the specific problem subtypes used in IQ tests can be solved by relatively simple programs.  In 2003, Sange and Dowe created a simple Perl program (less than 1000 lines of code) that can solve several specific subtypes of common IQ problems[4] - but not analogies.  It scored an IQ of a little over 100, simply by excelling in a few categories and making random guesses for the remaining harder problem types.  Thus its score is highly dependent on the test's particular mix of subproblems, but that is also true for humans to some extent.  

The IQ test sub-problems that remain hard for computers are those that require pattern recognition combined with analogical reasoning and or inductive inference.  Precise mathematical inductive inference is easier for machines, whereas humans excel at natural reasoning - inference problems involving huge numbers of variables that can only be solved by scalable approximations.

For natural language tasks, neural networks have recently been used to learn vector embeddings which map words or sentences to abstract conceptual spaces encoded as vectors (typically of dimensionality 100 to 1000).  Combining word vector embeddings with some new techniques for handling multiple word senses, Wang and Gao et al just recently trained a system that can solve typical verbal reasoning problems from IQ tests (or the GRE) at upper human level - including verbal analogies[5].

The word vector embedding is learned as a component of an ANN trained via backprop on a large corpus of text data - Wikipedia.  This particular model is rather complex: it combines a multi-sense word embedding, a local sliding window prediction objective, task-specific geometric objectives, and relational regularization constraints.  Unlike the recent crop of general linguistic modeling RNNs, this particular system doesn't model full sentence structure or longer term dependencies - as those aren't necessary for answering these specific questions.  Surprisingly all it takes to solve the verbal analogy problems typical of IQ/SAT/GRE style tests are very simple geometric operations in the word vector space - once the appropriate embedding is learned.  

As a trivial example: "Uncle is to Aunt as King is to ?" literally reduces to:

Uncle + X = Aunt, King + X = ?, and thus X = Aunt-Uncle, and:

? = King + (Aunt-Uncle).

The (Aunt-Uncle) expression encapsulates the concept of 'femaleness', which can be combined with any male version of a word to get the female version.  This is perhaps the simplest example, but more complex transformations build on this same principle.  The embedded concept space allows for easy mixing and transforms of memetic sub-features to get new concepts.

Conceptual Abstractions and Cortical Maps

The success of these simplistic geometric transforms operating on word vector embeddings should not come as a huge surprise to one familiar with the structure of the brain.  The brain is extraordinarily slow, so it must learn to solve complex problems via extremely simple and short mental programs operating on huge wide vectors.  Humans (and now convolutional neural networks) can perform complex visual recognition tasks in just 10-15 individual computational steps (150 ms), or 'cortical clock cycles'.  The entire program that you used to solve the earlier visual analogy problem probably took on the order of a few thousand cycles (assuming it took you a few dozen seconds).  Einstein solved general relativity in - very roughly - around 10 billion low level cortical cycles.

The core principle behind word vector embeddings, convolutional neural networks, and the cortex itself is the same: learning to represent the statistical structure of the world by an efficient low complexity linear algebra program (consisting of local matrix vector products and per-element non-linearities).  The local wiring structure within each cortical module is equivalent to a matrix with sparse local connectivity, optimized heavily for wiring and computation such that semantically related concepts cluster close together.

(Concept mapping the cortex, from this research page)

The image above is from the paper "A Continous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain" by Huth et al.[5] They used fMRI to record activity across the cortex while subjects watched annotated video clips, and then used that data to find out roughly what types of concepts each voxel of cortex responds to.  It correctly identifies the FFA region as specializing in people-face things and the PPA as specializing in man-made objects and buildings.  A limitation of the above image visualizations is that they don't show response variance or breadth, so the voxel colors are especially misleading for lower level cortical regions that represent generic local features (such as gabor edges in V1).

The power of analogical reasoning depends entirely on the formation of efficient conceptual maps that carve reality at the joints.  The visual pathway learns a conceptual hierarchy that builds up objects from their parts: a series of hierarchical has-a relationships encoded in the connections between V1, V2, V4 and so on.  Meanwhile the semantic clustering within individual cortical maps allows for fast computations of is-a relationships through simple local pooling filters.  

An individual person can be encoded as a specific active subnetwork in the face region, and simple pooling over a local cluster of neurons across the face region can then compute the presence of a face in general.  Smaller local pooling filters with more specific shapes can then compute the presence of a female or male face, and so on - all starting from the full specific feature encoding.  

The pooling filter concept has been extensively studied in the lower levels of the visual system, where 'complex' cells higher up in V1 pool over 'simple' cell features: abstracting away gabor edges at specific positions to get edges OR'd over a range of positions (CNNs use this same technique to gain invariance to small local translations).  

This key semantic organization principle is used throughout the cortex: is-a relations and more general abstractions/invariances are computed through fast local intramodule connections that exploit the physical semantic clustering on the cortical surface, and more complex has-a relations and arbitrary transforms (ex: mapping between an eye centered coordinate basis and a body centered coordinate basis) are computed through intermodule connections (which also exploit physical clustering).


The Hippocampal Association Engine

The Hippocampus is a tubular seahorse shaped module located in the center of the brain, to the exterior side of the central structures (basal ganglia, thalamus).  It is the brain's associative database and search engine responsible for storing, retrieving, and consolidating patterns and declarative memories (those which we are consciously aware of and can verbally declare) over long time scales beyond the reach of short term memory in the cortex itself.

A human (or animal) unfortunate enough to suffer complete loss of hippocampal functionality basically loses the ability to form and consolidate new long term episodic and semantic memories.  They also lose more recent memories that have not yet been consolidated down the cortical hierarchy.  In rats and humans, problems in the hippocampal complex can also lead to spatial navigation impairments (forgetting current location or recent path), as the HC is used to compute and retrieve spatial map information associated with current sensory impressions (a specific instance of the HC's more general function).

In terms of module connectivity, the hippocampal complex sits on top of the cortical sensory hierarchy.  It receives inputs from a number of cortical modules, largely in the nearby associative cortex, which collectively provide a summary of the recent sensory stream and overall brain state.  The HC then has several sub circuits which further compress the mental summary into something like a compact key which is then sent into a hetero-auto-associative memory circuit to find suitable matches.  

If a good match is found, it can then cause retrieval: reactivation of the cortical subnetworks that originally formed the memory.  As the hippocampus can't know for sure which memories will be useful in the future, it tends to store everything with emphasis on the recent, perhaps as a sort of slow exponentially fading stream.  Each memory retrieval involves a new decoding and encoding to drive learning in the cortex through distillation/consolidation/retraining (this also helps prevent ontological crisis).  The amygdala is a little cap on the edge of the hippocampus which connects to the various emotion subsystems and helps estimate the importance of current memories for prioritization in the HC.

A very strong retrieval of an episodic memory causes the inner experience of reliving the past (or imagining the future), but more typical weaker retrievals (those which load information into the cortex without overriding much of the existing context) are a crucial component in general higher cognition.

In short the computation that the HC performs is that of dynamic association between the current mental pattern/state loaded into short term memory across the cortex and some previous mental pattern/state.  This is the very essence of creative insight.

Associative recall can be viewed as a type of pattern recognition with the attendant familiar tradeoffs between precision/recall or sensitivity/specificity.  At the extreme of low recall high precision the network is very conservative and risk averse: it only returns high confidence associations, maximizing precision at the expense of recall (few associations found, many potentially useful matches are lost).  At the other extreme is the over-confident crazy network which maximizes recall at the expense of precision (many associations are made, most of which are poor).  This can also be viewed in terms of the exploitation vs exploration tradeoff.

This general analogy or framework - although oversimplified - also provides a useful perspective for understanding both schizotypy and hallucinogenic drugs.  There is a large body of accumulated evidence in the form of use cases or trip reports, with a general consensus that hallucinogens can provide occasional flashes of creative insight at the expense of pushing one farther towards madness.

From a skeptical stance, using hallucinogenic drugs in an attempt to improve the mind is like doing surgery with butter-knives.  Nonetheless, careful exploration of the sanity border can help one understand more on how the mind works from the inside. 

Cannabis in particular is believed - by many of its users - to enhance creativity via occasional flashes of insight.  Most of its main mental effects: time dilation, random associations, memory impairment, spatial navigation impairment, etc appear to involve the hippocampus.  We could explain much of this as a general shift in the precision/recall tradeoff to make the hippocampus less selective.  Mainly that makes the HC just work less effectively, but it also can occasionally lead to atypical creative insights, and appears to elevate some related low level measures such as schizotypy and divergent thinking[7].  The tradeoff is one must be willing to first sift through a pile of low value random associations.


Cultivate memetic heterogeneity and heterozygosity

Fluid intelligence is obviously important, but in many endeavors net creativity is even more important.  

Of all the components underlying creativity, improving the efficiency of learning, the quality of knowledge learned, and the organizational efficiency of one's internal cortical maps are probably the most profitable dimensions of improvement: the low hanging fruits.

Our learning process is largely automatic and subconscious : we do not need to teach children how to perceive the world.  But this just means it takes some extra work to analyze the underlying machinery and understand how to best utilize it.

Over long time scales humanity has learned a great deal on how to improve on natural innate learning: education is more or less learning-engineering.  The first obvious lesson from education is the need for curriculum: acquiring concepts in stages of escalating complexity and order-dependency (which of course is already now increasingly a thing in machine learning).

In most competitive creative domains, formal education can only train you up to the starting gate.  This of course is to be expected, for the creation of novel and useful ideas requires uncommon insights.

Memetic evolution is similar to genetic evolution in that novelty comes more from recombination than mutation.  We can draw some additional practical lessons from this analogy: cultivate memetic heterogeneity and heterozygosity.

The first part - cultivate memetic heterogeneity - should be straightforward, but it is worth examining some examples.  If you possess only the same baseline memetic population as your peers, then the chances of your mind evolving truly novel creative combinations are substantially diminished.  You have no edge - your insights are likely to be common.

To illustrate this point, let us consider a few examples:

Geoffrey Hinton is one of the most successful researchers in machine learning - which itself is a diverse field.  He first formally studied psychology, and then artificial intelligence.  His various 200 research publications integrate ideas from statistics, neuroscience and physics.  His work on boltzmann machines and variants in particular imports concepts from statistical physics whole cloth.

Before founding DeepMind (now one of the premier DL research groups in the world), Demis Hassabis studied the brain and hippocampus in particular at the Gatsby Computational Neuroscience Unit, and before that he worked for years in the video game industry after studying computer science.

Before the Annus Mirabilis, Einstein worked at the patent office for four years, during which time he was exposed to a large variety of ideas relating to the transmission of electric signals and electrical-mechanical synchronization of time, core concepts which show up in his later thought experiments.[8]

Creative people also tend to have a diverse social circle of creative friends to share and exchange ideas across fields.

Genetic heterozygosity is the quality of having two different alleles at a gene locus; summed over the organism this leads to a different but related concept of diversity.

Within developing fields of knowledge we often find key questions or subdomains for which there are multiple competing hypotheses or approaches.  Good old fashioned AI vs Connectionism, Ray tracing vs Rasterization, and so on.

In these scenarios, it is almost always better to understand both viewpoints or knowledge clusters - at least to some degree.  Each cluster is likely to have some unique ideas which are useful for understanding the greater truth or at the very least for later recombination.  

This then is memetic heterozygosity.  It invokes the Jain version of the blind men and the elephant.

Construct and maintain clean conceptual taxonomies

Formal education has developed various methods and rituals which have been found to be effective through a long process of experimentation.  Some of these techniques are still quite useful for autodidacts.

When one sets out to learn, it is best to start with a clear goal.  The goal of high school is just to provide a generalist background.  In college one then chooses a major suitable for a particular goal cluster: do you want to become a computer programmer? a physicist? a biologist? etc.  A significant amount of work then goes into structuring a learning curriculum most suitable for these goal types.

Once out of the educational system we all end up creating our own curriculums, whether intentionally or not.  It can be helpful to think strategically as if planning a curriculum to suit one's longer term goals.

For example, about four years ago I decided to learn how the brain works and how AGI could be built in particular.  When starting on this journey, I had a background mainly in computer graphics, simulation, and game related programming.  I decided to focus about equally on mainstream AI, machine learning, computational neuroscience, and the AGI literature.  I quickly discovered that my statistics background was a little weak, so I had to shore that up.  Doing it all over again I may have started with a statistics book.  Instead I started with AI: a modern approach (of course I mostly learn from the online research literature).

Learning works best when it is applied.  Education exploits this principle and it is just as important for autodidactic learning.  The best way to learn many math or programming concepts is learning by doing, where you create reasonable subtasks or subgoals for yourself along the way.  

For general knowledge, application can take the form of writing about what you have learned.  Academics are doing this all the time as they write papers and textbooks, but the same idea applies outside of academia.

In particular a good exercise is to imagine that you need to communicate all that you have learned about the domain.  Imagine that you are writing a textbook or survey paper for example, and then you need to compress all that knowledge into a summary chapter or paper, and then all of that again down into an abstract.  Then actually do write up a summary - at least in the form of a blog post (even if you don't show it to anybody).

The same ideas apply on some level to giving oral presentations or just discussing what you have learned informally - all of which are also features of the academic learning environment.

Early on, your first attempts to distill what you have learned into written form will be ... poor.  But doing this process forces you to attempt to compress what you have learned, and thus it helps encourage the formation of well structured concept maps in the cortex.

A well structured conceptual map can be thought of as a memetic taxonomy.  The point of a taxonomy is to organize all the invariances and 'is-a' relationships between objects so that higher level inferences and transformations can generalize well across categories.  

Explicitly asking questions which probe the conceptual taxonomy can help force said structure to take form.  For example in computer science/programming the question: "what is the greater generalization of this algorithm?" is a powerful tool.

In some domains, it may even be possible to semi-automate or at least guide the creative process using a structured method.

For example consider sci-fi/fantasy genre novels.  Many of the great works have a general analogical structure based on real history ported over into a more exotic setting.  The foundation series uses the model of the fall of the roman empire.  Dune is like Lawrence of Arabia in space.  Stranger in a Strange Land is like the Mormon version of Jesus the space alien, but from Mars instead of Kolob.  A Song of Fire and Ice is partly a fantasy port of the war of the roses.  And so on.

One could probably find some new ideas for novels just by creating and exploring a sufficiently large table of historical events and figures and comparing it to a map of the currently colonized space of ideas.  Obviously having an idea for a novel is just the tiniest tip of the iceberg in the process, but a semi-formal method is interesting nonetheless for brainstorming and applies across domains (others have proposed similar techniques for generating startup ideas, for example).


We are born equipped with sophisticated learning machinery and yet lack innate knowledge on how to use it effectively - for this too we must learn.

The greatest constraint on creative ability is the quality of conceptual maps in the cortex.  Understanding how these maps form doesn't automagically increase creativity, but it does help ground our intuitions and knowledge about learning, and could pave the way for future improved techniques.

In the meantime: cultivate memetic heterogeneity and heterozygosity, create a learning strategy, develop and test your conceptual taxonomy, continuously compress what you learn by writing and summarizing, and find ways to apply what you learn as you go.

MIRI Fundraiser: Why now matters

24 So8res 24 July 2015 10:38PM

Our summer fundraiser is ongoing. In the meantime, we're writing a number of blog posts to explain what we're doing and why, and to answer a number of common questions. Previous posts in the series are listed at the above link.

I'm often asked whether donations to MIRI now are more important than donations later. Allow me to deliver an emphatic yes: I currently expect that donations to MIRI today are worth much more than donations to MIRI in five years. As things stand, I would very likely take $10M today over $20M in five years.

That's a bold statement, and there are a few different reasons for this. First and foremost, there is a decent chance that some very big funders will start entering the AI alignment field over the course of the next five years. It looks like the NSF may start to fund AI safety research, and Stuart Russell has already received some money from DARPA to work on value alignment. It's quite possible that in a few years' time significant public funding will be flowing into this field.

(It's also quite possible that it won't, or that the funding will go to all the wrong places, as was the case with funding for nanotechnology. But if I had to bet, I would bet that it's going to be much easier to find funding for AI alignment research in five years' time).

In other words, the funding bottleneck is loosening — but it isn't loose yet.

We don't presently have the funding to grow as fast as we could over the coming months, or to run all the important research programs we have planned. At our current funding level, the research team can grow at a steady pace — but we could get much more done over the course of the next few years if we had the money to grow as fast as is healthy.

Which brings me to the second reason why funding now is probably much more important than funding later: because growth now is much more valuable than growth later.

There's an idea picking up traction in the field of AI: instead of focusing only on increasing the capabilities of intelligent systems, it is important to also ensure that we know how to build beneficial intelligent systems. Support is growing for a new paradigm within AI that seriously considers the long-term effects of research programs, rather than just the immediate effects. Years down the line, these ideas may seem obvious, and the AI community's response to these challenges may be in full swing. Right now, however, there is relatively little consensus on how to approach these issues — which leaves room for researchers today to help determine the field's future direction.

People at MIRI have been thinking about these problems for a long time, and that puts us in an unusually good position to influence the field of AI and ensure that some of the growing concern is directed towards long-term issues in addition to shorter-term ones. We can, for example, help avert a scenario where all the attention and interest generated by Musk, Bostrom, and others gets channeled into short-term projects (e.g., making drones and driverless cars safer) without any consideration for long-term risks that are more vague and less well-understood.

It's likely that MIRI will scale up substantially at some point; but if that process begins in 2018 rather than 2015, it is plausible that we will have already missed out on a number of big opportunities.

The alignment research program within AI is just now getting started in earnest, and it may even be funding-saturated in a few years' time. But it's nowhere near funding-saturated today, and waiting five or ten years to begin seriously ramping up our growth would likely give us far fewer opportunities to shape the methodology and research agenda within this new AI paradigm. The projects MIRI takes on today can make a big difference years down the line, and supporting us today will drastically affect how much we can do quickly. Now matters.

I encourage you to donate to our ongoing fundraiser if you'd like to help us grow!

This post is cross-posted from the MIRI blog.

Steelmaning AI risk critiques

23 Stuart_Armstrong 23 July 2015 10:01AM

At some point soon, I'm going to attempt to steelman the position of those who reject the AI risk thesis, to see if it can be made solid. Here, I'm just asking if people can link to the most convincing arguments they've found against AI risk.

EDIT: Thanks for all the contribution! Keep them coming...

There is no such thing as strength: a parody

23 ZoltanBerrigomo 05 July 2015 11:44PM

The concept of strength is ubiquitous in our culture. It is commonplace to hear one person described as "stronger" or "weaker" than another. And yet the notion of strength is a a pernicious myth which reinforces many our social ills and should be abandoned wholesale. 


1. Just what is strength, exactly? Few of the people who use the word can provide an exact definition. 

On first try, many people would say that  strength is the ability to lift heavy objects. But this completely ignores the strength necessary to push or pull on objects; to run long distances without exhausting oneself; to throw objects with great speed; to balance oneself on a tightrope, and so forth. 

When this is pointed out, people often try to incorporate all of these aspects into the definition of strength, with a result that is long, unwieldy, ad-hoc, and still missing some acts commonly considered to be manifestations of strength. 


Attempts to solve the problem by referring to the supposed cause of strength -- for example, by saying that strength is just a measure of  muscle mass -- do not help. A person with a large amount of muscle mass may be quite weak on any of the conventional measures of strength if, for example, they cannot lift objects due to injuries or illness. 



2. The concept of strength has an ugly history. Indeed, strength is implicated in both sexism and racism. Women have long been held to be the "weaker sex," consequently needing protection from the "stronger" males, resulting in centuries of structural oppression. Myths about racialist differences in strength have informed pernicious stereotypes and buttressed inequality.


3. There is no consistent way of grouping people into strong and weak. Indeed, what are we to make of the fact that some people are good at running but bad at lifting and vice versa? 


One might think that we can talk about different strengths - the strength in one's arms and one's legs for example. But what, then, should we make of the person who is good at arm-wrestling but poor at lifting? Arms can move in many ways; what will we make of someone who can move arms one way with great force, but not another? It is not hard to see that potential concepts such as "arm strength" or "leg strength" are problematic as well. 


4. When people are grouped into strong and weak according to any number of criteria, the amount of variation within each group is far larger than the amount of variation between groups. 


5. Strength is a social construct. Thus no one is inherently weak or strong. Scientifically, anthropologically, we are only human


6. Scientists are rapidly starting to understand the illusory nature of strength, and one needs only to glance at any of the popular scientific periodicals to encounter refutations of this notion. 


In on experiment, respondents from two different cultures were asked to lift a heavy object as much as they could. In one of the cultures, the respondents lifted the object higher. Furthermore, the manner in which the respondents attempted to lift the object depended on the culture. This shows that tests of strength cannot be considered culture-free and that there may be no such thing as a universal test of strength


7. Indeed, to even ask "what is strength?" is to assume that there is a quality, or essence, of humans with essential, immutable qualities. Asking the question begins the process of reifying strength... (see page 22 here).




For a serious statement of what the point of this was supposed to be, see this comment


A Federal Judge on Biases in the Criminal Justice System.

22 Costanza 03 July 2015 03:17AM

A well-known American federal appellate judge, Alex Kozinski, has written a commentary on systemic biases and institutional myths in the criminal justice system.

The basic thrust of his criticism will be familiar to readers of the sequences and rationalists generally. Lots about cognitive biases (but some specific criticisms of fingerprints and DNA evidence as well). Still, it's interesting that a prominent federal judge -- the youngest when appointed, and later chief of the Ninth Circuit -- would treat some sacred cows of the judiciary so ruthlessly. 

This is specifically a criticism of U.S. criminal justice, but, ceteris paribus, much of it applies not only to other areas of U.S. law, but to legal practices throughout the world as well.

Examples of AI's behaving badly

21 Stuart_Armstrong 16 July 2015 10:01AM

Some past examples to motivate thought on how AI's could misbehave:

An algorithm pauses the game to never lose at Tetris.

In "Learning to Drive a Bicycle using Reinforcement Learning and Shaping", Randlov and Alstrom, describes a system that learns to ride a simulated bicycle to a particular location. To speed up learning, they provided positive rewards whenever the agent made progress towards the goal. The agent learned to ride in tiny circles near the start state because no penalty was incurred from riding away from the goal.

A similar problem occurred with a soccer-playing robot being trained by David Andre and Astro Teller (personal communication to Stuart Russell). Because possession in soccer is important, they provided a reward for touching the ball. The agent learned a policy whereby it remained next to the ball and “vibrated,” touching the ball as frequently as possible. 

Algorithms claiming credit in Eurisko: Sometimes a "mutant" heuristic appears that does little more than continually cause itself to be triggered, creating within the program an infinite loop. During one run, Lenat noticed that the number in the Worth slot of one newly discovered heuristic kept rising, indicating that had made a particularly valuable find. As it turned out the heuristic performed no useful function. It simply examined the pool of new concepts, located those with the highest Worth values, and inserted its name in their My Creator slots.

Crazy Ideas Thread

21 Gunnar_Zarncke 07 July 2015 09:40PM

This thread is intended to provide a space for 'crazy' ideas. Ideas that spontaneously come to mind (and feel great), ideas you long wanted to tell but never found the place and time for and also for ideas you think should be obvious and simple - but nobody ever mentions them.

This thread itself is such an idea. Or rather the tangent of such an idea which I post below as a seed for this thread.


Rules for this thread:

  1. Each crazy idea goes into its own top level comment and may be commented there.
  2. Voting should be based primarily on how original the idea is.
  3. Meta discussion of the thread should go to the top level comment intended for that purpose. 


If this should become a regular thread I suggest the following :

  • Use "Crazy Ideas Thread" in the title.
  • Copy the rules.
  • Add the tag "crazy_idea".
  • Create a top-level comment saying 'Discussion of this thread goes here; all other top-level comments should be ideas or similar'
  • Add a second top-level comment with an initial crazy idea to start participation.

[link] FLI's recommended project grants for AI safety research announced

17 Kaj_Sotala 01 July 2015 03:27PM

You may recognize several familiar names there, such as Paul Christiano, Benja Fallenstein, Katja Grace, Nick Bostrom, Anna Salamon, Jacob Steinhardt, Stuart Russell... and me. (the $20,000 for my project was the smallest grant that they gave out, but hey, I'm definitely not complaining. ^^)

Philosophy professors fail on basic philosophy problems

16 shminux 15 July 2015 06:41PM

Imagine someone finding out that "Physics professors fail on basic physics problems". This, of course, would never happen. To become a physicist in academia, one has to (among million other things) demonstrate proficiency on far harder problems than that.

Philosophy professors, however, are a different story. Cosmologist Sean Carroll tweeted a link to a paper from the Harvard Moral Psychology Research Lab, which found that professional moral philosophers are no less subject to the effects of framing and order of presentation on the Trolley Problem than non-philosophers. This seems as basic an error as, say, confusing energy with momentum, or mixing up units on a physics test.


We examined the effects of framing and order of presentation on professional philosophers’ judgments about a moral puzzle case (the “trolley problem”) and a version of the Tversky & Kahneman “Asian disease” scenario. Professional philosophers exhibited substantial framing effects and order effects, and were no less subject to such effects than was a comparison group of non-philosopher academic participants. Framing and order effects were not reduced by a forced delay during which participants were encouraged to consider “different variants of the scenario or different ways of describing the case”. Nor were framing and order effects lower among participants reporting familiarity with the trolley problem or with loss-aversion framing effects, nor among those reporting having had a stable opinion on the issues before participating the experiment, nor among those reporting expertise on the very issues in question. Thus, for these scenario types, neither framing effects nor order effects appear to be reduced even by high levels of academic expertise.

Some quotes (emphasis mine):

When scenario pairs were presented in order AB, participants responded differently than when the same scenario pairs were presented in order BA, and the philosophers showed no less of a shift than did the comparison groups, across several types of scenario.

[...] we could find no level of philosophical expertise that reduced the size of the order effects or the framing effects on judgments of specific cases. Across the board, professional philosophers (94% with PhD’s) showed about the same size order and framing effects as similarly educated non-philosophers. Nor were order effects and framing effects reduced by assignment to a condition enforcing a delay before responding and encouraging participants to reflect on “different variants of the scenario or different ways of describing the case”. Nor were order effects any smaller for the majority of philosopher participants reporting antecedent familiarity with the issues. Nor were order effects any smaller for the minority of philosopher participants reporting expertise on the very issues under investigation. Nor were order effects any smaller for the minority of philosopher participants reporting that before participating in our experiment they had stable views about the issues under investigation.

I am confused... I assumed that an expert in moral philosophy would not fall prey to the relevant biases so easily... What is going on?


MIRI needs an Office Manager (aka Force Multiplier)

16 alexvermeer 03 July 2015 01:10AM

(Cross-posted from MIRI's blog.)

MIRI's looking for a full-time office manager to support our growing team. It’s a big job that requires organization, initiative, technical chops, and superlative communication skills. You’ll develop, improve, and manage the processes and systems that make us a super-effective organization. You’ll obsess over our processes (faster! easier!) and our systems (simplify! simplify!). Essentially, it’s your job to ensure that everyone at MIRI, including you, is able to focus on their work and Get Sh*t Done.

That’s a super-brief intro to what you’ll be working on. But first, you need to know if you’ll even like working here.

A Bit About Us

We’re a research nonprofit working on the critically important problem of superintelligence alignment: how to bring smarter-than-human artificial intelligence into alignment with human values.1 Superintelligence alignment is a burgeoning field, and arguably the most important and under-funded research problem in the world. Experts largely agree that AI is likely to exceed human levels of capability on most cognitive tasks in this century—but it’s not clear when, and we aren’t doing a very good job of preparing for the possibility. Given how disruptive smarter-than-human AI would be, we need to start thinking now about AI’s global impact. Over the past year, a number of leaders in science and industry have voiced their support for prioritizing this endeavor:

People are starting to discuss these issues in a more serious way, and MIRI is well-positioned to be a thought leader in this important space. As interest in AI safety grows, we’re growing too—we’ve gone from a single full-time researcher in 2013 to what will likely be a half-dozen research fellows by the end of 2015, and intend to continue growing in 2016.

All of which is to say: we really need an office manager who will support our efforts to hack away at the problem of superintelligence alignment!

If our overall mission seems important to you, and you love running well-oiled machines, you’ll probably fit right in. If that’s the case, we can’t wait to hear from you.

What it’s like to work at MIRI

We try really hard to make working at MIRI an amazing experience. We have a team full of truly exceptional people—the kind you’ll be excited to work with. Here’s how we operate:

Flexible Hours

We do not have strict office hours. Simply ensure you’re here enough to be available to the team when needed, and to fulfill all of your duties and responsibilities.

Modern Work Spaces

Many of us have adjustable standing desks with multiple large external monitors. We consider workspace ergonomics important, and try to rig up work stations to be as comfortable as possible.

Living in the Bay Area

We’re located in downtown Berkeley, California. Berkeley’s monthly average temperature ranges from 60°F in the winter to 75°F in the summer. From our office you’re:

  • A 10-second walk to the roof of our building, from which you can view the Berkeley Hills, the Golden Gate Bridge, and San Francisco.
  • A 30-second walk to the BART (Bay Area Rapid Transit), which can get you around the Bay Area.
  • A 3-minute walk to UC Berkeley Campus.
  • A 5-minute walk to dozens of restaurants (including ones in Berkeley’s well-known Gourmet Ghetto).
  • A 30-minute BART ride to downtown San Francisco.
  • A 30-minute drive to the beautiful west coast.
  • A 3-hour drive to Yosemite National Park.

Vacation Policy

Our vacation policy is that we don’t have a vacation policy. That is, take the vacations you need to be a happy, healthy, productive human. There are checks in place to ensure this policy isn’t abused, but we haven’t actually run into any problems since initiating the policy.

We consider our work important, and we care about whether it gets done well, not about how many total hours you log each week. We’d much rather you take a day off than extend work tasks just to fill that extra day.

Regular Team Dinners and Hangouts

We get the whole team together every few months, order a bunch of food, and have a great time.

Top-Notch Benefits

We provide top-notch health and dental benefits. We care about our team’s health, and we want you to be able to get health care with as little effort and annoyance as possible.

Agile Methodologies

Our ops team follows standard Agile best practices, meeting regularly to plan, as a team, the tasks and priorities over the coming weeks. If the thought of being part of an effective, well-functioning operation gets you really excited, that’s a promising sign!

Other Tidbits

  • Moving to the Bay Area? We’ll cover up to $3,500 in travel expenses.
  • Use public transit to get to work? You get a transit pass with a large monthly allowance.
  • All the snacks and drinks you could want at the office.
  • You’ll get a smartphone and full plan.
  • This is a salaried position. (That is, your job is not to sit at a desk for 40 hours a week. Your job is to get your important work done, even if this occasionally means working on a weekend or after hours.)

It can also be surprisingly motivating to realize that your day job is helping people explore the frontiers of human understanding, mitigate global catastrophic risk, etc., etc. At MIRI, we try to tackle the very largest problems facing humanity, and that can be a pretty satisfying feeling.

If this sounds like your ideal work environment, read on! It’s time to talk about your role.

What an office manager does and why it matters

Our ops team and researchers (and collection of remote contractors) are swamped making progress on the huge task we’ve taken on as an organization.

That’s where you come in. An office manager is the oil that keeps the engine running. They’re indispensable. Office managers are force multipliers: a good one doesn’t merely improve their own effectiveness—they make the entire organization better.

We need you to build, oversee, and improve all the “behind-the-scenes” things that ensure MIRI runs smoothly and effortlessly. You will devote your full attention to looking at the big picture and the small details and making sense of it all. You’ll turn all of that into actionable information and tools that make the whole team better. That’s the job.

Sometimes this looks like researching and testing out new and exciting services. Other times this looks like stocking the fridge with drinks, sorting through piles of mail, lugging bags of groceries, or spending time on the phone on hold with our internet provider. But don’t think that the more tedious tasks are low-value. If the hard tasks don’t get done, none of MIRI’s work is possible. Moreover, you’re actively encouraged to find creative ways to make the boring stuff more efficient—making an awesome spreadsheet, writing a script, training a contractor to take on the task—so that you can spend more time on what you find most exciting.

We’re small, but we’re growing, and this is an opportunity for you to grow too. There’s room for advancement at MIRI (if that interests you), based on your interests and performance.

Sample Tasks

You’ll have a wide variety of responsibilities, including, but not necessarily limited to, the following:

  • Orienting and training new staff.
  • Onboarding and offboarding staff and contractors.
  • Managing employee benefits and services, like transit passes and health care.
  • Payroll management; handling staff questions.
  • Championing our internal policies and procedures wiki—keeping everything up to date, keeping everything accessible, and keeping staff aware of relevant information.
  • Managing various services and accounts (ex. internet, phone, insurance).
  • Championing our work space, with the goal of making the MIRI office a fantastic place to work.
  • Running onsite logistics for introductory workshops.
  • Processing all incoming mail packages.
  • Researching and implementing better systems and procedures.

Your “value-add” is by taking responsibility for making all of these things happen. Having a competent individual in charge of this diverse set of tasks at MIRI is extremely valuable!

A Day in the Life

A typical day in the life of MIRI’s office manager may look something like this:

  • Come in.
  • Process email inbox.
  • Process any incoming mail, scanning/shredding/dealing-with as needed.
  • Stock the fridge, review any low-stocked items, and place an order online for whatever’s missing.
  • Onboard a new contractor.
  • Spend some time thinking of a faster/easier way to onboard contractors. Implement any hacks you come up with.
  • Follow up with Employee X about their benefits question.
  • Outsource some small tasks to TaskRabbit or Upwork. Follow up with previously outsourced tasks.
  • Notice that you’ve spent a few hours per week the last few weeks doing xyz. Spend some time figuring out whether you can eliminate the task completely, automate it in some way, outsource it to a service, or otherwise simplify the process.
  • Review the latest post drafts on the wiki. Polish drafts as needed and move them to the appropriate location.
  • Process email.
  • Go home.

You’re the one we’re looking for if:

  • You are authorized to work in the US. (Prospects for obtaining an employment-based visa for this type of position are slim; sorry!)
  • You can solve problems for yourself in new domains; you find that you don’t generally need to be told what to do.
  • You love organizing information. (There’s a lot of it, and it needs to be super-accessible.)
  • Your life is organized and structured.
  • You enjoy trying things you haven’t done before. (How else will you learn which things work?)
  • You’re way more excited at the thought of being the jack-of-all-trades than at the thought of being the specialist.
  • You are good with people—good at talking about things that are going great, as well as things that aren’t.
  • People thank you when you deliver difficult news. You’re that good.
  • You can notice all the subtle and wondrous ways processes can be automated, simplified, streamlined… while still keeping the fridge stocked in the meantime.
  • You know your way around a computer really well.
  • Really, really well.
  • You enjoy eliminating unnecessary work, automating automatable work, outsourcing outsourcable work, and executing on everything else.
  • You want to do what it takes to help all other MIRI employees focus on their jobs.
  • You’re the sort of person who sees the world, organizations, and teams as systems that can be observed, understood, and optimized.
  • You think Sam is the real hero in Lord of the Rings.
  • You have the strong ability to take real responsibility for an issue or task, and ensure it gets done. (This doesn’t mean it has to get done by you; but it has to get done somehow.)
  • You celebrate excellence and relentlessly pursue improvement.
  • You lead by example.

Bonus Points:

  • Your technical chops are really strong. (Dabbled in scripting? HTML/CSS? Automator?)
  • Involvement in the Effective Altruism space.
  • Involvement in the broader AI-risk space.
  • Previous experience as an office manager.

Experience & Education Requirements

  • Let us know about anything that’s evidence that you’ll fit the bill.

How to Apply

by July 31, 2015!

P.S. Share the love! If you know someone who might be a perfect fit, we’d really appreciate it if you pass this along!

  1. More details on our About page. 

Magnetic rings (the most mediocre superpower) A review.

15 Elo 30 July 2015 01:23PM

Following on from a few threads about superpowers and extra sense that humans can try to get; I have always been interested in the idea of putting a magnet in my finger for the benefits of extra-sensory perception.

Stories (occasional news articles) imply that having a magnet implanted in a finger in a place surrounded by nerves imparts a power of electric-sensation.  The ability to feel when there are electric fields around.  So that's pretty neat.  Only I don't really like the idea of cutting into myself (even if its done by a professional piercing artist).  

Only recently did I come across the suggestion that a magnetic ring could impart similar abilities and properties.  I was delighted at the idea of a similar and non-invasive version of the magnetic-implant (people with magnetic implants are commonly known as grinders within the community).  I was so keen on trying it that I went out and purchased a few magnetic rings of different styles and different properties.

Interestingly the direction that a magnetisation can be imparted to a ring-shaped object can be selected from 2 general types.  Magnetised across the diameter, or across the height of the cylinder shape.  (there is a 3rd type which is a ring consisting of 4 outwardly magnetised 1/4 arcs of magnetic metal suspended in a ring-casing. and a few orientations of that system).

I have now been wearing a Neodymium ND50 magnetic ring from for around two months.  The following is a description of my experiences with it.

When I first got the rings, I tried wearing more than one ring on each hand, I very quickly found out what happens when you wear two magnets close to each other. AKA they attract.  Within a day I was wearing one magnet on each hand.  What is interesting is what happens when you move two very strong magnets within each other's magnetic field.  You get the ability to feel a magnetic field, and roll it around in your hands.  I found myself taking typing breaks to play with the magnetic field between my fingers.  It was an interesting experience to be able to do that.  I also found I liked the snap as the two magnets pulled towards each other and regularly would play with them by moving them near each other.  For my experiences here I would encourage others to use magnets as a socially acceptable way to hide an ADHD twitch - or just a way to keep yourself amused if you don't have a phone to pull out and if you ever needed a reason to move.  I have previously used elastic bands around my wrist for a similar purpose.

The next thing that is interesting to note is what is or is not ferrous.  Fridges are made of ferrous metal but not on the inside.  Door handles are not usually ferrous, but the tongue and groove of the latch is.  metal railings are common, as are metal nails in wood.  Elevators and escalators have some metallic parts.  Light switches are often plastic but there is a metal screw holding them into the wall.  Tennis fencing is ferrous, the ends of usb cables are sometimes ferrous and sometimes not.  The cables are not ferrous.  except one I found. (they are probably made of copper)


Breaking technology

I had a concern that I would break my technology.  That would be bad.  overall I found zero broken pieces of technology.  In theory if you take a speaker which consists of a magnet and an electric coil and you mess around with its magnetic field it will be unhappy and maybe break.  That has not happened yet.  The same can be said for hard drives, magnetic memory devices, phone technology and other things that rely on electricity.  So far nothing has broken.  What I did notice is that my phone has a magnetic-sleep function on the top left.  i.e. it turns the screen off to hold the ring near that point.  For both benefit and detriment depending on where I am wearing the ring.

Metal shards

I spend some of my time in workshops that have metal shards lying around.  sometimes they are sharp, sometimes they are more like dust.  They end up coating the magnetic ring.  The sharp ones end up jabbing you, and the dust just looks like dirt on your skin.  in a few hours they tend to go away anyways, but it is something I have noticed

magnetic strength

Over the time I have been wearing the magnets their strength has dropped off significantly.  I am considering building a remagnetisation jig, but have not started any work on it.  obviously every time I ding something against it, every time I drop them - the magnetisation decreases a bit as the magnetic dipoles reorganise.


I cook a lot.  Which means I find myself holding sharp knives fairly often.  The most dangerous thing that I noticed about these rings is that when I hold a ferrous knife in the normal way I hold a knife, the magnet has a tendency to shift the knife slightly or at a time when I don't want it to.  That sucks.  Don't wear them while playing with sharp objects like knives.  the last think you want to do is accidentally have your carrot-cutting turn into a finger-cutting event.  What is interesting as well is that some cutlery is made of ferrous metal and some is not.  also sometimes parts of a piece of cutlery are ferrous and some are non-ferrous.  i.e. my normal food-eating knife set has a ferrous blade part and a non-ferrous handle part.  I always figured they were the same, but the magnet says they are different materials.  Which is pretty neat.  I have found the same thing with spoons sometimes.  the scoop is ferrous and the handle is not.  I assume it would be because the scoop/blade parts need extra forming steps so need to be a more work-able metal.  Cheaper cutlery is not like this.

The same applies to hot pieces of metal.  Ovens, stoves, kettles, soldering irons...  When they accidentally move towards your fingers, or your fingers are compelled to be attracted to them.  Thats a slightly unsafe experience.


You know how when you run a microwave it buzzes, in a *vibrating* sorta way.  if you put your hand against the outside of a microwave you will feel the motor going.  Yea cool.  So having a magnetic ring means you can feel that without touching the microwave from about 20cm away.  There is a variability to it, better microwaves have more shielding on their motors and are leak less.  I tried to feel the electric field around power tools like a drill press, handheld tools like an orbital sander, computers, cars, appliances, which pretty much covers everything.  I also tried servers and the only thing that really had a buzzing field was a UPS machine (uninterupted power supply).  Which was cool.  Only other people had reported that any transformer - i.e. a computer charger would make that buzz.  I also carry a battery block with me and that had no interesting fields.  Totally not exciting.  As for moving electrical charge.  Cant feel it.  If powerpoints are receiving power - nope.  not dying by electrocution - no change.

boring superpower

There is a reason I call magnetic rings a boring superpower.  The only real super-power I have been imparted is the power to pick up my keys without using my fingers.  and also maybe hold my keys without trying to.  As superpowers go - thats pretty lame.  But kinda nifty.  I don't know. I wouldn't insist people do it for the life-changing purposes.


Did I find a human-superpower?  No.  But I am glad I tried it.


Any questions?  Any experimenting I should try?

The Pre-Historical Fallacy

13 Tem42 03 July 2015 08:21PM

One fallacy that I see frequently in works of popular science -- and also here on LessWrong -- is the belief that we have strong evidence of the way things were in pre-history, particularly when one is giving evidence that we can explain various aspects of our culture, psychology, or personal experience because we evolved in a certain way. Moreover, it is held implicit that because we have this 'strong evidence', it must be relevant to the topic at hand. While it is true that the environment did effect our evolution and thus the way we are today, evolution and anthropology of pre-historic societies is emphasized to a much greater extent than rational thought would indicate is appropriate. 

As a matter of course, you should remember these points whenever you hear a claim about prehistory:

  • Most of what we know (or guess) is based on less data than you would expect, and the publish or perish mentality is alive and well in the field of anthropology.
  • Most of the information is limited and technical, which means that anyone writing for a popular audience will have strong motivation to generalize and simplify.
  • It has been found time and time again that for any statement that we can make about human culture and behavior that there is (or was) a society somewhere that will serve as a counterexample. 
  • Very rarely do anthropologists or members of related fields have finely tuned critical thinking skills or a strong background on the philosophy of science, and are highly motivated to come up with interpretations of results that match their previous theories and expectations. 

Results that you should have reasonable levels of confidence in should be framed in generalities, not absolutes. E.g., "The great majority of human cultures that we have observed have distinct and strong religious traditions", and not "humans evolved to have religion". It may be true that we have areas in our brain that evolved not only 'consistent with holding religion', but actually evolved 'specifically for the purpose of experiencing religion'... but it would be very hard to prove this second statement, and anyone who makes it should be highly suspect. 

Perhaps more importantly, these statements are almost always a red herring. It may make you feel better that humans evolved to be violent, to fit in with the tribe, to eat meat, to be spiritual, to die at the age of thirty.... But rarely do we see these claims in a context where the stated purpose is to make you feel better. Instead they are couched in language indicating that they are making a normative statement -- that this is the way things in some way should be. (This is specifically the argumentum ad antiquitatem or appeal to tradition, and should not be confused with the historical fallacy, but it is certainly a fallacy). 

It is fine to identify, for example, that your fear of flying has a evolutionary basis. However, it is foolish to therefore refuse to fly because it is unnatural, or to undertake gene therapy to correct the fear. Whether or not the explanation is valid, it is not meaningful. 

Obviously, this doesn't mean that we shouldn't study evolution or the effects evolution has on behavior. However, any time you hear someone refer to this information in order to support any argument outside the fields of biology or anthropology, you should look carefully at why they are taking the time to distract you from the practical implications of the matter under discussion. 


Rationality Reading Group: Part D: Mysterious Answers

12 Gram_Stone 02 July 2015 01:55AM

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.

Welcome to the Rationality reading group. This week we discuss Part D: Mysterious Answers (pp. 117-191)This post summarizes each article of the sequence, linking to the original LessWrong post where available.

D. Mysterious Answers

30. Fake Explanations - People think that fake explanations use words like "magic," while real explanations use scientific words like "heat conduction." But being a real explanation isn't a matter of literary genre. Scientific-sounding words aren't enough. Real explanations constrain anticipation. Ideally, you could explain only the observations that actually happened. Fake explanations could just as well "explain" the opposite of what you observed.

31. Guessing the Teacher's Password - In schools, "education" often consists of having students memorize answers to specific questions (i.e., the "teacher's password"), rather than learning a predictive model that says what is and isn't likely to happen. Thus, students incorrectly learn to guess at passwords in the face of strange observations rather than admit their confusion. Don't do that: any explanation you give should have a predictive model behind it. If your explanation lacks such a model, start from a recognition of your own confusion and surprise at seeing the result.

32. Science as Attire - You don't understand the phrase "because of evolution" unless it constrains your anticipations. Otherwise, you are using it as attire to identify yourself with the "scientific" tribe. Similarly, it isn't scientific to reject strongly superhuman AI only because it sounds like science fiction. A scientific rejection would require a theoretical model that bounds possible intelligences. If your proud beliefs don't constrain anticipation, they are probably just passwords or attire.

33. Fake Causality - It is very easy for a human being to think that a theory predicts a phenomenon, when in fact is was fitted to a phenomenon. Properly designed reasoning systems (GAIs) would be able to avoid this mistake with our knowledge of probability theory, but humans have to write down a prediction in advance in order to ensure that our reasoning about causality is correct.

34. Semantic Stopsigns - There are certain words and phrases that act as "stopsigns" to thinking. They aren't actually explanations, or help to resolve the actual issue at hand, but they act as a marker saying "don't ask any questions."

35. Mysterious Answers to Mysterious Questions - The theory of vitalism was developed before the idea of biochemistry. It stated that the mysterious properties of living matter, compared to nonliving matter, was due to an "elan vital". This explanation acts as a curiosity-stopper, and leaves the phenomenon just as mysterious and inexplicable as it was before the answer was given. It feels like an explanation, though it fails to constrain anticipation.

36. The Futility of Emergence - The theory of "emergence" has become very popular, but is just a mysterious answer to a mysterious question. After learning that a property is emergent, you aren't able to make any new predictions.

37. Say Not "Complexity" - The concept of complexity isn't meaningless, but too often people assume that adding complexity to a system they don't understand will improve it. If you don't know how to solve a problem, adding complexity won't help; better to say "I have no idea" than to say "complexity" and think you've reached an answer.

38. Positive Bias: Look into the Dark - Positive bias is the tendency to look for evidence that confirms a hypothesis, rather than disconfirming evidence.

39. Lawful Uncertainty - Facing a random scenario, the correct solution is really not to behave randomly. Faced with an irrational universe, throwing away your rationality won't help.

40. My Wild and Reckless Youth - Traditional rationality (without Bayes' Theorem) allows you to formulate hypotheses without a reason to prefer them to the status quo, as long as they are falsifiable. Even following all the rules of traditional rationality, you can waste a lot of time. It takes a lot of rationality to avoid making mistakes; a moderate level of rationality will just lead you to make new and different mistakes.

41. Failing to Learn from History - There are no inherently mysterious phenomena, but every phenomenon seems mysterious, right up until the moment that science explains it. It seems to us now that biology, chemistry, and astronomy are naturally the realm of science, but if we had lived through their discoveries, and watched them reduced from mysterious to mundane, we would be more reluctant to believe the next phenomenon is inherently mysterious.

42. Making History Available - It's easy not to take the lessons of history seriously; our brains aren't well-equipped to translate dry facts into experiences. But imagine living through the whole of human history - imagine watching mysteries be explained, watching civilizations rise and fall, being surprised over and over again - and you'll be less shocked by the strangeness of the next era.

43. Explain/Worship/Ignore? - When you encounter something you don't understand, you have three options: to seek an explanation, knowing that that explanation will itself require an explanation; to avoid thinking about the mystery at all; or to embrace the mysteriousness of the world and worship your confusion.

44. "Science" as Curiosity-Stopper - Although science does have explanations for phenomena, it is not enough to simply say that "Science!" is responsible for how something works -- nor is it enough to appeal to something more specific like "electricity" or "conduction". Yet for many people, simply noting that "Science has an answer" is enough to make them no longer curious about how it works. In that respect, "Science" is no different from more blatant curiosity-stoppers like "God did it!" But you shouldn't let your interest die simply because someone else knows the answer (which is a rather strange heuristic anyway): You should only be satisfied with a predictive model, and how a given phenomenon fits into that model.

45. Truly Part of You - Any time you believe you've learned something, you should ask yourself, "Could I re-generate this knowledge if it were somehow deleted from my mind, and how would I do so?" If the supposed knowledge is just empty buzzwords, you will recognize that you can't, and therefore that you haven't learned anything. But if it's an actual model of reality, this method will reinforce how the knowledge is entangled with the rest of the world, enabling you to apply it to other domains, and know when you need to update those beliefs. It will have become "truly part of you", growing and changing with the rest of your knowledge.

Interlude: The Simple Truth

This has been a collection of notes on the assigned sequence for this week. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

The next reading will cover Part E: Overly Convenient Excuses (pp. 211-252). The discussion will go live on Wednesday, 15 July 2015 at or around 6 p.m. PDT, right here on the discussion forum of LessWrong.

The horrifying importance of domain knowledge

11 NancyLebovitz 30 July 2015 03:28PM

There are some long lists of false beliefs that programmers hold. isn't because programmers are especially likely to be more wrong than anyone else, it's just that programming offers a better opportunity than most people get to find out how incomplete their model of the world is.

I'm posting about this here, not just because this information has a decent chance of being both entertaining and useful, but because LWers try to figure things out from relatively simple principles-- who knows what simplifying assumptions might be tripping us up?

The classic (and I think the first) was about names. There have been a few more lists created since then.

Time. And time zones. Crowd-sourced time errors.

Addresses. Possibly more about addresses. I haven't compared the lists.

Gender. This is so short I assume it's seriously incomplete.

Networks. Weirdly, there is no list of falsehoods programmers believe about html (or at least a fast search didn't turn anything up). Don't trust the words in the url.

Distributed computing Build systems.

Poem about character conversion.

I got started on the subject because of this about testing your code, which was posted by Andrew Ducker.

Should you write longer comments? (Statistical analysis of the relationship between comment length and ratings)

11 cleonid 20 July 2015 02:09PM

A few months ago we have launched an experimental website. In brief, our goal is to create a platform where unrestricted freedom of speech would be combined with high quality of discussion. The problem can be approached from two directions. One is to help users navigate through content and quickly locate the higher quality posts. Another, which is the topic of this article, is to help users improve the quality of their own posts by providing them with meaningful feedback.

One important consideration for those who want to write better comments is how much detail to leave out. Our statistical analysis shows that for many users there is a strong connection between the ratings and the size of their comments. For example, for Yvain (Scott Alexander) and Eliezer_Yudkowsky, the average number of upvotes grows almost linearly with increasing comment length.



This trend, however, does not apply to all posters. For example, for the group of top ten contributors (in the last 30 days) to LessWrong, the average number of upvotes increases only slightly with the length of the comment (see the graph below).  For quite a few people the change even goes in the opposite direction – longer comments lead to lower ratings.



Naturally, even if your longer comments are rated higher than the short ones, this does not mean that inflating comments would always produce positive results. For most users (including popular writers, such as Yvain and Eliezer), the average number of downvotes increases with increasing comment length. The data also shows that long comments that get most upvotes are generally distinct from long comments that get most downvotes. In other words, long comments are fine as long as they are interesting, but they are penalized more when they are not.



The rating patterns vary significantly from person to person. For some posters, the average number of upvotes remains flat until the comment length reaches some threshold and then starts declining with increasing comment length. For others, the optimal comment length may be somewhere in the middle. (Users who have accounts on both Lesswrong and Omnilibrium can check the optimal length for their own comments on both websites by using this link.)

Obviously length is just one among many factors that affect comment quality and for most users it does not explain more than 20% of variation in their ratings. We have a few other ideas on how to provide people with meaningful feedback on both the style and the content of their posts. But before implementing them, we would like to get your opinions first. Would such feedback be actually useful to you?

State-Space of Background Assumptions

10 algekalipso 29 July 2015 12:22AM

Hello everyone!

My name is Andrés Gómez Emilsson, and I'm the former president of the Stanford Transhumanist Association. I just graduated from Stanford with a masters in computational psychology (my undergraduate degree was in Symbolic Systems, the major with the highest LessWronger density at Stanford and possibly of all universities).

I have a request for the LessWrong community: I would like as many of you as possible to fill out this questionnaire I created to help us understand what causes the diversity of values in transhumanism. The purpose of this questionnaire is twofold:


  1. Characterize the state-space of background assumptions about consciousness
  2. Evaluate the influence of beliefs about consciousness, as well as personality and activities, in the acquisition of memetic affiliations


The first part is not specific to transhumanism, and it will be useful whether or not the second is fruitful. What do I mean by the state-space of background assumptions? The best way to get a sense of what this would look like is to see the results of a previous study I conducted: State-space of drug effects. There I asked participants to "rate the effects of a drug they have taken" by selecting the degree to which certain phrases describe the effects of the drug. I then conducted factor analysis on the dataset and extracted 6 meaningful factors accounting for more than 50% of the variance. Finally, I mapped the centroid of the responses of each drug in the state-space defined, so that people could visually compare the relative position of all of the substances in a normalized 6-dimensional space. 

I don't know what the state-space of background assumptions about consciousness looks like, but hopefully the analysis of the responses to this survey will reveal them.

The second part is specific to transhumanism, and I think it should concerns us all. To the extent that we are participating in the historical debate about how the future of humanity should be, it is important for us to know what makes people prefer certain views over others. To give you a fictitious example of a possible effect I might discover: It may turn out that being very extraverted predisposes you to be uninterested in Artificial Intelligence and its implications. If this is the case, we could pin-point possible sources of bias in certain communities and ideological movements, thereby increasing the chances of making more rational decisions.

The survey is scheduled to be closed in 2 days, on July 30th 2015. That said, I am willing to extend the deadline until August 2nd if I see that the number of LessWrongers answering the questionnaire is not slowing down by the 30th. [July 31st edit: I extend the deadline until midnight (California time) of August 2nd of 2015.]

Thank you all!

Andrés :)


Here are some links about my work in case you are interested and want to know more:

Survey link

Qualia Computing

Psychophysics for Psychedelic Research 

Psychedelic Perception of Visual Textures

The Psychedelic Future of Consciousness

A Workable Solution to the Problem of Other Minds

How to accelerate your cognitive growth in 8 difficult steps

10 BayesianMind 22 July 2015 04:01AM

I believe there is some truth in William James' conclusion that "compared with what we ought to be, we are only half awake." (James, 1907). So what can we do to awaken our slumbering potentials? I am especially interested in our potential for cognitive growth, that is learning to think, learn, and decide better. Early in life we learn amazingly fast, but as we transition into adulthood our cognitive development plateaus, and most of us get stuck in suboptimal mental habits and never realize our full potential. I think this is very sad, and I wish we could find a way to accelerate our cognitive growth. Yesterday, I organized a discussion on this very topic at the Consciousness Hacking Meetup, and it inspired me to propose the following eight steps as a starting point for our personal and scientific exploration of interventions to promote cognitive growth:

1.    Tap into your intrinsic motivation by mental contrasting: Who do you want to become and why? Imagine your best possible self and how wonderful it will be to become that person. Imagine how wonderful it will be to have perfected the skill you seek to develop and how it will benefit you. Next, contrast the ideal future self you just imagines with who you are right now and be brutally honest with yourself. Realizing the discrepancy between who you are and who you want to be is a powerful motivator (Oettingen, et al., 2009). Finally, make yourself aware that you and the world around you will benefit from any progress that you make on yourself for a very, very long time. A few hours of hard work per week is a small price to pay for the sustained benefits of being a better person and feeling better about yourself for the rest of your life.

2.    Become more self-aware: Introspect, observe yourself, and ask your friends to develop an accurate understanding and acceptance of how you currently fare in the skill you want to improve and why. What do you do in situations that require the skill? How well does it work? How do you feel? Have you tried doing it differently? Are you currently improving? Why or why not?

3.    Develop a growth mindset (Dweck, 2006): Convince yourself that you will learn virtually any cognitive skill if you invest the necessary hard work. Even talent is just a matter of training. Each failure is a learning opportunity and so are your little successes along the way.

4.    Understand the skill and how it is learned: What do masters of this skill do? How does it work? How did they develop the skill? What are the intermediate stages? How can the skill be learned and practiced? Are there any exercises, tutorials, tools, books, or courses for acquiring the skill you want to improve on?

5.    Create a growth structure for yourself:

a. Set SMART self-improvement goals (Doran, 1981). The first three steps give you a destination (i.e. a better version of yourself), a starting point (i.e. the awareness of your strengths and weaknesses), and a road map (i.e. how to practice). Now it is time to plan your journey. Which path do you want to take from who you are right now to who you want to be in the future? A good way to delineate your path might be to place a number of milestones and decide by when you want to have reached each of them. Milestones are specific, measurable goals that lie between where you are now and where you want to be. Starting with the first milestone, you can choose a series of steps and decide when to take each step. It helps to set concrete goals at the beginning of every day. To set good milestones and choose appropriate steps, you can ask yourself the following questions: What exactly do I want to learn? How will I know that I have learned it? What will I do to develop that skill? By which time do I want to have learned it?

b. Translate your goals into implementation intentions. An implementation intention is a simple IF-THEN plan. It specifies a concrete situation in which you will take action (IF) and what exactly you will do (THEN). Committing to an implementation intention will make you much more likely to seize opportunities to make progress towards your goals and eventually achieve them (Gollwitzer, 1999).

c. You can restructure your physical environment to make your goals and your progress more salient. To make your goals more salient you can write them down and post them on your desktop, in your office, and in your apartment. To make your progress more salient, make todo lists and celebrate checking off every subtask that you have completed. Give yourself points for every task you completed and compute your daily score, e.g. the percentage of daily goals that you have accomplished. Celebrate these small moments of victory! Post your path and score board in a visible manner.

d. Restructure your social environment to make it more conducive to growth. You can share your self-improvement goals with a friend or mentor who helps you understand where you are at, encourages you to grow, and will hold you accountable for following through with your plan. Friends can make suggestions for what to try and give you feedback on how you are doing. They can also help you notice, appreciate and celebrate your progress. Identify social interactions that help you grow and seek them out more while changing or avoiding social interactions that hinder your growth.

e. There are many things that you can do to also restructure your own mind for growth as well: There are at least three kinds of things you can do. First, you can be more mindful of what you do, how well it works, and why. Mindful learning is much more effective than mindless learning. Second, you an pay more attention to the moments when you do well at what you want to improve. Let yourself appreciate these small (or large) successes more—give yourself a compliment for getting better, smile, and give yourself a mental or physical pat and the shoulder. Attend specifically to your improvement. To do so, ask yourself, if you are getting better rather than how well you did. You can mentally contrast what you did this time to how poorly you used to do when you started working on that skill. Rate your improvement by how many percent better you perform now than you used to. Third, you can be kind to yourself: Don’t beat yourself up for failing and being imperfect. Instead, embrace failure as an opportunity for growth. This is will allow you to continue practicing a skill that you have not mastered yet rather than giving up in frustration.

6.   Seek advice, experiment, and get feedback: Accept that you don’t know how to do it yet and adopt a beginner’s mindset. Curious infants learn much more rapidly than seniors who think they know it all. So emulate a curious infant rather than pretending that you know everything already. With this mindset, it will be much easier to seek advice from other people. Experimenting with new ways of doing things is critical, because if you merely repeat what you have done a thousand times the results won’t be dramatically different. Sometimes we are unaware of something large or small that really matters, and it is often hard to notice what you are doing wrong and what you are doing well. This is why it is crucial to get feedback; ideally from somebody who has already mastered the skill you are trying to learn.

7.  Practice, practice, practice. Becoming a world-class expert requires 10,000 hours of deliberate practice (Ericsson, Krampe, & Tesch-Romer, 1993). Since you probably don’t need to become the world’s leading expert in the skill you are seeking to develop, fewer hours will be sufficient. But the point is that you will have to practice a lot. You will have to challenge yourself regularly and practicing will be hard. Schedule to practice the skill regularly. Make practicing a habit. Kindly help yourself resume the practice after you have let it slip.

8.  Reflect on your progress at a regular basis, perhaps at the end of every day. Ask yourself: What have I learned today/this week/this month? Am I making any progress? What did I do well? What will I do better tomorrow/this week/month.


Doran, G. T. (1981). There's a S.M.A.R.T. way to write management's goals and objectives. Management Review70 (11): 35–36.

Dweck, C. (2006). Mindset: The new psychology of success. Random House.

Ericsson, K.A., Krampe, R.Th. and Tesch-Romer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100, pp. 393-394.

Gollwitzer, P. M. (1999). Implementation intentions: Strong effects of simple plans. American Psychologist, 54, 493-503.

James, W. (1907). The energies of men. Science, 321-332.

Oettingen, G., Mayer, D., Sevincer, A. T., Stephens, E. J., Pak, H. J., & Hagenah, M. (2009). Mental contrasting and goal commitment: The mediating role of energization. Personality and Social Psychology Bulletin35(5), 608-622. 

AGI Safety Solutions Map

10 turchin 21 July 2015 02:41PM

When I started to work on the map of AI safety solutions, I wanted to illustrate the excellent article “Responses to Catastrophic AGI Risk: A Survey” by Kaj Sotala and Roman V. Yampolskiy, 2013, which I strongly recommend.

However, during the process I had a number of ideas to expand the classification of the proposed ways to create safe AI. In their article there are three main categories: social constraints, external constraints and internal constraints.

I added three more categories: "AI is used to create a safe AI", "Multi-level solutions" and "meta-level", which describes the general requirements for any AI safety theory.

In addition, I divided the solutions into simple and complex. Simple are the ones whose recipe we know today. For example: “do not create any AI”. Most of these solutions are weak, but they are easy to implement.

Complex solutions require extensive research and the creation of complex mathematical models for their implementation, and could potentially be much stronger. But the odds are less that there will be time to realize them and implement successfully.

After aforementioned article several new ideas about AI safety appeared.

These new ideas in the map are based primarily on the works of Ben Goertzel, Stuart Armstrong and Paul Christiano. But probably many more exist and was published but didn’t come to my attention.

Moreover, I have some ideas of my own about how to create a safe AI and I have added them into the map too. Among them I would like to point out the following ideas:

1.     Restriction of self-improvement of the AI. Just as a nuclear reactor is controlled by regulation the intensity of the chain reaction, one may try to control AI by limiting its ability to self-improve in various ways.

2.     Capture the beginning of dangerous self-improvement. At the start of potentially dangerous AI it has a moment of critical vulnerability, just as a ballistic missile is most vulnerable at the start. Imagine that AI gained an unauthorized malignant goal system and started to strengthen itself. At the beginning of this process, it is still weak, and if it is below the level of human intelligence at this point, it may be still more stupid than the average human even after several cycles of self-empowerment. Let's say it has an IQ of 50 and after self-improvement it rises to 90. At this level it is already committing violations that can be observed from the outside (especially unauthorized self-improving), but does not yet have the ability to hide them. At this point in time, you can turn it off. Alas, this idea would not work in all cases, as some of the objectives may become hazardous gradually as the scale grows (1000 paperclips are safe, one billion are dangerous, 10 power 20 are x-risk). This idea was put forward by Ben Goertzel.

3.     AI constitution. First, in order to describe the Friendly AI and human values we can use the existing body of criminal and other laws. (And if we create an AI that does not comply with criminal law, we are committing a crime ourselves.) Second, to describe the rules governing the conduct of AI, we can create a complex set of rules (laws that are much more complex than Asimov’s three laws), which will include everything we want from AI. This set of rules can be checked in advance by specialized AI, which calculates only the way in which the application of these rules can go wrong (something like mathematical proofs on the basis of these rules).

4.     "Philosophical landmines." In the map of AI failure levels I have listed a number of ways in which high-level AI may halt when faced with intractable mathematical tasks or complex philosophical problems. One may try to fight high-level AI using "landmines", that is, putting it in a situation where it will have to solve some problem, but within this problem is encoded more complex problems, the solving of which will cause it to halt or crash. These problems may include Godelian mathematical problems, nihilistic rejection of any goal system or the inability of AI to prove that it actually exists.

5. Multi-layer protection. The idea here is not that if we apply several methods at the same time, the likelihood of their success will add up, this notion will not work if all methods are weak. The idea is that the methods of protection work together to protect the object from all sides. In a sense, human society works the same way: a child is educated by an example as well as by rules of conduct, then he begins to understand the importance of compliance with these rules, but also at the same time the law, police and neighbours are watching him, so he knows that criminal acts will put him in jail. As a result, lawful behaviour is his goal which he finds rational to obey. This idea can be reflected in the specific architecture of AI, which will have at its core a set of immutable rules, around it will be built human emulation which will make high-level decisions, and complex tasks will be delegated to a narrow Tool AIs. In addition, independent emulation (conscience) will check the ethics of its decisions. Decisions will first be tested in a multi-level virtual reality, and the ability of self-improvement of the whole system will be significantly limited. That is, it will have an IQ of 300, but not a million. This will make it effective in solving aging and global risks, but it will also be predictable and understandable to us. The scope of its jurisdiction should be limited to a few important factors: prevention of global risks, death prevention and the prevention of war and violence. But we should not trust it in such an ethically delicate topic as prevention of suffering, which will be addressed with the help of conventional methods.

This map could be useful for the following applications:

1. As illustrative material in the discussions. Often people find solutions ad hoc, once they learn about the problem of friendly AI or are focused on one of their favourite solutions.

2. As a quick way to check whether a new solution really has been found.

3. As a tool to discover new solutions. Any systematisation creates "free cells" to fill for which one can come up with new solutions. One can also combine existing solutions or be inspired by them.

4. There are several new ideas in the map.

A companion to this map is the map of AI failures levels. In addition, this map is subordinated to the map of global risk prevention methods and corresponds to the block "Creating Friendly AI" Plan A2 within it.

The pdf of the map is here:


Previous posts:

A map: AI failures modes and levels

A Roadmap: How to Survive the End of the Universe

A map: Typology of human extinction risks

Roadmap: Plan of Action to Prevent Human Extinction Risks



(scroll down to see the map)






















List of Fully General Counterarguments

9 Gunnar_Zarncke 18 July 2015 09:49PM

Follow-up to: Knowing About Biases Can Hurt People

See also: Fully General Counterargument (LW Wiki)

fully general counterargument [FGCA] is an argument which can be used to discount any conclusion the arguer does not like.

With the caveat that the arguer doesn't need to be aware that this is the case. But if (s)he is not aware of that, this seems like the other biases we are prone to. The question is: Is there a tendency or risk to accidentally form FGCAs? Do we fall easily into this mind-trap? 

This post tries to (non-exhaustively) list some FGCAs as well as possible countermeasures.

continue reading »

Moral AI: Options

9 Manfred 11 July 2015 09:46PM

Epistemic status: One part quotes (informative, accurate), one part speculation (not so accurate).

One avenue towards AI safety is the construction of "moral AI" that is good at solving the problem of human preferences and values. Five FLI grants have recently been funded that pursue different lines of research on this problem.

The projects, in alphabetical order:

Most contemporary AI systems base their decisions solely on consequences, whereas humans also consider other morally relevant factors, including rights (such as privacy), roles (such as in families), past actions (such as promises), motives and intentions, and so on. Our goal is to build these additional morally relevant features into an AI system. We will identify morally relevant features by reviewing theories in moral philosophy, conducting surveys in moral psychology, and using machine learning to locate factors that affect human moral judgments. We will use and extend game theory and social choice theory to determine how to make these features more precise, how to weigh conflicting features against each other, and how to build these features into an AI system. We hope that eventually this work will lead to highly advanced AI systems that are capable of making moral judgments and acting on them.

Techniques: Top-down design, game theory, moral philosophy

Previous work in economics and AI has developed mathematical models of preferences, along with algorithms for inferring preferences from observed actions. [Citation of inverse reinforcement learning] We would like to use such algorithms to enable AI systems to learn human preferences from observed actions. However, these algorithms typically assume that agents take actions that maximize expected utility given their preferences. This assumption of optimality is false for humans in real-world domains. Optimal sequential planning is intractable in complex environments and humans perform very rough approximations. Humans often don't know the causal structure of their environment (in contrast to MDP models). Humans are also subject to dynamic inconsistencies, as observed in procrastination, addiction and in impulsive behavior. Our project seeks to develop algorithms that learn human preferences from data despite the suboptimality of humans and the behavioral biases that influence human choice. We will test our algorithms on real-world data and compare their inferences to people's own judgments about their preferences. We will also investigate the theoretical question of whether this approach could enable an AI to learn the entirety of human values.

Techniques: Trying to find something better than inverse reinforcement learning, supervised learning from preference judgments

The future will see autonomous agents acting in the same environment as humans, in areas as diverse as driving, assistive technology, and health care. In this scenario, collective decision making will be the norm. We will study the embedding of safety constraints, moral values, and ethical principles in agents, within the context of hybrid human/agents collective decision making. We will do that by adapting current logic-based modelling and reasoning frameworks, such as soft constraints, CP-nets, and constraint-based scheduling under uncertainty. For ethical principles, we will use constraints specifying the basic ethical ``laws'', plus sophisticated prioritised and possibly context-dependent constraints over possible actions, equipped with a conflict resolution engine. To avoid reckless behavior in the face of uncertainty, we will bound the risk of violating these ethical laws. We will also replace preference aggregation with an appropriately developed constraint/value/ethics/preference fusion, an operation designed to ensure that agents' preferences are consistent with the system's safety constraints, the agents' moral values, and the ethical principles of both individual agents and the collective decision making system. We will also develop approaches to learn ethical principles for artificial intelligent agents, as well as predict possible ethical violations.

Techniques: Top-down design, obeying ethical principles/laws, learning ethical principles

The objectives of the proposed research are (1) to create a mathematical framework in which fundamental questions of value alignment can be investigated; (2) to develop and experiment with methods for aligning the values of a machine (whether explicitly or implicitly represented) with those of humans; (3) to understand the relationships among the degree of value alignment, the decision-making capability of the machine, and the potential loss to the human; and (4) to understand in particular the implications of the computational limitations of humans and machines for value alignment. The core of our technical approach will be a cooperative, game-theoretic extension of inverse reinforcement learning, allowing for the different action spaces of humans and machines and the varying motivations of humans; the concepts of rational metareasoning and bounded optimality will inform our investigation of the effects of computational limitations.

Techniques: Trying to find something better than inverse reinforcement learning (differently this time), creating a mathematical framework, whatever rational metareasoning is

Autonomous AI systems will need to understand human values in order to respect them. This requires having similar concepts as humans do. We will research whether AI systems can be made to learn their concepts in the same way as humans learn theirs. Both human concepts and the representations of deep learning models seem to involve a hierarchical structure, among other similarities. For this reason, we will attempt to apply existing deep learning methodologies for learning what we call moral concepts, concepts through which moral values are defined. In addition, we will investigate the extent to which reinforcement learning affects the development of our concepts and values.

Techniques: Trying to identify learned moral concepts, unsupervised learning 


The elephant in the room is that making judgments that always respect human preferences is nearly FAI-complete. Application of human ethics is dependent on human preferences in general, which are dependent on a model of the world and how actions impact it. Calling an action ethical also can also depend on the space of possible actions, requiring a good judgment-maker to be capable of search for good actions. Any "moral AI" we build with our current understanding is going to have to be limited and/or unsatisfactory.

Limitations might be things like judging which of two actions is "more correct" rather than finding correct actions, only taking input in terms of one paragraph-worth of words, or only producing good outputs for situations similar to some combination of trained situations.

Two of the proposals are centered on top-down construction of a system for making ethical judgments. Designing a system by hand, it's nigh-impossible to capture the subtleties of human values. Relatedly, it seems weak at generalization to novel situations, unless the specific sort of generalization has been forseen and covered. The good points of a top down approach are that it can capture things that are important, but are only a small part of the description, or are not easily identified by statistical properties. A top-down model of ethics might be used as a fail-safe, sometimes noticing when something undesirable is happening, or as a starting point for a richer learned model of human preferences.

Other proposals are inspired by inverse reinforcement learning. Inverse reinforcement learning seems like the sort of thing we want - it observes actions and infers preferences - but it's very limited. The problem of having to know a very good model of the world in order to be good at human preferences rears its head here. There are also likely unforseen technical problems in ensuring that the thing it learns is actually human preferences (rather than human foibles, or irrelevant patterns) - though this is, in part, why this research should be carried out now.

Some proposals want to take advantage of learning using neural networks, trained on peoples' actions or judgments. This sort of approach is very good at discovering patterns, but not so good at treating patterns as a consequence of underlying structure. Such a learner might be useful as a heuristic, or as a way to fill in a more complicated, specialized architecture. For this approach like the others, it seems important to make the most progress toward learning human values in a way that doesn't require a very good model of the world.

The Person As Input

8 Eneasz 08 July 2015 12:40AM

I. Humans are emotion-feeling machines. 

I don’t mean that humans are machines that happen to feel emotions. I mean that humans are machines whose output is the feeling of emotions—“emotion-feeling” is the thing of value that we produce.

Not just “being happy." Then wireheading is the ultimate good, rather than the go-to utopia-horror example. But emotions must be involved, because everything else one can do are no more than a means to an end. Producing things, propagating life, even thinking. They all seem like endeavors that are useful, but a life of maximizing those things would suck. And the implication is that if we can create a machine that can do those things better than we can, it would be good to replace ourselves with that machine and set it to reproduce itself infinitely. 

I recently saw a statement to the effect of “Art exists to produce feelings in us that we want, but do not get enough of in the course of normal life.” That’s what makes art valuable – supplementing emotional malnutrition. Such a thing exists because “to feel emotions” is the core function of humanity, and not fulfilling that function hurts like not eating does.

This is why (for many people) the optimal level of psychosis is non-zero. This is why intelligence is important – a greater level of intelligence allows a species to experience far more complex and nuanced emotional states. And the ability to experience more varieties of emotions is why it’s better to become more complex rather than simply dialing up happiness. It’s why disorders that prevent us from experiencing certain emotions are so awful (with the worst obviously being the ones that prevent us from feeling the “best” desires)

It’s why we like funny things, and tragic things, and scary things. Who wants to feel the way they feel after watching all of Evangelion?? Turns out – everyone, at some point, for at least a little bit of time!

It is why all human life has value. You do not matter based on what you can produce, or how smart you are, or how useful you are to others. You matter because you are a human who feels things.

My utility function is to feel a certain elastic web of emotions, and it varies from other utility functions by which emotions are desired in which amounts. My personality determines what actions produce what emotions.

And a machine that could feel things even better than humans can could be a wonderful thing. Greg Egan’s "Diaspora" features an entire society of uploaded humans, living rich, complex lives of substance. Loving, striving, crying, etc. The society can support far more humans than is physically possible in meat-bodies, running far faster than is possible in realspace. Since all these humans are running on computer chips, one could argue that one way of looking at this thing is not “A society of uploaded humans” but “A machine that feels human emotions better than meat-humans do.” And it’s a glorious thing. I would be happy to live in such a society.


II. God Mode is Super Lame

Why not just wirehead with a large and complex set of emotions?

I’m old enough to have played the original Doom when it came out (sooo old!). It had a cheat-code that made you invincible, commonly called god-mode. The first thing you notice is that it’s super cool to be invincible and just mow down all those monsters with impunity! The next thing you notice is that after a while (maybe ten minutes?) it loses all appeal. It becomes boring. There is no game anymore, once you no longer have to worry about taking damage. It becomes a task. You start enabling other cheats to get through it faster. Full-ammo cheats, to just use the biggest, fastest gun nonstop and get those monsters out of your way. Then walk-through-wall cheats, so you can just go straight to the level exit without wandering around looking for keys. Over, and over, and over again, level after level. It becomes a Kafka-esque grotesquery. Why am I doing this? Why am I here? Is my purpose just to keep walking endlessly from Spawn Point to Exit, the world passing around me in a blur, green and blue explosions obscuring all vision? When will this end?

It was a relief to be finished with the game.

That was my generation’s first brush with the difference between goal-oriented objectives, and process-oriented objectives. We learned that the point of a game isn’t to get to the end, the point is to play the game. It used to be that if you wanted to be an awesome guitarist, you had to go through the process of playing guitar a LOT. There was no shortcut. So one could be excused for confusing “I want to be a rock star” with “I want to be playing awesome music.” Before cheat codes, getting to the end of the game was fun, so we thought that was our objective. After cheat-codes we could go straight to the end any time we wanted, and now we had to choose – is your objective really just to get to the end? Or is it to go through the process of playing the game?

Some things are goal-oriented, of course. Very few people clean their toilets because they enjoy the process of cleaning their toilet. They want their toilet to be clean. If they could push a button and have a clean toilet without having to do the cleaning, they would.

Process-oriented objectives still have a goal. You want to beat the game. But you do not want first-order control over the bit “Game Won? Y/N”. You want first-order control over the actions that can get you there – strafing, shooting, jumping – resulting in second-order control over if the bit finally gets flipped or not.

First-order control is god mode. Your goal is completed with full efficiency. Second-order control is indirect. You can take actions, and those actions will, if executed well, get you closer to your goal. They are fuzzier, you can be wrong about their effects, their effects can be inconsistent over time, and you can get better at using them. You can tell if you’d prefer god-mode for a task by considering if you’d like to have it completed without going through the steps.

Do you want to:

Have Not Played The Game, And Have It Completed?  or Be Playing The Game?
Have A Clean Toilet, Without Cleaning It Yourself? or Be Cleaning The Toilet?
Be At The End of a Movie? or Be Watching The Movie?

If the answer is in the first column, you want first-order control. If it is in the second column, you want second-order control.

Wireheading, even variable multi-emotional wireheading, assumes that emotions are a goal-oriented objective, and thus takes first-order control of one’s emotional state. I contest that emotions are a process-oriented objective. The purpose is to evoke those emotions by using second-order control – taking actions that will lead to those emotions being felt. To eliminate that step and go straight to the credits is to lose the whole point of being human.


III. Removing The Person From The Output

How is the process of playing Doom without cheat codes distinguished from the process of repeatedly pushing a button connected to certain electrodes in your head that produce the emotions associated with playing Doom without cheat codes? (Or just lying there while the computer chooses which electrodes to stimulate on your behalf?)

If it’s just the emotions without the experiences that would cause those emotions, I think that’s a huge difference. That is once again just jumping right to the end-state, rather than experiencing the process that brings it about. It’s first-order control, and that efficiency and directness strips out all the complexity and nuance of a second-order experience.

See Incoming Fireball -> Startled, Fear
Strafe Right -> Anticipation, Dread
Fireball Dodged -> Relief
Return Fire -> Vengeance!!

Is strictly more complicated than just

Startled, Fear
Anticipation, Dread

The key difference being that in the first case, the player is entangled in the process. While these things are designed to produce a specific and very similar experiences for everyone (which is why they’re popular to a wide player base), it takes a pre-existing person and combines them with a series of elements that is supposed to lead to an emotional response. The exact situation is unique(ish) for each person, because the person is a vital input. The output (of person feeling X emotions) is unique and personalized, as the input is different in every case.

When simply conjuring the emotions directly via wire, the individual is removed as an input. The emotions are implanted directly and do not depend on the person. The output (of person feeling X emotions) is identical and of far less complexity and value. Even if the emotions are hooked up to a random number generator or in some other way made to result in non-identical outputs, the situation is not improved. Because the problem isn’t so much “identical output” as it is that the Person was not an input, was not entangled in the process, and therefore doesn’t matter.

I actually don’t have much of a problem with simulated-realities. Already a large percentage of the emotions felt by middle-class people in the first world are due to simulated realities. We induce feelings via music, television/movies, video games, novels, and other art. I think this has had some positive effects on society – it’s nice when people can get their Thrill needs met without actually risking their lives and/or committing crimes. In fact, the sorts of people who still try to get all their emotional needs met in the real world tend to be destructive and dramatic and I’m sure everyone knows at least one person like that, and tries to avoid them.

I think a complete retreat to isolation would be sad, because other human minds are the most complex things that exist, and to cut that out of one’s life entirely would be an impoverishment. But a community of people interacting in a cyberworld, with access to physical reality? Shit, that sounds amazing!

Of course a “Total Recall” style system has the potential to become nightmarish. Right now when someone watches a movie, they bring their whole life with them. The movie is interpreted in light of one’s life experience. Every viewer has a different experience (some people have radically different experiences, as me and my SO recently discovered when we watched Birdman together. In fact, this comparing of the difference of experiences is the most fun part of my bi-weekly book club meetings. It’s kinda the whole point.). The person is an input in the process, and they’re mashed up into the product. If your proposed system would simply impose a memory or an experience onto someone else wholesale* without them being involved in the process, then it would be just as bad as the “series of emotions” process.

I have a vision of billions of people spending all of eternity simply reliving the most intense emotional experiences ever recorded, in perfect carbon copy, over and over again, and I shudder in horror. That’s not even being a person anymore. That’s overwriting your own existence with the recorded existence of someone(s) else. :(

But a good piece of art, that respects the person-as-input, and uses the artwork to cause them to create/feel more of their own emotions? That seems like a good thing.

(*this was adapted from a series of posts on my blog)

Zooming your mind in and out

8 John_Maxwell_IV 06 July 2015 12:30PM

I recently noticed I had two mental processes opposing one another in an interesting way.

The first mental process was instilled by reading Daniel Kahneman on the focusing illusion and Paul Graham on procrastination.  This process encourages me to "zoom out" when engaging in low-value activities so I can see they don't deliver much value in the grand scheme of things.

The second mental process was instilled by reading about the importance of just trying things.  (These articles could be seen as steelmanning Mark Friedenbach's recent Less Wrong critique.)  This mental process encourages me to "zoom in" and get my hands dirty through experimentation.

Both these processes seem useful.  Instead of spending long stretches of time in either the "zoomed in" or "zoomed out" state, I think I'd do better flip-flopping between them.  For example, if I'm wandering down internet rabbit holes, I'm spending too much time zoomed in.  Asking "why" repeatedly could help me realize I'm doing something low value.  If I'm daydreaming or planning lots with little doing, I'm spending too much time zoomed out.  Asking "how" repeatedly could help me identify a first step.

This fits in with construal level theory, aka "near/far theory" as discussed by Robin Hanson.  (I recommend the reviews Hanson links to; they gave me a different view of the concept than his standard presentation.)  To be more effective, maybe one should increase cross communication between the "near" and "far" modes, so the parts work together harmoniously instead of being at odds.

If Hanson's view is right, maybe the reason people become uncomfortable when they realize they are procrastinating (or not Just Trying It) is that this maps to getting caught red-handed in an act of hypocrisy in the ancestral environment.  You're pursuing near interests (watching Youtube videos) instead of working towards far ideals (doing your homework)?  For shame!

(Possible cure: Tell yourself that there's nothing to be ashamed of if you get stuck zoomed in; it happens to everyone.  Just zoom out.)

Part of me is reluctant to make this post, because I just had this idea and it feels like I should test it out more before writing about it.  So here are my excuses:

1. If I wait until I develop expertise in everything, it may be too late to pass it on.

2. In order to see if this idea is useful, I'll need to pay attention to it.  And writing about it publicly is a good way to help myself pay attention to it, since it will become part of my identity and I'll be interested to see how people respond.

There might be activities people already do on a regular basis that consist of repeated zooming in and out.  If so, engaging in them could be a good way to build this mental muscle.  Can anyone think of something like this?

Green Emeralds, Grue Diamonds

8 Stuart_Armstrong 06 July 2015 11:27AM

A putative new idea for AI control; index here.

When posing his "New Riddle of Induction", Goodman introduced the concepts of "grue" and "bleen" to show some of the problems with the conventional understanding of induction.

I've somewhat modified those concepts. Let T be a set of intervals in time, and we'll use the boolean X to designate the fact that the current time t belongs to T (with ¬X equivalent to t∉T). We'll define an object to be:

  • Grue if it is green given X (ie whenever t∈T), and blue given ¬X (ie whenever t∈T).
  • Bleen if it is blue given X, and green given ¬X.

At this point, people are tempted to point out the ridiculousness of the concepts, dismissing them because of their strange disjunctive definitions. However, this doesn't really solve the problem; if we take grue and bleen as fundamental concepts, then we have the disjunctively defined green and blue; an object is:

  • Green if it is grue given X, and bleen given ¬X.
  • Blue if it is bleen given X, and grue given ¬X.

Still, the categories green and blue are clearly more fundamental than grue and bleen. There must be something we can whack them with to get this - maybe Kolmogorov complexity or stuff like that? Sure someone on Earth could make a grue or bleen object (a screen with a timer, maybe?), but it would be completely artificial. Note that though grue and bleen are unnatural, "currently grue" (colour=green XOR ¬X) or "currently bleen" (colour=blue XOR ¬X) make perfect sense (though they require knowing X, an important point for later on).

But before that... are we so sure the grue and bleen categories are unnatural? Relative to what?


Welcome to Chiron Beta Prime

Chiron Beta Prime, apart from having its own issues with low-intelligence AIs, is noted for having many suns: one large sun that glows mainly in the blue spectrum, and multiple smaller ones glowing mainly in the green spectrum. They all emit in the totality of the spectrum, but they are stronger in those colours.

Because of the way the orbits are locked to each other, the green suns are always visible from everywhere. The blue sun rises and sets on a regular schedule; define T to be time when the blue sun is risen (so X="Blue sun visible, some green suns visible" and ¬X="Blue sun not visible, some green suns visible").

Now "green" is a well defined concept in this world. Emeralds are green; they glow green under the green suns, and do the same when the blue sun is risen. "Blue" is also a well-defined concept. Sapphires are blue. They glow blue under the blue sun and continue to do so (albeit less intensely) when it is set.

But "grue" is also a well defined concept. Diamonds are grue. They glow green when the green suns are the only ones visible, but glow blue under the glare of the blue sun.

Green, blue, and grue (which we would insist on calling green, blue and white) are thus well understood and fundamental concepts, that people of this world use regularly to compactly convey useful information to each other. They match up easily to fundamental properties of the objects in question (eg frequency of light reflected).

Bleen, on the other hand - don't be ridiculous. Sure, someone on Chiron Beta Prime could make a bleen object (a screen with a timer, maybe?), but it would be completely artificial.

In contrast, the inhabitants of Pholus Delta Secundus, who have a major green sun and many minor blue suns (coincidentally with exactly the same orbital cycles), feel that green, blue and bleen are the natural categories...


Natural relative to the (current) universe

We've shown that some categories that we see as disjunctive or artificial can seem perfectly natural and fundamental to beings in different circumstances. Here's another example:

A philosopher proposes, as thought experiment, to define a certain concept for every object. It's the weighted sum of the inverse of the height of an object (from the centre of the Earth), and its speed (squared, because why not?), and its temperature (but only on an "absolute" scale), and some complicated thing involving its composition and shape, and another term involving its composition only. And maybe we can add another piece for its total mass.

And then that philosopher proposes, to great derision, that this whole messy sum be given a single name, "Energy", and that we start talking about it as if it was a single thing. Faced with such an artificially bizarre definition, sensible people who want to use induction properly have no choice... but to embrace energy as one of the fundamental useful facts of the universe.

What these example show is that green, blue, grue, bleen, and energy are not natural or non-natural categories in some abstract sense, but relative to the universe we inhabit. For instance, if we had some strange energy' which used the inverse of the height cubed, then we'd have a useless category - unless we lived in five spacial dimensions.


You're grue, what time is it?

So how can we say that green and blue are natural categories in our universe, while grue and bleen are not? A very valid explanation seems to be the dependence on X - on the time of day. In our earth, we can tell whether objects are green or blue without knowing anything about the time. Certainly we can get combined information about an object's colour and the time of day (for instance by looking at emeralds out in the open). But we also expect to get information about the colour (by looking at an object in a lit basement) and the time (by looking at a clock). And we expect these pieces of information to be independent of each other.

In contrast, we never expect to get information about an object being currently grue or currently bleen without knowing the time (or the colour, for that matter). And information about the time can completely change our assessment as to whether an object is grue versus bleen. It would be a very contrived set of circumstances where we would be able to assert "I'm pretty sure that object is currently grue, but I have no idea about its colour or about the current time".

Again, this is a feature of our world and the evidence we see in it, not some fundamental feature of the categories of grue and bleen. We just don't generally seen green objects change into blue objects, nor do we typically learn about disjunctive statements of the type "colour=green XOR time=night" without learning about the colour and the time separately.

What about the grue objects on Chiron Beta Prime? There, people do see objects change colour regularly, and, upon investigation, they can detect whether an object is grue without knowing either the time or the apparent colour of the object. For instance, they know that diamond is grue, so they can detect some grue objects by a simple hardness test.

But what's happening is that the Chiron Beta Primers have correctly identified a fundamental category - the one we call white, or, more technically "prone to reflect light both in the blue and green parts of the spectrum" - that has different features on their planet than on ours. From the macroscopic perspective, it's as if we and they live in a different universe, hence grue means something to them and not to us. But the same laws of physics underlie both our worlds, so fundamentally the concepts converge - our white, their grue, mean the same things at the microscopic level.


Definitions open to manipulation

In the next post, I'll look at whether we can formalise "expect independent information about colour and time", and "we don't expect change to the time information to change our colour assessment."

But be warned. The naturalness of these categories is dependent on facts about the universe, and these facts could be changed. A demented human (or a powerful AI) could go through the universe, hiding everything in boxes, smashing clocks, and putting "current bleen detectors" all other the place, so that it suddenly becomes very easy to know statements like "colour=blue XOR time=night", but very hard to know about colour (or time) independently from this. So it would be easy to say "this object is currently bleen", but hard to say "this object is blue". Thus the "natural" categories may be natural now, but this could well change, so we must have care when using these definitions to program an AI.

AI: requirements for pernicious policies

7 Stuart_Armstrong 17 July 2015 02:18PM

Some have argued that "tool AIs" are safe(r). Recently, Eric Drexler decomposed AIs into "problem solvers" (eg calculators), "advisors" (eg GPS route planners), and actors (autonomous agents). Both solvers and advisors can be seen as examples of tools.

People have argued that tool AIs are not safe. It's hard to imagine a calculator going berserk, no matter what its algorithm is, but it's not too hard to come up with clear examples of dangerous tools. This suggests the solvers vs advisors vs actors (or tools vs agents, or oracles vs agents) is not the right distinction.

Instead, I've been asking: how likely is the algorithm to implement a pernicious policy? If we model the AI as having an objective function (or utility function) and algorithm that implements it, a pernicious policy is one that scores high in the objective function but is not at all what is intended. A pernicious function could be harmless and entertaining or much more severe.

I will lay aside, for the moment, the issue of badly programmed algorithms (possibly containing its own objective sub-functions). In any case, to implement a pernicious function, we have to ask these questions about the algorithm:

  1. Do pernicious policies exist? Are there many?
  2. Can the AI find them?
  3. Can the AI test them?
  4. Would the AI choose to implement them?

The answer to 1. seems to be trivially yes. Even a calculator could, in theory, output a series of messages that socially hack us, blah, take over the world, blah, extinction, blah, calculator finishes its calculations. What is much more interesting is some types of agents have many more pernicious policies than others. This seems the big difference between actors and other designs. An actor AI in complete control of the USA or Russia's nuclear arsenal has all sort of pernicious policies easily to hand; an advisor or oracle has much fewer (generally going through social engineering), a tool typically even less. A lot of the physical protection measures are about reducing the number of sucessfull pernicious policies the AI has a cess to.

The answer to 2. is mainly a function of the power of the algorithm. A basic calculator will never find anything dangerous: its programming is simple and tight. But compare an agent with the same objective function and the ability to do an unrestricted policy search with vast resources... So it seems that the answer to 2. does not depend on any solver vs actor division, but purely on the algorithm used.

And now we come to the big question 3., whether the AI can test these policies. Even if the AI can find pernicious policies that rank high on its objective function, it will never implement them unless it can ascertain this fact. And there are several ways it could do so. Let's assume that a solver AI has a very complicated objective function - one that encodes many relevant facts about the real world. Now, the AI may not "care" about the real world, but it has a virtual version of that, in which it can virtually test all of its policies. With a detailed enough computing power, it can establish whether the pernicious policy would be effective at achieving its virtual goal. If this is a good approximation of how the pernicious policy would behave in the real world, we could have a problem.

But extremely detailed objective functions are unlikely. But even simple ones can show odd behaviour if the agents gets to interact repeatedly with the real world - this is the issue with reinforcement learning. Suppose that the agent attempts a translation job, and is rewarded on the accuracy of its translation. Depending on the details of what the AI knows and who choose the rewards, the AI could end up manipulating its controllers, similarly to this example. The problem is that one there is any interaction, all the complexity of humanity could potentially show up in the reward function, even if the objective function is simple.

Of course, some designs make this very unlikely - resetting the AI periodically can help to alleviate the problem, as can choosing more objective criteria for any rewards. Lastly on this point, we should mention the possibility that human R&D, by selecting and refining the objective function and the algorithm, could take the roll of testing the policies. This is likely to emerge only in cases where many AI designs are considered, and the best candiates are retained based on human judgement.

Finally we come to the question of whether the AI will implement the policy if it's found it and tested it. You could say that the point of FAI is to create an AI that doesn't choose to implement pernicious policies - but, more correctly, the point of FAI is to ensure that very few (or zero) pernicious policies exist in the first place, as they all score low on the utility function. However, there are a variety of more complicated designs - satisficers, agents using crude measures - where the questions of "Do pernicious policies exist?" and "Would the AI choose to implement them?" could become quite distinct.


Conclusion: a more through analysis of AI designs is needed

A calculator is safe, because it is a solver, it has a very simple objective function, with no holes in the algorithm, and it can neither find nor test any pernicious policies. It is the combination of these elements that makes it almost certainly safe. If we want to make the same claim about other designs, neither "it's just a solver" or "it's objective function is simple" would be enough; we need a careful analysis.

Though, as usual, "it's not certainly safe" is a quite distinct claim from "it's (likely) dangerous", and they should not be conflated.

Rationality Reading Group: Part E: Overly Convenient Excuses

7 Gram_Stone 16 July 2015 03:38AM

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.

Welcome to the Rationality reading group. This fortnight we discuss Part E: Overly Convenient Excuses (pp. 211-252)This post summarizes each article of the sequence, linking to the original LessWrong post where available.

Essay: Rationality: An Introduction

E. Overly Convenient Excuses

46. The Proper Use of Humility - There are good and bad kinds of humility. Proper humility is not being selectively underconfident about uncomfortable truths. Proper humility is not the same as social modesty, which can be an excuse for not even trying to be right. Proper scientific humility means not just acknowledging one's uncertainty with words, but taking specific actions to plan for the case that one is wrong.

47. The Third Alternative People justify Noble Lies by pointing out their benefits over doing nothing. But, if you really need these benefits, you can construct a Third Alternative for getting them. How? You have to search for one. Beware the temptation not to search or to search perfunctorily. Ask yourself, "Did I spend five minutes by the clock trying hard to think of a better alternative?"

48. Lotteries: A Waste of Hope - Some defend lottery-ticket buying as a rational purchase of fantasy. But you are occupying your valuable brain with a fantasy whose probability is nearly zero, wasting emotional energy. Without the lottery, people might fantasize about things that they can actually do, which might lead to thinking of ways to make the fantasy a reality. To work around a bias, you must first notice it, analyze it, and decide that it is bad. Lottery advocates are failing to complete the third step.

49. New Improved Lottery - If the opportunity to fantasize about winning justified the lottery, then a "new improved" lottery would be even better. You would buy a nearly-zero chance to become a millionaire at any moment over the next five years. You could spend every moment imagining that you might become a millionaire at that moment.

50. But There's Still A Chance, Right? - Sometimes, you calculate the probability of a certain event and find that the number is so unbelievably small that your brain really can't keep track of how small it is, any more than you can spot an individual grain of sand on a beach from 100 meters off. But, because you're already thinking about that event enough to calculate the probability of it, it feels like it's still worth keeping track of. It's not.

51. The Fallacy of Gray - Nothing is perfectly black or white. Everything is gray. However, this does not mean that everything is the same shade of gray. It may be impossible to completely eliminate bias, but it is still worth reducing bias.

52. Absolute Authority - Those without the understanding of the Quantitative Way will often map the process of arriving at beliefs onto the social domains of Authority. They think that if Science is not infinitely certain, or if it has ever admitted a mistake, then it is no longer a trustworthy source, and can be ignored. This cultural gap is rather difficult to cross.

53. How to Convince Me That 2 + 2 = 3 - The way to convince Eliezer that 2+2=3 is the same way to convince him of any proposition, give him enough evidence. If all available evidence, social, mental and physical, starts indicating that 2+2=3 then you will shortly convince Eliezer that 2+2=3 and that something is wrong with his past or recollection of the past.

54. Infinite Certainty - If you say you are 99.9999% confident of a proposition, you're saying that you could make one million equally likely statements and be wrong, on average, once. Probability 1 indicates a state of infinite certainty. Furthermore, once you assign a probability 1 to a proposition, Bayes' theorem says that it can never be changed, in response to any evidence. Probability 1 is a lot harder to get to with a human brain than you would think.

55. 0 And 1 Are Not Probabilities - In the ordinary way of writing probabilities, 0 and 1 both seem like entirely reachable quantities. But when you transform probabilities into odds ratios, or log-odds, you realize that in order to get a proposition to probability 1 would require an infinite amount of evidence.

56. Your Rationality Is My Business - As a human, I have a proper interest in the future of human civilization, including the human pursuit of truth. That makes your rationality my business. The danger is that we will think that we can respond to irrationality with violence. Relativism is not the way to avoid this danger. Instead, commit to using only arguments and evidence, never violence, against irrational thinking.


This has been a collection of notes on the assigned sequence for this fortnight. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

The next reading will cover Part F: Politics and Rationality (pp. 255-289). The discussion will go live on Wednesday, 29 July 2015, right here on the discussion forum of LessWrong.

Biases and Fallacies Game Cards

7 Gunnar_Zarncke 15 July 2015 08:19AM

On the Stupid Questions Thread I asked

I need some list of biases for a game of Biased Pandemic for our Meet-Up. Do suitably prepared/formatted lists exist somewhere?

But none came forward.

Therefore I created a simple deck based on Wikipedia entries. I selected those that can be presumably be used easily in a game, summarized the description and added an illustrative quote.

The deck can be found in Dropbox here (PDF and ODT).

I'd be happy for corrections and further suggestions.

ADDED: We used these cards during the LW Hamburg Meetup. They attracted significant interest and even though we did use them during a board game we drew them and tried to act them out during a discussion round (which didn't work out that well but stimulated discussion nonetheless).

The AI, the best human advisor

7 Stuart_Armstrong 13 July 2015 03:33PM

A putative new idea for AI control; index here.

All my ideas on reduced impact AI follow the same format: some restriction to ensure a reduced impact (I generally use something like "could a future AI detect this one existed"?) and some "trick" to allow a way of still achieving a larger impact. Further more, there is always a "nice" way of achieving this larger impact (do the thing we really want you to do, and only that!), so that we hope the AI can be forced into that behaviour. Examples of "tricks" are:

In this post, I'll present another variant - one that I think is the most interesting so far. I won't change the reduced impact condition, but will present a different "trick".


AI pretends to be human

As in this post, imagine that we have a reduced-impact AI in a box that conceals it from us. The AI has some goal, and will output some data to further that goal, then turn itself off. And then we'll shake the box around or something, so we can't be sure that the AI was turned on.

There are two issues to deal with, here: a) The message itself is strong evidence that the AI was turned on, and b) if we exclude the message from the reduced impact, the message becomes dangerous.

To deal with a), first of all. This approach gets much easier if we have a whole brain emulation (WBE). Then we can house the WBE in the box with the AI. If the AI doesn't send out a message, then the WBE sends out is own message instead. There are variants we can try here if we don't have WBE - maybe some human volunteer willing to either die or take amnesia pills at the end of the experiment.

So, receiving a coherent message is not a surprising thing to come out the output channel. However, an AI motivated strongly by reduced impact, will simply seek to imitate as precisely as possible what the WBE would send. We still need a trick to get something useful.


AI pretends to be BEST human

Imagine that we wanted the AI to give us investment advice to make us rich. Say, a binary task: get us ten million dollars over the course of the year.

Now, this is the kind of thing that the WBE could conceivably achieve. We could run the WBE fast, for three subjective weeks, and it has a chance of giving us advice that would get us that rich. Assume that it only has 0.1% chance of doing so. The AI, on the other hand, could produce advice that makes us that rich with almost certainty.

Therefore if we get advice, follow it, and end up at least ten million dollars richer, we can conclude that the odds update of the AI having given us the advice (rather than the WBE doing so) is about 1:1000.

And this gives us the solution to b). The "trick", the escape clause, is by conditioning on the outcome of our investments. So we're asking: given that we have become rich (and given the other facts about the universe in 50 years time) what additional evidence is there that the AI was turned on?

Now, a sensible WBE is almost certain to not come up with a scheme that destroys the universe with self-tiling agents - or, indeed, does anything too crazy. We could instruct the WBE to this effect, and we're only running it for three weeks, which prevents the kind of drifts that you see in more extreme situations. Thus a message from the WBE, no matter how likely it is to make us rich, is almost certain to not lead to tilling the universe with self-replicating nano-bots.

So here, the AI merely has to come up with a message that could have come from the WBE, with at least 0.01% probability, and that would indeed make us rich. And, therefore, the message cannot contain secret code that would optimise the universe.

Note that we need that the goal we are looking for be not too extreme in impact (the higher the impact of the goal, the more latitude the AI has to have impact itself, "hiding" inside the goal's impact), and we should restrict to event of probability, say, greater than 1/10,000 (as when the probability gets too small, odd events start happening).

It seems that this scheme could result in AIs that always give the very best possible advice a human (or WBE) could ever give, which would be a very useful result.


Whose probability?

We have a probability distribution over events. The AI itself will have a probability distribution over events. The future hypothetical super-super-AI it is using to compute reduced impact has a probability distribution over events - and the AI has a probability distribution over that probability distribution. If all of them agree on the probability of us getting richer (given WBE advice and given not), then this scheme should work.

If they disagree, there might be problems. A more complex approach could directly take into account the divergent probability estimates; I'll think of that and return to the issue later.

Entrepreneurial autopsies

7 Clarity 13 July 2015 03:27AM

Entrepreneurial ideas come and go. Some I don't give a second thought to. Others I commence market research for, examine the competitive landscape and explore the feasibility for development. This can be time consuming, and has yet to have produced any tangible, commercialized product.

I figure it's about time I devote the time I would spend to exploiting my existing repertoire of knowledge to develop an idea, to exploring parsimonious, efficient techniques for assessing viability.

In my search I found [], a startup graveyard. Founders describe why their startups failed, concisely. It made me think about my past startup ideas and why they haven't flied.

I'm going to work that out, put it in a spreadsheet and regress to whatever problem keeps popping up - then, I'll work on improving my subject matter knowledge in that domain - for example, if its the feasibility of implementing with existing technology - I might learn more about the current technological landscape in general. Or, more about existing services for investors, if my product is a service for investors, like my last startup idea, which I have autopsied in detail here

I just thought I'd share my general strategy for anyone who'd want to copy this procedure for startup autopsy. Please use this space to suggest other appropriate diagnostic methods.

edit 1: Thanks for pointing out the typos :)


'Charge for something and make more than you spend' - Marco Arment, Founder of Instapaper

Rationality Reading Group: Part F: Politics and Rationality

6 Gram_Stone 29 July 2015 10:22PM

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.

Welcome to the Rationality reading group. This fortnight we discuss Part F: Politics and Rationality (pp. 255-289)This post summarizes each article of the sequence, linking to the original LessWrong post where available.

F. Politics and Rationality

57. Politics is the Mind-Killer - People act funny when they talk about politics. In the ancestral environment, being on the wrong side might get you killed, and being on the correct side might get you sex, food, or let you kill your hated rival. If you must talk about politics (for the purpose of teaching rationality), use examples from the distant past. Politics is an extension of war by other means. Arguments are soldiers. Once you know which side you're on, you must support all arguments of that side, and attack all arguments that appear to favor the enemy side; otherwise, it's like stabbing your soldiers in the back - providing aid and comfort to the enemy. If your topic legitimately relates to attempts to ban evolution in school curricula, then go ahead and talk about it, but don't blame it explicitly on the whole Republican/Democratic/Liberal/Conservative/Nationalist Party.

58. Policy Debates Should Not Appear One-Sided - Debates over outcomes with multiple effects will have arguments both for and against, so you must integrate the evidence, not expect the issue to be completely one-sided.

59. The Scales of Justice, the Notebook of Rationality - People have an irrational tendency to simplify their assessment of things into how good or bad they are without considering that the things in question may have many distinct and unrelated attributes.

60. Correspondence Bias - Also known as the fundamental attribution error, refers to the tendency to attribute the behavior of others to intrinsic dispositions, while excusing one's own behavior as the result of circumstance.

61. Are Your Enemies Innately Evil? - People want to think that the Enemy is an innately evil mutant. But, usually, the Enemy is acting as you might in their circumstances. They think that they are the hero in their story and that their motives are just. That doesn't mean that they are right. Killing them may be the best option available. But it is still a tragedy.

62. Reversed Stupidity Is Not Intelligence - The world's greatest fool may say the Sun is shining, but that doesn't make it dark out. Stalin also believed that 2 + 2 = 4. Stupidity or human evil do not anticorrelate with truth. Arguing against weaker advocates proves nothing, because even the strongest idea will attract weak advocates.

63. Argument Screens Off Authority - There are many cases in which we should take the authority of experts into account, when we decide whether or not to believe their claims. But, if there are technical arguments that are available, these can screen off the authority of experts.

64. Hug the Query - The more directly your arguments bear on a question, without intermediate inferences, the more powerful the evidence. We should try to observe evidence that is as near to the original question as possible, so that it screens off as many other arguments as possible.

65. Rationality and the English Language - George Orwell's writings on language and totalitarianism are critical to understanding rationality. Orwell was an opponent of the use of words to obscure meaning, or to convey ideas without their emotional impact. Language should get the point across - when the effort to convey information gets lost in the effort to sound authoritative, you are acting irrationally.

66. Human Evil and Muddled Thinking - It's easy to think that rationality and seeking truth is an intellectual exercise, but this ignores the lessons of history. Cognitive biases and muddled thinking allow people to hide from their own mistakes and allow evil to take root. Spreading the truth makes a real difference in defeating evil.


This has been a collection of notes on the assigned sequence for this fortnight. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

The next reading will cover Part G: Against Rationalization (pp. 293-339). The discussion will go live on Wednesday, 12 August 2015, right here on the discussion forum of LessWrong.

Immortality Roadmap

6 turchin 28 July 2015 09:27PM

Added: Direct link on pdf:


A lot of people value indefinite life extension, but most have their own preferred method of achieving it. The goal of this map is to present all known ways of radical life extension in an orderly and useful way.

A rational person could choose to implement all of these plans or to concentrate only on one of them, depending on his available resources, age and situation. Such actions may be personal or social; both are necessary.

The roadmap consists of several plans; each of them acts as insurance in the case of failure of the previous plan. (The roadmap has a similar structure to the "Plan of action to prevent human extinction risks".) The first two plans contain two rows, one of which represents personal actions or medical procedures, and the other represents any collective activity required.

Plan A. The most obvious way to reach immortality is to survive until the creation of Friendly AI; in that case if you are young enough and optimistic enough, you can simply do nothing – or just fund MIRI. However, if you are older, you have to jump from one method of life extension to the next as they become available. So plan A is a relay race of life extension methods, until the problem of death is solved.

This plan includes actions to defeat aging, to grow and replace diseased organs with new bioengineered ones, to get a nanotech body and in the end to be scanned into a computer. It is an optimized sequence of events, and depends on two things – your personal actions (such as regular medical checkups), and collective actions such as civil activism and scientific research funding.

Plan B. However, if Plan A fails, i.e. if you die before the creation of superintelligence, there is Plan B, which is cryonics. Some simple steps can be taken now, such as calling your nearest cryocompany about a contract.

Plan C. Unfortunately, cryonics could also fail, and in that case Plan C is invoked. Of course it is much worse – less reliable and less proven. Plan C is so-called digital immortality, where one could be returned to life based on existing recorded information about that person. It is not a particularly good plan, because we are not sure how to solve the identity problem which will arise, and we don’t know if the collected amount of information would be enough. But it is still better than nothing.

Plan D. Lastly, if Plan C fails, we have Plan D. It is not a plan in fact, it is just hope or a bet that immortality already exists somehow: perhaps there is quantum immortality, or perhaps future AI will bring us back to life.

The first three plans demand particular actions now: we need to prepare for all of them simultaneously. All of the plans will lead to the same result: our minds will be uploaded into a computer with help of highly developed AI.

The plans could also help each other. Digital immortality data may help to fill any gaps in the memory of a cryopreserved person. Also cryonics is raising chances that quantum immortality will result in something useful: you have more chance of being cryopreserved and successfully revived than living naturally until you are 120 years old.

After you have become immortal with the help of Friendly AI you might exist until the end of the Universe or even beyond – see my map “How to prevent the end of the Universe”.

A map of currently available methods of life extension is a sub-map of this one and will published later.

The map was made in collaboration with Maria Konovalenko and Michael Batin and its earlier version was presented in August 2014 in Aubrey de Grey’s conference Rejuvenation Biotechnology.

Pdf of the map is here

Previous posts:

AGI Safety Solutions Map

A map: AI failures modes and levels

A Roadmap: How to Survive the End of the Universe

A map: Typology of human extinction risks

Roadmap: Plan of Action to Prevent Human Extinction Risks







(scroll down to see the map)
















(scroll down to see the map)













(scroll down to see the map)











Group rationality diary for July 12th - August 1st 2015

6 Gunnar_Zarncke 26 July 2015 11:31PM

This is the public group rationality diary for July 12th - August 1st, 2015. It's a place to record and chat about it if you have done, or are actively doing, things like:

  • Established a useful new habit

  • Obtained new evidence that made you change your mind about some belief

  • Decided to behave in a different way in some set of situations

  • Optimized some part of a common routine or cached behavior

  • Consciously changed your emotions or affect with respect to something

  • Consciously pursued new valuable information about something that could make a big difference in your life

  • Learned something new about your beliefs, behavior, or life that surprised you

  • Tried doing any of the above and failed

Or anything else interesting which you want to share, so that other people can think about it, and perhaps be inspired to take action themselves. Try to include enough details so that everyone can use each other's experiences to learn about what tends to work out, and what doesn't tend to work out.

Archive of previous rationality diaries

Note to future posters: no one is in charge of posting these threads. If it's time for a new thread, and you want a new thread, just create it. It should run for about two weeks, finish on a Saturday, and have the 'group_rationality_diary' tag.

LessWrong Diplomacy Game 2015

6 Sherincall 20 July 2015 03:10PM

Related: Diplomacy as a Game Theory Laboratory by Yvain.

I've been floating this idea around for a while, and there was enough interest to organize it.

Diplomacy is a board game of making and breaking alliances. It is a semi-iterative prisoner's dilemma with 7 prisoners. The rules are very simple, there is no luck factor and any tactical tricks can be learned quickly. You play as one of the great powers in pre-WW1 Europe, and your goal is to dominate over half of the board. To do this, you must negotiate alliances with the other players, and then stab them at the most opportune moment. But beware, if you are too stabby, no one will trust you. And if you are too trusting, you will get stabbed yourself.

If you have never played the game, don't worry. It is really quick to pick up. I explain the rules in detail here.

The game will (most likely) be played at You need an account, which requires a valid email. To play the game, you will need to spend at least 10 minutes every phase (3 days) to enter your orders. In the meantime, you will be negotiating with other players. That takes as much as you want it to, but I recommend setting away at least 30 minutes per day (in 5-minute quantums). A game usually lasts about 10 in-game years, which comes down to 30-something phases (60-90 days). A phase can progress early if everyone agrees. Likewise, the game can be paused indefinitely if everyone agrees (e.g. if a player will not have Internet access).

Joining a game is Serious Business, as missing a deadline can spoil it for the other 6 players. Please apply iff:

  1. You will be able to access the game for 10 minutes every 3 days (90% certainty required)
  2. If 1) changes, you will be able to let the others know at least 1 day in advance (95% certainty required)
  3. You will be able to spend an average of 30 minutes per day (standard normal distribution)
  4. You will not hold an out-of-game grudge against a player who stabbed you (adjusting for stabbyness in potential future games is okay)

If you still wish to play, please sign up in the comments. Please specify the earliest time it would suit you for the game to start. If we somehow get more than 7 players, we'll discuss our options (play a variant with more players, multiple games, etc).


See also: First game of LW Diplomacy


Well, the interest is there, so I've set up two games.

Game 1:  (started!)

Game 2:  (started! First phase will be extended to end on the 4th of August)

Password: clippy

Please note a couple important rules of the website:


  1. You can only have one account. If you are caught with multiple accounts, they will all be banned.
  2. You may not blame your moves on the website bugs as a diplomacy tactic. This gives the site's mods extra work to do when someone actually reports the bug.
  3. Should go without saying, but you are not allowed to illegally access another player's account (i.e. hacking).


LessWrong Hamburg Meetup July 2015 Summary

6 Gunnar_Zarncke 18 July 2015 11:13PM

After a hiatus of about a year the LessWrong Hamburg Meetup had a very strong revival! Infused by motivation from the Berlin Weekend I tried a reachout to collegues and via and an amazing 24 people gathered on July, 17th in a location kindly provided by my employer.

Because the number of participants quickly exceeded my expectations I had to scramble to put something together for a larger group. For this I had tactical aid from blob and practical support from colleagues putting everything together from name tags to getting food and drinks and chairs.

We had an easy start with getting to know each other with Fela's Ice-Breaking Game.

The main topics covered were:

Beside the main topics there was a good athmosphere with many people having smaller discussions.

The event ended with a short wrap-up based on Irinas Sustainable Change talk from the Berlin event which did prompt some people to take action based on what they heard.

What I learned from the event:

  • I still tend to do overplanning. Maybe having a plan for eventualities isn't bad but the agenda doesn't need to be as highly structured as I did. It could cause expectations that can't be met. 
  • Apparently I appeared stressed but I didn't feel that way myself. Probably from hurrying around. I wonder wheather that has a negative effect on other people and how I can avoid that. Esp. as I'm not feeling stressed myself. 
  • A standard-issue meeting room for 12 people can comfortably host 24 people if tables and furniture are rearranged and comfy beanbags etc. are added.
  • Whe number of people showing up can vary unpredictably. This may depend on weather or how the event is communicated and unknown factors.
  • Visualize the concrete effects of your charity. This can give you a specific intuition you can use to decide whether it's worth it. Imma's example was thinking about how your donated AMF bednets hang over children and protect from mosquitoes.

There will definitely be a follow-up meeting of a comparable size in a few month (no date yet). And maybe smaller get-together will be organized inbetween.


Beware the Nihilistic Failure Mode

6 Gram_Stone 09 July 2015 03:31PM

I have noticed that the term 'nihilism' has quite a few different connotations. I do not know that it is a coincidence. Reputedly, the most popular connotation, and in my opinion, the least well-defined, is existential nihilism, 'the philosophical theory that life has no intrinsic meaning or value.' I think that most LessWrong users would agree that there is no intrinsic meaning or value, but also that they would argue that there is a contingent meaning or value, and that the absence of such intrinsic meaning or value is no justification to be a generally insufferable person.

There is also the slightly similar but perhaps more well-defined moral nihilism; epistemological nihilism; and the not-unrelated fatalism.

Here, it goes without saying that each of these positions is wrong.

I recognize a pattern here. It seems that in each case the person who arrives at each of these positions has, in some informal sense, given up.

The idea finally came to my explicit attention after reading a passage in Nick Bostrom's Technological Revolutions: Ethics and Policy in the Dark. Bostrom writes:

If we want to make sense of the claim that physics is better at predicting than social science is, we have to work harder to explicate what it might mean. One possible way of explicating the claim is that when one says that physics is better at predicting than social science one might mean that experts in physics have a greater advantage over non‐experts in predicting interesting things in the domain of physics than experts in social science have over non‐experts in predicting interesting things in the domain of social science. This is still very imprecise since it relies on an undefined concept of “interesting things”. Yet the explication does at least draw attention to one aspect of the idea of predictability that is relevant in the context of public policy, namely the extent to which research and expertise can improve our ability to predict. The usefulness of ELSI‐funded activities might depend not on the absolute obtainable degree of predictability of technological innovation and social outcomes but on how much improvement in predictive ability these activities will produce. Let us hence set aside the following unhelpful question:

"Is the future of science or technological innovation predictable?"

A better question would be,

"How predictable are various aspects of the future of science or technological innovation?"

But often, we will get more mileage out of asking,

"How much more predictable can (a certain aspect of) the future of science or technological
innovations become if we devote a certain amount of resources to study it?"

Or better still:

"Which particular inquiries would do most to improve our ability to predict those aspects of the future of S&T that we most need to know about in advance?"

Pursuit of this question could lead us to explore many interesting avenues of research which might result in improved means of obtaining foresight about S&T developments and their policy consequences.

Crow and Sarewitz, however, wishing to side‐step the question about predictability, claim that it is “irrelevant”:

"preparation for the future obviously does not require accurate prediction; rather, it requires a foundation of knowledge upon which to base action, a capacity to learn from experience, close attention to what is going on in the present, and healthy and resilient institutions that can effectively respond or adapt to change in a timely manner."

This answer is too quick. Each of the elements they mention as required for the preparation for the future relies in some way on accurate prediction. A capacity to learn from experience is not useful for preparing for the future unless we can correctly assume (predict) that the lessons we derive from the past will be applicable to future situations. Close attention to what is going on in the present is likewise futile unless we can assume that what is going on in the present will reveal stable trends or otherwise shed light on what is likely to happen next. It also requires prediction to figure out what kind of institutions will prove healthy, resilient, and effective in responding or adapting to future changes. Predicting the future quality and behavior of institutions that we create today is not an exact science.

This is about quick answers. The One True Morality is not written in the atoms, but it is also a mistake to conclude that we may value whatever we postulate. We cannot know things certainly, but it is also a mistake to conclude that we can know nothing as a consequence. The universe is fundamentally deterministic, but it is also a mistake to conclude that we should take the null action in every case.

I think that in each case where a person has arrived at one of these positions, it is not as the result of a verbal, deductive argument, but rather, it is a verbalization of a wordless feeling of difficulty, an expression of one's attitude that the confusion surrounding morality, epistemology, and free will is intractable.

It has already been said that one should be suspicious of ordinary solutions to impossible problems. But I do think that the point that I have made above has been overlooked as a special case. Sometimes, something even less than an ordinary solution is proposed. Sometimes, it is proposed that there is no solution.

These points are obvious to most LessWrong users, but the general experience is perhaps worth distinguishing. Where you encounter a difficult problem (of either an instrumentally or epistemically rational nature, I might add), beware a feeling of futility, or a compulsion to inform others that their actions are futile.

This is also perhaps similar to the idea of a wrong question. I would argue that even if one has a verbal, propositional belief that confusion exists in the map and not the territory, it is easy to be dissuaded by feelings of difficulty without noticing, and perhaps it is worth learning to notice a feeling of difficulty in itself, the sort of behavior that it inspires, and the danger therewith.

Stupid Questions July 2015

6 Gondolinian 01 July 2015 07:13PM

This thread is for asking any questions that might seem obvious, tangential, silly or what-have-you. Don't be shy, everyone has holes in their knowledge, though the fewer and the smaller we can make them, the better.

Please be respectful of other people's admitting ignorance and don't mock them for it, as they're doing a noble thing.

To any future monthly posters of SQ threads, please remember to add the "stupid_questions" tag.

Open Thread, Jul. 27 - Aug 02, 2015

5 MrMind 27 July 2015 07:16AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

View more: Next