Deliberate Grad School
Among my friends interested in rationality, effective altruism, and existential risk reduction, I often hear: "If you want to have a real positive impact on the world, grad school is a waste of time. It's better to use deliberate practice to learn whatever you need instead of working within the confines of an institution."
While I'd agree that grad school will not make you do good for the world, if you're a self-driven person who can spend time in a PhD program deliberately acquiring skills and connections for making a positive difference, I think you can make grad school a highly productive path, perhaps more so than many alternatives. In this post, I want to share some advice that I've been repeating a lot lately for how to do this:
- Find a flexible program. PhD programs in mathematics, statistics, philosophy, and theoretical computer science tend to give you a great deal of free time and flexibility, provided you can pass the various qualifying exams without too much studying. By contrast, sciences like biology and chemistry can require time-consuming laboratory work that you can't always speed through by being clever.
- Choose high-impact topics to learn about. AI safety and existential risk reduction are my favorite examples, but there are others, and I won't spend more time here arguing their case. If you can't make your thesis directly about such a topic, choosing a related more popular topic can give you valuable personal connections, and you can still learn whatever you want during the spare time a flexible program will afford you.
- Teach classes. Grad programs that let you teach undergraduate tutorial classes provide a rare opportunity to practice engaging a non-captive audience. If you just want to work on general presentation skills, maybe you practice on your friends... but your friends already like you. If you want to learn to win over a crowd that isn't particularly interested in you, try teaching calculus! I've found this skill particularly useful when presenting AI safety research that isn't yet mainstream, which requires carefully stepping through arguments that are unfamiliar to the audience.
- Use your freedom to accomplish things. I used my spare time during my PhD program to cofound CFAR, the Center for Applied Rationality. Alumni of our workshops have gone on to do such awesome things as creating the Future of Life Institute and sourcing a $10MM donation from Elon Musk to fund AI safety research. I never would have had the flexibility to volunteer for weeks at a time if I'd been working at a typical 9-to-5 or a startup.
- Organize a graduate seminar. Organizing conferences is critical to getting the word out on important new research, and in fact, running a conference on AI safety in Puerto Rico is how FLI was able to bring so many researchers together on its Open Letter on AI Safety. It's also where Elon Musk made his donation. During grad school, you can get lots of practice organizing research events by running seminars for your fellow grad students. In fact, several of the organizers of the FLI conference were grad students.
- Get exposure to experts. A top 10 US school will have professors around that are world-experts on myriad topics, and you can attend departmental colloquia to expose yourself to the cutting edge of research in fields you're curious about. I regularly attended cognitive science and neuroscience colloquia during my PhD in mathematics, which gave me many perspectives that I found useful working at CFAR.
- Learn how productive researchers get their work done. Grad school surrounds you with researchers, and by getting exposed to how a variety of researchers do their thing, you can pick and choose from their methods and find what works best for you. For example, I learned from my advisor Bernd Sturmfels that, for me, quickly passing a draft back and forth with a coauthor can get a paper written much more quickly than agonizing about each revision before I share it.
- Remember you don't have to stay in academia. If you limit yourself to only doing research that will get you good post-doc offers, you might find you aren't able to focus on what seems highest impact (because often what makes a topic high impact is that it's important and neglected, and if a topic is neglected, it might not be trendy enough land you good post-doc). But since grad school is run by professors, becoming a professor is usually the most salient path forward for most grad students, and you might end up pressuring yourself to follow that standards of that path. When I graduated, I got my top choice of post-doc, but then I decided not to take it and to instead try earning to give as an algorithmic stock trader, and now I'm a research fellow at MIRI. In retrospect, I might have done more valuable work during my PhD itself if I'd decided in advance not to do a typical post-doc.
That's all I have for now. The main sentiment behind most of this, I think, is that you have to be deliberate to get the most out of a PhD program, rather than passively expecting it to make you into anything in particular. Grad school still isn't for everyone, and far from it. But if you were seriously considering it at some point, and "do something more useful" felt like a compelling reason not to go, be sure to first consider the most useful version of grad that you could reliably make for yourself... and then decide whether or not to do it.
Please email me (lastname@thisdomain.com) if you have more ideas for getting the most out of grad school!
A Year of Spaced Repetition Software in the Classroom
Last year, I asked LW for some advice about spaced repetition software (SRS) that might be useful to me as a high school teacher. With said advice came a request to write a follow-up after I had accumulated some experience using SRS in the classroom. This is my report.
Please note that this was not a scientific experiment to determine whether SRS "works." Prior studies are already pretty convincing on this point and I couldn't think of a practical way to run a control group or "blind" myself. What follows is more of an informal debriefing for how I used SRS during the 2014-15 school year, my insights for others who might want to try it, and how the experience is changing how I teach.
Summary
SRS can raise student achievement even with students who won't use the software on their own, and even with frequent disruptions to the study schedule. Gains are most apparent with the already high-performing students, but are also meaningful for the lowest students. Deliberate efforts are needed to get student buy-in, and getting the most out of SRS may require changes in course design.
The software
After looking into various programs, including the game-like Memrise, and even writing my own simple SRS, I ultimately went with Anki for its multi-platform availability, cloud sync, and ease-of-use. I also wanted a program that could act as an impromptu catch-all bin for the 2,000+ cards I would be producing on the fly throughout the year. (Memrise, in contrast, really needs clearly defined units packaged in advance).
The students
I teach 9th and 10th grade English at an above-average suburban American public high school in a below-average state. Mine are the lower "required level" students at a school with high enrollment in honors and Advanced Placement classes. Generally speaking, this means my students are mostly not self-motivated, are only very weakly motivated by grades, and will not do anything school-related outside of class no matter how much it would be in their interest to do so. There are, of course, plenty of exceptions, and my students span an extremely wide range of ability and apathy levels.
The procedure
First, what I did not do. I did not make Anki decks, assign them to my students to study independently, and then quiz them on the content. With honors classes I taught in previous years I think that might have worked, but I know my current students too well. Only about 10% of them would have done it, and the rest would have blamed me for their failing grades—with some justification, in my opinion.
Instead, we did Anki together, as a class, nearly every day.
As initial setup, I created a separate Anki profile for each class period. With a third-party add-on for Anki called Zoom, I enlarged the display font sizes to be clearly legible on the interactive whiteboard at the front of my room.
Nightly, I wrote up cards to reinforce new material and integrated them into the deck in time for the next day's classes. This averaged about 7 new cards per lesson period.These cards came in many varieties, but the three main types were:
- concepts and terms, often with reversed companion cards, sometimes supplemented with "what is this an example of" scenario cards.
- vocabulary, 3 cards per word: word/def, reverse, and fill-in-the-blank example sentence
- grammar, usually in the form of "What change(s), if any, does this sentence need?" Alternative cards had different permutations of the sentence.
Weekly, I updated the deck to the cloud for self-motivated students wishing to study on their own.
Daily, I led each class in an Anki review of new and due cards for an average of 8 minutes per study day, usually as our first activity, at a rate of about 3.5 cards per minute. As each card appeared on the interactive whiteboard, I would read it out loud while students willing to share the answer raised their hands. Depending on the card, I might offer additional time to think before calling on someone to answer. Depending on their answer, and my impressions of the class as a whole, I might elaborate or offer some reminders, mnemonics, etc. I would then quickly poll the class on how they felt about the card by having them show a color by way of a small piece of card-stock divided into green, red, yellow, and white quadrants. Based on my own judgment (informed only partly by the poll), I would choose and press a response button in Anki, determining when we should see that card again.

[Data shown is from one of my five classes. We didn't start using Anki until a couple weeks into the school year.]
Opportunity costs
8 minutes is a significant portion of a 55 minute class period, especially for a teacher like me who fills every one of those minutes. Something had to give. For me, I entirely cut some varieties of written vocab reinforcement, and reduced the time we spent playing the team-based vocab/term review game I wrote for our interactive whiteboards some years ago. To a lesser extent, I also cut back on some oral reading comprehension spot-checks that accompany my whole-class reading sessions. On balance, I think Anki was a much better way to spend the time, but it's complicated. Keep reading.
Whole-class SRS not ideal
Every student is different, and would get the most out of having a personal Anki profile determine when they should see each card. Also, most individuals could study many more cards per minute on their own than we averaged doing it together. (To be fair, a small handful of my students did use the software independently, judging from Ankiweb download stats)
Getting student buy-in
Before we started using SRS I tried to sell my students on it with a heartfelt, over-prepared 20 minute presentation on how it works and the superpowers to be gained from it. It might have been a waste of time. It might have changed someone's life. Hard to say.
As for the daily class review, I induced engagement partly through participation points that were part of the final semester grade, and which students knew I tracked closely. Raising a hand could earn a kind of bonus currency, but was never required—unlike looking up front and showing colors during polls, which I insisted on. When I thought students were just reflexively holding up the same color and zoning out, I would sometimes spot check them on the last card we did and penalize them if warranted.
But because I know my students are not strongly motivated by grades, I think the most important influence was my attitude. I made it a point to really turn up the charm during review and play the part of the engaging game show host. Positive feedback. Coaxing out the lurkers. Keeping that energy up. Being ready to kill and joke about bad cards. Reminding classes how awesome they did on tests and assignments because they knew their Anki stuff.
(This is a good time to point out that the average review time per class period stabilized at about 8 minutes because I tried to end reviews before student engagement tapered off too much, which typically started happening at around the 6-7 minute mark. Occasional short end-of-class reviews mostly account for the difference.)
I also got my students more on the Anki bandwagon by showing them how this was directly linked reduced note-taking requirements. If I could trust that they would remember something through Anki alone, why waste time waiting for them to write it down? They were unlikely to study from those notes anyway. And if they aren't looking down at their paper, they'll be paying more attention to me. I better come up with more cool things to tell them!
Making memories
Everything I had read about spaced repetition suggested it was a great reinforcement tool but not a good way to introduce new material. With that in mind, I tried hard to find or create memorable images, examples, mnemonics, and anecdotes that my Anki cards could become hooks for, and to get those cards into circulation as soon as possible. I even gave this method a mantra: "vivid memory, card ready".
When a student during review raised their hand, gave me a pained look, and said, "like that time when...." or "I can see that picture of..." as they struggled to remember, I knew I had done well. (And I would always wait a moment, because they would usually get it.)
Baby cards need immediate love
Unfortunately, if the card wasn't introduced quickly enough—within a day or two of the lesson—the entire memory often vanished and had to be recreated, killing the momentum of our review. This happened far too often—not because I didn't write the card soon enough (I stayed really on top of that), but because it didn't always come up for study soon enough. There were a few reasons for this:
- We often had too many due cards to get through in one session, and by default Anki puts new cards behind due ones.
- By default, Anki only introduces 20 new cards in one session (I soon uncapped this).
- Some cards were in categories that I gave lower priority to.
Two obvious cures for this problem:
- Make fewer cards. (I did get more selective as the year went on.)
- Have all cards prepped ahead of time and introduce new ones at the end of the class period they go with. (For practical reasons, not the least of which was the fact that I didn't always know what cards I was making until after the lesson, I did not do this. I might able to next year.)
Days off suck
SRS is meant to be used every day. When you take weekends off, you get a backlog of due cards. Not only do my students take every weekend and major holiday off (slackers), they have a few 1-2 week vacations built into the calendar. Coming back from a week's vacation means a 9-day backlog (due to the weekends bookending it). There's no good workaround for students that won't study on their own. The best I could do was run longer or multiple Anki sessions on return days to try catch up with the backlog. It wasn't enough. The "caught up" condition was not normal for most classes at most points during the year, but rather something to aspire to and occasionally applaud ourselves for reaching. Some cards spent weeks or months on the bottom of the stack. Memories died. Baby cards emerged stillborn. Learning was lost.
Needless to say, the last weeks of the school year also had a certain silliness to them. When the class will never see the card again, it doesn't matter whether I push the button that says 11 days or the one that says 8 months. (So I reduced polling and accelerated our cards/minute rate.)
Never before SRS did I fully appreciate the loss of learning that must happen every summer break.
Triage
I kept each course's master deck divided into a few large subdecks. This was initially for organizational reasons, but I eventually started using it as a prioritizing tool. This happened after a curse-worthy discovery: if you tell Anki to review a deck made from subdecks, due cards from subdecks higher up in the stack are shown before cards from decks listed below, no matter how overdue they might be. From that point, on days when we were backlogged (most days) I would specifically review the concept/terminology subdeck for the current semester before any other subdecks, as these were my highest priority.
On a couple of occasions, I also used Anki's study deck tools to create temporary decks of especially high-priority cards.
Seizing those moments
Veteran teachers start acquiring a sense of when it might be a good time to go off book and teach something that isn't in the unit, and maybe not even in the curriculum. Maybe it's teaching exactly the right word to describe a vivid situation you're reading about, or maybe it's advice on what to do in a certain type of emergency that nearly happened. As the year progressed, I found myself humoring my instincts more often because of a new confidence that I can turn an impressionable moment into a strong memory and lock it down with a new Anki card. I don't even care if it will ever be on a test. This insight has me questioning a great deal of what I thought knew about organizing a curriculum. And I like it.
A lifeline for low performers
An accidental discovery came from having written some cards that were, it was immediately obvious to me, much too easy. I was embarrassed to even be reading them out loud. Then I saw which hands were coming up.
In any class you'll get some small number of extremely low performers who never seem to be doing anything that we're doing, and, when confronted, deny that they have any ability whatsoever. Some of the hands I was seeing were attached to these students. And you better believe I called on them.
It turns out that easy cards are really important because they can give wins to students who desperately need them. Knowing a 6th grade level card in a 10th grade class is no great achievement, of course, but the action takes what had been negative morale and nudges it upward. And it can trend. I can build on it. A few of these students started making Anki the thing they did in class, even if they ignored everything else. I can confidently name one student I'm sure passed my class only because of Anki. Don't get me wrong—he just barely passed. Most cards remained over his head. Anki was no miracle cure here, but it gave him and I something to work with that we didn't have when he failed my class the year before.
A springboard for high achievers
It's not even fair. The lowest students got something important out of Anki, but the highest achievers drank it up and used it for rocket fuel. When people ask who's widening the achievement gap, I guess I get to raise my hand now.
I refuse to feel bad for this. Smart kids are badly underserved in American public schools thanks to policies that encourage staff to focus on that slice of students near (but not at) the bottom—the ones who might just barely be able to pass the state test, given enough attention.
Where my bright students might have been used to high Bs and low As on tests, they were now breaking my scales. You could see it in the multiple choice, but it was most obvious in their writing: they were skillfully working in terminology at an unprecedented rate, and making way more attempts to use new vocabulary—attempts that were, for the most part, successful.
Given the seemingly objective nature of Anki it might seem counterintuitive that the benefits would be more obvious in writing than in multiple choice, but it actually makes sense when I consider that even without SRS these students probably would have known the terms and the vocab well enough to get multiple choice questions right, but might have lacked the confidence to use them on their own initiative. Anki gave them that extra confidence.
A wash for the apathetic middle?
I'm confident that about a third of my students got very little out of our Anki review. They were either really good at faking involvement while they zoned out, or didn't even try to pretend and just took the hit to their participation grade day after day, no matter what I did or who I contacted.
These weren't even necessarily failing students—just the apathetic middle that's smart enough to remember some fraction of what they hear and regurgitate some fraction of that at the appropriate times. Review of any kind holds no interest for them. It's a rerun. They don't really know the material, but they tell themselves that they do, and they don't care if they're wrong.
On the one hand, these students are no worse off with Anki than they would have been with with the activities it replaced, and nobody cries when average kids get average grades. On the other hand, I'm not ok with this... but so far I don't like any of my ideas for what to do about it.
Putting up numbers: a case study
For unplanned reasons, I taught a unit at the start of a quarter that I didn't formally test them on until the end of said quarter. Historically, this would have been a disaster. In this case, it worked out well. For five weeks, Anki was the only ongoing exposure they were getting to that unit, but it proved to be enough. Because I had given the same test as a pre-test early in the unit, I have some numbers to back it up. The test was all multiple choice, with two sections: the first was on general terminology and concepts related to the unit. The second was a much harder reading comprehension section.
As expected, scores did not go up much on the reading comprehension section. Overall reading levels are very difficult to boost in the short term and I would not expect any one unit or quarter to make a significant difference. The average score there rose by 4 percentage points, from 48 to 52%.
Scores in the terminology and concept section were more encouraging. For material we had not covered until after the pre-test, the average score rose by 22 percentage points, from 53 to 75%. No surprise there either, though; it's hard to say how much credit we should give to SRS for that.
But there were also a number of questions about material we had already covered before the pretest. Being the earliest material, I might have expected some degradation in performance on the second test. Instead, the already strong average score in that section rose by an additional 3 percentage points, from 82 to 85%. (These numbers are less reliable because of the smaller number of questions, but they tell me Anki at least "locked in" the older knowledge, and may have strengthened it.)
Some other time, I might try reserving a section of content that I teach before the pre-test but don't make any Anki cards for. This would give me a way to compare Anki to an alternative review exercise.
What about formal standardized tests?
I don't know yet. The scores aren't back. I'll probably be shown some "value added" analysis numbers at some point that tell me whether my students beat expectations, but I don't know how much that will tell me. My students were consistently beating expectations before Anki, and the state gave an entirely different test this year because of legislative changes. I'll go back and revise this paragraph if I learn anything useful.
Those discussions...
If I'm trying to acquire a new skill, one of the first things I try to do is listen to skilled practitioners of that skill talk about it to each other. What are the terms-of-art? How do they use them? What does this tell me about how they see their craft? Their shorthand is a treasure trove of crystallized concepts; once I can use it the same way they do, I find I'm working at a level of abstraction much closer to theirs.
Similarly, I was hoping Anki could help make my students more fluent in the subject-specific lexicon that helps you score well in analytical essays. After introducing a new term and making the Anki card for it, I made extra efforts to use it conversationally. I used to shy away from that because so many students would have forgotten it immediately and tuned me out for not making any sense. Not this year. Once we'd seen the card, I used the term freely, with only the occasional reminder of what it meant. I started using multiple terms in the same sentence. I started talking about writing and analysis the way my fellow experts do, and so invited them into that world.
Even though I was already seeing written evidence that some of my high performers had assimilated the lexicon, the high quality discussions of these same students caught me off guard. You see, I usually dread whole-class discussions with non-honors classes because good comments are so rare that I end up dejectedly spouting all the insights I had hoped they could find. But by the end of the year, my students had stepped up.
I think what happened here was, as with the writing, as much a boost in confidence as a boost in fluency. Whatever it was, they got into some good discussions where they used the terminology and built on it to say smarter stuff.
Don't get me wrong. Most of my students never got to that point. But on average even small groups without smart kids had a noticeably higher level of discourse than I am used to hearing when I break up the class for smaller discussions.
Limitations
SRS is inherently weak when it comes to the abstract and complex. No card I've devised enables a student to develop a distinctive authorial voice, or write essay openings that reveal just enough to make the reader curious. Yes, you can make cards about strategies for this sort of thing, but these were consistently my worst cards—the overly difficult "leeches" that I eventually suspended from my decks.
A less obvious limitation of SRS is that students with a very strong grasp of a concept often fail to apply that knowledge in more authentic situations. For instance, they may know perfectly well the difference between "there", "their", and "they're", but never pause to think carefully about whether they're using the right one in a sentence. I am very open to suggestions about how I might train my students' autonomous "System 1" brains to have "interrupts" for that sort of thing... or even just a reflex to go back and check after finishing a draft.
Moving forward
I absolutely intend to continue using SRS in the classroom. Here's what I intend to do differently this coming school year:
- Reduce the number of cards by about 20%, to maybe 850-950 for the year in a given course, mostly by reducing the number of variations on some overexposed concepts.
- Be more willing to add extra Anki study sessions to stay better caught-up with the deck, even if this means my lesson content doesn't line up with class periods as neatly.
- Be more willing to press the red button on cards we need to re-learn. I think I was too hesitant here because we were rarely caught up as it was.
- Rework underperforming cards to be simpler and more fun.
- Use more simple cloze deletion cards. I only had a few of these, but they worked better than I expected for structured idea sets like, "characteristics of a tragic hero".
- Take a less linear and more opportunistic approach to introducing terms and concepts.
- Allow for more impromptu discussions where we bring up older concepts in relevant situations and build on them.
- Shape more of my lessons around the "vivid memory, card ready" philosophy.
- Continue to reduce needless student note-taking.
- Keep a close eye on 10th grade students who had me for 9th grade last year. I wonder how much they retained over the summer, and I can't wait to see what a second year of SRS will do for them.
Suggestions and comments very welcome!
Analogical Reasoning and Creativity
This article explores analogism and creativity, starting with a detailed investigation into IQ-test style analogy problems and how both the brain and some new artificial neural networks solve them. Next we analyze concept map formation in the cortex and the role of the hippocampal complex in establishing novel semantic connections: the neural basis of creative insights. From there we move into learning strategies, and finally conclude with speculations on how a grounded understanding of analogical creative reasoning could be applied towards advancing the art of rationality.

- Introduction
- Under the Hood
- Conceptual Abstractions and Cortical Maps
- The Hippocampal Association Engine
- Cultivate memetic heterogeneity and heterozygosity
- Construct and maintain clean conceptual taxonomies
- Conclusion
Introduction
The computer is like a bicycle for the mind.
-- Steve Jobs
The kingdom of heaven is like a mustard seed, the smallest of all seeds, but when it falls on prepared soil, it produces a large plant and becomes a shelter for the birds of the sky.
-- Jesus
Sigmoidal neural networks are like multi-layered logistic regression.
-- various
The threat of superintelligence is like a tribe of sparrows who find a large egg to hatch and raise. It grows up into a great owl which devours them all.
-- Nick Bostrom (see this video)
Analogical reasoning is one of the key foundational mechanisms underlying human intelligence, and perhaps a key missing ingredient in machine intelligence. For some - such as Douglas Hofstadter - analogy is the essence of cognition itself.[1]
Steve Job's bicycle analogy is clever because it encapsulates the whole cybernetic idea of computers as extensions of the nervous system into a single memorable sentence using everyday terms.
A large chunk of Jesus's known sayings are parables about the 'Kingdom of Heaven': a complex enigmatic concept that he explains indirectly through various analogies, of which the mustard seed is perhaps the most memorable. It conveys the notions of exponential/sigmoidal growth of ideas and social movements (see also the Parable of the Leaven), while also hinting at greater future purpose.
In a number of fields, including the technical, analogical reasoning is key to creativity: most new insights come from establishing mappings between or with concepts from other fields or domains, or from generalizing existing insights/concepts (which is closely related). These abilities all depend on deep, wide, and well organized internal conceptual maps.
Under the Hood

You can think of the development of IQ tests as a search for simple tests which have high predictive power for g-factor in humans, while being relatively insensitive to specific domain knowledge. That search process resulted in a number of problem categories, many of which are based on verbal and mathematical analogies.
The image to the right is an example of a simple geometric analogy problem. As an experiment, start a timer before having a go at it. For bonus points, attempt to introspect on your mental algorithm.
Solving this problem requires first reducing the images to simpler compact abstract representations. The first rows of images then become something like sentences describing relations or constraints (Z is to ? as A is to B and C is to D). The solution to the query sentence can then be found by finding the image which best satisfies the likely analogous relations.
Imagine watching a human subject (such as your previous self) solve this problem while hooked up to a future high resolution brain imaging device. Viewed in slow motion, you would see the subject move their eyes from location to location through a series of saccades, while various vectors or mental variable maps flowed through their brain modules. Each fixation lasts about 300ms[2], which gives enough time for one complete feedforward pass through the dorsal vision stream and perhaps one backwards sweep.

The output of the dorsal stream in inferior temporal cortex (TE on the bottom) results in abstract encodings which end up in working memory buffers in prefrontal cortex. From there some sort of learned 'mental program' implements the actual analogy evaluations, probably involving several more steps in PFC, cingulate cortex, and various other cortical modules (coordinated by the Basal Ganglia and PFC). Meanwhile the eye frontal fields and various related modules are computing the next saccade decision every 300ms or so.
If we assume that visual parsing requires one fixation on each object and 50ms saccades, this suggests that solving this problem would take a typical brain a minimum of about 4 seconds (and much longer on average). The minimum estimate assumes - probably unrealistically - that the subject can perform the analogy checks or mental rotations near instantly without any backtracking to help prime working memory. Of course faster times are also theoretically possible - but not dramatically faster.
These types of visual analogy problems test a wide set of cognitive operations, which by itself can explain much of the correlation with IQ or g-factor: speed and efficiency of neural processing, working memory, module communication, etc.
However once we lay all of that aside, there remains a core dependency on the ability for conceptual abstraction. The mapping between these simple visual images and their compact internal encodings is ambiguous, as is the predictive relationship. Solving these problems requires the ability to find efficient and useful abstractions - a general pattern recognition ability which we can relate to efficient encoding, representation learning, and nonlinear dimension reduction: the very essence of learning in both man and machine[3].
The machine learning perspective can help make these connections more concrete when we look into state of the art programs for IQ tests in general and analogy problems in particular. Many of the specific problem subtypes used in IQ tests can be solved by relatively simple programs. In 2003, Sange and Dowe created a simple Perl program (less than 1000 lines of code) that can solve several specific subtypes of common IQ problems[4] - but not analogies. It scored an IQ of a little over 100, simply by excelling in a few categories and making random guesses for the remaining harder problem types. Thus its score is highly dependent on the test's particular mix of subproblems, but that is also true for humans to some extent.

The IQ test sub-problems that remain hard for computers are those that require pattern recognition combined with analogical reasoning and or inductive inference. Precise mathematical inductive inference is easier for machines, whereas humans excel at natural reasoning - inference problems involving huge numbers of variables that can only be solved by scalable approximations.

The word vector embedding is learned as a component of an ANN trained via backprop on a large corpus of text data - Wikipedia. This particular model is rather complex: it combines a multi-sense word embedding, a local sliding window prediction objective, task-specific geometric objectives, and relational regularization constraints. Unlike the recent crop of general linguistic modeling RNNs, this particular system doesn't model full sentence structure or longer term dependencies - as those aren't necessary for answering these specific questions. Surprisingly all it takes to solve the verbal analogy problems typical of IQ/SAT/GRE style tests are very simple geometric operations in the word vector space - once the appropriate embedding is learned.
As a trivial example: "Uncle is to Aunt as King is to ?" literally reduces to:
Uncle + X = Aunt, King + X = ?, and thus X = Aunt-Uncle, and:
? = King + (Aunt-Uncle).
The (Aunt-Uncle) expression encapsulates the concept of 'femaleness', which can be combined with any male version of a word to get the female version. This is perhaps the simplest example, but more complex transformations build on this same principle. The embedded concept space allows for easy mixing and transforms of memetic sub-features to get new concepts.
Conceptual Abstractions and Cortical Maps
The success of these simplistic geometric transforms operating on word vector embeddings should not come as a huge surprise to one familiar with the structure of the brain. The brain is extraordinarily slow, so it must learn to solve complex problems via extremely simple and short mental programs operating on huge wide vectors. Humans (and now convolutional neural networks) can perform complex visual recognition tasks in just 10-15 individual computational steps (150 ms), or 'cortical clock cycles'. The entire program that you used to solve the earlier visual analogy problem probably took on the order of a few thousand cycles (assuming it took you a few dozen seconds). Einstein solved general relativity in - very roughly - around 10 billion low level cortical cycles.
The core principle behind word vector embeddings, convolutional neural networks, and the cortex itself is the same: learning to represent the statistical structure of the world by an efficient low complexity linear algebra program (consisting of local matrix vector products and per-element non-linearities). The local wiring structure within each cortical module is equivalent to a matrix with sparse local connectivity, optimized heavily for wiring and computation such that semantically related concepts cluster close together.

(Concept mapping the cortex, from this research page)
The image above is from the paper "A Continous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain" by Huth et al.[5] They used fMRI to record activity across the cortex while subjects watched annotated video clips, and then used that data to find out roughly what types of concepts each voxel of cortex responds to. It correctly identifies the FFA region as specializing in people-face things and the PPA as specializing in man-made objects and buildings. A limitation of the above image visualizations is that they don't show response variance or breadth, so the voxel colors are especially misleading for lower level cortical regions that represent generic local features (such as gabor edges in V1).
The power of analogical reasoning depends entirely on the formation of efficient conceptual maps that carve reality at the joints. The visual pathway learns a conceptual hierarchy that builds up objects from their parts: a series of hierarchical has-a relationships encoded in the connections between V1, V2, V4 and so on. Meanwhile the semantic clustering within individual cortical maps allows for fast computations of is-a relationships through simple local pooling filters.
An individual person can be encoded as a specific active subnetwork in the face region, and simple pooling over a local cluster of neurons across the face region can then compute the presence of a face in general. Smaller local pooling filters with more specific shapes can then compute the presence of a female or male face, and so on - all starting from the full specific feature encoding.
The pooling filter concept has been extensively studied in the lower levels of the visual system, where 'complex' cells higher up in V1 pool over 'simple' cell features: abstracting away gabor edges at specific positions to get edges OR'd over a range of positions (CNNs use this same technique to gain invariance to small local translations).
This key semantic organization principle is used throughout the cortex: is-a relations and more general abstractions/invariances are computed through fast local intramodule connections that exploit the physical semantic clustering on the cortical surface, and more complex has-a relations and arbitrary transforms (ex: mapping between an eye centered coordinate basis and a body centered coordinate basis) are computed through intermodule connections (which also exploit physical clustering).
The Hippocampal Association Engine

The Hippocampus is a tubular seahorse shaped module located in the center of the brain, to the exterior side of the central structures (basal ganglia, thalamus). It is the brain's associative database and search engine responsible for storing, retrieving, and consolidating patterns and declarative memories (those which we are consciously aware of and can verbally declare) over long time scales beyond the reach of short term memory in the cortex itself.
A human (or animal) unfortunate enough to suffer complete loss of hippocampal functionality basically loses the ability to form and consolidate new long term episodic and semantic memories. They also lose more recent memories that have not yet been consolidated down the cortical hierarchy. In rats and humans, problems in the hippocampal complex can also lead to spatial navigation impairments (forgetting current location or recent path), as the HC is used to compute and retrieve spatial map information associated with current sensory impressions (a specific instance of the HC's more general function).
In terms of module connectivity, the hippocampal complex sits on top of the cortical sensory hierarchy. It receives inputs from a number of cortical modules, largely in the nearby associative cortex, which collectively provide a summary of the recent sensory stream and overall brain state. The HC then has several sub circuits which further compress the mental summary into something like a compact key which is then sent into a hetero-auto-associative memory circuit to find suitable matches.
If a good match is found, it can then cause retrieval: reactivation of the cortical subnetworks that originally formed the memory. As the hippocampus can't know for sure which memories will be useful in the future, it tends to store everything with emphasis on the recent, perhaps as a sort of slow exponentially fading stream. Each memory retrieval involves a new decoding and encoding to drive learning in the cortex through distillation/consolidation/retraining (this also helps prevent ontological crisis). The amygdala is a little cap on the edge of the hippocampus which connects to the various emotion subsystems and helps estimate the importance of current memories for prioritization in the HC.
A very strong retrieval of an episodic memory causes the inner experience of reliving the past (or imagining the future), but more typical weaker retrievals (those which load information into the cortex without overriding much of the existing context) are a crucial component in general higher cognition.
In short the computation that the HC performs is that of dynamic association between the current mental pattern/state loaded into short term memory across the cortex and some previous mental pattern/state. This is the very essence of creative insight.
Associative recall can be viewed as a type of pattern recognition with the attendant familiar tradeoffs between precision/recall or sensitivity/specificity. At the extreme of low recall high precision the network is very conservative and risk averse: it only returns high confidence associations, maximizing precision at the expense of recall (few associations found, many potentially useful matches are lost). At the other extreme is the over-confident crazy network which maximizes recall at the expense of precision (many associations are made, most of which are poor). This can also be viewed in terms of the exploitation vs exploration tradeoff.
This general analogy or framework - although oversimplified - also provides a useful perspective for understanding both schizotypy and hallucinogenic drugs. There is a large body of accumulated evidence in the form of use cases or trip reports, with a general consensus that hallucinogens can provide occasional flashes of creative insight at the expense of pushing one farther towards madness.
From a skeptical stance, using hallucinogenic drugs in an attempt to improve the mind is like doing surgery with butter-knives. Nonetheless, careful exploration of the sanity border can help one understand more on how the mind works from the inside.
Cannabis in particular is believed - by many of its users - to enhance creativity via occasional flashes of insight. Most of its main mental effects: time dilation, random associations, memory impairment, spatial navigation impairment, etc appear to involve the hippocampus. We could explain much of this as a general shift in the precision/recall tradeoff to make the hippocampus less selective. Mainly that makes the HC just work less effectively, but it also can occasionally lead to atypical creative insights, and appears to elevate some related low level measures such as schizotypy and divergent thinking[7]. The tradeoff is one must be willing to first sift through a pile of low value random associations.
Cultivate memetic heterogeneity and heterozygosity
Fluid intelligence is obviously important, but in many endeavors net creativity is even more important.
Of all the components underlying creativity, improving the efficiency of learning, the quality of knowledge learned, and the organizational efficiency of one's internal cortical maps are probably the most profitable dimensions of improvement: the low hanging fruits.
Our learning process is largely automatic and subconscious : we do not need to teach children how to perceive the world. But this just means it takes some extra work to analyze the underlying machinery and understand how to best utilize it.
Over long time scales humanity has learned a great deal on how to improve on natural innate learning: education is more or less learning-engineering. The first obvious lesson from education is the need for curriculum: acquiring concepts in stages of escalating complexity and order-dependency (which of course is already now increasingly a thing in machine learning).
In most competitive creative domains, formal education can only train you up to the starting gate. This of course is to be expected, for the creation of novel and useful ideas requires uncommon insights.
Memetic evolution is similar to genetic evolution in that novelty comes more from recombination than mutation. We can draw some additional practical lessons from this analogy: cultivate memetic heterogeneity and heterozygosity.
The first part - cultivate memetic heterogeneity - should be straightforward, but it is worth examining some examples. If you possess only the same baseline memetic population as your peers, then the chances of your mind evolving truly novel creative combinations are substantially diminished. You have no edge - your insights are likely to be common.
To illustrate this point, let us consider a few examples:
Geoffrey Hinton is one of the most successful researchers in machine learning - which itself is a diverse field. He first formally studied psychology, and then artificial intelligence. His various 200 research publications integrate ideas from statistics, neuroscience and physics. His work on boltzmann machines and variants in particular imports concepts from statistical physics whole cloth.
Before founding DeepMind (now one of the premier DL research groups in the world), Demis Hassabis studied the brain and hippocampus in particular at the Gatsby Computational Neuroscience Unit, and before that he worked for years in the video game industry after studying computer science.
Before the Annus Mirabilis, Einstein worked at the patent office for four years, during which time he was exposed to a large variety of ideas relating to the transmission of electric signals and electrical-mechanical synchronization of time, core concepts which show up in his later thought experiments.[8]
Creative people also tend to have a diverse social circle of creative friends to share and exchange ideas across fields.
Genetic heterozygosity is the quality of having two different alleles at a gene locus; summed over the organism this leads to a different but related concept of diversity.
Within developing fields of knowledge we often find key questions or subdomains for which there are multiple competing hypotheses or approaches. Good old fashioned AI vs Connectionism, Ray tracing vs Rasterization, and so on.
In these scenarios, it is almost always better to understand both viewpoints or knowledge clusters - at least to some degree. Each cluster is likely to have some unique ideas which are useful for understanding the greater truth or at the very least for later recombination.
This then is memetic heterozygosity. It invokes the Jain version of the blind men and the elephant.
Construct and maintain clean conceptual taxonomies
Formal education has developed various methods and rituals which have been found to be effective through a long process of experimentation. Some of these techniques are still quite useful for autodidacts.
When one sets out to learn, it is best to start with a clear goal. The goal of high school is just to provide a generalist background. In college one then chooses a major suitable for a particular goal cluster: do you want to become a computer programmer? a physicist? a biologist? etc. A significant amount of work then goes into structuring a learning curriculum most suitable for these goal types.
Once out of the educational system we all end up creating our own curriculums, whether intentionally or not. It can be helpful to think strategically as if planning a curriculum to suit one's longer term goals.
For example, about four years ago I decided to learn how the brain works and how AGI could be built in particular. When starting on this journey, I had a background mainly in computer graphics, simulation, and game related programming. I decided to focus about equally on mainstream AI, machine learning, computational neuroscience, and the AGI literature. I quickly discovered that my statistics background was a little weak, so I had to shore that up. Doing it all over again I may have started with a statistics book. Instead I started with AI: a modern approach (of course I mostly learn from the online research literature).
Learning works best when it is applied. Education exploits this principle and it is just as important for autodidactic learning. The best way to learn many math or programming concepts is learning by doing, where you create reasonable subtasks or subgoals for yourself along the way.
For general knowledge, application can take the form of writing about what you have learned. Academics are doing this all the time as they write papers and textbooks, but the same idea applies outside of academia.
In particular a good exercise is to imagine that you need to communicate all that you have learned about the domain. Imagine that you are writing a textbook or survey paper for example, and then you need to compress all that knowledge into a summary chapter or paper, and then all of that again down into an abstract. Then actually do write up a summary - at least in the form of a blog post (even if you don't show it to anybody).
The same ideas apply on some level to giving oral presentations or just discussing what you have learned informally - all of which are also features of the academic learning environment.
Early on, your first attempts to distill what you have learned into written form will be ... poor. But doing this process forces you to attempt to compress what you have learned, and thus it helps encourage the formation of well structured concept maps in the cortex.
A well structured conceptual map can be thought of as a memetic taxonomy. The point of a taxonomy is to organize all the invariances and 'is-a' relationships between objects so that higher level inferences and transformations can generalize well across categories.
Explicitly asking questions which probe the conceptual taxonomy can help force said structure to take form. For example in computer science/programming the question: "what is the greater generalization of this algorithm?" is a powerful tool.
In some domains, it may even be possible to semi-automate or at least guide the creative process using a structured method.
For example consider sci-fi/fantasy genre novels. Many of the great works have a general analogical structure based on real history ported over into a more exotic setting. The foundation series uses the model of the fall of the roman empire. Dune is like Lawrence of Arabia in space. Stranger in a Strange Land is like the Mormon version of Jesus the space alien, but from Mars instead of Kolob. A Song of Fire and Ice is partly a fantasy port of the war of the roses. And so on.
One could probably find some new ideas for novels just by creating and exploring a sufficiently large table of historical events and figures and comparing it to a map of the currently colonized space of ideas. Obviously having an idea for a novel is just the tiniest tip of the iceberg in the process, but a semi-formal method is interesting nonetheless for brainstorming and applies across domains (others have proposed similar techniques for generating startup ideas, for example).
Conclusion
We are born equipped with sophisticated learning machinery and yet lack innate knowledge on how to use it effectively - for this too we must learn.
The greatest constraint on creative ability is the quality of conceptual maps in the cortex. Understanding how these maps form doesn't automagically increase creativity, but it does help ground our intuitions and knowledge about learning, and could pave the way for future improved techniques.
In the meantime: cultivate memetic heterogeneity and heterozygosity, create a learning strategy, develop and test your conceptual taxonomy, continuously compress what you learn by writing and summarizing, and find ways to apply what you learn as you go.
The Brain as a Universal Learning Machine
This article presents an emerging architectural hypothesis of the brain as a biological implementation of a Universal Learning Machine. I present a rough but complete architectural view of how the brain works under the universal learning hypothesis. I also contrast this new viewpoint - which comes from computational neuroscience and machine learning - with the older evolved modularity hypothesis popular in evolutionary psychology and the heuristics and biases literature. These two conceptions of the brain lead to very different predictions for the likely route to AGI, the value of neuroscience, the expected differences between AGI and humans, and thus any consequent safety issues and dependent strategies.

(The image above is from a recent mysterious post to r/machinelearning, probably from a Google project that generates art based on a visualization tool used to inspect the patterns learned by convolutional neural networks. I am especially fond of the wierd figures riding the cart in the lower left. )
- Intro: Two viewpoints on the Mind
- Universal Learning Machines
- Historical Interlude
- Dynamic Rewiring
- Brain Architecture (the whole brain in one picture and a few pages of text)
- The Basal Ganglia
- Implications for AGI
- Conclusion
Intro: Two Viewpoints on the Mind
Few discoveries are more irritating than those that expose the pedigree of ideas.
-- Lord Acton (probably)
Less Wrong is a site devoted to refining the art of human rationality, where rationality is based on an idealized conceptualization of how minds should or could work. Less Wrong and its founding sequences draws heavily on the heuristics and biases literature in cognitive psychology and related work in evolutionary psychology. More specifically the sequences build upon a specific cluster in the space of cognitive theories, which can be identified in particular with the highly influential "evolved modularity" perspective of Cosmides and Tooby.
From Wikipedia:
Evolutionary psychologists propose that the mind is made up of genetically influenced and domain-specific[3] mental algorithms or computational modules, designed to solve specific evolutionary problems of the past.[4]
From "Evolutionary Psychology and the Emotions":[5]
An evolutionary perspective leads one to view the mind as a crowded zoo of evolved, domain-specific programs. Each is functionally specialized for solving a different adaptive problem that arose during hominid evolutionary history, such as face recognition, foraging, mate choice, heart rate regulation, sleep management, or predator vigilance, and each is activated by a different set of cues from the environment.
If you imagine these general theories or perspectives on the brain/mind as points in theory space, the evolved modularity cluster posits that much of the machinery of human mental algorithms is largely innate. General learning - if it exists at all - exists only in specific modules; in most modules learning is relegated to the role of adapting existing algorithms and acquiring data; the impact of the information environment is de-emphasized. In this view the brain is a complex messy cludge of evolved mechanisms.
The universal learning hypothesis proposes that all significant mental algorithms are learned; nothing is innate except for the learning and reward machinery itself (which is somewhat complicated, involving a number of systems and mechanisms), the initial rough architecture (equivalent to a prior over mindspace), and a small library of simple innate circuits (analogous to the operating system layer in a computer). In this view the mind (software) is distinct from the brain (hardware). The mind is a complex software system built out of a general learning mechanism.
Additional indirect support comes from the rapid unexpected success of Deep Learning[7], which is entirely based on building AI systems using simple universal learning algorithms (such as Stochastic Gradient Descent or other various approximate Bayesian methods[8][9][10][11]) scaled up on fast parallel hardware (GPUs). Deep Learning techniques have quickly come to dominate most of the key AI benchmarks including vision[12], speech recognition[13][14], various natural language tasks, and now even ATARI [15] - proving that simple architectures (priors) combined with universal learning is a path (and perhaps the only viable path) to AGI. Moreover, the internal representations that develop in some deep learning systems are structurally and functionally similar to representations in analogous regions of biological cortex[16].
To paraphrase Feynman: to truly understand something you must build it.
In this article I am going to quickly introduce the abstract concept of a universal learning machine, present an overview of the brain's architecture as a specific type of universal learning machine, and finally I will conclude with some speculations on the implications for the race to AGI and AI safety issues in particular.
Universal Learning Machines
A universal learning machine is a simple and yet very powerful and general model for intelligent agents. It is an extension of a general computer - such as Turing Machine - amplified with a universal learning algorithm. Do not view this as my 'big new theory' - it is simply an amalgamation of a set of related proposals by various researchers.
An initial untrained seed ULM can be defined by 1.) a prior over the space of models (or equivalently, programs), 2.) an initial utility function, and 3.) the universal learning machinery/algorithm. The machine is a real-time system that processes an input sensory/observation stream and produces an output motor/action stream to control the external world using a learned internal program that is the result of continuous self-optimization.
There is of course always room to smuggle in arbitrary innate functionality via the prior, but in general the prior is expected to be extremely small in bits in comparison to the learned model.
The key defining characteristic of a ULM is that it uses its universal learning algorithm for continuous recursive self-improvement with regards to the utility function (reward system). We can view this as second (and higher) order optimization: the ULM optimizes the external world (first order), and also optimizes its own internal optimization process (second order), and so on. Without loss of generality, any system capable of computing a large number of decision variables can also compute internal self-modification decisions.
Conceptually the learning machinery computes a probability distribution over program-space that is proportional to the expected utility distribution. At each timestep it receives a new sensory observation and expends some amount of computational energy to infer an updated (approximate) posterior distribution over its internal program-space: an approximate 'Bayesian' self-improvement.
The above description is intentionally vague in the right ways to cover the wide space of possible practical implementations and current uncertainty. You could view AIXI as a particular formalization of the above general principles, although it is also as dumb as a rock in any practical sense and has other potential theoretical problems. Although the general idea is simple enough to convey in the abstract, one should beware of concise formal descriptions: practical ULMs are too complex to reduce to a few lines of math.
A ULM inherits the general property of a Turing Machine that it can compute anything that is computable, given appropriate resources. However a ULM is also more powerful than a TM. A Turing Machine can only do what it is programmed to do. A ULM automatically programs itself.
If you were to open up an infant ULM - a machine with zero experience - you would mainly just see the small initial code for the learning machinery. The vast majority of the codestore starts out empty - initialized to noise. (In the brain the learning machinery is built in at the hardware level for maximal efficiency).
Theoretical turing machines are all qualitatively alike, and are all qualitatively distinct from any non-universal machine. Likewise for ULMs. Theoretically a small ULM is just as general/expressive as a planet-sized ULM. In practice quantitative distinctions do matter, and can become effectively qualitative.
Just as the simplest possible Turing Machine is in fact quite simple, the simplest possible Universal Learning Machine is also probably quite simple. A couple of recent proposals for simple universal learning machines include the Neural Turing Machine[16] (from Google DeepMind), and Memory Networks[17]. The core of both approaches involve training an RNN to learn how to control a memory store through gating operations.
Historical Interlude
At this point you may be skeptical: how could the brain be anything like a universal learner? What about all of the known innate biases/errors in human cognition? I'll get to that soon, but let's start by thinking of a couple of general experiments to test the universal learning hypothesis vs the evolved modularity hypothesis.
In a world where the ULH is mostly correct, what do we expect to be different than in worlds where the EMH is mostly correct?
One type of evidence that would support the ULH is the demonstration of key structures in the brain along with associated wiring such that the brain can be shown to directly implement some version of a ULM architecture.
From the perspective of the EMH, it is not sufficient to demonstrate that there are things that brains can not learn in practice - because those simply could be quantitative limitations. Demonstrating that an intel 486 can't compute some known computable function in our lifetimes is not proof that the 486 is not a Turing Machine.
Nor is it sufficient to demonstrate that biases exist: a ULM is only 'rational' to the extent that its observational experience and learning machinery allows (and to the extent one has the correct theory of rationality). In fact, the existence of many (most?) biases intrinsically depends on the EMH - based on the implicit assumption that some cognitive algorithms are innate. If brains are mostly ULMs then most cognitive biases dissolve, or become learning biases - for if all cognitive algorithms are learned, then evidence for biases is evidence for cognitive algorithms that people haven't had sufficient time/energy/motivation to learn. (This does not imply that intrinsic limitations/biases do not exist or that the study of cognitive biases is a waste of time; rather the ULH implies that educational history is what matters most)
The genome can only specify a limited amount of information. The question is then how much of our advanced cognitive machinery for things like facial recognition, motor planning, language, logic, planning, etc. is innate vs learned. From evolution's perspective there is a huge advantage to preloading the brain with innate algorithms so long as said algorithms have high expected utility across the expected domain landscape.
On the other hand, evolution is also highly constrained in a bit coding sense: every extra bit of code costs additional energy for the vast number of cellular replication events across the lifetime of the organism. Low code complexity solutions also happen to be exponentially easier to find. These considerations seem to strongly favor the ULH but they are difficult to quantify.

Neuroscientists have long known that the brain is divided into physical and functional modules. These modular subdivisions were discovered a century ago by Brodmann. Every time neuroscientists opened up a new brain, they saw the same old cortical modules in the same old places doing the same old things. The specific layout of course varied from species to species, but the variations between individuals are minuscule. This evidence seems to strongly favor the EMH.
Throughout most of the 90's up into the 2000's, evidence from computational neuroscience models and AI were heavily influenced by - and unsurprisingly - largely supported the EMH. Neural nets and backprop were known of course since the 1980's and worked on small problems[18], but at the time they didn't scale well - and there was no theory to suggest they ever would.
Theory of the time also suggested local minima would always be a problem (now we understand that local minima are not really the main problem[19], and modern stochastic gradient descent methods combined with highly overcomplete models and stochastic regularization[20] are effectively global optimizers that can often handle obstacles such as local minima and saddle points[21]).
The other related historical criticism rests on the lack of biological plausibility for backprop style gradient descent. (There is as of yet little consensus on how the brain implements the equivalent machinery, but target propagation is one of the more promising recent proposals[22][23].)
Many AI researchers are naturally interested in the brain, and we can see the influence of the EMH in much of the work before the deep learning era. HMAX is a hierarchical vision system developed in the late 90's by Poggio et al as a working model of biological vision[24]. It is based on a preconfigured hierarchy of modules, each of which has its own mix of innate features such as gabor edge detectors along with a little bit of local learning. It implements the general idea that complex algorithms/features are innate - the result of evolutionary global optimization - while neural networks (incapable of global optimization) use hebbian local learning to fill in details of the design.
Dynamic Rewiring
In a groundbreaking study from 2000 published in Nature, Sharma et al successfully rewired ferret retinal pathways to project into the auditory cortex instead of the visual cortex.[25] The result: auditory cortex can become visual cortex, just by receiving visual data! Not only does the rewired auditory cortex develop the specific gabor features characteristic of visual cortex; the rewired cortex also becomes functionally visual. [26] True, it isn't quite as effective as normal visual cortex, but that could also possibly be an artifact of crude and invasive brain rewiring surgery.
The ferret study was popularized by the book On Intelligence by Hawkins in 2004 as evidence for a single cortical learning algorithm. This helped percolate the evidence into the wider AI community, and thus probably helped in setting up the stage for the deep learning movement of today. The modern view of the cortex is that of a mostly uniform set of general purpose modules which slowly become recruited for specific tasks and filled with domain specific 'code' as a result of the learning (self optimization) process.
The next key set of evidence comes from studies of atypical human brains with novel extrasensory powers. In 2009 Vuillerme et al showed that the brain could automatically learn to process sensory feedback rendered onto the tongue[27]. This research was developed into a complete device that allows blind people to develop primitive tongue based vision.
In the modern era some blind humans have apparently acquired the ability to perform echolocation (sonar), similar to cetaceans. In 2011 Thaler et al used MRI and PET scans to show that human echolocators use diverse non-auditory brain regions to process echo clicks, predominantly relying on re-purposed 'visual' cortex.[27]
The echolocation study in particular helps establish the case that the brain is actually doing global, highly nonlocal optimization - far beyond simple hebbian dynamics. Echolocation is an active sensing strategy that requires very low latency processing, involving complex timed coordination between a number of motor and sensory circuits - all of which must be learned.
Somehow the brain is dynamically learning how to use and assemble cortical modules to implement mental algorithms: everyday tasks such as visual counting, comparisons of images or sounds, reading, etc - all are task which require simple mental programs that can shuffle processed data between modules (some or any of which can also function as short term memory buffers).
To explain this data, we should be on the lookout for a system in the brain that can learn to control the cortex - a general system that dynamically routes data between different brain modules to solve domain specific tasks.
But first let's take a step back and start with a high level architectural view of the entire brain to put everything in perspective.
Brain Architecture
Below is a circuit diagram for the whole brain. Each of the main subsystems work together and are best understood together. You can probably get a good high level extremely coarse understanding of the entire brain is less than one hour.

(there are a couple of circuit diagrams of the whole brain on the web, but this is the best. From this site.)
The human brain has ~100 billion neurons and ~100 trillion synapses, but ultimately it evolved from the bottom up - from organisms with just hundreds of neurons, like the tiny brain of C. Elegans.
We know that evolution is code complexity constrained: much of the genome codes for cellular metabolism, all the other organs, and so on. For the brain, most of its bit budget needs to be spent on all the complex neuron, synapse, and even neurotransmitter level machinery - the low level hardware foundation.
For a tiny brain with 1000 neurons or less, the genome can directly specify each connection. As you scale up to larger brains, evolution needs to create vastly more circuitry while still using only about the same amount of code/bits. So instead of specifying connectivity at the neuron layer, the genome codes connectivity at the module layer. Each module can be built from simple procedural/fractal expansion of progenitor cells.
So the size of a module has little to nothing to do with its innate complexity. The cortical modules are huge - V1 alone contains 200 million neurons in a human - but there is no reason to suspect that V1 has greater initial code complexity than any other brain module. Big modules are built out of simple procedural tiling patterns.
Very roughly the brain's main modules can be divided into six subsystems (there are numerous smaller subsystems):
- The neocortex: the brain's primary computational workhorse (blue/purple modules at the top of the diagram). Kind of like a bunch of general purpose FPGA coprocessors.
- The cerebellum: another set of coprocessors with a simpler feedforward architecture. Specializes more in motor functionality.
- The thalamus: the orangish modules below the cortex. Kind of like a relay/routing bus.
- The hippocampal complex: the apex of the cortex, and something like the brain's database.
- The amygdala and limbic reward system: these modules specialize in something like the value function.
- The Basal Ganglia (green modules): the central control system, similar to a CPU.
In the interest of space/time I will focus primarily on the Basal Ganglia and will just touch on the other subsystems very briefly and provide some links to further reading.
The neocortex has been studied extensively and is the main focus of several popular books on the brain. Each neocortical module is a 2D array of neurons (technically 2.5D with a depth of about a few dozen neurons arranged in about 5 to 6 layers).
Each cortical module is something like a general purpose RNN (recursive neural network) with 2D local connectivity. Each neuron connects to its neighbors in the 2D array. Each module also has nonlocal connections to other brain subsystems and these connections follow the same local 2D connectivity pattern, in some cases with some simple affine transformations. Convolutional neural networks use the same general architecture (but they are typically not recurrent.)
Cortical modules - like artifical RNNs - are general purpose and can be trained to perform various tasks. There are a huge number of models of the cortex, varying across the tradeoff between biological realism and practical functionality.
Perhaps surprisingly, any of a wide variety of learning algorithms can reproduce cortical connectivity and features when trained on appropriate sensory data[27]. This is a computational proof of the one-learning-algorithm hypothesis; furthermore it illustrates the general idea that data determines functional structure in any general learning system.
There is evidence that cortical modules learn automatically (unsupervised) to some degree, and there is also some evidence that cortical modules can be trained to relearn data from other brain subsystems - namely the hippocampal complex. The dark knowledge distillation technique in ANNs[28][29] is a potential natural analog/model of hippocampus -> cortex knowledge transfer.
Module connections are bidirectional, and feedback connections (from high level modules to low level) outnumber forward connections. We can speculate that something like target propagation can also be used to guide or constrain the development of cortical maps (speculation).
The hippocampal complex is the root or top level of the sensory/motor hierarchy. This short youtube video gives a good seven minute overview of the HC. It is like a spatiotemporal database. It receives compressed scene descriptor streams from the sensory cortices, it stores this information in medium-term memory, and it supports later auto-associative recall of these memories. Imagination and memory recall seem to be basically the same.
The 'scene descriptors' take the sensible form of things like 3D position and camera orientation, as encoded in place, grid, and head direction cells. This is basically the logical result of compressing the sensory stream, comparable to the networking data stream in a multiplayer video game.
Imagination/recall is basically just the reverse of the forward sensory coding path - in reverse mode a compact scene descriptor is expanded into a full imagined scene. Imagined/remembered scenes activate the same cortical subnetworks that originally formed the memory (or would have if the memory was real, in the case of imagined recall).
The amygdala and associated limbic reward modules are rather complex, but look something like the brain's version of the value function for reinforcement learning. These modules are interesting because they clearly rely on learning, but clearly the brain must specify an initial version of the value/utility function that has some minimal complexity.
As an example, consider taste. Infants are born with basic taste detectors and a very simple initial value function for taste. Over time the brain receives feedback from digestion and various estimators of general mood/health, and it uses this to refine the initial taste value function. Eventually the adult sense of taste becomes considerably more complex. Acquired taste for bitter substances - such as coffee and beer - are good examples.
The amygdala appears to do something similar for emotional learning. For example infants are born with a simple versions of a fear response, with is later refined through reinforcement learning. The amygdala sits on the end of the hippocampus, and it is also involved heavily in memory processing.
See also these two videos from khanacademy: one on the limbic system and amygdala (10 mins), and another on the midbrain reward system (8 mins)

The Basal Ganglia
The Basal Ganglia is a wierd looking complex of structures located in the center of the brain. It is a conserved structure found in all vertebrates, which suggests a core functionality. The BG is proximal to and connects heavily with the midbrain reward/limbic systems. It also connects to the brain's various modules in the cortex/hippocampus, thalamus and the cerebellum . . . basically everything.
All of these connections form recurrent loops between associated compartmental modules in each structure: thalamocortical/hippocampal-cerebellar-basal_ganglial loops.


Just as the cortex and hippocampus are subdivided into modules, there are corresponding modular compartments in the thalamus, basal ganglia, and the cerebellum. The set of modules/compartments in each main structure are all highly interconnected with their correspondents across structures, leading to the concept of distributed processing modules.
Each DPM forms a recurrent loop across brain structures (the local networks in the cortex, BG, and thalamus are also locally recurrent, whereas those in the cerebellum are not). These recurrent loops are mostly separate, but each sub-structure also provides different opportunities for inter-loop connections.
The BG appears to be involved in essentially all higher cognitive functions. Its core functionality is action selection via subnetwork switching. In essence action selection is the core problem of intelligence, and it is also general enough to function as the building block of all higher functionality. A system that can select between motor actions can also select between tasks or subgoals. More generally, low level action selection can easily form the basis of a Turing Machine via selective routing: deciding where to route the output of thalamocortical-cerebellar modules (some of which may specialize in short term memory as in the prefrontal cortex, although all cortical modules have some short term memory capability).
There are now a number of computational models for the Basal Ganglia-Cortical system that demonstrate possible biologically plausible implementations of the general theory[28][29]; integration with the hippocampal complex leads to larger-scale systems which aim to model/explain most of higher cognition in terms of sequential mental programs[30] (of course fully testing any such models awaits sufficient computational power to run very large-scale neural nets).
For an extremely oversimplified model of the BG as a dynamic router, consider an array of N distributed modules controlled by the BG system. The BG control network expands these N inputs into an NxN matrix. There are N2 potential intermodular connections, each of which can be individually controlled. The control layer reads a compressed, downsampled version of the module's hidden units as its main input, and is also recurrent. Each output node in the BG has a multiplicative gating effect which selectively enables/disables an individual intermodular connection. If the control layer is naively fully connected, this would require (N2)2 connections, which is only feasible for N ~ 100 modules, but sparse connectivity can substantially reduce those numbers.
It is unclear (to me), whether the BG actually implements NxN style routing as described above, or something more like 1xN or Nx1 routing, but there is general agreement that it implements cortical routing.

Of course in actuality the BG architecture is considerably more complex, as it also must implement reinforcement learning, and the intermodular connectivity map itself is also probably quite sparse/compressed (the BG may not control all of cortex, certainly not at a uniform resolution, and many controlled modules may have a very limited number of allowed routing decisions). Nonetheless, the simple multiplicative gating model illustrates the core idea.
This same multiplicative gating mechanism is the core principle behind the highly successful LSTM (Long Short-Term Memory)[30] units that are used in various deep learning systems. The simple version of the BG's gating mechanism can be considered a wider parallel and hierarchical extension of the basic LSTM architecture, where you have a parallel array of N memory cells instead of 1, and each memory cell is a large vector instead of a single scalar value.
The main advantage of the BG architecture is parallel hierarchical approximate control: it allows a large number of hierarchical control loops to update and influence each other in parallel. It also reduces the huge complexity of general routing across the full cortex down into a much smaller-scale, more manageable routing challenge.
Implications for AGI
These two conceptions of the brain - the universal learning machine hypothesis and the evolved modularity hypothesis - lead to very different predictions for the likely route to AGI, the expected differences between AGI and humans, and thus any consequent safety issues and strategies.
In the extreme case imagine that the brain is a pure ULM, such that the genetic prior information is close to zero or is simply unimportant. In this case it is vastly more likely that successful AGI will be built around designs very similar to the brain, as the ULM architecture in general is the natural ideal, vs the alternative of having to hand engineer all of the AI's various cognitive mechanisms.
In reality learning is computationally hard, and any practical general learning system depends on good priors to constrain the learning process (essentially taking advantage of previous knowledge/learning). The recent and rapid success of deep learning is strong evidence for how much prior information is ideal: just a little. The prior in deep learning systems takes the form of a compact, small set of hyperparameters that control the learning process and specify the overall network architecture (an extremely compressed prior over the network topology and thus the program space).
The ULH suggests that most everything that defines the human mind is cognitive software rather than hardware: the adult mind (in terms of algorithmic information) is 99.999% a cultural/memetic construct. Obviously there are some important exceptions: infants are born with some functional but very primitive sensory and motor processing 'code'. Most of the genome's complexity is used to specify the learning machinery, and the associated reward circuitry. Infant emotions appear to simplify down to a single axis of happy/sad; differentiation into the more subtle vector space of adult emotions does not occur until later in development.
If the mind is software, and if the brain's learning architecture is already universal, then AGI could - by default - end up with a similar distribution over mindspace, simply because it will be built out of similar general purpose learning algorithms running over the same general dataset. We already see evidence for this trend in the high functional similarity between the features learned by some machine learning systems and those found in the cortex.
Of course an AGI will have little need for some specific evolutionary features: emotions that are subconsciously broadcast via the facial muscles is a quirk unnecessary for an AGI - but that is a rather specific detail.
The key takeway is that the data is what matters - and in the end it is all that matters. Train a universal learner on image data and it just becomes a visual system. Train it on speech data and it becomes a speech recognizer. Train it on ATARI and it becomes a little gamer agent.
Train a universal learner on the real world in something like a human body and you get something like the human mind. Put a ULM in a dolphin's body and echolocation is the natural primary sense, put a ULM in a human body with broken visual wiring and you can also get echolocation.
Control over training is the most natural and straightforward way to control the outcome.
To create a superhuman AI driver, you 'just' need to create a realistic VR driving sim and then train a ULM in that world (better training and the simple power of selective copying leads to superhuman driving capability).
So to create benevolent AGI, we should think about how to create virtual worlds with the right structure, how to educate minds in those worlds, and how to safely evaluate the results.
One key idea - which I proposed five years ago is that the AI should not know it is in a sim.
New AI designs (world design + architectural priors + training/education system) should be tested first in the safest virtual worlds: which in simplification are simply low tech worlds without computer technology. Design combinations that work well in safe low-tech sandboxes are promoted to less safe high-tech VR worlds, and then finally the real world.
A key principle of a secure code sandbox is that the code you are testing should not be aware that it is in a sandbox. If you violate this principle then you have already failed. Yudkowsky's AI box thought experiment assumes the violation of the sandbox security principle apriori and thus is something of a distraction. (the virtual sandbox idea was most likely discussed elsewhere previously, as Yudkowsky indirectly critiques a strawman version of the idea via this sci-fi story).
The virtual sandbox approach also combines nicely with invisible thought monitors, where the AI's thoughts are automatically dumped to searchable logs.
Of course we will still need a solution to the value learning problem. The natural route with brain-inspired AI is to learn the key ideas behind value acquisition in humans to help derive an improved version of something like inverse reinforcement learning and or imitation learning[31] - an interesting topic for another day.
Conclusion
Ray Kurzweil has been predicting for decades that AGI will be built by reverse engineering the brain, and this particular prediction is not especially unique - this has been a popular position for quite a while. My own investigation of neuroscience and machine learning led me to a similar conclusion some time ago.
The recent progress in deep learning, combined with the emerging modern understanding of the brain, provide further evidence that AGI could arrive around the time when we can build and train ANNs with similar computational power as measured very roughly in terms of neuron/synapse counts. In general the evidence from the last four years or so supports Hanson's viewpoint from the Foom debate. More specifically, his general conclusion:
Future superintelligences will exist, but their vast and broad mental capacities will come mainly from vast mental content and computational resources. By comparison, their general architectural innovations will be minor additions.
The ULH supports this conclusion.
Current ANN engines can already train and run models with around 10 million neurons and 10 billion (compressed/shared) synapses on a single GPU, which suggests that the goal could soon be within the reach of a large organization. Furthermore, Moore's Law for GPUs still has some steam left, and software advances are currently improving simulation performance at a faster rate than hardware. These trends implies that Anthropomorphic/Neuromorphic AGI could be surprisingly close, and may appear suddenly.
What kind of leverage can we exert on a short timescale?
[Link] Nate Soares is answering questions about MIRI at the EA Forum
Nate Soares, MIRI's new Executive Director, is going to be answering questions tomorrow at the EA Forum (link). You can post your questions there now; he'll start replying Thursday, 15:00-18:00 US Pacific time.
Quoting Nate:
Last week Monday, I took the reins as executive director of the Machine Intelligence Research Institute. MIRI focuses on studying technical problems of long-term AI safety. I'm happy to chat about what that means, why it's important, why we think we can make a difference now, what the open technical problems are, how we approach them, and some of my plans for the future.
I'm also happy to answer questions about my personal history and how I got here, or about personal growth and mindhacking (a subject I touch upon frequently in my blog, Minding Our Way), or about whatever else piques your curiosity.
Nate is a regular poster on LessWrong under the name So8res -- you can find stuff he's written in the past here.
Update: Question-answering is live!
Update #2: Looks like Nate's wrapping up now. Feel free to discuss the questions and answers, here or at the EA Forum.
Update #3: Here are some interesting snippets from the AMA:
Alex Altair: What are some of the most neglected sub-tasks of reducing existential risk? That is, what is no one working on which someone really, really should be?
Nate Soares: Policy work / international coordination. Figuring out how to build an aligned AI is only part of the problem. You also need to ensure that an aligned AI is built, and that’s a lot harder to do during an international arms race. (A race to the finish would be pretty bad, I think.)
I’d like to see a lot more people figuring out how to ensure global stability & coordination as we enter a time period that may be fairly dangerous.
Diego Caleiro: 1) Which are the implicit assumptions, within MIRI's research agenda, of things that "currently we have absolutely no idea of how to do that, but we are taking this assumption for the time being, and hoping that in the future either a more practical version of this idea will be feasible, or that this version will be a guiding star for practical implementations"? [...]
2) How do these assumptions diverge from how FLI, FHI, or non-MIRI people publishing on the AGI 2014 book conceive of AGI research?
3) Optional: Justify the differences in 2 and why MIRI is taking the path it is taking.
Nate Soares: 1) The things we have no idea how to do aren't the implicit assumptions in the technical agenda, they're the explicit subject headings: decision theory, logical uncertainty, Vingean reflection, corrigibility, etc :-)
We've tried to make it very clear in various papers that we're dealing with very limited toy models that capture only a small part of the problem (see, e.g., basically all of section 6 in the corrigibility paper).
Right now, we basically have a bunch of big gaps in our knowledge, and we're trying to make mathematical models that capture at least part of the actual problem -- simplifying assumptions are the norm, not the exception. All I can easily say that common simplifying assumptions include: you have lots of computing power, there is lots of time between actions, you know the action set, you're trying to maximize a given utility function, etc. Assumptions tend to be listed in the paper where the model is described.
2) The FLI folks aren't doing any research; rather, they're administering a grant program. Most FHI folks are focused more on high-level strategic questions (What might the path to AI look like? What methods might be used to mitigate xrisk? etc.) rather than object-level AI alignment research. And remember that they look at a bunch of other X-risks as well, and that they're also thinking about policy interventions and so on. Thus, the comparison can't easily be made. (Eric Drexler's been doing some thinking about the object-level FAI questions recently, but I'll let his latest tech report fill you in on the details there. Stuart Armstrong is doing AI alignment work in the same vein as ours. Owain Evans might also be doing object-level AI alignment work, but he's new there, and I haven't spoken to him recently enough to know.)
Insofar as FHI folks would say we're making assumptions, I doubt they'd be pointing to assumptions like "UDT knows the policy set" or "assume we have lots of computing power" (which are obviously simplifying assumptions on toy models), but rather assumptions like "doing research on logical uncertainty now will actually improve our odds of having a working theory of logical uncertainty before it's needed."
3) I think most of the FHI folks & FLI folks would agree that it's important to have someone hacking away at the technical problems, but just to make the arguments more explicit, I think that there are a number of problems that it's hard to even see unless you have your "try to solve FAI" goggles on. [...]
We're still in the preformal stage, and if we can get this theory to the formal stage, I expect we may be able to get a lot more eyes on the problem, because the ever-crawling feelers of academia seem to be much better at exploring formalized problems than they are at formalizing preformal problems.
Then of course there's the heuristic of "it's fine to shout 'model uncertainty!' and hover on the sidelines, but it wasn't the armchair philosophers who did away with the epicycles, it was Kepler, who was up to his elbows in epicycle data." One of the big ways that you identify the things that need working on is by trying to solve the problem yourself. By asking how to actually build an aligned superintelligence, MIRI has generated a whole host of open technical problems, and I predict that that host will be a very valuable asset now that more and more people are turning their gaze towards AI alignment.
Buck Shlegeris: What's your response to Peter Hurford's arguments in his article Why I'm Skeptical Of Unproven Causes...?
Nate Soares: (1) One of Peter's first (implicit) points is that AI alignment is a speculative cause. I tend to disagree.
Imagine it's 1942. The Manhattan project is well under way, Leo Szilard has shown that it's possible to get a neutron chain reaction, and physicists are hard at work figuring out how to make an atom bomb. You suggest that this might be a fine time to start working on nuclear containment, so that, once humans are done bombing the everloving breath out of each other, they can harness nuclear energy for fun and profit. In this scenario, would nuclear containment be a "speculative cause"?
There are currently thousands of person-hours and billions of dollars going towards increasing AI capabilities every year. To call AI alignment a "speculative cause" in an environment such as this one seems fairly silly to me. In what sense is it speculative to work on improving the safety of the tools that other people are currently building as fast as they can? Now, I suppose you could argue that either (a) AI will never work or (b) it will be safe by default, but both those arguments seem pretty flimsy to me.
You might argue that it's a bit weird for people to claim that the most effective place to put charitable dollars is towards some field of scientific study. Aren't charitable dollars supposed to go to starving children? Isn't the NSF supposed to handle scientific funding? And I'd like to agree, but society has kinda been dropping the ball on this one.
If we had strong reason to believe that humans could build strangelets, and society were pouring billions of dollars and thousands of human-years into making strangelets, and almost no money or effort was going towards strangelet containment, and it looked like humanity was likely to create a strangelet sometime in the next hundred years, then yeah, I'd say that "strangelet safety" would be an extremely worthy cause.
How worthy? Hard to say. I agree with Peter that it's hard to figure out how to trade off "safety of potentially-very-highly-impactful technology that is currently under furious development" against "children are dying of malaria", but the only way I know how to trade those things off is to do my best to run the numbers, and my back-of-the-envelope calculations currently say that AI alignment is further behind than the globe is poor.
Now that the EA movement is starting to look more seriously into high-impact interventions on the frontiers of science & mathematics, we're going to need to come up with more sophisticated ways to assess the impacts and tradeoffs. I agree it's hard, but I don't think throwing out everything that doesn't visibly pay off in the extremely short term is the answer.
(2) Alternatively, you could argue that MIRI's approach is unlikely to work. That's one of Peter's explicit arguments: it's very hard to find interventions that reliably affect the future far in advance, especially when there aren't hard objective metrics. I have three disagreements with Peter on this point.
First, I think he picks the wrong reference class: yes, humans have a really hard time generating big social shifts on purpose. But that doesn't necessarily mean humans have a really hard time generating math -- in fact, humans have a surprisingly good track record when it comes to generating math!
Humans actually seem to be pretty good at putting theoretical foundations underneath various fields when they try, and various people have demonstrably succeeded at this task (Church & Turing did this for computing, Shannon did this for information theory, Kolmogorov did a fair bit of this for probability theory, etc.). This suggests to me that humans are much better at producing technical progress in an unexplored field than they are at generating social outcomes in a complex economic environment. (I'd be interested in any attempt to quantitatively evaluate this claim.)
Second, I agree in general that any one individual team isn't all that likely to solve the AI alignment problem on their own. But the correct response to that isn't "stop funding AI alignment teams" -- it's "fund more AI alignment teams"! If you're trying to ensure that nuclear power can be harnessed for the betterment of humankind, and you assign low odds to any particular research group solving the containment problem, then the answer isn't "don't fund any containment groups at all," the answer is "you'd better fund a few different containment groups, then!"
Third, I object to the whole "there's no feedback" claim. Did Kolmogorov have tight feedback when he was developing an early formalization of probability theory? It seems to me like the answer is "yes" -- figuring out what was & wasn't a mathematical model of the properties he was trying to capture served as a very tight feedback loop (mathematical theorems tend to be unambiguous), and indeed, it was sufficiently good feedback that Kolmogorov was successful in putting formal foundations underneath probability theory.
Interstice: What is your AI arrival timeline?
Nate Soares: Eventually. Predicting the future is hard. My 90% confidence interval conditioned on no global catastrophes is maybe 5 to 80 years. That is to say, I don't know.
Tarn Somervell Fletcher: What are MIRI's plans for publication over the next few years, whether peer-reviewed or arxiv-style publications?
More specifically, what are the a) long-term intentions and b) short-term actual plans for the publication of workshop results, and what kind of priority does that have?
Nate Soares: Great question! The short version is, writing more & publishing more (and generally engaging with the academic mainstream more) are very high on my priority list.
Mainstream publications have historically been fairly difficult for us, as until last year, AI alignment research was seen as fairly kooky. (We've had a number of papers rejected from various journals due to the "weird AI motivation.") Going forward, it looks like that will be less of an issue.
That said, writing capability is a huge bottleneck right now. Our researchers are currently trying to (a) run workshops, (b) engage with & evaluate promising potential researchers, (c) attend conferences, (d) produce new research, (e) write it up, and (f) get it published. That's a lot of things for a three-person research team to juggle! Priority number 1 is to grow the research team (because otherwise nothing will ever be unblocked), and we're aiming to hire a few new researchers before the year is through. After that, increasing our writing output is likely the next highest priority.
Expect our writing output this year to be similar to last year's (i.e., a small handful of peer reviewed papers and a larger handful of technical reports that might make it onto the arXiv), and then hopefully we'll have more & higher quality publications starting in 2016 (the publishing pipeline isn't particularly fast).
Tor Barstad: Among recruiting new talent and having funding for new positions, what is the greatest bottleneck?
Nare Soares: Right now we’re talent-constrained, but we’re also fairly well-positioned to solve that problem over the next six months. Jessica Taylor is joining us in august. We have another researcher or two pretty far along in the pipeline, and we’re running four or five more research workshops this summer, and CFAR is running a summer fellows program in July. It’s quite plausible that we’ll hire a handful of new researchers before the end of 2015, in which case our runway would start looking pretty short, and it’s pretty likely that we’ll be funding constrained again by the end of the year.
Diego Caleiro: I see a trend in the way new EAs concerned about the far future think about where to donate money that seems dangerous, it goes:
I am an EA and care about impactfulness and neglectedness -> Existential risk dominates my considerations -> AI is the most important risk -> Donate to MIRI.
The last step frequently involves very little thought, it borders on a cached thought.
Nate Soares: Huh, that hasn't been my experience. We have a number of potential donors who ring us up and ask who in AI alignment needs money the most at the moment. (In fact, last year, we directed a number of donors to FHI, who had much more of a funding gap than MIRI did at that time.)
1. What are your plans for taking MIRI to the next level? What is the next level?
2. Now that MIRI is focused on math research (a good move) and not on outreach, there is less of a role for volunteers and supporters. With the donation from Elon Musk, some of which will presumably get to MIRI, the marginal value of small donations has gone down. How do you plan to keep your supporters engaged and donating? (The alternative, which is perhaps feasible, could be for MIRI to be an independent research institution, without a lot of public engagement, funded by a few big donors.)
Nate Soares:
1. (a) grow the research team, (b) engage more with mainstream academia. I'd also like to spend some time experimenting to figure out how to structure the research team so as to make it more effective (we have a lot of flexibility here that mainstream academic institutes don't have). Once we have the first team growing steadily and running smoothly, it's not entirely clear whether the next step will be (c.1) grow it faster or (c.2) spin up a second team inside MIRI taking a different approach to AI alignment. I'll punt that question to future-Nate.
2. So first of all, I'm not convinced that there's less of a role for supporters. If we had just ten people earning-to-give at the (amazing!) level of Ethan Dickinson, Jesse Liptrap, Mike Blume, or Alexei Andreev (note: Alexei recently stopped earning-to-give in order to found a startup), that would bring in as much money per year as the Thiel Foundation. (I think people often vastly overestimate how many people are earning-to-give to MIRI, and underestimate how useful it is: the small donors taken together make a pretty big difference!)
Furthermore, if we successfully execute on (a) above, then we're going to be burning through money quite a bit faster than before. An FLI grant (if we get one) will certainly help, but I expect it's going to be a little while before MIRI can support itself on large donations & grants alone.
Announcement: The Sequences eBook will be released in mid-March
The Sequences are being released as an eBook, titled Rationality: From AI to Zombies, on March 12.
We went with the name "Rationality: From AI to Zombies" (based on shminux's suggestion) to make it clearer to people — who might otherwise be expecting a self-help book, or an academic text — that the style and contents of the Sequences are rather unusual. We want to filter for readers who have a wide-ranging interest in (/ tolerance for) weird intellectual topics. Alternative options tended to obscure what the book is about, or obscure its breadth / eclecticism.
The book's contents
Around 340 of Eliezer's essays from 2009 and earlier will be included, collected into twenty-six sections ("sequences"), compiled into six books:
- Map and Territory: sequences on the Bayesian conceptions of rationality, belief, evidence, and explanation.
- How to Actually Change Your Mind: sequences on confirmation bias and motivated reasoning.
- The Machine in the Ghost: sequences on optimization processes, cognition, and concepts.
- Mere Reality: sequences on science and the physical world.
- Mere Goodness: sequences on human values.
- Becoming Stronger: sequences on self-improvement and group rationality.
The six books will be released as a single sprawling eBook, making it easy to hop back and forth between different parts of the book. The whole book will be about 1,800 pages long. However, we'll also be releasing the same content as a series of six print books (and as six audio books) at a future date.
The Sequences have been tidied up in a number of small ways, but the content is mostly unchanged. The largest change is to how the content is organized. Some important Overcoming Bias and Less Wrong posts that were never officially sorted into sequences have now been added — 58 additions in all, forming four entirely new sequences (and also supplementing some existing sequences). Other posts have been removed — 105 in total. The following old sequences will be the most heavily affected:
- Map and Territory and Mysterious Answers to Mysterious Questions are being merged, expanded, and reassembled into a new set of introductory sequences, with more focus placed on cognitive biases. The name 'Map and Territory' will be re-applied to this entire collection of sequences, constituting the first book.
- Quantum Physics and Metaethics are being heavily reordered and heavily shortened.
- Most of Fun Theory and Ethical Injunctions is being left out. Taking their place will be two new sequences on ethics, plus the modified version of Metaethics.
I'll provide more details on these changes when the eBook is out.
Unlike the print and audio-book versions, the eBook version of Rationality: From AI to Zombies will be entirely free. If you want to purchase it on Kindle Store and download it directly to your Kindle, it will also be available on Amazon for $4.99.
To make the content more accessible, the eBook will include introductions I've written up for this purpose. It will also include a LessWrongWiki link to a glossary, which I'll be recruiting LessWrongers to help populate with explanations of references and jargon from the Sequences.
I'll post an announcement to Main as soon as the eBook is available. See you then!
CFAR in 2014: Continuing to climb out of the startup pit, heading toward a full prototype
Summary: We outline CFAR’s purpose, our history in 2014, and our plans heading into 2015.
- Highlights from 2014.
- Improving operations.
- Attempts to go beyond the current workshop and toward the ‘full prototype’ of CFAR: our experience in 2014 and plans for 2015.
- Nuts, bolts, and financial details.
- The big picture and how you can help.
One of the reasons we’re publishing this review now is that we’ve just launched our annual matching fundraiser, and want to provide the information our prospective donors need for deciding. This is the best time of year to decide to donate to CFAR. Donations up to $120k will be matched until January 31.[1]
To briefly preview: For the first three years of our existence, CFAR mostly focused on getting going. We followed the standard recommendation to build a ‘minimum viable product’, the CFAR workshops, that could test our ideas and generate some revenue. Coming into 2013, we had a workshop that people liked (9.3 average rating on “Are you glad you came?”; a more recent random survey showed 9.6 average rating on the same question 6-24 months later), which helped keep the lights on and gave us articulate, skeptical, serious learners to iterate on. At the same time, the workshops are not everything we would want in a CFAR prototype; it feels like the current core workshop does not stress-test most of our hopes for what CFAR can eventually do. The premise of CFAR is that we should be able to apply the modern understanding of cognition to improve people’s ability to (1) figure out the truth (2) be strategically effective (3) do good in the world. We have dreams of scaling up some particular kinds of sanity. Our next goal is to build the minimum strategic product that more directly justifies CFAR’s claim to be an effective altruist project.[2]
The Future of Humanity Institute could make use of your money
Many people have an incorrect view of the Future of Humanity Institute's funding situation, so this is a brief note to correct that; think of it as a spiritual successor to this post. As John Maxwell puts it, FHI is "one of the three organizations co-sponsoring LW [and] a group within the University of Oxford's philosophy department that tackles important, large-scale problems for humanity like how to go about reducing existential risk." (If you're not familiar with our work, this article is a nice, readable introduction, and our director, Nick Bostrom, wrote Superintelligence.) Though we are a research institute in an ancient and venerable institution, this does not guarantee funding or long-term stability.
[meta] Future moderation and investigation of downvote abuse cases, or, I don't want to deal with this stuff
Since the episode with Eugine_Nier, I have received three private messages from different people asking me to investigate various cases of suspected mass downvoting. And to be quite honest, I don't want to deal with this. Eugine's case was relatively clear-cut, since he had engaged in systematic downvoting of a massive scale, but the new situations are a lot fuzzier and I'm not sure of what exactly the rules should be (what counts as a permitted use of the downvote system and what doesn't?).
At least one person has also privately contacted me and offered to carry out moderator duties if I don't want them, but even if I told them yes (on what basis? why them and not someone else?), I don't know what kind of policy I should tell them to enforce. I only happened to be appointed a moderator because I was in the list of top 10 posters at a particular time, and I don't feel like I should have any particular authority to make the rules. Nor do I feel like I have any good idea of what the rules should be, or who would be the right person to enforce them.
In any case, I don't want to be doing this job, nor do I particularly feel like being responsible for figuring out who should, or how, or what the heck. I've already started visiting LW less often because I dread having new investigation requests to deal with. So if you folks could be so kind as to figure it out without my involvement? If there's a clear consensus that someone in particular should deal with this, I can give them mod powers, or something.
[meta] Policy for dealing with users suspected/guilty of mass-downvote harassment?
Below is a message I just got from jackk. Some specifics have been redacted 1) so that we can discuss general policy rather than the details of this specific case 2) because presumption of innocence, just in case there happens to be an innocuous explanation to this.
Hi Kaj_Sotala,
I'm Jack, one of the Trike devs. I'm messaging you because you're the moderator who commented most recently. A while back the user [REDACTED 1] asked if Trike could look into retributive downvoting against his account. I've done that, and it looks like [REDACTED 2] has downvoted at least [over half of REDACTED 1's comments, amounting to hundreds of downvotes] ([REDACTED 1]'s next-largest downvoter is [REDACTED 3] at -15).
What action to take is a community problem, not a technical one, so we'd rather leave that up to the moderators. Some options:
1. Ask [REDACTED 2] for the story behind these votes
2. Use the "admin" account (which exists for sending scripted messages, &c.) to apply an upvote to each downvoted post
3. Apply a karma award to [REDACTED 1]'s account. This would fix the karma damage but not the sorting of individual comments
4. Apply a negative karma award to [REDACTED 2]'s account. This makes him pay for false downvotes twice over. This isn't possible in the current code, but it's an easy fix
5. Ban [REDACTED 2]
For future reference, it's very easy for Trike to look at who downvoted someone's account, so if you get questions about downvoting in the future I can run the same report.
If you need to verify my identity before you take action, let me know and we'll work something out.
-- Jack
So... thoughts? I have mod powers, but when I was granted them I was basically just told to use them to fight spam; there was never any discussion of any other policy, and I don't feel like I have the authority to decide on the suitable course of action without consulting the rest of the community.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)