Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
As previously discussed, on June 6th I received a message from jackk, a Trike Admin. He reported that the user Jiro had asked Trike to carry out an investigation to the retributive downvoting that Jiro had been subjected to. The investigation revealed that the user Eugine_Nier had downvoted over half of Jiro's comments, amounting to hundreds of downvotes.
I asked the community's guidance on dealing with the issue, and while the matter was being discussed, I also reviewed previous discussions about mass downvoting and looked for other people who mentioned being the victims of it. I asked Jack to compile reports on several other users who mentioned having been mass-downvoted, and it turned out that Eugine was also overwhelmingly the biggest downvoter of users David_Gerard, daenarys, falenas108, ialdabaoth, shminux, and Tenoke. As this discussion was going on, it turned out that user Ander had also been targeted by Eugine.
I sent two messages to Eugine, requesting an explanation. I received a response today. Eugine admitted his guilt, expressing the opinion that LW's karma system was failing to carry out its purpose of keeping out weak material and that he was engaged in a "weeding" of users who he did not think displayed sufficient rationality.
Needless to say, it is not the place of individual users to unilaterally decide that someone else should be "weeded" out of the community. The Less Wrong content deletion policy contains this clause:
Harrassment of individual users.
If we determine that you're e.g. following a particular user around and leaving insulting comments to them, we reserve the right to delete those comments. (This has happened extremely rarely.)
Although the wording does not explicitly mention downvoting, harassment by downvoting is still harassment. Several users have indicated that they have experienced considerable emotional anguish from the harassment, and have in some cases been discouraged from using Less Wrong at all. This is not a desirable state of affairs, to say the least.
I was originally given my moderator powers on a rather ad-hoc basis, with someone awarding mod privileges to the ten users with the highest karma at the time. The original purpose for that appointment was just to delete spam. Nonetheless, since retributive downvoting has been a clear problem for the community, I asked the community for guidance on dealing with the issue. The rough consensus of the responses seemed to authorize me to deal with the problem as I deemed appropriate.
The fact that Eugine remained quiet about his guilt until directly confronted with the evidence, despite several public discussions of the issue, is indicative of him realizing that he was breaking prevailing social norms. Eugine's actions have worsened the atmosphere of this site, and that atmosphere will remain troubled for as long as he is allowed to remain here.
Therefore, I now announce that Eugine_Nier is permanently banned from posting on LessWrong. This decision is final and will not be changed in response to possible follow-up objections.
Unfortunately, it looks like while a ban prevents posting, it does not actually block a user from casting votes. I have asked jackk to look into the matter and find a way to actually stop the downvoting. Jack indicated earlier on that it would be technically straightforward to apply a negative karma modifier to Eugine's account, and wiping out Eugine's karma balance would prevent him from casting future downvotes. Whatever the easiest solution is, it will be applied as soon as possible.
EDIT 24 July 2014: Banned users are now prohibited from voting.
Due in part to Eliezer's writing style (e.g. not many citations), and in part to Eliezer's scholarship preferences (e.g. his preference to figure out much of philosophy on his own), Eliezer's Sequences don't accurately reflect the close agreement between the content of The Sequences and work previously done in mainstream academia.
I predict several effects from this:
- Some readers will mistakenly think that common Less Wrong views are more parochial than they really are.
- Some readers will mistakenly think Eliezer's Sequences are more original than they really are.
- If readers want to know more about the topic of a given article, it will be more difficult for them to find the related works in academia than if those works had been cited in Eliezer's article.
I'd like to counteract these effects by connecting the Sequences to the professional literature. (Note: I sort of doubt it would have been a good idea for Eliezer to spend his time tracking down more references and so on, but I realized a few weeks ago that it wouldn't take me much effort to list some of those references.)
I don't mean to minimize the awesomeness of the Sequences. There is much original content in them (edit: probably most of their content is original), they are engagingly written, and they often have a more transformative effect on readers than the corresponding academic literature.
I'll break my list of references into sections based on how likely I think it is that a reader will have missed the agreement between Eliezer's articles and mainstream academic work.
(This is only a preliminary list of connections.)
Viliam Bur made the announcement in Main, but not everyone checks main, so I'm repeating it here.
During the following months my time and attention will be heavily occupied by some personal stuff, so I will be unable to function as a LW moderator. The new LW moderator is... NancyLebovitz!
From today, please direct all your complaints and investigation requests to Nancy. Please not everyone during the first week. That can be a bit frightening for a new moderator.
There are a few old requests I haven't completed yet. I will try to close everything during the following days, but if I don't do it till the end of January, then I will forward the unfinished cases to Nancy, too.
Long live the new moderator!
So we say we know evolution is an alien god, which can do absolutely horrifying things to creatures. And surely we are aware that includes us, but how exactly does one internalize something like that? Something so at odds with default cultural intuitions. It may be just my mood tonight, but this short entry on the West Hunter (thanks Glados) blog really grabbed my attention and in a few short paragraphs on a hypothesis regarding the Hobbits of Flores utterly changed how I grok Eliezer's old post.
There is still doubt, but there seems to be a good chance that the Flores Hobbit was a member of a distinct hominid species, rather than some homo sap with a nasty case of microcephalic dwarfism. If this is the case, the Hobbits are likely descended from a small, Australopithecus-like population that managed to move from Africa to Indonesia without leaving any fossils in between, or from some ancient hominid (perhaps homo erectus) that managed to strand themselves on Flores and then shrank, as many large animals do when isolated on islands.
Island dwarfing of a homo erectus population is the dominant idea right now. However, many proponents are really bothered by how small the Hobbit’s brain was. At 400 cc, it was downright teeny, about the size of a chimpanzee’s brain. Most researchers seem to think that hominid brains naturally increase in size with time. They also suspect that anyone with a brain this small couldn’t be called sentient – and the idea of natural selection driving a population from sentience to nonsentience bothers them.
They should get over it. Hominid brain volume has increased pretty rapidly over the past few million years, but the increase hasn’t been monotonic. It’s decreased about 10% over the past 25,000 years. Moreover, we know of examples where natural selection has caused drastic decreases in organismal complexity – for example, canine venereal sarcoma, which today is an infectious cancer, but was once a dog.
I have to break here to note that was the most awesome fact I have learned in some time.
There is a mechanism that might explain what happened on Flores – partial mutational meltdown. Classic mutational meltdown occurs when a population is too small for too long. Selection is inefficient in such a small population: alleles that decrease fitness by less than 1/N drift fairly freely, and can go to fixation. At the same time, favorable mutations, which are very rare, almost never occur. In such a situation, mutational load accumulates – likely further reducing population size – and the population spirals down into extinction. Since small population size and high genetic load increase vulnerability to disaster, some kind of environmental catastrophe usually nails such doomed, shrinking populations before they manage to die off from purely genetic causes.
In principle, if the population is the right size and one adaptive function is considerably more complicated than others, presenting a bigger mutational target, you might see a population suffer a drastic decline in that function while continuing to exist. There is reason to think that intelligence is the most complex adaptation in hominids. More than half of all genes are expressed in the brain, and it seems that a given degree of inbreeding depression – say cousin marriage – depressesIQ more than other traits.
Flores is not that big an island and the population density of homo-erectus type hunter-gatherers must have been low – certainly lower than that of contemporary hunter-gatherers, who have much more sophisticated tools. Thus the hobbit population was likely small. It may not have been possible to sustain a high-performing brain over the long haul in that situation. Given that their brains performed poorly – while the metabolic costs were as high as ever – selection would have shrunk their brains. Over hundreds of thousands of years, this could well have generated the chimp-sized brain we see in the LB1 skeleton.
Of course, this could only have happened if there was an available ecological niche that did not require human-level intelligence. And there was such an opening: Flores had no monkeys.
That last sentence just struck me with utter horror.
In hindsight, this post seems incredibly obvious. The meat of it already exists in sayings which we all know we ought to listen to: "Always arrive 10 minutes earlier than you think early is," "If you arrive on time, then you're late," or "Better three hours too soon than one minute too late." Yet even with these sayings, I still never trusted them nor arrived on time. I'd miss deadlines, show up late, and just be generally tardy. The reason is that I never truly understood what it took to arrive on time until I grokked the math of it. So, while this may be remedial reading for most of you, I'm posting this because maybe there's someone out there who missed the same obviousness that I missed.
Everyone here understands that our universe is controlled and explained by math. Math describes how heavenly bodies move. Math describes how our computers run. Math describes how other people act in aggregate. Wait a second, something's not right with that statement... "other people". The way it comes out it's natural to think that math controls the way that other people act, and not myself. Intellectually, I am aware that I am not a special snowflake who is exempt from the laws of math. While I had managed to propagate this thought far enough to crush my belief in libertarian free will, I hadn't propagated it fully through my mind. Specifically, I hadn't realized I could also use math to describe my actions and reap the benefit of understanding them mathematically. I was still late to arrive and missing deadlines, and nothing seemed to help.
But wait, I'm a rationalist! I know all about the planning fallacy; I know to take the outside view! That's enough to save me right? Well, not quite. It seemed I missed one last part of the puzzle... Bell Curves.
When I go to work every day, the time from when I do nothing but getting ready to go to work until the time that I actually arrive there (I'll just call this prep time) usually takes 45 minutes, but sometimes it can take more time or less time. Weirdly and crazily enough, if you plot all the prep times on a graph, the shape would end up looking roughly like a bell. Well that's funny. Math is for other people, but my behavior appears like it can be described statistically. Some days I will have deviations from the normal routine that help me arrive faster while other days will have things that slow me down. Some of them happen more often, some of them happen less often. If I were describable by math, I could almost call these things standard deviations: days where I have almost zero traffic prep time takes 1 standard deviation less, days when I can't find my car keys my prep time takes 1 standard deviation more, days I realize would be late and skip showering take 2 standard deviations less, and days when there is a terrible accident on the freeway end up requiring +2 or +3 standard deviations more in time. To put it in other words, my prep time is a bell curve, and I've got 1-sigma and 2-sigma (and occasionally 3-sigma) events speeding me up and slowing me down.
This holds true for more than just going to work. Everything's time-until-completion can be described this way: project completion times, homework, going to the airport, the duration of foreplay and sex. Everything. It's not always bell curves, but it's a probability distribution with respect to completion times, and that can help give useful insights.
Starting 'On Time' Means You Won't be On Time
What do we gain by understanding that our actions are described by a probability distribution? The first and most important take away is this: If you only allocate the exact amount of time to do something, you'll be late 50% of the time. I'm going to repeat it and italicize because I think it's that important of a point. If you only allocate the exact amount of time to do something, you'll be late 50% of the time. That's the way bell curves work.
I know I've heard jokes about how 90% of the population has above average children, but it wasn't until I really looked at the math of my behavior that I realized I was doing the exact same thing. I'd say "oh it takes me 45 minutes on average to go to work every day, so I'll leave at 7:15." Yet I never realized that I was completely ignoring that half the time would take longer than average. So half the time, I'd end up be pressed for time and have to skip shaving (or something) or I'd end up late. I was terribly unpunctual until I realized I that I had to arrive early to always arrive on time. "If you arrive on time, then you are late." Hmm. You win this one, folk wisdom.
Still, the question remained. How much early would it take to never be late? The answer lay in bell curves.
Acceptable Lateness and Standard deviation
Looking at time requirements as a bell curve implies another thing: One can never completely eliminate all lateness; the only option is to make a choice about what probability of lateness is acceptable. A person must decide what lateness ratio they're willing to take, and then start prepping that many standard deviations beforehand. And, despite what employers say, 0% is not a probability.
If my prep time averages 45 minutes with a standard deviation of 10 minutes then that means...
- Starting 45 minutes beforehand will force me to be late or miss services (eg shaving) around 50% of the time or about 10 workdays a month.
- Starting 55 minutes beforehand will force me to be late or miss services (eg shaving) around 16% of the time or about 3 workdays a month.
- Starting 65 minutes beforehand will force me to be late or miss services (eg shaving) around 2.3% of the time or about 1 day every other month.
That's really good risk reduction for a small amount of time spent. (NB, remember that averages are dangerous little things. Taking this to a meta level, consider that being late to work about 3 times a month isn't helpful if you arrive late only once the first month, then get fired the next month when you arrive late 5 times. Hence, "Always arrive 10 minutes earlier than you think early is." God I hate folk wisdom, especially when it's right.)
The risk level you're acceptable with dictates how much time you need for padding. For job interviews, I'm only willing to arrive late to 1 in 1000, so I prepare 3 standard deviations early now. For first dates, I'm willing to miss about 5%. For dinners with the family, I'm okay with being late half the time. It feels similar to the algorithm I used before, which was a sort of ad-hoc thing where I'd prepared earlier for important things. The main difference is that now I can quantify the risk I'm assuming when I procrastinate. It causes each procrastination to become more concrete for me, and drastically reduces the chance that I'll be willing to make those tradeoffs. Instead of being willing to read lesswrong for 10 more minutes in exchange for "oh I might have to rush", I can now see that it would increase my chance of being late from 16% to 50%, which is flatly unacceptable. Viewing procrastination in terms of the latter tradeoff makes it much easier to get myself moving.
The last quote is "Better three hours too soon than one minute too late." I'm glad that at least that one's wrong. I'm sure Umesh would have some stern words for that saying. My key to arriving on time is locating your acceptable risk threshold and making an informed decision about how much risk you are willing to take.
The time it takes for you to complete any task is (usually) described by a bell curve. How much time you think you'll take is a lie, and not just because of the planning fallacy. Even if you do the sciency-thing and take the outside view, it's still not enough to keep you from getting fired or showing up to your interview late. To consistently show up on time, you must incorporate padding time.
So I've got a new saying, "If you wish to be late only 2.3% of the time, you must start getting ready at least two standard deviations before the average prep time you have needed historically." I wish my mom would have told me this one. It's so much easier to understand than all those other sayings!
(Also my first actual article-thingy, so any comments or suggestions is welcome)
- Curse of knowledge
- Duration neglect
- Extension neglect
- Extrinsic incentives bias
- Illusion of external agency
- Illusion of validity
- Insensitivity to sample size
- Lady Macbeth effect
- Less-is-better effect
- Naïve cynicism
- Naïve realism
- Reactive devaluation
- Rhyme-as-reason effect
- Scope neglect
Also conjunction fallacy has been expanded.
The idea of this article is something I've talked about a couple of times in comments. It seems to require more attention.
As a general rule, what is obvious to some people may not be obvious to others. Is this obvious to you? Maybe it was. Maybe it wasn't, and you thought it was because of hindsight bias.
Imagine a substantive Less Wrong comment. It's insightful, polite, easy to understand, and otherwise good. Ideally, you upvote this comment. Now imagine the same comment, only with "obviously" in front. This shouldn't change much, but it does. This word seems to change the comment in multifarious bad ways that I'd rather not try to list.
Uncharitably, I might reduce this whole phenomenon to an example of a mind projection fallacy. The implicit deduction goes like this: "I found <concept> obvious. Thus, <concept> is inherently obvious." The problem is that obviousness, like probability, is in the mind.
The stigma of "obvious" ideas has another problem in preventing things from being said at all. I don't know how common this is, but I've actually been afraid of saying things that I thought were obvious, even though ignoring this fear and just posting has yet to result in a poorly-received comment. (That is, in fact, why I'm writing this.)
Even tautologies, which are always obvious in retrospect, can be hard to spot. How many of us would have explicitly realized the weak anthropic principle without Nick Bostrom's help?
And what about implications of beliefs you already hold? These should be obvious, and sometimes are, but our brains are notoriously bad at putting two and two together. Luke's example was not realizing that an intelligence explosion was imminent until he read the I.J. Good paragraph. I'm glad he provided that example, as it has saved me the trouble of making one.
This is not (to paraphrase Eliezer) a thunderbolt of insight. I bring it up because I propose a few community norms based on the idea:
- Don't be afraid of saying something because it's "obvious". It's like how your teachers always said there are no stupid questions.
- Don't burden your awesome ideas with "obvious but it needs to be said".
- Don't vote down a comment because it says something "obvious" unless you've thought about it for a while. Also, don't shun "obvious" ideas.
- Don't call an idea obvious as though obviousness were an inherent property of the idea. Framing it as a personally obvious thing can be a more accurate way of saying what you're trying to say, but it's hard to do this without looking arrogant. (I suspect this is actually one of the reasons we implicitly treat obviousness as impersonal.)
Computer activist Aaron H. Swartz committed suicide in New York City yesterday, Jan. 11.
The accomplished Swartz co-authored the now widely-used RSS 1.0 specification at age 14, was one of the three co-owners of the popular social news site Reddit, and completed a fellowship at Harvard’s Ethics Center Lab on Institutional Corruption. In 2010, he founded DemandProgress.org, a “campaign against the Internet censorship bills SOPA/PIPA.”
He deserves a eulogy more eloquent than what I am capable of writing. Here's Cory Doctorow's, one of his long time friends.
It's a sad world in which you are being arrested and grand jury'd for downloading scientific journals and papers with the intent to share them.
I recently gave a talk at Chicago Ideas Week on adapting Turing Tests to have better, less mindkill-y arguments, and this is the precis for folks who would prefer not to sit through the video (which is available here).
Conventional Turing Tests check whether a programmer can build a convincing facsimile of a human conversationalist. The test has turned out to reveal less about machine intelligence than human intelligence. (Anger is really easy to fake, since fights can end up a little more Markov chain-y, where you only need to reply to the most recent rejoinder and can ignore what came before). Since normal Turing Tests made us think more about our model of human conversation, economist Bryan Caplan came up with a way to use them to make us think more usefully about our models of our enemies.
After Paul Krugman disparaged Caplan's brand of libertarian economics, Caplan challenged him to an ideological Turing Test, where both players would be human, but would be trying to accurately imitate each other. Caplan and Krugman would each answer questions about their true beliefs honestly, and then would fill out the questionaire again in persona inimici - trying to guess the answers given by the other side. Caplan was willing to bet that he understood Krugman's position well enough to mimic it, but Krugman would be easily spotted as a fake!Caplan.
Krugman didn't take him up on the offer, but I've run a couple iterations of the test for my religion/philosophy blog. The first year, some of the most interesting results were the proxy variables people were using, that weren't as strong as indicators as the judges thought. (One Catholic coasted through to victory as a faux atheist, since many of the atheist judges thought there was no way a Christian would appreciate the webcomic SMBC).
The trouble was, the Christians did a lot better, since it turned out I had written boring, easy to guess questions for the true and faux atheists. The second year, I wrote weirder questions, and the answers were a lot more diverse and surprising (and a number of the atheist participants called out each other as fakes or just plain wrong, since we'd gotten past the shallow questions from year one, and there's a lot of philosophical diversity within atheism).
The exercise made people get curious about what it was their opponents actually thought and why. It helped people spot incorrect stereotypes of an opposing side and faultlines they'd been ignoring within their own. Personally, (and according to other participants) it helped me have an argument less antagonistically. Instead of just trying to find enough of a weak point to discomfit my opponent, I was trying to build up a model of how they thought, and I needed their help to do it.
Taking a calm, inquisitive look at an opponent's position might teach me that my position is wrong, or has a gap I need to investigate. But even if my opponent is just as wrong as zer seemed, there's still a benefit to me. Having a really detailed, accurate model of zer position may help me show them why it's wrong, since now I can see exactly where it rasps against reality. And even if my conversation isn't helpful to them, it's interesting for me to see what they were missing. I may be correct in this particular argument, but the odds are good that I share the rationalist weak-point that is keeping them from noticing the error. I'd like to be able to see it more clearly so I can try and spot it in my own thought. (Think of this as the shift from "How the hell can you be so dumb?!" to "How the hell can you be so dumb?").
When I get angry, I'm satisfied when I beat my interlocutor. When I get curious, I'm only satisfied when I learn something new.
In Keep Your Identity Small, Paul Graham argues against associating yourself with labels (i.e. “libertarian,” “feminist,” “gamer,” “American”) because labels constrain what you’ll let yourself believe. It’s a wonderful essay that’s led me to make concrete changes in my life. That said, it’s only about 90% correct. I have two issues with Graham’s argument; one is a semantic quibble, but it leads into the bigger issue, which is a tactic I’ve used to become a better person.
Graham talks about the importance of identity in determining beliefs. This isn’t quite the right framework. I’m a fanatical consequentialist, so I care what actions people take. Beliefs can constrain actions, but identity can also constrain actions directly.
To give a trivial example from the past week in which beliefs didn’t matter: I had a self-image as someone who didn’t wear jeans or t-shirts. As it happens, there are times when wearing jeans is completely fine, and when other people wore jeans in casual settings, I knew it was appropriate. Nevertheless, I wasn’t able to act on this belief because of my identity. (I finally realized this was silly, consciously discarded that useless bit of identity, and made a point of wearing jeans to a social event.)
Why is this distinction important? If we’re looking at identify from an action-centered framework, this recommends a different approach from Graham’s.
Do you want to constrain your beliefs? No; you want to go wherever the evidence pushes you. “If X is true, I desire to believe that X is true. If X is not true, I desire to believe that X is not true.” Identity will only get in the way.
Do you want to constrain your actions? Yes! Ten thousand times yes! Akrasia exists. Commitment devices are useful. Beeminder is successful. Identity is one of the most effective tools for the job, if you wield it deliberately.
I’ve cultivated an identity as a person who makes events happen. It took months to instill, but now, when I think “I wish people were doing X,” I instinctively start putting together a group to do X. This manifests in minor ways, like the tree-climbing expedition I put together at the Effective Altruism Summit, and in big ways, like the megameetup we held in Boston. If I hadn’t used my identity to motivate myself, neither of those things would’ve happened, and my life would be poorer.
Identity is powerful. Powerful things are dangerous, like backhoes and bandsaws. People use them anyway, because sometimes they’re the best tools for the job, and because safety precautions can minimize the danger.
Identity is hard to change. Identity can be difficult to notice. Identity has unintended consequences. Use this tool only after careful deliberation. What would this identity do to your actions? What would it do to your beliefs? What social consequences would it have? Can you do the same thing with a less dangerous tool? Think twice, and then think again, before you add to your identity. Most identities are a hindrance.
But please, don’t discard this tool just because some things might go wrong. If you are willful, and careful, and wise, then you can cultivate the identity of the person you always wanted to be.
Last year, I asked LW for some advice about spaced repetition software (SRS) that might be useful to me as a high school teacher. With said advice came a request to write a follow-up after I had accumulated some experience using SRS in the classroom. This is my report.
Please note that this was not a scientific experiment to determine whether SRS "works." Prior studies are already pretty convincing on this point and I couldn't think of a practical way to run a control group or "blind" myself. What follows is more of an informal debriefing for how I used SRS during the 2014-15 school year, my insights for others who might want to try it, and how the experience is changing how I teach.
SRS can raise student achievement even with students who won't use the software on their own, and even with frequent disruptions to the study schedule. Gains are most apparent with the already high-performing students, but are also meaningful for the lowest students. Deliberate efforts are needed to get student buy-in, and getting the most out of SRS may require changes in course design.
After looking into various programs, including the game-like Memrise, and even writing my own simple SRS, I ultimately went with Anki for its multi-platform availability, cloud sync, and ease-of-use. I also wanted a program that could act as an impromptu catch-all bin for the 2,000+ cards I would be producing on the fly throughout the year. (Memrise, in contrast, really needs clearly defined units packaged in advance).
I teach 9th and 10th grade English at an above-average suburban American public high school in a below-average state. Mine are the lower "required level" students at a school with high enrollment in honors and Advanced Placement classes. Generally speaking, this means my students are mostly not self-motivated, are only very weakly motivated by grades, and will not do anything school-related outside of class no matter how much it would be in their interest to do so. There are, of course, plenty of exceptions, and my students span an extremely wide range of ability and apathy levels.
First, what I did not do. I did not make Anki decks, assign them to my students to study independently, and then quiz them on the content. With honors classes I taught in previous years I think that might have worked, but I know my current students too well. Only about 10% of them would have done it, and the rest would have blamed me for their failing grades—with some justification, in my opinion.
Instead, we did Anki together, as a class, nearly every day.
As initial setup, I created a separate Anki profile for each class period. With a third-party add-on for Anki called Zoom, I enlarged the display font sizes to be clearly legible on the interactive whiteboard at the front of my room.
Nightly, I wrote up cards to reinforce new material and integrated them into the deck in time for the next day's classes. This averaged about 7 new cards per lesson period.These cards came in many varieties, but the three main types were:
- concepts and terms, often with reversed companion cards, sometimes supplemented with "what is this an example of" scenario cards.
- vocabulary, 3 cards per word: word/def, reverse, and fill-in-the-blank example sentence
- grammar, usually in the form of "What change(s), if any, does this sentence need?" Alternative cards had different permutations of the sentence.
Weekly, I updated the deck to the cloud for self-motivated students wishing to study on their own.
Daily, I led each class in an Anki review of new and due cards for an average of 8 minutes per study day, usually as our first activity, at a rate of about 3.5 cards per minute. As each card appeared on the interactive whiteboard, I would read it out loud while students willing to share the answer raised their hands. Depending on the card, I might offer additional time to think before calling on someone to answer. Depending on their answer, and my impressions of the class as a whole, I might elaborate or offer some reminders, mnemonics, etc. I would then quickly poll the class on how they felt about the card by having them show a color by way of a small piece of card-stock divided into green, red, yellow, and white quadrants. Based on my own judgment (informed only partly by the poll), I would choose and press a response button in Anki, determining when we should see that card again.
[Data shown is from one of my five classes. We didn't start using Anki until a couple weeks into the school year.]
8 minutes is a significant portion of a 55 minute class period, especially for a teacher like me who fills every one of those minutes. Something had to give. For me, I entirely cut some varieties of written vocab reinforcement, and reduced the time we spent playing the team-based vocab/term review game I wrote for our interactive whiteboards some years ago. To a lesser extent, I also cut back on some oral reading comprehension spot-checks that accompany my whole-class reading sessions. On balance, I think Anki was a much better way to spend the time, but it's complicated. Keep reading.
Whole-class SRS not ideal
Every student is different, and would get the most out of having a personal Anki profile determine when they should see each card. Also, most individuals could study many more cards per minute on their own than we averaged doing it together. (To be fair, a small handful of my students did use the software independently, judging from Ankiweb download stats)
Getting student buy-in
Before we started using SRS I tried to sell my students on it with a heartfelt, over-prepared 20 minute presentation on how it works and the superpowers to be gained from it. It might have been a waste of time. It might have changed someone's life. Hard to say.
As for the daily class review, I induced engagement partly through participation points that were part of the final semester grade, and which students knew I tracked closely. Raising a hand could earn a kind of bonus currency, but was never required—unlike looking up front and showing colors during polls, which I insisted on. When I thought students were just reflexively holding up the same color and zoning out, I would sometimes spot check them on the last card we did and penalize them if warranted.
But because I know my students are not strongly motivated by grades, I think the most important influence was my attitude. I made it a point to really turn up the charm during review and play the part of the engaging game show host. Positive feedback. Coaxing out the lurkers. Keeping that energy up. Being ready to kill and joke about bad cards. Reminding classes how awesome they did on tests and assignments because they knew their Anki stuff.
(This is a good time to point out that the average review time per class period stabilized at about 8 minutes because I tried to end reviews before student engagement tapered off too much, which typically started happening at around the 6-7 minute mark. Occasional short end-of-class reviews mostly account for the difference.)
I also got my students more on the Anki bandwagon by showing them how this was directly linked reduced note-taking requirements. If I could trust that they would remember something through Anki alone, why waste time waiting for them to write it down? They were unlikely to study from those notes anyway. And if they aren't looking down at their paper, they'll be paying more attention to me. I better come up with more cool things to tell them!
Everything I had read about spaced repetition suggested it was a great reinforcement tool but not a good way to introduce new material. With that in mind, I tried hard to find or create memorable images, examples, mnemonics, and anecdotes that my Anki cards could become hooks for, and to get those cards into circulation as soon as possible. I even gave this method a mantra: "vivid memory, card ready".
When a student during review raised their hand, gave me a pained look, and said, "like that time when...." or "I can see that picture of..." as they struggled to remember, I knew I had done well. (And I would always wait a moment, because they would usually get it.)
Baby cards need immediate love
Unfortunately, if the card wasn't introduced quickly enough—within a day or two of the lesson—the entire memory often vanished and had to be recreated, killing the momentum of our review. This happened far too often—not because I didn't write the card soon enough (I stayed really on top of that), but because it didn't always come up for study soon enough. There were a few reasons for this:
- We often had too many due cards to get through in one session, and by default Anki puts new cards behind due ones.
- By default, Anki only introduces 20 new cards in one session (I soon uncapped this).
- Some cards were in categories that I gave lower priority to.
Two obvious cures for this problem:
- Make fewer cards. (I did get more selective as the year went on.)
- Have all cards prepped ahead of time and introduce new ones at the end of the class period they go with. (For practical reasons, not the least of which was the fact that I didn't always know what cards I was making until after the lesson, I did not do this. I might able to next year.)
Days off suck
SRS is meant to be used every day. When you take weekends off, you get a backlog of due cards. Not only do my students take every weekend and major holiday off (slackers), they have a few 1-2 week vacations built into the calendar. Coming back from a week's vacation means a 9-day backlog (due to the weekends bookending it). There's no good workaround for students that won't study on their own. The best I could do was run longer or multiple Anki sessions on return days to try catch up with the backlog. It wasn't enough. The "caught up" condition was not normal for most classes at most points during the year, but rather something to aspire to and occasionally applaud ourselves for reaching. Some cards spent weeks or months on the bottom of the stack. Memories died. Baby cards emerged stillborn. Learning was lost.
Needless to say, the last weeks of the school year also had a certain silliness to them. When the class will never see the card again, it doesn't matter whether I push the button that says 11 days or the one that says 8 months. (So I reduced polling and accelerated our cards/minute rate.)
Never before SRS did I fully appreciate the loss of learning that must happen every summer break.
I kept each course's master deck divided into a few large subdecks. This was initially for organizational reasons, but I eventually started using it as a prioritizing tool. This happened after a curse-worthy discovery: if you tell Anki to review a deck made from subdecks, due cards from subdecks higher up in the stack are shown before cards from decks listed below, no matter how overdue they might be. From that point, on days when we were backlogged (most days) I would specifically review the concept/terminology subdeck for the current semester before any other subdecks, as these were my highest priority.
On a couple of occasions, I also used Anki's study deck tools to create temporary decks of especially high-priority cards.
Seizing those moments
Veteran teachers start acquiring a sense of when it might be a good time to go off book and teach something that isn't in the unit, and maybe not even in the curriculum. Maybe it's teaching exactly the right word to describe a vivid situation you're reading about, or maybe it's advice on what to do in a certain type of emergency that nearly happened. As the year progressed, I found myself humoring my instincts more often because of a new confidence that I can turn an impressionable moment into a strong memory and lock it down with a new Anki card. I don't even care if it will ever be on a test. This insight has me questioning a great deal of what I thought knew about organizing a curriculum. And I like it.
A lifeline for low performers
An accidental discovery came from having written some cards that were, it was immediately obvious to me, much too easy. I was embarrassed to even be reading them out loud. Then I saw which hands were coming up.
In any class you'll get some small number of extremely low performers who never seem to be doing anything that we're doing, and, when confronted, deny that they have any ability whatsoever. Some of the hands I was seeing were attached to these students. And you better believe I called on them.
It turns out that easy cards are really important because they can give wins to students who desperately need them. Knowing a 6th grade level card in a 10th grade class is no great achievement, of course, but the action takes what had been negative morale and nudges it upward. And it can trend. I can build on it. A few of these students started making Anki the thing they did in class, even if they ignored everything else. I can confidently name one student I'm sure passed my class only because of Anki. Don't get me wrong—he just barely passed. Most cards remained over his head. Anki was no miracle cure here, but it gave him and I something to work with that we didn't have when he failed my class the year before.
A springboard for high achievers
It's not even fair. The lowest students got something important out of Anki, but the highest achievers drank it up and used it for rocket fuel. When people ask who's widening the achievement gap, I guess I get to raise my hand now.
I refuse to feel bad for this. Smart kids are badly underserved in American public schools thanks to policies that encourage staff to focus on that slice of students near (but not at) the bottom—the ones who might just barely be able to pass the state test, given enough attention.
Where my bright students might have been used to high Bs and low As on tests, they were now breaking my scales. You could see it in the multiple choice, but it was most obvious in their writing: they were skillfully working in terminology at an unprecedented rate, and making way more attempts to use new vocabulary—attempts that were, for the most part, successful.
Given the seemingly objective nature of Anki it might seem counterintuitive that the benefits would be more obvious in writing than in multiple choice, but it actually makes sense when I consider that even without SRS these students probably would have known the terms and the vocab well enough to get multiple choice questions right, but might have lacked the confidence to use them on their own initiative. Anki gave them that extra confidence.
A wash for the apathetic middle?
I'm confident that about a third of my students got very little out of our Anki review. They were either really good at faking involvement while they zoned out, or didn't even try to pretend and just took the hit to their participation grade day after day, no matter what I did or who I contacted.
These weren't even necessarily failing students—just the apathetic middle that's smart enough to remember some fraction of what they hear and regurgitate some fraction of that at the appropriate times. Review of any kind holds no interest for them. It's a rerun. They don't really know the material, but they tell themselves that they do, and they don't care if they're wrong.
On the one hand, these students are no worse off with Anki than they would have been with with the activities it replaced, and nobody cries when average kids get average grades. On the other hand, I'm not ok with this... but so far I don't like any of my ideas for what to do about it.
Putting up numbers: a case study
For unplanned reasons, I taught a unit at the start of a quarter that I didn't formally test them on until the end of said quarter. Historically, this would have been a disaster. In this case, it worked out well. For five weeks, Anki was the only ongoing exposure they were getting to that unit, but it proved to be enough. Because I had given the same test as a pre-test early in the unit, I have some numbers to back it up. The test was all multiple choice, with two sections: the first was on general terminology and concepts related to the unit. The second was a much harder reading comprehension section.
As expected, scores did not go up much on the reading comprehension section. Overall reading levels are very difficult to boost in the short term and I would not expect any one unit or quarter to make a significant difference. The average score there rose by 4 percentage points, from 48 to 52%.
Scores in the terminology and concept section were more encouraging. For material we had not covered until after the pre-test, the average score rose by 22 percentage points, from 53 to 75%. No surprise there either, though; it's hard to say how much credit we should give to SRS for that.
But there were also a number of questions about material we had already covered before the pretest. Being the earliest material, I might have expected some degradation in performance on the second test. Instead, the already strong average score in that section rose by an additional 3 percentage points, from 82 to 85%. (These numbers are less reliable because of the smaller number of questions, but they tell me Anki at least "locked in" the older knowledge, and may have strengthened it.)
Some other time, I might try reserving a section of content that I teach before the pre-test but don't make any Anki cards for. This would give me a way to compare Anki to an alternative review exercise.
What about formal standardized tests?
I don't know yet. The scores aren't back. I'll probably be shown some "value added" analysis numbers at some point that tell me whether my students beat expectations, but I don't know how much that will tell me. My students were consistently beating expectations before Anki, and the state gave an entirely different test this year because of legislative changes. I'll go back and revise this paragraph if I learn anything useful.
If I'm trying to acquire a new skill, one of the first things I try to do is listen to skilled practitioners of that skill talk about it to each other. What are the terms-of-art? How do they use them? What does this tell me about how they see their craft? Their shorthand is a treasure trove of crystallized concepts; once I can use it the same way they do, I find I'm working at a level of abstraction much closer to theirs.
Similarly, I was hoping Anki could help make my students more fluent in the subject-specific lexicon that helps you score well in analytical essays. After introducing a new term and making the Anki card for it, I made extra efforts to use it conversationally. I used to shy away from that because so many students would have forgotten it immediately and tuned me out for not making any sense. Not this year. Once we'd seen the card, I used the term freely, with only the occasional reminder of what it meant. I started using multiple terms in the same sentence. I started talking about writing and analysis the way my fellow experts do, and so invited them into that world.
Even though I was already seeing written evidence that some of my high performers had assimilated the lexicon, the high quality discussions of these same students caught me off guard. You see, I usually dread whole-class discussions with non-honors classes because good comments are so rare that I end up dejectedly spouting all the insights I had hoped they could find. But by the end of the year, my students had stepped up.
I think what happened here was, as with the writing, as much a boost in confidence as a boost in fluency. Whatever it was, they got into some good discussions where they used the terminology and built on it to say smarter stuff.
Don't get me wrong. Most of my students never got to that point. But on average even small groups without smart kids had a noticeably higher level of discourse than I am used to hearing when I break up the class for smaller discussions.
SRS is inherently weak when it comes to the abstract and complex. No card I've devised enables a student to develop a distinctive authorial voice, or write essay openings that reveal just enough to make the reader curious. Yes, you can make cards about strategies for this sort of thing, but these were consistently my worst cards—the overly difficult "leeches" that I eventually suspended from my decks.
A less obvious limitation of SRS is that students with a very strong grasp of a concept often fail to apply that knowledge in more authentic situations. For instance, they may know perfectly well the difference between "there", "their", and "they're", but never pause to think carefully about whether they're using the right one in a sentence. I am very open to suggestions about how I might train my students' autonomous "System 1" brains to have "interrupts" for that sort of thing... or even just a reflex to go back and check after finishing a draft.
I absolutely intend to continue using SRS in the classroom. Here's what I intend to do differently this coming school year:
- Reduce the number of cards by about 20%, to maybe 850-950 for the year in a given course, mostly by reducing the number of variations on some overexposed concepts.
- Be more willing to add extra Anki study sessions to stay better caught-up with the deck, even if this means my lesson content doesn't line up with class periods as neatly.
- Be more willing to press the red button on cards we need to re-learn. I think I was too hesitant here because we were rarely caught up as it was.
- Rework underperforming cards to be simpler and more fun.
- Use more simple cloze deletion cards. I only had a few of these, but they worked better than I expected for structured idea sets like, "characteristics of a tragic hero".
- Take a less linear and more opportunistic approach to introducing terms and concepts.
- Allow for more impromptu discussions where we bring up older concepts in relevant situations and build on them.
- Shape more of my lessons around the "vivid memory, card ready" philosophy.
- Continue to reduce needless student note-taking.
- Keep a close eye on 10th grade students who had me for 9th grade last year. I wonder how much they retained over the summer, and I can't wait to see what a second year of SRS will do for them.
Suggestions and comments very welcome!
In honor of System Administrator Appreciation Day, this is a post to thank Trike Apps for creating & maintaining Less Wrong. A lot of the time when they are mentioned on Less Wrong, it is to complain about bugs or request new features. So this is the time of year: thanks for everything that continues to silently go right!
I enjoy teaching, and I'd like to do my bit for the Less Wrong community. I've tutored a few people on the #lesswrong IRC channel in freenode without causing permanent brain damage. Hence I'm extending my offer of free tutoring from #lesswrong to lesswrong.com.
I offer tutoring in the following programming languages:
I offer tutoring in the following areas of mathematics:
- Elementary Algebra
- Linear Algebra
- Abstract Algebra
- Category Theory
- Probability Theory
- Computational Complexity
If you're interested contact me. Contact details below:
Apptimize is a 2-year old startup closely connected with the rationalist community, one of the first founded by CFAR alumni. We make “lean” possible for mobile apps -- our software lets mobile developers update or A/B test their apps in minutes, without submitting to the App Store. Our customers include big companies such as Nook and Ebay, as well as Top 10 apps such as Flipagram. When companies evaluate our product against competitors, they’ve chosen us every time.
We work incredibly hard, and we’re striving to build the strongest engineering team in the Bay Area. If you’re a good developer, we have a lot to offer.
Our team of 14 includes 7 MIT alumni, 3 ex-Googlers, 1 Wharton MBA, 1 CMU CS alum, 1 Stanford alum, 2 MIT Masters, 1 MIT Ph. D. candidate, and 1 “20 Under 20” Thiel Fellow. Our CEO was also just named to the Forbes “30 Under 30”
David Salamon, Anna Salamon’s brother, built much of our early product
HP:MoR is required reading for the entire company
We evaluate candidates on curiosity even before evaluating them technically
Seriously, our team is badass. Just look
You will have huge autonomy and ownership over your part of the product. You can set up new infrastructure and tools, expense business products and services, and even subcontract some of your tasks if you think it's a good idea
You will learn to be a more goal-driven agent, and understand the impact of everything you do on the rest of the business
Access to our library of over 50 books and audiobooks, and the freedom to purchase more
Everyone shares insights they’ve had every week
Self-improvement is so important to us that we only hire people committed to it. When we say that it’s a company value, we mean it
Our mobile engineers dive into the dark, undocumented corners of iOS and Android, while our backend crunches data from billions of requests per day
Engineers get giant monitors, a top-of-the-line MacBook pro, and we’ll pay for whatever else is needed to get the job done
We don’t demand prior experience, but we do demand the fearlessness to jump outside your comfort zone and job description. That said, our website uses AngularJS, jQuery, and nginx, while our backend uses AWS, Java (the good parts), and PostgreSQL
We don’t have gratuitous perks, but we have what counts: Free snacks and catered meals, an excellent health and dental plan, and free membership to a gym across the street
Seriously, working here is awesome. As one engineer puts it, “we’re like a family bent on taking over the world”
If you’re interested, send some Bayesian evidence that you’re a good match to email@example.com
Spurred by discussion of whether Luke's Q&A session should be on video or text-only, I volunteered to transcribe Eliezer's Q&A videos from January 2010. I finished last night, much earlier than my estimate, mostly due to feeling motivated to finish it and spending more on it than my very conservative estimated 30 minutes a day (estimate of number of words was pretty close; about 16000). I have posted a link to this post as a comment in the original thread here, if you would like to upvote that.
Some advice for transcribing videos: I downloaded the .wmv videos, which allowed me to use VLC's global hotkeys to create a pause and "short skip backwards and forwards" buttons (ctrl-space and ctrl-shift left/right arrow), which were so much more convenient than any other method I tried.
Edited out: repetition of the question, “um/uh”, “you know,” false starts.
Punctuation, capitalization, and structure, etc may not be entirely consistent.
Keep in mind the opinions expressed here are those of Eliezer circa January 2010.
1. What is your information diet like? Do you control it deliberately (do you have a method; is it, er, intelligently designed), or do you just let it happen naturally.
By that I mean things like: Do you have a reading schedule (x number of hours daily, etc)? Do you follow the news, or try to avoid information with a short shelf-life? Do you frequently stop yourself from doing things that you enjoy (f.ex reading certain magazines, books, watching films, etc) to focus on what is more important? etc.
It’s not very planned, most of the time, in other words Hacker News, Reddit, Marginal Revolution, other random stuff found on the internet. In order to learn something, I usually have to set aside blocks of time and blocks of effort and just focus on specifically reading something. It’s only sort of popular level books which I can put on a restroom shelf and get them read that way. In order to learn actually useful information I generally find that I have to set aside blocks of time or run across a pot of gold, and you’re about as likely to get a pot of gold from Hacker News as anywhere else really. So not very controlled.
2. Your "Bookshelf" page is 10 years old (and contains a warning sign saying it is obsolete): http://yudkowsky.net/obsolete/bookshelf.html
Could you tell us about some of the books and papers that you've been reading lately? I'm particularly interested in books that you've read since 1999 that you would consider to be of the highest quality and/or importance (fiction or not).
I guess I’m a bit ashamed of how little I’ve been reading whole books and how much I’ve been reading small bite-sized pieces on the internet recently. Right now I’m reading Predictably Irrational which is a popular book by Dan Ariely about biases, it’s pretty good, sort of like a sequence of Less Wrong posts. I’ve recently finished reading Good and Real by Gary Drescher, which is something I kept on picking up and putting down, which is very Lesswrongian, it’s master level Reductionism and the degree of overlap was incredible enough that I would read something and say ‘OK I should write this up on my own before I read how Drescher wrote it so that you can get sort of independent views of it and see how they compare.’
Let’s see, other things I’ve read recently. I’ve fallen into the black hole of Fanfiction.net, well actually fallen into a black hole is probably too extreme. It’s got a lot of reading and the reading’s broken up into nice block size chapters and I’ve yet to exhaust the recommendations of the good stuff, but probably not all that much reading there, relatively speaking.
I guess it really has been quite a while since I picked up a good old-fashioned book and said ‘Wow, what an amazing book’. My memory is just returning the best hits of the last 10 years instead of the best hits of the last six months or anything like that. If we expand it out to the best hits of the last 10 years then Artificial Intelligence: A Modern Approach by Russell and Norvig is a really wonderful artificial intelligence textbook. It was on reading through that that I sort of got the epiphany of artificial intelligence really has made a lot more progress than people credit for, it’s just not really well organized, so you need someone with good taste to go through and tell you what’s been done before you recognize what has been done.
There was a book on statistical inference, I’m trying to remember the exact title, it’s by Hastie and Tibshirani, Elements of Statistical Learning, that was it. Elements of Statistical Learning was when I realized that the top people, they really do understand their subject, the people who wrote the Elements of Statistical Learning, they really understand statistics. At the same time you read through and say ‘Gosh, by comparison with these people, the average statistician, to say nothing of the average scientist who’s just using statistics, doesn’t really understand statistics at all.’
Let’s see, other really great... Yeah my memory just doesn’t really associate all that well I’m afraid, it doesn’t sort of snap back and cough up a list of the best things I’ve read recently. This would probably be something better for me to answer in text than in video I’m afraid.
3. What is a typical EY workday like? How many hours/day on average are devoted to FAI research, and how many to other things, and what are the other major activities that you devote your time to?
I’m not really sure I have anything I could call a ‘typical’ workday. Akrasia, weakness of will, that has always been what I consider to be my Great Bugaboo, and I still do feel guilty about the amount of rest time and downtime that I require to get work done, and even so I sometimes suspect that I’m taking to little downtime relative to work time just because on those occasions when something or other prevents me form getting work done, for a couple of days, I come back and I’m suddenly much more productive. In general, I feel like I’m stupid with respect to organizing my work day, that sort of problem, it used to feel to me like it was chaotic and unpredictable, but I now recognize that when something looks chaotic and unpredictable, that means that you are stupid with respect to that domain.
So it’ll probably look like, when I manage to get a work session in the work session will be a couple of hours, I’ll sometimes when I run into a difficult problem I’ll sometimes stop and go off and read things on the internet for a few minutes or a lot of minutes, until I can come back and I can come back and solve the problem or my brain is rested enough to go to the more tiring high levels of abstraction where I can actually understand what it is that’s been blocking me and move on. That’s for writing, which I’ve been doing a lot of lately.
A typical workday when I’m actually working on Friendly AI with Marcello, that’ll look like we get together and sit down and open up a notebook and stare at our notebooks and throw ideas back and forth and sometimes sit in silence and think about things, write things down, I’ll propose things, Marcello will point out flaws in them or vice versa, sort of reach the end of a line of thought, go blank, stop and stare at each other and try to think of another line of thought, keep that up for two to three hours, break for lunch, keep it up for another two to three hours, and then break for a day, could spend the off day just recovering or reading math if possible or otherwise just recovering. Marcello doesn’t need as much recovery time, but I also suspect that Marcello, because he’s still sort of relatively inexperienced isn’t quite confronting the most difficult parts of the problem as directly.
So taking a one-day-on one-day-off, with respect to Friendly AI I actually don’t feel guilty about it at all, because it really is apparent that I just cannot work two days in a row on this problem and be productive. It’s just really obvious, and so instead of the usual cycle of ‘Am I working enough? Could I be working harder?’ and feeling guilty about it it’s just obvious that in that case after I get a solid day’s work I have to take a solid day off.
Let’s see, any other sorts of working cycles? Back when I was doing the Overcoming Bias/Less Wrong arc at one post per day, I would sometimes get more than one post per day in and that’s how I’d occasionally get a day off, other times a post would take more than one day. I find that I am usually relatively less productive in the morning; a lot of advice says ‘as soon as you get up in the morning, sit down, start working, get things done’; that’s never quite worked out for me, and of course that could just be because I’m doing it wrong, but even so I find that I tend to be more productive later in the day.
Let’s see, other info... Oh yes, at one point I tried to set up my computer to have a separate login without any of the usual distractions, and that caused my productivity to drop down because it meant that when I needed to take some time off, instead of browsing around the internet and then going right back to working, I’d actually separated work and so it was harder to switch back and forth between them both, so that was something that seemed like it was a really good idea that ought to work in theory, setting aside this sort of separate space with no distractions to work, and that failed.
And right now I’m working sort of on the preliminaries for the book, The Art of Rationality being the working title, and I haven’t started writing the book yet, I’m still sort of trying to understand what it is that I’ve previously written on Less Wrong, Overcoming Bias, organize it using mind mapping software from FreeMind which is open source mind mapping software; it’s really something I wish I’d known existed and started using back when the whole Overcoming Bias/Less Wrong thing started, I think it might have been a help.
So right now I’m just still sort of trying to understand what did I actually say, what’s the point, how do the points relate to each other, and thereby organizing the skeleton of the book, rather than writing it just yet, and the reason I’m doing it that way is that when it comes to writing things like books where I don’t push out a post every day I tend to be very slow, unacceptably slow even, and so one method of solving that was rite a post every day and this time I’m seeing if I can, by planning everything out sufficiently thoroughly in advance and structuring it sufficiently thoroughly in advance, get it done at a reasonable clip.
4. Could you please tell us a little about your brain? For example, what is your IQ, at what age did you learn calculus, do you use cognitive enhancing drugs or brain fitness programs, are you Neurotypical and why didn't you attend school?
So the question is ‘please tell us a little about your brain.’ What’s your IQ? Tested as 143, that would have been back when I was... 12? 13? Not really sure exactly. I tend to interpret that as ‘this is about as high as the IQ test measures’ rather than ‘you are three standard deviations above the mean’. I’ve scored higher than that on(?) other standardized tests; the largest I’ve actually seen written down was 99.9998th percentile, but that was not really all that well standardized because I was taking the test and being scored as though for the grade above mine and so it was being scored for grade rather than by age, so I don’t know whether or not that means that people who didn’t advance through grades tend to get the highest scores and so I was competing well against people who were older than me, or whether if the really smart people all advanced farther through the grades and so the proper competition doesn’t really get sorted out, but in any case that’s the highest percentile I’ve seen written down.
‘At what age did I learn calculus’, well it would have been before 15, probably 13 would be my guess. I’ll also state at just how stunned I am at how poorly calculus is taught.
Do I use cognitive enhancing drugs or brain fitness programs? No. I’ve always been very reluctant to try tampering with the neurochemistry of my brain because I just don’t seem to react to things typically; as a kid I was given Ritalin and Prozac and neither of those seemed to help at all and the Prozac in particular seemed to blur everything out and you just instinctively(?) just... eugh.
One of the questions over here is ‘are you neurotypical’. And my sort of instinctive reaction to that is ‘Hah!’ And for that reason I’m reluctant to tamper with things. Similarly with the brain fitness programs, don’t really know which one of those work and which don’t, I’m sort of waiting for other people in the Less Wrong community to experiment with that sort of thing and come back and tell the rest of us what works and if there’s any consensus between them, I might join the crowd.
‘Why didn’t you attend school?’ Well I attended grade school, but when I got out of grade school it was pretty clear that I just couldn’t handle the system; I don’t really know how else to put it. Part of that might have been that at the same time that I hit puberty my brain just sort of... I don’t really know how to describe it. Depression would be one word for it, sort of ‘spontaneous massive will failure’ might be another way to put; it’s not that I was getting more pessimistic or anything, just that my will sort of failed and I couldn’t get stuff done. Sort of a long process to drag myself out that and you could probably make a pretty good case that I’m still there, I just handle it a lot better? Not even really sure quite what I did right, as I said in an answer to a previous question, this is something I’ve been struggling with for a while and part of having a poor grasp on something is that even when you do something right you don’t understand afterwards quite what it is that you did right.
So... ‘tell us about your brain’. I get the impression that it’s got a different balance of abilities; like, some neurons got allocated to different areas, other areas got shortchanged, some areas got some extra neurons, other areas got shortchanged, the hypothesis has occurred to me lately that my writing is attracting other people with similar problems because of the extent to which one has noticed a sort of similar tendency to fall on the lines of very reflective, very analytic and has mysterious trouble executing and getting things done and working at sustained regular output for long periods of time, among the people who like my stuff.
On the whole though, I never actually got around to getting an MRI scan; it’s probably a good thing to do one of these days, but this isn’t Japan where that sort of thing only costs 100 dollars, and getting it analyzed, you know they’re not just looking for some particular thing but just sort of looking at it and saying ‘Hmm, well what is this about your brain?’, well I’d have to find someone to do that too.
So, I’m not neurotypical... asking sort of ‘what else can you tell me about your brain’ is sort of ‘what else can you tell me about who you are apart from your thoughts’, and that’s a bit of a large question. I don’t try and whack on my brain because it doesn’t seem to react typically and I’m afraid of being in a sort of narrow local optimum where anything I do is going to knock it off the tip of the local peak, just because it works better than average and so that’s sort of what you would expect to find there.
5. During a panel discussion at the most recent Singularity Summit, Eliezer speculated that he might have ended up as a science fiction author, but then quickly added:
I have to remind myself that it's not what's the most fun to do, it's not even what you have talent to do, it's what you need to do that you ought to be doing.
Shortly thereafter, Peter Thiel expressed a wish that all the people currently working on string theory would shift their attention to AI or aging; no disagreement was heard from anyone present.
I would therefore like to ask Eliezer whether he in fact believes that the only two legitimate occupations for an intelligent person in our current world are (1) working directly on Singularity-related issues, and (2) making as much money as possible on Wall Street in order to donate all but minimal living expenses to SIAI/Methuselah/whatever.
How much of existing art and science would he have been willing to sacrifice so that those who created it could instead have been working on Friendly AI? If it be replied that the work of, say, Newton or Darwin was essential in getting us to our current perspective wherein we have a hope of intelligently tackling this problem, might the same not hold true in yet unknown ways for string theorists? And what of Michelangelo, Beethoven, and indeed science fiction? Aren't we allowed to have similar fun today? For a living, even?
So, first, why restrict it to intelligent people in today’s world? Why not everyone? And second... the reply to the essential intent of the question is yes, with a number of little details added. So for example, if you’re making money on Wall Street, I’m not sure you should be donating all but minimal living expenses because that may or may not be sustainable for you. And in particular if you’re, say, making 500,000 dollars a year and you’re keeping 50,000 dollars of that per year, which is totally not going to work in New York, probably, then it’s probably more effective to double your living expenses to 100,000 dollars per year and have the amount donated to the Singularity Institute go from 450,000 to 400,000 when you consider how much more likely that makes it that more people follow in your footsteps. That number is totally not realistic and not even close to the percentage of income donated versus spent on living expenses for present people working on Wall Street who are donors to the Singularity Institute. So considering at present that no one seems willing to do that, I wouldn’t even be asking that, but I would be asking for more people to make as much money as possible if they’re the sorts of people who can make a lot of money and can donate a substantial amount fraction, never mind all the minimal living expenses, to the Singularity Institute.
Comparative advantage is what money symbolizes; each of us able to specialize in doing what we do best, get a lot of experience doing it, and trade off with other people specialized at what they’re doing best with attendant economies of scale and large fixed capital installations as well, that’s what money symbolizes, sort of in idealistic reality, as it were; that’s what money would mean to someone who could look at human civilization and see what it was really doing. On the other hand, what money symbolizes emotionally in practice, is that it imposes market norms, instead of social norms. If you sort of look at how cooperative people are, they can actually get a lot less cooperative once you offer to pay them a dollar, because that means that instead of cooperating because it’s a social norm, they’re now accepting a dollar, and a dollar puts it in the realm of market norms, and they become much less altruistic.
So it’s sort of a sad fact about how things are set up that people look at the Singularity Institute and think ‘Isn’t there some way for me to donate something other than money?’ partially for the obvious reason and partially because their altruism isn’t really emotionally set up to integrate properly with their market norms. For me, money is reified time, reified labor. To me it seems that if you work for an hour on something and then donate the money, that’s more or less equivalent to donating the money (time?), or should be, logically. We have very large bodies of experimental literature showing that the difference between even a dollar bill versus a token that’s going to be exchanged for a dollar bill at the end of the experiment can be very large, just because that token isn’t money. So there’s nothing dirty about money, and there’s nothing dirty about trying to make money so that you can donate it to a charitable cause; the question is ‘can you get your emotions to line up with reality in this case?’
Part of the question was sort of like ‘What of Michaelangelo, Beethoven, and indeed science fiction? Aren’t we allowed to have similar fun today? For a living even?’
This is crunch time. This is crunch time for the entire human species. This is the hour before the final exam, we are trying to get as much studying done as possible, and it may be that you can’t make yourself feel that, for a decade, or 30 years on end or however long this crunch time lasts. But again, the reality is one thing, and the emotions are another. So it may be that you can’t make yourself feel that this is crunch time, for more than an hour at a time, or something along those lines. But relative to the broad sweep of human history, this is crunch time; and it’s crunch time not just for us, it’s crunch time for the intergalactic civilization whose existence depends on us. I think that if you’re actually just going to sort of confront it, rationally, full-on, then you can’t really justify trading off any part of that intergalactic civilization for any intrinsic thing that you could get nowadays, and at the same time it’s also true that there are very few people who can live like that, and I’m not one of them myself, so because trying to live with that would even rule out things like ordinary altruism; I hold open doors for little old ladies, because I find that I can’t live only as an altruist in theory; I need to commit sort of actual up-front deeds of altruism, or I stop working properly.
So having seen that intergalactic civilization depends on us, in one sense, all you can really do is try not to think about that, and in another sense though, if you spend your whole life creating art to inspire people to fight global warming, you’re taking that ‘forgetting about intergalactic civilization’ thing much too far. If you look over our present civilization, part of that sort of economic thinking that you’ve got to master as a rationalist is learning to think on the margins. On the margins, does our civilization need more art and less work on the singularity? I don’t think so. I think that the amount of effort that our civilization invests in defending itself against existential risks, and to be blunt, Friendly AI in particular is ludicrously low. Now if it became the sort of pop-fad cause and people were investing billions of dollars into it, all that money would go off a cliff and probably produce anti-science instead of science, because very few people are capable of working on a problem where they don’t find immediately whether or not they were wrong, and it would just instantaneously go wrong and generate a lot of noise from people of high prestige who would just drown out the voices of sanity. So wouldn’t it be a nice thing if our civilization started devoting billions of dollars to Friendly AI research because our civilization is not set up to do that sanely. But at the same time, the Singularity Institute exists, the Singularity Institute, now that Michael Vassar is running it, should be able to scale usefully; that includes actually being able to do interesting things with more money, now that Michael Vassar’s the president.
To say ‘No, on the margin, what human civilization, at this present time, needs to do is not put more money in the Singularity Institute, but rather do this thing that I happen to find fun’ not that I’m doing this and I’m going to professionally specialize in it and become good in it and sort of trade hours of doing this thing that I’m very good at for hours that go into the Singularity Institute via the medium of money, but rather ‘no, this thing that I happen to find fun and interesting is actually what our civilization needs most right now, not Friendly AI’, that’s not defensible; and, you know, these are all sort of dangerous things to think about possibly, but I think if you sort of look at that face-on, up-front, take it and stare at it, there’s no possible way the numbers could work out that way.
It might be helpful to visualize a Friendly Singularity so that the kid who was one year old at the time is now 15 years old and still has something like a 15 year old human psychology and they’re asking you ‘So here’s this grand, dramatic moment in history, not human history, but history, on which the whole future of the intergalactic civilization that we now know we will build; it hinged on this one moment, and you knew that was going to happen. What were you doing?’ and you say, ‘Well, I was creating art to inspire people to fight global warming.’ The kid says ‘What’s global warming?’
That’s what you get for not even taking into account at all the whole ‘crunch time, fate of the world depends on it, squeaking through by a hair if we do it at all, already played into a very poor position in terms of how much work has been done and how much work we need to do relative to the amount of work that needs to be done to destroy the world as opposed to saving it; how long we could have been working on this previously and how much trouble it’s been to still get started.’ When this is all over, it’s going to be difficult to explain to that kid, what in the hell the human species was thinking. It’s not going to be a baroque tale. It’s going to be a tale of sheer insanity. And you don’t want you to be explaining yourself to that kid afterward as part of the insanity rather than the sort of small core of ‘realizing what’s going on and actually doing something about it that got it done.’
6. I know at one point you believed in staying celibate, and currently your main page mentions you are in a relationship. What is your current take on relationships, romance, and sex, how did your views develop, and how important are those things to you? (I'd love to know as much personal detail as you are comfortable sharing.)
This is not a topic on which I consider myself an expert, and so it shouldn’t be shocking to hear that I don’t have incredibly complicated and original theories about these issues. Let’s see, is there anything else to say about that... It’s asking ‘at one point I believed in staying celibate and currently your main page mentions your are in a relationship.’ So, it’s not that I believed in staying celibate as a matter of principle, but that I didn’t know where I could find a girl who would put up with me and the life that I intended to lead, and said as much, and then one woman, Erin, read about the page I’d put up to explain why I didn’t think any girl would put up with me and my life and said essentially ‘Pick me! Pick me!’ and it was getting pretty difficult to keep up with the celibate lifestyle by then so I said ‘Ok!’ And that’s how we got together, and if that sounds a bit odd to you, or like, ‘What!? What do you mean...?’ then... that’s why you’re not my girlfriend.
I really do think that in the end I’m not an expert; that might be as much as there is to say.
7. What's your advice for Less Wrong readers who want to help save the human race?
Find whatever you’re best at; if that thing that you’re best at is inventing new math[s] of artificial intelligence, then come work for the Singularity Institute. If the thing that you’re best at is investment banking, then work for Wall Street and transfer as much money as your mind and will permit to the Singularity institute where [it] will be used by other people. And for a number of sort of intermediate cases, if you’re familiar with all the issues of AI and all the issues of rationality and you can write papers at a reasonable clip, and you’re willing to work for a not overwhelmingly high salary, then the Singularity Institute is, as I understand it, hoping to make a sort of push toward getting some things published in academia. I’m not going to be in charge of that, Michael Vassar and Anna Salamon would be in charge of that side of things. There’s an internship program whereby we provide you with room and board and you drop by for a month or whatever and see whether or not this is work you can do and how good you are at doing it.
Aside from that, though, I think that saving the human species eventually comes down to, metaphorically speaking, nine people and a brain in a box in a basement, and everything else feeds into that. Publishing papers in academia feeds into either attracting attention that gets funding, or attracting people who read about the topic, not necessarily reading the papers directly even but just sort of raising the profile of the issues where intelligent people wonder what they can do with their lives think artificial intelligence instead of string theory. Hopefully not too many of them are thinking that because that would just generate noise, but the very most intelligent people... string theory is a marginal waste of the most intelligent people. Artificial intelligence and Friendly Artificial Intelligence, sort of developing precise, precision grade theories of artificial intelligence that you could actually use to actually build a Friendly AI instead of blowing up the world; the need for one more genius there is much greater than the need for one more genius in string theory. Most of us can’t work on that problem directly. I, in a sense, have been lucky enough not to have to confront a lot of the hard issues here, because of being lucky enough to be able to work on the problem directly, which simplifies my choice of careers.
For everyone else, I’ll just sort of repeat what I said in an earlier video about comparative advantage, professional specialization, doing what we do best at and practicing a lot; everyone doing that and trading with each other is the essence of economics, and the symbol of this is money, and it’s completely respectable to work hours doing what you’re best at, and then transfer the sort of expected utilons that a society assigns to that to the Singularity Institute, where it can pay someone else to work at it such that it’s an efficient trade, because the total amount of labor and effectiveness that they put into it that you can purchase is more than you could do by working an equivalent number of hours on the problem yourself. And as long as that’s the case, the economically rational thing to do is going to be to do what you’re best at and trade those hours to someone else, and let them do it. And there should probably be fewer people, one expects, who working on the problem directly, full time; stuff just does not get done if you’re not working on it full time, that’s what I’ve discovered, anyway; I can’t even do more than one thing at a time. And that’s the way grown ups do it, essentially, that’s the way a grown up economy does it.
Eliezer, first congratulations for having the intelligence and courage to voluntarily drop out of school at age 12! Was it hard to convince your parents to let you do it? AFAIK you are mostly self-taught. How did you accomplish this? Who guided you, did you have any tutor/mentor? Or did you just read/learn what was interesting and kept going for more, one field of knowledge opening pathways to the next one, etc...?
EDIT: Of course I would be interested in the details, like what books did you read when, and what further interests did they spark, etc... Tell us a little story. ;)
Well, amazingly enough, I’ve discovered the true, secret, amazing formula for teaching yourself and... I lie, I just winged it. Yeah, just read whatever interested me until age 15-16 thereabouts which is when I started to discover the Singularity as opposed to background low-grade Transhumanism that I’d been engaged with up until that point; started thinking that cognitive technologies, creating smarter than human level intelligence was the place to be and initially thought that neural engineering was going to be the sort of leading, critical path of that. Studied a bit of neuroscience and didn’t get into that too far before I started thinking that artificial intelligence was going to be the route; studied computer programming, studied a bit of business type stuff because at one point I thought I’d do a start up at something I’m very glad I didn’t end up doing, in order to get the money to do the AI thing, and I’m very glad that I didn’t go that route, and I won’t even say that the knowledge has served me all that good instead, it’s just not my comparative advantage.
At some point sort of woke up and smelled the Bayesian coffee and started studying probability theory and decision theory and statistics and that sort of thing, but really I haven’t had and opportunity to study anywhere near as much as I need to know. And part of that, I won’t apologize for because a lot of sort of fact memorization is more showing off than because you’re going to use that fact every single day; part of that I will apologize for because I feel that I don’t know enough to get the job done and that when I’m done writing the book I’m just going to have to take some more time off and just study some of the sort of math and mathematical technique that I expect to need in order to get this done. I come across as very intelligent, but a surprisingly small amount of that relies on me knowing lots of facts, or at least that’s the way it feels to me. So I come across as very intelligent, but that’s because I’m good at winging it, might be one way to put it. The road of the autodidact, I feel... I used to think that anyone could just go ahead and do it and that the only reason to go to college was for the reputational ‘now people can hire you’ aspect which sadly is very important in today’s world. Since then I’ve come to realize both that college is less valuable and less important than I used to think and also that autodidacticism might be a lot harder for the average person than I thought because the average person is less similar to myself than my sort of intuitions would have it.
‘How do you become an autodidact’; the question you would ask before that would be ‘what am I going to do, and is it something that’s going to rely on me having memorized lots of standard knowledge and worked out lots of standard homework problems, or is it going to be something else, because if you’re heading for a job where you going to want to memorize lots of the same standardized facts as people around you, then autodidacticism might not be the best way to go. If you’re going to be a computer programmer, on the other hand, then [going] into a field where every day is a new adventure, and most jobs in computer programming will not require you to know the Nth detail of computer science, and even if they did, the fact that this is math means you might even have a better chance of learning it out of a book, and above all it’s a field where people have some notion that you’re allowed to teach yourself; if you’re good, other people can see it by looking at your code, and so there’s sort of a tradition of being willing to hire people who don’t have a Masters.
So I guess I can’t really give all that much advice about how to be successful autodidact in terms of... studying hard, doing the same sort of thing you’d be doing in college only managing to do it on your own because you’re that self-disciplined, because that is completely not the route I took. I would rather advise you to think very hard about what it is you’re going to be doing, whether or not anyone will let you do it if you don’t have the official credential, and to what degree the road you’re going is going to depend on the sort of learning that you have found that you can get done on your own.
9. Is your pursuit of a theory of FAI similar to, say, Hutter's AIXI, which is intractable in practice but offers an interesting intuition pump for the implementers of AGI systems? Or do you intend on arriving at the actual blueprints for constructing such systems? I'm still not 100% certain of your goals at SIAI.
Definitely actual blueprint, but, on the way to an actual blueprint, you probably have to, as an intermediate step, construct intractable theories that tell you what you’re trying to do, and enable you to understand what’s going on when you’re trying to do something. If you want a precise, practical AI, you don’t get there by starting with an imprecise, impractical AI and going to a precise, practical AI. You start with a precise, impractical AI and go to a precise, practical AI. I probably should write that down somewhere else because it’s extremely important, and as(?) various people who will try to dispute it, and at the same time hopefully ought to be fairly obvious if you’re not motivated to arrive at a particular answer there. You don’t just run out and construct something imprecise because, yeah, sure, you’ll get some experimental observations out of that, but what are your experimental observations telling you? And one might say along the lines of ‘well, I won’t know that until I see it,’ and suppose that has been known to happen a certain number of times in history; just inventing the math has also happened a certain number of times in history.
We already have a very large body of experimental observations of various forms of imprecise AIs, both the domain specific types we have now, and the sort of imprecise AI constituted by human beings, and we already have a large body of experimental data, and eyeballing it... well, I’m not going to say it doesn’t help, but on the other hand, we already have this data and now there is this sort of math step in which we understand what exactly is going on; and then the further step of translating the math back into reality. It is the goal of the Singularity Institute to build a Friendly AI. That’s how the world gets saved, someone has to do it. A lot of people tend to think that this is going to require, like, a country’s worth of computing power or something like that, but that’s because the problem seems very difficult because they don’t understand it, so they imagine throwing something at it that seems very large and powerful and gives this big impression of force, which might be a country-size computing grid, or it might be a Manhattan Project where some computer scientists... but size matters not, as Yoda says.
What matters is understanding, and if the understanding is widespread enough, then someone is going to grab the understanding and use it to throw together the much simpler AI that does destroy the world, the one that’s build to much lower standards, so the model of ‘yes, you need the understanding, the understanding has to be concentrated within a group of people small enough that there is not one defector in the group who goes off and destroys the world, and then those people have to build an AI.’ If you condition on that the world got saved, and look back and within history, I expect that that is what happened in the majority of cases where a world anything like this one gets saved, and working back from there, they will have needed a precise theory, because otherwise they’re doomed. You can make mistakes and pull yourself up, even if you think you have a precise theory, but if you don’t have a precise theory then you’re completely doomed, or if you don’t think you have a precise theory then you’re completely doomed.
And working back from there, you probably find that there were people spending a lot of time doing math based on the experimental results that other people had sort of blundered out into the dark and gathered because it’s a lot easier to blunder out into the dark; more people can do it, lots more people have done it; it’s the math part that’s really difficult. So I expect that if you look further back in time, you see a small group of people who had honed their ability to understand things to a very high pitch, and then were working primarily on doing math and relying on either experimental data that other people had gathered by accident, or doing experiments where they have a very clear idea why they’re doing the experiment and what different results will tell them.
10. What was the story purpose and/or creative history behind the legalization and apparent general acceptance of non-consensual sex in the human society from Three Worlds Collide?
The notion that non-consensual sex is not illegal and appears to be socially accepted might seem a bit out of place in the story, as if it had been grafted on. This is correct. It was grafted on from a different story in which, for example, theft is while not so much legal, because they don’t have what would you call a strong, centralized government, but rather, say, theft is, in general, something you pull off by being clever rather than a horrible crime; but of course, you would never steal a book. I have yet to publish a really good story set in this world; most of them I haven’t finished, the one I have finished has other story problems. But if you were to see the story set in this world, then you would see that it develops out of a much more organic thing than say... dueling, theft, non-consensual sex; all of these things are governed by tradition rather than by law, and they certainly aren’t prohibited outright.
So why did I pick up that one aspect form that story and put it into Three Worlds Collide? Well, partially it was because I wanted that backpoint to introduce a culture clash between their future and our past, and that’s what came to mind, more or less, it was more something to test out to see what sort of reaction it got, to see if I could get away with putting it into this other story. Because one can’t use theft; Three Worlds Collide’s society actually does run on private propety. One can’t use dueling; their medical technology isn’t advanced enough to make that trivial. But you can use non-consensual sex and try to explain sort of what happens in a society in which people are less afraid, and not afraid of the same things. They’re stronger than we are in some senses, they don’t need as much protection, the consequences aren’t the same consequences that we know, and the people there sort of generally have a higher grade of ethics and are less likely to abuse things. That’s what made that sort of particular culture clash feature a convenient thing to pick up from one story and graft onto another, but ultimately it was a graft, and any feelings of ‘why is that there?’ that you have, might make a bit more sense if you saw the other story, if I can ever repair the flaws in it, or manage to successfully complete and publish a story set in that world that actually puts the world on display.
11. If you were to disappear (freak meteorite accident), what would the impact on FAI research be?
Do you know other people who could continue your research, or that are showing similar potential and working on the same problems? Or would you estimate that it would be a significant setback for the field (possibly because it is a very small field to begin with)?
Marcello Herreshoff is the main person whom I’ve worked with on this, and Marcello doesn’t yet seem to be to the point where he could replace me, although he’s young so he could easily develop further in coming years and take over as the lead, or even, say, ‘Aha! Now I’ve got it! No more need for Eliezer Yudkowsky.’ That sort of thing would be very nice if it happened, but it’s not the sort of thing I would rely on.
So if I got hit by a meteor right now, what would happen is that Michael Vassar would take over responsibility for seeing the planet through to safety, and say ‘Yeah I’m personally just going to get this done, not going to rely on anyone else to do it for me, this is my problem, I have to handle it.’ And Marcello Herreshoff would be the one who would be tasked with recognizing another Eliezer Yudkowsky if one showed up and could take over the project, but at present I don’t know of any other person who could do that, or I’d be working with them. There’s not really much of a motive in a project like this one to have the project split into pieces; whoever can do work on it is likely to work on it together.
12. Your approach to AI seems to involve solving every issue perfectly (or very close to perfection). Do you see any future for more approximate, rough and ready approaches, or are these dangerous?
More approximate, rough and ready approaches might produce interesting data that math theorist types can learn something from even though the people who did it didn’t have that in mind. The thing is, though, there’s already a lot of people running out and doing that and really failing at AI, or even approximate successes at AI, result in much fewer sudden thunderbolts of enlightenment about the structure of intelligence than the people that are busily producing ad hoc AI programs because that’s easier to do and you can get a paper out of it and you get respect out of it and prestige and so on. So it’s a lot harder for that sort of work to result in sudden thunderbolts of enlightenment about the structure of intelligence than the people doing it would like to think, because that way it gives them an additional justification for doing the work. The basic answer to the question is ‘no’, or at least I don’t see a future for Singularity Institute funding, going as marginal effort, into sort of rough and ready ‘forages’ like that. It’s been done already. If we had more computer power and our AIs were more sophisticated, then the level of exploration that we’re doing right now would not be a good thing, as it is, it’s probably not a very dangerous thing because the AIs are weak more or less. It’s not something you would ever do with AI that was powerful enough to be dangerous. If you know what it is that you want to learn by running a program, you may go ahead and run it; if you’re just foraging out at random, well other people are doing that, and even then they probably won’t understand what their answers mean until you on your end, the sort of math structure of intelligence type people, understand what it means. And mostly the result of an awful lot of work in domain specific AIs tell us that we don’t understand something, and this can often be surprisingly easy to figure out, simply by querying your brain without being overconfident.
So, I think that at this point, what’s needed is math structure of intelligence type understanding, and not just any math, not just ‘Ooh, I’m going to make a bunch of Greek symbols and now I can publish a paper and everyone will be impressed by how hard it is to understand,’ but sort of very specific math, the sort that results in thunderbolts of enlightenment; the usual example I hold up is the Bayesian Network Causality insight as depicted in Judea Pearl’s Probabilistic Reasoning in Intelligent Systems and (later book of causality?). So if you sort of look at the total amount of papers that have been written with neat Greek symbols and things that are mathematically hard to understand and compare that to those Judea Pearl books I mentioned, though one should always mention this is the culmination of a lot of work not just by Judea Pearl; that will give you a notion of just how specific the math has to be.
In terms of solving every issue perfectly or very close to perfection, there’s kinds of perfection. As long as I know that any proof is valid, I might not know how long it takes to do a proof; if there’s something that does proof, then I may not know how long the algorithm takes to produce a proof but I may know that anything it claims is a proof is definitely a proof, so there’s different kinds of perfection and types of precision. But basically, yeah, if you want to build a recursively self-improving AI, have it go through a billion sequential self-modifications, become vastly smarter than you, and not die, you’ve got to work to a pretty precise standard.
13. How young can children start being trained as rationalists? And what would the core syllabus / training regimen look like?
I am not an expert in the education of young children. One has these various ideas that one has written up on Less Wrong, and one could try to distill those ideas, popularize them, illustrate them through simpler and simpler stories and so take these ideas and push them down to a lower level, but in terms of sort of training basic though skills, training children to be self-aware, to be reflective, getting them into the habit of reading and storing up lots of pieces of information, trying to get them more interested in being fair to both sides of an argument, the virtues of honest curiosity over rationalization, not in the way that I do it by sort of telling people and trying to lay out stories and parables that illustrate it and things like that, but if there’s some other way to do it with children, I’m not sure that my grasp of this concept of teaching rationality extends to before the young adult level. I believe that we had some sort of thread on Less Wrong about this, sort of recommended reading for young rationalists, I can’t quite remember.
Oh, but one thing that does strike me as being fairly important is that if this ever starts to happen on a larger scale and individual parents teaching individual children, the number one thing we want to do is test out different approaches and see which one works experimentally.
14. Could you elaborate a bit on your "infinite set atheism"? How do you feel about the set of natural numbers? What about its power set? What about that thing's power set, etc?
From the other direction, why aren't you an ultrafinitist?
The question is ‘can you elaborate on your infinite set atheism’, that’s where I say ‘I don’t believe in infinite sets because I’ve never seen one.’
So first of all, my infinite set atheism is a bit tongue-in-cheek. I mean, I’ve seen a whole lot of natural numbers, and I’ve seen that times tend to have successor times, and in my experience, at least, time doesn’t return to its starting point; as I understand current cosmology, the universe is due to keep on expanding, and not return to its starting point. So it’s entirely possible that I’m faced with certain elements that have successors where if the successors of two elements are the same and the two elements are the same, in which there’s no cycle. So in that sense I might be forced to recognize the empirical existence of every member of what certainly looks like an infinite set. As for the question of whether this collection of infinitely many finite things constitutes an infinite thing exists is an interesting metaphysical one, or it would be if we didn’t have the fact that even though by looking at time we can see that it looks like infinite things ought to exist, nonetheless, we’ve never encountered an infinite thing in certain, in person. We’ve never encountered a physical process that performs a super task. If you look more at physics, you find that actually matters are even worse than this. We’ve got real numbers down there, or at least if you postulate that it’s something other than real numbers underlying physics then you have to postulate something that looks continuous but isn’t continuous, and in this way, by Occam’s Razor, one might very easily suspect that the appearance of continuity arises from actual continuity, so that we have, say, an amplitude distribution, a neighborhood in configuration space, and the amplitude[s that] flows in configuration space are continuous, instead of having a discrete time with a discrete successor, we actually have a flow of time, so when you write the rules of causality, it’s not possible to write the rules of causality the way we write them for a Turing machine, you have to write the rules of causality as differential equations.
So these are the two main cases in which the universe is defined by infinite set atheism. The universe is handing me what looks like an infinite collection of things, namely times; the universe is handing me things that exist and are causes and the simplest explanation would have them being described by continuous differential equations, not by discrete ticks. So that’s the main sense in which my infinite set atheism is challenged by the universe’s actual presentation of things to me of things that look infinite. Aside from this, however, if you start trying to hand me paradoxes that are being produced by just assuming that you have an infinite thing in hand as an accomplished fact, an infinite thing of the sort where you can’t just present to me a physical example of it, you’re just assuming that that infinity exists, and then you’re generating paradoxes from it, well, we do have these nice mathematical rules for reasoning about infinities, but, rather than putting the blame on the person for having violated these elaborate mathematical rules that we develop to reason about infinities, I’m even more likely to cluck my tongue and say ‘But what good is it?’ Now it may be a tongue-in-cheek tongue cluck... I’m trying to figure out how to put this into words... Map that corresponds to the territory, if you can’t have infinities in your map, because your neurons, they fire discretely, and you only have a finite number of neurons in your head, so if you can’t have infinities in the map, what makes you think that you can make them correspond to infinities in the territory, especially if you’ve never actually seen that sort of infinity? And so the sort of math of the higher infinities, I tend to view as works of imaginative literature, like Lord of the Rings; they may be pretty, in the same way that Tolkien Middle Earth is pretty, but they don’t correspond to anything real until proven otherwise.
15. Why do you have a strong interest in anime, and how has it affected your thinking?
‘Well, as a matter of sheer, cold calculation I decided that...’
It’s anime! (laughs)
How has it affected my thinking? I suppose that you could view it as a continuity of reading dribs and drabs of westernized eastern philosophy from Godel, Escher, Bach or Raymon Smullyan, concepts like ‘Tsuyoku Naritai’, ‘I want to become stronger’, are things that being exposed to the alternative eastern culture as found in anime might have helped me to develop concepts of. But on the whole... it’s anime! There’s not some kind of elaborate calculation behind it, and I can’t quite say that when I’m encountering a daily problem, I think to myself ‘How would Light Yagami solve this?’ If the point of studying a programing language is to change the way you think, then I’m not sure that studying anime has change the way I think all that much.
16. What are your current techniques for balancing thinking and meta-thinking?
For example, trying to solve your current problem, versus trying to improve your problem-solving capabilities.
I tend to focus on thinking, and it’s only when my thinking gets stuck or I run into a particular problem that I will resort to meta-thinking, unless it’s a particular meta skill that I already have, in which case I’ll just execute it. For example, the meta skill of trying to focus on the original problem. In one sense, a whole chunk of Less Wrong is more or less my meta-thinking skills.
So I guess on reflection (ironic look), I would say that there’s a lot of routine meta-thinking that I already know how to do, and that I do without really thinking of it as meta-thinking. On the other hand, original meta-thinking, which is the time consuming part is something I tend to resort to only when my current meta-thinking skills have broken down. And that’s probably a reasonably exceptional circumstance even though it’s something of comparative advantage and so I expect it to do a bit more of it than average. Even so, when I’m trying to work on an object-level problem at any given point, I’m probably not doing original meta-level questioning about how to execute these meta-level skills.
If I bog down in writing something I may execute my sort of existing meta-level skill of ‘try to step back and look at this from a more abstract level’, and if that fails, then I may have to sort of think about what kind of abstract levels can you view this problem on, similar problems as opposed to tasks, and in that sense go into original meta-level thinking mode. But one of those meta-level skills I would say is the notion that your meta-level problem comes from an object-level problem and you’re supposed to keep one eye on the object-level problem the whole time you’re working on the meta-level.
17. Could you give an uptodate estimate of how soon non-Friendly general AI might be developed? With confidence intervals, and by type of originator (research, military, industry, unplanned evolution from non-general AI...)
We’re talking about this very odd sector of program space and programs that self-modify and wander around that space and sort of amble into a pot of gold that enables them to keep going and... I have no idea...
There are all sorts of different ways that it could happen, I don’t know which one of them are plausible or implausible or how hard or difficult they are relative to modern hardware or computer science. I have no idea what the odds are; I know they aren’t getting any better as time goes on or that is, the probabilities of Unfriendly AI are increasing over time. So if you were actually to make some kind of graph, then you’d see the probability rising over time as the odds got worse, and then the graph would slope down again as you entered into regions where it was more likely than not that Unfriendly AI had actually occurred before that; the slope would actually fall off faster as you went forward in time because the amount of probability mass has been drained away by Unfriendly AI happening now.
‘By type of originator’ or something, I might have more luck answering. I would put academic research at the top of it, because academic research that actually can try blue sky things. Or... OK, first commercial, that wasn’t quite on the list, as in people doing startup-ish things, hedge funds, people trying to improve the internal AI systems that they’re using for something, or build weird new AIs to serve commercial needs; those are the people most likely to build AI ‘stews’(?) Then after that, academic research, because in academia you have a chance of trying blue sky things. And then military, because they can hire smart people and give the smart people lots of computing power and have a sense of always trying to be on the edge of things. Then industry, if that’s supposed to mean car factories and so on because... that actually strikes me as pretty unlikely; they’re just going to be trying to automate ordinary processes, that sort of thing, it’s generally unwise to sort of push the bounds of theoretical limits while you’re trying to do that sort of thing; you can count Google as industry, but that’s the sort of thing I had in mind when I was talking about commercial. Unplanned evolution from non-general AI [is] not really all that likely to happen. These things aren’t magic. If something can happen by itself spontaneously, it’s going to happen before that because humans are pushing on it.
As for confidence intervals... doing that just feels like pulling numbers out of thin air. I’m kind of reluctant to do it because of the extent to which I feel that, even to the extent that my brain has a grasp on this sort of thing; by making up probabilities and making up times, I’m not even translating the knowledge that I do have into reality, so much as pulling things out of thin air. And if you were to sort of ask ‘what do sort of attitude do your revealed actions indicate?’ then I would say that my revealed actions don’t indicate that I expect to die tomorrow of Unfriendly AI, and my revealed actions don’t indicate that we can safely take until 2050. And that’s not even a probability estimate, that’s sort of looking at what I’m doing and trying to figure out what my brain thinks the probabilities are.
18. What progress have you made on FAI in the last five years and in the last year?
The last five years would take us back to the end of 2004, which is fairly close to the beginning of my Bayesian enlightenment, so the whole ‘coming to grasps with the Bayesian structure of it all’, a lot of that would fall into the last five years. And if you were to ask me... the development of Timeless Decision Theory would be in the last five years. I’m tyring to think if there’s anything else I can say about that. Getting a lot of clarification of what the problems were.
In the last year, I managed to get in a decent season of work with Marcello after I stopped regular posting to OBLW over the summer, before I started writing the book. That, there’s not much I can say about; there was something I suspected was going to be a problem and we tried to either solve the problem or at least nail down exactly what the problem was, and i think that we did a fairly good job of the latter, we now have a nice precise, formal explanation of what it is we want to do and why we can’t do it in the obvious way; we came up with sort of one hack for getting around it that’s a hack and doesn't have all the properties that we want a real solution to have.
So, step one, figure out what the problem is, step two, understand the problem, and step three, solve the problem. Some degree of progress on step two but not finished with it, and we didn’t get to step three, but that’s not overwhelmingly discouraging. Most of the real progress that has been made when we sit down and actually work on the problem [are] things I’d rather not talk about and the main exception to that is Timeless Decision Theory which has been posted to Less Wrong.
19. How do you characterize the success of your attempt to create rationalists?
It’s a bit of an ambiguous question, and certainly an ongoing project. Recently, for example, I was in a room with a group of people with a problem of what Robin Hanson called a far-type and what I would call the type where it’s difficult because you don’t get immediate feedback when you say something stupid, and it really was clear who in that room was an ‘X-rationalist’ or ‘neo-rationalist’, or ‘Lesswrongian’ or ‘Lessiath’ and who was not. The main distinction was that the sort of non-X-rationalists were charging straight off and were trying to propose complicated policy solutions right off the bat, and the rationalists were actually holding off, trying to understand the problem, break it down into pieces, analyze the pieces modularly, and just that one distinction was huge; it was the difference between ‘these are the people who can make progress on the problem’ and ‘these are the people who can’t make progress on the problem’. So in that sense, once you hand this deep, Lesswrongian types a difficult problem, the distinction between them and someone who has merely had a bunch of successful life experiences and so on is really obvious.
There’s a number of other interpretations that can be attached to the question, but I don’t really know what it means aside from that, even though it was voted up by 17 people.
20. What is the probability that this is the ultimate base layer of reality?
I would answer by saying, hold on, this is going to take me a while to calculate... um.... uh... um... 42 percent! (sarcastic)
21. Who was the most interesting would-be FAI solver you encountered?
Most people do not spontaneously try to solve the FAI problem. If they’re spontaneously doing something, they try to solve the AI problem. If we’re talking about sort of ‘who’s made interesting progress on FAI problems without being a Singularity Institute Eliezer supervised person,’ then I would have to say: Wei Dai.
22. If Omega materialized and told you Robin was correct and you are wrong, what do you do for the next week? The next decade?
If Robin’s correct, then we’re on a more or less inevitable path to competing intelligences driving existence down to subsistence level, but this does not result in the loss of everything we regard as valuable, and there seem to be some values disputes here, or things that are cleverly disguised as values disputes while probably not being very much like values disputes at all.
I’m going to take the liberty of reinterpreting this question as ‘Omega materializes and tells you “You’re Wrong”’, rather than telling me Robin in particular is right; for one thing that’s a bit more probable. And, Omega materializes and tells me ‘Friendly AI is important but you can make no contribution to that problem, in fact everything you’ve done so far is worse than nothing.’ So, publish a retraction... Ordinarily I would say that the next most important thing after this is to go into talking about rationality, but then if Omega tells me that I’ve actually managed to do worse than nothing on Friendly AI, that of course has to change my opinion of how good I am at rationality or teaching others rationality, unless this is a sort of counterfactual surgery type of thing where it doesn’t affect my opinion of how useful I can be by teaching people rationality, and mostly the thing I’d be doing if Friendly AI weren’t an option would probably be pushing human rationality. And if that were blocked out of existence, I’d probably end up as a computer programmer whose hobby was writing science fiction.
I guess I have enough difficulty visualizing what it means for Robin to be correct or how the human species isn’t just plain screwed in that situation that I could wish that Omega had materialized and either told me someone else was correct or given me a bit more detail about what I was wrong about exactly; I mean I can’t be wrong about everything; I think that two plus two equals four.
23. In one of the discussions surrounding the AI-box experiments, you said that you would be unwilling to use a hypothetical fully general argument/"mind hack" to cause people to support SIAI. You've also repeatedly said that the friendly AI problem is a "save the world" level issue. Can you explain the first statement in more depth? It seems to me that if anything really falls into "win by any means necessary" mode, saving the world is it.
Ethics are not pure personal disadvantages that you take on for others’ benefit. Ethics are not just penalties to the current problem you’re working on that have sort of side benefits for other things. When I first started working on the Singularity problem, I was making non-reductionist type mistakes about Friendly AI, even though I thought of myself as a rationalist at the time. And so I didn’t quite realize that Friendly AI was going to be a problem, and I wanted to sort of go all-out on any sort of AI, as quickly as possible; and actually, later on when I realized that Friendly AI was an issue, the sort of sneers that I now get about not writing code or being a luddite were correctly anticipated by my past self with the result that my past self sort of kept on advocating the kind of ‘rush ahead and write code’ strategy, rather than face the sneers, instead of going back and replanning everything from scratch once my past self realized that Friendly AI was going to be an issue, on which basis all the plans had been made before then.
So if I’d lied to get people to do what I had wanted them to do at that point, to just get AI done, to rush ahead and write code rather than doing theory; being honest as I actually was, I could just come back and say ‘OK, here’s what I said, I’m honestly mistaken, here’s the new information that I encountered that caused me to change my mind, here’s the new strategy that we need to use after taking this new information into account’. If you lie, there’s not necessarily any equally easy way to retract your lies. ... So for example, one sort of lie that I used to hear advocated back in the old days was by other people working on AI projects and it was something along the lines of ‘AI is going to be safe and harmless and will inevitably cure cancer, but not really take over the world or anything’ and if you tell that lie in order to get people to work on your AI project, then it’s going to be a bit more difficult to explain to them why you suddenly have to back off and do math and work on Friendly AI. Now, if I were an expert liar, I’d probably be able to figure out some sort of way to reconfigure those lies as well, I mean I don’t really know what an expert liar could accomplish by way of lying because I don’t have enough practice.
So I guess in that sense it’s not all that defensible... a defensive ethics, because I haven’t really tried it both ways, but it does seem to me, looking over my history, my ethics have played a pretty large role in protecting me from myself. Another example is [that] the whole reason that I originally pursued the thought of Friendly AI long enough to realize that it was important was not so much out of a personal desire as out of a sense that this was something I owed to the other people who were funding the project, Brian Atkins in particular back then, and that if there’s a possibility from their perspective that you can do better by Friendly AI, or that a fully honest account would cause them to go off and fund someone who was more concerned about Friendly AI, then I owed it to them to make sure that they didn’t suffer by helping me. And so it was a sense of ethical responsibility for others at that time which cause me to focus in on this sort of small, discordant note, ‘Well, this minor possibility that doesn’t look all that important, follow it long enough to get somewhere’. So maybe there are people who could defend the Earth by any means necessary and recruit other people to defend the Earth by any means necessary, and nonetheless have that all and well and happily smiling ever after, rather than bursting into flames and getting arrested for murder and robbing banks and being international outlaws, or more likely just arrested and attracting the ‘wrong’ sort of people who are trying to go along with this and people being corrupted by power and deciding that ‘no, the world really would be a better place with them in charge’ and etcetera etcetera etcetera.
I think if you sort of survey the Everett branches of the Many Worlds and look at the ones with successful Singularities, or pardon me, look at the conditional probability of successful Singularities, my guess is that the worlds that start out with programming teams who are trying to play it ethical versus the worlds that start off with programming teams that figure ‘well no, this is a planetary-class problem, we should throw away all our ethics and do whatever is necessary to get it done’ that the former world will have a higher proportion of happy outcomes. I could be mistaken, but if it does take a sort of master ruthless type person to do it optimally, then I am not that person, and that is not my comparative advantage, and I am not really all that willing to work with them either; so I supposed if there was any way you could end up with two Friendly AI projects, then I suppose the possibility of there actually being sort of completely ruthless programmers versus ethical programmers, they might both have good intentions and separate into two groups that refuse to work with one another, but I’m sort of skeptical about these alleged completely ruthless altruists. Has there ever, in history, been a completely ruthless altruist with that turning out well. Knut Haukelid, if I’m pronouncing his name correctly, the guy who blew up a civilian ferry in order to sink the Deuterium that the Nazis needed for their nuclear weapons program; you know you never see that in a Hollywood movie; so you killed civilians and did it to end the Nazi nuclear weapons program. So that’s about the best historical example I can think of a ruthless altruist and it turns out well, and I’m not really sure that’s quite enough to persuade me, to give up my ethics.
24. What criteria do you use to decide upon the class of algorithms / computations / chemicals / physical operations that you consider "conscious" in the sense of "having experiences" that matter morally? I assume it includes many non-human animals (including wild animals)? Might it include insects? Is it weighted by some correlate of brain / hardware size? Might it include digital computers? Lego Turing machines? China brains? Reinforcement-learning algorithms? Simple Python scripts that I could run on my desktop? Molecule movements in the wall behind John Searle's back that can be interpreted as running computations corresponding to conscious suffering? Rocks? How does it distinguish interpretations of numbers as signed vs. unsigned, or ones complement vs. twos complement? What physical details of the computations matter? Does it regard carbon differently from silicon?
This is something that I don’t know, and would like to know. What you’re really being asked is ‘what do you consider as people? Who you consider as people is a value. How can you not know what your own values are?’ Well, for one, it’s very easy to not know what your own values are. And for another thing, my judgement of what is a person, I do want to rely, if I can, about the notion of ‘what has... (hesitant) subjective experience’. For example, one reason that I’m not very concerned about my laptop’s feelings is because I’m fairly sure that whatever else is going on in there, it’s not ‘feeling’ it. And this is really something I wish I knew more about.
And the number one reason I wish I knew more about it is because the most accurate possible model of a person is probably a person; not necessarily the same person, but if you had an Unfriendly AI and it was looking at a person and using huge amounts of computing power, or just very efficient computing power, to model that person and predict the next event as accurately and as precisely as it could, then its model of that person might not be the same person, but it would probably be a person in its own right. So, one of the problems that I don’t even try talking to other AI researchers about, because it’s so much more difficult than what they signed up to handle that I just assume that they don’t want to hear about it; I’ve confronted them with much less difficult sounding problems like this and they just make stuff up or run away, and don’t say ‘Hmm, I better solve this problem before I go on with my plans to... destroy the world,’ or whatever it is they think they’re doing.
But in terms of danger points; three example danger points. First, if you have an AI with a pleasure-pain reinforcement architecture and any sort of reflectivity, the ability to sort of learn about its own thoughts and so on, then I might consider that a possible danger point, because then, who knows, it might be able to hurt and be aware that it was hurting; in particular because pleasure-pain reinforcement architecture is something that I think of as an evolutionary legacy architecture rather than an incredibly brilliant way to do things; that scenario space is easy to clear out of.
If you had an AI with terminal values over how it was treated and its role in surrounding social networks; like you had an AI that could... just, like, not as a means to an end but just, like, in its own right, the fact that you are treating it as a non-person; even if you don’t know whether or not it was feeling that about that, you might still be treading into territory where, just for the sake of safety, it might be worth steering out of it in terms of what we would consider as a person.
Oh, and the third consideration is that if your AI spontaneously starts talking about the mystery of subjective experience and/or the solved problem of subjective experience, and a sense of its own existence, and whether or not it seems mysterious to the AI; it could be lying, but you are now in probable trouble; you have wandered out of the safe zone. And conversely, as long as we go on about building AIs that don’t have pleasure, pain, and internal reflectivity, and anything resembling social emotions or social terminal values, and that exhibit no signs at all of spontaneously talking about a sense of their own existence, we’re hopefully still safe. I mean ultimately, if you push these things far enough without knowing what your doing, sooner or later you’re going to open the black box that contains the black swan surprise from hell. But at least as long as you sort of steer clear of those three land mines, and things just haven’t gone further and further and further, it gives you a way of looking at a pocket calculator and saying that the pocket calculator is probably safe.
25. I admit to being curious about various biographical matters. So for example I might ask: What are your relations like with your parents and the rest of your family? Are you the only one to have given up religion?
As far as I know I’m the only one in my family to give up religion except for one grand-uncle. I still talk to my parents, still phone calls and so on, amicable relations and so on. They’re Modern Orthodox Jews, and mom’s a psychiatrist and dad’s a physicist, so... ‘Escher painting’ minds; thinking about some things but always avoiding the real weak points of their beliefs and developing more and more complicated rationalizations. I tried confronting them directly about it a couple of times and each time have been increasingly surprised at the sheer depth of tangledness in there.
I might go on trying to confront them about it a bit, and it would be interesting to see what happens to them if i finish my rationality book and they read it. But certainly among the many things to resent religion for is the fact that I feel that it prevents me from having the sort of family relations that I would like; that I can’t talk with my parents about a number of things that I would like to talk with them about. The kind of closeness that I have with my fellow friends and rationalists is a kind of closeness that I can never have with them; even though they’re smart enough to learn the skills, they’re blocked off by this boulder of religion squatting in their minds. That may not be much to lay against religion, it’s not like I’m being burned at the stake, or even having my clitoris cut off, but it is one more wound to add to the list. And yeah, I resent it.
I guess even when I do meet with my parents and talk with my parents, the fact of their religion is never very far from my mind. It’s always there as the block, as a problem to be solved that dominates my attention, as something that prevented me from saying the things I want to say, and as the thing that’s going to kill them when they don’t sign up for cryonics. My parents may make it without cryonics, but all four of my grandparents are probably going to die, because of their religion. So even though they didn’t cut off all contact with me when I turned Atheist, I still feel like their religion has put a lot of distance between us.
26. Is there any published work in AI (whether or not directed towards Friendliness) that you consider does not immediately, fundamentally fail due to the various issues and fallacies you've written on over the course of LW? (E.g. meaningfully named Lisp symbols, hiddenly complex wishes, magical categories, anthropomorphism, etc.)
ETA: By AI I meant AGI.
There’s lots of work that’s regarded as plain old AI that does not immediately fail. There’s lots of work in plain old AI that succeeds spectacularly, and Judea Pearl is sort of like my favorite poster child there. But one could also name the whole Bayesian branch of statistical inference can be regarded with some equanimity as part of AI. There’s the sort of Bayesian methods that are used in robotics as well, which is sort of a surprisingly... how do I put it, it’s not theoretically distinct because it’s all Bayesian at heart, but in terms of the algorithms, it looks to me like there’s quite a bit of work that’s done in robotics that’s a separate branch of Bayesianism from the work done in statistical learning type stuff. That’s all well and good.
But if we’re asking about works that are sort of billing themselves as ‘I am Artificial General Intelligence’, then I would say that most of that does indeed fail immediately and indeed I cannot think of a counterexample which fails to fail immediately, but that’s a sort of extreme selection effect, and it’s because if you’ve got a good partial solution, or solution to a piece of the problem, and you’re an academic working in AI, and you’re anything like sane, you’re just going to bill it as plain old AI, and not take the reputational hit from AGI. The people who are bannering themselves around as AGI tend to be people who think they’ve solved the whole problem, and of course they’re mistaken. So to me it really seems like to say that all the things I’ve read on AGI immediately fundamentally fail is not even so much a critique of AI as rather a comment on what sort of more tends to bill itself as Artificial General Intelligence.
27. Do you feel lonely often? How bad (or important) is it?
(Above questions are a corollary of:) Do you feel that — as you improve your understanding of the world more and more —, there are fewer and fewer people who understand you and with whom you can genuinely relate in a personal level?
That’s a bit hard to say exactly. I often feel isolated to some degree, but the fact of isolation is a bit different from the emotional reaction of loneliness. I suspect and put some probability to the suspicion that I’ve actually just been isolated for so long that I don’t have a state of social fulfillment to contrast it to, whereby I could feel lonely, or as it were, lonelier, or that I’m too isolated relative to my baseline or something like that. There's also the degree to which I, personality-wise, don’t hold with trying to save the world in an Emo fashion...? And as I improve my understanding of the world more and more, I actually would not say that I felt any more isolated as I’ve come to understand the world better.
There’s some degree to which hanging out with cynics like Robin Hanson has caused me to feel that the world is even more insane than I started out thinking it was, but that’s more a function of realizing that the rest of world is crazier than I thought rather than myself improving.
Writing Less Wrong has, I think, helped a good deal. I now feel a great deal less like I’m walking around with all of this stuff inside my head that causes most of my thoughts to be completely incomprehensible to anyone. Now my thoughts are merely completely incomprehensible to the vast majority of people, but there’s a sizable group out there who can understand up to, oh, I don’t know, like one third of my thoughts without a years worth of explanation because I actually put in the year’s worth of explanation. And even attracted a few people whom I feel like I can relate to on a personal level, and Michael Vassar would be the poster child there.
28. Previously, you endorsed this position:
Never try to deceive yourself, or offer a reason to believe other than probable truth; because even if you come up with an amazing clever reason, it's more likely that you've made a mistake than that you have a reasonable expectation of this being a net benefit in the long run.
One counterexample has been proposed a few times: holding false beliefs about oneself in order to increase the appearance of confidence, given that it's difficult to directly manipulate all the subtle signals that indicate confidence to others.
What do you think about this kind of self-deception?
So... Yeah, ‘cuz y’know people are always criticizing me on the grounds that I come across as too hesitant and not self confident enough. (sarcastic)
But to just sort of answer the broad thrust of the question; four legs good, two legs bad, self-honest good, self-deception bad. You can’t sort of say ‘Ok now I’m going to execute a 180 degree turn from the entire life I’ve led up until this point and now, for the first time, I’m going to throw away all the systematic training I’ve put into noticing when I’m deceiving myself, finding the truth, noticing thoughts that are hidden away in the corner of my mind, and taking reflectivity on a serious, gut level, so that if I know I have no legitimate reason to believe something I will actually stop believing it because, by golly, when you have no legitimate reason to believe something, it’s usually wrong. I’m now going to throw that out the window; I’m going to deceive myself about something and I’m not going to realize it’s hopeless and I’m going to forget the fact that I tried to deceive myself.’ I don’t see any way that you can turn away from self-honesty and towards self-deception, once you’ve gone far enough down toward the path of self-honesty without ‘A’ relinquishing The Way and losing your powers, and ‘B’ it doesn’t work anyway.
Most of the time, deceiving yourself is much harder than people think. But, because they don’t realize this, they can easily deceive themselves into believing that they’ve deceived themselves, and since they’re expecting a placebo effect, they get most of the benefits of the placebo effect. However, at some point, you become sufficiently skilled in reflection that this sort of thing does not confuse you anymore, and you actually realize that that’s what’s going on, and at that point, you’re just stuck with the truth. How sad. I’ll take it.
29. In the spirit of considering semi abyssal plans, what happens if, say, next week you discover a genuine reduction of consciousness and in turns out that... There's simply no way to construct the type of optimization process you want without it being conscious, even if very different from us?
ie, what if it turned out that The Law turned out to have the consequence of "to create a general mind is to create a conscious mind. No way around that"? Obviously that shifts the ethics a bit, but my question is basically if so, well... "now what?" what would have to be done differently, in what ways, etc?
Now, this question actually comes in two flavors. The difficult flavor is, you build this Friendly AI, and you realize there’s no way for it to model other people at the level of resolution that you need without every imagination that it has of another person being conscious. And so the first obvious question is ‘why aren’t my imaginations of other people conscious?’ and of course the obvious answer would be ‘they are!’ The models in your mind that you have of your friends are not your friends, they’re not identical with your friends, they’re not as complicated as the people you’re trying to model, so the person that you have in your imagination does not much resemble the person that you’re imagining; it doesn’t even much resemble the referent... like I think Michael Vassar is a complicated person, but my model of him is simple and then the person who that model is is not as complicated as my model says Michael Vassar is, etcetera, etcetera. But nonetheless, every time that I’ve modeled a person, and I write my stories, the characters that I create are real people. They may not hurt as intensely as the people do in my stories, but they nonetheless hurt when I make bad things happen to them, and as you scale up to superintelligence the problem just gets worse and worse and the people get realer and realer.
What do I do if this turns out to be the law? Now, come to think of it, I haven’t much considered what I would do in that case; and I can probably justify that to you by pointing out the fact that if I actually knew that this was the case I would know a great number of things I do not currently know. But mostly I guess I would have to start working on sort of different Friendly AI designs so that the AI could model other people less, and still get something good done.
And as for the question of ‘Well, the AI can go ahead and model other people but it has to be conscious itself, and then it might experience empathically what it imagines conscious beings experiencing the same way that I experience some degree of pain and shock, although a not a correspondingly large amount of pain and shock when I imagine one of my characters watching their home planet be destroyed. So in this case, when one is now faced with the question of creating a AI such that it can, in the future, become a good person; to the extent that you regard it as having human rights, it hasn’t been set on to a trajectory that would lock it out of being a good person. And this would entail a number of complicated issues, but it’s not like you have to make a true good person right of the bat, you just have to avoid putting it into horrible pain, or making it so that it doesn’t want to be what we would think of as a humane person later on. … You might have to give it goals beyond the sort of thing I talk about in Coherent Extrapolated Volition, and at the same time, perhaps a sort of common sense understanding that it will later be a full citizen in society, but for now it can sort of help the rest of us save the world.
30. What single technique do you think is most useful for a smart, motivated person to improve their own rationality in the decisions they encounter in everyday life?
It depends on where that person has deficit; so, the first thought that came to mind for that answer is ‘hold off on proposing solutions until you’ve analyzed the problem for a bit’, but on the other hand, if dealing with someone who’s given to extensive, deliberate rationalization, then the first thing I tell them is ‘stop doing that’. If I’m dealing with someone who’s ended up stuck in a hole because they now have this immense library of flaws to accuse other people of, so that no matter what is presented to them, they can find a flaw in that and yet they don’t turn, at full force, that ability upon themselves, then the number one technique that they need is ‘avoid motivated skepticism’. If I’m dealing with someone who tends to be immensely driven by cognitive dissonance and rationalizing mistakes that they already made, then I might advise them on Cialdini’s time machine technique; ask yourself ‘would you do it differently if you could go back in time, in your heart of hearts’, or pretend that you have now been teleported into your situation spontaneously; some technique like that, say.
But these are all matters of ‘here’s a single flaw that the person has that is stopping them’. So if you move aside from that a bit and ask ‘what sort of positive counter intuitive technique you might use’, I might say ‘hold off on proposing solutions until you understand the problem. Well, the question was about everyday life, so, in everyday life, I guess I would still say that people’s intelligence might probably still be improved a bit if they sort of paused and looked at more facets of the situation before jumping to a policy solution; or it might be rationalization, cognitive dissonance, the tendency to just sort of reweave their whole life stories just to make it sound better and to justify their past mistake, that doing something to help tone that down a bit might be the most important thing they could do in their everyday lives. Or if you got someone who’s giving away their entire income to their church then they could do with a bit more reductionism in their lives, but my guess is that, in terms of everyday life, then either one of ‘holding off on proposing solutions until thinking about the problem’ or ‘against rationalization, against cognitive dissonance, against sour grapes, not reweaving your whole life story to make sure that you didn’t make any mistakes, to make sure that you’re always in the right and everyone else is in the wrong, etcetera, etcetera’, that one of those two would be the most important thing.
I bet most people here have realized this explicitly or implicitly, but this comment has inspired me to write a short, linkable summary of this error pattern, with a name:
The Relation Projection Fallacy: a denotational error whereby one confuses an n-ary relation for an m-ary relation, where usually m<n.
Example instance: "Life has no purpose."
This is a troublesome phrase. Why? If you look at unobjectionable uses of the concept <purpose> --- also referenced by synonyms like "having a point" --- it is in fact a ternary relation.
Example non-instance: "The purpose of a doorstop is to stop doors."
Here, one can query "to whom?" and be returned the context "to the person who made it" or "to the person who's using it", etc. That is, the full denotation of "purpose" is always of the form "The purpose of X to Y is Z," where Y is often implicit or can take a wide range of values.
This has nothing to do with connotation... it's just how the concept <purpose> typically works as people use it. But to flog a dead horse, the purpose of a doorstop to a cat may be to make an amusing sound as it glides across the floor after the cat hits it. The value of Y always matters. There is no "true purpose" stored anywhere inside the doorstop, or even in the combination of the doorstop and the door it is stopping. To think otherwise is literally projecting, in the mathematical sense, a ternary relation, i.e., a subset of a product of three sets (objects)x(agents)x(verbs), into a product of two sets, (objects)x(verbs). But people often do this projection incorrectly, by either searching for a purpose that is intrinsic to the Doorstop or to Life, or by searching for a canonical value of "Y" like "The Great Arbiter of Purpose", both of which are not to be found, at least to their satisfaction when they utter the phrase "Life has no purpose."
Likewise, the relation "has a purpose" is typically a binary relation, because again, we can always ask "to whom?". "<That doorstop> has a purpose to <me>."
In some form, this realization is of course the cause of many schools of thought taking the name "relativist" on many different issues. But I find that people over-use the phrase "It's all relative" to connote "It's all meaningless" or "there is no answer". Which is ironic, because meaning itself is a ternary relation! Its typical denotation is of the form "The meaning of X to Y is Z", like in
- "The meaning of <the sound 'owe'> to <French people> is <liquid water>" or
- "The meaning of <that pendant> to <your mother> is <a certain undescribed experience of sentimentality>".
Realizing this should NOT result in a cascade of bottomless relativism where nothing means anything! In fact, the first time I had this thought as a kid, I arrived at the connotationally pleasing conclusion "My life can have as many purposes as there are agents for it to have a purpose to."
Indeed, the meaning of <"purpose"> to <humans> is <a certain ternary functional relationship between objects, agents, and verbs>, and the meaning of <"meaning"> to <humans> is <a certain ternary relationship between syntactic elements, people generating or perceiving them, and referents>.
When I found LessWrong, I was happy to find that Eliezer wrote on almost exactly this realization in 2-Place and 1-Place Words, but sad that the post had few upvotes -- only 14 right now. So in case it was too long, or didn't have a snappy enough name, I thought I'd try giving the idea another shot.
ETA: In the special case of talking to someone wondering about the purpose of life, here is how I would use this observation in the form of an argument:
First of all, you may be lacking satisfaction in your life for some reason, and framing this to yourself in philosophical terms like "Life has no purpose, because <argument>." If that's true, it's quite likely that you'd feel differently if your emotional needs as a social primate were being met, and in that sense the solution is not an "answer" but rather some actions that will result in these needs being met.
Still, that does not address the <argument>. So because "What is s the purpose of life?" may be a hard question, let's look at easier examples of purpose and see how they work. Notice how they all have someone the purpose is to? And how that's missing in your "purpose of life" question? Because of that, you could end up feeling one of two ways:
(1) Satisfied, because now you can just ask "What could be the purpose of my life to <my friends, my family, myself, the world at large, etc>", and come up with answers, or
(2) Unsatisfied, because there is no agent to ask about such that the answer would seem important enough to you.
And I claim that whether you end up at (1) or (2) is probably more a function of whether your social primate emotional needs are being met than any particular philosophical argument.
That being said, if you believe this argument, the best thing to do for someone lacking a sense of purpose is probably not to just say the argument, but to help them start satisfying their emotional needs, and have this argument mainly to satisfy their sense of curiosity or nagging intellectual doubts about the issue.
Post will be returning in Main, after a rewrite by the company's writing staff. Citations Galore.
The Singularity Institute needs researchers capable of doing literature searches, critically analyzing studies, and summarizing their findings. The fields involved are mostly psychology (biases and debiasing, effective learning, goal-directed behavior / self help), computer science (AI and AGI), technological forecasting, and existential risks.
Pay is hourly and starts at $14/hr but that will rise if the product is good. You must be available to work at least 20 hrs/week to be considered.
- Work from home, with flexible hours.
- Age and credentials are irrelevant; only the product matters.
- Get paid to research things you're probably interested in anyway.
- Contribute to human knowledge in immediately actionable ways. We need this research because we're about to act on it. Your work will not fall into the journal abyss that most academic research falls into.
If you're interested, apply here.
Why post this job ad on LessWrong? We need people with some measure of genuine curiosity.
Also see Scholarship: How to Do It Efficiently.
At a recent meetup, we tried having a structured discussion in which we would all choose to talk about a belief that influences our behavior, talk about something we protect, or talk about a mistake we once made and have corrected. And it seemed that people thought it would require exceptional bravery to choose to talk about one's mistake. Elsewhere on Less Wrong, people are concerned about retaining the ability to edit a comment expressing a position they later reconsider and think is wrong.
My first reaction to all of this is that we need a group norm so that it doesn't require bravery to admit a mistake, or to leave a record of previously held positions. My second reaction is that we do in fact have such a norm. Comments expressing a change in position, that accept counter arguments and refutations, get up voted. Old comments reflecting the old wrong position are generally not down voted for being wrong. The problem is not how we treat people that make mistakes, but that people have inaccurate anticipations of how we will react.
So, to everyone who is worried about this, I want to say: It's OK. You can admit your mistakes. You can make a mistake and change your mind. We, the community, will applaud your growth, celebrate your new strength, and leave your mistake in the past where it belongs.
There's a concept (inspired by a Metafilter blog post) of ask culture vs. guess culture. In "ask culture," it's socially acceptable to ask for a favor -- staying over at a friend's house, requesting a raise or a letter of recommendation -- and equally acceptable to refuse a favor. Asking is literally just inquiring if the request will be granted, and it's never wrong to ask, provided you know you might be refused. In "guess culture," however, you're expected to guess if your request is appropriate, and you are rude if you accidentally make a request that's judged excessive or inappropriate. You can develop a reputation as greedy or thoughtless if you make inappropriate requests.
When an asker and a guesser collide, the results are awful. I've seen it in marriages, for example.
Husband: "Could you iron my shirt? I have a meeting today."
Wife: "Can't you see I'm packing lunches and I'm not even dressed yet? You're so insensitive!"
Husband: "But I just asked. You could have just said no if you were too busy -- you don't have to yell at me!"
Wife: "But you should pay enough attention to me to know when you shouldn't ask!"
It's not clear how how the asking vs. guessing divide works. Some individual people are more comfortable asking than guessing, and vice versa. It's also possible that some families, and some cultures, are more "ask-based" than "guess-based." (Apparently East Asia is more "guess-based" than the US.) It also varies from situation to situation: "Will you marry me?" is a question you should only ask if you know the answer is yes, but "Would you like to get coffee with me?" is the kind of question you should ask freely and not worry too much about rejection.
There's a lot of scope for rationality in deciding when to ask and when to guess. I'm a guesser, myself. But that means I often pass up the opportunity to get what I want, because I'm afraid of being judged as "greedy" if I make an inappropriate request. If you're a systematic "asker" or a systematic "guesser," then you're systematically biased, liable to guess when you should ask and vice versa.
In my experience, there are a few situations in which you should experiment with asking even if you're a guesser: in a situation where failure/rejection is so common as to not be shameful (i.e. dating), in a situation where it's someone's job to handle requests, and requests are common (e.g. applying for jobs or awards, academic administration), in a situation where granting or refusing a request is ridiculously easy (most internet communication.) Most of the time when I've tried this out I've gotten my requests granted. I'm still much more afraid of being judged as greedy than I am of not getting what I want, so I'll probably always stay on the "guessing" end of the spectrum, but I'd like to get more flexible about it, and more willing to ask when I'm in situations that call for it.
Anyone else have a systematic bias, one way or another? Anybody trying to overcome it?
(relevant: The Daily Ask, a website full of examples of ways you can make requests. Some of these shock me -- I wouldn't believe it's acceptable to bargain over store prices like that. But, then again, I'm running on corrupted hardware and I wouldn't know what works and what doesn't until I make the experiment.)
I was shocked, absolutely shocked, to find that Tyler Cowen's excellent TEDxMidAtlantic talk on stories had not yet been transcribed. It generated a lot of discussion in the thread about it where it was first introduced, so I went ahead and transcribed it. I added hyperlinks to background information where I thought it was due. Here you go:
Newest edit: I just realized that by "philosophy journals" in the original post I really meant "cognitive science" journals. (I made the mistake because for me, philosophy basically just is cognitive science.) So please the read the below in terms of cognitive science journals, not just philosophy journals.
First edit: Some people apparently read this as an "ultimatum" for SIAI, which was not the intent at all. It's merely an argument for why I think SIAI could benefit from publishing in mainstream journals, and then some advice on how to do it. I'm making recommendations, not "demands" - how silly would that be? Also, it's not like I'm saying SIAI should do a bunch of stuff, and I'm then walking away. For example, I'm actively writing a journal-grade paper on Friendly AI, putting it in the context of existing literature on the subject. And I'd love to do more.
Also, I suspect that many at SIAI already want to be publishing in mainstream philosophy journals. The problem is that it requires a fair amount of resources and know-how to do so (as the below post shows), and that takes time. It doesn't appear that SIAI has anyone whose primary training is in philosophy, because they've (wisely) invested their resources in, you know, math geniuses and people who can bring in funds and so on. Anyhoo...
After reading about 80% of the literature in the field of machine ethics, I've realized that the field hasn't quite caught up to where Yudkowsky's thinking was (on the most important issues) circa 2001.*
One cause of this may be the fact that unlike almost every other 10-year research institute or university research program on the planet, SIAI has no publications in established peer-reviewed journals. This fact has at least two effects: (1) SIAI's researchers are able to work more quickly on these problems when they are not spending their time reading hundreds of mostly useless papers from the mainstream literature, and then composing arduously crafted papers that conform to the style and expectations of the mainstream community, citing all the right literature. And: (2) the mainstream community has not caught up with SIAI's advances because SIAI has not shared them with anyone - at least not in their language, in their journals, to their expectations of clarity and style.
However, I suspect that SIAI may now want to devote some resources doing what must be done to get published in mainstream journals, because (1) many donors do know the difference between conference papers and papers published in mainstream journals, and will see SIAI as more valuable and credible if they are publishing in mainstream journals, (2) SIAI's views will look less cult-like and more academically credible in general if they publish in mainstream journals, and (3) SIAI and LW people will need to spend less time answering dumb questions like "Why not just program the AI to maximize human happiness?" if SIAI publishes short, well-cited, well-argued responses to such questions in the language that everybody else knows how to understand, rather than responding to those questions in a way that requires someone to read a set of dozens of blog posts and articles with a complex web of dependencies and an unfamiliar writing/citation style and vocabulary. Also: (4) Talking in everyone else's language and their journals will probably help some really smart people make genuine progress on the Friendly AI problem! Gert-Jan Lokhorst is a really smart guy interested in these issues, but it's not clear that he has read Yudkowsky. Perhaps he's never heard of Yudkowsky, or if he has, he doesn't have time to risk spending it on something that hasn't even bothered to pass a journal's peer review process. Finally, bringing the arguments to the world in the common language and journals will (5) invite criticism, some of which will be valid and helpful in reformulating SIAI's views and giving us all a better chance of surviving the singularity.
Thus, I share some advice on how to get published in philosophy journals. Much of SIAI's work is technically part of the domain of 'philosophy', even when it looks like math or computer science. Just don't think of Kant or Plato when you think of 'philosophy.' Much of SIAI's work is more appropriate for math and computer science journals, but I'm not familiar with how to get published in those fields, though I suspect the strategy is much the same.
Who am I to share advice? I've never published in a philosophy journal. But a large cause of that fact is that I haven't tried. (Though, I'm beginning on early drafts of some journal-bound papers now.) Besides, what I share with you below is just repeating what published authors do say to me and online, so you're getting their advice, not particularly mine.
Okay, how to get published in philosophy journals...
The easiest way to get published is to be a respected academic with a long publication history, working at a major university. Barring that, find a co-author or two who fit that description.
Still, that won't be enough, and sometimes the other conditions below will be sufficient if you don't match that profile. After all, people do manage to build up a long publication history starting with a publication history of 0. Here's how they do it:
1. Write in the proper style. Anglophone philosophy has, over the years, developed a particular style marked by clarity and other norms. These norms have been expressed in writing guides for undergraduate philosophy students here, here, and elsewhere. However, such guides are insufficient. Really, the only way to learn the style of Anglophone philosophy is to read hundreds and hundreds of journal articles. You will then have an intuitive sense of what sounds right or wrong, and which structures are right or wrong, and your writing will be much easier because you won't need to look it up in a style guide every time. As an example, Yudkowsky's TDT paper is much closer to the standard style than his CEV paper, but it's still not quite there yet.
2. Use the right vocabulary and categories. Of course, you might write a paper aiming to recommend a new term or new categories, but even then you need to place your arguments in the context of the existing terms and categories first. As an example, consider Eliezer's Coherent Extrapolated Volition paper from 2004. The paper was not written for journals, so I'm not criticizing the paper. I'm explaining how it would need to be written differently if it was intended for journal publication. Let's pretend it is now 2004, and I am co-writing the Coherent Extrapolated Volition paper with Eliezer, and we want to publish it in a mainstream journal.
First, what is Eliezer's topic? It is the topic of how to design the goal system of an AI so that it behaves ethically, or in ways that we want. For a journal paper, our first goal would be to place the project of our paper in the context of the existing literature on that subject. Now, in 2004, it wasn't clear that this field would come to be called by the term "machine ethics" rather than by other terms that were floating around at the time like "artificial morality" (Danielson, 1992) or "computational ethics" (Allen et al., 2000) or "friendly ai" (Yudkowsky, 2001). So, we would probably cite the existing literature on this issue of how to design the goal system of an AI so that it behaves ethically (only about a two dozen works existed in 2004) and pick the terms that worked best for our purpose, after making clear what we meant by them.
Next, we would undertake the same considerations for the other concepts we use. For example, Eliezer introduces the term volition:
Suppose you're faced with a choice between two boxes, A and B. One and only one of the boxes contains a diamond. You guess that the box which contains the diamond is box A. It turns out that the diamond is in box B. Your decision will be to take box A. I now apply the term volition to describe the sense in which you may be said towant box B, even though your guess leads you to pick box A.
But here, it's unnecessary to invent a new term, because philosophers talk a lot about this concept, and they already have a well-developed vocabulary for talking about it. Eliezer is making use of the distinction between "means" and "ends," and he's talking about "informed desires" or "informed wants" or "what an agent would want if fully informed." There is a massive and precise literature on this concept, and mainstream journals would expect us to pick one variant of this vocabulary for use in our paper and cite the people who use it, rather than just introducing a brand new term for no good reason.
Next, when Eliezer writes about "extrapolating" human volition, he actually blends two concepts that philosophers keep distinct for good reasons. He blends the concept of distinguishing means from ends with the notion of ends that change in response to the environment or inner processes. To describe the boxes example above, a mainstream philosopher would say that you desired to choose box A as a means, but you desired the diamond in box B as an end. (You were simply mistaken about which box contained the diamond.) Eliezer calls this a type of "extrapolation," but he also refers to something else as "extrapolation":
In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.
This is a very different thing to the mainstream philosopher. This is an actual changing of (or extrapolating of) what one desires as an end, perhaps through a process by which reward signals reinforce certain neural pathways, thus in certain circumstances transforming a desire-as-means into a desire-as-end (Schroeder, 2004). Or, in Yudkowsky's sense, it's an "extrapolation" of what we would desire as an end if our desires-as-ends were transformed through a process that involved not just more information but also changes to our neural structure due to environment (such as growing up together).
This kind of unjustified blending and mixing of concepts - without first putting your work in the context of the current language and then justifying your use of a brand new language - is definitely something that would keep our paper out of mainstream journals. In this case, I think the mainstream language is just fine, so I would simply adopt it, briefly cite some of the people who explain and defend that language, and move forward.
There are other examples, right from the start. Eliezer talks about the "spread" of "extrapolated volition" where a mainstream philosopher would talk about its uncertainty. He talks about "muddle" where a mainstream philosopher would call it inconsistency or incoherence. And so on. If we were writing the CEV paper in 2004 with the intention of publishing in a mainstream journal, we would simply adopt the mainstream language if we found it adequate, or we would first explain ourselves in terms of the mainstream language and then argue in favor of using a different language, before giving other arguments in that brand new language.
Same goes for every other subject. If you're writing on the complexity of wishes, you should probably be citing from, say, OUP's recent edited volume on the very latest affective neuroscience of pleasure and desire, and you should probably know that what you're talking about is called "affective neuroscience," and you should probably know that one of the leading researchers in that field is Kent Berridge, and that he recently co-edited a volume for OUP on exactly the subject you are talking about. (Hint: neuroscience overwhelmingly confirms Eliezer's claims about the complexity of wishes, but the standards of mainstream philosophy expect you to cite actual science on the topic, not just appeal to your readers' intuitions. Or at least, good mainstream philosophy requires you to cite actual science.)
I should also mention there's a huge literature on this "fully informed" business, too. One of the major papers is from David Sobel.
3. Put your paper in context and cite the right literature. Place your work in the context of things already written on the subject. Start with a brief overview of the field or sub-field, citing a few key works. Distinguish a few of the relevant questions from each other, and explain exactly which questions you'll be tackling in this paper, and which ones you will not. Explain how other people have answered those questions, and explain why your paper is needed. Then, go on to give your arguments, along the way explaining why you think your position on the question, or your arguments, are superior to the others that have been given, or valuable in some other way. Cite the literature all along the way.
4. Get feedback from mainstream philosophers. After you've written a pretty good third draft, send it to the philosophers whose work you interact with most thoroughly. If the draft is well-written according to the above rules, they will probably read it. Philosophers get way less attention than scientists, and are usually interested to read anything that engages their work directly. They will probably send you a few comments within a month or two, and may name a few other papers you may want to read. Revise.
5. Submit to the right journals. If you have no mainstream academic publishing history, you may want to start conservatively and submit to some established but less-prestigious journals first. As your mainstream academic publishing record grows, you can feel more confident in submitting to major journals in your field - in the case of CEV, this would be journals like Minds and Machines and IEEE Intelligent Systems and International Journal of Applied Philosophy. After a couple successes there, you might be able to publish in a major general-subject philosophy journal like Journal of Philosophy or Nous. But don't get your hopes up.
Note that journals vary widely in what percentage of submissions they accept, how good the feedback is, and so on. For that, you'll want to keep track of what the community is saying about various journals. This kind of thing is often reported on blogs like Leiter Reports.
6. Remember your strengths and weaknesses. If this process sounds like a lot of work - poring through hundreds of journal articles and books to figure out what the existing language is for each concept you want to employ, and thinking about whether you want to adopt that language or argue for a new one, figuring out which journals to submit to, and so on - you're right! Writing for mainstream journals is a lot of work. It's made much easier these days with online search engines and digital copies of articles, but it's still work, and you have to know how to look for it. You have to know the names of related terms that might bring you to the right articles. You have to know which journals and publishers and people are the "big names." You have to know what the fields and sub-fields of philosophy (and any relevant sciences) are, and how they interact. This is one advantage that someone who is familiar with philosophy has over someone who is not - it may not be that the former is any smarter or creative than the latter, it's just that the former knows what to look for, and probably already knows what language to use for a greatly many subjects so he doesn't have to look it up. Also, if you're going to do this it is critical that you have some mastery over procrastination.
Poring through the literature, along with other steps in the process of writing a mainstream philosophy paper, is often a godawful slog. And of course it helps if you quite simply enjoy research. That's probably the most important quality you can have. If you're not great with procrastination and you don't enjoy research but you have brilliant and important ideas to publish, team up with somebody who does enjoy research and has a handle on procrastination as your writing partner. You can do the brilliant insight stuff, the other person can do the literature slog and using-the-right-terms-and-categories part.
There is tons more I could say about the subject, but that's at least a start. I hope it's valuable to some people, especially if you think you might want to publish something on a really important subject like existential risks and Friendly AI. Good luck!
* This is not to say the field of machine ethics is without valuable contributions. Far from it!
Has no one else mentioned this on LW yet?
Elizabeth Edwards has been elected as a New Hampshire State Rep, self-identifies as a Rationalist and explicitly mentions Less Wrong in her first post-election blog post.
Sorry if this is a repost
[EDIT, Nov 13th: I've submitted to FIMFiction, and will update with a link to its permanent home if it passes moderation. I have also removed the docs link and will make the document private once it goes live.]
Over the last year, I’ve spent a lot of my free time writing a semi-rationalist My Little Pony fanfic. Whenever I’ve mentioned this side project, I’ve received requests to alpha the story.
I present, as an open beta: Friendship is Optimal. Please do not spread that link outside of LessWrong; Google Docs is not its permanent home. I intend to put it up on fanfiction.net and submit it to Equestria Daily after incorporating any feedback. The story is complete, and I believe I've caught the majority of typographical and grammatical problems. (Though if you find some, comments are open on the doc itself.) Given the subject matter, I’m asking for the LessWrong community’s help in spotting any major logical flaws or other storytelling problems.
Cover jacket text:
Hanna, the CEO of Hofvarpnir Studios, just won the contract to write the official My Little Pony MMO. She had better hurry; a US military contractor is developing weapons based on her artificial intelligence technology, which just may destroy the world. Hana has built an A.I. Princess Celestia and given her one basic drive: to satisfy values through friendship and ponies. What will Princess Celestia do when she’s let loose upon the world, following the drives Hanna has given her?
Special thanks to my roommate (who did extensive editing and was invaluable in noticing attempts by me to anthropomorphize an AI), and to Vaniver, who along with my roommate, convinced me to delete what was just a flat out bad chapter.
The following is a dialogue intended to illustrate what I think may be a serious logical flaw in some of the conclusions drawn from the famous Mere Addition Paradox.
EDIT: To make this clearer, the interpretation of the Mere Addition Paradox this post is intended to criticize is the belief that a world consisting of a large population full of lives barely worth living is the optimal world. That is, I am disagreeing with the idea that the best way for a society to use the resources available to it is to create as many lives barely worth living as possible. Several commenters have argued that another interpretation of the Mere Addition Paradox is that a sufficiently large population with a lower quality of life will always be better than a smaller population with a higher quality of life, even if such a society is far from optimal. I agree that my argument does not necessarily refute this interpretation, but think the other interpretation is common enough that it is worth arguing against.
EDIT: On the advice of some of the commenters I have added a shorter summary of my argument in non-dialogue form at the end. Since it is shorter I do not think it summarizes my argument as completely as the dialogue, but feel free to read it instead if pressed for time.
Bob: Hi, I'm with R&P cable. We're selling premium cable packages to interested customers. We have two packages to start out with that we're sure you love. Package A+ offers a larger selection of basic cable channels and costs $50. Package B offers a larger variety of exotic channels for connoisseurs, it costs $100. If you buy package A+, however, you'll get a 50% discount on B.
Alice: That's very nice, but looking at the channel selection, I just don't think that it will provide me with enough utilons.
Bob: Utilons? What are those?
Alice: They're the unit I use to measure the utility I get from something. I'm really good at shopping, so if I spend my money on the things I usually spend it on I usually get 1.5 utilons for every dollar I spend. Now, looking at your cable channels, I've calculated that I will get 10 utilons from buying Package A+ and 100 utilons from buying Package B. Obviously the total is 110, significantly less than the 150 utilons I'd get from spending $100 on other things. It's just not a good deal for me.
Bob: You think so? Well it so happens that I've met people like you in the past and have managed to convince them. Let me tell you about something called the "Mere Cable Channel Addition Paradox."
Alice: Alright, I've got time, make your case.
Bob: Imagine that the government is going to give you $50. Sounds like a good thing, right?
Alice: It depends on where it gets the $50 from. What if it defunds a program I think is important?
Bob: Let's say that it would defund a program that you believe is entirely neutral. The harms the program causes are exactly outweighed by the benefits it brings, leaving a net utility of zero.
Alice: I can't think of any program like that, but I'll pretend one exists for the sake of the argument. Yes, defunding it and giving me $50 would be a good thing.
Bob: Okay, now imagine the program's beneficiaries put up a stink, and demand the program be re-instituted. That would be bad for you, right?
Alice: Sure. I'd be out $50 that I could convert into 75 utilons.
Bob: Now imagine that the CEO of R&P Cable Company sleeps with an important senator and arranges a deal. You get the $50, but you have to spend it on Package A+. That would be better than not getting the money at all, right?
Alice: Sure. 10 utilons is better than zero. But getting to spend the $50 however I wanted would be best of all.
Bob: That's not an option in this thought experiment. Now, imagine that after you use the money you received to buy Package A+, you find out that the 50% discount for Package B still applies. You can get it for $50. Good deal, right?
Alice: Again, sure. I'd get 100 utilons for $50. Normally I'd only get 75 utilons.
Bob: Well, there you have it. By a mere addition I have demonstrated that a world where you have bought both Package A+ and Package B is better than one where you have neither. The only difference between the hypothetical world I imagined and the world we live in is that in one you are spending money on cable channels. A mere addition. Yet you have admitted that that world is better than this one. So what are you waiting for? Sign up for Package A+ and Package B!
And that's not all. I can keep adding cable packages to get the same result. The end result of my logic, which I think you'll agree is impeccable, is that you purchase Package Z, a package where you spend all the money other than that you need for bare subsistence on cable television packages.
Alice: That seems like a pretty repugnant conclusion.
Bob: It still follows from the logic. For every world where you are spending your money on whatever you have calculated generates the most utilons there exists another, better world where you are spending all your money on premium cable channels.
Alice: I think I found a flaw in your logic. You didn't perform a "mere addition." The hypothetical world differs from ours in two ways, not one. Namely, in this world the government isn't giving me $50. So your world doesn't just differ from this one in terms of how many cable packages I've bought, it also differs in how much money I have to buy them.
Bob: So can I interest you in a special form of the package? This one is in the form of a legally binding pledge. You pledge that if you ever make an extra $50 in the future you will use it to buy Package A+.
Alice: No. In the scenario you describe the only reason buying Package A+ has any value is that it is impossible to get utility out of that money any other way. If I just get $50 for some reason it's more efficient for me to spend it normally.
Bob: Are you sure? I've convinced a lot of people with my logic.
Alice: Like who?
Bob: Well, there were these two customers named Michael Huemer and Robin Hanson who both accepted my conclusion. They've both mortgaged their homes and started sending as much money to R&P cable as they can.
Alice: There must be some others who haven't.
Bob: Well, there was this guy named Derek Parfit who seemed disturbed by my conclusion, but couldn't refute it. The best he could do is mutter something about how the best things in his life would gradually be lost if he spent all his money on premium cable. I'm working on him though, I think I'll be able to bring him around eventually.
Alice: Funny you should mention Derek Parfit. It so happens that the flaw in your "Mere Cable Channel Addition Paradox" is exactly the same as the flaw in a famous philosophical argument he made, which he called the "Mere Addition Paradox."
Bob: Really? Do tell?
Alice: Parfit posited a population he called "A" which had a moderately large population with large amounts of resources, giving them a very high level of utility per person. Then he added a second population, which was totally isolated from the other population. How they were isolated wasn't important, although Parfit suggested maybe they were on separate continents and can't sail across the ocean or something like that. These people don't have nearly as many resources per person as the other population, so each person's level of utility is lower (their lack of resources is the only reason they have lower utility). However, their lives are still just barely worth living. He called the two populations "A+."
Parfit asked if "A+" was a better world than "A." He thought it was, since the extra people were totally isolated from the original population they weren't hurting anyone over there by existing. And their lives were worth living. Follow me so far?
Bob: I guess I can see the point.
Alice: Next Parfit posited a population called "B," which was the same as A+. except that the two populations had merged together. Maybe they got better at sailing across the ocean, it doesn't really matter how. The people share their resources. The result is that everyone in the original population had their utility lowered, while everyone in the second had it raised.
Parfit asked if population "B" was better than "A+" and argued that it was because it had a greater level of equality and total utility.
Bob: I think I see where this is going. He's going to keep adding more people, isn't he?
Alice: Yep. He kept adding more and more people until he reached population "Z," a vast population where everyone had so few resources that their lives were barely worth living. This, he argued, was a paradox, because he argued that most people would believe that Z is far worse than A, but he had made a convincing argument that it was better.
Bob: Are you sure that sharing their resources like that would lower the standard of living for the original population? Wouldn't there be economies of scale and such that would allow them to provide more utility even with less resources per person?
Alice: Please don't fight the hypothetical. We're assuming that it would for the sake of the argument.
Now, Parfit argued that this argument led to the "Repugnant Conclusion," the idea that the best sort of world is one with a large population with lives barely worth living. That confers on people a duty to reproduce as often as possible, even if doing so would lower the quality of their and everyone else's lives.
He claimed that the reason his argument showed this was that he had conducted "mere addition." The populations in his paradox differed in no way other than their size. By merely adding more people he had made the world "better," even if the level of utility per person plummetted. He claimed that "For every population, A, with a high average level of utility there exists another, better population, B, with more people and a lower average level of utility."
Do you see the flaw in Parfit's argument?
Bob: No, and that kind of disturbs me. I have kids, and I agree that creating new people can add utility to the world. But it seems to me that it's also important to enhance the utility of the people who already exist.
Alice: That's right. Normal morality tells us that creating new people with lives worth living and enhancing the utility of people that already exist are both good things to use resources on. Our common sense tells us that we should spend resources on both those things. The disturbing thing about the Mere Addition Paradox is that it seems at first glance to indicate that that's not true, that we should only devote resources to creating more people with barely worthwhile lives. I don't agree with that, of course.
Bob: Neither do I. It seems to me that having a large number of worthwhile lives and a high average utility are both good things and that we should try to increase them both, not just maximize one.
Alice: You're right, of course. But don't say "having a high average utility." Say "use resources to increase the utility of people who already exist."
Bob: What's the difference? They're the same thing, aren't they?
Alice: Not quite. There are other ways to increase average utility than enhancing the utility of existing people. You could kill all the depressed people, for instance. Plus, if there was a world where everyone was tortured 24 hours a day, you could increase average utility by creating some new people who are only tortured 23 hours a day.
Bob: That's insane! Who could possibly be that literal-minded?
Alice: You'd be surprised. The point is, a better way to phrase it is "use resources to increase the utility of people who already exist," not "increase average utility." Of course, that still leaves some stuff out, like the fact that it's probably better to increase everyone's utility equally, rather than focus on just one person. But it doesn't lead to killing depressed people, or creating slightly less tortured people in a Hellworld.
Bob: Okay, so what I'm trying to say is that resources should be used to create people, and to improve people's lives. Also equality is good. And that none of these things should completely eclipse the other, they're each too valuable to maximize just one. So a society that increases all of those values should be considered more efficient at generating value than a society that just maximizes one value. Now that we're done getting our terminology straight, will you tell me what Parfit's mistake was?
Alice: Population "A" and population "A+" differ in two ways, not one. Think about it. Parfit is clear that the extra people in "A+" do not harm the existing people when they are added. That means they do not use any of the original population's resources. So how do they manage to live lives worth living? How are they sustaining themselves?
Bob: They must have their own resources. To use Parfit's example of continents separated by an ocean; each continent must have its own set of resources.
Alice: Exactly. So "A+" differs from "A" both in the size of its population, and the amount of resources it has access to. Parfit was not "merely adding" people to the population. He was also adding resources.
Bob: Aren't you the one who is fighting the hypothetical now?
Alice: I'm not fighting the hypothetical. Fighting the hypothetical consists of challenging the likelihood of the thought experiment happening, or trying to take another option than the ones presented. What I'm doing is challenging the logical coherence of the hypothetical. One of Parfit's unspoken premises is that you need some resources to live a life worth living, so by adding more worthwhile lives he's also implicitly adding resources. If he had just added some extra people to population A without giving them their own continent full of extra resources to live on then "A+" would be worse than "A."
Bob: So the Mere Addition Paradox doesn't confer on us a positive obligation to have as many children as possible, because the amount of resources we have access to doesn't automatically grow with them. I get that. But doesn't it imply that as soon as we get some more resources we have a duty to add some more people whose lives are barely worth living?
Alice: No. Adding lives barely worth living uses the extra resources more efficiently than leaving Parfit's second continent empty for all eternity. But, it's not the most efficient way. Not if you believe that creating new people and enhancing the utility of existing people are both important values.
Let's take population "A+" again. Now imagine that instead of having a population of people with lives barely worth living, the second continent is inhabited by a smaller population with the same very high percentage of resources and utility per person as the population of the first continent. Call it "A++. " Would you say "A++" was better than "A+?"
Bob: Sure, definitely.
Alice: How about a world where the two continents exist, but the second one was never inhabited? The people of the first continent then discover the second one and use its resources to improve their level of utility.
Bob: I'm less sure about that one, but I think it might be better than "A+."
Alice: So what Parfit actually proved was: "For every population, A, with a high average level of utility there exists another, better population, B, with more people, access to more resources and a lower average level of utility."
And I can add my own corollary to that: "For every population, B, there exists another, better population, C, that has the same access to resources as B, but a smaller population and higher average utility."
Bob: Okay, I get it. But how does this relate to my cable TV sales pitch?
Alice: Well, my current situation, where I'm spending my money on normal things is analogous to Parfit's population "A." High utility, and very efficient conversion of resources into utility, but not as many resources. We're assuming, of course, that using resources to both create new people and improve the utility of existing people is more morally efficient than doing just one or the other.
The situation where the government gives me $50 to spend on Package A+ is analogous to Parfit's population A+. I have more resources and more utility. But the resources aren't being converted as efficiently as they could be.
The situation where I take the 50% discount and buy Package B is equivalent to Parfit's population B. It's a better situation than A+, but not the most efficient way to use the money.
The situation where I get the $50 from the government to spend on whatever I want is equivalent to my population C. A world with more access to resources than A, but more efficient conversion of resources to utility than A+ or B.
Bob: So what would a world where the government kept the money be analogous to?
Alice: A world where Parfit's second continent was never settled and remained uninhabited for all eternity, its resources never used by anyone.
Bob: I get it. So the Mere Addition Paradox doesn't prove what Parfit thought it did? We don't have any moral obligation to tile the universe with people whose lives are barely worth living?
Alice: Nope, we don't. It's more morally efficient to use a large percentage of our resources to enhance the lives of those who already exist.
Bob: This sure has been a fun conversation. Would you like to buy a cable package from me? We have some great deals.
My argument is that Parfit’s Mere Addition Paradox doesn’t prove what it seems to. The argument behind the Mere Addition Paradox is that you can make the world a better place by the “mere addition” of extra people, even if their lives are barely worth living. In other words : "For every population, A, with a high average level of utility there exists another, better population, B, with more people and a lower average level of utility." This supposedly leads to the Repugnant Conclusion, the belief that a world full of people whose lives are barely worth living is better than a world with a smaller population where the people lead extremely fulfilled and happy lives.
Parfit demonstrates this by moving from world A, consisting of a population full of people with lots of resources and high average utility, and moving to world A+. World A+ has an addition population of people who are isolated from the original population and not even aware of the other’s existence. The extra people live lives just barely worth living. Parfit argues that A+ is a better world than A because everyone in it has lives worth living, and the additional people aren’t hurting anyone by existing because they are isolated from the original population.
Parfit them moves from World A+ to World B, where the populations are merged and share resources. This lowers the standard of living for the original people and raises it for the newer people. Parfit argues that B must be better than A+, because it has higher total utility and equality. He then keeps adding people until he reaches Z, a world where everyones’ lives are barely worth living and the population is vast. He argues that this is a paradox because most people would agree that Z is not a desirable world compared to A.
I argue that the Mere Addition Paradox is a flawed argument because it does not just add people, it also adds resources. The fact that the extra people in A+ do not harm the original people of A by existing indicates that their population must have a decent amount of resources to live on, even if it is not as many per person as the population of A. For this reason what the Mere Addition Paradox proves is not that you can make the world better by adding extra people, but rather that you can make it better by adding extra people and resources to support them. I use a series of choices about purchasing cable television packages to illustrate this in concrete terms.
I further argue for a theory of population ethics that values both using resources to create lives worth living, and using resources to enhance the utility of already existing people, and considers the best sort of world to be one where neither of these two values totally dominate the other. By this ethical standard A+ might be better than A because it has more people and resources, even if the average level of utility is lower. However, a world with the same amount of resources as A+, but a lower population and the same, or higher average utility as A is better than A+.
The main unsatisfying thing about my argument is that while it avoids the Repugnant Conclusion in most cases, it might still lead to it, or something close to it, in situations where creating new people and getting new resources are, as one commenter noted, a “package deal.” In other words, a situation where it is impossible to obtain new resources without creating some new people whose utility levels are below average. However, even in this case, my argument holds that the best world of all is one where it would be possible to obtain the resources without creating new people, or creating a smaller amount of people with higher utility.
In other words, the Mere Addition Paradox does not prove that: "For every population, A, with a high average level of utility there exists another, better population, B, with more people and a lower average level of utility." Instead what the Mere Addition Paradox seems to demonstrate is that: "For every population, A, with a high average level of utility there exists another, better population, B, with more people, access to more resources and a lower average level of utility." Furthermore, my own argument demonstrates that: "For every population, B, there exists another, better population, C, which has the same access to resources as B, but a smaller population and higher average utility."
When I first learned about Friendly AI, I assumed it was mostly a programming problem. As it turns out, it's actually mostly a math problem. That's because most of the theory behind self-reference, decision theory, and general AI techniques haven't been formalized and solved yet. Thus, when people ask me what they should study in order to work on Friendliness theory, I say "Go study math and theoretical computer science."
But that's not specific enough. Should aspiring Friendliness researchers study continuous or discrete math? Imperative or functional programming? Topology? Linear algebra? Ring theory?
I do, in fact, have specific recommendations for which subjects Friendliness researchers should study. And so I worked with a few of my best interns at MIRI to provide recommendations below:
- University courses. We carefully hand-picked courses on these subjects from four leading universities — but we aren't omniscient! If you're at one of these schools and can give us feedback on the exact courses we've recommended, please do so.
- Online courses. We also linked to online courses, for the majority of you who aren't able to attend one of the four universities whose course catalogs we dug into. Feedback on these online courses is also welcome; we've only taken a few of them.
- Textbooks. We have read nearly all the textbooks recommended below, along with many of their competitors. If you're a strongly motivated autodidact, you could learn these subjects by diving into the books on your own and doing the exercises.
Have you already taken most of the subjects below? If so, and you're interested in Friendliness research, then you should definitely contact me or our project manager Malo Bourgon (firstname.lastname@example.org). You might not feel all that special when you're in a top-notch math program surrounded by people who are as smart or smarter than you are, but here's the deal: we rarely get contacted by aspiring Friendliness researchers who are familiar with most of the material below. If you are, then you are special and we want to talk to you.
Not everyone cares about Friendly AI, and not everyone who cares about Friendly AI should be a researcher. But if you do care and you might want to help with Friendliness research one day, we recommend you consume the subjects below. Please contact me or Malo if you need further guidance. Or when you're ready to come work for us.
If you're endeavoring to build a mind, why not start by studying your own? It turns out we know quite a bit: human minds are massively parallel, highly redundant, and although parts of the cortex and neocortex seem remarkably uniform, there are definitely dozens of special purpose modules in there too. Know the basic details of how the only existing general purpose intelligence currently functions.
While cognitive science will tell you all the wonderful things we know about the immense, parallel nature of the brain, there's also the other side of the coin. Evolution designed our brains to be optimized at doing rapid thought operations that work in 100 steps or less. Your brain is going to make stuff up to cover up that its mostly cutting corners. These errors don't feel like errors from the inside, so you'll have to learn how to patch the ones you can and then move on.
There are two major branches of programming: Functional and Imperative. Unfortunately, most programmers only learn imperative programming languages (like C++ or python). I say unfortunately, because these languages achieve all their power through what programmers call "side effects". The major downside for us is that this means they can't be efficiently machine checked for safety or correctness. The first self-modifying AIs will hopefully be written in functional programming languages, so learn something useful like Haskell or Scheme.
Much like programming, there are two major branches of mathematics as well: Discrete and continuous. It turns out a lot of physics and all of modern computation is actually discrete. And although continuous approximations have occasionally yielded useful results, sometimes you just need to calculate it the discrete way. Unfortunately, most engineers squander the majority of their academic careers studying higher and higher forms of calculus and other continuous mathematics. If you care about AI, study discrete math so you can understand computation and not just electricity.
Linear algebra is the foundation of quantum physics and a huge amount of probability theory. It even shows up in analyses of things like neural networks. You can't possibly get by in machine learning (later) without speaking linear algebra. So learn it early in your scholastic career.
Like learning how to read in mathematics. But instead of building up letters into words, you'll be building up axioms into theorems. This will introduce you to the program of using axioms to capture intuition, finding problems with the axioms, and fixing them.
The mathematical equivalent of building words into sentences. Essential for the mathematics of self-modification. And even though Sherlock Holmes and other popular depictions make it look like magic, it's just lawful formulas all the way down.
Like building sentences into paragraphs. Algorithms are the recipes of thought. One of the more amazing things about algorithm design is that it's often possible to tell how long a process will take to solve a problem before you actually run the process to check it. Learning how to design efficient algorithms like this will be a foundational skill for anyone programming an entire AI, since AIs will be built entirely out of collections of algorithms.
There are ways to systematically design algorithms that only get things slightly wrong when the input data has tiny errors. And then there's programs written by amateur programmers who don't take this class. Most programmers will skip this course because it's not required. But for us, getting the right answer is very much required.
This is where you get to study computing at it's most theoretical. Learn about the Church-Turing thesis, the universal nature and applicability of computation, and how just like AIs, everything else is algorithms... all the way down.
It turns out that our universe doesn't run on Turing Machines, but on quantum physics. And something called BQP is the class of algorithms that are actually efficiently computable in our universe. Studying the efficiency of algorithms relative to classical computers is useful if you're programming something that only needs to work today. But if you need to know what is efficiently computable in our universe (at the limit) from a theoretical perspective, quantum computing is the only way to understand that.
There's a good chance that the first true AIs will have at least some algorithms that are inefficient. So they'll need as much processing power as we can throw at them. And there's every reason to believe that they'll be run on parallel architectures. There are a ton of issues that come up when you switch from assuming sequential instruction ordering to parallel processing. There's threading, deadlocks, message passing, etc. The good part about this course is that most of the problems are pinned down and solved: You're just learning the practice of something that you'll need to use as a tool, but won't need to extend much (if any).
Remember how I told you to learn functional programming way back at the beginning? Now that you wrote your code in functional style, you'll be able to do automated and interactive theorem proving on it to help verify that your code matches your specs. Errors don't make programs better and all large programs that aren't formally verified are reliably *full* of errors. Experts who have thought about the problem for more than 5 minutes agree that incorrectly designed AI could cause disasters, so world-class caution is advisable.
Life is uncertain and AIs will handle that uncertainty using probabilities. Also, probability is the foundation of the modern concept of rationality and the modern field of machine learning. Probability theory has the same foundational status in AI that logic has in mathematics. Everything else is built on top of probability.
Now that you've learned how to calculate probabilities, how do you combine and compare all the probabilistic data you have? Like many choices before, there is a dominant paradigm (frequentism) and a minority paradigm (Bayesianism). If you learn the wrong method here, you're deviating from a knowably correct framework for integrating degrees of belief about new information and embracing a cadre of special purpose, ad-hoc statistical solutions that often break silently and without warning. Also, quite embarrassingly, frequentism's ability to get things right is bounded by how well it later turns out to have agreed with Bayesian methods anyway. Why not just do the correct thing from the beginning and not have your lunch eaten by Bayesians every time you and them disagree?
No more applied probability: Here be theory! Deep theories of probabilities are something you're going to have to extend to help build up the field of AI one day. So you actually have to know why all the things you're doing are working inside out.
Now that you chose the right branch of math, the right kind of statistics, and the right programming paradigm, you're prepared to study machine learning (aka statistical learning theory). There are lots of algorithms that leverage probabilistic inference. Here you'll start learning techniques like clustering, mixture models, and other things that cache out as precise, technical definitions of concepts that normally have rather confused or confusing English definitions.
We made it! We're finally doing some AI work! Doing logical inference, heuristic development, and other techniques will leverage all the stuff you just learned in machine learning. While modern, mainstream AI has many useful techniques to offer you, the authors will tell you outright that, "the princess is in another castle". Or rather, there isn't a princess of general AI algorithms anywhere -- not yet. We're gonna have to go back to mathematics and build our own methods ourselves.
Probably the most celebrated results is mathematics are the negative results by Kurt Goedel: No finite set of axioms can allow all arithmetic statements to be decided as either true or false... and no set of self-referential axioms can even "believe" in its own consistency. Well, that's a darn shame, because recursively self-improving AI is going to need to side-step these theorems. Eventually, someone will unlock the key to over-coming this difficulty with self-reference, and if you want to help us do it, this course is part of the training ground.
Working within a framework of mathematics is great. Working above mathematics -- on mathematics -- with mathematics, is what this course is about. This would seem to be the most obvious first step to overcoming incompleteness somehow. Problem is, it's definitely not the whole answer. But it would be surprising if there were no clues here at all.
One day, when someone does side-step self-reference problems enough to program a recursively self-improving AI, the guy sitting next to her who glances at the solution will go "Gosh, that's a nice bit of Model Theory you got there!"
Category theory is the precise way that you check if structures in one branch of math represent the same structures somewhere else. It's a remarkable field of meta-mathematics that nearly no one knows... and it could hold the keys to importing useful tools to help solve dilemmas in self-reference, truth, and consistency.
Highly recommended book of light, enjoyable reading that predictably inspires people to realize FAI is an important problem AND that they should probably do something about that.
A good primer on xrisks and why they might matter. SPOILER ALERT: They matter.
Rationality: the indispensable art of non-self-destruction! There are manifold ways you can fail at life... especially since your brain is made out of broken, undocumented spaghetti code. You should learn more about this ASAP. That goes double if you want to build AIs.
A surprisingly thoughtful book on decision theory and other paradoxes in physics and math that can be dissolved. Reading this book is 100% better than continuing to go through your life with a hazy understanding of how important things like free will, choice, and meaning actually work.
MIRI has already published 30+ research papers that can help orient future Friendliness researchers. The work is pretty fascinating and readily accessible for people interested in the subject. For example: How do different proposals for value aggregation and extrapolation work out? What are the likely outcomes of different intelligence explosion scenarios? Which ethical theories are fit for use by an FAI? What improvements can be made to modern decision theories to stop them from diverging from winning strategies? When will AI arrive? Do AIs deserve moral consideration? Even though most of your work will be more technical than this, you can still gain a lot of shared background knowledge and more clearly see where the broad problem space is located.
A useful book on "optimal" AI that gives a reasonable formalism for studying how the most powerful classes of AIs would behave under conservative safety design scenarios (i.e., lots and lots of reasoning ability).
I intended Leveling Up in Rationality to communicate this:
Despite worries that extreme rationality isn't that great, I think there's reason to hope that it can be great if some other causal factors are flipped the right way (e.g. mastery over akrasia). Here are some detailed examples I can share because they're from my own life...
But some people seem to have read it and heard this instead:
I'm super-awesome. Don't you wish you were more like me? Yay rationality!
This failure (on my part) fits into a larger pattern of the Singularity Institute seeming too arrogant and (perhaps) being too arrogant. As one friend recently told me:
At least among Caltech undergrads and academic mathematicians, it's taboo to toot your own horn. In these worlds, one's achievements speak for themselves, so whether one is a Fields Medalist or a failure, one gains status purely passively, and must appear not to care about being smart or accomplished. I think because you and Eliezer don't have formal technical training, you don't instinctively grasp this taboo. Thus Eliezer's claim of world-class mathematical ability, in combination with his lack of technical publications, make it hard for a mathematician to take him seriously, because his social stance doesn't pattern-match to anything good. Eliezer's arrogance as evidence of technical cluelessness, was one of the reasons I didn't donate until I met [someone at SI in person]. So for instance, your boast that at SI discussions "everyone at the table knows and applies an insane amount of all the major sciences" would make any Caltech undergrad roll their eyes; your standard of an "insane amount" seems to be relative to the general population, not relative to actual scientists. And posting a list of powers you've acquired doesn't make anyone any more impressed than they already were, and isn't a high-status move.
So, I have a few questions:
- What are the most egregious examples of SI's arrogance?
- On which subjects and in which ways is SI too arrogant? Are there subjects and ways in which SI isn't arrogant enough?
- What should SI do about this?
This is meant as an open discussion thread someplace where I won't censor anything (and in fact can't censor anything, since I don't have mod permissions on this subreddit), in a location where comments aren't going to show up unsolicited in anyone's feed (which is why we're not doing this locally on LW). If I'm wrong about this - i.e. if there's some reason that Reddit LW followers are going to see comments without choosing to click on the post - please let me know and I'll retract the thread and try to find some other forum.
I have been deleting a lot of comments from (self-confessed and publicly designated) trolls recently, most notably Dmytry aka private-messaging and Peterdjones, and I can understand that this disturbs some people. I also know that having an uncensored thread somewhere else is probably not your ideal solution. But I am doing my best to balance considerations, and I hope that having threads like these is, if not your perfect solution, then something that you at least regard as better than nothing.
I thought this video was a really good question dissolving by Richard Feynman. But it's in 240p! Nobody likes watching 240p videos. So I transcribed it. (Edit: That was in jest. The real reasons are because I thought I could get more exposure this way, and because a lot of people appreciate transcripts. Also, Paul Graham speculates that the written word is universally superior than the spoken word for the purpose of ideas.) I was going to post it as a rationality quote, but the transcript was sufficiently long that I think it warrants a discussion post instead.
Here you go:
tl;dr: My grandpa died, and I gave a eulogy with a mildly anti-deathist message, in a Catholic funeral service that was mostly pretty disagreeable.
I'm a little uncomfortable writing this post, because it's very personal, and I'm not exactly a regular with friends here. But I need to get it out, and I don't know any other place to put it.
My grandfather (one of two) died last week, and there was a funeral mass (Catholic) today. Although a ‘pro-life’ organisation, the Roman Catholic Church has a very deathist funeral liturgy. It wasn't just ‘Stanley has gone on to a better place’, and all that; the priest had the gall to say that Grandpa had probably done everything that he wanted to do in life, so it was OK for him to die now. I know from discussions with my mother and my aunt that Grandpa did not want to die now; although his life and health were not what they used to be, he was happy to live. Yes, he had gone to his great-granddaughter's second birthday party, but he wanted to go to her third, and that will never happen.
There are four of us grandchildren, two (not including me) with spouses. At first, it was suggested that each of us six say one of the Prayers of the Faithful (which are flexible). Mom thought that I might find one that I was willing to recite, so I looked them up online. It wasn't so bad that they end with ‘We pray to the Lord.’ recited by the congregation; I would normally remain silent during that, but I decided that I could say it, and even lead others in saying it, pro forma. And I could endorse the content of some (at least #6 from that list) with some moderate edits. But overall, the whole thing was very disturbing to me. (I had to read HPMoR 45 afterwards to get rid of the bad taste.) I told Mom ‘This is a part of the Mass where I would normally remain in respectful silence.’, and she apologised for ‘put[ting] [me] in an uncomfortable position’ (to quote from our text messages). In the end, the two grandchildren-in-law were assigned to say these prayers.
But we grandchildren still had a place in the programme; we would give eulogies. So I had to think about what to say. I was never close to Grandpa; I loved him well enough, but we didn't have much in common. I tried to think about what I remembered about him and what I would want to tell people about him. It was a little overwhelming; in the end, I read my sibling's notes and decided to discuss only what she did not plan to discuss, and that narrowed it down enough. So then I knew what I wanted to say about Grandpa.
But I wanted to say something more. I wanted to say something to counter the idea that Grandpa's death was OK. I didn't yet know how appalling the priest's sermon would be, but I knew that there would be a lot of excuses made for death. I wanted to preach ‘Grandpa should not have died.’ and go on from there, but I knew that this would be disturbing to people who wanted comfort from their grief, and a lecture on death would not really be a eulogy. Still, I wanted to say something.
(I also didn't want to say anything that could be interpreted as critical of the decision to remove life support. I wasn't consulted on that decision, but under the circumstances, I agree with it. As far as I'm concerned, he was killed on Monday, even though he didn't finally die until Wednesday. In the same conversation in which Mom and I talked about how Grandpa wanted to live, we talked about how he didn't want to live under the circumstances under which he was living on Tuesday, conditions which his doctors expected would never improve. Pulling the plug was the best option available in a bad situation.)
Enough background; here is my eulogy. Some of this is paraphrase, since my written notes were only an outline.
When I was young, we would visit my grandparents every year, for Thanksgiving or Christmas. Grandma and Grandpa would greet us at the door with hugs and kisses. The first thing that I remember about their place was the candy. Although I didn't realise it at the time, they didn't eat it; it was there as a gift for us kids.
Later I noticed the books that they had, on all topics: religion, history, humour, science fiction, technical material. Most of it was older than I was used to reading, and I found it fascinating. All of this was open to me, and sometimes I would ask Grandpa about some of it; but mostly I just read his books, and to a large extent, this was his influence on me.
Grandpa was a chemical engineer, although he was retired by the time I was able to appreciate that, and this explains the technical material, and to some extent the science fiction. Even that science fiction mostly took death for granted; but Grandpa was with us as long as he was because of the chemists and other people who studied medicine and the arts of healing. They helped him to stay healthy and happy until the heart attack that ended his life.
So, I thank them for what they did for Grandpa, and I wish them success in their future work, to help other people live longer and better, until we never have to go through this again.
I was working on this until the ceremony began, and I even edited it a little in the pew. I wasn't sure until I got up to the podium how strong to make the ending. Ultimately, I said something that could be interpreted as a reference to the Second Coming, but Catholics are not big on that, and my family knows that I don't believe in it. So I don't know how the church officials and Grandpa's personal friends interpreted it, but it could only mean transhumanism to my family.
Nobody said anything, positive or negative, afterwards. Well, a couple of people said that my eulogy was well done; but without specifics, it sounded like they were just trying to make me feel good, to comfort my grief. After my speech, the other three grandchildren went, and then the priest said more pleasant falsehoods, and then it was over.
Goodbye, Grandpa. I wish that you were alive and happy in Heaven, but at least you were alive and happy here on Earth for a while. I'll miss you.
[Edit: Fix my cousin's age.]
This is a first draft. Over the next few days I'll add citations and that sort of thing, but I'm posting it as-is in order to solicit feedback. Also, I wasn't able to find any specific policy regarding mention of illicit substances, so I'm going to assume this is okay, but if not please let me know.
Disclaimer: This is a work of postmodern fiction about two irredeemable junkies named Alice and Bob and their cat Fido. The views contained herein are not medical or legal advice, they are not my views, and they are not the views of LessWrong.com or any of its members. In fact they are not views at all: they are transnarrative flows in alterity-space, or that's what my lit prof tells me. I do not condone any illegal activity whatsoever, except jaywalking.
Today, says Alice, I'm going to talk to you about drugs. I'll be covering several nutritional supplements, some stimulants and nootropics, and - as some of you have probably guessed - I'll also be talking briefly about recreational drugs, particularly psychedelics. Now, I don't have any sense of what the popular perception is of drugs around here, but I presume that at least some of you will be a little put off at the suggestion of recreational drug use. If that is how you feel, please bear with me. To partake or not partake of prohibited substances is a choice that must be made individually, and for many the payoffs may not be worth the risks; but I hope to convince you that, at least for some people, responsible drug use is a very reasonable and beneficial activity.
Well, hold on, says Bob - who takes a permissive but detached view of these things - hold on, now. It may be true (as cursory research will show) that drug use is far less dangerous than it's made out to be, and it may be true that some people get a lot of enjoyment out of them. But if you value knowledge and reason over hedonic pleasure, it seems better to cut them out entirely. After all, it's your brain on the line if anything goes wrong!
As a matter of fact, says Alice, drugs are good for more than just hedonism. First of all, they give you a handle on your own neurochemistry. It's unlikely that your brain is optimally tuned for the things you want to accomplish, so if you can tweak it the right way, you might be able to improve your functioning. In extreme cases, you might have chronic imbalances leading to depression, mania, etc., in which case you'll probably want to talk to a doctor about medication; but the ability to use drugs to change yourself goes well beyond this. For example, judicious use of MDMA can help you retrain your social reflexes and become more outgoing and sociable. Having this handle also allows you to begin to experimentally correlate your subjective experience with the physical processes to which they correspond, and by carefully observing more unusual states of consciousness, you broaden your understanding of the mind and how it operates. Lastly, psychedelics can sometimes help you understand things differently or more deeply. I've often found my mathematical ability improved by moderate doses of LSD, for example. So, even for someone concerned primarily with rationality and the accumulation and application of knowledge, drugs are at least worth considering.
And what of the risks? says Bob.
I was getting to that, says Alice. There will always be a risk/benefit tradeoff, but the risks can be minimized through careful and responsible use:
- Thoroughly research every new drug before trying it. Unfortunately, in the case of prohibited substances, very little good clinical research has been done (this has started to change in recent years, but there are still vast swaths of uncharted territory). Nevertheless, there's good information to be had. For drugs used recreationally, I usually start at Erowid.org, which provides an overall summary of the effects of a wide variety of drugs; academic citations and sometimes full articles, if there are any; "trip reports" (anecdotal evidence is better than nothing, especially if there's a lot of it); and other useful information. For nootropics and "smart drugs", I usually just start with a Google search and/or Wikipedia.
- Pay particular attention to addictive potential, toxicity and contraindications. Drugs with high addictive potential require extra caution, and should perhaps be avoided by people with akrasia problems. Also be mindful of any history of addiction you may have in your family. Regarding contraindications: beside drug interactions, a lot of this is just common sense. If you are prone to anxiety, you should probably avoid amphetamines. As far as toxicity goes, a good number to look at is the therapeutic index, which is the ratio of the LD50 (the dose, per kilogram of body weight, at which 50% of experimental test subjects (usually rodents) die) to the effective dose (per kilogram of body weight). However, keep in mind also that frequent or heavy drug use can tax the liver, and that otherwise safe chemicals may build up to toxic levels over time.
- If you decide to take a drug known to be addictive, take it in moderate quantities over brief periods of time, well-separated from each other. This is not a hard and fast rule: under a doctor's supervision, for example, you may choose to take prescribed medication every day. You should recognize, however, that this comes at a cost: antidepressants can be used to pull your life together and overcome depression, but it's going to be nasty coming off them. Finally, as a rule of thumb, oral ingestion is significantly less addictive than smoking, insufflation or injection, since this gives a gradual and delayed onset of the reward stimulus. For the same reason, you can further reduce your chances of becoming addicted by taking prodrugs wherever possible (e.g. Vyvanse instead of Dexedrine).
- Always take a low dose first, in case you react badly, and do so around other people who know what you are taking.
- To the greatest extent possible, maintain an open and honest relationship with your doctor, who is in a position to help you minimize the health risks associated with your drug-taking.
- If you decide to seek out prohibited substances, it's important to have a good source of high-quality product. Street drugs may be cut with cheap substitutes or contaminated with solvents used in extraction/synthesis, or they may simply not be what they are claimed to be. Go to people you trust who already do drugs on a regular basis, and ask them for help finding a reputable dealer.
With that out of the way, continues Alice, let's start simple: what is a drug? For our purposes, we'll say a drug is any substance consumed for reasons other than its nutritive value or the sensory experience of consumption. We'll specifically be focusing on psychoactive drugs, which are consumed for their effects on the mind. Note that just about anything you eat or drink is potentially a psychoactive drug, and you may not have to turn to outlandish synthetic compounds to alter your neurochemistry. For example: after three years of vegetarianism, I gradually began to develop chronic anxiety, with occasional panic attacks. It plateaued at a (barely) manageable level, so I never ended up seeking medical help; it took two years before I thought to try eating meat again. When I finally did, the anxiety immediately vanished and has not returned. So, for me, meat is a psychoactive drug. In fact, let's talk about nutritional supplements first.
Supplements and Neurotransmitters
The first group of drugs we're going to be looking at are neurotransmitters, and their chemical precursors, which can be found at health food stores. First, there are 5-HTP and tryptophan, which are serotonin precursors. There is some evidence that these can help treat depression, improve quality of sleep, and improve your mood, but since you need to take it for a few days before you start to notice the effect, it might be hard to tell if this is actually doing anything for you.
Next, consider phenylalanine, an amino acid which serves as a precursor to dopamine, norepinephrine and adrenaline. Phenylalanine is first metabolized into tyrosine, which is also available as a dietary supplement. Research seems to suggest that these are mainly effective only for people under conditions of physical, emotional or mental stress, and don't do much for the general population. I've found that, in fact, L-phenylalanine has a noticeable uplifting effect on my mood within a short time of taking it; but maybe this just says something about how much stress I'm under.
Lastly, there's GABA, a neurotransmitter which has an inhibitory effect on the dopamine system and certain other neurotransmitters. In short, this will calm you down right quick, which makes it useful for dealing with intense and uncontrollable emotions - anxiety, grief, rage, etc. I find that, for this purpose, theanine is even better: it promotes GABA production and alpha brainwave activity, and also seems to increase dopamine levels. Its calming effect is very similar to that of GABA, but I find it much less likely to leave me feeling tired and out of it: if anything, it seems to have a mildly stimulating effect. As an added bonus, theanine appears to boost the immune system. Theanine synergizes well with caffeine, which we'll cover shortly.
All of the above - 5-HTP, tryptophan, phenylalanine, tyrosine, theanine and GABA - are not only useful for regulating your mood, but also for learning what your neurochemistry feels like from the inside. I found it edifying to take fairly large doses of 5-HTP (or phenylalanine, etc.) every day for a couple weeks, stop for a few weeks, go back on for a couple weeks, stop, etc. - all the while noting changes in my mood and perception. In that respect, melatonin tablets can be added to this list: they're not really going to make you a more effective rationalist, but they will teach you what melatonin does to your cognition. Melatonin will also be useful if you're taking stimulants, which might otherwise interfere with your sleep patterns.
It is also worth mentioning that vitamin deficiencies (or excesses) can have a significant impact on mood and cognitive functioning. I recommend taking multivitamins; this need not be a daily regimen if you have a healthy diet, just kind of take them when you remember to. Women should look for multivitamins with iron, and men should look for those without.
Next, let's talk about stimulants, starting with caffeine - by far the most popular, although by no means the most effective. Caffeine works by blocking the activity of adenosine, an inhibitory neurotransmitter that plays a role in sleep and drowsiness. As a result, neural activity goes up, accompanied by a kick to the sympathetic nervous system and an increase in blood sugar levels. Taken on a fairly regular daily schedule, caffeine seems to improve my attention, motivation and energy level. In the long term, there appear to be health benefits from drinking coffee in this way: in addition to its stimulating effects, it appears to help prevent heart disease, Alzheimer's disease and Parkinson's disease, among others. For all-nighters, though, caffeine is an inferior choice: although it suffices to keep you up and running, it doesn't seem to do much to mitigate the cognitive effects of sleep deprivation. Also, as increasing amounts are consumed, a variety of unpleasant side-effects begin to appear, including tremors, heart palpitations, anxiety, diarrhea, and dehydration. It should also be noted that caffeine builds tolerance, and the withdrawal is rather unpleasant. Despite this, it seems to make sense to take coffee every day in moderation, unless you are especially sensitive to its negative effects.
Caffeine can also be had in tea (green tea, in particular, also contains theanine, as we discussed), in chocolate, preferably dark (chocolate also contains a number of other psychoactive alkaloids, including phenylalanine and theobromine), in caffeine pills and in energy drinks. It is perhaps worth mentioning that I find that the energy drinks and energy shots containing other medicinal ingredients (phenylalanine, taurine, B vitamins, etc.) really do seem to be slightly better, minimizing the unpleasant side effects and smoothing out the crash. Still, I try to avoid these because of the sugar and/or artificial sweetener content. It is unclear exactly what effect each of these other medicinal ingredients has individually, if any at all, so you should also be warned that you are probably buying some nonsense along with your actually mind-altering compounds.
Next are amphetamines, which act on the serotonin, norepinephrine and especially dopamine systems, causing increased focus, improved cognitive ability, and elevated energy levels. They also mitigate some of the effects of sleep deprivation, although your cognitive performance will still suffer. While amphetamines can greatly improve your productivity if used correctly, they can also easily do the opposite, because it's just as easy to hyperfocus on video games as it is to hyperfocus on neural network algorithms. Body tics and bad habits can also get strongly reinforced, since your reward systems are getting pummelled by dopamine. Basically, if you're doing amphetamines, keep your akrasia-fighting systems on high alert (fortunately this, too, will be aided by the amphetamines). Another downside to amphetamines is that they're quite addictive; take them either in fixed quantities on a regular schedule (if you have a prescription) or else in occasional bursts of no more than a few days, and in moderate quantities.
Beside addiction, a lot of the danger from amphetamines comes from failing to eat and sleep, if you're taking them for more than a day or two. Amphetamines are strong appetite suppressants, and of course they keep you awake, so you'll need to force yourself to eat three square meals a day and get to sleep at a reasonable hour. Sleep is especially important because the longer you stay up, the more amphetamines you have to take to stay awake; if your dose gets high enough, and if you're badly enough sleep deprived, you put yourself at risk of amphetamine psychosis, which is about as much fun as it sounds like.
There are a number of prescription amphetamines on the market, and these are generally to be preferred to street speed, due to their purity and lack of adulterants. It's not terribly hard to get diagnosed with ADD/ADHD, so this can be above-the-board. Dexedrine is pure dextroamphetamine, while Adderall is a mix of dextroamphetamine and racemic salts (which contain a 50%/50% split between dextro- and levoamphetamine). The difference between the two stereoisomers is complicated, and your best bet is to experiment to see which works best for you, but a rule of thumb is that Adderall has more "kick" at lower doses, while Dexedrine is stronger at higher doses. Methamphetamine, you may be surprised to learn, is also prescribed for ADHD, although much more rarely. Meth is stronger, longer-lasting, and significantly more addictive, than amphetamine. It is also more neurotoxic. I would recommend exercising extreme caution around this one, or else avoiding it entirely, unless you actually have severe ADHD and this is the only thing that works for you. Lastly, there is a prodrug for dextroamphetamine that just came on the market, called Vyvanse (lisdexamphetamine). This has a much slower onset, since it has to be metabolized into dextroamphetamine, and therefore has significantly less addiction potential.
Ritalin works similarly to amphetamines; apparently it tends to produce less euphoria. Ephedrine is chemically similar to amphetamines, but works primarily by increasing the activity of noradrenaline; it has the advantage of being legally available without a prescription. I mention these only in passing because I don't have much experience with them; if you want to contribute some information about them, please comment!
I also want to briefly mention MDPV (methylenedioxypyrovalerone), an experimental stimulant still legally available in the U.S. and in Canada (sold online as a research chemical or, sometimes, as "envigorating bath salts"). This one acts as a dopamine and norepinephrine reuptake inhibitor, producing a state reminiscent of that caused by amphetamines. It has a quick onset and short lifetime (3-5 hours), which makes it well-suited to accomplishing quick chores. However, there are a number of nasty side effects reported, and it seems to have some addictive potential. Looking at the evidence, it appears that most of the people reporting this sort of thing were insufflating larger doses; so taken orally and in moderation, this one can still be reasonably safe. But there's very little out there that's peer-reviewed, so take a look at some trip reports and proceed with caution.
All of this probably makes it sound like stimulants are, at best, not far off from a zero-sum game: they may benefit cognitive performance, but they come with side effects and addictive potential and a nasty crash when you come off them. Well, good news: we'll be considering Modafinil next, which some are calling the perfect stimulant. Modafinil is sold by prescription only, but its prodrug, Adrafinil, can be purchased online. I've taken the latter on a number of occasions, and have been quite impressed with the results. Adrafinil promotes a state of wakefulness and energy, but without the "edge" that comes with amphetamine use. Even under conditions of sleep deprivation it has a significant positive effect on cognitive functioning, memory retention, focus and motivation. It has no significant crash, produces no tolerance, and seems to have very little addictive potential. You can even sleep on Adrafinil, if you wish; this is a significant advantage over all of the other stimulants we've discussed, since if you take dexedrine at 9PM and finish your work at 1AM, you're still effectively committed to staying up all night. Adrafinil also seems to have a positive effect on mood; lastly, there are signs it can be used both to prevent and to treat some of the effects of aging on energy level and cognitive ability.
Modafinil and Adrafinil are not perfectly safe, though. They put a fairly heavy load on the liver and kidneys if taken daily, and should therefore be taken only occasionally, unless as part of medically-supervised treatment. There are also occasional side effects to watch out for, most notably skin infections.
Nootropics are still a very new and experimental class of drugs. Many purported nootropics give negative or inconclusive results in clinical tests, and many of them have not been tested at all, or very little: more specifically, most of the research has focused around using nootropics to treat neural disorders and injuries, or to mitigate the effects of aging, with very little focus on young and healthy individuals. This does not mean, however, that they do nothing beneficial; it only means that you'll have to do some careful experimentation to determine how they effect you, if at all. For our purposes, I'll only be covering some of the more popular and (most importantly) well-studied nootropic drugs.
We'll start with piracetam. The research on piracetam shows positive effects in the prevention and treatment of aphasia, dementia, epilepsy and hypoxic injury, but says almost nothing about its effect on healthy individuals. The general consensus among those who take it daily is that it seems to do something, though it's hard to put a finger on. One friend of mine suggests that it seems to subtly improve the general flow of cognition, making memories and good ideas more available. One significant effect that piracetam seems to have is general potentiation of other drugs, especially stimulants and psychedelics, so if you incorporate piracetam into your daily regimen, you should be extra-careful about trying new drugs.
Another "smart drug" is DHEA, an endogenous chemical with a variety of functions, including inhibition of cortisol. The research shows that it has anti-depressant effects, and seems to improve cognitive functioning under stressful conditions. It also seems to improve episodic memory in young men, but has no such effect in elderly people. You'll note I said "young men", not "young people": the effects of DHEA appear to be asymmetric with respect to gender. In particular, higher levels of endogenous DHEA are correlated with longer lifespan in men, but there is no such correlation in women.
On the life extention angle, another nootropic worth considering is Selegiline, which appears to be available for purchase online, although technically it's not supposed to be sold to anyone without a prescription (possession, on the other hand, is legal). Selegiline is an MAO-B inhibitor, commonly used to treat Parkinson's, depression and dementia. Even for someone without these conditions, Selegiline produces cognitive benefits similar to those of Adrafinil, and there are reports that long-term use might tend to increase your lifespan. Looking at the evidence, I am inclined to take such claims seriously. Since it targets MAO-B specifically, Selegiline is less dangerous than nonspecific MAOIs. However, at higher doses, Selegiline to lose its specificity and inhibit MAO-A also. Women on oral contraception should be especially careful, as birth control pills appear to increase Selegiline's bioavailability, so that MAO-A inhibition may kick in at lower doses. At any rate, use caution with this one, and take lower-than-usual doses of other substances, including foods containing tyrosine and other potentially dangerous monoamines (e.g. chocolate, cheese, wine).
Some other less common nootropics with effects similar to those of the above include Vinpocetine and Hydergine, which function as neuroprotectives and might also improve cognitive functioning. I haven't tried these, and available research is slim, so I can't say much. Beyond the nootropics we've discussed, the field begins to look a little grim. For example, the jury seems to be out on ginkgo biloba: some clinical trials failed to demonstrate any measurable effect on memory or cognition, and others appeared to show short-term benefits. It gets worse, though, than merely ambiguous research. For example, DMAE, once marketed as a life-extension agent, may actually shorten your lifespan. Other nootropics have turned out to be severely toxic, such as Fipexide, which appears to cause liver failure with prolonged use. At any rate, your best bet is probably to stick with established and well-studied drugs; there are whole communities out there perfectly willing to put themselves on the line testing new contenders, and I feel it's best to leave that up to them.
Psychedelics and "recreational" drugs
Now, says Alice, for psychedelics. We should begin with some general remarks. First of all, these are contraindicated for anyone with a family history or predisposition to psychotic disorders. More generally, set and setting (mental state and external circumstances, respectively) are extremely important to having a positive experience. Psychedelics have been described as "nonspecific amplifiers of experience", which means that if you're having a bad day, acid will probably make it worse. Ideally, your first trip should be in a place where you feel safe and inspired, with a few people you trust and who are experienced and knowledgeable about the drug you're taking. You'll probably want to have art books, sketchbooks, good music and someplace comfy to lie or sit.
Your provisos and warnings are all well and good, says Bob, but what do psychedelics actually do?
It's... hard to describe, says Alice, in the same way that the colour red is hard to describe, but I'll give it a shot; just keep in mind this is an incomplete description. First there are the visual effects. These aren't hallucinations, in the sense that you'll recognize them as being effects of the drug. With your eyes open, you'll tend to see colours intensified and altered, and your brain will be having a field day reinterpreting interesting textures (cf. pareidolia and form constants); with your eyes closed, you'll see animated geometric patterns, tessellations, visions, and all kinds of surprisingly interesting stuff. Depictions in popular culture of psychedelic experiences are notoriously bad, but this movie does a reasonably good job. In addition to visual alterations (which can teach you a lot about the functioning of your visual cortex) you might also experience changes in auditory and tactile sensation, as well as synaesthetic crossover between senses.
Even more interesting than the sensory changes are the cognitive effects. It turns out that your sense of a coherent self can be overridden, and you may experience a blurring of the boundary between you and everything else. You may have strange, spontaneous ideas or insights; to some extent this is because the peculiarity of the experience forces you to reexamine tacit assumptions hidden deep in your reality model; these assumptions do not always turn out to be wrong, but it's good to be aware of them and to understand them more deeply. You may also become unusually aware of pathologies in your lifestyle and relationships, and with practice you may be better able to articulate those pathologies, than you are normally. Ideas that come to you while high must be carefully examined and tested while sober, of course, but my experience has been that many of them turn out to be genuinely good ideas, and some have even led to significant improvements in my functional relationship to the world.
LSD, in particular, seems well-suited to understanding technical fields, including math and physics. Unlike mushrooms, acid does not significantly impair my ability to read and understand mathematical texts, and the heightened ability to flex my visual cortex allows me to see difficult and abstract constructions quite vividly, as well as to understand on an intuitive level how they work. Mushrooms, conversely, are more likely to present me (somewhat forcefully) with ideas I might never have otherwise considered. These two are the most popular psychedelics, and the two experiences bear a definite family resemblance. LSD is calmer, and easier to control and direct, but it lasts up to twice as long as mushrooms - twelve hours is common. Mushrooms tend to be more emotionally intense: usually this means euphoria and lots of giggling, but occasionally you might be overcome by grief or anger, especially if you're already feeling that way before you dose.
I'm not going to say much about cannabis, because while the experience is certainly interesting, it's probably not going to help most of you think better (there are some, mind you, who actually function better with THC in their systems; Carl Sagan, for example, was a notorious pothead). One reason you might want to take cannabis anyway is that it can serve as a gentle introduction to psychedelia - but be warned that some people, even those who generally enjoy psychedelics, have consistently bad reactions to THC. Proceed with caution. Another reason to take cannabis is for life extension purposes; there's good evidence that THC helps prevent certain kinds of cancer. If you're taking it for health reasons, though, you probably want to use a vapourizer or eat it instead of smoking it. Also note that, at least for some particularly susceptible people, cannabis can be addictive. Again, if you have reason to believe you're prone to substance abuse, you might want to give this one a skip.
Another controlled substance to consider is MDMA. MDMA has a variety of neurochemical effects: it inhibits dopamine and norepinephrine reuptake, actually reverses the serotonin reuptake pump, and also seems to increase levels of oxytocin, the "trust hormone". So, you feel a sense of love, joy, wellbeing, safety, etc. You'll also get many of the same stimulant effects as methamphetamine, albeit milder. I mention MDMA because in my experience, if you already have a decent idea of how social interactions are supposed to work but still have trouble getting over your anxiety, this can help you teach yourself to be more socially confident, if taken in the appropriate environment (hint: do this around other people on MDMA). The surge of oxytocin makes you temporarily fearless about approaching strangers, and also cushions the blow if things go badly, while the increased dopamine activity strongly reinforces behaviours leading to successful interactions. This learned confidence persists into the sober state. MDMA is also useful for confronting emotionally difficult issues - indeed, it was used for psychotherapy before it became popular recreationally and was banned - but I'll leave that to you to research on your own.
Some warnings about MDMA. First of all, most "Ecstasy" contains adulterants (commonly caffeine and methamphetamine and sometimes PCP, among others), and sometimes contains no MDMA at all. As a general rule, avoid pressed pills; pure MDMA most commonly comes in crystal form. If you don't have a reliable source, you might want to skip MDMA entirely. Also note that MDMA commonly causes hangovers, although these can be mitigated by taking 5-HTP.
Lastly, although all the drugs I've mentioned in this section are controlled, you might actually be able to experience very similar altered states legally. Alex and Ann Shulgin, the research chemist and psychotherapist (respectively) who first popularized MDMA, also came up with literally hundreds of other psychoactive compounds, many of which are still legal outside America and can be purchased online from so-called "research chemical" companies (the U.S. has the Analogues Act, which automatically makes illegal any chemical broadly similar to any other illegal chemical, but Canada, among other countries, has not shared this dismal fate). For example, instead of MDMA, you might consider AMT, a tryptamine with similar and in some ways better effects. Another research chemical, 4-ACO-DMT, is actually metabolized into psilocin, as is psilocybin, and so the trip is almost identical to that of mushrooms. The downside to all this is that research chemicals are generally sold only in larger quantities, so if you don't want to drop a couple hundred dollars on something you may not enjoy, this may not be your best bet. There's also the fact that these are not as well-understood as more popular psychedelics, which makes them riskier, although these risks can be minimized by using caution and moderation.
As we've seen, says Alice with a smirk, you too can alter your neurochemistry for fun and profit - but this must be done responsibly. Although I've tried to give a sense of the dangers alongside the benefits, this post is really only meant to serve as a broad introduction. If you're thinking of actually trying any of the drugs I've mentioned, it's important that you do some in-depth research, and a proper cost-benefit analysis. But with a little practice, you too can expand your mind.
New version! I updated the link above to it as well. Added LOADS and LOADS of new content, although I'm not entirely sure if it's actually more fun (my guess is there's more total fun due to varity, but that it's more diluted).
I ended up working on this basically the entire day to day, and implemented practically all my ideas I have so far, except for some grammar issues that'd require disproportionately much work. So unless there are loads of suggestions or my brain comes up with lots of new ideas over the next few days, this may be the last version in a while and I may call it beta and ask for spell-check. Still alpha as of writing this thou.
Since there were some close calls already, I'll restate this explicitly: I'd be easier for everyone if there weren't any forks for at least a few more days, even ones just for spell-checking. After that/I move this to beta feel more than free to do whatever you want.
Thanks to everyone who commented! ^_^
Credits: http://lesswrong.com/lw/d2w/cards_against_rationality/ , http://lesswrong.com/lw/9ki/shit_rationalists_say/ , various people commenting on this article with suggestions, random people on the bay12 forums that helped me with the engine this is a descendent from ages ago.
I'm worried that LW doesn't have enough good contrarians and skeptics, people who disagree with us or like to find fault in every idea they see, but do so in a way that is often right and can change our minds when they are. I fear that when contrarians/skeptics join us but aren't "good enough", we tend to drive them away instead of improving them.
For example, I know a couple of people who occasionally had interesting ideas that were contrary to the local LW consensus, but were (or appeared to be) too confident in their ideas, both good and bad. Both people ended up being repeatedly downvoted and left our community a few months after they arrived. This must have happened more often than I have noticed (partly evidenced by the large number of comments/posts now marked as written by [deleted], sometimes with whole threads written entirely by deleted accounts). I feel that this is a waste that we should try to prevent (or at least think about how we might). So here are some ideas:
- Try to "fix" them by telling them that they are overconfident and give them hints about how to get LW to take their ideas seriously. Unfortunately, from their perspective such advice must appear to come from someone who is themselves overconfident and wrong, so they're not likely to be very inclined to accept the advice.
- Create a separate section with different social norms, where people are not expected to maintain the "proper" level of confidence and niceness (on pain of being downvoted), and direct overconfident newcomers to it. Perhaps through no-holds-barred debate we can convince them that we're not as crazy and wrong as they thought, and then give them the above-mentioned advice and move them to the main sections.
- Give newcomers some sort of honeymoon period (marked by color-coding of their usernames or something like that), where we ignore their overconfidence and associated social transgressions (or just be extra nice and tolerant towards them), and take their ideas on their own merits. Maybe if they see us take their ideas seriously, that will cause them to reciprocate and take us more seriously when we point out that they may be wrong or overconfident.
OTOH, I don’t think group think is a big problem. Criticism by folks like Will Newsome, Vladimir Slepnev and especially Wei Dai is often upvoted. (I upvote almost every comment of Dai or Newsome if I don’t forget it. Dai makes always very good points and Newsome is often wrong but also hilariously funny or just brilliant and right.) Of course, folks like this Dymytry guy are often downvoted, but IMO with good reason.
or: Why Everything Is Terrible, An Overview.1
It sounds like a theory which explains too much. But it's not a theory, hardly even an explanation, more a pattern that manifests itself once you start trying to seriously answer rhetorical questions about the state of the world. From many perspectives, it's obvious to the point of being mundane, practically tautological, but sometimes such obvious facts are worth pointing out regardless.
The idea is this: The subset of participants which rises to prominence in any area does so because its members have traits helpful to becoming prominent, not necessarily because they have traits which are desirable. Thus, without ongoing and concerted effort, a great many arenas end up dominated by players employing strategies which are bad for everyone.
This comes up again and again:
- Why does science (or rather, the publisher-based model thereof) so frequently produce results which are laughably wrong? Because those journals which don't publish retractions or reproductions will more frequently be the first to publish revolutionary results, and so become more widely read and widely cited. Journals don't attract authors by being as accurate as possible; they win by looking important.
- Why do cigarette companies target kids and teens whenever they think they can get away with it, and breed tobacco for maximized nicotine? Because those companies which do will turn more profit and thus last longer and grow faster than those that don't, and so have more resources to devote to proliferating. Companies don't expand by playing fair; they win by making and keeping customers.
- Why is the Make-A-Wish Foundation sitting on more donations than it knows what to do with when the Against Malaria Foundation could have used that money to save literally tens or hundreds of thousands of lives per year? Because knowing how to elicit donations is a skill almost completely unrelated to knowing how to spend donations, and because American children with cancer make for better advertising than African children with malaria. Charities don't get donations by making the best possible use of their money; they win by advertising effectively towards potential donors. (cf. Efficient Charity)
- Why do governments inevitably end up run by career lawyers and politicians instead of scientists and economists2? Because polarizing rhetoric and political connections look better than a nuanced, accurate understanding of the issues. There is only finite time for training and practice, and eventually a choice must be made between training in looking good and training in being good. People don't get elected or appointed by being good Bayesians; they win by being popular.
- Why do the big media channels seem to be more concerned with celebrities than science, and spend more time talking about individual murders than they do entire genocides? Because those channels talking about Laci Peterson seem more personal and are thus more watched than those talking about some religious sect in China. Television programming isn't determined by what's important; what wins is what's watched.
- Why is the sex ratio in animals almost always nearly 1:1, when a population with one male for every five females could grow faster and adapt to problems more readily? Because in such a population, or in any population with a sufficiently large gender imbalance, a gene causing a woman to only have male children will be vastly overrepresented in the grandchild generation relative to the rest of the population, and so shift the balance closer to 50/50. Genes don't proliferate by being good for the species; they win by being good for themselves. (cf. Evolutionarily stable strategy, evolutionary game theory.)
- Why do most big businesses make use of sweatshop conditions and shady tax dodges? Because the businesses which do so will outperform the businesses which don't. Corporations don't grow by being nice; they win by being profitable.
- Why do so many apparently intelligent people spend hours per day idly browsing the likes of Reddit, Hacker News, or TVTropes (or indeed LW), when a similar dedication to active self-improvement could have made them a master of a field inside of a decade? (Using for back-of-the-envelope's sake the supposition that 10,000 hours of practice are required for mastery of some specific art, we find that three hours per day for ten years is approximately 1.1 masteries.) Because which activities become habitual is determined by their immediate dopamine release, and for intelligent people the act of (say) reading about strategies for becoming an effective entrepreneur makes for more instant dopamine than does the painful daily grind involved in actually becoming an entrepreneur. Activities don't become part of daily life by being useful; they win by tricking your brain into making them feel good.
Although I feel that Nick Bostrom’s new book “Superintelligence” is generally awesome and a well-needed milestone for the field, I do have one quibble: both he and Steve Omohundro appear to be more convinced than I am by the assumption that an AI will naturally tend to retain its goals as it reaches a deeper understanding of the world and of itself. I’ve written a short essay on this issue from my physics perspective, available at http://arxiv.org/pdf/1409.0813.pdf.
give you, some we can't, few have been written up and even fewer in any
well-organized way. Benja or Nate might be able to expound in more detail
while I'm in my seclusion.
Very briefly, though:
The problem of utility functions turning out to be ill-defined in light of
new discoveries of the universe is what Peter de Blanc named an
"ontological crisis" (not necessarily a particularly good name, but it's
what we've been using locally).
The way I would phrase this problem now is that an expected utility
maximizer makes comparisons between quantities that have the type
"expected utility conditional on an action", which means that the AI's
utility function must be something that can assign utility-numbers to the
AI's model of reality, and these numbers must have the further property
that there is some computationally feasible approximation for calculating
expected utilities relative to the AI's probabilistic beliefs. This is a
constraint that rules out the vast majority of all completely chaotic and
uninteresting utility functions, but does not rule out, say, "make lots of
Models also have the property of being Bayes-updated using sensory
information; for the sake of discussion let's also say that models are
about universes that can generate sensory information, so that these
models can be probabilistically falsified or confirmed. Then an
"ontological crisis" occurs when the hypothesis that best fits sensory
information corresponds to a model that the utility function doesn't run
on, or doesn't detect any utility-having objects in. The example of
"immortal souls" is a reasonable one. Suppose we had an AI that had a
naturalistic version of a Solomonoff prior, a language for specifying
universes that could have produced its sensory data. Suppose we tried to
give it a utility function that would look through any given model, detect
things corresponding to immortal souls, and value those things. Even if
the immortal-soul-detecting utility function works perfectly (it would in
fact detect all immortal souls) this utility function will not detect
anything in many (representations of) universes, and in particular it will
not detect anything in the (representations of) universes we think have
most of the probability mass for explaining our own world. In this case
the AI's behavior is undefined until you tell me more things about the AI;
an obvious possibility is that the AI would choose most of its actions
based on low-probability scenarios in which hidden immortal souls existed
that its actions could affect. (Note that even in this case the utility
function is stable!)
Since we don't know the final laws of physics and could easily be
surprised by further discoveries in the laws of physics, it seems pretty
clear that we shouldn't be specifying a utility function over exact
physical states relative to the Standard Model, because if the Standard
Model is even slightly wrong we get an ontological crisis. Of course
there are all sorts of extremely good reasons we should not try to do this
anyway, some of which are touched on in your draft; there just is no
simple function of physics that gives us something good to maximize. See
also Complexity of Value, Fragility of Value, indirect normativity, the
whole reason for a drive behind CEV, and so on. We're almost certainly
going to be using some sort of utility-learning algorithm, the learned
utilities are going to bind to modeled final physics by way of modeled
higher levels of representation which are known to be imperfect, and we're
going to have to figure out how to preserve the model and learned
utilities through shifts of representation. E.g., the AI discovers that
humans are made of atoms rather than being ontologically fundamental
humans, and furthermore the AI's multi-level representations of reality
evolve to use a different sort of approximation for "humans", but that's
okay because our utility-learning mechanism also says how to re-bind the
learned information through an ontological shift.
This sorta thing ain't going to be easy which is the other big reason to
start working on it well in advance. I point out however that this
doesn't seem unthinkable in human terms. We discovered that brains are
made of neurons but were nonetheless able to maintain an intuitive grasp
on what it means for them to be happy, and we don't throw away all that
info each time a new physical discovery is made. The kind of cognition we
want does not seem inherently self-contradictory.
Three other quick remarks:
*) Natural selection is not a consequentialist, nor is it the sort of
consequentialist that can sufficiently precisely predict the results of
modifications that the basic argument should go through for its stability.
The Omohundrian/Yudkowskian argument is not that we can take an arbitrary
stupid young AI and it will be smart enough to self-modify in a way that
preserves its values, but rather that most AIs that don't self-destruct
will eventually end up at a stable fixed-point of coherent
consequentialist values. This could easily involve a step where, e.g., an
AI that started out with a neural-style delta-rule policy-reinforcement
learning algorithm, or an AI that started out as a big soup of
self-modifying heuristics, is "taken over" by whatever part of the AI
first learns to do consequentialist reasoning about code. But this
process doesn't repeat indefinitely; it stabilizes when there's a
consequentialist self-modifier with a coherent utility function that can
precisely predict the results of self-modifications. The part where this
does happen to an initial AI that is under this threshold of stability is
a big part of the problem of Friendly AI and it's why MIRI works on tiling
agents and so on!
*) Natural selection is not a consequentialist, nor is it the sort of
consequentialist that can sufficiently precisely predict the results of
modifications that the basic argument should go through for its stability.
It built humans to be consequentialists that would value sex, not value
inclusive genetic fitness, and not value being faithful to natural
selection's optimization criterion. Well, that's dumb, and of course the
result is that humans don't optimize for inclusive genetic fitness.
Natural selection was just stupid like that. But that doesn't mean
there's a generic process whereby an agent rejects its "purpose" in the
light of exogenously appearing preference criteria. Natural selection's
anthropomorphized "purpose" in making human brains is just not the same as
the cognitive purposes represented in those brains. We're not talking
about spontaneous rejection of internal cognitive purposes based on their
causal origins failing to meet some exogenously-materializing criterion of
validity. Our rejection of "maximize inclusive genetic fitness" is not an
exogenous rejection of something that was explicitly represented in us,
that we were explicitly being consequentialists for. It's a rejection of
something that was never an explicitly represented terminal value in the
first place. Similarly the stability argument for sufficiently advanced
self-modifiers doesn't go through a step where the successor form of the
AI reasons about the intentions of the previous step and respects them
apart from its constructed utility function. So the lack of any universal
preference of this sort is not a general obstacle to stable
*) The case of natural selection does not illustrate a universal
computational constraint, it illustrates something that we could
anthropomorphize as a foolish design error. Consider humans building Deep
Blue. We built Deep Blue to attach a sort of default value to queens and
central control in its position evaluation function, but Deep Blue is
still perfectly able to sacrifice queens and central control alike if the
position reaches a checkmate thereby. In other words, although an agent
needs crystallized instrumental goals, it is also perfectly reasonable to
have an agent which never knowingly sacrifices the terminally defined
utilities for the crystallized instrumental goals if the two conflict;
indeed "instrumental value of X" is simply "probabilistic belief that X
leads to terminal utility achievement", which is sensibly revised in the
presence of any overriding information about the terminal utility. To put
it another way, in a rational agent, the only way a loose generalization
about instrumental expected-value can conflict with and trump terminal
actual-value is if the agent doesn't know it, i.e., it does something that
it reasonably expected to lead to terminal value, but it was wrong.
This has been very off-the-cuff and I think I should hand this over to
Nate or Benja if further replies are needed, if that's all right.
A summary of standard non-Bayesian criticisms of common frequentist statistical practices, with pointers into the academic literature.
With the help of many dedicated Less Wrongers (players muflax, Karl, Charlie, and Emile; musicians Mike Blume and Alicorn, technical support Ari Rahikkala) we have successfully completed what is, as far as I know, the first actual Dungeons and Discourse adventure anywhere. Except we're not calling it that, because I don't have the rights to use that name. Though it's not precisely rationality related, I hope it is all right if I post a summary of the adventure by popular demand.
Also, at some point it turned into a musical. The first half of the songs are only available as lyrics at the moment, but Alicorn and MBlume very kindly produced the second half as real music, which I've uploaded to YouTube and linked at the bottom of this post (skip to it now).
The known world has many sects and religions, but all contain shadowy legends of two primeval deities: Sophia, Goddess of Wisdom; and Aleithos, God of Truth. When Sophia announced her plan to create thinking, rational beings, Aleithos objected, declaring that they would fall into error and produce endless falsehoods. Sophia ignored her brother's objections and created humankind, who named the world after their goddess-mother. But Aleithos' fears proved well-founded: humankind fell into error and produced endless falsehoods, and their clamor drove the God of Truth insane.
The once mighty Aleithos fell from heaven, and all of his angelic servants turned into Paradox Beasts, arachnoid monstrosities that sought and devoured those who challenged the laws of logic. Over centuries, most of the Paradox Beasts were banished, but Aleithos himself remained missing. And though thousands of seekers set off to all the corners of the world in search of Truth, the Mad God keeps his own counsel, if He still even exists at all.
The Truth God's madness had one other effect; the laws of physics, once inviolable, turned fluid, and those sufficiently advanced in the study of Truth gained apparently magical abilities. With knowledge literally being power, great philosophers and scientists built mighty cities and empires.
In the middle of the Cartesian Plain at the confluence of the rivers Ordinate and Abcissa stands the mightiest of all, the imperial city of Origin. At the very center of the city stands the infinitely tall Z-Axis Tower, on whose bottom floor lives the all-seeing Wizard of 0=Z. Surrounding the Tower are a host of colleges and universities that attract the greatest scholars from all over Origin, all gathered in service to the great project to find Truth.
Into the city comes Lady Cerune Russell, an exotic noblewoman from far-off parts seeking great thinkers to join her on a dangerous adventure. Four scholars flock to her banner. Nomophilos the Elder the Younger (Emile) is a political scientist studying the central role of laws in creating a just society. Phaidros (muflax) is a zealous Protestant theologian trying to meld strains of thought as disparate as Calvinism, Gnosticism, and W.L. Craig's apologetics. Ephraim (Charlie) is a Darwinian biologist with strong leftist sympathies and an experimental streak that sometimes gets him in trouble. And Macx (Karl) is a quiet but very precise logician with a talent for puzzles.
Cerune explains to the Original scholars that she is the last living descendant of Good King Bertrand, historic ruler of the land of Russellia far to the west. Russellia was the greatest nation in the world until two hundred years ago, when a cataclysm destroyed the entire kingdom in a single day and night. Now the skies above Russellia are dark and filled with choking ash, monsters roam its plains, and the Good King is said to be locked in a magical undying sleep deep beneath the Golden Mountain in the kingdom's center. Though many have traveled to Russellia in search of answers, none have returned alive; Cerune, armed with secret information from the Turing Oracle which she refuses to share, thinks she can do better. The four Originals agree to protect her as she makes the dangerous journey to the Golden Mountain to investigate the mysterious disaster and perhaps lift the curse. Cerune gives them a day in Origin to prepare for the journey.
CHAPTER ONE: ORIGIN
The party skip the city's major attractions, including the Z-Axis Tower and the Hagia Sophia, in favor of more academic preparations: a visit to the library to conduct research, and a shopping trip to Barnes & Aristoi Booksellers, where they purchase reading material for the journey ahead. Here, they find a map of the lands on the road to Russellia, including the unpleasant-sounding Slough of Despotism and the Shadow City of Xar-Morgoloth, whose very name inexplicably chills the air when spoken aloud. After a long discussion on how this thermodynamic-defying effect could probably be used to produce unlimited free energy, they return to more immediate matters and head to the armory to pick up some weapons - a trusty isoceles triangle for Nomophilos, a bow for Macx - before the stores close for the evening. After a final night in Origin, they meet Cerune at the city gates and set off.
They originally intend to stick to the course of the Abcissa, but it is flooding its banks and Cerune recommends crossing the river into Platonia at the Pons Asinorum. After being attacked by a Euclidean Elemental charged with letting no one enter who does not know geometry, they reach the other bank and find a strange old man, raving incomprehensibly. His turns of phrase start to make sense only after the party realizes that he is speaking as if he - and all objects - have no consistent identity.
In his roundabout way, he identifies himself as Heraclitus, the Fire Mage, one of the four great Elemental Mages of Platonia. Many years ago, he crossed into Origin on some errand, only to be ambushed by his arch-enemy, the Water Mage Thales. Thales placed a curse on Heraclitus that he could never cross the same river twice, trapping him on the wrong side of the Abcissa and preventing his return to Platonia. In order to dispel the curse, Heraclitus finds a loophole in the curse: he convinces himself that objects have no permanent identity, and so he can never cross the same river twice since it is not the same river and he is not the same man. Accepting this thesis, he crosses the Abcissa without incident - only to find that his new metaphysics of identity prevents him from forming goals, executing long-term plans, or doing anything more complicated than sitting by the riverbank and eating the fish that swim by.
This sets off a storm of conversation, as each member of the party tries to set Heraclitus right in their own way; Phaidros by appealing to God as a final arbiter of identity, Macx and Nomophilos by arguing that duty is independent of identity and that Heraclitus has a duty to his family and followers. Unfortunately, they make a logical misstep and end out convincing Heraclitus that it is illogical from his perspective to hold conversation; this ends the debate. And as the five philosophers stand around discussing what to do, they are ambushed by a party of assassins, who shoot poisoned arrows at them from a nearby knoll.
Outnumbered and outflanked, the situation seems hopeless, until Macx notices several of the attackers confused and unwilling to attack. With this clue, he identifies them as Buridan's Assassins, who in the presence of two equally good targets will hesitate forever, unable to choose: he yells to his friends to stand with two or more adventurers equidistant from each assassin, and sure enough, this paralyzes the archers and allows the party some breathing space.
But when a second group of assassins arrives to join the first, the end seems near - until Heraclitus, after much pondering, decides to accept his interlocutors' arguments for object permanence and joins in the battle. His fire magic makes short work of the remaining assassins, and when the battle is over, he thanks them and gives a powerful magic item as a gift to each. Then he disappears in a burst of flame after warning his new friends to beware the dangers ahead.
The party searches the corpses of the assassins - who all carry obsidian coins marked PLXM - and then camp for the night on the fringe of the Slough of Despotism.
CHAPTER TWO: THE SLOUGH OF DESPOTISM
The Slough of Despotism is a swamp unfortunately filled with allegators, giant reptiles who thrive on moral superiority and on casting blame. They accuse our heroes of trespassing on their property; our heroes counter that the allegators, who do not have a state to enforce property rights, cannot have a meaningful concept of property. The allegators threaten to form a state, but before they can do so the party manages to turn them against each other by pointing out where their property rights conflict; while the allegators argue, the adventurers sneak off.
They continue through the swamp, braving dense vegetation, giant snakes, and more allegators (who are working on the whole state thing; the party tells them that they're too small and disorganized to be a real state, and that they would have to unite the entire allegator nation under a mutually agreed system of laws) before arriving at an old barrow tomb. Though four of the five adventurers want to leave well enough alone, Ephraim's experimental spirits gets the better of him, and he enters the mound. Its local Barrow Wight has long since departed, but he has left behind a suit of Dead Wight Mail, which confers powerful bonuses on Conservatives and followers of the Right-Hand Path. Nomophilos, the party's Conservative, is all set to take the Mail when Phaidros objects that it is morally wrong to steal from the dead; this sparks a fight that almost becomes violent before Nomo finally backs down; with a sigh of remorse, he leaves the magic item where he found it.
Beyond the barrow tomb lies the domain of the Hobbesgoblins, the mirror image of the Allegators in that they have a strong - some might say dictatorial - state under the rule of their unseen god-king, Lord-Over-All. They are hostile to any foreigners who refuse to swear allegiance to their ruler, but after seeing an idol of the god-king - a tentacled monstrosity bearing more than a passing resemblance to Cthulhu - our heroes are understandably reluctant to do so. As a result, the Hobbesgoblins try to refuse them passage through their capital city of Malmesbury on the grounds that, without being subordinated to Lord-Over-All or any other common ruler, the adventurers are in a state of nature relative to the Hobbesgoblins and may rob, murder, or otherwise exploit them. The Hobbesgoblins don't trust mere oaths or protestations of morality - but Nomophilos finally comes up with a compromise that satisfies them. He offers them a hostage in return for their good behavior, handing them his pet tortoise Xeno. This satisfies the Hobbesgoblins as assurance of their good behavior, and the party passes through Malmesbury without incident.
On the far side of Malmesbury they come to a great lake, around which the miasmas of the swamp seem to swirl expectantly. On the shore of the lake lives Theseus with his two ships. Theseus tells his story: when he came of age, he set off on a trading expedition upon his father's favorite ship. His father made him swear to return the ship intact, but after many years of travel, Theseus realized that every part of the ship had been replaced and repaired, so that there was not a single piece of the ship that was the same as when it had left port. Mindful of his oath, he hunted down the old pieces he had replaced, and joined them together into a second ship. But now he is confused: is it the first or the second ship which he must return to his father?
The five philosophers tell Theseus that it is the first ship: the ship's identity is linked to its causal history, not to the matter that composes it. Delighted with this answer, he offers the second ship to the adventurers, who sail toward the far shore.
Halfway across the lake, they meet an old man sitting upon a small island. He introduces himself as Thomas Hobbes, and says that his spies and secret police have told him everything about the adventurers since they entered the Slough. Their plan to save Russellia is a direct threat to his own scheme to subordinate the entire world under one ruler, and so he will destroy them. When the party expresses skepticism, his "island" rises out of the water and reveals itself to be the back of the monstrous sea creature, Leviathan, the true identity of the Hobbesgoblins' Lord-Over-All. After explaining his theory of government ("Let's Hear It For Leviathan", lyrics only) Hobbes and the monster attack for the game's first boss battle. The fight is immediately plagued by mishaps, including one incident where Phaidros's "Calvin's Predestined Hellfire" spell causes Hobbes to briefly turn into a Dire Tiger. When one of Leviathan's tentacles grab Cerune, she manifests a battle-axe of magic fire called the Axe of Separation and hacks the creature's arm off. She refuses to explain this power, but inspired by the small victory the party defeat Hobbes and reduce Leviathan into a state of Cartesian doubt; the confused monster vanishes into the depths, and the adventurers hurry to the other side and out of the Slough.
CHAPTER THREE: THE SHADOW CITY
Although our heroes make good time, they soon spot a detachment of Hobbesgoblins pursuing them. Afraid the goblins will be angry at the defeat of their god, the party hides; this turns out to be unnecessary, as the goblins only want Ephraim - the one who actually dealt the final blow against Leviathan - to be their new Lord-Over-All. Ephraim rejects the positions, and the party responds to the goblins' desperate pleading by suggesting a few pointers for creating a new society - punishing violence, promoting stability, reinforcing social behavior. The Hobbesgoblins grumble, but eventually depart - just in time for the party to be attacked by more of Buridan's Assassins. These killers' PLXM coins seem to suggest an origin in Xar-Morgoloth, the Shadow City, and indeed its jet-black walls now loom before them. But the city sits upon the only pass through the Central Mountains, so the party reluctantly enters.
Xar-Morgoloth turns out to be a pleasant town of white-washed fences and laughing children. In search of an explanation for the incongruity the five seek out the town's spiritual leader, the Priest of Lies. The Priest explains that although Xar-Morgoloth is superficially a nice place, the town is evil by definition. He argues that all moral explanations must be grounded in base moral facts that cannot be explained, whether these be respect for others, preference of pleasure over pain, or simple convictions that murder and theft are wrong. One of these base level moral facts, he says, is that Xar-Morgoloth is evil. It is so evil, in fact, that it is a moral imperative to keep people out of the city - which is why he sent assassins to scare them off.
Doubtful, the party seeks the mysterious visiting philosopher whom the Priest claimed originated these ideas: they find Immanuel Kant living alone on the outskirts of the city. Kant tells his story: he came from a parallel universe, but one day a glowing portal appeared in the sky, flinging him into the caves beyond Xar-Morgoloth. Wandering into Xar-Morgoloth, he tried to convince the citizens of his meta-ethical theories, but they insisted they could ground good and evil in basic moral intuitions instead. Kant proposed that Xar-Morgoloth was evil as a thought experiment to disprove them, but it got out of hand.
When our heroes challenge Kant's story and blame him for the current state of the city, Kant gets angry and casts Parmenides' Stasis Hex, freezing them in place. Then he announces his intention to torture and kill them all. For although in this world Immanuel Kant is a moral philosopher, in his own world (he explains) Immanuel Kant is a legendary villain and figure of depravity ("I'm Evil Immanuel Kant", lyrics only). Cerune manifests a second magic weapon, the Axe of Choice, to break the Stasis Hex, and the party have their second boss battle, which ends in defeat for Evil Kant. Searching his home, they find an enchanted Parchment of Natural Law that causes the chill in the air whenever the city's name is spoken.
Armed with this evidence, they return to the Priest of Lies and convince him that his moral theory is flawed. The Priest dispels the shadow over the city, recalls his assassins, and restores the town name to its previous non-evil transliteration of Summerglass. He then offers free passage through the caverns that form the only route through the Central Mountains.
CHAPTER FOUR: THE CAVERNS OF ABCISSA
Inside the caverns, which are nearly flooded by the overflowing Abcissa River, the party encounter an army of Water Elementals, leading them to suspect that they may be nearing the headquarters of Heraclitus' arch-enemy, Thales. The Water Elementals are mostly busy mining the rock for gems and magic artifacts, but one of them is sufficiently spooked by Phaidros to cast a spell on him, temporarily turning him to water. This is not immediately a disaster - Phaidros assumes a new form as a water elemental but keeps his essential personality - except that in an Ephraimesque display of overexperimention, Phaidros wonders what would happen if he temporarily relaxed the morphogenic field that holds him in place - as a result, he loses his left hand, a wound which stays in place when he reverts back to his normal form a few hours later. A resigned Phaidros only quotes the Bible: ("And if your hand offend you, cut it off: it is better for you to enter into life maimed, than having two hands to go into hell" - Mark 9:43) and trusts in the Divine plan.
The Caverns of Abcissa are labyrinthine and winding, but eventually the party encounters a trio who will reappear several times in their journey: Ruth (who tells the truth), Guy (who'll always lie) and Clancy (who acts on fancy). These three have a habit of hanging around branching caverns and forks in the road, and Ephraim solves their puzzle thoroughly enough to determine what route to take to the center of the cave system.
Here, in a great cavern, lives a civilization of cave-men whose story sounds a lot like Evil Kant's - from another world, minding their own business until a glowing portal appeared in the sky and sucked them into the caves. The cave-men are currently on the brink of civil war after one of their number, Thag, claims to have visited the mythical "outside" and discovered a world of magic and beauty far more real than the shadows dancing on the walls of their cavern. Most of the other cave-men, led by the very practical Vur, have rejected his tale, saying that the true magic and beauty lies in accepting the real, in-cave world rather than chasing after some outside paradise - but a few of the youth have flocked to Thag's banner, including Antil, a girl with mysterious magic powers.
Only the timely arrival of the adventurers averts a civil war; the party negotiates a truce and offers to solve the dispute empirically - they will escort Vur and Antil with them through the caverns so that representatives of both sides can see whether or not the "outside" really exists. This calms most of the cave-men down, and with Vur and Antil alongside, they head onward to the underground source of the Abcissa - which, according to their research, is the nerve center of Thales' watery empire.
On the way, they encounter several dangers. First, they awake a family of hibernating bears, who are quickly dispatched but who manage to maul the frail Vur so severely that only some divine intervention mediated by Phaidros saves his life. Second, they come across a series of dimensional portals clearly linked to the stories related by Evil Kant and the cave-men. Some link directly to otherworldly seas, pouring their water into the Abcissa and causing the recent floods. Others lead to otherworldly mines and quarries, and are being worked by gangs of Water Elementals. After some discussion of the ethics of stranding the Water Elementals, the five philosophers decide to shut down as many of the portals as possible.
They finally reach the source of the Abcissa, and expecting a battle, deck themselves out in magic armor that grants immunity to water magic. As expected, they encounter Thales, who reveals the full scale of his dastardly plot - to turn the entire world into water. But his exposition is marred by a series of incongruities, including his repeated mispronunciations of his own name ("All is Water", lyrics only). And when the battle finally begins, the party dispatches Thales with minimal difficulty, and the resulting corpse is not that of a Greek philosopher at all, but rather that of Davidson's Swampman, a Metaphysical summon that can take the form of any creature it encounters and imitate them perfectly.
Before anyone has time to consider the implications of their discovery, they are attacked by the real Water Mage, who bombards them with powerful water spells to which their magic armor mysteriously offers no protection. Worse, the Mage is able to create dimensional portals at will, escaping attacks effortlessly. After getting battered by a series of magic Tsunamis that nearly kill several of the weaker party members, the adventurers are in dire straits.
Then the tide begins to turn. Antil manifests the power to go invisible and attack the Water Mage from an unexpected vantage. Cerune manifests another magic weapon, the Axe of Extension, which gives her allies the same powers over space as the Water Mage seems to possess. And with a little prompting from Cerune, Phaidros and Nomophilos realize the Water Mage's true identity. Magic armor doesn't grant protection from his water spells because they are not water at all, but XYZ, a substance physically identical to but chemically different from H2O. And his mastery of dimensional portals arises from his own origin in a different dimension, Twin Earth. He is Hilary Putnam ("All is Water, Reprise", lyrics only) who has crossed dimensions, defeated Thales, and assumed his identity in order to take over his watery empire and complete his world domination plot. With a last push of magic, the party manage to defeat Putnam, who is knocked into the raging Abcissa and drowned in the very element he sought to control.
They tie up the loose ends of the chapter by evacuating the Water Elementals from Twin Earth, leading the cave-men to the promised land of the Outside, and confronting Antil about her mysterious magic. Antil gives them the source of her power to turn invisible: the Ring of Gyges, which she found on the cave floor after an earthquake. She warns them never to use it, as it presents a temptation which their ethics might be unable to overcome.
CHAPTER FIVE: CLIMBING MOUNT IMPROBABLE
Now back on the surface, the party finds their way blocked by the towering Mount Improbable, which at first seems too tall to ever climb. But after some exploration, they find there is a gradual path sloping upward, and begin their ascent. They are blocked, however, by a regiment of uniformed apes: cuteness turns to fear when they get closer and find the apes have machine guns. They decide to negotiate, and the apes prove willing to escort them to their fortress atop the peak if they can prove their worth by answering a few questions about their religious beliefs.
Satisfied, the ape army lead them to a great castle at the top of the mountain where Richard Dawkins ("Beware the Believers", credit Michael Edmondson) and his snow leopard daemon plot their war against the gods themselves. Dawkins believes the gods to be instantiated memes - creations of human belief that have taken on a life of their own due to Aleithos' madness - and accuses them of causing disasters, poverty, and ignorance in order to increase humanity's dependence upon them and keep the belief that sustains their existence intact. With the help of his genetically engineered apes and a fleet of flying battleships, he has been waging war against all the major pantheons of polytheism simultaneously. Dawkins is now gearing up to attack his most implacable foe, Jehovah Himself, although he admits He has so far managed to elude him.
Hoping the adventurers will join his forces, he takes them on a tour of the castle, showing them the towering battlements, the flotilla of flying battleships, and finally, the dungeons. In these last are imprisoned Fujin, Japanese god of storms; Meretseger, Egyptian goddess of the flood, and even Ares, the Greek god of war (whom Dawkins intends to try for war crimes: not any specific war crime, just war crimes in general). When the party reject Dawkins' offer to join his forces (most vocally Phaidros, most reluctantly Ephraim) Dawkins locks them in the dungeons themselves.
They are rescued late at night by their old friend Theseus. Theseus lost his ship in a storm (caused by the Japanese storm god, Fujin) and joined Dawkins' forces to get revenge; he is now captain of the aerial battleships. Theseus loads the adventurers onto a flying battleship and deposits them on the far side of the mountain, where Dawkins and his apes will be unlikely to find them.
Their troubles are not yet over, however, for they quickly encounter a three man crusade consisting of Blaise Pascal, Johann Tetzel, and St. Augustine of Hippo (mounted, cavalry-style, upon an actual hippopotamus). The three have come, led by a divine vision, to destroy Dawkins and his simian armies as an abomination unto the Lord, and upon hearing that the adventurers have themselves escaped Dawkins, invite them to come along. But the five, despite their appreciation for Pascal's expository fiddle music ("The Devil and Blaise Pascal") are turned off by Tetzel's repeated attempts to sell them indulgences, and Augustine's bombastic preaching. After Phaidros gets in a heated debate with Augustine over the role of pacifism in Christian thinking, the two parties decide to go their separate ways, despite Augustine's fiery condemnations and Pascal's warning that there is a non-zero chance the adventurers' choice will doom them to Hell.
After another encounter with Ruth, Guy, and Clancy, our heroes reach the base of Mount Improbable and at last find themselves in Russellia.
CHAPTER SIX: THE PALL OVER RUSSELLIA
Russellia is, as the legends say, shrouded in constant darkness. The gloom and the shock of being back in her ancestral homeland are too much for Cerune, who breaks down and reveals her last few secrets. Before beginning the quest, she consulted the Turing Oracle in Cyberia, who told her to seek the aid of a local wizard, Zermelo the Magnificent. Zermelo gave her nine magic axes of holy fire, which he said possessed the power to break the curse over Russellia. But in desperation, she has already used three of the magic axes, and with only six left she is uncertain whether she will have the magic needed.
At that moment, Heraclitus appears in a burst of flame, seeking a debriefing on the death of his old enemy Thales. After recounting the events of the past few weeks, our heroes ask Heraclitus whether, as a Fire Mage, he can reforge the axes of holy fire. Heraclitus admits the possibility, but says he would need to know more about the axes, their true purpose, and the enemy they were meant to fight. He gives the party an enchanted matchbook, telling them to summon him by striking a match when they gather the information he needs.
Things continue going wrong when, in the midst of a discussion about large numbers, Phaidros makes a self-contradictory statement that summons a Paradox Beast. Our heroes stand their ground and manage to destroy the abomination, despite its habit of summoning more Paradox Beasts to its aid through its Principle of Explosion spell. Bruised and battered, they limp into the nearest Russellian city on their map, the town of Ravenscroft.
The people of Ravenscroft tell their story: in addition to the eternal darkness, Russellia is plagued by vampire attacks and by a zombie apocalypse, which has turned the population of the entire country, save Ravenscroft, into ravenous brain-eating zombies. Despite the burghers claiming the zombie apocalypse had been confirmed by no less a figure than Thomas Nagel, who passed through the area a century ago, our heroes are unconvinced: for one thing, the Ravenscrofters are unable to present any evidence that the other Russellians are zombies except for their frequent attacks on Ravenscroft - and the Ravenscrofters themselves attack the other towns as a "pre-emptive measure". But the Ravenscrofters remain convinced, and even boast of their plan to launch a surprise attack on neighboring Brixton the next day.
Suspicious, our heroes head to the encampment of the Ravenscroft army, where they are just in time to see Commander David Chalmers give a rousing oration against the zombie menace ("Flee! A History of Zombieism In Western Thought", credit Emerald Rain). They decide to latch on to Chalmers' army, both because it is heading the same direction they are and because they hope they may be able to resolve the conflict between Ravenscroft and Brixton before it turns violent.
They camp with the army in some crumbling ruins from the golden age of the Russellian Empire. Entering a ruined temple, they disarm a series of traps to enter a vault containing a legendary artifact, the Morningstar of Frege. They also encounter a series of statues and bas-reliefs of the Good King, in which he demonstrates his chivalry by swearing an oath to Aleithos that he will defend all those who cannot defend themselves. Before they can puzzle out the meaning of all they have seen, they are attacked by vampires, confirming the Ravenscrofters' tales; they manage to chase them away with their magic and a hare-brained idea of Phaidros' to bless their body water, turning it into holy water and burning them up from the inside.
The next morning, they sneak into Brixton before the main army, and find their fears confirmed: the Brixtonites are normal people, no different from the Russellians, and they claim that Thomas Nagel told them that they were the only survivors of the zombie apocalypse. They manage to forge a truce between Ravenscroft and Brixton, but to their annoyance, the two towns make peace only to attack a third town, Mountainside, which they claim is definitely populated by zombies this time. In fact, they say, the people of Mountainside openly admit to being zombies and don't even claim to have souls.
Once again, our heroes rush to beat the main army to Mountainside. There they find the town's leader, Daniel Dennett, who explains the theory of eliminative materialism ("The Zombies' Secret"). The party tries to explain the subtleties of Dennett's position to a bloodthirsty Chalmers, and finally all sides agree to drop loaded terms like "human" and "zombie" and replace them with a common word that suggests a fundamental humanity but without an internal Cartesian theater (one of our heroes suggests "NPC", and it sticks). The armies of the three towns agree to ally against their true common enemy - the vampires who live upon the Golden Mountain and kidnap their friends and families in their nighttime raids.
Before the attack, Nomophilos and Ephraim announce their intention to build an anti-vampire death ray. The theory is that places on the fringe of Russellia receive some sunlight, while places in the center are shrouded in endless darkness. If the towns of Russellia can set up a system of mirrors from their highest towers, they can reflect the sunlight from the borderlands into a central collecting mirror in Mountainside, which can be aimed at the vampires' hideout to flood it with daylight, turning them to ashes. Ephraim, who invested most of his skill points into techne, comes up with schematics for the mirror, and after constructing a successful prototype, Chalmers and Dennett sound the attack order.
The death ray takes out many of the vampires standing guard, but within their castle they are protected from its light: our heroes volunteer to infiltrate the stronghold, but are almost immediately captured and imprisoned - the vampires intend to sacrifice Cerune in a ritual to use her royal blood to increase their power. But the adventurers make a daring escape: arch-conservative Nomophilos uses the invisible hand of the marketplace to steal the keys out of the jailer's pocket, and Phaidros summons a five hundred pound carnivorous Christ metaphor to maul the guards. Before the party can escape the castle, they are confronted by the vampire lord himself, who is revealed to be none other than Thomas Nagel ("What Is It Like To Be A Bat?"). In the resulting battle, Nagel is turned to ashes and the three allied cities make short work of the remaining vampires, capturing the castle.
The next morning finds our heroes poring over the vampire lord's library. Inside, they find an enchanted copy of Godel Escher Bach (with the power to summon an identical enchanted copy of Godel Escher Bach) and a slew of books on Russellian history. Over discussion of these latter, they finally work out what curse has fallen over the land, and what role the magic axes play in its removal.
[spoiler alert; stop here if you want to figure it out for yourself]
The Good King's oath to defend those who could not defend themselves was actually more complicated than that: he swore an oath to the god Aleithos to defend those and only those who could not defend themselves. His enemies, realizing the inherent contradiction, attacked him, trapping Russell in a contradiction - if he defended himself, he was prohibited from doing so; if he did not defend himself, he was obligated to do so. Trapped, he was forced to break his oath, and the Mad God punished him by casting his empire into eternal darkness and himself into an endless sleep.
The nine axes of Zermelo the Magnificent embody the nine axioms of ZFC. If applied to the problem, they will allow set theory to be reformulated in a way that makes the paradox impossible, lifting the curse and waking the Good King.
Upon figuring out the mystery, the party strike the enchanted match and summon Heraclitus, who uses fire magic to reforge the Axes of Choice, Separation, and Extension. Thus armed, the party leave the Vampire Lord's castle and enter the system of caverns leading into the Golden Mountain.
CHAPTER SEVEN: THE KING UNDER THE MOUNTAIN
The party's travels through the cavern are quickly blocked by a chasm too deep to cross. Nomophilos saves the day by realizing that the enchanted copy of Godel Escher Bach creates the possibility of infinite recursion; he uses each copy of GEB to create another copy, and eventually fills the entire chasm with books, allowing the party to walk through to the other side.
There they meet Ruth, Clancy, and Guy one last time; the three are standing in front of a Logic Gate, and to open it the five philosophers must solve the Hardest Logic Puzzle Ever. In an epic feat that the bards will no doubt sing for years to come, Macx comes up with a solution to the puzzle, identifies each of the three successfully, and opens the Logic Gate.
Inside the gate is the Good King, still asleep after two centuries. His resting place is guarded by the monster he unleashed, a fallen archangel who has become a Queen Paradox Beast. The Queen summons a small army of Paradox Beast servants with Principle of Explosion, and the battle begins in earnest. Cerune stands in a corner, trying to manifest her nine magic axes, but Nomophilos uses his Conservative spell "Morning in America" to summon a Raygun capable of piercing the Queen Paradox Beast's armored exoskeleton. Macx summons a Universal Quantifier and attaches it to his Banish Paradox Beast spell to decimate the Queen's armies. Ephraim desperately tries to wake the Good King, while Phaidros simply prays.
After an intense battle, Cerune manifests all nine axes and casts them at the Queen Paradox Beast, dissolving the paradox and destroying the beast's magical defenses. The four others redouble their efforts, and finally manage to banish the Queen. When the Queen Paradox Beast is destroyed, Good King Bertrand awakens.
Bertrand is temporarily discombobulated, but eventually regains his bearings and listens to the entire adventure. Then he tells his story. The attack that triggered the curse upon him, he says, was no coincidence, but rather a plot by a sinister organization against whom he had been waging a shadow war: the Bayesian Conspiracy. He first encountered the conspiracy when their espionage arm, the Bayes Network, tried to steal a magic emerald of unknown origin from his treasury. Since then, he worked tirelessly to unravel the conspiracy, and had reached the verge of success - learning that their aim was in some way linked to a plan to gain the shattered power of the Mad God Aleithos for themselves - when the Conspiracy took advantage of his oath and managed to put him out of action permanently.
He is horrified to hear that two centuries have passed, and worries that the Bayesians' mysterious plan may be close to fruition. He begs the party to help him re-establish contact with the Conspiracy and continue figuring out their plans, which may be a dire peril to the entire world. But he expresses doubt that such a thing is even possible at this stage.
In a burst of flame, Heraclitus appears, announcing that all is struggle and that he has come to join in theirs. He admits that the situation is grim, but declares it is not as hopeless as it seems, because they do not fight alone. He invokes the entire Western canon as the inspiration they follow and the giants upon whose shoulders they stand ("Grand Finale").
Heraclitus, Good King Bertrand, and the five scholars end the adventure by agreeing to seek out the Bayesian Conspiracy and discover whether Russell's old adversaries are still active. There are nebulous plans to continue the campaign (subject to logistical issues) in a second adventure, Fermat's Last Stand.
Hobbes' Song: Let's Hear It For Leviathan
Kant's Song: I'm Evil Immanuel Kant
Thales' Song: All Is Water
Putnam's Song: All Is Water, Reprise
GOOD ARTISTS BORROW, GREAT ARTISTS STEAL
Dawkins' Song: Beware The Believers (credit: Michael Edmondson)
Chalmers' Song: Flee: A History of Zombieism In Western Thought (credit: Emerald Rain)
Pascal's Song: The Devil and Blaise Pascal
Dennett's Song: The Zombies' Secret
Vampire Nagel's Song: What Is It Like To Be A Bat?
Heraclitus' Song: Grand Finale
We are delighted to report that technology inventor Elon Musk, creator of Tesla and SpaceX, has decided to donate $10M to the Future of Life Institute to run a global research program aimed at keeping AI beneficial to humanity.
There is now a broad consensus that AI research is progressing steadily, and that its impact on society is likely to increase. A long list of leading AI-researchers have signed an open letter calling for research aimed at ensuring that AI systems are robust and beneficial, doing what we want them to do. Musk's donation aims to support precisely this type of research: "Here are all these leading AI researchers saying that AI safety is important", says Elon Musk. "I agree with them, so I'm today committing $10M to support research aimed at keeping AI beneficial for humanity."
[...] The $10M program will be administered by the Future of Life Institute, a non-profit organization whose scientific advisory board includes AI-researchers Stuart Russell and Francesca Rossi. [...]
The research supported by the program will be carried out around the globe via an open grants competition, through an application portal at http://futureoflife.org that will open by Thursday January 22. The plan is to award the majority of the grant funds to AI researchers, and the remainder to AI-related research involving other fields such as economics, law, ethics and policy (a detailed list of examples can be found here [PDF]). "Anybody can send in a grant proposal, and the best ideas will win regardless of whether they come from academia, industry or elsewhere", says FLI co-founder Viktoriya Krakovna.
[...] Along with research grants, the program will also include meetings and outreach programs aimed at bringing together academic AI researchers, industry AI developers and other key constituents to continue exploring how to maximize the societal benefits of AI; one such meeting was held in Puerto Rico last week with many of the open-letter signatories.
View more: Next