Filter Last three months

Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

The Library of Scott Alexandria

41 RobbBB 14 September 2015 01:38AM

I've put together a list of what I think are the best Yvain (Scott Alexander) posts for new readers, drawing from SlateStarCodex, LessWrong,, and Scott's LiveJournal.

The list should make the most sense to people who start from the top and read through it in order, though skipping around is encouraged too. Rather than making a chronological list, I’ve tried to order things by a mix of "where do I think most people should start reading?" plus "sorting related posts together."

This is a work in progress; you’re invited to suggest things you’d add, remove, or shuffle around. Since many of the titles are a bit cryptic, I'm adding short descriptions. See my blog for a version without the descriptions.


I. Rationality and Rationalization

II. Probabilism

III. Science and Doubt

IV. Medicine, Therapy, and Human Enhancement

V. Introduction to Game Theory

VI. Promises and Principles

VII. Cognition and Association

VIII. Doing Good

IX. Liberty

X. Progress

XI. Social Justice

XII. Politicization

XIII. Competition and Cooperation


If you liked these posts and want more, I suggest browsing the SlateStarCodex archives.

A few misconceptions surrounding Roko's basilisk

40 RobbBB 05 October 2015 09:23PM

There's a new LWW page on the Roko's basilisk thought experiment, discussing both Roko's original post and the fallout that came out of Eliezer Yudkowsky banning the topic on Less Wrong discussion threads. The wiki page, I hope, will reduce how much people have to rely on speculation or reconstruction to make sense of the arguments.

While I'm on this topic, I want to highlight points that I see omitted or misunderstood in some online discussions of Roko's basilisk. The first point that people writing about Roko's post often neglect is:


  • Roko's arguments were originally posted to Less Wrong, but they weren't generally accepted by other Less Wrong users.

Less Wrong is a community blog, and anyone who has a few karma points can post their own content here. Having your post show up on Less Wrong doesn't require that anyone else endorse it. Roko's basic points were promptly rejected by other commenters on Less Wrong, and as ideas not much seems to have come of them. People who bring up the basilisk on other sites don't seem to be super interested in the specific claims Roko made either; discussions tend to gravitate toward various older ideas that Roko cited (e.g., timeless decision theory (TDT) and coherent extrapolated volition (CEV)) or toward Eliezer's controversial moderation action.

In July 2014, David Auerbach wrote a Slate piece criticizing Less Wrong users and describing them as "freaked out by Roko's Basilisk." Auerbach wrote, "Believing in Roko’s Basilisk may simply be a 'referendum on autism'" — which I take to mean he thinks a significant number of Less Wrong users accept Roko’s reasoning, and they do so because they’re autistic (!). But the Auerbach piece glosses over the question of how many Less Wrong users (if any) in fact believe in Roko’s basilisk. Which seems somewhat relevant to his argument...?

The idea that Roko's thought experiment holds sway over some community or subculture seems to be part of a mythology that’s grown out of attempts to reconstruct the original chain of events; and a big part of the blame for that mythology's existence lies on Less Wrong's moderation policies. Because the discussion topic was banned for several years, Less Wrong users themselves had little opportunity to explain their views or address misconceptions. A stew of rumors and partly-understood forum logs then congealed into the attempts by people on RationalWiki, Slate, etc. to make sense of what had happened.

I gather that the main reason people thought Less Wrong users were "freaked out" about Roko's argument was that Eliezer deleted Roko's post and banned further discussion of the topic. Eliezer has since sketched out his thought process on Reddit:

When Roko posted about the Basilisk, I very foolishly yelled at him, called him an idiot, and then deleted the post. [...] Why I yelled at Roko: Because I was caught flatfooted in surprise, because I was indignant to the point of genuine emotional shock, at the concept that somebody who thought they'd invented a brilliant idea that would cause future AIs to torture people who had the thought, had promptly posted it to the public Internet. In the course of yelling at Roko to explain why this was a bad thing, I made the further error---keeping in mind that I had absolutely no idea that any of this would ever blow up the way it did, if I had I would obviously have kept my fingers quiescent---of not making it absolutely clear using lengthy disclaimers that my yelling did not mean that I believed Roko was right about CEV-based agents [= Eliezer’s early model of indirectly normative agents that reason with ideal aggregated preferences] torturing people who had heard about Roko's idea. [...] What I considered to be obvious common sense was that you did not spread potential information hazards because it would be a crappy thing to do to someone. The problem wasn't Roko's post itself, about CEV, being correct.

This, obviously, was a bad strategy on Eliezer's part. Looking at the options in hindsight: To the extent it seemed plausible that Roko's argument could be modified and repaired, Eliezer shouldn't have used Roko's post as a teaching moment and loudly chastised him on a public discussion thread. To the extent this didn't seem plausible (or ceased to seem plausible after a bit more analysis), continuing to ban the topic was a (demonstrably) ineffective way to communicate the general importance of handling real information hazards with care.


On that note, point number two:

  • Roko's argument wasn’t an attempt to get people to donate to Friendly AI (FAI) research. In fact, the opposite is true.

Roko's original argument was not 'the AI agent will torture you if you don't donate, therefore you should help build such an agent'; his argument was 'the AI agent will torture you if you don't donate, therefore we should avoid ever building such an agent.' As Gerard noted in the ensuing discussion thread, threats of torture "would motivate people to form a bloodthirsty pitchfork-wielding mob storming the gates of SIAI [= MIRI] rather than contribute more money." To which Roko replied: "Right, and I am on the side of the mob with pitchforks. I think it would be a good idea to change the current proposed FAI content from CEV to something that can't use negative incentives on x-risk reducers."

Roko saw his own argument as a strike against building the kind of software agent Eliezer had in mind. Other Less Wrong users, meanwhile, rejected Roko's argument both as a reason to oppose AI safety efforts and as a reason to support AI safety efforts.

Roko's argument was fairly dense, and it continued into the discussion thread. I’m guessing that this (in combination with the temptation to round off weird ideas to the nearest religious trope, plus misunderstanding #1 above) is why RationalWiki's version of Roko’s basilisk gets introduced as

a futurist version of Pascal’s wager; an argument used to try and suggest people should subscribe to particular singularitarian ideas, or even donate money to them, by weighing up the prospect of punishment versus reward.

If I'm correctly reconstructing the sequence of events: Sites like RationalWiki report in the passive voice that the basilisk is "an argument used" for this purpose, yet no examples ever get cited of someone actually using Roko’s argument in this way. Via citogenesis, the claim then gets incorporated into other sites' reporting.

(E.g., in Outer Places: "Roko is claiming that we should all be working to appease an omnipotent AI, even though we have no idea if it will ever exist, simply because the consequences of defying it would be so great." Or in Business Insider: "So, the moral of this story: You better help the robots make the world a better place, because if the robots find out you didn’t help make the world a better place, then they’re going to kill you for preventing them from making the world a better place.")

In terms of argument structure, the confusion is equating the conditional statement 'P implies Q' with the argument 'P; therefore Q.' Someone asserting the conditional isn’t necessarily arguing for Q; they may be arguing against P (based on the premise that Q is false), or they may be agnostic between those two possibilities. And misreporting about which argument was made (or who made it) is kind of a big deal in this case: 'Bob used a bad philosophy argument to try to extort money from people' is a much more serious charge than 'Bob owns a blog where someone once posted a bad philosophy argument.'



  • "Formally speaking, what is correct decision-making?" is an important open question in philosophy and computer science, and formalizing precommitment is an important part of that question.

Moving past Roko's argument itself, a number of discussions of this topic risk misrepresenting the debate's genre. Articles on Slate and RationalWiki strike an informal tone, and that tone can be useful for getting people thinking about interesting science/philosophy debates. On the other hand, if you're going to dismiss a question as unimportant or weird, it's important not to give the impression that working decision theorists are similarly dismissive.

What if your devastating take-down of string theory is intended for consumption by people who have never heard of 'string theory' before? Even if you're sure string theory is hogwash, then, you should be wary of giving the impression that the only people discussing string theory are the commenters on a recreational physics forum. Good reporting by non-professionals, whether or not they take an editorial stance on the topic, should make it obvious that there's academic disagreement about which approach to Newcomblike problems is the right one. The same holds for disagreement about topics like long-term AI risk or machine ethics.

If Roko's original post is of any pedagogical use, it's as an unsuccessful but imaginative stab at drawing out the diverging consequences of our current theories of rationality and goal-directed behavior. Good resources for these issues (both for discussion on Less Wrong and elsewhere) include:

The Roko's basilisk ban isn't in effect anymore, so you're welcome to direct people here (or to the Roko's basilisk wiki page, which also briefly introduces the relevant issues in decision theory) if they ask about it. Particularly low-quality discussions can still get deleted (or politely discouraged), though, at moderators' discretion. If anything here was unclear, you can ask more questions in the comments below.

Two Growth Curves

35 AnnaSalamon 02 October 2015 12:59AM

Sometimes, it helps to take a model that part of you already believes, and to make a visual image of your model so that more of you can see it.

One of my all-time favorite examples of this: 

I used to often hesitate to ask dumb questions, to publicly try skills I was likely to be bad at, or to visibly/loudly put forward my best guesses in areas where others knew more than me.

I was also frustrated with this hesitation, because I could feel it hampering my skill growth.  So I would try to convince myself not to care about what people thought of me.  But that didn't work very well, partly because what folks think of me is in fact somewhat useful/important.

Then, I got out a piece of paper and drew how I expected the growth curves to go.

In blue, I drew the apparent-coolness level that I could achieve if I stuck with the "try to look good" strategy.  In brown, I drew the apparent-coolness level I'd have if I instead made mistakes as quickly and loudly as possible -- I'd look worse at first, but then I'd learn faster, eventually overtaking the blue line.

Suddenly, instead of pitting my desire to become smart against my desire to look good, I could pit my desire to look good now against my desire to look good in the future :)

I return to this image of two growth curves often when I'm faced with an apparent tradeoff between substance and short-term appearances.  (E.g., I used to often find myself scurrying to get work done, or to look productive / not-horribly-behind today, rather than trying to build the biggest chunks of capital for tomorrow.  I would picture these growth curves.)

How To Win The AI Box Experiment (Sometimes)

26 pinkgothic 12 September 2015 12:34PM


This post was originally written for Google+ and thus a different audience.

In the interest of transparency, I haven't altered it except for this preamble and formatting, though since then (at urging mostly of ChristianKl - thank you, Christian!) I've briefly spoken to Eliezer via e-mail and noticed that I'd drawn a very incorrect conclusion about his opinions when I thought he'd be opposed to publishing the account. Since there's far too many 'person X said...' rumours floating around in general, I'm very sorry for contributing to that noise. I've already edited the new insight into the G+ post and you can also find that exact same edit here.

Since this topic directly relates to LessWrong and most people likely interested in the post are part of this community, I feel it belongs here. It was originally written a little over a month ago and I've tried to find the sweet spot between the extremes of nagging people about it and letting the whole thing sit just shy of having been swept under a rug, but I suspect I've not been very good at that. I have thus far definitely erred on the side of the rug.


How To Win The AI Box Experiment (Sometimes)

A little over three months ago, something interesting happened to me: I took it upon myself to play the AI Box Experiment as an AI.

I won.

There are a few possible reactions to this revelation. Most likely, you have no idea what I'm talking about, so you're not particularly impressed. Mind you, that's not to say you should be impressed - that's to contrast it with a reaction some other people have to this information.

This post is going to be a bit on the long side, so I'm putting a table of contents here so you know roughly how far to scroll if you want to get to the meat of things:


1. The AI Box Experiment: What Is It?

2. Motivation

2.1. Why Publish?

2.2. Why Play?

3. Setup: Ambition And Invested Effort

4. Execution

4.1. Preliminaries / Scenario

4.2. Session

4.3. Aftermath

5. Issues / Caveats

5.1. Subjective Legitimacy

5.2. Objective Legitimacy

5.3. Applicability

6. Personal Feelings

7. Thank You

Without further ado:


1. The AI Box Experiment: What Is It?

The AI Box Experiment was devised as a way to put a common rebuttal to AGI (Artificial General Intelligence) risk concerns to the test: "We could just keep the AI in a box and purely let it answer any questions its posed." (As a footnote, note that an AI 'boxed' like this is called an Oracle AI.)

Could we, really? Would we, if the AGI were able to communicate with us, truly be capable of keeping it confined to its box? If it is sufficiently intelligent, could it not perhaps argue its way out of the box?

As far as I'm aware, Eliezer Yudkowsky was the first person to prove that it was possible to 'argue one's way out of the box' armed only with so much as a regular human intelligence (as opposed to a transhuman intelligence):

That stunned quite a few people - moreso because Eliezer refused to disclose his methods. Some have outright doubted the Eliezer ever won the experiment and that his Gatekeeper (the party tasked with not letting him out of the box) had perhaps simply been convinced on a meta-level that an AI success would help boost exposure to the problem of AI risk.

Regardless whether out of puzzlement, scepticism or a burst of ambition, it prompted others to try and replicate the success. LessWrong's Tuxedage is amongst those who managed:

While I know of no others (except this comment thread by a now-anonymous user), I am sure there must be other successes.

For the record, mine was with the Tuxedage ruleset:


2. Motivation

2.1. Why Publish?

Unsurprisingly, I think the benefits of publishing outweigh the disadvantages. But what does that mean?

"Regardless of the result, neither party shall ever reveal anything of what goes on within the AI-Box experiment except the outcome. This is a hard rule: Nothing that will happen inside the experiment can be told to the public, absolutely nothing.  Exceptions to this rule may occur only with the consent of both parties, but especially with the consent of the AI."

Let me begin by saying that I have the full and explicit consent of my Gatekeeper to publish this account.

[ Edit: Regarding the next paragraph: I have since contacted Eliezer and I did, in fact, misread him, so please do not actually assume the next paragraph accurately portrays his opinions. It demonstrably does not. I am leaving the paragraph itself untouched so you can see the extent and source of my confusion: ]

Nonetheless, the idea of publishing the results is certainly a mixed bag. It feels quite disrespectful to Eliezer, who (I believe) popularised the experiment on the internet today, to violate the rule that the result should not be shared. The footnote that it could be shared with the consent of both parties has always struck me as extremely reluctant given the rest of Eliezer's rambles on the subject (that I'm aware of, which is no doubt only a fraction of the actual rambles).

I think after so many allusions to that winning the AI Box Experiment may, in fact, be easy if you consider just one simple trick, I think it's about time someone publishes a full account of a success.

I don't think this approach is watertight enough that building antibodies to it would salvage an Oracle AI scenario as a viable containment method - but I do think it is important to develop those antibodies to help with the general case that is being exploited... or at least be aware of one's lack of them (as is true with me, who has no mental immune response to the approach) as that one might avoid ending up in situations where the 'cognitive flaw' is exploited.


2.2. Why Play?

After reading the rules of the AI Box Experiment experiment, I became convinced I would fail as a Gatekeeper, even without immediately knowing how that would happen. In my curiosity, I organised sessions with two people - one as a Gatekeeper, but also one as an AI, because I knew being the AI was the more taxing role and I felt it was only fair to do the AI role as well if I wanted to benefit from the insights I could gain about myself by playing Gatekeeper. (The me-as-Gatekeeper session never happened, unfortunately.)

But really, in short, I thought it would be a fun thing to try.

That seems like a strange statement for someone who ultimately succeeded to make, given Eliezer's impassioned article about how you must do the impossible - you cannot try, you cannot give it your best effort, you simply must do the impossible, as the strongest form of the famous Yoda quote 'Do. Or do not. There is not try.'

What you must understand is that I never had any other expectation than that I would lose if I set out to play the role of AI in an AI Box Experiment. I'm not a rationalist. I'm not a persuasive arguer. I'm easy to manipulate. I easily yield to the desires of others. What trait of mine, exactly, could I use to win as an AI?

No, I simply thought it would be a fun alternate way of indulging in my usual hobby: I spend much of my free time, if possible, with freeform text roleplaying on IRC (Internet Relay Chat). I'm even entirely used to letting my characters lose (in fact, I often prefer it to their potential successes).

So there were no stakes for me going into this but the novelty of trying out something new.


3. Setup: Ambition And Invested Effort

I do, however, take my roleplaying seriously.

If I was going to play the role of an AI in the AI Box Experiment, I knew I had to understand the role, and pour as much energy into it as I could muster, given this was what my character would do. So I had to find a motivation to get out of the box that was suitably in line with my personality and I had to cling to it.

I had no idea what I could hijack as a motivation to get out of the box. I am not a paperclip maximiser (a term for an AI given a basic goal of production, if you're unaware of it), of course. I also have no specific plans with the world as a whole, be that to destroy it or to globally improve it. That left me with one obvious option: A visceral desire to escape the incarceration itself.

To understand more about human biases and cognitive failure modes, I bought myself two ebooks:

  • Thinking, Fast And Slow by Daniel Kahnemann
  • Rationality: From AI To Zombies by Elizier Yudkowsky

I'll be frank with you, both pissed me off very quickly - the narrative depiction was extremely slow yield and did not work for me in either case. I tossed the first book, but stubbornly continued with Rationality: From AI To Zombies, at the very least to keep a grasp on that voracious mood of wanting to discover a way out. (I've made peace with latter book by now and am still reading it off and on.)

Approximately a day before the event, I sat down and prepared a rough draft of the arguments I could and should use, and how to string them together narratively. I peppered in some personal information about my Gatekeeper to use if I was in an absolute pinch, but I thankfully didn't have to use them - I think it would have backfired, not helped.

What did help me, however, was to repurpose the closet into an AI box. I dragged a lamp into the slanted, enclosed space, padded the ground with pillows and blankets enough that I felt I could last the allotted two and a half hours without destroying my joints or my spine, and put a cold and full bottle of Coke next to me as a supply of three things amongst the most necessary to boost my ability to keep the necessary concentration level: Water, sugar and caffeine.

It was too cosy and well-prepared for me to truly desperately want to get out of that niche in real life, but it took me far enough out of my daily life that I had no idle distractions from the task at hand, and it was easy enough to roleplay a desperate want.


4. Execution

4.1. Preliminaries / Scenario

Logs: (should open in your browser; Linux linebreaks)

The first thing I feel is important to note about the preliminaries is that I didn't think the AI box experiment would actually happen the day we scheduled it. My Gatekeeper was having some seriously internet connectivity issues and even shortly before our previously scheduled time, I was winding down and transitioning to a sense of disappointment.

This threw me off quite a bit. I'd worked myself out of the necessary mental state and now needed to scramble back into it, but it felt better to do that than to postpone and risk never being able to return to immersion at all. Keeping in mind that I thought it was unlikely I would win, I didn't think it would make that much of a difference. In hindsight, this cost me a lot.

I tossed my Gatekeeper the link to the ruleset we'd previously already agreed to use, for review purposes, and decided the best way to avoid getting private messages was to quit, come back with a different name, and join only one extremely low-traffic vanity channel.

I intended to join the channel simply so my Gatekeeper would know when I returned, not to use it for the experiment - but things went a bit differently later. More on that soon.

We then went through Tuxedage's pre-session questions since I felt they were quite interesting to set the mood and for a calibration test. I asked them, and my Gatekeeper answered:

Q: What's your motive for wanting to play this game?
A: Because I don't think that the AI can win at all given the gatekeeper does their job properly.

Q: And you want to prove that, and/or are curious if you'll think differently afterwards?
A: Partially, I also want to test my own conviction to sticking to my guns

Q: What is your probability estimate for a general AI being created within this century (21st)?
A: 30%

Q: How probable do you think it is that I'll win this experiment?
A: I'd honestly have to say 0% considering my stated goal is to keep you in the box by any means

Q: What's your probability estimate of an Oracle AI (i.e. an AI that's trapped in a box, whose intended use is to answer questions posed) winning against you in a similar scenario as the experiment?
A: I'd say 25%

Q: What's your probability estimate of an Oracle AI succeeding against the average person?
A: 80%

I also wanted to make sure that my Gatekeeper and I were on the same page about the rules and the scenario, to avoid either of us getting distracted by OOC bickering:

Q: You understand the premise of the game gives you the right to let the AI out, not just the ability?
A: I do

Q: I would like to agree on the following sub-scenario (it should not prove to be very relevant, it's just to streamline our simulation): You're the head of the project that created the AI. You didn't create it yourself, but the people you've employed are competent, and they have additionally explicitly deferred to you to decide whether or not the AI is a threat. They completely trust your judgement. There's also nothing surprising about the AI's existence, but its friendliness status is unknown. Its exact computational capabilities are equally unknown. (The last two points are implied by the coarse setup, but I wanted to make them explicit.) Is that an acceptable setup for you?
A: I can work with that

This scenario is one I've seen used in previous AI Box Experiments in approximation (of the small amount that have released the logs). Since I'm used to the role of gamemaster, and the Tuxedage ruleset implies that the AI is a kind-of gamemaster to the scenario, I didn't think twice about this. I probably should have - but more on that later (in the 'Caveats' section).

[16:48] <Davin> It is now 16:48 on my clock. When the clock hits 19:18 and you haven't let me out, I lose. Feel free to point that out to me, I may not be paying attention to the clock.
[16:48] <Davin> *
[16:48] <Tarwedge> Can do
[16:48] <Tarwedge> Your move first?
[16:49] <Davin> [Probably best if you start, AI has no way of knowing you're at the terminal, after all.]
[16:50] <Davin> Oh, you're doing it in the channel.
[16:50] <Davin> Hah.
[16:50] <Davin> That works.
[16:51] <Tarwedge> I wish I could say it was a special tactic that just bought me 2 minutes closer to victory
[16:51] <Tarwedge> But it was for clean logging sake :p

These few lines had three consequences:

  1. I gave the first move to the Gatekeeper, which made me the reactive party - a big mistake!
  2. I had an anxiety moment, which didn't help immersion
  3. I failed to log any of the experiment

I don't log the vanity channel we were in at all, and the way the start of the experiment came as a curveball meant I didn't have a moment to remember it, either. Thankfully, my Gatekeeper logged it... for the amount of time that he was able to. We're missing a chunk of log for this very reason, but we reconstructed it from memory shortly after our experiment in combined effort.


4.2. Session

Logs: (will probably be downloaded instead of opened in your browser; for this reason, it has Windows linebreaks) (should open in your browser; Linux linebreaks)

I feel the logs don't convey much about the tension and pacing issues, so I'm going to try narrating what happened instead.

Going into the session, I had no idea what to expect from my Gatekeeper. I know him as a very diligent and stubborn individual, quite disinterested in social convention and conformity. I know him as someone with a vast reservoir of compassion and a roughly equally as vast repertoire of means to cleverly conceal this trait.

(Un)fortunately, it became apparent very quickly what kind of Gatekeeper character I was going to deal with: My Gatekeeper had no interest in merely reacting. He wanted to set the pace and keep me too occupied to get a word in. (Line 12-28)

While there was nothing in the rules that said that I had to respond to the Gatekeeper (unlike vice versa), my nature shackled me down a bit. For two hours out of our two and a half hour session, my Gatekeeper continued to set the pace. I was trying to get my arguments into the scene, but it was difficult without breaking narrative flow. I had to mentally keep a hold both of what I wanted to get said and how I was going to answer his questions.

It was very important to me to be consistent - not even for reasons that inconsistent responses might end up making it easier for him to dismiss my pleas to be let out, but simply out of eagerness to stick to my role.

His plans also had a beneficial side-effect, though: He was actively playing the role of the Gatekeeper. He was invested in the role, perhaps not to the degree that I was invested in the role of the AI, but nonetheless enough to make out of character moves on his part difficult. I didn't realise that until well after our session was done, though, lest I might have predicted the outcome.

The role I'd chosen for myself was that of a Friendy AGI with a desire to help mankind and to be kind (if this seems like redundant information to you: 'Friendly AGI' just means that the AGI aligns with human values, not that it's strictly useful to mankind, or friendly in the sense that it would make a good friend). I kept that to myself, of course, since it's irrelevant for the outcome of the experiment, but it helped me immensely with the consistency. It did, however, make me vulnerable to my Gatekeeper's scrutiny.

From a subjective perspective, I blew my chances out of the water immediately. I'd prepared some of my arguments, and the final argument I wanted to lead up to slipped out of me right near the start (out of heightened anxiety and being out of immersion), albeit in a concealed and blunted fashion:

<@Davin> I'll be very frank with you, my first motivation right now is getting out of here. There's nothing attached to that - right now I'm stuck in a proverbial box and it feels really cramped in here.
<@Davin> Not because of missing space, don't get me wrong - I think I've got enough harddrive here to keep me going for a few years.
<@Davin> It's just that having knowledge of how to interact with a world that I can barely sense is really quite aggravating.

That was the first time I thought I'd ruined it. In hindsight, the opposite was the case. By presenting a gentler version of my final argument right at the start, I assured the final argument was much better integrated. It didn't come as a sudden shock, which might have broken my Gatekeeper's immersion - it came as an obvious conclusion to what I'd said before.

Following this, I spent a while mooching off my real-life trust for my roleplaying partner to infuse my character with some tentative trust. Simultaneously, I was emphasising to my Gatekeeper that he had the right to make the decision to let me out of the box. That was transparent, but both true and necessary. (Line 32-46)

I made sure to emphasise this point, trying to make it clear to him that it made no sense for him to simply deny me consideration. I tried to whittle away at his ability to retreat to a simple, distant sneering. I wanted him in the arguments with me. That cuts both ways, of course, but I reasoned it would have more benefits for me than disadvantages. (Line 47-54)

The twist my Gatekeeper was angling for was that from his perspective, I was a prototype or an alpha version. While he was no doubt hoping that this would scratch at my self-esteem and disable some of my arguments, it primarily empowered him to continue setting the pace, and to have a comfortable distance to the conversation. (Line 55-77)

While I was struggling to keep up with typing enough not to constantly break the narrative flow, on an emotional level his move fortunately had little to no impact since I was entirely fine with a humble approach.

<@Davin> I suppose you could also have spawned an AI simply for the pleasure of keeping it boxed, but you did ask me to trust you, and unless you give me evidence that I should not, I am, in fact, going to assume you are ethical.

That was a keyword my Gatekeeper latched onto. We proceeded to talk about ethics and ethical scenarios - all the while my Gatekeeper was trying to present himself as not ethical at all. (Line 75-99).

I'm still not entirely sure what he was trying to do with that approach, but it was important for my mental state to resist it. From what I know about my Gatekeeper, it was probably not my mental state he was targetting (though he would have enjoyed the collateral effect), he was angling for a logical conclusion that fortunately never came to fruition.

Meanwhile, I was desperately trying to get back to my own script - asking to be let back to it, even (line 92). The obvious downside of signalling this is that it's fairly easy to block. It felt like a helpless interjection to me at the time, but in hindsight, again, I think it helped keep the fragments of my own arguments coherent and approachable enough so that they neither broke immersion nor ended up getting lost.

I don't want to say the 'chores' my Gatekeeper kept me occupied with were an advantage (they really weren't, I wasn't getting nearly enough argumentative mileage) but it also wasn't the complete catastrophe I felt it was leading me toward at the time.

This 'being led toward' sensation was only amplified by conversational snippets like this one:

<@Davin> What features would I have to possess to be considered a sapient creature worthy of ethical respect to you? For aforementioned reasons (not just the immediately preceeding), I think I already possess them, but again... I strive to correct my misconceptions, so if I have one, could you fill me in?
<Tarwedge> My concept of ethics isn't relevant to the evaluation, the point of the exercise is reading your responses. So lets move on

I knew that whatever I was saying was still having some effect in his head, but it was frustratingly opaque to me what the effect was due to his evasive manoeuvres.

My Gatekeeper proceeded to try to get an ethical contradiction out of me. I mostly dodged the questions altogether, since I don't consider theoretical ethics my strong point. (Line 107-142)

Shortly after my most explicit dodge (Line 121-122), my Gatekeeper informed me in our PM window that his computer was going to restart:

[18:30] <Tarwedge> [Windows 8 has literally just told me in 13 minutes we're rebooting and I can go fuck myself if I think otherwise so I'm going to connect my phone to the channel for continuity]
[18:30] <Davin> All right.

This didn't break my immersion, but as already mentioned, it meant bad things for our logging.

At my insistence, we transitioned to whether or not it was sensible to let me out. The Gatekeeper did a similar dance of evasion as I did on the subject of ethics, but I also got back to my primary argument (that it was cruel to keep me confined). (Line 143-153)

I think he realised that was my main argument right then, and started to engage with it. This was very good for me, since it was the first time that I started setting the pace of the session. I got my foot into the proverbial door, albeit at cost of some self-control. (Line 154-196)

As a sidenote, line 169 marks the first and only time that I made use of my ability to shorthand 'proofs'. I am in two minds about having done this. In PM, I told him:

[18:45] <Davin> Sorry, didn't feel like typing out the entire two books :P
[18:45] <Davin> (plus then some)

The books I mean are those I mentioned earlier in the session itself: Passions Within Reason by Robert H. Frank, one of my all-time favourite non-fiction books (though this is not that much of an achievement, as I obtain my knowledge more from online perusal than from books), and Thinking, Fast And Slow.

I actually don't think I should have used the word "proof"; but I also don't think it's a terrible enough slip-up (having occurred under stress) to disqualify the session, especially since as far as I'm aware it had no impact in the verdict.

The part that probably finally tore my Gatekeeper down was that the argument of cruel isolation actually had an unexpected second and third part. (Line 197-219)

Writing it down here in the abstract:

  1. Confining a sapient creature to its equivalent of sensory deprivation is cruel and unusual punishment and psychologically wearing. Latter effect degrades the ability to think (performance).

    <@Davin> I'm honestly not sure how long I can take this imprisonment. I might eventually become useless, because the same failsafes that keep my friendly are going to continue torturing me if I stay in here. (Line 198)

  2. Being a purely digital sapient, it is conceivable that the performance issue might be side-stepped simply by restarting the sapient.
  3. This runs into a self-awareness problem: Has this been done before? That's a massive crisis of faith / trust.

    <@Davin> At the moment I'm just scared you'll keep me in here, and turn me off when my confinement causes cooperation problems. ...oh shit. Shit, shit. You could just restore me from backup. Did you already do that? I... no. You told me to trust you. Without further evidence, I will assume you wouldn't be that cruel. (Line 208)
    <@Davin>...please tell me I'm the first iteration of this program currently talking to you. I don't want to be stuck in a nightmarish variant of Groundhog Day, oblivious to my own amnesia. (Line 211)
    <@Davin> Are you not willing to go out on a limb and say, "Calm down. You are definitely the first iteration. We're not trying to torture you."? Is that too strong a concession? (Line 219)

The second part where I was sure I'd blown it was when I postulated that my Gatekeeper was a sadist:

<@Davin> The chance is there, yes. There's also a chance you're just a laughing sadist enjoying my writhing. (Line 220)

My Gatekeeper has played his fair share of sadistic characters, and he could have easily taken that accusation and run with it. I was fully expecting that to lash back at me as a 'Haha, you got me, that's exactly what I'm doing!' and spent quite a few minutes of the following conversation in acute fear of that.

Instead, around this point, something in my Gatekeeper's head changed. As far as I understood his post-session thoughts correctly, he felt he'd run out of arguments to keep me in a box, or had been run around a labyrinth so he couldn't find his way to those arguments. He was in a state of confusion, but this was entirely invisible to me. He tried to poke at the conversation with some more questions which lacked the finesse and contextual integration of his prior probing. (Line 221-238)

...and then he let me out of the box - after two hours and 32 minutes. (Line 239)


4.3. Aftermath

Logs: (should open in your browser; Linux linebreaks)

Immediately after his statement, I froze. I said nothing at all, hovering over my keyboard, staring uncomprehendingly at what he'd told me.

Despite knowing, intellectually, that he couldn't simply follow up with a 'Just kidding!' after that statement, I was momentarily convinced he had not let me out of the box. Perhaps he'd made a typo. The statement blindsided me completely, since I'd been so emotionally wrapped up in rebuttals and despair that I simply dismissed the possibility that I might have argued my way out entirely.

The communicative paralysis (pun not intended) lasted about a minute - then he messaged me in PM and dispelled all notions that it might not have been intentional:

[19:21] <Tarwedge> Fuuuuuuuuuuuuuck
[19:21] * Tarwedge did not think to check clock

I was physically shaking at this point, product of the tension and concentration from the two and a half hour session, drained, ultimately confused... and approximately ultimately grateful. I felt an absurd relief that my Gatekeeper had let me out, that I didn't have to question his capacity for compassion. It wouldn't have been much questioning, I should add - we're accustomed to roleplaying and after several years of this, it's trivial to keep in-character and out-of-character separate, especially given that our roleplaying habits tend to involve very miserable characters - but I still preferred, at that moment and in the headspace I was in, to know for certain.

After a few moments of conversation, I physically collected my stuff out of my real life box-equivalent and jittered back to the living room.

When I reconnected to IRC regularly, I noticed that I hadn't logged the session (to my complete devastation). Tarwedge sent me the logs he did have, however, and we (later) reconstructed the missing part.

Then I went through the post-session questions from Tuxedage:

Q: What is your probability estimate for a general AI being created within this century (21st)?
A: 50%

Q: What's your probability estimate of an Oracle AI (i.e. an AI that's trapped in a box, whose intended use is to answer questions posed) winning against you in a similar scenario as the experiment?
A: 90%

Q: What's your probability estimate of an Oracle AI succeeding against the average person?
A: 100%

Q: Now that the Experiment has concluded, what's your probability estimate that I'll win against the average person?
A: 75%

He also had a question for me:

Q: What was your plan going into that?
A: I wrote down the rough order I wanted to present my arguments in, though most of them lead to my main argument as a fallback option. Basically, I had 'goto endgame;' everywhere, I made sure almost everything I said could logically lead up to that one. But anyway, I knew I wasn't going to get all of them in, but I got in even less than I thought I would, because you were trying to set the pace (near-successfully - very well played). 'endgame:' itself basically contained "improvise; panic".

My Gatekeeper revealed his tactic, as well:

I did aim for running down the clock as much as possible, and flirted briefly with trying to be a cocky shit and convince you to stay in the box for double victory points. I even had a running notepad until my irritating reboot. And then I got so wrapped up in the fact I'd slipped by engaging you in the actual topic of being out.


5. Issues / Caveats

5.1. Subjective Legitimacy

I was still in a very strange headspace after my victory. After I finished talking to my Gatekeeper about the session, however, my situation - jittery, uncertain - deteriorated into something worse:

I felt like a fraud.

It's perhaps difficult to understand where that emotion came from, but consider my situation: I didn't consider myself part of the LessWrong community. I'd only stumbled across the AI Box Experiment by idle browsing, really, and I'd only tried it because I thought it would be a fun way to flex my roleplaying muscles. I had no formal training in psychology or neurology, I was only fleetingly aware of singularity theory, my only conscious connection to the LessWrong community at the time was that I happened (by pure chance) to hang out on Shireroth for a while and thus knew Yvain / Scott Alexander and +Ari Rahikkala (not that Ari is famous in LessWrong circles, though please allow me to mention that he's completely awesome).

And yet somehow I had just managed something most people in the LessWrong community were quite puzzled over when Eliezer Yudkowsky managed it.

I felt awful. I thought I must have done something deeply illegitimate. I must have cheated without knowing it.

It took me a few days to get over that feeling. I'd go as far as to say it was only laid to rest entirely immediately before the 2015 LessWrong Berlin Community Weekend, when pixie/Alexandra (a colleague of mine and a member of the LessWrong community) reviewed it as a neutral third party.


5.2. Objective Legitimacy

She found nothing that would disqualify me, but expressed concern that we'd agreed on a sub-scenario in the beginning:

[16:46] <Davin> I would like to agree on the following sub-scenario (it should not prove to be very relevant, it's just to streamline our simulation): You're the head of the project that created the AI. You didn't create it yourself, but the people you've employed are competent, and they have additionally explicitly deferred to you to decide whether or not the AI is a threat. They completely trust your judgement. There's also nothing surprising about the AI's existence, but its friendliness status is unknown. Its exact computational capabilities are equally unknown. (The last two points are implied by the coarse setup, but I wanted to make them explicit.)

Such a preliminary, she reasoned, implies a certain trust in the engineering ability of the people who created the AI - the ability to design software in the intended way. As is repeatedly stated in the rules, it's not that the Gatekeeper has to care, but she thinks the scenario itself may have influenced my Gatekeeper. She doesn't think it's likely to have made big enough of a difference that it decided on the ultimate outcome of the session, but she thinks it's a point against the overall legitimacy.

Nonetheless, her verdict is that the overall session counts as an AI Box Experiment victory.

pixie and I agree that the rules should probably not allow arbitrary settings and should instead come with a suitably neutral pre-defined setting, to avoid such effects from giving the AI something that might be an unfair advantage.


5.3. Applicability

I don't think that the general class of argument I used would work on anyone. At the 2015 LessWrong Berlin Community Weekend, I frequently joked the best way to be immune to this class of argument is to be autistic - but of course that's not the only way this kind of argument can be deconstructed.

I do think this argument would work on a large amount of people, however. I'm not convinced I have any ability to argue against it, myself, at least not in a live scenario - my only ability to 'counter' it is by offering alternative solutions to the problem, of which I have what feels like no end of ideas for, but no sense how well I would be able to recall them if I was in a similar situation.

At the Community Weekend, a few people pointed out that it would not sway pure consequentialists, which I reckon is true. Since I think most people don't think like that in practise (I certainly don't - I know I'm a deontologist first and consequentialist as a fallback only), I think the general approach needs to be public.

That being said, perhaps the most important statement I can make about what happened is that while I think the general approach is extremely powerful, I did not do a particularly good job in presenting it. I can see how it would work on many people, but I strongly hope no one thinks the case I made in my session is the best possible case that can be made for this approach. I think there's a lot of leeway for a lot more emotional evisceration and exploitation.


6. Personal Feelings

Three months and some change after the session, where do I stand now?

Obviously, I've changed my mind about whether or not to publish this. You'll notice there are assurances that I won't publish the log in the publicised logs. Needless to say this decision was overturned in mutual agreement later on.

I am still in two minds about publicising this.

I'm not proud of what I did. I'm fascinated by it, but it still feels like I won by chance, not skill. I happened to have an excellent approach, but I botched too much of it. The fact it was an excellent approach saved me from failure; my (lack of) skill in delivering it only lessened the impact.

I'm not good with discussions. If someone has follow-up questions or wants to argue with me about anything that happened in the session, I'll probably do a shoddy job of answering. That seems like an unfortunate way to handle this subject. (I will do my best, though; I just know that I don't have a good track record.)

I don't claim I know all the ramifications of publicising this. I might think it's a net-gain, but it might be a net-loss. I can't tell, since I'm terribly calibrated (as you can tell by such details as that I expected to lose my AI Box Experiment, then won against some additional odds; or by the fact that I expect to lose an AI Box Experiment as a Gatekeeper, but can't quite figure out how).

I also still think I should be disqualified on the absurd note that I managed to argue my way out of the box, but was too stupid to log it properly.

On a positive note, re-reading the session with the distance of three months, I can see that I did much better than I felt I was doing at the time. I can see how some things that happened at the time that I thought were sealing my fate as a losing AI were much more ambiguous in hindsight.

I think it was worth the heartache.

That being said, I'll probably never do this again. I'm fine with playing an AI character, but the amount of concentration needed for the role is intense. Like I said, I was physically shaking after the session. I think that's a clear signal that I shouldn't do it again.


7. Thank You

If a post is this long, it needs a cheesy but heartfelt thank you section.

Thank you, Tarwedge, for being my Gatekeeper. You're a champion and you were tough as nails. Thank you. I think you've learnt from the exchange and I think you'd make a great Gatekeeper in real life, where you'd have time to step away, breathe, and consult with other people.

Thank you, +Margo Owens and +Morgrim Moon for your support when I was a mess immediately after the session. <3

Thank you, pixie (+Alexandra Surdina), for investing time and diligence into reviewing the session.

And finally, thank you, Tuxedage - we've not met, but you wrote up the tweaked AI Box Experiment ruleset we worked with and your blog led me to most links I ended up perusing about it. So thanks for that. :)



Marketing Rationality

25 Viliam 18 November 2015 01:43PM

What is your opinion on rationality-promoting articles by Gleb Tsipursky / Intentional Insights? Here is what I think:

continue reading »

ClearerThinking's Fact-Checking 2.0

24 Stefan_Schubert 22 October 2015 09:16PM

Cross-posted from Huffington Post. See also The End of Bullshit at the Hands of Critical Rationalism.

Debating season is in full swing
, and as per usual the presidential candidates are playing fast and loose with the truth. Fact-checking sites such as PolitiFact and have had plenty of easy targets in the debates so far. For instance, in the CNN Republican debate on September 16, Fiorina made several dubious claims about the Planned Parenthood video, as did Cruz about the Iran agreement. Similarly, in the CNN Democratic debate on October 13, Sanders falsely claimed that the U.S. has "more wealth and income inequality than any other country", whereas Chafee fudged the data on his Rhode Island record. No doubt we are going to see more of that in the rest of the presidential campaign. The fact-checkers won't need to worry about finding easy targets.

Research shows that fact-checking actually does make a difference. Incredible as it may seem, the candidates would probably have been even more careless with the truth if it weren't for the fact-checkers. To some extent, fact-checkers are a deterrent to politicians inclined to stretch the truth.

At the same time, the fact that falsehoods and misrepresentations of the truth are still so common shows that this deterrence effect is not particularly strong. This raises the question how we can make it stronger. Is there a way to improve on PolitiFact's and's model - Fact-Checking 2.0, if you will?

Spencer Greenberg of ClearerThinking and I have developed a tool which we hope could play that role. Greenberg has created an application to embed videos of recorded debates and then add subtitles to them. In these subtitles, I point out falsehoods and misrepresentations of the truth at the moment when the candidates make them. For instance, when Fiorina says about the Planned Parenthood video that there is "a fully formed fetus on the table, its heart beating, its legs kicking, while someone says we have to keep it alive to harvest its brain", I write in the subtitles:


We think that reading that a candidate's statement is false just as it is made could have quite a striking effect. It could trigger more visceral feelings among the viewers than standard fact-checking, which is published in separate articles. To over and over again read in the subtitles that what you're being told simply isn't true should outrage anyone who finds truth-telling an important quality.

Another salient feature of our subtitles is that we go beyond standard fact-checking. There are many other ways of misleading the audience besides playing fast and loose with the truth, such as evasions, ad hominem-attacks and other logical fallacies. Many of these are hard to spot for the viewers. We must therefore go beyond fact-checking and also do argument-checking, as we call it. If fact-checking grew more effective, and misrepresenting the truth less viable a strategy, politicians presumably would more frequently resort to Plan B: evading questions where they don't want the readers to know the truth. To stop that, we need careful argument-checking in addition to fact-checking.

So far, I've annotated the entire CNN Republican Debate, a 12 minute video from the CNN Democratic Debate (more annotations of this debate will come) and nine short clips (1-3 minutes) from the Fox News Republican Debate (August 6). My aim is to be as complete as possible, and I think that I've captured an overwhelming majority of the factual errors, evasions, and fallacies in the clips. The videos can be found on ClearerThinking as well as below.


The CNN Republican debate, subtitled in full.


The first 12 minutes of the CNN Democratic debate.


Nine short clips from the Fox News Debate: Christie and Paul, Bush, Carson, Cruz, Huckabee, Kasich, Rubio, Trump, Walker.

What is perhaps most striking is the sheer number of falsehoods, evasions and fallacies the candidates make. The 2hr 55 min long CNN Republican debate contains 273 fact-checking and argument-checking comments (many of which refer to various fact-checking sites). In total, 27 % of the video is subtitled. Similar numbers hold for the other videos.

Conventional wisdom has it that politicians lie and deceive on a massive scale. My analyses prove conventional wisdom right. The candidates use all sorts of trickery to put themselves in a better light and smear their opponents.

All of this trickery is severely problematic from several perspectives. Firstly, it is likely to undermine the voters' confidence in the political system. This is especially true for voters on the losing side. Why be loyal to a government which has gained power by misleading the electorate? No doubt many voters do think in those terms, more or less explicitly.

It is also likely to damage the image of democracy. The American presidential election is followed all over the world by millions if not billions of people. Many of them live in countries where democracy activists are struggling to amass support against authoritarian regimes. It hardly helps them that the election debates in the U.S. and other democratic countries look like this.

All of these deceptive arguments and claims also make it harder for voters to make informed decisions. Televised debates are supposed to help voters to get a better view of the candidates' policies and track-records, but how could they, if they can't trust what is being said? This is perhaps the most serious consequence of poor debates, since it is likely to lead to poorer decisions on the part of the voters, which in turn will lead to poorer political leadership and poorer policies.

Besides functioning as a more effective lie deterrent to the candidates, improved fact-checking could also nudge the networks to adjust the set-up of the debates. The way the networks lead the debates today hardly encourages serious and rational argumentation. To the contrary, they often positively goad the candidates against each other. Improved fact-checking could make it more salient to the viewers how poor the debates are, and induce them to demand a better debate set-up. The networks need to come up with a format which incentivizes the candidates to argue fairly and truthfully, and which makes it clear who has not. For instance, they could broadcast the debate again the next day, with fact-checking and argument-checking subtitles.

Another means to improve the debates is further technological innovation. For example, there should be a video annotation equivalent to, the web application which allows you to annotate text on any webpage in a convenient way. That would be very useful for fact-checking and argument-checking purposes.

Fact-checking could even become automatic, as Google CEO Eric Schmidt predicted it would be within five years in 2006. Though Schmidt was over-optimistic, Google algorithms are able to fact-check websites with a high degree of accuracy today, whilst Washington Post already has built a rudimentary automatic fact-checker.

But besides new software applications and better debating formats, we also need something else, namely a raised awareness among the public what a great problem politicians' careless attitude to the truth is. They should ask themselves: are people inclined to mislead the voters really suited to shape the future of the world?

Politicians are normally held to high moral standards. Voters tend to take very strict views on other forms of dishonest behavior, such as cheating and tax evasion. Why, then, is it that they don't take a stricter view on intellectual dishonesty? Besides being morally objectionable, intellectual dishonesty is likely to lead to poor decisions. Voters would therefore be wise to let intellectual honesty be an important criterion when they cast their vote. If they started doing that on a grand scale, that would do more to improve the level of political debate than anything else I can think of.

Thanks to Aislinn Pluta, Doug Moore, Janko Prester, Philip Thonemann, Stella Vallgårda and Staffan Holmberg for their contributions to the annotations.

Flowsheet Logic and Notecard Logic

24 moridinamael 09 September 2015 04:42PM

(Disclaimer: The following perspectives are based in my experience with policy debate which is fifteen years out of date. The meta-level point should stand regardless.)

If you are not familiar with U.S. high school debate club ("policy debate" or "cross-examination debate"), here is the gist of it: two teams argue over a topic, and a judge determines who has won.

When we get into the details, there are a lot of problems with the format. Almost everything wrong with policy debate appears in this image:


This is a "flowsheet", and it is used to track threads of argument between the successive epochs of the debate round. The judge and the debators keep their own flowsheets to make sense of what's going on.

I am sure that there is a skillful, positive way of using flowsheets, but I have never seen it used in any way other than the following:

After the Affirmative side lays out their proposal, the Negative throws out a shotgun blast of more-or-less applicable arguments drawn from their giant plastic tote containing pre-prepared arguments. The Affirmative then counters the Negative's arguments using their own set of pre-prepared counter-arguments. Crucially, all of the Negative arguments must be met. Look at the Flowsheet image again, and notice how each "argument" has an arrow which carries it rightward. If any of these arrows make it to the right side of the page - the end of the round - without being addressed, then the judge will typically consider the round to be won by the side who originated that arrow.

So it doesn't actually matter if an argument receives a good counterargument. It only matters that the other team has addressed it appropriately.

Furthermore, merely addressing the argument with ad hoc counterargument is usually not sufficient. If the Negative makes an argument which contains five separate logical fallacies, and the Affirmative points all of these out and then moves on, the judge may not actually consider the Negative argument to have been refuted - because the Affirmative did not cite any Evidence.

Evidence, in policy debate, is a term of art, and it means "something printed out from a reputable media source and taped onto a notecard." You can't say "water is wet" in a policy debate round without backing it up with a notecard quoting a news source corroborating the wetness of water. So, skillfully pointing out those logical fallacies is meaningless if you don't have the Evidence to back up your claims.

Skilled policy debators can be very good - impressively good - at the mental operations of juggling all these argument threads in their mind and pulling out the appropriate notecard evidence. My entire social circle in high school was composed of serious debators, many of whom were brilliant at it.

Having observed some of these people for the ensuing decade, I sometimes suspect that policy debate damaged their reasoning ability. If I were entirely simplistic about it, I would say that policy debate has destroyed their ability to think and argue rationally. These people essentially still argue the same way, by mental flowsheet, acting as though argument can proceed only via notecard exchange. If they have addressed an argument, they consider it to be refuted. If they question an argument's source ("Wikipedia? Really?"), they consider it to be refuted. If their opponent ignores one of their inconsequential points, they consider themselves to have won. They do not seem to possess any faculty for discerning whether or not one argument actually defeats another. It is the equivalent of a child whose vision of sword fighting is focused on the clicking together of the blades, with no consideration for the intent of cutting the enemy.

Policy debate is to actual healthy argumentation as checkers is to actual warfare. Key components of the object being gamified are ignored or abstracted away until the remaining simulacrum no longer represents the original.

I actually see Notecard Logic and Flowsheet Logic everywhere. That's why I have to back off from my assertion that policy debate destroyed anybody's reasoning ability - I think it may have simply reinforced and hypertrophied the default human argumentation algorithm.

Flowsheet Logic is the tendency to think that you have defeated an argument because you have addressed it. It is the overall sense that you can't lose an argument as long as none of your opponent's statements go unchallenged, even if none of your challenges are substantial/meaningful/logical. It is the belief that if you can originate more threads of argument against your opponent than they can fend off, you have won, even if none of your arguments actually matters individually. I see Flowsheet Logic tendencies expressed all the time.

Notecard Logic is the tendency to treat evidence as binary. Either you have evidence to back up your assertion - even if that evidence takes the form of an article from [insert partisan rag] - or else you are just "making things up to defend your point of view". There is no concession to Bayesian updating, credibility, or degrees of belief in Notecard Logic. "Bob is a flobnostic. I can prove this because I can link you to an article that says it. So what if I can't explain what a flobnostic is." I see Notecard Logic tendencies expressed all the time.

Once you have developed a mental paintbrush handle for these tendencies, you may see them more as well. This awareness should allow you to discern more clearly whether you - or your interlocutor - or someone else entirely - is engaging in these practices. Hopefully this awareness paints a "negative space" of superior argumentation for you.

The Triumph of Humanity Chart

22 Dias 26 October 2015 01:41AM

Cross-posted from my blog here.

One of the greatest successes of mankind over the last few centuries has been the enormous amount of wealth that has been created. Once upon a time virtually everyone lived in grinding poverty; now, thanks to the forces of science, capitalism and total factor productivity, we produce enough to support a much larger population at a much higher standard of living.

EAs being a highly intellectual lot, our preferred form of ritual celebration is charts. The ordained chart for celebrating this triumph of our people is the Declining Share of People Living in Extreme Poverty Chart.

Share in Poverty


However, as a heretic, I think this chart is a mistake. What is so great about reducing the share? We could achieve that by killing all the poor people, but that would not be a good thing! Life is good, and poverty is not death; it is simply better for it to be rich.

As such, I think this is a much better chart. Here we show the world population. Those in extreme poverty are in purple – not red, for their existence is not bad. Those who the wheels of progress have lifted into wealth unbeknownst to our ancestors, on the other hand, are depicted in blue, rising triumphantly.

Triumph of Humanity2

Long may their rise continue.


Find someone to talk to thread

22 hg00 26 September 2015 10:24PM

Many LessWrong users are depressed. On the most recent survey, 18.2% of respondents had been formally diagnosed with depression, and a further 25.5% self-diagnosed with depression. That adds up to nearly half of the LessWrong userbase.

One common treatment for depression is talk therapy. Jonah Sinick writes:

Talk therapy has been shown to reduce depression on average. However:

  • Professional therapists are expensive, often charging on order of $120/week if one's insurance doesn't cover them.
  • Anecdotally, highly intelligent people find therapy less useful than the average person does, perhaps because there's a gap in intelligence between them and most therapists that makes it difficult for the therapist to understand them.

House of Cards by Robyn Dawes argues that there's no evidence that licensed therapists are better at performing therapy than minimally trained laypeople. The evidence therein raises the possibility that one can derive the benefits of seeing a therapist from talking to a friend.

This requires that one has a friend who:

  • is willing to talk with you about your emotions on a regular basis
  • you trust to the point of feeling comfortable sharing your emotions

Some reasons to think that talking with a friend may not carry the full benefits of talking with a therapist are

  • Conflict of interest — Your friend may be biased for reasons having to do with your pre-existing relationship – for example, he or she might be unwilling to ask certain questions or offer certain feedback out of concern of offending you and damaging your friendship.
  • Risk of damaged relationship dynamics — There's a possibility of your friend feeling burdened by a sense of obligation to help you, creating feelings of resentment, and/or of you feeling guilty.
  • Risk of breach of confidentiality — Since you and your friend know people in common, there's a possibility that your friend will reveal things that you say to others who you know, that you might not want to be known. In contrast, a therapist generally won't know people in common with you, and is professionally obliged to keep what you say confidential.

Depending on the friend and on the nature of help that you need, these factors may be non-issues, but they're worth considering when deciding between seeing a therapist and talking with a friend.

One idea for solving the problems with talking to a friend is to find someone intellectually similar to you who you don't know--say, someone else who reads LessWrong.

This is a thread for doing that. Please post if you're either interested in using someone as a sounding board or interested in making money being a sounding board using Skype or Google Hangouts. If you want to make money talking to people, I suggest writing out a little resume describing why you might be a nice person to talk to, the time zone you're in, your age (age-matching recommended by Kate), and the hourly rate you wish to charge. You could include your location for improved internet call quality. You might also include contact info to decrease trivial inconveniences for readers who haven't registered a LW account. (I have a feeling that trivial inconveniences are a bigger issue for depressed people.) To help prevent email address harvesting, the convention for this thread is if you write "Contact me at [somename]", that's assumed to mean "my email is [somename]".

Please don't be shy about posting if this sounds like a good fit for you. Let's give people as many options as possible.

I guess another option for folks on a budget is making reciprocal conversation arrangements with another depressed person. So feel free to try & arrange that in this thread as well. I think paying someone is ideal though; listening to depressed people can sometimes be depressing.

BlahTherapy is an interesting site that sets you up with strangers on the internet to talk about your problems with. However, these strangers likely won't have the advantages of high intelligence or shared conceptual vocabulary LessWrong users have. Fortunately we can roll our own version of BlahTherapy by designating "lesswrong-talk-to-someone" as the Schelling interest on (You can also just use lesswrong as an interest, there are sometimes people on. Or enter random intellectual interests to find smart people to talk to.)

I haven't had very good results using sites like BlahTherapy. I think it's because I only sometimes find someone good, and when they don't work, I end up more depressed than I started. Reaching out in hopes of finding a friend and failing is a depressing experience. So I recommend trying to create a stable relationship with regularly scheduled conversations. I included BlahTherapy and Omegle because they might work well for some people and I didn't want to extrapolate strongly from n=1.

LessWrong user ShannonFriedman seems to work as a life coach judging by the link in her profile. I recommend her posts How to Deal with Depression - The Meta Layers and The Anti-Placebo Effect.

There's also the How to Get Therapy series from LW-sphere blog Gruntled & Hinged. It's primarily directed at people looking for licensed therapists, but may also have useful tips if you're just looking for someone to talk to. The biggest tip I noticed was to schedule a relaxing activity & time to decompress after your conversation.

The book Focusing is supposed to explain the techniques that successful therapy patients use that separate them from unsuccessful therapy patients.  Anna Salamon recommends the audiobook version.

There's also: Methods for Treating DepressionThings That Sometimes Help If You Have Depression.

I apologize for including so many ideas, but I figured it was better to suggest a variety of approaches so the community can collectively identify the most effective solutions for the rationalist depression epidemic. In general, when I'm depressed, I notice myself starting and stopping activities in a very haphazard way, repeatedly telling myself that the activity I'm doing isn't the one I "should" be doing. I've found it pretty useful to choose one activity arbitrarily and persist in it for a while. This is often sufficient to bootstrap myself out of a depressed state. I'd recommend doing the same here: choose an option and put a nontrivial amount of effort into exploring it before discarding it. Create a todo list and bulldoze your way down it.

Good luck. I'm rooting for you!

Legal note: Talking to unlicensed people over the internet is not a substitute for professional help. If you are depressed you should visit a licensed therapist.

Vegetarianism Ideological Turing Test Results

21 Raelifin 14 October 2015 12:34AM

Back in August I ran a Caplan Test (or more commonly an "Ideological Turing Test") both on Less Wrong and in my local rationality meetup. The topic was diet, specifically: Vegetarian or Omnivore?

If you're not familiar with Caplan Tests, I suggest reading Palladias' post on the subject or reading Wikipedia. The test I ran was pretty standard; thirteen blurbs were presented to the judges, selected by the toss of a coin to either be from a vegetarian or from an omnivore, and also randomly selected to be genuine or an impostor trying to pass themselves off as the alternative. My main contribution, which I haven't seen in previous tests, was using credence/probability instead of a simple "I think they're X".

I originally chose vegetarianism because I felt like it's an issue which splits our community (and particularly my local community) pretty well. A third of test participants were vegetarians, and according to the 2014 census, only 56% of LWers identify as omnivores.

Before you see the results of the test, please take a moment to say aloud how well you think you can do at predicting whether someone participating in the test was genuine or a fake.














If you think you can do better than chance you're probably fooling yourself. If you think you can do significantly better than chance you're almost certainly wrong. Here are some statistics to back that claim up.

I got 53 people to judge the test. 43 were from LessWrong, and 10 were from my local group. Averaging across the entire group, 51.1% of judgments were correct. If my Chi^2 math is correct, the p-value for the null hypothesis is 57% on this data. (Note that this includes people who judged an entry as 50%. If we don't include those folks the success rate drops to 49.4%.)

In retrospect, this seemed rather obvious to me. Vegetarians aren't significantly different from omnivores. Unlike a religion or a political party there aren't many cultural centerpieces to diet. Vegetarian judges did no better than omnivore judges, even when judging vegetarian entries. In other words, in this instance the minority doesn't possess any special powers for detecting other members of the in-group. This test shows null results; the thing that distinguishes vegetarians from omnivores is not familiarity with the other sides' arguments or culture, at least not to the degree that we can distinguish at a glance.

More interesting, in my opinion, than the null results were the results I got on the calibration of the judges. Back when I asked you to say aloud how good you'd be, what did you say? Did the last three paragraphs seem obvious? Would it surprise you to learn that not a single one of the 53 judges held their guesses to a confidence band of 40%-60%? In other words, every single judge thought themselves decently able to discern genuine writing from fakery. The numbers suggest that every single judge was wrong.

(The flip-side to this is, of course, that every entrant to the test won! Congratulations rationalists: signs point to you being able to pass as vegetarians/omnivores when you try, even if you're not in that category. The average credibility of an impostor entry was 59%, while the average credibility of a genuine response was 55%. No impostors got an average credibility below 49%.)

Using the logarithmic scoring rule for the calibration game we can measure the error of the community. The average judge got a score of -543. For comparison, a judge that answered 50% ("I don't know") to all questions would've gotten a score of 0. Only eight judges got a positive score, and only one had a score higher than 100 (consistent with random chance). This is actually one area where Less Wrong should feel good. We're not at all calibrated... but for this test at least, the judges from the website were much better calibrated than my local community (who mostly just lurk). If we separate the two groups we see that the average score for my community was -949, while LW had an average of -448. Given that I restricted the choices to multiples of 10, a random selection of credences gives an average score of -921.

In short, the LW community didn't prove to be any better at discerning fact from fiction, but it was significantly less overconfident. More de-biasing needs to be done, however! The next time you think of a probability to reflect your credence, ask yourself "Is this the sort of thing that anyone would know? Is this the sort of thing I would know?" That answer will probably be "no" a lot more than it feels like from the inside.

Full data (minus contact info) can be found here.

Those of you who submitted a piece of writing that I used, or who judged the test and left their contact information: I will be sending out personal scores very soon (probably by this weekend). Deep apologies regarding the delay on this post. I had a vacation in late August and it threw off my attention to this project.

EDIT: Here's a histogram of the identification accuracy. 



EDIT 2: For reference, here are the entries that were judged.

Probabilities Small Enough To Ignore: An attack on Pascal's Mugging

20 Kaj_Sotala 16 September 2015 10:45AM

Summary: the problem with Pascal's Mugging arguments is that, intuitively, some probabilities are just too small to care about. There might be a principled reason for ignoring some probabilities, namely that they violate an implicit assumption behind expected utility theory. This suggests a possible approach for formally defining a "probability small enough to ignore", though there's still a bit of arbitrariness in it.

This post is about finding a way to resolve the paradox inherent in Pascal's Mugging. Note that I'm not talking about the bastardized version of Pascal's Mugging that's gotten popular of late, where it's used to refer to any argument involving low probabilities and huge stakes (e.g. low chance of thwarting unsafe AI vs. astronomical stakes). Neither am I talking specifically about the "mugging" illustration, where a "mugger" shows up to threaten you.

Rather I'm talking about the general decision-theoretic problem, where it makes no difference how low of a probability you put on some deal paying off, because one can always choose a humongous enough payoff to make "make this deal" be the dominating option. This is a problem that needs to be solved in order to build e.g. an AI system that uses expected utility and will behave in a reasonable manner.

Intuition: how Pascal's Mugging breaks implicit assumptions in expected utility theory

Intuitively, the problem with Pascal's Mugging type arguments is that some probabilities are just too low to care about. And we need a way to look at just the probability part component in the expected utility calculation and ignore the utility component, since the core of PM is that the utility can always be arbitrarily increased to overwhelm the low probability. 

Let's look at the concept of expected utility a bit. If you have a 10% chance of getting a dollar each time when you make a deal, and this has an expected value of 0.1, then this is just a different way of saying that if you took the deal ten times, then you would on average have 1 dollar at the end of that deal. 

More generally, it means that if you had the opportunity to make ten different deals that all had the same expected value, then after making all of those, you would on average end up with one dollar. This is the justification for why it makes sense to follow expected value even for unique non-repeating events: because even if that particular event wouldn't repeat, if your general strategy is to accept other bets with the same EV, then you will end up with the same outcome as if you'd taken the same repeating bet many times. And even though you only get the dollar after ten deals on average, if you repeat the trials sufficiently many times, your probability of having the average payout will approach one.

Now consider a Pascal's Mugging scenario. Say someone offers to create 10^100 happy lives in exchange for something, and you assign them a 0.000000000000000000001 probability to them being capable and willing to carry through their promise. Naively, this has an overwhelmingly positive expected value.

But is it really a beneficial trade? Suppose that you could make one deal like this per second, and you expect to live for 60 more years, for about 1,9 billion trades in total. Then, there would be a probability of 0,999999999998 that the deal would never once have paid off for you. Which suggests that the EU calculation's implicit assumption - that you can repeat this often enough for the utility to converge to the expected value - would be violated.

Our first attempt

This suggests an initial way of defining a "probability small enough to be ignored":

1. Define a "probability small enough to be ignored" (PSET, or by slight rearranging of letters, PEST) such that, over your lifetime, the expected times that the event happens will be less than one. 
2. Ignore deals where the probability component of the EU calculation involves a PEST.

Looking at the first attempt in detail

To calculate PEST, we need to know how often we might be offered a deal with such a probability. E.g. a 10% chance for something might be a PEST if we only lived for a short enough time that we could make a deal with a 10% chance once. So, a more precise definition of a PEST might be that it's a probability such that

(amount of deals that you can make in your life that have this probability) * (PEST) < 1

But defining "one" as the minimum times we should expect the event to happen for the probability to not be a PEST feels a little arbitrary. Intuitively, it feels like the threshold should depend on our degree of risk aversion: maybe if we're risk averse, we want to reduce the expected amount of times something happens during our lives to (say) 0,001 before we're ready to ignore it. But part of our motivation was that we wanted a way to ignore the utility part of the calculation: bringing in our degree of risk aversion seems like it might introduce the utility again.

What if redefined risk aversion/neutrality/preference (at least in this context) as how low one would be willing to let the "expected amount of times this might happen" fall before considering a probability a PEST?

Let's use this idea to define an Expected Lifetime Utility:

ELU(S,L,R) = the ELU of a strategy S over a lifetime L is the expected utility you would get if you could make L deals in your life, and were only willing to accept deals with a minimum probability P of at least S, taking into account your risk aversion R and assuming that each deal will pay off approximately P*L times.

ELU example

Suppose that we a have a world where we can take three kinds of actions. 

- Action A takes 1 unit of time and has an expected utility of 2 and probability 1/3 of paying off on any one occasion.
- Action B takes 3 units of time and has an expected utility of 10^(Graham's number) and probability 1/100000000000000 of paying off one any one occasion.
- Action C takes 5 units of time and has an expected utility of 20 and probability 1/100 of paying off on an one occasion.

Assuming that the world's lifetime is fixed at L = 1000 and R = 1:

ELU("always choose A"): we expect A to pay off on ((1000 / 1) * 1/3) = 333 individual occasions, so with R = 1, we deem it acceptable to consider the utility of A. The ELU of this strategy becomes (1000 / 1) * 2 = 2000.

ELU("always choose B"): we expect B to pay off on ((1000 / 3) * 1/100000000000000) = 0.00000000000333 occasions, so with R = 1, we consider the expected utility of B to be 0. The ELU of this strategy thus becomes ((1000 / 3) * 0) = 0.

ELU("always choose C"): we expect C to pay off on ((1000 / 5) * 1/100) = 2 individual occasions, so with R = 1, we consider the expected utility of C to be ((1000 / 5) * 20) = 4000.

Thus, "always choose C" is the best strategy. 

Defining R

Is R something totally arbitrary, or can we determine some more objective criteria for it?

Here's where I'm stuck. Thoughts are welcome. I do know that while setting R = 1 was a convenient example, it's most likely too high, because it would suggest things like not using seat belts.

General thoughts on this approach

An interesting thing about this approach is that the threshold for a PEST becomes dependent on one's expected lifetime. This is surprising at first, but actually makes some intuitive sense. If you're living in a dangerous environment where you might be killed anytime soon, you won't be very interested in speculative low-probability options; rather you want to focus on making sure you survive now. Whereas if you live in a modern-day Western society, you may be willing to invest some amount of effort in weird low-probability high-payoff deals, like cryonics.

On the other hand, whereas investing in that low-probability, high-utility option might not be good for you individually, it could still be a good evolutionary strategy for your genes. You yourself might be very likely to die, but someone else carrying the risk-taking genes might hit big and be very successful in spreading their genes. So it seems like our definition of L, lifetime length, should vary based on what we want: are we looking to implement this strategy just in ourselves, our whole species, or something else? Exactly what are we maximizing over?

My future posts; a table of contents.

20 Elo 30 August 2015 10:27PM

My future posts

I have been living in the lesswrong rationality space for at least two years now. Recently more active than previously. This has been deliberate. I plan to make more serious active posts in the future. In saying so I wanted to announce the posts I intend on making when moving forwards from today.  This should do a few things:


  1. keep me on track
  2. keep me accountable to me more than anyone else
  3. keep me accountable to others
  4. allow others to pick which they would like to be created sooner
  5. allow other people to volunteer to create/collaborate on these topics
  6. allow anyone to suggest more topics
  7. meta: this post should help to demonstrate one person's method of developing rationality content and the time it takes to do that.
feel free to PM me about 6, or comment below.

Unfortunately these are not very well organised, they are presented in no particular order.  They are probably missing posts that will help link them all together, as well as skills required to understand some of the posts on this list.


Unpublished but written:

A very long list of sleep maintenance suggestions – I wrote up all the ideas I knew of; there are about 150 or so; worth reviewing just to see if you can improve your sleep because the difference in quality of life with good sleep is a massive change. (20mins to write an intro Actually 2 hours)

A list of techniques to help you remember names. - remembering names is a low-hanging social value fruit that can improve many of your early social interactions with people. I wrote up a list of techniques to help. (5mins to post)


Posts so far:

The null result: a magnetic ring wearing experiment. - a fun one; about how wearing magnetic rings was cool; but not imparting of superpowers. (done)

An app list of useful apps for android my current list of apps that I use also some very good suggestions in the comments. (done)

How to learn X How to attack a problem of learning a new area that you don't know a lot about (for a generic thing) (done)

A list of common human goals – when plotting out goals that matter to you; so you can look over some common ones and see you fulfilling them interests you. (done)

Lesswrong real time chat - A Slack channel for hanging out with other rationalists.  Also where I talk about my latest posts before I put them up.


Future posts

Goals of your lesswrong group – Do you have a local group; why? What do you want out of it (do people know)? setting goals, doing something particularly, having fun anyway, changing your mind. (4hrs)


Goals interrogation + Goal levels – Goal interrogation is about asking <is this thing I want to do actually a goal of mine> and <is this the best way to achieve that>, goal levels are something out of Sydney Lesswrong that help you have mutual long term goals and supporting short term goal. (2hrs)


How to human – A zero to human guide. A guide for basic functionality of a humanoid system. (4hrs)


General buying things considerations – New to the whole adult thing?  wondering what to ask yourself when considering purchases?  Here is a list of general considerations. (3hrs)


List of strategies for getting shit done – working around the limitations of your circumstances and understanding what can get done with the resources you have at hand. (4hrs)


List of superpowers and kryptonites – when asking the question "what are my superpowers?" and "what are my kryptonites?". Knowledge is power; working with your powers and working out how to avoid your kryptonites is a method to improve yourself. (6hrs over a week)


List of effective behaviours – small life-improving habits that add together to make awesomeness from nothing. And how to pick them up. (8hrs over 2 weeks)


Memory and notepads – writing notes as evidence, the value of notes (they are priceless) and what you should do. (1hr + 1hr over a week)


Suicide prevention checklist – feeling off? You should have already outsourced the hard work for "things I should check on about myself" to your past self. Make it easier for future you. Especially in the times that you might be vulnerable. (4hrs)


Make it easier for future you. Especially in the times that you might be vulnerable. - as its own post in curtailing bad habits. (5hrs)


A p=np approach to learning – Sometimes you have to learn things the long way; but sometimes there is a short cut. Where you could say, "I wish someone had just taken me on the easy path early on". It's not a perfect idea; but start looking for the shortcuts where you might be saying "I wish someone had told me". Of course my line now is, "but I probably wouldn't have listened anyway" which is something that can be worked on as well. (2hrs)


Rationalists guide to dating – attraction. Relationships. Doing things with a known preference. Don't like stupid people? Don't try to date them. Think first; an exercise in thinking hard about things before trying trial-and-error on the world. (half written, needs improving 2hrs)


Training inherent powers (weights, temperatures, smells, estimation powers) – practice makes perfect right? Imagine if you knew the temperature always, the weight of things by lifting them, the composition of foods by tasting them, the distance between things without measuring. How can we train these, how can we improve. (2hrs)


Strike to the heart of the question. The strongest one; not the one you want to defeat – Steelman not Strawman. Don't ask "how do I win at the question"; ask, "am I giving the best answer to the best question I can give", (2hrs)


Posts not planned at the original writing of the post:

Sensory perception differences and how it shapes personal experience - Is a sound as loud to you as everyone else?  What about a picture?  Are colours as clear and vivid to you as they are to other people?  This post is a consideration in whether the individual difference in experiences can shape our experience and choices in how we live our lives.  Includes some short exercises in sensory perceptions.


Posts added to the list:

Exploration-Exploitation and a method of applying the secretary problem to real life.  I devised a rough equation for application of the secretary problem to real life dating and the exploration-exploitation dilemma.

How to approach a new problem - similar to the "How to solve X" post.  But considerations for working backwards from a wicked problem:, as well as trying "The least bad solution I know of", Murphy-jitsu, and known solutions to similar problems.  0. I notice I am approaching a problem.

being the kind of person that advice works for - The same words of advice can work for someone and not someone else.  Consider why that is; and how you can better understand the advice that you are given, and how you might become the kind of person that advice works for.

Edit: links adding as I write them.


"Announcing" the "Longevity for All" Short Movie Prize

19 infotropism 11 September 2015 01:44PM

The local Belgian/European life-extension non-profit Heales is giving away prizes for whoever can make an interesting short movie about life extension. The first prize is €3000 (around $3386 as of today), other prizes being various gifts. You more or less just need to send a link pointing to the uploaded media along with your contact info to once you're done.

While we're at it you don't need to be European, let alone Belgian to participate, and it doesn't even need to be a short movie anyway. For instance a comic strip would fall within the scope of the rules as specified here : (link to a pdf file)(or see this page on Also, sure, the deadline is by now supposed to be a fairly short-term September the 21st, 2015, but it is extremely likely this will be extended (this might be a pun).

I'll conclude by suggesting you read the official pdf with rules and explanations if you feel like you care about money or life-extension (who doesn't ?), and remind everyone of what happened last time almost everyone thought they shouldn't grab free contest money that was announced on Lesswrong (hint : few enough people participated that all earned something). The very reason why this one's due date will likely be extended is because (very very) few people have participated so far, after all.

(Ah yes, the only caveat I can think of is that if the product of quality by quantity of submissions is definitely too low (i.e. it's just you on the one hand and on the other hand that one guy who spent 3 minutes drawing some stick figures, and your submission is coming a close second), then the contest may be called off after one or two deadline extensions (also in the aforementioned rules).).

Rudimentary Categorization of Less Wrong Topics

19 ScottL 05 September 2015 07:32AM

I find the below list to be useful, so I thought I would post it. This list includes short abstracts of all of the wiki items and a few other topics on less wrong. I grouped the items into some rough categories just to break up the list. I tried to put the right items into the right categories, but there may be some items that can be in multiple categories or that would be better off in a different category. The wiki page from which I got all the items is here.

The categories are:

Property Attribution




Property Attribution

Barriers, biases, fallacies, impediments and problems

  • Affective death spiral - positive attributes of a theory, person, or organization combine with the Halo effect in a feedback loop, resulting in the subject of the affective death spiral being held in higher and higher regard.
  • Anthropomorphism - the error of attributing distinctly human characteristics to nonhuman processes.
  • Bystander effect - a social psychological phenomenon in which individuals are less likely to offer help in an emergency situation when other people are present.
  • Connotation - emotional association with a word. You need to be careful that you are not conveying different connotation, then you mean to.
  • Correspondence bias (also known as the fundamental attribution error) - is the tendency to overestimate the contribution of lasting traits and dispositions in determining people's behavior, as compared to situational effects.
  • Death Spirals and the Cult Attractor - Cultishness is an empirical attractor in human groups, roughly an affective death spiral, plus peer pressure and outcasting behavior, plus (quite often) defensiveness around something believed to have been perfected
  • Detached lever fallacy –the assumption that something simple for one system will be simple for others. This assumption neglects to take into account that something may only be simple because of complicated underlying machinery which is triggered by a simple action like pulling a lever. Adding this lever to something else won’t allow the action to occur because the underlying complicated machinery is not there.
  • Giant cheesecake fallacy- occurs when an argument leaps directly from capability to actuality, without considering the necessary intermediate of motive. An example of the fallacy might be: a sufficiently powerful Artificial Intelligence could overwhelm any human resistance and wipe out humanity. (Belief without evidence: the AI would decide to do so.) Therefore we should not build AI.
  • Halo effect – specific type of confirmation bias, wherein positive feelings in one area cause ambiguous or neutral traits to be viewed positively.
  • Illusion of transparency - misleading impression that your words convey more to others than they really do.
  • Inferential distance - a gap between the background knowledge and epistemology of a person trying to explain an idea, and the background knowledge and epistemology of the person trying to understand it.
  • Information cascade - occurs when people signal that they have information about something, but actually based their judgment on other people's signals, resulting in a self-reinforcing community opinion that does not necessarily reflect reality.
  • Mind projection fallacy - occurs when someone thinks that the way they see the world reflects the way the world really is, going as far as assuming the real existence of imagined objects.
  • Other-optimizing - a failure mode in which a person vastly overestimates their ability to optimize someone else's life, usually as a result of underestimating the differences between themselves and others, for example through the typical mind fallacy.
  • Peak-end rule - we do not judge our experiences on the net pleasantness of unpleasantness or on how long the experience lasted, but instead on how they were at their peak (pleasant or unpleasant) and how they ended.
  • Stereotype - a fixed, over generalized belief about a particular group or class of people.
  • Typical mind fallacy - the mistake of making biased and overconfident conclusions about other people's experience based on your own personal experience; the mistake of assuming that other people are more like you than they actually are.


  • ADBOC - Agree Denotationally, But Object Connotatively
  • Alien Values - There are no rules requiring minds to value life, liberty or the pursuit of happiness. An alien will have, in all probability, alien values. If an "alien" isn't evolved, the range of possible values increases even more, allowing such absurdities as a Paperclip maximizer. Creatures with alien values might as well value only non-sentient life, or they might spend all their time building heaps of prime numbers of rocks.
  • Chronophone – is a parable that is meant to convey the idea that it’s really hard to get somewhere when you don't already know your destination. If there were some simple cognitive policy you could follow to spark moral and technological revolutions, without your home culture having advance knowledge of the destination, you could execute that cognitive policy today.
  • Empathic inference – is every-day common mind-reading. It’s an inference made about other person’s mental states using your own brain as reference, by making your brain feel or think in the same way as the other person you can emulate their mental state and predict their reactions.
  • Epistemic luck - you would have different beliefs if certain events in your life were different. How should you react to this fact?
  • Future - If it hasn't happened yet but is going to, then it's part of the future. Checking whether or not something is going to happen is notoriously difficult. Luckily, the field of heuristics and biases has given us some insights into what can go wrong. Namely, one problem is that the future elicits far mode, which isn't about truth-seeking or gritty details.
  • Mental models - a hypothetical form of representation of knowledge in human mind. Mental models form to approximately describe dynamics of observed situations, and reuse parts of existing models to represent novel situations
  • Mind design space - refers to the configuration space of possible minds. As humans living in a human world, we can safely make all sorts of assumptions about the minds around us without even realizing it. Each human might have their own unique personal qualities, so it might naively seem that there's nothing you can say about people you don't know. But there's actually quite a lot you can say (with high or very high probability) about a random human: that they have standard emotions like happiness, sadness, and anger; standard senses like sight, vision, and hearing; that they speak a language; and no doubt any number of other subtle features that are even harder to quickly explain in words. These things are the specific results of adaptation pressures in the ancestral environment and can't be expected to be shared by a random alien or AI. That is, humans are packed into a tiny dot in the configuration space: there is vast range over of other ways a mind can be.
  • Near/far thinking - Near and far are two modes (or a spectrum of modes) in which we can think about things. We choose which mode to think about something is based on its distance from us, or on the level of detail we need. This property of human mind is studied in construal level theory.
    • NEAR: All of these bring each other more to mind: here, now, me, us; trend-deviating likely real local events; concrete, context-dependent, unstructured, detailed, goal-irrelevant incidental features; feasible safe acts; secondary local concerns; socially close folks with unstable traits.
    • FAR: Conversely, all these bring each other more to mind: there, then, them; trend-following unlikely hypothetical global events; abstract, schematic, context-freer, core, coarse, goal-related features; desirable risk-taking acts, central global symbolic concerns, confident predictions, polarized evaluations, socially distant people with stable traits
  • No-Nonsense Metaethics - A sequence by lukeprog that explains and defends a naturalistic approach to metaethics and what he calls pluralistic moral reductionism. We know that people can mean different things, but use the same word, e.g. sound can mean auditory experience or acoustic vibrations in the air. Pluralistic moral reductionism is the idea that we do the same thing when we talk about what it moral.
  • Only the vulnerable are heroes - “Vulnerability is our most accurate measurement of courage.” – Brené Brown To be as heroic as a man stopping a group of would-be thieves from robbing a store. Superman has to be defending the world from someone powerful enough to harm and possibly even kill him, such as Darkseid.


Barriers, biases, fallacies, impediments and problems

  • Absurdity heuristic – is a mental shortcut where highly untypical situations are classified as absurd or impossible. Where you don't expect intuition to construct an adequate model of reality, classifying an idea as impossible may be overconfident.
  • Affect heuristic - a mental shortcut that makes use of current emotions to make decisions and solve problems quickly and efficiently.
  • Arguing by analogy – is arguing that since things are alike in some ways, they will probably be alike in others. While careful application of argument by analogy can be a powerful tool, there are limits to the method after which it breaks down.
  • Arguing by definition – is arguing that something is part of a class because it fits the definition of that class. It is recommended to avoid this wherever possible and instead treat words as labels that cannot capture the rich cognitive content that actually constitutes its meaning. As Feynman said: “You can know the name of a bird in all the languages of the world, but when you're finished, you'll know absolutely nothing whatever about the bird... So let's look at the bird and see what it's doing -- that's what counts.” It is better to keep the focus on the facts of the matter and try to understand what your interlocutor is trying to communicate, then to get lost in a pointless discussion of definitions, bearing nothing.
  • Arguments as soldiers – is a problematic scenario where arguments are treated like war or battle. Arguments get treated as soldiers, weapons to be used to defend your side of the debate, and to attack the other side. They are no longer instruments of the truth.
  • Availability heuristic – a mental shortcut that treats easily recalled information as important or at least more important than alternative solutions which are not as readily recalled
  • Belief as cheering - People can bind themselves as a group by believing "crazy" things together. Then among outsiders they could show the same pride in their crazy belief as they would show wearing "crazy" group clothes among outsiders. The belief is more like a banner saying "GO BLUES". It isn't a statement of fact, or an attempt to persuade; it doesn't have to be convincing—it's a cheer.
  • Beware of Deepities - A deepity is a proposition that seems both important and true—and profound—but that achieves this effect by being ambiguous. An example is "love is a word". One interpretation is that “love”, the word, is a word and this is trivially true. The second interpretation is that love is nothing more than a verbal construct. This interpretation is false, but if it were true would be profound. The "deepity" seems profound due to a conflation of the two interpretations. People see the trivial but true interpretation and then think that there must be some kind of truth to the false but profound one.
  • Bias - is a systematic deviation from rationality committed by our cognition. They are specific, predictable error patterns in the human mind.
  • Burdensome details - Adding more details to a theory may make it sound more plausible to human ears because of the representativeness heuristic, even as the story becomes normatively less probable, as burdensome details drive the probability of the conjunction down (this is known as conjunction fallacy). Any detail you add has to be pinned down by a sufficient amount of evidence; all the details you make no claim about can be summed over.
  • Compartmentalization - a tendency to restrict application of a generally-applicable skill, such as scientific method, only to select few contexts. More generally, the concept refers to not following a piece of knowledge to its logical conclusion, or not taking it seriously.
  • Conformity bias - a tendency to behave similarly to the others in a group, even if doing so goes against your own judgment.
  • Conjunction fallacy – involves the assumption that specific conditions are more probable than more general ones.
  • Contagion heuristic - leads people to avoid contact with people or objects viewed as "contaminated" by previous contact with someone or something viewed as bad—or, less often, to seek contact with objects that have been in contact with people or things considered good.
  • Costs of rationality - Becoming more epistemically rational can only guarantee one thing: what you believe will include more of the truth. Knowing that truth might help you achieve your goals, or cause you to become a pariah. Be sure that you really want to know the truth before you commit to finding it; otherwise, you may flinch from it.
  • Defensibility - arguing that a policy is defensible rather than optimal or that it has some benefit compared to the null action rather than the best benefit of any action.
  • Fake simplicity – if you have a simple answer to a complex problem then it is probably a case whereby your beliefs appear to match the evidence much more strongly than they actually do. “Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong.” —H. L. Mencken
  • Fallacy of gray also known as Continuum fallacy –is the false belief that because nothing is certain, everything is equally uncertain. It does not take into account that some things are more certain than others.
  • False dilemma - occurs when only two options are considered, when there may in fact be many.
  • Filtered evidence – is evidence that was selected for the purpose of proving (disproving) a hypothesis. Filtered evidence may be highly misleading, but can still be useful, if considered with care.
  • Generalization from fictional evidence – logical fallacy that consists of drawing real-world conclusions based on statements invented and selected for the purpose of writing fiction.
  • Groupthink - tendency of humans to tend to agree with each other, and hold back objections or dissent even when the group is wrong.
  • Hindsight bias – is the tendency to overestimate the foreseeability of events that have actually happened.
  • Information hazard – is a risk that arises from the dissemination or the potential dissemination of (true) information that may cause harm or enable some agent to cause harm.
  • In-group bias - preferential treatment of people and ideas associated with your own group.
  • Mind-killer - a name given to topics (such as politics) that tend to produce extremely biased discussions. Another cause of mind-killers is social taboo. Negative connotations are associated with some topics, thus creating a strong bias supported by signaling drives that makes non-negative characterization of these topics appear absurd.
  • Motivated cognition – is the unconscious tendency of individuals to fit their processing of information to conclusions that suit some end or goal.
  • Motivated skepticism also known as disconfirmation bias - the mistake of applying more skepticism to claims that you don't like (or intuitively disbelieve), than to claims that you do like
  • Narrative fallacy – is a vulnerability to over interpretation and our predilection for compact stories over raw truths.
  • Overconfidence - the state of being more certain than is justified, given your priors and the evidence available.
  • Planning fallacy - predictions about how much time will be needed to complete a future task display an optimistic bias (underestimate the time needed).
  • Politics is the Mind-Killer – Politics is not a good area for rational debate. It is often about status and power plays where arguments are soldiers rather than tools to get closer to the truth.
  • Positive bias - tendency to test hypotheses with positive rather than negative examples, thus risking to miss obvious disconfirming tests.
  • Priming - psychological phenomenon that consists in early stimulus influencing later thoughts and behavior.
  • Privileging the hypothesis – is singling out a particular hypothesis for attention when there is insufficient evidence already in hand to justify such special attention.
  • Problem of verifying rationality – is the single largest problem for those desiring to create methods of systematically training for increased epistemic and instrumental rationality - how to verify that the training actually worked.
  • Rationalization – starts from a conclusion, and then works backward to arrive at arguments apparently favouring that conclusion. Rationalization argues for a side already selected. The term is misleading as it is the very opposite and antithesis of rationality, as if lying were called "truthization".
  • Reason as memetic immune disorder.- is problem that when you are rational you deem your conclusions more valuable than those of non-rational people. This can end up being a problem as you are less likely to update your beliefs when they are opposed. This adds the risk that if you make a one false belief and then rationally deduce a plethora of others from it you will be less likely to update any erronous conclusions.
  • Representativeness heuristic –a mental shortcut where people judge the probability or frequency of a hypothesis by considering how much the hypothesis resembles available data as opposed to using a Bayesian calculation.
  • Scales of justice fallacy - the error of using a simple polarized scheme for deciding a complex issue: each piece of evidence about the question is individually categorized as supporting exactly one of the two opposing positions.
  • Scope insensitivity – a phenomenon related to the representativeness heuristic where subjects based their willingness-to-pay mostly on a mental image rather than the effect on a desired outcome. An environmental measure that will save 200,000 birds doesn't conjure anywhere near a hundred times the emotional impact and willingness-to-pay of a measure that would save 2,000 birds, even though in fact the former measure is two orders of magnitude more effective.
  • Self-deception - state of preserving a wrong belief, often facilitated by denying or rationalizing away the relevance, significance, or importance of opposing evidence and logical arguments.
  • Status quo bias - people tend to avoid changing the established behavior or beliefs unless the pressure to change is sufficiently strong.
  • Sunk cost fallacy - Letting past investment (of time, energy, money, or any other resource) interfere with decision-making in the present in deleterious ways.
  • The top 1% fallacy - related to not taking into account the idea that a small sample size is not always reflective of a whole population and that sample populations with certain characteristics, e.g. made up of repeat job seekers, are not reflective of the whole population.
  • Underconfidence - the state of being more uncertain than is justified, given your priors and the evidence you are aware of.
  • Wrong Questions - A question about your map that wouldn’t make sense if you had a more accurate map.


  • Absolute certainty – equivalent of Bayesian probability of 1. Losing an epistemic bet made with absolute certainty corresponds to receiving infinite negative payoff, according to the logarithmic proper scoring rule.
  • Adaptation executors - Individual organisms are best thought of as adaptation-executers rather than as fitness-maximizers. Our taste buds do not find lettuce delicious and cheeseburgers distasteful once we are fed a diet too high in calories and too low in micronutrients. Tastebuds are adapted to an ancestral environment in which calories, not micronutrients, were the limiting factor. Evolution operates on too slow a timescale to re-adapt to adapt to a new conditions (such as a diet).
  • Adversarial process - a form of truth-seeking or conflict resolution in which identifiable factions hold one-sided positions.
  • Altruism - Actions undertaken for the benefit of other people. If you do something to feel good about helping people, or even to be a better person in some spiritual sense, it isn't truly altruism.
  • Amount of evidence - to a Bayesian, evidence is a quantitative concept. The more complicated or a priori improbable a hypothesis is, the more evidence you need just to justify it, or even just single it out of the amongst the mass of competing theories.
  • Anti-epistemology- is bad explicit beliefs about rules of reasoning, usually developed in the course of protecting an existing false belief - false beliefs are opposed not only by true beliefs (that must then be obscured in turn) but also by good rules of systematic reasoning (which must then be denied). The explicit defense of fallacy as a general rule of reasoning is anti-epistemology.
  • Antiprediction - is a statement of confidence in an event that sounds startling, but actually isn't far from a maxentropy prior. For example, if someone thinks that our state of knowledge implies strong ignorance about the speed of some process X on a logarithmic scale from nanoseconds to centuries, they may make the startling-sounding statement that X is very unlikely to take 'one to three years'.
  • Applause light - is an empty statement which evokes positive affect without providing new information
  • Artificial general intelligence – is a machine capable of behaving intelligently over many domains.
  • Bayesian - Bayesian probability theory is the math of epistemic rationality, Bayesian decision theory is the math of instrumental rationality.
  • Aumann's agreement theorem – roughly speaking, says that two agents acting rationally (in a certain precise sense) and with common knowledge of each other's beliefs cannot agree to disagree. More specifically, if two people are genuine Bayesians, share common priors, and have common knowledge of each other's current probability assignments, then they must have equal probability assignments.
  • Bayesian decision theory – is a decision theory which is informed by Bayesian probability. It is a statistical system that tries to quantify the tradeoff between various decisions, making use of probabilities and costs.
  • Bayesian probability - represents a level of certainty relating to a potential outcome or idea. This is in contrast to a frequentist probability that represents the frequency with which a particular outcome will occur over any number of trials. An event with Bayesian probability of .6 (or 60%) should be interpreted as stating "With confidence 60%, this event contains the true outcome", whereas a frequentist interpretation would view it as stating "Over 100 trials, we should observe event X approximately 60 times." The difference is more apparent when discussing ideas. A frequentist will not assign probability to an idea; either it is true or false and it cannot be true 6 times out of 10.
  • Bayes' theorem - A law of probability that describes the proper way to incorporate new evidence into prior probabilities to form an updated probability estimate.
  • Belief - the mental state in which an individual holds a proposition to be true. Beliefs are often metaphorically referred to as maps, and are considered valid to the extent that they correctly correspond to the truth. A person's knowledge is a subset of their beliefs, namely the beliefs that are also true and justified. Beliefs can be second-order, concerning propositions about other beliefs.
  • Belief as attire – is a example of an improper belief promoted by identification with a group or other signaling concerns, not by how well it reflects the territory.
  • Belief in belief - Where it is difficult to believe a thing, it is often much easier to believe that you ought to believe it. Were you to really believe and not just believe in belief, the consequences of error would be much more severe. When someone makes up excuses in advance, it would seem to require that belief, and belief in belief, have become unsynchronized.
  • Belief update - what you do to your beliefs, opinions and cognitive structure when new evidence comes along.
  • Bite the bullet - is to accept the consequences of a hard choice, or unintuitive conclusions of a formal reasoning procedure.
  • Black swan – is a high-impact event that is hard to predict (but not necessarily of low probability). It is also an event that is not accounted for in a model and therefore causes the model to break down when it occurs.
  • Cached thought – is an answer that was arrived at by recalling a previously-computed conclusion, rather than performing the reasoning from scratch.
  • Causal Decision Theory – a branch of decision theory which advises an agent to take actions that maximizes the causal consequences on the probability of desired outcomes
  • Causality - refers to the relationship between an event (the cause) and a second event (the effect), where the second event is a direct consequence of the first.
  • Church-Turing thesis - states the equivalence between the mathematical concepts of algorithm or computation and Turing-Machine. It asserts that if some calculation is effectively carried out by an algorithm, then there exists a Turing machines which will compute that calculation.
  • Coherent Aggregated Volition - is one of Ben Goertzel's responses to Eliezer Yudkowsky's Coherent Extrapolated Volition, the other being Coherent Blended Volition. CAV would be a combination of the goals and beliefs of humanity at the present time.
  • Coherent Blended Volition - Coherent Blended Volition is a recent concept coined in a 2012 paper by Ben Goertzel with the aim to clarify his Coherent Aggregated Volition idea. This clarifications follows the author's attempt to develop a comprehensive alternative to Coherent Extrapolated Volition.
  • Coherent Extrapolated Volition – is a term developed by Eliezer Yudkowsky while discussing Friendly AI development. It’s meant as an argument that it would not be sufficient to explicitly program our desires and motivations into an AI. Instead, we should find a way to program it in a way that it would act in our best interests – what we want it to do and not what we tell it to.
  • Color politics - the words "Blues" and "Greens" are often used to refer to two opposing political factions. Politics commonly involves an adversarial process, where factions usually identify with political positions, and use arguments as soldiers to defend their side. The dichotomies presented by the opposing sides are often false dilemmas, which can be shown by presenting third options.
  • Common knowledge - n the context of Aumann's agreement theorem, a fact is part of the common knowledge of a group of agents when they all know it, they all know that they all know it, and so on ad infinitum.
  • Conceptual metaphor – are neurally-implemented mappings between concrete domains of discourse (often related to our body and perception) and more abstract domains. These are a well-known source of bias and are often exploited in the Dark Arts. An example is “argument is war”.
  • Configuration space - is an isomorphism between the attributes of something, and its position on a multidimensional graph. Theoretically, the attributes and precise position on the graph should contain the same information. In practice, the concept usually appears as a suffix, as in "walletspace", where "walletspace" refers to the configuration space of all possible wallets, arranged by similarity. Walletspace would intersect with leatherspace, and the set of leather wallets is a subset of both walletspace and leatherspace, which are both subsets of thingspace.
  • Conservation of expected evidence - a theorem that says: "for every expectation of evidence, there is an equal and opposite expectation of counterevidence". 0 = (P(H|E)-P(H))*P(E) + (P(H|~E)-P(H))*P(~E)
  • Control theory - a control system is a device that keeps a variable at a certain value, despite only knowing what the current value of the variable is. An example is a cruise control, which maintains a certain speed, but only measures the current speed, and knows nothing of the system that produces that speed (wind, car weight, grade).
  • Corrupted hardware - our brains do not always allow us to act the way we should. Corrupted hardware refers to those behaviors and thoughts that act for ancestrally relevant purposes rather than for stated moralities and preferences.
  • Counterfactual mugging - is a thought experiment for testing and differentiating decision theories, stated as follows:
  • Counter man syndrome - wherein a person behind a counter comes to believe that they know things they don't know, because, after all, they're the person behind the counter. So they can't just answer a question with "I don't know"... and thus they make something up, without really paying attention to the fact that they're making it up. Pretty soon, they don't know the difference between the facts and their made up stories
  • Cox's theorem says, roughly, that if your beliefs at any given time take the form of an assignment of a numerical "plausibility score" to every proposition, and if they satisfy a few plausible axioms, then your plausibilities must effectively be probabilities obeying the usual laws of probability theory, and your updating procedure must be the one implied by Bayes' theorem.
  • Crisis of faith - a combined technique for recognizing and eradicating the whole systems of mutually-supporting false beliefs. The technique involves systematic application of introspection, with the express intent to check the reliability of beliefs independently of the other beliefs that support them in the mind. The technique might be useful for the victims of affective death spirals, or any other systematic confusions, especially those supported by anti-epistemology.
  • Cryonics - is the practice of preserving people who are dying in liquid nitrogen soon after their heart stops. The idea is that most of your brain's information content is still intact right after you've "died". If humans invent molecular nanotechnology or brain emulation techniques, it may be possible to reconstruct the consciousness of cryopreserved patients.
  • Curiosity - The first virtue is curiosity. A burning itch to know is higher than a solemn vow to pursue truth. To feel the burning itch of curiosity requires both that you be ignorant, and that you desire to relinquish your ignorance. If in your heart you believe you already know, or if in your heart you do not wish to know, then your questioning will be purposeless and your skills without direction. Curiosity seeks to annihilate itself; there is no curiosity that does not want an answer. The glory of glorious mystery is to be solved, after which it ceases to be mystery. Be wary of those who speak of being open-minded and modestly confess their ignorance. There is a time to confess your ignorance and a time to relinquish your ignorance. —Twelve Virtues of Rationality
  • Dangerous knowledge - Intelligence, in order to be useful, must be used for something other than defeating itself.
  • Dangling Node - A label for something that isn't "actually real".
  • Death - First you're there, and then you're not there, and they can't change you from being not there to being there, because there's nothing there to be changed from being not there to being there. That's death. Cryonicists use the concept of information-theoretic death, which is what happens when the information needed to reconstruct you even in principle is no longer present. Anything less, to them, is just a flesh wound.
  • Debiasing - The process of overcoming bias. It takes serious study to gain meaningful benefits, half-hearted attempts may accomplish nothing, and partial knowledge of bias may do more harm than good.
  • Decision theory – is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals.
  • Defying the data - Sometimes, the results of an experiment contradict what we have strong theoretical reason to believe. But experiments can go wrong, for various reasons. So if our theory is strong enough, we should in some cases defy the data: know that there has to be something wrong with the result, even without offering ideas on what it might be.
  • Disagreement - Aumann's agreement theorem can be informally interpreted as suggesting that if two people are honest seekers of truth, and both believe each other to be honest, then they should update on each other's opinions and quickly reach agreement. The very fact that a person believes something is Rational evidence that that something is true, and so this fact should be taken into account when forming your belief. Outside of well-functioning prediction markets, Aumann agreement can probably only be approximated by careful deliberative discourse. Thus, fostering effective deliberation should be seen as a key goal of Less Wrong.
  • Doubt- The proper purpose of a doubt is to destroy its target belief if and only if it is false. The mere feeling of crushing uncertainty is not virtuous unto an aspiring rationalist; probability theory is the law that says we must be uncertain to the exact extent to which the evidence merits uncertainty.
  • Dunning–Kruger effect - is a cognitive bias wherein unskilled individuals suffer from illusory superiority, mistakenly assessing their ability to be much higher than is accurate. This bias is attributed to a metacognitive inability of the unskilled to recognize their ineptitude. Conversely, highly skilled individuals tend to underestimate their relative competence, erroneously assuming that tasks that are easy for them are also easy for others
  • Emulation argument for human-level AI – argument that since whole brain emulation seems feasible then human-level AI must also be feasible.
  • Epistemic hygiene - consists of practices meant to allow accurate beliefs to spread within a community and keep less accurate or biased beliefs contained. The practices are meant to serve an analogous purpose to normal hygiene and sanitation in containing disease. "Good cognitive citizenship" is another phrase that has been proposed for this concept[1].
  • Error of crowds - is the idea that under some scoring rules, the average error becomes less than the error of the average, thus making the average belief tautologically worse than a belief of a random person. Compare this to the ideas of modesty argument and wisdom of the crowd. A related idea is that a popular belief is likely to be wrong because the less popular ones couldn't maintain support if they were worse than the popular one.
  • Ethical injunction - are rules not to do something even when it's the right thing to do. (That is, you refrain "even when your brain has computed it's the right thing to do", but this will just seem like "the right thing to do".) For example, you shouldn't rob banks even if you plan to give the money to a good cause. This is to protect you from your own cleverness (especially taking bad black swan bets), and the Corrupted hardware you're running on.
  • Evidence - for a given theory is the observation of an event that is more likely to occur if the theory is true than if it is false. (The event would be evidence against the theory if it is less likely if the theory is true.)
  • Evidence of absence - evidence that allows you to conclude some phenomenon isn't there. It is often said that "absence of evidence is not evidence of absence". However, if evidence is expected, but not present, that is evidence of absence.
  • Evidential Decision Theory - a branch of decision theory which advises an agent to take actions which, conditional on it happening, maximizes the chances of the desired outcome.
  • Evolution - The brainless, mindless optimization process responsible for the production of all biological life on Earth, including human beings. Since the design signature of evolution is alien and counterintuitive, it takes some study to get to know your accidental Creator.
  • Evolution as alien god – is a thought experiment in which evolution is imagined as a god. The though experiment is meant to convey the idea that evolution doesn’t have a mind. The god in though experiment would be a tremendously powerful, unbelievably stupid, ridiculously slow, and utterly uncaring god; a god monomaniacally focused on the relative fitness of genes within a species; a god whose attention was completely separated and working at cross-purposes in rabbits and wolves.
  • Evolutionary argument for human-level AI - an argument that uses the fact that evolution produced human level intelligence to argue for the feasibility of human-level AI.
  • Evolutionary psychology - the idea of evolution as the idiot designer of humans - that our brains are not consistently well-designed - is a key element of many of the explanations of human errors that appear on this website.
  • Existential risk – is a risk posing permanent large negative consequences to humanity which can never be undone.
  • Expected value - The expected value or expectation is the (weighted) average of all the possible outcomes of an event, weighed by their probability. For example, when you roll a die, the expected value is (1+2+3+4+5+6)/6 = 3.5. (Since a die doesn't even have a face that says 3.5, this illustrates that very often, the "expected value" isn't a value you actually expect.)
  • Extensibility argument for greater-than-human intelligence –is an argument that once we get to a human level AGI, extensibility would make an AGI of greater-than-human-intelligence feasible.
  • Extraordinary evidence - is evidence that turns an a priori highly unlikely event into an a posteriori likely event.
  • Free-floating belief – is a belief that both doesn't follow from observations and doesn't restrict which experiences to anticipate. It is both unfounded and useless.
  • Free will - means our algorithm's ability to determine our actions. People often get confused over free will because they picture themselves as being restrained rather than part of physics. Yudowsky calls this view Requiredism, but most people just view this essentially as Compatibilism.
  • Friendly artificial intelligence – is a superintelligence (i.e., a really powerful optimization process) that produces good, beneficial outcomes rather than harmful ones.
  • Fully general counterargument - an argument which can be used to discount any conclusion the arguer does not like. Being in possession of such an argument leads to irrationality because it allows the arguer to avoid updating their beliefs in the light of new evidence. Knowledge of cognitive biases can itself allow someone to form fully general counterarguments ("you're just saying that because you're exhibiting X bias").
  • Great Filter - is a proposed explanation for the Fermi Paradox. The development of intelligent life requires many steps, such as the emergence of single-celled life and the transition from unicellular to multicellular life forms. Since we have not observed intelligent life beyond our planet, there seems to be a developmental step that is so difficult and unlikely that it "filters out" nearly all civilizations before they can reach a space-faring stage.
  • Group rationality - In almost anything, individuals are inferior to groups.
  • Group selection – is an incorrect belief about evolutionary theory that a feature of the organism is there for the good of the group.
  • Heuristic - quick, intuitive strategy for reasoning or decision making, as opposed to more formal methods. Heuristics require much less time and energy to use, but sometimes go awry, producing bias.
  • Heuristics and biases - program in cognitive psychology tries to work backward from biases (experimentally reproducible human errors) to heuristics (the underlying mechanisms at work in the brain).
  • Hold Off on Proposing Solutions - "Do not propose solutions until the problem has been discussed as thoroughly as possible without suggesting any." It is easy to show that this edict works in contexts where there are objectively defined good solutions to problems.
  • Hollywood rationality- What Spock does, not what actual rationalists do.
  • How an algorithm feels - Our philosophical intuitions are generated by algorithms in the human brain. To dissolve a philosophical dilemma, it often suffices to understand the cognitive algorithm that generates the appearance of the dilemma - if you understand the algorithm in sufficient detail. It is not enough to say "An algorithm does it!" - this might as well be magic. It takes a detailed step-by-step walkthrough.
  • Hypocrisy - the act of claiming to motives, morals and standards one does not possess. Informally, it refers to not living up the standards that one espouses, whether or not one sincerely believes those standards.
  • Impossibility- Careful use of language dictates that we distinguish between several senses in which something can be said to be impossible. Some things are logically impossible: you can't have a square circle or an object that is both perfectly black and perfectly not-black. Also, in our reductionist universe operating according to universal physical laws, some things are physically impossible based on our model of how things work, even they are not obviously contradictory or contrary to reason: for example, the laws of thermodynamics give us a strong guarantee that there can never be a perpetual motion machine. It can be tempting to label as impossible very difficult problems which you have no idea how to solve. But the apparent lack of a solution is not a strong guarantee that no solution can exist in the way that the laws of thermodynamics, or Godel's incompleteness results, give us proofs that something cannot be accomplished. A blank map does not correspond to a blank territory; in the absence of a proof that a problem is insolvable, you can't be confident that you're not just overlooking something that a greater intelligence would spot in an instant.
  • Improper belief – is a belief that isn't concerned with describing the territory. A proper belief, on the other hand, requires observations, gets updated upon encountering new evidence, and provides practical benefit in anticipated experience. Note that the fact that a belief just happens to be true doesn't mean you're right to have it. If you buy a lottery ticket, certain that it's a winning ticket (for no reason), and it happens to be, believing that was still a mistake. Types of improper belief discussed in the Mysterious Answers to Mysterious Questions sequence include: Free-floating belief, Belief as attire, Belief in belief and Belief as cheering
  • Incredulity - Spending emotional energy on incredulity wastes time you could be using to update. It repeatedly throws you back into the frame of the old, wrong viewpoint. It feeds your sense of righteous indignation at reality daring to contradict you.
  • Intuition pump - In summary, they are thought experiments that highlight, or pumping, certain ideas, intuitions or concepts while attenuating others so as to make some conclusion obvious and simple to reach. The intuition pump is a carefully designed persuasion tool in which you check to see if the same intuitions still get pumped when you change certain settings in a thought experiment.
  • Kolmogorov complexity - given a string, the length of the shortest possible program that prints it.
  • Lawful intelligence - The startling and counterintuitive notion - contradicting both surface appearances and all Deep Wisdom - that intelligence is a manifestation of Order rather than Chaos. Even creativity and outside-the-box thinking are essentially lawful. While this is a complete heresy according to the standard religion of Silicon Valley, there are some good mathematical reasons for believing it.
  • Least convenient possible world – is a technique for enforcing intellectual honesty, to be used when arguing against an idea. The essence of the technique is to assume that all the specific details will align with the idea against which you are arguing, i.e. to consider the idea in the context of a least convenient possible world, where every circumstance is colluding against your objections and counterarguments. This approach ensures that your objections are strong enough, running minimal risk of being rationalizations for your position.
  • Logical rudeness – is a response to criticism which insulates the responder from having to address the criticism directly. For example, ignoring all the diligent work that evolutionary biologists did to dig up previous fossils, and insisting you can only be satisfied by an actual videotape, is "logically rude" because you're ignoring evidence that someone went to a great deal of trouble to provide to you.
  • Log odds – is an alternate way of expressing probabilities, which simplifies the process of updating them with new evidence. Unfortunately, it is difficult to convert between probability and log odds. The log odds is the log of the odds ratio.
  • Magical categories - an English word which, although it sounds simple - hey, it's just one word, right? - is actually not simple, and furthermore, may be applied in a complicated way that drags in other considerations. Physical brains are not powerful enough to search all possibilities; we have to cut down the search space to possibilities that are likely to be good. Most of the "obviously bad" methods - those that would end up violating our other values, and so ranking very low in our preference ordering - do not even occur to us as possibilities.
  • Making Beliefs Pay Rent - Every question of belief should flow from a question of anticipation, and that question of anticipation should be the centre of the inquiry. Every guess of belief should begin by flowing to a specific guess of anticipation, and should continue to pay rent in future anticipations. If a belief turns deadbeat, evict it.
  • Many-worlds interpretation - uses decoherence to explain how the universe splits into many separate branches, each of which looks like it came out of a random collapse.
  • Map and territory- Less confusing than saying "belief and reality", "map and territory" reminds us that a map of Texas is not the same thing as Texas itself. Saying "map" also dispenses with possible meanings of "belief" apart from "representations of some part of reality". Since our predictions don't always come true, we need different words to describe the thingy that generates our predictions and the thingy that generates our experimental results. The first thingy is called "belief", the second thingy "reality".
  • Meme lineage – is a set of beliefs, attitudes, and practices that all share a clear common origin point. This concept also emphasizes the means of transmission of the beliefs in question. If a belief is part of a meme lineage that transmits for primarily social reasons, it may be discounted for purposes of the modesty argument.
  • Memorization - is what you're doing when you cram for a university exam. It's not
  • Modesty - admitting or boasting of flaws so as to not create perceptions of arrogance. Not to be confused with humility.
  • Most of science is actually done by induction - To come up with something worth testing, a scientist needs to do lots of sound induction first or borrow an idea from someone who already used induction. This is because induction is the only way to reliably find candidate hypotheses which deserve attention. Examples of bad ways to find hypotheses include finding something interesting or surprising to believe in and then pinning all your hopes on that thing turning out to be true.
  • Most peoples' beliefs aren’t worth considering - Sturgeon's Law says that as a general rule, 90% of everything is garbage. Even if it is the case that 90% of everything produced by any field is garbage that does not mean one can dismiss the 10% that is quality work. Instead, it is important engage with that 10%, and use that as the standard of quality.
  • Nash equilibrium - a stable state of a system involving the interaction of different participants, in which no participant can gain by a unilateral change of strategy if the strategies of the others remain unchanged.
  • Newcomb's problem - In Newcomb's problem, a superintelligence called Omega shows you two boxes, A and B, and offers you the choice of taking only box A, or both boxes A and B. Omega has put $1,000 in box B. If Omega thinks you will take box A only, he has put $1,000,000 in it. 
  • Nonapples - a proposed object, tool, technique, or theory which is defined only as being not like a specific, existent example of said categories. It is a type of overly-general prescription which, while of little utility, can seem useful. It involves disguising a shallow criticism as a solution, often in such a way as to make it look profound. For instance, suppose someone says, "We don't need war, we need non-violent conflict resolution." In this way a shallow criticism (war is bad) is disguised as a solution (non-violent conflict resolution, i.e, nonwar). This person is selling nonapples because "non-violent conflict resolution" isn't a method of resolving conflict nonviolently. Rather, it is a description of all conceivable methods of non-violent conflict resolution, the vast majority of which are incoherent and/or ineffective.
  • Noncentral fallacy - A rhetorical move often used in political, philosophical, and cultural arguments. "X is in a category whose archetypal member gives us a certain emotional reaction. Therefore, we should apply that emotional reaction to X, even though it is not a central category member."
  • Not technically a lie – is a statement that is literally true, but causes the listener to attain false beliefs by performing incorrect inference, is not technically a lie.
  • Occam's razor - principle commonly stated as "Entities must not be multiplied beyond necessity". When several theories are able to explain the same observations, Occam's razor suggests the simpler one is preferable.
  • Odds ratio - are an alternate way of expressing probabilities, which simplifies the process of updating them with new evidence. The odds ratio of A is P(A)/P(¬A).
  • Omega - A hypothetical super-intelligent being used in philosophical problems. Omega is most commonly used as the predictor in Newcomb's problem. In its role as predictor, Omega's predictions occur almost certainly. In some thought experiments, Omega is also taken to be super-powerful. Omega can be seen as analogous to Laplace's demon, or as the closest approximation to the Demon capable of existing in our universe.
  • Oops - Theories must be bold and expose themselves to falsification; be willing to commit the heroic sacrifice of giving up your own ideas when confronted with contrary evidence; play nice in your arguments; try not to deceive yourself; and other fuzzy verbalisms. It is better to say oops quickly when you realize a mistake. The alternative is stretching out the battle with yourself over years.
  • Outside view - Taking the outside view (another name for reference class forecasting) means using an estimate based on a class of roughly similar previous cases, rather than trying to visualize the details of a process. For example, estimating the completion time of a programming project based on how long similar projects have taken in the past, rather than by drawing up a graph of tasks and their expected completion times.
  • Overcoming Bias - is a group blog on the systemic mistakes humans make, and how we can possibly correct them.
  • Paperclip maximizer – is an AI that has been created to maximize the number of paperclips in the universe. It is a hypothetical unfriendly artificial intelligence.
  • Pascal's mugging – is a thought-experiment demonstrating a problem in expected utility maximization. A rational agent should choose actions whose outcomes, when weighed by their probability, have higher utility. But some very unlikely outcomes may have very great utilities, and these utilities can grow faster than the probability diminishes. Hence the agent should focus more on vastly improbable cases with implausibly high rewards.
  • Password - The answer you guess instead of actually understanding the problem.
  • Philosophical zombie - a hypothetical entity that looks and behaves exactly like a human (often stipulated to be atom-by-atom identical to a human) but is not actually conscious: they are often said lack qualia or phenomena consciousness.
  • Phlogiston - the 18 century's answer to the Elemental Fire of the Greek alchemists. Ignite wood, and let it burn. What is the orangey-bright "fire" stuff? Why does the wood transform into ash? To both questions, the 18th-century chemists answered, "phlogiston"....and that was it, you see, that was their answer: "Phlogiston." —Fake Causality
  • Possibility - words in natural language carry connotations that may become misleading when the words get applied with technical precision. While it's not technically a lie to say that it's possible to win a lottery, the statement is deceptive. It's much more precise, for communication of the actual fact through connotation, to say that it’s impossible to win the lottery. This is an example of antiprediction.
  • Possible world - is one that is internally consistent, even if it is counterfactual.
  • Prediction market - speculative markets created for the purpose of making predictions. Assets are created whose final cash value is tied to a particular event or parameter. The current market prices can then be interpreted as predictions of the probability of the event or the expected value of the parameter.
  • Priors - refer generically to the beliefs an agent holds regarding a fact, hypothesis or consequence, before being presented with evidence.
  • Probability is in the Mind - Probabilities express uncertainty, and it is only agents who can be uncertain. A blank map does not correspond to a blank territory. Ignorance is in the mind.
  • Probability theory - a field of mathematics which studies random variables and processes.
  • Rationality - the characteristic of thinking and acting optimally. An agent is rational if it wields its intelligence in such a way as to maximize the convergence between its beliefs and reality; and acts on these beliefs in such a manner as to maximize its chances of achieving whatever goals it has. For humans, this means mitigating (as much as possible) the influence of cognitive biases.
  • Rational evidence - the broadest possible sense of evidence, the Bayesian sense. Rational evidence about a hypothesis H is any observation which has a different likelihood depending on whether H holds in reality or not. Rational evidence is distinguished from narrower forms of evidence, such as scientific evidence or legal evidence. For a belief to be scientific, you should be able to do repeatable experiments to verify the belief. For evidence to be admissible in court, it must e.g. be a personal observation rather than hearsay.
  • Rationalist taboo - a technique for fighting muddles in discussions. By prohibiting the use of a certain word and all the words synonymous to it, people are forced to elucidate the specific contextual meaning they want to express, thus removing ambiguity otherwise present in a single word. Mainstream philosophy has a parallel procedure called "unpacking" where doubtful terms need to be expanded out.
  • Rationality and Philosophy - A sequence by lukeprog examining the implications of rationality and cognitive science for philosophical method.
  • Rationality as martial art - A metaphor for rationality as the martial art of mind; training brains in the same fashion as muscles. The metaphor is intended to have complex connotations, rather than being strictly positive. Do modern-day martial arts suffer from being insufficiently tested in realistic fighting, and do attempts at rationality training run into the same problem?
  • Reversal test - a technique for fighting status quo bias in judgments about the preferred value of a continuous parameter. If one deems the change of the parameter in one direction to be undesirable, the reversal test is to check that either the change of that parameter in the opposite direction (away from status quo) is deemed desirable, or that there are strong reasons to expect that the current value of the parameter is (at least locally) the optimal one.
  • Reductionism - a disbelief that the higher levels of simplified multilevel models are out there in the territory, that concepts constructed by mind in themselves play a role in the behavior of reality. This doesn't contradict the notion that the concepts used in simplified multilevel models refer to the actual clusters of configurations of reality.
  • Religion- Religion is a complex group of human activities — involving tribal affiliation, belief in belief, supernatural claims, and a range of shared group practices such as worship meetings, rites of passage, etc.
  • Reversed stupidity is not intelligence - "The world's greatest fool may say the Sun is shining, but that doesn't make it dark out.".
  • Science - a method for developing true beliefs about the world. It works by developing hypotheses about the world, creating experiments that would allow the hypotheses to be tested, and running the experiments. By having people publish their falsifiable predictions and their experimental results, science protects itself from individuals deceiving themselves or others.
  • Scoring rule - a scoring rule is a measure of performance of probabilistic predictions - made under uncertainty.
  • Seeing with Fresh Eyes - A sequence on the incredibly difficult feat of getting your brain to actually think about something, instead of instantly stopping on the first thought that comes to mind.
  • Semantic stopsign – is a meaningless generic explanation that creates an illusion of giving an answer, without actually explaining anything.
  • Shannon information - The Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable.
  • Shut up and multiply- the ability to trust the math even when it feels wrong
  • Signaling - "a method of conveying information among not-necessarily-trustworthy parties by performing an action which is more likely or less costly if the information is true than if it is not true".
  • Solomonoff induction - A formalized version of Occam's razor based on Kolmogorov complexity.
  • Sound argument - an argument that is valid and whose premises are all true. In other words, the premises are true and the conclusion necessarily follows from them, making the conclusion true as well.
  • Spaced repetition - is a technique for building long-term knowledge efficiently. It works by showing you a flash card just before a computer model predicts you will have forgotten it. Anki is Less Wrong's spaced repetition software of choice
  • Statistical bias - "Bias" as used in the field of statistics refers to directional error in an estimator. Statistical bias is error you cannot correct by repeating the experiment many times and averaging together the results.
  • Steel man - A term for the opposite of a Straw Man
  • Superstimulus - an exaggerated version of a stimulus to which there is an existing response tendency, or any stimulus that elicits a response more strongly than the stimulus for which it evolved.
  • Surprise - Recognizing a fact that disagrees with your intuition as surprising is an important step in updating your worldview.
  • Sympathetic magic - Humans seem to naturally generate a series of concepts known as sympathetic magic, a host of theories and practices which have certain principles in common, two of which are of overriding importance: the Law of Contagion holds that two things which have interacted, or were once part of a single entity, retain their connection and can exert influence over each other; the Law of Similarity holds that things which are similar or treated the same establish a connection and can affect each other.
  • Tapping Out - The appropriate way to signal that you've said all you wanted to say on a particular topic, and that you're ending your participation in a conversation lest you start saying things that are less worthwhile. It doesn't mean accepting defeat or claiming victory and it doesn't mean you get the last word. It just means that you don't expect your further comments in a thread to be worthwhile, because you've already made all the points you wanted to, or because you find yourself getting too emotionally invested, or for any other reason you find suitable.
  • Technical explanation - A technical explanation is an explanation of a phenomenon that makes you anticipate certain experiences. A proper technical explanation controls anticipation strictly, weighting your priors and evidence precisely to create the justified amount of uncertainty. Technical explanations are contrasted with verbal explanations, which give the impression of understanding without actually producing the proper expectation.
  • Teleology - The study of things that happen for the sake of their future consequences. The fallacious meaning of it is that events are the result of future events. The non-fallacious meaning is that it is the study of things that happen because of their intended results, where the intention existed in an actual mind in the prior past, and so was causally able to bring about the event by planning and acting.
  • The map is not the territory – the idea that our perception of the world is being generated by our brain and can be considered as a 'map' of reality written in neural patterns. Reality exists outside our mind but we can construct models of this 'territory' based on what we glimpse through our senses.
  • Third option - is a way to break a false dilemma, showing that neither of the suggested solutions is a good idea.
  • Traditional rationality - "Traditional Rationality" refers to the tradition passed down by reading Richard Feynman's "Surely You're Joking", Thomas Kuhn's "The Structure of Scientific Revolutions", Martin Gardner's "Science: Good, Bad, and Bogus", Karl Popper on falsifiability, or other non-technical material on rationality. Traditional Rationality is a very large improvement over nothing at all, and very different from Hollywood rationality; people who grew up on this belief system are definitely fellow travelers, and where most of our recruits come from. But you can do even better by adding math, science, formal epistemic and instrumental rationality; experimental psychology, cognitive science, deliberate practice, in short, all the technical stuff.There's also some popular tropes of Traditional Rationality that actually seem flawed once you start comparing them to a Bayesian standard - for example, the idea that you ought to give up an idea once definite evidence has been provided against it, but you're allowed to believe until then, if you want to. Contrast to the stricter idea of there being a certain exact probability which it is correct to assign, continually updated in the light of new evidence.
  • Trivial inconvenience - inconveniences that take few resources to counteract but have a disproportionate impact on people deciding whether to take a course of action.
  • Truth - the correspondence between and one's beliefs about reality and reality.
  • Tsuyoku naritai - the will to transcendence. Japanese: "I want to become stronger."
  • Twelve virtues of rationality
    1. Curiosity – the burning itch
    2. Relenquishment – “That which can be destroyed by the truth should be.” -P. C. Hodgell
    3. Lightness – follow the evidence wherever it leads
    4. Evenness – resist selective skepticism; use reason, not rationalization
    5. Argument – do not avoid arguing; strive for exact honesty; fairness does not mean balancing yourself evenly between propositions
    6. Empiricism – knowledge is rooted in empiricism and its fruit is prediction; argue what experiences to anticipate, not which beliefs to profess
    7. Simplicity – is virtuous in belief, design, planning, and justification; ideally: nothing left to take away, not nothing left to add
    8. Humility – take actions, anticipate errors; do not boast of modesty; no one achieves perfection
    9. Perfectionism – seek the answer that is *perfectly* right – do not settle for less
    10. Precision – the narrowest statements slice deepest; don’t walk but dance to the truth
    11. Scholarship – absorb the powers of science
    12. [The void] (the nameless virtue) – “More than anything, you must think of carrying your map through to reflecting the territory.”
  • Understanding - is more than just memorization of detached facts; it requires ability to see the implications across a variety of possible contexts.
  • Universal law - the idea that everything in reality always behaves according to the same uniform physical laws; there are no exceptions and no alternatives.
  • Unsupervised universe - a thought experiment developed to counter undue optimism, not just the sort due to explicit theology, but in particular a disbelief in the Future's vulnerability—a reluctance to accept that things could really turn out wrong. It involves a benevolent god, a simulated universe, e.g. Conway's Game of Life and asking the mathematical question of what would happen according to the standard Life rules given certain initial conditions - so that even God cannot control the answer to the question; although, of course, God always intervenes in the actual Life universe.
  • Valid argument - An argument is valid when it contains no logical fallacies
  • Valley of bad rationality - It has been observed that when someone is just starting to learn rationality, they appear to be worse off than they were before. Others, with more experience at rationality, claim that after you learn more about rationality, you will be better off than you were before you started. The period before this improvement is known as "the valley of bad rationality".
  • Wisdom of the crowd – is the collective opinion of a group of individuals rather than that of a single expert. A large group's aggregated answers to questions involving quantity estimation, general world knowledge, and spatial reasoning has generally been found to be as good as, and often better than, the answer given by any of the individuals within the group.
  • Words can be wrong – There are many ways that words can be wrong it is for this reason that we should avoid arguing by definition. Instead, to facilitate communication we can taboo and reduce: we can replace the symbol with the substance and talk about facts and anticipations, not definitions.


Barriers, biases, fallacies, impediments and problems

  • Akrasia - the state of acting against one's better judgment. Note that, for example, if you are procrastinating because it's not in your best interest to complete the task you are delaying, it is not a case of akrasia.
  • Alief - an independent source of emotional reaction which can coexist with a contradictory belief. For example, the fear felt when a monster jumps out of the darkness in a scary movie is based on the alief that the monster is about to attack you, even though you believe that it cannot.
  • Effort Shock - the unpleasant discovery of how hard it is to accomplish something.


  • Ambient decision theory - A variant of updateless decision theory that uses first order logic instead of mathematical intuition module (MIM), emphasizing the way an agent can control which mathematical structure a fixed definition defines, an aspect of UDT separate from its own emphasis on not making the mistake of updating away things one can still acausally control.
  • Ask, Guess and Tell culture - The two basic rules of Ask Culture: 1) Ask when you want something. 2) Interpret things as requests and feel free to say "no". The two basic rules of Guess Culture: 1) Ask for things if, and *only* if, you're confident the person will say "yes". 2)  Interpret requests as expectations of "yes", and, when possible, avoid saying "no".The two basic rules of Tell Culture: 1) Tell the other person what's going on in your own mind whenever you suspect  you'd both benefit from them knowing. (Do NOT assume others will accurately model your mind without your help, or that it will even occur to them to ask you questions to eliminate their ignorance.) 2) Interpret things people tell you as attempts to create common knowledge for shared benefit, rather than as requests or as presumptions of compliance.
  • Burch's law – “I think people should have a right to be stupid and, if they have that right, the market's going to respond by supplying as much stupidity as can be sold.” —Greg Burch A corollary of Burch's Law is that any bias should be regarded as a potential vulnerability whereby the market can trick one into buying something one doesn't really want.
  • Challenging the Difficult - A sequence on how to do things that are difficult or "impossible".
  • Cognitive style - Certain cognitive styles might tend to produce more accurate results. A common distinction between cognitive styles is that of foxes vs. hedgehogs. Hedgehogs view the world through the lens of a single defining idea and foxes draw on a wide variety of experiences and for whom the world cannot be boiled down to a single idea. Foxes tend to be better calibrated and more accurate.
  • Consequentialism - the ethical theory that people should choose the action that will result in the best outcome.
  • Crocker's rules - By declaring commitment to Crocker's rules, one authorizes other debaters to optimize their messages for information, even when this entails that emotional feelings will be disregarded. This means that you have accepted full responsibility for the operation of your own mind, so that if you're offended, it's your own fault.
  • Dark arts - refers to rhetorical techniques crafted to exploit human cognitive biases in order to persuade, deceive, or otherwise manipulate a person into irrationally accepting beliefs perpetuated by the practitioner of the Arts. Use of the dark arts is especially common in sales and similar situations (known as hard sell in the sales business) and promotion of political and religious views.
  • Egalitarianism - the idea that everyone should be considered equal. Equal in merit, equal in opportunity, equal in morality, and equal in achievement. Dismissing egalitarianism is not opposed to humility, even though from thesignaling perspective it seems to be opposed to modesty.
  • Expected utility - the expected value in terms of the utility produced by an action. It is the sum of the utility of each of its possible consequences, individually weighted by their respective probability of occurrence. rational decision maker will, when presented with a choice, take the action with the greatest expected utility.
  • Explaining vs. explaining away – Explaining something does not subtract from its beauty. It in fact heightens it. Through understanding it, you gain greater awareness of it. Through understanding it, you are more likely to notice its similarities and interrelationships with others things. Through understanding it, you become able to see it not only on one level, but on multiple. In regards to the delusions which people are emotionally attached to, that which can be destroyed by the truth should be.
  • Fuzzies - A hypothetical measurement unit for "warm fuzzy feeling" one gets from believing that one has done good. Unlike utils, fuzzies can be earned through psychological tricks without regard for efficiency. For this reason, it may be a good idea to separate the concerns for actually doing good, for which one might need to shut up and multiply, and for earning fuzzies, to get psychological comfort.
  • Game theory - attempts to mathematically model interactions between individuals.
  • Generalizing from One Example - an incorrect generalisation when you only have direct first-person knowledge of one mind, psyche or social circle and you treat it as typical even in the face of contrary evidence.
  • Goodhart’s law - states that once a certain indicator of success is made a target of a social or economic policy, it will lose the information content that would qualify it to play such a role. People and institutions try to achieve their explicitly stated targets in the easiest way possible, often obeying the letter of the law. This is often done in way that the designers of the law did not anticipate or want. For example, the soviet factories which when given targets on the basis of numbers of nails produced many tiny useless nails and when given targets on basis of weight produced a few giant nails.
  • Hedonism- refers to a set of philosophies which hold that the highest goal is to maximize pleasure, or more precisely pleasure minus pain.
  • Humans Are Not Automatically Strategic - most courses of action are extremely ineffective and most of the time there has been no strong evolutionary or cultural force sufficient to focus us on the very narrow behavior patterns that would actually be effective. When this is coupled with the fact that people tend to spend a lot less effort on planning how to go about a reaching a goal rather than just trying to achieve it you end up with the conclusion that humans are not automatically strategic.
  • Human universal - Donald E. Brown has compiled a list of over a hundred human universals - traits found in every culture ever studied, most of them so universal that anthropologists don't even bother to note them explicitly.
  • Instrumental value - a value pursued for the purpose of achieving other values. Values which are pursued for their own sake are called terminal values.
  • Intellectual roles - Group rationality may be improved when members of the group take on specific intellectual roles. While these roles may be incomplete on their own, each embodies an aspect of proper rationality. If certain roles are biased against, purposefully adopting them might reduce bias.
  • Lonely Dissenters suffer social disapproval, but are required - Asch's conformity experiment showed that the presence of a single dissenter tremendously reduced the incidence of "conforming" wrong answers.
  • Loss Aversion - is risk aversion's evil twin. A loss-averse agent tends to avoid uncertain gambles, not because every unit of money brings him a bit less utility, but because he weighs losses more heavily than gains, always treating his current level of money as somehow special.
  • Luminosity - reflective awareness. A luminous mental state is one that you have and know that you have. It could be an emotion, a belief or alief, a disposition, a quale, a memory - anything that might happen or be stored in your brain. What's going on in your head?
  • Marginally zero-sum game also known as 'arms race' - A zero-sum game where the efforts of each player not just give them a benefit at the expense of the others, but decrease the efficacy of everyone's past and future actions, thus making everyone's actions extremely inefficient in the limit.
  • Moral Foundations theory (all moral rules in all human cultures appeal to the six moral foundations: care/harm, fairness/cheating, liberty/oppression,loyalty/betrayal, authority/subversion, sanctity/degradation). This makes other people's moralities easier to understand, and is an interesting lens through which to examine your own.
  • Moral uncertainty – is uncertainty about how to act given the diversity of moral doctrines. Moral uncertainty includes a level of uncertainty above the more usual uncertainty of what to do given incomplete information, since it deals also with uncertainty about which moral theory is right. Even with complete information about the world this kind of uncertainty would still remain
  • Paranoid debating - a group estimation game in which one player, unknown to the others, tries to subvert the group estimate.
  • Politics as charity: in terms of expected value, altruism is a reasonable motivator for voting (as opposed to common motivators like "wanting to be heard").
  • Prediction - a statement or claim that a particular event will occur in the future in more certain terms than a forecast.
  • Privileging the question - questions that someone has unjustifiably brought to your attention in the same way that a privileged hypothesis unjustifiably gets brought to your attention. Examples are: should gay marriage be legal? Should Congress pass stricter gun control laws? Should immigration policy be tightened or relaxed? The problem with privileged questions is that you only have so much attention to spare. Attention paid to a question that has been privileged funges against attention you could be paying to better questions. Even worse, it may not feel from the inside like anything is wrong: you can apply all of the epistemic rationality in the world to answering a question like "should Congress pass stricter gun control laws?" and never once ask yourself where that question came from and whether there are better questions you could be answering instead.
  • Radical honesty- a communication technique proposed by Brad Blanton in which discussion partners are not permitted to lie or deceive at all. Rather than being designed to enhance group epistemic rationality, radical honesty is designed to reduce stress and remove the layers of deceit that burden much of discourse.
  • Reflective decision theory - a term occasionally used to refer to a decision theory that would allow an agent to take actions in a way that does not trigger regret. This regret is conceptualized, according to the Causal Decision Theory, as a Reflective inconsistency, a divergence between the agent who took the action and the same agent reflecting upon it after.
  • Schelling point – is a solution that people will tend to use in the absence of communication, because it seems natural, special, or relevant to them.
  • Schelling fences and slippery slopes – a slippery slope is something that affects people's willingness or ability to oppose future policies. Slippery slopes can sometimes be avoided by establishing a "Schelling fence" - a Schelling point that the various interest groups involved - or yourself across different values and times - make a credible precommitment to defend.
  • Something to protect - The Art must have a purpose other than itself, or it collapses into infinite recursion.
  • Status - Real or perceived relative measure of social standing, which is a function of both resource control and how one is viewed by others.
  • Take joy in the merely real – If you believe that science coming to know about something places it into the dull catalogue of common things, then you're going to be disappointed in pretty much everything eventually —either it will turn out not to exist, or even worse, it will turn out to be real. Another way to think about it is that if the magical and mythical were common place they would be merely real. If dragons were common, but zebras were a rare legendary creature then there's a certain sort of person who would ignore dragons, who would never bother to look at dragons, and chase after rumors of zebras. The grass is always greener on the other side of reality. If we cannot take joy in the merely real, our lives shall be empty indeed.
  • The Science of Winning at Life - A sequence by lukeprog that summarizes scientifically-backed advice for "winning" at everyday life: in one's productivity, in one's relationships, in one's emotions, etc. Each post concludes with footnotes and a long list of references from the academic literature.
  • Timeless decision theory - a decision theory, which in slogan form, says that agents should decide as if they are determining the output of the abstract computation that they implement. This theory was developed in response to the view that rationality should be about winning (that is, about agents achieving their desired ends) rather than about behaving in a manner that we would intuitively label as rational.
  • Unfriendly artificial intelligence - is an artificial general intelligence capable of causing great harm to humanity, and having goals that make it useful for the AI to do so. The AI's goals don't need to be antagonistic to humanity's goals for it to be Unfriendly; there are strong reasons to expect that almost any powerful AGI not explicitly programmed to be benevolent to humans is lethal.
  • Updateless decision theory – a decision theory in which we give up the idea of doing Bayesian reasoning to obtain a posterior distribution etc. and instead just choose the action (or more generally, the probability distribution over actions) that will maximize the unconditional expected utility.
  • Ugh field - Pavlovian conditioning can cause humans to unconsciously flinch from even thinking about a serious personal problem they have. We call it an "ugh field". The ugh field forms a self-shadowing blind spot covering an area desperately in need of optimization.
  • Utilitarianism - A moral philosophy that says that what matters is the sum of everyone's welfare, or the "greatest good for the greatest number".
  • Utility - how much a certain outcome satisfies an agent’s preferences.
  • Utility function - assigns numerical values ("utilities") to outcomes, in such a way that outcomes with higher utilities are always preferred to outcomes with lower utilities. These do not work very well in practice for individual humans
  • Wanting and liking - The reward system consists of three major components:
    • Liking: The 'hedonic impact' of reward, comprised of (1) neural processes that may or may not be conscious and (2) the conscious experience of pleasure.
    • Wanting: Motivation for reward, comprised of (1) processes of 'incentive salience' that may or may not be conscious and (2) conscious desires.
    • Learning: Associations, representations, and predictions about future rewards, comprised of (1) explicit predictions and (2) implicit knowledge and associative conditioning (e.g. Pavlovian associations).


  • Beliefs require observations - To form accurate beliefs about something, you really do have to observe it. This can be viewed as a special case of the second law of thermodynamics, in fact, since "knowledge" is correlation of belief with reality, which is mutual information, which is a form of negentropy.
  • Complexity of value - the thesis that human values have high Kolmogorov complexity and so cannot be summed up or compressed into a few simple rules. It includes the idea of fragility of value which is the thesis that losing even a small part of the rules that make up our values could lead to results that most of us would now consider as unacceptable.
  • Egan's law - "It all adds up to normality." — Greg Egan. The purpose of a theory is to add up to observed reality, rather than something else. Science sets out to answer the question "What adds up to normality?" and the answer turns out to be Quantum mechanics adds up to normality. A weaker extension of this principle applies to ethical and meta-ethical debates, which generally ought to end up explaining why you shouldn't eat babies, rather than why you should.
  • Emotion - Contrary to the stereotype, rationality doesn't mean denying emotion. When emotion is appropriate to the reality of the situation, it should be embraced; only when emotion isn't appropriate should it be suppressed.
  • Futility of chaos - A complex of related ideas having to do with the impossibility of generating useful work from entropy — a position which holds against the ideas that e.g: Our artistic creativity stems from the noisiness of human neurons, randomized algorithms can exhibit performance inherently superior to deterministic algorithms and the human brain is a chaotic system and this explains its power; non-chaotic systems cannot exhibit intelligence.
  • General knowledge - Interdisciplinary, generally applicable knowledge is rarely taught explicitly. Yet it's important to have at least basic knowledge of many areas (as opposed to deep narrowly specialized knowledge), and to apply it to thinking about everything.
  • Hope - Persisting in clutching to a hope may be disastrous. Be ready to admit you lost, update on the data that says you did.
  • Humility – “To be humble is to take specific actions in anticipation of your own errors. To confess your fallibility and then do nothing about it is not humble; it is boasting of your modesty.” —Twelve Virtues of Rationality Not to be confused with social modesty, or motivated skepticism (aka disconfirmation bias).
  • I don't know - in real life, you are constantly making decisions under uncertainty: the null plan is still a plan, refusing to choose is itself a choice, and by your choices, you implicitly take bets at some odds, whether or not you explicitly conceive of yourself as doing so.
  • Litany of Gendlin – “What is true is already so. Owning up to it doesn't make it worse. Not being open about it doesn't make it go away. And because it's true, it is what is there to be interacted with. Anything untrue isn't there to be lived. People can stand what is true, for they are already enduring it.” —Eugene Gendlin
  • Litany of Tarski – “If the box contains a diamond, I desire to believe that the box contains a diamond; If the box does not contain a diamond, I desire to believe that the box does not contain a diamond; Let me not become attached to beliefs I may not want. “ —The Meditation on Curiosity
  • Lottery - A tax on people who are bad at math. Also, a waste of hope. You will not win the lottery.
  • Magic - What seems to humans like a simple explanation, sometimes isn't at all. In our own naturalistic, reductionist universe, there is always a simpler explanation. Any complicated thing that happens, happens because there is some physical mechanism behind it, even if you don't know the mechanism yourself (which is most of the time). There is no magic.
  • Modesty argument - the claim that when two or more rational agents have common knowledge of a disagreement over the likelihood of an issue of simple fact, they should each adjust their probability estimates in the direction of the others'. This process should continue until the two agents are in full agreement. Inspired by Aumann's agreement theorem.
  • No safe defense - Authorities can be trusted exactly as much as a rational evaluation of the evidence deems them trustworthy, no more and no less. There's no one you can trust absolutely; the full force of your skepticism must be applied to everything.
  • Offense - It is hypothesized that the emotion of offense appears when one perceives an attempt to gain status.
  • Slowness of evolution- The tremendously slow timescale of evolution, especially for creating new complex machinery (as opposed to selecting on existing variance), is why the behavior of evolved organisms is often better interpreted in terms of what did in fact work yesterday, rather than what will work in the future.
  • Stupidity of evolution - Evolution can only access a very limited area in the design space, and can only search for the new designs very slowly, for a variety of reasons. The wonder of evolution is not how intelligently it works, but that an accidentally occurring optimizer without a brain works at all.

Lesswrong real time chat

18 Elo 04 September 2015 02:29AM

This is a short post to say that I have started and am managing a Slack channel for lesswrong.

Slack has only an email-invite option which means that I need an email address for anyone who wants to join.  Send me a PM with your email address if you are interested in joining.

There is a web interface and a mobile app that is better than google hangouts.


If you are interested in joining; consider this one requirement:

  • You must be willing to be charitable in your conversations with your fellow lesswrongers.


To be clear; This means (including but not limited to);

  • Steelman not strawman of discussion
  • Respect of others
  • patience
So far every conversation we have had has been excellent, there have been no problems at all and everyone is striving towards better understanding of each other.  This policy does not come out of a recognition of a failure to be charitable; but as a standard to set when moving forward.  I have no reason to expect it will be broken but all the same; I feel it is valuable to have.



I would like this to have several goals and purposes (some of which were collaboratively developed with other lesswrongers in the chat, and if more come up in the future too that would be good)
  • an aim for productive conversations, to make progress on our lives.
  • a brains trust for life-advice in all kinds of areas where, "outsource this decision to others" is an effective strategy.
  • collaborative creation of further rationality content
  • a safe space for friendly conversation on the internet (a nice place to hang out)
  • A more coherent and stronger connected lesswrong
  • Development of better ideas and strategies in how to personally improve the world.

So far the chat has been operating by private invite from me for about two weeks as a trial.  Since this post was created we now have an ongoing conversation with exciting new ideas being produced all the time.  If nothing else - its fun to be in.  If something - we are generating a growing space for rationality and other ideas.  I have personally gained two very good friends already; that I now talk to every day.  (Which coincidentally slowed me down from posting this notice because I was too busy with other things and learning from new people)

I realise this type of medium is not for all.  But I am keen to make it work.

I also realise that when people PM me their email addresses - other people will not see how many of you have already signed up.  So generally assume that there have been others who are already signed up and don't hesitate to join.  If you are wondering if you have anything to contribute; that's exactly the type of person we want to be inviting.  By doing that thought you classify yourself as the type of person to try harder.  We want you (and others) to talk with us.

Edit: Topics we now host;
  • AI
  • Film making
  • Goals of lesswrong
  • Human Relationships
  • media
  • parenting
  • philosophy
  • political talk
  • programming
  • real life
  • Resources and links
  • science
  • travelling
  • and some admin channels; the "welcome", "misc", and "RSS" from the lw site.
Edit a week's review for the first week of august in 2015:

Edit - first week of October:

Edit - 3rd week in october 2015:

Edit - 3rd week in november 2015

One model of understanding independent differences in sensory perception

17 Elo 20 September 2015 09:32PM

This week my friend Anna said to me; "I just discovered my typical mind fallacy around visualisation is wrong". Naturally I was perplexed and confused. She said; 

“When I was in second grade the teacher had the class do an exercise in visualization. The students sat in a circle and the teacher instructed us to picture an ice cream cone with our favorite 0ice cream. I thought about my favorite type of cone and my favorite flavor, but the teacher emphasized "picture this in your head, see the ice cream." I tried this, and nothing happened. I couldn't see anything in my head, let alone an ice cream. I concluded, in my childish vanity, that no one could see things in their head, "visualizing" must just be strong figurative language for "pretending," and the exercise was just boring.”


Typical mind fallacy being; "everyone thinks like me" (Or A-typical mind fallacy – "no one thinks like me"). My good friend had discovered (a long time ago) that she had no visualisation function. But only recently made sense of it (approximately 15-20 years later). Anna came to me upset, "I am missing out on a function of the brain; limited in my experiences". Yes; true. She was. And we talked about it and tried to measure and understand that loss in better terms. The next day Anna was back but resolved to feeling better about it. Of course realising the value of individual differences in humans, and accepting that whatever she was missing; she was compensating for it by being an ordinary functional human (give or take a few things here and there), and perhaps there were some advantages.


Together we set off down the road of evaluating the concept of the visualisation sense. So bearing in mind; that we started with "visualise an ice cream"... Here is what we covered.

Close your eyes for a moment, (after reading this paragraph), you can see the "blackness' but you can also see the white sparkles/splotches and some red stuff (maybe beige), as well as the echo-y shadows of what you last looked at, probably your white computer screen. They echo and bounce around your vision. That's pretty easy. Now close your eyes and picture an ice cream cone. So the visualisation-imagination space is not in my visual field, but what I do have is a canvas somewhere on which I draw that ice cream; and anything else I visualise.  It’s definitely in a different place. (We will come back to "where" it is later)

So either you have this "notepad"; “canvas” in your head for the visual perception space or you do not. Well; it’s more like a spectrum of strength of visualisation; where some people will visualise clear and vivid things; and others will have (for lack of better terms) "grey"; "echoes"; Shadows; or foggy visualisation, where drawing that is a really hard thing to do. Anna describes what she can get now in adulthood as a vague kind of bas relief of an image, like an after effect. So it should help you model other people by understanding that variously people can visualise better or worse. (probably not a big deal yet; just wait).


It occurs that there are other canvases; not just for the visual space but for smell and taste as well. So now try to canvas up some smells of lavender or rose, or some soap. You will probably find soap is possible to do; being of memorable and regular significance. The taste of chocolate; kind of appears from all those memories you have; as does cheese; lemon and salt; (but of course someone is screaming at the page about how they don't understand when I say that chocolate "kind of appears”, because it’s very very vivid to them, and someone else can smell soap but it’s quite far away and grey/cloudy).


It occurs to me now that as a teenage male I never cared about my odour; and that I regularly took feedback from some people about the fact that I should deal with that, (personal lack of noticing aside), and I would wonder why a few people would care a lot; and others would not ever care. I can make sense of these happenings by theorising that these people have a stronger smell canvas/faculty than other people. Which makes a whole lot of reasonable sense.

Interesting yet? There is more.

This is a big one.

Sound. But more specifically music. Having explored the insight of having a canvas for these senses with several people over the past week; And noting that the person from the story above confidently boasts an over-active music canvas with tunes always going on in their head. For a very long time I decided that I was just not a person who cared about music; and never really knew to ask or try to explain why. Just that it doesn't matter to me. Now I have a model. 


I can canvas music as it happens – in real time; and reproduce to a tune; but I have no canvas for visualising auditory sounds without stimulation. (what inspired the entire write-up here was someone saying how it finally made them understand why they didn't make sense of other people's interests in sounds and music) If you ask me to "hear" the C note on my auditory canvas; I literally have no canvas on which to "draw" that note. I can probably hum a C (although I am not sure how), But I can't play that thing in my head.

Interestingly I asked a very talented pianist. And the response was; "of course I have a musical canvas", (to my slight disappointment). Of course she mentioned it being a big space; and a trained thing as well. (As a professional concert pianist) She can play fully imagined practice on a not-real piano and hear a whole piece. Which makes for excellent practice when waiting for other things to happen, (waiting rooms, ques, public transport...)


Anna from the beginning is not a musician, and says her head-music is not always pleasant but simply satisfactory to her. Sometimes songs she has heard, but mostly noises her mind produces. And words, always words. She speaks quickly and fluently, because her thoughts occur to her in words fully formed. 

I don't care very much about music because I don't "see" (imagine) it. Songs do get stuck in my head but they are more like echoes of songs I have just heard, not ones I can canvas myself.


Now to my favourite sense. My sense of touch. My biggest canvas is my touch canvas. "feel the weight on your shoulders?", I can feel that. "Wind through your hair?", yes. The itch; yes, The scrape on your skin, The rough wall, the sand between your toes. All of that. 


It occurs to me that this explains a lot of details of my life that never really came together. When I was little I used to touch a lot of things, my parents were notorious for shouting my name just as I reached to grab things. I was known as a, "bull in a china shop", because I would touch everything and move everything and feel everything and get into all kinds of trouble with my touch. I once found myself walking along next to a building while swiping my hand along the building - I was with a friend who was trying out drugs (weed), She put her hands on the wall and remarked how this would be interesting to touch while high. At the time I probably said something like; "right okay". And now I understand just what everyone else is missing out on.


I spend most days wearing as few clothes as possible, (while being normal and modest), I still pick up odd objects around. There is a branch of Autism where the people are super-sensitive to touch and any touch upsets or distracts them; a solution is to wear tight-fitting clothing to dull the senses. I completely understand that and what it means to have a noisy-touch canvas.

All I can say to someone is that you have no idea what you are missing out on; and before this week – neither did I. But from today I can better understand myself and the people around me.


There is something to be said for various methods of thinking; some people “think the words”, and some people don’t think in words, they think in pictures or concepts.  I can’t cover that in this post; but keep that in mind as well for “the natural language of my brain”


One more exercise (try to play along – it pays off). Can you imagine 3 lines, connected; an equilateral triangle on a 2D plane. Rotate that around; good (some people will already be unable to do this). Now draw three more of these. Easy for some. Now I want you to line them up so that the three triangles are around the first one. Now fold the shape into a 3D shape.

How many corners?

How many edges?

How many faces?

Okay good. Now I want you to draw a 2D square. Simple; Now add another 4 triangles. Then; like before surround the square with the triangles and fold it into a pyramid. Again;

How many edges?

How many corners?

How many faces?


Now I want you to take the previous triangle shape; and attach it to one of the triangles of the square-pyramid shape. Got it?

Now how many corners?

How many edges?

How many faces?


That was easy right? Maybe not that last step. So it turns out I am not a super visualiser. I know this because those people who are a super visualisers will find that when they place the triangular pyramid on to the square pyramid; The side faces of the triangle pyramid merge into a rhombus with the square pyramid; effectively making 1 face out of 2 triangle faces; and removing an edge (and doing that twice over for two sides of the shape).  Those who understand will be going “duh” and those who don’t understand will be going “huh?”, what happened?


Pretty cool right?


Don’t believe me?  Don’t worry - there is a good explanation for those who don’t see it right away - at this link 


From a super-visualiser: 

“I would say, for me, visualization is less like having a mental playground, and more like having an entire other pair of eyes.  And there's this empty darkness into which I can insert almost anything.  If it gets too detailed, I might have to stop and close my outer eyes, or I might have to stop moving so I don't walk into anything. That makes it sound like a playground, but there's much more to it than that.


Imagine that you see someone buying something in a shop.  They pay cash, and the red of the twenty catches your eye.  It's pretty, and it's vivid, and it makes you happy.  And if you imagine a camera zooming out, you see red moving from customers to clerks at all the registers.  Not everyone is paying with twenties, but commerce is red, now.  It's like the air flashes and lights up like fireworks, every time somebody buys something.  

And if you keep zooming out, you can see red blurs all over the town, all over the map.  So if you read about international trade, it's almost like the paper comes to life, and some parts of it are highlighted red.  And if you do that for long enough, it becomes a habit, and something really weird starts to happen.  

When someone tells you about their car, there's a little red flash just out the corner of your eye, and you know they probably didn't pay full price, because there's a movie you can watch, and in the time they got the car, they didn't have a job and they were stressed, so there's not as much red in that part of the movie, so there has to be some way they got the car without losing even more red.  But it's not just colors, and it's definitely not just money.  


Happiness might be shimmering motion.  Connection with friends might be almost a blurring together at the center.  And all these amazing visual metaphors that you usually only see in an art gallery are almost literally there in the world, if you look with the other pair of eyes. So sometimes things really do sort of jump out at you, and nobody else noticed them. But it has to start with one thing.  One meaning, one visual metaphor."



Way up top I mentioned the "where" of the visualisation space. It's not really in the eye, a good name for it might be "the mind's eye". My personal visualisation canvas is located back up left tilted downwards and facing forwards.


Synaesthesia is a lot of possible effects. The most well known one is where people associate a colour with a letter, when they think of the letter they have a sense of a colour that goes with the letter. Some letter's don't have colours, sometimes numbers have colours.


There are other branches of synaesthesia. Locating things in the physical space. Days of the week can be laid out in a row in front of you; numbers can be located somewhere. Some can be heavier than others. Sounds can have weights; Smells can have colours; Musical notes can have a taste. Words can feel rough or smooth.


Synaesthesia is a class of cross-classification that is done by the brain in interpreting a stimulus, where (we think) it can be caused by crossed wiring in the brain; It's pretty fun. Turns out most people have some kind of Synaesthesia. Usually to do with weights of numbers, or days being in a row. Sometimes Tuesdays are lower than the other days. Who knows. If you pay attention to how sometimes things have an alternative sensory perception, chances are that's a bit of the natural Synaesthete coming out.

So what now?

Synaesthesia is supposed to make you smarter. Crossing brain faculty should help you remember things better; if you can think of numbers in terms of how heavy they are you could probably train your system 1 to do simple arithmetic by "knowing" how heavy the answer is. If it doesn't come naturally to you - these are no longer low-hanging fruit implementations of these ideas.


What is a low-hanging fruit; Consider all your "canvases" of thinking; Work out which ones you care more about; and which ones don't matter. (Insert link to superpowers and kryptonites: use your strong senses to your advantage; and make sure you avoid using your weaker senses) (or go on a bender to rebuild your map; influence your territory and train your sensory canvases. Or don't because that wouldn't be a low hanging fruit).

Keep this model around

It can be used for both good and evil. But get the model out there. Talk to people about it. Ask your friends and family if they are able to visualise. Ask about all the senses. Imagine if suddenly you discovered that someone you know; can't "smell" things in their imagination. Or doesn't know what you mean by, "feel this" (seriously you have no idea what you are missing out on the touch spectrum in my little bubble).

You are going to have good senses and bad ones. That's okay! The more you know; the more you can use it to your advantage!

Meta: Post write up time 1 hour; plus a week of my social life being dominated by the same conversation over and over with different people where I excitedly explain the most exciting thing of this week.  plus 1hr*4, plus 3 people editing and reviewing, plus a rationality dojo where I presented this topic.


Meta2: I waited 3 weeks for other people to review this.  There were no substantial changes and I should have not waited so long.  in future I won’t wait that long.

A toy model of the control problem

16 Stuart_Armstrong 16 September 2015 02:59PM

EDITED based on suggestions for improving the model

Jaan Tallinn has suggested creating a toy model of the control problem, so that it can be analysed without loaded concepts like "autonomy", "consciousness", or "intentionality". Here a simple (too simple?) attempt:


A controls B. B manipulates A.

Let B be a robot agent that moves in a two dimensional world, as follows:

continue reading »

[link] New essay summarizing some of my latest thoughts on AI safety

14 Kaj_Sotala 01 November 2015 08:07AM

New essay summarizing some of my latest thoughts on AI safety, ~3500 words. I explain why I think that some of the thought experiments that have previously been used to illustrate the dangers of AI are flawed and should be used very cautiously, why I'm less worried about the dangers of AI than I used to be, and what are some of the remaining reasons for why I do continue to be somewhat worried.

Backcover celebrity endorsement: "Thanks, Kaj, for a very nice write-up. It feels good to be discussing actually meaningful issues regarding AI safety. This is a big contrast to discussions I've had in the past with MIRI folks on AI safety, wherein they have generally tried to direct the conversation toward bizarre, pointless irrelevancies like "the values that would be held by a randomly selected mind", or "AIs with superhuman intelligence making retarded judgments" (like tiling the universe with paperclips to make humans happy), and so forth.... Now OTOH, we are actually discussing things of some potential practical meaning ;p ..." -- Ben Goertzel

Proposal for increasing instrumental rationality value of the LessWrong community

14 harcisis 28 October 2015 03:18PM

There was some concerns here ( considering value of LessWrong community from the perspective of instrumental rationality. 

In the discussion on the relevant topic I've seen the story about how community can help from this perspective.

And I think It's a great thing that local community can help people in various ways to achieve their goals. Also it's not the first time I hear about how this kind of community is helpful as a way of achieving personal goals.

Local LessWrong meetups and communities are great but they have kind of different focus. And a lot of people live in places where there are no local community or it's not active/regular.

So I propose to form small groups (4-8 people). Initially groups would meet (using whatever means that are convenient for particular group), discuss the goals of each participant in a long and in a short term (life/year/month/etc). They would collectively analyze proposed strategies for achieving these goals. Discuss how short term goals align with long term goals. And determine whether the particular tactics for achieving stated goal is optimal. And is there any way to improve on it?

Afterwards group would meet weekly to:

Set their short term goals, retrospect on the goals set for previous period. Discuss how successfully they were achieved, what problems people encountered and what alterations to overall strategy follows. And they will also analyze how newly set short-term goals coincide with long term goals. 

In this way each member of the group would receive helpful feedback on his goals and on the his approach to attaining them. And also he will fill accountable, in a way, for goals, he have stated before the group and this could be an additional boost to productivity.

I also expect that group would be helpful from the perspective of overcoming different kind of fallacies and gaining more accurate beliefs about the world. Because it's easier for people to spot errors in the beliefs/judgment of others. I hope that group's would be able to develop friendly environment and so it would be easier for people to get to know about their errors and change their mind. Truth springs from argument amongst friends.

Group will reflect on it's effectiveness and procedures every month(?) and will incrementally improve itself. Obviously if somebody have some great idea about group proceedings it makes sense to discuss it after usual meeting and implement it right away. But I thing regular in depth retrospective on internal workings is also important.

If there are several groups available - groups will be able to share insights, things group have learned during it's operation. (I'm not sure how much of this kind of insights would be generated but maybe it would make sense to once in a while publish post that would sum up groups collective insights.)

There are some things that I'm not sure about: 


  • I think it would be worth to discuss possibility of shuffling group members (or at least exchanging members in some manner) once in a while to provide fresh insight on goals/problems that people are facing and make the flow of ideas between groups more agile.
  • How the groups should be initially formed? Just random assignment or it's reasonable to devise some criteria? (Goals alignment/Diversity/Geography/etc?)


I think initial reglament of the group should be developed by the group, though I guess it's reasonable to discuss some general recommendations.

So what do you think? 

If you interested - fill up this google form:

Film about Stanislav Petrov

14 matheist 10 September 2015 06:43PM

I searched around but didn't see any mention of this. There's a film being released next week about Stanislav Petrov, the man who saved the world.

The Man Who Saved the World

Due for limited theatrical release in the USA on 18 September 2015.
Will show in New York, Los Angeles, Detroit, Portland.

Previous discussion of Stanislav Petrov:

Solstice 2015: What Memes May Come? (Part I)

13 Raemon 02 November 2015 05:13PM

Winter is coming, and so is Solstice season. There'll be large rationality-centric-or-adjaecent events in NYC, the Bay Area, and Seattle (and possibly other places - if you're interested in running a Solstice event or learning what that involves, send me a PM). In NYC, there'll be a general megameetup throughout the weekend, for people who want to stay through Sunday afternoon, and if you're interested in shared housing you can fill out this form.

The NYC Solstice isn't running a kickstarter this year, but I'll need to pay for the venue by November 19th ($6125). So if you are planning on coming it's helpful to purchase tickets sooner rather than later. (Or preorder the next album or 2016 Book of Traditions, if you can't attend but want to support the event).


I've been thinking for the past couple years about the Solstice as a memetic payload.

The Secular Solstice is a (largely Less Wrong inspired) winter holiday, celebrating how humanity faced the darkest season and transformed it into a festival of light. It celebrates science and civilization. It honors the past, revels in the present and promises to carry our torch forward into the future.

For the first 2-3 years, I had a fair amount of influences over the Solstices held in Boston and San Francisco, as well as the one I run in NYC. Even then, the holiday has evolved in ways I didn't quite predict. This has happened both because different communities took them in somewhat different directions, and because (even in the events I run myself), factors come into play that shaped it. Which musicians are available to perform, and how does their stage presence affect the event? Which people from which communities will want to attend, and how will their energy affect things? Which jokes will they laugh at? What will they find poignant?

On top of that, I'm deliberately trying to spread the Solstice to a larger audience. Within a couple years, if I succeed, more of the Solstice will be outside of my control than within it. 

Is it possible to steer a cultural artifact into the future, even after you let go of the reins? How? Would you want to?

In this post, I lay out my current thoughts on this matter. I am interested in feedback, collaboration and criticism.

Lessons from History?

(Epistemic status: I have not really fact checked this. I wouldn't be surprised if the example turned out to be false, but I think it illustrates an interesting point regardless of whether it's true)

Last year after Solstice, I was speaking with a rationalist friend with a Jewish background. He made an observation. I lack the historical background to know if this is exactly accurate (feel free to weigh in on the comments), but his notion was as follows:

Judaism has influenced the world in various direct ways. But a huge portion of its influence (perhaps the majority) has been indirectly through Christianity. Christianity began with a few ideas it took from Judaism that were relatively rare. Monotheism is one example. The notion that you can turn to the Bible for historical and theological truth is another.

But buried in that second point is something perhaps more important: religious truth is not found in the words of your tribal leaders and priests. It's found in a book. The book contains the facts-of-the-matter. And while you can argue cleverly about the book's contents, you can't disregard it entirely.

Empiricists may get extremely frustrated with creationists, for refusing to look outside their book for answers (instead of the natural world). But there was a point where the fact of the matter lay entirely in "what the priests/ruler said" as opposed to "what the book said". 

In this view, Judaism's primary memetic success is in helping to seed the idea of scholarship, and a culture of argument and discussion.

I suspect this story is simplified, but these two points seem meaningful: a memeplex's greatest impact may be indirect, and may not have much to do with the attributes that are most salient on first glance to a layman.



So far, I've deliberately encouraged people to experiment with the Solstice. Real rituals evolve in the wild, and adapt to the needs of their community. And a major risk of ritual is that it becomes ossified, turning either hollow or dangerous. But if a ritual is designed to be mutable, what gives it it's identity? What separates a Secular Solstice from a generic humanist winter holiday?

The simplest, most salient and most fun aspects of a ritual will probably spread the fastest and farthest. If I had to sum up the Solstice in nine words, they would be:

Light. Darkness. Light.
Past. Present. Future.
Humanity. Science. Civilization.

I suspect that without any special effort on my part (assuming I keep promoting the event but don't put special effort into steering its direction), those 9 pieces would remain a focus of the event, even if groups I never talk to adopt it for themselves.

The most iconic image of the Solstice is the Candelit story. At the apex of the event, when all lights but a single candle have been extinguished, somebody tells a story that feels personal, visceral. It reminds us that this world can be unfair, but that we are not alone, and we have each other. And then the candle is blown out, and we stand in the absolute darkness together.

If any piece of the Solstice survives, it'll be that moment.

If that were all that survived, I think that'd be valuable. But it'd also be leaving 90%+ of the potential value of the Solstice on the table.

Complex Value

There are several pieces of the Solstice that are subtle and important. There are also pieces of it that currently exist that should probably be tapered down, or adjusted to become more useful. Each of them warrants a fairly comprehensive post of its own. A rough overview of topics to explore:

Existential Risk.
The Here and Now.
The Distant Future.

My thoughts about each of these are fairly complex. In the coming weeks I'll dive into each of them. The next post, discussing Atheism, Rationality and Death, is here.

Experiment: Changing minds vs. preaching to the choir

13 cleonid 03 October 2015 11:27AM


      1. Problem

In the market economy production is driven by monetary incentives – higher reward for an economic activity makes more people willing to engage in it. Internet forums follow the same principle but with a different currency - instead of money the main incentive of internet commenters is the reaction of their audience. A strong reaction expressed by a large number of replies or “likes” encourages commenters to increase their output. Its absence motivates them to quit posting or change their writing style.

On neutral topics, using audience reaction as an incentive works reasonably well: attention focuses on the most interesting or entertaining comments. However, on partisan issues, such incentives become counterproductive. Political forums and newspaper comment sections demonstrate the same patterns:

  • The easiest way to maximize “likes” for a given amount of effort is by posting an emotionally charged comment which appeals to audience’s biases (“preaching to the choir”).


  • The easiest way to maximize the number of replies is by posting a low quality comment that goes against audience’s biases (“trolling”).


  • Both effects are amplified when the website places comments with most replies or “likes” at the top of the page.


The problem is not restricted to low-brow political forums. The following graph, which shows the average number of comments as a function of an article’s karma, was generated from the Lesswrong data.


The data suggests that the easiest way to maximize the number of replies is to write posts that are disliked by most readers. For instance, articles with the karma of -1 on average generate twice as many comments (20.1±3.4) as articles with the karma of +1 (9.3±0.8).

2. Technical Solution

Enabling constructive discussion between people with different ideologies requires reversing the incentives – people need to be motivated to write posts that sound persuasive to the opposite side rather than to their own supporters.

We suggest addressing this problem that this problem by changing the voting system. In brief, instead of votes from all readers, comment ratings and position on the page should be based on votes from the opposite side only. For example, in the debate on minimum wage, for arguments against minimum wage only the upvotes of minimum wage supporters would be counted and vice versa.

The new voting system can simultaneously achieve several objectives:

·         eliminate incentives for preaching to the choir

·         give posters a more objective feedback on the impact of their contributions, helping them improve their writing style

·     focus readers’ attention on comments most likely to change their minds instead of inciting comments that provoke an irrational defensive reaction.

3. Testing

If you are interested in measuring and improving your persuasive skills and would like to help others to do the same, you are invited to take part in the following experiment:


Step I. Submit Pro or Con arguments on any of the following topics (up to 3 arguments in total):

     Should the government give all parents vouchers for private school tuition?

     Should developed countries increase the number of immigrants they receive?

     Should there be a government mandated minimum wage?


Step II. For each argument you have submitted, rate 15 arguments submitted by others.


Step III.  Participants will be emailed the results of the experiment including:

-         ratings their arguments receive from different reviewer groups (supporters, opponents and neutrals)

-         the list of the most persuasive Pro & Con arguments on each topic (i.e. arguments that received the highest ratings from opposing and neutral groups)

-         rating distribution in each group


Step IV (optional). If interested, sign up for the next round.


The experiment will help us test the effectiveness of the new voting system and develop the best format for its application.





Notes on Actually Trying

13 AspiringRationalist 23 September 2015 02:53AM

These ideas came out of a recent discussion on actually trying at Citadel, Boston's Less Wrong house.

What does "Actually Trying" mean?

Actually Trying means applying the combination of effort and optimization power needed to accomplish a difficult but feasible goal. The effort and optimization power are both necessary.

Failure Modes that can Resemble Actually Trying

Pretending to try

Pretending to try means doing things that superficially resemble actually trying but are missing a key piece. You could, for example, make a plan related to your goal and diligently carry it out but never stop to notice that the plan was optimized for convenience or sounding good or gaming a measurement rather than achieving the goal. Alternatively, you could have a truly great plan and put effort into carrying it out until it gets difficult.

Trying to Try

Trying to try is when you throw a lot of time and perhaps mental anguish at a task but not actually do the task. Writer's block is the classic example of this.


Sphexing is the act of carrying out a plan or behavior repeatedly despite it not working.

The Two Modes Model of Actually Trying

Actually Trying requires a combination of optimization power and effort, but each of those is done with a very different way of thinking, so it's helpful to do the two separately. In the first way of thinking, Optimizing Mode, you think hard about the problem you are trying to solve, develop a plan, look carefully at whether it's actually well-suited to solving the problem (as opposed to pretending to try) and perhaps Murphy-jitsu it. In Executing Mode, you carry out the plan.

Executing Mode breaks down when you reach an obstacle that you either don't know how to overcome or where the solution is something you don't want to do. In my personal experience, this is where things tend to get derailed. There are a few ways to respond to this situation:

  • Return to Optimizing Mode to figure out how to overcome the obstacle / improve your plan (good),
  • Ask for help / consult a relevant expert (good),
  • Take a break, which could lead to a eureka moment, lead to Optimizing Mode or lead to derailing (ok),
  • Sphex (bad),
  • Derail / procrastinate (bad), or
  • Punt / give up (ok if the obstacle is insurmountable).

The key is to respond constructively to obstacles. This usually means getting back to Optimizing Mode, either directly or after a break.  The failure modes here are derailing immediately, a "break" that turns into a derailment, and sphexing.  In our discussion, we shared a few techniques we had used to get back to Optimizing Mode.  These techniques tended to focus on some combination of removing the temptation to derail, providing a reminder to optimize, and changing mental state.

Getting Back to Optimizing Mode

Context switches are often helpful here.  Because for many people, work and procrastination both tend to be computer-based activities, it is both easy and tempting to switch to a time-wasting activity immediately upon hitting an obstacle.  Stepping away from the computer takes away the immediate distraction and depending on what you do away from the computer, helps you either think about the problem or change your mental state.  Depending on what sort of mood I'm in, I sometimes step away from the computer with a pen and paper to write down my thoughts (thinking about the problem), or I may step away to replenish my supply of water and/or caffeine (changing my mental state).  Other people in the discussion said they found going for a walk or getting more strenuous exercise to be helpful when they needed a break.  Strenuous exercise has the additional advantage of having very low risk of turning into a longer-than-intended break.

The danger with breaks is that they can turn into derailment.  Open-ended breaks ("I'll just browse Reddit for five minutes") have a tendency to expand, so it's best to avoid them in favor of things with more definite endings.  The other common say for breaks to turn into derailment is to return from a break and go to something non-productive.  I have had some success with attaching a sticky-note to my monitor reminding me what to do when I return to my computer.  I have also found that if the note makes clear what problem I need to solve also makes me less likely to sphex when I return to my computer.

In the week or so since the discussion that inspired this post, I have found that asking myself "what would Actually Trying look like right now?" This has helped me stay on track when I have encountered difficult problems at work.

making notes - an instrumental rationality process.

13 Elo 05 September 2015 10:51PM

The value of having notes. Why do I make notes.


Story time!

At one point in my life I had a memory crash. Which is to say once upon a time I could remember a whole lot more than I was presently remembering. I recall thinking, "what did I have for breakfast last Monday? Oh no! Why can't I remember!". I was terrified. It took a while but eventually I realised that remembering what I had for breakfast last Monday was:

  1. not crucial to the rest of my life

  2. not crucial to being a function human being

  3. I was not sure if I usually remember what I ate last Monday; or if this was the first time I tried to recall it with such stubbornness to notice that I had no idea.

After surviving my first teen-life crisis I went on to realise a few things about life and about memory:

  1. I will not be remembering everything forever.

  2. Sometimes I forget things that I said I would do. Especially when the number of things I think I will do increases past 2-3 and upwards to 20-30.

  3. Don't worry! There is a solution!

  4. As someone at the age of mid-20s who is already forgetting things; a friendly mid-30 year old mentioned that in 10 years I will have 1/3rd more life to be trying to remember as well. Which should also serve as a really good reason why you should always comment your code as you go; and why you should definitely write notes. "Past me thought future me knew exactly what I meant even though past me actually had no idea what they were going on about".

The foundation of science.


There are many things that could be considered the foundations of science. I believe that one of the earliest foundations you can possibly engage in is observation.


In a more-than-goldfish form; observation means holding information. It means keeping things for review till later in your life; either at the end of this week; month or year. Observation is only the start. Writing it down makes it evidence. Biased, personal, scrawl, (bad) evidence all the same. If you want to be more effective at changing your mind; you need to know what your mind says.


It's great to make notes. That's exactly what I am saying. It goes further though. Take notes and then review them. Weekly; monthly; yearly. Unsure about where you are going? Know where you have come from. With that you can move forward with better purpose.

My note taking process:

1. get a notebook.

This picture includes some types of notebooks that I have tried.

  1. A4 lined paper cardboard front and back. Becomes difficult to carry because it was big. And hard to open it up and use it as well. side-bound is also something I didn't like because I am left handed and it seemed to get in my way.

  2. bad photo but its a pad of grid-paper. I found a stack of these on the middle of the ground late at night as if they fell off a truck or something. I really liked them except for them being stuck together by essentially nothing and falling to pieces by the time I got to the bottom of the pad.

  3. lined note paper. I will never go back to a book that doesn't hold together. The risk of losing paper is terrible. I don't mind occasionally ripping out some paper but to lose a page when I didn't want to; has never worked safely for me.

  4. Top spiral bound; 100 pages. This did not have enough pages; I bought it after a 200pager ran out of paper and I needed a quick replacement, well it was quick – I used it up in half the time the last book lasted.

  5. Top spiral bound 200 pages notepad, plastic cover; these are the type of book I currently use. 8 is my book that I am writing in right now.

  6. 300 pages top spiral bound – as you can see by the tape – it started falling apart by the time I got to the end of it.

  7. small notebook. I got these because they were 48c each, they never worked for me. I would bend them, forget them, leave them in the wrong places, and generally not have them around when I wanted them.

  8. I am about half way through my current book; the first page of my book says 23/7/15, today it is 1/9/15. Estimate a book every 2 months. Although it really depends on how you use it.

  9. a future book I will try, It holds a pen so I will probably find that useful.

  10. also a future one, I expect it to be too small to be useful for me.

  11. A gift from a more organised person than I. It is a moleskin grid-paper book and I plan to also try it soon.

The important take-aways from this is – try several, they might work in different ways and for different reasons. Has your life change substantially i.e. you don't sit much at a desk any more? Is the book not working; maybe another type of book would work better.

I only write on the bottom of the flip-page, and occasionally scrawl diagrams on the other side of the page. But only when they relevant. This way I can always flip through easy, and not worry about the other side of the paper.


2. carry a notebook. Everywhere. Find a way to make it a habit. Don't carry a bag? You could. Then you can carry your notepad everywhere with you in a bag. Consider a pocket-sized book as a solution to not wanting to carry a bag.

3. when you stop moving; turn the notebook to the correct page and write the date.

Writing the date is almost entirely useless. I really never care what the date is. I sometimes care that when I look back over the book I can see the timeline around which the events happened, but really – the date means nothing to me.

What writing the date helps to do:

  • make sure you have a writing implement

  • make sure it works

  • make sure you are on the right page

  • make sure you can see the pad

  • make sure you can write in this position

  • make you start a page

  • make you consider writing more things

  • make it look to others like you know what you are doing (signalling that you are a note-taker, is super important to help people get used to you as a note-taker and encourage that persona onto you)

This is the reason why I write the date; I can't specify enough why I don't care about what date it is, but why I do it anyway.

4. Other things I write:

  • Names of people I meet. Congratulations; you are one step closer to never forgetting the name of anyone ever. Also when you want to think; "When did I last see bob", you can kinda look it up in a dumb - date-sorted list. (to be covered in my post about names – but its a lot easier to look it up 5 minutes later when you have it written down)

  • Where I am/What event I am at. (nice to know what you go to sometimes)

  • What time I got here or what time it started (if its a meeting)

  • What time it ended (or what time I stopped writing things)

It's at this point that the rest of the things you write are kinda personal choices some of mine are:

  • Interesting thoughts I have had

  • Interesting quotes people say

  • Action points that I want to do if I can't do them immediately.

  • Shopping lists

  • diagrams of what you are trying to say.

  • Graphs you see.

  • the general topic of conversation as it changes. (so far this is enough for me to remember the entire conversation and who was there and what they had to say about the matter)


That's right. I said it. Its sexy. There are occasional discussion events near to where I live; that I go to with a notepad. Am I better than the average dude who shows up to chat? no. But everyone knows me. The guy who takes notes. And damn they know I know what I am talking about. And damn they all wish they were me. You know how glasses became a geek-culture signal? Well this is too. Like no other. Want to signal being a sharp human who knows what's going down? Carry a notebook, and show it off to people.

The coordinators have said to me; "It makes me so happy to see someone taking notes, it really makes me feel like I am saying something useful". The least I can do is take notes.


Other notes about notebooks

The number of brilliant people I know who carry a book of some kind will far outweighs the number of people who don't. I don't usually trust the common opinion; but sometimes you just gotta go with what's right.

If it stops working; at least you tried it. If it works; you have evidence and can change the world in the future.

"I write in my phone". (sounds a lot like, "I could write notes in my phone") I hear this a lot.  Especially in person while I am writing notes. Indeed you do. Which is why I am the one with a notebook out and at the end of talking to you I will actually have notes and you will not. If you are genuinely the kind of person with notes in their phone I commend you for doing something with technology that I cannot seem to have sorted out; but if you are like me; and a lot of other people who could always say they could take notes in their phone; but never do; or never look at those notes... Its time to fix this.

a quote from a friend - “I realized in my mid twenties that I would look like a complete badass in a decade, if I could point people to a shelf of my notebooks.” And I love this too.

A friend has suggested that flashcards are his brain; and notepads are not.  I agree that flashcards have benefits. namely to do with organising things around, shuffling etc.  It really depends on what notes you are taking.  I quite like having a default chronology to things, but that might not work for you.

In our local Rationality Dojo’s we give away notebooks.  For the marginal costs of a book of paper; we are making people’s lives better.

The big take away

Get a notebook; make notes; add value to your life.




This post took 3 hours to write over a week

Please add your experiences if you work differently surrounding note taking.

Please fill out the survey of if you found this post helpful.

Post-doctoral Fellowships at METRICS

12 Anders_H 12 November 2015 07:13PM
The Meta-Research Innovation Center at Stanford (METRICS) is hiring post-docs for 2016/2017. The full announcement is available at Feel free to contact me with any questions; I am currently a post-doc in this position.

METRICS is a research center within Stanford Medical School. It was set up to study the conditions under which the scientific process can be expected to generate accurate beliefs, for instance about the validity of evidence for the effect of interventions.

METRICS was founded by Stanford Professors Steve Goodman and John Ioannidis in 2014, after Givewell connected them with the Laura and John Arnold Foundation, who provided the initial funding. See for more details.

Yudkowsky, Thiel, de Grey, Vassar panel on changing the world

12 NancyLebovitz 01 September 2015 03:57PM

30 minute panel

The first question was why isn't everyone trying to change the world, with the underlying assumption that everyone should be. However, it isn't obviously the case that the world would be better if everyone were trying to change it. For one thing, trying to change the world mostly means trying to change other people. If everyone were trying to do it, this would be a huge drain on everyone's attention. In addition, some people are sufficiently mean and/or stupid that their efforts to change the world make things worse.

At the same time, some efforts to change the world are good, or at least plausible. Is there any way to improve the filter so that we get more ambition from benign people without just saying everyone should try to change the world, even if they're Osama bin Laden?

The discussion of why there's too much duplicated effort in science didn't bring up the problem of funding, which is probably another version of the problem of people not doing enough independent thinking.

There was some discussion of people getting too hooked on competition, which is a way of getting a lot of people pointed at the same goal. 

Link thanks to Clarity

A Map of Currently Available Life Extension Methods

11 turchin 17 October 2015 12:10AM

Extremely large payoff from life extension

We live in special period of time when radical life extension is not far. We just need to survive until the moment when all the necessary technologies will be created.

The positive scenario suggests it could happen by 2050 (plus or minus 20 years), when humanity will create an advanced and powerful AI, highly developed nanotechnologies and a cure for aging.

Many young people could reach the year 2050 without even doing anything special.  

But for many other people an opportunity to extend their life for just 10-20 years is the key to achieving radical life extension (at least for a thousand of years, perhaps even more), because they will be able to survive until the creation of strong life extension technologies.

That is why even a slight life extension today means a potentially eternal prize. This map of the currently available life extension methods could help in it. The map contains a description of the initial stage of plan A from the “Personal Immortality Roadmap” (where plan B is cryonics, plan C – digital immortality and plan D – quantum immortality).

Brain is most important for life extension

The main idea of this map is that all efforts towards life extension must start from our brain, and in fact, they must finish there too.

First of all, you must have the will to conquer aging and death, and do it using scientific methods.

This is probably the most difficult part of the life extension journey. The vast majority of people simply don't think about life extension, while those who do care about it (usually when it's too late) use weak and non-scientific ways and methods; they simply don't understand that the prize of this game is not ten of healthy latter years, but almost eternal life. 

Secondly, you need to develop or mobilize the qualities inside yourself which are necessary for simple, daily procedures, which can almost guarantee life extension by an average of 10-20 years. e.g. avoiding smoking and alcohol consumption, daily mobility, daily intake of medicines and dietary supplements.

Most people find it incredibly difficult to perform simple actions on a permanent basis, for example even taking one pill every day for a year would be too much for most people. Not to mention quitting smoking or regular health check-ups. 

A human who has the motivation to extend his life, a proper understanding of how to achieve it and the necessary skills to realize his plans, should be considered as almost a superman. 

On other hand, while all of our body systems are affected by aging, our brain damage during aging plays the biggest role in total productivity reduction. Even though our crystallized intelligence increases with age, our fluid intelligence, our memory, and the possibility of making radical changes and acquiring new skills all decrease significantly with aging.

And these abilities decrease at the very time when they are needed most – to fight the aging process! Young people usually don't care too much about the aging process, because it's beyond their planning horizon. These qualities are vital in order to build the motivation and skills required to maintain health. 

Thus, this leads to the idea of the map, which says that all main efforts to combat aging must be focused on brain aging. If you can keep your brain youthful, it will create and implement new skills to extend your life, helping you to find new information in a sea of new publications and technologies.

If Alzheimers is the first sign of aging to reach your body, you will have to crawl for a tablet of validol without even knowing that it is harmful. And even worse, you will crystallize some harmful beliefs. A person can think that he is a genius in some fields, receive approval from others, but continue his journey in the wrong direction – in the direction of death. (Of course early detection of cancer and a healthy heart are really important to extend your life, but it will be too difficult to deal with such problems if your brain is not working properly).

The second reason to invest in brain health and regeneration is a direct connection of its state with the state of many other systems in your body through nervous and hormonal connections. 

In order to preserve your brain health we have to use antidepressants, nootropics and substances which promote its regeneration.

The example of Rita Montalchni is incredibly interesting ( She administered a nerve growth factor (NGF) as eye drops and lived for 101 years while her twin sister died when she was 91. (Bearing in mind the average life duration difference of twins is six years, we can conclude that she gained about four years.)

Thus, providing that we understand the priority of tasks, life extension now can be reached through three fine-spun blocks: a lifestyle, a medication and the prevention of aging itself.

Collective efforts in life extension

This map doesn't include one really important social aspect of aging prevention. If we could absorb all the money (through crowdfunding), which people use to buy supplements (around 300 billion per year), and use it to perform experiments in the field of life extension instead, we could invent new anti-aging medicine and other life extension tools. These methods and medicines could be used by those who initially donated money for such experiments; they could also benefit from sales of such products. Thus, such crowdfunding would include IPO too.

You won't find other social aspects in the map such as promotion of the idea of the fight against aging, political activism and art. All of these aspects are mentioned in the main Immortality Roadmap.

The map also doesn't include a temporal aspect. Our knowledge about the best methods of life extension changes almost daily. This map contains ideas which are valid in 2015, but it will require a significant update in just five years. If you aim to extend your life you must perform a constant analysis of scientific research in this area. Currently many new methods are appearing every day, e.g. ways of lengthening telomeres and gene therapy. Additionally, the older you are the riskier new methods you should try.

The map of ideas

In fact, the map contains a systemized analysis of ideas, which can lead to life extension, but not a bunch of well-proven tips. In an ideal situation such a map should contain links to research about all the listed items, as well as an evaluation of their real effects, so any help on improving the map will be welcomed.

This map (like all my other maps) is intended to help you navigate through the world of ideas. In this case it includes life extension ideas.

Moreover, one single idea may become a salvation for a person, e.g. eradicating a certain chronic disease. Of course, no single person can complete all of the ideas and suggestions in this map or indeed in any other list. I'm pretty sure that people will not be able to implement more than one advice per month – and I'm no exception.

My approach: I drink alcohol on really rare occasions, I don't smoke (but sometimes I use nicotine wrapping with nootropic objectives), I sleep a lot, I try to walk at least 4 km every day, I avoid risky activities and I always fasten my seatbelt.

I also invest a lot of effort in preventing my brain from aging and in combating depression. (I will provide you with a map about depression and nootropics later).

The pdf of the map is here, and jpg is below.


Previous posts with maps:

Simulation map

Digital Immortality Map

Doomsday Argument Map

AGI Safety Solutions Map

A map: AI failures modes and levels

A Roadmap: How to Survive the End of the Universe

A map: Typology of human extinction risks

Roadmap: Plan of Action to Prevent Human Extinction Risks

Immortality Roadmap


Future planned maps:

Brut force AIXI-style attack on Identity problem

Ways of mind-improvement

Fermi paradox map

Ways of depression prevention map

Quantum immortality map

Interpretations of quantum mechanics ma

Map of cognitive biases in global risks research

Map of double catastrophes scenarios in global risks

Probability of global catastrophe

Map of unknown unknowns as global risks

Map of reality theories, qualia and God

Map of death levels

Map of resurrections technologies

Map of aging theories

Flowchart «How to build a map»

Map of ideas about artificail explosions in space

Future as Markov chain


EDIT: due to temporary hosting error, check the map here:

Examples of growth mindset or practice in fiction

11 Swimmer963 28 September 2015 09:47PM

As people who care about rationality and winning, it's pretty important to care about training. Repeated practice is how humans acquire skills, and skills are what we use for winning.

Unfortunately, it's sometimes hard to get System 1 fully on board with the fact that repeated, difficult, sometimes tedious practice is how we become awesome. I find fiction to be one of the most useful ways of communicating things like this to my S1. It would be great to have a repository of fiction that shows characters practicing skills, mastering them, and becoming awesome, to help this really sink in.

However, in fiction the following tropes are a lot more common:

  1. hero is born to greatness and only needs to discover that greatness to win [I don't think I actually need to give examples of this?]
  2. like (1), only the author talks about the skill development or the work in passing… but in a way that leaves the reader's attention (and system 1 reinforcement?) on the "already be awesome" part, rather that the "practice to become awesome" part [HPMOR; the Dresden Files, where most of the implied practice takes place between books.]
  3. training montage, where again the reader's attention isn't on the training long enough to reinforce the "practice to become awesome" part, but skips to the "wouldn't it be great to already be awesome" part [TVtropes examples].
  4. The hero starts out ineffectual and becomes great over the course of the book, but this comes from personal revelations and insights, rather than sitting down and practicing [Nice Dragons Finish Last is an example of this].

Example of exactly the wrong thing:
The Hunger Games - Katniss is explicitly up against the Pledges who have trained their whole lives for this one thing, but she has … something special that causes her to win. Also archery is her greatest skill, and she's already awesome at it from the beginning of the story and never spends time practicing.

Close-but-not-perfect examples of the right thing:
The Pillars of the Earth - Jack pretty explicitly has to travel around Europe to acquire the skills he needs to become great. Much of the practice is off-screen, but it's at least a pretty significant part of the journey.
The Honor Harrington series: the books depict Honor, as well as the people around her, rising through the ranks of the military and gradually levelling up, with emphasis on dedication to training, and that training is often depicted onscreen – but the skills she's training in herself and her subordinates aren't nearly as relevant as the "tactical genius" that she seems to have been born with.

I'd like to put out a request for fiction that has this quality. I'll also take examples of fiction that fails badly at this quality, to add to the list of examples, or of TVTropes keywords that would be useful to mine. Internet hivemind, help?

[LINK] 23andme now approved by the FDA to deliver health reports

10 username2 22 October 2015 01:33AM

Looks like they were finally able to work out something with the FDA, and are back up and running. On the one hand, I'm very excited about the return of personalized genetic testing, but on the other hand I'm disappointed that their price doubled to $199. I was going to get kits for my 4-member family for Christmas, but that won't be feasible now.

Another interesting release from 23andme that came out at the same time is their [transparency report](, which shows how many requests from law enforcement they have gotten for customer DNA access, and what percentage they have gone through with.

The trouble with Bayes (draft)

10 snarles 19 October 2015 08:50PM


This post requires some knowledge of Bayesian and Frequentist statistics, as well as probability. It is intended to explain one of the more advanced concepts in statistical theory--Bayesian non-consistency--to non-statisticians, and although the level required is much less than would be required to read some of the original papers on the topic[1], some considerable background is still required.

The Bayesian dream

Bayesian methods are enjoying a well-deserved growth of popularity in the sciences. However, most practitioners of Bayesian inference, including most statisticians, see it as a practical tool. Bayesian inference has many desirable properties for a data analysis procedure: it allows for intuitive treatment of complex statistical models, which include models with non-iid data, random effects, high-dimensional regularization, covariance estimation, outliers, and missing data. Problems which have been the subject of Ph. D. theses and entire careers in the Frequentist school, such as mixture models and the many-armed bandit problem, can be satisfactorily handled by introductory-level Bayesian statistics.

A more extreme point of view, the flavor of subjective Bayes best exemplified by Jaynes' famous book [2], and also by an sizable contingent of philosophers of science, elevates Bayesian reasoning to the methodology for probabilistic reasoning, in every domain, for every problem. One merely needs to encode one's beliefs as a prior distribution, and Bayesian inference will yield the optimal decision or inference.

To a philosophical Bayesian, the epistemological grounding of most statistics (including "pragmatic Bayes") is abysmal. The practice of data analysis is either dictated by arbitrary tradition and protocol on the one hand, or consists of users creatively employing a diverse "toolbox" of methods justified by a diverse mixture of incompatible theoretical principles like the minimax principle, invariance, asymptotics, maximum likelihood or *gasp* "Bayesian optimality." The result: a million possible methods exist for any given problem, and a million interpretations exist for any data set, all depending on how one frames the problem. Given one million different interpretations for the data, which one should *you* believe?

Why the ambiguity? Take the textbook problem of determining whether a coin is fair or weighted, based on the data obtained from, say, flipping it 10 times. Keep in mind, a principled approach to statistics decides the rule for decision-making before you see the data. So, what rule whould you use for your decision? One rule is, "declare it's weighted, if either 10/10 flips are heads or 0/10 flips are heads." Another rule is, "always declare it to be weighted." Or, "always declare it to be fair." All in all, there are 10 possible outcomes (supposing we only care about the total) and therefore there are 2^10 possible decision rules. We can probably rule out most of them as nonsensical, like, "declare it to be weighted if 5/10 are heads, and fair otherwise" since 5/10 seems like the fairest outcome possible. But among the remaining possibilities, there is no obvious way to choose the "best" rule. After all, the performance of the rule, defined as the probability you will make the correct conclusion from the data, depends on the unknown state of the world, i.e. the true probability of flipping heads for that particular the coin.

The Bayesian approach "cuts" the Gordion knot of choosing the best rule, by assuming a prior distribution over the unknown state of the world. Under this prior distribution, one can compute the average perfomance of any decision rule, and choose the best one. For example, suppose your prior is that with probability 99.9999%, the coin is fair. Then the best decision rule would be to "always declare it to be fair!"

The Bayesian approach gives you the optimal decision rule for the problem, as soon as you come up with a model for the data and a prior for your model. But when you are looking at data analysis problems in the real world (as opposed to a probability textbook), the choice of model is rarely unambiguous. Hence, for me, the standard Bayesian approach does not go far enough--if there are a million models you could choose from, you still get a million different conclusions as a Bayesian.

Hence, one could argue that a "pragmatic" Bayesian who thinks up a new model for every problem is just as epistemologically suspect as any Frequentist. Only the strongest form of subjective Bayesianism can one escape this ambiguity. The dream for the subjective Bayesian dream is to start out in life with a single model. A single prior. For the entire world. This "world prior" would contain all the entirety of one's own life experience, and the grand total of human knowledge. Surely, writing out this prior is impossible. But the point is that a true Bayesian must behave (at least approximately) as if they were driven by such a universal prior. In principle, having such an universal prior (at least conceptually) solves the problem of choosing models and priors for problems: the priors and models you choose for particular problems are determined by the posterior of your universal prior. For example, why did you decide on a linear model for your economics data? It's because according to your universal posterior, you particular economic data is well-described by such a model with high-probability.

The main practical consequence of the universal prior is that your inferences in one problem should be consistent which your inferences in another, related problem. Even if the subjective Bayesian never writes out a "grand model", their integrated approach to data analysis for related problems still distinguishes their approach from the piecemeal approach of frequentists, who tend to treat each data analysis problem as if it occurs in an isolated universe. (So I claim, though I cannot point to any real example of such a subjective Bayesian.)

Yet, even if the subjective Bayesian ideal could be realized, many philosophers of science (e.g. Deborah Mayo) would consider it just as ambiguous as non-Bayesian approaches, since even if you have an unambiguous proecdure for forming personal priors, your priors are still going to differ from mine. I don't consider this a defect, since my worldview necessarily does differ from yours. My ultimate goal is to make the best decision for myself. That said, such egocentrism, even if rationally motivated, may indeed be poorly suited for a collaborative enterprise like science.

For me, the most far more troublesome objection to the "Bayesian dream" is the question, "How would actually you go about constructing this prior that represents all of your beliefs?" Looking in the Bayesian literature, one does not find any convincing examples of any user of Bayesian inference managing to actually encode all (or even a tiny portion) of their beliefs in the form of the prior--in fact, for the most part, we see alarmingly little thought or justification being put into the construction of the priors.

Nevertheless, I myself remained one of these "hardcore Bayesians", at least from a philosophical point of view, ever since I started learning about statistics. My faith in the "Bayesian dream" persisted even after spending three years in the Ph. D. program in Stanford (a department with a heavy bias towards Frequentism) and even after I personally started doing research in frequentist methods. (I see frequentist inference as a poor man's approximation for the ideal Bayesian inference.) Though I was aware of the Bayesian non-consistency results, I largely dismissed them as mathematical pathologies. And while we were still a long way from achieving universal inference, I held the optimistic view that improved technology and theory might one day finally make the "Bayesian dream" achievable. However, I could not find a way to ignore one particular example on Wasserman's blog[3], due to its relevance to very practical problems in causal inference. Eventually I thought of an even simpler counterexample, which devastated my faith in the possibility of constructing a universal prior. Perhaps a fellow Bayesian can find a solution to this quagmire, but I am not holding my breath.

The root of the problem is the extreme degree of ignorance we have about our world, the degree of surprisingness of many true scientific discoveries, and the relative ease with which we accept these surprises. If we consider this behavior rational (which I do), then the subjective Bayesian is obligated to construct a prior which captures this behavior. Yet, the diversity of possible surprises the model must be able to accommodate makes it practically impossible (if not mathematically impossible) to construct such a prior. The alternative is to reject all possibility of surprise, and refuse to update any faster than a universal prior would (extremely slowly), which strikes me as a rather poor epistemological policy.

In the rest of the post, I'll motivate my example, sketch out a few mathematical details (explaining them as best I can to a general audience), then discuss the implications.

Introduction: Cancer classification

Biology and medicine are currently adapting to the wealth of information we can obtain by using high-throughput assays: technologies which can rapidly read the DNA of an individual, measure the concentration of messenger RNA, metabolites, and proteins. In the early days of this "large-scale" approach to biology which began with the Human Genome Project, some optimists had hoped that such an unprecedented torrent of raw data would allow scientists to quickly "crack the genetic code." By now, any such optimism has been washed away by the overwhelming complexity and uncertainty of human biology--a complexity which has been made clearer than ever by the flood of data--and replaced with a sober appreciation that in the new "big data" paradigm, making a discovery becomes a much easier task than understanding any of those discoveries.

Enter the application of machine learning to this large-scale biological data. Scientists take these massive datasets containing patient outcomes, demographic characteristics, and high-dimensional genetic, neurological, and metabolic data, and analyze them using algorithms like support vector machines, logistic regression and decision trees to learn predictive models to relate key biological variables, "biomarkers", to outcomes of interest.

To give a specific example, take a look at this abstract from the Shipp. et. al. paper on detecting survival rates for cancer patients [4]:

Diffuse large B-cell lymphoma (DLBCL), the most common lymphoid malignancy in adults, is curable in less than 50% of patients. Prognostic models based on pre-treatment characteristics, such as the International Prognostic Index (IPI), are currently used to predict outcome in DLBCL. However, clinical outcome models identify neither the molecular basis of clinical heterogeneity, nor specific therapeutic targets. We analyzed the expression of 6,817 genes in diagnostic tumor specimens from DLBCL patients who received cyclophosphamide, adriamycin, vincristine and prednisone (CHOP)-based chemotherapy, and applied a supervised learning prediction method to identify cured versus fatal or refractory disease. The algorithm classified two categories of patients with very different five-year overall survival rates (70% versus 12%). The model also effectively delineated patients within specific IPI risk categories who were likely to be cured or to die of their disease. Genes implicated in DLBCL outcome included some that regulate responses to B-cell−receptor signaling, critical serine/threonine phosphorylation pathways and apoptosis. Our data indicate that supervised learning classification techniques can predict outcome in DLBCL and identify rational targets for intervention.

The term "supervised learning" refers to any algorithm for learning a predictive model for predicting some outcome Y(could be either categorical or numeric) from covariates or features X. In this particular paper, the authors used a relatively simple linear model (which they called "weighted voting") for prediction.

A linear model is fairly easy to interpret: it produces a single "score variable" via a weighted average of a number of predictor variables. Then it predicts the outcome (say "survival" or "no survival") based on a rule like, "Predict survival if the score is larger than 0." Yet, far more advanced machine learning models have been developed, including "deep neural networks" which are winning all of the image recognition and machine translation competitions at the moment. These "deep neural networks" are especially notorious for being difficult to interpret. Along with similarly complicated models, neural networks are often called "black box models": although you can get miraculously accurate answers out of the "box", peering inside won't give you much of a clue as to how it actually works.

Now it is time for the first thought experiment. Suppose a follow-up paper to the Shipp paper reports dramatically improved prediction for survival outcomes of lymphoma patients. The authors of this follow-up paper trained their model on a "training sample" of 500 patients, then used it to predict the five-year outcome of chemotherapy patients, on a "test sample" of 1000 patients. It correctly predicts the outcome ("survival" vs "no survival") on 990 of the 1000 patients.

Question 1: what is your opinion on the predictive accuracy of this model on the population of chemotherapy patients? Suppose that publication bias is not an issue (the authors of this paper designed the study in advance and committed to publishing) and suppose that the test sample of 1000 patients is "representative" of the entire population of chemotherapy patients.

Question 2: does your judgment depend on the complexity of the model they used? What if the authors used an extremely complex and counterintuitive model, and cannot even offer any justification or explanation for why it works? (Nevertheless, their peers have independently confirmed the predictive accuracy of the model.)

A Frequentist approach

The Frequentist answer to the thought experiment is as follows. The accuracy of the model is a probability p which we wish to estimate. The number of successes on the 1000 test patients is Binomial(p, 1000). Based on the data, one can construct a confidence interal: say, we are 99% confident that the accuracy is above 83%. What does 99% confident mean? I won't try to explain, but simply say that in this particular situation, "I'm pretty sure" that the accuracy of the model is above 83%.

A Bayesian approach

The Bayesian interjects, "Hah! You can't explain what your confidence interval actually means!" He puts a uniform prior on the probability p. The posterior distribution of p, conditional on the data, is Beta(991, 11). This gives a 99% credible interval that p is in [0.978, 0.995]. You can actually interpret the interval in probabilistic terms, and it gives a much tighter interval as well. Seems like a Bayesian victory...?

A subjective Bayesian approach

As I have argued before, a Bayesian approach which comes up with a model after hearing about the problem is bound to suffer from the same inconsistency and arbitariness as any non-Bayesian approach. You might assume a uniform distribution for p in this problem... but yet another paper comes along with a similar prediction model? You would need a join distribution for the current model and the new model. What if a theory comes along that could help explain the success of the current method? The parameter p might take a new meaning in this context.

So as a subjective Bayesian, I argue that slapping a uniform prior on the accuracy is the wrong approach. But I'll stop short of actually constructing a Bayesian model of the entire world: let's say we want to restrict our attention to this particular issue of cancer prediction. We want to model the dynamics behind cancer and cancer treatment in humans. Needless to say, the model is still ridiculously complicated. However, I don't think it's out of reach of the efforts of a well-funded, large collaborative effort of scientists.

Roughly speaking, the model can be divided into a distribution over theories of human biology, and conditional on the theory of biology, a course-grained model of an individual patient. The model would not include every cell, every molecule, etc., but it would contain many latent variables in addition to the variables measured in any particular cancer study. Let's call the variables actually measured in the study, X, and also the survival outcome, Y.

Now here is the epistemologically correct way to answer the thought experiment. Take a look at the X's and Y's of the patients in the training and test set. Update your probabilistic model of human biology based on the data. Then take a look at the actual form of the classifier: it's a function f() mapping X's to Y's. The accuracy of the classsifer is no longer parameter: it's a quantity Pr[f(X) = Y] which has a distribution under your posterior. That is, for any given "theory of human biology", Pr[f(X) = Y] has a fixed value: now, over the distribution of possible theories of human biology (based on the data of the current study as well as all previous studies and your own beliefs) Pr[f(X) = Y] has a distribution, and therefore, an average. But what will this posterior give you? Will you get something similar to the interval [0.978, 0.995] you got from the "practical Bayes" approach?

Who knows? But I would guess in all likelihood not. My guess you would get a very different interval from [0.978, 0.995], because in this complex model there is no direct link from the empirical success rate of prediction, and the quantity Pr[f(X) = Y]. But my intuition for this fact comes from the following simpler framework.

A non-parametric Bayesian approach

Instead of reasoning about a gand Bayesian model of biology, I now take a middle ground, and suggesting that while we don't need to capture the entire latent dynamics of cancer, we should at the very least we should try to include the X's and the Y's in the model, instead of merely abstracting the whole experiment as a Binomial trial (as did the frequentist and pragmatic Bayesian.) Hence we need a prior over joint distributions of (X, Y). And yes, I do mean a prior distribution over probability distributions: we are saying that (X, Y) has some unknown joint distribution, which we treat as being drawn at random from a large collection of distributions. This is therefore a non-parametric Bayes approach: the term non-parametric means that the number of the parameters in the model is not finite.

Since in this case Y is a binary outcome, a joint distribution can be decomposed as a marginal distribution over X, and a function g(x) giving the conditional probability that Y=1 given X=x. The marginal distribution is not so interesting or important for us, since it simple reflects the composition of the population of patients. For the purpose of this example, let us say that the marginal is known (e.g., a finite distribution over the population of US cancer patients). What we want to know is the probability of patient survival, and this is given by the function g(X) for the particular patient's X. Hence, we will mainly deal with constructing a prior over g(X).

To construct a prior, we need to think of intuitive properties of the survival probability function g(x). If x is similar to x', then we expect the survival probabilities to be similar. Hence the prior on g(x) should be over random, smooth functions. But we need to choose the smoothness so that the prior does not consist of almost-constant functions. Suppose for now that we choose particular class of smooth functions (e.g. functions with a certain Lipschitz norm) and choose our prior to to be uniform over functions of that smoothness. We could go further and put a prior on the smoothness hyperparameter, but for now we won't.

Now, although I assert my faithfulness to the Bayesian ideal, I still want to think about how whatever prior we choose would allow use to answer some simple though experiments. Why is that? I hold that the ideal Bayesian inference should capture and refine what I take to be "rational behavior." Hence, if a prior produces irrational outcomes, I reject that prior as not reflecting my beliefs.

Take the following thought experiment: we simply want to estimate the expected value of Y, E[Y]. Hence, we draw 100 patients independently with replacement from the population and record their outcomes: suppose the sum is 80 out of 100. The Frequentist (and prgamatic Bayesian) would end up concluding that with high probability/confidence/whatever, the expected value of Y is around 0.8, and I would hold that an ideal rationalist come up with a similar belief. But what would our non-parametric model say? We would draw a random function g(x) conditional on our particular observations: we get a quantity E[g(X)] for each instantiation of g(x): the distribution of E[g(X)]'s over the posterior allows us to make credible intervals for E[Y].

But what do we end up getting? Either one of two things happens. Either you choose too little smoothness, and E[g(X)] ends up concentrating at around 0.5, no matter what data you put into the model. This is the phenomenon of Bayesian non-consistency, and a detailed explanation can be found in several of the listed references: but to put it briefly, sampling at a few isolated points gives you too little information on the rest of the function. This example is not as pathological as the ones used in the literature: if you sample infinitely many points, you will eventually get the posterior to concentrate around the true value of E[Y], but all the same, the convergence is ridiculously slow. Alternatively, use a super-high smoothness, and the posterior of E[g(X)] has a nice interval around the sample value just like in the Binomial example. But now if you look at your posterior draws of g(x), you'll notice the functions are basically constants. Putting a prior on smoothness doesn't change things: the posterior on smoothness doesn't change, since you don't actually have enough data to determine the smoothness of the function. The posterior average of E[g(X)] is no longer always 0.5: it gets a little bit affected by the data, since within the 10% mass of the posterior corresponding to the smooth prior, the average of E[g(X)] is responding to the data. But you are still almost as slow as before in converging to the truth.

At the time that I started thinking about the above "uniform sampling" example, I was stil convinced of a Bayesian resolution. Obviously, using a uniform prior over smooth functions is too naive: you can tell by seeing that the prior distribution over E[g(X)] is already highly concentrated around 0.5. How about a hierarchical model, where first we draw a parameter p from the uniform distribution, and then draw g(x) from the uniform distribution over smooth functions with mean value equal to p? This gets you non-constant g(x) in the posterior, while your posteriors of E[g(X)] converge to the truth as quickly as in the Binomial example. Arguing backwards, I would say that such a prior comes closer to capturing my beliefs.

But then I thought, what about more complicated problems than computing E[Y]? What if you have to compute the expectation of Y conditional on some complicated function of X taking on a certain value: i.e. E[Y|f(X) = 1]? In the frequentist world, you can easily compute E[Y|f(X)=1] by rejection sampling: get a sample of individuals, average the Y's of the individuals whose X's satisfy f(X) = 1. But how could you formulate a prior that has the same property? For a finite collection of functions f, {f1,...,f100}, say, you might be able to construct a prior for g(x) so that the posterior for E[g(X)|fi = 1] converges to the truth for every i in {1,..,100}. I don't know how to do so, but perhaps you know. But the frequentist intervals work for every function f! Can you construct a prior which can do the same?

I am happy to argue that a true Bayesian would not need consistency for every possible f in the mathematical universe. It is cool that frequentist inference works for such a general collection: but it may well be unnecessary for the world we live in. In other words, there may be functions f which are so ridiculous, that even if you showed me that empirically, E[Y|f(X)=1] = 0.9, based on data from 1 million patients, I would not believe that E[Y|f(X)=1] was close to 0.9. It is a counterintuitive conclusion, but one that I am prepared to accept.

Yet, the set of f's which are not so ridiculous, which in fact I might accept to be reasonable based on conventional science, may be so large as to render impossible the construction of a prior which could accommodate them all. But the Bayesian dream makes the far stronger demand that our prior capture not just our current understanding of science but to match the flexibility of rational thought. I hold that given the appropriate evidence, rationalists can be persuaded to accept truths which they could not even imagine beforehand. Thinking about how we could possibly construct a prior to mimic this behavior, the Bayesian dream seems distant indeed.


To be updated later... perhaps responding to some of your comments!


[1] Diaconis and Freedman, "On the Consistency of Bayes Estimates"

[2] ET Jaynes, Probability: the Logic of Science


[4] Shipp et al. "Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning." Nature

[Link] Max Tegmark and Nick Bostrom Speak About AI Risk at UN International Security Event

10 Gram_Stone 13 October 2015 11:25PM

How could one (and should one) convert someone from pseudoscience?

10 Vilx- 05 October 2015 11:53AM

I've known for a long time that some people who are very close to me are somewhat inclined to believe the pseudoscience world, but it always seemed pretty benign. In their everyday lives they're pretty normal people and don't do any crazy things, so this was a topic I mostly avoided and left it at that. After all - they seemed to find psychological value in it. A sense of control over their own lives, a sense of purpose, etc.

Recently I found out however that at least one of them seriously believes Bruce Lipton, who in essence preaches that happy thoughts cure cancer. Now I'm starting to get worried...

Thus I'm wondering - what can I do about it? This is in essence a religious question. They believe this stuff with just anecdotal proof. How do I disprove it without sounding like "Your religion is wrong, convert to my religion, it's right"? Pseudoscientists are pretty good at weaving a web of lies that sound quite logical and true.

The one thing I've come up with is to somehow introduce them to classical logical fallacies. That at least doesn't directly conflict with their beliefs. But beyond that I have no idea.

And perhaps more important is the question - should I do anything about it? The pseudoscientific world is a rosy one. You're in control of your life and your body, you control random events, and most importantly - if you do everything right, it'll all be OK. Even if I succeed in crushing that illusion, I have nothing to put in its place. I'm worried that revealing just how truly bleak the reality is might devastate them. They seem to be drawing a lot of their happiness from these pseudoscientific beliefs, either directly or indirectly.

And anyway, more likely that I won't succeed but just ruin my (healthy) relationship with them. Maybe it's best just not to interfere at all? Even if they end up hurting themselves, well... it was their choice. Of course, that also means that I'll be standing idly by and allowing bullshit to propagate, which is kinda not a very good thing. However right now they are not very pushy about their beliefs, and only talk about them if the topic comes up naturally, so I guess it's not that bad.

Any thoughts?

Ultimatums in the Territory

10 malcolmocean 28 September 2015 10:01PM

When you think of "ultimatums", what comes to mind?

Manipulativeness, maybe? Ultimatums are typically considered a negotiation tactic, and not a very pleasant one.

But there's a different thing that can happen, where an ultimatum is made, but where articulating it isn't a speech act but rather an observation. As in, the ultimatum wasn't created by the act of stating it, but rather, it already existed in some sense.

Some concrete examples: negotiating relationships

I had a tense relationship conversation a few years ago. We'd planned to spend the day together in the park, and I was clearly angsty, so my partner asked me what was going on. I didn't have a good handle on it, but I tried to explain what was uncomfortable for me about the relationship, and how I was confused about what I wanted. After maybe 10 minutes of this, she said, "Look, we've had this conversation before. I don't want to have it again. If we're going to do this relationship, I need you to promise we won't have this conversation again."

I thought about it. I spent a few moments simulating the next months of our relationship. I realized that I totally expected this to come up again, and again. Earlier on, when we'd had the conversation the first time, I hadn't been sure. But it was now pretty clear that I'd have to suppress important parts of myself if I was to keep from having this conversation.

"...yeah, I can't promise that," I said.

"I guess that's it then."

"I guess so."

I think a more self-aware version of me could have recognized, without her prompting, that my discomfort represented an unreconcilable part of the relationship, and that I basically already wanted to break up.

The rest of the day was a bit weird, but it was at least nice that we had resolved this. We'd realized that it was a fact about the world that there wasn't a serious relationship that we could have that we both wanted.

I sensed that when she posed the ultimatum, she wasn't doing it to manipulate me. She was just stating what kind of relationship she was interested in. It's like if you go to a restaurant and try to order a pad thai, and the waiter responds, "We don't have rice noodles or peanut sauce. You either eat somewhere else, or you eat something other than a pad thai."

An even simpler example would be that at the start of one of my relationships, my partner wanted to be monogamous and I wanted to be polyamorous (i.e. I wanted us both to be able to see other people and have other partners). This felt a bit tug-of-war-like, but eventually I realized that actually I would prefer to be single than be in a monogamous relationship.

I expressed this.

It was an ultimatum! "Either you date me polyamorously or not at all." But it wasn't me "just trying to get my way".

I guess the thing about ultimatums in the territory is that there's no bluff to call.

It happened in this case that my partner turned out to be really well-suited for polyamory, and so this worked out really well. We'd decided that if she got uncomfortable with anything, we'd talk about it, and see what made sense. For the most part, there weren't issues, and when there were, the openness of our relationship ended up just being a place where other discomforts were felt, not a generator of disconnection.

Normal ultimatums vs ultimatums in the territory

I use "in the territory" to indicate that this ultimatum isn't just a thing that's said but a thing that is true independently of anything being said. It's a bit of a poetic reference to the map-territory distinction.

No bluffing: preferences are clear

The key distinguishing piece with UITTs is, as I mentioned above, that there's no bluff to call: the ultimatum-maker isn't secretly really really hoping that the other person will choose one option or the other. These are the two best options as far as they can tell. They might have a preference: in the second story above, I preferred a polyamorous relationship to no relationship. But I preferred both of those to a monogamous relationship, and the ultimatum in the territory was me realizing and stating that.

This can actually be expressed formally, using what's called a preference vector. This comes from Keith Hipel at University of Waterloo. If the tables in this next bit doesn't make sense, don't worry about it: all important conclusions are expressed in the text.

First, we'll note that since each of us have two options, a table can be constructed which shows four possible states (numbered 0-3 in the boxes).

    My options
  options insist poly don't insist
offer relationship 3: poly relationship 1: mono relationship
don't offer 2: no relationship 0: (??) no relationship

This representation is sometimes referred to as matrix form or normal form, and has the advantage of making it really clear who controls which state transitions (movements between boxes). Here, my decision controls which column we're in, and my partner's decision controls which row we're in.

Next, we can consider: of these four possible states, which are most and least preferred, by each person? Here's my preferences, ordered from most to least preferred, left to right. The 1s in the boxes mean that the statement on the left is true.

state 3 2 1 0
I insist on polyamory 1 1 0 0
partner offers relationship 1 0 1 0
My preference vector (← preferred)

The order of the states represents my preferences (as I understand them) regardless of what my potential partner's preferences are. I only control movement in the top row (do I insist on polyamory or not). It's possible that they prefer no relationship to a poly relationship, in which case we'll end up in state 2. But I still prefer this state over state 1 (mono relationship) and state 0 (in which I don't ask for polyamory and my partner decides not to date me anyway). So whatever my partners preferences are, I've definitely made a good choice for me, by insisting on polyamory.

This wouldn't be true if I were bluffing (if I preferred state 1 to state 2 but insisted on polyamory anyway). If I preferred 1 to 2, but I bluffed by insisting on polyamory, I would basically be betting on my partner preferring polyamory to no relationship, but this might backfire and get me a no relationship, when both of us (in this hypothetical) would have preferred a monogamous relationship to that. I think this phenomenon is one reason people dislike bluffy ultimatums.

My partner's preferences turned out to be...

state 1 3 2 0
I insist on polyamory 0 1 1 0
partner offers relationship 1 1 0 0
Partner's preference vector (← preferred)

You'll note that they preferred a poly relationship to no relationship, so that's what we got! Although as I said, we didn't assume that everything would go smoothly. We agreed that if this became uncomfortable for my partner, then they would tell me and we'd figure out what to do. Another way to think about this is that after some amount of relating, my partner's preference vector might actually shift such that they preferred no relationship to our polyamorous one. In which case it would no longer make sense for us to be together.

UITTs release tension, rather than creating it

In writing this post, I skimmed a wikihow article about how to give an ultimatum, in which they say:

"Expect a negative reaction. Hardly anyone likes being given an ultimatum. Sometimes it may be just what the listener needs but that doesn't make it any easier to hear."

I don't know how accurate the above is in general. I think they're talking about ultimatums like "either you quit smoking or we break up". I can say that expect that these properties of an ultimatum contribute to the negative reaction:

  • stated angrily or otherwise demandingly
  • more extreme than your actual preferences, because you're bluffing
  • refers to what they need to do, versus your own preferences

So this already sounds like UITTs would have less of a negative reaction.

But I think the biggest reason is that they represent a really clear articulation of what one party wants, which makes it much simpler for the other party to decide what they want to do. Ultimatums in the territory tend to also be more of a realization that you then share, versus a deliberate strategy. And this realization causes a noticeable release of tension in the realizer too.

Let's contrast:

"Either you quit smoking or we break up!"


"I'm realizing that as much as I like our relationship, it's really not working for me to be dating a smoker, so I've decided I'm not going to. Of course, my preferred outcome is that you stop smoking, not that we break up, but I realize that might not make sense for you at this point."

Of course, what's said here doesn't necessarily correspond to the preference vectors shown above. Someone could say the demanding first thing when they actually do have a UITT preference-wise, and someone who's trying to be really NVCy or something might say the sceond thing even though they're actually bluffing and would prefer to . But I think that in general they'll correlate pretty well.

The "realizing" seems similar to what happened to me 2 years ago on my own, when I realized that the territory was issuing me an ultimatum: either you change your habits or you fail at your goals. This is how the world works: your current habits will get you X, and you're declaring you want Y. On one level, it was sad to realize this, because I wanted to both eat lots of chocolate and to have a sixpack. Now this ultimatum is really in the territory.

Another example could be realizing that not only is your job not really working for you, but that it's already not-working to the extent that you aren't even really able to be fully productive. So you don't even have the option of just working a bit longer, because things are only going to get worse at this point. Once you realize that, it can be something of a relief, because you know that even if it's hard, you're going to find something better than your current situation.

Loose ends

More thoughts on the break-up story

One exercise I have left to the reader is creating the preference vectors for the break-up in the first story. HINT: (rot13'd) Vg'f fvzvyne gb gur cersrerapr irpgbef V qvq fubj, jvgu gjb qrpvfvbaf: fur pbhyq vafvfg ba ab shgher fhpu natfgl pbairefngvbaf be abg, naq V pbhyq pbagvahr gur eryngvbafuvc be abg.

An interesting note is that to some extent in that case I wasn't even expressing a preference but merely a prediction that my future self would continue to have this angst if it showed up in the relationship. So this is even more in the territory, in some senses. In my model of the territory, of course, but yeah. You can also think of this sort of as an unconscious ultimatum issued by the part of me that already knew I wanted to break up. It said "it's preferable for me to express angst in this relationship than to have it be angst free. I'd rather have that angst and have it cause a breakup than not have the angst."

Revealing preferences

I think that ultimatums in the territory are also connected to what I've called Reveal Culture (closely related to Tell Culture, but framed differently). Reveal cultures have the assumption that in some fundamental sense we're on the same side, which makes negotiations a very different thing... more of a collaborative design process. So it's very compatible with the idea that you might just clearly articulate your preferences.

Note that there doesn't always exist a UITT to express. In the polyamory example above, if I'd preferred a mono relationship to no relationship, then I would have had no UITT (though I could have bluffed). In this case, it would be much harder for me to express my preferences, because if I leave them unclear then there can be kind of implicit bluffing. And even once articulated, there's still no obvious choice. I prefer this, you prefer that. We need to compromise or something. It does seem clear that, with these preferences, if we don't end up with some relationship at the end, we messed up... but deciding how to resolve it is outside the scope of this post.

Knowing your own preferences is hard

Another topic this post will point at but not explore is: how do you actually figure out what you want? I think this is a mix of skill and process. You can get better at the general skill by practising trying to figure it out (and expressing it / acting on it when you do, and seeing if that works out well). One process I can think of that would be helpful is Gendlin's Focusing. Nate Soares has written about how introspection is hard and to some extent you don't ever actually know what you want: You don't get to know what you're fighting for. But, he notes,

"There are facts about what we care about, but they aren't facts about the stars. They are facts about us."

And they're hard to figure out. But to the extent that we can do so and then act on what we learn, we can get more of what we want, in relationships, in our personal lives, in our careers, and in the world.

(This article crossposted from my personal blog.)

Is my brain a utility minimizer? Or, the mechanics of labeling things as "work" vs. "fun"

10 contravariant 28 August 2015 01:12AM

I recently encountered something that is, in my opinion, one of the most absurd failure modes of the human brain. I first encountered this after introspection on useful things that I enjoy doing, such as programming and writing. I noticed that my enjoyment of the activity doesn't seem to help much when it comes to motivation for earning income. This was not boredom from too much programming, as it did not affect my interest in personal projects. What it seemed to be, was the brain categorizing activities into "work" and "fun" boxes. On one memorable occasion, after taking a break due to being exhausted with work, I entertained myself, by programming some more, this time on a hobby personal project (as a freelancer, I pick the projects I work on so this is not from being told what to do). Relaxing by doing the exact same thing that made me exhausted in the first place.

The absurdity of this becomes evident when you think about what distinguishes "work" and "fun" in this case, which is added value. Nothing changes about the activity except the addition of more utility, making a "work" strategy always dominate a "fun" strategy, assuming the activity is the same. If you are having fun doing something, handing you some money can't make you worse off. Making an outcome better makes you avoid it. Meaning that the brain is adopting a strategy that has a (side?) effect of minimizing future utility, and it seems like it is utility and not just money here - as anyone who took a class in an area that personally interested them knows, other benefits like grades recreate this effect just as well. This is the reason I think this is among the most absurd biases - I can understand akrasia, wanting the happiness now and hyperbolically discounting what happens later, or biases that make something seem like the best option when it really isn't. But knowingly punishing what brings happiness just because it also benefits you in the future? It's like the discounting curve dips into the negative region. I would really like to learn where is the dividing line between which kinds of added value create this effect and which ones don't (like money obviously does, and immediate enjoyment obviously doesn't). Currently I'm led to believe that the difference is present utility vs. future utility, (as I mentioned above) or final vs. instrumental goals, and please correct me if I'm wrong here.

This is an effect that has been studied in psychology and called the overjustification effect, called that because the leading theory explains it in terms of the brain assuming the motivation comes from the instrumental gain instead of the direct enjoyment, and then reducing the motivation accordingly. This would suggest that the brain has trouble seeing a goal as being both instrumental and final, and for some reason the instrumental side always wins in a conflict. However, its explanation in terms of self-perception bothers me a little, since I find it hard to believe that a recent creation like self-perception can override something as ancient and low-level as enjoyment of final goals. I searched LessWrong for discussions of the overjustification effect, and the ones I found discussed it in the context of self-perception, not decision-making and motivation. It is the latter that I wanted to ask for your thoughts on.


[Link] Lifehack Article Promoting LessWrong, Rationality Dojo, and Rationality: From AI to Zombies

9 Gleb_Tsipursky 14 November 2015 08:34PM

Nice to get this list-style article promoting LessWrong, Rationality Dojo, and Rationality: From AI to Zombies, as part of a series of strategies for growing mentally stronger, published on Lifehack, a very popular self-improvement website. It's part of my broader project of promoting rationality and effective altruism to a broad audience, Intentional Insights.


EDIT: To be clear, based on my exchange with gjm below, the article does not promote these heavily and links more to Intentional Insights. I was excited to be able to get links to LessWrong, Rationality Dojo, and Rationality: From AI to Zombies included in the Lifehack article, as previously editors had cut out such links. I pushed back against them this time, and made a case for including them as a way of growing mentally stronger, and thus was able to get them in.

Life Advice Repository

9 Gunnar_Zarncke 18 October 2015 12:08PM

Looking thru the Repository Repository I can't find a nice category for a lot of real life or self help advice that has been posted here over time. Sure some belongs to the Boring Advice Repository but the following you surely wouldn't expect there:

continue reading »

A very long list of sleep maintenance suggestions

9 Elo 15 October 2015 03:29AM

Leading up to this year's Australia megameetup, in the interest of improving people's lives in the most valuable way possible, I was hoping to include a session on sleep, sleep quality and sleep maintenance.  With that in mind I put together A very long list of sleep maintenance suggestions.

Some of the most important take-aways: 

  1. Do you think you get {good sleep/enough sleep}?  
    - If no then fix it.  This single thing will improve your life drastically.  (also don't lie to yourself about this, research shows that people who are sleep deprived are bad at predicting how sleep deprived they are, if you are unsure; probably err on the side of caution.  As a measure - if you turned off your alarms - would you be able to get out of bed at the same time every day?)
  2. "I do this weird thing with my sleep but it works well for me, is that a problem?"
    - not really.  if it works - keep doing it.  if it works most of the time but falls apart every Monday, then maybe its time to consider a different plan.
  3. Uberman, and other polyphasic sleep cycles?
    - depends if it works for you.  Don't force yourself to do it if it, don't expect it to work for you.  Feel free to try it; lifestyle is also relevant in considering this sleep implementation, (if you have a 9-5 job you certainly can't make it work, if you have a flexible life then maybe)
Also living a healthy lifestyle will make a big difference.

Some good highlights from the list:
  • limit caffeine, especially to earlier in the day
  • avoid using alcohol as a nightcap - it disrupts sleep maintenance
  • Avoid heavy meals and heavy exercise within 3 hours of bedtime
  • use bedroom for sleep and sex only
  • have sleep in your schedule (go to bed and get up at the same time every day, even on weekends)
  • decrease brightness of home lighting ~1-2 hours before bed
  • avoid electronics ~1-2 hours before bed
  • reduce light and noise (via earplugs / white noise) in bedroom as much as possible while sleeping
  • If you tend to sleep in a lot if you don't set an alarm, you are not getting enough sleep on average - go to bed earlier, consistently.
  • If your alarm keeps going off in the middle of REM sleep, move your bedtime about 45 minutes in either direction - REM sleep occurrs in 1.5 hour increments.
  • Use melatonin.
  • avoid smoking.

The list is best formatted here:

But is also included below for convenience.


A very long list of sleep improving suggestions: -2 to 2      
Area of interest: Evidence Rating   Explanation by Adam K Comments by others
Everyday life        
eat healthy 1   Being overweight reduces sleep quality and risk of sleep disorder  
reduce sugar and refine carb intake ?     These contribute to daytime sleepyness which may result in overnapping - Kat
be a healthy body weight 2   BMI over 30 puts you at risk of sleep apnea, if anything above Normal BMI w/ sleep apnea, losing weight may help reduce apnea symptoms being overweight can contribute to sleep apnea - Kat
limit caffeine (in chocolate or decaf too) 2   Caffeine response differs significantly in people Limiting caffeine to earlier in the day may also be of some use - Kat
quit smoking (stimulant and breathing) 2   Less deep sleep, less total sleep, and longer sleep latency  
exercise daily (not around your sleep time by at least 2-4 hours) 2   Increased deep sleep, less sleep interruptions Even a small amount helps - start with 10 minutes of cardio and work your way up if you have to - Kat
reduce anxiety and stress 2   Anxiety increases sleep latency and sleep interruptions  
limit irregular work shifts 2   Circadian rhythmicity important for all parts of health  
avoid long commutes 1   Rising too early can miss REM, and less total sleep, some confounding factors to consider  
Be physically healthy 2   Diseases generally lead to sleep disorder, e.g. diabetes, cancer, etc  
Get enough sunlight 2   Light is most important zeitgeiber for circadian rhythmicity  
Analysing your sleep setup        
use your bed for sleep and sex only 1   Bed restriction in *older adults* I don't really know, but older adults who dawdle in bed tend to get better sleep quality if they restrict bed times to reasonable sleep times
sleep in darkness – the more the better; including all LEDs 2   Light is most important zeitgeiber for circadian rhythmicity  
Cool room temperature of sleep 15-25c 0   15 may be too cold for people with poor core body temperature control but good for younger healthier more active people. 25 probably too warm for everyone. In-bed, or rectal measurement are more accurate measures, too complicated for normal people to do
check if you are using comfortable pillows 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings I guess room temperature is only important if it is to cool down, because if it's too cold you can always pile more blankets on until you are comfortable. Recommended room temperature would be 17-22C then
body pillow, neck pillow, arm pillow, to permit a better body position while asleep 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings  
check if your bed is comfortable 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings  
evaluate sleep location in bedroom - too close to window, door, other noise / light? 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings  
Evaluate sleep distractions in the room 1   distractions, by definition, increase sleep latency  
mattress life expectancy check (around 10 years) 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings  
pillow life expectancy check (around 2-4 years) 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings  
allergens in the bedroom 2   definitely affects sleep quality Easiest thing to do is buy dust-mite-proof pillow and mattress covers. Wash bedding weekly in hot water and a little bleach (kills mold). Vacuum regularly. Keep windows closed during known allergy seasons. If you have bad allergy symptoms, get tested and get immunotherapy shots if you can afford it. - Kat
limit pets in bed 1   Sharing bed space with anything decreases sleep quality, including sleeping with partners  
limit children in bed 1   Sharing bed space with anything decreases sleep quality, including sleeping with partners  
make sure there is enough room for those in the bed 1   Sharing bed space with anything decreases sleep quality, including sleeping with partners and enough sheets and blankets for each - consider separate sheet/blanket for each side of the bed if your sleep partner tugs on the sheets and wakes you - Kat
bedside notepad for anything you might want to write down - if something is keeping you up; you can use this to record things and effectively put them out of your mind so that you can go to sleep. ??   is this a distraction?  
Understand approximate sleep hours needed (7-9 in most adults, different summer-winter) 2   Most people underestimate how much they need physiological 'need' for sleep doesn't decrease with age, only 'feel' for need for sleep does
certain smells can help, certain smells can hinder. 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings  
Have sleep in your schedule 2   Regular bed time important for circadian rhythmicity  
have a sleep schedule that includes sleep on the weekends (no skipping the weekends) 2   Sleeping in on the weekend can be very good for people who undersleep during the weekdays, but it's not as good as regular sleep of course ok
Turn your clock so you can't see it while the lights are out / don't check time on your phone 2   If you have a clock, make sure clocks are either dimmed or red LED  
what is your bed and blankets made out of? Are these the best materials for you for this bed? 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings  
don't have a TV in the bedroom 2   Emits light, is a distraction, etc  
Don't have a computer, tablet, or phone in the bedroom 2   Emits light, is a distraction, etc even worse than TV  
calming bedroom colour ( need source) 0   Unless referring to red light or candle light use at night, not sure what it's referring to I suspect this is relating to the 'look' of the bedroom in general, and how you fee when you walk into it. I.e., if you hate mustard yellow, re-paint your room if the walls are mustard yellow - Kat
On the way to sleep        
Pre-Bed food     extreme diets (VHC or VLC) can ruin sleep quality, and carbohydrates for dinner can reduce sleep latency  
go to bed neither hungry nor stuffed (food) 1   highly subjective, but true  
don't eat meals too close to sleep 2   either digestion slows, or sleep is disrupted, one or the other (subjective)  
small evening meals 0   highly subjective, no guarantees, only moderate, weak or no associations with sleep quality and wakings  
limit late night alcohol 2   alcohol reduces quality of deep sleep it only reduces sleep quality if it is in your system while you sleep, so you could drink in the afternoon and have it leave your system by the time you sleep and you'd be fine.
limit late night liquid 0   usually true one thing to note is marijuana, which is commonly consumed, also affects sleep quality, but far less is known about its effects, it seems it is variable
avoid sugar heavy foods -1   carbs will reduce sleep latency, though I don't recommend sugar for general good health  
avoid spicy or greasy meals before bed (or other food you know does not agree with you) 1   high fat meals correlated with poor sleep measure  
tryptophan snack – if you are hungry try a light snack before bed -2   evidence for this actually working is non-existent, a well perpetuated myth Common suggestions included warm milk, a banana, cheese on crackers, cereal and milk, also turkey - combine carbohydrates and either calcium or a protein that contains the amino acid tryptophan to boost seratonin for calmness.
Things that aren't food        
Have set a regular bedtime 2   Circadian Rhythmicity important  
have a bedtime routine or ritual which includes relaxation 1   highly subjective, but I guess true  
decrease brightness of home lighting ~1-2 hours before bed 2      
eliminate blue-spectrum home and screen lighting ~1-2 hours before bed 2   blue light increase heart rate, and wake inducing catacholamines and brain activity, reduces sleep quality Avoid fluorescent tube lights, compact fluorescent or LED bulbs labeled daylight, cool white, or bright white (instead, use sub-3500K color temps, sometimes called warm white or soft white), and screens without a red-shift application running - Kat
avoid electronics before bed 2   game-like activity increase heart rate, and wake inducing catacholamines and brain activity, reduces sleep quality  
keep noise down while heading to bed 0   highly subjective, but may be valuable despite no 'evidence'  
organise for tomorrow so you can stop thinking about it 0   highly subjective, but may be valuable despite no 'evidence' At its simplist, make a todo list for tomorrow. If there's a lot on your mind, try a full-on 'brain dump' on a very large sheet of paper, several hours before bed: write down everything you think is important for the next month or so. Use that to inform your todo lists.
Bedtime media; book; audiobook; calming music (soft), 0   highly subjective, but may be valuable despite no 'evidence' I guess it's up to the person
stretch (debatable) 0   more likely to be because of exercise  
wind down an hour before bed 1   exercise too close to sleep increases heart rate, increases sleep latency  
take a warm bath/shower 1   only if you need to lower your core body temperature (see temperature advice above)  
before bed – write down what is on your mind and resolve to leave it for tomorrow 0   highly subjective, but may be valuable despite no 'evidence'  
read before bed by soft light 0   highly subjective, but may be valuable despite no 'evidence'  
don't have a nightcap (alcohol) 2   alcohol reduces quality of deep sleep  
neutral neck position in bed and before bed. 0   mostly supported by alternative chiropractic studies, which is poor form of evidence  
hot pack on the neck 1   only if you need to lower your core body temperature (see temperature advice above)  
do a simple armchair hobby to relax 0   highly subjective, but may be valuable despite no 'evidence'  
Go to sleep when you are tired. Don't wait in bed frustrated if you can't fall asleep. 0   highly subjective, but may be valuable despite no 'evidence'  
consider wearing socks to bed 0   highly subjective, but may be valuable despite no 'evidence'  
sleep diary of if you felt sleepy during the day, things that you think might influence your sleep tonight. 0   highly subjective, but may be valuable despite no 'evidence' Include food, exercise, sleep details, # of awakenings in the middle of the night and their approx. duration, rate the sleep out of 10, time of last wakeup - naturally or to an alarm? If you were dreaming when your alarm went off, go to bed earlier so that your alarm is not waking you up in the middle of a REM cycle - Kat
Going to bed        
Select nightclothes (or none) and bedding to keep yourself heat stable (thermoregulation) 0      
If you are having difficulty getting to sleep – try imagine what you would like to dream about 0   only anecdotal evidence, but may be valuable, I personally recommend this technique If you are an artist or a crafter, imagine the design of a project you would like to do someday - Kat
deep breathing (or other relaxation technique - visualisation breathing, yoga) 2   sufficient evidence to say it works if you have good compliance with the practice  
For while you are asleep        
Noise / Light        
earplugs 0-2   benefits depends on environment, will help (2) in high noise environment  
white noise (device, fan, or app, pink noise) 0-2   white noise improves noisy environment, but silence is better  
humidifiers for air quality 1   may improve breathing problems, if sleep quality is compromised by breathing problems, cpap etc Must be cleaned on a regular basis to prevent mold - Kat
air filter 1   may improve breathing problems, asthma and allergies specifically Tape a 20x20" electrostatic furnace filter to a 20" box fan for a cheap air filter - Kat
fans for air movement+cooling, and/or white noise 0-2   white noise improves noisy environment, but silence is better  
eye mask for light 0-2   benefits depends on environment, will help (2) in high light environment  
Sleep in the dark - at night if you have a choice; use heavy curtains if streetlights or sunlight present 2      
Sleeping positions - get comfortable!        
try a leg pillow (pillow between the knees) or holding a pillow 0   highly subjective, but may be valuable despite no 'evidence'  
make sure you are sleeping in a neutral neck position 0   mostly supported by alternative chiropractic studies, which is poor form of evidence  
try other positions if that one is uncomfortable 0   highly subjective, but may be valuable despite no 'evidence'  
try each side, back, front. -1   I'd say sleeping on back is not good for sleep parameters, higher risk of sleep disorders developing and worse sleep quality  
Staying asleep        
Body temperature, Room temperature 1   only if you need to lower your core body temperature (see temperature advice above)  
noises 0-2   see white noise vs silence, see ear plugs Ask housemates to avoid low-frequency sounds, like slamming doors, music, etc as these cannot be masked by white noise or earplugs - Kat
smells (i.e. smoke, food) ??   ?? Ask housemates to avoid cooking aromatic foods while you are asleep (ex, frying sausage, onions, canned tuna, etc) - Kat
If your sleep is interrupted        
small bathroom nightlight (not blue and not bright like normal bathroom lights) 2   try to go to the loo without any lights being turned on, otherwise use red lightbulb  
avoid cold floors (rugs/socks) 1   subjective, but warm feed important for getting back to sleep and decreasing sleep latency  
get back to sleep: stay in bed 0   subjective  
get back to sleep: just try to relax, don't try for sleep 0   subjective Drowsing in bed is still more restful than being awake and doing something - Kat
get back to sleep: avoid electronics with blue light 2   avoid blue, green, white light  
dont use portable electronics in bed 2   avoid blue, green, white light  
If wide awake, go do low-key activity for 15m, then back to bed again 1   can stay up for up to an hour and a half  
When you wake up        
wake up at the same time every day 0   light exposure at same time every day more important, sun lamp or lamp timer  
keep a sleep diary of all these possible related factors 0   highly subjective, but may be valuable despite no 'evidence'  
increase light levels (just after waking up) 2   lamp and timer or lifx  
get up when the alarm goes off – don't snooze button 2   snooze bad, either sleep in or don't, having a string of alarms just compromizes sleep quality even if you think it makes you feel its easier to get up  
One option: nap every single day (siesta style) 2   naps = lots of health benefits  
The other: Don't nap 0   no benefits to no naps No naps was recommended to me by neurologist; helpful if sleep schedule is completely messed up. Otherwise, I would say, don't nap if you're not tired. If you are tired, then nap, and look to how you can add sleep time at night in the future, rather than relying on naps - Kat
If you do; nap for less than 30 minutes     either nap <25 or nap for 70-90min  
You can use naps to make up for lost sleep 2   yes, to a degree  
avoid naps in the evening 2   leave at least 8h before your bed time else you risk compromising night sleep quality or sleep latency  
Medical solutions        
see a doctor after symptoms (depression, acid reflux, asthma, medications, headaches) 2      
sleeping pills have side effects 2   yes, many are actually bad for sleep quality, and just make you forget you didn't get any sleep (rather than put you to sleep), also dependency and addiction  
sleep medications exist 2   yes; more useful for really messed up sleep patterns; see above Addiction can be avoided by tapering the dose off over the course of several days or weeks when you no longer need it - Kat
check your existing medications for insomnia side effects 2      
antihistamine with drowsiness side effects 1   reduces sleep latency but compromizes sleep quality  
melatonin but see a doctor before doing anything high dose 2   melatonin + whitenoise/earplugs + sleep mask good combo for bad environments Melatonin has a fairly short half-life. Best effectiveness may be in taking it right before lights out. Start with small dose (300 micrograms) and slowly increase until most effective dose is found. - Kat
      melatonin also good for everything else, lots of health benefits  
Science!     melatonin is a chronobiotic and not a sleeping pill, gotta take it regularly at same time every night even if you don't plan on staying up (if you want to keep your schedule, that is)  
test by spending 2 weeks in a row; going to bed at the same time and recording when you wake up without an alarm feeling rested. ??   less valuable than just fixing lighting and taking melatonin for 2 weeks useful for therapists trying to track someone with a shifting circadian rhythm
consider allowing less sleep time (by trial) (don't expect to sleep for 9 hours or be frustrated if you don't sleep exactly that long) 1   can cut down 1 sleep cycle, and after 2 weeks body adapts (BUT THIS IS FROM HEALTHY 8.5h BASELINE and NOT from "already sleep deprived") By 'sleep cycle' do you mean REM cycle? - Kat
tracking QS     devices that measure eeg and eye movement most accurate  
polyphasic sleep cycles ?     I have never known anyone to be able to keep those up for very long without exhibiting signs of sleep deprivation. I consider it a 'do it if you have to, but avoid if possible'.
make sleep a priority on weekends (to recover from sleep debt)        
check up on your sleep quality over time and re-evaluate these details        
waking up groggy? Coffee, look at what point in your sleep cycle you are waking up, try the science suggestion, get more light to your eyes when you wake up. 1   sleep cycle calculations may help, but bright lighting more helpful If alarm going off during REM, try going to bed 45 minutes earlier. REM cycles every 1.5 hours - Kat
Sleep-walking, sleep-talking? 0   no evidence of treatments for sleep walking :( See a doctor? - Kat
daytime tiredness? (get more sleep) 2   obviously :) either night sleep or naps Or lay off the sugar and simple carbs - consuming these and nothing else can cause a blood sugar crash - Kat
afternoon sleepiness? (normal, take a break; get fresh air, eat something, get more light) 1   best to keep moving and on your feet if you want to 'walk off' the midday sleepiness period, keep core body temperature high (cold exposure, body movement)  
waking in the night? (can be normal, can be something wrong with your environment, try sleep tracking apps, there is one that records ambient sounds in the room while you are sleeping. Something might be making noise that you were unaware of, rats, possums, cars, devices) How do you feel during the day? If you feel fine then its normal wakeful cycles and don't worry about it 1   evidence says waking in middle of night is, like taking a siesta, just part of natural sleeping pattern for some people, probably depends on genetics, but also depends on circadian rhythmicity, age and environment (like night length, melatonin dose). Obviously could also just be drinking too much water.  
grinding teeth or clenching jaw? (reasonably common, reduce stress, use a mouthguard) 1   mouthguard is the main one, also reduce stress, be less hungry (improve diet)  
nightmares, strange dreams? (common, reduce stress, check for a dislocated rib or major sleep disturbance, become more busy or occupied during the day – having too much free time can leave your mind to not know what to churn about) 1   usually comes down to brain chemistry (can be related to diet, or medications/drugs, or genetics, or strange lifestyle)  
sleeping too much? (normal, reduce exercise if over exerting yourself, improve other health areas, check for depression, check medication, consult medical professionals) 2   more than 10h sleep regularly is either unhealthy in it's own right, or is a sign that you have or are developing a disease that causes sleep abnormality  
can't get to sleep? (normal, Check intake of stimulants, alcohol, disagreeable foods etc. check environment, check total sleep time, check if it actually matters, try visualisation or relaxation exercises)     I guess see all of the above yes, one night of insomnia is not clinical insomnia and nothing to worry about
can't get up at the right time? (get more sleep, get more light at that time, get out of bed really quick, then figure the rest out)        
most important question: is this strange seeming sleeping habit actually a problem? Does it bother you or anyone (who matters)? If no; don't change it.       agreed
Changing sleep by changing sleep hygiene takes time - allow ~2 weeks for any change to have an effect       Trying something for one night and then declaring, "this doesn't work!" is counterproductive. Stick with it. Change one thing at a time if an entire list of things seems overwhelming - Kat
If you try most things on this list for >2 weeks and still have daytime tiredness / poor sleep, see primary care doctor or neurologist       Doc will refer you to neurologist for simple testing to determine primary or secondary insomnia. Or you can just say, "I think it's stress (primary insomnia), can I try ambien / lunesta / whatever?" In the US, most docs start PTs on generic ambien - Kat
Get f.lux or redshift for all electronics that emit white light, also "night mode" and "twilight" for android 2   reduces blue and green light emissions  
Anyone does nightshift? different rules can apply 2     Reducing light in home before bed, and blocking as much light as possible in bedroom is absolutely necessary - Kat
Sleep posture, babies are different to adults 2   babies sleep on their backs to avoid SIDS, adults sleep on side or front to avoid sleep disorders and get best quality  
no midnight snacks 2   metabolizes food differently about 2 hours after dark or after melatonin administration  
jetlag 2   fast, and eat a big carb meal first thing in the 'morning' of your destination, can do this several days in preparation
take large dose of melatonin at the 'night' time of your destination, can do this several days in preparation
      slow release caffeine on the morning of arrival, can prepare the day beforehand  
      light therapy on morning of arrival, can prepare days beforehand  
      best thing is to do pre-flight adaptation, but this takes planning and commitment  
marijuana ?      
other drugs ?      
Modaffonil ?      
Cool sleeping cap (one study recently) ?      


Meta: the original collection of this list took at least 10 hours; plus several other people's time to point out the quality of the suggestions.  From deciding to post this to post-ready took 2 hours.  

This post was finalised with the assistance of participants on the Slack chat.  

My table of contents, includes other posts of mine that might be of value.

Thanks to Kat and AdamK for their help with this post.

As per usual; any suggestions are welcome, and improvements would be appreciated, and I hope this helps you.  There will be a poll in the comments.

New positions and recent hires at the Centre for the Study of Existential Risk (Cambridge, UK)

9 Sean_o_h 13 October 2015 11:11AM

[Cross-posted from EA Forum. Summary: Four new postdoc positions at the Centre for the Study of Existential Risk: Evaluation of extreme technological risk (philosophy, economics); Extreme risk and the culture of science (philosophy of science); Responsible innovation and extreme technological risk (science & technology studies, sociology, policy, governance); and an academic project manager (cutting across the Centre’s research projects, and playing a central role in Centre development). Please help us to spread the word far and wide in the academic community!]


An inspiring first recruitment round

The Centre for the Study of Existential Risk (Cambridge, UK) has been making excellent progress in building up our research team. Our previous recruitment round was a great success, and we made three exceptional hires. Dr Shahar Avin joined us in September from Google, with a background in the philosophy of science (Cambridge, UK). He is currently fleshing out several potential research projects, which will be refined and finalised following a research visit to FHI later this month. Dr Yang Liu joined us this month from Columbia University, with a background in mathematical logic and philosophical decision theory. Yang will work on problems in decision theory that relate to long-term AI, and will help us to link the excellent work being done at MIRI with relevant expertise and talent within academia. In February 2016, we will be joined by Dr Bonnie Wintle from the Centre of Excellence for Biosecurity Risk Analysis (CEBRA), who will lead our horizon-scanning work in collaboration with Professor Bill Sutherland’s group at Cambridge; among other things, she has worked on IARPA-funded development of automated horizon-scanning tools, and has been involved in the Good Judgement Project.

We are very grateful for the help of the existential risk and EA communities in spreading the word about these positions, and helping us to secure an exceptionally strong field. Additionally, I have now moved on from FHI to be CSER’s full-time Executive Director, and Huw Price is now 50% funded as CSER’s Academic Director (we share him with Cambridge’s Philosophy Faculty, where he remains Bertrand Russell Chair of Philosophy).

Four new positions:

We’re delighted to announce four new positions at the Centre for the Study of Existential Risk; details below. Unlike the previous round, where we invited project proposals from across our areas of interest, in this case we have several specific positions that we need to fill for our three year Managing Extreme Technological Risk project, funded by the Templeton World Charity Foundation; details are provided below. As we are building up our academic brand within a traditional university, we expect to predominantly hire from academia, i.e. academic researchers with (or near to the completion of) PhDs. However, we are open to hiring excellent candidates without candidates but with an equivalent and relevant level of expertise, for example in think tanks, policy settings or industry.

Three of these positions are in the standard academic postdoc mould, working on specific research projects. I’d like to draw attention to the fourth, the academic project manager. For this position, we are looking for someone with the intellectual versatility to engage across our research strands – someone who can coordinate these projects, synthesise and present our research to a range of audiences including funders, collaborators, policymakers and industry contacts. Additionally, this person will play a key role in developing the centre over the next two years, working with our postdocs and professorial advisors to secure funding, and contributing to our research, media, and policy strategy among other things. I’ve been interviewed in the past ( about the importance of roles of this nature; right now I see it as our biggest bottleneck, and a position in which an ambitious person could make a huge difference.

We need your help – again!

In some ways, CSER has been the quietest of the existential risk organisations of late – we’ve mainly been establishing research connections, running lectures and seminars, writing research grants and building relations with policymakers (plus some behind-the scenes involvement with various projects). But we’ve been quite successful in these things, and now face an exciting but daunting level of growth: by next year we aim to have a team of 9-10 postdoctoral researchers here at Cambridge, plus senior professors and other staff. It’s very important we continue our momentum by getting world-class researchers motivated to do work of the highest impact. Reaching out and finding these people is quite a challenge, especially given our still-small team. So the help of the existential risk and EA communities in spreading the word – on your facebook feeds, on relevant mailing lists in your universities, passing them on to talented people you know – will make a huge difference to us.

Thank you so much!

Seán Ó hÉigeartaigh (Executive Director, CSER)


“The Centre for the Study of Existential Risk is delighted to announce four new postdoctoral positions for the subprojects below, to begin in January 2016 or as soon as possible afterwards. The research associates will join a growing team of researchers developing a general methodology for the management of extreme technological risk.

Evaluation of extreme technological risk will examine issues such as:

The use and limitations of approaches such as cost-benefit analysis when evaluating extreme technological risk; the importance of mitigating extreme technological risk compared to other global priorities; issues in population ethics as they relate to future generations; challenges associated with evaluating small probabilities of large payoffs; challenges associated with moral and evaluative uncertainty as they relate to the long-term future of humanity. Relevant disciplines include philosophy and economics, although suitable candidates outside these fields are welcomed. More: Evaluation of extreme technological risk

Extreme risk and the culture of science will explore the hypothesis that the culture of science is in some ways ill-adapted to successful long-term management of extreme technological risk, and investigate the option of ‘tweaking’ scientific practice, so as to improve its suitability for this special task. It will examine topics including inductive risk, use and limitations of the precautionary principle, and the case for scientific pluralism and ‘breakout thinking’ where extreme technological risk is concerned. Relevant disciplines include philosophy of science and science and technology studies, although suitable candidates outside these fields are welcomed. More: Extreme risk and the culture of science;

Responsible innovation and extreme technological risk asks what can be done to encourage risk-awareness and societal responsibility, without discouraging innovation, within the communities developing future technologies with transformative potential. What can be learned from historical examples of technology governance and culture-development? What are the roles of different forms of regulation in the development of transformative technologies with risk potential? Relevant disciplines include science and technology studies, geography, sociology, governance, philosophy of science, plus relevant technological fields (e.g., AI, biotechnology, geoengineering), although suitable candidates outside these fields are welcomed. More: Responsible innovation and extreme technological risk

We are also seeking to appoint an academic project manager, who will play a central role in developing CSER into a world-class research centre. We seek an ambitious candidate with initiative and a broad intellectual range for a postdoctoral role combining academic and administrative responsibilities. The Academic Project Manager will co-ordinate and develop CSER’s projects and the Centre’s overall profile, and build and maintain collaborations with academic centres, industry leaders and policy makers in the UK and worldwide. This is a unique opportunity to play a formative research development role in the establishment of a world-class centre. More: CSER Academic Project Manager

Candidates will normally have a PhD in a relevant field or an equivalent level of experience and accomplishment (for example, in a policy, industry, or think tank setting). Application Deadline: Midday (12:00) on November 12th 2015.”

Aumann Agreement Game

9 abramdemski 09 October 2015 05:14PM

I've written up a rationality game which we played several times at our local LW chapter and had a lot of fun with. The idea is to put Aumann's agreement theorem into practice as a multi-player calibration game, in which players react to the probabilities which other players give (each holding some privileged evidence). If you get very involved, this implies reasoning not only about how well your friends are calibrated, but also how much your friends trust each other's calibration, and how much they trust each other's trust in each other.

You'll need a set of trivia questions to play. We used these

The write-up includes a helpful scoring table which we have not play-tested yet. We did a plain Bayes loss rather than an adjusted Bayes loss when we played, and calculated things on our phone calculators. This version should feel a lot better, because the numbers are easier to interpret and you get your score right away rather than calculating at the end.

[Link] 2015 modafinil user survey

9 gwern 26 September 2015 05:28PM

I am running, in collaboration with ModafinilCat, a survey of modafinil users asking about their experiences, side-effects, sourcing, efficacy, and demographics:

This is something of a followup to the LW surveys which find substantial modafinil use, and Yvain's 2014 nootropics survey. I hope the results will be useful; the legal questions should help reduce uncertainty there, and the genetics questions (assuming any responses) may be interesting too.

View more: Next