Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

My Kind of Moral Responsibility

0 Gram_Stone 02 May 2016 05:54AM

The following is an excerpt of an exchange between Julia Galef and Massimo Pigliucci, from the transcript for Rationally Speaking Podcast episode 132:

Massimo: [cultivating virtue and 'doing good' locally 'does more good' than directly eradicating malaria]

Julia: [T]here's lower hanging fruit [in the developed world than there is in the developing world]. By many order of magnitude, there's lower hanging fruit in terms of being able to reduce poverty or disease or suffering in some parts of the world than other parts of the world. In the West, we've picked a lot of the low hanging fruit, and by any sort of reasonable calculation, it takes much more money to reduce poverty in the West -- because we're sort of out in the tail end of having reduced poverty -- than it does to bring someone out of poverty in the developing world.

Massimo: That kind of reasoning brings you quickly to the idea that everybody here is being a really really bad person because they spent money for coming here to NECSS listening to us instead of saving children on the other side of the world. I resist that kind of logic.

Massimo (to the audience): I don't think you guys are that bad! You see what I mean?

I see a lot of people, including bullet-biters, who feel a lot of internal tension, and even guilt, because of this apparent paradox.

Utilitarians usually stop at the question, "Are the outcomes different?"

Clearly, they aren't. But people still feel tension, so it must not be enough to believe that a world where some people are alive is better than a world where those very people are dead. The confusion has not evaporated in a puff of smoke, as we should expect.

After all, imagine a different gedanken where a virtue ethicist and a utilitarian each stand in front of a user interface, with each interface bearing only one shiny red button. Omega tells each, "If you press this button, then you will prevent one death. If you do not press this button, then you will not prevent one death."

There would be no disagreement. Both of them would press their buttons without a moment of hesitation.

So, in a certain sense, it's not only a question of which outcome is better. The repugnant part of the conclusion is the implication for our intuitions about moral responsibility. It's intuitive that you should save ten lives instead of one, but it's counterintuitive that the one who permits death is just as culpable as the one who causes death. You look at ten people who are alive when they could be dead, and it feels right to say that it is better that they are alive than that they are dead, but you juxtapose a murderer and your best friend who is not an ascetic, and it feels wrong to say that the one is just as awful as the other.

The virtue-ethical response is to say that the best friend has lived a good life and the murderer has not. Of course, I don't think that anyone who says this has done any real work.

So, if you passively don't donate every cent of discretionary income to the most effective charities, then are you morally culpable in the way that you would be if you had actively murdered everyone that you chose not to save who is now dead?

Well, what is moral responsibility? Hopefully we all know that there is not one culpable atom in the universe.

Perhaps the most concrete version of this question is: what happens, cognitively, when we evaluate whether or not someone is responsible for something? What's the difference between situations where we consider someone responsible and situations where we don't? What happens in the brain when we do these things? How do different attributions of responsibility change our judgments and decisions?

Most research on feelings has focused only on valence, how positiveness and negativeness affect judgment. But there's clearly a lot more to this: sadness, anger, and guilt are all negative feelings, but they're not all the same, so there must be something going on beyond valence.

One hypothesis is that the differences between sadness, anger, and guilt reflect different appraisals of agency. When we are sad, we haven't attributed the cause of the inciting event to an agent; the cause is situational, beyond human control. When we are angry, we've attributed the cause of the event to the actions of another agent. When we are guilty, we've attributed the cause of the event to our own actions.

(It's worth noting that there are many more types of appraisal than this, many more emotions, and many more feelings beyond emotions, but I'm going to focus on negative emotions and appraisals of agency for the sake of brevity. For a review of proposed appraisal types, see Demir, Desmet, & Hekkert (2009). For a review of emotions in general, check out Ortony, Clore, & Collins' The Cognitive Structure of Emotions.)

So, what's it look like when we narrow our attention to specific feelings on the same side of the valence spectrum? How are judgments affected when we only look at, say, sadness and anger? Might experiments based on these questions provide support for an account of our dilemma in terms of situational appraisals?

In one experiment, Keltner, Ellsworth, & Edwards (1993) found that sad subjects consider events with situational causes more likely than events with agentic causes, and that angry subjects consider events with agentic causes more likely than events with situational causes. In a second experiment in the same study, they found that sad subjects are more likely to consider situational factors as the primary cause of an ambiguous event than agentic factors, and that angry subjects are more likely to consider agentic factors as the primary cause of an ambiguous event than situational factors.

Perhaps unsurprisingly, watching someone commit murder, and merely knowing that someone could have prevented a death on the other side of the world through an unusual effort, makes very different things happen in our brains. I expect that even the utilitarians are biting a fat bullet; that even the utilitarians feel the tension, the counterintuitiveness, when utilitarianism leads them to conclude that indifferent bystanders are just as bad as murderers. Intuitions are strong, and I hope that a few more utilitarians can understand why utilitarianism is just as repugnant to a virtue ethicist as virtue ethics is to a utilitarian.

My main thrust here is that "Is a bystander as morally responsible as a murderer?" is a wrong question. You're always secretly asking another question when you ask that question, and the answer often doesn't have the word 'responsibility' anywhere in it.

Utilitarians replace the question with, "Do indifference and evil result in the same consequences?" They answer, "Yes."

Virtue ethicists replace the question with, "Does it feel like indifference is as 'bad' as 'evil'?" They answer, "No."

And the one thinks, in too little detail, "They don't think that bystanders are just as bad as murderers!", and likewise, the other thinks, "They do think that bystanders are just as bad as murderers!".

And then the one and the other proceed to talk past one another for a period of time during which millions more die.

As you might expect, I must confess to a belief that the utilitarian is often the one less confused, so I will speak to that one henceforth.

As a special kind of utilitarian, the kind that frequents this community, you should know that, if you take the universe, and grind it down to the finest powder, and sieve it through the finest sieve, then you will not find one agentic atom. If you only ask the question, "Has the virtue ethicist done the moral thing?", and you silently reply to yourself, "No.", and your response is to become outraged at this, then you have failed your Art on two levels.

On the first level, you have lost sight of your goal. As if your goal is to find out whether or not someone has done the moral thing, or not! Your goal is to cause them to commit the moral action. By your own lights, if you fail to be as creative as you can possibly be in your attempts at persuasion, then you're just as culpable as someone who purposefully turned someone away from utilitarianism as a normative-ethical position. And if all you do is scorn the virtue ethicists, instead of engaging with them, then you're definitely not being very creative.

On the second level, you have failed to apply your moral principles to yourself. You have not considered that the utility-maximizing action might be something besides getting righteously angry, even if that's the easiest thing to do. And believe me, I get it. I really do understand that impulse.

And if you are that sort of utilitarian who has come to such a repugnant conclusion epistemically, but who has failed to meet your own expectations instrumentally, then be easy now. For there is no longer a question of 'whether or not you should be guilty'. There are only questions of what guilt is used for, and whether or not that guilt ends more lives than it saves.

All of this is not to say that 'moral outrage' is never the utility-maximizing action. I'm at least a little outraged right now. But in the beginning, all you really wanted was to get rid of naive notions of moral responsibility. The action to take in this situation is not to keep them in some places and toss them in others.

Throw out the bath water, and the baby, too. The virtue ethicists are expecting it anyway.


Demir, E., Desmet, P. M. A., & Hekkert, P. (2009). Appraisal patterns of emotions in human-product interaction. International Journal of Design, 3(2), 41-51.

Keltner, D., Ellsworth, P., & Edwards, K. (1993). Beyond simple pessimism: Effects of sadness and anger on social perception. Journal of Personality and Social Psychology, 64, 740-752.

Ortony, A., Clore, G. L., & Collins, A. (1990). The Cognitive Structure of Emotions. (1st ed.).

Open Thread May 2 - May 8, 2016

2 Elo 02 May 2016 02:43AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

A Second Year of Spaced Repetition Software in the Classroom

9 tanagrabeast 01 May 2016 10:14PM

This is a follow-up to last year's report. Here, I will talk about my successes and failures using Spaced Repetition Software (SRS) in the classroom for a second year. The year's not over yet, but I have reasons for reporting early that should become clear in a subsequent post. A third post will then follow, and together these will constitute a small sequence exploring classroom SRS and the adjacent ideas that bubble up when I think deeply about teaching.


I experienced net negative progress this year in my efforts to improve classroom instruction via spaced repetition software. While this is mostly attributable to shifts in my personal priorities, I have also identified a number of additional failure modes for classroom SRS, as well as additional shortcomings of Anki for this use case. My experiences also showcase some fundamental challenges to teaching-in-general that SRS depressingly spotlights without being any less susceptible to. Regardless, I am more bullish than ever about the potential for classroom SRS, and will lay out a detailed vision for what it can be in the next post.

continue reading »

May 2016 Media Thread

1 ArisKatsaris 01 May 2016 09:27PM

This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.


  • Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
  • If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
  • Please post only under one of the already created subthreads, and never directly under the parent media thread.
  • Use the "Other Media" thread if you believe the piece of media you want to discuss doesn't fit under any of the established categories.
  • Use the "Meta" thread if you want to discuss about the monthly media thread itself (e.g. to propose adding/removing/splitting/merging subthreads, or to discuss the type of content properly belonging to each subthread) or for any other question or issue you may have about the thread or the rules.

Hedge drift and advanced motte-and-bailey

10 Stefan_Schubert 01 May 2016 02:45PM

Motte and bailey is a technique by which one protects an interesting but hard-to-defend view by making it similar to a less interesting but more defensible position. Whenever the more interesting position - the bailey - is attacked - one retreats to the more defensible one - the motte -, but when the attackers are gone, one expands again to the bailey. 

In that case, one and the same person switches between two interpretations of the original claim. Here, I rather want to focus on situations where different people make different interpretations of the original claim. The originator of the claim adds a number of caveats and hedges to their claim, which makes it more defensible, but less striking and sometimes also less interesting.* When others refer to the same claim, the caveats and hedges gradually disappear, however, making it more and more motte-like.

A salient example of this is that scientific claims (particularly in messy fields like psychology and economics) often come with a number of caveats and hedges, which tend to get lost when re-told. This is especially so when media writes about these claims, but even other scientists often fail to properly transmit all the hedges and caveats that come with them.

Since this happens over and over again, people probably do expect their hedges to drift to some extent. Indeed, it would not surprise me if some people actually want hedge drift to occur. Such a strategy effectively amounts to a more effective, because less observable, version of the motte-and-bailey-strategy. Rather than switching back and forth between the motte and the bailey - something which is at least moderately observable, and also usually relies on some amount of vagueness, which is undesirable - you let others spread the bailey version of your claim, whilst you sit safe in the motte. This way, you get what you want - the spread of the bailey version - in a much safer way.

Even when people don't use this strategy intentionally, you could argue that they should expect hedge drift, and that omitting to take action against it is, if not ouright intellectually dishonest, then at least approaching that. This argument would rest on the consequentialist notion that if you have strong reasons to believe that some negative event will occur, and you could prevent it from happening by fairly simple means, then you have an obligation to do so. I certainly do think that scientists should do more to prevent their views from being garbled via hedge drift. 

Another way of expressing all this is by saying that when including hedging or caveats, scientists often seem to seek plausible deniability ("I included these hedges; it's not my fault if they were misinterpreted"). They don't actually try to prevent their claims from being misunderstood. 

What concrete steps could one then take to prevent hedge-drift? Here are some suggestions. I am sure there are many more.

  1. Many authors use eye-catching, hedge-free titles and/or abstracts, and then only include hedges in the paper itself. This is a recipe for hedge-drift and should be avoided.
  2. Make abundantly clear, preferably in the abstract, just how dependent the conclusions are on keys and assumptions. Say this not in a way that enables you to claim plausible deniability in case someone misinterprets you, but in a way that actually reduces the risk of hedge-drift as much as possible. 
  3. Explicitly caution against hedge drift, using that term or a similar one, in the abstract of the paper.

* Edited 2/5 2016. By hedges and caveats I mean terms like "somewhat" ("x reduces y somewhat"), "slightly", etc, as well as modelling assumptions without which the conclusions don't follow and qualifications regarding domains in which the thesis don't hold.

2016 LessWrong Diaspora Survey Results

16 ingres 01 May 2016 08:26AM


As we wrap up the 2016 survey, I'd like to start by thanking everybody who took
the time to fill it out. This year we had 3060 respondents, more than twice the
number we had last year. (Source: http://lesswrong.com/lw/lhg/2014_survey_results/)
This seems consistent with the hypothesis that the LW community hasn't declined
in population so much as migrated into different communities. Being the *diaspora*
survey I had expectations for more responses than usual, but twice as many was
far beyond them.

Before we move on to the survey results, I feel obligated to put a few affairs
in order in regards to what should be done next time. The copyright situation
for the survey was ambiguous this year, and to prevent that from happening again
I'm pleased to announce that this years survey questions will be released jointly
by me and Scott Alexander as Creative Commons licensed content. We haven't
finalized the details of this yet so expect it sometime this month.

I would also be remiss not to mention the large amount of feedback we received
on the survey. Some of which led to actionable recommendations I'm going to
preserve here for whoever does it next:

- Put free response form at the very end to suggest improvements/complain.

- Fix metaethics question in general, lots of options people felt were missing.

- Clean up definitions of political affilations in the short politics section.
  In particular, 'Communist' has an overly aggressive/negative definition.

- Possibly completely overhaul short politics section.

- Everywhere that a non-answer is taken as an answer should be changed so that
  non answer means what it ought to, no answer or opinion. "Absence of a signal
  should never be used as a signal." - Julian Bigelow, 1947

- Give a definition for the singularity on the question asking when you think it
  will occur.

- Ask if people are *currently* suffering from depression. Possibly add more
  probing questions on depression in general since the rates are so extraordinarily

- Include a link to what cisgender means on the gender question.

- Specify if the income question is before or after taxes.

- Add charity questions about time donated.

- Add "ineligible to vote" option to the voting question.

- Adding some way for those who are pregnant to indicate it on the number of
  children question would be nice. It might be onerous however so don't feel
  obligated. (Remember that it's more important to have a smooth survey than it
  is to catch every edge case.)

And read this thread: http://lesswrong.com/lw/nfk/lesswrong_2016_survey/,
it's full of suggestions, corrections and criticism.

Without further ado,

Basic Results:

2016 LessWrong Diaspora Survey Questions (PDF Format)

2016 LessWrong Diaspora Survey Results (PDF Format)

Our report system is currently on the fritz and isn't calculating numeric questions. If I'd known this earlier I'd have prepared the results for said questions ahead of time. Instead they'll be coming out later today or tomorrow.

2016 LessWrong Diaspora Survey Public Dataset

(Note for people looking to work with the dataset: My survey analysis code repository includes a sqlite converter, examples, and more coming soon. It's a great way to get up and running with the dataset really quickly.)

In depth analysis:

Effective Altruism and Charitable Giving Analysis

Mental Health Stats By Diaspora Community (Including self dxers)

How Diaspora Communities Compare On Mental Health Stats (I suspect these charts are subtly broken somehow, will investigate later)

Political Opinions By Political Affiliation

More coming soon!

Survey Analysis Code

Some notes:

1. FortForecast on the communities section, Bayesed And Confused on the blogs section, and Synthesis on the stories section were all 'troll' answers designed to catch people who just put down everything. Somebody noted that the three 'fortforecast' users had the entire DSM split up between them, that's why.

2. Lots of people asked me for a list of all those cool blogs and stories and communities on the survey, they're included in the survey questions PDF above.

Public TODO:

1. Fix the report system or perform the calculations manually.

2. Add more in depth analysis, fix the ones that decided to suddenly break at the last minute or I suspect were always broken.

3. Finish public data release function and release public dataset.

4. See if I can make sense of the calibration questions.

5. Add a compatibility mode so that the current question codes are converted to older ones for 3rd party analysis that rely on them.

If anybody would like to help with these, write to jd@fortforecast.com

The 'why does it even tell me this' moment

2 Romashka 01 May 2016 08:15AM

I thought one day that it would be helpful to develop a habit of seeking implications of what one is reading right afterwards - to have a default setting of evaluating and making predictions based on it. But I did not go writing up internal dialogues. As soon as I cast about for a book to test, my laziness dredged up the fact that I do not work in the same industry as most people here, and so any 'analysis' wouldn't be that useful/engaging.

(Still true)

So here's an anecdote, and I hope people would add their observations in the comments.

I find it easier to 'butt at' a book when it engenders some personal attitude - 'oh come ON this is nuts', 'and I imagine you tried it yourself', 'spot on, just let me show this to the guys', 'sure d'Artagnan wasn't actually made of iron, but', 'three-dimensional reconstructions just put it all in a new light', 'can I substitute cashew with walnuts here' etc. Her

Yet some stuff just sucks you in over your head and doesn't let you any freedom of mental movement. There should be structural properties common to the more 'training' books which help the reader to disengage enough to at least wonder, right? Even if it's not in itself sufficient to make one test ideas.

Specifically, it feels entirely natural to question outdated sources on widely defined subjects with some structure imposed by necessity which had been actually used by a number of people until better books came out. Even more specifically, let's look at a now-somewhat-above-highschool-level Russian plant identification text of the 1948.*

It is written in clear words, unclattered by the species found/acknowledged later and those Neistadt considered less important. That last was an assumption, but considering the (now obsolete) details provided for crops, the book was meant as a tool, a part of a statewide project, and it drives home the fact that it was written by someone quite different. I just...got annoyed the first time 'something went missing', then squinted at the beetroot production rates, and then finally realized 'oh, they treat this as economically urgent knowledge'. This added a level of interest, of puzzle. Being sufficiently removed from those times, and without a base in agriculture, I could only guess at what was added to round it up and what had measurable consequences, and what is even meaningfully true nowadays. And why.

So when Neistadt stated 'we don't have strong bees necessary to pollinate the pea, which is why it is usually self-fertilized', I thought 'but what bees do you need?!', 'must have shaped selection here throughout known history, probably lots of local varieties' and 'but most of them would have been replaced by now, more's the pity, wonder if there are any saved in the seed banks'. It was a sidenote in royally simple language, and it demanded more of me than if I read it most anywhere else, which kinda reminded me gently that all those other things from MAE are not less worthy and should be thought about. It's like... I guess when you read Darwin's travelling accounts... Only the structure forced by the need to, well, identify these blessed plants while we're at it, makes for tidier bites of insight.

Certainly there should be other combinations of question-inducing traits, and things within one's domain of competence are much more easy to digest. And yet... Perhaps this can be learned.

* Ф. Нейштадт. Определитель растений. - Учпедгиз, 1948. - 476 с.

[LINK] Updating Drake's Equation with values from modern astronomy

6 DanArmak 30 April 2016 10:08PM

A paper published in AstrobiologyA New Empirical Constraint on the Prevalence of Technological Species in the Universe (PDF), A. Frank and W.T. Sullivan.

From the abstract:

Recent advances in exoplanet studies provide strong constraints on all astrophysical terms in the Drake equation. [...] We find that as long as the probability that a habitable zone planet develops a technological species is larger than ~ 10-24, humanity is not the only time technological intelligence has evolved.

They say we now know with reasonable certainty the total number of stars ever to exist (in the observable universe), and the average number of planets in the habitable zone. But we still don't know the probabilities of life, intelligence, and technology arising. They call this cumulative unknown factor fbt.

Their result: for technological civilization to arise no more than once, with probability 0.01, in the lifetime of the observable universe, fbt should be no greater than ~ 2.5 x 10-24.


It's convenient that they calculate the chance technological civilization ever arose, rather than the chance one exists now. This is just the number we need to estimate the likelihood of a Great Filter.

They state their result as "[if we set fbt ≤ 2.5 x 10-24, then] at in a statistical sense were we to rerun the history of the Universe 100 times, only once would a lone technological species occur". But I don't know what rerunning the Universe means. I also can't formulate this as saying "if we hadn't already observed the Universe to be apparently empty of life, we would expect it to contain or to have once contained life with a probability of 1024", because that would ignore the chance that another civilization (if it counterfactually existed) would have affected or prevented the rise of life on Earth. Can someone help reformulate this? 

I don't know if their modern values for star and planet formation have been used in previous discussions of the Fermi paradox or the Great Filter. (The papers they cite for their values date from 2012, 2013 and 2015.) I also don't know if these values should be trusted, or what concrete values had been used previously. People on top of the Great Filter discussion probably already updated when the astronomical data came in.

Weekly LW Meetups

1 FrankAdamek 29 April 2016 03:50PM

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Canberra, Columbus, Denver, Kraków, London, Madison WI, Melbourne, Moscow, Mountain View, New Hampshire, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers and a Slack channel for daily discussion and online meetups on Sunday night US time.

continue reading »

Double Corrigibility: better Corrigibility

4 Stuart_Armstrong 28 April 2016 02:46PM

A putative new idea for AI control; index here.

Corrigibility was an attempt to allow an AI to safely change its values, without seeking to provoke or avoid the change. The idea is that, when the AI's utility changes from u to v at time t, it maximises a meta-utility U such that

  • U≤t = u
  • U>t = v + E(u|u→u) - E(v|u→v).

Here u→u designates the event that u remains unchanged, while u→v designates the change.

As has been shown, adding those expectation terms means the AI will not seek to provoke or resist such a utility change - it is indifferent to the change. Those expectation terms can be designated compensatory rewards.

What it is not indifferent to, though, is changes to E. If the AI knows that change u→v is coming, then it has incentives to change E to E' with a high E'(u|u→u) and a low E'(v|u→v). Its ability to rewire is limited by the fact that E' will serve as its future expectation estimator, so it won't mangle it completely, but it certainly can twist it a bit.

Notice that the AI is now no longer indifferent to the change of utility. It can benefit from rewiring E this way only if a change is coming up. Notice that E' probably can't rewire itself any more, so the rewired AI will be indifferent, but not before rewiring.


Double corrigibility

To solve this problem, we need to turn to corrigibility a second time. Let ρ be the AI's current expectation/probability estimator. Let ρ' be one that it can conceivably change into. Let C be a random variable designating the sum total of all compensatory rewards. Now, to guarantee that C is finite would require bounds on all the utility functions and some discount rate. In practice, if often will be finite. That's because E(C)=E(u0|u0→u0)-E(un |un-1→un, H), where u0 is the AI's first utility, un its last, and H intermediate history. So some milder restrictions on the ui should suffice to bound C.

Then, whenever the AI takes an action that changes ρ to ρ', it gets the special meta-compensatory reward

  • Eρ(C|ρ→ρ)-Eρ(C|ρ→ρ').

This prevents it from deriving any advantage to C via changes to ρ. Or, more precisely, it prevents it from deriving any expected advantage, that it can figure out using its current ρ.

Note that it doesn't prevent improvements to ρ in general, it just removes the incentives to change it to affect compensatory rewards.

Since any action a might have an indirect effect on ρ, the "utility maximising" for a given u must be changed to:

  • Eρ(u|a) + Σρ' Pρ(ρ→ρ'|a) (Eρ(C|ρ→ρ)-Eρ(C|ρ→ρ')),

where Pρ is the probability estimate corresponding to ρ; the probability term can be rewritten as Eρ(Iρ→ρ') for Iρ→ρ' the indicator function for ρ→ρ'. In fact the whole line above can be rewritten as

  • Eρ(u|a) + Eρ(Eρ(C|ρ→ρ)-Eρ(C|ρ→ρ') | a).

For this to work, Eρ needs to be able to say sensible things about itself, and also about Eρ', which is used to estimate C if ρ→ρ'.

If we compare this with various ways of factoring out variables, we can see that it's a case where we have a clear default, ρ, and are estimating deviations from that.

Is the average ethical review board ethical from an utilitarian standpoint?

3 ChristianKl 27 April 2016 12:11PM
Many people argue that Facebook's study of how the emotions of it's users changed depending on the emotional content of messages in their facebook feed wouldn't have been approved by the average ethical review board because facebook didn't seek informed consent for the experiment.

Is the harm that the average ethical review board prevents less than the harm that they cause by preventing research from happening? Are principles such as requiring informed consent from all research participants justifiable from an utilitarian perspective?

JFK was not assassinated: prior probability zero events

17 Stuart_Armstrong 27 April 2016 11:47AM

A lot of my work involves tweaking the utility or probability of an agent to make it believe - or act as if it believed - impossible or almost impossible events. But we have to be careful about this; an agent that believes the impossible may not be so different from one that doesn't.

Consider for instance an agent that assigns a prior probability of zero to JFK ever having been assassinated. No matter what evidence you present to it, it will go on disbelieving the "non-zero gunmen theory".

Initially, the agent will behave very unusually. If it was in charge of JFK's security in Dallas before the shooting, it would have sent all secret service agents home, because no assassination could happen. Immediately after the assassination, it would have disbelieved everything. The films would have been faked or misinterpreted; the witnesses, deluded; the dead body of the president, that of twin or an actor. It would have had huge problems with the aftermath, trying to reject all the evidence of death, seeing a vast conspiracy to hide the truth of JFK's non-death, including the many other conspiracy theories that must be false flags, because they all agree with the wrong statement that the president was actually assassinated.

But as time went on, the agent's behaviour would start to become more and more normal. It would realise the conspiracy was incredibly thorough in its faking of the evidence. All avenues it pursued to expose them would come to naught. It would stop expecting people to come forward and confess the joke, it would stop expecting to find radical new evidence overturning the accepted narrative. After a while, it would start to expect the next new piece of evidence to be in favour of the assassination idea - because if a conspiracy has been faking things this well so far, then they should continue to do so in the future. Though it cannot change its view of the assassination, its expectation for observations converge towards the norm.

If it does a really thorough investigation, it might stop believing in a conspiracy at all. At some point, the probability of a miracle will start to become more likely than a perfect but undetectable conspiracy. It is very unlikely that Lee Harvey Oswald shot at JFK, missed, and the president's head exploded simultaneously for unrelated natural causes. But after a while, such a miraculous explanation will start to become more likely than anything else the agent can consider. This explanation opens the possibility of miracles; but again, if the agent is very thorough, it will fail to find evidence of other miracles, and will probably settle on "an unrepeatable miracle caused JFK's death in a way that is physically undetectable".

But then note that such an agent will have a probability distribution over future events that is almost indistinguishable from a normal agent that just believes the standard story of JFK being assassinated. The zero-prior has been negated, not in theory but in practice.


How to do proper probability manipulation

This section is still somewhat a work in progress.

So the agent believes one false fact about the world, but its expectation is otherwise normal. This can be both desirable and undesirable. The negative is if we try and control the agent forever by giving it a false fact.

To see the positive, ask why would we want an agent to believe impossible things in the first place? Well, one example was an Oracle design where the Oracle didn't believe its output message would ever be read. Here we wanted the Oracle to believe the message wouldn't be read, but not believe anything else too weird about the world.

In terms of causality, if X designates the message being read at time t, and B and A are event before and after t, respectively, we want P(B|X)≈P(B) (probabilities about current facts in the world shouldn't change much) while P(A|X)≠P(A) is fine and often expected (the future should be different if the message is read or not).

In the JFK example, the agent eventually concluded "a miracle happened". I'll call this miracle a scrambling point. It's kind of a breakdown in causality: two futures are merged into one, given two different pasts. The two pasts are "JFK was assassinated" and "JFK wasn't assassinated", and their common scrambled future is "everything appears as if JFK was assassinated". The non-assassination belief has shifted the past but not the future.

For the Oracle, we want to do the reverse: we want the non-reading belief to shift the future but not the past. However, unlike the JFK assassination, we can try and build the scrambling point. That's why I always talk about messages going down noisy wires, or specific quantum events, or chaotic processes. If the past goes through a truly stochastic event (it doesn't matter whether there is true randomness or just that the agent can't figure out the consequences), we can get what we want.

The Oracle idea will go wrong if the Oracle conclude that non-reading must imply something is different about the past (maybe it can see through chaos in ways we thought it couldn't), just as the JFK assassination denier will continue to be crazy if can't find a route to reach "everything appears as if JFK was assassinated".

But there is a break in the symmetry: the JFK assassination denier will eventually reach that point as long as the world is complex and stochastic enough. While the Oracle requires that the future probabilities be the same in all (realistic) past universes.

Now, once the Oracle's message has been read, the Oracle will find itself in the same situation as the other agent: believing an impossible thing. For Oracles, we can simply reset them. Other agents might have to behave more like the JFK assassination disbeliever. Though if we're careful, we can quantify things more precisely, as I attempted to do here.

Suggest best book as an introduction to computational neuroscience

2 BiasedBayes 26 April 2016 09:16PM

Im trying to find a best place to start learning the field. I have no special math background. Im very eager to learn. Thanks alot!


Exercise in rationality: popular quotes, revisited

1 jennabouche 25 April 2016 11:35PM

A friend recently shared an image of Lincoln with the quote, "Better to remain silent and be thought a fool than speak and remove all doubt."


Correcting that idea, I replied with the following: "Speak! Reveal your foolishness, and open yourself so that others may enlighten you and you can learn. Fear the false mantle of silence-as-wisdom; better to briefly be the vocal fool than forever the silent fool."


The experience led me to thinking that it might be fun, cathartic, andor a good mental exercise/reminder to translate our culture's more irrational memes into a more presentable package.


Post your own examples if you like, and if I think of/see more I'll post here.

Sphere packing and logical uncertainty

3 sanxiyn 25 April 2016 06:02AM

Trying posting here since I don't see how to post to https://agentfoundations.org/.

Recently sphere packing was solved in dimension 24, and I read about it on Quanta Magazine. I found the following part of the article (paraphrased) fascinating.

Cohn and Kumar found that the best possible sphere packings in dimensions 24 could be at most 0.0000000000000000000000000001 percent denser than the Leech lattice. Given this ridiculously close estimate, it seemed clear that the Leech lattice must be the best sphere packings in dimension 24.

This is clearly a kind of reasoning under logical uncertainty, and seems very reasonable. Most humans probably would reason similarly, even when they have no idea what the Leech lattice is.

Is this kind of reasoning covered by already known desiderata for logical uncertainty?

Open Thread April 25 - May 1, 2016

2 Elo 25 April 2016 06:02AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

[Link] Mutual fund fees

3 James_Miller 23 April 2016 10:09PM

An easy win for rationalists is to avoid actively managed mutual funds.  As a NYT article points out:   


"High fees, often hidden from view, are still enriching many advisers and financial services companies at the expense of ordinary people who are struggling to salt away savings....even for retirement accounts that are to be covered by the rules, many advisers are not required to act in their clients’ best interests. This means that they are legally entitled to look out for themselves first and recommend investments with higher fees, to the detriment of those who have asked for help....even when fund managers succeed in outperforming their peers in one year, they cannot easily repeat the feat in successive years, as many studies have shown. That’s why low-cost index funds, which merely mirror the performance of the market and don’t try to beat it, make a great deal of sense as a core investment....With fees included, the average actively managed fund in each of 29 asset categories — from those that invest in various sizes and styles of stocks to those that hold fixed-income instruments like government or municipal bonds — underperformed its benchmark over the decade through December. In other words, index funds outperformed the average actively managed fund in every single category....Investors who believe they have found honest and skillful advisers may still want to understand all of this. Not everyone truly has your best interest at heart."

My Custom Spelling Dictionary

3 Gram_Stone 23 April 2016 09:56PM

I looked at my custom spelling dictionary in Google Chrome, and thought custom spelling dictionaries in general might be a good place for you to look if you wonder what kinds of terms you'll have to explain to people to help them understand what you mean. If something's on your list, then you would probably have to provide an explanation of its usage to a given random individual from the world population.

Here's my list:

















































Share yours, too, if you'd like. Maybe something interesting or useful will come out of it. Maybe there will be patterns.

What is up with carbon dioxide and cognition? An offer

19 paulfchristiano 23 April 2016 05:47PM

One or two research groups have published work on carbon dioxide and cognition. The state of the published literature is confusing.

Here is one paper on the topic. The authors investigate a proprietary cognitive benchmark, and experimentally manipulate carbon dioxide levels (without affecting other measures of air quality). They find implausibly large effects from increased carbon dioxide concentrations.

If the reported effects are real and the suggested interpretation is correct, I think it would be a big deal. To put this in perspective, carbon dioxide concentrations in my room vary between 500 and 1500 ppm depending on whether I open the windows. The experiment reports on cognitive effects for moving from 600 and 1000 ppm, and finds significant effects compared to interindividual differences.

I haven't spent much time looking into this (maybe 30 minutes, and another 30 minutes to write this post). I expect that if we spent some time looking into indoor CO2 we could have a much better sense of what was going on, by some combination of better literature review, discussion with experts, looking into the benchmark they used, and just generally thinking about it.

So, here's a proposal:

  • If someone looks into this and writes a post that improves our collective understanding of the issue, I will be willing to buy part of an associated certificate of impact, at a price of around $100*N, where N is my own totally made up estimate of how many hours of my own time it would take to produce a similarly useful writeup. I'd buy up to 50% of the certificate at that price.
  • Whether or not they want to sell me some of the certificate, on May 1 I'll give a $500 prize to the author of the best publicly-available analysis of the issue. If the best analysis draws heavily on someone else's work, I'll use my discretion: I may split the prize arbitrarily, and may give it to the earlier post even if it is not quite as excellent.

Some clarifications:

  • The metric for quality is "how useful it is to Paul." I hope that's a useful proxy for how useful it is in general, but no guarantees. I am generally a pretty skeptical person. I would care a lot about even a modest but well-established effect on performance. 
  • These don't need to be new analyses, either for the prize or the purchase.
  • I reserve the right to resolve all ambiguities arbitrarily, and in the end to do whatever I feel like. But I promise I am generally a nice guy.
  • I posted this 2 weeks ago on the EA forum and haven't had serious takers yet.
(Thanks to Andrew Critch for mentioning these results to me and Jessica Taylor for lending me a CO2 monitor so that I could see variability in indoor CO2 levels. I apologize for deliberately not doing my homework on this post.)

The Validity of the Anthropic Principle

1 casebash 23 April 2016 09:12AM

In my last post, I wrote about how the anthropic principle was often misapplied, that it could not be used within a single model, but only for comparing two or more models. This post will explain why I think that the anthropic principle is valid in every case where we aren't making those mistakes.

There have been many probability problems discussed on this site and one popular viewpoint is that probabilities cannot be discussed as existing by themselves, but only as existing in relation to a series of bets. Imagine that there are two worlds: World A has 10 people and World B has 100. Both worlds have a prior probability of 50% of being correct. Is it the case that World B should instead be given a 10:1 odds due to there being ten times the number of people and the anthropic principle? This sounds surprising, but I would say yes as you’d have to be paid 10 times as much from each person in World A who is correct in order for you to be indifferent between the two worlds. What this means is that if there is a bet that gains or loses you money according to whether you are in world A or world B, you should bet as though the probability of you being in world B is 10 times as much. That doesn’t quite show that the probability is 10:1, but it is rather close. I can’t actually remember the exact process/theorem in order to determine probabilities from betting odds. Can anyone link it to me?

Another way to show that the anthropic principle is probably correct is to note that if world A had 0 people instead, then there would be 100% of observing world B rather than world A. This doesn’t prove much, but it does prove that anthropic effects exist on some level. 

Suppose now that world A has 1 person and world B has 1 million people. Maybe you aren’t convinced that you are more likely to observe world B. Let’s consider an equivalent formulation where world A has 1 person who is extremely unobservant and only has a 1 in a million chance of noticing the giant floating A in world A and the other world has a single person, but this time with a 100% chance of noticing the giant floating B in their world. I think it is clear that it is more likely for you to notice a giant floating B than an A.

One more formulation is to have world A have 10 humans and 90 cyborgs and world B to have 100 humans. We can then ask about the probability of being in world B given that you are a human observing the world. It seems clear here that you have 10 times the probability of being in world B than world A given that you are a human. It seems that this should be equivalent to the original problem since the cyborgs don’t change anything. 

I admit that none of this is fully rigorous philosophical reasoning, but I thought that I’d post it anyway a) to get feedback b) to see if anyone denied the use of the anthropic principle in this way (not the way described in my last post), which would provide me with more motivation to try making all of this more formal.

Update: I thought it was worth adding that applying the anthropic principle to two models is really very similar to null hypothesis testing to determine if it is likely that a coin is biased. If there are a million people in one possible world, but only one in another, it would seem to be an amazing coincidence for you to be that one.

Sleepwalk bias, self-defeating predictions and existential risk

4 Stefan_Schubert 22 April 2016 06:31PM

Connected to: The Argument from Crisis and Pessimism Bias

When we predict the future, we often seem to underestimate the degree to which people will act to avoid adverse outcomes. Examples include Marx's prediction that the ruling classes would fail to act to avert a bloody revolution, predictions of environmental disasters and resource constraints, y2K, etc. In most or all of these cases, there could have been a catastrophe, if people had not acted with determination and ingenuity to prevent it. But when pressed, people often do that, and it seems that we often fail to take that into account when making predictions. In other words: too often we postulate that people will sleepwalk into a disaster. Call this sleepwalk bias.

What are the causes of sleepwalk bias? I think there are two primary causes:

Cognitive constraints. It is easier to just extrapolate existing trends than to engage in complicated reasoning about how people will act to prevent those trends from continuing.

Predictions as warnings. We often fail to distinguish between predictions in the pure sense (what I would bet will happen) and what we may term warnings (what we think will happen, unless appropriate action is taken). Some of these predictions could perhaps be interpreted as warnings - in which case, they were not as bad as they seemed.

However, you could also argue that they were actual predictions, and that they were more effective because they were predictions, rather than warnings. For, more often than not, there will of course be lots of work to reduce the risk of disaster, which will reduce the risk. This means that a warning saying that "if no action is taken, there will be a disaster" is not necessarily very effective as a way to change behaviour - since we know for a fact that action will be taken. A prediction that there is a high probability of a disaster all things considered is much more effective. Indeed, the fact that predictions are more effective than warnings might be the reason why people predict disasters, rather than warn about them. Such predictions are self-defeating - which you may argue is why people make them.

In practice, I think people often fail to distinguish between pure predictions and warnings. They slide between these interpretations. In any case, the effect of all this is for these "prediction-warnings" to seem too pessimistic qua pure predictions.



The upshot for existential risk is that those suffering from sleepwalk bias may be too pessimistic. They fail to appreciate the enormous efforts people will make to avoid an existential disaster.

Is sleepwalk bias common among the existential risk community? If so, that would be a pro tanto-reason to be somewhat less worried about existential risk. Since it seems to be a common bias, it would be unsurprising if the existential risk community also suffered from it. On the other hand, they have thought about these issues a lot, and may have been able to overcome it (or even overcorrect for it)

Also, even if sleepwalk bias does indeed affect existential risk predictions, it would be dangerous to let this notion make us decrease our efforts to reduce existential risk, given the enormous stakes, and the present neglect of existential risk. If pessimistic predictions may be self-defeating, so may optimistic predictions.



[Added 24/4 2016] Under which circumstances can we expect actors to sleepwalk? And under what circumstances can we expect that people will expect them to sleepwalk, even though they won't? Here are some considerations, inspired by the comments below. Sleepwalking is presumably more likely if:

  1. The catastrophe is arriving too fast for actors to react.
  2. It is unclear whether the catastrophe will in fact occur, or it is at least not very observable for the relevant actors (the financial crisis, possibly AGI).
  3. The possible disaster, though observable in some sense, is not sufficiently salient (especially to voters) to override more immediate concerns (climate change).
  4. There are conflicts (World War I) and/or free-riding problems (climate change) which are hard to overcome.
  5. The problem is technically harder than initially thought.

1, 2 and, in a way, 3, have to do with observing the disaster in time to act, whereas 4 and 5 have to do with ability to act once the problem is identified.

On the second question, my guess would be that people in general do not differentiate sufficiently between scenarios where sleepwalking is plausible and those where it is not (i.e. predicted sleepwalking has less variance than actual sleepwalking).  This means that we sometimes probably underestimate the amount of sleepwalking, but more often, if my main argument is right, we overestimate it. An upshot of this is that it is important to try to carefully model the amount of sleepwalking that there will be regarding different existential risks.

One weird trick to turn maximisers into minimisers

1 Stuart_Armstrong 22 April 2016 04:47PM

A putative new idea for AI control; index here.

A simple and easy design for a u-maximising agent that turns into a u-minimising one.

Let X be some boolean random variable outside the agent's control, that will be determined at some future time t (based on a cosmic event, maybe?). Set it up so that P(X=1)=ε, and for a given utility u, consider the utility:

  • u# = (2/ε)Xu - u.

Before t, the expected value of (2/ε)X is 2, so u# = u. Hence the agent is a u-maximiser. After t, the most likely option is X=0, hence a little bit of evidence to that effect is enough to make u# into a u-minimiser.

This isn't perfect corrigibility - the agent would be willing to sacrifice a bit of u-value (before t) in order to maintain its flexibility after t. To combat this effect, we could instead use:

  • u# = Ω(2/ε)Xu - u.

If Ω is large, then the agent is willing to pay very little u-value to maintain flexibility. However, the amount of evidence of X=0 that it needs to become a u-minimiser is equally proportional to Ω, so X better be a clear and convincing event.

Weekly LW Meetups

0 FrankAdamek 22 April 2016 03:58PM

This summary was posted to LW Main on April 22nd. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Canberra, Columbus, Denver, Kraków, London, Madison WI, Melbourne, Moscow, Mountain View, New Hampshire, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers and a Slack channel for daily discussion and online meetups on Sunday night US time.

continue reading »

[link] Simplifying the environment: a new convergent instrumental goal

4 Kaj_Sotala 22 April 2016 06:48AM


Convergent instrumental goals (also basic AI drives) are goals that are useful for pursuing almost any other goal, and are thus likely to be pursued by any agent that is intelligent enough to understand why they’re useful. They are interesting because they may allow us to roughly predict the behavior of even AI systems that are much more intelligent than we are.

Instrumental goals are also a strong argument for why sufficiently advanced AI systems that were indifferent towards human values could be dangerous towards humans, even if they weren’t actively malicious: because the AI having instrumental goals such as self-preservation or resource acquisition could come to conflict with human well-being. “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.

I’ve thought of a candidate for a new convergent instrumental drive: simplifying the environment to make it more predictable in a way that aligns with your goals.

The Web Browser is Not Your Client (But You Don't Need To Know That)

20 Error 22 April 2016 12:12AM

(Part of a sequence on discussion technology and NNTP. As last time, I should probably emphasize that I am a crank on this subject and do not actually expect anything I recommend to be implemented. Add whatever salt you feel is necessary)1

If there is one thing I hope readers get out of this sequence, it is this: The Web Browser is Not Your Client.

It looks like you have three or four viable clients -- IE, Firefox, Chrome, et al. You don't. You have one. It has a subforum listing with two items at the top of the display; some widgets on the right hand side for user details, RSS feed, meetups; the top-level post display; and below that, replies nested in the usual way.

Changing your browser has the exact same effect on your Less Wrong experience as changing your operating system, i.e. next to none.

For comparison, consider the Less Wrong IRC, where you can tune your experience with a wide range of different software. If you don't like your UX, there are other clients that give a different UX to the same content and community.

That is how the mechanism of discussion used to work, and does not now. Today, your user experience (UX) in a given community is dictated mostly by the admins of that community, and software development is often neither their forte nor something they have time for. I'll often find myself snarkily responding to feature requests with "you know, someone wrote something that does that 20 years ago, but no one uses it."

Semantic Collapse

What defines a client? More specifically, what defines a discussion client, a Less Wrong client?

The toolchain by which you read LW probably looks something like this; anyone who's read the source please correct me if I'm off:

Browser -> HTTP server -> LW UI application -> Reddit API -> Backend database.

The database stores all the information about users, posts, etc. The API presents subsets of that information in a way that's convenient for a web application to consume (probably JSON objects, though I haven't checked). The UI layer generates a web page layout and content using that information, which is then presented -- in the form of (mostly) HTML -- by the HTTP server layer to your browser. Your browser figures out what color pixels go where.

All of this is a gross oversimplification, obviously.

In some sense, the browser is self-evidently a client: It talks to an http server, receives hypertext, renders it, etc. It's a UI for an HTTP server.

But consider the following problem: Find and display all comments by me that are children of this post, and only those comments, using only browser UI elements, i.e. not the LW-specific page widgets. You cannot -- and I'd be pretty surprised if you could make a browser extension that could do it without resorting to the API, skipping the previous elements in the chain above. For that matter, if you can do it with the existing page widgets, I'd love to know how.

That isn't because the browser is poorly designed; it's because the browser lacks the semantic information to figure out what elements of the page constitute a comment, a post, an author. That information was lost in translation somewhere along the way.

Your browser isn't actually interacting with the discussion. Its role is more akin to an operating system than a client. It doesn't define a UX. It provides a shell, a set of system primitives, and a widget collection that can be used to build a UX. Similarly, HTTP is not the successor to NNTP; the successor is the plethora of APIs, for which HTTP is merely a substrate.

The Discussion Client is the point where semantic metadata is translated into display metadata; where you go from 'I have post A from user B with content C' to 'I have a text string H positioned above visual container P containing text string S.' Or, more concretely, when you go from this:

Author: somebody
Subject: I am right, you are mistaken, he is mindkilled.
Date: timestamp
Content: lorem ipsum nonsensical statement involving plankton....

to this:

<h1>I am right, you are mistaken, he is mindkilled.</h1>
<div><span align=left>somebody</span><span align=right>timestamp</span></div>
<div><p>lorem ipsum nonsensical statement involving plankton....</p></div>

That happens at the web application layer. That's the part that generates the subforum headings, the interface widgets, the display format of the comment tree. That's the part that defines your Less Wrong experience, as a reader, commenter, or writer.

That is your client, not your web browser. If it doesn't suit your needs, if it's missing features you'd like to have, well, you probably take for granted that you're stuck with it.

But it doesn't have to be that way.

Mechanism and Policy

One of the difficulties forming an argument about clients is that the proportion of people who have ever had a choice of clients available for any given service keeps shrinking. I have this mental image of the Average Internet User as having no real concept for this.

Then I think about email. Most people have probably used at least two different clients for email, even if it's just Gmail and their phone's built-in mail app. Or perhaps Outlook, if they're using a company system. And they (I think?) mostly take for granted that if they don't like Outlook they can use something else, or if they don't like their phone's mail app they can install a different one. They assume, correctly, that the content and function of their mail account is not tied to the client application they use to work with it.

(They may make the same assumption about web-based services, on the reasoning that if they don't like IE they can switch to Firefox, or if they don't like Firefox they can switch to Chrome. They are incorrect, because The Web Browser is Not Their Client)

Email does a good job of separating mechanism from policy. Its format is defined in RFC 2822 and its transmission protocol is defined in RFC 5321. Neither defines any conventions for user interfaces. There are good reasons for that from a software-design standpoint, but more relevant to our discussion is that interface conventions change more rapidly than the objects they interface with. Forum features change with the times; but the concepts of a Post, an Author, or a Reply are forever.

The benefit of this separation: If someone sends you mail from Outlook, you don't need to use Outlook to read it. You can use something else -- something that may look and behave entirely differently, in a manner more to your liking.

The comparison: If there is a discussion on Less Wrong, you do need to use the Less Wrong UI to read it. The same goes for, say, Facebook.

I object to this.

Standards as Schelling Points

One could argue that the lack of choice is for lack of interest. Less Wrong, and Reddit on which it is based, has an API. One could write a native client. Reddit does have them.

Let's take a tangent and talk about Reddit. Seems like they might have done something right. They have (I think?) the largest contiguous discussion community on the net today. And they have a published API for talking to it. It's even in use.

The problem with this method is that Reddit's API applies only to Reddit. I say problem, singular, but it's really problem, plural, because it hits users and developers in different ways.

On the user end, it means you can't have a unified user interface across different web forums; other forum servers have entirely different APIs, or none at all.2 It also makes life difficult when you want to move from one forum to another.

On the developer end, something very ugly happens when a content provider defines its own provision mechanism. Yes, you can write a competing client. But your client exists only at the provider's sufferance, subject to their decision not to make incompatible API changes or just pull the plug on you and your users outright. That isn't paranoia; in at least one case, it actually happened. Using an agreed-upon standard limits this sort of misbehavior, although it can still happen in other ways.

NNTP is a standard for discussion, like SMTP is for email. It is defined in RFC 3977 and its data format is defined in RFC 5536. The point of a standard is to ensure lasting interoperability; because it is a standard, it serves as a deliberately-constructed Schelling point, a place where unrelated developers can converge without further coordination.

Expertise is a Bottleneck

If you're trying to build a high-quality community, you want a closed system. Well kept gardens die by pacifism, and it's impossible to fully moderate an open system. But if you're building a communication infrastructure, you want an open system.

In the early Usenet days, this was exactly what existed; NNTP was standardized and open, but Usenet was a de-facto closed community, accessible mostly to academics. Then AOL hooked its customers into the system. The closed community became open, and the Eternal September began.3 I suspect, but can't prove, that this was a partial cause of the flight of discussion from Usenet to closed web forums.

I don't think that was the appropriate response. I think the appropriate response was private NNTP networks or even single servers, not connected to Usenet at large.

Modern web forums throw the open-infrastructure baby out with the open-community bathwater. The result, in our specific case, is that if we want something not provided by the default Less Wrong interface, it must be implemented by Less Wrongers.

I don't think UI implementation is our comparative advantage. In fact I know it isn't, or the Less Wrong UI wouldn't suck so hard. We're pretty big by web-forum standards, but we still contain only a tiny fraction of the Internet's technical expertise.

The situation is even worse among the diaspora; for example, at SSC, if Scott's readers want something new out of the interface, it must be implemented either by Scott himself or his agents. That doesn't scale.

One of the major benefits of a standardized, open infrastructure is that your developer base is no longer limited to a single community. Any software written by any member of any community backed by the same communication standard is yours for the using. Additionally, the developers are competing for the attention of readers, not admins; you can expect the reader-facing feature set to improve accordingly. If readers want different UI functionality, the community admins don't need to be involved at all.

A Real Web Client

When I wrote the intro to this sequence, the most common thing people insisted on was this: Any system that actually gets used must allow links from the web, and those links must reach a web page.

I completely, if grudgingly, agree. No matter how insightful a post is, if people can't link to it, it will not spread. No matter how interesting a post is, if Google doesn't index it, it doesn't exist.

One way to achieve a common interface to an otherwise-nonstandard forum is to write a gateway program, something that answers NNTP requests and does magic to translate them to whatever the forum understands. This can work and is better than nothing, but I don't like it -- I'll explain why in another post.

Assuming I can suppress my gag reflex for the next few moments, allow me to propose: a web client.

(No, I don't mean write a new browser. The Browser Is Not Your Client.4)

Real NNTP clients use the OS's widget set to build their UI and talk to the discussion board using NNTP. There is no fundamental reason the same cannot be done using the browser's widget set. Google did it. Before them, Deja News did it. Both of them suck, but they suck on the UI level. They are still proof that the concept can work.

I imagine an NNTP-backed site where casual visitors never need to know that's what they're dealing with. They see something very similar to a web forum or a blog, but whatever software today talks to a database on the back end, instead talks to NNTP, which is the canonical source of posts and post metadata. For example, it gets the results of a link to http://lesswrong.com/posts/message_id.html by sending ARTICLE message_id to its upstream NNTP server (which may be hosted on the same system), just as a native client would.

To the drive-by reader, nothing has changed. Except, maybe, one thing. When a regular reader, someone who's been around long enough to care about such things, says "Hey, I want feature X," and our hypothetical web client doesn't have it, I can now answer:

Someone wrote something that does that twenty years ago.

Here is how to get it.

  1. Meta-meta: This post took about eight hours to research and write, plus two weeks procrastinating. If anyone wants to discuss it in realtime, you can find me on #lesswrong or, if you insist, the LW Slack.

  2. The possibility of "universal clients" that understand multiple APIs is an interesting case, as with Pidgin for IM services. I might talk about those later.

  3. Ironically, despite my nostalgia for Usenet, I was a part of said September; or at least its aftermath.

  4. Okay, that was a little shoehorned in. The important thing is this: What I tell you three times is true.

Expect to know better when you know more

3 Stuart_Armstrong 21 April 2016 03:47PM

A seemingly trivial result, that I haven't seen posted anywhere in this form, that I could find. It simply shows that we expect evidence to increase the posterior probability of the true hypothesis.

Let H be the true hypothesis/model/environment/distribution, and ~H its negation. Let e be evidence we receive, taking values e1, e2, ... en. Let pi=P(e=ei|H) and qi=P(E=ei|~H).

The expected posterior weighting of H, P(e|H), is Σpipi while the expected posterior weighting of ~H, P(e|~H), is Σqipi. Then since the pi and qi both sum to 1, Cauchy–Schwarz implies that


  • E(P(e|H)) ≥ E(P(e|~H)).

Thus, in expectation, the probability of the evidence given the true hypothesis, is higher than or equal to the probability of the evidence given its negation.

This, however, doesn't mean that the Bayes factor - P(e|H)/P(e|~H) - must have expectation greater than one, since ratios of expectation are not the same as expectations of ratio. The Bayes factor given e=ei is (pi/qi). Thus the expected Bayes factor is Σ(pi/qi)pi. The negative logarithm is a convex function; hence by Jensen's inequality, -log[E(P(e|H)/P(e|~H))] ≤ -E[log(P(e|H)/P(e|~H))]. That last expectation is Σ(log(pi/qi))pi. This is the Kullback–Leibler divergence of P(e|~H) from P(e|H), and hence is non-negative. Thus log[E(P(e|H)/P(e|~H))] ≥ 0, and hence


  • E(P(e|H)/P(e|~H)) ≥ 1.

Thus, in expectation, the Bayes factor, for the true hypothesis versus its negation, is greater than or equal to one.

Note that this is not true for the inverse. Indeed E(P(e|~H)/P(e|H)) = Σ(qi/pi)pi = Σqi = 1.

In the preceding proofs, ~H played no specific role, and hence


  • For all K,    E(P(e|H)) ≥ E(P(e|K))    and    E(P(e|H)/P(e|K)) ≥ 1    (and E(P(e|K)/P(e|H)) = 1).

Thus, in expectation, the probability of the true hypothesis versus anything, is greater or equal in both absolute value and ratio.

Now we can turn to the posterior probability P(H|e). For e=ei, this is P(H)*P(e=ei|H)/P(e=ei). We can compute the expectation of P(e|H)/P(e) as above, using the non-negative Kullback–Leibler divergence of P(e) from P(e|H), and thus showing it has an expectation greater than or equal to 1. Hence:


  • E(P(H|e)) ≥ P(H).

Thus, in expectation, the posterior probability of the true hypothesis is greater than or equal to its prior probability.

Roughly you

3 JDR 21 April 2016 03:28PM

Since, like everyone, I generalise from single examples, I expect most people have some older relative or friend who they feel has added some wisdom to their life - some small pieces of information which seem to have pervasively wormed their way into more of their cognitive algorithms than you would expect, coloring and informing perceptions and decisions. For me, this would most be my grandfather. Over his now 92 years he has given me gems such as "always cut a pear before you peel it" (make quick checks of the value of success before committing to time consuming projects) and whenever someone says "that's never happened before", finishing their sentence with "said the old man when his donkey died" (just because something hasn't happened before doesn't mean it wasn't totally predictable).

Recently, though, I've been thinking about something else he has said, admittedly in mock seriousness: "If I lose my mind, you should take me out back and shoot me". We wouldn't, he wouldn't expect us to, but it's what he has said.

The reason I've been thinking of this darker quotation is that I've been spending a lot of time with people who have "lost their minds" in the way that he means. I am a medical student, and on a rotation in old age psychiatry, so have been talking to patients most of whom have some level of dementia, often layered with psychotic conditions such as intractable schizophrenia, some of whom increasingly can't remember their own pasts let alone their recent present. They can become fixed in untrue beliefs, their emotional become limited, or lose motivation to complete even simple tasks.

It can be scary. In some ways, such illness represents death by degrees. These people can remain happy and have a good quality of life, but it's certain that they are not entirely the people they once were. In fact, this is a question we have asked relatives when deciding whether someone is suffering from early dementia: "Overall, in the way she behaves, does this seem like your mother to you? Is this how your mother acts?". Sometimes, the answer is "No, it's like she is a different person", or "Only some of the time". It's a process of personality-approximation, blurring, abridging and changing the mind to create something not quite the same. What my grandfather fears is becoming a rough estimate of himself - though again, for some, that re-drawn person might be perfectly happy with who they are when they arrive.

Why is this of interest to LessWrong? I think it is because quite a few people here (me included) have at least thought about bidding to live forever using things like cryogenics and maybe brain-download. These things could work at some point; but what if they don't work perfectly? What if the people of the future can recover some of the information from a frozen brain, but not all of it? What if we had to miss off a few crucial memories, a few talents, maybe 60 points of IQ? Or even more subtle things - it's been written a few times that the entirety of who a person is in their brain, but that's probably not entirely true - the brain is influenced by the body, and aspects of your personality are probably influenced by how sensitive your adrenals are, the amount of fat you have, and even the community of bacteria in your intestines. Even a perfect neural computer-you wouldn't have these things; it would be subtle, but the created immortal agent wouldn't completely be you, as you are now. Somehow, though, missing my precise levels of testosterone would seem an acceptable compromise for the rest of my personality living forever, but missing the memory of my childhood, half my intelligence or my ability to change my opinion would leave me a lot less sure.

So here's the question I want to ask, to see what people think: If I offered you partial immortality - immortality for just part of you - how rough an approximation of "you" would you be willing to accept?

Rationality Reading Group: Part Y: Challenging the Difficult

2 Gram_Stone 20 April 2016 10:32PM

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.

Welcome to the Rationality reading group. This fortnight we discuss Part Y: Challenging the Difficult (pp. 1605-1647). This post summarizes each article of the sequence, linking to the original LessWrong post where available.

Y. Challenging the Difficult

304. Tsuyoku Naritai! (I Want to Become Stronger) - Don't be satisfied knowing you are biased; instead, aspire to become stronger, studying your flaws so as to remove them. There is a temptation to take pride in confessions, which can impede progress.

305. Tsuyoku vs. the Egalitarian Instinct - There may be evolutionary psychological factors that encourage modesty and mediocrity, at least in appearance; while some of that may still apply today, you should mentally plan and strive to pull ahead, if you are doing things right.

306. Trying to Try - As a human, if you try to try something, you will put much less work into it than if you try something.

307. Use the Try Harder, Luke - A fictional exchange between Mark Hamill and George Lucas over the scene in Empire Strikes Back where Luke Skywalker attempts to lift his X-wing with the force.

308. On Doing the Impossible - A lot of projects seem impossible, meaning that we don't immediately see a way to do them. But after working on them for a long time, they start to look merely extremely difficult.

309. Make an Extraordinary Effort - It takes an extraordinary amount of rationality before you stop making stupid mistakes. Doing better requires making extraordinary efforts.

310. Shut Up and Do the Impossible! - The ultimate level of attacking a problem is the point at which you simply shut up and solve the impossible problem.

311. Final WordsThe conclusion of the Beisutsukai series.


This has been a collection of notes on the assigned sequence for this fortnight. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

The next reading will cover Part Z: The Craft and the Community (pp. 1651-1750). The discussion will go live on Wednesday, 4 May 2016, right here on the discussion forum of LessWrong.

Gratitude Thread :-)

1 Gleb_Tsipursky 19 April 2016 03:11AM

Hi folks! Building up on the recent experiment and the #LessWrongMoreNice meme, this thread is devoted to any and all expressions of gratitude. Special rules for communication and voting apply here. Please play along!


A lot of research shows that expressing gratitude improves mental and physical health, qualities that most of us want to increase. So in this thread, please express anything you are grateful for, big or small, one-time or continuing, and feel free to post stuff that you would not normally post to Less Wrong. Encourage and support others in what they post in comments, and upvote posts that you like, while downvoting those that don't express the spirit of this thread.


If you want to discuss this thread, please do so in response to this open thread comment.


I'm grateful to you for following the spirit of this thread!

Open thread, Apr. 18 - Apr. 24, 2016

2 MrMind 18 April 2016 07:19AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

Monthly Outreach Thread

0 Gleb_Tsipursky 17 April 2016 11:18PM

Please share about any outreach that you have done to convey rationality-style ideas broadly, whether recent or not, which you have not yet shared on previous Outreach threads. The goal of having this thread is to organize information about outreach and provide community support and recognition for raising the sanity waterline, a form of cognitive altruism that contributes to creating a flourishing world. Likewise, doing so can help inspire others to emulate some aspects of these good deeds through social proof and network effects.

Does immortality imply eternal existence in linear time?

0 turchin 17 April 2016 11:17PM

The question is important, as it’s often used as an argument against idea of immortality, on the level of desirability as well as feasibility. It may result in less interest in radical life extension as "result will be the same", we will die. Religion, on the other hand is not afraid to "sell" immortality, as it has God, who will solve all contradiction in immortality implementation. As a result, religion win on the market of ideas. 


Immortality (by definition) is about not dying. The fact of eternal linear existence follows from it, seems to be very simple and obvious theorem:


“If I do not die in the time moment N and N+1, I will exist for any N”. 


If we prove that immortality is impossible, then any life would look like: Now + unknown very long time + death. So, death is inevitable, and the only difference is the unknown time until it happens.


It is an unpleasant perspective, by the way. 


So we have or “bad infinity”, or inevitable death. Both look unappealing.  Both also look logically contradictory. "Infinite linear existence" requires infinite memory of observer, for example. "Death of observer" is also implies an idea of the ending of stream of experiences, which can't be proved empirically, and from logical point of view is unproved hypothesis.


But we can change our point of view if we abandon the idea of linear time.


Physics suggests that near black holes closed time-like curves could be possible. https://en.wikipedia.org/wiki/Closed_timelike_curve (Idea of "Eternal recurrence" of Nietzsche is an example of such circle immortality.)


If I am in such a curve, my experiences may recur after, say, one billion years. In this case, I am immortal but have finite time duration.


It may be not very good, but it is just a starting point in considerations that would help lead us away from the linear time model.


There may be other configurations in non-linear time. Another obvious one is the merging of different personal timelines. 


Another is the circular attractor.


Another is a combination of attractors, merges and circular timelines, which may result in complex geometry.


Another is 2 (or many)- dimensional time, with another perpendicular time arrow. It results in a time topology. Time could also include singularities, in which one has an infinite number of experiences in finite time.


We could also add here idea of splitting time in quantum multiverse.


We could also add an idea that there is a possible path between any two observer-moment, and given that infinitely many such paths exist in splitting multiverse, any observer has non zero probability to become any other observer, which results in tangle of time-like curves in the space of all possible minds.


Timeless physics ideas also give us another view on idea of “time” in which we don’t have “infinite time”, but not because infinity is impossible, but because there is no such thing as time.


TL;DR: The idea of time is so complex that we can’t state that immortality results in eternal linear existence. These two ideas may be true or false independently.


Also I have a question to the readers: If you think that superintelligence will be created, do you think it will be immortal, and why?

Using humility to counteract shame

8 Vika 15 April 2016 06:32PM

"Pride is not the opposite of shame, but its source. True humility is the only antidote to shame."

Uncle Iroh, "Avatar: The Last Airbender"

Shame is one of the trickiest emotions to deal with. It is difficult to think about, not to mention discuss with others, and gives rise to insidious ugh fields and negative spirals. Shame often underlies other negative emotions without making itself apparent - anxiety or anger at yourself can be caused by unacknowledged shame about the possibility of failure. It can stack on top of other emotions - e.g. you start out feeling upset with someone, and end up being ashamed of yourself for feeling upset, and maybe even ashamed of feeling ashamed if meta-shame is your cup of tea. The most useful approach I have found against shame is invoking humility.

What is humility, anyway? It is often defined as a low view of your own importance, and tends to be conflated with modesty. Another common definition that I find more useful is acceptance of your own flaws and shortcomings. This is more compatible with confidence, and helpful irrespective of your level of importance or comparison to other people. What humility feels like to me on a system 1 level is a sense of compassion and warmth towards yourself while fully aware of your imperfections (while focusing on imperfections without compassion can lead to beating yourself up). According to LessWrong, "to be humble is to take specific actions in anticipation of your own errors", which seems more like a possible consequence of being humble than a definition.

Humility is a powerful tool for psychological well-being and instrumental rationality that is more broadly applicable than just the ability to anticipate errors by seeing your limitations more clearly. I can summon humility when I feel anxious about too many upcoming deadlines, or angry at myself for being stuck on a rock climbing route, or embarrassed about forgetting some basic fact in my field that I am surely expected to know by the 5th year of grad school. While humility comes naturally to some people, others might find it useful to explicitly build an identity as a humble person. How can you invoke this mindset?

One way is through negative visualization or pre-hindsight, considering how your plans could fail, which can be time-consuming and usually requires system 2. A faster and less effortful way is to is to imagine a person, real or fictional, who you consider to be humble. I often bring to mind my grandfather, or Uncle Iroh from the Avatar series, sometimes literally repeating the above quote in my head, sort of like an affirmation. I don't actually agree that humility is the only antidote to shame, but it does seem to be one of the most effective.

(Cross-posted from my blog. Thanks to Janos Kramar for his feedback on this post.)

Weekly LW Meetups

2 FrankAdamek 15 April 2016 03:26PM

This summary was posted to LW Main on April 15th. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Canberra, Columbus, Denver, Kraków, London, Madison WI, Melbourne, Moscow, Mountain View, New Hampshire, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers and a Slack channel for daily discussion and online meetups on Sunday night US time.

continue reading »

Anthropics and Biased Models

4 casebash 15 April 2016 02:18AM

The Fine-tuned Universe Theory, according to Wikipedia is the belief that, "our universe is remarkably well suited for life, to a degree unlikely to happen by mere chance". It is typically used to argue that our universe must therefore be the result of Intelligent Design.

One of the most common counter-arguments to this view based on the Anthropic Principle. The argument is that if the conditions were not such that life would be possible, then we would not be able to observe this, as we would not be alive. Therefore, we shouldn't be surprised that the universe has favourable conditions.

I am going to argue that this particular application of the anthropic principle is in fact an incorrect way to deal with this problem. I'll begin first by explaining one way to deal with this problem; afterwards I will explain why the other way is incorrect.

Two model approach

We begin with two modes:

  • Normal universe model: The universe has no bias towards supporting life
  • Magic universe model: The universe is 100% biased towards supporting life
We can assign both of these models a prior probability, naturally I'd suggest the prior probability for the later should be rather low. We then update based on the evidence that we see.

p(normal universe|we exist) = p(we exist|normal universe)/p(we exist) * p(normal universe)

The limit of p(normal universe|we exist) as p(we exist|normal universe) approaches 0 is 0 (assuming p(normal universe)!=1). This is proven in the supplementary materials at the end of this post. In plain English, as the chance of us existing in the normal universe approaches zero, as long as we assign some probability to the magic universe model we will at some point conclude that the Magic universe model is overwhelming likely to be correct. I should be clear, I am definitely not claiming that the Fine-Tuned Universe argument is correct. I expect that if we come to the conclusion that the Magical model is more likely than the Normal model of the universe, than that is because we have set our prior for the magical model of the universe to be too high or the chances of life inside the normal universe model to be too low. Regarding the former, our exposure to science fiction and fantasy subjects us to the availability bias, which biases our priors upwards. Regarding the later, many scientists make arguments that life can only exist in a very specific form, which I don't find completely convincing.

Standard anthropic argument

Let's quote an example of the standard anthropic argument by DanielLC:

Alice notices that Earth survived the cold war. She asks Bob why that is. After all, so much more likely for Earth not to survive. Bob tells her that it's a silly question. The only reason she picked out Earth is that it's her home planet, which is because it survived the cold war. If Earth died and, say, Pandora survived, she (or rather someone else, because it's not going to be the same people) would be asking why Pandora survived the cold war. There's no coincidence.

This paragraph notes that the answer to the question, "What is the probability that we survived the Cold War given that we can ask this question?" is going to always be 1. It is then implied that since there is no surprise, indeed, this is what must be what happened, the anthropic principle lacks any force.

However, this is actually asking the wrong question. It is right to note that we shouldn't be surprised to observe that we survived given that it would be impossible to observe otherwise. However, if we were then informed that we lived in a normal, unbiased universe, rather than in an alternate biased universe, if the maths worked out a particular way such that it leaned heavily towards the alternate universe, then we would be surprised to learn we lived in a normal universe. In particular, we showed how this could work out above, when we examined the situation where p(we exist|normal universe) approached 0. The anthropic argument against the alternate hypothesis denies that surprise in a certain sense can occur, however, if fails to show that surprised in another, more meaningful sense can occur.

Reframing this, the problem is that it fails to be comparative. The proper question we should be asking is “Given that we observe an unlikely condition, is it more probable that the normal or magical model of the universe is true?”. Simply noting that we can explain our observations perfectly well within our universe, does not mean that an alternate model wouldn't provide a better explanation. As an analogy, if we want to determine whether a coin is biased or unbiased, then we must start with (at least) two models - fair and unfair. We assign each a prior probability and then do a Bayesian update on the new information provided - ie. the unusual run or state of the universe.

Coin flip argument

Let's consider a version of this analogy in more detail. Imagine that you are flipping coins. If you flip a heads, then you live, if you flip a tails, then you are shot. Suppose you get 15 coin flips in a row. You could argue that only the people who got 15 coin flips in a row are alive to ask this question, so there is nothing to explain. However, if there is a 1% chance that the coin you have is perfectly biased towards heads, then the number of people with biased coins who get 15 flips and ask the question will massively outweigh the number of people with unbiased coins who get to 15 flips. Simply stating that that there was nothing surprising about you observing 15 flips given that you would be dead if you hadn't gotten 15 flips didn't counteract the fact that one model was more likely than the other.

Edit - Extra Perspective: Null hypothesis testing

Another view comes from the idea of hypothesis testing in statistics. In hypothesis testing, you start with a null hypothesis, ie. a probability distribution based on the Normal universe model and then calculate a p-value representing the chance that you would get this kind of result given that probability distribution. If we get a low p-value, then we generally "reject" the null hypothesis, or at least argue that we have evidence for rejecting it in favour of the alternate hypothesis, which is in this case that there exists at least some bias in the universe towards life. People using the anthropic principle argue that our null hypothesis should be a probability distribution based on the odds of you surviving given that you are asking this question, rather than simply the odds of you surviving fullstop. This would mean that all the probability should be distributed to the survive case, providing a p-value of 1 meaning that we should reject the evidence.

While the p-value may remain fixed as 1 as p(alive|normal universe) -> 0, it is worth noting that the prior probability of our null hypothesis, p(alive & normal universe), is actually changing. At some point, the prior probability becomes sufficiently close to 0 that we reject the hypothesis despite the p-value still being stuck at 1. This is, hypothesis testing is not the only situation when we may reject a hypothesis. A hypothesis that perfectly fits the data may be rejected based in a minuscule prior probability.


This post was originally about the Fine-tuned universe theory, but we also answered the Cold war anthropic puzzle and a Coin Flip Anthropic puzzle. I'm not claiming that all anthropic reasoning is broken in general, only that we can't use anthropic reasoning on a single side of a model. I think that there are cases where we can use anthropic reasoning, but these are cases where we are trying to determine likely properties of our universe, not ones where we are trying to use it to argue against the existence of a biased model. Future posts will deal with these applications of the anthropic principle.

Edit: After consideration, I have realised that the anthropic principle actually works when combined with the multiple worlds hypothesis as per Luke_A_Somers comment. My argument only works against the idea that there is a single universe with parameters that just happen to be right. If the hypotheses are: a multiverse as per string theory vs. a magical (single) universe, even though each individual universe may only have a small chance of life, the multiverse as a whole can have almost guaranteed life, meaning our beliefs would simply be based on priors. I suppose someone might complain that I should be comparing a Normal multiverse against a Magical multiverse, but the problem is that my priors for a Magical multiverse would be even lower than that of a Magical universe. It is also possible to use the multiple worlds argument without using the anthropic principle at all - you can just deny that the fine tuning argument applies to the multi-verse as a whole.

Supplementary Materials

Limit of p(normal universe|we exist)

The formula we had was:

p(normal universe|we exist) = p(we exist|normal universe)/p(we exist) * p(normal universe)

The extra information that we exist, has led to a factor of p(we exist|normal universe)/p(we exist) being applied.

We note that p(we exist)=p(we exist|normal universe)p(normal universe) + p(we exist|magical universe)p(magical universe)
                                    =p(we exist|normal universe)p(normal universe) + 1 - p(normal universe)

The limit of p(we exist) as p(we exist|normal universe) -> 0, with p(normal universe) fixed, is 1 - p(normal universe). So long as p(normal universe) != 1, p(we exist) approaches a fixed value greater than 0.

The limit of p(we exist|normal universe)/p(we exist) as p(we exist|normal universe) -> 0 is 0.

Meaning that limit of p(normal universe|we exist) as p(we exist|normal universe) -> 0 is 0 (assuming p(normal universe)!=1)

Performing Bayesian updates

Again, we'll imagine that we have a biased universe where we have 100% chance of being alive.

We will use Bayes law:



a = being in a normal universe

b = we are alive


We'll also use:

p(alive) = p(alive|normal universe)p(normal universe) + p(alive|biased universe)p(biased universe)


Example 1:


p(alive|normal universe) = 1/100

p(normal universe) = 1/2

The results are:

p(we are alive) = (1/100)*(1/2)+1*(1/2) = 101/200

p(normal universe|alive) = (1/100)*(1/2)*(200/101) = 1/101


Example 2:


p(normal universe)=100/101

p(alive|normal universe) = 1/100

p(normal universe) = 100/101

The results are:

p(we are alive) = 100/101*1/100+1/101*1 = 2/101

p(normal universe|alive) = (1/100)*(100/101)* (101/2) = 1/2



How to provide a simple example to the requirement of falsifiability in the scientific method to a novice audience?

5 Val 11 April 2016 09:26PM

(I once posted this question on academia.stackexchange, but it was deemed to be off topic there. I hope it would be more on-topic here)

I would like to introduce the basics of the scientific method to an audience unfamiliar with the real meaning of it, without making it hard to understand.

As the suspected knowledge level of the intended audience is of the type which commonly thinks that to "prove something scientifically" is the same as "use modern technological gadgets to measure something, afterwards interpret the results as we wish", my major topic would be the selection of an experimental method and the importance of falsifiability. Wikipedia lists the "all swans are white" as an example for a falsifiable statement, but it is not practical enough. To prove that all swans are white would require to observe all the swans in the world. I'm searching of a simple example which uses the scientific method to determine the workings of an unknown system, starting by forming a good hypothesis.

A good example I found is the 2-4-6 game, culminating in the very catchy phrase "if you are equally good at explaining any outcome, you have zero knowledge". This would be one of the best examples to illustrate the most important part of the scientific method which a lot of people imagine incorrectly, it has just one flaw: for best effect it has to be interactive. And if I make it interactive, it has some non-negligible chance to fail, especially if done with a broader audience.

Is there any simple, non-interactive example to illustrate the problem underlying the 2-4-6 game? (for example, if we had taken this naive method to formulate our hypothesis, we would have failed)

I know, the above example is mostly used in the topic of fallacies, like the confirmation bias, but nevertheless it seems to me as a good method in grasping the most important aspects of the scientific method.

I've seen several good posts about the importance of falsifiability, some of them in this very community, but I did not yet see any example which is simple enough so that people unfamiliar with how scientists work, can also understand it. A good working example would be one, where we want to study a familiar concept, but by forgetting to take falsifiability into account, we arrive to an obviously wrong (and preferably humorous) conclusion.

(How I imagine such an example to work? My favorite example in a different topic is the egg-laying dog. A dog enters the room where we placed ten sausages and ten eggs, and when it leaves the room, we observe that the percentage of eggs relative to the sausages increased, so we conclude that the dog must have produced eggs. It's easy to spot the mistake in this example, because the image of a dog laying eggs is absurd. However, let's replace the example of the dog with an effective medicine against heart diseases where someone noticed that the chance of dying of cancer in the next ten years increased for those patients who were treated with it, so they declared the medicine to be carcinogenic even though it wasn't (people are not immortal, so if they didn't die in one disease, they died later in another one). In this case, many people will accept that it's carcinogenic without any second thought. This is why the example of the egg-laying dog can be so useful in illustrating the problem. Now, the egg-laying dog is not a good example to raise awareness for the importance of falsifiability, I presented it as a good and useful style for an effective example any laymen can understand)


The Science of Effective Fundraising: Four Common Mistakes to Avoid

8 Gleb_Tsipursky 11 April 2016 03:19PM

This article will be of interest primarily for Effective Altruists. It's also cross-posted to the EA Forum.



Summary/TL;DR: Charities that have the biggest social impact often get significantly less financial support than rivals that tell better stories but have a smaller social impact. Drawing on academic research across different fields, this article highlights four common mistakes that fundraisers for effective charities should avoid and suggests potential solutions to these mistakes. 1) Focus on individual victims as well as statistics; 2) Present problems that are solvable by individual donors; 3) Avoid relying excessively on matching donations and focus on learning about your donors; 4) Empower your donors and help them feel good.



Co-written by Gleb Tsipursky and Peter Slattery


Acknowledgments: Thanks to Stefan Schubert, Scott Weathers, Peter Hurford, David Moss, Alfredo Parra, Owen Shen, Gina Stuessy, Sheannal Anthony Obeyesekere and other readers who prefer to remain anonymous for providing feedback on this post. The authors take full responsibility for all opinions expressed here and any mistakes or oversights. Versions of this piece will be published on The Life You Can Save blog and the Intentional Insights blog.



Charities that use their funds effectively to make a social impact frequently struggle to fundraise effectively. Indeed, while these charities receive plaudits from those committed to measuring and comparing the impact of donations across sectors, many effective charities have not successfully fundraised large sums outside of donors focused highly on impact.


In many cases, this situation results from the beliefs of key stakeholders at effective charities. Some think that persuasive fundraising tactics are “not for them”  and instead assume that presenting hard data and statistics will be optimal as they believe that their nonprofit’s effectiveness can speak for itself.

The belief that a nonprofit’s effectiveness can speak for itself can be very harmful to fundraising efforts as it overlooks the fact that donors do not always optimise their giving for social impact. Instead, studies suggest that donors’ choices are influenced by many other considerations, such as a desire for a warm glow, social prestige, or being captured by engrossing stories. Indeed, charities that have the biggest social impact often get significantly less financial support than rivals that tell better stories but have a smaller social impact. For example, while one fundraiser collected over $700,000 to remove a young girl from a well and save a single life, most charities struggle to raise anything proportionate for causes that could save many more lives or lift thousands out of poverty.


Given these issues, the aim of this article is to use available science on fundraising and social impact to address some of the common misconceptions that charities may have about fundraising and, hopefully, make it easier for effective charities to also become more effective at fundraising. To do this it draws on academic research across different fields to highlight four common mistakes that those who raise funds for effective charities should avoid and suggest potential solutions to these mistakes.


Don’t forget individual victims


Many fundraisers focus on using statistics and facts to convey the severity of the social issues they tackle. However, while fact and statistics are often an effective way to convince potential donors, it is important to recognise that different people are persuaded by different things. While some individuals are best persuaded to do good deeds through statistics and facts, others are most influenced by the closeness and vividness of the suffering. Indeed, it has been found that people often prefer to help a single identifiable victim, rather than many faceless victims; the so-called identifiable victim effect.


One way in which charities can cover all bases is to complement their statistics by telling stories about one or more of the most compelling victims. Stories have been shown to be excellent ways of tapping emotions, and stories told using video and audio are likely to be particularly good at creating vivid depictions of victims that compel others to want to help them.


Don’t overemphasise the problem


Focusing on the size of the problem has been shown to be ineffective for at least two reasons. First, most people prefer to give to causes where they can save the greatest portion of people. This means that rather than save 100 out of 1,000 victims of malaria, the majority of people would rather use the same or even more resources to save all five out of five people stranded on a boat or one girl stranded in a well with the same amount of resources, even if saving 100 people is clearly the more rational choice. People being reluctant to help where they feel their impact is not going to be significant is often called the drop in the bucket effect.


Second, humans have a tendency to neglect the scope of the problem when dealing with social issues. This is called scope insensitivity: people do not scale up their efforts in proportion to a problem’s true size. For example, a donor willing to give $100 to help one person might only be willing to give $200 to help 100 people, instead of the proportional amount of $10,000.


Of course charities often need to deal with big problems. In such cases one solution is to break these big problems into smaller pieces (e.g., individuals, families or villages) and present situations on a scale that the donor can relate to and realistically address through their donation.


Don’t assume that matching donations is always a good way to spend funds


Charitable fundraisers frequently put a lot of emphasis on arranging for big donors to offer to match any contributions from smaller donors. Intuitively, donation matching seems to be a good incentive for givers as they will generate twice (sometimes three times) the social impact for donating the same amount. However, research provides insufficient evidence to support or discourage donation matching: after reviewing the evidence, Ben Kuhn argues that its positive effects on donations are relatively small (and highly uncertain), and that sometimes the effects can be negative.


Given the lack of strong supporting research, charities should make sure to check that donation matching works for them and should also consider other ways to use their funding from large donors. One option is to use some of this money to cover experiments and other forms of prospect research to better understand their donors’ reasons for giving. Another is to pay various non-program costs so that a charity may claim that more of the smaller donors’ donations will go to program costs, or to use big donations as seed money for a fundraising campaign.


Don't forget to empower donors and help them feel good


Charities frequently focus on showing tragic situations to motivate donors to help.  However, charities can sometimes go too far in focusing on the negatives as too much negative communication can overwhelm and upset potential donors, which can deter them from giving. Additionally, while people often help due to feeling sadness for others, they also give for the warm glow and feeling of accomplishment that they expect to get from helping.


Overall, charities need to remember that most donors want to feel good for doing good and ensure that they achieve this. One reason why the ALS Ice Bucket Challenge was such an incredibly effective approach to fundraising was that it gave donors the opportunity to have a good time, while also doing good. Even when it isn’t possible to think of a clever new way to make donors feel good while donating, it is possible to make donors look good by publicly thanking and praising them for their donations. Likewise it is possible to make them feel important and satisfied by explaining how their donations have been key to resolving tragic situations and helping address suffering.




Remember four key strategies suggested by the research:


1) Focus on individual victims as well as statistics


2) Present problems that are solvable by individual donors


3) Avoid relying excessively on matching donations and focus on learning about your donors


4) Empower your donors and help them feel good.


By following these strategies and avoiding the mistakes outlined above, you will not only provide high-impact services, but will also be effective at raising funds.

The Sally-Anne fallacy

25 philh 11 April 2016 01:06PM

Cross-posted from my blog

I'd like to coin a term. The Sally-Anne fallacy is the mistake of assuming that somone believes something, simply because that thing is true.1

The name comes from the Sally-Anne test, used in developmental psychology to detect theory of mind. Someone who lacks theory of mind will fail the Sally-Anne test, thinking that Sally knows where the marble is. The Sally-Anne fallacy is also a failure of theory of mind.

In internet arguments, this will often come up as part of a chain of reasoning, such as: you think X; X implies Y; therefore you think Y. Or: you support X; X leads to Y; therefore you support Y.2

So for example, we have this complaint about the words "African dialect" used in Age of Ultron. The argument goes: a dialect is a variation on a language, therefore Marvel thinks "African" is a language.

You think "African" has dialects; "has dialects" implies "is a language"; therefore you think "African" is a language.

Or maybe Marvel just doesn't know what a "dialect" is.

This is also a mistake I was pointing at in Fascists and Rakes. You think it's okay to eat tic-tacs; tic-tacs are sentient; therefore you think it's okay to eat sentient things. Versus: you think I should be forbidden from eating tic-tacs; tic-tacs are nonsentient; therefore you think I should be forbidden from eating nonsentient things. No, in both cases the defendant is just wrong about whether tic-tacs are sentient.

Many political conflicts include arguments that look like this. You fight our cause; our cause is the cause of [good thing]; therefore you oppose [good thing]. Sometimes people disagree about what's good, but sometimes they just disagree about how to get there, and think that a cause is harmful to its stated goals. Thus, liberals and libertarians symmetrically accuse each other of not caring about the poor.3

If you want to convince someone to change their mind, it's important to know what they're wrong about. The Sally-Anne fallacy causes us to mistarget our counterarguments, and to mistake potential allies for inevitable enemies.

  1. From the outside, this looks like "simply because you believe that thing".

  2. Another possible misunderstanding here, is if you agree that X leads to Y and Y is bad, but still think X is worth it.

  3. Of course, sometimes people will pretend not to believe the obvious truth so that they can further their dastardly ends. But sometimes they're just wrong. And sometimes they'll be right, and the obvious truth will be untrue.

Open Thread April 11 - April 17, 2016

3 Clarity 10 April 2016 09:01PM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.


Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

The Thyroid Madness: Two Apparently Contradictory Studies. Proof?

6 johnlawrenceaspden 10 April 2016 08:21PM

Recap: (See also: http://lesswrong.com/r/discussion/lw/nef/the_thyroid_madness_core_argument_evidence/ and previous posts)

Chronic Fatigue Syndrome and Fibromyalgia all look far too much like the classical presentation of hypothyroidism for comfort, but thyroid hormone blood tests are normal.

Many alternative medicine practitioners, most prominently John Lowe, and several conventional medical doctors, most prominently Kenneth Blanchard, a practising endocrinologist with a longstanding practice completely free of lawsuits, have tried diagnosing hypothyroidism 'by clinical symptoms', and treating it with various combinations of thyroid hormones, and they all report success, but the practice is dismissed as ignorant and dangerous quackery by conventional medicine.

I suspect that there are acquired 'hormone resistance' or 'type II' versions of all the various endocrine disorders. These would produce the symptoms without reducing the levels of the hormones in the blood. However hormone treatments should still work, simply by overwhelming the resistance.

We know that diabetes comes in two forms, (type I) gland failure and (type II), 'insulin resistance', and that the resistance version is usually acquired rather than inborn. The mechanism for the resistance version of diabetes is mysterious.

There are known to be corresponding 'gland failure' and 'resistance' versions of diseases associated with all the other endocrine hormones, but for some reason the resistance versions are thought to be very rare, and only to be inherited, never acquired.

Should such acquired resistance mechanisms exist and be common, then on evolutionary grounds they would have to be caused by the direct action of pathogens, be a side effect of immune defense against such pathogens, or have an environmental cause. Nothing else would be stable.

Chronic Fatigue Syndrome often seems to start with an infection.


I thought until recently that the problem must be rather complex, and depend on subtle balances of hormones in a complicated system. The idea is so simple and obvious that if it were straightforwardly true, it isn't credible that it would have been missed.

But it turns out that there have been two formal studies of the simplest possible version of idea (treat the symptoms of hypothyroidism with thyroxine) in the medical literature. And they're all I've managed to find. Further examples would be most welcome.

The two studies are apparently contradictory, but there's no real contradiction, in fact the second supports the first.

The first:

Clinical Response to Thyroxine Sodium in Clinically Hypothyroid but Biochemically Euthyroid Patients

was an open trial done in 2000, by Gordon Skinner in Birmingham.

Dr Skinner took 139 patients, all of whom had symptoms consistent with a clinical diagnosis of hypothyroidism.

Of these the majority had been diagnosed with CFS or ME or Post-Viral Fatigue Syndrome, but thirty had been diagnosed with Major Depression, which also has all the right symptoms.

Dr Skinner started off with small doses of thyroxine, and slowly increased the doses, to quite high levels, until the patients got better. He reported that they all got considerably better. In fact his results are phenomenally good.

He mentioned the possibility of placebo effect, and the necessity of ruling it by placebo-controlled blinded randomised trial in the paper, but thought it unlikely. Many of these patients had been seriously ill for many years, and had usually tried a lot of things already.

[ From the study ]  In the absence of a control group, a placebo effect cannot be excluded in this or any study. However, the average duration of illness was 7.5 years in patients who had usually undergone an alarming array of traditional and alternative medications without significant improvement as evidenced by their wish to seek further medical advice. Secondly, certain clinical features allowed objective assessment, namely change in appearance, hair or skin texture, reduction in size of tongue and thyroid gland and increase in pulse rate.

If these patients hadn't had a hormone resistance, he would have done them very serious harm! He kept increasing the dose until it worked, and the highest dose he used was 300mg of thyroxine. That's more than the amount you'd usually use to completely replace the output of a removed thyroid gland. Given that all these people had normal hormone levels to start with, if the patient was not resisting the hormone, this should have caused a range of extremely unpleasant symptoms, including death.

He mentions no adverse effects whatsoever.

Dr Skinner wrote to the British Medical Journal suggesting that thryoxine should be tried in cases where the clinical symptoms of hypothyroidism were present but the blood tests were normal.

This prompted a small trial:

Thyroxine treatment in patients with symptoms of hypothyroidism but thyroid function tests within the reference range: randomised double blind placebo controlled crossover trial  

M Anne Pollock, Alison Sturrock, Karen Marshall, Kate M Davidson, Christopher J G Kelly, Alex D McMahon, E Hamish McLaren

This trial looks very well designed and established that:

(a) There was a huge placebo effect in the patients

(b) Thyroxine is very strongly disliked by the healthy controls (they could tell it from placebo and hated it)

(c) The patient group couldn't tell the difference between thyroxine and placebo (on average).

This result is very interesting of itself, and I make no criticism of the brave GPs who organised it in response to Skinner's letter, but unfortunately it has been taken as a refutation of Skinner's methods. Which it is not. In fact it supports him.

In fact there are two obvious relevant differences between what they did and what Skinner did:

(i) They used a fixed dose for everyone (100mg thyroxine / day) and made no attempt to tailor the dose to the patient.

I suspect that this would have made Skinner's treatment less effective, but it should still have worked.

(ii) They used very different criteria for selecting their patients.

Skinner had carefully done a 'clinical diagnosis' of hypothyroidism, using 16 symptoms, most of which were present in the majority of his patients.

The criteria for the formal trial were:

At least three of the following symptoms for six months: tiredness, lethargy, weight gain or inability to lose weight, intolerance to cold, hair loss, or dry skin or hair.

So a fat person with dry hair who didn't get enough sleep would have qualified as a patient.

This is utterly inadequate as a diagnosis of hypothyroidism! It is a famously difficult disease to diagnose!

Their patient group would have consisted mainly of people who didn't have the clinical symptoms of hypothyroidism.

If the type II version is rare or non-existent, then it would have included no real patients at all.

If the type II version is very common, then at least some of the patient group should have had the disease Skinner said he could cure.

What I think must have happened here is that the treatment produced great improvements in a few patients, and caused unpleasant symptoms in all the rest. This averaged out to 'can't tell the difference between placebo and treatment'. Remember that healthy people can!

I deduce that Skinner's treatment works pretty much as well as he thought it did, and that the disease he was curing is very common indeed.

Can anyone explain these two studies in any other way?


When combined with Sarah Myhill's paper showing that the principal cause of chronic fatigue is 'mitochondrial dysfunction', and that the action of the thyroid hormone is to stimulate the mitochondria, I think the case for a 'thyroid hormone resistance' disease manifesting as Chronic Fatigue Syndrome is unanswerable.

At the very least, this should be investigated.

I now believe my own argument, which until I saw Skinner's paper appeared even to me to be a wild idea made up from shreds of mathematical intuition and questionable evidence from biased sources. I think that Skinner's treatment is unlikely to be optimal, and research into what is actually going on needs to be done.

The problem, if it does exist, is likely to be extremely widespread, and explain far more than the mystery of Chronic Fatigue Syndrome and Fibromyalgia. I immediately claim Major Depressive Disorder and Irritable Bowel Syndrome as alternative labels for: 'type II hypothyroidism'. There is a large cluster of these diseases, all mysterious, all with very similar symptoms, known as the 'central sensitivity syndromes'.

And I should like to add that 'blood cholesterol' was once a test for hypothyroidism, so there are probably implications for heart disease as well. Anyone interested in the wider implications might want to take a look at Broda Barnes' work. I started off thinking he was a lunatic. I'm now fairly sure he must have been right all along.

I think it's now urgent to bring this to the attention of the medical profession and the sufferers' groups. Has anyone got any ideas how to do that?




Two excellent arguments made on reddit's r/CFS group by EmergencyLies (I paraphrase/steelman him):

  • If there's a widespread hormone resistance version of hypothyroidism, where are the most severe cases?

(i) The mild version may be polymorphic, but the severe 'myxoedema' described in Victorian literature was the sort of thing that could be diagnosed on sight (or by hearing the voice) by anyone who'd seen a few severe cases.

(ii) One hears anecdotes of people who can tolerate insane levels of T3. If the hormone resistance can get that severe, why isn't the same problem killing people, or at least making them obviously hypothyroid?

I can't answer this one. Where are they? This is the best objection to this idea that I have seen in three months. Does anyone know of people with really obvious hypothyroidism and normal TSH values?


  • CFS should look like hypothyroidism, but doesn't

(i) Skinner and Pollock together strongly suggest that there's a widespread form of hypothyroidism, undetected by usual blood tests, but treatable with thyroxine

(ii) Anyone with hypothyroidism but normal blood tests is going to get diagnosed with something like CFS/FMS/IBS/MDD etc...

(iii) Some of those people are going to end up diagnosed with CFS. Probably lots, if it's widespread.

(iv) Hypothyroidism causes lowered heart rate

(v) But CFS patients have raised heart rates, (on average?).

Those five things together look like a proof of contradiction, so one of them must be wrong.


I think it's (iv). Billewicz's clinical hypothyroidism test doesn't think heart rate has diagnostic value. Thus there were both low and high heart rates in hypothyroidism. I suspect that there's a low basal heart rate because of low metabolism, but that it goes high and stays high after even mild exercise because of the need to clear fatigue poison. Also, of course, hypothyroidism weakens the heart like any other muscle, so heart rate would actually need to be higher to pump the same amount of blood.

View more: Next