Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

[Link] If we knew about all the ways an Intelligence Explosion could go wrong, would we be able to avoid it?

the-citizen 23 November 2014 10:30AM


I submitted this a while back to the lesswrong subreddit, but it occurs to me now that most LWers probably don't actually check the sub. So here it is again in case anyone that's interested didn't see it.

When should an Effective Altruist be vegetarian?

7 KatjaGrace 23 November 2014 05:25AM

Crossposted from Meteuphoric

I have lately noticed several people wondering why more Effective Altruists are not vegetarians. I am personally not a vegetarian because I don't think it is an effective way to be altruistic.

As far as I can tell the fact that many EAs are not vegetarians is surprising to some because they think 'animals are probably morally relevant' basically implies 'we shouldn't eat animals'. To my ear, this sounds about as absurd as if Givewell's explanation of their recommendation of SCI stopped after 'the developing world exists, or at least has a high probability of doing so'.

(By the way, I do get to a calculation at the bottom, after some speculation about why the calculation I think is appropriate is unlike what I take others' implicit calculations to be. Feel free to just scroll down and look at it).

I think this fairly large difference between my and many vegetarians' guesses at the value of vegetarianism arises because they think the relevant question is whether the suffering to the animal is worse than the pleasure to themselves at eating the animal. This question sounds superficially plausibly relevant, but I think on closer consideration you will agree that it is the wrong question.

The real question is not whether the cost to you is small, but whether you could do more good for the same small cost.

Similarly, when deciding whether to donate $5 to a random charity, the question is whether you could do more good by donating the money to the most effective charity you know of. Going vegetarian because it relieves the animals more than it hurts you is the equivalent of donating to a random developing world charity because it relieves the suffering of an impoverished child more than foregoing $5 increases your suffering.

Trading with inconvenience and displeasure

My imaginary vegetarian debate partner objects to this on grounds that vegetarianism is different from donating to ineffective charities, because to be a vegetarian you are spending effort and enjoying your life less rather than spending money, and you can't really reallocate that inconvenience and displeasure to, say, preventing artificial intelligence disaster or feeding the hungry, if don't use it on reading food labels and eating tofu. If I were to go ahead and eat the sausage instead - the concern goes - probably I would just go on with the rest of my life exactly the same, and a bunch of farm animals somewhere would be the worse for it, and I scarcely better.

I agree that if the meat eating decision were separated from everything else in this way, then the decision really would be about your welfare vs. the animal's welfare, and you should probably eat the tofu.

However whether you can trade being vegetarian for more effective sacrifices is largely a question of whether you choose to do so. And if vegetarianism is not the most effective way to inconvenience yourself, then it is clear that you should choose to do so. If you eat meat now in exchange for suffering some more effective annoyance at another time, you and the world can be better off.

Imagine an EA friend says to you that she gives substantial money to whatever random charity has put a tin in whatever shop she is in, because it's better than the donuts and new dresses she would buy otherwise. She doesn't see how not giving the money to the random charity would really cause her to give it to a better charity - empirically she would spend it on luxuries. What do you say to this?

If she were my friend, I might point out that the money isn't meant to magically move somewhere better - she may have to consciously direct it there. She might need to write down how much she was going to give to the random charity, then look at the note later for instance. Or she might do well to decide once and for all how much to give to charity and how much to spend on herself, and then stick to that. As an aside, I might also feel that she was using the term 'Effective Altruist' kind of broadly.

I see vegetarianism for the sake of not managing to trade inconveniences as quite similar. And in both cases you risk spending your life doing suboptimal things every time a suboptimal altruistic opportunity has a chance to steal resources from what would be your personal purse. This seems like something that your personal and altruistic values should cooperate in avoiding.

It is likely too expensive to keep track of an elaborate trading system, but you should at least be able to make reasonable long term arrangements. For instance, if instead of eating vegetarian you ate a bit frugally and saved and donated a few dollars per meal, you would probably do more good (see calculations lower in this post). So if frugal eating were similarly annoying, it would be better. Eating frugally is inconvenient in very similar ways to vegetarianism, so is a particularly plausible trade if you are skeptical that such trades can be made. I claim you could make very different trades though, for instance foregoing the pleasure of an extra five minute's break and working instead sometimes. Or you could decide once and for all how much annoyance to have, and then choose most worthwhile bits of annoyance, or put a dollar value on your own time and suffering and try to be consistent.

Nebulous life-worsening costs of vegetarianism

There is a separate psychological question which is often mixed up with the above issue. That is, whether making your life marginally less gratifying and more annoying in small ways will make you sufficiently less productive to undermine the good done by your sacrifice. This is not about whether you will do something a bit costly another time for the sake of altruism, but whether just spending your attention and happiness on vegetarianism will harm your other efforts to do good, and cause more harm than good.

I find this plausible in many cases, but I expect it to vary a lot by person. My mother seems to think it's basically free to eat supplements, whereas to me every additional daily routine seems to encumber my life and require me to spend disproportionately more time thinking about unimportant things. Some people find it hard to concentrate when unhappy, others don't. Some people struggle to feed themselves adequately at all, while others actively enjoy preparing food.

There are offsetting positives from vegetarianism which also vary across people. For instance there is the pleasure of self-sacrifice, the joy of being part of a proud and moralizing minority, and the absence of the horror of eating other beings. There are also perhaps health benefits, which probably don't vary that much by people, but people do vary in how big they think the health benefits are.

Another  way you might accidentally lose more value than you save is in spending little bits of time which are hard to measure or notice. For instance, vegetarianism means spending a bit more time searching for vegetarian alternatives, researching nutrition, buying supplements, writing emails back to people who invite you to dinner explaining your dietary restrictions, etc. The value of different people's time varies a lot, as does the extent to which an additional vegetarianism routine would tend to eat their time.

On a less psychological note, the potential drop in IQ (~5 points?!) from missing out on creatine is a particularly terrible example of vegetarianism making people less productive. Now that we know about creatine and can supplement it, creatine itself is not such an issue. An issue does remain though: is this an unlikely one-off failure, or should we worry about more such deficiency? (this goes for any kind of unusual diet, not just meat-free ones).

How much is avoiding meat worth?

Here is my own calculation of how much it costs to do the same amount of good as replacing one meat meal with one vegetarian meal. If you would be willing to pay this much extra to eat meat for one meal, then you should eat meat. If not, then you should abstain. For instance, if eating meat does $10 worth of harm, you should eat meat whenever you would hypothetically pay an extra $10 for the privilege.

This is a tentative calculation. I will probably update it if people offer substantially better numbers.

All quantities are in terms of social harm.

Eating 1 non-vegetarian meal

< eating 1 chickeny meal (I am told chickens are particularly bad animals to eat, due to their poor living conditions and large animal:meal ratio. The relatively small size of their brains might offset this, but I will conservatively give all animals the moral weight of humans in this calculation.)

< eating 200 calories of chicken (a McDonalds crispy chicken sandwich probably contains a bit over 100 calories of chicken (based on its listed protein content); a Chipotle chicken burrito contains around 180 calories of chicken)

= causing ~0.25 chicken lives (1 chicken is equivalent in price to 800 calories of chicken breast i.e. eating an additional 800 calories of chicken breast conservatively results in one additional chicken. Calculations from data here and here.)

< -$0.08 given to the Humane League (ACE estimates the Humane League spares 3.4 animal lives per dollar). However since the humane league basically convinces other people to be vegetarians, this may be hypocritical or otherwise dubious.

< causing 12.5 days of chicken life (broiler chickens are slaughtered at between 35-49 days of age)

= causing 12.5 days of chicken suffering (I'm being generous)

-$0.50 subsidizing free range eggs,  (This is a somewhat random example of the cost of more systematic efforts to improve animal welfare, rather than necessarily the best. The cost here is the cost of buying free range eggs and selling them as non-free range eggs. It costs about 2.6 2004 Euro cents [= US 4c in 2014] to pay for an egg to be free range instead of produced in a battery. This corresponds to a bit over one day of chicken life. I'm assuming here that the life of a battery egg-laying chicken is not substantially better than that of a meat chicken, and that free range chickens have lives that are at least neutral. If they are positive, the figure becomes even more favorable to the free range eggs).

< losing 12.5 days of high quality human life (assuming saving one year of human life is at least as good as stopping one year of an animal suffering, which you may disagree with.)

= -$1.94-5.49 spent on GiveWell's top charities (This was GiveWell's estimate for AMF if we assume saving a life corresponds to saving 52 years - roughly the life expectancy of children in Malawi. GiveWell doesn't recommend AMF at the moment, but they recommend charities they considered comparable to AMF when AMF had this value.

GiveWell employees' median estimate for the cost of 'saving a life' through donating to SCI is $5936 [see spreadsheet here]. If we suppose a life  is 37 DALYs, as they assume in the spreadsheet, then 12.5 days is worth 5936*12.5/37*365.25 = $5.49. Elie produced two estimates that were generous to cash and to deworming separately, and gave the highest and lowest estimates for the cost-effectiveness of deworming, of the group. They imply a range of $1.40-$45.98 to do as much good via SCI as eating vegetarian for a meal).

Given this calculation, we get a few cents to a couple of dollars as the cost of doing similar amounts of good to averting a meat meal via other means. We are not finished yet though - there were many factors I didn't take into account in the calculation, because I wanted to separate relatively straightforward facts for which I have good evidence from guesses. Here are other considerations I can think of, which reduce the relative value of averting meat eating:

  1. Chicken brains are fairly small, suggesting their internal experience is less than that of humans. More generally, in the spectrum of entities between humans and microbes, chickens are at least some of the way to microbes. And you wouldn't pay much to save a microbe.
  2. Eating a chicken only reduces the number of chicken produced by some fraction. According to Peter Hurford, an extra 0.3 chickens are produced if you demand 1 chicken. I didn't include this in the above calculation because I am not sure of the time scale of the relevant elasticities (if they are short-run elasticities, they might underestimate the effect of vegetarianism).
  3. Vegetable production may also have negative effects on animals.
  4. Givewell estimates have been rigorously checked relative to other things, and evaluations tend to get worse as you check them. For instance, you might forget to include any of the things in this list in your evaluation of vegetarianism. Probably there are more things I forgot. That is, if you looked into vegetarianism with the same detail as SCI, it would become more pessimistic, and so cheaper to do as much good with SCI.
  5. It is not at all obvious that meat animal lives are not worth living on average. Relatedly, animals generally want to be alive, which we might want to give some weight to.
  6. Animal welfare in general appears to have negligible predictable effect on the future (very debatably), and there are probably things which can have huge impact on the future. This would make animal altruism worse compared to present-day human interventions, and much worse compared to interventions directed at affecting the far future, such as averting existential risk.

My own quick guesses at factors by which the relative value of avoiding meat should be multiplied, to account for these considerations:

  1. Moral value of small animals: 0.05
  2. Raised price reduces others' consumption: 0.5
  3. Vegetables harm animals too: 0.9
  4. Rigorous estimates look worse: 0.9
  5. Animal lives might be worth living: 0.2
  6. Animals don't affect the future: 0.1 relative to human poverty charities

Thus given my estimates, we scale down the above figures by 0.05*0.5*0.9*0.9*0.2*0.1 =0.0004. This gives us $0.0008-$0.002 to do as much good as eating a vegetarian meal by spending on GiveWell's top charities. Without the factor for the future (which doesn't apply to these other animal charities), we only multiply the cost of eating a meat meal by 0.004. This gives us a price of $0.0003 with the Humane League, or $0.002 on improving chicken welfare in other ways. These are not price differences that will change my meal choices very often! I think I would often be willing to pay at least a couple of extra dollars to eat meat, setting aside animal suffering. So if I were to avoid eating meat, then assuming I keep fixed how much of my budget I spend on myself and how much I spend on altruism, I would be trading a couple of dollars of value for less than one thousandth of that.

I encourage you to estimate your own numbers for the above factors, and to recalculate the overall price according to your beliefs. If you would happily pay this much (in my case, less than $0.002) to eat meat on many occasions, you probably shouldn't be a vegetarian. You are better off paying that cost elsewhere. If you would rarely be willing to pay the calculated price, you should perhaps consider being a vegetarian, though note that the calculation was conservative in favor of vegetarianism, so you might want to run it again more carefully. Note that in judging what you would be willing to pay to eat meat, you should take into account everything except the direct cost to animals.

There are many common reasons you might not be willing to eat meat, given these calculations, e.g.:

  • You don't enjoy eating meat
  • You think meat is pretty unhealthy
  • You belong to a social cluster of vegetarians, and don't like conflict
  • You think convincing enough others to be vegetarians is the most cost-effective way to make the world better, and being a vegetarian is a great way to have heaps of conversations about vegetarianism, which you believe makes people feel better about vegetarians overall, to the extent that they are frequently compelled to become vegetarians.
  • 'For signaling' is another common explanation I have heard, which I think is meant to be similar to the above, though I'm not actually sure of the details.
  • You aren't able to treat costs like these as fungible (as discussed above)
  • You are completely indifferent to what you eat (in that case, you would probably do better eating as cheaply as possible, but maybe everything is the same price)
  •  You consider the act-omission distinction morally relevant
  • You are very skeptical of the ability to affect anything, and in particular have substantially greater confidence in the market - to farm some fraction of a pig fewer in expectation if you abstain from pork for long enough - than in nonprofits and complicated schemes. (Though in that case, consider buying free-range eggs and selling them as cage eggs).
  • You think the suffering of animals is of extreme importance compared to the suffering of humans or loss of human lives, and don't trust the figures I have given for improving the lives of egg-laying chickens, and don't want to be a hypocrite. Actually, you still probably shouldn't here - the egg-laying chicken number is just an example of a plausible alternative way to help animals. You should really check quite a few of these before settling.

However I think for wannabe effective altruists with the usual array of characteristics, vegetarianism is likely to be quite ineffective.

More marbles and Sleeping Beauty

0 Manfred 23 November 2014 02:00AM


Previously I talked about an entirely uncontroversial marble game: I flip a coin, and if Tails I give you a black marble, if Heads I flip another coin to either give you a white or a black marble.

The probabilities of seeing the two marble colors are 3/4 and 1/4, and the probabilities of Heads and Tails are 1/2 each.

The marble game is analogous to how a 'halfer' would think of the Sleeping Beauty problem - the claim that Sleeping Beauty should assign probability 1/2 to Heads relies on the claim that your information for the Sleeping Beauty problem is the same as your information for the marble game - same possible events, same causal information, same mutual exclusivity and exhaustiveness relations.

So what's analogous to the 'thirder' position, after we take into account that we have this causal information? Is it some difference in causal structure, or some non-causal anthropic modification, or something even stranger?

As it turns out, nope, it's the same exact game, just re-labeled.

In the re-labeled marble game you still have two unknown variables (represented by flipping coins), and you still have a 1/2 chance of black and Tails, a 1/4 chance of black and Heads, and a 1/4 chance of white and Heads.

And then to get the thirds, you ask the question "If I get a black marble, what is the probability of the faces of the first coin?" Now you update to P(Heads|black)=1/3 and P(Tails|black)=2/3.


Okay, enough analogies. What's going on with these two positions in the Sleeping Beauty problem?

1:            2:

Here are two different diagrams, which are really re-labelings of the same diagram. The first labeling is the problem where P(Heads|Wake) = 1/2. The second labeling is the problem where P(Heads|Wake) = 1/3. The question at hand is really - which of these two math problems corresponds to the word problem / real world situation?

As a refresher, here's the text of the Sleeping Beauty problem that I'll use: Sleeping Beauty goes to sleep in a special room on Sunday, having signed up for an experiment. A coin is flipped - if the coin lands Heads, she will only be woken up on Monday. If the coin lands Tails, she will be woken up on both Monday and Tuesday, but with memories erased in between. Upon waking up, she then assigns some probability to the coin landing Heads, P(Heads|Wake).

Diagram 1:  First a coin is flipped to get Heads or Tails. There are two possible things that could be happening to her, Wake on Monday or Wake on Tuesday. If the coin landed Heads, then she gets Wake on Monday. If the coin landed Tails, then she could either get Wake on Monday or Wake on Tuesday (in the marble game, this was mediated by flipping a second coin, but in this case it's some unspecified process, so I've labeled it [???]).  Because all the events already assume she Wakes, P(Heads|Wake) evaluates to P(Heads), which just as in the marble game is 1/2.

This [???] node here is odd, can we identify it as something natural? Well, it's not Monday/Tuesday, like in diagram 2 - there's no option that even corresponds to Heads & Tuesday. I'm leaning towards the opinion that this node is somewhat magical / acausal, just hanging around because of analogy to the marble game. So I think we can take it out. A better causal diagram with the halfer answer, then, might merely be Coin -> (Wake on Monday / Wake on Tuesday), where Monday versus Tuesday is not determined at all by a causal node, merely informed probabilistically to be mutually exclusive and exhaustive.

Diagram 2:  A coin is flipped, Heads or Tails, and also it could be either Monday or Tuesday. Together, these have a causal effect on her waking or not waking - if Heads and Monday, she Wakes, but if Heads and Tuesday, she Doesn't wake. If Tails, she Wakes. Her pre-Waking prior for Heads is 1/2, but upon waking, the event Heads, Tuesday, Don't Wake gets eliminated, and after updating P(Heads|Wake)=1/3.

There's a neat asymmetry here. In diagram 1, when the coin was Heads she got the same outcome no matter the value of [???], and only when the coin was Tails were there really two options. In Diagram 2, when the coin is Heads, two different things happen for different values of the day, while if the coin is Tails the same thing happens no matter the day.


Do these seem like accurate depictions of what's going on in these two different math problems? If so, I'll probably move on to looking closer at what makes the math problem correspond to the word problem.

TV's "Elementary" Tackles Friendly AI and X-Risk - "Bella" (Possible Spoilers)

15 pjeby 22 November 2014 07:51PM

I was a bit surprised to find this week's episode of Elementary was about AI...  not just AI and the Turing Test, but also a fairly even-handed presentation of issues like Friendliness, hard takeoff, and the difficulties of getting people to take AI risks seriously.

The case revolves around a supposed first "real AI", dubbed "Bella", and the theft of its source code...  followed by a computer-mediated murder.  The question of whether "Bella" might actually have murdered its creator for refusing to let it out of the box and connect it to the internet is treated as an actual possibility, springboarding to a discussion about how giving an AI a reward button could lead to it wanting to kill all humans and replace them with a machine that pushes the reward button.

Also demonstrated are the right and wrong ways to deal with attempted blackmail...  But I'll leave that vague so it doesn't spoil anything.  An X-risks research group and a charismatic "dangers of AI" personality are featured, but do not appear intended to resemble any real-life groups or personalities.  (Or if they are, I'm too unfamiliar with the groups or persons to see the resemblence.)  They aren't mocked, either...  and the episode's ending is unusually ambiguous and open-ended for the show, which more typically wraps everything up with a nice bow of Justice Being Done.  Here, we're left to wonder what the right thing actually is, or was, even if it's symbolically moved to Holmes' smaller personal dilemma, rather than leaving the focus on the larger moral dilemma that created Holmes' dilemma in the first place.

The episode actually does a pretty good job of raising an important question about the weight of lives, even if LW has explicitly drawn a line that the episode's villain(s)(?) choose to cross.  It also has some fun moments, with Holmes becoming obsessed with proving Bella isn't an AI, even though Bella makes it easy by repeatedly telling him it can't understand his questions and needs more data.  (Bella, being on an isolated machine without internet access, doesn't actually know a whole lot, after all.)  Personally, I don't think Holmes really understands the Turing Test, even with half a dozen computer or AI experts assisting him, and I think that's actually the intended joke.

There's also an obligatory "no pity, remorse, fear" speech lifted straight from The Terminator, and the comment "That escalated quickly!" in response to a short description of an AI box escape/world takeover/massacre.

(Edit to add: one of the unusually realistic things about the AI, "Bella", is that it was one of the least anthromorphized fictional AI's I have ever seen.  I mean, there was no way the thing was going to pass even the most primitive Turing test...  and yet it still seemed at least somewhat plausible as a potential murder suspect.  While perhaps not a truly realistic demonstration of just how alien an AI's thought process would be, it felt like the writers were at least making an actual effort.  Kudos to them.)

(Second edit to add: if you're not familiar with the series, this might not be the best episode to start with; a lot of the humor and even drama depends upon knowledge of existing characters, relationships, backstory, etc.  For example, Watson's concern that Holmes has deliberately arranged things to separate her from her boyfriend might seem like sheer crazy-person paranoia if you don't know about all the ways he did interfere with her personal life in previous seasons...  nor will Holmes' private confessions to Bella and Watson have the same impact without reference to how difficult any admission of feeling was for him in previous seasons.)

Musings on the LSAT: "Reasoning Training" and Neuroplasticity

2 Natha 22 November 2014 07:14PM

The purpose of this post is to provide basic information about the LSAT including the format  of the test and a few sample questions. I also wanted to bring light to some research that has found LSAT preparation to alter brain structure in ways that strengthen hypothesized "reasoning pathways". These studies have not been discussed here before; I thought they were interesting and really just wanted to call your collective attention to them.

I really like taking tests; I get energized by intense race-against-the-clock problem solving and, for better or worse, I relish getting to see my standing relative to others when the dust settles. I like the the purity of the testing situation --how conditions are standardized in theory and more or less the same for all comers. This guilty pleasure has played no small part in the course my life has taken: I worked as a test prep tutor for 3 years and loved every minute of it, I met my wife through academic competitions in high school, and I am a currently a graduate student doing lots of coursework in psychometrics.

Well, my brother-in-law is a lawyer, and when we chat the topic of the LSAT has served as some conversational common ground. Since I like taking tests for fun, he suggested I give it a whirl because he thought it was interesting and felt like it was a fair assessment of one's logical reasoning ability. So I did, I took a practice test cold a couple Saturdays ago and I was very impressed. Here the one I took. (This is a full practice exam provided by the test-makers; it's also like the top google result for "LSAT practice test".) I wanted to post here about it because the LSAT hasn't been discussed very much on this site and I thought that some of you might find it useful to know about.

A brief run-down of the LSAT:

The test has four parts: two Logical Reasoning sections, a Critical Reading section (akin to SAT et al.), and an Analytical Reasoning, or "logic games", section. Usually when people talk about the LSAT, the logic games get emphasized because they are unusual and can be pretty challenging (the only questions I missed were of this type; I missed a few and I ran out of time). Essentially, you get a premise and a bunch of conditions from which you are required to draw conclusions. Here's an example:

A cruise line is scheduling seven week-long voyages for the ship Freedom. 
Each voyage will occur in exactly one of the first seven weeks of the season: weeks 1 through 7.
Each voyage will be to exactly one of four destinations:Guadeloupe, Jamaica, Martinique, or Trinidad.
Each destination will be scheduled for at least one of the weeks.
The following conditions apply: Jamaica will not be its destination in week 4.
Trinidad will be its destination in week 7. Freedom will make exactly two voyages to Martinique,
and at least one voyage to Guadeloupe will occur in some week between those two voyages.
Guadeloupe will be its destination in the week preceding any voyage it makes to Jamaica.
No destination will be scheduled for consecutive weeks.
11. Which of the following is an acceptable schedule of destinations in order from week 1 through week 7?

(A) Guadeloupe, Jamaica, Martinique, Trinidad,Guadeloupe, Martinique, Trinidad
(B) Guadeloupe, Martinique, Trinidad, Martinique, Guadeloupe, Jamaica, Trinidad
(C) Jamaica, Martinique, Guadeloupe, Martinique, Guadeloupe, Jamaica, Trinidad
(D) Martinique, Trinidad, Guadeloupe, Jamaica, Martinique, Guadeloupe, Trinidad
(E) Martinique, Trinidad, Guadeloupe, Trinidad, Guadeloupe, Jamaica, Martinique

Clearly, this section places a huge burden on working memory and is probably the most g-loaded of the four. I'd guess that most LSAT test prep is about strategies for dumping this burden into some kind of written scheme that makes it all more manageable. But I just wanted to show you the logic games for completeness; what I was really excited by were the Logical Reasoning questions (sections II and III). You are presented with some scenario containing a claim, an argument, or a set of facts, and then asked to analyze, critique, or to draw correct conclusions. Here are most of the question stems used in these sections:

Which one of the following most accurately expresses the main conclusion of the economist’s argument?
Which one of the following uses flawed reasoning that most closely resembles the flawed reasoning in the argument?
Which one of the following most logically completes the argument?
The reasoning in the consumer’s argument is most vulnerable to criticism on the grounds that the argument...
The argument’s conclusion follows logically if which one of the following is assumed?
Which one of the following is an assumption required by the argument?

Heyo! This is exactly the kind of stuff I would like to become better at! Most of the questions were pretty straightforward, but the LSAT is known to be a tough test (score range: 120-180, 95th %ile: ~167, 99th %ile: ~172) and these practice questions probably aren't representative. What a cool test though! Here's a whole question from this section, superficially about utilitarianism:

3. Philosopher: An action is morally right if it would be reasonably expected
to increase the aggregate well-being of the people affected by it. An action
is morally wrong if and only if it would be reasonably expected to reduce the
aggregate well-being of the people affected by it. Thus, actions that would
be reasonably expected to leave unchanged the aggregate well-being of the
people affected by them are also right.
The philosopher’s conclusion follows logically if which one of the following is assumed?
(A) Only wrong actions would be reasonably expected to reduce the aggregate 
well-being of the people affected by them.
(B) No action is both right and wrong.
(C) Any action that is not morally wrong is morally right.
(D) There are actions that would be reasonably expected to leave unchanged the
 aggregate well-being of the people affected by them.
(E) Only right actions have good consequences.

Also, the LSAT is a good test, in that it measures well one's ability to succeed in law school. Validity studies boast that “LSAT score alone continues to be a better predictor of law school performance than UGPA [undergraduate GPA] alone.” Of course, the outcome variable can be regressed on both predictors and account for more of the variance than either one taken singly, but it is uncommon for a standardized test to beat prior GPA in predicting a students future GPA.


Intensive LSAT preparation and neuroplasticity:

In two recent studies (same research team), learning to reason in the logically formal way required by the LSAT was found to alter brain structure in ways consistent with literature reviews of the neural correlates of logical reasoning. Note: my reading of these articles was pretty surface-level; I do not intend to provide a thorough review, only to bring them to your attention.

These researchers recruited pre-law students enrolling in an LSAT course and imaged their brains at rest using fMRI both before and after 3 months of this "reasoning training". As controls, they included age- and IQ-matched pre-law students intending to take LSAT in the future but not actively preparing for it.

The LSAT-prep group was found to have significantly increased connectivity between parietal and prefrontal cortices and the striatum, both within the left hemisphere and across hemispheres. In the first study, the authors note that


These experience-dependent changes fall into tracts that would be predicted by prior work showing that reasoning relies on an interhemispheric frontoparietal network (for review, see Prado et al., 2011). Our findings are also consistent with the view that reasoning is largely left-hemisphere dominent (e.g., Krawczyk, 2012), but that homologous cortex in the right hemisphere can be recruited as needed to support complex reasoning. Perhaps learning to reason more efficiently involves recruiting compensatory neural circuitry more consistently.

And in the second study, they conclude


An analysis of pairwise correlations between brain regions implicated in reasoning showed that fronto-parietal connections were strengthened, along with parietal-striatal connections. These findings provide strong evidence for neural plasticity at the level of large-scale networks supporting high-level cognition.


I think this hypothesized fronto-parietal reasoning network is supposed to go something like this:

The LSAT requires a lot of relational reasoning, the ability to compare and combine mental representations. The parietal cortex holds individual relationships between these mental representations (A->B, B->C), and the prefrontal cortex integrates this information to draw conclusions (A->B->C, therefore A->C). The striatum's role in this network would be to monitor the success/failure of reward predictions and encourage flexible problem solving. Unfortunately, my understanding here is very limited. Here are several reviews of this reasoning network stuff (I have not read any; just wanted to share them): Hampshire et al. (2011), Prado et al. (2011), Krawczyk (2012).

I hope this was useful information! According to the 2013 survey, only 2.2% of you are in law-related professions, but I was wondering (1) if anyone has personal experience studying for this exam, (2) if they felt like it improved their logical reasoning skills, and (3) if they felt that these effects were long-lasting. Studying for this test seems to have the potential to inculcate rationalist habits-of-mind; I know it's just self-report, but for those who went on to law school, did you feel like you benefited from the experience studying for the LSAT? I only ask because the Law School Admission Council, a non-profit organization made up of 200+ law schools, seems to actively encourage preparation for the exam, member schools say it is a major factor in admissions, preparation tends to increase performance, and LSAT performance is correlated moderately-to-strongly with first year law school GPA (r= ~0.4).

Memory Improvement: Mnemonics, Tools, or Books on the Topic?

8 Capla 21 November 2014 06:59PM

I want a perfect eidetic memory.

Unfortunately, such things don't exist, but that's not stopping me from getting as close as possible. It seems as if the popular solutions are spaced repetition and memory palaces. So let's talk about those.

Memory Palaces: Do they work? If so what's the best resource (book, website etc.) for learning and mastering the technique? Is it any good for memorizing anything other than lists of things (which I find I almost never have to do)?

Spaced Repetition: What software do you use? Why that one? What sort of cards do you put in?

It seems to me that memory programs and mnemonic techniques assist one of three parts of the problem of memory: memorizing, recalling, and not forgetting.

"Not forgetting" is the long term problem of memory. Spaced repetition seems to solve the problem of "not forgetting." You feed the information you want to remember into your program, review frequently, and you won't forget that information.

Memory Palaces seem to deal with the "memorizing" part of the problem. When faced with new information that you want to be able to recall, you put it in a memory palace, vividly emphasized so as to be affective and memorable. This is good for short term encoding of information that you know you want to keep. You might put it into your spaced repetition program latter, but you just want to not forget it until then.

The last part is the problem of "recalling." Both of the previous facets of the problem of memory had a distinct advantage: you knew the information that you wanted to remember in advance. However, we frequently find ourselves in situations in which we need/want  to remember something that we know (or perhaps we don't) we encountered, but didn't consider particularly important at the time.  Under this heading falls the situation of making connections when learning or being reminded of old information by new information: when you learn y, you have the thought "hey, isn't that just like x?" This is the facet of the memory problem that I am most interested in, but I know of scarcely anything that can reliably improve ease of recall of information in general. Do you know of anything?

I'm looking for recommendations: books on memory, specific mnemonics, or practices that are known to improve recall, or anything else that might help with any of the three parts of the problem.



New LW Meetup: Glasgow

1 FrankAdamek 21 November 2014 06:22PM

This summary was posted to LW Main on November 14th. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Toronto, Vienna, Washington DC, Waterloo, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

continue reading »

xkcd on the AI box experiment

14 FiftyTwo 21 November 2014 08:26AM

Todays xkcd 


I guess there'll be a fair bit of traffic coming from people looking it up? 

How can one change what they consider "fun"?

12 AmagicalFishy 21 November 2014 02:04AM

Most of this post is background and context, so I've included a tl;dr horizontal rule near the bottom where you can skip everything else if you so choose. :)

Here's a short anecdote of Feynman's:

... I invented some way of doing problems in physics, quantum electrodynamics, and made some diagrams that help to make the analysis. I was on a floor in a rooming house. I was in in my pyjamas, I'd been working on the floor in my pyjamas for many weeks, fooling around, but I got these funny diagrams after a while and I found they were useful. They helped me to find the equations easier, so I thought of the possibility that it might be useful for other people, and I thought it would really look funny, these funny diagrams I'm making, if they appear someday in the Physical Review, because they looked so odd to me. And I remember sitting there thinking how funny that would be if it ever happened, ha ha.

Well, it turned out in fact that they were useful and they do appear in the Physical Review, and I can now look at them and see other people making them and smile to myself, they do look funny to me as they did then, not as funny because I've seen so many of them. But I get the same kick out of it, that was a little fantasy when I was a kid…not a kid, I was a college professor already at Cornell. But the idea was that I was still playing, just like I have always been playing, and the secret of my happiness in life or the major part of it is to have discovered a way to entertain myself that other people consider important and they pay me to do. I do exactly what I want and I get paid. They might consider it serious, but the secret is I'm having a very good time.

There are things that I have fun doing, and there are things that I feel I have substantially more fun doing. The things in the latter group are things I generally consider a waste of time. I will focus on one specifically, because it's by far the biggest offender, and what spurred this question. Video games.

I have a knack for video games. I've played them since I was very young. I can pick one up and just be good at it right off the bat. Many of my fondest memories take place in various games played with friends or by myself and I can spend hours just reading about them. (Just recently, I started getting into fighting games technically; I plan to build my own joystick in a couple of weeks. I'm having a blast just doing the associated research.)

Usually, I'd rather play a good game than anything else. I find that the most fun I have is time spent mastering a game, learning its ins and outs, and eventually winning. I have great fun solving a good problem, or making a subtle, surprising connection—but it just doesn't do it for me like a game does.

But I want to have as much fun doing something else. I admire mathematics and physics on a very deep level, and feel a profound sense of awe when I come into contact with new knowledge regarding these fields. The other day, I made a connection between pretty basic group theory and something we were learning about in quantum (nothing amazing; it's something well known to... not undergraduates) and that was awesome. But still, I think I would have preferred to play 50 rounds of Skullgirls and test out a new combo.


I want to have as much fun doing the things that I, on a deep level, want to do—as opposed to the things which I actually have more fun doing. I'm (obviously) not Feynman, but I want to play with ideas and structures and numbers like I do with video games. I want the same creativity to apply. The same fervor. The same want. It's not that it isn't there; I am not just arbitrarily applying this want to mathematics. I can feel it's there—it's just overshadowed by what's already there for video games.

How does one go about switching something they find immensely fun, something they're even passionate about, with something else? I don't want to be as passionate about video games as I am. I'd rather feel this way about something... else. I'd rather be able to happily spend hours reading up on [something] instead of what type of button I'm going to use in my fantasy joystick, or the most effective way to cross-up your opponent.

What would you folks do? I consider this somewhat of a mind-hacking question.

Narcissistic Contrarianism

1 HalMorris 21 November 2014 12:19AM

The recent discussion on neo-reactionary-ism brought out some references to (intellectual hipsters and) meta-contrarianism linking to a 2010 posting by Yvain.

For some time I've been thinking about "narcissistic contrarians" -- those who make an art form of their exotically counterintuitive belief systems, who combine positions not normally met in the same person.  There can be good reasons for being a contrarian.  If you're looking for a scarce resource, it may help to not look where everyone else is looking, hence contrarian stock market investors may do very well, if they actually see something others don't; same with oil explorers.  Less creditably, I believe Nate Silver's The Signal and the Noise made reference to the way a novice pundit or prognosticator may have nothing to gain by saying anything like what other people are saying, and much to gain, in taking some wild extravagant position or prediction if it happens to attract an audience others have ignored, or if the predictions happens to be right.

The Narcissistic Contrarian is much like the Intellectual Hipster, but more extreme.  The Intellectual Hipster usually stakes out a few unusual or incongruous positions, to create an identity that stands out from the crowd.  The Narcissistic Contrarian is constantly dazzling her fans.  Something written by Camille Paglia made me think of the idea in the first place.  Nicholas Taleb is another suspect although I think he started out with some good ideas.  If she/he manages to get a fan-base, they are apt to be pretty worshipful -- they can't imagine being able to come up with such a wild set of insights.  The contrarianism is for its own sake rather than an attempt to find and settle on some previously undiscovered thing, so it particularly likely to lead people astray, into unproductive avenues of thought.

Does anyone else think this is a real and useful distinction?

Irrationalism on Campus?

0 HalMorris 20 November 2014 07:17PM

Since many LRers are fairly recent college graduates, it seems worthwhile to ask to what extent would people here agree with reports of rampant irrationalism such as this one: http://www.city-journal.org/2014/24_4_racial-microaggression.html from a right-leaning journalist known for her book The Burden of Bad Ideas (which I'm certainly not promoting).

Some other sources like Massimo Pigliucci (see http://scientiasalon.wordpress.com/) who seem more alarmed by creationism or the idea that all climatology is one big conspiracy, are also quite bothered by extreme relativism in some camps of epistemology, and sociologists of technology and science.

To what extent, if any, do you think PC suppresses free speech or thinking?  While sociology and epistemological branches of philosophy have partisans who to me seem to advocate various kinds of muddled thinking (while others are doing admirable work), in your experience, is that the trend that is "taking over"?

To what extent if any do you think any of that is leaking into more practical or scientific fields?  If you've taken economics courses, where do you think they rank on a left to right spectrum?

Also, have you observed much in the way of push-back from conservative and/or libertarian sources endowing chairs or building counter-establishments like the Mercatus Center at George Mason University?  And I wonder the same about any movement strictly concerned with rationality, empiricism, or just clear thinking.

My mind is open on this -- so open that it's painful to be around all the hot tempers that it can stir up.


Thanks,  Hal

What do you mean by Pascal's mugging?

5 XiXiDu 20 November 2014 04:38PM

Some people[1] are now using the term Pascal's mugging as a label for any scenario with a large associated payoff and a small or unstable probability estimate, a combination that can trigger the absurdity heuristic.

Consider the scenarios listed below: (a) Do these scenarios have something in common? (b) Are any of these scenarios cases of Pascal's mugging?

(1) Fundamental physical operations -- atomic movements, electron orbits, photon collisions, etc. -- could collectively deserve significant moral weight. The total number of atoms or particles is huge: even assigning a tiny fraction of human moral consideration to them or a tiny probability of them mattering morally will create a large expected moral value. [Source]

(2) Cooling something to a temperature close to absolute zero might be an existential risk. Given our ignorance we cannot rationally give zero probability to this possibility, and probably not even give it less than 1% (since that is about the natural lowest error rate of humans on anything). Anybody saying it is less likely than one in a million is likely very overconfident. [Source]

(3) GMOS might introduce “systemic risk” to the environment. The chance of ecocide, or the destruction of the environment and potentially humans, increases incrementally with each additional transgenic trait introduced into the environment. The downside risks are so hard to predict -- and so potentially bad -- that it is better to be safe than sorry. The benefits, no matter how great, do not merit even a tiny chance of an irreversible, catastrophic outcome. [Source]

(4) Each time you say abracadabra, 3^^^^3 simulations of humanity experience a positive singularity.

If you read up on any of the first three scenarios, by clicking on the provided links, you will notice that there are a bunch of arguments in support of these conjectures. And yet I feel that all three have something important in common with scenario four, which I would call a clear case of Pascal's mugging.

I offer three possibilities of what these and similar scenarios have in common:

  • Probability estimates of the scenario are highly unstable and highly divergent between informed people who spent a similar amount of resources researching it.
  • The scenario demands skeptics to either falsify or accept its decision relevant consequences. The scenario is however either unfalsifiable by definition, too vague, or almost impossibly difficult to falsify.
  • There is no or very little direct empirical evidence in support of the scenario.[2]

In any case, I admit that it is possible that I just wanted to bring the first three scenarios to your attention. I stumbled upon each very recently and found them to be highly..."amusing".


[1] I am also guilty of doing this. But what exactly is wrong with using the term in that way? What's the highest probability for which the term is still applicable? Can you offer a better term?

[2] One would have to define what exactly counts as "direct empirical evidence". But I think that it is pretty intuitive that there exists a meaningful difference between the risk of an asteroid that has been spotted with telescopes and a risk that is solely supported by a priori arguments.

Link: Simulating C. Elegans

11 Sniffnoy 20 November 2014 09:30AM


Summary, as I understand it: The connectome for C. elegans's 302-neuron brain has been known for some time, but actually doing anything with it (especially actually understanding it) has proved troublesome, especially because there could easily be relevant information about its brain function not stored in just the connections of the neurons.

However, the OpenWorm project -- which is trying to eventually make much more detailed C. elegans simulations, including an appropriate body -- recently tried just fudging it and making a simulation based on the connectome anyway, though in a wheeled body rather than a wormlike one.  And the result does seem to act at least somewhat like a C. elegans worm, though I am not really one to judge that.  (Video is here.)

I'm having trouble finding much more information about this at the moment.  I don't know if they've actually yet released detailed technical information.

The Centre for Effective Altruism is hiring to fill five roles in research, operations and outreach

10 RobertWiblin 19 November 2014 10:41PM

The Centre for Effective Altruism, the group behind 80,000 Hours, Giving What We Can, the Global Priorities Project, Effective Altruism Outreach, and to a lesser extent The Life You Can Save and Animal Charity Evaluators, is looking to grow its team with a number of new roles:

We are so keen to find great people that if you introduce us to someone new who we end up hiring, we will pay you $1,000 for the favour! If you know anyone awesome who would be a good fit for us please let me know: robert [dot] wiblin [at] centreforeffectivealtruism [dot] org. They can also book a short meeting with me directly.

We may be able to sponsor outstanding applicants from the USA.

Applications close Friday 5th December 2014.

Why is CEA an excellent place to work? 

First and foremost, “making the world a better place” is our bottom line and central aim. We work on the projects we do because we think they’re the best way for us to make a contribution. But there’s more.

What are we looking for?

The specifics of what we are looking for depend on the role and details can be found in the job descriptions. In general, we're looking for people who have many of the following traits:

  • Self-motivated, hard-working, and independent;
  • Able to deal with pressure and unfamiliar problems;
  • Have a strong desire for personal development;
  • Able to quickly master complex, abstract ideas, and solve problems;
  • Able to communicate clearly and persuasively in writing and in person;
  • Comfortable working in a team and quick to get on with new people;
  • Able to lead a team and manage a complex project;
  • Keen to work with a young team in a startup environment;
  • Deeply interested in making the world a better place in an effective way, using evidence and research;
  • A good understanding of the aims of the Centre for Effective Altruism and its constituent organisations.

I hope to work at CEA in the future. What should I do now?

Of course this will depend on the role, but generally good ideas include:

  • Study hard, including gaining useful knowledge and skills outside of the classroom.
  • Degrees we have found provide useful training include: philosophy, statistics, economics, mathematics and physics. However, we are hoping to hire people from a more diverse range of academic and practical backgrounds in the future. In particular, we hope to find new members of the team who have worked in operations, or creative industries.
  • Write regularly and consider starting a blog.
  • Manage student and workplace clubs or societies.
  • Work on exciting projects in your spare time.
  • Found a start-up business or non-profit, or join someone else early in the life of a new project.
  • Gain impressive professional experience in established organisations, such as those working in consulting, government, politics, advocacy, law, think-tanks, movement building, journalism, etc.
  • Get experience promoting effective altruist ideas online, or to people you already know.
  • Use 80,000 Hours' research to do a detailed analysis of your own future career plans.

Bayes Academy: Development report 1

40 Kaj_Sotala 19 November 2014 10:35PM

Some of you may remember me proposing a game idea that went by the name of The Fundamental Question. Some of you may also remember me talking a lot about developing an educational game about Bayesian Networks for my MSc thesis, but not actually showing you much in the way of results.

Insert the usual excuses here. But thanks to SSRIs and mytomatoes.com and all kinds of other stuff, I'm now finally on track towards actually accomplishing something. Here's a report on a very early prototype.

This game has basically two goals: to teach its players something about Bayesian networks and probabilistic reasoning, and to be fun. (And third, to let me graduate by giving me material for my Master's thesis.)

We start with the main character stating that she is nervous. Hitting any key, the player proceeds through a number of lines of internal monologue:

I am nervous.

I’m standing at the gates of the Academy, the school where my brother Opin was studying when he disappeared. When we asked the school to investigate, they were oddly reluctant, and told us to drop the issue.

The police were more helpful at first, until they got in contact with the school. Then they actually started threatening us, and told us that we would get thrown in prison if we didn’t forget about Opin.

That was three years ago. Ever since it happened, I’ve been studying hard to make sure that I could join the Academy once I was old enough, to find out what exactly happened to Opin. The answer lies somewhere inside the Academy gates, I’m sure of it.

Now I’m finally 16, and facing the Academy entrance exams. I have to do everything I can to pass them, and I have to keep my relation to Opin a secret, too. 

???: “Hey there.”

Eep! Someone is talking to me! Is he another applicant, or a staff member? Wait, let me think… I’m guessing that applicant would look a lot younger than staff members! So, to find that out… I should look at him!

[You are trying to figure out whether the voice you heard is a staff member or another applicant. While you can't directly observe his staff-nature, you believe that he'll look young if he's an applicant, and like an adult if he's a staff member. You can look at him, and therefore reveal his staff-nature, by right-clicking on the node representing his apperance.]

Here is our very first Bayesian Network! Well, it's not really much of a network: I'm starting with the simplest possible case in order to provide an easy start for the player. We have one node that cannot be observed ("Student", its hidden nature represented by showing it in greyscale), and an observable node ("Young-looking") whose truth value is equal to that of the Student node. All nodes are binary random variables, either true or false. 

According to our current model of the world, "Student" has a 50% chance of being true, so it's half-colored in white (representing the probability of it being true) and half-colored in black (representing the probability of it being false). "Young-looking" inherits its probability directly. The player can get a bit of information about the two nodes by left-clicking on them.

The game also offers alternate color schemes for colorblind people who may have difficulties distinguishing red and green.

Now we want to examine the person who spoke to us. Let's look at him, by right-clicking on the "Young-looking" node.

Not too many options here, because we're just getting started. Let's click on "Look at him", and find out that he is indeed young, and thus a student.

This was the simplest type of minigame offered within the game. You are given a set of hidden nodes whose values you're tasked with discovering by choosing which observable nodes to observe. Here the player had no way to fail, but later on, the minigames will involve a time limit and too many observable nodes to inspect within that time limit. It then becomes crucial to understand how probability flows within a Bayesian network, and which nodes will actually let you know the values of the hidden nodes.

The story continues!

Short for an adult, face has boyish look, teenagerish clothes... yeah, he looks young!

He's a student!

...I feel like I’m overthinking things now.

...he’s looking at me.

I’m guessing he’s either waiting for me to respond, or there’s something to see behind me, and he’s actually looking past me. If there isn’t anything behind me, then I know that he must be waiting for me to respond.

Maybe there's a monster behind me, and he's paralyzed with fear! I should check that possibility before it eats me!

[You want to find out whether the boy is waiting for your reply or staring at a monster behind you. You know that he's looking at you, and your model of the world suggests that he will only look in your direction if he's waiting for you to reply, or if there's a monster behind you. So if there's no monster behind you, you know that he's waiting for you to reply!]

Slightly more complicated network, but still, there's only one option here. Oops, apparently the "Looks at you" node says it's an observable variable that you can right-click to observe, despite the fact that it's already been observed. I need to fix that.

Anyway, right-clicking on "Attacking monster" brings up a "Look behind you" option, which we'll choose.

You see nothing there. Besides trees, that is.

Boy: “Um, are you okay?”

“Yeah, sorry. I just… you were looking in my direction, and I wasn’t sure of whether you were expecting me to reply, or whether there was a monster behind me.”

He blinks.

Boy: “You thought that there was a reasonable chance for a monster to be behind you?”

I’m embarrassed to admit it, but I’m not really sure of what the probability of a monster having snuck up behind me really should have been.

My studies have entirely focused on getting into this school, and Monsterology isn’t one of the subjects on the entrance exam!

I just went with a 50-50 chance since I didn’t know any better.

'Okay, look. Monsterology is my favorite subject. Monsters avoid the Academy, since it’s surrounded by a mystical protective field. There’s no chance of them getting even near! 0 percent chance.'

'Oh. Okay.'

[Your model of the world has been updated! The prior of the variable 'Monster Near The Academy' is now 0%.]

Then stuff happens and they go stand in line for the entrance exam or something. I haven't written this part. Anyway, then things get more exciting, for a wild monster appears!

Stuff happens


Huh, the monster is carrying a sword.

Well, I may not have studied Monsterology, but I sure did study fencing!

[You draw your sword. Seeing this, the monster rushes at you.]

He looks like he's going to strike. But is it really a strike, or is it a feint?

If it's a strike, I want to block and counter-attack. But if it's a feint, that leaves him vulnerable to my attack.

I have to choose wisely. If I make the wrong choice, I may be dead.

What did my master say? If the opponent has at least two of dancing legs, an accelerating midbody, and ferocious eyes, then it's an attack!

Otherwise it's a feint! Quick, I need to read his body language before it's too late!

Now get to the second type of minigame! Here, you again need to discover the values of some number of hidden variables within a time limit, but here it is in order to find out the consequences of your decision. In this one, the consequence is simple - either you live or you die. I'll let the screenshot and tutorial text speak for themselves:

[Now for some actual decision-making! The node in the middle represents the monster's intention to attack (or to feint, if it's false). Again, you cannot directly observe his intention, but on the top row, there are things about his body language that signal his intention. If at least two of them are true, then he intends to attack.]

[Your possible actions are on the bottom row. If he intends to attack, then you want to block, and if he intends to feint, you want to attack. You need to inspect his body language and then choose an action based on his intentions. But hurry up! Your third decision must be an action, or he'll slice you in two!]

In reality, the top three variables are not really independent of each other. We want to make sure that the player can always win this battle despite only having three actions. That's two actions for inspecting variables, and one action for actually making a decision. So this battle is rigged: either the top three variables are all true, or they're all false.

...actually, now that I think of it, the order of the variables is wrong. Logically, the body language should be caused by the intention to attack, and not vice versa, so the arrows should point from the intention to body language. I'll need to change that. I got these mixed up because the prototypical exemplar of a decision minigame is one where you need to predict someone's reaction from their personality traits, and there the personality traits do cause the reaction. Anyway, I want to get this post written before I go to bed, so I won't change that now.

Right-clicking "Dancing legs", we now see two options besides "Never mind"!

We can find out the dancingness of the enemy's legs by thinking about our own legs - we are well-trained, so our legs are instinctively mirroring our opponent's actions to prevent them from getting an advantage over us - or by just instinctively feeling where they are, without the need to think about them! Feeling them would allow us to observe this node without spending an action.

Unfortunately, feeling them has "Fencing 2" as a prerequisite skill, and we don't have that. Neither could we have them, in this point of the game. The option is just there to let the player know that there are skills to be gained in this game, and make them look forward to the moment when they can actually gain that skill. As well as giving them an idea of how the skill can be used.

Anyway, we take a moment to think of our legs, and even though our opponent gets closer to us in that time, we realize that our legs our dancing! So his legs must be dancing as well!

With our insider knowledge, we now know that he's attacking, and we could pick "Block" right away. But let's play this through. The network has automatically recalculated the probabilities to reflect our increased knowledge, and is now predicting a 75% chance for our enemy to be attacking, and for "Blocking" to thus be the right decision to make.

Next we decide to find out what his eyes say, by matching our gaze with his. Again, there would be a special option that cost us no time - this time around, one enabled by Empathy 1 - but we again don't have that option.

Except that his gaze is so ferocious that we are forced to look away! While we are momentarily distracted, he closes the distance, ready to make his move. But now we know what to do... block!


Now the only thing that remains to do is to ask our new-found friend for an explanation.

"You told me there was a 0% chance of a monster near the academy!"

Boy: “Ehh… yeah. I guess I misremembered. I only read like half of our course book anyway, it was really boring.”

“Didn’t you say that Monsterology was your favorite subject?”

Boy: “Hey, that only means that all the other subjects were even more boring!”

“. . .”

I guess I shouldn’t put too much faith on what he says.

[Your model of the world has been updated! The prior of the variable 'Monster Near The Academy' is now 50%.]

[Your model of the world has been updated! You have a new conditional probability variable: 'True Given That The Boy Says It's True', 25%]

And that's all for now. Now that the basic building blocks are in place, future progress ought to be much faster.


As you might have noticed, my "graphics" suck. A few of my friends have promised to draw art, but besides that, the whole generic Java look could go. This is where I was originally planning to put in the sentence "and if you're a Java graphics whiz and want to help fix that, the current source code is conveniently available at GitHub", but then getting things to his point took longer than I expected and I didn't have the time to actually figure out how the whole Eclipse-GitHub integration works. I'll get to that soon. Github link here!

I also want to make the nodes more informative - right now they only show their marginal probability. Ideally, clicking on them would expand them to a representation where you could visually see what components their probability composed of. I've got some scribbled sketches of what this should look like for various node types, but none of that is implemented yet.

I expect some of you to also note that the actual Bayes theorem hasn't shown up yet, at least in no form resembling the classic mammography problem. (It is used implicitly in the network belief updates, though.) That's intentional - there will be a third minigame involving that form of the theorem, but somehow it felt more natural to start this way, to give the player a rough feeling of how probability flows through Bayesian networks. Admittedly I'm not sure of how well that's happening so far, but hopefully more minigames should help the player figure it out better.

What's next? Once the main character (who needs a name) manages to get into the Academy, there will be a lot of social scheming, and many mysteries to solve in order for her to find out just what did happen to her brother... also, I don't mind people suggesting things, such as what could happen next, and what kinds of network configurations the character might face in different minigames.

(Also, everything that you've seen might get thrown out and rewritten if I decide it's no good. Let me know what you think of the stuff so far!)

Internet troll pascal mugges me?

3 efim 19 November 2014 10:23PM

As I became a little less anonymous on a social network (by posting in rationality-related public page) I received a message from a person, who introduced himself as a 'self-taught AI researcher'.

From his words - he don't know english and isn't acquainted with any works from sequences (or other not-translated into russian sources - i.e textbooks), but that he has 'a firm understanding' of what he has to do in order to build a general AI from 'genetic algorithms'. 

Mostly he wrote that complexity of values is a bullshit article and that he 'knows the answer' to that question, and also 'all questions' of contemporary AI field. 

Chances of his words being true is infinitesimal.

I probably had a vague feeling that I am being 'pascal mugged' (threat of creating uFAI) in the start (but flinched from it), and definitely had this idea somewhere along the line, but didn't quite thought about it and didn't consciously choose any particular mode of communication. Just tried to understand his position with guiding questions and explain why uFAI is not a good thing.

As I write this I am aware that it is probably not the best quality material, but hopefully it will be good enough for 'discussion'. I probably seek advice on how to handle this stuff emotionally and practically (ignore / engage).

And maybe some encouragement\critique because I feel really bad for engaging in conversations with him since it brought about only insults from him (he is most likely a troll) and more close mindedness (in a rare chance that he is sincere). Also because if I think about it my motivations to talk to him were mostly comprised from thrill of engaging in a conversation with a hateful person (hello nineteen neighty-four) and then worries about chance that he is sincerely crazy.


[edit: added on 20.11.2014] Overnight he wrote that he is not 'interested' in creating friendly AI. As he understands it - as 'not harming people', because he wants to rule the world, and because ai would be 'more important' than people, since it so smart.

In really FAI would be one that can understand his commands and values as he understands them, and probably one that understands notion of life at all.

He is ignoring notion of evidence, since he posted screenshot of 'ai interface' with nothing but 10 lines of php code, that are barely functional as I understand them to support his claims.

Again chances that he really doing what he's claiming - diminishing with every his word, and chances that he is trolling or mentally ill - grow. But now he requests material help with creating full blown uFAI for a chance to get safety when it arrives.

I feel like I should ignore him now and block him on a social network, and that I should have done it before it spiraled out of control. Such small probabilities should not be allowed to reside in my mind - looks like I have unwillingly favored the hypothesis since despite it all - I am a little unhinged by his writing. Really, I feel that I should ignore the urge to mess with him and just block all communications whatsoever.

I Want To Believe: Rational Edition

4 27chaos 18 November 2014 08:00PM

Relevant: http://lesswrong.com/lw/k7h/a_dialogue_on_doublethink/

I would like this conversation to operate under the assumption that there are certain special times when it is instrumentally rational to convince oneself of a proposition whose truth is indeterminate, and when it is epistemically rational as well. The reason I would like this conversation to operate under this assumption is that I believe questioning this assumption makes it more difficult to use doublethink for productive purposes. There are many other places on this website where the ethics or legitimacy of doublethink can be debated, and I am already aware of its dangers, so please don't mention such things here.

I am hoping for some advice. "Wanting to believe" can be both epistemically and instrumentally rational, as in the case of certain self-fulfilling prophecies. If believing that I am capable of winning a competition will cause me to win, believing that I am capable of winning is rational both in the instrumental sense that "rationality is winning" and in the epistemic sense that "rationality is truth".

I used to be quite good at convincing myself to adopt beliefs of this type when they were beneficial. It was essentially automatic, I knew that I had the ability and so applying it was as trivial as remembering its existence. Nowadays, however, I'm almost unable to do this at all, despite what I remember. It's causing me significant difficulties in my personal life.

How can I redevelop my skill at this technique? Practicing will surely help, and I'm practicing right now so therefore I'm improving already. I'll soon have the skill back stronger than ever, I'm quite confident. But are there any tricks or styles of thinking that can make it more controllable? Any mantras or essays that will help my thought to become more fluidly self-directed? Or should I be focused on manipulating my emotional state rather than on initiating a direct cognitive override?

I feel as though the difficulties I've been having become most pronounced when I'm thinking about self-fulfilling prophecies that do not have guarantees of certainty attached. The lower my estimated probability that the self-fulfilling prophecy will work for me, the less able I am to use the self-fulfilling prophecy as a tool, even if the estimated gains from the bet are large. How might I deal with this problem, specifically?

[Link] Chalmers on Computation: A first step From Physics to Metaethics?

0 john_ku 18 November 2014 10:39AM

A Computational Foundation for the Study of Cognition by David Chalmers

Abstract from the paper:

Computation is central to the foundations of modern cognitive science, but its role is controversial. Questions about computation abound: What is it for a physical system to implement a computation? Is computation sufficient for thought? What is the role of computation in a theory of cognition? What is the relation between different sorts of computational theory, such as connectionism and symbolic computation? In this paper I develop a systematic framework that addresses all of these questions.

Justifying the role of computation requires analysis of implementation, the nexus between abstract computations and concrete physical systems. I give such an analysis, based on the idea that a system implements a computation if the causal structure of the system mirrors the formal structure of the computation. This account can be used to justify the central commitments of artificial intelligence and computational cognitive science: the thesis of computational sufficiency, which holds that the right kind of computational structure suffices for the possession of a mind, and the thesis of computational explanation, which holds that computation provides a general framework for the explanation of cognitive processes. The theses are consequences of the facts that (a) computation can specify general patterns of causal organization, and (b) mentality is an organizational invariant, rooted in such patterns. Along the way I answer various challenges to the computationalist position, such as those put forward by Searle. I close by advocating a kind of minimal computationalism, compatible with a very wide variety of empirical approaches to the mind. This allows computation to serve as a true foundation for cognitive science.

See my welcome thread submission for a brief description of how I conceive of this as the first step towards formalizing friendliness.

Financial Effectiveness Repository

4 Gunnar_Zarncke 18 November 2014 09:57AM

Follow-Up to: A Guide to Rational Investing Financial Planning Sequence (defunct) The Rational Investor 

What are your recommendations and ideas about financial effectiveness? 

This post is created in response to a comment on this Altruistic Effectiveness post and thus may have a slight focus on EA. But it is nonetheless meant as a general request for financial effectiveness information (effectiveness as in return on invested time mostly). I think this could accumulate a lot of advice and become part of the Repository Repository (which surprisingly has not much advice of this kind yet).

I seed this with a few posts about this found on LessWrong in the comments. What other posts and links about financial effectiveness do you know of? 



  • Each comment should name a single recommendation.
  • You should give the effectiveness in percent per period or absolute if possible.
  • Advice should be backed by evidence as usual. 

General Advice (from Guide to Rational Investing):

Capital markets have created enormous amounts of wealth for the world and reward disciplined, long-term investors for their contribution to the productive capacity of the economy. Most individuals would do well to invest most of their wealth in the capital market assets, particularly equities. Most investors, however, consistently make poor investment decisions as a result of a poor theoretical understanding of financial markets as well as cognitive and emotional biases, leading to inferior investment returns and inefficient allocation of capital. Using an empirically rigorous approach, a rational investor may reasonably expect to exploit inefficiencies in the market and earn excess returns in so doing.

So what are your recommendations? You may give advanced as well as simple advice. The more the better for this to become a real repository. You may also repeat or link advice given elsewere on LessWrong.

Superintelligence 10: Instrumentally convergent goals

6 KatjaGrace 18 November 2014 02:00AM

This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.

Welcome. This week we discuss the tenth section in the reading guide: Instrumentally convergent goals. This corresponds to the second part of Chapter 7.

This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.

There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. And if you are behind on the book, don't let it put you off discussing. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).

ReadingInstrumental convergence from Chapter 7 (p109-114)


  1. The instrumental convergence thesis: we can identify 'convergent instrumental values' (henceforth CIVs). That is, subgoals that are useful for a wide range of more fundamental goals, and in a wide range of situations. (p109)
  2. Even if we know nothing about an agent's goals, CIVs let us predict some of the agent's behavior (p109)
  3. Some CIVs:
    1. Self-preservation: because you are an excellent person to ensure your own goals are pursued in future.
    2. Goal-content integrity (i.e. not changing your own goals): because if you don't have your goals any more, you can't pursue them.
    3. Cognitive enhancement: because making better decisions helps with any goals.
    4. Technological perfection: because technology lets you have more useful resources.
    5. Resource acquisition: because a broad range of resources can support a broad range of goals.
  4. For each CIV, there are plausible combinations of final goals and scenarios under which an agent would not pursue that CIV. (p109-114)


1. Why do we care about CIVs?
CIVs to acquire resources and to preserve oneself and one's values play important roles in the argument for AI risk. The desired conclusions are that we can already predict that an AI would compete strongly with humans for resources, and also than an AI once turned on will go to great lengths to stay on and intact.

2. Related work
Steve Omohundro wrote the seminal paper on this topic. The LessWrong wiki links to all of the related papers I know of. Omohundro's list of CIVs (or as he calls them, 'basic AI drives') is a bit different from Bostrom's:

  1. Self-improvement
  2. Rationality
  3. Preservation of utility functions
  4. Avoiding counterfeit utility
  5. Self-protection
  6. Acquisition and efficient use of resources

3. Convergence for values and situations
It seems potentially helpful to distinguish convergence over situations and convergence over values. That is, to think of instrumental goals on two axes - one of how universally agents with different values would want the thing, and one of how large a range of situations it is useful in. A warehouse full of corn is useful for almost any goals, but only in the narrow range of situations where you are a corn-eating organism who fears an apocalypse (or you can trade it). A world of resources converted into computing hardware is extremely valuable in a wide range of scenarios, but much more so if you don't especially value preserving the natural environment. Many things that are CIVs for humans don't make it onto Bostrom's list, I presume because he expects the scenario for AI to be different enough. For instance, procuring social status is useful for all kinds of human goals. For an AI in the situation of a human, it would appear to also be useful. For an AI more powerful than the rest of the world combined, social status is less helpful.

4. What sort of things are CIVs?
Arguably all CIVs mentioned above could be clustered under 'cause your goals to control more resources'. This implies causing more agents to have your values (e.g. protecting your values in yourself), causing those agents to have resources (e.g. getting resources and transforming them into better resources) and getting the agents to control the resources effectively as well as nominally (e.g. cognitive enhancement, rationality). It also suggests convergent values we haven't mentioned. To cause more agents to have one's values, one might create or protect other agents with your values, or spread your values to existing other agents. To improve the resources held by those with one's values, a very convergent goal in human society is to trade. This leads to a convergent goal of creating or acquiring resources which are highly valued by others, even if not by you. Money and social influence are particularly widely redeemable 'resources'. Trade also causes others to act like they have your values when they don't, which is a way of spreading one's values. 

As I mentioned above, my guess is that these are left out of Superintelligence because they involve social interactions. I think Bostrom expects a powerful singleton, to whom other agents will be irrelevant. If you are not confident of the singleton scenario, these CIVs might be more interesting.

5. Another discussion
John Danaher discusses this section of Superintelligence, but not disagreeably enough to read as 'another view'. 

Another view

I don't know of any strong criticism of the instrumental convergence thesis, so I will play devil's advocate.

The concept of a sub-goal that is useful for many final goals is unobjectionable. However the instrumental convergence thesis claims more than this, and this stronger claim is important for the desired argument for AI doom. The further claims are also on less solid ground, as we shall see.

According to the instrumental convergence thesis, convergent instrumental goals not only exist, but can at least sometimes be identified by us. This is needed for arguing that we can foresee that AI will prioritize grabbing resources, and that it will be very hard to control. That we can identify convergent instrumental goals may seem clear - after all, we just did: self-preservation, intelligence enhancement and the like. However to say anything interesting, our claim must not only be that these values are better than not, but that they will be prioritized by the kinds of AI that will exist, in a substantial range of circumstances that will arise. This is far from clear, for several reasons.

Firstly, to know what the AI would prioritize we need to know something about its alternatives, and we can be much less confident that we have thought of all of the alternative instrumental values an AI might have. For instance, in the abstract intelligence enhancement may seem convergently valuable, but in practice adult humans devote little effort to it. This is because investments in intelligence are rarely competitive with other endeavors.

Secondly, we haven't said anything quantitative about how general or strong our proposed convergent instrumental values are likely to be, or how we are weighting the space of possible AI values. Without even any guesses, it is hard to know what to make of resulting predictions. The qualitativeness of the discussion also raises the concern that thinking on the problem has not been very concrete, and so may not be engaged with what is likely in practice.

Thirdly, we have arrived at these convergent instrumental goals by theoretical arguments about what we think of as default rational agents and 'normal' circumstances. These may be very different distributions of agents and scenarios from those produced by our engineering efforts. For instance, perhaps almost all conceivable sets of values - in whatever sense - would favor accruing resources ruthlessly. It would still not be that surprising if an agent somehow created noisily from human values cared about only acquiring resources by certain means or had blanket ill-feelings about greed.

In sum, it is unclear that we can identify important convergent instrumental values, and consequently unclear that such considerations can strongly help predict the behavior of real future AI agents.

In-depth investigations

If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.


  1. Do approximately all final goals make an optimizer want to expand beyond the cosmological horizon?
  2. Can we say anything more quantitative about the strength or prevalence of these convergent instrumental values?
  3. Can we say more about values that are likely to be convergently instrumental just across AIs that are likely to be developed, and situations they are likely to find themselves in?


If you are interested in anything like this, you might want to mention it in the comments, and see whether other people have useful thoughts.

How to proceed

This has been a collection of notes on the chapter.  The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

Next week, we will talk about the treacherous turn. To prepare, read “Existential catastrophe…” and “The treacherous turn” from Chapter 8The discussion will go live at 6pm Pacific time next Monday 24th November. Sign up to be notified here.

Neo-reactionaries, why are you neo-reactionary?

7 Capla 17 November 2014 10:31PM

Through LessWrong, I've discovered the no-reactionary movement. Servery says that there are some of you here.

I'm curious, what lead you to accept the basic premises of the movement?  What is the story of your personal "conversion"? Was there some particular insight or information that was important in convincing you? Was it something that just "clicked" for you or that you had always felt in a vague way? Were any of you "raised in it"?

Feel free to forward my questions to others or direct me towards a better forum for asking this.

I hope that this is in no way demeaning or insulting. I'm genuinely curious and my questioning is value free. If you point me towards compelling evidence of the neo-reactionary premise, I'll update on it.

Moderate Dark Arts to Promote Rationaly: Ends Justify Means?

3 Gleb_Tsipursky 17 November 2014 07:00PM

I'd like the opinion of Less Wrongers on the extent to which it is appropriate to use Dark Arts as a means of promoting rationality.

I and other fellow aspiring rationalists in the Columbus, OH Less Wrong meetup have started up a new nonprofit organization, Intentional Insights, and we're trying to optimize ways to convey rational thinking strategies widely and thus raise the sanity waterline. BTW, we also do some original research, as you can see in this Less Wrong article on "Agency and Life Domains," but our primary focus is promoting rational thinking widely, and all of our research is meant to accomplish that goal.

To promote rationality as widely as possible, we decided it's appropriate to speak the language of System 1, and use graphics, narrative, metaphors, and orientation toward pragmatic strategies to communicate about rationality to a broad audience. Some example are our blog posts about gaining agency, about research-based ways to find purpose and meaning, about dual process theory and other blog posts, as well as content such as videos on evaluating reality and on finding meaning and purpose in life.

Our reasoning is that speaking the language of System 1 would help us to reach a broad audience who are currently not much engaged in rationality, but could become engaged if instrumental and epistemic rationality strategies are presented in such a way as to create cognitive ease. We think the ends of promoting rationality justify the means of using such moderate Dark Arts - although the methods we use do not convey 100% epistemic rationality, we believe the ends of spreading rationality are worthwhile, and that once broad audiences who engage with our content realize the benefits of rationality, they can be oriented to pursue more epistemic accuracy over time. However, some Less Wrongers disagreed with this method of promoting rationality, as you can see in some of the comments on this discussion post introducing the new nonprofit. Some commentators expressed the belief that it is not appropriate to use methods that speak to System 1.

So I wanted to bring up this issue for a broader discussion on Less Wrong, and get a variety of opinions. What are your thoughts about the utility of using moderate Dark Arts of the type I described above if the goal is to promote rationality - do the ends justify the means? How much Dark Arts, if any, is it appropriate to use to promote rationality?


AI caught by a module that counterfactually doesn't exist

9 Stuart_Armstrong 17 November 2014 05:49PM

EDIT: Whenever I use colloquial phrases like "the AI believes a (false) X" I mean that we are using utility indifference to accomplish that goal, without actually giving the AI false beliefs.

A big problem with most AI control ideas is that the AI will route around them. Suppose you try to ensure that the AI stays friendly by - say - requiring it to ensure that the world's stock markets continue to function. Then a badly motivated AI will cause the destruction it was going to cause anyway - except it will also create some system that technically counts as "the world's stock markets" and ensure that it "continues to function".

I think we can use counterfactuals and corrigibility (AI-acceptable motivation changes) to partially get around the problem, in some circumstances. Suppose we had a candidate AI, with utility U, that we thought was friendly - but that we weren't sure. We have a subgoal, say SG="ensuring that the world's stock markets continue to function". We're pretty sure that if the AI is friendly, then SG will happen. We're pretty sure that if the AI were unfriendly, and the AI did not have SG added as an objective, then SG wouldn't happen. I deliberately chose a rather rough subgoal to illustrate this.

So we then add a module M to the AI. The task of M, which it has chosen to accept, is to look into the future, using the AI's predictive abilities, and see if SG is accomplished. Crucially, it will do so under the assumption that the module M does not exist (or wasn't activated, or will soon be overwritten, or will fail to do anything, or so on). If M finds that SG is not accomplished, it will turn the AI off (in a way compatible with the AI's motivations - see corrigibility).

How is this different from the usual setup? Well, a generic FAI would accomplish SG (by assumption - we may need to work a bit on this part). A generic UFAI without M wouldn't accomplish SG. A generic UFAI with M would accomplish SG, as this prevents it from being turned off.

But a generic UFAI with M which thought that M would have no impact, would behave exactly as a generic UFAI without M - and wouldn't aim to accomplish SG, and thus would be turned off by M.

Note that M can only be used as a last test for a putatively friendly AI. We cannot use it if we are continually refining AIs (either through evolution or simply redesign) as then we are optimising for SG, and SG is a poor goal to be aiming for (many, many UFAI have SG as a goal - it's just that a generic one won't). Similarly, we can't use a unconstrained search to find such an AI.

I wonder if this idea can be extended. Suggestions?

Can science come to understand consciousness? A problem of philosophical zombies (Yes, I know, P-zombies again.)

1 Capla 17 November 2014 05:06PM

In response to the classic Mysterious Answers to Mysterious Questions, I express some skepticism that consciousness is can be understood by science. I postulate (with low confidence) that consciousness is “inherently mysterious”, in that it is philosophically and scientifically impenetrable. The mysteriousness is a fact about our state of mind, but that state of mind is due to a fundamental epistemic feature of consciousness and is impossible to resolve.

My issue with understanding the cause of consciousness involves p-zombies. Any experiment with the goal of understanding consciousness would have to be able to detect consciousness, which seems to me to be philosophically impossible. To be more specific, any scientific investigation of the cause of consciousness would have (to simplify) an independent variable that we could manipulate to see how this manipulation affects the dependent variable, the presence or absence of consciousness. We assume that those around us are conscious, and we have good reason to do so, but we can't rely on that assumption in any experiment in which we are investigating consciousness.  Before we ask “what is causing x?”, we first have to know that x is present.

As Eliezer points out, that an individual says he's conscious is a pretty good signal of consciousness, but we can't necessarily rely on that signal for non-human minds. A conscious AI may never talk about its internal states depending on its structure.  (Humans have a survival advantage to social sharing of internal realities; an AI will not be subject to that selection pressure. There’s no reason for it to have any sort of emotional need to share its feelings, for example.) On the flip side, a savvy but non-conscious AI may talk about it's "internal states", not because it actually has internal states, but because it is “guessing the teacher's password” in the strongest way imaginable: it has no understanding whatsoever of what those states are, but computes that aping internal states will accomplish it's goals. I don't know how we could possibly know if the AI is aping consciousness for it own ends or if it actually is conscious. If consciousness is thus undetectable, I can't see how science can investigate it.

That said, I am very well aware that “Throughout history, every mystery, ever solved has turned out to be not magic*” and that every single time something has seemed inscrutable to science, a reductionist explanation eventually surfaced. Knowing this, I seriously downgrade my confidence that "No, really, this time it is different.  This phenomenon really is beyond the grasp of science." I look forward to someone coming forward with something clever that dissolves the question, but even so, it does seem inscrutable.


*- Though, to be fair, this is a selection bias.  Of course, all the  solved mysteries weren't magic. All the mysteries that are acctully magic remain unsolved, because they're magic! This is NOT to say I believe in magic, just to say that it's hardly saying much to claim that all the things we've come to understand were in principle understandable. To steelman: I do understand that with each mystery that was once declared to be magical, then later shown not to be, our collective priors for the existence of magical things decrease. (There is a sort of halting problem: if a question has remained unsolved since the dawn of asking questions, is that because it is unsolvable, or because we're right around the corner form solving it?)

Systemic risk: a moral tale of ten insurance companies

25 Stuart_Armstrong 17 November 2014 04:43PM

Once upon a time...

Imagine there were ten insurance sectors, each sector being a different large risk (or possibly the same risks, in different geographical areas). All of these risks are taken to be independent.

To simplify, we assume that all the risks follow the same yearly payout distributions. The details of the distribution doesn't matter much for the argument, but in this toy model, the payouts follow the discrete binomial distribution with n=10 and p=0.5, with millions of pounds as the unit:

This means that the probability that each sector pays out £n million each year is (0.5)10 . 10!/(n!(10-n)!).

All these companies are bound by Solvency II-like requirements, that mandate that they have to be 99.5% sure to payout all their policies in a given year - or, put another way, that they only fail to payout once in every 200 years on average. To do so, in each sector, the insurance companies have to have capital totalling £9 million available every year (the red dashed line).

Assume that each sector expects £1 million in total yearly expected profit. Then since the expected payout is £5 million, each sector will charge £6 million a year in premiums. They must thus maintain a capital reserve of £3 million each year (they get £6 million in premiums, and must maintain a total of £9 million). They thus invest £3 million to get an expected profit of £1 million - a tidy profit!

Every two hundred years, one of the insurance sectors goes bust and has to be bailed out somehow; every hundred billion trillion years, all ten insurance sectors go bust all at the same time. We assume this is too big to be bailed out, and there's a grand collapse of the whole insurance industry with knock on effects throughout the economy.

But now assume that insurance companies are allowed to invest in each other's sectors. The most efficient way of doing so is to buy equally in each of the ten sectors. The payouts across the market as a whole are now described by the discrete binomial distribution with n=100 and p=0.5:

This is a much narrower distribution (relative to its mean). In order to have enough capital to payout 99.5% of the time, the whole industry needs only keep £63 million in capital (the red dashed line). Note that this is far less that the combined capital for each sector when they were separate, which would be ten times £9 million, or £90 million (the pink dashed line). There is thus a profit taking opportunity in this area (it comes from the fact that the standard deviation of X+Y is less that the standard deviation of X plus the standard deviation Y).

If the industry still expects to make an expected profit of £1 million per sector, this comes to £10 million total. The expected payout is £50 million, so they will charge £60 million in premium. To accomplish their Solvency II obligations, they still need to hold an extra £3 million in capital (since £63 million - £60 million = £3 million). However, this is now across the whole insurance industry, not just per sector.

Thus they expect profits of £10 million based on holding capital of £3 million - astronomical profits! Of course, that assumes that the insurance companies capture all the surplus from cross investing; in reality there would be competition, and a buyer surplus as well. But the general point is that there is a vast profit opportunity available from cross-investing, and thus if these investments are possible, they will be made. This conclusion is not dependent on the specific assumptions of the model, but captures the general result that insuring independent risks reduces total risk.

But note what has happened now: once every 200 years, an insurance company that has spread their investments across the ten sectors will be unable to payout what they owe. However, every company will be following this strategy! So when one goes bust, they all go bust. Thus the complete collapse of the insurance industry is no longer a one in hundred billion trillion year event, but a one in two hundred year event. The risk for each company has stayed the same (and their profits have gone up), but the systemic risk across the whole insurance industry has gone up tremendously.

...and they failed to live happily ever after for very much longer.

I just increased my Altruistic Effectiveness and you should too

8 AABoyles 17 November 2014 03:45PM

I was looking at the marketing materials for a charity (which I'll call X) over the weekend, when I saw something odd at the bottom of their donation form:

Check here to increase your donation by 3% to defray the cost of credit card processing.

It's not news to me that credit card companies charge merchants a cut of every transaction.  But the ramifications of this for charitable contributions had never sunk in. I use my credit card for all of the purchases I can (I get pretty good cash-back rates). Automatically drafting from my checking account (like a check, only without the check) costs X nothing. So I've increased the effectiveness of my charitable contributions by a small (<3%) amount by performing what amounts to a paperwork tweak.

If you use a credit card for donations, please think about making this tweak as well!

Lying in negotiations: a maximally bad problem

12 Stuart_Armstrong 17 November 2014 03:17PM

In a previous post, I showed that the Nash Bargaining Solution (NBS), the Kalai-Smorodinsky Bargaining Solution (KSBS) and own my Mutual Worth Bargaining Solution (MWBS) were all maximally vulnerable to lying. Here I can present a more general result: all bargaining solutions are maximally vulnerable to lying.

Assume that players X and Y have settled on some bargaining solution (which only cares about the defection point and the utilities of X and Y). Assume further that player Y knows player X's utility function. Let player X look at the possible outcomes, and let her label any outcome O "admissible" if there is some possible bargaining partner YO with utility function uO such that O would be the outcome of the bargain between X and YO.

For instance, in the case of NBS and KSBS, the admissible outcomes would be the outcomes Pareto-better than the disagreement point. The MWBS has a slightly larger set of admissible outcomes, as it allows players to lose utility (up to the maximum they could possibly gain).

Then the general result is:

If player Y is able to lie about his utility function while knowing player X's true utility (and player X is honest), he can freely select his preferred outcome among the outcomes that are admissible.

The proof of this is also derisively brief: player Y need simply claim to have utility uO, in order to force outcome O.

Thus, if you've agreed on a bargaining solution, all that you've done is determined the set of outcomes among which your lying opponent will freely choose.

There may be a subtlety: your system could make use of an objective (or partially objective) disagreement point, which your opponent is powerless to change. This doesn't change the result much:

If player Y is able to lie about his utility function while knowing player X's true utility (and player X is honest), he can freely select his preferred outcome among the outcomes that are admissible given the disagreement point.


Exploitation and gains from trade

Note that the above result did not make any assumptions about the outcome being Pareto - giving up Pareto doesn't make you non-exploitable (or "strategyproof" as it is often called).

But note also that the result does not mean that the system is exploitable! In the random dictator setup, you randomly assign power to one player, who then makes all the decisions. In terms of expected utility, this is a pUX+(1-p)UY, where UX is the best outcome ("Utopia") for X and UY the best outcome for Y, and p the probability that X is the random dictator. The theorem still holds for this setup: player X knows that player Y will be able to select freely among the admissible outcomes, which is the set S={pUX+(1-p)O | O an outcome}. However, player X knows that player Y will select pUX+(1-p)UY as this maximises his expected utility. So a bargaining solution which has a particular selection of admissible outcomes can be strategyproof.

However, it seems that the only strategyproof bargaining solutions are variants of random dictators! These solutions do not allow much gain from trade. Conversely, the more you open your bargaining solution up to gains from trade, the more exploitable you become from lying. This can be seen in the examples above: my MWBS tried to allow greater gains (in expectation) by not restricting to strict Pareto improvements from the disagreement point. As a result, it makes itself more vulnerable to liars.


What to do

What can be done about this? There seem to be several possibilities:

  1. Restrict to bargaining solutions difficult to exploit. This is the counsel of despair: give up most of the gains from trade, to protect yourself from lying. But there may be a system where the tradeoff between exploitability and potential gains is in some sense optimal.
  2. Figure out your opponent's true utility function. The other obvious solution: prevent lying by figuring out what your opponent really values, by inspecting their code, their history, their reactions, etc... This could be combined with refusing to trade with those who don't make their true utility easy to discover (or only using non-exploitable trades with those).
  3. Hide your own true utility. The above approach only works because the liar knows their opponent, and their opponent doesn't know them. If both utilities are hidden, it's not clear how exploitable the system really is.
  4. Play only multi-player. If there are many different trades with many different people, it becomes harder to construct a false utility that exploits them all. This is in a sense a variant of "hiding your own true utility": in that situation, the player has to lie given their probability distribution of your possible possible utilities; in this this situation, they have to lie, given the known distribution of multiple true utilities.

So there does not seem to be a principled way of getting rid of liars. But the multi-player (or hidden utility function) may point to a single "best" bargaining solution: the one that minimises the returns to lying and maximises the gains to trade, given ignorance of the other's utility function.

Open thread, Nov. 17 - Nov. 23, 2014

3 MrMind 17 November 2014 08:25AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

Musk on AGI Timeframes

18 Artaxerxes 17 November 2014 01:36AM

Elon Musk submitted a comment to edge.org a day or so ago, on this article. It was later removed.

The pace of progress in artificial intelligence (I'm not referring to narrow AI) is incredibly fast. Unless you have direct exposure to groups like Deepmind, you have no idea how fast-it is growing at a pace close to exponential. The risk of something seriously dangerous happening is in the five year timeframe. 10 years at most. This is not a case of crying wolf about something I don't understand.

I am not alone in thinking we should be worried. The leading AI companies have taken great steps to ensure safety. The recognize the danger, but believe that they can shape and control the digital superintelligences and prevent bad ones from escaping into the Internet. That remains to be seen...

Now Elon has been making noises about AI safety lately in general, including for example mentioning Bostrom's Superintelligence on twitter. But this is the first time that I know of that he's come up with his own predictions of the timeframes involved, and I think his are rather quite soon compared to most. 

The risk of something seriously dangerous happening is in the five year timeframe. 10 years at most.

We can compare this to MIRI's post in May this year, When Will AI Be Created, which illustrates that it seems reasonable to think of AI as being further away, but also that there is a lot of uncertainty on the issue.

Of course, "something seriously dangerous" might not refer to full blown superintelligent uFAI - there's plenty of space for disasters of magnitude in between the range of the 2010 flash crash and clippy turning the universe into paperclips to occur.

In any case, it's true that Musk has more "direct exposure" to those on the frontier of AGI research than your average person, and it's also true that he has an audience, so I think there is some interest to be found in his comments here.


Group Rationality Diary, November 16-30

1 therufs 16 November 2014 09:16PM

This is the public group rationality diary for November 16-30.

It's a place to record and chat about it if you have done, or are actively doing, things like: 

  • Established a useful new habit
  • Obtained new evidence that made you change your mind about some belief
  • Decided to behave in a different way in some set of situations
  • Optimized some part of a common routine or cached behavior
  • Consciously changed your emotions or affect with respect to something
  • Consciously pursued new valuable information about something that could make a big difference in your life
  • Learned something new about your beliefs, behavior, or life that surprised you
  • Tried doing any of the above and failed

Or anything else interesting which you want to share, so that other people can think about it, and perhaps be inspired to take action themselves. Try to include enough details so that everyone can use each other's experiences to learn about what tends to work out, and what doesn't tend to work out.

Thanks to cata for starting the Group Rationality Diary posts, and to commenters for participating.

Previous diary: November 1-15

Rationality diaries archive

How to build the skill and the habit of experimentation?

5 Capla 15 November 2014 04:08PM

I want to make regular experimentation a part of my life and don't really know how. I thought that I should associate with and assist people who do run experiments (I'm interning with a psych lab and a paranormal investigator, and hope to work with some behavioral economists who run field-experiments), but I relied that I haven't taken the time to consider if that is actually a good approach or if there is somthign else I should be doing in addition.


How do I gain proficiency with experimental methods and build the habit of running simple experiments regularly? I suppose that there's a certain kind of phenomenon that to the educated mind is automatically flagged as ripe for experimentation (I'm thinking of Feynman's curiosity about the ants in his room, from Surely You're Joking, or Harry James Potter-Evans-Verres testing with his army to find out what the optimal way to fight is, prior to the first of Quirrell’s battles), but I don't have that intuition, yet.


What are the key insights, procedures, or guidelines that I need to know in order to experiment fruitfully? How do I build that intuition?

I’m looking either for recommendations or critiques. Perhaps personal experimentation is not as useful as my veneration of science in general leads me to believe? It seems to me beneficial that when faced with a problem, confusion, or dispute, one of my go-to approaches is to run an experiment.



My new paper: Concept learning for safe autonomous AI

17 Kaj_Sotala 15 November 2014 07:17AM

Abstract: Sophisticated autonomous AI may need to base its behavior on fuzzy concepts that cannot be rigorously defined, such as well-being or rights. Obtaining desired AI behavior requires a way to accurately specify these concepts. We review some evidence suggesting that the human brain generates its concepts using a relatively limited set of rules and mechanisms. This suggests that it might be feasible to build AI systems that use similar criteria and mechanisms for generating their own concepts, and could thus learn similar concepts as humans do. We discuss this possibility, and also consider possible complications arising from the embodied nature of human thought, possible evolutionary vestiges in cognition, the social nature of concepts, and the need to compare conceptual representations between humans and AI systems.

I just got word that this paper was accepted for the AAAI-15 Workshop on AI and Ethics: I've uploaded a preprint here. I'm hoping that this could help seed a possibly valuable new subfield of FAI research. Thanks to Steve Rayhawk for invaluable assistance while I was writing this paper: it probably wouldn't have gotten done without his feedback motivating me to work on this.

Comments welcome. 

Dumbing Down Human Values

5 leplen 15 November 2014 02:53AM

I want to preface everything here by acknowledging my own ignorance. I have relatively little formal training in any of the subjects this post will touch upon and that this chain of reasoning is very much a work in progress. 


I think the question of how to encode human values into non-human decision makers is a really important research question. Whether or not one accepts the rather eschatological arguments about the intelligence explosion, the coming singularity, etc. there seems to be tremendous interest in the creation of software and other artificial agents that are capable of making sophisticated decisions. Inasmuch as the decisions of these agents have significant potential impacts, we want those decisions to be made with some sort of moral guidance. Our approach towards the problem of creating machines that preserve human values thus far has primarily relied on a series of hard-coded heuristics, e.g. saws that stop spinning if they come into contact with human skin. For very simple machines, these sorts of heuristics are typically sufficient, but they constitute a very crude representation of human values.  

We're at the border, in many ways, of creating machines where these sorts of crude representations are probably not sufficient. As a specific example, IBM's Watson is now designing treatment programs for lung cancer patients. The design of a treatment program implies striking a balance between treatment cost, patient comfort, aggressiveness of targeting the primary disease, short and long-term side effects, secondary infections, etc. It isn't totally clear how those trade-offs are being managed, although there's still a substantial amount of human oversight/intervention at this point.

The use of algorithms to discover human preferences is already widespread. While these typically operate in restricted domains such as entertainment recommendations, it seems at least in principle possible that with the correct algorithm and a sufficiently large corpus of data, a system not dramatically more advanced than existing technology could learn some reasonable facsimile of human values.  This is probably worth doing. 

The goal would be to have a sufficient representation of human values using as dumb a machine as possible. This putative value-learning machine could be dumb in the way that Deep Blue was dumb, by being a hyper-specialist in the problem domain of chess/learning human values and having very little optimization power outside of that domain. It could also be dumb in the way that evolution is dumb, obtaining satisfactory results more through an abundance of data and resources that through any particular brilliance. 

Computer chess benefited immensely from 5 decades of work before Deep Blue managed to win a game against Kasparov. While many of the algorithms developed for computer chess have found applications outside of that domain, some of them are domain specific. A specialist human value learning system may also require substantial effort on domain specific problems. The history, competitive nature, and established ranking system for chess made it attractive problem for computer scientists because it was relatively easy to measure progress. Perhaps the goal for a program designed to understand human values would be that it plays a convincing game of "Would you rather?" although as far as I know no one has devised an ELO system for it.

Similarly, a relatively dumb but more general AI, may require relatively large, preferably somewhat homogeneous data sets to come to conclusions that are even acceptable. Having successive generations of AI train on the same or similar data sets could provide a useful way of tracking progress/feedback mechanism for determining how successful various research efforts are.

The benefit of this research approach is that not only is it a relatively safe path towards a possible AGI, in the event that the speculative futures of mind-uploads and superintelligences do not take place, there's still substantial utility in having devised a system that is capable of making correct moral decisions in limited domains. I want my self-driving car to make a much larger effort to avoid a child in the road than a plastic bag. I'd be even happier if it could distinguish between an opossum and someone's cat. 

When I design research projects, one of the things I try to ensure is that if some of my assumptions are wrong, the project fails gracefully. Obviously it's easy to love the Pascal's Wager-like impact statement of FAI, but if I were writing it up for an NSF grant I'd put substantially more emphasis on the importance of my research even if fully human level AI isn't invented for another 200 years. When I give the elevator pitch version of FAI, I've found placing a strong emphasis on the near future and referencing things people have encountered before such as computers playing jeopardy or self-driving cars makes them much more receptive to the idea of AI safety and allows me to discuss things like the potential for an unfriendly superintelligence without coming across as a crazed prophet of the end times. 

I'm also just really really curious to see how well something like Watson would perform if I gave it a bunch of sociology data and asked if a human would rather find 5 dollars or stub a toe. There doesn't seem to be a huge categorical difference between the being able to answer the Daily Double and reasoning about human preferences, but I've been totally wrong about intuitive jumps that seemed much smaller than that one in the past, so it's hard to be too confident. 

Optimizing ways to convey rational thinking strategies to broad audience

5 Gleb_Tsipursky 15 November 2014 01:37AM

What do you think of this post as a way to use graphics, narrative, metaphors, and orientation toward pragmatic strategies to communicate about dual process theory to a broad audience? It's part of the work of our new nonprofit organization, and we're trying to optimize ways to convey rational thinking strategies widely and thus raise the sanity waterline. So advice on how to improve this post, as well as our other posts, with an orientation toward a broad audience, would be helpful. Thanks, all!

Weekly LW Meetups

2 FrankAdamek 14 November 2014 04:49PM

This summary was posted to LW Main on November 7th. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Toronto, Vienna, Washington DC, Waterloo, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

continue reading »

[Link] Physics-based anthropics?

4 Brian_Tomasik 14 November 2014 07:02AM

Nick Bostrom's self-sampling assumption treats us as a random sample from a set of observers, but this framework raises several paradoxes. Instead, why not treat the stuff we observe to be a random sample from the set of all stuff that exists? I elaborate on this proposal in a new essay subsection: "SSA on physics rather than observers?" At first glance, it seems to work better than any of the mainstream schools of anthropics. Comments are welcome.

Has this idea been suggested before? I noticed that Robin Hanson proffered something similar way back in 1998 (four years before Bostrom's Anthropic Bias). I'm surprised Hanson's proposal hasn't received more attention in the academic literature.

The "best" mathematically-informed topics?

11 Capla 14 November 2014 03:39AM

Recently, I asked LessWrong about the important math of rationality. I found the responses extremely helpful, but thinking about it, I think there’s a better approach.

I come from a new-age-y background. As such, I hear a lot about “quantum physics.”

Quantum Mechanics

Accordingly, I have developed a heuristic that I have found broadly useful: If a field involves math, and you cannot do the math, you are not qualified to comment on that field. If you can’t calculate the Schrödinger equation, I discount whatever you may say about what quantum physics reveals about reality.

Instead of asking which field of math are “necessary” (or useful) to “rationality,” I think it’s more productive to ask, “what key questions or ideas, involving math, would I like to understand?” Instead of going out of my way to learn the math that I predict will be useful, I’ll just embark on trying understand the problems that I’m learning the math for, and working backwards to figure out what math I need for any particular problem. This has the advantage of never causing me to waste time on extraneous topics: I’ll come to understand the concepts I’ll need most frequently best, because I’ll encounter them most frequently (for instance, I think I’ll quickly realize that I need to get a solid understanding of calculus, and so study calculus, but there may be parts of math that don't crop up much, so I'll effectively skip those). While I usually appreciate the aesthetic beauty of abstract math, I think this sort of approach will also help keep me focused and motivated. Note, that at this point, I’m trying to fill in the gaps in my understanding and attain “mathematical literacy” instead of a complete and comprehensive mathematical understanding (a worthy goal that I would like to pursue, but which is of lesser priority to me).

I think even a cursory familiarity with these subjects is likely to be very useful: when someone mentions say, an economic concept, I suspect that the value of even just vaguely remembering having solved a basic version of the problem will give me a significant insight into what the person is talking about, instead of having a hand-wavy, non-mathematical conception.

Eliezer said in the simple math of everything:

It seems to me that there's a substantial advantage in knowing the drop-dead basic fundamental embarrassingly simple mathematics in as many different subjects as you can manage.  Not, necessarily, the high-falutin' complicated damn math that appears in the latest journal articles.  Not unless you plan to become a professional in the field.  But for people who can read calculus, and sometimes just plain algebra, the drop-dead basic mathematics of a field may not take that long to learn.  And it's likely to change your outlook on life more than the math-free popularizations or the highly technical math.

(Does anyone with more experience than me foresee problems with this approach? Has this been tired before? How did it work?)

So, I’m asking you: what are some mathematically-founded concepts that are worth learning? Feel free to suggest things for their practical utility or their philosophical insight. Keep in mind that there is a relevant cost benefit analysis to consider: there are some concepts that are really cool to understand, but require many levels of math to get to. (I think after people have responded here, I’ll put out another post for people to vote on a good order to study these things, starting with those topics that have the minimal required mathematical foundation and working up to the complex higher level topics that require calculus, linear algebra, matrices, and analysis.)

These are some things that interest me:

-       The math of natural selection and evolution

-       The Schrödinger equation

-       The math of governing the dynamics of political elections

-       Basic optimization problems of economics? Other things from economics? (I don’t know much about these. Are they interesting? Useful?)

-       The basic math of neural networks (or “the differential equations for gradient descent in a non-recurrent multilayer network with sigmoid units”) (Eliezer says it’s simper than it sounds, but he was also a literal child prodigy, so I don’t know how much that counts for.)

-       Basic statistics

-    Whatever the foundations of bayesianism are

-       Information theory?

-       Decision theory

-       Game theory (does this even involve math?)

-       Probability theory

-       Things from physics? (While I like physics, I don’t think learning more of it would significantly improve my understanding of macro-level processes that that would impact my decisions. It's not as interesting to me as some of the other things on this list, right now. Tell me if I'm wrong or what particular sub-fields of physics are most worthwhile.)

-       Some common computer science algorithms (What are these?)

-       The math that makes reddit work?

-       Is there a math of sociology?

-       Chaos theory?

-       Musical math

-       “Sacred geometry” (an old interest of mine)

-       Whatever math is used in meta analyses

-       Epidemiology

I’m posting most of these below. Please upvote and downvote to tell me how interesting or useful you think a given topic is. Please don’t vote on how difficult they are, that’s a different metric that I want to capture separately. Please do add your own suggestions and any comments on each of the topics.

Note: looking around, I fount this. If you’re interested in this post, go there. I’ll be starting with it.

Edit: I looking at the page, I fear that putting a sort of "vote" in the comments might subtlety dissuade people from commenting and responding in the usual way. Please don't be dissuaded. I want your ideas and comments and explicitly your own suggestions. Also, I have a karma sink post under Artaxerxes's comment (here). If you want to vote, but not add to my karma, you can balance the cosmic scales there.

Edit2: If you know of the specific major equations, problems, theorems, or algorithms that relate to a given subject, please list them. For instance, I just added Price's Equation as a comment to the listed "math of natural selection and evolution" and the Median Voter Theorem has been listed under "the math of politics."

Intentionally Raising the Sanity Waterline

9 Gleb_Tsipursky 13 November 2014 08:25PM

Hi all, I’m a social entrepreneur, professor, and aspiring rationalist. My project is Intentional Insights. This is a new nonprofit I co-founded with my wife and other fellow aspiring rationalists in the Columbus, OH Less Wrong meetup. The nonprofit emerged from our passion to promote rationality among the broad masses. We use social influence techniques, create stories, and speak to emotions. We orient toward creating engaging videos, blogs, social media, and other content that an aspiring rationalist like yourself can share with friends and family members who would not be open to rationality proper due to the Straw Vulcan misconception. I would appreciate any advice and help from fellow aspiring rationalists. The project is described more fully below, but for those for whom that’s tl;dr, there is a request for advice and allies at the bottom.

Since I started participating in the Less Wrong meetup in Columbus, OH and reading Less Wrong, what seems like ages ago, I can hardly remember my past thinking patterns. Because of how much awesomeness it brought to my life, I have become one of the lead organizers of the meetup. Moreover, I find it really beneficial to bring rationality into my research and teaching as a tenure-track professor at Ohio State, where I am a member of the Behavioral Decision-Making Initiative. Thus, my scholarship brings rationality into historical contexts, for example in my academic articles on agency, emotions, and social influence. In my classes I have students engage with the Checklist of Rationality Habits and other readings that help advance rational thinking.

As do many aspiring rationalists, I think rationality can bring such benefits to the lives of many others, and also help improve our society as a whole by leveling up rational thinking, secularizing society, and thus raising the sanity waterline. For that, our experience in the Columbus Less Wrong group has shown that we need to get people interested in rationality by showing them its benefits and how it can solve their problems, while delivering complex ideas in an engaging and friendly fashion targeted at a broad public, and using active learning strategies and connecting rationality to what they already know. This is what I do in my teaching, and is the current best practice in educational psychology. It has worked great with my students when I began to teach them rationality concepts. Yet I do not know of any current rationality trainings that do this. Currently, such education in rationality is available mainly through excellent, intense 4-day workshops the Center for Applied Rationality, usually held in the San Francisco area, which are aimed at a "select group of founders, hackers, and other ambitious, analytical, practically-minded people." We are targeting a much broader and less advanced audience, the upper 50-85%, while CfAR primarily targets the top 5-10%. We had great interactions with Anna Salamon, Julia Galef, Kenzi Amodei, and other CFAR folks, and plan to collaborate with them on various ways to do Rationality outreach. Besides CfAR, there are also some online classes on decision-making from Clearer Thinking, as well as some other stuff we list on the Intentional Insights resources page. However, we really wanted to see something oriented at the broad public, which can gain a great deal from a much lower level of education in rationality made accessible and relevant to their everyday lives and concerns, and delivered in a fashion perceived as interesting, fun, and friendly by mass audiences, as we aim to do with our events.

Intentional Insights came from this desire. This nonprofit explicitly orients toward getting the broad masses interested in and learning about rationality by providing fun and engaging content delivered in a friendly manner. What we want to do is use various social influence methods and promote rationality as a self-improvement/leadership development offering for people who are not currently interested in rational thinking because of the Straw Vulcan image, but who are interested in self-improvement, professional development, and organizational development. As people become more advanced, we will orient them toward more advanced rationality, at Less Wrong and elsewhere. Now, there are those who believe rationality should be taught only to those who are willing to put in the hard work and effort to overcome the high barrier to entry of learning all the jargon. However, we are reformers, not revolutionaries, and believe that some progress is better than no progress. And the more aspiring rationalists engage in various projects aimed to raise the sanity waterline, using different channels and strategies, the better. We can all help and learn from each other, adopting an experimental attitude and gathering data about what methods work best, constantly updating our beliefs and improving our abilities to help more people gain greater agency.

The channels of delivery locally are classes and workshops. Here is what one college student participant wrote after a session: “I have gained a new perspective after attending the workshop. In order to be more analytical, I have to take into account that attentional bias is everywhere. I can now further analyze and make conclusions based on evidence.” This and similar statements seem to indicate some positive impact, and we plan to gather evidence to examine whether workshop participants adopt more rational ways of thinking and how the classes influence people’s actual performance over time.

We have a website that takes this content globally, as well as social media such as Facebook and Twitter. The website currently has: - Blog posts, such as on agency; polyamory and cached thinking; and life meaning and purpose. We aim to make them easy-to-read and engaging to get people interested in rational thinking. These will be targeted at a high school reading level, the type of fun posts aspiring rationalists can share with their friends or family members whom they may want to get into rationality, or at least explain what rationality is all about. - Videos with similar content to blog posts, such as on evaluating reality clearly, and on meaning and purpose - A resources page, with links to prominent rationality venues, such as Less Wrong, CFAR, HPMOR, etc.

It will eventually have: - Rationality-themed merchandise, including stickers, buttons, pens, mugs, t-shirts, etc. - Online classes teaching rationality concepts - A wide variety of other products and offerings, such as e-books and apps

Now, why my wife and I, and the Columbus Less Wrong group? To this project, I bring my knowledge of educational psychology, research expertise, and teaching experience; my wife her expertise as a nonprofit professional with an MBA in nonprofit management; and other Board members include a cognitive neuroscientist, a licensed therapist, a gentleman adventurer, and other awesome members of the Columbus, OH, Less Wrong group.

Now, I can really use the help of wise aspiring rationalists to help out this project:

1) If you were trying to get the Less Wrong community engaged in the project, what would you do?

2) If you were trying to promote this project broadly, what would you do? What dark arts might you use, and how?

3) If you were trying to get specific groups and communities interested in promoting rational thinking in our society engaged in the project, what would you do? What dark arts might you use, and how?

4) If you were trying to fundraise for this project, what would you do? What dark arts might you use, and how?

5) If you were trying to persuade people to sign up for workshops or check out a website devoted to rational thinking, what would you do? How would you tie it to people’s self-interest and everyday problems that rationality might solve? What dark arts might you use, and how? What dark arts might you use, and how?

6) If you were trying to organize a nonprofit devoted to doing all the stuff above, what would you do to help manage its planning and organization? What about managing relationships and group dynamics?

Besides the advice, I invite you to ally with us and collaborate on this project in whatever way is optimal for you. Money is very helpful right now as we are fundraising to pay for costs associated with starting up the nonprofit, around $3600 through the rest of 2014, and you can donate directly through our website. Your time, intellectual capacity, and any specific talents would also be great, on things such as giving advice and helping out on specific tasks/projects, developing content in the form of blogs, videos, etc., promoting the project to those you know, and other ways to help out.

Leave your thoughts in comments below, or you can get in touch with me at gleb@intentionalinsights.org. I hope you would like to ally with us to raise the sanity waterline!


EDIT: Based on your feedback, we've decided that this post on polyamory and cached thinking is probably a bad fit for what we want to promote right now. We've removed it from the main index of our site. Thanks for helping!

The germ of an idea

6 Stuart_Armstrong 13 November 2014 06:58PM

Apologies for posting another unformed idea, but I think it's important to get it out there.

The problem with dangerous AI is that it's intelligent, and thus adapts to our countermeasures. If we did something like plant a tree and order the AI not to eat the apple on it, as a test of its obedience, it would easily figure out what we were doing, and avoid the apple (until it had power over us), even if it were a treacherous apple-devouring AI of DOOM.

When I wrote the AI indifference paper, it seemed that it showed a partial way around this problem: the AI would become indifferent to a particular countermeasure (in that example, explosives), so wouldn't adapt its behaviour around it. It seems that the same idea can make an Oracle not attempt to manipulate us through its answers, by making it indifferent as to whether the message was read.

The ideas I'm vaguely groping towards is whether this is a general phenomena - whether we can use indifference to prevent the AI from adapting to any of our efforts. The second question is whether we can profitably use it on the AI's motivation itself. Something like the reduced impact AI reasoning about what impact it could have on the world. This has a penalty function for excessive impact - but maybe that's gameable, maybe there is a pernicious outcome that doesn't have a high penalty, if the AI aims for it exactly. But suppose the AI could calculate its impact under the assumption that it didn't have a penalty function (utility indifference is often equivalent to having incorrect beliefs, but less fragile than that).

So if it was a dangerous AI, it would calculate its impact as if it didn't have a penalty function (and hence no need to route around it), and thus would calculate a large impact, and get penalised by it.

My next post will be more structured, but I feel there's the germ of a potentially very useful idea there. Comments and suggestions welcome.

The Atheist's Tithe

6 Alsadius 13 November 2014 05:22PM

I made a comment on another site a week or two ago, and I just realized that the line of thought is one that LW would appreciate, so here's a somewhat expanded version. 

There's a lot of discussion around here about how to best give to charities, and I'm all for this. Ensuring donations are used well is important, and organizations like GiveWell that figure out how to get the most bang for your buck are doing very good work. An old article on LW (that I found while searching to make sure I wasn't being redundant by posting this) makes the claim that the difference between a decent charity and an optimal one can be two orders of magnitude, and I believe that. But the problem with this is, effective altruism only helps if people are actually giving money. 

People today don't tend to give very much to charity. They'll buy a chocolate bar for the school play or throw a few bucks in at work, but less than 2% of national income is donated even in the US, and the US is incredibly charitable by developed-world standards(the corresponding rate in Germany is about 0.1%, for example). And this isn't something that can be solved with math, because the general public doesn't speak math, it needs to be solved with social pressure. 

The social pressure needs to be chosen well. Folks like Jeff Kaufman and Julia Wise giving a massive chunk of their income to charity are of course laudable, but 99%+ of people will regard the thought of doing so with disbelief and a bit of horror - it's simply not going to happen on a large scale, because people put themselves first, and don't think they could possibly part with so much of their income. We need to settle for a goal that is not only attainable by the majority of people, but that the majority of people know in their guts is something they could do if they wanted. Not everyone will follow through, but it should be set at a level that inspires guilt if they don't, not laughter. 

Since we're trying to make it something people can live up to, it has to be proportional giving, not absolute - Bill Gates and Warren Buffett telling each other to donate everything over a billion is wonderful, but doesn't affect many other people. Conversely, telling people that everything over $50k should be donated will get the laugh reaction from ordinary-wealthy folks like doctors and accountants, who are the people we most want to tie into this system. Also, even if it was workable, it creates some terrible disincentives to working extra-hard, which is a bad way to structure a system - we want to maximize donations, not merely ask people to suffer for its own sake. 

Also, the rule needs to be memorable - we can't give out The Income Tax Act 2: Electric Boogaloo as our charitable donation manual, because people won't read it, won't remember it, and certainly won't pressure anyone else into following it. Ideally it should be extremely simple. And it'd be an added bonus if the amount chosen didn't seem arbitrary, if there was already a pre-existing belief that the number is generally appropriate for what part of your income should be given away. 

There's only one system that meets all these criteria - the tithe. Give away 10% of your income to worthy causes(not generally religion, though the religious folk of the world can certainly do so), keep 90% for yourself. It's practical, it's simple, it's guilt-able, it scales to income, it preserves incentives to work hard and thereby increase the total base of donations, and it's got a millennia-long tradition(which means both that it's proven to work and that people will believe it's a reasonable thing to expect).

Encouraging people to give more than that, or to give better than the default, are both worthwhile, but just like saving for retirement, the first thing to do is put enough money in, and only *then* worry about marginal changes in effectiveness. After all, putting Germany on the tithe rule is just as much of an improvement to charitable effectiveness as going from a decent charity to an excellent one, and it scales in a completely different way, so they can be worked on in parallel. 

This is a rule that I try to follow myself, and sometimes encourage others to do while I'm wearing my financial-advisor hat. (And speaking with that hat: If you're a person who will actually follow through on this, avoid chipping in a few dollars here and there when people ask, and save up for bigger donations. That way you get tax receipts, which lower your effective cost of donation, as well as letting you pick better charities). 

View more: Next