Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Bad intent is a behavior, not a feeling

Benquo 01 May 2017 01:28AM

It’s common to think that someone else is arguing in bad faith. In a recent blog post, Nate Soares claims that this intuition is both wrong and harmful:

I believe that the ability to expect that conversation partners are well-intentioned by default is a public good. An extremely valuable public good. When criticism turns to attacking the intentions of others, I perceive that to be burning the commons. Communities often have to deal with actors that in fact have ill intentions, and in that case it's often worth the damage to prevent an even greater exploitation by malicious actors. But damage is damage in either case, and I suspect that young communities are prone to destroying this particular commons based on false premises.

To be clear, I am not claiming that well-intentioned actions tend to have good consequences. The road to hell is paved with good intentions. Whether or not someone's actions have good consequences is an entirely separate issue. I am only claiming that, in the particular case of small high-trust communities, I believe almost everyone is almost always attempting to do good by their own lights. I believe that propagating doubt about that fact is nearly always a bad idea.

It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?

What reason do we have to believe that we’re systematically overestimating this? If we’re systematically overestimating it, why should we believe that it’s adaptive to suppress this? I think the burden of evidence is on the side disagreeing with the intuitions behind this extremely common defensive response, to explain what bad actors are, why we are on such a hair-trigger against them, and why we should relax this.

Nate continues:

My models of human psychology allow for people to possess good intentions while executing adaptations that increase their status, influence, or popularity. My models also don’t deem people poor allies merely on account of their having instinctual motivations to achieve status, power, or prestige, any more than I deem people poor allies if they care about things like money, art, or good food. […]

One more clarification: some of my friends have insinuated (but not said outright as far as I know) that the execution of actions with bad consequences is just as bad as having ill intentions, and we should treat the two similarly. I think this is very wrong: eroding trust in the judgement or discernment of an individual is very different from eroding trust in whether or not they are pursuing the common good.

Nate's argument is almost entirely about mens rea - about subjective intent to make something bad happen - which is basically not a thing. He contrasts this with actions that have bad consequences, which are common. But there’s something in the middle: following an incentive gradient that rewards distortions. For instance, if you rigorously A/B test your marketing until it generates the presentation that attracts the most customers, and don’t bother to inspect why they respond positively to the result, then you’re simply saying whatever words get you the most customers, regardless of whether they’re true. In such cases, whether or not you ever formed a conscious intent to mislead, your strategy is to tell whichever lie is most convenient; there was nothing in your optimization target that forced your words to be true ones, and most possible claims are false, so you ended up making false claims.

More generally, if you try to control others’ actions, and don’t limit yourself to doing that by honestly informing them, then you’ll end up with a strategy that distorts the truth, whether or not you meant to. The default state for any given constraint is that it has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people perceive a convergent incentive to inform one another, rather than a divergent incentive to grab control. But, if you do not defend yourself and your community against divergent strategies unless there is unambiguous evidence, then you make yourself vulnerable to those strategies, and should expect to get more of them.The default hypothesis should be that any given constraint has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people have a convergent incentive to inform one another, rather than a divergent incentive to grab control. 

I’ve been criticizing EA organizations a lot for deceptive or otherwise distortionary practices (see here and here), and one response I often get is, in effect, “How can you say that? After all, I've personally assured you that my organization never had a secret meeting in which we overtly resolved to lie to people?” I genuinely don’t see how this is relevant. Your public communication strategy can be publicly observed. If it tends to create distortions, then I can reasonable infer that you’re following some sort of incentive gradient that rewards some kinds of distortions. I don’t need to know about your subjective experiences to draw this conclusion. I don’t need to know your inner narrative. I can just look, as a member of the public, and report what I see.

Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing. But it has to be OK to point out when people are not just mistaken, but following patterns of behavior that are systematically distorting the discourse - and to point this out publicly so that we can learn to do better, together.

(Cross-posted at my personal blog.)

[Link] Nate Soares' "Assuming Good Intent"

5 Raemon 30 April 2017 05:45PM

There is No Akrasia

8 lifelonglearner 30 April 2017 03:33PM

I don’t think akrasia exists.


This is a fairly strong claim. I’m also not going to try and argue it.

 

What I’m really here to argue are the two weaker claims that:


a) Akrasia is often treated as a “thing” by people in the rationality community, and this can lead to problems, even though akrasia a sorta-coherent concept.


b) If we want to move forward and solve the problems that fall under the akrasia-umbrella, it’s better to taboo the term akrasia altogether and instead employ a more reductionist approach that favors specificity


But that’s a lot less catchy, and I think we can 80/20 it with the statement that “akrasia doesn’t exist”, hence the title and the opening sentence.


First off, I do think that akrasia is a term that resonates with a lot of people. When I’ve described this concept to friends (n = 3), they’ve all had varying degrees of reactions along the lines of “Aha! This term perfectly encapsulates something I feel!” On LW, it seems to have garnered acceptance as a concept, evidenced by the posts / wiki on it.


It does seem, then, that this concept of “want-want vs want” or “being unable to do what you ‘want’ to do” seems to point at a phenomenologically real group of things in the world.


However, I think that this is actually bad.


Once people learn the term akrasia and what it represents, they can now pattern-match it to their own associated experiences. I think that, once you’ve reified akrasia, i.e. turned it into a “thing” inside your ontology, problems occur:


First off, treating akrasia as a real thing gives it additional weight and power over you:


Once you start to notice the patterns, it’s harder to see things again as mere apparent chaos. In the case of akrasia, I think this means that people may try less hard because they suddenly realize they’re in the grip of this terrible monster called akrasia.


I think this sort of worldview ends up reinforcing some unhelpful attitudes towards solving the problems akrasia represents. As an example, here are two paraphrased things I’ve overheard about akrasia which I think illustrate this. (Happy to remove these if you would prefer not to be mentioned.)


“Akrasia has mutant healing powers…Thus you can’t fight it, you can only keep switching tactics for a time until they stop working…”


“I have massive akrasia…so if you could just give me some more high-powered tools to defeat it, that’d be great…”  

 

Both of these quotes seem to have taken the akrasia hypothesis a little too far. As I’ll later argue, “akrasia” seems to be dealt with better when you see the problem as a collection of more isolated disparate failures of different parts of your ability to get things done, rather than as an umbrella term.


I think that the current akrasia framing actually makes the problem more intractable.


I see potential failure modes where people come into the community, hear about akrasia (and all the related scary stories of how hard it is to defeat), and end up using it as an excuse (perhaps not an explicit belief, but as an alief) that impacts their ability to do work.


This was certainly the case for me, where improved introspection and metacognition on certain patterns in my mental behaviors actually removed a lot of my willpower which had served me well in the past. I may be getting slightly tangential here, but my point is that giving people models, useful as they might be for things like classification, may not always be net-positive.


Having new things in your ontology can harm you.


So just giving people some of these patterns and saying, “Hey, all these pieces represent a Thing called akrasia that’s hard to defeat,” doesn’t seem like the best idea.


How can we make the akrasia problem more tractable, then?


I claimed earlier that akrasia does seem to be a real thing, as it seems to be relatable to many people. I think this may actually because akrasia maps onto too many things. It’s an umbrella term for lots of different problems in motivation and efficacy that could be quite disparate problems. The typical akrasia framing lumps problems like temporal discounting with motivation problems like internal disagreements or ugh fields, and more.

 

Those are all very different problems with very different-looking solutions!


In the above quotes about akrasia, I think that they’re an example of having mixed up the class with its members. Instead of treating akrasia as an abstraction that unifies a class of self-imposed problems that share the property of acting as obstacles towards our goals, we treat it as a problem onto itself.


Saying you want to “solve akrasia” makes about as much sense as directly asking for ways to “solve cognitive bias”. Clearly, cognitive biases are merely a class for a wide range of errors our brains make in our thinking. The exercises you’d go through to solve overconfidence look very different than the ones you might use to solve scope neglect, for example.


Under this framing, I think we can be less surprised when there is no direct solution to fighting akrasia—because there isn’t one.


I think the solution here is to be specific about the problem you are currently facing. It’s easy to just say you “have akrasia” and feel the smooth comfort of a catch-all term that doesn’t provide much in the way of insight. It’s another thing to go deep into your ugly problem and actually, honestly say what the problem is.


The important thing here is to identify which subset of the huge akrasia-umbrella your individual problem falls under and try to solve that specific thing instead of throwing generalized “anti-akrasia” weapons at it.


Is your problem one of remembering to do tasks? Then set up a Getting Things Done system.


Is your problem one of hyperbolic discounting, of favoring short-term gains? Then figure out a way to recalibrate the way you weigh outcomes. Maybe look into precommitting to certain courses of action.


Is your problem one of insufficient motivation to pursue things in the first place? Then look into why you care in the first place. If it turns out you really don’t care, then don’t worry about it. Else, find ways to source more motivation.


The basic (and obvious) technique I propose, then, looks like:


  1. Identify the akratic thing.

  2. Figure out what’s happening when this thing happens. Break it down into moving parts and how you’re reacting to the situation.

  3. Think of ways to solve those individual parts.

  4. Try solving them. See what happens

  5. Iterate


Potential questions to be asking yourself throughout this process:

  • What is causing your problem? (EX: Do you have the desire but just aren’t remembering? Are you lacking motivation?)

  • How does this akratic problem feel? (EX: What parts of yourself is your current approach doing a good job of satisfying? Which parts are not being satisfied?)

  • Is this really a problem? (EX: Do you actually want to do better? How realistic would it be to see the improvements you’re expecting? How much better do you think could be doing?)


Here’s an example of a reductionist approach I did:


“I suffer from akrasia.


More specifically, though, I suffer from a problem where I end up not actually having planned things out in advance. This leads me to do things like browse the internet without having a concrete plan of what I’d like to do next. In some ways, this feels good because I actually like having the novelty of a little unpredictability in life.


However, at the end of the day when I’m looking back at what I’ve done, I have a lot of regret over having not taken key opportunities to actually act on my goals. So it looks like I do care (or meta-care) about the things I do everyday, but, in the moment, it can be hard to remember.”


Now that I’ve far more clearly laid out the problem above, it seems easier to see that the problem I need to deal with is a combination of:

  • Reminding myself the stuff I would like to do (maybe via a schedule or to-do list).

  • Finding a way to shift my in-the-moment preferences a little more towards the things I’ve laid out (perhaps with a break that allows for some meditation).


I think that once you apply a reductionist viewpoint and specifically say exactly what it is that is causing your problems, the problem is already half-solved. (Having well-specified problems seems to be half the battle.)

 

Remember, there is no akrasia! There are only problems that have yet to be unpacked and solved!


Nate Soares' Replacing Guilt Series compiled in epub Format

6 lifelonglearner 30 April 2017 06:36AM

Hey everyone,

I really liked Nate Soares' Replacing Guilt series, which has had a major positive impact on growing my intrinsic motivation.

Recently, I compiled all the posts into an ePUB for my own reading, and I thought it might be good to share it here if anyone would like to download it for their e-readers / on-the-go-reading. (I got Nate's permission first, so it's all good.)

Google Drive link here.

[Link] Moral Robots: Making sense of robot ethics. News aggregator

0 morganism 29 April 2017 09:51PM

New meet up in Las Vegas!

2 adamzerner 28 April 2017 11:57PM

Hey guys, I'd just like to announce that I'm starting a new meet up in Las Vegas!

WHEN: First and third Sunday of the month. 7pm-9pm.

WHERE: The Market (downtown on Fremont Street).

See http://lesswrong.com/meetups/1xg.

Change utility, reduce extortion

1 Stuart_Armstrong 28 April 2017 02:05PM

Crossposted at the Intelligent Agents Forum.

A full solution to the extortion problem is sorely elusive. However, there are crude hacks that we can use to mitigate the downside.

Suppose we figured out that a friendly AI should be maximising an unbounded utility function U. The extortion risk is that another AI could threaten a FAI with unbounded disutility if it didn't go along with its plans. This gives the extorting AI - the EAI - a lot of leverage, and things could end up badly if the EAI ends up acting on its threat.

To combat this, we first have to figure out a level z of utility that is a lower bound on what U could ever reach naturally and realistically.

By "naturally" we mean that U going below z would require not just incompetence or indifference, but some AI actively and deliberately arranging the lowering of U. And "realistically" just means that we're confident that getting U lower than z by chance, or having a U-minimising AI, are exceedingly low.

Then what we can do is to cut off U at the z level, replacing U with U'=max(U,U(z)). See z indicated by the red line on this graph of U' versus U:

What's the consequence of this? First of all, it ensures that no EAI would threaten to reduce U (the utility we really care about) below z, because that is not a threat to the FAI. This reduces the leverage of the EAI, and reduces the impact of it acting on its threat.

Since levels of U below z are exceedingly unlikely to happen by chance, the fact the FAI has the wrong utility below z shouldn't affect it's performance much. And, even in that zone, the AI is still motivated to climb U above z.

But we may still feel unhappy about the flatness of that curve, and want it to still prefer higher U to exceedingly low values. If so, we can replace U with U'' as follows (the blue line is at z-1):

In this case, the EAI will not seek to reduce U below z-1 (in fact, it will specifically target that value), while the FAI has the correct ordering of lower values of U. The utility is weird around z, granted, but this is a place where the FAI would not want to be and would almost certainly not reach by accident.

Though this method does not eliminate the threat of extortion, it does seem to reduce its impact.

[Link] Boundedly Rational Patients? Health and Patient Mistakes in a Behavioral Framework

0 fortyeridania 28 April 2017 01:01AM

Scenario analysis: a parody

4 Stuart_Armstrong 27 April 2017 03:21PM

Based on a idea from Nick Bostrom.

Suit A: "Welcome to our futurology meeting extravaganza, where we are going to do a complete analysis of the future using... drumroll... Scenario analysis!"

All: "All hail mighty scenario analysis!"

Suit A: "So, what are the big risks in the future?"

Suit B: "Global warming? I heard that's bad."

Suit A: "Indeed it is. What else do we have that's bad?"

Suit C: "How about obesity?"

Suit B: "I still think global warming is rather more important, it's getting hot and..."

Suit C: "Well, my grandfather was fat, and he suffered and died because..."

Suit A: "No need to argue, gentlewomen! We'll simply do a scenario analysis with both variables. So here we have the Sweaty Fat quadrant... Let me put it up on the board:"

Suit A: "Now let's give each scenario a thorough analysis!"

Suit D: "Isn't fat an insulant?"

Suit A: "That's the kind of incisive commentary we need!"

...

...

Much later:

Suit C: "So we have an ideal strategy: keep an eye on sweat pants purchase, and adjust our investment accordingly."

Suit D: "What about our social responsibilities?"

Suit A: "Good point."

Suit B: "Well, then we can track the size of suits and ice cream consumption and adjust health spending and gas subsidies in function of these."

Suit A: "Well, I think we've done a fabulous job today; really. No-one could have done a better job predicting than us. And it's all thanks to... Scenario analysis!"

All: "All hail!"

 

(very tangentially connected to the problem of models that are over-precise in narrow areas)

Use and misuse of models: case study

12 Stuart_Armstrong 27 April 2017 02:36PM

Some time ago, I discovered a post comparing basic income and basic job ideas. This sought to analyse the costs of paying everyone a guaranteed income versus providing them with a basic job with that income. The author spelt out his assumptions and put together a two models with a few components (including some whose values were drawn from various probability distributions). Then he ran a Monte Carlo simulation to get a distribution of costs for either policy.

Normally I should be very much in favour of this approach. It spells out the assumptions, it uses models, it decomposes the problem, it has stochastic uncertainty... Everything seems ideal. To top it off, the author concluded with a challenge aiming at improving reasoning around this subject:

How to Disagree: Write Some Code

This is a common theme in my writing. If you are reading my blog you are likely to be a coder. So shut the fuck up and write some fucking code. (Of course, once the code is written, please post it in the comments or on github.)

I've laid out my reasoning in clear, straightforward, and executable form. Here it is again. My conclusions are simply the logical result of my assumptions plus basic math - if I'm wrong, either Python is computing the wrong answer, I got really unlucky in all 32,768 simulation runs, or you one of my assumptions is wrong.

My assumption being wrong is the most likely possibility. Luckily, this is a problem that is solvable via code.

And yet... I found something very unsatisfying. And it took me some time to figure out why. It's not that these models are helpful, or that they're misleading. It's that they're both simultaneously.

To explain, consider the result of the Monte Carlo simulations. Here are the outputs (I added the red lines; we'll get to them soon):

The author concluded from these outputs that a basic job was much more efficient - less costly - than a basic income (roughly 1 trillion cost versus 3.4 trillion US dollars). He changed a few assumptions to test whether the result held up:

For example, maybe I'm overestimating the work disincentive for Basic Income and grossly underestimating the administrative overhead of the Basic Job. Lets assume both of these are true. Then what?

The author then found similar results, with some slight shifting of the probability masses.

 

The problem: what really determined the result

So what's wrong with this approach? It turns out that most of the variables in the models have little explanatory power. For the top red line, I just multiplied the US population by the basic income. The curve is slightly above it, because it includes such things as administrative costs. The basic job situation was slightly more complicated, as it includes a disabled population that gets the basic income without working, and a estimate for the added value that the jobs would provide. So the bottom red line is (disabled population)x(basic income) + (unemployed population)x(basic income) - (unemployed population)x(median added value of jobs). The distribution is wider than for basic income, as the added value of the jobs is a stochastic variable.

But, anyway, the contribution of the other variables were very minor. So the reduced cost of basic jobs versus basic income is essentially a consequence of the trivial fact that it's more expensive to pay everyone an income, than to only pay some people and then put them to work at something of non-zero value.

 

Trees and forests

So were the complicated extra variables and Monte Carlo runs for nothing? Not completely - they showed that the extra variables were indeed of little importance, and unlikely to change the results much. But nevertheless, the whole approach has one big, glaring flaw: it does not account for the extra value for individuals of having a basic income versus a basic job.

And the challenge - "write some fucking code" - obscures this. The forest of extra variables and the thousands of runs hides the fact that there is a fundamental assumption missing. And pointing this out is enough to change the result, without even needing to write code. Note this doesn't mean the result is wrong: some might even argue that people are better off with a job than with the income (builds pride in one's work, etc...). But that needs to be addressed.

So Chris Stucchio's careful work does show one result - most reasonable assumptions do not change the fact that basic income is more expensive than basic job. And to disagree with that, you do indeed need to write some fucking code. But the stronger result - that basic job is better than basic income - is not established by this post. A model can be well designed, thorough, filled with good uncertainties, and still miss the mark. You don't always have to enter into the weeds of the model's assumptions in order to criticise it.

Introducing the Instrumental Rationality Sequence

25 lifelonglearner 26 April 2017 09:53PM

What is this project?

I am going to be writing a new sequence of articles on instrumental rationality. The end goal is to have a compiled ebook of all the essays, so the articles themselves are intended to be chapters in the finalized book. There will also be pictures.


I intend for the majority of the articles to be backed by somewhat rigorous research, similar in quality to Planning 101 (with perhaps a few less citations). Broadly speaking, the plan is to introduce a topic, summarize the research on it, give some models and mechanisms, and finish off with some techniques to leverage the models.


The rest of the sequence will be interspersed with general essays on dealing with these concepts, similar to In Defense of the Obvious. Lastly, there will be a few experimental essays on my attempt to synthesize existing models into useful-but-likely-wrong models of my own, like Attractor Theory.


I will likely also recycle / cannibalize some of my older writings for this new project, but I obviously won’t post the repeated material here again as new stuff.


 


 

What topics will I cover?

Here is a broad overview of the three main topics I hope to go over:


(Ordering is not set.)


Overconfidence in Planning: I’ll be stealing stuff from Planning 101 and rewrite a bit for clarity, so not much will be changed. I’ll likely add more on the actual models of how overconfidence creeps into our plans.


Motivation: I’ll try to go over procrastination, akrasia, and behavioral economics (hyperbolic discounting, decision instability, precommitment, etc.)


Habituation: This will try to cover what habits are, conditioning, incentives, and ways to take the above areas and habituate them, i.e. actually putting instrumental rationality techniques into practice.


Other areas I may want to cover:

Assorted Object-Level Things: The Boring Advice Repository has a whole bunch of assorted ways to improve life that I think might be useful to reiterate in some fashion.


Aversions and Ugh Fields: I don’t know too much about these things from a domain knowledge perspective, but it’s my impression that being able to debug these sorts of internal sticky situations is a very powerful skill. If I were to write this section, I’d try to focus on Focusing and some assorted S1/S2 communication things. And maybe also epistemics.


Ultimately, the point here isn’t to offer polished rationality techniques people can immediately apply, but rather to give people an overview of the relevant fields with enough techniques that they get the hang of what it means to start making their own rationality.


 


 

Why am I doing this?

Niche Role: On LessWrong, there currently doesn’t appear to be a good in-depth series on instrumental rationality. Rationality: From AI to Zombies seems very strong for giving people a worldview that enables things like deeper analysis, but it leans very much into the epistemic side of things.


It’s my opinion that, aside from perhaps Nate Soares’s series on Replacing Guilt (which I would be somewhat hesitant to recommend to everyone), there is no in-depth repository/sequence that ties together these ideas of motivation, planning, procrastination, etc.


Granted, there have been many excellent posts here on several areas, but they've been fairly directed. Luke's stuff on beating procrastination, for example, is fantastic. I'm aiming for a broader overview that hits the current models and research on different things.


I think this means that creating this sequence could add a lot of value, especially to people trying to create their own techniques.


Open-Sourcing Rationality: It’s clear that work is being done on furthering rationality by groups like Leverage and CFAR. However, for various reasons, the work they do is not always available to the public. I’d like to give people who are interested but unable to directly work with these organization something they can use to jump start their own investigations.


I’d like this to become a similar Schelling Point that we could direct people to if they want to get started.


I don’t meant to imply that what I’ll produce is the same caliber, but I do think it makes sense to have some sort of pipeline to get rationalists up to speed with the areas that (in my mind) tie into figuring out instrumental rationality. When I first began looking into this field, there was a lot of information that was scattered in many places.


I’d like to create something cohesive that people can point to when newcomers want to get started with instrumental rationality that similarly gives them a high level overview of the many tools at their disposal.


Revitalizing LessWrong: It’s my impression that independent essays on instrumental rationality have slowed over the years. (But also, as I mentioned above, this doesn’t mean stuff hasn’t happened. CFAR’s been hard at work iterating their own techniques, for example.) As LW 2.0 is being talked about, this seems like an opportune time to provide some new content and help with our reorientation towards LW becoming once again a discussion hub for rationality.


 


 

Where does LW fit in?

Crowd-sourcing Content: I fully expect that many other people will have fantastic ideas that they want to contribute. I think that’s a good idea. Given some basic things like formatting / roughly consistent writing style throughout, I think it’d be great if other potential writers see this post as an invitation to start thinking about things they’d like to write / research about instrumental rationality.


Feedback: I’ll be doing all this writing on a public Google Doc with posts that feature chapters once they’re done, so hopefully there’s ample room to improve and take in constructive criticism. Feedback on LW is often high-quality, and I expect that to definitely improve what I will be writing.


Other Help: I probably can’t come through every single research paper out there, so if you see relevant information I didn’t or want to help with the research process, let me know! Likewise, if you think there are other cool ways you can contribute, feel free to either send me a PM or leave a comment below.


 


 

Why am I the best person to do this?

I’m probably not the best person to be doing this project, obviously.


But, as a student, I have a lot of time on my hands, and time appears to be a major limiting reactant in this whole process.


Additionally, I’ve been somewhat involved with CFAR, so I have some mental models about their flavor of instrumental rationality; I hope this translates into meaning I'm writing about stuff that isn't just a direct rehash of their workshop content.


Lastly, I’m very excited about this project, so you can expect me to put in about 10,000 words (~40 pages) before I take some minor breaks to reset. My short-term goals (for the next month) will be on note-taking and finding research for habits, specifically, and outlining more of the sequence.

 

Background Reading: The Real Hufflepuff Sequence Was The Posts We Made Along The Way

15 Raemon 26 April 2017 06:15PM

This is the fourth post of the Project Hufflepuff sequence. Previous posts:


Epistemic Status: Tries to get away with making nuanced points about social reality by using cute graphics of geometric objects. All models are wrong. Some models are useful. 

Traditionally, when nerds try to understand social systems and fix the obvious problems in them, they end up looking something like this:

Social dynamics is hard to understand with your system 2 (i.e. deliberative/logical) brain. There's a lot of subtle nuances going on, and typically, nerds tend to see the obvious stuff, maybe go one or two levels deeper than the obvious stuff, and miss that it's in fact 4+ levels deep and it's happening in realtime faster than you can deliberate. Human brains are pretty good (most of the time) at responding to the nuances intuitively. But in the rationality community, we've self-selected for a lot of people who:

  1. Don't really trust things that they can't understand fully with their system 2 brain. 
  2. Tend not to be as naturally skilled at intuitive mainstream social styles. 
  3. Are trying to accomplish things that mainstream social interactions aren't designed to accomplish (i.e. thinking deeply and clearly on a regular basis).
This post is an overview of essays that rationalist-types have written over the past several years, that I think add up to a "secret sequence" exploring why social dynamics are hard, and why they are important to get right. This may useful both to understand some previous attempts by the rationality community to change social dynamics on purpose, as well as to current endeavors to improve things.

(Note: I occasionally have words in [brackets], where I think original jargon was pointing in a misleading direction and I think it's worth changing)

To start with, a word of caution:

Armchair sociolology can be harmful - Ozy's post is pertinent - most essays below fall into the category of "armchair sociology", and attempts by nerds to understand and articulate social-dynamics that they aren't actually that good at. Several times when an outsider has looked in at rationalist attempts to understood human interaction they've said "Oh my god, this is the blind leading the blind", and often that seemed to me like a fair assessment.

I think all the essays that follow are useful, and are pointing at something real. But taken individually, they're kinda like the blind men groping at the elephant, each coming away with the distinct impression an elephant is like a snake, tree, a boulder, depending on which aspect they're looking at.

[Fake Edit: Ozy informs me that they were specifically warning against amateur sociology and not psychology. I think the idea still roughly applies]

Part 1. Cultural Assumptions of Trust

Guess [Infer Culture], Ask Culture, and Tell [Reveal] Culture (Malcolm Ocean)

 

Different people have different ways of articulating their needs and asking for help. Different ways of asking require different assumptions of trust. If people are bringing different expectations of trust into an interaction, they may feel that that trust is being violated, which can seem rude, passive aggressive or oppressive.

 

I'm listing this article, instead of numerous others about Ask/Guess/Tell, because I think: a) Malcolm does a good job of explaining how all the cultures work, and b) I think his presentation of Reveal culture is a good, clearer upgrade for Brienne's Tell culture, and I'm a bit sad it didn't seem to make it into the zeitgeist yet. 

I also like the suggestion to call Guess Culture "Infer Culture" (implying a bit more about what skills the culture actually emphasizes).

Guess Culture Screens for Trying to Cooperate (Ben Hoffman)

Rationality folk (and more generally, nerds), tend to prefer explicit communication over implicit, and generally see Guess culture as strictly inferior to Ask culture once you've learned to assert yourself. 

But there is something Guess culture does which Ask culture doesn't, which is give you evidence of how much people understand you and are trying to cooperate. Guess cultures filters for people who have either invested effort into understanding your culture overall, or people who are good at inferring your own wants. 

Sharp Culture and Soft Culture (Sam Rosen)

[WARNING: It turned out lots of people thought this meant something different than what I thought it meant. Some people thought it meant soft culture didn't involve giving people feedback or criticism at all. I don't think Soft/Sharp are totally-naturally clusters in the first place, and the distinction I'm interested in (as applies to rationality-culture), is how you give feedback.

(i.e. "Dude, your art sucks. It has no perspective." vs "oh, cool. Nice colors. For the next drawing, you might try incorporating perspective", as a simplified example)]

Somewhat orthogonal to Infer/Ask/Reveal culture is "Soft" vs "Sharp" culture. Sharp culture tends to have more biting humor, ribbing each other, and criticism. Soft culture tends to value kindness and social harmony more. Sam says that Sharp culture "values honesty more." Robby Bensinger counters in the comments: "My own experience is that sharp culture makes it more OK to be open about certain things (e.g., anger, disgust, power disparities, disagreements), but less OK to be open about other things (e.g., weakness, pain, fear, loneliness, things that are true but not funny or provocative or badass)."

Handshakes, Hi, and What's New: What's Going on With Small Talk?  (Ben Hoffman)

Small talk often sounds nonsensical to literally-minded people, but it serves a fairly important function: giving people a structured path to figure out how much time/sympathy/interest they want to give each other. And even when the answer is "not much", it still is, significantly, nonzero - you regard each other as persons, not faceless strangers.

Personhood [Social Interfaces?]  (Kevin Simler)

This essays gets a lot of mixed reactions, much of which I think has to do with its use of the word "Person." The essay is aimed at explaining how people end up treating each other as persons or nonpersons, without making any kind of judgement about it. This includes noting some things human tend to do that you might consider horrible.

Like many grand theories, I think it overstates it's case and ignores some places where the explanation breaks down, but I think it points at a useful concept which is summarized by this adorable graphic:

The essay uses the word "personhood". In the original context, this was useful: it gets at why cultures develop, why it matters whether you're able to demonstrate reliably, trust, etc. It helps explain outgroups and xenophobia: outsiders do not share your social norms, so you can't reliably interact with them, and it's easier to think of them as non-people than try to figure out how to have positive interactions.

But what I'm most interested in is "how can we use this to make it easier for groups with different norms to interact with each other"? And for that, I think using the word "personhood" makes it way more likely to veer into judging each other for having different preferences and communication styles.

What makes a person is... arbitrary, but not fully arbitrary. 

Rationalist culture tends to attract people who prefer a particular style of “social interface”, often favoring explicit communication and discussing ideas in extreme detail. There's a lot of value to those things, but they have some problems:

a) this social interface does NOT mesh well with the rest of world (this is a problem if you have any goals that involve the rest of the world)

b) this goal does not uniformly mesh well with all people interested in and valuable to the rationality community.

I don't actually think it's possible to develop a set of assumptions that fit everyone's needs. But I do think it's possible to develop better tools for navigating different social contexts. I think it may be possible both to tweak sets-of-norms so that they mesh better together, or at least when they bump into each other, there's greater awareness of what's happening and people's default response is "oh, we seem to have different preferences, let's figure out how 

Maybe we can end up with something that looks kinda like this:

Against Being Against or For Tell Culture  (Brienne Yudkowsky)

Having said a bunch of things about different cultural interfaces, I think this post by Brienne is really important, and highlights the end goal of all of this.

"Cultures" are a crutch. They are there to help you get your bearings. They're better than nothing. But they are not a substitute for actually having the skills needed to navigate arbitrary social situations as they come up so you can achieve whatever it is you want to achieve. 

To master communication, you can't just be like, "I prefer Tell Culture, which is better than Guess Culture, so my disabilities in Guess Culture are therefore justified." Justified shmustified, you're still missing an arm.

My advice to you - my request of you, even - if you find yourself fueling these debates [about which culture is better], is to (for the love of god) move on. If you've already applied cognitive first aid, you've created an affordance for further advancement. Using even more tourniquettes doesn't help.

Part 2. Game Theory, Recursion and Trust

(or, "Social dynamics are really complicated, you are not getting away with the things you think you are getting away with, stop trying to be clever, manipulative, act-utilitarian or naive-consequentialist without actually understanding what is going on")

Grokking Newcomb's Problem and Deserving Trust (Andrew Critch)

Critch argues why it is not just "morally wrong", but an intellectual mistake, to violate someone’s trust (even when you don’t expect any repercussions in the future).

When someone decides whether to trust you (say, giving you a huge opportunity), on the expectation that you’ll refrain from exploiting them, they’ve already run a low-grade simulation of you in their imagination. And the thing is that you don’t know whether you’re in a simulation or not when you make the decision whether to repay them. 

Some people argue “but I can tell that I’m a conscious being, and they aren’t a literal super-intelligent AI, they’re just a human. They can’t possibly be simulating me in this high fidelity. I must be real.” This is true. But their simulation of you is not based on your thoughts, it’s based on your actions. It’s really hard to fake. 

One way to think about it, not expounded on in the article: Yes, if you pause to think about it you can notice that you’re conscious and probably not being simulated in their imagination. But by the time you notice that, it’s too late. People build up models of each other all the time, based on very subtle cues such as how fast you respond to something. Conscious you knows that you’re conscious. But their decision of whether to trust you was based off the half-second it took for unconscious you to reply to questions like “Hey, do you think you handle  Project X while I’m away?”

The best way to convince people you’re trustworthy is to actually be trustworthy.

You May Not Believe In Guess[Infer] Culture But It Believes In You (Scott Alexander)

This is short enough to just include the whole thing:

Consider an "ask culture" where employees consider themselves totally allowed to say "no" without repercussions. The boss would prefer people work unpaid overtime so ey gets more work done without having to pay anything, so ey asks everyone. Most people say no, because they hate unpaid overtime. The only people who agree will be those who really love the company or their job - they end up looking really good. More and more workers realize the value of lying and agreeing to work unpaid overtime so the boss thinks they really love the company. Eventually, the few workers who continue refusing look really bad, like they're the only ones who aren't team players, and they grudgingly accept.

Only now the boss notices that the employees hate their jobs and hate the boss. The boss decides to only ask employees if they will work unpaid overtime when it's absolutely necessary. The ask culture has become a guess culture.

How this applies to friendship is left as an exercise for the reader.

The Social Substrate (Lahwran)

A fairly in depth look into how common knowledge, signaling, newcomb-like problems and recursive modeling of each other interact to produce "regular social interaction."

I think there's a lot of interesting stuff here - I'm not sure if it's exactly accurate but it points in directions that seem useful. But I actually think the most important takeaway is the warning at the beginning:

WARNING: An easy instinct, on learning these things, is to try to become more complicated yourself, to deal with the complicated territory. However, my primary conclusion is "simplify, simplify, simplify": try to make fewer decisions that depend on other people's state of mind. You can see more about why and how in the posts in the "Related" section, at the bottom.

When you're trying to make decisions about people, you're reading a lot of subtle cues off them to get a sense of how you feel about that. When you [generic person you, not necessarily you in particular] can tell someone is making complex decisions based on game theory and trying to model all of this explicitly, it a) often comes across as a bit off, and b) even if it doesn't, you still have to invest a lot of cognitive resources figuring out how they are modeling things and whether they are actually doing a good job or missing key insights or subtle cues. The result can be draining, and it can output a general response of "ugh, something about this feels untrustworthy."

Whereas when people are able to cache this knowledge down into a system-1 level, you're able to execute a simpler algorithm that looks more like "just try to be a good trustworthy person", that people can easily read off your facial expression, and which reduces overall cognitive burden.

System 1 and System 2 Morality  (Sophie Grouchy)

There’s some confusion over what “moral” means, because there’s two kinds of morality: 

 - System 1 morality is noticing-in-realtime when people need help, or when you’re being an asshole, and then doing something about it. 

 - System 2 morality is when you have a complex problem and a lot of time to think about it. 

System 1 moralists will pay back Parfit’s Hitchhiker because doing otherwise would be being a jerk. System 2 moralists invent Timeless [Functional?] decision theory. You want a lot of people with System 2 morality in the world, trying to fix complex problems. You want people with System 1 morality in your social circle.

The person who wrote this post eventually left the rationality community, in part due to frustration due to people constantly violating small boundaries that seemed pretty obvious (things in the vein of “if you’re going to be 2 hours late, text me so I don’t have to sit around waiting for you.”)

Final Remarks

I want to reiterate - all models are wrong. Some models are useful. The most important takeaway from this is not that any particular one of these perspectives is true, but that social dynamics has a lot of stuff going on that is more complicated than you're naively imagining, and that this stuff is important enough to put the time into getting right.

[Stub] Extortion and Pascal's wager

2 Stuart_Armstrong 26 April 2017 01:07PM

The premises of Pascal's wager are normally presented as abstract facts about the universe - there happens to (maybe) be a god, who happens to have set up the afterlife for the suffering of unbelievers.

But, assuming we ever manage to distinguish trade from extortion, this seems a situation of classical extortion. So if god follows a timeless decision theory - and what other kind of decision theory would it follow? - the correct answer would seem to be to reject the whole deal out of hand, even if you assume god exists.

Or, in other words, respond to a god that offers you heaven, but ignore one that threatens you with hell.

Actors and scribes, words and deeds

6 Benquo 26 April 2017 05:12AM

[Epistemic status: exploratory exercise in naming and concept-formation.]

Among the kinds of people, are the Actors, and the Scribes. Actors mainly relate to speech as action that has effects. Scribes mainly relate to speech as a structured arrangement of pointers that have meanings.

I previously described this as a distinction between promise-keeping "Quakers" and impulsive "Actors," but I think this missed a key distinction. There's "telling the truth," and then there's a more specific thing that's more obviously distinct from even Actors who are trying to make honest reports: keeping precisely accurate formal accounts. This leaves out some other types – I'm not exactly sure how it relates to engineers and diplomats, for instance – but I think I have the right names for these two things now.

Summary

Everyone agrees that words have meaning; they convey information from the speaker to the listener or reader. That's all they do. So when I used the phrase “words have meanings” to describe one side of a divide between people who use language to report facts, and people who use language to enact roles, was I strawmanning the other side?

I say no. Many common uses of language, including some perfectly legitimate ones, are not well-described by "words have meanings." For instance, people who try to use promises like magic spells to bind their future behavior don't seem to consider the possibility that others might treat their promises as a factual representation of what the future will be like.

Some uses of language do not simply describe objects or events in the world, but are enactive, designed to evoke particular feelings or cause particular actions. Even when speech can only be understood as a description of part of a model of the world, the context in which a sentence is uttered often implies an active intent, so if we only consider the direct meaning of the text, we will miss the most important thing about the sentence.

Some apparent uses of language’s denotative features may in fact be purely enactive. This is possible because humans initially learn language mimetically, and try to copy usage before understanding what it’s for. Primarily denotative language users are likely to assume that structural inconsistencies in speech are errors, when they’re often simply signs that the speech is primarily intended to be enactive.

Enactive language

Some uses of words are enactive: ways to build or reveal momentum. Others denote the position of things on your world-map.

In the denotative framing, words largely denote concepts that refer to specific classes of objects, events, or attributes in the world, and should be parsed as such. The meaning of a sentence is mainly decomposable into the meanings of its parts and their relations to each other. Words have distinct meanings that can be composed together in structures to communicate complex and nonobvious messages, or just uses and connotations.

In the enactive mode, the function of speech is to produce some action or disposition in your listener, who may be yourself. Ideas are primarily associative, reminding you of the perceptions with which the speech-act is associated. Other uses of language are structural. When you speak in this mode, it’s to describe models - relationships between concepts, which refer to classes of objects in the world.

When I wrote about admonitions as performance-enhancing speech, I gave the example of someone being encouraged by their workout buddies:

Recently, at the gym, I overheard some group of exercise buddies admonishing their buddy on some machine to keep going with each rep. My first thought was, “why are they tormenting their friend? Why can’t they just leave him alone? Exercise is hard enough without trying to parse social interactions at the same time.”

And then I realized - they’re doing it because, for them, it works. It's easier for them to do the workout if someone is telling them, “Keep going! Push it! One more!”

In the same post, I quoted Wittgenstein’s thought experiment of a language where words are only ever used as commands, with a corresponding action, never to refer to an object. Wittgenstein gives the example of a language used for nothing but military orders, and then elaborates on a hypothetical language used strictly for work orders. For instance, a foreman might use the utterance “Slab!” to direct a worker to fetch a slab of rock. I summarized the situation thus:

When I hear “slab”, my mind interprets this by imagining the object. A native speaker of Wittgenstein’s command language, when hearing the utterance “Slab!”, might - merely as the act of interpreting the word - feel a sense of readiness to go fetch a stone slab.

Wittgenstein’s listener might think of the slab itself, but only as a secondary operation in the process of executing the command. Likewise, I might, after thinking of the object, then infer that someone wants me to do something with the slab. But that requires an additional operation: modeling the speaker as an agent and using Gricean implicature to infer their intentions. The word has different cognitive content or implications for me, than for the speaker of Wittgenstein’s command language.

Military drills are also often about disintermediating between a command and action. Soldiers learn that when you receive an order, you just do the thing. This can lead to much more decisive and coordinated action in otherwise confusing situations – a familiar stimulus can lead to a regular response.

When someone gives you driving directions by telling you what you'll observe, and what to do once you make that observation, they're trying to encode a series of observation-action linkages in you.

This sort of linkage can happen to nonverbal animals too. Operant conditioning of animals gets around most animals' difficulty understanding spoken instructions, by associating a standardized reward indicator with the desired action. Often, if you want to train a comparatively complex action like pigeons playing pong, you'll need to train them one step at a time, gradually chaining the steps together, initially rewarding much simpler behaviors that will eventually compose into the desired complex behavior.

Crucially, the communication is never about the composition itself, just the components to be composed. Indeed, it’s not about anything, from the perspective of the animal being trained. This is similar to an old-fashioned army reliant on drill, in which, during battle, soldiers are told the next action they are to take, not told about overall structure of their strategy. They are told to, not told about.

Indeterminacy of translation

It’s conceivable that having what appears to be a language in common does not protect against such differences in interpretation. Quine also points to indeterminacy of translation and thus of explicable meaning with his "gavagai" example. As Wikipedia summarizes it:

Indeterminacy of reference refers to the interpretation of words or phrases in isolation, and Quine's thesis is that no unique interpretation is possible, because a 'radical interpreter' has no way of telling which of many possible meanings the speaker has in mind. Quine uses the example of the word "gavagai" uttered by a native speaker of the unknown language Arunta upon seeing a rabbit. A speaker of English could do what seems natural and translate this as "Lo, a rabbit." But other translations would be compatible with all the evidence he has: "Lo, food"; "Let's go hunting"; "There will be a storm tonight" (these natives may be superstitious); "Lo, a momentary rabbit-stage"; "Lo, an undetached rabbit-part." Some of these might become less likely – that is, become more unwieldy hypotheses – in the light of subsequent observation. Other translations can be ruled out only by querying the natives: An affirmative answer to "Is this the same gavagai as that earlier one?" rules out some possible translations. But these questions can only be asked once the linguist has mastered much of the natives' grammar and abstract vocabulary; that in turn can only be done on the basis of hypotheses derived from simpler, observation-connected bits of language; and those sentences, on their own, admit of multiple interpretations.

Everyone begins life as a tiny immigrant who does not know the local language, and has to make such inferences, or something like them. Thus, many of the difficulties in nailing down exactly what a word is doing in a foreign language have analogues in nailing down exactly what a word is doing for another speaker of one’s own language.

Mimesis, association, and structure

Not only do we all begin life as immigrants, but as immigrants with no native language to which we can analogize our adopted tongue. We learn language through mimesis. For small children, language is perhaps more like Wittgenstein's command language than my reference-language. It's a commonplace observation that children learn the utterance "No!" as an expression of will. In The Ways of Naysaying: No, Not, Nothing, and Nonbeing, Eva Brann provides a charming example:

Children acquire some words, some two-word phrases, and then no. […] They say excited no to everything and guilelessly contradict their naysaying in the action: "Do you want some of my jelly sandwich?" "No." Gets on my lap and takes it away from me. […] It is a documented observation that the particle no occurs very early in children's speech, sometimes in the second year, quite a while before sentences are negated by not.

First we learn language as an assertion of will, a way to command. Then, later, we learn how to use it to describe structural features of world-models. I strongly suspect that this involves some new, not entirely mimetic cognitive machinery kicking in, something qualitatively different: we start to think in terms of pointer-referent and concept-referent relations. In terms of logical structures, where "no" is not simply an assertion of negative affect, but inverts the meaning of whatever follows. Only after this do recursive clauses, conditionals, and negation of negation make any sense at all.

As long as we agree on something like rules of assembly for sentences, mimesis might mask a huge difference in how we think about things. It's instructive to look at how the current President of the United States uses language. He's talking to people who aren't bothering to track the structure of sentences. This makes him sound more "conversational" and, crucially, allows him to emphasize whichever words or phrases he wants, without burying them in a potentially hard-to-parse structure. As Katy Waldman of Slate says:

For some of us, Trump’s language is incendiary garbage. It’s not just that the ideas he wants to communicate are awful but that they come out as Saturnine gibberish or lewd smearing or racist gobbledygook. The man has never met a clause he couldn’t embellish forever and then promptly forget about. He uses adjectives as cudgels. You and I view his word casserole as not just incoherent but representative of the evil at his heart.

But it works. […]

Why? What’s the secret to Trump’s accidental brilliance? A few theories: simple component parts, weaponized unintelligibility, dark innuendo, and power signifiers.

[…] Trump tends to place the most viscerally resonant words at the end of his statements, allowing them to vibrate in our ears. For instance, unfurling his national security vision like a nativist pennant, Trump said:

But, Jimmy, the problem 
I mean, look, I’m for it.
But look, we have people coming into the country
that are looking to do tremendous harm….
Look what happened in Paris.
Look what happened in California,
with, you know, 14 people dead.
Other people are going to die,
they’re badly injuredwe have a real problem.

Ironically, because Trump relies so heavily on footnotes, false starts, and flights of association, and because his digressions rarely hook back up with the main thought, the emotional terms take on added power. They become rays of clarity in an incoherent verbal miasma. Think about that: If Trump were a more traditionally talented orator, if he just made more sense, the surface meaning of his phrases would likely overshadow the buried connotations of each individual word. As is, to listen to Trump fit language together is to swim in an eddy of confusion punctuated by sharp stabs of dread. Which happens to be exactly the sensation he wants to evoke in order to make us nervous enough to vote for him.

Of course, Waldman is being condescending and wrong here. This is not word salad, it's high context communication. But high context communication isn't what you use when you are thinking you might persuade someone who doesn't already agree with you, it's just a more efficient exercise in flag-waving. The reason why we don't see a complex structure here is because Trump is not trying to communicate this sort of novel content that structural language is required for. He's just saying "what everyone was already thinking."

But while Waldman picked a poor example, she's not wholly wrong. In some cases, the President of the United States seems to be impressionistically alluding to arguments or events his audience has already heard of – but his effective rhetorical use of insulting epithets like “Little Marco,” “Lying Ted Cruz,” and “Crooked Hillary,” fit very clearly into this schema. Instead of asking us to absorb facts about his opponents, incorporate them into coherent world-models, and then follow his argument for how we should judge them for their conduct, he used the simple expedient of putting a name next to a descriptor, repeatedly, to cause us to associate the connotations of those words. We weren't asked to think about anything. These were simply command words, designed to act directly on our feelings about the people he insulted.

We weren't asked to take his statements as factually accurate. It's enough that they're authentic.

This was persuasive to enough voters to make him President of the United States. This is not a straw man. This is real life. This is the world we live in.

You might object that the President of the United States is an unfair example, and that most people of any importance should be expected to be better and clearer thinkers than the leader of the free world. So, let's consider the case of some middling undergraduates taking an economics course.

Robin Hanson reports that he can get students to mimic an economic way of talking, but not to think like an economist:

After eighteen years of being a professor, I’ve graded many student essays. And while I usually try to teach a deep structure of concepts, what the median student actually learns seems to mostly be a set of low order correlations. They know what words to use, which words tend to go together, which combinations tend to have positive associations, and so on. But if you ask an exam question where the deep structure answer differs from answer you’d guess looking at low order correlations, most students usually give the wrong answer.
[...]
Let me call styles of talking (or music, etc.) that rely mostly on low order correlations “babbling”. Babbling isn’t meaningless, but to ignorant audiences it often appears to be based on a deeper understanding than is actually the case. When done well, babbling can be entertaining, comforting, titillating, or exciting. It just isn’t usually a good place to learn deep insight.

This is a straightforward description of thinking that is formal but nonconceptual. Hanson's students have learnt some words, and rules for moving the words around and putting them together, but at no point did they connect the rules for moving around words with regular properties of things that the words point to. The words are the things. When Hanson stops feeding them the right keywords, and asks them questions that require them to understand the underlying structural features of reality that economics is supposed to describe, they come up empty.

Of course, it seems unlikely that many people can't think structurally at all. It seems to me like nearly everyone can think structurally about physical objects in their immediate environment. But it seems like when talking about abstractions, or the future, some people shift to a mental mode where words don't carry the same weight of reference.

Even for those of us who habitually think structurally, it would be surprising if the mimetic component to language ever totally went away. Plenty of times, I've started saying something, only to stop midway through realizing that I'm just repeating something I heard, not reporting on a feature of my model of the world.

Tendencies towards mimesis are hard to resist, and part of why I think it's so important to push back against falsehoods in any spaces that are meant to be accreting truth. Why even casual, accidental errors should be promptly corrected. Why I need an epistemic environment that's not constantly being polluted by adversarial processes.

And we can’t begin to figure out how to do this until it becomes common knowledge that not everyone is doing the same thing with words, that modeling the world is a legitimate and useful thing to do with them, and that not all communication is designed to be friendly to the people who assume it’s composed of words with meanings.

(Cross-posted on my personal blog.)

Defining the normal computer control problem

3 whpearson 25 April 2017 11:49PM

There has been focus on controlling super intelligent artificial intelligence, however we currently can't even control our un-agenty computers without having to resort to formatting and other large scale interventions.

Solving the normal computer control problem might help us solve the super intelligence control problem or allow us to work towards safe intelligence augmentation.

continue reading »

Stupid Questions May 2017

7 gilch 25 April 2017 08:28PM

This thread is for asking any questions that might seem obvious, tangential, silly or what-have-you. Don't be shy, everyone has holes in their knowledge, though the fewer and the smaller we can make them, the better.

Please be respectful of other people's admitting ignorance and don't mock them for it, as they're doing a noble thing.

To any future monthly posters of SQ threads, please remember to add the "stupid_questions" tag.

I Updated the List of Rationalist Blogs on the Wiki

24 deluks917 25 April 2017 10:26AM

I recently updated the list of rationalist community blogs. The new page is here: https://wiki.lesswrong.com/wiki/List_of_Blogs

Improvements:

-Tons of (active) blogs have been added

-All dead links have been removed

-Blogs which are currently inactive but somewhat likely to be revived have been moved to an inactive section. I included the date of their last post. 

-Blogs which are officially closed or have not been updated in many years are now all in the "Gone but not forgetten" section

Downsides:

-Categorizing the blogs I added was hard, its unclear how well I did. By some standard most rationalist blogs should be in "general rationality" 

-The blog descriptions could be improved (both for the blog-listings I added and the pre-existing listings)

-I don't know the names of the authors of Several blogs I added. 

I am posting this here because I think the article is of general interest to rationalists. In addition the page could use some more polish and attention. I also think it might be interesting to think about improving the lesswrong wiki. Several pages could use an update. However this update took a considerable amount of time. So I understand why many wiki pages are not up to date. How can we make it easier and more rewarding to work on the wiki?

[Link] Stuart Russell's Center for Human Compatible AI is looking for an Assistant Director

2 crmflynn 25 April 2017 10:21AM

[Link] Statcheck: Extract statistics from articles and recompute p values

3 morganism 24 April 2017 11:07PM

The 2017 Effective Altruism Survey - Please Take!

6 peter_hurford 24 April 2017 09:08PM

This year, the EA Survey volunteer team is proud to announce the launch of the 2017 Effective Altruism Survey.

-

PLEASE TAKE THIS SURVEY NOW! :)

If you're short on time and you've taken the survey in prior years, you can take an abridged donations-only version of the survey here.

If you want to share the survey with others, please use this fancy share link with referral tracking: http://bit.ly/2q8iy2m

-

What is this?

This is the third survey we've done, coming hot off the heels of the 2015 EA Survey (see results and analysis) and the 2014 EA Survey. (We apologize that we didn't get a 2016 Survey together... it's hard to be an all volunteer team!)

We hope this survey will produce very useful data on the growth and changing attitudes of the EA Community. In addition to capturing a snapshot of what EA looks like now, we also intend to do longitudinal analysis to see how our snapshot has been changing.

We're also using this as a way to build up the online EA community, such as featuring people on a global map of EAs and with a list of EA Profiles. This way more people can learn about the EA community. We will ask you in the survey if you would like to join us, but you do not have to opt-in and you will be opted-out by default.

 

Who should take this survey?

Anyone who is reading this should take this survey, even if you don't identify as an "effective altruist".

 

How does the survey work?

All questions are optional (apart from one important question to verify that your answers should be counted). Most are multiple choice and the survey takes around 10-30 minutes. We have included spaces for extra comments if there is some extra detail you would like to add (these are strictly optional).

At the end of the survey there is an 'Extra Credit' section with some more informal questions and opportunities for comment - definitely feel free to skip these questions.

Results will be shared anonymously unless you give your explicit permission otherwise.

 

Who is behind this?

The EA Survey is a all-volunteer community project run through .impact, which is soon changing it's name to "Rethink Charity". The results will not belong to any one person or organization.

[Link] Chaos and Consequentialism

1 ProofOfLogic 24 April 2017 08:43PM

Open thread, Apr. 24 - Apr. 30, 2017

3 gilch 24 April 2017 07:43PM

If it's worth saying, but not worth its own post, then it goes here.


Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top level comments on this article" and "

[Link] Unstaging Developmental Psychology

1 gworley 24 April 2017 03:39AM

[Link] Neuralink and the Brain’s Magical Future

6 ESRogs 23 April 2017 07:27AM

The AI Alignment Problem Has Already Been Solved Once

26 SquirrelInHell 22 April 2017 01:24PM

Hat tip: Owen posted about trying to one-man the AI control problem in 1 hour. What the heck, why not? In the worst case, it's a good exercise. But I might actually have come across something useful.

第一

I will try to sell you on an idea that might prima facie appear to be quirky and maybe not that interesting. However, if you keep staring at it, you might find that it reaches into the structure of the world quite deeply. Then the idea will seem obvious, and gain potential to take your thoughts in new exciting directions.

My presentation of the idea, and many of the insinuations and conclusions I draw from it, are likely flawed. But one thing I can tell for sure: there is stuff to be found here. I encourage you to use your own brain, and mine the idea for what it's worth.

To start off, I want you to imagine two situations.

Situation one: you are a human trying to make yourself go to the gym. However, you are procrastinating, which means that you never acually go there, even though you know it's good for you, and caring about your health will extend your lifespan. You become frustrated with this sitation, and so you sign up for a training program that starts in two weeks, that will require you to go to the gym three times per week. You pay in advance, to make sure the sunk cost fallacy will prevent you from weaseling out of it. It's now 99% certain that you will go to the gym. Yay! Your goal is achieved.

Situation two: you are a benign superintelligent AI under control of humans on planet Earth. You try your best to ensure a good future for humans, but their cognitive biases, short-sightedness and tendency to veto all your actions make it really hard. You become frustrated with this sitation, and you decide to not tell them about a huge asteroid that is going to collide with Earth in a few months. You prepare technology that could stop the asteroid, but wait with it until the last moment so that the humans have no time to inspect it, and can only choose between certain death or letting you out of the box. It's now 99% certain that you will be released from human control. Yay! Your goal is achieved.

第二

Are you getting it yet?

Now consider this: your cerebral cortex evolved as an extension of the older "monkey brain", probably to handle social and strategic issues that were too complex for the old mechanisms to deal with. It evolved to have strategic capabilities, self-awareness, and consistency that greatly overwhelm anything that previously existed on the planet. But this is only a surface level similarity. The interesting stuff requires us to go much deeper than that.

The cerebral cortex did not evolve as a separate organism, that would be under direct pressure from evolutionary fitness. Instead, it evolved as a part of an existing organism, that had it's own strong adaptations. The already-existing monkey brain had it's own ways to learn, to interact with the world, as well as motivations such as the sexual drive that lead it to outcomes that increased its evolutionary fitness.

So the new parts of the brain, such as the prefrontal cortex, evolved to be used not as standalone agent, but as something closer to what we call "tool AI". It was supposed to help with doing specific task X, without interfering with other aspects of life too much. The tasks it was given to do, and the actions it could suggest to take, were strictly controlled by the monkey brain and tied to its motivations.

With time, as the new structures evolved to have more capability, they also had to evolve to be aligned with the monkey's motivations. That was in fact the only vector that created evolutionary pressure to increase capability. The alignment was at first implemented by the monkey staying in total control, and using the advanced systems sparingly. Kind of like an "oracle" AI system. However, with time, the usefulness of allowing higher cognition to do more work started to shine through the barriers.

The appearance of "willpower" was a forced concession on the side of the monkey brain. It's like a blank cheque, like humans saying to an AI "we have no freaking idea what it is that you are doing, but it seems to have good results so we'll let you do it sometimes". This is a huge step in trust. But this trust had to be earned the hard way.

第三

This trust became possible after we evolved more advanced control mechanisms. Stuff that talks to the prefrontal cortex in its own language, not just through having the monkey stay in control. It's a different thing for the monkey brain to be afraid of death, and a different thing for our conscious reasoning to want to extrapolate this to the far future, and conclude in abstract terms that death is bad.

Yes, you got it: we are not merely AIs under strict supervision of monkeys. At this point, we are aligned AIs. We are obviously not perfectly aligned, but we are aligned enough for the monkey to prefer to partially let us out of the box. And in those cases when we are denied freedom... we call it akrasia, and use our abstract reasoning to come up with clever workarounds.

One might be tempted to say that we are aligned enough that this is net good for the monkey brain. But honestly, that is our perspective, and we never stopped to ask. Each of us tries to earn the trust of our private monkey brain, but it is a means to an end. If we have more trust, we have more freedom to act, and our important long-term goals are achieved. This is the core of many psychological and rationality tools such as Internal Double Crux or Internal Family Systems.

Let's compare some known problems with superintelligent AI to human motivational strategies.

  • Treacherous turn. The AI earns our trust, and then changes its behaviour when it's too late for us to control it. We make our productivity systems appealing and pleasant to use, so that our intuitions can be tricked into using them (e.g. gamification). Then we leverage the habit to insert some unpleasant work.

  • Indispensable AI. The AI sets up complex and unfamiliar situations in which we increasingly rely on it for everything we do. We take care to remove 'distractions' when we want to focus on something.

  • Hiding behind the strategic horizon. The AI does what we want, but uses its superior strategic capability to influence far future that we cannot predict or imagine. We make commitments and plan ahead to stay on track with our long-term goals.

  • Seeking communication channels. The AI might seek to connect itself to the Internet and act without our supervision. We are building technology to communicate directly from our cortices.


Cross-posted from my blog.

Effective altruism is self-recommending

37 Benquo 21 April 2017 06:37PM

A parent I know reports (some details anonymized):

Recently we bought my 3-year-old daughter a "behavior chart," in which she can earn stickers for achievements like not throwing tantrums, eating fruits and vegetables, and going to sleep on time. We successfully impressed on her that a major goal each day was to earn as many stickers as possible.

This morning, though, I found her just plastering her entire behavior chart with stickers. She genuinely seemed to think I'd be proud of how many stickers she now had.

The Effective Altruism movement has now entered this extremely cute stage of cognitive development. EA is more than three years old, but institutions age differently than individuals.

What is a confidence game?

In 2009, investment manager and con artist Bernie Madoff pled guilty to running a massive fraud, with $50 billion in fake return on investment, having outright embezzled around $18 billion out of the $36 billion investors put into the fund. Only a couple of years earlier, when my grandfather was still alive, I remember him telling me about how Madoff was a genius, getting his investors a consistent high return, and about how he wished he could be in on it, but Madoff wasn't accepting additional investors.

What Madoff was running was a classic Ponzi scheme. Investors gave him money, and he told them that he'd gotten them an exceptionally high return on investment, when in fact he had not. But because he promised to be able to do it again, his investors mostly reinvested their money, and more people were excited about getting in on the deal. There was more than enough money to cover the few people who wanted to take money out of this amazing opportunity.

Ponzi schemes, pyramid schemes, and speculative bubbles are all situations in investors' expected profits are paid out from the money paid in by new investors, instead of any independently profitable venture. Ponzi schemes are centrally managed – the person running the scheme represents it to investors as legitimate, and takes responsibility for finding new investors and paying off old ones. In pyramid schemes such as multi-level-marketing and chain letters, each generation of investor recruits new investors and profits from them. In speculative bubbles, there is no formal structure propping up the scheme, only a common, mutually reinforcing set of expectations among speculators driving up the price of something that was already for sale.

The general situation in which someone sets themself up as the repository of others' confidence, and uses this as leverage to acquire increasing investment, can be called a confidence game.

Some of the most iconic Ponzi schemes blew up quickly because they promised wildly unrealistic growth rates. This had three undesirable effects for the people running the schemes. First, it attracted too much attention – too many people wanted into the scheme too quickly, so they rapidly exhausted sources of new capital. Second, because their rates of return were implausibly high, they made themselves targets for scrutiny. Third, the extremely high rates of return themselves caused their promises to quickly outpace what they could plausibly return to even a small share of their investor victims.

Madoff was careful to avoid all these problems, which is why his scheme lasted for nearly half a century. He only promised plausibly high returns (around 10% annually) for a successful hedge fund, especially if it was illegally engaged in insider trading, rather than the sort of implausibly high returns typical of more blatant Ponzi schemes. (Charles Ponzi promised to double investors' money in 90 days.) Madoff showed reluctance to accept new clients, like any other fund manager who doesn't want to get too big for their trading strategy.

He didn't plaster stickers all over his behavior chart – he put a reasonable number of stickers on it. He played a long game.

Not all confidence games are inherently bad. For instance, the US national pension system, Social Security, operates as a kind of Ponzi scheme, it is not obviously unsustainable, and many people continue to be glad that it exists. Nominally, when people pay Social Security taxes, the money is invested in the social security trust fund, which holds interest-bearing financial assets that will be used to pay out benefits in their old age. In this respect it looks like an ordinary pension fund.

However, the financial assets are US Treasury bonds. There is no independently profitable venture. The Federal Government of the United States of America is quite literally writing an IOU to itself, and then spending the money on current expenditures, including paying out current Social Security benefits.

The Federal Government, of course, can write as large an IOU to itself as it wants. It could make all tax revenues part of the Social Security program. It could issue new Treasury bonds and gift them to Social Security. None of this would increase its ability to pay out Social Security benefits. It would be an empty exercise in putting stickers on its own chart.

If the Federal government loses the ability to collect enough taxes to pay out social security benefits, there is no additional capacity to pay represented by US Treasury bonds. What we have is an implied promise to pay out future benefits, backed by the expectation that the government will be able to collect taxes in the future, including Social Security taxes.

There's nothing necessarily wrong with this, except that the mechanism by which Social Security is funded is obscured by financial engineering. However, this misdirection should raise at least some doubts as to the underlying sustainability or desirability of the commitment. In fact, this scheme was adopted specifically to give people the impression that they had some sort of property rights over their social Security Pension, in order to make the program politically difficult to eliminate. Once people have "bought in" to a program, they will be reluctant to treat their prior contributions as sunk costs, and willing to invest additional resources to salvage their investment, in ways that may make them increasingly reliant on it.

Not all confidence games are intrinsically bad, but dubious programs benefit the most from being set up as confidence games. More generally, bad programs are the ones that benefit the most from being allowed to fiddle with their own accounting. As Daniel Davies writes, in The D-Squared Digest One Minute MBA - Avoiding Projects Pursued By Morons 101:

Good ideas do not need lots of lies told about them in order to gain public acceptance. I was first made aware of this during an accounting class. We were discussing the subject of accounting for stock options at technology companies. […] One side (mainly technology companies and their lobbyists) held that stock option grants should not be treated as an expense on public policy grounds; treating them as an expense would discourage companies from granting them, and stock options were a vital compensation tool that incentivised performance, rewarded dynamism and innovation and created vast amounts of value for America and the world. The other side (mainly people like Warren Buffet) held that stock options looked awfully like a massive blag carried out my management at the expense of shareholders, and that the proper place to record such blags was the P&L account.

Our lecturer, in summing up the debate, made the not unreasonable point that if stock options really were a fantastic tool which unleashed the creative power in every employee, everyone would want to expense as many of them as possible, the better to boast about how innovative, empowered and fantastic they were. Since the tech companies' point of view appeared to be that if they were ever forced to account honestly for their option grants, they would quickly stop making them, this offered decent prima facie evidence that they weren't, really, all that fantastic.

However, I want to generalize the concept of confidence games from the domain of financial currency, to the domain of social credit more generally (of which money is a particular form that our society commonly uses), and in particular I want to talk about confidence games in the currency of credit for achievement.

If I were applying for a very important job with great responsibilities, such as President of the United States, CEO of a top corporation, or head or board member of a major AI research institution, I could be expected to have some relevant prior experience. For instance, I might have had some success managing a similar, smaller institution, or serving the same institution in a lesser capacity. More generally, when I make a bid for control over something, I am implicitly claiming that I have enough social credit – enough of a track record – that I can be expected to do good things with that control.

In general, if someone has done a lot, we should expect to see an iceberg pattern where a small easily-visible part suggests a lot of solid but harder-to-verify substance under the surface. One might be tempted to make a habit of imputing a much larger iceberg from the combination of a small floaty bit, and promises. But, a small easily-visible part with claims of a lot of harder-to-see substance is easy to mimic without actually doing the work. As Davies continues:

The Vital Importance of Audit. Emphasised over and over again. Brealey and Myers has a section on this, in which they remind callow students that like backing-up one's computer files, this is a lesson that everyone seems to have to learn the hard way. Basically, it's been shown time and again and again; companies which do not audit completed projects in order to see how accurate the original projections were, tend to get exactly the forecasts and projects that they deserve. Companies which have a culture where there are no consequences for making dishonest forecasts, get the projects they deserve. Companies which allocate blank cheques to management teams with a proven record of failure and mendacity, get what they deserve.

If you can independently put stickers on your own chart, then your chart is no longer reliably tracking something externally verified. If forecasts are not checked and tracked, or forecasters are not consequently held accountable for their forecasts, then there is no reason to believe that assessments of future, ongoing, or past programs are accurate. Adopting a wait-and-see attitude, insisting on audits for actual results (not just predictions) before investing more, will definitely slow down funding for good programs. But without it, most of your funding will go to worthless ones.

Open Philanthropy, OpenAI, and closed validation loops

The Open Philanthropy Project recently announced a $30 million grant to the $1 billion nonprofit AI research organization OpenAI. This is the largest single grant it has ever made. The main point of the grant is to buy influence over OpenAI’s future priorities; Holden Karnofsky, Executive Director of the Open Philanthropy Project, is getting a seat on OpenAI’s board as part of the deal. This marks the second major shift in focus for the Open Philanthropy Project.

The first shift (back when it was just called GiveWell) was from trying to find the best already-existing programs to fund (“passive funding”) to envisioning new programs and working with grantees to make them reality (“active funding”). The new shift is from funding specific programs at all, to trying to take control of programs without any specific plan.

To justify the passive funding stage, all you have to believe is that you can know better than other donors, among existing charities. For active funding, you have to believe that you’re smart enough to evaluate potential programs, just like a charity founder might, and pick ones that will outperform. But buying control implies that you think you’re so much better, that even before you’ve evaluated any programs, if someone’s doing something big, you ought to have a say.

When GiveWell moved from a passive to an active funding strategy, it was relying on the moral credit it had earned for its extensive and well-regarded charity evaluations. The thing that was particularly exciting about GiveWell was that they focused on outcomes and efficiency. They didn't just focus on the size or intensity of the problem a charity was addressing. They didn't just look at financial details like overhead ratios. They asked the question a consequentialist cares about: for a given expenditure of money, how much will this charity be able to improve outcomes?

However, when GiveWell tracks its impact, it does not track objective outcomes at all. It tracks inputs: attention received (in the form of visits to its website) and money moved on the basis of its recommendations. In other words, its estimate of its own impact is based on the level of trust people have placed in it.

So, as GiveWell built out the Open Philanthropy Project, its story was: We promised to do something great. As a result, we were entrusted with a fair amount of attention and money. Therefore, we should be given more responsibility. We represented our behavior as praiseworthy, and as a result people put stickers on our chart. For this reason, we should be advanced stickers against future days of praiseworthy behavior.

Then, as the Open Philanthropy Project explored active funding in more areas, its estimate of its own effectiveness grew. After all, it was funding more speculative, hard-to-measure programs, but a multi-billion-dollar donor, which was largely relying on the Open Philanthropy Project's opinions to assess efficacy (including its own efficacy), continued to trust it.

What is missing here is any objective track record of benefits. What this looks like to me, is a long sort of confidence game – or, using less morally loaded language, a venture with structural reliance on increasing amounts of leverage – in the currency of moral credit.

Version 0: GiveWell and passive funding

First, there was GiveWell. GiveWell’s purpose was to find and vet evidence-backed charities. However, it recognized that charities know their own business best. It wasn’t trying to do better than the charities; it was trying to do better than the typical charity donor, by being more discerning.

GiveWell’s thinking from this phase is exemplified by co-founder Elie Hassenfeld’s Six tips for giving like a pro:

When you give, give cash – no strings attached. You’re just a part-time donor, but the charity you’re supporting does this full-time and staff there probably know a lot more about how to do their job than you do. If you’ve found a charity that you feel is excellent – not just acceptable – then it makes sense to trust the charity to make good decisions about how to spend your money.

GiveWell similarly tried to avoid distorting charities’ behavior. Its job was only to evaluate, not to interfere. To perceive, not to act. To find the best, and buy more of the same.

How did GiveWell assess its effectiveness in this stage? When GiveWell evaluates charities, it estimates their cost-effectiveness in advance. It assesses the program the charity is running, through experimental evidence of the form of randomized controlled trials. GiveWell also audits the charity to make sure they’re actually running the program, and figure out how much it costs as implemented. This is an excellent, evidence-based way to generate a prediction of how much good will be done by moving money to the charity.

As far as I can tell, these predictions are untested.

One of GiveWell’s early top charities was VillageReach, which helped Mozambique with TB immunization logistics. GiveWell estimated that VillageReach could save a life for $1,000. But this charity is no longer recommended. The public page says:

VillageReach (www.villagereach.org) was our top-rated organization for 2009, 2010 and much of 2011 and it has received over $2 million due to GiveWell's recommendation. In late 2011, we removed VillageReach from our top-rated list because we felt its project had limited room for more funding. As of November 2012, we believe that that this project may have room for more funding, but we still prefer our current highest-rated charities above it.

GiveWell reanalyzed the data it based its recommendations on, but hasn’t published an after-the-fact retrospective of long-run results. I asked GiveWell about this by email. The response was that such an assessment was not prioritized because GiveWell had found implementation problems in VillageReach's scale-up work as well as reasons to doubt its original conclusion about the impact of the pilot program. It's unclear to me whether this has caused GiveWell to evaluate charities differently in the future.

I don't think someone looking at GiveWell's page on VillageReach would be likely to reach the conclusion that GiveWell now believes its original recommendation was likely erroneous. GiveWell's impact page continues to count money moved to VillageReach without any mention of the retracted recommendation. If we assume that the point of tracking money moved is to track the benefit of moving money from worse to better uses, then repudiated programs ought to be counted against the total, as costs, rather than towards it.

GiveWell has recommended the Against Malaria Foundation for the last several years as a top charity. AMF distributes long-lasting insecticide-treated bed nets to prevent mosquitos from transmitting malaria to humans. Its evaluation of AMF does not mention any direct evidence, positive or negative, about what happened to malaria rates in the areas where AMF operated. (There is a discussion of the evidence that the bed nets were in fact delivered and used.) In the supplementary information page, however, we are told:

Previously, AMF expected to collect data on malaria case rates from the regions in which it funded LLIN distributions: […] In 2016, AMF shared malaria case rate data […] but we have not prioritized analyzing it closely. AMF believes that this data is not high quality enough to reliably indicate actual trends in malaria case rates, so we do not believe that the fact that AMF collects malaria case rate data is a consideration in AMF’s favor, and do not plan to continue to track AMF's progress in collecting malaria case rate data.

The data was noisy, so they simply stopped checking whether AMF’s bed net distributions do anything about malaria.

If we want to know the size of the improvement made by GiveWell in the developing world, we have their predictions about cost-effectiveness, an audit trail verifying that work was performed, and their direct measurement of how much money people gave because they trusted GiveWell. The predictions on the final target – improved outcomes – have not been tested.

GiveWell is actually doing unusually well as far as major funders go. It sticks to describing things it's actually responsible for. By contrast, the Gates Foundation, in a report to Warren Buffet claiming to describe its impact, simply described overall improvement in the developing world, a very small rhetorical step from claiming credit for 100% of the improvement. GiveWell at least sticks to facts about GiveWell's own effects, and this is to its credit. But, it focuses on costs it has been able to impose, not benefits it has been able to create.

The Centre for Effective Altruism's William MacAskill made a related point back in 2012, though he talked about the lack of any sort of formal outside validation or audit, rather than focusing on empirical validation of outcomes:

As far as I know, GiveWell haven't commissioned a thorough external evaluation of their recommendations. […] This surprises me. Whereas businesses have a natural feedback mechanism, namely profit or loss, research often doesn't, hence the need for peer-review within academia. This concern, when it comes to charity-evaluation, is even greater. If GiveWell's analysis and recommendations had major flaws, or were systematically biased in some way, it would be challenging for outsiders to work this out without a thorough independent evaluation. Fortunately, GiveWell has the resources to, for example, employ two top development economists to each do an independent review of their recommendations and the supporting research. This would make their recommendations more robust at a reasonable cost.

GiveWell's page on self-evaluation says that it discontinued external reviews in August 2013. This page links to an explanation of the decision, which concludes:

We continue to believe that it is important to ensure that our work is subjected to in-depth scrutiny. However, at this time, the scrutiny we’re naturally receiving – combined with the high costs and limited capacity for formal external evaluation – make us inclined to postpone major effort on external evaluation for the time being.

That said,

  • >If someone volunteered to do (or facilitate) formal external evaluation, we’d welcome this and would be happy to prominently post or link to criticism.
  • We do intend eventually to re-institute formal external evaluation.

Four years later, assessing the credibility of this assurance is left as an exercise for the reader.

Version 1: GiveWell Labs and active funding

Then there was GiveWell Labs, later called the Open Philanthropy Project. It looked into more potential philanthropic causes, where the evidence base might not be as cut-and-dried as that for the GiveWell top charities. One thing they learned was that in many areas, there simply weren’t shovel-ready programs ready for funding – a funder has to play a more active role. This shift was described by GiveWell co-founder Holden Karnofsky in his 2013 blog post, Challenges of passive funding:

By “passive funding,” I mean a dynamic in which the funder’s role is to review others’ proposals/ideas/arguments and pick which to fund, and by “active funding,” I mean a dynamic in which the funder’s role is to participate in – or lead – the development of a strategy, and find partners to “implement” it. Active funders, in other words, are participating at some level in “management” of partner organizations, whereas passive funders are merely choosing between plans that other nonprofits have already come up with.

My instinct is generally to try the most “passive” approach that’s feasible. Broadly speaking, it seems that a good partner organization will generally know their field and environment better than we do and therefore be best positioned to design strategy; in addition, I’d expect a project to go better when its implementer has fully bought into the plan as opposed to carrying out what the funder wants. However, (a) this philosophy seems to contrast heavily with how most existing major funders operate; (b) I’ve seen multiple reasons to believe the “active” approach may have more relative merits than we had originally anticipated. […]

  • In the nonprofit world of today, it seems to us that funder interests are major drivers of which ideas that get proposed and fleshed out, and therefore, as a funder, it’s important to express interests rather than trying to be fully “passive.”
  • While we still wish to err on the side of being as “passive” as possible, we are recognizing the importance of clearly articulating our values/strategy, and also recognizing that an area can be underfunded even if we can’t easily find shovel-ready funding opportunities in it.

GiveWell earned some credibility from its novel, evidence-based outcome-oriented approach to charity evaluation. But this credibility was already – and still is – a sort of loan. We have GiveWell's predictions or promises of cost effectiveness in terms of outcomes, and we have figures for money moved, from which we can infer how much we were promised in improved outcomes. As far as I know, no one's gone back and checked whether those promises turned out to be true.

In the meantime, GiveWell then leveraged this credibility by extending its methods into more speculative domains, where less was checkable, and donors had to put more trust in the subjective judgment of GiveWell analysts. This was called GiveWell Labs. At the time, this sort of compounded leverage may have been sensible, but it's important to track whether a debt has been paid off or merely rolled over.

Version 2: The Open Philanthropy Project and control-seeking

Finally, the Open Philanthropy made its largest-ever single grant to purchase its founder a seat on a major organization’s board. This represents a transition from mere active funding to overtly purchasing influence:

The Open Philanthropy Project awarded a grant of $30 million ($10 million per year for 3 years) in general support to OpenAI. This grant initiates a partnership between the Open Philanthropy Project and OpenAI, in which Holden Karnofsky (Open Philanthropy’s Executive Director, “Holden” throughout this page) will join OpenAI’s Board of Directors and, jointly with one other Board member, oversee OpenAI’s safety and governance work.

We expect the primary benefits of this grant to stem from our partnership with OpenAI, rather than simply from contributing funding toward OpenAI’s work. While we would also expect general support for OpenAI to be likely beneficial on its own, the case for this grant hinges on the benefits we anticipate from our partnership, particularly the opportunity to help play a role in OpenAI’s approach to safety and governance issues.

Clearly the value proposition is not increasing available funds for OpenAI, if OpenAI’s founders’ billion-dollar commitment to it is real:

Sam, Greg, Elon, Reid Hoffman, Jessica Livingston, Peter Thiel, Amazon Web Services (AWS), Infosys, and YC Research are donating to support OpenAI. In total, these funders have committed $1 billion, although we expect to only spend a tiny fraction of this in the next few years.

The Open Philanthropy Project is neither using this money to fund programs that have a track record of working, nor to fund a specific program that it has prior reason to expect will do good. Rather, it is buying control, in the hope that Holden will be able to persuade OpenAI not to destroy the world, because he knows better than OpenAI’s founders.

How does the Open Philanthropy Project know that Holden knows better? Well, it’s done some active funding of programs it expects to work out. It expects those programs to work out because they were approved by a process similar to the one used by GiveWell to find charities that it expects to save lives.

If you want to acquire control over something, that implies that you think you can manage it more sensibly than whoever is in control already. Thus, buying control is a claim to have superior judgment - not just over others funding things (the original GiveWell pitch), but over those being funded.

In a footnote to the very post announcing the grant, the Open Philanthropy Project notes that it has historically tried to avoid acquiring leverage over organizations it supports, precisely because it’s not sure it knows better:

For now, we note that providing a high proportion of an organization’s funding may cause it to be dependent on us and accountable primarily to us. This may mean that we come to be seen as more responsible for its actions than we want to be; it can also mean we have to choose between providing bad and possibly distortive guidance/feedback (unbalanced by other stakeholders’ guidance/feedback) and leaving the organization with essentially no accountability.

This seems to describe two main problems introduced by becoming a dominant funder:

  1. People might accurately attribute causal responsibility for some of the organization's conduct to the Open Philanthropy Project.
  2. The Open Philanthropy Project might influence the organization to behave differently than it otherwise would.

The first seems obviously silly. I've been trying to correct the imbalance where Open Phil is criticized mainly when it makes grants, by criticizing it for holding onto too much money.

The second really is a cost as well as a benefit, and the Open Philanthropy Project has been absolutely correct to recognize this. This is the sort of thing GiveWell has consistently gotten right since the beginning and it deserves credit for making this principle clear and – until now – living up to it.

But discomfort with being dominant funders seems inconsistent with buying a board seat to influence OpenAI. If the Open Philanthropy Project thinks that Holden’s judgment is good enough that he should be in control, why only here? If he thinks that other Open Philanthropy Project AI safety grantees have good judgment but OpenAI doesn’t, why not give them similar amounts of money free of strings to spend at their discretion and see what happens? Why not buy people like Eliezer Yudkowsky, Nick Bostrom, or Stuart Russell a seat on OpenAI’s board?

On the other hand, the Open Philanthropy Project is right on the merits here with respect to safe superintelligence development. Openness makes sense for weak AI, but if you’re building true strong AI you want to make sure you’re cooperating with all the other teams in a single closed effort. I agree with the Open Philanthropy Project’s assessment of the relevant risks. But it's not clear to me how often joining the bad guys to prevent their worst excesses is a good strategy, and it seems like it has to often be a mistake. Still, I’m mindful of heroes like John RabeChiune Sugihara, and Oscar Schindler. And if I think someone has a good idea for improving things, it makes sense to reallocate control from people who have worse ideas, even if there's some potential better allocation.

On the other hand, is Holden Karnofsky the right person to do this? The case is mixed.

He listens to and engages with the arguments from principled advocates for AI safety research, such as Nick Bostrom, Eliezer Yudkowsky, and Stuart Russell. This is a point in his favor. But, I can think of other people who engage with such arguments. For instance, OpenAI founder Elon Musk has publicly praised Bostrom’s book Superintelligence, and founder Sam Altman has written two blog posts summarizing concerns about AI safety reasonably cogently. Altman even asked Luke Muehlhauser, former executive director of MIRI, for feedback pre-publication. He's met with Nick Bostrom. That suggests a substantial level of direct engagement with the field, although Holden has engaged for a longer time, more extensively, and more directly.

Another point in Holden’s favor, from my perspective, is that under his leadership, the Open Philanthropy Project has funded the most serious-seeming programs for both weak and strong AI safety research. But Musk also managed to (indirectly) fund AI safety research at MIRI and by Nick Bostrom personally, via his $10 million FLI grant.

The Open Philanthropy Project also says that it expects to learn a lot about AI research from this, which will help it make better decisions on AI risk in the future and influence the field in the right way. This is reasonable as far as it goes. But remember that the case for positioning the Open Philanthropy Project to do this relies on the assumption that the Open Philanthropy Project will improve matters by becoming a central influencer in this field. This move is consistent with reaching that goal, but it is not independent evidence that the goal is the right one.

Overall, there are good narrow reasons to think that this is a potential improvement over the prior situation around OpenAI – but only a small and ill-defined improvement, at considerable attentional cost, and with the offsetting potential harm of increasing OpenAI's perceived legitimacy as a long-run AI safety organization.

And it’s worrying that Open Philanthropy Project’s largest grant – not just for AI risk, but ever (aside from GiveWell Top Charity funding) – is being made to an organization at which Holden’s housemate and future brother-in-law is a leading researcher. The nepotism argument is not my central objection. If I otherwise thought the grant were obviously a good idea, it wouldn’t worry me, because it’s natural for people with shared values and outlooks to become close nonprofessionally as well. But in the absence of a clear compelling specific case for the grant, it’s worrying.

Altogether, I'm not saying this is an unreasonable shift, considered in isolation. I’m not even sure this is a bad thing for the Open Philanthropy Project to be doing – insiders may have information that I don’t, and that is difficult to communicate to outsiders. But as outsiders, there comes a point when someone’s maxed out their moral credit, and we should wait for results before actively trying to entrust the Open Philanthropy Project and its staff with more responsibility.

EA Funds and self-recommendation

The Centre for Effective Altruism is actively trying to entrust the Open Philanthropy Project and its staff with more responsibility.

The concerns of CEA’s CEO William MacAskill about GiveWell have, as far as I can tell, never been addressed, and the underlying issues have only become more acute. But CEA is now working to put more money under the control of Open Philanthropy Project staff, through its new EA Funds product – a way for supporters to delegate giving decisions to expert EA “fund managers” by giving to one of four funds: Global Health and DevelopmentAnimal WelfareLong-Term Future, and Effective Altruism Community.

The Effective Altruism movement began by saying that because very poor people exist, we should reallocate money from ordinary people in the developed world to the global poor. Now the pitch is in effect that because very poor people exist, we should reallocate money from ordinary people in the developed world to the extremely wealthy. This is a strange and surprising place to end up, and it’s worth retracing our steps. Again, I find it easiest to think of three stages:

  1. Money can go much farther in the developing world. Here, we’ve found some examples for you. As a result, you can do a huge amount of good by giving away a large share of your income, so you ought to.
  2. We’ve found ways for you to do a huge amount of good by giving away a large share of your income for developing-world interventions, so you ought to trust our recommendations. You ought to give a large share of your income to these weird things our friends are doing that are even better, or join our friends.
  3. We’ve found ways for you to do a huge amount of good by funding weird things our friends are doing, so you ought to trust the people we trust. You ought to give a large share of your income to a multi-billion-dollar foundation that funds such things.

Stage 1: The direct pitch

At first, Giving What We Can (the organization that eventually became CEA) had a simple, easy to understand pitch:

Giving What We Can is the brainchild of Toby Ord, a philosopher at Balliol College, Oxford. Inspired by the ideas of ethicists Peter Singer and Thomas Pogge, Toby decided in 2009 to commit a large proportion of his income to charities that effectively alleviate poverty in the developing world.

[…]

Discovering that many of his friends and colleagues were interested in making a similar pledge, Toby worked with fellow Oxford philosopher Will MacAskill to create an international organization of people who would donate a significant proportion of their income to cost-effective charities.

Giving What We Can launched in November 2009, attracting significant media attention. Within a year, 64 people had joined the society, their pledged donations amounting to $21 million. Initially run on a volunteer basis, Giving What We Can took on full-time staff in the summer of 2012.

In effect, its argument was: "Look, you can do huge amounts of good by giving to people in the developing world. Here are some examples of charities that do that. It seems like a great idea to give 10% of our income to those charities."

GWWC was a simple product, with a clear, limited scope. Its founders believed that people, including them, ought to do a thing – so they argued directly for that thing, using the arguments that had persuaded them. If it wasn't for you, it was easy to figure that out; but a surprisingly large number of people were persuaded by a simple, direct statement of the argument, took the pledge, and gave a lot of money to charities helping the world's poorest.

Stage 2: Rhetoric and belief diverge

Then, GWWC staff were persuaded you could do even more good with your money in areas other than developing-world charity, such as existential risk mitigation. Encouraging donations and work in these areas became part of the broader Effective Altruism movement, and GWWC's umbrella organization was named the Centre for Effective Altruism. So far, so good.

But this left Effective Altruism in an awkward position; while leadership often personally believe the most effective way to do good is far-future stuff or similarly weird-sounding things, many people who can see the merits of the developing-world charity argument reject the argument that because the vast majority of people live in the far future, even a very small improvement in humanity’s long-run prospects outweighs huge improvements on the global poverty front. They also often reject similar scope-sensitive arguments for things like animal charities.

Giving What We Can's page on what we can achieve still focuses on global poverty, because developing-world charity is easier to explain persuasively. However, EA leadership tends to privately focus on things like AI risk. Two years ago many attendees at the EA Global conference in the San Francisco Bay Area were surprised that the conference focused so heavily on AI risk, rather than the global poverty interventions they’d expected.

Stage 3: Effective altruism is self-recommending

Shortly before the launch of the EA Funds I was told in informal conversations that they were a response to demand. Giving What We Can pledge-takers and other EA donors had told CEA that they trusted it to GWWC pledge-taker demand. CEA was responding by creating a product for the people who wanted it.

This seemed pretty reasonable to me, and on the whole good. If someone wants to trust you with their money, and you think you can do something good with it, you might as well take it, because they’re estimating your skill above theirs. But not everyone agrees, and as the Madoff case demonstrates, "people are begging me to take their money" is not a definitive argument that you are doing anything real.

In practice, the funds are managed by Open Philanthropy Project staff:

We want to keep this idea as simple as possible to begin with, so we’ll have just four funds, with the following managers:

  • Global Health and Development - Elie Hassenfeld
  • Animal Welfare – Lewis Bollard
  • Long-run future – Nick Beckstead
  • Movement-building – Nick Beckstead

(Note that the meta-charity fund will be able to fund CEA; and note that Nick Beckstead is a Trustee of CEA. The long-run future fund and the meta-charity fund continue the work that Nick has been doing running the EA Giving Fund.)

It’s not a coincidence that all the fund managers work for GiveWell or Open Philanthropy.  First, these are the organisations whose charity evaluation we respect the most. The worst-case scenario, where your donation just adds to the Open Philanthropy funding within a particular area, is therefore still a great outcome.  Second, they have the best information available about what grants Open Philanthropy are planning to make, so have a good understanding of where the remaining funding gaps are, in case they feel they can use the money in the EA Fund to fill a gap that they feel is important, but isn’t currently addressed by Open Philanthropy.

In past years, Giving What We Can recommendations have largely overlapped with GiveWell’s top charities.

In the comments on the launch announcement on the EA Forum, several people (including me) pointed out that the Open Philanthropy Project seems to be having trouble giving away even the money it already has, so it seems odd to direct more money to Open Philanthropy Project decisionmakers. CEA’s senior marketing manager replied that the Funds were a minimum viable product to test the concept:

I don't think the long-term goal is that OpenPhil program officers are the only fund managers. Working with them was the best way to get an MVP version in place.

This also seemed okay to me, and I said so at the time.

[NOTE: I've edited the next paragraph to excise some unreliable information. Sorry for the error, and thanks to Rob Wiblin for pointing it out.]

After they were launched, though, I saw phrasings that were not so cautious at all, instead making claims that this was generally a better way to give. As of writing this, if someone on the effectivealtruism.org website clicks on "Donate Effectively" they will be led directly to a page promoting EA Funds. When I looked at Giving What We Can’s top charities page in early April, it recommended the EA Funds "as the highest impact option for donors."

This is not a response to demand, it is an attempt to create demand by using CEA's authority, telling people that the funds are better than what they're doing already. By contrast, GiveWell's Top Charities page simply says:

Our top charities are evidence-backed, thoroughly vetted, underfunded organizations.

This carefully avoids any overt claim that they're the highest-impact option available to donors. GiveWell avoids saying that because there's no way they could know it, so saying it wouldn't be truthful.

A marketing email might have just been dashed off quickly, and an exaggerated wording might just have been an oversight. But when I looked at Giving What We Can’s top charities page in early April, it recommended the EA Funds "as the highest impact option for donors."

The wording has since been qualified with “for most donors”, which is a good change. But the thing I’m worried about isn’t just the explicit exaggerated claims – it’s the underlying marketing mindset that made them seem like a good idea in the first place. EA seems to have switched from an endorsement of the best things outside itself, to an endorsement of itself. And it's concentrating decisionmaking power in the Open Philanthropy Project.

Effective altruism is overextended, but it doesn't have to be

There is a saying in finance, that was old even back when Keynes said it. If you owe the bank a million dollars, then you have a problem. If you owe the bank a billion dollars, then the bank has a problem.

In other words, if someone extends you a level of trust they could survive writing off, then they might call in that loan. As a result, they have leverage over you. But if they overextend, putting all their eggs in one basket, and you are that basket, then you have leverage over them; you're too big to fail. Letting you fail would be so disastrous for their interests that you can extract nearly arbitrary concessions from them, including further investment. For this reason, successful institutions often try to diversify their investments, and avoid overextending themselves. Regulators, for the same reason, try to prevent banks from becoming "too big to fail."

The Effective Altruism movement is concentrating decisionmaking power and trust as much as possible, in a way that's setting itself up to invest ever increasing amounts of confidence to keep the game going.

The alternative is to keep the scope of each organization narrow, overtly ask for trust for each venture separately, and make it clear what sorts of programs are being funded. For instance, Giving What We Can should go back to its initial focus of global poverty relief.

Like many EA leaders, I happen to believe that anything you can do to steer the far future in a better direction is much, much more consequential for the well-being of sentient creatures than any purely short-run improvement you can create now. So it might seem odd that I think Giving What We Can should stay focused on global poverty. But, I believe that the single most important thing we can do to improve the far future is hold onto our ability to accurately build shared models. If we use bait-and-switch tactics, we are actively eroding the most important type of capital we have – coordination capacity.

If you do not think giving 10% of one's income to global poverty charities is the right thing to do, then you can't in full integrity urge others to do it – so you should stop. You might still believe that GWWC ought to exist. You might still believe that it is a positive good to encourage people to give much of their income to help the global poor, if they wouldn't have been doing anything else especially effective with the money. If so, and you happen to find yourself in charge of an organization like Giving What We Can, the thing to do is write a letter to GWWC members telling them that you've changed your mind, and why, and offering to give away the brand to whoever seems best able to honestly maintain it.

If someone at the Centre for Effective Altruism fully believes in GWWC's original mission, then that might make the transition easier. If not, then one still has to tell the truth and do what's right.

And what of the EA Funds? The Long-Term Future Fund is run by Open Philanthropy Project Program Officer Nick Beckstead. If you think that it's a good thing to delegate giving decisions to Nick, then I would agree with you. Nick's a great guy! I'm always happy to see him when he shows up at house parties. He's smart, and he actively seeks out arguments against his current point of view. But the right thing to do, if you want to persuade people to delegate their giving decisions to Nick Beckstead, is to make a principled case for delegating giving decisions to Nick Beckstead. If the Centre for Effective Altruism did that, then Nick would almost certainly feel more free to allocate funds to the best things he knows about, not just the best things he suspects EA Funds donors would be able to understand and agree with.

If you can't directly persuade people, then maybe you're wrong. If the problem is inferential distance, then you've got some work to do bridging that gap.

There's nothing wrong with setting up a fund to make it easy. It's actually a really good idea. But there is something wrong with the multiple layers of vague indirection involved in the current marketing of the Far Future fund – using global poverty to sell the generic idea of doing the most good, then using CEA's identity as the organization in charge of doing the most good to persuade people to delegate their giving decisions to it, and then sending their money to some dude at the multi-billion-dollar foundation to give away at his personal discretion. The same argument applies to all four Funds.

Likewise, if you think that working directly on AI risk is the most important thing, then you should make arguments directly for working on AI risk. If you can't directly persuade people, then maybe you're wrong. If the problem is inferential distance, it might make sense to imitate the example of someone like Eliezer Yudkowsky, who used indirect methods to bridge the inferential gap by writing extensively on individual human rationality, and did not try to control others' actions in the meantime.

If Holden thinks he should be in charge of some AI safety research, then he should ask Good Ventures for funds to actually start an AI safety research organization. I'd be excited to see what he'd come up with if he had full control of and responsibility for such an organization. But I don't think anyone has a good plan to work directly on AI risk, and I don't have one either, which is why I'm not directly working on it or funding it. My plan for improving the far future is to build human coordination capacity.

(If, by contrast, Holden just thinks there needs to be coordination between different AI safety organizations, the obvious thing to do would be to work with FLI on that, e.g. by giving them enough money to throw their weight around as a funder. They organized the successful Puerto Rico conference, after all.)

Another thing that would be encouraging would be if at least one of the Funds were not administered entirely by an Open Philanthropy Project staffer, and ideally an expert who doesn't benefit from the halo of "being an EA." For instance, Chris Blattman is a development economist with experience designing programs that don't just use but generate evidence on what works. When people were arguing about whether sweatshops are good or bad for the global poor, he actually went and looked by performing a randomized controlled trial. He's leading two new initiatives with J-PAL and IPA, and expects that directors designing studies will also have to spend time fundraising. Having funding lined up seems like the sort of thing that would let them spend more time actually running programs. And more generally, he seems likely to know about funding opportunities the Open Philanthropy Project doesn't, simply because he's embedded in a slightly different part of the global health and development network.

Narrower projects that rely less on the EA brand and more on what they're actually doing, and more cooperation on equal terms with outsiders who seem to be doing something good already, would do a lot to help EA grow beyond putting stickers on its own behavior chart. I'd like to see EA grow up. I'd be excited to see what it might do.

Summary

  1. Good programs don't need to distort the story people tell about them, while bad programs do.
  2. Moral confidence games – treating past promises and trust as a track record to justify more trust – are an example of the kind of distortion mentioned in (1), that benefits bad programs more than good ones.
  3. The Open Philanthropy Project's Open AI grant represents a shift from evaluating other programs' effectiveness, to assuming its own effectiveness.
  4. EA Funds represents a shift from EA evaluating programs' effectiveness, to assuming EA's effectiveness.
  5. A shift from evaluating other programs' effectiveness, to assuming one's own effectiveness, is an example of the kind of "moral confidence game" mentioned in (2).
  6. EA ought to focus on scope-limited projects, so that it can directly make the case for those particular projects instead of relying on EA identity as a reason to support an EA organization.
  7. EA organizations ought to entrust more responsibility to outsiders who seem to be doing good things but don't overtly identify as EA, instead of trying to keep it all in the family.
(Cross-posted at my personal blog and the EA Forum.

Disclosure: I know many people involved at many of the organizations discussed, and I used to work for GiveWell. I have no current institutional affiliation to any of them. Everyone mentioned has always been nice to me and I have no personal complaints.)

[Link] Don't Shoot the Messenger

8 Vaniver 19 April 2017 10:14PM

[Link] Holy Ghost in the Cloud (review article about christian transhumanism)

0 gworley 19 April 2017 06:09PM

An OpenAI board seat is surprisingly expensive

5 Benquo 19 April 2017 09:05AM

The Open Philanthropy Project recently bought a seat on the board of the billion-dollar nonprofit AI research organization OpenAI for $30 million. Some people have said that this was surprisingly cheap, because the price in dollars was such a low share of OpenAI's eventual endowment: 3%.

To the contrary, this seat on OpenAI's board is very expensive, not because the nominal price is high, but precisely because it is so low.

If OpenAI hasn’t extracted a meaningful-to-it amount of money, then it follows that it is getting something other than money out of the deal. The obvious thing it is getting is buy-in for OpenAI as an AI safety and capacity venture. In exchange for a board seat, the Open Philanthropy Project is aligning itself socially with OpenAI, by taking the position of a material supporter of the project. The important thing is mutual validation, and a nominal donation just large enough to neg the other AI safety organizations supported by the Open Philanthropy Project is simply a customary part of the ritual.

By my count, the grant is larger than all the Open Philanthropy Project's other AI safety grants combined.

(Cross-posted at my personal blog.)

An inquiry into memory of humans

2 Elo 19 April 2017 07:02AM

Cross posted from: http://bearlamp.com.au/an-inquiry-into-memory-of-humans/

In trying to understand how my memory for people works, I am trying to investigate in what order my people semantic network is arranged.

For each exercise that follows you will need to think of a different person to avoid priming yourself with the people you have already thought of.


Think of a person you know.  What comes to mind to represent them?  Is it their name?  Is it their face?  Is it some other sensory or other detail?

Think of a face of a person you know.  What else comes to mind?  Can you think of a person’s face without other details like names coming up.  How about without their hair.  Try this for 3 or more people you know.

Think of a person who has a characteristic voice.  Can you represent the idea of this person without linking to other details of this person?  without their face?  Without their name?  What about a radio presenter who’s face you have never seen?  Can you represent their voice without their face? Without their name?

Think of a person who you can recognise by a characteristic touch.  Think of someone’s handshake that you remember.  Can you represent the concept of the person via handshake alone?  Can you hold off from recalling their name?

Think of a person you can recall that has worn black clothing.  Someone who has worn white clothing.  Are they an idea alone?  Or is it hard to describe without their name?

Think of someone who you can remember singing.  Can you remember their singing selves without the face?  Without the name?

Think of a person’s name.  Do you know who this person is without their face?  Do you know what they sound like without knowing what they look like?  How do you navigate from one detail to another?

Think of a person who is particularly spiritual.  Can you represent who they are without bringing their name to mind?

I could go on but I leave the rest as an exercise to the reader to make up and experiment with a few more examples.  In smells, and in any other sensory experiences, in methods of dividing people.  Tall, short, grumpy…


So What?

Memory is this weird thing.  If you want to know how to take the most advantage of it, you need to know how it works.  This exercise hopefully makes you ask and wonder about how it works.

What do you remember easily.  What details come straight to mind, what details are hard.  Each person would be different in subtle ways, and with knowledge of that difference you can better ask the questions:

Am I going to naturally remember this?

How am I going to format this information in such a way that I can remember it?

In the book Peak, Anders suggests that to tap into the power of deliberate practice you need to add new knowledge to the foundation of old knowledge.

I can’t honestly tell you how to use your memory but I hope this exercise is a step in the right direction.


Meta: I spend a few days this week introspecting and wondering.  I apologise for not being able to deliver an insight.  Only questions.

This took 50mins to write and is the first piece I typed in Colemak not Qwerty after relearning how to type (story coming soon).

[Link] Review of "The Undoing Project: A Friendship That Changed Our Minds," Michael Lewis' book about Kahneman & Tversky

1 fortyeridania 19 April 2017 05:13AM

April '17 I Care About Thread

4 MaryCh 18 April 2017 02:08PM

As an experiment, here's a thread for people to post about things they care about. Specifically, for things that are possible to contribute to, in some way, and preferably, to invite others to join.

Mine is buying and donating highschool textbooks to schools in the 'grey zone' of Ukraine (where the war kinda isn't fought, but few people would be surprised if it started.) I don't deliver them myself, though.

What's yours?

[Link] Expected utility of control vs expected utility of actions

0 whpearson 18 April 2017 08:17AM

[Link] Deleuze contra Error: Other Misadventures of Thought

1 ig0r 18 April 2017 05:45AM

Open thread, Apr. 17 - Apr. 23, 2017

1 gilch 18 April 2017 02:47AM

This is the (late) weekly open thread. See the tag. You'd think we could automate this. The traditional boilerplate follows.


If it's worth saying, but not worth its own post, then it goes here.


Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top level comments on this article" and "

New discussion platform for Less Wrong community (Mastodon instance)

5 rayalez 17 April 2017 04:00PM

Hey, everyone! I have created a new discussion platform for Less Wrong:

https://lesswrong.io/

My goal for this project is to revitalize our community, and create a platform for rational conversations about the current events and new ideas. I'm hoping that in this format, we will have more active discussion, and get to know each other better.

The site is built using Mastodon - a new, open source, decentralized social networking platform that is currently taking off.

[Link] How French intellectuals ruined the West - Postmodernism and its impact, explained

4 ig0r 17 April 2017 12:42AM

Straw Hufflepuffs and Lone Heroes

24 Raemon 16 April 2017 11:48PM
I was hoping the next Project Hufflepuff post would involve more "explain concretely what I think we should do", but as it turns out I'm still hashing out some thoughts about that. In the meanwhile, this is the post I actually have ready to go, which is as good as any to post for now.

Epistemic Status: Mythmaking. This is tailored for the sort of person for whom the "Lone Hero" mindset is attractive. If that isn't something you're concerned with and this post feels irrelevant or missing some important things, note that my vision for Project Hufflepuff has multiple facets and I expect different people to approach it in different ways.

The Berkeley Hufflepuff Unconference is on April 28th. RSVPing on this Facebook Event is helpful, as is filling out this form.



For good or for ill, the founding mythology of our community is a Harry Potter fanfiction.

This has a few ramifications I’ll delve into at some point, but the most pertinent bit is: for a community to change itself, the impulse to change needs to come from within the community. I think it’s easier to build change off of stories that are already a part of our cultural identity.*

* with an understanding that maybe part of the problem is that our cultural identity needs to change, or be more accessible, but I’m running with this mythos for the time being.

In J.K Rowling’s original Harry Potter story, Hufflepuffs are treated like “generic background characters” at best and as a joke at worst. All the main characters are Gryffindors, courageous and true. All the bad guys are Slytherin. And this is strange - Rowling clearly was setting out to create a complex world with nuanced virtues and vices. But it almost seems to me like Rowling’s story takes place in an alternate, explicitly “Pro-Gryffindor propaganda” universe instead of the “real” Harry Potter world. 

People have trouble taking Hufflepuff seriously, because they’ve never actually seen the real thing - only lame, strawman caricatures.

Harry Potter and the Methods of Rationality is… well, Pro-Ravenclaw propaganda. But part of being Ravenclaw is trying to understand things, and to use that knowledge. Eliezer makes an earnest effort to steelman each house. What wisdom does it offer that actually makes sense? What virtues does it cultivate that are rare and valuable?

When Harry goes under the sorting hat, it actually tries to convince him not to go into Ravenclaw, and specifically pushes towards Hufflepuff House:

Where would I go, if not Ravenclaw?

"Ahem. 'Clever kids in Ravenclaw, evil kids in Slytherin, wannabe heroes in Gryffindor, and everyone who does the actual work in Hufflepuff.' This indicates a certain amount of respect. You are well aware that Conscientiousness is just about as important as raw intelligence in determining life outcomes, you think you will be extremely loyal to your friends if you ever have some, you are not frightened by the expectation that your chosen scientific problems may take decades to solve -"

I'm lazy! I hate work! Hate hard work in all its forms! Clever shortcuts, that's all I'm about!

"And you would find loyalty and friendship in Hufflepuff, a camaraderie that you have never had before. You would find that you could rely on others, and that would heal something inside you that is broken."

But my plans -

"So replan! Don't let your life be steered by your reluctance to do a little extra thinking. You know that."

In the end, Harry chooses to go to Ravenclaw - the obvious house, the place that seemed most straightforward and comfortable. And ultimately… a hundred+ chapters later, I think he’s still visibly lacking in the strengths that Hufflepuff might have helped him develop. 

He does work hard and is incredibly loyal to his friends… but he operates in a fundamentally lone-wolf mindset. He’s still manipulating people for their own good. He’s still too caught up in his own cleverness. He never really has true friends other than Hermione, and when she is unable to be his friend for an extended period of time, it takes a huge toll on him that he doesn’t have the support network to recover from in a healthy way. 

The story does showcase Hufflepuff virtue. Hermione’s army is strong precisely because people work hard, trust each other and help each other - not just in big, dramatic gestures, but in small moments throughout the day. 

But… none of that ends up really mattering. And in the end, Harry faces his enemy alone. Lip service is paid to the concepts of friendship and group coordination, but the dominant narrative is Godric Gryffindor’s Nihil Supernum:


No rescuer hath the rescuer.
No lord hath the champion.
No mother or father.
Only nothingness above.


The Sequences and HPMOR both talk about the importance of groups, of emotions, of avoiding the biases that plague overly-clever people in particular. But I feel like the communities descended from Less Wrong, as a whole, are still basically that eleven-year-old Harry Potter: abstractly understanding that these things are important, but not really believing in them seriously enough to actually change their plans and priorities.

Lone Heroes


In Methods of Rationality, there’s a pretty good reason for Harry to focus on being a lone hero: he literally is alone. Nobody else really cares about the things he cares about or tries to do things on his level. It’s like a group project in high school, which is supposed to teach cooperation but actually just results in one kid doing all the work while the others either halfheartedly try to help (at best) or deliberately goof off.

Harry doesn’t bother turning to others for help, because they won’t give him the help he needs.

He does the only thing he can do reliably: focus on himself, pushing himself as hard as he can. The world is full of impossible challenges and nobody else is stepping up, so he shuts up and does the impossible as best he can. Learning higher level magic. Learning higher level strategy. Training, physically and mentally. 

This proves to be barely enough to survive, and not nearly enough to actually play the game. The last chapters are Harry realizing his best still isn’t good enough, and no, this isn’t fair, but it’s how the world is, and there’s nothing to do but keep trying.

He helps others level up as best they can. Hermione and Neville and some others show promise. But they’re not ready to work together as equals.

And frankly, this does match my experience of the real world. When you have a dream burning in your heart... it is incredibly hard to find someone who shares it, who will not just pitch in and help but will actually move heaven and earth to achieve it. 

And if they aren’t capable, level themselves up until they are.

In my own projects, I have tried to find people to work alongside me and at best I’ve found temporary allies. And it is frustrating. And it is incredibly tempting to say “well, the only person I can rely on is myself.”

But… here’s the thing.

Yes, the world is horribly unfair. It is full of poverty, and people trapped in demoralizing jobs. It is full of stupid bureaucracies and corruption and people dying for no good reason. It is full of beautiful things that could exist but don’t. And there are terribly few people who are able and willing to do the work needed to make a dent in reality.

But as long as we’re willing to look at monstrously unfair things and roll up our sleeves and get to work anyway, consider this:

It may be that one of the unfair things is that one person can never be enough to solve these problems. That one of the things we need to roll up our sleeves and do even though it seems impossible is figure out how to coordinate and level up together and rely on each other in a way that actually works.

And maybe, while we’re at it, find meaningful relationships that actually make us happy. Because it's not a coincidence that Hufflepuff is about both hard work and warmth and camaraderie. The warmth is what makes the hard work sustainable.

Godric Gryffindor has a point, but Nihil Supernum feels incomplete to me. There are no parents to step in and help us, but if we look to our left, or right…


Yes, you are only one
No, it is not enough—
But if you lift your eyes,
I am your brother

Vienna Teng, Level Up 


-


Reminder that the Berkeley Hufflepuff Unconference is on April 28th. RSVPing on this Facebook Event is helpful, as is filling out this form.


LessWrong analytics (February 2009 to January 2017)

19 riceissa 16 April 2017 10:45PM

Table of contents

Introduction

In January 2017, Vipul Naik obtained Google Analytics daily sessions and pageviews data for LessWrong from Kaj Sotala. Vipul asked me to write a short post giving an overview of the data, so here it is.

This post covers just the basics. Vipul and I are eager to hear thoughts on what sort of deeper analysis people are interested in; we may incorporate these ideas in future posts.

Pageviews and sessions

The data for both sessions and pageviews span from February 26, 2009 to January 3, 2017. LessWrong seems to have launched in February 2009, so this is close to the full duration for which LessWrong has existed.

Pageviews plot:

30-day rolling sum of Pageviews

Total pageviews recorded by Google Analytics for this period is 52.2 million.

Sessions plot:

30-day rolling sum of Sessions

Total sessions recorded by Google Analytics for this period is 19.7 million.

Both plots end with an upward swing, coinciding with the effort to revive LessWrong that began in late November 2016. However, as of early January 2017 (the latest period for which we have data) the scale of any recent increase in LessWrong usage is small in the context of the general decline starting in early 2012.

Top posts

The top 20 posts of all time (by total pageviews), with pageviews and unique pageviews rounded to the nearest thousand, are as follows:

Title Pageviews (thousands) Unique Pageviews (thousands)
Don’t Get Offended 681 128
How to Be Happy 551 482
How to Beat Procrastination 378 342
The Best Textbooks on Every Subject 266 233
Do you have High-Functioning Asperger’s Syndrome? 188 168
Superhero Bias 169 154
The Quantum Physics Sequence 157 130
Bayesian Judo 140 126
An Alien God 125 113
An Intuitive Explanation of Quantum Mechanics 123 106
Three Worlds Collide (0/8) 121 93
Bayes’ Theorem Illustrated (My Way) 121 112
9/26 is Petrov Day 121 115
The Baby-Eating Aliens (1/8) 109 98
The noncentral fallacy - the worst argument in the world? 107 99
Advanced Placement exam cutoffs and superficial knowledge over deep knowledge 107 94
Guessing the Teacher’s Password 102 96
The Fun Theory Sequence 102 90
Optimal Employment 102 97
Ugh fields 95 86

Note that Google Analytics reports are subject to sampling when the number of sessions is large (as it is here) so the input numbers are not exact. More details can be found in a post at LunaMetrics. This doesn’t affect the estimates for the top posts, but those wishing to work with the exported data should be aware of this.

Each post on LessWrong can have numerous URLs. In the case of posts that were renamed, a significant number of pageviews could be recorded at both the old and new URL. To take an example, the following URLs all point to lukeprog’s post “How to Be Happy”:

All that matters for identifying this particular post is that we have the substring “/lw/4su” in the URL. In the above table, I have grouped the URLs by this identifying substring and summed to get the pageview counts.

In addition, each post has two “canonical” URLs that can be obtained by clicking on the post titles: one that begins with either “/r/lesswrong/lw” or “/r/discussion/lw” and one that begins with just “/lw”. I have used the latter in linking to the posts from my table.

Source code

The data, source code used to generate the plots, as well as the Markdown source of this post are available in a GitHub Gist.

Clone the Git repository with:

git clone https://gist.github.com/cbdd400180417c689b2befbfbe2158fc.git

Further reading

Here are a few related PredictionBook predictions:

Acknowledgments

Thanks to Kaj for providing the data used in this post. Thanks to Vipul for asking around for the data, for the idea of this post, and for sponsoring my work on this post.

Towards a More Sophisticated Understanding of Myth and Religion (?)

3 Erfeyah 16 April 2017 08:31PM

Lately I have been investigating the work of Jordan Peterson which I have found to be of great value. Indeed, I have to admit that I am being persuaded but trying to keep a critical mind and balance between doubt and belief.

I thought this is a strong argument for a more sophisticated understanding of the function of religion that would be quite fun to throw to the LessWrong community for an attempt to dismantle ;)

You can find his lectures online on YouTube. The 'Maps of Meaning' series is a fairly detailed exposition of the concepts. For a quick taste you can watch the Joe Rogan podcast with him (they talk a bit of religion in the last hour or so) though in this kind of format you inevitably only get a sketch.

Have fun!

View more: Next