Criticism of EA Criticism Contest

Zvi

Back when it was announced, I toyed with the idea of criticizing the EA criticism and red teaming contest as an entry to that contest, leading to some very good Twitter discussion that I compiled into a post.

I finally found myself motivated to write up my thinking in detail.

Before I begin, I recommend reading the contest announcement yourself, and forming your own reaction. Ask yourself, among other things:

What are they telling people they are interested in and likely to reward?
What are they not interested in and telling people they will not reward?
What does this announcement say more generally about EA?

Praise First

I will start with some things that I think are good about the contest and announcement.

It exists at all. It is good to solicit (paid!) public criticism.
It awards at least $100,000. That’s good prize money.
It promises an even bigger prize for something that really hits home.
It suggests it will reward causing people to change their minds.
Guidelines offered to help with techniques includes red teaming.
The exercise I read this as being is still worth doing, if not as worthwhile as the thing I’d rather be happening, so long as it doesn’t create a story that the thing I want to happen did indeed happen and is thus handled, when it isn’t handled.
What actually wins will be an informative experiment. Stay tuned.

Core Critique

Effective Altruism has a core set of assumptions, and a core method of modeling and acting in the world.

The core critique I am offering is that the contest is mostly taking things like the list below as givens, rather than as things to be questioned.

It is sending a set of signals that such punches should be pulled, both for the contest and in general. Effective Altruism, in this model, very much wants criticism of its tactics, and mostly wants them also of its strategy, but only within the framework.

This parallels EA more broadly, where within-paradigm critiques are welcomed (although, as everywhere, often incorrect disregarded) but deeper critiques are unwelcome, and treated as mistakes.

Here is my attempt to summarize the framework, which came out to 21 points.

Note that the rest of the post does not depend on the exact points or their wordings. Although I did attempt to make the list match my perceptions as much as possible, the list is included primarily to give an idea of the type of thing that is assumed here.

Utilitarianism. Alternatives are considered at best to be mistakes.
Importance of Suffering. Suffering is The Bad. Happiness/pleasure is The Good.
Quantification. Emphasis on that which can be seen and measured.
Bureaucracy. Distribution of funds via organizational grants and applications.
Scope Sensitivity. Shut up and multiply, two are twice as good as one.
Intentionality. You should to plan your life around the impact it will have.
Effectiveness. Do what works. The goal is to cut the enemy.
Altruism. The best way to do good yourself is to act selflessly to do good.
Obligation. We owe the future quite a lot, arguably everything.
Coordination. Working together is more effective than cultivating competition.
Selflessness. You shouldn’t value yourself, locals or family more than others.
Self-Recommending. Belief in the movement and methods themselves.
Evangelicalism. Belief that it is good to convert others and add resources to EA.
Reputation. EA should optimize largely for EA’s reputation.
Modesty. Non-neglected topics can be safely ignored, often consensus trusted.
Existential Risk. Wiping out all value in the universe is really, really bad.
Sacrifice. Important to set a good example, and to not waste resources.
Judgment. Not living up to this list is morally bad. Also sort of like murder.
Veganism. If you are not vegan many EAs treat you as non-serious (or even evil).
Grace. In practice people can’t live up to this list fully and that’s acceptable.
Totalization. Things outside the framework are considered to have no value.

There are also things that follow from these points.

And importantly, there are also things one is not socially allowed to question or consider, not in EA in particular but fully broadly. Some key considerations are things that cannot be said on the internet, and some general assumptions that cannot be questioned are importantly wrong but cannot be questioned. This is a very hard problem but is especially worrisome when calculations and legibility are required, as this directly clashes with there being norms against making certain things legible.

Where I agree with the list above, I am a rather large fan. In particular, #7 is insanely great, #5 can be taken too far but almost always isn’t taken far enough, and #16 is super important. And if you’re going to do the rest of this, you really need #20.

Yet when I think about the remaining 17 points above, I notice that I strongly disagree with a majority of them (#1, #2, #4, #8, #10, #11, #13, #14, #15, #17, #18, #19, #21), and disagree on magnitude or practical usefulness with the rest (#3, #6, #9, #12), where I would say something like ‘yes, more of this than others think, less than you think.’

That many disagreements strongly implies the list is not doing a good job cutting reality at its joints, and a shorter list is possible - that many items here are more examples than they are core things. I am confident that starts at the top with #1. That all seems right and important, but a harder topic for another day. I don’t know how to write at that level in a way that I feel permission to do so, or the expectation of being able to do so effectively. Which makes me assume most others feel the same way.

The same goes for disagreement with things that follow from the assumptions, or that can only be challenged by challenging societal assumptions where there are norms against being seen challenging them.

One could say as Helen does here that EA is merely asking the question of “how can I do the most good, with the resources available to me?” or alternatively “with the resources I choose to give” although the contradictions and implications on that never sit well any more than they do with almost every other value system. To some extent sure, the details are up for grabs, but a lot of the answers on things like the items above aren’t considered all that up for grabs, and EA has an implied model and definition that goes along with ‘do the most good’ being a coherent concept, which packs a lot of logical punch.

Rather than being said too explicitly, although there is some of that as well, the call to work within the paradigm and pull punches comes centrally in the form of a vibe. Things come together to communicate the message implicitly, even unconsciously. I do not think that those who wrote the contest were doing this deliberately. Rather, it is a property of the systems that produce such things, that happens if not fought against.

The only way I know of to explain is to do so via a close reading.

A Close Reading of the Criteria

The section most worth zeroing in on is the criteria. Note that at various points someone could rightfully say ‘a literal interpretation of these words does not say the thing you are claiming they are saying’ and my response to that is ‘yes, you are right, as a matter of logic, but that is not the way that words like this are interpreted, nor is it the way they are intended to be interpreted at the level that matters most.’

Here is their guideline for what they are looking for.

Overall, we want to reward critical work according to a question like: “to what extent did this cause me to change my mind about something important?” — where “change my mind” can mean “change my best guess about whether some claim is true”, or just “become significantly more or less confident in this important thing.”
Below are some virtues of the kind of work we expect to be most valuable. We’ll look out for these features in the judging process, but we’re aware it can be difficult or impossible to live up to all of them.

Critical. The piece takes a critical or questioning stance towards some aspect of EA theory or practice. Note that this does not mean that your conclusion must end up disagreeing with what you are criticizing; it is entirely possible to approach some work critically, check the sources, note some potential weaknesses, and conclude that the original was broadly correct.

Important. The issues discussed really matter for our ability to do the most good as a movement.

Constructive and action-relevant. Where possible we would be most interested in arguments that recommend some specific, realistic action or change of belief. It’s fine to just point out where something is going wrong; even better to be constructive, by suggesting a concrete improvement.

Transparent and legible. We encourage transparency about your process: how much expertise do you have? How confident are you about the claims you’re making? What would change your mind? If your work includes data, how were they collected? Relatedly, we encourage epistemic legibility: the property of being easy to argue with, separate from being correct.

Aware. Take some time to check that you’re not missing an existing response to your argument. If responses do exist, mention (or engage with) them.

Novel. The piece presents new arguments, or otherwise presents familiar ideas in a new way. Novelty is great but not always necessary — it’s often still valuable to distill or “translate” existing criticisms.

Focused. Critical work is often (but not always) most useful when it is focused on a small number of arguments and a small number of objects. We’d love to see (and we’re likely to reward) work that engages with specific texts, strategic choices, or claims.

Once again I am going to ask you to stop and read through yourself, and form your own interpretation of what this is telling you.

Listen to those little voices inside your head.

We’ll start at the top. They don’t use italics, so I will use them to highlight particular details.

Want

Overall, we want to reward critical work according to a question like: “to what extent did this cause me to change my mind about something important?”

That’s a great question. Love the question.

What is the word ‘want’ doing here? It is placing an invisible yet unmistakable-once-you-notice-it ‘but’ in that sentence. It is often said, for good reason, that anything before the ‘but’ does not count.

Then later they say they will ‘look out for these virtues’ in the judging process:

We’ll look out for these features in the judging process, but we’re aware it can be difficult or impossible to live up to all of them.

Why? Why would we look out for these virtues?

If the thing to be rewarded is the extent to which minds were changed about something important, you can simply ask yourself that question.

I agree that having more of the properties listed will make it more likely that a given effort will change one’s mind about something important. It is easy to see why being critical, important, aware, novel, focused, transparent and legible all help the cause here, with constructive and action-relevant being the notable exception.

That doesn’t explain why you would deliberately Goodhart yourself here. It doesn’t explain why they would be part of the judging criteria rather than being merely things to consider when composing an entry.

Note the statement that it can be difficult or impossible ‘to live up to’ all of them, which emphasizes that absolutely we will dock your grade for anything you miss, on top of any effect it has on your ability to change our mind.

It’s the difference between a teacher saying each of these two things to a class:

Tomorrow’s report on population ethics must be double-spaced, with 10-point Ariel font, and 4-5 pages long and 1-inch margins, or I’m docking your grade.

Tomorrow’s report on population ethics should succinctly explain and justify your position, without going into too much detail, as we’ve discussed. It probably wants to be about 4-5 pages long.

The second teacher wants something useful, a thought out and justified view on population ethics that doesn’t get too lost in the weeds.

The first teacher is more focused on something like ensuring that the student write the correct number of characters in the proper format, to ensure everyone knows how to check all the boxes correctly and obey arbitrary rules, and to ensure no one ‘cheats’ by doing less work.

Or, alternatively, the first teacher is in a school culture where they know the students are utterly uninterested in producing a useful thing and can only be forced to do anything by threat of punishment, and that punishment requires violation of the rules, to the rules need to be exactly specified or they’ll get back papers in 18-inch (or for that one weirdo 8-inch) fonts with improper spacing and huge margins so the kids can get back to TikTok or whatever they’re doing these days.

This note from the FAQ confirms we are dealing with the first teacher (note the italicized term requirements):

Does my submission need to fulfill all the criteria outlined above? No. We understand that some formats make it difficult or impossible to satisfy all the requirements, and we don’t want that to be a barrier to submitting. At the same time, we do think each of the criteria are good indicators of the kind of work we’d like to see.

So to bring it back consider the following three sets of instructions.

We will judge entries based on the question: “To what extent did this cause me to change my mind about something important?” Here are some virtues we expect to best help people to change their minds about important things, but you will be judged only on whether our minds were changed.

We want to judge entries based on the question: “To what extent did this cause me to change my mind about something important?” Here are some virtues we except to find most valuable, so we will look out for them during the judging process.

We want to judge entries based on the question above. We will instead judge the contest largely on other things, including the following virtues.

My claim here is that #1 is very different from #2, but that #2 and #3 are very similar.

My additional claim is that #1 is a much better thing to be doing.

There is no need to look out for any virtues here. The whole idea is that those virtues/features are helpful in cutting the enemy. So look at the enemy. Has it been cut?

There are times and places where you want the first teacher (or something in between them) rather than the first one. This does not seem like it is one of those places.

Just

“to what extent did this cause me to change my mind about something important?” — where “change my mind” can mean “change my best guess about whether some claim is true”, or just “become significantly more or less confident in this important thing.”

It took me a bit to notice consciously what I had automatically noticed and flinched from here unconsciously, like a Fnord or a bit of alchemy.

The sentence starts off with a great question: “To what extent did this cause me to change my mind about something important?”

It then clarifies “change my mind” to mean “X or just Y,” where Y ends with ‘in this important thing.’

The brain instinctively knows three things here.

That the word ‘just’ in this context means something lesser or minimal. It satisfies the criteria, but to an importantly lesser degree.

That ‘something important’ has become optional.

Confidence levels are less important than best guesses.

To see the second one, notice that one of the two formulations, the lesser one, ends with ‘this important thing.’ Then move that back into the original sentence - “…become significantly more or less confident in this important thing about something important.”

That’s basically Paris In The The Springtime. The middle instance drops out.

Thus, the message I heard was:

We are looking for you to change our view of a [specific] claim.

Changing our confidence level is not as good.

Importance of claim is nice to have, but optional.

Critical

If you are looking for EA criticism, it makes sense to want it to be critical.

I can’t help but notice that this is actually the opposite request. It is saying that you do not need to be critical in order to count as critical.

You can ‘approach some work ‘critically’, check the sources, note some potential weaknesses, and conclude that the original was broadly correct.’

That is of course a valuable thing to do. If the original is broadly correct, I want to conclude that the original is broadly correct.

There is also the danger that if you only reward ‘criticism’ with money, and someone sets out to do an examination, that they will be biased towards being unfairly critical and negative, worried that it otherwise wouldn’t count.

The contest authors seem clearly worried about this, as an extension of the worry that people considering entering might need some assurance that they will get compensation, and giving them the ability to request funding in advance. I notice I am suspicious of a contest working this way, but it is not obviously wrong or crazy.

As written, this instead sends the opposite message. It is saying:

You need to take a critical stance towards something, some ‘aspect of EA theory or practice.’

That does not mean you need to actually be criticizing it.

Noticing some potential weaknesses is enough.

We are looking to reward things that appear to be critical assessments, but that conclude broad agreement with the thing being criticized.

Nowhere does this say explicitly that it would be preferred if you didn’t substantially criticize the target, and that the goal is to allow everyone involved to tell a story that criticism was solicited and received and that the ‘changing of one’s mind’ is supposed to be an affirmation of existing stories.

But is that the vibe here? The implicit message? Absolutely. Three quarters of the virtue of criticism is about how we meant critical but we didn’t mean criticize or disagree.

It is, again, totally fine to commission examinations of existing practice with an eye towards catching mistakes and being more confident in conclusions rather than looking for the strongest challenges. I would not call that a ‘criticism and red teaming’ task.

I realize, again, that covering all one’s bases in various directions is a challenge. What would I have written here instead? I would have written this:

Critical. The piece examines, criticizes and challenges an important EA theory or practice.

Thus, I would have accepted the ‘if all you do is affirm what we were already doing you are not embodying this virtue’ downside, because in this context it is a downside, it is less likely to importantly change one’s mind and the contest should reflect that while remaining open to having one’s mind changed anyway. When you run a contest, it needs to be expected that sometimes an attempted entry will turn out to be barking up the wrong tree - if the investigation doesn’t change our minds, but we want to pay anyway, then we are rewarding something other than what we ‘want’ to reward. We’re at minimum rewarding displays of effort, and potentially rewarding storytelling. That’s more like commissioning work. If you want to commission work, commission work.

If I was sufficiently worried about this I might extend this to:

Critical. The piece examines, criticizes and challenges an important EA theory or practice. This does not mean it needs to claim the target is centrally bad and wrong, but it must point to way in which minds should change.

Important

Important. The issues discussed really matter for our ability to do the most good as a movement.

Important things are more important than less important things. Yes, strongly agreed that it is more valuable to change one’s mind about important things. I do notice that the second clause is suspicious, as opposed to saying:

Important. The issues discussed really matter.

The first version cares only about whether it matters for our ability to do good. And beyond that, for our ability to do good as a movement.

A philosophy question. If something is important for reasons other than the ability to do good, is it still important?

I would argue strongly yes. The trivial argument is that if can matter for our ability to avoid doing bad, or our cost of doing good, or various higher-level effects, or other such caveats. These buy the basic premise that the thing that is important is doing good on some scale, with minor extensions.

I do still think that distinction matters. Noticing how do go good ‘cheaper’ or avoid harms is often more (intractable/neglected/important) in a situation, and many failed collective efforts (especially government ones) were rushes to ‘do good’ without in this sense thinking it all through.

The more fundamental challenge is whether doing good is indeed The Good. Can you learn something, have that not impact your ability to do good (including the extensions above) but have that still be important? Could you still be correct to invest time or other resources into learning that?

My answer again is strongly yes, and also that if we answer ‘no’ here our ability to do good is substantially reduced to the extent I question whether someone who sets out to do good while answering ‘no’ should be expected to on net do good.

There is a very deep disagreement this is pointing to where I do not think ‘do the most good’ is the right way to think about how to achieve The Good, which encompasses many core disagreements along the way.

Discussing properly would alas be beyond scope here. What I will note is that this reinforces the message that importance means on the basis of importance to doing things that are classified by the EA structures as ‘doing good’ and that can be quantifiably linked to specific good done, steering once again into narrow bounds.

Movement

Then we come to the end, ‘as a movement.’

I care about achieving The Good. I care about doing good. What I absolutely do not care about is whether this good is done or achieved by EA. I want the good because it’s good. If something is important to helping others do more good, or suggesting people can do importantly more good outside of EA, then that is important information, and information I actively want to find.

Constructive

This is the odd one out. The other descriptions are worrisome as is grading based on them, but they do seem clearly to be virtues in the desired sense.

Here we have something else. This is a commonly used strategy to prevent criticism.

One of the biggest reasons errors go unnoticed and uncorrected is ‘don’t tell [the boss/everyone/whoever] about the problem if you don’t have a solution.’

The attempt to say that the action can merely be ‘change of belief’ indicates that there is awareness of this issue, as is ‘it’s fine to just point out where something is going wrong’ but that is exactly what is not meant by ‘constructive’ or especially ‘action relevant.’

In other words, it is saying these things are ‘fine’ but in the sense that we will uphold the rhetorical claim that it is fine while making it very clear implicitly that this is totally, absolutely not fine and will be treated accordingly. If you can’t figure this out, or choose not to be a team player on this, that’s on you.

The request that the proposal be ‘specific and realistic’ and later ‘concrete’ is very clear that a change of belief that doesn’t cash out in a particular new action is insufficient.

There are several clear indications that ‘what you are doing does not work, halt and catch fire’ is not what is being requested here, not in practice. They want ‘what you are doing does not work, so fix it with this One Weird Trick’ or ‘what you are doing does not work, you should do this [within paradigm] alternative instead.’

The counterargument is that of course it is beneficial to be constructive rather than non-constructive. It is better to offer new information about what is failing and also a worthwhile path forward than to offer the same new information about what is failing and not to offer a path forward. And that’s certainly true. But when important problems are noticed, and important criticisms or red teaming is offered, gating the problem announcement on a solution kills the whole process.

The whole point of a red teaming is to find where you are likely to fail. It is a different job and task to then fix the problem.

I hear a clear message here, and that message is: One must be a team player. You must provide a story that we can continue to mostly execute our strategies and do the most good without a halt and catch fire or a paradigm shift. If you do not do this, we will go all Copenhagen Interpretation of Ethics on you, and hold you responsible for breaking the narrative compact.

Once again, a good question is, what would I have written here, to make the core point that it is a plus to point out solutions and suggest useful actions?

This one is tough. If I had to keep the ‘you will be judged on these bullet points’ language I think I delete this entirely. Otherwise, maybe something like this?

Explicit. It is best if you can explicitly indicate clear and important belief updates that should be made. That makes it more likely to importantly change minds. It is also helpful if you can explicitly indicate how these belief updates should change behavior, including where it makes sense to halt and catch fire.

Even this makes me nervous. Perhaps I am typical minding here, and people (especially in our circles) would if left to their own devices be overly neglectful of these stages and thus can use pushes in that direction. I doubt this, because I think most people fully recognize that everyone loves it when you not only point out a problem but also come with a solution.

When I point out a problem, I certainly attempt to find a solution or path forward, or at least ways to look for one, to suggest them as well. That is a normal, friendly, highly useful course of action. The key is not to let the lack of it stop you from speaking the truth, or make your voice tremble.

If someone does both point out a big problem and also comes up with a good solution, then yes, that is even better, that’s two changes of mind in one and should be rewarded accordingly. The problem is that by default this mostly causes silencing.

Legible

The top link here goes to an interesting document that leads off like this:

In short, our top recommendations are to:

Open with a linked summary of key takeaways. [more]

Throughout a document, indicate which considerations are most important to your key takeaways. [more]

Throughout a document, indicate how confident you are in major claims, and what support you have for them. [more]

This is vastly better practice than the vast majority of alternatives. More people doing it would mostly be great, given what it would be replacing.

I am especially a big fan of indicating confidence in one’s claims, and would extend this to minor claims. It’s worth it.

The top reasons I often don’t quite do this are, with some overlap:

Anchoring. If you know the takeaways before seeing the evidence, you will be more inclined to draw the same conclusions and think in the same ways.

Passivity. Telling people the takeaways and key points in advance acts as a blocker that makes it harder for them to think about the problem, either the way you did or their own way, and go into a kind of ‘check your work’ mode.

Learning to Think. I consider the secondary goal of essentially everything I write to be sharing my methods of thinking so others can improve theirs, and I can also improve mine. I realize there is a time and a place not to do this.

Discarding the Illegible. I consider this an especially big problem in EA but it is a problem in general. When you list legible considerations, and start assigning numbers, what happens to considerations that aren’t as legible? Even if you can spell out your worry, often it gets reduced to the part whose impact you can quantify via a lower bound. Thus, some considerations have clearly-much-too-low estimates or get discarded, others have reasonable estimates, and they get compared.

The Law of One Reason. If you give someone five good reasons for something, the default is for the brain to mostly discard four of them and compare the remaining reason to the one reason on the other side. Thus, it is often stronger in practice to drop four of the five reasons, and only state your strongest case. It can also be beneficial to trick the other side into providing lots of additional reasons, even real ones, while you hammer on your talking points that work. Indicating what are your most important reasons risks making this problem worse by dropping everything not on the list, and it also risks invoking the problem by having most of the list be discarded. There’s also the reverse part, where if they find one reason they disagree with people often then discard the whole thing whether or not that makes sense.

Key Takeaways Crowd Out. There is the argument that you only get five words, so better to choose what five words readers will remember and make it easy to have them stick. There is something to that. I still choose to hope for better. When you categorize your key takeaways, then talk about what it important to your key takeaways, what chance do any other potential takeaways have? Especially becoming stronger more generally, or unexpected things.

The second link basically explains that if one can understand what your claims are saying, quantify them, check them against other sources, figure out if they’re right and whether they justify the thing they are claiming they justify, that’s all helpful.

I’d agree with that, in something like this form: If a claim can be made more legible, that is a good thing to do, and you probably should do that. Needlessly vague statements are less useful on many levels.

The problem comes when you can’t easily make the claim more legible. Your choices are often to either (A) not make the claim at all, (B) make a similar but importantly different claim so it will be legible, (C) write endlessly trying to make the thing legible via giving people tacit knowledge or (D) say the thing illegibly and hope at least some people get it, or that at least some readers can grok it through context later on. Often it is helpful here to point out you are saying something illegible.

I think about this problem a lot. There are a lot of things I want to say that I don’t know how to say in a way that would sound not-crazy and not get dismissed, or would make sense to anyone who didn’t already know or mostly know. Often this is because I don’t yet understand it well enough. Other times it is because there are lots of load bearing things holding up the thing, or explaining why the thing isn’t crazy, and getting through those seems impossible or at least has to happen first. Then there are times when some of the load bearing things are things I can’t say out loud on the internet so then what do you do?

Often communicating such things through in-person conversation is possible where written communication is not, because you can figure out which metaphors resonate and click, and which parts of your model they disagree with and have to be explained or justified in which ways. Often I have good hope of tutoring on a question where I would be hopeless at teaching a class. Often that is because it is insufficiently legible.

Mostly all of this is quibbling and nitpicking, since almost everyone should move in the direction being suggested here, and I could probably do to work harder at this as well. The problem comes when you double-count this, as in both explicitly checking for legibility and also then to see if understanding follows.

I do worry about explicitly calling for people to say how much expertise they have in this way. I would expect a chilling effect on those who do not have sufficiently legible expertise in an area, making them doubt themselves and expect not to be listened to on the merits, and worried there will be reliance on argument from authority. One should of course still point out where and when one does and doesn’t have expertise of various kinds, especially when relying upon it to make claims, but mostly the words should stand on their own.

How would I have worded this one?

Transparent and legible. We encourage transparency about your process. How did you reason things out and reach your conclusions? How confident are you about the claims you’re making? What would change your mind? If your work includes data, how were they collected? Relatedly, we encourage epistemic legibility to the extent possible: Make your claims and reasoning as easy to pin down and evaluate and find cruxes with as you can, but no more than that.

While epistemic legibility is distinct from correctness, I would aim to avoid ending a point one is being judged on with ‘separate from being correct’ since that gives the impression correctness is not so valued.

Aware

Aware. Take some time to check that you’re not missing an existing response to your argument. If responses do exist, mention (or engage with) them.

Is it a good argument?

It is virtuous to respond to all good arguments against your position whether or not anyone has made them before.

Looking for existing arguments has several justifications.

It may be more efficient than figuring out the counterarguments yourself.

You might find good arguments.

You might be able to preempt bad arguments and save everyone time.

You might be blamed for not addressing existing arguments.

Whenever I write anything, there’s always a voice that says: Thousands of people are going to read this. You are one person. Shouldn’t you put in more work?

The answer is, sometimes, but if you take that too far nothing gets written. The point is to follow procedures and have virtues that actually lead to providing value to people.

To me, the idea of ‘responding to arguments’ misses the point entirely. Docking points for not pointing out existing counter-arguments, while not telling people to be on the lookout for arguments against your position that haven’t been suggested (which is especially important if it is, as the next point suggests, novel) means the goal isn’t to help people have all they need to form their own opinions and seek truth. Instead, you’re checking social boxes, and you’re doing adversarial advocacy rather than trying to figure things out.

Thus, I’d move this one around.

Balanced. Seek out potential reasons you might be wrong. Where they have merit, at least note them, and ideally address them. If people disagree with you, it is worth understanding why and considering addressing that.

Novel

This one seems fine, although I am unsure it is necessary, and I worry about awarding points for it directly. This is an example of good wording.

Focused

I strongly agree with this sentiment provided it then generalizes. I am unsure that focused is the correct name for the thing (Grounded? Detail Oriented or Detailed? Specific? Not sure anything else is better), but working with examples and their details is usually the right way to go. Then readers can draw the larger point and generalize from there. Otherwise, you often say a bunch of vague things and don’t have a good example even if asked, and the whole thing feels abstract and gets ignored, often with good reason.

The flip side is that the reason to do a detailed report that engages with specific objects can be either:

You care mostly about the specific objects.

The example grounds discussion and illustrates larger points.

I am much more interested, in most contexts, in that second one. So I’d like to see an extra sentence here, something like “Especially when this concreteness helps people to understand things that can then be generalized to other contexts.”

Putting That Together

That was a lot of detail, which together forms a pattern and vibe. That vibe says that what is desired is a superficial critique that stays within and affirms the EA paradigm while it also checks off the boxes of what ‘good criticism’ looks like and it also tells a story of a concrete win that justifies the prize award. Then everyone can feel good about the whole thing, and affirm that EA is seeking out criticism.

Formats For Investigation

The post also suggests adapting one of the standard forms of investigation that have been established with in the EA paradigm. Giving examples of things to do is good, and a lot of this seems reasonable, but the vibe is repeatedly reinforced.

You might consider framing your submission as one of the following:

Minimal trust investigation — A minimal trust investigation involves suspending your trust in others' judgments, and trying to understand the case for and against some claim yourself. Suspending trust does not mean determining in advance that you’ll end up disagreeing.

Red teaming — ‘Red teaming’ is the practice of “subjecting [...] plans, programmes, ideas and assumptions to rigorous analysis and challenge”. You’re setting out to find the strongest reasonable case against something, whatever you actually think about it (and you should flag that this is what you’re doing).

Fact checking and chasing citation trails — If you notice claims that seem crucial, but whose origin is unclear, you could track down the source, and evaluate its legitimacy.

Adversarial collaboration — An adversarial collaboration is where people with opposing views work together to clarify their disagreements.

Clarifying confusions — You might simply be confused about some aspect of EA, rather than confidently critical. You could try getting clear on what you’re confused about, and why.

Evaluating organizations — including their (implicit) theory of change, key claims, and their track record; and suggesting concrete changes where relevant.

Steelmanning and ‘translating’ existing criticism for an EA audience — We’d love to see work succinctly explaining these existing ideas, and constructing the strongest versions (‘steelmanning’) them. You might consider doing this in collaboration with a domain expert who does not consider themself part of the EA community.

In particular, this suggests various formats that are unlikely to offer central or fundamental criticism, or often even criticism at all, and that make it impossible to break out of or challenge the paradigm.

A minimal trust investigation, fact checking, chasing citation trails and evaluating organizations are inherently local, standard things that EA does all the time. Red teaming is similar. You’re sizing up the fish but you’re not going to notice the water.

Adversarial collaboration I haven’t loved as I’ve seen it on SSC/ACX, tending to end up as a dialectic of sorts and rarely breaking out, although it can often help with factual disputes along the way. Mostly it seems like it picks a disagreement, makes it as legible as possible, and ensures (here) that half the work goes into defending rather than criticizing whatever the target may be. I don’t expect this to give us anything too challenging.

Clarifying confusions is explicitly noted to often not even be critical at all, if you don’t count that something was not sufficiently clearly explained and thus you are confused. The description is pointing to the possibility of doing this in a non-critical fashion, to reward the noticing of confusion - a fine thing to reward, but likely to be with punches pulled. This framing pushes for a general modesty, setting up for the conclusion where confusions are resolved by explanation (that will often largely have effectively been social proof).

Steelmanning and translating existing criticism is taking things that are outside of the paradigm, and bringing them inside the paradigm. Often I worry this will take the most valuable disagreements and pretend not to see them, but it could also go the other way and result in a big flashing sign that says ‘this argument doesn’t fit into the paradigm because it rejects the following assumptions.’ The key is to not then go, ‘oh, this argument doesn’t understand these important fundamental things, this person needs to study EA basics more.’ And yes, that has happened to me and I did not like it.

The steelmanning part, in particular, seems likely to cause this. If an EA is looking to steelman criticism, there will be great temptation to change the thinking and arguments to be more EA-correct and thus throw out the most important content. Rob Bensinger called steelmanning ‘niche’ recently for similar reasons, quoting many well-known arguments against steelmanning.

I think there are two distinct useful things that can be done in this space.

Taking existing criticism and making it understood as it was intended.

Taking existing criticism and extracting and reasoning from its good points.

Steelmanning does one at the expense of the other.

What is most significant in this list is what it leaves out. In particular, if one has a central disagreement with something important about EA or its paradigm, that seems like the most important thing to talk about. How should one express that? One can frame it as ‘expressing confusion’ but that is pretending to be confused in order to allow those involved to save face and/or not actually engage fully. The last point explicitly says one must take existing criticism, rather than your own.

Overall, if I was looking to express important fundamental disagreements, to offer the most powerful criticisms of EA overall, to challenge its assumptions and Shibboleths, this list would discourage me quite a lot.

The Judging Panel

This looks like a large judging panel (so consensus and social dynamics will likely be important even without veto powers) that consists entirely of EA insiders who likely buy into the core EA principles.

I didn’t notice this at all unprompted, partly because I assumed of course such a thing would be true, then in the Twitter discussion someone else noticed this as a reason to be skeptical.

A reasonable objection is that if you are having a contest to see who can cause people in EA to change their minds, a non-EA judge would not be able to evaluate that. To the extent that evaluation was directly on the basis of whether the enemy was cut and minds changed, this seems reasonable - if the insiders all reject your entry then you likely did not productively change minds of insiders. If good entries that should change minds get rejected, that means the whole thing was pointless no matter who got the prize funds. If they get accepted, system works, great.

If you’re judging on the basis of intermediate metrics instead, and considering the social implications of various awards and implied commitments and such, and are effectively more focused on box checking in various ways, then insider-only judge panel is a serious problem.

It also sends a clear message to someone who knows they are telling insiders what they do not want to hear. Which, of course, is often the most important criticism. You are much more likely to hear what you need to be told but don’t want to be told, if there are outsiders on the panel.

Rationale

This section is useful in large part to contrast the conscious intentional story it tells with the story being told above, but also it has things worth noticing in other ways.

In his opening talk for EA Global this year, Will MacAskill considered how a major risk to the success of effective altruism is the risk of degrading its quality of thinking: “if you look at other social movements, you get this club where there are certain beliefs that everyone holds, and it becomes an indicator of in-group mentality; and that can get strengthened if it’s the case that if you want to get funding and achieve very big things you have to believe certain things — I think that would be very bad indeed. Looking at other social movements should make us worried about that as a failure mode for us as well.”

This implies that MacAskill does not believe that you currently need to (missing but belongs: pretend to) believe certain things to get big EA funding. I would be very surprised if this implication was not correct. Or at a minimum, that it helps quite a lot.

I don’t even think that is obviously an error. It seems more like a question of selection, magnitude and method. EA has a giant pool of money, and it needs to be protected somehow. We are, as Churchill said, talking price.

This paragraph is also of note:

It’s also possible that some of the most useful critical work goes relatively unrewarded because it might be less attention-grabbing or narrow in its conclusions. Conducting really high-quality criticism is sometimes thankless work: as the blogger Dynomight points out, there’s rarely much glory in fact-checking someone else’s work. We want to set up some incentives to attract this kind of work, as well as more broadly attention-grabbing work.

This is saying once again that they want high quality criticism in terms of getting the details and facts right, and they don’t mind if the conclusions are narrow.

I am totally sympathetic to the goal here. Good narrow criticism is a worthwhile exercise, and I can totally believe that throwing money at this to create supply is a good idea. I have no problem with setting out to get a bunch of fact checks via a fact-checking contest. The contest format seems like an odd fit but it should still work. That’s simply a much more compact and modest goal, and one that is very different from a general call for important criticism and mind changing.

I myself do some form of fact checking reasonably often. I consider it valuable, and would be appreciative if there was a good easy-to-use fact checking service that could do the ‘thankless’ parts of it while I did other parts, to let me put more focus elsewhere.

The final note is that they welcome criticism that EA’s mistake may potentially be not being EA enough, of not being sufficiently weird or having sufficient urgency.

We’re not going to privilege arguments for more caution about projects over arguments for urgency or haste. Scrutinizing projects in their early stages is a good way to avoid errors of commission; but errors of omission (not going ahead with an ambitious project because of an unjustified amount of risk aversion, or oversensitivity to downsides over upsides) can be just as bad.

I do think this is a good note. As the contest notes, my willingness to offer criticism at all is good news. If something (EA, the contest, etc) isn’t interesting enough or lacks potential, criticism is wasted time. EA differs from the mainstream in lots of ways, and it seems all but certain that many of them involve EA not moving away from the mainstream far enough, even if the direction is mostly correct.

Confidence Levels

Since it was explicitly requested, it makes sense to spell out my confidence level explicitly.

I’m almost certain of most of the specific detailed observations about individual words or phrases, in terms of their effect on me. I am almost as confident that the result of these details is that the entries into the contest will be less important and less impactful than they otherwise would be, in expectation. I am less confident, but still pretty confident, that each of them individually has that effect in general. I am somewhere in between in my confidence that none of this is a coincidence.

I am highly confident that the core observation is right - that the post was sending out a vibe with the effect of discouraging important criticism in favor of superficial criticism or things that aren’t even criticism, and that this was a reflection of the intent (at some level) of the people and/or systems that led to the contest.

And I am highly confident that this is reflective of a broader problem.

Where I have the core disagreements with EA or otherwise have a unique philosophy, modesty considerations would say I need to not be confident. Of course, one of my disagreements is that I am skeptical of modesty arguments, but I do think that you reading this should be skeptical here unless you reason it out on your own.

What would change my mind on any or all of this? There’s no one particular detail that I know would do it, but learning I was repeatedly wrong about how people are reading and interpreting things seems like the most likely way. That’s not the only way to do it, but other ways seem like bigger overall updates that would be much harder to get to, and seem less likely to be right.

Another thing I could change my mind on is the worthwhileness of writing about various topics, which is a function of the extent to which:

I learn something and change my mind via writing and seeing reactions to it.

Others change their minds because of my writing and reactions to it.

Social consequences are good rather than bad and I end up in interesting and productive interactions as a result rather than stressful pointless arguments.

Writing is fun and feels like exploration and learning, not like forced work.

There is revealed willingness to pay for such work in whatever way, both because it justifies time spent and also it indicates that others value the work.

I have availability, which may become very limited if one of my projects goes sufficiently well, in which case that would almost certainly take priority.

I have a lot of uncertainty about all of these points. As noted above I have a fun little Manifold Markets up on whether or not I’ll get at least a second prize.

Conclusion

This has been a critique about critiques, and about solicitation of and reactions to critiques. With the core critique being that what is presenting itself as a request for important critiques is instead a request for superficial critiques rather than important or fundamental critiques, because there is motivation to tell a story that important critiques are solicited rather than an actual appetite for fundamental critiques. Superficial critiques both support this storytelling, and are actually welcome in their own right.

This has been something in-between. It gestures towards fundamental critiques, but doesn’t focus on actually making them or justifying them. Making the actual case properly is hard and time intensive, and signals are strong it is unwelcome and would not be rewarded.

In the interest of actionable and constructive, what can be done?

A lot, even without considering the outcomes from more fundamental criticisms. Here are some places to start.

Judge the Contest Based Only on Whether Something Changed Your Mind. There’s nothing forcing the judges to Goodhart themselves here if they don’t want to. They still could simply not do so. That’s a good first step.

Continue Soliciting Criticism. Even when it has issues, still a good thing. Even better, solicit more fundamental critiques as explicitly as possible, in addition to watching out for actively discouraging such things.

Look for the Fnords. When designing a system for awarding money, whether it is grants, a contest, a job or something else, seek conscious awareness of what your system is telling people it wants and will reward, and what it is telling those evaluating to want and reward. Then ask if that’s what you want.

Vibe and Implication Matters. When you read or write something meant to induce behavior or belief, there is a kind of ‘listen for the vibe’ move that you need to make, and a ‘notice the implications my brain is picking up unconsciously’ related move as well. You need to draw them out into conscious awareness, then consider whether there is a problem. Then act accordingly.

Beware Self-Recommendations. If judgments on money and status are made exclusively by insiders who have gotten there in part by buying fully into the paradigm, and who have as a primary goal to direct resources towards the paradigm, the paradigm will not be questioned, and anything within it is at risk of also not being all that questioned. Get viewpoint diversity where it counts. That actually means avoiding having big panels in charge of such things - the only way to not have everything average out into consensus is to keep decision making at any given time on any given thing fast, nimble, flexible and concentrated on a small group, and ideally an individual. There should also be much less emphasis on whether something is EA or not, versus whether or not it is accomplishing something worthwhile and useful.

Question the Fundamentals. Everything from the core of utilitarianism on up, the full model of the world and how one ends up with a better one, needs to be up for grabs. It isn’t merely a thing to be dealt with in Eternal September after which everyone can move on. Which also means finding ways to have these discussions not be stuck in Eternal September.

Heed the Implications. And of course, when you do get information, use it.

This post contains specific examples/suggestions of details and wordings that would bring improvement, and one can build from there.

(Note: There is an additional copy of this and other highly EA-relevant posts at the EA Forum, and it is likely that some discussion will take place there.)

[-]Épiphanie Gédéon3y226

I feel that, while you went a level of meta up, this article really encapsulates why I am so hesitant about EA. I have several concerns about VNM utilitarianism applied to a global monolithic scope. My experience discussing them in the EA space is people looking at me funny and something along the lines of "How can you be against it though?"

By far the main problem I have, which I feel you pointed here elegantly, is how prone EA is to congratulate itself on being so willing to change its mind, without a broader interrogation of what that even means. (I remember Julia Galef saying something like "There was an EA forum post voicing criticism of the movement, to which others cheered on and said 'well said, here are some more criticisms'" which I do not think is a reaction of reception)

Another concern I have is that "Effective Altruism" is eating the memetic space of "doing altruism effectively". That is, there is more and more a conflation between the general idea of "let's do the most good" and the set of values and memespace EA has developped. I find it makes communicating about it a lot harder.

Thank you for writing this article, if there was an effective altruism movement receptive to it, I would feel less hesitation about the whole thing

[+][comment deleted]3y10

Deleted by TAG, 07/15/2022

Reason: was a test

[-]Zvi2y109Review for 2022 Review

This post was, in the end, largely a failed experiment. It did win a lesser prize, and in a sense that proved its point, and I had fun doing it, but I do not think it successfully changed minds, and I don't think it has lasting value, although someone gave it a +9 so it presumably worked for them. The core idea - that EA in particular wants 'criticism' but it wants it in narrow friendly ways and it discourages actual substantive challenges to its core stuff - does seem important. But also this is LW, not EA Forum. If I had to do it over again, I wouldn't bother writing this.

[-]Daniel2y41

I am surprised to hear this, especially “I don't think it has lasting value”. In my opinion, this post has aged incredibly well. Reading it now, knowing that the EA criticism contest utterly failed to do one iota of good with regards to stopping the giant catastrophe on the horizon (FTX), and seeing that the top prizes were all given to long, well-formatted essays providing incremental suggestions on heavily trodden topics while the one guy vaguely gesturing at the actual problem (https://forum.effectivealtruism.org/posts/T85NxgeZTTZZpqBq2/the-effective-altruism-movement-is-not-above-conflicts-of) gets ignored, cements this as one of your more prophetic works.

[-]ShardPhoenix3y90

I generally agree with what I read of this but wasn't sure if I should upvote since it was far too long and I skipped most of the last 1/3 or so. So I compromised by upvoting but also making this comment.

[-]mingyuan3y73

Thanks for this! I've recently been thinking about why I'm so turned off by the writing on the EA Forum, effectivealtruism.org, and centreforeffectivealtruism.org, and I was struggling to put my aversion into words. I hit upon the evangelism, and the attitude that the core tenets upon which every post rested were obviously objectively correct, but as usual you say it far better and in more detail

[-]Raemon3y58

(Mod note: I'm kinda unsure whether to frontpage this. The content seems mostly like a fairly longterm, timeless-ish discussion of longstanding philosophical disagreements that'll continue to be relevant for awhile. The exact prompt is a fairly time-sensitive contest. I'm frontpaging it for now but open to counterargument)

[-]awlego3y40

I'm curious how many of each of proposed tenets of EA people of LW/EA communities would identify with agreeing that 1) they themselves hold 2) are actually key tenets of the EA philosophy.

[-]ChristianKl3y50

This does call for someone running a survey.

[-]Said Achmiz3y31

Effective Altruism, in this model, very much wants criticism of its tactics, and mostly wants them also of its strategy, but only within the framework.

Relevant: Four Scopes of Advice.

[-]Anthony DiGiovanni3y20

I notice that I strongly disagree with a majority of them (#1, #2, #4, #8, #10, #11, #13, #14, #15, #17, #18, #21)

Re: #2, what do you consider to be The Bad other than suffering?

[-]Noosphere893y10

There's one final assumption in EA, seperate from my disagreements with some other assumptions, and that is objective morality, or the idea that there's one right moral or CEV, rather than individual or cultural relativism about morality, so that in the limit, it's divergent rather than convergent. Suffice it to say I have massive disagreements here, so that's another hidden assumption.

[-]Algon3y10

By "Utilitarianism" do you mean the original total utilitarianism?

[-]Zvi3y20

Not entirely, but basically yes.

[-]Algon3y1-5

OK, that makes a lot more sense. I parse "utilitarianism" as "total order over lotteries of world states" so your disagreement with "utilitarianism" threw me for a loop.

[-]Jack R3y13

I think there should be a word for your parsing, maybe "VNM utilitarianism," but I think most people mean roughly what's on the wiki page for utilitarianism:

Utilitarianism is a family of normative ethical theories that prescribe actions that maximize happiness and well-being for all affected individuals

[-]Corm3y12

It seems to me like first and second teacher are getting mixed up.

"The second teacher wants something useful, a thought out and justified view on population ethics that doesn’t get too lost in the weeds."

"There are times and places where you want the second teacher (or something in between them) rather than the first one. This does not seem like it is one of those places."

Don't we want the second teacher in this place?

Yes. Good call.

LESSWRONG
LW

108