Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Bayesian Adjustment Does Not Defeat Existential Risk Charity

37 steven0461 17 March 2013 08:50AM

(This is a long post. If you’re going to read only part, please read sections 1 and 2, subsubsection 5.6.2, and the conclusion.)

1. Introduction

Suppose you want to give some money to charity: where can you get the most bang for your philanthropic buck? One way to make the decision is to use explicit expected value estimates. That is, you could get an unbiased (averaging to the true value) estimate of what each candidate for your donation would do with an additional dollar, and then pick the charity associated with the most promising estimate.

Holden Karnofsky of GiveWell, an organization that rates charities for cost-effectiveness, disagreed with this approach in two posts he made in 2011. This is a response to those posts, addressing the implications for existential risk efforts.

According to Karnofsky, high returns are rare, and even unbiased estimates don’t take into account the reasons why they’re rare. So in Karnofsky's view, our favorite charity shouldn’t just be one associated with a high estimate, it should be one that supports the estimate with robust evidence derived from multiple independent lines of inquiry.1 If a charity’s returns are being estimated in a way that intuitively feels shaky, maybe that means the fact that high returns are rare should outweigh the fact that high returns were estimated, even if the people making the estimate were doing an excellent job of avoiding bias.

Karnofsky’s first post, Why We Can’t Take Expected Value Estimates Literally (Even When They’re Unbiased), explains how one can mitigate this issue by supplementing an explicit estimate with what Karnofsky calls a “Bayesian Adjustment” (henceforth “BA”). This method treats estimates as merely noisy measures of true values. BA starts with a prior representing what cost-effectiveness values are out there in the general population of charities, then the prior is updated into a posterior in standard Bayesian fashion.

Karnofsky provides some example graphs, illustrating his preference for robustness. If the estimate error is small, the posterior lies close to the explicit estimate. But if the estimate error is large, the posterior lies close to the prior. In other words, if there simply aren’t many high-return charities out there, a sharp estimate can be taken seriously, but a noisy estimate that says it has found a high-return charity must represent some sort of fluke.

Karnofsky does not advocate a policy of performing an explicit adjustment. Rather, he uses BA to emphasize that estimates are likely to be inadequate if they don’t incorporate certain kinds of intuitions — in particular, a sense of whether all the components of an estimation procedure feel reliable. If intuitions say an estimate feels shaky and too good to be true, then maybe the estimate was noisy and the prior is more important. On the other hand, if intuitions say an estimate has taken everything into account, then maybe the estimate was sharp and outweighs the prior.

Karnofsky’s second post, Maximizing Cost-Effectiveness Via Critical Inquiry, expands on these points. Where the first post looks at how BA is performed on a single charity at a time, the second post examines how BA affects the estimated relative values of different charities. In particular, it assumes that although the charities are all drawn from the same prior, they come with different estimates of cost-effectiveness. Higher estimates of cost-effectiveness come from estimation procedures with proportionally higher uncertainty.

It turns out that higher estimates aren’t always more auspicious: an estimate may be “too good to be true,” concentrating much of its evidential support on values that the prior already rules out for the most part. On the bright side, this effect can be mitigated via multiple independent observations, and such observations can provide enough evidence to solidify higher estimates despite their low prior probability.

Charities aiming to reduce existential risk have a potential claim to high expected returns, simply because of the size of the stakes. But if such charities are difficult to evaluate, and the prior probability of high expected values is low, then the implications of BA for this class of charities loom large.

This post will argue that competent efforts to reduce existential risk reduction are still likely to be optimal, despite BA. The argument will have three parts:

  1. BA differs from fully Bayesian reasoning, so that BA risks double-counting priors.

  2. The models in Karnofsky’s posts, when applied to existential risk, boil down to our having prior knowledge that the claimed returns are virtually impossible. (Moreover, similar models without extreme priors don’t lead to the same conclusions.)

  3. We don’t have such prior knowledge. Extreme priors would have implied false predictions in the past, imply unphysical predictions for the future, and are justified neither by our past experiences nor by any other considerations.

Claim 1 is not essential to the conclusion. While Claim 2 seems worth expanding on, it’s Claim 3 that makes up the core of the controversy. Each of these concerns will be addressed in turn.

Before responding to the claims themselves, however, it’s worth discussing a highly simplified model that will illustrate what Karnofsky’s basic point is.

continue reading »
Comment author: steven0461 16 March 2013 09:36:45PM 5 points [-]

The post discusses the limiting case where astronomical waste has zero importance and the only thing that matters is saving present lives. Extending that to the case where astronomical waste has some finite level of importance based on time discounting seems like a matter of interpolating between full astronomical waste and no astronomical waste.

Comment author: steven0461 16 March 2013 02:41:03AM 5 points [-]

Thanks for your detailed comment! I certainly agree that, if one takes into account ripple effects where saving lives leads to reduced existential risk, the disparities between direct ways of reducing existential risk on the one hand and other efficient ways of saving people's lives on the other hand are no longer astronomical in size. I learned of this argument partway into writing the post, and subsection 5.5 was meant to address it, but it's quite rough and far from the final word on that subject, particularly if you compare direct efforts to medium-direct efforts rather than to very indirect efforts.

It sounds as though, to model your intuitions on the situation, instead of putting a probability distribution on how many DALYs one could save by donating a dollar to a given charity, we'd instead have to put a probability distribution on what % of existential risk you could rationally expect to reduce by donating one dollar to a given charity. Does that sound right?

I would weakly guess that such a model would favor direct over semi-direct existential risk reduction and strongly guess that such a model would favor direct over indirect existential risk reduction. This is just based on thinking that some of the main variables relevant to existential risk are being pushed on by few enough people, and in ways that are sufficiently badly thought through, that there's likely to be low-hanging fruit to be picked by those who analyze the issues in a sufficiently careful and calculating manner. But this is a pretty vague and sketchy argument, and it definitely seems worth discussing this sort of model more thoroughly.

Comment author: steven0461 16 March 2013 02:04:28AM *  3 points [-]

I have a feeling that the fundamental difference between your position and GiveWell's arises not from a difference of opinion regarding mathematical arguments but because of a difference of values.

Karnofsky has, as far as I know, not endorsed measures of charitable effectiveness that discount the utility of potential people. (On the other hand, as Nick Beckstead points out in a different comment and as is perhaps under-emphasized in the current version of the main post, neither has Karnofsky made a general claim that Bayesian adjustment defeats existential risk charity. He has only explicitly come out against "if there's even a chance" arguments. But I think that in the context of his posts being reposted here on LW, many are likely to have interpreted them as providing a general argument that way, and I think it's likely that the reasoning in the posts has at least something to do with why Karnofsky treats the category of existential risk charity as merely promising rather than as a main focus. For MIRI in particular, Karnofsky has specific criticisms that aren't really related to the points here.)

In particular, valuing potential persons at 0 negates many arguments that rely on speculative numbers to pump expected utility into the present, and I'm not even sure if it's not right.

While valuing potential persons at 0 makes existential risk versus other charities a closer call than if you included astronomical waste, I think the case is still fairly strong that the best existential risk charities save more expected currently-existing lives than the best other charities. The estimate from Anna Salamon's talk linked in the main post makes investment into AI risk research roughly 4 orders of magnitude better for preventing the deaths of currently existing people than international aid charities. At the risk of anchoring, my guess is that the estimate is likely to be an overestimate, but not by 4 orders of magnitude. On the other hand, there may be non-existential risk charities that achieve greater returns in present lives but that also have factors barring them from being recommended by GiveWell.

Comment author: steven0461 16 March 2013 01:38:55AM *  1 point [-]

I agree: the argument given here doesn't address whether existential risk charities are likely to be helpful or actively harmful. The fourth paragraph of the conclusion and various caveats like "basically competent" were meant to limit the scope of the discussion to only those whose effects were mostly positive rather than negative. Carl Shulman suggested in a feedback comment that one could set up an explicit model where one multiplies (1) a normal variable centered on zero, or with substantial mass below zero, intended to describe uncertainty about whether the charity has mostly positive or mostly negative effects, with (2) a thicker-tailed and always positive variable describing uncertainty about the scale the charity is operating on.

Comment author: steven0461 16 March 2013 01:32:46AM 0 points [-]

Yes. There's a choice as to what to put into the prior and what to put into the likelihood. This makes it more difficult to make claims like "this number is a reasonable prior and this one is not". Instead, one has to specify the population the prior is about, and this in turn affects what likelihood ratios are reasonable.

Comment author: steven0461 16 March 2013 01:28:46AM *  20 points [-]

The project was initially described as synthesizing some of the comments on Karnofsky's post into a response mentioning counterintuitive implications of the approach, or into whichever synthesis of responses I thought was accurate.

Comment author: steven0461 06 January 2013 10:04:22PM 2 points [-]

What if it is not, in fact, one of the things you're uncertain about?

Comment author: steven0461 06 January 2013 09:53:02PM *  2 points [-]

This and models similar to it.

Comment author: steven0461 06 January 2013 09:41:25PM *  2 points [-]

I'd be curious to see someone reply to this on behalf of parliamentary models, whether applied to preference aggregation or to moral uncertainty between different consequentialist theories. Do the choices of a parliament reduce to maximizing a weighted sum of utilities? If not, which axiom out of 1-3 do parliamentary models violate, and why are they viable despite violating that axiom?

View more: Next