Who said my utility function was unbounded? (Which, BTW, is the same as my reply to the Pascal's Mugger in the wording “create 3^^^3 units of disutility”.)
No one - he just said you don't have infinite confidence that your utility function is bounded.
I am cross-posting this GiveWell Blog post, a followup to an earlier cross-post I made. Here I provide a slightly more fleshed-out model that helps clarify the implications of Bayesian adjustments to cost-effectiveness estimates. It illustrates how it can be rational to take a "threshold" approach to cost-effectiveness, asking that actions/donations meet a minimum bar for estimated cost-effectiveness but otherwise focusing on robustness of evidence rather than magnitude of estimated impact.
We've recently been writing about the shortcomings of formal cost-effectiveness estimation (i.e., trying to estimate how much good, as measured in lives saved, DALYs or other units, is accomplished per dollar spent). After conceptually arguing that cost-effectiveness estimates can't be taken literally when they are not robust, we found major problems in one of the most prominent sources of cost-effectiveness estimates for aid, and generalized from these problems to discuss major hurdles to usefulness faced by the endeavor of formal cost-effectiveness estimation.
Despite these misgivings, we would be determined to make cost-effectiveness estimates work, if we thought this were the only way to figure out how to allocate resources for maximal impact. But we don't. This post argues that when information quality is poor, the best way to maximize cost-effectiveness is to examine charities from as many different angles as possible - looking for ways in which their stories can be checked against reality - and support the charities that have a combination of reasonably high estimated cost-effectiveness and maximally robust evidence. This is the approach GiveWell has taken since our inception, and it is more similar to investigative journalism or early-stage research (other domains in which people look for surprising but valid claims in low-information environments) than to formal estimation of numerical quantities.
The rest of this post
Conceptual illustration
I previously laid out a framework for making a "Bayesian adjustment" to a cost-effectiveness estimate. I stated (and posted the mathematical argument) that when considering a given cost-effectiveness estimate, one must also consider one's prior distribution (i.e., what is predicted for the value of one's actions by other life experience and evidence) and the variance of the estimate error around the cost-effectiveness estimate (i.e., how much room for error the estimate has). This section works off of that framework to illustrate the potential importance of examining charities from multiple angles - relative to formally estimating their cost-effectiveness - in low-information environments.
I don't wish to present this illustration either as official GiveWell analysis or as "the reason" that we believe what we do. This is more of an illustration/explication of my views than a justification; GiveWell has implicitly (and intuitively) operated consistent with the conclusions of this analysis, long before we had a way of formalizing these conclusions or the model behind them. Furthermore, while the conclusions are broadly shared by GiveWell staff, the formal illustration of them should only be attributed to me.
The model
Suppose that:
The implications
I use "initial estimate" to refer to the formal cost-effectiveness estimate you create for a charity - along the lines of the DCP2 estimates or Back of the Envelope Guide estimates. I use "final estimate" to refer to the cost-effectiveness you should expect, after considering your initial estimate and making adjustments for the key other factors: your prior distribution and the "estimate error" variance around the initial estimate. The following chart illustrates the relationship between your initial estimate and final estimate based on the above assumptions.
Note that there is an inflection point (X=1), past which point your final estimate falls as your initial estimate rises. With such a rough estimate, the maximum value of your final estimate is 0.5 no matter how high your initial estimate says the value is. In fact, once your initial estimate goes "too high" the final estimated cost-effectiveness falls.
This is in some ways a counterintuitive result. A couple of ways of thinking about it:
Now suppose that you make another, independent estimate of the good accomplished by your $1000, for the same charity. Suppose that this estimate is equally rough and comes to the same conclusion: it again has a value of X and a standard deviation of X. So you have two separate, independent "initial estimates" of good accomplished, and both are N(X,X). Properly combining these two estimates into one yields an estimate with the same average (X) but less "estimate error" (standard deviation = X/sqrt(2)). Now the relationship between X and adjusted expected value changes:
Now you have a higher maximum (for the final estimated good accomplished) and a later inflection point - higher estimates can be taken more seriously. But it's still the case that "too high" initial estimates lead to lower final estimates.
The following charts show what happens if you manage to collect even more independent cost-effectiveness estimates, each one as rough as the others, each one with the same midpoint as the others (i.e., each is N(X,X)).
The pattern here is that when you have many independent estimates, the key figure is X, or "how good" your estimates say the charity is. But when you have very few independent estimates, the key figure is K - how many different independent estimates you have. More broadly - when information quality is good, you should focus on quantifying your different options; when it isn't, you should focus on raising information quality.
A few other notes:
Instead, when I think about how to improve the robustness of evidence and thus reduce the variance of "estimate error," I think about examining a charity from different angles - asking critical questions and looking for places where reality may or may not match the basic narrative being presented. As one collects more data points that support a charity's basic narrative (and weren't known to do so prior to investigation), the variance of the estimate falls, which is the same thing that happens when one collects more independent estimates. (Though it doesn't fall as much with each new data point as it would with one of the idealized "fully independent cost-effectiveness estimates" discussed above.)
While other distributions may involve later/higher inflection points than normal distributions, the general point that there is a threshold past which higher initial estimates no longer translate to higher final estimates holds for many distributions.
The GiveWell approach
Since the beginning of our project, GiveWell has focused on maximizing the amount of good accomplished per dollar donated. Our original business plan (written in 2007 before we had raised any funding or gone full-time) lays out "ideal metrics" for charities such as
Early on, we weren't sure of whether we would find good enough information to quantify these sorts of things. After some experience, we came to the view that most cost-effectiveness analysis in the world of charity is extraordinarily rough, and we then began using a threshold approach, preferring charities whose cost-effectiveness is above a certain level but not distinguishing past that level. This approach is conceptually in line with the above analysis.
It has been remarked that "GiveWell takes a deliberately critical stance when evaluating any intervention type or charity." This is true, and in line with how the above analysis implies one should maximize cost-effectiveness. We generally investigate charities whose estimated cost-effectiveness is quite high in the scheme of things, and so for these charities the most important input into their actual cost-effectiveness is the robustness of their case and the number of factors in their favor. We critically examine these charities' claims and look for places in which they may turn out not to match reality; when we investigate these and find confirmation rather than refutation of charities' claims, we are finding new data points that support what they're saying. We're thus doing something conceptually similar to "increasing K" according to the model above. We've recently written about all the different angles we examine when strongly recommending a charity.
We hope that the content we've published over the years, including recent content on cost-effectiveness (see the first paragraph of this post), has made it clear why we think we are in fact in a low-information environment, and why, therefore, the best approach is the one we've taken, which is more similar to investigative journalism or early-stage research (other domains in which people look for surprising but valid claims in low-information environments) than to formal estimation of numerical quantities.
As long as the impacts of charities remain relatively poorly understood, we feel that focusing on robustness of evidence holds more promise than focusing on quantification of impact.
*This implies that the variance of your estimate error depends on the estimate itself. I think this is a reasonable thing to suppose in the scenario under discussion. Estimating cost-effectiveness for different charities is likely to involve using quite disparate frameworks, and the value of your estimate does contain information about the possible size of the estimate error. In our model, what stays constant across back-of-the-envelope estimates is the probability that the "right estimate" would be 0; this seems reasonable to me.