Given that the standard argument for diversification is to avoid risk, an argument based on a model which assumes no risk isn't going to be convincing. You need a model which does assume risk, and actually show that the risk makes no difference, rather than assuming it out at the beginning.
Diversification makes sense in a preservation context, where any asset you have could devalue by a near-arbitrary factor - and then it only matters because the utility of going near 0 assets is really really negative, so you try to steer around that by not letting one catastrophe sink you.
When handing out goodies, there is no such extreme penalty for it ending up worthless. It would be bad, but since it's only linearly bad, that's taken into account fully by taking the expected return.
The linked article (and, to an extent, this one) makes 2 critical assumptions that break their relevance.
I don't think the argument is parallel. Instead, consider:
If you're giving to charity anyway, give to the charity that has the highest expected impact. If you're voting anyway, vote for the candidate with the highest expected impact.
Here, you have optimal philanthropy plus voting against lizards.
But there is no analog to splitting up your vote, and to the extent that there can be (say, when you get multiple votes in an election to fill multiple co-equal seats on a council, and you can apply more than one of your votes to the same candidate), and several candidates have similar merit, the same arguments for charity splitting apply.
But there is no analog to splitting up your vote
Sure (to the extent that we are considering the effects of “what if everyone used the algorithm I'm using”): you vote for the Greens with probability p and for the Blues with probability 1 - p.
Downvoted for horribly obfuscating a trivial and uninteresting statement: donate everything to the charity with the highest bang for the buck, because (the quote is from the link to Landsburg):
no matter how much you give to CARE, you will never make a serious dent in the problem of starving children. The problem is just too big; behind every starving child is another equally deserving child.
One problem with this argument is that you only guess that CARE is the best charity to donate to, and make no allowance for how certain your guess is.
I agree that this is a simple statement that shouldn't need this kind of elaboration. Unfortunately, some people don't agree with the statement. I'm hoping that when written out in such excruciating detail, the argument gets a better chance of finally getting communicated, in those cases where a much more lightweight explanation, such as the one you've cited, doesn't do the trick. (Added more disclaimers to the first paragraph, describing when there is probably no point in reading the post.)
As I noted in my other comment, this argument just makes a more precise version of the original mistake. You could just as well say that:
no matter how much you give to CARE about the election, your vote will never make a serious dent in the outcome. There are just too many other voters. Therefore, you shouldn't bother voting against the lizards who just agreed to reduce permitted human lifespans to 34 years.
Incidentally, Landsburg advises against voting, for exactly the same reason, so it's worth pointing out that if you don't accept that argument there, you shouldn't accept it here, either.
I should also add that this doesn't meant the argument is wrong; if you agree with not-voting and not-charity-splitting, fine. But you should make it with knowledge of the parallel.
But all the longer argument has that the short argument doesn't is obfuscation of the assumptions that go into it.
Unfortunately, some people don't agree with the statement. I'm hoping that when written out in such excruciating detail, the argument gets a better chance of finally getting communicated, in those cases where a much more lightweight explanation, such as the one you've cited, doesn't do the trick.
The keyword in the grandparent was "obfuscating." I've done linear programming for half of my life and I couldn't tell that's what you were getting at in the OP.
TL;DR: Taking money from your top preference to give to your second preference only makes sense if your contribution is so huge that you're individually changing the whole situation in a noticeable way.
Does the analysis change if one is uncertain about the effectiveness of the charities, or can any uncertainty just be rolled up into a calculation of expected effectiveness?
To take an extreme example, given a charity that I am sure is producing one QALY per dollar given (I consider QALY's a better measure than "lives saved", since all lives are lost in the end), and one which I think might be creating 3 QALY/$ but might equally likely be a completely useless effort, which should I donate to? Assume I've already taken all reasonable steps to collect evidence and this is the best assessment I can make.
Thinking further about my own question, it would depend on whether one values not just QALYs, but confidence that one had indeed bought some number of QALYs -- perhaps parameterised by the mean and standard deviation respectively of the effectiveness estimates. But that leads to an argument for diversification, for the same reasons as in investing: uncorrelated uncertainties tend to cancel out.
Thinking further about my own question, it would depend on whether one values not just QALYs, but confidence that one had indeed bought some number of QALYs
In other words, it depends whether you donate to help people, or to make yourself feel good.
The function U need not be based on what a third party thinks they should be. Donating to make oneself feel good is a perfectly rational reason, provided one values the warm fuzzy feelings more than the money.
Fair enough, the argument does not hold in that case. If you are donating to make yourself feel good then you should diversify.
However, if you are donating to make yourself feel good, i.e. if you value confidence as well as QALYs, then your preference relation is no longer given by U, as this implies that you care differently depending on whether you bought the QALYs or someone else did, so your preferences is not a function solely of the number of antelope and the number of babies.
The only qualification of U is that it's values map to my preferences and that it has transitive values, such that if U(a1,b1)>U(a2,b2)>U(a3,b3), then U(a3,b3)<U(a1,b1). There is no requirement that the arguments of U be measured in terms of dollars- the arguments could easily be the non-real sum of the monies provided by others and the monies provided by me.
U is a function of the number of antelope and the number of babies. By the law of transpancy, it doesn't care whether there are 100 antelope because you saved them or because someone else did. If you do care, then your preference function cannot be described as a function on this domain.
As defined in the original post, U is a function of the total amount of money given to charities A and B. There is no restriction that more money results in more antelope or babies saved, nor that the domain of the function is limited to positive real numbers.
Or are you saying that if I care about whether I help do something important, then my preferences must be non-transitive?
He writes U(A, B), where A is the number of antelope saved and B is the number of babies saved. If you care about anything other than the number of antelope saved or the number of babies saved then U does not completely describe your preferences. Caring about whether you save the antelope or someone else does counts as caring about something other than the number of antelope saved. Unless you can exhibit a negative baby or a complex antelope, then you must accept this domain is limited to positive numbers.
He later gets, from U, a function from the amount of money given, strictly speaking this is a completely different function, it is only denoted by U for convenience. However, the fact that U was initially defined in the previous way means it may have constraints other than transitivity.
To give an example, let f be any function on the real numbers. f, currently has no constraints. We can make f into a function of vectors by saying f(x) = f(|x|), but it is not a fully general function of vectors, it has a constraint that it must satisfy, namely that it is constant on the surface of any sphere surrounding the origin.
Fair cop- I was mistaken about the definition of U.
If there is no function U(a,b) which maps to my preferences across the region which I have control, then the entire position of the original post is void.
Yes, I do not think we actually have a disagreement. The rule that you shouldn't diversify only applies if your aim is to help people. It obviously doesn't apply for all possible aims, as it is possible to imagine an agent with a terminal value for diversified charitable donations.
More specifically, it only applies if your goal is to help people, and your donation is not enough to noticeably change the marginal returns on investment.
Only if you assume that you can't easily self modify.
If you're trying to optimize how you feel instead of something out there in the territory, then you're wireheading. If you're going to wirehead, then do it right and feel good without donating.
If you aren't going to wirehead, then realize that you aren't actually being effective, and self modify so that you feel good when you maximize expected QALYs instead
How I feel IS real. The judgments about the value of my feelings are mostly consistent and transitive, and I choose not to change how my perceptions effect my feelings except for good reasons.
If you care about confidence that you bought the QALYs, you should diversify. If you only care about confidence that the QALYs exist, you should not. This is because, due to the already high uncertainty, the utility of confidence changes linearly with the amount of money donated.
If you only care about the uncertainty of what you did, then that portion of utility would change with the square of the amount donated, since whether you donated to or stole from the charity it would increase uncertainty. If you care about total uncertainty, then the amount of uncertainty changes linearly with the donation, and since it's already high, your utility function changes linearly with it.
Of course, if you really care about all the uncertainty you do, you have to take into account the butterfly effect. It seems unlikely that saving a few lives or equivalent would compare with completely changing the future of the world.
Tell me if I understand this argument correctly: under reasonable values for marginal dollar effectiveness and utility of each charity, your donation probably won't push one of the charities to the point where the marginal utility of donating to it drops below that of donating to the other. That is, your donation, even if spent all at one, will Probably not pass the point of relative diminishing returns.
If I have it right, please add the nonmathematical summary.
This post leaves no room for the standard caveats, e.g. you may want to diversify if you are not trying to maximize something like "people helped" but to express affiliation with a bunch of groups, or have a bunch of charities you can talk about if one succeeds, etc. I suggest adding a disclaimer section about the applicability of the result.
You're assuming consistent preferences over the whole relevant set of (A,B)s, which violates things like sacred-unsacred value tradeoff taboos. You're further assuming continuity of preference levels, which... is probably true with a bit of interpolation. You're also assuming that the amount to divide between charities is fixed.
Overall I'm not sure this convinces anyone who previously wasn't.
since we won't be considering uncertainty,
this model is not useful for making decisions in the real world.
Seriously, why this idiosyncratic position on the diversification of charity donations? How is it different from diversification of investments?
It is common knowledge that diversification is a strategy used by risk-adverse agents to counter the negative effects of uncertainty. If there is no uncertainty, it's obviously true that you should invest everything in the one thing that gives the highest utility (as long as the amount of money you invest is small enough that you don't run into saturation effects, that is, as long as you can make the local linearity appoximation).
Why would charities behave any differently than profit-making assets? Do you think that charities have less uncertainties? That's far from obvious. In fact, typical charities might well have more uncertainties, since they seem to be more difficult to evaluate.
The logic requires that your donations are purely altruistically motivated and you only care for good outcomes.
E. g. take donating to one of the organizations A, or B for cancer research. If your donations are purely altruistic and the consequences are the same you should have no preference on which of the organizations finds a new treatment. You have no reason to distinguish the case of you personally donating $ 1000 to both organizations and someone else doing the same from you donating $2000 to A and someone else donating $2000 to B. And once the donations are made you should have no preference between A or B finding the new treatment.
So the equivalent to your personal portfolio when making investments aren't your personal donations, but the aggregate donations of everyone. And since you aren't the only one making donations the donations are already diversified, so you are free to pick something underrepresented with high yield (which will almost certainly still be underrepresented afterwards). If you manage 0.1% of a $ 10,000,000 portfolio with 90% in government bonds it makes no sense to invest any of that 0.1% in government bonds in the name of diversification.
Makes sense, but it seems to me that if there are many underrepresented high yield charities, you should still diversify among them.
Why would charities behave any differently than profit-making assets? Do you think that charities have less uncertainties?
The confusion concerns whose risk is relevant. When you invest in stocks, you want to minimize the risk to your assets. So, you will diversify your holdings.
When you contribute to charities, if rational you should (with the caveats others have mentioned) minimize the risk that a failing charity will prove crucial, not the risk that your individual contribution will be wasted. If you take a broad, utilitarian overview, you incorporate the need for diversified charities in your utility judgment. If charity a and b are equally likely to pay off but charity a is a lot smaller and should receive more contributions to avoid risk to whatever cause, then you take that into account at the time of deciding on a and b, leading you to contribute everything to a for the sake of diversification. (It's this dialectical twist that confuses people.)
If your contribution is large enough relative to the distinctions between charities, then diversification makes sense but only because your contribution is sufficient to tip the objective balance concerning the desirable total contributions to the charities.
leading you to contribute everything to a for the sake of diversification. (It's this dialectical twist that confuses people.)
This is the most insightful thing I've read on LW today.
There's additional issue concerning imperfect evaluation.
Suppose we made a charity evaluator based on Statistical Prediction Rules, which perform pretty well. There is an issue though. The charities will try to fake the signals that SPR evaluates. SPR is too crude to resist deliberate cheating. Diversification then decreases payoff for such cheating; sufficient diversification can make it economically non viable for selfish parties to fake the signals. Same goes for any imperfect evaluation scheme, especially for elaborate processing of the information (statements, explanations, suggestions how to perform evaluation, et cetera) originating from the donation recipient.
You just can not abstract the imperfect evaluation as 'uncertainty' any more than you can abstract a backdoor in a server application as noise in the wire.
Diversification then decreases payoff for such cheating; sufficient diversification can make it economically non viable for selfish parties to fake the signals.
Diversification reduces the payoff for appearing better. Therefore it reduces the payoff of investing in fake signals of being better. But it also reduces the payoff of investments in actually being better! If a new project would increase humanitarian impact increases donations enough, then charities can afford to expand those efforts. If donations are insensitive to improvement, then the new project will be unaffordable.
Thus, e.g. GiveWell overwhelmingly channels funding to its top pick at a given time, partly to increase the expected direct benefit, and partly because they think that this creates incentives for improvement that dominate incentives for fake improvement. If the evaluation methods are worth using, they will include various signals that are costlier to fake than to honestly signal.
In the limit, if donors ignored quality indicators, spreading donations evenly among all charities, all this would do is incentivize the formation of lots of tiny charities that don't do anything at all, just collect most of the diversification donations. If you can't distinguish good from bad, you should focus on improving your ability to distinguish between them, not blindly diversify.
Diversification reduces the payoff for appearing better. Therefore it reduces the payoff of investing in fake signals of being better. But it also reduces the payoff of investments in actually being better!
Good charities are motivated by their objective. It's rather bad charities for which actually performing better is simply one of the means to looking good for sake of some entirely different terminal goal. You are correct about the latter.
If you can't distinguish good from bad, you should focus on improving your ability to distinguish between them, not blindly diversify.
I do concede that under unusually careful and secure (in the software security sense) evaluation it may be sufficiently resistant to cheating.
However, if you were parsing potentially turing-complete statements by the possible charity, verified the statement for approximate internal consistency, and then as a result of this clearly insecure process obtained enormously high number of, say, 8 lives per dollar, that's an entirely different story. If your evaluation process got security hole, the largest number that falls through will be scam.
edit:
In the limit, if donors ignored quality indicators, spreading donations evenly among all charities, all this would do is incentivize the formation of lots of tiny charities that don't do anything at all, just collect most of the diversification donations. If you can't distinguish good from bad, you should focus on improving your ability to distinguish between them, not blindly diversify.
Wrong limit. The optimum amount of diversification is dependent to how secure is the evaluation process (how expensive it is for someone to generate a 'donation basilisk' output, which, upon reading, compels the reader to donate). Yes, ideally you should entirely eliminate the possibility of such 'donation basilisk' data, and then donate to the top charity. Practically, the degree of basilisk-proofness is a given that is very difficult to change, and you are making donation decision in the now.
Diversification then decreases payoff for such cheating; sufficient diversification can make it economically non viable for selfish parties to fake the signals.
I find this nonobvious; could you elaborate?
I agree with the arguments against diversification (mainly due to its effect on lowering the incentive for becoming more efficient), but here's a concrete instance of how diversification could make cheating nonviable.
Example: Cheating to fake the signals costs 5,000$ (in other words, 5,000$ to make it look like you're the best charity). There are 10,000$ of efficient altruism funds that will be directed to the most efficient charity. By faking signals, you net 5,000$.
Now if diversification is used, let's say at most 1/4 of the efficient altruism funds will be directed to a given charity (maybe evenly splitting the funds among the top 4 charities). Faking the signals now nets -2,500$. Thus, diversification would lower the incentive to cheat by reducing the expected payoff.
Suppose we made a charity evaluator based on Statistical Prediction Rules, which perform pretty well.
Is that just vanilla linear regression?
Diversification then decreases payoff for such cheating; sufficient diversification can make it economically non viable for selfish parties to fake the signals.
Even without cheating, evaluation is still problematic:
Suppose you have a formula that computes the expected marginal welfare (QUALYs, etc.) of a charity given a set of observable variables. You run it on a set of charities and it the two top charities get a very close score, one slightly greater than the other. But the input variables all affected by noise, and the formula contains several approximations, so you perform error propagation analysis and it turns out that the difference between these scores is within the margin of error. Should you still donate everything to the top scoring charity even if you know that the decision is likely based on noise?
Should you still donate everything to the top scoring charity even if you know that the decision is likely based on noise?
If the charities are this close then you only expect to do very slightly better by giving only to the better scoring one. So it doesn't matter much whether you give to one, the other, or both.
Systematic errors are the problem.
Ideally, you run your charity-evaluator function on huge selection of charities, and the one for which your charity-evaluator function gives the largest value, is in some sense the best, regardless of the noise.
More practically, imagine an imperfect evaluation function that due to a bug in it's implementation multiplies by a Very Huge Number value of a charity whose description includes some string S which the evaluation function mis-processes in some dramatic way. Now, if the selection of charities is sufficiently big as to include at least one charity with such S in it's description, you are essentially donating at random. Or worse than random, because the people that run in their head the computation resulting in production of such S tend to not be the ones you can trust.
Normally, I would expect people who know about human biases to not assume that evaluation would resemble the ideal and to understand that the output of some approximate evaluation will not have the exact properties of expected value.
Why would charities behave any differently than profit-making assets?
The difference (for some) isn't in uncertainty, it's in utility, which isn't really made clear in the OP.
Risk aversion for personal investment stems from diminishing marginal utility: Going from $0 to $1,000,000 of personal assets is a significantly greater gain in personal welfare than going from $1,000,000 to $2,000,000. You use the first million to buy things like food, shelter, and such, while the second million goes to less urgent needs. So it makes sense to diversify into multiple investments, reducing the chance of severe falls in wealth even if this reduces expected value. E.g. for personal consumption one should take a sure million dollars rather than a 50% chance of $2,200,000.
If one assesses charitable donations in terms of something like "people helped by anyone" rather than something like "log of people helped by me" then there isn't diminishing utility (by that metric): saving twice as many people is twice as good. And if your donations are small relative to the cause you are donating to, then there should be significantly diminishing returns to money in terms of lives saved: if you donate $1,000 and increase the annual budget for malaria prevention from $500,000,000 to $500,001,000 you shouldn't expect that you are moving to a new regime with much lower marginal productivity.
But you might care about "log of lives saved by me" or "not looking stupid after the fact" or "affiliating with several good causes" or other things besides the number of people helped in your charitable donations. Or you might be donating many millions of dollars, so that diminishing impact of money matters.
It is common knowledge that diversification is a strategy used by risk-adverse agents to counter the negative effects of uncertainty.
When one is risk averse, one trades some expected gain to minimize potential loss. The relevant question is whether it makes any sense to be risk averse with respect to your charity donations.
I'd say no. Risk aversion for my own pile of money comes largely from decreasing marginal utility of each dollar in my pile when spent on me and mine. My first and last dollar to most charities are two drops in a bucket, with the same marginal "problem solving" power.
This doesn't take into account the other benefits of charitable giving, such as signaling and good feelings. In both cases, I'd say that others and you respond more favorably to you the more charities you donate to. In that respect, at least, there is decreasing marginal utility for each dollar more spent on a particular charity. But I think that feel good aspect was not part of the assumed utility calculation. If your goal is to best solve problems, take aim at what you consider the best target, and shoot your wad.
There are many charities that provide goods or services that their donors can use, think of the Wikimedia Foundation or the Free Software Foundation or even the Singularity Institute (which operates Less Wrong). You can donate to these charities for non-altruistic motives other than mere signalling or good feelings, and these motives will likely have diminishing returns, naturally resulting in risk aversion. (Or you may reason that since your small donation isn't going to make a difference, you can as well freeload, but that is the same argument against voting).
But let's assume that we are considering only "save the starving children" type of charities, where the set of donors and the set of beneficiaries don't overlap, and your donations can only buy welfare (measured in QUALYs or some other metric) for distant individuals you don't personally know. Are you not risk averse?
Consider the following scenario: There are two possible charities. For every 100,000 euros of donations, charity A saves the lives of 50 children (that is, allows them to reach adulthood in a condition that enables them to provide for themselves). Charity B either saves 101 children per 100,000 euros or fails, completely wasting all the donated money, with a 50-50 chance. You have got 100 euros to donate. How do you split them?
This is not intended as a complete argument, rather it's an elaboration of a point whose understanding might be helpful in understanding a more general argument. If this point, which is a simplified special case, is not understood, then understanding the more general argument would be even less likely. (Clarified this intent in the first paragraph.)
This results in the change of preference level dU = (∂U/∂A)·dA+(∂U/∂B)·dB... What you want is to maximize the value of U+dU
dU is a linear approximation to the change in preference level, not the change itself. By assuming the linear approximation is good enough, you beg the question. Consider that if that is to be assumed, you could just take U, A and B to be linear functions to begin with, and maximize the values explicitly without bothering to write out any partial derivatives.
By assuming the linear approximation is good enough, you beg the question. Consider that if that is to be assumed, you could just take U, A and B to be linear functions to begin with...
It's not so much an assumption, as an intermediate conclusion that follows from the dependence of U on A and B being smooth enough (across the kind of change that dM is capable of making) and nothing else. My only claim is that under this assumption, the non-diversification conclusion follows. U(A,B) being linear is a hugely more restrictive assumption.
that follows from the dependence of U on A and B being smooth enough (across the kind of change that dM is capable of making) and nothing else.
This is incorrect. Being smooth is not a license to freely replace function values with linear approximation values; that's not what smoothness means. You have to analyze the error term, most easily presented as the higher-order remainder in the Taylor approximation. Such an analysis is what I'd tried to supply in the post I linked to.
Now, create the field of (A,B). Consider the line segment in that field which you can choose- that line goes from (Ma+dM,Mb) to (Ma,Mb+dM)
What aspect of the field requires that most such line segments have a maximum at one endpoint?
Problems with non-diversification:
How do you choose between multiple things that are all necessary when leaving out one of them means disaster? For instance clean air v.s. clean water. Humanity needs both, or it dies. There must be more than one charity that's necessary.
How do you choose between multiple risks when any of them can kill you and they're equally likely? For instance: According to a TED video, there's around a 1 in 20,000 chance of a meteor hitting earth and according to some research I did a long time ago, the chance that Yellowstone caldera will erupt in our lifetimes and destroy earth is about 1 in 20,000.
If all of your favorite charities are likely to make their donation goals, why not donate to them all?
Sometimes one cause is dependent on another. For instance, how many charity websites are hosted on Linux / Apache - open source software. If Linux were in desperate need of programmers to solve some security flaw, it might make more sense to donate for that than to the charities that require them.
There is a standard argument against diversification of donations, popularly explained by Steven Landsburg in the essay Giving Your All. This post is an attempt to communicate a narrow special case of that argument in a form that resists misinterpretation better, for the benefit of people with a bit of mathematical training. Understanding this special case in detail might be useful as a stepping stone to the understanding of the more general argument. (If you already agree that one should donate only to the charity that provides the greatest marginal value, and that it makes sense to talk about the comparison of marginal value of different charities, there is probably no point in reading this post.)1
Suppose you are considering two charities, one that accomplishes the saving of antelopes, and the other the saving of babies. Depending on how much funding these charities secure, they are able to save respectively A antelopes and B babies, so the outcome can be described by a point (A,B) that specifies both pieces of data.
Let's say you have a complete transitive preference over possible values of (A,B), that is you can make a comparison between any two points, and if you prefer (A1,B1) over (A2,B2) and also (A2,B2) over (A3,B3), then you prefer (A1,B1) over (A3,B3). Let's further suppose that this preference can be represented by a sufficiently smooth real-valued function U(A,B), such that U(A1,B1)>U(A2,B2) precisely when you prefer (A1,B1) to (A2,B2). U doesn't need to be a utility function in the standard sense, since we won't be considering uncertainty, it only needs to represent ordering over individual points, so let's call it "preference level".
Let A(Ma) be the dependence of the number of antelopes saved by the Antelopes charity if it attains the level of funding Ma, and B(Mb) the corresponding function for the Babies charity. (For simplicity, let's work with U, A, B, Ma and Mb as variables that depend on each other in specified ways.)
You are considering a decision to donate, and at the moment the charities have already secured Ma and Mb amounts of money, sufficient to save A antelopes and B babies, which would result in your preference level U. You have a relatively small amount of money dM that you want to distribute between these charities. dM is such that it's small compared to Ma and Mb, and if donated to either charity, it will result in changes of A and B that are small compared to A and B, and in a change of U that is small compared to U.
Let's say you split the sum of money dM by giving its part dMa=s·dM (0≤s≤1) to A and the remaining part dMb=(1−s)·dM to B. The question is then what value of s should you choose. Donating everything to A corresponds to s=1 and donating everything to B corresponds to s=0, with values in between corresponding to splitting of the donation.
Donating s·dM to A results in its funding level becoming Ma+dMa, or differential funding level of dMa, and in A+dA = A+(∂A/∂Ma)·dMa = A+(∂A/∂Ma)·s·dM antelopes getting saved, with differential number of antelopes saved being (∂A/∂Ma)·s·dM, correspondingly the differential number of babies saved is (∂B/∂Mb)·(1−s)·dM. This results in the change of preference level dU = (∂U/∂A)·dA+(∂U/∂B)·dB = (∂U/∂A)·(∂A/∂Ma)·s·dM+(∂U/∂B)·(∂B/∂Mb)·(1−s)·dM. What you want is to maximize the value of U+dU, and since U is fixed, you want to maximize the value of dU.
Let's interpret some of the terms in that formula to make better sense of it. (∂U/∂A) is current marginal value of more antelopes getting saved, according to your preference U, correspondingly (∂U/∂B) is the marginal value of more babies getting saved. (∂A/∂Ma) is current marginal efficiency of the Antelopes charity at getting antelopes saved for a given unit of money, and (∂B/∂Mb) is the corresponding value for the Babies charity. Together, (∂U/∂A)·(∂A/∂Ma) is the value you get out of donating a unit of money to charity A, and (∂U/∂B)·(∂B/∂Mb) is the same for charity B. These partial derivatives depend on the current values of Ma and Mb, so they reflect only the current situation and its response to relatively small changes.
The parameter you control is s, and dM is fixed (it's all the money you are willing to donate to both charities together) so let's rearrange the terms in dU a bit: dU = (∂U/∂A)·(∂A/∂Ma)·s·dM+(∂U/∂B)·(∂B/∂Mb)·(1−s)·dM = (s·((∂U/∂A)·(∂A/∂Ma)−(∂U/∂B)·(∂B/∂Mb))+(∂U/∂B)·(∂B/∂Mb))·dM = (s·K+L)·dM, where K and L are not controllable by your actions (K = (∂U/∂A)·(∂A/∂Ma)−(∂U/∂B)·(∂B/∂Mb), L = (∂U/∂B)·(∂B/∂Mb)).
Since dM and s are nonnegative, we have two relevant cases in the maximization of dU=(s·K+L)·dM: when K is positive, and when it's negative. If it's positive, then dU is maximized by boosting K's influence as much as possible by setting s=1, that is donating all of dM to charity A. It it's negative, then dU is maximized by reducing K's influence as much as possible by setting s=0, that is donating all of dM to charity B.
What does the value of K mean? It's the difference between (∂U/∂A)·(∂A/∂Ma) and (∂U/∂B)·(∂B/∂Mb), that is between the marginal value you get out of donating a unit of money to A and the marginal value of donating to B. The result is that if the marginal value of charity A is greater than the marginal value of charity B, you donate everything to A, otherwise you donate everything to B.
1: This started as a reply to Anatoly Vorobey, but grew into an explanation that I thought might be useful to others in the future, so I turned it into a post.