Related to: Confidence levels inside and outside an argument, Making your explicit reasoning trustworthy
A mode of reasoning that sometimes comes up in discussion of existential risk is the following.
Person 1: According to model A (e.g. some Fermi calculation with probabilities coming from certain reference classes), pursuing course of action X will reduce existential risk by 10-5 ; existential risk has an opportunity cost of 1025 DALYs (*), therefore model A says the expected value of pursuing course of action X is 1020 DALYs. Since course of action X requires 109 dollars, the number of DALYs saved per dollar invested in course of action X is 1011. Hence course of action X is 1010 times as cost-effective as the most cost-effective health interventions in the developing world.
Person 2: I reject model A; I think that appropriate probabilities involved in the Fermi calculation may be much smaller than model A claims; I think that model A fails to incorporate many relevant hypotheticals which would drag the probability down still further.
Person 1: Sure, it may be that model A is totally wrong, but there's nothing obviously very wrong with it. Surely you'd assign at least a 10-5 chance that it's on the mark? More confidence than this would seem to indicate overconfidence bias, after all, plenty of smart people believe in model A and it can't be that likely that they're all wrong. So you think that the side-effects of pursuing course of action X are systematically negative, even your own implicit model gives a figure of at least 105 $/DALY saved, and that's a far better investment than any other philanthropic effort that you know of, so you should fund course of action X even if you think that model A is probably wrong.
(*) As Jonathan Graehl mentions, DALY stands for Disability-adjusted life year.
I feel very uncomfortable with this sort of argument that Person 1 advances above. My best attempt at an summary of where my discomfort comes from is that it seems like one could make the sort of argument to advance a whole number of courses of action, many of which would be at odds with one another.
I have difficulty parsing where my discomfort comes from in more detail. There may be underlying game-theoretic considerations, there may be underlying considerations based on the anthropic principle, it could be that the probability that one ascribes to model A being correct should be much lower than 10-5 on account of humans' poor ability to construct accurate models and that I shouldn't take it too seriously when some people ascribe to them, it could be that I'm irrationally influenced by social pressures against accepting unusual arguments that most people wouldn't feel comfortable accepting, it could be that in such extreme situations I value certainty over utility maximization, it could be some combination of all of these; I'm not sure how to disentangle the relevant issues in my mind.
One case study that I think may be useful to consider in juxtaposition with the above is as follows. In Creating Infinite Suffering: Lab Universes Alan Dawrst says
Abstract. I think there's a small but non-negligible probability that humans or their descendants will create infinitely many new universes in a laboratory. Under weak assumptions, this would entail the creation of infinitely many sentient organisms. Many of those organisms would be small and short-lived, and their lives in the wild would often involve far more pain than happiness. Given the seriousness of suffering, I conclude that creating infinitely many universes would be infinitely bad.
One may not share Dawrst's intuition that pain would outweigh happiness in such universes, but regardless, the hypothetical of lab universes raises the possibility that all of the philanthropy that one engages in with a view toward utility maximizing should be focusing around creating or preventing the creation of infinitely many lab universes (according to whether or not one one views the expected value of such a universe as positive or negative). This example is in the spirit of Pascal's wager but I prefer it because the premises are less metaphysically dubious.
One can argue that if one is willing to accept the argument given by Person 1 above, one should be willing to accept the argument that one should devote all of one's resources to studying and working toward or against lab universes.
Here various attempts at counterarguments seem to be uncompelling:
Counterargument #1: The issue here is with the infinite; we should ignore infinite ethics on the grounds that they're beyond the range of human comprehension and focus on finite ethics.
Response: The issue here doesn't seem to be with infinities, one can replace "infinitely many lab universes" with "3^^^3 lab universes" (or a sufficiently large number) and would be faced with essentially the same conundrum.
Counterargument #2: The hypothetical upside of a lab universe perfectly cancels out the hypothetical downside of such a universe so we can lab universes as having expected value zero.
Response: If this is true it's certainly not obviously true; there are physical constraints on the sorts of lab universes that could arise, it's probably not the case that for every universe there's an equal and opposite universe. Moreover, it's not the case that we don't have a means of investigating the expected utility of a lab universe. We do have our own universe as a model, can contemplate whether it has aggregate positive or negative utility and refine this understanding by researching fundamental physics, hypothesizing the variation among initial conditions and physical laws among lab universes and attempting to extrapolate what the utility/disutility of an average such universe would be.
Counterargument #3: Even if one's focus should be on lab universes, such a focus reduces to a focus on creating a Friendly AI, such an entity would be much better than us at reasoning about whether or not lab universes are a good thing and how to go about affecting their creation.
Response: Here too, if this is true it's not obvious. Even if one succeeds in creating an AGI that's sympathetic to human values, such an AGI may not ascribe to utilitarianism, after all many humans aren't and it's not clear that this is because their volitions have not been coherently extrapolated; maybe some humans have volitions which coherently extrapolate to being heavily utilitarian whereas others don't. If one is in the latter category, one may do better to focus on lab universes than one would do in focusing on FAI (for example, if one believes that lab universes would have average negative utility, one might work to increase existential risk so as to avert the possibility that a nonutilitarian FAI creates infinitely many universes in a lab because some people find it cool.
Counterargument #4: The universes so created would be parallel universes and parallel copies of a given organism should be considered equivalent to a single such organism, thus their total utility is finite and the expected utility of creating a lab universe is smaller than the expected utility in our own universe.
Response: Regardless of whether one considers parallel copies of a given organism equivalent to a single organism, there's some nonzero chance that the universes created would diverge in a huge number of ways; this could make the expected value of the creation of universes arbitrarily large depending how the probability that one assigns to the creation of n essentially distinct universes varies with n (this is partially an empirical/mathematical question; I'm not claiming that the answer goes one way or the other).
Counterargument #5: The statement "creating infinitely many universes would be infinitely bad" is misleading; as humans we experience marginal diminishing utility with respect to helping n sentient beings as n varies, this is not exclusively due to scope insensitivity, rather, the concavity of the function at least partially reflects terminal values.
Response: Even if one decides that this is true, one still has a question of how quickly the marginal diminishing utility sets in; and any choice here seems somewhat arbitrary so this line of reasoning seems unsatisfactory. Depending on the choice that one makes; one may reject Person 1's argument on the grounds that after a certain point one just doesn't care very much about helping additional people.
I'll end with a couple of questions for Less Wrong:
1. Is the suggestion that one's utilitarian efforts should be primarily focused on the possibility of lab universes an example of "explicit reasoning gone nuts?" (c.f. Anna's post Making your explicit reasoning trustworthy).
2. If so, is the argument advanced by Person 1 above also an example of "explicit reasoning gone nuts?" If the two cases are different then why?
3. If one rejects one or both of the argument by Person 1 and the argument that utilitarian efforts should be focused around lab universes, how does one reconcile this with the idea that one should assign some probability to the notion that one's model is wrong (or that somebody else's model is right)?
Agreed, but what I'm mostly griping about is when people who know that utility functions are a really inaccurate model still go ahead and use it, even if prefaced by some number of standard caveats. "Goal system", for example, conveys a similar abstract idea without all of the questionable and misleading technical baggage (let alone associations with "utilitarianism"), and is more amenable to case-specific caveats. I don't think we should downvote people for talking about utility functions, especially if they're newcomers, but there's a point at which we have to adopt generally higher standards for which concepts we give low K complexity in our language.
I have a vested interested in this. All of the most interesting meta-ethics and related decision theory I've seen thus far has come from people associated with SingInst or Less Wrong. If we are to continue to be a gathering place for that kind of mind we can't let our standards degenerate, and ideally we should be aiming for improvement. From far away it would be way easy to dismiss Less Wrong as full of naive nerds completely ignorant of both philosophy and psychology. From up close it would be easy to dismiss Less Wrong as overly confident in a suspiciously homogeneous set of philosophically questionable meta-ethical beliefs, e.g. some form of utilitarianism. The effects of such appearances are hard to calculate and I think larger than most might intuit. (The extent to which well-meaning folk of an ideology very influenced by Kurzweil have poisoned the well for epistemic-hygienic or technical discussion of technological singularity scenarios, for instance, seems both very large and very saddening.)
What is giving this appearance? We have plenty of vocal commenters who are against utilitarianism, top-level posts pointing out problems in utilitarianism, and very few people actually defending utilitarianism. I really don't get it. (BTW, utilitarianism is usually considered normative ethics, not metaethics.)
Also, utility function != utilitarianism. The fact that some peopl... (read more)