Model Uncertainty, Pascalian Reasoning and Utilitarianism

multifoliaterose

35 Model Uncertainty, Pascalian Reasoning and Utilitarianism

by multifoliaterose

14th Jun 2011

6 min read

155

35

A mode of reasoning that sometimes comes up in discussion of existential risk is the following.

Person 1: According to model A (e.g. some Fermi calculation with probabilities coming from certain reference classes), pursuing course of action X will reduce existential risk by 10^-5; existential risk has an opportunity cost of 10²⁵ DALYs (*), therefore model A says the expected value of pursuing course of action X is 10²⁰ DALYs. Since course of action X requires 10⁹ dollars, the number of DALYs saved per dollar invested in course of action X is 10¹¹. Hence course of action X is 10¹⁰ times as cost-effective as the most cost-effective health interventions in the developing world.

Person 2: I reject model A; I think that appropriate probabilities involved in the Fermi calculation may be much smaller than model A claims; I think that model A fails to incorporate many relevant hypotheticals which would drag the probability down still further.

Person 1: Sure, it may be that model A is totally wrong, but there's nothing obviously very wrong with it. Surely you'd assign at least a 10^-5 chance that it's on the mark? More confidence than this would seem to indicate overconfidence bias, after all, plenty of smart people believe in model A and it can't be that likely that they're all wrong. So you think that the side-effects of pursuing course of action X are systematically negative, even your own implicit model gives a figure of at least 10⁵$/DALY saved, and that's a far better investment than any other philanthropic effort that you know of, so you should fund course of action X even if you think that model A is probably wrong.

(*) As Jonathan Graehl mentions, DALY stands for Disability-adjusted life year.

I feel very uncomfortable with this sort of argument that Person 1 advances above. My best attempt at an summary of where my discomfort comes from is that it seems like one could make the sort of argument to advance a whole number of courses of action, many of which would be at odds with one another.

I have difficulty parsing where my discomfort comes from in more detail. There may be underlying game-theoretic considerations, there may be underlying considerations based on the anthropic principle, it could be that the probability that one ascribes to model A being correct should be much lower than 10^-5 on account of humans' poor ability to construct accurate models and that I shouldn't take it too seriously when some people ascribe to them, it could be that I'm irrationally influenced by social pressures against accepting unusual arguments that most people wouldn't feel comfortable accepting, it could be that in such extreme situations I value certainty over utility maximization, it could be some combination of all of these; I'm not sure how to disentangle the relevant issues in my mind.

One case study that I think may be useful to consider in juxtaposition with the above is as follows. In Creating Infinite Suffering: Lab Universes Alan Dawrst says

Abstract. I think there's a small but non-negligible probability that humans or their descendants will create infinitely many new universes in a laboratory. Under weak assumptions, this would entail the creation of infinitely many sentient organisms. Many of those organisms would be small and short-lived, and their lives in the wild would often involve far more pain than happiness. Given the seriousness of suffering, I conclude that creating infinitely many universes would be infinitely bad.

One may not share Dawrst's intuition that pain would outweigh happiness in such universes, but regardless, the hypothetical of lab universes raises the possibility that all of the philanthropy that one engages in with a view toward utility maximizing should be focusing around creating or preventing the creation of infinitely many lab universes (according to whether or not one one views the expected value of such a universe as positive or negative). This example is in the spirit of Pascal's wager but I prefer it because the premises are less metaphysically dubious.

One can argue that if one is willing to accept the argument given by Person 1 above, one should be willing to accept the argument that one should devote all of one's resources to studying and working toward or against lab universes.

Here various attempts at counterarguments seem to be uncompelling:

Counterargument #1: The issue here is with the infinite; we should ignore infinite ethics on the grounds that they're beyond the range of human comprehension and focus on finite ethics.

Response: The issue here doesn't seem to be with infinities, one can replace "infinitely many lab universes" with "3^^^3 lab universes" (or a sufficiently large number) and would be faced with essentially the same conundrum.

Counterargument #2: The hypothetical upside of a lab universe perfectly cancels out the hypothetical downside of such a universe so we can lab universes as having expected value zero.

Response: If this is true it's certainly not obviously true; there are physical constraints on the sorts of lab universes that could arise, it's probably not the case that for every universe there's an equal and opposite universe. Moreover, it's not the case that we don't have a means of investigating the expected utility of a lab universe. We do have our own universe as a model, can contemplate whether it has aggregate positive or negative utility and refine this understanding by researching fundamental physics, hypothesizing the variation among initial conditions and physical laws among lab universes and attempting to extrapolate what the utility/disutility of an average such universe would be.

Counterargument #3: Even if one's focus should be on lab universes, such a focus reduces to a focus on creating a Friendly AI, such an entity would be much better than us at reasoning about whether or not lab universes are a good thing and how to go about affecting their creation.

Response: Here too, if this is true it's not obvious. Even if one succeeds in creating an AGI that's sympathetic to human values, such an AGI may not ascribe to utilitarianism, after all many humans aren't and it's not clear that this is because their volitions have not been coherently extrapolated; maybe some humans have volitions which coherently extrapolate to being heavily utilitarian whereas others don't. If one is in the latter category, one may do better to focus on lab universes than one would do in focusing on FAI (for example, if one believes that lab universes would have average negative utility, one might work to increase existential risk so as to avert the possibility that a nonutilitarian FAI creates infinitely many universes in a lab because some people find it cool.

Counterargument #4: The universes so created would be parallel universes and parallel copies of a given organism should be considered equivalent to a single such organism, thus their total utility is finite and the expected utility of creating a lab universe is smaller than the expected utility in our own universe.

Response: Regardless of whether one considers parallel copies of a given organism equivalent to a single organism, there's some nonzero chance that the universes created would diverge in a huge number of ways; this could make the expected value of the creation of universes arbitrarily large depending how the probability that one assigns to the creation of n essentially distinct universes varies with n (this is partially an empirical/mathematical question; I'm not claiming that the answer goes one way or the other).

Counterargument #5: The statement "creating infinitely many universes would be infinitely bad" is misleading; as humans we experience marginal diminishing utility with respect to helping n sentient beings as n varies, this is not exclusively due to scope insensitivity, rather, the concavity of the function at least partially reflects terminal values.

Response: Even if one decides that this is true, one still has a question of how quickly the marginal diminishing utility sets in; and any choice here seems somewhat arbitrary so this line of reasoning seems unsatisfactory. Depending on the choice that one makes; one may reject Person 1's argument on the grounds that after a certain point one just doesn't care very much about helping additional people.

I'll end with a couple of questions for Less Wrong:

1. Is the suggestion that one's utilitarian efforts should be primarily focused on the possibility of lab universes an example of "explicit reasoning gone nuts?" (c.f. Anna's post Making your explicit reasoning trustworthy).

2. If so, is the argument advanced by Person 1 above also an example of "explicit reasoning gone nuts?" If the two cases are different then why?

3. If one rejects one or both of the argument by Person 1 and the argument that utilitarian efforts should be focused around lab universes, how does one reconcile this with the idea that one should assign some probability to the notion that one's model is wrong (or that somebody else's model is right)?

Explicit ReasoningInfinities In EthicsPascal's MuggingUtilitarianism

Personal Blog

35

New Comment

Rendering 0/155 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 3:02 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

35 Model Uncertainty, Pascalian Reasoning and Utilitarianism

by multifoliaterose

14th Jun 2011

6 min read

155

35

A mode of reasoning that sometimes comes up in discussion of existential risk is the following.

(*) As Jonathan Graehl mentions, DALY stands for Disability-adjusted life year.

One case study that I think may be useful to consider in juxtaposition with the above is as follows. In Creating Infinite Suffering: Lab Universes Alan Dawrst says

Abstract. I think there's a small but non-negligible probability that humans or their descendants will create infinitely many new universes in a laboratory. Under weak assumptions, this would entail the creation of infinitely many sentient organisms. Many of those organisms would be small and short-lived, and their lives in the wild would often involve far more pain than happiness. Given the seriousness of suffering, I conclude that creating infinitely many universes would be infinitely bad.

Here various attempts at counterarguments seem to be uncompelling:

Counterargument #1: The issue here is with the infinite; we should ignore infinite ethics on the grounds that they're beyond the range of human comprehension and focus on finite ethics.

Counterargument #2: The hypothetical upside of a lab universe perfectly cancels out the hypothetical downside of such a universe so we can lab universes as having expected value zero.

I'll end with a couple of questions for Less Wrong:

2. If so, is the argument advanced by Person 1 above also an example of "explicit reasoning gone nuts?" If the two cases are different then why?

Explicit ReasoningInfinities In EthicsPascal's MuggingUtilitarianism

Personal Blog

35

Mentioned in

8Question about Large Utilities and Low Probabilities

New Comment

Rendering 0/155 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 3:02 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

More from multifoliaterose

Curated and popular this week

155Comments

155

Comment Permalink

JenniferRM15y300

The more I think about it, the more I'm tempted to just bite the bullet and accept that my "empirically observed utility function" (to the degree that such a thing even makes sense) may be bounded, finite, with a lot of its variation spent measuring relatively local things like the prosaic well being of myself and my loved ones, so that there just isn't much left over to cover anyone outside my monkey sphere except via a generic virtue-ethical term for "being a good citizen n'stuff".

A first order approximation might be mathematically modeled by taking all the various utilities having to do with "weird infinite utilities", normalizing all those scenarios by "my ability to affect those outcomes" (so my intrinsic concern for things decreased when I "gave up" on affecting them... which seems broken but also sorta seems like how things might actually work) and then run what's left through a sigmoid function so their impact on my happiness and behavior is finite and marginal... claiming maybe 1% of my consciously strategic planning time and resource expenditures under normal circumstances.

Under this model, the real meat of my utility function would actually be characterized by the finite number of sigmoid terms that sum together, what each one handles, and the multiplicative factors attached to each term. All the weird "infinity arguments" are probably handled by a generic term for "other issues" that is already handling political tragedies playing out on different continents and the ongoing mass extinction event and looming singularity issues and so on. In comparison, this scheme would need quite a few terms for things like "regular bowel movements", that are usually near optimal and have multiplicative factors such that any of them can dominate pretty much the entire utility function if anything goes wrong in these domains.

Spelling this out as "a possible description of how my 'utility function' actually works" it occurs to me to wonder how far from optimal an agent that was built to work this way would be?

...

"No one can save the world with an upset tummy!"

Compare and contrast from Adam Smith's Theory of Moral Sentiments: "And when all this fine philosophy was over, when all these humane sentiments had been once fairly expressed, he would pursue his business or his pleasure, take his repose or his diversion, with the same ease and tranquillity, as if no such accident had happened. The most frivolous disaster which could befall himself would occasion a more real disturbance. If he was to lose his little finger to-morrow, he would not sleep to-night; but, provided he never saw them, he will snore with the most profound security over the ruin of a hundred millions of his brethren, and the destruction of that immense multitude seems plainly an object less interesting to him, than this paltry misfortune of his own."

moridinamael15y00

I think the use of both DALYs and dollars in the main article is worth talking about, in context of some of the things you have mentioned. Being a stupid human, I find that it is generally useful for me to express utility to myself in dollars, because I possess a pragmatic faculty for thinking about dollars. I might not bend over to pick up one dollar. I might spend a couple of hours working for $100. There isn't much difference between one billion and two billion dollars, from my current perspective.

When you ask me how many dollars I would spend to av... (read more)

7CarlShulman15y

I largely buy the framework of this comment, as I've said elsewhere. It does still leave the question of how to go about "being a good citizen n'stuff" with the limited portion of your efforts you want to invest in doing so. Most of multifoliaterose's questions could be reframed in those terms.

5multifoliaterose15y

Thanks for your thoughtful comment. I agree that it's unclear that it makes sense to talk about humans having utility functions; my use of the term was more a manner of speaking than anything else. It sounds like you're going with something like Counterargument #5 with something like the Dunbar number determining the point at which your concern for others caps off; this augmented by some desire to "be a good citizen n'stuff". Something similar may be true of me, but I'm not sure. I know that I derive a lot of satisfaction from feeling like I'm making the world a better place and am uncomfortable with the idea that I don't care about people who I don't know (in light of my abstract belief in space and time independence of moral value); but maybe the intensity of the relevant feelings are sufficiently diminished when the magnitude of uncertainty becomes huge so that other interests predominate. I feel like if I could prove that course X maximizes expected utility then my interest in pursuing course X would increase dramatically (independently of how small the probabilities are and of the possibility of doing more harm than good) but that having a distinctive sense that I'll probably change my mind about whether pursuing course X was a good idea significantly decreases my interest in pursuing course X. Finding it difficult to determine whether this reflects my "utility function" or whether there's a logical argument coming from utilitarianism against pursuing courses that one will probably regret (e.g. probable burnout and disillusionment repelling potentially utilitarian bystanders). Great Adam Smith quotation; I've seen it before, but it's good to have a reference.

See in context