I don't know if this solves very much. As you say, if we use the number 1, then we shouldn't wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal's Muggings. But if we start going for less than one, then we're just defining away Pascal's Mugging by fiat, saying "this is the level at which I am willing to stop worrying about this".
Also, as some people elsewhere in the comments have pointed out, this makes probability non-additive in an awkward sort of way. Suppose that if you eat unhealthy, you increase your risk of one million different diseases by plus one-in-a-million chance of getting each. Suppose also that eating healthy is a mildly unpleasant sacrifice, but getting a disease is much worse. If we calculate this out disease-by-disease, each disease is a Pascal's Mugging and we should choose to eat unhealthy. But if we calculate this out in the broad category of "getting some disease or other", then our chances are quite high and we should eat healthy. But it's very strange that our ontology/categorization scheme should affect our decision-making. This becomes much more dangerous when we start talking about AIs.
Also, does this create weird nonlinear thresholds? For example, suppose that you live on average 80 years. If some event which causes you near-infinite disutility happens every 80.01 years, you should ignore it; if it happens every 79.99 years, then preventing it becomes the entire focus of your existence. But it seems nonsensical for your behavior to change so drastically based on whether an event is every 79.99 years or every 80.01 years.
Also, a world where people follow this plan is a world where I make a killing on the Inverse Lottery (rules: 10,000 people take tickets; each ticket holder gets paid $1, except a randomly chosen "winner" who must pay $20,000)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
I like this because it's something to point to when arguing with somebody with an obvious bias toward anthropomorphizing the agents.
You show them a model like this, then you say, "Oh, the agent can reduce its movement penalty if it first consumes this other orange glowing box. The orange glowing box in this case is 'humanity' but the agent doesn't care."
edit: Don't normally care about downvotes, but my model of LW does not predict 4 downvotes for this post, am I missing something?
I was also surprised to see your comment downvoted.
That said, I don't think I see the value of the thing you proposed saying, since the framing of reducing the movement penalty by consuming an orange box which represents humanity doesn't seem clarifying.
Why does consuming the box reduce the movement penalty? Is it because, outside of the analogy, in reality humanity could slow down or get in the way of the AI? Then why not just say that?
I wouldn't have given you a downvote for it, but maybe others also thought your analogy seemed forced and are just harsher critics than I.