You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

philh comments on Open thread, Mar. 2 - Mar. 8, 2015 - Less Wrong Discussion

4 Post author: MrMind 02 March 2015 08:19AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (155)

You are viewing a single comment's thread. Show more comments above.

Comment author: Houshalter 02 March 2015 11:00:47AM 0 points [-]

In Pascal's Mugging, the problem seems to be using expected values, which is highly distorted by even a single outlier.

The post led to a huge number of proposed solutions. They all seem pretty bad, and none of them even address the problem itself, just the specific thought experiment. And others, like bounding the utility function, are ok, but not really elegant. We don't really want to disregard high utility futures, we just don't want them to highly distort our decision process. But if we make decisions based on expected utility, they inevitably do.

So why is it taken as a given that we decide based on expected utility? Why not "median expected utility"? That is, if you look at the space of all possible outcomes, and select the point where exactly 50% of them are better, and exactly 50% are worse. Choose actions so that this median future is the best.

I'm not certain that this would generate consistent behavior, although you could possibly fix that by making it self referencing. That is, predetermine your future actions now so they lead to the future you desire. Or modify your decision making algorithm to the same effect.

I'm more concerned that there's also weird edge cases where this also doesn't line up with our decision making algorithm. It solves the outlier problem by giving outliers absolutely zero weight. If you have a choice to buy a dollar lottery ticket that has a 20% chance at giving you millions, you would pass it up. (Although, if you expect to encounter many such opportunities in the future, you would predetermine yourself to take them, but only up to a certain point. And this intuitively seems to me the sort of reasoning humans use to choose to obey expected utility calculations.) The same with avoiding large risks.

But not all is lost, there wasn't a priori any reason to believe that was the ideal human decision algorithm either. There are an infinite number of possible algorithms for converting a distribution to a single value. Granted most of them aren't elegant like these, but who says humans are?

We should expect this from evolution. Not just because it's messy, but any creature that actually follows expected utility calculation in extreme cases would almost certainly die. The best strategy would be to follow it in everyday circumstances but break from it in the extremes.

The point is just that the utility function isn't the only thing we need to worry about. I think not paying the Mugger or worshiping the Christian God are perfectly valid options. Even if you really have a boundless utility function and non-balancing priors. And most likely we will be fine if we do that.

Comment author: philh 02 March 2015 11:15:43AM 6 points [-]

That is, if you look at the space of all possible outcomes, and select the point where exactly 50% of them are better, and exactly 50% are worse. Choose actions so that this median future is the best.

This seems vulnerable to the following bet: I roll a d6. If I roll 3+, I give you a dollar. Otherwise I shoot you.

Comment author: Houshalter 02 March 2015 11:32:51AM *  2 points [-]

I mention that vulnerability further down. Obviously it doesn't fit human decision making either, but I think it's qualitatively closer.

An example of an algorithm that's closer to the desired behavior would be to sample n counterfactuals from your probability distribution. Then take the average of these n outcomes, and take the median of this entire setup. E.g. so 50% of the time the average of the n outcomes is higher, and 50% of the time it's lower.

As n approaches infinity it becomes equivalent to expected utility, and as it approaches 1 it becomes median expected utility. A reasonable value is probably a few hundred. So that you select outcomes where you come out ahead the vast majority of the time, but still take low probability risks or ignore low probability rewards.