You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

halcyon comments on Open Thread Feb 22 - Feb 28, 2016 - Less Wrong Discussion

5 Post author: Elo 21 February 2016 09:14PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (228)

You are viewing a single comment's thread. Show more comments above.

Comment author: Kaj_Sotala 26 February 2016 05:39:02PM *  1 point [-]

From the CEV paper:

Different classes of satisfactory initial definitions may fall into different selfconsistent attractors for optimal definitions of volition. Or they may all converge to essentially the same endpoint. A CEV might survey the “space” of initial dynamics and self-consistent final dynamics, looking to see if one alternative obviously stands out as best; extrapolating the opinions humane philosophers might have of that space. But if there are multiple, self-consistent, satisficing endpoints, each of them optimal under their own criterion—okay. Whatever. As long as we end up in a Nice Place to Live.

And yes, the programmers’ choices may have a huge impact on the ultimate destiny of the human species. Or a bird, chirping in the programmers’ window. Or a science fiction novel, or a few lines spoken by a character in an anime, or a webcomic. Life is chaotic, small things have large effects. So it goes.

Which you could sum up as "CEV doesn't get around that problem, it treats it as irrelevant - the point isn't to find a particular good solution that's unique and totally non-arbitrary, it's just to find even one of the good solutions. If arbitrary reasons shift us from Good World #4 to Good World #36, who cares as long as they both really are good worlds".

Comment author: halcyon 28 February 2016 03:21:21PM *  1 point [-]

The real difficulty is that when you combine two sets of preferences, each of which make sense on their own, you get a set of preferences that makes no sense whatsoever: http://plato.stanford.edu/entries/economics/#5.2 https://www.google.com/search?q=site%3Aplato.stanford.edu+social+choice&ie=utf-8&oe=utf-8

There is no easy way to resolve this problem. There is also no known method that takes such an inconsistent set of preferences as input and gives a consistent set of preferences as output such that the output would be recognizable to either party who contributed an original set of preferences as furthering any of their original goals. These random decisions are required so often in cases where there isn't an unanimous agreement that in practice, there would be a large component of arbitrariness every single time CEV tries to arrive at a uniform set of preferences by extrapolating volitions of multiple agents into the future.

This doesn't mean the problem is unresolvable, just that it's an AI problem in its own right, but given these problems, wouldn't it be better to pick whichever Nice Place to Live is the safest to reach instead of bothering with CEV? I say this because I'm not sure Nice Place to Live can be defined in terms of CEV, as any CEV-approved output. Because of the preference aggregation problem, I'm not certain that a world that is provably CEV-abiding also provably avoids flagrant immorality. Two moral frameworks when aggregated by a non-smart algorithm might give rise to an immoral framework, so I'm not sure the essence of the problem is resolved just by CEV as explained in the paper.