You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Wei_Dai comments on Harsanyi's Social Aggregation Theorem and what it means for CEV - Less Wrong Discussion

21 Post author: AlexMennen 05 January 2013 09:38PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (86)

You are viewing a single comment's thread. Show more comments above.

Comment author: Wei_Dai 07 January 2013 08:33:43PM *  2 points [-]

I'm okay with some agents being worse off with the FAI, if that's the kind of agents they are.

Do you see CEV as about altruism, instead of cooperation/bargaining/politics? It seems to me the latter is more relevant, since if it's just about altruism, you could use CEV<the FAI Builders> instead of CEV<humanity>. So, if you don't want anyone to have an incentive to shut down an FAI project, you need to make sure they are not made worse off by an FAI. Of course you could limit this to people who actually have the power to shut you down, but my point is that it's not entirely up to you which agents the FAI can make worse off.

Luckily, I think people, given time to reflect and grown and learn, are not like that

Right, this could be another way to solve the problem: show that of the people you do have to make sure are not made worse off, their actual values (given the right definition of "actual values") are such that a VNM-rational FAI would be sufficient to not make them worse off. But even if you can do that, it might still be interesting and productive to look into why VNM-rationality doesn't seem to be "closed under bargaining".

Also, suppose I personally (according to my sense of altruism) do not want to make anyone among <some set of people> worse off by my actions. Depending on their actual utility functions, it seems that my preferences may not be VNM-rational. So maybe it's not safe to assume that the inputs to this process are VNM-rational either?

Comment author: AlexMennen 07 January 2013 09:05:49PM 2 points [-]

Even if it's about bargaining rather than about altruism, it's still okay to have someone worse off under the FAI just so long as they would not be able to predict ahead of time that they wold get the short end of the stick. It's possible to have everyone benefit in expectation by creating an AI that is willing to make some people (who humans cannot predict the identity of ahead of time) worse off if it brings sufficient gain to the others.

Comment author: Wei_Dai 07 January 2013 10:39:36PM 1 point [-]

I agree with this, which is why I said "worse off in expected utility" at the beginning of the thread. But I think you need "would not be able to predict ahead of time" in a fairly strong sense, namely that they would not be able to predict it even if they knew all the details of how the FAI worked. Otherwise they'd want to adopt the conditional strategy "learn more about the FAI design, and try to shut it down if I learn that I will get the short end of the stick". It seems like the easiest way to accomplish this is to design the FAI to explicitly not make certain people worse off, rather than depend on that happening as a likely side effect of other design choices.