If the majority and the minority are so fundamentally different that their killing each other is not forbidden by the universal human CEV, then no.
I don't understand what you mean by "fundamentally different". You said the AI would not do anything not backed by an all-human consensus. If a majority of humanity wishes to kill a minority, obviously there won't be a consensus to stop the killing, and AI will not interfere. I prefer to live in a universe whose living AI does interfere in such a case.
On what moral grounds would it do the prevention?
Libertarianism is one moral principle that would argue for prevention. So would most varieties of utilitarianism (ignoring utility monsters and such). Again, I would prefer living with an AI hard-coded to one of those moral ideologies (though it's not ideal) over your view of CEV.
Until everybody agree that this new AGI is not good after all. Then the original AGI will interfere and dismantle the new one (the original is still the first and the strongest).
Forever keeping this capability in reserve is most of what being a singleton means. But think of the practical implications: it has to be omnipresent, omniscient, and prevent other AIs from ever being as powerful as it is - which restricts those other AIs' abilities in many endeavors. All the while it does little good itself. So from my point of view, the main effect of successfully implementing your view of CEV may be to drastically limit the opportunities for future AIs to do good.
And yet it doesn't limit the opportunity to do evil, at least evil of the mundane death & torture kind. Unless you can explain why it would prevent even a very straightforward case like 80% of humanity voting to kill the other 20%.
But I can be sure that CEV fixes values that are based on false factual beliefs - this is a part of the definition of CEV.
But you said it would only do things that are approved by a strong human consensus. And I assure you that, to take an example, the large majority of the world's population who today believe in the supernatural will not consent to having that belief "fixed". Nor have you demonstrated that their extrapolated volition would want for them to be forcibly modified. Maybe their extrapolated volition simply doesn't value objective truth highly (because they today don't believe in the concept of objective truth, or believe that it contradicts everyday experience).
I have no idea what your FAI will do But you can be sure that it is something about which you (and everybody) would agree, either directly or if you were more intelligent and knew more.
Yes, but I don't know what I would approve of if I were "more intelligent" (a very ill defined term). And if you calculate that something, according to your definition of intelligence, and present me with the result, I might well reject that result even if I believe in your extrapolation process. I might well say: the future isn't predetermined. You can't calculate what I necessarily will become. You just extrapolated a creature I might become, which also happens to be more intelligent. But there's nothing in my moral system that says I should adopt the values of someone else because they are more intelligent. If I don't like the values I might say, thank-you for warning me, now I shall be doubly careful not to evolve into that kind of creature! I might even choose to forego the kind of increased intelligence that causes such an undsired change in my values.
Short version: "what I would want if I were more intelligent (according to some definition)" isn't the same as "what I will likely want in the future", because there's no reason for me to grow in intelligence (by that definition) if I suspect it would twist my values. So you can't apply the heuristic of "if I know what I'm going to think tomorrow, I might as well think it today".
~Endorses(A1, CEV<A2>) ~Endorses(A2, CEV<A1>) Endorses(A1, CEV<A1+A2>) Endorses(A2, CEV<A1+A2>)
I think you may be missing a symbol there? If not, I can't parse it... Can you spell out for me what it means to just write the last three Endorses(...) clauses one after the other?
Does it look like the most rational behavior?
It may be quite rational for everyone individually, depending on projected payoffs. Unlike a PD, starting positions aren't symmetrical and players' progress/payoffs are not visible to other players. So saying "just cooperate" doesn't immediately apply.
Wouldn't it be better for everybody to cooperate in this Prisoner's Dilemma, and do it with a creditable precommitment?
How can a state or military precommit to not having a supersecret project to develop a private AGI?
And while it's beneficial for some players to join in a cooperative effort, it may well be that a situation of several competing leagues (or really big players working alone) develops and is also stable. It's all laid over the background of existing political, religious and personal enmities and rivalries - even before we come to actual disagreements over what the AI should value.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
So you're OK with the FAI not interfering if they want to kill them for the "right" reasons? Such as "if we kill them, we will benefit by dividing their resources among ourselves"?
So you're saying your version of CEV will forcibly update everyone's beliefs and values to be "factual" and disallow people to believe in anything not supported by appropriate Bayesian evidence? Even if it has to modify those people by force, the result is unlike the original in many respects that they and many other people value and see as identity-forming, etc.? And it will do this not because it's backed by a strong consensus of actual desires, but because post-modification there will be a strong consensus of people happy that the modification was made?
If your answer is "yes, it will do that", then I would not call your AI a Friendly one at all.
My understanding of the CEV doc differs from yours. It's not a precise or complete spec, and it looks like both readings can be justified.
The doc doesn't (on my reading) say that the extrapolated volition can totally conform to objective truth. The EV is based on an extrapolation of our existing volition, not of objective truth itself. One of the ways it extrapolates is by adding facts the original person was not aware of. But that doesn't mean it removes all non-truth or all beliefs that "aren't even wrong" from the original volition. If the original person effectively assigns 0 or 1 "non-updateable probability" to some belief, or honestly doesn't believe in objective reality, or believes in "subjective truth" of some kind, CEV is not necessarily going to "cure" them of it - especially not by force.
But as long as we're discussing your vision of CEV, I can only repeat what I said above - if it's going to modify people by force like this, I think it's unFriendly and if it were up to me, would not launch such an AI.
Understood. But I don't see how this partial ordering changes what I had described.
Let's say I'm A1 and you're A2. We would both prefer a mutual CEV than a CEV of the other only. But each of us would prefer even more a CEV of himself only. So each of us might try to bomb the other first if he expected to get away without retaliation. That there exists a possible compromise that is better than total defeat doesn't mean total victory wouldn't be much better than any compromise.
If you think so you must have evidence relating to how to actually solve this problem. Otherwise they'd both look equally mysterious. So, what's your idea?
I wouldn't like it. But if the alternative is, for example, to have FAI directly enforce the values of the minority on the majority (or vice versa) - the values that would make them kill in order to satisfy/prevent - then I prefer FAI not interfering.
If the resources are so scarce that dividing them is so important that even CEV-s agree on the necessity of killing, then again, I prefer humans to decide who gets them.
No. CEV does not updates anyone's beliefs. It is calculated by extrapolating values in the presence of full knowledge and sufficient intelligence.
As I said elsewhere, if a person's beliefs are THAT incompatible with truth, I'm ok with ignoring their volition. Note, that their CEV is undefined in this case. But I don't believe there exist such people (excluding totally insane).
But the total loss would be correspondingly worse. PD reasoning says you should cooperate (assuming cooperation is precommittable).
Off the top of my head, adoption of total transparency for everybody of all governmental and military matters.