Is friendly AI "trivial" if the AI cannot rewire human values?

Alerus

I put "trivial" in quotes because there are obviously some exceptionally large technical achievements that would still need to occur to get here, but suppose we had an AI with a utilitarian utility function of maximizing subjective human well-being (meaning, well-being is not something as simple as physical sensation of "pleasure" and depends on the mental facts of each person) and let us also assume the AI can model this "well" (lets say at least as well as the best of us can deduce the values of another person for their well-being). Finally, we will also assume that the AI does not possess the ability to manually rewire the human brain to change what a human values. In other words, the ability for the AI to manipulate another person's values is limited by what we as humans are capable of today. Given all this, is there any concern we should have about making this AI; would it succeed in being a friendly AI?

One argument I can imagine for why this fails friendly AI is the AI would wire people up to virtual reality machines. However, I don't think that works very well, because a person (except Cypher from the Matrix) wouldn't appreciate being wired into a virtual reality machine and having their autonomy forcefully removed. This means the action does not succeed in maximizing their well-being.

But I am curious to hear what arguments exist for why such an AI might still fail as a friendly AI.

Thanks for the link, I'll give it a read.

Creating new people is potentially a problem, but I'm not entirely convinced. Let me elaborate. When you say:

What you need to do is program it so that it does what people would like if they were smarter, faster, and more the people they wish they were. In other words, use CEV.

Doesn't this kind of restate in different words that it models human well-being and tries to maximize that? I imagine when you phrased it this way that such an AI wouldn't create new people that are easier to maximize because that isn't what humans would want. And if that's not what humans would want doesn't that just mean it's negatively viewed in their well-being and my original definition suffices? Assuming humans don't want the AI to make new people that are simply easier to maximize, if it created a new person, all people on the earth view this negatively and their well-being drops. In fact, it may lead to humans shutting the AI down, so the AI deduces that it cannot create new people that are easier to maximize. The only possible hole in that I see is if the AI could suddenly create an enormous number of people at once..

Also, it's very hard to define what exactly constitutes "rewiring a human brain". If you make it too general, the AI can't do anything, because that would affect human brains. If you make it too specific, the AI would have some slight limitations on how exactly it messes with people's minds.

Indeed it's difficult to say precisely, that's why I used what we can do now as analogy. I can't really rewire a person's values at all except through persuasion or other such methods. Even our best neuroscientists can't do that unless I'm ignorant to some profound advances. The most we can really do is tweak pleasure centers (which as I stated isn't the metric for well-being) or effectively break the brain so the person is non-operational, but I'd argue that non-operational humans have effectively zero measure of well-being anyway (for similar reasons as to why I'd say a bug has a lower scale of well-being than a human does).

Assuming humans don't want the AI to make new people that are simply easier to maximize, if it created a new person, all people on the earth view this negatively and their well-being drops.

I'm not sure how common it is, but I at least consider total well-being to be important. The more people the better. The easier to make these people happy, the better.

Indeed it's difficult to say precisely, that's why I used what we can do now as analogy. I can't really rewire a person's values at all except through persuasion or other such methods.

An AI is much b... (read more)

-9

Is friendly AI "trivial" if the AI cannot rewire human values?

-9

-9

-9

Is friendly AI "trivial" if the AI cannot rewire human values?

-9

-9