The Friendliness Problem is, at its root, about forming a workable set of values which are acceptable to society.
No, that's the special bonus round after you solve the real friendliness problem. If that were the real deal, we could just tell an AI to enforce Biblical values or the values of Queen Elizabeth II or the US Constitution or something, and although the results would probably be unpleasant they would be no worse than the many unpleasant states that have existed throughout history.
As opposed to the current problem of having a very high likelihood that the AI will kill everyone in the world.
The Friendliness problem is, at its root, about communicating values to an AI and keeping those values stable. If we tell the AI "do whatever Queen Elizabeth II wants" - which I expect would be a perfectly acceptable society to live in - the Friendliness problem is how to get the AI to properly translate that into statements like "Queen Elizabeth wants a more peaceful world" and not things more like "INCREASE LEVEL OF DOPAMINE IN QUEEN ELIZABETH'S REWARD CENTER TO 3^^^3 MOLES" or "ERROR: QUEEN ELIZABETH NOT AN OBVIOUSLY CLOSED SYSTEM, CONVERT EVERYTH...
I imagine a Friendly AI, I imagine a hands-off benefactor who permits people to do anything they wish to which won't result in harm to others.
Yeah, I like personal freedom, too, but you have to realize that this is massively, massively underspecified. What exactly constitutes "harm", and what specific mechanisms are in place to prevent it? Presumably a punch in the face is "harm"; what about an unexpected pat on the back? What about all other possible forms of physical contact that you don't know how to consider in advance? If loud verbal abuse is harm, what about polite criticism? What about all other possible ways of affecting someone via sound waves that you don't know how to consider in advance? &c., ad infinitum.
Does anybody envisage a Friendly AI which doesn't correspond more or less directly with their own political beliefs?
I'm starting to think this entire idea of "having political beliefs" is crazy. There are all sorts of possible forms of human social organization, which result in various outcomes for the humans involved; how am I supposed to know which one is best for people? From what I know about economics, I can point out some ...
I'm starting to think this entire idea of "having political beliefs" is crazy.
Most of my "political beliefs" is awareness of specific failures in other people's beliefs.
Which is where I think politics offers a pretty strong hint to the possibility that the Friendliness Problem has no resolution:
We can't agree on which political formations are more Friendly. That's what "Politics is the Mindkiller" is all about; our inability to come to an agreement on political matters. It's not merely a matter of the rules - which is to say, it's not a matter of the output: We can't even come to an agreement about which values should be used to form the rules.
I'm pretty sure this is a problem with human reasoning abilities, and not a problem with friendliness itself. Or in other words, I think this is only very weak evidence that friendliness is unresolvable.
They will agree on what values they have, and what the best action is relative to those values, but they still might have different values.
There are some analogies between politics and friendliness, but the differences are also worth mentioning.
In politics, you design a system which must be implemented by humans. Many systems fail because of some property of human nature. Whatever rules you give to humans, if they have incentives to act otherwise, they will. Also, humans have limited intelligence and attention, lot of biases and hypocrisy, and their brains are not designed to work in communities with over 300 members, or to resist all the superstimuli of modern life.
If you construct a friendly AI, you don't have a problem with humans, besides the problem of extracting human values.
Politics is a harder problem than friendliness: politics is implemented with agents. Not only that, but largely self-selected agents who are thus usually not the ideal selections for implementing politics.
Friendliness is implemented (inside an agent) with non-agents you can build to task.
(edited for grammarz)
We can't agree on which political formations are more Friendly.
We also can't agree on, say, the correct theory of quantum gravity. But reality is there and it works in some particular way, which we may or may not be able to discover.
The values of a friendly AI are usually assumed to be an idealization of universal human values. More precisely: when someone makes a decision, it is because their brain performs a particular computation. To the extent that this computation is the product of a specific cognitive architecture universal to our species (and no...
That's what "Politics is the Mindkiller" is all about; our inability to come to an agreement on political matters.
In a sense, but most would not agree. I think all would agree that motivated cognition on strongly held values makes for some of the mindkilling.
I agree with what I take as your basic point, that people have different preferences, and Friendliness, political or AI, will be a trade off between them. But, many here don't. In a sense, you and I believe they are mindkilled, but in a different way - structural commitment to an incorre...
The real politic question is: should US government invest money in creating FAI, preventing existential risks and life extension?
There's a value, call it "weak friendliness", that I view as a prerequisite to politics: it's a function that humans already implement successfully, and is the one that says "I don't want to be wire-headed, drugged in to a stupor, victim of a nuclear winter, or see Earth turned in to paperclips".
A hands-off AI overlord can prevent all of that, while still letting humanity squabble over gay rights and which religion is correct.
And, well, the whole point of an AI is that it's smarter than us, and thus has a chance of solving harder problems.
Part of the problem is the many factors involved in the political issues. People explain things through their own specialty, but lack knowledge of other specialties.
Why do you restrict Strong Friendliness to human values? Is there some value which an intelligence can have that can never be a human value?
You're making the perfect the enemy of the good.
I'm fine with at least a thorough framework for Weak Friendliness. That's not gonna materialize out of nothing. There are no actual Turing Machines (infinite tapes required), yet it is a useful model and its study yields useful results for real world applications.
Studying Strong Friendliness is a useful activity in finding a heuristic for best-we-can-do friendliness, which is way better than nothing.
Politics as a process doesn't generate values; they're strictly an input,
Politics is part about choosing goals/values. (E.g., do we value equality or total wealth?) It is also about choosing the means to achieving the goals. And it is also about signaling power. Most of these are not relevant to designing a future Friendly AI.
Yes, a polity is an "optimizer" in some crude sense, optimizing towards a weighted sum of the values of its members with some degree of success. Corporations and economies have also been described as optimizers. But I don't see too much similarity to AI design here.
...and no, it's not because of potential political impact on its goals. Although that's also a thing.
The Politics problem is, at its root, about forming a workable set of rules by which society can operate, which society can agree with.
The Friendliness Problem is, at its root, about forming a workable set of values which are acceptable to society.
Politics as a process (I will use "politics" to refer to the process of politics henceforth) doesn't generate values; they're strictly an input, by which the values of society are converted into rules which are intended to maximize them. While this is true, it is value agnostic; it doesn't care what the values are, or where they come from. Which is to say, provided you solve the Friendliness Problem, it provides a valuable input into politics.
Politics is also an intelligence. Not in the "self aware" sense, or even in the "capable of making good judgments" sense, but in the sense of an optimization process. We're each nodes in this alien intelligence, and we form what looks, to me, suspiciously like a neural network.
The Friendliness Problem is equally applicable to Politics as it is to any other intelligence. Indeed, provided we can provably solve the Friendliness Problem, we should be capable of creating Friendly Politics. Friendliness should, in principle, be equally applicable to both. Now, there are some issues with this - politics is composed of unpredictable hardware, namely, people. And it may be that the neural architecture is fundamentally incompatible with Friendliness. But that is discussing the -output- of the process. Friendliness is first an input, before it can be an output.
More, we already have various political formations, and can assess their Friendliness levels, merely in terms of the values that went -into- them.
Which is where I think politics offers a pretty strong hint to the possibility that the Friendliness Problem has no resolution:
We can't agree on which political formations are more Friendly. That's what "Politics is the Mindkiller" is all about; our inability to come to an agreement on political matters. It's not merely a matter of the rules - which is to say, it's not a matter of the output: We can't even come to an agreement about which values should be used to form the rules.
This is why I think political discussion is valuable here, incidentally. Less Wrong, by and large, has been avoiding the hard problem of Friendliness, by labeling its primary functional outlet in reality as a mindkiller, not to be discussed.
Either we can agree on what constitutes Friendly Politics, or not. If we can't, I don't see much hope of arriving at a Friendliness solution more broadly. Friendly to -whom- becomes the question, if it was ever anything else. Which suggests a division in types of Friendliness; Strong Friendliness, which is a fully generalized set of human values, and acceptable to just about everyone; and Weak Friendliness, which isn't fully generalized, and perhaps acceptable merely to a plurality. Weak Friendliness survives the political question. I do not see that Strong Friendliness can.
(Exemplified: When I imagine a Friendly AI, I imagine a hands-off benefactor who permits people to do anything they wish to which won't result in harm to others. Why, look, a libertarian/libertine dictator. Does anybody envisage a Friendly AI which doesn't correspond more or less directly with their own political beliefs?)