Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Why Politics are Important to Less Wrong...

6 Post author: OrphanWilde 21 February 2013 04:24PM

...and no, it's not because of potential political impact on its goals.  Although that's also a thing.

The Politics problem is, at its root, about forming a workable set of rules by which society can operate, which society can agree with.

The Friendliness Problem is, at its root, about forming a workable set of values which are acceptable to society.

Politics as a process (I will use "politics" to refer to the process of politics henceforth) doesn't generate values; they're strictly an input, by which the values of society are converted into rules which are intended to maximize them.  While this is true, it is value agnostic; it doesn't care what the values are, or where they come from.  Which is to say, provided you solve the Friendliness Problem, it provides a valuable input into politics.

Politics is also an intelligence.  Not in the "self aware" sense, or even in the "capable of making good judgments" sense, but in the sense of an optimization process.  We're each nodes in this alien intelligence, and we form what looks, to me, suspiciously like a neural network.

The Friendliness Problem is equally applicable to Politics as it is to any other intelligence.  Indeed, provided we can provably solve the Friendliness Problem, we should be capable of creating Friendly Politics.  Friendliness should, in principle, be equally applicable to both.  Now, there are some issues with this - politics is composed of unpredictable hardware, namely, people.  And it may be that the neural architecture is fundamentally incompatible with Friendliness.  But that is discussing the -output- of the process.  Friendliness is first an input, before it can be an output.

More, we already have various political formations, and can assess their Friendliness levels, merely in terms of the values that went -into- them.

Which is where I think politics offers a pretty strong hint to the possibility that the Friendliness Problem has no resolution:

We can't agree on which political formations are more Friendly.  That's what "Politics is the Mindkiller" is all about; our inability to come to an agreement on political matters.  It's not merely a matter of the rules - which is to say, it's not a matter of the output: We can't even come to an agreement about which values should be used to form the rules.

This is why I think political discussion is valuable here, incidentally.  Less Wrong, by and large, has been avoiding the hard problem of Friendliness, by labeling its primary functional outlet in reality as a mindkiller, not to be discussed.

Either we can agree on what constitutes Friendly Politics, or not.  If we can't, I don't see much hope of arriving at a Friendliness solution more broadly.  Friendly to -whom- becomes the question, if it was ever anything else.  Which suggests a division in types of Friendliness; Strong Friendliness, which is a fully generalized set of human values, and acceptable to just about everyone; and Weak Friendliness, which isn't fully generalized, and perhaps acceptable merely to a plurality.  Weak Friendliness survives the political question.  I do not see that Strong Friendliness can.

(Exemplified: When I imagine a Friendly AI, I imagine a hands-off benefactor who permits people to do anything they wish to which won't result in harm to others.  Why, look, a libertarian/libertine dictator.  Does anybody envisage a Friendly AI which doesn't correspond more or less directly with their own political beliefs?)

Comments (96)

Comment author: Adele_L 21 February 2013 04:38:54PM 10 points [-]

Which is where I think politics offers a pretty strong hint to the possibility that the Friendliness Problem has no resolution:

We can't agree on which political formations are more Friendly. That's what "Politics is the Mindkiller" is all about; our inability to come to an agreement on political matters. It's not merely a matter of the rules - which is to say, it's not a matter of the output: We can't even come to an agreement about which values should be used to form the rules.

I'm pretty sure this is a problem with human reasoning abilities, and not a problem with friendliness itself. Or in other words, I think this is only very weak evidence that friendliness is unresolvable.

Comment author: Benito 21 February 2013 05:13:15PM 3 points [-]

Indeed. If we were perfect bayesians, who had unlimited introspective access, and we STILL couldn't agree after an unconscionable amount of argument and discussion, then we'd have a bigger problem.

Comment author: OrphanWilde 21 February 2013 05:25:28PM 4 points [-]

Are perfect Bayesians with unlimited introspective access more inclined to agree on matters of first principles?

I'm not sure. I've never met one, much less two.

Comment author: Plasmon 21 February 2013 05:29:53PM 1 point [-]
Comment author: Adele_L 21 February 2013 05:33:30PM 12 points [-]

They will agree on what values they have, and what the best action is relative to those values, but they still might have different values.

Comment author: Benito 22 February 2013 11:47:59PM 1 point [-]

My point exactly. Only if we are sure agents are best representing themselves, can we be sure their values are not the same. If an agent is unsure of zir values, or extrapolates them incorrectly, then there will be disagreement that doesn't imply different values.

With seven billion people, none of which are best representing themselves (they certainly aren't perfect bayesians!) then we should expect massive disagreement. This is not an argument for fundamentally different values.

Comment author: OrphanWilde 21 February 2013 05:23:03PM -1 points [-]

I disagree with the first statement, but agree with the second. That is, I disagree with a certainty that the problem is with our reasoning abilities, but agree that the evidence is very weak.

Comment author: Adele_L 21 February 2013 05:24:39PM 1 point [-]

Um, I said I was "pretty sure". Not absolutely certain.

Comment author: OrphanWilde 21 February 2013 06:54:11PM 0 points [-]

Upvoted, and I'll consider it fair if you downvote my reply. Sorry about that!

Comment author: Adele_L 21 February 2013 10:24:01PM 1 point [-]

No worries!

Comment author: paper-machine 21 February 2013 08:33:01PM 1 point [-]

I'm amused that you've retracted the post in question after posting this.

Comment author: Viliam_Bur 21 February 2013 10:05:30PM *  7 points [-]

There are some analogies between politics and friendliness, but the differences are also worth mentioning.

In politics, you design a system which must be implemented by humans. Many systems fail because of some property of human nature. Whatever rules you give to humans, if they have incentives to act otherwise, they will. Also, humans have limited intelligence and attention, lot of biases and hypocrisy, and their brains are not designed to work in communities with over 300 members, or to resist all the superstimuli of modern life.

If you construct a friendly AI, you don't have a problem with humans, besides the problem of extracting human values.

Comment author: OrphanWilde 21 February 2013 10:37:02PM 5 points [-]

I fully agree. I don't think even a perfect Friendliness theorem would suffice in making politics well and truly Friendly. Such an expectation is like expecting Friendly AI to work even while it's being bombarded with ionic radiation (or whatever) that is randomly flipping bits in its working memory.

Comment author: ikrase 22 February 2013 11:04:54AM 2 points [-]

Actually it's worse: It's like expecting to build a Friendly AI using a computer with no debugging utilities, an undocumented program interpreter, and a text editor that has a sense of humor. You have to implement it.

Comment author: Zack_M_Davis 21 February 2013 07:31:15PM 18 points [-]

I imagine a Friendly AI, I imagine a hands-off benefactor who permits people to do anything they wish to which won't result in harm to others.

Yeah, I like personal freedom, too, but you have to realize that this is massively, massively underspecified. What exactly constitutes "harm", and what specific mechanisms are in place to prevent it? Presumably a punch in the face is "harm"; what about an unexpected pat on the back? What about all other possible forms of physical contact that you don't know how to consider in advance? If loud verbal abuse is harm, what about polite criticism? What about all other possible ways of affecting someone via sound waves that you don't know how to consider in advance? &c., ad infinitum.

Does anybody envisage a Friendly AI which doesn't correspond more or less directly with their own political beliefs?

I'm starting to think this entire idea of "having political beliefs" is crazy. There are all sorts of possible forms of human social organization, which result in various outcomes for the humans involved; how am I supposed to know which one is best for people? From what I know about economics, I can point out some reasons to believe that market-like systems have some useful properties, but that doesn't mean I should run around shouting "Yay Libertarianism Forever!" because then what happens when someone implements some form of libertarianism, and it turns out to be terrible?

Comment author: Viliam_Bur 21 February 2013 09:56:22PM 11 points [-]

I'm starting to think this entire idea of "having political beliefs" is crazy.

Most of my "political beliefs" is awareness of specific failures in other people's beliefs.

Comment author: ikrase 22 February 2013 11:02:49AM 2 points [-]

That's fairly common, and rarely realized, I think.

Comment author: Viliam_Bur 25 February 2013 08:54:15PM 1 point [-]

Fairly common among rational (I don't mean LW-style) people. But I also know people who really believe things, and it's kind of scary.

Comment author: RomeoStevens 22 February 2013 04:33:19AM 4 points [-]

All formulations of human value are massively underspecified.

I agree that expecting humans to know what sorts of things would be good for humans in general is terrible. The problem is that we also can't get an honest report of what people think would be good for them personally because lying is too useful/humans value things hypocritically.

Comment author: Vladimir_Nesov 21 February 2013 07:40:27PM *  5 points [-]

These examples also only compare things with status quo. Status quo is most likely itself "harm" when compared to many of the alternatives.

Comment author: OrphanWilde 21 February 2013 08:01:30PM 6 points [-]

There are many more ways to arrange things in a defective manner than an effective one. I'd consider deviations from the status quo to be harmful until proven otherwise.

Comment author: torekp 22 February 2013 12:19:29AM 3 points [-]

Or in other words: most mutations are harmful.

Comment author: Vladimir_Nesov 22 February 2013 12:26:34AM 1 point [-]

(Fixed the wording to better match the intended meaning: "compared to the many alternatives" -> "compared to many of the alternatives".)

Comment author: whowhowho 21 February 2013 07:48:57PM -3 points [-]

Compare:

There are all sorts of possible forms of human social organization, which result in various outcomes for the humans involved; how am I supposed to know which one is best for people?

with:

what happens when someone implements some form of libertarianism, and it turns out to be terrible?

Comment author: AlexMennen 21 February 2013 08:16:29PM 4 points [-]

It was pretty clearly a hypothetical. As in, he doesn't see enough evidence to justify high confidence that libertarianism would not be terrible, which is perfectly in line with his statement that he doesn't know which system is best.

Comment author: whowhowho 27 February 2013 12:57:31AM *  0 points [-]

It's hypothetical about libertarianism. Other approaches have been tried, so the single data point does not generalise into anything like "no one ever has any evidential basis for choosing a political system or party". To look at it from the other extreme, someone voting in a typical democracy is typically choosing between N parties (for a small N) each of which has been in power within living memory.

Comment author: Yvain 22 February 2013 12:46:20AM *  27 points [-]

The Friendliness Problem is, at its root, about forming a workable set of values which are acceptable to society.

No, that's the special bonus round after you solve the real friendliness problem. If that were the real deal, we could just tell an AI to enforce Biblical values or the values of Queen Elizabeth II or the US Constitution or something, and although the results would probably be unpleasant they would be no worse than the many unpleasant states that have existed throughout history.

As opposed to the current problem of having a very high likelihood that the AI will kill everyone in the world.

The Friendliness problem is, at its root, about communicating values to an AI and keeping those values stable. If we tell the AI "do whatever Queen Elizabeth II wants" - which I expect would be a perfectly acceptable society to live in - the Friendliness problem is how to get the AI to properly translate that into statements like "Queen Elizabeth wants a more peaceful world" and not things more like "INCREASE LEVEL OF DOPAMINE IN QUEEN ELIZABETH'S REWARD CENTER TO 3^^^3 MOLES" or "ERROR: QUEEN ELIZABETH NOT AN OBVIOUSLY CLOSED SYSTEM, CONVERT EVERYTHING TO COMPUTRONIUM TO DEVELOP AIRTIGHT THEORY OF PERSONAL IDENTITY" or "ERROR: FUNCTION SWITCH_TASKS NOT FOUND; TILE ENTIRE UNIVERSE WITH CORGIS".

This is hard to explain in a way that doesn't sound silly at first, but Creating Friendly AI does a good job of it.

If we can get all of that right, we could start coding in a complete theory of politics. Or we could just say "AI, please develop a complete theory of politics that satisfies the criteria OrphanWilde has in his head right now" and it would do it for us, because we've solved the hard problem of cashing out human desires. The second way sounds easier.

Comment author: Eugine_Nier 22 February 2013 03:20:00AM 4 points [-]

The Friendliness problem is, at its root, about communicating values to an AI and keeping those values stable. If we tell the AI "do whatever Queen Elizabeth II wants" - which I expect would be a perfectly acceptable society to live in

That depends on whether we mean 2013!Queen Elizabeth II or Queen Elizabeth after the the resulting power goes to her head.

Comment author: OrphanWilde 22 February 2013 02:25:07PM 0 points [-]

I don't think you get the same thing from that document that I do. (Incidentally, I disagree with a lot of the design decisions inherent in that document, such as self-modifying AI, which I regard as inherently and uncorrectably dangerous. When you stop expecting the AI to make itself better, the "Keep your ethics stable across iterations" part of the problem goes away.)

Either that or I'm misunderstanding you. Because my current understanding of your view of the Friendliness problem has less to do with codifying and programming ethics and more to do with teaching the AI to know exactly what we mean and not to misinterpret what we ask for. (Which I hope you'll forgive me if I call "Magical thinking." That's not necessarily a disparagement; sufficiently advanced technology and all that. I just think it's not feasible in the foreseeable future, and such an AI makes a poor target for us as we exist today.)

Comment author: Luke_A_Somers 21 February 2013 08:44:24PM *  5 points [-]

Politics is a harder problem than friendliness: politics is implemented with agents. Not only that, but largely self-selected agents who are thus usually not the ideal selections for implementing politics.

Friendliness is implemented (inside an agent) with non-agents you can build to task.

(edited for grammarz)

Comment author: OrphanWilde 21 February 2013 10:23:08PM 0 points [-]

Friendliness can only be implemented after you've solved the problem of what, exactly, you're implementing.

Comment author: Luke_A_Somers 22 February 2013 04:06:48AM 1 point [-]

Right, but the point is you don't need to get everyone to agree what's right (there's always going to be someone out there who's going to hate it no matter what you do). You just need it to actually be friendly... and, as hard as that is, at least you don't have to work with only corrupted hardware.

Comment author: buybuydandavis 22 February 2013 11:09:23AM 2 points [-]

That's what "Politics is the Mindkiller" is all about; our inability to come to an agreement on political matters.

In a sense, but most would not agree. I think all would agree that motivated cognition on strongly held values makes for some of the mindkilling.

I agree with what I take as your basic point, that people have different preferences, and Friendliness, political or AI, will be a trade off between them. But, many here don't. In a sense, you and I believe they are mindkilled, but in a different way - structural commitment to an incorrect model that says there is One Right Answer. For you and I, that isn't the answer. Politics isn't about a search for truth, it's about assertion of preferences, trying to persuade some to do what you want them to do.

Comment author: Mitchell_Porter 21 February 2013 10:19:12PM 3 points [-]

We can't agree on which political formations are more Friendly.

We also can't agree on, say, the correct theory of quantum gravity. But reality is there and it works in some particular way, which we may or may not be able to discover.

The values of a friendly AI are usually assumed to be an idealization of universal human values. More precisely: when someone makes a decision, it is because their brain performs a particular computation. To the extent that this computation is the product of a specific cognitive architecture universal to our species (and not just the contingencies of their life), we could speak of "the human decision procedure", an unknown universal algorithm of decision-making implicit in how our brains are organized.

This human decision procedure includes a method of generating preferences - preferring one possibility over another. So we can "ask" the human decision procedure "what would be the best decision procedure for humans to follow?" This produces an idealized decision procedure: a human ideal for how humans should be. That idealized decision procedure is what human ethics has been struggling towards, and that is where a friendly AI should get its values, and perhaps its methods, from.

It may seem that I am assuming rather a lot about how human decision-making cognition works, but what I just described is the simplest version of the idea. There may be multiple identifiable decision procedures in the human gene pool; the genetically determined part of the human decision procedure may be largely a template with values set by experience and culture; there may be multiple conflicting equilibria at the end of the idealization process, depending on how it starts.

For example, egoism and altruism may be different computational attractors, both a possible end result of reflective idealization of the human decision procedure; in which case a "politicization" of the value-setting process is certainly possible - a struggle over initial conditions. Or it may be that once you really know how humans think - as opposed to just guessing on the basis of folk psychology and very incomplete scientific knowledge - it's apparent that this is a false opposition.

Either way, what I'm trying to convey here is a particular spirit of approach to the problem of values in friendly AI: that the answers should come from a scientific study of how humans actually think, that the true ideals and priorities of human beings are to be found by a study of the computational particulars of human thought, and that all our ideologies and moralities are just a flawed attempt by this computational process to ascertain its own nature.

Comment author: OrphanWilde 21 February 2013 10:35:20PM 1 point [-]

If such an idealization exists, that would of course be preferable.

I suspect it doesn't, which may color my position here, but I think it's important to consider the alternatives if there isn't a generalizable ideal; specifically, we should be working from the opposing end, and try to generalize from the specific instances; even if we can't arrive at Strong Friendliness (the fully generalized ideal of human morality), we might still be able to arrive at Weak Friendliness (some generalized ideal that is at least acceptable to a majority of people).

Because the alternative for those of us who aren't neurologists, as far as I can tell, is to wait.

Comment author: JoshuaFox 21 February 2013 07:03:19PM 1 point [-]

Politics as a process doesn't generate values; they're strictly an input,

Politics is part about choosing goals/values. (E.g., do we value equality or total wealth?) It is also about choosing the means to achieving the goals. And it is also about signaling power. Most of these are not relevant to designing a future Friendly AI.

Yes, a polity is an "optimizer" in some crude sense, optimizing towards a weighted sum of the values of its members with some degree of success. Corporations and economies have also been described as optimizers. But I don't see too much similarity to AI design here.

Comment author: [deleted] 21 February 2013 09:41:25PM 2 points [-]

Deciding what we value isn't relevant to friendliness? Could you explain that to me?

Comment author: Larks 22 February 2013 10:18:10AM 2 points [-]

The whole point of CEV is that we give the AI an algorithm for educing our values, and let it run. At no point do we try to work them out ourselves.

Comment author: [deleted] 25 February 2013 10:00:09PM *  0 points [-]

I mentally responded to you and forgot to, you know, actually respond.

I'm a bit confused by this and since it was upvoted I'm less sure I get CEV....

It might clear things up to point out that I'm making a distinction between goals or preferences vs. values. CEV could be summarized as "fulfill our ideal rather than actual preferences", yeah? As in, we could be empirically wrong about what would maximize the things we care about, since we can't really be wrong about what to care about. So I imagine the AI needing to be programmed with our values- the meta wants that motivate our current preferences- and it would extrapolate from them to come up with better preferences, or at least it seems that way to me. Or does the AI figure that out too somehow? If so, what does an algorithm that figures out our preferences and our values contain?

Comment author: Larks 26 February 2013 10:43:28AM 2 points [-]

Ha, yes, I often do that.

The motivation behind CEV also includes the idea we might be wrong about what we care about. Instead, you give your FAI an algorithm for

  • Locating people
  • Working out what they care about
  • Working out what they would care about if they knew more, etc.
  • Combining these preferences

I'm not sure what distinction you're trying to draw between values and preferences (perhaps a moral vs non-moral one?), but I don't think it's relevant to CEV as currently envisioned.

Comment author: JoshuaFox 22 February 2013 11:48:45AM 1 point [-]

Actually, when I said "most" in "most of these are not relevant to designing a future Friendly AI," I was thinking that values are the exception, they are relevant.

Comment author: [deleted] 22 February 2013 08:52:51PM 0 points [-]

Oh. Then yeah ok I think I agree.

Comment author: turchin 22 February 2013 06:37:23AM 1 point [-]

The real politic question is: should US government invest money in creating FAI, preventing existential risks and life extension?

Comment author: NancyLebovitz 23 February 2013 07:13:20PM 1 point [-]

Why just the US government?

Comment author: turchin 24 February 2013 05:39:44AM 0 points [-]

Of course, not only US government, but of all other countries, which have potential to influence AI research, existential risks. For example North Korea could play important role in existential risks, as it is said to develop small pox bioweapons. In my opinion, we need global government to address existential risks, and AI which will take over the world will be a form of global government. I was routinely downvoted for such posts and comments in LW, so it probably not appropriate place to discuss these issues.

Comment author: NancyLebovitz 24 February 2013 01:02:11PM 0 points [-]

Smallpox isn't an existential risk-- existential risks affect the continuation of the human race. So far as I know, the big ones are UFAI and asteroid strike.

I don't know of classifications for very serious but smaller risks.

Comment author: turchin 24 February 2013 09:55:29PM 1 point [-]

Look, common smallpox is not existential risk, but biological weapons could be if they were specially designed to be existential risk. The simplest way to do it is simulanious use of many different pathogens. If we have 10 viruses with 50 per cent mortality, it would mean 1000 times reduction of human population, and this last million people would be very scattered and unadapted, so they could continue to extinction. North korea is said to develope 8 different bioweapons, but with progress of biotechnology it could be hundreds. But my main idea here was not a classification of existential risks, but to adress the idea that preventing them is the question of global politic - or it least it should be if we want to survive.

Comment author: OrphanWilde 26 February 2013 06:42:27PM 2 points [-]

Infectious agents with high mortality rates tend to weed themselves out of the population. There's a sweet spot for infectious disease; prolific enough to pass themselves on, not so prolific as to kill their host before they got the opportunity. Additionally, there's a strong negative feedback to particularly nasty disease in the form of quarantine.

A much bigger risk to my mind actually comes from healthcare, which can push that sweet spot further into the "mortal peril" section. Healthcare provokes an arms race with infectious agents; the better we are at treating disease and keeping it from killing people, the more dangerous an infectious agent can be and still successfully propagate.

Comment author: handoflixue 21 February 2013 09:18:15PM 1 point [-]

There's a value, call it "weak friendliness", that I view as a prerequisite to politics: it's a function that humans already implement successfully, and is the one that says "I don't want to be wire-headed, drugged in to a stupor, victim of a nuclear winter, or see Earth turned in to paperclips".

A hands-off AI overlord can prevent all of that, while still letting humanity squabble over gay rights and which religion is correct.

And, well, the whole point of an AI is that it's smarter than us, and thus has a chance of solving harder problems.

Comment author: TimS 22 February 2013 12:55:06AM 2 points [-]

[weak friendliness is] a function that humans already implement successfully

I'm not sure this is true in any useful sense. Louis XIV probably agrees with me that "I don't want to be wire-headed, drugged in to a stupor, victim of a nuclear winter, or see Earth turned in to paperclips."

But I think is is pretty clear than the Sun King was not implementing my moral preferences, and I am not implementing his. Either one of us is not "weak friendly" or "weak friendly" is barely powerful enough to answer really easy moral questions like "should I commit mass murder for no reason at all?" (Hint: no).

If weak friendly morality is really that weak, then I have no confidence that a weak-FAI would be able to make a strong-FAI, or even would want to. In other words, I suspect that what most people mean by weak friendly is highly generalized applause lights that widely diverging values could agree with without any actual agreement on which actions are more moral.

Comment author: RomeoStevens 22 February 2013 04:35:53AM 0 points [-]

I think a lower bound on weak friendliness is whether or not entities living within the society consider their lives worthwhile. Of course this opens up debate about house elves and such but it's a useful starting point.

Comment author: Document 22 February 2013 04:33:43PM *  1 point [-]

That (along with this semi-recent exchange) reminds me of a stupid idea I had for a group decision process a while back.

  • Party A dislikes the status quo. To change it, they declare to the sysop that they would rather die than accept it.
  • The sysop accepts this and publicly announces a provisionally scheduled change.
  • Party B objects to the change and declares that they'd rather die than accept A's change.
  • If neither party backs down, a coin is flipped and the "winner" is asked to kill the loser in order for their preference to be realized; face-to-face to make it as difficult as possible, thereby maximizing the chances of one party or the other backing down.
  • If the parties consist of multiple individuals, the estimated weakest-willed person on the majority side has to kill (or convince to forfeit) the weakest person on the minority side; then the next-weakest, until the minority side is eliminated. If they can't or won't, then they're out of the fight, and replaced with the next-weakest person, et cetera until the minority is eliminated or the majority becomes the minority.

Basically, formalized war, only done in the opposite way of the strawman version in A Taste of Armageddon; making actual killing more difficult rather than easier.

A few reasons it's stupid:

  • People will tolerate conditions much worse than death (for themselves, or for others unable to self-advocate) rather than violate the taboo against killing or against "threatening" suicide.
  • The system may make bad social organizations worse by removing the most socially enlightened and active people first.
  • People have values outside themselves, so they'll stay alive and try to work for change rather than dying pointlessly and leaving things to presumably get worse and worse from their perspective.
  • Prompting people to kill or die for their values will galvanize them and make reconciliation less likely.
  • Real policy questions aren't binary, and how a question is framed or what order questions are considered in will probably strongly affect the outcome and who lives or dies, which will further affect future outcomes.
  • A side might win after initially taking casualties, or even be vindicated a long time after their initial battle. They'd want their people back, but keeping backups of people killed in a battle would make "killing" them much easier psychologically. It might also put them at risk of being restored in a dystopia that no longer respects their right to die. (Of course, people might still be reconstructed from records and others' memories even if they weren't stored anywhere in their entirety.)
  • The system assumes that there's a well-defined notion of an individual by which groups can be counted, and that individuals can't be created at will to try to outnumber opponents (possibly relevant: 1, 2, 3, 4).
  • People will immediately reject the system, so the first thing anyone "votes" for will be to abolish it, regardless of how much worse the result might be.
  • If there's an afterlife (i.e. simulation hypothesis), we might just be passing the buck.
  • I'm not sure it's a good idea to even public(al)ly discuss things like this.
Comment author: Document 22 February 2013 08:37:16PM *  0 points [-]

Actually, I think I'm now remembering a better (or better-sounding) idea that occurred to me later: rather than something as extreme as deletion, let people "vote" by agreeing to be deinstantiated, giving up the resources that would have been spent instantiating them. It might be essentially the same as death if they stayed that way til the end of the universe, but it wouldn't be as ugly. Maybe they could be periodically awakened if someone wants to try to persuade them to change or withdraw their vote.

That would hopefully keep people from voting selfishly or without thorough consideration. On the other hand, it might insulate them from the consequences of poor policies.

Also, how to count votes is still a problem; where would "the resources that would have been spent instantiating them" come from? Is this a socialist world where everyone is entitled to a certain income, and if so, what happens when population outstrips resources? Or, in a laissez-faire world where people can run out of money and be deinstantiated, the idea amounts to plain old selling of votes to the rich<strike>, like we have now</strike>.

Basically, both my ideas seem to require a eutopia already in place, or at least a genuine 100% monopoly on force. I think that might be my point. Or maybe it's that a simple-sounding, socially acceptable idea like "If someone would rather die than tolerate the status quo, that's bad, and the status quo should be changed" isn't socially acceptable once you actually go into details and/or strip away the human assumptions.

Comment author: RomeoStevens 22 February 2013 08:30:24PM 0 points [-]

Can this be set up in a round robin fashion with sets of mutually exclusive values such that everyone who is willing to kill for their values kills each other?

Comment author: Document 22 February 2013 08:44:18PM 0 points [-]

Maybe if the winning side's values mandated their own deaths. But then it would be pointless for the sysop to respond to their threat of suicide to begin with, so I don't know. I'm not sure if there's something you're getting at that I'm not seeing.

Comment author: OrphanWilde 26 February 2013 06:45:39PM 0 points [-]

"I'm not going to live there. There's no place for me there... any more than there is for you. Malcolm... I'm a monster.What I do is evil. I have no illusions about it, but it must be done. "

  • The Operative, from Serenity. (On the off-chance that somebody isn't familiar with that quote.)
Comment author: RomeoStevens 22 February 2013 09:31:56PM *  0 points [-]

I'm thinking if you do the matchup's correctly you only wind up with one such person at the end, whom all the others secretly precommit to killing.

...maybe this shouldn't be discussed publicly.

Comment author: Document 22 February 2013 10:14:12PM 0 points [-]

I don't think the system works in the first place without a monopoly on lethal force. You could work within the system by "voting" for his death, but then his friends (if any) get a chance to join in the vote, and their friends, til you pretty much have a new war going. (That's another flaw in the system I could have mentioned.)

Comment author: handoflixue 22 February 2013 01:03:53AM 0 points [-]

I think the vast majority of the population would agree that genocide and mass murder are bad, same as wire heading and turning the earth in to paperclips. A single exception isn't terribly noteworthy - I'm sure there's at least a few pro-wire-heading people out there, and I'm sure at least a few people have gotten enraged enough at humanity to think paperclips would be a better use of the space.

If you have a reason to suspect that "mass murder" is a common preference, that's another matter.

Comment author: TimS 22 February 2013 01:07:22AM *  1 point [-]

Mass murder is an easy question.

Is the Sun King (who doesn't particularly desire pointless mass murder) more moral than I am? Much harder, and your articulation of "weak Friendliness" seems incapable of even trying to answer. And that doesn't even get into actual moral problems society actually faces every day (i.e. what is the most moral taxation scheme?).

If weak-FAI can't solve those types of problems, or even suggest useful directions to look, why should we believe it is a step on the path to strong-FAI?

Comment author: handoflixue 22 February 2013 01:29:58AM 0 points [-]

Mass murder is an easy question.

That's my point. I'm not sure where the confusion is, here. Why would you call it useless to prevent wireheading, UFAI, and nuclear winter, just because it can't also do your taxes?

If it's easier to solve the big problems first, wouldn't we want to do that? And then afterwards we can take our sweet time figuring out abortion and gay marriage and tax codes, because a failure there doesn't end the species.

Comment author: TimS 22 February 2013 02:47:09AM 2 points [-]

For reasons related to Hidden Complexity of Wishes, I don't think weak-FAI actually is likely to prevent "wireheading, UFAI, and nuclear winter." At best, it prohibits the most obvious implementations of those problems. And it is terribly unlikely to be helpful in creating strong-FAI.

And your original claim was that common human preferences already implement weak-FAI preferences. I think that the more likely reason why we haven;t had the disasters you reference is that for most of human history, we lacked the capacity to cause those problems. As actual society shows, hidden complexity of wishes make implementing social consensus hopeless, much less whatever smaller set of preferences is weak-FAI preferences.

Comment author: handoflixue 22 February 2013 07:37:00PM 1 point [-]

As actual society shows, hidden complexity of wishes make implementing social consensus hopeless

My basic point was that we shouldn't worry about politics, at least not yet, because politics is a wonderful example of all the hard questions in CEV, and we haven't even worked out the easy questions like how to prevent nuclear winter. My second point was that humans do seem to have a much clearer CEV when it comes to "prevent nuclear winter", even if it's still not unanimous.

Implicit in that should have been the idea that CEV is still ridiculously difficult. Just like intelligence, it's something humans seem to have and use despite being unable to program for it.

So, then, summarized, I'm saying that we should perhaps work out the easy problems first, before we go throwing ourselves against harder problems like politics.

Comment author: TimS 23 February 2013 03:11:01AM *  1 point [-]

There's not a clear dividing line between "easy" moral questions and hard moral questions. The Cold War, which massively increased the risk of nuclear winter, was a rational expression of Great Power relations between two powers.

Until we have mutually acceptable ways of resolving disputes when both parties are rationally protecting their interests, we can't actually solve the easy problems either.

Comment author: handoflixue 25 February 2013 07:25:25PM 0 points [-]

from you:

we can't actually solve the easy problems either.

and from me:

Implicit in that should have been the idea that CEV is still ridiculously difficult.

So, um, we agree, huzzah? :)

Comment author: fubarobfusco 23 February 2013 06:17:39PM -1 points [-]

I think the vast majority of the population would agree that genocide and mass murder are bad

Sure, genocide is bad. That's why the Greens — who are corrupting our precious Blue bodily fluids to exterminate pure-blooded Blues, and stealing Blue jobs so that Blues will die in poverty — must all be killed!

Comment author: gwern 22 February 2013 12:50:27AM 1 point [-]

A hands-off AI overlord can prevent all of that, while still letting humanity squabble over gay rights and which religion is correct.

We usually call that the 'sysop AI' proposal, I think.

Comment author: OrphanWilde 21 February 2013 10:05:08PM *  1 point [-]

There's a bootstrapping problem inherent to handing AI the friendliness problem to solve.

Edit: Unless you're suggesting we use a Weakly Friendly AI to solve the hard problem of Strong Friendliness?

Comment author: handoflixue 22 February 2013 12:11:42AM 4 points [-]

Your edit pretty much captures my point, yes :) If nothing else, a Weak Friendly AI should eliminate a ton of the trivial distractions like war and famine, and I'd expect that humans have a much more unified volition when we're not constantly worried about scarcity and violence. There's not a lot of current political problems I'd see being relevant in a post-AI, post-scarcity, post-violence world.

Comment author: Dre 22 February 2013 05:21:43PM 1 point [-]

The problem is that we have to guarantee that the AI doesn't do something really bad while trying to stop these problems; what if it decides it really needs more resources suddenly, or needs to spy on everyone, even briefly? And it seems (to me at least) that stopping it from having bad side effects is pretty close, if not equivalent to, Strong Friendliness.

Comment author: handoflixue 22 February 2013 07:20:25PM 0 points [-]

I should have made that more clear: I still think Weak-Friendliness is a very difficult problem. My point is simply that we only need an AI that solves the big problems, not an AI that can do our taxes. My second point was that humans seem to already implement weak-friendliness, barring a few historical exceptions, whereas so far we've completely failed at implementing strong-friendliness.

I'm using Weak vs Strong here in the sense of Weak being a "SysOP" style AI that just handles catastrophes, whereas Strong is the "ushers in the Singularity" sort that usually gets talked about here, and can do your taxes :)

Comment author: OrphanWilde 22 February 2013 12:41:13AM *  1 point [-]

This... may be an amazing idea. I'm noodling on it.

Comment author: Rukifellth 22 February 2013 05:35:15AM 0 points [-]

I know this wasn't the spirit of your post, but I wouldn't refer to war and famine as "trivial distractions".

Comment author: Rukifellth 22 February 2013 01:39:29AM *  0 points [-]

Wait, if you're regarding the elimination of war, famine and disease as consolation prizes for creating an wFAI, what are people expecting from a sFAI?

Comment author: Fadeway 22 February 2013 03:43:59AM 1 point [-]

God. Either with or without the ability to bend the currently known laws of physics.

Comment author: Rukifellth 22 February 2013 05:17:41AM 1 point [-]

No, really.

Comment author: RichardKennaway 22 February 2013 02:26:16PM 3 points [-]

Really. That really is what people are expecting of a strong FAI. Compared with us, it will be omniscient, omnipotent, and omnibenevolent. Unlike currently believed-in Gods, there will be no problem of evil because it will remove all evil from the world. It will do what the Epicurean argument demands of any God worthy of the name.

Comment author: Rukifellth 22 February 2013 02:44:20PM 0 points [-]

Are you telling me that if a wFAI were capable of eliminating war, famine and disease, it wouldn't be developed first?

Comment author: RichardKennaway 22 February 2013 06:13:38PM 2 points [-]

Well, I don't take seriously any of these speculations about God-like vs. merely angel-like creations. They're just a distraction from the task of actually building them, which no-one knows how to do anyway.

Comment author: Rukifellth 22 February 2013 06:40:17PM 0 points [-]

But still, if a wFAI was capable of eliminating those things, why be picky and try for sFAI?

Comment author: Mimosa 22 February 2013 08:56:32PM 0 points [-]

Part of the problem is the many factors involved in the political issues. People explain things through their own specialty, but lack knowledge of other specialties.

Comment author: Decius 22 February 2013 05:10:09AM 0 points [-]

Why do you restrict Strong Friendliness to human values? Is there some value which an intelligence can have that can never be a human value?

Comment author: OrphanWilde 22 February 2013 02:26:21PM 0 points [-]

Because we're the one that has to live with the thing, and I don't know but my inclination is that the answer is "Yes"

Comment author: Decius 23 February 2013 10:13:33AM 0 points [-]

Implication: A Strongly Friendly (paperclip maximizer) AI is actually a meaningful phrase. (As opposed to all Strongly Friendly AIs being compatible with everyone)

Why all human values?

Comment author: Kawoomba 21 February 2013 08:38:23PM 0 points [-]

You're making the perfect the enemy of the good.

I'm fine with at least a thorough framework for Weak Friendliness. That's not gonna materialize out of nothing. There are no actual Turing Machines (infinite tapes required), yet it is a useful model and its study yields useful results for real world applications.

Studying Strong Friendliness is a useful activity in finding a heuristic for best-we-can-do friendliness, which is way better than nothing.