Comment author: Kaj_Sotala 01 March 2016 07:55:57AM 1 point [-]

My recent paper touches upon preference aggregation a bit in section 8, BTW, though it's mostly focused on the question of figuring out a single individual's values. (Not sure how relevant that is for your comments, but thought maybe a little.)

Comment author: halcyon 05 March 2016 09:58:10PM 0 points [-]

Thanks, I'll look into it.

(And all my ranting still didn't address the fundamental difficulty: There is no rational way to choose from among different projections of values held by multiple agents, projections such as Rawlsianism and utilitarianism.)

Comment author: Kaj_Sotala 26 February 2016 05:39:02PM *  1 point [-]

From the CEV paper:

Different classes of satisfactory initial definitions may fall into different selfconsistent attractors for optimal definitions of volition. Or they may all converge to essentially the same endpoint. A CEV might survey the “space” of initial dynamics and self-consistent final dynamics, looking to see if one alternative obviously stands out as best; extrapolating the opinions humane philosophers might have of that space. But if there are multiple, self-consistent, satisficing endpoints, each of them optimal under their own criterion—okay. Whatever. As long as we end up in a Nice Place to Live.

And yes, the programmers’ choices may have a huge impact on the ultimate destiny of the human species. Or a bird, chirping in the programmers’ window. Or a science fiction novel, or a few lines spoken by a character in an anime, or a webcomic. Life is chaotic, small things have large effects. So it goes.

Which you could sum up as "CEV doesn't get around that problem, it treats it as irrelevant - the point isn't to find a particular good solution that's unique and totally non-arbitrary, it's just to find even one of the good solutions. If arbitrary reasons shift us from Good World #4 to Good World #36, who cares as long as they both really are good worlds".

Comment author: halcyon 28 February 2016 08:43:52PM *  0 points [-]

Although what if we told each party to submit goals rather than non-goal preferences? If the AI has access to a model specifying which actions lead to which consequences, then it can search for those actions that maximize the number of goals fulfilled regardless of which party submitted them, or perhaps takes a Rawlsian approach of trying to maximize the number of goals fulfilled that were submitted by whichever party will have the least number of goals fulfilled if that sequence of actions were taken, etc. That seems very imaginable to me. You can then have heuristics that constrain the search space and stuff. You can also have non-goal preferences in addition to goals if the parties have any of those.

In that light, it seems to me that the problem was inferring goals from a set of preferences which were not purely non-goal preferences but were actually presented with some unspecified goals in mind. Eg. One party wanted chocolate, but said, "I want to go to the store" instead. If that was the source of the original problem, then we can see why we might need an AI to solve it, since it calls for some lightweight mind reading. Of course, a CEV-implementing AI would have to be a mind reader anyway, since we don't really know what our goals ultimately are given everything we could know about reality.

This still does not guarantee basic morality, but parties should at least recognize some of their ultimate goals in the end result. They might still grumble about the result not being exactly what they wanted, but we can at least scold them for lacking a spirit of compromise.

All this presupposes that enough of our actions can be reduced to ultimate goals that can be discovered, and I don't think this process guarantees we will be satisfied with the results. For example, this might erode personal freedom to an unpleasant degree. If we would choose to live in some world X if we were wiser and nicer than we are, then it doesn't necessarily follow that X is a Nice Place to Live as we are now. Changing ourselves to reach that level of niceness and wisdom might require unacceptably extensive modifications to our actual selves.

Comment author: Kaj_Sotala 26 February 2016 05:45:29PM 1 point [-]

What is the current replacement for CEV anyway?

There isn't really one as far as I know; "The Value Learning Problem" discusses some of the questions involved, but seems to mostly at be the point of defining the problem rather than trying to answer it. (This seems appropriate to me; trying to answer the problem at this point seems premature.)

Comment author: halcyon 28 February 2016 03:32:58PM 1 point [-]

Thanks. That makes sense to me.

Comment author: Kaj_Sotala 26 February 2016 05:39:02PM *  1 point [-]

From the CEV paper:

Different classes of satisfactory initial definitions may fall into different selfconsistent attractors for optimal definitions of volition. Or they may all converge to essentially the same endpoint. A CEV might survey the “space” of initial dynamics and self-consistent final dynamics, looking to see if one alternative obviously stands out as best; extrapolating the opinions humane philosophers might have of that space. But if there are multiple, self-consistent, satisficing endpoints, each of them optimal under their own criterion—okay. Whatever. As long as we end up in a Nice Place to Live.

And yes, the programmers’ choices may have a huge impact on the ultimate destiny of the human species. Or a bird, chirping in the programmers’ window. Or a science fiction novel, or a few lines spoken by a character in an anime, or a webcomic. Life is chaotic, small things have large effects. So it goes.

Which you could sum up as "CEV doesn't get around that problem, it treats it as irrelevant - the point isn't to find a particular good solution that's unique and totally non-arbitrary, it's just to find even one of the good solutions. If arbitrary reasons shift us from Good World #4 to Good World #36, who cares as long as they both really are good worlds".

Comment author: halcyon 28 February 2016 03:21:21PM *  1 point [-]

The real difficulty is that when you combine two sets of preferences, each of which make sense on their own, you get a set of preferences that makes no sense whatsoever: http://plato.stanford.edu/entries/economics/#5.2 https://www.google.com/search?q=site%3Aplato.stanford.edu+social+choice&ie=utf-8&oe=utf-8

There is no easy way to resolve this problem. There is also no known method that takes such an inconsistent set of preferences as input and gives a consistent set of preferences as output such that the output would be recognizable to either party who contributed an original set of preferences as furthering any of their original goals. These random decisions are required so often in cases where there isn't an unanimous agreement that in practice, there would be a large component of arbitrariness every single time CEV tries to arrive at a uniform set of preferences by extrapolating volitions of multiple agents into the future.

This doesn't mean the problem is unresolvable, just that it's an AI problem in its own right, but given these problems, wouldn't it be better to pick whichever Nice Place to Live is the safest to reach instead of bothering with CEV? I say this because I'm not sure Nice Place to Live can be defined in terms of CEV, as any CEV-approved output. Because of the preference aggregation problem, I'm not certain that a world that is provably CEV-abiding also provably avoids flagrant immorality. Two moral frameworks when aggregated by a non-smart algorithm might give rise to an immoral framework, so I'm not sure the essence of the problem is resolved just by CEV as explained in the paper.

Comment author: ChristianKl 24 February 2016 12:46:30AM 1 point [-]

I think that's on the list of MIRI open research problems.

Comment author: halcyon 24 February 2016 02:24:17PM 0 points [-]

Interesting. In that case, would you say an AI that provably implements CEV's replacement is, for that reason, provably Friendly? That is, AIs implementing CEV's replacement form an analytical subset of Friendly AIs? What is the current replacement for CEV anyway? Having some technical material would be even better. If it's open to the public, then I'd like to understand how EY proposes to install a general framework similar to CEV at the "initial dynamic" stage that can predictably generate a provably Friendly AI without explicitly modeling the target of its Friendliness.

Comment author: Manfred 24 February 2016 01:11:09AM 1 point [-]

Any proofs will be like... assuming that if some laws of aerodynamics and range of conditions hold, proving that a certain plane design will fly. Which of course has some trouble because we don't know the equivalent of aerodynamics either.

Comment author: halcyon 24 February 2016 02:12:25PM 0 points [-]

That would seem to be the best possible solution, but I have never heard aeroplane engineers claim that their designs are "provably airworthy". If you take the aeroplane design approach, then isn't "provably Friendly" a somewhat misleading claim to make, especially when you're talking about pushing conditions to the extreme that you yourself admit are beyond your powers of prediction? The aeroplane equivalent would be like designing a plane so powerful that its flight changes the atmospheric conditions of the entire planet, but then the plane uses a complicated assembly of gyroscopes or something to continue flying in a straight line. However, if you yourself cannot predict which specific changes the flight of the plane will make, then how can you claim that you can prove that particular assembly of gyroscopes is sufficient to keep the plane on the preplanned path? On the other hand, if you can prove which specific changes the plane's flight will make that are relevant to its flight, then you have a mathematical definition of the target atmosphere at a sufficient depth of resolution to design such an assembly. Does MIRI think it can come up with an equivalent mathematical model of humanity with respect to AI?

Comment author: Lumifer 23 February 2016 09:05:15PM 1 point [-]

That's the reason EY came up with the concept of CEV -- Coherent Extrapolated Volition.

Comment author: halcyon 23 February 2016 10:49:42PM *  2 points [-]

The SEP says that preferences cannot be aggregated without additional constraints on how the aggregation is to be done, and the end result changes depending on things like the order of aggregation, so these additional constraints take on the quality of arbitrariness. How does CEV get around that problem?

Comment author: halcyon 23 February 2016 08:52:13PM *  0 points [-]

I have a question: It seems to me that Friendliness is a function of more than just an AI. To determine whether an AI is Friendly, it would seem necessary to answer the question: Friendly to whom? If that question is unanswered, then "Friendly" seems like an unsaturated function like "2+". In the LW context, the answer to that question is probably something along the lines of "humanity". However, wouldn't a mathematical definition of "humanity" be too complex to let us prove that some particular AI is Friendly to humanity? Even if the answer to "To whom?" is "Eliezer Yudkowsky", even that seems like it would be a rather complicated proof to say the least.

Comment author: ChristianKl 01 June 2015 02:06:34PM 1 point [-]

In the 19th century the German idea was about not having wars between German states. It was about not having border but being unified under shared law. It was in it's nature cosmopolitan. "Deutschland, Deutschland über alles" meant when it was written to have something that's bigger than the individual states.

The European idea is given credit for preventing European nations from waging war against each other after WWII.

Kant is commonly admitted to be a romantic philosopher

When reading Kant in a school philosophy study group, our teacher told us that discussing whether or not someone is a romantic philosopher, is an Anglo-thing. German intellectual discourse usually doesn't focus on putting those kinds of labels on people but tries to be more discerning.

I also think that you overrate the impact of philosophers. A lot of important thought isn't done by philosophers. Today the Bertelsmann Stiftung produces more ideas that are relevant for political policy than Habermas.

Comment author: halcyon 02 June 2015 03:55:39AM *  -2 points [-]

Oh well, I agree with the English that Kant was a romantic philosopher. Rousseau was a primary source of inspiration for him. (I agree with Dewey that writers (Goethe) and philosophers (Kant) give expression to popular views more than shaping them. OTOH, as much as I admire Goethe, I think Oswald Spengler went too far in trying to interpret him as a universal philosopher.)

"In the 19th century the German idea was about not having wars between German states," is a true statement, but it leaves out crucial details. For example, there are many people who agree that European nations should not war against each other, but are bitterly critical of the details of how that general plan was implemented in practice.

I think it follows that the European idea is not reducible to the notion that European states should not fight. If you do not agree, then I apologize for using terms like "European idea" and "German idea" in a sense you didn't intend, but my point can be easily reworded using "implementation of the German idea" in place of "German idea".

The point I'm trying to make is that, like I said, Germany is currently outcompeting the Anglo-American world on the terms of the Anglo-American world, not on the original terms of Germany. Arguably, England wanted to end European wars in the 19th century as well. Who would you say got their way in the end, England or Germany?

Comment author: ChristianKl 01 June 2015 01:07:35PM 1 point [-]

That is satire, but notice how progressive Germans were accused of imitating the English in EXACTLY the same way that Islamists accuse progressive Arabs of copying the West.

You call the collapse of democracy in 1933 a collapse of Germany but that democracy mostly was an American idea. After mostly losing to the US in WWI German's spent a decade wanting to copy the US.

You can't at the same time label stopping to copy other countries systems a collapse and copying other countries system a collapse.

I don't want you to think I'm putting German culture down or anything, but proposing an interpretation of "the German idea" that has the figure of Faust expurgated from it

The phrase "the German idea" refers to something particular the same way the phrase "the German question does". Neither of them happen to do something with Faust. Faust is a part of German culture but it's not about the German idea. Goethe would have had political problems to publish in favor of the the German idea at his time because that would have meant to question the authority of his government.

Faust is still part of German culture. It get's read in schools.

Even the notion of an "European idea" including Britain is an oversimplification because if you ask Europeans, many of them will tell you that England has a different culture from the rest of Europe.

The European idea is an ideal. It's a wish for the future. It's a wish for the future in the same way the German idea was a wish for the future in the early 19th century.

Nevertheless England get's partly governed by Brussels. The English might not like it, but Brussels has power. The referendum is going to be interesting. Does the British public make a choice to consent to be governed by Brussels or don't they?

Comment author: halcyon 01 June 2015 01:35:53PM *  -2 points [-]

Look, the collapse of a state is the collapse of state regardless of ideological roles. (Modern Germany is fundamentally Anglo-American in design and very successful. That is the point, since you were citing the success of contemporary Germany.)

(...Nah, it would take far too long to discuss the state of Germany prior to WWI.)

Faust really was a central figure in the German idea, I'm afraid. I don't know how consciously Goethe was complicit in this, and this has nothing to do with what he would have had problems for saying what when he published Faust.

Of course Faust is still a part of German culture. He's part of world culture, a typically German vision of the universal man. (I am personally a huge fan of Faust.)

I don't understand the contradiction in saying that X and Y have different wishes for the future owing to cultural differences. (And I don't understand what Habermas' Europe has to do with the 19th century German idea. Habermas has openly stated that the German intellectual tradition is inadequate for criticizing fascism and consciously borrowed from Anglophone thinkers. The most striking difference between thinkers who have gained a standing in the Anglophone world and thinkers from the rest of the world is their careful, deliberate anti-existentialism.)

(Kant is commonly admitted to be a romantic philosopher, and I found this link: http://philosophyisnotaluxury.com/2010/08/12/romanticism-and-existential-philosophy/)

View more: Prev | Next