Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Richard_Hollerith2 comments on That Tiny Note of Discord - Less Wrong

17 Post author: Eliezer_Yudkowsky 23 September 2008 06:02AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (34)

Sort By: Old

You are viewing a single comment's thread.

Comment author: Richard_Hollerith2 24 September 2008 06:17:56PM 0 points [-]

If Eliezer had not abandoned the metaethics he adopted in 1997 or so by the course described in this blog entry, he might have abandoned it later in the design of the seed AI when it became clear to him that the designer of the seed must choose the criterion the AI will use to recognize objective morality when it finds it. In other words, there is no way to program a search for objective morality or for any other search target without the programmer specifying or defining what constitutes a successful conclusion of the search.

The reason a human seems to be able to search for things without being able to define clearly at the start of the search what he or she is searching for is that humans have preferences and criteria that no one can articulate fully. Well, the reader might be thinking, why not design the AI so that it, too, has criteria that no one can articulate? My answer has 2 parts: one part explains that CEV is not such an unarticulatable design; the other part asserts that any truly unarticulatable design would be irresponsible.

Although it is true that no one currently in existence can articulate the volition of the humans, it is possible for some of us to specify or define with enough precision and formality what the volition of the humans is and how the AI should extrapolate it. In turn, a superintelligent AI in possession of such a definition can articulate the volition of the humans.

The point is that although it is a technically and scientifically challenging problem, it is not outside the realm of current human capability to define what is meant by the phrase "coherent extrapolated volition" in sufficient precision, reliability and formality to bet the outcome of the intelligence explosion on it.

Like I said, humans have systems of value and systems of goals that no one can articulate. The only thing that keeps it from being completely unethical to rely on humans for any important purpose is that we have no alternative means of achieving the important purpose. In contrast, it is possible to design a seed AI whose goal system is "articulatable", which means that some human or some team or community of humans can understand it utterly, the way that some humans can currently understand relativity theory utterly. An agent with an articulatable goal system is vastly preferrable to the alternative because it is vastly desirable for the designer of the agent to do his or her best in choosing the optimization target of the agent, and choosing an unarticulatable goal system is simply throwing away that ability to choose -- leaving the choice up to "chance".

To switch briefly to a personal note, when I found Eliezer's writings in 2001 his home page still linked to his explanation of the metaethics he adopted in 1997 or so, which happened to coincide with my metaethics at the time (which coincidence made me say to myself, "What a wonderful young man!" ). I can for example recall using the argument Eliezer give below in a discussion of ethics with my roommate in 1994:

In the event that life is meaningless, nothing is the "right" thing to do; therefore it wouldn't be particularly right to respect people's preferences in this event.

Anyway, I have presented the argument against the metaethics to which Eliezer and I used to subscribe that I find the most persuasive.