Another important step I wish would be done on problems like these: explain them. Don't just define the problem; build up the network of inferences that lead up to it so that someone can get the Level 2 (see link) understanding necessary to get the problem at a "gut" level. Make it so that someone can build up from the pre-requisite knowledge across the inferential gap just as easily (well, more so) than you can do here.
That, I think, would expose the problem to a lot more people capable of contribute, and reveal connections to similar problems.
A related post, 'Friendly AI Research and Taskification':
I don't even see how one would start to research the problem of getting a hypothetical AGI to recognize humans as distinguished beings. Solving this problem would seem to require as a prerequisite an understanding of the make up of the hypothetical AGI; something which people don't seem to have a clear grasp of at the moment. Even if one does have a model for a hypothetical AGI, writing code conducive to it recognizing humans as distinguished beings seems like an intractable task.
If the nature of ethical properties, statements, attitudes, and judgments does ultimately correlate with human brains, it might be possible to derive mathematical models of moral terms or judgments from brain data. The problem with arriving at the meaning of morality solely by means of contemplation is that you risk introducing new meanings based on high-order cognition and intuitions, rather than figuring out what humans as a whole mean by morality.
Two possible steps towards friendly AI/CEV (just some quick ideas):
1.) We want the AGI (CEV) to extrapolate our volition in a certain, ethical way. That is, it shouldn't for example create models of humans and hurt them just to figure out what we dislike. But in the end it won't be enough to write blog posts in English. We might have to put real people into brain scanners and derive mathematically precise thresholds for states like general indisposition and unethical behavior. Such models could then be implemented into the utility-function of an AGI, while blog posts written in natural language can't.
2.) We don't know if CEV is itself wished for and considered ethical by most humans. If you do not assume that all humans are alike, what makes you think that your personal solution, your answer to those questions will be universally accepted? A rich white atheist male living in a western country who is interested in topics like philosophy and mathematics does not seem to be someone who can speak for the rest of the world. If we are very concerned with the ethics of CEV in and of itself, we might have to come up with a way to execute an approximation of CEV before AGI is invented. We might need massive, large-scale social experiments and surveys to see if something like CEV is even desirable. Writing a few vague blog posts about it doesn't seem to get us the certainty we need before altering the universe irrevocably.
If CEV encounters a large proportion of the population that wish it was not run and will continue to do so after extrapolation, it simply stops and reports that fact. That's one of the points of the method. It is, in and of itself a large scale social survey of present and future humanity. And if the groups that wouldn't want it run now would after extrapolation, I'm fine with running it against their present wishes, and hope that if I were part of a group under similar circumstances someone else would do the same- "past me" is an idiot, I'm not much better, and "future me" is hopefully an even bigger improvement, while "desired future me" almost certainly is.
Is the intention of this project to approximate the list of questions, which if answered, would let someone build an FAI? (That seems incompatible with SIAI's traditional more secretive approach. Surely the Manhattan Project would never have published a list of questions, which if answered, would let someone build a nuclear bomb.)
If yes, is this a strategic (i.e., considered) change of plans? If so, what were the arguments that changed people's minds?
Or is the intention to just define some open problems that are FAI-related, but whose answers, if made public, pose no significant danger?
I hope not to publish problem definitions that increase existential risk. I plan to publish problem definitions that decrease existential risk.
Your response seems rather defensive. Perhaps it would help to know that I just recently started thinking that some seemingly innocuous advances in FAI research might turn out to be dangerous, and that Eliezer's secretive approach might be a good idea after all. So I'm surprised to see SIAI seeming to turn away from that approach, and would like to know whether that is actually the case, and if so what is the reason for it. With that in mind, perhaps you could consider giving more informative answers to my questions?
Sorry for my brevity, it's just that I don't have more substantive answers right now. These are discussions that need to be ongoing. I'm not aware of any strategy change.
There seems to be a number of different possible approaches to building a Friendly AI, each with their own open problems. For example, we could design a system of uploads with safety checks and then punt all other questions to them, or figure out how we solve confusing philosophical problems and program a de novo AI with those methods, or solve them ourselves and just code up a decision process together with a set of preferences. Does this effort to define open problems assume a particular approach?
I will try to be clear as I go along which problems are relevant for which scenarios. Also, solving certain problems will mean that we don't have to solve other problems. The trouble is that we don't know which scenarios will play out and which problems we'll solve first.
I think the Millennium Prize Problems isn't the best example in this context, because for the one problem that was solved in that set, the prize was rejected.
This is probably also a good opportunity for people concerned about existential risk to shape the perception of the scientific and mathematical public.
(Inspired by.)
Einstein's four fundamental papers of 1905 were inspired by the statement of three open problems in science by Henri Poincaré:
A few years earlier, David Hilbert had published 23 open problems in mathematics, about half of which were solved during the 20th century.
More recently, Timothy Gowers has used his blog to promote open problems in mathematics that might be solved collaboratively, online. After just seven weeks, the first problem was "probably solved," resulting in some published papers under the pseudonym 'D.H.J. Polymath.'
The Clay Mathematics Institute offers a $1 million prize for the solution to any of 7 particularly difficult problems in mathematics. One of these problems has now been solved.
In 2006, researchers defined 14 open problems in artificial life, and their paper continues to guide research in that field.
And of course there are many more open problems. Many more.
One problem with Friendly AI research is that even those who could work on the project often don't have a clear picture of what the open problems are and how they interact with each other. There are a few papers that introduce readers to the problem space, but more work could be done to (1) define each open problem with some precision, (2) discuss how each open problem interacts with other open problems, (3) point readers to existing research on the problem, and (4) suggest directions for future research. Such an effort might even clarify the problem space for those who think they understand it.
(This is, in fact, where my metaethics sequence is headed.)
Defining a problem is the first step toward solving it.
Defining a problem publicly can bring it to the attention of intelligent minds who may be able to make progress on it.
Defining a problem publicly and offering a reward for its solution can motivate intelligent minds to work on that problem instead of some other problem.