The Open Thread posted at the beginning of the month has gotten really, really big, so I've gone ahead and made another one. Post your new discussions here!
This thread is for the discussion of Less Wrong topics that have not appeared in recent posts. If a discussion gets unwieldy, celebrate by turning it into a top-level post.
Maybe I'm crazy but all that doesn't sound so hard.
More precisely, there's one part, the solution to which should require nothing more than steady hard work, and another part which is so nebulous that even the problems are still fuzzy.
The first part - requiring just steady hard work - is everything that can be reduced to existing physics and mathematics. We're supposed to take the human brain as input and get a human-friendly AI as output. The human brain is a decision-making system; it's a genetically encoded decision architecture or decision architecture schema, with the parameters of the schema being set in the individual by genetic or environmental contingencies. CEV is all about answering the question: If a superintelligence appeared in our midst, what would the human race want its decision architecture to be, if we had time enough to think things through and arrive at a stable answer? So it boils down to asking, if you had a number of instances of the specific decision architecture human brain, and they were asked to choose a decision architecture for an entity of arbitrarily high intelligence that was to be introduced into their environment, what would be their asymptotically stable preference? That just doesn't sound like a mindbogglingly difficult problem. It's certainly a question that should be answerable for much simpler classes of decision architecture.
So it seems to me that the main challenge is simply to understand what the human decision architecture is. And again, that shouldn't be beyond us at all. The human genome is completely sequenced, we know the physics of the brain down to nucleons, there's only a finite number of cell types in the body - yes it's complicated, but it's really just a matter of sticking with the problem. (Or would be, if there was no time factor. But how to do all this quickly is a separate problem.)
So to sum up, all we need to do is to solve the decision theory problem 'if agents X, Y, Z... get to determine the value system and cognitive architecture of a new, superintelligent agent A which will be introduced into their environment, what would their asymptotic preference be?'; correctly identify the human decision architecture; and then substitute this for X, Y, Z... in the preceding problem.
That's the first part, the 'easy' part. What's the second part, the hard but nebulous part? Everything to do with consciousness, inconceivable future philosophy problems, and so forth. Now what's peculiar about this situation is that the existence of nebulous hard problems suggests that the thinker is missing something big about the nature of reality, and yet the easy part of the problem seems almost completely specified. How can the easy part appear closed, an exactly specified problem simply awaiting solution, and yet at the same time, other aspects of the overall task seem so beyond understanding? This contradiction is itself something of a nebulous hard problem.
Anyway, achieving the CEV agenda seems to require a combination of steady work on a well-defined problem where we do already have everything we need to solve it, and rumination on nebulous imponderables in the hope of achieving clarity - including clarity about the relationship between the imponderables and the well-defined problem. I think that is very doable - the combination of steady work and contemplation, that is. And the contemplation is itself another form of steady work - steadily thinking about the nebulous problems, until they resolve themselves.
So long as there are still enigmas in the existential equation we can't be sure of the outcome, but I think we can know, right now, that it's possible to work on the problem (easy and hard aspects alike) in a systematic and logical way.