First, I'm grateful for this thoughtful engagement and pushback.
Let's call your second dystopia the Universal Chinese Turing Factory, since it's sort of a mash-up of the factory variant of Searle's Chinese Room argument and a universal Turing Machine.
I claim that the Universal Chinese Turing Factory, if put to some generic task like solving satisfiability puzzles, will not be favored by a supercontroller with the function I've specified.
Why? Because if we look at the representations computed by the Universal Chinese Turing Factory, they may be very logically deep. But their depth will not be especially due to the fact that humanity is mechanically involved in the machine. In terms of the ratio of depth-relative-to-humanity to absolute-depth, there's going to be hardly any gains there.
(You could argue, by the way, that if you employed all of humanity in the task of solving math puzzles, that would be much like the Universal Chinese Turing Factory you describe.)
Let's consider an alternative world, the Montessori World, where all of humanity is employed creating ever more elaborate artistic representations of the human condition. Arguably, this is the condition of those who dream up a post-scarcity economy where everybody gets to be a quasi-professional creative doing improv comedy, interpretative dance, and abstract expressionist basket-weaving. A utopia, I'm sure you'd agree.
The thing is that these representations will be making use of humanity's condition h as an integral part of the computing apparatus that produces the representations. So, humanity is not just a cog in a universal machine implementing other algorithms. It is the algorithm, which the supercontroller then has the interest of facilitating.
That's the vision anyway. Now that I'm writing it out, I'm thinking maybe I got my math wrong. Does the function I've proposed really capture these intuitions?
For example, maybe a simpler way to get at this would be to look at the Kolmogorov complexity of the universe relative to humanity, K(u/h). That could be a better Montessori world. Then again, maybe Montessori world isn't as much of a utopia as I've been saying it is.
If I sell the idea of these kinds of complexity-relative-to-humanity functions as a way of designing supercontrollers that are not existential threats, I'll consider this post a success. The design of the particular function is I think an interesting ethical problem or choice.
Well, certainly I've been coming from the viewpoint that it doesn't capture these intuitions :P Human values are complicated (I'm assuming you've read the relevant posts here (e.g.)), both in terms of their representation and in terms of how they are cashed out from their representation into preferences over world-states. Thus any solution that doesn't have very good evidence that it will satisfy human values, will very likely not do so (small target in a big space).
...In terms of the ratio of depth-relative-to-humanity to absolute-depth, there's going to
I attended Nick Bostrom's talk at UC Berkeley last Friday and got intrigued by these problems again. I wanted to pitch an idea here, with the question: Have any of you seen work along these lines before? Can you recommend any papers or posts? Are you interested in collaborating on this angle in further depth?
The problem I'm thinking about (surely naively, relative to y'all) is: What would you want to program an omnipotent machine to optimize?
For the sake of avoiding some baggage, I'm not going to assume this machine is "superintelligent" or an AGI. Rather, I'm going to call it a supercontroller, just something omnipotently effective at optimizing some function of what it perceives in its environment.
As has been noted in other arguments, a supercontroller that optimizes the number of paperclips in the universe would be a disaster. Maybe any supercontroller that was insensitive to human values would be a disaster. What constitutes a disaster? An end of human history. If we're all killed and our memories wiped out to make more efficient paperclip-making machines, then it's as if we never existed. That is existential risk.
The challenge is: how can one formulate an abstract objective function that would preserve human history and its evolving continuity?
I'd like to propose an answer that depends on the notion of logical depth as proposed by C.H. Bennett and outlined in section 7.7 of Li and Vitanyi's An Introduction to Kolmogorov Complexity and Its Applications which I'm sure many of you have handy. Logical depth is a super fascinating complexity measure that Li and Vitanyi summarize thusly:
The mathematics is fascinating and better read in the original Bennett paper than here. Suffice it presently to summarize some of its interesting properties, for the sake of intuition.