I attended Nick Bostrom's talk at UC Berkeley last Friday and got intrigued by these problems again. I wanted to pitch an idea here, with the question: Have any of you seen work along these lines before? Can you recommend any papers or posts? Are you interested in collaborating on this angle in further depth?
The problem I'm thinking about (surely naively, relative to y'all) is: What would you want to program an omnipotent machine to optimize?
For the sake of avoiding some baggage, I'm not going to assume this machine is "superintelligent" or an AGI. Rather, I'm going to call it a supercontroller, just something omnipotently effective at optimizing some function of what it perceives in its environment.
As has been noted in other arguments, a supercontroller that optimizes the number of paperclips in the universe would be a disaster. Maybe any supercontroller that was insensitive to human values would be a disaster. What constitutes a disaster? An end of human history. If we're all killed and our memories wiped out to make more efficient paperclip-making machines, then it's as if we never existed. That is existential risk.
The challenge is: how can one formulate an abstract objective function that would preserve human history and its evolving continuity?
I'd like to propose an answer that depends on the notion of logical depth as proposed by C.H. Bennett and outlined in section 7.7 of Li and Vitanyi's An Introduction to Kolmogorov Complexity and Its Applications which I'm sure many of you have handy. Logical depth is a super fascinating complexity measure that Li and Vitanyi summarize thusly:
Logical depth is the necessary number of steps in the deductive or causal path connecting an object with its plausible origin. Formally, it is the time required by a universal computer to compute the object from its compressed original description.
The mathematics is fascinating and better read in the original Bennett paper than here. Suffice it presently to summarize some of its interesting properties, for the sake of intuition.
- "Plausible origins" here are incompressible, i.e. algorithmically random.
- As a first pass, the depth D(x) of a string x is the least amount of time it takes to output the string from an incompressible program.
- There's a free parameter that has to do with precision that I won't get into here.
- Both a string of length n that is comprised entirely of 1's, and a string of length n of independent random bits are both shallow. The first is shallow because it can be produced by a constant-sized program in time n. The second is shallow because there exists an incompressible program that is the output string plus a constant sized print function that produces the output in time n.
- An example of a deeper string is the string of length n that for each digit i encodes the answer to the ith enumerated satisfiability problem. Very deep strings can involve diagonalization.
- Like Kolmogorov complexity, there is an absolute and a relative version. Let D(x/w) be the least time it takes to output x from a program that is incompressible relative to w,
- It can be updated with observed progress in human history at time t' by replacing ht with ht'. You could imagine generalizing this to something that dynamically updated in real time.
- This is a quite conservative function, in that it severely punishes computation that does not depend on human history for its input. It is so conservative that it might result in, just to throw it out there, unnecessary militancy against extra-terrestrial life.
- There are lots of devils in the details. The precision parameter I glossed over. The problem of representing human history and the state of the universe. The incomputability of logical depth (of course it's incomputable!). My purpose here is to contribute to the formal framework for modeling these kinds of problems. The difficult work, like in most machine learning problems, becomes feature representation, sensing, and efficient convergence on the objective.
Unsure if sarcasm or just a surprisingly good prediction, since I happen to have it on my desktop at the moment. (Thanks, /u/V_V).
Anyhow, why should this state contain humans at all? The rest of the universe out-entropies us by a lot. Think of all the radical high-logical-depth things the Sun can do that would happen to be bad for humans.
Even if we just considered a human brain, why should a high-relative-logical-depth future look like the human brain continuing to think? Why not overwrite it with a SAT-solver? Or a human brain whose neuron firings encode the running times of turing machines that halt in less than a googolplex steps? Or turn the mass-energy into photons to encode a much huger state?
A good prediction :)
Logical depth is not entropy.
The function I've proposed is to maximize depth-of-universe-relative-to-humanity-divided-by-depth-of-universe.
Consider the decision to kill off people and overwrite them with a very fast SAT solver. That would surely increase depth-of-universe, which is in the denominator. I.e. increasing that value decreases the favorability of this outcome.
What increases the favorability of the outcome, in light of that function, are the computation of representations that take humanity as an input. You could imagine the s... (read more)