Re: Generality.
Yes, I agree a toy setup and a proof are needed here. In case it wasn't clear, my intentions with this post was to suss out if there was other related work out there already done (looks like there isn't) and then do some intuition pumping in preparation for a deeper formal effort, in which you are instrumental and for which I am grateful. If you would be interested in working with me on this in a more formal way, I'm very open to collaboration.
Regarding your specific case, I think we may both be confused about the math. I think you are right that there's something seriously wrong with the formulas I've proposed.
If the string y is incompressible and shallow, then whatever x is, D(x) ~ D(x/y), because D(x) (at least in the version I'm using for this argument) is the minimum computational time of producing x from an incompressible program. If there is a minimum running time program P that produces x, then appending y as noise at the end isn't going to change the running time.
I think this case with incompressible y is like your Ongoing Tricky Procession.
On the other hand, say w is a string with high depth. Which is to say, whether or not it is compressible in space, it is compressible in time: you get it by starting with something incompressible and shallow and letting it run in time. Then there are going to be some strings x such that D(x/w) + D(w) ~ D(x). There will also be a lot of strings x such that D(x/w) ~ D(x) because D(w) is finite and there tons of deep things the universe can compute that are deeper. So for a given x, D(x) > D(x/w) > D(x) - D(w) , roughly speaking.
I'm saying the h, the humanity data, is logically deep, like w, not incompressible and shallow, like y or the ongoing tricky procession.
Hmm, it looks like I messed up the formula yet again.
What I'm trying to figure out is to select for universes u such that h is responsible for a maximal amount of the total depth. Maybe that's a matter of minimizing D(u/h). Only that would lead perhaps to globe-flattening shallowness.
What if we tried to maximize D(u) - D(u/h)? That's like the opposite of what I originally proposed.
I'm still confused as to what D(u/h) means. It looks like it should refer to the number of logical steps you need to predict the state of the universe - exactly, or up to a certain precision - given only knowledge of human history up to a certain point. But then any event you can't predict without further information, such as the AI killing everyone using some astronomical phenomenon we didn't include in the definition of "human history", would have infinite or undefined D(u/h).
I attended Nick Bostrom's talk at UC Berkeley last Friday and got intrigued by these problems again. I wanted to pitch an idea here, with the question: Have any of you seen work along these lines before? Can you recommend any papers or posts? Are you interested in collaborating on this angle in further depth?
The problem I'm thinking about (surely naively, relative to y'all) is: What would you want to program an omnipotent machine to optimize?
For the sake of avoiding some baggage, I'm not going to assume this machine is "superintelligent" or an AGI. Rather, I'm going to call it a supercontroller, just something omnipotently effective at optimizing some function of what it perceives in its environment.
As has been noted in other arguments, a supercontroller that optimizes the number of paperclips in the universe would be a disaster. Maybe any supercontroller that was insensitive to human values would be a disaster. What constitutes a disaster? An end of human history. If we're all killed and our memories wiped out to make more efficient paperclip-making machines, then it's as if we never existed. That is existential risk.
The challenge is: how can one formulate an abstract objective function that would preserve human history and its evolving continuity?
I'd like to propose an answer that depends on the notion of logical depth as proposed by C.H. Bennett and outlined in section 7.7 of Li and Vitanyi's An Introduction to Kolmogorov Complexity and Its Applications which I'm sure many of you have handy. Logical depth is a super fascinating complexity measure that Li and Vitanyi summarize thusly:
The mathematics is fascinating and better read in the original Bennett paper than here. Suffice it presently to summarize some of its interesting properties, for the sake of intuition.