I'm interested in thinking formally about AI risk. I believe that a proper mathematization of the problem is important to making intellectual progress in that area.
I have been trying to understand the rather critical notion of optimization power. I was hoping that I could find a clear definition in Bostrom's Superintelligence. But having looked in the index at all the references to optimization power that it mentions, as far as I can tell he defines it nowhere. The closest he gets is defining it in terms of rate of change and recalcitrance (pp.62-77). This is an empty definition--just tautologically defining it in terms of other equally vague terms.
Looking around, this post by Yudkowksy, "Measuring Optimization Power" doesn't directly formalize optimization power. He does discuss how one would predict or identify if a system were the result of an optimization process in a Bayesian way:
The quantity we're measuring tells us how improbable this event is, in the absence of optimization, relative to some prior measure that describes the unoptimized probabilities. To look at it another way, the quantity is how surprised you would be by the event, conditional on the hypothesis that there were no optimization processes around. This plugs directly into Bayesian updating: it says that highly optimized events are strong evidence for optimization processes that produce them.
This is not, however, a definition that can be used to help identify the pace of AI development, for example. Rather, it is just an expression of how one would infer anything in a Bayesian way, applied to the vague 'optimization process' phenomenon.
Alex Altair has a promising attempt at formalization here but it looks inconclusive. He points out the difficulty of identifying optimization power with just the shift in the probability mass of utility according to some utility function. I may be misunderstanding, but my gloss on this is that defining optimization power purely in terms of differences in probability of utility doesn't say anything substantive about how a process has power. Which is important it is going to be related to some other concept like recalcitrance in a useful way.
Has there been any further progress in this area?
It's notable that this discussion makes zero references to computational complexity, formally or otherwise. That's notable because the informal discussion about 'optimization power' is about speed and capacity to compute--whether it be brains, chips, or whatever. There is a very well-developed formal theory of computational complexity that's at the heart of contemporary statistical learning theory. I would think that the tools for specifying optimization power would be in there somewhere.
Those of you interested in the historical literature on this sort of thing may be interested in cyberneticist's Rosenblueth, Weiner, and Bigelow's 1943 paper "Behavior, Purpose and Teleology", one of the first papers to discuss machine 'purpose', which they associate with optimization but in the particular sense of a process that is driven by a negative feedback loop as it approaches its goal. That does not exactly square with an 'explosively' teleology. This is one indicator that explosively purposeful machines might be quite rare or bizarre. In general, the 20th century cybernetics movement has a lot in common with contemporary AI research community. Which is interesting, because its literature is rarely directly referenced. I wonder why.
I see two main ways to deal mathematically with these optimization processes:
1) The first is an 'whatever-it-takes' process that realizes a goal function ideally (in the limit). To get a feel how the mathematics looks I suggest a look at the comparable mathematics of the operational amplifier (short op-amp).
An ideal op-amp also does whatever it takes to realize the transfer function applied to the input. Non-ideal i.e. real op-amps fail this goal but one can give operating ranges by comparing the parameters of the tranfer function elements with the prameters (mostly the A_OL) of the op-amp.
I think this is a good model for the limiting case because we abstract the 'optmization process' as a black box and look at what it does to its goal function - namely realize it. We just can make this mathematcally precise.
2) My second model tries to model the differential equations following from EYs description of Recursive Self-Improvement (RSI) namely the PDEs relating "Optimization slope", "Optimization resources", "Optimization efficiency" with actual physical quantities. I started to write the equations down and put a few into Wolfram Alpha but didn't have time to do a comprehensive analysis. But I'd think that the resulting equations form classes of functions which could be classified by their associated complexity and risk.
And when searching for RSI look what I found:
Mathematical Measures of Optimization Power
1) This is an interesting approach. It looks very similar to the approach taken by the mid-20th century cybernetics movement--namely, modeling social and cognitive feedback processes with the metaphors of electrical engineering. Based on this response, you in particular might be interested in the history of that intellectual movement.
My problem with this approach is that it considers the optimization process as a black box. That seems particularly unhelpful when we are talking about the optimization process acting on itself as a cognitive process. It's eas... (read more)