I'm interested in thinking formally about AI risk. I believe that a proper mathematization of the problem is important to making intellectual progress in that area.
I have been trying to understand the rather critical notion of optimization power. I was hoping that I could find a clear definition in Bostrom's Superintelligence. But having looked in the index at all the references to optimization power that it mentions, as far as I can tell he defines it nowhere. The closest he gets is defining it in terms of rate of change and recalcitrance (pp.62-77). This is an empty definition--just tautologically defining it in terms of other equally vague terms.
Looking around, this post by Yudkowksy, "Measuring Optimization Power" doesn't directly formalize optimization power. He does discuss how one would predict or identify if a system were the result of an optimization process in a Bayesian way:
The quantity we're measuring tells us how improbable this event is, in the absence of optimization, relative to some prior measure that describes the unoptimized probabilities. To look at it another way, the quantity is how surprised you would be by the event, conditional on the hypothesis that there were no optimization processes around. This plugs directly into Bayesian updating: it says that highly optimized events are strong evidence for optimization processes that produce them.
This is not, however, a definition that can be used to help identify the pace of AI development, for example. Rather, it is just an expression of how one would infer anything in a Bayesian way, applied to the vague 'optimization process' phenomenon.
Alex Altair has a promising attempt at formalization here but it looks inconclusive. He points out the difficulty of identifying optimization power with just the shift in the probability mass of utility according to some utility function. I may be misunderstanding, but my gloss on this is that defining optimization power purely in terms of differences in probability of utility doesn't say anything substantive about how a process has power. Which is important it is going to be related to some other concept like recalcitrance in a useful way.
Has there been any further progress in this area?
It's notable that this discussion makes zero references to computational complexity, formally or otherwise. That's notable because the informal discussion about 'optimization power' is about speed and capacity to compute--whether it be brains, chips, or whatever. There is a very well-developed formal theory of computational complexity that's at the heart of contemporary statistical learning theory. I would think that the tools for specifying optimization power would be in there somewhere.
Those of you interested in the historical literature on this sort of thing may be interested in cyberneticist's Rosenblueth, Weiner, and Bigelow's 1943 paper "Behavior, Purpose and Teleology", one of the first papers to discuss machine 'purpose', which they associate with optimization but in the particular sense of a process that is driven by a negative feedback loop as it approaches its goal. That does not exactly square with an 'explosively' teleology. This is one indicator that explosively purposeful machines might be quite rare or bizarre. In general, the 20th century cybernetics movement has a lot in common with contemporary AI research community. Which is interesting, because its literature is rarely directly referenced. I wonder why.
Interesting. I know a bit about cybernetics but wasn't consciously aware of a clear analog between cognitive and electrical processes. Maybe I'm missing some background. Could you give a reference I could follow up on?
That is a plausible interpretation. Fooming is actually the only valid interpretation given an ideal black-box AI modelled this way. We have to look into the box which is comparable to looking at non-ideal op-amps. Fooming (on human time-scales) may still be be possible, but to determine that we have to get a handle on the math going on inside the box(es).
One could formulate discrete analogs to the continuous equations relating self-optimization steps. But I don't think this gains much as we are not interested in the specific efficiency of a specific optimization step. That wouldn't work anyway simply because the effect of each optimization step isn't known precisely, not even its timing.
But maybe your proposal to use complexity results from combinatorial optimization theory for specific feedback types (between the optimization stages outlined by EY) could provide better approximations to possible speedups.
Maybe we can approximate the black-box as a set of nested interrelated boxes.
Norbert Wiener is where it all starts. This book has a lot of essays. It's interesting--he's talking about learning machines before "machine learning" was a household word, but envisioning it as electrical circuits.
http://www.amazon.com/Cybernetics-Second-Edition-Control-Communication/dp/026273009X
I think that it's important to look inside the boxes. We know a lot about the mathematical limits of boxes which could help us understand whether and how they might go foom.
Thank you for introducing me to that Concrete Mathematics book. That looks cool.... (read more)