Ok, then I did not understand exactly what you meant, but I still don't think this is a counterexample to the problem Paul's idea tries to get around.
The problem is that logical systems have problems reasoning about their own behavior, not a claim that there is no other logical system that can reason about them. In particular, we are interested in if an optimization process can model itself as an optimization process, accurately predicting that its future decisions are likely to achieve outcomes that score well on its optimization criteria and the score will be better if it has more resources, and will become much worse if its representation of its criteria gets corrupted, (all using abstract reasoning in much less time than fully simulating its future decisions). Can program analysis do that?
ETA: Also, I should note that this is a good question and I'm glad you asked it!
If your question is whether a program analyzer can, when given itself as input, produce sensible results, the answer is yes. Program analyzers are meant to run on arbitrary code, so in particular they can be run on themselves as a special instance. (Actually, nothing particularly special happens in this case as far as I can tell.)
Now, a key point is the formalism we are working in: a program analyzer takes in a program P and some specification S, and checks whether P obeys spe...
Previously: Why Neglect Big Topics.
Why was there no serious philosophical discussion of normative uncertainty until 1989, given that all the necessary ideas and tools were present at the time of Jeremy Bentham?
Why did no professional philosopher analyze I.J. Good’s important “intelligence explosion” thesis (from 19591) until 2010?
Why was reflectively consistent probabilistic metamathematics not described until 2013, given that the ideas it builds on go back at least to the 1940s?
Why did it take until 2003 for professional philosophers to begin updating causal decision theory for the age of causal Bayes nets, and until 2013 to formulate a reliabilist metatheory of rationality?
By analogy to financial market efficiency, I like to say that “theoretical discovery is fairly inefficient.” That is: there are often large, unnecessary delays in theoretical discovery.
This shouldn’t surprise us. For one thing, there aren’t necessarily large personal rewards for making theoretical progress. But it does mean that those who do care about certain kinds of theoretical progress shouldn’t necessarily think that progress will be hard. There is often low-hanging fruit to be plucked by investigators who know where to look.
Where should we look for low-hanging fruit? I’d guess that theoretical progress may be relatively easy where:
These guesses make sense of the abundant low-hanging fruit in much of MIRI’s theoretical research, with the glaring exception of decision theory. Our September decision theory workshop revealed plenty of low-hanging fruit, but why should that be? Decision theory is widely applied in multi-agent systems, and in philosophy it’s clear that visible progress in decision theory is one way to “make a name” for oneself and advance one’s career. Tons of quality-adjusted researcher hours have been devoted to the problem. Yes, new theoretical advances (e.g. causal Bayes nets and program equilibrium) open up promising new angles of attack, but they don’t seem necessary to much of the low-hanging fruit discovered thus far. And progress in decision theory is definitely not valuable only to those with unusual views. What gives?
Anyway, three questions:
1 Good (1959) is the earliest statement of the intelligence explosion: “Once a machine is designed that is good enough… it can be put to work designing an even better machine. At this point an ”explosion“ will clearly occur; all the problems of science and technology will be handed over to machines and it will no longer be necessary for people to work. Whether this will lead to a Utopia or to the extermination of the human race will depend on how the problem is handled by the machines. The important thing will be to give them the aim of serving human beings.” The term itself, “intelligence explosion,” originates with Good (1965). Technically, artist and philosopher Stefan Themerson wrote a "philosophical analysis" of Good's intelligence explosion thesis called Special Branch, published in 1972, but by "philosophical analysis" I have in mind a more analytic, argumentative kind of philosophical analysis than is found in Themerson's literary Special Branch. ↩