I've just posted an analysis to MIRI's blog called Transparency in Safety-Critical Systems. Its aim is to explain a common view about transparency and system reliability, and then open a dialogue about which parts of that view are wrong, or don't apply well to AGI.
The "common view" (not universal by any means) explained in the post is, roughly:
Black box testing can provide some confidence that a system will behave as intended, but if a system is built such that it is transparent to human inspection, then additional methods of reliability verification are available. Unfortunately, many of AI’s most useful methods are among its least transparent. Logic-based systems are typically more transparent than statistical methods, but statistical methods are more widely used. There are exceptions to this general rule, and some people are working to make statistical methods more transparent.
Three caveats / open problems listed at the end of the post are:
- How does the transparency of a method change with scale? A 200-rules logical AI might be more transparent than a 200-node Bayes net, but what if we’re comparing 100,000 rules vs. 100,000 nodes? At least we can query the Bayes net to ask “what it believes about X,” whereas we can’t necessarily do so with the logic-based system.
- Do the categories above really “carve reality at its joints” with respect to transparency? Does a system’s status as a logic-based system or a Bayes net reliably predict its transparency, given that in principle we can use either one to express a probabilistic model of the world?
- How much of a system’s transparency is “intrinsic” to the system, and how much of it depends on the quality of the user interface used to inspect it? How much of a “transparency boost” can different kinds of systems get from excellently designed user interfaces?
The MIRI blog has only recently begun to regularly host substantive, non-news content, so it doesn't get much commenting action yet. Thus, I figured I'd post here and try to start a dialogue. Comment away!
Re: formal verification, I wonder if FAI developers will ultimately end up having to resort to improved programming environments to reduce bug rates, as described by DJB:
HT Aaron Swartz. If this approach is taken, it probably makes sense to start developing the padded-wall programming environment with non-FAI test programming tasks well before the development of actual FAI begins, so more classes of bugs can be identified and guarded against. See also.
Such an environment would most likely be based on Haskell, if not something more esoteric.
Assuming performance is a concern, Haskell probably strikes the best balance between crazy-powerful type systems and compiler maturity.
As an anecdote, the exact string-escape issue you've mentioned is a (repeatedly) solved problem in that ecosystem, the type system being clever enough to escape most of the verbosity and non-genericity you'd get by trying the same thing in, say, C.