a close coupling of representation syntax and semantics is neccessary for a discovery program to prosper in a given domain
This is a really interesting point; it seems related to the idea that to be an expert in something, you need a vocabulary close to the domain in question.
It doesn't seem that interesting to me: it's just a restatement that "data compression = data prediction". When you have a vocabulary "close to the domain" that simply means that common concepts are compactly expressed. Once you've maximally compressed a domain, you have discovered all regularities, and simply outputting a short random string will decompress into something useful.
How do you find which concepts are common and how do you represent them? Aye, there's the rub.
It also immediately raises the question of what the expert vocabulary of vocabulary formation/acquisition is, i.e. the domain of learning.
So my guess would be that the expert vocabulary of vocabulary formation is the vocabulary of data compression. I don't know how to make any use of that, though, because the No Free Lunch Theorems seem to say that there's no general algorithm that is the best across all domains And so there's no algorithmic way to find which is the best compressor for this universe.
(ETA: multiple quick edits)
In the early 1980s Douglas Lenat wrote EURISKO, a program Eliezer called "[maybe] the most sophisticated self-improving AI ever built". The program reportedly had some high-profile successes in various domains, like becoming world champion at a certain wargame or designing good integrated circuits.
Despite requests Lenat never released the source code. You can download an introductory paper: "Why AM and EURISKO appear to work" [PDF]. Honestly, reading it leaves a programmer still mystified about the internal workings of the AI: for example, what does the main loop look like? Researchers supposedly answered such questions in a more detailed publication, "EURISKO: A program that learns new heuristics and domain concepts." Artificial Intelligence (21): pp. 61-98. I couldn't find that paper available for download anywhere, and being in Russia I found it quite tricky to get a paper version. Maybe you Americans will have better luck with your local library? And to the best of my knowledge no one ever succeeded in (or even seriously tried) confirming Lenat's EURISKO results.
Today in 2009 this state of affairs looks laughable. A 30-year-old pivotal breakthrough in a large and important field... that never even got reproduced. What if it was a gigantic case of Clever Hans? How do you know? You're supposed to be a scientist, little one.
So my proposal to the LessWrong community: let's reimplement EURISKO!
We have some competent programmers here, don't we? We have open source tools and languages that weren't around in 1980. We can build an open source implementation available for all to play. In my book this counts as solid progress in the AI field.
Hell, I'd do it on my own if I had the goddamn paper.
Update: RichardKennaway has put Lenat's detailed papers up online, see the comments.