Yes, now that I think about it, I guess their formalism tends towards incredibly low-signal environments where the actions are primarily simple "tokens" that can be named suggestively but aren't capable of actually revealing the data needed for the kind of sophistication I'm thinking of. That is, The environment is generally incapable of displaying an environmental tag that would suggest "novel action X (unlike novel actions Y or Z) could be dramatic and irreversible".
The only way to acquire such insight in a totally "from scratch" game context is to gain experience of having "died" after choosing X (probably several times), or else by having substantially richer environment cues than is normal for systems like this, where concepts like "reversibility" and "predictors of payoff size" could be worked out in trivial contexts and then correctly applied to more significant contexts later on, based on environmental cues that allow the model-based inference of both potential irreversibility and great importance in moderately novel situations.
"Measuring universal intelligence: Towards an anytime intelligence test"; abstract:
http://www.csse.monash.edu.au/~dld/Publications/HernandezOrallo+DoweArtificialIntelligenceJArticle.pdf
Example popular media coverage: http://www.sciencedaily.com/releases/2011/01/110127131122.htm
The group's homepage: http://users.dsic.upv.es/proy/anynt/
(There's an applet but it seems to be about constructing a simple agent and stepping through various environments, and no working IQ test.)
The basic idea, if you already know your AIXI*, is to start with simple programs** and then test the subject on increasingly harder ones. To save time, boring games such as random environments or one where the agent can 'die'*** are excluded and a few rules added to prevent gaming the test (by, say, deliberately failing on harder tests so as to be given only easy tests which one scores perfectly on) or take into account how slow or fast the subject makes predictions.
* apparently no good overviews of the whole topic AIXI but you could start at http://www.hutter1.net/ai/aixigentle.htm or http://www.hutter1.net/ai/uaibook.htm
** simple as defined by Kolmogorov complexity; since KC is uncomputable, one of the computable variants - which put bounds on resource usage - is used instead
*** make a mistake which turns any future rewards into fixed rewards with no connection to future actions