I think Shane Legg's universal intelligence itself involves Kolmogorov complexity, so it's not computable and will not work here. (Also, it involves a function V, encoding the our values; if human values are irreducibly complex, that should add a bunch of bits.)
In general, I think this approach seems too good to be true? An intelligent agent is one which preforms well in the environment. But don't the "no free lunch" theorems show that you need to know what the environment is like in order to do that? Intuitively, that's what should cause the Kolmogorov complexity to go up.
He made an actual test of it, that involved generating random brainfuck programs. And then tested various reinforcement learning algorithms on it to measure their intelligence, and even tested humans.
That is an actual computable test that can be run.
The no free lunch theorems apply to a completely uninformative prior. We have a prior. The Solomonoff prior, where you assume the environment was generated by a computer program. And that simpler programs are more likely than more complex ones. With that, some AI programs will be objectively better than others....
The Kolmogorov complexity ("K") of a string ("S") specifies the size of the smallest Turing machine that can output that string. If a Turing machine (equivalently, by the Church-Turing thesis, any AI) has size smaller than K, it can rewrite its code as much as it wants to, it won't be able to output S. To be specific, of course it can output S by enumerating all possible strings, but it won't be able to decide on S and output it exclusively among the options available. Now suppose that S is the source code for an intelligence strictly better than all those with complexity <K. Now, we are left with 3 options: