Solomonoff induction is generally given as the correct way to penalise more complex hypotheses when calculating priors. A great introduction can be found here.
My question is, how is this actually calculated in practice?
As an example, say I have 2 hypotheses:
A. The probability distribution of the output is given by the same normal distribution for all inputs, with mean and standard deviation .
B. The probability distribution of the output is given by a normal distribution depending on an input with mean and standard deviation .
It is clear that hypothesis B is more complex (using an additional input [], having an additional parameter [] and requiring 2 additional operations to calculate) but how does one calculate the actual penalty that B should be given vs A?
AIXI is related. Being based on Solomonoff induction, it is also incomputable. However, there's AIXItl which is an approximation (with memory and time contraints), and it's okay at something like first person pac-man, so there probably is an approximation to Solomonoff induction. I don't know how useful it is, and I've never seen it used.