The PAC-learning scenario doesn't have any parameter that goes to infinity, so I'm not sure why you dismiss "fixed" overheads :-)
Once you've chosen them, they're fixed, and don't run off to infinity.
This definition-only-up-to-a-constant is one of the weaknesses of minimum description length. (The other is its uncomputability. Shalizi somewhere else remarks that in discussions of algorithmic complexity, it is traditional to solemnly take out Kolmogorov complexity, exhibit its theoretical properties, remark on its uncomputability, and put it away again before turning to practical matters.)
This is precisely the point where I'd like to hear LW's informed replies.
FWIW, this is mine, informed or otherwise. Anyone else have light to shed?
No, let me try nailing this jelly to the wall once again. The definition-only-up-to-a-constant is a weakness of MDL, but this weakness isn't relevant to my question at all! Even if we had some globally unique variant of MDL derived from some nice mathematical idea, learning theory still doesn't use description lengths, and would be perfectly happy with rules that have long descriptions as long as we delineate a small set of those rules. To my mind this casts doubt on the importance of MDL.
I declare this Open Thread open for discussion of Less Wrong topics that have not appeared in recent posts.