I imagine there could be two different compression strategies that both happen to produce a result of the same length, but cannot be merged.
I think this is correct, but I think of this as being similar to chirality - multiple symmetric versions of the same essential information. I think it also probably depends on the description language you use, so maybe in one language something might have multiple versions, but in another it wouldn't?
To me, it really looks like brains and LLMs are both using embedding spaces to represent information. Embedding spaces ground symbols by automatically relating all concepts they contain, including the grammar for manipulating these concepts.
I don't know, just how compressible are we? I agree that the lead in my 36 molar is a part of my description, but anomalies such as these are always going to be the hardest part of compression since noise is not compressible. So maybe a complete description would look more like "all of the usual teeth, with xyz lead anomalies".
The "noise" of lead atoms in your teeth are among the least important bits in your Kolmogorov string, and would be the first to be dropped if you decided to allow a lossy representation. This reminds me of overfitting actually. The first thing a model tries to learn are the actual useful bits, and then later on when you train too long it starts to memorize the random noise in the dataset.