Your three criterion are remarkably similar to the three criterion David Deutsch uses to distinguish between good and bad explanations. He argues this via negativa by stating what makes explanations bad.
From The Logic of Experimental Tests:
[A]n explanation is bad (or worse than a rival or variant explanation) to the extent that...
(i) it seems not to account for its explicanda; or
(ii) it seems to conflict with explanations that are otherwise good; or
(iii) it could easily be adapted to account for anything (so it explains nothing).
The first principle would correspond to beliefs paying rent, although unlike BPR it seems implied that you're setting the scope of the phenomena to be explained rather than deriving expectations for your belief. But this ought not to be a problem since a theory once established would also derive what would be further potential explicanda.
The third principle corresponds to having our models constrain on possibility rather than being capable of accounting for any possibility, something Deutsch and Eliezer would actually agree on.
And the second principle would correspond more weakly to what it would mean to make something truly a part of you. A good explanation or gears-level model would not be in conflict with the rest of your knowledge in any meaningful way, and a great test of this would be if one explanation/model were derivable from another.
This suggests to me that Deutsch is heading in a concordant direction to what you're getting at and could be another point of reference for developing this idea. But note that he's not actually Bayesian. However, most of his refutations are relevant to the logical models that are supposed to populate probabilistic terms and have been known by Bayesians for awhile; they amount to stating that logical omniscience doesn't actually hold in the real world, and only in toy models like for AI. This is precisely the thing that logical induction is supposed to solve, so I figure that once its kinks are worked out then all will be well.
Additionally, if we adopted Deutch's terms, using value judgements of "good" and "bad" also gets you gradients for free, as you can set up a partial ordering as a preference ranking amongst your models/explanation. Otherwise I find the notion of "degrees of gears-ness" a bit incoherent without reintroducing probability to something that's supposed to be deterministic.
The problem might go all the up to the notion of correspondence to reality itself. There's no simple way of stating that we're accessing reality without that also constituting a representation.
Your mutual information is between representations coming from one part of your mind to other parts; likewise what is considered "accurate" information about reality is really just accuracy relative to some idealized set of expectations from some other model that would take the photos as evidence.