I could be drawing too long of a bow, but this seems to recall the distinction Marvin Minsky makes between Logic and Common-Sense thinking. Logic is a single "thin" chain of true or false propositions, if any single link in the chain is false, the whole chain collapses. Commonsense, in his parlance, is less discrete, we can have degrees of belief in any part of a chain, some parts of the train will be deeper and stronger than others.
He also greatly admired a passage in Aristotle's De Anima that shows how a single object can be represented in multiple ways, which Minsky saw as being very significant to operating in the world.
"Thus the essence of a house is assigned in such a formula as ‘a shelter against destruction by wind, rain, and heat'; the physicist would describe it as 'stones, bricks, and timbers'; but there is a third possible description which would say that it was that form in that material with that purpose or end. Which, then, among these is entitled to be regarded as the genuine physicist? The one who confines himself to the material, or the one who restricts himself to the formulable essence alone? Is it not rather the one who combines both in a single formula?"
Am I conflating different things by saying this reads as similar to the idea of favoring Cross-Entropy rather than the shortest program?
Minsky extended to the idea of multiple representations to what he called Papert's Principle - that it is how we administer and use these multiple representations together, or when we opt for one and exclude others which is the most important part of 'mental growth'.
Some of the most crucial steps in mental growth are based not simply on acquiring new skills, but on acquiring new administrative ways to use what one already knows.
Returning to replacing axioms and how this relates to Minsky's ideas about multiple representations, take for example making an omelette. I may use a stone bench-top, a tiled backsplash, a spoon, or any sort of 'hard' surface to crack the egg. The "crack the egg" part of the process/recipe stays the same, with the same anticipated result, but it becomes replaced by mental representations about the perceived hardness of many different objects.
Does any of this seem relevant or have I made some crude, tenuous connections?
Epistemic status: hand waving conjecture. Let me know what I got right and wrong.
I was thinking about reflective reasoning and "strange loops through the meta level" and it lead me to this intuition that even if you have to make some assumptions to support your beliefs, if your beliefs can be justified by various different sets of assumptions, and not just one specific set of assumptions, then it gives those beliefs more credence.
What do I mean by justification by different sets of assumptions? Let's take peano arithmetic (PA) as an example (though any other formal system can work for this purpose as well).
If (1) is correct, then it's not necessary to accept that axiom as long as you're willing to accept its replacement (and assuming that you're fine with PA but just don't like assuming stuff, you should be fine with the replacement statement as well).
If (2) is correct, then the same is true for any PA axiom.
if (3) is correct, then the same is true for any set of PA axioms. Which means if you think PA really does prove only correct things, you don't actually have to accept the axioms as mere assumptions, because you can prove them from their substitutes, which you already accept.
(Question: Is this actually true about peano arithmetic?)
So though at any moment you would be using axioms to define what you're talking about and prove statements, none of them could be said to be required and permanent assumptions. There would be no assumptions you can be blamed for always assuming.
To bring it back to reflective reasoning, it would match the intuition that no belief, even the most fundamental ones like inductive and occemian priors, are beyond scrutiny under the full power of our reasoning, and therefore can't be said to be merely assumed.
This also reminds me of @So8res' cross-entropy idea - that instead of just prioritizing the hypothesis which has the shortest code, we should prioretize the hypothesis which has many different short codes.
In the same way, the more assumptions a belief requires, and the more complex these assumptions are, the more we discount that belief. But we should also look at how many different sets of assumptions can support that belief, and prioritize beliefs which can rely on more sets of assumptions rather than on ones that require very specific assumptions.
This also applies to beliefs supported by circular justifications. If a beliefs requires a very specific circle, it's less likely to be true than if it can fit in many different circles.
If there's no particular set of assumptions my worldview permanently depends on, if it can spring from many and various different sets of assumptions, then it's a stronger worldview. Even if it still requires assumptions or even circular reasoning.
I think this can be a step towards, or a part of, a solution to the regress problem or even The Problem of the Criterion, because it alleviates the need to permanently rely on just one criterion.