The usual explanation of probability theory goes like this:

There is this thing called Probability Space, which consists of three other things:

  1. Sample Space - some non-empty set
  2. Event Space - a set of subsets of the Sample Space
  3. Probability Function - a measure function over the elements of the Event Space.

And then several examples of how we can merge this mathematical model with a real world situations are given. 

For instance, for a dice roll the appropriate sample space would be {1; 2; 3; 4; 5; 6}. For an Event Space we can use a superset of the Sample Space and probability function has to give every elementary event equal value: 

The point of such examples is to give students intuitive understanding of how to apply the math of set theory towards reasoning under uncertainty and I think they generally work fine for such purpose. But also they obfuscate an interesting and very important question: How do we select a sample space for a given problem?

Intuition aside, how can we know that the correct sample space is {1; 2; 3; 4; 5; 6} and not, say {1; 2; 3; 4; 5}? Both of these sets fit the formal definition of the sample space - they are non-empty sets. How can we know which one of them is the sample space for the exact problem we are talking about? How do we logically pinpoint the notion of sample space for a very specific problem, instead of sample space for at least some problem?

By Definition?

Surely, we can simply declare that semantic statement "Sample space for a dice roll" means exactly {1; 2; 3; 4; 5; 6}. And so the answer to our question: "How can we know which is the sample space for the problem we are talking about?" is simple - by definition. 

Be careful with this kind of reasoning. As a wise man once said:

In cases like these, it is futile to try to settle the problem by coming up with some new definition of the word “rational” and saying, “Therefore my preferred answer, by definition, is what is meant by the word ‘rational.’ ” This simply raises the question of why anyone should pay attention to your definition. I’m not interested in probability theory because it is the holy word handed down from Laplace. I’m interested in Bayesian-style belief-updating (with Occam priors) because I expect that this style of thinking gets us systematically closer to, you know, accuracy, the map that reflects the territory.

If we can just define "Sample space for a dice roll" as {1; 2; 3; 4; 5; 6}, we can likewise define it as {1; 2; 3; 4; 5} or anything else and none of these definitions will be superior to any other, so we are essentially back to square one.

The only difference is that now we've replaced the initial question with "How can we know which definition is the right one?"

The second issue, is that now we have to do the exact same thing for every conceivable probability theory problem. We managed to pinpoint a sample space for a dice roll and now we can use probability theory to answer questions about dice rolls. But in order to solve problems involving coin tosses we need to axiomatically define a sample space for a coin toss, and in order to solve problems involving picking a marble from a bag with n marbles we need to axiomatically define such a sample space for every n. And so on.

In other words, this approach treats the knowledge about the sample space for a problem as completely non-generalizable. And this doesn't seem to be the case in practice. It's enough to show a couple examples of sample spaces to human students so that they grasp the idea, and then can apply probability theory to different problems. There is some rational principle that allows our intuition to work this way. What is it? Even if defining "Sample space for a dice roll" as {1; 2; 3; 4; 5; 6} is the right thing to do, what makes it so?

Betting Argument?

The other approach that quickly comes to mind is trying to construct a betting argument. 

The problem here, is that this "betting" stuff is simply too advanced for us yet. We need to cover a lot of ground before we could be coherently talking about it, at all, yet alone for it to be helpful for us in any way. Let's look at an example.

Albert: So you believe that outcome 6 can't possibly happen?

Barry: Exactly.

Albert: That's crazy!

Barry: On the contrary, this is completely obvious.

Albert: Are you putting your money where your words are?

Barry: Sure. Bet you a dollar at 1:1000 odds that next time I roll the die the outcome isn't 6.

Albert: Deal.

[The die is rolled. It top side is showing four dots]

Barry: Four. Have you learned your lesson?

Albert: You just got lucky this time. Your betting odds are completely ridiculous.

Barry: We made a bet. I won it. You should be updating in favor of my position as a proper rationalist.

Albert: No! I'm winning in expectation. Here let me calculate expected utility for you... See? I may be loosing a dollar 5 times out of six but when I win, I win big, which compensates for all the losses and much more.

Barry: You are begging the question, assuming that your winning outcome could've happened, which is exactly the crux of our disagreement. Besides, you can come up with whatever rationalizations you want, but the fact of the matter is that you are one dollar poorer. Isn't that the whole point of betting arguments?

Albert: Whatever. If you are so confident, then let's bet one more time.

Barry: Okay... give me another dollar then.

Albert: What? You haven't even rolled the die yet!

Barry: I have. It's four. We've been discussing it for the last couple of minutes. Have you forgotten? I suppose memory problems would explain your crazy beliefs...

Albert: Are you mocking me? That was a previous dice roll. I'm talking about a new one!

Barry: That was the dice roll that we agreed to bet on. Now you wanted to repeat the bet...

Albert: Argh! Of course I meant the next dice roll! Why would I want to bet on a roll that has already happened and I know that the outcome is not 6?

Barry: Frankly? Because you are insane. So you want to make a new bet on a new dice roll?

Albert: Yes!

Barry: I don't think I'm interested. You didn't update at all based on our previous bet. Seems that you are not arguing in good faith here.

At the very least we need to understand a concept of some procedure that includes multiple dice rolls, with multiple outcomes. And that we are not just talking about one particular dice roll but about this procedure as a whole. But this is not all.

To formally construct a betting argument you need to have well-defined events over which you could define probability and utility functions. Which means that you already need to have an agreement on the sample space. To define a monetary reward for an outcome we first need to understand what are the outcomes, which is the whole question that is being investigated. 

This isn't just some technical minutiae. To see this, let's suppose Albert managed to persuade Barry to keep rolling the die and and making the bets.

[The die is rolled. It shows one dot]

Barry: One. [Collects another dollar from Albert] Are you sure you still want to keep going?

Albert: Absolutely. Roll again.

[The die is rolled, but bumps into a teacup and ends on a corner in an undefined state. Barry re-rolls. Four dots again.]

Albert: [Gives another dollar] Again.

[Another roll. The top side shows three dots]

Albert: Again.

[The die falls from the table and rolls under the couch. Barry pull his hand to get the die back and re-rolls it. This time it clearly shows six dots.]

Albert: Aha! Six! Now... wait, what are you doing?

[Barry quickly re-rolls the die again. Three dots are on the top side]

Barry: No idea what you are talking about. It's three - pay up.

Albert: What the Hell?! It was six! You just re-rolled it. You owe me a grand! 

Barry: Just like I re-rolled it when it landed on a corner or fell under the couch. Those obviously do not count. 

Albert: Of course those didn't count! This one does!

Barry: I don't remember agreeing to it. As far as I'm concerned only when the top side of the die clearly shows 1, 2, 3, 4 or 5 dots it counts as an outcome of the toss. 

Albert: You swindler! That's not what people usually mean! You should have specified!

Barry: I explicitly said that outcome 6 never happens. It's you who jumped to conclusions, before figuring out what's going on. By the way, you still owe me another dollar.

Albert: It's you who owe me a thousand! Pay up!

Barry: Quite a sore loser you are. I knew there was no point in continuing arguing with you after our first bet.

Map and Territory

There are multiple lessons that could be learned here. But what is specifically relevant for our case is this.

When Albert says "Dice roll" he means this procedure:

  1. Roll a die, wait until it stops.
  2. If it landed on a corner or in a in a place with low visibility, go back to step 1.
  3. Add together the number of dots on the top side of the die. The sum corresponds to the outcome that has just happened.

While Barry has a different procedure in mind:

  1. Roll a dice, wait until it stops.
  2. If it landed on a corner or in a in a place with low visibility or there are 6 dots on the top side, go back to step 1.
  3. Add together the number of dots on the top side of the die. The sum corresponds to the outcome that has just happened.

Both procedures include some amount of arbitrariness. There is no particular reason why do we count the number of the dots on the top side and not, say, on the bottom side, beyond general human agreement. Or that the mapping between the sum of the dots and the outcome of the sample space has to be direct. Or that we have to re-roll when the die lands on an edge, instead of counting it as some other outcome.

But also both procedures are connected to reality. They are entangled with a physical object - a die and an action of throwing it.

The disagreement between Albert and Barry is purely semantic. As soon as we've replaced the mental paintbrush handle "Dice roll" with a full description of the procedures, there is nothing to argue about. 

For the first procedure the sample space is {1; 2; 3; 4; 5; 6}. 

For the second, it's {1; 2; 3; 4; 5}.

How do we know that? The same way we can have any map reflecting a territory. We go outside and look. We conduct an experiment. We follow the described procedure and see what outcomes we get according to it. This is the helpful part of the betting argument - actually rolling the die and observing what happens, regardless of how money change hands.

To validate any map we need to compare it to the territory. But to do that we need to be able to talk about the territory at all. To make a step back and conceptualize the iterated procedure of dice rolling instead of a singular roll. 

If probability is in the map, what is the territory?

New Comment
17 comments, sorted by Click to highlight new comments since:

I don't think I'm interested. You didn't update at all based on our previous bet.

That should make you more interested (financially) in betting against the person.

You may assume that it's the way how Albert managed to persuade Barry to continue)

A less arbitrary way to define a sample space is to take the set of all possible worlds. Each event, e.g. a die roll, corresponds to the disjunction of possible worlds where that event happens. The possible worlds can differ in a lot of tiny details, e.g. the exact position of a die on the table. Even just an atom being different at the other end of the galaxy would constitute a different possible world. A possible world is a maximally specific way the world could be. So two possible worlds are always mutually exclusive. And the set of all possible worlds includes every possible way reality could be. There are no excluded possibilities like a die falling on the floor.

But for subjective probability theory a "sample space" isn't even needed at all. A probability function can simply be defined over a Boolean algebra of propositions. Propositions ("events") are taken to be primary instead of being defined via primary outcomes of a sample space. We just have beliefs in some propositions, and there is nothing psychological corresponding to outcomes of a sample space. We only need outcomes if probabilities are defined to be ratios of frequencies of outcomes. Likewise, "random variables" or "partitions" don't make sense for subjective probability theory: there are just propositions.

A less arbitrary way to define a sample space is to take the set of all possible worlds.

And how would you know which worlds are possible and which are not?

How would Albert and Barry use the framework of "possible worlds" to help them resolve their disagreement?

But for subjective probability theory a "sample space" isn't even needed at all. A probability function can simply be defined over a Boolean algebra of propositions. Propositions ("events") are taken to be primary instead of being defined via primary outcomes of a sample space.

This simply passes the buck of the question from "What is the sample space corresponding to a particular problem?" to "What is the event space corresponding to a particular problem?". You've renamed your variables, but the substance of the issue is still the same.

How would you know, whether

or

for a dice roll?

And how would you know which worlds are possible and which are not?

Yes, that's why I only said "less arbitrary".

Regarding "knowing": In subjective probability theory, the probability over the "event" space is just about what you believe, not about what you know. You could theoretically believe to degree 0 in the propositions "the die comes up 6" or "the die lands at an angle". Or that the die comes up as both 1 and 2 with some positive probability. There is no requirement that your degrees of belief are accurate relative to some external standard. It is only assumed that the beliefs we do have compose in a way that adheres to the axioms of probability theory. E.g. P(A)≥P(A and B). Otherwise we are, presumably, irrational.

Yes, that's why I only said "less arbitrary".

I don't think I can agree even with that. 

Previously we arbritrary assumed that a particular sample space correspond to a problem. Now we are arbitrary assuming that a particular set of possible worlds corresponds to a problem. In the best case we are exactly as arbitrary as before and have simply renamed our set. In the worst case we are making a lot of extra unfalsifiable assumptions about metaphysics.

You could theoretically believe to degree 0 in the propositions "the die comes up 6" or "the die lands at an angle". Or that the die comes up as both 1 and 2 with some positive probability. There is no requirement that your degrees of belief are accurate relative to some external standard. It is only assumed that the beliefs we do have compose in a way that adheres to the axioms of probability theory. E.g. P(A)≥P(A and B). Otherwise we are, presumably, irrational.

Well, technically  is an axiom, so you do need a sample space if you want to adhere to the axioms.

But sure, if you do not care about accurate beliefs and systematic ways to arrive to them at all, then the question is, indeed, not interesting. Of course then it's not clear what use is probability theory for you, in the first place.

Well, technically P(Ω)=1 is an axiom, so you do need a sample space if you want to adhere to the axioms.

For a propositional theory this axiom is replaced with , i.e. a tautology in classical propositional logic receives probability 1.

But sure, if you do not care about accurate beliefs and systematic ways to arrive to them at all, then the question is, indeed, not interesting. Of course then it's not clear what use is probability theory for you, in the first place.

Degrees of belief adhering to the probability calculus at any point in time rules out things like "Mary is a feminist and a bank teller" to simultaneously receive a higher degree of belief than "Mary is a bank teller". It also requires e.g. that if and then . That's called "probabilism" or "synchronic coherence".

Another assumption is typically that after "observing" . This is called "conditionalization" or sometimes "diachronic coherence".

Degrees of belief adhering to the probability calculus at any point in time rules out things like "Mary is a feminist and a bank teller" to simultaneously receive a higher degree of belief than "Mary is a bank teller". It also requires e.g. that if  and  then . That's called "probabilism" or "synchronic coherence".

What is even the motivation for it? If you are not interested in your map representing a territory, why demanding that your map is coherent?

And why not assume some completely different axioms? Surely, there is a lot of potential ways to logically pinpoint things. Why this one in particular?  Why not allow 

P(Mary is a feminist and bank teller) > P(Mary is a feminist)?

Why not simply remove all the limitations from the function P?

Not to remove all limitations: I think the probability axioms are a sort of "logic of sets of beliefs". If the axioms are violated the belief set seems to be irrational. (Or at least the smallest incoherent subset that, if removed, would make the set coherent.) Conventional logic doesn't work as a logic for belief sets, as the preface and lottery paradox show, but subjective probability theory does work. As a justification for the axioms: that seems a similar problem to justifying the tautologies / inference rules of classical logic. Maybe an instrumental Dutch book argument works. But I do think it does come down to semantic content: If someone says "P(A and B)>P(A)" it isn't a sign of incoherence if he means with "and" what I mean with "or".

Regarding the map representing the territory: That's a more challenging thing to formalize than just logic or probability theory. It would amount to a theory of induction. We would need to formalize and philosophically justify at least something like Ockham's razor. There are some attempts, but I think no good solution.

I think the probability axioms are a sort of "logic of sets of beliefs". If the axioms are violated the belief set seems to be irrational.

Well yes, they are. But how do you know which axioms are the correct axioms for logic of sets beliefs? How comes violation of some axioms seems to be irrational, while violation of other axioms does not? What do you even mean by "rational" if not "systematic way to arrive to map-territory correspondence"?

You see, in any case you have to ground your mathematical model in reality. Natural numbers may be logically pinpointed by arithmetical axioms, but a question of whether some action with particular objects behave like addition of natural numbers is a matter of empiricism. The reason we came up with a notion of natural numbers, in the first place, is because we've encountered a lot of stuff in reality which behavior generalizes this way. And the same things with logic of beliefs. First we encounter some territory, then we try to approximate it with a map.

What I'm trying to say is that if you are already trying to make a map that corresponds to some territory, why not make the one that corresponds better? You can declare that any consistent map is "good enough" and stop your inquiry there, but surely you can do better. You can declare that any consistent map following several simple conditions is good enough - that's a step in the right direction, but still there is a lot of place for improvement. Why not figure out the most accurate map that we can come up with?

That's a more challenging thing to formalize than just logic or probability theory.

Well, yes, it's harder than the subjective probability approach you are talking about. We are trying to pinpoint a more specific target: a probabilistic model for a particular problem, instead of just some probabilistic model.

It would amount to a theory of induction. We would need to formalize and philosophically justify at least something like Ockham's razor. 

No, not really. We can do a lot before we go this particular rabbit hole. I hope my next post will make it clear enough.

It seems clear to me that statements expressing logical or probabilistic laws like or are "analytic". Similar to "Bachelors are unmarried".

The truth of a statement in general is determined by two things, it's meaning and what the world is like. But for some statements the latter part is irrelevant, and their meanings alone are sufficient to determine their truth or falsity.

As soon as you have your axioms you can indeed analytically derive theorems from them. However, the way you determine which axioms to pick, is entangled with reality. It's an especially clear case with probability theory where the development of the field was motivated by very practical concerns. 

The reason why some axioms appear to us appropriate for logic of beliefs and some don't, is because we know what beliefs are from experience. We are trying to come up with a mathematical model approximating this element of reality - an intensional definition for an extensional referent that we have.

Being Dutch-bookable is considered irrational because you systematically lose your bets. Likewise, continuing to believe that a particular outcome can happen in a setting where it, in fact, can't and another agent could've already figured it out with the same limitations you have, is irrational for the same reason.

Similar to "Bachelors are unmarried".

Indeed. There is, in fact, some real world reasons why the words "bachelor" and "unmarried" have these meanings in the English language. In both "why these particular worlds for this particular meanings?" and "why these meanings deserved designating any words at all" senses. The etimology of english language and the existence of the institute of marrige in the first place, both of which the results of social dynamics of humans whose psyche has evolved in a particular way.

The truth of a statement in general is determined by two things, it's meaning and what the world is like.

I hope the previous paragraph does a good enough job showing, how meaning of a statement is, in fact, connected to the way the world is like. 

Truth is a map-territory correspondence. We can separately talk about its two components: validity and soundness. As long as we simply conceptualize some mathematical model, logically pinpointing it for no particular reason, then we are simply dealing with tautologies and there is only validity. Drawing maps for the sake of drawing maps, without thinking about territory. But the moment we want our model to be about something, we encounter soundness. Which requires some connection to the outside world. And then there is a natural question of having a more accurate map and how to have it.

Yes, the meaning of a statement depends causally on empirical facts. But this doesn't imply that the truth value of "Bachelors are unmarried" depends less than completely on its meaning. Its meaning (M) screens off the empirical facts (E) and its truth value (T). The causal graph looks like this:

E —> M —> T

If this graph is faithful, it follows that E and T are conditionally independent given M. . So if you know M, E gives you no additional information about T.

And the same is the case for all "analytic" statements, where the truth value only depends on its meaning. They are distinguished from synthetic statements, where the graph looks like this:

E —> M —> T
|_________^

That is, we have an additional direct influence of the empirical facts on the truth value. Here E and T are no longer conditionally independent given M.

I think that logical and probabilistic laws are analytic in the above sense, rather than synthetic. Including axioms. There are often alternative axiomatizations of the same laws. So and are equally analytic, even though only the latter is used as an axiom.

Being Dutch-bookable is considered irrational because you systematically lose your bets.

I think the instrumental justification (like Dutch book arguments) for laws of epistemic rationality (like logic and probability) is too weak. Because in situations where there happens to be in fact no danger of being exploited by a Dutch book (because there is nobody who would do such an exploit) it is not instrumentally irrational to be epistemically irrational. But you continue to be epistemically irrational if you have e.g. incoherent beliefs. So epistemic rationality cannot be grounded in instrumental rationality. Epistemic rationality laws being true in virtue of their meaning alone (being analytic) therefore seems a more plausible justification for epistemic rationality.

Yes, the meaning of a statement depends causally on empirical facts. But this doesn't imply that the truth value of "Bachelors are unmarried" depends less than completely on its meaning.

I think we are in agreement here.

My point is that if your picking of particular axioms is entangled with reality, then you are already using a map to describe some territory. And then you can just as well describe this territory more accurately.

I think the instrumental justification (like Dutch book arguments) for laws of epistemic rationality (like logic and probability) is too weak. Because in situations where there happens to be in fact no danger of being exploited by a Dutch book (because there is nobody who would do such an exploit) it is not instrumentally irrational to be epistemically irrational. But you continue to be epistemically irrational if you have e.g. incoherent beliefs.

Rationality is about systematic ways to arrive to correct map-territory correspondence. Even if in your particular situation no one is exploiting you, the fact that you are exploitable in principle is bad. But to know about what is exploitable in principle we generalize from all the individual acts of exploatation. It all has to be grounded in reality in the end.

Epistemic rationality laws being true in virtue of their meaning alone (being analytic) therefore seems a more plausible justification for epistemic rationality.

You've said yourself, meaning is downstream of experience. So in the end you have to appeal to reality while trying to justify it.

My point is that if your picking of particular axioms is entangled with reality, then you are already using a map to describe some territory. And then you can just as well describe this territory more accurately.

I think picking axioms is not necessary here and in any case inconsequential. "Bachelors are unmarried" is true whether or not I regard it as some kind of axiom or not. I seems the same holds for tautologies and probabilistic laws. Moreover, I think neither of them is really "entangled" with reality, in the sense that they are compatible with any possible reality. They merely describe what's possible in the first place. That bachelors can't be married is not a fact about reality but a fact about the concept of a bachelor and the concept of marriage.

Rationality is about systematic ways to arrive to correct map-territory correspondence. Even if in your particular situation no one is exploiting you, the fact that you are exploitable in principle is bad. But to know about what is exploitable in principle we generalize from all the individual acts of exploatation. It all has to be grounded in reality in the end.

Suppose you are not instrumentally exploitable "in principle", whatever that means. Then it arguably would still be epistemically irrational to believe that "Linda is a feminist and a bank teller" is more likely than "Linda is a bank teller". Moreover, it is theoretically possible that there are cases where it is instrumentally rational to be epistemically irrational. Maybe someone rewards people with (epistemically) irrational beliefs. Maybe theism has favorable psychological consequences. Maybe Pascal's Wager is instrumentally rational. So epistemic irrationality can't in general be explained with instrumental irrationality as the latter may not even be present.

You've said yourself, meaning is downstream of experience. So in the end you have to appeal to reality while trying to justify it.

I don't think we have to appeal to reality. Suppose the concept of bachelorhood and marriage had never emerged. Or suppose humans had never come up with logic and probability theory, and not even with language at all. Or humans had never existed in the first place. Then it would still be true that all bachelors are necessarily unmarried, and that tautologies are true. Moreover, it's clear that long before the actual emergence of humanity and arithmetic, two dinosaurs plus three dinosaurs already were five dinosaurs. Or suppose the causal history had only been a little bit different, such that "blue" means "green" and "green" means "blue". Would it then be the case that grass is blue and the sky is green? Of course not. It would only mean that we say "grass is blue" when we mean that it is green.

I think picking axioms is not necessary here and in any case inconsequential.

By picking your axioms you logically pinpoint what you are talking in the first place. Have you read Highly Advanced Epistemology 101 for Beginners? I'm noticing that our inferential distance is larger than it should be otherwise.

"Bachelors are unmarried" is true whether or not I regard it as some kind of axiom or not.

No, you are missing the point. I'm not saying that this phrase has to be axiom itself. I'm saying that you need to somehow axiomatically define your individual words, assign them meaning and only then, in regards to these language axioms the phrase "Bachelors are unmarried" is valid.

Moreover, I think neither of them is really "entangled" with reality

You've drawn the graph yourself, how meaning is downstream of reality. This is the kind of entanglement we are talking about. The choice of axioms is motivated by our experience with stuff in the real world. Everything else is beside the point.

Suppose you are not instrumentally exploitable "in principle", whatever that means. Then it arguably would still be epistemically irrational to believe that "Linda is a feminist and a bank teller" is more likely than "Linda is a bank teller".

Yes. That's, among other things, what not being instrumentally exploitable "in principle" means. Epistemic rationality is a generalisation of instrumental rationality the same way how arithmetics is a generalisation from the behaviour of individual objects in reality. The kind of beliefs that are not exploitable in any case other than literally adversarial cases such as a mindreader specifically rewarding people who do not have such beliefs.

I don't think we have to appeal to reality. Suppose the concept of bachelorhood and marriage had never emerged. Or suppose humans had never come up with logic and probability theory, and not even with language at all. Or humans had never existed in the first place. Then it would still be true that all bachelors are necessarily unmarried, and that tautologies are true.

I think the problem is that you keep using the word Truth to mean both Validity and Soundness and therefore do not notice when you switch from one to another.

Validity depends only on the axioms. As long as you are talkin about some set of axioms in which P defined in such a way that P(A) ≥ P(A&B) is a valid theorem, no appeal to reality is needed.

Likewise, you can talk about a set of axioms where P(A) ≤ P(A&B). These two statements remain valid in regards to their axioms.

But the moment you claim that this has something to do with the way beliefs - a thing from reality - are supposed to behave you start talking about soundness, and therefore require a connection to reality. As soon as pure mathematical statements mean something you are in the domain of map-territory relations.

Moreover, it's clear that long before the actual emergence of humanity and arithmetic, two dinosaurs plus three dinosaurs already were five dinosaurs.

Territory behaved the way that we now can describe in the map as 2+3=5. But no maps existed back then. If we are in agreement about it, there is nothing substabtial to argue about.

I think picking axioms is not necessary here and in any case inconsequential.

By picking your axioms you logically pinpoint what you are talking in the first place. Have you read Highly Advanced Epistemology 101 for Beginners? I'm noticing that our inferential distance is larger than it should be otherwise.

I have read it a while ago, but he overstates the importance of axiom systems. E.g. he wrote:

You need axioms to pin down a mathematical universe before you can talk about it in the first place. The axioms are pinning down what the heck this 'NUM-burz' sound means in the first place - that your mouth is talking about 0, 1, 2, 3, and so on.

That's evidently not true. Mathematicians studied arithmetic for two thousand years before it was axiomatized by Dedekind and Peano. Likewise, mathematical statisticians have studied probability theory long before it was axiomatized by Kolmogorov in the 1930s. Advanced theorems preceded these axiomatizations. Mathematicians rarely use axiom systems in their work even if they are theoretically available. That's why it is hard to translate proofs into Lean code. Mathematicians just use well-known mathematical facts (that are considered obvious or already sufficiently established by others) as assumptions for their proofs.

No, you are missing the point. I'm not saying that this phrase has to be axiom itself. I'm saying that you need to somehow axiomatically define your individual words, assign them meaning and only then, in regards to these language axioms the phrase "Bachelors are unmarried" is valid.

That's obviously not necessary. We neither do nor need to "somehow axiomatically define" our individual words for "Bachelors are unmarried" to be true. What would these axioms even be? Clearly the sentence has meaning and is true without any axiomatization.

Curated and popular this week