Discussion about this post

User's avatar
Rafael Kaufmann's avatar

The "Bayesianism" being argued against here seems to actually be "Yudkowskianism" and clicking through your references digs up a lot of LessWrong posts. If so, I mostly agree with the substance of your criticism. In particular, I agree that attempting to tag probabilities onto propositions without making explicit and understanding the model that translates context into application of Bayes' rule (which I call the "Yudkowskian vice" by analogy with the "Ricardian vice" of economics) makes many if not most attempts at Internet rationalism fail before they have even started. However, common Bayesian practice outside of Internet forums (going back at least to Jaynes's seminal book, I won't make claims about the past further than that, I'm not a historian of science) does give models exactly the primacy you mention. (This is also not a philosophical innovation, but rather an ipsis litteris implementation of Quinean holism, an idea from the 1940s. And Quine was a *popular philosopher!*.) Indeed, for a few decades already we've gone past that and into explicitly conditioning first-order variables on model-valued variables, and then performing higher-order inference on model space, which indeed lets us solve (approximately) any kind of problem. For a particularly clear exposition of modern Bayesianism, check out Richard McElreath's "Statistical Rethinking", both the book and the accompanying lectures on YouTube.

On your first gripe, with "degrees of truth", I claim it's a fundamentally misguided concept. "P='The Earth is a sphere' is mostly true" is not a statement of a scalar attribute of a proposition that's just waiting to be quantified, it's a statement about the applicability of the proposition -- under which conditions it's OK to make this approximation. It's a convenient way to say "If you're trying to use the truth value of P to make claims/decisions about astronomy, then it's True; if you're trying to use it to make claims/decisions about some specific kind of engineering that cares about the exact distances to the center of the Earth, gravity, etc, then it's False." To say that "P is 99.99% true" may be logically possible in principle by somehow summing over model space, but it has no usefulness, because it misses the all-important fact that makes the "mostly true" statement useful -- under which conditions it's to be taken as true!

Expand full comment
Richard Futrell's avatar

Regarding vagueness and pragmatic interpretation of things like "large" and "water in the fridge", it's worth pointing out that the best models of these phenomena (in terms of accurately predicting how people will interpret utterances) are in fact probabilistic Bayesian models. For example, as outlined here https://www.problang.org/chapters/05-vagueness.html

These are models where utterance interpretation is a process whereby a listener does Bayesian inference about what an *informative* speaker would say, where that speaker is reasoning about a listener, who is in turn reasoning about a speaker, and so on recursively. The recursion bottoms out in a base case consisting of truth-value semantics, for example for a word like "large" the semantics is "x is large if x is larger than a threshold θ", where θ is a free parameter whose value ends up being inferred probabilistically as part of the recursive reasoning process. The effect is that the (distribution on the) threshold θ ends up being set with respect to a reference class that would make the utterance informative based on the world models of the simulated listeners. For example if I say "Jupiter is large" then you will end up inferring a threshold θ that would make sense for planets, and if I say "my thumb is large" you'll end up inferring a threshold θ that would make sense for bodily appendages.

Which is all to say: a more complex, but still fully Bayesian and probabilistic process, grounding out in definite truth values, actually provides a very good model for how people use vague expressions like this. There's a pretty big academic literature on models like this, of which the size threshold above is just one example.

Expand full comment
42 more comments...