If probability is in the map, then what is the territory? What are we mapping when we apply probability theory?

"Our uncertainty about the world, of course."

Uncertainty, yes. And sure, every map is, in a sense, a map of the world. But can we be more specific? Say, for a fair coin toss, what particular part of the world do we map with probability theory? Surely it's not the whole world at the same time, is it?

"It is. You map the whole world. Multiple possible worlds, in fact. In some of them the coin is Heads in the others it's Tails, and you are uncertain which one is yours."

Wouldn't that mean that I need to believe in some kind of multiverse to reason about probability? That doesn't sound right. Even if those "possible worlds" existed, how am I supposed to know that's the case?

"Well, you don't necessary have to believe that there are parallel worlds as real as ours in which the coin comes differently, though it's a respectable position about the nature of counterfactuals. Probability is in the mind, remember? You can simply imagine alternative worlds that are logically consistent with your observations."

Even so, this can't be the way things work. Humans are not, in fact, able to hold the whole world in their minds and validate its logical consistency, let alone multiple worlds. If that was actually required, then probability theoretic reasoning would have only been accessible for literally galaxy-brained superintelligences.

"Well, in practice we simply imagine that we've imagined the whole world, do not notice any contradictions and call it a day without thinking too hard about the matter. It's a standard practice in philosophy." 

Why does this not sound reassuring at all, I wonder? Wait a second, it's even worse than that. What about uncertainty about logic? Say I try to guess whether the 1,253,725,569th digit of pi is even or odd. Only one answer is actually logically consistent with my observations, but I don't know which one. Does that mean that I can't use probability theory to reason about it?

"Oh yes, formalizing logical uncertainty is a well known unsolved problem. We kind of pretend that we can still use probability theory with it as usual, even though it doesn't make sense in the framework of possible worlds, and it seems to work out just fine."

If in practice probability theoretic reasoning works the same way with logical and physical uncertainty, but only one of them makes sense in the framework of possible worlds, doesn't it mean that we need to throw away this framework and come up with something better?

"We've tried with the framework of centred possible worlds, which, in addition to all the worlds, specify the observer's place in them. Didn't really help with logical uncertainty. But it highlighted that we have troubles with indexical uncertainty too. It's now also a respected framework among philosophers."

Well, that went about as well as one might expect from adding extra complexity to a system just for the sake of extra complexity, without fixing the already existing problems in it. Maybe we are better off starting from scratch?

"Feel free to try."

Map of Uncertainty

The first thing that comes to mind is that this framework of possible worlds is not how map-territory relations usually work. Usually a map represents some part of the territory, instead of multiple territories, some of them imaginary.

"But probability theory needs to represent our uncertainty about the territory, not just state facts about it. That's the source of this difference."

Maps can already represent our uncertainty. Consider two towns with similar overall layout. Buildings are positioned in the same places relative to the center of the town, but used differently. For example, the building that is a grocery store in the first town is a theater in the other. Suppose we have not a very detailed map of the first town. It captures the layout but doesn't specify the function of each building. Such a map represents the second town just as much as the first. As a matter of fact, such a map would represent any town with the same building layout, whether it was built in the past or will be built in the future, or never existed at all. The map pinpoints the layout.

"And what does it have to do with representing our uncertainty?"

The map represents our uncertainty between all the towns with this particular layout. The level of detail on the map corresponds to our knowledge. The less specific the map is, the less we know and vice versa. Learning a new fact about the town adds a new detail to the map and changes what kind of towns it represents. So if I know that all buildings have red roofs, I'm uncertain between all the towns with this layout and red roofs on all of the buildings and my map now include an additional detail: the roof color of the buildings.  

"I see. But we still need to be able to talk about all the individual towns."

Do we, though? The whole point of math is to logically pinpoint a general principle, instead of talking about every individual example separately. We say 2+2=4, as a thing in itself, without having to hold in mind all the individual objects and processes from the real world for which it's true. We simply imply that for every objects and processes that work exactly like addition of natural numbers, two and two of such objects put through such process will result in four of such objects. We can do the exact same thing here.

"But that's because we have a formal model of arithmetic. While in this case, you are just vaguely gesturing to map-territory relations. This is not the same thing. Also Sample Space is a set in our mathematical model. It has elements. If these are not worlds, then what are they?"

Fair enough. Let's get there. 

Definition of Probability Experiment

Addition of natural numbers is an abstraction for multiple objects and processes in the physical world. It represents both throwing a rocks in a bucket, and letting sheep go to the pasture - very different things. Likewise, we need some abstraction for multiple tosses of a coin which are, in fact, different physical processes. Properties of the coin, the way force is applied to it and how much, environmental factors - lots of things can vary. They are just irrelevant to our knowledge state. All that we know is that some coin tosses result in Heads and some in Tails, and no matter how many tosses we've already observed we can't guess the outcome of the next one better than chance...

"Still sounds like vague gesturing to me."

...thankfully there is already a mathematical model for what we need! A function. So let there be a function

For every natural number, it has one specific value from the sample space. We will call this function probability experiment. And we will say that 

is i-th iteration (or trial) of probability experiment. And that 

means that  is the outcome of i-th iteration of probability experiment. 

And we also demand that outcomes of different trials are statistically independent from each other.

"So how does it help us?"

We now have a mathematical model that correspond to actual experiments in the real world. When I toss the coin and observe that it's Heads, instead of saying that I observe a world in which the coin is Heads I can say: "The outcome of -st iteration of coin tossing probability experiment is Heads". Where  corresponds to exactly this trial. I don't need to talk about "possible worlds" at all. And then I can toss the coin more times to learn the outcomes for trials  and so on. Which gives me pretty good idea about the properties of the function and therefore about the general act of tossing a coin.

Probability Experiment is in the Mind

"Isn't it just objective probabilities and Frequentism all over again?"

It does have all the advantages of Frequentism. But not its problems. The notion of probability experiment is not claimed to be about "objective probabilities". It's very explicitly a map approximating a territory to the level of our knowledge. It does account for our uncertainty in a similar manner to Bayesianism. It's just doesn't frame this as uncertainty about "possible worlds" but as an uncertainty about the outcome of the particular iteration of the experiment, instead.

"Can you give me a specific example how your approach to probability differs from Frequentism, and how it doesn't perform worse than Bayesianism?"

Sure. Consider this problem:

  1. There is a coin that may be biased in any way
  2. It's biased
  3. It's biased 1:2
  4. It's biased 1:2 in favor of Heads

For each of your knowledge states 1-4, what is the probability that each next toss of this coin results in Heads?

"A Bayesian would answer 1/2, 1/2, 1/2 and 2/3."

Naturally. While, a Frequentist who believes that probability is an inherent property of a coin has problems with first three questions. If the coin may be (or indeed is) biased then it seems that its probability to come Heads can't be 1/2, and until they toss this particular coin they can't have a coherent estimate. Only after they know how exactly the coin is biased can they give an answer.

"So how does your notion of probability experiment help here?"

As soon as you understand that probability experiment approximates your knowledge state about the coin instead of being about its objective properties, it all adds up to normality - that is - to Bayesianism. Probability experiment doesn't consist of the tosses of this particular coin. Instead it consists of all the tosses of all the types of coins our knowledge about which works the same way as for this one.

So for knowledge state 1 it's simply all kinds of coins, fair and biased in all kind of ways. Our coin toss is one of them but we have no idea which so we are indifferent between all the iterations of this experiment. All that is left is to reason about the ratio of Heads. Coins biased in the opposite ways cancel each other and so we get an equal ratio of Heads and Tails. And, therefore, the probability is 1/2.

After we've learned that the coin is biased, we know that our coin toss can't be a toss of a fair coin. So to capture our new knowledge state, we remove the tosses of a fair coin from the experiment. Now we are indifferent between all the tosses of all kinds of unfair coins. This however, doesn't affect the ratio between Heads and Tails, and so the probability is again 1/2.

For 3, our probability experiment consists only of the tosses of two coins: one biased 1:2 in favor of Tails and the other likewise biased in favor of Heads. Once again, about half of the coin tosses in such experiment are Heads, which corresponds to probability for Heads being 1/2.

And in case 4, we need to exclude the coin biased in favor of Tails. So only the tosses of a coin biased 1:2 in favor of Heads are left and therefore P(Heads) = 2/3.

"I see. That's kind of neat, actually. It provides an intuitive-for-a-Frequentist explanation of Bayesianism, and has a decent chance to switch them to the light side.  But I'm already a Bayesian. What is the value of this framework for me? What are these advantages of Frequentism you were talking about?"

It provides a principled way to assign a sample space to a given problem. We can simply perform the experiment multiple times and observe what are the outcomes. Likewise, we can always infer the probabilities of events from their frequencies.

But beyond that, it allows us to get rid of these "possible worlds" which were leading everyone astray. Now instead of speculating about some weird metaphysics that we have no idea about, we explicitly approximate some process in the real world. This provides a unified way to reason about any uncertainty be it physical, logical or indexical, which, as far as I can tell, solves all the paradoxes. 

Logical Uncertainty

"And how does it solve problems with logical uncertainty? Let's go back to your example with not knowing whether 1,253,725,569th digit of pi is even or odd. No matter how many times we check it, the answer is still the same. So it's a deterministic experiment with only one outcome."

By the same logic tossing a coin is also deterministic, because if we toss the same coin exactly the same way in exactly the same conditions, the outcome is always the same. But that's not how we reason about it. Just like we've generalized coin tossing probability experiment from multiple individual coin tosses, we can generalize checking whether some previously unknown digit of pi is even or odd probability experiment from multiple individual checks about different unknown digits of pi. About half of them would be even and about half of them odd, and so the initial probability estimate for 1/2 is justified. As soon as we've properly accounted for our uncertainty, a deterministic experiment with only one outcome was turned into probability experiment with multiple outcomes, and everything adds up to normality.

"But aren't you supposed to account for all the information you have? Here you clearly know that we are talking specifically about 1,253,725,569th digit of pi. Why do you simply ignore this information and start talking about some unknown digit of pi instead?"

I'm supposed to account for all the relevant information and ignore all the irrelevant. This is how mathematical models work in principle. I may know a lot of facts about apples, but when I'm reasoning about addition, I abstract away from them. Addition of apples works the same way as addition of other objects, so apple-specific facts do not matter and I can simply apply a general principle. Likewise, my ignorance about the 1,253,725,569th digit of pi works exactly the same way as my ignorance about any other digit of pi that I know nothing about, so I generalize the same way as well.

"But the fact that it's specifically the 1,253,725,569th digit of pi can be relevant. Suppose that digit is written on a card and shown to you. Now you are certain whether it's even or odd. But this certainty doesn't generalize to other digits of pi. If you were shown this card while wondering about, say, the 1,000,000,000,011th digit of pi, you wouldn't know any better."

Naturally, as soon as I know something about the 1,253,725,569th digit of pi, my knowledge state about it can't be represented by checking whether some previously unknown digit of pi is even or odd. But it's not some unique property of this particular digit, the same principle applies to all the other digits as well. So it does, indeed, generalize. For any unknown to me digit of pi that is written on the cardboard and can be shown to me, my uncertainty works the same way.

The Territory

"Okay, I'll need to think more about it."

Please do. For now, we're almost ready to answer the initial question. We can notice that coin tossing is, in fact, similar to not knowing whether some digit of pi is even or odd. There are two outcomes with an equal ratio among the iterations of probability experiment. I can use the model from coin tossing, apply it to evenness of some unknown to me digit of pi, and get a correct result. So we can generalize even further and call both of them, and any other probability experiment with the same properties as:

Likewise we can talk about  - a generalized notion of any probability experiment with n equiprobable outcomes.

"A lot of probability experiments are not like that, though. Oftentimes outcomes are not equiprobable, like in the example with a coin biased 1:2 in favor of Heads."

True. So to capture them as well, we need to generalize even further and introduce the concept of 

Where  is weights of i-th outcome - its ratio to all the other outcomes of probability experiment.

"So the territory that probability is in the map of is..."

All the processes in the real world, such that my knowledge state about them works like a weighted sample of n elements.

1.
^

You can run the calculations yourself, if you want to; a single-semester course of E&M would suffice to give you all the background you need

2.
^

More precisely, classical-inspired Stat Mech models (like the Ising model) might serve as fine approximations of reality, but the assumptions lurking in the background (i.e., what causes the dipole moments to be systematically oriented a certain way?) necessitate QM as a justification

New Comment


17 comments, sorted by Click to highlight new comments since:

I'm supposed to account for all the relevant information and ignore all the irrelevant.

Is there a formal way you'd define this? My first attempt is something like "information that, if it were different, would change my answer". E.g. knowing the coin is biased 2:1 vs 3:1 doesn't change your probability, so it's irrelevant; knowing the coin is biased 2:1 for heads vs 2:1 for tails changes your probability, so it's relevant.

Or maybe it should be considered from the perspective of reducing the sample space? Is knowing the coin is biased vs knowing it's biased 2:1 a change in relevant information, even though your probability remains at 1/2, because you removed all other biased coins from the sample space? (Intuitively this feels less correct, but I'm writing it out in the interest of spitballing ideas)

Unrelatedly, would you agree that there's not really a meaningful difference between logical and physical uncertainties? I see both of them as stemming from a lack of knowledge - logical uncertainty is where you could find the answer in principle but haven't done so; physical uncertainty is where you don't know how to find the answer. But in practice there's a continuum of how much you know about the answer-finding process for any given problem, so they blur together in the middle.

Is there a formal way you'd define this? My first attempt is something like "information that, if it were different, would change my answer"

I'd say that the rule is: "To construct probability experiment use the minimum generalization that still allows you to model your uncertainty".

In the case with 1,253,725,569th digit of pi, if I try to construct a probability experiment consisting only of checking this paticular digit, I fail to model my uncertainty, as I don't know yet what is the value of this digit.

So instead I use a more general probability experiment of checking any digit of pi that I don't know. This allows me to account for my uncertainty.

Now, I may worry, that I overdid it and have abstracted away some relevant information, so I check:

 - Does knowing that the digit in question is specifically 1,253,725,569 affects my credence? 

 - Not until I receive some evidence about the state of specifically 1,253,725,569th digit of pi.

- So until then this information is not relevant.

Unrelatedly, would you agree that there's not really a meaningful difference between logical and physical uncertainties?

Yes. I'm making this point here:

We can notice that coin tossing is, in fact, similar to not knowing whether some digit of pi is even or odd. There are two outcomes with an equal ratio among the iterations of probability experiment. I can use the model from coin tossing, apply it to evenness of some unknown to me digit of pi, and get a correct result. So we can generalize even further and call both of them, and any other probability experiment with the same properties as:

There is no particular need to talk about logical and physical uncertainty as different things. It's just a historical artifact of confused philosiophical approach of possible worlds and I'm presenting a better way.

logical uncertainty is where you could find the answer in principle but haven't done so; physical uncertainty is where you don't know how to find the answer.

Even this difference is not real. Consider:

A coin is tossed and put into an opaque box, without showing you the result. What is the probability that the result of this particular toss was Heads?

This is physical uncertainty. And yet I do know how to find the answer: all I need is to remove the opaque box and look. Nevertheless, I can talk about my credence before I looked at the coin.

The exact same situation goes with not knowing a particular digit of pi. Yes, I do know a way to find an answer: google an algorithm for calculating any digit of pi and insert there my digit as an input. Nevertheless, I can still talk about my credence before I performed all these actions.

In the case with 1,253,725,569th digit of pi, if I try to construct a probability experiment consisting only of checking this paticular digit, I fail to model my uncertainty, as I don't know yet what is the value of this digit.

Ok, let me see if I'm understanding this correctly: if the experiment is checking the X-th digit specifically, you know that it must be a specific digit, but you don't know which, so you can't make a coherent model. So you generalize up to checking an arbitrary digit, where you know that the results are distributed evenly among {0...9}, so you can use this as your model.

The first part about not having a coherent model sounds a lot like the frequentist idea that you can't generate a coherent probability for a coin of unknown bias - you know that it's not 1/2 but you can't decide on any specific value. 

Now, I may worry, that I overdid it and have abstracted away some relevant information, so I check:

 - Does knowing that the digit in question is specifically 1,253,725,569 affects my credence? 

This seems equivalent to my definition of "information that would change your answer if it was different", so it looks like we converged on similar ideas?

This is physical uncertainty.

I'd argue that it's physical uncertainty before the coin is flipped, but logical certainty after. After the flip, the coin's state is unknown the same way the X-th digit of pi is unknown - the answer exists and all you need to do is look for it.

Ok, let me see if I'm understanding this correctly: if the experiment is checking the X-th digit specifically, you know that it must be a specific digit, but you don't know which, so you can't make a coherent model. So you generalize up to checking an arbitrary digit, where you know that the results are distributed evenly among {0...9}, so you can use this as your model.

Basically yes. Strictly speaking it's not just any arbitrary digit, but any digit your knowledge about values of which works the same way as about value of X. 

For any digit you can execute this algorithm:

Check whether you know about it more (or less) than you know about X.

Yes: Go to the next digit

No: Add it to the probability experiment

As a result you get a bunch of digits about values of which you knew as much as you know about X. And so you can use them to estimate your credence for X

The first part about not having a coherent model sounds a lot like the frequentist idea that you can't generate a coherent probability for a coin of unknown bias - you know that it's not 1/2 but you can't decide on any specific value. 

Yes. As I say in the post:

By the same logic tossing a coin is also deterministic, because if we toss the same coin exactly the same way in exactly the same conditions, the outcome is always the same. But that's not how we reason about it. Just like we've generalized coin tossing probability experiment from multiple individual coin tosses, we can generalize checking whether some previously unknown digit of pi is even or odd probability experiment from multiple individual checks about different unknown digits of pi.

The fact how a lot of Bayesians mock Frequentists for not being able to conceptualize probability of a coin of unknown fairness, and then make the exact same mistake with not being able to conceptualize probability of a specific digit of pi, which value is unknown, has always appeared quite ironic to me.

This seems equivalent to my definition of "information that would change your answer if it was different", so it looks like we converged on similar ideas?

I think we did!

I'd argue that it's physical uncertainty before the coin is flipped, but logical certainty after. After the flip, the coin's state is unknown the same way the X-th digit of pi is unknown - the answer exists and all you need to do is look for it.

That's not how people usually use these terms. The uncertainty about a state of the coin after the toss is describable within the framework of possible worlds just as uncertainty about a future coin toss, but uncertainty about a digit of pi - isn't.

Moreover, isn't it the same before the flip?  It's not that coin toss is "objectively random". At the very least, the answer also exists in the future and all you need is to wait a bit for it to be revealed.

The core princinple is the same: there is in fact some value that Probability Experiment function takes in this iteration. But you don't know which. You can do some actions: look under the box, do some computation, just wait for a couple of seconds - to learn the answer. But you also can reason approximately for the state of your current uncertainty before these actions are taken.

That's not how people usually use these terms. The uncertainty about a state of the coin after the toss is describable within the framework of possible worlds just as uncertainty about a future coin toss, but uncertainty about a digit of pi - isn't.

Oops, that's my bad for not double-checking the definitions before I wrote that comment. I think the distinction I was getting at was more like known unknowns vs unknown unknowns, which isn't relevant in platonic-ideal probability experiments like the ones we're discussing here, but is useful in real-world situations where you can look for more information to improve your model.

Now that I'm cleared up on the definitions, I do agree that there doesn't really seem to be a difference between physical and logical uncertainty.

If..it was pointed out a long time ago that (a form of) probability being in the mind doesn't imply (a firm of) it isn't in the territory as well.

Armchair arguments can't prove anything about the territory...you have to look.

The people whose job it is to investigate this sort of thing, physicists , have been unable to decide the issue.

By the same logic tossing a coin is also deterministic, because if we toss the same coin exactly the same way in exactly the same conditions, the outcome is always the same.

That not true because fundamental determinism is true , or because effective determinism at the macroscopic level is true.

But beyond that, it allows us to get rid of these “possible worlds” which were leading everyone astray. Now instead of speculating about some weird metaphysics that we have no idea about, we explicitly approximate some process in the real world

You may be beating a dead horse there. Talk of possible worlds doesn't have to imply realism about possible worlds, just as mathematical anti-realists can talk about numbers without committing to their mind independent existence.

"In philosophy, possible worlds are usually regarded as real but abstract possibilities (i.e., Platonism),[4] or sometimes as a mere metaphor, abbreviation, or as mathematical devices, or a mere combination of propositions" -- WP.

Tl;Dr: Talk of probabilities and possible worlds doesn't have to be talk about the territory. But it can be.

If..it was pointed out a long time ago that (a form of) probability being in the mind doesn't imply (a firm of) it isn't in the territory as well.

 

That not true because fundamental determinism is true , or because effective determinism at the macroscopic level is true.

This is beside the point that I'm making. Which is: even if we grant that the universe is utterly deterministic and therefore probability is fully in the map, this map still has to correspond to the territory for which you have to go an look. And we still have to be able to construct a meaningful framework for it.

Armchair arguments can't prove anything about the territory...you have to look.

Exactly.

You may be beating a dead horse there. Talk of possible worlds doesn't have to imply realism about possible worlds, just as mathematical anti-realists can talk about numbers without committing to their mind independent existence.

I'm not saying that it does. For instance here I specifically outline alternative option:

"Well, you don't necessary have to believe that there are parallel worlds as real as ours in which the coin comes differently, though it's a respectable position about the nature of counterfactuals. Probability is in the mind, remember? You can simply imagine alternative worlds that are logically consistent with your observations."

What I'm saying is that even the talk itself about "possible worlds" - without assumption of their realism - is harmful as this framework leaves us unable to reason about logical uncertainty and doesn't provide proper guardrails against absurdities, most noticeably in indexical uncertainty.

A better way is to talk about iterations of probability experiment which solves all these issues.

even if we grant that the universe is utterly deterministic and therefore probability is fully in the map, this map *still *has to correspond to the territory for which you have to go an look

The map that corresponds to a deterministically branching multiversal has possible worlds. The map that corresponds to a Copenhagen universe has inherent indeterminism

What I’m saying is that even the talk itself about “possible worlds”—without assumption of their realism—is harmful as this framework leaves us unable to reason about logical uncertainty

Refusing to.ever talk about possible worlds is dangerous, because they might exist (they do in MWI ) and they might be useful otherwise. What you really have is an argument that they are a poor match for logical uncertainty, which they are, but you are allowed to use different tools for different jobs.

Having dogmatic , non-updatable assumptions is bad (see rationality, passsim) and it's still bad when they are in the direction of determinism, reductionism, etc.

You keep missing the point. It's as if you haven't even read the post and simply noticed a couple of key words.

The map that corresponds to a deterministically branching multiversal has possible worlds.

Some do.

I'm proposing a better map, capable to talk about knowledge states and uncertainty, in any circumstances, having all the advantages of maps using the concept of possible worlds, without their weak point.

If you think that the framework of probability experiment that I'm outlining in the post fails to account for something that the frameworks of possible worlds manage to account for - please specify it. Bring up a setting in which you think my framework fails to describe the territory.

Refusing to.ever talk about possible worlds is dangerous, because they might exist (they do in MWI ) and they might be useful otherwise.

Possible world is a term from a map. There may be a referent for it in a territory, true. But it doesn't mean that we have to use this particular term to talk about this referent. We may have a better term, instead. 

You keep missing the point

The point is that a map has to represent the territory.

"And sure, every map is, in a sense, a map of the world", as you out it.,

So if a the territory is branching , the map.should, too. (A map may include aspects of human knowledge as well).

I’m proposing a better map, capable to talk about knowledge states and uncertainty, in any circumstances

That's a disadvantage, because the same map can't represent any territory.

Threre may be an ontologically neutral way of doing probability calculations, but it's not a map, for that reason....more of a tool.

If you think that the framework of probability experiment that I’m outlining in the post fails to account for something that the frameworks of possible worlds manage to account for

The problem is the implied ontology. You haven't actually proven that probability is only in the mind and you can't prove it using methodology, because its a statement about the territory , not just about probability calculations.

Possible world is a term from a map. There may be a referent for it in a territory

If there is a referent for it in the territory, it is entirely reasonable to say "possible worlds exist".

But it doesn’t mean that we have to use this particular term to talk about this referent. We may have a better term, instead.

Is it really a win to admit the substance of existing possible worlds, but under a different name?

So if a the territory is branching , the map.should, too.

Of course not. The territory can be made of rocks and dirt, but it doesn't mean that the map also has to be.

That's a disadvantage, because the same map can't represent any territory.

I'm not saying that it represents every territory.  I'm saying that it represents a more general class of territories without loosing any advantages of the framework of possible worlds.

 In the post I've even specifically outlined what are the territories that my framework can represent:

"So the territory that probability is in the map of is..."

All the processes in the real world, such that my knowledge state about them works like a weighted sample of n elements.

 

( there may be an ontological neutral way of doing probability calculations, but it's not a map, for that reason....more of a tool)

Now it seems that you are finally starting to get it. It is a map, of course, but that's really beside the point. If you want to put it in a separate category of "tools" (as if a map is not a tool?) then whatever suits your needs. Yes, what I'm doing is providing an ontologically neutral framework for probability theory that works better than a framework of possible worlds.

The problem is the implied ontology.

Once again, no ontology is actually implied. It's absolutely trivial to describe behavior of indetermenistic processes in terms of probability experiment. I'm concentrating on deterministic cases simply because they are trickier.

If there is a referent for it in the territory, it is entirely reasonable to say "possible worlds exist".

I'm not saying that it's unreasonable to say, conditionally on using this term. I'm saying that usage of the term, to begin with, is a bad idea as it keeps leading people doing probability theory astray.

Is it really a win to admit the substance of existing possible worlds, but under a different name?

When your goal is to separate the substance from the harmful rubbish, it absolutely is a win.

Once again, no ontology is actually implied. It’s absolutely trivial to describe behavior of indetermenistic processes in terms of probability experiment. I’m concentrating on deterministic cases simply because they are trickier

If that's what you actually think, the first line should read something like "under circumstances where probability is in the mind".

I think the problem here is that you do not quite understand the problem.

It's not that we "imagine that we've imagined the whole world, do not notice any contradictions and call it a day". It's that we know there exists idealized procedure which doesn't produce stupid answers, like, it can't be money-pumped. We also know that if we try to approximate this procedure harder (consider more hypotheses, compute more inferences) we are going to get in expectation better results. It is not, say, property of null hypothesis testing - the more hypotheses you consider, the more likely you to either p-hack or drive p-value into statistical insignificance due to excessive multiple testing correction.

The whole computationally unbounded Bayesian business is more about "here is an idealized procedure X, and if we don't do anything visibly for us stupid from perspective of X, then we can hope that our losses won't be unbounded from certain notion of boundedness". It is not obvious that your procedure can be understood this way. 

I think the problem here is that you do not quite understand the problem.

There is definetely some kind of misunderstanding that is going on, and I'd like to figure it out.

It's not that we "imagine that we've imagined the whole world, do not notice any contradictions and call it a day". 

How it's not the case? Citing you from here:

When you are conditioning on empirical fact, you are imaging set of logically consistent worlds where this empirical fact is true and ask yourself about frequency of other empirical facts inside this set.

How do you know which worlds are logically consistent with your observations and which are not? For that you need to hold them in your mind one by one with all their details and checks for inconsistencies. Which requires you to be a logically omniscient supercomputator with unlimited memory. And none of us is that.

So you have to be doing something else. Only validate the consistency to the best of your cognitive resources, therefore - "imagine that we've imagined the whole world, do not notice any contradictions and call it a day".

It's that we know there exists idealized procedure which doesn't produce stupid answers, like, it can't be money-pumped.

Well, yes. That's the goal. What I'm doing is trying to pinpoint this procedure without the framework of possible worlds which, among other things, doesn't allow reasoning about logical uncertainty. I replace it with a better framework, of iterations of probability experiment that does allow that. 

The whole computationally unbounded Bayesian business is more about "here is an idealized procedure X, and if we don't do anything visibly for us stupid from perspective of X, then we can hope that our losses won't be unbounded from certain notion of boundedness". It is not obvious that your procedure can be understood this way. 

The bayesian procedure is the same, we've just got rid of all the bizarre metaphysics and now explicitly talking about values of a function approximating something in the real world. What is not obvious for you here? Do you expect that there is some case in which my framework fails, where framework of possible worlds doesn't? If so I'd like to see this example. But I'm also currious where such belief would even come from, considering that, once again, we simply talk about iterations of probability experiment instead of possible worlds.

Humans are not, in fact, able to hold the whole world in their minds and validate its logical consistency

I don't see problems here. When I go to the supermarket and think about whether there is milk there or not, I imagine an empty shelf, then a shelf with milk, and then I start to think about relevant things. For example, is there a trade war, are there sales, etc. You should imagine a part of the world, not the whole world, including orbits of start in an other galaxy.

As a side effect, you may not remember a fact that is related and you already know, but empiricism isn't perfect either. Maybe there was milk in the supermarket for all my life, but there were no trade wars for all my life, and the paper for milk packaging is produced in China.

I don't see problems here.

The problem is that according to the framework of "possible worlds" we technically need to be able to do a thing that we can't do.  The solution to this problem is to use a better framework - the one of probability experiments.

You should imagine a part of the world, not the whole world, including orbits of start in an other galaxy.

My point exactly. In actuality we simply approximate a particular process of our universe to the best of our knowledge, instead of imagining all the universal permutations and then filtering from them the ones logically consistent with our knowledge.

[-][anonymous]20

When I go to the supermarket and think about whether there is milk there or not, I imagine an empty shelf

Yes, you indeed imagine it. And people also imagine a world that macroscopically looks just like ours on a human-scale, but instead follows the laws of classical mechanics (in fact, for centuries, this was the mainstream conception of reality among top physicists). 

The problem is that such a world cannot exist. The classical picture of a ball-like electron orbiting around a proton inside a hydrogen atom cannot happen; classically, a rotating electrically charged particle radiates away energy (and, to compensate and ensure conservation of energy, must lose potential energy as a result). As such, the electron would collapse into the center of the proton a billion times faster than the blink of an eye.[1] 

At a deep, fundamental level, the fact that we imagine such a world is a fact about us, not about the world. It's not that there are some weird phenomena around the edges that appear when you move really fast or you fly millions of light-years away from Earth or you get down to quarks, so that you're forced to use QM or relativity because they are "appropriate," but as long as you only think about sufficiently "mundane" matters, you are ok. No! The "weirdness" is baked into everything that exists. Even a bar magnet, the likes of which you might stick onto a fridge, cannot be explained by classical mechanics; ferromagnetism is a fundamentally quantum phenomenon.[2]

Logic runs on (appropriately-named) logical deductions. But as soon as you have any contradiction whatsoever, the principle of explosion makes everything... well, explode. Every statement inside the system becomes both true and false, i.e., it becomes useless. If you can prove "observation-X implies observation-Y a time-step later" but also know that both "observation-X" and "not observation-Y" (i.e., you reason about a logical counterfactual), then you are fully within the ambit of the principle of explosion.

So that's the problem with the milk-free supermarket. And it's a genuine problem, since naive resolutions break the laws of math and physics typically used to parse through such matters. Nevertheless, I suppose there are solutions, at least if you believe in the MWI and the notion of different quantum "branches," for example.

  1. ^

    You can run the calculations yourself, if you want to; a single-semester course of E&M would suffice to give you all the background you need

  2. ^

    More precisely, classical-inspired Stat Mech models (like the Ising model) might serve as fine approximations of reality, but the assumptions lurking in the background (i.e., what causes the dipole moments to be systematically oriented a certain way?) necessitate QM as a justification