Is Determinism A Special Case Of Randomness?
I was trying to reconcile the fact that in a deterministic universe there could be life with free will, but I am going full circle now and am starting to think that everything is really random, if not I don't see how there could be free will in a deterministic universe.
If mathematicians measure randomness with probability, then there must be some things that have a 100% occurrence probability (in the current universe above atomic levels I presume), which now I see as special cases of randomness rather than opposites to randomness, and these lead us to think that there is determinism.
I think we may have this cognitive bias (deterministic views of reality) because it is extremely helpful to use these 100% probability occurrence things to model the universe rationally, learn, and to predict the future, but it is not the whole story or at least a complete description of reality.
What do you think?
EDIT 1: Thank you all for the comments below. I recognize I am naive in this topic.
Although I am not convinced yet, I think my possible argumentative error is:
P1: I observe free will in the behavior of living things.
P2: Deterministic physical mechanical processes don't permit free will.
C: Therefore physics must include random processes.
I think I only see a solution of free will in randomness, but maybe there are other solutions ( I will read the Free Will Sequence here on LW!)
EDIT 2: After reading some articles of the Free Will Sequence I realize the problem of investing energy around free will questions if free will is just a mistake in our thinking process.
It is something like why ask about time travel if time doesn't exist? or, why explore the mechanics of randomness vs determinism if randomness doesn't exist and thus the dichotomy "randomness vs determinism" doesn't exist in the first place?
The difference between Determinism & Pre-determination
1. Scope
There are two arm-waving views often expressed about the relationship between “determinism/causality” on the one hand and “predetermination/predictability in principle” on the other. The first treats them as essentially interchangeable: what is causally determined from instant to instant is thereby predetermined over any period - the Laplacian view. The second view is that this is a confusion, and they are two quite distinct concepts. What I have never seen thoroughly explored (and therefore propose to make a start on here) is the range of different cases which give rise to different relationships between determinism and predetermination. I will attempt to illustrate that, indeed, determinism is neither a necessary nor a sufficient condition for predetermination in the most general case.
To make the main argument clear, I will relegate various pedantic qualifications, clarifications and comments to [footnotes].
Most of the argument relates to cases of a physically classical, pre-quantum world (which is not as straightforward as often assumed, and certainly not without relevance to the world we experience). The difference that quantum uncertainty makes will be considered briefly at the end.
2. Instantaneous determinism
To start with it is useful to define what exactly we mean by an (instantaneously) determinist system. In simple terms this means that how the system changes at any instant is fully determined by the state of the system at that instant [1]. This is how physical laws work in a Newtonian universe. The arm-waving argument says that if this is the case, we can derive the state of the system at any future instant by advancing through an infinite number of infinitesimal steps. Since each step is fully determined, the outcome must be as well. However, as it stands this is a mathematical over-simplification. It is well known that an infinite number of infinitesimals is indeterminate as such, and so we have to look at this process more carefully - and this is where there turn out to be significant differences between different cases.
3. Convergent and divergent behaviour
To illustrate the first difference that needs to be recognized, consider two simple cases - a snooker ball just about to collide with another snooker ball, and a snooker ball heading towards a pocket. In the first case, a small change in the starting position of the ball (assuming the direction of travel is unchanged) results in a steadily increasing change in the positions at successive instants after impact - that is, neighbouring trajectories diverge. In the second case, a small change in the starting position has no effect on the final position hanging in the pocket: neighbouring trajectories converge. So we can call these “convergent” and “divergent” cases respectively. [1.1]
Now consider what happens if we try to predict the state of some system (e.g. the position of the ball) after a finite time interval. Any attempt to find the starting position will involve a small error. The effect on the accuracy of prediction differs markedly in the two cases. In the convergent case, small initial errors will fade away with time. In the divergent case, by contrast, the error will grow and grow. Of course, if better instruments were available we could reduce the initial error and improve the prediction - but that would also increase the accuracy with which we could check the final error! So the notable fact about this case is that no matter how accurately we know the initial state, we can never predict the final state to the same level of accuracy - despite the perfect instantaneous determinism assumed, the last significant figure that we can measure remains as unpredictable as ever. [2]
One possible objection that might be raised to this conclusion is that with “perfect knowledge” of the initial state, we can predict any subsequent state perfectly. This is philosophically contentious - rather analagous to arguments about what happens when an irresistable force meets an immovable object. For example, philosophers who believe in “operational definitions” may doubt whether there is any operation that could be performed to obtain “the exact initial conditions”. I prefer to follow the mathematical convention that says that exact, perfect, or infinite entities are properly understood as the limiting cases of more mundane entities. On this convention, if the last significant figure of the most accurate measure we can make of an outcome remains unpredictable for any finite degree of accuracy, then we must say that the same is true for “infinite accuracy”.
The conclusion that there is always something unknown about the predicted outcome places a “qualitative upper limit”, so to speak, on the strength of predictability in this case, but we must also recognize a “qualitative lower limit” that is just as important, since in the snooker impact example whatever the accuracy of prediction that is desired after whatever time period, we can always calculate an accuracy of initial measurement that would enable it. (However, as we shall shortly see [3], this does not apply in every case.) The combination of predictability in principle to any degree, with necessary unpredictability to the precision of the best available measurement, might be termed “truncated predictability”.
4. More general cases
The two elemementary cases considered so far illustrate the importance of distinguishing convergent from divergent behaviour, and so provide a useful paradigm to be kept in mind, but of course, most real cases are more complicated than this.
To take some examples, a system can have both divergent parts and convergent parts at any instant - such as different balls on the same snooker table; an element whose trajectory is behaving divergently at one instant may behave convergently at another instant; convergent movement along one axis may be accompanied by divergent movement relative to another; and, significantly, divergent behaviour at one scale may be accompanied by convergent behaviour at a different scale. Zoom out from that snooker table, round positions to the nearest metre or so, and the trajectories of all the balls follow that of the adjacent surface of the earth.
There is also the possibility that a system can be potentially divergent at all times and places. A famous case of such behaviour is the chaotic behaviour of the atmosphere, first clearly understood by Edward Lorentz in 1961. This story comes in two parts, the second apparently much less well known than the first.
5. Chaotic case: discrete
The equations normally used to describe the physical behaviour of the atmosphere formally describe a continuum, an infinitely divisible fluid. As there is no algebraic “solution” to these equations, approximate solutions have to be found numerically, which in turn require the equations to be “discretised”, that is adapted to describe the behaviour at, or averaged around, a suitably large number of discrete points.
The well-known part of Lorenz’s work [4] arose from an accidental observation, that a very small change in the rounding of the values at the start of a numerical simulation led in due course to an entirely different “forecast”. Thus this is a case of divergent trajectories from any starting point, or “sensitivity to initial conditions” as it has come to be known.
The part of “chaos theory” that grew out of this initial insight describes the convergent trajectories from any starting point: they diverge exponentially, with a time constant known as the Kolmogorov constant for the particular problem case [5]. Thus we can still say, as we said for the snooker ball, that whatever the accuracy of prediction that is desired after whatever time period, we can always calculate an accuracy of initial measurement that would enable it.
6. Chaotic case: continuum
Other researchers might have dismissed the initial discovery of sensitivity to initial conditions as an artefact of the computation, but Lorenz realised that even if the computation had been perfect, exactly the same consequences would flow from disturbances in the fluid in the gaps between the discrete points of the numerical model. This is often called the “Butterfly Effect” because of a conference editor's colourful summary that “the beating of a butterfly’s wings in Brazil could cause a tornado in Texas”.
It is important to note that the Butterfly Effect is not strictly the same as “Sensitivity to Initial Conditions” as is often reported [6], although they are closely related. Sensitivity to Initial Conditions is an attribute of some discretised numerical models. The Butterfly Effect describes an attribute of the equations describing a continuous fluid, so is better described as “sensitivity to disturbances of minimal extent”, or in practice, sensitivity to what falls between the discrete points modelled.
Since, as noted above, there is no algebraic solution to the continuous equations, the only way to establish the divergent characteristics of the equations themselves is to repeatedly reduce the scale of discretisation (the typical distance between the points on the grid of measurements) and observe the trend. In fact, this was done for a very practical reason: to find out how much benefit would be obtained, in terms of the durability of the forecast [7], by providing more weather stations. The result was highly significant: each doubling of the number of stations increased the durability of the forecast by a smaller amount, so that (by extrapolation) as the number of imaginary weather stations was increased without limit, the forecast durability of the model converged to a finite value[8]. Thus, beyond this time limit, the equations that we use to describe the atmosphere give indeterminate results, however much detail we have about the initial conditions. [9]
Readers will doubtless have noticed that this result does not strictly apply to the earth’s atmosphere, because that is not the infinitely divisible fluid that the equations assumed (and a butterfly is likewise finitely divisible). Nevertheless, the fact that there are perfectly well-formed, familiar equations which by their nature have unpredictable outcomes after a finite time interval vividly exposes the difference between determinism and predetermination.
With hindsight, the diminishing returns in forecast durability from refining the scale of discretisation is not too surprising: it is much quicker for a disturbance on a 1 km scale to have effects on a 2 km scale than for a disturbance on a 100 km scale to have effects on a 200 km scale.
7. Consequences of quantum uncertainty
It is often claimed that the the Uncertainty Principle of quantum mechanics [10] makes the future unpredictable [11], but in the terms of the above analysis this is far from the whole story.
The effect of quantum mechanics is that at the scale of fundamental particles [12] the laws of physical causality are probabilistic. As a consequence, there is certainly no basis, for example, to predict whether an unstable nucleus will disintegrate before or after the expiry of its half-life.
However, in the case of a convergent process at ordinary scales, the unpredictability at quantum scale is immaterial, and at the scale of interest predictability continues to hold sway. The snooker ball finishes up at the bottom of the pocket whatever the energy levels of its constituent electrons. [13]
It is in the case of divergent processes that quantum effects can make for unpredictability at large scales. In the case of the atmosphere, for example, the source of that tornado in Texas could be a cosmic ray in Colombia, and cosmic radiation is strictly non-deterministic. The atmosphere may not be the infinitely divisible fluid considered by Lorenz, but a molecular fluid subject to random quantum processes has just the same lack of predictability.
[EDIT] How does this look in terms of the LW-preferred Many Worlds interpretation of quantum mechanics?[14] In this framework, exact "objective prediction" is possible in principle but the prediction is of an ever-growing array of equally real states. We can speak of the "probability" of a particular outcome in the sense of the probability of that outcome being present in any state chosen at random from the set. In a convergent process the cases become so similar that there appears to be only one outcome at the macro scale (despite continued differences on the micro scale); whereas in a divergent process the "density of probability" (in the above sense) becomes so vanishingly small for some states that at a macro scale the outcomes appear to split into separate branches. (They have become decoherent.) Any one such branch appears to an observer within that branch to be the only outcome, and so such an observer could not have known what to "expect" - only the probability distribution of what to expect. This can be described as a condition of subjective unpredictability, in the sense that there is no subjective expectation that can be formed before the divergent process which can be reliably expected to coincident with an observation made after the process. [END of EDIT]
8. Conclusions
What has emerged from this review of different cases, it seems to me, is that it is the convergent/divergent dichotomy that has the greatest effect on the predictability of a system’s behaviour, not the deterministic/quantised dichotomy at subatomic scales.
More particularly, in short-hand:-
Convergent + deterministic => full predictability
Convergent + quantised => predictability at all super-atomic scales
Divergent + deterministic + discrete => “truncated predictability”
Divergent + deterministic + continuous => unpredictability
[EDIT] Divergent + quantised => objective predictability of the multiverse but subjective unpredictability
Footnotes
1. The “state” may already include time derivatives of course, and in the case of a continuum, the state includes spatial gradients of all relevant properties.
1.1 For simplicity I have ignored the case between the two where neighbouring trajectories are parallel. It should be obvious how the argument applies to this case. Convergence/divergence is clearly related to (in)stability, and less directly to other properties such as (non)-linearity and (a)periodicity, but as convergence defines the characteristic that matters in the present context it seems better to focus on that.
2. In referring to a “significant figure” I am of course assuming that decimal notation is used, and that the initial error has diverged by at least a factor of 10.
3. In section 6.
4. For example, see Gleick, “Chaos”, "The Butterfly Effect" chapter.
5. My source for this statement is a contribution by Eric Kvaalen to the New Scientist comment pages.
6. E.G by Gleick or Wikipedia.
7. By durability I mean the period over which the required degree of accuracy is maintained.
8. This account is based on my recollection, and notes made at the time, of an article in New Scientist, volume 42, p290. If anybody has access to this or knows of an equivalent source available on-line, I would be interested to hear!
9. I am referring to predictions of the conditions at particular locations and times. It is, of course, possible to predict average conditions over an area on a probabilistic basis, whether based on seasonal data, or the position of the jetstream etc. These are further examples of how divergence at one scale can be accompanied by something nearer to convergence on another scale.
10. I am using “quantum mechanics” as a generic term to include its later derivatives such as quantum chromodynamics. As far as I understand it these later developments do not affect the points made here. However, this is certainly well outside my professional expertise in aspects of Newtonian mechanics, so I will gladly stand corrected by more specialist contributors!
11. E.G. by Karl Popper in an appendix to The Poverty of Historicism.
12. To be pedantic, I’m aware that this also applies to greater scales, but to a vanishingly small extent.
13. In such cases we could perhaps say that predictability is effectively an “emergent property” that is not present in the reductionist laws of the ultimate ingredients but only appears in the solution space of large scale aggregates.
14. Thanks to the contributors of the comments below as at 30 July 2013 which I have tried to take into account. The online preview of "The Emergent Multiverse: Quantum Theory According to the Everett Interpretation" by David Wallace has also been helpful to understanding the implications of Many Worlds.
Higher than the most high
In an earlier post, I talked about how we could deal with variants of the Heaven and Hell problem - situations where you have an infinite number of options, and none of them is a maximum. The solution for a (deterministic) agent was to try and implement the strategy that would reach the highest possible number, without risking falling into an infinite loop.
Wei Dai pointed out that in the cases where the options are unbounded in utility (ie you can get arbitrarily high utility), then there are probabilistic strategies that give you infinite expected utility. I suggested you could still do better than this. This started a conversation about choosing between strategies with infinite expectation (would you prefer a strategy with infinite expectation, or the same plus an extra dollar?), which went off into some interesting directions as to what needed to be done when the strategies can't sensibly be compared with each other...
Interesting though that may be, it's also helpful to have simple cases where you don't need all these subtleties. So here is one:
Omega approaches you and Mrs X, asking you each to name an integer to him, privately. The person who names the highest integer gets 1 utility; the other gets nothing. In practical terms, Omega will reimburse you all utility lost during the decision process (so you can take as long as you want to decide). The first person to name a number gets 1 utility immediately; they may then lose that 1 depending on the eventual response of the other. Hence if one person responds and the other doesn't, they get the 1 utility and keep it. What should you do?
In this case, a strategy that gives you a number with infinite expectation isn't enough - you have to beat Mrs X, but you also have to eventually say something. Hence there is a duel of (likely probabilistic) strategies, implemented by bounded agents, with no maximum strategy, and each agent trying to compute the maximal strategy they can construct without falling into a loop.
Naturalism versus unbounded (or unmaximisable) utility options
There are many paradoxes with unbounded utility functions. For instance, consider whether it's rational to spend eternity in Hell:
Suppose that you die, and God offers you a deal. You can spend 1 day in Hell, and he will give you 2 days in Heaven, and then you will spend the rest of eternity in Purgatory (which is positioned exactly midway in utility between heaven and hell). You decide that it's a good deal, and accept. At the end of your first day in Hell, God offers you the same deal: 1 extra day in Hell, and you will get 2 more days in Heaven. Again you accept. The same deal is offered at the end of the second day.
And the result is... that you spend eternity in Hell. There is never a rational moment to leave for Heaven - that decision is always dominated by the decision to stay in Hell.
Or consider a simpler paradox:
You're immortal. Tell Omega any natural number, and he will give you that much utility. On top of that, he will give you any utility you may have lost in the decision process (such as the time wasted choosing and specifying your number). Then he departs. What number will you choose?
Again, there's no good answer to this problem - any number you name, you could have got more by naming a higher one. And since Omega compensates you for extra effort, there's never any reason to not name a higher number.
It seems that these are problems caused by unbounded utility. But that's not the case, in fact! Consider:
You're immortal. Tell Omega any real number r > 0, and he'll give you 1-r utility. On top of that, he will give you any utility you may have lost in the decision process (such as the time wasted choosing and specifying your number). Then he departs. What number will you choose?
Compatibilism in action
A practical albeit fictional application of the philosophical conclusion that free will is compatible with determinism came up today in a discussion about a setting element from the role-playing game Exalted
(5:31:44 PM) Nekira Sudacne: So during the pirmodial war, one Yozi got his fetch killed and he reincarnated as Sachervell, He Who Knows The Shape of Things To Come. And he reincarnated asleep. and he has remained asleep. And the other primordials do all in their power to keep him asleep. and he wants to be asleep.
For you see, for as long as he sleeps, he dreams only of the present. should he awaken, he will see the totaltiy of exsistance, all things past and future exsactly as they will happen. quantumly speaking he will lock the universe into a single shape. All things that happen will happen as he sees them happen and there will be no chance for anyone to change it. effectivly nullifying chance for change. Even he cannot alter his vision for his vision takes into account all attempts to alter it.
And there's a big debate over rather or not this is a game ending thing. Essentially, does predestination negate freewill or not
(5:32:17 PM) Nekira Sudacne: and this is important, because one of the requirements for Exaltation to function is freewill. if Sachervell is able to negate freewill, then Exaltations will cease to function
(5:32:44 PM) Nekira Sudacne: and maddenly enough the game authors are also on the thread arguing because THEY don't agree where to go with it either :)
(5:38:02 PM) rw271828: ah, well I happen to know the answer :-)
(5:39:23 PM) rw271828: one of the most important discoveries of 20th-century mathematics is that in general the behavior of a complex system cannot be predicted -- or rather, there is no easier way to predict it than to run it and see what happens. Note in particular:
(5:39:41 PM) rw271828: 1. This is a mathematical fact, so it applies in all possible universes, including Exalted
(5:40:01 PM) rw271828: 2. Humans and other sentient lifeforms are complex systems in the relevant sense
(5:41:33 PM) rw271828: so if you postulate an entity that can actually see the future (as opposed to just extrapolate what is likely to happen unless something intervenes), the only way to do that is for that entity to run a perfect simulation, a complete copy of the universe
(5:42:50 PM) rw271828: if you're willing to postulate that, well fine, continue the game, and just note that you are running it in the copy the entity is using to make the prediction - the people in the setting still have free will, it is their actions that determine the future, and thus the result of the prediction ^.^
(5:43:04 PM) Nekira Sudacne: Hah. nice one
Deep Structure Determinism
Sort of a response to: Collapse Postulate
Abstract: There are phenomena in mathematics where certain structures are distributed "at random;" that is, statistical statements can be made and probabilities can be used to predict the outcomes of certain totally deterministic calculations. These calculations have a deep underlying structure which leads a whole class of problems to behave in the same way statistically, in a way that appears random, while being entirely deterministic. If quantum probabilities worked in this way, it would not require collapse or superposition.This is a post about physics, and I am not a physicist. I will reference a few technical details from my (extremely limited) research in mathematical physics, but they are not necessary to the fundamental concept. I am sure that I have seen similar ideas somewhere in the comments before, but searching the site for "random + determinism" didn't turn much up so if anyone recognizes it I would like to see other posts on the subject. However my primary purpose here is to expose the name "Deep Structure Determinism" that jasonmcdowell used for it when I explained it to him on the ride back from the Berkeley Meetup yesterday.
Again I am not a physicist; it could be that there is a one or two sentence explanation for why this is a useless theory--of course that won't stop the name "Deep Structure Determinism" from being aesthetically pleasing and appropriate.
For my undergraduate thesis in mathematics, I collected numerical evidence for a generalization of the Sato-Tate Conjecture. The conjecture states, roughly, that if you take the right set of polynomials, compute the number of solutions to them over finite fields, and scale by a consistent factor, these results will have a probability distribution that is precisely a semicircle.
The reason that this is the case has something to do with the solutions being symmetric (in the way that y=x2 if and only if y=(-x)2 is a symmetry of the first equation) and their group of symmetries being a circle. And stepping back one step, the conjecture more properly states that the numbers of solutions will be roots of a certain polynomial which will be the minimal polynomial of a random matrix in SU2.
That is at least as far as I follow the mathematics, if not further. However, it's far enough for me to stop and do a double take.
A "random matrix?" First, what does it mean for a matrix to be random? And given that I am writing up a totally deterministic process to feed into a computer, how can you say that the matrix is random?
A sequence of matrices is called "random" if when you integrate of that sequence, your integral converges to integrating over an entire group of matrices. Because matrix groups are often smooth manifolds they are designed to be integrated over, and this ends up being sensible. However a more practical characterization, and one that I used in the writeup for my thesis, is that if you take a histogram of the points you are measuring, the histogram's shape should converge to the shape of the group--that is, if you're looking at the matrices that determine a circle, your histogram should look more and more like a semicircle as you do more computing. That is, you can have a probability distribution over the matrix space for where your matrix is likely to show up.
The actual computation that I did involved computing solutions to a polynomial equation--a trivial and highly deterministic procedure. I then scaled them, and stuck them in place. If I had not know that these numbers were each coming from a specific equation I would have said that they were random; they jumped around through the possibilities, but they were concentrated around the areas of higher probability.
So bringing this back to quantum physics: I am given to understand that quantum mechanics involves a lot of random matrices. These random matrices give the impression of being "random" in that it seems like there are lots of possibilities, and one must get "chosen" at the end of the day. One simple way to deal with this is postulate many worlds, wherein there no one choice has a special status.
However my experience with random matrices suggests that there could just be some series of matrices, which satisfies the definition of being random, but which is inherently determined (in the way that the Jacobian of a given elliptic curve is "determined.") If all quantum random matrices were selected from this list, it would leave us with the subjective experience of randomness, and given that this sort of computation may not be compressible, the expectation of dealing with these variables as though they are random forever. It would also leave us in a purely deterministic world, which does not branch, which could easily be linear, unitary, differentiable, local, symmetric, and slower-than-light.
= 783df68a0f980790206b9ea87794c5b6)

Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)