The difference between Determinism & Pre-determination

3 RogerS 25 July 2013 11:41AM

1. Scope

 

There are two arm-waving views often expressed about the relationship between “determinism/causality” on the one hand and “predetermination/predictability in principle” on the other. The first treats them as essentially interchangeable: what is causally determined from instant to instant is thereby predetermined over any period - the Laplacian view. The second view is that this is a confusion, and they are two quite distinct concepts. What I have never seen thoroughly explored (and therefore propose to make a start on here) is the range of different cases which give rise to different relationships between determinism and predetermination. I will attempt to illustrate that, indeed, determinism is neither a necessary nor a sufficient condition for predetermination in the most general case.

To make the main argument clear, I will relegate various pedantic qualifications, clarifications and comments to [footnotes].

Most of the argument relates to cases of a physically classical, pre-quantum world (which is not as straightforward as often assumed, and certainly not without relevance to the world we experience). The difference that quantum uncertainty makes will be considered briefly at the end.

 

2. Instantaneous determinism

To start with it is useful to define what exactly we mean by an (instantaneously) determinist system. In simple terms this means that how the system changes at any instant is fully determined by the state of the system at that instant [1]. This is how physical laws work in a Newtonian universe. The arm-waving argument says that if this is the case, we can derive the state of the system at any future instant by advancing through an infinite number of infinitesimal steps. Since each step is fully determined, the outcome must be as well. However, as it stands this is a mathematical over-simplification. It is well known that an infinite number of infinitesimals is indeterminate as such, and so we have to look at this process more carefully - and this is where there turn out to be significant differences between different cases.

 

3. Convergent and divergent behaviour

To illustrate the first difference that needs to be recognized, consider two simple cases - a snooker ball just about to collide with another snooker ball, and a snooker ball heading towards a pocket. In the first case, a small change in the starting position of the ball (assuming the direction of travel is unchanged) results in a steadily increasing change in the positions at successive instants after impact - that is, neighbouring trajectories diverge. In the second case, a small change in the starting position has no effect on the final position hanging in the pocket: neighbouring trajectories converge. So we can call these “convergent” and “divergent” cases respectively. [1.1]

Now consider what happens if we try to predict the state of some system (e.g. the position of the ball) after a finite time interval. Any attempt to find the starting position will involve a small error. The effect on the accuracy of prediction differs markedly in the two cases. In the convergent case, small initial errors will fade away with time. In the divergent case, by contrast, the error will grow and grow. Of course, if better instruments were available we could reduce the initial error and improve the prediction - but that would also increase the accuracy with which we could check the final error! So the notable fact about this case is that no matter how accurately we know the initial state, we can never predict the final state to the same level of accuracy - despite the perfect instantaneous determinism assumed, the last significant figure that we can measure remains as unpredictable as ever. [2]

One possible objection that might be raised to this conclusion is that with “perfect knowledge” of the initial state, we can predict any subsequent state perfectly. This is philosophically contentious - rather analagous to arguments about what happens when an irresistable force meets an immovable object. For example, philosophers who believe in “operational definitions” may doubt whether there is any operation that could be performed to obtain “the exact initial conditions”. I prefer to follow the mathematical convention that says that exact, perfect, or infinite entities are properly understood as the limiting cases of  more mundane entities. On this convention, if the last significant figure of the most accurate measure we can make of an outcome remains unpredictable for any finite degree of accuracy, then we must say that the same is true for “infinite accuracy”.

 

The conclusion that there is always something unknown about the predicted outcome places a “qualitative upper limit”, so to speak, on the strength of predictability in this case, but we must also recognize a “qualitative lower limit” that is just as important, since in the snooker impact example whatever the accuracy of prediction that is desired after whatever time period, we can always calculate an accuracy of initial measurement that would enable it. (However, as we shall shortly see [3], this does not apply in every case.)  The combination of predictability in principle to any degree, with necessary unpredictability to the precision of the best available measurement, might be termed “truncated predictability”.

 

4. More general cases

The two elemementary cases considered so far illustrate the importance of distinguishing convergent from divergent behaviour, and so provide a useful paradigm to be kept in mind, but of course, most real cases are more complicated than this.

To take some examples, a system can have both divergent parts and convergent parts at any instant - such as different balls on the same snooker table; an element whose trajectory is behaving divergently at one instant may behave convergently at another instant; convergent movement along one axis may be accompanied by divergent movement relative to another; and, significantly, divergent behaviour at one scale may be accompanied by convergent behaviour at a different scale. Zoom out from that snooker table, round positions to the nearest metre or so, and the trajectories of all the balls follow that of the adjacent surface of the earth.

There is also the possibility that a system can be potentially divergent at all times and places. A famous case of such behaviour is the chaotic behaviour of the atmosphere, first clearly understood by Edward Lorentz in 1961. This story comes in two parts, the second apparently much less well known than the first.

 

5. Chaotic case: discrete

The equations normally used to describe the physical behaviour of the atmosphere formally describe a continuum, an infinitely divisible fluid. As there is no algebraic “solution” to these equations, approximate solutions have to be found numerically, which in turn require the equations to be “discretised”, that is adapted to describe the behaviour at, or averaged around, a suitably large number of discrete points. 

 

The well-known part of Lorenz’s work [4] arose from an accidental observation, that a very small change in the rounding of the values at the start of a numerical simulation led in due course to an entirely different “forecast”. Thus this is a case of divergent trajectories from any starting point, or “sensitivity to initial conditions” as it has come to be known.

 

The part of “chaos theory” that grew out of this initial insight describes the convergent trajectories from any starting point: they diverge exponentially, with a time constant known as the Kolmogorov constant for the particular problem case [5]. Thus we can still say, as we said for the snooker ball, that whatever the accuracy of prediction that is desired after whatever time period, we can always calculate an accuracy of initial measurement that would enable it.

 

6. Chaotic case: continuum

Other researchers might have dismissed the initial discovery of sensitivity to initial conditions as an artefact of the computation, but Lorenz realised that even if the computation had been perfect, exactly the same consequences would flow from disturbances in the fluid in the gaps between the discrete points of the numerical model.  This is often called the “Butterfly Effect” because of a conference editor's colourful summary that “the beating of a butterfly’s wings in Brazil could cause a tornado in Texas”.

 

It is important to note that the Butterfly Effect is not strictly the same as “Sensitivity to Initial Conditions” as is often reported [6], although they are closely related. Sensitivity to Initial Conditions is an attribute of some discretised numerical models. The Butterfly Effect describes an attribute of the equations describing a continuous fluid, so is better described as “sensitivity to disturbances of minimal extent”, or in practice, sensitivity to what falls between the discrete points modelled.

 

Since, as noted above, there is no algebraic solution to the continuous equations, the only way to establish the divergent characteristics of the equations themselves is to repeatedly reduce the scale of discretisation (the typical distance between the points on the grid of measurements) and observe the trend. In fact, this was done for a very practical reason: to find out how much benefit would be obtained, in terms of the durability of the forecast [7], by providing more weather stations. The result was highly significant: each doubling of the number of stations increased the durability of the forecast by a smaller amount, so that (by extrapolation) as the number of imaginary weather stations was increased without limit, the forecast durability of the model converged to a finite value[8]. Thus, beyond this time limit, the equations that we use to describe the atmosphere give indeterminate results, however much detail we have about the initial conditions. [9]

 

Readers will doubtless have noticed that this result does not strictly apply to the earth’s atmosphere, because that is not the infinitely divisible fluid that the equations assumed (and a butterfly is likewise finitely divisible). Nevertheless, the fact that there are perfectly well-formed, familiar equations which by their nature have unpredictable outcomes after a finite time interval vividly exposes the difference between determinism and predetermination.

 

With hindsight, the diminishing returns in forecast durability from refining the scale of discretisation is not too surprising: it is much quicker for a disturbance on a 1 km scale to have effects on a 2 km scale than for a disturbance on a 100 km scale to have effects on a 200 km scale.

 

7. Consequences of quantum uncertainty

It is often claimed that the the Uncertainty Principle of quantum mechanics [10] makes the future unpredictable [11], but in the terms of the above analysis this is far from the whole story.

 

The effect of quantum mechanics is that at the scale of fundamental particles [12] the laws of physical causality are probabilistic. As a consequence, there is certainly no basis, for example, to predict whether an unstable nucleus will disintegrate before or after the expiry of its half-life.

 

However, in the case of a convergent process at ordinary scales, the unpredictability at quantum scale is immaterial, and at the scale of interest predictability continues to hold sway. The snooker ball finishes up at the bottom of the pocket whatever the energy levels of its constituent electrons. [13]

 

It is in the case of divergent processes that quantum effects can make for unpredictability at large scales. In the case of the atmosphere, for example, the source of that tornado in Texas could be a cosmic ray in Colombia, and cosmic radiation is strictly non-deterministic. The atmosphere may not be the infinitely divisible fluid considered by Lorenz, but a molecular fluid subject to random quantum processes has just the same lack of predictability.

 

[EDIT] How does this look in terms of the LW-preferred Many Worlds interpretation of quantum mechanics?[14] In this framework, exact "objective prediction" is possible in principle but the prediction is of an ever-growing array of equally real states. We can speak of the "probability" of a particular outcome in the sense of the probability of that outcome being present in any state chosen at random from the set. In a convergent process the cases become so similar that there appears to be only one outcome at the macro scale (despite continued differences on the micro scale); whereas in a divergent process the "density of probability" (in the above sense) becomes so vanishingly small for some states that at a macro scale the outcomes appear to split into separate branches. (They have become decoherent.) Any one such branch appears to an observer within that branch to be the only outcome, and so such an observer could not have known what to "expect" - only the probability distribution of what to expect. This can be described as a condition of subjective unpredictability, in the sense that there is no subjective expectation that can be formed before the divergent process which can be reliably expected to coincident with an observation made after the process. [END of EDIT]

 

8. Conclusions

What has emerged from this review of different cases, it seems to me, is that it is the convergent/divergent dichotomy that has the greatest effect on the predictability of a system’s behaviour, not the deterministic/quantised dichotomy at subatomic scales.

 

More particularly, in short-hand:-

Convergent + deterministic => full predictability

Convergent + quantised => predictability at all super-atomic scales

Divergent + deterministic + discrete => “truncated predictability”

Divergent + deterministic + continuous => unpredictability

[EDIT] Divergent + quantised => objective predictability of the multiverse but subjective unpredictability

 

Footnotes

1. The “state” may already include time derivatives of course, and in the case of a continuum, the state includes spatial gradients of all relevant properties.

1.1 For simplicity I have ignored the case between the two where neighbouring trajectories are parallel. It should be obvious how the argument applies to this case. Convergence/divergence is clearly related to (in)stability, and less directly to other properties such as (non)-linearity and (a)periodicity, but as convergence defines the characteristic that matters in the present context it seems better to focus on that.

2. In referring to a “significant figure” I am of course assuming that decimal notation is used, and that the initial error has diverged by at least a factor of 10.

3. In section 6.

4. For example, see Gleick, “Chaos”, "The Butterfly Effect" chapter.

5. My source for this statement is a contribution by Eric Kvaalen to the New Scientist comment pages.

6. E.G by Gleick or Wikipedia.

7. By durability I mean the period over which the required degree of accuracy is maintained.

8. This account is based on my recollection, and notes made at the time, of an article in New Scientist, volume 42, p290. If anybody has access to this or knows of an equivalent source available on-line, I would be interested to hear!

9. I am referring to predictions of the conditions at particular locations and times. It is, of course, possible to predict average conditions over an area on a probabilistic basis, whether based on seasonal data, or the position of the jetstream etc. These are further examples of how divergence at one scale can be accompanied by something nearer to convergence on another scale.

10. I am using “quantum mechanics” as a generic term to include its later derivatives such as quantum chromodynamics. As far as I understand it these later developments do not affect the points made here. However, this is certainly well outside my  professional expertise in aspects of Newtonian mechanics, so I will gladly stand corrected by more specialist contributors!

11. E.G. by Karl Popper in an appendix to The Poverty of Historicism.

12. To be pedantic, I’m aware that this also applies to greater scales, but to a vanishingly small extent.

13. In such cases we could perhaps say that predictability is effectively an “emergent property” that is not present in the reductionist laws of the ultimate ingredients but only appears in the solution space of large scale aggregates. 

14. Thanks to the contributors of the comments below as at 30 July 2013 which I have tried to take into account. The online preview of "The Emergent Multiverse: Quantum Theory According to the Everett Interpretation" by David Wallace has also been helpful to understanding the implications of Many Worlds.

 

 

 

 

 

 

 

The real difference between Reductionism and Emergentism

2 RogerS 15 April 2013 10:09PM

After trying to discover why the LW wiki “definition” of Reductionism appeared so biased, I concluded from the responses that it was never really intended as a definition of the Reductionist position itself, but as a summary of what is considered to be wrong with positions critical of Reductionism.

The argument goes like this. “Emergentism”, as the critical view is often called, points out the properties that emerge from a system when it is assembled from its elements, which do not themselves show such a property. From such considerations it points out various ways in which research programmes based on a reductionist approach may distort priorities and underestimate difficulties. So far, this is all a matter of degree and eventually each case must be settled on its merits. However, it gets philosophically sensitive when Emergentists claim that a Reductionist approach may be unable in principle to 'explain' certain emergent properties.

The reponse to this claim (I think) goes like this. (1) The explanatory power of a model is a function of its ingredients. (2) Reductionism includes all the ingredients that actually exist in the real world. Therefore (3) Emergentists must be treating the “emergent properties” as extra ingredients, thereby confusing the “map” with the “territory”. So Reductionism is defined by EY and others as not treating emergent properties as extra ingredients (in effect).

At this point it is important to distinguish “Mind theory” from other fields where Reductionism is debated. In this field, Reductionists apparently regard Emergentism as a form of disguised Vitalism/Dualism - if emergent properties can’t be explained by the physical ingredients, they must exist in some non-physical realm. However, Emergentism can apply equally well to everything from chess playing programs to gearbox vibrations, neither of which involve anything like mysterious spiritual substances, so this can hardly be the whole story. And in fact I would argue that the reverse is the case: Vitalists or “substance Dualists” are actually unconscious Reductionists as well: when they assume an extra ingredient is necessary to account for the things which they believe Physicalism cannot explain, they are still reducing a system to its ingredients. Emergentists by contrast reject premise (1) of the previous paragraph, that the explanatory power of a model is a function of its ingredients. Thus it seems to me that the real difference between Reductionists & Emergentists is a difference over the nature of explanation. So it seems worthwhile looking into some of the different things that can be meant by “explanation”.

For simplicity, let us illustrate this by the banal example of a brickwork bridge. The elements are the bricks and their relative positions. Our reductionist R points out that these are the only elements you need - after all, if you remove all the bricks there is nothing left - and so proposes to become an expert in bricks. Our (Physicalist) Emergentist E suggests that this won’t be of much use without a knowledge of the Arch (an emergent feature). R isn't stupid and agrees that this would be extremely useful but points out that if no expert in Arch Theory is to hand, given the very powerful computer available, such expertise isn’t strictly necessary: it's not an inherent requirement. Simply solving the force balance equations for each brick will establish whether a given structure will fall into the river. Isn’t that an explanation?

Not in my sense, says E, as to start with it doesn’t tell me how the bridge will be designed, only how an existing design will be analysed. So R explains that the computer will generate structures randomly until one is found that satisfies the requirements of equilibrium. When E enquires how stability will be checked. R replies that the force balance will be checked under all possible small deviations from the design position.

E isn’t satisfied. To claim understanding, R must be able to apply the results of the first design to new bridges of different span, but all (s)he can do is repeat the process again every time.

On the contrary, replies R, this being the age of Big Data, the computer can generate solutions in a large number of cases and then use pattern recognition software to extract rules that can be applied to new cases.

Ah, says E, but explaining these rules means hypothesing more general rules from which these rules can be derived, using appropriate Bayesian reasoning to confirm your hypothesis.

OK, replies R, my program has a heuristic feature that has passed the Turing Test. So anything you can do along these lines, it can do just as well.

So using R’s approach, explanation even in E’s most general sense can always be arrived at by a four-stage process: (1) construct a model using the basic elements applicable to the situation, (2) fill a substantial chunk of solution space, (3) use pattern recognition to extract pragmatic rules, (4) use hypothesis generation and testing to derive general principles from the rules. It may be a trivial illustration, but it seems to me that in a broad sense this sort of process must be applicable in almost any situation.

How should we interpret this conclusion? R would say that it proves that “explanation” can be arrived it using a Reductionist model. E would say it proves the inadequacy of Reductionism, since Reductionist steps (1) & (2) have to be supplemented by Integrationist steps (3) & (4): the rules found at step (3) are precisely “emergent features” of the solution space. Moreover, pattern recognition is not a closed-form process with repeatable results. (Is it?) On the other hand the patterns identified in solution space might well be derivable in closed form directly from higher-level characteristics of the system in question (such as constraints in the system).

I would say that the choice of interpretation is a matter of convention, though I own up that I find the Emergentist mind-set more helpful in the fields I have learnt something about. What really matters is a recognition of the huge difference between “providing a solution” and “generalising from solution space” as types of explanation. The “Emergentist” label is a reminder of that difference. But call yourself a “Reductionist” if you like so long as you acknowledge the difference.

It seems to me that the sort of argument sketched here provides useful pointers to help recognize when “Reductionism” becomes “Greedy Reductionism”(A). For example, consider the claim that mapping the Human Connectome will enable the workings of the brain to be explained. Clearly, the mapping is just step (1). Consider the size of the Connectome, and then consider the size of the solution space of its activity. That makes step (1) sound utterly trivial compared with step (2). This leaves the magnitude of steps (3) & (4) to be evaluated. That doesn’t mean the project won’t be extremely valuable, but it puts the time-frame of the claim to provide real “understanding” into a very different light, and underlines the continued value of working at other scales as well.

(A): See e.g. fubarobfusco's comment on my earlier discussion.

Removing Bias From the Definition of Reductionism

1 RogerS 27 March 2013 06:06PM

The test for an unbiased definition of a philosophical position is (surely) that it is equally acceptable to critics and defenders of the position. I think the definition of reductionism in the wiki blatantly fails this test. The same bias is apparent in the old Sequence posting dealing with reductionism. (Some comments called it a “straw man” without spelling out why.)

Consider the definition:-

Reductionism is a disbelief that the higher levels of simplified multilevel models are out there in the territory, that concepts constructed by mind in themselves play a role in the behaviour of reality. This doesn't contradict the notion that the concepts used in simplified multilevel models refer to the actual clusters of configurations of reality.

 

The unavoidable implication is that critics of reductionism believe that the higher levels of simplified multilevel models are out there in the territory.

Certainly, nobody but the flimsiest of straw men could possibly believe this, since all parts of the models are by definition not part of the territory: they are part of the map. It might be possible to believe, by contrast, that the territory actually has built into it things that correspond in some sense to higher levels of a hierarchical map, whether simplified or no; or that whether there are or are not such things is not decidable or meaningful; or one could believe that there are definitely no such things, that hierarchical higher levels of organisation (clusters) are meaningful only as mental artefacts.

The second disbelief, that concepts constructed by mind in themselves play a role in the behaviour of reality, also appears to be unimpeachable. However, by attaching this claim to the first claim, it is implied (without examination of the implication) that this statement only applies to the higher levels of simplified models. Yet logically, it cannot be confined to the higher levels but equally applies to the lowest level of the map: it is not the pixels of even the best possible map that themselves play a role in the behaviour of reality, but only something in some way corresponding to them.

In the Sequence discussion of reductionism we read:

So is the 747 made of something other than quarks?  No...

And bigjeff5 later comments

The territory is only quarks (or whatever quarks may be made of).

But baryons, which are part of the map, are made of quarks, which are therefore also part of the map. In fact they are the pixels of the best available map today. The map is not the territory. Therefore quarks are not part of the territory. So nothing is made of quarks except in our current map. To say that reality is made of quarks is an acceptable shorthand in many contexts, but in a discussion whose whole point is to emphasize the need not to confuse the map with the territory, disregarding the distinction at quark level is at least prima facie evidence of a biased approach!

TWO KINDS OF SIMPLIFIED MAPS

The wiki definition I began with includes the word simplified for reasons that are not clear. The Sequence discussion seems to me to confuse two different senses of the term: simplification by approximation and simplification by selection. Newton’s theory is now regarded as a simplified version of Special Relativity that serves as an excellent approximation in certain contexts. In this case the key is that the simplification is an approximation. Treating the forces on an aircraft wing at the aggregate level is leaving out internal details that per se do not affect the result. There will certainly be approximations involved, of course, but they don’t stem from the actual process of aggregation, which is essentially a matter of combining all the relevant force equations algebraically, eliminating internal forces, before solving them; rather than combining the calculated forces numerically. So is the definition addressing approximate maps or selective maps?

WHICH TERRITORY?

A question raised in the discussion but never answered as far as I can see is whether the belief referred to applies to our particular universe or to any universe one could conceive (so is more like a belief about the nature of explanation). While much of the discussion is focussed on our universe, various analogies advanced by commenters suggest that applicability to all universes is intended. I will assume the latter. A further confusion is that whereas the usual definition of reductionism refers to reduction of any system to its elements, and thus, for example, covers the reduction of lifeforms to their genetic recipes, the focus on “the territory” seems to confine the definition to physics: genes, being “made of quarks”, are just a level of the map.

DEFINITIONS

A definition that consists of disbelieving a contradiction in terms and then disguising a selective application of a truism, is clearly biased, and leads to apparently biased thinking, so I will attempt an unbiased definition.

Of course, we could simply import a definition from Wikipedia or elsewhere, but I am trying to capture the particular approach and terminology of this site (from initial impressions) in an unbiased way.

If I understand it correctly, reductionists on this site believe that, for the purposes of causal explanation, any “territory” in the sense of physical reality is best characterised as corresponding only to the lowest hierarchical level of our best map of it, higher levels of organisation existing only in the map. Is that right?

The summary in the SEQ_RERUN is also worth repeating:

We build models of the universe that have many different levels of description. But so far as anyone has been able to determine, the universe itself has only the single level of fundamental physics - reality doesn't explicitly compute protons, only quarks.

Which seems to mean much the same as my definition (if it means anything to say that “the universe computes”).

Psy-Kosh implicitly criticises the reference to quarks in claiming:

Reductionism does _NOT_ mean "reduction to particles", just "reduction to simple principles that are the basic thing that give rise to everything else".

But that doesn’t exclude “simple principles” that emerge from higher levels of organization, so doesn’t really fit the bill either.

The most relevant corresponding wording in Wikipedia is interesting because it makes no reference to the model/reality (map/territory) distinction which Eliezer seems to think makes ontological reductionism intelligible:

In a reductionist framework, [a phenomenon] that can be explained completely in terms of relations between other more fundamental phenomena  exerts no causal agency on the fundamental phenomena that explain it.

This assumes that the term “more fundamental” is defined, and (like my definition) doesn’t distinguish sequential causation from structural causation. Hmm, maybe this needs a separate post sometime.

FOOTNOTE: WHAT ABOUT THE HEURISTIC SENSE?

As Perplexed pointed out, the discussion of reductionism in the Sequences clearly refers to ontological reductionism by contrast with methodological reductionism. The same applies to the wiki definition and my proposed correction.

I entirely support this distinction. I am focussing here on ontological reductionism because so many contributors define themselves as “reductionist materialists” as a matter of belief. One would no more define oneself as a heuristic reductionist than one would define oneself as a hammer user (rather than a screwdriver user, say) as a matter of conviction.

TEASER

This quarrel about definition is not mere pedantry, since it hints at an unconscious bias. Moreover, if there are signs of agreement with this definition (or improved versions of it) and my newbie’s karma doesn’t suffer, I hope to build on this beginning to suggest that the real difference between proponents and supporters of ontological reductionism is that they are using two subtly different conceptions of what we mean by “the territory” and “the map”, both consistent with the definitions.  (Both conceptions have been implied by different contributors but without considering that the difference may be one of convention).