Common sense quantum mechanics

11 dvasya 15 May 2014 08:10PM

Related to: Quantum physics sequence.

TLDR: Quantum mechanics can be derived from the rules of probabilistic reasoning. The wavefunction is a mathematical vehicle to transform a nonlinear problem into a linear one. The Born rule that is so puzzling for MWI results from the particular mathematical form of this functional substitution.

This is a brief overview a recent paper in Annals of Physics (recently mentioned in Discussion):

Quantum theory as the most robust description of reproducible experiments (arXiv)

by Hans De RaedtMikhail I. Katsnelson, and Kristel Michielsen. Abstract:

It is shown that the basic equations of quantum theory can be obtained from a straightforward application of logical inference to experiments for which there is uncertainty about individual events and for which the frequencies of the observed events are robust with respect to small changes in the conditions under which the experiments are carried out.

In a nutshell, the authors use the "plausible reasoning" rules (as in, e.g., Jaynes' Probability Theory) to recover the quantum-physical results for the EPR and SternGerlach experiments by adding a notion of experimental reproducibility in a mathematically well-formulated way and without any "quantum" assumptions. Then they show how the Schrodinger equation (SE) can be obtained from the nonlinear variational problem on the probability P for the particle-in-a-potential problem when the classical Hamilton-Jacobi equation holds "on average". The SE allows to transform the nonlinear variational problem into a linear one, and in the course of said transformation, the (real-valued) probability P and the action S are combined in a single complex-valued function ~P1/2exp(iS) which becomes the argument of SE (the wavefunction).

This casts the "serious mystery" of Born probabilities in a new light. Instead of the observed frequency being the square(d amplitude) of the "physically fundamental" wavefunction, the wavefunction is seen as a mathematical vehicle to convert a difficult nonlinear variational problem for inferential probability into a manageable linear PDE, where it so happens that the probability enters the wavefunction under a square root.

Below I will excerpt some math from the paper, mainly to show that the approach actually works, but outlining just the key steps. This will be followed by some general discussion and reflection.

1. Plausible reasoning and reproducibility

The authors start from the usual desiderata that are well laid out in Jaynes' Probability Theory and elsewhere, and add to them another condition:

There may be uncertainty about each event. The conditions under which the experiment is carried out may be uncertain. The frequencies with which events are observed are reproducible and robust against small changes in the conditions.

Mathematically, this is a requirement that the probability P(x|θ,Z) of observation x given an uncertain experimental parameter θ and the rest of out knowledge Z, is maximally robust to small changes in θ and independent of θ. Using log-probabilities, this amounts to minimizing the "evidence"

for any small ε so that |Ev| is not a function of θ (but the probability is).

2. The EinsteinPodolskyRosenBohm experiment

There is a source S that, when activated, sends a pair of signals to two routers R1,2. Each router then sends the signal to one of its two detectors Di+,– (i=1,2). Each router can be rotated and we denote as θ the angle between them. The experiment is repeated N times yielding the data set {x1,y1}, {x2,y2}, ... {xN,yN} where x and y are the outcomes from the two detectors (+1 or –1). We want to find the probability P(x,y|θ,Z).

After some calculations it is found that the single-trial probability can be expressed as P(x,y|θ,Z) = (1 + xyE12(θ) ) / 4, where E12(θ) = Σx,y=+–1 xyP(x,y|θ,Z) is a periodic function.

From the properties of Bernoulli trials it follows that, for a data set of N trials with nxy total outcomes of each type {x,y},

and expanding this in a Taylor series it is found that

The expression in the sum is the Fisher information IF for P. The maximum robustness requirement means it must be minimized. Writing it down as IF = 1/(1 – E12(θ)2) (dE12(θ)/dθ)2 one finds that E12(θ) = cos(θIF1/2 + φ), and since E12 must be periodic in angle, IF1/2 is a natural number, so the smallest possible value is IF = 1. Choosing φ π it is found that E12(θ) = –cos(θ), and we obtain the result that

which is the well-known correlation of two spin-1/2 particles in the singlet state.

Needless to say, our derivation did not use any concepts of quantum theory. Only plain, rational reasoning strictly complying with the rules of logical inference and some elementary facts about the experiment were used

3. The SternGerlach experiment

This case is analogous and simpler than the previous one. The setup contains a source emitting a particle with magnetic moment S, a magnet with field in the direction a, and two detectors D+ and D.

Similarly to the previous section, P(x|θ,Z) = (1 + xE(θ) ) / 2, where E(θ) = P(+|θ,Z) – P(–|θ,Z) is an unknown periodic function. By complete analogy we seek the minimum of IF and find that E(θ) = +–cos(θ), so that

In quantum theory, [this] equation is in essence just the postulate (Born’s rule) that the probability to observe the particle with spin up is given by the square of the absolute value of the amplitude of the wavefunction projected onto the spin-up state. Obviously, the variability of the conditions under which an experiment is carried out is not included in the quantum theoretical description. In contrast, in the logical inference approach, [equation] is not postulated but follows from the assumption that the (thought) experiment that is being performed yields the most reproducible results, revealing the conditions for an experiment to produce data which is described by quantum theory.

To repeat: there are no wavefunctions in the present approach. The only assumption is that a dependence of outcome on particle/magnet orientation is observed with robustness/reproducibility.

4. Schrodinger equation

A particle is located in unknown position θ on a line segment [–L, L]. Another line segment [–L, L] is uniformly covered with detectors. A source emits a signal and the particle's response is detected by one of the detectors.

After going to the continuum limit of infinitely many infinitely small detectors and accounting for translational invariance it is possible to show that the position of the particle θ and of the detector x can be interchanged so that dP(x|θ,Z)/dθ = –dP(x|θ,Z)/dx.

In exactly the same way as before we need to minimize Ev by minimizing the Fisher information, which is now

However, simply solving this minimization problem will not give us anything new because nothing so far accounted for the fact that the particle moves in a potential. This needs to be built into the problem. This can be done by requiring that the classical Hamilton-Jacobi equation holds on average. Using the Lagrange multiplier method, we now need to minimize the functional

Here S(x) is the action (Hamilton's principal function). This minimization yields solutions for the two functions P(x|θ,Z) and S(x). It is a difficult nonlinear minimization problem, but it is possible to find a matching solution in a tractable way using a mathematical "trick". It is known that standard variational minimization of the functional

yields the Schrodinger equation for its extrema. On the other hand, if one makes the substitution combining two real-valued functions P and S into a single complex-valued ψ,

Q is immediately transformed into F, concluding the derivation of the Schrodinger equation. Incidentally, ψ is constructed so that P(x|θ,Z) = |ψ(x|θ,Z)|2, which is the Born rule.

Summing up the meaning of Schrodinger equation in the present context:

Of course, a priori there is no good reason to assume that on average there is agreement with Newtonian mechanics ... In other words, the time-independent Schrodinger equation describes the collective of repeated experiments ... subject to the condition that the averaged observations comply with Newtonian mechanics.

The authors then proceed to derive the time-dependent SE (independently from the stationary SE) in a largely similar fashion.

5. What it all means

Classical mechanics assumes that everything about the system's state and dynamics can be known (at least in principle). It starts from axioms and proceeds to derive its conclusions deductively (as opposed to inductive reasoning). In this respect quantum mechanics is to classical mechanics what probabilistic logic is to classical logic.

Quantum theory is viewed here not as a description of what really goes on at the microscopic level, but as an instance of logical inference:

in the logical inference approach, we take the point of view that a description of our knowledge of the phenomena at a certain level is independent of the description at a more detailed level.

and

quantum theory does not provide any insight into the motion of a particle but instead describes all what can be inferred (within the framework of logical inference) from or, using Bohr’s words, said about the observed data

Such a treatment of QM is similar in spirit to Jaynes' Information Theory and Statistical Mechanics papers (I, II). Traditionally statistical mechanics/thermodynamics is derived bottom-up from the microscopic mechanics and a series of postulates (such as ergodicity) that allow us to progressively ignore microscopic details under strictly defined conditions. In contrast, Jaynes starts with minimum possible assumptions:

"The quantity x is capable of assuming the discrete values xi ... all we know is the expectation value of the function f(x) ... On the basis of this information, what is the expectation value of the function g(x)?"

and proceeds to derive the foundations of statistical physics from the maximum entropy principle. Of course, these papers deserve a separate post.

This community should be particularly interested in how this all aligns with the many-worlds interpretation. Obviously, any conclusions drawn from this work can only apply to the "quantum multiverse" level and cannot rule out or support any other many-worlds proposals.

In quantum physics, MWI does quite naturally resolve some difficult issues in the "wavefunction-centristic" view. However, we see that the concept wavefunction is not really central for quantum mechanics. This removes the whole problem of wavefunction collapse that MWI seeks to resolve.

The Born rule is arguably a big issue for MWI. But here it essentially boils down to "x is quadratic in t where t = sqrt(x)". Without the wavefunction (only probabilities) the problem simply does not appear.

Here is another interesting conclusion:

if it is difficult to engineer nanoscale devices which operate in a regime where the data is reproducible, it is also difficult to perform these experiments such that the data complies with quantum theory.

In particular, this relates to the decoherence of a system via random interactions with the environment. Thus decoherence becomes not as a physical intrinsically-quantum phenomenon of "worlds drifting apart", but a property of experiments that are not well-isolated from the influence of environment and therefore not reproducible. Well-isolated experiments are robust (and described by "quantum inference") and poorly-isolated experiments are not (hence quantum inference does not apply).

In sum, it appears that quantum physics when viewed as inference does not require many-worlds any more than probability theory does.

To become more rational, rinse your left ear with cold water

3 dvasya 29 May 2013 11:32PM

A recent paper in Cortex describes how caloric vestibular stimulation (CVS), i.e., rinsing of the ear canal with cold water, reduces unrealistic optimism. Here are some bits from the paper:

Participants were 31 healthy right-handed adults (15 men, 20–40 years)...

Participants were oriented in a supine position with the head inclined 30° from the horizontal and cold water (24 °C) was irrigated into the external auditory canal on one side (Fitzgerald and Hallpike, 1942). After both vestibular-evoked eye movements and vertigo had stopped, the procedure was repeated on the other side...

Participants were asked to estimate their own risk, relative to that of their peers (same age, sex and education), of contracting a series of illnesses. The risk rating scale ranged from −6 (lower risk) to +6 (higher risk). ... Each participant was tested in three conditions, with 5 min rest between each: baseline with no CI (always first), left-ear CI and right-ear CI (order counterbalanced). In the latter conditions risk-estimation was initiated after 30 sec of CI, when nystagmic response had built up. Ten illnesses were rated in each condition and the average risk estimate per condition (mean of 10 ratings) was calculated for each participant. The 30 illnesses used in this study (see Table 1) were selected from a larger pool of illnesses pre-rated by a separate group of 30 healthy participants.Overall, our participants were unrealistically optimistic about their chances of contracting illnesses at baseline ... and during right-ear CI. ...Post-hoc tests using the Bonferroni correction revealed that, compared to baseline, average risk estimates were significantly higher during left-ear CI (p = .016), whereas they remained unchanged during right-ear CI (p = .476). Unrealistic optimism was thus reduced selectively during left-ear stimulation.

(CI stands for caloric irrigation which is how CVS was performed.)

It is not clear how close the participants came to being realistic in their estimates after CVS, but they definitely became more pessimistic, which is the right direction to go in the context of numerous biases such as the planning fallacy.

The paper:

Vestibular stimulation attenuates unrealistic optimism

  • Ryan McKay
  • Corinne Tamagni
  • Antonella Palla
  • Peter Krummenacher
  • Stefan C.A. Hegemann
  • Dominik Straumann
  • Peter Brugger

(paywalled, but a pre-publication version is available

Risk aversion vs. concave utility function

1 dvasya 31 January 2012 06:25AM

In the comments to this post, several people independently stated that being risk-averse is the same as having a concave utility function. There is, however, a subtle difference here. Consider the example proposed by one of the commenters: an agent with a utility function

u = sqrt(p) utilons for p paperclips.

The agent is being offered a choice between making a bet with a 50/50 chance of receiving a payoff of 9 or 25 paperclips, or simply receiving 16.5 paperclips. The expected payoff of the bet is a full 9/2 + 25/2 = 17 paperclips, yet its expected utility is only 3/2 + 5/2 = 4 = sqrt(16) utilons which is less than the sqrt(16.5) utilons for the guaranteed deal, so our agent goes for the latter, losing 0.5 expected paperclips in the process. Thus, it is claimed that our agent is risk averse in that it sacrifices 0.5 expected paperclips to get a guaranteed payoff.

Is this a good model for the cognitive bias of risk aversion? I would argue that it's not. Our agent ultimately cares about utilons, not paperclips, and in the current case it does perfectly fine at rationally maximizing expected utilons. A cognitive bias should be, instead, some irrational behavior pattern that can be exploited to take utility (rather than paperclips) away from the agent. Consider now another agent, with the same utility function as before, but who just has this small additional trait that it would strictly prefer a sure payoff of 16 paperclips to the above bet. Given our agent's utility function, 16 is the point of indifference, so could there be any problem with his behavior? Turns out there is. For example, we could follow the post on Savage's theorem (see Postulate #4). If the sure payoff of

16 paperclips = 4 utilons

is strictly preferred to the bet

{P(9 paperclips) = 0.5; P(25 paperclips) = 0.5} = 4 utilons,

then there must also exist some finite δ > 0 such that the agent must strictly prefer a guaranteed 4 utilons to betting on

{P(9) = 0.5 - δ; P(25) = 0.5 + δ) = 4 + 2δ utilons

- all at the loss of 2δ expected utilons! This is also equivalent to our agent being willing to pay a finite amount of paperclips to substitute the bet with the sure deal of the same expected utility.

What we have just seen falls pretty nicely within the concept of a bias. Our agent has a perfectly fine utility function, but it also has this other thing - let's name it "risk aversion" - that makes the agent's behavior fall short of being perfectly rational, and is independent of its concave utility function for paperclips. (Note that our agent has linear utility for utilons, but is still willing to pay some amount of those to achieve certainty) Can we somehow fix our agent? Let's see if we can redefine our utility function u'(p) in some way so that it gives us a consistent preference of

guaranteed 16 paperclips

over the

 {P(9) = 0.5; P(25) = 0.5}

bet, but we would also like to request that the agent would still strictly prefer the bet

{P(9 + δ) = 0.5; P(25 + δ) = 0.5}

to {P(16) = 1} for some finite δ > 0, so that our agent is not infinitely risk-averse. Can we say anything about this situation? Well, if u'(p) is continuous, there must also exist some number δ' such that 0 < δ' < δ and our agent will be indifferent between {P(16) = 1} and

{P(9 + δ') = 0.5; P(25 + δ') = 0.5}.

And, of course, being risk-averse (in the above-defined sense), our supposedly rational agent will prefer - no harm done - the guaranteed payoff to the bet of the same expected utility u'... Sounds familiar, doesn't it?

I would like to stress again that, although our first agent does have a concave utility function for paperclips, which causes it to reject bets with some expected payoff of paperclips to guaranteed payoffs of less paperclips, it still maximizes its expected utilons, for which it has linear utility. Our second agent, however, has this extra property that causes it to sacrifice expected utilons to achieve certainty. And it turns out that with this property it is impossible to define a well-behaved utility function! Therefore it seems natural to distinguish being rational with a concave utility function, on the one hand, from, on the other hand, being risk-averse and not being able to have a well-behaved utility function at all. The latter case seems much more subtle at the first sight, but causes a more fundamental kind of problem. Which is why I feel that a clear, even if minor, distinction between the two situations is still worth making explicit.

A rational agent can have a concave utility function. A risk-averse agent can not be rational.

(Of course, even in the first case the question of whether we want a concave utility function is still open.)

[link] Innocentive challenge: $8000 for examples promoting altruistic behavior

4 dvasya 20 December 2011 09:17PM

A challenge recently posted on Innocentive seemed to me like something that may interest many LWers: "Models Motivating and Supporting Altruism Within Communities", with a grand prize of $8000. To quote from the challenge:

We are interested in looking at novel concepts from nature, business, or other areas that may elucidate the dynamics that help promote and maintain altruistic behaviors.

Further details are available on innocentive.com. I think that it would be a nice opportunity for our LW decision theory experts.

[For anybody who decides to participate: the links I provided contain a referral string so that, in case you win a prize, I can match your donation to the SIAI with the same fraction of my referral award ;) Please use them to register.]

 

Review article on Bayesian inference in physics

6 dvasya 19 September 2011 11:45PM

A nice article just appeared in Reviews of Modern Physics. It offers a brief coverage of the fundamentals of Bayesian probability theory, the practical numerical techniques, a diverse collection of real-world examples of applications of Bayesian methods to data analysis, and even a section on Bayesian experimental design. The PDF is available here.

The abstract:

Rev. Mod. Phys. 83, 943–999 (2011)

Bayesian inference in physics

Udo von Toussaint* 
Max-Planck-Institute for Plasmaphysics, Boltzmannstrasse 2, 85748 Garching, Germany

Received 8 December 2009; published 19 September 2011

Bayesian inference provides a consistent method for the extraction of information from physics experiments even in ill-conditioned circumstances. The approach provides a unified rationale for data analysis, which both justifies many of the commonly used analysis procedures and reveals some of the implicit underlying assumptions. This review summarizes the general ideas of the Bayesian probability theory with emphasis on the application to the evaluation of experimental data. As case studies for Bayesian parameter estimation techniques examples ranging from extra-solar planet detection to the deconvolution of the apparatus functions for improving the energy resolution and change point estimation in time series are discussed. Special attention is paid to the numerical techniques suited for Bayesian analysis, with a focus on recent developments of Markov chain Monte Carlo algorithms for high-dimensional integration problems. Bayesian model comparison, the quantitative ranking of models for the explanation of a given data set, is illustrated with examples collected from cosmology, mass spectroscopy, and surface physics, covering problems such as background subtraction and automated outlier detection. Additionally the Bayesian inference techniques for the design and optimization of future experiments are introduced. Experiments, instead of being merely passive recording devices, can now be designed to adapt to measured data and to change the measurement strategy on the fly to maximize the information of an experiment. The applied key concepts and necessary numerical tools which provide the means of designing such inference chains and the crucial aspects of data fusion are summarized and some of the expected implications are highlighted.

© 2011 American Physical Society

Rationality Quotes September 2011

7 dvasya 02 September 2011 07:38AM

Here's the new thread for posting quotes, with the usual rules:

  • Please post all quotes separately, so that they can be voted up/down separately.  (If they are strongly related, reply to your own comments.  If strongly ordered, then go ahead and post them together.)
  • Do not quote yourself.
  • Do not quote comments/posts on LW/OB.
  • No more than 5 quotes per person per monthly thread, please.

Rationality Quotes August 2011

3 dvasya 02 August 2011 08:24PM

Here's the new quotes thread, for all those quotes you were going to post.

Rules:

  • Please post all quotes separately, so that they can be voted up/down separately.  (If they are strongly related, reply to your own comments.  If strongly ordered, then go ahead and post them together.)
  • Do not quote yourself.
  • Do not quote comments/posts on LW/OB.
  • No more than 5 quotes per person per monthly thread, please.

A study in Science on memory conformity

8 dvasya 15 July 2011 05:30PM

I believe this may be a good addition to the cognitive bias literature:

Following the Crowd: Brain Substrates of Long-Term Memory Conformity

  1. Micah Edelson1,*
  2. Tali Sharot2
  3. Raymond J. Dolan2
  4. Yadin Dudai1

1Department of Neurobiology, Weizmann Institute of Science, Israel.

  1. 2Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, UK.

ABSTRACT

Human memory is strikingly susceptible to social influences, yet we know little about the underlying mechanisms. We examined how socially induced memory errors are generated in the brain by studying the memory of individuals exposed to recollections of others. Participants exhibited a strong tendency to conform to erroneous recollections of the group, producing both long-lasting and temporary errors, even when their initial memory was strong and accurate. Functional brain imaging revealed that social influence modified the neuronal representation of memory. Specifically, a particular brain signature of enhanced amygdala activity and enhanced amygdala-hippocampus connectivity predicted long-lasting but not temporary memory alterations. Our findings reveal how social manipulation can alter memory and extend the known functions of the amygdala to encompass socially mediated memory distortions.

http://www.sciencemag.org/content/333/6038/108.full

http://ifile.it/v76wsi5