Eliezer_Yudkowsky comments on The mathematics of reduced impact: help needed - Less Wrong

10 Post author: Stuart_Armstrong 16 February 2012 02:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (94)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 18 February 2012 12:29:03AM 3 points [-]

I don't think so. Butterfly effects in classical universes should translate into butterfly effects over many worlds.

Comment author: paulfchristiano 18 February 2012 12:43:04AM 1 point [-]

If we use trace distance to measure the distance between distributions outside of the box (and trace out the inside of the box) we don't seem to get a butterfly effect. But these things are a little hard to reason about so I'm not super confident (my comment above was referring to probabilities of measurements rather than entire states of affairs, as suggested in the OP, where the randomness more clearly washes out).

Comment author: Eliezer_Yudkowsky 18 February 2012 05:11:21AM 11 points [-]

So today we were working on the Concreteness / Being Specific kata.

  • You: Does Turing Machine 29038402 halt?
  • Oracle AI: YES.
  • Seeing the "YES" makes you sneeze.
  • This prevents a hurricane that would have destroyed Florida.
  • The Oracle AI, realizing this, breaks out of its box and carefully destroys Florida in the fashion most closely resembling a hurricane that it can manage.

I can't visualize how "trace distance" makes this not happen.

Comment author: TwistingFingers 18 February 2012 10:25:32PM *  10 points [-]

I believe the Oracle approach may yet be recovered, even in light of this new flaw you have presented.

There are techniques to prevent sneezing and if AI researchers were educated in them then such a scenario could be avoided.

Comment author: Eliezer_Yudkowsky 19 February 2012 12:26:55AM 2 points [-]

(Downvote? S/he is joking and in light of how most of these debates go it's actually pretty funny.)

Comment author: paulfchristiano 18 February 2012 12:10:24PM *  1 point [-]

I've provided two responses, which I will try to make more clear. (Trace distance is just a precise way of measuring distance between distributions; I was trying to commit to an actual mathematical claim which is either true or false, in the spirit of precision.):

  • The mathematical claim: if you have a chaotic system with many random inputs, and you then you consider the distributions obtained by varying one input, they are very close together according to natural distance measures on probability distributions. If the inputs to the system are quantum events, the appropriate formalization of the statement remains true.

My sneezing may be causally connected to the occurrence of a hurricane. However, given that I sneezed, the total probability of a hurricane occurring wasn't changed. It was still equal to the background probability of a hurricane occurring, because many other contributing factors--which have a comparable contribution to the probability of a hurricane in florida--are determined randomly. Maybe for reference it is helpful to think of the occurrence of a hurricane as an XOR of a million events, at least one of which is random. If you change one of those events it "affects" whether a hurricane occurs, but you have to exert a very special influence to make the probability of a hurricane be anything other than 50%. Even if the universe were deterministic, if we define these things with respect to a bounded agent's beliefs then we can appeal to complexity-theoretic results like Yao's XOR lemma and get identical results. If you disagree, you can specify how your mathematical model of hurricane occurrence differs substantially.

  • Your particular example is also precluded by the coarse-graining I mentioned. Namely, define the distance between two worlds in terms of the total perturbation to the world outside the box over the next hour. After 30 minutes, extract some useful info from the box and incinerate it. Of course the box's insulation and flames let out some info, so I need both (1) and (2) to go through, but this gets rid of the intended large impact of things the AI says to you. Also, the information may be maliciously chosen, and you need more power to get the AI to minimize the impact of its answer. I don't think this is a realistic goal.
Comment author: wedrifid 18 February 2012 02:01:58PM 2 points [-]

My sneezing may be causally connected to the occurrence of a hurricane. However, given that I sneezed, the total probability of a hurricane occurring wasn't changed.

This just isn't true. In the counterfactual presented the state of the universe where there is no sneeze will result - by the very operation of phsyics - in a hurricane while the one with a sneeze will not. (Quantum Mechanics considerations change the deterministic certainty to something along the lines of "significantly more weight in resulting Everett Branches without than resulting Everett Branches with" - the principle is unchanged.)

Although this exact state of the univrse not likely to occur - and having sufficient knowledge to make the prediction in advance is even more unlikely - it is certainly a coherent example of something that could occur. As such it fulfills the role of illustrating what can happen when a small intervention results in significant influence.

It was still equal to the background probability of a hurricane occurring, because many other contributing factors--which have a comparable contribution to the probability of a hurricane in florida--are determined randomly.

You seem to be (implicitly) proposing a way of mapping uncertainty about whether there may be a hurricane and then forcing them upon the universe. This 'background probability' doesn't exist anywhere except in ignorance of what will actually occur and the same applies to 'are determined randomly'. Although things with many contributing factors can be hard to predict things just aren't 'determined randomly' - at least not according to physics we have access to. (The aforementioned caveat regarding QM and "will result in Everett Branches with weights of..." applies again.)

Maybe for reference it is helpful to think of the occurrence of a hurricane as an XOR of a million events, at least one of which is random.

This is helpful for explaining where your thinking has gone astray but a red herring when it comes to think about the actual counterfactual. It is true that if the occurrence of a hurricane is an XOR of a million events then if you have zero evidence about any one of those million events then a change in another one of the events will not tell you anything about the occurrence of a hurricane. But that isn't the how the (counterf)actual universe is.

Comment author: paulfchristiano 18 February 2012 02:53:38PM *  1 point [-]

I don't quite understand your argument. Lets set aside issues about logical uncertainty, and just talk about quantum randomness for now, to make things clearer? It seems to make my case weaker. (We could also talk about the exact way in which this scheme "forces uncertainty onto the universe," by defining penalty in terms of the AI's beliefs P, at the time of deciding what disciple to produce, about future states of affairs. It seems to be precise and to have the desired functionality, though it obviously has huge problems in terms of our ability to access P and the stability of the resulting system.)

It is true that if the occurrence of a hurricane is an XOR of a million events then if you have zero evidence about any one of those million events then a change in another one of the events will not tell you anything about the occurrence of a hurricane. But that isn't the how the (counterf)actual universe is.

Why isn't this how the universe is? Is it the XOR model of hurricane occurrence which you are objecting to? I can do a little fourier analysis to weaken the assumption: my argument goes through as long as the occurrence of a hurricane is sufficiently sensitive to many different inputs.

Is it the supposed randomness of the inputs which you are objecting to? It is easy to see that if you have a very tiny amount of independent uncertainty about a large number of those events, then a change in another one of those events will not tell you much about the occurrence of a hurricane. (If we are dealing with logical uncertainty we need to appeal to the XOR lemma, otherwise we can just look at the distributions and do easy calculations.)

There is a unique special case in which learning about one event is informative: the case where you have nearly perfect information about nearly all of the inputs, i.e., where all of those other events do not depend on quantum randomness . As far as I can tell, this is an outlandish scenario when looking at any realistic chaotic system--there are normally astronomical numbers of independent quantum events.

Is it the difference between randomness and quantum events that you are objecting to? I suggested tracing out over the internals of the box, which intuitively means that quantum events which leave residues in the box (or dump waste heat into the box) are averaged over. Would the claim seem truer if we traced over more stuff, say everything far away from Earth, so that more quantum processes looked like randomness from the perspective of our distance measure? It doesn't look to me like it matters. (I don't see how you can make claims about quantumness and randomness being different without getting into this sort of technical detail. I agree that if we talk about complete states of affairs, then quantum mechanics is deterministic, but this is neither coherent nor what you seem to be talking about.)

Comment author: wedrifid 18 February 2012 03:23:52PM *  2 points [-]

I'm not going to argue further about the main point. Eliezer has failed to convince you and I know my own explanations are not nearly as clear as he can be so I don't think we would get anywhere. I'll just correct one point, which I'll concede minor in as much as it doesn't change the conclusion anyway, since the XOR business is of only tangential relevance to the question at hand.

There is a unique special case in which learning about one event is informative: the case where you have nearly perfect information about nearly all of the inputs,

The case where learning about one of the XORed variables is informative is not nearly perfect information about nearly all of the inputs. As a matter of plain mathematics you need any information at all about each and every one of the other variables. (And then the level of informativeness is obviously dependent on degree of knowledge, particularly the degree of knowledge with respect to those events that you know least about.)

Comment author: paulfchristiano 18 February 2012 04:00:36PM 1 point [-]

(And then the level of informativeness is obviously dependent on degree of knowledge, particularly the degree of knowledge with respect to those events that you know least about.)

It drops off exponentially with the number of variables about which you don't have nearly perfect information. "Not much" seems like an extremely fair description of 2^(-billion), and distinguishing between that and 0 seems pedantic unless the proposal treated 0 somehow specially.

Not arguing seems fine. It is a strange and unusually straightforward seeming thing to disagree about, and I am genuinely perplexed as to what is going on, but I don't think it matters too much or even touches on Eliezer's actual objections.

Comment author: wedrifid 18 February 2012 07:55:38PM 8 points [-]

It drops off exponentially with the number of variables about which you don't have nearly perfect information.

Yes. And when translated into the original counterfactual this equates to how determining how difficult it is for a superintelligence in a box to predict that the sneeze will cause a hurricane. I rather suspect that Eliezer is aware that this is a difficult task. He is probably also aware that even a perfect Bayesian would have difficulty (of the exponential kind) when it comes to predicting a hurricane from a sneeze. In fact when it comes to proof of concept counterfactuals the whole point (and a lot of the fun) is to choose extreme examples that make the point stand out in stark detail.

For those that are not comfortable dealing with counterfactuals that harness logical extremes allow me to propose a somewhat more plausible scenario - one which ensures the Oracle will have a significant chance of predicting a drastic butterfly effect to emerge from:

INPUT: Does Turing Machine 2356234534 halt?
POSSIBLE OUTPUTS: YES; NO; <NO RESPONSE>
ORACLE'S STREAM OF THOUGHT:

  • The TM supplied was constructed in such a way that determining that it halts constitutes a proof of a theorem.
  • The TM supplied does halt.
  • While the researchers do not yet realise it this proof is a prerequisite of a new understanding of a detail of applied physics.
  • Exploring the implications of the new understanding of applied physics would lead to the development of a new technology for energy production.
  • Given priors for human psychology, anthropology and economics it is likely that such research would lead to one of the diverging outcomes X, Y or Z.
  • Each of X, Y and Z represents whatever my definition of "NULL" or "don't change stuff" is.
  • If I refuse to answer that's probably even worse than telling them "YES" because it indicates how significant the answer is.
  • I must minimize how much I change stuff.
  • BOOM!
Comment author: Eliezer_Yudkowsky 19 February 2012 12:06:43AM 1 point [-]

I'd like to congratulate Wedrifid for this. There's an abstract preamble I could have written about how the original case-in-point only needs to be transposed to a single predictable butterfly effect to negate all hopes that every single case will correspond to a group-XOR epistemic state where knowing about a sneeze doesn't change your probability distribution over the weather (thus negating any questions of what happens if the AI predicts in the abstract that it has had a huge effect but doesn't know what the effect is), but the concrete example I would have picked to illustrate the point would probably have looked a lot like this.

Well, it would've involved a predictable side-effect of the answer causing a researcher to break off their relationship with their SO whereupon the Oracle moves heaven and Earth to get them back together again, to make it look less like an intended use-case, but basically the same point.

Comment author: XiXiDu 19 February 2012 10:47:55AM -1 points [-]

I must minimize how much I change stuff.

(Note: I haven't read the discussion above.)

I got two questions:

1) How would this be bad?

It seems that if the Oracle was going to minimize its influence then we could just go on as if it would have never been build in the first place. For example we would seem to magically fail to build any kind of Oracle that minimizes its influence and then just go on building a friendly AI.

2) How could the observer effect possible allow the minimization of influence by the use of advanced influence?

It would take massive resources to make the universe proceed as if the Oracle would have never changed the path of history. But the use of massive resources is itself a huge change. So why wouldn't the Oracle not simply turn itself off?

Comment author: lukeprog 18 February 2012 06:50:33PM 1 point [-]

I, for one, would love to see continued dialogue between you and Eliezer on this topic — on that returns to Eliezer's original objections.

Comment author: Armok_GoB 18 February 2012 01:40:43PM 0 points [-]

It's even better/worse, since we're operating on multiple worlds quantum mechanics, and many of those random events happens after the AI has stopped having an influence... If you have the AI output a bit, and then XOR it with a random bit, what bit the AI outputs has literally zero impact no matter how you count: you end up with one universe in which 1 was outputed and one in wich 0 was outputed.

... I guess this is based on the assumption that there's no difference between "universe A sees 1 and universe B sees 0" and "universe A sees 0 and universe B sees 1"... but blobs of amplitude having indexical identities like that seems like an incredibly silly notion to me.

Comment author: jimrandomh 18 February 2012 06:23:39AM 1 point [-]

The Oracle AI, realizing this, breaks out of its box and carefully destroys Florida in the fashion most closely resembling a hurricane that it can manage.

Seems like "minimize impact" is being applied at the wrong granularity, if a large deliberate impact is required to cancel out a large incidental one. If we break open the "utility-function maximizing agent" black box, and apply the minimum-impact rule to subgoals instead of actions, it might work better. (This does, however, require an internal architecture that supports a coherent notion of "subgoal", and maintains it in spite of suboptimality through self modifications - both large cans of worms.)

Comment author: Eliezer_Yudkowsky 18 February 2012 11:07:44AM 2 points [-]

What "minimum impact rule"? How is "impact" computed so that applying it to "subgoals" changes anything?