gRR comments on Decision Theories: A Less Wrong Primer - Less Wrong

69 Post author: orthonormal 13 March 2012 11:31PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (172)

You are viewing a single comment's thread.

Comment author: gRR 12 March 2012 12:07:38PM *  4 points [-]

Thanks for the recap. It still doesn't answer my question, though:

If X is a causal decision theorist, the choice is clear: whatever Omega decided, it decided already

This appears to be incorrect if the CDT knows that Omega always makes correct predictions

the problem looks much the same if Omega has a 90% success rate rather than 100%.

And this appears to be incorrect in all cases. The right decision depends on exact nature of the noise. If Omega makes the decision by analyzing the agent's psychological tests taken in childhood, then the agent should two-box. And if Omega makes a perfect simulation and then adds random noise, the agent should one-box.

Comment author: cousin_it 12 March 2012 01:29:57PM 1 point [-]

If Omega makes the decision by analyzing the agent's psychological tests taken in childhood, then the agent should two-box.

Sorry, could you explain this in more detail?

Comment author: gRR 12 March 2012 02:22:08PM *  0 points [-]

Hmm, I'm not sure this is an adequate formalization, but:

Lets assume there is an evolved population of agents. Each agent has an internal parameter p, 0<=p<=1, and implements a decision procedure p*CDT + (1-p)*EDT. That is, given a problem, the agent tosses a pseudorandom p-biased coin and decides according to either CDT or EDT, depending on the results of the toss.

Assume further that there is a test set of a hundred binary decision problems, and Omega knows the test results for every agent, and does not know anything else about them. Then Omega can estimate
P(agent's p = q | test results)
and predict "one box" if the maximum likelihood estimate of p is >1/2 and "two box" otherwise. [Here I assume for the sake of argument that CDT always two-boxes.]

Given a right distribution of p-s in the population, Omega can be made to predict with any given accuracy. Yet, there appears to be no reason to one-box...

Comment author: cousin_it 12 March 2012 02:38:06PM *  0 points [-]

Wait, are you deriving the uselessness of UDT from the fact that the population doesn't contain UDT? That looks circular, unless I'm missing something...

Comment author: gRR 12 March 2012 02:58:03PM *  0 points [-]

Err, no, I'm not deriving the uselessness of either decision theory here. My point is that only the "pure" Newcomb's problem - where Omega always predicts correctly and the agent knows it - is well-defined. The "noisy" problem, where Omega is known to sometimes guess wrong, is underspecified. The correct solution (that is whether one-boxing or two-boxing is the utility maximizing move) depends on exactly how and why Omega makes mistakes. Simply saying "probability 0.9 of correct prediction" is insufficient.

But in the "pure" Newcomb's problem, it seems to me that CDT would actually one-box, reasoning as follows:
1. Since Omega always predicts correctly, I can assume that it makes its predictions using a full simulation.
2. Then this situation in which I find myself now (making the decision in Newcomb's problem) can be either outside or within the simulation. I have no way to know, since it would look the same to me either way.
3. Therefore I should decide assuming 1/2 probability that I am inside Omega's simulation and 1/2 that I am outside.
4. So I one-box.

Comment author: XiXiDu 12 March 2012 02:27:18PM *  0 points [-]

If Omega makes the decision by analyzing the agent's psychological tests taken in childhood, then the agent should two-box.

Sorry, could you explain this in more detail?

Humans are time-inconsistent decision makers. Why would Omega choose to fill the boxes according to a certain point in configuration space rather than some average measure? Most of your life you would have two-boxed after all. Therefore if Omega was to predict whether you (space-time-worm) will take both boxes or not, when it meets you at an arbitrary point in configuration space, it might predict that you are going to two-box if you are not going to life for much longer in which time-period you are going to consistently choose to one-box.

ETA It doesn't really matter when a superintelligence will meet you. What matters is for how long a period you adopted which decision procedure, respectively were susceptible to what kind of exploitation. If you only changed your mind about a decision procedure for .01% of your life it might still worth to act on that acausally.

Comment author: Gabriel 12 March 2012 03:03:19PM *  7 points [-]

I think the idea is that even if Omega always predicted two-boxing, it still could be said to predict with 90% accuracy if 10% of the human population happened to be one-boxers. And yet you should two-box in that case. So basically, the non-deterministic version of Newcomb's problem isn't specified clearly enough.

Comment author: cousin_it 12 March 2012 03:09:11PM 0 points [-]

Very nice, thanks!

Comment author: gRR 12 March 2012 03:33:14PM 1 point [-]

Far better explanation than mine, thanks!

Comment author: orthonormal 12 March 2012 04:22:00PM 0 points [-]

Good point. I don't think this is worth going into within this post, but I introduced a weasel word to signify that the circumstances of a 90% Predictor do matter.

Comment author: ksvanhorn 13 March 2012 06:47:23PM 2 points [-]

I disagree. To be at all meaningful to the problem, the "90% accuracy" has to mean that, given all the information available to you, you assign a 90% probability to Omega correctly predicting your choice. This is quite different from correctly predicting the choices of 90% of the human population.

Comment author: drnickbone 13 March 2012 07:37:03PM 0 points [-]

I don't think this works in the example given, where Omega always predicts 2-boxing. We agree that the correct thing to do in that case is to 2-box. And if I've decided to 2-box then I can be > 90% confident that Omega will predict my personal actions correctly. But this still shouldn't make me 1-box.

I've commented on Newcomb in previous threads... in my view it really does matter how Omega makes its predictions, and whether they are perfectly reliable or just very reliable.

Comment author: jimmy 14 March 2012 06:18:16PM *  0 points [-]

Agreed for that case, but perfect reliability still isn't necessary (consider omega 99.99% accurate/10% one boxers for example)

What matters is that your uncertainty in omegas prediction is tied to your uncertainty in your actions. If you're 90% confident that omega gets it right conditioning on deciding to one box and 90% confident that omega gets it right conditional on deciding to two box, then you should one box. (0.9 * 1M>1K+0.1 * 1M)

Comment author: Gabriel 12 March 2012 02:58:50PM -1 points [-]

f X is a causal decision theorist, the choice is clear: whatever Omega decided, it decided already

This appears to be incorrect if the CDT knows that Omega always makes correct predictions

If a CDT agent A is told about the problem before Omega makes its prediction and fills the boxes, then A will want to stop being a CDT agent for the duration of the experiment. Maybe that's what you mean?

Comment author: gRR 12 March 2012 03:37:18PM 0 points [-]

No, I mean I think CDT can one-box within the regular Newcomb's problem situation, if its reasoning capabilities are sufficiently strong. In detail: here and in the thread here.

Comment author: orthonormal 12 March 2012 04:27:18PM 0 points [-]

And as I replied there, this depends on its utility function being such that "filling the box for my non-simulated copy" has utiity comparable to "taking the extra box when I'm not simulated". There are utility functions for which this works (e.g. maximizing paperclips in the real world) and utility functions for which it doesn't (e.g. maximizing hedons in my personal future, whether I'm being simulated or not), and Omega can slightly change the problem (simulate an agent with the same decision algorithm as X but a different utility function) in a way that makes CDT two-box again. (That trick wouldn't stop TDT/UDT/ADT from one-boxing.)

Comment author: gRR 12 March 2012 05:18:09PM 1 point [-]

I think you missed my point.

Omega can slightly change the problem (simulate an agent with the same decision algorithm as X but a different utility function)

This is irrelevant. The agent is actually outside, thinking what to do in the Newcomb's problem. But only we know this, the agent itself doesn't. All the agent knows is that Omega always predicts correctly. Which means, the agent can model Omega as a perfect simulator. The actual method that Omega uses to make predictions does not matter, the world would look the same to the agent, regardless.

Comment author: orthonormal 13 March 2012 04:47:52AM 1 point [-]

Unless Omega predicts without simulating- for instance, this formulation of UDT can be formally proved to one-box without simulating.

Comment author: gRR 13 March 2012 07:32:28AM 0 points [-]

Errrr. The agent does not simulate anything in my argument. The agent has a "mental model" of Omega, in which Omega is a perfect simulator. It's about representation of the problem within the agent's mind.

In your link, Omega - the function U() - is a perfect simulator. It calls the agent function A() twice, once to get its prediction, and once for the actual decision.

Comment author: orthonormal 13 March 2012 09:37:12PM 0 points [-]

The problem would work as well if the first call went not to A directly but querying the oracle whether A()=1. There are ways of predicting that aren't simulation, and if that's the case then your idea falls apart.

Comment author: wedrifid 13 March 2012 05:13:48AM 2 points [-]

No, I mean I think CDT can one-box within the regular Newcomb's problem situation, if its reasoning capabilities are sufficiently strong. In detail: here and in the thread here.

No, if you have an agent that is one boxing either it is not a CDT agent or the game it is playing is not Newcomb's problem. More specifically, in your first link you describe a game that is not Newcomb's problem and in the second link you describe an agent that does not implement CDT.

Comment author: gRR 13 March 2012 07:39:31AM 1 point [-]

More specifically, in your first link you describe a game that is not Newcomb's problem and in the second link you describe an agent that does not implement CDT

It would be a little more helpful, although probably not quite as cool-sounding, if you explained in what way the game is not Newcomb's in the first link, and the agent not a CDT in the second. AFAIK, the two links describe exactly the same problem and exactly the same agent, and I wrote both comments.

Comment author: wedrifid 13 March 2012 08:14:36AM 2 points [-]

It would be a little more helpful, although probably not quite as cool-sounding,

That doesn't seem to make helping you appealing.

if you explained in what way the game is not Newcomb's in the first link,

The agent believes that it is has 50% chance of being in an actual Newcomb's problem and 50% chance of being in a simulation which will be used to present another agent with a Newcomb's problem some time in the future.

and the agent not a CDT in the second.

Orthonormal already explained this in the context.

Comment author: gRR 13 March 2012 08:37:32AM 0 points [-]

That doesn't seem to make helping you appealing.

Yes, I have this problem, working on it. I'm sorry, and thanks for your patience!

The agent believes that it is has 50% chance of being in an actual Newcomb's problem and 50% chance of being in a simulation which will be used to present another agent with a Newcomb's problem some time in the future.

Yes, except for s/another agent/itself/. In what way this is not a correct description of a pure Newcomb's problem from the agent's point of view? This is my original still unanswered question.

Note: in the usual formulations of Newcomb's problem for UDT, the agent knows exactly that - it is called twice, and when it is running it does not know which of the two calls is being evaluated.

Orthonormal already explained this in the context.

I answered his explanation in the context, and he appeared to agree. His other objection seems to be based on a mistaken understanding.

Comment author: orthonormal 13 March 2012 09:43:19PM 0 points [-]

This is worth writing into its own post- a CDT agent with a non-self-centered utility function (like a paperclip maximizer) and a certain model of anthropics (in which, if it knows it's being simulated, it views itself as possibly within the simulation), when faced with a Predictor that predicts by simulating (which is not always the case), one-boxes on Newcomb's Problem.

This is a novel and surprising result in the academic literature on CDT, not the prediction they expected. But it seems to me that if you violate any of the conditions above, one-boxing collapses back into two-boxing; and furthermore, it won't cooperate in the Prisoner's Dilemma against a CDT agent with an orthogonal utility function. That, at least, is inescapable from the independence assumption.

Comment author: Will_Newsome 13 March 2012 08:03:00PM *  9 points [-]

This might not satisfactorily answer your confusion but: CDT is defined by the fact that it has incorrect causal graphs. If it has correct causal graphs then it's not CDT. Why bother talking about a "decision theory" that is arbitrarily limited to incorrect causal graphs? Because that's the decision theory that academic decision theorists like to talk about and treat as default. Why did academic decision theorists never realize that their causal graphs were wrong? No one has a very good model of that, but check out Wei Dai's related speculation here. Note that if you define causality in a technical Markovian way and use Bayes nets then there is no difference between CDT and TDT.

I used to get annoyed because CDT with a good enough world model should clearly one-box yet people stipulated that it wouldn't; only later did I realize that it's mostly a rhetorical thing and no one thinks that if you actually implemented an AGI with "CDT" that it'd be as dumb as academia/LessWrong's version of CDT.

If I'm wrong about any of the above then someone please correct me as this is relevant to FAI strategy.