Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: lackofcheese 01 October 2014 01:38:31PM *  1 point [-]

AIXI world programs have no inputs, they just run and produce sequences of triples in the form: (action, percept, reward).

No, that isn't true. See, for example, page 7 of this article. The environments (q) accept inputs from the agent and output the agent's percepts.

As such (as per my discussion with private_messaging), there are only three relevant classes of world programs:
(1) Opaque box contains $1,000,000
(2) Opaque box is empty
(3) Contents of the box are determined by my action input

For any and all such environment programs that are consistent with AIXI's observations to date, AIXI will evaluate the reward for both the OneBox and TwoBox actions. As long as classes (1) and (2) win out over class (3), which they should due to being simpler, AIXI will determine that the E(TwoBox) > E(OneBox) and therefore AIXI will TwoBox. In fact, as long as AIXI is smart enough to predict Omega's reasoning, world programs of type (2) should win out over type (1) as well, and so AIXI will already be pretty sure that the opaque box is empty when it two-boxes.

Comment author: V_V 01 October 2014 04:03:43PM *  0 points [-]

The environments (q) accept inputs from the agent and output the agent's percepts.

Yes, but the programs that AIXI maintains internally in its model ensemble are defined as input-less programs that generate all the possible histories.
AIXI filters them for the one observed history and then evaluates the expected (discounted) reward over the future histories, for each possible choice of its next action.
Anyway, that's a technical detail.

As long as classes (1) and (2) win out over class (3), which they should due to being simpler

How can they be simpler, given that you have explained to AIXI what Newcomb's problem is and provided it with enough evidence so that it really believes that it is going to face it?

Maybe Newcomb's problem is simply inconceivable to AIXI, in a way that no amount of evidence can ever lead it to expect that the content of the box, and thus the reward, is correlated to its action.
That's a possibility, but I find it not very plausible: AIXI world programs contain embeddings of all human minds, and all super-human computable AIs. If we assume that the agent is experienced, world programs embedding these very very smart AIs will get most of probability mass, since they are very good sequence predictors. So if a human can understand Newcomb's problem, I think that a super-human AI would understand it as well.

Anyway, if we stipulate that it is indeed possible to provide AIXI with enough evidence that it is facing Newcomb's problem, then it seems to me that it will OneBox.

Comment author: lackofcheese 28 September 2014 04:45:53PM *  0 points [-]

What do you mean by "programs where the content of the box, however internally represented, is correlated to the next decision"? Do you mean world programs that output $1,000,000 when the input is "one-box" and output $1000 when the input is "two-box"? That seems to contradict the setup of Newcomb's to me; in order for Newcomb's problem to work, the content of the box has to be correlated to the actual next decision, not to counterfactual next decisions that don't actually occur.

As such, as far as I can see it's important for AIXI's probability mass to focus down to models where the box already contains a million dollars and/or models where the box is already empty, rather than models in which the contents of the box are determined by the input to the world program at the moment AIXI makes its decision.

Comment author: V_V 01 October 2014 01:12:14PM *  0 points [-]

AIXI world programs have no inputs, they just run and produce sequences of triples in the form: (action, percept, reward).

So, let's say AIXI has been just subjected to Newcomb's problem. Assuming that the decision variable is always binary ("OneBox" vs "TwoBox"), of all the programs which produce a sequence consistent with the observed history, we distinguish five classes of programs, depending on the next triple they produce:
1: ("OneBox", "Opaque box contains $1,000,000", 1,000,000)
2: ("TwoBox", "Opaque box is empty", 1,000)
3: ("OneBox", "Opaque box is empty", 0)
4: ("TwoBox", "Opaque box contains $1,000,000", 1,001,000)
5: Anything else (eg. ("OneBox", "A pink elephant appears", 42)).

Class 5 should have a vanishing probability, since we assume that the agent already knows physics.
Therefore:
E("OneBox") = 1,000,000 * p(class1) + 0 * p(class3)
E("TwoBox") = 1,000 * p(class2) + 1,001,000 * p(class4)

Classes 1 and 2 are consistent with the setup of Newcomb's problem, while classes 3 and 4 aren't.
Hence I would say that if AIXI has been presented with enough evidence to believe that it is facing Newcomb's problem, then by definition of "enough evidence", p(class1) >> p(class3) and p(class2) >> p(class4), implying that AIXI will OneBox.

Comment author: lackofcheese 28 September 2014 04:27:17PM *  1 point [-]

It takes, as input, a description of the agent it's predicting; typically source code, but in the case of AIXI, it gets the AIXI equation and a sequence of prior observations for AIXI.

As for what it does, it spends some period of time (maybe a very long one) on whatever kind of deductive and/or inductive reasoning it chooses to do in order to establish with a reasonable level of confidence what the agent it's trying to predict will do.

Yes, AIXI being uncomputable means that Omega can't simply run the equation for itself, but there is no need for a perfect prediction here. On the whole, it just needs to be able to come up with a well-reasoned argument for why AIXI will take a particular action, or perhaps run an approximation of AIXI for a while. Moreover, anyone in this thread arguing for either one-boxing or two-boxing has already implicitly agreed with this assumption.

Comment author: V_V 01 October 2014 12:12:05PM *  0 points [-]

Yes, AIXI being uncomputable means that Omega can't simply run the equation for itself, but there is no need for a perfect prediction here. On the whole, it just needs to be able to come up with a well-reasoned argument for why AIXI will take a particular action, or perhaps run an approximation of AIXI for a while.

This opens up the possibility that AIXI figures out that Omega is going to mispredict it, which would make TwoBoxing the best decision.

Moreover, anyone in this thread arguing for either one-boxing or two-boxing has already implicitly agreed with this assumption.

I think it is generally assumed that, even if Omega is not a perfect predictor, the agent can't outsmart it and predict its errors. But if Omega is computable and the agent is uncomputable, this doesn't necessarily hold true.

Comment author: lackofcheese 28 September 2014 01:35:12PM *  1 point [-]

I think it's implicit in the Newcomb's problem scenario that it takes place within the constraints of the universe as we know it. Obviously we have to make an exception for AIXI itself, but I don't see a reason to make any further exceptions after that point. Additionally, it is explicitly stated in the problem setup that the contents of the box are supposed to be predetermined, and that the agent is made aware of this aspect of the setup. As far as the epistemic states are concerned, this would imply that AIXI has been presented with a number of prior observations that provide very strong evidential support for this fact.

I agree that AIXI's universe programs are general Turing machines rather than explicit physics simulations, but I don't think that's a particularly big problem. Unless we're talking about a particularly immature AIXI agent, it should already be aware of the obvious physics-like nature of the real world; it seems to me that the majority of AIXI's probability mass should be occupied by physics-like Turing machines rather than by thunking. Why would AIXI come up with world programs that involve Omega making money magically appear or disappear after being presented significant evidence to the contrary?

I can agree that in the general case it would be rather difficult indeed to predict AIXI, but in many specific instances I think it's rather straightforward. In particular, I think Newcomb's problem is one of those cases.

I guess that in general Omega could be extremely complex, but unless there is a reason Omega needs to be that complex, isn't it much more sensible to interpret the problem in a way that is more likely to comport with our knowledge of reality? Insofar as there exist simpler explanations for Omega's predictive power, those simpler explanations should be preferred.

I guess you could say that AIXI itself cannot exist in our reality and so we need to reinterpret the problem in that context, but that seems like a flawed approach to me. After all, the whole point of AIXI is to reason about its performance relative to other agents, so I don't think it makes sense to posit a different problem setup for AIXI than we would for any other agent.

Comment author: V_V 28 September 2014 04:12:34PM 1 point [-]

If AIXI has been presented with sufficient evidence that the Newcomb's problem works as advertised, then it must be assigning most of its model probability mass to programs where the content of the box, however internally represented, is correlated to the next decision.
Such programs exist in the model ensemble, hence the question is how much probability mass does AIXI assign to them. If it not enough to dominate its choice, then by definition AIXI has not been presented with enough evidence.

Comment author: lackofcheese 28 September 2014 03:34:39PM 0 points [-]

Alternatively, what about a version of Newcomb's problem where the predictor's source code is shown to AIXI before it makes its decision?

Comment author: V_V 28 September 2014 04:07:07PM 1 point [-]

What would the source code of an Omega able to predict an AIXI look like?

Comment author: lackofcheese 28 September 2014 03:08:43PM *  1 point [-]

That would require the AIXI agent to have been pretrained to understand English (or some language as expressive as English) and have some experience at solving problems given a verbal explanation of the rules.

I don't think someone posing Newcomb's problem would be particularly interested in excuses like "but what if the agent only speaks French!?" Obviously as part of the setup of Newcomb's problem AIXI has to be provided with an epistemic background that is comparable to that of its intended target audience. This means it doesn't just have to be familiar with English, it has to be familiar with the real world, because Newcomb's problem takes place in the context of the real world (or something very much like it).

I think you're confusing two different scenarios:
- Someone training an AIXI agent to output problem solutions given problem specifications as inputs.
- Someone actually physically putting an AIXI agent into the scenario stipulated by Newcomb's problem.

The second one is Newcomb's problem; the first is the "what is the optimal strategy for Newcomb's problem?" problem.

It's the second one I'm arguing about in this thread, and it's the second one that people have in mind when they bring up Newcomb's problem.

Comment author: V_V 28 September 2014 03:37:18PM *  0 points [-]

Then AIXI ensemble will be dominated by programs which associate "real world" percepts and actions to predicted rewards.

The point is that there is no way, short of actually running the (physically impossible) experiment, that we can tell whether the behavior of this AIXI agent will be consistent with CDT, EDT, or something else entirely.

Comment author: lackofcheese 28 September 2014 01:44:59PM *  1 point [-]

The problem you're discussing is not Newcomb's problem; it's a different problem that you've decided to apply the same name to.

It is a crucial part of the setup of Newcomb's problem that the agent is presented with significant evidence about the nature of the problem. This applies to AIXI as well; at the beginning of the problem AIXI needs to be presented with observations that give it very strong evidence about Omega and about the nature of the problem setup. From Wikipedia:
"By the time the game begins, and the player is called upon to choose which boxes to take, the prediction has already been made, and the contents of box B have already been determined. That is, box B contains either $0 or $1,000,000 before the game begins, and once the game begins even the Predictor is powerless to change the contents of the boxes. Before the game begins, the player is aware of all the rules of the game, including the two possible contents of box B, the fact that its contents are based on the Predictor's prediction, and knowledge of the Predictor's infallibility. The only information withheld from the player is what prediction the Predictor made, and thus what the contents of box B are."

It seems totally unreasonable to withhold information from AIXI that would be given to any other agent facing the Newcomb's problem scenario.

Comment author: V_V 28 September 2014 02:39:27PM 1 point [-]

That would require the AIXI agent to have been pretrained to understand English (or some language as expressive as English) and have some experience at solving problems given a verbal explanation of the rules.

In this scenario, the AIXI internal program ensemble concentrates its probability mass on programs which associate each pair of one English specification and one action to a predicted reward. Given the English specification, AIXI computes the expected reward for each action and outputs the action that maximizes the expected reward.

Note that in principle this can implement any computable decision theory. Which one it would choose depend on the agent history and the intrinsic bias of its UTM.
It can be CDT, EDT, UDT, or, more likely, some approximation of them that worked well for the agent so far.

Comment author: So8res 25 September 2014 06:38:50PM *  4 points [-]

CDT + Precommitments is not pure CDT -- I agree that CDT over time (with the ability to make and keep precommitments) does pretty well, and this is part of what I mean when I talk about how an agent using pure CDT to make every decision would self-modify to stop doing that (e.g., to implement precommitments, which is trivially easy when you can modify your own source code).

Consider the arguments of CDT agents as they twobox, when they claim that they would have liked to precommit but they missed their opportunity -- we can do better by deciding to act as we would have precommitted to act, but this entails using a different decision theory. You can minimize the number of missed opportunities by allowing CDT many opportunities to precommit, but that doesn't change the fact that CDT can't retrocommit.

If you look at the decision-making procedure of something which started out using CDT after it self-modifies a few times, the decision procedure probably won't look like CDT, even though it was implemented by CDT making "precommitments".

And while CDT mostly does well when the games are repeated, there are flaws that CDT won't be able to self-correct (roughly corresponding to CDT's inability to make retrocommitments), these will be the subject of future posts.

Comment author: V_V 28 September 2014 01:54:52PM 1 point [-]

Consider the arguments of CDT agents as they twobox, when they claim that they would have liked to precommit but they missed their opportunity

Why would they do that?
CDT two-boxes because CDT simply fails to understand that the content of the box is influenced by its decision. It deliberately uses an incorrect epistemic model.
So when the agent two-boxes and it obtains a reward different than what it had predicted, it will simply think it has been lied to, or if it is one hundred percent, certain that the model was correct, then it will experience a logical contradiction, halt and catch fire.

Comment author: lackofcheese 26 September 2014 06:50:43PM *  1 point [-]

Actually, I think AIXI is basically CDT-like, and I suspect that it would two-box on Newcomb's problem.

At a highly abstract level, the main difference between AIXI and a CDT agent is that AIXI has a generalized way of modeling physics (but it has a built-in assumption of forward causality), whereas the CDT agent needs you to tell it what the physics is in order to make a decision.

The optimality of the AIXI algorithm is predicated on viewing itself as a "black box" as far as its interactions with the environment are concerned, which is more or less what the CDT agent does when it makes a decision.

Comment author: V_V 28 September 2014 01:36:33PM 1 point [-]

Actually, I think AIXI is basically CDT-like, and I suspect that it would two-box on Newcomb's problem.

AIXI is a machine learning (hyper-)algorithm, hence we can't expect it to perform better than a random coin toss on a one-shot problem.

If you repeatedly pose Newcomb's problem to an AIXI agent, it will quickly learn to one-box.
Trivially, AIXI doesn't model the problem acausal structure in any way. For AIXI, this is just a matter of setting a bit and getting a reward, and AIXI will easily figuring out that setting its decision bit to "one-box" yields an higher expected reward that setting it to "two-box".
In fact, you don't even need an AIXI agent to do that: any reinforcement learning toy agent will be able to do that.

Comment author: lackofcheese 24 September 2014 10:01:06PM *  2 points [-]

The term "max-payoff equilibrium" is ill-defined.

Consider this pair of bots:
1) ExtortionBot, who defects for 80 rounds, and then cooperates with you for 20 rounds if and only if you cooperated for all of the first 80 (otherwise it defects for those rounds as well).
2) WeakBot, who always defects for the last 20 rounds, and cooperates with you for the first 80 if and only if it simulates that you will cooperate for the last 20 rounds iff WeakBot cooperates with you for the first 80 (otherwise it defects for the first 80).

The maximum score you can get vs ExtortionBot is 100 points, which is how many points WeakBot gets.
The maximum score you can get vs WeakBot is 400 points, which is how many points ExtortionBot gets.

Ergo ExtortionBot/WeakBot forms a Nash Equilibrium. Is that a max-payoff equilibrium, or is it not?

Comment author: V_V 28 September 2014 12:54:38PM *  2 points [-]

That means that this game is a symmetric bargaining problem.
According to Wikipedia, proposed solutions are symmetric, Pareto-optimal (i.e. "max-payoff") equilibria.

It seems to me that VOFB or something similar to it is a strategy leading to one of these equilibria (do other symmetric Pareto-optimal equilibria exist?)

View more: Next