Comment author: pjeby 01 July 2013 12:02:41AM 3 points [-]

the question for the two-boxer will be whether the decision causally influences the result of this brain scan. If yes, then, the two-boxer will one-box (weird sentence). If no, the two-boxer will two-box.

How would it not causally influence the brain scan? Are you saying two-boxers can make decisions without using their brains? ;-)

In any event, you didn't answer the question I asked, which was at what point in time does the two-boxer label the decision "irrational". Is it still "irrational" in their estimation to two-box, in the case where Omega decides after they do?

Notice that in both cases, the decision arises from information already available: the state of the chooser's brain. So even in the original Newcomb's problem, there is a causal connection between the chooser's brain state and the boxes' contents. That's why I and other people are asking what role time plays: if you are using the correct causal model, where your current brain state has causal influence over your future decision, then the only distinction two-boxers can base their "irrational" label on is time, not causality.

The alternative is to argue that it is somehow possible to make a decision without using your brain, i.e., without past causes having any influence on your decision. You could maybe do that by flipping a coin, but then, is that really a "decision", let alone "rational"?

If a two-boxer argues that their decision cannot cause a past event, they have the causal model wrong. The correct model is one of a past brain state influencing both Omega's decision and your own future decision.

For me, the simulation argument made it obvious that one-boxing is the rational choice, because it makes clear that your decision is algorithmic. "Then I'll just decide differently!" is, you see, still a fixed algorithm. There is no such thing as submitting one program to Omega and then running a different one, because you are the same program in both cases -- and it's that program that is causal over both Omega's behavior and the "choice you would make in that situation". Separating the decision from the deciding algorithm is incoherent.

As someone else mentioned, the only way the two-boxer's statements make any sense is if you can separate a decision from the algorithm used to arrive at that decision. But nobody has presented any concrete theory by which one can arrive at a decision without using some algorithm, and whatever algorithm that is, is your "agent type". It doesn't make any sense to say that you can be the type of agent who decides one way, but when it actually comes to deciding, you'll decide another way.

How does your hypothetical two-boxer respond to simulation or copy arguments? If you have no way of knowing whether you're the simulated version of you, or the real version of you, which decision is rational then?

To put it another way, a two-boxer is arguing that they ought to two-box while simultaneously not being the sort of person who would two-box -- an obvious contradiction. The two-boxer is either arguing for this contradiction, or arguing about the definitions of words by saying "yes, but that's not what 'rational' means".

Indeed, most two-boxers I've seen around here seem to alternate between those two positions, falling back to the other whenever one is successfully challenged.

In response to comment by pjeby on Why one-box?
Comment author: PhilosophyStudent 01 July 2013 12:40:58AM 0 points [-]

In any event, you didn't answer the question I asked, which was at what point in time does the two-boxer label the decision "irrational". Is it still "irrational" in their estimation to two-box, in the case where Omega decides after they do?

Time is irrelevant to the two-boxer except as a proof of causal independence so there's no interesting answer to this question. The two-boxer is concerned with causal independence. If a decision cannot help but causally influence the brain scan then the two-boxer would one-box.

Notice that in both cases, the decision arises from information already available: the state of the chooser's brain. So even in the original Newcomb's problem, there is a causal connection between the chooser's brain state and the boxes' contents. That's why I and other people are asking what role time plays: if you are using the correct causal model, where your current brain state has causal influence over your future decision, then the only distinction two-boxers can base their "irrational" label on is time, not causality.

Two-boxers use a causal model where your current brain state has causal influence on your future decisions. They are interested in the causal effects of the decision not the brain state and hence the causal independence criterion does distinguish the cases in their view and they need not appeal to time.

If a two-boxer argues that their decision cannot cause a past event, they have the causal model wrong. The correct model is one of a past brain state influencing both Omega's decision and your own future decision.

They have the right causal model. They just disagree about which downstream causal effects we should be considering.

For me, the simulation argument made it obvious that one-boxing is the rational choice, because it makes clear that your decision is algorithmic. "Then I'll just decide differently!" is, you see, still a fixed algorithm. There is no such thing as submitting one program to Omega and then running a different one, because you are the same program in both cases -- and it's that program that is causal over both Omega's behavior and the "choice you would make in that situation". Separating the decision from the deciding algorithm is incoherent.

No-one denies this. Everyone agrees about what the best program is. They just disagree about what this means about the best decision. The two-boxer says that unfortunately the best program leads us to make a non-optimal decision which is a shame (but worth it because the benefits outweigh the cost). But, they say, this doesn't change the fact that two-boxing is the optimal decision (while acknowledging that the optimal program one-boxes).

How does your hypothetical two-boxer respond to simulation or copy arguments? If you have no way of knowing whether you're the simulated version of you, or the real version of you, which decision is rational then?

I suspect that different two-boxers would respond differently as anthropic style puzzles tend to elicit disagreement.

To put it another way, a two-boxer is arguing that they ought to two-box while simultaneously not being the sort of person who would two-box -- an obvious contradiction. The two-boxer is either arguing for this contradiction, or arguing about the definitions of words by saying "yes, but that's not what 'rational' means".

Well, they're saying that the optimal algorithm is a one-boxing algorithm while the optimal decision is two-boxing. They can explain why as well (algorithms have different causal effects to decisions). There is no immediate contradiction here (it would take serious argument to show a contradiction like, for example, an argument showing that decisions and algorithms are the same thing). For example, imagine a game where I choose a colour and then later choose a number between 1 and 4. With regards to the number, if you pick n, you get $n. With regards to the colour, if you pick red, you get $0, if you pick blue you get $5 but then don't get a choice about the number (you are presumed to have picked 1). It is not contradictory to say that the optimal number to pick is 1 but the optimal colour to pick is blue. The two-boxer is saying something pretty similar here.

What "ought" you do, according to the two-boxer. Well that depends what decision you're facing. If you're facing a decision about what algorithm to adopt, then adopt the optimal algorithm (which one-boxers on all future versions of NP though not ones where the prediction has occurred). If you are not able to choose between algorithms but are just choosing a decision for this occasion then choose two-boxing. They do not give contradictory advice.

Comment author: pjeby 30 June 2013 11:01:06PM 2 points [-]

the two-boxer says that you should precommit to later making an irrational decision

I think the piece that this hypothetical two-boxer is missing is that they are acting as though the problem is cheating, or alternatively, that the premises can be cheated. That is, that you are able to make a decision that wasn't predictable beforehand. If your decision is predictable, two boxing is irrational, even considered as a single decision.

Try this analogy: instead of predicting your decision in advance, Omega simply scans your brain to determine what to put in the boxes, at the very moment you make the decision.

Does your hypothetical two-boxer still argue that one-boxing in this scenario is "irrational"?

If so, I cannot make sense of their answer. But if not, then the burden falls on the two boxer to explain how this scenario is any different from a prediction made a fraction of a millisecond sooner. How far before or after the point of decision does the decision become "rational" or "irrational" in their mind? (I use quotes here because I cannot think of any coherent definition of those terms that's still consistent with the hypothetical usage.)

In response to comment by pjeby on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 11:38:03PM 1 point [-]

The two-boxer never assumes that the decision isn't predictable. They just say that the prediction can no longer be influenced and so you may as well gain the $1000 from the transparent box.

In terms of your hypothetical scenario, the question for the two-boxer will be whether the decision causally influences the result of this brain scan. If yes, then, the two-boxer will one-box (weird sentence). If no, the two-boxer will two-box.

In response to Why one-box?
Comment author: Ronak 30 June 2013 07:12:47PM 2 points [-]

[Saying same thing as everyone else, just different words. Might work better, might not.]

Suppose once Omega explains everything to you, you think 'now either the million dollars are there or aren't and my decision doesn't affect shit.' True, your decision now doesn't affect it - but your 'source code' (neural wiring) contains the information 'will in this situation think thoughts that support two-boxing and accept them.' So, choosing to one-box is the same as being the type of agent who'll one-box.
The distinction between agent type and decision is artificial. If your decision is to two-box, you are the agent-type who will two-box. There's no two ways about it. (As others have pointed out, this has been formalised by Anna Salomon.)

The only way you can get out of this is if you believe in free will as something that exists in some metaphysical sense. Then to you, Omega being this accurate is beyond the realm of possibility and therefore the question is unfair.

In response to comment by Ronak on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 11:35:38PM 0 points [-]

Two-boxing definitely entails that you are a two-boxing agent type. That's not the same claim as the claim that the decision and the agent type are the same thing. See also my comment here. I would be interested to know your answer to my questions there (particularly the second one).

Comment author: Robert_Unwin 30 June 2013 08:43:55PM *  2 points [-]

The LW approach has focused on finding agent types that win on decision problems. Lots of the work has been in trying to formalize TDT/UDT, providing sketches of computer programs that implement these informal ideas. Having read a fair amount of the philosophy literature (including some of the recent stuff by Egan, Hare/Hedden and others), I think that this agent/program approach has been extremely fruitful. It has not only given compelling solutions to a large number of problems in the literature (Newcomb's, trivial coordination problems like Stag Hunt that CDT fails on, PD playing against a selfish copy of yourself) but it also has elucidated the deep philosophical issues that the Newcomb Problem dramatizes (concerning pre-commitment, free will / determinism and uncertainty about purely apriori/logical question). The focus on agents as programs has brought to light the intricate connection between decision making, computability and logic (esp. Godelian issues) --- something merely touched on in the philosophy literature.

These successes provide a sufficient reason to push the agent-centered approach (even if there were no compelling foundational argument that the 'decision' centered approach was incoherent). Similarly, I think there is no overwhelming foundational argument for Bayesian probability theory but philosophers should study it because of its fruitfulness in illuminating many particular issues in the philosophy of science and the foundations of statistics (not to mention its success in practical machine learning and statistics).

This response may not be very satisfying but I can only recommend the UDT posts (http://wiki.lesswrong.com/wiki/Updateless_decision_theory) and the recent MIRI paper http://intelligence.org/files/RobustCooperation.pdf.)

Rough arguments against the decision-centered approach:

Point 1

Suppose I win the lottery after playing 10 times. My decision of which numbers to pick on the last lottery was the cause of winning money. (Whereas previous decisions over numbers produced only disutility). But it's not clear there's anything interesting about this distinction. If I lost money on average, the important lesson is the failing of my agent-type (i.e. the way my decision algorithm makes decisions on lottery problems).

And yet in many practical cases that humans face, it is very useful to look back at which decisions led to high utility. If we compare different algorithms playing casino games, or compare following the advice of a poker expert vs. a newbie, we'll get useful information by looking at the utility caused by each decision. But this investigation of decisions that cause high utility is completely explainable from the agent-centered approach. When simulation and logical correlations between agents are not part of the problem, the optimal agent will make decisions that cause the most utility. UDT/TDT and variants all (afaik) act like CDT in these simple decision problems. If we came upon a Newcomb problem without being told the setup (and without any familiarity with these decision theory puzzles), we would see that the CDTer's decisions were causing utility and the EDTer's decisions were not causing any utility. The EDTer would look like lunatic with bizarrely good luck. Here we are following a local causal criterion in comparing actions. While usually fine, we would clearly be missing out on an important part of the story in the Newcomb problem.

Point 2

In AI, we want to build decision making agents that win. In life, we want to improve our decision making so that we win. Thinking about the utility caused by individual decisions may be a useful subgoal in coming up with winning agents, but it seems hard to see it as the central issue. The Newcomb problem (and the counterfactual mugging and Parfit's Hitchhiker) make clear that a local Markovian criterion (e.g. choose the action that will cause the highest utility, ignoring all previous actions/commitments) is inadequate for winning.

Point 3

The UDT one-boxer's agent type does not cause utility in the NP. However it does logically determine the utility. (More specifically, we could examine the one-boxing program as a formal system and try to isolate which rules/axioms lead to its one boxing in this type of problem). Similarly, if two people were using different sets of axioms (where one set is inconsistent), we might point to one of the axioms and say that its inclusion is what determines the inconsistency of the system. This is a mere sketch, but it might be possible to develop a local criterion by which "responsibility" for utility gains can be assigned to particular aspects of an agent.

It's clear that we can learn about good agent types by examining particular decisions. We don't have to always work with a fully specified program. (And we don't have the code of any AI that can solve decision problems the way humans can). So the more local approach may have some value.

In response to comment by Robert_Unwin on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 11:31:55PM 0 points [-]

Generally agree. I think there are good arguments for focusing on decision types rather than decisions. A few comments:

Point 1: That's why rationality of decisions is evaluated in terms of expected outcome, not actual outcome. So actually, it wasn't just your agent type that was flawed here but also your decisions. But yes, I agree with the general point that agent type is important.

Point 2: Agree

Point 3: Yes. I agree that there could be ways other than causation to attribute utility to decisions and that these ways might be superior. However, I also think that the causal approach is one natural way to do this and so I think claims that the proponent of two-boxing doesn't care about winning are false. I also think it's false to say they have a twisted definition of winning. It may be false but I think it takes work to show that (I don't think they are just obviously coming up with absurd definitions of winning).

Comment author: Creutzer 30 June 2013 06:23:20PM 1 point [-]

The question is, how much of this utility can be attributed to the agent's decision rather than type.

That's the wrong question, because it presupposes that the agent's decision and type are separable.

In response to comment by Creutzer on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 11:24:02PM 0 points [-]

By decision, the two-boxer means something like a proposition that the agent can make true or false at will (decisions don't need to be analysed in terms of propositions but it makes the point fairly clearly). In other words, a decision is a thing that an agent can bring about with certainty.

By agent type, in the case of Newcomb's problem, the two-boxer is just going to mean *the thing that Omega based their prediction on". Let's say the agent's brain state at the time of prediction.

Why think these are the same thing?

If these are the same thing, CDT will one-box. Given that, is there any reason to think that the LW view is best presented as requiring a new decision theory rather than as requiring a new theory of what constitutes a decision?

Comment author: Strilanc 30 June 2013 12:49:25PM *  0 points [-]

In that case: the two-boxer isn't just wrong, they're double-wrong. You can't just come up with some related-but-different function ("caused gain") to maximize. The problem is about maximizing the money you receive, not "caused gain".

For example, I've seen some two-boxers justify two-boxing as a moral thing. They're willing to pay 999000$ for the benefit of throwing being predicted in the predictors face, somehow. Fundamentally, they're making the same mistake: fighting the hypothetical by saying the payoffs are different than what was stated in the problem.

In response to comment by Strilanc on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 11:15:49PM -1 points [-]

The two-boxer is trying to maximise money (utility). They are interested in the additional question of which bits of that money (utility) can be attributed to which things (decisions/agent types). "Caused gain" is a view about how we should attribute the gaining of money (utility) to different things.

So they agree that the problem is about maximising money (utility) and not "caused gain". But they are interested in not just which agents end up with the most money (utility) but also which aspects of those agents is responsible for them receiving the money. Specifically, they are interested in whether the decisions the agent makes are responsible for the money they receive. This does not mean they are trying to maximise something other than money (utility). It means they are interested in maximising money and then also in how you can maximise money via different mechanisms.

Comment author: Qiaochu_Yuan 30 June 2013 09:40:25AM *  0 points [-]

The question is, how much of this utility can be attributed to the agent's decision rather than type.

To many two-boxers, this isn't the question. At least some two-boxing proponents in the philosophical literature seem to distinguish between winning decisions and rational decisions, the contention being that winning decisions can be contingent on something stupid about the universe. For example, you could live in a universe that specifically rewards agents who use a particular decision theory, and that says nothing about the rationality of that decision theory.

In response to comment by Qiaochu_Yuan on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 09:50:30AM 1 point [-]

I'm not convinced this is actually the appropriate way to interpret most two-boxers. I've read papers that say things that sound like this claim but I think the distinction that it generally being gestured at is the distinction I'm making here (with different terminology). I even think we get hints of that with the last sentence of your post where you start to talk about agent's being rewards for their decision theory rather than their decision.

Comment author: Strilanc 30 June 2013 09:27:25AM 0 points [-]

I still don't follow. The causal effect of two-boxing is getting 1000$ instead of 1000000$. That's bad. How are you interpreting it, so that it's good? Because they're following a rule of thumb that's right under different circumstances?

In response to comment by Strilanc on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 09:31:54AM 0 points [-]

One-boxers end up with 1 000 000 utility Two-boxers end up with 1 000 utility

So everyone agrees that one-boxers are the winning agents (1 000 000 > 1 000)

The question is, how much of this utility can be attributed to the agent's decision rather than type. The two-boxer says that to answer this question we ask about what utility the agent's decision caused them to gain. So they say that we can attribute the following utility to the decisions:

One-boxing: 0 Two-boxing: 1000

And the following utility to the agent's type (there will be some double counting because of overlapping causal effects):

One-boxing type: 1 000 000 Two-boxing type: 1 000

So the proponent of two-boxing says that the winning decision is two-boxing and the winning agent type is a one-boxing type.

I'm not interpreting it so that it's good (for a start, I'm not necessarily a proponent of this view, I'm just outlining it). All I'm discussing is the two-boxer's response to the accusation that they don't win. They say they are interested not in winning agents but winning decisions and that two boxing is the winning decision (because 1000 > 0).

In response to Why one-box?
Comment author: Strilanc 30 June 2013 08:45:53AM 0 points [-]

[Two boxers] are interested in what aspects of the agent's winning can be attributed to their decision and they say that we can attribute the agent's winning to their decision if this is caused by their decision. This strikes me as quite a reasonable way to apportion the credit for various parts of the winning.

What do you mean by "the agent's winning can be attributed to their decision"? The agent isn't winning! Calling losing winning strikes me as a very unreasonable way to apportion credit for winning.

It would be helpful to me if you defined how you're attributing winning to decisions. Maybe taboo the words winning and decision. At the moment I really can't get my head around what you're trying to say.

In response to comment by Strilanc on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 09:20:22AM 0 points [-]

I was using winning to refer to something that comes in degrees.

The basic idea is that each agent ends up with a certain amount of utility (or money) and the question is which bits of this utility can you attribute to the decision. So let's say you wanted to determine how much of this utility you can attribute to the agent having blue hair. How would you do so? One possibility (that used by the two-boxer) is that you ask what causal effect the agent's blue hair had on the amount of utility received. This doesn't seem an utterly unreasonable way of determining how the utility received should be attributed to the agent's hair type.

Comment author: Creutzer 30 June 2013 07:52:57AM *  4 points [-]

But the very point is that you can't submit one piece of code and run another. You have to run what you submitted. That, again, is the issue that decisions don't fall from the sky uncaused. The reason why CDT can't get Newcomb's right is that due to its use of surgery on the action node, it cannot conceive of its own choice as predetermined. You are precommitted already just in virtue of what kind of agent/program you are.

In response to comment by Creutzer on Why one-box?
Comment author: PhilosophyStudent 30 June 2013 08:28:55AM 1 point [-]

But the very point is that you can't submit one piece of code and run another. You have to run what you submitted.

Yes. So the two-boxer says that you should precommit to later making an irrational decision. This does not require them to say that the decision you are precommitting to is later rational. So the two-boxer would submit the one-boxing code despite the fact that one unfortunate effect of this would be that they would later irrationally run the code (because there are other effects which counteract this).

I'm not saying your argument is wrong (nor am I saying it's right). I'm just saying that the analogy is too close to the original situation to pump intuitions. If people don't already have the one-boxing intuition in Newcomb's problem then the submitting code analogy doesn't seem to me to make things any clearer.

View more: Next