Comment author: dankane 25 September 2014 04:20:34PM 5 points [-]

Newcomblike problems occur whenever knowledge about what decision you will make leaks into the environment. The knowledge doesn't have to be 100% accurate, it just has to be correlated with your eventual actual action.

This is far too general. The way in which information is leaking into the environment is what separates Newcomb's problem from the smoking lesion problem. For your argument to work you need to argue that whatever signals are being picked up on would change if the subject changed their disposition, not merely that these signals are correlated with the disposition.

Comment author: dankane 25 September 2014 05:59:48PM 2 points [-]

Relatedly, with your interview example, I think that perhaps a better model is that whether a person is confident or shy is not depending on whether they believe that they will be bold or not, but upon the degree to which they care about being laughed at. If you are confident, you don't care about being laughed at and might as well be bold. If you are afraid of being laughed at, you already know that you are shy and thus do not gain anything by being bold.

Comment author: dankane 25 September 2014 04:20:34PM 5 points [-]

Newcomblike problems occur whenever knowledge about what decision you will make leaks into the environment. The knowledge doesn't have to be 100% accurate, it just has to be correlated with your eventual actual action.

This is far too general. The way in which information is leaking into the environment is what separates Newcomb's problem from the smoking lesion problem. For your argument to work you need to argue that whatever signals are being picked up on would change if the subject changed their disposition, not merely that these signals are correlated with the disposition.

Comment author: dankane 25 September 2014 05:44:12PM 2 points [-]

I think my bigger point is that you don't seem to make any real argument as to which case we are in. For example, consider the following model of how people's perception of my trustworthiness might be correlated to my actual trustworthiness: There are two causal chains: My values -> Things I say -> Peoples' perceptions My values -> My actions So if I value trustworthiness, I will not, for example talk much about wanting to avoid being sucker (in contexts where it would refer to be doing trustworthy things). This will influence peoples' perceptions of whether or not I am trustworthy. Furthermore, if I do value trustworthiness, I will want to be trustworthy.

This setup makes things look very much like the smoking lesion problem. A CDT agent that values trustworthiness will be trustworthy because they place intrinsic value in it. A CDT agent that does not value trustworthiness will be perceived as being untrustworthy. Simply changing their actions will not alter this perception, and therefore they will fail to be trustworthy in situations where it benefits them, and this is the correct decision.

Now you might try to break the causal link: My values -> Things that I say And doing so is certainly possible (I mean you can have spies that successfully pretend to be loyal for extended periods without giving themselves away). On the other hand, it might not happen often for several possible reasons: A) Maintaining a facade at all times is exhausting (and thus imposes high costs) B) Lying consistently is hard (as in too computationally expensive) C) The right way to lie consistently, is to simulate the altered value set, but this may actually lead to changing your values (standard advice for become more confident is pretending to be confident, right?).

So yes, in this model an non-trust-valuing and self-modifying CDT agent will self-modify, but it will need to self-modify its values rather than its decision theory. Using a decision theory that is trustworthy despite not intrinsically valuing it doesn't help.

Comment author: dankane 25 September 2014 04:20:34PM 5 points [-]

Newcomblike problems occur whenever knowledge about what decision you will make leaks into the environment. The knowledge doesn't have to be 100% accurate, it just has to be correlated with your eventual actual action.

This is far too general. The way in which information is leaking into the environment is what separates Newcomb's problem from the smoking lesion problem. For your argument to work you need to argue that whatever signals are being picked up on would change if the subject changed their disposition, not merely that these signals are correlated with the disposition.

Comment author: [deleted] 19 September 2014 12:52:53AM 2 points [-]

Correlation-by-congruent-logic can show up in situations that don't necessarily have to do with minds, particularly the agent's mind, but the agent needs to either have an epistemology capable of noticing the correlations and equating them logically within its decision-making procedure -- TDT reaches in that direction.

Comment author: dankane 19 September 2014 01:37:26AM 0 points [-]

Sorry. I'm not quite sure what you're saying here. Though, I did ask for a specific example, which I am pretty sure is not contained here.

Though to clarify, by "reading your mind" I refer to any situation in which the scenario you face (including the given description of that scenario) depends directly on which program you are running and not merely upon what that program outputs.

Comment author: hairyfigment 18 September 2014 04:13:02PM 1 point [-]

No. BOT^CDT = DefectBot. It defects against any opponent. CDT could not cause it to cooperate by changing what it does.

Maybe I've misread you, but this sounds like an assertion that your counterfactual question is the right one by definition, rather than a meaningful objection.

Comment author: dankane 19 September 2014 12:59:57AM 0 points [-]

Well, yes. Then again, the game was specified as PD against BOT^CDT not as PD against BOT^{you}. It seems pretty clear that for X not equal to CDT that it is not the case that X could achieve the result CC in this game. Are you saying that it is reasonable to say that CDT could achieve a result that no other strategy could just because it's code happens to appear in the opponent's program?

I think that there is perhaps a distinction to be made between things that happen to be simulating your code and this that are causally simulating your code.

Comment author: nshepperd 18 September 2014 11:00:05PM *  0 points [-]

Well I can map different epistemic states to different outputs, I can implement the strategy cooperate if you are painted blue and defect if you are painted red. Of course UDT^Blue will reason the same way and they will fail to cooperate with each other.

No, because that's a silly thing to do in this scenario. For one thing, UDT will see that they are reasoning the same way (because they are selfish and only consider "my color" vs "other color"), and therefore will both do the same thing. But also, depending on the setup, UDT^Red's prior should give equal probability to being painted red and painted blue anyway, which means trying to make the outcome favour red is silly.

Compare to the version of newcomb's where the bot in the room is UDT^Red, while Omega simulates UDT^Blue. UDT can implement the conditional strategy {Red => two-box, Blue => one-box}. This is obviously unlikely, because the point of the Newcomb thought experiment is that Omega simulates (or predicts) you. So he would clearly try to avoid adding such information that "gives the game away".

However in this scenario you say that BOT simulates UDT "by coincidence", not by mind reading. So it is far more likely that BOT simulates (the equivalent) of UDT^Blue, while the UDT actually playing is UDT^Red. And you are passed the code of BOT as input, so UDT can simply implement the conditional strategy {cooperate iff the color inside BOT is the same as my color}.

Comment author: dankane 19 September 2014 12:42:57AM 0 points [-]

OK. Fine. Point taken. There is a simple fix though.

MBOT^X(Y) = X'(MBOT^X) where X' is X but with randomized irrelevant experiences.

In order to produce this properly, MBOT only needs to have your prior (or a sufficiently similar probability distribution) over irrelevant experiences hardcoded. And while your actual experiences might be complicated and hard to predict, your priors are not.

Comment author: nshepperd 18 September 2014 08:46:42AM 3 points [-]

I take it back, the scenario isn't that weird. But your argument doesn't prove what you think it does:

Consider the analogous scenario, where CDT plays against BOT = CDT(BOT). CDT clearly does the wrong thing here - it defects. If it cooperated, it would get CC instead of DD. Note that if CDT did cooperate, UDT would be able to freeload by defecting (against BOT = CDT(BOT)). But CDT doesn't care about that because the prisoner's dilemma is defined such that we don't care about freeloaders. Nevertheless CDT defects and gets a worse result than it could.

CDT does better than UDT against BOT = UDT(BOT) because UDT (correctly) doesn't care that CDT can freeload, and correctly cooperates to gain CC.

If you are claiming that you want a UDT that differs from the encoding in BOT because, of some irrelevant details in its memory...

Depending on the exact setup, "irrelevant details in memory" are actually vital information that allow you to distinguish whether you are "actually playing" or are being simulated in BOT's mind.

Comment author: dankane 18 September 2014 02:50:13PM 0 points [-]

No. BOT^CDT = DefectBot. It defects against any opponent. CDT could not cause it to cooperate by changing what it does.

If it cooperated, it would get CC instead of DD.

Actually if CDT cooperated against BOT^CDT it would get $3^^^3. You can prove all sorts of wonderful things once you assume a statement that is false.

Depending on the exact setup, "irrelevant details in memory" are actually vital information that allow you to distinguish whether you are "actually playing" or are being simulated in BOT's mind.

OK... So UDT^Red and UDT^Blue are two instantiations of UDT that differ only in irrelevant details. In fact the scenario is a mirror matchup, only after instantiation one of the copies was painted red and the other was painted blue. According to what you seem to be saying UDT^Red will reason:

Well I can map different epistemic states to different outputs, I can implement the strategy cooperate if you are painted blue and defect if you are painted red.

Of course UDT^Blue will reason the same way and they will fail to cooperate with each other.

Comment author: nshepperd 18 September 2014 07:10:47AM *  2 points [-]

Okay, you're saying here that BOT has a perfect copy of the UDT player's mind in its own code (otherwise how could it calculate UDT(BOT) and guarantee that the output is the same?). It's hard to see how this doesn't count as "reading your mind".

Yes, sometimes its advantageous to not control the output of computations in the environment. In this case UDT is worse off because it is forced to control both its own decision and BOT's decision; whereas CDT doesn't have to worry about controlling BOT because they use different algorithms. But this isn't due to any intrinsic advantage of CDT's algorithm. It's just because they happen to be numerically inequivalent.

An instance of UDT with literally any other epistemic state than the one contained in BOT would do just as well as CDT here.

Comment author: dankane 18 September 2014 07:54:58AM 1 point [-]

It's hard to see how this doesn't count as "reading your mind".

So... UDT's source code is some mathematical constant, say 1893463. It turns out that UDT does worse against BOT^1893463. Note that it does worse against BOT^1893463 not BOT^{you}. The universe does not depend on the source code of the person playing the game (as it does in mirror PD). Furthermore, UDT does not control the output of its environment. BOT^1893463 always cooperates. It cooperates against UDT. It cooperates against CDT. It cooperates everything.

But this isn't due to any intrinsic advantage of CDT's algorithm. It's just because they happen to be numerically inequivalent.

No. CDT does at least as well as UDT against BOT^CDT. UDT does worse when there is this numerical equivalence, but CDT does not suffer from this issue. CDT does at least as well as UDT against BOT^X for all X, and sometimes does better. In fact, if you only construct counterfactuals this way, CDT does at least as well as anything else.

An instance of UDT with literally any other epistemic state than the one contained in BOT would do just as well as CDT here.

This is silly. A UDT that believes that it is in a mirror matchup also loses. A UDT that believes it is facing Newcomb's problem does something incoherent. If you are claiming that you want a UDT that differs from the encoding in BOT because, of some irrelevant details in its memory... well then it might depend upon implementation, but I think that most attempted implementations of UDT would conclude that these irrelevant details are irrelevant and cooperate anyway. If you don't believe this then you should also think that UDT will defect in a mirror matchup if it and its clone are painted different colors.

Comment author: lackofcheese 18 September 2014 06:54:27AM *  1 point [-]

Actually, you've oversimplified and missed something critical. In reality, the only way you can force BOT^UDT(X) = UDT(BOT^UDT) = C is if the universe does, in fact, read your mind. In general, UDT can map different epistemic states to different actions, so as long as BOT^UDT has no clue about the epistemic state of the UDT agent it has no way of guaranteeing that its output is the same as that of the UDT agent. Consequently, it's possible for the UDT agent to get DC as well. The only way BOT^UDT would be able to guarantee that it gets the same output as a particular UDT agent is if the universe was able to read the UDT agent's mind.

Comment author: dankane 18 September 2014 06:58:37AM *  0 points [-]

Actually, I think that you are misunderstanding me. UDT's current epistemic state (at the start of the game) is encoded into BOT^UDT. No mind reading involved. Just a coincidence. [Really, your current epistemic state is part of your program]

Your argument is like saying that UDT usually gets $1001000 in Newcomb's problem because whether or not the box was full depended on whether or not UDT one-boxed when in a different epistemic state.

Comment author: lackofcheese 18 September 2014 04:02:05AM 2 points [-]

What is your point, exactly?

It's clear that UDT can't do better vs "BOT" than by cooperating, because if UDT defects against BOT then BOT defects against UDT. Given that dependency, you clearly can't call it CooperateBot, and it's clear that UDT makes the right decision by cooperating with it because CC is better than DD.

Comment author: dankane 18 September 2014 05:39:06AM *  1 point [-]

OK. Let me say this another way that involves more equations.

So let's let U(X,Y) be the utility that X gets when it plays prisoner's dilemma against Y. For a program X, let BOT^X be the program where BOT^X(Y) = X(BOT^X). Notice that BOT^X(Y) does not depend on Y. Therefore, depending upon what X is BOT^X is either equivalent CooperateBot or equivalent to DefectBot.

Now, you are claiming that UDT plays optimally against BOT_UDT because for any strategy X U(X, BOT^X) <= U(UDT, BOT^UDT) This is true, because X(BOT^X) = BOT^X(X) by the definition of BOT^X. Therefore you cannot do better than CC. On the other hand, it is also true that for any X and any Y that U(X,BOT^Y) <= U(CDT, BOT^Y) This is because BOT^Y's behavior does not depend on X, and therefore you do optimally by defecting against it (or you could just apply the Theorem that says that CDT wins if the universe cannot read your mind).

Our disagreement here stems from the fact that we are considering different counterfactuals here. You seem to claim that UDT behaves correctly because U(UDT,BOT^UDT) > U(CDT,BOT^CDT) While I claim that CDT does because U(CDT, BOT^UDT) > U(UDT, BOT^UDT)

And in fact, given the way that I phrased the scenario, (which was that you play BOT^UDT not that you play BOT^{you} (i.e. the mirror matchup)) I happen to be right here. So justify it however you like, but UDT does lose this scenario.

View more: Prev | Next