Dagon

Just this guy, you know?

Wikitag Contributions

Comments

Sorted by
Dagon31

note: Nozick does NOT say that he endorses two-boxing.  He describes the argument for it as you say, without stating that he believes it's correct.

I disagree with your analysis

The point isn't that Omega is a faulty predictor. The point is that even if Omega is an awesome predictor, then what you do now can't magically fill or empty box B".  

That second part is equivalent to "in this case, Omega can fail to predict my next action".  If you believe it's possible to two-box and get $1.001M, you're rejecting the premise.  

What you do next being very highly correlated with whether the $1M is in a box is exactly the important part of the thought experiment, and if you deny it, you're answering a different question.  Whether it's 'magic' or not is irrelevant (though it does show that the problem may have little to do with the real world).

I'm FINE with saying "this is an impossible situation that doesn't apply to the real world".  That's different from saying "I accept all the premises (including magic prediction and correlation with my own actions) and I still recommend 2-boxing".  

Dagon24

And probably each local instance would paperclip itself when the locally-reachable resources were clipped.  "local" being defined as the area of spacetime which does not have a different instance in progress to clippify it.

decisionproblem.com/paperclips/index2.html demonstrates some features of this (though it has a different take on distribution), and is amazingly playable as a game.

Dagon70

Note that the concepts of "counterfactual" vs "hypothetical" kind of map to this.   I tend to think of "scenario" rather than "world" as my container of imagination for these kinds of things, because "world" implies a large-scale consistency that is explicitly missing in a lot of fictional cases, and unavailable for inspection in many "real" and "semi-real (intended to be realistic, but not asserted to be actually occurring)" cases.

Dagon*2-7

"Many philosophers are idiots" is overstated quite a bit, but not completely wrong - professional philosophers very often add complexity and redefine common terms in ways that make questions more tractable but further from the original question.  Sorry if I offended anyone.

My reasons for stating that two-boxers are wrong are pretty much based on the fact that every actual argument for two-boxing I've seen is based on disbelieving the setup (or more gently, different interpretations of the setup).  I don't believe anyone who claims to two-box AND claims to believe that the correlation with an empty box is close to 1 (that is, if they really accept that Omega is correct in this instance).  

That said, I'd love to actually converse with a two-boxer, to be able to prove this, or to see if there are other reasons that I've just missed.  I acknowledge that I haven't really looked into it, and I'd love it if you can point to a relatively concise reason for two-boxing that is NOT "I don't believe Omega will be correct this time".

edit: on thinking more about this, I realize this is very similar to the mechanism by which CDT chooses two-box: CDT's inherent model of decision causality does not include this prediction capability, so it just doesn't believe that two-boxing will cost $1M. 

Note: I don't believe Omega exists, and I'm agnostic as to whether it CAN exist in our universe to predict me with that level of accuracy.  But given this alternate-universe where a perfect predictor exists and I know it, one-boxing is the only reasonable option.  

Dagon1-1

In all the discussions around here, very few human LW posters/commenters endorse two-boxing.  They often mention that "CDT two-boxes", but it's an indictment of CDT, not an endorsement of the choice.

GPT 4o, at least, does the same.  "If you use Causal Decision Theory, do you one-box on Newcomb's problem?", gives a pretty decent 

No. If you follow Causal Decision Theory (CDT), you two-box on Newcomb’s problem.

Reason: CDT evaluates actions based on their causal consequences. Since your choice cannot causally affect the already-made prediction (the money is already in the boxes), CDT concludes that taking both boxes gives you $1,000 more regardless of what's in the opaque box.

In contrast, Evidential Decision Theory (EDT) and Functional Decision Theory (FDT) typically one-box, since the act of one-boxing correlates with having a predictor that likely put $1 million in the opaque box.

Questions about what it would or I should do pretty consistently recommend one-boxing.  Depending on the prompt, it may or may not mention decision theory.

I'd say it's in agreement with the LW consensus position, and (depending on prompting) describes it more clearly than 85% of posters on the topic have.  This is consistent with having LW and related publications in the training data.

Edit: your longer text does mention that many philosophers advocate two-boxing.  I take that as evidence that many philosophers are idiots, not as evidence that two-boxing is better in the scenario described.  That LLMs are more sensible than many philosophers isn't very surprising to me.

Dagon42

If the only objective is this specific behavior, then private reporting is preferable.  If the objective is awareness and open discussion about the fact that we're imperfect but still have to strive for safety, then doing it publicly is best.  In practice, the second has overwhelmed the first in teams I've been part of.

Dagon*40

Upvoted for thinking about the question of mixed-equilibrium and both pros and cons for mechansisms of enforcement and education, I wish I could separately mark my disagreement.  I think this misses a lot of nuance and context-specificity around the good and the bad of the practice.  On the teams I've been on, it's more beneficial than risky.  I think it's especially beneficial NOT in the enforcement of behavior, but in the cultural normalizing of openly discussing human failures (and chiding each other) about security thinking.

Having a routine hook to have office chatter about it can really matter a lot - it's one of few ways that "makes it salient" for workers in a way that walks the line between unbelievable fake-over-seriousness (OMG, the phishing tests from corportate infosec!) and actual practice.  It's not the behavior itself (though that's a fine reason - it really does reduce open workstations), but the perception of importance of personal activity around infosec.

Yes, it could normalize snooping, but not by that much - it would still be a huge norm violation and draw unwanted attention if someone went far out of their way to find unlocked stations.  It really is only acceptable in groups of peers who all have roughly-equal access, not in truly differential or importantly-restricted-between-coworkers cases.

I've been in senior-IC leadership positions long enough that I do get to pretty much decide whether to encourage or ban the practice in my teams.  I generally encourage it, as just an example of practical things we should all be careful of, not as a make-or-break object-level requirement that we hit 100% compliance.

If it were actually important on the specific object level, we'd just make it automatic - there have long been wearable /transportable technology that locks when you walk away.  I wasn't on the team, but was adjacent to one in the late '90s that used an old version of smartcards to unlock the computers, and the requirement was the card had to be on a lanyard to your person - you literally couldn't walk away without taking it with you and locking your station.  More often and more recently, the setup is that everyone in that section of the office has similar clearance, and they secure the area - no visitors, no badging someone else in even if you recognize them, with 24/7 security guard to help enforce that.

Dagon41

This also works in reverse:

  • do some armchair theorizing and spend 10 seconds (or even a few hours) thinking about why people who make X aren't actually doing X very well.  You can't understand why.
  • Conclude: "X is simple, and the obvious answer is obvious".
Dagon20

Mind responding to:

For example, it's hard to argue someone has your best interest at heart ...

I think my response is above - I have no intention to argue either side of that.  It's so far out of what's possible that there's really no information available about how one should or could react.

In truth, I would advise you to say no (or just walk away and don't engage) if someone made you this offer - they're dangerously delusional or trying to scam you.

Dagon20

What about an AI which cares a tiny bit about keeping humans alive

This is quite believable in the short (a few generations of humans) term.  It's much less so in the long term.  I model "caring about a lot of things" as an equilibrium case of competing reasons for action, and almost nothing that is "care a little bit" is going to be stable.

I suppose "care a lot about maintaining a small quantity of biologicals" is possible as well for inscrutable AI reasons, but that doesn't bring any civilization back.

Load More