All of eapi's Comments + Replies

the order-dimension of its preference graph is not 1 / it passes up certain gains

If the order dimension is 1, then the graph is a total order, right? Why the conceptual detour here?

Loosely related to this, it would be nice to know if systems which reliably don't turn down 'free money' must necessarily have almost-magical levels of internal coordination or centralization. If the only things which can't (be tricked into) turn(ing) down free money when the next T seconds of trade offers are known are Matrioshka brains at most T light-seconds wide, does that tell us anything useful about the limits of that facet of dominance as a measure of agency?

I'm not convinced that the specifics of "why" someone might consider themselves a plural smeared across a multiverse are irrelevant. MWI and the dynamics of evolving amplitude are a straightforward implication of the foundational math of a highly predictive theory, whereas the different flavors of classical multiverse are a bit harder to justify as "likely to be real", and also harder to be confident about any implications.

If I do the electron-spin thing I can be fairly confident of the future existence of a thing-which-claims-to-be-me experiencing both ou... (read more)

1TAG
No they are not straightforward, MWI is controversial and subject to ongoing research.
4JBlack
Fully agreed, I wasn't trying to say that there are just as good justifications for a classical multiverse as a quantum multiverse. Just that it's the "multiverse" part that's more relevant than the "quantum" part. If you accept multiverses at all, most types include the possibility that there may be indistinguishable pre-flip versions of 'you' that experience different post-flip outcomes.

If I push the classical uncertainty into the past by, say, shaking a box with the coin inside and sticking it in a storage locker and waiting a year (or seeding a PRNG a year ago and consulting that) then even though the initial event might have branched nicely, right now that cluster of sufficiently-similar Everett branches are facing the same situation in the original question, right? Assuming enough chaotic time has passed that the various branches from the original random event aren't using that randomness for the same reason.

I understand from things like this that it doesn't take a lot of (classical) uncertainty or a lot of time for a system to become unpredictable at scale, but for me that pushes the question down to annoying concrete follow-ups like:

  • My brain and arm muscles have thermal noise, but they must be somewhat resilient to noise, so how long does it take for noise at one scale (e.g. ATP in a given neuron) to be observable at another scale (e.g. which word I say, what thought I have, how my arm muscle moves)?
  • More generally, how effective are "noise control" mechanism
... (read more)
2Viliam
I think the quantum uncertainty can propagate to large scale relatively fast, like on the scale of minutes. If we take an identical copy of you (in an identical copy of the room, isolated from the rest of the universe), and five minutes later you flip a coin, the result will be random, as the quantum uncertainty has propagated through your neurons and muscle fibers. (Not sure about this. I am not an expert, I just vaguely remember reading this somewhere.) Usually we do not notice this, because for non-living things, such as rocks, a few atoms moved here or there does not matter on the large scale; on the other hand, living things have feedback and homeostatis, keeping them in some reasonable range. However, things like "flipping a coin" are designed to be sensitive to noise. The same is true for pinball.

For me the only "obvious" takeaway from this re. quantum immortality is that you should be more willing to play quantum Russian roulette than classical Russian roulette. Beyond that, the topic seems like something where you could get insights by just Sitting Down and Doing The Math, but I'm not good enough at math to do the math.

...wait, you were just asking for an example of an agent being "incoherent but not dominated" in those two senses of being money-pumped? And this is an exercise meant to hint that such "incoherent" agents are always dominatable?

I continue to not see the problem, because the obvious examples don't work. If I have  as incomparable to  that doesn't mean I turn down the trade of  (which I assume is what you're hinting at re. foregoing free money).

If one then sa... (read more)

Hmm, I was going to reply with something like "money-pumps don't just say something about adversarial environments, they also say something about avoiding leaking resources" (e.g. if you have circular preferences between proximity to apples, bananas, and carrots, then if you encounter all three of them in a single room you might get trapped walking between them forever) but that's also begging your original question - we can always just update to enjoy leaking resources, transmuting a "leak" into an "expenditure".

Another frame here is that if you make/enco... (read more)

(I'm not EJT, but for what it's worth:)

I find the money-pumping arguments compelling not as normative arguments about what preferences are "allowed", but as engineering/security/survival arguments about what properties of preferences are necessary for them to be stable against an adversarial environment (which is distinct from what properties are sufficient for them to be stable, and possibly distinct from questions of self-modification).

1keith_wynroe
Yeah I agree that even if they fall short of normative constraints there’s some empirical content around what happens in adversarial environments. I think I have doubts that this stuff translates to thinking about AGIs too much though, in the sense that there’s an obvious story of how an adversarial environment selected for (partial) coherence in us, but I don’t see the same kinds of selection pressures being a force on AGIs. Unless you assume that they’ll want to modify themselves in anticipation of adversarial environments which kinda begs the question

The rock doesn't seem like a useful example here. The rock is "incoherent and not dominated" if you view it as having no preferences and hence never acting out of indifference, it's "coherent and not dominated" if you view it as having a constant utility function and hence never acting out of indifference, OK, I guess the rock is just a fancy Rorschach test.

IIUC a prototypical Slightly Complicated utility-maximizing agent is one with, say, , and a prototypical Slightly Complicated not-obviously-pumpable non-utility... (read more)

This is pretty unsatisfying as an expansion of "incoherent yet not dominated" given that it just uses the phrase "not coherent" instead.

I find money-pump arguments to be the most compelling ones since they're essentially tiny selection theorems for agents in adversarial environments, and we've got an example in the post of (the skeleton of) a proof that a lack-of-total-preferences doesn't immediately lead to you being pumped. Perhaps there's a more sophisticated argument that Actually No, You Still Get Pumped but I don't think I've seen one in the comments... (read more)

5Eliezer Yudkowsky
Things are dominated when they forego free money and not just when money gets pumped out of them.

Say more about what counts as incoherent yet not dominated? I assume "incoherent" is not being used here as an alias for "non-EU-maximizing" because then this whole discussion is circular.

6Eliezer Yudkowsky
Suppose I describe your attempt to refute the existence of any coherence theorems:  You point to a rock, and say that although it's not coherent, it also can't be dominated, because it has no preferences.  Is there any sense in which you think you've disproved the existence of coherence theorems, which doesn't consist of pointing to rocks, and various things that are intermediate between agents and rocks in the sense that they lack preferences about various things where you then refuse to say that they're being dominated?

Here's a very late follow-up: the rationale behind linearity for Shapley values seems closely related to the rationale behind the independence axiom of VNM rationality, and under some decision theories we apparently can dispense with the latter.

This gives me the vocabulary for expressing why I find linearity constraining: if I'm about to play game  or game  with probabilities  and  respectively, and my payout of  is lower, maybe I would prefer to get a lower payout in  in exchange for a hi... (read more)

...huh. So UDT in general gets to just ignore the independence axiom because:

  • UDT's whole shtick is credibly pre-committing to seemingly bad choices in some worlds in order to get good outcomes in others, and/or
  • UDT is optimizing over policies rather than actions, and I guess there's nothing stopping us having preferences over properties of the policy like fairness (instead of only ordering policies by their "ground level" outcomes).
    • And this is where  comes in, it's one way of encoding something-like-fairness.

Sound about right?

7Scott Garrabrant
yep

I'm confused by the "no dutch book" argument. Pre-California-lottery-resolution, we've got , but post-California-lottery-resolution we simultaneously still have  and "we refuse any offer to switch from  to ", which makes me very uncertain what  means here.

Is this just EDT vs UDT again, or is the post-lottery  subtly distinct from the pre-lottery one,  or is "if you see yourself about to be dutch-booked, just suck it up and be sad" a generally accepted solution to otherwise being DB'd, or ... (read more)

4Scott Garrabrant
I think it is EDT vs UDT. We prefer B to A, but we prefer CA to CB, not because of dutch books, but because CA is good enough for Bob to be fair, and A is not good enough for Bob.

But if your utility function is bounded, as it apparently should be then you're one affine transform away from being able to use geometric rationality, no?

1[comment deleted]
5Eric Neyman
How much should you shift things by? The geometric argmax will depend on the additive constant.

I agree my "fix" is insufficient - in fact I'd go so far as agreeing with JBlack below that including it was net negative to the question.

I'd like to pin down what you mean by your description of a more complete model, I hope you don't mind.

Let me flesh out the restaurant story. The actors are  (me) and  (my friend). The restaurants are  and . There are two events we care about: the first is me and my friend choosing the lottery parameter , and the second is actually running the lottery.

After picking ... (read more)

Fair point re. focusing on a specific formula, I'll remove that from the post.

Hmm, I'm not sure what I should be taking away from that. You've pointed out that the morning and evening lotteries are materially different, but that's not contentious to me: if uncertainty has costs then those costs have to show up as differences in the world compared to a world without that uncertainty.

I guess the restaurant story failed to focus on the-bit-that's-weird-to-me, which is that if my friend and I were negotiating over the lottery parameter , then my mental model of the expected utility boundary as  varies is not a straight ... (read more)

4JBlack
Yes, I'm not contending against your fundamental point. In fact, I think that the curve from 0 to 1 can be even stranger than that with discontinuities in it, and that under some circumstances it can even have parts that go above the straight line. Focusing on a specific formula based on entropy doesn't really match reality and detracts from the main point.

I wasn't asking "what payment rules still satisfy the three remaining properties", I was asking "what other payment rules are there which satisfy the three remaining properties but not additivity" (with bonus questions "what other properties of Shapley values do we still get just from those three properties" and "what properties other than additivity can we add to those three properties which again pin down a unique rule").

My aim here, which I admit is nebulous, is to get a rough overview of the space of different payment rules (for example, this answer on... (read more)

Yep, that's pretty much it, but with the added bonus of a concrete motivating example. Thanks!