LESSWRONG
LW

All of Scott Garrabrant's Comments + Replies

Answer by Scott GarrabrantSep 06, 2024202

I proposed this same voting system here: https://www.lesswrong.com/s/gnAaZtdwjDBBRpDmw

It is not strategy proof. If it were, that would violate https://en.wikipedia.org/wiki/Gibbard–Satterthwaite_theorem [Edit: I think, for some version of the theorem. It might not literally violate it, but I also believe you can make a small example that demonstrates it is not strategy proof. This is because the equilibrium sometimes extracts all the value from a voter until they are indifferent, and if they lie about their preferences less value can be extracted.]

Further,... (read more)

(Geometrically) Maximal Lottery-Lotteries Exist

Scott Garrabrant1y245

This post does not prove Maximal Lottery Lotteries exist. Instead, it redefines MLL to be equivalent to the Nash bargaining solution (in a way that is obscured by using the same language as the MLL proposal), and then claims that under the new definition MLL exist (because the Nash bargaining solution exists).

I like Nash bargaining, and I don't like majoritarianism, but the MLL proposal is supposed to be a steelman of majoritarianism, and Nash bargaining is not only not MLL, but it is not even majoritarian. (If a majority of voters have the same favorite candidate, this is not sufficient to make this candidate win in the Nash bargaining solution.)

5Lorxus1y

Dang. I wasn't entirely sure whether you were firm on the definition of lottery-lottery dominance or if that was more speculative. I guess I wasn't clear that MLLs were specifically meant to be "majoritarianism but better"? Given that you meant for it to be, this post sure doesn't prove that they exist. You're absolutely right that you can cook up electorates where the majority-favored candidate isn't the Nash bargaining/Geometric MLL favored candidate.

The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review

Scott Garrabrant1y120

I agree with this.

The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review

Scott Garrabrant1y41

More like illuminating ontologies than great predictions, but yeah.

The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review

Scott Garrabrant1y414

I think Chris Langan and the CTMU are very interesting, and I there is an interesting and important challenge for LW readers to figure out how (and whether) to learn from Chris. Here are some things I think are true about Chris (and about me) and relevant to this challenge. (I do not feel ready to talk about the object level CTMU here, I am mostly just talking about Chris Langan.)

Chris has a legitimate claim of being approximately the smartest man alive according to IQ tests.
Chris wrote papers/books that make up a bunch of words there are defined circularl

... (read more)

zhukeepa1y130

In particular, I think this manifests in part as an extreme lack of humility.

I just want to note that, based on my personal interactions with Chris, I experience Chris's "extreme lack of humility" similarly to how I experience Eliezer's "extreme lack of humility":

in both cases, I think they have plausibly calibrated beliefs about having identified certain philosophical questions that are of crucial importance to the future of humanity, that most of the world is not taking seriously,^[1] leading them to feel a particular flavor of frustration that

... (read more)

2David Udell1y

Would you kindly explain this? Because you think some of his world-models independently throw out great predictions, even if other models of his are dead wrong?

Geometric Rationality is Not VNM Rational

Scott Garrabrant2y50

So, I am trying to talk about the preferences of the couple, not the preferences of either individual. You might reject that the couple is capable of having preference, if so I am curious if you think Bob is capable of having preferences, but not the couple, and if so, why?

I agree if you can do arbitrary utility transfers between Alice and Bob at a given exchange rate, then they should maximize the sum of their utilities (at that exchange rate), and do a side transfer. However, I am assuming here that efficient compensation is not possible. I specifically made it a relatively big decision, so that compensation would not obviously be possible.

1Closed Limelike Curves2y

Whether the couple is capable of having preferences probably depends on your definition of “preferences.” The more standard terminology for preferences by a group of people is “social choice function.” The main problem we run into is that social choice functions don’t behave like preferences.

Infrafunctions and Robust Optimization

Scott Garrabrant2y*Ω10232

Here are the most interesting things about these objects to me that I think this post does not capture.

Given a distribution over non-negative non-identically-zero infrafunctions, up to a positive scalar multiple, the pointwise geometric expectation exists, and is an infra function (up to a positive scalar multiple).

(I am not going to give all the math and be careful here, but hopefully this comment will provide enough of a pointer if someone wants to investigate this.)

This is a bit of a miracle. Compare this with arithmetic expectation of utility fun... (read more)

Infrafunctions and Robust Optimization

Scott Garrabrant2yΩ7101

I have been thinking about this same mathematical object (although with a different orientation/motivation) as where I want to go with a weaker replacement for utility functions.

I get the impression that for Diffractor/Vanessa, the heart of a concave-value-function-on-lotteries is that it represents the worst case utility over some set of possible utility functions. For me, on the other hand, a concave value function represents the capacity for compromise -- if I get at least half the good if I get what I want with 50% probability, then I have the capacity... (read more)

Concave Utility Question

Scott Garrabrant2yΩ440

Then it is equivalent to the thing I call B2 in edit 2 in the post (Assuming A1-A3).

In this case, your modified B2 is my B2, and your B3 is my A4, which follows from A5 assuming A1-A3 and B2, so your suspicion that these imply C4 is stronger than my Q6, which is false, as I argue here.

However, without A5, it is actually much easier to see that this doesn't work. The counterexample here satisfies my A1-A3, your weaker version of B2, your B3, and violates C4.

Concave Utility Question

Scott Garrabrant2yΩ220

Your B3 is equivalent to A4 (assuming A1-3).

Concave Utility Question

Scott Garrabrant2yΩ662

Your B2 is going to rule out a bunch of concave functions. I was hoping to only use axioms consistent with all (continuous) concave functions.

2Vanessa Kosoy2y

Oops. What if instead of "for any p" we go with "there exists p"?

Concave Utility Question

Scott Garrabrant2yΩ220

I am skeptical that it will be possible to salvage any nice VNM-like theorem here that makes it all the way to concavity. It seems like the jump necessary to fix this counterexample will be hard to express in terms of only a preference relation.

Concave Utility Question

Answer by Scott GarrabrantApr 15, 2023Ω330

The answers to Q3, Q4 and Q6 are all no. I will give a sketchy argument here.

Consider the one dimensional case, where the lotteries are represented by real numbers in the interval $L = [0, 1]$ , and consider the function $u : L \to [0, 1]$ given by $u (x) = \frac{1}{2} - (x - \frac{1}{3})^{3} (x - \frac{2}{3})$ . Let $⪰$ be the preference order given by $x ⪰ y$ if and only if $u (x) \geq u (y)$ .

$u$ is continuous and quasi-concave, which means $⪰$ is going to satisfy A1, A2, A3, A4, and B2. Further, since $u$ is monotonically increasing up to the unique argmax, and ... (read more)

2Scott Garrabrant2y

Concave Utility Question

Scott Garrabrant2yΩ330

You can also think of A5 in terms of its contrapositive: For all $A, B \in L$ , if $A ≻ B$ , then for all $0 < p \leq 1$ $A ≻ p A + (1 - p) B$

This is basically just the strict version of A4. I probably should have written it that way instead. I wanted to use $⪰$ instead of $≻$ , because it is closer to the base definition, but that is not how I was natively thinking about it, and I probably should have written it the way I think about it.

Concave Utility Question

Scott Garrabrant2yΩ330

Alex's counterexample as stated is not a counterexample to Q4, since it is in fact concave.

I believe your counterexample violates A5, taking $B = \neg X$ , $A = X$ , and $p = \frac{1}{2}$ .

1James Payor2y

Seems right, oops! A5 is here saying that if any part of my u is flat it had better stay flat! I think I can repair my counterexample but looks like you've already found your own.

Concave Utility Question

Scott Garrabrant2yΩ562

That does not rule out your counterexample. The condition is never met in your counterexample.

6AlexMennen2y

Oh, derp. You're right.

Concave Utility Question

Answer by Scott GarrabrantApr 15, 2023Ω220

The answer to Q1 is no, using the same counter example here. However, the spirit of my original question lives on in Q4 (and Q6).

Concave Utility Question

Answer by Scott GarrabrantApr 15, 2023Ω220

Claim: A1, A2, A3, A5, and B2 imply A4.

Proof: Assume we have a preference ordering that satisfies A1, A2, A3, A5, and B2, and consider lotteries $A, B \in L$ , and $p \in [0, 1]$ , with $A ⪰ B$ . Let $C = p A + (1 - p) B$ . It suffices to show $C ⪰ B$ . Assume not, for the purpose of contradiction. Then (by axiom A1), $B ≻ C$ . Thus by axiom B2 there exists a $D \in L$ such that $B ≻ D ≻ C$ . By axiom A3, we may assume $D = q B + (1 - q) C$ for some $q \in [0, 1]$ . Observe that $C = r A + (1 - r) D$ where $r = \frac{p q}{1 - p + p q} \in [0, 1]$ . $r$ is positive, since otherwise... (read more)

Concave Utility Question

Scott Garrabrant2yΩ440

Oh, nvm, that is fine, maybe it works.

Concave Utility Question

Scott Garrabrant2yΩ440

Oh, no, I made a mistake, this counterexample violates A3. However, the proposed fix still doesn't work, because you just need a function that is decreasing in probability of $x$ , but does not hit 0, and then jumps to 0 when probability of $x$ is 1.

4Scott Garrabrant2y

Oh, nvm, that is fine, maybe it works.

Concave Utility Question

Scott Garrabrant2yΩ220

I haven't actually thought about whether A5 implies A4 though. It is plausible that it does. (together with A1-A3, or some other simple axioms,)

When $A ≻ B$ , we get A4 from A5, so it suffices to replace A4 with the special case that $A \sim B$ . If $A \sim B$ , and $A, B ≻ X$ , a mixture of $A$ and $B$ , then all we need to do is have any Y such that $A ≻ Y ≻ X$ , then we can get $Y^{'}$ between $A$ and $X$ by A3, and then $X$ will also be a mixture of $Y^{'}$ and $B$ , contradicting A5, since $B ≻ Y^{'}$ .

A1,A2,A3,A5 do ... (read more)

Concave Utility Question

Scott Garrabrant2yΩ220

(and everywhere you say "good" and "bad", they are the non-strict versions of the words)

1James Payor2y

yep!

Concave Utility Question

Scott Garrabrant2yΩ330

Your understanding of A4 is right. In A5, "good" should be replaced with "bad."

1James Payor2y

Okay, I now think A5 implies: "if moving by Δ is good, then moving by any negative multiple −nΔ is bad". Which checks out to me re concavity.

2Scott Garrabrant2y

(and everywhere you say "good" and "bad", they are the non-strict versions of the words)

Concave Utility Question

Scott Garrabrant2yΩ330

You have the inequality backwards. You can't apply A5 when the mixture is better than the endpoint, only when the mixture is worse than the endpoint.

1James Payor2y

Got it, thanks!

Concave Utility Question

Scott Garrabrant2yΩ440

That proposed axiom to add does not work. Consider the function on lotteries over ${x, y, z}$ that gives utility 1 if $z$ is supported, and otherwise gives utility equality to the probability of $x$ . This function is concave but not continuous, satisfies A1-A5 and the extra axiom I just proposed, and cannot be made continuous.

4Scott Garrabrant2y

Oh, no, I made a mistake, this counterexample violates A3. However, the proposed fix still doesn't work, because you just need a function that is decreasing in probability of x, but does not hit 0, and then jumps to 0 when probability of x is 1.

Concave Utility Question

Scott Garrabrant2yΩ440

I edited the post to remove the continuity assumption from the main conclusion. However, my guess is that if we get a VNM-like result, we will want to add back in another axiom that gives us continuity,

Concave Utility Question

Scott Garrabrant2yΩ440

I meant the conclusions to all be adding to the previous one, so this actually also answers the main question I stated, by violating continuity, but not the main question I care about. I will edit the post to say that I actually care about concavity, even without continuity.

4Scott Garrabrant2y

Concave Utility Question

Scott Garrabrant2yΩ450

Nice! This, of course, seems like something we should salvage, by e.g. adding an axiom that if A is strictly preferred to B, there should be a lottery strictly between them.

4AlexMennen2y

I think the way I would rule out my counterexample is by strengthening A3 to if A≻B and B≻C then there is p∈(0,1)...

4Scott Garrabrant2y

That proposed axiom to add does not work. Consider the function on lotteries over {x,y,z} that gives utility 1 if z is supported, and otherwise gives utility equality to the probability of x. This function is concave but not continuous, satisfies A1-A5 and the extra axiom I just proposed, and cannot be made continuous.

Concave Utility Question

Scott Garrabrant2y*Ω220

To see why A1-A4 is not enough to prove C4 on its own, consider the preference relation on the space of lotteries between two outcomes X and Y such that all lotteries are equivalent if $P (X) \leq \frac{1}{2}$ , and if $P (X) \geq \frac{1}{2}$ , higher values of $P (X)$ are preferred. This satisfies A1-A4, but cannot be expressed with a concave function, since we would have to have $u (\frac{X + Y}{2}) = u (X) < \frac{u (X) + u (Y)}{2}$ , contradicting concavity. We can, however express it with a quasi-concave function: $U (p X + (1 - p) Y) = max (0, p - \frac{1}{2})$ .

Concave Utility Question

Scott Garrabrant2yΩ360

I believe using A4 (and maybe also A5) in multiple places will be important to proving a positive result. This is because A1, A2, and A3 are extremely week on their own.

A1-A3 is not even enough to prove C1. To see a counterexample, take any well ordering on $R / Q$ , and consider the preference ordering over the space of lotteries on a two element set of deterministic outcomes. If two lotteries have probabilities of the first outcome that differ by a rational number, they are equivalent, otherwise, you compare them according to your well ordering. Th... (read more)

Review of AI Alignment Progress

Scott Garrabrant2y20

I personally care about cardinal utility, where the magnitude of the utility is information about how to aggregate rather than information about how to take lotteries, but I think this is a very small minority usage of cardinal utility, so I don't think it should change the naming convention very much.

Review of AI Alignment Progress

Scott Garrabrant2y20

I think UDT as you specified it has utility functions. What do you mean by doesn't have independence? I am advocating for an updateless agent model that might strictly prefer a mixture between outcomes A and B to either A or B deterministically. I think an agent model with this property should not be described as having a "utility." Maybe I am conflating "utility" with expected utility maximization/VNM and you are meaning something more general?

If you mean by utility something more general than utility as used in EUM, then I think it is mostly a term... (read more)

2Scott Garrabrant2y

Even if EUM doesn't get "utility", I think it at least gets "utility function", since "function" implies cardinal utility rather than ordinal utility and I think people almost always mean EUM when talking about cardinal utility. I personally care about cardinal utility, where the magnitude of the utility is information about how to aggregate rather than information about how to take lotteries, but I think this is a very small minority usage of cardinal utility, so I don't think it should change the naming convention very much.

Review of AI Alignment Progress

Scott Garrabrant2y52

Although I note that my flavor of rejecting utility functions is trying to replace them with something more general, not something incompatible.

Review of AI Alignment Progress

Scott Garrabrant2y9-1

I feel like reflective stability is what caused me to reject utility. Specifically, it seems like it is impossible to be reflectively stable if I am the kind of mind that would follow the style of argument given for the independence axiom. It seems like there is a conflict between reflective stability and Bayesian updating.

I am choosing reflective stability, in spite of the fact that loosing updating is making things very messy and confusing (especially in the logical setting), because reflective stability is that important.

When I lose updating, the independence axiom, and thus utility goes along with it.

4Wei Dai2y

UDT still has utility functions, even though it doesn't have independence... Is it just a terminological issue? Like you want to call the representation of value in whatever the correct decision theory turns out to be something besides "utility"? If so, why?

5Scott Garrabrant2y

Although I note that my flavor of rejecting utility functions is trying to replace them with something more general, not something incompatible.

Basics of Rationalist Discourse

Scott Garrabrant2y114

I think the short statement would be a lot weaker (and better IMO) if "inability" were replaced with "inability or unwillingness". "Inability" is implying a hierarchy where falsifiable statements are better than the poetry, since the only reason why you would resort to poetry is if you are unable to turn it into falsifiable statements.

4Duncan Sabien (Inactive)2y

I changed it to say "aren't doing so (or can't)."

Basics of Rationalist Discourse

Scott Garrabrant2y192

I would also love a more personalized/detailed description of how I made this list, and what I do poorly.

I think I have imposter syndrome here. My top guess is that I do actually have some skill in communication/discourse, but my identity/inside view really wants to reject this possibility. I think this is because I (correctly) think of myself as very bad at some of the subskills related to passing people's ITTs.

Geometric Rationality is Not VNM Rational

Scott Garrabrant3y20

From listening to that podcast, it seems like even she would not advocate for preferring a lottery between two outcomes to either of the pure components.

Finite Factored Sets

Scott Garrabrant3yΩ232

This underrated post is pretty good at explaining how to translate between FFSs and DAGs.

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y20

Hmm, examples are hard. Maybe the intuitions contribute to concept of edge instantiation?

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y61

I note that EU maximization has this baggage of never strictly preferring a lottery over outcomes to the component outcomes, and you steelmen appear to me to not carry that baggage. I think that baggage is actually doing work in some people's reasoning and intuitions.

1rotatingpaguro2y

I think you are referring to the case where an agent wishes to be unpredictable in an adversarial situation, right? (I genuinely do not feel confident I understand what you said.) If so, isn't this lottery on a different, let's say ontological, level, instead of the level of "lotteries" that define its utility?

2Wei Dai3y

Do you have any examples of this?

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y50

I am not sure if there is any disagreement in this comment. What you say sounds right to me. I agree that UDT does not really set us up to want to talk about "coherence" in the first place, which makes it weird to have it be formalized in term of expected utility maximization.

This does not make me think intelligent/rational agents will/should converge to having utility.

5Vladimir_Nesov3y

I think coherence of unclear kind is an important principle that needs a place in any decision theory, and it motivates something other than pure updatelessness. I'm not sure how your argument should survive this. The perspective of expected utility and the perspective of updatelessness both have glaring flaws, respectively unwarranted updatefulness and lack of a coherence concept. They can't argue against each other in their incomplete forms. Expected utility is no more a mistake than updatelessness.

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y20

Yeah, I don't have a specific UDT proposal in mind. Maybe instead of "updateless" I should say "the kind of mind that might get counterfactually mugged" as in this example.

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y62

FDT and UDT are formulated in terms of expected utility. I am saying that the they advocate for a way of thinking about the world that makes it so that you don't just Bayesian update on your observations, and forget about the other possible worlds.

Once you take on this worldview, the Dutch books that made you believe in expected utility in the first place are less convincing, so maybe we want to rethink utility.

I don't know what the FDT authors were thinking, but it seems like they did not propagate the consequences of the worldview into reevaluating what preferences over outcomes look like.

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y50

No, at least probably not at the time that we lose all control.

However, I expect that systems that are self-transparent and can easily sellf-modify might quickly converge to reflective stability (and thus updatelessness). They might not, but I think the same arguments that might make you think they would develop a utility function also can be used to argue that they would develop updatelessness (and thus possibly also not develop a utility function).

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y92

Here is a situation where you make an "observation" and can still interact with the other possible worlds. Maybe you do not want to call this an observation, but if you don't call it an observation, then true observations probably never really happen in practice.

I was not trying to say that is relevant to the coin flip directly. I was trying to say that the move used to justify the coin flip is the same move that is rejected in other contexts, and so we should open to the idea of agents that refuse to make that move, and thus might not have utility.

1Optimization Process3y

Ah, that's the crucial bit I was missing! Thanks for spelling it out.

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y40

I think UDT is as you say. I think it is also important to clarify that you are not updating on your observations when you decide on a policy. (If you did, it wouldn't really be a function from observations to actions, but it is important to emphasize in UDT.)

Note that I am using "updateless" differently than "UDT". By updateless, I mostly mean anything that is not performing Bayesian updates and forgetting the other possible worlds when it makes observations. UDT is more of a specific proposal. "Updateless" is more of negative property, defined by lack of... (read more)

7EOC3y

Thanks, the clarification of UDT vs. "updateless" is helpful. But now I'm a bit confused as to why you would still regard UDT as "EU maximisation, where the thing you're choosing is policies". If I have a preference ordering over lotteries that violates independence, the vNM theorem implies that I cannot be represented as maximising EU. In fact, after reading Vladimir_Nesov's comment, it doesn't even seem fully accurate to view UDT taking in a preference ordering over lotteries. Here's the way I'm thinking of UDT: your prior over possible worlds uniquely determines the probabilities of a single lottery L, and selecting a global policy is equivalent to choosing the outcomes of this lottery L. Now different UDT agents may prefer different lotteries, but this is in no sense expected utility maximisation. This is simply: some UDT agents think one lottery is the best, other might think another is the best. There is nothing in this story that resembles a cardinal utility function over outcomes that the agents are multiplying with their prior probabilities to maximise EU with respect to. It seems that to get an EU representation of UDT, you need to impose coherence on the preference ordering over lotteries (i.e. over different prior distributions), but since UDT agents come with some fixed prior over worlds which is not updated, it's not at all clear why rationality would demand coherence in your preference between lotteries (let alone coherence that satisfies independence).

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y60

You could take as an input parameter to UDT a preference ordering over lotteries that does not satisfy the independence axiom, but is a total order (or total preorder if you want ties). Each policy you can take results in a lottery over outcomes, and you take the policy that gives your favorite lottery. There is no need for the assumption that your preferences over lotteries is vNM.

Note that I don't think that we really understand decision theory, and have a coherent proposal. The only thing I feel like I can say confidently is that if you are convinced by... (read more)

3EOC3y

Okay this is very clarifying, thanks! If the preference ordering over lotteries violates independence, then it will not be representable as maximising EU with respect to the probabilities in the lotteries (by the vNM theorem). Do you think it's a mistake then to think of UDT as "EU maximisation, where the thing you're choosing is policies"? If so, I believe this is the most common way UDT is framed in LW discussions, and so this would be a pretty important point for you to make more visibly (unless you've already made this point before in a post, in which case I'd love to read it).

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y70

Also, if by "have a utility function" you mean something other than "try to maximize expected utility," I don't know what you mean. To me, the cardinal (as opposed to ordinal) structure of preferences that makes me want to call something a "utility function" is about how to choose between lotteries.

1EOC3y

Yeah by "having a utility function" I just mean "being representable as trying to maximise expected utility".

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y30

Note that I am not saying here that rational agents can't have a utility function. I am only saying that they don't have to.

Why The Focus on Expected Utility Maximisers?

Scott Garrabrant3y102

That depends on what you mean by "suitably coherent." If you mean they need to satisfy the independence vNM axiom, then yes. But the point is that I don't see any good argument why updateless agents should satisfy that axiom. The argument for that axiom passes through wanting to have a certain relationship with Bayesian updating.

7Scott Garrabrant3y

3EOC3y

Ah okay, interesting. Do you think that updateless agents need not accept any separability axiom at all? And if not, what justifies using the EU framework for discussing UDT agents? In many discussions on LW about UDT, it seems that a starting point is that agent is maximising some notion of expected utility, and the updatelessness comes in via the EU formula iterating over policies rather than actions. But if we give up on some separability axiom, it seems that this EU starting point is not warranted, since every major EU representation theorem needs some version of separability.