LESSWRONG
LW

All of zulupineapple's Comments + Replies

An Orthodox Case Against Utility Functions

Maybe I should just let you tell me what framework you are even using in the first place.

I'm looking at the Savage theory from your own https://plato.stanford.edu/entries/decision-theory/ and I see U(f)=∑u(f(si))P(si), so at least they have no problem with the domains (O and S) being different. Now I see the confusion is that to you Omega=S (and also O=S), but to me Omega=dom(u)=O.

Furthermore, if O={o0,o1}, then I can group the terms into u(o0)P("we're in a state where f evaluates to o0") + u(o1)P("we're in a state where f evaluates to o1"), I'm just movin... (read more)

2abramdemski2y

(Just to be clear, I did not write that article.) I think the interpretation of Savage is pretty subtle. The objects of preference ("outcomes") and objects of belief ("states") are treated as distinct sets. But how are we supposed to think about this? * The interpretation Savage seems to imply is that both outcomes and states are "part of the world", but the agent has somehow segregated parts of the world into matters of belief and matters of preference. But however the agent has done this, it seems to be fundamentally beyond the Savage representation; clearly within Savage, the agent cannot represent meta-beliefs about which matters are matters of belief and which are matters of preference. So this seems pretty weird. * We could instead think of the objects of preference as something like "happiness levels" rather than events in the world. The idea of the representation theorem then becomes that we can peg "happiness levels" to real numbers. In this case, the picture looks more like standard utility functions; S is the domain of the function that gives us our happiness level (which can be represented by a real-valued utility). * Another approach which seems somewhat common is to take the Savage representation but require that S=O. Savage's "acts" then become maps from world to world, which fits well with other theories of counterfactuals and causal interventions. So even within a Savage framework, it's not entirely clear that we would want the domain of the utility function to be different from the domain of the belief function. I should also have mentioned the super-common VNM picture, where utility has to be a function of arbitrary states as well. The question is, what math-speak is the best representation of the things we actually care about?

An Orthodox Case Against Utility Functions

zulupineapple2yΩ010

A classical probability distribution over $Ω$ with a utility function understood as a random variable can easily be converted to the Jeffrey-Bolker framework, by taking the JB algebra as the sigma-algebra, and V as the expected value of U.

Ok, you're saying that JB is just a set of axioms, and U already satisfies those axioms. And in this construction "event" really is a subset of Omega, and "updates" are just updates of P, right? Then of course U is not more general, I had the impression that JB is a more distinct and specific thing.

Regarding the o... (read more)

2abramdemski2y

Ah, if you don't see 'worlds' as meaning any such thing, then I wonder, are we really arguing about anything at all? I'm using 'worlds' that way in reference to the same general setup which we see in propositions-vs-models in model theory, or in Ω vs the σ-algebra in the Kolmogorov axioms, or in Kripke frames, and perhaps some other places. We can either start with a basic set of "worlds" (eg, Ω) and define our "propositions" or "events" as sets of worlds, where that proposition/event 'holds' or 'is true' or 'occurs'; or, equivalently, we could start with an algebra of propositions/events (like a σ-algebra) and derive worlds as maximally specific choices of which propositions are true and false (or which events hold/occur). Maybe I should just let you tell me what framework you are even using in the first place. There are two main alternatives to the Jeffrey-Bolker framework which I have in mind: the Savage axioms, and also the thing commonly seen in statistics textbooks where you have a probability distribution which obeys the Kolmogorov axioms and then you have random variables over that (random variables being defined as functions of type Ω→R). A utility function is then treated as a random variable. It doesn't sound like your notion of utility function is any of those things, so I just don't know what kind of framework you have in mind.

An Orthodox Case Against Utility Functions

zulupineapple2yΩ010

Answering out of order:

<...> then I think the Jeffrey-Bolker setup is a reasonable formalization.

Jeffrey is a reasonable formalization, it was never my point to say that it isn't. My point is only that U is also reasonable, and possibly equivalent or more general. That there is no "case against" it. Although, if you find Jeffery more elegant or comfortable, there is nothing wrong with that.

do you believe that any plausible utility function on bit-strings can be re-represented as a computable function (perhaps on some other representation, rather than

... (read more)

2abramdemski2y

I do agree that my post didn't do a very good job of delivering a case against utility functions, and actually only argues that there exists a plausibly-more-useful alternative to a specific view which includes utility functions as one of several elements. Utility functions definitely aren't more general. A classical probability distribution over Ω with a utility function understood as a random variable can easily be converted to the Jeffrey-Bolker framework, by taking the JB algebra as the sigma-algebra, and V as the expected value of U. Technically the sigma-algebra needs to be atomless to fit JB exactly, but Zoltan Domotor (Axiomatization of Jeffrey Utilities) generalizes this considerably. I've heard people say that there is a way to convert in the other direction, but that it requires ultrafilters (so in some sense it's very non-constructive). I haven't been able to find this construction yet or had anyone explain how it works. So it seems to me, but I recognize that I haven't shown in detail, that the space of computable values is strictly broader in the JB framework; computable utility functions + computable probability gives us computable JB-values, but computable JB-values need not correspond to computable utility functions. Thus, the space of minds which can be described by the two frameworks might be equivalent, but the space of minds which can be described by computations does not seem to be; the JB space, there, is larger. Well, the Jeffrey-Bolker kind of explanation is as follows: agents really only need to consider and manipulate the probabilities and expected values of events (ie, propositions in the agent's internal language). So it makes some sense to assume that these probabilities and expected values are computable. But this does not imply (as far as I know) that we can construct 'worlds' as maximal specifications of which propositions are true/false and then define a utility function on those worlds which is consistent with the computable

Superintelligence FAQ

zulupineapple2y-11

If you actually do want to work on AI risk, but something is preventing you, you can just say "personal reasons", I'm not going to ask for details.

I understand that my style is annoying to some. Unfortunately, I have not observed polite and friendly people getting interesting answers, so I'll have to remain like that.

3Mitchell_Porter2y

Your questions opened multiple wounds, but I'll get over it. I "work on" AI risk, in the sense that I think about it when I can. Under better circumstances, I suspect I could make important contributions. I have not yet found a path to better circumstances.

Superintelligence FAQ

zulupineapple2y-21

OK, there are many people writing explanations, but if all of them are rehashing the same points from Superintelligence book, then there is not much value in that (and I'm tired of reading the same things over and over). Of course you don't need new arguments or new evidence, but it's still strange if there aren't any.

Anyone who has read this FAQ and others, but isn't a believer yet, will have some specific objections. But I don't think everyone's objections are unique, a better FAQ should be able to cover them, if their refutations exist to begin with.

Als... (read more)

4Mitchell_Porter2y

etc. I presume you have no idea how enraging these questions are, because you know less than nothing about my life. I will leave it to you to decide whether this "Average Redditor" style of behavior (look it up, it's a Youtube character) is something you should avoid in future.

Superintelligence FAQ

zulupineapple2y10

Stampy seems pretty shallow, even more so than this FAQ. Is that what you meant by it not filling "this exact niche"?

By the way, I come from AGI safety from first principles, where I found your comment linking to this. Notably, that sequence says "My underlying argument is that agency is not just an emergent property of highly intelligent systems, but rather a set of capabilities which need to be developed during training, and which won’t arise without selection for it." which is reasonable and seems an order of magnitude more conservative than this FAQ, which doesn't really touch the question of agency at all.

Why We Disagree

zulupineapple2y10

I'm talking specifically about discussions on LW. Of course in reality Alice ignores Bob's comment 90% of the time, and that's a problem in it's own right. It would be ideal if people who have distinct information would choose to exchange that information.

I picked a specific and reasonably grounded topic, "x-risk", or "the probability that we all die in the next 10 years", which is one number, so not hard to compare, unless you want to break it down by cause of death. In contrived philosophical discussions, it can certainly be hard to determine who agrees ... (read more)

Superintelligence FAQ

zulupineapple2y-30

I want neither. I observe that Raemon cannot find an up to date introduction that he's happy with, and I point out that this is really weird. What I want is an explanation to this bizarre situation.

Is your position that Raemon is blind, and good, convincing explanations are actually abundant? If so, I'd like to see them, it doesn't matter where from.

2Mitchell_Porter2y

Expositions of AI risk are certainly abundant. there have been numerous books and papers. Or just go to Youtube and type in "AI risk". As for whether any given exposition is convincing, I am no connoisseur. For a long time, I have taken it for granted that AI can be both smarter than humans and dangerous to humans. I'm more interested in details, like risk taxonomies and alignment theories. But whether a given exposition is convincing, depends on the audience as well as on the author. Some people have highly specific objections. In our discussion, you questioned whether adversarial relations between AI and human are likely to occur, and with Raemon you bring up the topic of agency, so maybe you specifically need an argument that AIs would ever end up acting against human interests? As for Raemon, I suspect he would like a superintelligence FAQ that that acknowledges the way things are in 2023 - e.g. the rise of a particular AI paradigm to dominate discussion (deep learning and large language models), and the existence of a public debate about AI safety, all the way up to the UN Security Council. I don't know if you know, but after being focused for 20 years on rather theoretical issues of AI, MIRI has just announced it will be changing focus to "broad public communication". If you look back at their website, in the 2000s their introductory materials were mostly aimed at arguing that smarter-than-human AI is possible and important. Then in the 2010s (which is the era of Less Wrong), the MIRI homepage was more about their technical papers and workshops and so on, and didn't try to be accessible to a general audience. Now in the mid-2020s, they really will be aiming at a broader audience.

Superintelligence FAQ

zulupineapple2y10

"The world is full of adversarial relationships" is pretty much the weakest possible argument and is not going to convince anyone.

Are you saying that MIRI website has convincing introductory explanation of AI risk, the kind that Raemon wishes he had? Surely he would have found them already? If there aren't, then, again, why not?

2Mitchell_Porter2y

Let me first clarify something. Are you asking because you want to understand MIRI's specific model of AI risk; or do you just want a simple argument that AI risk is real, and it doesn't matter who makes the argument? You're writing as if the reality of AI risk depends on whether or not there's an up-to-date FAQ about it, on this website. But you know that Less Wrong does not have a monopoly on AI doom, right? Everyone from the founders of deep learning to the officials of the deep state are worried about AI now, because it has become so powerful. This issue is somewhere in the media every day now; and it's just common sense, given the way of the world, that entities which are not human and smarter than human potentially pose a risk to the human race.

Superintelligence FAQ

zulupineapple2y0-3

If our relationship to them is adversarial, we will lose. But you also need to argue that this relationship will (likely) be adversarial.

Also, I'm not asking you to make the case here, I'm asking why the case is not being made on front page of LW and on every other platform. Would that not help with advocacy and recruitment? No idea what "keeping up with current events" means.

5Mitchell_Porter2y

The world is full of adversarial relationships, from rivalry among humans, to machines that resist doing what we want them to do. There are many ways in which AIs and humans might end up clashing. Superintelligent AI is of particular concern because you probably don't get a second chance. If your goals clash with the goals of a superintelligent AI, your goals lose. So we have a particular incentive to get the goals of superintelligent AI correct in advance. Less Wrong was set up to be a forum for discussion of rationality, not a hub of AI activism specifically. Eliezer's views on AI form just a tiny part of his "Sequences" here. People wanting to work on AI safety could go to the MIRI website or the "AI Alignment" forum. Certainly Less Wrong now overflows with AI news and discussion. It wasn't always like that! Even as recently as 2020, I think there was more posting about Covid than there was about AI. A turning point was April last year, when the site founder announced that he thought humanity was on track to fail at the challenge of AI safety. Then came ChatGPT, and ecstasy and dread about AI became mainstream. If the site is now all AI, all the time, that simply reflects the state of the world.

An Orthodox Case Against Utility Functions

zulupineapple2yΩ7100

I certainly don't evaluate my U on quarks. Omega is not the set of worlds, it is the set of world models, and we are the ones who decide what that model should be. In "procrastination" example you intentionally picked a bad model, so it proves nothing (if the world only has one button we care about, then maybe |Omega|=2 and everything is perfectly computable).

Further on, it seems to me that if we set our model to be a list of "events" we've observed, then we get the exact thing you're talking about. Although you're imprecise and inconsistent about what an ... (read more)

2abramdemski2y

I agree that it makes more sense to suppose "worlds" are something closer to how the agent imagines worlds, rather than quarks. But on this view, I think it makes a lot of sense to argue that there are no maximally specific worlds -- I can always "extend" a world with an extra, new fact which I had not previously included. IE, agents never "finish" imagining worlds; more detail can always be added (even if only in separate magisteria, eg, imagining adding epiphenomenal facts). I can always conceive of the possibility of a new predicate beyond all the predicates which a specific world-model discusses. If you buy this, then I think the Jeffrey-Bolker setup is a reasonable formalization. If you don't buy this, my next question would be whether you really think that the sort of "world" ("world model", as you called it) which an agent attaches value to always are "closed off" (ie sperify all the facts one way or the other; do not admit further detail) -- or, perhaps, you merely want to argue that this can sometimes be the case but not always. (Because if it's sometimes the case but not always, this argues against both the traditional view where Omega is the set which the probability is a measure over & the utility function is a function of, and against the Jeffrey-Bolker picture.) I find it implausible that the sort of "world model" which we can model humans as having-values-as-a-function-of is "closed off" -- we can appreciate ideas like atoms and quarks, adding these to our ontology, without necessarily changing other aspects of our world-model. Perhaps sometimes we can "close things off" like this -- we can consider the possibility that there "is nothing else" -- but even so, I think this is better-modeled as an additional assertion which we add to the set of propositions defining a possibility rather than modeling us as having bottomed out in an underlying set of "world" which inherently decide all propositions. You seem to be suggesting that any such example cou

Superintelligence FAQ

zulupineapple2y10

Seems like a red flag. How can there not be a more up-to-date one? Is advocacy and recruitment not a goal of AI-risk people? Are they instrumentally irrational? What is preventing you from writing such a post right now?

Most importantly, could it be that people struggle to write a good case for AI-risk, because the case for it is actually pretty weak, when you think about it?

7Raemon2y

People have made tons of slightly-different things all tackling this sort of goal (for example: https://stampy.ai/ ), they just didn't happen to fill this exact niche. I do think maybe it'd actually just be good for @Scott Alexander to write an up-to-date one. A lot of why I like this one is Scott's prose, which I feel awkward completely copying and making changes to, and writing a new thing from scratch is a pretty high skill operation.

5Mitchell_Porter2y

The case for AI risk, is the same as the case for computers beating humans at chess. If the fate of the world depended on unaided humans being able to beat the best chess computers, we would have fought and lost about 25 years ago. Computers long ago achieved supremacy in the little domain of chess. They are now going to achieve supremacy in the larger domain of everyday life. If our relationship to them is adversarial, we will lose as surely as even the world champion of human chess loses to a moderately strong chess program. If this FAQ is out of date, it might be because everyone is busy keeping up with current events.