User Comment Replies

The Robot, the Puppet-master, and the Psychohistorian

Chaos in complex systems is guaranteed but also bounded. I cannot know what the weather will be like in New York City one month from now. I can, however, predict that it probably won't be "tornado" and near-certainly won't be "five hundred simultaneous tornadoes level the city". We know it's possible to build buildings that can withstand ~all possible weather for a very long time. I imagine that a thing you're calling a puppet-master could build systems that operate within predictable bounds robustly and reliably enough to more or less guarantee broad cont... (read more)

3WillPetillo1mo

Verifying my understanding of your position: you are fine with the puppet-master and psychohistorian categories and agree with their implications, but you put the categories on a spectrum (systems are not either chaotic or robustly modellable, chaos is bounded and thus exists in degrees) and contend that ASI will be much closer to the puppet-master category. This is a valid crux. To dig a little deeper, how does your objection sustain in light of my previous post, Lenses of Control? The basic argument there is that future ASI control systems will have to deal with questions like: "If I deploy novel technology X, what is the resulting equilibrium of the world, including how feedback might impact my learning and values?" Does the level chaos in such contexts remain narrowly bounded? EDIT for clarification: the distinction between the puppet-master and psychohistorian metaphors is not the level of chaos in the system they are dealing with, but rather is about the extent of direct control that the control system of the ASI has on the world, where the control system is a part of the AI machinery as a whole (including subsystems that learn) and the AI is a part of the world. Chaos factors in as an argument for why human-compatible goals are doomed if AI follows the psychohistorian metaphor.

Cost, Not Sacrifice

Joe Rogero3mo20

I think the whole concept of labeling goods as "fungible" or "non-fungible" is a category error. Everything trades off against something.

Either you value your fingers more than what [some specific amount of money] will buy you or you don't. If you value your fingers more, then keeping them is the right call for you.

Cost, Not Sacrifice

Joe Rogero3mo11

Lots of things have a value that we might call "infinite" according to this argument. Everything from a human life to reading a book spoiler counts as "something you cannot buy back if you regret it later."

Even if we choose to label some things as "non-fungible", we must often weigh them against each other nevertheless. I claim, not that the choice never hurts, but that there is no need to feel guilty about it.

2dr_s3mo

Well, yes, it's true, and obviously those things do not necessarily all have genuine infinite value. I think what this really means in practice is not that all non-fungible things have infinite value, but that because they are non-fungible, most judgements involving them are not as easy or straightforward as simple numerical comparisons. Preferences end up being expressed anyway, but just because practical needs force a square peg in a round hole doesn't make it fit any better. I think this in practice manifests in high rates of hesitation or regret for decisions involving such things, and the general difficulty of really squaring decisions like these We can agree in one sense that several trillion dollars in charity are a much greater good than someone not having their fingers cut off, and yet we generally wouldn't call that person "evil" for picking the latter option because we understand perfectly how to someone their own fingers might feel more valuable. If we were talking about fungible goods we'd feel very differently. Replace cutting one's fingers with e.g. demolishing their house.

Cost, Not Sacrifice

Joe Rogero3mo20

True, it can always hurt. I note, however, that's not quite the same thing as feeling like you made a terrible deal, and also that feeling pain at the loss of a treasured thing is not the same as feeling guilty about the choice.

6Jozdien3mo

Many deals in the real world have a lot of positive surplus. Most deals I would like to make have positive surplus. I would still make a deal to get something more valuable with something less valuable, but if the margins are very thin (or approaching zero), then I wouldn't like the deal even as I make it. I can feel like it's a terrible deal because the deals I want would have a lot more surplus to them, ideally involving a less painful cost.

What are Emotions?

Joe Rogero3mo10

Does this also mean there is no such thing as "inherent good"?

Yes.

If so, then one cannot say, "X is good", they would have to say "I think that X is good", for "good" would be a fact of their mind, not the environment.

One can say all sorts of things. People use the phrase "X is good" to mean lots of things: "I'm cheering for X", "I value X", "X has consequences most people endorse", etc. I don't recommend we abandon the phrase, for many phrases are similarly ambiguous but still useful. I recommend keeping this ambiguity in mind, however, and di... (read more)

What are Emotions?

Joe Rogero3mo1-3

What happens then when a non-thinking thing feels happy? Is that happiness valued? To whom? Or do you think this is impossible?

When a baby feels happy, it feels happy. Nothing else happens.

There are differences among wanting, liking, and endorsing something.

A happy blob may like feeling happy, and might even feel a desire to experience more of it, but it cannot endorse things if it doesn't have agency. Human fulfillment and wellbeing typically involves some element of all three.

An unthinking being cannot value even its own happiness, beca... (read more)

4Myles H3mo

Does this also mean there is no such thing as "inherent good"? If so, then one cannot say, "X is good", they would have to say "I think that X is good", for "good" would be a fact of their mind, not the environment. This is what I thought the whole field of morality is about. Defining what is "good" in an objective fundamental sense. And if "inherent good" can exist but not "inherent value", how would "good" be defined for it wouldn't be allowed to use "value" in its definition.

What are Emotions?

Joe Rogero3mo21

I think you have correctly noticed an empirical fact about emotions (they tend to be preferred or dispreferred by animals who experience them) but are drawing several incorrect conclusions therefrom.

First and foremost, my model of the universe leaves no room for it valuing anything. "Values" happen to be a thing possessed by thinking entities; the universe cares not one whit more for our happiness or sadness than the rules of the game of chess care whether the game is won by white or black. Values happen inside minds, they are not fundamental to the ... (read more)

1Myles H3mo

What happens then when a non-thinking thing feels happy? Is that happiness valued? To whom? Or do you think this is impossible? I can imagine it possible for a fetus in the womb without any thoughts, sense of self, or an ability to move, to still be capable of feeling happiness. Now try to imagine a hypothetical person with a severe mental disability preventing them having any cohesive thoughts, sense of self, or an ability to move. Could they still feel happiness? What happens when the dopamine receptors get triggered? It is my hypothesis that the mechanism by which emotions are felt does not require a "thinking" agent. This could be false and I now see how this is an assumption which many of my arguments rely on. Thank you for catching that. It just seems so clear to me. When I feel pain or pleasure, I don't need to "think" about it for the emotion to be felt. I just immediately feel the pain or pleasure. Anyway, if you assume that it is possible for a non-thinker to still be a feeler, then there is nothing logically inconceivable about a hypothetical happy rock. Then if you also say that happiness is good, and that good implies value, one must ask, who or what is valuing the happiness? The rock? The universe? Ok maybe not "the universe" as to mean the collection of all objects within the universe. I'm more trying to say "the fabric of reality". Like there must be some physical process by which happiness is valued. Maybe a dimension by which emotional value is expressed? You are partly correct about this. When I said I terminally value the making of kinetic sculptures, I was definitely making a simplification. I don't value the making of all kinetic sculptures, and I also value the making of things which aren't kinetic sculptures. I don't, however, do it because I think it is "fun". I can't formally define what the actual material terminal goal is but it is something more along the lines of, "something that is challenging, and requires a certain kind of

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

Joe Rogero3mo20

I'm assuming the Cosmic Flipper is offering, not a doubling of the universe's current value, but a doubling of its current expected value (including whatever you think the future is worth) plus a little more. If it's just doubling current niceness or something, then yeah, that's not nearly enough.

1Logos Of The Meta Agent3mo

I'd missed that, thank you for pointing that out. If "expected" effectively means what you're saying is that you're being offered a bet that is good by definition, that even at 50/50 odds, you take the bet, I suppose that's true. If the bet is static for a second flip, it wouldn't be a good deal, but if it dynamically altered such that it was once again a good bet by definition, I suppose you keep taking the bet. If you're engaging with the uncertainty that people are bad at evaluating things like "expected utility" then at least some of the point is that our naive intuitions are probably missing some of the math, and costs, and the bet is likely a bad bet. If I was trying to give credence to that second possibility, I'd say that the word "expected" is now doing a bunch of hidden heavy lifting in the payoff structure, and you don't really know what lifting it's doing.

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

Joe Rogero3mo20

That is an interesting reframing of this wager!

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

Joe Rogero3mo10

Alas, I am not familiar with Lara Buchak's arguments, and the high-level summary I can get from Googling them isn't sufficient to tell me how it's supposed to capture something utility maximizing can't. Was there a specific argument you had in mind?

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

Joe Rogero3mo10

Did he really? If true, that's actually much dumber than I thought, but I couldn't find anything saying that when I looked.

I wouldn't characterize that as a "commitment to utilitarianism", though; you can be a perfect utilitarian and have value that is linear in matter and energy (and presumably number of people?), or be a perfect utilitarian and have some other value function.

The possible redundancy of conscious patterns was one of the things I was thinking about when I wrote:

Secondly, and more importantly, I question whether it is possible ev

Joe Rogero3mo10

I don't actually mean the thing you're calling the motte at all, and I'm not sure I agree with the bailey either. The thought experiment as I understand it was never quite a St. Petersburg Paradox because both the payout ("double universe value") and the method of choosing how to play (single initial payment vs repeated choice betting everything each time) are different. It also can't literally be applied to the real world at all, part of the point is that I don't even know what it would look like for this scenario to be possible in the real world, there a... (read more)

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

Joe Rogero3mo10

Heard of it, but this particular application is new. There's a difference, though, between "this formula can be a useful strategy to get more value" and "this formula accurately reflects my true reflectively endorsed value function."

Situational Awareness Summarized - Part 2

Joe Rogero8mo40

Thanks for your thoughts, Cam! The confusion as I see it comes from sneaking in assumptions with the phrase "what they are trained to do". What are they trained to do, really? Do you, personally, understand this?

Consider Claude's Constitution. Look at the "principles in full" - all 60-odd of them. Pick a few at random. Do you wholeheartedly endorse them? Are they really truly representative of your values, or of total human wellbeing? What is missing? Would you want to be ruled by a mind that squeezed these words as hard as physically possible, to th... (read more)

Thinking By The Clock

Joe Rogero1y30

Love this post. I've also used the five-minute technique at work, especially when facilitating meetings. In fact, there's a whole technique called think-pair-share that goes something like:

Everyone think about it for X minutes. Take notes.
Partner up and talk about your ideas for 2X minutes.
As a group, discuss the best ideas and takeaways for 4X minutes.

There's an optional step involving groups of four, but I'd rarely bother with that one unless it's a really huge meeting (and at that point I'm actively trying to shrink it because huge committees are shit decision-makers).

2Screwtape1y

Thank you for the addition! Pairing up to talk through the ideas seems like a good group technique, I like the sequence you outline!

Thoughts on sharing information about language model capabilities

Joe Rogero2y21

This was a good post, and shifted my view slightly on accelerating vs halting AI capabilities progress.

I was confused by your "overhang" argument all the way until footnote 9, but I think I have the gist. You're saying that even if absolute progress in capabilities increases as a result of earlier investment, progress relative to safety will be slower.

A key assumption seems to be that we are not expecting doom immediately; i.e. the next major jump in capabilities is deemed nearly impossible to kill us all with misaligned AI. I'm not sure I buy this assumpt... (read more)

Grant applications and grand narratives

Joe Rogero2y10

I found this a very useful post. I would also emphasize how important it is to be specific, whether one's project involves a grand x-risk moonshot or a narrow incremental improvement.

There are approximately X vegans in America; estimates of how many might suffer from nutritional deficiencies range from Y to Z; this project would...
An improvement in epistemic health on [forum] would potentially affect X readers, which include Y donors who gave at least $Z to [forum] causes last year...
A 1-10% gain in productivity for the following people and organizat

... (read more)

Open Thread - July 2023

Joe Rogero2y90

Greetings from The Kingdom of Lurkers Below. Longtime reader here with an intro and an offer. I'm a former Reliability Engineer with expertise in data analysis, facilitation, incident investigation, technical writing, and more. I'm currently studying deep learning and cataloguing EA projects and AI safety efforts, as well as facilitating both formal and informal study groups for AI Safety Fundamentals.

I have, and am willing to offer to EA or AI Safety focused individuals and organizations, the following generalist skills:

Facilitation. Organize and ru

... (read more)

LESSWRONG
LW

All of Joe Rogero's Comments + Replies