MinusGix

Programmer.

Wiki Contributions

Comments

Sorted by

I define rationality as "more in line with your overall values". There are problems here, because people do profess social values that they don't really hold (in some sense), but roughly it is what they would reflect on and come up with.
Someone could value the short-term more than the long-term, but I think that most don't. I'm unsure if this is a side-effect of Christianity-influenced morality or just a strong tendency of human thought.

Locally optimal is probably the correct framing, but that it is irrational relative to whatever idealized values the individual would have. Just like how a hacky approximation of a Chess engine is irrational relative to Stockfish—they both can be roughly considered to have the same goal, just one has various heuristics and short-term thinking that hampers it. These heuristics can be essential, as it runs with less processing power, but in the human mind they can be trained and tuned.

Though I do agree that smoking isn't always irrational: I would say smoking is irrational for the supermajority of human minds, however. The social negativity around smoking may be what influences them primarily, but I'd consider that just another fragment of being irrational— >90% of them would have a value for their health, but they are varying levels of poor at weighting the costs and the social negativity response is easier for the mind to emulate. Especially since they might see people walking around them while they're out taking a cigarette. (Of course, the social approval is some part of a real value too; though people have preferences about which social values they give into)

Answer by MinusGix10

An important question here is "what is the point of being 'more real'?". Does having a higher measure give you a better acausal bargaining position? Do you terminally value more realness? Less vulnerable to catastrophes? Wanting to make sure your values are optimized harder?

I consider these, except for the terminal sense, to be rather weak as far as motivations go.

Acausal Bargaining: Imagine a bunch of nearby universes with instances of 'you'. They all have variations, some very similar, others with directions that seem a bit strange to the others. Still identifiably 'you' by a human notion of identity. Some of them became researchers, others investors, a few artists, writers, and a handful of CEOs.

You can model these as being variations on some shared utility function: where is shared, and is the individual utility function. Some of them are more social, others cynical, and so on. A believable amount of human variation that won't necessarily converge to the same utility function on reflection (but quite close).

For a human, losing memories so that you are more real is akin to each branch chopping off the . They lose memories of a wonderful party which changed their opinion of them, they no longer remember the horrors of a war, and so on.

Everyone may do the simple ask of losing all their minor memories which has no effect on the utility function, but then if you want more bargaining power, do you continue? The hope is that this would make your coalition easier to locate, to be more visible in "logical sight". That this increased bargaining power would thus ensure that, at the least, your important shared values are optimized harder than they could if you were a disparate group of branches.

I think this is sometimes correct, but often not.
From a simple computationalist perspective, increasing the measure of the 'overall you' is of little matter. The part that bargains, your rough algorithm and your utility function, is already shared: is shared among all your instances already, some of you just have considerations that pull in other directions (). This is the same core idea of the FDT explanation of why people should vote: because, despite not being clones of you, there is a group of people that share similar reasoning as you. Getting rid of your memories in the voting case does not help you!

For the Acausal Bargaining case, there is presumably some value in being simpler. But, that means more likely that you should bargain 'nearby' to present a computationally cheaper value function 'far away'. So, similar to forgetting, where you appear as if having some shared utility function, but without actually forgetting—and thus being able to optimize for in your local universe. As well, the bargained utility function presented far away (less logical sight to your cluster of universes) is unlikely to be the same as .


So, overall, my argument would be that forgetting does give you more realness. If at 7:59AM, a large chunk of universes decide to replace part of their algorithm with a specific coordinated one (like removing a memory) then that algorithm is instantiated across more universes. But, that from a decision-theoretic perspective, I don't think that matters too much? You already share the important decision theoretic parts, even if the whole algorithm is not shared.

From a human perspective we may care about this as a value of wanting to 'exist more' in some sense. I think this is a reasonable enough value to have, but that it is oft satisfied by considering the sharing of decision methods and 99.99% of personality is enough.

My main question of whether this is useful beyond a terminal value for existing more is about quantum immortality—of which I am more uncertain about.

Beliefs and predictions that influence wants may be false or miscalibrated, but the feeling itself, the want itself, just is what it is, the same way sensations of hunger or heat just are what they are.

I think this may be part of the disconnect between me and the article. I often view the short jolt preferences (that you get from seeing an ice-cream shop) as heuristics, as effectively predictions paired with some simpler preference for "sweet things that make me feel all homey and nice". These heuristics can be trained to know how to weigh the costs, though I agree just having a "that's irrational" / "that's dumb" is a poor approach to it. Other preferences, like "I prefer these people to be happy" are not short-jolts but rather thought about and endorsed values that would take quite a bit more to shift—but are also significantly influenced by beliefs too.

Other values like "I enjoy this aesthetic" seem more central to your argument than short-jolts or considered values.

This is why you could view a smoker's preference for another cigarette as irrational: the 'core want' is just a simple preference for the general feel of smoking a cigarette, but the short-jolt preference has the added prediction of "and this will be good to do". But that added prediction is false and inconsistent with everything they know. The usual statement of "you would regret this in the future". Unfortunately, the short-jolt preference often has enough strength to get past the other preferences, which is why you want to downweight it.

So, I agree that there's various preferences that having them is disentangled from whether you're rational or not, but that I also think most preferences are quite entangled with predictions about reality.

“inconsistent preferences” only makes sense if you presume you’re a monolithic entity, or believe your "parts" need to all be in full agreement all the time… which I think very badly misunderstands how human brains work.

I agree that humans can't manage this, but it does still make sense for a non-monolithic entity—You'd take there being an inconsistency as a sign that there's a problem, which is what people tend to do, even if ti can't be fixed.

Finally, the speed at which you communicate vibing means you're communicating almost purely from System 1, expressing your actual felt beliefs. It makes deception both of yourself and others much harder. Its much more likely to reveal your true colors. This allows it to act as a values screening mechanism as well.

I'm personally skeptical of this. I've found I'm far more likely to lie than I'd endorse when vibing. Saying "sure I'd be happy to join you on X event" when it is clear with some thought that I'd end up disliking it. Or exaggerating stories because it fits with the vibe.
I view System-1 as less concerned with truth here, it is the one that is more likely to produce a fake-argument in response to a suggested problem. More likely to play social games regardless of if they make sense.

I agree that it is easy to automatically lump the two concepts together.

I think another important part of this is that there are limited methods for most consumers to coordinate against companies to lower their prices. There's shopping elsewhere, leaving a bad review, or moral outrage. The last may have a chance of blowing up socially, such as becoming a boycott (but boycotts are often considered ineffective), or it may encourage the government to step in. In our current environment, the government often operates as the coordination method to punish companies for behaving in ways that people don't want. In a much more libertarian society we would want this replaced with other methods, so that consumers can make it harder to put themselves in a prisoner's dilemma or stag hunt against each other.

If we had common organizations for more mild coordination than the state interfering, then I believe this would improve the default mentality because there would be more options.

It has also led to many shifts in power between groups based on how well they exploit reality. From hunter-gatherers to agriculture, to grand armies spreading an empire, to ideologies changing the fates of entire countries, and to economic & nuclear super-powers making complex treaties.

This reply is perhaps a bit too long, oops.


Having a body that does things is part of your values and is easily described in them. I don't see deontology or virtue ethics as giving any more fundamentally adequate solution to this (beyond the trivial 'define a deontological rule about ...', or 'it is virtuous to do interesting things yourself', but why not just do that with consequentialism?).
My attempt at interpreting what you mean is that you're drawing a distinction between morality about world-states vs. morality about process, internal details, experiencing it, 'yourself'. To give them names, "global"-values (you just want them Done) & "indexical"/'local"-values (preferences about your experiences, what you do, etc.) Global would be reducing suffering, avoiding heat death and whatnot. Local would be that you want to learn physics from the ground up and try to figure out XYZ interesting problem as a challenge by yourself, that you would like to write a book rather than having an AI do it for you, and so on.

I would say that, yes, for Global you should/would have an amorphous blob that doesn't necessarily care about the process. That's your (possibly non-sentient) AGI designing a utopia while you run around doing interesting Local things. Yet I don't see why you think only Global is naturally described in consequentialism.

I intrinsically value having solved hard problems—or rather, I value feeling like I've solved hard problems, which is part of overall self-respect, and I also value realness to varying degrees. That I've actually done the thing, rather than taken a cocktail of exotic chemicals. We could frame this in a deontological & virtue ethics sense: I have a rule about realness, I want my experiences to be real. / I find it virtuous to solve hard problems, even if in a post-singularity world.
But do I really have a rule about realness? Uh, sort-of? I'd be fine to play a simulation where I forget about the AGI world and am in some fake-scifi game world and solve hard problems. In reality, my value has a lot more edge-cases that will be explored than many deontological rules prefer. My real value isn't really a rule, it is just sometimes easy to describe it that way. Similar to how "do not lie" or "do not kill" is usually not a true rule.
Like, we could describe my actual value here as a rule, but seems actually more alien to the human mind. My actual value for realness is some complicated function of many aspects of my life, preferences, current mood to some degree, second-order preferences, and so on. Describing that as a rule is extremely reductive.
And 'realness' is not adequately described as a complete virtue either. I don't always prefer realness: if playing a first-person shooter game, I prefer that my enemies are not experiencing realistic levels of pain! So there are intricate trade-offs here as I continue to examine my own values.


Another aspect I'm objecting to mentally when I try to apply those stances is that there's two ways of interpreting deontology & virtue ethics that I think are common on LW. You can treat them as actual philosophical alternatives to consequentialism, like following the rule "do not lie". Or you can treat them as essentially fancy words for deontology=>"strong prior for this rule being generally correct and also a good coordination point" and virtue ethics=>"acting according to a good Virtue consistently as a coordination scheme/culture modification scheme and/or because you also think that Virtue is itself a Good".
Like, there's a difference between talking about something using the language commonly associated with deontology and actually practicing deontology. I think conflating the two is unfortunate.

The overaching argument here is that consequentialism properly captures a human's values, and that you can use the basic language of "I keep my word" (deontology flavored) or "I enjoy solving hard problems because they are good to solve" (virtue ethics flavored) without actually operating within those moral theories. You would have the ability to unfold these into the consequentialist statements of whatever form you prefer.


In your reply to cubefox, "respect this person's wishes" is not a deontological rule. Well, it could be, but I expect your actual values don't fulfill that. Just because your native internal language suggestively calls it that, doesn't mean you should shoehorn it into the category of rule!
"play with this toy" still strikes me as natively a heuristic/approximation to the goal of "do things I enjoy". The interlinking parts of my brain that decided to bring that forward is good at its job, but also dumb because it doesn't do any higher order thinking. I follow that heuristic only because I expect to enjoy it—the heuristic providing that information. If I had another part of my consideration that pushed me towards considering whether that is a good plan, I might realize that I haven't actually enjoyed playing with a teddy bear in years despite still feeling nostalgia for that. I'm not sure I see the gap between consequentialism and this. I don't have the brain capacity to consider every impulse I get, but I do want to consider agents other than AIXI to be a consequentialist.
I think there's a space in there for a theory of minds, but I expect it would be more mechanistic or descriptive rather than a moral theory. Ala shard theory.

Or, alternatively, even if you don't buy my view that the majority of my heuristics can be cast as approximations of consequentialist propositions, then deontology/virtue ethics are not natural theories either by your descriptions. They miss a lot of complexity even within their usual remit.

MinusGix3-2

I think there's two parts of the argument here:

  • Issues of expressing our values in a consequentialist form
  • Whether or not consequentialism is the ideal method for humans

The first I consider not a major problem. Mountain climbing is not what you can put into the slot to maximize, but you do put happiness/interest/variety/realness/etc. into that slot. This then falls back into questions of "what are our values". Consequentialism provides an easy answer here: mountain climbing is preferable along important axes to sitting inside today. This isn't always entirely clear to us, we don't always think natively in terms of consequentialism, but I disagree with:

There are many reasons to do things - not everything has to be justified by consequences.

We just don't usually think in terms of consequences, we think in terms of the emotional feeling of "going mountain climbing would be fun". This is a heuristic, but is ultimately about consequences: that we would enjoy the outcome of mountain climbing better than the alternatives immediately available to our thoughts.

This segues into the second part. Is consequentialism what we should be considering? There's been posts about this before, of whether our values are actually best represented in the consequentialist framework.
For mountain climbing, despite the heuristic of "I feel like mountain climbing today", if I learned that I would actually enjoy going running for an hour then heading back home more, then I would do that instead. When I'm playing with some project, part of that is driven by in-the-moment desires, but ultimately from a sense that this would be an enjoyable route.This is part of why I view the consequentialist lens as a natural extension of most if not all of our heuristics.
An agent that really wanted to go in circles doesn't necessarily have to stop, but for humans we do care about that.
There's certainly a possible better language/formalization to talk about agents that are mixes of consequentialist parts and non-consequentialist parts, which would be useful for describing humans, but I also am skeptical about your arguments for non-consequentialist elements of human desires.

If I value a thing at one period of life and turn away from it later, I have not discovered something about my values. My values have changed. In the case of the teenager we call this process “maturing”. Wine maturing in a barrel is not becoming what it always was, but simply becoming, according to how the winemaker conducts the process.

Your values change according to the process of reflection - the grapes mature into wine through fun chemical reactions.
From what you wrote, it feels like you are mostly considering your 'first-order values'. However, you have an updating process that you also have values about. Like that I wouldn't respect simple mind control that alters my first-order values, because my values consider mind-control as disallowed. Similar to why I wouldn't take a very potent drug even if I know my first-order values would rank the feeling very highly, because I don't endorse that specific sort of change.

I have never eaten escamoles. If I try them, what I will discover is what they are like to eat. If I like them, did I always like them? That is an unheard-falling-trees question.

Then we should split the question. Do you have a value for escamoles specifically before eating them? No. Do you have a system of thought (of updating your values) that would ~always result in liking escamoles? Well, no in full generality. You might end up with some disease that affects your tastebuds permanently. But in some reasonably large class of normal scenarios, your values would consistently update in a way that would end up liking escamoles were you to ever eat them. (But really, the value for escamoles is more instrumental of a value for [insert escamole flavor, texture, etc.] here, that the escamoles are learned to be a good instance of.)

What johnwentworth mentions would then be the question of "Would this approved process of updating my values converge to anything"; or tend to in some reasonable reference class; or at least have some guaranteed properties that aren't freely varying. I don't think he is arguing that the values are necessarily fixed and always persistent (I certainly don't always handle my values according to my professed beliefs about how I should updatethem), but that they're constrained. That the brain also models them as reasonably constrained, and that you can learn important properties of them.

Load More