> All the rest is an act of shared imagination. It’s a dream we weave around a status game.
> They’re part of the dream of reality in which they exist, a dream that feels no less obvious and true to them than ours does to us.
> Moral ‘truths’ are acts of imagination. They’re ideas we play games with.
IDK, I feel like you could say the same sentences truthfully about math, and if you "went with the overall vibe" of them, you might be confused and mistakenly think math was "arbitrary" or "meaningless", or doesn't have a determinate tendency, etc. Like, okay, if I say "one element of moral progress is increasing universalizability", and you say "that's just the thing your status cohort assigns high status", I'm like, well, sure, but that doesn't mean it doesn't also have other interesting properties, like being a tendency across many different peoples; like being correlated with the extent to which they're reflecting, sharing information, and building understanding; like resulting in reductionist-materialist local outcomes that have more of material local things that people otherwise generally seem to like (e.g. not being punched, having food, etc.); etc. It could be that morality has tendencies, but not without hormesis and mutually assured destrubtion and similar things that might be removed by aligned AI.
Attacking you? I said I don't want to be around you and don't want to invest in you. I said it with a touch of snark ("remind me").
> I clearly meant that it applies to everyone in different ways
Not clear to me. I don't think everyone "would sacrifice a lot" to "see the people [they] hate being harmed". I wouldn't. I think behaving that way is inadvisable for you and harmful to others, and will tend to make you a bad investment opportunity.
You sound like you're positing the existence of two type of people: type I people who have morality based on "reason" and type II people who have morality based on the "status game". In reality, everyone's nearly everyone's morality is based on something like the status game (see also: 1 2 3). It's just that EAs and moral philosophers are playing the game in a tribe which awards status differently.
The true intrinsic values of most people do place a weight on the happiness of other people (that's roughly what we call "empathy"), but this weight is very unequally distributed.
There are definitely thorny questions regarding the best way to aggregate the values of different people in TAI. But, I think that given a reasonable solution, a lower bound on the future is imagining that the AI will build a private utopia for every person, as isolated from the other "utopias" as that person wants it to be. Probably some people's "utopias" will not be great, viewed in utilitarian terms. But, I still prefer that over paperclips (by far). And, I suspect that most people do (even if they protest it in order to play the game).
It’s just that EAs and moral philosophers are playing the game in a tribe which awards status differently.
Sure, I've said as much in recent comments, including this one. ETA: Related to this, I'm worried about AI disrupting "our" status game in an unpredictable and possibly dangerous way. E.g., what will happen when everyone uses AI advisors to help them play status games, including the status game of moral philosophy?
The true intrinsic values of most people do place a weight on the happiness of other people (that’s roughly what we call “emapthy”), but this weight is very unequally distributed.
What do you mean by "true intrinsic values"? (I couldn't find any previous usage of this term by you.) How do you propose finding people's true intrinsic values?
These weights, if low enough relative to other "values", haven't prevented people from committing atrocities on each other in the name of morality.
...There are definitely thorny questions regarding the best way to aggregate the values of different people in TAI. But, I think that given a reasonable solution, a lower bound on the future is imagining that the AI will build a private utopia for every person, as isolated from the ot
I want to add a little to my stance on utilitarianism. A utilitarian superintelligence would probably kill me and everyone I love, because we are made of atoms that could be used for minds that are more hedonic[1][2][3]. Given a choice between paperclips and utilitarianism, I would still choose utilitarianism. But, if there was a utilitarian TAI project along with a half-decent chance to do something better (by my lights), I would actively oppose the utilitarian project. From my perspective, such a project is essentially enemy combatants.
One way to avoid it is by modifying utilitarianism to only place weight on currently existing people. But this is already not that far from my cooperative bargaining proposal (although still inferior to it, IMO). ↩︎
Another way to avoid it is by postulating some very strong penalty on death (i.e. discontinuity of personality). But this is not trivial to do, especially without creating other problems. Moreover, from my perspective this kind of thing is hacks trying to work around the core issue, namely that I am not a utilitarian (along with the vast majority of people). ↩︎
A possible counterargument is, maybe the superhedonic future minds wou
First, you wrote "a part of me is actually more scared of many futures in which alignment is solved, than a future where biological life is simply wiped out by a paperclip maximizer." So, I tried to assuage this fear for a particular class of alignment solutions.
Second... Yes, for a utilitarian this doesn't mean "much". But, tbh, who cares? I am not a utilitarian. The vast majority of people are not utilitarians. Maybe even literally no one is an (honest, not self-deceiving) utilitarian. From my perspective, disappointing the imaginary utilitarian is (in itself) about as upsetting as disappointing the imaginary paperclip maximizer.
Third, what I actually want from multi-user alignment is a solution that (i) is acceptable to me personally (ii) is acceptable to the vast majority of people (at least if they think through it rationally and are arguing honestly and in good faith) (iii) is acceptable to key stakeholders (iv) as much as possible, doesn't leave any Pareto improvements on the table and (v) sufficiently Schelling-pointy to coordinate around. Here, "acceptable" means "a lot better than paperclips and not worth starting an AI race/war to get something better".
Second… Yes, for a utilitarian this doesn’t mean “much”. But, tbh, who cares? I am not a utilitarian. The vast majority of people are not utilitarians. Maybe even literally no one is an (honest, not self-deceiving) utilitarian. From my perspective, disappointing the imaginary utilitarian is (in itself) about as upsetting as disappointing the imaginary paperclip maximizer.
I'm not a utilitarian either, because I don't know what my values are or should be. But I do assign significant credence to the possibility that something in the vincinity of utilitarianism is the right values (for me, or period). Given my uncertainties, I want to arrange the current state of the world so that (to the extent possible), whatever I end up deciding my values are, through things like reason, deliberation, doing philosophy, the world will ultimately not turn out to be a huge disappointment according to those values. Unfortunately, your proposed solution isn't very reassuring to this kind of view.
It's quite possible that I (and people like me) are simply out of luck, and there's just no feasible way to do what we want to do, but it sounds like you think I shouldn't even want what I want, or at least t...
But the future we’re discussing here is one where humans retain autonomy (?), and in that case, they’re allowed to change their mind over time, especially if humanity has access to a superintelligent aligned AI.
What if the humans ask the aligned AI to help them be more moral, and part of what they mean by "more moral" is having fewer doubts about their current moral beliefs? This is what a "status game" view of morality seems to predict, for the humans whose status games aren't based on "doing philosophy", which seems to be most of them.
In reality, everyone's morality is based on something like the status game (see also: 1 2 3)
... I really wanted to say [citation needed], but then you did provide citations, but then the citations were not compelling to me.
I'm pretty opposed to such universal claims being made about humans without pushback, because such claims always seem to me to wish-away the extremely wide variation in human psychology and the difficulty establishing anything like "all humans experience X."
There are people who have no visual imagery, people who do not think in words, people who have no sense of continuity of self, people who have no discernible emotional response to all sorts of "emotional" stimuli, and on and on and on.
So, I'll go with "it makes sense to model people as if every one of them is motivated by structures built atop the status game." And I'll go with "it seems like the status architecture is a physiological near-universal, so I have a hard time imagining what else people's morality might be made of." And I'll go with "everyone I've ever talked to had morality that seemed to me to cash out to being statusy, except the people whose self-reports I ignored because the...
Kind of frustrating that this high karma reply to a high karma comment on my post is based on a double misunderstanding/miscommunication:
It's not just that the self-reports didn't fit the story I was building, the self-reports didn't fit the revealed preferences. Whatever people say about their morality, I haven't seen anyone who behaves like a true utilitarian.
IMO, this is the source of all the gnashing of teeth about how much % of your salary you need to donate: the fundamental contradiction between the demands of utilitarianism and how much people are actually willing to pay for the status gain. Ofc many excuses were developed ("sure I still need to buy that coffee or those movie tickets, otherwise I won't be productive") but they don't sound like the most parsimonious explanation.
This is also the source of paradoxes in population ethics and its vicinity: those abstractions are just very remote from actual human minds, so there's no reason they should produce anything sane in edge cases. Their only true utility is as an approximate guideline for making group decisions, for sufficiently mundane scenarios. Once you get to issues with infinities it becomes clear utilitarianism is not even mathematically coherent, in general.
You're right that there is a lot of variation in human psychology. But it's also an accepted ...
Well, if the OP said something like "most people have 2 eyes but enlightened Buddhists have a third eye" and I responded with "in reality, everyone have 2 eyes" then, I think my meaning would be clear even though it's true that some people have 1 or 0 eyes (afaik maybe there is even a rare mutation that creates a real third eye). Not adding all possible qualifiers is not the same as "not even pretending that it's interested in making itself falsifiable".
Natural language is not math, it's inherently ambiguous and it's not realistically possible to always be precise without implicitly assuming anything about the reader's understanding of the context. That said, it seems like I wasn't sufficiently precise in this case, so I edited my comment. Thank you for the correction.
What does it have to do with "No True Scotsman"? NTS is when you redefine your categories to justify your claim. I don't think I did that anywhere.
Just because the extreme doesn't exist doesn't mean that all of the scale can be explained by status games.
First, I didn't say all the scale is explained by status games, I did mention empathy as well.
Second, that by itself sure doesn't mean much. Explaining all the evidence would require an article, or maybe a book (although I hoped the posts I linked explain some of it). My point here is that there is an enormous discrepancy between the reported morality and the revealed preferences, so believing self-reports is clearly a non-starter. How do you build an explanation not from self-reports is a different (long) story.
Feminist hero and birth control campaigner Marie Stopes, who was voted Woman of the Millennium by the readers of The Guardian and honoured on special Royal Mail stamps in 2008, was an anti-Semite and eugenicist
My conclusion from this is more like "successful politicians are not moral paragons". More generally, trying to find morally virtuous people by a popular vote is not going to produce great results, because the popularity plays much greater role than morality.
I googled for "woman of the year" to get more data points; found this list, containing: 2019 Greta Thunberg, 2016 Hillary Clinton, 2015 Angela Merkel, 2010 Nancy Pelosi, 2008 Michelle Obama, 1999 Madeleine Albright, 1990 Aung San Suu Kyi... clearly, being a politician dramatically increases your chances of winning. Looking at their behavior, Aung San Suu Kyi later organized a genocide.
The list also includes 2009 Malala Yousafzai, who as far as I know is an actual hero with no dark side. But that's kinda my point, that putting Malala Yousafzai on the same list as Greta Thunberg and Hillary Clinton just makes the list confusing. And if you had to choose one of them as the "woman of the millenium", I would expect most reader...
And this sounds silly to us, because we know that “kicking the sunrise” is impossible, because Sun is a planet, it is far away, and your kicking has no impact on it.
I think a lot of contemporary cultures back then would have found "kicking the sunrise" to be silly, because it was obviously impossible even given what they knew at the time, i.e., you can only kick something if you physically touch it with your foot, and nobody has ever even gotten close to touching the sun, and it's even more impossible while you're asleep.
So, we should distinguish between people having different moral feelings, and having different models of the world. If you actually believed that kicking the Sun is possible and can have astronomical consequences, you would probably also perceive people sleeping westwards as criminally negligent, possibly psychopathic.
Why did the Malagasy people have such a silly belief? Why do many people have very silly beliefs today? (Among the least politically risky ones to cite, someone I've known for years who otherwise is intelligent and successful, currently believes, or at least believed in the recent past, that 2/3 of everyone will die as a result of taking the CO...
Even if moralities vary from culture to culture based on the local status games, I would suggest that there is still some amount of consequentialist bedrock to why certain types of norms develop. In other words, cultural relativism is not unbounded.
Generally speaking, norms evolve over time, where any given norm at one point didn't yet exist if you go back far enough. What caused these norms to develop? I would say the selective pressures for norm development come from some combination of existing culturally-specific norms and narratives (such as the sunrise being an agent that could get hurt when kicked) along with more human-universal motivations (such as empathy + {wellbeing = good, suffering = bad} -> you are bad for kicking the sunrise -> don't sleep facing west) or other instrumentally-convergent goals (such as {power = good} + "semen grants power" -> institutionalized sodomy). At every step along the evolution of a moral norm, every change needs to be justifiable (in a consequentialist sense) to the members of the community who would adopt it. Moral progress is when the norms of society come to better resonate with both the accepted narratives of society (which may ...
I am also scared of futures where "alignment is solved" under the current prevailing usage of "human values."
Humans want things that we won't end up liking, and prefer things that we will regret getting relative to other options that we previously dispreferred. We are remarkably ignorant of what we will, in retrospect, end up having liked, even over short timescales. Over longer timescales, we learn to like new things that we couldn't have predicted a priori, meaning that even our earnest and thoughtfully-considered best guess of our preferences in advance will predictably be a mismatch for what we would have preferred in retrospect.
And this is not some kind of bug, this is centrally important to what it is to be a person; "growing up" requires a constant process of learning that you don't actually like certain things you used to like and now suddenly like new things. This truth ranges over all arenas of existence, from learning to like black coffee to realizing you want to have children.
I am personally partial to the idea of something like Coherent Extrapolated Volition. But it seems suspicious that I've never seen anybody on LW sketch out how a decision theory ought to beha...
I honestly have a difficult time understanding the people (such as your "AI alignment researchers and other LWers, Moral philosophers") who actually believe in Morality with a capital M. I believe they are misguided at best, potentially dangerous at worst.
I hadn't heard of the Status Game book you quote, but for a long time now it's seemed obvious to me that there is no objective true Morality, it's purely a cultural construct, and mostly a status game. Any deep reading of history, cultures, and religions, leads one to this conclusion.
Humans have complex values, and that is all.
We humans cooperate and compete to optimize the universe according to those values, as we always have, as our posthuman descendants will, even without fully understanding them.
To repost my comment from a couple of weeks back, which seems to say roughly the same thing, not as well:
...I don't believe alignment is possible. Humans are not aligned with other humans, and the only thing that prevents an immediate apocalypse is the lack of recursive self-improvement on short timescales. Certainly groups of humans happily destroy other groups of humans, and often destroy themselves in the process of maximizing something like the number of statues. Best we can hope for that whatever takes over the planet after meatbags are gone has some of
I'm not sure what you mean by 'astronomical waste or astronomical suffering'. Like, you are writing that everything forever is status games, ok, sure, but then you can't turn around and appeal to a universal concept of suffering/waste, right?
Whatever you are worried about is just like Gandhi worrying about being too concerned with cattle, plus x years, yeah? And even if you've lucked into a non status games morality such that you can perceive 'Genuine Waste' or what have you...surely by your own logic, we who are reading this are incapable of understanding, aside from in terms of status games.
So on the one hand you have values that are easily, trivially compatible, such as "I want to spend 1000 years climbing the mountains of Mars" or "I want to host blood-sports with my uncoerced friends with the holodeck safety on".
On the other hand you have insoluble, or at least apparently insoluble, conflicts: B wants to torture people, C wants there to be no torture anywhere at all. C wants to monitor everyone everywhere forever to check that they aren't torturing anyone or plotting to torture anyone, D wants privacy. E and F both want to be the best in ...
I'm leaning towards the more ambitious version of the project of AI alignment being about corrigible anti-goodharting, with the AI optimizing towards good trajectories within scope of relatively well-understood values, preventing overoptimized weird/controversial situations, even at the cost of astronomical waste. Absence of x-risks, including AI risks, is generally good. Within this environment, the civilization might be able to eventually work out more about values, expanding the scope of their definition and thus allowing stronger optimization. Here corrigibility is in part about continually picking up the values and their implied scope from the predictions of how they would've been worked out some time in the future.
I think this post makes an important point -- or rather, raises a very important question, with some vivid examples to get you started. On the other hand, I feel like it doesn't go further, and probably should have -- I wish it e.g. sketched a concrete scenario in which the future is dystopian not because we failed to make our AGIs "moral" but because we succeeded, or e.g. got a bit more formal and complemented the quotes with a toy model (inspired by the quotes) of how moral deliberation in a society might work, under post-AGI-alignment conditions, and ho...
If with "morality" you mean moral realism, then yes, I agree that it is scary.
I'm most scared by the apparent assumption that we have solved the human alignment problem.
Looking at history, I don't feel like our current situation of relative peace is very stable.
My impression is that "good" behavior is largely dependent on incentives, and so is the very definition of "good".
Perhaps markets are one of the more successful tools of creating aligned behaviour in humans, but even in that case it only seems to work if the powers of the market participants are balanced, which is not a luxury we have in alignment work.
You could read the status game argument the opposite way: Maybe status seeking causes moral beliefs without justifying them, in the same way that it can distort our factual beliefs about the world. If we can debunk moral beliefs by finding them to be only status-motivated, the status explanation can actually assist rational reflection on morality.
Also the quote from The Status Game conflates purely moral beliefs and factual beliefs in a way that IMO weakens its argument. It's not clear that many of the examples of crazy value systems would survive full logical and empirical information.
There is no unique eutopia.
Sentient beings that collaborate outcompete ones that don't (not considering here, inner competition in a singleton). Collaboration means that interests between beings are traded/compromised. Better collaboration methods have a higher chance to win. We see this over the course of history. This is a messy evolutionary process. But I think there is a chance that this process itself can be improved e.g. with FAI. Think of an interactive "AlphaValue" that does Monte-Carlo Tree Search over collaboration opportunities. It will not converge on a unique best CEV but result in one of many possible eutopias.
I don't follow the reasoning. How do you get from "most people's moral behaviour is explainable in terms of them 'playing' a status game" to "solving (some versions of) the alignment problem probably won't be enough to ensure a future that's free from astronomical waste or astronomical suffering"?
More details:
Regarding the quote from The Status Game: I have not read the book, so I'm not sure what the intended message is but this sounds like some sort of unwarranted pessimism about ppl's moral standing (something like a claim like "the vast majority of ppl ...
Great post, thanks! Widespread value pluralism a la 'well that's just, like, your opinion man' is now a feature of modern life. Here are a pair of responses from political philosophy which may be of some interest
(1) Rawls/Thin Liberal Approach. Whilst we may not be able to agree on what 'the good life' is, we can at least agree on a basic system which ensures all participants can pursue their own idea of the good life. So,(1) Protect a list of political liberties and freedoms and (2) degree of economic levelling. Beyond that, it is up to ...
Don't you need AI to go through the many millions of experiences that it might take to develop a good morality strategy?
I'm entranced by Jordan Peterson's descriptions, which seem to light up the evolutionary path of morality for humans. Shouldn't AI be set up to try to grind through the same progress?
What's truly scary is how much the beliefs and opinions of normal people make them seem like aliens to me.
I find the paragraph beginning with these two sentences, its examples misleading and unconvincing in the point about moral disagreement across time it tries to make:
Such ‘facts’ also change across time. We don’t have to travel back far to discover moral superstars holding moral views that would destroy them today.
I shall try to explain why, because such evidence seemed persuasive to me before I thought about it more; I made this account just for this comment after being a lurker for a while -- I have found your previous posts about moral uncertainty ...
It seems that our morality consists of two elements. First is bias, based on game theoretical environment of our ancestors. Humans developed complex feelings around activities that promoted inclusive genetic fitness and now we are intrinsically and authentically motivated to do them for their own sake.
There is also a limited capability for moral updates. That's what we use to resolve contradictions in our moral intuitions. And that's also what allow us to persuade ourselves that doing some status promoting thing is actually moral. One the one hand, t...
You may not be interested in mutually exclusive compression schemas, but mutually exclusive compression schemas are interested in you. One nice thing is that given that the schemas use an arbitrary key to handshake with there is hope that they can be convinced to all get on the same arbitrary key without loss of useful structure.
I'm worried that many AI alignment researchers and other LWers have a view of how human morality works, that really only applies to a small fraction of all humans (notably moral philosophers and themselves). In this view, people know or at least suspect that they are confused about morality, and are eager or willing to apply reason and deliberation to find out what their real values are, or to correct their moral beliefs. Here's an example of someone who fits this view:
I'm currently reading The Status Game by Will Storr (highly recommended BTW), and found in it the following description of how morality works in most people, which matches my own understanding of history and my observations of humans around me:
I hope this helps further explain why I think even solving (some versions of) the alignment problem probably won't be enough to ensure a future that's free from astronomical waste or astronomical suffering. A part of me is actually more scared of many futures in which "alignment is solved", than a future where biological life is simply wiped out by a paperclip maximizer.