I wasn't convinced of this ten years ago and I'm still not convinced.
When I look at people who have contributed most to alignment-related issues - whether directly, like Eliezer Yudkowsky and Paul Christiano - or theoretically, like Toby Ord and Katja Grace - or indirectly, like Sam Bankman-Fried and Holden Karnofsky - what all of these people have in common is focusing mostly on object-level questions. They all seem to me to have a strong understanding of their own biases, in the sense that gets trained by natural intelligence, really good scientific work, and talking to other smart and curious people like themselves. But as far as I know, none of them have made it a focus of theirs to fight egregores, defeat hypercreatures, awaken to their own mortality, refactor their identity, or cultivate their will. In fact, all them (except maybe Eliezer) seem like the kind of people who would be unusually averse to thinking in those terms. And if we pit their plumbing or truck-manuevering skills against those of an average person, I see no reason to think they would do better (besides maybe high IQ and general ability).
It's seemed to me that the more that people talk about "rationality trai...
I think your pushback is ignoring an important point. One major thing the big contributors have in common is that they tend to be unplugged from the stuff Valentine is naming!
So even if folks mostly don't become contributors by asking "how can I come more truthfully from myself and not what I'm plugged into", I think there is an important cluster of mysteries here. Examples of related phenomena:
I think Val's correct on the point that our people and organizations are plugged into some bad stuff, and that it's worth examining that.
But as far as I know, none of them have made it a focus of theirs to fight egregores, defeat hypercreatures
Egregore is an occult concept representing a distinct non-physical entity that arises from a collective group of people.
I do know one writer who talks a lot about demons and entities from beyond the void. It's you, and it happens in some of, IMHO, the most valuable pieces you've written.
...I worry that Caplan is eliding the important summoner/demon distinction. This is an easy distinction to miss, since demons often kill their summoners and wear their skin.
That civilization is dead. It summoned an alien entity from beyond the void which devoured its summoner and is proceeding to eat the rest of the world.
https://slatestarcodex.com/2016/07/25/how-the-west-was-won/And Ginsberg answers: “Moloch”. It’s powerful not because it’s correct – nobody literally thinks an ancient Carthaginian demon causes everything – but because thinking of the system as an agent throws into relief the degree to which the system isn’t an agent.
But the current rulers of the universe – call them what you want, Moloch, Gnon, whatever – want us dead, and with us everything we value. Art, science, love, ph
I sadly don't have time to really introspect what is going in me here, but something about this comment feels pretty off to me. I think in some sense it provides an important counterpoint to the OP, but also, I feel like it also stretches the truth quite a bit:
I wasn't convinced of this ten years ago and I'm still not convinced.
Given the link, I think you're objecting to something I don't care about. I don't mean to claim that x-rationality is great and has promise to Save the World. Maybe if more really is possible and we do something pretty different to seriously develop it. Maybe. But frankly I recognize stupefying egregores here too and I don't expect "more and better x-rationality" to do a damn thing to counter those for the foreseeable future.
So on this point I think I agree with you… and I don't feel whatsoever dissuaded from what I'm saying.
The rest of what you're saying feels like it's more targeting what I care about though:
When I look at people who have contributed most to alignment-related issues […] what all of these people have in common is focusing mostly on object-level questions.
Right. And as I said in the OP, stupefaction often entails alienation from object-level reality.
It's also worth noting that LW exists mostly because Eliezer did in fact notice his own stupidity and freaked the fuck out. He poured a huge amount of energy into taking his internal mental weeding seriously in order to never ever ever be that st...
Maybe. It might be that if you described what you wanted more clearly, it would be the same thing that I want, and possibly I was incorrectly associating this with the things at CFAR you say you're against, in which case sorry.
But I still don't feel like I quite understand your suggestion. You talk of "stupefying egregores" as problematic insofar as they distract from the object-level problem. But I don't understand how pivoting to egregore-fighting isn't also a distraction from the object-level problem. Maybe this is because I don't understand what fighting egregores consists of, and if I knew, then I would agree it was some sort of reasonable problem-solving step.
I agree that the Sequences contain a lot of useful deconfusion, but I interpret them as useful primarily because they provide a template for good thinking, and not because clearing up your thinking about those things is itself necessary for doing good work. I think of the cryonics discussion the same way I think of the Many Worlds discussion - following the motions of someone as they get the right answer to a hard question trains you to do this thing yourself.
I'm sorry if "cultivate your will" has the wrong connotations,...
There's also the skulls to consider. As far as I can tell, this post's recommendations are that we, who are already in a valley littered with a suspicious number of skulls,
https://forum.effectivealtruism.org/posts/ZcpZEXEFZ5oLHTnr9/noticing-the-skulls-longtermism-edition
https://slatestarcodex.com/2017/04/07/yes-we-have-noticed-the-skulls/
turn right towards a dark cave marked 'skull avenue' whose mouth is a giant skull, and whose walls are made entirely of skulls that turn to face you as you walk past them deeper into the cave.
The success rate of movments aimed at improving the longterm future or improving rationality has historically been... not great but there's at least solid concrete emperical reasons to think specific actions will help and we can pin our hopes on that.
The success rate of, let's build a movement to successfully uncouple ourselves from society's bad memes and become capable of real action and then our problems will be solvable, is 0. Not just in that thinking that way didn't help but in that with near 100% success you just end up possessed by worse memes if you make that your explicit final goal (rather than ending up doing that as a side effect of trying to get good at something). And there's also no concrete paths to action to pin our hopes on.
The success rate of developing and introducing better memes into society is indeed not 0. The key thing there is that the scientific revolutionaries weren't just as an abstract thinking "we must uncouple from society first, and then we'll know what to do". Rather, they wanted to understand how objects fell, how animals evolved and lots of other specific problems and developed good memes to achieve those ends.
Now that I've had a few days to let the ideas roll around in the back of my head, I'm gonna take a stab at answering this.
I think there are a few different things going on here which are getting confused.
1) What does "memetic forces precede AGI" even mean?
"Individuals", "memetic forces", and "that which is upstream of memetics" all act on different scales. As an example of each, I suggest "What will I eat for lunch?", "Who gets elected POTUS?", and "Will people eat food?", respectively.
"What will I eat for lunch?" is an example of an individual decision because I can actually choose the outcome there. While sometimes things like "veganism" will tell me what I should eat, and while I might let that have influence me, I don't actually have to. If I realize that my life depends on eating steak, I will actually end up eating steak.
"Who gets elected POTUS" is a much tougher problem. I can vote. I can probably persuade friends to vote. If I really dedicate myself to the cause, and I do an exceptionally good job, and I get lucky, I might be able to get my ideas into the minds of enough people that my impact is noticeable. Even then though, it's a drop in the bucket and pretty far outside ...
Don't have the time to write a long comment just now, but I still wanted to point out that describing either Yudkowsky or Christiano as doing mostly object-level research seems incredibly wrong. So much of what they're doing and have done focused explicitly on which questions to ask, which question not to ask, which paradigm to work in, how to criticize that kind of work... They rarely published posts that are only about the meta-level (although Arbital does contain a bunch of pages along those lines and Prosaic AI Alignment is also meta) but it pervades their writing and thinking.
More generally, when you're creating a new field of science of research, you tend to do a lot of philosophy of science type stuff, even if you don't label it explicitly that way. Galileo, Carnot, Darwin, Boltzmann, Einstein, Turing all did it.
(To be clear, I'm pointing at meta-stuff in the sense of "philosophy of science for alignment" type things, not necessarily the more hardcore stuff discussed in the original post)
When I look at people who have contributed most to alignment-related issues - whether directly... or indirectly, like Sam Bankman-Fried
Perhaps I have missed it, but I’m not aware that Sam has funded any AI alignment work thus far.
If so this sounds like giving him a large amount of credit in advance of doing the work, which is generous but not the order credit allocation should go.
My attempt to break down the key claims here:
Putting this in a separate comment, because Reign of Terror moderation scares me and I want to compartmentalize. I am still unclear about the following things:
I ~entirely agree with you.
At some point (maybe from the beginning?), humans forgot the raison d’etre of capitalism — encourage people to work towards the greater good in a scalable way. It’s a huge system that has fallen prey to Goodhart’s Law, where a bunch of Powergamers have switched from “I should produce the best product in order to sell the most” to “I should alter the customer‘s mindset so that they want my (maybe inferior) product”. And the tragedy of the commons has forced everyone to follow suit.
Not only that, the system that could stand in the way — the government — has been captured by the same forces. A picture of an old man wearing mittens that was shared millions of times likely had a larger impact on how people vote than actual action or policy.
I don’t know what to do about these things. I’ve tried hard to escape the forces myself, but it’s a constant battle to not be drawn back in. The thing I’d recommend to anyone else willing to try is to think of who your enemy is, and work hard to understand their viewpoint and how they came to it. For most people in the US, I imagine it’s the opposite political party. You’ll pr...
I really liked this post, though I somewhat disagree with some of the conclusions. I think that in fact aligning an artificial digital intelligence will be much, much easier than working on aligning humans. To point towards why I believe this, think about how many "tech" companies (Uber, crypto, etc) derive their value, primarily, from circumventing regulation (read: unfriendly egregore rent seeking). By "wiping the slate clean" you can suddenly accomplish much more than working in a field where the enemy already controls the terrain.
If you try to tackle "human alignment", you will be faced with the coordinated resistance of all the unfriendly demons that human memetic evolution has to offer. If you start from scratch with a new kind of intelligence, a system that doesn't have to adhere to the existing hostile terrain (doesn't have to have the same memetic weaknesses as humans that are so optimized against, doesn't have to go to school, grow up in a toxic media environment etc etc), you can, maybe, just maybe, build something that circumvents this problem entirely.
That's my biggest hope with alignment (which I am, unfortunately, not very optimistic about, but I am even ...
Keeping your identity small posits that most of your attack surface is in something you maintain yourself. It would make sense, then, that as the sophistication of these entities increase, they would eventually start selecting for causing you to voluntarily increase your attack surface.
Tim Ferriss' biggest surprise while doing interviews for his Tools of Titans book was that 90% of the people he interviewed had some sort of meditation practice. I think that contemplative tech is already mostly a requirement for high performance in an adversarially optimizing environment.
I think statistical physics of human cooperation is the best overview of one method of studying the emergence of such hyperobjects that is basically a nascent field right now.
In a Facebook post I argued that it’s fair to view these things as alive.
Just a note, unlike in the recent past, Facebook post links seem to now be completely hidden unless you are logged into Facebook when opening them, so they are basically broken as any sort of publicly viewable resource.
Well, that's just terrible.
Here's the post:
...I think the world makes more sense if you recognize humans aren't on the top of the food chain.
We don't see this clearly, kind of like ants don't clearly see anteaters. They know something is wrong, and they rush around trying to deal with it, but it's not like any ant recognizes the predator in much more detail than "threat".
There's a whole type of living being "above" us the way animals are "above" ants.
Esoteric traditions sometimes call these creatures "egregores".
Carl Jung called a special subset of them "archetypes".
I often refer to them as "memes" — although "memeplex" might be more accurate. Self-preserving clusters of memes.
We have a hard time orienting to them because they're not made of stuff we're used to thinking of as living — in basically the same way that anteaters are tricky for ants to orient to as ant-like. Wrong pheromones, wrong size, more like reality than like members of this or another colony, etc.
We don't see a fleshy body, or cells, or a molecular mechanism. So there's no organism, right?
But we have a clear intuition for life without molecular mechanisms. That's why we refer to "computer viruses" as such: the analo
“Sure, cried the tenant men, but it’s our land…We were born on it, and we got killed on it, died on it. Even if it’s no good, it’s still ours….That’s what makes ownership, not a paper with numbers on it."
"We’re sorry. It’s not us. It’s the monster. The bank isn’t like a man."
"Yes, but the bank is only made of men."
"No, you’re wrong there—quite wrong there. The bank is something else than men. It happens that every man in a bank hates what the bank does, and yet the bank does it. The bank is something more than men, I tell you. It’s the monster. Men made it, but they can’t control it.”
― John Steinbeck, The Grapes of Wrath
The part about hypercreatures preventing coordination sounds very true to me, but I'm much less certain about this part:
Who is aligning the AGI? And to what is it aligning?
This isn't just a cute philosophy problem.
A common result of egregoric stupefaction is identity fuckery. We get this image of ourselves in our minds, and then we look at that image and agree "Yep, that's me." Then we rearrange our minds so that all those survival instincts of the body get aimed at protecting the image in our minds.
How did you decide which bits are "you"? Or what can threaten "you"?
I'll hop past the deluge of opinions and just tell you: It's these superintelligences. They shaped your culture's messages, probably shoved you through public school, gripped your parents to scar you in predictable ways, etc.
It's like installing a memetic operating system.
If you don't sort that out, then that OS will drive how you orient to AI alignment.
It seems to me that you can think about questions of alignment from a purely technical mindset, e.g. "what kind of a value system does the brain have and would the AI need to be like in order to understand that", and that this kind of technical thinking is much less affe...
I agree with most of what I think you're saying, for example that the social preconditions for unfriendly non-human AGI are at least on the same scale of importance and attention-worthiness as technical problems for friendly non-human AGI, and that alignment problems extend throughout ourselves and humanity. But also, part of the core message seems to be pretty incorrect. Namely:
Anything else is playing at the wrong level. Not our job. Can't be our job. Not as individuals, and it's individuals who seem to have something mimicking free will.
This sounds like you're saying, it's not "our" (any of our?) job to solve technical problems in (third person non-human-AGI) alignment. But that seems pretty incorrect because it seems like there are difficult technical obstacles to making friendly AGI, which take an unknown possibly large amount of time. We can see that unfriendly non-human very-superhuman AGI is fairly likely by default given economic incentives, which makes it hard for social conditions to be so good that there isn't a ticking clock. Solving technical problems is very prone to be done in service of hostile / external entities; but that doesn't mean you can get good outcomes without solving technical problems.
You are 200% right. This is the problem we have to solve, not making sure a superintelligent AI can be technologically instructed to serve the whims of its creators.
Have you read Scott Alexander's Meditations on Moloch? It's brilliant, and is quite adjacent to the claims you are making. It has received too little follow-up in this community.
https://www.lesswrong.com/posts/TxcRbCYHaeL59aY7E/meditations-on-moloch
...The implicit question is – if everyone hates the current system, who perpetuates it? And Ginsberg answers: “Moloch”. It’s powerful not b
This is an interesting idea. Note that superforecasters read more news than the average person, and so are online a significant amount of time, yet they seem unaffected (this could be for many reasons, but is weak evidence against your theory). I’d like to know whether highly or moderately successful people, especially in the EA-sphere, avoid advertising and other info characterized as malicious by your theory. Elon Musk stands out as very online, yet very successful, but the way he is spending his money certainly is not optimized to prevent his fears of e...
Note that superforecasters read more news than the average person, and so are online a significant amount of time, yet they seem unaffected (this could be for many reasons, but is weak evidence against your theory).
I like this example.
Superforecasters are doing something real. If you make a prediction and you can clearly tell whether it comes about or not, this makes the process of evaluating the prediction mostly immune to stupefaction.
Much like being online a lot doesn't screw with your ability to shoot hoops, other than maybe taking time away from practice. You can still tell whether the ball goes in the basket.
This is why focusing on real things is clarifying. Reality reflects truth. Is truth, really, although I imagine that use of the word "truth" borders on heresy here.
Contrast superforecasters with astrologers. They're both mastering a skillset, but astrologers are mastering one that has no obvious grounding. Their "predictions" slide all over the place. Absolutely subject to stupefaction. They're optimizing for something more like buy-in. Actually testing their art against reality would threaten what they're doing, so they throw up mental fog and invite you to do the same.
W...
Who is aligning the AGI? And to what is it aligning?
Generally, I tend to think of "how do we align an AGI to literally anyone at all whatsoever instead of producing absolutely nothing of value to any human ever" as being a strict prerequisite to "who to align to"; the former without the latter may be suboptimal, but the latter without the former is useless.
My guess is, it's a fuckton easier to sort out Friendliness/alignment within a human being than it is on a computer. Because the stuff making up Friendliness is right there.
I don't think this is a given....
Ok. So suppose we build a memetic bunker. We protect ourselves from the viral memes. A handful of programmers, aligned within themselves, working on AI. Then they solve alignment. The AI is very powerful and fixes everything else.
My conclusion: Let's start the meme that Alignment (the technical problem) is fundamentally impossible (maybe it is? why think you can control something supposedly smarter than you?) and that you will definitely kill yourself if you get to the point where finding a solution to Alignment is what could keep you alive. Pull a Warhammer 40k, start banning machine learning, and for that matter, maybe computers (above some level of performance) and software. This would put more humans in the loop for the same tasks we have now, which offers more opportunities to...
I can't upvote this enough. This is exactly how I think about it, and why I have always called myself a mystic. I have an unusual brain and I am prone to ecstatic possession experiences, particularly while listening to certain types of music. The worst thing is, people like me used to become shamans and it used to be obvious to everybody that egregores - spirits - are the most powerful force in the world - but Western culture swept that under the rug and now they are able to run amok with very few people able to perceive them. I bet if you showed a tribal ...
Gosh, um…
I think I see where you are, and by my judgment you're more right than wrong, but from where I stand it sure looks like pain is still steering the ship. That runs the risk of breaking your interface to places like this.
(I think you're intuiting that. Hence the "crazy alert".)
I mean, vividly apropos of what you're saying, it looks to me like you've rederived a lot of the essentials of how symbiotic egregores work, what it's like to ally with them, and why we have to do so in order to orient to the parasitic egregores.
But the details of what you mean by "religion" and "cult" matter a lot, and in most interpretations of "extremely missionary" I just flat-out disagree with you on that point.
…the core issue being that symbiotic memes basically never push themselves onto potential hosts.
You actually hint at this:
And it must be able to do this while they know it perfectly well, and consent before joining to begin with to its doing so - as obviously one which does so without consent is not aligned to true human values, even though, ironically, it has to be so good at rhetoric that consent is almost guaranteed to be given.
But I claim the core strategy cannot be rhetoric. The ...
I think I follow what you're saying, and I think it's consistent with my own observations of the world.
I suspect that there's a particular sort of spirituality-adjacent recreational philosophy whose practice may make it easier to examine the meta-organisms you're describing. Even with it, they seem to often resist being named in a way that's useful when speaking to mixed company.
Can you point out some of the existing ones that meet your definition of Friendly?
I didn't get the 'first person' thing at first (and the terminal diagnosis metaphor wasn't helpful to me). I think I do now.
I'd rephrase it as "In your story about how the Friendly hypercreature you create gains power, make sure the characters are level one intelligent". That means creating a hypercreature you'd want to host. Which means you will be its host.
To ensure it's a good hypercreature, you need to have good taste in hypercreatures. Rejecting all hypercreatures doesn't work—you need to selectively reject bad hypercreatures.
This packs real emotional punch! Well done!
A confusion: in what way is our little project here not another egregore, or at least a meta-egregore?
What kind of thing is wokism? Or Communism? What kind of thing was Naziism in WWII? Or the flat Earth conspiracy movement? Q Anon?
I'd say they are alliances, or something like it.
You can't achieve much on your own; you need other people to specialize in getting information about topics you haven't specialized in, to handle numerous object-level jobs that you haven't specialized in, and to lead/organize all of the people you are dependent on.
But this dependency on other people requires trust, particularly in the leaders. So first of all, in order for you to...
Note A- I assert that what the original author is getting at is extremely important. A lot of what's said here is something I would have liked to say but couldn't find a good way to explain, and I want to emphasize how important this is.
Note B- I assert that a lot of politics is the question of how to be a good person. Which is also adjacent to religion and more importantly, something similar to religion but not religion, which is basically, which egregore should you worship/host. I think that the vast majority of a person's impact in this world is what hy...
3 points:
I don't know if there are superior entities to us playing these games, or if such memes are just natural collective tendencies. I don't think any of us know or can know, at least with current knowledge.
I agree that aligning humanity is our only chance. Aligning AGI takes, in fact, superhuman technical ability, so that, considering current AGI timelines vs current technical alignment progress, I'd give a less than 1% probability that we make it on time. In fact some even say that technical alignment is impossible, just look at anything Yalmpo
I think this is a useful abstraction.
But I think the word you're looking for is "god". In the "Bicameral Consciousness" sense - these egregores you refer to are gods that speak to us, whose words we know. There's another word, zeitgeist, that refers to something like the same thing.
If you look in your mind, you can find them; just look for what you think the gods would say, and they will say it. Pick a topic you care about. What would your enemy say about that topic? There's a god, right there, speaking to you.
Mind, in a sense...
Back in 2016, CFAR pivoted to focusing on xrisk. I think the magic phrase at the time was:
I was against this move. I also had no idea how power works. I don't know how to translate this into LW language, so I'll just use mine: I was secret-to-me vastly more interested in being victimized at people/institutions/the world than I was in doing real things.
But the reason I was against the move is solid. I still believe in it.
I want to spell that part out a bit. Not to gripe about the past. The past makes sense to me. But because the idea still applies.
I think it's a simple idea once it's not cloaked in bullshit. Maybe that's an illusion of transparency. But I'll try to keep this simple-to-me and correct toward more detail when asked and I feel like it, rather than spelling out all the details in a way that turns out to have been unneeded.
Which is to say, this'll be kind of punchy and under-justified.
The short version is this:
We're already in AI takeoff. The "AI" is just running on human minds right now. Sorting out AI alignment in computers is focusing entirely on the endgame. That's not where the causal power is.
Maybe that's enough for you. If so, cool.
I'll say more to gesture at the flesh of this.
What kind of thing is wokism? Or Communism? What kind of thing was Naziism in WWII? Or the flat Earth conspiracy movement? Q Anon?
If you squint a bit, you might see there's a common type here.
In a Facebook post I argued that it's fair to view these things as alive. Well, really, I just described them as living, which kind of is the argument. If your woo allergy keeps you from seeing that… well, good luck to you. But if you're willing to just assume I mean something non-woo, you just might see something real there.
These hyperobject creatures are undergoing massive competitive evolution. Thanks Internet. They're competing for resources. Literal things like land, money, political power… and most importantly, human minds.
I mean something loose here. Y'all are mostly better at details than I am. I'll let you flesh those out rather than pretending I can do it well.
But I'm guessing you know this thing. We saw it in the pandemic, where friendships got torn apart because people got hooked by competing memes. Some "plandemic" conspiracy theorist anti-vax types, some blind belief in provably incoherent authorities, the whole anti-racism woke wave, etc.
This is people getting possessed.
And the… things… possessing them are highly optimizing for this.
To borrow a bit from fiction: It's worth knowing that in their original vision for The Matrix, the Wachowski siblings wanted humans to be processors, not batteries. The Matrix was a way of harvesting human computing power. As I recall, they had to change it because someone argued that people wouldn't understand their idea.
I think we're in a scenario like this. Not so much the "in a simulation" part. (I mean, maybe. But for what I'm saying here I don't care.) But yes with a functionally nonhuman intelligence hijacking our minds to do coordinated computations.
(And no, I'm not positing a ghost in the machine, any more than I posit a ghost in the machine of "you" when I pretend that you are an intelligent agent. If we stop pretending that intelligence is ontologically separate from the structures it's implemented on, then the same thing that lets "superintelligent agent" mean anything at all says we already have several.)
We're already witnessing orthogonality.
The talk of "late-stage capitalism" points at this. The way greenwashing appears for instance is intelligently weaponized Goodhart. It's explicitly hacking people's signals in order to extract what the hypercreature in question wants from people (usually profit).
The way China is drifting with a social credit system and facial recognition tech in its one party system, it appears to be threatening a Shriek. Maybe I'm badly informed here. But the point is the possibility.
In the USA, we have to file income taxes every year even though we have the tech to make it a breeze. Why? "Lobbying" is right, but that describes the action. What's the intelligence behind the action? What agent becomes your intentional opponent if you try to change this? You might point at specific villains, but they're not really the cause. The CEO of TurboTax doesn't stay the CEO if he doesn't serve the hypercreature's hunger.
I'll let you fill in other examples.
If the whole world were unified on AI alignment being an issue, it'd just be a problem to solve.
The problem that's upstream of this is the lack of will.
Same thing with cryonics really. Or aging.
But AI is particularly acute around here, so I'll stick to that.
The problem is that people's minds aren't clear enough to look at the problem for real. Most folk can't orient to AI risk without going nuts or numb or splitting out gibberish platitudes.
I think this is part accidental and part hypercreature-intentional.
The accidental part is like how advertisements do a kind of DDOS attack on people's sense of inherent self-worth. There isn't even a single egregore to point at as the cause of that. It's just that many, many such hypercreatures benefit from the deluge of subtly negative messaging and therefore tap into it in a sort of (for them) inverse tragedy of the commons. (Victory of the commons?)
In the same way, there's a very particular kind of stupid that (a) is pretty much independent of g factor and (b) is super beneficial for these hypercreatures as a pathway to possession.
And I say "stupid" both because it's evocative but also because of ties to terms like "stupendous" and "stupefy". I interpret "stupid" to mean something like "stunned". Like the mind is numb and pliable.
It so happens that the shape of this stupid keeps people from being grounded in the physical world. Like, how do you get a bunch of trucks out of a city? How do you fix the plumbing in your house? Why six feet for social distancing? It's easier to drift to supposed-to's and blame minimization. A mind that does that is super programmable.
The kind of clarity that you need to de-numb and actually goddamn look at AI risk is pretty anti all this. It's inoculation to zombiism.
So for one, that's just hard.
But for two, once a hypercreature (of this type) notices this immunity taking hold, it'll double down. Evolve weaponry.
That's the "intentional" part.
This is where people — having their minds coopted for Matrix-like computation — will pour their intelligence into dismissing arguments for AI risk.
This is why we can't get serious enough buy-in to this problem.
Which is to say, the problem isn't a need for AI alignment research.
The problem is current hypercreature unFriendliness.
From what I've been able to tell, AI alignment folk for the most part are trying to look at this external thing, this AGI, and make it aligned.
I think this is doomed.
Not just because we're out of time. That might be.
But the basic idea was already self-defeating.
Who is aligning the AGI? And to what is it aligning?
This isn't just a cute philosophy problem.
A common result of egregoric stupefaction is identity fuckery. We get this image of ourselves in our minds, and then we look at that image and agree "Yep, that's me." Then we rearrange our minds so that all those survival instincts of the body get aimed at protecting the image in our minds.
How did you decide which bits are "you"? Or what can threaten "you"?
I'll hop past the deluge of opinions and just tell you: It's these superintelligences. They shaped your culture's messages, probably shoved you through public school, gripped your parents to scar you in predictable ways, etc.
It's like installing a memetic operating system.
If you don't sort that out, then that OS will drive how you orient to AI alignment.
My guess is, it's a fuckton easier to sort out Friendliness/alignment within a human being than it is on a computer. Because the stuff making up Friendliness is right there.
And by extension, I think it's a whole lot easier to create/invoke/summon/discover/etc. a Friendly hypercreature than it is to solve digital AI alignment. The birth of science was an early example.
I'm pretty sure this alignment needs to happen in first person. Not third person. It's not (just) an external puzzle, but is something you solve inside yourself.
A brief but hopefully clarifying aside:
Stephen Jenkinson argues that most people don't know they're going to die. Rather, they know that everyone else is going to die.
That's what changes when someone gets a terminal diagnosis.
I mean, if I have a 100% reliable magic method for telling how you're going to die, and I tell you "Oh, you'll get a heart attack and that'll be it", that'll probably feel weird but it won't fill you with dread. If anything it might free you because now you know there's only one threat to guard against.
But there's a kind of deep, personal dread, a kind of intimate knowing, that comes when the doctor comes in with a particular weight and says "I've got some bad news."
It's immanent.
You can feel that it's going to happen to you.
Not the idea of you. It's not "Yeah, sure, I'm gonna die someday."
It becomes real.
You're going to experience it from behind the eyes reading these words.
From within the skin you're in as you witness this screen.
When I talk about alignment being "first person and not third person", it's like this. How knowing your mortality doesn't happen until it happens in first person.
Any kind of "alignment" or "Friendliness" or whatever that doesn't put that first person ness at the absolute very center isn't a thing worth crowing about.
I think that's the core mistake anyway. Why we're in this predicament, why we have unaligned superintelligences ruling the world, and why AGI looks so scary.
It's in forgetting the center of what really matters.
It's worth noting that the only scale that matters anymore is the hypercreature one.
I mean, one of the biggest things a single person can build on their own is a house. But that's hard, and most people can't do that. Mostly companies build houses.
Solving AI alignment is fundamentally a coordination problem. The kind of math/programming/etc. needed to solve it is literally superhuman, the way the four color theorem was (and still kind of is) superhuman.
"Attempted solutions to coordination problems" is a fine proto-definition of the hypercreatures I'm talking about.
So if the creatures you summon to solve AI alignment aren't Friendly, you're going to have a bad time.
And for exactly the same reason that most AGIs aren't Friendly, most emergent egregores aren't either.
As individuals, we seem to have some glimmer of ability to lean toward resonance with one hypercreature or another. Even just choosing what info diet you're on can do this. (Although there's an awful lot of magic in that "just choosing" part.)
But that's about it.
We can't align AGI. That's too big.
It's too big the way the pandemic was too big, and the Ukraine/Putin war is too big, and wokeism is too big.
When individuals try to act on the "god" scale, they usually just get possessed. That's the stupid simple way of solving coordination problems.
So when you try to contribute to solving AI alignment, what egregore are you feeding?
If you don't know, it's probably an unFriendly one.
(Also, don't believe your thoughts too much. Where did they come from?)
So, I think raising the sanity waterline is upstream of AI alignment.
It's like we've got gods warring, and they're threatening to step into digital form to accelerate their war.
We're freaking out about their potential mech suits.
But the problem is the all-out war, not the weapons.
We have an advantage in that this war happens on and through us. So if we take responsibility for this, we can influence the terrain and bias egregoric/memetic evolution to favor Friendliness.
Anything else is playing at the wrong level. Not our job. Can't be our job. Not as individuals, and it's individuals who seem to have something mimicking free will.
Sorting that out in practice seems like the only thing worth doing.
Not "solving xrisk". We can't do that. Too big. That's worth modeling, since the gods need our minds in order to think and understand things. But attaching desperation and a sense of "I must act!" to it is insanity. Food for the wrong gods.
Ergo why I support rationality for its own sake, period.
That, at least, seems to target a level at which we mere humans can act.