Mitchell_Porter

Wikitag Contributions

Comments

Sorted by

In the title you say AI was "aligned by default", which to me makes it sound like any sufficiently advanced AI is automatically moral, but in the story you have a particular mechanism - explicit simulation of an aligned AI, which bootstraps that AI into being. Did I misinterpret the title? 

Said is right ... about the epistemic standards of this site being low

Could you, or someone who agrees with you, be specific about this. What exactly are the higher standards of discussion that are not being met? What are the endemic epistemic errors that are being allowed to flourish unchecked? 

I got o3 to compare Eliezer's metaethics with that of Brand Blanshard (who has some similar ideas), with particular attention to whether morality is subjective or objective. The result...

What's the relationship between consciousness and intelligence?

why ASI is near certain in the immediate future

He doesn't say that? Though plenty of other people do. 

The world will get rich.

Economists say the world or the West already "became rich". What further changes are you envisioning? 

Did you notice a few months ago, when Grok 3 was released and people found it could be used for chemical weapons recipes, assassination planning, and so on? The xAI team had to scramble to fix its behavior. If it had been open source, that would not even be an option, it would just be out there now, helping to boost any psychopath or gang who got it, towards criminal mastermind status. 

First let me say that with respect to the world of alignment research, or the AI world in general, I am nothing. I don't have a job in those areas, I am physically remote from where the action is. My contribution consists of posts and comments here. This is a widely read site, so in principle, a thought posted here can have consequences, but a priori, my likely impact is small compared to people already closer to the center of things. 

I mention this because you're asking rationalists and effective altruists to pay more attention to your scenario, and I'm giving it attention, but who's listening? Nonetheless... 

Essentially, you are asking us to pay more attention to the risk that small groups of people, super-empowered by user-aligned AI, will deliberately use that power to wipe out the rest of the human race; and you consider this a reason to favor (in the words of your website) "rejecting AI" - which to me means a pause or a ban - rather than working to "align" it. 

Now, from my own situation of powerlessness, I do two things. First, I focus on the problem of ethical alignment or civilizational alignment - how one would impart values to an AI, such that, even as an autonomous superintelligent being, it would be "human-friendly". Second, I try to talk frankly about the consequences of AI. For me, that means insisting, not that it will necessarily kill us, but that it will necessarily rule us - or at least, rule the world, order the world according to its purposes. 

I focus on ethical alignment, rather than on just trying to stop AI, because we could be extremely close to the creation of superintelligence, and in that case, there is neither an existing social mechanism that can stop the AI race, nor is there time to build one. As I said, I do not consider human extinction a certain outcome of superintelligence - I don't know the odds - but I do consider human disempowerment to be all but certain. A world with superintelligent AI will be a world ruled by superintelligent AI, not by human beings. 

There is some possibility that superintelligence emerging from today's AI will be adequately human-friendly, even without further advances in ethical alignment. Perhaps we have enough pieces of the puzzle already, to make that a possible outcome. But we don't have all the pieces yet, and the more we collect, the better the chance of a happy outcome. So, I speak up in favor of ideas like CEV, I share promising ideas when I come across them, and I encourage people to try to solve this big problem. 

As for talking frankly about the consequences of AI, it's apparent that no one in power is stating that the logical endpoint of an AI race is the creation of humanity's successors. Therefore I like to emphasize that, in order to restore some awareness of the big picture. 

OK,  now onto your take on everything. Superficially, your scenario deviates from mine. Here I am insisting that superintelligence means the end of human rule, whereas you're talking about humans still using AI to shape the world, albeit destructively. When I discuss the nature of superintelligent rule with more nuance, I do say that rule by entirely nonhuman AI is just one form. Another form is rule by some combination of AIs and humans. However, if we're talking about superintelligence, even if humans are nominally in control, the presence of superintelligence as part of the ruling entity means that most of the "ruling" will be done by the AI component, because the vast majority of the cognition behind decision-making will be AI cognition, not human cognition. 

You also ask us to consider scenarios in which destructive humans are super-empowered by something less than superintelligence. I'm sure it's possible, but in general, any scenario with AI that is "agentic" but less than superintelligent, will have a tendency to give rise to superintelligence, because that is a capability that would empower the agent (if it can solve the problems of user-alignment, where the AI agent is itself the user). 

Now let's think for a bit about where "asymmetric AI risk", in which most but not all of the human race is wiped out, belongs in the taxonomy of possible futures, how much it should affect humanity's planning, and so forth. 

A classic taxonomic distinction is between x-risk (extinction risk, "existential risk") and s-risk. "S" here most naturally stands for "suffering", but I think s-risk also just denotes a future where humanity isn't extinct, but nonetheless something went wrong. There are s-risk scenarios where AI is in charge, but instead of killing us, it just puts us in storage, or wireheads us. There are also s-risk scenarios where humans are in charge and abuse power. An endless dictatorship is an obvious example. I think your scenario also falls into this subcategory (though it does border on x-risk). Finally, there are s-risk scenarios where things go wrong, not because of a wrong or evil decision by a ruling entity, but because of a negative-sum situation in which we are all trapped. This could include scenarios in which there is an inescapable trend of disempowerment, or dehumanization, or relentlessly lowered expectations. Economic competition is the usual villain in these scenarios. 

Finally, zooming in on the specific scenario according to which some little group uses AI to kill off the rest of the human race, we could distinguish between scenarios in which the killers are nihilists who just want to "watch the world burn", and scenarios in which the killers are egoists who want to live and prosper, and who are killing off everyone else for that reason.  We can also scale things down a bit, and consider the possibility of AI-empowered war or genocide. That actually feels more likely than some clique using AI to literally wipe out the rest of humanity. It would also be in tune with the historical experience of humanity, which is that we don't completely die out, but we do suffer a lot. 

If you're concerned about human well-being in general, you might consider the prospect of genocidal robot warfare (directed by human politicians or generals), as something to be opposed in itself. But from a perspective in which the rise of superintelligence is the endgame, such a thing still just looks like one of the phenomena that you might see on your way to the true ending - one of the things that AI makes possible while AI is still only at "human level" or less, and humans are still in charge. 

I feel myself running out of steam here, a little. I do want to mention, at least as a curiosity, an example of something like your scenario, from science fiction. Vernor Vinge is known for raising the topic of superintelligence in his fiction, under the rubric of the "technological singularity". That is a theme of his novel Marooned in Real Time. But the precursor to that book, The Peace War, is a depopulated world, in which there's a ruling clique with an overwhelming technology (not AI or nanotechnology, just a kind of advanced physics) that allows it to dominate everyone else. Its paradigm is that in the world of the late 20th century, humanity was flirting with extinction anyway, thanks to nuclear and biological warfare. "The Peace", the ruling clique, are originally just a bunch of scientists and managers from an American lab which had this physics breakthrough. They first used it to seize power from the American and Russian governments, by disabling their nuclear and aerospace strengths. Then came the plagues, which killed most of humanity and which were blamed on rogue biotechnologists. In the resulting depopulated world, the Peace keeps a monopoly on high technology, so that humanity will not destroy itself again. The depopulation is blamed on the high-tech madmen who preceded the Peace. But I think it is suggested inconclusively, once or twice, that the Peace itself might have had a hand in releasing the plagues. 

We see here a motivation for a politicized group to depopulate the world by force, a very Hobbesian motivation: let us be the supreme power, and let us do whatever is necessary to remain in that position, because if we don't do that, the consequences will be even worse. (In terms of my earlier taxonomy, this would be an "egoist" scenario, because the depopulating clique intends to rule; whereas an AI-empowered attempt to kill off humanity for the sake of the environment or the other species, would be a "nihilist" scenario, where the depopulating clique just wants to get rid of humanity. Perhaps this shows that my terminology is not ideal, because in both these cases, depopulation is meant to serve a higher good.) 

Presumably the same reasoning could occur, in service of (e.g) national survival rather than species survival. So here we could ask: how likely is it that one of the world's great powers would use AI to depopulate the world, in the national interest? That seems pretty unlikely to me. The people who rise to the top in great powers may be capable of contemplating terrible actions, but they generally aren't omnicidal. What might be a little more likely, is a scenario in which, having acquired the capability, they decide to permanently strip all other nations of high technology, and they act ruthlessly in service of this goal. The leaders of today's great powers don't want to see the rest of humanity exterminated, but they might well want to see them reduced to a peasant's life, especially if the alternative is an unstable arms race and the risk of being subjugated themselves. 

However, even this is something of a geopolitical dream. In the real world of history so far, no nation gets an overwhelming advantage like that. There's always a rival hot on the leader's trail, or there are multiple powers who are evenly matched. No leader ever has the luxury to think, what if I just wiped out all other centers of power, how good would that be? Geopolitics is far more usually just a struggle to survive recurring situations in which all choices are bad. 

On the other hand, we're discussing the unprecedented technology of AI, which, it is argued, could actually deliver that unique overwhelming advantage to whoever goes furthest fastest. I would argue that the world's big leaders, as ruthless as they can be, would aim at disarming all rival nations rather than outright exterminating them, if that relative omnipotence fell into their hands. But I would also suggest that the window for doing such a thing would be brief, because AI should lead to superintelligent AI, and a world in which AIs, not humans, are in charge. 

Possibly I should say something about scenarios in which it's not governments, but rather corporate leaders, who are the humans who rule the world via their AIs. Vinge's Peace is also like this - it's not the American government that takes over the world, it's one particular DARPA physics lab that achieved the strategic breakthrough. The personnel of that lab (and the allies they recruited) became the new ruling clique of the world. The idea that Altman, or Musk, or Sutskever, or Hassabis, and trusted circles around them, could become the rulers of Earth, is something to think about. However, once again I don't see these people as exterminators of humanity - despite paranoia about billionaires buying up bomb shelters in New Zealand, and so forth. That's just the billionaires trying to secure their own survival in the event of global disaster, it doesn't mean they're planning to trigger that disaster... And once again, anyone who is achieving world domination via AI, is likely to end up in a sorcerer's apprentice situation, in which they get dominated by their own tools, no matter how good their theory of AI user-alignment is; because agentic AI naturally leads to superintelligence, and the submergence of the human component in the tide of AI cognition. 

I think I'm done. Well, one more thing: although I am not fighting for a pause or a ban myself, pragmatically, I advise you to cultivate ties with those who are, because that is your inclination. You may not be able to convince anyone to change their priorities, but you can at least team up with those who already share them. 

I'm confused about what your bounty is asking exactly

From the post: 

the goal is to promote broadly changing the status of this risk from "unacknowledged" ... to "examined and assigned objective weight"

Load More