Comment Permalink

2y20

- building AGI probably comes with a non-trivial existential risk. This, in itself, is enough for most to consider it an act of aggression;

1. I don't see how aligned AGI comes with existential risk to humanity. It might come as existential risk to groups opposing the value system of the group training the AGI, this is true. For example Al-Kaida will view it as existential risk to itself, but there is no probable existential risk for the groups that are more aligned with the training.

2. There are several more steps from aligned AGI to existential risk to any group of people. You don't only need an AGI, but you need to weaponize it, and promote physical presence that will monitor the execution of the value system of this AGI. Deploying an army of robots that will enforce a value system of an AGI, is very different from just inventing an AGI. Just like bombing civilians from planes, is very different from inventing flight or bombs. We can argue where the aggression act takes place, but most of us will place it in the hands of people that have the resources to build an army of robots for this purpose, and they invest their resources with the intention of enforcing their value system. Just like Marie Curie can't be blamed for an atomic weapon, and her discovery is not an act of aggression, the Wright brothers can't be blamed for all the bombs dropped on civilians from planes.

3. I would expect most deployed robots based on AGI, to be of protective nature not aggressive. That means that nations will use those robots to *defend* themselves and their allies from invaders and not attack. So any measure of aggression in the invading sense, of forcing and invading and breaking the existing social boundaries we created, will contradict the majority of humanity values, and therefore will mean this AGI is not aligned. Yes some aggressive nations might create invading AGIs, but they will probably be a minority, and the invention and deployment of an AGI can't be considered by itself an act of aggression. If aggressive people teach an AGI to be aggressive, and not aligned with the majority of humanity which is protective but not aggressive, then this is on their hands, not the AGI inventor.

- even if the powerful AGI is aligned, there are many scenarios in which its mere existence transforms the world in ways that most people don't desire or agree with; whatever value system it encodes gets an immense boost and essentially Wins Culture; very basic evidence from history suggests that people don't like that;

1. I would argue that initially there would be a lot of different alternatives, all meant to this or that extent to serve the best interest of a collective. Some of the benefits are universal - say people dying of starvation, homelessness, traffic accidents, environmental issues like pollution and waste, diseases, lack of education resources or access to healthcare advice. Avoiding the deployment of an AGI, means you don't care about people which has those problems, I would say most people would like to solve those social issues, and if you don't, you can't force people to continue dying from starvation and diseases just because you don't like an AGI. You need to bring something more substantial, otherwise just don't use this technology.

2. The idea that an AGI is enforced somehow on people to "Win Culture", is not based on anything substantial. Just like any technology, and this is the secret of its success, is a choice. You can go to live in a forest and avoid any technology, and find a like minded Amish inspired community of people. Most people do enjoy technological advancements and the benefits that come with them. Using force based on an AGI is a moral choice, a choice which is made by a community of people training the AGI, and this kind of aggression will most probably be both not popular and forbidden by law. Providing a chatbot with some value system to the contrary is part of freedom of speech.

3. If by "Win Culture" you mean automating jobs that are done today by hand - I wouldn't call it enforcing a value system. Currently jobs are necessary evil, and are enforced on people to otherwise not be able to get their basic needs met. Solving problems, and stopping forcing people to do jobs most of them don't like, is not an act of aggression. This is an act of kindness that stops the current perpetual aggression we are used to. If someone is using violence, and you come and stop him from using violence, you are not committing an act of aggression, you are preventing aggression. Preventing the act of aggression might be not desired by the aggressor, but we somehow learned to deal with people who think they can be violent and try to use force to get what they want. This is a very delicate balance, and as long as AGI services are provided by choice, with several alternatives, I don't see how this is an act of aggression.

4. If someone "Win Culture" then good for him. I would not say that today's culture is so good, I would bet on superhuman culture to be better than what we have today. Some people might not like it, some people might not love cars and planes, and continue to use horses, but you can't force everyone around you to continue to use horses because sometimes car accidents happens, and you could become a victim of a car accident, this is not a claim that should stop any technology from being developed or integrated into society.

- as a result of this, lots of people (and institutions, and countries, possibly of the sort with nukes) might turn out to be willing to resort to rather extreme measures to prevent an aligned AGI take off, simply because it's not aligned with their values.

Terrorism and sabotage is a common strategy that can't be eliminated completely, but I would say most of the time it doesn't manage to reach its goals. Why would people try to bomb anything, instead of for example paying money to someone for training an AGI that will be aligned with their values? How is it even concerning an AGI, and not any human community with a different value system? Why do you wait for an AGI for these acts of aggression? If some community doesn't deserve to live in your opinion, you will not wait for an AGI, and if it does - so you learned to coexist with people different than yourself. They will not take over the world, just because they have an AGI. There would be plenty of alternative AGIs, of different strength and trained with different values. It takes time for an AGI to take over the world, a time way longer to reinvent the same technology several times over, and use alternative AGIs that can compete. And as most of us are protectors and not aggressors, and we have established some boundaries balancing our forces, I would expect this basic balance to continue.

- "When you open your Pandora's Box, you've just decided to change the world for everyone, for good or for bad, billions of people who had absolutely no say in what now will happen around them."

Billions of people have no say today in many social issues. People are dying, people are forced to do labor, people are homeless. Reducing those hazards, almost to zero, is not something we should stop to attempt in the name of "liberty". Much more people suffered a thousand years ago than now. Much of it is due to the development of technology. There is no "only good" technology, but most of us accept the benefits that come with technology over without it. You also can't force other people to stop using technology in order to become more healthy, and risk their life less, or stating that jobs are good even though they are forced on everyone and the basic necessities are conditioned on them.

I can imagine larger pockets of populations preferring to avoid the use of modern technology like larger Amish inspired communities. This is possible - and then we should respect those people's choices, and avoid forcing upon them our values, and let them live as they want. Yet you can't force people who do want the progress and all the benefits that come with it, to just stop the progress and respect the rights of people who fear it.

Notice that we are not talking here about development of a weapon, but a development of a technology that promises to solve a lot of our current problems. This at the least, should put you in place of agnostic. That means this is not a trivial decision to take some risks for humanity, to save hundreds of millions of lives, and reduce suffering to an extreme extent never seen before in history. I agree we should be cautious, and we should be mindful of the consequences, but we also should not be paralyzed by fear, we have a lot to lose if we stop and avoid AGI development.

- aligned AGI would be a smart agent imbued with the full set of values of its creator. It would change the world with absolutely fidelity to that vision.

A more realistic estimation that many aligned AGIs will change the world to the common denominator of humanity, like reducing diseases, and will continue to keep the power balance between different communities, as everyone would be able to build an AGI with a power proportional to their available resources, just like today there is a power balance between different communities and between the community and the individual.

Let me take an extreme example. Let's say I build an AGI for my fantasies. But as part of global regulation, I will promise to keep this AGI inside the boundaries of my property. I will not force my vision on the world, I will not want or force everyone to live in my fantasy land. I just want to be able to do it myself, inside my borders, without harming anyone who wants to live differently. Why would you want to stop me? As I see it once again, most people are protectors not aggressors, they want to have their values in their own space, they will not want to forcefully and unilaterally spread their ideas without consent. My home-made AGI will probably be much weaker than any state AGI, so I wouldn't be able to do much harm anyway. Today countries are enforcing their laws on everyone, even if you disagree with some of them, how do you see the future any different? If anything I expect the private spaces to be much more versatile than today, providing more choices and with less aggression than governments do today.

- the creator is an authoritarian state that wants to simply rule everything with an iron fist;

I agree this is a concern.

- the creator is a private corporation that comes up with some set of poorly thought out rules by committee that are mostly centred around its profit;

Not probable. It will more probably be focused on a good level of safety first and then on profit. Corporations are concerned about their image, not to mention the people who develop it, will simply not want to bring an extinction of human race.

- the creator is a genuinely well-intentioned person who only wishes for everyone to have as much freedom as allowed, but regardless of that has blind spots that they fail to identify and that slip their way into the rules;

This doesn't sound like something that is impossible to solve with newer improved versions once the blind spot is discovered. In case of aligned AGI the blind spot will not be the end of humanity, but more likely some bias in the data, misrepresenting some ideas or groups. As long as there is an extremely low probability for extinction, and this property is almost identical with the definition of alignment, the margin of error increases significantly. There was no technology in history we got right from the first attempt. So I expect a lot of variability in AGI, I expect some of them to be weaker or stronger, some of them fit this or that value system of different communities. And I would expect local accidents too, with limited damage, just like terrorists and mass shooters can do today.

-many powerful actors lack the insight and/or moral fibre to actually succeed at creating a good one, and because the bad ones might be easier to create.

We actually don't need to guess anymore. We have had this technology for a while, the reason it caught on now, and was released only relatively lately - is because without providing ethical standards to those models, the backlash on large corporations is too strong. So even if I might agree that the worst ones are easier to create, and some powerful actors could do some damage, they will be forced by a larger community (of investors, users, media and governments), to invest the effort to make the harder and safer option. I think this claim is true to many technologies today, it's cheaper and easier to make unsafe cars, trains, planes, but we managed to install a regulation procedures, both by government and by independent testers, to make sure our vehicles are relatively safe.

You can see that RLHF which is the main key to safety today, is incorporated by larger players, and alignment datasets and networks are provided for free and opened to the public exactly for the reason that we all want this technology to mostly benefit humanity. It's possible to add more nation centric set of values that will be more aggressive, or some leader will want to make his countrymen slaves, but this is not the point here. The main idea is that we are already creating mechanism to encourage everyone to easily create pretty good ones as part of our cultural norms and cultural mechanisms that prevent bad AIs from being exposed to the public and come to market to make profit, for further development of even stronger AIs that eventually become an AGI. So although the initial development of AI safety might be harder, it is crucial, it's clear to most of the actors is crucial, and the tools that provide safety will be available and simple to use, thus in the long run creating an AGI which is not aligned, will be harder - because of the social environment of norms and best practices those models were developed with.

- There are people who will oppose making work obsolete.

Work is forced on us, it's not a choice. Opposing making it obsolete is an obvious act of aggression. As long as it's necessary evil, it has a right to exist, but at the moment you demand other people to work, because you're afraid of technology - you become the cause of a lot of suffering, that could be potentially avoided.

- There are people who will oppose making death obsolete.

Death is forced on us, it's not a choice. Opposing making it obsolete is also an act of aggression, against people who are choosing not to die if they don't want to.

- If you are about to simply override all those values with an act of force, by using a powerful AGI to reshape the world in your image, they'll feel that is an act of aggression - and they will be right.

I don't think anyone forces them to join. As a liberal I don't believe you have the right to come to me and say "you must die, or i will kill you". This is at the least can't be viewed as legitimate behavior that we should encourage or legitimize. If you want to work, you want to die, you want to live in 2017, you have the full right to do so. But wanting to exterminate everyone who is not like you, forcing people to suffer, die, work etc. is an obvious act of aggression toward other people, and should not be legitimized or portrayed as an act of aggression against them. "You don't let me force my values on you" doesn't come out as a legitimate act of self defense. Very reminiscent of Al Bandy, where he claimed in a court a face of his fellow, was in the way of his fist, harming his hand, and demanding compensation. If you want to be stuck in time, and live your life - be my guest, but legitimizing usage of force in order to avoid progress that saves millions, and improves our life significantly can't be justified inside liberal set of values.

- If enough people feel threatened enough...AGI training data centres might get bombed anyway.

This is true. And if enough people think it's ok to be extreme Islamist they will be, and even try to build a state like ISIS. The hope is that with enough good reasoning, and with enough rational analysis of the situation, most thinking people will not be threatened, and see the vast potential benefits, enough to not try and bomb the AGI computer centers.

- just like in the Cold War someone might genuinely think "better dead than red".

I could believe this is possible. But once again most of us are not aggressors, therefore most of us will try to protect our homeland and our way of life, without trying to aggressively propagate it to other places where they have their own social preferences.

- The best value a human worker might have left to offer would be that their body is still cheaper than a robot's

Do you truly believe that in the world all problems are solved by automation, and full of robots whose whole purpose is to serve humans, people will try to justify their existence by jobs that they can do? And this justification will be that their body has more value than robotic parts?

I would propose an alternative: in a world where all robots serve humans, and everything is automated, humans will be valued intrinsically, provided with all their needs, and provided with basic income just because they are humans. The default where a human worth nothing without his job will be outdated and seen as we see slavery today.

--------

In summary I would say one major problem I see through most of your claims: there would be a very limited amount of AGIs, forcing a minority values system upon everyone, expanding aggressively this value system on everyone else who thinks differently.

I would claim the more probable future is a wide variety of AGIs, each improving slowly in its own past, while all the development teams will both do something unique and learn from the lessons of other teams. For every good technology there comes dozens of copycats, they will all be based on a bit different value system, and with common denominator of trying to benefit humanity, like discovering new drugs, fixing starvation, reducing road accidents, climate change, tedious labor which is basically forced labor. While the common humanity problems will be solved, the moral and ethical variety will continue to coexist with a similar power balance we have today. This pattern of technology influence on society happened throughout all of human history until AGI, and as of today that we know how to align LLMs, this tendency of power balances between nations, and inside each nation is expected to propagate into the world where AGI is available technology to everyone to download and train their own. If AGI will be an advanced LLM we see all those trends today, and they are not expected to suddenly change.

Although it's hard to predict the possible bad or good sides of Aligned AGIs now, it's clear that the aligned networks do not pose a threat to humanity as a whole, leaving a large margin of error. Nonetheless, there remains a considerable risk of amplifying current societal problems like inequality, totalitarianism and wars to an alarming extent.

People who are not willing to be part of the progress, exist today as well, as a minority. If they will become a majority, it's an interesting futuristic scenario, but it's both implausible, and will be immoral to forcefully stop those who do want to use this life saving technology, as long as they don't force anything on those who don't.

28

AGI deployment as an act of aggression

28

Non-aligned AGI is bad

Aligned AGI is not necessarily that good either

People are people

Conclusion

28