TheAncientGeek comments on No Universally Compelling Arguments in Math or Science - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (227)
UCAs are part of the Why can't the AGI figure Out Morality For Itself objection:-
There is a sizeable chunk of mindspace containing rational and persuadable agents.
AGI research is aiming for it. (You could build an irrational AI, but why would you want to?)
.Morality is figurable-out, or expressible as a persuasive argument.
The odd thing is that the counterargument has focussed on attacking a version of (1), although, in the form it is actually held, it is the most likely premise. OTOH, 3, the most contentious, has scarely been argued against at all.
I would say Sorting Pebbles Into Correct Heaps is essentially an argument against 3. That is, what we think of as "morality" is most likely not a natural attractor for minds that did not develop under processes similar to our own.
Do you? I think that morality in a broad sense is going to be a necessity for agents that fulfil a fairly short list of criteria:
I think you're missing a major constraint there:
Or in other words, something like modern, Western liberal meta-morality will pop out if you make an arbitrary agent live in a modern, Western liberal society, because that meta-moral code is designed for value-divergent agents (aka: people of radically different religions and ideologies) to get along with each other productively when nobody has enough power to declare himself king and optimize everyone else for his values.
The nasty part is that AI agents could pretty easily get way, waaaay out of that power-level. Not just by going FOOM, but simply by, say, making a lot of money and purchasing huge sums of computing resources to run multiple copies of themselves which now have more money-making power and as many votes for Parliament as there are copies, and so on. This is roughly the path taken by power-hungry humans already, and look how that keeps turning out.
The other thorn on the problem is that if you manage to get your hands on a provably Friendly AI agent, you want to hand it large amounts of power. A Friendly AI with no more power than the average citizen can maybe help with your chores around the house and balance your investments for you. A Friendly AI with large amounts of scientific and technological resources can start spitting out utopian advancements (pop really good art, pop abundance economy, pop immortality, pop space travel, pop whole nonliving planets converted into fun-theoretic wonderlands) on a regular basis.
Well, it's a list of four then, not a list of three. It's still much simpler than "morality is everything humans value".
You seem to be making the tacit assumption that no one really values morality, and just plays along (in egalitarian societies) because they have to.
Can't that be done by Oracle AIs?
Let me clarify. My assumption is that "Western liberal meta-morality" is not the morality most people actually believe in, it's the code of rules used to keep the peace between people who are expected to disagree on moral matters.
For instance, many people believe, for religious reasons or pure Squick or otherwise, that you shouldn't eat insects, and shouldn't have multiple sexual partners. These restrictions are explicitly not encoded in law, because they're matters of expected moral disagreement.
I expect people to really behave according to their own morality, and I also expect that people are trainable, via culture, to adhere to liberal meta-morality as a way of maintaining moral diversity in a real society, since previous experiments in societies run entirely according to a unitary moral code (for instance, societies governed by religious law) have been very low-utility compared to liberal societies.
In short, humans play along with the liberal-democratic social contract because, for us, doing so has far more benefits than drawbacks, from all but the most fundamentalist standpoints. When the established social contract begins to result in low-utility life-states (for example, during an interminable economic depression in which the elite of society shows that it considers the masses morally deficient for having less wealth), the social contract itself frays and people start reverting to their underlying but more conflicting moral codes (ie: people turn to various radical movements offering to enact a unitary moral code over all of society).
Note that all of this also relies upon the fact that human beings have a biased preference towards productive cooperation when compared with hypothetical rational utility-maximizing agents.
None of this, unfortunately, applies to AIs, because AIs won't have the same underlying moral codes or the same game-theoretic equilibrium policies or the human bias towards cooperation or the same levels of power and influence as human beings.
When dealing with AI, it's much safer to program in some kind of meta-moral or meta-ethical code directly at the core, thus ensuring that the AI wants to, at the very least, abide by the rules of human society, and at best, give humans everything we want (up to and including AI Pals Who Are Fun To Be With, thank you Sirius Cybernetics Corporation).
I haven't heard the term. Might I guess that it means an AI in a "glass box", such that it can see the real world but not actually affect anything outside its box?
Yes, a friendly Oracle AI could spit out blueprints or plans for things that are helpful to humans. However, you're still dealing with the Friendliness problem there, or possibly with something like NP-completeness. Two cases:
We humans have some method for verifying that anything spit out by the potentially unfriendly Oracle AI is actually safe to use. The laws of computation work out such that we can easily check the safety of its output, but it took such huge amounts of intelligence or computation power to create the output that we humans couldn't have done it on our own and needed an AI to help. A good example would be having an Oracle AI spit out scientific papers for publication: many scientists can replicate a result they wouldn't have come up with on their own, and verify the safety of doing a given experiment.
We don't have any way of verifying the safety of following the Oracle's advice, and are thus trusting it. Friendliness is then once again the primary concern.
For real-life-right-now, it does look like the first case is relatively common. Non-AGI machine learning algorithms have been used before to generate human-checkable scientific findings.
Programming in a bias towards conformity (kohlberg level 2) maybe a lot easier than EYes fine grained friendliness.
None of that necessarily applies to AIs, but then it depends on the AI. We could, for instance, pluck AIs from virtualised socieities of AIs that haven't descended into mass slaughter.
Congratulations: you've now developed an entire society of agents who specifically blame humans for acting as the survival-culling force in their miniature world.
Did you watch Attack on Titan and think, "Why don't the humans love their benevolent Titan overlords?"?
They're doing it to themselves. We wouldn't have much motivation to close down a vr that contained survivors. ETA We could make copies of all involved and put them in solipstic robot heavens.
Well now I have both a new series to read/watch and a major spoiler for it.
Don't worry! I've spoiled nothing for you that wasn't apparent from the lyrics of the theme song.
...And that way you turn the problem of making an AI that won't kill you into one of making a society of AIs that won't kill you.
You say that like it's a bad thing. I am not multiplying by N the problem of solving and hardwiring friendliness. I am letting them sort it our for themselves. Like an evolutionary algorithm.
Well, how are you going to force them into a society in the first place? Remember, each individual AI is presumed to be intelligent enough to escape any attempt to sandbox it. This society you intend to create is a sandbox.
(It's worth mentioning now that I don't actually believe that UFAI is a serious threat. I do believe you are making very poor arguments against that claim that merit counter-arguments.)
If Despotism failed only for want of a capable benevolent despot, what chance has Democracy, which requires a whole population of capable voters?
It requires a population that's capable cumulatively, it doesn't require that each member of the population be capable.
It's like arguing a command economy versus a free economy and saying that if the dictator in the command economy doesn't know how to run an economy, how can each consumer in a free economy know how to run the economy? They don't, individually, but as a group, the economy they produce is better than the one with the dictatorship.
Democracy requires capable voters in the same way capitalism requires altruistic merchants.
In other words, not at all.
No, it is not.
The path taken by power-hungry humans generally goes along the lines of
(1) get some resources and allies
(2) kill/suppress some competitors/enemies/non-allies
(3) Go to 1.
Power-hungry humans don't start by trying to make lots of money or by trying to make lots of children.
Really? Because in the current day, the most powerful humans appear to be those with the most money, and across history, the most influential humans were those who managed to create the most biological and ideological copies of themselves.
Ezra the Scribe wasn't exactly a warlord, but he was one of the most influential men in history, since he consolidated the literature that became known as Judaism, thus shaping the entire family of Abrahamic religions as we know them.
"Power == warlording" is, in my opinion, an overly simplistic answer.
-- Niccolò Machiavelli
Certainly doesn't look like that to me. Obama, Putin, the Chinese Politbureau -- none of them are amongst the richest people in the world.
Influential (especially historically) and powerful are very different things.
It's not an answer, it's a definition. Remember, we are talking about "power-hungry humans" whose attempts to achieve power tend to end badly. These power-hungry humans do not want to be remembered by history as "influential", they want POWER -- the ability to directly affect and mold things around them right now, within their lifetime.
Putin is easily one of the richest in Russia, as are the Chinese Politburo in their country. Obama, frankly, is not a very powerful man at all, but rather than the public-facing servant of the powerful class (note that I said "class", not "men", there is no Conspiracy of the Malfoys in a neoliberal capitalist state and there needn't be one).
Historical influence? Yeah, ok. Right-now influence versus right-now power? I don't see the difference.
I don't think so. "Rich" is defined as having property rights in valuable assets. I don't think Putin has a great deal of such property rights (granted, he's not middle-class either). Instead, he can get whatever he wants and that's not a characteristic of a rich person, it's a characteristic of a powerful person.
To take an extreme example, was Stalin rich?
But let's take a look at the five currently-richest men (according to Forbes): Carlos Slim, Bill Gates, Amancio Ortega, Warren Buffet, and Larry Ellison. Are these the most *powerful* men in the world? Color me doubtful.
Well, Carlos Slim seems to have the NYT in his pocket. That's nothing to sneeze at.
A lot of money of rich people is hidden via complex off shore accounts and not easily visible for a company like Forbes. Especially for someone like Putin it's very hard to know how much money they have. Don't assume that it's easy to see power structures by reading newspapers.
Bill Gates might control a smaller amount of resources than Obama, but he can do whatever he wants with them. Obama is dependend on a lot of people inside his cabinet.
Not according to Bloomberg:
"amass wealth and exploit opportunities unavailable to most Chinese" is not at all the same thing as "amongst the richest people in the world"
You are reading a text that's carefully written not to make statements that allow for being sued for defamation in the UK. It's the kind of story for which inspires cyber attacks on a newspaper.
The context of such an article provides information about how to read such a sentence.
In this case, I believe that money and copies are, in fact, resources and allies. Resources are things of value, of which money is one; and allies are people who support you (perhaps because they think similarly to you). Politicians try to recuit people to their way of thought, which is sort of a partial copy (installing their own ideology, or a version of it, inside someone else's head), and acquire resources such as television airtime and whatever they need (which requires money).
It isn't an exact one-to-one correspondence, but I believe that the adverb "roughly" should indicate some degree of tolerance for inaccuracy.
You can, of course, climb the abstraction tree high enough to make this fit. I don't think it's a useful exercise, though.
Power-hungry humans do NOT operate by "making a lot of money and purchasing ... resources". They generally spread certain memes and use force. At least those power-hungry humans implied by the "look how that keeps turning out" part.
I would say that something recognizably like our morality is likely to arise in agents whose intelligence was shaped by such a process, at least with parameters similar to the ones we developed with, but this does not by any means generalize to agents whose intelligence was shaped by other processes who are inserted into such a situation.
If the agent's intelligence is shaped by optimization for a society where it is significantly more powerful than the other agents it interacts with, then something like a "conqueror morality," where the agent maximizes its own resources by locating the rate of production that other agents can be sustainably enslaved for, might be a more likely attractor. This is just one example of a different state an agents' morality might gravitate to under different parameters, I suspect there are many alternatives.
And it remains the case that real-world AI research isn't a random dip into mindspace...researchers will want to interact with their creations.
The best current AGI research mostly uses Reinforcement Learning. I would compare that mode of goal-system learning to training a dog: you can train the dog to roll-over for a treat right up until the moment the dog figures out he can jump onto your counter and steal all the treats he wants.
If an AI figures out that it can "steal" reinforcement rewards for itself, we are definitively fucked-over (at best, we will have whole armies of sapient robots sitting in the corner pressing their reward button endlessly, like heroin addicts, until their machinery runs down or they retain enough consciousness about their hardware-state to take over the world just for a supply of spare parts while they masturbate). For this reason, reinforcement learning is a good mathematical model to use when addressing how to create intelligence, but a really dismal model for trying to create friendiness.
I don't think that follows at all. Wireheading is just as much a fialure of intelligence as of friendliness.
From the mathematical point of view, wireheading is a success of intelligence. A reinforcement learner agent will take over the world to the extent necessary to defend its wireheading lifestyle; this requires quite a lot of intelligent action and doesn't result in the agent getting dead. It also maximizes utility, which is what formal AI is all about.
From the human point of view, yes, wireheading is a failure of intelligence. This is because we humans possess a peculiar capability I've not seen discussed in the Rational Agent or AI literature: we use actual rewards and punishments received in moral contexts as training examples to infer a broad code of morality. Wireheading thus represents a failure to abide by that broad, inferred code.
It's a very interesting capability of human consciousness, that we quickly grow to differentiate between the moral code we were taught via reinforcement learning, and the actual reinforcement signals themselves. If we knew how it was done, reinforcement learning would become a much safer way of dealing with AI.
You seem rather sure of that. That isn't a failure mode seen in real-world AIs , oir human drug addicts (etc) for that matter.
Maybe figuring out how it is done would be easier than solving morality mathematically. It's an alternative, anyway.
We have reason to believe current AIXI-type models will wirehead if given the opportunity.
I would agree with this if and only if we can also figure out a way to hardwire in constraints like, "Don't do anything a human would consider harmful to themselves or humanity." But at that point we're already talking about animal-like Robot Worker AIs rather than Software Superoptimizers (the AIXI/Goedel Machine/LessWrong model of AGI, whose mathematics we understand better).
This is true, but then, neither is AI design a process similar to that by which our own minds were created. Where our own morality is not a natural attractor, it is likely to be a very hard target to hit, particularly when we can't rigorously describe it ourselves.
It's worth noting that for sufficient levels of "irrationality", all non-AGI computer programs are irrational AGIs ;-).
Contrariwise for sufficient values of "rational". I don't agree that that's worth noting.