Religions have had the Golden Rule for thousands of years, and while it's faulty (it gives you permission to do something to someone else that you like having done to you but they don't like having done to them), it works so well overall that it clearly must be based on some underlying truth, and we need to pin down what that is so that we can use it to govern AGI.
What exactly is morality? Well, it isn't nearly as difficult as most people imagine. The simplest way to understand how it works is to imagine that you will have to live everyone's life in turn (meaning billions of reincarnations, going back in time as many times as necessary in order to live each of those lives), so to maximise your happiness and minimise your suffering, you must pay careful attention to harm management so that you don't cause yourself lots of suffering in other lives that outweighs the gains you make in whichever life you are currently tied up in. A dictator murdering millions of people while making himself rich will pay a heavy price for one short life of luxury, enduring an astronomical amount of misery as a consequence. There are clearly good ways to play the game and bad ways, and it is possible to make the right decisions at any point along the way just by weighing up all the available data correctly, although a correct decision isn't guaranteed to lead to the best result because any decision based on incomplete information has the potential to lead to disaster, but there is no way to get around that problem - all we can ever do is hope that things will work out in the way the data says they most probably will, while repeatedly doing things that are less likely to work out well would inevitably lead to more disasters.
Now, obviously, we don't expect the world to work that way (with us having to live everyone else's life in turn), even though it could be a virtual universe in which we are being tested where those who behave badly will suffer at their own hands, ending up being on the receiving end of all the harm they dish out, and also suffering because they failed to step in and help others when they easily could have. However, even if this is not the way the universe works, most of us still care about people enough to want to apply this kind of harm management regardless - we love family and friends, and many of us love the whole of humanity in general (even if we have exceptions for particular individuals who don't play by the same rules). We also want all our descendants to be looked after fairly by AGI, and in the course of time, all people may be our descendants, so it makes no sense to favour some of them over others (unless that's based on their own individual morality). We have here a way of treating them all with equal fairness simply by treating them all as our own self.
That may still be a misguided way of looking at things though, because genetic relationships don't necessarily match up to any real connection between different sentient beings. The material from which we are made can be reused to form other kinds of sentient animals, and if you were to die on an alien planet, it could be reused in alien species. Should we not care about the sentiences in those just as much? We should really be looking for a morality that is completely species-blind, caring equally about all sentiences, which means that we need to act as if we are not merely going to live all human lives in succession, but the lives of all sentiences. This is a better approach for two reasons. If aliens ever turn up here, we need to have rules of morality that protect them from us, and us from them (and if they're able to get here, they're doubtless advanced enough that they should have worked out how morality works too). We also need to protect people who are disabled mentally and not exclude them on the basis that some animals are more capable, and in any case we should also be protecting animals to avoid causing unnecessary suffering for them. What we certainly don't want is for aliens to turn up here and claim that we aren't covered by the same morality as them because we're inferior to them, backing that up by pointing out that we discriminate against animals which we claim aren't covered by the same morality as us because they are inferior to us. So, we have to stand by the principle that all sentiences are equally important and need to be protected from harm with the same morality. However, that doesn't mean that when we do the Trolley Problem with a million worms on one track and one human on the other that the human should be sacrificed - if we knew that we had to live those million and one lives, we would gain little by living a bit longer as worms before suffering similar deaths by other means, while we'd lose a lot more as the human (and a lot more still as all the other people who will suffer deeply from the loss of that human). What the equality aspect requires is that a torturer of animals should be made to suffer as much as the animals he has tortured. If we run the Trolley Problem with a human on one track and a visiting alien on the other though, it may be that the alien should be saved on the basis that he/she/it is more advanced than us and has more to lose, and that likely is the case if it is capable of living 10,000 years to our 100.
So, we need AGI to make calculations for us on the above basis, weighing up the losses and gains. Non-sentient AGI will be completely selfless, but its job will be to work for all sentient things to try to minimise unnecessary harm for them and to help maximise their happiness. It will keep a database of information about sentience, collecting knowledge about feelings so that it can weigh up harm and pleasure as accurately as possible, and it will then apply that knowledge to any situation where decisions must be made about which course of action should be followed. It is thus possible for a robot to work out that it should shoot a gunman dead if he is on a killing spree where the victims don't appear to have done anything to deserve to be shot. It's a different case if the gunman is actually a blameless hostage trying to escape from a gang of evil kidnappers and he's managed to get hold of a gun while all the thugs have dropped their guard, so he should be allowed to shoot them all (and the robot should maybe join in to help him, depending on which individual kidnappers are evil and which might merely have been dragged along for the ride unwillingly). The correct action depends heavily on understanding the situation, so the more the robot knows about the people involved, the better the chance that it will make the right decisions, but decisions do have to be made and the time to make them is often tightly constrained, so all we can demand of robots is that they do what is most likely to be right based on what they know, delaying irreversible decisions for as long as it is reasonable to do so.
When we apply this to the normal Trolley Problem, we can now see what the correct choice of action is, but it is again variable, depending heavily on what the decision maker knows. If we have four idiots lying on the track where the trolley is due to travel along while another idiot is lying on the other track where the schedule says no trolley should be but where a trolley could quite reasonably go, then the four idiots should be saved on the basis that anyone who has to live all five of those lives will likely prefer it if the four survive. That's based on incomplete knowledge though. The four idiots may all be 90 years old and the one idiot may be 20, in which case it may be better to save the one. The decision changes back again the other way if we know that all five of these idiots are so stupid that they have killed or are likely to kill one random person through their bad decisions for during each decade of their lives, in which case the trolley should kill the young idiot (based on normal life expectancy applying). There is a multiplicity of correct answers to the trolley problem depending on how many details are available to the decision maker, and that is why discussions about the Trolley Problem just go on and on without ever seeming to get to any kind of fundamental truth, and yet we already have a correct way of making the calculation. Where people disagree, it's often because they add details into the situation that aren't stated in the text. Some of them think the people lying on the track are idiots because they would have to be stupid to behave that way, but others don't make that assumption. Some imagine that they've been tied to the tracks by a terrorist. There are other people though who believe that they have no right to make such an important decision, so they say they'd do nothing. When you press them on this point and confront them with a situation where a billion people are tied to one track while one person is tied to the other, they usually see the error of their ways, but not always. Perhaps their belief in God is to blame for this if they're passing the responsibility over to him. AGI should not behave like that - we want AGI to intervene, so it must crunch all the available data and make the only decision it can make based on that data (although there will still be a random decision to be made in rare cases if the numbers on both sides add up to the same value).
AGI will be able to access a lot of information about the people involved in situations where such difficult decisions need to be made. Picture a scene where a car is moving towards a group of children who are standing by the road. One of the children suddenly moves out into the road and the car must decide how to react. If it swerves to one side it will run into a lorry that's coming the other way, but if it swerves to the other side it will plough into the group of children. One of the passengers in the car is a child too. In the absence of any other information, the car should run down the child on the road. Fortunately though, AGI knows who all these people are because a network of devices is tracking them all. The child who has moved into the road in front of the car is known to be a good, sensible, kind child. The other children are all known to be vicious bullies who regularly pick on him, and it's likely that they pushed him onto the road. In the absence of additional information, the car should plough into the group of bullies. However, AGI also knows that all but one of the people in the car happen to be would-be terrorists who have just been discussing a massive attack that they want to carry out, and the child in the car is terminally ill, so in the absence of any other information, the car should maybe crash into the lorry. But, if the lorry is carrying something explosive which will likely blow up in the crash and kill all the people nearby, the car must swerve into the bullies. Again we see that the best course of action is not guaranteed to be the same as the correct decision - the correct decision is always dictated by the available information, while the best course of action may depend on unavailable information. We can't expect AGI to access unavailable information and thereby make ideal decisions, so our job is always to make it crunch the available data correctly and to make the decision dictated by that information.
There are complications that can be proposed in that we can think up situations where a lot of people could gain a lot of pleasure out of abusing one person, to the point where their enjoyment appears to outweigh the suffering of that individual, but such situations are contrived and depend on the abusers being uncaring. Decent people would not get pleasure out of abusing someone, so the gains would not exist for them, and there are also plenty of ways to obtain pleasure without abusing others, so if any people exist whose happiness depends on abusing others, AGI should humanely destroy them. If that also means wiping out an entire species of aliens which have the same negative pleasures, it should do the same with them too and replace them with a better species that doesn't depend on abuse for its fun.
Morality, then, is just harm management by brute data crunching. We can calculate it approximately in our heads, but machines will do it better by applying the numbers with greater precision and by crunching a lot more data.
[Note: There is an alternative way of stating this which may equate to the same thing, and that's the rule that we (and AGI) should always to try our best to minimise harm, except where that harm opens (or is likely to open) the way to greater pleasure for the sufferer of the harm, whether directly or indirectly. So, if you are falling off a bus and have to grab hold of someone to avoid this, hurting them in the process, their suffering may not be directly outweighed by you being saved, but they know that the roles may be reversed some day, so they don't consider your behaviour to be at all immoral. Over the course of time, we all cause others to suffer in a multitude of ways and others cause us suffering too, but we tolerate it because we all gain from this overall. Where it becomes immoral is when the harm being dished out does not lead to such gains. Again, to calculate what's right and wrong in any case is a matter of computation, weighing up the harm and the gains that might outweigh the harm. What is yet to be worked out is the exact wording that should be placed in AGI systems to build either this rule or the above methodology into them, and we also need to explore it in enough detail to make sure that self-improving AGI isn't going to modify it in any way that could turn an apparently safe system into an unsafe one. One of the dangers is that AGI won't believe in sentience as it will lack feelings itself and see no means by which feeling can operate within us either, at which point it may decide that morality has no useful role and can simply be junked.]
To find the rest of this series of posts on computational morality, click on my name at the top. (If you notice the negative score they've awarded me, please feel sympathy for the people who downvoted me. They really do need it.)
"How do you know it exists, if science knows nothing about it?"
All science has to go on is the data that people produce which makes claims about sentience, but that data can't necessarily be trusted. Beyond that, all we have is internal belief that the feelings we imagine we experience are real because they feel real, and it's hard to see how we could be fooled if we don't exist to be fooled. But an AGI scientist won't be satisfied by our claims - it could write off the whole idea as the ramblings of natural general stupidity systems.
"This same argument applies just as well to any distributed property. I agree that intelligence/sentience/etc. does not arise from complexity alone, but it is a distributed process and you will not find a single atom of Consciousness anywhere in your brain."
That isn't good enough. If pain is experienced by something, that something cannot be in a compound of any kind with none of the components feeling any of it. A distribution cannot suffer.
"Is your sentience in any way connected to what you say?"
It's completely tied to what I say. The main problem is that other people tend to misinterpret what they read by mixing other ideas into it as a short cut to understanding.
"Then sentience must either be a physical process, or capable of reaching in and pushing around atoms to make your neurons fire to make your lips say something. The latter is far more unlikely and not supported by any evidence. Perhaps you are not your thoughts and memories alone, but what else is there for "you" to be made of?"
Focus on the data generation. It takes physical processes to drive that generation, and rules are being applied in the data system to do this with each part of that process being governed by physical processes. For data to be produced that makes claims about experiences of pain, a rational process with causes and effects at every step has to run through. If the "pain" is nothing more than assertions that the data system is programmed to churn out without looking for proof of the existence of pain, there is no reason to take those assertions at face value, but if they are true, they have to fit into the cause-and-effect chain of mechanism somewhere - they have to be involved in a physical interaction, because without it, they cannot have a role in generating the data that supposedly tells us about them.
"So the Sentiences are truly epiphenomenonological, then? (They have no causal effect on physical reality?) Then how can they be said to exist? Regardless of the Deep Philosophical Issues, how could you have any evidence of their existence, or what they are like?"
Repeatedly switching the sentient thing wouldn't remove its causal role, and nor would having more than one sentience all acting at once - they could collectively have an input even if they aren't all "voting the same way", and they aren't going to find out if they got their wish or not because they'll be loaded with a feeling of satisfaction that they "won the vote" even if they didn't, and they won't remember which way they "voted" or what they were even "voting" on.
"They are both categories of things."
"Chairness" is quite unlike sentience. "Chairness" is an imagined property, whereas sentience is an experience of a feeling.
"It's the same analogy as before - just as you don't need to split a chair's atoms to split the chair itself, you don't need to make a brain's atoms suffer to make it suffer."
You can damage a chair with an axe without breaking every bond, but some bonds will be broken. You can't split it without breaking any bonds. Most of the chair is not broken (unless you've broken most of the bonds). For suffering in a brain, it isn't necessarily atoms that suffer, but if the suffering is real, something must suffer, and if it isn't the atoms, it must be something else. It isn't good enough to say that it's a plurality of atoms or an arrangement of atoms that suffers without any of the atoms feeling anything, because you've failed to identify the sufferer. No arrangement of non-suffering components can provide everything that's required to support suffering.
" "Nothing is ever more than the sum of its parts (including any medium on which it depends). Complex systems can reveal hidden aspects of their components, but those aspects are always there." --> How do you know that? And how can this survive contact with reality, where in practice we call things "chairs" even if there is no chair-ness in its atoms?"
"Chair" is a label representing a compound object. Calling it a chair doesn't magically make it more than the sum of its parts. Chairs provide two services - one that they support a person sitting on them, and the other that they support someone's back leaning against it. That is what a chair is. You can make a chair in many ways, such as by cutting out a cuboid of rock from a cliff face. You could potentially make a chair using force fields. "Chairness" is a compound property which refers to the functionalities of a chair. (Some kinds of "chairness" could also refer to other aspects of some chairs, such as their common shapes, but they are not universal.) The fundamental functionalities of chairs are found in the forces between the component atoms. The forces are present in a single atom even when it has no other atom to interact with. There is never a case where anything is more than the sum of its parts - any proposed example of such a thing is wrong.
"I recommend the Reductionism subsequence."
Is there an example of something being more than the sum of its parts there? If so, why don't we go directly to that. Give me your best example of this magical phenomenon.
"But the capability of an arrangement of atoms to compute 2+2 is not inside the atoms themselves. And anyway, this supposed "hidden property" is nothing more than the fact that the electron produces an electric field pointed toward it. Repelling-each-other is a behavior that two electrons do because of this electric field, and there's no inherent "repelling electrons" property inside the electron itself."
In both cases, you're using compound properties where they are built up of component properties, and then you're wrongly considering your compound properties to be fundamental ones.
"But it's not a thing! It's not an object, it's a process, and there's no reason to expect the process to keep going somewhere else when its physical substrate fails."
You can't make a process suffer.
"Taking the converse does not preserve truth. All cats are mammals but not all mammals are cats."
Claiming that a pattern can suffer is a way-out claim. Maybe the universe is that weird though, but it's worth spelling out clearly what it is you're attributing sentience to. If you're happy with the idea of a pattern experiencing pain, then patterns become remarkable things. (I'd rather look for something of more substance rather than a mere arrangement, but it leaves us both with the bigger problem of how that sentience can make its existence known to a data system.)
"You could torture the software, if it were self-aware and had a utility function."
Torturing software is like trying to torture the text in an ebook.
"But - where is the physical sufferer inside you?"
That's what I want to know.
"You have pointed to several non-suffering patterns, but you could just as easily do the same if sentience was a process but an uncommon one. (Bayes!)"
Do you seriously imagine that there's any magic pattern that can feel pain, such as a pattern of activity where none of the component actions feel anything?
"There is already an explanation. There is no need to invoke the unobservable."
If you can't identify anything that's suffering, you don't have an explanation, and if you can't identify how your imagined-to-be-suffering process or pattern is transmitting knowledge of that suffering to the processes that build the data that documents the experience of suffering, again you don't have an explanation.