Human Minds are Fragile
We are familiar with the thesis that Value is Fragile. This is why we are researching how to impart values to an AGI.
Embedded Minds are Fragile
Besides values, it may be worth remembering that human minds too are very fragile.
A little magnetic tampering with your amygdalas, and suddenly you are a wannabe serial killer. A small dose of LSD can get you to believe you can fly, or that the world will end in 4 hours. Remove part of your Ventromedial PreFrontal Cortex, and suddenly you are so utilitarian even Joshua Greene would call you a psycho.
It requires very little material change to substantially modify a human being's behavior. Same holds for other animals with embedded brains, crafted by evolution and made of squishy matter modulated by glands and molecular gates.
A Problem for Paul-Boxing and CEV?
One assumption underlying Paul-Boxing and CEV is that:
It is easier to specify and simulate a human-like mind then to impart values to an AGI by means of teaching it values directly via code or human language.
Usually we assume that because, as we know, value is fragile. But so are embedded minds. Very little tampering is required to profoundly transform people's moral intuitions. A large fraction of the inmate population in the US has frontal lobe or amygdala malfunctions.
Finding out the simplest description of a human brain that when simulated continues to act as that human brain would act in the real world may turn out to be as fragile, or even more fragile, than concept learning for AGI's.
... And Everyone Loses Their Minds
Chris Nolan's Joker is a very clever guy, almost Monroesque in his ability to identify hypocrisy and inconsistency. One of his most interesting scenes in the film has him point out how people estimate horrible things differently depending on whether they're part of what's "normal", what's "expected", rather than on how inherently horrifying they are, or how many people are involved.
Soon people extrapolated this observation to other such apparent inconsistencies in human judgment, where a behaviour that once was acceptable, with a simple tweak or change in context, becomes the subject of a much more serious reaction.
I think there's rationalist merit in giving these inconsistencies a serious look. I intuit that there's some sort of underlying pattern to them, something that makes psychological sense, in the roundabout way that most irrational things do. I think that much good could come out of figuring out what that root cause is, and how to predict this effect and manage it.
Phenomena that come to mind, are, for instance, from an Effective Altruism point of view, the expenses incurred in counter-terrorism (including some wars that were very expensive in treasure and lives), and the number of lives said expenses save, compared with the number of lives that could be saved by spending that same amount into improving road safety, increasing public helathcare expense where it would do the most good, building better lightning rods (in the USA you're four times more likely to be struck by thunder than by terrorists), or legalizing drugs.
What do y'all think? Why do people have their priorities all jumbled-up? How can we predict these effects? How can we work around them?
Other minds and bats: the vampire Turing test
Thoughts inspired by Yvain's philosophical role-playing post.
Thomas Nagel produced a famous philosophical thought experiment "What Is It Like to Be A Bat?" In it, he argued that the reductionist understanding of consciousness was insufficient, since there exists beings - bats - that have conscious experiences that humans cannot understand. We cannot know what "it is like to be a bat", and looking reductively at bat brains, bat neurones, or the laws of physics, cannot (allegedly) grant us any understanding of this subjective experience. Therefore there remains an unavoidable subjective component to the problem of consciousness.
I won't address this issue directly (see for instance this, on the closely related subject of qualia), but instead look at the question: suppose someone told us that they actually knew what it was like to be a bat (as well as what it was like to be a human). Call such a being a vampire, for obvious reasons. So if someone claimed they were a vampire, how would we test this?
We can't simply ask them to describe what it's like to be a bat - it's perfectly possible they know what it's like to be a bat, but cannot describe it in human terms (just as we often fail to describe certain types of experiences to those who haven't experienced them). Could we run a sort of Turing test - maybe implant the putative vampire's brain into a bat body, and see how bat-like it behaved? But, as Nagel pointed out, this could be a test of whether they know how to behave like a bat behaves, not whether they know what it's like to be a bat.
I posit that one possible solution is to use the approach laid out in my post "the flawed Turing test". We need to pay attention as to how the "vampire" got their knowledge. If the vampire is a renown expert on bat behaviour and social interactions, who is also interested in sonar and paragliding - then them functioning as a bat is weak evidence as to them actually knowing what it is like to be a bat. But suppose instead that their knowledge comes from another source - maybe the vampire is a renown brain expert, who has grappled with philosophy of mind and spent many years examining the functioning of bat brains. But, crucially, they have never seen a full living bat in the wild or in the lab, they've never watched a natural documentary on bats, they've never even seen a photo of a bat. In that case, if they behave correctly when transplanted into a bat body, then it's strong evidence of them actually understanding what it's like to be a bat.
Similarly, maybe they got their knowledge after a long conversation with another "vampire". We have the recording of the conversation, and it's all about mental states, imagery, emotional descriptions and visualisation exercises - but not about physical descriptions or bat behaviour. In that case, as above, if they can function successfully as a bat, this is evidence of them really "getting it".
In summary, we can say "that person likely knows what it is like to be a bat" if "knowing what it's like to be a bat" is the most likely explanation for what we see. If they behave exactly like a bat when in a bat body, and we know they have no prior experience that teaches them how to behave like a bat (but a lot about the bat's mental states), then we can conclude that it's likely that they genuinely know what it's like to be a bat, and are implementing this knowledge, rather than imitating behaviour.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)