“Misbehaving Machines: The Emulated Brains of Transhumanist Dreams”, by Corry Shores (grad student; Twitter, blog) is another recent JET paper. Abstract:
Enhancement technologies may someday grant us capacities far beyond what we now consider humanly possible. Nick Bostrom and Anders Sandberg suggest that we might survive the deaths of our physical bodies by living as computer emulations. In 2008, they issued a report, or “roadmap,” from a conference where experts in all relevant fields collaborated to determine the path to “whole brain emulation.” Advancing this technology could also aid philosophical research. Their “roadmap” defends certain philosophical assumptions required for this technology’s success, so by determining the reasons why it succeeds or fails, we can obtain empirical data for philosophical debates regarding our mind and selfhood. The scope ranges widely, so I merely survey some possibilities, namely, I argue that this technology could help us determine
- if the mind is an emergent phenomenon,
- if analog technology is necessary for brain emulation, and
- if neural randomness is so wild that a complete emulation is impossible.
Whole brain emulation succeeds if it merely replicates human neural functioning. Yet for Nick Bostrom and Anders Sandberg, its success increases when it perfectly replicates a specific person’s brain…In 2008, Nick Bostrom and Anders Sandberg compiled the findings from a conference of philosophers, technicians and other experts who had gathered to formulate a “roadmap” of the individual steps and requirements that could plausibly develop this technology…As I proceed, I will look more closely at these philosophical assumptions individually. For now let it suffice to say that I will adopt the basic framework of their philosophy of mind.
…I will explore research that calls into question certain other ones. For example, although the authors diminish the importance of analog computation and noise interference, there are findings and compelling arguments that suggest otherwise. As well, there is reason to think that the brain’s computational dynamics would not call for Bostrom’s and Sandberg’s hierarchical model for the mind’s emergence. And finally, I will argue on these bases that if brain emulation were to be carried out to its ultimate end of replicating some specific person’s mind, the resulting replica would still over time develop divergently from its original.
Moravec believes that our minds can be transferred this way, because he does not adopt what he calls the body-identity position, which holds that the human individual can only be preserved if the continuity of its “body stuff” is maintained. He proposes instead what he terms the pattern-identity theory, which defines the essence of personhood as “the pattern and the process going on in my head and body, not the machinery supporting that process. If the process is preserved, I am preserved. The rest is mere jelly” (Moravec 1988, 108–109). He explains that over the course of our lives, our bodies regenerate themselves
…For N. Katherine Hayles, Moravec’s description of mind transfer is a nightmare. She observes that mental uploading presupposes a cybernetic concept. Our selfhood extends into intersubjective systems lying beyond our body’s bounds (Hayles 1999, 2). For example, Picasso in a sense places himself into his paintings, and then they reflect and communicate his identity to other selves. This could have been more fully accomplished if we precisely emulated his brain processes…These thinkers whom Krueger refers to as posthumanists would like to overcome the realms of matter and corporeality in which the body resides so as to enter into a pure mental sphere that secures their immortality. They propose that the human mind be “scanned as a perfect simulation” so it may continue forever inside computer hardware (Krueger 2005, 77). In fact, Krueger explains, because posthumanist philosophy seeks the annihilation of biological evolution in favor of computer and machine evolution, their philosophy necessitates there be an immortal existence, and hence, “the idea of uploading human beings into an absolute virtual existence inside the storage of a computer takes the center stage of the posthumanist philosophy” (Krueger 2005, 80).
Bostrom and Sandberg do not favor Moravec’s “invasive” sort of mind replication that involves surgery and the destruction of brain tissue (Bostrom and Sandberg 2008, 27). They propose instead whole brain emulation. To emulate someone’s neural patterns, we first scan a particular brain to obtain precise detail of its structures and their interactions. Using this data, we program an emulation that will behave essentially the same as the original brain…The emulation will mimic the human brain’s functioning on the cellular level, and then automatically, higher and higher orders of organization should spontaneously arise. Finally human consciousness might emerge at the highest level of organization.
…There are various levels of successfully attaining a functionally isomorphic mind, beginning with a simple “parts list” of the brain’s components along with the ways they interact. Yet, the highest levels are the most philosophically interesting, write Bostrom and Sandberg. When the technology achieves individual brain emulation, it produces emergent activity characteristic of that of one particular (fully functioning) brain. It is more similar to the activity of the original brain than any other brain. The highest form is a personal identity emulation: “a continuation of the original mind; either as numerically the same person, or as a surviving continuer thereof,” and we achieve such an emulation when it becomes rationally self-concerned for the brain it emulates (Bostrom and Sandberg 2008, 11).
Bostrom’s and Sandberg’s “Roadmap” presupposes a physicalist standpoint…Bostrom and Sandberg write that “sufficient apparent success with [whole brain emulation] would provide persuasive evidence for multiple realizability” (Bostrom and Sandberg 2008, 14).
…Our minds emerge from the complex dynamic pattern of all our neurons communicating and computing in parallel. Roger Sperry offers compelling evidence. There are “split brain” patients whose right and left brain hemispheres are disconnected from one another, and nonetheless, they have maintained unified consciousness. However, there is no good account for this on the basis of neurological activity, because there is no longer normal communication between the two brain-halves (Clayton 2006, 20). For this reason, Sperry concludes that mental phenomena are emergent properties that “govern the flow of nerve impulse traffic.” According to Sperry, “Individual nerve impulses and other excitatory components of a cerebral activity pattern are simply carried along or shunted this way and that by the prevailing overall dynamics of the whole active process” (Sperry quoted in Clayton 2006, 20). Yet it works the other way as well:
The conscious properties of cerebral patterns are directly dependent on the action of the component neural elements. Thus, a mutual interdependence is recognized between the sustaining physico-chemical processes and the enveloping conscious qualities. The neurophysiology, in other words, controls the mental effects, and the mental properties in turn control the neurophysiology. (Sperry quoted in Clayton 2006, 20)
…Now let’s suppose that whole brain emulation continually fails to produce emergent mental phenomena, despite having developed incredible computational resources for doing so. This might lead us to favor Todd Feinberg’s argument that the mind does not emerge from the brain to a higher order. He builds his argument in part upon Searle’s distinction between two varieties of conscious emergence. Searle first has us consider a system made of a set of components, for example, a rock made up of a conglomerate of molecules. The rock will have features not found in any individual molecule; its weight of ten pounds is not found entirely in any molecular part. However, we can deduce or calculate the weight of the rock on the basis of the weights of its molecules. Yet, what about the solidity of the rock? This is an example of an emergent property that can be explained only in terms of the interactions among the elements (Searle 1992, 111). Consciousness, he argues, is an emergent property based on the interactions of neurons, but he disputes a more “adventurous conception,” which holds that emergent consciousness has capacities not explainable on the basis of the neurons’ interactivity: “the naïve idea here is that consciousness gets squirted out by the behaviour of the neurons in the brain, but once it has been squirted out, then it has a life of its own” (Searle 1992, 112). Feinberg will build from Searle’s position in order to argue for a non-hierarchical conception of mental emergence. So while Feinberg does in fact think consciousness results from the interaction of many complex layers of neural organization, no level emerges to a superior status. He offers the example of visual recognition and has us consider when we recognize our grandmother. One broad layer of neurons transmits information about the whole visual field. Another more selective layer picks-out lines. Then an even narrower layer detects shapes. Finally the information arrives at the “grandmother cell,” which only fires when she is the one we see. But this does not make the grandmother cell emergently higher. Rather, all the neural layers of organization must work together simultaneously to achieve this recognition. The brain is a vast network of interconnected circuits, so we cannot say that any layer of organization emerges over-and-above the others (Feinberg 2001, 130–31)…Nonetheless, his objection may still be problematic for whole brain emulation, because Bostrom and Sandberg write:
An important hypothesis for [whole brain emulation] is that in order to emulate the brain we do not need to understand the whole system, but rather we just need a database containing all necessary low-level information about the brain and knowledge of the local update rules that change brain states from moment to moment. (Bostrom and Sandberg 2008, 8)
…If consciousness emerges from neural activity, perhaps it does so in a way that is not perfectly suited to the sort of emergentism that Bostrom and Sandberg use in their roadmap. Hence, pursuing the development of whole brain emulation might provide evidence indicating whether and how our minds relate to our brains.
…One notable advantage of analog is its “density” (Goodman 1968, 160–161). Between any two variables can be found another, but digital variables will always have gaps between them. For this reason, analog can compute an infinity of different values found within a finite range, while digital will always be missing variables between its units. In fact, Hava Siegelmann argues that analog is capable of a hyper-computation that no digital computer could possibly accomplish (Siegelmann 2003, 109).
…A “relevant property” of an audiophile’s brain is its ability to discern analog from digital, and prefer one to the other. However, a digital emulation of the audiophile’s brain might not be able to share its appreciation for analog, and also, perhaps digital emulations might even produce a mental awareness quite foreign to what humans normally experience. Bostrom’s and Sandberg’s brain emulation exclusively uses digital computation. Yet, they acknowledge that some argue analog and digital are qualitatively different, and they even admit that implementing analog in brain emulation could present profound difficulties (Bostrom and Sandberg 2008, 39). Nonetheless, they think there is no need to worry.
They first argue that brains are made of discrete atoms that must obey quantum mechanical rules, which force the atoms into discrete energy states. Moreover, these states could be limited by a discrete time-space (Bostrom and Sandberg 2008, 38). Although I am unable to comment on issues of quantum physics, let’s presume for argument’s sake that the world is fundamentally made-up of discrete parts. Bostrom and Sandberg also say that whole brain emulation’s development would be profoundly hindered if quantum computation were needed to compute such incredibly tiny variations (Bostrom and Sandberg 2008, 39); however, this is where analog now already has the edge (Siegelmann 2003, 111).
Yet their next argument calls even that notion into question. They pose what is called “the argument from noise.” Analog devices always take some physical form, and it is unavoidable that interferences and irregularities, called noise, will make the analog device imprecise. So analog might be capable of taking on an infinite range of variations; however, it will never be absolutely accurate, because noise always causes it to veer-off slightly from where it should be…However, soon the magnitude between digital’s smallest values will equal the magnitude that analog veers away from its proper course. Digital’s blind spots would then be no greater than analog’s smallest inaccuracies. So, we only need to wait for digital technology to improve enough so that it can compute the same values with equivalent precision. Both will be equally inaccurate, but for fundamentally different reasons.
…Note first that our nervous system’s electrical signals are discrete pulses, like Morse code. In that sense they are digital. However, the frequency of the pulses can vary continuously (Jackendoff 1987, 33); for, the interval between two impulses may take any value (Müller et al. 1995, 5). This applies as well to our sense signals: as the stimulus varies continuously, the signal’s frequency and voltage changes proportionally (Marieb and Hoehn 2007, 401). As well, there are many other neural quantities that are analog in this way. Recent research suggests that the signal’s amplitude is also graded and hence is analog (McCormick et.al 2006, 761). Also consider that our brains learn by adjusting the “weight” or computational significance of certain signal channels. A neuron’s signal-inputs are summed, and when it reaches a specific threshold, the neuron fires its own signal, which then travels to other neurons where the process is repeated. Another way the neurons adapt is by altering this input threshold. Both these adjustments may take on a continuous range of values; hence analog computation seems fundamental to learning (Mead 1989, 353–54).
At this point, Shores summarizes an argument from Fred Dretske which seems to me so blatantly false that it could not possibly be what Dretske meant, so I will refrain from excerpting it and passing on my own misconceptions. Continuing on:
…and yet like Bostrom and Sandberg, Schonbein critiques analog using the argument from noise (Schonbein 2005, 60). He says that analog computers are more powerful only in theory, but as soon as we build them, noise from the physical environment diminishes their accuracy (Schonbein 2005, 65–66). Curiously, he concludes that we should not for that reason dismiss analog but instead claims that analog neural networks, “while not offering greater computational power, may nonetheless offer something else” (2005, 68). However, he leaves it for another effort to say exactly what might be the unique value of analog computation.
A.F. Murray’s research on neural-network learning supplies an answer: analog noise interference is significantly more effective than digital at aiding adaptation, because being “wrong” allows neurons to explore new possibilities for weights and connections (Murray 1991, 1547). This enables us to learn and adapt to a chaotically changing environment. So using digitally-simulated neural noise might be inadequate. Analog is better, because it affords our neurons an infinite array of alternate configurations (1991, 1547–1548). Hence in response to Bostrom’s and Sandberg’s argument from noise, I propose this argument for noise. Analog’s inaccuracies take the form of continuous variation, and in my view, this is precisely what makes it necessary for whole brain emulation.
This is worth examining in more depth. The citation Murray 1991 is “Analogue noise-enhanced learning in neural network circuits”; the PDF is not online and the abstract is not particularly helpful:
Experiments are reported which demonstrate that, whereas digital inaccuracy in neural arithmetic, in the form of bit-length limitation, degrades neural learning, analogue noise enhances it dramatically. The classification task chosen is that of vowel recognition within a multilayer perceptron network, but the findings seem to be perfectly general in the neural context, and have ramifications for all learning processes where weights evolve incrementally, and slowly.
Fortunately, Shores has excerpted key bits on his blog in “Deleuze’s & Guattari’s Neurophysiology and Neurocomputation”; I will reproduce them below:
Analogue techniques allow the essential neural functions of addition and multiplication to be mapped elegantly into small analogue circuits. The compactness introduced allows the use of massively parallel arrays of such operators, but analogue circuits are noise-prone. This is widely held to be tolerable during neural computation, but not during learning. In arriving at this conclusion, parallels are drawn between analogue ‘noise’ uncertainty, and digital inaccuracy, limited by bit length. This, coupled with dogma which holds that high (â16 bit) accuracy is needed during neural learning, has discouraged attempts to develop analogue learning circuitry, although some recent work on hybrid analogue/digital systems suggests that learning is possible, if perhaps not optimal, in low (digital) precision networks. In this Letter results are presented which demonstrate that analogue noise, far from impeding neural learning, actually enhances it. (Murray 1546.1a, emphasis mine)
Learning is clearly enhanced by the presence of noise at a high level (around 20%) on both synaptic weights and activities. This result is surprising, in light of the normal assertion alluded to above that back propagation requires up to 16 bit precision during learning. The distinction is that digital inaccuracy, determined by the significance of the least significant bit (LSB), implies that the smallest possible weight change during learning is 1 LSB. Analogue inaccuracy is, however, fundamentally different, in being noise-limited. In principle, infinitesimally small weight changes can be made, and the inaccuracy takes the form of a spread of ‘actual’ values of that weight as noise enters the forward pass. The underlying ‘accurate’ weight does, however, maintain its accuracy as a time average, and the learning process is sufficiently slow to effectively ‘see through’ the noise in an analogue system. (1547.2.bc, emphasis mine)
The further implication is that drawing parallels in this context between digital inaccuracy and analogue noise is extremely misleading. The former imposes constraints (quantisation) on allowable weight values, whereas the latter merely smears a continuum of allowable weight values. (1547–1548, emphasis mine)
While it’s hard to say in lieu of the full paper, this seems to be the exact same analogue argument as before: analogue supposedly gives a system more degrees of freedom and can answer more questions about similarly free systems, and the brain may be such a free system. Parts of the quotes undermine the idea that analogue offers any additional power in concrete practice (“the learning process is sufficiently slow to effectively ‘see through’ the noise in an analogue system”) and to extend this to brains is unwarranted by the same anti-analogue arguments are before - in a quantized universe, you only need more bits to get as much precision as exists.
In the paper, the neural network apparently uses 8-bit or 16-bit words; perhaps 32-bits would be enough to reach the point where the bit-length quantization is only as bad as the analogue noise, and if it is not, then perhaps 64-bits (as is now standard on commodity computers in 2011) or 128-bit lengths would be enough (commodity computers use 128-bit special-purpose vector registers, and past and present architectures have used them).
Neural noise can result from external interferences like magnetic fields or from internal random fluctuations (Ward 2002, 116–117). According to Steven Rose, our brain is an “uncertain” system on account of “random, indeterminate, and probabilistic” events that are essential to its functioning (Rose 1976, 93). Alex Pouget and his research team recently found that the mind’s ability to compute complex calculations has much to do with its noise. Our neurons transmit varying signal-patterns even for the same stimulus (Pouget et al. 2006, 356), which allows us to probabilistically estimate margins of error when making split-second decisions, as for example when deciding what to do if our brakes fail as we speed toward a busy intersection (Pouget et al. 2008, 1142). Hence the brain’s noisy irregularities seem to be one reason that it is such a powerful and effective computer.
Some also theorize that noise is essential to the human brain’s creativity. Johnson-Laird claims that creative mental processes are never predictable (Johnson-Laird 1987, 256). On this basis, he suggests that one way to make computers think creatively would be to have them alter their own functioning by submitting their own programs to artificially-generated random variations (Johnson-Laird 1993, 119–120). This would produce what Ben Goertzel refers to as “a complex combination of random chance with strict, deterministic rules” (Goertzel 1994, 119). According to Daniel Dennett, this indeterminism is precisely what endows us with what we call free will (Dartnall 1994, 37). Likewise, Bostrom and Sandberg suggest we introduce random noise into our emulation by using pseudo-random number generators. They are not truly random, because eventually the pattern will repeat. However, if it takes a very long time before the repetitions appear, then probably it would be sufficiently close to real randomness.
…Lawrence Ward reviews findings that suggest we may characterize our neural irregularities as pink noise, which is also called 1/f noise (Ward 2002, 145–153). Benoit Mandelbrot classifies such 1/f noise as what he terms “wild randomness” and “wild variation” (Mandelbrot and Hudson 2004, 39–41). This sort of random might not be so easily simulated, and Mandelbrot gives two reasons for this.
- In wild randomness, there are events that defy the normal random distribution of the bell curve. He cites a number of stock market events that are astronomically improbable, even though such occurrences in fact happen quite frequently in natural systems despite their seeming impossibility. There is no way to predict when they will happen or how drastic they will be (Mandelbrot and Hudson 2004, 4). And
- each event is random and yet it is not independent from the rest, like each toss of a coin is.
The pink noise point seems entirely redundant with the previous 2 paragraphs. If the PRNGs are adequate for the latter kinds of noise, they are adequate for the former, which is merely one of many distributions statisticians employ all the time besides the ‘normal distribution’. As well, PRNGs are something of a red herring here: genuine quantum random-number generators for computers are old school, and other kinds of hardware can produce staggering quantities of randomness. (I read a few years ago of laser RNGs producing a gigabyte or two per second.)
The problem is that the brain’s 1/f noise is wildly random. So suppose we emulate some person’s brain perfectly, and suppose further that the original person and her emulation identify so much that they cannot distinguish themselves from one another. Yet, if both minds are subject to wild variations, then their consciousness and identity might come to differ more than just slightly. They could even veer off wildly. So, to successfully emulate a brain, we might need to emulate this wild neural randomness. However, that seems to remove the possibility that the emulation will continue on as the original person. Perhaps our very effort to emulate a specific human brain results in our producing an entirely different person altogether.
If the inputs are the same to a perfect emulation, the outputs will be the same by definition. This is just retreading the old question of divergence; 1/f noise adds nothing new to the question. If there is a difference in the emulation or in the two inputs, of course there may be arbitrarily small or large differences in outputs. This is easy to see with simple thought-experiments that owe nothing to noise.
Imagine that the brain is completely devoid of chaos or noise or anything previously suggested. We can still produce arbitrarily large divergences based on arbitrarily small differences, right down to a single bit. Here’s an example thought-experiment: the subject resolves, before an uploading procedure, that he will recall a certain memory in which he looks at a bit, and if the bit is 1 he will become a Muslim and if a 0 he will become an atheist. He is uploaded, and the procedure goes perfectly except for one particular bit, which just happens to be the same bit of the memory; the original and the upload then, per their resolution, examine their memories and become an atheist and a Muslim. One proceeds to blow himself up in the local mall and the other spends its time ranting online about the idiocy of theism. Quite a divergence, but one can imagine greater divergences if one must. Now, are they the same people after carrying out their resolution? Or different? An answer to this would seem to cover noise just as well.
And let’s not forget the broader picture. We obviously want the upload to initially be as close as possible to the original. But there being no difference eventually would completely eliminate the desirability of uploads: even if we mirror the upload and original in lockstep, what do we do with the upload when the original dies? Faithfully emulate the process of dying and then erase it, to preserve the moment-to-moment isomorphism? Of course not - we‘d keep running it, at which point it diverges quite a bit. (’What do you mean, I’m an upload and the original has just died?!’) One could say, with great justice, that for transhumanists, divergence is not an obstacle to personal identity/continuity but the entire reason uploading is desirable in the first place.
Line segment AC corresponds exactly to line segment AB, and an infinite number of line segments can be drawn this way. This means that although the circles have different diameters, an infinite number of corresponding line segments can be drawn between an infinite number of corresponding points. Each circle represents a different sized infinity. A binary (digital) system can theoretically generate these infinite sets as an ongoing process of constant computation, but the computing resources required to do this are immense. So it seems that the differences between consciousness resulting from a diachronic pattern emergence embodied on a digital system might be quite severe when compared to an emergence resulting from the same set of rules run on an analog or a flesh-and-bone computer.
Perhaps it’s just me, but I think the author doesn’t grasp Cantorian cardinality at all (different sized infinity? No, they’re all the same cardinality, in the same way ‘all powers of 2’ is the same size as ‘the even integers’), and the rest doesn’t read much more sensibly.