Brain-centredness and mind uploading

14 gedymin 02 January 2015 12:23PM

The naïve way of understanding mind uploading is "we take the connectome of a brain, including synaptic connection weights and characters, and emulate it in a computer". However, people want their personalities to be uploaded, not just brains. That is more than just replicating the functionality of their brains in silico.

This nuance has lead to some misunderstandings, for example, to experts wondering [1] why on Earth would anyone think that brain-centredness [2] (the idea that brains are "sufficient" in some vague sense) is a necessary prerequisite for successful whole brain emulation. Of course, brain-centredness is not required for brain uploading to be technically successful; the problem is whether it is sufficient for mind uploading in the sense that people actually care about?

 

The first obvious extension that may be required is the chemical environment of the brain. Here are some examples:

  • Are you familiar with someone whose personality is radically (and often predictability) altered under influence of alcohol or drugs? This is not an exception, but a rule: most are impacted by this, only to a smaller extent. Only the transiency of the effects allow us to label them as simple mood changes.
  • I have observed that my personal levels of neuroticism vary depending on the pharmaceutical drugs I'm using. Nootropics make me more nervous, while anti-hypertensions drugs have the reverse effect.
  • The levels of hormones in the blood function as long-term personality changes. There are neurotransmitters that themselves are slow-acting, for example, nitric oxide [3].
  • Artificially enchanted levels of serotonin in the brain causes it to "adapt" to this environment - in this way some of antidepressants work (namely, SSRI) [4].

Whole Brain Emulation - A Roadmap includes a short section about the "Body chemical environment" and concludes that for "WBE, the body chemistry model, while involved, would be relatively simple", unless protein interactions have to be modelled.

The technical aspect notwithstanding, what are the practical and moral implications? I think that here's not only a problem, but also an opportunity. Why keep the accidental chemistry we have developed in our lifetimes, one that presumably has little relation to what we would really like to be - if we could? Imagine that it is possible to create carefully improved and tailored versions of the neurotransmitter "soup" in the brain. There are new possibilities here for personal growth in ways that have not been possible before. These ways are completely orthogonal to the intelligence enhancement opportunities commonly associated with uploading.

The question of personal identity is more difficult, and there appears to be a grey zone here. A fictional example of the protagonist in Planescape: Torment comes into mind - is he the same person in each of his incarnations?

 

The second extension required to upload our personalities in the fullest sense might be the peripheral nervous system. Most of us think it's the brain that's responsible for emotions, but this is a simplified picture. Here are some hints why:

  • The James-Lange 19th century theory of emotions proposed that we experience emotion in response to physiological changes in our body. For example, we feel sad because we cry rather than cry because we are sad [5]. While the modern understanding of emotions is significantly different, these ideas have not completely gone away neither from academic research [5] nor everyday life. For example, to calm down, we are suggested to take deep and slow breaths. Paraplegics and quadriplegics, with severe spinal cord injuries typically experience less intense emotions than other people [6].
  • Endoscopic thoracic sympathectomy (ETS) is a surgical procedure in which a portion of the sympathetic nerve trunk in the thoracic region is destroyed [7]. It is typically used against excessive hand sweating. However, "a large study of psychiatric patients treated with this surgery [also] showed significant reductions in fear, alertness and arousal [..] A severe possible consequence of thoracic sympathectomy is corposcindosis (split-body syndrome) [..] In 2003 ETS was banned in Sweden due to overwhelming complaints by disabled patients." The complaints include having not been able to lead emotional life as fully as before the operation.
  • The enteric nervous system in the stomach "governs the function of the gastrointestinal system" [8]. I'm not sure how solid the research is, but there are a lot of articles on the Web that mention the importance of this system to our mood and well being [9]. Serotonin is "the happiness neurotransmitter" and "in fact 95 percent of the body's serotonin is found in the bowels", as are 50% of dopamine [8]. "Gut bacteria may influence thoughts and behaviour" [10] by using the serotonin mechanism. Also, "Irritable bowel syndrome is associated with psychiatric illness" [10].

 

In short, different chemistry in the brain changes what we are, as does the peripheral nervous system. To upload someone in the fullest sense, his/her chemistry and PNS also have to be uploaded.

[1] Randal Koene on whole brain emulation

[2] Anders Sandberg, Nick Bostrom, Future of Humanity Institute, Whole Brain Emulation - A Roadmap.

[3] Bradley Voytek's (Ph.D. neuroscience) Quora answer to Will human consciousness ever be transferrable?

[4] Selective serotonin reuptake inhibitors

[5] Bear et al. Neuroscience: Exploring the Brain, 3rd edition. Page 564.

[6] Michael W. Eysenck - Perspectives On Psychology - Page 100 - Google Books Result

[7] Endoscopic thoracic sympathectomy

[8] Enteric nervous system

[9] Scientific American, 2010. Think Twice: How the Gut's "Second Brain" Influences Mood and Well-Being

[10] The Guardian, 2012. Microbes manipulate your mind

An additional problem with Solomonoff induction

2 gedymin 22 January 2014 11:34PM

Let's continue from my previous post and look how Solomonoff induction fails to adequately deal with hypercomputation.

 

You may have heard of the Physical Church-Turing thesis. It's the idea that the Universe can, in a perfect level of detail, be simulated on a Turing machine. (No problem if the Universe turns out to be infinite - the thesis requires only that each finite portion of it can be simulated.) A corollary to the Physical CTT is the idea that there are no physically realizable uncomputable processes. We can talk about hypercomputers as mathematical abstractions, but we'll never be able to build (or see) one.

 

We don't have a very strong scientific evidence for the Physical CTT thesis yet - no one has built the perfect simulator yet, for example. On the other hand, we do have something - all known laws of physics (including quantum physics) allow arbitrary precision simulations on a computer. Even though the complete unified theory of all fundamental forces isn't there yet, the Standard model already makes pretty accurate predictions for almost all environments. (Singularities being the only known exception, as their neighborhoods are the only locations where the role of quantum gravity is not negligible.)

 

So the Physical CTT does not contradict any known laws of physics. Of course, these laws are not contradicted by a multitude of alternative hypotheses as well; all the hypotheses which suggest that the universe cannot be simulated on a Turing machine. We prefer the Physical CTT solely because it's the simplest one; because Occam's razor says so.

 

There are multiple levels and kinds of uncomputability; none of the problems placed on either of these levels are uncomputable in the absolute sense; all of them can computed by some hypothetical devices. And all of these devices are called hypercomputers. A corollary to the "Universe is uncomputable" position is that we someday may be able to, in fact, build a hypercomputer that is embedded in the physical world, or at least access some of the Nature's mystical, uncomputable processes as a black box.

 

Now, the "standard" Solomonoff induction uses the Universal prior, which in turn is related to Kolmogorov complexity (KC). An uncomputable process formally has undefined Kolmogorov complexity. Informally, the KC of such as process is infinitely large, as it must be larger than the KC of any computable process.

 

As discussed in the comments to the previous post, Solomonoff induction is by no means restricted to the Universal prior; it can use other priors, including a prior (i.e. probability distribution) defined over an universal hypercomputer. An example of such a hypercomputer is the combination Universal Turing machine + Halting problem oracle. Another example is the combination Universal Turing machine + a true random number oracle. An upgraded form of Solomonoff induction which uses a prior defined over the first kind of universal hypercomputer is going treat the halting problem as a very simple, computable process. An upgraded form of Solomonoff induction over the second kind of universal hypercomputer is going to treat random number generation as a very simple, computable process. And so on.

 

Now here's the problem. Mathematically, a Universal Turing machine equipped with the Halting oracle, and a Universal Turing machine equipped with a Random Number oracle are in the same class: they are both universal hypercomputers. Physically and practically, they are miles away from each another.

 

A Random Number oracle is just that: something that gives you random numbers. Their statistical properties won't even be particularly better than the properties a good pseudorandom number generator. They simply are, in a sense, "true"; therefore uncomputable. However, quantum physics suggests that Random Number oracles, in fact, might be real; i.e. that there are sources of true randomness in the Universe. This is QM interpretation-dependent, of course, but any deterministic, non-random interpretation of quantum mechanics involves things like faster-than light interaction etc., which frankly are much less intuitive.

 

A Halting oracle, in contrast, can solve infinite number of hugely important problems. It's magic. In my beliefs, the a priori probability that Universe contains some sort of Halting oracle is tiny. Only a huge amount of proper scientific tests could convince me to change this conclusion.

 

On the other hand, mathematically the two hypercomputers are siblings. Both can be approximated by a Turing machine. Both can even be computed by a deterministic algorithm (I think?), if the Turing machine that does the computation is allowed to work forever.

 

There is one significant mathematical difference between the two oracles. (Nevertheless, Solomonoff induction fails to take into account this difference.) The Halting oracle has a power on its own; it can be used to solve problems even when it comes without the accompanying Turing machine. The Random Number oracle cannot be used for anything but random number generation. (To solve any computable decision problem P with the Halting oracle, we can reformulate it as program source code: "if P, then halt; otherwise: loop forever" and feed this program to the Halting oracle. In this way the Halting oracle can be used to tell that 3<7 is true - the program halts -, while 10<5 is false - it loops forever.)

 

The Solomonoff induction can be fixed, if we assume that the input tape of the Universal prior's Turing machine contains infinite number of random bits. However, this idea needs an explicit justification, and its implications are not at all obvious. Does this mean that Occam's razor should be "prefer the simplest hypothesis, together with an infinite source of random numbers", instead of "prefer the simplest hypothesis"?

 

So to sum up, the problem is:

  • Intuitively, the probability that we are living in a Universe that includes True Random numbers is much larger than the probability that we are living in a Universe that allows Halting problem oracles;
  • Solomonoff induction cannot reliably distinguish between these two cases.

 

The consequences? When you hear someone claiming - again - that "computers are not capable of true creativity/consciousness/whatever, because creativity/consciousness/whatever requires human intuition, which is uncomputable, etc." - remember that it might be a bad idea to respond with an appeal to Solomonoff induction.

 

Interestingly, quite a few people praise their intuition and view it as almost a mystical power, but no one is surprised by their ability to name a few random numbers :)

 

 


A related question: how does finite, bounded universe fit into this? Does it make sense to use the Universal Turing machine as a generator for Kolmogorov complexity, when the actual model of computation required to simulate the universe is much simpler?

Understanding and justifying Solomonoff induction

1 gedymin 15 January 2014 01:16AM

I've been trying to understand the uses and limitations of Solomonoff induction. Following the principle that in order to fully understand something you should explain it others, here's a try. I prefer to write such things in a form for dialogue, as that better reflects thought processes.

This is not a very-in-depth technical article - for example, cousin_it and Wen Dai has obviously spent more time pondering about SI. (I'm not a long-time LW reader, but have skimmed through existing LW and wiki articles on related topics before posting this.)



Alice. Hi, I'm interested in the question of why and when should I prefer simpler hypotheses. I've heard about Occam's razor and I've read about Solomonoff induction and the Universal Prior. Now I'm looking for a philosophical justification of the math. I'd like to have something akin to de Finetti's justification for probability theory - "if you don't believe in the axioms, you're going to be Dutch booked!".

Bob.
You're welcome. Do you have any problems with the formulations?

A. I'm trying to understand how to connect the informal concept of Occam's razor with the mathematical formula of the Universal Prior in a meaningful way. Informally, a hypothesis is something that explains the data. Occam's razor tells us to prefer simpler hypotheses.

B. Well, yes.

A. But the Universal Prior seems to tell something that seems to be even stronger: that all shorter hypothesis are more likely! Clearly, that's not the case: if F and G are hypotheses, then "F OR G" is a hypothesis as well, and not less likely than either F or G individually! What's more, there exists a whole set of language and domain pairs where longer hypotheses have the same average probability. For one particular example consider propositional logic with AND, OR, and NOT operators. Because of symmetry between conjunction and disjunction, a well-formed 10-element formula is as likely to be a tautology (i.e. always correct) as a 100-element formula!

B. You're confused. In this formalism, you should interpret all hypotheses as programs for Universal Turing Machine. The machine reads a hypothesis from one of its tapes. It then executes the hypothesis, and outputs the result of the computation to another tape. The result is the data - the actual string, whose prefix we're observing, and whose remaining part we're trying to predict. Hypothesis is the input string - the string we're not observing. In order to predict future output, our best option is to guess which input program the machine is using. Some programs can be ruled out because they don't correspond to the observed output; but an infinite number of possible programs always remains. The Universal Prior says that shorter programs are more likely. So, a hypothesis is just a bunch of Turing Machine instructions. It makes sense to speak about the "joint hypothesis of F and G" - it's a program that does both what F and G do. On the other hand, it makes no sense to speak about "F OR G" in the way you were doing it before.

A. I see, right! Actually I'm not a mathematician or AI designer, I just want to reason about the physical universe. Let's fix it as the domain of our conversation. It seems to me that hypotheses in some sense correspond to the laws of physics. The output corresponds to the physical universe itself; everything that we can observe by using our senses. But what does the Turing Machine correspond to?

B. You're looking at it from the wrong angle. The justification of Occam's razor is an epistemic, not an ontological issue. It's not essential whether there is or isn't a specific Turing Machine "out there" computing our universe; it's essential that our universe is behaving as if it was computable! Let's quote Scholarpedia: "It is clear that, in a world with computable processes, patterns which result from simple processes are relatively likely, while patterns that can only be produced by very complex processes are relatively unlikely."

A. But when we're calculating the complexity of hypotheses, we're doing it in respect to a particular model of computing!

B.
The Church-Turing thesis tells that the choice of a particular machine is not very important - any Universal Turing Machine can simulate any other model of computation.

A. But who says it's a universal machine! It may be anything, really! For example, who says that the universe doesn't work as a quasi-Turing machine that outputs a huge number of "1" with 99% probability every time after it outputs a single "0"? The Universal Prior relies on the relation between inputs and outputs of the machine; if this relation is changed, the probabilities are going to be wrong.

On the other hand, the physical model may be strictly more powerful than a Universal Turing machine - it may be a hypercomputer! If it is, the universe is uncomputable.

B. Well, Occam's razor says that the model should be the simplest possible... Ah, I see, appealing to the razor here is a kind of circular reasoning.

At the moment I have another line of thought: consider what happens when we throw in the Simulation Hypothesis. The probability of being in a simulation is dependent on the computational power the simulated universes got. If the power is significantly smaller than in the parent universe, the recursive chain of simulations is shorter; if the power is closer to parent's power, the chain is longer. Therefore, the probability of being in a simulation is proportional to the length of the chain. This implies that if we're living in a simulation, then our universe is almost certainly not significantly less powerful than our "parent" universe. Either both our and the alien universe are computable, or they are both hypercomputable (in identical meanings of this word). Since it seems that we cannot create hypercomputers in our universe, its reasonable that the aliens cannot do that either. So it's evidence that our universe is computable.

A. Still there is a small probability that uncomputable oracles are present in the physical world, and we have just failed to recognize them. Perhaps we could learn about them in the future and harness their power somehow.

B. Sure - but the point is that we have yet to see any evidence for them. And there's evidence against them - the fact that our computer simulations match reality very well - as long as we have the computing power required for them! We've been looking for strange messages in the sky, we haven't found them. We've been looking for messages in our cell, we haven't found them.

A. In the worst case we can still justify Occam's razor for the physical universe by induction on empirical experience, right?

B. Hmm... I thought for a while and now I've got a better justification! See, even if the universe itself is uncomputable, there's still a MYRIAD of processes in it that ARE COMPUTABLE. We know that gravity and electromagnetism do not behave in random ways; they are at least approximately computable. Molecular dynamics is approximately computable. The cell is approximately computable. The nervous system is approximately computable. Evolution should be approximately computable; we can compute some of its basic elements astonishingly well. When Mendel was determining how heredity works by doing his pea-hybridization experiments, he was investigating a discrete, beautifully computable process.

Nature fights against randomness. Pure uncontrolled randomness does not allow complex systems to emerge. Even if the space-time of physics itself is completely continuous and uncomputable (which far from given), Nature at higher levels favors computable and discrete processes. In a sense, Nature has an infinite number of monkeys - but these monkeys are not sitting a a typewriter and writing the whole damn thing. Instead, they are sitting at a computer and writing input programs. As Seth Lloyd says (source):

Quantum mechanics supplies the universe with “monkeys” in the form of random quantum fluctuations, such as those that seeded the locations of galaxies. The computer into which they type is the universe itself. From a simple initial state, obeying simple physical laws, the universe has systematically processed and amplified the bits of information embodied in those quantum fluctuations. The result of this information processing is the diverse, information-packed universe we see around us: programmed by quanta, physics gave rise first to chemistry and then to life; programmed by mutation and recombination, life gave rise to Shakespeare; programmed by experience and imagination, Shakespeare gave rise to Hamlet. You might say that the difference between a monkey at a typewriter and a monkey at a computer is all the difference in the world.

A. Fine. This really has convinced me that Occam's razor is in some sense justified. I agree that the majority of the processes in the universe are computable or computationally approximable. I agree that for these processes, the Universal Prior follows. On the other hand, Occam's razor isn't justified for the ultimate laws of physics, because our universe might be uncomputable.

B. Well, there might be no such thing as "laws" in the ontological sense. Look, it's probable that there were an infinite number of inflationary bubbles expanding in the primordial Multiverse. Random regularities generated in a particular their subset lead to the appearance of laws. In this case the Universal Prior is exactly what is needed to sort out which sets of physical laws are probable, and which are improbable. I think that the existence of this simple physical model is evidence for a simple computational model, therefore evidence against hypercomputation and the kind of probabilistic, quasi-Turing machines you mentioned before.

A. May be. However, I still doubt that Occam's razor is always the best option to use in practice, even if it is theoretically justified.

Solomonoff Induction guarantees good results when are able to observe the prefix of the output data completely and precisely. Instead, real-world reasoning requires working with data data that has measurement errors, random noise, missing samples... and so on. Furthermore, we often need not so much a single precise hypothesis, but rather something broader - a type or a class of hypotheses - what machine learning people call "a model". Indeed, it turns out that machine learning researchers are working precisely on these problems. They are not terribly excited about Occam's razor. Let me quote from [Domingos 1999]:

"First [version of Occam's] razor: Given two models with the same generalization error, the simpler one should be
preferred because simplicity is desirable in itself.
[..]
Second razor: Given two models with the same training-set error, the simpler one should be preferred because it is likely to have lower generalization error.

We believe that it is important to distinguish clearly between these two versions of Occam's razor. The first one is largely uncontroversial, while the second one, taken literally, is false.

Several theoretical arguments and pieces of empirical evidence have been advanced to support it, but each of these is reviewed below and found wanting."

B. So what do you suggest?

A. It seems to me that Occam's razor surely is a good option when we need to explain data from controlled scientific experiments. It's the best option if we know how to computationally simulate the process and are able to isolate the process from the environment. If we don't know the mechanics of this process, we cannot even approximately calculate the computational complexity of the data. If we cannot isolate the system from the environment (and the system doesn't have good enough built-in error-correction facilities), then the results of the simulations won't repeat real world data. (For example, we can simulate the heredity of the peas. We can get probability distribution of recessive and dominant traits across pea hybrids in silico, and then get nearly the same results in vivo. On the other hand, there's small hope that we'll ever be able to simulate the unfolding of the  long-scale evolution and replicate its past results in our computers. Even though we do understand the mathematical basis of evolution, simulating the environmental impact is beyond our reach.)

B. We also should not forget that there are other motivations behind the razor. When evaluating a theory, its universal a priori probability is not the only thing that counts: theories with lower cognitive complexity are preferable because they are easier to reason about; theories that lead to algorithms with lower time complexity are preferable because they save our processing power. The universal prior is uncomputable after all; it can only be approximated. So it's OK to trade off marginal gains in probability - the apparent gain may be an error anyway. It's perfectly rational to examine and trade-off Kolmogorov complexity for different kinds of complexities.

A. So the bottom line is that situations in which Occam's razor (in the informal sense) is justified is neither a superset nor a subset of the situations in which Solomonoff induction is justified. Besides, the question of the nature of the universe remains open - as always...


References:


[Domingos 1999] The role of Occam's razor in knowledge discovery. P Domingos - 1999 - Springer. [PDF]

[Scholarperdia 2007] Marcus Hutter et al. (2007), Scholarpedia, 2(8):2572. [link]