[Link]: KIC 8462852, aka WTF star, "the most mysterious star in our galaxy", ETI candidate, etc.
KIC 8462852, or the WTF (Where's the Flux?) star, is an F-type main sequence star about 1,480 ly away. It's a little larger and more massive than the sun, and a few times brighter. Age is uncertain, but probably older rather than younger.
Kepler observations over the last few years reveal very strange large and aperiodic flux variations (up to 20%) - of the general form predicted by some ETI megastructure models. However there doesn't appear to be any excess infrared.
The star's fluctuations were discovered by the PlanetHunters team. In the WTF paper they review a large number of unlikely natural explanations and settle on an unusual comet swarm as the most likely scenario.
Abstract of the WTF paper:
Over the duration of the Kepler mission, KIC 8462852 was observed to undergo irregularly shaped, aperiodic dips in flux down to below the 20% level. The dipping activity can last for between 5 and 80 days. We characterize the object with high-resolution spectroscopy, spectral energy distribution fitting, and Fourier analyses of the Kepler light curve. We determine that KIC 8462852 is a main-sequence F3 V/IV star, with a rotation period ~0.88 d, that exhibits no significant IR excess. In this paper, we describe various scenarios to explain the mysterious events in the Kepler light curve, most of which have problems explaining the data in hand. By considering the observational constraints on dust clumps orbiting a normal main-sequence star, we conclude that the scenario most consistent with the data is the passage of a family of exocomet fragments, all of which are associated with a single previous breakup event. We discuss the necessity of future observations to help interpret the system.
From "Comets or Aliens?", on the Planet Hunters blog: " However, so far over 100 professional scientists have had a look at the lightcurves and not managed to come up with a working solution."
In a another recent paper Jason Wright et al discusses the WTF star in more detail and critiques the comet theory.
The Search for Extraterrestial Civilizations with Large Energy Supplies. IV: the Signatures and Information Content of Transiting Megastructures:
Arnold (2005), Forgan (2013), and Korpela et al. (2015) noted that planet-sized artificial structures could be discovered with Kepler as they transit their host star. We present a general discussion of transiting megastructures, and enumerate ten potential ways their anomalous silhouettes, orbits, and transmission properties would distinguish them from exoplanets. We also enumerate the natural sources of such signatures.
Several anomalous objects, such as KIC 12557548 and CoRoT-29, have variability in depth consistent with Arnold's prediction and/or an asymmetric shape consistent with Forgan's model. Since well motivated physical models have so far provided natural explanations for these signals, the ETI hypothesis is not warranted for these objects, but they still serve as useful examples of how nonstandard transit signatures might be identified and interpreted in a SETI context. Boyajian et al. 2015 recently announced KIC 8462852, an object with a bizarre light curve consistent with a "swarm" of megastructures. We suggest this is an outstanding SETI target.
We develop the normalized information content statistic M to quantify the information content in a signal embedded in a discrete series of bounded measurements, such as variable transit depths, and show that it can be used to distinguish among constant sources, interstellar beacons, and naturally stochastic or artificial, information-rich signals. We apply this formalism to KIC 12557548 and a specific form of beacon suggested by Arnold to illustrate its utility.
Jason Wright discusses WTF here on his blog.
Big reddit discussion on r/askscience here.
The Unfriendly Superintelligence next door
Markets are powerful decentralized optimization engines - it is known. Liberals see the free market as a kind of optimizer run amuck, a dangerous superintelligence with simple non-human values that must be checked and constrained by the government - the friendly SI. Conservatives just reverse the narrative roles.
In some domains, where the incentive structure aligns with human values, the market works well. In our current framework, the market works best for producing gadgets. It does not work so well for pricing intangible information, and most specifically it is broken when it comes to health.

We treat health as just another gadget problem: something to be solved by pills. Health is really a problem of knowledge; it is a computational prediction problem. Drugs are useful only to the extent that you can package the results of new knowledge into a pill and patent it. If you can't patent it, you can't profit from it.
So the market is constrained to solve human health by coming up with new patentable designs for mass-producible physical objects which go into human bodies. Why did we add that constraint - thou should solve health, but thou shalt only use pills? (Ok technically the solutions don't have to be ingestible, but that's a detail.)
The gadget model works for gadgets because we know how gadgets work - we built them, after all. The central problem with health is that we do not completely understand how the human body works - we did not build it. Thus we should be using the market to figure out how the body works - completely - and arguably we should be allocating trillions of dollars towards that problem.
The market optimizer analogy runs deeper when we consider the complexity of instilling values into a market. Lawmakers cannot program the market with goals directly, so instead they attempt to engineer desireable behavior by ever more layers and layers of constraints. Lawmakers are deontologists.
As an example, consider the regulations on drug advertising. Big pharma is unsafe - its profit function does not encode anything like "maximize human health and happiness" (which of course itself is an oversimplification). If allowed to its own devices, there are strong incentives to sell subtly addictive drugs, to create elaborate hyped false advertising campaigns, etc. Thus all the deontological injunctions. I take that as a strong indicator of a poor solution - a value alignment failure.
What would healthcare look like in a world where we solved the alignment problem?
To solve the alignment problem, the market's profit function must encode long term human health and happiness. This really is a mechanism design problem - its not something lawmakers are even remotely trained or qualified for. A full solution is naturally beyond the scope of a little blog post, but I will sketch out the general idea.
To encode health into a market utility function, first we create financial contracts with an expected value which captures long-term health. We can accomplish this with a long-term contract that generates positive cash flow when a human is healthy, and negative when unhealthy - basically an insurance contract. There is naturally much complexity in getting those contracts right, so that they measure what we really want. But assuming that is accomplished, the next step is pretty simple - we allow those contracts to trade freely on an open market.
There are some interesting failure modes and considerations that are mostly beyond scope but worth briefly mentioning. This system probably needs to be asymmetric. The transfers on poor health outcomes should partially go to cover medical payments, but it may be best to have a portion of the wealth simply go to nobody/everybody - just destroyed.
In this new framework, designing and patenting new drugs can still be profitable, but it is now put on even footing with preventive medicine. More importantly, the market can now actually allocate the correct resources towards long term research.
To make all this concrete, let's use an example of a trillion dollar health question - one that our current system is especially ill-posed to solve:
What are the long-term health effects of abnormally low levels of solar radiation? What levels of sun exposure are ideal for human health?
This is a big important question, and you've probably read some of the hoopla and debate about vitamin D. I'm going to soon briefly summarize a general abstract theory, one that I would bet heavily on if we lived in a more rational world where such bets were possible.
In a sane world where health is solved by a proper computational market, I could make enormous - ridiculous really - amounts of money if I happened to be an early researcher who discovered the full health effects of sunlight. I would bet on my theory simply by buying up contracts for individuals/demographics who had the most health to gain by correcting their sunlight deficiency. I would then publicize the theory and evidence, and perhaps even raise a heap pile of money to create a strong marketing engine to help ensure that my investments - my patients - were taking the necessary actions to correct their sunlight deficiency. Naturally I would use complex machine learning models to guide the trading strategy.
Now, just as an example, here is the brief 'pitch' for sunlight.

If we go back and look across all of time, there is a mountain of evidence which more or less screams - proper sunlight is important to health. Heliotherapy has a long history.
Humans, like most mammals, and most other earth organisms in general, evolved under the sun. A priori we should expect that organisms will have some 'genetic programs' which take approximate measures of incident sunlight as an input. The serotonin -> melatonin mediated blue-light pathway is an example of one such light detecting circuit which is useful for regulating the 24 hour circadian rhythm.
The vitamin D pathway has existed since the time of algae such as the Coccolithophore. It is a multi-stage pathway that can measure solar radiation over a range of temporal frequencies. It starts with synthesis of fat soluble cholecalciferiol which has a very long half life measured in months. [1] [2]
- Cholecalciferiol (HL ~ months) becomes
- 25(OH)D (HL ~ 15 days) which finally becomes
- 1,25(OH)2 D (HL ~ 15 hours)
The main recognized role for this pathway in regards to human health - at least according to the current Wikipedia entry - is to enhance "the internal absorption of calcium, iron, magnesium, phosphate, and zinc". Ponder that for a moment.
Interestingly, this pathway still works as a general solar clock and radiation detector for carnivores - as they can simply eat the precomputed measurement in their diet.
So, what is a long term sunlight detector useful for? One potential application could be deciding appropriate resource allocation towards DNA repair. Every time an organism is in the sun it is accumulating potentially catastrophic DNA damage that must be repaired when the cell next divides. We should expect that genetic programs would allocate resources to DNA repair and various related activities dependent upon estimates of solar radiation.
I should point out - just in case it isn't obvious - that this general idea does not imply that cranking up the sunlight hormone to insane levels will lead to much better DNA/cellular repair. There are always tradeoffs, etc.
One other obvious use of a long term sunlight detector is to regulate general strategic metabolic decisions that depend on the seasonal clock - especially for organisms living far from the equator. During the summer when food is plentiful, the body can expect easy calories. As winter approaches calories become scarce and frugal strategies are expected.
So first off we'd expect to see a huge range of complex effects showing up as correlations between low vit D levels and various illnesses, and specifically illnesses connected to DNA damage (such as cancer) and or BMI.
Now it turns out that BMI itself is also strongly correlated with a huge range of health issues. So the first key question to focus on is the relationship between vit D and BMI. And - perhaps not surprisingly - there is pretty good evidence for such a correlation [3][4] , and this has been known for a while.
Now we get into the real debate. Numerous vit D supplement intervention studies have now been run, and the results are controversial. In general the vit D experts (such as my father, who started the vit D council, and publishes some related research[5]) say that the only studies that matter are those that supplement at high doses sufficient to elevate vit D levels into a 'proper' range which substitutes for sunlight, which in general requires 5000 IU day on average - depending completely on genetics and lifestyle (to the point that any one-size-fits all recommendation is probably terrible).
The mainstream basically ignores all that and funds studies at tiny RDA doses - say 400 IU or less - and then they do meta-analysis over those studies and conclude that their big meta-analysis, unsurprisingly, doesn't show a statistically significant effect. However, these studies still show small effects. Often the meta-analysis is corrected for BMI, which of course also tends to remove any vit D effect, to the extent that low vit D/sunlight is a cause of both weight gain and a bunch of other stuff.
So let's look at two studies for vit D and weight loss.
First, this recent 2015 study of 400 overweight Italians (sorry the actual paper doesn't appear to be available yet) tested vit D supplementation for weight loss. The 3 groups were (0 IU/day, ~1,000 IU / day, ~3,000 IU/day). The observed average weight loss was (1 kg, 3.8 kg, 5.4 kg). I don't know if the 0 IU group received a placebo. Regardless, it looks promising.
On the other hand, this 2013 meta-analysis of 9 studies with 1651 adults total (mainly women) supposedly found no significant weight loss effect for vit D. However, the studies used between 200 IU/day to 1,100 IU/day, with most between 200 to 400 IU. Five studies used calcium, five also showed weight loss (not necessarily the same - unclear). This does not show - at all - what the study claims in its abstract.
In general, medical researchers should not be doing statistics. That is a job for the tech industry.
Now the vit D and sunlight issue is complex, and it will take much research to really work out all of what is going on. The current medical system does not appear to be handling this well - why? Because there is insufficient financial motivation.
Is Big Pharma interested in the sunlight/vit D question? Well yes - but only to the extent that they can create a patentable analogue! The various vit D analogue drugs developed or in development is evidence that Big Pharma is at least paying attention. But assuming that the sunlight hypothesis is mainly correct, there is very little profit in actually fixing the real problem.
There is probably more to sunlight that just vit D and serotonin/melatonin. Consider the interesting correlation between birth month and a number of disease conditions[6]. Perhaps there is a little grain of truth to astrology after all.
Thus concludes my little vit D pitch.
In a more sane world I would have already bet on the general theory. In a really sane world it would have been solved well before I would expect to make any profitable trade. In that rational world you could actually trust health advertising, because you'd know that health advertisers are strongly financially motivated to convince you of things actually truly important for your health.
Instead of charging by the hour or per treatment, like a mechanic, doctors and healthcare companies should literally invest in their patients long-term health, and profit from improvements to long term outcomes. The sunlight health connection is a trillion dollar question in terms of medical value, but not in terms of exploitable profits in today's reality. In a properly constructed market, there would be enormous resources allocated to answer these questions, flowing into legions of profit motivated startups that could generate billions trading on computational health financial markets, all without selling any gadgets.
So in conclusion: the market could solve health, but only if we allowed it to and only if we setup appropriate financial mechanisms to encode the correct value function. This is the UFAI problem next door.
Analogical Reasoning and Creativity
This article explores analogism and creativity, starting with a detailed investigation into IQ-test style analogy problems and how both the brain and some new artificial neural networks solve them. Next we analyze concept map formation in the cortex and the role of the hippocampal complex in establishing novel semantic connections: the neural basis of creative insights. From there we move into learning strategies, and finally conclude with speculations on how a grounded understanding of analogical creative reasoning could be applied towards advancing the art of rationality.

- Introduction
- Under the Hood
- Conceptual Abstractions and Cortical Maps
- The Hippocampal Association Engine
- Cultivate memetic heterogeneity and heterozygosity
- Construct and maintain clean conceptual taxonomies
- Conclusion
Introduction
The computer is like a bicycle for the mind.
-- Steve Jobs
The kingdom of heaven is like a mustard seed, the smallest of all seeds, but when it falls on prepared soil, it produces a large plant and becomes a shelter for the birds of the sky.
-- Jesus
Sigmoidal neural networks are like multi-layered logistic regression.
-- various
The threat of superintelligence is like a tribe of sparrows who find a large egg to hatch and raise. It grows up into a great owl which devours them all.
-- Nick Bostrom (see this video)
Analogical reasoning is one of the key foundational mechanisms underlying human intelligence, and perhaps a key missing ingredient in machine intelligence. For some - such as Douglas Hofstadter - analogy is the essence of cognition itself.[1]
Steve Job's bicycle analogy is clever because it encapsulates the whole cybernetic idea of computers as extensions of the nervous system into a single memorable sentence using everyday terms.
A large chunk of Jesus's known sayings are parables about the 'Kingdom of Heaven': a complex enigmatic concept that he explains indirectly through various analogies, of which the mustard seed is perhaps the most memorable. It conveys the notions of exponential/sigmoidal growth of ideas and social movements (see also the Parable of the Leaven), while also hinting at greater future purpose.
In a number of fields, including the technical, analogical reasoning is key to creativity: most new insights come from establishing mappings between or with concepts from other fields or domains, or from generalizing existing insights/concepts (which is closely related). These abilities all depend on deep, wide, and well organized internal conceptual maps.
Under the Hood

You can think of the development of IQ tests as a search for simple tests which have high predictive power for g-factor in humans, while being relatively insensitive to specific domain knowledge. That search process resulted in a number of problem categories, many of which are based on verbal and mathematical analogies.
The image to the right is an example of a simple geometric analogy problem. As an experiment, start a timer before having a go at it. For bonus points, attempt to introspect on your mental algorithm.
Solving this problem requires first reducing the images to simpler compact abstract representations. The first rows of images then become something like sentences describing relations or constraints (Z is to ? as A is to B and C is to D). The solution to the query sentence can then be found by finding the image which best satisfies the likely analogous relations.
Imagine watching a human subject (such as your previous self) solve this problem while hooked up to a future high resolution brain imaging device. Viewed in slow motion, you would see the subject move their eyes from location to location through a series of saccades, while various vectors or mental variable maps flowed through their brain modules. Each fixation lasts about 300ms[2], which gives enough time for one complete feedforward pass through the dorsal vision stream and perhaps one backwards sweep.

The output of the dorsal stream in inferior temporal cortex (TE on the bottom) results in abstract encodings which end up in working memory buffers in prefrontal cortex. From there some sort of learned 'mental program' implements the actual analogy evaluations, probably involving several more steps in PFC, cingulate cortex, and various other cortical modules (coordinated by the Basal Ganglia and PFC). Meanwhile the eye frontal fields and various related modules are computing the next saccade decision every 300ms or so.
If we assume that visual parsing requires one fixation on each object and 50ms saccades, this suggests that solving this problem would take a typical brain a minimum of about 4 seconds (and much longer on average). The minimum estimate assumes - probably unrealistically - that the subject can perform the analogy checks or mental rotations near instantly without any backtracking to help prime working memory. Of course faster times are also theoretically possible - but not dramatically faster.
These types of visual analogy problems test a wide set of cognitive operations, which by itself can explain much of the correlation with IQ or g-factor: speed and efficiency of neural processing, working memory, module communication, etc.
However once we lay all of that aside, there remains a core dependency on the ability for conceptual abstraction. The mapping between these simple visual images and their compact internal encodings is ambiguous, as is the predictive relationship. Solving these problems requires the ability to find efficient and useful abstractions - a general pattern recognition ability which we can relate to efficient encoding, representation learning, and nonlinear dimension reduction: the very essence of learning in both man and machine[3].
The machine learning perspective can help make these connections more concrete when we look into state of the art programs for IQ tests in general and analogy problems in particular. Many of the specific problem subtypes used in IQ tests can be solved by relatively simple programs. In 2003, Sange and Dowe created a simple Perl program (less than 1000 lines of code) that can solve several specific subtypes of common IQ problems[4] - but not analogies. It scored an IQ of a little over 100, simply by excelling in a few categories and making random guesses for the remaining harder problem types. Thus its score is highly dependent on the test's particular mix of subproblems, but that is also true for humans to some extent.

The IQ test sub-problems that remain hard for computers are those that require pattern recognition combined with analogical reasoning and or inductive inference. Precise mathematical inductive inference is easier for machines, whereas humans excel at natural reasoning - inference problems involving huge numbers of variables that can only be solved by scalable approximations.

The word vector embedding is learned as a component of an ANN trained via backprop on a large corpus of text data - Wikipedia. This particular model is rather complex: it combines a multi-sense word embedding, a local sliding window prediction objective, task-specific geometric objectives, and relational regularization constraints. Unlike the recent crop of general linguistic modeling RNNs, this particular system doesn't model full sentence structure or longer term dependencies - as those aren't necessary for answering these specific questions. Surprisingly all it takes to solve the verbal analogy problems typical of IQ/SAT/GRE style tests are very simple geometric operations in the word vector space - once the appropriate embedding is learned.
As a trivial example: "Uncle is to Aunt as King is to ?" literally reduces to:
Uncle + X = Aunt, King + X = ?, and thus X = Aunt-Uncle, and:
? = King + (Aunt-Uncle).
The (Aunt-Uncle) expression encapsulates the concept of 'femaleness', which can be combined with any male version of a word to get the female version. This is perhaps the simplest example, but more complex transformations build on this same principle. The embedded concept space allows for easy mixing and transforms of memetic sub-features to get new concepts.
Conceptual Abstractions and Cortical Maps
The success of these simplistic geometric transforms operating on word vector embeddings should not come as a huge surprise to one familiar with the structure of the brain. The brain is extraordinarily slow, so it must learn to solve complex problems via extremely simple and short mental programs operating on huge wide vectors. Humans (and now convolutional neural networks) can perform complex visual recognition tasks in just 10-15 individual computational steps (150 ms), or 'cortical clock cycles'. The entire program that you used to solve the earlier visual analogy problem probably took on the order of a few thousand cycles (assuming it took you a few dozen seconds). Einstein solved general relativity in - very roughly - around 10 billion low level cortical cycles.
The core principle behind word vector embeddings, convolutional neural networks, and the cortex itself is the same: learning to represent the statistical structure of the world by an efficient low complexity linear algebra program (consisting of local matrix vector products and per-element non-linearities). The local wiring structure within each cortical module is equivalent to a matrix with sparse local connectivity, optimized heavily for wiring and computation such that semantically related concepts cluster close together.

(Concept mapping the cortex, from this research page)
The image above is from the paper "A Continous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain" by Huth et al.[5] They used fMRI to record activity across the cortex while subjects watched annotated video clips, and then used that data to find out roughly what types of concepts each voxel of cortex responds to. It correctly identifies the FFA region as specializing in people-face things and the PPA as specializing in man-made objects and buildings. A limitation of the above image visualizations is that they don't show response variance or breadth, so the voxel colors are especially misleading for lower level cortical regions that represent generic local features (such as gabor edges in V1).
The power of analogical reasoning depends entirely on the formation of efficient conceptual maps that carve reality at the joints. The visual pathway learns a conceptual hierarchy that builds up objects from their parts: a series of hierarchical has-a relationships encoded in the connections between V1, V2, V4 and so on. Meanwhile the semantic clustering within individual cortical maps allows for fast computations of is-a relationships through simple local pooling filters.
An individual person can be encoded as a specific active subnetwork in the face region, and simple pooling over a local cluster of neurons across the face region can then compute the presence of a face in general. Smaller local pooling filters with more specific shapes can then compute the presence of a female or male face, and so on - all starting from the full specific feature encoding.
The pooling filter concept has been extensively studied in the lower levels of the visual system, where 'complex' cells higher up in V1 pool over 'simple' cell features: abstracting away gabor edges at specific positions to get edges OR'd over a range of positions (CNNs use this same technique to gain invariance to small local translations).
This key semantic organization principle is used throughout the cortex: is-a relations and more general abstractions/invariances are computed through fast local intramodule connections that exploit the physical semantic clustering on the cortical surface, and more complex has-a relations and arbitrary transforms (ex: mapping between an eye centered coordinate basis and a body centered coordinate basis) are computed through intermodule connections (which also exploit physical clustering).
The Hippocampal Association Engine

The Hippocampus is a tubular seahorse shaped module located in the center of the brain, to the exterior side of the central structures (basal ganglia, thalamus). It is the brain's associative database and search engine responsible for storing, retrieving, and consolidating patterns and declarative memories (those which we are consciously aware of and can verbally declare) over long time scales beyond the reach of short term memory in the cortex itself.
A human (or animal) unfortunate enough to suffer complete loss of hippocampal functionality basically loses the ability to form and consolidate new long term episodic and semantic memories. They also lose more recent memories that have not yet been consolidated down the cortical hierarchy. In rats and humans, problems in the hippocampal complex can also lead to spatial navigation impairments (forgetting current location or recent path), as the HC is used to compute and retrieve spatial map information associated with current sensory impressions (a specific instance of the HC's more general function).
In terms of module connectivity, the hippocampal complex sits on top of the cortical sensory hierarchy. It receives inputs from a number of cortical modules, largely in the nearby associative cortex, which collectively provide a summary of the recent sensory stream and overall brain state. The HC then has several sub circuits which further compress the mental summary into something like a compact key which is then sent into a hetero-auto-associative memory circuit to find suitable matches.
If a good match is found, it can then cause retrieval: reactivation of the cortical subnetworks that originally formed the memory. As the hippocampus can't know for sure which memories will be useful in the future, it tends to store everything with emphasis on the recent, perhaps as a sort of slow exponentially fading stream. Each memory retrieval involves a new decoding and encoding to drive learning in the cortex through distillation/consolidation/retraining (this also helps prevent ontological crisis). The amygdala is a little cap on the edge of the hippocampus which connects to the various emotion subsystems and helps estimate the importance of current memories for prioritization in the HC.
A very strong retrieval of an episodic memory causes the inner experience of reliving the past (or imagining the future), but more typical weaker retrievals (those which load information into the cortex without overriding much of the existing context) are a crucial component in general higher cognition.
In short the computation that the HC performs is that of dynamic association between the current mental pattern/state loaded into short term memory across the cortex and some previous mental pattern/state. This is the very essence of creative insight.
Associative recall can be viewed as a type of pattern recognition with the attendant familiar tradeoffs between precision/recall or sensitivity/specificity. At the extreme of low recall high precision the network is very conservative and risk averse: it only returns high confidence associations, maximizing precision at the expense of recall (few associations found, many potentially useful matches are lost). At the other extreme is the over-confident crazy network which maximizes recall at the expense of precision (many associations are made, most of which are poor). This can also be viewed in terms of the exploitation vs exploration tradeoff.
This general analogy or framework - although oversimplified - also provides a useful perspective for understanding both schizotypy and hallucinogenic drugs. There is a large body of accumulated evidence in the form of use cases or trip reports, with a general consensus that hallucinogens can provide occasional flashes of creative insight at the expense of pushing one farther towards madness.
From a skeptical stance, using hallucinogenic drugs in an attempt to improve the mind is like doing surgery with butter-knives. Nonetheless, careful exploration of the sanity border can help one understand more on how the mind works from the inside.
Cannabis in particular is believed - by many of its users - to enhance creativity via occasional flashes of insight. Most of its main mental effects: time dilation, random associations, memory impairment, spatial navigation impairment, etc appear to involve the hippocampus. We could explain much of this as a general shift in the precision/recall tradeoff to make the hippocampus less selective. Mainly that makes the HC just work less effectively, but it also can occasionally lead to atypical creative insights, and appears to elevate some related low level measures such as schizotypy and divergent thinking[7]. The tradeoff is one must be willing to first sift through a pile of low value random associations.
Cultivate memetic heterogeneity and heterozygosity
Fluid intelligence is obviously important, but in many endeavors net creativity is even more important.
Of all the components underlying creativity, improving the efficiency of learning, the quality of knowledge learned, and the organizational efficiency of one's internal cortical maps are probably the most profitable dimensions of improvement: the low hanging fruits.
Our learning process is largely automatic and subconscious : we do not need to teach children how to perceive the world. But this just means it takes some extra work to analyze the underlying machinery and understand how to best utilize it.
Over long time scales humanity has learned a great deal on how to improve on natural innate learning: education is more or less learning-engineering. The first obvious lesson from education is the need for curriculum: acquiring concepts in stages of escalating complexity and order-dependency (which of course is already now increasingly a thing in machine learning).
In most competitive creative domains, formal education can only train you up to the starting gate. This of course is to be expected, for the creation of novel and useful ideas requires uncommon insights.
Memetic evolution is similar to genetic evolution in that novelty comes more from recombination than mutation. We can draw some additional practical lessons from this analogy: cultivate memetic heterogeneity and heterozygosity.
The first part - cultivate memetic heterogeneity - should be straightforward, but it is worth examining some examples. If you possess only the same baseline memetic population as your peers, then the chances of your mind evolving truly novel creative combinations are substantially diminished. You have no edge - your insights are likely to be common.
To illustrate this point, let us consider a few examples:
Geoffrey Hinton is one of the most successful researchers in machine learning - which itself is a diverse field. He first formally studied psychology, and then artificial intelligence. His various 200 research publications integrate ideas from statistics, neuroscience and physics. His work on boltzmann machines and variants in particular imports concepts from statistical physics whole cloth.
Before founding DeepMind (now one of the premier DL research groups in the world), Demis Hassabis studied the brain and hippocampus in particular at the Gatsby Computational Neuroscience Unit, and before that he worked for years in the video game industry after studying computer science.
Before the Annus Mirabilis, Einstein worked at the patent office for four years, during which time he was exposed to a large variety of ideas relating to the transmission of electric signals and electrical-mechanical synchronization of time, core concepts which show up in his later thought experiments.[8]
Creative people also tend to have a diverse social circle of creative friends to share and exchange ideas across fields.
Genetic heterozygosity is the quality of having two different alleles at a gene locus; summed over the organism this leads to a different but related concept of diversity.
Within developing fields of knowledge we often find key questions or subdomains for which there are multiple competing hypotheses or approaches. Good old fashioned AI vs Connectionism, Ray tracing vs Rasterization, and so on.
In these scenarios, it is almost always better to understand both viewpoints or knowledge clusters - at least to some degree. Each cluster is likely to have some unique ideas which are useful for understanding the greater truth or at the very least for later recombination.
This then is memetic heterozygosity. It invokes the Jain version of the blind men and the elephant.
Construct and maintain clean conceptual taxonomies
Formal education has developed various methods and rituals which have been found to be effective through a long process of experimentation. Some of these techniques are still quite useful for autodidacts.
When one sets out to learn, it is best to start with a clear goal. The goal of high school is just to provide a generalist background. In college one then chooses a major suitable for a particular goal cluster: do you want to become a computer programmer? a physicist? a biologist? etc. A significant amount of work then goes into structuring a learning curriculum most suitable for these goal types.
Once out of the educational system we all end up creating our own curriculums, whether intentionally or not. It can be helpful to think strategically as if planning a curriculum to suit one's longer term goals.
For example, about four years ago I decided to learn how the brain works and how AGI could be built in particular. When starting on this journey, I had a background mainly in computer graphics, simulation, and game related programming. I decided to focus about equally on mainstream AI, machine learning, computational neuroscience, and the AGI literature. I quickly discovered that my statistics background was a little weak, so I had to shore that up. Doing it all over again I may have started with a statistics book. Instead I started with AI: a modern approach (of course I mostly learn from the online research literature).
Learning works best when it is applied. Education exploits this principle and it is just as important for autodidactic learning. The best way to learn many math or programming concepts is learning by doing, where you create reasonable subtasks or subgoals for yourself along the way.
For general knowledge, application can take the form of writing about what you have learned. Academics are doing this all the time as they write papers and textbooks, but the same idea applies outside of academia.
In particular a good exercise is to imagine that you need to communicate all that you have learned about the domain. Imagine that you are writing a textbook or survey paper for example, and then you need to compress all that knowledge into a summary chapter or paper, and then all of that again down into an abstract. Then actually do write up a summary - at least in the form of a blog post (even if you don't show it to anybody).
The same ideas apply on some level to giving oral presentations or just discussing what you have learned informally - all of which are also features of the academic learning environment.
Early on, your first attempts to distill what you have learned into written form will be ... poor. But doing this process forces you to attempt to compress what you have learned, and thus it helps encourage the formation of well structured concept maps in the cortex.
A well structured conceptual map can be thought of as a memetic taxonomy. The point of a taxonomy is to organize all the invariances and 'is-a' relationships between objects so that higher level inferences and transformations can generalize well across categories.
Explicitly asking questions which probe the conceptual taxonomy can help force said structure to take form. For example in computer science/programming the question: "what is the greater generalization of this algorithm?" is a powerful tool.
In some domains, it may even be possible to semi-automate or at least guide the creative process using a structured method.
For example consider sci-fi/fantasy genre novels. Many of the great works have a general analogical structure based on real history ported over into a more exotic setting. The foundation series uses the model of the fall of the roman empire. Dune is like Lawrence of Arabia in space. Stranger in a Strange Land is like the Mormon version of Jesus the space alien, but from Mars instead of Kolob. A Song of Fire and Ice is partly a fantasy port of the war of the roses. And so on.
One could probably find some new ideas for novels just by creating and exploring a sufficiently large table of historical events and figures and comparing it to a map of the currently colonized space of ideas. Obviously having an idea for a novel is just the tiniest tip of the iceberg in the process, but a semi-formal method is interesting nonetheless for brainstorming and applies across domains (others have proposed similar techniques for generating startup ideas, for example).
Conclusion
We are born equipped with sophisticated learning machinery and yet lack innate knowledge on how to use it effectively - for this too we must learn.
The greatest constraint on creative ability is the quality of conceptual maps in the cortex. Understanding how these maps form doesn't automagically increase creativity, but it does help ground our intuitions and knowledge about learning, and could pave the way for future improved techniques.
In the meantime: cultivate memetic heterogeneity and heterozygosity, create a learning strategy, develop and test your conceptual taxonomy, continuously compress what you learn by writing and summarizing, and find ways to apply what you learn as you go.
The Brain as a Universal Learning Machine
This article presents an emerging architectural hypothesis of the brain as a biological implementation of a Universal Learning Machine. I present a rough but complete architectural view of how the brain works under the universal learning hypothesis. I also contrast this new viewpoint - which comes from computational neuroscience and machine learning - with the older evolved modularity hypothesis popular in evolutionary psychology and the heuristics and biases literature. These two conceptions of the brain lead to very different predictions for the likely route to AGI, the value of neuroscience, the expected differences between AGI and humans, and thus any consequent safety issues and dependent strategies.

(The image above is from a recent mysterious post to r/machinelearning, probably from a Google project that generates art based on a visualization tool used to inspect the patterns learned by convolutional neural networks. I am especially fond of the wierd figures riding the cart in the lower left. )
- Intro: Two viewpoints on the Mind
- Universal Learning Machines
- Historical Interlude
- Dynamic Rewiring
- Brain Architecture (the whole brain in one picture and a few pages of text)
- The Basal Ganglia
- Implications for AGI
- Conclusion
Intro: Two Viewpoints on the Mind
Few discoveries are more irritating than those that expose the pedigree of ideas.
-- Lord Acton (probably)
Less Wrong is a site devoted to refining the art of human rationality, where rationality is based on an idealized conceptualization of how minds should or could work. Less Wrong and its founding sequences draws heavily on the heuristics and biases literature in cognitive psychology and related work in evolutionary psychology. More specifically the sequences build upon a specific cluster in the space of cognitive theories, which can be identified in particular with the highly influential "evolved modularity" perspective of Cosmides and Tooby.
From Wikipedia:
Evolutionary psychologists propose that the mind is made up of genetically influenced and domain-specific[3] mental algorithms or computational modules, designed to solve specific evolutionary problems of the past.[4]
From "Evolutionary Psychology and the Emotions":[5]
An evolutionary perspective leads one to view the mind as a crowded zoo of evolved, domain-specific programs. Each is functionally specialized for solving a different adaptive problem that arose during hominid evolutionary history, such as face recognition, foraging, mate choice, heart rate regulation, sleep management, or predator vigilance, and each is activated by a different set of cues from the environment.
If you imagine these general theories or perspectives on the brain/mind as points in theory space, the evolved modularity cluster posits that much of the machinery of human mental algorithms is largely innate. General learning - if it exists at all - exists only in specific modules; in most modules learning is relegated to the role of adapting existing algorithms and acquiring data; the impact of the information environment is de-emphasized. In this view the brain is a complex messy cludge of evolved mechanisms.
The universal learning hypothesis proposes that all significant mental algorithms are learned; nothing is innate except for the learning and reward machinery itself (which is somewhat complicated, involving a number of systems and mechanisms), the initial rough architecture (equivalent to a prior over mindspace), and a small library of simple innate circuits (analogous to the operating system layer in a computer). In this view the mind (software) is distinct from the brain (hardware). The mind is a complex software system built out of a general learning mechanism.
Additional indirect support comes from the rapid unexpected success of Deep Learning[7], which is entirely based on building AI systems using simple universal learning algorithms (such as Stochastic Gradient Descent or other various approximate Bayesian methods[8][9][10][11]) scaled up on fast parallel hardware (GPUs). Deep Learning techniques have quickly come to dominate most of the key AI benchmarks including vision[12], speech recognition[13][14], various natural language tasks, and now even ATARI [15] - proving that simple architectures (priors) combined with universal learning is a path (and perhaps the only viable path) to AGI. Moreover, the internal representations that develop in some deep learning systems are structurally and functionally similar to representations in analogous regions of biological cortex[16].
To paraphrase Feynman: to truly understand something you must build it.
In this article I am going to quickly introduce the abstract concept of a universal learning machine, present an overview of the brain's architecture as a specific type of universal learning machine, and finally I will conclude with some speculations on the implications for the race to AGI and AI safety issues in particular.
Universal Learning Machines
A universal learning machine is a simple and yet very powerful and general model for intelligent agents. It is an extension of a general computer - such as Turing Machine - amplified with a universal learning algorithm. Do not view this as my 'big new theory' - it is simply an amalgamation of a set of related proposals by various researchers.
An initial untrained seed ULM can be defined by 1.) a prior over the space of models (or equivalently, programs), 2.) an initial utility function, and 3.) the universal learning machinery/algorithm. The machine is a real-time system that processes an input sensory/observation stream and produces an output motor/action stream to control the external world using a learned internal program that is the result of continuous self-optimization.
There is of course always room to smuggle in arbitrary innate functionality via the prior, but in general the prior is expected to be extremely small in bits in comparison to the learned model.
The key defining characteristic of a ULM is that it uses its universal learning algorithm for continuous recursive self-improvement with regards to the utility function (reward system). We can view this as second (and higher) order optimization: the ULM optimizes the external world (first order), and also optimizes its own internal optimization process (second order), and so on. Without loss of generality, any system capable of computing a large number of decision variables can also compute internal self-modification decisions.
Conceptually the learning machinery computes a probability distribution over program-space that is proportional to the expected utility distribution. At each timestep it receives a new sensory observation and expends some amount of computational energy to infer an updated (approximate) posterior distribution over its internal program-space: an approximate 'Bayesian' self-improvement.
The above description is intentionally vague in the right ways to cover the wide space of possible practical implementations and current uncertainty. You could view AIXI as a particular formalization of the above general principles, although it is also as dumb as a rock in any practical sense and has other potential theoretical problems. Although the general idea is simple enough to convey in the abstract, one should beware of concise formal descriptions: practical ULMs are too complex to reduce to a few lines of math.
A ULM inherits the general property of a Turing Machine that it can compute anything that is computable, given appropriate resources. However a ULM is also more powerful than a TM. A Turing Machine can only do what it is programmed to do. A ULM automatically programs itself.
If you were to open up an infant ULM - a machine with zero experience - you would mainly just see the small initial code for the learning machinery. The vast majority of the codestore starts out empty - initialized to noise. (In the brain the learning machinery is built in at the hardware level for maximal efficiency).
Theoretical turing machines are all qualitatively alike, and are all qualitatively distinct from any non-universal machine. Likewise for ULMs. Theoretically a small ULM is just as general/expressive as a planet-sized ULM. In practice quantitative distinctions do matter, and can become effectively qualitative.
Just as the simplest possible Turing Machine is in fact quite simple, the simplest possible Universal Learning Machine is also probably quite simple. A couple of recent proposals for simple universal learning machines include the Neural Turing Machine[16] (from Google DeepMind), and Memory Networks[17]. The core of both approaches involve training an RNN to learn how to control a memory store through gating operations.
Historical Interlude
At this point you may be skeptical: how could the brain be anything like a universal learner? What about all of the known innate biases/errors in human cognition? I'll get to that soon, but let's start by thinking of a couple of general experiments to test the universal learning hypothesis vs the evolved modularity hypothesis.
In a world where the ULH is mostly correct, what do we expect to be different than in worlds where the EMH is mostly correct?
One type of evidence that would support the ULH is the demonstration of key structures in the brain along with associated wiring such that the brain can be shown to directly implement some version of a ULM architecture.
From the perspective of the EMH, it is not sufficient to demonstrate that there are things that brains can not learn in practice - because those simply could be quantitative limitations. Demonstrating that an intel 486 can't compute some known computable function in our lifetimes is not proof that the 486 is not a Turing Machine.
Nor is it sufficient to demonstrate that biases exist: a ULM is only 'rational' to the extent that its observational experience and learning machinery allows (and to the extent one has the correct theory of rationality). In fact, the existence of many (most?) biases intrinsically depends on the EMH - based on the implicit assumption that some cognitive algorithms are innate. If brains are mostly ULMs then most cognitive biases dissolve, or become learning biases - for if all cognitive algorithms are learned, then evidence for biases is evidence for cognitive algorithms that people haven't had sufficient time/energy/motivation to learn. (This does not imply that intrinsic limitations/biases do not exist or that the study of cognitive biases is a waste of time; rather the ULH implies that educational history is what matters most)
The genome can only specify a limited amount of information. The question is then how much of our advanced cognitive machinery for things like facial recognition, motor planning, language, logic, planning, etc. is innate vs learned. From evolution's perspective there is a huge advantage to preloading the brain with innate algorithms so long as said algorithms have high expected utility across the expected domain landscape.
On the other hand, evolution is also highly constrained in a bit coding sense: every extra bit of code costs additional energy for the vast number of cellular replication events across the lifetime of the organism. Low code complexity solutions also happen to be exponentially easier to find. These considerations seem to strongly favor the ULH but they are difficult to quantify.

Neuroscientists have long known that the brain is divided into physical and functional modules. These modular subdivisions were discovered a century ago by Brodmann. Every time neuroscientists opened up a new brain, they saw the same old cortical modules in the same old places doing the same old things. The specific layout of course varied from species to species, but the variations between individuals are minuscule. This evidence seems to strongly favor the EMH.
Throughout most of the 90's up into the 2000's, evidence from computational neuroscience models and AI were heavily influenced by - and unsurprisingly - largely supported the EMH. Neural nets and backprop were known of course since the 1980's and worked on small problems[18], but at the time they didn't scale well - and there was no theory to suggest they ever would.
Theory of the time also suggested local minima would always be a problem (now we understand that local minima are not really the main problem[19], and modern stochastic gradient descent methods combined with highly overcomplete models and stochastic regularization[20] are effectively global optimizers that can often handle obstacles such as local minima and saddle points[21]).
The other related historical criticism rests on the lack of biological plausibility for backprop style gradient descent. (There is as of yet little consensus on how the brain implements the equivalent machinery, but target propagation is one of the more promising recent proposals[22][23].)
Many AI researchers are naturally interested in the brain, and we can see the influence of the EMH in much of the work before the deep learning era. HMAX is a hierarchical vision system developed in the late 90's by Poggio et al as a working model of biological vision[24]. It is based on a preconfigured hierarchy of modules, each of which has its own mix of innate features such as gabor edge detectors along with a little bit of local learning. It implements the general idea that complex algorithms/features are innate - the result of evolutionary global optimization - while neural networks (incapable of global optimization) use hebbian local learning to fill in details of the design.
Dynamic Rewiring
In a groundbreaking study from 2000 published in Nature, Sharma et al successfully rewired ferret retinal pathways to project into the auditory cortex instead of the visual cortex.[25] The result: auditory cortex can become visual cortex, just by receiving visual data! Not only does the rewired auditory cortex develop the specific gabor features characteristic of visual cortex; the rewired cortex also becomes functionally visual. [26] True, it isn't quite as effective as normal visual cortex, but that could also possibly be an artifact of crude and invasive brain rewiring surgery.
The ferret study was popularized by the book On Intelligence by Hawkins in 2004 as evidence for a single cortical learning algorithm. This helped percolate the evidence into the wider AI community, and thus probably helped in setting up the stage for the deep learning movement of today. The modern view of the cortex is that of a mostly uniform set of general purpose modules which slowly become recruited for specific tasks and filled with domain specific 'code' as a result of the learning (self optimization) process.
The next key set of evidence comes from studies of atypical human brains with novel extrasensory powers. In 2009 Vuillerme et al showed that the brain could automatically learn to process sensory feedback rendered onto the tongue[27]. This research was developed into a complete device that allows blind people to develop primitive tongue based vision.
In the modern era some blind humans have apparently acquired the ability to perform echolocation (sonar), similar to cetaceans. In 2011 Thaler et al used MRI and PET scans to show that human echolocators use diverse non-auditory brain regions to process echo clicks, predominantly relying on re-purposed 'visual' cortex.[27]
The echolocation study in particular helps establish the case that the brain is actually doing global, highly nonlocal optimization - far beyond simple hebbian dynamics. Echolocation is an active sensing strategy that requires very low latency processing, involving complex timed coordination between a number of motor and sensory circuits - all of which must be learned.
Somehow the brain is dynamically learning how to use and assemble cortical modules to implement mental algorithms: everyday tasks such as visual counting, comparisons of images or sounds, reading, etc - all are task which require simple mental programs that can shuffle processed data between modules (some or any of which can also function as short term memory buffers).
To explain this data, we should be on the lookout for a system in the brain that can learn to control the cortex - a general system that dynamically routes data between different brain modules to solve domain specific tasks.
But first let's take a step back and start with a high level architectural view of the entire brain to put everything in perspective.
Brain Architecture
Below is a circuit diagram for the whole brain. Each of the main subsystems work together and are best understood together. You can probably get a good high level extremely coarse understanding of the entire brain is less than one hour.

(there are a couple of circuit diagrams of the whole brain on the web, but this is the best. From this site.)
The human brain has ~100 billion neurons and ~100 trillion synapses, but ultimately it evolved from the bottom up - from organisms with just hundreds of neurons, like the tiny brain of C. Elegans.
We know that evolution is code complexity constrained: much of the genome codes for cellular metabolism, all the other organs, and so on. For the brain, most of its bit budget needs to be spent on all the complex neuron, synapse, and even neurotransmitter level machinery - the low level hardware foundation.
For a tiny brain with 1000 neurons or less, the genome can directly specify each connection. As you scale up to larger brains, evolution needs to create vastly more circuitry while still using only about the same amount of code/bits. So instead of specifying connectivity at the neuron layer, the genome codes connectivity at the module layer. Each module can be built from simple procedural/fractal expansion of progenitor cells.
So the size of a module has little to nothing to do with its innate complexity. The cortical modules are huge - V1 alone contains 200 million neurons in a human - but there is no reason to suspect that V1 has greater initial code complexity than any other brain module. Big modules are built out of simple procedural tiling patterns.
Very roughly the brain's main modules can be divided into six subsystems (there are numerous smaller subsystems):
- The neocortex: the brain's primary computational workhorse (blue/purple modules at the top of the diagram). Kind of like a bunch of general purpose FPGA coprocessors.
- The cerebellum: another set of coprocessors with a simpler feedforward architecture. Specializes more in motor functionality.
- The thalamus: the orangish modules below the cortex. Kind of like a relay/routing bus.
- The hippocampal complex: the apex of the cortex, and something like the brain's database.
- The amygdala and limbic reward system: these modules specialize in something like the value function.
- The Basal Ganglia (green modules): the central control system, similar to a CPU.
In the interest of space/time I will focus primarily on the Basal Ganglia and will just touch on the other subsystems very briefly and provide some links to further reading.
The neocortex has been studied extensively and is the main focus of several popular books on the brain. Each neocortical module is a 2D array of neurons (technically 2.5D with a depth of about a few dozen neurons arranged in about 5 to 6 layers).
Each cortical module is something like a general purpose RNN (recursive neural network) with 2D local connectivity. Each neuron connects to its neighbors in the 2D array. Each module also has nonlocal connections to other brain subsystems and these connections follow the same local 2D connectivity pattern, in some cases with some simple affine transformations. Convolutional neural networks use the same general architecture (but they are typically not recurrent.)
Cortical modules - like artifical RNNs - are general purpose and can be trained to perform various tasks. There are a huge number of models of the cortex, varying across the tradeoff between biological realism and practical functionality.
Perhaps surprisingly, any of a wide variety of learning algorithms can reproduce cortical connectivity and features when trained on appropriate sensory data[27]. This is a computational proof of the one-learning-algorithm hypothesis; furthermore it illustrates the general idea that data determines functional structure in any general learning system.
There is evidence that cortical modules learn automatically (unsupervised) to some degree, and there is also some evidence that cortical modules can be trained to relearn data from other brain subsystems - namely the hippocampal complex. The dark knowledge distillation technique in ANNs[28][29] is a potential natural analog/model of hippocampus -> cortex knowledge transfer.
Module connections are bidirectional, and feedback connections (from high level modules to low level) outnumber forward connections. We can speculate that something like target propagation can also be used to guide or constrain the development of cortical maps (speculation).
The hippocampal complex is the root or top level of the sensory/motor hierarchy. This short youtube video gives a good seven minute overview of the HC. It is like a spatiotemporal database. It receives compressed scene descriptor streams from the sensory cortices, it stores this information in medium-term memory, and it supports later auto-associative recall of these memories. Imagination and memory recall seem to be basically the same.
The 'scene descriptors' take the sensible form of things like 3D position and camera orientation, as encoded in place, grid, and head direction cells. This is basically the logical result of compressing the sensory stream, comparable to the networking data stream in a multiplayer video game.
Imagination/recall is basically just the reverse of the forward sensory coding path - in reverse mode a compact scene descriptor is expanded into a full imagined scene. Imagined/remembered scenes activate the same cortical subnetworks that originally formed the memory (or would have if the memory was real, in the case of imagined recall).
The amygdala and associated limbic reward modules are rather complex, but look something like the brain's version of the value function for reinforcement learning. These modules are interesting because they clearly rely on learning, but clearly the brain must specify an initial version of the value/utility function that has some minimal complexity.
As an example, consider taste. Infants are born with basic taste detectors and a very simple initial value function for taste. Over time the brain receives feedback from digestion and various estimators of general mood/health, and it uses this to refine the initial taste value function. Eventually the adult sense of taste becomes considerably more complex. Acquired taste for bitter substances - such as coffee and beer - are good examples.
The amygdala appears to do something similar for emotional learning. For example infants are born with a simple versions of a fear response, with is later refined through reinforcement learning. The amygdala sits on the end of the hippocampus, and it is also involved heavily in memory processing.
See also these two videos from khanacademy: one on the limbic system and amygdala (10 mins), and another on the midbrain reward system (8 mins)

The Basal Ganglia
The Basal Ganglia is a wierd looking complex of structures located in the center of the brain. It is a conserved structure found in all vertebrates, which suggests a core functionality. The BG is proximal to and connects heavily with the midbrain reward/limbic systems. It also connects to the brain's various modules in the cortex/hippocampus, thalamus and the cerebellum . . . basically everything.
All of these connections form recurrent loops between associated compartmental modules in each structure: thalamocortical/hippocampal-cerebellar-basal_ganglial loops.


Just as the cortex and hippocampus are subdivided into modules, there are corresponding modular compartments in the thalamus, basal ganglia, and the cerebellum. The set of modules/compartments in each main structure are all highly interconnected with their correspondents across structures, leading to the concept of distributed processing modules.
Each DPM forms a recurrent loop across brain structures (the local networks in the cortex, BG, and thalamus are also locally recurrent, whereas those in the cerebellum are not). These recurrent loops are mostly separate, but each sub-structure also provides different opportunities for inter-loop connections.
The BG appears to be involved in essentially all higher cognitive functions. Its core functionality is action selection via subnetwork switching. In essence action selection is the core problem of intelligence, and it is also general enough to function as the building block of all higher functionality. A system that can select between motor actions can also select between tasks or subgoals. More generally, low level action selection can easily form the basis of a Turing Machine via selective routing: deciding where to route the output of thalamocortical-cerebellar modules (some of which may specialize in short term memory as in the prefrontal cortex, although all cortical modules have some short term memory capability).
There are now a number of computational models for the Basal Ganglia-Cortical system that demonstrate possible biologically plausible implementations of the general theory[28][29]; integration with the hippocampal complex leads to larger-scale systems which aim to model/explain most of higher cognition in terms of sequential mental programs[30] (of course fully testing any such models awaits sufficient computational power to run very large-scale neural nets).
For an extremely oversimplified model of the BG as a dynamic router, consider an array of N distributed modules controlled by the BG system. The BG control network expands these N inputs into an NxN matrix. There are N2 potential intermodular connections, each of which can be individually controlled. The control layer reads a compressed, downsampled version of the module's hidden units as its main input, and is also recurrent. Each output node in the BG has a multiplicative gating effect which selectively enables/disables an individual intermodular connection. If the control layer is naively fully connected, this would require (N2)2 connections, which is only feasible for N ~ 100 modules, but sparse connectivity can substantially reduce those numbers.
It is unclear (to me), whether the BG actually implements NxN style routing as described above, or something more like 1xN or Nx1 routing, but there is general agreement that it implements cortical routing.

Of course in actuality the BG architecture is considerably more complex, as it also must implement reinforcement learning, and the intermodular connectivity map itself is also probably quite sparse/compressed (the BG may not control all of cortex, certainly not at a uniform resolution, and many controlled modules may have a very limited number of allowed routing decisions). Nonetheless, the simple multiplicative gating model illustrates the core idea.
This same multiplicative gating mechanism is the core principle behind the highly successful LSTM (Long Short-Term Memory)[30] units that are used in various deep learning systems. The simple version of the BG's gating mechanism can be considered a wider parallel and hierarchical extension of the basic LSTM architecture, where you have a parallel array of N memory cells instead of 1, and each memory cell is a large vector instead of a single scalar value.
The main advantage of the BG architecture is parallel hierarchical approximate control: it allows a large number of hierarchical control loops to update and influence each other in parallel. It also reduces the huge complexity of general routing across the full cortex down into a much smaller-scale, more manageable routing challenge.
Implications for AGI
These two conceptions of the brain - the universal learning machine hypothesis and the evolved modularity hypothesis - lead to very different predictions for the likely route to AGI, the expected differences between AGI and humans, and thus any consequent safety issues and strategies.
In the extreme case imagine that the brain is a pure ULM, such that the genetic prior information is close to zero or is simply unimportant. In this case it is vastly more likely that successful AGI will be built around designs very similar to the brain, as the ULM architecture in general is the natural ideal, vs the alternative of having to hand engineer all of the AI's various cognitive mechanisms.
In reality learning is computationally hard, and any practical general learning system depends on good priors to constrain the learning process (essentially taking advantage of previous knowledge/learning). The recent and rapid success of deep learning is strong evidence for how much prior information is ideal: just a little. The prior in deep learning systems takes the form of a compact, small set of hyperparameters that control the learning process and specify the overall network architecture (an extremely compressed prior over the network topology and thus the program space).
The ULH suggests that most everything that defines the human mind is cognitive software rather than hardware: the adult mind (in terms of algorithmic information) is 99.999% a cultural/memetic construct. Obviously there are some important exceptions: infants are born with some functional but very primitive sensory and motor processing 'code'. Most of the genome's complexity is used to specify the learning machinery, and the associated reward circuitry. Infant emotions appear to simplify down to a single axis of happy/sad; differentiation into the more subtle vector space of adult emotions does not occur until later in development.
If the mind is software, and if the brain's learning architecture is already universal, then AGI could - by default - end up with a similar distribution over mindspace, simply because it will be built out of similar general purpose learning algorithms running over the same general dataset. We already see evidence for this trend in the high functional similarity between the features learned by some machine learning systems and those found in the cortex.
Of course an AGI will have little need for some specific evolutionary features: emotions that are subconsciously broadcast via the facial muscles is a quirk unnecessary for an AGI - but that is a rather specific detail.
The key takeway is that the data is what matters - and in the end it is all that matters. Train a universal learner on image data and it just becomes a visual system. Train it on speech data and it becomes a speech recognizer. Train it on ATARI and it becomes a little gamer agent.
Train a universal learner on the real world in something like a human body and you get something like the human mind. Put a ULM in a dolphin's body and echolocation is the natural primary sense, put a ULM in a human body with broken visual wiring and you can also get echolocation.
Control over training is the most natural and straightforward way to control the outcome.
To create a superhuman AI driver, you 'just' need to create a realistic VR driving sim and then train a ULM in that world (better training and the simple power of selective copying leads to superhuman driving capability).
So to create benevolent AGI, we should think about how to create virtual worlds with the right structure, how to educate minds in those worlds, and how to safely evaluate the results.
One key idea - which I proposed five years ago is that the AI should not know it is in a sim.
New AI designs (world design + architectural priors + training/education system) should be tested first in the safest virtual worlds: which in simplification are simply low tech worlds without computer technology. Design combinations that work well in safe low-tech sandboxes are promoted to less safe high-tech VR worlds, and then finally the real world.
A key principle of a secure code sandbox is that the code you are testing should not be aware that it is in a sandbox. If you violate this principle then you have already failed. Yudkowsky's AI box thought experiment assumes the violation of the sandbox security principle apriori and thus is something of a distraction. (the virtual sandbox idea was most likely discussed elsewhere previously, as Yudkowsky indirectly critiques a strawman version of the idea via this sci-fi story).
The virtual sandbox approach also combines nicely with invisible thought monitors, where the AI's thoughts are automatically dumped to searchable logs.
Of course we will still need a solution to the value learning problem. The natural route with brain-inspired AI is to learn the key ideas behind value acquisition in humans to help derive an improved version of something like inverse reinforcement learning and or imitation learning[31] - an interesting topic for another day.
Conclusion
Ray Kurzweil has been predicting for decades that AGI will be built by reverse engineering the brain, and this particular prediction is not especially unique - this has been a popular position for quite a while. My own investigation of neuroscience and machine learning led me to a similar conclusion some time ago.
The recent progress in deep learning, combined with the emerging modern understanding of the brain, provide further evidence that AGI could arrive around the time when we can build and train ANNs with similar computational power as measured very roughly in terms of neuron/synapse counts. In general the evidence from the last four years or so supports Hanson's viewpoint from the Foom debate. More specifically, his general conclusion:
Future superintelligences will exist, but their vast and broad mental capacities will come mainly from vast mental content and computational resources. By comparison, their general architectural innovations will be minor additions.
The ULH supports this conclusion.
Current ANN engines can already train and run models with around 10 million neurons and 10 billion (compressed/shared) synapses on a single GPU, which suggests that the goal could soon be within the reach of a large organization. Furthermore, Moore's Law for GPUs still has some steam left, and software advances are currently improving simulation performance at a faster rate than hardware. These trends implies that Anthropomorphic/Neuromorphic AGI could be surprisingly close, and may appear suddenly.
What kind of leverage can we exert on a short timescale?
[Link] Word-vector based DL system achieves human parity in verbal IQ tests
A research team in China has created a system for answering verbal analogy questions of the type found on the GRE and IQ tests that scores a little above the average human score, perhaps corresponding to an IQ of around 105 or so. This improves substantially on the reported SOTA in AI for these types of problems.
This work builds on deep word-vector embeddings which have led to large gains in translation and many NLP tasks. One of their key improvements involves learning multiple vectors per word, where the number of specific word meanings is simply grabbed from a dictionary. This is important because verbal analogy questions often use more rare word meanings. They also employ modules specialized for the different types of questions.
I vaguely remember reading that AI systems already are fairly strong at solving visual raven-matrix style IQ questions, although I haven't looked into that in detail.
The multi-vector technique is probably the most important take away for future work.
Even if subsequent follow up work reaches superhuman verbal IQ in a few years, this of course doesn't immediately imply AGI. These types of IQ tests measure specific abilities which are correlated with general intelligence in humans, but these specific abilities are only a small subset of the systems/abilities required for general intelligence, and probably rely on a smallish subset of the brain's circuitry.
Resolving the Fermi Paradox: New Directions
Our sun appears to be a typical star: unremarkable in age, composition, galactic orbit, or even in its possession of many planets. Billions of other stars in the milky way have similar general parameters and orbits that place them in the galactic habitable zone. Extrapolations of recent expolanet surveys reveal that most stars have planets, removing yet another potential unique dimension for a great filter in the past.
According to Google, there are 20 billion earth like planets in the Galaxy.
A paradox indicates a flaw in our reasoning or our knowledge, which upon resolution, may cause some large update in our beliefs.
Ideally we could resolve this through massive multiscale monte carlo computer simulations to approximate Solonomoff Induction on our current observational data. If we survive and create superintelligence, we will probably do just that.
In the meantime, we are limited to constrained simulations, fermi estimates, and other shortcuts to approximate the ideal bayesian inference.
The Past
While there is still obvious uncertainty concerning the likelihood of the series of transitions along the path from the formation of an earth-like planet around a sol-like star up to an early tech civilization, the general direction of the recent evidence flow favours a strong Mediocrity Principle.
Here are a few highlight developments from the last few decades relating to an early filter:
- The time window between formation of earth and earliest life has been narrowed to a brief interval. Panspermia has also gained ground, with some recent complexity arguments favoring a common origin of life at 9 billion yrs ago.[1]
- Discovery of various extremophiles indicate life is robust to a wider range of environments than the norm on earth today.
- Advances in neuroscience and studies of animal intelligence lead to the conclusion that the human brain is not nearly as unique as once thought. It is just an ordinary scaled up primate brain, with a cortex enlarged to 4x the size of a chimpanzee. Elephants and some cetaceans have similar cortical neuron counts to the chimpanzee, and demonstrate similar or greater levels of intelligence in terms of rituals, problem solving, tool use, communication, and even understanding rudimentary human language. Elephants, cetaceans, and primates are widely separated lineages, indicating robustness and inevitability in the evolution of intelligence.
The Future(s)
When modelling the future development of civilization, we must recognize that the future is a vast cloud of uncertainty compared to the past. The best approach is to focus on the most key general features of future postbiological civilizations, categorize the full space of models, and then update on our observations to determine what ranges of the parameter space are excluded and which regions remain open.
An abridged taxonomy of future civilization trajectories :
Collapse/Extinction:
Civilization is wiped out due to an existential catastrophe that sterilizes the planet sufficient enough to kill most large multicellular organisms, essentially resetting the evolutionary clock by a billion years. Given the potential dangers of nanotech/AI/nuclear weapons - and then aliens, I believe this possibility is significant - ie in the 1% to 50% range.
Biological/Mixed Civilization:
This is the old-skool sci-fi scenario. Humans or our biological descendants expand into space. AI is developed but limited to human intelligence, like CP30. No or limited uploading.
This leads eventually to slow colonization, terraforming, perhaps eventually dyson spheres etc.
This scenario is almost not worth mentioning: prior < 1%. Unfortunately SETI in current form is till predicated on a world model that assigns a high prior to these futures.
PostBiological Warm-tech AI Civilization:
This is Kurzweil/Moravec's sci-fi scenario. Humans become postbiological, merging with AI through uploading. We become a computational civilization that then spreads out some fraction of the speed of light to turn the galaxy into computronium. This particular scenario is based on the assumption that energy is a key constraint, and that civilizations are essentially stellavores which harvest the energy of stars.
One of the very few reasonable assumptions we can make about any superintelligent postbiological civilization is that higher intelligence involves increased computational efficiency. Advanced civs will upgrade into physical configurations that maximize computation capabilities given the local resources.
Thus to understand the physical form of future civs, we need to understand the physical limits of computation.
One key constraint is the Landauer Limit, which states that the erasure (or cloning) of one bit of information requires a minimum of kTln2 joules. At room temperature (293 K), this corresponds to a minimum of 0.017 eV to erase one bit. Minimum is however the keyword here, as according to the principle, the probability of the erasure succeeding is only 50% at the limit. Reliable erasure requires some multiple of the minimal expenditure - a reasonable estimate being about 100kT or 1eV as the minimum for bit erasures at today's levels of reliability.
Now, the second key consideration is that Landauer's Limit does not include the cost of interconnect, which is already now dominating the energy cost in modern computing. Just moving bits around dissipates energy.
Moore's Law is approaching its asymptotic end in a decade or so due to these hard physical energy constraints and the related miniaturization limits.
I assign a prior to the warm-tech scenario that is about the same as my estimate of the probability that the more advanced cold-tech (reversible quantum computing, described next) is impossible: < 10%.
From Warm-tech to Cold-tech
There is a way forward to vastly increased energy efficiency, but it requires reversible computing (to increase the ratio of computations per bit erasures), and full superconducting to reduce the interconnect loss down to near zero.
The path to enormously more powerful computational systems necessarily involves transitioning to very low temperatures, and the lower the better, for several key reasons:
- There is the obvious immediate gain that one gets from lowering the cost of bit erasures: a bit erasure at room temperature costs 100 times more than a bit erasure at the cosmic background temperature, and a hundred thousand times more than an erasure at 0.01K (the current achievable limit for large objects)
- Low temperatures are required for most superconducting materials regardless.
- The delicate coherence required for practical quantum computation requires or works best at ultra low temperatures.
Assuming large scale quantum computing is possible, then the ultimate computer is thus a reversible massively entangled quantum device operating at absolute zero. Unfortunately, such a device would be delicate to a degree that is hard to imagine - even a single misplaced high energy particle could cause enormous damage.
Stellar Escape Trajectories
The Great Game
If two civs both discover each other's locations around the same time, then MAD (mutually assured destruction) dynamics takeover and cooperation has stronger benefits. The vast distances involve suggest that one sided discoveries are more likely.
Spheres of Influence
Conditioning on our Observational Data
Observational Selection Effects
All advanced civs will have strong instrumental reasons to employ deep simulations to understand and model developmental trajectories for the galaxy as a whole and for civilizations in particular. A very likely consequence is the production of large numbers of simulated conscious observers, ala the Simulation Argument. Universes with the more advanced low temperature reversible/quantum computing civilizations will tend to produce many more simulated observer moments and are thus intrinsically more likely than one would otherwise expect - perhaps massively so.
Rogue Planets
We estimate that there may be up to ∼ 10^5 compact objects in the mass range 10^−8 to 10^−2M⊙per main sequence star that are unbound to a host star in the Galaxy. We refer to these objects asnomads; in the literature a subset of these are sometimes called free-floating or rogue planets.
Although the error range is still large, it appears that free floating planets outnumber planets bound to stars, and perhaps by a rather large margin.
Assuming the galaxy is colonized: It could be that rogue planets form naturally outside of stars and then are colonized. It could be they form around stars and then are ejected naturally (and colonized). Artificial ejection - even if true - may be a rare event. Or not. But at least a few of these options could potentially be differentiated with future observations - for example if we find an interesting discrepancy in the rogue planet distribution predicted by simulations (which obviously do not yet include aliens!) and actual observations.
Also: if rogue planets outnumber stars by a large margin, then it follows that rogue planet flybys are more common in proportion.
Conclusion
SETI to date allows us to exclude some regions of the parameter space for alien civs, but the regions excluded correspond to low prior probability models anyway, based on the postbiological perspective on the future of life. The most interesting regions of the parameter space probably involve advanced stealthy aliens in the form of small compact cold objects floating in the interstellar medium.
The upcoming WFIST telescope should shed more light on dark matter and enhance our microlensing detection abilities significantly. Sadly, it's planned launch date isn't until 2024. Space development is slow.
Transhumanist Nationalism and AI Politics
From this article by Zoltan Istvan, in regards to the looming Global AI Arms race, he says:
As the 2016 US Presidential candidate for the Transhumanist Party, I don't mind going out on a limb and saying the obvious: I also want AI to belong exclusively to America. Of course, I would hope to share the nonmilitary benefits and wisdom of a superintelligence with the world, as America has done for much of the last century with its groundbreaking innovation and technology. But can you imagine for a moment if AI was developed and launched in, let's say, North Korea, or Iran, or increasingly authoritarian Russia? What if another national power told that superintelligence to break all the secret codes and classified material that America's CIA and NSA use for national security? What if this superintelligence was told to hack into the mainframe computers tied to nuclear warheads, drones, and other dangerous weaponry? What if that superintelligence was told to override all traffic lights, power grids, and water treatment plants in Europe? Or Asia? Or everywhere in the world except for its own country? The possible danger is overwhelming.
Now, to some extent I expect many Americans, on reflection, would at least partly agree with the above statement - and that should be concerning.
Consider the issue from the perspective of Russian, Chinese (or really any foreign) readers with similar levels of national pride.
One equivalent postionally reflected statement from a foreign perspective might read like this:
I also want AI to belong exclusively to China. Of course, I would hope to share the nonmilitary benefits and wisdom of a superintelligence with the world, as China has done for much of the this century with its groundbreaking innovation and technology. But can you imagine for a moment if AI was developed and launched by, let's say, the US NSA, or Israel, or India? . ..
On a related note, there was an interesting panel recently with Robin Li (CEO of Baidu), Bill Gates, and Elon Musk. They spent a little time discussing AI superintelligence. Robin Li mentioned that his new head of research - Andrew Ng - doesn't believe superintelligence is an immediate threat. In particular Ng said: "Worrying about AI risk now is like worrying about overpopulation on Mars." Li also mentioned that he has been advocating for a large chinese government investment in AI.
Resurrection through simulation: questions of feasibility, desirability and some implications
Could a future superintelligence bring back the already dead? This discussion has come up a while back (and see the somewhat related); I'd like to resurrect the topic because ... it's potentially quite important.
Algorithmic resurrection is a possibility if we accept the same computational patternist view of identity that suggests cryonics and uploading will work. I see this as the only consistent view of my observations, but if you don't buy this argument/belief set then the rest may not be relevant.
The general implementation idea is to run a forward simulation over some portion of earth's history, constrained to enforce compliance with all recovered historical evidence. The historical evidence would consist mainly of all the scanned brains and the future internet.
The thesis is that to the extent that you can retrace historical reality complete with simulated historical people and their thoughts, memories, and emotions, to this same extent you actually recreate/resurrect the historical people.
So the questions are: is it feasible? is it desirable/ethical/utility-efficient? And finally, why may this matter?
Simulation Feasibility
A few decades ago pong was a technical achievement, now we have avatar. The trajectory seems to suggest we are on track to photorealistic simulations fairly soon (decades). Offline graphics for film arguably are already photoreal, real-time rendering is close behind, and the biggest remaining problem is the uncanny valley, which really is just the AI problem by another name. Once we solve that (which we are assuming), the Matrix follows. Superintelligences could help.
There are some general theorems in computer graphics that suggest that simulating an observer optimized world requires resources only in proportion to the observational power of the observers. Video game and film renderers in fact already rely heavily on this strategy.
Criticism from Chaos: We can't even simulate the weather more than a few weeks in advance.
Response: Simulating the exact future state of specific chaotic systems may be hard, but simulating chaotic systems in general is not. In this case we are not simulating the future state, but the past. We already know something of the past state of the system, to some level of detail, and we can simulate the likely (or multiple likely) paths within this configuration space, filling in detail.
Physical Reversibility Criticism: The AI would have to rewind time, it would have to know the exact state of every atom on earth and every photon that has left earth.
Response: Yes the most straightforward brute force way to infer the past state of earth would be to compute the reverse of all physical interactions and would require ridiculously impractical amounts of information and computation. The best algorithm for a given problem is usually not brute force. The specifying data of a human mind is infinitesimal in comparison, and even a random guessing algorithm would probably require less resources than fully reversing history.
Constrained simulation converges much faster to perfectly accurate recovery, but by no means is full perfect recovery even required for (partial) success. The patternist view of identity is fluid and continuous.
If resurrecting a specific historical person is better than creating a hypothetical person, creating a somewhat historical person is also better, and the closer the better.
Simulation Ethics
Humans appear to value other humans, but each human appears to value some more than others. In general humans typically roughly value themselves the most, then kin and family, followed by past contacts, tribal affiliations, and the vaguely similar.
We can generalize this as a valuation in person-space which peaks at the self identity-pattern and then declines in some complex fashion as we move away to more distant locales and less related people.
If we extrapolate this to a future where humans have the power to create new humans and or recreate past humans, we can infer that the distribution of created people may follow the self-centered valuation distribution.
Thus recreating specific ancestors or close relations is better than recreating vaguely historical people which is better than creating non-specific people in general.
Suffering Criticism: An ancestral simulation would recreate a huge amount of suffering.
Response: Humans suffer and live in a world that seems to suffer greatly, and yet very few humans prefer non-existence over their suffering. Evolution culls existential pessimists.
Recreating a past human will recreate their suffering, but it could also grant them an afterlife filled with tremendous joy. The relatively small finite suffering may not add up to much in this consideration. It could even initially relatively enhance subsequent elevation to joyful state, but this is speculative.
The utilitarian calculus seems to be: create non-suffering generic people who we value somewhat less vs recreate initially suffering specific historical people who we value more. In some cases (such as lost love ones), the moral calculus weighs heavily in favor of recreating specific people. Many other historicals may be brought along for the ride.
Closed Loops
The vast majority of the hundred billion something humans who have ever lived share the singular misfortune of simply being born too early in earth's history to be saved by cryonics and uploading.
Recreating history up to 2012 would require one hundred billion virtual brains. Simulating history into the phase when uploading and virtual brains become common could vastly increase the simulation costs.
The simulations have the property that they become more accurate as time progresses. If a person is cryonically perserved and then scanned and uploaded, this provides exact information. Simulations will converge to perfect accuracy at that particular moment in time. In addition, the cryonic brain will be unconscious and inactive for a stretch.
Thus the moment of biological death, even if the person is cryonically preserved, could be an opportune time to recycle simulation resources, as there is no loss of unique information (threads converged).
How would such a scenario effect the Simulation Argument? It would seem to shift probabilities such that more (most?) observer moments are in pre-uploading histories, rather than in posthuman timelines. I find this disquieting for some reason, even though I don't suspect it will effect my observational experience.
The Generalized Anti-Pascal Principle: Utility Convergence of Infinitesimal Probabilities
Edit: Added clarification of the limit in response to gwern's comment.
For recent examples, see this post by MileyCyrus, or this post from XiXiDu (where I reply with unbounded utility functions, which is not the general solution).
I encountered this issue again while reading through a fascinating discussion thread on John Baez's blog from earlier this year where Greg Egan jumped in with a "Yudkowsky/Bostrom" criticism:
The Yudkowsky/Bostrom strategy is to contrive probabilities for immensely unlikely scenarios, and adjust the figures until the expectation value for the benefits of working on — or donating to — their particular pet projects exceed the benefits of doing anything else. Combined with the appeal to vanity of “saving the universe”, some people apparently find this irresistible, but frankly, their attempt to prescribe what rational altruists should be doing with their time and money is just laughable, and it’s a shame you’ve given it so much air time.
In short, Egan is indirectly accusing SIAI and FHI of Pascal Mugging(among else): something serious indeed. Egan in particular presents the following (presumably Yudkowsky) quote as evidence:
Anyway: In terms of expected utility maximization, even large probabilities of jumping the interval between a universe-history in which 95% of existing biological species survive Earth’s 21st century, versus a universe-history where 80% of species survive, are just about impossible to trade off against tiny probabilities of jumping the interval between interesting universe-histories, versus boring ones where intelligent life goes extinct, or the wrong sort of AI self-improves.
Yudkowsky responds with his Pascal's Wager Fallacy Fallacy, and points out that in fact he agrees there is no case for investing in defense against highly improbable existential risks:
And I don’t think the odds of us being wiped out by badly done AI are small. I think they’re easily larger than 10%. And if you can carry a qualitative argument that the probability is under, say, 1%, then that means AI is probably the wrong use of marginal resources – not because global warming is more important, of course, but becauseother ignored existential risks like nanotech would be more important. I am not trying to play burden-of-proof tennis. If the chances are under 1%, that’s low enough, we’ll drop the AI business from consideration until everything more realistic has been handled.
The rest of the thread makes for an entertaining read, but the takeaway I'd like to focus on is the original source of Egan's criticism: the apparent domination of immensely unlikely scenarios of immensely high utility.
It occurred to me that the expected value of any action - properly summed over subsets of integrated futures - necessarily converges to zero as the probability of those considered subsets goes to zero. Critically this convergence occurs for *all* utility functions, as it is not dependent on any particular utility assignments. Alas LW is vast enough that there may be little new left under the sun: In researching this idea, I encountered an earlier form of it in a post by SilasBart here, as well as some earlier attempts by RichardKennaway, Komponisto, and jimrandomh.
Now that we've covered the background, I'll jump to the principle:
The Infinitesimal Probability Utility Convergence Principle (IPUP): For any action A, utility function U, and a subset of possible post-action futures F, EU(F) -> 0 as p(F) -> 0.
In Pascal's Mugging scenarios we are considering possible scenarios (futures) that have some low probability. It is important to remember that rational agents compute expected reward over all possible futures, not just the one scenario we may be focusing on.
The principle can be formalized in the theoretical context of perfect omniscience-approaching agents running on computers approaching infinite power.
The AIXI formalization provides a simple mathematical model of such agents. It's single line equation has a concise English summary:
If the environment is modeled by a deterministic program q, then the future perceptions ...okrk...omrm = U(q,a1..am) can be computed, where U is a universal (monotone Turing) machine executing q given a1..am. Since q is unknown, AIXI has to maximize its expected reward, i.e. average rk+...+rm over all possible future perceptions created by all possible environments q that are consistent with past perceptions. The simpler an environment, the higher is its a-priori contribution 2-l(q), where simplicity is measured by the length l of program q. AIXI effectively learns by eliminating Turing machines q once they become inconsistent with the progressing history. Since noisy environments are just mixtures of deterministic environments, they are automatically included.
AIXI is just a mathematical equation. We must be very careful in mapping it to abstract scenarios lest we lose much in translation. It is best viewed as a family of agent-models, the reward observations it seeks to maximize could be anything.
When one ponders: "What would AIXI/Omega do?" There are a couple of key points to keep in mind:
- AIXI like models (probably) simulate the entire complete infinitely branching multiverse from the beginning of time to infinity (as particular simulation programs). This is often lost in translation.
- AIXI like models compute 1 (the infinite totality of existence), not once, but for each of an infinite number of programs (corresponding to what we would call universal physics: theories of everything) in parallel. Thus AIXI computes (in parallel) the entire Tegmark multiverse: every possible universe that could exist in principle.
- AIXI 'learns' by eliminating sub-universes (and theories) that do not perfectly agree with it's observation history to date. Of course this is only ever a finite reduction, it never collapses the multiverse from an infinite set into a finite set.
- AIXI finally picks an action A that maximizes expected reward. It computes this measure by summing over, for each observation-valid universe (computed by a particular theory-program 1) in the multiverse ensemble (2), the total accumulated reward in the sub-universes branching off from that action, weighted by a scoring term for each valid universe that decreases with the negative exponent of the theory's program length.
In other words the perfectly rational agent considers everything that could possibly happen as a consequence of it's action in every possible universe it could be in, weighted by an exponential penalty against high-complexity universes.
Here is a sketch of how the limit convergence (IPUP above) can be derived: When considering a possible action A, such as giving $5 to a Pascal Mugger, an optimal agent considers all possible dependent futures for all possible physics-universes. As we advance into scenarios of infinitesimal probability, we are advancing up the complexity ladder into increasingly chaotic universes which feature completely random rewards which approach positive/negative infinity. As we advance into this regime of infinitesimal probability, causality itself breaks down completely and expected reward of any action goes to zero.
The convergence principle can be derived from the program length prior 2^-l(q). An agent which has accumulated P perception bits so far can fully explain those perceptions by completely random programs of length P, thus 2^-l(P) forms a probability limit at which the agent's perceptions start becoming irrelevant, and chaotic non-causal physics dominate. Chaos should dominate expected reward for actions where p(A) << 2^-l(P).
Thinking as a limited human, we impose abstractions and collapse all extremely similar (to us) futures. All the tiny random quantum-dependent variations of a particular future correspond to "giving the Mugger $5" we collapse into a single set of futures which we assign a probability to based on counting the subinstances in that set as a fraction of the whole.
AIXI does not do this: it actually computes each individual future path.
But as we can't hope to think that way, we have to think in terms of probability categorizations. Fine. Imagine collapsing any futures that are sufficiently indistinguishable such that humans would consider them identical: described by the same natural language. We then get subsets of futures which we assign probabilities as relative size measures.
Now consider ranking all of those future-sets in decreasing probability order. Most of the early list is dominated by Mugger is (joking/lying/crazy/etc). Farther down the list you get into scenarios where we do live in a multi-level Simulation (AIXI only ever considers itself in some simulation), but the Mugger is still (joking/lying/crazy/etc).
By the time you get down the list to scenarios described where the Mugger says "Or else I will use my magic powers from outside the Matrix to run a Turing machine that simulates and kills 3^^^^3 people" and what the Mugger says actually happens, we are almost certainly down in infinitesimal probability land.
Infinitesimal probability land is a wierd place. It is a regime where the physics that we commonly accept is wrong - which is to say simply that the exponential complexity penalty no longer rules out ultra-complex universes. It is dominated by chaos: universes of every possible fancy, where nothing is as what it seems, where everything you possibly thought is completely wrong, where there is no causality, etc. etc.
At the complete limit of improbability, we just get universes where our entire observation history is completely random - generated by programs more complex than our observations. You give the mugger $5 and the universe simply dissolves in white noise and nothing happens (or god appears and gives you infinite heaven, or infinite hell, or the speed of light goes to zero, or a black hole forms near your nose, or the Mugger turns into jellybeans, etc. etc., an infinite number of stories, over which the net reward summation necessarily collapses to zero.)
Remember AIXI doesn't consider the mugger's words as 'evidence', they are simply observations. In the more complex universes they are completely devoid of meaning, as causality itself collapses.
Feasibility of Creating Non-Human or Non-Sentient Machine Intelligence
What defines a human mind? Or a sentient mind in general?
From a computational perspective, a human mind is one of some particular class of complex programs within the overall large space of general intelligences, or minds in general.
The most succinct dilineator of the human sub-category of minds is simply that of having a human ontology. This entails information such as: a large body of embodiment derived knowledge we call common-sense, one or more human languages, memories, beliefs, ideas, values and so on.
Take a human infant, for dramatic example let us use a genetic clone of Einstein. Raise this clone up amongst wild animals and the development result is not anything remotely resembling Einstein, and in fact is not a human mind at all, but something much closer to a primate mind. The brain is the hardware, the mind is software, and the particular mind of Einstein was a unique result of a particular mental developmental history and observation sequence.
If the mind is substrate independent, this begs the question of to what extent it is also algorithm independent. If an AGI has a full human ontology, on what basis can we say that it is not human? If it can make the same inferences on the same knowledgebase, understands one of our languages, has similar memories, ideas, and values, in what way is it not human?
Substrate and algorithm independence show us that it doesn't really matter in the slightest *how* something thinks internally, all that matter is the end functional behavior, the end decisions.
Surely there are some classes of AGI-designs that would exhibit thought patterns and behaviors well outside the human norm, but these crucially all involve changing some aspect of the knowledge-base. For example, AGI's based solely on reinforcement learning algorithms would appear to be incapable of abstract model-based value decisions. This would show up as glaring contrast between the AGI's decisions and it's linguistically demonstrable understanding of terms such as 'value', 'good', 'moral', and so on. Of course human's actual decisions are often at odds in a similar fashion, but most humans mostly make important decisions they believe are 'good' most of the time.
A reinforcement-learning based AGI with no explicit connection between value concepts such as 'good' and it's reward maximizing utility function would not necessarily be inclined to make 'good' decisions. It may even be completely aware of this feature of it's design and be quite capable of verbalizing it.
But RL techniques are just one particular class of algorithms and we can probably do much better with designs that can form model-based utility functions that actually incorporate high order learned values encoded into the ontology itself.
Such a design would make 'good' decisions, and if truly of human or surpassing intelligence, would be fully capable of learning, refining, and articulating complex ethical/moral frameworks which in turn refine it's internal concept of 'good'. It would consistently do 'good' things, and would be fully capable of explaining why they were good.
Naturally the end values and thus decisions of any such system would depend on what human knowledge it learned, what it read or absorbed and in what order, but is that really different than any alternatives?
And how could we say that such a system would not be human? How does one make a non-arbitrary division between a human-capable AGI and say a human upload?
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)