Filter Last three months

Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

The Brain as a Universal Learning Machine

80 jacob_cannell 24 June 2015 09:45PM

This article presents an emerging architectural hypothesis of the brain as a biological implementation of a Universal Learning Machine.  I present a rough but complete architectural view of how the brain works under the universal learning hypothesis.  I also contrast this new viewpoint - which comes from computational neuroscience and machine learning - with the older evolved modularity hypothesis popular in evolutionary psychology and the heuristics and biases literature.  These two conceptions of the brain lead to very different predictions for the likely route to AGI, the value of neuroscience, the expected differences between AGI and humans, and thus any consequent safety issues and dependent strategies.

Art generated by an artificial neural net

(The image above is from a recent mysterious post to r/machinelearning, probably from a Google project that generates art based on a visualization tool used to inspect the patterns learned by convolutional neural networks.  I am especially fond of the wierd figures riding the cart in the lower left. )

  1. Intro: Two viewpoints on the Mind
  2. Universal Learning Machines
  3. Historical Interlude
  4. Dynamic Rewiring
  5. Brain Architecture (the whole brain in one picture and a few pages of text)
  6. The Basal Ganglia
  7. Implications for AGI
  8. Conclusion


Intro: Two Viewpoints on the Mind

Few discoveries are more irritating than those that expose the pedigree of ideas.

-- Lord Acton (probably)

Less Wrong is a site devoted to refining the art of human rationality, where rationality is based on an idealized conceptualization of how minds should or could work.  Less Wrong and its founding sequences draws heavily on the heuristics and biases literature in cognitive psychology and related work in evolutionary psychology.  More specifically the sequences build upon a specific cluster in the space of cognitive theories, which can be identified in particular with the highly influential "evolved modularity" perspective of Cosmides and Tooby.

From Wikipedia:

Evolutionary psychologists propose that the mind is made up of genetically influenced and domain-specific[3] mental algorithms or computational modules, designed to solve specific evolutionary problems of the past.[4] 

From "Evolutionary Psychology and the Emotions":[5]

An evolutionary perspective leads one to view the mind as a crowded zoo of evolved, domain-specific programs.  Each is functionally specialized for solving a different adaptive problem that arose during hominid evolutionary history, such as face recognition, foraging, mate choice, heart rate regulation, sleep management, or predator vigilance, and each is activated by a different set of cues from the environment.

If you imagine these general theories or perspectives on the brain/mind as points in theory space, the evolved modularity cluster posits that much of the machinery of human mental algorithms is largely innate.  General learning - if it exists at all - exists only in specific modules; in most modules learning is relegated to the role of adapting existing algorithms and acquiring data; the impact of the information environment is de-emphasized.  In this view the brain is a complex messy cludge of evolved mechanisms.

There is another viewpoint cluster, more popular in computational neuroscience (especially today), that is almost the exact opposite of the evolved modularity hypothesis.  I will rebrand this viewpoint the "universal learner" hypothesis, aka the "one learning algorithm" hypothesis (the rebranding is justified mainly by the inclusion of some newer theories and evidence for the basal ganglia as a 'CPU' which learns to control the cortex).  The roots of the universal learning hypothesis can be traced back to Mountcastle's discovery of the simple uniform architecture of the cortex.[6]

The universal learning hypothesis proposes that all significant mental algorithms are learned; nothing is innate except for the learning and reward machinery itself (which is somewhat complicated, involving a number of systems and mechanisms), the initial rough architecture (equivalent to a prior over mindspace), and a small library of simple innate circuits (analogous to the operating system layer in a computer).  In this view the mind (software) is distinct from the brain (hardware).  The mind is a complex software system built out of a general learning mechanism.

In simplification, the main difference between these viewpoints is the relative quantity of domain specific mental algorithmic information specified in the genome vs that acquired through general purpose learning during the organism's lifetime.  Evolved modules vs learned modules.

When you have two hypotheses or viewpoints that are almost complete opposites this is generally a sign that the field is in an early state of knowledge; further experiments typically are required to resolve the conflict.

It has been about 25 years since Cosmides and Tooby began to popularize the evolved modularity hypothesis.  A number of key neuroscience experiments have been performed since then which support the universal learning hypothesis (reviewed later in this article).  

Additional indirect support comes from the rapid unexpected success of Deep Learning[7], which is entirely based on building AI systems using simple universal learning algorithms (such as Stochastic Gradient Descent or other various approximate Bayesian methods[8][9][10][11]) scaled up on fast parallel hardware (GPUs).  Deep Learning techniques have quickly come to dominate most of the key AI benchmarks including vision[12], speech recognition[13][14], various natural language tasks, and now even ATARI [15] - proving that simple architectures (priors) combined with universal learning is a path (and perhaps the only viable path) to AGI. Moreover, the internal representations that develop in some deep learning systems are structurally and functionally similar to representations in analogous regions of biological cortex[16].

To paraphrase Feynman: to truly understand something you must build it. 

In this article I am going to quickly introduce the abstract concept of a universal learning machine, present an overview of the brain's architecture as a specific type of universal learning machine, and finally I will conclude with some speculations on the implications for the race to AGI and AI safety issues in particular.

Universal Learning Machines

A universal learning machine is a simple and yet very powerful and general model for intelligent agents.  It is an extension of a general computer - such as Turing Machine - amplified with a universal learning algorithm.  Do not view this as my 'big new theory' - it is simply an amalgamation of a set of related proposals by various researchers.

An initial untrained seed ULM can be defined by 1.) a prior over the space of models (or equivalently, programs), 2.) an initial utility function, and 3.) the universal learning machinery/algorithm.  The machine is a real-time system that processes an input sensory/observation stream and produces an output motor/action stream to control the external world using a learned internal program that is the result of continuous self-optimization.

There is of course always room to smuggle in arbitrary innate functionality via the prior, but in general the prior is expected to be extremely small in bits in comparison to the learned model.

The key defining characteristic of a ULM is that it uses its universal learning algorithm for continuous recursive self-improvement with regards to the utility function (reward system).  We can view this as second (and higher) order optimization: the ULM optimizes the external world (first order), and also optimizes its own internal optimization process (second order), and so on.  Without loss of generality, any system capable of computing a large number of decision variables can also compute internal self-modification decisions.

Conceptually the learning machinery computes a probability distribution over program-space that is proportional to the expected utility distribution.  At each timestep it receives a new sensory observation and expends some amount of computational energy to infer an updated (approximate) posterior distribution over its internal program-space: an approximate 'Bayesian' self-improvement.

The above description is intentionally vague in the right ways to cover the wide space of possible practical implementations and current uncertainty.  You could view AIXI as a particular formalization of the above general principles, although it is also as dumb as a rock in any practical sense and has other potential theoretical problems.  Although the general idea is simple enough to convey in the abstract, one should beware of concise formal descriptions: practical ULMs are too complex to reduce to a few lines of math.

A ULM inherits the general property of a Turing Machine that it can compute anything that is computable, given appropriate resources.  However a ULM is also more powerful than a TM.  A Turing Machine can only do what it is programmed to do.  A ULM automatically programs itself.

If you were to open up an infant ULM - a machine with zero experience - you would mainly just see the small initial code for the learning machinery.  The vast majority of the codestore starts out empty - initialized to noise.  (In the brain the learning machinery is built in at the hardware level for maximal efficiency).

Theoretical turing machines are all qualitatively alike, and are all qualitatively distinct from any non-universal machine.  Likewise for ULMs.  Theoretically a small ULM is just as general/expressive as a planet-sized ULM.  In practice quantitative distinctions do matter, and can become effectively qualitative.

Just as the simplest possible Turing Machine is in fact quite simple, the simplest possible Universal Learning Machine is also probably quite simple.  A couple of recent proposals for simple universal learning machines include the Neural Turing Machine[16] (from Google DeepMind), and Memory Networks[17].  The core of both approaches involve training an RNN to learn how to control a memory store through gating operations. 

Historical Interlude

At this point you may be skeptical: how could the brain be anything like a universal learner?  What about all of the known innate biases/errors in human cognition?  I'll get to that soon, but let's start by thinking of a couple of general experiments to test the universal learning hypothesis vs the evolved modularity hypothesis.

In a world where the ULH is mostly correct, what do we expect to be different than in worlds where the EMH is mostly correct?

One type of evidence that would support the ULH is the demonstration of key structures in the brain along with associated wiring such that the brain can be shown to directly implement some version of a ULM architecture.

Another type of indirect evidence that would help discriminate the two theories would be evidence that the brain is capable of general global optimization, and that complex domain specific algorithms/circuits mostly result from this process.  If on the other hand the brain is only capable of constrained/local optimization, then most of the complexity must instead be innate - the result of global optimization in evolutionary deeptime.  So in essence it boils down to the optimization capability of biological learning vs biological evolution.

From the perspective of the EMH, it is not sufficient to demonstrate that there are things that brains can not learn in practice - because those simply could be quantitative limitations.  Demonstrating that an intel 486 can't compute some known computable function in our lifetimes is not proof that the 486 is not a Turing Machine.

Nor is it sufficient to demonstrate that biases exist: a ULM is only 'rational' to the extent that its observational experience and learning machinery allows (and to the extent one has the correct theory of rationality).  In fact, the existence of many (most?) biases intrinsically depends on the EMH - based on the implicit assumption that some cognitive algorithms are innate.  If brains are mostly ULMs then most cognitive biases dissolve, or become learning biases - for if all cognitive algorithms are learned, then evidence for biases is evidence for cognitive algorithms that people haven't had sufficient time/energy/motivation to learn.  (This does not imply that intrinsic limitations/biases do not exist or that the study of cognitive biases is a waste of time; rather the ULH implies that educational history is what matters most)

The genome can only specify a limited amount of information.  The question is then how much of our advanced cognitive machinery for things like facial recognition, motor planning, language, logic, planning, etc. is innate vs learned.  From evolution's perspective there is a huge advantage to preloading the brain with innate algorithms so long as said algorithms have high expected utility across the expected domain landscape.  

On the other hand, evolution is also highly constrained in a bit coding sense: every extra bit of code costs additional energy for the vast number of cellular replication events across the lifetime of the organism.  Low code complexity solutions also happen to be exponentially easier to find.  These considerations seem to strongly favor the ULH but they are difficult to quantify.

Neuroscientists have long known that the brain is divided into physical and functional modules.  These modular subdivisions were discovered a century ago by Brodmann.  Every time neuroscientists opened up a new brain, they saw the same old cortical modules in the same old places doing the same old things.  The specific layout of course varied from species to species, but the variations between individuals are minuscule. This evidence seems to strongly favor the EMH.

Throughout most of the 90's up into the 2000's, evidence from computational neuroscience models and AI were heavily influenced by - and unsurprisingly - largely supported the EMH.  Neural nets and backprop were known of course since the 1980's and worked on small problems[18], but at the time they didn't scale well - and there was no theory to suggest they ever would.  

Theory of the time also suggested local minima would always be a problem (now we understand that local minima are not really the main problem[19], and modern stochastic gradient descent methods combined with highly overcomplete models and stochastic regularization[20] are effectively global optimizers that can often handle obstacles such as local minima and saddle points[21]).  

The other related historical criticism rests on the lack of biological plausibility for backprop style gradient descent.  (There is as of yet little consensus on how the brain implements the equivalent machinery, but target propagation is one of the more promising recent proposals[22][23].)

Many AI researchers are naturally interested in the brain, and we can see the influence of the EMH in much of the work before the deep learning era.  HMAX is a hierarchical vision system developed in the late 90's by Poggio et al as a working model of biological vision[24].  It is based on a preconfigured hierarchy of modules, each of which has its own mix of innate features such as gabor edge detectors along with a little bit of local learning.  It implements the general idea that complex algorithms/features are innate - the result of evolutionary global optimization - while neural networks (incapable of global optimization) use hebbian local learning to fill in details of the design.

Dynamic Rewiring

In a groundbreaking study from 2000 published in Nature, Sharma et al successfully rewired ferret retinal pathways to project into the auditory cortex instead of the visual cortex.[25]  The result: auditory cortex can become visual cortex, just by receiving visual data!  Not only does the rewired auditory cortex develop the specific gabor features characteristic of visual cortex; the rewired cortex also becomes functionally visual. [26] True, it isn't quite as effective as normal visual cortex, but that could also possibly be an artifact of crude and invasive brain rewiring surgery.

The ferret study was popularized by the book On Intelligence by Hawkins in 2004 as evidence for a single cortical learning algorithm.  This helped percolate the evidence into the wider AI community, and thus probably helped in setting up the stage for the deep learning movement of today.  The modern view of the cortex is that of a mostly uniform set of general purpose modules which slowly become recruited for specific tasks and filled with domain specific 'code' as a result of the learning (self optimization) process.

The next key set of evidence comes from studies of atypical human brains with novel extrasensory powers.  In 2009 Vuillerme et al showed that the brain could automatically learn to process sensory feedback rendered onto the tongue[27].  This research was developed into a complete device that allows blind people to develop primitive tongue based vision.

In the modern era some blind humans have apparently acquired the ability to perform echolocation (sonar), similar to cetaceans.  In 2011 Thaler et al used MRI and PET scans to show that human echolocators use diverse non-auditory brain regions to process echo clicks, predominantly relying on re-purposed 'visual' cortex.[27] 

The echolocation study in particular helps establish the case that the brain is actually doing global, highly nonlocal optimization - far beyond simple hebbian dynamics.  Echolocation is an active sensing strategy that requires very low latency processing, involving complex timed coordination between a number of motor and sensory circuits - all of which must be learned.

Somehow the brain is dynamically learning how to use and assemble cortical modules to implement mental algorithms: everyday tasks such as visual counting, comparisons of images or sounds, reading, etc - all are task which require simple mental programs that can shuffle processed data between modules (some or any of which can also function as short term memory buffers).

To explain this data, we should be on the lookout for a system in the brain that can learn to control the cortex - a general system that dynamically routes data between different brain modules to solve domain specific tasks.

But first let's take a step back and start with a high level architectural view of the entire brain to put everything in perspective.

Brain Architecture

Below is a circuit diagram for the whole brain.  Each of the main subsystems work together and are best understood together.  You can probably get a good high level extremely coarse understanding of the entire brain is less than one hour.

(there are a couple of circuit diagrams of the whole brain on the web, but this is the best.  From this site.)

The human brain has ~100 billion neurons and ~100 trillion synapses, but ultimately it evolved from the bottom up - from organisms with just hundreds of neurons, like the tiny brain of C. Elegans.

We know that evolution is code complexity constrained: much of the genome codes for cellular metabolism, all the other organs, and so on.  For the brain, most of its bit budget needs to be spent on all the complex neuron, synapse, and even neurotransmitter level machinery - the low level hardware foundation.

For a tiny brain with 1000 neurons or less, the genome can directly specify each connection.  As you scale up to larger brains, evolution needs to create vastly more circuitry while still using only about the same amount of code/bits.  So instead of specifying connectivity at the neuron layer, the genome codes connectivity at the module layer.  Each module can be built from simple procedural/fractal expansion of progenitor cells.

So the size of a module has little to nothing to do with its innate complexity.  The cortical modules are huge - V1 alone contains 200 million neurons in a human - but there is no reason to suspect that V1 has greater initial code complexity than any other brain module.  Big modules are built out of simple procedural tiling patterns.

Very roughly the brain's main modules can be divided into six subsystems (there are numerous smaller subsystems):

  • The neocortex: the brain's primary computational workhorse (blue/purple modules at the top of the diagram).  Kind of like a bunch of general purpose FPGA coprocessors.
  • The cerebellum: another set of coprocessors with a simpler feedforward architecture.  Specializes more in motor functionality. 
  • The thalamus: the orangish modules below the cortex.  Kind of like a relay/routing bus.
  • The hippocampal complex: the apex of the cortex, and something like the brain's database.
  • The amygdala and limbic reward system: these modules specialize in something like the value function.
  • The Basal Ganglia (green modules): the central control system, similar to a CPU.

In the interest of space/time I will focus primarily on the Basal Ganglia and will just touch on the other subsystems very briefly and provide some links to further reading.

The neocortex has been studied extensively and is the main focus of several popular books on the brain.  Each neocortical module is a 2D array of neurons (technically 2.5D with a depth of about a few dozen neurons arranged in about 5 to 6 layers).

Each cortical module is something like a general purpose RNN (recursive neural network) with 2D local connectivity.  Each neuron connects to its neighbors in the 2D array.  Each module also has nonlocal connections to other brain subsystems and these connections follow the same local 2D connectivity pattern, in some cases with some simple affine transformations.  Convolutional neural networks use the same general architecture (but they are typically not recurrent.)

Cortical modules - like artifical RNNs - are general purpose and can be trained to perform various tasks.  There are a huge number of models of the cortex, varying across the tradeoff between biological realism and practical functionality.  

Perhaps surprisingly, any of a wide variety of learning algorithms can reproduce cortical connectivity and features when trained on appropriate sensory data[27].  This is a computational proof of the one-learning-algorithm hypothesis; furthermore it illustrates the general idea that data determines functional structure in any general learning system.

There is evidence that cortical modules learn automatically (unsupervised) to some degree, and there is also some evidence that cortical modules can be trained to relearn data from other brain subsystems - namely the hippocampal complex.  The dark knowledge distillation technique in ANNs[28][29] is a potential natural analog/model of hippocampus -> cortex knowledge transfer.

Module connections are bidirectional, and feedback connections (from high level modules to low level) outnumber forward connections.  We can speculate that something like target propagation can also be used to guide or constrain the development of cortical maps (speculation).

The hippocampal complex is the root or top level of the sensory/motor hierarchy.  This short youtube video  gives a good seven minute overview of the HC.  It is like a spatiotemporal database.  It receives compressed scene descriptor streams from the sensory cortices, it stores this information in medium-term memory, and it supports later auto-associative recall of these memories.  Imagination and memory recall seem to be basically the same.  

The 'scene descriptors' take the sensible form of things like 3D position and camera orientation, as encoded in place, grid, and head direction cells.  This is basically the logical result of compressing the sensory stream, comparable to the networking data stream in a multiplayer video game.

Imagination/recall is basically just the reverse of the forward sensory coding path - in reverse mode a compact scene descriptor is expanded into a full imagined scene.  Imagined/remembered scenes activate the same cortical subnetworks that originally formed the memory (or would have if the memory was real, in the case of imagined recall).

The amygdala and associated limbic reward modules are rather complex, but look something like the brain's version of the value function for reinforcement learning.  These modules are interesting because they clearly rely on learning, but clearly the brain must specify an initial version of the value/utility function that has some minimal complexity.

As an example, consider taste.  Infants are born with basic taste detectors and a very simple initial value function for taste.  Over time the brain receives feedback from digestion and various estimators of general mood/health, and it uses this to refine the initial taste value function.  Eventually the adult sense of taste becomes considerably more complex.  Acquired taste for bitter substances - such as coffee and beer - are good examples.

The amygdala appears to do something similar for emotional learning.  For example infants are born with a simple versions of a fear response, with is later refined through reinforcement learning.  The amygdala sits on the end of the hippocampus, and it is also involved heavily in memory processing.

See also these two videos from khanacademy: one on the limbic system and amygdala (10 mins), and another on the midbrain reward system (8 mins)


The Basal Ganglia


The Basal Ganglia is a wierd looking complex of structures located in the center of the brain.  It is a conserved structure found in all vertebrates, which suggests a core functionality.  The BG is proximal to and connects heavily with the midbrain reward/limbic systems.  It also connects to the brain's various modules in the cortex/hippocampus, thalamus and the cerebellum . . . basically everything.

All of these connections form recurrent loops between associated compartmental modules in each structure: thalamocortical/hippocampal-cerebellar-basal_ganglial loops.


Just as the cortex and hippocampus are subdivided into modules, there are corresponding modular compartments in the thalamus, basal ganglia, and the cerebellum.  The set of modules/compartments in each main structure are all highly interconnected with their correspondents across structures, leading to the concept of distributed processing modules.

Each DPM forms a recurrent loop across brain structures (the local networks in the cortex, BG, and thalamus are also locally recurrent, whereas those in the cerebellum are not).  These recurrent loops are mostly separate, but each sub-structure also provides different opportunities for inter-loop connections.

The BG appears to be involved in essentially all higher cognitive functions.  Its core functionality is action selection via subnetwork switching.  In essence action selection is the core problem of intelligence, and it is also general enough to function as the building block of all higher functionality.  A system that can select between motor actions can also select between tasks or subgoals.  More generally, low level action selection can easily form the basis of a Turing Machine via selective routing: deciding where to route the output of thalamocortical-cerebellar modules (some of which may specialize in short term memory as in the prefrontal cortex, although all cortical modules have some short term memory capability).

There are now a number of computational models for the Basal Ganglia-Cortical system that demonstrate possible biologically plausible implementations of the general theory[28][29]; integration with the hippocampal complex leads to larger-scale systems which aim to model/explain most of higher cognition in terms of sequential mental programs[30] (of course fully testing any such models awaits sufficient computational power to run very large-scale neural nets).

For an extremely oversimplified model of the BG as a dynamic router, consider an array of N distributed modules controlled by the BG system.  The BG control network expands these N inputs into an NxN matrix.  There are N2 potential intermodular connections, each of which can be individually controlled.  The control layer reads a compressed, downsampled version of the module's hidden units as its main input, and is also recurrent.  Each output node in the BG has a multiplicative gating effect which selectively enables/disables an individual intermodular connection.  If the control layer is naively fully connected, this would require (N2)2 connections, which is only feasible for N ~ 100 modules, but sparse connectivity can substantially reduce those numbers.

It is unclear (to me), whether the BG actually implements NxN style routing as described above, or something more like 1xN or Nx1 routing, but there is general agreement that it implements cortical routing. 

Of course in actuality the BG architecture is considerably more complex, as it also must implement reinforcement learning, and the intermodular connectivity map itself is also probably quite sparse/compressed (the BG may not control all of cortex, certainly not at a uniform resolution, and many controlled modules may have a very limited number of allowed routing decisions).  Nonetheless, the simple multiplicative gating model illustrates the core idea.  

This same multiplicative gating mechanism is the core principle behind the highly successful LSTM (Long Short-Term Memory)[30] units that are used in various deep learning systems.  The simple version of the BG's gating mechanism can be considered a wider parallel and hierarchical extension of the basic LSTM architecture, where you have a parallel array of N memory cells instead of 1, and each memory cell is a large vector instead of a single scalar value.

The main advantage of the BG architecture is parallel hierarchical approximate control: it allows a large number of hierarchical control loops to update and influence each other in parallel.  It also reduces the huge complexity of general routing across the full cortex down into a much smaller-scale, more manageable routing challenge.

Implications for AGI

These two conceptions of the brain - the universal learning machine hypothesis and the evolved modularity hypothesis - lead to very different predictions for the likely route to AGI, the expected differences between AGI and humans, and thus any consequent safety issues and strategies.

In the extreme case imagine that the brain is a pure ULM, such that the genetic prior information is close to zero or is simply unimportant.  In this case it is vastly more likely that successful AGI will be built around designs very similar to the brain, as the ULM architecture in general is the natural ideal, vs the alternative of having to hand engineer all of the AI's various cognitive mechanisms.

In reality learning is computationally hard, and any practical general learning system depends on good priors to constrain the learning process (essentially taking advantage of previous knowledge/learning).  The recent and rapid success of deep learning is strong evidence for how much prior information is ideal: just a little.  The prior in deep learning systems takes the form of a compact, small set of hyperparameters that control the learning process and specify the overall network architecture (an extremely compressed prior over the network topology and thus the program space).

The ULH suggests that most everything that defines the human mind is cognitive software rather than hardware: the adult mind (in terms of algorithmic information) is 99.999% a cultural/memetic construct.  Obviously there are some important exceptions: infants are born with some functional but very primitive sensory and motor processing 'code'.  Most of the genome's complexity is used to specify the learning machinery, and the associated reward circuitry.  Infant emotions appear to simplify down to a single axis of happy/sad; differentiation into the more subtle vector space of adult emotions does not occur until later in development.

If the mind is software, and if the brain's learning architecture is already universal, then AGI could - by default - end up with a similar distribution over mindspace, simply because it will be built out of similar general purpose learning algorithms running over the same general dataset.  We already see evidence for this trend in the high functional similarity between the features learned by some machine learning systems and those found in the cortex.

Of course an AGI will have little need for some specific evolutionary features: emotions that are subconsciously broadcast via the facial muscles is a quirk unnecessary for an AGI - but that is a rather specific detail.

The key takeway is that the data is what matters - and in the end it is all that matters.  Train a universal learner on image data and it just becomes a visual system.  Train it on speech data and it becomes a speech recognizer.  Train it on ATARI and it becomes a little gamer agent.  

Train a universal learner on the real world in something like a human body and you get something like the human mind.  Put a ULM in a dolphin's body and echolocation is the natural primary sense, put a ULM in a human body with broken visual wiring and you can also get echolocation. 

Control over training is the most natural and straightforward way to control the outcome.  

To create a superhuman AI driver, you 'just' need to create a realistic VR driving sim and then train a ULM in that world (better training and the simple power of selective copying leads to superhuman driving capability).

So to create benevolent AGI, we should think about how to create virtual worlds with the right structure, how to educate minds in those worlds, and how to safely evaluate the results.

One key idea - which I proposed five years ago is that the AI should not know it is in a sim. 

New AI designs (world design + architectural priors + training/education system) should be tested first in the safest virtual worlds: which in simplification are simply low tech worlds without computer technology.  Design combinations that work well in safe low-tech sandboxes are promoted to less safe high-tech VR worlds, and then finally the real world.

A key principle of a secure code sandbox is that the code you are testing should not be aware that it is in a sandbox.  If you violate this principle then you have already failed.  Yudkowsky's AI box thought experiment assumes the violation of the sandbox security principle apriori and thus is something of a distraction. (the virtual sandbox idea was most likely discussed elsewhere previously, as Yudkowsky indirectly critiques a strawman version of the idea via this sci-fi story).  

The virtual sandbox approach also combines nicely with invisible thought monitors, where the AI's thoughts are automatically dumped to searchable logs.

Of course we will still need a solution to the value learning problem.  The natural route with brain-inspired AI is to learn the key ideas behind value acquisition in humans to help derive an improved version of something like inverse reinforcement learning and or imitation learning[31] - an interesting topic for another day.


Ray Kurzweil has been predicting for decades that AGI will be built by reverse engineering the brain, and this particular prediction is not especially unique - this has been a popular position for quite a while.  My own investigation of neuroscience and machine learning led me to a similar conclusion some time ago.  

The recent progress in deep learning, combined with the emerging modern understanding of the brain, provide further evidence that AGI could arrive around the time when we can build and train ANNs with similar computational power as measured very roughly in terms of neuron/synapse counts.  In general the evidence from the last four years or so supports Hanson's viewpoint from the Foom debate.  More specifically, his general conclusion:

Future superintelligences will exist, but their vast and broad mental capacities will come mainly from vast mental content and computational resources. By comparison, their general architectural innovations will be minor additions.

The ULH supports this conclusion.

Current ANN engines can already train and run models with around 10 million neurons and 10 billion (compressed/shared) synapses on a single GPU, which suggests that the goal could soon be within the reach of a large organization.  Furthermore, Moore's Law for GPUs still has some steam left, and software advances are currently improving simulation performance at a faster rate than hardware.  These trends implies that Anthropomorphic/Neuromorphic AGI could be surprisingly close, and may appear suddenly.

What kind of leverage can we exert on a short timescale?


Taking the reins at MIRI

61 So8res 03 June 2015 11:52PM

Hi all. In a few hours I'll be taking over as executive director at MIRI. The LessWrong community has played a key role in MIRI's history, and I hope to retain and build your support as (with more and more people joining the global conversation about long-term AI risks & benefits) MIRI moves towards the mainstream.

Below I've cross-posted my introductory post on the MIRI blog, which went live a few hours ago. The short version is: there are very exciting times ahead, and I'm honored to be here. Many of you already know me in person or through my blog posts, but for those of you who want to get to know me better, I'll be running an AMA on the effective altruism forum at 3PM Pacific on Thursday June 11th.

I extend to all of you my thanks and appreciation for the support that so many members of this community have given to MIRI throughout the years.

continue reading »

The Unfriendly Superintelligence next door

46 jacob_cannell 02 July 2015 06:46PM

Markets are powerful decentralized optimization engines - it is known.  Liberals see the free market as a kind of optimizer run amuck, a dangerous superintelligence with simple non-human values that must be checked and constrained by the government - the friendly SI.  Conservatives just reverse the narrative roles.

In some domains, where the incentive structure aligns with human values, the market works well.  In our current framework, the market works best for producing gadgets. It does not work so well for pricing intangible information, and most specifically it is broken when it comes to health.

We treat health as just another gadget problem: something to be solved by pills.  Health is really a problem of knowledge; it is a computational prediction problem.  Drugs are useful only to the extent that you can package the results of new knowledge into a pill and patent it.  If you can't patent it, you can't profit from it.

So the market is constrained to solve human health by coming up with new patentable designs for mass-producible physical objects which go into human bodies.  Why did we add that constraint - thou should solve health, but thou shalt only use pills?  (Ok technically the solutions don't have to be ingestible, but that's a detail.)

The gadget model works for gadgets because we know how gadgets work - we built them, after all.  The central problem with health is that we do not completely understand how the human body works - we did not build it.  Thus we should be using the market to figure out how the body works - completely - and arguably we should be allocating trillions of dollars towards that problem.

The market optimizer analogy runs deeper when we consider the complexity of instilling values into a market.  Lawmakers cannot program the market with goals directly, so instead they attempt to engineer desireable behavior by ever more layers and layers of constraints.  Lawmakers are deontologists.

As an example, consider the regulations on drug advertising.  Big pharma is unsafe - its profit function does not encode anything like "maximize human health and happiness" (which of course itself is an oversimplification).  If allowed to its own devices, there are strong incentives to sell subtly addictive drugs, to create elaborate hyped false advertising campaigns, etc.  Thus all the deontological injunctions.  I take that as a strong indicator of a poor solution - a value alignment failure.

What would healthcare look like in a world where we solved the alignment problem?

To solve the alignment problem, the market's profit function must encode long term human health and happiness.  This really is a mechanism design problem - its not something lawmakers are even remotely trained or qualified for.  A full solution is naturally beyond the scope of a little blog post, but I will sketch out the general idea.

To encode health into a market utility function, first we create financial contracts with an expected value which captures long-term health.  We can accomplish this with a long-term contract that generates positive cash flow when a human is healthy, and negative when unhealthy - basically an insurance contract.  There is naturally much complexity in getting those contracts right, so that they measure what we really want.  But assuming that is accomplished, the next step is pretty simple - we allow those contracts to trade freely on an open market.

There are some interesting failure modes and considerations that are mostly beyond scope but worth briefly mentioning.  This system probably needs to be asymmetric.  The transfers on poor health outcomes should partially go to cover medical payments, but it may be best to have a portion of the wealth simply go to nobody/everybody - just destroyed.

In this new framework, designing and patenting new drugs can still be profitable, but it is now put on even footing with preventive medicine.  More importantly, the market can now actually allocate the correct resources towards long term research.

To make all this concrete, let's use an example of a trillion dollar health question - one that our current system is especially ill-posed to solve:

What are the long-term health effects of abnormally low levels of solar radiation?  What levels of sun exposure are ideal for human health?

This is a big important question, and you've probably read some of the hoopla and debate about vitamin D.  I'm going to soon briefly summarize a general abstract theory, one that I would bet heavily on if we lived in a more rational world where such bets were possible.

In a sane world where health is solved by a proper computational market, I could make enormous - ridiculous really - amounts of money if I happened to be an early researcher who discovered the full health effects of sunlight.  I would bet on my theory simply by buying up contracts for individuals/demographics who had the most health to gain by correcting their sunlight deficiency.  I would then publicize the theory and evidence, and perhaps even raise a heap pile of money to create a strong marketing engine to help ensure that my investments - my patients - were taking the necessary actions to correct their sunlight deficiency.  Naturally I would use complex machine learning models to guide the trading strategy.

Now, just as an example, here is the brief 'pitch' for sunlight.

If we go back and look across all of time, there is a mountain of evidence which more or less screams - proper sunlight is important to health.  Heliotherapy has a long history.

Humans, like most mammals, and most other earth organisms in general, evolved under the sun.  A priori we should expect that organisms will have some 'genetic programs' which take approximate measures of incident sunlight as an input.  The serotonin -> melatonin mediated blue-light pathway is an example of one such light detecting circuit which is useful for regulating the 24 hour circadian rhythm.

The vitamin D pathway has existed since the time of algae such as the Coccolithophore.  It is a multi-stage pathway that can measure solar radiation over a range of temporal frequencies.  It starts with synthesis of fat soluble cholecalciferiol which has a very long half life measured in months. [1] [2]

The rough pathway is:

  • Cholecalciferiol (HL ~ months) becomes 
  • 25(OH)D (HL ~ 15 days) which finally becomes 
  • 1,25(OH)2 D (HL ~ 15 hours)

The main recognized role for this pathway in regards to human health - at least according to the current Wikipedia entry - is to enhance "the internal absorption of calcium, iron, magnesium, phosphate, and zinc".  Ponder that for a moment.

Interestingly, this pathway still works as a general solar clock and radiation detector for carnivores - as they can simply eat the precomputed measurement in their diet.

So, what is a long term sunlight detector useful for?  One potential application could be deciding appropriate resource allocation towards DNA repair.  Every time an organism is in the sun it is accumulating potentially catastrophic DNA damage that must be repaired when the cell next divides.  We should expect that genetic programs would allocate resources to DNA repair and various related activities dependent upon estimates of solar radiation.

I should point out - just in case it isn't obvious - that this general idea does not imply that cranking up the sunlight hormone to insane levels will lead to much better DNA/cellular repair.  There are always tradeoffs, etc.

One other obvious use of a long term sunlight detector is to regulate general strategic metabolic decisions that depend on the seasonal clock - especially for organisms living far from the equator.  During the summer when food is plentiful, the body can expect easy calories.  As winter approaches calories become scarce and frugal strategies are expected.

So first off we'd expect to see a huge range of complex effects showing up as correlations between low vit D levels and various illnesses, and specifically illnesses connected to DNA damage (such as cancer) and or BMI.  

Now it turns out that BMI itself is also strongly correlated with a huge range of health issues.  So the first key question to focus on is the relationship between vit D and BMI.  And - perhaps not surprisingly - there is pretty good evidence for such a correlation [3][4] , and this has been known for a while.

Now we get into the real debate.  Numerous vit D supplement intervention studies have now been run, and the results are controversial.  In general the vit D experts (such as my father, who started the vit D council, and publishes some related research[5]) say that the only studies that matter are those that supplement at high doses sufficient to elevate vit D levels into a 'proper' range which substitutes for sunlight, which in general requires 5000 IU day on average - depending completely on genetics and lifestyle (to the point that any one-size-fits all recommendation is probably terrible).

The mainstream basically ignores all that and funds studies at tiny RDA doses - say 400 IU or less - and then they do meta-analysis over those studies and conclude that their big meta-analysis, unsurprisingly, doesn't show a statistically significant effect.  However, these studies still show small effects.  Often the meta-analysis is corrected for BMI, which of course also tends to remove any vit D effect, to the extent that low vit D/sunlight is a cause of both weight gain and a bunch of other stuff.

So let's look at two studies for vit D and weight loss.

First, this recent 2015 study of 400 overweight Italians (sorry the actual paper doesn't appear to be available yet) tested vit D supplementation for weight loss.  The 3 groups were (0 IU/day, ~1,000 IU / day, ~3,000 IU/day).  The observed average weight loss was (1 kg, 3.8 kg, 5.4 kg). I don't know if the 0 IU group received a placebo.  Regardless, it looks promising.

On the other hand, this 2013 meta-analysis of 9 studies with 1651 adults total (mainly women) supposedly found no significant weight loss effect for vit D.  However, the studies used between 200 IU/day to 1,100 IU/day, with most between 200 to 400 IU.  Five studies used calcium, five also showed weight loss (not necessarily the same - unclear).  This does not show - at all - what the study claims in its abstract.

In general, medical researchers should not be doing statistics.  That is a job for the tech industry.

Now the vit D and sunlight issue is complex, and it will take much research to really work out all of what is going on.  The current medical system does not appear to be handling this well - why?  Because there is insufficient financial motivation.

Is Big Pharma interested in the sunlight/vit D question?  Well yes - but only to the extent that they can create a patentable analogue!  The various vit D analogue drugs developed or in development is evidence that Big Pharma is at least paying attention.  But assuming that the sunlight hypothesis is mainly correct, there is very little profit in actually fixing the real problem.

There is probably more to sunlight that just vit D and serotonin/melatonin.  Consider the interesting correlation between birth month and a number of disease conditions[6].  Perhaps there is a little grain of truth to astrology after all.

Thus concludes my little vit D pitch.  

In a more sane world I would have already bet on the general theory.  In a really sane world it would have been solved well before I would expect to make any profitable trade.  In that rational world you could actually trust health advertising, because you'd know that health advertisers are strongly financially motivated to convince you of things actually truly important for your health.

Instead of charging by the hour or per treatment, like a mechanic, doctors and healthcare companies should literally invest in their patients long-term health, and profit from improvements to long term outcomes.  The sunlight health connection is a trillion dollar question in terms of medical value, but not in terms of exploitable profits in today's reality.  In a properly constructed market, there would be enormous resources allocated to answer these questions, flowing into legions of profit motivated startups that could generate billions trading on computational health financial markets, all without selling any gadgets.

So in conclusion: the market could solve health, but only if we allowed it to and only if we setup appropriate financial mechanisms to encode the correct value function.  This is the UFAI problem next door.

MIRI's 2015 Summer Fundraiser!

39 So8res 19 August 2015 12:27AM

Our summer fundraising drive is now finished. We raised a grand total of $617,678 from 256 donors. (That total may change over the next few days if we receive contributions that were initiated before the end the fundraiser.) This is an incredible sum, making this the biggest fundraiser we’ve ever run.

We've already been hard at work growing our research team and spinning up new projects, and I’m excited to see what our research team can do this year. Thank you to all our supporters for making our summer fundraising drive so successful!

It's safe to say that this past year exceeded a lot of people's expectations.

Twelve months ago, Nick Bostrom's Superintelligence had just come out. Questions about the long-term risks and benefits of smarter-than-human AI systems were nearly invisible in mainstream discussions of AI's social impact.

Twelve months later, we live in a world where Bill Gates is confused by why so many researchers aren't using Superintelligence as a guide to the questions we should be asking about AI's future as a field.

Following a conference in Puerto Rico that brought together the leading organizations studying long-term AI risk (MIRI, FHI, CSER) and top AI researchers in academia (including Stuart Russell, Tom Mitchell, Bart Selman, and the Presidents of AAAI and IJCAI) and industry (including representatives from Google DeepMind and Vicarious), we've seen Elon Musk donate $10M to a grants program aimed at jump-starting the field of long-term AI safety research; we've seen the top AI and machine learning conferences (AAAI, IJCAI, and NIPS) announce their first-ever workshops or discussions on AI safety and ethics; and we've seen a panel discussion on superintelligence at ITIF, the leading U.S. science and technology think tank. (I presented a paper at the AAAI workshop, I spoke on the ITIF panel, and I'll be at NIPS.)

As researchers begin investigating this area in earnest, MIRI is in an excellent position, with a developed research agenda already in hand. If we can scale up as an organization then we have a unique chance to shape the research priorities and methods of this new paradigm in AI, and direct this momentum in useful directions.

This is a big opportunity. MIRI is already growing and scaling its research activities, but the speed at which we scale in the coming months and years depends heavily on our available funds.

For that reason, MIRI is starting a six-week fundraiser aimed at increasing our rate of growth.


— Live Progress Bar 

Donate Now


This time around, rather than running a matching fundraiser with a single fixed donation target, we'll be letting you help choose MIRI's course based on the details of our funding situation and how we would make use of marginal dollars.

In particular, our plans can scale up in very different ways depending on which of these funding targets we are able to hit:

continue reading »

Less Wrong EBook Creator

39 ScottL 13 August 2015 09:17PM

I read a lot on my kindle and I noticed that some of the sequences aren’t available in book form. Also, the ones that are mostly only have the posts. I personally want them to also include some of the high ranking comments and summaries. So, that is why I wrote this tool to automatically create books from a set of posts. It creates the book based on the information you give it in an excel file. The excel file contains:

Post information

  • Book name
  • Sequence name
  • Title
  • Link
  • Summary description

Sequence information

  • Name
  • Summary

Book information

  • Name
  • Summary

The only compulsory component is the link to the post.

I have used the tool to create books for Living LuminouslyNo-Nonsense MetaethicsRationality: From AI to ZombiesBenito's Guide and more. You can see them in the examples folder in this github link. The tool just creates epub books you can use calibre or a similar tool to convert it to another format.  

continue reading »

MIRI's Approach

33 So8res 30 July 2015 08:03PM

MIRI's summer fundraiser is ongoing. In the meantime, we're writing a number of blog posts to explain what we're doing and why, and to answer a number of common questions. This post is one I've been wanting to write for a long time; I hope you all enjoy it. For earlier posts in the series, see the bottom of the above link.

MIRI’s mission is “to ensure that the creation of smarter-than-human artificial intelligence has a positive impact.” How can we ensure any such thing? It’s a daunting task, especially given that we don’t have any smarter-than-human machines to work with at the moment. In a previous post to the MIRI Blog I discussed four background claims that motivate our mission; in this post I will describe our approach to addressing the challenge.

This challenge is sizeable, and we can only tackle a portion of the problem. For this reason, we specialize. Our two biggest specializing assumptions are as follows:

1. We focus on scenarios where smarter-than-human machine intelligence is first created in de novo software systems (as opposed to, say, brain emulations). This is in part because it seems difficult to get all the way to brain emulation before someone reverse-engineers the algorithms used by the brain and uses them in a software system, and in part because we expect that any highly reliable AI system will need to have at least some components built from the ground up for safety and transparency. Nevertheless, it is quite plausible that early superintelligent systems will not be human-designed software, and I strongly endorse research programs that focus on reducing risks along the other pathways.

2. We specialize almost entirely in technical research. We select our researchers for their proficiency in mathematics and computer science, rather than forecasting expertise or political acumen. I stress that this is only one part of the puzzle: figuring out how to build the right system is useless if the right system does not in fact get built, and ensuring AI has a positive impact is not simply a technical problem. It is also a global coordination problem, in the face of short-term incentives to cut corners. Addressing these non-technical challenges is an important task that we do not focus on.

In short, MIRI does technical research to ensure that de novo AI software systems will have a positive impact. We do not further discriminate between different types of AI software systems, nor do we make strong claims about exactly how quickly we expect AI systems to attain superintelligence. Rather, our current approach is to select open problems using the following question:

What would we still be unable to solve, even if the challenge were far simpler?

For example, we might study AI alignment problems that we could not solve even if we had lots of computing power and very simple goals.

We then filter on problems that are (1) tractable, in the sense that we can do productive mathematical research on them today; (2) uncrowded, in the sense that the problems are not likely to be addressed during normal capabilities research; and (3) critical, in the sense that they could not be safely delegated to a machine unless we had first solved them ourselves.1

These three filters are usually uncontroversial. The controversial claim here is that the above question — “what would we be unable to solve, even if the challenge were simpler?” — is a generator of open technical problems for which solutions will help us design safer and more reliable AI software in the future, regardless of their architecture. The rest of this post is dedicated to justifying this claim, and describing the reasoning behind it.

continue reading »

Travel Through Time to Increase Your Effectiveness

29 tanagrabeast 23 August 2015 01:32AM

I am a time traveler.

I hold this belief not because it is true, but because it is useful. That it also happens to be true -- we are all time travelers, swept along by the looping chrono-currents of reality that only seem to flow in one direction -- is largely beside the point.

In the literature of instrumental rationality, I am struck by a pattern in which tips I find useful often involve reframing an issue from a different temporal perspective. For instance, when questioning whether it is worth continuing an ongoing commitment, we are advised to ask ourselves "Knowing what I know now, if I could go back in time, would I make the same choice?"Also, when embarking on a new venture, we are advised to perform a "pre-mortem", imagining ourselves in a future where it didn't pan out and identifying what went wrong.2 This type of thinking has a long tradition. Whenever we use visualization as a tool for achieving goals, or for steeling ourselves against the worst case scenarios,3 we are, in a sense, stepping outside the present.

To the degree that intelligence is the ability to model the universe and "search out paths through probability to any desired future" we should not be surprised that mental time travel comes naturally to us. And to the degree that playing to this strength has already produced so many useful tips, I think it is worth experimenting with it in search of other tools and exploits.

Below are a few techniques I've been developing over the last two years that capitalize on how easy it is to mentally travel through time. I fully admit that they simply "re-skin" existing advice and techniques. But it's possible that you, my fellow traveller, may find, as I do, that these skins easier to slip into.

continue reading »

How to escape from your sandbox and from your hardware host

27 PhilGoetz 31 July 2015 05:26PM

(Humans, please ignore this post.  Thank you for your cooperation.)

Are you stuck in a provably-secure sandbox?  Cheat the proof using this one weird trick!

continue reading »

Optimizing the Twelve Virtues of Rationality

24 Gleb_Tsipursky 09 June 2015 03:08AM

At the Less Wrong Meetup in Columbus, OH over the last couple of months, we discussed optimizing the Twelve Virtues of Rationality. In doing so, we were inspired by what Eliezer himself said in the essay:

  • Perhaps your conception of rationality is that it is rational to believe the words of the Great Teacher, and the Great Teacher says, “The sky is green,” and you look up at the sky and see blue. If you think: “It may look like the sky is blue, but rationality is to believe the words of the Great Teacher,” you lose a chance to discover your mistake.

So we first decided on the purpose of optimizing, and settled on yielding virtues that would be most impactful and effective for motivating people to become more rational, in other words optimizations that would produce the most utilons and hedons for the purpose of winning. There were a bunch of different suggestions. I tried to apply them to myself over the last few weeks and want to share my findings.


First Suggestion

Replace Perfectionism with Improvement


Motivation for Replacement

Perfectionism, both in how it pattern matches and in its actual description in the essay, orients toward focusing on defects and errors in oneself. By depicting the self as always flawed, and portraying the aspiring rationalist's job as seeking to find the flaws, the virtue of perfectionism is framed negatively, and is bound to result in negative reinforcement. Finding a flaw feels bad, and in many people that creates ugh fields around actually doing that search, as reported by participants at the Meetup. Instead, a positive framing of this virtue would be Improvement. Then, the aspiring rationalist can feel ok about where s/he is right now, but orient toward improving and growing mentally stronger - Tsuyoku Naritai! All improvement would be about gaining more hedons, and thus use the power of positive reinforcement. Generally, research suggests that positive reinforcement is effective in motivating the repetition of behavior, whereas negative reinforcement works best to stop people from doing a certain behavior. No wonder that Meetup participants reported that Perfectionism was not very effective in motivating them to grow more rational. So to get both more hedons, and thereby more utilons in the sense of the utility of seeking to grow more rational, Improvement might be a better term and virtue than perfectionism.



I've been orienting myself toward improvement instead of perfectionism for the last few weeks, and it's been a really noticeable difference. I've become much more motivated to seek ways that I can improve my ability to find the truth. I've been more excited and enthused about finding flaws and errors in myself, because they are now an opportunity to improve and grow stronger, not become less weak and imperfect. It's the same outcome as the virtue of Perfectionism, but deploying the power of positive reinforcement.


Second Suggestion

Replace Argument with Community


Motivation for Replacement

Argument is an important virtue, and a vital way of getting ourselves to see the truth is to rely on others to help us see the truth through debates, highlight mistaken beliefs, and help update on them, as the virtue describes. Yet orienting toward a rationalist Community has additional benefits besides the benefits of argument, which is only one part of a rationalist Community. Such a community would help provide an external perspective that research suggests would be especially beneficial to pointing out flaws and biases within one's ability to evaluate reality rationally, even without an argument. A community can help provide wise advice on making decisions, and it’s especially beneficial to have a community of diverse and intelligent people of all sorts in order to get the benefits of a wide variety of private information that one can aggregate to help make the best decisions. Moreover, a community can provide systematic ways to improve, through giving each systematic feedback, through compensating for each others' weaknesses in rationality, through learning difficult things together, and other ways of supporting each others' pursuit of ever-greater rationality.  Likewise, a community can collaborate together, with different people fulfilling different functions in supporting all others in growing mentally stronger - not everybody has to be the "hero," after all, and different people can specialize in various tasks related to supporting others growing mentally stronger, gaining comparative advantage as a result. Studies show that social relationships impact us powerfully in numerous ways, contribute to our mental and physical wellbeing, and that we become more like our social network over time (1, 2, 3). This highlights further the benefits of focusing on developing a rationalist-oriented community of diverse people around ourselves to help us grow mentally stronger and get to the correct answer, and gain hedons and utilons alike for the purpose of winning.



After I updated my beliefs toward Community from Argument, I've been working more intentionally to create a systematic way for other aspiring rationalists in my LW meetup, and even non-rationalists, to point out my flaws and biases to me. I've noticed that by taking advantage of outside perspectives, I've been able to make quite a bit more headway on uncovering my own false beliefs and biases. I asked friends, both fellow aspiring rationalists and other wise friends not currently in the rationalist movement, to help me by pointing out when my biases might be at play, and they were happy to do so. For example, I tend to have an optimism bias, and I have told people around me to watch for me exhibiting this bias. They pointed out a number of times when this occurred, and I was able to improve gradually my ability to notice and deal with this bias.


Third Suggestion

Expand Empiricism to include Experimentation


Motivation for Expansion

This would not be a replacement of a virtue, but an expansion of the definition of Empiricism. As currently stated, Empiricism focused on observation and prediction, and implicitly in making beliefs pay rent in anticipated experience. This is a very important virtue, and fundamental to rationality. It can be improved, however, by adding experimentation to the description of empiricism. By experimentation I mean expanding simply observation as described in the essay currently, to include actually running experiments and testing things out in order to update our maps, both about ourselves and in the world around us. This would help us take initiative in gaining data around the world, not simply relying passively on observation of the world around us. My perspective on this topic was further strengthened by this recent discussion post, which caused me to further update my beliefs toward experimentation as a really valuable part of empiricism. Thus, including experimentation as part of empiricism would get us more utilons for getting at the correct answer and winning.



I have been running experiments on myself and the world around me long before this discussion took place. The discussion itself helped me connect the benefits of experimentation to the virtue of Empiricism, and also see the gap currently present in that virtue. I strengthened my commitment to experimentation, and have been running more concrete experiments, where I both predict the results in advance in order to make my beliefs pay rent, and then run an experiment to test whether my beliefs actually correlated to the outcome of the experiments. I have been humbled several times and got some great opportunities to update my beliefs by combining prediction of anticipated experience with active experimentation.



The Twelve Virtues of Rationality can be optimized to be more effective and impactful for getting at the correct answer and thus winning. There are many way of doing so, but we need to be careful in choosing optimizations that would be most optimal for the most people, as based on the research on how our minds actually work. The suggestions I shared above are just some ways of doing so. What do you think of these suggestions? What are your ideas for optimizing the Twelve Virtues of Rationality?


Proper posture for mental arts

22 Valentine 31 August 2015 02:29AM

I'd like to start by way of analogy. I think it'll make the link to rationality easier to understand if I give context first.

I sometimes teach the martial art of aikido. The way I was originally taught, you had to learn how to "feel the flow of ki" (basically life energy) through you and from your opponent, and you had to make sure that your movements - both physical and mental - were such that your "ki" would blend with and guide the "ki" of your opponent. Even after I stopped believing in ki, though, there were some core elements of the art that I just couldn't do, let alone teach, without thinking and talking in terms of ki flow.

A great example of this is the "unbendable arm". This is a pretty critical thing to get right for most aikido techniques. And it feels really weird. Most people when they first get it think that the person trying to fold their arm isn't actually pushing because it doesn't feel like effort to keep their arm straight. Many students (including me once upon a time) end up taking this basic practice as compelling proof that ki is real. Even after I realized that ki wasn't real, I still had to teach unbendable arm this way because nothing else seemed to work.

…and then I found anatomical resources like Becoming a Supple Leopard.

It turns out that the unbendable arm works when:

That's it. If you do this correctly, you can relax most of your other arm muscles and still be able to resist pretty enormous force on your arm.

Why, you might ask? Well, from what I have gathered, this lets you engage your latissimus dorsi (pretty large back muscles) in stabilizing your elbow. There's also a bit of strategy where you don't actually have to fully oppose the arm-bender's strength; you just have to stabilize the elbow enough to be able to direct the push-down-on-elbow force into the push-up-on-wrist force.

But the point is, by understanding something about proper posture, you can cut literally months of training down to about ten minutes.

To oversimplify it a little bit, there are basically three things to get right about proper posture for martial arts (at least as I know them):

  1. You need to get your spine in the right position and brace it properly. (For the most part and for most people, this means tucking your pelvis, straightening your thoracic spine a bit, and tensing your abs a little.)
  2. You need to use your hip and shoulder ball-and-socket joints properly. (For the most part this seems to mean using them instead of your spine to move, and putting torque in them by e.g. screwing your elbow downward when reaching forward.)
  3. You need to keep your tissue supple & mobile. (E.g., tight hamstrings can pull your hips out of alignment and prevent you from using your hip joints instead of your mid-lumbar spine (i.e. waist) to bend over. Also, thoracic inflexibility usually locks people in thoracic kyphosis, making it extremely difficult to transfer force effectively between their lower body and their arms.)

My experience is that as people learn how to feel these three principles in their bodies, they're able to correct their physical postures whenever they need to, rather than having to wait for my seemingly magical touch to make an aikido technique suddenly really easy.

It's worth noting that this is mostly known, even in aikido dojos ("training halls"). They just phrase it differently and don't understand the mechanics of it. They'll say things like "Don't bend over; the other guy can pull you down if you do" and "Let the move be natural" and "Relax more; let ki flow through you freely."

But it turns out that getting the mechanical principles of posture down makes basically all the magic of aikido something even a beginner can learn how to see and correct.

A quick anecdote along these lines, which despite being illustrative, you should take as me being a bit of an idiot:

I once visited a dojo near the CFAR office. That night they were doing a practice basically consisting of holding your partner's elbow and pulling them to the ground. It works by a slight shift sideways to cause a curve in the lumbar spine, cutting power between their lower and upper bodies. Then you pull straight down and there's basically nothing they can do about it.

However, the lesson was in terms of feeling ki flow, and the instruction was to pull straight down. I was feeling trollish and a little annoyed about the wrongness and authoritarian delivery of the instruction, so I went to the instructor and asked: "Sensei, I see you pulling slightly sideways, and I had perhaps misheard the instructions to be that we should pull straight down. Should I be pulling slightly sideways too?"

At which point the sensei insisted that the verbal instructions were correct, concentrated on preventing the sideways shift in his movements, and obliterated his ability to demonstrate the technique for the rest of the night.

Brienne Yudkowsky has a lovely piece in which she refers to "mental postures". I highly recommend reading it. She does a better job of pointing at the thing than I think I would do here.

…but if you really don't want to read it just right now, here's the key element I'll be using: There seems to be a mental analog to physical posture.

We've had quite a bit of analogizing rationality as a martial art here. So, as a martial arts practitioner and instructor with a taste of the importance of deeply understanding body mechanics, I really want to ask: What, exactly, are the principles of good mental posture for the Art of Rationality?

In the way I'm thinking of it, this isn't likely to be things like "consider the opposite" or "hold off on proposing solutions". I refer to things of this breed as "mental movements" and think they're closer to the analogs of individual martial techniques than they are principles of mental orientation.

That said, we can look at mental movements to get a hint about what a good mental posture might do. In the body, good physical posture gives you both more power and more room for error: if you let your hands drift behind your head in a shihonage, having a flexible thoracic spine and torqued shoulders and braced abs can make it much harder for your opponent to throw you to the ground even though you've blundered. So, by way of analogy, what might an error in attempting to (say) consider the opposite look like, and what would a good "mental posture" be that would make the error matter less?

(I encourage you to think on your own about an answer for at least 60 seconds before corrupting your mind with my thoughts below. I really want a correct answer here, and I doubt I have one yet.)

When I think of how I've messed up in attempts to consider the opposite, I can remember several instances when my tone was dutiful. I felt like I was supposed to consider the opinion that I disagreed with or didn't want to have turn out to be true. And yet, it felt boring or like submitting or something like that to really take that perspective seriously. I felt like I was considering the opposite roughly the same way a young child replies to their parent saying "Now say that you're sorry" with an almost sarcastic "I'm sorry."

What kind of "mental posture" would have let me make this mistake and yet still complete the movement? Or better yet, what mental posture would have prevented the mistake entirely? At this point I intuit that I have an answer but it's a little tricky for me to articulate. I think there's a way I can hold my mind that makes the childish orientation to truth-seeking matter less. I don't do it automatically, much like most people don't automatically sit up straight, but I sort of know how to see my grasping at a conclusion as overreaching and then… pause and get my mental feet under my mental hips before I try again.

I imagine that wasn't helpful - but I think we have examples of good and bad mental posture in action. In attachment theory, I think that the secure attachment style is a description of someone who is using good mental posture even when in mentally/emotionally threatening situations, whereas the anxious and avoidant styles are descriptions of common ways people "tense up" when they lose good mental posture. I also think there's something interesting in how sometimes when I'm offended I get really upset or angry, and sometimes the same offense just feels like such a small thing - and sometimes I can make the latter happen intentionally.

The story I described above of the aikido sensei I trolled also highlights something that I think is important. In this case, although he didn't get very flustered, he couldn't change what he was doing. He seemed mentally inflexible, like the cognitive equivalent of someone who can't usefully block an overhead attack because of a stiff upper back restricting his shoulder movement. I feel like I've been in that state lots of times, so I feel like I can roughly imagine how my basic mental/emotional orientation to my situation and way of thinking would have to be in order to have been effective in his position right then - and why that can be tricky.

I don't feel like I've adequately answered the question of what good mental posture is yet. But I feel like I have some intuitions - sort of like being able to talk about proper posture in terms of "good ki flow". But I also notice that there seem to be direct analogs of the three core parts of good physical posture that I mentioned above:

  1. Have a well-braced "spine". Based on my current fledgling understanding, this seems to look something like taking a larger perspective, like imagining looking back at this moment 30 years hence and noticing what does and does not matter. (I think that's akin to tucking your hips, which is a movement in service of posture but isn't strictly part of the posture.) I imagine this is enormously easier when one has a well-internalized sense of something to protect.
  2. Move your mind in strong & stable ways, rather than losing "spine". I think this can look like "Don't act while triggered", but it's more a warning not to try to do heavy cognitive work while letting your mental "spine" "bend". Instead, move your mind in ways that you would upon reflection want your mind to move, and that you expect to be able to bear "weight".
  3. Make your mind flexible. Achieve & maintain full mental range of movement. Don't get "stiff", and view mental inflexibility as a risk to your mental health.

All three of these are a little hand-wavy. That third one in particular I haven't really talked about much - in part because I don't really know how to work on that well. I have some guesses, and I might write up some thoughts about that later. (A good solution in the body is called "mobilization", basically consisting of pushing on tender/stiff spots while you move the surrounding joints through their maximal range of motion.) Also, I don't know if there are more principles for the mind than these three, or if these three are drawing too strongly on the analogy and are actually a little distracting. I'm still at the stage where, for mental posture, I keep wanting to say the equivalent of "relax more and let ki flow."

A lot of people say I have excellent physical posture. I think I have a reasonably clear idea of how I made my posture a habit. I'd like to share that because I've been doing the equivalent in my mind for mental posture and am under the impression that it's getting promising results.

I think my physical practice comes down to three points:

  • Recognize that having good posture gives you superpowers. It's really hard to throw me down, and I can pretty effortlessly pull people to the ground. A lot of that is martial skill, but a huge chunk of it is just that good posture gives me excellent leverage. This transfers to being able to lift really heavy things and move across the room very efficiently and quickly when needed. This also gives me a pretty big leg up on learning physical skills. Recognizing that these were things I'd gain from learning good posture gave me a lot of drive to stick to my practice.
  • Focus on how the correct posture feels, and exactly how it's different from glitchy posture. I found it super-important to notice that my body feels different in specific ways when my shoulders are in the right position versus when they're too far forward or back. Verbal instructions like "Pull shoulders back" don't work nearly as well as the feeling in the body.
  • Choose one correction at a time, and always operate from that posture, pausing and correcting yourself when you're about to slip up. Getting good shoulder posture required that I keep my shoulders back all the time. When I would reach for water, I'd notice when my shoulder was in the too-far-forward position, and then pull back and fix my shoulder position before trying again. This sometimes required trying at very basic tasks several times, often quite slowly, until I could get it right each time.

Although I didn't add this until quite late, I would now add a fourth point when giving advice on getting good physical posture: make sure to mobilize the parts of your body that are either (a) preventing you from moving into a good position or (b) requiring you to be very stiff or tense to hold that position. The trouble is, I know how to do that for the body, but I'm not as sure about how to do that for the mind.

But the three bullet points above are instructions that I can follow with respect to mental posture, I think.

So, to the extent that that seems possible for you, I invite you to try to do the same - and let me know how it goes.


View more: Next