There is no such thing as strength: a parody

25 ZoltanBerrigomo 05 July 2015 11:44PM

The concept of strength is ubiquitous in our culture. It is commonplace to hear one person described as "stronger" or "weaker" than another. And yet the notion of strength is a a pernicious myth which reinforces many our social ills and should be abandoned wholesale. 

 

1. Just what is strength, exactly? Few of the people who use the word can provide an exact definition. 

On first try, many people would say that  strength is the ability to lift heavy objects. But this completely ignores the strength necessary to push or pull on objects; to run long distances without exhausting oneself; to throw objects with great speed; to balance oneself on a tightrope, and so forth. 

When this is pointed out, people often try to incorporate all of these aspects into the definition of strength, with a result that is long, unwieldy, ad-hoc, and still missing some acts commonly considered to be manifestations of strength. 

 

Attempts to solve the problem by referring to the supposed cause of strength -- for example, by saying that strength is just a measure of  muscle mass -- do not help. A person with a large amount of muscle mass may be quite weak on any of the conventional measures of strength if, for example, they cannot lift objects due to injuries or illness. 

 

 

2. The concept of strength has an ugly history. Indeed, strength is implicated in both sexism and racism. Women have long been held to be the "weaker sex," consequently needing protection from the "stronger" males, resulting in centuries of structural oppression. Myths about racialist differences in strength have informed pernicious stereotypes and buttressed inequality.

 

3. There is no consistent way of grouping people into strong and weak. Indeed, what are we to make of the fact that some people are good at running but bad at lifting and vice versa? 

 

One might think that we can talk about different strengths - the strength in one's arms and one's legs for example. But what, then, should we make of the person who is good at arm-wrestling but poor at lifting? Arms can move in many ways; what will we make of someone who can move arms one way with great force, but not another? It is not hard to see that potential concepts such as "arm strength" or "leg strength" are problematic as well. 

 

4. When people are grouped into strong and weak according to any number of criteria, the amount of variation within each group is far larger than the amount of variation between groups. 

 

5. Strength is a social construct. Thus no one is inherently weak or strong. Scientifically, anthropologically, we are only human

 

6. Scientists are rapidly starting to understand the illusory nature of strength, and one needs only to glance at any of the popular scientific periodicals to encounter refutations of this notion. 

 

In on experiment, respondents from two different cultures were asked to lift a heavy object as much as they could. In one of the cultures, the respondents lifted the object higher. Furthermore, the manner in which the respondents attempted to lift the object depended on the culture. This shows that tests of strength cannot be considered culture-free and that there may be no such thing as a universal test of strength

 

7. Indeed, to even ask "what is strength?" is to assume that there is a quality, or essence, of humans with essential, immutable qualities. Asking the question begins the process of reifying strength... (see page 22 here).

 

---------------------------------------

 

For a serious statement of what the point of this was supposed to be, see this comment

 

Seeking geeks interested in bioinformatics

17 bokov 22 June 2015 01:44PM

I work at a small but feisty research team whose focus is biomedical informatics, i.e. mining biomedical data. Especially anonymized hospital records pooled over multiple healthcare networks. My personal interest is ultimately life-extension, and my colleagues are warming up to the idea as well. But the short-term goal that will be useful many different research areas is building infrastructure to massively accelerate hypothesis testing on and modelling of retrospective human data.

 

We have a job posting here (permanent, non-faculty, full-time, benefits):

https://www.uthscsajobs.com/postings/3113

 

If you can program, want to work in an academic research setting, and can relocate to San Antonio, TX, I invite you to apply. Thanks.

Note: The first step of the recruitment process will be a coding challenge, which will include an arithmetical or string-manipulation problem to solve in real-time using a language and developer tools of your choice.

edit: If you tried applying and were unable to access the posting, it's because the link has changed, our HR has an automated process that periodically expires the links for some reason. I have now updated the job post link.

The Brain as a Universal Learning Machine

82 jacob_cannell 24 June 2015 09:45PM

This article presents an emerging architectural hypothesis of the brain as a biological implementation of a Universal Learning Machine.  I present a rough but complete architectural view of how the brain works under the universal learning hypothesis.  I also contrast this new viewpoint - which comes from computational neuroscience and machine learning - with the older evolved modularity hypothesis popular in evolutionary psychology and the heuristics and biases literature.  These two conceptions of the brain lead to very different predictions for the likely route to AGI, the value of neuroscience, the expected differences between AGI and humans, and thus any consequent safety issues and dependent strategies.

Art generated by an artificial neural net

(The image above is from a recent mysterious post to r/machinelearning, probably from a Google project that generates art based on a visualization tool used to inspect the patterns learned by convolutional neural networks.  I am especially fond of the wierd figures riding the cart in the lower left. )

  1. Intro: Two viewpoints on the Mind
  2. Universal Learning Machines
  3. Historical Interlude
  4. Dynamic Rewiring
  5. Brain Architecture (the whole brain in one picture and a few pages of text)
  6. The Basal Ganglia
  7. Implications for AGI
  8. Conclusion

 

Intro: Two Viewpoints on the Mind

Few discoveries are more irritating than those that expose the pedigree of ideas.

-- Lord Acton (probably)

Less Wrong is a site devoted to refining the art of human rationality, where rationality is based on an idealized conceptualization of how minds should or could work.  Less Wrong and its founding sequences draws heavily on the heuristics and biases literature in cognitive psychology and related work in evolutionary psychology.  More specifically the sequences build upon a specific cluster in the space of cognitive theories, which can be identified in particular with the highly influential "evolved modularity" perspective of Cosmides and Tooby.

From Wikipedia:

Evolutionary psychologists propose that the mind is made up of genetically influenced and domain-specific[3] mental algorithms or computational modules, designed to solve specific evolutionary problems of the past.[4] 

From "Evolutionary Psychology and the Emotions":[5]

An evolutionary perspective leads one to view the mind as a crowded zoo of evolved, domain-specific programs.  Each is functionally specialized for solving a different adaptive problem that arose during hominid evolutionary history, such as face recognition, foraging, mate choice, heart rate regulation, sleep management, or predator vigilance, and each is activated by a different set of cues from the environment.

If you imagine these general theories or perspectives on the brain/mind as points in theory space, the evolved modularity cluster posits that much of the machinery of human mental algorithms is largely innate.  General learning - if it exists at all - exists only in specific modules; in most modules learning is relegated to the role of adapting existing algorithms and acquiring data; the impact of the information environment is de-emphasized.  In this view the brain is a complex messy cludge of evolved mechanisms.

There is another viewpoint cluster, more popular in computational neuroscience (especially today), that is almost the exact opposite of the evolved modularity hypothesis.  I will rebrand this viewpoint the "universal learner" hypothesis, aka the "one learning algorithm" hypothesis (the rebranding is justified mainly by the inclusion of some newer theories and evidence for the basal ganglia as a 'CPU' which learns to control the cortex).  The roots of the universal learning hypothesis can be traced back to Mountcastle's discovery of the simple uniform architecture of the cortex.[6]

The universal learning hypothesis proposes that all significant mental algorithms are learned; nothing is innate except for the learning and reward machinery itself (which is somewhat complicated, involving a number of systems and mechanisms), the initial rough architecture (equivalent to a prior over mindspace), and a small library of simple innate circuits (analogous to the operating system layer in a computer).  In this view the mind (software) is distinct from the brain (hardware).  The mind is a complex software system built out of a general learning mechanism.

In simplification, the main difference between these viewpoints is the relative quantity of domain specific mental algorithmic information specified in the genome vs that acquired through general purpose learning during the organism's lifetime.  Evolved modules vs learned modules.

When you have two hypotheses or viewpoints that are almost complete opposites this is generally a sign that the field is in an early state of knowledge; further experiments typically are required to resolve the conflict.

It has been about 25 years since Cosmides and Tooby began to popularize the evolved modularity hypothesis.  A number of key neuroscience experiments have been performed since then which support the universal learning hypothesis (reviewed later in this article).  

Additional indirect support comes from the rapid unexpected success of Deep Learning[7], which is entirely based on building AI systems using simple universal learning algorithms (such as Stochastic Gradient Descent or other various approximate Bayesian methods[8][9][10][11]) scaled up on fast parallel hardware (GPUs).  Deep Learning techniques have quickly come to dominate most of the key AI benchmarks including vision[12], speech recognition[13][14], various natural language tasks, and now even ATARI [15] - proving that simple architectures (priors) combined with universal learning is a path (and perhaps the only viable path) to AGI. Moreover, the internal representations that develop in some deep learning systems are structurally and functionally similar to representations in analogous regions of biological cortex[16].

To paraphrase Feynman: to truly understand something you must build it. 

In this article I am going to quickly introduce the abstract concept of a universal learning machine, present an overview of the brain's architecture as a specific type of universal learning machine, and finally I will conclude with some speculations on the implications for the race to AGI and AI safety issues in particular.

Universal Learning Machines

A universal learning machine is a simple and yet very powerful and general model for intelligent agents.  It is an extension of a general computer - such as Turing Machine - amplified with a universal learning algorithm.  Do not view this as my 'big new theory' - it is simply an amalgamation of a set of related proposals by various researchers.

An initial untrained seed ULM can be defined by 1.) a prior over the space of models (or equivalently, programs), 2.) an initial utility function, and 3.) the universal learning machinery/algorithm.  The machine is a real-time system that processes an input sensory/observation stream and produces an output motor/action stream to control the external world using a learned internal program that is the result of continuous self-optimization.

There is of course always room to smuggle in arbitrary innate functionality via the prior, but in general the prior is expected to be extremely small in bits in comparison to the learned model.

The key defining characteristic of a ULM is that it uses its universal learning algorithm for continuous recursive self-improvement with regards to the utility function (reward system).  We can view this as second (and higher) order optimization: the ULM optimizes the external world (first order), and also optimizes its own internal optimization process (second order), and so on.  Without loss of generality, any system capable of computing a large number of decision variables can also compute internal self-modification decisions.

Conceptually the learning machinery computes a probability distribution over program-space that is proportional to the expected utility distribution.  At each timestep it receives a new sensory observation and expends some amount of computational energy to infer an updated (approximate) posterior distribution over its internal program-space: an approximate 'Bayesian' self-improvement.

The above description is intentionally vague in the right ways to cover the wide space of possible practical implementations and current uncertainty.  You could view AIXI as a particular formalization of the above general principles, although it is also as dumb as a rock in any practical sense and has other potential theoretical problems.  Although the general idea is simple enough to convey in the abstract, one should beware of concise formal descriptions: practical ULMs are too complex to reduce to a few lines of math.

A ULM inherits the general property of a Turing Machine that it can compute anything that is computable, given appropriate resources.  However a ULM is also more powerful than a TM.  A Turing Machine can only do what it is programmed to do.  A ULM automatically programs itself.

If you were to open up an infant ULM - a machine with zero experience - you would mainly just see the small initial code for the learning machinery.  The vast majority of the codestore starts out empty - initialized to noise.  (In the brain the learning machinery is built in at the hardware level for maximal efficiency).

Theoretical turing machines are all qualitatively alike, and are all qualitatively distinct from any non-universal machine.  Likewise for ULMs.  Theoretically a small ULM is just as general/expressive as a planet-sized ULM.  In practice quantitative distinctions do matter, and can become effectively qualitative.

Just as the simplest possible Turing Machine is in fact quite simple, the simplest possible Universal Learning Machine is also probably quite simple.  A couple of recent proposals for simple universal learning machines include the Neural Turing Machine[16] (from Google DeepMind), and Memory Networks[17].  The core of both approaches involve training an RNN to learn how to control a memory store through gating operations. 

Historical Interlude

At this point you may be skeptical: how could the brain be anything like a universal learner?  What about all of the known innate biases/errors in human cognition?  I'll get to that soon, but let's start by thinking of a couple of general experiments to test the universal learning hypothesis vs the evolved modularity hypothesis.

In a world where the ULH is mostly correct, what do we expect to be different than in worlds where the EMH is mostly correct?

One type of evidence that would support the ULH is the demonstration of key structures in the brain along with associated wiring such that the brain can be shown to directly implement some version of a ULM architecture.

Another type of indirect evidence that would help discriminate the two theories would be evidence that the brain is capable of general global optimization, and that complex domain specific algorithms/circuits mostly result from this process.  If on the other hand the brain is only capable of constrained/local optimization, then most of the complexity must instead be innate - the result of global optimization in evolutionary deeptime.  So in essence it boils down to the optimization capability of biological learning vs biological evolution.

From the perspective of the EMH, it is not sufficient to demonstrate that there are things that brains can not learn in practice - because those simply could be quantitative limitations.  Demonstrating that an intel 486 can't compute some known computable function in our lifetimes is not proof that the 486 is not a Turing Machine.

Nor is it sufficient to demonstrate that biases exist: a ULM is only 'rational' to the extent that its observational experience and learning machinery allows (and to the extent one has the correct theory of rationality).  In fact, the existence of many (most?) biases intrinsically depends on the EMH - based on the implicit assumption that some cognitive algorithms are innate.  If brains are mostly ULMs then most cognitive biases dissolve, or become learning biases - for if all cognitive algorithms are learned, then evidence for biases is evidence for cognitive algorithms that people haven't had sufficient time/energy/motivation to learn.  (This does not imply that intrinsic limitations/biases do not exist or that the study of cognitive biases is a waste of time; rather the ULH implies that educational history is what matters most)

The genome can only specify a limited amount of information.  The question is then how much of our advanced cognitive machinery for things like facial recognition, motor planning, language, logic, planning, etc. is innate vs learned.  From evolution's perspective there is a huge advantage to preloading the brain with innate algorithms so long as said algorithms have high expected utility across the expected domain landscape.  

On the other hand, evolution is also highly constrained in a bit coding sense: every extra bit of code costs additional energy for the vast number of cellular replication events across the lifetime of the organism.  Low code complexity solutions also happen to be exponentially easier to find.  These considerations seem to strongly favor the ULH but they are difficult to quantify.

Neuroscientists have long known that the brain is divided into physical and functional modules.  These modular subdivisions were discovered a century ago by Brodmann.  Every time neuroscientists opened up a new brain, they saw the same old cortical modules in the same old places doing the same old things.  The specific layout of course varied from species to species, but the variations between individuals are minuscule. This evidence seems to strongly favor the EMH.

Throughout most of the 90's up into the 2000's, evidence from computational neuroscience models and AI were heavily influenced by - and unsurprisingly - largely supported the EMH.  Neural nets and backprop were known of course since the 1980's and worked on small problems[18], but at the time they didn't scale well - and there was no theory to suggest they ever would.  

Theory of the time also suggested local minima would always be a problem (now we understand that local minima are not really the main problem[19], and modern stochastic gradient descent methods combined with highly overcomplete models and stochastic regularization[20] are effectively global optimizers that can often handle obstacles such as local minima and saddle points[21]).  

The other related historical criticism rests on the lack of biological plausibility for backprop style gradient descent.  (There is as of yet little consensus on how the brain implements the equivalent machinery, but target propagation is one of the more promising recent proposals[22][23].)

Many AI researchers are naturally interested in the brain, and we can see the influence of the EMH in much of the work before the deep learning era.  HMAX is a hierarchical vision system developed in the late 90's by Poggio et al as a working model of biological vision[24].  It is based on a preconfigured hierarchy of modules, each of which has its own mix of innate features such as gabor edge detectors along with a little bit of local learning.  It implements the general idea that complex algorithms/features are innate - the result of evolutionary global optimization - while neural networks (incapable of global optimization) use hebbian local learning to fill in details of the design.

Dynamic Rewiring

In a groundbreaking study from 2000 published in Nature, Sharma et al successfully rewired ferret retinal pathways to project into the auditory cortex instead of the visual cortex.[25]  The result: auditory cortex can become visual cortex, just by receiving visual data!  Not only does the rewired auditory cortex develop the specific gabor features characteristic of visual cortex; the rewired cortex also becomes functionally visual. [26] True, it isn't quite as effective as normal visual cortex, but that could also possibly be an artifact of crude and invasive brain rewiring surgery.

The ferret study was popularized by the book On Intelligence by Hawkins in 2004 as evidence for a single cortical learning algorithm.  This helped percolate the evidence into the wider AI community, and thus probably helped in setting up the stage for the deep learning movement of today.  The modern view of the cortex is that of a mostly uniform set of general purpose modules which slowly become recruited for specific tasks and filled with domain specific 'code' as a result of the learning (self optimization) process.

The next key set of evidence comes from studies of atypical human brains with novel extrasensory powers.  In 2009 Vuillerme et al showed that the brain could automatically learn to process sensory feedback rendered onto the tongue[27].  This research was developed into a complete device that allows blind people to develop primitive tongue based vision.

In the modern era some blind humans have apparently acquired the ability to perform echolocation (sonar), similar to cetaceans.  In 2011 Thaler et al used MRI and PET scans to show that human echolocators use diverse non-auditory brain regions to process echo clicks, predominantly relying on re-purposed 'visual' cortex.[27] 

The echolocation study in particular helps establish the case that the brain is actually doing global, highly nonlocal optimization - far beyond simple hebbian dynamics.  Echolocation is an active sensing strategy that requires very low latency processing, involving complex timed coordination between a number of motor and sensory circuits - all of which must be learned.

Somehow the brain is dynamically learning how to use and assemble cortical modules to implement mental algorithms: everyday tasks such as visual counting, comparisons of images or sounds, reading, etc - all are task which require simple mental programs that can shuffle processed data between modules (some or any of which can also function as short term memory buffers).

To explain this data, we should be on the lookout for a system in the brain that can learn to control the cortex - a general system that dynamically routes data between different brain modules to solve domain specific tasks.

But first let's take a step back and start with a high level architectural view of the entire brain to put everything in perspective.

Brain Architecture

Below is a circuit diagram for the whole brain.  Each of the main subsystems work together and are best understood together.  You can probably get a good high level extremely coarse understanding of the entire brain is less than one hour.

(there are a couple of circuit diagrams of the whole brain on the web, but this is the best.  From this site.)

The human brain has ~100 billion neurons and ~100 trillion synapses, but ultimately it evolved from the bottom up - from organisms with just hundreds of neurons, like the tiny brain of C. Elegans.

We know that evolution is code complexity constrained: much of the genome codes for cellular metabolism, all the other organs, and so on.  For the brain, most of its bit budget needs to be spent on all the complex neuron, synapse, and even neurotransmitter level machinery - the low level hardware foundation.

For a tiny brain with 1000 neurons or less, the genome can directly specify each connection.  As you scale up to larger brains, evolution needs to create vastly more circuitry while still using only about the same amount of code/bits.  So instead of specifying connectivity at the neuron layer, the genome codes connectivity at the module layer.  Each module can be built from simple procedural/fractal expansion of progenitor cells.

So the size of a module has little to nothing to do with its innate complexity.  The cortical modules are huge - V1 alone contains 200 million neurons in a human - but there is no reason to suspect that V1 has greater initial code complexity than any other brain module.  Big modules are built out of simple procedural tiling patterns.

Very roughly the brain's main modules can be divided into six subsystems (there are numerous smaller subsystems):

  • The neocortex: the brain's primary computational workhorse (blue/purple modules at the top of the diagram).  Kind of like a bunch of general purpose FPGA coprocessors.
  • The cerebellum: another set of coprocessors with a simpler feedforward architecture.  Specializes more in motor functionality. 
  • The thalamus: the orangish modules below the cortex.  Kind of like a relay/routing bus.
  • The hippocampal complex: the apex of the cortex, and something like the brain's database.
  • The amygdala and limbic reward system: these modules specialize in something like the value function.
  • The Basal Ganglia (green modules): the central control system, similar to a CPU.

In the interest of space/time I will focus primarily on the Basal Ganglia and will just touch on the other subsystems very briefly and provide some links to further reading.

The neocortex has been studied extensively and is the main focus of several popular books on the brain.  Each neocortical module is a 2D array of neurons (technically 2.5D with a depth of about a few dozen neurons arranged in about 5 to 6 layers).

Each cortical module is something like a general purpose RNN (recursive neural network) with 2D local connectivity.  Each neuron connects to its neighbors in the 2D array.  Each module also has nonlocal connections to other brain subsystems and these connections follow the same local 2D connectivity pattern, in some cases with some simple affine transformations.  Convolutional neural networks use the same general architecture (but they are typically not recurrent.)

Cortical modules - like artifical RNNs - are general purpose and can be trained to perform various tasks.  There are a huge number of models of the cortex, varying across the tradeoff between biological realism and practical functionality.  

Perhaps surprisingly, any of a wide variety of learning algorithms can reproduce cortical connectivity and features when trained on appropriate sensory data[27].  This is a computational proof of the one-learning-algorithm hypothesis; furthermore it illustrates the general idea that data determines functional structure in any general learning system.

There is evidence that cortical modules learn automatically (unsupervised) to some degree, and there is also some evidence that cortical modules can be trained to relearn data from other brain subsystems - namely the hippocampal complex.  The dark knowledge distillation technique in ANNs[28][29] is a potential natural analog/model of hippocampus -> cortex knowledge transfer.

Module connections are bidirectional, and feedback connections (from high level modules to low level) outnumber forward connections.  We can speculate that something like target propagation can also be used to guide or constrain the development of cortical maps (speculation).

The hippocampal complex is the root or top level of the sensory/motor hierarchy.  This short youtube video  gives a good seven minute overview of the HC.  It is like a spatiotemporal database.  It receives compressed scene descriptor streams from the sensory cortices, it stores this information in medium-term memory, and it supports later auto-associative recall of these memories.  Imagination and memory recall seem to be basically the same.  

The 'scene descriptors' take the sensible form of things like 3D position and camera orientation, as encoded in place, grid, and head direction cells.  This is basically the logical result of compressing the sensory stream, comparable to the networking data stream in a multiplayer video game.

Imagination/recall is basically just the reverse of the forward sensory coding path - in reverse mode a compact scene descriptor is expanded into a full imagined scene.  Imagined/remembered scenes activate the same cortical subnetworks that originally formed the memory (or would have if the memory was real, in the case of imagined recall).

The amygdala and associated limbic reward modules are rather complex, but look something like the brain's version of the value function for reinforcement learning.  These modules are interesting because they clearly rely on learning, but clearly the brain must specify an initial version of the value/utility function that has some minimal complexity.

As an example, consider taste.  Infants are born with basic taste detectors and a very simple initial value function for taste.  Over time the brain receives feedback from digestion and various estimators of general mood/health, and it uses this to refine the initial taste value function.  Eventually the adult sense of taste becomes considerably more complex.  Acquired taste for bitter substances - such as coffee and beer - are good examples.

The amygdala appears to do something similar for emotional learning.  For example infants are born with a simple versions of a fear response, with is later refined through reinforcement learning.  The amygdala sits on the end of the hippocampus, and it is also involved heavily in memory processing.

See also these two videos from khanacademy: one on the limbic system and amygdala (10 mins), and another on the midbrain reward system (8 mins)

 

The Basal Ganglia

 

The Basal Ganglia is a wierd looking complex of structures located in the center of the brain.  It is a conserved structure found in all vertebrates, which suggests a core functionality.  The BG is proximal to and connects heavily with the midbrain reward/limbic systems.  It also connects to the brain's various modules in the cortex/hippocampus, thalamus and the cerebellum . . . basically everything.

All of these connections form recurrent loops between associated compartmental modules in each structure: thalamocortical/hippocampal-cerebellar-basal_ganglial loops.

 

Just as the cortex and hippocampus are subdivided into modules, there are corresponding modular compartments in the thalamus, basal ganglia, and the cerebellum.  The set of modules/compartments in each main structure are all highly interconnected with their correspondents across structures, leading to the concept of distributed processing modules.

Each DPM forms a recurrent loop across brain structures (the local networks in the cortex, BG, and thalamus are also locally recurrent, whereas those in the cerebellum are not).  These recurrent loops are mostly separate, but each sub-structure also provides different opportunities for inter-loop connections.

The BG appears to be involved in essentially all higher cognitive functions.  Its core functionality is action selection via subnetwork switching.  In essence action selection is the core problem of intelligence, and it is also general enough to function as the building block of all higher functionality.  A system that can select between motor actions can also select between tasks or subgoals.  More generally, low level action selection can easily form the basis of a Turing Machine via selective routing: deciding where to route the output of thalamocortical-cerebellar modules (some of which may specialize in short term memory as in the prefrontal cortex, although all cortical modules have some short term memory capability).

There are now a number of computational models for the Basal Ganglia-Cortical system that demonstrate possible biologically plausible implementations of the general theory[28][29]; integration with the hippocampal complex leads to larger-scale systems which aim to model/explain most of higher cognition in terms of sequential mental programs[30] (of course fully testing any such models awaits sufficient computational power to run very large-scale neural nets).

For an extremely oversimplified model of the BG as a dynamic router, consider an array of N distributed modules controlled by the BG system.  The BG control network expands these N inputs into an NxN matrix.  There are N2 potential intermodular connections, each of which can be individually controlled.  The control layer reads a compressed, downsampled version of the module's hidden units as its main input, and is also recurrent.  Each output node in the BG has a multiplicative gating effect which selectively enables/disables an individual intermodular connection.  If the control layer is naively fully connected, this would require (N2)2 connections, which is only feasible for N ~ 100 modules, but sparse connectivity can substantially reduce those numbers.

It is unclear (to me), whether the BG actually implements NxN style routing as described above, or something more like 1xN or Nx1 routing, but there is general agreement that it implements cortical routing. 

Of course in actuality the BG architecture is considerably more complex, as it also must implement reinforcement learning, and the intermodular connectivity map itself is also probably quite sparse/compressed (the BG may not control all of cortex, certainly not at a uniform resolution, and many controlled modules may have a very limited number of allowed routing decisions).  Nonetheless, the simple multiplicative gating model illustrates the core idea.  

This same multiplicative gating mechanism is the core principle behind the highly successful LSTM (Long Short-Term Memory)[30] units that are used in various deep learning systems.  The simple version of the BG's gating mechanism can be considered a wider parallel and hierarchical extension of the basic LSTM architecture, where you have a parallel array of N memory cells instead of 1, and each memory cell is a large vector instead of a single scalar value.

The main advantage of the BG architecture is parallel hierarchical approximate control: it allows a large number of hierarchical control loops to update and influence each other in parallel.  It also reduces the huge complexity of general routing across the full cortex down into a much smaller-scale, more manageable routing challenge.

Implications for AGI

These two conceptions of the brain - the universal learning machine hypothesis and the evolved modularity hypothesis - lead to very different predictions for the likely route to AGI, the expected differences between AGI and humans, and thus any consequent safety issues and strategies.

In the extreme case imagine that the brain is a pure ULM, such that the genetic prior information is close to zero or is simply unimportant.  In this case it is vastly more likely that successful AGI will be built around designs very similar to the brain, as the ULM architecture in general is the natural ideal, vs the alternative of having to hand engineer all of the AI's various cognitive mechanisms.

In reality learning is computationally hard, and any practical general learning system depends on good priors to constrain the learning process (essentially taking advantage of previous knowledge/learning).  The recent and rapid success of deep learning is strong evidence for how much prior information is ideal: just a little.  The prior in deep learning systems takes the form of a compact, small set of hyperparameters that control the learning process and specify the overall network architecture (an extremely compressed prior over the network topology and thus the program space).

The ULH suggests that most everything that defines the human mind is cognitive software rather than hardware: the adult mind (in terms of algorithmic information) is 99.999% a cultural/memetic construct.  Obviously there are some important exceptions: infants are born with some functional but very primitive sensory and motor processing 'code'.  Most of the genome's complexity is used to specify the learning machinery, and the associated reward circuitry.  Infant emotions appear to simplify down to a single axis of happy/sad; differentiation into the more subtle vector space of adult emotions does not occur until later in development.

If the mind is software, and if the brain's learning architecture is already universal, then AGI could - by default - end up with a similar distribution over mindspace, simply because it will be built out of similar general purpose learning algorithms running over the same general dataset.  We already see evidence for this trend in the high functional similarity between the features learned by some machine learning systems and those found in the cortex.

Of course an AGI will have little need for some specific evolutionary features: emotions that are subconsciously broadcast via the facial muscles is a quirk unnecessary for an AGI - but that is a rather specific detail.

The key takeway is that the data is what matters - and in the end it is all that matters.  Train a universal learner on image data and it just becomes a visual system.  Train it on speech data and it becomes a speech recognizer.  Train it on ATARI and it becomes a little gamer agent.  

Train a universal learner on the real world in something like a human body and you get something like the human mind.  Put a ULM in a dolphin's body and echolocation is the natural primary sense, put a ULM in a human body with broken visual wiring and you can also get echolocation. 

Control over training is the most natural and straightforward way to control the outcome.  

To create a superhuman AI driver, you 'just' need to create a realistic VR driving sim and then train a ULM in that world (better training and the simple power of selective copying leads to superhuman driving capability).

So to create benevolent AGI, we should think about how to create virtual worlds with the right structure, how to educate minds in those worlds, and how to safely evaluate the results.

One key idea - which I proposed five years ago is that the AI should not know it is in a sim. 

New AI designs (world design + architectural priors + training/education system) should be tested first in the safest virtual worlds: which in simplification are simply low tech worlds without computer technology.  Design combinations that work well in safe low-tech sandboxes are promoted to less safe high-tech VR worlds, and then finally the real world.

A key principle of a secure code sandbox is that the code you are testing should not be aware that it is in a sandbox.  If you violate this principle then you have already failed.  Yudkowsky's AI box thought experiment assumes the violation of the sandbox security principle apriori and thus is something of a distraction. (the virtual sandbox idea was most likely discussed elsewhere previously, as Yudkowsky indirectly critiques a strawman version of the idea via this sci-fi story).  

The virtual sandbox approach also combines nicely with invisible thought monitors, where the AI's thoughts are automatically dumped to searchable logs.

Of course we will still need a solution to the value learning problem.  The natural route with brain-inspired AI is to learn the key ideas behind value acquisition in humans to help derive an improved version of something like inverse reinforcement learning and or imitation learning[31] - an interesting topic for another day.

Conclusion

Ray Kurzweil has been predicting for decades that AGI will be built by reverse engineering the brain, and this particular prediction is not especially unique - this has been a popular position for quite a while.  My own investigation of neuroscience and machine learning led me to a similar conclusion some time ago.  

The recent progress in deep learning, combined with the emerging modern understanding of the brain, provide further evidence that AGI could arrive around the time when we can build and train ANNs with similar computational power as measured very roughly in terms of neuron/synapse counts.  In general the evidence from the last four years or so supports Hanson's viewpoint from the Foom debate.  More specifically, his general conclusion:

Future superintelligences will exist, but their vast and broad mental capacities will come mainly from vast mental content and computational resources. By comparison, their general architectural innovations will be minor additions.

The ULH supports this conclusion.

Current ANN engines can already train and run models with around 10 million neurons and 10 billion (compressed/shared) synapses on a single GPU, which suggests that the goal could soon be within the reach of a large organization.  Furthermore, Moore's Law for GPUs still has some steam left, and software advances are currently improving simulation performance at a faster rate than hardware.  These trends implies that Anthropomorphic/Neuromorphic AGI could be surprisingly close, and may appear suddenly.

What kind of leverage can we exert on a short timescale?

 

In praise of gullibility?

23 ahbwramc 18 June 2015 04:52AM

I was recently re-reading a piece by Yvain/Scott Alexander called Epistemic Learned Helplessness. It's a very insightful post, as is typical for Scott, and I recommend giving it a read if you haven't already. In it he writes:

When I was young I used to read pseudohistory books; Immanuel Velikovsky's Ages in Chaos is a good example of the best this genre has to offer. I read it and it seemed so obviously correct, so perfect, that I could barely bring myself to bother to search out rebuttals.

And then I read the rebuttals, and they were so obviously correct, so devastating, that I couldn't believe I had ever been so dumb as to believe Velikovsky.

And then I read the rebuttals to the rebuttals, and they were so obviously correct that I felt silly for ever doubting.

And so on for several more iterations, until the labyrinth of doubt seemed inescapable.

He goes on to conclude that the skill of taking ideas seriously - often considered one of the most important traits a rationalist can have - is a dangerous one. After all, it's very easy for arguments to sound convincing even when they're not, and if you're too easily swayed by argument you can end up with some very absurd beliefs (like that Venus is a comet, say).

This post really resonated with me. I've had several experiences similar to what Scott describes, of being trapped between two debaters who both had a convincingness that exceeded my ability to discern truth. And my reaction in those situations was similar to his: eventually, after going through the endless chain of rebuttals and counter-rebuttals, changing my mind at each turn, I was forced to throw up my hands and admit that I probably wasn't going to be able to determine the truth of the matter - at least, not without spending a lot more time investigating the different claims than I was willing to. And so in many cases I ended up adopting a sort of semi-principled stance of agnosticism: unless it was a really really important question (in which case I was sort of obligated to do the hard work of investigating the matter to actually figure out the truth), I would just say I don't know when asked for my opinion.

[Non-exhaustive list of areas in which I am currently epistemically helpless: geopolitics (in particular the Israel/Palestine situation), anthropics, nutrition science, population ethics]

All of which is to say: I think Scott is basically right here, in many cases we shouldn't have too strong of an opinion on complicated matters. But when I re-read the piece recently I was struck by the fact that his whole argument could be summed up much more succinctly (albeit much more pithily) as:

"Don't be gullible."

Huh. Sounds a lot more obvious that way.

Now, don't get me wrong: this is still good advice. I think people should endeavour to not be gullible if at all possible. But it makes you wonder: why did Scott feel the need to write a post denouncing gullibility? After all, most people kind of already think being gullible is bad - who exactly is he arguing against here?

Well, recall that he wrote the post in response to the notion that people should believe arguments and take ideas seriously. These sound like good, LW-approved ideas, but note that unless you're already exceptionally smart or exceptionally well-informed, believing arguments and taking ideas seriously is tantamount to...well, to being gullible. In fact, you could probably think of gullibility as a kind of extreme and pathological form of lightness; a willingness to be swept away by the winds of evidence, no matter how strong (or weak) they may be.

There seems to be some tension here. On the one hand we have an intuitive belief that gullibility is bad; that the proper response to any new claim should be skepticism. But on the other hand we also have some epistemic norms here at LW that are - well, maybe they don't endorse being gullible, but they don't exactly not endorse it either. I'd say the LW memeplex is at least mildly friendly towards the notion that one should believe conclusions that come from convincing-sounding arguments, even if they seem absurd. A core tenet of LW is that we change our mind too little, not too much, and we're certainly all in favour of lightness as a virtue.

Anyway, I thought about this tension for a while and came to the conclusion that I had probably just lost sight of my purpose. The goal of (epistemic) rationality isn't to not be gullible or not be skeptical - the goal is to form correct beliefs, full stop. Terms like gullibility and skepticism are useful to the extent that people tend to be systematically overly accepting or dismissive of new arguments - individual beliefs themselves are simply either right or wrong. So, for example, if we do studies and find out that people tend to accept new ideas too easily on average, then we can write posts explaining why we should all be less gullible, and give tips on how to accomplish this. And if on the other hand it turns out that people actually accept far too few new ideas on average, then we can start talking about how we're all much too skeptical and how we can combat that. But in the end, in terms of becoming less wrong, there's no sense in which gullibility would be intrinsically better or worse than skepticism - they're both just words we use to describe deviations from the ideal, which is accepting only true ideas and rejecting only false ones.

This answer basically wrapped the matter up to my satisfaction, and resolved the sense of tension I was feeling. But afterwards I was left with an additional interesting thought: might gullibility be, if not a desirable end point, then an easier starting point on the path to rationality?

That is: no one should aspire to be gullible, obviously. That would be aspiring towards imperfection. But if you were setting out on a journey to become more rational, and you were forced to choose between starting off too gullible or too skeptical, could gullibility be an easier initial condition?

I think it might be. It strikes me that if you start off too gullible you begin with an important skill: you already know how to change your mind. In fact, changing your mind is in some ways your default setting if you're gullible. And considering that like half the freakin sequences were devoted to learning how to actually change your mind, starting off with some practice in that department could be a very good thing.

I consider myself to be...well, maybe not more gullible than average in absolute terms - I don't get sucked into pyramid scams or send money to Nigerian princes or anything like that. But I'm probably more gullible than average for my intelligence level. There's an old discussion post I wrote a few years back that serves as a perfect demonstration of this (I won't link to it out of embarrassment, but I'm sure you could find it if you looked). And again, this isn't a good thing - to the extent that I'm overly gullible, I aspire to become less gullible (Tsuyoku Naritai!). I'm not trying to excuse any of my past behaviour. But when I look back on my still-ongoing journey towards rationality, I can see that my ability to abandon old ideas at the (relative) drop of a hat has been tremendously useful so far, and I do attribute that ability in part to years of practice at...well, at believing things that people told me, and sometimes gullibly believing things that people told me. Call it epistemic deferentiality, or something - the tacit belief that other people know better than you (especially if they're speaking confidently) and that you should listen to them. It's certainly not a character trait you're going to want to keep as a rationalist, and I'm still trying to do what I can to get rid of it - but as a starting point? You could do worse I think.

Now, I don't pretend that the above is anything more than a plausibility argument, and maybe not a strong one at that. For one I'm not sure how well this idea carves reality at its joints - after all, gullibility isn't quite the same thing as lightness, even if they're closely related. For another, if the above were true, you would probably expect LWer's to be more gullible than average. But that doesn't seem quite right - while LW is admirably willing to engage with new ideas, no matter how absurd they might seem, the default attitude towards a new idea on this site is still one of intense skepticism. Post something half-baked on LW and you will be torn to shreds. Which is great, of course, and I wouldn't have it any other way - but it doesn't really sound like the behaviour of a website full of gullible people.

(Of course, on the other hand it could be that LWer's really are more gullible than average, but they're just smart enough to compensate for it)

Anyway, I'm not sure what to make of this idea, but it seemed interesting and worth a discussion post at least. I'm curious to hear what people think: does any of the above ring true to you? How helpful do you think gullibility is, if it is at all? Can you be "light" without being gullible? And for the sake of collecting information: do you consider yourself to be more or less gullible than average for someone of your intelligence level?

Update on the Brain Preservation Foundation Prize

26 Andy_McKenzie 26 May 2015 01:47AM

Brain Preservation Foundation President Kenneth Hayworth just wrote a synopsis of the recent ongoings from the major two competitors for the BPF prizes. Here is the summary: 

Brain Preservation Prize competitor Shawn Mikula just published his whole mouse brain electron microscopy protocol in Nature Methods (paper, BPF interview), putting him close to winning the mouse phase of our prize.

Brain Preservation Prize competitor 21st Century Medicine has developed a new “Aldehyde-Stabilized Cryopreservation” technique–preliminary results show good ultrastructure preservation even after storage of a whole rabbit brain at -135 degrees C.

This work was funded in part from donations from LW users. In particular, a grant to support the work of LW user Robert McIntyre at 21st Century Medicine that the BPF was able to provide has been instrumental. 

In order to continue this type of research and to bolster it, BPF welcomes your support in a variety of different ways, including awareness-raising, donations, and volunteering. Please reach out if you would like to volunteer, or you can PM me and I will help put you in touch. And if you have any suggestions for the BPF, please feel free to discuss them in the comments below. 

Communicating via writing vs. in person

4 adamzerner 22 May 2015 04:58AM

There's a lot that I really like about communicating via writing. Communicating in person is sometimes frustrating for me, and communicating via writing addresses a lot of those frustrations:

1) I often want to make a point that depends on the other person knowing X. In person, if I always paused and did the following, it'd add a lot of friction to conversations: "Wait, do you know X? If yes, good, I'll continue. If no, let me think about how to explain it briefly. Or do you want me to explain it in more depth? Or do you want to try to proceed without knowing X and see how it goes?". But if I don't do so, then it risks miscommunication (because the other person may not have the dependency X).

In writing, I could just link to an article. If the other person doesn't have the dependency, they have options. They could try to proceed without knowing X and see how it goes. If it doesn't work out, they could come back and read the link. Or they could read the link right away. And in reading the link, they have their choice of how deeply they want to read. Ie. they could just skim if they want to.

Alternatively, if you don't have something to link to, you could add a footnote. I think that a UI like Medium's side comments is very preferable to putting the footnotes at the bottom of the page. I hope to see this adopted across the internet some time in the next 5 years or so.

2) I think that in general, being precise about what you're saying is actually quite difficult/time consuming*. For example, I don't really mean what I just said. I'm actually not sure how often that it's difficult/time consuming to be precise with what you're saying. And I'm not sure how often it's useful to be precise about what you're saying (or really, more precise...whatever that means...). I guess what I really mean is that it happens often enough where it's a problem. Or maybe just that for me, it happens enough where I find it to be a problem.

Anyway, I find that putting quotes around what I say is a nice way to mitigate this problem.

Ex. It's "in my nature" to be strategic.

The quotes show that the word inside them isn't precisely what I mean, but that it's close enough to what I mean that it should communicate the gist of it. I sense that this communication often happens through empathetic inference.

*I also find that I feel internal and external pressure to be consistent with what I say, even if I know I'm oversimplifying. This is a problem and has negatively effected me. I recently realized what a big problem it is, and will try very hard to address it (or really, I plan on trying very hard but I'm not sure blah blah blah blah blah...).

Note 1: I find internal conversation/thinking as well as interpersonal conversation to be "chaotic". (What follows is rant-y and not precisely what I believe. But being precise would take too long, and I sense that the rant-y tone helps to communicate without detracting from the conversation by being uncivil.) It seems that a lot of other people (much less so on LW) have more "organized" thinking patterns. I can't help but think that that's BS. Well, maybe they do, but I sense that they shouldn't. Reality is complicated. People seem to oversimplify things a lot, and to think in terms of black-white. When you do that, I could see how ones thoughts could be "organized". But when you really try to deal with the complexities of reality... I don't understand how you could simultaneously just go through life with organized thoughts.

Note 2: I sense that this post somewhat successfully communicates my internal thought process and how chaotic it could be. I'm curious how this compares to other people. I should note that I was diagnosed with a mild-moderate case of ADHD when I was younger. But that was largely based off of iffy reporting from my teachers. They didn't realize how much conscious thought motivated my actions. Ie. I often chose to do things that seem impulsive because I judged it to be worth it. But given that my mind is always racing so fast, and that I have a good amount of trouble deciding to pay attention to anything other than the most interesting thing to me, I'd guess that I do have ADHD to some extent. I'm hesitant to make that claim without ever having been inside someone else's mind before though (how incredibly incredibly cool would that be!!!) - appearances could be deceiving.

3) It's easier to model and traverse the structure of a conversation/argument when it's in writing. You could break things into nested sections (which isn't always a perfect way to model the structure, but is often satisfactory). In person, I find that it's often quite difficult for two people (let alone multiple people) to stay in sync with the structure of the conversation. The outcome of this is that people rarely veer away from extremely superficial conversations. Granted, I haven't had the chance to talk to many smart people in real life, and so I don't have much data on how deep a conversation between two smart people could get. My guess is that it could get a lot deeper than what I'm used to, but that it'd be pretty hard to make real progress on a difficult topic without outlining and diagramming things out. (Note: I don't mean "deep as in emotional", I mean "deep as in nodes in a graph")


There are also a lot of other things to say about communicating in writing vs. in person, including:

  • The value of the subtle things like nonverbal communication and pauses.
  • The value of a conversation being continuous. When it isn't, you have to download the task over and over again.
  • How much time you have to think things through before responding.
  • I sense that people are way more careful in writing, especially when there's a record of it (rather than, say PM).

This is a discussion post, so feel free to comment on these things too (or anything else in the ballpark).

How my social skills went from horrible to mediocre

29 JonahSinick 19 May 2015 11:29PM

Over the past few months, I've become aware that my understanding of social reality had been distorted to an extreme degree. It took 29 years for me to figure out what was going on, but I finally now understand.

The situation is very simple: The amount of time that I put into interacting within typical social contexts was very small, so I didn't get enough feedback to realize that I had a major blindspot as I otherwise would have.

Now that I've identified the blindspot, I can work on it, and my social awareness has been increasing at very rapid clip. I had no idea that I had so much potential for social awareness. I had been in a fixed mindset as rather than a growth mindset, I had thought "social skills will never be my strong point, so I shouldn't spend time trying to improve them, instead I should focus on what I'm best at." I'm astonished by how much my relationships have improved over a span of mere weeks.

I give details below.

continue reading »

Emotional Basilisks

-2 OrphanWilde 28 June 2013 09:10PM

Suppose it is absolutely true that atheism has a negative impact on your happiness and lifespan.  Suppose furthermore that you are the first person in your society of relatively happy theists who happened upon the idea of atheism, and moreover found absolute proof of its correctness, and quietly studied its effects on a small group of people kept isolated from the general population, and you discover that it has negative effects on happiness and lifespan.  Suppose that it -does- free people from a considerable amount of time wasted - from your perspective as a newfound atheist - in theistic theater.

Would you spread the idea?

This is, in our theoretical society, the emotional equivalent of a nuclear weapon; the group you tested it on is now comparatively crippled with existentialism and doubt, and many are beginning to doubt that the continued existence of human beings is even a good thing.  This is, for all intents and purposes, a basilisk, the mere knowledge of which causes its knower severe harm.  Is it, in fact, a good idea to go around talking about this revolutionary new idea, which makes everybody who learns it slightly less happy?  Would it be a -better- idea to form a secret society to go around talking to bright people likely to discover it themselves to try to keep this new idea quiet?

(Please don't fight the hypothetical here.  I know the evidence isn't nearly so perfect that atheism does in fact cause harm, as all the studies I've personally seen which suggest as much have some methodical flaws.  This is merely a question of whether "That which can be destroyed by the truth should be" is, in fact, a useful position to take, in view of ideas which may actually be harmful.)

Could you tell me what's wrong with this?

1 Algon 14 April 2015 10:43AM

Edit: Some people have misunderstood my intentions here. I do not in any way expect this to be the NEXT GREAT IDEA. I just couldn't see anything wrong with this, which almost certainly meant there were gaps in my knowledge. I thought the fastest way to see where I went wrong would be to post my idea here and see what people say. I apologise for any confusion I caused. I'll try to be more clear next time.

(I really can't think of any major problems in this, so I'd be very grateful if you guys could tell me what I've done wrong). 

So, a while back I was listening to a discussion about the difficulty of making an FAI. One of the ways that was suggested to circumvent this was to go down the route of programming an AGI to solve FAI. Someone else pointed out the problems with this. Amongst other things one would have no idea what the AI will do in pursuit of its primary goal. Furthermore, it would already be a monumental task to program an AI whose primary goal is to solve the FAI problem; doing this is still easier than solving FAI, I should think. 

So, I started to think about this for a little while, and I thought 'how could you make this safer?' Well, first of, you don't want an AI who completely outclasses humanity in terms of intellect. If things went Wrong, you'd have little chance of stopping it. So, you want to limit the AI's intellect to genius level, so if something did go Wrong, then the AI would not be unstoppable. It may do quite a bit of damage, but a large group of intelligent people with a lot of resources on their hands could stop it. 

 Therefore, what must be done is that the AI cannot modify parts of its source code. You must try and stop an intelligence explosion from taking off. So, limited access to its source code, and a limit on how much computing power it can have on hand. This is problematic though, because the AI would not be able to solve FAI very quickly. After all, we have a few genius level people trying to solve FAI, and they're struggling with it, so why should a genius level computer do any better. Well, an AI would have fewer biases, and could accumulate much more expertise relevant to the task at hand. It would be about as capable as solving FAI as the most capable human could possibly be; perhaps even more so. Essentially, you'd get someone like Turing, Von Neumann, Newton and others all rolled into one working on FAI. 

 But, there's still another problem. The AI, if left for 20 years working on FAI for 20 years let's say, would have accumulated enough skills that it would be able to cause major problems if something went wrong. Sure, it would be as intelligent as Newton, but it would be far more skilled. Humanity fighting against it would be like sending a young Miyamoto Musashi against his future self at his zenith i.e. completely one sided. 

 What must be done then, is the AI must have a time limit of a few years (or less) and after that time is past, it is put to sleep. We look at what it accomplished, see what worked and what didn't, and boot up a fresh version of the AI with any required modifications, and tell it what the old AI did. Repeat the process for a few years, and we should end up with FAI solved. 

After that, we just make an FAI, and wake up the originals, since there's no point in killing them off at this point. 

 But there are still some problems. One, time. Why try this when we could solve FAI ourselves? Well, I would only try and implement something like this if it is clear that AGI will be solved before FAI is. A backup plan if you will. Second, what If FAI is just too much for people at our current level? Sure, we have guys who are one in ten thousand and better working on this, but what if we need someone who's one in a hundred billion? Someone who represents the peak of human ability? We shouldn't just wait around for them, since some idiot would probably just make an AGI thinking it would love us all anyway. 

 So, what do you guys think? As a plan, is this reasonable? Or have I just overlooked something completely obvious? I'm not saying that this would by easy in anyway, but it would be easier than solving FAI.

Translating bad advice

16 Sophronius 14 April 2015 09:20AM

While writing my Magnum Opus I came across this piece of writing advice by Neil Gaiman:

“When people tell you something’s wrong or doesn’t work for them, they are almost always right. When they tell you exactly what they think is wrong and how to fix it, they are almost always wrong.”

And it struck me how true it was, even in other areas of life. People are terrible at giving advice on how to improve yourself, or on how to improve anything really. To illustrate this, here is what you would expect advice from a good rationalist friend to look like:

1)      “Hey, I’ve noticed you tend to do X.”

2)      “It’s been bugging me for a while, though I’m not really sure why. It’s possible other people think X is bad as well, you should ask them about it.”

3)      Paragon option: “Maybe you could do Y instead? I dunno, just think about it.”  

4)      Renegade option: “From now on I will slap you every time you do X, in order to help you stop being retarded about X.”

I wish I had more friends who gave advice like that, especially the renegade option. Instead, here is what I get in practice:

1)      Thinking: Argh, he is doing X again. That annoys me, but I don’t want to be rude.

2)      Thinking: Okay, he is doing Z now, which is kind of like X and a good enough excuse to vent my anger about X

3)      *Complains about Z in an irritated manner, and immediately forgets that there’s even a difference between X and Z*

4)      Thinking: Oh shit, that was rude. I better give some arbitrary advice on how to fix Z so I sound more productive.

As you can see, social rules and poor epistemology really get in the way of good advice, which is incredibly frustrating if you genuinely want to improve yourself! (Needless to say, ignoring badly phrased advice is incredibly stupid and you should never do this. See HPMOR for a fictional example of what happens if you try to survive on your wits alone.) A naïve solution is to tell everybody that you are the sort of person who loves to hear criticism in the hope that they will tell you what they really think. This never works because A) Nobody will believe you since everyone says this and it’s always a lie, and B) It’s a lie, you hate hearing real criticism just like everybody else.

The best solution I have found is to make it a habit to translate bad advice into good advice, in the spirit of what Neil Gaiman said above: Always be on the lookout for people giving subtle clues that you are doing something wrong and ask them about it (preferably without making yourself sound insecure in the process, or they’ll just tell you that you need to be more confident). When they give you some bullshit response that is designed to sound nice, keep at it and convince them to give you their real reasons for bringing it up in the first place. Once you have recovered the original information that lead them to give the poor advice, you can rewrite it as good advice in the format used above. Here is an example from my own work experience:

1)      Bad advice person: “You know, you may have your truth, but someone else may have their own truth.”

2)      Me, confused and trying not to be angry at bad epistemology: “That’s interesting. What makes you say that?”

3)      *5 minutes later*. “Holy shit, my insecurity is being read as arrogance, and as a result people feel threatened by my intelligence which makes them defensive? I never knew that!”

Seriously, apply this lesson. And get a good friend to slap you every time you don’t.

View more: Prev | Next