Eris

Currently an independent AI Safety researcher. Ex software developer, ex QA. 

Prior to working in industry was involved with academic research of cognitive architectures (the old ones). I'm a generalist with a focus on human-like AIs (know a couple of things about developmental psychology, cognitive science, ethology, computational models of the mind).

Personal research vectors: ontogenetic curriculum and the narrative theory. The primary theme is consolidating insights from various mind related areas into plausible explanation of human value dynamics.

A long-time lesswronger (~8 years). Mostly been active in the local LW community (as a consumer and as an org).

Recently I've organised a sort peer-to-peer accelerator for anyone who wants to become AI Safety researcher. Right now there are 17 of us. 

Was a part of AI Safety Camp 2023 (Positive Attractors team). 
 

Sequences

The shape of the solution
You can (not) advance
Ontogenetic Curriculum
Narrative Theory

Wiki Contributions

Comments

Sorted by
Eris10

Agreed. That said, some efforts in this direction do exist. for example Ekdeep Singh Lubana and his Explaining Emergence in NN with Model Systems Analysis

https://ekdeepslubana.github.io/

Answer by Eris21

(My day-to-day job is literally to tackle the 'generality' of intelligence)

While having high IQ/g is useful, it is not what lies at the core of great performance. Having developed 'intelligences' around the task you're tackling, + determination/commitment/obsession, + agency is what creates great results. 

I think it's better to focus on things one could change/train, sadly IQ/g is not one those things. 

Answer by Eris20

There is a book called The Culture Map. It maps behavioral differences across cultures including related to genuineness. For example in cultures with a direct attitude to criticism/feedback you can be more certain that some comment is truthful than in cultures with indirect feedback. (And more so if the comment is harsh)

Eris10

A Thousand Narratives. Theory of Cognitive Morphogenesis
Part 6/20. Artificial Neural Networks

“Alan Turing started off by wanting to 'build the brain' and ended up with a computer”
- Henry Markram, The Blue Brain Project

Recently I’ve come to terms with the idea that I have to publish my research even if it feels unfinished or slightly controversial. The mind is too complex (who would have thought), each time you think you get something, the new bit comes up and crushes your model. Time after time after time. So, waiting for at least remotely good answers is not an option. I have to “fail fast” even though it’s not a widely accepted approach among scientists nowadays. 

With that, Reinforcement learning and in-depth analysis of the mentioned models will be covered later. The goal of this part is to explain the reasoning behind the choice of the surface area.

Artificial Neural Networks are the face of modern artificial intelligence and the most successful branch of it too. But success unfortunately doesn’t mean biological plausibility. Even though most ML algorithms have been inspired by the aspects of biological neural networks final models end up pretty far from the source material. This makes their usefulness for the quest of reverse engineering the mind questionable. What I mean here is that almost no insights can be directly brought back to neuroscience to help with the research. I’ll explain why so in a bit. (note, this doesn’t mean that they can not serve as an inspiration. This is very much possible and, I’m sure, a good idea.)

There are three main show-stoppers:

(Reason #1) is the use of an implausible learning algorithm (read backpropagation). There were numerous attempts at finding something analogous to the backpropagation but all of them felt short as far as I know. The core objection to the biological plausibility of backpropagation is that weight updates in multi-layered networks require access to information that is non-local (i.e. error signals generated by units many layers downstream) In contrast, plasticity in biological synapses depends primarily on local information (i.e., pre- and post-synaptic neuronal activity)[1].

(Reason #2) is the fact that ANNs are being used to solve “synthetic” problems. The vast majority of ANNs originated from industry, designed to solve some practical real-world problem. For us, this means that the training data used for these models would have almost nothing in common with the human ontogenetic curriculum (or part of it) and hence not allow us to use it for this kind of research.

(Reason #3) is the use of implausible building blocks and morphology of the network, resulting in implausible neural dynamics. (e.g. use of point neurons instead of full-blown multi-compartment neurons, the use of all types of neural interaction instead of just STDP). We still don’t know crucial those alternative modes are, but the consensus on this matter is “we need more than we use right now”.

However, there are three notable exceptions:

(The first exception) is convolutional neural networks and their successors. They have been copied from the mammalian visual cortex and are considered sufficiently biologically plausible. The success of convNets is based on the utilization of design principles specific to the visual cortex, specifically shared weights and pooling[2]. The area of applicability of these principles is an open question.

(The second) is highly biologically plausible networks like Izhikevich’s, The Blue Brain project, and others. Izhkevich’s model is built from multi-compartment high-fidelity neurons displaying all the alternative modes of neural/ganglia interaction[3]. Among the results, my personal is “Network exhibits sleeplike oscillations, gamma (40 Hz) rhythms, conversion of firing rates to spike timings, and other interesting regimes. Due to the interplay between the delays and STDP, the spiking neurons spontaneously self-organize into groups and generate patterns of stereotypical polychronous activity. To our surprise, the number of coexisting polychronous groups far exceeds the number of neurons in the network, resulting in an unprecedented memory capacity of the system.”

(The third) is Hierarchical Temporal Memory by Jeff Hawkins. It’s a framework inspired by the principles of the neocortex. It claims that the role of neocortex is to integrate the upstream sensory data and then find patterns within the combined stream of neural activity. It views neocortex as an auto-association machine (the view I at least partially endorse). HTM has been developed almost two decades ago but, to my best knowledge, failed to earn much recognition. Still, it’s the best model of this type, so it is worth considering.

  1. ^

    Demis Hassabis. Neuroscience-Inspired Artificial Intelligence. https://www.sciencedirect.com/science/article/pii/S0896627317305093

  2. ^

    Y. Lecun, Y. Bengio. Gradient-based learning applied to document recognition. https://ieeexplore.ieee.org/abstract/document/726791

  3. ^
Eris10

A Thousand Narratives. Theory of Cognitive Morphogenesis
Part 4/20. Neural Darwinism

if the problems are the same, it (evolution) often finds the same solution" 
- Richard Dawkins, The Blind Watchmaker

Neural Darwinism, also known as the theory of neuronal group selection, is a theory that proposes that the development and organisation of the brain is similar to the process of biological evolution. According to this theory, the brain is composed of a large number of neural networks that compete with each other for resources and survival, much like biological organisms competing for resources in their environment.

The main similarity between Neural Darwinism and evolution is that they both involve a process of variation, selection, and adaptation. In biological evolution, organisms with advantageous traits are more likely to survive and reproduce, passing those traits on to their offspring. Similarly, in Neural Darwinism, neural networks that are better able to compete for resources and perform necessary functions are more likely to be preserved and strengthened, while weaker or less effective networks are pruned away.

The core claims of Neural Darwinism[1]:

  1. Neuronal groups, or populations of neurons that are functionally connected, compete with one another for resources and influence within the brain. Neuronal groups that are better adapted to a particular task or context are more likely to survive and thrive, while those that are less well-adapted are more likely to be eliminated or suppressed. The process of selection and adaptation occurs through a combination of genetic factors and experience-dependent modifications to neural connections. The brain is able to generate highly specific and adaptive responses to a wide range of stimuli through the dynamic interactions of neuronal populations.
  2. Spatiotemporal coordination of the neural activity underlying these selectional events is achieved mainly by a process of reentry. Reentry is the synchronous entrainment of reciprocally connected neuronal groups within sensorimotor maps into ensembles of coherent global activity.
  3. Neural Darwinism proposes that the brain uses degenerate coding, which means that multiple neural populations can respond to the same stimulus, allowing for redundancy and flexibility in neural processing.
  4. The initial population of groups is known as the primary repertoire and developed during prenatal development.
  5. The connections which are modified during development are between neuronal groups, rather than between specific cells.
  6. The primary repertoire and selection is responsible for the creation of a secondary repertoire which will be involved in the subsequent behavior of the organism.
  7. The operation of selection in Neural Darwinism is manifested through the selective stabilisation of neural connections that are relevant to the task at hand. Connections that are not relevant or are redundant are eliminated through a process of competitive interaction, while connections that are relevant are strengthened and stabilised.
  8. The relationship of external events to specific operations of selection in Neural Darwinism is that external events provide the stimuli and experiences that drive the selective stabilisation process. The brain is constantly adapting and modifying its neural connections based on the environmental stimuli and experiences that it encounters. Therefore, the specific operations of selection are driven by the external events that the brain is exposed to.

ND has little to say about how cognitive processes such as decision-making, problem-solving, and other executive functions exactly occur but it provides plausible basis for future developments. It has been mostly accepted (except for the fact that it lacks “units of evolution”, replicators capable of hereditary variation[2]. I personally do not endorse this criticism and will address it in Narrative Theory section) and became a part of fruitful direction of research.

 

  1. ^

    Neural Darwinism: The theory of neuronal group selection. GM Edelman. https://psycnet.apa.org/record/1987-98537-000

  2. ^

    The Neuronal Replicator Hypothesis. Chrisantha Fernando, Richard Goldstein, Eörs Szathmáry. https://direct.mit.edu/neco/article-abstract/22/11/2809/7586/The-Neuronal-Replicator-Hypothesis

Eris10

A Thousand Narratives. Theory of Cognitive Morphogenesis. Part 3/20
Simplest to succeed

"Evolution is a tinkerer, not an engineer. It works with what is already there and takes the path of least resistance. 
 It is not always the most efficient solution, but it is the dumbest solution that works." 
 -François Jacob, "The Logic of Life: A History of Heredity"

Reverse engineering complex systems is a tricky problem. Look for example at the design of modern microprocessors, how easy it would be to see the underlying principle of the Turing machine behind all the caches, branch prediction, thread balancing, and the rest. Not that easy I would say. This example might be also applicable to reverse engineering the mind. After peeling out all the evolutionary optimizations the underlying principles of brain design might turn out modestly simple. Some of these optimizations have been already identified and well studied [1] (e.g. mechanisms of translating chemical signals to electrical ones, numerous structural decisions dedicated to spending as less wire and energy as possible, ways of getting information from different sensors to the same frequency). While we have not yet succeeded in this quest the idea of simple but powerful core principles should be part of our strategy of getting there.

Biology offers one more piece of the strategy.  Although the overall amount of “design work” done by evolution is incredible, only a fraction of it is directly associated with the decision-making circuitry of the mind. The relatively slow pace of cognitive evolution means that there was not much time for reinventing cognitive architecture between subsequent species. Meaning, the most necessary parts of the apparatus have been already present in primates, some smaller part has been present in mammals, and so on.

The reasoning above together with a liberal application of the Lindy Effect[2] justify us taking a certain stance towards reverse engineering the mind - the longer some specific design principle has been around, the more it got represented in the construction of the system and the more we should put emphasis on it while building our models.

By this logic, we should expect the bulk of the design to be implemented via the use of a tiny set of the oldest mechanisms (if we take into account the timeframes of the introduction of all of them). The prime suspects for that set are:

  • Reflexes aka direct stimuli-action reactions
  • Associations. Extension of a stimuli-reaction dynamics into conditioned one
  • Inhibition and locality-based transmission of signals (e.g. neuromodulators)
  • Body modes like emotional states
  • Specialised circuits for different types of information
  1. ^

    3. Principles of Neural Design. Peter Sterling and Simon Laughlin. https://mitpress.mit.edu/9780262534680/principles-of-neural-design/

  2. ^
Eris10

A Thousand Narratives. Theory of Cognitive Morphogenesis. Part 2/20
A new way of doing the same thing

"Is an ant colony an organism, or is an organism a colony?" 
- Mark A. Changizi

As of now, there are two kinds of evolution: genetic evolution and memetic evolution. The first one is your usual evolution concerned with "change in the heritable characteristics of biological populations over successive generations", responsible for all the biological diversity that we know, and happening on the scale of at least hundreds of years. Memetic evolution, strictly speaking, is just a particularly powerful set of adaptations that appeared in primates (and unique only to them), that enabled the accumulation of adaptations during a lifetime, responsible for the cultural progress of humanity, and happening on the scale from minutes to years depending on definition.

The meme as a concept was coined by biologist Richard Dawkins in his 1976 book "The Selfish Gene"[1] and refers to units of cultural information that are transmitted from person to person through imitation or other forms of cultural transmission. Like genes in biological evolution, memes can undergo processes of variation, selection, and transmission that can lead to their spread or decline within a population.

Memetic evolution became possible after the introduction of several key mechanisms: the obvious suspects such as language and social learning; their dependencies like signaling (prelinguistic communication), {niche construction, extended phenotype}[2], scaffolded upbringing, theory of mind; and development of necessary neural substrates enabling all these mechanisms (whatever they are)

The main benefit that the development of meme evolution has brought up is the drastic increase in problem solving capacity (both on the level of population and, more importantly for this post on an individual level)

While general dynamics remained the same (organisms being innovation aggregators) the details have changed:

  • Selective pressure also had changed, now being pointed at storing knowledge and social interactions. This might be viewed as the last big impact of genetic evolution - the creation of the adaptation loop driving memetic evolution. Which resulted in humans having bigger brains more suitable for both.
  • From that moment onward species switched (arguably fully) towards the accumulation of innovations through memes instead of genes. Individuals that were for some reason deprived of access to the meme pool effectively got thrown back to primates in terms of environmental success.
  1. ^

    Richard Dawkins. The Selfish gene. https://www.goodreads.com/book/show/61535.The_Selfish_Gene?from_search=true&from_srp=true&qid=oWwQlQJHhQ&rank=1

  2. ^

    Richard Dawkins. The extended phenotype. https://www.goodreads.com/book/show/61538.The_Extended_Phenotype?from_search=true&from_srp=true&qid=Ko5sX4zBtL&rank=1

Eris10

A Thousand Narratives. Theory of Cognitive Morphogenesis. Part 1/20. Intro

The ultimate goal of this line of research is to gain a better understanding of how human value system operates. The problem I see regarding current approaches to studying values is that we cannot study {values/desires/preferences} in isolation from the rest of cognitive mechanisms, cause according to latest theories values are just a part of a broader system governing behaviour in general. With that you have to have a decent model of human behaviour first to then be able to explain value dynamics.

To get a good theory of the mind you have to meet multiple requirements:

  1. A good theory of the mind must span at least four different timescales: (genetic evolution) for the billion years in which our brains have evolved; (memetic evolution) for the centuries of cultural accumulation of ideas through history; (personal) for the individual development during lifetime; and (neuronal) milliseconds during which cognitive inference happens.
  2. A good theory must explain behaviour of the system on each of Marr’s three levels of analysis[1]: (1) the computational problem the system is solving; (2) the algorithm the system uses to solve that problem; and (3) how that algorithm is implemented in the “physical hardware” of the system. And, the part I think Marr is missing, the third level also has to include explanation of how the learning environment affects agent.
  3. A good theory must at least make an attempt at answering the main questions: how is the generality of intelligence achieved?; what is the neural substrate of memory?; etc.

To meet these requirements I’ve combined insights from several fields: Developmental Psychology, Neuroscience, Ethology and Computation models of mind. The result is the Narrative Theory. The research is still far from completion but there are already interesting insights to be shared.

At this moment NT is similar to Shard Theory in many ways, but it also differs from it in many others: (1) NT is trying to integrate “more distant” but still crucial perspectives (like ethology and linguistics). (2) It is concerned with the flow of development of human behaviour as a whole instead of focusing of values. And (3) NT is only concerned with human intelligence, for now ignoring the topic of artificial agents entirely.

It’s pretty audacious to say that one can make progress on something as big as computational theory of human behaviour but there are two things giving me hope of succeeding: (1) It’s been quite a while since the last wave of overarching psychological theories. (2) The last decades were sort of a divergent period of scientific inquiry (when it comes to mind studies), efforts mostly have been focused on puzzling out the smaller pieces of The Problem and there have been no serious attempts at updating previous theories with newly found evidence (or even integrating those theories between each other). These together promise that there is now a room for improvements to be made.

Note on vocabulary. Each mentioned theory has it’s own unique language. This may present a problem for unprepared readers. While I will unpack and rephrase convoluted terms when possible, not everything can be stripped away.

This post is structured as follows: (the first section) is a list of constraints discovered by various mind related fields that are crucial for building an overarching theory mind; (the second section) presents the first claims of Narrative Theory built according with known constraints; and (the third section) covers implications of the theory, problems and future work directions.

  1. ^

    David Marr.  Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. https://academic.oup.com/mit-press-scholarship-online/book/13528

Answer by Eris60

Nug and Yeb by Exploring Egregors
Ars Longa, Vita Brevis by Scott Alexander

Load More