This reminds me of this post about the encoding of "fear" of snakes: https://www.lesswrong.com/posts/bgqmv8YF6HA3mvrPM/attention-to-snakes-not-fear-of-snakes-evolution-encoding
Das this fit? Do you have other examples?
It fits perfectly, thanks!
Yes, there's a bunch of other mechanisms/phenomena, such as
- the developmental windows for learning speech and language,
- the spectrum of reactions to distress (anger, fear, etc.),
- the palmar grasp reflex.
Basically I'm interested in all biological mechanisms that control our learning, not just affects, and even if they seem irrelevant for AI purposes. As can be seen from Kaj's post there, the way to get these systems to work might be nonintuitive, so every little hint will help in the end.
I think another post might be in order to fully explore the list of all of these biological mechanisms at some point, maybe as a pitstop before going into the full deal.
I have found a source of some more plausible mechanisms tied to common emotions here: Dares, costly signals, and psychopaths (which references The Psychopath Code, see raw text on Github). These sources are focused on psychopaths but give extremely well-suited descriptions of the following classes of emotions:
Some examples:
Hunger [...] Your digestion slows. Your vision and hearing gets sharper and you focus on distinguishing prey from threats. You feel the need to move, yet you are careful to stay invisible. You walk without haste, and keep your posture relaxed. Your breathing is regular, slow.
Euphoria [...] Your hearing switches off and your vision tunnels in on your target. Your breathing and heartbeat accelerate. Blood flows to your muscles, and glucose feeds into your blood. Your eyes widen, your mouth opens, and you bare your teeth.
Surprise [...] "startle response." You flinch away from the threat, and raise your arms in self-defense. You lift your eyebrows and open your eyes wide to see better. Your hearing gets sharp. You exhale hard to clear your lungs of carbon dioxide. Your heart accelerates and you breathe in deep to oxygenate your body for action.
Love - [...] We establish "closeness" by mutual physical contact. The kinds of contact depend on the relationship. The closer you are to another person the more you feel the emotion. Your eyebrows rise, your pupils widen, you smile and laugh and feel happy. You use open and dominant body language. You are more childlike: playful and uninhibited. You seek more contact. You need less sleep.
All of the descriptions are like this, and I think an excellent source when looking for mechanisms that facilitate the recognition of the more abstract patterns.
Other things that are candidates:
There's one issue that I don't have an answer yet: how would the visual system detect "height"?
Could we presume there is a spatial engine that needs to be taught first, and then linked to this phobia?
Or would it make sense to have a straight link to a spatial predictive system instead, and if the system would predict that there's some uncertainty in if the agent suddenly needs more space to maneuver, and then that space is instead occupied with a void? At least *I* cannot look up when the fear of heights triggers, and get a sudden sensation of vertigo: I need to know where the closest brace-point is when I know falling might be imminent.
The visual system wouldn't detect the abstract concept of height, and that would be the brain's job to figure out by being primed on when the thing triggers and what else correlates with it.
I imagine the visual system would detect visual depth from binocular vision. Babies learn this in the first few months. It is one of the things that cause them distress when it gets activated in the brain. I don't know the research papers, but these might be starting pointers:
https://www.beltz.de/fileadmin/beltz/leseproben/978-3-621-27926-0.pdf (picture 2.2, German)
https://www.thewonderweeks.com/babys-mental-leaps-first-year/ (week 26)
So visual depth you have without much learning - or with other priming steps ahead of that; I understand these are well researched). What is left is the vertical component, and I guess that it comes from the vestibular system. Looking down + visual depths = height trigger.
It is funny that you mention the need to grasp something, and maybe that is the hard-wired cue: Close the hand.
If I understood correctly, babies cannot focus their eyes properly for the first two months, and this may indicate they are learning some universal 3D-spatial models into their heads, as a prerequisite for many of the other instincts they have as later developmental windows. So there has to be some thread of signals that string this system to the later affects/instincts, such as the fear of heights.
It is also funny to relate the ability of many ungulate babies ability to walk immediately on birth, meaning there has to be some seriously robust set of instincts that coordinate this for them. This blurs the ... requirements... between instinctual and learned coordination, but I believe in the end all cortex-having brains would benefit from moving away from instincts and into learned models.
Related: Gene for upright walk in humans discovered: https://www.theguardian.com/science/2008/jun/02/genetics.medicalresearch
... Or more specifically, a post on how, and why, to encode emotions to find out more about goals, rationality and safe alignment in general.
If one tries to naively fit reinforcement learning’s reward functions back onto the human mind, the closest equivalent one may find is emotions. If one delves deeper into the topic though, they will find a mish-mash of other “reward signals” and auxiliary mechanisms in the human brain, (such as the face tracking reflex, which aids us in social development) and ends up hearing about affects, the official term when it comes to the study of emotions.
At least that is what approximately happened with me.
Affects and reward functions seem to have a common functional purpose in agents, in that they both direct the agent’s attention towards what is relevant.
Both:
This means that if we can map and write all of the human affects into reward functions, we can compare various constellations of affects and see which ones produce what human-like behaviours. This in turn may lead to solutions for not only how to induce human-like biases into AI, but also investigate our own values and rationality from a new perspective.
The purpose of this post is to introduce a few ways to proceed with the task of encoding affects. First, there will be a rudimentary definition of the components and some motivational points on what this all could mean. After that there will be an introduction to three distinct levels of representation for various use cases, from philosophical to ontological, and finally pseudocode.
Disclaimers:
This post is meant to act as a conversation starter, so many points might be alarmingly concise.
The formalities are meant to replicate the functionality of human behaviour, and are not claimed to be exact copies of the neurological mechanisms themselves. Tests should be performed to find out what emerges in the end. Some points might be controversial, so discussion is welcome.
Definitions
Alright, let's define the two a bit more in depth and see how they compare.
Reward functions are part of reinforcement learning, where a “computer program interacts with a dynamic environment in which it must perform a certain goal (sic, goal here is the programmer’s) (such as driving a vehicle or playing a game against an opponent). As it navigates its problem space, the program is provided feedback that's analogous to rewards, which it tries to maximize.”
- Wikipedia, reinforcement learning
Specific points about reward functions that will conveniently compare well with my argument:
Affect theory is “the idea that feelings and emotions are the primary motives for human behaviour, with people desiring to maximise their positive feelings and minimise their negative ones. Within the theory, affects are considered to be innate and universal responses that create consciousness and direct cognition.”
- APA, affect theory
Specific points about affects (citations pending):
Disclaimer: The claim here is not that all values and goals necessarily come from emotions later in life, when they can be based on other values and existing knowledge. But rather, that the original source of our very first values came from affects during infancy and childhood, and thus the ultimate source for all values are, in the end, affects.
Further elaboration can be found also from appraisal theory and affective neuroscience.
So what is common?
Both frameworks define what the agent can learn, what they value, and what goals they can create based on these values. I will posit here even further that neither humans nor AI would “learn what to do” if there weren’t any criteria towards which to learn, thus doing reflexive and random actions only. We can see this clearly from the definition of RL-agents: remove their reward function, and they cannot learn the "relevant" connections from the environment they work in. With humans we could study brain lesion patients and birth defects, but more on that later. What I found thus far was inconclusive, but the search continues.
But what does it all mean?
Meanwhile, let’s discuss a number of beliefs I have regarding the topic, some might be more certain than others. All of these could have a discussion of their own, but I will simply list them here for now.
Alright, back to business.
Examples on representation
Here are three different levels of representation for encoding affects and their interplay within consciousness, from philosophical ponderings to actual pseudocode.
Philosophical:
Surprise was already partially developed with TD-Lambda, but has been further refined by DeepMind with episodic curiosity.
In practice, the agent could be constantly predicting what it might perceive next, and if this prediction is wildly different, the prediction error could be directly proportional to the amount of surprise.
Ontology card:
Reciprocity violation
An example of a more fleshed out ontology with the theoretical fields addressed.
This was a card we designed with an affective psychologist some time ago, it basically outlines the interface for the affect within the whole system.
Pseudocode: Learning the concept ‘agent’
Affects also have implicit requirements for their functionality, such as the concept “another agent” to tie all of the social affects to. Sorry, this one is a bit of a mess, the idea should be visible though.
This pseudocode addresses a developmental window we humans have, which helps us generate the concept of "agents" faster and more reliably. It is a learned feature within our consciousness, but because many of our social affects are based on the concept of an agent, this is something the genes just have to be sure that the brain has learned (and then linked to via a preprogrammed pointer, meaning the 'affect'). We can see this mechanism breaking partially in some cases of severe autism, where the child doesn't look people in the eyes.
These pseudocodes could be eventually combined into a framework of actual code, and may be tested in places such as OpenAI's multi-agent environment.
Alright, that's it for this one. I'll just end this with a repetition of the disclaimer:
Tests should be performed to find out what emerges in the end. Some points might be controversial, so discussion is welcome, and this material will be eventually rectified.