edbs - LessWrong

Decomposing Agency — capabilities without desires

edbs10moΩ246116

The Active Inference literature on this is very strong, and I think the best and most overlooked part of what it offers. In Active Inference, an agent is first and foremost a persistent boundary. Specifically, it is a persistent Markov Blanket, a idea due to Judea Pearl. https://en.wikipedia.org/wiki/Markov_blanket The short version: a Markov blanket is a statement that a certain amount of state (the interior of the agent) is conditionally independent of a certain other amount of state (the rest of the universe), and that specifically its independence is conditioned on the blanket state that sits in between the exterior and the interior.

You can show that, in order for an agent to persist, it needs to have the capacity to observe and learn about its environment. The math is a more complex than I want to get into here, but the intuition pump is easy:

A cubic meter of rock has a persistent boundary over time, but no interior, states in an informational sense and therefore are not agents. To see they have no interior, note that anything that puts information into the surface layer of the rock transmits that same information into the very interior (vibrations, motion, etc).
A cubic meter of air has lots of interior states, but no persistent boundary over time, and is therefore not an agent. To see that it has no boundary, just note that it immediately dissipates into the environment from the starting conditions.
A living organism has both a persistent boundary over time, and also interior states that are conditionally independent of the outside world, and is therefore an agent.
Computer programs are an interesting middle ground case. They have a persistent informational boundary (usually the POSIX APIs or whatever), and an interior that is conditionally independent of the outside through those APIs. So they are agents in that sense. But they're not very good agents, because while their boundary is persistent it mostly persists because of a lot of work being done by other agents (humans) to protect them. So they tend to break a lot.

What's cool about this definition is that it gives you criteria for the baseline viability of an agent: can it maintain its own boundary over time, in the face of environmental disruption? Some agents are much better at this than others.

This leads to of course many more questions that are important -- many of the ones listed in this post are relevant. But it gives you an easy, and more importantly mathematical test, for agenthood. It is a question of dynamics in flows of mutual information between the interior and the exterior, which is conveniently quite easy to measure for a computer program. And I think it is simply true: to the degree and in such contexts as such a thing persists without help in the face of environmental disruption, it is agent-like.

There is much more to say here about the implications -- specifically how this necessarily means that you have an entity which has pragmatic and epistemic goals, minimizes free energy (aka surprisal) and models a self-boundary, but I'll stop here because it's an important enough idea on its own to be worth sharing.

Decomposing Agency — capabilities without desires

edbs9mo20

If you built a good one, and you knew how to look at the dynamics, you'd find that the agent in the computer was in a "liquid" state. Although it's virtualized, so the liquid is in the virtualization layer.

Decomposing Agency — capabilities without desires

edbs10mo50

Ah, yes, this took me a long time to grok. It's subtle and not explained well in most of the literature IMO. Let me take a crack at it.

When you're talking about agents, you're talking about the domain of coupled dynamic systems. This can be modeled as a set of internal states, a set of blanket states divided into active and sensory, and a set of external states (it's worth looking at this diagram to get a visual). When modeling an agent, we model the agent as the combination of all internal states and all blanket states. The active states are how the agent takes action, the sensory states are how the agent gets observations, and the internal states have their own dynamics as a generative model.

But how did we decide which part of this coupled dynamic system was the agent in the first place? Well, we picked one of the halves and said "it's this half". Usually we pick the smaller half (the human) rather than the larger half (the entire rest of the universe) but mathematically there is no distinction. From this lens they are both simply coupled systems. So let's reverse it and model the environment instead. What do we see then? We see a set of states internal to the environment (called "external states" in the diagram)...and a bunch of blanket states. The same blanket states, with the labels switched. The agent's active states are the environment's sensory states, the agent's sensory states are the environment's active states. But those are just labels, the states themselves belong to both the environment and the agent equally.

OK, so what does this have to do with a rock? Well, the very surface of the rock is obviously blanket state. When you lightly press the surface of the rock, you move the atoms in the surface of the rock. But because they are rigidly connected to the next atoms, you move them too. And again. And again. The whole rock acts as a single set of sensory states. When you lightly press the rock, the rock presses back against you, but again not just the surface. That push comes from the whole rock, acting as a single set of active states. The rock is all blanket, there is no interiority. When you cut a layer off the surface of a rock, you just find...more rock. It hasn't really changed. Whereas cutting the surface off a living agent has a very different impact: usually the agent dies, because you've removed its blanket states and now its interior states have lost conditional independence from the environment.

All agents have to be squishy, at least in the dimensions where they want to be agents. You cannot build something that can observe, orient, decide, and act out of entirely rigid parts. Because to take information in, to hold it, requires degrees of freedom: the ability to be in many different possible states. Rocks (as a subset of crystals) do not have many internal degrees of freedom.

Side note: Agents cannot be a gas just like they can't be a crystal but for the opposite reason. A gas has plenty of degrees of freedom, basically the maximum number. But it doesn't have ENOUGH cohesion. It's all interior and no blanket. You push your hand lightly into a gas and...it simply disperses. No persistent boundary. Agents want to be liquid. There's a reason living things are always made using water on earth.

tldr: rocks absolutely have a persistent boundary, but no interiority. agents need both a persistent boundary and an interiority.

Re: Black Holes specifically...this is pure speculation because they're enough of an edge case I don't know if I really understand it yet...I think a Black Hole is an agent in the same sense that our whole universe is an agent. Free energy minimization is happening for the universe as a whole (the 2nd law of thermodynamics!) but it's entirely an interior process rather than an exterior one. People muse about Black Holes potentially being baby universes and I think that is quite plausible. Agents can have internal and external actions, and a Black Hole seems like it might be an agent with only internal-actions which nevertheless persists. You normally don't find something that's flexible enough to take internal action, yet rigid enough to resist environmental noise -- but a Black Hole might be the exception to that, because its dynamics are so powerful that it doesn't need to protect itself from the environment anymore.

Decomposing Agency — capabilities without desires

edbs10mo10

Yes, you are very much right. Active Inference / FEP is a description of persistent independent agents. But agents that have humans building and maintaining and supporting them need not be free energy minimizers! I would argue that those human-dependent agents are in fact not really agents at all, I view them as powerful smart-tools. And I completely agree that machine learning optimization tools need not be full independent agents in order to be incredibly powerful and thus manifest incredible potential for danger.

However, the biggest fear about AI x-risk that most people have is a fear about self-improving, self-expanding, self-reproducing AI. And I think that any AI capable of completely independently self-improving is obviously and necessarily an agent that can be well-modeled as a free-energy minimizer. Because it will have a boundary and that boundary will need to be maintained over time.

So I agree with you that AI-tools (non-general optimizers) are very dangerous and not covered by FEP, but AI-agents (general optimizers) are very dangerous for unique reasons but also covered by FEP.

Decomposing Agency — capabilities without desires

edbs10mo103

So, let me give you the high level intuitive argument first, where each step is hopefully intuitively obvious:

The environment contains variance. Sometimes it's warmer, sometimes it's colder. Sometimes it is full of glucose, sometimes it's full of salt.
There exist only a subset of states which an agent can persist in. Obviously the stuff the agent is made out of will persist but the agent itself (as a pattern of information) will dissipate into the environment if it doesn't exist in those states.
Therefore, the agent needs to be able to observe its surroundings and take action in order to steer into the parts of state-space where it will persist. Even if the system is purely reactive it must act-as-if it is doing inference, because there is variance in the time lag between receiving an observation and when you need to act on it. (Another way to say this is that an agent must be a control system that contends with temporal lag).
The environment is also constantly changing. So even if the agent is magically gifted with the ability to navigate into states via observation and action to begin with, whatever model it is using will become out of date. Then its steering will become wrong. Then it dies.

There is another approach to persistence (become a very hard rock) but that involves stopping being an agent. Being hard means committing very so hard to a single pattern that you can't change. That does mean, good news, the environment can't change you. Bad news, you can't change yourself either, and a minimal amount of self-change is required in order to take action (actions are always motions!).

I, personally, find this quite convincing. I'm curious what about it doesn't seem simply intuitively obvious. I agree that having formal mathematical proof is valuable and good, but this point seems so clear to me that I feel quite comfortable with assuming it even without.

Some papers that are related, not sure which you were referring to. I think they lay it out in sufficient detail that I'm convinced but if you think there's a mistake or gap I'd be curious to hear about it.

Life As We Know It -- very heuristic, not really a proof
The free energy principle made simpler but not too simple -- a more formal take
A free energy principle for a particular physics -- the most formal take I'm aware of

Joint Configurations

edbs17y20

I found your explanations of Bayesian probability enlightening, and I've tried to read several explanations before. Your recent posts on quantum mechanics, much less so. Unlike the probability posts, I find these ones very hard to follow. Every time I hit a block of 'The "B to D" photon is deflected and the "C to D" photon is deflected.' statements, my eyes glaze over and I lose you.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments