I agree this is a big blindspot. My take on the intellectual history here is that(crudely speaking) MIRI et al. have mostly pursued a 'top-down' approach to agency, starting with agents such as AIXI representing the limit of unbounded rationality and compute, and then attempted to 'downsize' them such that they can actually fit in our universe(e.g. logical inductors merely need ridiculously large amounts of compute, rather than hypercomputers). This seems like a reasonable strategy a priori; there's already a well-developed theory of idealized rationality in agents that you can start with and try to 'perturb' down to fit in the actual universe, and it's plausible that a superintelligence will bear a closer resemblance to such agents than amoebae. The 'amoeba-first' strategy is difficult in that a naïve approach will just lead you to learn a bunch of irrelevant details about amoebae, not generalizing usefully to higher intelligences; a large part of the problem consists in figuring out what about amoebae(or whatever other system) you actually want to study, which is somewhat nebulous in contrast to the idealized-agents-first approach. Nevertheless, it does seem that the idealized agents plan has stalled out to a certain degree in recent years, and MIRI(e.g. finite factored set stuff) and other alignment researchers(e.g. johnswentworth's natural abstraction stuff) have shifted more towards the amoeba side of things. I think the 'amoeba approach' has some big advantages in that you can more readily test your ideas or get new ones by examining natural systems, plus physics seems to be the only part of the universe that really cleanly obeys mathematical laws, so a concept of agency starting from physics seems more likely to generalize to arbitrarily powerful intelligences.
The difference between the two is quantitative, not qualitative, unless you explicitly or implicitly subscribe to the "humans are the only real agents" idea,
I would like to push back gently on this point. It's entirely possible that what we mean by "agency" does not emerge until a certain threshold of complexity is passed, and that amoebas aren't at the necessary level of complexity. Otherwise, why limit yourself to amoebas? Why not talk about the agency of electrons?
Why not talk about the agency of electrons?
Indeed, why not? Where is the emergence threshold, or a zone? I would think this is where one would want to start understanding the concept of agency.
An explanation that I've seen before of "where agency begins" is when an entity executes OODA loops. I don't know if OODA loops are a completely accurate map to reality, but they've been a useful model so far. If someone were going to explore "where agency begins" OODA loops might be a good starting point.
I feel like an article about "what agency is" must've already been written here, but I don't remember it. In any case, that article on agency in Conway's Life sounds like my next stop, thank you for linking it!
I didn't pick it up from any reputable sources. The white paper on military theory that created the term was written many years ago, and since then I've only seen that explanation tossed around informally in various places, not investigated with serious rigor. OODA loops seem to be seldom discussed on this site, which I find kinda weird, but a good full explanation of them can be found here: Training Regime Day 20: OODA Loop
I tried to figure out on my own whether executing an OODA loop was necessary & sufficient condition for something to be an intelligent agent, (part of an effort to determine what the smallest & simplest thing which could still be considered true AGI might be) and I found that while executing OODA loops seems necessary for something to have meaningful agency, doing so is not sufficient for something to be an intelligent agent.
Thank you for your interest, though! I wish I could just reply with a link, but I don't think the paper I would link to has been written yet.
I asked because that's a reasonable one-line approximation of my own tentative theory of agency. I'm happy to hear that other people have similar intuitions! Alas that there isn't a fleshed out paper I can go read. Do you have any... nonreputable sources to link me to, that I might benefit from reading?
There should be "more focus" on a lot of stuff including amoebas (this claim is almost contradictory if taken literally; I'm saying "focus" isn't as subject to tradeoffs as it might seem like). But, I think you're missing that bigger / higher / complexer / abstract / refineder things can be in some key ways simpler or easier to understand in their essences. Compare: "You can't understand digital addition without understanding Mesopotamian clay token accounting". There's a lot of interesting stuff to be learned by studying the evolution of a sublunary instance of an abstract thing, but that doesn't mean you can't understand the abstract thing, possibly faster, by some other method. For example, you can try to read amoebas as participating in abstract logical structures, and that can be fruitful, but the humans are convenient in that sometimes they actually literally write out formal expressions of the logical structures.
Compare: "You can't understand digital addition without understanding Mesopotamian clay token accounting".
Well, if we didn't understand digital addition and were only observing some strange electrical patterns on a mysterious blinking board, going back to the clay token accounting might not have been a bad idea. And we do not understand agency, so why not go back to basics?
I'm not arguing against studying amoebas, I'm arguing for also studying higher-level things including agency without first studying amoebas. Amoebas are simpler, which makes them easier to study, but they are also less agenty, and in some ways *less* simple *as agents*. It would be easier to understand an abstractly written program to perform addition, than to understand register readouts from a highly optimized program, even if the former never appears "in the wild" because it's too computationally expensive.
going back to the clay token accounting might not have been a bad idea
I agree, as I said. But it would be a mistake to not also think at the abstract level; you can learn/invent digital addition just by trying to count stuff.
It's a good point that there are trade-offs, and highly optimized programs, even if they perform a simple function, are hard to understand without "being inside" one. That's one reason I linked a post about an even simpler and well understood potentially "agentic" system, the Game of Life, though it focuses on a different angle, not "let's see what it takes to design a simple agent in this game".
"You can't understand digital addition without understanding Mesopotamian clay token accounting"
That's sort of exactly correct? If you fully understand digital addition, then there's going to be something at the core of clay token accounting that you already understand. Complex systems tend to be built on the same concepts as simpler systems that do the same thing. If you fully understand an elevator, then there's no way that ropes & pulleys can still be a mystery to you, right? And to my knowledge, studying ropes & pulleys is a step in how we got to elevators, so it would make sense to me that going "back to basics", i.e. simpler real models, could help us make something we're still trying to build.
Even if I disagree with you, thank you for posing the example!
What do you disagree about? I agree that understanding addition implies that you understand something important about token accounting. I think there's something about addition that is maybe best learned by studying token accounting or similar (understanding how minds come to practice addition). I also think much of the essence of [addition as addition itself] is best and most easily understood in a more normal way--practicing counting and computing things in everyday life--and *not* by studying anything specifically about Mesopotamian clay token accounting, because relative to much of the essence of addition, historical accounting systems are baroque with irrelevant detail, and are a precursor or proto form of practicing addition, hence don't manifest the essence of addition in a refined and clear way.
I like your elevator example. I think it's an open question whether / how amoebas are manifesting the same principles as (human, say) agency / mind / intelligence, i.e. to what extent amoebas are simpler models of the same thing (agent etc.) vs. models of something else (such as deficient agency, a subset of agency, etc.). I mean, my point isn't that there's some "amount" that amoebas "are agents" or whatever, that's not exactly well-defined or interesting, my point is that the reasons we're interested in agency make human agency much more interesting than amoeba agency, and this is not primarily a mistake; amoebas pretty much just don't do fictive learning, logical inference, etc., even though if you try you can read into amoebas a sort of deficient/restricted form of these things.
is best and most easily understood in a more normal way--practicing counting and computing things in everyday life
Good advice for learning in general.
What do you disagree about?
I don't know. Possibly something, probably nothing.
the essence of [addition as addition itself]...
The "essence of cognition" isn't really available for us to study directly (so far as I know), except as a part of more complex processes. Finding many varied examples may help determine what is the "essence" versus what is just extraneous detail.
While intelligent agency in humans is definitely more interesting than in amoebas, knowing exactly why amoebas aren't intelligent agents would tell you one detail about why humans are, and may thus tell you a trait that a hypothetical AGI would need to have.
I'm glad you liked my elevator example!
knowing exactly why amoebas aren't intelligent agents would tell you one detail about why humans are
Exactly, yeah; I think in the particular case of amoebas the benefit looks more like this, and it doesn't so much look like amoebas positively exemplifying much that's key about the kind of agency we're interested in re/ alignment. Which is why I disagree with the OP.
This is supposed to be a trivially true statement, and yet it sounds controversial somehow, doesn't it?
It seems far from trivially true to me. Compare: "you can't understand human intelligence without understanding amoeba intelligence". Yet the people who know the most about human intelligence may never have studied amoebas; nor does studying amoebas seem likely to be on the shortest path to AGI.
Do you think
the people who know the most about human intelligence
understand how neurons work? Or might they be operating at a different level (of abstraction) like behavior?
"you can't understand human intelligence without understanding amoeba intelligence"
That does sound less trivially true, I agree. I am not sure what the difference is exactly...
nor does studying amoebas seem likely to be on the shortest path to AGI.
I don't see how this follows. Not studying amoebas, per se, but the basic blocks of intelligence starting somewhere around the level of an amoeba, whatever they might turn out to be.
I've wondered this a lot too. There is a lot of focus on and discussion about "superintelligent" AGI here, or even human-level AGI, but I wonder what about "stupid" AGI? When superintelligent AGI is still out of reach, is there not something still to be learned from a hypothetical AGI with the intelligence level of, say, a crow?
Right, something like that. A crow is smart though. That's why I went picked an example of a single-cell organism.
I think Friston's free energy principle has a lot to offer here in terms of generalizing agency to include everything from amoebas to humans (although, ironically, maybe not AIXI).
Basically, rational agents, and living systems more generally, are built to regulate themselves in resistance to entropy by minimizing the free energy (essentially, "conflict") between their a priori models of themselves as living or rational systems and what they actually experience through their senses. To do this well, they need to have some sort of internal model of their environment (https://en.m.wikipedia.org/wiki/Good_regulator) that they use for responding to changes in what they sense in order to increase the likelihood of their survival.
For human minds, this internal model is encoded in the neural circuitry of our brains. For amoebas, this internal model could be encoded in the states of the promoters and inhibitors in its gene regulatory network.
I am having trouble understanding the "free energy principle" being anything more than a control system that tries to minimize prediction error. If that's all that is, there is nothing special about living systems, engineers have been building control systems for a long time. By that definition a Boston Dynamics walking robot is definitely a living system...
That's not unreasonable as a quick summary of the principle.
I would say there is more to what makes a living system alive than just following the free energy principle per se. For instance, the robot would also need to scavenge for material and energy resources to incorporate into itself for maintenance, repair, and/or reproduction. Just correcting its gait when thrown off balance allows it minimize a sort of behavioral free energy, but that's not enough to count as alive.
But if you want to put amoebas and humans in the same qualitative category of "agency", then you need a framework that is general enough to capture the commonalities of interest. And yes, under such a broad umbrella, artificial control systems and dynamically balancing walking robots would be included.
The free energy principle applies to a lot of systems, not just living or agentic. I see it more as a way to systematize our approach to understanding a system or process rather than an explanation in and of itself. By focusing on how a system maintains set points (e.g., homeostasis) and minimizes prediction error (e.g., unsupervised learning), I think we would be better positioned to figure out what real agents are actually doing in a way that could inform the both the design and alignment of AGI.
I've never been able to make sense of the "Good Regulator" theorem, and in the original paper (linked in that Wikipedia article) I cannot map their terminology onto any control system I can think of. Can you explain it? Because it seems obvious to me that a room thermostat contains no model of anything, and I can find no way of mapping the components of that system to Conant and Ashby's terminology. Their own example of a hunter shooting pheasants is just as opaque to me.
The model may be implicit, but it's embedded in the structure of the whole thermostat system, from the thermometer that measures temperature to the heating and cooling systems that it controls. For instance, it "knows" that turning on the heat is the appropriate thing to do when the temperature it reads falls below its set point. There is an implication there that the heater causes temperature to rise, or that the AC causes it to fall, even though it's obviously not running simulations (unless it's a really good thermostat) on how the heating/cooling systems affect the dynamics of temperature fluctuations in the building.
The engineers did all the modeling beforehand, then built the thermostat to activate the heating and cooling systems in response to temperature fluctuations according to the rules that they precomputed. Evolution did just this in building the structure of the amoeba's gene networks and the suite of human instincts (heritable variation + natural selection is how information is transferred from the environment into a species' genome). Lived experience pushes further information from the environment to the internal model, upregulating or downregulating various genes in response to stimuli or learning to reinforce certain behaviors in certain contexts. But environmental information was already there in the structure to begin with, just like it is in more traditional artificial control systems.
The example with the hunter and pheasants was just to show how "regulating" (i.e., consistently achieving a state in the desirable set = pheasant successfully shot) requires the hunter to have a good mental model of the system (pheasant behavior, wind disturbances, etc.). Again, this model does not have to be explicit in general but could be completely innate.
I can't match any of that up to Conant and Ashby's paper, though.
As you say, the engineers designing a thermostat have a model of the system. But the thermostat does not. It simply compares the temperature with that set on the dial and turns a switch on and off. There is no trace of any model, prediction, expectation, knowledge of what its actions do, and so on. The engineers do have these things, the proof of which is that you can elicit their knowledge. The thermostat does not, the proof of which is that nowhere in the thermostat can any of these things be found.
The hunter is an obscure example, because no-one knows how humans accomplish such things, and instead we mostly make up stories based on what the process feels like from within. This method has a poor track record. More illuminating would be to look at a similar but man-made system: an automatic anti-aircraft gun shooting at a plane. Whether the control systems inside this device contain models is an empirical question, to be answered by looking at how it works. Maybe it does, and maybe it doesn't. There is such a thing as model-based control, and there is such a thing as PID controllers (which do not contain models).
Seems right to me. For example, I think by most natural notions of "agency" that don't introduce anything crazy, we should probably think of thermostats as agents since they go about making the world different based on inputs. But such deflationary notions of agency seem deeply uncomfortable to a lot of people because they violate the very human-centric notion that lots of simple things don't have "real" agency because we understand their mechanism, whereas things with agency seem to be complex in a way that we can't easily understand how they work.
But such deflationary notions of agency seem deeply uncomfortable to a lot of people because they violate the very human-centric notion that lots of simple things don't have "real" agency because we understand their mechanism, whereas things with agency seem to be complex in a way that we can't easily understand how they work.
Yeah, that seems like a big part of it. I remember posting to that effect some years ago https://www.lesswrong.com/posts/NptifNqFw4wT4MuY8/agency-is-bugs-and-uncertainty
But given that we want to understand "real" agency, not some "mysterious agency" stemming from not understanding inner workings of some glorified thermostat, would it not make sense to start with something simple?
Maybe it's better to start with something we do understand, then, to make the contrast clear. Can we study the "real" agency of a thermometer, and if we can, what would that research program look like?
My sense is that you can study the real agency of a thermometer, but that it's not helpful for understanding amoebas. That is, there isn't much to study in "abstract" agency, independent of the substrate it's implemented on. For the same reason I wouldn't study amoebas to understand humans; they're constructed too differently.
But it's possible that I don't understand what you're trying to do.
That is, there isn't much to study in "abstract" agency, independent of the substrate it's implemented on
Yeah, that's the question, is agency substrate-independent or not, and if it is, does it help to pick a specific substrate, or would one make more progress by doing it more abstractly, or maybe both?
This is supposed to be a trivially true statement, and yet it sounds controversial somehow, doesn't it?
The difference between the two is quantitative, not qualitative, unless you explicitly or implicitly subscribe to the "humans are the only real agents" idea, which is not even one step removed from "unlike other animals, humans possess souls". And yet when I browse through research on (embedded) agency, most of it is about humans, not anything less complicated. Sure, studying humans has an apparent advantage over studying other animals because we, as humans, have an inside view not readily accessible in other species. Whether it is a real advantage depends on how misleading the inside view is. After all, it comes with a dangling node of free will, with the feeling of qualia and ineffable redness of red, and with a host of other evolutionary artifacts that once may have been useful for survival in a small tribe, but now are a just along for the ride. And in any case, evolution satisfices for survival, it does not optimize the inside view for accuracy.
I understand why philosophy focuses almost exclusively on humans, but I don't understand why, say, AI research does not focus on simple creatures first, and maybe even on non-living agents before that. Well, that's not entirely true, posts like this one do appear:
https://www.lesswrong.com/posts/3SG4WbNPoP8fsuZgs/agency-in-conway-s-game-of-life
but they seem to be a small minority, as far as I can tell. What am I missing here?