The natural abstraction hypothesis claims (in part) that a wide variety of agents will learn to use the same abstractions and concepts to reason about the world.
What's the simplest possible setting where we can state something like this formally? Under what conditions is it true?
One necessary condition to say that agents use 'the same abstractions' is that their decisions depend on the same coarse-grained information about the environment. Motivated by this, we consider a setting where an agent needs to decide which information to pass though an information bottleneck, and ask: when do different utility functions prefer to preserve the same information?
As we develop this setting, we:
- unify and extend the Blackwell Informativeness
... (read 10627 more words →)
A while back I was looking for toy examples of environments with different amounts of 'naturalness' to their abstractions, and along the way noticed a connection between this version of Gooder Regulator and the Blackwell order.
Inspired by this, I expanded on this perspective of preferences-over-models / abstraction a fair bit here.
It includes among other things:
Personally I think these results are pretty neat; I hope they might be of interest.
I also make one... (read more)