Suppose you want to predict the behavior of an agent. I stand corrected. To make the prediction, as a predictor you need:
- observations of the agent
- the capacity to model the agent to a sufficient degree of accuracy
"Sufficient accuracy" here is a threshold on, for example, KL divergence or perhaps some measure that depends on utilities of predictions in the more complex case.
When we talk about the intelligence of a system, or the relative intelligence between agents, one way to think of that is the ability for one agent to predict another.
Consider a game where an agent, A, acts on the basis of an arbitrarily chosen polynomial function of degree k. A predictor, P, can observe A and build predictive models of it. Predictor P has the capacity to represent predictive models that are polynomial functions of degree j.
If j > k, then predictor P will in principal be able to predict A with perfect accuracy. If j < k, then there most of the time be cases where P predicts inaccurately. If we say (just for the sake of argument) that perfect predictive accuracy is the test for sufficient capacity, we could say that in the j < k case P does not have sufficient capacity to represent A.
When we talk about the relative intelligence between agents in an adversarial context, this is one way to think about the problem. One way that an agent can have a decisive strategic advantage over another is if it has the capacity to predict the other agent and not vice-versa.
The expressive power of the model space available to P is only one of the ways in which P might have or not have capacity to predict A. If we imagine the prediction game extended in time, then the computational speed of P--what functions it can compute within what span of real time--relative to the computational speed of A could be a factor.
Note that these are ways of thinking about the relative intelligence between agents that do not have anything explicitly to do with "optimization power" or a utility function over outcomes. It is merely about the capacity of agents to represent each other.
One nice thing about representing intelligence in this way is that it does not require an agent's utility function to be stable. In fact, it would be strange for an agent that became more intelligent to have a stable utility function, because the range of possible utility functions available to a more intelligent agent are greater. We would expect that an agent that grows in its understanding would change its utility function--if only because to do so would make it less predictable to adversarial agents that would exploit its simplicity.
True, and even much wider than the educational system, but I would probably rephrase to say that this makes human intelligence predictable within narrow domains. A math class strives to make students predictable when attempting to solve math problems, a legal system hopes to make humans predictable in the domain of violent conflict resolution, a religion hopes to make humans predictable in metaphysical inquiry, etc.
But human intelligence itself is fully general (in that we defined 'fully general' to mean 'like me'), so there's not really any form of training or education that can make, or attempts to make, human intelligence predictable across all domains.