Imagine aliens on a distant world. They have values very different to humans. However, they also have complicated values, and don't exactly know their own values.
Imagine these aliens are doing well at AI alignment. They are just about to boot up a friendly (to them) superintelligence.
Now imagine we get to see all their source code and research notes. How helpful would this be for humans solving alignment?
Agreed it is natural.
To describe 'limited optimization' in my words: The teacher implements an abstract function whose optimization target is not {the outcome of a system containing a copy of this function}, but {criteria about the isolated function's own output}. The input to this function is not {the teacher's entire world model}, but some simple data structure whose units map to schedule-related abstractions. The output of this function, when interpreted by the teacher, then maps back to something like a possible schedule ordering. (Of course, this is an idealized case, I don't claim that actual human brains are so neat)
The optimization target of an agent, though, is "{the outcome of a system containing a copy of this function}" (in this case, 'this function' refers to the agent). If agents themselves implemented agentic functions, the result would be infinite recurse; so all agents of sufficiently complex worlds must, at some point in the course of solving their broader agent-question[1], ask 'domain limited' sub-questions.
(note that 'domain limited' and 'agentic' are not fundamental-types; the fundamental thing would be something like "some (more complex) problems have sub-problems which can/must be isolated")
I think humans have deep assumptions conducive to their 'embedded agency' which can make it harder to see this for the first time. It may be automatic to view 'the world' as a referent which a 'goal function' can somehow be about naturally. I once noticed I had a related confusion, and asked "wait, how can a mathematical function 'refer to the world' at all?". The answer is that there is no mathematically default 'world' object to refer to, and you have to construct a structural copy to refer to instead (which, being a copy, contains a copy of the agent, implying the actions of the real agent and its copy logically-correspond), which is a specific non-default thing, which nearly all functions do not do.
(This totally doesn't answer your clarified question, I'm just writing a related thing to something you wrote in hopes of learning)
From possible outputs, which meets some criteria about {the outcome of a system containing a copy of this function}?