what if an agent does not have an internal representation of its desires. Is that possible?
Human infants, perhaps.
I have some difficulties mapping the terms you use (and roughly define) to the usual definitions of these terms. For example
It has an internal representation of its goals. I will call this internal representation its desires.
Nonetheless I see some interesting differentiations in your elaborations. It makes a difference of whether a utility function is explicitly coded as part of the system and whether the implicit utility function of a system is inferred from its overall function. And also different from the utility function inferred for the composition of the system with its environment.
I also like how you relate these concepts to compassion and consequentialsm even though the connections appears vague to me. Some more elaboration - or rather more precise relationships could help.
How could an AI be compassionate? Perhaps an AI could be empathetic if it could perceive, through its sensors, the desires (or empirical goals, or reflective goals) of other agents and internalize them as its own.
In other words, it tries to maximize human values. Isn't this the standard way of programming a Friendly AI?
For the sake of argument, let's consider an agent to be autonomous if: