We (or at least a majority of humans) do still have inner desires to have kids, though; they just get balanced out by other considerations, mostly creature comforts/not wanting to deal with the hassle of kids. But yeah, evolution did not foresee birth control, so that's a substantial misgeneralization.
We are still a very successful species overall according to IGF, but birth rates continue to decline, which is why I made my last point about inner alignment possibly drifting farther and farther away the stronger the inner optimizer (e.g. human culture) becomes.
I saw that Katja Grace has said something similar here; I'm just putting my own spin on the idea.
The relevance of the evolutionary analogy for inner alignment has been long discussed in this community, but one observation that seems to not be mentioned is that humans are still... pretty good at inclusive genetic fitness? Even in way-out-of-distribution environments like modern society, we still have strong desires to eat food, stay alive, find mates and reproduce (although the last one has relatively decreased recently; IGF hasn't totally generalized). We ...
The thing is, there exists lots of popular movies about rogue AIs taking over the world -- 2001, Terminator, etc etc -- so the concept should already exist in popular culture. The roadblocks seem to be:
In this case, the starving person presumably has to press the button or else starve to death, and thus has no bargaining power. The other person only has to offer the bare minimum beyond what the starving person needs to survive, and the starving person must take the deal. In Econ 101 (assuming away monopolies, information asymmetry, etc.), exploited workers do have bargaining power by being able to work for other companies, hence why companies can’t just do stupid, spiteful actions in the long term.
It might be relevant to note that the meaningfulness of this coherence definition depends on the chosen environment. For instance, in an deterministic forest MDP where an agent at a state can never return to for any and there is only one path between any two states, suppose we have a deterministic policy and let , , etc. Then for the zero-current-payoff Bellman equations, we only need that for any successor from , ...
Currently these two links include the commas so they redirect to 404 pages