Is this from the school of "If you don't make it hungry, it will starve to death even if it has goals and knows it needs to eat to live and live to accomplish the goals"?
Is there a name for that fallacy?
Is this from the school of "If you don't make it hungry, it will starve to death even if it has goals and knows it needs to eat to live and live to accomplish the goals"?
No, it's from the school of "If it starves to death while otherwise trying to accomplish its goals, it will count this as a success, just as much as if it actually had accomplished its goals. So it won't bother to eat".
I just noticed that LessWrong has not yet linked to FHI researcher Stuart Amstrong's brief technical report, Utility Indifference (2010). It opens: