What mechanism would a paperclipper have for developing out of a paperclipper? If it has the terminal goal of increasing paperclips, then it will never self-modify to anything that will result in it creating less paperclips, even if under its new utility function it wouldn't care about that.
Or: If A -> B -> C, and the paperclipper does not want C, then paperclipper will not go to B.
I'm imagining that the paperclipper will become a massively distributed system, with subunits pursuing subgoals, groups of subunits will be granted partial agency due to long-distance communication constraints, and over eons value drift will occur due to mutation. ETA: the paperclipper will be counteracting value drift, but will also pursue fastest creation of paperclips and avoiding extintion, which can be at a trade-off with value drift.
Thought experiment:
Through whatever accident of history underlies these philosophical dilemmas, you are faced with a choice between two, and only two, mutually exclusive options:
* Choose A, and all life and sapience in the solar system (and presumably the universe), save for a sapient paperclipping AI, dies.
* Choose B, and all life and sapience in the solar system, including the paperclipping AI, dies.
Phrased another way: does the existence of any intelligence at all, even a paperclipper, have even the smallest amount of utility above no intelligence at all?
If anyone responds positively, subsequent questions would be which would be preferred, a paperclipper or a single bacteria; a paperclipper or a self-sustaining population of trilobites and their supporting ecology; a paperclipper or a self-sustaining population of australopithecines; and so forth, until the equivalent value is determined.