This story was originally posted as a response to this thread.
It might help to imagine a hard takeoff scenario using only known sorts of NN & scaling effects...
In A.D. 20XX. Work was beginning. "How are you gentlemen !!"... (Work. Work never changes; work is always hell.)
Specifically, a MoogleBook researcher has gotten a pull request from Reviewer #2 on his new paper in evolutionary search in auto-ML, for error bars on the auto-ML hyperparameter sensitivity like larger batch sizes, because more can be different and there's high variance in the old runs with a few anomalously high performance values. ("Really? Really? That's what you're worried about?") He can't see why worry, and wonders what sins he committed to deserve this asshole Chinese (given the Engrish) reviewer, as he wearily kicks off yet another HQU experiment...
Suppose you've got an AI with a big old complicated world model, which outputs a compressed state to the reward function. There are two compressed states. The reward function is +1 for if you're in state one each turn, and -1 if you aren't. I guess you could try to perform a pascal's mugging by suggesting that if you help humanity, they're willing to put the world in state one forever as a quid pro quo. But that doesn't seem like it is high probability, and the potential reward is still bounded via discounting, so I don't think that would work.