Once a maximal state has been reached, the agent has an incentive to further improve it if and only if that makes the maintenance of the state more likely.
This is true, but much depends on what is considered a 'maximal state'. If our 1-bit utility superintelligence predicts future paths all the way to the possible end states of the universe, then it isn't necessarily susceptible to getting stuck in maintenance states along the way. It all depends on what sub-set of future paths we classify as 'good'.
Also keep in mind that the 1-bit utility model still rates entire future paths, not just end future states. So let's say for example that we are really picky and we only want Tipler Omega Point end-states. If that is all we specify, then the SuperIntelligence may take us through a path that involves killing off most of humanity. However, we can avoid that by adding further constraints on the entire path: assigning 1 to future paths that end in the Omega Point but also satisfy some arbitrary list of constraints along the way. Again this is probably not the best type of utility model, but the weakness of 1-bit bounded utility is not that it tends to get stuck in maintenance mode for all utility models.
The failure in 1-bit utility is more in the specificity vs feasibility tradeoff. If we make the utility model very narrow and it turns out that the paths we want are unattainable, then the superintelligence will gleefully gamble everything and risks losing the future. For example the SI which only seeks specific Omega Point futures may eat the moon, think for a bit, and determine that even in the best actions sequences, it only has a 10^-99 of winning (according to it's narrow OP criteria). In this case it won't 'fall back' to some other more realistic but still quite awesome outcome, no it will still proceed to transform the universe in an attempt to achieve the OP, no matter how impossible. Unless of course there is some feedback mechanism with humans and utility model updating, but that amounts to circumventing the 1-bit utility idea.
"I've come to agree that navigating the Singularity wisely is the most important thing humanity can do. I'm a researcher and I want to help. What do I work on?"
The Singularity Institute gets this question regularly, and we haven't published a clear answer to it anywhere. This is because it's an extremely difficult and complicated question. A large expenditure of limited resources is required to make a serious attempt at answering it. Nevertheless, it's an important question, so we'd like to work toward an answer.
A few preliminaries:
Next, a division of labor into "problem categories." There are many ways to categorize the open problems; some of them are probably more useful than the one I've chosen below.
The list of open problems below is very preliminary. I'm sure there are many problems I've forgotten, and many problems I'm unaware of. Probably all of the problems are stated relatively poorly: this is only a "first step" document. Certainly, all listed problems are described at an extremely "high" level, very far away (so far) from mathematical precision, and can be broken down into several and often dozens of subproblems.
Safe AI Architectures
Safe AI Goals
Strategy
My thanks for some notes written by Eliezer Yudkowsky, Carl Shulman, and Nick Bostrom, from which I've drawn.