What about The Lifespan Dilemma and Pascal's Mugging?
These are really only problems for agents with unbounded utility functions. This is a great example of over-theorizing without considering practical computational limitations. If your AI design requires double (or even much higher) precision arithmetic just to evaluate it's internal utility functions, you have probably already failed.
Consider the extreme example of bounded utility functions: 1-bit utilities. A 1-bit utility function can only categorize futures into two possible shades: good or bad. This by itself is not a crippling limitation if the AI considers a large number of potential futures and computes a final probability-weighted decision score with higher precision. For example when considering two action paths A and B, a monte carlo design could evaluate out a couple hundred futures branching from A and B, assign each a 0 or 1, and then add them up into a tally requiring precision proportional to the number of futures evaluated (in this case, around 8-bit).
This extremely bounded design would need to do far more future-simulation to compensate for it's extremely low-granularity utility rankings: for example when playing chess, it could only categorize board states as 'likely win' or 'likely loss'. Thus it would need to have a higher ply-depth than algos that use higher-bit depth evaluations. But even so, this only amounts to a performance efficiency disadvantage, not a fundamental limitation.
If we extrapolate a 1-bit friendly AI to planning humanity's future, it would collapse all futures that humanity found largely 'desirable' into the same score of 1, with everything else being 0. If it's utility classifier and future modelling is powerful enough this design can still work.
And curiously a 1-bit utility function gives more intuitively reasonable results in the Lifespan Dilemma or Pascal's Mugging. Assuming dying in an hour is a 0-utility outcome and living for at least a billion years is a 1, it would never take any wagers increasing it's probability of death. And it would be just as un-susceptible to Pascal Mugging.
Just to be clear, I'm not really advocating simplistic 1-bit utilities. What is clear is that human's internal utility evaluations are bounded. This probably comes from practical computational limitations, but likely future AI's will also have practical limitations.
Bounded utilities -- especially strongly bounded ones like your 1-bit probability-weighted utility function -- give you outcomes that depend crucially on the probability of a world-state's human-relative improvement versus the probability of degeneration. Once a maximal state has been reached, the agent has an incentive to further improve it if and only if that makes the maintenance of the state more likely. That's not really a bad outcome if we've chosen our utility terms well (i.e. not foolishly ignored the hedonic treadmill or something), but it's sub...
"I've come to agree that navigating the Singularity wisely is the most important thing humanity can do. I'm a researcher and I want to help. What do I work on?"
The Singularity Institute gets this question regularly, and we haven't published a clear answer to it anywhere. This is because it's an extremely difficult and complicated question. A large expenditure of limited resources is required to make a serious attempt at answering it. Nevertheless, it's an important question, so we'd like to work toward an answer.
A few preliminaries:
Next, a division of labor into "problem categories." There are many ways to categorize the open problems; some of them are probably more useful than the one I've chosen below.
The list of open problems below is very preliminary. I'm sure there are many problems I've forgotten, and many problems I'm unaware of. Probably all of the problems are stated relatively poorly: this is only a "first step" document. Certainly, all listed problems are described at an extremely "high" level, very far away (so far) from mathematical precision, and can be broken down into several and often dozens of subproblems.
Safe AI Architectures
Safe AI Goals
Strategy
My thanks for some notes written by Eliezer Yudkowsky, Carl Shulman, and Nick Bostrom, from which I've drawn.