Possibly somewhat off-topic: my hunch is that the actual motivation of the initial AGI will be random, rather than orthogonal to anything.
Consider this: how often has a difficult task been accomplished right the first time, even with all the careful preparation beforehand? For example, how many rockets blew up, killing people in the process, before the first successful lift-off? People were careless but lucky with the first nuclear reactor, though note "Fermi had convinced Arthur Compton that his calculations were reliable enough to rule out a runaway chain reaction or an explosion, but, as the official historians of the Atomic Energy Commission later noted, the "gamble" remained in conducting "a possibly catastrophic experiment in one of the most densely populated areas of the nation!"
I doubt that one can count on luck in the AGI development, but I would bet on unintentional carelessness (and other manifestations of the Murphy's law).
The bottom line is (nothing new here), no matter how much you research things beforehand, the first AGI will have bugs, with unpredictable consequences for its actual motivation. If we are lucky, there will be a chance to fix the bugs. Whether it is even possible to constrain the severity of bugs is way too early to tell, given how little is currently known about the topic.
One of the most annoying arguments when discussing AI is the perennial "But if the AI is so smart, why won't it figure out the right thing to do anyway?" It's often the ultimate curiosity stopper.
Nick Bostrom has defined the "Orthogonality thesis" as the principle that motivation and intelligence are essentially unrelated: superintelligences can have nearly any type of motivation (at least, nearly any utility function-bases motivation). We're trying to get some rigorous papers out so that when that question comes up, we can point people to standard, and published, arguments. Nick has had a paper accepted that points out the orthogonality thesis is compatible with a lot of philosophical positions that would seem to contradict it.
I'm hoping to complement this with a paper laying out the positive arguments in favour of the thesis. So I'm asking you for your strongest arguments for (or against) the orthogonality thesis. Think of trying to convince a conservative philosopher who's caught a bad case of moral realism - what would you say to them?
Many thanks! Karma and acknowledgements will shower on the best suggestions, and many puppies will be happy.