You don't know how much do you privilege a hypothesis by picking the arbitrary unbounded goal G out of goals that we humans easily define using English language. It is very easy to say 'maximize the paperclips or something' - it is very hard to formally define what paperclips are even without any run-time constraints, and it's very dubious that you can forbid solutions similar to those that a Soviet factory would employ if it was tasked with maximization of paperclip output (a lot of very tiny paperclips, or just falsified numbers for the outputs, or making the paperclips and then re-melting them). Furthermore, it is really easy for us to say 'self' but defining self formally is very difficult as well, if you want the AI's self improvement not to equal suicide.
Furthermore, the AI starts stupid. It better be caring about itself before it can start inventing self preservation via self-foresight. Defining the goals in terms of some complexity metrics = goals that have something to do with life.
My argument doesn't require that anybody be able to formally define "self" or "maximize paperclips"; it doesn't require the goal G to be picked among those that are easily defined in English.
An agent capable of reasoning about the world should be able to make an inference like "if all copies of me are destroyed, it makes it much less likely that goal G would be reached"; it may not have exactly that form, but it should be something analogous. It doesn't matter if I can't formalize that, the agent may not have a completely formal version either, only one that is sufficient for it's purposes.
Here's my draft document Concepts are Difficult, and Unfriendliness is the Default. (Google Docs, commenting enabled.) Despite the name, it's still informal and would need a lot more references, but it could be written up to a proper paper if people felt that the reasoning was solid.
Here's my introduction:
And here's my conclusion:
For the actual argumentation defending the various premises, see the linked document. I have a feeling that there are still several conceptual distinctions that I should be making but am not, but I figured that the easiest way to find the problems would be to have people tell me what points they find unclear or disagreeable.