What's the point? Are you going to nitpick that my goals aren't formal enough, even though I'm not making any claim at all about what kind of goals those could be?
Are you claiming that it's impossible for an agent to have goals? That the set of goals that it's even conceivable for an AI to have (without immediately wireheading or something) is much narrower than what most people here assume?
I'm not even sure what this disagreement is about right now, or even if there is a disagreement.
Ya, I think the set of goals is very narrow. The AI here starts of Descartes level genius and proceeds to self preserve, understand the map-territory distinction for non-wireheading, foreseeing the possibility that instrumental goals which look good may destroy the terminal goal, and such.
The AI I imagine starts off stupid and has some really narrowly (edit: or should i say, short-foresighted) self improving non self destructive goal likely having to do with maximization of complexity in some way. Think evolution, don't think fully grown Descartes waking ...
Here's my draft document Concepts are Difficult, and Unfriendliness is the Default. (Google Docs, commenting enabled.) Despite the name, it's still informal and would need a lot more references, but it could be written up to a proper paper if people felt that the reasoning was solid.
Here's my introduction:
And here's my conclusion:
For the actual argumentation defending the various premises, see the linked document. I have a feeling that there are still several conceptual distinctions that I should be making but am not, but I figured that the easiest way to find the problems would be to have people tell me what points they find unclear or disagreeable.