As I understand it, the argument (roughly) is that if you build an AI from scratch, using just tools available now, you will have to specify its utility function, in a way that the program can understand, as part of that process. Anyone actually trying to work out a utility function that can be programmed would have to have a fairly deep understanding - you can't just type "make nice things happen and no bad things", but have to think in terms that can be converted into C or Perl or whatever. In doing so, you would have to have some kind of understanding in your own head of what you're telling the computer to do, and would be likely to avoid at least the most obvious failure modes.
However, in (say) twenty years that might not be the case - it might be (as an example) that we have natural language processing programs that can take a sentence like 'make people happy' and have some form of 'understanding' of it, while still not being Turing-test-passing, self-modification-capable fully general AIs. It could then get to the stage that some half-clever person could think "Hmm... If I put this and this and this together, I'll have a self-modifying AI. And then I'll just tell it to make everyone smile. What could go wrong?"
It's probably easier to build an uncaring AI than a friendly one. So, if we assume that someone, somewhere is trying to build an AI without solving friendliness, that person will probably finish before someone who's trying to build a friendly AI.
[redacted]
[redacted]
further edit:
Wow, this is getting a rather stronger reaction than I'd anticipated. Clarification: I'm not suggesting practical measures that should be implemented. Jeez. I'm deep in an armchair, thinking about a problem that (for the moment) looks very hypothetical.
For future reference, how should I have gone about asking this question without seeming like I want to mobilize the Turing Police?