Maybe? But consider that the opposite of what you just claimed sounds just as plausible to an outside observer. "Do what I mean" doesn't sound all that complicated -- even to someone with a background in computer science or AI specifically. "Do what I mean" translates as "accurately determine the principles which constrain my own actions and use those to constrain the AI's, or otherwise build a model of my thinking which the AI can use to evaluate options." Sub-goals such as verifying that the model matches reality fall easily out of this definition.
It's not at all clear, even to a practitioner within the field, that this expansion doesn't work, if in fact it does not.
It's not necessarily that the AI would have difficulty understanding what "do what humans mean" means, even before being told to do what humans mean.
It just has no reason to obey "do what humans mean" unless we program it to do what humans mean.
"Do what humans mean" is telling the AI to do something that we can currently only specify vaguely. "Figure out what we intend by "do what humans mean", and then do that" is also vaguely specified. It doesn't solve the problem.
If it's worth saying, but not worth its own post, then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should start on Monday, and end on Sunday.
4. Unflag the two options "Notify me of new top level comments on this article" and "