"Betrayal" is not the main worry. Given that you prevent the AGI from understanding what people want, it's likely that it won't do what people want.
Have you read Bostroms book Superintelligence?
Yes, that's actually the reason why I wanted to tackle the "treacherous turn" first, to look for a general design that would allow us to trust the results from tests and then build on that. I'm seeing as order of priority: 1) make sure we don't get tricked, so that we can trust the results of what we do; 2) make the AI do the right things. I'm referring to 1) in here. Also, as mentioned in another comment to the main post, part of the AI's utility function is evolving to understand human values, so I still don't quite see why exactly it shouldn't...
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.