Not sure what allowing a small chance of false negatives does: you presumably could just repeat all your questions?
In this case the result would or could be the same, so long as the AI didn't sufficiently update its internal state inbetween. but the detail isn't important; please ignore it. I include it because it makes the device tractable. To achieve perfect detection would require a more powerful computer than the AI being analyzed, which seems impractical. But achieving even infinitesimal error rates appears to be doable (I had a specific construction in mind when writing this post).
More substantially, I don't know how easy 'deception' would be to define - any presentation of information would be selective.
Deception in this case means giving false or incomplete descriptions of its thought processes. It's okay for the AI to think "how do I present this in a way that the human will accept it?" only so long as the AI tells the human it had that thought. E.g. you ask "why do you recommend this action?" and the answer you get is anything other than the actual, 100% complete justification of both the specified choice and its alternatives, and a calculation showing higher expected utility for the chosen action, as well as a whole slew of meta-information such a description of the search strategy and cutoff thresholds for giving up on generating altneratives, which cached computations were available for use, etc. If any of this is falsified, or a single detail ommitted, the red light goes off.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Interesting and useful post!
But on your last bullet, you seem to be conflating 'leadership' with 'people presenting the idea'. I'm not sure they are always the same thing: the 'leaders' of any group are quite often going to be there because they're good at forging consensus and/or because they have general social/personal skills that stop them appearing like cranks.
Take a fringe political party: I would guess that people promoting that party down the pub or in online comments on newspaper websites or whatever are more likely to be the sort of advocate you describe. But in all but the smallest fringe parties, you'd expect the actual leadership to have rather more political skill.