Followup To: Are You Anosognosic?, The Strangest Thing An AI Could Tell You
Over this past weekend I listened to an episode of This American Life titled Pro Se. Although the episode is nominally about people defending themselves in court, the first act of the episode was about a man who pretended to act insane in order to get out of a prison sentence for an assault charge. There doesn't appear to be a transcript, so I'll summarize here first.
A man, we'll call him John, was arrested in the late 1990s for assaulting a homeless man. Given that there was plenty of evidence to prove him guilty, he was looking for a way to avoid the likely jail sentence of five to seven years. The other prisoners he was being held with suggested that he plead insanity: he'd be put up at a hospital for several months with hot food and TV and released once they considered him "rehabilitated". So he took bits and pieces about how insane people are supposed to act from movies he had seen and used them to form a case for his own insanity. The court believed him, but rather than sending him to a cushy hospital, they sent him to a maximum security asylum for the criminally insane.
Within a day of arriving, John realized the mistake he had made and sought to find a way out. He tries a variety of techniques: engaging in therapy, not engaging in therapy, dressing like a sane person, acting like a sane person, acting like an incurably insane person, but none of it works. Over a decade later he is still being held.
As the story unravels, we learn that although John makes a convincing case that he faked his way in and is being held unjustly, the psychiatrists at the asylum know that he faked his way in and continue to hold him anyway, though John is not aware of this. The reason: through his long years of documented behavior John has made it clear to the psychiatrists that he is a psychopath/sociopath and is not safe to return to society without therapy. John is aware that this is his diagnosis, but continues to believe himself sane.
Similar to trying to determine if you are anosognosic, how do you determine if you are insane? Some kinds of insanity can be self diagnosed, but in John's case he has lots of evidence (he has access to read all of his own medical records) that he is insane, yet continues to believe himself not to be. To me this seems a level trickier than anosognosis, since there's no physical tests you can make, but perhaps it's only a level of difference significant to people but not to an AI.
Edited to add a footnote: By "sane" I simply mean normative human reasoning: the way you expect, all else being equal, a human to think about things. While the discussion in the comments about how to define sanity might be of some interest, it really gets away from the point of the post unless you want to argue that "sanity" is creating a question here that is best solved by dissolving the question (as at least one commenter does).
Define a function's wackiness is easy. If you don't believe me, suffer through the following paragraph, in which I demonstrate my cleverness in a remarkably economist-like manner.
Let's say a utility function goes from world-states to real numbers on the interval [-1, 1]. -1 is the worst thing you can imagine and 1 is the best. Your function periodically re-normalizes as the best or worst thing you can imagine changes. To compute two utility functions' wackiness with respect to each other, compute the root-mean-squared differences between them across all world-states they are both defined on. Define a function A(world-state) which is defined on all world-states for which at least half of the human revealed utility functions are defined on and for which the standard deviation in the human revealed utility functions' computed values is less than 0.2. Its value for any given world-state is the average of all humanity's revealed utility functions' values for that world state. Observing all humanity's revealed utility functions' wackiness with respect to A, we designate humans as "insane" if their revealed utility functions' wackiness with respect to A is more than two standard deviations above the average.
In other words, insane people want really different things than other people.
That's probably not a good definition though, because those people are more likely to be called weird than insane. Probably the fast-changing revealed utility function definition is a better one. For that you'd compute the wackiness of a person's current revealed utility function with respect to the one they had five minutes ago over the last four days and add all the wackinesses together. If this result is more than four standard deviations above the average they can be considered insane.
I'm not sure that really captures most of what passes for 'insane'.