What I meant was "for the top three answers, the confidences would sometimes sum to > 100, so how does that work?"
Is the procedure defined as well calibrated only for the top answer, or is there something I'm missing?
The confidence level compares the answer to other answers Watson's given in the past, based on how much the answer is supported by the evidence Watson has and uses. All the answers are generated and scored in parallel. It's not a comparison among the answers generated for a specific question, so it shouldn't necessarily add up to 100.
Quote from Chris Welty at last night's panel: "When [Watson] says 'this is my answer, 50% sure,' half the time he's right about that, and half the time he's wrong. When he says 80%, 20% of the time he's wrong."
It was mentioned before on LessWrong, but I feel people might appreciate a reminder:
http://www-03.ibm.com/innovation/us/watson/what-is-watson/countdown-to-jeopardy.html
It's a bit of a cheesy PR thing - I'd be a lot more interested if they connected the program on the Internet and allowed anyone to try and ask them general questions, rather than mixing the program with voice recognition and (heh) buzzer-pushing. Trivia tests are also probably one of the easier challenges to deal with, since keyword filtering alone is very efficient in narrowing down the candidate space.
Still, I'm going to watch it if I can: if anybody knows of a streaming link that is accessible to non-US viewers, that would be appreciated.
(Silly aside: is anyone else annoyed by how "Jeopardy" pretends to invert the traditional question-answer format, while what it does is simply moving the "what is" from the former to the latter, even if the result makes no sense? I suppose to US people this is a rather old complaint, but I learnt about the show today and I'm rather bugged by this feature.)