Followup: I was able to attend a panel discussion tonight with several members of the team working on Watson. (My university is hosting panels for all three nights, as many of the team members were once students here. See watson.rpi.edu for recordings of the panel discussions.)
I spoke with one person from IBM after the episode aired, and confirmed that Watson is programmed with statistics from every Jeopardy episode. That allows it to search for the daily double efficiently, and afterward it does specifically turn to the lowest-point questions in each category in order to learn in which it might do best. It also employs game theory to determine how much to bet on daily doubles and final jeopardy, and which questions to pick when it has control.
They explained to us why Watson missed the final jeopardy in tonight's game. The category was "U.S. Cities" and the answer was "Its largest airport was named for a World War II hero; its second for a World War II battle." Watson learned that the category names don't always strictly imply the answer type, so it didn't consider that to be a strong indicator. It recognized that the clue was in two parts, but the second part was missing the noun and verb from the first, so Watson couldn't really get anything from it. Toronto's largest airport is named after a WWII vet, and there are cities named Toronto in the U.S. We were told that its confidence on Toronto was ~13%, and its second choice was Chicago (the correct answer) with a confidence of ~11%.
We were also told that its confidences are very well calibrated, so that, e.g., it will be right on average 9 out of 10 of the times that it displays 90% confidence.
The confidences are supposed to be probabilities? But they often summed to > 100%
Or is it "the procedure for generating the confidences is such that it'll be well calibrated for the highest ranking answer"?
It was mentioned before on LessWrong, but I feel people might appreciate a reminder:
http://www-03.ibm.com/innovation/us/watson/what-is-watson/countdown-to-jeopardy.html
It's a bit of a cheesy PR thing - I'd be a lot more interested if they connected the program on the Internet and allowed anyone to try and ask them general questions, rather than mixing the program with voice recognition and (heh) buzzer-pushing. Trivia tests are also probably one of the easier challenges to deal with, since keyword filtering alone is very efficient in narrowing down the candidate space.
Still, I'm going to watch it if I can: if anybody knows of a streaming link that is accessible to non-US viewers, that would be appreciated.
(Silly aside: is anyone else annoyed by how "Jeopardy" pretends to invert the traditional question-answer format, while what it does is simply moving the "what is" from the former to the latter, even if the result makes no sense? I suppose to US people this is a rather old complaint, but I learnt about the show today and I'm rather bugged by this feature.)