Tuesday, February 15, 2011

Watson 9000

I’ve been watching Jeopardy – The IBM Challenge for the last two evenings and the Nova documentary “Smartest Machine on Earth” the week before. What the folks at the IBM Watson team have pulled together is really quite impressive. Not only is Watson in the lead so far – and by a huge margin – it has also taken a giant leap forward for natural language processing in computers.

From the early days of the Eliza program by Joseph Weizenbaum it has always been a challenge for computers to recognize and process human language (click here for a JavaScript version of Eliza that you can interact with). While we’ve made remarkable progress in speech recognition in the past couple of years, the actual ability to understand and interpret language has eluded even the most sophisticated computer systems.

Humans just have a tendency to use colorful phrases, idioms, pop-culture references, and mix it all with humor in a way that is difficult to grasp for a machine. Nevertheless the Watson team seems to have made great strides in tackling these difficult problems.

It was immediately obvious that Watson did best when the question was directly related to an encyclopedic fact, such as the various illnesses in the “Don’t worry about it” category tonight. But even with humorous categories like “Church & State”, Watson did fine. In fact, Watson didn’t just do fine tonight: he (it?) dominated this second day of the Jeopardy challenge finishing with a crazy lead of $36,881, to $5,400 for Rutter and $2,400 for Jennings before going into Final Jeopardy.

The big surprise, however, came in the Final Jeopardy round tonight, when the category was “U.S. Cities”. In response to the answer “This city’s largest airport is named for a World War II hero, and its second largest airport is named for a World War II battle”, Watson came up with “What is Toronto”, which is clearly not a US city, while the two human contestants both responded with the correct answer (What is Chicago? The airports are O’Hare and Midway). However, Watson was reasonably unsure about its answer and only wagered $749, so his loss was kept nicely under control. Clearly, there is something amiss in the interpretation of categories in Watson’s algorithms. It could potentially be as simple as a missing entry in a synonym table that equates “U.S.” with “US” and “USA”…

But we’ve seen that same weakness in the category interpretation also in various rounds of test games that we saw in the Nova documentary.

It remains to be seen how Watson fares tomorrow in the final round of Jeopardy. I will definitely be watching…

In any case, now it is just a simple matter of time until IBM shifts its company name one letter to the left and comes out with the next release of Watson, which will probably be called version 9000:

HAL 9000 responding to Dave Bowman in “2001: A Space Odyssey”


More commentary on Watson can be found on Techmeme, and in particular I recommend this article All Things Digital.

UPDATE: For an explanation of the “Toronto” incident by David Ferrucci, project manager for Watson, please see “The Confusion over an Airport Clue” on the IBM Smarter Planet blog.

No comments: