Watson 9000

I’ve been watching Jeopardy – The IBM Challenge for the last two evenings and the Nova documentary “Smartest Machine on Earth” the week before. What the folks at the IBM Watson team have pulled together is really quite impressive. Not only is Watson in the lead so far – and by a huge margin – it has also taken a giant leap forward for natural language processing in computers.

From the early days of the Eliza program by Joseph Weizenbaum it has always been a challenge for computers to recognize and process human language (click here for a JavaScript version of Eliza that you can interact with). While we’ve made remarkable progress in speech recognition in the past couple of years, the actual ability to understand and interpret language has eluded even the most sophisticated computer systems.

Humans just have a tendency to use colorful phrases, idioms, pop-culture references, and mix it all with humor in a way that is difficult to grasp for a machine. Nevertheless the Watson team seems to have made great strides in tackling these difficult problems.

It was immediately obvious that Watson did best when the question was directly related to an encyclopedic fact, such as the various illnesses in the “Don’t worry about it” category tonight. But even with humorous categories like “Church & State”, Watson did fine. In fact, Watson didn’t just do fine tonight: he (it?) dominated this second day of the Jeopardy challenge finishing with a crazy lead of $36,881, to $5,400 for Rutter and $2,400 for Jennings before going into Final Jeopardy.

The big surprise, however, came in the Final Jeopardy round tonight, when the category was “U.S. Cities”. In response to the answer “This city’s largest airport is named for a World War II hero, and its second largest airport is named for a World War II battle”, Watson came up with “What is Toronto”, which is clearly not a US city, while the two human contestants both responded with the correct answer (What is Chicago? The airports are O’Hare and Midway). However, Watson was reasonably unsure about its answer and only wagered $749, so his loss was kept nicely under control. Clearly, there is something amiss in the interpretation of categories in Watson’s algorithms. It could potentially be as simple as a missing entry in a synonym table that equates “U.S.” with “US” and “USA”…

But we’ve seen that same weakness in the category interpretation also in various rounds of test games that we saw in the Nova documentary.

It remains to be seen how Watson fares tomorrow in the final round of Jeopardy. I will definitely be watching…

In any case, now it is just a simple matter of time until IBM shifts its company name one letter to the left and comes out with the next release of Watson, which will probably be called version 9000:

HAL 9000 responding to Dave Bowman in “2001: A Space Odyssey”


More commentary on Watson can be found on Techmeme, and in particular I recommend this article All Things Digital.

UPDATE: For an explanation of the “Toronto” incident by David Ferrucci, project manager for Watson, please see “The Confusion over an Airport Clue” on the IBM Smarter Planet blog.

LANSA middle-ware integration builds on MapForce

Here is a cool story about an Altova partner, who recently integrated the MapForce mapping and data transformation user interface into their product.

LANSA is a development environment and suite of eBusiness solutions that organizations use to rapidly implement business systems that make effective use of new technologies. From its beginnings as a 4th generation language and repository-based development environment, LANSA has evolved to a family of products and solutions that support IBM iSeries (AS/400), Windows, UNIX and Linux platforms.

LANSA Composer is built on top of LANSA Integrator, the company’s integration toolkit that offers bi-directional data integration through XML, SOAP, and Java services, on IBM System i and other middle-ware platforms.

At its core, LANSA Composer utilizes the MapForce application as its transformation component:

LANSA Composer showing a MapForce transformation

For more details, see the LANSA Case Study on the Altova website for further information. Also, this integration is getting great reviews in the press. For an example, read this article in Database Trends and Applications.

Whitepaper on using Altova Tools with IBM DB2

Altova and IBM jointly published a whitepaper that shows how the integration of Altova tools with DB2 allows users to:

The solutions to the business problems presented in the whitepaper show how DBAs and developers working with real-world XML applications can benefit from the integration of Altova tools with IBM DB2.

Click here to download the whitepaper (PDF).

IBM Information-On-Demand XML Highlights

I am writing this quick report from the airport lounge in San Francisco, as I am waiting for my flight to Frankfurt and onward to Vienna for a week of meetings at Altova GmbH's headquarters in Austria. You just have to love Wi-Fi access - how did we ever get anything done before the Internet, laptops, Wi-Fi networks, and blogs?

With a week of conference sessions and trade show work in Las Vegas behind me, here is a quick summary of what I perceived to be some of the highlights of IBM's Information-On-Demand show from an XML Aficionado's perspective:

IBM announced DB2 version 9.5 (scheduled to ship October 31st), which contains several feature enhancements to the pureXML functionality, including inlining of XML, compression, several performance improvements for transactional XML, and the first implementation of XQuery Update in a major database. The last bit is probably the most interesting, because of the lack of updating capabilities in the original XQuery specs, which was exclusively focused on queries (analog to the SELECT statement in SQL). XQuery Update provides the ability to insert, delete, or update any node, i.e. potentially just one single element or attribute in the database, which should provide a huge difference to current implementation, where the entire XML document typically needs to be written back to a column in the database.

Altova's Nick Nagel, spoke on XML-Driven Data Management in a Developer Den session on Wednesday, which was well-attended. Nick's presentation addressed a "top-down" approach to data modeling using XSD (XML Schema Definition language) as a data modeling language with implications for data storage and retrieval as pureXML in DB2. Nick spoke on how XSD in turn can drive process implementation serving as formal design document, and how XML facilitates process development by enabling automated data binding, data mapping, as well as storage and retrieval with XPath 2.0 and XQuery. He also showed several screenshots on how Altova's tools can make working with XML in DB2 easier for developers.

IBM's Berni Schiefer conducted a Birds of a Feather session on Performance Tips, Conundrums and Experiences with DB2, where he answered customer questions and spoke in-depth about performance tuning of DB2, including tuning for pureXML. He also gave a few other interesting talks, but I did unfortunately not have time to attend those.

There were also plenty of interesting customer talks about how they are using XML, as well as more in-depth sessions on various aspects of pureXML in DB2, but my flight is boarding in a few minutes, so I don't have enough time to report on those anymore.

XMLSpy awarded Certificate of Excellence for IBM CTO Innovation award

XMLSpy 2008 was selected today as one of only three finalists for the IBM Information-On-Demand CTO Innovation award.

It was a very diverse list of nominations for the CTO Innovation award and a very competitive judging process - with Altova being the only company in the XML and developer tools space among the semi-finalists and finalists.

At the award breakfast today the actual "winner" didn't even show up, but we received the above Certificate of Excellence. The judging was done by a panel of CTOs, experts, and IBM executives, and they were positively impressed by the new DB2-specific functionality of our XML editor.

What are you doing with XML?

We are exhibiting at the IBM Information-On-Demand conference and trade show in Las Vegas this week, where we are demoing Altova’s deep integration with IBM DB2 pureXML, and I just noticed a profound change in the answers I get to my standard question that I’ve asked every booth visitor for the last 6-7 years.

For the past several years, when I ask my standard “So, what are you doing with XML?” question, I’ve always received very diverse answers ranging from the “Oh, I’m just getting started” to the more elaborate descriptions of what key-role XML plays as a part of an entire information management infrastructure. And – depending on how general the show audience is (i.e. if this is a pure XML-specific event or rather a more general developer conference or industry event) – there was always a fair share of “XML? What is that?” responses.

What struck me only today – at the end of the second day of this show – is that I haven’t gotten a single “What is XML?” answer. Every single person I’ve talked with is using XML for something! And this is not an XML-specific event, but rather a very broad audience that goes way beyond just developers. I think that is a significant change – and a very positive one.

So let me use this observation as grounds to proclaim what I’ve predicted all along: XML is now ubiquitous. It is all-pervasive, all-encompassing. It is the lingua franca of how systems talk to one another, how data is transported, how content is stored, reused, and manipulated. And it only took a little over 8 years for XML to conquer the world.