XSLT

The Power of RaptorXML - now available in XMLSpy 2014

The RaptorXML Story

As you probably know by now, RaptorXML, our new 3rd generation high-performance XML, XPath, XSLT, XQuery, and XBRL processing engine, was a little over 2 years in the making. When we embarked on this mission in 2011, we set out to create a new processor that was highly optimized for multi-core CPUs, high throughput, and a reduced memory footprint. As part of that redesign we incorporated all our experience with the evolution of XML over the last decade and focused on adding support for all the latest standards, including XML 1.1, XML Schema 1.1, XPath 3.0, XSLT 3.0, XQuery 3.0, XBRL 2.1, XBRL Dimensions, XBRL Formula, and many others.

Our decision to first launch the new RaptorXML engine as a stand-alone server product in June rather than waiting for our annual product release in the fall was certainly not an easy decision to make. We knew that all our existing XMLSpy and MissionKit customers were eagerly awaiting support for XML Schema 1.1 as well as the 3.0 versions of XPath, XLST, and XQuery.

At the same time we knew that the engine was ready for large-scale production use, while the refactoring of our existing tools and integration of the new engine would still take another 3-4 months, so we decided to introduce RaptorXML as a stand-alone server product first. The initial RaptorXML announcement happened at the XBRL International conference in Dublin, Ireland, in May this year and commercial availability of the server followed in June, when RaptorXML joined the growing family of Altova Server products.

And it turns out we made the right decision. At this time RaptorXML+XBRL Server is already being used by over 50 customers, including a major banking regulator in Asia, to validate large amounts of XBRL data on high-end servers using XBRL 2.1 and XBRL Formula validation.


Altova MissionKit 2014 Launch

Now the long-awaited day has finally come and we are very excited to introduce our new Altova MissionKit 2014 product line that incorporates the RaptorXML engine in XMLSpy and many other MissionKit tools. This means you get immediate access to XML Schema 1.1, XPath 3.0, XSLT 3.0, XQuery 3.0, and XBRL Formula validation - in addition to all the previous standards - right from within the new XMLSpy 2014 XML Editor and you get a huge performance boost for all your projects due to the faster engine.

And we have, of course, extended our graphical XML Schema editor to include support for XML Schema 1.1 as well as adding the powerful SmartFix validation to the schema editor, so XMLSpy will now make intelligent suggestions on how to fix common XML Schema errors directly in the graphical schema editor.





Due to the new capabilities of RaptorXML you also get a cool new feature that has been requested by many users: the ability to display multiple validation errors at once!

And for all XBRL users we have included better XBRL Formula support as well as XBRL Concept Types in the XBRL Taxonomy editor. This makes XMLSpy Enterprise Edition the single most powerful XBRL development tool with taxonomy editing and powerful RaptorXML-based XBRL instance validation all available in one easily affordable tool.

Bottom-line: having RaptorXML inside of XMLSpy is just awesome and you will love the speed as well as the improved standards-conformance! And if you want to include RaptorXML-based validation in your own projects you can now deploy RaptorXML Server on Windows, Linux, and MacOS X. Check out the RaptorXML Server datasheet for more information.

But the good news doesn't stop with XMLSpy. Version 2014 of the Altova MissionKit includes several new features in MapForce as well, such as support for XML wildcards <xs:any> and <xs:anyAttribute> as well as the ability to generate comments and processing instructions in any XML output files.

Additional new features that are available across all major Altova MissionKit 2014 tools are: Integration with Eclipse 4.3, as well as updated support for new databases, including SQL Server 2012, MySQL 5.5.28, PostgreSQL 9.0.10, 9.1.6, 9.2.1, Sybase ASE 15.7, IBM DB2 9.5, 9.7, 10.1, Informix 11.70, and Access 2013.

Last, but not least, to help you take advantage of the powerful new XML Schema 1.1 capabilities, such as assertions, conditional type alternatives, default attributes, and open content, we are also launching a comprehensive new FREE XML Schema 1.1 online training course today.

For more information, please check out today's announcement on the Altova blog as well as the What's New page on the Altova website.

Altova StyleVision In-Depth Review

Dave Gash published an in-depth review of Altova StyleVision 2010 on the WritersUA website this past week and says in his introduction:

Altova calls StyleVision a "stylesheet designer," but that technically accurate designation doesn't really do the software justice. They could have called it a "schema-based WYSIWYG drag-and-drop XML / XBRL / database visual page editor and XSLT / XSL-FO / HTML / RTF / PDF / Word / e-forms generator," but I'm guessing that wouldn't have made it past the suits in Marketing.

I like that new product description. It’s a bit of a mouthful, but certainly brings it to the point. Really, we couldn’t have said it any better…

Dave follows this introduction with a detailed review of the design method, user-interface, formatting, and output options and covers all the exciting new capabilities of version 2010, such as the new blueprint capability.

And after going over all the relevant features Dave comes to the following conclusion:

StyleVision is one of the most interesting software applications I've seen in years. Without question, it offers a new and unique approach to XSLT transform authoring, a skill formerly reserved for beanie-wearing, pocket-protector using, syntax-obsessing code jockeys such as your humble reviewer. It allows more of the tech pubs workforce than ever to transform raw data into aesthetic, useful pages.

While some coders might lament the loss of a previously proprietary skill set to non-programmers, the fact is that spreading knowledge around is a good thing. Make no mistake: as more people use a technology, the better that technology becomes, and StyleVision's application of the WYSIWYG concept to XSLT is a shining example.

We are delighted to hear that! Please check out Dave’s review and then download a free 30-day eval version to see for yourself.

JSON and XML

Listening to bloggers over the last couple of years and also in talking to several developers in the industry it appeared to me that the JSON vs. XML debate seemed to primarily be a “religious” conflict – much like the “PC vs. Mac” or “Java vs. .NET” are to a large extent. I call it a religious conflict simply because there are fanatics on either side and they try to convert the masses and get them to agree with their point of view. But if you look at both XML and JSON with an unbiased mind and take each approach seriously, you’ll quickly find out that there are tons of applications where XML makes more sense and then there is a ton of other applications where JSON makes more sense. I would argue that XML has the better infra-structure of surrounding standards (like XML Schema, XSLT, XPath, etc.) that makes is a much richer platform and provides more flexibility, but there is something to be said about the elegance, efficiency, and simplicity of JSON, too. Instead of viewing JSON and XML as competing technologies we decided in our v2010 release to support them side-by-side and give our users the choice: XMLSpy 2010 now includes a JSON editor and supports JSON editing in its text and grid views with full syntax coloring, syntax checking, etc. This makes XMLSpy the first and only XML Editor to support JSON:

Just like we find some of our customers using XMLSpy as a plug-in within Eclipse and doing code-generation for Java, whereas other customers are using XMLSpy embedded within Visual Studio and doing primarily code-generation for C#, we expect that some of our customers will user our tools for JSON work, and others won’t. As a standards-focused developer tools vendor we simply want to give them the choice.

So let’s talk a bit more about the JSON vs. XML devate. For some background reading, I’d recommend doing a Google search for “JSON vs XML” and you will find various different opinions.

Here is my take on the matter: XML by itself appears to be similar to JSON only when you ignore all the surrounding XML-related standards. At its core, both JSON and XML are used to capture and describe structured and unstructured data. JSON mainly focuses on storing or transmitting that data efficiently, i.e. with very little overhead, whereas XML focuses on a rich environment that includes entities and a mechanism to support metadata and extensibility. The extensibility is really the huge difference: XML includes concepts on how existing XML data can be augmented, extended and enriched by additional data and metadata from other domains using XML Namespaces. Furthermore, XML data can be processed with XSLT, queried with XQuery, addressed and extracted via XPath, etc. – none of those supporting technologies are really available for JSON in such rich diversity. However, when it comes to just capturing simple structured data and expressing it either in files or in a transmission between client and server, JSON shines with its simplicity and – some argue – better human readability.

So once we made the decision to support JSON in XMLSpy in addition to XML, the next question became obviously how you get data from one format into the other, and so we added a JSON <=> XML conversion option to our Convert menu.

For example, take the following bit of JSON data:

And you can now easily convert that into an equivalent piece of XML data:

Our main logic to include easy one-click conversion was that we figured people would sometimes want to experiment to see what approach works best for their application or their data. As such, they can simply take existing XML data and easily convert it to JSON to test with their new app, or conversely if they run into an issue with a JSON based application and need to convert existing data files to XML because they need the extensibility, meta-data, attributes, or the processing capabilities of XSLT and XQuery, they can easily do that with XMLSpy now.

But make no mistake: we are not saying that JSON is better than XML. On the contrary, I continue to be a huge fan of XML, which is why this blog is called the XML Aficionado, not the JSON Aficionado!

So the fact that there is a conversion function from XML => JSON in XMLSpy now doesn’t mean that people should necessarily convert from one format to the other. Rather, I see people who use either one or both formats wanting to occasionally move data from one world into the other or experiment with the other format and to make that process very convenient we’ve added the ability to convert from one to the other whenever you need to. But mainly we expect people will work either with JSON files or with XML files – depending on what is most suitable for their particular application or use-case.

So go ahead, use the comment section and let me know how you feel about the JSON vs  XML debate and what you think of our approach to support both…

P.S. My thanks go to Peter Zschunke, because a lot of material for this blog post came from a discussion and e-mail interview we recently conducted for his German blog XML-Ecke.

Creating Open XML documents from XML and database data

The latest release 2008r2 of StyleVision gives users important new functionality for creating advanced stylesheets to publish XML and database data in Word 2007, which uses the new Open XML (OOXML) data format, as well as simpler processes for publishing the same source content in other formats. And, to further ease the transition for developers and designers working with OOXML, we have just reduced the price of StyleVision considerably. As adoption of Open XML increases, StyleVision developers will be ready with a powerful tool for publishing XML and database data in what is sure to be the most predominant end-user document format, now that Open XML has been approved as an ISO standard.

Here is how the process works:

  1. Open your existing XML document or connect to an existing relational database to populate the source pane in StyleVision:
    Sources
  2. Drag & drop elements from the source pane into the design pane and apply styles to them, thereby creating a meta stylesheet for producing the desired output formatting:
    DragDrop
  3. Click on one of the preview tabs underneath the design pane to preview the output in any of the supported output formats (Open XML for Word 2007, HTML, PDF, and RTF) - all outputs are automatically created from one and the same visual design:
    OpenXMLpreview
  4. Save the generated output file(s) as well as the specific stylesheets that have been auto-generated to render your data in the desired output formats again and again...

StyleVision can access data from database tables,views, or you can directly enter a SQL SELECT statement to query only for particular data from a database. This makes StyleVision ideal for flexible database reporting, too.

If you are interested in further details, you can read more about the new features of StyleVision 2008r2 here.

Content reuse with Open XML and XSLT

While Open XML may not yet be an ISO standard, it is already standardized by ECMA and - even more important - all documents created by Office 2007 are already stored in Open XML by default, so there is an abundance of documents whose content you can now reuse much more easily and productively than ever before. So instead of waiting for the ISO vote or paying too much attention to all the political battles being fought around it, I want to show you how you can already take advantage of Open XML (sometimes also called OOXML or Office Open XML) today.

This is the first article in a series of blog postings that I plan to write about practical Open XML tips & tricks, so I encourage you to subscribe to my XML Aficionado blog (via RSS or via e-mail), if you haven't already done so. This will ensure that you get future articles from this series automatically as soon as I post them.

So let's look at an Open XML document in our favorite XML Editor. For this example I am going to use a WordprocessingML document (.docx) that I have created with Microsoft Office Word 2007. When I open the .docx file in XMLSpy, I immediately get to see the contents of the package file, which is structured according to the Open Packaging Convention.

That's a fancy way of saying that it is a ZIP file that contains specific files and directories that make up the content, structure, styles, relationships, and other parts of the document. Using XMLSpy's built-in capability to open any ZIP-formatted archive, I can directly browse any directory structures inside the ZIP package, add new files to the package, or open any existing XML file contained in the package:

OOXML1

For the purpose of reusing the content from this WordprocessingML example file, I am going to open the 'document.xml' file, which contains the content of the document.

As soon as I double-click the file in the ZIP archive, the XML is displayed in a separate window just like any other XML document and I can use the powerful grid view or text view features of XMLSpy to view or edit the XML data (sometimes it may be useful to invoke the pretty-print function in text view to make the file more easily readable):

OOXML2

This is, of course, a live editing view, so you can not only view the Open XML data, but make any changes to the XML and save it back into the package file.

But now let's look at how we can easily reuse content from this Open XML document using XSLT. XMLSpy ships with a few Open XML example documents as well as example XSLT stylesheets for just that purpose. Let's look at the 'docx2html.xslt' stylesheet, which takes a WordprocessingML document and extracts all paragraphs to turn them into HTML. This example stylesheet is by no means intended to be a fully-featured conversion tool from .docx to HTML. Instead it serves as a blue-print of how to reuse content from a .docx file and hopefully will serve as a starting point for your stylesheet development efforts.

At the core of that XSLT stylesheet we need a <xsl:for-each> loop to iterate over all the <docx:p> elements, which it turns into simple HTML <p> paragraphs. The text inside the paragraphs is grouped into runs of characters that share common attributes, and so we need an inner <xsl:for-each> loop to iterate over those <docx:r> elements and extract the text from their <docx:t> text node children. Thus the most primitive content reuse that only extracts the text of all paragraphs looks like this:

XSLT1

Once we have constructed those loops, we can start to think about perhaps extracting and reusing some style information. To do that, we now emit a <span> HTML element for every <docx:r> run of characters and give it a style attribute, whose value will depend on the <docx:rPr> element, so we use <xsl:apply-templates> to decide what HTML style we want to apply to the <span> elements:

XSLT2

The corresponding templates for the three most common styles (bold, italic, underline) are trivially easy to construct and look like this:

XSLT3

With just a few lines of XSLT and a few templates we have already written a stylesheet that extracts the basic paragraphs and most important styles from a WordprocessingML document and turns them into HTML that can be viewed in the browser view - here is the result produced from running the above XSLT stylesheet on the example WordprocessingML document that you can find in the XMLSpy examples directory:

OOXML4

Similarly, it is quite easy to extend the stylesheet to extract meta information, other styles, or image information from the WordprocessingML document and reuse the content for any modern application scenario, from web publishing via HTML, RSS, or social media formats to mobile web applications and beyond.

"But wait! How can I apply an XSLT stylesheet to an XML document that is stored within a ZIP file?", you might ask.

You can, of course, extract all the XML files using a regular ZIP expander, but there is a much better solution: when you use the document() function in XSLT 2.0 within XMLSpy or with our royalty-free XSLT engine AltovaXML, you can directly access files contained in a ZIP archive by using the "|zip" pipe operator within the filename, e.g. "MyDocument.docx|zip\_rels\.rels" will address the Relationship file ".rels" in the archive directory "\_rels" inside the ZIP package with the file named "MyDocument.docx".

The benefits of using XSLT to reuse content from Open XML documents are obvious: because XSLT is a cornerstone of the core set of XML standards from the W3C, you can apply all your existing XML, XPath, and XSLT know-how and you can use the excellent tools support that is available for these standards. For example, you can easily develop and debug your XSLT stylesheet using the powerful XSLT debugger in XMLSpy, which allows you to single-step through the transformation, set breakpoints on XSLT instructions or even on data nodes in your Open XML document, view the partially generated output, and inspect the state of the XSLT processor in detail as the output document is constructed:

OOXML3

Using the XSLT Debugger eliminates a lot of the pain that is normally associated with XSLT stylesheet development and allows for a very iterative approach to creating and improving stylesheets that facilitate content reuse and repurposing.

To sum it up, reusing content from Open XML documents for a variety of web applications, mobile scenarios, or social media and Web 2.0 contexts is very easy and can be achieved with standard XML-related technologies, such as XSLT.

For additional information on Open XML and how to take advantage of all the content that is now already available in that format, please refer to the following sites: