Thursday, December 10, 2015

XML Transformation: XSLT vs Groovy

In this post I show how Groovy can be used for XML transformation, instead of XSLT. The idea is to combine XmlSulper with MarkupBuilder for a powerful and concise syntax for XML transformation.
Everything that is presented in this post is uploaded on github.

XML Transformation Example

As a demo XML input, I use the Sky Sports Football News Feed. This contains a list of news items, as shown below.

Let's say we would like to transform this XML into an HTML file with a table of two columns, i.e. the timestamp (news:publication_date element) and the link (loc element) accompanied with the title (news:title element), like the following HTML snippet.

The XSLT Case

You could do this transformation with the following XSL file

To check the effect of this XSL stylesheet on your own, I have downloaded the XML input into sitemap_news_football.xml and have added the following line
<?xml-stylesheet type="text/xsl" href="skyFeed.xsl" ?>
in order to point to the skyFeed.xsl stylesheet. Clone the gihub repository and just open the sitemap_news_football.xml on the browser of your wish.

The Groovy Case

You can do the same transformation with the following Groovy script.

To run this script you need to have Groovy installed and the groovy binary in your path. Run:
> cd xslt-vs-groovy 
> skyfeed sitemap_news_football.xml

This will create the output file out.html in the same directory. Open it on your browser to verify the result is the same as opening sitemap_news_football.xml on the browser. You can also run the script with no input file argument:
> skyfeed

This will consider as input the current XML feed content.

The magic happens by blurring XmlSlurper with MarkupBuilder.

XmlSlurper's parse method returns a GPathResult object (e.g. the object referenced by root in this example) and then you can navigate over XML elements inside your Groovy code with GPath. GPath is a path expression language integrated into groovy (as a Groovy DSL). Its syntax is very similar to XPath, but has an object oriented flavor, e.g. instead of '/' it uses '.'. For instance, root matches the XML root element, which is the urlset elem. Similarly, root.url matches all the url elements that are children of urlset. Groovy evaluates root.url into a NodeChildren object, which then can be iterated over with a groovy closure as shown in the script.

MarkupBuilder allows you to produce HTML (or XML) by using normal Groovy contracts, like methods -more precisely missing (not implemented) methods which are delegated to methodMissing method- and closures. For more details on Builders, you can reed "Chapter 7: Building a Builder" of "Groovy for Domain-Specific Languages".

One should consider though, that browsers do not run Groovy, while they do run XSLT processors. This means that XML transformation with Groovy can only happen at the server side and then you need to send the output over HTTP to the client. If you need to do the XML transformation in the client side you need to stick with XSLT.