Introduction to XSL XSL gives your XML some style XSL gives your XML some style By: By Michael Ball Use XSL and servlets to style your XML data ow, you've come so far. A year ago you didn't know what XML was, but now it's everywhere. You're storing XML in databases and using it in middleware, and now you want to present that data to a browser. That's where XSL can help. XSL can help you turn your XML into HTML. Moreover, servlets provide a powerful medium to perform those translations as they reside on the server and have access to all the features of server-side Java. In this article, we'll cover the basics of XSL and XSL processors. If you don't know much about XML, you may want to first read Mark Johnson's excellent XML article, " Programming XML in Java, Part 1 ." Our example will use a servlet to turn well-formed XML into HTML. If you need to learn more about servlets, please refer to Sun's servlets tutorial (see Resources ). XML and XSL The process of transforming and formatting information into a rendered result is called styling. Two recommendations from the W3C come together to make styling possible: XSL Transformations (XSLT), which allows for a reorganization of information, and XSL, which specifies the formatting of the information for rendering. With those two technologies, when you put your XML and XSL stylesheet into an XSL processor, you don't just get a prettied up version of your XML. You get a result tree that can be expanded, modified, and rearranged. An XSL processor takes a stylesheet consisting of a set of XSL commands and transforms it, using an input XML document. Let's take a look at a simple example. Below we see a small piece of XML, describing an employee. It includes his name and title. Let's assume that we would like to present that in HTML. Joe Shmo If we wanted our HTML to look like this: Joe Shmo: Manager
Then we could use a stylesheet, such as the one below, to generate the HTML above. The stylesheet could reside in a file or database entry:
:
Common XSL stylesheet commands Stylesheets are defined by a set of XSL commands. They make up valid XML documents. Stylesheets use pattern matching to locate elements and attributes. There are also expressions that can be used to call extensions -- either Java objects or JavaScript. Let's look at some XSL commands. Stylesheet declaration The stylesheet declaration consists of a version and namespace. The namespace declares the prefix for the tags that will be used in the stylesheet and where the definition of those tags are located: version="1.0"> . . . If there are any extensions referenced, the namespace must be specified. For example, if you were going to use Java, you would specify this namespace: xmlns:java="http://xml.apache.org/xslt/java" Pattern matching When selecting in a stylesheet, a pattern is used to denote which element or attribute we want to access. The syntax is simple: specify the node you want, using / to separate the elements. Notice that in the sample XML code above we matched our template on / , which is the root node. We could have, however, matched on the employee node. Then a select statement could just refer to the name node instead of employee/name . For example, if we had the following XML: Joe Shmo Attributes can also be selected. The employee id could be accessed by saying employee/@id . Groups of nodes can be accessed by using employee/* . A specific employee could be located using employee/@id='03432' . Pattern matching allows us to select specific values out of our XML document. The Accessing elements versus attributes
Command Result Joe Shmo 03432 Templates Templates provide a way to match nodes in an XML document and perform operations on them. The syntax for a template is: . . . The template is matched on a node name, then all the stylesheet commands in that template are applied. We can call templates in our stylesheet by using the apply-templates command: An example using our employee XML above would be:
We can call this template anywhere there is a name node to be referenced, using this: Logical commands There are a few structures available for doing ifs and loops. Let's take a look at the syntax. Choose command The choose command provides a structure to test different situations. stylesheet commands stylesheet commands The first successful test will result in that block's stylesheet commands executing. If all the tests fail, the otherwise block is executed. You may have as many when blocks as you want. The otherwise block must always be present; if you don't want to do anything in your otherwise block, just put: If command The if command provides only a single test and doesn't have any kind of else structure available. If you need to have an else, use the choose command. ... Loops (for-each command) Unlike most languages with for and while structures, XSL offers only the for-each command. As
such, you can loop on a set of nodes or you can select the nodes you want to loop on, using a pattern match: ... For example, if you had more than one employee in your XML document, and you wanted to loop through all the managers, you could use a statement such as this: ... Variables The variable command provides a way to set a variable and access it later. The extension mechanism uses variables to store the values retrieved from extensions: assign value to count here The variable count can be accessed by using $count later in the stylesheet: Parameters You can pass parameters to your stylesheet, using the param tag. You can also specify a default value in a select statement. The default is a string, so it must be in single quotes: You can set the parameters you are passing to your stylesheet on your XSLProcessor object: processor.setStylesheetParam("param1", processor.createXString("value")); Extensions Extensions add functionality to a stylesheet. XSL comes with some basic functions: sum() -- Sum the values in designated nodes count() -- Count the nodes position() -- Returns the position of the current node in a loop last() -- Test whether this is the last node; this function returns a boolean value If you want additional functionality, you need to use extensions. Extensions can be called anywhere a value can be selected. Extensions to a stylesheet can be written in Java or JavaScript, among other languages. We'll concentrate on Java extensions in this article. In order to call extensions in Java, the java namespace must be specified in your stylesheet declaration: xmlns:java="http://xml.apache.org/xslt/java" Any calls to Java extensions would be prefaced with java: . (Note: You don't have to call your namespace java ; you can call it whatever you want.) You can do three things with Java extensions: create instances of classes, call methods on those classes, or just call static methods on classes. Table 2 shows syntax that can be used to reference Java objects. Instantiate a class: prefix:class.new (args) Example: variable myVector select"java:java.util.Vector.new()" Call a static method: prefix:class.methodName (args) Example: variable myString select="java:java.lang.String.valueOf(@quantity))" Call a method on an object: prefix:methodName (object, args) Example: variable myAdd select="java:addElement($myVector, string(@id))" Table 2. Three ways to use Java objects (For more on XSL, see Elliotte Harold's The XML Bible in Resources .) The XSL Processor API
For our example, we will use Lotus' implementation of Apache's XSL processor Xalan (see Resources ). We'll use the following classes in our servlet example: Class com.lotus.xsl.XSLProcessor com.lotus.xsl.XSLProcessor is the processor that implements the functionality defined in org.apache.xalan.xslt.XSLTProcessor . The default constructor can be used and processing can take place using the process() method, as seen below: void process(XSLTInputSource inputSource, XSLTInputSource stylesheetSource, XSLTResultTarget outputTarget) The process() method transforms the source tree to the output in the given result tree target. The void reset() method, to be used after process() , resets the processor to its original state. The process() method is overloaded 18 times. Each signature provides a different way to process your XML and XSL. Some return an org.w3c.dom.Document object. I have found that the above process() method is the handiest; the documentation recommends its use because of the XSLTInputSource (used to read in XML or XSL) and XSLTResultTarget (used to write out the results) classes, which we examine next in turn. Class com.lotus.xsl. XSL stands for E X tensible Stylesheet Language. The World Wide Web Consortium (W3C) started to develop XSL because there was a need for an XML-based Stylesheet Language. Thus it is a language for expressing Stylesheets. A stylesheet specifies the presentation of XML information using two basic categories of techniques: •
An optional transformation of the input document into another structure.
•
A description of how to present the transformed information.
The components of the XSL language The full XSL language logically consists of three component languages, which are described in three W3C (World Wide Web Consortium) Recommendations: 1. XPath: XML Path Language is an expression language used by XSLT to access or refer
specific parts of an XML document 2. XSLT: XSL Transformations is a language for describing how to transform one XML
document (represented as a tree) into another. 3. XSL-FO: Extensible Stylesheet Language Formatting Objects is a language for
formatting XML documents and Formatting Properties. Understanding XSL Stylesheet Structure (a) XSLT namespace The XSL stylesheet starts with the root element <xsl:stylesheet> or <xsl:transform> that declares the document to be an XSL style sheet. The correct way to declare an XSL style sheet according to the W3C XSLT Recommendation is: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
or: <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> Since an XSL style sheet is an XML document itself, it always begins with the XML declaration: To get access to the XSLT elements, attributes and features we must declare the XSLT namespace at the top of the document. The xmlns:xsl="http://www.w3.org/1999/XSL/Transform" points to the official W3C XSLT namespace. If you use this namespace, you must also include the attribute version="1.0". This specification uses a prefix of xsl: for referring to elements in the XSLT namespace. However, XSLT stylesheets are free to use any prefix. Now set it up to produce HTML-compatible output: <xsl:stylesheet ... > <xsl:output method="html"/> ... (b) Stylesheet Element The <xsl:template> Element An XSL style sheet consists of a set of rules that are called templates. Each template "matches" some set of elements in the source tree and then describes the contribution that the matched element makes to the result tree. Most templates have the following form: <xsl:template match="/"> <xsl:apply-templates/> Before processing can begin, the part of the XML document with the information to be copied to the output must be selected with an XPath expression. The selected section of the document is called a node and is normally selected with the match operator. In the above statements, the <xsl:template> element defines a template. The match="/" attribute associates the template with the root node of the XML source document. Another approach is to match the document element (the element that includes the entire document). The <xsl:apply-templates> Element
The <xsl:apply-templates> element applies a template to the current element or to the current element's child nodes. If we add a select attribute to the <xsl:apply-templates> element it will process only the child element that matches the value of the attribute. We can use the select attribute to specify the order in which the child nodes are processed. The <xsl:value-of> Element The <xsl:value-of> element can be used to extract the value of an XML element and add it to the output stream of the transformation. For example, the given expression will select the value of Emp_Id attribute of the specified element and write to the output: <xsl:value-of select="Emp_Id"/> or <xsl:value-of select="EmployeeDetail/Employee/Emp_Id"/> Note: The value of the select attribute is an XPath expression. An XPath expression works like navigating a file system; where a forward slash (/) selects subdirectories. The <xsl:for-each select="elementName "> Element The 'for-each' expression is a loop that processes the same instructions for these elements. The XSL <xsl:for-each> element can be used to select every XML element of a specified node-set. For example, the given expression finds all ‘Employee' elements in the ‘Employee-Detail' element context using the XPath expression ‘Employee-Detail/ Employee'. If the selected node contains all elements in the root, all of the ‘Employee-Detail' elements will be selected. <xsl:for-each select="Employee"> <xsl:value-of select="Emp_Id"/> <xsl:value-of select="Emp_Name"/> Extensible Stylesheet Language Transformations (XSLT) is an XML-based language that transforms an XML documents and generates output into a different format such as HTML, XML or another type of document that is recognized by a browser like WML, and XHTML. XSLT is an extension of XSL, which is a stylesheet definition language for XML. With XSLT you can add/remove elements and attributes to or from the output file. You can also rearrange and sort elements, and make decisions about which elements to hide and display. XSLT uses XPath to find information in an XML document. XPath is used to navigate through elements and attributes in XML documents.
The following figure shows the working process of XSLT:
XSLT Processors The job of an XSLT processor is to apply an XSLT stylesheet to an XML source document and produce a result document, (for example HTML document). There are several XSLT processors, but a few good one (Open sources), such as MSXML4, Saxon, and Xalan, XT, Oracle. Most of them can be downloaded free from Web sites. Apache's Xalan XSLT engine Xalan is the Apache XML Project's XSLT engine. This processor is available at http://xml.apache.org/xalan/ . We will concentrate on using this engine for transformation our XML document that we have developed and want to transform it into output document in the HTML format. Once the Xalan.zip or .gzip file is downloaded, unpack it and add these files to your CLASSPATH. These files include the .jar file for the Xerces parser, and the .jar file for the Xalan stylesheet engine itself. The .jar files are named xercesImpl.jar , and xalan.jar . Working with XSLT APIs XSLT consist of three components that transform an XML document into the required format. These components are: •
An instance of the TransformerFactory
•
An instance of the Transformer
•
The predefined transformation instruction
TransformerFactory is an abstract class used to create an instance of the Transformer class that is responsible for transforming a source object to a result object.
The process of XML transformation starts when you create an instance of the TransformerFactory class. An instance of the Transformer class is then created using the instance of the TransformerFactory class. This instance of the Transformer class uses the XML document as a source object and optionally uses the predefined instructions required for transformation to generate the formatted output as a result object. You can create the source XML document using SAX, DOM, or an input stream. The result object of the transformation process is in the form of a SAX event handler, DOM, or an output stream. During transformation process, the original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized (output) by the processor in standard XML syntax or in another format, such as HTML or plain text. The following figure shows the working process of XSLT APIs:
The XSLT Packages The XSLT APIs is defined in the following packages: Package
Description
javax.xml.transform
Defines the TransformerFactory and Transformer classes. These classes are used to get an object for doing transformations. After creating a transformer object, its transform() method is invoked. This method provides an input (source) and output (result).
javax.xml.transform.dom
Defines classes used to create input and output objects from a DOM.
javax.xml.transform.sax
Defines classes used to create input from a SAX parser and output objects from a SAX event handler.
javax.xml.transform.stream Defines classes used to create input and output objects from an I/O stream. In this tutorial, we will convert a simple XML file into HTML using XSLT APIs. To develop this program, do the following steps: 1. Create an XML file
The code for the emp.xml file is given below: <Employee-Detail> <Employee> <Emp_Id> E-001 <Emp_Name> Nisha <Emp_E-mail>
[email protected] <Employee> <Emp_Id> E-002 <Emp_Name> Amit <Emp_E-mail>
[email protected] <Employee> <Emp_Id> E-003 <Emp_Name> Deepak <Emp_E-mail>
[email protected] 2. Create an XSL Stylesheet Lets see the source code of XSL stylesheet ( emp.xsl ) that provides templates to transform the XML document: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="html" indent="yes"/> <xsl:template match="/">
XSLT Style Sheet Employee Details
<xsl:apply-templates/>
<xsl:template match="Employee-Detail">
Emp_Id | Emp_Name | Emp_E-mail |
<xsl:for-each select="Employee"> <xsl:value-of select="Emp_Id"/> | <xsl:value-of select="Emp_Name"/> | <xsl:value-of select="Emp_E-mail"/> |
Create a Java program using XSLT APIs Now we will develop a class in Java that takes both XML and XSL file as an input and transforms them to generate a formatted HTML file. Here is the source code of the SimpleXMLTransform.java:
import javax.xml.transform.ErrorListener; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerConfigurationException; import javax.xml.transform.TransformerException; import javax.xml.transform.TransformerFactory; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.stream.StreamSource;
public class SimpleXMLTransform { static public void main(String[] arg) { if(arg.length != 3) { System.err.println("Usage: SimpleXMLTransform " + "