© 2001 First Hop
Java and XML Janne Kalliola Director, Product Development First Hop Ltd
Java and XML
About XML Basic XML manipulation DOM SAX Transforming and outputting XML documents XSLT XSL-FO New generation technologies Java XML Bindings Java XML Messaging JAX-RPC Case – configuration information
1
About XML
Extensible Markup Language (XML) is a W3C recommendation for representing information in electronical form XML is a meta language; it does not provide any semantics, only the basic syntax and conformance tests everything else can be decided by the provider of the XML document There are dozens of initiatives to create various languages over XML MathML XHTML SOAP
Using XML
In the beginning XML was strictly thought as a textual document the representation of the document structure was a file containing elements and body text Nowadays XML is used programmatically, thus documents can also be data structures events streams In this presentation, the word "document" is used to cover all of these aspects
2
XML as Objects
XML documents can be manipulated in object oriented software using Document Object Model (DOM) interfaces DOM is a W3C (www.w3.org) recommendation programming language and implementation independent interfaces available for several languages including Java, C++, Python, and IDL DOM is also used in browsers to manipulate HTML content The basic idea in DOM is to read XML documents into trees and manipulate the tree
DOM Levels and Implementations
DOM contains currently three levels: 1. basic manipulation interfaces 2. event based XML document manipulation, tree travelsal and style sheets (CSS) 3. XML Schemas, Xpath There are several implementations available, such as JAXP from Sun Microsystems, and Xerces from Apache Foundation the implementations are usually called XML parsers contains implementations for the DOM interfaces and supporting code the support of different levels varies from implementation to another
3
Event-based Manipulation
If the XML documents are large or only a part of the document should be read in, DOM can consume too much memory and resources the solution is to read the XML document as a serie of events every XML element triggers an event the program reading in the document catches the events and reacts accordingly the document is read element by element and thus no large data structures are created The event-based manipulation is done with SAX (Simple API for XML) the SAX implementations are usually bundled with DOM implementations
Transforming XML
If the source XML document is outside the system using the document, the source document may be in a format that is not optimal or usable for the system The document has to be transformed into another XML syntax the transformation is hard to program with normal programming languages instead, a special programming language Extensible Style Language Transformations (XSLT) has been created for solving this problem
4
XSLT
XSLT is rule-based language the XSLT syntax is based on XML, i.e. XSLT programs (stylesheets) are XML documents the XSLT contains rules to transform elements, attributes and subdocuments to another form The XSLT stylesheet is run inside a XSLT processor the processor gets the source document and the stylesheet as input and produces another XML documents as output the processor reads in part of the source document and tries to find matching rules if a rule is found, it is executed and the rule produces a part of the output document there are default rules for ruleless situations the process goes on recursively until the source document has been completely processed
XSL Formatting Objects
XSL-FO is an XML-based language to describe layout information the XSL-FO document contains instructions to render itself, like DTP program documents the document is rendered using an XSL-FO processor the processor outputs a document in some format such as PDF or PostScript XSL-FO is a young recommendation, finalised on October 2001 programmatic support is still in beta level
5
Outputting Human-readable Documents
If the XML document has to be presented to a person, there are two basic ways to proceed: convert the document to XHTML and show the output document with a WWW browser transformation is done using XSLT and the output is readily available for presentation convert the document to XSL-FO and format the document to some desired output format transformation is done using XSLT and the output is rendered to the final output format by XSL-FO processor
Transformations in Java
Both XSLT and XSL-FO can be used inside Java programs XSLT interfaces are included in JAXP XSLT processors are available for example from IBM, Sun Microsystems and Apache Foundation XSL-FO is currently still in beta phase the best Java-based processor is FOP, provided by Apache Foundation, xml.apache.org/fop FOP provides output as PDF, PS, PCL, SVG FOP can also be used to render the documents inside a Java program, using the supplied AWT component
6
New Generation Technologies
The first uses of XML in Java were very documentoriented reading in and writing out documents using XML in WWW applications for content presentation, or as a configuration file format transforming documents to other forms New APIs are geared towards using XML in a programmatic manner the programmers are shielded from the bare XML documents APIs are provided instead and XML formats are kept on the background
Java XML Bindings
If XML documents contain only computer readable information, DOM and SAX are cumbersome technologies to read the documents in semantics has to be provided by the software itself Sun Microsystems has proposed a solution for the problem Java XML Bindings (JAXB) is an API and collection of tools to automate mappings between XML documents and Java classes JAXB is currently in development, for more information: http://java.sun.com/xml/jaxb/
7
JAXB Mechanisms
JAXB provides a compiler that creates Java classes from XML DTDs (document type definition) XML Schema support will be available shortly DTD is converted to a Binding Schema that is used to create the classes Generated classes contain error and validity checks as stated in DTDs Classes can both read in and generate XML documents
JAXB Benefits & Problems
Easier programming model some parts of program logic are written with a description language Smaller footprint for XML parsing no need to keep redundant information in the memory during parsing Program logic is divided into two locations there are some versioning problems, when the DTDs change Automatisation may generate poor code
8
Java XML Messaging
Java XML Messaging (JAXM) is a set of APIs that enable sending and receiving XML formatted messages implements Simple Object Access Protocol (SOAP) 1.1 with attachments JAXM is used to exchange XML business documents over the Internet Transportation is usually done over HTTP FTP and SMTP can be used, too messages can be both synchronous and asynchronous (with or without acknowledgements) one message can be sent to several recipients For more information: http://java.sun.com/xml/jaxm/
JAXM Benefits
Help the developers to concentrate on the core features of the program messaging details are left to JAXM components provides several transports to interchange messages Messages can be exchanged with non-Java applications, too
9
JAX-RPC
Java API for XML-Based Remote Procedure Calls enables Java applications to communicate with other programs using RPC mechanism the API is based on SOAP 1.1 Usually, JAX-RPC is used in a client program to connect to a remote server client initiates a remote procedure call in the server server has defined a set of calls that are available for the clients (remote API of the server) JAX-RPC is thus similar to RMI (remote method invocation) or CORBA (common object request broker architecture)
JAX-RPC Functionality
The remote procedure call is represented using an XML based protocol (for instance SOAP 1.1) The server can define, describe and export a web service as an RPC based service the service is described using Web Service Description Language (WDSL) XML based specification to describe service as a set of endpoints that operate on messages WDSL is a World Wide Web Consortium (W3C) specification For more information: http://java.sun.com/xml/jaxrpc/
10
Java XML Packs
Sun Microsystems has collected all the available APIs to a Java XML Pack the pack contains a set of interoperable Java XML technologies Sun Microsystems releases a new version of the pack quarterly http://java.sun.com/xml/javaxmlpack.html Packs should not be confused with JAXP (Java API for XML Processing) contains DOM and SAX parsers and XSLT processor JAXP forms the base of Java XML Pack
Case – Description
First Hop products' configuration information is stored in an XML file the file is read in by the main controller of the product the configurations are grouped by the components of the product eases maintaining of the configuration the contents of the file are dissected to several parts and given to the appropriate components every component gets only its own configuration configurations can be embedded, if a component uses another component internally
11
Case – Benefits
The configuration is kept in a single file reduces number of files to be manipulated The configuration is flexible the syntax of the component configuration is specified by the component the configurations can be nested, if required The configuration is automatically checked if the syntax of the configuration is wrong, DOM raises errors no need to create own configuration parsers
Summary
There are two levels of XML manipulation the first one is based on the document aspects of XML older standards, more implementations generic solutions the second one is based on programmatic use of XML new standards, usually one or few implementations some are still in a draft phase specific solutions Java and XML provide a wide range of possibilities for application programmers
12
Questions & Comments?
For more information about First Hop: www.firsthop.com For information about XML: http://java.sun.com/xml/ http://www.w3.org/ http://www.xml.com/ You can reach me at
[email protected]
13