Xml Document

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Xml Document as PDF for free.

More details

  • Words: 24,668
  • Pages: 121
XML was designed to describe data and to focus on what data is. HTML was designed to display data and to focus on how data looks.

What You Should Already Know Before you continue you should have a basic understanding of the following:

• •

HTML / XHTML JavaScript or VBScript

If you want to study these subjects first, find the tutorials on our Home page.

What is XML? • • • • • • •

XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to describe data XML tags are not predefined. You must define your own tags XML uses a Document Type Definition (DTD) or an XML Schema to describe the data XML with a DTD or XML Schema is designed to be self-descriptive XML is a W3C Recommendation

XML is a W3C Recommendation The Extensible Markup Language (XML) became a W3C Recommendation 10. February 1998. You can read more about XML standards in our W3C tutorial.

The Main Difference Between XML and HTML XML was designed to carry data. XML is not a replacement for HTML. XML and HTML were designed with different goals: XML was designed to describe data and to focus on what data is. HTML was designed to display data and to focus on how data looks. HTML is about displaying information, while XML is about describing information.

XML Does not DO Anything XML was not designed to DO anything. Maybe it is a little hard to understand, but XML does not DO anything. XML was created to structure, store and to send information.

1

The following example is a note to Tove from Jani, stored as XML:

<note> Tove Jani Reminder Don't forget me this weekend! The note has a header and a message body. It also has sender and receiver information. But still, this XML document does not DO anything. It is just pure information wrapped in XML tags. Someone must write a piece of software to send, receive or display it.

XML is Free and Extensible XML tags are not predefined. You must "invent" your own tags. The tags used to mark up HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard (like

,

, etc.). XML allows the author to define his own tags and his own document structure. The tags in the example above (like and ) are not defined in any XML standard. These tags are "invented" by the author of the XML document.

XML is a Complement to HTML XML is not a replacement for HTML. It is important to understand that XML is not a replacement for HTML. In future Web development it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data. My best description of XML is this: XML is a cross-platform, software and hardware independent tool for transmitting information.

XML in Future Web Development XML is going to be everywhere. We have been participating in XML development since its creation. It has been amazing to see how quickly the XML standard has been developed and how quickly a large number of software vendors have adopted the standard. We strongly believe that XML will be as important to the future of the Web as HTML has been to the foundation of the Web and that XML will be the most common tool for all data manipulation and data transmission.

XML Joke

2

Question: When should I use XML? Answer: When you need a buzzword in your resume. It is important to understand that XML was designed to store, carry, and exchange data. XML was not designed to display data.

XML can Separate Data from HTML With XML, your data is stored outside your HTML. When HTML is used to display data, the data is stored inside your HTML. With XML, data can be stored in separate XML files. This way you can concentrate on using HTML for data layout and display, and be sure that changes in the underlying data will not require any changes to your HTML. XML data can also be stored inside HTML pages as "Data Islands". You can still concentrate on using HTML only for formatting and displaying the data.

XML is Used to Exchange Data With XML, data can be exchanged between incompatible systems. In the real world, computer systems and databases contain data in incompatible formats. One of the most time-consuming challenges for developers has been to exchange data between such systems over the Internet. Converting the data to XML can greatly reduce this complexity and create data that can be read by many different types of applications.

XML and B2B With XML, financial information can be exchanged over the Internet. Expect to see a lot about XML and B2B (Business To Business) in the near future. XML is going to be the main language for exchanging financial information between businesses over the Internet. A lot of interesting B2B applications are under development.

XML Can be Used to Share Data With XML, plain text files can be used to share data. Since XML data is stored in plain text format, XML provides a software- and hardware-independent way of sharing data. This makes it much easier to create data that different applications can work with. It also makes it easier to expand or upgrade a system to new operating systems, servers, applications, and new browsers.

XML Can be Used to Store Data

3

With XML, plain text files can be used to store data. XML can also be used to store data in files or in databases. Applications can be written to store and retrieve information from the store, and generic applications can be used to display the data.

XML Can Make your Data More Useful With XML, your data is available to more users. Since XML is independent of hardware, software and application, you can make your data available to other than only standard HTML browsers. Other clients and applications can access your XML files as data sources, like they are accessing databases. Your data can be made available to all kinds of "reading machines" (agents), and it is easier to make your data available for blind people, or people with other disabilities.

XML Can be Used to Create New Languages XML is the mother of WAP and WML. The Wireless Markup Language (WML), used to markup Internet applications for handheld devices like mobile phones, is written in XML. You can read more about WML in our WML tutorial.

If Developers Have Sense If they DO have sense, all future applications will exchange their data in XML. The future might give us word processors, spreadsheet applications and databases that can read each other's data in a pure text format, without any conversion utilities in between. We can only pray that Microsoft and all the other software vendors will agree.

The syntax rules of XML are very simple and very strict. The rules are very easy to learn, and very easy to use. Because of this, creating software that can read and manipulate XML is very easy.

An Example XML Document XML documents use a self-describing and simple syntax.

<note> Tove Jani Reminder

4

Don't forget me this weekend! The first line in the document - the XML declaration - defines the XML version and the character encoding used in the document. In this case the document conforms to the 1.0 specification of XML and uses the ISO-8859-1 (Latin-1/West European) character set. The next line describes the root element of the document (like it was saying: "this document is a note"):

<note> The next 4 lines describe 4 child elements of the root (to, from, heading, and body):

Tove Jani Reminder Don't forget me this weekend! And finally the last line defines the end of the root element:

Can you detect from this example that the XML document contains a Note to Tove from Jani? Don't you agree that XML is pretty self-descriptive?

All XML Elements Must Have a Closing Tag With XML, it is illegal to omit the closing tag. In HTML some elements do not have to have a closing tag. The following code is legal in HTML:

This is a paragraph

This is another paragraph In XML all elements must have a closing tag, like this:

This is a paragraph

This is another paragraph

Note: You might have noticed from the previous example that the XML declaration did not have a closing tag. This is not an error. The declaration is not a part of the XML document itself. It is not an XML element, and it should not have a closing tag.

XML Tags are Case Sensitive Unlike HTML, XML tags are case sensitive. With XML, the tag is different from the tag .

5

Opening and closing tags must therefore be written with the same case:

<Message>This is incorrect <message>This is correct

XML Elements Must be Properly Nested Improper nesting of tags makes no sense to XML. In HTML some elements can be improperly nested within each other like this:

This text is bold and italic In XML all elements must be properly nested within each other like this:

This text is bold and italic

XML Documents Must Have a Root Element All XML documents must contain a single tag pair to define a root element. All other elements must be within this root element. All elements can have sub elements (child elements). Sub elements must be correctly nested within their parent element:

<subchild>.....

XML Attribute Values Must be Quoted With XML, it is illegal to omit quotation marks around attribute values. XML elements can have attributes in name/value pairs just like in HTML. In XML the attribute value must always be quoted. Study the two XML documents below. The first one is incorrect, the second is correct:

<note date=12/11/2002> Tove Jani <note date="12/11/2002"> Tove Jani

6

The error in the first document is that the date attribute in the note element is not quoted. This is correct: date="12/11/2002". This is incorrect: date=12/11/2002.

With XML, White Space is Preserved With XML, the white space in your document is not truncated. This is unlike HTML. With HTML, a sentence like this: Hello

my name is Tove,

will be displayed like this: Hello my name is Tove, because HTML reduces multiple, consecutive white space characters to a single white space.

With XML, CR / LF is Converted to LF With XML, a new line is always stored as LF. Do you know what a typewriter is? Well, a typewriter is a mechanical device which was used last century to produce printed documents. :-) After you have typed one line of text on a typewriter, you have to manually return the printing carriage to the left margin position and manually feed the paper up one line. In Windows applications, a new line is normally stored as a pair of characters: carriage return (CR) and line feed (LF). The character pair bears some resemblance to the typewriter actions of setting a new line. In Unix applications, a new line is normally stored as a LF character. Macintosh applications use only a CR character to store a new line.

Comments in XML The syntax for writing comments in XML is similar to that of HTML.

There is Nothing Special About XML There is nothing special about XML. It is just plain text with the addition of some XML tags enclosed in angle brackets. Software that can handle plain text can also handle XML. In a simple text editor, the XML tags will be visible and will not be handled specially. In an XML-aware application however, the XML tags can be handled specially. The tags may or may not be visible, or have a functional meaning, depending on the nature of the application.

7

XML Elements are extensible and they have relationships. XML Elements have simple naming rules.

XML Elements are Extensible XML documents can be extended to carry more information. Look at the following XML NOTE example:

<note> Tove Jani Don't forget me this weekend! Let's imagine that we created an application that extracted the , , and elements from the XML document to produce this output: MESSAGE To: Tove From: Jani Don't forget me this weekend! Imagine that the author of the XML document added some extra information to it:

<note> 2002-08-01 Tove Jani Reminder Don't forget me this weekend! Should the application break or crash? No. The application should still be able to find the , , and elements in the XML document and produce the same output. XML documents are Extensible.

XML Elements have Relationships Elements are related as parents and children. To understand XML terminology, you have to know how relationships between XML elements are named, and how element content is described. Imagine that this is a description of a book:

8

My First XML Introduction to XML

• •

What is HTML What is XML

XML Syntax



Elements must have a closing tag



Elements must be properly nested

Imagine that this XML document describes the book:

My First XML <prod id="33-657" media="paper"> Introduction to XML <para>What is HTML <para>What is XML XML Syntax <para>Elements must have a closing tag <para>Elements must be properly nested Book is the root element. Title, prod, and chapter are child elements of book. Book is the parent element of title, prod, and chapter. Title, prod, and chapter are siblings (or sister elements) because they have the same parent.

Elements have Content Elements can have different content types. An XML element is everything from (including) the element's start tag to (including) the element's end tag. An element can have element content, mixed content, simple content, or empty content. An element can also have attributes. In the example above, book has element content, because it contains other elements. Chapter has mixed content because it contains both text and other elements. Para has simple content (or text content) because it contains only text. Prod has empty content, because it carries no information. In the example above only the prod element has attributes. The attribute named id has the value "33-657". The attribute named media has the value "paper".

9

Element Naming XML elements must follow these naming rules:

• • • •

Names can contain letters, numbers, and other characters Names must not start with a number or punctuation character Names must not start with the letters xml (or XML, or Xml, etc) Names cannot contain spaces

Take care when you "invent" element names and follow these simple rules: Any name can be used, no words are reserved, but the idea is to make names descriptive. Names with an underscore separator are nice. Examples: , . Avoid "-" and "." in names. For example, if you name something "first-name," it could be a mess if your software tries to subtract name from first. Or if you name something "first.name," your software may think that "name" is a property of the object "first." Element names can be as long as you like, but don't exaggerate. Names should be short and simple, like this: not like this: . XML documents often have a corresponding database, in which fields exist corresponding to elements in the XML document. A good practice is to use the naming rules of your database for the elements in the XML documents. Non-English letters like éòá are perfectly legal in XML element names, but watch out for problems if your software vendor doesn't support them. The ":" should not be used in element names because it is reserved to be used for something called namespaces (more later). XML elements can have attributes in the start tag, just like HTML. Attributes are used to provide additional information about elements.

XML Attributes XML elements can have attributes. From HTML you will remember this: . The SRC attribute provides additional information about the IMG element. In HTML (and in XML) attributes provide additional information about elements:

Attributes often provide information that is not a part of the data. In the example below, the file type is irrelevant to the data, but important to the software that wants to manipulate the element:

10

computer.gif

Quote Styles, "female" or 'female'? Attribute values must always be enclosed in quotes, but either single or double quotes can be used. For a person's sex, the person tag can be written like this:

or like this:

Note: If the attribute value itself contains double quotes it is necessary to use single quotes, like in this example:

Note: If the attribute value itself contains single quotes it is necessary to use double quotes, like in this example:



Use of Elements vs. Attributes Data can be stored in child elements or in attributes. Take a look at these examples:

Anna Smith <sex>female Anna Smith In the first example sex is an attribute. In the last, sex is a child element. Both examples provide the same information. There are no rules about when to use attributes, and when to use child elements. My experience is that attributes are handy in HTML, but in XML you should try to avoid them. Use child elements if the information feels like data.

My Favorite Way I like to store data in child elements.

11

The following three XML documents contain exactly the same information: A date attribute is used in the first example:

<note date="12/11/2002"> Tove Jani Reminder Don't forget me this weekend! A date element is used in the second example:

<note> 12/11/2002 Tove Jani Reminder Don't forget me this weekend! An expanded date element is used in the third: (THIS IS MY FAVORITE):

<note> 12 <month>11 2002 Tove Jani Reminder Don't forget me this weekend!

Avoid using attributes? Should you avoid using attributes? Some of the problems with using attributes are:

• • • • •

attributes cannot contain multiple values (child elements can) attributes are not easily expandable (for future changes) attributes cannot describe structures (child elements can) attributes are more difficult to manipulate by program code attribute values are not easy to test against a Document Type Definition (DTD) - which is used to define the legal elements of an XML document

If you use attributes as containers for data, you end up with documents that are difficult to read and maintain. Try to use elements to describe data. Use attributes only to provide information that is not relevant to the data. Don't end up like this (this is not how XML should be used):

12

<note day="12" month="11" year="2002" to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!">

An Exception to my Attribute Rule Rules always have exceptions. My rule about attributes has one exception: Sometimes I assign ID references to elements. These ID references can be used to access XML elements in much the same way as the NAME or ID attributes in HTML. This example demonstrates this:

<messages> <note id="p501"> Tove Jani Reminder Don't forget me this weekend! <note id="p502"> Jani Tove Re: Reminder I will not! The ID in these examples is just a counter, or a unique identifier, to identify the different notes in the XML file, and not a part of the note data. What I am trying to say here is that metadata (data about data) should be stored as attributes, and that data itself should be stored as elements.

XML Validations: XML with correct syntax is Well Formed XML. XML validated against a DTD is Valid XML.

Well Formed XML Documents A "Well Formed" XML document has correct XML syntax. A "Well Formed" XML document is a document that conforms to the XML syntax rules that were described in the previous chapters:

• •

XML documents must have a root element XML elements must have a closing tag

13

• • •

XML tags are case sensitive XML elements must be properly nested XML attribute values must always be quoted

<note> Tove Jani Reminder Don't forget me this weekend!

Valid XML Documents A "Valid" XML document also conforms to a DTD. A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a Document Type Definition (DTD):

<note> Tove Jani Reminder Don't forget me this weekend!

XML DTD A DTD defines the legal elements of an XML document. The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements. You can read more about DTD, and how to validate your XML documents in our DTD tutorial.

Documentation of DTD Tutorial: A Document Type Definition defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements. A DTD can be declared inline in your XML document, or as an external reference.

Internal DOCTYPE Declaration If the DTD is included in your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax:

Example XML document with a DTD: (Open it in IE5, and select view source):

14

]> <note> Tove Jani Reminder Don't forget me this weekend The DTD above is interpreted like this: !DOCTYPE note (in line 2) defines that this is a document of the type note. !ELEMENT note (in line 3) defines the note element as having four elements: "to,from,heading,body". !ELEMENT to (in line 4) defines the to element to be of the type "#PCDATA". !ELEMENT from (in line 5) defines the from element to be of the type "#PCDATA" and so on.....

External DOCTYPE Declaration If the DTD is external to your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax:

This is the same XML document as above, but with an external DTD: (Open it in IE5, and select view source)

<note> Tove Jani Reminder Don't forget me this weekend! And this is a copy of the file "note.dtd" containing the DTD:


note (to,from,heading,body)> to (#PCDATA)> from (#PCDATA)> heading (#PCDATA)> body (#PCDATA)>

Why Use a DTD? With DTD, each of your XML files can carry a description of its own format with it.

15

With a DTD, independent groups of people can agree to use a common DTD for interchanging data. Your application can use a standard DTD to verify that the data you receive from the outside world is valid. You can also use a DTD to verify your own data. The main building blocks of both XML and HTML documents are tags like .....

The Building Blocks of XML Documents Seen from a DTD point of view, all XML documents (and HTML documents) are made up by the following simple building blocks:

• • • • •

Elements Attributes Entities PCDATA CDATA

The following is a brief explanation of each of the building blocks:

Elements Elements are the main building blocks of both XML and HTML documents. Examples of HTML elements are "body" and "table". Examples of XML elements could be "note" and "message". Elements can contain text, other elements, or be empty. Examples of empty HTML elements are "hr", "br" and "img". Examples:

body text in between <message>some message in between

Attributes Attributes provide extra information about elements. Attributes are always placed inside the starting tag of an element. Attributes always come in name/value pairs. The following "img" element has additional information about a source file:

The name of the element is "img". The name of the attribute is "src". The value of the attribute is "computer.gif". Since the element itself is empty it is closed by a " /".

Entities

16

Entities are variables used to define common text. Entity references are references to entities. Most of you will know the HTML entity reference: " ". This "no-breaking-space" entity is used in HTML to insert an extra space in a document. Entities are expanded when a document is parsed by an XML parser. The following entities are predefined in XML: Entity References

Character

<

<

>

>

&

&

"

"

'

'

PCDATA PCDATA means parsed character data. Think of character data as the text found between the start tag and the end tag of an XML element. PDATA is text that WILL be parsed by a parser. The text will be examined by the parser for entities and markup. Tags inside the text will be treated as markup and entities will be expanded. However, parsed character data should not contain any &, <, or > characters; these need to be represented by the & < and > entities, respectively.

CDATA CDATA means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded. In a DTD, XML elements are declared with a DTD element declaration.

Declaring an Element In the DTD, XML elements are declared with an element declaration. An element declaration has the following syntax:

or

Empty Elements

17

Empty elements are declared with the category keyword EMPTY:

example: XML example:


Elements with Only Parsed Character Data Elements with only parsed character data are declared with #PCDATA inside parentheses:

example:

Elements with any Contents Elements declared with the category keyword ANY, can contain any combination of parsable data:

example:

Elements with Children (sequences) Elements with one or more children are defined with the name of the children elements inside parentheses:

or example: When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. In a full declaration, the children must also be declared, and the children can also have children. The full declaration of the "note" element will be:


note (to,from,heading,body)> to (#PCDATA)> from (#PCDATA)> heading (#PCDATA)> body (#PCDATA)>

Declaring Only One Occurrence of an Element

18

example: The example declaration above declares that the child element message must occur once, and only once inside the "note" element.

Declaring Minimum One Occurrence of an Element example: The + sign in the example above declares that the child element message must occur one or more times inside the "note" element.

Declaring Zero or More Occurrences of an Element example: The * sign in the example above declares that the child element message can occur zero or more times inside the "note" element.

Declaring Zero or One Occurrences of an Element example: The ? sign in the example above declares that the child element message can occur zero or one times inside the "note" element.

Declaring either/or Content example: The example above declares that the "note" element must contain a "to" element, a "from" element, a "header" element, and either a "message" or a "body" element.

Declaring Mixed Content example: The example above declares that the "note" element can contain zero or more occurrences of parsed character, "to", "from", "header", or "message" elements. In a DTD, Attributes are declared with an ATTLIST declaration.

19

Declaring Attributes An attribute declaration has the following syntax:

example: DTD example: XML example: <payment type="check" /> The attribute-type can have the following values: Value

Explanation

CDATA

The value is character data

(en1|en2|..)

The value must be one from an enumerated list

ID

The value is a unique id

IDREF

The value is the id of another element

IDREFS

The value is a list of other ids

NMTOKEN

The value is a valid XML name

NMTOKENS

The value is a list of valid XML names

ENTITY

The value is an entity

ENTITIES

The value is a list of entities

NOTATION

The value is a name of a notation

xml:

The value is a predefined xml value

The default-value can have the following values: Value

Explanation

value

The default value of the attribute

#REQUIRED

The attribute value must be included in the element

#IMPLIED

The attribute does not have to be included

#FIXED value

The attribute value is fixed

Specifying a Default Attribute Value DTD: Valid XML: <square width="100" /> In the example above, the "square" element is defined to be an empty element with a "width" attribute of type CDATA. If no width is specified, it has a default value of 0.

#IMPLIED

20

Syntax Example DTD: Valid XML: Valid XML: Use the #IMPLIED keyword if you don't want to force the author to include an attribute, and you don't have an option for a default value.

#REQUIRED Syntax Example DTD: Valid XML: Invalid XML: Use the #REQUIRED keyword if you don't have an option for a default value, but still want to force the attribute to be present.

#FIXED Syntax Example DTD: Valid XML: <sender company="Microsoft" /> Invalid XML: <sender company="W3Schools" /> Use the #FIXED keyword when you want an attribute to have a fixed value without allowing the author to change it. If an author includes another value, the XML parser will return an error.

Enumerated Attribute Values

21

Syntax: DTD example: XML example: <payment type="check" /> or <payment type="cash" /> Use enumerated attribute values when you want the attribute values to be one of a fixed set of legal values. Entities are variables used to define shortcuts to common text.



Entity references are references to entities



Entities can be declared internal or external

An Internal Entity Declaration Syntax Example DTD Example: XML example: &writer;©right;

An External Entity Declaration Syntax Example DTD Example: XML example: &writer;©right; Internet Explorer 5.0 can validate your XML against a DTD.

Validating With the XML Parser

22

If you try to open an XML document, the XML Parser might generate an error. By accessing the parseError object, the exact error code, the error text, and even the line that caused the error can be retrieved: Note: The load( ) method is used for files, while the loadXML( ) method is used for strings.

var xmlDoc = new ActiveXObject("Microsoft.XMLDOM") xmlDoc.async="false" xmlDoc.validateOnParse="true" xmlDoc.load("note_dtd_error.xml") document.write("
Error Code: ") document.write(xmlDoc.parseError.errorCode) document.write("
Error Reason: ") document.write(xmlDoc.parseError.reason) document.write("
Error Line: ") document.write(xmlDoc.parseError.line) Try it Yourself or just look at the XML file

Turning Validation Off Validation can be turned off by setting the XML parser's validateOnParse="false".

var xmlDoc = new ActiveXObject("Microsoft.XMLDOM") xmlDoc.async="false" xmlDoc.validateOnParse="false" xmlDoc.load("note_dtd_error.xml") document.write("
Error Code: ") document.write(xmlDoc.parseError.errorCode) document.write("
Error Reason: ") document.write(xmlDoc.parseError.reason) document.write("
Error Line: ") document.write(xmlDoc.parseError.line) Try it Yourself

A General XML Validator To help you validate your xml files, we have created this link so that you can Validate any XML file.

The parseError Object You can read more about the parseError object in our XML DOM tutorial.

DTD Examples: TV Schedule DTD By David Moisan. Copied from his Web: http://www.davidmoisan.org/

23


TVSCHEDULE [ TVSCHEDULE (CHANNEL+)> CHANNEL (BANNER,DAY+)> BANNER (#PCDATA)> DAY (DATE,(HOLIDAY|PROGRAMSLOT+)+)> HOLIDAY (#PCDATA)> DATE (#PCDATA)> PROGRAMSLOT (TIME,TITLE,DESCRIPTION?)> TIME (#PCDATA)> TITLE (#PCDATA)> DESCRIPTION (#PCDATA)>


TVSCHEDULE NAME CDATA #REQUIRED> CHANNEL CHAN CDATA #REQUIRED> PROGRAMSLOT VTR CDATA #IMPLIED> TITLE RATING CDATA #IMPLIED> TITLE LANGUAGE CDATA #IMPLIED>

]>

Newspaper Article DTD Copied from http://www.vervet.com/

]>

Product Catalog DTD Copied from http://www.vervet.com/


24

NAME CDATA #IMPLIED CATEGORY (HandTool|Table|Shop-Professional) "HandTool" PARTNUM CDATA #IMPLIED PLANT (Pittsburgh|Milwaukee|Chicago) "Chicago" INVENTORY (InStock|Backordered|Discontinued) "InStock"> ]>

DTD Summary This tutorial has taught you how to describe the structure of an XML document. You have learned how to use a DTD to define the legal elements of an XML document, and how the DTD can be declared inside your XML document, or as an external reference. You have learned how to declare the legal elements, attributes, entities, and CDATA sections for XML documents. You have also seen how to validate an XML document against a DTD.

Now You Know DTD, What's Next? The next step is to learn about XML Schema. XML Schema is used to define the legal elements of an XML document, just like a DTD. We think that very soon XML Schemas will be used in most Web applications as a replacement for DTDs. XML Schema is an XML-based alternative to DTD. Unlike DTD, XML Schemas has support for data types and namespaces.

25

XML Schema XML Schema is an XML based alternative to DTD. W3C supports an alternative to DTD called XML Schema. You can read more about XML Schema in our Schema tutorial.

XML Validator: XML Errors will Stop you Errors in XML documents will stop your XML program. The W3C XML specification states that a program should not continue to process an XML document if it finds an error. The reason is that XML software should be easy to write, and that all XML documents should be compatible. With HTML it was possible to create documents with lots of errors (like when you forget an end tag). One of the main reasons that HTML browsers are so big and incompatible, is that they have their own ways to figure out what a document should look like when they encounter an HTML error. With XML this should not be possible.

Syntax-check your XML - IE Only To help you syntax-check your xml, we have used Microsoft's XML parser to create an XML validator. Paste your XML in the text area below, and syntax-check it by pressing the "Validate" button.

Syntax-check your XML File - IE Only You can also syntax-check your XML file by typing the URL of your file into the input field below, and then press the "Validate" button Filename:

If you want to syntax-check an error-free XML file, you can paste the following address into the filename field: http://www.w3schools.com/xml/cd_catalog.xml Note: If you get the error "Access denied" when accessing this file, it is because your Internet Explorer security settings do not allow access across domains!

XML Browser Support

26

Nearly all major browsers have support for XML and XSLT.

Mozilla Firefox As of version 1.0.2, Firefox has support for XML and XSLT (and CSS).

Mozilla Mozilla includes Expat for XML parsing and has support to display XML + CSS. Mozilla also has some support for Namespaces. Mozilla is available with an XSLT implementation.

Netscape As of version 8, Netscape uses the Mozilla engine, and therefore it has the same XML / XSLT support as Mozilla.

Opera As of version 9, Opera has support for XML and XSLT (and CSS). Version 8 supports only XML + CSS.

Internet Explorer As of version 6, Internet Explorer supports XML, Namespaces, CSS, XSLT, and XPath. Note: Internet Explorer 5 also has XML support, but the XSL part is NOT compatible with the official W3C XSL Recommendation!

Viewing XML Files Raw XML files can be viewed in Mozilla, Firefox, Opera, Internet Explorer, and Netscape 6+. However, to make XML documents display as nice web pages, you will have to add some display information.

Viewing XML Files In Firefox and Internet Explorer: Open the XML file (typically by clicking on a link) - The XML document will be displayed with colorcoded root and child elements. A plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure. To view the raw XML source (without the + and - signs), select "View Page Source" or "View Source" from the browser menu. In Netscape 6:

27

Open the XML file, then right-click in XML file and select "View Page Source". The XML document will then be displayed with color-coded root and child elements. In Opera 7 and 8: In Opera 7: Open the XML file, then right-click in XML file and select "Frame" / "View Source". The XML document will be displayed as plain text. In Opera 8: Open the XML file, then right-click in XML file and select "Source". The XML document will be displayed as plain text. Look at this XML file: note.xml Note: Do not expect XML files to be formatted like HTML documents!

Viewing an Invalid XML File If an erroneous XML file is opened, the browser will report the error. Look at this XML file: note_error.xml

Other XML Examples Viewing some XML documents will help you get the XML feeling. An XML CD catalog This is my father's CD collection, stored as XML data (old and boring titles I guess... :-)). An XML plant catalog This is a plant catalog from a plant shop, stored as XML data. A Simple Food Menu This is a breakfast food menu from a restaurant, stored as XML data.

Why Does XML Display Like This? XML documents do not carry information about how to display the data. Since XML tags are "invented" by the author of the XML document, browsers do not know if a tag like describes an HTML table or a dining table. Without any information about how to display the data, most browsers will just display the XML document as it is. In the next chapters, we will take a look at different solutions to the display problem, using CSS, XSL, JavaScript, and XML Data Islands.

Displaying XML with CSS: With CSS (Cascading Style Sheets) you can add display information to an XML document.

28

Displaying your XML Files with CSS? It is possible to use CSS to format an XML document. Below is an example of how to use a CSS style sheet to format an XML document: Take a look at this XML file: The CD catalog Then look at this style sheet: The CSS file Finally, view: The CD catalog formatted with the CSS file Below is a fraction of the XML file. The second line, , links the XML file to the CSS file:

<TITLE>Empire Burlesque Bob Dylan USA Columbia 10.90 1985 <TITLE>Hide your heart Bonnie Tyler UK CBS Records 9.90 1988 . . . . Note: Formatting XML with CSS is NOT the future of how to style XML documents. XML document should be styled by using the W3C's XSL standard!

Displaying XML with XSL With XSL you can add display information to your XML document.

Displaying XML with XSL XSL is the preferred style sheet language of XML.

29

XSL (the eXtensible Stylesheet Language) is far more sophisticated than CSS. One way to use XSL is to transform XML into HTML before it is displayed by the browser as demonstrated in these examples: View the XML file, the XSL style sheet, and View the result. Below is a fraction of the XML file. The second line, , links the XML file to the XSL file:

Belgian Waffles <price>$5.95 <description> two of our famous Belgian Waffles 650 If you want to learn more about XSL, please visit our XSL tutorial.

XSL Languages It started with XSL and ended up with XSLT, XPath, and XSL-FO.

It Started with XSL XSL stands for EXtensible Stylesheet Language. The World Wide Web Consortium (W3C) started to develop XSL because there was a need for an XML-based Stylesheet Language.

CSS = HTML Style Sheets HTML uses predefined tags and the meaning of the tags are well understood. The
element in HTML defines a table - and a browser knows how to display it. Adding styles to HTML elements is simple. Telling a browser to display an element in a special font or color, is easy with CSS.

XSL = XML Style Sheets XML does not use predefined tags (we can use any tag-names we like), and the meaning of these tags are not well understood. A
element could mean an HTML table, a piece of furniture, or something else - and a browser does not know how to display it.

30

XSL describes how the XML document should be displayed!

XSL - More Than a Style Sheet Language XSL consists of three parts:

• • •

XSLT - a language for transforming XML documents XPath - a language for navigating in XML documents XSL-FO - a language for formatting XML documents

This Tutorial is About XSLT The rest of this tutorial is about XSLT - the language for transforming XML documents. But you can also study our XPath Tutorial and our XSL-FO Tutorial.

Introduction to XSLT XSLT is a language for transforming XML documents into XHTML documents or to other XML documents. XPath is a language for navigating in XML documents.

What You Should Already Know Before you continue you should have a basic understanding of the following:

• • •

HTML / XHTML XML / XML Namespaces XPath

If you want to study these subjects first, find the tutorials on our Home page.

What is XSLT? • • • • •

XSLT stands for XSL Transformations XSLT is the most important part of XSL XSLT transforms an XML document into another XML document XSLT uses XPath to navigate in XML documents XSLT is a W3C Recommendation

XSLT = XSL Transformations XSLT is the most important part of XSL.

31

XSLT is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. Normally XSLT does this by transforming each XML element into an (X)HTML element. With XSLT you can add/remove elements and attributes to or from the output file. You can also rearrange and sort elements, perform tests and make decisions about which elements to hide and display, and a lot more. A common way to describe the transformation process is to say that XSLT transforms an XML source-tree into an XML result-tree.

XSLT Uses XPath XSLT uses XPath to find information in an XML document. XPath is used to navigate through elements and attributes in XML documents. If you want to study XPath first, please read our XPath Tutorial.

How Does it Work? In the transformation process, XSLT uses XPath to define parts of the source document that should match one or more predefined templates. When a match is found, XSLT will transform the matching part of the source document into the result document.

XSLT is a W3C Recommendation XSLT became a W3C Recommendation 16. November 1999. To read more about the XSLT activities at W3C, please read our W3C Tutorial.

XSLT Browsers Nearly all major browsers have support for XML and XSLT.

Mozilla Firefox As of version 1.0.2, Firefox has support for XML and XSLT (and CSS).

Mozilla Mozilla includes Expat for XML parsing and has support to display XML + CSS. Mozilla also has some support for Namespaces. Mozilla is available with an XSLT implementation.

Netscape

32

As of version 8, Netscape uses the Mozilla engine, and therefore it has the same XML / XSLT support as Mozilla.

Opera As of version 9, Opera has support for XML and XSLT (and CSS). Version 8 supports only XML + CSS.

Internet Explorer As of version 6, Internet Explorer supports XML, Namespaces, CSS, XSLT, and XPath. Version 5 is NOT compatible with the official W3C XSL Recommendation.

XSLT – Transformation Example study: How to transform XML into XHTML using XSLT. The details of this example will be explained in the next chapter.

XSLT defines 37 elements, which break down into 3 overlapping categories: Two root elements: xsl:stylesheet xsl:transform 12 top-level elements. These elements may appear as immediate children of the root and are the following: xsl:attribute-set xsl:import xsl:key xsl:output xsl:preserve-space Xsl:template

xsl:decimal-format xsl:include xsl:namespace-alias xsl:param xsl:strip-space xsl:variable

23 instruction elements. These elements appear in the content of elements that contain templates. Here we don't mean the xsl:template element. We mean the content of that and several other elements, such as xsl:for-each and xsl:message, which are composed of literal result elements, character data, and XSLT instructions that are processed to produce part of the result tree. These elements are as follows: xsl:apply-imports

xsl:apply-templates

33

xsl:attribute

xsl:call-template

xsl:choose

xsl:comment

xsl:copy

xsl:copy-of

xsl:element

xsl:fallback

xsl:for-each

xsl:if

xsl:message

xsl:number

xsl:otherwise

xsl:param

xsl:processing-instruction xsl:sort xsl:text xsl:variable

xsl:value-of xsl:with-param

xsl:when

Correct Style Sheet Declaration The root element that declares the document to be an XSL style sheet is <xsl:stylesheet> or <xsl:transform>. Note: <xsl:stylesheet> and <xsl:transform> are completely synonymous and either can be used! The correct way to declare an XSL style sheet according to the W3C XSLT Recommendation is:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> or:

<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> To get access to the XSLT elements, attributes and features we must declare the XSLT namespace at the top of the document. The xmlns:xsl="http://www.w3.org/1999/XSL/Transform" points to the official W3C XSLT namespace. If you use this namespace, you must also include the attribute version="1.0".

34

Start with a Raw XML Document We want to transform the following XML document ("cdcatalog.xml") into XHTML:

Empire Burlesque <artist>Bob Dylan USA Columbia <price>10.90 1985 . . . Viewing XML Files in Firefox and Internet Explorer: Open the XML file (typically by clicking on a link) - The XML document will be displayed with color-coded root and child elements. A plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure. To view the raw XML source (without the + and - signs), select "View Page Source" or "View Source" from the browser menu. Viewing XML Files in Netscape 6: Open the XML file, then right-click in XML file and select "View Page Source". The XML document will then be displayed with color-coded root and child elements. Viewing XML Files in Opera 7: Open the XML file, then right-click in XML file and select "Frame" / "View Source". The XML document will be displayed as plain text. View "cdcatalog.xml"

Create an XSL Style Sheet Then you create an XSL Style Sheet ("cdcatalog.xsl") with a transformation template:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd">
Title Artist
<xsl:value-of select="title"/> <xsl:value-of select="artist"/>


35

View "cdcatalog.xsl"

Link the XSL Style Sheet to the XML Document Add the XSL style sheet reference to your XML document ("cdcatalog.xml"):

Empire Burlesque <artist>Bob Dylan USA Columbia <price>10.90 1985 . . . If you have an XSLT compliant browser it will nicely transform your XML into XHTML. View the result The details of the example above will be explained in the next chapters.

XSLT <xsl:template> Element An XSL style sheet consists of one or more set of rules that are called templates. Each template contains rules to apply when a specified node is matched.

The <xsl:template> Element The <xsl:template> element is used to build templates. The match attribute is used to associate a template with an XML element. The match attribute can also be used to define a template for the entire XML document. The value of the match attribute is an XPath expression (i.e. match="/" defines the whole document). Ok, let's look at a simplified version of the XSL file from the previous chapter:



36

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

Title Artist
. .
Since an XSL style sheet is an XML document itself, it always begins with the XML declaration: . The next element, <xsl:stylesheet>, defines that this document is an XSLT style sheet document (along with the version number and XSLT namespace attributes). The <xsl:template> element defines a template. The match="/" attribute associates the template with the root of the XML source document. The content inside the <xsl:template> element defines some HTML to write to the output. The last two lines define the end of the template and the end of the style sheet. The result of the transformation above will look like this:

My CD Collection Title Artist .

. View the XML file, View the XSL file, and View the result The result from this example was a little disappointing, because no data was copied from the XML document to the output. In the next chapter you will learn how to use the <xsl:value-of> element to select values from the XML elements.

XSLT <xsl:value-of> Element The <xsl:value-of> element is used to extract the value of a selected node.

37

The <xsl:value-of> Element The <xsl:value-of> element can be used to extract the value of an XML element and add it to the output stream of the transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

Title Artist
<xsl:value-of select="catalog/cd/title"/> <xsl:value-of select="catalog/cd/artist"/>
Note: The value of the select attribute is an XPath expression. An XPath expression works like navigating a file system; where a forward slash (/) selects subdirectories. The result of the transformation above will look like this:

My CD Collection Title

Artist

Empire Burlesque Bob Dylan View the XML file, View the XSL file, and View the result The result from this example was also a little disappointing, because only one line of data was copied from the XML document to the output. In the next chapter you will learn how to use the <xsl:for-each> element to loop through the XML elements, and display all of the records.

XSLT <xsl:for-each> Element The <xsl:for-each> element allows you to do looping in XSLT.

The <xsl:for-each> Element The XSL <xsl:for-each> element can be used to select every XML element of a specified node-set:

38

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd">
Title Artist
<xsl:value-of select="title"/> <xsl:value-of select="artist"/>
Note: The value of the select attribute is an XPath expression. An XPath expression works like navigating a file system; where a forward slash (/) selects subdirectories. The result of the transformation above will look like this:

My CD Collection Title

Artist

Empire Burlesque

Bob Dylan

Hide your heart

Bonnie Tyler

Greatest Hits

Dolly Parton

Still got the blues

Gary More

Eros

Eros Ramazzotti

One night only

Bee Gees

Sylvias Mother

Dr.Hook

Maggie May

Rod Stewart

Romanza

Andrea Bocelli

When a man loves a woman Percy Sledge Black angel

Savage Rose

1999 Grammy Nominees

Many

For the good times

Kenny Rogers

Big Willie style

Will Smith

Tupelo Honey

Van Morrison

Soulsville

Jorn Hoel

The very best of

Cat Stevens

Stop

Sam Brown

Bridge of Spies

T`Pau

39

Private Dancer

Tina Turner

Midt om natten

Kim Larsen

Pavarotti Gala Concert

Luciano Pavarotti

The dock of the bay

Otis Redding

Picture book

Simply Red

Red

The Communards

Unchain my heart

Joe Cocker

View the XML file, View the XSL file, and View the result

Filtering the Output We can also filter the output from the XML file by adding a criterion to the select attribute in the <xsl:for-each> element. <xsl:for-each select="catalog/cd[artist='Bob Dylan']"> Legal filter operators are:

• • • •

= (equal) != (not equal) < less than > greater than

Take a look at the adjusted XSL style sheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd[artist='Bob Dylan']">
Title Artist
<xsl:value-of select="title"/> <xsl:value-of select="artist"/>
The result of the transformation above will look like this:

My CD Collection

40

Title

Artist

Empire Burlesque Bob Dylan

XSLT <xsl:sort> Element The <xsl:sort> element is used to sort the output.

Where to put the Sort Information To sort the output, simply add an <xsl:sort> element inside the <xsl:for-each> element in the XSL file:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd"> <xsl:sort select="artist"/>
Title Artist
<xsl:value-of select="title"/> <xsl:value-of select="artist"/>
Note: The select attribute indicates what XML element to sort on. The result of the transformation above will look like this:

My CD Collection Title

Artist

Romanza

Andrea Bocelli

One night only

Bee Gees

Empire Burlesque

Bob Dylan

Hide your heart

Bonnie Tyler

The very best of

Cat Stevens

Greatest Hits

Dolly Parton

Sylvias Mother

Dr.Hook

41

Eros

Eros Ramazzotti

Still got the blues

Gary Moore

Unchain my heart

Joe Cocker

Soulsville

Jorn Hoel

For the good times

Kenny Rogers

Midt om natten

Kim Larsen

Pavarotti Gala Concert

Luciano Pavarotti

1999 Grammy Nominees

Many

The dock of the bay

Otis Redding

When a man loves a woman Percy Sledge Maggie May

Rod Stewart

Stop

Sam Brown

Black angel

Savage Rose

Picture book

Simply Red

Bridge of Spies

T`Pau

Red

The Communards

Private Dancer

Tina Turner

Tupelo Honey

Van Morrison

Big Willie style

Will Smith

View the XML file, View the XSL file, and View the result

XSLT <xsl:if> Element The <xsl:if> element is used to put a conditional test against the content of the XML file.

The <xsl:if> Element To put a conditional if test against the content of the XML file, add an <xsl:if> element to the XSL document.

Syntax <xsl:if test="expression"> ... ...some output if the expression is true... ...

Where to Put the <xsl:if> Element To add a conditional test, add the <xsl:if> element inside the <xsl:for-each> element in the XSL file:

<xsl:stylesheet version="1.0"

42

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd"> <xsl:if test="price > 10">
Title Artist
<xsl:value-of select="title"/> <xsl:value-of select="artist"/>
Note: The value of the required test attribute contains the expression to be evaluated. The code above will only output the title and artist elements of the CDs that has a price that is higher than 10. The result of the transformation above will look like this:

My CD Collection Title

Artist

Empire Burlesque

Bob Dylan

Still got the blues

Gary Moore

One night only

Bee Gees

Romanza

Andrea Bocelli

Black Angel

Savage Rose

1999 Grammy Nominees Many View the XML file, View the XSL file, and View the result

XSLT <xsl:choose> Element The <xsl:choose> element is used in conjunction with <xsl:when> and <xsl:otherwise> to express multiple conditional tests.

The <xsl:choose> Element Syntax <xsl:choose>

43

<xsl:when test="expression"> ... some output ... <xsl:otherwise> ... some output ....

Where to put the Choose Condition To insert a multiple conditional test against the XML file, add the <xsl:choose>, <xsl:when>, and <xsl:otherwise> elements to the XSL file:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd"> <xsl:choose> <xsl:when test="price > 10"> <xsl:otherwise>
Title Artist
<xsl:value-of select="title"/> <xsl:value-of select="artist"/> <xsl:value-of select="artist"/>
The code above will add a pink background-color to the "Artist" column WHEN the price of the CD is higher than 10. The result of the transformation will look like this:

My CD Collection Title

Artist

Empire Burlesque

Bob Dylan

Hide your heart

Bonnie Tyler

44

Greatest Hits

Dolly Parton

Still got the blues

Gary Moore

Eros

Eros Ramazzotti

One night only

Bee Gees

Sylvias Mother

Dr.Hook

Maggie May

Rod Stewart

Romanza

Andrea Bocelli

When a man loves a woman Percy Sledge Black angel

Savage Rose

1999 Grammy Nominees

Many

For the good times

Kenny Rogers

Big Willie style

Will Smith

Tupelo Honey

Van Morrison

Soulsville

Jorn Hoel

The very best of

Cat Stevens

Stop

Sam Brown

Bridge of Spies

T`Pau

Private Dancer

Tina Turner

Midt om natten

Kim Larsen

Pavarotti Gala Concert

Luciano Pavarotti

The dock of the bay

Otis Redding

Picture book

Simply Red

Red

The Communards

Unchain my heart

Joe Cocker

View the XML file, View the XSL file, and View the result

Another Example Here is another example that contains two <xsl:when> elements:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd"> <xsl:choose> <xsl:when test="price > 10"> <xsl:when test="price > 9"> <xsl:otherwise>
Title Artist
<xsl:value-of select="title"/>

45

<xsl:value-of select="artist"/>
<xsl:value-of select="artist"/> <xsl:value-of select="artist"/>
The code above will add a pink background color to the "Artist" column WHEN the price of the CD is higher than 10, and a grey background-color WHEN the price of the CD is higher than 9 and lower or equal to 10. The result of the transformation will look like this:

My CD Collection Title

Artist

Empire Burlesque

Bob Dylan

Hide your heart

Bonnie Tyler

Greatest Hits

Dolly Parton

Still got the blues

Gary Moore

Eros

Eros Ramazzotti

One night only

Bee Gees

Sylvias Mother

Dr.Hook

Maggie May

Rod Stewart

Romanza

Andrea Bocelli

When a man loves a woman Percy Sledge Black angel

Savage Rose

1999 Grammy Nominees

Many

For the good times

Kenny Rogers

Big Willie style

Will Smith

Tupelo Honey

Van Morrison

Soulsville

Jorn Hoel

The very best of

Cat Stevens

Stop

Sam Brown

Bridge of Spies

T`Pau

Private Dancer

Tina Turner

Midt om natten

Kim Larsen

Pavarotti Gala Concert

Luciano Pavarotti

The dock of the bay

Otis Redding

46

Picture book

Simply Red

Red

The Communards

Unchain my heart

Joe Cocker

View the XML file, View the XSL file, and View the result

XSLT <xsl:apply-templates> Element The <xsl:apply-templates> element applies a template to the current element or to the current element's child nodes.

The <xsl:apply-templates> Element The <xsl:apply-templates> element applies a template to the current element or to the current element's child nodes. If we add a select attribute to the <xsl:apply-templates> element it will process only the child element that matches the value of the attribute. We can use the select attribute to specify the order in which the child nodes are processed. Look at the following XSL style sheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:apply-templates/> <xsl:template match="cd">

<xsl:apply-templates select="title"/> <xsl:apply-templates select="artist"/>

<xsl:template match="title"> Title: <span style="color:#ff0000"> <xsl:value-of select="."/>
<xsl:template match="artist"> Artist: <span style="color:#00ff00"> <xsl:value-of select="."/>
The result of the transformation will look like this:

My CD Collection

47

Title: Empire Burlesque Artist: Bob Dylan Title: Hide your heart Artist: Bonnie Tyler Title: Greatest Hits Artist: Dolly Parton Title: Still got the blues Artist: Gary Moore Title: Eros Artist: Eros Ramazzotti Title: One night only Artist: Bee Gees Title: Sylvias Mother Artist: Dr.Hook Title: Maggie May Artist: Rod Stewart Title: Romanza Artist: Andrea Bocelli Title: When a man loves a woman Artist: Percy Sledge Title: Black angel Artist: Savage Rose Title: 1999 Grammy Nominees Artist: Many Title: For the good times Artist: Kenny Rogers Title: Big Willie style Artist: Will Smith Title: Tupelo Honey Artist: Van Morrison Title: Soulsville Artist: Jorn Hoel Title: The very best of Artist: Cat Stevens Title: Stop Artist: Sam Brown Title: Bridge of Spies Artist: T`Pau

48

Title: Private Dancer Artist: Tina Turner Title: Midt om natten Artist: Kim Larsen Title: Pavarotti Gala Concert Artist: Luciano Pavarotti Title: The dock of the bay Artist: Otis Redding Title: Picture book Artist: Simply Red Title: Red Artist: The Communards Title: Unchain my heart Artist: Joe Cocker View the XML file, View the XSL file, and View the result.

XSLT - On the Client If your browser supports it, XSLT can be used to transform the document to XHTML in your browser.

A JavaScript Solution In the previous chapters we have explained how XSLT can be used to transform a document from XML to XHTML. We did this by adding an XSL style sheet to the XML file and let the browser do the transformation. Even if this works fine, it is not always desirable to include a style sheet reference in an XML file (e.g. it will not work in a non XSLT aware browser.) A more versatile solution would be to use a JavaScript to do the transformation. By using a JavaScript, we can:

• •

do browser-specific testing use different style sheets according to browser and user needs

That is the beauty of XSLT! One of the design goals for XSLT was to make it possible to transform data from one format to another, supporting different browsers and different user needs. XSLT transformation on the client side is bound to be a major part of the browsers work tasks in the future, as we will see a growth in the specialized browser market (Braille, aural browsers, Web printers, handheld devices, etc.)

The XML File and the XSL File

49

Look at the XML document that you have seen in the previous chapters:

Empire Burlesque <artist>Bob Dylan USA Columbia <price>10.90 1985 . . . View the XML file. And the accompanying XSL style sheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd">
Title Artist
<xsl:value-of select="title" /> <xsl:value-of select="artist" />
View the XSL file. Notice that the XML file does not have a reference to the XSL file. IMPORTANT: The above sentence indicates that an XML file could be transformed using many different XSL style sheets.

Transforming XML to XHTML in the Browser Here is the source code needed to transform the XML file to XHTML on the client:

50

<script type="text/javascript"> // Load XML var xml = new ActiveXObject("Microsoft.XMLDOM") xml.async = false xml.load("cdcatalog.xml") // Load XSL var xsl = new ActiveXObject("Microsoft.XMLDOM") xsl.async = false xsl.load("cdcatalog.xsl") // Transform document.write(xml.transformNode(xsl)) Tip: If you don't know how to write JavaScript, you can study our JavaScript tutorial. The first block of code creates an instance of the Microsoft XML parser (XMLDOM), and loads the XML file into memory. The second block of code creates another instance of the parser and loads the XSL file into memory. The last line of code transforms the XML document using the XSL document, and displays the result as XHTML in your browser. Nice! See how it works in IE.

XSLT - On the Server Since not all browsers support XSLT, one solution is to transform the XML to XHTML on the server.

A Cross Browser Solution In the previous chapter we explained how XSLT can be used to transform a document from XML to XHTML in the browser. We created a JavaScript that used an XML parser to do the transformation. The JavaScript solution will not work in a browser that doesn't have an XML parser. To make XML data available to all kind of browsers, we must transform the XML document on the SERVER and send it as XHTML back to the browser. That's another beauty of XSLT. One of the design goals for XSLT was to make it possible to transform data from one format to another on a server, returning readable data to all kinds of browsers.

The XML File and the XSLT File Look at the XML document that you have seen in the previous chapters:

Empire Burlesque <artist>Bob Dylan

51

USA Columbia <price>10.90 1985
. . .
View the XML file. And the accompanying XSL style sheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd">
Title Artist
<xsl:value-of select="title" /> <xsl:value-of select="artist" />
View the XSL file. Notice that the XML file does not have a reference to the XSL file. IMPORTANT: The above sentence indicates that an XML file could be transformed using many different XSL style sheets.

Transforming XML to XHTML on the Server Here is the ASP source code needed to transform the XML file to XHTML on the server:

<% 'Load XML set xml = Server.CreateObject("Microsoft.XMLDOM") xml.async = false xml.load(Server.MapPath("cdcatalog.xml"))

52

'Load XSL set xsl = Server.CreateObject("Microsoft.XMLDOM") xsl.async = false xsl.load(Server.MapPath("cdcatalog.xsl")) 'Transform file Response.Write(xml.transformNode(xsl)) %> Tip: If you don't know how to write ASP, you can study our ASP tutorial. The first block of code creates an instance of the Microsoft XML parser (XMLDOM), and loads the XML file into memory. The second block of code creates another instance of the parser and loads the XSL file into memory. The last line of code transforms the XML document using the XSL document, and sends the result as XHTML to your browser. Nice! See how it works.

XSLT - Editing XML Data stored in XML files can be edited from an Internet browser.

Open, Edit and Save XML Now, we will show how to open, edit, and save an XML file that is stored on the server. We will use XSL to transform the XML document into an HTML form. The values of the XML elements will be written to HTML input fields in an HTML form. The HTML form is editable. After editing the data, the data is going to be submitted back to the server and the XML file will be updated (this part is done with ASP).

The XML File and the XSL File First, look at the XML document that will be used ("tool.xml"):

HAMMER HG2606 32456240 $30.00 View the XML file. Then, take a look at the following style sheet ("tool.xsl"):

53

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

Tool Information (edit):

<xsl:for-each select="tool/field">
<xsl:value-of select="@id"/> <xsl:attribute name="id"> <xsl:value-of select="@id" /> <xsl:attribute name="name"> <xsl:value-of select="@id" /> <xsl:attribute name="value"> <xsl:value-of select="value" />

View the XSL file. The XSL file above loops through the elements in the XML file and creates one input field for each XML "field" element. The value of the XML "field" element's "id" attribute is added to both the "id" and "name" attributes of each HTML input field. The value of each XML "value" element is added to the "value" attribute of each HTML input field. The result is an editable HTML form that contains the values from the XML file. Then, we have a second style sheet: "tool_updated.xsl". This is the XSL file that will be used to display the updated XML data. This style sheet will not result in an editable HTML form, but a static HTML table:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

Updated Tool Information:



54

<xsl:for-each select="tool/field">
<xsl:value-of select="@id" /> <xsl:value-of select="value" />
View the XSL file.

The ASP File The HTML form in the "tool.xsl" file above has an action attribute with a value of "edittool.asp". The "edittool.asp" page contains two functions: The loadFile() function loads and transforms the XML file for display and the updateFile() function applies the changes to the XML file:

<% function loadFile(xmlfile,xslfile) Dim xmlDoc,xslDoc 'Load XML file set xmlDoc = Server.CreateObject("Microsoft.XMLDOM") xmlDoc.async = false xmlDoc.load(xmlfile) 'Load XSL file set xslDoc = Server.CreateObject("Microsoft.XMLDOM") xslDoc.async = false xslDoc.load(xslfile) 'Transform file Response.Write(xmlDoc.transformNode(xslDoc)) end function function updateFile(xmlfile) Dim xmlDoc,rootEl,f Dim i 'Load XML file set xmlDoc = Server.CreateObject("Microsoft.XMLDOM") xmlDoc.async = false xmlDoc.load(xmlfile) 'Set the rootEl variable equal to the root element Set rootEl = xmlDoc.documentElement 'Loop through the form collection for i = 1 To Request.Form.Count 'Eliminate button elements in the form if instr(1,Request.Form.Key(i),"btn_")=0 then 'The selectSingleNode method queries the XML file for a 'single node that matches a query. This query requests 'the value element that is the child of a field element 'that has an id attribute which matches the current key 'value in the Form Collection. When there is a match 'set the text property equal to the value of the current 'field in the Form Collection. set f = rootEl.selectSingleNode("field[@id='" & _

55

Request.Form.Key(i) & "']/value") f.Text = Request.Form(i) end if next 'Save the modified XML file xmlDoc.save xmlfile 'Release all object references set xmlDoc=nothing set rootEl=nothing set f=nothing 'Load the modified XML file with a style sheet that 'allows the client to see the edited information loadFile xmlfile,server.MapPath("tool_updated.xsl") end function 'If the form has been submitted update the 'XML file and display result - if not, 'transform the XML file for editing if Request.Form("btn_sub")="" then loadFile server.MapPath("tool.xml"),server.MapPath("tool.xsl") else updateFile server.MapPath("tool.xml") end if %> Tip: If you don't know how to write ASP, you can study our ASP tutorial. Note: We are doing the transformation and applying the changes to the XML file on the server. This is a cross-browser solution. The client will only get HTML back from the server - which will work in any browser.

XML Editors If you are serious about XML, you will benefit from using a professional XML Editor.

XML is Text-based XML is a text-based markup language. One great thing about XML is that XML files can be created and edited using a simple text-editor like Notepad. However, when you start working with XML, you will soon find that it is better to edit XML documents using a professional XML editor.

Why Not Notepad? Many web developers use Notepad to edit both HTML and XML documents because Notepad is included with the most common OS and it is simple to use. Personally I often use Notepad for quick editing of simple HTML, CSS, and XML files. But, if you use Notepad for XML editing, you will soon run into problems. Notepad does not know that you are writing XML, so it will not be able to assist you.

56

Why an XML Editor? Today XML is an important technology, and development projects use XML-based technologies like:

• • • • • • •

XML Schema to define XML structures and data types XSLT to transform XML data SOAP to exchange XML data between applications WSDL to describe web services RDF to describe web resources XPath and XQuery to access XML data SMIL to define graphics

To be able to write error-free XML documents, you will need an intelligent XML editor!

XML Editors Professional XML editors will help you to write error-free XML documents, validate your XML against a DTD or a schema, and force you to stick to a valid XML structure. An XML editor should be able to:

• • • • •

Add closing tags to your opening tags automatically Force you to write valid XML Verify your XML against a DTD Verify your XML against a Schema Color code your XML syntax

Altova's XMLSpy At W3Schools we have been using XMLSpy for many years. XMLSpy is our favorite XML editor. These are some of the features we especially like:

• • • • • • • • • • • • • • • •

Easy to use Syntax coloring Automatic tag completion Context-sensitive entry helpers Automatic well-formedness check Built in DTD and/or XML Schema-based validation Easy switching between text view and grid view Built in graphical XML Schema editor Powerful conversion utilities Database import and export Built in templates for most XML document types Built in XPath 1.0/2.0 analyzer XSLT 1.0/2.0 editor, profiler, and debugger XQuery editor, profiler, and debugger SOAP client and debugger Graphical WSDL editor

57

• •

Powerful project management capabilities Code generation in Java, C++, and C#

Read more about XMLSpy

You Have Learned XSLT, Now What? XSLT Summary This tutorial has taught you how to use XSLT to transform XML documents into other formats, like XHTML. You have learned how to add/remove elements and attributes to or from the output file. You have also learned how to rearrange and sort elements, perform tests and make decisions about which elements to hide and display. For more information on XSLT, please look at our XSLT reference.

Now You Know XSLT, What's Next? XSL includes 3 languages: XSLT, XPath and XSL-FO, so the next step is to learn about XPath and XSL-FO. XPath XPath is used to navigate through elements and attributes in an XML document. XPath is a major element in the W3C's XSL standard. An understanding of XPath is fundamental for advanced use of XML. Without any XPath knowledge, you will not be able to create XSLT documents. If you want to learn more about the XPath, please visit our XPath tutorial. XSL-FO XSL-FO describes the formatting of XML data for output to screen, paper or other media. XSL-FO documents are XML files with information about the output layout and output content. If you want to learn more about XSL-FO, please visit our XSL-FO tutorial.

XML Data Island: With Internet Explorer, the unofficial <xml> tag can be used to create an XML data island.

58

XML Data Embedded in HTML An XML data island is XML data embedded into an HTML page. Here is how it works; assume we have the following XML document ("note.xml"):

<note> Tove Jani Reminder Don't forget me this weekend! Then, in an HTML document, you can embed the XML file above with the <xml> tag. The id attribute of the <xml> tag defines an ID for the data island, and the src attribute points to the XML file to embed:

<xml id="note" src="note.xml"> However, the embedded XML data is, up to this point, not visible for the user. The next step is to format and display the data in the data island by binding it to HTML elements.

Bind Data Island to HTML Elements In the next example, we will embed an XML file called "cd_catalog.xml" into an HTML file. View "cd_catalog.xml". The HTML file looks like this:

<xml id="cdcat" src="cd_catalog.xml">
<span datafld="ARTIST"> <span datafld="TITLE">
Example explained:

59

The datasrc attribute of the tag binds the HTML table element to the XML data island. The datasrc attribute refers to the id attribute of the data island. , , and .

XML in Real Life: Example: XML News XMLNews is a specification for exchanging news and other information. Using such a standard makes it easier for both news producers and news consumers to produce, receive, and archive any kind of news information across different hardware, software, and programming languages. An example XMLNews document:

Colombia Earthquake 143 Dead in Colombia Earthquake By Jared Kotler, Associated Press Writer Bogota, Colombia Monday January 25 1999 7:28 ET

XML Parser: To read and update, create and manipulate an XML document, you will need an XML parser.

60

Examples Parse an XML file - Crossbrowser example This example is a cross-browser example that loads an existing XML document ("note.xml") into the XML parser. Parse an XML string - Crossbrowser example This example is a cross-browser example on how to load and parse an XML string.

Parsing XML Documents To manipulate an XML document, you need an XML parser. The parser loads the document into your computer's memory. Once the document is loaded, its data can be manipulated using the DOM. The DOM treats the XML document as a tree. To learn more about the XML DOM, please read our XML DOM tutorial. There are some differences between Microsoft's XML parser and the XML parser used in Mozilla browsers. In this tutorial we will show you how to create cross browser scripts that will work in both Internet Explorer and Mozilla browsers.

Microsoft's XML Parser Microsoft's XML parser is a COM component that comes with Internet Explorer 5 and higher. Once you have installed Internet Explorer, the parser is available to scripts. Microsoft's XML parser supports all the necessary functions to traverse the node tree, access the nodes and their attribute values, insert and delete nodes, and convert the node tree back to XML. To create an instance of Microsoft's XML parser, use the following code: JavaScript:

var xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); VBScript:

set xmlDoc=CreateObject("Microsoft.XMLDOM") ASP:

set xmlDoc=Server.CreateObject("Microsoft.XMLDOM") The following code fragment loads an existing XML document ("note.xml") into Microsoft's XML parser:

var xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.load("note.xml");

61

The first line of the script above creates an instance of the XML parser. The second line turns off asynchronized loading, to make sure that the parser will not continue execution of the script before the document is fully loaded. The third line tells the parser to load an XML document called "note.xml".

XML Parser in Mozilla, Firefox, and Opera Mozilla's XML parser supports all the necessary functions to traverse the node tree, access the nodes and their attribute values, insert and delete nodes, and convert the node tree back to XML. To create an instance of the XML parser in Mozilla browsers, use the following code: JavaScript:

var xmlDoc=document.implementation.createDocument("ns","root",null); The first parameter, ns, defines the namespace used for the XML document. The second parameter, root, is the XML root element in the XML file. The third parameter, null, is always null because it is not implemented yet. The following code fragment loads an existing XML document ("note.xml") into Mozillas' XML parser:

var xmlDoc=document.implementation.createDocument("","",null); xmlDoc.load("note.xml"); The first line of the script above creates an instance of the XML parser. The second line tells the parser to load an XML document called "note.xml".

Parsing an XML File - A Cross browser Example The following example is a cross browser example that loads an existing XML document ("note.xml") into the XML parser:

<script type="text/javascript"> var xmlDoc; function loadXML() { // code for IE if (window.ActiveXObject) { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async=false; xmlDoc.load("note.xml"); getmessage(); } // code for Mozilla, Firefox, Opera, etc. else if (document.implementation && document.implementation.createDocument) { xmlDoc=document.implementation.createDocument("","",null); xmlDoc.load("note.xml");

62

xmlDoc.onload=getmessage; } else { alert('Your browser cannot handle this script'); } } function getmessage() { document.getElementById("to").innerHTML= xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue; document.getElementById("from").innerHTML= xmlDoc.getElementsByTagName("from")[0].childNodes[0].nodeValue; document.getElementById("message").innerHTML= xmlDoc.getElementsByTagName("body")[0].childNodes[0].nodeValue; }

W3Schools Internal Note

To: <span id="to">
From: <span id="from">
Message: <span id="message">

Output:

W3Schools Internal Note To: Tove From: Jani Message: Don't forget me this weekend!

Important Note To extract the text (Jani) from an XML element like: Jani, the correct syntax is:

getElementsByTagName("from")[0].childNodes[0].nodeValue IMPORTANT: getElementsByTagName returns an array of nodes. The array contains all elements with the specified name within the XML document. In this case there is only one "from" element, but you still have to specify the array index ( [0] ).

Parsing an XML String - A Cross browser Example The following code is a cross-browser example on how to load and parse an XML string:

<script type="text/javascript">

63

var text="<note>"; text=text+"Tove"; text=text+"Jani"; text=text+"Reminder"; text=text+"Don't forget me this weekend!"; text=text+""; // code for IE if (window.ActiveXObject) { var doc=new ActiveXObject("Microsoft.XMLDOM"); doc.async="false"; doc.loadXML(text); } // code for Mozilla, Firefox, Opera, etc. else { var parser=new DOMParser(); var doc=parser.parseFromString(text,"text/xml"); } // documentElement always represents the root node var x=doc.documentElement; document.write("Text of first child element: "); document.write(x.childNodes[0].childNodes[0].nodeValue); document.write("
"); document.write("Text of second child element: "); document.write(x.childNodes[1].childNodes[0].nodeValue); Output: Text of first child element: Tove Text of second child element: Jani Note: Internet Explorer uses the loadXML() method to parse an XML string, while Mozilla browsers uses the DOMParser object.

XML Namespaces: XML Namespaces provide a method to avoid element name conflicts.

Name Conflicts Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names. This XML document carries information in a table:

tags cannot be bound to data, so we are using <span> tags. The <span> tag allows the datafld attribute to refer to the XML element to be displayed. In this case, it is datafld="ARTIST" for the element and datafld="TITLE" for the <TITLE> element in the XML file. As the XML is read, additional rows are created for each element. If you are running IE 5.0 or higher, you can try it yourself. Also try this example, demonstrating


64

Apples Bananas
This XML document carries information about a table (a piece of furniture):

African Coffee Table <width>80120
If these two XML documents were added together, there would be an element name conflict because both documents contain a element with different content and definition.

Solving Name Conflicts Using a Prefix This XML document carries information in a table:

Apples Bananas This XML document carries information about a piece of furniture:

African Coffee Table 80 120 Now there will be no name conflict because the two documents use a different name for their
element ( and ). By using a prefix, we have created two different types of
elements.

Using Namespaces This XML document carries information in a table:

Apples Bananas This XML document carries information about a piece of furniture:



65

African Coffee Table 80 120
Instead of using only prefixes, we have added an xmlns attribute to the
tag to give the prefix a qualified name associated with a namespace.

The XML Namespace (xmlns) Attribute The XML namespace attribute is placed in the start tag of an element and has the following syntax:

xmlns:namespace-prefix="namespaceURI" When a namespace is defined in the start tag of an element, all child elements with the same prefix are associated with the same namespace. Note that the address used to identify the namespace is not used by the parser to look up information. The only purpose is to give the namespace a unique name. However, very often companies use the namespace as a pointer to a real Web page containing information about the namespace. Try to go to http://www.w3.org/TR/html4/.

Uniform Resource Identifier (URI) A Uniform Resource Identifier (URI) is a string of characters which identifies an Internet Resource. The most common URI is the Uniform Resource Locator (URL) which identifies an Internet domain address. Another, not so common type of URI is the Universal Resource Name (URN). In our examples we will only use URLs.

Default Namespaces Defining a default namespace for an element saves us from using prefixes in all the child elements. It has the following syntax:

xmlns="namespaceURI" This XML document carries information in a table:

Apples Bananas
This XML document carries information about a piece of furniture:

African Coffee Table <width>80

66

120


Namespaces in Real Use When you start using XSL, you will soon see namespaces in real use. XSL style sheets are used to transform XML documents into other formats, like HTML. If you take a close look at the XSL document below, you will see that most of the tags are HTML tags. The tags that are not HTML tags have the prefix xsl, identified by the namespace "http://www.w3.org/1999/XSL/Transform":

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">

My CD Collection

<xsl:for-each select="catalog/cd">
Title Artist
<xsl:value-of select="title"/> <xsl:value-of select="artist"/>


XML CDATA: All text in an XML document will be parsed by the parser. Only text inside a CDATA section will be ignored by the parser.

Parsed Data XML parsers normally parse all the text in an XML document. When an XML element is parsed, the text between the XML tags is also parsed:

<message>This text is also parsed The parser does this because XML elements can contain other elements, as in this example, where the element contains two other elements (first and last):

67

BillGates and the parser will break it up into sub-elements like this:

Bill Gates

Escape Characters Illegal XML characters have to be replaced by entity references. If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element. You cannot write something like this:

<message>if salary < 1000 then To avoid this, you have to replace the "<" character with an entity reference, like this:

<message>if salary < 1000 then There are 5 predefined entity references in XML: <

<

less than

>

>

greater than

&

&

ampersand

'

'

apostrophe

"

"

quotation mark

Note: Only the characters "<" and "&" are strictly illegal in XML. Apostrophes, quotation marks and greater than signs are legal, but it is a good habit to replace them.

CDATA Everything inside a CDATA section is ignored by the parser. If your text contains a lot of "<" or "&" characters - as program code often does - the XML element can be defined as a CDATA section. A CDATA section starts with "":

<script>
68

else { return 0 } } ]]> In the example above, everything inside the CDATA section is ignored by the parser.

Notes on CDATA sections: A CDATA section cannot contain the string "]]>", therefore, nested CDATA sections are not allowed. Also make sure there are no spaces or line breaks inside the "]]>" string.

XML Encoding: XML documents may contain foreign characters, like Norwegian æ ø å , or French ê è é. To let your XML parser understand these characters, you should save your XML documents as Unicode.

Windows 2000 Notepad Windows 2000 Notepad can save files as Unicode. Save the XML file below as Unicode (note that the document does not contain any encoding attribute):

<note> Jani Tove <message>Norwegian: æøå. French: êèé The file above, note_encode_none_u.xml will NOT generate an error in IE 5+, Firefox, or Opera, but it WILL generate an error in Netscape 6.2.

Windows 2000 Notepad with Encoding Windows 2000 Notepad files saved as Unicode use "UTF-16" encoding. If you add an encoding attribute to XML files saved as Unicode, windows encoding values will generate an error. The following encoding (open it), will NOT give an error message:



69

The following encoding (open it), will NOT give an error message:

The following encoding (open it), will NOT give an error message:

The following encoding (open it), will NOT generate an error in IE 5+, Firefox, or Opera, but it WILL generate an error in Netscape 6.2.



Error Messages If you try to load an XML document into Internet Explorer, you can get two different errors indicating encoding problems: An invalid character was found in text content. You will get this error message if a character in the XML document does not match the encoding attribute. Normally you will get this error message if your XML document contains "foreign" characters, and the file was saved with a single-byte encoding editor like Notepad, and no encoding attribute was specified. Switch from current encoding to specified encoding not supported. You will get this error message if your file was saved as Unicode/UTF-16 but the encoding attribute specified a single-byte encoding like Windows-1252, ISO-8859-1 or UTF-8. You can also get this error message if your document was saved with single-byte encoding, but the encoding attribute specified a double-byte encoding like UTF-16.

Conclusion The conclusion is that the encoding attribute has to specify the encoding used when the document was saved. My best advice to avoid errors is:

• • •

Use an editor that supports encoding Make sure you know what encoding it uses Use the same encoding attribute in your XML documents

XML on the Server: XML can be generated on a server without installing any XML controls.

Storing XML on the Server XML files can be stored on an Internet server exactly the same way as HTML files.

70

Start Windows Notepad and write the following lines:

<note> Jani Tove <message>Remember me this weekend Save the file on your web server with a proper name like "note.xml".

Generating XML with ASP XML can be generated on a server without any installed XML software. To generate an XML response from the server - simply write the following code and save it as an ASP file on the web server:

<% response.ContentType="text/xml" response.Write("") response.Write("<note>") response.Write("Jani") response.Write("Tove") response.Write("<message>Remember me this weekend") response.Write("") %> Note that the content type of the response must be set to "text/xml". See how the ASP file will be returned from the server. If you don't know how to write ASP, please visit our ASP tutorial

Getting XML From a Database XML can be generated from a database without any installed XML software. To generate an XML database response from the server, simply write the following code and save it as an ASP file on the web server:

<% response.ContentType = "text/xml" set conn=Server.CreateObject("ADODB.Connection") conn.provider="Microsoft.Jet.OLEDB.4.0;" conn.open server.mappath("/db/database.mdb") sql="select fname,lname from tblGuestBook" set rs=Conn.Execute(sql) rs.MoveFirst() response.write("") response.write("") while (not rs.EOF) response.write("")

71

response.write("" & rs("fname") & "") response.write("" & rs("lname") & "") response.write("
") rs.MoveNext() wend rs.close() conn.close() response.write("
") %> See the real life database output from the ASP file above. The example above uses ASP with ADO. If you don't know how to use ADO, please visit our ADO tutorial.

XML Application: This chapter demonstrates a small framework for an XML application. Note: This example uses a Data Island, which only works in Internet Explorer.

The XML Example Document Look at the following XML document ("cd_catalog.xml"), that represents a CD catalog:

<TITLE>Empire Burlesque Bob Dylan USA Columbia 10.90 1985 . . ... more ... . View the full "cd_catalog.xml" file in your browser.

Load the XML Document Into a Data Island A Data Island can be used to access the XML file. To get your XML document "inside" an HTML page, add an XML Data Island to the HTML page:

<xml src="cd_catalog.xml" id="xmldso" async="false">

72

With the example code above, the XML file "cd_catalog.xml" will be loaded into an "invisible" Data Island called "xmldso". The async="false" attribute is added to make sure that all the XML data is loaded before any other HTML processing takes place.

Bind the Data Island to an HTML Table To make the XML data visible on the HTML page, you must "bind" the Data Island to an HTML element. To bind the XML data to an HTML table, add a datasrc attribute to the table element, and add datafld attributes to the span elements inside the table data:

Title Artist Year
<span datafld="TITLE"> <span datafld="ARTIST"> <span datafld="YEAR">
If you have IE 5.0 or higher: See how the XML data is displayed inside an HTML table.

Bind the Data Island to <span> or
Elements <span> or
elements can be used to display XML data. You don't have to use the HTML table element to display XML data. Data from a Data Island can be displayed anywhere on an HTML page. All you have to do is to add some <span> or
elements to your page. Use the datasrc attribute to bind the elements to the Data Island, and the datafld attribute to bind each element to an XML element, like this:


Title: <span datasrc="#xmldso" datafld="TITLE">
Artist: <span datasrc="#xmldso" datafld="ARTIST">
Year: <span datasrc="#xmldso" datafld="YEAR"> or like this:


Title:

Artist:

Year:


73

If you have IE 5.0 or higher: See how the XML data is displayed inside the HTML elements. Note that if you use an HTML
element, the data will be displayed on a new line. With the examples above, you will only see one line of your XML data. To navigate to the next line of data, you have to add some scripting to your code.

Add a Navigation Script Navigation has to be performed by a script. To add navigation to the XML Data Island, create a script that calls the movenext() and moveprevious() methods of the Data Island.

<script type="text/javascript"> function movenext() { x=xmldso.recordset if (x.absoluteposition < x.recordcount) { x.movenext() } } function moveprevious() { x=xmldso.recordset if (x.absoluteposition > 1) { x.moveprevious() } } If you have IE 5.0 or higher: See how you can navigate through the XML records.

All Together Now With a little creativity you can create a full application. If you use what you have learned on this page, and a little imagination, you can easily develop this into a full application. If you have IE 5.0 or higher: See how you can add a little fancy to this application.

The XMLHttpRequest Object: The XMLHttpRequest object is supported in Internet Explorer 5.0+, Safari 1.2, Mozilla 1.0 / Firefox, Opera 9, and Netscape 7.

What is an HTTP Request?

74

With an HTTP request, a web page can make a request to, and get a response from a web server without reloading the page. The user will stay on the same page, and he or she will not notice that scripts might request pages, or send data to a server in the background. By using the XMLHttpRequest object, a web developer can change a page with data from the server after the page has loaded. Google Suggest is using the XMLHttpRequest object to create a very dynamic web interface: When you start typing in Google's search box, a JavaScript sends the letters off to a server and the server returns a list of suggestions.

Is the XMLHttpRequest Object a W3C Standard? The XMLHttpRequest object is a JavaScript object, and is not specified in any W3C recommendation. However, the W3C DOM Level 3 "Load and Save" specification contains some similar functionality, but these are not implemented in any browsers yet. So, at the moment, if you need to send an HTTP request from a browser, you will have to use the XMLHttpRequest object.

Creating an XMLHttpRequest Object For Mozilla, Firefox, Safari, Opera, and Netscape:

var xmlhttp=new XMLHttpRequest() For Internet Explorer:

var xmlhttp=new ActiveXObject("Microsoft.XMLHTTP") Example <script type="text/javascript"> var xmlhttp function loadXMLDoc(url) { xmlhttp=null // code for Mozilla, etc. if (window.XMLHttpRequest) { xmlhttp=new XMLHttpRequest() } // code for IE else if (window.ActiveXObject) { xmlhttp=new ActiveXObject("Microsoft.XMLHTTP") } if (xmlhttp!=null) { xmlhttp.onreadystatechange=state_Change xmlhttp.open("GET",url,true) xmlhttp.send(null) } else { alert("Your browser does not support XMLHTTP.")

75

} } function state_Change() { // if xmlhttp shows "loaded" if (xmlhttp.readyState==4) { // if "OK" if (xmlhttp.status==200) { // ...some code here... } else { alert("Problem retrieving XML data") } } } Try it yourself using JavaScript The syntax is a little bit different in VBScript: Try it yourself using VBScript Note: An important property in the example above is the onreadystatechange property. This property is an event handler which is triggered each time the state of the request changes. The states run from 0 (uninitialized) to 4 (complete). By having the function xmlhttpChange() check for the state changing, we can tell when the process is complete and continue only if it has been successful.

Why are we Using Async in our Examples? All the examples here use the async mode (the third parameter of open() set to true). The async parameter specifies whether the request should be handled asynchronously or not. True means that script continues to run after the send() method, without waiting for a response from the server. false means that the script waits for a response before continuing script processing. By setting this parameter to false, you run the risk of having your script hang if there is a network or server problem, or if the request is long (the UI locks while the request is being made) a user may even see the "Not Responding" message. It is safer to send asynchronously and design your code around the onreadystatechange event!

More Examples Load a textfile into a div element with XML HTTP (JavaScript) Make a HEAD request with XML HTTP (JavaScript) Make a specified HEAD request with XML HTTP (JavaScript) List data from an XML file with XML HTTP (JavaScript)

The XMLHttpRequest Object Reference

76

Methods Method

Description

abort()

Cancels the current request

getAllResponseHeaders()

Returns the complete set of http headers as a string

getResponseHeader("headername")

Returns the value of the specified http header

open("method","URL",async,"uname","pswd") Specifies the method, URL, and other optional attributes of a request The method parameter can have a value of "GET", "POST", or "PUT" (use "GET" when requesting data and use "POST" when sending data (especially if the length of the data is greater than 512 bytes. The URL parameter may be either a relative or complete URL. The async parameter specifies whether the request should be handled asynchronously or not. true means that script processing carries on after the send() method, without waiting for a response. false means that the script waits for a response before continuing script processing send(content)

Sends the request

setRequestHeader("label","value")

Adds a label/value pair to the http header to be sent

Properties Property

Description

onreadystatechange

An event handler for an event that fires at every state change

readyState

Returns the state of the object: 0 1 2 3 4

= = = = =

uninitialized loading loaded interactive complete

responseText

Returns the response as a string

responseXML

Returns the response as XML. This property returns an XML document object, which can be examined and parsed using W3C DOM node tree methods and properties

status

Returns the status as a number (e.g. 404 for "Not Found" or 200 for "OK")

statusText

Returns the status as a string (e.g. "Not Found" or "OK")

Save Data to an XML File: Usually, we save data in databases. However, if we want to make the data more portable, we can store the data in an XML file.

Create and Save an XML File

77

Storing data in XML files is useful if the data is to be sent to applications on non-Windows platforms. Remember that XML is portable across all platforms and the data will not need to be converted! First we will learn how to create and save an XML file. The XML file below will be named "test.xml" and will be stored in the c directory on the server. We will use ASP and Microsoft's XMLDOM object to create and save the XML file:

<% Dim xmlDoc, rootEl, child1, child2, p 'Create an XML document Set xmlDoc = Server.CreateObject("Microsoft.XMLDOM") 'Create a root element and append it to the document Set rootEl = xmlDoc.createElement("root") xmlDoc.appendChild rootEl 'Create and append child elements Set child1 = xmlDoc.createElement("child1") Set child2 = xmlDoc.createElement("child2") rootEl.appendChild child1 rootEl.appendChild child2 'Add an XML processing instruction 'and insert it before the root element Set p=xmlDoc.createProcessingInstruction("xml","version='1.0'") xmlDoc.insertBefore p,xmlDoc.childNodes(0) 'Save the XML file to the c directory xmlDoc.Save "c:\test.xml" %> If you open the saved XML file it will look something like this ("test.xml"):



Real Form Example Now, we will look at a real HTML form example. We will first look at the HTML form that will be used in this example: The HTML form below asks for the user's name, country, and e-mail address. This information will then be written to an XML file for storage. "customers.htm":

Enter your contact information

First Name:
Last Name:
Country:
Email:


78

The action for the HTML form above is set to "saveForm.asp". The "saveForm.asp" file is an ASP page that will loop through the form fields and store their values in an XML file:

<% dim xmlDoc dim rootEl,fieldName,fieldValue,attID dim p,i 'Do not stop if an error occurs On Error Resume Next Set xmlDoc = server.CreateObject("Microsoft.XMLDOM") xmlDoc.preserveWhiteSpace=true 'Create a root element and append it to the document Set rootEl = xmlDoc.createElement("customer") xmlDoc.appendChild rootEl 'Loop through the form collection for i = 1 To Request.Form.Count 'Eliminate button elements in the form if instr(1,Request.Form.Key(i),"btn_")=0 then 'Create a field and a value element, and an id attribute Set fieldName = xmlDoc.createElement("field") Set fieldValue = xmlDoc.createElement("value") Set attID = xmlDoc.createAttribute("id") 'Set the value of the id attribute equal to the name of 'the current form field attID.Text = Request.Form.Key(i) 'Append the id attribute to the field element fieldName.setAttributeNode attID 'Set the value of the value element equal to 'the value of the current form field fieldValue.Text = Request.Form(i) 'Append the field element as a child of the root element rootEl.appendChild fieldName 'Append the value element as a child of the field element fieldName.appendChild fieldValue end if next 'Add an XML processing instruction 'and insert it before the root element Set p = xmlDoc.createProcessingInstruction("xml","version='1.0'") xmlDoc.insertBefore p,xmlDoc.childNodes(0) 'Save the XML file xmlDoc.save "c:\Customer.xml" 'Release all object references set xmlDoc=nothing set rootEl=nothing set fieldName=nothing set fieldValue=nothing set attID=nothing set p=nothing 'Test to see if an error occurred if err.number<>0 then response.write("Error: No information saved.") else response.write("Your information has been saved.")

79

end if %> Note: If the XML file name specified already exists, it will be overwritten! The XML file that will be produced by the code above will look something like this ("Customer.xml"):

<customer> Hege Refsnes Norway [email protected]

XML DHTML Behaviors: Internet Explorer 5 introduced DHTML behaviors. Behaviors are a way to add DHTML functionality to HTML elements with the ease of CSS.

Behaviors - What are They? IE5 introduced DHTML behaviors. Behaviors are a way to add DHTML functionality to HTML elements with the ease of CSS. How do behaviors work? By using XML we can link behaviors to any element in a web page and manipulate that element. DHTML behaviors do not use a <script> tag. Instead, they are using a CSS attribute called "behavior". This "behavior" specifies a URL to an HTC file which contains the actual behavior (The HTC file is written in XML).

Syntax behavior: url(some_filename.htc) Note: The behavior attribute is only supported by IE 5 and higher, all other browsers will ignore it. This means that Mozilla, Firefox, Netscape and other browsers will only see the regular content and IE 5+ can see the DHTML behaviors.

Example The following HTML file has a <style> element that defines a behavior for the

element:

80

<style type="text/css"> h1 { behavior: url(behave.htc) }

Mouse over me!!!

The XML document "behave.htc" is shown below:

<script type="text/javascript"> function hig_lite() { element.style.color='red' } function low_lite() { element.style.color='blue' } The behavior file contains a JavaScript and the event handlers for the script. Try it yourself (mouse over the text in the example). The following HTML file has a <style> element that defines a behavior for elements with an id of "typing":

<style type="text/css"> #typing { behavior:url(typing.htc); font-family:'courier new'; } <span id="typing" speed="100">IE5 introduced DHTML behaviors. Behaviors are a way to add DHTML functionality to HTML elements with the ease of CSS.

How do behaviors work?
By using XML we can link behaviors to any element in a web page and manipulate that element.



81

The XML document "typing.htc" is shown below:

<method name="type" /> <script type="text/javascript"> var i,text1,text2,textLength,t function beginTyping() { i=0 text1=element.innerText textLength=text1.length element.innerText="" text2="" t=window.setInterval(element.id+".type()",speed) } function type() { text2=text2+text1.substring(i,i+1) element.innerText=text2 i=i+1 if (i==textLength){clearInterval(t)} } Try it yourself

XML Editors: If you are serious about XML, you will benefit from using a professional XML Editor.

XML is Text-based XML is a text-based markup language. One great thing about XML is that XML files can be created and edited using a simple text-editor like Notepad. However, when you start working with XML, you will soon find that it is better to edit XML documents using a professional XML editor.

Why Not Notepad? Many web developers use Notepad to edit both HTML and XML documents because Notepad is included with the most common OS and it is simple to use. Personally I often use Notepad for quick editing of simple HTML, CSS, and XML files. But, if you use Notepad for XML editing, you will soon run into problems. Notepad does not know that you are writing XML, so it will not be able to assist you.

Why an XML Editor?

82

Today XML is an important technology, and development projects use XML-based technologies like:

• • • • • • •

XML Schema to define XML structures and data types XSLT to transform XML data SOAP to exchange XML data between applications WSDL to describe web services RDF to describe web resources XPath and XQuery to access XML data SMIL to define graphics

To be able to write error-free XML documents, you will need an intelligent XML editor!

XML Editors Professional XML editors will help you to write error-free XML documents, validate your XML against a DTD or a schema, and force you to stick to a valid XML structure. An XML editor should be able to:

• • • • •

Add closing tags to your opening tags automatically Force you to write valid XML Verify your XML against a DTD Verify your XML against a Schema Color code your XML syntax

Altova's XMLSpy At W3Schools we have been using XMLSpy for many years. XMLSpy is our favorite XML editor. These are some of the features we especially like:

• • • • • • • • • • • • • • • • • •

Easy to use Syntax coloring Automatic tag completion Context-sensitive entry helpers Automatic well-formedness check Built in DTD and/or XML Schema-based validation Easy switching between text view and grid view Built in graphical XML Schema editor Powerful conversion utilities Database import and export Built in templates for most XML document types Built in XPath 1.0/2.0 analyzer XSLT 1.0/2.0 editor, profiler, and debugger XQuery editor, profiler, and debugger SOAP client and debugger Graphical WSDL editor Powerful project management capabilities Code generation in Java, C++, and C#

83

XML Quiz: 1. What does XML stand for? eXtra Modern Link Example Markup Language eXtensible Markup Language X-Markup Language

2. There is a way of describing XML data, how? XML uses XSL to describe data XML uses a DTD to describe the data XML uses a description node to describe data

3. XML's goal is to replace HTML False True

4. What is the correct syntax of the declaration which defines the XML version? <xml version="1.0" />

5. What does DTD stand for? Direct Type Definition Document Type Definition Do The Dance Dynamic Type Definition

6. Is this a "well formed" XML document? <note> Tove Jani Reminder Don't forget me this weekend! No

84

Yes

7. Is this a "well formed" XML document? Tove Jani Reminder Don't forget me this weekend! No Yes

8. Which statement is true? All All All All

XML elements must be lower case XML elements must be properly closed XML documents must have a DTD the statements are true

9. Which statement is true? XML documents must have a root tag XML elements must be properly nested XML tags are case sensitive All the statements are true

XML QUIZ

http://www.w3schools.com

10. XML preserves white spaces True False

11. Is this a "well formed" XML document? <note> Tove Jani No Yes

85

12. Is this a "well formed" XML document? <note> Tove Jani No Yes

13. XML elements cannot be empty False True

14. Which is not a correct name for an XML element?

<1dollar> All 3 names are incorrect 15. Which is not a correct name for an XML element? All 3 names are incorrect 16. Which is not a correct name for an XML element? All 3 names are incorrect <xmldocument> <7eleven> 17. XML attribute values must always be enclosed in quotes True False

18. What does XSL stand for? eXtra Style Language eXtensible Stylesheet Language eXpandable Style Language eXtensible Style Listing 19. What is a correct way of referring to a stylesheet called "mystyle.xsl" ?

86

<stylesheet type="text/xsl" href="mystyle.xsl" />

XPath Introduction: XPath is a language for finding information in an XML document. XPath is used to navigate through elements and attributes in an XML document.

What You Should Already Know Before you continue you should have a basic understanding of the following:

87

• •

HTML / XHTML XML / XML Namespaces

If you want to study these subjects first, find the tutorials on our Home page.

What is XPath? • • • • •

XPath is a syntax for defining parts of an XML document XPath uses path expressions to navigate in XML documents XPath contains a library of standard functions XPath is a major element in XSLT XPath is a W3C Standard

XPath Path Expressions XPath uses path expressions to select nodes or node-sets in an XML document. These path expressions look very much like the expressions you see when you work with a traditional computer file system.

XPath Standard Functions XPath includes over 100 built-in functions. There are functions for string values, numeric values, date and time comparison, node and QName manipulation, sequence manipulation, Boolean values, and more.

XPath is Used in XSLT XPath is a major element in the XSLT standard. Without XPath knowledge you will not be able to create XSLT documents. You can read more about XSLT in our XSLT tutorial. XQuery and XPointer are both built on XPath expressions. XQuery 1.0 and XPath 2.0 share the same data model and support the same functions and operators. You can read more about XQuery in our XQuery tutorial.

XPath is a W3C Standard XPath became a W3C Recommendation 16. November 1999. XPath was designed to be used by XSLT, XPointer and other XML parsing software. You can read more about the XPath standard in our W3C tutorial. In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document (root) nodes.

88

XPath Terminology Nodes In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processinginstruction, comment, and document (root) nodes. XML documents are treated as trees of nodes. The root of the tree is called the document node (or root node). Look at the following XML document:

Harry Potter J K. Rowling 2005 <price>29.99 Example of nodes in the XML document above:

(document node) J K. Rowling lang="en" (attribute node)

(element node)

Atomic values Atomic values are nodes with no children or parent. Example of atomic values:

J K. Rowling "en" Items Items are atomic values or nodes.

Relationship of Nodes Parent Each element and attribute has one parent. In the following example; the book element is the parent of the title, author, year, and price:

Harry Potter J K. Rowling 2005 <price>29.99

89

Children Element nodes may have zero, one or more children. In the following example; the title, author, year, and price elements are all children of the book element:

Harry Potter J K. Rowling 2005 <price>29.99 Siblings Nodes that have the same parent. In the following example; the title, author, year, and price elements are all siblings:

Harry Potter J K. Rowling 2005 <price>29.99 Ancestors A node's parent, parent's parent, etc. In the following example; the ancestors of the title element are the book element and the bookstore element:

Harry Potter J K. Rowling 2005 <price>29.99 Descendants A node's children, children's children, etc. In the following example; descendants of the bookstore element are the book, title, author, year, and price elements:

Harry Potter J K. Rowling

90

2005 <price>29.99


XPath uses path expressions to select nodes or node-sets in an XML document. The node is selected by following a path or steps.

The XML Example Document We will use the following XML document in the examples below.

Harry Potter <price>29.99 Learning XML <price>39.95

Selecting Nodes XPath uses path expressions to select nodes in an XML document. The node is selected by following a path or steps. The most useful path expressions are listed below: Expression

Description

nodename

Selects all child nodes of the node

/

Selects from the root node

//

Selects nodes in the document from the current node that match the selection no matter where they are

.

Selects the current node

..

Selects the parent of the current node

@

Selects attributes

Examples In the table below we have listed some path expressions and the result of the expressions: Path Expression

Result

bookstore

Selects all the child nodes of the bookstore element

/bookstore

Selects the root element bookstore Note: If the path starts with a slash ( / ) it always represents an absolute path to an element!

bookstore/book

Selects all book elements that are children of bookstore

//book

Selects all book elements no matter where they are in the document

91

bookstore//book

Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element

//@lang

Selects all attributes that are named lang

Predicates Predicates are used to find a specific node or a node that contains a specific value. Predicates are always embedded in square brackets.

Examples In the table below we have listed some path expressions with predicates and the result of the expressions: Path Expression

Result

/bookstore/book[1]

Selects the first book element that is the child of the bookstore element

/bookstore/book[last()]

Selects the last book element that is the child of the bookstore element

/bookstore/book[last()-1]

Selects the last but one book element that is the child of the bookstore element

/bookstore/book[position()<3]

Selects the first two book elements that are children of the bookstore element

//title[@lang]

Selects all the title elements that have an attribute named lang

//title[@lang='eng']

Selects all the title elements that have an attribute named lang with a value of 'eng'

/bookstore/book[price>35.00]

Selects all the book elements of the bookstore element that have a price element with a value greater than 35.00

/bookstore/book[price>35.00]/title

Selects all the title elements of the book elements of the bookstore element that have a price element with a value greater than 35.00

Selecting Unknown Nodes XPath wildcards can be used to select unknown XML elements. Wildcard

Description

*

Matches any element node

@*

Matches any attribute node

node()

Matches any node of any kind

Examples In the table below we have listed some path expressions and the result of the expressions: Path Expression

Result

/bookstore/*

Selects all the child nodes of the bookstore element

//*

Selects all elements in the document

92

//title[@*]

Selects all title elements which have any attribute

Selecting Several Paths By using the | operator in an XPath expression you can select several paths.

Examples In the table below we have listed some path expressions and the result of the expressions: Path Expression

Result

//book/title | //book/price

Selects all the title AND price elements of all book elements

//title | //price

Selects all the title AND price elements in the document

/bookstore/book/title | //price

Selects all the title elements of the book element of the bookstore element AND all the price elements in the document

The XML Example Document We will use the following XML document in the examples below.

Harry Potter <price>29.99 Learning XML <price>39.95

XPath Axes An axis defines a node-set relative to the current node. AxisName

Result

ancestor

Selects all ancestors (parent, grandparent, etc.) of the current node

ancestor-or-self

Selects all ancestors (parent, grandparent, etc.) of the current node and the current node itself

attribute

Selects all attributes of the current node

child

Selects all children of the current node

descendant

Selects all descendants (children, grandchildren, etc.) of the current node

descendant-or-self

Selects all descendants (children, grandchildren, etc.) of the current node and the current node itself

following

Selects everything in the document after the closing tag of the current node

93

following-sibling

Selects all siblings after the current node

namespace

Selects all namespace nodes of the current node

parent

Selects the parent of the current node

preceding

Selects everything in the document that is before the start tag of the current node

preceding-sibling

Selects all siblings before the current node

self

Selects the current node

Location Path Expression A location path can be absolute or relative. An absolute location path starts with a slash ( / ) and a relative location path does not. In both cases the location path consists of one or more steps, each separated by a slash:

An absolute location path: /step/step/... A relative location path: step/step/... Each step is evaluated against the nodes in the current node-set. A step consists of:

• • •

an axis (defines the tree-relationship between the selected nodes and the current node) a node-test (identifies a node within an axis) zero or more predicates (to further refine the selected node-set)

The syntax for a location step is:

axisname::nodetest[predicate] Examples Example

Result

child::book

Selects all book nodes that are children of the current node

attribute::lang

Selects the lang attribute of the current node

child::*

Selects all children of the current node

attribute::*

Selects all attributes of the current node

child::text()

Selects all text child nodes of the current node

child::node()

Selects all child nodes of the current node

descendant::book

Selects all book descendants of the current node

ancestor::book

Selects all book ancestors of the current node

ancestor-or-self::book

Selects all book ancestors of the current node - and the current as well if it is a book node

child::*/child::price

Selects all price grandchildren of the current node

An XPath expression returns either a node-set, a string, a Boolean, or a number.

94

XPath Operators Below is a list of the operators that can be used in XPath expressions: Operator

Description

Example

Return value

|

Computes two node-sets

//book | //cd

Returns a node-set with all book and cd elements

+

Addition

6+4

10

-

Subtraction

6-4

2

*

Multiplication

6*4

24

div

Division

8 div 4

2

=

Equal

price=9.80

true if price is 9.80 false if price is 9.90

!=

Not equal

price!=9.80

true if price is 9.90 false if price is 9.80

<

Less than

price<9.80

true if price is 9.00 false if price is 9.80

<=

Less than or equal to

price<=9.80

true if price is 9.00 false if price is 9.90

>

Greater than

price>9.80

true if price is 9.90 false if price is 9.80

>=

Greater than or equal to

price>=9.80

true if price is 9.90 false if price is 9.70

or

or

price=9.80 or price=9.70

true if price is 9.80 false if price is 9.50

and

and

price>9.00 and price<9.90

true if price is 9.80 false if price is 8.50

mod

Modulus (division remainder) 5 mod 2

1

Let's try to learn some basic XPath syntax by looking at some examples.

The XML Example Document We will use the following XML document in the examples below. "books.xml":

Everyday Italian Giada De Laurentiis 2005 <price>30.00 Harry Potter J K. Rowling 2005 <price>29.99

95

XQuery Kick Start James McGovern Per Bothner Kurt Cagle James Linn Vaidyanathan Nagarajan 2003 <price>49.99
Learning XML Erik T. Ray 2003 <price>39.95
View the "books.xml" file in your browser.

Selecting Nodes We will use the Microsoft XMLDOM object to load the XML document and the selectNodes() function to select nodes from the XML document:

set xmlDoc=CreateObject("Microsoft.XMLDOM") xmlDoc.async="false" xmlDoc.load("books.xml") xmlDoc.selectNodes(path expression)

Select all book Nodes The following example selects all the book nodes under the bookstore element:

xmlDoc.selectNodes("/bookstore/book") If you have IE 5 or higher you can try it yourself.

Select the First book Node The following example selects only the first book node under the bookstore element:

xmlDoc.selectNodes("/bookstore/book[0]") If you have IE 5 or higher you can try it yourself Note: IE5 and later has implemented that [0] should be the first node, but according to the W3C standard it should have been [1]!!

A Workaround! To solve the [0] and [1] problem in IE5+, you can set the SelectionLanguage to XPath.

96

The following example selects only the first book node under the bookstore element:

xmlDoc.setProperty "SelectionLanguage", "XPath" xmlDoc.selectNodes("/bookstore/book[1]") Try it yourself

Select the prices The following example selects the text from all the price nodes:

xmlDoc.selectNodes("/bookstore/book/price/text()") If you have IE 5 or higher you can try it yourself.

Selecting price Nodes with Price>35 The following example selects all the price nodes with a price higher than 35:

xmlDoc.selectNodes("/bookstore/book[price>35]/price") If you have IE 5 or higher you can try it yourself.

Selecting title Nodes with Price>35 The following example selects all the title nodes with a price higher than 35:

xmlDoc.selectNodes("/bookstore/book[price>35]/title") If you have IE 5 or higher you can try it yourself.

XPath Summary This tutorial has taught you how to find information in an XML document. You have learned how to use XPath to navigate through elements and attributes in an XML document. You have also learned how to use some of the standard functions that are built-in in XPath. For more information on XPath, please look at our XPath Reference.

Now You Know XPath, What's Next? The next step is to learn about XSLT, XQuery, XLink, and XPointer. XSLT

97

XSLT is the style sheet language for XML files. With XSLT you can transform XML documents into other formats, like XHTML. If you want to learn more about XSLT, please visit our XSLT tutorial. XQuery XQuery is about querying XML data. XQuery is designed to query anything that can appear as XML, including databases. If you want to learn more about XQuery, please visit our XQuery tutorial. XLink and XPointer Linking in XML is divided into two parts: XLink and XPointer. XLink and XPointer define a standard way of creating hyperlinks in XML documents. If you want to learn more about XLink and XPointer, please visit our XLink and XPointer tutorial.

Introduction to XSL-FO: XSL-FO is about formatting XML data for output.

What You Should Already Know Before you study XSL-FO you should have a basic understanding of XML and XML Namespaces. If you want to study these subjects first, please read our XML Tutorial.

What is XSL-FO? • • • •

XSL-FO is a language for formatting XML data XSL-FO stands for Extensible Stylesheet Language Formatting Objects XSL-FO is a W3C Recommendation XSL-FO is now formally named XSL

XSL-FO is About Formatting XSL-FO is an XML-based markup language describing the formatting of XML data for output to screen, paper or other media.

XSL-FO is Formally Named XSL Why this confusion? Is XSL-FO and XSL the same thing?

98

Yes it is, but we will give you an explanation: Styling is both about transforming and formatting information. When the World Wide Web Consortium (W3C) made their first XSL Working Draft, it contained the language syntax for both transforming and formatting XML documents. Later, the XSL Working Group at W3C split the original draft into separate Recommendations:

• • •

XSLT, a language for transforming XML documents XSL or XSL-FO, a language for formatting XML documents XPath, a language for navigating through elements and attributes in XML documents

The rest of this tutorial is about formatting XML documents: XSL-FO, also called XSL. You can read more about XSLT in our XSLT Tutorial. You can read more about XPath in our XPath Tutorial.

XSL-FO is a Web Standard XSL-FO became a W3C Recommendation 15. October 2001. Formally named XSL. To read more about the XSL activities at W3C please read our W3C Tutorial. XSL-FO documents are XML files with output information.

XSL-FO Documents XSL-FO documents are XML files with output information. They contain information about the output layout and output contents. XSL-FO documents are stored in files with a .fo or a .fob file extension. It is also quite common to see XSL-FO documents stored with an .xml extension, because this makes them more accessible to XML editors.

XSL-FO Document Structure XSL-FO documents have a structure like this:



99

Structure explained XSL-FO documents are XML documents, and must always start with an XML declaration:

The element is the root element of XSL-FO documents. The root element also declares the namespace for XSL-FO:

The element contains one or more page templates:

Each element contains a single page template. Each template must have a unique name (master-name):

One or more elements describe the page contents. The master-reference attribute refers to the simple-page-master template with the same name:

Note: The master-reference "A4" does not actually describe a predefined page format. It is just a name. You can use any name like "MyPage", "MyTemplate", etc. XSL-FO uses rectangular boxes (areas) to display output.

XSL-FO Areas The XSL formatting model defines a number of rectangular areas (boxes) to display output. All output (text, pictures, etc.) will be formatted into these boxes and then displayed or printed to a target media. We will take a closer look at the following areas:

• • • •

Pages Regions Block areas Line areas

100



Inline areas

XSL-FO Pages XSL-FO output is formatted into pages. Printed output will normally go into many separate pages. Browser output will often go into one long page. XSL-FO Pages contain Regions.

XSL-FO Regions Each XSL-FO Page contains a number of Regions:

• • • • •

region-body (the body of the page) region-before (the header of the page) region-after (the footer of the page) region-start (the left sidebar) region-end (the right sidebar)

XSL-FO Regions contain Block areas.

XSL-FO Block Areas XSL-FO Block areas define small block elements (the ones that normally starts with a new line) like paragraphs, tables and lists. XSL-FO Block areas can contain other Block areas, but most often they contain Line areas.

XSL-FO Line Areas XSL-FO Line areas define text lines inside Block areas. XSL-FO Line areas contain Inline areas.

XSL-FO Inline Areas XSL-FO Inline areas define text inside Lines (bullets, single character, graphics, and more). XSL-FO defines output inside elements.

XSL-FO Page, Flow, and Block "Blocks" of content "Flows" into "Pages" and then to the output media. XSL-FO output is normally nested inside elements, nested inside elements, nested inside elements:

101



XSL-FO Example It is time to look at a real XSL-FO example:

Hello W3Schools The output from this code would be something like this: Hello W3Schools

XSL-FO Flow XSL-FO pages are filled with data from elements.

102

XSL-FO Page Sequences XSL-FO uses elements to define output pages. Each output page refers to a page master which defines the layout. Each output page has a element defining the output. Each output page is printed (or displayed) in sequence.

XSL-FO Flow XSL-FO pages are filled with content from the element. The element contains all the elements to be printed to the page. When the page is full, the same page master will be used over (and over) again until all the text is printed.

Where To Flow? The element has a "flow-name" attribute. The value of the flow-name attribute defines where the content of the element will go. The legal values are:

• • • • •

xsl-region-body (into the region-body) xsl-region-before (into the region-before) xsl-region-after (into the region-after) xsl-region-start (into the region-start) xsl-region-end (into the region-end)

XSL-FO Pages: XSL-FO uses page templates called "Page Masters" to define the layout of pages.

XSL-FO Page Templates XSL-FO uses page templates called "Page Masters" to define the layout of pages. Each template must have a unique name:



103

In the example above, three elements, define three different templates. Each template (page-master) has a different name. The first template is called "intro". It could be used as a template for introduction pages. The second and third templates are called "left" and "right". They could be used as templates for even and odd page numbers.

XSL-FO Page Size XSL-FO uses the following attributes to define the size of a page:

• •

page-width defines the width of a page page-height defines the height of a page

XSL-FO Page Margins XSL-FO uses the following attributes to define the margins of a page:

• • • • •

margin-top defines the top margin margin-bottom defines the bottom margin margin-left defines the left margin margin-right defines the right margin margin defines all four margins

XSL-FO Page Regions XSL-FO uses the following elements to define the regions of a page:

• • • • •

region-body defines the body region region-before defines the top region (header) region-after defines the bottom region (footer) region-start defines the left region (left sidebar) region-end defines the right region (right sidebar)

Note that the region-before, region-after, region-start, and region-end is a part of the body region. To avoid text in the body region to overwrite text in these regions, the body region must have margins at least the size of these regions.

104

Margin Top REGION BEFORE

M a r g i n

R E G I O N

L e f t

S T A R T

REGION BODY

R E G I O N E N D

M a r g i n R i g h t

REGION AFTER Margin Bottom

XSL-FO Example This is an extract from an XSL-FO document:

The code above defines a "Simple Page Master Template" with the name "A4". The width of the page is 297 millimeters and the height is 210 millimeters.

105

The top, bottom, left, and right margins of the page are all 1 centimeter. The body has a 3 centimeter margin (on all sides). The before, after, start, and end regions (of the body) are all 2 centimeters. The width of the body in the example above can be calculated by subtracting the left and right margins and the region-body margins from the width of the page itself: 297mm - (2 x 1cm) - (2 x 3cm) = 297mm - 20mm - 60mm = 217mm. Note that the regions (region-start and region-end) are not a part of the calculation. As described earlier, these regions are parts of the body.

XSL-FO Blocks: XSL-FO output goes into blocks.

XSL-FO Pages, Flow, and Block "Blocks" of content "Flow" into "Pages" of the output media. XSL-FO output is normally nested inside elements, nested inside elements, nested inside elements:



Block Area Attributes Blocks are sequences of output in rectangular boxes:

This block of output will have a one millimeter border around it. Since block areas are rectangular boxes, they share many common area properties:

• • • •

space before and space after margin border padding space before margin

106

border padding

content

space after The space before and space after is the empty space separating the block from the other blocks. The margin is the empty area on the outside of the block. The border is the rectangle drawn around the external edge of the area. It can have different widths on all four sides. It can also be filled with different colors and background images. The padding is the area between the border and the content area. The content area contains the actual content like text, pictures, graphics, or whatever.

Block Margin • • • • •

margin margin-top margin-bottom margin-left margin-right

Block Border Border style attributes:

• • • • • • • • •

border-style border-before-style border-after-style border-start-style border-end-style border-top-style (same as border-before) border-bottom-style (same as border-after) border-left-style (same as border-start) border-right-style (same as border-end)

Border color attributes:

• • • •

border-color border-before-color border-after-color border-start-color

107

• • • • •

border-end-color border-top-color (same as border-before) border-bottom-color (same as border-after) border-left-color (same as border-start) border-right-color (same as border-end)

Border width attributes:

• • • • • • • • •

border-width border-before-width border-after-width border-start-width border-end-width border-top-width (same as border-before) border-bottom-width (same as border-after) border-left-width (same as border-start) border-right-width (same as border-end)

Block Padding • • • • • • • • •

padding padding-before padding-after padding-start padding-end padding-top (same as padding-before) padding-bottom (same as padding-after) padding-left (same as padding-start) padding-right (same as padding-end)

Block Background • • • •

background-color background-image background-repeat background-attachment (scroll or fixed)

Block Styling Attributes Blocks are sequences of output that can be styled individually:

This block of output will be written in a 12pt sans-serif font. Font attributes:

108

• • • • •

font-family font-weight font-style font-size font-variant

Text attributes:

• • • • • • • • •

text-align text-align-last text-indent start-indent end-indent wrap-option (defines word wrap) break-before (defines page breaks) break-after (defines page breaks) reference-orientation (defines text rotation in 90" increments)

Example W3Schools At W3Schools you will find all the Web-building tutorials you need, from basic HTML and XHTML to advanced XML, XSL, Multimedia and WAP. Result:

W3Schools At W3Schools you will find all the Web-building tutorials you need, from basic HTML and XHTML to advanced XML, XSL, Multimedia and WAP.

When you look at the example above, you can see that it will take a lot of code to produce a document with many headers and paragraphs. Normally XSL-FO document do not combine formatting information and content like we have done here. With a little help from XSLT we can put the formatting information into templates and write a cleaner content.

109

You will learn more about how to combine XSL-FO with XSLT templates in a later chapter in this tutorial.

XSL-FO Lists: XSL-FO uses List Blocks to define lists.

XSL-FO List Blocks There are four XSL-FO objects used to create lists:

• • • •

fo:list-block (contains the whole list) fo:list-item (contains each item in the list) fo:list-item-label (contains the label for the list-item - typically an containing a number, character, etc.) fo:list-item-body (contains the content/body of the list-item - typically one or more objects)

An XSL-FO list example:

* Volvo * Saab The output from this code would be:

* Volvo * Saab

XSL-FO Tables: XSL-FO uses the element to define tables.

110

XSL-FO Tables The XSL-FO table model is not very different from the HTML table model. There are nine XSL-FO objects used to create tables:

• • • • • • • • •

fo:table-and-caption fo:table fo:table-caption fo:table-column fo:table-header fo:table-footer fo:table-body fo:table-row fo:table-cell

XSL-FO uses the element to define a table. It contains a and an optional element. The element contains optional elements, an optional element, a element, and an optional element. Each of these elements has one or more elements, with one or more elements:

Car Price Volvo $50000 SAAB $48000

111

The output from this code would something like this: Car

Price

Volvo

$50000

SAAB

$48000

XSL-FO and XSLT: XSL-FO and XSLT can help each other.

Remember this Example? W3Schools At W3Schools you will find all the Web-building tutorials you need, from basic HTML and XHTML to advanced XML, XSL, Multimedia and WAP. Result:

W3Schools At W3Schools you will find all the Web-building tutorials you need, from basic HTML and XHTML to advanced XML, XSL, Multimedia and WAP.

The example above is from the chapter about XSL-FO Blocks.

With a Little Help from XSLT Remove the XSL-FO information from the document:

112

W3Schools
<paragraph> At W3Schools you will find all the Web-building tutorials you need, from basic HTML and XHTML to advanced XML, XSL, Multimedia and WAP. Add an XSLT transformation:

<xsl:template match="header"> <xsl:apply-templates/> <xsl:template match="paragraph"> <xsl:apply-templates/>

And the result will be the same:

XQuery Tutorial: The best way to explain XQuery is to say that XQuery is to XML what SQL is to database tables. XQuery is designed to query XML data - not just XML files, but anything that can appear as XML, including databases.

What You Should Already Know Before you continue you should have a basic understanding of the following:

• • •

HTML / XHTML XML / XML Namespaces XPath

If you want to study these subjects first, find the tutorials on our Home page.

113

What is XQuery? • • • • •

XQuery is the language for querying XML data XQuery for XML is like SQL for databases XQuery is built on XPath expressions XQuery is supported by all the major database engines (IBM, Oracle, Microsoft, etc.) XQuery is a W3C Recommendation

XQuery is About Querying XML XQuery is a language for finding and extracting elements and attributes from XML documents. Here is an example of a question that XQuery could solve: "Select all CD records with a price less than $10 from the CD collection stored in the XML document called cd_catalog.xml"

XQuery and XPath XQuery 1.0 and XPath 2.0 share the same data model and support the same functions and operators. If you have already studied XPath you will have no problems with understanding XQuery. You can read more about XPath in our XPath Tutorial.

XQuery - Examples of Use XQuery can be used to:

• • • •

Extract information to use in a Web Service Generate summary reports Transform XML data to XHTML Search Web documents for relevant information

Let's try to learn some basic XQuery syntax by looking at an example.

The XML Example Document We will use the following XML document in the examples below. "books.xml":

Everyday Italian Giada De Laurentiis 2005 <price>30.00

114

Harry Potter J K. Rowling 2005 <price>29.99 XQuery Kick Start James McGovern Per Bothner Kurt Cagle James Linn Vaidyanathan Nagarajan 2003 <price>49.99 Learning XML Erik T. Ray 2003 <price>39.95
View the "books.xml" file in your browser.

How to Select Nodes From "books.xml"? Functions XQuery uses functions to extract data from XML documents. The doc() function is used to open the "books.xml" file:

doc("books.xml") Path Expressions XQuery uses path expressions to navigate through elements in an XML document. The following path expression is used to select all the title elements in the "books.xml" file:

doc("books.xml")/bookstore/book/title (/bookstore selects the bookstore element, /book selects all the book elements under the bookstore element, and /title selects all the title elements under each book element) The XQuery above will extract the following:

<br /> lang="en">Everyday Italian lang="en">Harry Potter lang="en">XQuery Kick Start lang="en">Learning XML

115

Predicates XQuery uses predicates to limit the extracted data from XML documents. The following predicate is used to select all the book elements under the bookstore element that have a price element with a value that is less than 30:

doc("books.xml")/bookstore/book[price<30] The XQuery above will extract the following:

Harry Potter J K. Rowling 2005 <price>29.99

The XML Example Document We will use the "books.xml" document in the examples below (same XML file as in the previous chapter). View the "books.xml" file in your browser.

How to Select Nodes From "books.xml" With FLWOR Look at the following path expression:

doc("books.xml")/bookstore/book[price>30]/title The expression above will select all the title elements under the book elements that are under the bookstore element that have a price element with a value that is higher than 30. The following FLWOR expression will select exactly the same as the path expression above:

for $x in doc("books.xml")/bookstore/book where $x/price>30 return $x/title The result will be:

XQuery Kick Start Learning XML With FLWOR you can sort the result:

for $x in doc("books.xml")/bookstore/book where $x/price>30 order by $x/title

116

return $x/title FLWOR is an acronym for "For, Let, Where, Order by, Return". The for clause selects all book elements under the bookstore element into a variable called $x. The where clause selects only book elements with a price element with a value greater than 30. The order by clause defines the sort-order. Will be sort by the title element. The return clause specifies what should be returned. Here it returns the title elements. The result of the XQuery expression above will be:

Learning XML XQuery Kick Start

The XML Example Document We will use the "books.xml" document in the examples below (same XML file as in the previous chapters). View the "books.xml" file in your browser.

Present the Result In an HTML List Look at the following XQuery FLWOR expression:

for $x in doc("books.xml")/bookstore/book/title order by $x return $x The expression above will select all the title elements under the book elements that are under the bookstore element, and return the title elements in alphabetical order. Now we want to list all the book-titles in our bookstore in an HTML list. We add
    and
  • tags to the FLWOR expression:

      { for $x in doc("books.xml")/bookstore/book/title order by $x return
    • {$x}
    • }
    The result of the above will be:

    • Everyday Italian


    • 117

    • Harry Potter
    • Learning XML
    • XQuery Kick Start
    Now we want to eliminate the title element, and show only the data inside the title element:

      { for $x in doc("books.xml")/bookstore/book/title order by $x return
    • {data($x)}
    • }
    The result will be (an HTML list):

    • Everyday Italian
    • Harry Potter
    • Learning XML
    • XQuery Kick Start
    In XQuery, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document (root) nodes.

    XQuery Terminology Nodes In XQuery, there are seven kinds of nodes: element, attribute, text, namespace, processinginstruction, comment, and document (root) nodes. XML documents are treated as trees of nodes. The root of the tree is called the document node (or root node). Look at the following XML document:

    Harry Potter J K. Rowling 2005 <price>29.99 Example of nodes in the XML document above:

    (document node) J K. Rowling lang="en" (attribute node)

    (element node)

    118

    Atomic values Atomic values are nodes with no children or parent. Example of atomic values:

    J K. Rowling "en" Items Items are atomic values or nodes.

    Relationship of Nodes Parent Each element and attribute has one parent. In the following example; the book element is the parent of the title, author, year, and price:

    Harry Potter J K. Rowling 2005 <price>29.99 Children Element nodes may have zero, one or more children. In the following example; the title, author, year, and price elements are all children of the book element:

    Harry Potter J K. Rowling 2005 <price>29.99 Siblings Nodes that have the same parent. In the following example; the title, author, year, and price elements are all siblings:

    Harry Potter J K. Rowling 2005 <price>29.99

    119

    Ancestors A node's parent, parent's parent, etc. In the following example; the ancestors of the title element are the book element and the bookstore element:

    Harry Potter J K. Rowling 2005 <price>29.99 Descendants A node's children, children's children, etc. In the following example; descendants of the bookstore element are the book, title, author, year, and price elements:

    Harry Potter J K. Rowling 2005 <price>29.99 XQuery is case-sensitive and XQuery elements, attributes, and variables must be valid XML names.

    XQuery Basic Syntax Rules Some basic syntax rules:

    • • • • •

    XQuery is case-sensitive XQuery elements, attributes, and variables must be valid XML names An XQuery string value can be in single or double quotes An XQuery variable is defined with a $ followed by a name, e.g. $bookstore XQuery comments are delimited by (: and :), e.g. (: XQuery Comment :)

    XQuery Conditional Expressions "If-Then-Else" expressions are allowed in XQuery.

    120

    Look at the following example:

    for $x in doc("books.xml")/bookstore/book return if ($x/@category="CHILDREN") then {data($x/title)} else {data($x/title)} Notes on the "if-then-else" syntax: parentheses around the if expression are required. else is required, but it can be just else (). The result of the example above will be:

    Everyday Italian Harry Potter Learning XML XQuery Kick Start

    XQuery Comparisons In XQuery there are two ways of comparing values. 1. General comparisons: =, !=, <, <=, >, >= 2. Value comparisons: eq, ne, lt, le, gt, ge The difference between the two comparison methods are shown below. Look at the following XQuery expressions:

    $bookstore//book/@q > 10 The expression above returns true if any q attributes have values greater than 10. $bookstore//book/@q gt 10 The expression above returns true if there is only one q attribute returned by the expression, and its value is greater than 10. If more than one q is returned, an error occurs.

    121


Related Documents

Xml Document
November 2019 0
Xml
June 2020 21
Xml
November 2019 35
Xml
May 2020 25
Xml
November 2019 45