Introduction to XML Version 1.0
Introduction •EXtensible Markup Language (XML) is a data representation language by using a set of tags. •Unlike HTML, which supports only limited set of tags, using XML, we can have our own tags defined. •XML is used to represent most of the configuration files in J2EE application. •Example:
John Doe TCS Internal
September 3, 2009
Rules for well formed XML documents
•Every start-tag must have matching end-tag or be a self-closing tag. – Examples: Start and End Tag •John Self closing •<document-end /> Tag •Tags can’t overlap; elements must be properly nested. •XML documents can have only one root element. •Element names must obey XML naming conventions •XML is case sensitive TCS Internal
September 3, 2009
Attributes in Tags •An XML tag can include attributes also. •Attributes are simple name/value pairs associated with a tag. •Attributes are used toAttribute give additional information for a tag. •Example: Manmohan Singh TCS Internal
September 3, 2009
XML Declaration •The typical xml declaration looks like: – •XML declaration is used to label documents as being XML and give some additional information for parsers. •XML parsers are programs which parse and extract information from the XML document. •JAX API, which is part of J2EE is used for XML parsing. TCS Internal
September 3, 2009
Special Characters •The characters which are part of XML document like ‘<‘ and ‘>’ can not be used as data. •Within document these characters are represented in a special way. – & the & character – < the < character – > the > character – ' the ‘ character – " the “ character •Example: – For representing tom & jerry we use: tom & jerry
TCS Internal
September 3, 2009
Special Characters •If the special characters are occurring more frequently, then they can be embedded within CDATA .tag. •Example: <script language=“JavaScript”> TCS Internal
September 3, 2009
XML Namespaces •By using namespace mechanism, more than one person can use the same tag. •In the following example, the tags person and name are used by two different XML document. •These tags are differentiated by namespaces mypers and yourpers •Example: <mypers:person> <mypers:name> TCS Internal
September 3, 2009
Document Type Definitions (DTD) •A document type definition allows the developer to create a set of rules to specify legal contents and place restrictions on a XML file. •If the XML document does not follow the rules, then a XML parser generate errors. •An XML document which conforms to the DTD is said to be a valid XML doucument.
TCS Internal
September 3, 2009
DTD Advantages •A single DTD ensures a common format for each XML document that references to it. •An application can use DTD to validate the data (XML document) it received from outside. •DTD helps in interoperability of XML data between various application.
TCS Internal
September 3, 2009
Anatomy of DTD •The DTD definition will have: – Element declarations •Used to define tags – Attribute declarations •Used to define attributes for a tag – Notation declarations •Used to associate with external resources. – Entity declarations •Used to represent replacement texts. TCS Internal
September 3, 2009
Element Declarations •Element declarations consists of three parts: – The ELEMENT declaration – The element name – The element content model. •Example: – •The content may be: – Empty – Element – Mixed – Any TCS Internal
September 3, 2009
Examples of Element Declarations: Example 1: Elements with empty declartion Declaration:
Usage:
Example 2: Elements with Data Declaration: <Month>This is a month <Month> <January>Jan <March>March
TCS Internal
September 3, 2009
Examples of Element Declarations • Example 3: Elements with Children To specify that an element must have a single child element, include the element name within the parenthesis. 1345 Preston Ave Charlottesville Va 22903 An element can have multiple children. A DTD describes multiple children using a sequence, or a list of elements separated by commas. The XML file must contain one of each element in the specified order. John Doe <street>1234 Preston Ave. Charlottesville, Va 22903 TCS Internal
September 3, 2009
Attribute Declarations •Used to declare a list of allowable attributes for a given element Attribute Type •Example: –
Element
TCS Internal
Attribute Name
September 3, 2009
Notation Declarations •Used to associate external resources. •Example: –
TCS Internal
September 3, 2009
Entity declarations •Entities are used to refer sections for replacement text, other XML markup, and even other external files. •Example: – this This should be replaced by When ever the parser finds ‘asap’ within the document, it will be automatically replaced into ‘as soon as possible’.
TCS Internal
September 3, 2009
Document Object Model (DOM) •DOM is an interface for programmers to create XML documents, to navigate through them, and add, modify or delete parts of those XML documents. •DOM provides logical view on the in-memory structure that represents an XML document in an hierarchical structure consisting of nodes. •Node is the primary object with a set of properties and methods which programmer use to manipulate a node in the XML document. TCS Internal
September 3, 2009
Reference: •Beginning XML, 3rd Edition, David Hunter et. al., Wrox Publication, 2005.
TCS Internal
September 3, 2009