Session04 Xml Validation Schema

  • Uploaded by: Neeraj Singh
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Session04 Xml Validation Schema as PDF for free.

More details

  • Words: 3,140
  • Pages: 42
XML Schema Neeraj Singh October 2009

© 2008 MindTree Consulting

Agenda XML Validation Introduction to XML Schema Examples / Demo

Slide 2

XML Validation

© 2008 MindTree Consulting

An Introduction to XML Validation

One of the important innovations of XML is the ability to place preconditions on the data the programs read, and to do this in a simple declarative way.

XML allows you to say that every Order element must contain exactly one Customer element, that each Customer element must have an id attribute that contains an XML name token,

that every ShipTo element must contain one or more Streets, one City, one State, and one Zip, and so forth.

Checking an XML document against this list of conditions is called validation.

Validation is an optional step but an important one. Slide 4

Validation There are many reasons and opportunities to validate an XML document: When we receive one, before importing data into a legacy system When we receive one, before importing data into a legacy system, when we have produced or hand-edited one

To test the output of an application, etc.

Validation as “firewall” to serve as actual firewalls when we receive documents from the external world (as is commonly the case with Web Services and other XML communications),

to provide check points when we design processes as pipelines of transformations.

Validation can take place at several levels. Structural validation Data validation

Slide 5

Schema Languages There is more than one language in which you can express such validation conditions. Generically, these are called schema languages, and the documents that list the constraints are called schemas.

Different schema languages have different strengths and weaknesses.

The document type definition (DTD) is the only schema language built into most XML parsers and endorsed as a standard part of XML.

The W3C XML Schema Language (schemas for short, though it’s hardly the only schema language) addresses several limitations of DTDs.

Many other schema languages have been invented that can easily be integrated with your systems. Slide 6

XML Schema

© 2008 MindTree Consulting

XML Schema Introduction

W3C XML Schema (Schema) is an XML-based technology that is considered a replacement for DTDs. Just like DTDs, schemas are used for defining the constraints of an XML document. But unlike DTDs, they provide strong data typing and support for namespaces -- and since they are based on XML, they are also extensible.

Advantage of XML Schema over DTD Schemas are written in XML instance document syntax, using tags, elements, and attributes.

Schemas are fully namespace aware. Schemas can assign data types like integer and date to elements, and validate documents not only based on the element structure but also on the contents of the elements.

Slide 8

Schema definition

A schema is defined in a separate file and generally stored with the .xsd extension.

Every schema definition has a schema root element that belongs to the http://www.w3.org/2001/XMLSchema namespace. The schema element can also contain optional attributes.

For example: The following example indicates that the elements used in the schema come from the http://www.w3.org/2001/XMLSchema namespace. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

Slide 9

Schema Linking when document root element is from null namespace

Let's start with our first document. It must have only "root" element and this element can contain text only. The element is from null namespace. Valid document –

aaa

If you want to validate this document with XML Schema, you have to associate some Schema document with it. If the root element is from null namespace, you will use "noNamespaceSchemaLocation" attribute.

test

Slide 10

Schema Linking when document root element from some particular namespace

Now, let's have the same document as in previous example, but the "root" element must be from some concrete namespace, let's say "http://foo". Valid document

aaa

If the root element is from some particular namespace, you associate the Schema using "schemaLocation" attribute. The first part of this attribute is the target namespace, the second one the URL of the Schema file.

test

Slide 11

Example s / Demo

01_FirstXMLSchema.xsd Writing your first XML Schema and a valid XML file based on this. This will also demonstrate how to link a XML file with a XML schema.

02_FirstNameSpace.xsd This example demonstrate the use of namespace. If you have a xml document that belongs to certain namespace, how to connect to a XML Schema.

Slide 12

Schema elements

A schema file contains definitions for element and attributes, as well as data types for elements and attributes. It is also used to define the structure or the content model of an XML document.

Elements in a schema file can be classified as either simple or complex

Schema elements: Simple type A simple type element is an element that cannot contain any attributes or child elements; it can only contain the data type specified in its declaration. The syntax for defining a simple element is: <xs:element name="ELEMENT_NAME" type="DATA_TYPE" default/fixed="VALUE" /> Where DATA_TYPE is one of the built-in schema data types

Slide 13

Schema elements: Simple type Contd…

You can also specify default or fixed values for an element. You do this with either the default or fixed attribute and specify a value for the attribute. Note: Specifying a fixed or default attribute is optional.

An example of a simple type element is: <xs:element name="Author" type="xs:string" default="Whizlabs"/>

All attributes are simple types, so they are defined in the same way that simple elements are defined. For example:

<xs:attribute name="title" type="xs:string" />

Slide 14

Schema data types

All complex types

All data types in schema inherit from anyType. This includes both simple and complex data types. You can further classify simple types into builtin-primitive types and built-in-derived types.

Built-in datatype hierarchy

 A complete hierarchical diagram from the XML Schema Datatypes Recommendation is shown below. ur types – derived by restriction built-in primitive types – derived by list built-in primitive types – derived by extension or restriction Complex types Slide 15

Schema elements: Complex types  Complex types are elements that either:    

Contain other elements Contain attributes Are empty (empty elements) Contain text

 To define a complex type in a schema, use a complexType element.  You can specify the order of occurrence and the number of times an element can occur (cardinality) by using the order and occurrence indicators, respectively.

 For example: <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element name="Name" type="xs:string" /> <xs:element name="Author" type="xs:string" maxOccurs="4"/> <xs:element name="ID" type="xs:string"/> <xs:element name="Price" type="xs:string"/>

 In this example, the order indicator is xs:sequence, and the occurrence indicator is maxOccurs in the Author element name.

Slide 16

Schema elements: Complex types (Mixed content)

W3C XML Schema supports mixed content though the mixed attribute in the xs:complexType elements. Consider <xs:element name="book"> <xs:complexType mixed="true"> <xs:all> <xs:element name="title" type="xs:string"/>

It will validate an XML element such as:  Funny book by Charles M. Schulz. Its title (Being a Dog Is a FullTime Job) says it all !

<xs:element name="author" type="xs:string"/> <xs:attribute name="isbn" type="xs:string"/>

Slide 17

Example s / Demo

07_ComplexType01.xsd Your first complex type. Element can contain a mixture of elements. Now, we want the element "root" to contain elements "aaa", "bbb", and "ccc" in any order. We will use the "all" element. It also demonstrate the use of All.

11_EmptyElementUsingAnyType.xsd Empty element. We want to have the root element to be named "AAA", from null namespace and empty. The empty element is defined as a "complexType" with a "complexContent" which is a restriction of "anyType", but without any elements.

Slide 18

Occurrence indicators

Occurrence indicators specify the number of times an element can occur in an XML document. You specify them with the minOccurs and maxOccurs attributes of the element in the element definition.

As the names suggest, minOccurs specifies the minimum number of times an element can occur in an XML document while maxOccurs specifies the maximum number of times the element can occur.

It is possible to specify that an element might occur any number of times in an XML document. This is determined by setting the maxOccurs value to unbounded.

The default values for both minOccurs and maxOccurs is 1, which means that by default an element or attribute can appear exactly one time.

Slide 19

Order indicators

Order indicators define the order or sequence in which elements can occur in an XML document. Three types of order indicators are:

All: If All is the order indicator, then the defined elements can appear in any order and must occur only once. Remember that both the maxOccurs and minOccurs values for All are always 1.

Sequence: If Sequence is the order indicator, then the elements must appear in the order specified.

Choice: If Choice is the order indicator, then any one of the elements specified must appear in the XML document.

Slide 20

Example: Occurrence and order indicators <xs:element name="Book"> <xs:complexType> <xs:all> <xs:element name="Name" type="xs:string" /> <xs:element name="ID" type="xs:string"/> <xs:element name="Authors" type="authorType"/> <xs:element name="Price" type="priceType"/> <xs:complexType name="authorType"> <xs:sequence> <xs:element name="Author" type="xs:string" maxOccurs="4"/> <xs:complexType name="priceType"> <xs:choice> <xs:element name="dollars" type="xs:double" /> <xs:element name="pounds" type="xs:double" />

the <xs:all> indicator specifies that the Book element, if present, must contain only one instance of each of the following four elements: Name, ID, Authors, Price.

The xs:sequence indicator in the authorType declaration specifies that elements of this particular type (Authors element) contain at least one Author element and can contain up to four Author elements.

The xs:choice indicator in the priceType declaration specifies that elements of this particular type (Price element) can contain either a dollars element or a pounds element, but not both.

Slide 21

Restriction

A main advantage of schema is that you have the ability to control the value of XML attributes and elements.

A restriction, which applies to all of the simple data elements in a schema, allows you to define your own data type according to the requirements by modifying the facets available for a particular simple type.

To achieve this, use the restriction element defined in the schema namespace.

W3C XML Schema defines 12 facets for simple data types. Enumeration, maxExclusive, minExclusive, maxInclusive, minInclusive, maxLength, minLength, pattern, length, whiteSpace, fractionDigits, totalDigits

Slide 22

Example - To restrict the length of the text node An example that shows how to restrict the length of the text node <xs:element name="title"> <xs:complexType> <xs:simpleContent> <xs:restriction base="tokenWithLangAndNote"> <xs:maxLength value="255"/> <xs:attribute name="lang" type="xs:language"/> <xs:attribute name="note" type="xs:token"/> Slide 23

Example – Remove an attribute from the element To remove the note attribute from the element title, we declare note to be prohibited in the list of attributes in the restriction: <xs:element name="title"> <xs:complexType> <xs:simpleContent> <xs:restriction base="tokenWithLangAndNote"> <xs:maxLength value="255"/> <xs:attribute name="lang" type="xs:language"/> <xs:attribute name="note" use="prohibited"/>

Slide 24

Facets

enumeration - Value of the data type is constrained to a specific set of values. <xs:simpleType name="Subjects"> <xs:restriction base="xs:string">

maxExclusive - Numeric value of the data type is less than the value specified.

minExclusive -Numeric value of

<xs:enumeration value="History"/>

the data type is greater than the value specified.

<xs:enumeration value="Geology"/>

<xs:simpleType name="id">

<xs:enumeration value="Biology"/>



<xs:restriction base="xs:integer"> <xs:maxExclusive value="101"/> <xs:minExclusive value="1"/>

Slide 25

Facets Contd…

maxInclusive - Numeric value of the data type is less than or equal to the value specified.

minInclusive - Numeric value of the data type is greater than or equal to the value specified. <xs:simpleType name="id"> <xs:restriction base="xs:integer"> <xs:minInclusive value="0"/> <xs:maxInclusive value="100"/>

maxLength - Specifies the maximum number of characters or list items allowed in the value.

minLength - Specifies the minimum number of characters or list items allowed in the value.

pattern - Value of the data type is constrained to a specific sequence of characters that are expressed using regular expressions. <xs:simpleType name="nameFormat"> <xs:restriction base="xs:string"> <xs:minLength value="3"/> <xs:maxLength value="10"/> <xs:pattern value="[a-z][A-Z]*"/> Slide 26

Facets Contd… length - Specifies the exact number of characters or list items allowed in the value. <xs:simpleType name="secretCode"> <xs:restriction base="xs:string"> <xs:length value="5"/>

whiteSpace - Specifies the method for handling white space. Allowed values for the value attribute are preserve, replace, and collapse. <xs:simpleType name="FirstName"> <xs:restriction base="xs:string"> <xs:whiteSpace value="preserve"/>

fractionDigits - Constrains the maximum number of decimal places allowed in the value.

totalDigits - The number of digits allowed in the value. <xs:simpleType name="reducedPrice"> <xs:restriction base="xs:float"> <xs:totalDigits value="4"/> <xs:fractionDigits value="2"/>



Slide 27

Multiple Restriction using ‘Union’ The union has been applied on the two embedded simple types to allow values from both data types, our new data type will now accept the values from an enumeration with two possible values (TBD and NA). <xs:simpleType name="isbnType"> <xs:union> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{10}"/> <xs:simpleType> <xs:restriction base="xs:NMTOKEN"> <xs:enumeration value="TBD"/> <xs:enumeration value="NA"/>

Slide 28

Example s / Demo

03_RestrictSimpleType01.xsd This example restricts a simple type. Here we will require the value of the element "root" to be integer and less than 25.

04_RestrictUsingUnion01.xsd We want the element "root" to be from the range 0-100 or 300-400 (including the border values). We will make a union from two intervals.

06_RestrictUnionEnum02.xsd Element can contain a string from an enumerated set. Now, we want the element "root" to have a value "N/A" or "#REF!".

14_RestrictionOfSequence.xsd The Schema declares type "AAA", which can contain up to two sequences of "x" and "y" elements. Then we declare the type "BBB", which is a restriction of the type "AAA" and contain only one x-y sequence. Slide 29

Extension The extension element defines complex types that might derive from other complex or simple types.

If the base type is a simple type, then the complex type can only add attributes. If the base type is a complex type, then it is possible to add attributes and elements.

To derive from a complex type, you have to use the complexContent element in conjunction with the base attribute of the extension element.

Extensions are particularly useful when you need to reuse complex element definitions in other complex element definitions.

For example, it is possible to define a Name element that contains two child elements (First and Last) and then reuse it in other complex element definitions.

Slide 30

An example of extensions



<xs:complexType name="Name">

<xs:complexType name="Student">

<xs:sequence> <xs:element name="First"/> <xs:element name="Last"/>

<xs:complexContent> <xs:extension base="Name"> <xs:sequence> <xs:element name="school" type="xs:string"/> <xs:element name="year" type="xs:string"/>

<xs:complexType name="Customer"> <xs:complexContent> <xs:extension base="Name">



<xs:sequence> <xs:element name="phone" type="xs:string"/>

Slide 31

Example s / Demo

12_ExtensionOfSequence.xsd Extension of a sequence. When we extend the complexType, which contains a sequence A with a sequence B, then the sequence B will be appended to sequence A.

Slide 32

Groups W3C XML Schema also allows the definition of groups W3C XML Schema also allows the of elements and attributes.

 These groups are not datatypes but containers holding a

definition of groups of elements and attributes.

set of elements or attributes that can be used to describe <xs:complexType name="bookType"> complex types.

<xs:group name="mainBookElements"> <xs:sequence> <xs:element name="title" type="nameType"/> <xs:element name="author" type="nameType"/>

<xs:sequence> <xs:group ref="mainBookElements"/> <xs:element name="character" type="characterType" minOccurs="0" maxOccurs="unbounded"/>







<xs:attributeGroup ref="bookAttributes"/>

<xs:attributeGroup name="bookAttributes"> <xs:attribute name="isbn" type="isbnType" use="required"/>



<xs:attribute name="available" type="xs:string"/> Slide 33

Example s / Demo

08_AttributeGroup01.xsd Defining a group of attributes. Let's say we want to define a group of common attributes, which will be reused. The root element is named "root", it must contain the "aaa" and "bbb" elements, and these elements must have attributes "x" and "y".

12_SequenceChoiceGroup.xsd Element which contains two "patterns" (sequences), in any order. We want to have the root element to be named "AAA", from null namespace and contains two patterns in any order. The first pattern is a sequence of "BBB" and "CCC" elements, the second one is a sequence of "XXX" and "YYY" element. The element "choice" allows one of the cases: either the sequence "myFirstSequence"-"mySecondSequence" or "mySecondSequence"-"myFirstSequence".

Slide 34

List Datatypes The definition of a list datatype can List datatypes are special cases in which a structure is defined within the content of a single attribute or element.

IDREFS, ENTITIES, and NMTOKENS are predefined list datatypes

As we have seen with these three datatypes, all the list datatypes that can be defined must be whitespaceseparated. No other separator is accepted.

The definition of a list datatype by reference to an existing type is done through a itemType attribute: <xs:simpleType name="integerList"> <xs:list itemType="xs:integer"/>

also be done by embedding a xs:simpleType element: <xs:simpleType name="myIntegerList"> <xs:list> <xs:simpleType> <xs:restriction base="xs:integer"> <xs:maxInclusive value="100"/>

This datatype can be used to define attributes or elements that accept a whitespace-separated list of integers smaller than or equal to 100 such as: "1 -25000 100." Slide 35

Example s / Demo

09_ListDataType01.xsd Attribute contains a list of values. Now, we want the "root" element to have attribute "xyz", which contains a list of three integers. We will define a general list (element "list") of integers and then restrict it (element "restriction") to have exact length (element "length") of three items.

10_ListDataType02.xsd Element contains a list of values. Now, we want the "root" element to contain a list of three integers. We will define a general list (element "list") of integers and then restrict it (element "restriction") to have exact length (element "length") of three items.

Slide 36

Example s / Demo

More Examples

© 2008 MindTree Consulting

Example s / Demo

15_CustomSimpleType.xsd Definition of a custom simpleType - temperature must be greater than -273.15. The element "T" must contain number greater than -273.15. We will define our custom type for temperature named "Temperature" and will require the element "T" to be of that type.

16_PatternElement.xsd String must contain e-mail address. The element "A" must contain an email address. We will define our custom type, which will at least approximately check the validity of the address. We will use the "pattern" element, to restrict the string using regular expressions.

Slide 38

Summary

W3C XML Schema has become the de facto standard for defining the structure of an XML document and for checking the validity of XML documents. Using schema, it is possible to define:

Elements (simple and complex) Attributes Facets for XML elements The structure of a document (order indicators) The allowable number of elements (occurrence indicators) in an XML document

Slide 39

References

ibm.com/developerWorks IBM XML certification success, Part 1:

W3schools.com www.Xml.com XML Schema by OReilly http://www.zvon.org/xxl/XMLSchemaTutorial Examples used in the presentation are attached here XML-Schema-Project.zip

Slide 40

Questions

Slide 41

Thank you

XML Technology, Semester 4 SICSR Executive MBA(IT) @ MindTree, Bangalore, India

By Neeraj Singh (toneeraj(AT)gmail(DOT)com ) Slide 42

Related Documents


More Documents from ""