Ensuring Consistency of Data in XML Documents Objectives In this lesson, you will learn to: ☛ Declare elements and attributes in a Document Type Definition (DTD) ☛ Create an XML Schema
©NIIT
eXtensible Markup Language/Lesson 2/Slide 1 of 49
Ensuring Consistency of Data in XML Documents Problem Statement 2.D.1 ☛ The head office of CyberShoppe sends the information about its products to the branch offices. The product details must be stored in a consistent format at all branches. Restrictions must be placed on the kind of data that can be saved in the data store to ensure uniformity and consistency of information. The products sold by CyberShoppe are organized into two categories, toys and books. The product details comprise the name of the product, a brief description about it, the price of the product, and the quantity available in stock. Every product is uniquely identified by a product ID.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 2 of 49
Ensuring Consistency of Data in XML Documents Task List ☛ Identify the elements required for storing structured data. ☛ Identify the attributes. ☛ Identify the method for storing consistent data. ☛ Identify the method for declaring elements to be used for storing structured data. ☛ Identify the method for declaring attributes. ☛ Identify the method to validate the structure of data.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 3 of 49
Ensuring Consistency of Data in XML Documents Task List (Contd.) ☛ Declare elements and attributes. ☛ Store data. ☛ Validate the structure of data.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 4 of 49
Ensuring Consistency of Data in XML Documents Task 1: Identify the elements required for storing structured data. Result: ☛ The elements required to store the details about products sold at CyberShoppe are as follows: Element
Description
PRODUCTDATA
Indicates that data specific to various products is being stored in the document. Acts as the root element for all other elements.
PRODUCT
Represents the details (product name, description, price, and quantity) for each product.
PRODUCTNAME
Represents the name of each product.
DESCRIPTION
Represents the description of each product.
PRICE
Represents the price of each product.
QUANTITY
Represents the quantity of each product.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 5 of 49
Ensuring Consistency of Data in XML Documents Task 2: Identify the attributes. Result: ☛ In case of CyberShoppe, you need to store all details about products in an XML document. ☛ Each product needs to have a unique identification number for easy identification of a particular product. Therefore, PRODUCTID can be defined as an attribute of the PRODUCT element. ☛ The category classifies a product as Book or Toy. Therefore, CATEGORY can also be defined as an attribute of the PRODUCT element.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 6 of 49
Ensuring Consistency of Data in XML Documents Task 2: Identify the attributes. (Contd.) ☛ The following table specifies the attributes to be used in the XML document that stores product details: Attribute
Description
PRODUCTID
Represents a unique identification value for each product. It must be specified for every product.
CATEGORY
Represents the category of a product, and specifies whether a product is a TOY or BOOK.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 7 of 49
Ensuring Consistency of Data in XML Documents Task 3: Identify the method for storing consistent data. Document Type Definition ☛ A DTD defines the structure of the content of an XML document, thereby allowing you to store data in a consistent format. ☛ XML allows you to create your own DTDs for applications. ✓ You can check an XML document against a DTD. ✓ This checking process is called validation. ✓ XML documents that conform to a DTD are considered valid documents. ©NIIT
eXtensible Markup Language/Lesson 2/Slide 8 of 49
Ensuring Consistency of Data in XML Documents Task 3: Identify the method for… (Contd.) Result: ☛ As a DTD allows you to specify the structure and type of data elements, a DTD can be created to specify the structure of the document.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 9 of 49
Ensuring Consistency of Data in XML Documents Task 4: Identify the method for declaring elements to be used for storing structured data. ☛ In DTD, elements are declared by using the following syntax: ☛ Elements can be of following types: ✓ Empty ✓ Unrestricted ✓ Container
©NIIT
eXtensible Markup Language/Lesson 2/Slide 10 of 49
Ensuring Consistency of Data in XML Documents Task 4: Identify the method for… (Contd.) ☛ While declaring elements in a DTD, different symbols can be used to specify whether an element is mandatory or optional, and whether it can occur more than once. ☛ The following table lists the various symbols that can be used while defining the DTD: Symbol
Meaning
Example
Description
,
“and” in specific order
PRODUCTNAME, DESCRIPTION
PRODUCTNAME or DESCRIPTION must occur in that order.
|
“or”
PRODUCTNAME| DESCRIPTION
Either PRODUCTNAME or DESCRIPTION.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 11 of 49
Ensuring Consistency of Data in XML Documents Task 4: Identify the method for… (Contd.) Symbol
Meaning
Example
Description
?
“optional”, can occur only once.
DESCRIPTION?
DESCRIPTION need not be present, but if it is present, it can occur only once.
*
An element can occur zero or multiple times.
(PRODUCTNAME| DESCRIPTION)*
Any number of PRODUCTNAME or DESCRIPTION elements can be present in any order.
+
An element must occur at least once. There can be multiple occurrences.
DESCRIPTION+
DESCRIPTION can occur multiple times.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 12 of 49
Ensuring Consistency of Data in XML Documents Task 4: Identify the method for… (Contd.) ☛ As per the given scenario, the type of content for each element is given in the following table: Element
Content Type
Description
PRODUCTDATA
Element content
Contains one or more PRODUCT elements.
PRODUCT
Element content
Contains details of other products, and hence will contain other elements like PRODUCTNAME, DESCRIPTION, PRICE, and QUANTITY.
PRODUCTNAME
Data content
Contains regular text that represents the name of a product.
DESCRIPTION
Data content
Contains regular text that represents the description of a product.
PRICE
Data content
Contains regular text that represents the price of a product.
QUANTITY
Data content
Contains regular text that represents the quantity of a product.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 13 of 49
Ensuring Consistency of Data in XML Documents Task 4: Identify the method for… (Contd.) ☛ You need to use the statement for declaring elements in a DTD. ☛ For example, the PRODUCTNAME element used in the CyberShoppe scenario can be declared as follows:
©NIIT
eXtensible Markup Language/Lesson 2/Slide 14 of 49
Ensuring Consistency of Data in XML Documents Task 5: Identify the method for declaring attributes. ☛ The syntax for declaring attributes in a DTD is as follows: ✓ The attributename valuetype [attributetype] [“default”] section is repeated as often as necessary to create multiple attributes for any given element.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 15 of 49
Ensuring Consistency of Data in XML Documents Task 5: Identify the method for declaring attributes. (Contd.) ☛ The value types that can be specified for attributes in a DTD are: ✓ PCDATA ✓ ID ✓ (enumerated) ☛ The attribute types are: ✓ REQUIRED ✓ FIXED ✓ IMPLIED ©NIIT
eXtensible Markup Language/Lesson 2/Slide 16 of 49
Ensuring Consistency of Data in XML Documents Task 5: Identify the attribute types and… (Contd.) Result: ☛ In the case of CyberShoppe, the attribute and their value types will be as follows: Attribute
Attribute Type
Value Type
Description
PRODUCTID
#REQUIRED
ID
Product ID must have a unique value and has to be specified for every product.
CATEGORY
#REQUIRED
(enumerated)
Category must be TOYS or BOOKS.
☛ You need to use the statement for declaring attributes in a DTD. ©NIIT
eXtensible Markup Language/Lesson 2/Slide 17 of 49
Ensuring Consistency of Data in XML Documents Task 6: Identify the method to validate the structure of data. ☛ To validate the structure of data in an XML document you need to use parsers. ☛ Parsers are software programs that check the syntax used in an XML file. There are two types of parsers. They are: ✓ Non-validating parsers: Check whether an XML document is well-formed. ✓ Validating parsers: Check for well-formedness and validity of an XML document.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 18 of 49
Ensuring Consistency of Data in XML Documents Task 6: Identify the method to validate… (Contd.) Result: ☛ In order to check whether the data sent by the branches of CyberShoppe conforms to the structure specified in the DTD, you need a validating parser.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 19 of 49
Ensuring Consistency of Data in XML Documents Task 7: Declare elements and attributes. ☛ Internal and External DTDs ✓ You can declare elements and attributes in a DTD. ✓ A DTD can be classified into two types. They are: ➤ Internal
DTD
➤ External
©NIIT
DTD
eXtensible Markup Language/Lesson 2/Slide 20 of 49
Ensuring Consistency of Data in XML Documents Task 7: Declare elements and attributes. (Contd.) ☛ Differences between internal and external DTDs are given in the following table: Internal DTD
External DTD
This DTD is a part of the XML document.
This DTD is maintained as a separate file. A reference to this file in included in the XML document.
This DTD can be used only by the document in which it is created and cannot be used across multiple documents.
This DTD can be used across multiple documents.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 21 of 49
Ensuring Consistency of Data in XML Documents Task 7: Declare elements and attributes. (Contd.) ☛ To ensure that the structure of an XML document conforms to the DTD, you must associate the DTD with the XML document. ☛ The declaration is used to define the internal DTD. It can also be used to reference an external DTD. ☛ The syntax for defining an internal DTD in an XML document is as follows: ©NIIT
eXtensible Markup Language/Lesson 2/Slide 22 of 49
Ensuring Consistency of Data in XML Documents Task 7: Declare elements and attributes. (Contd.) ☛ The syntax for referencing an external DTD in the XML document is as follows: Action: ☛ Type the code for creating the DTD. ☛ Save the file as products.dtd.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 23 of 49
Ensuring Consistency of Data in XML Documents Task 8: Store data. Action: ☛ Write the code for creating the XML document. ☛ Save the file as products.xml.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 24 of 49
Ensuring Consistency of Data in XML Documents Task 9: Validate the structure of data. Action: ☛ Open index.htm in Internet Explorer. ☛ Click the DTD Validator link. ☛ Type the name of the XML document that you want to parse in the text box. ☛ Click the Validate button.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 25 of 49
Ensuring Consistency of Data in XML Documents Just a Minute… The branches of CyberShoppe send information about books sold by them to the head office. The book details must be stored in a consistent format. Restrictions must be placed on kind of data that can be saved in the data store to ensure uniformity and consistency of information. The details of the books sold by CyberShoppe consist of the name of the book, ISBN of the book, first and last names of the author of the book, and the price of the book. The ISBN should be unique for each book. In addition, you need to ensure that the book category contains HISTORY, SCIENCE, or FICTION as its valid values. Create a DTD for declaring the elements to be used for storing book details in an XML document.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 26 of 49
Ensuring Consistency of Data in XML Documents Introduction to XML Schemas ☛ An XML schema is used to define the structure of an XML document. ☛ Microsoft has developed a language that is used to define the schema of an XML document. This language is called the XML Schema Definition (XSD) language.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 27 of 49
Ensuring Consistency of Data in XML Documents Advantages of XML Schemas over DTDs ☛ Some of the advantages of an XML schema created by using XSD over DTD are as follows: ✓ XSD provides more control over the type of data that can be assigned to elements and attributes as compared to DTD. ✓ DTD does not enable you to define your own customized data types. XSD enables you to create your own data types. ✓ XSD also allows you to specify restrictions on data.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 28 of 49
Ensuring Consistency of Data in XML Documents Advantages of XML Schemas over DTDs (Contd.) ✓ The syntax for defining a DTD is different from the syntax used for creating an XML document. However, the syntax for defining an XSD is the same as the syntax of the XML document.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 29 of 49
Ensuring Consistency of Data in XML Documents Problem Statement 2.D.2 The head office of CyberShoppe sends information about its products to its branch offices. The product details must be stored in a consistent format. Restrictions must be placed on the kind of data that can be saved in the data store to ensure uniformity and consistency of information. The product details comprise the name of the product, a brief description about it, the price of the product, and the quantity available in stock. The price of the product must always be greater than zero.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 30 of 49
Ensuring Consistency of Data in XML Documents Task List ☛ Identify the elements required to store data. ☛ Identify the data type of the contents of an element. ☛ Identify the method for declaring a simple type element. ☛ Identify the method for declaring a complex type element. ☛ Create the XML schema. ☛ Create an XML document conforming to the schema. ☛ Validate an XML document against the schema.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 31 of 49
Ensuring Consistency of Data in XML Documents Task 1: Identify the elements required to store data. Result: ☛ As per the problem, the elements required in the XML document are: Element
Description
PRODUCTDATA
This element indicates that data specific to various products is being stored in the document. Therefore, it contains more elements and acts as the root element
PRODUCT
Represents the details (product name, description, price, and quantity) for each product.
PRODUCTNAME
Represents the name of each product.
DESCRIPTION
Represents the description of each product.
PRICE
Represents the price of each product
QUANTITY
Represents the quantity of each product.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 32 of 49
Ensuring Consistency of Data in XML Documents Task 2: Identify the data type of the contents of an element. ☛ Every element declared in XSD, must be associated with a data type. ☛ XSD provides a list of pre-defined data types. ✓ Primitives Data Types: Fundamental data types of XSD, such as string, decimal, float, and boolean. ✓ Derived Data Types: Defined by using other data types. ✓ Atomic Data Types: Data types that cannot be broken further. ✓ List Data Types: Contain a set of values. ✓ Union Data Types: Derived from list and atomic data types. ©NIIT
eXtensible Markup Language/Lesson 2/Slide 33 of 49
Ensuring Consistency of Data in XML Documents Task 2: Identify the data type of the… (Contd.) ☛ XSD also allows definition of custom data types. These custom data types can be classified as follows: ✓ Simple data type: A data type that contains only values. ✓ Complex data type: A data type that contains child elements, attributes, and also the mixed content.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 34 of 49
Ensuring Consistency of Data in XML Documents Task 2: Identify the data type of the… (Contd.) Result: ☛ The data type for the contents of the elements will be: Element
Data Type
Description
PRODUCTDATA
Complex data type
A complex type element that can hold other elements, attributes, and mixed content. This element will hold a complex data type, which will be defined in the later session.
PRODUCT
Complex data type
A complex type element that can hold other elements, attributes, and mixed content. This element will hold a complex data type, which will be defined in the later session.
PRODUCTNAME
String
A simple type element that contains values of data string type.
DESCRIPTION
String
A simple type element that contains values of string data type.
PRICE
Positiveinteger
A simple type element that contains values of positiveInteger data type (product price must be greater than zero.
QUANTITY
Integer
A simple type element that contains values of integer data type.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 35 of 49
Ensuring Consistency of Data in XML Documents Task 3: Identify the method for declaring a simple type element. ☛ A simple element does not contain any child elements or attributes. Simple elements contain only values such as numbers, strings, and dates. ☛ The syntax for declaring elements with a simple data type is as follows: <xsd:element name=”element-name” type=”data type” />
©NIIT
eXtensible Markup Language/Lesson 2/Slide 36 of 49
Ensuring Consistency of Data in XML Documents Task 3: Identify the method for declaring a simple type element. (Contd.) ☛ You can associate an element with a user-defined simple data type. To do so, you must define the new simple data type. ☛ You can use the simpleType element of XSD to create a user-defined simple data type.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 37 of 49
Ensuring Consistency of Data in XML Documents Task 3: Identify the method for declaring a simple type element. (Contd.) Result: ☛ As per the problem, the simple elements can be declared in the XSD as follows: <xsd:element name="PRODUCTNAME" type="xsd:string"/> <xsd:element name="DESCRIPTION" type="xsd:string"/> <xsd:element name="PRICE" type="xsd:positiveInteger"/> <xsd:element name="QUANTITY" type="xsd:nonNegativeInteger"/> ©NIIT
eXtensible Markup Language/Lesson 2/Slide 38 of 49
Ensuring Consistency of Data in XML Documents Task 4: Identify the method for declaring a complex type element. ☛ A complex type element is one that contains other markup elements, attributes, and mixed content. ☛ To declare a complex type element, you need to first define a complex data type. After you define a complex data type, you can declare a complex element by associating this data type with the element. ☛ You can define a complex data type by using the syntax given below: <xsd:complexType name=”data type name”> Content model declaration ©NIIT
eXtensible Markup Language/Lesson 2/Slide 39 of 49
Ensuring Consistency of Data in XML Documents Task 4: Identify the method for declaring a complex type element. (Contd.) ☛ To declare an element as a complex type element, the element must be associated with a complex data type. ☛ For example, to declare the element PRODUCT as a complex type element you can associate this element with the prdt data type as shown below: <xsd:element name="PRODUCT" type="prdt"/> Result ☛ In the CyberShoppe scenario, you require two complex type elements, PRODUCTDATA and PRODUCT. ©NIIT
eXtensible Markup Language/Lesson 2/Slide 40 of 49
Ensuring Consistency of Data in XML Documents Task 4: Identify the method for declaring a complex type element. (Contd.) ☛ You can create complex type elements by associating them with complex data types. ☛ You can use the element element of XSD to declare a complex type element. ☛ You can use the complexType element of XSD to create the complex data type.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 41 of 49
Ensuring Consistency of Data in XML Documents Task 5: Create the XML Schema. ☛ The Schema element ✓ The integration of the various components of the XSD is done using the schema element. ✓ The declaration of an XML schema starts with the <schema> element. ✓ The <schema> element uses the xmlns attribute to specify the namespace associated with the document. Action: ✓ Type the XML Schema in Notepad. ✓ Save the file as product.xsd. ©NIIT
eXtensible Markup Language/Lesson 2/Slide 42 of 49
Ensuring Consistency of Data in XML Documents Task 6: Create an XML document conforming to the schema. ☛ To create a data structure that conforms to the XML schema, you should create an XML document and associate it with the XML schema. ☛ An XML file cannot be directly associated with the XML schema file. The XML file can be associated with the XML schema only through a validator. Action: ✓ Type the code in Notepad. ✓ Save the file as products.xml ©NIIT
eXtensible Markup Language/Lesson 2/Slide 43 of 49
Ensuring Consistency of Data in XML Documents Task 7: Validate an XML document against the schema. Action: ✓ Open index.htm. ✓ Click the Schema Validator link. ✓ Type the name of the XML document and the XSD file. ✓ Click the Validate button.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 44 of 49
Ensuring Consistency of Data in XML Documents Problem Statement 2.P.2 ☛The details of the books sold by CyberShoppe consist of the name of the book, the ISBN of the book, the first and last names of the author of the book, and the price of the book. The ISBN must start with the letter I and be followed by three digits. This data must be validated to ensure that it conforms to the standards specified in order to maintain data integrity. Also, the data types used for the data must be compatible with those used in databases. All data must be stored in a consistent format.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 45 of 49
Ensuring Consistency of Data in XML Documents Summary In this lesson, you learned that: ☛ Document type Definition (DTD) is method for defining the structure of the data in an XML document. ☛ There are two types of DTD: ✓ Internal DTD: It can be included as a part of the document. ✓ External DTD: it is stored as a separate file having the declaration of all elements and attributes that can be used in an XML document. ☛ There are three types of elements: empty, unrestricted, and container. ©NIIT
eXtensible Markup Language/Lesson 2/Slide 46 of 49
Ensuring Consistency of Data in XML Documents Summary (Contd.) ☛ The statement is used to declare an element in a DTD. ☛ The statement is used to declare a list of attributes for an element in a DTD. ☛ The
©NIIT
eXtensible Markup Language/Lesson 2/Slide 47 of 49
Ensuring Consistency of Data in XML Documents Summary (Contd.) ☛ Schema can be used to specify the list of elements and the order in which these elements must appear in the XML document. ☛ The language that is used to describe the structure of the elements in a schema is called the XML Schema Definition (XSD) language . ☛ The data types supported by schema are of the following types: ✓ Primitive ✓ Derived ✓ Atomic ✓ List ©NIIT
eXtensible Markup Language/Lesson 2/Slide 48 of 49
Ensuring Consistency of Data in XML Documents Summary (Contd.) ☛ The simpleType element of XSD allows you to create user-defined simple data types. ☛ The complexType element of XSD allows you to create complex data types. ☛ The restriction element can be used to specify constraints on values that can be stored in elements and attributes.
©NIIT
eXtensible Markup Language/Lesson 2/Slide 49 of 49