XML Impacting the Enterprise Tapping into the Power of XML: Five Success Stories
Table of Contents 1 | Introduction and XML Servers: Enhancing and Improving the XML Experience 2 | Five Success Stories: XML Servers in the Real World 3 | Elsevier Case Study 4 | Reed Business International (RBI) Case Study 5 | JPMorgan Chase & Co. Case Study 6 | Datamonitor Case Study 7 | JetBlue Case Study 8 | The Benefits of XML Servers 9 | Conclusion
XML Impacting the Enterprise Tapping into the Power of XML: Five Success Stories
XML: A Brief (and Successful) History XML was introduced 10 years ago to improve the exchange of content over the web and enable linking of siloed content repositories. Today it is experiencing a surge in popularity. Both businesses and government agencies are embracing XML in large numbers. In a 2008 survey of 700 software development professionals using XML in multiple industries worldwide , 64% said their companies are expanding the overall amount of data they store in XML. Sixty-nine percent said the number of XML files in their organization is growing. In just a few years, we’ve seen the arrival—and widespread adoption—of a number of new and innovative uses for XML: Office Open XML: With the advent of Microsoft Office 2007, a version of XML called Office Open XML or Open XML became the underlying format for all Office documents. Standard XML Schemas: XML schemas, which describe particular types of XML documents, are now available for many industries. These include the financial products markup language (FPML), Darwin Information Typing Architecture (DITA) for technical information, and keyhole markup language (KML) for geographic information. Web 2.0: XML enables popular web 2.0 capabilities like tagging, rating, annotating, and commenting by allowing users to easily mark-up, add to, and enhance content. SOA/Web Services: Service-oriented architectures (SOA) and web services are increasingly popular because they simplify the exchange of data among heterogeneous systems. SOA and web services are based on XML standards, including the simple object access protocol (SOAP)— a specification for exchanging structured information in the implementation of web services—and web services description language (WSDL), which provides a model for describing web services. XQuery: The Worldwide Web Consortium (W3C) adopted XQuery as the XML query standard in January 2007. Increasing numbers of major software vendors—like Oracle and IBM—support this standard, leading to more XQuery books, training, and classes. Federal Government Initiatives: Numerous initiatives require federal agencies to employ XML. XML is a key component of the emerging federal enterprise architecture (FEA), which provides a common methodology for IT acquisition, use, and disposal in the federal government. XML is a focus of the Emerging Technology group of the Federal CIO Architecture and Infrastructure Committee. The E-Government Act of 2002, which recommends the use of standards and guidelines for interconnectivity and interoperability, has also included XML among its recommended standards. 1
| mark logic whitepaper
These developments have helped influ-
High performance—XML servers combine
ence XML’s growing popularity. But there
various techniques to improve performance
is another key factor in the XML success
including advanced clustering, parallel
story—the XML server.
processing of queries and threading to take
It’s now even easier to take advantage of XML
XML Servers: Enhancing and Improving the XML Experience
thanks to XML servers designed specifically
It’s now even easier to take advantage
for applications that rely on XML content.
of XML thanks to XML servers designed
Organizations can load all relevant content into an XML server to create a content platform that allows them to rapidly build and deploy applications to meet a wide variety of information access and delivery needs.
specifically for applications that rely on XML content. An XML server is built specifically for XML content and uses that content in its native format. Organizations can load all relevant content into an XML server, creating a content platform that allows them to rapidly build and deploy applications to meet a wide variety of information access and delivery needs. The platform helps them increase
an application is much more efficient than having to transform content into other formats or storage models. A solution that includes a web server can serve HTML or XHTML directly to the browser, avoiding potential bottlenecks or latency issues. Additionally, a universal index that indexes text and structure allows users to receive specific information quickly. The index can be optimized automatically to improve performance. Ease of integration —XML servers can easily
applications that can be tied into business
integrate with any third-party product either
processes.
natively through XML and XQuery or through
server include: Native XML storage—An XML server uses XML as its native storage format and is built specifically for XML content. It allows organizations to load any XML in any structure directly into the repository. This eliminates the need for administrators to maintain multiple structures and indexes for managing content, reducing maintenance costs, and improving performance. Comprehensive search—Accessing the right content starts with search. An XML server combines full-text search, XML search, and geospatial search to find content. XQuery support allows users to retrieve the content at a granular level—paragraph, chapter, caption, section—for reassembly and delivery. Powerful content analysis—Before organizations can exploit content, they need to understand what they have. XML servers offer analytic capabilities that characterize an organizations’ content, such as number of authors, pages, or images. The analysis also feeds rich visual interfaces for exploring content— such as faceted navigation , tag clouds or heat maps—that guide users to information more quickly.
| mark logic whitepaper
Delivering content to a browser in XML via
use of content through dynamic publishing
Some of the key capabilities of an XML
2
advantage of multi-core architectures.
Java and .Net, becoming a “hub” in a system architecture and allowing different applications that speak XML to work together. This means an organization could use a content management system with capabilities such as sign in/out, versioning, and workflow to create content, store content in the XML server, and pull content directly into a design tool like FrameMaker. We’ve described some of the critical components of an XML server and how it combines with XML to increase its overall flexibility. But how does an XML server work in real organizations facing complex, real-world challenges?
Five Success Stories: XML Servers in the Real World Today many of the world’s top organizations are using XML servers to solve their toughest information access and delivery challenges. Following are the success stories of five industry leaders who are increasing agility, benefit 2 and benefit 3, with the help of MarkLogic Server.
Elsevier A world-leading publisher of scientific, technical, and medical information products and services, Elsevier is the science and medical publishing division of Reed Elsevier Group plc. Elsevier supplies more than 30 million scientists, students, and health information professionals worldwide with 20,000+ products and services that help them conduct research, perform experiments, aid patients, and more. Challenge Elsevier has long invested in digitizing content,
Solution
amassing vast repositories of medical and
In the end, Elsevier decided to replace its RD-
These role- and task -aware applications
scientific information, and making it available
BMS with MarkLogic Server. This allowed the
help guide users to results by presenting
via a range of online database-driven solu-
company to develop applications that store
information to them in a way that is
tions. As Elsevier’s content grew, its customers
all of its data in a large content repository,
consistent with their diagnostic process.
began spending more time refining searches to
extract exactly the information needed, and
They can work across summary level
find relevant content. Elsevier wanted to help
present the content as a new, automatically
information drawn from metadata and
customers extract only the pieces of content
created document. Now Elsevier can build
then drill into more specific details when
they needed and maximize its value by letting
new applications and create value-added
necessary—and all of the content is
them flexibly combine procedures, techniques,
services from the repository very quickly.
assembled in real time as they navigate.
Benefits
Most importantly, physicians can work faster
Elsevier had stored its vast content
Because the XML server allows Elsevier to
and with greater confidence. By using
repositories—five million full text articles
import content “as is” from many sources, the
Elsevier applications to facilitate the
from 1,800 journals; more than 60 million
company has eliminated the lengthy process
diagnostic process, physicians spend less
citations and abstracts; 20,000 print books;
of preparing and normalizing content for the
time looking for information and more time
9,000 out-of-print books; and thousands
repository, slashing time-to-availability by
evaluating and healing. And because physi-
of pamphlets—in multiple databases in 35
two thirds. The new system also enables rapid
cians can more easily compare procedures,
different file formats. To help customers find
application development without the need for
they can also achieve better diagnoses.
content more easily, the company decided to
schemas or DTDs, allowing Elsevier to more
migrate it to a single platform.
quickly deliver new products to customers.
Initially, Elsevier moved the content to XML
A complete XQuery implementation delivers
and used an RDBMS to create a central-
high performance against the multi-terabyte
ized document repository. This allowed the
dataset. The server can search deep inside
company to deconstruct and synthesize
documents to access precise sections or
documents into content-specific results. But
paragraphs, rather than large numbers of
to obtain reasonable performance from the
possible documents, allowing users to find
RDBMS, it needed to pre-define schemas and
information five to nine times faster than
access paths—time-consuming tasks.
before.
and best practices.
1 www.wikipedia.org [Schema] A schema is a way to define the structure, content and, to some extent, the
semantics of XML documents 2 www.wikipedia.org [Index] (publishing) A detailed list, usually arranged alphabetically, of the specific informa-
tion in a publication
3
| mark logic whitepaper
Solution
RBI France Because all the content is XML, RBI France has eliminated needless transformations, creating cleaner, faster streamlined process that allows the print production team to leverage the latest information. Editors now publish in XML and launch new material on the web in seconds when ready. The new platform also makes it easier to repurpose digital content so RBI France can deliver even more value-added services to its partners.
custom, enterprise-wide content manage-
to-business division of Reed Elsevier Group
ment and production system. The solution
PLC, the world’s leading publisher and informa-
stored all of RBI France’s XML content in a
tion provider. RBI provides business profes-
centralized repository, enabled deep, fine-
sionals across five continents with unrivaled
grained queries across publishing verticals,
access to communication and information
and preserved and enriched meta data to im-
channels ranging from magazines to directo-
prove image archive management and reuse.
ries, conferences, and market research. RBI France is one of nine international operations hubs and stewards some of the company’s most popular branded products in Europe.
Because of its laborious content transformation and production process, RBI France was struggling to get new products to market quickly and provide value-added services to its customers. The company had been using XML to tag and index content and SQL Server as its content repository. When they needed to reuse content, they were forced to export selected content into XML and reprocess it for publishing. RBI France would then export the new pages out of XML for re-storage in the database. This meant the production department had to print content first and then load it on the internet, significantly slowing the update process. RBI France resells its professional content to business partners, such as banks, consumer sites, retail outlets, and technology companies, who in turn leverage this custom, industryspecific content to send to their end customers. But due to the flawed production process, it was difficult and expensive for RBI France to repurpose content for its partners.
| mark logic whitepaper
France worked with MarkLogic to create a
Reed Business Information (RBI) is a business-
Challenge
4
To streamline its production processes, RBI
The system includes workflow functionality to speed and ease production and content management of the publisher’s books, magazines, website applications, and partner offerings. And it allows for direct connections to and from the editors’ preferred in-house layout and design tools, including Adobe FrameMaker and InDesign. Benefits Because all the content is XML, RBI France has eliminated needless transformations, creating a cleaner, faster streamlined process that allows the print production team to leverage the latest information. Editors now publish in XML and launch new material on the web in seconds when ready. The new platform also makes it easier to repurpose digital content so RBI France can deliver even more value-added services to its partners.
Benefits Thanks to the new system, JP Morgan Chase
JPMorgan Chase & Co. JPMorgan Chase & Co. is a leading global financial services firm with assets of $2.0 trillion and operations in more than 60 countries. The firm is a leader in investment banking, financial services for consumers, commercial banking, financial transaction
delivers timely research to 80,000 users worldwide, improving customer satisfaction and competitive advantage. By alerting customers to the availability of critical new research more quickly, financial traders gain a definite edge in the office and on the trading floor.
processing, asset management, and private equities. Challenge JPMorgan Chase provides financial research to customers on a subscription basis. Because every second counts in the fast-paced world of stock trading, the firm needed to deliver new research to its subscribers as quickly as possible to help them make better decisions about their trades. Unfortunately, these efforts were hampered by the firm’s legacy infrastructure. Because of shortcomings with the current tool they were not able to easily respond to new requirements or to fully leverage the XML that was being created. Additionally they could not meet their goals for delivering alerts in a timely fashion. Solution JPMorgan Chase replaced its legacy system with a MarkLogic Server. Now the firm can take full advantage of the research information already being stored in research information exchange markup language (RIXML) format. RIXML is an open-industry standard for categorizing, aggregating, comparing, sorting, and distributing global financial research. The solution drastically reduces alert latency and delivers information to the customer’s portal and email.
Thanks to the new system, JP Morgan Chase delivers timely research to 80,000 users worldwide, improving customer satisfaction and competitive advantage. By alerting customers to the availability of critical new research more quickly, financial traders gain a definite edge in the office and on the trading floor.
5
| mark logic whitepaper
Solution Datamonitor deployed MarkLogic Server as
Datamonitor By creating content as XML, managing it as XML, and delivering it as XML, Datamonitor has improved the quality of its content by incorporating different types of information into new high-value products. Because the company creates highly customized products for its customers, it increases customer intimacy and fosters a stronger business relationship. And because the application improves the speed with which Datamonitor can deliver new products, the company is frequently first to market.
Datamonitor is a leading industry analyst. 5,000+ customers—including industry analysis, company profiles, news products, detailed market analysis—use its’ subscription-based and custom research and consulting offerings to develop strategic industry and competitive intelligence. The company is set apart from the competition thanks to industry-specific expertise and comprehensive databases that manage acquisition and
a mix of structured data (financials and statistics about companies) and content for analysis. The application integrates with browserbased authoring and editing tools that support a team spanning three continents—the U.S., U.K., and India. These tools work with a custom-built content management application that includes check-in, check-out, and versioning, implemented within a workflow system. As users complete editing, new
Challenge
complete tasks, documents are automati-
advantage in a highly competitive market by
versions are automatically created; as they cally checked in and out.
combining quality information into high-value
All of the editing screens and external appli-
products. To accomplish this goal, it planned
cation interactions are executed in XQuery.
to hire more content authors and implement
The user interface features a number of rich
tools that would allow it to create better prod-
tools to allow authors to look up content
ucts with less effort.
ranging from Yahoo! ticker systems to 10-K
However, Datamonitor faced technical obstacles. The company was using Microsoft Access for content authoring and Microsoft SQL server as its data repository. While this architecture allowed for well-structured and
filings to geography and industry taxonomies. The tool is linked to the Factiva information dataset, which allows Datamonitor to leverage the company name and organization relationships in the Factiva feed.
consistent content products, it was difficult
Benefits
to modify the system to accommodate new
By creating content as XML, managing it as
products or extend existing products.
XML, and delivering it as XML, Datamonitor
Also, while it could create content on a local network, Datamonitor was unable to move to a browser-based infrastructure. This was critical as the company increased its content authoring team from a few dozen people to well over 600 by opening an office in India.
has improved the quality of its content by incorporating different types of information into new high-value products. Because the company creates highly customized products for its customers, it increases customer intimacy and fosters a stronger business relationship. And because the application
Additional challenges included content
improves the speed with which Datamonitor
distributed across multiple siloed systems,
can deliver new products, the company is
lack of an integrated architecture to support
frequently first to market.
content reuse, cumbersome content creation workflow and processes, and the need to load and process many content formats— including information scraped from the web.
| mark logic whitepaper
and deliver content. The content is all XML and
revenue figures for each company it covers.
Datamonitor believed it could achieve an
6
the basis for a content application to manage
With this platform, subject matter experts
JetBlue New York-based JetBlue Airways has created a new airline category based on value, service, and style. Known for its award-winning service, free TV, and low fares, JetBlue serves 52 cities with 600 daily flights. It is also one of the first airlines to offer its own Customer Bill of Rights, with meaningful and specific compensation for customers inconvenienced by service disruptions within JetBlue’s control. Problem Previously, JetBlue employed a manually
using Microsoft Word can author policies and
As the FAA releases new guidance, JetBlue
procedures, re-using information and assem-
can also easily track and change the required
bling documents at any level of granularity to develop content for different audiences. The same content can be created once and personalized for users based on their role, task, location, and area of responsibility. Workflow within SharePoint is used to manage the approval process. As the FAA releases new
property resided in multiple disparate systems, when new regulatory guidance was released, JetBlue was unable to quickly update its operating manuals to meet new FAA, IATA, and/or ICAO requirements. Failure to correct this situation would have exposed JetBlue to a range of regulatory actions, including fines, sanctions, and potential loss of its operating certificate. Solution
ing ground servers, PDAs, and disconnected pilot laptops.
the company can dynamically deliver the content online and offline in a variety of formats to various devices, including ground servers,
Benefits
departments. But because this intellectual
variety of formats to various devices, includ-
information to the right people. Ultimately,
agement system that used the file system
and educational materials spanning multiple
cally deliver the content online and offline in a
change the required topics and publish new
PDAs, and disconnected pilot laptops.
regulatory compliance procedures, policies,
people. Ultimately, the company can dynami-
guidance, JetBlue can also easily track and
accessed and maintained document manto manage intellectual property such as
topics and publish new information to the right
With the new system, ground crews, air crews, training departments, and regulatory agencies now have reliable and consistent access to the right information at the right time. And because users query live servers rather than fixed content, they are guaranteed the most current information. Prior to the new system, a pilot in rough turbulence might not have time to find Chapter 7, Section 16 in a 200-page procedure manual. But with the new system, pilots can quickly find the correct procedure needed to deal with a possible engine failure at a particular altitude.
JetBlue created a content assembly application based on MarkLogic Server and integrated it into its Office 2007/SharePoint infrastructure. The centralized, scalable XML repository accommodates a diverse range of content types, including Microsoft Office
SharePoint S
Word
documents and PDF files.
7
| mark logic whitepaper
The Benefits of XML Servers
available to different types of employees on
The organizations profiled here have imple-
different devices according to their roles—
mented just a few of the innovative solutions made possible with XML servers. And while each company’s solution is unique, these organizations have one thing in common— they’re all reaping the benefits associated with XML servers. Centralized information: XML servers store all content in one place. By administering With an XML server, you can tailor information to
and maintaining a single content reposi-
the individual user. And when the information is
tory, rather than multiple siloed systems,
delivered in XML, it can be displayed across a wide
organizations reduce hardware and mainte-
range of devices and formats.
nance costs. At the same time, users need to search only in one location to find all relevant information, rather than looking in numerous
on a laptop, and marketing receives competitive analysis via desktop. Enhanced agility: Because XML servers offer the flexibility to deal with any content structure without having to know the schema upfront, organizations can more quickly and easily load content and run data or full text queries on-the-fly, without having to build queries into the application. This lets organizations quickly react to market changes, easily aggregate new content, and agilely build new information products in response to changing demands and market
Dramatically improved content reuse: Using
with content and rapidly develop and deploy
modular document capabilities, XML serv-
new content applications that solve a wide
ers make it easy to re-use content in many
array of needs, and find new opportunities to
documents to avoid re-creating information.
exploit existing content to increase revenue
For example, a single description of a cooling
and operational efficiency.
conditions. Organizations can experiment
that uses that fan. When the fan is modified,
Conclusion
the description only needs to be changed in
The adoption of XML has increased signifi-
one place.
cantly since it was first introduced 10 years
Fast application development and reduced costs: Whether an organization wishes to build new applications or embed XML content capabilities into existing products, an XML server—like MarkLogic Server—provides a single, comprehensive infrastructure
ago. This is partly due to the development of new and innovative uses for XML such as Open Office XML, Web 2.0, SOA/web services, and more. But its popularity can also be attributed to the arrival of XML servers, like MarkLogic Server.
from which to build and deploy applications.
The combination of XML and XML servers
It includes an XML repository, full text and
provides a new level of flexibility that helps
XML search capabilities, an XQuery engine,
a growing number of industry leaders around
and a web server. So organizations can de-
the world meet a wide range of business chal-
velop applications more quickly. Those appli-
lenges and create new business opportuni-
cations run more efficiently and effectively
ties. These innovators are tapping into the
because they are on a single platform, saving
raw power of XML servers to benefit from
time and money.
centralized information, improved content
Delivery of customized content to employees, customers, and partners: With an XML server, you can tailor information to the individual user. And when the information is delivered in XML, it can be displayed across a wide range of devices and formats. For example, an organization might wish to make different parts of its overall training manual
| mark logic whitepaper
manufacturing sees product specifications
content databases.
fan can be used in the manuals for every part
8
sales reviews product descriptions on a PDA,
reuse, faster application development, and the ability to furnish highly customized content to employees, partners, and customers. As a result, these organizations improve their competitive advantage and become more agile via their ability to provide value-added products and services to their customers in a timely manner.
MarkLogic Server: The Leading XML Server Each of the five businesses in this white paper solved their information challenges with the help of one XML server in particular—MarkLogic Server. MarkLogic Server is the industry’s leading XML server. Organizations around the world use it to store, search, analyze, and deliver XML content. MarkLogic Server uses XML as its native data format and is built specifically for content. It has the capabilities of a large enterprise-class DBMS—storage, query, update, failover, backup/restore, and administrative tools. It also provides a way to manage how XML datasets are identified, collected, stored, shared, and built into applications or distributed online or via printed documents or other media. The platform lets organizations build XML-based applications to suit their specific information access needs. MarkLogic Server is also open for integration with enterprise systems involved in the management of content and publishing processes including content creation, editing, and workflow.
9
| mark logic whitepaper
Mark Logic Corporation www.marklogic.com Headquarters 999 Skyway Road, Suite 200 San Carlos, CA 94070 + 1 650 655 2300 New York +1 646 378 2104 United Kingdom +44 (0) 207 643 1712
Version 1 March 2009
© Copyright 2009 Mark Logic Corporation. Mark Logic is a registered trademark and MarkLogic Server is a trademark of Mark Logic Corporation, all rights reserved. All other product names mentioned herein are the property of their respective owners.