Oracle Warehouse Builder 10g Three business reasons to move to release 10g An Oracle White Paper February 2004
Three business reasons to move to release 10g
EXECUTIVE OVERVIEW
Many software products are continuously updated and extended, and in the case of Oracle Warehouse Builder 10g that is not different. The key for customers is to understand what the benefits of a new release are, in order to determine the upgrade path. This paper will guide the customers in that decision by showing the new functionality on the 10g release of Warehouse Builder. This paper strongly focuses on the business benefits that the customer can achieve using Warehouse Builder 10g instead of the older versions. Note that the 9.2 version of Warehouse Builder is the same as the 10g version, with one exception. Warehouse Builder 10g is also supported and certified on the 10g release of the database. Apart from this note you can substitute 10g with 9.2 in this paper. The main business themes for the Warehouse Builder 10g release are: •
Increase data quality with advanced, embedded data quality
•
Enhance productivity in the Extraction, Transformation and Load process with debugging and data viewing capabilities
•
Enhanced metadata comparison to evolve metadata rather than redevelop it
•
Enhanced version management to allow object level version control for developers and designers.
The major improvements within the new release, which solve these business issues, make the Warehouse Builder 10g release a compelling one for all customers who want to increase their productivity and improve the quality of their data. This will lead to faster and better implementations of your warehouse system.
Three business reasons to move to release 10g
Page 2
INTRODUCTION
As the cornerstone of Oracle’s Business Intelligence offering, Oracle9i Warehouse Builder solves the complex problem of data integration between various dispersed data sources and targets. Oracle9i Warehouse Builder is the tool to design and manage both data and metadata integration for the Oracle Business Intelligence solutions. By now many organizations understand the necessity of a central data and metadata store. This central store allows companies to make better decisions faster. On a departmental level this can be classified as a Phase 1, or departmental ad-hoc, implementation, which indicates that the need is perceived and a solution is crafted. Many of these Phase 1 projects struggle with growth problems. Some are so successful that the business demand is overpowering the delivery capabilities for new information. Some projects are too restrictive and create more problems than they solve. Thus many first attempts are struggling to evolve into solid, continuous and extending information delivery mechanisms on a departmental level, or Phase 2 of data warehousing. Many of the project trying to move into Phase 2 of data warehousing struggle with some or all of the issues below: •
Low implementation productivity because of: o
Developer productivity is too low to achieve the aggressive demand for new extensions and information
o
Poor metadata change capabilities reducing the delivery speed
•
Low trust in the solution due to poor data quality
•
Decoupled metadata between the ETL, schema design and end user tools
In order to move from Phase 1 ad-hoc implementations to Phase 2 tactical implementations customers must solve these problems. To move to the 3rd phase - data warehouse implementations that directly impact the bottom line of the organization or strategic warehouses - the customer must ensure his data is not only accurate but also valuable. In other words he must add value to the data to make use of this data to either decrease the cost or increase the profitability of an organization. The 10g release of Oracle Warehouse Builder adds specific capabilities to address these problems to an even greater extent and allow customers to move from Phase 1 to Phase 2 and 3.
Three business reasons to move to release 10g
Page 3
INCREASING IMPLEMENTATION PRODUCTIVITY
To increase overall productivity the main focus should go to the tasks that consume the most time in a development cycle. Within a data warehouse project the most time consuming part is, beyond any doubt, the development of Extraction, Transformation and Load (ETL) routines. Being able to reduce the amount of effort and time in this phase of the project will give the biggest gain in overall productivity. A second area to improve is the ability of developers to handle changes in both data and metadata. How does Warehouse Builder 10g address the reduction of time in the ETL process?
One of the harder parts in development of ETL is the creation of transformations that move data flows from the source to the target systems. Many times these transformations are complex and take a lot of time to develop. To reduce the development time, Warehouse Builder introduces the Mapping Debugger, allowing developers to debug their logical ETL processes.
Figure 1 Debugging a complex mapping
The developer already can create a set of transformations in the graphical user interface, but with the debugger they can also step through the process and see the effect of their constructs on the individual data rows. This functionality increases the productivity of ETL design in Warehouse Builder. Compared to hand-coding of ETL, the mapping debugger is a quantum leap forward. The avid data warehouse professional will conclude that there are many tools on the market that offer this capability. While this is true, none of these tools have the granularity of debugging that Warehouse Builder can offer.
Three business reasons to move to release 10g
Page 4
Figure 2 Granular modeling in the mapping editor
Compared to other graphical ETL tools, Warehouse Builder offers modeling per activity, and not per SQL statement. For example, joins and filters on a set of source tables are not modeled in a query but as individual operators allowing the developer to see the individual as well as the combined effect of operations on his data. Because of this unique design, Warehouse Builder can look into queries modeled in Warehouse Builder and offer robust debugging results. How does Warehouse Builder 10g assist in managing changes?
Any warehouse developer will attest to the fact that a data warehouse is continuously changed and updated to reflect new or changing business requirements. To address this need, Warehouse Builder has from the start supported sophisticated change management capabilities like incremental metadata deployment and advanced lineage and impact analysis. With the 10g release of Warehouse Builder a new dimension to warehouse metadata management is added to the equation - the ability to inspect differences between multiple versions of the metadata objects and act upon these before any changes are made.
Figure 3 The Change Manager Three business reasons to move to release 10g
Page 5
By doing this Warehouse Builder takes the guesswork out of version management and allows developers to confidently update their metadata with new or existing versions.
Figure 4 Inspecting the difference between two snapshots
Adding this capability to the already extensive change capabilities puts Warehouse Builder in a class of its own regarding change management.
Three business reasons to move to release 10g
Page 6
INCREASING TRUST BY IMPROVING DATA QUALITY
In previous releases of the Warehouse Builder, Oracle introduced integrated and advanced Name and Address cleansing capabilities. Name and Address cleansing achieves better quality of your most important data, your customer data. With the 10g release this is extended to provide more capabilities by incorporating mapping debugging capabilities in Name and Address cleansing maps.
Figure 5 Matching and Merging data
However, this is data quality of customer data. While, as stated, very important, many other aspects of data quality are apparent outside of customer data. Product lists for example are often as critical and as dirty. How does Warehouse Builder help to create more trust in noncustomer data?
To assist customers to deal with critical data that is not as structured as names and addresses, Warehouse Builder 10g has advanced matching and merging capabilities. Other than the name and address cleansing, match/merge as it is commonly known, allows you to find duplicates in more or less unstructured data. Once these duplicates are identified you can merge the rows into single instances. By doing this, your data becomes more accurate and your aggregations and forecasts are much more accurate and valuable to the business community. The match/merge implementation in Warehouse Builder allows you to specify your own matching and merging rules, letting you handle both simple and advanced cases within the same user interface.
Three business reasons to move to release 10g
Page 7
Can Warehouse Builder achieve worldwide coverage?
As you would expect from an enterprise class tool, Warehouse Builder is capable of covering names and addresses from all over the world. To achieve this, Warehouse Builder chose to move directly opposite of most vendors. Instead of locking you into proprietary technology, Warehouse Builder allows you and your local data vendor to plug specific country information into the Warehouse Builder tool. This way you are flexible and can choose the data vendor1 that best solves your name and address requirements.
Figure 6 The API architecture for data quality
Doing this also increases your developers’ productivity as he or she can develop within the familiar and user-friendly Warehouse Builder user interface, while still getting specialized data cleansing support.
1
Please visit our website to see the current partners, or contact us on the OTN Technical forum for more information Three business reasons to move to release 10g
Page 8
INCREASING OPERATIONAL VALUE
Once you have achieved trust in your data and your data warehouse, you can start looking at moving into Phase 3 of warehouse development. As you remember in Phase 3 the warehouse becomes strategic and indispensable to your daily business. How does Warehouse Builder help to create operational value?
The newest addition to the capabilities in this region is something referred to as Householding. Once you have the data quality under control, you can start using your data for added value like cost reductions in your business. Householding is an effective method to increase the value of customer information by adding meaning to the individual records. This meaning in the case of Householding is grouping the individual records into households. Or in other words, who belongs in a logical group. So why is this interesting for your bottom line? Simply said it reduces the amount of money required to reach a number of people by ensuring that you only send one copy of materials to a single household instead of sending it to 3 people in the same household. This is the obvious advantage, however the opposite might be true as well. Sometimes a shared address might just mean roommates and you might actually want to send multiple copies, increasing your rate of success. The amazing thing is that Warehouse Builder combines this advanced and important functionality within its 10g release, leveraging your investment in the technology, while providing you with the means to increase the operational value of your warehouse.
Three business reasons to move to release 10g
Page 9
CONCLUSION
While this white paper only covered a small cross-section of the functionality of Warehouse Builder, it is clear that the product offers tremendous business value to customers facing data and metadata integration issues. By using Warehouse Builder in your data warehouse development, you can address a number of common problems in your organization. The 10g release specifically addresses the issues of: •
Low implementation productivity
•
Low trust in the solution due to poor data quality
•
Decoupled metadata between the ETL, schema design and end user tools
Addressing these business problems will make your projects more successful and productive and add value to the organization.
Three business reasons to move to release 10g
Page 10
Three business reasons to move to release 10g February 2004 Author: Jean-Pierre Dijcks Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A. Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 www.oracle.com Oracle is a registered trademark of Oracle Corporation. Various product and service names referenced herein may be trademarks of Oracle Corporation. All other product and service names mentioned may be trademarks of their respective owners. Copyright © 2004 Oracle Corporation All rights reserved.