A Ahead

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View A Ahead as PDF for free.

More details

  • Words: 3,210
  • Pages: 66
Informatica User Group PowerCenter : Differences Between v 7 & v 8 Mark Murray - Senior Sales Consultant October, 19th 2006

Informatica confidential. For discussion purposes only. 1

Goals for New Architecture •

Enterprise Deployment • Improved Service Orientation • High Availability • Grid Deployments



Centralized Services • Administration • Logging & Auditing



Single Point of Administration • Traditional Configuration • HA Configuration • Grid Configuration

Informatica confidential. For discussion purposes only.

2

What do customers want? • High Availability and Failover was a top 10 request in the 2004 User Group surveys • Database Pushdown Optimization was 10th out of 66 features in the 2005 Surveys • Improved logging capabilities was 2nd out of over 60 feature requests in the 2004 surveys • Looping support within the Designer

Informatica confidential. For discussion purposes only.

3

Informatica Data Integration Platform Continually Raising the Bar Hercules 2007

PowerCenter 8.1.1 Now

PowerCenter 7

On-Demand Platform for the Enterprise

Mission-Critical Enterprise Deployment

Advanced Edition One Product, Single Install

Informatica confidential. For discussion purposes only.

4

Informatica Delivers Continuous Innovation

<18 min

0:37

“With PowerCenter continually leapfrogging on performance and scalability, we are never concerned about our ability to handle increasingly large data volumes in our data integration environment.”

3:36 SOA Web services Grid, 64-bit Team development Enterprise security Mainframe Data Server and CDC Impact analysis

SOA Web Services Grid, 64-bit Team development Enterprise security Mainframe Data Server and CDC Impact analysis

Realtime Workflow Data quality 3-tier architecture Enterprise metadata

Realtime Workflow Data quality 3-tier architecture Enterprise metadata

Realtime Workflow Data quality 3-tier architecture Enterprise metadata

Partitioning Debugger XML Metadata connectivity

Partitioning Debugger XML Metadata connectivity

Partitioning Debugger XML Metadata connectivity

Partitioning Debugger XML Metadata connectivity

Pipelining ERP Connectivity UNICODE

Pipelining ERP Connectivity UNICODE

Pipelining ERP Connectivity UNICODE

Pipelining ERP Connectivity UNICODE

--- Kevin Smith, CRM Strategies Manager, AAA Carolina 6:35

1 TB Transform and Load Test HR: Min

Pipelining ERP Connectivity UNICODE

V4.x

Session On Grid Adaptive Load Balancing High Availability Dynamic Partitioning Pushdown Optimization Unstructured Data Data Federation

V5.x

V6.x

V7.x

Informatica confidential. For discussion purposes only.

V8.x

5

What else is in the Informatica product family? PowerCenter Options Data Cleanse and Match

PowerCenter 8 Advanced Edition

Data Federation (EII)

New

Enterprise Grid High Availability

Metadata Manager Pushdown Optimization

Data Analyzer

Unstructured Data

Team Based Development

Mapping Generation Data Profiling

PowerCenter 8 Standard Edition

Updated

Partitioning Real-Time PowerCenter Connects

Broader

Metadata Exchange

Informatica confidential. For discussion purposes only.

6

PowerCenter 8 Base Improvements Delivering Value for Installed Base Customers Reduce Time To Results PowerCenter Advanced Edition Metadata Manager Data Analyzer Team Based Development

PowerCenter Standard Edition



Java transformation support



User defined functions



Extended expression library



Mapping generation and templates



Improved Data Profiling

Cost Effectively Scale •

Centralized administration web-based console



Extended recovery options



Connection resilience (RDMS, Network, PC)



Flat File Performance Optimization



Enhanced, centralized logging



Enhanced Team-Based Development



Unicode repository option

Informatica confidential. For discussion purposes only.

7

PowerCenter 8 Release Themes

• • • • •

Service Oriented Architecture 24x7 Availability of PowerCenter services Order of magnitude performance improvements Unlimited scalability Improved developer productivity

Informatica confidential. For discussion purposes only.

8

PowerCenter 8.x Update – Setting the Standard for Data Integration across the Enterprise •



Infrastructure and Server Enhancements • • • • •

• • • • •

Services based Architecture High Availability Grid Enhancements Easy Grid Configuration Centralized administration web-based console Centralized configuration





Developer Enhancements

Performance Enhancements • • • •

Pushdown Optimization Flat Files Partitioning Auto Cache



Connection resilience (RDMS, Network, PC)





Functions and Expressions User Defined Functions Java Transformation Dynamic Target Creation Visio Template – mapping generation and templates Upgrade Wizard

Expand the definition of universal data access • • • •

Data Federation Option Unstructured Data Option Data Quality Option – Extended PowerExchange

Informatica confidential. For discussion purposes only.

9

PowerCenter 8 Architecture

Informatica confidential. For discussion purposes only. 10

PowerCenter 6 and 7 Architecture Client Tools

Repository Manager Designer

Workflow Manger Workflow Monitor

Repository Server Admin Console Web Services Hub

PowerCenter Connects

Repository Server

Repository Database

Data Servers (pmserver)

PowerExchange

Machine Informatica confidential. For discussion purposes only.

11

PowerCenter 8 Architecture Client Tools

Repository Manager Designer

Workflow Manger Workflow Monitor

Administration Console

Application Services Integration Service

Repository Service

*

Web Services Hub

PowerCenter Connects

PowerExchange

Repository Database

SAP BW Service

Core Services Repository Service Domain/Gateway Services • • • •

Log Service

Administration & Authorization Configuration Domain Licensing

Node & Domain .

Informatica confidential. For discussion purposes only.

12

PowerCenter 8 Terminology • Services • A service is a resource that provides specialized functions. • PowerCenter has two types of services. Application and Core Services. • PowerCenter Application Services – represents server based functions such as Repository, Integration, SAP BW, and WebService Hub services. • PowerCenter Core Services – represents functions that manage and maintain the environment in which PowerCenter operates.

Informatica confidential. For discussion purposes only.

13

Introducing PowerCenter 8 Terminology • Node • A node is a logical representation of a physical machine. It has physical attributes such as a hostname and port number. • Each node runs a Service Manager which is responsible for the application and core services. • Is started when you start “Informatica Services”

• Domain • A domain is the fundamental unit of PowerCenter Services administraion. • A domain is a logical collection or set of nodes and services that you can group in a “folder like” deployment.

Informatica confidential. For discussion purposes only.

14

PowerCenter 8 Terminology • Service Manager • On the gateway node, the Service Manager is responsible for • Controlling the domain • Manage services running on the domain • Provide service lookup

• On all nodes, the Service Manager • Controls the core services and application services

Informatica confidential. For discussion purposes only.

15

PowerCenter Services Framework PowerCenter Domain

Client Tools Repository Database

Designer Repository Manager Workflow Manager Monitor

Check point

Repository Service

Master Gateway (Domain Controller)

Logs Domain Metadata

Administration Console Integration Service

Informatica confidential. For discussion purposes only.

16

High Availability (HA)

Informatica confidential. For discussion purposes only. 17

High Availability in PC8 • Failover • Restart for data integration, repository and other services • Primary and backup servers

• Recovery • Workflow and sessions will be recovered on running servers on the grid during server failure • Checkpoint recovery

• Repository recovery

• Resilience • PowerCenter jobs will sustain transient failure • Network errors • DB connection failures Informatica confidential. For discussion purposes only.

18

Resilience • DB Connection Resilience • When connecting/disconnecting from a DB • Oracle, DB2, Sybase, SQL Server and Teradata • Retry interval based on timeout setting

• FTP Resilience • For connections to FTP server • Read/write will recover if connection lost based on timeout parameter

• Internal Resilience • PowerCenter components (integration service, clients etc.) resilient to Repository service failure

Informatica confidential. For discussion purposes only.

19

Simple High Availability/Failover Scenario • Simple environment • 1 Domain which consists of: • 2 nodes for Integration Services

Node01 (Int_Svc01)

• node01 - Primary • node02 - Backup

• 1 server for repository.

Repository DB Node02 (Int_Svc02)

Informatica confidential. For discussion purposes only.

20

Simple High/Failover Availability Scenario • node01 Integration Service goes down • Node01 Integration Service “fails over” to node02

Component Failure (HW/SW)

node01 (Int_Svs01)

Repository DB node02 (Int_Svs02)

Automatic Failover Restart Recovery

Informatica confidential. For discussion purposes only.

21

Grid Enhancements

Informatica confidential. For discussion purposes only. 22

Domain Overview Dashboard Simplified, Web-based Administration

Services Configuration Remember pmserver config file?

Domain Example Primary & Backup Repository Service

Nodes

Services

Informatica confidential. For discussion purposes only.

23

Mission-critical Enterprise Deployment Cost-effective Scalability with PowerCenter on a Grid Automatically recover, restart on live server

Failed Hardware Server

PowerCenter Domain Controller

Distributed processing of sessions PowerCenter Domain on Server Grid

Informatica confidential. For discussion purposes only.

24

Grid Enhancements ƒ

Grid Object • • •



Workflow distributed on Grid (WOnG) • •



New in version 8 Can partition sessions to run on multiple nodes

Dynamic Partitioning • •

ƒ

Same as version 7 Distribute Sessions of a Workflow across multiple nodes

Session distributed on Grid (SOnG) • •

ƒ

Configured from admin console Services can be assigned to grid Workflows are assigned to be run by services

# of partitions dynamically determined at runtime Less configuration for users

Resource Maps • •

Configure available resources on nodes in grid through admin console Load balancer dispatch jobs based on resource availability on nodes

Informatica confidential. For discussion purposes only.

25

Grid – PC 7 vs. PC 8 PowerCenter 7 •

ServerGrid is collection of pmservers



Work is directed to individual pmservers



Work distributed across Grid in round-robin manner



Session/task is lowest unit of work

Informatica confidential. For discussion purposes only.

26

Grid Capabilities in 7.x vs. 8.x 8.X

7.x • ServerGrid Object • Collection of pmservers • Workflows explicitly assigned to pmservers • Pmservers belonging to a ServerGrid will dispatch to other pmservers • Pmservers could fail causing workflows to fail • Can’t split sessions across multiple nodes • Load balancer is round robin only

• Grid object • Collection of nodes

• Workflows assigned to Integration Service • Integration Service assigned to Grid (can run on any node in grid) • If one node fails, another Integration Service process on another node in grid takes over running the workflow • A session can be partitioned across nodes • Load balancer takes into account resource availability on nodes and resource requirements of sessions for dispatch.

Informatica confidential. For discussion purposes only.

27

Performance Improvements

Informatica confidential. For discussion purposes only. 28

Pushdown Optimization

Informatica confidential. For discussion purposes only. 29

Introduction • What is pushdown optimization? • Push transformation processing to data sources & targets w/o moving data out

• Benefits • Reduce movement of data when source and target are the same database instance • Utilize database-specific processing that may be more optimal

• Maintain metadata and lineage in PowerCenter

Informatica confidential. For discussion purposes only.

30

Pushdown Optimization •

Full Pushdown: • Source and target are in the same RDBMS • All transformations can be processed in database



Partial Source: • One or more transformations can be processed in source database



Partial Target : • One or more transformations can be processed in target database



Generated SQL: • INSERT INTO t (…) VALUES (?+1, SOUNDEX(?))

Extract Source DB

Transform

Load Target DB

Informatica confidential. For discussion purposes only.

31

Example – Full Pushdown SQL & Business Logic Maintained in Repository

Informatica confidential. For discussion purposes only.

32

Flat File Performance & Parameter and Variable Enhancements

Informatica confidential. For discussion purposes only. 33

Flat file enhancements • FF Reader and Writer have been rewritten to optimize for performance • Delimited files with lots of decimal data will see the most significant performance improvements • Out of box performance improvements should be between 30%300%

• Append to flat file targets • Session output can be appended to existing flat file

• Flat file source/target command support • Sources: use a command to generate source data or a file list that references multiple source files. • Targets: use a command to process the target data or process data for all partitioned targets in a session. Informatica confidential. For discussion purposes only.

34

Parameters and Variables Enhancements • Parameter Enhancements • Table owner name for relational sources/targets • E-mail address • FTP remote file name

• Global section specification in parameter files for use across different workflows / sessions

Informatica confidential. For discussion purposes only.

35

Partitioning Enhancements

Informatica confidential. For discussion purposes only. 36

Partitioning Enhancements • Flat File Partitioning • FF targets can now be partitioned • All partitions can write to a single file, a merge file or file list can be created that contains the names of the individual files that were written

• Database Partitioning • Partitioned Oracle and DB2 sources can be read in parallel • No changes to targets. DB2 can be written to in parallel.

• Dynamic Partitioning • Based on # of partitions in database • Based on the # of nodes in a Grid

Informatica confidential. For discussion purposes only.

37

Auto Cache

©Informatica Informatica Corporation, 2006. rights reserved. confidential. ForAll discussion purposes only. 38

AutoCache Overview • Cache in PowerCenter v7 • • • •

Default cache settings not adequate for all situations. Default settings can underestimate new chip technologies. Sometimes necessary to hand tune individual transformations. Development did not always scale when deployed to different production machines.

• Auto Cache in PowerCenter v8.x • Automatically distribute session memory to transformations. • Automatically scale memory usage based on resource available. • Automatically scale memory usage based on mapping complexity.

Informatica confidential. For discussion purposes only.

39

Memory Attributes •

PowerCenter has two types of memory attributes: • Transformation Memory Attributes • Session Memory Attributes



Transformation Memory Attributes are for individual transformations: • Lookup, Aggregator, Rank, Joiner • Index and Data Cache Size

• Sorter Cache Size • XML Target Cache Size



Session Memory Attributes are for the session: • Default Buffer Block Size • DTM Buffer Size

Informatica confidential. For discussion purposes only.

40

New Memory Attribute Specification • Previously, only integer byte value were allowed for Memory Attributes. E.g, 1000000 or 2000000. • Now also allow shortcuts: “KB”, “MB”, and “GB”. E.g, 100MB • Also allow the value “Auto” • This indicates that the user wants PowerCenter to automatically find a good value for that memory attribute • “Auto” supported for both session (e.g. DTM buffers/buffer block size) and transformation memory attributes (e.g. lookup caches)

Informatica confidential. For discussion purposes only.

41

AutoCache •

Allows the user to leave the calculations to PowerCenter



User specifies total amount of memory AutoCache is allowed to use



Automatically computes a value for ALL memory attributes that have the value “Auto”



Will NOT affect any memory attributes where the value is not “Auto”

Informatica confidential. For discussion purposes only.

42

Cache Calculator •

Click drop down



Calculate based on the number of rows and the ports going into the object



Value is propogated into the Cache value

Informatica confidential. For discussion purposes only.

43

Developer Improvements

Informatica confidential. For discussion purposes only. 44

Functions and Expressions

Informatica confidential. For discussion purposes only. 45

Function Enhancements • Over 20 new functions added in the 8.x release • Financial Functions, Regular Expression parsing/match, IN(), Compression, Encryption, CRC, MD5 and more

• Custom Functions • Extend the functionality of the Expression Transformation via a C API • All 20+ functions above were added via this API

Informatica confidential. For discussion purposes only.

46

Function Enhancements • User Defined Functions (UDF) • Ability for Designer users to create reusable functions entirely within the Expression Language • UDFs are folder level objects • can use any valid functions (except aggregation functions) as well as other UDFs (in the same folder)

Informatica confidential. For discussion purposes only.

47

Java & SQL Transformations

Informatica confidential. For discussion purposes only. 48

Java Transformation Use Cases • Looping over data • Walking data hierarchies • Calling third-party APIs (Java based) • Calling RMI/EJB etc. • Other Java Packages

• Calling expression/UDF/unconnected widget (like lookup) from Custom Transformation • Simple “Custom Transformation”

Informatica confidential. For discussion purposes only.

49

Improved Developer Productivity Java Inline Coding Sample

Informatica confidential. For discussion purposes only.

50

SQL Transformation Use Cases • New SQL Transformation • Allows PowerCenter developers to execute SQL statements midstream in a mapping. • You can insert, delete, update, and retrieve rows from a database and returns database errors. • The SQL that is executed can be static SQL or can be dynamic where the SQL statement is itself created on a row by row basis. • The SQL transformation can also be used to execute SQL scripts from within a mapping – e.g. leverage SQL scripts that already exist

Informatica confidential. For discussion purposes only.

51

XML

Informatica confidential. For discussion purposes only. 52

XML Enhancements • Filter data with query predicate • Create a default namespace • Import part of an XML schema • Use anySimpleType

Informatica confidential. For discussion purposes only.

53

Metadata Enhancements

Informatica confidential. For discussion purposes only. 54

Metadata Exchange Enhancements • New Data Model Support • • • •

Sybase Power Designer – bi-directional Oracle Designer – bi-directional ER Studio Design Tool – uni-directional (same as before) CA Erwin – bi-directional

• Business Intelligence Support • Business Objects (bi-directional) – added 6.5 & XI & XI R2 XConnects • Cognos ReportNet Framework Manager (bi-directional) – added 2.0 • Microstrategy (bi-directional) – added 8.0

Informatica confidential. For discussion purposes only.

55

Dynamic Target Creation

Informatica confidential. For discussion purposes only. 56

Dynamic Target creation • Ability to dynamically create a target based on a transformation in the workspace or navigator • Right click on transformation in workspace and selected Create and Add Target • Drag a transformation and drop it in the Target folder • Has same port definitions as transformation from which it was created • Target type is same as repository you are using • Can edit the target definition to change type or ports • Creation dialog will be added in an upcoming release

Informatica confidential. For discussion purposes only.

57

Improved Developer Productivity Target Generation

Simply Right-Click on an object…

…..Target is created! All you need to do is Auto link and you are ready to go

Informatica confidential. For discussion purposes only.

58

Mapping Generation Option Visio Client for PowerCenter

Informatica confidential. For discussion purposes only. 59

Mapping Generation Option • Bi-Directional “engine” for automatically generating mappings from Visio templates or reverse engineering PowerCenter mappings into Visio templates • Leverages the Informatica Data Stencil and Velocity templates for Visio

Informatica confidential. For discussion purposes only.

60

Visio Client for PowerCenter

Mapping Template

Template Inputs

Informatica confidential. For discussion purposes only.

61

Upgrade Wizard

Informatica confidential. For discussion purposes only. 62

PowerCenter Upgrade to 8.1 • A new Upgrade wizard in Admin Console • Integrated UI that takes the user through the various steps in the upgrade • Provides a detailed upgrade summary report in the end • Allows user to switch in and out of the Upgrade UI to perform any other administrative activities • Can handle multiple repositories (global /local) and multiple PowerCenter Servers in one shot • Live feedback during repository upgrade as user goes through the upgrade process

• A new post-upgrade reference guide

Informatica confidential. For discussion purposes only.

63

Summary

Informatica confidential. For discussion purposes only. 64

Summary - PC 7 vs. PC 8 PC 8.x

PC 7.x •

3 Tier Architecture



Services Oriented Architecture



Basic Grid Deployment



Enhanced Grid Deployment



Introduction to Profiling



Added Transformations • •

• • •

High Availability Session on Grid Resilience

Union XML



Enhanced Profiling



Web Services



Added Transformations



Team Based Development

• •



Java SQL

Enhanced Productivity • •

Mapping Generation User Defined Functions

Informatica confidential. For discussion purposes only.

65

Thank You Questions at the break

Informatica confidential. For discussion purposes only. 66

Related Documents