http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
Standards & Guidelines for TIBCO Business Works
Standards & Guidelines for TIBCO Business Works
Page 1 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
Table of Contents 1 Introduction 1.1 Objective 1.2 Tibco BW Concepts 1.2 Methodology
2 TIBCO Business Works 5.2 Performance Architecture 2.1 Main TIBCO Business Works Performance Components 2.1.1 Engine Processing 2.1.2 Determining the Available Memory 2.1.3 Flow Control 2.1.4 Paging Jobs 2.1.5 Paging Waiting Jobs 2.1.6 Enabling Paging
3 Best Practices 3.1 TIBCO Business Works Engine Tuning Guidelines 3.1.1 Calculating MaxJobs or Flow Limit 3.1.2 Memory Management Issues 3.1.3 Tuning Engine 3.1.4 Fault Tolerance and Load Balancing 3.2 JVM Tuning Guidelines 3.2.1 Specifying JVM Heap Size 3.2.2 Setting JVM and Processor Affinity 3.2.3 JVM Garbage Collection 3.3 TIBCO Business Works Transport and Palette Guidelines 3.3.1 HTTP/S 3.3.2 SOAP 3.3.3 JMS 3.3.4 FTP 3.3.5 JDBC 3.3.6 Transactional Activities 3.3.7 General Activities 3.4 Process and Activity Design Guidelines 3.4.1 Data and Caching 3.4.2 Checkpoints 3.4.3 Grouping Activities 3.4.4 Testing BW Process Definitions 3.4.5 Tracing in BW 3.5 Message, Payload and Schema Guidelines 3.5.1 Data Representation Guidelines for Tibco Rv and JMS 3.5.2 TIBCO Business Works Mapper Performance 3.6 Processing Large Sets of Data 3.6.1 Iterating Through Large Sets of Data Using a Mapper Activity 3.6.2 Sorting and Grouping Data 3.6.3 Large Document/Record Use Cases 3.6.4 SOA Best Practices
4 Configurations 4.1 Configuring Persistent Connections 4.2 Configuring HTTP/JMS Servers
Standards & Guidelines for TIBCO Business Works
Page 2 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
4.3 Enabling SSL Configurations via BW
5 Global Variables 5.1 Grouping & Sub-grouping Rules 5.2 Naming Conventions
6 Logging 6.1 Logging Rules 6.2 Logging Parameters [Details, frequency etc.
Standards & Guidelines for TIBCO Business Works
Page 3 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
1 Introduction 1.1 Objective The purpose of this document is to provide Standards, Guidelines and methodology for implementing TIBCO BW related projects. The document helps in defining the BW Project specific guidelines and can be used by the team of people mostly Architects, Functional Consultants, Developers who implements a BW Project and also helps in customer architecture and production readiness reviews. Some portion may be relevant and some may be irrelevant. 1.2 Tibco BW Concepts TIBCO BusinessWorks (TIB-BW) scalable, extensible, and easy to use event driven integration platform that allows developing Enterprise Application Integration (EAI) projects. TIBCO BusinessWorks includes a graphical user interface (GUI) for designing business processes and an engine that executes the process. Tibco BW is extensible as it provides platform to plug in Other available Tibco Products(products providing BPM solution, Complex Event Processing Solutions, Messaging Solution, Monitoring and Management solution, BAM Solutions etc) and use these Product Features to Integrate systems which are decentralized or exists in different platforms .net, J2EE or Legacy Systems or custom applications or exchanging data within protocols or Applications such as SAP, Oracle, JDE or a business covering B2B, A2A. TIBCO BusinessWorks also works with TIBCO Administrator, a web-based GUI for monitoring and managing run-time components. The current flavor TIB-BW is provides extensive solution where a process can be configured as a exposable Service adhering to SOA standard. The current flavor TIB-BW design also supports BPEL standards of Design. 1.3 Methodology TIB-BW integration project is developed in phases. Methodology focuses on the aspects of a nature of a project where a project is in development or in migration phase following a short development cycle, scalable and extensible, and the ease-ness of use. It is been observed that the sequential execution of these phases in results in a fast deployment Analysis - Define & analyze problem Detailed problem analysis, which involve clearly identification, and analysis of the existing problem, defining the problem statement. Analysis leading to defining the solution which defines a transport communication layer, the protocol involved in data transformation or data exchange, the business process involved, the common universal Interfaces to be built in, deciding the partners exchanging data and the rule the business need to obey etc. The required Interfaces and the associated end-point data elements are recorded. Domain Setup - Install software & configure domain Decide the Tibco Infrastructure requirements i.e. the software, which best suits for integration using BW as Integration tool. Evaluate the Hardware and Software need to be used at different Phases of Integration. Defining the administration Domain setup and Planning of the Management of the Domain (defining a access control list - ACL) are the essential activities.
Fig : Typical Domain
Decide a Software-Hardware map where the software component(s) and machine(s) to be deployed on the machine is determined and the features such as load balancing and fail over and backing up strategies are well defined.
Services Configuration - Configure adapters
Standards & Guidelines for TIBCO Business Works
Page 4 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
The different Nature of the Services exist in Tibco BW environment and configuring these services plays a key role. The business process, which is available as a service implementing a Adapter behavior, publish-subscribe behavior, request – reply or request – response behavior, web services (SOA) can well be accessed within the TIB-BW environment for integration.
Process Design - Implement & test business processes
The artifacts, the data structures, the business process execution paths, the end points of the business process, decision of the nature of work (automated, batch, manual etc) is recorded involved are been finalized. The BW Project defines the activity, sub-activity associated and the transitions involved. The processes and the inter process communication is well defined and can be validated for design time errors using the BW Designer. TIB-BW has capability to design and configure the required business process and have a dry run (a designing time test ) using its Test Mode environment.
Deployment - Deploy to runtime engine
The Project Archives contains the processes, which can be Enterprise Archive or a Adapter Archive and are created using the TIB-BW designer interface. The configurations need to be re-adjusted with the state of deployment i.e. testing, acceptance, production etc. The post-deploy run-time environment always checks for the newly defined configurations. The post-deployed process is visible and manageable using the TIB-Administration Domain.
Production - Manage & monitor deployments
User Management can be achieved by defining a ACL (Access Control List) where by the TIB Administration domain can be obeying a Security (Authorization and Authentication) policy. A User with a given role does controls / manages the Deployed Processes and the properties. TIB- Administration domain can be monitored for the products and run time availability, CPU and DISK Usage. The Deployed processes can also be monitored and traced using the runtime logging.
2 TIBCO Businesses Works 5.x Performance Architecture 2.1 Main TIBCO Business Works Performance Considerations
Connections Opening and Closing of Connections Retrying of failed/suspended processes, Job recovery Number of Process configuring a service to run in a single thread or multi thread mode and hence directly impacting the memory usage. HW Sizing of the Hardware based on the TPCC of the Hardware Processor Deciding on the number of Configurable Runtime variables. The More GVs used the more memory a process consumes. Checkpoint/Reliable defining Checkpoints or Configuring a Reliable mode of Process usually takes more time to process as the time slot is allocated to refreshing a cache or a file system Logging Extensive logging does take more time to processing because of Unavailability of File descriptors at time of need and causing a delay. JVM The Memory allocated to the JVM should be optimistic. The process Times out and renders to Error when talking with a third party non-responsive Application. Network Specific Filter: Using multicasting provides a lowest level filtering of traffic Transport Specific Filter – At Messaging Layer (RV: Subject Based Filter, JMS: Topic Based) Process Engine Specific Filter – At Process Management Level (RVFT and RVCMQ) Transport based FT, LB Process Specific (Short Living Master—Content Based Filter—Condition—Child Process (One can also aggregate from multiple sources) Use flow control provided by deployment. It has limitations. If (wait or suspend) state of flow. They are considered to be part of the count. If this is unacceptable Need to build semaphore logic in java code Max Jobs and Activation Limit Memory, CPU, Threads, Connections, Number of process need a primary attention Complexity of Data Validation and Transformation Complexity in XPath defined transition Event Source (Starter Processes) usually resource hungry
2.1.1 Engine Processing The Business works Process Execution always depend on the State of the Process, the buffer associated with the state, the memory available to JVM (heap size) to executes Process.
Standards & Guidelines for TIBCO Business Works
Page 5 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
The number of threads spawned behaves as a unique instance of the process. The CPU cycle is allocated to each thread. Each instance consumes a processor slot and comparable to a tiny CPU. So the CPU sizing is always a must. The Job pool hold the jobs in queue and process in sequence. Piling of jobs cause the paging out of the system(database, file, memory) which can be adjusted using parameters such as FlowLimit, MaxJobs, Activations Limit. Process can either be staying in a ready state or at a Dispatched state or at a wait/blocked state. Setting FlowLimit, MaxJobs, Activations Limit helps in reducing paging out. Once a Job is submitted to a thread, the process executes 20 steps and goes back to dispatch queue leading to a thread switch allotting the thread to the next available job. In this scenario if the steps are executed and the thread is not released then the process goes to a blocked state reaching a incomplete state of the process and this does not allow the next job to page in. The BW Deployment configuration should be well adjusted with the parameters Max Jobs(in memory), Thread Count, Flow Limit (max jobs in memory and paged) using the TIBAdministration at deployment. 2.1.2 Determining the Available Memory The size of Memory and the number Processes it can handle at a specified time is part of a tuning activity. To reach a correct figure on the number of services, a load test with verity of services viz, request -reply, adapter, normal sequential can be carried on the specified System with the given HEAP size i.e. the JVM size. Parameters such as Initial Heap Size (MB), Maximum Heap Size (MB) , Java Thread Stack Size (KB), Thread Count are adjusted to achieve the optimal memory usage. 2.1.3 Flow Control Mostly the deployed archives always have a number starter process configured either with Adapter, JMS publisher/subscriber, HTTP/SOAP. The rate at which the process gets completed depends on the number of threads and the max job configured for this. To improve on processing the jobs are paged/buffered using the flowlimit or the activation limit for that process. Restricting the flow of incoming events is necessary to avoid memory overflow when processing lags behind. The FlowLimit property limits the number of jobs created. This parameter is specified in the Job Starters. Setting MaxJobs to more than one and Activation Limit enabled(gives priority to new jobs) will limit n-jobs to be processed as a batch at one time. It is useful for situations where resources downstream for concurrent processing are limited. MaxJobs and FlowLimit are two attributes to control flow in to TIBCO BusinessWorks process where JVM parameters are useful to control TIBCO BusinessWorks Engine memory. 2.1.4 Paging Jobs Existing Jobs get paged out when the job is blocked, or the Maxjobs Limit is exceeded or Activation Limit is disabled. New job can be paged when MaxJobs limit is exceeded and Activation Limit is enabled or All paged-in jobs are unblocked. Paged jobs returned to the Dispatch Queue provided the job is unblocked and the MaxJobs limit is not exceeded Some other paged in job is blocked and can be paged out to make room. 2.1.5 Paging Waiting Jobs Flow control enables the waited process to be placed into the paged process pool. It also frees space for another process to enter the process pool.
If the task that waits for an event is a Signal In or Sleep task, the process paging feature is used to write the process to disk and allowing a new job to run using the thread. For example, when the event occurs, a Rendezvous message arrives for the Signal In task, and the process automatically re-enters the process pool and becomes a ready process.
2.1.6 Enabling Paging The Activation Limit setting is used in the deployment configuration to control process paging. By enabling paging, system allows the number of active and waiting jobs to be greater than the MaxJobs/FlowControl setting.
3 Best Practices 3.1 TIBCO Business Works Engine Tuning Guidelines
Standards & Guidelines for TIBCO Business Works
Page 6 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
Allocate significant time for selecting the JVM vendor, version, and necessary tuning parameters. The number of threads that an engine will allocate is set in the TIBCO Administrator Edit Service Instance dialog box, under the Server Settings tab, General section. Each engine process on a separate Java thread, the number of worker threads controls how many jobs can run simultaneously. Throughput can be increased by Measuring the available CPU and memory resources. Can be determined from a stress/load test. Increase in thread count definitely consumes more resource leading to degrade in performance, so a proper thread sizing need to be done. There are two ways to take benefit of CPU utilization: Optimize engine thread or Increase number of TIBCO BusinessWorks Engines on a single node or distribute on different nodes based on a optimal configuration Typical numbers of worker threads range between 8 and 32. Specifying too low a value can cause lower engine throughput even though spare CPU resources exist. Specifying too high a value can cause CPU thrashing behavior. If the rate of incoming processes exceeds the number of threads available to run them then alternate approach would be to control the process using either a delay, or sequencing or configuring a flow control. Processes with heavy nesting/hierarchy calling sub-process should be refined or partitioned to simple modules to avoid a run time overhead. 3.1.1 Calculating MaxJobs or Flow Limit MaxJobs and FlowLimit are two attributes to control flow in to TIBCO BusinessWorks process where JVM parameters are useful to control TIBCO BusinessWorks Engine memory.
Do a load test/stress test
List out how many jobs and the type of jobs that can be accommodated in the given JVM
MaxJobs and FlowLimit configuration for a process engine depends on the Amount of memory reserved for the engine JVM, Maximum size of a process instance object, Number of process instances per engine . Normaly MaxJobs as 0 and FlowLimit as 0 which allows process engine to create an unbounded number of services and eliminates the overhead of paging. 3.1.2 Memory Management Issues and tracing Memory related issues some times occur other than the JVM. A third party software integrated with the BW process tries to consume memory to store its state leading to unavailability of memory to the process and if the process is not optimized to handle this scenario causes a choke or in-stability. The exception gets reported to Hawk monitor if available and configured in the infrastructure. It is always advised that the rate at which event source acts should be followed with a same rate where the event sink should process. 3.1.3 Tuning Engine Parameter: Engine.StepCount This TIBCO BusinessWorks property controls the max number of execution steps (unless inside a transaction) for a job before an engine thread switch occurs. The default value is 20, thereby preventing frequent thread switches (which can slow down the engine) in processes with a large number of steps. Parameter: EnableMemorySavingMode=True Turning this parameter on makes engine to release references to unused process data so that it can be garbage collected by the JVM, thus improving the performance of processes with large amounts of data. This feature is very useful in the scenario where a large amount of memory is occupied by a variable defined in a process; for example, a variable that reads in a very big XML file. In this scenario, it is best to release memory right after process instance completes. 3.1.4 Fault Tolerance and Load Balancing The TIBCO BusinessWorks process engine can be configured to be fault-tolerant. One can start up several engines. In the event of a failure, other engines restart process starters and the corresponding services. One engine is configured as the master, and it creates and executes services. The second engine is a secondary engine, and it stands by in case of failure of the master. The engines send heartbeats to notify each other they are operating normally.
Standards & Guidelines for TIBCO Business Works
Page 7 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
Normal operation: master processing while secondary stands by
In the event the master process engine fails, the secondary engine detects the stop in the master’s heartbeat and resumes operation in place of the master. All process starters are restarted on the secondary, and services are restarted to the state of their last checkpoint. Fault -tolerant failover Scenario
The expected deployment is for master and secondary engines to reside on separate machines. There can have multiple secondary engineswith a specified weight for each engine. The weight determines the type of relationship between the Fault -tolerant engines. A master and its secondary engines is known as a fault--tolerant group. The group can be configured with several advanced configuration options, such as the heartbeat interval and the weight of each group member.
Peer or Master and Secondary Relationships
Members of a Fault -tolerant group can be configured as peers or as master and secondary engines. If all engines are peers, when the machine containing the currently active process engine fails, another peer process engine resumes processing for the first engine, and continues processing until its machine fails. If the engines are configured as master and secondary, the secondary engine resumes processing when the master fails. The secondary engine continues processing until the master recovers. Once the master recovers, the secondary engine shuts down and the master takes over processing again. The Fault-tolerant tab of the Process Engine deployment resource allows to specify the member weight of each member of a Fault -tolerant group. The member with the highest weight is the master. One can select "Peer" in the first field on the tab to configure all engines as peers (that is, they all have the same weight). One can select Primary/Secondary to configure the engines as master and secondary. One can also select Custom to specify oner own values for the weight of each member of the group.
Process Starters and Fault -tolerant
When a master process engine fails, its process starters are restarted on the secondary engine. This may not be possible with all process starters. For example, the HTTP Receiver process starter listens for HTTP requests on a specified port on the machine where the process engine resides. If a secondary engine resumes operation for a master engine, the new machine is now listening for HTTP requests on the specified port. HTTP requests always specify the machine name, so incoming HTTP requests will not automatically be redirected to the new machine. Each process starter has different configuration requirements, and not all process starters may gracefully resume on a different machine. Additional hardware or software required to redirect the incoming events to the appropriate place in the event of a failure.
Standards & Guidelines for TIBCO Business Works
Page 8 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
Sometimes servers may not have all of the necessary software for restarting all of instances. For example, database may reside on the same machine as the master process engine. If that server goes down, any JDBC activities will not be able to execute. Therefore, it may not be possible to load process definitions that use JDBC activities in oner secondary process engine. One can specify that secondary process engine loads different process definitions than the master. It is advisable to load the process definitions that can gracefully migrate to a new server during a failure.
Setting Fault Tolerant Options The FT Group Settings panel displays only if the TIBCO BusinessWorks process has been added to at least two (different) machines. If the domain includes components that were deployed as part of a fault-tolerant group, the display includes the information about the group. It is possible to start one or more process engines in the group. If more than one engine has started, only one is displayed as Running and all other engines are displayed as Standing By (or, initially, as Starting Up). Any change in the status of a component that has been deployed as part of a FT group, the status change affects all other members of the group. After deployed the process engines, it is most efficient to select all process engines and start these together. After the primary and secondary engines have communicated, the master will display as Running and all other engines as Standby. If only primary gets started , it will first go to Standby mode as it checks the status of the other engines. It then changes to Running. Upon shutdown of a process engine, the appropriate secondary engine starts automatically. The Fault –tolerant mode of running of the process can be configured in the
administrator. Because Fault -tolerant engines are expected to be on separate machines, it is advisable to use a database for storage for each process engine. This allows to specify the same JDBC Connection resource for the master and secondary engines, and therefore all engines can share the information stored for process instance checkpoints. If all engines share the checkpoint information, and then the secondary engines can recover process instances up to their last checkpoint. If engines do not share the checkpoint information, process instances are not restarted. Load-Balancing of Incoming Messages One common application of a JMS queue is to distribute queue messages across multiple receivers, there by it balances the processing of the messages on the queue. To achieve this goal, both the JMS server and TIBCO BusinessWorks must be properly configured. The JMS server must allow the queue messages to be distributed across multiple receivers. For example, in TIBCO Enterprise Message Service, the exclusive property on the queue controls whether messages can be delivered across receivers or not. In TIBCO BusinessWorks, the process definition containing the JMS Queue Receiver must be deployed across multiple process engines. This creates multiple queue receivers for the same queue. When balancing incoming messages across TIBCO BusinessWorks engines, it shouldbe ensured that one engine does not attempt to accept and confirm a large number of incoming messages before other engines can receive the messages. In general, most JMS servers balance the load by distributing messages in a roundrobin fashion to all queue receivers. However, there are situations that can cause an uneven distribution of messages across queue receivers. If the Acknowledge Mode field is set to "Auto" on the Configuration tab of the JMS Queue Receiver, the process starter confirms messages as it receives them. When process engines are started at different times, this can lead to one process engine receiving all queue messages and paging them to disk, depending upon how the engine’s Max Jobs and Activation Limit properties are set when the engine is deployed. TIBCO Enterprise Messaging Service, can avoid this problem by setting the acknowledge mode to TIBCO EMS Explicit and then use the Flow Limit property in the deployment configuration to control the number of process instances created by the process starter. If not using TIBCO Enterprise Messaging Service, set the Acknowledge Mode field to "Client". In this mode, a process engine can only receive as many messages as it has sessions specified in the Max Sessions field. Once a process engine reaches the maximum number of sessions, another process engine can begin to accept incoming messages. A process engine cannot receive more messages until the messages have been acknowledged by using the Confirm activity and can be considered as a draw back. Once the message is acknowledged, the session is released and the process engine can accept a new message. 3.2 JVM Tuning Guidelines Each TIBCO BusinessWorks engine runs as a multi-threaded Java server application. Processes and other objects used internally by TIBCO BusinessWorks are Java objects that consume memory while the engine is running. Java provides some useful parameters for tuning memory usage. TIBCO recommends that TIBCO
Standards & Guidelines for TIBCO Business Works
Page 9 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
BusinessWorks customers consider various factors when selecting a JVM. Besides JVM version and vendor, the most relevant tuning parameters are: • JVM heap size • Server VM vs. Client VM based setting • Garbage collection settings • Hotspot Server JVM will give better performance with TIBCO BusinessWorks. Sun also claims better performance for Hotspot Server JVM. • Consider using the multi-threaded garbage collector, which is available in Jdk.4.x. Experimenting with processor sets with multiple engines is a good idea. 3.2.1 Specifying JVM Heap Size The default Java heap size, which varies according to platform, is a conservative estimate made by the developers of the particular type of Java being used. To calculate the amount of memory needed for a TIBCO BusinessWorks Engine, one should determine the largest heap size that can reside in physical memory. For best engine performance, paging to disk should be avoided. •Heap size mostly depends on the memory available. Recommended heap size for a small workload is 256MB, for medium 512MB, and for large workload 1GB or more. Maximum heap size per engine can be configured and saved using the Tibco Administartor TIBCO Runtime Agent™ allows to modify the Java heap size, JVM selection, and JVM settings that the TIBCO BusinessWorks Engine uses when it is started. This memory is committed for engine activities and job processing. To set the JVM available memory, use the following parameters: • The Initial JVM Size parameter sets the minimum amount of memory used • Maximum JVM Size sets the maximum. The total amount of JVM memory needed to operate a TIBCO BusinessWorks Engine should be the memory for each process plus the maximum number of processes that can be in the process pool. If flow control is enabled, the process pool can contain up to the MaxJobs value. 3.2.2 Setting JVM and Processor Affinity If machine is equipped with multiple processor machines, TIBCO Software recommends that one assign processor affinity with different instances of TIBCO BusinessWorks Engines: • On Windows, use the Task Manager facility to assign affinity. • On a Solaris platform, create processor set using psrset/psradm/pbind. • On Linux, use the taskset command to set processor affinity. 3.2.3 JVM Garbage Collection The recommendation is not to do a make direct explicit garbage collection. JVM internally handles GC with its inbuilt algorithm. However tuning garbage collection requires good understanding of garbage collection frequency, message size, and longevity (oneng, tenured, and perm). 3.3 TIBCO Business Works Transport and Palette Guidelines Choose a appropriate Transport which adheres to interoperability, reliability, and security requirements. Choice of using XML over HTTP, XML over JMS, SOAP over HTTP, SOAP over JMS, TIBCO Rendezvous messages and TIBCO Active Enterprise™ messages are a few combinations based on the requirement and infrastructure. It is observed that SOAP over HTTP is performs better than SOAP over JMS. The SOAP protocol adds significant overhead whether one use it on HTTP or JMS, since there is an overhead cost of creating and parsing the SOAP envelope. The following observations can be made based on various comparisons of HTTP clients, HTTP servers, SOAP clients, and SOAP servers: • As the payload increased from 1KB to 5KB, the throughput dropped. • There was no significant increase in the latency when the payload increased from 1KB to 5KB. • As the number of HTTP/SOAP client requests increased (concurrent requests), the throughput increased.
Standards & Guidelines for TIBCO Business Works
Page 10 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
• As the number of HTTP/SOAP client requests increased, the CPU utilization increased. 3.3.1 HTTP/S There are two important tuning considerations related to the HTTP client: • HTTP/S Client Thread Pool • Persistent Connection Manager (PCM)
HTTP/S Client Thread Pool Thumb rule is to have the process tuned such that the HTTP Server and HTTP Client should run with the same bit. The rate at which a request is produced by client follows with the same rate at which the request gets consumed by server. Each Request/Response activity that uses the HTTP protocol (for example, Send HTTP Request or SOAP Request Reply) is associated with a unique thread pool. Each request is executed in a separate thread, belonging to the thread pool associated with the activity. The value of this property controls the size of the thread pool. The number of threads in the pool determines the maximum number of concurrent requests a request/response activity can execute. The default value of this property is 10. The thread pool is created when the engine starts: therefore, be careful to set the value of this property to a reasonable number for oner system. If one set the value too high, it may result in extra resources being allocated that are never used. Optimal thread pool count can be determined with a test and monitoring the test behavior. bw.plugin.http.client.ResponseThreadPool
Persistent Connection Manager By default HTTP provides non-persistent connections. Persistent Connections Pool (PCP) is a new important feature when the client makes a call to the same server very frequently. The cost of the HTTP connection is deemed important, and state management of the connection is very well understood. In TIBCO BusinessWorks 5.2, the Send HTTP Request-Reply and SOAP/HTTP RequestReply activities can now share connections. HTTP (SOAP/HTTP) request-reply activity configuration is enhanced to allow users to setup a pool of persistent HTTP connections with the server. However, this feature should be used only after fully understanding implications of using this feature. If persistent connections are chosen, it can improve performance because instead of creating and destroying a connection for each job, a pool of reusable shared persistent connections will be maintained. Decision to use persistent vs. non-persistent connections Persistent HTTP connections have a number of advantages: • By opening and closing fewer TCP connections, CPU time is saved in routers and hosts (clients, servers, proxies, gateways, tunnels, or caches), and memory used for TCP protocol control blocks can be saved in hosts. • Network congestion is reduced by reducing the number of packets caused by TCP opens, and by allowing TCP sufficient time to determine the congestion state of the network. • Latency on subsequent requests is reduced since there is no time spent in TCP's connection opening handshake. • HTTP can evolve more gracefully, since errors can be reported without the penalty of closing the TCP connection. The most important disadvantages of using persistent connections are: • The HTTP (SOAP/HTTP) client may report exceptions that are not usually reported when using non-persistent connections. • The HTTP server may end up receiving duplicate requests from the same client. Usually, these issues occur when the client is not configured to check for stale connections. For different reasons, the connections that are persisted may become stale. When attempting to use a stale connection, the underlying HTTP client application may fail, throwing an exception that may be propagated to the business process layer. Best Practices for using persistent connections are as follows: • Tthe client should check the connections before using them. In TIBCO BusinessWorks has the ability that checks for stale connections (see property bw.plugin.http.client.checkForStaleConnections). When the underlying HTTP
Standards & Guidelines for TIBCO Business Works
Page 11 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
application detects a stale connection, it closes it and gets another connection from the pool. After three unsuccessful attempts to get a non-stale connection, the client fails, throwing an exception. However, this option should be used with caution, since it causes the overall performance to drop significantly, making the use of a non-persistent connection a better choice. • Open and close fewer TCP connections to save CPU time. • Use persistent connections when making repeated connection to the same endpoint, when a stale connection is not an issue, and when the server handles or supports duplicate messages. • When using HTTP over SSL, note that this adds overhead and that overall request/response throughput may drop. When the client cannot check for stale connections, it is very likely that the server would receive duplicate messages. If one cannot rely on the client to detect the stale connection, the next option is to make sure the server provides support for duplicate message detection.
Flow Control in HTTP HTTP Properties such as FlowLimit, bw.plugin.http.server.maxProcessors, bw.plugin.http.server.minProcessors etc are tuned to have control on the process flow leading to a good result. Details about configuring Persistent Connection Manager are described in the section “Configuring Persistent Connections.”
HTTP/SOAP Server Following important considerations apply to HTTP and SOAP over HTTP server: • Configure minProcessor & maxProcessor as described in the configuration section. • HTTP servers can be deployed in various load-balanced situations. However, deploying multiple HTTP servers that listen on the same machine is not possible, since the HTTP port is bound to a specific machine. Alternatives are available, such as creating reverse proxy solution, but they are beyond the scope of this document. • HTTP traffic that arrives from external sources should be evaluated carefully. Use of content aware XML accelerators or routers such as Cisco® Application-Oriented Networking (AON), Sarvega (now in INTEL), or Reactivity helps in determining the content. 3.3.2 SOAP Simple Object Access Protocol (SOAP) is a lightweight protocol for the exchange of information between web services. Currently, the supported transports for SOAP are HTTP and JMS. The performance characteristics of SOAP are closely tied to the performance of the HTTP and JMS implementation in TIBCO BusinessWorks, and the only way to load balance SOAP over HTTP is to include a Local Director for directing requests to a set of configured engines. Major influences of performance for SOAP are similar to the generic influences: • Message Complexity. The more complex the process the greater impact on the parsing (packing and unpacking) of the message as it travels through various stages of the SOAP protocol. • Secured. The decision to use basic authentication will affect performance. This is straightforward and the impact is easily measured. • Load Balanced. The use of a local director to distribute incoming requests. 3.3.3 JMS Recommendations for using Java Message Service (JMS) are as follows: • Within TIBCO BusinessWorks 5.1.3 and prior versions, transport such as JMS can be throttled by limiting the number of sessions. JMS doesn’t deliver more messages until some of the sessions have been acknowledged. • Within TIBCO BusinessWorks 5.2, significant improvements have been made to this mechanism. New for this release is the combination of TIBCO Enterprise Message Service features Explicit Acknowledge and FlowLimit. In this case, location of ack in process makes no difference.
Standards & Guidelines for TIBCO Business Works
Page 12 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
• When using Client Ack, the JMS session cannot receive a new message until the current one is acknowledged. • TIBCO BusinessWorks allows one to configure multiple sessions to receive messages faster; and to set number of sessions higher than the number of engine threads. • Acknowledge (confirm) messages as soon as possible to improve throughput. • By holding Client ack to the end of the process, one will block that session. This means one will slow down the rate at which TIBCO BusinessWorks pulls messages from the JMS server, which will have to hold messages for a longer period of time. • With TIBCO Enterprise Message Service Explicit Ack, a single session is used to receive all messages. This mode allows for more efficient resource utilization, and provides more even load distribution across multiple engines. • The best way to increase performance beyond the capability of a single engine is to distribute the load over multiple engines using a load-balanced transport such as JMS Queue or TIBCO Rendezvous CMQ transport to distribute the work. External mechanisms exist to allow HTTP to be used for this purpose also. • must use a hardware/software load balancer to improve HTTP performance and load distribution. • If possible, choose NON_PERSISTENT as the delivery mode in replying to a JMS message. • Using JMS Queue Sender and the Wait For JMS Queue message instead of the combined JMS Queue Requestor activity may improve throughput. • Simple and Text JMS message types have the lowest processing overhead. • To design a long running process to fetch a message and process it, use Get JMS Queue message activity in a loop instead of Wait For JMS Queue message. In most cases, a JMS starter will be sufficient in this scenario. 3.3.4 FTP FTP activities that share the same host, port, username, and password within the same process definition can now share the same session. The Quit (post-command) field specifies that the command should close the session. Keeping the session open for all activities can improve performance because there is significant overhead creating a session for each activity. 3.3.5 JDBC JDBC activity most frequently used data access activity that has a significant performance implication. It is recommended to use certified, high performance JDBC drivers or the recommended drivers available at BW. A test can be done in-order to determine the Driver type giving a positive response. The JDBC Query activity can now fetch batches of records at a time instead of retrieving the entire result set. JDBC Update activity has been enhanced to allow multiple statements to be executed. Use Batching instead of Statement when possible (Better Latency) A general rule of thumb is to initially set the maximum number of database connections to slightly less than engine thread count. If there are not enough database connections allocated, some jobs may be blocked waiting to free a connection. One should continue to monitor the number of database connections using database tools. There are a few tools that one should be familiar with: TIBCO trace tool, described in this paper, and specific DB tools to monitor the overall number of database sessions. 3.3.6 Transactional Activities
Avoid Development Misuses • Inefficient queries – sending SQL data that asks the database to do more work than necessary; for example, tables without indexes. • Excessive querying – efficient queries called too frequently. • Large data sets – processing large sets of data in ResultsSets. • Procedure Over-riding is not allowed • Define a time-out when execute a Block of SQL code 3.3.7 General Activities Standards & Guidelines for TIBCO Business Works
Page 13 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
3.4 Process and Activity Design Guidelines 3.4.1 Data and Caching Data could be Static, SemiStatic, Dynamic, Shared at process run time. Rule of thumb is not to use many Global variables(GV) as it stays in memory in form of a tree and to retrieve the bind-ed value process has to traverse the tree. If the unwanted GV are there the cost is incurred in traversing.
Static data These data are typically kept as a local copy; TIBCO BusinessWorks hangs on to it in memory. One example is global variables implemented internally as the XML structure known as $_globalVariables. Unused variables can increase the latencies, especially for repeat-until-true where iterator conditions XPATH formula contains Global Variables instead of constants. Although the cache has the MFU (Most Frequently Used) to LFU (Least Frequently Used) list of GV for its use.
Near static data • Keep a local copy, hang on to it in memory, and lazily check for updates. Example: XML Documents (Name Value Pair)
Dynamic data • Work on local data, cache carefully, and use optimistic locking. • TimesTen is an in-memory database that can improve the performance of checkpoints and wait/notify activities. • TIBCO® BusinessWorks Smart Mapper is another example which has been used extensively in cross-referencing (1-1 or 1-n) data. It provides a data caching option which simply removes any performance overhead of looking up data from the database. The Java Global Instance shared configuration resource allows one to specify a Java object that can be shared across all process instances in JVM. • Shared Java object (example : a shared variable which allowing to implement a wait and gain nature of inter process communication) should be instantiated at engine startup time and shared by all activities that access it. • If an object can be shared across multiple process instances, instantiate shared data only once to improve performance and reduce overhead. 3.4.2 Checkpoints The high volume applications which make a significant use of checkpoint have to be continuously aware of the performance cost of writing data to the checkpoints. These applications frequently need faster access to checkpoint data, replication of checkpoint data between different systems for failover reason, and database server-failure protection. 3.4.3 Grouping Activities
Signal-In and Group Do NOT use Signal-in within a loop group to receive/process one message at a time. It is ineffective because the signal-in’s internal queue is unbounded for out-of-band events waiting to be matched up with a job key.
Local Only Field The Notify Configuration has been enhanced with a Local Only field to allow an in-memory notification when the Wait and Notify activities are performed on the same machine. See TIBCO BusinessWorks Palette Reference, Chapter “General Activities Palette, the field Local Only in Notify Configuration for more information. Call Process Every time CP is called, input data including process instance data are bounded to the subprocess input data structure. In most cases reference is made to original entity; however, whenever data is mapped a copy of data is created. Keeping this obvious cost in mind, the designer should always be mindful of use of CP and continuously evaluate cost vs. CP modularization.
Standards & Guidelines for TIBCO Business Works
Page 14 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
• Call Process (CP) • Consider using “Spawn” CP for heavy asynchronous process (such as logging). This will not decrease response time (not CPU usage) • Be aware of tradeoff between data binding cost vs. modularization • CP should be frequently reusable sub-processes with reasonable complexity • Use shared process variables to avoid data copying cost in CP • Review use of large strings that requires manipulation • Evaluate need for passing large strings especially between CPs • Any changes within CPs will create a new copy • Parse structure upfront • Keep string reference out • FTP • Use session caching • Wait for Notify • Do NOT use Wait for Notify within a loop group to receive/process one message at a time. • It is ineffective because the Wait for Notify’s internal queue is unbounded for out-of-band events waiting to be matched up with a job key. • Notify Configuration (Local Only) • Use Local Only field in Notify Configuration. Allow an in-memory notification when the Wait and Notify activities are performed on the same machine. 3.4.4 Testing BW Process Definitions Each process has to be unit tested against a use case, followed with capture of the test proofs. The integration time test should be certified with minimal activity oriented logging. TIBCO BusinessWorks provides a testing environment for stepping through oner process models and determining the sources of errors. Entering the testing environment starts a TIBCO BusinessWorks engine. The engine starts process instances based on the process definitions stored in oner project. One can select one of the running process instances to display in the design panel, and the currently executing activity is highlighted as the process instance runs. In general, testing should be done during the design and development phase of a project. testing a deployed project is possible, but might be difficult depending upon the volume of the workload of the system. Also, testing usually involves setting breakpoints in the process model to stop the running process instances at desired points. This is not possible in a production environment, so one may want to use a development system for testing purposes. Testing a process definition typically involves these steps: Select the process definition one wish to test in the project panel. Set breakpoints in the process definition at points where one wish to stop a running process and examine its state. If the process begins with a Start activity and the Start activity has a schema defined, one can supply input data to the process before executing it. Click the Tester tab on the left of the project panel. The project panel becomes the test panel. From the test panel one can start process instances or load more process definitions. See Process Instances During testing for more information about process instances in the test panel. Examine the data of the process by selecting any of the activities in the process. The current state of the process data is displayed on the Process Data tab of each activity. Use the toolbar buttons (Pause testing, Step to Next Activity, and so on) in the test panel to either continue through the process instance or to stop the current process instance. Once in testing mode, changes to oner process definitions are not reflected in the running process instances. Return to design mode before changing process definitions. 3.4.5 Tracing in BW Tracing activity needed in order to manage and monitor a deployment. View the status of components and generate tracing information. Start and stop process engines and adapters. Use of Custom Activity, Use of Logging i.e. system log and application log, configuring a hawk micro agent on the logs. Write a custom micro agent to monitor specific scenarios. Define Break points at design time and Test. TIBCO Administrator GUI allows monitoring of the running project at different levels of detail, and can collect tracing information for later analysis.
Standards & Guidelines for TIBCO Business Works
Page 15 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
For the example discussed in this manual, the process engine could perform these tasks: Receive data from an application server via JMS, data from a PeopleSoft Order Management System via the appropriate adapter, and data from a shipping service via SOAP. Enter data into a PeopleSoft Order Management system and data into a Siebel customer service system via the appropriate adapters. Send certain orders out for credit approval and receive approval or refusal. All components are monitored and managed by way of TIBCO Administrator, which also provides security and repository management. Users can access TIBCO Administrator using the TIBCO Administrator GUI. Specify tracing information if desired. TIBCO Designer allows one to specify simple tracing to a file or standard out using the configuration panel directly. One can also specify advanced tracing, such as tracing to a network sink. 3.5 Message, Payload and Schema Guidelines There is no specific guide line but the criteria such as Size of Message or payload size becomes a performance factor. The general guidelines are as follows: • Be aware of performance cost of message; keep message size as small as possible • If one are using binary or large document, consider using file reference • If oner scenario includes a large memory space occupied by a variable defined in the process, such as activity#1 (a variable that reads in a very big XML file), plan to release memory right after the process instance completes: Enable MemorySavingMode=True This parameter significantly improves memory footprint for processes that manipulate a large amount of data for a small part of their lifetime. This TIBCO BusinessWorks property allows the engine to release references to unused process data so that it can be garbage collected by the JVM, thus improving the performance of processes with large amounts of data. • Keep XML Structure and schema complexity as simple as possible • Review use of large string that requires manipulation • Any changes within CPs will create a new copy • Parse structure upfront • Keep string references out • Avoid or re-architect XML schema with XML elements that contain nested elements of nested elements. Many Open Applications Group Business Object Documents (OAG BODs) have these issues. 3.5.1 Data Representation Guidelines for Tibco Rv and JMS While using JMS as a transport for TIBCO BusinessWorks, the following matrix should be considered as very high-level performance guidelines. Use these as a general rules, but one should certainly experiment with different options especially in the high throughput environment: Table 1. Comparing Performance of Different Data Types
Message Type
Performance Rating
Comments
Simple
Very High
Without a body portion there is no extra parsing required. This is by far the best performing message type.
Text
High
Treated as single text field. No parsing needed.
Bytes
High
Bytes require parsing downstream in the process. This influences the rate the process starter is able to receive the JMS Message.
Map
Medium
A set of name-value pairs that can be accessed sequentially or by name. It requires a level of parsing in the incoming activity, which has an adverse affect on the incoming rate.
Standards & Guidelines for TIBCO Business Works
Page 16 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
Stream
Medium
A set of ordered values. No parsing needed.
XML Text*
Medium
A single text field containing XML, which will be parsed by the process starter. This is a major change in TIBCO Active Enterprise™ 5 wire format. When XML Text format has been chosen, TIBCO Rendezvous will use the internal TIBRVMSG_XML type and compress the message on the wire.
Object
Low
A serialized Java object. The object is deserialized in the process starter. De-serializing requires significant CPU and memory resources.
3.5.2 TIBCO Business Works Mapper Performance TIBCO BusinessWorks 5.x provides a full-featured XSLT editor. Although TIBCO BusinessWorks mapper provides greater flexibility, granularity and support for XSLT 2.0 features (such as for-eachgroup), it is important to keep in mind that improper use of drag-and-drop may end up creating inefficient XSLT. Many TIBCO BusinessWorks mapper performance issues arise from this inefficient XSLT. Proper understanding of mapper and code optimization may be necessary for the best performance. In general, users should pay attention to following guidelines: • Use Mapper (startup time parse) instead of Transform XML activity wherever possible for better performance. Be aware that the Transform XML activity invokes XSL parser at the runtime and hence it may add performance overhead every time a process runs. Mapper on other hand dynamically loads definitions and evaluates at input document and mapping rules at the runtime. However, there are some cases appropriate for using the Transform XML activity, such as input and style sheets being different every time Transform XML activity is invoked. • Use tib:render-xml function instead of Render XML if one are not modifying XML structure. • Think of the process or source data as an XML tree. • Note that large data sets (1000s of records) take time to traverse. • Note that XPath expressions are evaluated to take one to a specific element (node) or group of nodes. • Don’t let the same expression get evaluated many times (especially if the data is large). • Use variables and evaluate them only once. • Because of XML tree traversing, always use the "iteration element" in a loop if there are more than a small number of records. This section will focus on providing a better design practices for an improved performance using TIBCO BusinessWorks. 3.6 Processing Large Sets of Data Common uses of a TIBCO BusinessWorks process are to retrieve a large set of data (for example, a JDBC result set) and then process it through a series of activities. Care should be exercised to limit this data to a manageable “chunk” before processing. It is important to note that the default behavior of the JDBC query activity, and other activities such as parse data or parse XML, is to parse/retrieve the entire result into memory for downstream processing. This has two effects on performance: • Memory Large parsed objects can consume considerable memory very quickly, therefore allowing an unspecified number of records to be retrieved at once can cause memory usage problems. • Indexing of records If oner data has a large number of repeated records/elements and one plan to iterate through them, performance will degrade in a non-linear (quadratic) fashion as the number of records grows. This is due to the increased lookup time as the data “tree” is traversed from the start
Standards & Guidelines for TIBCO Business Works
Page 17 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
record/element to the current node being processed. This effect can be amplified by referencing that node many places in mapping/binding activities. The use of variables can minimize this impact. The solution to both of these problems is to retrieve the data in smaller sets/chunks and use a group to iterate through the entire set. The way to do this varies based on the input data source, but a couple of common inputs are results of Parse Data and JDBC Query activities. For the parse data activity, one can easily configure it to retrieve a set number of records and process them before retrieving more. One feature of the parse data activity is that it does not buffer the whole file from the disk. Therefore, one can use it to process extremely large (1G +) files with a controlled amount of memory usage. For the JDBC query activity, it is possible to retrieve the results in chunks by using more selective SQL and surrounding the JDBC query activity with an iterate group. For example, if the table holds 100K rows and one retrieve 1K at a time and iterate 100 times, the total processing time will be significantly less than retrieving all 100K rows at once. Some experimentation can be used to optimize the size of the chunk depending on the type of data and processing required. 3.6.1 Iterating Through Large Sets of Data Using a Mapper Activity Mapping technology in TIBCO BusinessWorks (XSLT) has mechanisms for iteration and looping, which makes it possible to iterate through data within the mapper using the statement xsl:for-each or xsl:for-each-group instead of using a group around the mapper. This obviously does not apply if multiple activities must execute within each iteration. If the mapper is the sole activity, it is much more efficient to use the mapper and utilize xsl:foreach, instead of placing the mapper in a group and using the “accumulate output” feature of the group.
Use of Variables One of the single most important optimizations in the mapping activity and any activity that transforms data is the use of variables. The mapper activity retrieves data at runtime by evaluating XPath expressions and traversing the input tree to access the matching data. If the tree is large and the mappings define many lookups of related or identical data, using variables to store that data can improve performance significantly. An example is a large number of records in an input tree that are processed one at a time. The XPath for one output value in the mapper (contained in an iteration group) might look like the following: $JDBC-Query/resultSet/Record[$index]/field1 If this expression is repeated many times for multiple fields, it will be more efficient to create a variable at the top of the mapper input pane (and generated XSLT) that evaluates the repeated portion of the XPath ($JDBC-Query/resultSet/Record[$index]) and then use the variable as part of all subsequent XPath expressions, such as: $var1/field1 $var1/field2 $var1/field3 The benefits of using variables within the mapper will vary depending on how large the data is, how complex the expression is, and how many times it is repeated. The mapper GUI makes this easy to use. Using the mapper GUI, once one create the variable in the input pane, one will see it on the left side with the appropriate schema representation to allow drag-and-drop mapping. In TIBCO BusinessWorks 5.2, one can store the current iteration element in a process variable for faster access during iterations. 3.6.2 Sorting and Grouping Data A common mapping requirement that can result in very inefficient XSLT is to sort or group a list-style output by a value of some data in the input data. For example, one may need to create a list from an input list where every item that has a unique value gets created on the output, but duplicates are dropped. Or one may create a list in which every item that has a matching itemType gets grouped on the output. One way to do this is use the preceding-sibling axis to check all of the previously processed items for a specific value before processing the current item. This obviously gets progressively slower as the number of items grows.
Standards & Guidelines for TIBCO Business Works
Page 18 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
The solution is to use for-each-group, an XSLT 2.0 statement that is supported in TIBCO BusinessWorks 5.1.2 and later. “Converting a List into a Grouped List.” Is a feature that can achieve dramatic performance improvements for these use cases. 3.6.3 Large Document/Record Use Cases One may receive large documents or messages with the SOAP, HTTP, or Mail process starters. Large documents or messages can consume considerable memory and may degrade the performance of oner system. One can specify a threshold size for incoming large documents and messages so that items that exceed the threshold are written to disk instead of stored in memory. Once the large document or message is written to disk, one can use the Read File activity to obtain the contents of the file, if necessary of course a large IO is associated.
Process Large Incoming documents • SOAP, HTTP, Email: Use option process starter streaming option • Later read it using File Reader Data Parse 3.6.4 SOA Best Practices These are the best practices for working with Service Oriented Architecture (SOA): BW 5.X supports design where a service and the Implementation of service (BW process) are having isolation. BW gives flexibility to derive to process skeletons framework (no implementation) out of a third party WSDL (Port, Message, Service) and also in a reversed way WSDL can be generated out of a plain process definition and can be exposed as service with the use of the generated WSDL. • Eliminate frequently repeated XML tags and elements • Consider async invocation style • Use synchronous or fine-grained invocation for a frequent back-and-forth communication between nodes • Keep in mind that exception handling and document passing requirements will increase, such as Universal Application Network (UAN) • Carefully consider pros and cons of - Compressing technique (binary compression such as Gzip) - Reducing overly frequent validation • Investigate parsing technology - Pull parser (XPP3), hardware based solutions (Cisco AON, Sarvega, Reactivity) • Reduce overly chatty protocol uses, such as WS-Discovery
4 Configurations 4.1 Configuring Persistent Connections Persistent connections are created for each HTTP server that Send HTTP Request activities in process instances communicate with. Each HTTP Client holds a persistent connection until the HTTP server sends the response message. The connection is then released by the client and returned to the pool. It is possible specify the maximum number of connections to create in the persistent connection pool, and It is also possible to specify the maximum number of persistent connections for each HTTP server. Connections for each HTTP server are created in the pool until the maximum is reached. When a Send HTTP Request activity requires a connection, the pool is searched for a connection that corresponds to the HTTP server. If a corresponding unused connection is found, it is used. If no connection is found to a corresponding HTTP server, a new connection is created if the maximum pool size has not been reached. If the maximum number of connections for that particular server has been reached, the request must wait for a connection to be released before using it.
Standards & Guidelines for TIBCO Business Works
Page 19 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
bw.plugin.http.client.usePersistentConnectionManager This property specifies that a pool of HTTP connections to each HTTP server should be created so that connections can be reused by Send HTTP Request activities. Not all HTTP servers support persistent connections. Refer to oner HTTP server documentation for more information about support for persistent connections. When this property is set to true, a pool of connections is created for each HTTP server that the HTTP (SOAP/HTTP) Request-Reply activities connect to. The total number of connections in the pool is limited by the bw.plugin.http.client.maxTotalConnections property. The number of connections for each host is limited by the bw.plugin.http.client.maxConnectionsPerHost property. The default value of this property is false.
bw.plugin.http.client.maxConnectionsPerHost The value of this property is ignored unless the bw.plugin.http.client.usePersistentConnectionManager property is set to true. This property specifies the maximum number of persistent connections to each remote HTTP server. The default value for this property is 20.
bw.plugin.http.client.maxTotalConnections The value of this property is ignored unless the bw.plugin.http.client.usePersistentConnectionManager property is set to true. This property specifies the maximum number of persistent connections to create for all HTTP servers. The default value for this property is 200.
bw.plugin.http.client.checkForStaleConnections The value of this property is ignored unless the bw.plugin.http.client.usePersistentConnectionManager property is set to true. When using persistent connections, a connection can become stale. When this property is set to true, a persistent connection is checked to determine if it is stale before it is used by an HTTP (SOAP/HTTP) Request-Reply activity. Checking for stale connections adds significant processing overhead, but it does improve reliability. The default value for this property is false.
bw.plugin.http.client.ResponseThreadPool The HTTP (SOAP/HTTP) client uses a thread pool for sending the HTTP messages. Each HTTP request is sent out in a separate thread in order to not keep the engine’s thread blocked while waiting for the response message. These threads are taken from a thread pool. Each HTTP (SOAP/HTTP) Request-Reply activity has its own thread pool. The thread pool’s size can be configured using this property. The default value for this property is 10.
Standards & Guidelines for TIBCO Business Works
Page 20 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
There are several public domain connection manager tools available to monitor progress of the connections. TIBCO Software recommends that during performance testing one monitor progress of connection using one of these tools.
Configuration Considerations for Non-Persistent Connections Many operating systems by default are tuned for the limited numbers of UserPort parameters. Generally, this parameter corresponds to the number of sockets open at a time. For example, on Windows, MaxUserPort specifies the ephemeral port numbers (by default, Windows allows 10245000). Another important parameter is TcpWaitedTimeDelay, which is normally set to 30 seconds. This parameter determines the length of time that a connection stays in the TIME_WAIT state when being closed. While a connection is in the TIME_WAIT state, the socket pair cannot be reused. This is also known as the 2MSL state because the value should be twice the maximum segment lifetime on the network. For scenarios where an HTTP server is using non-persistent connections (such as a Web server prior to HTTP 1.1 or HTTP 1.1 without the HTTP Persistent support), tuning should be very carefully understood on the Windows platform. For example, on Windows 2000, the following two parameters should be carefully tuned: Key: Tcpip\Parameters Value Type: REG_DWORD-maximum port number Valid Range: 5000-65534 (decimal) Default: 0x1388 (5000 decimal) This parameter controls the maximum port number used when an application requests any available user port from the system. Normally, short-lived ports are allocated in the range from 1024 through 5000. Setting this parameter to a value outside of the valid range causes the nearest valid value to be used (5000 or 65534). Key: Tcpip\Parameters Value Type: REG_DWORD-time in seconds Valid Range: 30-300 (decimal) Default: 0xF0 (240 decimal) This parameter determines the length of time that a connection stays in the TIME_WAIT state when being closed. While a connection is in the TIME_WAIT state, the socket pair cannot be reused. This is also known as the 2MSL state because the value should be twice the maximum segment lifetime on the network. See RFC 793 for further details. Please note that other operating systems may have similar limiting parameters. Hence, one should consider tuning those parameters when similar limitations are encountered.
4.2 Configuring HTTP/JMS Servers In some situations, one may wish to alter the configuration of the HTTP server that receives incoming HTTP requests for TIBCO BusinessWorks. Properties such as bw.plugin.http.server.minProcessors and bw.plugin.http.server.maxProcessors can be configured to handle the incoming HTTP requests. Increasing or reducing the processor count may lead to performance related problems and so can be optimized with number of tests. The existing hardware, processor and memory should also be considered while defining the above properties. A connection refuse might happen if the resource available for a incoming request.
To change these values, add the appropriate property to the bwengine.tra file. When one deploy oner project using TIBCO Administrator, these properties will automatically become part of the deployment configuration. 4.3 Enabling SSL Configurations via BW SSL can be implemented into a transport with JMS, HTTP, RV, SOAP, FTP, Adapter and also at process level. The designer is equipped with the tool to import the certificate and Identity and makes it available at run time to prevent connections from unauthorized users and connect to a authentic System. How ever Passing a User name , password or importing a identity or the certificate controls unauthorized access.
5 Global Variables
Global Variables and Grouping of GV
Standards & Guidelines for TIBCO Business Works
Page 21 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
Global variables are used within the deployment process to maintain environment specific values. Initially, this will include values such as port number and subject name but may be expanded based on the project requirements, such as database or mainframe connectivity information. Having different port numbers and subject names based on the environment ensures that messages from a project in one environment are not inadvertently picked up by same project in another environment. All the global variables are listed in the bwengine.tra or “processengine.tra" file, there is one central place onecan take a look at the configuration parameters and hence can be easily administered. Issues: We strongly recommend that one define a well-defined naming & usage convention for Global Variable across oner project(s). One must take following issues under consideration before designing “Global Variable” naming convention. 1. Once global variable is set into the repository it is loaded when business process engine is loaded. Currently BW doesn’t provide any facility to modify “global variables”, once process engine has been started. 2. Location Independence/Dependence: Special care must be given to certain parameters such as RV Subjects, RVCM ledger file, and HTTP Port. These parameters may have dependency with migration environment or machine. 3. Deployment doesn’t allow “Global Variable” substitution yet. 4. Grouping Related GV is easy from Deployment and maintenance aspect of a process. Process Variables Though this concept is not available out of box, the utility function introduced with “BWTemplate” provides a mechanism to specify process specific variables. This service provides a mechanism to read process-specific properties into memory at initialization time and these properties can then be used through out the process using x-path functions. Two properties named PTY_LogLevel and PTY_PublishLogLevel are crucial in defining the way logging is done using the Logging service and these properties are initialized from the property file or default to global variables if they are not defined within the property file. For more details please see TIBCO BusinessWorks Common Utility Services(1) 5.1 Grouping & Sub-grouping Rules Grouping Related GV is easy from Deployment and maintenance aspect of a process. Choose the repeating element in the Activity Input schema that holds the grouped data. For-Each-Group is a shortcut to create a For-Each-Group statement with the repeating element as a child element and a Grouping statement to contain the element you wish to group-by. Adding the Grouping statement creates the $=current-group() element in the Process Data area. The Grouping statement creates the list grouped by the desired element, and the current-group() function allows you to access the items in the Requests repeating element that correspond to the group that is currently being processed. In a reverse excersise if a repeating element is coming and the index is known then the element at a particular index of the repeating element can be determined using the xpath. 5.2 Naming Conventions Any BW related projects should carefully define global variables for the projects. Generally we recommend that one categorize them and use naming convention accordingly. Naming convention should consist of prefix, which provides a category, and suffix, which conveys meaning of variable. Naming Conventions for Prefix Element The configuration parameters used inside our BPs for connectivity to external applications changes frequently. They change when the development hands over the BPs to QA for testing or when QA decides to move from one system to another or even when the developers decide to move from one system to another for end-to-end testing. Category Prefix Comments Migration Variables MIG_ These variables, which changes from one environment to another (DEV, STAGE, PROD) such as ENV, username, connection string in some cases. Themigration script can simply look at any global variable starting MIG_ and provide property file corresponding to environment. Global Across Projects GLB_ These variables are global across the project and don’t vary from one migration environment to other. They are typically used in processes. Deployment/Config uration Specific Variables CFG_ These variables are global across the project and don’t vary from one migration environment to other. They are typically used in adapter configuration and deployment. Process LevelProperty PTY_ Though these are not strictly “Global Variables”, it can be set as Global Variable and can be managed as property associated with a specific process. This allows finer granularity at “Process Level” Naming Convention for Suffix Element Global variable should have unique names.
Use leading upper case for resources (e.g. ProductNameType) including first letter The name should not have spaces First letter must be “Alphabetical” and “Upper Case” Only.
Standards & Guidelines for TIBCO Business Works
Page 22 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected],
[email protected]
By default all name should follow:
<Suffix> convention If following table contains required field and if resource context information is available, developer should use this fields. For example, if [Source] [Target] is required, and only [Source] is available, developer should [Source] in respective area. [Source] and [Target] can be “Application Name” such as ERP or CRM. It can also be transport such as “JMS” or “RV”. However, transport name should only be used if Application detail is not available. The name should describe the functionality of the component. Valid are starter, main, validation, transform The name should uniquely identify the project in the domain The first letter of each word in the project should be capitalized.
The purpose of this “File System” structure is to provide a uniform referencable external resource location such as property, log, input/output data, static maps, schemas, unit test, and message correction file locations.
6 Logging 6.1 Logging Rules
Application Specific success or failure should be logged
System Specific startup shutdown error should be logged
Process Specific log for each instance
Obey the format of logging.
No Extensive logging as it hinders performance
1.0 Logging Parameters Details, frequency etc. Logging need to cover Application Process Name, Time Stamp of Logging, Stack Trace, Activity, in-bound data, out-bound data, failure code, failure description, success code & description etc. Each Major Activity, a group of activity should be precede and followed with a log. If the in-bound data, out-bound data which is very high then a smaller portion of that can be logged as a high I/O is always associated with logging. Logging for Third-Party Components TIBCO BusinessWorks can use a variety of third-party components. For example, the Apache Tomcat server is used to accept incoming HTTP or SOAP requests or the Arjuna Transaction Service can be used as a transaction manager. Many third-party components can use the standard log4j logging services. TIBCO BusinessWorks provides the bw//lib/log4j.properties file to allow one to configure logging services for third-party components. The properties defined in bw//lib/log4j.properties are required by the components used by TIBCO BusinessWorks. The supplied log4j.properties file has comments describing property usage. One can alter the properties in this file, if one wish to configure logging for oner environment. Do not remove any required properties from this file. It is a good idea to create a backup copy of the log4j.properties file before altering it. This allows one to return to the original configuration if oner changes result in errors. There can be only one log4j.properties file per Java VM. If one wish to use properties from a different log4j.properties file, one can either add the properties to bw//lib/log4j.properties or one can alter the bwengine.tra file to point to the location of oner own log4j.properties file. If one use oner own log4j.properties file, one must include all of the required properties from the file supplied with TIBCO BusinessWorks in oner file. 7. Utility Services These covers the most granular processes and utilised/referenced from main business process. Example : a common query interface getting used at different business processes within a project 8. Exception, Audit & Debug
Standards & Guidelines for TIBCO Business Works
Page 23 of 24
http://architecture-soa-bpm-eai.blogspot.com/ Tushar Jain
[email protected], [email protected]
The errors or fault can be categorized into mainly Internal Exceptions, System Exceptions, Business Exceptions ,Components Exceptions. Each exception is maintained with a code and description in XML flat file or in Database. The Exception , the process id causing the exception, time stamp of exception generated can be well traced and monitored. A utility service can be designed to report the errors(system, application, activity, component related ) or exceptions to an audit interface which usually mails and stores the details from which an Audit report can be generated. The fault messages can be audited and the message can be remediated based on the requirement further. Hawk and micro agents can well monitor the Process and the Logs generated to report the exception in form of a mail, alert or it can restart a failed job.
Standards & Guidelines for TIBCO Business Works
Page 24 of 24