ITCAM Implementation Case Study at Depository Trust and Clearing Corp. by Jason Meiers, Chief Technology Officer of CAM Solutions
ITCAM at DTCC Depository Trust and Clearing Corporation
Page 2
ITCAM at DTCC
Page 3
Agenda / Table of contents IBM Tivoli Composite Application Manager for J2EE @ DTCC
1
Welcome and scope of implementation
2
Depository Trust and Clearing Corporation & CAM
3
Deployment, Configuration, Challenges & Solutions
4
ITCAM application management tools and benefits
5
Monitoring-as-a-Service(TM)
Page 4
Why Application Management?
Page 5
Welcome Scope of implementation
Hundreds of application servers in development, staging and production environment 300+ developers writing code, deploying applications, restarting application servers Transactions volumes over $1 trillion High value and high volume transactions for risk, fixed income, stocks and bonds clearing Distributed systems and management Highly secure access to building as well as development, staging and production systems
Page 6
IBM Tivoli Composite Application Management at DTCC Challenges and solutions for DTCC Application Management Challenges
Solutions
“Flying Blind” the application is having problems in production
Heap dump and Thread analysis for composite applications
Application specific problem determination and isolation of production incidents to the root cause
Application Management down to the method level within the code of the application, correlation across multiple servers and applications
Enterprise wide application management using SSH
Provide problem determination enterprise wide from one console
Reactive “Fire-fighting” production support
Trend analysis for pre-emptive monitoring of JVM memory, CPU threads
Massive deployment across hundreds of application servers
Page 7
Provisioning and automation of data collector deployment
Implementation Support Application Management support goes beyond install and configuration Recommendation “Application monitoring as a service for our enterprise applications is essential, especially when working in a highly secure environment for financial transactions. It is good to have someone on your side who knows from experience how to make IT Service Management/ITIL work. CAM Solutions is a valuable asset to anyplace they provide consulting services” Depository Trust and Clearing Corporation August 13, 2008
Page 8
Planning and Architecture Ports, Servers, Applications, Scheduling and Security Architecture
Planing
Deployment
Discovery of existing application servers and IT Operation environment
Integration points with other groups and monitoring tools i.e. SNMP or TEC events
Scheduling deployment times with operation team
Understand application specific architecture i.e. EJB, Web Services, SAP, Mainframe Communication protocols and connection points Server IP addresses and subnets Visual implementation of ITCAM
Page 9
Alerts and thresholds required by the operators, business and service levels Sizing of the ITCAM infrastructure and trend database
Development of automation processes and documentation Successful verification of deployment in production Hand-off and training
Architecture Application infrastructure for financial composite applications
Discover existing applications Identify firewalls and gateways for change control process Build architecture diagram for logical ITCAM implementation
Page 10
Planing Scheduling, Sizing, Alerts and Thresholds What is the best time to deploy, restart and test application servers in development, staging and production? Who is going to be affected by this change and who is to notified?
How many application servers are in our environment? What type of monitoring data would you like to collect? Number of end users of the application management console Integration into other monitoring systems?
What type of metrics are to be monitored? i.e AppServer up/down, jvm heap sizes, response times for transactions Who is alerted for a specific threshold violation? What is the operator response for this event?
Page 11
Deployment Documentation and Roll-out The documentation of the deployment and roll-out process specifically for the Depository Trust and Clearing Corporation •Each systems management deployment step was documented for reproduction purpose •Known issues were documented with solutions in the case of reoccurrence The roll-out into production was testing in all critical stages including development and staging • Challenges were to schedule accordingly even in development due to the high number of developers accessing the systems • Zero down time was expected, failover of the application servers needed to be included into the process Page 12
ITCAM Best Practices Application Management Tools
Page 13
Application Management Example
Developer creates method for risk calculation
Page 14
Application Management Example Customer service has received an alert that risk customer has an issue
Page 15
Application Management Example
Application analyst sees there is a hanging risk transaction from the customer
Page 16
Application Management Example Application analyst selects customer request for more details
Page 17
ITCAM In-flight Analysis Tools for problem determination at the method level, real-time
Page 18
IT Operations Events & Applications Problems, Known Issues and Solutions Application Events Depending on your infrastructure and enterprise applications the majority of alerts are from the application Network Systems Database
Without application management production incidents are not included in development to be fixed
Security Application
Page 19
ITIL service support standard is to identify known issues and add incident fixes to change management and release
Application Management Events From unqualified alerts to qualified events Unqualified Alerts Single point where events are received All events are accepted at the top of the funnel Filter Development of known issues and improved problem determination Correlation of incidents and events Qualified Alerts
Quality events with real IT Management value More sleep and less pager calls
Page 20
CAM Solutions Strategy & Value Add From managing systems to managing services Service & Value
Linking of IT to business metrics Service Level Management Capacity Management
Proactive
Availability Management Problem Management Change Management Configuration Management
Reactive
Tactical fire-fighting up/down Service Desk Release Management
Chaotic
Multiple Help Desks minimal standards user call-driver Incident Management
Page 21
Service Management
Service Management
Systems Management
Systems Management
J2EE & WebSphere Best Practices Application management usage guide for composite applications
Installation, Configuration, and Best Practices Distributed and Mainframe examples Tools ITCAM provides for application management
Page 22
Old School vs. Application Mgmt Today Beyond log file analysis Application Management Log file parsing
New
X
Heap Analysis
x
Thread Analysis
x
Method Level Monitoring
x
Monitoring-on-Demand (TM)
x
Monitoring-as-a-Service (TM)
x
Custom JMX Applications
x
Built-in JMX Services
x
Software Consistency Check
x
In-Flight Transaction Monitoring Enterprise System Resource Dashboard Dashboard
Page 23
Old
x x x
Monitoring-as-a-Serivce (TM) Gartner Says 25 Percent of New Business Software is SaaS 25% Enterprise Software to be SaaS
ITCAM as a Service CAM Solutions provides ITCAM as a Service Data Center SAS 70 certified Save implementation cost and management
Page 24
Log file to application management W orkflow process to get to enterprise application management
Logfile scraping
Step 1
Contact IBM Tivoli
Step 2
Participate in TUG
Page 25
Step 3
Live the dream
Step 4
SME Deployment
Step 5
The Composite Application Management Solution “Where rubber meets the road” ITCAM for WebSphere
IBM System x
ITCAM for RTT
IBM System p
ITM for Applications
IBM System z Blade Center
CAM Solutions IBM
Page 26
ITIL ITSM
Do You Have Any Questions? We would be happy to help.
Page 27