Managing Grid and Web Services
and their exchanged messages OGF19 Workshop on Reliability and Robustness Friday Center Chapel Hill NC January 31 2007 Authors Harshawardhan Gadgil (his PhD topic), Geoffrey Fox, Shrideep Pallickara, Marlon Pierce Community Grids Lab, Indiana University Presented by Geoffrey Fox
Management Problem I
Characteristics of today’s (Grid) applications – Increasing complexity – Components widely dispersed and disparate in nature and access
Span different administrative domains Under differing network / security policies Limited access to resources due to presence of firewalls, NATs etc… (major focus in prototype)
– Dynamic
Components (Nodes, network, processes) may fail
Services must meet
– General QoS and Life-cycle features – (User defined) Application specific criteria
Need to “manage” services to provide these capabilities
– Dynamic monitoring and recovery – Static configuration and composition of systems 2 from subsystems
Management Problem II
Management Operations* include – – – –
Configuration and Lifecycle operations (CREATE, DELETE) Handle RUNTIME events Monitor status and performance Maintain system state (according to user defined criteria)
Protocols like WS-Management/WS-DM define interservice negotiation and how to transfer metadata We are designing/prototyping a system that will manage a general world wide collection of services and their network links – Need to address Fault Tolerance, Scalability, Performance, Interoperability, Generality, Usability
We are starting with our messaging infrastructure as
– we need this to be robust in Grids we are using it in (Sensor and material science) – we are using it in management system – and it has critical network requirements * From WS – Distributed Management http:// devresource.hp.com/drc/slide_presentations/wsdm/index.jsp 3
Core Features of Management Architecture Remote Management
– Allow management irrespective of the location of the resource (as long as that resource is reachable via some means)
Traverse firewalls and NATs
– Firewalls complicate management by disabling access to some transports and access to internal resources – Utilize tunneling capabilities and multi-protocol support of messaging infrastructure
Extensible
– Management capabilities evolve with time. We use a service oriented architecture to provide extensibility and interoperability
Scalable
– Management architecture should be scale as number of managees increases
Fault-tolerant
– Management itself must be fault-tolerant. Failure of transports OR management components should not cause management architecture to fail. 4
Management Architecture built in terms of Hierarchical Bootstrap System – Robust itself by Replication
managees in different domains can be managed with separate policies for each domain Periodically spawns a System Health Check that ensures components are up and running
Registry for metadata (distributed database) – Robust by standard database techniques and our system itself for Service Interfaces
Stores managee specific information (Userdefined configuration / policies, external state required to properly manage a managee) Generates a unique ID per instance of registered component
Architecture: Scalability: Hierarchical distribution Spawns if not present and ensure up and running
Replicated ROOT
EUROPE
US
Passive Bootstrap Nodes
…
•Only ensure that all child bootstrap nodes are always up and running
Active Bootstrap Nodes /ROOT/EUROPE/CARDIFF
CGL
FSU
CARDIFF
•Responsible for maintaining a working set of management components in the domain •Always the leaf nodes in the hierarchy
Management Architecture built in terms of
Messaging Nodes form a scalable messaging substrate
Managers – Active stateless agents that manage managees.
Message delivery between managers and managees Provides transport protocol independent messaging between distributed entities Can provide Secure delivery of messages
Both general and managee specific management threads performs actual management Multi-threaded to improve scalability with many managees
Managees – what you are managing (managee / service to manage) – Our system makes robust
There is NO assumption that Managed system uses Messaging nodes Wrapped by a Service Adapter which provides a Web Service interface Assumed that ONLY modest state needed to be stored/restored externally. Managee could front end and
Architecture: Conceptual Idea (Internals) Manager
WS Management
Service Adapter
Messaging Node
Resource to Manage (Managee )
Manager
... Manager
Manager processes periodically checks available managees to manage. Also Read/Write managee specific external state from/to registry
Service Adapter
Always up Alwaysensure ensure and running up and running
System Health Check Manager
Bootstrap Service
Registry
User writes system configuration to
Connect to Messaging Node for sending and receiving messages
Resource to Manage (Managee )
Periodically Spawn
...
Service Adapter
Resource to Manage (Managee )
Architecture: User Component
“Managee Characteristics” are determined by the user. Events generated by the Managees are handled by the manager Event processing is determined by via WS-Policy constructs
E.g. Wait for user’s decision on handling specific conditions “Auto Instantiate” a failed service but service responsible for doing this consistently even when failed service not failed but just unreachable
Administrators can set up services (managees) by defining characteristics
Writing information to registry can be used to
Issues in the distributed system Consistency
Examples of inconsistent behavior
Two or more managers managing the same managee Old messages / requests reaching after new requests Multiple copies of managees existing at the same time / Orphan managees leading to inconsistent system state
Use a Registry generated monotonically increasing Unique Instance ID (IID) to distinguish between new and old instances of Managers, Managees and Messages
Requests from manager thread A are considered obsolete IF IID(A) < IID(B) Service Adapter stores the last known MessageID (IID:seqNo) allowing it to differentiate between duplicates AND obsolete messages Periodic renewal with registry IF IID(manageeInstance_1) < IID(manageeInstance_2) THEN manageeInstance_1 was deemed OBSOLETE SO EXECUTE Policy (E.g. Instruct manageeInstance_1 to silently shutdown)
Issues in the distributed system Security
Security – Provide secure communication between communicating parties (e.g. Manager <-> Managee)
*
Publish/Subscribe:- Provenance, Lifetime, Unique Topics Secure Discovery of endpoints Prevent unauthorized users from accessing the Managers or Managees Prevent malicious users from modifying message (Thus message interactions are secure when passing through insecure intermediaries)
Utilize NaradaBrokering’s Topic Creation and Discovery* and Security Scheme#
NB-Topic Creation and Discovery (Grid2005)
http://grids.ucs.indiana.edu/ptliupages/publications/NB-TopicDiscovery-IJHPC
#
NB-Security (Grid2006)
Implemented:
WS – Specifications
WS – Management (June 2005) parts (WS – Transfer [Sep 2004], WS – Enumeration [Sep 2004] and WS – Eventing) (could use WS-DM)
WS – Eventing (Leveraged from the WS – Eventing capability implemented in OMII) WS – Addressing [Aug 2004] and SOAP v 1.2 used (needed for WS-Management)
Used XmlBeans 2.0.0 for manipulating XML in custom container.
Currently implemented using JDK 1.4.2 but will switch to JDK1.5 Released on http://www.naradabrokering.org in February 2007
Performance Evaluation Results
Extreme case with many catastrophic failures
Response time increases with increasing number of concurrent requests
Response time is MANAGEE-DEPENDENT and the shown times are typical
MAY involve 1 or more Registry access which will increase overall response time
Increases rapidly as no. of Managees > (150 – 200)
managees
Performance Evaluation How much infrastructure is required to manage N managees ?
N = Number of managees to manage M = Max. no. of entities connected to a single messaging node D = Max. no of managees managed by a single manager process R = min. no. of registry service instances required to provide fault-tolerance Assume every leaf domain has 1 messaging node. Hence we have N/M leaf domains. Further, No. of managers required per leaf domain is M/D
Total Components in lowest level = (R registry + 1 Bootstrap Service + 1 Messaging Node + M/D Managers) * (N/M such leaf domains) = (2 + R + M/D) * (N/M)
Thus percentage of additional infrastructure is = [(2 +R)/M + 1/D] * 100 %
Performance Evaluation Research Question: How much infrastructure is required to manage N managees ?
Additional infrastructure = [(2 +R)/M + 1/D] * 100 % A Few Cases
Typical values of D and M are 200 and 800 and assuming R = 4, then Additional Infrastructure = [(2+4)/800 + 1/200] * 100 % ≈ 1.2 % Shared Registry => there is one registry interface per domain, R = 1, then Additional Infrastructure = [(2+1)/800 + 1/200] * 100 % ≈ 0.87 % If NO messaging node is used (assume D = 200), then Additional Infrastructure = [(R registry + 1 bootstrap node + N/D managers)/N] * 100 % = [(1+R)/N + 1/D] * 100 % ≈ 100/D % (for N >> R) ≈ 0.5%
Performance Evaluation Research Question: How much infrastructure is required to manage N managees ? How Cost varies with maximum Managees per Manager (D) ?
Percentage of Additional Infrastructure
110.0
100.4
90.0 70.0 50.0
33.7 20.4
30.0
14.7
10.4
10.0 -10.0
1
3
5
7
10
4.4
2.4
1.4
1.0
0.9
25
50
100
150
200
Max. Managees per Manager (D) "For R = 1, M = 800"
Performance Evaluation XML Processing Overhead
XML Processing overhead is measured as the total marshalling and un-marshalling time required. In case of Broker Management interactions, typical processing time (includes validation against schema) ≈ 5 ms
Broker Management operations invoked only during initialization and failure from recovery Reading Broker State using a GET operation involves 5ms overhead and is invoked periodically (E.g. every 1 minute, depending on policy) Further, for most operation dealing with changing broker state, actual operation processing time >> 5ms and hence the XML overhead of 5 ms is acceptable.
Prototype: Managing Grid Messaging Middleware
We illustrate the architecture by managing the distributed messaging middleware: NaradaBrokering This example motivated by the presence of large number of dynamic peers (brokers) that need configuration and deployment in specific topologies Runtime metrics provide dynamic hints on improving routing which leads to redeployment of messaging system (possibly) using a different configuration and topology Can use (dynamically) optimized protocols (UDP v TCP v Parallel TCP) and go through firewalls
Broker Service Adapter Note NB illustrates an electronic entity that didn’t start off with an administrative Service interface So add wrapper over the basic NB BrokerNode object that provides WS – Management front-end
Allows CREATION, CONFIGURATION and MODIFICATION of broker and broker topologies
Messaging (NaradaBrokering) Architecture
Slow Client behind modem Compute Server
Computer
User
PDA
Messaging Substrate
File Server
Media Device
Media Server
Audio / Video Conferencing Client
Laptop Compute Server behind firewall
19
Typical use of Grid Messaging in NASA Sensor Grid implementing using NB
NB
June 19, 2006
Datamining Grid
Community Grids Lab, Bloomington IN :CLADE 2006:
GIS Grid
20
NaradaBrokering Management Needs NaradaBrokering Distributed Messaging System consists of peers
(brokers) that collectively form a scalable messaging substrate. Optimizations and configurations include: – Where should brokers be placed and how should they be connected, E.g. RING, BUS, TREE, HYPERCUBE etc…, each TOPOLOGY has varying degree of resource utilization, routing, cost and fault-tolerance characteristics. Static topologies or topologies created using static rules may be inefficient in some cases – E.g., In CAN, Chord a new incoming peer randomly joins nodes in the network. Network distances are not taken into account and hence some lookup queries may span entire diameter of network – Runtime metrics provide dynamic hints on improving routing which leads to redeployment of messaging system (possibly) using a different configuration and topology – Can use (dynamically) optimized protocols (UDP v TCP v Parallel TCP) and go through firewalls but no good way to make choices dynamically 21 These actions collectively termed as Managing the Messaging
Prototype: Costs (Individual Managees are NaradaBrokering Brokers)
Time (msec) (average values) Operation
Un-Initialized (First time)
Initialized (Later modifications)
Set Configuration
778 ± 5
33 ± 3
Create Broker
610 ± 6
57 ± 2
Create Link
160 ± 2
27 ± 2
Delete Link
104 ± 2
20 ± 1
Delete Broker
142 ± 1
129 ± 2
Recovery: Typical Time Topology
Number of managee specific Configuration Entries
Ring
N nodes, N links (1 outgoing link per Node) 2 managee Objects Per node
Cluster
N nodes, Links per broker vary from 0 –3 1 – 4 managee Objects per node
Recovery Time = T(Read State From Registry) + T(Bring managee up to speed) = T(Read State) + T[SetConfig + Create Broker + CreateLink(s)]
10 + (778.0 + 610.1 + 160.5)
≈ 1548 msec Min: 5 + 778.0 + 610.1
Max: 20 + 778.0 + 610.1 + 160.5 + 2* 26.67
≈ 1393 msec
≈ 1622 msec
Assuming 5ms Read time from registry per managee object
Prototype: Observed Recovery Cost per managee Operation *Spawn Process Read State
Average (msec) 2362 ± 18 8±1
Restore (1 Broker + 1 Link)
1421 ± 9
Restore (1 Broker + 3 Link)
1616 ± 82
Time for Create Broker depends on the number & type of transports opened by the broker E.g. SSL transport requires negotiation of keys and would require more time than simply establishing a TCP connection If brokers connect to other brokers, the destination broker MUST be ready to accept connections, else topology recovery takes more time.
Management Console: Creating Nodes and Setting Properties
Management Console: Creating Links
Management Console: Policies
Management Console: Creating Topologies
Conclusion
We have presented a scalable, fault-tolerant management framework that
Adds acceptable cost in terms of extra resources required (about 1%) Provides a general framework for management of distributed entities Is compatible with existing Web Service specifications
We have applied our framework to manage Managees that are loosely coupled and have modest external state (important to improve scalability of management process) Outside effort is developing a Grid Builder which combines BPEL and this management system to manage initial specification,