Thesis Proposal Hierarchically-Synthesized Network Services An-Cheng Huang Computer Science Department Carnegie Mellon University
[email protected]
Abstract Most existing network services are vertically-integrated by service providers and provided to users. Such an approach is inflexible because it is difficult to customize a service to satisfy each user’s unique requirements, and the resulting services are often limited in certain operational environments due to decisions made at the design/implementation stage. We propose a hierarchically-synthesized service model: a synthesizer dynamically composes a service instance at runtime according to each user’s specific requirements and the runtime network characteristics. A service provider only needs to implement the service-specific part of the synthesizer since the generic supporting infrastructure can be shared by all providers. In addition, services can be composed hierarchically using lower-level components. Therefore, service development and deployment cost can be greatly reduced, and the resulting services can be more efficient and deliver better user-perceived quality because the runtime conditions are taken into considerations. We have implemented two prototype services and performed preliminary evaluation of the development cost, synthesizer performance, and quality of composed services through implementation experiences and experiments. The proposed work includes two parts, which are outlined in this proposal. First, we will design and implement a generic synthesizer architecture that allows service providers to specify their servicespecific knowledge using a generic representation. Secondly, we will develop optimization techniques used by a synthesizer to produce better composed services, optimize component binding, and adapt composed service instances to runtime environment changes.
1 Introduction The use of network applications has evolved from academic/research activities to part of everyone’s everyday life. More and more people are using web browsers, FTP, video conferencing, instant messaging, file sharing, on-line gaming, and so on. From users’ perspective, they use these network applications to access services offered by service providers through the network. For example, when a user uses a web browser to access a web site, she is accessing a service provided by the entity that creates the contents and sets up the web server. A Napster [32] client can be used to access the service provided by Napster, who implements the applications and sets up the Napster servers. In a departmental network, the network administrator can
1
provide the video conferencing service by setting up an H.323 Multipoint Control Unit (MCU) [23] in the network so that three or more users can hold a video conference using H.323-compliant video conferencing applications such as Microsoft NetMeeting [33]. From the above examples, we can see that most existing services are vertically-integrated, i.e., a service provider needs to determine what functionalities it wants to provide, implement and/or integrate all the necessary software components, and install and/or setup all the necessary hardware components (computation servers, network bandwidth, etc.). Most importantly, all this work is done before the service can start serving the first user. For example, Napster decides it wants to provide a central server-based music sharing service, so it implements the client and server applications, sets up the servers with sufficient network bandwidth, and starts serving customers. In the video conferencing example above, the network administrator finds an existing MCU software package, sets up a server that runs the MCU, makes sure that NetMeeting is installed on everyone’s machine, and then announces the availability of the service. This vertically-integrated approach has a number of limitations: • Lack of customizability: since the service provider designs and implements the service, the service may not be exactly what the users want. However, it is not straightforward to customize a verticallyintegrated service to fit a user’s needs. There are three main reasons. First, it is common that a service provider implements all functionalities of its service within a monolithic application. As a result, customizing a service means modifying the application, which is not an easy task. Secondly, even if (for example) a user’s special needs can be satisfied by adding a new component, such customization would require the service provider to change the configuration of the service, which is not much different from going through the whole process of designing, implementing, and deploying a new service. Third, although a service provider may improve its service based on users’ feedback, finergrained customization (such as customizing a video conferencing service by adding a movie provider requested by the participants in a particular session) is not practical in a vertically-integrated service. • High development/deployment costs: providers of vertically-integrated services usually do not design and implement their services with reusability or interoperability in mind (in fact, services are often provided using proprietary solutions, for example, a video conferencing solution that consists of equipment/software from a single vendor). As a result, few services are reusable (i.e., few can be used to compose a richer service). For example, although the widely-used FTP clients/servers already provide the file transfer functionality, some video conferencing applications still has built-in file transfer. The development cost can be reduced if such functionalities can be added by simply using existing services. Furthermore, the deployment cost is also high because a service provider needs to deploy all the hardware/software components before serving users, and the configuration needs to be maintained even when no users are using the service. • Restrictive operational environment: in the vertically-integrated approach, since service providers and/or developers do not know the runtime network characteristics, they need to make certain restrictive assumptions about the runtime environment. For example, a video conferencing application developer does not know whether all users will be in multicast-capable networks, so she has to design
2
and implement a unicast-only video conferencing application. A game hosting service provider does not know whether all users will have broadband access, so he has to optimize the service for users with 56K modem access. Such an approach may work fine for simple services. However, as as services become more and more sophisticated, their performance or feasibility will greatly depend on many network characteristics. The worst-case assumptions usually made in the vertically-integrated approach can severely limit the operational environment of a service. In this thesis proposal, we propose a fundamentally different approach to developing and deploying network services. A service provider translates its service-specific knowledge into a generic representation called service recipe, which describes how to compose a service instance given the requirements of a particular user and the runtime network characteristics. Service composition is handled by a synthesizer, and a generic infrastructure provides supporting functionalities such as component discovery and network measurement. Service instances can be composed hierarchically: a component of a richer service can itself be a service instance composed by another synthesizer. We will devise a general representation for service recipes and design and implement a generic synthesizer to translate service recipes into actions to compose, optimize, and adapt service instances. We will also develop techniques for the synthesizer to perform static, dynamic, and runtime optimizations. Our approach addresses the problems associated with the vertically-integrated service model. The development and deployment costs are lowered because (1) service providers can share the same infrastructure that deals with generic issues such as network measurement for location-based server selection, and (2) the hierarchical model can hide the details of a component from its user (a higher-level service). The separation of service recipe from synthesizer allows a service provider to provide an innovative service with nothing but the service-specific knowledge. By delaying the generation of service configuration and the binding of physical components/resources to the runtime, a synthesizer can compose a service instance that is customized and optimized for a particular user according to the user’s requirements and the runtime network characteristics. The rest of this thesis proposal is organized as follows. In Section 2, we describe our motivating example, the video conferencing service, and how service composition can be used to address the limitations of the traditional vertically-integrated model. We present the architecture of the hierarchically-synthesized model in Section 3, and the supporting runtime infrastructure is described in Section 4. Our current implementation and preliminary evaluation results are presented in Sections 5 and 6. The proposed work is discussed in Section 7, and the expected contributions are summarized in Section 8. Finally, the time table is shown in Section 9, and the related work is discussed in Section 10.
2 Motivating Example: Video Conferencing Service In this proposal, we will use the “video conferencing service” as our motivating example. First, let’s look at how this service can be provided using a vertically-integrated approach.
3
2.1 Vertically-integrated approaches Here are four examples of how the video conferencing service can be provided today: • Conferencing facilities: a company that provides the video conferencing service can build video conferencing facilities in a number of large cities and build a dedicated network among these facilities. A user of this service can schedule a time for the conference, and each participant of the conference can go to a nearby facility at the scheduled time. • Conferencing rooms: if a corporation needs the video conferencing service among its offices in different cities, a video conferencing service provider can provide a complete video conferencing solution by building conferencing rooms in the different offices and connecting them together. Such a solution is often proprietary (i.e., uses software and equipment from the single vendor), and even if it claims to be standards-based, there are often features that discourage interoperation with other software/equipment (e.g., “high quality audio is only available if all participants use our products”). • NetMeeting: Microsoft NetMeeting [33] is an application for video conferencing based on the H.323 standard [23]. It requires a central server to coordinate conferences with three or more participants, and it can only handle one video stream at any time and cannot handle multicast. To provide the video conferencing service based on NetMeeting, the network administrator needs to set up a machine to run the H.323 MCU or gatekeeper and make sure the NetMeeting applications know where to find the MCU/gatekeeper. • vic/SDR: vic [47] is a video conferencing application that uses IP multicast for communication among participants. SDR [43] is an application for establishing conference sessions using the Session Initiation Protocol (SIP) [20], and after the session is established, SDR invokes vic for the actual video conference. To hold a video conference using vic/SDR, the network administrator (or one of the participants) needs to make sure all participants’ computers are within the same “IP multicast zone”, i.e., they can reach each other using IP multicast (e.g., they are all on MBone). In these approaches, the services are vertically-integrated by the service providers and/or the application developers, i.e., at the design and implementation phase, the application developers determine what features are needed and implement them, and the service providers acquire and set up the necessary software and hardware components in the network. At the runtime, a user accesses the service provided by a service provider (or the functionalities provided by an application without a service provider, for example, the vic/SDR case above). As mentioned earlier, such vertically-integrated services are not flexible and have many limitations. The main reason for the limitations is that most decisions must be made at the design and implementation phase, i.e., before the runtime when users request for services. In other words, when designing and implementing the service, the service providers and application developers do not know, for example, the specific requirements of each user, the network location of all the participants in a session, and the availability of IP multicast support. Therefore, the resulting services may be inefficient or infeasible for certain users under certain network conditions.
4
vic/SD R
X
Han d h eld Proxy
Z
IP M h and h eld (receive-on ly)
vic/SD R En d System Mu lticast (ESM)
Internet Internet
vic/SD R NetM eetin g IP M
Y Vid eo G atew ay
W
NetM eetin g
vic/SD R
Figure 1: A video conferencing scenario
2.2 Service composition We propose a hierarchically-synthesized service model: service composition is used at runtime to provide services that are more flexible and sophisticated than vertically-integrated ones. Let us use the following video conferencing scenario as an example to illustrate how a service can be composed at runtime (Figure 1): Suppose Alice, who is at W University, wants to hold a video conference that includes two participants at W University, two participants at X Corporation, two participants at Y Corporation, and one participant at Z Airport. The participants at W and X all use the vic/SDR applications, but W and X are not in the same IP multicast zone (i.e., participants at W cannot reach participants at X using IP multicast). The participants at Y are using the NetMeeting application, and the participant at Z is using a receive-only handheld device (which does not do protocol negotiation). Supporting this scenario is difficult because of the heterogeneity of the systems involved. For example, different participants are using different video conferencing tools (NetMeeting [33] versus the MBone tools vic [47] and SDR [43]) that use different protocols (H.323 [23] versus SIP [20]). Similarly at the network layer, IP multicast is not universally supported, and while application-level multicast alternatives exist, they are more difficult to set up. For this reason, video conferencing services today typically require users to use specific software. For example, the service may set up an H.323 server and will require all participants to use, for example, NetMeeting. Alternatively, it may require that all participants use the MBone tools and are connected to MBone [30]. Obviously, this solution is not very convenient. The challenge in supporting this scenario is not that the software does not exist. We will later describe a prototype service that supports the scenario of Figure 1, and we make use of the following existing software packages: (1) a SIP/H.323-translation gateway [21], which helps vic/SDR and NetMeeting applications establish a joint session by translating the protocol negotiations (and also handles the video demultiplexing for NetMeeting users); (2) End System Multicast (ESM) [8] proxies, which implement multicast functionality using an overlay over a unicast-only network; and (3) a handheld proxy that performs protocol negotiation on behalf of the handheld device and forwards the video streams, optionally performing transcoding to reduce bandwidth consumption. We also make use of vic/SDR and NetMeeting clients and IP multicast. The challenges are instead in putting the existing software together: (1) how can we get these separately-
5
User
Request for service
Service Provider Use as component
Request for resources
Service Provider Network Provider
Network Provider
Computational & Communication resources
Figure 2: Entities in the service model developed packages to work together to deliver a higher-level service, (2) how can we manage the deployment of the components in the infrastructure so that the video conferencing service can be efficient and of high quality (to the user), and (3) how can we automate the service composition process so user request can be handled efficiently without manual intervention. Our hierarchically-synthesized services model directly addresses these challenges. Existing packages are configured as services that can be invoked by higher level services. At runtime, an entity called synthesizer automatically compose a customized service instance for each user’s request using these reusable and interoperable service components. Specifically, the synthesizer would handle Alice’s request of Figure 1 in the following way. A handheld proxy is installed for the participant at Z and a video translation gateway is installed to bridge the different conferencing clients. An ESM session is established for the participants at W and X and also for the video gateway and the handheld proxy. The ESM session uses a set of proxies, including a proxy that handles interoperation with IP multicast at site X. The whole process is automatic and transparent to the user.
3 Architecture 3.1 Network services model We assume a network service model in which “providers” fall in two classes (Figure 2). Network providers provide the resources necessary to deliver services, i.e., bandwidth on network links, computing cycles and memory on network nodes, and possibly specialized devices. Service providers deliver more advanced services such as distance learning or backup services to end-users. Such services are built by combining a set of service components and by executing them on a set of resources that is leased from an network provider. Service components can be implemented internally by the service provider or can be provided by other service providers. This model assumes that network services will be developed and delivered in a competitive market. Service providers get paid by its customers (end-users and higher-level service providers) for the services they deliver; network providers get revenue from the users of their infrastructure (service providers and 6
possibly end-users). As in any competitive market, all the providers will want to be able to differentiate their product (resources/services) from that of their competitors’, and they will want to bring services to the market faster. Note that in practice the distinction between network providers and service providers may not be that clear. Service providers may own some communication and computational resources, and similarly, network providers may deliver some higher-level services.
3.2 Service Composition Hierarchically-synthesized services go through three stages in their lifetime. The first phase is the design and implementation phase. During this phase, the service provider determines the specification of the service, and what components and resources may be needed to meet the different users’ requirements. Some components may have to be implemented by the service provider, and other components/resources will be provided by other service providers and network providers. The outcome of this phase is a service recipe describing, for a specific type of user requests, what components/resources are needed, and how they fit together. The second phase is the deployment phase. During this phase, the service provider needs to make sure that the necessary components and resources will be available at runtime. The service provider may decide specifically what suppliers will be used for the different components/resources (note that some may have multiple suppliers), or the service provider can maintain a directory of available suppliers for components and resources so that such information can be used at execution time for service composition. The service provider can also arrange a provider that provides a “service discovery” service to provide such information at execution time. The third phase is the execution phase. During this phase, customers submit requests to the service provider, and the synthesizer use the service recipe and information on components and resources availability to decide how to best satisfy the requests. One of the advantages of the hierarchically-synthesized approach over the vertically-integrated approach is that more decisions are made at runtime, allowing the service to be more adaptive and flexible, e.g., adapt to network load and address specific customer requirements.
3.3 Synthesizer architecture The synthesizer performs the following tasks at the runtime. (1) The synthesizer receives the user’s service request, which specifies the desired service. For example, for the video conferencing service, the user request will specify the conference participants, the conferencing applications used by the participants, and so on. (2) The synthesizer generates one or more abstract solutions for the user according to the service provider’s service recipe. An abstract solution describes one possible combination of components and resources and how they should be put together so that the user’s requirements can be satisfied. (3) The synthesizer then needs to “realize” the abstract solutions by finding the actual components in the network, i.e., it “binds” the each of the abstract components to an actual component in the network. For example, if the abstract solution specifies “a video transcoder is needed”, then the synthesizer needs to locate a physical machine that is running the transcoding software. An actual component can be an existing service owned by the service 7
Ingredient: • Handheld proxy: if some participants use receive-only handheld devices, use a handheld proxy for each of them. • SIP/H.323-translation gateway • End system multicast (ESM) service: ask the ESM service to establish an ESM session among the vic/SDR endpoints, the gateway, and the handheld proxies. (The ESM service returns the data path entry points for those in the ESM session.) Instruction: 1. Give the gateway the list of NetMeeting endpoints and the list of vic/SDR endpoints and handheld proxies (along with their data path entry points) so that the gateway can call the endpoints to establish the conference session. Optimization: 1. Use the generic optimization support
Figure 3: A service recipe for the video conferencing service provider, a service provided by another service provider, or a newly instantiated service on an available computation node (possibly provided by a network provider). In addition, there will typically be many possible realizations of the abstract solutions possible, corresponding to a set of feasible solutions. The synthesizer has to use some optimization criteria to pick the “best” binding, considering both the service quality for the user and the efficient use of the infrastructure (cost). (4) Finally, the synthesizer needs to configure the components to actually start the service instance. How to configure the components is specified in the service recipe. For example, in the video conferencing service, the synthesizer needs to instruct the transcoding service to establish connections to appropriate video source and sink. We can see that the functionalities needed to perform task (3) are fairly generic, for example, a component discovery infrastructure can be shared by many synthesizers. Therefore, in our synthesizer architecture, we implement these generic functionalities as libraries that can be used by different service providers to implement their synthesizers. As a result, service providers can concentrate on developing their service recipes, which specifies how to perform the service-specific tasks (2) and (4).
3.4 Service recipe In our current prototype implementation, the service recipe is implemented as the service-specific code in the synthesizer. A service recipe includes the ingredient part, which generates the abstract solutions (task 2), the instruction part, which configures the components (task 4), and the optional optimization part, in which the service provider can implement some customized optimization policies if necessary. To illustrate what a service recipe looks like, Figure 3 sketches (in English) a service recipe for the video conferencing service (the actual service recipe is the video conferencing synthesizer code). This recipe also illustrates that a component of a service can in fact be a service provided by another service provider. In this recipe, the end system multicast (ESM) component is a service provided by an ESM service provider. Therefore, the ESM service provider will implement an ESM synthesizer using the ESM service recipe in Figure 4, for example. Currently, we provide a set of generic optimization mechanisms (described later) in the library for syn8
Ingredient: • ESM proxy: figure out whether the participants are in the same IP multicast zone, and use an ESM proxy for each participant of the ESM session. (All participants in the same IP multicast zone can share the same proxy.) Instruction: 1. For each ESM proxy, choose a multicast address and a port which will be used by the ESM proxy to communicate with its associated participant (or participants in the associated zone). This is the data path entry point for the participant(s). 2. Set up the ESM session among the proxies. 3. Return the list of data path entry points to the entity that requests for this ESM session. Optimization: 1. Use the generic optimization support
Figure 4: A service recipe for the end system multicast (ESM) service thesizers so that the service providers do not have to implement service-specific optimizations if the generic ones are sufficient (for example, the two example above both use the generic ones). We will talk about optimizations later in Section 4.3.2.
4 Runtime Infrastructure In this section we look in more detail at the runtime infrastructure needed for hierarchically-synthesized services. Specifically we look at components, resources, and the synthesizer. We will use a prototype implementation of the video conferencing service as the basis for our discussion. The low-level details of the implementation will be described in Section 5.1.
4.1 Components So far we have described service composition under the assumption that the components in our service architecture are reusable and interoperable. That is, the synthesizer must be able to find the necessary components and put them together to construct the service instance. To achieve this, we need to solve two subproblems. First, we need a service discovery mechanism so that the synthesizer can find the components. Secondly, in order to use a component, the synthesizer must know the interface of the component (i.e., how to access its functionality). There are many proposed solutions to the service discovery problem. For example, the Service Location Protocol (SLP) [19] is a directory-based approach, and extensions are proposed to handle wide-area scenarios. Other examples include the Java-based Jini [24], hierarchical-hashing based SDS [10], and so on. In our current prototype implementation, we use a simple directory approach based on SLP. We use a central directory to store the information (e.g., locations, properties, etc.) of all available services. When a synthesizer needs a certain component, it sends a query to the directory, which returns information of available services that match the query. The synthesizer can then use one of the available services as a component. This solution assumes that the naming of services is standardized, and it also assumes that the query lan9
guage is rich enough for the synthesizer to express the properties of the needed component. Our experience is that SLP’s query format is sufficient. Let us look at one example from the video conferencing service implementation. The handheld proxy has a standardized service name “handheld-proxy”, and also has a property “out-codec”, which indicates its output video codec. When the synthesizer needs to find a handheld proxy for a handheld device that can only receive H.261 [22] video streams, it will issue a query that looks like “servicetype=handheld-proxy AND out-codec=H.261” to the directory. In addition to the attribute-based service discovery, we extend the SLP to take the network location into consideration using the GNP [34] service. GNP is a mechanism that computes coordinates for network nodes using their network locations. After the coordinates are computed, the network distance between two nodes can be easily computed. We extend the SLP so that when the synthesizer tries to find a component, it can specify the optimization criteria. For example, in the video conferencing service, there may be multiple video gateways available, so the synthesizer will want to find one that is “close” to all the NetMeeting clients (since they only use unicast). Suppose there are two NetMeeting clients A and B, then when asking the SLP directory for a video gateway, the synthesizer can specify “clients=A,B” in addition to the service type and other desired attributes of the video gateway. The SLP directory will first find the candidates (gateways with the desired attributes) and then use the GNP optimization code to select the candidate that are optimal for A and B (e.g., having the shortest average distance to A and B). Finally, the directory returns the selected gateway to the synthesizer. The second component question has to do with its interface. While many services are developed with a human user in mind, service components must provide an interface that can be called by other services, or, specifically, by a synthesizer. We distinguish between the control and data interface. The control interface is used to, for example, query the capabilities of a component or configure it. In the video conferencing service, the synthesizer uses an XML over HTTP communication interface to communicate with other components, and legacy components can be easily modified to use the interface (or a simple wrapper can be used). The data interface is used to transfer data between components at runtime. For example, the vic conferencing application sends/receives RTP [42] video streams in certain formats at a specified address and port. The synthesizer (and the recipe) needs to make sure the data interfaces of the components can fit together. If not, additional components will be needed, for example, a handheld proxy is needed to transcode the video format for the handheld device.
4.2 Resources As mentioned earlier, a synthesizer may need to acquire computational and communication resources to realize an abstract solution. Computational resources may be needed to start a primitive component that is not available (i.e., cannot be found through service discovery, or too loaded), and communication resources may be needed to provide QoS guarantees to satisfy the user’s requirements. The issues are again (1) how to discover the resources and (2) how to acquire them. So far we have only integrated computational resources into our service composition. The allocation of computational resources is done through a proxy provider service, which is a provider that owns a number of computation servers (proxies) in the network. The synthesizer can ask the proxy
10
X
vic/SD R
Z Han d h eld Proxy
vic/SD R
h and h eld (receive-on ly)
En d System Mu lticast (ESM) vic/SD R NetM eetin g Vid eo G ateway
W
Y NetM eetin g
vic/SD R
Figure 5: An abstract solution for the video conferencing scenario in Figure 1 provider to provide an appropriate proxy, the proxy provider will select a proxy for the synthesizer (potentially based on the “location” and load), and then the synthesizer can start a primitive component on the proxy. In our prototype, the proxy provider maintains a directory similar to the one described above for component discovery. To start a component on a proxy, the code for the component can be downloaded to the proxy and started in an on-demand fashion. Again, the discovery service is integrated with the GNP optimization code so that the synthesizer can specify optimization criteria based on network location (see Section 4.1).
4.3 Synthesizer To construct service instances for user requests, the synthesizer performs service composition, which involves three tasks: generating abstract solution, component binding, and configuring the components and resources. 4.3.1
Generating abstract solution
An abstract solution is the blueprint of a particular service instance. It describes the specific components needed to create the service instance and how to connect them together. For example, Figure 5 shows an abstract solution generated by a video conferencing synthesizer for the scenario in Section 1. Basically, a handheld proxy is needed for the participant at Z, a gateway is also needed, and the ESM service is needed to connect the participants at W, the participants at X, the handheld proxy, and the gateway. As described above, the ESM service is provided by another service provider, so the video conferencing synthesizer will send a request (which lists the participants of the requested ESM session) to the ESM synthesizer. As a result, the ESM synthesizer will need to generate an abstract solution for the ESM session requested by the video conferencing service. This abstract solution is shown in Figure 6, which says that one ESM proxy is needed for the participants at W, one is needed for the participants at X, one is needed for the handheld proxy, and another is needed for the gateway. In fact, as we can see from the previous section, the ingredient part of the service recipe describes how to generate the abstract solution for a particular user request. After the synthesizer generates an abstract 11
vic/S D R
X
vic/S D R
vic/S D R
Han d h eld P roxy ESM P roxy
ESM P roxy
ESM P roxy ESM P roxy
W vic/S D R
Vid eo G ateway
Figure 6: An abstract solution for the ESM session requested by the video conferencing service solution, the synthesizer needs to perform component binding, i.e., finding the components needed to realize the abstract solution. 4.3.2
Component binding
There are two types of component in an abstract solution: primitive and complex. A primitive component provides its functionality in a self-contained fashion and does not require further service composition. In other words, it is usually a piece of software running on a piece of hardware and provides an interface so that others can access its functionality. In contrast, a complex component is itself an instance of a service provided by another service provider (i.e., the component is a service instance composed by another synthesizer). In the video conferencing service example, the gateway and the handheld proxy are primitive components, while the ESM service is a complex component. In order to realize the abstract solution (to construct the service instance), each component in the abstract solution needs to be “mapped” to an “actual component” in the network. Figure 7 shows the component binding in the video conferencing example. For each primitive component (in this case the gateway and the handheld proxy), the synthesizer needs to find a piece of hardware running a piece of software that provides the functionality of the component. The ESM service is a complex service composed by an ESM synthesizer, so the ESM synthesizer needs to perform its own component binding (the ESM proxies) as shown in the figure. As mentioned earlier, if a component cannot be found, a synthesizer can also locate an appropriate computation server (proxy) through a proxy provider service and then start the component on the proxy dynamically. Optimization As mentioned earlier, there may be more than one way to realize the abstract solution. Therefore, the synthesizer needs some criteria to optimize the constructed service instance. As described in Sections 4.1 and 4.2, we currently provided a set of generic optimization mechanisms that allow the synthesizer to specify a set of clients and get a component (or computation server) that is “optimal” (in terms of network 12
Abstract solution
Handheld Proxy
ESM
Video Gateway
ESM Proxy
ESM Proxy
ESM Proxy
Bound to physical node
ESM Proxy
Figure 7: Component binding in the video conferencing example location) for those clients. For example, in the video conferencing service, the synthesizer can try to find a video gateway that is close to the NetMeeting clients, a handheld proxy that is close to a handheld client, and so on. There are a number of directions we are currently exploring in this area. We are investigating more sophisticated optimization criteria, for example, an objective function involving both network location and monetary cost of a component/resource. In addition, the synthesizer currently only performs “local optimization” (optimizing each component separately), not global optimization. We want to explore the trade-off between the optimality of the solution and the scalability of the optimization mechanism. Previously, the Xena [6] resource broker in the Darwin project [5] utilizes linear programming to optimize the resource allocation. Similar approaches may be useful in the context of service composition. Finally, some services may have their own very specific optimization policies, in which case a set of generic optimization mechanisms will not be sufficient. We are looking at the possibilities of allowing service providers to customize the optimization mechanisms used at runtime or even specify their own service-specific optimization schemes. Hierarchical service composition The component binding process described above also brings out another key property of our service framework: hierarchical service composition. That is, a synthesizer composes a service instance using the components specified in the recipe. Each of these first-level (highest-level) components may be a complex component and can be composed by a second-level synthesizer using second-level components, and so on. We believe such a hierarchical structure is important because it allows a service provider to hide its service-specific knowledge and thus allows a higher-level user (which may be another service provider) to use the service without knowing the implementation details. For example, in the video conferencing service example, the video conferencing synthesizer does not need to know how to establish an ESM session since it
13
can simply ask the lower-level ESM synthesizer for the service. Therefore, hierarchical service composition allows greater flexibility and enables each service provider to “do one thing and do it well”. 4.3.3
Configure the components
After the component binding step, the components specified in the abstract solution have all been acquired. Therefore, the final step in service composition is to configure these components to actually construct the service instance for the user. As we can see from the service recipes in Figures 3 and 4, the “instruction” part of a service recipe specifies how to configure the components. For example, the ESM synthesizer needs to chooses a multicast address and a port for each of the four ESM proxies, sets up the ESM session among the proxies (by giving them the addresses, ports, other members, etc.), and returns the list of data path entry points to the video conferencing synthesizer. Then the video conferencing synthesizer executes its instruction: constructing a request that includes a list of all participants and sends the request to the video gateway. At this point, the service composition is completed, and the gateway invites the participants to start the conferencing session.
5 Current Implementation Now we describe two services we have implemented using the hierarchically-synthesized service model.
5.1 Video conferencing service Most aspects of the video conferencing service implementation has been described in earlier sections. We set up a web page as the front-end, and a user can request for a video conferencing session by filling the form on the web page (entering the participants and the application/platform they use). The front-end then forwards the list to the video conferencing synthesizer, which composes the customized service instance. We use the XML library libxml2 [28] and a minimal implementation of the HTTP protocol to implement an XML over HTTP libraries for communication between components. Our service discovery mechanism is based on the SLP implementation from the OpenSLP project [37], and it is integrated with the GNP network measurement service. For computational resources we utilize nodes on the Active Network Backbone (ABone) [1], so primitive components can be started on-demand using an active network approach. The major service components include the following: (1) the translation gateway [21] is implemented using libraries from the OpenH323 project [36], (2) the handheld device is an iPAQ device running the Familiar Linux distribution [12], and the application for the device is a modified version of vic, (3) the handheld proxy utilizes a modified version of SDR to perform protocol negotiation and borrows the video codecs from vic for transcoding, and (4) for End System Multicast (ESM) we use Narada ESM proxies [8].
5.2 Simulated multiplayer gaming service The second service we implemented is a “simulated” multiplayer gaming service. We focus on one specific model of multiplayer game: multiple players (clients) join a gaming session hosted by a server, and each
14
Avg. (ms)
Std. dev.
21959
439.65
Client clock time Lines
VConf. synth. utime
730
126.96
VConf. synthesizer
482
VConf. synth. stime
260
54.95
ESM synthesizer
359
ESM synth. utime
949
135.77
Supporting infrastructure
3659
ESM synth. stime
324
56.87
(a) Code size comparison
(b) Performance of the synthesizers
Table 1: Evaluation results for the video conferencing service client sends its actions (mouse click, position, keyboard, etc.) to the server, which processes the actions and forwards the results to the appropriate clients. A multiplayer gaming service lets players meet online, and when a group of players decide to start a gaming session, the gaming service provides a server to host the session. Currently, such a service is provided using a centralized approach, (for example, Blizzard’s Battle.net service [4]). That is, players go to a central server (or a cluster of servers) that hosts gaming sessions. Such an approach has three major drawbacks: it is expensive to set up a dedicated cluster and a dedicated link to the cluster, the location of the server hosting a gaming session may not be good in terms of the network distances to the players in the session, and adding more servers to the cluster can only improve the player-perceived performance to a certain point, since the link connecting the cluster will eventually become the bottleneck. In our service model, when a group of players decide to start a gaming session, a “multiplayer gaming synthesizer” (similar to the synthesizers presented earlier) will compose a customized service instance for the participating players. Although in the current implementation the synthesizer only needs to locate an appropriate proxy to host the session, the service provider can easily add more functionalities (such as multicast support based on user/network multicast capability for more efficient communications, etc.) to the service recipe and provider a richer service, which is not easily done in a vertically-integrated approach described above. Our current implementation of the multiplayer gaming service is “simulated”, i.e., we do not actually use any real games. Instead, we use a simulated game client, which periodically (e.g., every 100 milliseconds) sends small UDP packets to the game server, and the game server forwards the packets to the other participants. The client and server code can be loaded dynamically to ABone [1] nodes using an active network approach.
6 Preliminary Evaluation In this section, we use the video conference and game services to evaluate both the development effort and performance of hierarchically-synthesized services.
15
6.1 Video conferencing service Let us first look at the development effort for the video conferencing service. Most of the effort was in developing the synthesizer since the actual video conferencing functionality was based on existing software packages. Within the synthesizer, most of the code is service-specific (i.e., the service recipe) because the generic functionalities are implemented as libraries. Table 1(a) shows the size of the synthesizer code (mostly service recipe) and the supporting synthesizer infrastructure in terms of lines of (C++) code (including headers). The supporting infrastructure includes an XML over HTTP communication library, an SLP wrapper that provides a simplified interface, the GNP optimization code, and so on, which is shared by both synthesizers. (Note that the 3659 lines do not include the XML library libxml2 and the SLP implementation from the OpenSLP project). We can see that the service-specific part in our implementation is in fact fairly small, suggesting that the development cost for service providers is low because they can share the same supporting infrastructure. Let us next look at the performance of the synthesizer, i.e., how fast it can perform service composition. We evaluate two aspects of the synthesizer performance: (1) the user-perceived setup latency, i.e., the time to the start of the video conferencing session, and (2) the throughput of the synthesizer, i.e., how many service compositions can the synthesizer perform per second. Our experiment consists of a client issuing a stream of back-to-back requests to the synthesizer. Requests are finished when the video gateway hosting the requested conferencing session sends the invitation messages to all the participants, because the participants will see an invitation pop up on their screens. Each request specifies a conferencing session that involves three vic/SDR participants, one NetMeeting participant, and one handheld participant. The synthesizers run on a machine with a 400MHz Pentium II CPU and 128MBytes of RAM. All the service components and supporting infrastructures (e.g., the SLP directory) are running on machines in the same LAN segment, so the network latency is not significant. We measure the clock time of the client issuing 100 back-to-back requests, as well as the user time and system time for the video conferencing synthesizer and the ESM synthesizer processing these 100 requests. The results (average of 20 runs) can be seen in Table 1(b). We can derive that the average user-perceived latency for each session is 220 ms. In addition, the video conferencing synthesizer consumes 9.9 ms of CPU time for processing each request, and the ESM synthesizer consumes 12.73 ms. Therefore, we can derive that the upper bounds for the throughput of the video conferencing and ESM synthesizers are 101 and 78 sessions per second, respectively.
6.2 Multiplayer gaming service We use the multiplayer gaming service to evaluate how well a synthesizer-based solution can optimize the game performance perceived by the users. In the vertically-integrated approach, all gaming sessions are hosted by the server cluster(s). In our approach, each gaming session is a service instance dynamically composed and customized for the participants of the session. Therefore, the key issue in our approach is how the multiplayer gaming synthesizer finds a server (or a proxy to start a new server) to host each gaming session. In our evaluation, we compare four different server selection schemes: (1) central: all sessions are hosted by the same server, (2) random: the synthesizer randomly selects one of the available servers, (3) 16
40 0 centra l rand om laten cy laten cy+loa d
Average latency (m s)
30 0
20 0
10 0
0 0
10
20 30 40 50 Nu mb er of co ncurrent session s
60
70
Figure 8: Average latency latency: the server selection is based on the output of the GNP service, i.e., the synthesizer gives the GNP service a list of candidates and asks it to return the one that is “optimal” for the participants in a session, and (4) latency+load: this is similar to the latency scheme, except that the synthesizer applies a constraint on how many sessions a server can host, i.e., if the best server is already hosting many sessions, then the second best server is examined, and so on. Our experiments are conducted using 13 nodes on the ABone [1] and two local CMU nodes. We divide the nodes into a server set (6 nodes) and a client set (9 nodes). Note that these nodes are connected by the Internet, so our measurements are affected by the presence of random, uncontrolled cross traffic. Also, we do not have any information on the configuration of the ABone nodes, i.e., CPU, memory, etc.. While we did some benchmarking to establish a rough estimate of the speed of each ABone node, these results are also subject to dynamic load conditions. The server set consists of the faster nodes. For each experiment, we generate a list of sessions, each of which consists of four players randomly selected from the client set. Then for each session, we use each of the four schemes above to select a server. Each session participant sends a small UDP packet to the server every 100 ms and the server replicates this packet to the other players. The server also sends an ack, which is used by the client to measure the latency perceived by the the client. Our evaluation is based on three metrics that are important in a multiplayer gaming context: (1) average latency of each session, which is important because it represents the overall picture of player-perceived performance, (2) maximum latency of each session, which is important because if one player in a session experiences particularly high latency, it will affect (or at least confuse) other players, and (3) average packet loss rate in each session, which is important because if too many packets are lost, the session becomes basically unplayable. We examine the performance of the four schemes under different load conditions: from 10 concurrent sessions to 60 concurrent sessions. The results of average latency, maximum latency, and packet loss rate are shown in Figure 8, Figure 9, and Figure 10, respectively. Each data point represents the average of 60 sessions. The results show that the central scheme performs very well when the load is not too high (10 to 20
17
4 00 cen tral ran dom
M axim u m latenc y (m s)
la te ncy la te ncy+lo ad
3 00
2 00
1 00
0 0
10
20 30 40 50 N um be r o f co ncurre nt sessio ns
60
70
60
70
Figure 9: Maximum latency 0 .5 ce ntral ra ndo m latency
0 .4
Packet lo ss ra te
latency+load
0 .3
0 .2
0 .1
0 0
10
20 30 40 50 N um be r o f co ncurre nt sessio ns
Figure 10: Packet loss rate concurrent sessions), but when the number of concurrent sessions is increased to 30 or higher, the central scheme performs poorly. Of course, this comparison is not fair because the other schemes can use six servers, while the centralized scheme has only one server. However, to level the playing field a bit, we chose as the centralized server a node that is at least a factor of two faster than the other server nodes, according to our experimental data. The latency results show that the ability to dynamically select a server based on the location of the players pays off (since the random scheme does not perform as well). The high loss rate for the centralized approach is probably caused by the fact that the centralized server is more heavily loaded. Given these results, we wonder why the latency+load scheme does not have any significant performance advantage over the latency scheme. We speculate that the reason is that the load is not high enough to affects the selected nodes. Therefore, we decrease the size of the server set to five and increased the load to 80 concurrent sessions. The following table compares the performance of the latency and latency+load
18
Service Provid er A
Service Provid er B
Service recip e A
Service recip e B
User1
User2 Req . for service B
Req . for service A
Com p osed in stan ce of B
Com p osed in stan ce of A G en eric Syn th esizer
Figure 11: A generic synthesizer performs service composition schemes (the results are the average of 60 sessions). latency
latency+load
Avg. latency (ms)
158.54
70.62
Max. latency (ms)
214.05
127.68
Packet loss rate
0.0930
0.0085
The results indicate that under such load conditions, the synthesizer can compose better service instances for the users if it takes the dynamic load into consideration. The evaluation presented here is not meant to provide a conclusive study of load balancing and server selection schemes. Instead, the purpose is to demonstrate that in our hierarchically-synthesized service model, service providers can easily apply different server selection schemes to optimize user service quality without having to “own” many machines and bandwidth in the network.
7 Proposed Research Directions This section describes the two key areas of the proposed work. First, we will design and implement a generic synthesizer architecture. Secondly, we will develop techniques used by a synthesizer for static, dynamic, and runtime optimizations.
7.1 Toward a generic synthesizer architecture As described above, in our current prototype each service provider needs to implement its own synthesizer by combining service-specific code (service recipe) and function calls to our libraries that provide a set of generic functionalities. The resulting synthesizer is a specialized synthesizer, i.e., it can only be used to compose instances of the particular service. Our preliminary evaluation results demonstrate that implementing such a specialized synthesizer is fairly straightforward, since most of the work is done by our generic libraries. However, we can further simplify the service providers’ job by using a generic synthesizer architecture (Figure 11). When a generic synthesizer is used, all the service provider has to do is to transform the service-specific knowledge into a service recipe that can be understood by the generic synthesizer (such 19
a service recipe is more abstract than the recipe in our current implementation, which is part of the actual program code). For example, given a video conferencing service recipe designed by a video conferencing service provider, a generic synthesizer will be able to compose video conferencing service instances according to users’ requests. Similarly, given a multi-player gaming service recipe, the synthesizer can compose multi-player gaming service instances. Therefore, a generic synthesizer can be implemented by (for example) a network provider and can perform service compositions for many different service providers. We believe the generic synthesizer model has a number of potential advantages: • Simplify the job of the service provider: The main tasks of a synthesizer are generating the abstract solution, components/resources binding, and configuring the components/resources. In our current prototype, a service provider implements its specialized synthesizer by combining service-specific code (generating abstract solutions and configuring components/resources) and function calls to the generic libraries (for components/resources binding). In contrast, a generic synthesizer implements the generic functionalities of the synthesizer and can be shared by many service providers. Therefore, each service provider can concentrate on designing a good service recipe that describes how to perform the service-specific tasks. • Better composition: since the recipe and the synthesizer are separated, the implementor of the synthesizer can focus on how service composition should be done (specifically, how to do components/resources binding), which involves a number of issues such as optimization, hierarchical composition, etc. Therefore, the resulting synthesizer is likely to do a better job in service composition. • Supporting infrastructure can be shared: to perform service composition, a synthesizer will need a supporting infrastructure that consists of many elements, for example, service discovery, computational resources, network measurement/monitoring, etc. Ideally, these elements will be openly-available so that everyone can use them. However, some of them may be proprietary (for example, network providers may not want to reveal or make available their network monitoring data or methodology), and even if they are all available, it may not be straightforward to integrate them into an infrastructure to support a synthesizer. In the generic synthesizer architecture, the integration is done by the entity (for example, a network provider) who implements the generic synthesizer. Therefore, the service providers can utilize the infrastructure simply by using the generic synthesizer. • Network-specific knowledge: if a service provider has to implement a specialized synthesizer, the synthesizer may not be able to utilize all available network information in service composition since the service provider probably does not control, cannot afford, or does not know how to gain access to network information. In the generic synthesizer architecture, the implementor of the synthesizer can arrange access to an infrastructure that provides network information (especially if the network provider implements the synthesizer) and implement the synthesizer so that such explicit network information can be used by the synthesizer to compose service instances that are more efficient. In order to utilize the generic synthesizer architecture, one key issue needs to be addressed: how to represent the service-specific knowledge (i.e., service recipe). This is the key because the recipe representation 20
determines what services can be described, what a service provider needs to do, how a generic synthesizer can be customized for a service provider, and so on. Service recipe To use a generic synthesizer, a service provider needs to transform its service-specific knowledge (i.e., how to construct service instances that can satisfy different users’ requests) into a service recipe. Then the generic synthesizer can use the recipe to construct service instances for users requesting service from the service provider. There are two issues regarding the service recipe. First, we need to determine what a service provider might need to specify in a service recipe (i.e., the semantics of service recipes). Since our goal is to let service providers invent novel services as easily as possible, we do not want the service recipe to limit the services that can be provided. In other words, the service recipe should be sufficiently general so that most service providers will be able to describe their conceptual services. Secondly, we also need a language that can be used to express everything that may appear in the service recipe (i.e., the syntax of service recipes). We believe that in a service recipe, a service provider will likely need to specify three types of entities: • Components: in the ingredient part of the service recipe, the primitive and complex components needed in the abstract solution are specified. For example, in the video conferencing service recipe (Figure 3), it needs to specify the handheld proxy, the translation gateway, and the ESM service. This is closely related to the component discovery mechanism. Currently we use the Service Location Protocol for discovery; therefore, the components are specified using a standardized service type and attribute-value pairs. This requires that if we want to use a component, we need to have certain knowledge about it (e.g., the communication protocol used). Another possible approach is to use a lower-level and strongly-typed specification that specifies the interface of a component. The potential benefit is that since it requires less external knowledge about the components, we may be able to perform certain parts of the service composition process without service-specific knowledge. • Actions: the service recipe specifies the actions that need to be done to configure the components or resources (i.e., the instruction part of the recipe). In addition, since the ingredient part describes how to construct an abstract solution, actions are sometimes needed. For example, in the ESM service recipe (Figure 4), the ingredient part says in order to construct an abstract solution, the synthesizer needs to figure out whether the participants are in the same IP multicast zone. There are two possible approaches for specifying actions. One is to define a set of high-level actions that can be used in a service recipe, i.e., similar to defining an API. The second approach is to define a scripting language that can be used by the service provider to write a service recipe. Therefore, the main issue here is the trade-off between the flexibility a service provider has and the complexity of a generic synthesizer (e.g., allowing a full-blown scripting language will probably have major security implications). • Optimization policies: as mentioned earlier, service providers may need to specify how to select a component if there are several eligible, for example. These policies may include QoS constraints that must be satisfied, objective functions to be minimized/maximized, or even code that performs service-specific optimizations. Again, there is a trade-off between flexibility and complexity. Since optimization is itself a complicated issue, we discuss it in the next section. 21
To devise a representation for service recipes, we will first look at the recipes of the video conferencing service and the ESM service and explore how they can be separated from the current synthesizers. We can leverage previous work on architectural description languages (for example, Acme [16]) to describe the components and how they are inter-connected. Optimization policies may be specified as special architectural constraints. As for actions, a simple scripting language with limited capabilities may be used. If we want to allow more general optimization policies and actions, we may need to use a “sandbox” approach with a more general language, for example, Java.
7.2 Optimizing service composition When composing a service instance for a particular user request, there may be many possible instances that can satisfy the request. Of course, we want to be able to choose the “best” one using certain optimization criteria, for example, the cost for the user, the cost for the service provider, resource utilization, runtime network characteristics, user-perceived service quality, and so on. We have identified three types of optimizations: static (optimizations in the recipe), dynamic (optimizations of components/resources binding), and runtime (optimizations after service instance is started). 7.2.1
Static optimization
Static optimization is done by the service provider when the service recipe is designed. The service provider can put rules into the recipe to optimize the composed service instance for a particular user request. For example, in the ESM service recipe (Figure 4), the service provider tells the synthesizer to figure out whether the participants are in the same IP multicast zone and use one ESM proxy for each zone (instead of simply using one proxy for each participant without checking the zones). Since participants in the same zone can share one ESM proxy, this optimization can reduce the overall cost of the composed service instance (comparing to an instance that uses one ESM proxy for every participant). Another example is that a video conferencing service provider can specify (in the service recipe): “if a conference participant is behind a 56K modem, a video transcoder should be used to reduce the bandwidth of the video stream before it reaches the modem link”. Since static optimizations are specified in the service recipe, the expressiveness of service recipes determines what types of static optimization are possible. For example, to specify the second optimization described above, we need to specify conditional commands (“if”), QoS constraint (“below 100Kbps”), component (“video transcoder”), location (“before the bottleneck link”), and so on. Therefore, this is closely related to the service recipe representation problem. 7.2.2
Dynamic optimization
Dynamic optimizations refer to the optimizations that are done in the components/resources binding process, i.e., when a synthesizer performs components or resources binding, it often needs to select one from a set of eligible choices. For example, suppose a video conferencing synthesizer is trying to find a handheld proxy for a handheld device, and it discovers two candidates that use different pricing schemes (e.g., one is seven
22
VConf Synthesizer
VConf Synthesizer
Use as component
Use as component ESM Synthesizer Handheld Proxy
ESM Proxy
Handheld Proxy
ESM Proxy
Video Gateway
(a) Flat
Video Gateway
(b) Hierarchical
Figure 12: Flat composition vs. hierarchical composition cents per minute, and the other is twenty minutes for 99 cents). Then the synthesizer can choose one of them to minimize the cost. Another example is that, suppose a synthesizer is trying to find a path with at least a certain amount of bandwidth, and there are two eligible paths provided by different ISPs. Then the synthesizer can choose one based on some optimization criteria (e.g., cost, reliability, etc.). As described earlier, in our current prototype, the only dynamic optimization performed by the synthesizer is to use the network distance information provided by the GNP network measurement service to choose the component with the optimal network location (e.g., closest to all participants). Here are three directions we plan to explore for dynamic optimizations: • Flat vs. hierarchical composition: as described earlier, one key property of our service framework is hierarchical service composition, i.e., each of the first-level components may be a complex component and can be composed by a second-level synthesizer using second-level components, and so on. The advantage of this is that a service provider can use another service as a component directly without knowing the details. However, sometimes it may be useful to perform service composition in a “flat” fashion. For example, in Figure 12, suppose the participants of a video conferencing session use a number of different video codecs, so transcoders are needed to make everyone be able to see the video streams of all others. In addition, the participants are not in the same IP multicast zone, so an ESM session needs to be established among them. With hierarchical composition, the transcoder placement is done by the first-level synthesizer, while the ESM session is established by a secondlevel synthesizer. However, a better solution may be found with a flat composition, i.e., the transcoder placement and ESM session are considered together, since the optimality of the transcoder placement may be dependent on where the ESM proxies are placed and how the overlay tree are formed, and vice versa. • Iterative composition: with hierarchical service composition, each lower-level synthesizer performs 23
local optimization and then returns its composed service instance to a higher-level synthesizer. Therefore, the final global solution is essentially a combination of all the locally optimal solutions. Of course, a combination of locally optimal solutions is not necessarily a globally optimal solution (it may not even be a feasible solution). Another problem is that optimizations may be expensive, and the synthesizers may not have the time or resources to find the absolute optimal solution. One potential solution is to use an iterative approach. For example, suppose a synthesizer needs two complex components to compose a service instance that has a constraint on end-to-end delay. When the two second-level synthesizers return their composed service instances, the first-level synthesizer discovers that the total end-to-end delay (the sum of the two plus the connection between them) exceeds the constraint. Therefore, another iteration is needed, and this can be repeated until the two returned service instances combined can satisfy the delay constraint of the global solution (instead of randomly searching the solution space, the first-level synthesizer may want to provide guidance to the lowerlevel ones so that they they will find solutions that are more likely to result in a better global solution). Furthermore, if optimizations are expensive, the synthesizers can compute approximate solutions and iteratively invest more resources for better solutions. • Customizable optimization: in our current prototype, the components/resources binding is done using the generic optimization policy provided by our libraries (i.e., use GNP to choose the optimal one). Of course, some service providers (or even users) may have service-specific requirements and prefer some special optimization policies. As mentioned earlier, such optimization policies can be specified in the service recipe. For example, in the video conferencing example, the service provider may determine that it is better to choose a video gateway that is not only close to the NetMeeting clients (the default generic optimization) but also has sufficient bandwidth on the paths to those clients. In this case, the service provider can specify such constraints in the service recipe so that the synthesizer can perform the optimization accordingly. There is a spectrum of different ways for specifying optimization policies in a service recipe. One way is to define a set of QoS constraints that can be used in a recipe to instruct the synthesizer that the composed service instance must satisfy certain constraints. We can also allow a service provider to specify an objective function that the synthesizer should try to minimize/maximize. If the service recipe representation is sufficiently expressive, we can even allow a service provider to write a procedure that is invoked by the synthesizer to perform service-specific optimizations. 7.2.3
Runtime optimization
After the user starts using the composed service instance (i.e., the runtime), the environment may change. For example, the network conditions may change (e.g., less available bandwidth), a participant may join, leave, or become unreachable, and a user’s preferences may change (e.g., now prefer higher quality video). When the environment changes, what was the optimal solution when the service started may no longer be optimal, and the perceived performance of the service may degrade. Therefore, we may be able to maintain better perceived performance if we perform runtime optimizations. One possible approach is to re-do the dynamic optimization process described above. The problem with this approach is that it may be 24
too disruptive (the new solution may have different topology, different nodes, etc.), and restarting the service may become necessary, which is not feasible or desirable for many services. One solution to this problem is to limit runtime optimizations to a set of adaptations that can be performed without restarting the service session. In other words, they should be transparent to the users of the composed service instance. Such adaptations can be specified in the service recipe, for example, a video conferencing service provider can specify that “if the available bandwidth for the handheld device becomes too low, the handheld proxy should reduce the frame rate”. Again, in this case the possible adaptations will be limited by the expressiveness of the service recipe. If the adaptation is not service-specific, it may be implemented as part of a generic synthesizer. Furthermore, some components may already have built-in adaptation mechanisms, so the synthesizer may need to come up with a global adaptation strategy based on the local adaptations and service-specific adaptations. Note that all these adaptations are “static strategies”, i.e., a service provider and/or a synthesizer specify the adaptation strategies to the components before runtime, and the components adapt accordingly at runtime. This leads to another interesting issue: should the synthesizer be involved in runtime optimizations? There is a trade-off: if the synthesizer is not involved, it can forget about a service instance after the user starts using the instance; if the synthesizer is involved (e.g., synthesizer can change the adaptation strategies at runtime), it may be able to generate a better adaptation strategy and even recover a service instance if some core components failed. We will start with application-specific adaptations and look at how a service provider can customize the adaptation strategies of the components. Then we will define a set of adaptations that are sufficiently generic so that they can be moved into the synthesizer, i.e., a synthesizer can generate these generic adaptation strategies without hints from service providers. Finally, we will experiment with the approach in which a synthesizer keeps track of a composed service instance at runtime, and we will look at what state of a service instance should be kept by a synthesizer to enable certain adaptation strategies, and when a synthesizer should execute a certain adaptation strategy.
8 Expected Contributions We expect two main contributions in this thesis. First, we will develop the synthesizer architecture and its supporting infrastructure to support the hierarchically-synthesized service model. We expect to show that this approach can address the limitations of the traditional vertically-integrated service model. By using reusable components and sharing the generic infrastructure, the service development and deployment cost can be significantly lowered. Therefore, service providers can deliver more flexible and sophisticated services to the users more easily. In addition, the users can receive more efficient and higher-quality services because the synthesizer takes individual users’ requirements and runtime network characteristics into considerations when composing services. We will show the advantages of our approach by evaluating a number of different services and comparing our approach with the traditional approach in terms of costs (e.g., programming efforts), performance (e.g., user-perceived quality), and so on. The second contribution is the techniques and algorithms we will devise and evaluate for the service composition problem. We expect to derive a representation that is sufficiently general for defining service recipes. We will also develop techniques for generating optimal abstract solutions, evaluate the effects of 25
different service composition approaches on component and resource binding, devise generic and servicespecific optimization algorithms, and investigate adaptation strategies for runtime optimizations. Again, we will use a number of different services as examples to evaluate the effectiveness of the techniques and algorithms we developed. We believe the results will be applicable to the general problems involved in service composition.
9 Work Items and Time Table So far we have designed and implemented a prototype synthesizer architecture and libraries for the generic infrastructure. As described earlier, we have also implemented a video conferencing service and a simulated gaming service and performed some preliminary evaluations. The major parts of my proposed work are presented below: • Optimizing service composition: We will implement and evaluate some static optimization strategies. We believe the implementation experience and evaluation result will be useful for designing an initial recipe representation, as described below. For dynamic optimizations, we will first experiment with different composition strategies proposed in Section 7.2.2. Then we will investigate different approaches for service-specific optimizations. Finally, we will explore runtime optimizations, starting with simple adaptation strategies as described in Section 7.2.3. • Syntax and semantics of service recipes: Since the expressiveness of service recipes will determine, for example, what services can be defined and what optimization policies can be specified, we will tackle the service recipe representation problem after we have some initial results from the work on optimization. After we design and implement a prototype representation, we will work on the optimization problems and revise the recipe representation accordingly, and vice versa. The time table for the work items described above is shown in Figure 13.
10 Related Work In this section, we summarize related work in three areas: service composition, resource management, and software engineering approaches for component-based systems.
10.1 Service composition The Xena service broker [6] in the Darwin project [5] takes an input graph (roughly equivalent to our abstract solution) submitted by the application and selects each component in the graph. It can also automatically insert certain semantics-preserving transformations (e.g., video transcoding to reduce bandwidth consumption). Comparing with our work, the main difference is that Xena only performs the second part of the service composition (it takes the abstract solution specified by the application), and it does not utilize service-specific knowledge for optimizations (except semantics-preserving transformations).
26
Proposal
Pre-proposal Prototype design & implementation Video conferencing service Simulated gaming service Preliminary evaluation
Optimizing service composition Static optimizations Dynamic optimizations Different composition strategies Service-specific optimizations Runtime optimizations
Service recipe syntax & semantics Components & actions Optimization policies
Writing Su01 Fa01 Sp02 Su02 Fa02 Sp03 Su03 Fa03 Sp04 Su04
Figure 13: Time table The AS1 active service framework [2] allows clients to instantiate services on clusters within the network. Ninja [18] is a similar cluster computing environment that supports runtime adaptation and service composition [29]. Iceberg [48] utilizes the Ninja platform to construct an architecture for sophisticated communication services. The Sahara project [39] is developing an architecture to support composition and management of services across independent providers. It provides runtime failure detection and session recovery by constructing an overlay network among clusters [40]. From the service composition perspective, these frameworks utilize the same “service path” model. That is, service composition means finding a path and appropriate components along the path so that the service provided by the provider at one end of the path can be accessed by the client at the other end. Using this model greatly reduces the complexity of the component selection and optimization aspects of service composition. In contrast, we look at the composition and optimization of more general services and utilize both generic techniques and service-specific knowledge. The Panda project [41] explores the automatic application of adaptations for network applications. It also utilizes a path model similar to the one described above, but its focus is on how to remedy a set of problems that occur in a data path between a sender and a receiver. They do not address the more general service composition problem. The SWORD toolkit [38] allows service developers to quickly create and deploy composite web services. Base services (the components) are defined in terms of inputs and outputs, and a developer can specify the inputs and outputs of the desired composed service. A rule-based plan generator is then used to find a feasible “plan”, which can be used by the service developer to deploy the composed service. In contrast to our work, the focus of this work is on composing services for the “providers” (as oppose to
27
composing services for the “users”). Therefore, a provider still needs to set up the components specified in the plan and provide an integrated service instance to users. The Paths architecture [25] developed by the same group provides a mediation infrastructure to support ad-hoc, any-to-any communication between two heterogeneous endpoints in a ubiquitous computing environment. Their composition model is similar to the path model described above. They also look at how a composition can be manipulated to enhance the performance and reliability of the service [26]. There have also been some specialized architectures that utilize the concept of service composition. The Da CaPo++ architecture [46] uses a set of protocol modules to generate a customized communication protocol that meets the requirements of a particular communication session. Tina [11] is an architecture for composing telecommunication services using components with CORBA [35] IDL interfaces. xbind [27] is a CORBA-based architecture that uses low-level components (e.g., kernel services, display, camera, etc.) to compose services. CitiTime [3] is an architecture that dynamically creates communication sessions by downloading/activating an appropriate service module according to the caller’s and callee’s requirements. Gbaguidi et al. [17] propose a Java-based architecture that allows creation of hybrid telecommunication services using a set of JavaBeans components. Comparing with our work, these architectures are designed for specific service types and platforms, and their main focus is on how to support service composition, not how to perform the composition. In contrast to these efforts, our work will focus on how a service provider can describe a sophisticated service using a generic representation to reduce the costs of development and deployment, and, given this description, how a synthesizer can compose a service instance to satisfy the users’ requirements and adapt to runtime changes. Of course, we can reuse many techniques and mechanisms developed in these previous work to support service composition, for example, service discovery, execution environment for component, fault-tolerant platforms, etc.
10.2 Resource allocation and management The resource management architecture [9] in the Globus project [13] addresses the problem of resource allocation for applications in a distributed metacomputing environment. Applications specify their resource requirements using a Resource Specification Language (RSL), and application-specific resource brokers translate the high-level abstract requirements into concrete specifications (e.g., 10 nodes with 1GB memory). Such specifications are further “specialized” until the locations of the required resources are completely specified (e.g., 10 nodes at site X). Then the QoS resource manager [15] at each site handles the resource reservation and runtime adaptation. In contrast to our work, the main “components” for their targeted applications are CPU cycles, memory, storage, bandwidth, etc., and the issues of resource selection and optimization are left to the developers of the resource brokers. The Open Grid Services Architecture (OGSA) [14] built on top of the Globus toolkit defines interfaces to allow applications work together and access resources across multiple platforms. The task of service composition is left to the users. Similar to the Globus resource management architecture, the QoS broker architecture described in [31] allows applications to specify their QoS parameters and translates such parameters into network QoS requirements. However, it did not address specific resource selection and optimization issues.
28
Since we envision service composition to include low-level resources such as computation servers and network bandwidth, we can utilize many aspects of these frameworks, for example, resource reservation, admission control, translating high-level application requirements into low-level resource specifications, etc.
10.3 Software engineering approaches to component-based systems Task-driven computing [49] provides a framework that allows a user to interact with computers by specifying in an application/environment-independent way the task she wants to accomplish. The framework can then analyze the task and choose appropriate software components to support it. Their work does not specifically address the service composition problem. Instead, they focus on task and service management. The Prism task manager in the Aura project [44] extends the task-driven computing concept by supporting task migration, runtime adaptation, and context awareness. Again, this work does not focus on service composition. Spitznagel and Garlan proposes a compositional approach for constructing connectors between software components [45]. The model is similar to path-based service composition: they compose the desired connector by applying a series of transformations on the original connector. However, their approach is more general than path-based service composition since the set of supported transformations include aggregation of multiple connectors and adding a new party to be involved in the communication. Acme [16] is an architecture description language (ADL) for component-based systems. It allows one to specify the structure, properties, constraints, and types and styles of architecture. Using this formal architectural description, the Rainbow project [7] constructs a framework for software architecture-based adaptation. It allows applications to specify repair strategies that can be used to adapt the applications by changing the architectural configuration at runtime if certain architectural constraints are violated. As described earlier, formal descriptions and specifications are needed in several important parts of our proposed work so that a synthesizer can automatically parse them and translate them into actions to compose, optimize, and adapt services. Therefore, the software engineering work described above can help us in these areas: how to describe service components in a systematic way to allow automated service discovery, how to translate service-specific knowledge into generic service recipes, how to let users easily and precisely specify their desired services, and how to transform these specifications into composition, optimization, and adaptation strategies.
References [1] Active Network Backbone (ABone). http://www.isi.edu/abone/. [2] E. Amir, S. McCanne, and R. H. Katz. An Active Service Framework and Its Application to Real-Time Multimedia Transcoding. SIGCOMM ’98, 1998. [3] F. Anjum, F. Caruso, R. Jain, P. Missier, and A. Zordan. CitiTime: a system for rapid creation of portable next-generation telephony services. Computer Networks, 35(5), Apr. 2001.
29
[4] Battle.net. http://www.battle.net/. [5] P. Chandra, Y.-H. Chu, A. Fisher, J. Gao, C. Kosak, T. E. Ng, P. Steenkiste, E. Takahashi, and H. Zhang. Darwin: Customizable Resource Management for Value-Added Network Services. IEEE Network, 15(1), Jan. 2001. [6] P. Chandra, A. Fisher, C. Kosak, and P. Steenkiste. Network Support for Application-Oriented Quality of Service. Sixth IEEE/IFIP International Workshop on Quality of Service, May 1998. [7] S.-W. Cheng, D. Garlan, B. Schmerl, J. P. Sousa, B. Spitznagel, and P. Steenkiste. Using Architectural Style as a Basis for System Self-repair. The Working IEEE/IFIP Conference on Software Architecture 2002, Aug. 2002. [8] Y. Chu, S. Rao, and H. Zhang. A Case for End System Multicast. In Proceedings of ACM Sigmetrics, June 2000. [9] K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, and S. Tuecke. A Resource Management Architecture for Metacomputing Systems. IPPS/SPDP ’98 Workshop on Job Scheduling Strategies for Parallel Processing, 1998. [10] S. E. Czerwinski, B. Y. Zhao, T. Hodes, A. D. Joseph, and R. Katz. An Architecture for a Secure Service Discovery Service. MobiCOM ’99, Aug. 1999. [11] F. Dupuy, G. Nilsson, and Y. Inoue. The TINA Consortium: Toward Networking Telecommunications Information Services. IEEE Communications Magazine, 33(11), Nov. 1995. [12] The Familiar Project. http://familiar.handhelds.org/. [13] I. Foster and C. Kesselman. Globus: A Metacomputing Infrastructure Toolkit. Intl J. Supercomputer Applications, 11(2):115–128, 1997. [14] I. Foster, C. Kesselman, J. M. Nick, and S. Tuecke. Grid Services for Distributed System Integration. IEEE Computer, 35(6), June 2002. [15] I. Foster, A. Roy, and V. Sander. A Quality of Service Architecture that Combines Resource Reservation and Application Adaptation. 8th International Workshop on Quality of Service, 2000. [16] D. Garlan, R. T. Monroe, and D. Wile. Acme: Architectural Description of Component-Based Systems. In G. T. Leavens and M. Sitaraman, editors, Foundations of Component-Based Systems, pages 47–68. Cambridge University Press, 2000. [17] C. Gbaguidi, J.-P. Hubaux, G. Pacifici, and A. N. Tantawi. Integration of Internet and Telecommunications: An Architecture for Hybrid Services. IEEE Journal on Selected Areas in Communications, 17(9), Sept. 1999.
30
[18] S. D. Gribble, M. Welsh, R. von Behren, E. A. Brewer, D. Culler, N. Borisov, S. Czerwinski, R. Gummadi, J. Hill, A. Joseph, R. Katz, Z. Mao, S. Ross, and B. Zhao. The Ninja Architecture for Robust Internet-Scale Systems and Services. IEEE Computer Networks, Special Issue on Pervasive Computing, 35(4), Mar. 2001. [19] E. Guttman, C. Perkins, J. Veizades, and M. Day. Service Location Protocol, Version 2. RFC 2608, IETF, June 1999. [20] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg. SIP: Session Initiation Protocol. RFC 2543, IETF, Mar. 1999. [21] J.-C. Hu and J.-M. Ho. A Conference Gateway Supporting Interoperability Between SIP and H.323 Clients. Master’s thesis, Carnegie Mellon University, Mar. 2000. [22] ITU-T Recommendation H.261. Video Codec for Audiovisual Services at p x 64 kbit/s, Mar. 1993. [23] ITU-T Recommendation H.323. Packet-based Multimedia Communications Systems, Nov. 2000. [24] Jini[tm] Network Technology. http://wwws.sun.com/software/jini/. [25] E. Kiciman and A. Fox. Using Dynamic Mediation to Integrate COTS Entities in a Ubiquitous Computing Environment. In Proceedings of the Second International Symposium on Handheld and Ubiquitous Computing 2000 (HUC2k), Sept. 2000. [26] E. Kiciman and A. Fox. Separation of Concerns in Networked Service Composition. Position Paper, Workshop on Advanced Separation of Concerns in Software Engineering at ICSE 2001, May 2001. [27] A. Lazar, S. Bhonsle, and K. Lim. A Binding Architecture for Multimedia Networks. Journal of Parallel and Distributed Systems, 30(2), Nov. 1995. [28] The XML C library for Gnome. http://www.xmlsoft.org/. [29] Z. M. Mao and R. H. Katz. Achieving Service Portability in ICEBERG. IEEE GlobeCom 2000, Workshop on Service Portability (SerP-2000), 2000. [30] Introduction to the MBone. http://www-itg.lbl.gov/mbone/. [31] K. Nahrstedt and J. M. Smith. The QOS Broker. IEEE Multimedia, 2(1):53–67, 1995. [32] Napster. http://www.napster.com/ (website operational as of September 6, 2002). [33] Microsoft Windows NetMeeting. http://www.microsoft.com/windows/netmeeting/. [34] T. S. E. Ng and H. Zhang. Predicting Internet Network Distance with Coordinates-Based Approaches. INFOCOM ’02, June 2002. [35] Object Management Group. CORBA: The Common Object Request Broker Architecture, Revision 2.0, July 1995. 31
[36] OpenH323 Project. http://www.openh323.org/. [37] OpenSLP Home Page. http://www.openslp.org/. [38] S. R. Ponnekanti and A. Fox. SWORD: A Developer Toolkit for Web Service Composition. The Eleventh World Wide Web Conference (Web Engineering Track), May 2002. [39] B. Raman, S. Agarwal, Y. Chen, M. Caesar, W. Cui, P. Johansson, K. Lai, T. Lavian, S. Machiraju, Z. M. Mao, G. Porter, T. Roscoe, M. Seshadri, J. Shih, K. Sklower, L. Subramanian, T. Suzuki, S. Zhuang, A. D. Joseph, R. H. Katz, and I. Stoica. The SAHARA Model for Service Composition Across Multiple Providers. International Conference on Pervasive Computing (Pervasive 2002), Aug. 2002. [40] B. Raman and R. H. Katz. Emulation-based Evaluation of an Architecture for Wide-Area Service Composition. International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS 2002), July 2002. [41] P. Reiher, R. Guy, M. Yarvis, and A. Rudenko. Automated Planning for Open Architectures. In Proceedings for OPENARCH 2000 – Short Paper Session, pages 17–20, Mar. 2000. [42] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RTP: A Transport Protocol for Real-Time Applications. RFC 1889, IETF, Jan. 1996. [43] Session Directory. http://www-mice.cs.ucl.ac.uk/multimedia/software/sdr/. [44] J. P. Sousa and D. Garlan. Aura: an Architectural Framework for User Mobility in Ubiquitous Computing Environments. The Working IEEE/IFIP Conference on Software Architecture 2002, Aug. 2002. [45] B. Spitznagel and D. Garlan. A Compositional Approach for Constructing Connectors. The Working IEEE/IFIP Conference on Software Architecture (WICSA’01), Aug. 2001. [46] B. Stiller, C. Class, M. Waldvogel, G. Caronni, and D. Bauer. A Flexible Middleware for Multimedia Communication: Design, Implementation, and Experience. IEEE Journal on Selected Areas in Communication, Special Issue on Middleware, 17(9), Sept. 1999. [47] vic - Video Conferencing Tool. http://www-nrg.ee.lbl.gov/vic/. [48] H. J. Wang, B. Raman, C.-N. Chuah, R. Biswas, R. Gummadi, B. Hohlt, X. Hong, E. Kiciman, Z. Mao, J. S. Shih, L. Subramanian, B. Y. Zhao, A. D. Joseph, and R. H. Katz. ICEBERG: An Internet-core Network Architecture for Integrated Communications. IEEE Personal Communications Special Issue on IP-based Mobile Telecommunications Networks, Aug. 2000. [49] Z. Wang and D. Garlan. Task-Driven Computing. Technical Report CMU-CS-00-154, Carnegie Mellon University School of Computer Science, May 2000.
32