This document was uploaded by user and they confirmed that they have the permission to share
it. If you are author or own the copyright of this book, please report to us by using this DMCA
report form. Report DMCA
A UI-driven Approach to Facilitating Effective Development of Rich and Composite Web Applications Jin Yu
A dissertation submitted in fulfillment of the requirements for the degree of
Doctor of Philosophy
School of Computer Science and Engineering University of New South Wales Sydney, NSW 2052, Australia
Supervisor: Prof. Boualem Benatallah October 31, 2008
Originality Statement ‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.’
Jin Yu October 31, 2008
v
Acknowledgements It has been a great experience to work as an external PhD student in the School of Computer Science and Engineering, the University of New South Wales. As an offcampus student, I faced numerous challenges in the last three years, due to the physical distance (I reside in California) and time zone differences. I am most grateful to my supervisor, Prof. Boualem Benatallah, who made all this possible after all. Foremost, I would like to thank Prof. Benatallah for accepting and accommodating me as an external research student, for he had to spend countless hours on email communications with me. As a dedicated researcher and inspiring mentor, he has taught me many valuable lessons, including key research disciplines and methodologies. I would like to give special thanks to my co-supervisor, Prof. Fabio Casati, who I interacted frequently (when he was in HP Labs in Palo Alto) and had many interesting and fruitful discussions. Prof. Casati was also instrumental in many of my research papers, as he not only gave insightful comments but also directly edited some the papers. In addition, I would like to thank my research collaborators Regis Saint-Paul, Florian Daniel, Maristella Matera, and many students who have participated in the research projects. I enjoyed the collaboration very much and gained invaluable skills by working with them. Among the research students, I wish to express my sincere appreciation to Evi Syukur, who spent many hours in helping me with proof reading and the printing and binding of thesis hardcopies. Finally, I am grateful to my parents and my wife, for their endless support and encouragement throughout my PhD study. I express special thanks to my wife, who took on the dreadful task of packing and unpacking when we moved to a bigger house, allowing me to concentrate on the thesis and finish it on time.
vii
Abstract It is well-recognized that the development of user interfaces is one of the most timeconsuming tasks in the overall application development process. At the same time, there is an increasing demand for rich and fluid user interfaces from web users. As a result, developers are facing increasing challenges in delivering web applications, especially those with rich UI requirements. In this thesis we present two solutions to facilitate the execution and rapid development of web applications with rich user interfaces. The first solution is a rich internet application (RIA) framework aimed at providing high usability and productivity to web applications, while the second solution is a UI integration framework that simplifies web application development by facilitating the composition of reusable UI components. The foundation of our RIA framework is an XML-based high-level protocol for communicating asynchronous events and incremental UI updates on the web. The protocol facilitates rich and highly interactive UI, while at the same time eliminates frequent and slow page refreshes and provides a more responsive user experience. Built on top of the protocol, a server-side runtime allows UI logic code to be executed on the server side, while a set of server-side event-driven API enables developers to implement sophisticated application-specific UI behavior. On the client side, a thin client renders UI and processes native events, but leaves application-specific logic to the server side. The thin client thus allows end users to enjoy a rich UI experience in a safe client environment, without executing any downloaded code. The proposed UI integration framework includes an abstract UI component model which allows UI components to be programmatically manipulated via events, operations, and properties, essentially exposing UI as services. To facilitate component interactions, the framework offers an event-based composition model, which allows integration logic to be specified in the form of event listeners.
viii
Composite applications are executed via a lightweight runtime middleware, which provides component adapters that allow the middleware to communicate with native UI components implemented in a variety of languages and platforms. Finally, a graphical development environment allows composite applications to be built in a drag-and-drop fashion.
Future Work ..................................................................................................230
Publications...................................................................................................................233 Project Web Site .......................................................................................................234 Bibliography..................................................................................................................235
xvi
List of Figures Figure 2.1 Standard dialog ..............................................................................................18 Figure 2.2 Architecture of AJAX applications ...............................................................31 Figure 2.3 X Window System.........................................................................................34 Figure 3.1 Protocol message exchange ...........................................................................43 Figure 3.2 Before button click ........................................................................................49 Figure 3.3 After button click...........................................................................................50 Figure 3.4 Before clicking "Add to list" button ..............................................................51 Figure 3.5 After clicking "Add to list" button.................................................................52 Figure 3.6 Selecting list item "Chocolate Chip" .............................................................53 Figure 4.1 Example application: XCat............................................................................59 Figure 4.2 OpenXUP framework architecture ................................................................61 Figure 4.3 XComponent class .........................................................................................68 Figure 4.4 Classes related to events ................................................................................69 Figure 4.5 XApplication class.........................................................................................70 Figure 4.6 XUserSession class........................................................................................71 Figure 4.7 SUL classes....................................................................................................71 Figure 4.8 SUL events.....................................................................................................72 Figure 4.9 SUL event masks ...........................................................................................72 Figure 4.10 XHTML classes ...........................................................................................77 Figure 4.11 XML parsing performance...........................................................................90
xvii
Figure 4.12 Request processing performance .................................................................92 Figure 5.1 XUPClient architecture................................................................................100 Figure 5.2 Open URL dialog.........................................................................................110 Figure 5.3 Initial window..............................................................................................112 Figure 5.4 After button click.........................................................................................112 Figure 5.5 XUPClient's system menu ...........................................................................113 Figure 6.1 Component integration at different layers ...................................................123 Figure 6.2 HousingMaps...............................................................................................137 Figure 7.1 The National Park Guide .............................................................................150 Figure 7.2 Component-defined event vs. native UI event.............................................156 Figure 7.3 Relationship between UI components and application logic / data services157 Figure 8.1 National Park Guide (event-based model)...................................................168 Figure 9.1 Architecture of the UI integration framework, Mixup.................................193 Figure 9.2 New York Times example ...........................................................................196 Figure 9.3 Event automation .........................................................................................198 Figure 9.4 Component adapters ....................................................................................200 Figure 9.5 Architecture of Mixup's development environment ....................................205 Figure 9.6 Eclipse-based Mixup Editor ........................................................................209 Figure 9.7 AJAX-based Mixup Editor..........................................................................221
xviii
List of Tables Table 2.1 Comparison of web-based UI technologies ....................................................37 Table 3.1 XUP protocol elements for incremental UI updates .......................................45 Table 3.2 Managing XUP status requests .......................................................................48 Table 3.3 Protocol comparison .......................................................................................55 Table 4.1 Server code size ..............................................................................................85 Table 4.2 XML parsing performance..............................................................................90 Table 4.3 Request processing performance ....................................................................91 Table 4.4 Code size comparison (classic HTML vs. OpenXUP) ...................................94 Table 5.1 Client code size .............................................................................................114 Table 6.1 Comparison of UI integration approaches ....................................................143 Table 9.1 Application startup performance...................................................................218 Table 9.2 User study: composition development time..................................................223
xix
Code Listings Listing 3.1 UI model example (in SUL) .........................................................................41 Listing 3.2 XUP request for Example 1 ..........................................................................50 Listing 3.3 XUP response for Example 1........................................................................50 Listing 3.4 XUP request for Example 2 (button click) ...................................................52 Listing 3.5 XUP response for Example 2 (after button click).........................................53 Listing 3.6 XUP request for Example 2 (selecting list item) ..........................................54 Listing 4.1 UI template example.....................................................................................74 Listing 4.2 Multiple UI languages example ....................................................................76 Listing 4.3 C# source code fragment for XCat ...............................................................81 Listing 4.4 UI template "catalog.xml".............................................................................82 Listing 4.5 "Selection-changed" event request................................................................84 Listing 5.1 Event selector example ...............................................................................104 Listing 5.2 Startup XUP request ...................................................................................110 Listing 5.3 Startup XUP response.................................................................................111 Listing 5.4 Event request for button click.....................................................................112 Listing 5.5 Event response for button click ..................................................................112 Listing 7.1 UISDL descriptors (National Park Guide)..................................................159 Listing 7.2 Binding for the park listing component ......................................................160 Listing 7.3 UISDL descriptor with multiple bindings...................................................162 Listing 8.1 Composition model description (National Park Guide) .............................172
xx
Listing 8.2 Process-based integration logic...................................................................175 Listing 8.3 Component descriptors for credit application and vehicle registration ......178 Listing 8.4 XPIL fragment for content consolidation ...................................................179 Listing 8.5 Component descriptors with value-changed events....................................180 Listing 8.6 Component descriptors for YouTube and the news report .........................184 Listing 8.7 Composition model for the multimedia news report ..................................185 Listing 8.8 Component descriptors for Google Maps and PImage ...............................187 Listing 8.9 Composition model for the real estate application .....................................187 Listing 8.10 Composite component for the original park guide ...................................189 Listing 8.11 Composition model for the multimedia park guide ..................................191 Listing 9.1 UISDL descriptors for Yahoo pipe news feed and YouTube video ...........195 Listing 9.2 XML response from the "getWeatherInfo" operation.................................201 Listing 9.3 Generated JavaScript object for the XML response ...................................202 Listing 9.4 Composition model for the New York Times example..............................208
1 Chapter 1. Introduction
1. Introduction Web-based applications and services are now the dominant form of software development and deployment. It is well-recognized that a significant amount of time in developing an application is spent on its user interface (UI) [Mye92]. Therefore, UI development has become a key element in the web application development process [Bri02][Gar02]. Traditionally web applications use HTML to render their user interface. UI is delivered as HTML pages coupled with some simple JavaScript for data validation. This form of UI is very simple, but lacks true interactivity, since HTML was originally designed to render text with embedded images and therefore does not offer rich UI controls [Jel04]. In addition, web pages are always reloaded after each user actions (e.g., mouse click). This is very annoying to end users, since it takes time to reload a web page even if only a very small portion of the user interface needs to be updated. In response to that, developing rich internet applications (RIA) has become an essential part of the recent Web 2.0 [ORe05] trend. In RIA, interactions are no longer page-based; that is, user actions do not always result in page refreshes – only the changed UI elements will be re-rendered. Thus, web UI becomes as fluid and as rich as traditional desktop UI. RIA not only benefits end users, it also benefits the overall network infrastructure as it requires less roundtrip to the servers and the data involved are typically small (i.e., not entire pages of data). At the same time, it presents new challenges to developers because the shift from traditional page-based programming model to RIA may require additional skills and development effort. With the added rich UI requirement, the already time-consuming task of web UI development becomes even more daunting. To save development time and effort, the reusability of UI is of critical importance. There are many UI toolkits, such as Java Swing1 and .NET Windows Forms2, which provide pre-packaged classes modeling finegrained UI controls (e.g., buttons and menus). However, the reusability of the UI 1
2 Chapter 1. Introduction remains low since the unit of reuse is low-level UI controls such as buttons and panels, not high-level UI components encapsulating real application functionalities. High-level UI components are essentially the presentation front-ends of web applications or services. Examples of such UI components are: a stock quote portlet that retrieves stock prices from Yahoo, a weather gadget that displays weather information from WeatherBug's weather database3, and street maps such as Google Maps4 and Yahoo Maps5. The availability of these high-level UI components will dramatically shorten the UI development time of web applications and services. Once the required UI components are in place, they need to be put together to form an integrated web application. This process is what we called UI integration or composition, where UI components from various sources are assembled together to behave in a synchronized fashion in the composite web application [Yu07b][Yu07c]. This is similar to traditional application integration and web service composition, where components (i.e., services) are integrated together to form composite applications or services. With the availability of high-level UI components and appropriate UI integration middleware, web applications can be quickly put together, without having to redevelop UI from scratch every time. In this chapter, we will first discuss the requirements and challenges in developing UI for web applications. We then outline our major contributions. Finally, we describe the structure of this dissertation.
1.1 Requirements and Challenges In this section we outline the requirements and challenges in facilitating web UI development. We separate the challenges in two areas: rich internet applications and UI integration.
3
http://www.weatherbug.com
4
http://maps.google.com/
5
http://maps.yahoo.com/
3 Chapter 1. Introduction
1.1.1 Developing Rich Internet Application To provide end users with a rich and responsive user interface, RIA applications will face the following challenges: High usability and productivity. Traditional HTML-based web user interface lacks true interactivity, due to the fact that HTML was originally designed to render hypertext, not rich UI. This results in a reduction of both the web application's usability and the end user's productivity. What's needed is a set of rich UI controls equivalent to those found in desktop applications, so that the web UI can be as fluid and as rich as its desktop counterpart. For developers, this calls for a rich UI toolset, with a familiar programming model similar to those found in desktop GUI toolkits. Essentially, developers should be able to develop web applications the same way they develop desktop applications. Asynchronous and incremental UI updates. HTML's page-based model requires the entire page to be refreshed for each user action, even though typically only a small portion of the UI needs to be updated. However, users expect fast response time from their applications. This implies the use of asynchronous mechanism to perform UI operations while computing in the background. Additionally, UI updates should be applied incrementally, eliminating the latency and annoyance associated with slow but frequent page refreshes. Furthermore, both asynchronous and incremental UI functionalities should be provided by the framework, without the need of complex programming techniques such as multi-threading and callback functions. Secure client environment. While many rich web client technologies tried to address the limitation of HTML, their common approach is to download code to be executed on the client side (i.e., the browser). The downloaded code could be binary (e.g., applet byte code) or text (e.g., JavaScript), both impose security risks. Therefore, with increasing number of internet-based security risks (e.g., identity theft6), the client-side environment should only execute safe UI code (i.e., markups), without compromising the security of the end user's computer.
4 Chapter 1. Introduction Development complexity. Many rich web client technologies leverage JavaScript to manage and extend the client-side UI. However, comparing to traditional programming languages, large amount of JavaScript code remains much harder to develop, test, and maintain. Although browser incompatibilities have been mitigated due to the availability of sophisticated JavaScript toolkits, the performance of the JavaScript interpreter still varies greatly among browsers [Wei08]. In addition, developers now need to be concerned about intellectual property protection issues, as downloaded code can be easily viewed and copied on the client side. With traditional HTML applications, developers do not need to worry about network issues since both UI logic and business logic code reside on the server side. However, with rich web client technologies, UI logic is now at the client side, and therefore the developers are responsible for the network communication between the client-side UI logic and server-side business logic (e.g., to catch network exceptions). Ideally, the development environment should allow developers to be oblivious of the deployment method. The burden of managing client/server communications should be handled by the runtime framework, not by the developers.
1.1.2 UI Integration To facilitate the rapid development of web-based user interfaces, UI integration frameworks will face the following challenges: Reusable high-level UI components. Application componentization has long been a common practice in web application development; that is, application functionalities are packaged as modules or components to be reused. When developing a web application, the developer first selects application components containing desired functionality and then glues them together appropriately. After that, she builds a user interface for the integrated web application. Therefore, user interfaces are typically re-developed from scratch every time. To simply the development of web-based user interfaces, it is important to make them componentized so that they could be reused, just like application functionalities.
6
http://en.wikipedia.org/wiki/Identity_theft
5 Chapter 1. Introduction The granularity of the reuse should be high-level UI components encapsulating real application functionalities, not low-level UI controls such as buttons and panels. A highlevel UI component can be regarded as the presentation tier of a web application component or module. Examples of such UI components are: a news ticker that displays headlines from CNN, a stock quote portlet that retrieves stock prices from Yahoo, and a weather gadget that displays weather information from WeatherBug. A high-level UI component may in fact consist of many low-level controls. The weather gadget, for example, may contain a panel which houses a graph control for displaying the barometer, two text labels for displaying high and low temperatures, a text field for inputting zip code, and a few buttons to cycle through weather forecasts for different days or ranges of days. Obviously, the desired unit of reuse is the entire weather gadget, not the low-level UI controls within it. Component heterogeneity. There are a variety of component technologies that can be used to develop UI components. Those technologies (and their underlying languages and platforms) are typically incompatible with one another. In addition, components are typically built by different developers before the development of the composite application. As a result, the composite application developer must be able to cope with existing heterogeneous UI components from a variety of sources. For example, when putting together a PIM (Personal Information Management) application, the developer may need to integrate a .NET calendar component, a Java applet task list, and an AJAXbased address book. All three components were built with incompatible component technologies7. This is an overwhelming task as the composition developer must possess the intimate knowledge of multiple component technologies in order to reuse components built by them. Therefore, what's needed is a facility that allows components built with different technologies to be reused in the same composite application, while at the same time hides the platform and language differences from the composition developer. Hence, the
7
The three components may in fact be available in the same component technology. For illustration purpose, we assume they were inaccessible to the developer for reasons such as license and distribution restrictions.
6 Chapter 1. Introduction key is the ability to facilitate the communication among components from different technologies so that they could seamlessly work in the same composite application. UI composition. While the composition techniques in application integration (e.g., services composition) have been well-researched, there has been little study in the UI area. As we have already discussed the importance of reusing user interfaces, proper UI composition techniques are needed to assemble UI components into new, value-added composite applications. Currently, web-based UI composition exists mostly in form of combining page clips, where several HTML fragments enclosed in
or <iframe>8 are combined to produce a new page. This form of UI composition is very limited in functionality. For example, the page clips cannot effectively communicate or interact with one another; they just sit side by side in the final page. Therefore, what's needed is a composition model that facilitates the communications and interactions among UI components, while at the same time provides a consistent visual layout. This ensures that the UI components will behave in a synchronized fashion in the composite application. In addition, the composition model needs to be simple yet effective. Simplicity will facilitate quick adoption by developers and tech-savvy business users, and may eventually enable end-user compositions.
1.2 Contributions Overview Our goal is to facilitate and simply the development of web-based, rich user interfaces. To achieve this goal, we propose an RIA framework to develop highly interactive web user interfaces, and a UI integration framework to facilitate the development of composite applications by reusing existing, heterogeneous UI components.
8
These are HTML container elements that allow any web content to be embedded.
7 Chapter 1. Introduction
1.2.1 Developing Rich Internet Application To provide a rich web UI experience to end users and to simply the development of highly interactive user interfaces, we propose an RIA framework (called OpenXUP), consisting of the following ingredients. UI transport protocol. The foundation of our RIA framework is the Extensible User Interface
Protocol
(XUP)
[YC02],
an
XML-based
high-level
protocol
for
communicating events and incremental user interface changes on the web. We chose SOAP/HTTP [Gud07a][Gud07b] as the default binding protocol because it is wellunderstood, and its implementations in different platforms are widely available. Alternative bindings such as REST9 [Fie00] are also possible. In XUP, user actions result in UI events, which are sent as requests to the server side for processing. Event requests can be delivered from the client to the server asynchronously, so end users will find applications to be much more responsive. After processing the events, the server sends back a response containing necessary UI updates. Since the UI updates are incremental, not one full page at a time, end users will no longer experience slow page refreshes. Server-side runtime and development environment. OpenXUP offers a server-side runtime environment similar to traditional web applications. That is, UI logic and behavior are programmed on the server side. This allows the applications to be centrally managed, without the hassles of client-side software maintenance. Unlike traditional runtimes which return full HTML page in each response, our runtime keeps track of UI changes, and only returns UI deltas in each response. This is achieved by maintaining a UI model that corresponds to the exact user interface rendered to the end user. OpenXUP's development environment includes a set of event-driven APIs, which enable developers to implement sophisticated application-specific UI behavior on the server side. OpenXUP APIs are designed to be familiar, closely resembling the APIs from desktop GUI toolkits. This allows developers to quickly migrate their existing desktopbased applications. In addition, since all application code resides on the server side, it 9
8 Chapter 1. Introduction makes web applications easier to debug and maintain, without the need to worry about issues from distributed computing. OpenXUP is very extensible in that it does not dictate a particular UI model. It places no restriction on the UI control set, the properties or events associated with each control, or the style or appearance of the UI. That is, OpenXUP can practically work with any UI model that has an XML-based representation. Finally, web applications built with OpenXUP are fully compatible with existing backend technologies (EJB10, CORBA [OMG08], COM11, etc.), since OpenXUP's server side is designed to run within existing, established web application servers. This allows OpenXUP-based applications to leverage all existing backend data and business logic components. Thin client. OpenXUP employees a thin client design while at the same time providing rich UI to end users. Similar to browsers, OpenXUP's client side remains thin in terms of application logic; that is, no application code is executed on the client side. The client renders UI and processes native events, but leaves application-specific logic to the server side. However, the client takes advantages of the desktop computing power to enable a rich and interactive user experience for end users. It fully leverages the rich UI capability offered by native desktop GUI toolkits such as Windows Forms and Java Swing, while at the same time maintains a small footprint. As the client is thin, the security risks associated with executing downloaded code, whether binary or script, are avoided all together. This ensures that end users will enjoy a rich UI experience in a safe client environment. Implementation. To validate our approach, we provide an implementation of the proposed RIA framework, which includes a full implementation of the XUP protocol, a .NET-based server-side runtime environment, a set of event-driven APIs for application development, and a thin client that can be executed either as a standalone application or
10
http://java.sun.com/products/ejb/
11
http://www.microsoft.com/com
9 Chapter 1. Introduction as an Internet Explorer plugin. The implementation prototype fully leverages industry standards such as SOAP and XML. We also developed a lightweight UI modeling language called Simple User Interface Language12 (SUL) in order to build some sample applications. However, since OpenXUP is completely independent of the actual UI model, any UI model with an XML representation can be used (e.g., XUL13 [Goo01], XAML [MS07b], and UIML14 [Abr99]).
1.2.2 UI Integration To further simply the development of web applications with sophisticated user interfaces, we propose a UI integration framework (called Mixup) aiming at the development of composite applications by reusing heterogeneous UI components. UI component model. Aiming at combining simplicity with effectiveness, we propose a UI component model to represent the presentation front-ends of existing web applications or modules. The key observations are that UI components require 1) a conceptual, application-specific notion of state (e.g., the location and the zoom level for street maps), 2) operations to request state changes, 3) events to notify state changes that are primarily caused by user interactions, and 4) layout and appearance characteristics to give a consistent look and feel to the composite application. The proposed model is abstract, meaning that it is not tied to specific implementation technologies. As a result, it can be used to describe existing UI components developed with heterogeneous component technologies. This allows us to model UI components as services, where an abstract UI component may have multiple bindings to native component implementations. To describe UI components, we propose the UI Service Description Language (UISDL), which models a UI component as a service by describing both the abstract component
12
http://openxup.org/TR/sul.pdf
13
http://www.mozilla.org/projects/xul
14
http://www.uiml.org
10 Chapter 1. Introduction model and its bindings to concrete implementations. That is, a UISDL document contains the events, operations, and properties of a UI component, as well as their bindings to native implementations. The design of UISDL follows closely to WSDL [Chr01][Chi07a]. UI composition model. Aiming at integration in the presentation layer, we propose an event-based composition model, as we believe that UI integration is mostly event-driven (i.e., driven by user interactions). The composition model includes event subscription information to facilitate the communication among UI components, in the form of event listeners, where each listener maps an event from one component to an operation of another component. For cases where event-based one-on-one mapping is insufficient, additional integration logic (e.g., sequencing and flow control) may also be specified in the form of simple scripts or references to external code. This allows script-based process logic to complement with event-based, declarative integration logic. In addition, when direct mappings between event parameters and operation inputs are infeasible, additional data mappings and transformations can be specified in XSLT [Cla99b] within event listeners. Finally, the composition model also includes layout and positioning information so that the UI components can be positioned properly in the composite application. To describe the UI composition model, we propose the eXtensible Presentation Integration Language (XPIL), which is an XML-based language for modeling composite applications. XPIL is declarative, since UI integration logic is primarily event-based; this in turn makes it easy to author and interpret. UI consolidation and embedding. Since composite applications are built with components from various sources, very often two or more UI components in the same composite application may overlap in content or functionality. Therefore, to ensure a coherent presentation of the composite application, we have devised a content consolidation mechanism to merge, propagate, or hide semantically identical UI content from different UI components.
11 Chapter 1. Introduction In a composite application, it is often necessary to place a component inside another. For example, in a real estate application, it is desirable to place a small thumbnail image of the selected property on a map showing the property location. Here, the thumbnail image is provided by a real estate listing component and the map is provided by map services such as Google Maps. Hence, we propose a component embedding mechanism that allows one component to be embedded into another, where the two components may not be aware of each other a priori. Runtime middleware. In conjunction with the UI composition model, we provide a lightweight middleware for the execution of composite web applications. The runtime middleware interprets the XPIL document containing the composition logic and the UISDL documents that represent the involved UI components. At runtime, the middleware facilitates the interactions among the components by capturing events fired by one component and dispatching them to operations of subscribing components. Component adapters and inspectors. In order to support heterogeneous components, the runtime middleware supports the notion of component adapters, which allow the middleware to communicate with components from different component technologies. Using these adapters, the middleware will permit the integration of UI components developed using a wide variety of technologies, as long as the corresponding component adapters are available. This reinforces the notion of UI as a service, where components adapters facilitate the bindings from the abstract UI component model to concrete native component implementations. That is, the abstract model of a UI component is bound to one of its bindings through an appropriate component adapter at the runtime. On the development side, we introduce the notion of component inspectors, which allow the automatic generation of component descriptors (i.e., UISDL documents) from native UI components. As long as the appropriate meta-language facility (e.g., reflection) from a component technology is available (e.g., Java, .NET), a component inspector can be used to find out a legacy component's native events, operations and properties and then generate a component descriptor with appropriate bindings to the native component implementation.
12 Chapter 1. Introduction Implementation. To validate our approach, we provide an implementation of the proposed UI integration framework, Mixup, which includes a runtime middleware for the execution of composite applications and a development environment to facilitate the application design and development process. The runtime middleware is a lightweight JavaScript library that can be executed in any standard browser. It instantiates UI components defined in UISDL documents and coordinates their interactions according to the composition logic specified in the XPIL document. We provide two development tools to assist the building of composite applications. The first tool is an Eclipse GEF15 based visual editor. It allows developers to efficiently create component descriptors (i.e., UISDL documents) and composition model (i.e., XPIL document) in a drag and drop fashion. The second tool is web-based (AJAX) and serves the same purpose. Since it can be executed in the browser, no software installation is necessary. As a result, the AJAX version is more suitable for tech-savvy end users who demand quick and easy access, whereas the Eclipse version may be ideal for developers who require maximum features with efficiency.
1.3 Thesis Organization The remainder of this thesis is structured into two parts. In Part 1, we discuss our proposed RIA framework, OpenXUP, and in Part 2, we present our UI integration framework, Mixup. In Part 1, we start with a discussion of the current state of art of rich internet applications in Chapter 2. We identify several dimensions which can be used to characterize different RIA approaches, followed by a survey of related work using the dimensions as a guideline. Next, we present the details of the proposed RIA framework, OpenXUP. In particular, we discuss the UI protocol, XUP, in Chapter 3, the server-side runtime / development environment and its implementation in Chapter 4, and the design and implementation of the thin client in Chapter 5.
15
http://www.eclipse.org/gef/
13 Chapter 1. Introduction Part 2 focuses on the proposed UI integration framework, Mixup. Chapter 6 discusses the current state of art of UI integration, in both research and commercial development. We identify several dimensions which can be used to characterize different UI integration techniques, followed by a survey of several representative approaches in the field, using the dimensions as a guideline. Next, we present the details of the proposed UI integration framework, Mixup. First, we describe the abstract UI component model together with the concept of UI as services in Chapter 7, followed a thorough discussion of the event-based UI composition model in Chapter 8. The UI composition model also includes content consolidation and component embedding mechanisms to enable a seamless integrated UI. After that we describe the overall UI integration framework in Chapter 9, which includes a lightweight runtime middleware to execute composite applications, as well as a graphical development environment to facilitate rapid UI composition. In the same chapter, we also present the notion of component adapters to support legacy, heterogeneous components, and the notion of component inspectors to support the automatic creation of component descriptors from native component implementations. We then conclude the chapter by a discussion of the framework implementation. Finally, in Chapter 10, we give concluding remarks of this thesis and discuss possible directions for future work.
Part 1: OpenXUP – an RIA Framework
17
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA
2. Background and State of Art in RIA RIA has become a key component of Web 2.0, as more and more users demand rich web user interfaces which may greatly improve web applications' usability and productivity. As a result, numerous RIA technologies with varying architectures have emerged in both research and industry, resulting different features and functionalities. In this chapter we present the current state of art in rich internet application development, by illustrating the features and differences, strengths and weaknesses of several leading RIA approaches. First, we give an overview of the field, followed by a set of dimensions that can be used to compare and characterize related researches and technologies in this area. Finally, we discuss related work in this area using the dimensions as guidelines.
2.1 Dimensions for Characterizing RIA Technologies To discuss related technologies in the RIA field, we need a set of dimensions to compare them. First, RIA technologies have varying degrees of usability and productivity. Second, the location of UI code dictates the architecture of the solutions, with impact from security, communications, to intellectual property protections. In addition, the support of asynchronous UI allows user interfaces to be more interactive, without having users "wait" for the UI. Finally, the development environment dictates how easy RIA applications can be developed; this includes, for example, choices of programming languages and APIs.
2.1.1 Usability and Productivity The primary purpose of rich internet applications is to improve the usability and productivity of web applications. By usability, we mean whether a user interface is easy to learn and operate. A highly usable interface is thus intuitive and allows users to start using the application without extensive initial training [BV03]. Similarly, productivity implies whether a user interface allows users to work with the web application efficiently [Bai88]; i.e., to achieve a task with the minimum number of mouse clicks and/or keystrokes.
18
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA UI conventions and metaphors improve a user interface's usability and productivity and reduce its learning curve. There are many established UI conventions in desktop UI [Mye00]. For example, if a dialog box in Windows contains a "Yes" and a "No" button, users can either click on the buttons to invoke the desired functionality, or they can use keyboard shortcut, the "Enter" key for "Yes" and the "Escape" key for "No" (see Figure 2.1). Similarly, many keyboard shortcuts help users navigate menus and edit text (e.g., "control-c" and "control-v" to copy / paste text).
Figure 2.1 Standard dialog This type of conventions is much less established in web UI. Very few web applications support keyboard shortcuts. Some newer AJAX-based [Gar05] applications do support them, but have very different conventions. For example, GMail16 and Yahoo Mail17 support keyboard shortcuts, but each with very different conventions. As a result, the keyboard shortcuts offered by these applications are only used by a limited number of power users. Desktop UI is very mature as a result of many years of research and development. Desktop-based UI toolkits offer many rich, powerful, and standardized UI controls which users are very familiar with. Examples are tree, combo box (editable list), slider, and context menu. Therefore, to provide a high level of usability and productivity in web applications, users should be able to interact with a set of rich UI controls equivalent to those found in desktop applications.
16
http://gmail.com
17
http://mail.yahoo.com
19
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA
2.1.2 UI Code Location We use the term UI code to refer to the code that renders UI (i.e., UI definition) as well as the code that handles UI events caused by user actions (i.e., UI logic). For web applications, the code that renders UI is usually in the form of markups (e.g., HTML), whereas UI logic code must be programmed in script or traditional programming languages. UI code can be either located on the client side or on the server side. UI definition is processes at the client side since the rendering of UI markups is usually performed by the web browser. It is the location of UI logic code that determines the architecture of a RIA solution. When UI logic is located on the client side, the code needs to be downloaded to be executed by the client. The execution of downloaded code may impose security risks to the end user's computer. Therefore, both .NET and Java have sophisticated security sandbox to restrict the execution of downloaded code. In addition, for trusted execution, downloaded code must be certified by public CAs (certification authority). On the other hand, if UI logic code is located on the server side, client-side security is no longer a concern as no downloaded code needs to be executed by the client. UI logic code on the client side needs to handle issues arisen from client/server communications. Typically, the client-side UI logic code employs some form of RPC mechanism to communicate with the server-side application logic code. During this process, the UI code must catch any exceptions caused by network issues. On the other hand, if UI logic code is located on the server side, the communication between UI logic and application logic is much more reliable, so developers do not need to worry about network issues in the UI logic code. Finally, if UI logic code is downloaded to the client side, the intellectual property (IP) associated with that code needs to be carefully evaluated. If the downloaded code is script-based, IP cannot be protected since anyone can view the source code in the browser. Binary-based byte code (e.g., Java classes and .NET assemblies) could be
20
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA decompiled, so the associated IP could also be revealed. However, IP protection becomes a non-issue if UI logic code is on the server side, as the code is never exposed.
2.1.3 Asynchronous UI Users expect their applications to be very responsive. However, many applications block users from further interactions while performing computations (e.g., by presenting an hour glass cursor to the users). To make applications more responsive and interactive, some form of asynchronous mechanism must be employed to perform UI operations while computing in the background. Asynchronous UI makes applications to appear more responsive, but it adds complexity in application development. Applications must support background computation via multi-threading or callback functions, both of which require additional effort on the developer's part.
2.1.4 Development Environment An application development environment offers languages and APIs to create rich internet applications. Both traditional languages such as Java or C# and scripting languages such as JavaScript can be used to program rich UI behavior. Traditional languages may be more appropriate for large scale development, since they are more structured and have mature tool support. In addition, it is likely that developers can use the same programming language to code both UI logic and application logic, allowing an easier integration. Scripting languages are less verbose and easy to get started. However, large of amount of scripts is hard to maintain, and therefore may not be suitable for applications with complex UI logic. APIs play an important role in creating rich UI behavior. Since UI logic mainly consists of the handling of UI events caused by user actions, the API should be event-driven, reflecting the nature of rich user interfaces. In order to support rich and highly interactive UI, the API should support the manipulation of rich UI controls and the handling of their associated events, similar to those found in desktop-based GUI toolkits. This also brings familiarity to developers who have experiences in desktop GUI
21
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA programming. Finally, the API should be extendable, so that developers can create their own custom UI controls.
2.2 RIA Technologies A large number of RIA technologies exist to aid the development and deployment of web applications with rich and interactive UI. In this section, we attempt to group them into several categories and compare their strengths and weaknesses along the dimensions introduced earlier. Although classic HTML-based web applications cannot be regarded as rich internet applications, they serve as a baseboard upon where other RIA technologies can be compared. After all, the primary goal of RIA technologies is to overcome the limitations of classic HTML-based applications.
2.2.1 Classic HTML Traditionally, web applications are built with HTML pages. We call this the classic HTML approach, where web pages are generated on the server side, with some manually inserted JavaScript code for data validation. Each user interaction (e.g., clicking on a link or a button) in the browser results in an HTTP request, which in turn causes a new page to be generated and returned. This page-based model lacks true user interactivity, since HTML was designed explicitly for presenting passive documents and therefore lacks many useful user interface controls. In addition, frequently page refreshes are very annoying to end users, since it takes time to reload a web page even if only a very small portion of the user interface needs to be updated. Also, this wastes network resources since a full web page is always downloaded after each user interaction. Overall, user interfaces developed using the classic HTML approach lack both usability and productivity. Any programming languages can be used in this approach. However, the API is pagebased. That is, the API revolves around how to efficiently generate HTML pages, mostly based on a combination of templating mechanism with embedded code. Examples are JavaServer Pages18 (JSP), Active Server Pages19 (ASP), and PHP20. The page-based 18
http://java.sun.com/products/jsp
22
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA programming model does not capture the notion of UI events, therefore cannot provide a high level of interactivity to end users. Since UI logic is programmed at the server side, no code needs to be downloaded and executed by the client browser. Therefore, client-side security is not a concern. In addition, since both UI logic code and application logic code reside on the server side, developers do not need to worry about communication issues between the two. Finally, IP protection is not an issue because UI logic code is kept on the server side and never exposed to the users. The classic HTML approach does not support asynchronous UI. Users must not act on the UI before the browser finishes the ongoing request / response cycle. If the user attempts to interact with the application (e.g., clicking on a link or pressing a button), the current request / response cycle will be interrupted, resulting adverse effect on the application. While terminating a search may not cause much damage, interrupting a credit card transaction may result the user's credit card been charged without booking the order into the seller's inventory system.
2.2.2 Rich Client Technologies To overcome the limitations in classics HTML-based applications, a slew of client-side technologies have emerged. They are typically deployed as browser plug-ins and require downloaded UI code to be executed on the client side. ActiveX
21
is one of the first technologies that allow downloadable code to be executed
inside browser. An ActiveX control is typically packaged in a CAB [MS05] file that is downloaded on demand, while the browser renders the HTML file containing reference to the CAB file. Once downloaded and instantiated, an ActiveX control may execute any native Windows code. Therefore, an Active control may contain any UI controls provided by Windows, resulting a rich and interactive UI that's identical to Windows' desktop UI experience.
19
http://www.asp.net/
20
http://www.php.net/
23
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA Since an ActiveX control allows any native code to be executed on the user's computer, it can cause many harmful effects. Therefore, Internet Explorer requires downloaded binaries to be signed by a trusted public certificate authority22 (CA). At runtime, the certificate issued to the ActiveX control will be presented to the user, asking her permission to install / execute this control. However, the user usually has no idea of what's inside the control, so she can either reject it or blindly grant the permission. Once granted, the downloaded code can practically do anything to the user's computer. ActiveX mixes the UI definition code that renders the UI and the UI logic code that provides UI behavior in response to user actions. Both are programmed in whatever programming language used to implement the ActiveX control. This makes UI appearance customization less flexible, as it requires UI source code to be changed and recompiled. Since ActiveX controls are on the client side, developers must explicitly handle client / server communications. Although ActiveX controls are downloaded by the browser and exposed to the end users, IP protection is not a concern since they are compiled machine binaries. On the development side, ActiveX controls can be created using many traditional programming languages, such as C++ and VB. ActiveX controls can access any APIs provided by the desktop operating system, including Windows' graphical libraries. The ActiveX API itself is extendable, allowing developers to create custom controls. Finally, ActiveX does not offer any explicit mechanism for asynchronous UI, so developers must use callback functions or multiple threads to support that. Java Applet
23
is Java's alternative to ActiveX controls. It allows Java code to be
executed inside the browser. Instead of native machine code, a Java applet consists of Java classes made of high-level, platform-neutral byte code. Java applets must be
Chapter 2. Background and State of Art in RIA executed by the Java browser plug-in, which is available as part of the Java Runtime Environment (JRE) installation. Since Java applets have access to most of Java's desktop GUI APIs, they may provide the same level of usability and productivity as desktop Java applications. Specifically, the APIs are event-driven, and include a comprehensive set of UI controls for building rich and interactive user interfaces. The APIs are also extendable – developers can build custom UI controls based on one or more built-in Java UI controls. Java has good support for callback methods and multi-threading, so developers can take advantage of those facilities to implemented asynchronous UI. On the security side, Java Applet provides a sandbox mechanism in addition to signature from a trusted public CA. The security sandbox restricts what APIs can be called by the downloaded code, so that applets cannot perform harmful operations (e.g., removing local files) on the user's computer. Similar to ActiveX controls, Java applets mix UI definition code and UI logic code, both of which must be programmed in Java. As a result, applet source code must be changed and recompiled in order to alter or customize the UI appearance of an applet. Since applets reside on the client side, developers must manage client / server communications through RMI24 or other messaging facilities. Although applets are in binary form, they can be easily decompiled. Therefore, developers must ensure that valuable IP is not exposed through downloadable applets. .NET Smart Client [Hil04] offers several improvements to ActiveX. It allows .NET code to be executed inside the browser. Similar to Java applets, a .NET smart client contains high-level byte code rather than native machine code. The byte code is stored in one or more .NET assemblies (DLL files), which are executed by the .NET Common Language Runtime (CLR). However, .NET CLR is only pre-installed on newer versions of Windows (i.e., Vista), so users with older Windows must manually install CLR.
Chapter 2. Background and State of Art in RIA Smart clients have access to the .NET class libraries, including .NET Windows Forms. As the APIs provided by Windows Forms are event-driven and include a comprehensive set of UI controls, smart clients can offer a rich and interactive web UI on par with desktop UI. In addition, developers can create custom UI controls by extending from the built-in UI controls provided by Windows Forms. Since .NET has comprehensive support for callback methods and multi-threading, developers can leverage those facilities to implemented asynchronous UI, providing a more responsive user experience. To mitigate client-side security risks, .NET provides Code Access Security25 (CAS) through a complex set of security APIs to restrict what can be done by the downloaded code. However, since the security APIs are very complex, they add additional challenges to the already challenging task of rich web UI development. Similar to ActiveX controls and Java applets, smart clients mix UI definition code and UI logic code, both of which must be programmed in languages supported by .NET (e.g., C#, VB.NET, C++). As a result, the source code of a smart client must be changed and recompiled in order to alter or customize the UI appearance of the smart client. Since smart clients reside on the client side, developers must manage client / server communications through messaging facilities such as .NET Remoting [OH01] or ASP.NET Web Services26 [How01]. Similar to Java applets, although in binary form, .NET assemblies can be easily decompiled. Therefore, developers must ensure that valuable IP is not exposed through downloadable .NET assemblies. Flash
27
is a popular browser plug-in that was originally designed to render animations
(i.e., vector graphics) in browsers. More functionality was added later to provide rich UI functionalities. Due to its ubiquitous status among browsers, quite a few RIA frameworks have been built on top of Flash; for example, Flex28 and OpenLaszlo29. All 25
Chapter 2. Background and State of Art in RIA these frameworks leverage the Flash browser plug-in to render rich UI and provide user interactions. Although Flash-based frameworks provide many rich UI controls, the look and feel is very different from standard desktop UI controls such as those from Windows Forms and Java Swing. A new look and feel may appear attractive, but the disadvantage is that users would have to learn a set of new UI conventions and metaphors. In addition, keyboard shortcuts are not adequately supported, which negatively impacts the user's productivity. Developing Flash-based UI involves creating two types of files, MXML [Coe03] (or LZX30 for OpenLaszlo), an XML-based declarative language for modeling UI, and ActionScript31,
a
JavaScript-based
language
provided
by Adobe
(previously
Macromedia). MXML files contain the UI definition and ActionScript files contain the UI logic. Since the Flash browser plug-in cannot direct interpret MXML or ActionScript, they must be compiled into an SWF32 file, which can then be executed by the Flash plug-in. Separating UI definition from UI logic allows an application's UI appearance to be changed without affecting its UI logic. However, since MXML and ActionScript are typically compiled into a single SWF file, changing either one will require a recompilation of both. With Flash-based applications, both UI definition and UI logic code (in the compiled form) reside on the client side. As the UI logic code may contain any ActionScript, client-side security becomes an issue just like Java applets and ActiveX controls. In addition, developers must manage client / server communication via messaging facilities such as BlazeDS33. Although the SWF file format is binary, it can be easily decompiled to reveal ActionScript source code. Therefore, developers must ensure that valuable IP is not exposed through compiled ActionScript in downloadable SWF files.
30
http://www.openlaszlo.org/lps/docs/reference/
31
http://www.adobe.com/devnet/actionscript/
32
http://www.adobe.com/devnet/swf/
33
http://opensource.adobe.com/wiki/display/blazeds/
27
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA Flex allows UI definition to be specified in MXML and UI logic to be programmed, whereas OpenLaszlo uses a combination of LZX and JavaScript. The APIs are extendable so that custom UI controls can be added by developers. Since Flash runtime does not support multi-threading, developers must manually divide lengthy computation tasks and use callbacks to achieve asynchronous UI. Silverlight
34
is Microsoft's alternative to Flash. It is deployed as a browser plug-in and
provides rich user interactions with sophisticated graphical and multimedia capabilities. Comparing to Flash, Silverlight has a relatively small deployment base, since it is still in early product lifecycle and has limited platform / browser support. Developing Silverlight applications involves coding in XAML for UI definition and JavaScript for UI logic. Unlike Flash, Silverlight can directly interpret XAML and JavaScript. JavaScript code can be either embedded in XAML document or can be referenced externally. Separating UI definition from UI logic allows an application's UI appearance to be changed without affecting its UI logic. XAML provides a rich set of UI controls. In addition, starting from version 2.0, Silverlight can execute any downloaded .NET assemblies (subject to the same security restrictions as .NET smart clients). This enables developers to use rich UI controls from both XAML and .NET Windows Forms libraries, resulting a high level of usability and productivity similar to desktop applications. With Silverlight-based applications, both UI definition and UI logic code reside on the client side. As the UI code may consist of JavaScript and downloaded .NET assemblies, client-side security becomes an issue just like Flash and .NET Smart Client. In addition, developers must manage client / server communication via messaging facilities such as .NET Remoting or ASP.NET Web Services. Since JavaScript is in source code form and .NET assemblies can be easily decompiled, developers must ensure that valuable IP is not exposed through downloaded JavaScript and .NET assemblies. On the development side, Silverlight supports JavaScript for coding the application's UI logic. With the ability to execute .NET assemblies in Silverlight 2.0, any .NET
28
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA languages can be used to develop rich UI in Silverlight. As for API, applications can use JavaScript to access and manipulate XAML's Document Object Model (DOM) [Hor00], as well as the HTML DOM [Ste03] in the containing browser. In addition, Silverlight 2.0 applications may call any .NET libraries, including those providing rich user interactions. Both the JavaScript API and the .NET API are event-driven and can be extended to create custom UI controls. For Silverlight 2.0 applications, .NET's comprehensive support for callback methods and multi-threading can be leveraged to implemented asynchronous UI, providing a more responsive user experience.
2.2.3 Server-side Approaches Since the client-side technologies mentioned above all require users to install browser plug-ins to execute downloaded code, several server-side approaches have emerged that allow UI logic be programmed at the server side, just like classic HTML applications. Improving upon the page-based programming model, they offer abstractions that represent rich UI controls and a set of APIs resembling their desktop counterpart. ASP.NET Web Forms [She01] provides an event-driven programming model to pagebased applications. In addition to standard UI controls offered by HTML, it provides a set of extended UI controls resembling the ones found in the desktop environment. The goal is to provide developers with an API similar to desktop UI toolkits. Since it ASP.NET runs on top of the .NET framework, any .NET programming languages can be used to develop ASP.NET Web Forms applications. Its API is quite extendable, so developers can easily create custom UI controls. In ASP.NET Web Forms, UI definition is provided in ASP pages while UI logic primarily consists of event handlers that can be programmed in any .NET languages. The separation of UI logic from UI definition allows the UI appearance to be changed or customized without altering the UI logic code (and vise versa). Client-side security is no longer an issue since the UI logic code is on the server side and consequently no downloaded code needs to be executed on the client side. In
34
http://silverlight.net/
29
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA addition, developers do not need to manage client / server communications since both UI logic and application logic reside on the server side. IP protection is also not an issue because no source code is exposed to the users. Although ASP.NET Web Forms provides many desktop-like UI controls and an associated event-driven programming model, it is still limited by the basic controls offered by HTML browsers. This is because the rich UI controls are made of basic HTML controls. For example, the calendar control is made of an HTML table containing links to each day of the month. In addition, most user actions result in a request / response cycle that returns a full HTML page, causing an annoying page refresh. Furthermore, asynchronous UI is hard to implement as now the UI logic runs on the server side, so the developer has no control of the client side, unless using advanced JavaScript techniques (e.g., AJAX). As a result, although it improves the web UI development process by offering an event-driven API with desktop-like UI controls, the end users' usability and productivity remain unchanged when comparing to classic HTML-based applications. JavaServer Faces
35
(JSF) provides a Java-based, extendable event-driven API for
developing web applications on the server side. A unique feature of JSF is that it offers a set of abstract UI controls that can be bound to multiple concrete UI controls. For example, the "UISelectBoolean" control represents a single boolean (true or false) value. It can be rendered as a check box for one application and a toggle button or switch for another application. Similarly, the "UISelectOne" control allows users to choose one item from a collection of items, and it can be rendered as a dropdown list, a single selection box, or even a menu. This means that JSF UI controls are abstract and can be rendered in a variety of forms depending on the binding implementations. With JSF, UI logic code resides on the server side, so client-side security is not an issue since the client does not execute downloaded code. In addition, developers do not need to manage client / server communications as both UI logic and application logic reside on the server side. Since source code is never exposed to users, IP protection is no longer an issue. In JSF, UI definition is provided in JSP pages while UI logic primarily 35
http://java.sun.com/javaee/javaserverfaces/
30
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA consists of event handling code in Java. The separation of UI logic from UI definition allows the UI appearance to be changed or customized without altering the UI logic code (and vise versa). JSF's default HTML / JSP based implementation is very similar to ASP.NET Web Forms, and therefore it suffers from the same HTML-based limitations, resulting low usability and productivity. Similarly, asynchronous UI is hard to implement as the UI logic runs on the server side, so the developer has no control of the client side without using advanced JavaScript techniques (e.g., AJAX). Since JSF only standardizes on the API (the abstract UI controls), it could be used with other rich UI languages (e.g., XUL) and rich client technologies (e.g., AJAX). However, similar to ASP.NET Web Forms, JSF still maintains the "page" concept, which is not ideal for highly interactive user interfaces that are fluid and without page boundaries. XForms [Boy07] improves upon traditional HTML Forms36 by providing better clientside events and data validations. XForms applications can be developed in a variety of programming languages at the server side. However, the APIs remain page-based, similar to page-based APIs in classic HTML applications. As XForms provides comprehensive client-side events and data validations, there is very little JavaScript required on the client side. Consequently UI logic code resides on the server side, so developers do not need to worry about client-side security, client / server communication, or IP protection issues. With XForms, UI definition is provided in XML, and UI logic is programmed in whatever language supported by the server runtime. So it is easy to change or customize an application's UI appearance without affecting its UI logic. XForms inherits HTML's page-based model; i.e., the response of a form submission either replaces the current form or the entire containing page. In addition, it only offers a limited set of UI controls and asynchronous UI is not supported. Therefore, for end users, the usability and productivity remain low, while for developers, it is still hard to build rich web user interfaces with XForms.
36
http://www.w3.org/TR/html401/interact/forms.html
31
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA
2.2.4 AJAX AJAX (asynchronous JavaScript and XML) [Gar05] is a development technique that heavily utilizes client-side JavaScript to enhance the richness and interactivity of HTML-based applications. Its core is the XMLHttpRequest37 object (XHR), which allows web applications to retrieve data from the server asynchronously in the background without blocking users from interacting with the existing page. During runtime, the base HTML page is first rendered by the browser. Subsequent user interactions trigger client-side JavaScript code to be executed to update the UI by manipulating the HTML DOM. At the same time, data is retrieved using the XHR object behind the scene (i.e., asynchronously) in the form of XML or JSON38.
Figure 2.2 Architecture of AJAX applications [Gar05]
37
http://www.w3.org/TR/XMLHttpRequest/
38
http://www.json.org/
Part 1: OpenXUP
32 Chapter 2. Background and State of Art in RIA
Another advantage of AJAX is that now UI can be updated incrementally, without annoying page refreshes. In fact, most user interactions only result a small portion of the page to be updated – this is achieved through DOM manipulation via client-side JavaScript. Using XHR, data communication with the server happens behind the scene, so users are not blocked from further interacting with the UI. As a result, AJAX-based applications offer improved usability and productivity over classic HTML applications. There are a large number of AJAX frameworks and development toolkits, offering many different APIs for building rich web user interfaces, some remain page-based, while other are more event-driven. The dominant approach is to provide client-side JavaScript libraries that offer rich UI controls and simplify some common tasks (e.g., hiding browser incompatibilities and facilitating communications with the server). Developers then write client-side JavaScript code that calls methods or functions from these libraries when needed. Since JavaScript is hard to maintain, several AJAX toolkits allow applications to be developed in traditionally programming languages but executed on the client side. For example, Google Web Toolkit39 (GWT) allows developers to create AJAX applications in Java. When the applications are deployed, GWT employs a cross-compiler to translate the Java code into optimized JavaScript code that can then be executed on the client side. With AJAX, UI logic is programmed in JavaScript and resides on the client side, while UI definition is provided in HTML and CSS [Bos98]. Although toolkits such as GWT allow UI logic to be programmed in Java, it is still executed as JavaScript on the client side at runtime. As a result, AJAX applications are subject to the same security risks faced by other rich client technologies due to client-side code execution and JavaScript vulnerabilities. Since the UI logic resides on the client side while the application logic is on the server side, developers must manage client / server communications either directly using XHR or through wrapper functions provided by AJAX toolkit libraries. In addition, developers must take IP protection into consideration as JavaScript code can be viewed by any users.
39
http://code.google.com/webtoolkit/
33
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA Although AJAX offers improved usability and productivity over classic HTML applications, it still cannot compare with desktop-based UI. First, AJAX has no consistent user interface guidelines. Applications built with one AJAX toolkit may differ significantly from ones built with another toolkit. As a result, users are faced with the challenge in learning a variety of new UI conventions and metaphors. In addition, keyboard shortcut support is quite limited due to the potential conflicts with different browser's built-in keyboard shortcuts. Although many AJAX applications provide limited keyboard shortcuts to improve productivity, they are often ignored by the users due to inconsistencies among applications and browser focus issues. Finally, lack of consistent UI guideline also hampers the application's accessibility. While AJAX promises to improve interactivity through asynchronous UI, JavaScript's performance remains to be a major obstacle to this [Wei08]. For example, using JavaScript to perform drawing tasks (e.g., vector graphics) or process large amount of XML data will likely make the application less responsive to user input. In addition, browser incompatibility remains to be an issue due to the lack of standardization of JavaScript support in different web browsers and even in different versions of the same browser. As an AJAX toolkit may not cover all browsers (or all versions of a particular browser) required by the target application, the developer inevitably needs to perform manual tweaking, resulting in high development cost.
2.2.5 Remote Display Technologies Remote display technologies can be used to display remote UI and graphics on the local desktop. Mature technologies such X1140, NeWS [Gos90], and Display PostScript (DPS) [Ado93] offer low level UI protocols to transport bitmaps and low-level primitives. In X11, everything is based on bitmaps and windows, so the protocol does not offer higher level abstractions such as buttons or scrollbars. Rich UI controls are instead provided by toolkit libraries (e.g., Motif41) that are built on top of low level primitives. Usability and productivity are both high because these are essentially desktop UI technologies. From the network's point of view, the local desktop is actually 40
http://www.x.org/
41
http://www.opengroup.org/motif/
34
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA the "server", since it must open a port to receive graphical instructions from remote programs. Applications can be developed in traditionally programming languages like C and C++, and for NeWS and DPS, PostScript42 can be directly used. APIs are event-driven and extendable. UI logic and UI definition are programmed together; there is no easy way to separate the two. Asynchronous UI must be manually supported (e.g., by dividing up long running tasks and using callback functions).
Figure 2.3 X Window System [GE06] Since UI logic code resides remotely, there are no client-side code execution risks or IP protection issues. In addition, since both UI logic and application logic reside remotely, developers do not need to manage client / server communications themselves.
42
http://en.wikipedia.org/wiki/PostScript
35
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA However, since the local desktop must open a server port to receive remote graphical instructions, it is subject to all kinds of network security attacks. In addition, the port may be blocked by network firewalls, making it unsuitable for wide area networks such as the Web. Furthermore, these protocols have very high network bandwidth requirement, as typically every keystroke and mouse movement is transported over the network. Therefore, these protocols only work well in a local area network environment, where network resource is plenty. Remote desktop technologies allow the entire remote desktop to be displayed on the local desktop. Examples are Virtual Network Computing43 (VNC) and Remote Desktop Protocol (RDP) [MS07a]. Their protocols have similar high bandwidth requirement. However, unlike the remote display technologies mentioned above, they are not designed for UI development, so they do not offer any development toolkits or UI libraries. Their primary goal is desktop sharing and thin client computing. Desktop sharing facilitates remote administration (e.g., server monitoring and technical support), whereas thin client computing allows multiple thin clients to connect to the same server to share its computing resources. RemoteJFC
44
[Lok02] is a distributed user interface toolkit based on the JFC (Java
Swing) API. It facilitates the development of server-side applications by leveraging a JFC-like remote API that displays its UI output on the local desktop. RemoteJFC has two components: the client is a viewer with Java Swing look and feel; the server contains application code, which uses an API that mimics JFC. Through the server-side API, the application code processes UI events from the client and sends UI updates to the client. Following the JFC model, UI logic and UI definition are programmed together on the server side, so there is no easy way to separate the two. RemoteJFC offers a high level of usability and productivity since the client renders the UI using Swing, so its look and feel are identical to desktop-based Swing applications. However, as applications have no direct control of the client, asynchronous UI cannot be easily supported.
43
http://www.realvnc.com/
36
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA Since all UI code resides on the server side, RemoteJFC is not subject to client-side code execution risks and IP protection issues. However, it uses RMI for network communication, which is firewall unfriendly. In addition, it requires two-way RMI servers. When the client sends UI events to the server, the server behaves as an RMI server; when the server sends UI updates to the client, the client also becomes an RMI server. This architecture poses serious security problems. Finally, RemoteJFC has very high bandwidth requirement as the client may send a large amount of UI events to the server, including high frequency ones such as mouse movement. Therefore, it is unsuitable for deployment on wide area networks.
2.2.6 Rich UI Languages There is a number of XML-based UI markup languages that aimed at providing desktoplike UI controls not found in HTML. Examples are XUL, UIML, XIML45, XAML, MXML, and LZX. They typically provide a set of rich UI controls with comprehensive event support. These UI languages allow UI definition to be specified declaratively in XML. To specify UI logic, most of them allow scripts to be embedded in the markups to provide the necessary UI behavior (i.e., handling events). For more extensive UI logic, the rich client technologies mentioned earlier also allow external code (script or binary) to be downloaded and executed. These UI languages alone are not sufficient to implement rich UI experiences. They must be used in conjunction with some client-side or server-side software (or protocol), such as the various RIA technologies mentioned earlier in this section.
2.2.7 Comparisons In this section we summarize and compare different web-based UI development approaches. The following table provides a comparison of the aforementioned approaches along the characteristic dimensions introduced earlier.
Chapter 2. Background and State of Art in RIA Table 2.1 Comparison of web-based UI technologies Usability and productivity
UI logic code location
Asynchronous UI
Development environment
Classic HTML
Low
Server; separate UI definition and UI logic
No
Page-based API
Rich clients (ActiveX, Java applet)
High (desktop-like)
Client; mixed UI definition and UI logic
Using callbacks or multiple threads
Event-driven API with rich UI controls; traditional languages
Rich clients (Flash, Silverlight)
High (desktop-like)
Client; separate UI definition and UI logic
Using callbacks or multiple threads
Event-driven API with rich UI controls
Servers-side approaches
Low
Server; separate UI definition and UI logic
No
Event-driven API
AJAX
Medium to high
Client; separate UI definition and UI logic
Using XHR object
Ad-hoc scripting API
Remote display technologies
High (desktop-like)
Remote; mixed UI definition and UI logic
Using callbacks or multiple threads
Event-driven API with rich UI controls; traditional languages
The main advantage of the rich client technologies is their improved usability and productivity on the user interfaces. However, since they must download and execute code on the client side (whether binary or script), they all suffer from client-side code execution risks and face IP protection challenges. As a result, many end users choose not to install download code (e.g., applets) when prompted, even though they are certified by public CAs. In addition, due to the same security concerns, many organizations do not allow the installation of downloaded code, with a few exceptions for well-known applications such as Acrobat PDF Reader. The recommended practice in these rich client technologies is to have pure UI logic code on the client side and application logic code on the server side. However, without hard limit, there is a tendency for developers to mix UI and application logic code on the client side, making the client side code (and the overall application) hard to maintain. As complex UI implies a large UI logic code size (not to mention the possible clientside application logic code), the downloaded code can quickly become quite bulky as the application evolves. Slow download is annoying to end users for both first time download and any subsequently code updates.
38
Part 1: OpenXUP
Chapter 2. Background and State of Art in RIA With the server-side approaches, there are no client-side security or IP protection issues as no code will be downloaded and executed by the client. In addition, comparing to classic HTML, they offer better development support since event-driven APIs are more natural than page-based APIs. However, since they rely on basic controls in HTML, the usability and productivity remain low. Comparing with the rich client technologies, AJAX's major advantage is that no plug-in needs to be installed. However, it is subject to similar client-side code execution and IP protection issues, since UI logic code (i.e., JavaScript) must be downloaded and executed on the client side. Similar to the rich client technologies, there is also a tendency for developers to mix both UI logic and application logic code on the client side, thus making the application hard to maintain. Although there are numerous AJAX toolkits available, large amount of JavaScript remains to be a development challenge. Finally, AJAX's usability and productivity cannot compare with those rich client technologies or traditional desktop UI, due to the lack of consistent UI guidelines and JavaScript's performance limitations. The remote display technologies are very mature and have been widely deployed. They offer desktop-grade UI with high usability. However, their high bandwidth requirements and security restrictions call for well-controlled local network environment (e.g., enterprise LAN), making them unsuitable for wide area networks such as the Web.
2.3 Summary In this chapter we have introduced some basic concepts in rich internet applications, and presented the state of art in current research and industrial efforts. In particular, we have provided an overview of the representative approaches, by illustrating their strengths and weaknesses along a set of characteristic dimensions. Our findings indicate that, although tremendous efforts and results have been made and obtained in this area, RIA technologies are far from mature and face open research challenges. In the next few chapters, we will discuss and present our solution to RIA.
39
Part 1: OpenXUP
Chapter 3. UI Transport Protocol
3. UI Transport Protocol Traditional web applications are built with HTML pages. This page-based programming model lacks true user interactivity, since HTML was designed explicitly for presenting passive documents and therefore lacks many useful user interface controls. In addition, the de-facto web UI protocol, a form submission protocol46 ("application/x-www-formurlencoded"), is very primitive. The request consists of a list of URL-encoded name / value string pairs, and the response is always a full page of markup. As a result, in HTML-based server-side applications, web pages are always reloaded after each user click. This is very annoying to end users, since it takes time to reload a web page even if only a very small portion of the user interface needs to be updated. Also, this wastes network resources since a full web page is always downloaded after each user interaction. Many RIA technologies have emerged to improve user interactivities. The client-side approaches do not need any standardize protocol for delivering UI, as UI is now programmed at the client side. Typically some form of RPC is used by the client-side UI logic code to communicate with the server-side application logic; however, this communication has to be handled by the developer manually. Since these rich client technologies (including AJAX) all resort to executing large amount of UI logic code in the browser, client-side code execution security and IP protection become a major issue. On the other hand, the server-side RIA approaches rely on the same form submission protocol as traditional HTML applications, therefore suffering from the same set of limitations. Hence, in this chapter we propose an alternative UI transport protocol, the Extensible User Interface Protocol (XUP), an XML-based protocol for communicating events and user interface changes on the web. It lays a foundation for a framework that would enable the development and consumption of highly interactive web applications. With XUP, rich UI events are delivered from client to server as XML messages, rather than URL-encoded name / value strings. Programmers implement event handlers on the 46
http://www.w3.org/TR/html401/interact/forms.html
40
Part 1: OpenXUP
Chapter 3. UI Transport Protocol server side. They no longer need to process form data as URL-encoded values. User interface changes are delivered from server to client as incremental updates, so end users will no longer experience slow page refreshes, and network bandwidth is conserved. The remainder of the chapter is organized as follows. The next section describes the concept of user interface model. After that we discuss the detailed design of the protocol. Finally, we illustrate two simple examples to further clarify the concept.
3.1 User Interface Model A user interface model (UI model) is a representation of the user interface which the end user perceives and interacts with. Common UI models include .NET Windows Forms, Java Swing, XUL, XAML, UIML, etc. Some UI models have XML-based representations, such as XUL and UIML. We call them declarative UI modeling languages (or just UI languages). That is, the UI model is declared, not programmed. A UI model typically consists of a tree of UI controls (e.g., buttons, panels), a set of events (e.g., mouse click), and a list of resources (e.g., button images, files to be downloaded or uploaded). Therefore, a declarative UI model can be described by a tree of XML elements, with UI controls mapping to XML elements and the properties (e.g., color, size) of the controls mapping to XML attributes47. One major advantage of XML-based UI languages is that they can be edited with existing XML authoring tools. It is also easy to develop UI design tools to further simplify XML-based UI model creation. Listing 3.1 shows an example UI model in SUL48. The root of the UI model is a window; within it there is a panel which contains two push buttons. <window id='w1' position="width:200;height:100" xmlns="http://www.openxup.org/2004/02/sul"> <panel id='p1' border='single'...>
47
Within the context of XML, we will use the terms UI elements and UI attributes to refer to UI controls and their properties.
48
http://www.openxup.org/TR/sul.pdf
Part 1: OpenXUP
41 Chapter 3. UI Transport Protocol
...... ......
Listing 3.1 UI model example (in SUL) XHTML and WML [WAP01] can be also regarded as UI modeling languages, although they are more suitable for presenting textual information rather than rich user interfaces. XUP is a protocol for delivering events and user interface updates. Specifically, its payload includes rich UI events caused by user actions and instructions to manipulate the UI model. With extensibility in mind, XUP has been designed to work with any UI models with XML-based representations. In fact, it is independent of the actual UI model, and it places no restriction on the UI control set, or the properties or events associated with each control. An event model is a representation of events triggered by user interactions with the UI model. The event model defines how events are fired, what types of events are fired, and from which UI controls the events are fired in the UI model. The event model is often part of or closely associated with the UI model. XUP supports both delegation (e.g., Java Swing) [Sun97] and capturing/bubbling (e.g., DOM) [Pix00] event models, as long as they have XML-based representations. In a delegation-based event model, an event can be sent to and processed by any interested parties. To receive the event, one simply registers with the UI control that will be firing the event. This type of event model is also called event subscription/notification. In a capturing/bubbling based event model, an event will propagate up and down the UI tree, starting from the source UI control that fired the event. Any UI controls along the propagation path may receive and process the event. Delegation-based event model is often used in desktop UI frameworks, such as Java Swing and .NET Windows Forms, whereas capturing/bubbling based event model is typically used in document-based applications, such as HTML/XML DOM. To accommodate a wide range of application scenarios, XUP has been designed to work with both types of event models. In addition, it does not dictate any event details, such
42
Part 1: OpenXUP
Chapter 3. UI Transport Protocol as what types of events may occur on which UI controls, or the syntax and semantics of any particular event.
3.2 Protocol Design To avoid client-side code execution security risks and IP protection issues, the protocol must allow all application-specific code (i.e., UI logic code) to reside on the server side. The client-side software should only render declarative UI definition code (i.e., markups). This leads to the following design goals: •
Facilitating server-side event processing: protocol requests should contain rich UI events encoded in XML. In addition, to minimize network event traffic, the protocol needs to support an event selection/filtering mechanism. To provide a responsive UI experience, it must support the asynchronous delivery of UI events.
•
Delivering UI updates: protocol responses should contain instructions to update the client-side UI model. In addition, to enable a fluid user interface, the protocol needs to support the incremental delivery of UI updates.
In the following we provide an overview of the various elements of XUP and the interactions among them.
3.2.1 Definitions Before describing protocol operations, we need to make the following definitions: An XUP client is a program that sends event requests and processes server responses on behalf of the end user. The client renders user interfaces and interacts with the end user. An XUP server is a program that processes event requests from the client and sends back UI update in responses. It processes UI events and generates UI updates by executing application-supplied code.
43
Part 1: OpenXUP
Chapter 3. UI Transport Protocol An XUP application is an application-supplied code that resides in the server and provides desired functionalities to the end user through the client. The application uses an API provided by the server to process events and manipulate UI. An event handler is part of the XUP application. The server invokes the application by calling registered event handlers upon receiving an event request from the client. An event selector defines event selection criteria in the client. To minimize network traffic, not all client-side UI events are delivered to the server – only those matched by event selectors are sent.
3.2.2 Protocol Operations An XUP client communicates with an XUP server by sending events encapsulated in XUP requests. The server then handles the events and returns UI updates (and event selectors) in XUP responses.
Figure 3.1 Protocol message exchange An end user interacts with an XUP application by operating a client which sends events to the XUP server hosting the XUP application. XUP applications are developed by programmers to provide desired application functionalities for end users. An XUP server may host one or more XUP applications which may be running concurrently.
44
Part 1: OpenXUP
Chapter 3. UI Transport Protocol Similarly, an end user may use a client to interact with more than one application from one or more servers at the same time. In this section, we describe protocol message exchanges and the interactions among different components as a result of the message exchanges (Figure 3.1). First, the end user starts by establishing a session with an XUP application. Second, the end user interacts with the application which causes events and UI updates to be exchanged between the client and the server. Finally, the end user terminates the session with the application.
3.2.2.1 Startup The client starts an application by sending a request to the URL identifying the server. This request includes the name of the application and a list of UI model namespaces supported by the client. The server then responds by sending a session ID used for identifying the client in subsequent request, the initial UI model, and an initial list of event selectors. The server maintains a user session for each client, identified by a session ID49. The namespace of the initial UI model must be in the list of UI model namespaces supported by the client. The purpose of the UI model namespace negotiation is to allow the client and the server to agree upon a common UI language. Both the client and server may support more than one UI languages (e.g., the client may support SUL and XAML, while the server may support UIML and SUL), but they must agree on one of them in order to execute the application. The client interprets this startup UI model and renders its initial UI accordingly. The event selectors will be used by the client to deliver events in subsequent requests.
3.2.2.2 User Interactions After the startup phase, the end user interacts with the application by manipulating the UI model, which triggers events to be sent to the server. An event is sent to the server
49
The protocol does not specify how the server or the application should maintain user sessions, nor the contents of user sessions.
45
Part 1: OpenXUP
Chapter 3. UI Transport Protocol only when it matches with an event selector in the client. Together with the event, a list of UI data is also sent. Those UI data reflect changes made by the end user to the UI model, such as text entered in a text box and a check box been unchecked. Upon receiving an event request from the client, the server locates the XUP application and invokes the event handlers registered by the application for the event. The UI data sent with the event are also available to the event handlers. Event handlers execute the necessary UI logic code and generate UI model updates (and selector updates, if any) to be returned to the client. If some errors occurred, a SOAP50 fault is returned in place of the UI model updates. The client processes event response by rendering UI changes and updating its list of event selectors. The UI model changes include both UI element51 level and attribute52 level changes, and the UI elements could be simple UI controls or complex UI containers. This approach allows applications to perform both macro and micro UI updates, eliminating end users' visual discomfort with frequent page refreshes as in HTML-based applications. As shown in Table 3.1, XUP supports fine-grained updates to the UI model. For example, may be used to add a list item to a list box, and may be used to update the background color of a button. Table 3.1 XUP protocol elements for incremental UI updates <xup:addUIElement>
Add a UI control under a specified parent control
<xup:updateUIElement>
Update the content of the specified UI control
<xup:removeUIElement>
Remove the specified UI control
<xup:moveUIElement>
Reposition the specified UI control under its parent
<xup:updateUIAttr>
Update the specified UI property of the UI control
50
We chose SOAP/HTTP as the default transport binding for XUP, since it's well-understood, and its implementations in different platforms are widely available. Another good transport binding is REST.
51
We use the term "UI element" and "UI control" interchangeably in this context, since a UI control is represented by an XML element.
52
Similarly, we also use the term "UI attribute" and "UI property" interchangeably.
46
Part 1: OpenXUP
Chapter 3. UI Transport Protocol
3.2.2.3 Session Termination There are multiple ways for an end user to exit an XUP application. First, XUP defines an optional shutdown message for the client to terminate the session with an XUP application. Alternatively, the client may simply close the transport connection and expect the server or application to purge the user session after some time interval. In addition, a particular event on a UI control may be interpreted by the application as a termination notice. For example, a click event on a button labeled "quit" may be used by the application to terminate the user session.
3.2.3 Network Event Delivery In XUP, events are triggered by user interactions on the UI model in the client. Once an event is fired, the client sends a request encapsulating the full event detail and its associated UI data to the server for further processing. Server sends back UI changes and selector updates after handling the event. The client maintains a list of event selectors which are updated by server through XUP responses. An event is sent to the server when a selector matches with the event in the following manner: •
In delegation event models o The selector's event type matches with the event's type, and o The element to which the selector is attached is the same as the element at which the event is targeted. In this model, the element is the source of the event.
•
In capturing/bubbling event models o The selector's event type matches with the event's type, and o The element to which the selector is attached is the same as the element at which the event is targeted, and
47
Part 1: OpenXUP
Chapter 3. UI Transport Protocol o If the selector specifies a source element, it must match the source element of the event, and o The selector's phase attribute (i.e., capturing or bubbling phase) must match with that of the event, or both attributes must be unspecified. Together with the event, a list of UI data is sent. These data reflect UI changes made by the user, such as text entered in a text box and a check box been unchecked. The client will send the event without any UI data if the user did not change the UI model. Note that an event will not be sent if there are no selectors matching the event. If multiple event selectors match an event, the client will send the event to the server only once. After the server receives the event, it calls one or more event handlers registered by the XUP application. The event handlers process the event and perform any necessary computation for the application. For capturing/bubbling event models, the application may also indicate in the response whether to stop the propagation of the event in the client. By default, events propagate in the client according to their types (e.g., some types of events bubble, others do not), as defined in the event model.
3.2.3.1 Asynchronous Events To support a more responsive user interface, the client should not always block user interactions while sending events (and waiting for server responses). Therefore, XUP allows events to be delivered either synchronously or asynchronously. During a synchronous network event delivery, the client will prevent the end user from interacting with the UI model until the event response is received and processed. On the other hand, during an asynchronous event delivery, the user may continue interacting with the UI model while the client is sending the previous event (and waiting for the server response). To denote the mode of event delivery, XUP provides the "async" attribute on the <selector> element. When set to true, it instructs the client to send the matching events in an asynchronous fashion.
48
Part 1: OpenXUP
Chapter 3. UI Transport Protocol
3.2.3.2 Minimize Event Traffic It may appear that XUP is a fairly verbose protocol, since UI events are sent over the network and there are many types of UI events. However, XUP has been designed to minimize the amount of network event traffic. Because events are dispatched over the network, to conserve network bandwidth, the client does not send certain types of high-frequency events, such as mouse movement. In addition, XUP's event selector mechanism allows an application to select interested client-side UI events based on their event types. For example, an application may want to receive mouse events but not keyboard events from a button. The client will only send the events selected by the application. Finally, to filter events of the same type, XUP supports the notion of event mask, which further refines the event selection criteria. For example, for keyboard events on a tree node, an event mask may specify that an event should be sent only if the "delete" key or the "return" key is pressed; so no other keystrokes will cause the client to send the event to the server.
3.2.4 Server-side Notification Due to the nature of HTTP/SOAP request/response model, it is not easy for the server to send asynchronous notification to the client. This facility would be needed when reporting server-side events and the status of long time server-side operations. Table 3.2 Managing XUP status requests <xup:startStatusMonitor>
Instruct the client to start sending periodical status requests according to the specified time interval
<xup:updateStatusMonitor>
Instruct the client to change the frequency of the status requests
<xup:stopStatusMonitor>
Instruct the client to stop sending periodical status requests
To simulate server-side notification, XUP provides status messages between the client and server. The client may send periodic status requests (i.e., polling messages) to the server, and the server may then send asynchronous notifications (in the form of UI model updates) in status responses. For example, the server may send back a dialog box
49
Part 1: OpenXUP
Chapter 3. UI Transport Protocol alerting the user of a disk failure. Of course, the frequency of the status messages (and whether to use status messages at all) is up to the XUP application. Table 3.2 shows the XUP protocol elements that allow applications to manage the way the client sends status requests.
3.3 Examples This section illustrates how XUP works by showing two examples. Both of them are very simple, but much richer user interfaces can be built with the same techniques. The examples use XUL as the UI modeling language, with a simple delegation-based event model (namespace URI: "http://www.openxup.org/event/delegation"). The UI controls in the examples all have XML IDs, which are used by XUP to uniquely indentify the UI elements and their attributes. Note that the examples can be replicated in other UI models and event models in a similar fashion.
3.3.1 Example 1
Figure 3.2 Before button click In Example 1, the user clicks on "Button1" (Figure 3.2). The XUP request in Listing 3.2 describes the button click event. The XUP <event> element specifies a client-side UI event. Note that this event matches the selector53 "btn1Click", which is specified in the "selector" attribute of <event>. <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xup="http://www.openxup.org/2004/02/xup"> <soap:Header>
53
The selector "btn1Click" selects the mouse click event from the button. The content of the selector would have been sent in an earlier XUP response, which is not shown in this example.
Listing 3.2 XUP request for Example 1 In the server response, the XUP application instructs the client to update the text of the label element with ID "lbl1". This is achieved by using the XUP protocol element , as in Listing 3.3. The content of the element specifies the value of the attribute "xul:value" of the label element "lbl1". Figure 3.3 shows the result after the client processes the response and updates the text of "lbl1" to "Button1 clicked."
Figure 3.4 Before clicking "Add to list" button In Example 2, the user types "Chocolate Chip" into the text box with ID "text1" and then clicks the button "Add to list" (Figure 3.4). The XUP request in Listing 3.4 describes the button click event. Note that this request includes a piece of UI data: the content of the text box "text1", which is the string "Chocolate Chip". This is achieved by using the XUP protocol element , similar to the protocol response in the previous example54. The content of specifies the value of the text box "text1". <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xup="http://www.openxup.org/2004/02/xup" xmlns:xul="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul"> <soap:Header> <xup:session soap:mustUnderstand="1" application="sample2" sid="3876"/> <soap:Body> <xup:event type="ev:click" selector="addBtnClick" xmlns:ev="http://www.openxup.org/event/delegation"> <detail> <ev:mouseClick button="0" clientX="18" clientY="6"/> 54
The element can be used in both XUP request and response. In request, it specifies the UI data entered by the end user; whereas in response, it specified the incremental UI update resulting from programmatic manipulation of the UI model on the server side.
52
Part 1: OpenXUP
Chapter 3. UI Transport Protocol Chocolate Chip
Listing 3.4 XUP request for Example 2 (button click) The XUP application processes the event and creates a list item based on the string "Chocolate Chip", which is the value of the text box "text1". It then inserts the list item before the second item in the list box, which has an XML ID "list1". This is achieved by the XUP element , as shown in Listing 3.5. The XUP element describes a single UI element to be added to the client UI. In our example, the UI element is a simple list item, but it may contain nested children in more complex scenarios. In addition, the application also clears the text box "text1", so that the user may input further text. This is accomplished via an empty , which essentially clears out the value of the "xul:value" attribute of the UI element "text1". The resulting UI is shown in Figure 3.5.
Figure 3.5 After clicking "Add to list" button <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xup="http://www.openxup.org/2004/02/xup" xmlns:xul="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul"> <soap:Body> <xup:eventResponse>
Listing 3.5 XUP response for Example 2 (after button click) In the XUP response in Listing 3.5, the application also adds an event selector "listSelector1" to the list box "list1". That means all list item selection event will be sent to the server (By default, no list selection event is sent to the server, since an XUP client will not send any event unless there is a matching selector).
Figure 3.6 Selecting list item "Chocolate Chip" In Figure 3.6, the user clicks the list item "Chocolate Chip", which triggers the list selection event to be sent to the server. Listing 3.6 describes the XUP event request. The XUP element specifies the value of the "xul:selectionIndex" attribute of the list box "list1". This value ("1") is the index of selected list item ("Chocolate Chip").
Listing 3.6 XUP request for Example 2 (selecting list item) The above examples are intentionally simple, since code listings for more complicated UI interaction scenarios will be rather lengthy. However, from these simple examples, it should be obvious to see the advantage of XUP over the traditional form submission protocol used by classic HTML-based web applications.
3.4 Summary In this chapter we have presented an alternative web UI protocol, aimed at the development of web applications with highly interactive user interfaces. The proposed protocol, XUP, has the following features: •
To facilitate server-side UI programming, it delivers rich UI events from the client to the server in XML format.
•
To minimize network traffic, it supports the notion of event selector for event selection and filtering.
•
It transports UI update instructions from the server to the client, thus allowing the server-side application to directly control the UI perceived by the end user.
55
Part 1: OpenXUP
Chapter 3. UI Transport Protocol •
To provide end users with a fluid UI, it supports incremental UI updates, thereby avoiding frequent whole page/screen refreshes.
•
It supports asynchronous event delivery to enable a responsive UI experience.
•
It provides status messages to support server-side notifications.
As a final note, we provide a comparison between XUP and the traditional form / pagebased protocol in classic HTML-based applications: Table 3.3 Protocol comparison form-urlencoded
XUP
(on top of HTTP)
(on top of SOAP/HTTP)
Request
Form submission
UI event
Response
Whole page
Incremental UI updates
UI sophistication
Form / page-based UI
Fluid, desktop-like UI
Table 3.3 shows the major differences. The traditional form-based protocol only carries application-defined name / value string pairs; it has no notion of UI event, which is essential for highly interactive user interfaces. In addition, each protocol response includes a full HTML page, which causes frequent and annoying page refreshes. XUP, on the other hand, carries rich UI events and delivers incremental UI updates. Therefore, XUP can enable fluid, desktop-like user interfaces. Further details about the protocol can be found in the protocol specifications [YC02] [Yu06a].
Part 1: OpenXUP
57 Chapter 4. Developing and Executing RIAs
4. Developing and Executing RIAs The web infrastructure offers unprecedented opportunities for quickly deploying applications on a large scale, while maintaining them on the server in a centralized way. The trend in enterprise application development is the use of web browsers to provide access to complex and feature-rich applications (e.g. CRM, ERP) used in production environment. Ideally, web applications should have the same level of productivity and usability as traditional desktop applications. By productivity, we mean how efficiently end users can perform their job functionalities. High productivity requires the applications to provide highly usable user interfaces. As previously mentioned, the traditional desktop-based model offers rich and powerful user interface controls, but it requires client-side software administration. The web page based model has no client software administration cost, as it has a universal client – the web browser, but it lacks sophisticated user interface controls required by highly interactive web applications. While many client-side RIA technologies provide richer UI, they all suffer from client-side code execution security risks and IP protection issues. Therefore, our goal is to provide a user interface programming model that combines the advantages of both the desktop and web page based models; that is, to offer the benefit of zero client administration cost as well as better, richer, and more powerful user interactivities. To satisfy the requirements set out in Chapter 1 and overcome the limitations in existing approaches, we propose an alternative framework for web UI development, called OpenXUP. The framework is built on top of XUP and provides a UI programming model that combines advantages in both web page-based applications and desktop applications. For developers, the framework provides a programming environment that closely resembles desktop applications. That is, rich user interface controls, such as those found in Windows Forms and Java Swing, can now be used in building web applications. At the same time, OpenXUP maintains one of the most important benefits of web
Part 1: OpenXUP
58 Chapter 4. Developing and Executing RIAs
applications – zero client-side administration, since all application code resides on the server side. Consequently, OpenXUP has no IP protection issues. For end users, the framework aims at bringing web applications closer to their desktop counterparts. That is, web applications built with OpenXUP will appear identical to desktop applications. In addition, OpenXUP's asynchronous event delivery and incremental UI updates mechanism enables fast application responses without annoying page refreshes. Finally, the framework provides a safe client environment, without the common security risks arisen from client-side code execution. OpenXUP offers a set of server-side event-driven APIs, which enables developers to implement sophisticated application-specific UI behaviors. The OpenXUP APIs are designed to be familiar, closely resembling the APIs from desktop GUI toolkits. In addition, since all application code resides on the server side, it makes web applications easier to debug and maintain, without the need to worry about issues from distributed computing. As for the backend infrastructure, web applications built with OpenXUP can leverage all existing backend components (EJB, CORBA, COM, etc.), since OpenXUP's server side is designed to run within existing web application servers. The outline of this chapter is as follows. In the next section we describe an example scenario to illustrate the requirements of rich web user interfaces. We then present the detailed design of the runtime framework, focusing on the server side (the client side will be discussed in the next chapter). After that we describe the development environment and show how the example application can be developed. Finally, we discuss the implementation of the framework, including the server-side runtime environment and the associated development tools.
4.1 Example Scenario To illustrate the requirements of a rich web UI environment, we show an example application, XCat, which is a hosted multi-buyer, multi-seller e-commerce solution. XCat is typically hosted by an ASP (application service provider) or a marketplace operator and used by consumers (buyers) and retailers (sellers). XCat has two interfaces
Part 1: OpenXUP
59 Chapter 4. Developing and Executing RIAs
that deal with consumers and retailers respectively. The retailer part of XCat includes functionalities such as electronic catalog editing, order processing, and logistics. In this example we focus on the electronic catalog editing functionality, which allows multiple users within a retailer's organization to concurrently and efficiently edit their product information stored in server-side databases. It also has a workflow-based preview and approval process, which allows product managers to perform quality assurance on the accuracy of the product information. Since each retailer may have thousands of products, the editing, preview, and approval user interfaces must be very efficient. Without high usability, the data entry specialists and product managers' productivity may suffer, and many of the retailer's products may not get into the electronic catalog on time. Hence, XCat should provide a rich and efficient UI experience analogous to that of desktop applications.
Figure 4.1 Example application: XCat Figure 4.1 shows the desired user interface for XCat. In this screen, the left hand side is a tree interface, containing a hierarchy of product categories; this allows efficient catalog navigation and selection. The right hand side is a tabbed panel, with two tabs showing the list of attributes and the list of products under the currently selected category. The products tab (not shown) also has scrolling and pagination control, allowing thousands of products to be displayed in an incremental fashion.
Part 1: OpenXUP
60 Chapter 4. Developing and Executing RIAs
To allow maximum efficiency and usability, the end user should be able to select a category by clicking on the tree node, using arrow keys, or typing the tree node's label while the tree control has the focus. Similarly, to expand a category, the end user may click on the "+" icon, double-click on the category name, or press the right arrow key. In addition, the product information editing interface should support full keyboard navigation. The end user should be able to go to a particular input field with keyboard shortcut (without using mouse), and traverse input fields via the tab key. This is the behavior that we would expect from a well-designed desktop application.
4.2 Runtime Framework In this section we illustrate the design of the runtime framework by providing an overview of the various elements of the framework and the interactions among them. Before discussing the details of our framework, we need to provide some additional background information. We begin by reiterating the concept of user interface model (UI model), user interface language (UI language), UI definition, and UI logic. As mentioned in the last chapter, a UI model is a representation of the user interface which the end user perceives and interacts with. UI models may have XML-based representations. We call these representations UI languages. Common UI languages include XUL, XAML, UIML, and SUL. A UI model typically consists of a tree of UI controls (e.g., buttons, panels), a set of events (e.g., mouse click), and a list of resources (e.g., button images, files to be downloaded or uploaded). To handle user interactions, developers process events emitted from the UI tree and manipulate the UI model to update the user interface. The code than handles events and manipulate the UI model is called UI logic, whereas the code that renders the UI model to end users is called UI definition. UI definition is a description of the user interface, specifically, the UI controls and their properties. UI definition is typically coded in declarative languages such as XUL and XHTML. In this case, UI controls can be described by a tree of XML elements, with the UI controls mapping to XML elements and their properties (e.g., color, size) mapping to XML attributes. On the other hand, UI logic contains code that handles UI events and
Part 1: OpenXUP
61 Chapter 4. Developing and Executing RIAs
communicates with application logic code. UI logic is typically coded in a programming language. Consistent with the protocol, OpenXUP is also independent of the actual UI language. It places no restriction on the UI control set, or the properties or events associated with each control. That is, OpenXUP is designed to work with any UI model that has an XML-based representation.
4.2.1 OpenXUP Architecture In order to illustrate how OpenXUP works, we first need to describe the different components that are part of the framework. Since OpenXUP is built on top of XUP, the framework components expand from the basic protocol components discussed in the last chapter. The framework's components and the overall architecture are depicted in Figure 4.2.
Figure 4.2 OpenXUP framework architecture
4.2.1.1 OpenXUP Client The OpenXUP client has two main tasks: one is to manage the interaction with the end user (display UI controls and capture UI events) and the other is to communicate with the server (transform UI events to XML messages and then send them to the server; receive UI updates in XML messages from the server and then render them). These tasks are performed by the view manager and the XML serializer, respectively.
Part 1: OpenXUP
62 Chapter 4. Developing and Executing RIAs
The view manager is responsible for managing the native desktop user interface. First, it renders UI controls using the appropriate native desktop GUI toolkit. Second, it captures native UI events caused by end user actions, and then delivers them to the server through the XML serializer. The client-side XML serializer is responsible for XML serialization. First, it parses and generates XUP protocol messages. Second, it turns a UI control to and from its XML representation. For example, let us assume that a push button UI control has an object representation SButton and an XML representation