This document was uploaded by user and they confirmed that they have the permission to share
it. If you are author or own the copyright of this book, please report to us by using this DMCA
report form. Report DMCA
Overview
Download & View Digital Radio Broadcasting And Java as PDF for free.
What is DAB ?...................................................................................2 A Software Platform for Downloading Applications using the DAB System................................................................................5 Examples of DAB Applications .....................................................6 A Brief Description of the Contents of this Thesis ......................8
Chapter 2 Mobile Code 2.1 2.2
A Brief Introduction on Communication Mechanisms ..............13 Distributed Computing Paradigms...............................................14 2
Client/Server Architecture 3-tier Architecture Remote Evaluation & Execution Code on Demand Mobile Agent
2.3 2.4
Classification of Mobile Code Mechanisms ................................ 20 Programming Languages Concepts for Mobile Code ................24 Language Level Abstract Machine Level Library Level
2.5
Security & Safety of Mobile Code Technologies .........................28
Mobile Agent 3.9 AGLETS.............................................................................................48 3.10 Telescript ...........................................................................................50
Scripting Languages 3.11 Tcl/Tk ................................................................................................ 52 3.12 Internet Scripting: JavaScript, Jscript, VBScript...........................53 3.13 Comparisons & Comments ............................................................53
3
3.14 A Technology for the DAB System................................................58
Chapter 4 Java 4.1 4.2 4.3 4.4 4.5 4.6
Java Architecture..............................................................................64 Virtual Machine................................................................................65 The Class File Format ......................................................................67 Loader: How Code Born ! ...............................................................70 Garbage Collector ............................................................................73 Java Language: A Deep Sight.........................................................73 Exceptions Concurrency Serialization RMI Java Family: JRE, SDK, Personal Java, Embedded Java, Java Card
4.7 4.8 4.9 4.10
4.11 4.12 4.13 4.14 4.15
Java API .............................................................................................76 Web Integration: Applet .................................................................78 Native Interface ................................................................................80 Does only one Java exist ? Microsoft Way: DCOM/Java, J/Direct, RNI .............................................................................................................82 Java Extension Framework: How to Expand the Java API ........84 Java vs. Other technologies ............................................................84 Java Implementation: Window Platform......................................85 java.net.*: a Case of Study ......................................................87 Future Development: a Faster and Secure Architecture ............90
Chapter 5 Download of DAB Applications using Java 5.1 5.2 5.3 5.4
DAB Architecture............................................................................. Concepts for Application Download in DAB ............................. Request of an Extension of the Eureka 147 Specification........... Proposal for a Security Model: Java/DAB Security....................
Chapter 6 Simulation 4
6.1 6.2 6.3
Overview of the Simulation............................................................ Integration of the Java Virtual Machine: MS COM..................... Software Implementation: the Main Java Component ...............
Chapter 7 J-DAB Package: A Complete Java Solution 7.1 7.2
JDAB: Java DAB API ....................................................................... JDAB: A Conceptual Framework...................................................
Conclusion References
5
6
Chapter 1 Introduction “The quest to know the unknown and see the unseen is inherent in human nature. It is this restlessness that has propelled mankind to over higher pinnacles and ever deeper depths. This insatiable desire led to the discovery of light as being electromagnetic, paving the way to the discovery of the radio." (Saleh Faruque - Cellular Mobile Systems Engineering)
Nowadays the technologies of the three major industries, entertainment, computing and telecommunication are converging. In the new era of digital communication, many different systems are being integrated. During the last years Internet applications and services have been the leading technologies for new advanced digital communication infrastructures. Many fields related to telecommunication are today thinking about the integration with the "mother of all networks" - the Internet. DAB, Digital Audio Broadcasting, is the advanced successor of FM broadcast radio systems. This new system gives listeners interference-free reception of high quality sound, easy to use radios, reliable mobile reception and overall the potential of a pure
7
digital channel to receive all kind of data (mpeg audio, videos, images, text data and code). Digital data opens to a wide area of broadcast applications because it is the base language used in the computing environment and it can be used as a link to the world of the computer and telecom services (Internet, database services, multimedia applications, telephony, GSM) with the world of broadcast radio and video communication (GPS, Video On Demand). Otherwise broadcast radio systems are well suitable and more flexible for large scale distributed data and for general data services communication systems: weather forecast, traffic information, news, etc. Today in the market different broadcast system for digital radio transmission are also available: DSR (Digital Satellite Radio) and ADR (Astra Digital Radio); they are not so flexible and smarter as we will see than DAB.
Video on Demand
Entertainment Music
Broadcast
Multimedia
Mpeg 2 Encoding Database Distributed Systems
Telecommunication GPS Single Frequency Network
GSM
Computing
Internet
Mobile Code Virtual Machine
Figure 1: Converging fields in DAB System
1.1 What is DAB ? Digital Audio Broadcasting - DAB - is the most fundamental advance in radio technology since the introduction of FM stereo radio. In the new era of digital communication, another step toward the global integration of all networks is done. Broadcast radio systems, until now relegated to a peripheral role in the multimedia and
8
digital datacommunication applications, now are ready to compete and to integrate with other distributed multimedia computing technologies. ♣♣♣ We can summarized the main capabilities of traditional radio broadcast systems in the following list: *
unidirectional
*
scalable
*
ubiquitous
*
low capacity data channel - FM-RDS
Unidirectional and scalability are the main characteristics of broadcast systems: signals are broadcast by a single source and are received by a large number of users and the amount of traffic (occupied bandwindth) is independent from the number of possible users: these aspects make the radio system suitable for low cost large scale data broadcasting. Ubiquitous computing is related to the physical medium used to transmit the radio signals. Whenever a wire is unavailable or unnecessary, wherever mobility is a big constraint radio systems are the best solutions: boats, cars, small cabins, historical buildings, mobile stations, PDA, Walkman, etc. Besides all this positive aspects of traditional broadcast radio systems, AM and FM radio miss some issues and there are some big disadvantages: noise interference, multipath propagation on mobile systems, interfrequency interference, low quality audio (with respect to CD or DAT audio quality), a poor number of extra services (FM-RDS), etc. What are the new features of the DAB system and what kind of services can be developed ? The DAB transmission signal (see chapter 5 for details) carries a multiplex of several digital services simultaneously; it uses advanced digital audio compression algorithm to achieve a spectrum efficiency equivalent to or higher than that of conventional FM radio. Its overall bandwidth is 1.536 MHz, providing a useful bit-rate capacity of approximately 1.5 Mbit/s in a complete "ensemble". Each service is independently error protected with a coding overhead ranging from about 33% to 300% (200% for sound), the amount of which depends on the requirements of the broadcasters (coverage, reception quality, number of services). The ensemble contains audio programmes, data related to audio programmes, and independent data services. Usually the receiver will decode several of this services in parallel. A specific part of the multiplex contains information on how the multiplex is actually configured, so that the receiver can decode the signal correctly. DAB system is the best gateway to move digital data towards mobile devices and it is suitable for providing integration with other communication architectures (Internet, GSM, satellite networks). Actually there are other digital system to broadcast information: DSR (Digital Satellite Radio) and ADR (Astra Digital Radio) are two services provided by the use of satellite channels. They both provide CD-quality radio (with none or some kind of compression mechanisms) using large bandwidth and they both lack good performance for mobile devices and they are not suitable for local radio services.
9
DAB, in brief, is characterized by the following attributes: *
small quantity of information that interests a large number of users;
*
access time is independent from the number of users (scalable)
*
designed for mobile reception (car, wireless device as radio, Walkman)
*
one system standard for the whole European market
*
unidirectional, scalable, data delivering communication;
*
economicity of the medium
*
electrosmog
Some attributes are similar to traditional broadcast system, but new one are added in DAB. High quality audio and design for mobile reception are one of the fundamental introduction of DAB system: the use of special coding techniques permits to optimise the reception from mobile device (car radio, boat radio) and the use of some special compression tecnique permits the trasmission of CD-quality audio. DAB is also a cheaper way respect to the FM system to trasmit radio signal: transmitter are less powered than traditional high frequencies transmitter. For the same reasons we have a reduction of electromagnetic fields emission (electrosmog). DAB has been under development since 1981 at the Institut für Rundfunktechnic (IRT): broadcasters, research institutes, network providers/operators and consumer electronics firms are contribuiting. The DAB system is a European standard (ETS 300401, February 1995) adopted by the European Telecommunications Standards Institute (ETSI). Services or pilot projects are running in many countries, including Australia, Belgium, Canada, Finland, France, Germany, Italy, Norway, Sweden and Switzerland [8a].
10
DAB active services DAB sperimental services interested countries information not available
Figure 2: DAB in Europe (see [8b] )
1.2 A Software Platform for Downloading Applications using the DAB System At the present two different DAB applications have been developed: Broadcast Web Site and SlidesShow. In both case we use some static data, HTML pages and GIF images, to implement new DAB services. From different parts, both from content providers (BBC, Swedish Radio) and receivers manufacturer (Sony, Mitsubishi, Bosch), there is the need to have more flexible DAB application. Dynamic contents are the base of download application: download a complete application can allow user to receive dynamic services and to offer a large number of modern services (in competition with Internet or Mobile phones). DAB opens new opportunities to service providers and new facilities to the users. This sophisticated platform permits the introduction of new data services on the radio channel. Telematics, the integration of communication and computer science, finds a new fertile field for creating new services in DAB. The research of new software solutions for the DAB system is the leading topic of this thesis: virtual machine architecture, dynamic applications, distributed computing, mobile code, software components, Internet services are new word in the context of the broadcast services. Internet is a big laboratory for new solutions in the field of network, software engineering and multimedia: for that reasons we will refer continuously to Internet technologies applications. For example virtual machines are important components of a modern portable computing environment (as the Internet) because they provide an architectureindependent representation of executable code. Their performance is critical to the success of such environments and for the development of general purpose services; but
11
they are difficult to design because they are subject to conflicting goals. On one hand, they offer a way to hide differences between hardware devices and they give a standardized software layer for programmers; on the other hand, they must be implemented efficiently on a variety of different machines. Since radio receivers are becoming increasingly complex and microprocessors cheaper and cheaper, in the near future we think that receivers will became more like computers, and computers will incorporate receivers [9]. Some of the leading software companies have been introducing in these months complete OS for the market of the embedded devices (car radio, settop box, hand held computers): Windows CE from Microsoft and Java OS from Sun, but also Linux RTOS, Inferno from Lucent, are only a few examples of the big interests involved in this direction. Before building a distributed computing platform, we have to investigate what are the different possible approaches that we can use and what are the right technologies to implement them; then we have to follow the basic requirements that our specific platform needs and to develop appropriate solutions.
1.3 Examples of DAB Applications Digital information opens to a wide range of new applications not available in the traditional FM/AM Broadcast system. Also the advent of digital data can open a huge number of integration processes with other kind of communication networks: computer networks, GSM network, home network, etc. Both traditional (like audio programmes) and modern (like downloaded applications) services can exist together in DAB. We list a number of possible applications. High Quality Audio. The advanced digital compression techniques (MPEG 1 e 2) used in the DAB system allow the broadcast of high quality radio services. A flexible audio bit rate (from 32 kbit/sec to 384 kbit/sec) allow the multiplex DAB signal to provide from 20 restricted quality mono programmes up to 6 high quality stereo audio programmes. Also DAB radio services are identified not with the frequency range but with some additional information as the name, the type of programme: DAB receivers are smarter than ordinary FM receivers and users can interface with different services in a modern way. For example, some broadcasters trasmit extra data along with the radio programme which identifies what sort of programme is, Classical Music or News, and users can search all the DAB radio services that match some special requirements. In DAB are also available some extra features to provide pay radio or data services: broadcasters might offer special services (both audio or data information) available only on payment, which users would subscribe to on an individual or long term basis. Dynamic Label. DAB provides a flexible way to associate and synchronize a data information channel with audio programmes. This additional data has a variable capacity from a minimum of 667 bit/sec up to 65 kbit/sec. This additional information can be used to implement different new services: dynamic label messages, Karaoke-like transmission of lyrics, multilanguages information transmission.
12
Travel/Traffic Information. All information for traveling can be retrieved from a DAB service. Using text, images, maps and applications we can coordinate and inform car driver about all kind of information (from hotel prices to traffic situations) needed for a good and secure trip. The DAB system is a perfect medium to communicate this kind of information. Broadcast Web Site. Once we have build a new communication medium, we can develop a Web Browser application to receive HTML pages, Java DABApplet, Images like a normal network link. DAB is a high bitrate downlink medium for providing a selected number of Internet services: traffic information, weather forecast, sport news, emergency messages, entertainment, etc. Commercial Applications and Data Download. For a large company that has to exchange common information (price reports, software patches, internal communication) to a large numbers of branches customers, DAB offers low cost and secure services. For example a car company has to inform thousands of car shops around a large area like Europe about new price lists or new offers; then each single shop can use some feedback channels (Internet, phone, mail) to order specific items. A software company can offer software uploads for their customers. To realize these kind of closed user group services means to control access to potentially confidential data are required: conditional access systems and cryptography.
DAB Booked !!
Downlink path Train/Flight Tim etable Information UPlink
path
Internet
Iwant to reserve a ticket ..
Figure 3: Example of DAB/Internet integration
Entertainment. The combination between sound transmission and application contents can be used for creating a new way of broadcast radio entertainment. For example the Slides Show: in this case broadcaster can transmit additional data that may
13
include comprehensive information about a piece of music being played, such as the song title, composer, singer, album name, album picture, video frames and so on. Brokering Information & Stock Quote Systems. Information about sport results or stock prices can be delivered using DAB and be presented by Java application. A brokering agency can deliver information to users using a ubiquitous medium like DAB. Polling & Surveying. Agencies that collect information can create a polling service upon the DAB channel, asking general information and can receive the feedback from the users by the means of GSM mail service, Internet mail, etc. Multimedia Home Network. The DAB receivers will be just a node inside a home multimedia network together with HIFI systems, digital TVs, digital cameras, Settop boxes, etc: Java would be the glue for all these different platforms.
What do you think about the italian new parliam ent ?
DAB ?!!@:-(
GSM Internet
???!!!!@:-(
Figure 4: Another example of a possible DAB application
Web/DAB Integration. Once we have created a software bridge between Java and DAB services, the computer can be an hardware bridge to the Internet: we can think about a global network in which we can move and optimize data exchanges. We can use the characteristic of the DAB channel to reduce traffic congestion on the Internet networks for some special kind of information. Games. Using download application means the possibility to move also games. User can receive games in their DAB car radio: the runtime environment with graphical output provides all the facilities to create this services.
14
1.4 A Brief Description of the Contents of this Thesis The goals of this work is to give a theoretical and practical introduction to the problems related to distributed programming environment, in general, and to give the basic requirements for a downloading application service using the DAB system, in particular. Understanding the Java platform is another central topic of this thesis: Java up to now is a platform useful for network applications; we analyze in detail the potentiality of this platform for a DAB system. At the end we have applied some of the Java mechanisms to investigate in details some particular mechanisms of Java and to simulated a simple download application (Applets) service for seeing the behavior of some Internet-like services (CodeOnDemand, teledatacommunication) using a DAB system simulation. All the information used in this thesis comes from specification papers, academic articles1, and free available documentation on the Web [8]. Here is a brief description of the chapters contained in this thesis.
1
Chapter 1
Introduction This chapter wants to introduce the reader to the new aspects and future developments related to the DAB systems. We point out the new kind of services available and the need of building a complete computing environment for the future challenges for the DAB service. We also focus our attention on the differences and the likeness of the DAB system and the Internet world. At the end we give a brief overview of the possible future applications using DAB.
Chapter 2
Mobile Code This chapter is a theoretical introduction to the paradigm of distributed computing: the material is a collection of general concepts to give a complete working framework to understand all the problems related to the distributed computing. The goal of this chapter is to create a background and to collect a series of theoretical tools to analyze all the contents of the following chapters. We have pointed out some issues related to distributed computing paradigms, security mechanisms, abstract machine and programming languages principles.
Chapter 3
Technologies This chapter is a brief description of the main distributed computing technologies now available on the market. The chapter is not exhaustive, but we have tried to mention the majority of the products now available both from the market and from the academic world. Inferno from Lucent, Java from Sun, ActiveX/DCOM from Microsoft, Obliq from DEC are only few examples of complete programming environment for distributed applications. At the end of this chapter we make a comparison between all this technologies and we try to choose the right platform for application download in DAB. These first chapters are the base for the developing platform for this project: the Java technology.
Chapter 4
Java Java is the winning technology for developing a download application service in the DAB system. We describe here the Java technology moving from the abstract machine implementation to the API library. We describe also some of the main characteristics of this developing platform: abstract machine architecture,
See references for a detailed list of documents
15
programming language features, security mechanism, Java API, native interface, Applet model, etc. Chapter 5
Download of DAB Applications using Java This chapter explains in details the DAB architecture principles and services. We introduce some security concepts related to the problems of application download using the Java technology in the DAB system and the need of some extensions to the DAB architecture to make the system more flexible for the Java integration.
Chapter 6
Simulation In the project we have integrated inside a C++ application the VM: we have tried to understand the inner mechanisms to integrate the Java VM. In this chapter we describe in details the simulation and the used software components: we have integrated the MS Java VM inside an MFC project that simulates the DAB Navigator.
Chapter 7
J-DAB Package: A Complete Java Solution In this chapter we sketch the basic steps for a complete Java solution for the DAB system: this is a conceptual framework for future development using Java platform. The contents are also the basis for a discussion on the possibility to use Java as a DAB platform.
16
17
Chapter 2 Mobile Code
“Independence of the design paradigm from the underlying technology is a major point in general software engineering practice” “Designing Distributed applications with Mobile Code Paradigms” (A.Carzaniga, G.P. Picco, G.Vigna)
Telecommunication networks are evolving at a fast pace and this evolution proceeds along several lines. The size of networks is increasing rapidly and this phenomenon is not confined just to the Internet, whose tremendous growth rate is wellknown, but also in many different fields of applications related to advanced telecommunication systems: from mobile phones to PDAs, from satellite systems to mobile computers. The DAB system is a new way to transfer digital information using advanced technique on the radio channels. It is a new opportunity that opens to researchers a wide area for developing new services for the radio market.
18
Before starting our investigation on the possibility to implement a download application on the DAB channel, we need to retrieve some theoretical and technical concepts about the distributed computing systems. The Internet world has been a exsperimental platform for lots of new distributed technology: we want to use all these experience for start the development of new solution for the radio services using the DAB system. ♣♣♣ In an heterogeneous computing environment we need new approaches to develop efficient applications and we can not use classical paradigms or technologies used for stand alone machines: that's why we analyze at the beginning of this thesis the new features of the Distributed Computing Environment and the new methods of the mobile code technology. A global network links together many different systems, so we need new approaches to create new ways for exchanging information: solutions to the new problems arrive both from the academic world and from the industrial research centers. We decide to start with a brief general introduction and a classification of mobile code architectures to have some theoretical tools in order to compare the different implementations nowadays available on the market: a more complete survey is available in [1], [2]. The expression "mobile code" is used with various different meanings in the technical literature; for our purpose we define mobile code "as software that travels on a heterogeneous network, crossing protection domains and is automatically executed upon arrival at the destination"[1]. For protection domains we intend both wide area networks and small embedded systems (DAB radio receiver, Personal Digital Assistant, Smart Card, etc.), the only requirements is that all nodes are linked via a specific network medium that permits to move digital information. ♣♣♣ "Code Mobility" is not a recent discovery: we have already some examples. (For an introduction on the mobile code see reference [1].) * PostScript Language is used to send a series of commands describing the pages to be printed and sending these "programs" to the printer; printer have a complete special purpose computing environment to evaluate the commands that clients send; * SQL Script Language is an efficient way to balance the database processing load on the server machine where the data resides; in this case the specific language is suitable for database queries, updates or data manipulations; * Web Browser Application: to have a dynamic behavior on the HTML pages many proprietary solutions have produced some special mechanism to transmitted executable contents on the network (the well known Java Applet mechanism of Sun Microsystems, ActiveX controls of Microsoft, embedded multimedia contents of the Netscape Browser); the code is automatically fetch from network sites and is executed in the client machine where it uses all or partial resources (memory, Microprocessor, Hard Disks, etc.);
19
* Software Distribution and Installation problem: this specific application has produced many proprietary solution as Inferno of Lucent Tech., a mobile code enabled network operating system for media providers and telecommunication systems; the management of the software installation is a typical application where code mobility can give a better solution than the old strategies for its particular features: scalability, customizability, etc. These are only few example of the possible applications of mobile code, but as you can see they try to solve some specific constraints to built efficient systems in terms of scalability, network traffic reduction, security, minimal overhead and performance. Mobile code technology is not a new approach, but until now it was relegated to very specific applications: these new model start to be investigate in a deep manner and different University Centers have started a formalization process of the new paradigms. In order to evaluate the different mobile code technologies available on the market or for future development, a set of criteria and theoretical tools need to be established. The following are important issues for a complete comprehension of the complex and dynamic world of mobile code applications.
2.1 A Brief Introduction on Communication Mechanisms Without enter in a detailed description, it is good to introduce some mechanisms used to implement some communications tasks: these are basic concepts that are used in the proceeding of these work during the analysis of the mobile code technologies. The goal of this chapter is to give a basic theoretical understanding of distributed architecture and a set of concepts to compare different mechanism. We are going to talk about the principles of distributed computing architecture and it is good to introduce some of the used technique to exchange information. Always from an abstract point of view, we can distinguish different communication mechanisms. We can summarize them the following scheme:
Communicaton mechanisms
Point-to-Point
Messages RPC Stream Point-to-Multi-point
Events Group Communication Shared Memory Tuple Spaces Multicasting/Broadcasting
20
Basically we can distinguish the number of the objects that interact and the mechanism of the interactions. For the first we have components communicating inside the same machine (processes, threads inside a process) or remotely (different machine in the network); the second is a group of mechanisms used to exchange information between these components: anonymous notifier events, packet exchanged on stream connection oriented channel (sockets), messages posted on specific site, etc. Point-to-Point. It refers to the communication between two Execution Unit both locally and remotely. The more primitive and simple mechanism is message passing used in the client/server paradigm (HTTP, XWindow); Remote Procedure Call is a more advanced but always based on client/server mechanisms (UNIX, Win); streams are channel used to transfer data from two entities in a continuos way (pipes in UNIX, TCP socket connections). Point-to-Multipoint. Shared memory is the mechanism most frequently used to achieve multipoint intraprocess communication inside a single Operating System. Event-based mechanisms define an event bus constituting the logical channel through which events are dispatched: a suppliers-consumers architecture is used to give the possibilities to send events and to subscribe for receiving events we are interested in: for examples the Java event implementation. In this case we distinguish a set of specific type of information (events) and we use them to communicate anonymous event: the goal is to know what is the message (type of service), not knowing who sent the message. With the tuple space technique, each execution unit communicates by either inserting the tuple containing the information to be communicated into shared tuple space, or by searching it for a tuple using some form of pattern matching. Among the point-to-multipoint communication mechanisms, the notion of group communication is essential for development of cooperative software both in distributed or autonomous system. These concepts can be important in a broadcast environment, in which the group communication is intrinsic in the topology of the architecture.
2.2 Distributed Computing Paradigms In traditional applications the computing environment was closed inside a single machine (singletask OSes). As computer and network capabilities grew bigger, the next step was to developed some communication mechanisms to interface applications in the same machine (multitask OSes) or in other connected machine (network and distributed OSes). Some examples are related to the UNIX systems (pipes, signal, sockets), to the Windows system (event mechanism, WinSocket, queue, shared files or variables) or the Inferno OS (chain mechanism). With these techniques all applications were built based on the simple Client/Server or Peer-to-Peer paradigm in which a single entity (single-thread-processes or threads in a multithread-application) calls some specific services on another entity (located locally in the same machine, or remotely in other machine on the network); in general the exchanged information was only formatted data following specific rules (variables, constants, data files, etc.). This mechanism determines the basic behavior of the majority of the classical distributed applications: FTP, HTTP, RPC; nowadays some object oriented mechanisms have been introduced to the Client/Server paradigm (CORBA, RMI), but essentially the paradigm remains the same. The positive features of a simple Client/Server mechanism is related to its simplicity, especially in small and homogeneous networks. But in an increasing world where a large number of
21
heterogeneous machine are linked together that is not enough to create efficient, scalable, secure, customizable applications. The numbers of the interactions between clients and servers, the delays introduced in WAN networks, the lack of flexibility in a non homogeneous network are few examples of the problems of the client/server approach. Today new distributed paradigms have been introduced to implement modern distributed system and to increase performance in a complex and heterogeneous environment: a conceptual framework is needed to partially understand the mobile code scenario. We consider some basic components of a distributed computing environment. We distinguish three architectural concepts: components, interactions and sites. A complete work about the subject can be found on [2],[3] e [5]. Components are the constituents of a software architecture and are divided in code, resources and computational components. The first encapsulates the know-how to perform a particular task (in traditional system this component was running always in the single machine where the code was stored); the second are elements used during computation (shared files or variables, hard disk, video resources from a software point of view); and the third are active executor capable to carry out a computation (OSes, interpreters, virtual machines). Interactions are exchanged data that are used to communicate between one or more components (for example a message between two process in the UNIX system, an event thrown by a window application, a message between a client and a server machine through the network in a Xwindow system). Sites are the physical location of this components on a specific network. We distinguish also the services, that is the results of the computation.
Interaction
Computation Component (CPU,memory,VM)
Code Component
Service
Resources Component
Figure 5: Traditional not distributed system: all components reside in a single machine
By using this abstractions we can introduce the main design paradigm used for implementing distributed applications (in the pictures used in this chapter we use for simplicity computer-like entities, but these system can refer to any possible network architecture). We distinguish: *
3 tier architecture (N tier architecture) Remote Evaluation Code on Demand Mobile Agents
For analyzing these different paradigms we should consider some parameters to point out the main advantages related to a particular approach. In brief these parameters could be: * * * * * * *
Number of interactions (generated network traffic) CPU costs in terms of calculation Security Performance & Simplicity of the implementation Scalability Latency Optimization Push and Pull capabilities
Client/Server Architecture. This is the classical implementation of traditional distributed applications. The mechanisms used to exchange information are very simple and only data information is exchanged between two machines connected with a network. All management and security mechanisms are centered in a high performance server machine that provides a "a priori fixed set of services accessible through a statically defined interface" [2]; the server can communicate with a number of low-capabilities clients (mainframe architecture) or with a group of medium powered machines (LAN architecture). This architecture has encountered many problems of scalability, especially today that network applications spread from LAN to WAN in an heterogeneous environment and the number of users is growing up. A few examples of this kind of implementation can be reported: X-Window systems, POP, FTP and HTTP protocols, Remote Procedural Call, etc. New implementation for distributed computing as CORBA, DCOM, RMI try to use this architecture paradigm (RPC) together with an object oriented programming, in which clients and servers are "object entity" distributed in the network. There is a big difference between the latter mentioned computing models and mobile code paradigms: in the mobile code paradigm we explicitly model the concept of a separate execution environment and how computations are moved between these environments, while in the object oriented distributed application, the computational environment is an abstraction layer that hides all the network system.
23
Client/Server (pure) message
Client
Service
data
data
Server Computation Code Environment (CPU,memory,VM)
Es: RPC, HTTP, FTP, Xwindow
Figure 6: Client/Server (pure) Architecture
In C/S systems the generated traffic is large and it can be a problem the latency of the medium and the number of the interactions: clients and servers exchange a large number of messages for coordination, control, and executing specific tasks. That's why this architecture is in general used only for small networks with high band capabilities (LAN). A lot of Internet application use this paradigm for its simplicity, but the produced overhead of information and the lower reliability of the large networks decrease the performance of this system. Security policy is centralized on the server: the gain on security is paid with a loss on flexibility (static set of controlled services) and a decrease on performance (high overhead). 3-tier Architecture. This architecture design is the natural evolution of the C/S model. For balancing the load of computing between different computers with the same high-medium capabilities, especially in commercial web-based application, it has been introduced a multi-layer chained layout; each stage deals with a particular aspect of the distributed application. For example we can separate the presentation tasks from the processing tasks and from data access: in this way we reduce the overload of the exchanged information in the network and we increase the performance of the entire system. The layering process can be done both at an hardware level (different machines implement different tasks) and at a software level (inside the same machine we can have many processes that communicate with each other).
Remote Evaluation & Execution. A client machine has all the know-how (code) and sends it to a server machine for running it: in this case we transfer code through the network link. The result of the computation can come back to the client, evaluation (for example a SQL query), or not, execution (a PostScript transmission to a printer, a remote script shell in UNIX). This solution applies well in that case in which the exchanged information or the complexity of the computing should saturate the bandwidth of the links. The server offers a service that is programmable with a complete computation language and the client can customize its own service. In this case the server receives and interprets commands: it is the first basic step toward sending code through the network.
Code on Demand. Many upcoming Internet applications are based on this paradigm. In this case a client has already all necessary components to execute some service, but it does not have the know-how; the provider machine gives the necessary know-how "on demand", that is in replay to a request of the client machine (in general this is automatically controlled by
25
applications); the client machine must have capabilities to download, link, run the code automatically. Each client has a complete computing engine; for the portability of the distributed code, all machines that interact with the server must have the same computational capabilities: for that reason for implementing this kind of paradigm we need a complete virtual machine (a machine with the same computational characteristics in spite of different platforms). Among the many existing proposals, the most well known is the Java Applets technology of Sun Microsystem; other technologies are ActiveX from MicroSoft, Obliq from DEC Lab, etc. We can see that the code is fetched from a source in the network and once it is in the host is executed.
Code on Demand Es: ActiveX/IE4.0, Applet, Web services
message
code Client
Server
Service
Computation Environment (CPU,memory,VM)
Figure 9: Code on Demand
Mobile Agents. This is the most advanced distributed computing paradigm and today many research centers and academia are trying to design the more efficient implementation for implementing mobile agent applications. Mobile Agents are computer programs which may migrate from one computer to another on the network [10 pag.3]. They are said to be intelligent, because they can perform autonomous actions in the network depending on the external conditions: in each host the agent application can exchange some information (through events, local variables, etc.) and can choose how to continue its travel. The advantage of this kind approach is clear: hosts interactions are reduced to the minimum, high flexibility and scalability, new solutions for unsolved problems (for example in the network management); the disadvantage is the need of a complex computing environment. The executed process is moved among the nodes of the network autonomously. This intelligent agent seems to be the best way for intelligent network and seems to resolve problems related to efficiency, customization, management: mobile agent has become a huge industry buzzword, especially in the business applications! A complete process (an application running) can be moved from one site to the other with its execution data (variables) or completely with its execution state (stack image, register values). This approach is more complex than the others and it needs a complete framework environment that manages the migration.
26
Mobile Agent Es: Aglet,Obliq
code
Service
code
Service
Computation Environment (CPU,memory,VM)
Service
Computation Environment (CPU,memory,VM)
Computation Environment (CPU,memory,VM)
Figure 10: Mobile Agent Architecture
Let me explain some advantage of this technology with a simple example: if a client applications need to communicate to a client machine with a high number of interaction, a mobile agent can perform the same service, but it reduces the number of interaction on the network: it migrates from the client machine into the server machine; it executes all the interactions locally inside the servers in a fast way and then it can come back to the client with the result of the service. ♦♦♦♦
2.3 Classification of Mobile Code Mechanisms At this point we need some reference abstractions to classify the different mechanism that allow applications to move code and state across a network: the reader interested in more detailed analysis can refer to [2],[3],[5] e [6]. Traditional Network Computing Systems use a software layer to offer to programmer an homogeneous programming environment: each programmers don't care about the presence of a underlying network. CORBA and DCOM are a few modern examples of this kind of implementations. Technologies supporting code mobility take a different perspective. The structure of the underlying computer network is not hidden from the programmer, rather it is made manifest. In the first case the programmers interface is called True Distributed System, while in the second Computational Environment CE (see the Fig. 7).
27
Network Apps Components
Network Apps Components
True Distributed System NOS COS
NOS
NOS
COS
COS Host
COS
CE NOS
COS
Host
NOS COS
Host
COS Core Operating System
COS Core Operating System
NOS Network Operating System
NOS Network Operating System
Host
Hardware
Host
CE NOS
Hardware
Host
CE
CE Computational Environment Figure 11: Traditional System Architecture vs. Mobile Code Architecture
Let's go further. In the mobile code architecture we distinguish the components hosted by the CE in Execution units (EUs) and resources. Execution Units represent sequential flows of computation. Typical examples of EUs are single-threaded processes or individual threads of a multi-threaded process. Resources represent entities that can be shared among multiple EUs, such as file in the file system, on object shared by threads in a multi-threaded object-oriented language, or an operating system variable. In a conventional system, each EU is bound to a single CE for its entire lifetime and binding between the EU and its code segment is generally static. On the contrary, in Mobile Computing Systems the code segment, the execution state, and data space of EU can be relocated to a different CE and in principle each of these EU constituents might move independently.
These new concepts give us the possibility to classify two main forms of mobility: Strong Mobility and Weak Mobility. Strong Mobility is the ability to allow migration of both the code and the execution state of an EU to a different CE. Weak Mobility is the ability to allow code move across different CEs; code may be accompanied by some initialization data or some state-data, but no migration of execution state is involved. Let me do some examples to explain these concepts: in the case of a strong mobility the execution of the application is frosen and the host engine takes care to remove all the information inside the CPU registers and all the bound resources, and move all in another machine; in the case of weak mobility only the code and the data are moved: programmers can use special flag and variables to save the state of the execution, but at the application level. Strong mobility is supported by two mechanisms: migration and cloning. The migration mechanism suspends an EU, moves it to the destination CE, and then resumes it. The remote cloning mechanism creates a copy of an EU at a remote CE. Remote cloning differs from the migration mechanism because the original EU is not detached from its current CE. Mechanism supporting weak mobility provide the capabilities to transfer code across CEs and either link it dynamically to a running EU or use it as the code segment for a new EU. Such mechanism can be classified according to the direction of code transfer, the nature of the code being moved, the synchronization involved, and the time when code is actually executed at the destination site. An EU can either fetch the code dynamically linked and/or executed, or ship such code to another CE. The code migrates either as a stand-alone code or as a code fragment: the first is a self-contained piece of code and will be used to instantiate a new EU on the destination sites (Java Applet); conversely a code fragment must be linked in the context of already running code and eventually executed (Safe-Tcl). The mechanism supporting weak mobility can be either synchronous or asynchronous, depending on either the EU requesting the transfer suspends or not until the code executed. We can compare the previous section contents about network paradigm and the last concepts about code mobility n the next figure.
29
data mobility
only data
code mobility data + code
code + data state
code + data + exec state
Remote Execution Client/Server
Code on Demand Weak Migration
Strong Migration
Figure 13: A complete view of mobility in a Distribuited Computing System
The previous concepts can be considered as a base framework to start an analysis and a comparison of the technologies in the distributed computing systems. The choice of a paradigm for the design of a distributed application is a critical decision. Much of the success of the development process may depend on it; but there is no paradigm that is better than others in absolute terms. The choice of the paradigm must be performed on a case-by-case basis, according to the type an application and the characteristic of the physical systems: a management application on a computer network, an information retrieval system on a radio broadcast environment, a smart card solution for Automatic Teller Machine, etc. From [6] we see that paradigms and technologies are not completely orthogonal. Although it is possible to use a single technology to implement many kind of computing paradigms, there are some technologies more suitable to implement particular design. Technologies sometimes are too powerful or too limited to implement a particular architecture: in the first case the use of resources is inefficient; in the second programmers have to code all mechanisms, to use extra structures, to implement special policies that technology does not provide. In the following section we will go into more details about the technology attributes that characterize the computational infrastructure of a mobile code system.
30
2.4 Programming Languages Concepts for Mobile Code From a less abstract point of view, we now try to emphasize what are the technical concerns on the implementation of code mobility. In the last section we have seen the basic paradigm for implementing a distributed system; here we want to introduce some technical concepts related to the development of a distributed system. Mobile code technologies include specific new programming languages (or special capabilities added to the old ones) and a corresponding run-time support. For the sake of clarity, we introduce three conceptual levels to understand the major constraints of a mobile architecture and the implementation of some specific technologies [1]. We distinguish: *
Language Level
*
Abstract Machine Level
*
Library Level
This distinction is valid in general, but sometimes specific problems of the distributed system are not limited to a single level: for example security and safety are global properties, so a security model must take into account all aspects of the system support and the execution of the code.
a a a a
Library Level
Graphics support Network Encryption Hierarchical Class Support
a Interpreted or Compiled a Strongly Typed or Typeless a Programming Paradigm
Languages Level
Abstract Machine Level
a a a a a a a
Concurrency Security Model Memory Management Memory Architecture Platform Architecture Mobility Mechanism Shipping or Fetching
Figure 14: Security conceptual layer
Language Level. Interpreted or Compiled. The language level is the first step and the main resources toward code mobility. The first characteristic is the nature of the produced code: interpreted or compiled. Compiled languages (C, C++, Pascal) translate code directly in the hardware language of the execution machine (Assembler); they have big performance and are more efficient,
31
but they lack portability and we have to compiled the code each time we want to move the code in other platforms. Interpreted languages (Visual Basic, JavaScript, Jscript) instead are based on special execution engines, called interpreters (in general developed for each platform), and they are interpreted directly from the source code to the code of the particular executing machine: this middle software level produce worse performance, but it is a complete portable architecture ("write once, run everywhere"). Some new languages support a third way: they compile the source code in a platform independent format for an abstract machine architecture and they execute it with some interpreter engine or sometimes with a so-called Just-In-Time compiler (Java of MicroSoft Browser, Limbo of Lucent Tech., etc.): this solution is a compromise between performance and portability. Strongly Typed versus Typeless. Languages can be defined to be typeless or strongly typed. In the first case all data belong to an untyped universe. All values are interpreted when manipulated by different operators; there are no restrictions on how data can be manipulated (more flexibility), but there is the problem of nonsensical or erroneous manipulations: this can be a great disadvantage for security policy and maintainability of the code (reduction of readability and maintenability). Examples of typeless languages is the C language. In the latter case, programmers can use only a secure set of typed variables and each manipulation of variables can be controlled at compile-time; to increase efficiency the control can be performed at run-time. Examples: Java, Tcl In a mobile code system strong typing can be practically impossible to achieve for two main reasons: first, code can be downloaded and linked at run-time from remote host; second, resources may be bound to a program as it is executing. Between these two extreme many technologies have tried to find the right balance (Java, Limbo). Programming Paradigm. The majority of modern languages use the Object Oriented Programming paradigm (OOP) for its basic characteristic (reusability, inheritance, encapsulation, security mechanism) (Java, Obliq, Telescript); others use different paradigm and less complex instruments, but in general they give some form of code modularity and security issues.. Some versions are based on the more traditional imperative (or procedural) and functional languages, but they have some specific additional security features (Limbo, Objective Caml). Mobility. At Language Level, the mobility mechanism are some time implemented in a transparent and automatic way directly in the Abstract Machine Level, or sometimes there are explicit mechanisms for triggering the migration in the network. Some implementation use only some specific network mechanisms (TCP, UDP, HTTP connections on computer networks), while others give some loading mechanisms for general purpose code mobility.
Abstract Machine Level. Platform and Memory Architecture. Fundamentally the Abstract Virtual Machine is a concrete computer architecture implemented in a software way: that is we have all the physical components (registers, Memory, devices) of a hardware machine, but are all virtually built using software
32
components (stack, variables, handles, etc.). The characteristic of this machine depend on what are the goals of developers: the Java VM looks like the Intel 4004 processor (used for embedded applications) and need few hardware resources, while the VM of Lucent tech., Inferno-Dis, is a RISC-like implementation with good configuration from high end system to embedded phone machine. From a programming point of view nothing change, but if we are looking for specific performance or we want to have a flexible architecture we have to consider the nature of the implementation design. Also the Memory Architecture follows some hardware specific requirements, but all implemented via software: stack base design, memory-to-memory design, etc.; in these case we have to consider the possibility to implement the design "idea" with the underneath hardware platform (virtual memory mechanism, security features, etc.). We have to be carefully on the memory protection mechanisms: some memory protection mechanism are implemented directly in hardware in a platform dependent way, while in some abstract machine implementation all protection mechanism are implemented via software. We can see again that the goal in the platform design, is either having a general purpose portable design or getting some specific efficient features for embedded systems. In general, in the memory design, it is implemented an automatic memory control, the so called Garbage Collector mechanism. This is done for many reasons: •
it is not efficient nor simple to control memory allocation in a distributed computing system when some object can reside on remote systems;
•
it is better for security reason and for automatic binding of resources that the VM can control completely the memory access; otherwise memory dominates the costs of small systems, so the VMs should be designed to keep memory usage as small as possible;
•
memory bugs are difficult to control, so an automatic memory management can help to built safe applications.
All Virtual Machines for mobile code follow in some way this rules. Security mechanisms. General protection of the running process is performed in general by hardware memory mechanism, confining the code in separate memory spaces; but sometimes this could be a solution too expensive for small devices; some security features are implemented at the Abstract Machine Level in order to control the executed code. During the loading and the linking process we can verify the contents of the code itself and control it before the execution; a sort of Security Monitor can be implemented to control the access to some resources or to manage trusted and untrusted references (Security Manager in Java). (A more complete study on security mechanisms is reported in the next section ) Concurrency. In this case we point out the capacity of the Virtual Machine to manage two or more applications (single thread process or threads on a multi-threads process) at the same time in order to improve the use of the CPUs resources.
33
In general a thread is an independent execution resource managed and scheduled by the operating system; the OS associates the CPU, a set of instructions and a stack context (including local variables) to a thread. The first Operating System responsibility is to manage computing resources. One of the ways the OS manages is by scheduling access to the CPU to the various programs that are executed contemporarily. Programs then can be made up of multiple threads. Threads became a fundamental part of the operating system and nowadays they are spreadly used. In general we use a collections of threads ruled by a preemptive or an cooperative scheduling. The first is better for a strong thread control because the CPU gives to each thread a fixed time slice to execute its code: the second is better for multimedia or realtime applications, in which it is the programmers that manage the CPU resources depending on the need of the application. Some Abstract Machine give the possibility to use embedded concurrency techniques in spite of the underlying platform (Java on Win 3.1), others use the specific instrument of the platform in which they are running). Mobility: Shipping or Fetching code ? We have talked about moving the code: but who is moving the code ? In some cases clients ask to directly a server to download the necessary code (fetching), while in others the server move the code towards different clients (shipping), and sometimes the code can move alone (mobile agents). The first two mechanisms seem to be exactly the same (it is just a problem of symmetry), but they hide different mechanism for loading, linking, resolving and running applications; the third is a more sophisticated way to move around the network and need a more complete framework environment. And what about the structure of the code itself ? In general we encounter two basic ways to move code: a single block and an applet-like way. In the first case we have a single block of code that is run completely from the beginning to the end (PostScript file, UNIX remote shell Script). In the second case it is applied a more object oriented solution: you can use a set of method to control and trigger in a dynamic way the behavior of your applications (init(), start(), .., stop() for the Applet, OnDispatch(), OnCloning(),.., OnArrival() for the IBM-Aglets, Objective Caml applet from INRIA). Linking Mechanism. One of the main characteristics of the modern operating system is the possibility to call functions libraries or external code at runtime and linking this code directly on the calling process: these techniques allow programmers to produce modular, flexible and smaller applications. Some example are: the DLL mechanism in the Windows environment, class loading and linking techniques in Java, modules loading in the Inferno system, etc.
Library Level. Since it is important to built secure systems, the control on the implementation of external libraries is of fundamental importance. In general there are set of libraries ready-to-use and tested in order to control the behavior of the entire programming environment. Some particular implementation include inside the libraries some extra features related to control access to critical resources. For example inside the Java libraries each time we use some critical method a security check is done for control the permission of the calling applications. Graphics. The existence of a complete set or packages to develop a graphical user interface is a strong advantage on choosing a platform for mobile code: an easy and ready to use
34
library, can allow from one side developers to focus their attention on mobility skills and from the other have an integrated, secure and stable graphics interface. Java and Inferno/Limbo, for example, provide these features. Security instruments. Some complex mechanisms to built some specific implementation of security are enclosed in ready to use library: security interfaces for authentication and certification, security algorithms, etc.
2.5 Security & Safety of Mobile Code Technologies. Code mobility paradigms may be implemented only with the help of a complex framework: basically a programming language, an abstract machine (interpreter or just in time compiler) and sometimes a complete library. One of the major issues in a distributed environment is the security of the system. In an open network security became a big constraints; the increase of systems complexity is a big concern also related to the security. More complex is the system, more weak is ! We need special attention to how we can build a secure and safe system, and where implement the controls. In general the degree of security is related to the weakest "ring of the chain": " You can build a 100 billion dollar secure system to prevent access to private password in your system, but are you sure that your system manager will not give this information for less ? " (A.Lioy - Security and Distributed Programming Professor at the Politecnico of Turin) Another aspects is related to the nature of code mobility. As the mobile code is crossing protections domains, special care must be taken in order to protect it against external computing environment and special care must be taken by computing environment against malicious mobile code. A good introduction on these issues could be found in [1]. We distinguish four security properties: *
Confidentiality (or Secrecy)
*
Integrity (or Accuracy)
*
Availability
*
Authenticity
Confidentiality concerns the absence of leakage of private information (which often occurs through a covert channel). Integrity of the data means that data should not be modifiable by unauthorized parties. Availability, the negation of which is known as denial of service: the attackers denies normal use of shared resources preventing users to use the resources. Authenticity permits that identity of communication partner can be trusted. We can divide a global mobile code computing system in four subsystems, or levels, to focus on in details some specific aspects: *
Communication Level
35
*
Operating System Level
*
Abstract Machine Level
*
Programming Language Level
Communication Level At this level the network is a collection of computer connected with hardware networking technology: the requirements at this level in general are concerning the robustness and efficiency of the protocol. But how the information are protected by eavesdropping or how the confidential of exchanged information is controlled ? Only in these years new secure features have been introduced at network layer (IPv6); HTTPSecure and SSL (Secure Socket Layer) are some implementation that are located just on the HTTP and Socket mechanisms. This new protocol based on cryptography techniques controls and manages confidentiality, integrity and authentication, but they seldom lack availability: lots of examples are related to the problems of deny of services in Internet. Operating System Level Safety and security at the communication level is not sufficient in general. Handling safety and security is also a primary concern at the operating system level. Historically, memory protection and privilege levels have been implemented in hardware: memory protection via base/limit registers, segments, or pages; and privilege levels via user/kernel mode bits or rings. The recent mobile code systems, however, rely on software rather than hardware for protection. From a more general point of view, network hosts can be represented both by high-end machine and by smaller embedded system with no hardware memory mechanisms. The switch to software mechanism is being driven by two technological trends: portability and performance. The first one permits to have a platform independent security mechanism; the second is reached because software protection offer significantly cheaper cross domain calls. Confidentiality and integrity can be achieved by controlling process access to information and communication channel; to control availability we can limit the access to all needed resources (disk space, memory usage, number of process, graphic access). Authentication is usually established through an initial identification of the user and maintained by some protected OS structures. Abstract Machine Level Safety can be obtained also at Abstract machine level: the virtual machine can implement some specific features, that in general are implemented by OS. In fact using a language independent abstract machine retains all the language independence of the operating system solution, but does not have portability problems. At the Abstract Machine level you can control access to resources through the use of a set of secure API: you can enforce protection of internal structures of VM. Programming Language Level. At this level we pay each secure mechanism with the flexibility of the programming language: the majority of modern programming languages have some controls against low-level errors through mechanism like typing, restricted pointers, automatic memory management, and array bound checking. It is possible to go even further and use the language scope and access rules to protect the interface of resources.
36
As an optimization, the high level program can be compiled and type checked before being shipped as mobile code, but can we be sure that the object code is really a non tampered output of a correct compiler ? Three techniques have been proposed [2]: • • •
using cryptographic signatures to reduce the problem to one of trusting the author (ActiveX) using cryptographic signatures to trust compilers, in this case we use only a small number of trusted sites for compilation and certification (Inferno/Limbo) compiling to an intermediate language which can be checked to verify the same constraints that are imposed on the source language (Java)
These techniques are not exclusive. You can use some combination of these techniques to improve your security wall. For example you can combine the use of cryptographic signatures and trust compilers: it seems easily feasible as they require much of the same technology and infrastructure [1]. As we have seen "Security is a global property, so a security model must take into account all aspects of the system supporting the execution of the code. This includes in particular the hardware, operating system, the abstract machine, the modules libraries, the security manager, and the browser (in this case it means the container of the mobile code). A security weakness in just one of these endangers the security of the whole system [1]".
“Consider the past and you shall know the future.” (Chinese Proverb)
37
38
Chapter 3 Technologies
"Strong typing is for people with weak memories" (Toman Vleck)
A number of mobile code solutions have been developed in the last years by the main software Companies (Sun, Microsoft, IBM, Lucent, DEC) and by a number of academic research centers (Berkeley, Stuttgart, Zurich). As we have seen, mobile code applications can follow different implementations approach in spite of the specific goals of the designers and follow different mechanisms to exchange information. Our purpose, in this chapter, is to provide the readers an overview on the basic characteristics of the products now available on the market, following the architectural frameworks (paradigms, techniques and instruments) explained in the last chapter.
39
Using the theoretical tools of the previous chapter we also want to identify the differences and the similarities of the distributed computing technologies and find the best suitable solution for our particular domain (see next chapters). Talking about technologies means to investigate the basic practical tools used for implementing a certain kind of engineering solution: that is talk about the architecture, the software components and tools, the qualitative performances, etc. A starting point for a complete overview of mobile code technologies can be found in [1]. In these subsections we will classify the different technologies in 4 main groups: • Virtual Machine/OS • Architecture for Object Components • Mobile Agents • Scripting Languages The first group is a complete programming environment to create mobile code solutions: Java from Sun and Inferno from Lucent Technologies are some good example to explain the main concepts of Operating System and Virtual Machine for mobile code. The second group is more general object oriented approach to the mobile technology; in the last chapter we have discussed about true distributed system and here we make some example of this solutions. CORBA, DCOM, Obliq, Java Beans are just some examples of this approach to distributed system: some are only specification, other are ready-to-use tools to implement object components systems. Mobile Agent are the more advanced architecture for building a distributed system: some solutions are nowadays available on the market, but a lot of work has still to be done in this direction. Finally. Scripting languages are a collection of very simple and efficient solutions for implementing distributed system: they lack lots of the flexibility of other more sophisticated technologies, but they are productive, cross-platform, and specialized tools.
Virtual Machine / OSes 3.1 Java Java was originally called Oak and was proposed by Sun as a language for embedded non-computer devices - mostly consumer devices. When the Web became explosively popular, it was a natural fit to apply this technology to the Web environment, and now this product is the leading technology for implementing distributed mobile code applications in Internet. Today there are two main version available on the market, SDK 1.x and SDK 1.1.x 2; it is going to be shipped between few months JDK 1.2. All the information inside this section refers to [2], [3], and [5].
2 In this chapter we refer to the JDK 1.1 implementation of Sun Microsystems and the JDK 2.0 of Microsoft, both developed for the Win32 platform.
40
What is Java under the hood ? Java, first of all, is a class-based, cross-platform, network-based, object oriented language created by Sun Microsystems, with an emphasis on portability and security. Java is a complete programming environment: it is not only a programming language, but a complete virtual machine specification and a set of general purpose libraries that forms the API of the Java system. Today Java is the most dynamic technology for Internet applications and for distributed computing: initially it lacked performance and efficiency, but nowadays all the advantages of this platform are clear and touchable. Portability is one of the main advantage, but also flexibility (loading mechanism, security policy, API extensibility, specification).
bytecode
It is hard to make a brief overview of the Java family; in this section we will provide some basic concepts and practical implementation to see both the advantages and the disadvantages of this architecture following the theoretical framework built in section 2.5. The most known applications of the Java technology are the Applets: a secure, easy to develop, portable, object oriented application for embedding dynamic application inside HTML pages, the Web basic mean to communicate over Internet. Applets are Java classes that are loaded automatically from the net and executed inside a secure "sand box" following the code on demand paradigm. Applets have transformed the way you can think of Internet and HTML pages: code is moved towards every point of the net whatever are the OSes or the hardware platforms.
Java Application
Java API
C / bytecode
java.io
java.net
java.math
Java class
OS API I/O
Graphic Peer class
TCP/IP Class Loader
assembler / C
java.awt
HD API Interrupt
Linker
native Math thread mgmt.
Java VM
Native Operating System Device Management
Memory
Graphics
Figure 15: Java Architecture
Language Level. The Java programming language is based on a simplified variant of C++ (the developing language most used on the market and in the academic world) with all unsafe and most complicated language features removed: unsafe operations, like pointer arithmetic, unrestricted casts, unions, and features leading to unmantainability programs
41
like the C preprocessor, unstructured goto, operator overloading, and multiple inheritance. In computer science terms, Java is strongly typed and late bound with dynamic linking programming language. It supports encapsulation, which reduces complexity; it is strongly object oriented, where inheritance, polymorphism and delegation help to reuse the code; array and string types are built-in with range checks for all accesses. Exception handling has also been added, in order to permit the creation of robust programs. Concurrency is provided at the language level with thread and serialized methods, using mutex-locking technique on the corresponding object. Java includes a novel notion of interface types. Interfaces define a collection of abstract methods and attribute variables with their associated types. A class can be declared to implement an interface, in which case it must implement all the methods of the interface. Whenever a value of an interface type is expected, a value of a class implementing this interface is used. Interfaces are useful for a number of purposes: they can be used to hide the implementation of a class and to group classes with a common functionality without forcing them into a class hierarchy. Java also uses the notion of package. A package groups a number of class and interface definitions. A class can be defined final (disallowing subclasses of itself to be derived), abstract (disallowing instances to be created), and private (limiting the scope of the class declaration to the containing package). Attributes have four levels of visibility: private, default, protected, and public. Private attributes are only visible from within the object itself; default visibility extends visibility to the package in which it is defined; protected attributes further extend the visibility to subclasses of defining class, potentially defined in another package; finally, public attributes are visible everywhere. As we can see Java is more than a traditional programming language: it is a instrument built to help programmers to develop object oriented applications skills and to create a secure platform to manage code in a distributed computing environment. All these aspects make Java a complete operating system language, as C++. Abstract Machine Level. Basically the virtual machine of Java is a virtual hardware platform. Today processors have high computing capabilities and they can simulate by software a complete hardware platform for building a software layer between different specific platform and user applications. From an architectural point of view, the Java VM is like the Intel 4004 processor, with a minimum number of registers (4) and a stack base memory architecture; this requirements allow to implement the VM on almost all hardware platform, from embedded systems to high power computers. The VM have its own machine language, the so called bytecode, that is used unchanged for each VM: it has both efficiency of a compiled code and portability of a interpreted code. The VM is not a monolithic piece of code. Since it is designed with portability and scalability in mind, it heavily uses separate subsystems for threading, memory management (Garbage Collector), native method interfacing, native system calls, loading, verifying and linking code, etc.. Each block can be used and configured for specific purposes and can be added or removed to adapt to specific software and hardware requirements. For example some embedded implementations of the JVM don't need the GC module or the verifier module, because memory management is done manually and Java classes are embedded inside the ROM memory. Automatic memory management (Garbage Collector) has been added, protecting execution from memory allocation/deallocation and pointer errors due to manual memory management; as we have seen, automatic memory management is a
42
key-point in distributed systems both to control and manage resources and for security reasons. The most interesting part of the VM is the execution engine itself, which comes in two different flavors: interpreter and "just in time" compiler. The first is just a translator of the bytecode commands, while the second is a more efficient way to execute Java code in a platform dependent way using caching technique for the frequently used blocks of code. The second solution is surely the more efficient but it lacks portability, because the VM implementation is strictly dependent on the specific machine. A JVM is available for a huge number of different hardware platform and OSes: from RISC based processor to Intel Pentium, from embedded system to PDA, from MS Windows platform to Sun's Solaris, etc. Library Level. From the beginning of its life, Java has been shipped with a set of libraries (JAPI) that have given programmers a complete and secure class based environment. System classes are a complete set of pieces of reusable (using inheritance technique) code for permitting to programmers to focus on their specific tasks, without worrying about all parts of the system. The Java Development Kit (JDK) from Sun range from GUI interface to HTTP communication libraries, from complex Math calculations to remote procedure calls. The importance of the Java class libraries is in its role for built a secure system: Java class libraries are written in such a way that all sensitive operations call into a centralized object, the security manager, to check whether the caller is allowed to invoke this operation. Programmers can built their own security manager to trigger the security of their environment. Initially JDK was the only Java Technology available for application development, because Java was positioned for the Internet market. Sun nowadays is looking seriously to other specific market segments. You can adapt your Java system for different platforms: Personal Java for Network computers, smart phones, handheld PDA; Embedded Java for embedded systems and Java Card for smart card technologies. They all are designed to be upwardly compatible, so you can run your applications on the upper API implementations.
43
Figure 16: Applet executed in Internet Explorer and in Netscape Navigator: not all is the same !!
Applications. Java is an environment to implement complex distributed applications. From its birth Java has collected a huge number of different applications: from commercial oriented application to agent based platform, from embedded device management to software component. A complete OS, JavaOS, has been built. JDBC is a package for database interfacing using ODBC standard. Java Beans is a component architecture for Java enabling re-usable software components. Java 3D, a set of libraries for 3D imaging. JavaChips, an hardware implementations of Java VM for embedded applications and JavaCard, a complete set of API for building smart card apps. The flexibility of the package structure of Java has allowed programmers to develop all kind of mobile code applications: RMI packages permit to create traditional distributed client/server, object oriented systems; Aglets, Voyager, Odyssey are some specific implementations that use some features of the Java languages (RMI, serializable, networking, multithreaded) or use some additional technique to create weak mobile code applications. Java is a promising language with a tremendous market acceptance. Much of this popularity stems from Java's unique combination of characteristics: close to C++, safe, portable, easy and concurrent, as well as supplying a rich base library. There is yet much work to do especially in the secure architecture and the efficiency of the execution engine; Java misses some characteristic for real time applications, where performance of the executing applications have to be predictable. Many other technologies are converging towards Java: Tcl/Tk is thought to be the scripting twin of Java; DCOM interfacing with Java is a goal of Microsoft JVM; the Inferno OS developer from Lucent are thinking of the possibility to embedded JVM inside their system; etc. ♣♣♣
44
Java is used as a base technology to implement different kind of mobile code architectures and paradigms. Java and applets have revolutionized the Web, and executable contents has become a common term in the Web and network glossary. We have seen that Applets are essentially a set of program codes that can be downloaded (code on demand paradigm), instantiated, and executed in Web Browser. Recently, this concepts has been matched by the introduction of servlet: the servlet is moved from the client in the opposite way to the applet; that is, it allows the client program to upload additional program code into server; the servlet's code is then instantiated and executed in the server (remote execution paradigm). In the next sections we will analyze some additional implementation using the Java technology: a mobile agent architecture, Aglets, and an object components, Java Beans. These implementation are Java solution but they have particular aspects that they have to be considered separately.
3.2 Inferno/Limbo The same folks that brought us UNIX an C are promising something even better for network communication: Inferno OS and Limbo language. Inferno by Lucent Technologies (Bell Labs Innovations) is a network operating system designed to suit the constraints and needs of this environment; Limbo is the operating system programming language used to build the majority of the OS and all applications. All information about Inferno is available on the Lucent Web site and in some paper available also from Lucent [7]. Inferno is a commercial product intended to be flexible enough to be employed on devices as diverse as intelligent telephones, hand-held computers, personal digital assistants, television set-top boxes, home video game consoles, and inexpensive network computers; it can also be used on servers such as Internet servers, financial servers, and video-on-demand servers. It is portable across processors (Intel, Sparc, MIPS, ARM, HPPA, AMD 29K) and across platforms (WinNT, Win95, UNIX: Irix, Solaris, Linux, AIX, HP/UX). The design of Inferno is based largely on Plan9 OS, a network operating system from Lucent, but emphasizes portability, versatility, and economical implementation. Economic here refers to the computing resources required; Inferno can run within little memory and does not require virtual memory hardware. The portability has two dimension in Inferno. The operating system itself can run on a bare hardware directly on the CPU (native mode), or on top of an existing operating system like UNIX, WinNT or Plan9 as a VM (emulation environment). In the latter case, the services provided by Inferno are mapped to the native services of the underlying operating system. The second dimension is the portability of Inferno applications. Applications (as we have already mentioned) are written in Limbo, an integral part of Inferno, and they are compiled to a binary format that is portable between all Inferno implementations. The Inferno system is based on the application of three basic principles: * resources as files: system resources are represented as files in a hierarchical file system; * namespaces: the application view of the network is a single, coherent namespace that appears as a hierarchical file system but may represent physically separated resources;
45
* standard communication protocol: a standard communication protocol, called Styx, is used to access all resources, both remote and local. In the figure 5 we can see an overview of the architecture of the Inferno/Limbo system. Java ? Applications
Application Layer Graphics library
Security
Memory Management
Process Management
Limbo
Namespaces
Dis Virtual Machine
Kernel Layer
Styx communication Assembler C language
Host OS
Device drivers network
Hardware Layer
Figure 17: Inferno/Limbo Architecture
Language Level. A Technology for the DAB System Limbo is a safe imperative language. It's main inspiration is C, but it includes in addition declarations as in Pascal, abstract data types (ADT), first class modules, first class channels, automatic memory management, and preemptive scheduled threads. It excludes pointer arithmetic and casts. The declaration of a module identifies the types of exported functions and contains the exported declarations of ADT's, simple type declarations, and constants. In order to use a module, it must be instantiated by loading an implementation of the module (at runtime). The loading is done with the built-in function load that takes a module type and a path to the module implementation and return the instantiated module (or null if unsuccessful). This allows the programs to choose among several implementations at run-time. The channels of Limbo allow the communication of any value of its declared type. A channel can be connected directly to another process or, through a library call, to a named destination. Channel are the only built-in primitive for interprocess communication, but more complicated mechanism can be built upon them. Memory is managed automatically by a garbage collector mechanism. Abstract Machine Level. Limbo programs are compiled to a RISC-like abstract machine called Dis. Dis is designed for just-in-time compilation to efficient native code by Limbo run-time system. The machine is memory-to-memory architecture, not a stack machine, that translates easily to native instructions sets. The Inferno kernel provides preemptive scheduling of processes (with a very low overload) that are responsible for managing and servicing protocol stack, media copies, alarms, interrupts, and the like.. The kernel schedules processes with multiple priority run queues (8 levels) using round-robin mechanism. Scheduling is on a fixed time slice, or quantum, basis with each quantum being set by the local system clock.
46
Inferno has two-level memory allocation mechanism: the lower level maintains control of large block of contiguous memory. The higher level divides memory into pools to allow applications to control the behavior of the system when the resources become scarce. Memory is stored as block in an unbalanced B-tree with the leaves sorted by size. The interface to all devices is via the Inferno file system interface (as it is partially done in UNIX) and each device driver is a kernel resident file system.
Figure 18: Inferno Emulator on WinNT platform
Library level. A rich set of standard modules are provided, including modules for network communication, secure and encrypted communication, and graphics. Two user interface libraries are available: one based on Tk is intended for traditional window based user interface; the other is a ready made interface components for typical embedded applications, such interactive TV. The specialized design allows for a minimal memory requirements. Security. Safety is achieved through a safe language with restricted pointers and automatic memory management: this safety is not enforced by the abstract machine (as in Java). Inferno relies on applications being signed by trusted authorities who control their validity and behavior: this is a big constraint for the diffusion of that environment for Internet applications, where a huge number of programmers live on. Applications. The applications domain for Inferno is focused toward applications for service providers. Inferno has been compared to Java by many in the press. The lack of an object oriented structure in Limbo is one of the reason for which Inferno developer are thinking of an integration of Java inside Inferno [4]. The two platform are briefly compared in the following table.
47
Table 1: Inferno vs. Java - a Comparison -
Security
Resources access
Minimum size machine to run applications Object Oriented Virtual Machine
Inferno/Limbo
JavaOS/Java
Built in authentication and encryption at OS level, not automatic machine protection security One file system access everything from data to network
Machine protection security is built-in. Encryption has been added.
512 KB of RAM 512 KB of ROM No DIS
File system access local data: network data must be accessed through the server 128 of RAM 512 of ROM Yes Java Virtual Machine
3.3 OSes as Virtual Machines ? Special hardware devices need OSes built directly upon them. For some specific purpose architecture we can not implement an abstract virtual machine upon the main OS. Big constraints on hardware resources as memory, CPU, special communications protocol or special video devices, give us the only choice to develop only a very thin software layer and use that as the only platform. There are some advantages to use a platform dependent environment: •
fast performance and efficient code
•
direct access to the hardware resources
•
a complete and optimized set of developing instruments
•
a well defined API
•
real time capabilities
Nowadays there are a few number of platform for distributed applications that follow this way. Apart for some peculiar characteristics, they can be included in this group of OSes: Windows CE from Microsoft, JavaOS from Sun, Inferno from Lucent, but also all UNIX flavors and real time OSes. The problem of portability for these systems is big and from many directions they are trying to include in the platform some tools to open the platform to a wide environment: Inferno programmers are developing a way to include Java VM inside their system; some real time OSes are thinking about Java as an additional module to include in their systems; etc.
48
Architecture for Object Components Modern programming languages employ the object paradigm to structure computation within a single operating system. The next logical step is to distributed a computation over multiple processes on one single or even on different machines. Because object orientation has proven to be adequate means for developing and maintaining large scale applications, it seems reasonable to apply the object paradigm to distributed computation as well. Object are distributed over the machines within a networked environment and communicate with each other [9]. As a fact of life the computers within a networked environment differ in hardware architecture, operating system software, and the programming languages used to implement the objects. That is what we call a heterogeneous distributed environment. To allow communication between components in such environment one needs a rather complex piece of software called a middleware platform.
3.4 CORBA Currently there are two major approaches to distributed-object technology, namely Microsoft's OLE/ActiveX technologies and the Object Management Group's (OMG) CORBA Architecture specification. CORBA stands for Common Object Request Broker Architecture. Let's imagine the internal hardware architecture of a PC. Inside we have different blocks that implement different mechanisms: one o more CPU, hard disks, floppy controller, video controller, audio card. All we need to coordinate all these separate block is a common "bus" that coordinate and ruled the exchanging information. CORBA provides the software solution, a "bus" specification to connect different software components from different platforms. CORBA addresses the following issues: *
object orientation
*
distribution transparency
*
hardware, operating system and language independence
*
vendor independence.
CORBA is an open standard in the sense that anybody can obtain the specification and implement it; besides its technical features this is considered one of the CORBA's main advantages over other proprietary solutions. For resume CORBA is a standard specification to built a multiplatform distributed computing environment using a object oriented paradigm. Other proprietary solutions are now available on the market (DCOM from MS), but CORBA is becoming the standard and the reference model for large distributed applications. The biggest Software Company use CORBA as a reference model for their applications (IBM, HP, ORACLE) and a lot of other company are building a bridge to this specification (Java/CORBA, DCOM/CORBA). From the point of view of distributed paradigm, CORBA is an evolution of the client/server RPC model used in many platforms (UNIX, Windows, Solaris): the revolutionary idea is to match the object oriented paradigm and to give a specification
49
separated from a particular technology: OSes, programming languages, communication mechanisms. In CORBA there is no mobile code: this technology is a coordinating framework to export object communication from single address space process to a distributed environment (both locally and remotely). As other technology CORBA is Internet oriented, that is a complete specification to build a sophisticated system for large high powered computers networks and for large scale applications. It can be considered a reference model for building general distributed computing environments.
3.5 ActiveX & DCOM To explain the Microsoft technology, we have to do a brief historic excursus. All material regarding COM technology can be found in the Microsoft Web site (http://www.microsoft.com/COM/). Object Linking and Embedding OLE, which was introduced in 1991, began as a way to create compound documents that would serve as containers for blocks of highly structured data--a spreadsheet, a clipart, a document for example. These containers could embed a fragment of a spreadsheet in the middle of a word processing document. But along the way, Microsoft added one function after another, placing them all under that one well-known label. By continually expanding OLE's core capabilities, Microsoft bloated OLE components and made them inefficient for delivering "active" content to bandwidth-choked dial-in Internet clients. Contrasted with the monolithic OLE, ActiveX defines a family of interfaces, with each member of that family meeting a more narrowly focused set of needs. The base of these technologies is the Component Object Model (COM) and Distributed COM (DCOM) standards. DCOM (the network side of COM), the base technology for ActiveX and OLE, is a way for software components to communicate with each other both locally than remotely; this is the distributed extension for moving data from inside the machine to the Internet world. It is a binary and network standard that allows any two components to communicate regardless of what machine they are running on (as long the machines are connected), what OS the machine are running (as long it supports COM), and what language the components are written in (as long it support deference and virtual function tables) . These technology was developed by Microsoft for its OS and recently, under the impulse of CORBA technology, it have started a standardization process for exporting this technology on other platform (UNIX, Solaris, etc.). On the other side DCOM technology is the natural evolution of the well known DLL technique and COM components: DLL are dynamic linked library that permit to Win developer to reuse piece of code inside Win32 applications and to load it at run time (dynamically). DCOM component are piece of code that have a standard interface, like the COM counterpart (so you can call some specific method without knowing the internal structure of the component) and that can be called from wherever application is both remotely and locally. At the moment the only platform that uses DCOM is the Windows platform (WinNT, Win95, Win98, etc.). As we can see from the introduction these technology is suitable for distributed object oriented programming applications. It is interesting the use of this mechanism inside the Internet Explorer Web Browser: in that case it implements a code on demand paradigm, because during the loading mechanism of HTML pages we can automatically download, register and run application object (ActiveX component).
50
Language Level. DCOM architecture is not related to a particular language. The only needed features is the possibility to deference and create virtual function table, to create the necessary framework for create components and communicating each other. General purpose object oriented languages like C++, Java, Delphi could use this technology, but today only a restrict number of languages (all belonging to Microsoft Company) are the concrete instruments for this technology: Visual C++, Visual J++ and Visual Basic, the first two more suitable for operating system programming, the third is a more efficient and faster developing instrument for simple front-end user friendly applications. Software components are objects that implement a specific standard interface for each particular kind of applications (all interface inherits from a single interface called IUnknown): ActiveX control, ActiveX documents, ActiveX container, etc. Abstract Machine Level. The complete framework needed to built a DCOM system is given by the Microsoft OSes: each components when instanced it is registered inside the registry database (a central, efficient database for collect all software and hardware configuration information of a machine); this registering mechanism can be implemented also for remote objects. ActiveX Controls can reside locally on a client machine or they can be downloaded from the Internet. You can use ActiveX Controls to handle client-side interactions with the user or server side computation (three tier architecture). An ActiveX Control may be as simple as a button or as complex as a reporting tool. Since the same ActiveX Controls can also be used inside Visual Basic, you can expect to see a proliferation of ActiveX Controls that you'll be able to use in your Web pages. Usually in the Internet application ActiveX components are used to create dynamic behavior inside the HTML pages: each time in a page there is a special tag (