Object-Relational Database Systems: an Introduction Divya Kommineni 1. Introduction: The relational data model has proven very successful at solving most business data processing problems. However, there are many engineering applications like CAD systems, programming environments, geographic data, and graphics for which a conventional relational system is not suitable. The relational databases and object technology can be used in conjunction which involves bringing relational data into objects, where the data can be navigated and applications built. Then, the altered object data needs to be brought back to the relational database where it can be securely stored [1]. 2. Strengths and Weaknesses of Relational Model: The strengths of a relational model are its strong theoretical foundation based on firstorder predicate logic, simplicity, reliability, suitability for online transaction processing and its support for data independence [2], relatively static repositories for business data [3]. However, RDBMSs have their own weaknesses. They are: Poor representation of „real world‟ entities Semantic overloading Poor support for integrity and enterprise constraints Homogenous data structure Limited operations Difficulty handling recursive queries Impedance mismatch Problems with RDBMSs associated with concurrency, schema changes and poor navigational access The RDBMS has domains, keys, multi-valued and join dependencies Expressing many to many relationship is difficult Normalisation may sometimes lead to relations which do not correspond to entities in the real world. 3. Object Relational Database Management Systems: Many of the advanced database management systems make use of object oriented features like User extensible type system Encapsulation Inheritance Polymorphism Dynamic binding of methods Complex objects including non-1NF objects Object identity
One way to overcome the weaknesses of the relational model is to extend the relational model with the above mentioned features. The RDBMSs which have incorporated the object oriented features are now referred as Object Relational DBMSs (ORDBMSs). Some analysts predict ORDBMS will have 50% larger share of market than RDBMS. Object-relational DBMSs add support for object-oriented data modeling by extending both the relational data model and the query language while keeping the already successful technology (especially the SQL) of a relational DBMS relatively intact. There are two classes of object-relational DBMSs in the market: ORDBMSs that have been built from scratch (e.g., Illustra, UniSQL) and ORDBMSs that are built by extending existing relational DBMSs (e.g.: DB2, Informix, Oracle, and Sybase). This approach is essentially a bottom-up approach, being data (or database) centric [4]. RDBMSs are currently the dominant database technology with estimated sales of $6 $10 billion per year ($25 billion with tools sales included). OODBMS market is still small, but it still finds new applications areas such as Web. Some analysts expect OODBMS market to grow at a faster rate than total database market, but unlikely to overtake relational systems. But the vendors of RDBMSs are conscious of the threat and promise of OODBMS. They also agree that the current RDBMSs are not suitable for the advanced applications and that additional functionality is required. But they reject the claim that RDBMSs do not provide sufficient functionality or will be too slow to cope adequately with new complexity [2]. The shortcomings of relational model can be overcome by extending the model with object oriented features. 3.1 Design methodology: Developers using ORDBMS get the most benefit from the technology when taking a more holistic approach to database analysis and design than is typical with relational DBMS technology. A good way to think about an ORDBMS is as a kind of software "back-plane": a framework within which you can embed software modules (object classes) corresponding to the things the application is interested in. What distinguishes an object-relational DBMS from more conventional software frameworks-such as pure object-oriented DBMSs, application-servers, or TP-monitor middleware-is that the embedded object classes are deployed within an abstracted or logical data model. As a result, object-relational database development teams need to work at two levels. One team of developers implements the objects in the database using languages like C or Java, while another team combines these objects to address the high-level requirements of the application's problem domain [3]. 4. Stonebraker’s view: Stonebraker has proposed a four-quadrant view of the database world, as shown in Figure1. The lower left quadrant contains the applications that process simple data and have no need for querying the data. These types of applications use standard text processing packages like Word, WordPerfect and Framemaker which can make use of the underlying operating system to obtain the essential DBMS functionality of persistence. In the lower-right quadrant contains the applications with complex data but require no significant data querying. For these
types of applications, computer-aided design packages may be an appropriate choice of DBMS. In the top-left quadrant, the applications with complex querying but simple data are present. Many of the traditional business applications fall in to this quadrant and a relational DBMS may be the most appropriate DBMS for those types of applications. The top-right quadrant has the applications that require complex querying with complex data. These applications require some advanced database features and ORDBMS is the most appropriate one in this case. This is a very simplistic classification and also it is not easy to distinguish the database applications in to one of these cases. 5. Advantages of ORDBMSs: Apart from overcoming the weaknesses of relational models, ORDBMSs have the following advantages. In ORDBMS, the server performs the standard functions centrally instead of having it coded in each application which increases the productivity. This can be referred to as reuse. It gives rise to increased productivity both for developer and end-user. This ORDBMS preserves the knowledge and experience that is gained in developing the relational applications. Able to handle and query large and complex applications. 6. Disadvantages of ORDBMSs: Though ORDBMS have some advantages by the way of using object oriented features, it is also not fully solving the problems. It also has some disadvantages. Proponents of relational approach believe that the simplicity and purity of relational model are lost. Some believe that RDBMS is being extended for what will be a minority of applications. OO purists are also not attracted by extensions [2]. ORDBMSs have some disadvantages like Complexity Increased costs Simplicity and purity of the relational model are lost with these extensions. SQL is now extremely complex. 7. Postgres – An early ORDBMS: The Ingres relational database management system was implemented at the University of California. Many extensions of that were proposed later and it became more difficult to include any more substantial new functions to it. Because of the earlier design decisions and the problem of integrating the previously proposed ideas to the new system, it became difficult to extend the existing system. Hence, a new database system called Postgres(„Post INGRES‟) was built. Postgres attempts to extend the relational model with abstract data types, procedures, and rules [5]. The objectives of this project were [5]: 1. Providing better support for complex objects: Engineering data is more complex and dynamic in contrast to the business data. Although the required data types can be simulated in the relational system, the performance of the system is unacceptable.
2. Providing user extensibility for data types, operators, and access methods: This helps in the system being used for new application domains. A conventional DBMS has a small set of built-in data types and access methods. Many applications require specialized data types like geometric data types for CAD/CAM or a latitude or longitude position data type for mapping applications. Though these data types can be simulated on the built-in data types, the resulting queries are confusing and the performance is poor. Such applications would be best served by the ability to add new data types and new operators to a DBMS. Moreover, easy-to-use interfaces should be preserved for any code that will be written by a user. 3. Supporting active databases and rules: Many applications are most easily programmed using alerters and triggers. A database alerter is needed to bring attention to a particular problem. Triggers can be used in the database to propagate updates to maintain consistency. 4. Simplifying the DBMS code for crash recovery: Most DBMSs have large amount of crash recovery code that is tricky to write, full of special cases, and very difficult to test and debug. The model for crash recovery should be as simple as possible and easily extendible. The approach proposed in [5] is to treat log as normal data managed by the DBMS which will simplify the recovery code and simultaneously provide support for access to the historical data. 5. Making use of new technology whenever possible: A system design with optical disks in its hierarchy and also the usage of multiple-processor work-stations, and customdesigned VLSI chips will have an advantage. 6. Making a few changes as possible (preferably none) to the relational model: Rather than building a system that is based on a large, complex data model, a new system should be built on a small, simple model that is extendible. Some features are added to the relational model in order to support aggregation, generalization, complex objects with shared sub-objects and attributes that reference tuples in other relation. They are: Abstract data types, Data of type „procedure‟, Rules. Postgres provide a set of predefined atomic types: int2, int4, float4, float8, bool, char, and data. Users can add new atomic types and structured types. An Abstract Data Type definition includes a type name, its length in bytes, procedures for converting a value from internal to external representation (and vice versa) and a default value. For example, the type int4 is internally defined as: DEFINE TYPE int4 IS ( InternalLength = 4, InputProc = CharToInt4, OutputProc = Int4ToChar, Default = “0”) The procedures Int4ToChar and CharToInt4 are implemented using some high level programming language like „C‟ and they are made known to the system using DEFINE PROCEDURE command. An operator on ADTs is defined by specifying the number and type of operand, the return type, the precedence and associativity of the operator, and the procedure that
implements it. Structured types are defined using type constructors for arrays and procedures. A variable or fixed length array is defined using an array constructor. The procedure constructor allows values of type „procedure‟ in an attribute, where a procedure is a series of commands written in Postquel, the query language of Postgres and the corresponding data type is called the postquel data type. 7.1 Relations and Inheritance: The declaration of a relation in Postgres looks like: CREATE TableName( columnName1 = type1, columnName2 = type2, …) [KEY (listOfColumnNames)] [INHERITS (listOfTableNames)] Multiple inheritance is supported and the key specifications are also inherited. The following declaration shows how to create an entity Staff that inherits the attributes of a Person: CREATE Person ( fName = char[15], lName = char[15], sex = char, dateOfBirth = date) KEY (lName, dateOfBirth) CREATE Staff( staffNo = char[5], position = char[10], salary = float4, branchNo = char[4], manager = postquel) INHERITS (Person) The relation Staff now includes the attributes declared explicity for that and also the attributes declared for Person. In this case, the key is also inherited for the Staff relation which is same as that of the Person. The manager attribute is defined as a type postquel to indicate that it is a Postquel query. The following declaration shows how a tuple can be added to the Staff relation using the APPEND command: APPEND Staff ( staffNo = “SG37”, fName = “Ann”, lName = “Beech”, sex = “F”, dateOfBirt = “10-Nov-60”, position = “Assistant”, salary = 12,000, branchNo = “B003”, manager = “ RETRIEVE (s.staffNo) FROM s IN Staff WHERE position = „Manager‟ AND branchNo = „B003‟”) Accessing the manager attribute can be done in two ways: One using RETRIEVE command and the other using EXECUTE command. RETRIEVE ( s.staffNo, s.lName, s.manager.staffNo) FROM s IN Staff EXECUTE ( s.staffNo, s.lName, s.manager.staffNo) FROM s IN Staff Parameterized procedures can be used where the query parameters can be taken from other attributes in the tuple. The $ sign is used to refer to a tuple in which a query is stored. The above query can be redefined using parameterized procedure type as DEFINE TYPE Manager IS RETRIEVE ( staffNumber = s.staffNo) FROM s IN Staff WHERE position = “Manager” AND branchNO = $.branchNo
and this can be used in the creation of the table as: CREATE Staff ( staffNo = char[15], position = char[10], salary = float4, branchNo = char[4], manater = Manager) INHERITS (Person) And the query to get the staff details is written as: RETRIEVE ( s.staffNo, s.lName, s.manager.staffNumber) FROM s IN Staff The mechanism of ADT in Postgres is limited in comparison with OODBMSs. In Postgres, the objects are composed based on ADTs whereas the objects themselves are treated as ADTs in OODBMSs. Hence the concept of encapsulation is not fully satisfied. Moreover, the inheritance mechanism is associated with tables and not with ADTs. 7.2 Object Identity: Postgres implicitly creates and maintains an attribute called oid for every relation that contains tuple‟s unique identifier. The user queries can only access oid but cannot update it. The oid can be used as a mechanism to simulate attribute types that reference tuples in other relations. A type that references a tuple in the Staff relation can be defined as DEFINE TYPE Staff(int4) IS RETRIEVE ( Staff.all) WHERE Staff.oid = $1 Same name can be used for both relations and type name because each of them have different name spaces. A relation that uses this reference type can be created as: CREATE PropertyForRent( propertyNo = char[15], street = char[25], city = char[15], postcode = char[8], type = char[1], rooms = int2, rent = float4, ownerNo = char[5], branchNo = char[4], staffNo = Staff) KEY (propertyNo) A property to the database can be added using the following query: APPEND PropertyForRent( propertyNo = “PA14”, street = “16 Holhead”, city = “Aberdeen”, postcode = “AB7 5SU”, type = “H”, rooms = 6 , rent = 650, ownerNo = “C046”, branchNo = “B007”, staffNo = Staff(s.oid)) FROM s IN Staff WHERE s.staffNo = “SA9”) 8. Conclusion: Though ORDBMS tries to extend the relational DBMS with object oriented concepts, it still does not support advanced transaction models compared to OODBMS. But the features of security, integrity and viewing mechanism that are supported by ORBDMS are not fully supported in OODBMS. There is no single extended relational data model but the degree to
which these extensions are made varies. Hence, the choice of a database system is made based on the features demanded by the particular application in which the database system is used.
References: [1] Vermeulen R. (1997). Upgrading Relational Databases Using Objects. Cambridge University Press. [2] Thomas Connolly and Carolyn Begg, Database Systems: A Practical Approach to Design, Implementation and Management, Fourth Edition, Addison-Wesley, 2005. [3] Developing Object-Relational Database Applications, Part 1 The article was posted in the IBM website under developer works. This is the first article in a two-part series on developing object-relational. The article was written by Paul Brown. URL: http://www.ibm.com/developerworks/db2/library/techarticle/0206brown/0206brown1.html [4] Object Persistence in Object Oriented Applications This paper was posted in Journals home page of IBM. The paper was written by V. Srinivasan and D. T. Chang. URL: http://researchweb.watson.ibm.com/journal/sj/361/srinivasan.html [5] “THE DESIGN OF POSTGRES” The paper was posted in the Database Group section of the
Computer Science Division at the University of California, Berkeley. URL: http://db.cs.berkeley.edu//papers/ERL-M85-95.pdf
Figures:
Figure 1: Four-quadrant view of the database world[1].