Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesObject Databases


April 1994 / State Of The Art / Object Databases

The best way to store the complex data used in object-oriented systems is with a DBMS that understands objects--something that relational databases don't do well

Richard Marlon Stein

Object-oriented database management systems, or ODBMSes, represent the latest addition to the modern software engineer's toolbox. So many new applications are being designed with object-oriented techniques and programming languages--primarily C++ and, to a lesser extent, Smalltalk. These applications get much of their power from manipulating objects that include multiple, complex data types and associated methods and functions.

But what happens to that data when the application is not running? Only the ODBMS knows for sure. The synergy between OOP (object-oriented programming) and ODBMS interfaces generates a powerful and expedi ent mechanism with which to express, manipulate, and store--in what can be called an objectbase--the complex objects that are routinely created today. (A sampling of object-based applications using ODBMSes is presented in "Objects in Use" on page 99.)

Why Objects?

One of the principal reasons developers are turning increasingly to the object approach is that older techniques--procedural languages and relational databases--simply can't handle complex data very well. Developers have long recognized the shortcomings of the relational data model and commercial RDBMS (relational database management system) products with multimedia applications; economic models; document management systems; cooperative groupware products; client/server systems; and CAD, engineering, and manufacturing systems. These applications require the definition and manipulation of complex, abstract, articulated entities that defy representation with the relational data model.

RDBMSes lose their efficacy as storage systems whe n objects must be explicitly and tediously transformed (often losing some of their attributes and certainly their methods) before an object-oriented application program can store or retrieve them. (See reference 1. Also, for more information on the problems of mapping object data onto a relational database, see "The Great Debate" on page 85.)

ODBMS technology has gained momentum with the industry's recent adoption of the ODMG-93 standard (see the text box "The Object Database Standard" on page 82). The existence of an ODBMS standard simplifies the process of making applications portable, much as the SQL standard has let software developers migrate many applications between platforms without having to rewrite them.

Integrity, Reliability, and Consistency

Integrity and reliability are important concerns for any database user. Commercially available ODBMS products satisfy these needs, though sometimes with a reduction in overall performance. But the advantages of the object approach mean that bo th mission-critical and noncritical applications of ODBMS technology frequently pay a small performance penalty in return for reduced application engineering and maintenance costs.

The integrity of ODBMS transactions is essential. A transaction here is characterized as an inviolable sequence of operations--that is, all the operations that constitute a transaction either execute completely or not at all. An ODBMS transaction implies that an object is committed to storage and confirmed as stored.

Reliability refers to how the storage system retains objects in the event of computer malfunction. Under some high-stress conditions, an ODBMS may degrade unpredictably. In a multimedia system, the component object attributes (e.g., sound, video, graphics, and text) may differ substantially in their respective extents; one attribute stream may be much larger than the others. For a synchronized playback to occur, all media streams must be uniformly retrieved and recorded. Reliable ODBMS storage operations ensure that objects reach the storage system even if an error condition interrupts normal services.

ODBMSes store objects' attributes, not the methods that affect object state. Executable images for object methods are typically loaded by an ODBMS client, which retrieves ODBMS objects from the server through a tightly coupled network protocol, similar to an RPC (remote procedure call). The server provides lock management to prevent object inconsistencies from contaminating the objectbase. Access to an object's attributes is afforded by its methods. Each method possesses a signature that identifies the names and types of the arguments, as well as the names and types of any return values. Method signatures are specified for objects by the ODBMS's object definition language.

Persistence

ODBMSes implicitly support the notion that objects have a definable lifetime that can extend beyond an executing program (see reference 2). This persistence characteristic is important for applications that may in teract with objects over varying spans of time.

For example, an economic forecasting model may require objects that reflect instantaneous stock-market conditions. The state of a stock-market object, as described by its attributes (e.g., the Dow Jones or Wilshire 5000 indexes), may persist for a few seconds at most before being superseded. But monthly economic indicators, such as the consumer price index and number of housing starts, have much greater longevity. Fundamentally, persistent objects outlive the procedures and processes that create them.

Object persistence is declared as part of the ODBMS schema. The ODMG-93 standard specifies three types of persistence attributes, one of which is assigned to an object when it is declared, and this persistence attribute is immutable during the object's lifetime.

The most ephemeral persistence type, or lifetime, is called coterminous with procedure. Object storage for this lifetime is obtained from the run-time call frame stack and is similar to an automatic variable. When the procedure returns, the object passes out of scope and is deleted.

The next type of lifetime is coterminous with process. The application assigns memory resources for a particular object instantiation and returns them to the heap when the process exits.

Last, objects that have coterminous with database lifetimes are stored on disk under ODBMS run-time control. These tenacious objects remain in the store until the database is deleted.

An ODBMS can implement object persistence with a file system, although accessing disk media is far slower than reading memory. Also, a persistent object may contain references to other objects through a pointer. This is typical for a complex model that represents a highly articulated system or a collection of behaviors inherited from multiple objects. For optimal performance in accessing attributes, therefore, an object and any references it possesses must be resident in memory (see the figure "Managing Persistent Objects").

An efficient ODBMS will fetch the entire object, including dependent references, from secondary storage and place it in a cache; RDBMSes perform an analogous operation by loading whole tables in response to certain queries. Persistent-object cache management is partially controlled by the particular object's lifetime as declared by the ODBMS schema definition. The host operating system lies underneath any cache management strategy.

Keeping Track of Objects

Each time an object is created, a unique OID (object identifier) is added to the ODBMS identifier table. The OID is independent of the object's state. Coherent operation of the ODBMS hinges on the maintenance of this table, which retains and tracks OIDs as objects evolve. Currently, the ODMG-93 standard specifies that the OID table is a single flat structure that precludes, for the time being, any extensions into distributed tables. Future generations of the ODMG-93 standard will eventually support distributed ODBMS structures.

When an application references an object via its OID, the ODBMS must convert this into a virtual memory address before any object attributes can be modified. This conversion of OID to a memory address or address to an OID is called a swizzle (see reference 3).

Swizzle operations are important for efficient storage and application access. An ODBMS swizzles object pointers to speed access to data in memory. Because an object hierarchy includes pointers to other objects, the use of swizzles speeds retrieval and facilitates the update of object attributes.

Object Locks and Concurrent Use

As with any database system, sharing objects by multiple users raises the important issue of maintaining consistency among the user copies. If two or more users simultaneously access an object and add to or alter any of the object's attributes, inconsistencies will arise unless the transactions are serialized, ensuring that a consistent, predictable order is applied to the modification of stored objects. Concurrency con trol mechanisms prevent interleaved transactions to the same object. Exclusive write-lock mechanisms are applied to object storage and usually suffice to prevent changes by more than one person at a time.

Unfortunately, locking mechanisms can also impede the normal use of a database for which long-duration transactions dominate. In many cases, a transaction consists of a short sequence of operations that are completed in a few seconds or less. This is typical of automatic teller machines or credit-card authorizations.

But not all applications are so short and sweet. For example, CAD applications in IC manufacturing may involve several teams of engineers working simultaneously on different parts of a chip design. Some of the chip's cells will undoubtedly cross workgroup boundaries, and it may take weeks for the different teams to figure out how to route wires and position the cells to minimize signal propagation delays, power dissipation, and electrical impedance.

During this protracted de velopment period, other groups of engineers may need to update their design data for a write-locked cell held by a particular user. In this case, the ODBMS must provide the capability to queue up a lock request--and perhaps notify the current lock holder to allow the current changes--or provide the option to abort the lock-request operation. The ODMG-93 standard is silent on the object-locking issue; each ODBMS vendor furnishes its own set of locking options and capabilities.

Distributed Objects--Data in ORBit

Another goal of most ODBMS technology is the notion of a multidatabase (see the figure "Inside a Multidatabase"), which can transparently integrate physically distributed ODBMSes into a single logical structure. To achieve this, the ODBMS must maintain the OID table as a distributed entity (see reference 4). Message passing is used to convey OIDs between peer tables to ensure consistency and to exchange objects between processes for distributed processing. The system addresses the multidatabas e as a logical, global entity; the user has no knowledge of the underlying object distribution.

What makes the multidatabase concept work is the ORB (object request broker), which mediates client access to distributed objects. The ORB must perform many of the functions and capabilities found in current operating systems, in addition to network administration and communications, object format conversions between different processors, heterogenous access control, native memory access, security via encryption and decryption, and memory allocation (see the figure "Object Request Brokers").

An ORB executes a client request by effectively trapping all references to an OID issued by a method or ODBMS primitive (i.e., function). If the object is local, the method will operate on a cache-local object; a remote object will force the process to suspend until the object is relocated to the client's address space. Since network messages are generally slower than local memory accesses, large objects will requ ire more time to relocate than smaller ones. One way to minimize a process's latency is to organize multiple client threads. If the object possesses many attributes, you could dedicate separate threads to fetch each of the object attributes simultaneously. By overlapping computation with communication, the process idle time is reduced and a better load balance is achieved.

Client/servers operating over WANs, such as airline and hotel reservation systems, may be among the first types of applications to avail themselves of the multidatabase/ORB mechanism. Here, relatively fine-grained objects will be communicated between ORBs located on separate processors interconnected by a network. The amount of information needed to represent a single hotel or airline reservation is quite modest. A customer who phones to inquire about a reservation will then wait for the reservation object to move from some host computer system to the clerk's workstation.

Once at the workstation, the clerk may alter the reserv ation object by changing, say, the time-of-departure attribute for the customer's outbound flight. When completed, a local transaction will confirm the reservation object. Another workstation or mainframe might request all reservation objects periodically--say at the end of each business day--to compute daily income or other corporate measures.

The ORB model mimics the dynamic load-balancing techniques already familiar to practitioners of massively parallel computing. The principal difference is that an ORB facilitates the relocation of coarse-grained objects--instances that consume several megabytes of storage--whereas massively parallel systems are better suited to finer-grained messages--10 to 100 KB. ORBs are clearly better suited to highly asynchronous application domains, where object relocation is either infrequent--less than 10 object relocations each second--or the object attributes are divisible into smaller chunks and can be retrieved by multithreaded client ORB recipients. The message traff ic associated with object relocation will subside after it arrives in the requesting ORB's address space.

NASA plans to incorporate ORBs into its Earth Observing System Data and Information System, or EOSDIS. This data archive and distribution system will be used by space and environmental scientists to access and analyze the 300 GB per day expected from 18 satellites examining global warming, greenhouse gases, ozone depletion, and natural resource exploitation (see reference 5). Several DAACs (distributed active archive centers) will be established to serve scientists and provide data to public and private interests. The gigantic volumes of data, expected to reach petabytes (1018 bytes) by mission completion, will most easily be accessed from DAACs via computer networks aided by ORB agents.

The trickiest aspect of CORBA (Common Object Request Broker Architecture) lies in assuring the delivery of all messages that describe objects. Nondeterminism arises from the asynchronous nature of communicat ing sequential processes; two communicating ORBs exemplify this configuration. But interspersed between the processes' network I/O connections are additional processes dedicated to encryption, access control, accounting, and so on. Each process has the potential to interject an exception condition between the two ORBs, and this exception may corrupt the data transmission or force one of the peer ORBs to terminate, thereby creating a state of unrecoverable deadlock.

Tomorrow's Objects

ODBMSes possess a rich set of operations and primitives that couple seamlessly into the semantics of OOP languages. ODBMSes simplify the design process through their enhanced modeling capability. Developers who incorporate an ODBMS into their products are likely to realize substantial savings from reduced software maintenance and engineering costs.

ODBMSes and their ORBs can produce scalable, reusable architectures that map easily onto client/server topologies, where enterprise-wide multiprocessor object servers distribute and manage objects over a LAN. However, the ORB mechanism may impede ODBMSes from efficient exploitation by message-passing parallel computer systems such as the Cray T3D, KSR1, or nCUBE-2. Background message traffic is required to maintain a coherent OID table for distributed environments.

Nonetheless, these computation systems will inherit a leadership role in information commerce. ODBMS vendors should recognize this important opportunity and begin to engineer CORBAs that minimize message traffic.

ACKNOWLEDGMENTS

My thanks to Drew Wade of Objectivity, Inc. (Menlo Park, CA), and Won Kim of UniSQL, Inc. (Austin, TX), for their assistance during the preparation of this article.


Commercial ODBMS Packages



DEC Object/DB
Digital Equipment Corp.
55 Northeastern Blvd.
Nashua, NH 03062
(603) 884-0828
fax: (603) 884-0828


GemStone
Servio Corp.
2085 Hamilton Ave., 
Suite 200
San Jose, CA 95125
(408) 879-6200
fax: (408) 369-0422


IDB Object
 Database
Persistent Data Systems
P.O. Box 28415
Pittsburgh, PA 15238
(412) 963-1843
fax: (412) 963-1846


Itasca ODBMS
Itasca Systems, Inc.
7850 Metro Pkwy.
Minneapolis, MN 55425
(612) 851-3155


M.A.T.I.S.S.E.
ADB, Inc.
238 Broadway
Cambridge, MA 02139
(617) 354-4220
fax: (617) 354-5420


O2


O2 Technology, Inc.
7, rue du Parc de Clagny
78035 Versailles Cedex,
France
+33 1 30 84 77 98
fax: +33 1 30 84 77 90


Scientific Services, Inc.
10001 Derekwood Lane, Suite 204
Lanham, MD 20706
(301) 577-3606
fax: (301) 577-0831


Objectivity/DB
Objectivity, Inc.
800 El Camino Real
Menlo Park, CA 94025
(800) 767-6259
(415) 688-8000


Object Skipjack
MAK Software Consultants
1 East Chase St., Suite 1113
Baltimore, MD 21202
(800) 625-7547
(410) 783-2913
fax: (410) 783-2912


ObjectStore 
Object Design, Inc.
1 New England Executive Park
Burlington, MA 01803
(617) 270-9797
fax: (617) 229-2451


ODBMS 2.0
VC Software, Inc.
3 Christina Cen
tre
201 North Walnut St., 
Suite 1000
Wilmington, DE 19801


VC Software 
Construction GmbH
Petritorwall 28
38118 Braunschweig, Germany
+49 531 24 24 0-0
fax: +49 531 24 0-24


Ontos DB
Ontos, Inc.
3 Burlington Woods
Burlington, MA 01803
(617) 272-7110
fax: (617) 272-8101


OpenODB
Hewlett-Packard Co.
P.O. Box 58059
Santa Clara, CA 95052
(800) 637-7740


Poet
Poet Software Co.
4633 Old Ironsides Dr., 
Suite 110
Santa Clara, CA 95054
(800) 950-8845
(408) 970-4640
fax: (408) 970-4630


Raima Object Manager
Raima Corp.
1605 Northwest 
Sammamish Rd.
Issaquah, WA 98027
(800) 327-2462
(206) 557-0200
fax: (206) 557-5200


Tensegrity OO Database for Smalltalk
Tensegrity
1091 Industrial Rd., 
Suite 220
San Carlos, CA 94070
(415) 592-6301
fax: (415) 592-6302


UniSQL/X Database Management System
UniSQL, Inc.
9390 Research Blvd., II-20
Austin, TX 78759
(800) 451-3267
(512) 343-7297
fax: (512) 343-7383


Versant Object Database Management
 System
Versant Object 
Technology Corp.
1380 Willow Rd., 
Suite 201
Menlo Park, CA 94025
(800) 837-7268
(415) 329-7500
fax: (415) 329-2380


Illustration: Managing Persistent Objects The ODBMS manages object persistence. Encapsulated object references are dynamically examined to determine if dependent references must be fetched from secondary storage into memory. The dark nodes imply recently modified object attributes; the clear nodes imply no changes to other objects.
Illustration: Inside a Multidatabase A multidatabase organizes many distributed ODBMSes into a single logical structure. The OID (object identifier) table is distributed along with the objects. Message passing is used to relocate objects under control of an ORB (object request broker).
Illustration: Object Request Brokers In an ORB architecture, ORBs perform a variety of functions in mediating access to distributed objects by passing messages back and forth. In addition to som e operating-system functions, ORBs can cover network communications and administration, object format conversions between different processors, heterogenous access control, native memory access, security via methods for encryption/decryption, and memory allocation.
Richard Marlon Stein is a freelance writer and multicomputer systems technologist with the Parallel Software Group of Santa Clara, California. You can reach him on the Internet at rms@well.com or BIX c/o "editors."

Up to the State Of The Art section contentsGo to next article: The Object Database StandardSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network