OSF's Distributed Computing Environment standard makes distributed computing usable and invisible
Michael D. Millikin
Distributed computing deals with higher-level issues than physical media and interconnection. The distributed environments must give users and applications transparent access to data, computation, and other resources across collections of multivendor, heterogeneous systems. The strategic architectures of every major system vendor are now based on some form of distributed computing environment.
The key to realizing the theoretical benefit of such an architecture is transparency. Users can't spend their time thrashing about trying to figure out where something is. Nor should developers have to code into their applications the locations of re
sources over the network; it is in no one's interest to force applications developers to become communications gurus. Nor should business users have to worry about mounting remote volumes. And from the MIS viewpoint, the network should be manageable. The final picture is one of a "virtual" network: a collection of workgroup, departmental, enterprise, and inter-enterprise LANs that appears to the end user or client application to be a seamless and easily accessed whole.
One technology that will figure prominently in the future of distributed computing is the Open Software Foundation's DCE (Distributed Computing Environment). OSF's DCE is an integrated set of operating-system- and network-independent services that support the development, use, and maintenance of distributed applications. Because of its ability to enable a manageable, transparent, and interoperable network of multivendor, multiplatform systems, DCE could prove to be one of the most important technologies of the decade.
DCE is a tec
hnology most users will likely end up licensing from their system vendors. Most of the major players have committed to delivering DCE in future versions of system and network software. For example, IBM, which provides a number of AIX-based DCE products, has broadened its DCE offerings to include PC LANs. In September 1993, IBM shipped a DCE SDK (Software Development Kit) for OS/2 and Windows and, at the same time, made available its first DCE product aimed at PC end users, the DCE Client for OS/2. In February, after months of testing, a Windows cousin appeared. By the end of this year, IBM plans to begin beta testing a new OS/2 LAN Server that will include "snap-on" modules for specific DCE services.
Some large end users, however, went directly to OSF to license the early versions of the software. Those are the sophisticated users--such as Citibank, the Argonne National Laboratory, and the Jet Propulsion Laboratory--that began to develop distributed applications in-house.
For example, Citibank d
eveloped a prototype application in which a Sun workstation makes calls to an IBM RS/6000 server, which in turn executes APPC (Advanced Peer-to-Peer Communications) LU6.2 calls that execute on an MVS host. In the past, Citibank developers would have spent months building some of the network application infrastructure (e.g., security, support for APPC, and transactional extensions) for such an application.
In this prototype, however, a client Motif application running on a Sun workstation makes transactional RPCs (remote procedure calls) using a third-party vendor's (Transarc's) transactional extensions to DCE to a server running on an RS/6000. The RPC server there executes APPC LU6.2 calls (using the Transarc APPC run-time library), which execute on an MVS host. Everything participates in the same two-phase commit. Interoperability in this union of MVS and Open Systems is transparent. Citibank developers get to concentrate on the application rather than the application enablers. That is the promise of
DCE in action.
DCE is constructed on a layered architecture, from the most basic providers of services (e.g., operating systems) up to higher-end clients of services (e.g., applications). Security and management are essential to all layers in the model. Currently, DCE consists of seven tools and services that are divided into fundamental distributed services and data-sharing services.
The fundamental distributed services include threads, RPCs, directory service, time service, and security service. Data-Sharing Services build on top of the fundamental services and include DFS (Distributed File System) and diskless support. The OSF has reserved space for possible future services, such as spooling, transaction services, and distributed object-oriented environments.
Threads
Traditionally, applications deal with processes, each of which has a single thread of control. In this model, multiple tasks within applications are divided among multiple communicating processes.
Essentially, t
hreads extend the process model to multiple threads of control that share a single address space and set of resources. A multithreaded program has decomposed a single program into multiple threads of execution. Threads are an important emerging model for expressing parallelism within a process, especially within a distributed environment.
For example, this threading capability becomes particularly important within the context of an RPC. The RPC is synchronous by nature: A client makes a call for a remote function and then waits around until the call is fulfilled. With threads, however, one thread can make the request, but another can begin to process the data from a different request. Threading can therefore greatly improve the performance of a distributed application.
The threads model puts less demand on the skill of a programmer than other parallelism alternatives, such as explicit asynchronous operations or shared memory. Asynchronous interfaces, although they've existed in some environments
for some time, can be complex to implement. In the commercial world, the less retraining a new technology causes, the better--retraining programmers can be a major cost constraint. Threads preserve a traditional, synchronous view of the world.
Because of threading's obvious performance benefits, most modern operating systems are multithreaded. Much of the installed base is not, however. To provide threads to support distributed applications on those systems that do not support threading natively, OSF DCE offers a threads package. This is essentially a library of threads routines.
Compared to kernel-based threads, library-based threads have some functional restrictions, but they are really the best choice for broad heterogeneous interoperability at this time. The presence of threads libraries provides a common denominator of functionality across different operating-system platforms that might or might not have native support. The DCE RPC; the security, directory, and time services; and DFS all u
se the threads service.
The RPC
A well-known mechanism for implementing distributed processing, the RPC extends a familiar programming model (the procedure call) across the network. The RPC can handle the nuts and bolts of distribution (e.g., the semantics of the call, binding to the server, or communication failures). In theory, the programmer does not have to become a communications expert to write a distributed network application. Programmers will use an interface specification language to specify the operations. Compiling this then produces code for both client and server. To enable this type of function, the RPC must be simple, transparent, and reliable and must perform well.
The DCE RPC offers simplicity. It adheres to the local procedure model as closely as possible while providing the distributed aspects of applications in a straightforward manner. It foists less of a conceptual change on developers, thereby reducing retraining time. This is especially important for in-house corporat
e development teams.
Consistent protocol is another DCE RPC hallmark. The RPC protocol is clearly specified and is not subject to user (i.e., developer) modification. This guaranteed core is an important consideration in a heterogeneous environment requiring interoperability. It's a specific design philosophy that the OSF has chosen; proponents of other RPC tools think that flexibility and the ability for developers to customize by adding their own functional extensions are more important.
Regardless of the transport protocol it runs on, the DCE RPC provides identical behavior and keeps the management of connections invisible. The RPC interface supports a variety of transports simultaneously, and it allows the introduction of new transports and protocols without affecting the coding of the application.
The DCE RPC fits in well with the other needs of DCE: naming, DFS, security, and time service. The DCE RPC integrates with the authentication system to enable secure communications. It int
egrates with client and server threads, preserving the synchronous interface while allowing both client and server to exploit concurrency. Its ability to send and receive indeterminate-length streams of typed data supports DFS.
The DCE RPC is integrated with the DCE Security Service to guarantee authenticity, integrity, and privacy of communications. And it supports doublebyte character sets, such as those used in Japanese and other Asian languages.
Distributed Directory Service
Finding things (e.g., users, resources, data, or applications) in a distributed network is the task of the directory service. Name or directory services must map large numbers of system objects (e.g., users, organizations, groups, computers, printers, files, processes, and services) to user-oriented names. The problem is difficult enough in a homogeneous LAN environment, given personnel and equipment moves and changes to names, locations, and so forth. In a heterogeneous global WAN (wide-area network) environment, the
directory task becomes considerably more complex, given the need to synchronize different directory databases. Furthermore, as distributed applications appear on the network, the directories have to begin tracking all those objects and their components as well.
A good name service makes use of a distributed computing environment transparent to the user. Users should not have to know the location of a remote printer, file, or application, for example, nor should they have to key in the X.400 mail address for a distant colleague.
OSF specified a two-tier architecture for the name service to address both intracell and worldwide communications. The cell is a fundamental organizational unit for systems in OSF's DCE. Cells can map to social, political, or organizational boundaries, and consist of computers that must communicate frequently with one another--workgroups, departments, or divisions of companies, for example. Generally, computers in a cell are geographically close. Cells range in size from
two to thousands of computers, although OSF cites tens to hundreds as being the most common range.
Some vendors and users have pushed for the implementation of X.500 as a common directory service at all levels. But the OSF believed that using X.500 at the workgroup (i.e., cell) level would have been cumbersome because of the software and performance requirements--especially when more nimble cell-level directory services already existed in the market.
There are four elements in the DCE directory service:
CDS (Cell Directory Service). A network cell is a group of systems administered as a single entity. The CDS is optimized for local access. The bulk of directory service queries asks about resources within the query originator's cell. Each network cell needs at least one CDS.
GDA (Global Directory Agent). The GDA is a naming gateway that connects the DCE domain to other administrative domains through the X.500 worldwide directory service and DNS (Domain Name Service). The GDA takes qu
eries to names that it cannot find in the local cell and passes them to another cell service or to the Global Directory Service (depending on the location of the name). To look up a name, a client queries the local GDA. The GDA then passes an interdomain name query to the X.500 service. This service returns the response to the GDA, which in turn responds to the client. The OSF GDA can be compatible with any global naming scheme.
GDS (Global Directory Service). Based on the X.500 standard, the GDS functions as a higher level of the directory hierarchy in order to connect multiple cells in multiple organizations.
XDS (X/Open Directory Service). Support for the X/Open API for directory service calls allows developers to write applications independent of the underlying directory service architecture. An XDS-compliant application will work unmodified with both DCE and X.500 directory services.
Distributed Security Service
There are two broad general categories of security services: authentica
tion and authorization. Authentication verifies the identity of an entity (i.e., a user or a service). Authorization (or access control) grants privileges to the entity, such as access to a file.
Authorization alone is only a partial solution, however. Authentication services must exist within a distributed network environment where a workstation cannot be trusted to identify itself or its users correctly to shared network services. An authentication service is a mechanism for providing trusted third-party verification of user identities. An authentication service, which basically requires the user to prove his or her identity for each required service, must be secure, reliable, transparent, and scalable.
OSF security is based on the Kerberos authentication system (developed at MIT's Project Athena), augmented by security components (see "Distributed and Secure" on page 165). Kerberos uses private-key encryption to provide three levels of protection. The lowest level requires only that user auth
enticity be established at the initiation of a connection, assuming that subsequent network messages flow from the authenticated principal. The next level up requires the authentication of each network message. On the level beyond these safe messages are private messages, where each message is encrypted as well as authenticated.
End users should be minimally affected by the network-based service. In other words, you shouldn't have to memorize dozens of passwords or codes. A great deal of the security benefits stem from this network service's managing a user's access--in other words, authorization.
OSF added a registry service and an authorization service as well. OSF is also including authorization checks based on Posix-conformant ACLs (access control lists) and an authentication interface to the RPC.
There is growing de facto support in the industry for public-key encryption systems such as the one provided by the RSA (Rivest-Shamir-Adleman) method. (Microsoft and Apple, for example, are
working with RSA technology.) OSF also intends that applications for DCE be portable from Kerberos to public-key authentication schemes such as that provided by RSA.
Distributed File System
The OSF DFS is intended to provide transparent access to any file sitting on any node on the network (security permitting, of course). A major concern in such a distributed file system is making it simple for users. Vendors must address a number of other issues in delivering such a file system.
For example, a distributed file system should have a uniform name space. Files should have the same name, regardless of platform and location. Other features to consider are integrated security, data consistency and availability, reliability and recovery, performance and scalability to very large configurations without performance degradation, and coherent, location-independent management and administration.
The OSF DFS, which is based on AFS (Andrew File System) from Transarc (Pittsburgh, PA), uses four pri
ncipal components to address these needs: the DCE Logical File System, a protocol exporter (file server), a cache manager (client), and a token manager.
The implementation of DFS provides an excellent example of how the various components of DCE work together. DFS software resides on each node of the network. DFS integrates the node file systems with the DCE directory services, ensuring a uniform naming convention for all files stored in DFS. It uses the DCE security system, with ACLs to control access to individual files. The RPC streaming function allows DFS to move large amounts of data through a WAN in one operation rather than dribbling it across in smaller packets; this capability is very important because of the latencies inherent in a WAN.
To maximize file-access performance, DFS caches frequently access files on a workstation's local drive. When a user accesses data on the file server, a copy of the data is cached locally. When the user is finished working with the data, the file is wr
itten back to the server. The result is rapid user access to distributed files.
To prevent problems from arising when multiple users on different computers access and modify the same data, DFS uses a token management scheme to coordinate file modification. This prevents unintentional corruption of distributed files through multiple out-of-sync updates.
DFS allows system administrators to subdivide file-system partitions into filesets (logical collections of files). Filesets are not mounted in the local file-system name space but are spliced into the DCE global-directory name space instead. The fileset is referenced by its global directory name, so its name is independent of its location. A fileset moved from one physical file-system partition to another maintains its global name.
Thus, filesets make for easy administration. If a disk partition is getting close to capacity, an administrator can move filesets to another partition or file server.
Distributed Time Service
Distribute
d network systems need a consistent time service. Many distributed services, such as distributed file systems and authentication services, compare dates generated on different computers. For the comparison to be meaningful, DCE must support a consistent time stamp.
In OSF DCE, a time server is a system that provides time to other systems for the purpose of synchronization. Any nontime server system is called a clerk. The DTS (distributed time service) uses three types of servers to coordinate network time. A local server synchronizes with other local servers on the same LAN. A global server is available across an extended LAN or a WAN. A courier is a designated local server that regularly coordinates with global servers. Servers can obtain the official Universal Coordinated Time from standards organizations (e.g., the U.S. Naval Observatory) via short-wave radio, dial-up lines, or satellite.
At periodic intervals, servers synchronize with every other local server on the LAN via the DTS protocol.
The OSF DTS synchronization protocol is interoperable with the NTP, the protocol used by the Internet.
Extending and Using DCE
The premise that DCE lays a foundation for extension is already being tested and proven. Transarc, which provided the basis for DFS to DCE, has released its Encina extensions for transaction processing based on DCE. Major user corporations such as Amoco have committed to DCE, and major system vendors such as IBM, DEC, and Hewlett-Packard are busy implementing and delivering elements of the technology.
The University of Massachusetts at Amherst is using DCE as a foundation for providing an advanced computing environment called Project Pilgrim. The leaders of Project Pilgrim decided that DCE best met their needs for an integrated and comprehensive distributed computing environment. Project Pilgrim is completing its own distributed printing, mail, and event-notification services to layer on top of DCE.
OSF DCE 1.1
Currently, DCE is in release 1.0.3. One of th
e major goals of DCE 1.1 is improved administrative function. Earlier versions lacked some functions, so it was difficult to configure or administer a DCE cell from a single log-in session. OSF is rectifying this in release 1.1 by providing a new user-extensible control program and a new server that will be able to start servers directly under a variety of circumstances and provide better control over (and information about) what services are running on a host. The new server will also maintain the configuration files that current DCE programs require.
Another major area for enhancement is security. Release 1.1 will see the addition of GSSAPI (Generic Security Service Application Program Interface). The current scope of GSSAPI is establishing security contexts, performing peer-entity authentication, and yielding shared keys. One of its key goals is to provide non-RPC applications operating within a DCE environment with the ability to use the DCE Shared Secret authentication protocol. New audit subsyst
ems in DCE 1.1 will track security-related events.
DCE 1.1 will be further internationalized to handle differing character sets or encodings for text data. The new version will also add support for hierarchical cell naming and extended registry attributes.
Interoperability
Although DCE has been built from many standard technologies and is designed to promote interoperability, it is, in general, an extensive, interrelated environment. That is one of its strengths and also one of its weaknesses. The ability to switch in and out of DCE usage or to work in a mixed DCE/non-DCE environment doesn't appear likely right now. For example, although a DCE Kerber-os server supports Kerberos 5, it is not compatible with MITKerberos 5. DCE Kerberos runs over the DCE RPC; MIT Kerberos 5 does not. OSF promises full Kerberos compatibility in DCE 1.1, but in the meantime users can solve the problem by taking a dual-stack approach and running both.
There are some exceptions to this tight integration of se
rvices within an extensive environment. Some of the DTS, security, and GDS features are stand-alone. In addition, GSSAPI will be in release 1.1. But the essential philosophy behind DCE views tight integration as a feature, not a bug. Accordingly, a burden is placed on the user or implementor in choosing another design path.
This is not an insurmountable barrier. As another example, DCE and Windows NT are said to be compatible, even though NT is not a DCE platform. Compatibility is claimed by virtue of the RPCs' ability to interoperate. After some careful work at the source code level, developers can create DCE servers that communicate with Windows NT clients, NT servers that work with DCE clients, and DCE servers that communicate with DOS clients.
Areas for Extension
For DCE to fulfill its promise of becoming the foundation for widespread heterogeneous distributed computing, it must deliver support in two key general areas: TP (transaction processing) and object orientation.
Support fo
r TP is fundamental to success in the commercial market as a production system. Transaction integrity must be a given for businesses that cannot afford any loss or inconsistency in data. Some of these sites have had gigantic centralized TP systems running for years. The base DCE technology is insufficient to provide the qualities expected in a standard TP system: the so-called ACID properties (atomicity, consistency, isolation, and durability). Transarc provides one solution for this with Encina. IBM offers another with an implementation of its CICS on top of DCE on AIX. IBM actually offers its customers a choice of either Encina or CICS for open TP solutions.
Object orientation will prove fundamental to the rapid proliferation of network-based applications, for some of the same reasons that are propelling the transparency of DCE to developers: It is too hard to write a network-based application without either extensive retraining or a technology that camouflages the intricacies of the network. Object
orientation provides this necessary transparency as ease of development. Much of Novell's AppWare family, for example, is based on object-oriented technology. Object orientation will provide the necessary capabilities of reuse and customization required in today's business-oriented computing climate.
OSF is exploring extensions to its RPC interface definition language that will add object-oriented functionality. Once implemented, such features will support the Object Management Group's CORBA (Common Object Request Broker Architecture) on top of the DCE infrastructure. Much of DCE's future success will depend on this sort of extensibility, as well as on the success of the organizational management of this collection of enabling technologies.
What the DCE RPC Offers
-- Simplicity
-- Consistent protocol
-- Consistent behavior over multiple transports
-- Fulfillment of other DCE needs
-- Security
-- Internationalization
Illustration: The OSF DCE Ar
chitecture
The DCE architecture follows a layered model that integrates a set of technologies. The most basic, or supplier, services--such as the operating system--are at the bottom, with the highest level being service consumers, or applications. Security and management apply to all layers.
Illustration: The OSF DCE Directory Service
The CDS is optimized for local access. The bulk of directory service queries will ask about resources within the query originator's cell. The GDA takes queries to names that it cannot find in the local cell and passes them to another cell service or to the GDS, depending on the location of the name. Based on the X.500 standard, GDS functions as a higher level of the directory hierarchy in order to connect multiple cells in multiple organizations.
Michael D. Millikin is vice president of programs for NetWorld+Interop (Mountain View, CA). He is a longtime follower of the OSF technology and process. You can contact him on the Internet a
t
mdm@polaris.interop.com
or on BIX c/o "editors."