Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesThe Quest to Standardize Metadata


November 1997 / Core Technologies / The Quest to Standardize Metadata  

A standard for database objects lets disparate data-manipulation tools exchange information.

Stephen R. Gardner

Metadata is popularly defined as "data about data." For the IT manager, this is more concisely stated as "information about the enterprise." In the context of data warehousing, the term refers to anything that defines a data-warehouse object, such as a table, query, report, business rule, or transformation a lgorithm. Metadata management gives users greater control of corporate data by providing a map of the locations where that data is stored. It also supplies a blueprint that shows how one type of information is derived from another.

The current crop of data manipulation and management tools has resulted in IT products that all process metadata differently, with little consideration for shari ng the information. This situation highlights the need for a metadata standard.

Efforts are under way by a consortium of companies to standardize metadata interchange among products from diverse vendors. Six companies -- Arbor Software, Business Objects, Cognos, Evolutionary Technologies International, Platinum Technology, and Texas Instruments Software -- formed the Metadata Council in July 1995.

The Council launched the Metadata Interchange Specification (MDIS) to address issues relating to the exchange, sharing, and management of metadata. The Council released versio n 1.0 of MDIS in June 1996. The standards document can be downloaded from http://www.he.net/~metadata/standards/toc.html .

Initial Steps

MDIS consists of components that represent the minimum common set of metadata elements and the minimum integration points that must be incorporated into database tools for compliance. MDIS also provides standards for optional and extension components that are relevant only to a particular class of tool.

A common language must be developed before a standard can be constructed. This involves setting up well-understood and well-communicated processes for naming metadata elements, standardizing data types and lengths, and maintaining descriptive glossaries.

This development of a common definition and terminology involves two entirely diff erent information models. First, there's the application metamodel . This model is application-specific and describes the tables and objects that contain the metadata for schemata particular to a given application. Second is the metadata metamodel , which is the set of objects that MDIS describes. These objects reflect information common to one or more classes of tools, such as database servers and data-discovery and data-extraction tools. For MDIS to succeed, the metadata metamodel must be independent of any application metamodel. It must have a unique definition for each object, and it should be character-based so that it's platform independent.

Since metadata is stored in different types of storage or data formats -- such as relational tables, ASCII files, and customized repositories -- the MDIS access methodology must be very flexible. This requires a framework that translates a tool's metamodel access request to match the MDIS syntax and format, as shown in the figure .

To establish a bidirectional data flow, the standard uses three types of information. First, the metadata files include a header with version information. Second, a Tool Profile file contains character-based information that describes the type of metadata elements that the tool manipulates. Finally, a character-based Configuration Profile file describes the mapping of data to specific metadata objects. It also describes what flows of the metadata are legitimate: A tool might be prohibited from using a later version of a metadata object because of major changes to source-to-target mappings of the metadata.

MDIS uses a text-based tag language that resembles HTML. The mechanism that implements extensibility to the MDIS is similar to Lisp's properties object, a character field of arbitrary length that's composed of identifiers and a value. The tool's import function uses the identifier to recognize the metadata type and to locate the data within in the field. The value is the metadata itself.

In Search of a Standard

The Metadata Council examined several ways to implement the standardized MDIS model, as shown in the figure . The ASCII batch approach relies on an ASCII file format. The file contains descriptions of the common metadata components and standardized access requirements that make up the MDIS model. The file loads whenever a tool accesses metadata via the common API.

This approach does not require updating the tool when the metadata model changes; modifications to the standard are made to the file instead. However, since using an object requires loading the entire MDIS framework, this approach is processor intensive.

The procedural approach requires that the intelligence to communicate with the MDIS standard be built into the tool. This approach needs only a modification to the API to accommodate changes and additions to the metamodel schema and/or access parameters. But it also requires a great deal of up-front effort on the pa rt of tool vendors to retrofit this logic to achieve MDIS compliance.

The hybrid approach combines the ASCII-batch and procedural approaches. It follows a data-driven model. A tool loads a set of tables that define the MDIS API. The tool interacts with the API through the MDIS framework and retrieves just the needed object. This eliminates the need for reading the entire schema.

Changes to the standard are reflected in the table data so that the tools don't have to be modified to maintain compliance with the MDIS specification. But loading the tables can be time consuming, which is unacceptable in information-intensive applications.

A fourth approach is to develop the MDIS standard within the Electronics Industries Association's CASE (Computer Aided Software and Systems Engineering) Data Interchange Format (CDIF). The CDIF standards support multiple semantic layers and transfer formats for CASE tools. Adopting this approach carries two obligations: the Metadata Coalition must appoint

For version 1.0 of the MDIS, the Council recommends the ASCII batch approach because vendors can implement support for the specification with minimum overhead and a shorter time to market.

Into the Future

There will continue to be a lack of integration among metadata tools for the next several years. Integrated metadata won't be readily available until at least the 1998/1999 time frame, when repository-based solutions should begin to emerge.

Furthermore, integrating new objects that consist of video, audio, and spatial data types will offer some additional challenges to the Metadata Council -- as well as anyone who's looking for integration of metadata tools.


A Unifying Framework

illustrati on_link (16 Kbytes)

Profile information, an API, and a standard set of objects allow diverse metadata tools to interoperate.


Four Approaches to an MDIS Implementation

illustration_link (39 Kbytes)

The Metadata Council chose the ASCII batch approach because it's easy to implement and offers a shorter time to market.


Stephen R. Gardner (Seattle, WA) is the director of advanced technology research at NCR Corp. You can reach him by sending e-mail to stephen.gardner@sanfranciscoca.ncr.co .

Up to the Core Technologies section contentsGo to previous article: Autoconf Makes for Portable SoftwareGo to next article: Building the Virtual PCSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network