Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers

ArticlesOLAP Omnipresent


February 1998 / Features / OLAP Omnipresent

Once strictly a big-iron application, OLAP is moving to desktop spreadsheets, business software, and the Web.

Nigel Pendse

Which came first, on-line analytical processing (OLAP) or relational databases? Award yourself 10 points if you answered OLAP. Contrary to what many people think, multidimensional analysis tools have been around for much longer than relational databases: Today's market leader is now 28 years old. Whether or not their developers used the term "OLAP," many budgeti ng, sales analysis, consolidation, and planning applications have used fundamental OLAP concepts over the years. And now OLAP functions are popping up in non-OLAP applications that will probably bring them to a bigger audience.

Despite their widespread presence, OLAP tools failed to capture the attention of the mainstream user -- until recently. This failure was not because of a lack of functionality: Most of the advances in OLAP have nothing to do with analytical or statistical fun ctionality. Indeed, the earliest OLAP tools probably had more sophisticated calculation capabilities than many of the modern ones.

So, if it was not analytical functionality that kept OLAP from being as popular as 1-2-3, what was it? As the table "OLAP's Former Difficulties" shows, OLAP had problems collecting data and storing it, and problems integrating with other applications. OLAP also lacked standards, severing the market into proprietary fiefdoms. Setting up and running OLAP required expert help, and only high-level or guru types had access to its capabilities. Plus, it was all very expensive.

These negative characteristics have changed. Now that OLAP is no longer strictly a niche technology, it is worth examining how it has evolved to fit more closely with contemporary business needs in organizations large and small.

Since OLAP products cover a wide spectrum of applications, no single architecture could work optimally for the full range. Data volumes range from a few megabytes to terabytes; functionality ranges from simple aggregation to complex financial models; access ranges from read-only to multiuser concurrent read/write; and user populations range from single user to thousands of users.

So OLAP vendors have used many different approaches for how and where their products store the data for multidimensional processing, and where the processing itself occurs. Some products are more versatile than others, and inevitably, the vendors of the l ess versatile products attack the approaches that they don't support. But this is unnecessary because each type of architecture is suitable for different applications.

Types of OLAP

OLAP products and vendors resist attempts to be neatly pigeonholed or compared. On a technical basis, there are a surprisingly large number of variables that differentiate the products, so no two products are identical -- even in architecture.

Relational OLAP (ROLAP) uses SQL to extract data from a relational database and create multidimensional views on the fly. Multidimensional OLAP (MOLAP) creates and stores multidimensional databases beforehand, for faster access. Hybrid OLAP (HOLAP) combines ROLAP and MOLAP by creating multidimensional models that userelational databases. But the terms "ROLAP," "MOLAP," and "HOLAP" are, at best, only approximate indicators of how products work.

To try to bring some clarity to our analysis, it is handy to use a two-dimensional grid (see the figure "Different Styles of OLAP" ). This grid positions products by where the majority of the multidimensional processing occurs and by where they store the bulk of their multidimensional data. For each of these two dimensions, there are three basic options. Regardless of where the active multidimensional data originally came from, there are essentially three places to store it for analysis: an RDBMS (with no external storage), a shared multidimensional database on a server, or local data files on a client PC. Similarly, there are three styles of "engine" that can do all or most of the multidimensional consolidations and other calculations: multipass SQL, a multiuser multidimensional application server, and the client PC itself. (Multipass SQL performs one query, then retains the result of that query in temporary storage while it performs one or more other queries. Then it joins the results of the queries for a final multidimensional result.)

To some extent, these options are independent, so several comb inations are possible. However, only some of the nine combinations make sense. For example, it would be crazy to use SQL to do multidimensional processing of data held in a multidimensional database. In practice, six of the options make sense, and in most cases, several products provide those combinations.

If a product is normally used in only one way, it appears in that box only. However, several of the products are capable of working in more than one way, which is one reason why even the most ingenious architectural classification schemes often fail. Although products that are genuinely usable in more than one way appear in all the relevant boxes, this does not mean that they are equally at home in each. Some products really belong in one category, but they have capabilities that allow them to be included in another.

If you prefer the normal bandied-about terms, you can map them to the grid. Relational OLAP products are in the first column only (squares 1, 2, and 3). Multidimensional database (a lso known as MDB, MDDB, or MOLAP) products are in the second column only (squares 4 and 5). Desktop OLAP products are in square 6. Hybrid OLAP products are in both squares 2 and 4 and are shown in italics.

The fact that several products are in the same square, and therefore have similar architectures, does not necessarily mean that they are very similar products. For instance, DB2 OLAP Server and Information Advantage DecisionSuite are actually very different: They just happen to share certain storage and processing characteristics; technically, they both still count as ROLAP products.

Even though this grid is probably the simplest and clearest way of illustrating the architectural differences among the different OLAP approaches, it is still relatively complex. It's hardly surprising that many people continue to be baffled by the differences among OLAP technologies and select grossly inappropriate products for their applications.

In general, products that access multidimensional data direct ly from SQL pay a heavy performance penalty, as there is nearly always much more I/O and CPU involved. Their retrieval performance is usually two to four times slower because SQL is not good for multidimensional data retrieval and manipulation. However, they can handle lots more data. So, if the key issue is performance, multidimensional database technology remains essential, but if capacity is the limiting factor, an RDBMS must hold at least the base-level data.

The hybrid architecture is becoming the most popular for current products because it can combine the capacity of ROLAP tools with the superior performance of multidimensional databases. Still, most of the initial hybrid OLAP products have acquired a reputation for complexity. The new Microsoft OLAP Server will use a wizard-driven interface with automatic tuning to avoid the 4GL programming that has been required until now.

Most of the products that include client-based processing need an extra tier for Web deployment. In particular, the d esktop OLAP tools will all need a server component to work successfully with browsers. This additional server takes over the processing that would have occurred on the client -- so three-tier OLAP becomes four-tier if deployed on the Web. This can sometimes work out as significantly more expensive than the conventional three-tier client/server solution.

Spreadsheet OLAP Clients

Traditional OLAP tools used dedicated, proprietary client tools, and most OLAP products on the market can still work this way. However, in the early 1990s, vendors began to provide add-ins to present multidimensional data via industry-standard spreadsheets (see the table "OLAP for Spreadsheets" ). Initially vendors supported both Lotus 1-2-3 and Microsoft Excel, but all the more recent implementations have worked with Excel only. OLAP vendors report that, quite apart from the obvious marketing reasons, it is easier to develop Excel add-ins than 1-2-3 add-ins.

The key advantage of the spreadshee t approach is that it combines the flexible display, formatting strengths, and ad hoc calculations of spreadsheets with the data management, calculations, and performance of multidimensional database technology. No data is stored in the spreadsheet, no macros are required, and the spreadsheet's notorious maintenance hole is eliminated. Conversely, the OLAP vendors don't have to keep up with relentless spreadsheet GUI advances. Of course, vendors still usually have to produce different versions of their add-ins for each new release of the spreadsheet (for example, Excel 97 needs a different version than Excel 7.0, even though both are 32-bit programs).

Many OLAP applications require users to enter plans, budgets, and comments, so an increasing number of OLAP servers now allow read/write spreadsheet access. This approach to OLAP data input is particularly attractive. It allows end users to stick with their familiar spreadsheet data entry without having to manage or upload the captured data -- which reside s directly in a properly secure multiuser database. The OLAP engine, not the spreadsheet, manages access controls and multiwrite concurrency.

Using a spreadsheet grid is very popular with end users, many of whom routinely copy data from other analysis tools into spreadsheets anyway. Financial OLAP applications tend to use the spreadsheet approach the most. But in organizations where wider populations of users are spreadsheet-literate, it is a great approach for sales and marketing applications as well. Because the spreadsheet add-ins are often "free," with licensing based on concurrent server connections, buyers don't have to account for each desktop installation of the add-in. In fact, the economics of this approach often beats Web deployment because the hardware and software costs are usually much lower, even though the administrative effort is a little greater. (See the figure "OLAP Add-Ins to Spreadsheets" .)

Other OLAP Clients

Spreadsheets are by no means the onl y OLAP clients. Typically there are three other options: standard desktop OLAP products, specialized applications, and Web browsers.

Several vendors produce desktop OLAP tools that literally hundreds of other application providers resell. In some cases, they "own label" them by removing the original product's name. However, in most cases, they do not change the product itself, and they add value by integrating data from their own application into the OLAP data structure. Typically the application vendor implements functions to generate the desktop cubes nearly automatically using the application's metadata, then resells a standard version of the desktop OLAP product.

Cognos, in particular, has been very successful in signing up both horizontal and vertical application vendors: For example, most accounting systems now come with an optional Cognos PowerPlay front end. Other vendors that provide OLAP clients on an OEM basis include Business Objects, Brio Technology, and AppSource.

Embedded OLA P

The other form of OLAP embedding comes when specialized applications include licensed off-the-shelf OLAP servers. For example, Comshare and Hyperion both have a history of building their own multidimensional engines, but they now license third-party OLAP servers for some of their applications.

Normally it is necessary to add an extra layer between the server component of the application and the standard OLAP server. Plus, another layer on the "output" side of the generic server handles post-OLAP processing such as exception scanning or data mining. The server performs both of these much more efficiently than the client. Sometimes there are also application components on the client, but generally it is also possible to access the server using its generic client support.

Thus, although it may seem easy to simply license a standard OLAP server, in practice a significant amount of development effort is required on the server, the client, and even the client/server communications if the appli cation is to work properly. (See the figure "Embedding OLAP" .) The work should be distributed efficiently between the client and the server and between the generic and the specialized components. Despite these complexities, the inexpensive new Microsoft OLAP Server is likely to make this a popular architecture for the many applications that can use multidimensional processing. A host of new OLAP applications that embed the Microsoft server is likely to appear next year.

Using embedded OLAP servers allows complex business applications such as budgeting, consolidation, and retail systems to be "OLAP-enabled." Buyers may hardly be aware that a key component of their solution is an OLAP engine. The main multidimensional databases currently used for such systems are Arbor Essbase and Applix TM1, while Oracle Express is the most commonly embedded hybrid OLAP tool, and MicroStrategy DSS Server is the fashionable ROLAP choice.

OLAP and the Web

There has been a huge amount of hype about using Web browsers for OLAP access, but few sites are yet in production with Web OLAP deployment. This is one of the more difficult applications to deliver successfully in a browser environment, and most of the currently available offerings -- despite all the excited announcements -- unfortunately are very disappointing. Today's better OLAP client tools offer a very rich GUI environment, with plenty of direct manipulation of screen objects, ad hoc multidimensional calculations, read/write data access, and navigation through potentially very large hierarchies.

All of this is tough to deliver through a browser. Regardless of the technology used, the current crop of products is usually slow, clumsy, and low-featured. This may be the only option for users who simply cannot gain access via a conventional client/server interface, but it is a poor substitute for users familiar with sophisticated Windows OLAP clients. In addition, many products require building special Web pages. As a result, mixed client/server and Web deployments are not easy to administer. Only the more basic reporting-oriented products allow existing client/server applications to be deployed via Web browsers with little or no change.

Since 1996, various pundits have regularly predicted a rapid movement to the Web for OLAP applications, but it now seems that, like so much else to do with the Web, their headline-making proclamations were naïve. While Web OLAP could indeed become ubiquitous, OLAP applications currently seem like some of the least suitable for the general Internet. Still, a few simple multidimensional analyses are already becoming available to customers and suppliers on a small scale. If it finally takes off, Web OLAP would probably spread first on intranets and extranets, for two reasons. First is the type of applications that use OLAP. Second is the probable technical requirement to standardize on particular browsers and plug-ins for sophisticated deployment.

The Next Pivot?

It has taken a l ong time, but OLAP has finally hit the big time. Whether lurking behind a deceptively flat spreadsheet, embedded within a budgeting application, invisibly spewing out dynamic Web pages, or providing the glitz for your ledgers, OLAP technologies are infiltrating all branches of business computing. And these are not trivial technologies: OLAP applications can stretch GUI interfaces to the limit, swamp the biggest servers, and cause mysterious effects like database explosion. Despite the smartest sparsity suppression -- sparsity is a condition in which a region of a multidimensional database has few entries, wasting space -- megabytes of input data can legitimately explode into gigabytes of fully calculated data. (For more information on this subject, see http://www.olapreport.com/DatabaseExplosion .htm .)

Yet even after all these years, OLAP innovation is far from exhausted. Having tackled the functionality issues long ago, the surviving OLAP vendors now have to confront the more difficult problems of mass deployment. Their applications must become easier to implement, data integration must cease to be an issue, end-user training needs should be minimal, and tuning should become automated. Then OLAP could become as common as a spreadsheet.


Where to Find

Applix
Westboro, MA 
Phone:    508-870-0300
Phone:    800-8-APPLIX
Internet: http://www.applix.com/tm1/index.htm

AppSource
Orlando, FL
Phone:    407-888-8050
Internet: http://www.appsource.com

Arbor Software
Sunnyvale, CA
Phone:    408-744-9500
Phone:    800-858-1666
Internet: http://www.arborsoft.com

Brio Technology
Palo Alto, CA
Phone:    415-856-8000 
Internet: http://www.brio.com

Business Objects
San Jose, CA
Phone:    408-953-6000
Phone:    800-527-0580 
Internet: http://www.businessobjects.com

Cognos
Ottawa, Ontario, Canada
Phone:    613-738-1440
Internet: http://www.cognos.com

Comshare
Ann Arbor, MI
Phone:    313-994-4800
Internet: http://www.comshare.com

Gentia Software
Wakefield, MA
Phone:    781-224-0750
Internet: http://www.gentia.com

Hyperion Software
Stamford, CT
Phone:    203-703-3000
Phone:    800-286-8000
Internet: http://www.hysoft.com

Information Advantage
Eden Prairie, MN
Phone:    612-833-3700 
Internet: http://www.infoadvan.com

Informix Software
Menlo Park, CA
Phone:    650-926-6300
Phone:    800-331-1763
Internet: http://www.informix.com

Microsoft
Redmond, WA
Phone:    206-882-8080
Internet: http://www.microsoft.com

MicroStrategy
Vienna, VA
Phone:    703-848-8600
Internet: http://www.strategy.com

Oracle
Redwood Shores, CA
Phone:    415-506-7000
Internet: http://www.oracle.com/products/olap/html

Pilot Software
Cambridge, MA
Phone:    617-374-9400
Internet: http://www.pilotsw.com

Seagate Software
Scotts Valley, CA
Phone:    408-438-6550
Internet: http://www.seagatesoftware.com/holos

Speedware
Montreal, Quebec, Canada
Phone:    514-747-9494
Phone:    800-361-6782
Internet: http://www.speedware.com


OLAP for Spreadsheets

OLAP product Lotus 1-2-3 Microsoft Excel Type of access
Applix TM1 * * Read/write
Arbor Essbase * * Read/write
Gentia OLAP Database * Read/write
Hyperion Pillar * Read/write
IBM DB2 OLAP Server * * Read/write
Informix MetaCube * Read-only
Microsoft OLAP Server * Read-only
MicroStrategy DSS Server * Read-only
Oracle Expr ess * Read/write
Pilot Analysis Server * Read-only
Seagate Holos * Read/write
Key: * = Yes


OLAP's Former Difficulties

Data problems Collecting data from multiple sources.
Hardware limits Multidimensional application data requires vast storage and memory resources.
Skills OLAP tools were complex to implement, especially for designing adequate database structures.
Integration Usually proprietary. Necessitated complete solution, including database, engine, application, and client tool.
Standards Non e.
Costs Priced for the elite, not for volume sales.
Application limits Not designed to cope with large data volumes; could not do calculations fast enough; database index limits were too small; succumbed to database explosion (when sparse multidimensional structures were fully precalculated).
Business style Seemed to be for specialists only.
Administration Intended for use only on a departmental basis: lacked the necessary administration tools.


Different Styles of OLAP

illustration_link (21 Kbytes)

You can categorize OLAP products by how they perform calculations and also by where they store their data.


Embedding OLAP

illustration_link (18 Kbytes)

Embedding a generic OLAP server engine requires new interfaces for the existing application server-based logic and the client software.


OLAP Add-Ins to Spreadsheets

illustration_link (17 Kbytes)

Spreadsheet add-ins can connect to both local and server-based multidimensional engines, with no data in spreadsheet files.


Nigel Pendse is lead author of The OLAP Report ( http://www.olapreport.com), published by Business Intelligence, Inc. An independent consultant to buyers and sellers of OLAP, he has been involved with OLAP products for over 20 years, as both user and vendor. You can reach him at nigelp@compuserve.com.

Up to the Features section contentsGo to previous article: Bean BuildersGo to next article: OLAP APIs
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network