y to meaningfully analyze the data; in effect, transforming raw data into useful business information.
Business intelligence tools sit
on top of the data warehouse and provide this service. Because they're the main point of contact between your warehouse application and the people who use it, these tools also can be the difference between a warehouse's success or failure. Business intelligence tools have become the successor to decision support systems, but they have a broader scope: They not only support decision making but, in many cases, these tools support operational, mission-critical functions of the company, too. However, business intelligence products aren't foolproof: You get the most out of your data warehouse only as long as you match the
right tool
to the needs of each end user.
Beyond Queries
The simplest of these tools are basic querying and reporting products. They provide graphical front ends to SQL generators (or, more accurately, database access-call generators). Rather than requiring you to learn SQL or write a program to access database information, the querying tool lets you
point-and-click menus and buttons to specify data elements, conditions, grouping criteria, and other attributes of an information request (see the figure
"A View to Your Data"
). The query tool then generates a database call, extracts the relevant data, performs additional calculation and data manipulation if necessary, and presents the results in a clear format.
You can store queries and report requests for subsequent submission, either as is or with modifications. Statistical processing is usually limited to averages, summations, standard deviations, and other basic analysis functions. Although the capabilities vary from one product to another, querying and reporting tools are most appropriate when you need an answer to the question "What happened?" -- as in "How do last month's sales for products X, Y, and Z compare with the previous month's sales and the same month's sales last year?"
To make queries more accessible to nontechnical people, products like Seagate's Crystal Repo
rts, Cognos' Impromptu, Borland's Reportsmith, IQ Software's Intelligent Query, Software AG's Esperant, and Andyne's GQL offer graphical interfaces for drag-and-drop selections. The more advanced of these products will steer you away from queries that have bad syntax or that return unintended results. Access to data has also improved with new versions of these products as vendors ship standard drivers such as ODBC and native 32-bit drivers to popular data sources.
In general, data warehouse managers who are going to use these types of products should be prepared to spend time resolving setup tasks like managing directories and libraries, installing connectivity software, establishing English-like names, and precalculating "virtual data fields." Once you create your SQL front end, you may need to develop a set of standard queries and reports, although some products offer libraries of prebuilt templates and predefined reports that you can quickly modify.
On-Line Analysis
Report gene
rators fall short when end users need more than a single, static view of data that is not subject to further manipulation. For these users, on-line analytical processing (OLAP) tools provide "slice and dice" capabilities that go beyond answering "what happened?" to analyzing why the results are the way they are (see
"Two Ways to Slice and Dice Data"
). The first OLAP solutions were based on multidimensional databases (MDDBs). A structural cube (dubbed a hypercube or a multidimensional array) stored the data so you could manipulate it intuitively and clearly see the associations across multiple dimensions. Pioneering products such as Arbor Software's Essbase directly support the various views and dimensional manipulations required by OLAP.
But the MDDB approach has two limitations. First, new data-storage structures require proprietary databases. There are no truly open standards for accessing multidimensional data. Vendors such as Arbor saw this as an opportunity to create de facto st
andards by publishing MDDB APIs, encouraging third-party tools, and establishing strategic partnerships. Many third-party query tools and data-mining solutions directly support Essbase, Oracle Express, and other popular MDDB formats. Commander OLAP, Comshare's client/server tool, sits on top of an Essbase multidimensional data store and supports dynamic access and manipulation of data.
The second limitation of MDDB concerns populating the data structure. Companies typically store enterprise data in relational databases, which means that someone has to extract, format, and load this data into the hypercube. This process can be complex and time-consuming but, again, vendors are searching for solutions. Data pumps and other tools automate the process by mapping relational fields into the multidimensional structure and populating the MDDB on-the-fly.
Some vendors are now evangelizing relational OLAP (ROLAP), which taps into the data warehouse directly using standard SQL calls. The front-end tools let you
submit multidimensional requests, but the ROLAP engine transforms the queries into SQL routines. You then receive cross-tabulated results as a multidimensional spreadsheet or in some other form that supports rotating, drilling down, and slicing.
Even with a data pump, developing and populating the MDDB structure can be challenging. ROLAP administrators must face the sometimes daunting task of developing the SQL routines to aggregate and index ROLAP data as well as to ensure correct translation of multidimensional requests into SQL command sets.
Advocates of ROLAP argue that it uses open standards (SQL) and that it makes atomic (detail-level) data more readily accessible. On the other hand, advocates of MDDB argue that a native multidimensional structure achieves better performance and flexibility once you populate the data store. The good news is that both of these technologies are maturing rapidly, and either can provide a strong OLAP solution. Some product examples are Cognos' PowerPlay, Business Ob
jects' software of the same name, Brio Technology's BrioQuery, and MicroStrategy's DSS Agent/DSS Server suite.
The administrative and development challenges of OLAP, while similar to those encountered with query and reporting tools, are generally more complex. Setting up the OLAP and the data access software requires a clear understanding of the corporation's data models and the analytical functions required by executives, managers, and other data analysts. Commercial development products can ease the pain, but OLAP is rarely a turnkey solution; the architecture must be tuned to support your data sources and requirements. But once you've set up an OLAP system, ongoing end-user support is minimal.
For the Busy Executive
The concept of the executive information system (EIS) is simple: Executives have neither the time nor, in many cases, the expertise to perform OLAP analysis of large volumes of data. The EIS presents simplified, highly consolidated, and mostly static views of the da
ta.
There are two major categories of EIS environments: the briefing book and the command center. The briefing book is an electronic, on-line version of its paper counterpart that many executives use in staff meetings. Electronic slides present a concise picture of an organizational initiative or perhaps data to reveal the current status of a major project.
The command center is basically a collection of ports into a wide array of reports, newsgroup retrievals from the Internet, and other materials that provide insight into the organization. Command-center reports may be accessed daily, or more often if information changes frequently, or only when exceptions warrant. Some products generate alarms when specified exceptions occur.
When appropriate, each briefing-book slide or command-center screen should permit the executive to receive additional information if desired (and if available). Unlike the OLAP model, where increasing levels of information are revealed as the analyst manipulates the data,
an executive expects the overview up front. He or she should not have to dig for answers. Therefore, when the executive requests more information from briefing-book slides or command-center screens, the display should be carefully constructed to present primarily supplemental, amplifying information. The executive should be able to bypass every prompt for "more information" without missing any critical information.
Savvy executives may administer their own briefing book and command center, or administrators could maintain and modify the EIS according to the executive's specifications. EISes typically have a programming or scripting capability that will vary in complexity from one product to another. Pioneers in the EIS market include Comshare, makers of Commander EIS, and Pilot Software, developers of Pilot Command Center.
Digging for Data
Data mining is a category of open-ended analysis tools. Instead of asking them questions, you hand these tools reams of data and ask for someth
ing "interesting" in return -- a trend or a peculiar grouping, for instance. The process of data mining extracts hidden knowledge or predictive information from the data warehouse without requiring specific questions or requests.
Mining tools use some of the most advanced computing techniques -- neural networks, rule discovery, deviation detection, predictive modeling, and genetic programming -- to derive patterns and associations. Mining is a data-driven, rather than an application-driven, approach.
IBM's Intelligent Miner for AIX supports sophisticated mining techniques as well as data preparation functions to extract information from Oracle or Sybase databases and load it into DB2 for mining. With its Data Mine option for the Red Brick Warehouse 5.0 engine, Red Brick integrates data mining functionality and a warehousing architecture. Other examples of commercial data mining tools include Darwin from Thinking Machines, data visualization tools in SAS Institute's MDDB, SGI MineSet, and Focus 6 Analy
sis and Visualization suite from Information Builders.
The Right Fit
There are some obvious rules to follow when choosing analysis tools. Match tools to end-user needs, corporate technical expertise, and your existing data sources. If you selected a warehouse vendor that also offers integrated tools, you will probably save significant development time by choosing a compatible tool set. Otherwise, select a tool set that supports your data sources natively. Without such support, you should opt for a relational OLAP solution since it does provide a more open architecture.
After you've selected a tool set compatible with your data sources, determine how much analysis you really need. If you just need to know "how much" or "how many," a basic query and reporting tool should be sufficient. If you require more advanced analysis that explains the cause and effect of occurrences and trends, look to an OLAP solution. Sophisticated data mining tools require expertise in data analysis techniq
ues and are needed for advanced forecasting, classification, and pattern discovery.
As with any technology, no single solution or set of solutions can make your company perform better. Your staff must understand the requirements of technology, develop custom solutions that meet those requirements, and effectively maintain and upgrade the systems. Business intelligence software is only a tool. You still need managers and executives who can grasp the knowledge derived and make insightful decisions. In other words, business intelligence software still requires basic business intelligence.
Where to Find
Arbor Software
Sunnyvale, CA
Phone: (800) 858-1666 or (408) 727-5800
Fax: (408) 727-7140
Internet:
http://www.arbo
rsoft.com