on making, and it enables new business-automation services that respond automatically to specific business events. In this article, I explore these RT/DSSes.
Decision-Support Services
The advent of relational databases enabled general-purpose DSSes. With these tools, you could simply ask decision-support queries instead of having to program them. Similarly, the advent of event-driven technology enables new decision-support tools that can consume and analyze events almost instantaneously. With these tools you can ask powerful, real-time queries, using a high-level query language, instead of having to program them.
General-purpose tools providing real-time query processing, referred to as RT/DSSes, are becoming commercially available, such as Vitria's Martini product. Now, the manager of a large shipping hub for an express package-shippin
g company can request "to monitor shipment volume and changes thereto as new packages are picked up and existing packages rerouted." Unlike database queries, real-time queries typically are long-lived -- a query may live from hours to months, depending on the information being monitored. Also, an RT/DSS must support concurrent query processing, because it's common to have thousands of real-time queries active all at once.
RT/DSS query processing is fundamentally different from traditional query processing. Rather than optimizing the bulk evaluation of a single on-line query across a large number of records, an RT/DSS optimizes the incremental evaluation of a single event (i.e., a single data change, say, a transaction) against a large number of continuous queries. Hence, an RT/DSS has to solve two hard query-optimization problems not found in traditional query processing.
The first problem is incremental query optimization, consisting of algorithms for optimizing the ongoing evaluation of a single
query (i.e., track the sales of red shirts sold in Copenhagen). The second problem is multiquery optimization, consisting of algorithms for optimizing the simultaneous, incremental evaluation of multiple queries (i.e., separately track, for each market in Europe, the sales of all red shirts).
To illustrate just how an RT/DSS works, consider a real-time query. Using the package-shipping example, you monitor package information that changes on a real-time basis, particularly as shipments are delayed or rerouted. At every destination city, it is important to monitor package volume and weight to properly allocate delivery equipment. Consider the real-time query: "Monitor the total weight, per destination city, of all large, priority packages."
This is expressed in SQL as:
select city.name sum(package.weight)
from package /*real-time*/, city/*stored*/
where package.weight>100
and package.service = 'priority'
and package.zip = city.zip
group by city.na
me
Although it's simply expressed, this query is subtly complex. It joins real-time, dynamically changing shipping-information events (the packages en route) with stored information (cities) that resides in a traditional database system. It contains a number of query constraints (priority service and weight over 100 pounds). It requires the grouping of this information by city and also the incremental computation of weight.
Real-time query optimization consists of three steps, as shown in the figure
"Anatomy of an RT/DSS."
The first step is to build a discrimination network that evaluates the query constraints. When a real-time event is received, the RT/DSS evaluates the event information against the discrimination network to efficiently identify all real-time queries whose constraints match the new information.
Note again that a potentially large number of constraints from a large number of concurrent queries might be tested. Hence, the discrimination network mus
t be optimized so that constraints are tested in such a way that quickly identifies matching queries and discards those that don't match.
Step two is to derive incremental algorithms for computing the final query result. Once a query's constraints have been matched against the incoming event, the query's result needs to be incrementally evaluated. Continuing with the shipping example, if a package is rerouted to a new destination city, its weight must be subtracted from the old destination city's total and added to the total for the new destination city.
Step three is to prefetch and precompute, whenever possible, query expressions involving stored data and to cache the results in memory. This is done because directly accessing a database on each receipt of a real-time event is expensive, and, at high data rates, a database system simply can't keep up. Prefetching and precomputation let the RT/DSS overcome this bottleneck, speeding up both constraint matching and incremental result computations.
Scaling
An RT/DSS service must be scalable in two dimensions: the ability to increase the number of concurrent queries, and the ability to share the results of processing a query to an increasing number of consuming applications and users. An RT/DSS uses the underlying ECS to both receive the events that drive real-time query processing and deliver the results of those queries to interested consumers.
For the package-shipping example, the RT/DSS processes a large number of queries (evaluating the number of heavy packages being shipped from various cities) and distributes the results (to all destination cities). Hence, as an RT/DSS computes the query results, it simply publishes the results on the underlying ECS.
A beneficial side effect of using an ECS is that this enables real-time queries to be
composed
. That is, the results of one query can be the input to another query, as shown in the figure
"Building Complex Queries."
Such daisy chaining of result
s permits the building of complex query results incrementally by leveraging the results of simpler queries.
To scale the number of current real-time queries, an RT/DSS uses a federated architecture. If more processing power is required, you simply deploy new RT/DSS servers and rebalance the concurrent queries among the new servers.
Real-Time Reaction
One of the most practical results of an RT/DSS is that it enables business automation. For package shipping, an application with the proper event hooks could immediately respond to too many heavy packages delayed at Chicago by rerouting an aircraft there. Also, if the problem persists, the program could notify a supervisor to investigate, so that the company could change certain shipping routes.
In either case, responses to problems are immediate, rather than taking days or weeks for the problem to be discovered, much less resolved. By using Internet-based, distributed object standards, these new RT/DSS technologies can change the speed
of business computing.
illustration_link (28 Kbytes)

Real-time query optimization uses three mechanisms (green) to expedite processing.
illustration_link (20 Kbytes)

Because an ongoing query's results can be sent to other servers, complex
queries can be built out of them.
Dr. Dale Skeen (
skeen@vitria.com
) is CTO and cofounder of Vitria Technology, Inc.