Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesPublish or Perish


September 1997 / Features / Publish or Perish

Solutions to overworked networks and unruly software distribution are just part of P&S.

Richard Hackathorn

A large stock-brokerage firm on Wall Street has difficulty getting the right data to its traders. It implements a publish and subscribe (P&S) trading system to distribute general news and stock information. A successful software vendor of CAD tools has a big problem with handling customer support. It designs a P&S problem-tracking system to manage the daily flow of thousands of customer requests. A global manufacturer of oil-drilling equipment has a messy situation tracking material and finished goods on its shop floor. It installs a P&S materials-handling sy stem that tracks jobs at each step in the manufacturing process.

Notice a pattern? All these involve complex business processes with ever-changing objectives. Sounds a bit like your business? If so, P& S may solve your problems.

No More R&R

For 30 years, the basic paradigm of computing has been request and reply (R&R). An application requests specific data or services, and a subroutine replies with it.

But the R&R paradigm is running out of gas. In the dynamic and uncontrolled environments of present-day enterprise systems, an application no longer has the luxury of knowing when and what to request.

P&S coordinates the components of distributed applications. The concept started hundreds of years ago with newspaper publishing. Recently, it has been applied to a variety of products that coordinate complex distributed applications or replicate diverse information content. P&S is a connectivity paradigm that separates the role o f producer from consumer via an intermediary, called the broker (see the figure "R&R vs. P&S" ). The broker manages the interactions so that neither the producer nor consumer need know much about the other. The architecture is decoupled or loosely coupled.

With P&S, a relationship is maintained by the broker to couple producers with consumers, as contrasted with a momentary interaction of R&R. This relationship is called a channel (or subscription, subject, or buffer). By reversing the ordering from R&R, the producer initiates the interaction by publishing a message to the broker.

The traditional way of linking data producers with data consumers is to design the system so that those links are static -- hard-wired into module linkages and procedural calls. As we move into increasingly dynamic and complex environments, we no longer have the luxury of hard-wiring those links. Producers and consumers often appear and vanish. We need a mechanism to effici ently match producers with consumers in a dynamic fashion. Adding flexibility and adaptation to system architectures is the role that P&S is fulfilling.

Three Business Problems

Because the P&S arena is emerging from several complex technologies, it is very confusing. The terminology is nonstandard, and everyone uses the terms message , channel , and event with subtle variations in meanings.

To understand the technologies and terminology better, you must understand the business problems that P&S aims to solve: coordinate processes, replicate content, and inform people (see the table "P&S Solves Business Problems" ).

Coordinate processes. Typically, this means tracking a business activity. Cutting quite deep, P&S has become an alternative to traditional application-development methodologies -- a different way of thinking about system architectures. Rather than catering to a logically centralized database, P&S is us ed as the event-driven coordination of applications through the distribution of messages. The focus is on significant changes that occur in business processes, such as a customer ordering a product. Once the message flows from the producer to the consumer, it is treated as nonpersistent (i.e., thrown away).

Replicate content. Somewhere there exists a persistent data store, such as a relational database or Web site. The information stream is closely linked to some part of that persistent store and represents the changes that are occurring within it.

Inform people. This is the essence of newspaper publishing, but P&S shifts the activity into a global scope and customizable context.

These three areas are similar in many respects. "We distinguish between content-push versus process-push," says Mike Kennedy of the Meta Group.

Coordinating processes often assumes that a database is part of the system. Fulfilling a customer order assumes a database of customers and inventory . Replicating content is driven by events that change the database. The inventory database changes because customer orders are being fulfilled. And informing people assumes a common knowledge base and a world that is constantly changing.

Coordinating processes usually requires a strongly typed message structure and may or may not be closely associated with a common database. Replicating content usually assumes a strong linkage to a database of some sort. It can vary greatly in the degree of message structure (from SQL INSERT statements to refreshed Web pages). Finally, informing people usually has a low message structure, which may or may not require a common database to understand messages.

The Emerging Architecture

P&S is a coordination mechanism that matches and links producers with consumers mediated by a broker. Producers are sometimes called providers or publishers, and consumers are sometimes called subscribers.

"The role of broker is critical with P&S," states Mitch Kramer of the Patricia Seybold Group. "It decouples producers from consumers so that they don't need to know about each other."

The broker establishes a channel to manage a stream of similar messages. Channels relieve the burden for the producer or consumer to maintain currency (see the figure "Free Subscription" ). The broker maintains a channel as long as a producer publishes or a consumer subscribes. This duration may last from a few seconds up to a few years.

By decoupling the producer-consumer relationship, the security of both parties can be enhanced, allowing either one to participate anonymously. Producers could also share or transfer subscriptions to balance loads or specialize in certain areas. Further, the P&S mechanism could form multilevel value-added chains in which a consumer can add value to the data and republish the result to another group of consumers.

A message is usually divided into a header -- structured data common to all message types -- and a body -- variable data specific to a certain message type. The body may contain free-form text, HTML Web pages, attribute-value pairs, and such.

Finally, a market is formed when a high level of activity occurs among a group of producers and consumers over one or more channels. Like the dynamics of normal markets, the dynamics of markets in P&S systems are a major indicator for directions to evolve these systems.

Basic Interactions

Here's how it works (see the figure "The Publish and Subscribe Architecture" ). A producer registers with the broker for a specific channel. This action may cause the broker to create the channel and establish its characteristics. Consumers inquire about available channels. If a desired one is found, the consumer subscribes to it. Later, the producer publishes a message to a channel. The broker delivers the message to the proper set of consumers subscribing to the channel.

In some situations, direct interaction between a producer- consumer pair is desirable. Such a direct link is required for highly volatile or massive data, along with applications requiring efficient high-volume transactional semantics.

A final aspect of the above interactions is the possible monetary exchange among producers, consumers, and the broker. As critical systems extend beyond the boundaries of a company, an explicit financial incentive must be established to ensure stable operations. Although electronic commerce is rapidly increasing in various areas, no examples of monetary exchange with P&S have occurred.

Practical P&S

There are several key issues to resolve in any practical application of P&S. How do you define channels? Set level of service? Privileges?

Channels and namespace. The first issue is defining channels, especially the namespace. A channel represents a stream of important business events or information resources. Defining your channels implies defining your business processes. Likewise, naming (or addressing) your channels implies how the P&S applications will support your business processes. Most argue that the naming should be federated, so that there is a shared responsibility among producers and the broker, similar to domain names in the Internet.

The message header usually contains a structured field for a subject (or object type name). If the naming of messages uses this subject field, the P&S mechanism is subject-based. In contrast, if the naming is dependent on the content of the message body, it is content-based. Subject-based is more efficient, while content-based enables more flexibility for the consumer to specify which messages are processed. Content-based naming may also imply that the message body has some self-defining format so that the broker can filter on various equivalence operators in addition to simple string matches.

QoS. The second issue is the level (or quality) of service, usually dependent on the reliability of message delivery. The typical levels of service are best-effort, reliable, guaranteed, and transactional. Best-effort implies that the broker uses an efficient (but without error correction) transport, such as UDP. Reliable implies that the broker uses a less efficient (but with error correction) transport, such as TCP. Guaranteed implies that the broker queues the message on permanent storage until it is ensured that the consumer has received the message.

Finally, transactional implies that the broker manages a transaction among the producers and consumers, so that any actions by all parties are committed or aborted in unison. Among the various products, the scope of the transaction boundary is confusing and depends on whether the perspective is from the producer or consumer viewpoint.

Pull vs. push. The topic of push protocols has received much industry visibility recently as the preferred alternative. The problem occurs when the broker sends a message to the consumers that have subscribed to that message. If the number of c onsumers is small, each one can pull its message from the broker via periodic polling, or the broker can send the message multiple times, once for each consumer. As the number of consumers rises to millions, both approaches rapidly degrade network performance. In other words, approaches using pull and also simple push do not scale. Mark Bowles of TIBCO notes, "Scalability is poor for simple point-to-point solutions."

The essence of true push for P&S is twofold. First, the consumer receives its message asynchronously. An interrupt occurs at some level to switch the consumer's attention to the new message; there is no background polling by the consumer. Second, the message is multicast by the broker to many consumers. The broker initiates a message that is efficiently distributed to the proper consumers. Efficient multicasting implies hardware assistance buried in network routers, hence limiting networks to homogeneous equipment. At the heart of the debate over efficient multicasting is IP multicastin g for TCP/IP (see "Multicast to the Masses," June BYTE).

Privileges. The fourth issue is specifying and managing privileges for producers and consumers. Like that of a database system, it is necessary to have a secured environment in which all parties are authenticated and then assume a set of privileges that limit their actions.

Configuration. The final issue is the configuration for the P&S architecture. Vendors typically describe their implementations in terms of bus, hub and spoke, and snowflake (see the figure "Lay of the Land" ).

Key Players

P&S is emerging from many diverse product categories (see the figure "Where They Fit" ). There is a rapid blurring among categories caused by normal market pressures. As P&S matures, these categories may become useful only for historical background.

Messaging. Messaging transports (also called message-oriented middleware, or MOM) start with simple protocols fo r sending a message packet from point A to point B in a reliable and efficient manner. The inherent store-and-forward mechanism of message transports has been extended in numerous ways, one of which is P&S. As an outgrowth of sending one message from point A to many point Bs, the idea of shared buffers and subscribers emerged.

Some products are TIB/Rendezvous, from TIBCO; Velociti, from Vitria; SmartSockets, from Talarian; NEONet, from New Era of Networks; and ActiveWeb, from Active Software.

Since 1986, TIBCO (formerly Teknekron, now part of Reuters) has established a client base in the trading systems of Wall Street with its The Information Bus (TIB) middleware. Using a subject-based naming scheme, TIB/Rendezvous multicasts packets so that only selected destinations receive the packet, usually by the hardware-assisted IP multicasting.

Vitria's Velociti is a newcomer that takes a direct aim at TIBCO. It broadens protocol support beyond IP multicasting by adding support for Common Object Request Broker Architecture (CORBA) IIOP. "The key issues are defining the channel and event scheme, along with specifying the required quality of service," states Dale Skeen, cofounder and chief technologist.

SmartSockets, from Talarian, an industry veteran since 1989, emphasizes its ability to provide fault tolerance and unlimited scalability in traditional mainframe and Unix environments, along with NT. Tom Laffey, cofounder and CTO of Talarian, says, "For load balancing, SmartSockets can push messages to the subscriber that is least busy."

NEONet has a message broker controlled by a rule-driven engine that transforms the message flow. Consumers create subscriptions that are based on message content, rather than predefined naming or categorization by the publisher.

At the fringes of messaging is ActiveWeb, which adopts a strong Web flavor with Java-based tools. Rafael Bracho of Active Software says, "The focus of ActiveWeb is on attacking the heterogeneity problem by integrating diverse legacy systems and loosely coupled information resources." The configuration is hub and spoke, with the spokes as a variety of adapters into information resources.

As Evan Bauer of Giga notes, "The most frequent implementation of P&S à la messaging is the homegrown variety using IBMMQ." A popular alternative for many knowledgeable customers is to take a mature messaging product and add P&S functionality in your application. IBM recognizes this situation and is increasing the P&S services in MQ.

Distributed objects. The concept of a broker achieved industry visibility with the Object Management Group's (OMG) specification for CORBA, which built on the classical remote procedure call (RPC). The OMG extended CORBA to include a large set of object services, two of which are relevant to P&S: Object Event Notification Service (push/pull events to/from channels) and Object Naming Service (bind IDL-like [interface definition language] names to a context similar to a Unix dire ctory tree). The OMG is considering several proposals to flesh out these services for full P&S support.

Examples of object request brokers (ORBs) include Orbix, from Iona Technologies; Entera, from Borland Open Environment; and DataBroker, from I-Kinetics. Several have extended the Object Event Notification Service within CORBA to support P&S, such as Open Horizon's Ambrosia.

Nicholas Zaldastani, CEO of Open Horizon, emphasizes that its focus is on handling significant events that affect your business. "Developers must learn to exploit event-based infrastructures and properly design the namespace for event routing," he remarks.

Another contender for distributed objects is Microsoft's Distributed Component Object Model (DCOM), which forms the foundation for the ActiveX technology. Currently, there is no indication that ActiveX is adopting the P&S approach, although Microsoft's MSMQ (formerly Falcon), SQL Server Replication, and CDF-based Webcasting (described below) are close.

Transaction monitors. Transaction monitors evolved from database and large transaction-processing systems, such as IBM's IMS and Customer Information Control System (CICS) suites. The focus is on distributed transactions across multiple sites based on two-phase commit protocols (2PCs).

Tuxedo, now from BEA Systems, is a classic example of this category. In a way similar to CORBA object services, the event management of Tuxedo has been extended. "The 2PC is integrated into P&S and can coordinate among a variety of resources," states Ed Felt of BEA. "The provider can post a message to a broker that acts as a consensus taker. If all subscribers agree, the provider is allowed to commit its transaction." In addition, BEA has partnered with Digital Equipment to incorporate MessageQ, ObjectBroker, and SAP R/3 Wrapper into its product line.

Application Integration Server from Intermezzo Systems has a message broker driven by a transaction-processing monitor that coordinates several applica tions to accomplish a business activity.

Newsgroups. Lest we forget, good old e-mail has had P&S elements for a long time. Via group mailing lists, a producer (sender) can multicast a message to multiple consumers (recipients), who receive the message asynchronously. Add to that the concept of a BBS, and we have the Internet newsgroups, which are alive and healthy amid Web frenzy. Newsgroup creation and threaded messages are important concepts to be absorbed into P&S.

Work flow. Work-flow (or groupware) systems track a work item as it flows through the functional units of an organization. Through some combination of a centralized control database and structured e-mail messages, the responsibility for a work item passes from one person to another. There is now a strong convergence of traditional work-flow systems with messaging and distributed objects, thus solving the problem of implementing large-scale work-flow systems in an adaptive and incremental fashion.

As a P &S pioneer, Apple designed its Interapplication Communication (IAC) around a P&S variation for document management. A publisher shares a section of a document (e.g., a spreadsheet). A subscriber obtains this content for another document. The Edition Manager maintains the shared section within an edition container. Thus, users can change a document, and the changes are propagated to subscriber documents.

Another product is NewsStand from Lotus. It extends Notes onto the Web by publishing Notes templates and managing the security and approval of subscriptions. Several publications, such as BNS's Banking Report , use NewsStand for their electronic distribution.

Webcasting. This category has been "pushed" into the industry's limelight recently. Webcasting (or Web publishing) is using Web technology to deliver recurring information through a push protocol. Products are PointCast, Marimba's Castanet, BackWeb, I-Fusion, and DataManager.

DataManager, from DataChannel, is adopting TIBCO's technology and emphasizes its ability to efficiently multicast TCP packets, thus allowing scalability within large intranet environments. David Pool, president of DataChannel, says, "But what is very elegant architecture is the way that TIBCO sends out one packet and everybody listens for it. It is very lightweight and economical."

Recognizing the importance in standardizing Webcasting, Microsoft has submitted a proposal to the WWW Consortium (W3C) for its Channel Definition Format (CDF) technology that uses the Extensible Markup Language (XML). The proposal separates Webcasting into three levels: basic, managed, and true. Basic is simply the periodic probing (crawling) of specific sites of interest. Managed and true Webcasting use a CDF file, so that a consumer has a road map to the site as defined by the content provider. As stated in the Microsoft CDF white paper, "The CDF allows an author to optimize, personalize, and fully control how a site is Webcast." To Webcast a site, the content provi der would create a CDF file at the root Web directory to sketch a road map to key topics at the site.

Database replication. Distributed database systems that need to synchronize with a primary version require a replication scheme that reliably distributes a mixture of full-image and delta-image copies. Products are Data Propagator, from IBM; Replication Server, from Sybase; and SQL Server Replication, from Microsoft.

Software distribution. This category is a major thorn in the side for network and PC managers. As the number of workstation software suites soars, the need for effective software distribution enterprise-wide also soars. P&S seems to be an appropriate paradigm for software distribution, because a channel is a specific package while a consumer would subscribe to the software operating on its workstation. One Webcast product, Castanet, handles software distribution like Web content. A tuner at a workstation polls the transmitter server for differential updates to softwa re modules (even to the tuner software itself).

Data warehousing. P&S has a big potential with information delivery in data warehousing. The issue goes beyond delivering the proper information to the right people. The issue is how to sustain a flow of the proper information and let any consumer add value and republish the information. Applying P&S to data warehousing will move us into a whole new market-driven dynamic for information dissemination. Products are deliveryManager, from VIT; Tapestry, from D2K; and Aclue, from Decision·ism.

VIT's deliveryManager reaches beyond the data warehouse to any information source in the enterprise. "The focus must be on the consumer," remarks Subhash Chowdary, founder and CEO of VIT. "The consumer creates the demand and drives the content from any persistent store, like that of an information supply chain."

Tapestry has a Subscriber Interface with which analysts can examine the meta-catalog and place subscriptions via the Web. Content can be delivered in a variety of formats (e.g., Excel, Word, Lotus 1-2-3, and Java chart) and scheduled periodically ( see the screen ).

Tapestry has a unique separation in the producer roles. A supplier acts as a data administrator and maps available data sources into one or more data marts. A publisher acts as a business analyst and specifies various views to which people can subscribe. For example, a supplier could build data marts from host databases, while the publisher would publish views from those data marts.

Aclue focuses on the Arbor Essbase community, using P&S to distribute cubes consistently across the enterprise.

Electronic commerce. At first analysis, it may seem that electronic commerce has little to do with P&S; however, both share common technology (e.g., reliable and secure messaging) and common objectives (e.g., matching producers with consumers). P&S can benefit from the experiences with easy and reliable monetary exchange, and ele ctronic commerce can benefit from the mechanisms for recurring transactions to similar interest groups (like that of the Book of the Month Club).

Where to Now?

As a coordination mechanism for distributed systems and people, P&S has the tremendous potential for flexibility, adaptation, and evolution. In complex, large-scale situations where requirements are constantly changing, P&S may provide the fertile ground on which to grow those systems. Also, the standardization and commercialization of P&S technology have the potential to create global markets for information exchange and commerce, far beyond what we can presently imagine.

P&S, however, needs a few years to mature. First, the OMG and other standards groups must get serious at defining what it is. Second, the infrastructure of P&S is not all there yet. We still need to put into place the supporting technologies for reliable messaging outside the limited intranet context, efficient multicasting transport protocol s, and universal monetary exchange. Third, the critical weakness is the lack of system management across the enterprise. "It's easy to add a little at a time, but who is going to watch over it [the P&S system]," remarks Ian MacFadyen, vice president of technology management for Chase Retail Banking Systems. "There is no place in the organization responsible, since P&S intermingles the host, servers, network, and who knows what else."

Even when mature, P&S of itself is not a turnkey solution. There is still the difficult work of understanding your business processes, specifying an effective representation for events, and designing the proper database schemes. P&S will only provide more powerful tools and enlarge the set of possible options. "There is not a lot of experience with this stuff; it will probably take 10 years to absorb, like the batch to on-line transition," predicts Roy Schulte of the Gartner Group. "The big vendors will start to play [in the P&S marketplace] in two years ."

Any P&S solution still requires skilled professionals who can appropriately apply it. For many years to come, the education of these professionals will be the limiting factor in the adoption of P&S.


Where to Find


Active Software

Santa Clara, CA
Phone:    408-988-0414
Internet: 
http://www.activesw.com


BEA Systems, Inc.

Sunnyvale, CA
Phone:    408-743-4000
Internet: 
http://www.beasys.com


Borland Open Environment

Boston, MA
Phone:    617-562-0900
Internet: 
http://www.openenv.com


Decision·ism, Inc.

Boulder, CO
Phone:    303-938-8805
Internet: 
http://www.decisionism.com


D2K, Inc.

San Jose, CA
Phone:    408-451-2010
Internet: 
http://www.d2k.com


I-Kinetics, Inc.

Burlington, MA
Phone:    617-270-1300
Internet: 
http
://www.i-kinetics.com


Intermezzo Systems, Inc.

Boulder, CO
Phone:    303-440-5410

Iona Technologies, Inc.

Cambridge, MA
Phone:    617-949-9000
Internet: 
http://www.iona.com


Lotus Development Corp.

Cambridge, MA
Phone:    617-577-8500
Internet: 
http://www.lotus.com


New Era of Networks, Inc.

Englewood, CO
Phone:    800-815-6366
Internet: 
http://www.neonsoft.com


Talarian Corp.

Moun
tain View, CA 
Phone:    415-965-8050
Internet: 
http://www.talarian.com


TIBCO, Inc.

Palo Alto, CA
Phone:    415-846-5000
Internet: 
http://www.tibco.com


VIT

Cupertino, CA
Phone:    408-342-0882
Internet: 
http://www.vit.com


Vitria Technology, Inc.

Mountain View, CA
Phone:    415-237-6900
Internet: 
http://www.vitria.com


HotBYTEs
 - information on products covered or advertised in BYTE


P&S Solves Business Problems

P & S Solves Business Problems
Coordinate Processes Replicate Content Inform People
Information stream Messages representing significant business events Change statements to synchronize persistent data stores Information items having a common subject or topic
Producers Applications that detect and capture business events Log manager for updated database Content provider
Subscribers Applications that should react to business events Replication agent for database copies Knowledge worker
Level of reliability Low to high High (transactional) Low
Level of security Medium in an intranet environment Low Low to high

R&R vs. P&S

illustration_link (6 Kbytes)

Instead of forcing the client to ask whenever it needs something, P&S enables it to ask once and keep receiving.


Free Subscription

illustration_link (5 Kbytes)

Channels or subscriptions are logical groupings of similar messages kep currently by the broker.


The Publish and Subscribe Architecture

illustration_link (10 Kbytes)

In a six-step process, a consumer can find out what a producer offers and start receiving it.


Where They Fit

illustration_link (7 Kbytes)

P&S software is described by how structured its messages are and how tight its database integration is.


Lay of the Land

illustration_link (10 Kbytes)

P&S architectures come in three main flavors.


Weave Your Schedule in Tapestry

screen_link (140 Kbytes)

D2K's Tapestry has a Web interface for scheduling delivery of subscription information.


Richard Hackathorn ( richardh@bolder.com ) is president and founder of Bolder Technology, Inc. (Boulder, CO), a company specializing in enterprise connectivity and data warehousing. You can get a copy of the complete technology report on P&S via http://www.bolder.com .

Up to the Features section contentsGo to next article: The Universal InboxSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network