PU-intensive your application is, you need to ask: How I/O intensive is it? Then ask: How reliable do I need the system to be?
RISC
chips like Digital Equipment's Alpha and IBM's R10000 have been holding the high-performance end of the applications server market. In Unix shops, x86-based systems have generally been consigned to file, print, and light-duty applications service. But today, the widespread availability of symmetric multiprocessing (SMP) systems that are built around Intel's Pentium Pro chip is changing how x86 systems are deployed for running centralized applications.
SMP is hardly a new technology. What is new are low-cost SMP systems from companies like Compaq, Dell, and ALR that marry Windows NT with the top members of Intel's CPU family.
To see a performance increase, however, both the server OS and the application must support SMP. Even then, performance does not typically scale linearly with the number of processors added (see the figure
"Scalability by Number of CPUs"
). Depending on the application and the OS, a two-CPU x86 server will improve from about 70 to 80 percent over a single-CPU
system. Upgrading from two to four CPUs produces about the same enhancement. RISC-based Unix systems often offer more linear scaling up to six or eight CPUs. After that the system's performance starts leveling off.
RISC-based Unix systems may deliver more raw horsepower -- more CPUs, more RAM -- than x86-based systems. The hardware is usually more expensive, however. If you need outright speed, consider a RISC/Unix combo. If you need to keep an eye on the budget, x86 servers look a lot better.
Adding processors seems like an easy way to gain performance. But as with most fixes, it shifts the spotlight to other bottlenecks. Additional processors increase performance only until the number of processors contending for memory access and bus space creates bottlenecks. Beyond four processors, for example, the memory throughput of a typical Intel-based server becomes the limiting factor for scalability.
The most common configuration for SMP applications servers employs a separate L2 cache for each proce
ssor. This distinguishes them from multiple-CPU desktop systems that use a shared cache design. The Pentium Pro's internal nonblocking L2 cache makes it significantly faster than an equivalent external-cache system, such as the Pentium. The Pentium Pro with the 512-KB cache performs better in benchmark tests than the 256-KB cache version.
At a primitive level, scalability also means the ability to add components to a server system, and that means counting buses and open slots. A Silicon Graphics Challenge S, for example, has three SCSI buses, but two of them are differential SCSI -- which adds significantly to the cost of the peripherals you'll buy.
For multiprocessor x86 applications servers, the PCI bus is now the high-performance standard and has replaced the EISA bus. But electrical limitations keep the number of slots that a PCI bus can support to four or fewer. To increase the number of PCI adapter slots, system designers are adding PCI bridges -- circuits that connect distinct buses. Using a br
idge, a second PCI bus can be added to the system. How the bridge is connected can have a significant impact on the performance of the server. For example, in a cascade or hierarchical configuration, the second PCI bus is connected (via a PCI-to-PCI bridge) to the first PCI bus. In addition to its own load, the first PCI bus must also transfer the data load for the second bus. The effect is to share the 132-MBps system bandwidth of the first bus between both buses. The PCI bus may be bridged to an EISA bus in a similar fashion, further limiting throughput.
Peer bus design uses an alternative approach, bridging the first and second PCI buses individually to the system bus. Because data can flow independently to either bus, total system I/O can go as high as 264 MBps. This is the better alternative for high-performance servers.
Reliability and Availability
Downtime. It's the bane of every IT manager's existence. Fault-tolerant solutions generally fall into two categories depending o
n their level of protection.
Server reliability
solutions focus on making any single server as fault-tolerant as possible, using approaches such as redundant power supplies and RAID technology.
High-availability solutions address reliability at the server level. Failover, one aspect of clustering, ensures that if the primary server is lost, a standby server takes over. After the problem server is repaired, the system should provide a simple way to bring it back into the network.
Fault-tolerant solutions should satisfy several criteria. They must work in real time to minimize the period when services are not delivered to users. Better systems such as IBM's AS/400 will make the switch transparently, allowing users to continue work without losing network connections. Any system that duplicates data between the primary and the standby server should be transaction-based. Any data committed to disk at the time of the crash should be available on the standby server. Failover must be aut
omatic and work without requiring manual monitoring or intervention. It should not require that the servers be identical.
You can set up your standby server to act as a passive backup machine only. It would monitor your primary server and receive data continuously but perform no other functions on your network. A more cost-effective approach would be to cast the backup server in the role of utility server, where it could run printer, database, or communications services while in its standby mode. In the event of a failure, the standby server would automatically take over the functions of the primary server in addition to the utility services.
The final measure of any system is vendor support. Keeping your server operating may be crucial. But is it as important to your supplier? Be sure you can get service and support at the level you require around the clock or around the world.
Which system provides the best performance and resilience? Right now, the scales tip toward RISC/Unix systems and the AS
/400 with their better-developed clustering and SMP technologies. An x86-based SMP system running Windows NT provides solid performance at a relatively reasonable price, but there are few SMP systems that have more than four CPUs, and the third-party clustering technology is still an unknown.
Before you purchase your hardware, you should buy or rent a test system and assess the performance of your application running on it -- how much RAM it needs, how well it scales across multiple CPUs, and so on. Then you'll be able to make an informed decision about how to balance price and performance.
illustration_link (10 Kbytes)

90
percent reliability means you're down four days a month; 99.99 percent reliability means you're down four minutes a month.
illustration_link (19 Kbytes)

A Sun SPARC system offers more linear scaling than a Compaq ProLiant for small numbers of CPUs.
Robert L. Hummel is an electrical engineer, programmer, and consultant. You can reach him at
rhummel@monad.net
.