Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesThe Server's Helper


October 1996 / Core Technologies / The Server's Helper

RISC powers Intel's intelligent I/O controller for servers.

Tom Thompson

The changing nature of information that office workers use has had an impact on the role the network server plays. Today's server must pump out digital video, audio, and 3-D graphics on demand; handle complex database queries; and manage Web transactions. As a consequence, a server requires a more efficient architecture than just a souped-up desktop-computer design. A faster processor and system bus aren't enough. The computer must manage the vast amounts of I/O this type of data demands. Radical architectures from Tandem and Sequent address this problem on high-end server designs, as described in "The Network in the Server" (July BYTE).

Moderate-size servers must also work with the same type of data, on a smaller scale. When it comes to handling I/O, such servers must work smarter, not harder. Even a multiprocessor server can grind to a halt if each processor is supervising a peripheral or -- worse -- waiting to access a congested system bus.

To this end, Intel just started sampling the i960 RP, a 32-bit RISC processor that functions as an intelligent I/O controller. It supports DMA transfers, address translation, various memory types, and multiprocessor interrupt control. Thus, it can manage most of a server's peripheral I/O traffic without CPU intervention, eliminating many throughput bottlenecks.

The i960 RP also acts as a PCI-to-PCI bridge unit, which lets you add more slots to the design. At the same time, the bridge unit reduces the number of components required to build a server that uses state-of-the-art high-speed PCI peripherals. This enables a server to provide high throughput yet remain affordable.

RISC at t he Core

Like other chip vendors' smart I/O processors, Intel took the core of its tried-and-true i960 JF embedded processor and wrapped I/O support logic around it. Our tour of the processor begins with this RISC core. The i960 JF core consists of 32 32-bit registers. Sixteen of them are local (general-purpose); the other sixteen are global registers used for parameter passing or storing critical variables. An on-chip local-register cache stores up to eight copies of the local registers. This provides hardware support for the rapid entry and exit of function calls, a useful feature for time-critical interrupt handlers.

The core consists of 700,000 transistors and has a four-stage pipeline. It has several independent execution units (EUs): one for instruction processing and address generation, a multiply-divide-unit (MDU) for 32-bit math computations, and a memory interface unit that handles load/store operations. The core can scoreboard individual registers, so the processor can execute certa in instructions in parallel, or out of order, to maintain single-cycle instruction execution.

The core includes a 4-KB two-way set-associative instruction cache, a 1-KB direct-mapped data cache, and 1 KB of on-chip data RAM. You can enhance interrupt processing by locking sections of interrupt-handler code within the instruction cache and by storing a number of interrupt vectors in the on-chip memory. At 33 MHz, the core delivers 31 VAX MIPS.

A bus-control unit (BCU) supports 8-, 16-, and 32-bit memory addressing, plus big-endian and little-endian addressing modes. This lets the i960 connect to a large variety of memory and peripherals. Up to eight sections of memory, each 512 MB in length, can be defined with different memory-width attributes. Regions of memory ranging in size from 4 KB to 4 GB can also be defined as cacheable (typically for program memory) and noncacheable (typically for I/O devices).

Device Interfaces

The i960 RP provides a wealth of device interfaces and c ontrol functions, such as a memory controller, DMA controller, and PCI-to-PCI bridge unit, as shown in the figure "The i960 RP Architecture." These features can be used to both simplify a server design and improve system throughput.

The built-in memory controller generates the appropriate timing and signals for three different RAM types: fast page-mode (FPM), extended data out (EDO), and burst extended data out (BEDO). The controller supports memory interleaving for FPM RAM. The controller also handles 8- or 32-bit-wide ROM, static-RAM (SRAM), and flash-memory devices.

The i960's integrated DMA controller has three DMA channels that perform high-speed transfers between PCI peripherals and local memory (i.e., memory directly managed by the memory controller). Each DMA channel has a hardware packing and unpacking unit that can handle unaligned data transfers.

The DMA controller also implements chain descriptors . A chain descriptor is a data block that describes a DMA t ransfer, such as the amount of data to move, the source and destination addresses, a control value, and a pointer to the next descriptor. The pointers let you link descriptors into a "chain" of operations that can gather scattered blocks of data and transfer them in one chunk to the destination. These chains can implement sophisticated data transfers, perhaps moving data from a hard drive into memory and then to a network device, as shown in the figure "An Intelligent I/O Operation." Such a chain can supervise this type of complex transfer without interrupting the host processor, unless an error occurs.

A Built-In Bridge

The most versatile feature of the i960 RP is its PCI-to-PCI bridging capability. It supports two PCI buses: a primary PCI bus, which connects to the host CPU, and a secondary PCI bus that's maintained by the i960. These interfaces let you add the i960 to a PCI-based server design without using additional glue logic. This secondary bus complies wi th the 5-V PCI standard, and at 33 MHz, it provides nine extra PCI loads. This lets the server design offer more PCI devices or card slots. You can attach additional i960 RP processors to the secondary bus to build a hierarchy of PCI buses, so that the system can handle a large number of network interfaces and storage devices.

The i960's PCI bridge logic can forward memory, I/O, and command transfers between the two PCI buses. However, you can program the bridge logic to "filter" certain PCI transactions. This reduces traffic in other buses on the server and aids in the implementation of intelligent I/O subsystems. For example, suppose a hard drive is streaming video data to an Ethernet interface, and both these PCI-based devices reside on the PCI secondary bus. The bridge logic blocks these transfers from the primary PCI bus, so that it can independently handle a different set of I/O operations.

The DMA controller works in tandem with the PCI-to-PCI bridge unit to boost throughput. Also, properly wri tten descriptor chains can add smarts to low-cost PCI peripherals, so that their use improves performance while minimizing CPU overhead.

As you can see, the i960 RP offers many capabilities to a server design. Certain functions, such as the DMA controller, allow the systems designer to hand off data transfers between memory and peripherals to the i960, thus relieving the server's CPUs of this chore. Other functions, such as the memory controller and the PCI-to-PCI bridging capability, allow the engineer to eliminate some parts from the design, thus reducing the server's cost and complexity.


Product Information


Intel Corp.

Santa Clara, CA
phone:    (800) 628-8686
Fax:      (503) 264-6835
Internet: 
http://www.intel.c
om/DESIGN/IIO


HotBYTEs
 - information on products covered or advertised in BYTE


The i960 RP Architecture

illustration_link (25 Kbytes)

This intelligent I/O controller manages DMA transfers and memory timing. It also acts as a PCI-to-PCI bridge.


An Intelligent I/O Operation

illustration_link (28 Kbytes)

Sequences of DMA commands can manage complex I/O transfers without CPU support.


Tom Thompson is a BYTE senior technical editor at large. He has a B.S.E.E. degree from the University of Memphis. You can reach him on the Internet at tom_thompson@bix.com .

Up to the Core Technologies section contentsGo to previous article: Go to next article: Make Access and the Web Work TogetherSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network