Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesHow to Make Pentium Pros Cooperate


April 1996 / Core Technologies / How to Make Pentium Pros Cooperate

Intel's Pentium Pro has support for a four-processor configuration, which lets you do jobs that are too big for a single processor

John Hyde

As the decade has progressed, so have the power and capacity of desktop computers. As a consequence, they're assigned ever-larger jobs. While Intel's Pentium Pro processor has remarkable computational prowess at 200 MHz, certain jobs are so big that a single processor can't handle them in a reasonable amount of time.

However, you can deal with such work by using extra processors to divide and conquer the job. These multiprocessor systems require special support from the hardware and OS, so that each processor can share resources without conflict. Intel kept this strategy in mind while designing the Pentium Pro processor: Its bus has built-in support for a four-processor system.

Implementing a four-way multiprocessing environment isn't easy. For multiple processors to work in concert and share resources effectively, you must resolve many issues (e.g., how they interact during system reset, system initialization, and the OS boot). The Pentium Pro mechanism uses a combination of embedded hardware, processor-resident microcode, and firmware to produce a reliable yet extensible multiprocessor building block.

Bus Organization

To achieve this goal, Intel bused together all four processors' signals (as shown in the figure "A Multiprocessor System Bus" ). This design uses two of the five buses: the arbitration bus and the advanced programmable interrupt controller (APIC) bus. (The other three are the control, data, and address buses.) The reset operation makes heavy use of both these buses. We'll show how they assist in establishing the multiprocessing environment.

During reset, some power-on circuitry pulls one of the arbitration lines low. The board's hardware for the arbitration bus implements a rotating bit pattern on these bus lines, which creates a unique configuration for each processor. This configuration defines a processor's ID, which is used for all subsequent bus transactions. During normal (i.e., nonreset) processor operation, the processors use the arbitration lines to control access to the control, data, and address buses.

The APIC bus supports delivery of targeted or broadcast interrupt messages in a multiprocessor-system environment. During a reset operation, the processors send interprocessor interrupts (IPIs) to each other using the APIC bus. I/O devices or processors can place IPI messages onto this bus to be received by one or more processors. System software sets up the interrupt priorities for these messages, and the OS can use various delivery schemes for them. All APIC devices communicate using a three-wire bus.

This APIC bus differs slightly from the two-processor Pentium design described in the article "Pentium Chip's Dual Personality" (December 1994 BYTE). There, one of the lines served as an APIC enable, another acted as a chip select, and the third handled a clock signal. Here, two of the wires are wired-OR data lines, and the third wire is a common clock signal.

Dueling Processors

All processors must be connected to the APIC bus. The systems designer also provides an APIC clock signal. This bus is required for a hardware reset of the multiprocessor environment, even if it's not used after reset. (Intel strongly recommends that a multiprocessor system use the APIC interrupt scheme.)

A processor first checks that the APIC bus is not busy before initiating a data transfer; it then drives the APIC data lines low during a common clock phase to initiate the transfer. If two or more processors try to initiat e an IPI message during the same clock, the processors negotiate by driving their unique arbitration ID (derived from the processor ID) onto the data lines.

The processor with the highest-priority ID wins the arbitration, and the losing processors back off and wait for the APIC bus to fall idle. All devices now increment their arbitration ID. This puts the winner at the end of the priority queue for the next arbitration cycle. This round-robin scheduling algorithm guarantees that one--and only one--device sends IPI messages on the APIC bus at any time. It also ensures that each device has equal access to the bus bandwidth.

Following this arbitration sequence, an APIC device drives more serial bits onto the two data-bus lines, so that all the other devices on the APIC bus receive this IPI message. The APIC bus supports four categories of messages, as determined by the serial bits. Each message also has multiple subtypes to match the needs of various priority schemes. During reset, BOOT IPI messag es are used, and WAKEUP and INIT IPI messages may be used.

Once a RESET signal is recognized, all the processors execute identical microcode (as shown in the figure "Which Processor Takes Control?" ). Each processor checks its INIT pin. If low (which is recommended), the processor executes a built-in self test (BIST). A processor executing a BIST drives the reset-not-complete pin active, which prevents other processors from moving to the next phase until all the processors have completed BISTs.

The final parts of the reset stage set the processor's CS register to 0FFFF:0F000h and the EIP register to 0FFF0h. This forces the first code fetch from the RESTART vector at 0FFFF:FFF0h, or just below 4 GB. The systems designer can arrange for the Pentium Pro processor to start execution at 0F:FFF0h, or just below 1 MB. Intel provides this 286-compatible alternate scheme so that systems with more than 4 GB of memory need not have a "hole" in the address space to accommodate the RESTA RT vector. The microcode also clears the bootstrap processor (BSP) register. As its name implies, the BSP is a machine-specific register that identifies the bootstrap processor.

The next stage of initialization involves selecting a bootstrap processor from the available processors. All the processors are eligible to become the single bootstrap processor, rather than defining that a processor with, say, an ID of 0 becomes the bootstrap processor. This eliminates a single-point failure situation, where a system boot sequence stalls because that particular processor fails to operate. The processors continue to execute from microcode and implement a multiprocessor boot protocol.

Each processor broadcasts a BOOT IPI onto the APIC bus--note that the APIC bus serializes these requests--and each processor receives n BOOT IPIs. Each processor checks these incoming APIC IPIs. If the first one received has the same ID as the processor itself, this processor becomes the bootstrap processor.

Simply put, the fastest processor wins this arbitration round, and it sets the BSP register to 1. If the first ID doesn't match, that processor executes a wait loop in microcode. This essentially puts the losing processors to sleep because they don't perform external bus accesses. The bootstrap processor fetches code pointed to by the RESTART vector and starts executing the system firmware. This code is typically the system BIOS.

Design Issues

There might be a hardware reason why a systems designer would want a specific processor to serve as the bootstrap processor, rather than one randomly chosen by the bootstrap algorithm. DOS-compatible hardware, for example, might be connected only to a particular processor. In this case, the current bootstrap processor, if it isn't handling the compatibility signals, sends a WAKEUP IPI to the required processor. It also sends an INIT IPI to itself.

The bootstrap processor enters a wait-loop microcode sequence, effectively putting itself to sleep. The processor that receives the WAKEUP IPI extracts an embedded RESET vector from this IPI message and starts executing firmware code. This new RESET vector lets the awakened processor execute different firmware from the bootstrap processor. The original bootstrap processor clears its BSP flag, and the awakened processor sets its BSP flag to 1. This sequence transfers the responsibility of booting the OS to the newly anointed bootstrap processor.

The BIOS typically executes a system self test, and the other processors may be turned on for testing purposes using WAKEUP IPIs. The initiating processor can remain active to perform multiprocessor testing or can turn itself off by sending itself an INIT IPI. Following the successful completion of the power-on self tests (POSTs), the systems programmer should switch off all the processors except one. He or she must take care while switching processors on or off: The last processor left on must have its BSP flag set (indicating that it's the bootstrap processor).

Each processor in a multiprocessor system must be initialized consistently. They must, for example, have a common view of the system memory map that defines which areas are cacheable, noncacheable, I/O, and so forth. Other multiprocessor initialization, such as system management mode and machine check architecture, should be completed at this stage.

The bootstrap processor interrogates the system hardware and builds a table that describes the hardware configuration. This standardized table contains information about each processor, expansion buses, I/O APIC descriptions, I/O interrupt assignments, and local interrupt mappings. The OS may use this resource list to support plug and play. Full details of this table and its parameter passing are described in The Multiprocessor Specification , which is available from Intel's World Wide Web site ( http://www.intel.com ) or by contacting the Intel Literature Center at (800) 548-4725 and requesting packet #242016-004.

The last act of the bootstrap processor is to load the OS and pass control to it. The OS is now in control and turns on the sleeping processors as required.


A Multiprocessor System Bus

illustration_link (10 Kbytes)

The basic multiprocessor Pentium Pro schematic. The arbitration bus and advanced programmable interrupt controller (APIC) bus are used to set up the processors during the system boot.


Which Processor Takes Control?

illustration_link (6 Kbytes)

The interprocessor interrupt (IPI) messages determine the bootstrap processor and help coordinate processor activity during the bootstrap stage. The first processor to receive its own ID back from the bus wins.


John Hyde is the technical manager for Intel's Enterprise Server group. You can contact him on the Internet or BIX at editors@bix.com .

Up to the Core Technologies section contentsGo to next article: How Copland CommunicatesSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network