IBM and Motorola rev up the 603e and 604 and reduce the chips' hunger for power
Tom Thompson
IBM and Motorola have cranked up the performance of the PowerPC 603e and 604 CPUs. The companies recently disclosed an enhanced version of the 603e, called the 166-MHz 603e, that sports a number of significant improvements. Its predecessor, the 100-MHz 603e, peaks at 120 MHz. The companies have also revealed an enhanced PowerPC 604, called the 166-MHz 604e. Its 100-MHz sibling tops out at 133 MHz. The new processors not only operate at higher clock rates -- they also run certain operations faster.
IBM and Motorola have now exceeded the
performance targets
they set for themselves in February. Not only that, but they'v
e accomplished this without approaching any limits in these processor designs.
Both chips are made with a 0.35-micron five-layer-metal CMOS process. A 0.5-micron version of this same process dramatically shrank the original PowerPC 601's die (which used a 0.65-micron four-layer-metal process) from 121 mm(2) to 74 mm(2) and enabled it to run at 120 MHz. While this process costs slightly more, the size reduction confers important benefits. Smaller circuits result in a smaller die, which raises the yields per wafer and can result in savings that more than offset the increased process cost. Or, the designers can pack more features on the same-size die.
Size reduction can also mean a boost in the processor's performance. The reduced size of the processor's circuits means that signals travel shorter distances between logic gates. It also lets the circuits operate at lower voltages. These lower voltage levels allow the logic gates to switch faster while consuming less power. Thus, the 166-MHz 603e and th
e 166-MHz 604e can run at higher frequencies yet dissipate less or the same amount of power as their predecessors. For more details on die size, check the table at right.
Despite operating at the higher clock rate, the 166-MHz 603e consumes only 3 W (typical) at 166 MHz, the same as a 100-MHz 603e running at 100 MHz. Simulations show a dramatic performance improvement: At its named clock rate, the 166-MHz 603e posts an estimated 3.0 to 4.0 SPECint95 and 2.5 to 3.3 SPECfp95 -- about the same as a 100-MHz 604. At its named clock rate, the 166-MHz 604e typically dissipates an estimated 10 W, significantly less than a 133-MHz 604. Preliminary estimates by IBM peg the 166-MHz 604e at 5.0 to 6.0 SPECint95 and 4.0 to 5.0 SPECfp95 when running at 166 MHz. On top of the capabilities bestowed by a new fabrication process, each chip has key features added to its design that also boost performance.
603e Little Endian
The 166-MHz 603e contains 2.6 million transistors, approximately the
same as the 100-MHz 603e. The design reduces power consumption by running the processor core at 2.5 V, while the bus and I/O interface still operate at 3.3 V. It's pin-compatible with the 100-MHz 603e. The new chip, with its higher clock speed, supports a wider range of clock multipliers (2:1, 5:2, 3:1, 7:2, 4:1, 9:2, 5:1, 11:2, and 6:1). This enables PC designers to build notebook systems that use modest clock speeds (such as 25 or 33 MHz) to conserve power, while the CPU runs at 150 MHz or 166 MHz to meet performance goals.
A modification to the 166-MHz 603e's load/store logic provides better performance and support for little-endian addressing modes under Windows NT. Formerly, when the PowerPC operated in little-endian mode and software accessed misaligned data (such as when a 32-bit word straddled a 32-bit word boundary), an exception would occur and a millicode exception handler would field the access; see the sidebar "What the Heck Is Millicode?". Put another way, the processor first had to perfo
rm two accesses to read data crossing a word boundary. The chip would access the lower-address word first, regardless of the memory-addressing mode. The processor then spent additional cycles in a millicode handler that determines the endian order of the data.
With the 166-MHz 603e, the hardware keeps track of the data order. With the overhead of a millicode handler absent, misaligned data accesses complete several cycles faster. As a result, load/store operations now take the same number of cycles regardless of the endian addressing mode.
166-MHz 604e: Faster Fetching
Even with its smaller die, the 166-MHz 604e packs additional transistors that not only add new features but also enlarge its on-chip cache size from 32 KB to 64 KB. The new 604e has 5.6 million transistors, of which 3.8 million implement the on-chip caches. The 166-MHz 604e has separate code and data caches, each 32 KB in size, while the 100-MHz 604 had two separate 16-KB caches. The caches are logically org
anized as four-way set associative using 256 sets, instead of the 128 sets on the 604. By keeping the cache organization as four-way, the 166-MHz 604e is pin-compatible with the 100-MHz 604. The processor core operates at 2.5 V, and it supports processor-to-bus frequency ratios of 1:1, 3:2, 2:1, 5:2, 3:1, and 4:1, which can simplify a system design.
The CPU designers beefed up the logic of the load/store unit to reduce the number of cycles spent fetching and writing data. The cache logic forwards a subsequent nonspeculative load operation immediately to the load/store unit, rather than waiting for the cache fill to complete (as it does on the 100-MHz 604). Like the 166-MHz 603e, the 166-MHz 604e provides improved hardware support for little-endian misaligned data accesses.
Room to Grow
These new processors offer performance benefits beyond just faster clock speeds. The reduction of a few clock cycles here and there on load operations might not seem like much of an improvem
ent. However, because a processor spends its time either executing instructions or shipping data in and out, these faster operations add up to a significant performance boost.
The improved little-endian addressing support makes these processors capable of hosting any operating system, regardless of its addressing mode, without performance degradation. This is especially important for Windows NT, which operates in little-endian mode. You can expect to see the 166-MHz 603e and 166-MHz 604e at the heart of any system based on the PowerPC Common Hardware Reference Platform (CHRP).
For notebook computers, a 166-MHz 603e will mean 604-level performance but with longer battery time. (At BYTE we tested the battery life of an Apple PowerBook 5300 equipped with a 100-MHz 603e CPU, active-matrix color display, and lithium-ion battery. It ran for nearly 7 hours.)
A 166-MHz PowerPC 604e, armed with both the 604's speculative execution and branch prediction logic, and the improved load/store instruction p
erformance, should endow a desktop system with processing power beyond that of any system based on Intel's new P6. It's important to note that the 166-MHz clock speed is only the starting point. IBM and Motorola engineers say these enhanced processors have the potential to reach a clock speed of 200 MHz.
PROCESSOR 100-MHZ 603E 166-MHZ 603E
========================================================================
Die size 98 mm(2) 81 mm(2)
Number of transistors 2.6 million 2.6 million
On-chip cache size 32 KB 32 KB
Current maximum clock speed 120 MHz 166 MHz
Voltage 3.3 V 2.5 V (core)
Power dissipation (at max. speed) 3 W 3 W
SPECint95 (max. clock speed) 2.5-3.3* 3.
0-4.0*
SPECfp95 (max. clock speed) 2.1-2.8* 2.5-3.3*
PROCESSOR 100-MHZ 604 166-MHZ 604E
=========================================================================
Die size 196 mm(2) 148 mm(2)
Number of transistors 3.6 million 5.6 million
On-chip cache size 32 KB 64 KB
Current maximum clock speed 133 MHz 166 MHz
Voltage 3.3 V 2.5 V (core)
Power dissipation (at max. speed) 14 W 10 W
SPECint95 (max. clock speed) 4.55 5.0-6.0*
SPECfp95 (max. clock speed) 3.31 4.0-5.0*
* estimated
Tom Thompson is a BYTE senior technical editor. He's the author of
Power Macintosh Programming Starter Kit
(Hayden Books, 1994). You can contact him on the Internet at
tom_thompson@bix.com
.