Next generation: Alpha 21164A
DOSSIER:
The Alpha adheres more closely to the RISC philosophy than its competitors by stripping away every ounce of fat from the hardware and instruction set in favor of the fastest possible data path. Alpha designers believe that a faster clock will buy you what other chips achieve through fancy hardware. The principle appears to work: The 21164 was the fastest single-chip microprocessor in the world when it was launched in 1995, with three times the integer performance of a Pentium/100 and faster FPU performance than Mips' R8000 supercomputer chip set. The next-generation 21164A's layout has not changed, but the chip will apply a process shrink and compiler enhancements to deliv
er even more SPECmarks.
The 21164 family eschews out-of-order execution, relying instead on smart compilers that can sequence code to minimize pipeline stalls. At 9.3 million transistors, these chips have the largest transistor budget of any CPUs made so far, but most of them are used for cache-memory cells. The 21164 design has relatively small (8-KB) direct-mapped primary caches, but it's the first CPU to place a large (96-KB) secondary cache actually on-chip for minimum latency. Intel's P6 has its secondary cache in the same package, but not on the same silicon slice.
The 21164 family has four execution units (two integer and two floating-point) and can issue two instructions of each kind per cycle. It has a four-stage instruction pipeline that feeds into separate integer, floating-point, and memory-execution pipelines. Compared to other next-generation RISC chips, the 21164 has pipelines that are relatively deep and simple, for high clock speeds.
The instruction pipeline doesn't do any c
omplex checking for data dependencies or issue any instructions out of order. If the current four instructions cannot all be issued into different execution units, the instruction pipeline stalls until they can. Unlike its competitors, the 21164 doesn't use register renaming; instead, it updates architectural registers directly once a result has reached the final writeback stage of the pipeline. To avoid delaying any dependent instructions, the chip has extensive register-bypass routes so that shared operands are available well before the write-back stage.
Digital intends to push the Alpha for NT servers rather than for the more traditional Unix. Coupled with keener pricing, this could become a winning strategy.
OFFICIAL INTRODUCTION DATE:
During 1996
CURRENT STATUS:
Sampling in first quarter of 1996
LIKELIHOOD INTRODUCTION DATE WILL BE MET:
Unknown
TARGET CLOCK SPEED:
Over 300 MHz
ESTIMATED PERFORMANCE:
500 SPECint92; 700 SPECfp92
FABRICATION PROCESS/FEATURE SIZE:
CMOS/0.35-micron
TECHNOLOGICAL ADVANTAGES:
The Alpha 21164A retains the microarchitecture of the 21164, a four-way superscalar design with extremely simple, stripped-down data paths that allow it to be clocked faster than other vendors' chips.
TECHNOLOGICAL DISADVANTAGES:
At around $3000, the Alpha 21164 is the most expensive RISC chip on the market, and the 21164A is likely to retain this distinction. The latest version depends more on compiler quality to achieve maximum throughput than some competitors. For optimum throughput, the chip must cluster instructions into groups of four that execute together, and this task is left entirely to the compiler.
PRIMARY MARKET:
Scientific and engineering workstations and high-end Windows NT and Unix servers.
WHERE TO FIND:
Digit
al Equipment
Hudson, MA
(800) 332-2717
(508) 568-6868
semiconductor@digital.com
Inside the World's Fastest RISC CPU
illustration_link (11 Kbytes)

The 21164A improves upon the existing Alpha design with a process shrink and a better compiler.