Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesDesigning Alpha-Based Systems


June 1995 / Core Technologies / Designing Alpha-Based Systems

Digital's trio of processors offers different design possibilities

Bruce Faust

In a world where speed is king, not all RISC PCs are created equal. Currently, there is a marketing battle over which of the industry titans' RISC PCs are the fastest. However, there's one unmistakable truth concerning RISC PCs: If you have ever used one and run a "native" Windows NT application (an application compiled for the RISC processor, not something running in an x86 emulator), you'll never want to go back to an x86-based system.

Consider, for example, Digital Equipment's Alpha AXP family of microprocessors . Digital's Semiconductor Operation (Hudson, MA) has developed CPUs for many years, and the Alpha comes from the micro VAX family of CPUs. However, the Alpha is rather unique when compared to other RISC processors. It was designed from the beginning as 64-bit processor, which differs from other 64-bit RISC processors that have evolved from 32-bit implementations. It has 64-bit address and data lines, pipelined both in and out of the processor. Furthermore, not only is the Alpha superpipelined but superscalar as well. In a superscalar design, the CPU is issuing more than one instruction per clock tick. Digital's newest Alpha design, the 21164, issues four instructions per clock tick. This super-superscalar approach, coupled with a 300-MHz clock speed, yields a mind-bending 1.2 billion instructions per second.

There are essentially three types of Alpha CPUs. The table on page 240 shows a comparison of the Alpha family of processors. These are the 21066, 21064 (with two varieties), and 21164. As the table shows, the taxonomy of each processor is quite similar. However, there are variations in the internal cache sizes, clo ck speeds, and external glue logic required. With that in mind, let's start with the first member of the Alpha family.

The 21066 (PCA, PC Alpha)

The 21066's strength lies in its ready ease of integration into a PC system, hence the moniker, PC Alpha. That's because the 21066 provides all on-board cache, DRAM, and PCI (Peripheral Component Interconnect) logic signals. The PCI bus interface is 32 bits wide, which offers transfer rates of up to 132 MBps. Put another way, the designer doesn't have to design the external glue logic for the cache, main memory, or a PCI interface. A computer architect can lay out the motherboard and then attach the multiplexed address and data lines for the write-back cache and main memory. Digital added an on-board PLL (phase-locked loop) that further simplifies the implementation of the PCI interface. You supply the 21006 with an external 33-MHz clock signal, and the PLL multiplies it internally to give the processor a clock speed to either 166 MHz or 233 MHz. Meanwhile, the external hardware, such as the PCI bus, cache, and memory continue to operate at 33 MHz, simplifying design and component costs.

In some tests, the 21066 can outperform the faster 21064 family of CPUs in PCI I/O, simply because the former processor's PCI interface is efficient. The cache and DRAM bus are 64 bits wide, giving the processor bandwidth up to 264 MBps. However, because the cache and DRAM interface are time-multiplexed, the 21066 takes a performance hit relative to the 21064 and 21164 processors on memory accesses.

While the 21066's integer performance is bested by Intel's Pentium (94 SPECin92 at 233 MHz versus 112 SPECint92 for a 100-MHz Pentium), the Alpha's floating-point performance is quite impressive (110 SPECfp92 versus the Pentium's 82 SPECfp92). Floating-point computations are extremely important for such applications as rendering, animation, CAD, and other scientific applications. The strengths of the 21066 are evident in low-cost 64-bit RISC appli cations. If you want good floating-point performance as well as good I/O performance in a low-cost workstation, the 21066-based workstation is for you. Users of 21066-based systems enjoy about 25 percent better floating-point performance than Pentium 100 users. Base prices for 21066-based machines are under $4000.

The 21064

The 21064 was the first Alpha processor to arrive on the market, originally running at 150 MHz. Now the chip ticks along at 275 MHz. However, the 21064 requires external glue logic to interface the cache, DRAM, and PCI. The 21064 uses separate (nonmultiplexed) address and data lines; therefore, memory accesses are more efficient than in the 21066. This bus arrangement also allows such enhancements as doubling the data paths from 64 bit to 128 bits, which offers a more effective method for minimizing wait states and maximizing cache efficiency. However, designing the cache technology to minimize the wait states from the CPU to cache memory is somewhat difficu lt, because a 21064 running at 275 MHz has a 4-ns (nanosecond) access cycle time. As result, even using the currently available 15-ns, 1-Mb static RAMs yields four wait states per memory access at best.

Using such cache techniques as two-set associativity and synchronous static RAMs greatly improves cache performance. For sequential data applications, it is sometimes better to operate a smaller yet faster cache, such as one 512-KB cache using 10-ns parts. In applications where the data might be accessed randomly, having a larger yet slower cache offered better performance.

Newer 21064-based designs that offer cache SIMM modules are on the way. These cache SIMM modules are densely populated and can use fast 10-ns, 256-KB or 1-MB parts. These modules can be expanded from 2 MB up to 8 MB, allowing the 21064-based system to gain the best of both worlds: fast 10-ns access time for sequential applications and a deep cache for random-access applications. However, this makes the 21064 design mo re complex than developing a 21066-based machine.

Although the 21064 might be more of a design challenge for engineers, users who like the more-power approach to computing love this class of machine. Running native Window NT applications, a 275-MHz 21064 machine is about twice the speed of the Pentium 100 system, and floating-point performance is roughly four times faster than that of the Pentium. Emulated 16-bit x86 applications on the Alpha run at about the speed of a 50-MHz 486DX2. So, if you run many 16-bit applications, you might want a Pentium system instead.

The 21164

The 21164 is the newest in a series of Alpha CPUs from Digital. And this one really screams, especially when it can operate at 300 MHz. At this speed, it posts 330 SPECint92 and 500 SPECfp92. The key to this blazing performance is that the processor has a level 2 cache on-chip and issues four instructions per clock cycle. Because the level 2 cache is latched to the speed of the microprocessor, it off ers zero wait states. The only exception, of course, is if the next set of data is not cached in either the level 1 or level 2 cache and must be fetched from an off-chip cache or from main memory. With cycle times now less than 4 ns, and using cache module SIMMs with 10-ns speeds, the 21164 will probably have at least four wait states. However, silicon that glues this chip to a third level cache, DRAM, and PCI interface is not yet available. Such chip sets are expected to be released later this summer. Also, the planned PCI interface is expected to be expanded to 64 bits, adding to the complexity of the ASIC design of the glue interface. Early versions of systems based on this chip will be expensive. Such systems will have complex designs and will require costly high-speed parts to keep the 21164 running at full speed. Also, the 21164 alone has a price tag of $3000--higher than the price of some PC systems.

Market Outlook

Despite the Alpha's lead in the clock and performance ra ce, Digital clearly has a number of obstacles it must overcome to make the processor pervasive. First and foremost, more software vendors need to embrace the Alpha. Currently, a number of vendors have ported applications to the processor. To date, over 1500 vendors with applications such as Word and Excel have already been ported. However, it's the ports of such software as Microstation, Pro Engineer, and NewTek's Lightwave 3D that has fueled a boom in the Alpha-based workstation market. These applications are heavily floating-point intensive, and the native versions of these applications run circles around Pentium and even other competing RISC architectures.

Another obstacle is support silicon. Glue logic chip sets are crucial for system designers to develop hardware capable of harnessing the Alpha technology. Without this, I doubt many designers would be interested in developing programmable array logic-based motherboards. However, DeskStation (Lenexa, KS) has recently developed a chip set for the 21 064 and one for the 21164. These should be available early this summer. Other vendors should follow suit.

Finally, pricing for the Alpha chip technology must entice users to make the switch from Intel or its clones over to Alpha. Time will tell if Digital has made the right gamble.


ALPHA FAMILY FEATURES

Each processors features targets specific price/performance markets.

                                
21066    21066A   21064    21064A   21164


Total on-chip cache size        16 KB    32 KB    16 KB    32 KB    112 KB

On-chip secondary cache size    n/a      n/a      n/a      n/a      96 KB
                                                                    (unified
                                                                    data
                                                                    and
                                                                    instruc-

                                                                    tion)

Die technology (micron size)    .68      .50      .75      .50      .50
                                                                    to .40

Clock speed (MHz)               166      233      150,     233,     266,
                                                  166,     275      300,
                                                  200               300+

Transistor count (millions)     2.2      2.4      1.6      1.8      9.6

External data bus               64 bit   64 bit   128 bit  128 bit  256 bit

External cache (level 2)        256      256      128      128      n/a
                                KB to    KB to    KB to    KB to
                                1 MB     1 MB     16 MB    16 MB

External cache (level 3)        n/a      n/a      n/a      n/a      2 MB
                                                                    to 64 MB

Build in DRAM interface         X        X        O        O        O

Build in cache RAM interface    X        X
        O        O        O

Build in PCI interface (32 bit) X        X        O        O        O

External chip set required?     O        O        X        X        X

External chip set PCI interface 32 bit   32 bit   32 bit   32 bit   64 bit

Pin grid array count            287 pin  287 pin  431 pin  431 pin  499 pin

Power dissipation               22.5 W   27 W     27 W     32 W     43.5 W

X  yes;
O = no;
n/a = not applicable



DEC's Alpha AXP Family

photo_link (31 Kbytes)


Bruce Faust holds a graduate degree and is founder of Carrera Computers and NekoTech. Both NekoTech and Carrera manufacture RISC PCs based on Alpha technology.

Up to the Core Technologies section contentsGo to next article: Windows 95 Graphics ArchitectureSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM   Cop
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network