Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesIs Cache Losing Its Cachet?


June 1995 / State Of The Art / Fast, Smart RAM / Is Cache Losing Its Cachet?
David F. Bacon and Peter Wayner

Adding cache memory is well recognized by computer buyers as a reasonable way to turbocharge a system's performance. Nowadays, however, the need for separate caches is disappearing as newer microprocessors add more cache directly onto the CPU die itself and as multitasking OSes fragment memory demands and lose much of the performance advantage that cache memory is supposed to provide.

Recent generations of CPU chips have had enough silicon real estate to include a small on-chip cache. These caches have generally been in the range of 8 to 32 KB, which is too small to help many applications. As a result, many computer systems have been built with a larger, off-ch ip L2 (Level 2) cache to supplement the on-chip L1 cache.

However, on-chip caches are getting larger. Intel's newly announced P6, for example, has 256 KB of on-board L2 cache, while Digital Equipment's Alpha 21164 has 96 KB of on-chip L2 cache memory. With large on-chip caches like these, the complexity and expense of adding an L2 cache to a PC or workstation makes less sense, so we can expect to see fewer of those types of machines in years to come.

Large software packages and multitasking OSes like OS/2 Warp can destroy the value of a cache if it isn't large enough to hold all the code being executed. When the CPU switches between jobs, it can't find the information it needs in the cache, and it must request it from the substantially slower main memory. Users of Microsoft Windows, for example, may notice this effect already when they ask their system to print in the background. Many machines can't keep both the printing code and the Windows code in the cache simultaneously, so the constant swi tching makes the system run at the slowest memory speeds.

Look for innovations in cache design driven by the growing presence of multiprocessors. Multiprocessors are just beginning to break into the mainstream server market, and with the demands of desktop conferencing and high-end multimedia applications, multiprocessors are likely to become the platform of choice for power users before too long.

Cache design for multiprocessors is considerably more complicated. If processor A wishes to update a memory location cached by processor B, B's copy must be either invalidated or updated by A. Even worse, if B has already modified its copy, then before A can proceed, B's data must be either flushed back to main memory or transmitted directly to processor A. So far, we've seen two different approaches to solving this. Either all the processors monitor all the memory traffic, looking for potential conflicts with their locally cached data (a snoopy cache), or the main memory controller keeps track of whic h processors have cached which memory locations (a "directory-based" cache).

Each scheme has its advantages and disadvantages. Snoopy caches are generally easier to implement, but they require that all memory traffic goes over a shared bus. Directory-based caches require extra memory to keep track of the outstanding copies, but they can be used with more sophisticated processor-interconnection networks that provide higher bandwidth and scale to a larger number of processors.

Multiprocessor systems have been the subject of research for the past 30 years, but it's only in the last five or 10 years that they have managed to capture a significant portion of the high-end supercomputing market. Now, as multiprocessors make their way into the high-volume PC and workstation businesses, that research will come face-to-face with the real world.


Up to the State Of The Art section contentsGo to previous article: Smart MemoryGo to next article: One Machine, Many RAMsSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network