oring the 16-bit exponent portion. This yields eight 64-bit logical registers without significantly altering the x86 architecture.
MMX instructions can pack several data types into these 64-bit regis
ters: packed bytes (eight per register), packed words (four per register), packed doublewords (two per register), and a quadword (one 64-bit value per register). These data types are useful because multimedia programs typically work on small units of data. For example, a color pixel in TrueColor mode, the highest commonly used color resolution, uses 24 bits: 1 byte for each RGB color. This mode allows up to 16.7 million colors, more than the human eye can discern. In HiColor mode, only 16 bits are needed for a pixel. For many graphics applications, 16 bits is more than enough.
New x86 processors that support MMX will address the new registers as MM0 through MM7. Instead of treating the registers as a stack -- as FP instructions do -- MMX instructions can access the registers directly. When switching back and forth between FP and MMX instructions, the existing FSAV instruction saves the state of the registers, and the usual FRSTR instruction restores the values. This keeps MMX technology compatible with ex
isting OSes, which frequently must save and restore the registers when context-switching between multitasking applications.
The downside is that programmers can't mix FP and MMX instructions together because they need the same registers. But this is not as significant as it sounds, since multimedia programs typically perform their FP operations before displaying the data. (Rendering relies more heavily on integer instructions.)
MMX introduces a set of general-purpose integer instructions that use the single instruction/multiple data (SIMD) paradigm. One instruction processes the multiple data in the packed registers. This parallelism
increases performance
. Incidentally, this concept is not new at Intel. Years ago, the now-obsolete i860 RISC family featured a similar technology, called Pixel Addressing Extension (PAX).
Another feature of the new instruction set, parallel-compare operations, could improve performance by eliminating branches. (Modern processors try to predict bra
nches, but a misprediction means a penalty of several processor cycles.) Combined with packed data features, parallel-compare operations are useful when, for example, you want to combine or overlay two images.
The MMX instructions are similar to those in Sun's Visual Instruction Set (VIS) for the UltraSparc. VIS also packs registers and uses the FP registers. But it has a lot more to offer than MMX: 32 new registers (compared to Intel's eight), accelerated video decompression with discrete cosine transformations, more-powerful addressing modes, pixel masking, and a highly specialized set of operations that greatly accelerates motion estimation when compressing MPEG video streams.
MMX isn't Intel's only new approach to accelerating 3-D. Another new extension for 3-D accelerators is the Advanced Graphics Port (AGP). To evenly distribute main-processor tasks and graphics-chip tasks, the AGP creates a new data path for data transfers between main memory and the graphics card's frame buffer. By skipping th
e PCI bus altogether, AGP can theoretically allow read and write transfers at speeds up to 400 MBps, according to Intel.
illustration_link (32 Kbytes)
