Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesVideo for Free


February 1995 / Reviews / Video for Free

Thanks to new hardware and software technologies, accelerated motion-video playback is no longer a premium, and MPEG is on the move

Stanford Diehl and Greg Loveria

A very short time ago, accelerated playback of digital-video files was a value-added feature that differentiated the commodity market for Windows graphics cards. A number of technological and market developments promise to drive motion-video playback to the mass market. Given the power of today's mainstream hardware, video bandwidth can now be negotiated across a local bus instead of the slower system bus, and high-end CPUs can crunch more sophisticated decompression algorithms. Video-enhanced titles for training, reference, education, and entertainment are in high demand. And virtually every graphics-chip vendor, enabled by Microsoft Windows' DCI (Display Control Int erface), has announced a graphics architecture to support full-motion playback of digital video.

Digital-Video Playback

A number of factors affect video playback quality under Windows. The first is frame rate, measured in frames per second. To ensure quality, the video must be captured at an acceptable frame rate. The standard for TV-quality, full-motion video is 30 fps.

Video for Windows will drop frames to match the capability of the playback hardware, producing a fluid or jerky motion depending on the system it's played back on. The number of colors the sequence was captured at also affects quality because more data flows across the video adapter's data bus. If the video sequence was captured at 24-bit color depth, you have three times more data to move across the display bus than with an 8-bit (256 colors) video clip.

An uncompressed 24-bit video file, recorded at 640- by 480-pixel resolution and at 30 fps, would require a throughput rate of over 26 MBps. Clearl y, the video data must be compressed. Compression not only allows more video to be stored on your computer's hard drive, it also lowers the bandwidth requirements for video playback.

While compression algorithms significantly reduce bandwidth requirements, they demand intensive computational resources. Dedicated hardware has been required to decompress the video data at an acceptable rate while the host CPU took care of other chores, such as color space conversion (converting the video data from the compressed YUV format used for motion video to the RGB format necessary for display on computer monitors) and video scaling (scaling algorithms help maintain video quality when the video window is stretched beyond the captured size).

Video Playback's First Pass

Digital-video boards for Windows have been available for quite some time, but they have been expensive and difficult to install and use. Sigma Designs was the first company to successfully bring digital-video playback to a mainstream audience with its RealMagic board (dubbed ReelMagic at the time). RealMagic provided hardware-based decompression of MPEG video files. Sigma employed C-Cube's CL450 video processor along with its own proprietary video-acceleration chip (called Piccolo) to perform all pixel interpolation, line doubling, smoothing, and scaling algorithms.

RealMagic proved that there was a market for MPEG decompression boards, even at a time when few MPEG titles were available, but the board suffered from some limitations. RealMagic had no on-board VGA. Digital-video boards have typically relied on VGA pass-through, routing the VGA signal across a feature connector. Given the bandwidth limitations of a standard VGA feature connector, the graphics subsystem is confined to a maximum 640- by 480-pixel screen resolution. The feature-connector architecture has also been plagued by performance and compatibility problems.

Color shifts and shimmering problems afflict some models of VL-Bus c ards when connected to RealMagic through the VGA feature connector. According to Sigma, this problem is caused by the way the VESA (Video Electronics Standards Association) pass-through specification delivers MPEG 1 video's 15-bit color depth (the same color depth as NTSC TV) to a VL-Bus display adapter working in higher color modes.

Sigma recently announced RealMagic Rave, its MPEG playback adapter with an on-board graphics accelerator. It will continue to offer RealMagic Lite as well, but if you opt for the feature-connector solution, you should call Sigma first and make sure your VL-Bus adapter works correctly with RealMagic. Despite the limitations, RealMagic continues to be a driving force in pushing MPEG 1 as a major digital-video standard. With a compatible graphics adapter, the quality of video is outstanding. However, RealMagic does not accelerate more common software codecs such as Microsoft's Video 1, Intel Indeo, and SuperMac Cinepak.

VideoLogic's AVI Accelerator

One of the first single-board solutions for graphics and video acceleration is VideoLogic's 928Movie . The card uses S3's 86C928 graphics accelerator with 32-bit memory interleaving. The primary purpose of the 928Movie is to accelerate motion-video playback of Indeo, Cinepak and Microsoft Video 1 digital-video files.

The 928Movie uses VideoLogic's custom PowerPlay32 Digital Movie Accelerator ASIC (Application-Specific IC) and SmoothScale algorithm for YUV-to-RGB color space conversion and video scaling. The result is excellent full-motion playback even for video stretched beyond a 320- by 240-pixel window. Subjectively, though, we found that the AVI clips--even with the 928Movie's help--did not approach the quality of MPEG digital video. The motion is smooth, but the picture quality is somewhat blocky because of codec limitations. An announced upgrade to the Indeo codec may help.

The 928Movie also hosts a VMC (VESA Media Channel) architecture. The VMC provides an optimized d ata path for passing video data to other video components, such as capture cards, codec accelerators, or scan converters. By using the VMC, video components avoid passing data across the host system bus. Unfortunately, the general market has not embraced the VMC.

One of the VMC options VideoLogic offers is a hardware MPEG decoder. The $349 MPEG Player occupies a second slot and, like Sigma's RealMagic, uses C-Cube's CL450 acceleration chip. VideoLogic also employs its own Powerstream ASIC, a video-acceleration chip that works in conjunction with the CL450. Color palette shifts didn't affect video data passed across the VMC, even when we used the same graphics accelerator that caused problems with the RealMagic adapter.

We found the quality of VideoLogic's 928Movie, coupled with the MPEG Player adapter, simply superb. A 928Movie matched with the MPEG Player delivers a unified, expandable solution for accelerating AVI (Audio Video Interleave) and MPEG digital video. VideoLogic is now shipping a PC I (Peripheral Component Interconnect) adapter, the PCIMovie, with the PowerPlay32 video accelerator, a Weitek P9100 graphics accelerator, and the VMC architecture.

The DCI Interface

Before Intel and Microsoft released the software DCI layer, video accelerators such as the PowerPlay were very limited in what they were able to do. This limitation was not inherent to the chips themselves; video-playback software that could not take advantage of the specialized hardware imposed it.

Before DCI, Video for Windows would use the host CPU for compression and YUV-to-RGB conversion and then pass the RGB data to the video subsystem. Under this scenario, a specialized motion-video chip would get the data only after it was converted to RGB format. The only video function left for it to accelerate was video scaling.

DCI is a low-level interface that allows the video-playback software direct access to hardware-specific capabilities of the video subsystem. DCI coordinates with the W indows GDI (Graphical Device Interface), allowing the GDI to be bypassed for video playback when appropriate. DCI-compliant applications can check for the presence of specialized video hardware through the hardware's DCI driver. The DCI driver can then directly access the video frame buffer to dramatically improve throughput. With DCI, the video accelerator's driver can instruct the playback software to pass YUV data to it, allowing the video chip to perform color space conversion instead of the host CPU (see the figure `` Hardware Video Acceleration '').

Windows Accelerators Do the Video Thing

The DCI design enables a device-independent way for digital-video codecs to access specialized hardware features. DCI promises to drive innovation from both the software end and the hardware end. The graphics subsystem can now request raw YUV data and then process the video data totally within the confines of the graphics architecture. The graphics-chip vendors have respo nded with a flurry of announcements, heralding optimized motion video and graphics acceleration within a coordinated architecture. Some architectures are already in place but will benefit greatly from the DCI initiative.

Weitek Corp. (Sunnyvale, CA, (408) 738-8400) uses a dedicated chip--the Video Power coprocessor--for video scaling, color space conversion, and dithering (to emulate high-color video in 256-color mode). And yet Weitek also integrates video and graphics acceleration into a single architecture. Unlike earlier feature-connector solutions, the Weitek Power 9100 graphics controller and the Video Power coprocessor share a single video-memory frame buffer (see the figure, `` Video Architectures ''). The shared frame buffer not only reduces the cost by requiring less video memory, it also enhances performance by passing video data along the frame-buffer bus instead of the system bus.

The Tseng Labs (Newtown, PA, (215) 968-0502) architecture relies on a single frame bu ffer. A shared frame buffer requires two memory controllers that must negotiate for access to the video memory. Instead of arbitrating the frame buffer between the video processor and graphics accelerator, the Tseng Labs W32p graphics chip uses a ``multiport cache'' design. A fast cache sits on the front end of the W32p. YUV data flows to a Viper entry port of the VGA (Viper is the Tseng Labs video-acceleration chip). The Viper accepts the data, converts and scales it, and then loads it into the multiport cache. All the display data--video and graphics--is then stored in the frame buffer. The single-frame buffer design avoids any latency caused by arbitration between two controllers for buffer access.

Tseng Labs' latest video accelerator, the Viper f/x supports screen resolutions of up to 1024 by 768 pixels by 24-bit color. Tseng Labs has also announced a single-chip solution for graphics and video acceleration. The company will continue to market a dedicated video processor as well, claiming that a de dicated video accelerator can support a wider range of video formats and functionality. A dedicated processor does not have to make as many size and cost trade-offs as a single-chip architecture, so it can support a wider range of YUV conversions, for instance.

By the same logic, a dedicated processor could support more sophisticated interpolation algorithms than is possible with a single-chip architecture. To scale video beyond native size, video chips must add pixels to enlarge the video window. These pixels can be created by replication (simply replicating an adjacent pixel) or by interpolation (using an algorithm to determine the optimum characteristics of the pixel). Clearly, interpolation is the preferred method, but interpolation algorithms vary widely. At the most basic level, the chip could simply average the color values of two adjacent pixels and create the new pixel with the resulting color value. Very little memory would be required to process this logical operation. But as more sophistica ted algorithms are employed for pixel interpolation, more memory and chip complexity are required as well. Again, these requirements may exceed the size and cost limitations of a single-chip solution.

Jazz Multimedia's Jakarta board uses the Tseng Labs chip combination--the Viper video accelerator and the ET4000/W32P graphics accelerator--as a base platform to build a modular video solution. The standard Jakarta board delivers video-playback acceleration, including hardware MPEG decompression and graphics acceleration. Snap-on modules add a TV tuner and NTSC/PAL output. The Jakarta represents a strategy many graphics vendors will adopt: Deliver a standard video-playback solution on the graphics card and add higher-end functionality through modular components. The latest version of MGA Impression Plus starts with a 64-bit graphics accelerator and DCI driver; a snap-on module includes the new 64-bit PowerPlay64 and a VMC connector to support any other VMC-compatible video hardware.

On-the-Fly Video

Alliance Semiconductor (San Jose, CA, (408) 383-4900) takes a similar approach to the Tseng Labs single-buffer design, but the Alliance ProMotion-3210 chip performs scaling and color space conversion as the video data shifts out of memory and to the screen. As the screen is being refreshed, the chip can switch color depth on-the-fly as it scans across the screen, sending 256 colors to the graphical desktop and 24-bit color to a video window. The single-chip solution supports full-motion, 24-bit video acceleration along with 1024- by 768-pixel by 256-color graphics acceleration within a single megabyte of DRAM. Alliance claims that its chip can enable motion-video acceleration for an additional cost of less than $10 per system.

Once again, DCI is the key to this technology. DCI creates a surface in video memory that can be on-screen or off-screen. This surface is an area that the video codec can write to directly. Different vendors take advantage of this ca pability in different ways. Currently, many implementations perform video scaling and color space conversion before sending the processed data to the frame buffer. Scaling and conversion in real time requires high-speed circuitry that can match the refresh rate of the computer monitor, but overall cost is lower because a small amount of DRAM can be used effectively. In addition, the Alliance chip delivers true high-color video, instead of resorting to dithering to simulate high color in the video window.

All-in-One Chips

The trend is clearly toward integrating all video- and graphics-acceleration components onto a single slab of silicon. S3 (Santa Clara, CA, (408) 980-5400) has introduced the Vision868 (DRAM-based) and Vision968 (video-memory-based) Multimedia accelerators. The Vision series integrates a 64-bit graphics engine, color space conversion, scaling, and dithering on a single chip. The latest version of Diamond MultiMedia's Stealth 64 VRAM series wi ll soon offer an extensible architecture featuring the Vision968. The baseline adapter comes with graphics and video acceleration; add-on modules enrich the architecture with MPEG playback and video capture.

Similarly, Cirrus Logic (Fremont, CA, (510) 623-8300) has announced its MotionVideo Architecture. Cirrus goes even further than S3: The company not only integrates a graphics engine and video accelerator into its CL-GD5440 chip but also packs in a 24-bit DAC (D/A converter) for good measure. Internally, the chip uses a single frame buffer that supports different color depths between video and graphics. The company has also announced an 800- by 600-pixel LCD VGA controller with integrated video acceleration.

Perhaps the most ambitious new architecture is Brooktree's MediaStream (see ``Packetized Multimedia''). MediaStream sends multimedia packets to a specialized DAC that decodes the packets on-the-fly. Brooktree, along with other chip vendors, is already shipping a video-enabled DAC that per forms on-the-fly color conversion and scaling. These video DACs are pin-compatible with existing DACs; theoretically, a board maker could simply plug in the video DAC to video-enable an existing graphics adapter. In operation, though, the graphics accelerator must be able to let the video DAC know that YUV data is being passed to the DAC for conversion and scaling. Not all graphics chips deliver the signaling requirements to support a pin-compatible video DAC.

And the Winner is . . .

After all these new digital-video solutions come to market, the big winner could be the MPEG video codec. MPEG is generally accepted as a higher-quality codec than Indeo or Cinepak. In fact, MPEG 1 was specifically designed for high-quality playback from a single-speed CD-ROM (150 KBps). Unfortunately, the high compression ratios supported by MPEG (up to 200 to 1) require sophisticated algorithms and, hence, intensive computational resources. The demands of MPEG decompression created the market (that RealMagic currently owns) for MPEG boards.

But these new video-playback architectures present a clear threat to hardware MPEG decompression boards. With other processor-intensive tasks such as color space conversion and video scaling being off-loaded to mainstream graphics adapters, high-end host CPUs can now handle real-time MPEG decompression. In fact, many of the graphics chip makers plan to ship a software-based MPEG player from Xing Technology (Arroyo Grande, CA, (805) 473-0145) with the new video-enabled graphics accelerators. Consumers will then be able to play MPEG-1 CD-ROM titles without dedicated MPEG hardware.

Commodity Video

By midyear, graphics adapters with accelerated playback of digital video will be in the same commodity market as today's Windows accelerators. This is great news if you appreciate the latest applications and multimedia titles that feature motion-video clips. But it will require more understanding of video technologies. A vendor's claim of ``video accelerated'' will not necessarily translate into high-quality, full-motion video playback. The video tag will be somewhat like the claims of ``all natural'' on supermarket shelves. More than ever, you'll need to do your homework to make sure you're getting what you think you are.


About the Products


RealMagic Lite                                  $349


RealMagic Controller with audio playback        $449


RealMagic CD-ROM Upgrade Kit                    $799


RealMagic Rave (with graphics accelerator)      $489

Sigma Designs, Inc.
46501 Landing Pkwy.
Fremont, CA 94538
(800) 845-8086
fax: (510) 770-2640


928Movie                                        $349


2MB VRAM without audio                          $449


2MB VRAM with audio                             $549


1MB VRAM Upgrade Kit                            $100


MPEG Player
                                     $299


PCIMovie                                        $399

VideoLogic, Inc.
245 First Street, Suite 1403
Cambridge, MA 02142
(617) 494-0530
fax: (800) 203-8587


Jazz Jakarta                                    $499

Jazz Multimedia
1040 Richard Ave.
Santa Clara, CA 95050
(408) 727-8900
fax (408) 727-9092


Stealth 64 VRAM                                 $399


4MB VRAM                                        $599

Diamond MultiMedia
1130 East Arques Ave.
Sunnyvale, CA 94086
(408) 736-2000
fax: (408) 730-5750




Hardware Video Acceleration

illustration_link (10 Kbytes)

Before the release of DCI, a specialized video accelerator could only provide scaling services for digital-video clips. Video for Windows required the CPU to perform decompression and color space conversion, passing RGB data on to the graphics subsystem. A DCI-compliant video codec can check for the presence of video hardware and, if a video accelerator is present, can pass unconverted YUV data directly to the video subsystem for color space conversion and video scaling. With more control over video playback, graphics-chip vendors have devised innovative architectures for efficient video acceleration within Windows.


Video Architectures

illustration_link (7 Kbytes)

a) Dual Frame Buffer
In a dual-frame-buffer architecture, the video-acceleration board plugs into the host system's I/O bus and connects to an existing graphics adapter via the feature connector. Each accelerator uses its own video memory and DAC (D/A converter). The feature connector limits screen resolution to 640 by 480 pixels and suffers from incompati bilities with some board combinations.
b) Shared Frame Buffer
With a shared-frame-buffer interface, the graphics accelerator and video processor share one video-memory buffer, lowering memory requirements. Both accelerators feed the buffer, and each requires its own controller to arbitrate access to video memory.
c) Single Frame Buffer
A single-frame buffer routes converted video data through the graphics controller. All the display data--video and graphics--is then stored in the frame buffer. No buffer arbitration is needed because the graphics controller alone feeds the buffer. The single-frame-buffer architecture also requires only a single communications port to video memory, so inexpensive DRAM can be used instead of dual-ported video memory.


Jakarta

photo_link (38 Kbytes)

Both the Diamond Stealth 64 VRAM an d Jazz Jakarta start from a baseline graphics engine with integrated motion-video acceleration, and both offer snap-on upgrade modules. The Jakarta includes a Tseng Labs graphics engine, Viper f/x video accelerator, and hardware MPEG decompression on-board. Upgrade modules add a cable tuner and NTSC/PAL output.


Stealth

photo_link (30 Kbytes)

The Stealth 64 VRAM includes S3's new single-chip graphics/video accelerator. Add-on modules support MPEG decompression and video capture.


RealMagic and 928Movie

photo_link (32 Kbytes)

Two early harbingers of the coming wave of low-cost PC-based video accelerators: Sigma Design's RealMagic MPEG decoder (below) and VideoLogic's 928Movie (above), one of the first cards dedicated to accele rated playback of AVI files. The 928Movie was also first to implement the VESA Media Channel.


Stanford Diehl is director of BYTE reviews. You can reach him on the Internet or BIX at sdiehl@bix.com loveria@bix.com .

Up to the Reviews section contentsGo to previous article: Agent-Enhanced CommunicatorGo to next article: Packetized MultimediaSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM    BYTE
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network