Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesMaking Personal Video a Reality


December 1997 / Core Technologies / Making Personal Video a Reality

A one-chip MPEG-2 codec makes DVD authoring on a PC possible.

Les Kohn and Greg Efland

As early as next year, certain PCs will have recordable Digital Versatile Disc (DVD) drives. Even at 4.7 GB per single-sided, single-layer disc, a recordable DVD stores only 4 minutes of high-quality, uncompressed digital video. Fortunately, the latest compression standard from MPEG, MPEG-2, can compress a digital video stream so that a DVD holds over 2 hours of quality video.

In theory, a PC equipped with such a drive could become a video authoring system. Practically, MPEG-2's own capabilities have hampered i ts deployment. That's because the technology is asymmetric. Decompressing an MPEG-2 video stream requires only modest processing power (ideal for consumer devices), but encoding (or compressing) a video stream requires lots of processing power. Until recently, you needed costly multiprocessor arrays or custom hardware to achieve MPEG-2 encoding and editing in real time.

A low-cost processor from C-Cube Microsystems, the DV x , changes the situation. It is a 0.35-micron, 3.3-V part that contains 5.4 million transistors, packaged in a 352-pin ball-grid array (BGA). While the DV x operates at a modest 100 MHz, it performs professional-quality, real-time MPEG-2 encoding using only one-fourth the data rate of today's DV and M-JPEG video formats.

This lets a PC capture, encode, and store digital video on its standard hard disk or a recordable DVD, rather than use a dedicated disk array. Because the DV x combines MPEG-2 encoding/decoding and video-effects functions on a single chip, it makes MPEG-based, frame-accurate video editing available to the serious consumer for the first time.

DV x Architecture

The DV x architecture is based on the experience obtained from three previous encoder generations. Internally, the DV x consists of several semi-independent units, as shown in the figure "The DV x Microarchitecture." A SPARC RISC core performs high-level processing, complemented by motion estimation (ME) and video digital signal processor (DSP) units that handle compute-intensive, low-level processing. All these parts work concurrently to perform the operations required for real-time encoding.

The core acts as a microcontroller and operates at 80 MIPS. This software-based architecture lets you add new features or correct bugs without changing the hardware. A 16-KB instruction cache ensures that no cache misses occur in major processing loops. The DV x has a n on-chip 8-KB data memory that's managed by overlapped software-controlled DMA transfers. This replaces the traditional data cache to guarantee real-time performance.

The video DSP is a high-level coprocessor extending the SPARC instruction set to include image-processing and encoding operations. Its nearly autonomous operation lets the DV x use a less complex and smaller single-scalar core. The video DSP coprocessor consists of a DMA unit and a DSP unit, each connected to a double-buffered working memory composed of two banks of 4 KB each. At any given moment, the DMA unit is both loading new operands into one memory bank and storing prior results from it, while the DSP unit processes data in the other bank.

When the DMA and DSP units complete their tasks, the roles of the two banks are reversed. This lets video DSP operations overlap with the synchronous DRAM (SDRAM) data transfers necessary to sustain their throughput.

DMA-unit instructions load and store rectangular subsection s (i.e., strips) of an image between working memory and the external SDRAMs. One strip-load instruction implements the various flavors of motion compensation defined in the MPEG standard. The DMA unit converts motion vectors generated by the ME unit into image-strip addresses, while the SDRAM controller performs alignment and subpixel interpolation on the reference data.

Image Encoding and Performance

MPEG-2 encoding works by examining a succession of images (or frames) and removing redundant information from them (e.g., the blank wall in a scene's background can be stored once and reused in subsequent frames until the scene's point of view changes). This requires the DV x to have a high-throughput, robust ME mechanism to determine what image information has changed between frames.

A list of commands -- generated by the core and stored in SDRAM -- controls the programmable ME search engine. The engine fetches search commands from memory and writes the results back into it. As e ach command executes, the ME unit loads the appropriate target and reference image data from SDRAM into its on-chip target and reference window memories. These memories are double-buffered to allow the next target's SDRAM accesses to overlap with the search for the current target.

After all the search commands have been processed, an interrupt notifies the core. The command results might generate more search commands for the next level in a hierarchical search or perform motion compensation in the video DSP. Although the ME unit off-loads much of the burden from the core, microcode retains full control of the critical search parameters. This gives the flexibility of a CPU-controlled search engine, but with the performance of a hard-wired engine.

To encode high-resolution formats such as HDTV, multiple DV x processors can operate in parallel to divvy up the processing task. Previously, video-processing chips were interconnected by a globally shared bus. However, as the number of chips inc reases on the shared bus, it reaches the limit of the bus's bandwidth. This prevents further scaling of performance.

Instead, the DV x uses a point-to-point architecture that scales directly with the number of chips. The DV x chip's interprocess communications (IPC) channels can be interconnected to build multiprocessor arrays, as shown in the figure "Point-to-Point Communications" . Through the IPC ports, multiple DV x chips coordinate processing operations so as to encode all proposed digital HDTV formats. Two DV x chips can encode the 525P format, and eight to 10 chips are necessary to encode an HDTV 1080I format. (It takes only two DV x chips to decode all HDTV video formats.)

System Configuration

The DV x provides glueless interfaces to several of the PC's subsystems. It has a 32-bit PCI host bus interface (revision 2.1- compliant), a programmable CCIR-656 (parallel D1) video interface, and an eight-chan nel I2S-compatible audio I/O interface. The DV x uses 8 MB of SDRAM, comprised of four 16-Mb parts. Because you don't need external first-in/first-out (FIFO) buffers and other logic, the DV x further reduces the cost of adding the chip to a PC. To make systems capable of encoding and manipulating HDTV video formats, you simply add the DV x chips you need, depending on the target audience. With recordable DVD, a DV x PC offers professional-quality MPEG-2 video recording and authoring at entry-level prices.


The DV x Microarchitecture

illustration_link (48 Kbytes)

The various units operate concurrently to c apture and encode digital video on the fly.


Point-to-Point Communications

illustration_link (35 Kbytes)

A specialized bus lets processors work in parallel to encode HDTV video formats.


Les Kohn ( editors@bix.com ) is chief architect of C-Cube Microsystems' DV x family of processors. Greg Efland ( editors@bix.com ) is the chief architect of the DV x .

Up to the Core Technologies section contentsGo to previous article: Go to next article: Dynamic HTML Explained, Part IISearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network