Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesColdFire: A Hot Architecture


May 1995 / Core Technologies / ColdFire: A Hot Architecture

This new implementation renews a proven embedded architecture

Joe Circello

Motorola's 68000 family of microprocessors has served both the computer and the embedded markets well. Now the PowerPC has created an opportunity for the 68000 family to refocus entirely on embedded systems, making it possible to redefine the architecture to achieve dramatic improvements in both cost and performance relative to the older 68000-family designs. A new architecture, called ColdFire, is the result of such a refocus and represents an approach targeted at the emerging needs of advanced consumer-electronics applications.

ColdFire's designers set several requirements for the architecture they would create for this new class of cost-sensitive embedded applications. Obviously, they demanded a low-cost architecture, most of which they achieved by using a small core size. A small die also let them integrate on-chip memories, system modules, and peripherals cost-effectively. ColdFire offers abundant processing power, so it can tackle computer-intensive jobs while consuming relatively little electrical power.

Finally, ColdFire employs a high-density ISA (instruction-set architecture), especially important because in many embedded systems the memory subsystem's cost far exceeds the processor's cost. A high-density ISA minimizes the application's storage requirements, which thus reduces overall system cost. The original 68000 processor ISA provided the starting point for ColdFire's ISA. Like the 68000 ISA, ColdFire defines a variable-length ISA to achieve optimum code density. This is accomplished in a RISC-based implementation that provides a very efficient silicon design.

Changes in Instruction

Other important changes were made to the 68000 ISA instruction set while still maintaining the original programming model. Certain operations had either reduced support or were eliminated, which makes for a simpler and smaller core. Examples of the changes to the instruction set are reduced support for byte- and word-size operands, reduced support for RMW (read-modify-write) instructions, and removal of instructions used primarily by desktop applications, such as the trap on overflow exception, and BCD (binary-coded decimal) arithmetic.

Let's look at these changes in more detail: For byte- and word-size operands, the instructions supporting arithmetic and logical operations were removed. However, ColdFire keeps those op codes performing simple assignments (e.g., move ) and the Test and Clear functions. While support for RMW operations was reduced, ColdFire retains the op codes performing arithmetic and logical functions using a program-visible register and memory. A number of instructions, including those involving BCD op erands, rotate op codes, and integer divides, were simply deleted. The Divide instruction was eliminated because the transistor count needed to support these op codes could not be justified. A software Divide routine has been developed that actually uses less machine cycles than does the 68000 for most operands.

A number of extensions were made to the original 68000 ISA when the 68020 microprocessor was introduced. The ColdFire architecture implements several important instructions from these additions, including a 32- by 32-bit integer multiply that produces a 32-bit result, a complete set of register sign-extension instructions, scale factors (x1, x2, x4) for indexed addressing modes, and multiple-word NOP (No Operation) instructions. Compilers use the latter to remove branch instructions.

Taking Exception

In addition to these areas of instruction-set simplification, the ColdFire exception processing model is streamlined. The architecture defines a single 8-byte frame created for all exception types on a self-aligning system stack (i.e., the stack pointer automatically compensates for misaligned data before creating an exception frame). After ColdFire creates the stack frame, it fetches an exception vector from a 1024-byte table that defines the location of the first instruction of the service routine. Thus, the processing of system calls and external interrupts remains exactly compatible with previous 68000-family designs. As a result of these simplifications, exception processing times are very fast. For most exceptions, the time from the faulting instruction until the first instruction in the service routine is a mere 12 cycles.

The resulting ColdFire ISA then represents a balance between the core size and code expansion, while retaining the 68000-family programming model with its powerful set of basic addressing modes. The static size of embedded applications in the ColdFire ISA is typically 20 percent to 40 percent less than fixed-length instruction sets. In re lation to its predecessor, the ColdFire ISA produces object images that are considerably smaller than 68000 object images, but not as compact as objects targeted for the 68040.

A Tale of Two Pipelines

The hardware implementation of the ColdFire architecture uses a synthesis-driven, tools-based design philosophy. This allows the addition of optional hardware modules that provide custom functions and tune the processor's performance. It also provides design independence across different process technologies that target a range of operating frequencies and voltages. Finally, this approach also produces quick design cycles.

Two decoupled pipelines implement the ColdFire processor core: an IFP (Instruction Fetch Pipeline) and an OEP (Operand Execution Pipeline). A 12-byte FIFO (first-in/first-out) instruction buffer decouples the two pipelines (see the figure " ColdFire Processor Block Diagram "). Note that the core features a non-Harvard implementation to minimize die size and bus complexity. Studies indicate a full Harvard architecture provides only a minimal improvement in performance.

As the figure shows, the IFP itself consists of two stages, an IAG (Instruction Address Generation) stage and an IC (Instruction Fetch Cycle) stage. The OEP also consists of two stages, each of which can perform multiple functions, depending on the instruction type. The first stage of the OEP is the DSOC (Decode and Select/Operand Fetch Cycle), and the second stage is the operand AGEX (Address Generation/Execute Cycle).

The IFP calculates the next instruction address and then fetches 32 bits of instruction data using the single-cycle processor/memory bus. Typically, the processor is connected to an on-chip memory, either in the form of a RAM/ROM structure or a unified cache. As the fetched instruction enters the processor, it is loaded into the FIFO instruction buffer. If the OEP is waiting for instruction data, the prefetched instruction is also gated directly into its instruction registers. The connection between the two pipelines is a 48-bit interface, a ColdFire instruction's maximum size. The ColdFire architecture's variable-length instructions include a 16-bit op code, an optional 16-bit extension word 1, and an optional 16-bit extension word 2. The IFP connected to the FIFO instruction buffer provides a very efficient mechanism for loading the variable-length ColdFire instructions into the OEP with a minimum of idle cycles.

The OEP is based on the traditional RISC compute engine structure with a dual read-ported register file feeding an ALU. Register-to-register instructions are executed in a single pipeline cycle with the operands fetched during the OC (Operand Fetch Cycle) phase of the OEP pipeline, and the actual execution is performed in the EX (execute) phase of the OEP pipeline.

The ColdFire ISA is not a pure load/store architecture, so there are numerous compound instructions that combine a load operation with some type of arithmetic or logical operation. These "embedded-load" instructions essentially pass through the OEP twice. This type of instruction begins by selecting the components needed to form the operand address in the DS (decode and select) phase of the OEP's first stage. Next, the ALU sums the components to form the operand address during the AG (address generation) phase in the pipeline's second stage. During the third cycle, memory is read and the desired operand returned to the core. At the same time, any required register operand is fetched during the OC phase in the OEP pipeline. Finally, the instruction is actually executed in the ALU during the EX phase in the OEP pipeline. Register store operations perform both functions (DS + OC, and AG + EX) simultaneously in each stage of the OEP to execute the instruction in a single cycle.

The results of the ColdFire design can be seen in the table above. It compares today's 68000 design (the 68EC000), the latest 68040 design, and a possible ColdFire implementation. The ColdFi re architecture provides 68040 levels of performance at a given frequency in a core size smaller than the original 68000 design. The RISC-based implementation approach provides higher operating frequencies while still maintaining the advantages of a variable-length ISA. For cost-driven embedded systems, this variable-length ISA can provide substantial benefits over a fixed-length approach. Additionally, this new architecture maintains compatibility with the substantial 68000-family embedded development tool sets and preserves the knowledge base of engineers and programmers.


COMPARING COLDFIRE TO OTHER 68000 DESIGNS

                        
68EC000         68040V                  COLDFIRE

Process technology      0.8             0.5                     0.5
                        3.3 V, DLM      3.3 V, TLM              3.3 V, TLM
Core size (sq mm)       11.8            18.4                    4.4
Frequency (MHz)         16.67           25                      50
On-c
hip cache           None            4 KB instruction,
                                        4 KB data               4 KB unified
External bus (bits)     16              32                      32

Performance
Embedded code           1.0x            11.6x                   20.2x
Dhrystone MIPS          2.1             24.6                    44.3
MIPS/watt               42              36                      197



ColdFire Processor Block Diagram

illustration_link (14 Kbytes)

To improve performance, the ColdFire processor uses two pipelines. Note that the OEP's output is routed in such a way that compound instructions can pass through this pipeline twice.


Joe Circello is an advanced microprocessor architect for Motorola's High Performance Embedded Systems Division. You can reach him on the Internet at circello@oakhill.sps.mot.com or on BIX c/o "editors."

Up to the Core Technologies section contentsGo to next article: NetWare 4.1 Forges AheadSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network