Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesThe Mac Goes Multiprocessor


February 1997 / Core Technologies / The Mac Goes Multiprocessor

A new library and API enable preemptive multitasking and multiprocessing on Mac OS systems.

Tom Thompson

Today, desktop computer designs are flirting with clock speeds of 300 MHz, unheard of just a year ago. However, to keep a computer's design both simple and affordable, its system bus -- where the memory and certain peripherals hang out -- typically dawdles along at a fraction of the processor's speed. Because of the design and cost constraints, it's going to be difficult for even faster systems to realize significant gains in performance.

There's an alternative design that can achieve large performance boosts in spite of the slower bus. This is multiprocessing (MP), where the system has two or more processors tha t improve throughput by working in concert to divide and conquer a job. Of course, there's a catch: Some hardware modifications are necessary so the processors can properly share the system bus and the peripherals. Also, the OS soft ware requires changes so it can operate the multiple processors. Furthermore, it may be necessary to modify the application software so it divvies a task into sections for use by the various processors.

Technical Issues Solved

MP's cost and performance advantages led Mac OS vendors to tackle these hardware and software issues. In May of 1995, Apple Computer and DayStar Digital (a Mac OS licensee and hardware designer) announced the joint development of an API, the Apple MP API, that resolved the software situation. In October 1995, DayStar addressed the hardware situation by shipping a four-processor Mac- compatible, called the Genesis MP, followed by two-processor systems in June 1996. This was followed by Apple's two-proc essor Power Mac 9500/MP in August. Another Mac OS vendor, UMAX, has begun shipping multiprocessor systems, too. Both Apple and UMAX have licensed MP hardware designs from DayStar.

It's worth examining how DayStar and Apple dealt with some of the technical difficulties in implementing MP on a Mac OS system. For the hardware, changes to the existing system architecture to add MP support were rather small. Most of the PowerPC processor family has on-chip support for an n -processor MP architecture. Surprisingly, the ASICs used in the original Power Mac 9500 (introduced in 1995) had bus arbitration support for a two-processor MP design built in. Four-processor designs such as DayStar's Genesis MP require extra glue logic.

The hardware model the Apple MP API uses assumes that the processors share the same block of memory. This simplifies the software design because it makes it easy to share data and code libraries. The API also assumes a cache-coherent model, which relieves the programmer of the ch ore of updating the processor caches. The model assumes that only one processor needs access to I/O devices, timers, and external interrupts (although each processor can interrupt one another).

While this design's shared memory sounds like symmetric multiprocessing (SMP), it isn't. An SMP architecture assumes that everything is shared, including the I/O devices, which isn't the case with the Apple MP API. Another difference is that in an SMP system, portions of the OS can migrate to other processors to balance the load. This isn't possible in a Mac OS MP design because much of the OS code is nonreentrant. The Apple MP API overcomes these problems by restricting what code certain processors call.

The Apple MP API

The Apple MP API consists of a shared library that implements the API functions and a hardware abstraction layer (HAL) that manages the low-level MP hardware for both the programmer and the API itself. When an MP-aware application uses MP services, the Mac OS Code Fragment Manager automatically connects it to the MP library. The MP library next locates the appropriate HAL for the given hardware configuration. The processor that's already executing code at this point is anointed as the main processor, while the other processors are designated attached processors. The main processor runs the 680x0 emulator and the Mac OS and manages device I/O. The MP API uses the HAL to bootstrap the other processors and install a lightweight preemptive scheduler on all the processors (as shown in the figure "Mac OS MP Architecture" ). The MP API provides kernel services that implement MP task coordination and messaging. Note that the MP kernel isn't an executing task like a daemon; it is simply a set of service calls.

The Apple MP API provides calls that query the system for the number of processors, create/terminate MP tasks, allocate memory, and manage task synchronization. When an application creates an MP task, the MP kernel assigns it to a global task queue. When a currently executing task gets rescheduled, the processor's scheduler checks this queue and runs the next pending MP task for a maximum interval of 10 milliseconds. This permits the kernel to perform load-balancing for MP tasks. Task coordination is accomplished through supplied queue, semaphore, and critical region API calls. You should use these calls, since they help the kernel schedule and control MP tasks. Because the main processor also runs a scheduler and executes MP tasks, the Apple MP API performs symmetric processing even if the OS doesn't.

MP Limits

When writing a Mac application to use MP tasks, keep in mind the Apple MP API's limitations. First, an MP task can't execute 680x0 processor code. That's because the 680x0 emulator runs only on the main processor. Also, an MP task can't make direct calls to the Mac OS or Toolbox because of the nonreentrant code problem and because some of these functions consist of 680x0 code. This also explains why the main processor handles all the I/O: The File Manager and certain low-level I/O code use 680x0 code.

From these restrictions, it becomes obvious that MP tasks are most suitable for PowerPC compute-intensive code. Fortunately, a lot of work, such as image editing, digital video effects, 3-D modeling, and simulation, fit into this category. Furthermore, an MP-aware application isn't locked out of using the OS. The application's main task executes on the main processor and can avail itself of OS services. The MP tasks executing on the attached processors would use the synchronization calls to notify the main task when data should, say, be spooled to disk or placed on the screen.

It's up to you to determine how to best partition the job so that MP tasks make the best use of system resources. Ideally, you want each MP task accessing memory at different times to make the most efficient use of the system bus. To help in this area, version 11 of Metrowerks CodeWarrior provides source-level debugging of MP tasks (as shown in the screenshot ). While this involves some extra work on your part, the results can make the effort worthwhile: On a four-processor system, the performance of MP applications can be boosted by 2.5 to 3.5 times. Despite the formidable limits placed upon the MP architecture by the Mac OS, the Apple MP API offers a practical MP solution. It's important to know that Apple plans to carry over the MP API into future versions of the Mac operating system. This will preserve your MP coding efforts and will provide better performance because these future releases will offer OS-level support for symmetric multiprocessing.


The author thanks David Sowell, Chris Cooksey, and David Methven of DayStar Digital for their help with this article.


Where to Find


Apple Computer

Cupertino, CA
Phone:    (408) 996-1010
Internet: 
http://www.apple.com/


DayStar Digital

Flowery Branch, GA
Phone:    (770) 967-2077
E-mail:   
mp@daystar.com

Internet: 
http://www.daystar.com/


HotBYTEs
 - information on products covered or advertised in BYTE


Mac OS MP Architecture

illustration_link (26 Kbytes)

Multiprocessing tasks execute preemptively, even on a single-processor machine.


Metrowerks Does Battle with Bugs

screen_link (42 Kbytes)

Metrowerks CodeWarrior provides source-level debugging on MP tasks.


Tom Thompson is a BYTE senior technical editor at large. You can reach him by sending e-mail to tom_thompson@bix.com .

Up to the Core Technologies section contentsGo to previous article: Go to next article: Python Does Scripts and ObjectsSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network