ware requires changes so it can operate the multiple processors. Furthermore, it may be necessary to modify the application software so it divvies a task into sections for use by the various processors.
Technical Issues Solved
MP's cost and performance advantages led Mac OS vendors to tackle these hardware and software issues. In May of 1995, Apple Computer and DayStar Digital (a Mac OS licensee and hardware designer) announced the joint development of an API, the Apple MP API, that resolved the software situation. In October 1995, DayStar addressed the hardware situation by shipping a four-processor Mac- compatible, called the Genesis MP, followed by two-processor systems in June 1996. This was followed by Apple's two-proc
essor Power Mac 9500/MP in August. Another Mac OS vendor, UMAX, has begun shipping multiprocessor systems, too. Both Apple and UMAX have licensed MP hardware designs from DayStar.
It's worth examining how DayStar and Apple dealt with some of the technical difficulties in implementing MP on a Mac OS system. For the hardware, changes to the existing system architecture to add MP support were rather small. Most of the PowerPC processor family has on-chip support for an
n
-processor MP architecture. Surprisingly, the ASICs used in the original Power Mac 9500 (introduced in 1995) had bus arbitration support for a two-processor MP design built in. Four-processor designs such as DayStar's Genesis MP require extra glue logic.
The hardware model the Apple MP API uses assumes that the processors share the same block of memory. This simplifies the software design because it makes it easy to share data and code libraries. The API also assumes a cache-coherent model, which relieves the programmer of the ch
ore of updating the processor caches. The model assumes that only one processor needs access to I/O devices, timers, and external interrupts (although each processor can interrupt one another).
While this design's shared memory sounds like symmetric multiprocessing (SMP), it isn't. An SMP architecture assumes that everything is shared, including the I/O devices, which isn't the case with the Apple MP API. Another difference is that in an SMP system, portions of the OS can migrate to other processors to balance the load. This isn't possible in a Mac OS MP design because much of the OS code is nonreentrant. The Apple MP API overcomes these problems by restricting what code certain processors call.
The Apple MP API
The Apple MP API consists of a shared library that implements the API functions and a hardware abstraction layer (HAL) that manages the low-level MP hardware for both the programmer and the API itself. When an MP-aware application uses MP services, the Mac OS Code Fragment
Manager automatically connects it to the MP library. The MP library next locates the appropriate HAL for the given hardware configuration. The processor that's already executing code at this point is anointed as the main processor, while the other processors are designated attached processors. The main processor runs the 680x0 emulator and the Mac OS and manages device I/O. The MP API uses the HAL to bootstrap the other processors and install a lightweight preemptive scheduler on all the processors (as shown in the figure
"Mac OS MP Architecture"
). The MP API provides kernel services that implement MP task coordination and messaging. Note that the MP kernel isn't an executing task like a daemon; it is simply a set of service calls.
The Apple MP API provides calls that query the system for the number of processors, create/terminate MP tasks, allocate memory, and manage task synchronization. When an application creates an MP task, the MP kernel assigns it to a global task queue. When
a currently executing task gets rescheduled, the processor's scheduler checks this queue and runs the next pending MP task for a maximum interval of 10 milliseconds. This permits the kernel to perform load-balancing for MP tasks. Task coordination is accomplished through supplied queue, semaphore, and critical region API calls. You should use these calls, since they help the kernel schedule and control MP tasks. Because the main processor also runs a scheduler and executes MP tasks, the Apple MP API performs symmetric processing even if the OS doesn't.
MP Limits
When writing a Mac application to use MP tasks, keep in mind the Apple MP API's limitations. First, an MP task can't execute 680x0 processor code. That's because the 680x0 emulator runs only on the main processor. Also, an MP task can't make direct calls to the Mac OS or Toolbox because of the nonreentrant code problem and because some of these functions consist of 680x0 code. This also explains why the main processor handles
all the I/O: The File Manager and certain low-level I/O code use 680x0 code.
From these restrictions, it becomes obvious that MP tasks are most suitable for PowerPC compute-intensive code. Fortunately, a lot of work, such as image editing, digital video effects, 3-D modeling, and simulation, fit into this category. Furthermore, an MP-aware application isn't locked out of using the OS. The application's main task executes on the main processor and can avail itself of OS services. The MP tasks executing on the attached processors would use the synchronization calls to notify the main task when data should, say, be spooled to disk or placed on the screen.
It's up to you to determine how to best partition the job so that MP tasks make the best use of system resources. Ideally, you want each MP task accessing memory at different times to make the most efficient use of the system bus. To help in this area, version 11 of Metrowerks CodeWarrior provides source-level debugging of MP tasks (as shown
in the screenshot
). While this involves some extra work on your part, the results can make the effort worthwhile: On a four-processor system, the performance of MP applications can be boosted by 2.5 to 3.5 times. Despite the formidable limits placed upon the MP architecture by the Mac OS, the Apple MP API offers a practical MP solution. It's important to know that Apple plans to carry over the MP API into future versions of the Mac operating system. This will preserve your MP coding efforts and will provide better performance because these future releases will offer OS-level support for symmetric multiprocessing.
The author thanks David Sowell, Chris Cooksey, and David Methven of DayStar Digital for their help with this article.
Where to Find
Apple Computer
Cupertino, CA
Phone: (408) 996-1010
Internet:
http://www.apple.com/