Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesThe Software Stopwatch


April 1995 / Core Technologies / The Software Stopwatch

It doesn't take special hardware to achieve timing resolution in the microseconds on the PC or the Mac. You just have to know where to look.

Rick Grehan

Optimization is often the name of the programming game. But before you can optimize, you have to find that which is not optimal. Bill Atkinson, a member of the original Mac design team once said: "Optimization without measuring is wasted time. Find out where the application's really spending time and go whump on that code" (February 1984 BYTE, page 76).

So, to optimize, you must whump, and before you whump, you must measure. But measurement may not be an easy task with today's processors running 200 MHz and higher. A person with a Rolodex won't even begin to cut it. Some type of hardware/s oftware assistance would be nice, but how do you safely get at all those timers and hardware things that the operating system works so hard to insulate you from?

Time on the PC

At the heart of the old IBM PC--and still beating in PC clones today--is a hardware timer that issues an interrupt about once every 55 milliseconds. This works out to about 18.21 clock ticks per second and is the basis for the time-of-day clock on most PCs.

It's also the reason "generic" time functions on a PC may be inadequate for high-accuracy timing. Suppose you've written a program whose performance is highly dependent on a sorting routine. You have one of two choices as to what sort algorithm to put at the core of that routine. In your tests, the first sort algorithm takes a twentieth of a second; the second takes only a fortieth of a second (not unheard of in these days of 90- and 100-MHz Pentiums).

If you simply test the candidate algorithms by running an iteration of each and timing t he duration, a DOS clock-based timer will report that they both take the same amount of time. The problem is one of resolution; a clock that ticks only 18 times a second can't "see" any events shorter than that.

Brief Tangent

Before we discuss fixing this problem, stop for a moment for a warning. The clock() function in many DOS C compilers will appear to return results with resolutions to the hundredths or thousandths of a second. The warning: This is only partly true. The true part is that the number returned by clock() has the proper dimension. The not-so-true part is the fact that the clock() routine doesn't advance one unit per tick. It lurches forward in time, skipping multiple milliseconds at a tick.

For example, you can write a short C program that repeatedly calls the clock() routine, displaying a value only when there's a change. (Try this with your favorite DOS compiler.) We tried this with one DOS compiler and the value of clock() advanced by 5 or 6 ticks. That compiler's time.h header file told us that the clock() function presumes 100 ticks per second. Sure enough, 55 ms works out to between five- and six-hundredths of a second, given that clock() will exhibit a 5-ms "jitter."

Higher Resolution

There is a route to higher resolution on a PC running DOS. It appeared in the January 1987 BYTE in Byron Sheppard's article "High-Performance Software Analysis on the IBM PC." The article included the source code listing for assembly language routines that could easily be modified for calling from high-level language programs.

Sheppard's approach involved reprogramming the PC's timer 0, which turns out to be the timer that generates the 55-ms clock ticks. It also turns out that timer 0 is really ticking away with 840-nanosecond pulses; the BIOS programs it to count 65,536 pulses before generating the interrupt. In a nutshell, Sheppard's routine reads the count in the timer, which--with some math--you can use to determine the number of 8 40-ns ticks since the last timer interrupt. This gives you better-than-microsecond accuracy. If you want to check out Sheppard's listing, you can dig up your back issues of BYTE or download the file JAN87.ARC from the listings/frombyte87 area on BIX.

Virtual Timers

Unfortunately, using a hardware timer isn't always an available option. Within operating systems (e.g., Windows, which virtualizes hardware), directly accessing hardware can yield bizarre results. We tried Sheppard's high-resolution timer code from within a Windows program: Sometimes it worked, and sometimes it didn't.

Fortunately, Windows has a kind of back door into a timer that is as good as Sheppard's. Specifically, Window's VTD (virtual timer device) provides access to a timer with a resolution of--guess what--840 ns. You can get to it using two assembly language routines lashed to your main code.

One routine calls INT 2Fh, which is a kind of clearinghouse interrupt that returns the API entry point f or all the device drivers Windows knows of. You just plug a virtual device ID into the BX register (the ID of the VTD is 05) and call the interrupt; ES:DI returns holding the entry-point address in segment:offset form (or all zeros if the VxD has no entry point).

The other routine actually calls the VTD. Plug a 100h in the AX register (this is the function code) and then call the entry point that the first routine returned. The VTD will return the number of 840-ns ticks because Windows was started in EDX:EAX. The listing table shows both routines.

Time on the Mac

Suppose you're a Mac developer, and you want to do some high-resolution timing. There are no hardware timers to reprogram here. You first might try to use the timer global variable residing at 016Ah (referred to as LMGetTicks in the standard header files for the Mac). This location is updated 60 times a second and holds the number of ticks since the Mac was started. It is at least better than the DOS clock.

But the revised time manager--provided with the Mac OS 6.0.3 and higher--can provide timing with accuracy to 20 microseconds. (The Mac OS 7.0 has the extended time manager, which does everything the revised time manager can and more.)

The time manager's real job is to schedule tasks to run at predetermined future times. This lets you set up tasks that run at regular intervals, an important feature for multimedia and real-time activities that need routines run at precise intervals.

What we want, however, is the ability to measure durations of time. With the above description of the time manager, it seems that the logical approach to this is to create a routine that wakes up every millisecond or so and updates a global variable; sort of a high-resolution form of LMGetTicks.

But Apple has built into the time manager a way to measure without submitting a task for the manager to run. Sounds weird, but it's true. The time manager keeps a list of all the tasks scheduled to run on a queue, a linked list of data structures. Each queue member contains a pointer to the task that you want awakened in the future, as well as the delay (i.e., how far into the future the manager must go before waking up the task). You can specify the delay in milliseconds (for a long delay and not much accuracy) or microseconds (for the reverse). The time manager knows whether you want milliseconds or microseconds by examining the sign of the delay field: A negative value indicates microseconds, and positive indicates milliseconds. You put an item onto the queue using the InsTime routine (after building the appropriate data structure, of course) and "arm" it with a call to PrimeTime. This latter function is the one that tells the time manager when to run the task. You remove an item from the queue with a call to RmvTime.

If you place an item on the queue that has a NIL value in the task pointer field, the time manager never starts the task (which makes sense, because you've basically told the ti me manager that there is no task to start). But if you issue a PrimeTime call on such a queue element, the time manager keeps track of how much time is left were the hypothetical task started (it places this duration in a field of the queue element called tmCount). You then call RmvTime and examine tmCount to determine the time left, from which you can compute the time that has elapsed since the call to PrimeTime. The rest is obvious: Bracket the code you want to time between a call to PrimeTime and RmvTime, and you have a software stopwatch ( see the StopWatch listing ).

Watch the Watch

Getting at high-resolution timers doesn't have to be a programming nightmare. Of course, you've got to do some exploring. We had to dig through Inside Macintosh and various Microsoft developer CDs--not a carefree jaunt. But it paid off. Now we've got stopwatches accurate into the microseconds.

Whump away.


A Short Assembly Language Routine

A short assembly language routine to fetch the entry point of the
VTD. It returns the address in DX:AX.

getapi_
  proc near
  mov  ax,1684h  ;Subcode to return API
  mov  bx,05h    ;VTD ID
  xor  di,di     ;Clear ES:DI
  mov  es,di
  int  2fh       ;Call interrupt
  mov  dx,es     ;Return results
  mov  ax,di
  ret
getapi_ endp

Call the VTD. The entry point is passed to the routine in the DX:AX
register pair. Note that the routine fakes a CALL using a far return.
Also, this routine presumes tarray is a two-element array of double
words.

gettick_
  proc near
  push cs
  mov  bx,offset retspot
  push bx
  push dx      ;Push VTD segment
  push ax      ;Push VTD offset
  mov  ax,100h ;Function code
  retf         ;Make the far call
retspot:
  mov  dword ptr _tarray,eax
  mov  dword ptr _tarray+4,edx
  ret
gettick_ endp



Mac Time Manager as Stopwatch

Using the time manager on the Mac as a software stopwatch. Thi
s code
can time events up to 10 minutes. Note the calculation of ohead,
which factors out the overhead of the time manager routine calls.

struct TMTask myTMT;
long delay, ohead, rslt;

/* Clear TMTask struct */
memset((void *)&myTMT,0,sizeof(TMTask);
delay=100*1000*1000;    /* 10 minute delay */

/* Put task on queue */
InsTime((QElemPtr)&myTMT);

/* Calculate overhead */
PrimeTime((QElemPtr)&myTMT,-delay);
RmvTime((QElemPtr)&myTMT);
ohead=delay+myTMTask.tmCount;

/* Time something */
InsTime((QElemPtr)&myTMT);
PrimeTime((QElemPtr)&myTMT,-delay);
..insert stuff to be timed here...
RmvTime((QElemPtr)&myTMT);

/* rslt has duration in microsecs */
rslt=delay+myTMT.tmCount-ohead;



Rick Grehan is the technical director of the BYTE Lab. He has a B.S. in physics and applied mathematics and an M.S. in mathematics/computer science. He can be reached on the Internet or BIX at rick_g@bix.com .

Up to the Core Technologies section contentsGo to previous article: HP-UX 10.0Go to next article: Create More IP AddressesSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network