Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers

ArticlesEmbedded Reliability: Bet Your Life


April 1998 / Cover Story / Crash-Proof Computing / Embedded Reliability: Bet Your Life

Your life literally depends on millions of invisible computers that control everything from commercial airliners and antilock braking systems to traffic lights and medical equipment. It's a good thing those computers don't crash as often as PCs, because real life does not let you undo.

Embedded control systems far outnumber PCs, and they're multiplying faster t han AOL disks. Occasionally they do fail, sometimes with catastrophic results. In 1996, an Ariane 5 rocket exploded after a program tried to stuff a 64-bit value into a 16-bit variable. In 1991, an Iraqi Scud missile killed 28 Americans when a computer's clock drift prevented a Patriot missile battery fr om tracking the target accurately. In 1986 and 1987, three cancer patients died when a pair of Therac-25 radiation-therapy machines accidentally blasted them with lethal doses of radiation.

But those kinds of failures make news precisely because they're rare. Millions of vehicles and other devices work flawlessly, day after day. What makes embedded systems so reliable?

Experts cite three factors: Reliability is a high priority; developers try to keep embedded systems as simple as possible; and developers and customers alike resist making extensive changes to smoothly running systems.

IBM was the prime contractor for many of the software systems on the Space Shuttle . It took eight years to write the first programs, says Dr. Barry Feigenbaum, senior software engineer for IBM network-computing solutions. Neither IBM nor NASA is eager to change the code. "Old vintage code tends to be more reliable than new, fresh code that hasn't aged yet," he points out.

The microkernel in QNX Software Systems' embedded OS has not changed at all since 1991, notes Greg Bergsma, corporate communications manager for QNX. The QNX OS is found in the monitoring equipment at nuclear power plants, medical-imaging devices, chemical-processing systems, the Space Shuttle's "Canadarm," and the Shuttle's new payload bay vision system. Some QNX systems have been running without a reboot for three years.

QNX keeps the microkernel small -- just 10 KB -- and it contains only 14 calls. Just the kernel and the interrupt-service routines run in ring 0 (x86 terminology for a supervisor, kernel, or executive mode). Everything else -- the file system, device managers, network services, the optional GUI, and other pieces of system software -- runs as independent processes in separate partitions. One process is a "software watchdog," dedicated to handling memory violations.

To minimize complexity, some embedded systems shun multithreaded code, which is thorny to debug. NASA almost lost control of the Mars Pathfinder last year when a thread-priority conflict caused the lander's computer to repeatedly reboot itself. Engineers at the Jet Propulsion Laboratory traced the problem to a wrongly initialized Boolean parameter in Wind River's VxWorks OS. Luckily, they were able to upload a patch; on-site service wasn't an option.

That tale and other famous failures should raise a red flag for PC developers, who hurry larger programs to market with less testing. Unfortunately, the cold, hard realities of the marketplace make it almost impossible for PC developers to borrow much wisdom from their embedded-systems brethren.


Up to the Cover Story section contentsGo to previous article: Embedded Reliability: Bet Your LifeGo to next article: Crash-Proof Tools
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network