As software complexity grows, new testing tools help developers control bugs and costs alike.
Derek Jones
Software testing may be one of the least attractive of all software-development activities. Yet, simply put, it's one of the most important parts of the development process.
Typically, it costs 10 times more to fix a bug at any stage of the development cycle than it does to fix it during the previous stage. But the cost of formal, comprehensive testing procedures can be prohibitive and often delays the time to market unacceptably. Automated software-testing tools can be a cost-efficient so
lution for developers looking to enhance the quality of their products. In addition, as application development becomes more intricate
-- especially with the growing complexity created by the move to applets and components -- tools that systematically check source code or detect unexpected behavior at run time are becoming increasingly important for developers.
Software testing involves more than just debugging code. It encompasses the validation and verification of software throughout the whole development process.
Static vs. Dynamic Testing
So-called static analysis tools analyze the source code of an application. Static checks deal with such details as un-reachable code, the misuse of pointers, undeclared variables, and variables used before initialization. The information gathered by these techniques serves two main purposes: to help developers understand the source code (e.g., by displaying a graphical representation of the control flow) and to act a
s an automated proofreader, looking for constructs that are likely to cause problems.
Commercially available tools either support a variety of programming languages or are specifically designed for a certain language. Generally, tools that focus on a specific language can better handle that language's special cases. Generic tools that support multiple languages, on the other hand, are more appropriate for working with a variety of programming languages. Generic tools include IPL's (Bath, U.K.) Cantata; Verilog's (Toulouse, France) Logiscope; and Battlemap, from McCabe and Associates (High Wycombe, U.K.). Tools for specific languages include PC-Lint, from Gimpel Software (Collegeville, PA), and QA-C and QA-C++, both from Programming Research (Hersham, U.K.).
PC-Lint offers an entry-level method of quality assurance on PCs. It detects a large number of common programming errors in C and C++ and gives developers a possibility to customize. Programming Research's QA series of tools are geared to large
corporate development groups. They offer comprehensive customization features and can also generate metric information as well as graphical representations of code. In addition, both PC-Lint and the QA series of tools support cross-module checks.
Find Bad Pointers
Dynamic testing tools help you test a program's individual units and modules and, gradually, as modules come together, complete applications. Their big strengths include checking for bad pointers and memory leakage. In addition, they allow you to test the concurrent operation and integration of modules, check operation after a bug has been fixed, and perform stress tests.
There are two approaches to dynamic testing. One class of tools includes a compiler and uses source-code information to flag problems as it compiles. Another type of dynamic-testing tool checks the run-time behavior of executables.
Of course, a tool that checks source code has access to more information and can therefore do checks that are more thorough.
A good example of such a tool is Bounds-Checking GCC, which was developed at the Imperial College (London); see
http://www-dse.doc.ic.ac.uk/~rj3/bounds-checking.html
. It's an extension to the GNU C compiler that adds pointer checks. By working at the source-code level, it can detect when, for example, a pointer has "walked" off the end of an object into another one. The disadvantage of this approach is that it requires access to source code, which may not be available if you use third-party libraries.
Tools that work at the executable level don't need information about the source. A case in point for testing executables is Purify, from Pure Atria (Cupertino, CA). Purify doesn't even need to know what language the original application was written in. Although it checks uninitia
lized variables down to the byte level, its pointer-checking capabilities don't go down to the level that the GNU C bounds-checking tools do. Instead, it treats contiguously allocated storage as a single memory item.
Is More Testing Needed?
How do you know how much testing has been done and how much more is still needed? There are a couple of techniques for measuring test coverage.
Statement coverage
counts how many of the total statements in the application are executed. Achieving 100 percent statement coverage shows that the tests are exercising every statement in the application, although it doesn't give you any measure of how much of the program structure was tested.
An interesting side effect of 100 percent statement coverage is that it helps programmers find statements that can never be executed. This is because defining test procedures to execute a given statement makes developers look at an application in a new light. It unavoidably leads to the question: Is this statement
necessary? Some programmers say they remove almost 30 percent of the statements they find through this procedure.
Path coverage
, on the other hand, calculates the number of paths through an application. Achieving 100 percent path coverage for any but the simplest code requires enormous resources. In practice, you can use path-coverage measures only to check the testing level of the critical portions of an application.
Some tools insert code to flag that a statement or path has been executed to obtain coverage information, a process called
instrumentation
. The big problem with instrumentation is that it involves memory and CPU overhead. Fully instrumented programs typically run up to three times slower than noninstrumented ones. A good decision for software testing is, therefore, to completely instrument only critical portions of a program.
A case in point is ATAC, a publicly available tool written at Bell Communications Research for instrumenting C code. ATAC (
http://www.clark.net/pub/dickey/atac/atac_961112.tgz
) gives developers comprehensive coverage information by counting various bits of code being executed at run time.
Large sources and complex applications are naturally harder to test than software with simple functionality.
Software metrics
is one way of measuring the volume and complexity of an application. This procedure looks at the structure and linkage between the larger software building blocks rather than at the level of statements and expressions.
The two most often used metrics are the Halstead software science counts, which calculates code volume based on abstract operator and operand counts, and McCabe's cyclomatic complexity measure, which measures the number of control-flow nodes and edges.
If you're new to test
ing and have a complicated application to test, then metrics might help you find the most likely trouble spots within your code. The most complex and hardest-to-test modules of a program typically have the highest metric values. Once you locate potential trouble spots, you can break these modules down or rewrite the parts of a program that exceed predefined metric limits.
However, don't overestimate the value of metrics to software testing. Interestingly, many metrics correlate very highly to the number of noncommented lines of code. When evaluating a new software metric, always ask how accurate its predictions have been on historical data and to what extent it correlates with simple lines of code measurements. (Why use a complicated formula that doesn't give significantly better results than you can get from simply counting lines of code?)
You can also use metrics to estimate such things as the number of bugs likely remaining in code, the ease of software portability, or future maintenance costs.
However, all major testing tools can generate metric information, so it doesn't make sense to spend a lot of money on a tool that calculates only metrics.
GUI Testing
A critical component of all software testing is the GUI. In GUI-based applications, users have control over several concurrent input devices, and applications can output results to multiple windows.
GUI testing is usually a three-stage process. First, the tool records all keystrokes, mouse movements, and button clicks into a script file. This recorded script can be played back to drive the application in test mode. This works well when an application behaves
as expected
.
The second stage involves adding checks into the recorded script for handling extraordinary conditions. These checks can include waiting for specific events to appear. It's also possible to add synchronizing points to handle cases where an application runs at different speeds.
In the third stage, the modified script runs ag
ainst the application. The extent to which the script can reliably handle unexpected behavior depends on the effort you put into manually programming the script. The more sophisticated programs can graphically display differences between what the script expects and what the application sends to the screen. It's also possible to single-step through a script and display the values of variables, as simple debuggers do.
GUI testing tools, such as Mercury Interactive's (Or Yehuda, Israel) WinRunner for Windows and XRunner for Unix, enable developers to visually create test scenarios and verifications using a point-and-click method of selecting objects on-screen. They handle application changes automatically and maintain object-specific data independently of individual scripts to ensure that the same scripts can be reused even when an application changes during development.
Internet Adds Complexity
Current trends in software development challenge programmers as well as designers of testing tool
s. The Internet is dramatically increasing the pace of the software industry, and many developers believe that there is now even less time available for testing. Object-oriented design techniques, such as polymorphism, encapsulation, and inheritance, increase the complexity of testing through the extensive use of information hiding.
In addition, Java's "build once, run anywhere" paradigm complicates matters because it requires developers to test applications not only on all platforms but also on each virtual machine (VM) they might run on, including the VMs used in Web browers. As Java testers say, "build once, test everywhere."
Where to Find
Gimpel Software
Collegeville, PA
Phone: +1 610 584 4261
Fax: +1 610 584 4266
Internet:
http://www.gimpel.com
IPL
Bath, U.K.
Phone: +44 1225 475000
Fax: +44 1225 444400
Internet:
http://www.iplbath.com
McCabe and Associates
High Wycombe, U.K.
Phone: +44 1494 463233
Fax: +44 1494 463288
Internet:
http://www.mccabe.com
Mercury Interactive
Or Yehuda, Israel
Phone: +972 3 538 8888
Fax: +972 3 533 1617
Internet:
http://www.merc-int.com
Prog
ramming Research
Hersham, U.K.
Phone: +44 1932 888080
Fax: +44 1932 888081
Internet:
http://www.prqa.co.uk
Pure Atria Corp.
Cupertino, CA
Phone: 408-863-9900
Internet:
http://www.pureatria.com
Reliable Software Technologies
Sterling, VA
Phone: +1 703 404 9293
Fax: +1 703 404 9295
Internet:
http://www.rstcorp.com
SunTest
Mountain View, CA
Phone: +1 650 336 2005
Internet:
http://www.suntest.com
Verilog
Toulouse, France
Phone: +33 5 61 19 29 39
Fax: +33 5 61 40 84 52
Internet:
http://www.verilogusa.com
Information on products in the
programming
category
HotBYTEs
- information on products covered or advertised in BYTE
screen_link (21 Kbytes)

GUI testing tools compare expected and real application behavior.
Derek Jones, a former compiler writer, is a testing expert with Knowledge Software, Ltd. (Farnborough, U.K.). You can reach him by sending e-mail to
derek@knosof.co.uk
.