ine attempts to dereference a pointer that
is pointing to a deleted object and the whole program crashes spectacularly.
It's difficult to keep track of all the objects you have to delete. Consequently, during design time, you're forced to repeatedly ask: "When is it safe to delete this object?" Because this is often a tough question to answer, this ongoing wrestling match with memory almost always complicates your design. And even the most meticulous programmer makes an occasional error, leading either to a global protection fault (or its equivalent) or memory that's never freed.
Pandora's Box Closed
The solution to all this is a garbage collector. With a garbage collector, you never have to delete anything. The collector automatically deletes an object when your program no longer references that object. If you've seen garbage collectors, you're probably concerned about their performance. Some work by "waking up" intermittently and searching through memory for objects that can be deleted. This can bog
down the application.
The speed of a garbage collector depends on its type. The most general types of garbage collectors -- which do, in fact, search through memory -- are slow if you compare them to "manual" memory management (i.e., explicitly using delete). However, most of the time your program won't need such a complicated collector. A reference-counting garbage collector is sufficient.
A reference-counting garbage collector is fast because it doesn't have to search through memory. Instead, for every object, it keeps a running count of the number of pointers that point to that object (this running count is the object's "reference count"). Every time a pointer is assigned to point to an object, that object's count is incremented. Every time a pointer is reassigned to point to a different object, the original object's count is decremented, and the count on the object to which the pointer is newly assigned is incremented. Also, every time a pointer goes out of scope, the count for the object to
which it pointed is decremented.
Because the garbage collector keeps track of the number of pointers referencing the object, it knows that when the reference count for an object reaches zero, the object can be safely deleted (if there are no pointers to the object, your program can't access the object).
Limitation
The only drawback to such a garbage collector is that it can't "collect" cyclical structures. For example, suppose you have three objects, A, B, and C. They are tied to one another in the form of a circular list: A contains a pointer to B, B contains a pointer to C, and C contains a pointer to A. Initially, some other pointer may have pointed to A, but that pointer has gone out of scope and, consequently, this circular structure is no longer accessible from your program. It should be put in the garbage.
However, since each object points to the next member of the loop, each of these objects still has one outstanding reference. As a result, a reference-counting
garbage collector won't collect the members of this structure; they will be "leaked."
If your program does include a circularly linked list, that doesn't mean you cannot use a reference-counting garbage collector. You can arrange to have the garbage collector properly delete the structure, using what I call the "linchpin" method. With this technique, you pick one object in the cycle to be the linchpin object. When your program no longer needs to use a circular structure, it pulls the linchpin; that is, your program sets all pointers inside the linchpin object to null, thereby breaking the loop. Since the loop has now become a linear list, the garbage collector can delete the remaining objects appropriately.
C++ to the Rescue
At this point, though you may be inclined to agree that a reference-counting garbage collector would be a fine programming tool, you're probably concerned about the time you'd have to spend to build one. And if you do decide to build one, you must be exc
eedingly careful in its construction: Any errors in the reference-counting code of the collector would produce an unacceptably leaky garbage collector.
Happily, C++ operator overloading permits you to implement a completely transparent and foolproof means of reference counting. I've used this concept to create the reference-counting garbage collection system accompanying this article.
RGC, my reference-counting garbage collector, defines two template classes. These template classes emulate pointers by overloading the =, ==, *, and -> operators. Behind the scene these operators do the reference counting for you. Whenever the reference count of a particular object falls to zero, these operators delete the object.
You needn't be aware of the reference counting at all -- it's entirely transparent. And because you're unaware of the ongoing reference counting, you can't make any mistakes regarding it. This ensures the accuracy of the reference counts and hence the reliability of the garbage col
lector.
Using RGC
Incorporating RGC in your next C++ development project is simple. First, you'll need to include RGC.h in your source files. This file defines three classes: TRGCObj, TRGCRef, and TRGCRefObj. The first, TRGCObj, should be declared as the virtual public superclass of any object to be managed by the garbage collector. TRGCRef is a template class. Its parameter is the type for which it will act as a pointer. This type must inherit from TRGCObj. The third class -- TRGCRefObj -- is also a template class. TRGCRefObj is just like TRGCRef, but it inherits from TRGCObj so that it can be tracked and deleted by the collector. TRGCRefObj exists so that the system can support the equivalent of pointers to pointers.
Inside RGC
TRGCObj includes two member functions, RegisterReference and DeregisterReference. These functions respectively increment and decrement an object's reference count. Additionally, they return the updated number of outstanding refe
rences.
Garbage collection takes place inside the TRGCRef template class. At any given time, one of its instances (which I'll call a reference from now on) either points to a valid object in memory or is null. TGRCRef overloads the assignment operator so that every time a reference is assigned a value, two things happen. If the reference had previously been non-null, DeregisterReference is called on the object to which it had been pointing. Then, if the reference is being assigned to a non-null value, RegisterReference is called on the object to which it is to point. This keeps an object's reference count accurate as assignment operations occur.
Additionally, the TRGCRef destructor handles the situation of a reference's pointing to an object when that reference is destroyed. The destructor calls DeregisterReference on the object to which the reference points. This keeps an object's reference count accurate when references go out of scope or are deleted.
Finally, if DeregisterReference ever r
eturns a 0, indicating that there are no outstanding references to a particular object, the reference will delete the object.
An Example
Examine the following listing fragment:
#include "rgc.h"
class myclass : public virtual TRGCobj
{ };
typedef TRGCRef
Pmyclass;
void main()
{
Pmyclass A = new myclass;
Pmyclass B = new myclass;
A=B;
}
When A is assigned to the pointer returned by new, the TRGCRef::setpointer function is called. This stores the pointer to which A is being assigned and calls RegisterReference on the object to which it points. The second assignment (of B) does the same thing.
But when A is assigned to B (the third assignment), a series of calls within the overloaded assignment operator cause DeregisterReference to be called. DeregisterReference will return a 0, since A had been the last reference to the object it had been pointing to. The garbage collector will delete this object (the result of the first new).
This as
signment also causes RegisterReference to be called on the object to which A is being assigned. This increments the reference count in the object.
When the main function concludes, the destructors for the local variables (A and B) are called. The destructor for B is called first, which causes DeregisterReference to be invoked on the object to which B refers. DeregisterReference returns a 1, since A still points to the object. But when the destructor for A is called, DeregisterReference will be invoked on that same object and this time will return a 0, causing the object to be deleted.
Note that it is important not to use "regular" pointers to an object that is to be garbage-collected. A regular pointer will not be counted in the reference count, so it's possible that the object will be deleted while your program still holds an outstanding pointer to it.
Keep It Clean
RGC also handles pointers to pointers. For a reference that is a "second level" pointer -- i.e., a refer
ence that points to another object and that is itself pointed to -- you must use TRGCRefObj. This allows the garbage collector to work on references just as it works on objects; when there are no references to the reference, the reference will be garbage-collected.
The above applies to arbitrary depths of pointers. In general, the outermost layers of pointers may be implemented as either TRGCRef or TRGCRefObj types; inner layers must be implemented as TRGCRefObj types.
RGC is a simple, efficient, and robust reference-counting garbage collector that releases you from having to keep track of when to delete an object. It thereby ends memory leaks, ensures that you never free an object too soon, and simplifies your code. In short, it will make you more productive and make your C++ programs more reliable.
Circular References Remain
illustration_link (3 Kbytes)

A reference-counting garbage collector's weakness is a circular structure. Even though all outside references to the structure are gone, each member still has an outstanding reference pointing to it.
Take RCG for a Test Drive
screen_link (26 Kbytes)

RGC is available from the BYTE Web Site at
http://www.byte.com
. RGC includes RGC.h, a test-driver program, and a self-tracing program that prints messages wheneve
r an object is created or destroyed or its reference count changes.
Justin Miller is studying electrical engineering and computer science at the Massachusetts Institute of Technology. You can reach him on the Internet at
jcmiller@mit.edu
.