AI is becoming a reality as pattern-recognition programs can now prove
Peter Wayner
You know the three great lies, right? The check's in the mail, I'll call you right back, and artificial intelligence is right around the corner. It's easy to slam all forms of AI. In the world of computer science, AI research is often the most far-flung, blue-sky, and dreamy combination of mathematics and philosophy you can imagine.
But not all AI research is reaching for impossible goals. Scientists throughout the field have created many algorithms that successfully learn straightforward abilities. If the context is well-defined and the bounds of the problem can be correctly encoded for the computer, then these algorithms can often
pick up a pattern
and learn to predict it successfully. These
techniques can be used in applications that include agriculture, medicine, economics, and engineering.
Finding a Pattern
Databases rich with patterns are becoming common in all industries. Engineering firms now have good data on all parts of the engineering process and can use that data to produce better designs. Manufacturing companies need to be aware of sharp changes that might occur in any product (e.g., when another company introduces a significant competing product). Information suppliers can track the use of Web browsers to determine how people are reacting and can make choices about which information to choose next.
The most important stumbling block to using any of these algorithms is defining the logical structure of the problem so that the problem can, in essence, be explained to the computer. This means that large, vaguely defined problems, such as achieving peace in the Middle East, cannot be solved with pattern recognition. But more reasonable questions ca
n be answered. For instance, is there a relationship between the disease that a set of patients has and their symptoms? If all the details are recorded, many machine-learning algorithms can identify the connections.
One of the groups successfully bridging the gap between abstract theory and applications is MLI (Machine Learning and Inference Center) at George Mason University in Fairfax, Virginia. The center, headed by Ryszard Michalski, produced a number of successful applications by working with well-defined problems, categorizing diseases, identifying objects in images, and combing databases for information.
Much of the work emerging from the MLI concentrates on finding the best logical rules for the data. Such learning programs strive to generate knowledge such as "Cars have four wheels, and bicycles have two." More statistically based learning algorithms might examine a set of bikes and cars and determine that the threshold that separates the two is three wheels. The difference is subtle. L
earning algorithms aimed at finding rules shine on problems that have a fixed set of solutions. Statistical approaches do better where the dividing line is less obvious (e.g., the difference between tall people and short people).
Learning Systems
One of the better examples of the MLI approach to machine learning is the Inlen system. Inlen searches through large databases to find significant patterns. Supermarkets, for instance, encourage shoppers to join discount clubs so that they can generate profiles of their consumption. The IRS is widely believed to use pattern-recognition algorithms to identify suspicious tax returns. And federal law enforcement agencies are also believed to scan financial transactions to identify money laundering.
The Inlen system works with a relational database. Patterns are extracted from the database using a variety of different learning algorithms. The Inlen system is, in a sense, merely a CUI (common user interface) to some of these algorith
ms.
One of the algorithms used in the system is called Sparc (no relation to the chip). Sparc finds patterns in sequences of data and attempts to predict the next element. For instance, if the sequence were "circle, triangle, square, circle, triangle," then Sparc would predict that the next element was square.
The Sparc algorithm works by analyzing the sequence with three different submodules. One submodule looks for periodic patterns like the one in our "circle, triangle, square" pattern by trying all potential period lengths and looking for alignment.
Another submodule searches for dependencies that may not be periodic. For instance, only squares come after triangles in the following sequence, "circle, circle, triangle, square, circle, square, circle, triangle, square, circle." The algorithm starts by generating hypothetical rules from single items in the series and then tries to see which rules hold generally. Normally, the algorithm will pick the shortest explanation possible.
Sparc's third submodule tries to combine several simple rules in what is called disjunctive normal form--that is, the submodule will look at a more complicated pattern and realize that one rule holds in some parts of the series and another rule holds in others. The submodule would combine the two rules with an OR and produce a rule that is general to the series.
The Inlen system also includes many other modules. One called Cluster identifies groups of similar data. Another called Eventree generates decision tree rules built up of IFTHEN clauses.
Joining Results
The
process of combining the results
from Inlen's modules can be complicated. Michalski's center is exploring several different methods for combining these rules. The classic method, deductive logic, forms new results by recognizing that if A implies B and B implies C, then A implies C. It's straightforward and easy to implement. But the Inlen system also includes processes for
generalizing
, specializing, abstracting,
and
concretizing
, the knowledge built up by the modules. These are major components of Michalski's Inductive Theory of Learning.
Generalization, for instance, refers to an inference (e.g., if three students in the same class were assigned homework, then all students of this class were probably assigned homework). The system applies specific knowledge to a more general set. Complementary to generalization is specialization (e.g., if all students were assigned homework, then a specific student probably was).
Getting Answers
One of the big hurdles for all machine-learning algorithms is solving problems in an acceptable amount of time. Many of the problems fall into a class known as NP-complete, which means that there is no known efficient algorithm for finding the correct answer. This is because the number of potential answers can grow exponentially with the size and complexity of the answer.
The common solution in m
achine learning is to limit the size of potential answers, usually by creating hypothetical rules and ranking them. At each step in the recognition process, the number of potential new rules may grow significantly, so the algorithms rank new ones and eliminate the worst. Ultimately, this process may exclude the best final answer because preliminary versions of it don't make the cut along the way.
Evaluating the quality of machine-learning algorithms can be complicated by the fact that many algorithms are designed to perform well on particular types of problems. The rule-based systems emerging from the MLI are adept at picking up logically structured patterns (e.g., "a baseball player is a Yankee if the uniform has pinstripes"). More ambiguous relationships like the distinction between a tall and a short person are often hard to characterize with this more logical approach. In such cases, algorithms involving neural networks shine.
The machine-learning community recently held a competition to tes
t learning algorithms. There were three different problems presented to the algorithms. The first was intended to be easy for symbolic systems--the solutions were generated from simple symbolic rules. The second test was tuned for neural networks; it required the algorithms to identify sets that might be described by a pattern. Here's an example: If a man satisfies two of these three attributes, then he's an acceptable date: tall, dark, and handsome. These problems can be difficult for symbolic learning algorithms to grasp because their description can grow exponentially complex. The third test involved a more symbolic approach with added noise.
The AQ17 program from the MLI was the only one to score 100 percent on all the tests. It succeeded on the more difficult second test because it is not a typical rule-generating algorithm. The winning AQ-17 algorithm used a feedback mechanism to generate synthetic attrib-utes that could reduce the complexity (e.g., tall and dark). This synthetic approach really
shines in cases where the number of different attributes in the set grows.
On to Applications
The machine-learning algorithms are really just abstractions and are not tuned to any particular problem. The trouble is finding a domain that is structured enough to allow computer representation. The scientists at the MLI are experimenting with engineering problems--a domain rich in mathematical structure.
The MLI is investigating automotive design with Chrysler's Technology Center (Auburn Hills, MI). The team analyzed an automobile suspension system and tried to determine what features affected its manufacturability. The MLI reports that the initial results are promising.
Bridge design also presents another complicated, but well-structured, set of problems. Unfortunately, there are a limited number of training examples available to tune the system.
These applications for machine learning might not have the flash of the robots created by Hollywood scriptwriters,
but they may grow to fill valuable niches. At this stage, they require a well-defined problem to operate, but this isn't a horrible limitation. The world is exploding with data, and even the simplest patterns could be quite valuable. Machine-learning algorithms can help find these patterns.
illustration_link (23 Kbytes)

Finding a method in the madness takes a bit of persistence.
Here we have a group of talking heads. Some of them are in a club,
some don't get to be in the club. It's up to a machine-learning
algorithm to figure out why some of the heads get in and why some
don't. The algorithm takes a set of photos and figures out
a rule it
can use in the future. It uses concrete features, such as the shape
of the eyes, hat, and mouth, and generates some rules.
Has a head First, our algorithm creates some hypothetical
Has a hat rules from the first example it comes across.
Has eyes It doesn't know yet whether one of these rules
Has a smile is right or whether it's some combination--it
Has a round head has only one example to go on.
Has a triangular hat
Has round eyes
Has a head Has a round head
Has a hat Has a triangular hat
Has eyes Has round eyes
Has a smile
The algorithm runs its examples by the rest of the "in" crowd, ruling
some out in the process. (In the example, it rules out only one, but
in a real example, there would probably be more potential rules.)
Has a triangular hat Has a head and
Has a head and Has a hat Has a round head
Has a hea
d and Has eyes .
Has a head and Has a smile .
.
Now it knows what it takes to get into the club, but it's a long
list. So the algorithm looks at the heads that were excluded to see
what they don't have in common with the members of the club. At this
point, it starts to put the rules together with "and"s and "or"s, and
the list gets long. We aren't showing all of it.
Has round eyes and
Has a smile
Finally, after analyzing the entire set of examples, the algorithm
deduces the simplest rule. It might not be the only rule or even the
correct rule for all cases, but it describes the set of data that it
had to work with.
Challenge INDUCE to Deduce
screen_link (30 Kbytes)

Bored? Challenge an AI to figure out your pattern. Here, a program called INDUCE takes some basic train types that you put together and figures out what makes a safe train "safe" by comparing it to a series of "unsafe" trains. Just click on the button, and it will figure it out.
Peter Wayner is a BYTE consulting editor and the author of Agents Unleashed (AP Professional, 1995). He can be reached at
pcw@access.digex.net
.