Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesMachine Learning Grows Up


August 1995 / Features / Machine Learning Grows Up

AI is becoming a reality as pattern-recognition programs can now prove

Peter Wayner

You know the three great lies, right? The check's in the mail, I'll call you right back, and artificial intelligence is right around the corner. It's easy to slam all forms of AI. In the world of computer science, AI research is often the most far-flung, blue-sky, and dreamy combination of mathematics and philosophy you can imagine.

But not all AI research is reaching for impossible goals. Scientists throughout the field have created many algorithms that successfully learn straightforward abilities. If the context is well-defined and the bounds of the problem can be correctly encoded for the computer, then these algorithms can often pick up a pattern and learn to predict it successfully. These techniques can be used in applications that include agriculture, medicine, economics, and engineering.

Finding a Pattern

Databases rich with patterns are becoming common in all industries. Engineering firms now have good data on all parts of the engineering process and can use that data to produce better designs. Manufacturing companies need to be aware of sharp changes that might occur in any product (e.g., when another company introduces a significant competing product). Information suppliers can track the use of Web browsers to determine how people are reacting and can make choices about which information to choose next.

The most important stumbling block to using any of these algorithms is defining the logical structure of the problem so that the problem can, in essence, be explained to the computer. This means that large, vaguely defined problems, such as achieving peace in the Middle East, cannot be solved with pattern recognition. But more reasonable questions ca n be answered. For instance, is there a relationship between the disease that a set of patients has and their symptoms? If all the details are recorded, many machine-learning algorithms can identify the connections.

One of the groups successfully bridging the gap between abstract theory and applications is MLI (Machine Learning and Inference Center) at George Mason University in Fairfax, Virginia. The center, headed by Ryszard Michalski, produced a number of successful applications by working with well-defined problems, categorizing diseases, identifying objects in images, and combing databases for information.

Much of the work emerging from the MLI concentrates on finding the best logical rules for the data. Such learning programs strive to generate knowledge such as "Cars have four wheels, and bicycles have two." More statistically based learning algorithms might examine a set of bikes and cars and determine that the threshold that separates the two is three wheels. The difference is subtle. L earning algorithms aimed at finding rules shine on problems that have a fixed set of solutions. Statistical approaches do better where the dividing line is less obvious (e.g., the difference between tall people and short people).

Learning Systems

One of the better examples of the MLI approach to machine learning is the Inlen system. Inlen searches through large databases to find significant patterns. Supermarkets, for instance, encourage shoppers to join discount clubs so that they can generate profiles of their consumption. The IRS is widely believed to use pattern-recognition algorithms to identify suspicious tax returns. And federal law enforcement agencies are also believed to scan financial transactions to identify money laundering.

The Inlen system works with a relational database. Patterns are extracted from the database using a variety of different learning algorithms. The Inlen system is, in a sense, merely a CUI (common user interface) to some of these algorith ms.

One of the algorithms used in the system is called Sparc (no relation to the chip). Sparc finds patterns in sequences of data and attempts to predict the next element. For instance, if the sequence were "circle, triangle, square, circle, triangle," then Sparc would predict that the next element was square.

The Sparc algorithm works by analyzing the sequence with three different submodules. One submodule looks for periodic patterns like the one in our "circle, triangle, square" pattern by trying all potential period lengths and looking for alignment.

Another submodule searches for dependencies that may not be periodic. For instance, only squares come after triangles in the following sequence, "circle, circle, triangle, square, circle, square, circle, triangle, square, circle." The algorithm starts by generating hypothetical rules from single items in the series and then tries to see which rules hold generally. Normally, the algorithm will pick the shortest explanation possible.

Sparc's third submodule tries to combine several simple rules in what is called disjunctive normal form--that is, the submodule will look at a more complicated pattern and realize that one rule holds in some parts of the series and another rule holds in others. The submodule would combine the two rules with an OR and produce a rule that is general to the series.

The Inlen system also includes many other modules. One called Cluster identifies groups of similar data. Another called Eventree generates decision tree rules built up of IFTHEN clauses.

Joining Results

The process of combining the results from Inlen's modules can be complicated. Michalski's center is exploring several different methods for combining these rules. The classic method, deductive logic, forms new results by recognizing that if A implies B and B implies C, then A implies C. It's straightforward and easy to implement. But the Inlen system also includes processes for generalizing , specializing, abstracting, and concretizing , the knowledge built up by the modules. These are major components of Michalski's Inductive Theory of Learning.

Generalization, for instance, refers to an inference (e.g., if three students in the same class were assigned homework, then all students of this class were probably assigned homework). The system applies specific knowledge to a more general set. Complementary to generalization is specialization (e.g., if all students were assigned homework, then a specific student probably was).

Getting Answers

One of the big hurdles for all machine-learning algorithms is solving problems in an acceptable amount of time. Many of the problems fall into a class known as NP-complete, which means that there is no known efficient algorithm for finding the correct answer. This is because the number of potential answers can grow exponentially with the size and complexity of the answer.

The common solution in m achine learning is to limit the size of potential answers, usually by creating hypothetical rules and ranking them. At each step in the recognition process, the number of potential new rules may grow significantly, so the algorithms rank new ones and eliminate the worst. Ultimately, this process may exclude the best final answer because preliminary versions of it don't make the cut along the way.

Evaluating the quality of machine-learning algorithms can be complicated by the fact that many algorithms are designed to perform well on particular types of problems. The rule-based systems emerging from the MLI are adept at picking up logically structured patterns (e.g., "a baseball player is a Yankee if the uniform has pinstripes"). More ambiguous relationships like the distinction between a tall and a short person are often hard to characterize with this more logical approach. In such cases, algorithms involving neural networks shine.

The machine-learning community recently held a competition to tes t learning algorithms. There were three different problems presented to the algorithms. The first was intended to be easy for symbolic systems--the solutions were generated from simple symbolic rules. The second test was tuned for neural networks; it required the algorithms to identify sets that might be described by a pattern. Here's an example: If a man satisfies two of these three attributes, then he's an acceptable date: tall, dark, and handsome. These problems can be difficult for symbolic learning algorithms to grasp because their description can grow exponentially complex. The third test involved a more symbolic approach with added noise.

The AQ17 program from the MLI was the only one to score 100 percent on all the tests. It succeeded on the more difficult second test because it is not a typical rule-generating algorithm. The winning AQ-17 algorithm used a feedback mechanism to generate synthetic attrib-utes that could reduce the complexity (e.g., tall and dark). This synthetic approach really shines in cases where the number of different attributes in the set grows.

On to Applications

The machine-learning algorithms are really just abstractions and are not tuned to any particular problem. The trouble is finding a domain that is structured enough to allow computer representation. The scientists at the MLI are experimenting with engineering problems--a domain rich in mathematical structure.

The MLI is investigating automotive design with Chrysler's Technology Center (Auburn Hills, MI). The team analyzed an automobile suspension system and tried to determine what features affected its manufacturability. The MLI reports that the initial results are promising.

Bridge design also presents another complicated, but well-structured, set of problems. Unfortunately, there are a limited number of training examples available to tune the system.

These applications for machine learning might not have the flash of the robots created by Hollywood scriptwriters, but they may grow to fill valuable niches. At this stage, they require a well-defined problem to operate, but this isn't a horrible limitation. The world is exploding with data, and even the simplest patterns could be quite valuable. Machine-learning algorithms can help find these patterns.


MIX `N' MATCH

illustration_link (23 Kbytes)

Finding a method in the madness takes a bit of persistence. Here we have a group of talking heads. Some of them are in a club, some don't get to be in the club. It's up to a machine-learning algorithm to figure out why some of the heads get in and why some don't. The algorithm takes a set of photos and figures out a rule it can use in the future. It uses concrete features, such as the shape of the eyes, hat, and mouth, and generates some rules. Has a head First, our algorithm creates some hypothetical Has a hat rules from the first example it comes across. Has eyes It doesn't know yet whether one of these rules Has a smile is right or whether it's some combination--it Has a round head has only one example to go on. Has a triangular hat Has round eyes Has a head Has a round head Has a hat Has a triangular hat Has eyes Has round eyes Has a smile The algorithm runs its examples by the rest of the "in" crowd, ruling some out in the process. (In the example, it rules out only one, but in a real example, there would probably be more potential rules.) Has a triangular hat Has a head and Has a head and Has a hat Has a round head Has a hea d and Has eyes . Has a head and Has a smile . . Now it knows what it takes to get into the club, but it's a long list. So the algorithm looks at the heads that were excluded to see what they don't have in common with the members of the club. At this point, it starts to put the rules together with "and"s and "or"s, and the list gets long. We aren't showing all of it. Has round eyes and Has a smile Finally, after analyzing the entire set of examples, the algorithm deduces the simplest rule. It might not be the only rule or even the correct rule for all cases, but it describes the set of data that it had to work with.


Challenge INDUCE to Deduce

screen_link (30 Kbytes)

Bored? Challenge an AI to figure out your pattern. Here, a program called INDUCE takes some basic train types that you put together and figures out what makes a safe train "safe" by comparing it to a series of "unsafe" trains. Just click on the button, and it will figure it out.


Peter Wayner is a BYTE consulting editor and the author of Agents Unleashed (AP Professional, 1995). He can be reached at pcw@access.digex.net .

Up to the Features section contentsGo to next article: Patterns In The CodeSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network