David Essex
The Doonesbury strip. It haunts every conversation about handwriting recognition--and also serves as a milestone that marks where the technology has been and where it needs to go. When cartoonist Garry Trudeau ridiculed the Apple Newton's comical mangling of simple phrases, he gave voice to the public's unspoken verdict on the fledgling PDA market: Sorry, but your act isn't ready for prime time.
As a result of the Doonesbury debacle, the leading developers of handwriting-recognition software learned two lessons: do a better job of recognizing handwriting and do a better job of managing users' expectations. Developers say that users shouldn't expect a computer to recognize wr
iting that even a person can barely recognize.
Selling into vertical markets helps, as Apple has learned by repositioning the Newton MessagePad. Pen-based computers deployed by a corporate MIS department are more likely to be accompanied by training, as well as the expectation that users must learn to work with the devices. Vertical-market applications are also more likely to involve the frequent use of forms, where constrained text entry greatly simplifies the difficult task of recognition.
PDA developers have also learned to downplay the emphasis on handwriting recognition as a key component of the user interface. The latest designs make use of point-and-tap selections and digital ink (unrecognized drawing and jotting stored as bit maps). But digital ink needs more memory and disk storage--which PDAs tend to have in short supply.
Perennial Challenges
Although the key issues are almost all software-related, some hardware challenges remain. The ice-like surfaces o
f LCD screens on early pen computers caused people to write less legibly than usual, and the weight and thickness of the stylus did not always mimic those of traditional pens. In response, we're starting to see writing surfaces with more friction from vendors such as CalComp, as well as redesigned styluses. Hardware vendors are also working to minimize parallax (i.e., how a user perceives the separation between the writing point and where the point appears on the digitizer).
Meanwhile, research aimed at improving recognition continues. There are two broad approaches: the method of throwing many specialized algorithms at basic pattern recognition (a trend also observed in AI programs, such as neural networks and expert systems); and application of contextual and grammatical post-processing, which is common in speech recognition.
Most first-generation recognizers compared each newly written character to a set of similar ones. Now, research is focused on using computational and statistical methods
to spot deviations from the model characters. An example is the process of analyzing the different ways of writing an uppercase A, as explained by John S. Ostrem, vice president of R&D at Communication Intelligence (Redwood Shores, CA), the maker of PenDOS and Handwriter.
The right downstroke of the A might end at the bottom, or it might barely pass the crossbar. To account for such differences, recognizers use a technique called
elastic matching
. By measuring perhaps six or eight Fourier coefficients plotted between representative points on both the unknown and reference characters, an elastic matching algorithm calculates whether the coefficients fall within a permissible range. One study found the resulting error rate to be half that of linear (nonelastic) matching.
Grammatical and contextual analysis methods try to guess the likelihood of certain letters or words occurring near each other, based on language rules. If the pattern-recognition algorithms are uncertain about the identit
y of a Q, for example, a contextual analyzer might check the next character to see if it's a U and is at the beginning of a word. When the algorithm exceeds a particular confidence threshold, the recognizer interprets the character in question as a Q.
Lookup dictionaries have also become standard fare. They are frequently augmented by contextual algorithms or are narrowed to domains relevant to the user's special interests. A vendor, Lexicus (Palo Alto, CA), claims the dictionary in its Longhand cursive handwriting recognizer correctly guesses an unidentified word about 80 percent of the time (
see the screen
). Users choose their intended word from a list of candidates.
Using so many advanced techniques at once requires greater CPU power and memory capacity than is typical of today's hand-held devices. Accordingly, complicated neural-net approaches like those in Longhand work only on larger pen computers. Developers expect low-cost, high-speed RISC chips such as the StrongARM
from Digital Equipment and Advanced RISC Machines to bring these advanced techniques to smaller devices.
Palm's Alternative
Another approach is Graffiti, a cross-platform recognition engine from Palm Computing (Los Altos, CA). Graffiti requires users to print with a simplified version of the English alphabet. All but six of the 26 letters are the same as their traditional uppercase and lowercase equivalents. The rest are generally based on parts of traditional characters (
see the figure
).
The idea, says Palm Computing, is to make each character more distinguishable so that it won't be confused with others. (A special shift key lets you specify numerals.) Recognition is reportedly close to 100 percent, and Palm says that most people become competent with the new alphabet in about 20 minutes.
However, Palm's competitors--and many users--are skeptical about the idea of adopting a new alphabet. They believe that the computer should adapt to the
user, not the other way around.
Most developers continue to concentrate on the challenge of recognizing existing writing styles. All they ask is to be judged by a fair benchmark. "If you write something and no one else can read it, don't expect a computer to," says Madeline Duva, Communication Intelligence's director of business development. "These things are not magicians."
illustration_link (10 Kbytes)
Palm Computing's Graffiti requires users to learn a modified version of the alphabet. When combined with Caps Lock and a separate numbers key, Graffiti purportedly achieves close to 100 percent accuracy.
screen_link (57 Kbytes)
Lexicus Longhand uses neural networks to rec
ognize raw images and patterns in cursive handwriting. It compares unrecognized words to a 25,000-word dictionary and uses statistical methods to generate a list of best guesses from which you pick the intended word. Lexicus says the process takes only 1 or 2 seconds and that overall recognition exceeds 90 percent.
David Essex is a BYTE technical editor for reviews. You can contact him on the Internet or BIX at
dessex@bix.com
.