looking for are sophisticated, multilingual translation tools that can help reduce thes
e costs.
Machine translation (MT), the use of computers to automate translation, is one of the computer industry's oldest areas of interest -- and one of its most frustrating. Computer scientists and linguists have been working on MT techniques for decades. The results are disappointing: Perfect translation without human intervention is still a dream that will not be realized for the next 10 years or so.
MT has come a long way from simple word-to-word translation. Products can now deduce subtle contextual differences in languages. But even the most sophisticated systems on the market are far from automatic. They are, however, useful support tools for professional human translators. Some systems are based on huge dictionaries that translators use to more efficiently look up words and phrases. Other systems take a first look at a document and produce a rough draft that is then edited by a human translator. The best of these tools can deliver about 80 percent accuracy, experts reckon.
Real MT breakthroughs are rare and developments are slow because of the complexity and continuous progression of language. To be truly effective, a translation system must take into account the formation and use of words, syntax, and semantics. Furthermore, it must be able to recognize colloquial phrases, acronyms, and contractions -- not to mention incorrect grammar and misspelled words. That's why many vendors of language processing tools are now changing strategy. Rather than trying to keep up with constant additions to their products, they are making them customizable, enabling users to update dictionaries and also extend context sensitivity.
Globalink's new generation of technology, dubbed Barcelona, allows translators -- rather than programmers -- to include their own rules of how words should translate in a certain context. That means the front-line translator can efficiently deal with new word fields and idioms that might occur in a specific project. Globalink says this new technology will soon
incorporate a "wizard" that allows nonexperts to implement rules in a comprehensive high-level language. Other Windows programs will be able to access the Barcelona translation service via OLE Automation or an API.
Another example of a tool that lets users fine-tune the translation process with an expandable context-sensitive dictionary is Logos' Semantha. According to Mark Andrews, a Logos product marketing manager, this new generation of customizable translation tools enables users who systematically track new phrases and idioms to get to a point where they can push a button and generate close-enough translations for internal company documents and other communications.
So how do you keep up with ever-changing contexts and the variety of technical terms that occur in new projects? One answer comes from the Rank Xerox Research Center (RXRC) in Grenoble, France. RXRC's Terminology Extraction Project aims at facilitating the building of dictionaries. It compares translated documents in both the original
and the target language and aligns the text sentence by sentence. Then it extracts the multiword expressions and produces a list of paired terms that can be incorporated in a dictionary. In other words, the system automatically detects multiword expressions in the original and the translated documents and puts them in a dictionary. The technology currently works in Dutch, English, German, French, Italian, Spanish, and Portuguese.
A dictionary of 20,000 terms can take 1000 hours to build and cost up to $600,000, RXRC researchers say. With well-bred extraction tools, such costs can to a large extent be eliminated, they say.
The Terminology Extraction Project is part of the Xerox Lexical Development Architecture (XeLDA). This translation framework includes tools that can detect phrases. For example, if you click on the word "sweep" in the phrase "to sweep it under the rug," you don't get the translation of "to sweep,"; you get the translation of the complete phrase. This happens even if the idiom is spl
it up, as in "to sweep that crime under the nearest rug," because the system is designed to detect basic idioms.
Although the XeLDA services are prototypes, RXRC is planning to make these kinds of services commercially available in corporate LAN environments or over the Internet. "In the long term," says Monica Beltrametti, director of the Grenoble RXRC, "we aim to provide our Translation Aid Network Services as general-purpose translation tools to any networked computer user faced with multiple languages at work."
Personal MT
The market for MT tools has traditionally been professional translators in large corporations, international organizations, or governments. However, a new, more casual market for translation tools is emerging. Much of the information being passed around the globe doesn't require the precise translation that a novel or a technical manual might. "There is an increasing need for quick multilingual information scanning," says Ann-Marie Derouault, IBM's worldwide
speech and translation marketing executive. "No one would pay a professional translator to translate an e-mail message because a quick translation that gives you a rough idea of its content is all that's required."
New products in this area are nevertheless
context-aware
, and some also use sophisticated syntactical analysis. They integrate with standard word processors and are priced at less than DM 500. Here are some examples:
IBM Europe now offers Windows and OS/2 versions of its host-based translation technology, Personal Translator. This program comes in a basic package with a vocabulary of 160,000 words and 440,000 phrases and in an advanced version with approximately 200,000 words and 550,000 phrases. In Italy this technology is used in Synthema's PeTra English/Italian translation product, which runs under OS/2. And in Germany, IBM has worked with v.Rheinbaben & Busch Electronic Publishing to create a Windows-based German/English version of the Personal Translator techn
ology.
Accent Software offers Accent Duo with Translation, which integrates translation with word processing capabilities. This Windows system is available in English to Spanish, German, French, or Italian versions (it works bidirectionally). The program features a spelling checker and a thesaurus in both languages and lets users translate documents automatically or work interactively.
Logos' Remote Client is a Windows application that lets users dial into a Unix-based translation server. You can choose a multitude of dictionaries for several subjects, then send the job to the server, which returns a translated version. Users can maintain their own translation server or call the Logos corporate server, which costs $.04 per translated word. Logos' goal is to make machine translation available to smaller businesses and freelance translators who can't afford a high-end system.
Another force driving translation technology is on-line chat and communication in newsgroups. CompuServe, for example, of
fers English/French and English/German translation in some of its help forums. These translations are often very meager, but their value is immediacy, because in the context of a support forum messages can lose their relevance if they are delayed. As a CompuServe manager puts it, "The purpose is to quickly provide translations that otherwise would take hours to understand."
Multilingual translation is also reaching the World Wide Web. Globalink, for example, provides an add-on to Netscape Navigator 2.0 that translates Web sites in Spanish, French, or German into English, and vice versa, at the click of a button. Called Web Translator, the software allows users to translate on-line, or to save pages to be translated off-line, while maintaining the original page's hot links, graphics, and formatting.
The development of multilingual translation tools is key for most companies. Many of these systems support at least three of the main European languages -- English, French, German, and Spanish. However, the
re is no such thing as a one-size-fits-many translation technology. The experience of developing a translation system that works from language A to language B is in most cases of little help when developing a system for languages C and D. Merely replacing dictionaries is not enough because it does not reflect the grammatical structure or different semantic classes of words.
This famous example illustrates the difficulties of MT. Use any standard translation system to translate the old saying "The spirit is willing but the flesh is weak" to French and then back to English and you will get something like "The alcohol is strong but the meat is weak.
Where to Find
Accent
Jerusalem, Israel
Phone: +972 2 793 723 243
Fax: +972 2 793 731
E-Mail:
normank@accent.co.il
Internet:
http://www.accentsoft.com