s into Chinese. Translations aren't instantaneous, generally requiring around one minute per page of text on a fast Pentium machine; the program does run in the background, however, so you can continue surfing the Internet while it works.
Of more interest to English-speaking BYTE readers, who may have stumbled upon Web pages of Chinese characters coded in double-byte format, are Otek's plans to develop a Chinese-to-English version of the program. Like the English-to-Chinese version, the upcoming program, due for release by the end of the year, will require either a Chinese version of Windows or a Chinese environment manager--from a company such as Twinbridge--to actually display Chinese characters. Development should be fairly simple, says Otek vice president Anno Huang, because the company already has a full-fledged Chinese-to-English tra
nslation product that's designed to work with word processing software rather than with Web browsers.
At the heart of all of Otek's translation applications is the same knowledge-based translation engine. Functional differences between Chinese-to-English and English-to-Chinese versions are minimal.
Web Page Translator incorporates a 70,000-word basic dictionary. (Supplemental dictionaries are available, covering fields such as computing, medicine, law, and finance.) As well as holding the English equivalent of each word, the dictionary contains additional information that is used to improve translation accuracy. Nouns, for example, can be tagged as referring to a person or place, or to a number of other broad categories; verbs are marked as transitive, intransitive, and so on. The difference between Otek's translation engine and a simple dictionary-based word-replacement system is the program's database of 10,000 grammatical rules. Expressed in a conditional syntax devised by the company, each ru
le defines how the translation of a particular word can be changed by words preceding or following it.
The English word
take
, for example, is represented in the program's dictionary by a Chinese word meaning
pick up
or
carry
. This might be described as the word's basic definition, and in the majority of cases, it is substituted directly for
take
. However, when
take
is followed by the noun
bath
or
shower
, one of the grammatical rules is activated, overriding the dictionary definition and substituting a Chinese verb meaning
wash
. The ability to handle regular tense and plural suffixes is hardwired into the program. So variations, such as
took a bath
and
taking baths
, can be handled by the same rule.
What about a phrase such as
he takes a bath in the house
? A further grammatical rule checks for the presence of prepositions such as
in
and ensures that the correct translation of
take
is used.
Successful tra
nslation beyond basic word-for-word replacement depends upon the integrity of the grammatical rule database. Although Otek's software development team includes one linguistics specialist who majored in English, none of the programmers are native English speakers. It's possible that some of the program's translation errors--which, as is typical of this kind of application, are not infrequent--actually reflect its developers' imperfect grasp of English.
All computer translation products should be marked with a warning: "Not for mission-critical applications." Otek's Web Page Translator is no exception. It handles straightforward sentences quite well, sometimes perfectly, in fact, but its performance with complex idioms and slang can vary from borderline acceptability to unintentional hilarity. However, the forthcoming Chinese-to-English version of Web Page Translator is well worth consideration by anyone who has ever wondered what those mysterious hieroglyphics on that site in Shanghai or Xian might mean.
Where to Find
Otek International
Taipei, Taiwan
Phone: +886 2 760 9468
Phone: +886 2 765 5777
Internet: http://www.transperfect.com.tw