's speech or commands, it will be able to turn the content into text, then translate it or rephrase it as synthesized speech.
The goal of the project is to emerge with a system that recognizes any one of the 12 most widespread languages. English, French, and German linguistic databases are already available. In addition, the GlobalPhone team has collected high-quality databases of samples in Arabic, Chinese, Japanese, Croatian, Korean, Portuguese, Russian, Spanish, and Turkish.
For each language, the GlobalPhone researchers asked about 100 native speakers to read 20 minutes of newspaper text. They recorded each session digitally and characterized the recording session for each person by speaker characteristics and environmental conditions.
"The data collection is now done," says Schultz. The next step will be the training of the recognition engine based on the collected acoustic samples.
The GlobalPhone engine uses a phoneme-based algorithm, and it
s dictionary contains all known words from each language in a multilingual phoneme set. "Our phonemes are no longer language-specific but shared by several languages," explains Schultz.
When up and running, the GlobalPhone engine will produce a list of the most pertinent word strings separated into different languages. A scoring procedure will then reduce the number of best words and result in a best-matching word string. Schultz expects to have a running version with this functionality next spring.
The number of potential applications is huge. It includes any sorts of multilingual information and ordering systems, automatic telephone operators, or translation services.
illustration_link (42 Kbytes)
