o language-independent, since all human languages share universal linguistic properties. (Thank you, Noam Chomsky.) All you have to do is select the specific dictionary you want LinguistX to use.
Its stemmer can tell that "survived" is related to "survive," is a verb, and is past tense. So when indexing, it can thus arrange similar words together, making a smaller index that is faster to search; astonishingly, LinguistX can represent up to eight English words with a single bit. This also helps you make more precise queries and find what you're looking for through a word's deep meaning, not just its surface form. On a higher level, the LinguistX thesauri find words with similar meaning: That can broaden a search, but it can also lead to serendipitous connections.
Taggers decide parts of speech. If you're looking for a saw, LinguistX knows a hand tool from the past tense of the verb "see." This is in stark contrast to search engines
that treat words as mere strings of ASCII and would make no distinction between the tool and the verb. To tag a word properly means examining its context a little in the surrounding text. Obviously the extent of context examination must be as small as possible to preserve speed but wide enough to do the job.
LinguistX especially shines in handling phrases. If you search for "home run records" on most search services, you'll get a lot of dross about building a house, athletic footwear, and music albums. But LinguistX can tell that you are looking for exceptional batting performances in baseball.
All this has come out of years of linguistic research, turned into practical software. The result is a collection of ANSI C libraries that are platform-agnostic and eminently portable. For implementors, LinguistX saves time and space and delivers sharper query results. If this seems good to you, be sure to inquire whether commercial search engines incorporate LinguistX.