In his "Web Search" article (September), Jon Udell talks about freeWAIS (the NT port of it) and says that "since multiple search terms combine with OR . . . you depend on the selective power of a single term." This is only partly right: WAIS produces a ranking of result documents where the first entries fit the query better and the later entries don't fit it as well. Thus, you will find documents containing all search terms near the beginning of the ranking list, and documents containing few search terms near the end. In fact, if you access a WAIS server using a WAIS client, you will find that, in addition to the document title, you get a score indicating the match between the document and the query.
I would also like to direct your attention to freeWAIS-sf (sf = structured fields). It improves somewhat upon the standard indexing and retrieval functio
ns of freeWAIS. A very important improvement of freeWAIS-sf is the ability to process structured fields. A document is separated into fields specified at run time (based on regular expressions), so there is no hard-wired restriction on what fields there are and how to recognize them. Here's where to find freeWAIS-sf:
http://ls6-www.informatik.uni-dortmund.de/freeWAIS-sf/freeWAIS-sf.html
Kai Grossjohann
grossjoh@dusty.informatik.uni-dortmund.de
The arbitrary fielded capability of freeWAIS-sf sounds particularly handy. I like the fact that the Simple Web Indexing System for Humans (SWISH) can look within HTML tags, but it only knows about certain of these kinds of "fields." I, in fact, have an application that wants more specific fielded capability, so I will give freeWAIS-sf a try. Your clarification of the behavior of freeWAIS is a
lso very helpful. -- Jon Udell, executive editor
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it
is
theoretical--and no language better exemplifies this than C++.
BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin,
and dozens of other CMP publications—bringing
you critical news and information about wireless communication,
computer security, software development, embedded systems,
and more!