Although standards are in flux and product categories are blurred, one of these eight packages may be just right for electronically managing your most important documents
David Seachrist
Interest in using desktop computers to manage intellectual property has taken the personal computer industry by storm. In many organizations, documents are replacing applications as the central user-interface metaphor.
Not surprisingly, a whole broad category of document management software has risen to serve this growing demand. However, despite some marketing claims, no one product can meet everyone's document management needs. In fact, we can divide document management into three subcategories:
--
Dynamic document management
software (sometimes called ad hoc document management) focuses largely on
managing documents that are still in the creation, editing, or production stages. Products such as PC Docs Open, SoftSolutions, Visual Recall, and WorldView fall into this category.
--
Document image management
deals with managing documents that are already complete, making content creation and formatting control moot.
--
Document exchange programs
focus largely on making electronic documents more portable. The goal is to reduce handling of paper documents by using formatted electronic versions that you can open and print in different applications. Adobe Acrobat, Envoy, Replica, and Common Ground are in this category.
For this software roundup, NSTL focused on document image management for Windows environments. Also, the reviewed products must run as native applications under Windows, support scanning on a Hewlett-Packard ScanJet IIc, perform OCR, allow search and retrieval using the Boolean and operator, provide a go-to-page tool for quick document navi
gation, and allow printing.
Rating Usability
NSTL used a document library of 17 issues of
Software Digest Ratings Report
as the basis for testing the eight programs' ease of learning and use in basic functions, such as scanning, OCR, indexing, searching and retrieving, and printing. In addition, NSTL rated the programs for usability in installation, documentation, library setup, and maintenance. In general, the programs that provided the best interface design and documentation were the ones that rated highest in NSTL's usability evaluations. Westbrook Technologies' FileMagic Plus 4 Professional, Caere's PageKeeper 1.1, MindWorks' Recollect 2.11, and Watermark Software's Watermark Professional Edition 1.0 all achieved better-than-average usability.
Recollect was the easiest to learn. It has an intuitive interface and documentation that is short, well designed, and to the point. The biggest factor in its ease of learning is that many of its processes are automatic
and don't require separate steps.
A lack of functions also makes some programs easier to learn. Reduced feature sets often translate into shorter learning curves while causing a corresponding drop in ease of use. Such is the case with both Watermark Professional and Alacrity Systems' Equip+.
PageKeeper is like Recollect in combining OCR and indexing in a single process. However, installing PageKeeper requires more decision points than seems necessary, and its library setup is not as easy as Recollect's.
The tutorial in Newport Canyon Associates' Fileflo 3.6 is too bound to the manual and gives short shrift to key topics like OCR and document indexing. Furthermore, Fileflo's installation process is longer and requires the most decision making of any of the tested programs.
FileMagic Plus has the best learning materials, with separate tutorials for systems administrators, database administrators, and users. However, its problematic installation included false low-memory reports and
an incorrect password sequence. (Westbrook Technologies promptly provided a maintenance release to fix these problems, but we had to reinstall the software.) In addition, library setup and indexing of documents were harder to learn than in the other programs.
The tutorial in ImageFast Software Systems' ImageFast 2.0 covers starting the program, opening and creating drawers and folders, scanning, and indexing via standard forms, but it offers no coverage of OCR. The biggest learning difficulties result from the program's two separate indexing and search engine options. This requires more study than the programs with a single indexing and search engine.
PaperClip Imaging Software's PaperClip for Windows has a tutorial that covers scanning, setting up applications to work with PaperClip, viewing images, and retrieving documents. It offers no coverage of OCR and little on how to set up folders and drawers. Both OCR and library setup are difficult to learn.
Ease of Use
FileMagic Plus's interface amenities and flexibility make it one of the two easiest programs to use. We were particularly impressed with the program's search interface and OCR options, both of which let you select the storage location of text within document files.
Watermark Professional's user interface also makes it a winner in ease of use. Although its search engine and limited indexing options are weak, its help system and OCR interface are the best of any of the programs.
We found Recollect and PageKeeper to be slightly harder to use than Watermark Professional or FileMagic Plus. Neither program allows naming and saving of search criteria the way Watermark Professional and FileMagic Plus do.
Fileflo's indexing capabilities are less flexible than those of some of the other programs, and although it lets you save search criteria, its search interface is not as intuitive as the interfaces of other programs.
With ImageFast, you have to switch search engines, choosing either the i
nternal search engine or the run-time version of Microsoft Access's Database Wizard, depending on the type and speed requirements of the search. This is cumbersome.
Equip+'s lack of a search engine and its scanning and OCR interface make it less flexible for volume document scanning and indexing than most of the other programs. Finally, the lowest rated product for ease of use, PaperClip for Windows, has a cumbersome OCR function, and its indexing routine takes too many steps.
Scanning Horizons
Scanning is the method by which these programs transform documents into images, graphical bit maps that act as snapshots of the documents' pages. Two issues are important here: breadth of support of scanner types and image formats and the ability to control scan quality.
Although some programs ship with more scanner drivers than others, support for different brands of scanners is generally good in all the programs. The Scanning and File Management Features table shows the a
pproximate number of scanners supported by each program.
Features like image file compression, document feeder support, and the option to designate a separator page (usually a blank page) as the division between documents when performing batch scanning are available in all the products. These features are important in environments that require scanning multipage documents or that have a high volume of single-page documents.
Support for scanning gray scales and color images into documents varies among the programs, and only Recollect and FileMagic Plus currently support both (Watermark Software planned to release a version adding color support to Watermark Professional in April). Regardless, many of the OCR modules only work with black-and-white images.
Watermark Professional, ImageFast, and PaperClip for Windows all correct skewed pages. Deskewing slightly shifts pages that are placed improperly on the scanner flatbed.
PageKeeper does not permit adjusting the scanner resolution, wh
ich determines the level of image detail; the higher the resolution, the greater the image detail. Resolution also affects image storage size and quality, so it is useful to be able to alter this setting.
PageKeeper and Equip+ lack the ability to view individual pages during the scan operation. This option is useful for seeing each page during a batch scan to determine whether any pages need to be rescanned.
More than OCR
The ability to change a page image into editable text directly affects the types of documents a document image manager can process. The most important feature for streamlining the OCR process is a program's ability to differentiate between the graphical and text portions of a page.
Some programs' OCR modules come with the ability to detect snaking columns and maintain the reading sequence of the text. Of the eight, ImageFast and FileMagic Plus do not. It is also useful to be able to compensate for tabular matter, such as tables of numbers, by ins
erting tabs or other delimiting characters between values. PaperClip for Windows, ImageFast, and FileMagic Plus lack this ability.
Beyond page formatting is text-character formatting. Some programs attempt to maintain formatting characteristics, such as boldfacing, and others simply maintain the text and lose all the formatting attributes. The OCR modules in Equip+, Fileflo, PageKeeper, and
Watermark Professional
come with file format conversion filters that allow saving text and formatting in popular word processor file formats.
Despite the fact that virtually every computer user has a word processing program with a spelling checker, Equip+, PageKeeper, and Watermark Professional offer an optional spelling checker as a part of their OCR operations. Watermark Professional's TextProofer program is especially powerful, containing an integrated spelling checker and text editor.
Indexing: The Crux of the Matter
Once a document has been scanned an
d saved as an image file and OCR has been performed, the next step is to identify the document so it will be easy to find. This process is called
indexing
.
The simplest indexing option is to name the document and attach fields that contain information about the document. Such manual indexing is available, using varying methods, in all the programs.
Fileflo, FileMagic Plus, ImageFast, PaperClip for Windows, and Watermark Professional provide multiple fields that users can customize with names and data types. Watermark Professional is the least flexible in this regard, leaving most of this function to a database server that can be connected at an additional cost.
To perform manual indexing in Recollect and PageKeeper, you assign tag notes, the electronic equivalent of sticky notes. The problem with using tag notes is they are a haphazard means of indexing compared to fields, which prompt users for specific types of information. Equip+ allows filenames that are restricted to the DOS
8.3 naming scheme. In addition, each file has a single description field that can contain up to 127 characters.
The way OCR modules and indexing functions communicate is key to the types of documents suited for processing by each program. NSTL separates OCR and indexing integration into two distinct levels of functionality: OCR-driven and automatic full-text indexing.
Products with OCR-driven indexing are capable of capturing OCR data and placing it in appropriate fields during indexing. Because OCR-driven indexing uses fields as its organizing technique, it is the best indexing method for forms-based applications. Equip+, Fileflo, FileMagic Plus, ImageFast, and PaperClip for Windows provide OCR-driven indexing. Equip+ is hampered by the fact that it has only one field into which it can receive data. The other programs can direct OCR processed text into multiple fields.
Automatic full-text indexing involves indexing all the text captured during the OCR operation without using fields. Ind
exing an entire document's content for retrieval is best done with full-text indexing. FileMagic Plus, ImageFast, PageKeeper, PaperClip for Windows (via its bundled full-text, indexing add-on Isys) and Recollect offer this option.
Search and Retrieve
To locate documents, document image managers need powerful search tools that are easy to use. Programs like PageKeeper and Recollect can be helpful to researchers because their search modules rank results and use graphics to home in on appropriate data. This approach is not as useful for form retrieval, where it's important to locate the exact numbered form, not a range of close approximations.
In programs with full-text indexing, you should look for features like fuzzy searching, proximity searching, and ranking and sorting of search results. Fuzzy searching is the ability to set how closely the search results must adhere to the spelling of the search criteria. Proximity searching is looking only for documents that contain
two words within a certain location of each other in the document. ImageFast and PaperClip for Windows (via Isys) are the only full-text indexing programs that support proximity searches. PageKeeper is the only full-text indexing program that lacks fuzzy searching. FileMagic Plus lacks the ability to rank search results.
Boolean operators (and, not, and or) allow fine tuning or broadening of search criteria by including or excluding documents based on one or two search words. All eight programs allow this type of searching to some extent. However, Watermark Professional can only perform and-style Boolean searches on keyword fields and date fields. Equip+ and PageKeeper do not allow the not operator. Fileflo allows not in keyword searches but not for searches of document index fields.
Image Manipulation
All the programs allow viewing of the OCR processed text and the image of a page in side-by-side windows (except for PageKeeper, which offers viewing of the image or the t
ext but not both at the same time). All also have ample magnification and navigation tools for zooming in on portions of pages or jumping to specific pages within documents. Each program also allows attaching note text to a page.
However, if you want to draw images on the page, your options are more limited. Equip+, FileMagic Plus, and Watermark Professional are the only programs to provide adequate bit-map painting tools. FileMagic Plus and Watermark Professional even offer highlighting pens to visually draw attention to important passages. ImageFast provides users with the ability to draw rectangles, and PaperClip for Windows can link to a third-party paint application.
The Distribution System
Once you view a document and perhaps attach a note to it, you would then either put the document away, print it, or pass it on to another person. Work flow is the term given to this routing of documents to other workers on your network.
ImageFast provides its own E-mail an
d distribution module called WorkFast. In addition to allowing documents to be distributed to other users' mailboxes, WorkFast lets you send E-mail, attach priority to documents, and mark them regarding their completion status.
A special network version of PaperClip for Windows provides the same capabilities as WorkFast minus the status feature. The status marking capability was to be a part of a PaperClip for Windows Workflow add-on planned for release in April.
FileMagic Plus and Watermark Professional offer routing and the ability to access MAPI and VIM-compatible (Vendor-Independent Messaging) software, which, in turn, allows access to Lotus Notes. PageKeeper offers routing and its own E-mail system. Equip+ uses an inbox/outbox metaphor to send and receive faxes. It is the only program to come with its own fax/modem driver software, and you can set it up to receive and send faxes unattended. All other programs use a fax driver, such as Delrina Software's WinFax, to send and receive faxes.
The Wider World
At some point, users will want other Windows applications to dynamically interact with their document image manager. But our findings in that regard are rather disappointing. Only two of the programs are able to act as OLE clients and servers: FileMagic Plus and Watermark Professional. Of these, only Watermark Professional supports OLE 2.0. DDE links are supported in Equip+, FileMagic Plus, ImageFast, and Watermark Professional.
PaperClip for Windows provides its own proprietary method of linking to other applications. Although this method works, testers found that text formatting was not maintained when linking the OCR module to a Windows word processor.
Equip+ and FileMagic Plus provide print-file drivers that allow printing from other Windows applications to a file in the document image manager's file format. After printing a Microsoft Word for Windows document to the file driver, for example, you can view the Word document in the document image m
anager exactly as if a scanner had scanned it.
Watermark Professional can hook up in a client/server environment. It offers an image-server product based on Microsoft SQL Server for Windows NT ($2995 for the 25-user version). And PaperClip for Windows offers an SQL Server version ranging from $595 to $995 per seat, depending on the number of seats. ImageFast and FileMagic Plus plan to offer tie-ins to database servers in future releases.
No Clear Winner
Document management standards are still too much in flux and technologies are not defined well enough to pick any of the evaluated products as the clear choice for handling all document-imaging needs. But the technology is taking off fast, and three of the tested programs are worth serious consideration for any company's initial foray into document image management.
Watermark Professional is a good introduction to the category. It has an elegant interface, well-written documentation, superior OCR, and a bevy of ima
ge manipulation tools. For example, it can easily handle price lists, and when paired with a third-party database server, it emerges as the best bet for forms processing. However, unless you already have or intend to install a database server, think twice about using Watermark Professional in a high-volume forms-processing environment: Its retrieval times are slow.
PageKeeper is worth considering for handling documents that require full-text indexing, such as customer service literature. Its streamlined processing of scanning, high-quality OCR, and full-text indexing, along with its ability to highlight the relevance of search results, makes PageKeeper well suited to applications that require information access and image retrieval.
FileMagic Plus has a good interface and documentation and a nice mix of functions. It doesn't offer the OCR quality of some of its competitors, but if you are looking for a compromise between the forms prowess of Watermark Professional and the full-text retrieval of P
ageKeeper, FileMagic Plus fits the bill.
PRODUCT INFORMATION
Equip+ 6.0 $249
Alacrity Systems, Inc.
Hackettstown, NJ
(800) 252-2748
(908) 813-2400
Fileflo 3.6 $795
Newport Canyon Associates
Irvine, CA
(714) 833-0333
FileMagic Plus 4 Professional $1795
Westbrook Technologies, Inc.
Branford, CT
(800) 949-3453
(203) 399-7111
ImageFast 2.0 $695
ImageFast Softwa
re Systems, Inc.
McLean, VA
(703) 893-1934
PageKeeper 1.1 $595
Caere Corp.
Los Gatos, CA
(800) 535-7226
(408) 395-5148
PaperClip for Windows and Isys 3.0 $695
PaperClip Imaging Software, Inc.
Hackensack, NJ
(800) 929-3503
(201) 487-3503
Recollect 2.11 $595
MindWorks Corp.
Sunnyvale, CA
(800) 396-6463
(408) 730-2100
Watermark Professional Edition 1.0 $295
Watermark Software, Inc.
Burlington, MA
(617) 229-2600
OVERVIEW
NSTL
RATING VERSION QUALITY VERSATILITY PERFORMANCE
*** Watermark Professional Edition 1.0 # X O
** FileMagic Plus 4 series # O O
** PageKeeper 1.1 # X O
** ImageFast 2.0 X O X
** F
ileflo 3.6 X O O
** Recollect 2.11 O # O
** Equip+ 6.0 O O #
** PaperClip for Windows and Isys 3.0 O # X
EASE OF EASE
LEARNING OF USE PRICE
*** Watermark Professional Edition X X $295
** FileMagic Plus X X $1795
** PageKeeper X X $595
** ImageFast O X $695
** Fileflo X X $795
** Recollect X X $595
** Equip+ X O $249
** PaperClip for Windows and Isys O O $695
KEY
***** Outstanding
**** Excellent
***
Average
** Below average
* Poor
X Good
O Fair
# Unacceptable
HIGHLIGHTS
STRENGTHS LIMITATIONS
Equip+ -- Form fill-in feature -- Cannot view page during
batch scans
-- Strong OCR feature support -- Limited document indexing
-- Fax send/receive driver
included -- Limited document retrieval
Fileflo -- Numerous OCR features -- No automatic full-text
indexing
-- Fast retrieval -- Unsatisfactory OCR quality
-- Good print and fax output -- Limited work-flow support
quality
FileMagic -- Numerous indexing features -- Limited number of OCR
Plus
features
-- Easiest to use -- Unsatisfactory OCR quality
-- OLE 1.0 client/server
support
ImageFast -- Numerous scanning features -- Confusing indexing
functions
-- Fast retrieval speed -- Limited OCR support
-- Strongest work-flow support -- Unsatisfactory OCR quality
PageKeeper -- Displays relevance of search -- Can't set resolution
results during scanning
-- Integration of scanning, OCR, -- Limited links to other
and indexing applications
-- Excellent OCR quality -- Limited image-editing
support
PaperClip -- Numerous scanning features -- Most difficult to learn
for
and use
Windows -- Good proprietary links to -- Poor OCR quality
and Isys other applications
-- Strong retrieval feature set -- Poor print quality
Recollect -- Easiest to learn -- Limited links to other
applications
-- Integration of scanning, OCR, -- Poor OCR quality
and indexing
-- Displays relevance of search -- Slow retrieval
results
Watermark -- Good documentation and user -- Limited indexing features
Professional interface
Edition -- Scanning, OCR, and image -- Limited retrieval features
features
-- OLE 1.0 and 2.0 client/server -- Slow retrieval without
support database server
SCANNING AND FILE MANAGEMENT FEATURES
illustration_link (13 Kbytes)
DOCUMENT EDITING AND RETRIEVAL FEATURES
illustration_link (13 Kbytes)
screen_link (28 Kbytes)
Top-rated Watermark Professional Edition bests its competitors' text-processing features with TextProofer, an unusually powerful program that contains an integrated spelling checker and text editor.