Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesDocument Image Managers


May 199 5 / Reviews / Document Image Managers

Although standards are in flux and product categories are blurred, one of these eight packages may be just right for electronically managing your most important documents

David Seachrist

Interest in using desktop computers to manage intellectual property has taken the personal computer industry by storm. In many organizations, documents are replacing applications as the central user-interface metaphor.

Not surprisingly, a whole broad category of document management software has risen to serve this growing demand. However, despite some marketing claims, no one product can meet everyone's document management needs. In fact, we can divide document management into three subcategories:

-- Dynamic document management software (sometimes called ad hoc document management) focuses largely on managing documents that are still in the creation, editing, or production stages. Products such as PC Docs Open, SoftSolutions, Visual Recall, and WorldView fall into this category.

-- Document image management deals with managing documents that are already complete, making content creation and formatting control moot.

-- Document exchange programs focus largely on making electronic documents more portable. The goal is to reduce handling of paper documents by using formatted electronic versions that you can open and print in different applications. Adobe Acrobat, Envoy, Replica, and Common Ground are in this category.

For this software roundup, NSTL focused on document image management for Windows environments. Also, the reviewed products must run as native applications under Windows, support scanning on a Hewlett-Packard ScanJet IIc, perform OCR, allow search and retrieval using the Boolean and operator, provide a go-to-page tool for quick document navi gation, and allow printing.

Rating Usability

NSTL used a document library of 17 issues of Software Digest Ratings Report as the basis for testing the eight programs' ease of learning and use in basic functions, such as scanning, OCR, indexing, searching and retrieving, and printing. In addition, NSTL rated the programs for usability in installation, documentation, library setup, and maintenance. In general, the programs that provided the best interface design and documentation were the ones that rated highest in NSTL's usability evaluations. Westbrook Technologies' FileMagic Plus 4 Professional, Caere's PageKeeper 1.1, MindWorks' Recollect 2.11, and Watermark Software's Watermark Professional Edition 1.0 all achieved better-than-average usability.

Recollect was the easiest to learn. It has an intuitive interface and documentation that is short, well designed, and to the point. The biggest factor in its ease of learning is that many of its processes are automatic and don't require separate steps.

A lack of functions also makes some programs easier to learn. Reduced feature sets often translate into shorter learning curves while causing a corresponding drop in ease of use. Such is the case with both Watermark Professional and Alacrity Systems' Equip+.

PageKeeper is like Recollect in combining OCR and indexing in a single process. However, installing PageKeeper requires more decision points than seems necessary, and its library setup is not as easy as Recollect's.

The tutorial in Newport Canyon Associates' Fileflo 3.6 is too bound to the manual and gives short shrift to key topics like OCR and document indexing. Furthermore, Fileflo's installation process is longer and requires the most decision making of any of the tested programs.

FileMagic Plus has the best learning materials, with separate tutorials for systems administrators, database administrators, and users. However, its problematic installation included false low-memory reports and an incorrect password sequence. (Westbrook Technologies promptly provided a maintenance release to fix these problems, but we had to reinstall the software.) In addition, library setup and indexing of documents were harder to learn than in the other programs.

The tutorial in ImageFast Software Systems' ImageFast 2.0 covers starting the program, opening and creating drawers and folders, scanning, and indexing via standard forms, but it offers no coverage of OCR. The biggest learning difficulties result from the program's two separate indexing and search engine options. This requires more study than the programs with a single indexing and search engine.

PaperClip Imaging Software's PaperClip for Windows has a tutorial that covers scanning, setting up applications to work with PaperClip, viewing images, and retrieving documents. It offers no coverage of OCR and little on how to set up folders and drawers. Both OCR and library setup are difficult to learn.

Ease of Use

FileMagic Plus's interface amenities and flexibility make it one of the two easiest programs to use. We were particularly impressed with the program's search interface and OCR options, both of which let you select the storage location of text within document files.

Watermark Professional's user interface also makes it a winner in ease of use. Although its search engine and limited indexing options are weak, its help system and OCR interface are the best of any of the programs.

We found Recollect and PageKeeper to be slightly harder to use than Watermark Professional or FileMagic Plus. Neither program allows naming and saving of search criteria the way Watermark Professional and FileMagic Plus do.

Fileflo's indexing capabilities are less flexible than those of some of the other programs, and although it lets you save search criteria, its search interface is not as intuitive as the interfaces of other programs.

With ImageFast, you have to switch search engines, choosing either the i nternal search engine or the run-time version of Microsoft Access's Database Wizard, depending on the type and speed requirements of the search. This is cumbersome.

Equip+'s lack of a search engine and its scanning and OCR interface make it less flexible for volume document scanning and indexing than most of the other programs. Finally, the lowest rated product for ease of use, PaperClip for Windows, has a cumbersome OCR function, and its indexing routine takes too many steps.

Scanning Horizons

Scanning is the method by which these programs transform documents into images, graphical bit maps that act as snapshots of the documents' pages. Two issues are important here: breadth of support of scanner types and image formats and the ability to control scan quality.

Although some programs ship with more scanner drivers than others, support for different brands of scanners is generally good in all the programs. The Scanning and File Management Features table shows the a pproximate number of scanners supported by each program.

Features like image file compression, document feeder support, and the option to designate a separator page (usually a blank page) as the division between documents when performing batch scanning are available in all the products. These features are important in environments that require scanning multipage documents or that have a high volume of single-page documents.

Support for scanning gray scales and color images into documents varies among the programs, and only Recollect and FileMagic Plus currently support both (Watermark Software planned to release a version adding color support to Watermark Professional in April). Regardless, many of the OCR modules only work with black-and-white images.

Watermark Professional, ImageFast, and PaperClip for Windows all correct skewed pages. Deskewing slightly shifts pages that are placed improperly on the scanner flatbed.

PageKeeper does not permit adjusting the scanner resolution, wh ich determines the level of image detail; the higher the resolution, the greater the image detail. Resolution also affects image storage size and quality, so it is useful to be able to alter this setting.

PageKeeper and Equip+ lack the ability to view individual pages during the scan operation. This option is useful for seeing each page during a batch scan to determine whether any pages need to be rescanned.

More than OCR

The ability to change a page image into editable text directly affects the types of documents a document image manager can process. The most important feature for streamlining the OCR process is a program's ability to differentiate between the graphical and text portions of a page.

Some programs' OCR modules come with the ability to detect snaking columns and maintain the reading sequence of the text. Of the eight, ImageFast and FileMagic Plus do not. It is also useful to be able to compensate for tabular matter, such as tables of numbers, by ins erting tabs or other delimiting characters between values. PaperClip for Windows, ImageFast, and FileMagic Plus lack this ability.

Beyond page formatting is text-character formatting. Some programs attempt to maintain formatting characteristics, such as boldfacing, and others simply maintain the text and lose all the formatting attributes. The OCR modules in Equip+, Fileflo, PageKeeper, and Watermark Professional come with file format conversion filters that allow saving text and formatting in popular word processor file formats.

Despite the fact that virtually every computer user has a word processing program with a spelling checker, Equip+, PageKeeper, and Watermark Professional offer an optional spelling checker as a part of their OCR operations. Watermark Professional's TextProofer program is especially powerful, containing an integrated spelling checker and text editor.

Indexing: The Crux of the Matter

Once a document has been scanned an d saved as an image file and OCR has been performed, the next step is to identify the document so it will be easy to find. This process is called indexing .

The simplest indexing option is to name the document and attach fields that contain information about the document. Such manual indexing is available, using varying methods, in all the programs.

Fileflo, FileMagic Plus, ImageFast, PaperClip for Windows, and Watermark Professional provide multiple fields that users can customize with names and data types. Watermark Professional is the least flexible in this regard, leaving most of this function to a database server that can be connected at an additional cost.

To perform manual indexing in Recollect and PageKeeper, you assign tag notes, the electronic equivalent of sticky notes. The problem with using tag notes is they are a haphazard means of indexing compared to fields, which prompt users for specific types of information. Equip+ allows filenames that are restricted to the DOS 8.3 naming scheme. In addition, each file has a single description field that can contain up to 127 characters.

The way OCR modules and indexing functions communicate is key to the types of documents suited for processing by each program. NSTL separates OCR and indexing integration into two distinct levels of functionality: OCR-driven and automatic full-text indexing.

Products with OCR-driven indexing are capable of capturing OCR data and placing it in appropriate fields during indexing. Because OCR-driven indexing uses fields as its organizing technique, it is the best indexing method for forms-based applications. Equip+, Fileflo, FileMagic Plus, ImageFast, and PaperClip for Windows provide OCR-driven indexing. Equip+ is hampered by the fact that it has only one field into which it can receive data. The other programs can direct OCR processed text into multiple fields.

Automatic full-text indexing involves indexing all the text captured during the OCR operation without using fields. Ind exing an entire document's content for retrieval is best done with full-text indexing. FileMagic Plus, ImageFast, PageKeeper, PaperClip for Windows (via its bundled full-text, indexing add-on Isys) and Recollect offer this option.

Search and Retrieve

To locate documents, document image managers need powerful search tools that are easy to use. Programs like PageKeeper and Recollect can be helpful to researchers because their search modules rank results and use graphics to home in on appropriate data. This approach is not as useful for form retrieval, where it's important to locate the exact numbered form, not a range of close approximations.

In programs with full-text indexing, you should look for features like fuzzy searching, proximity searching, and ranking and sorting of search results. Fuzzy searching is the ability to set how closely the search results must adhere to the spelling of the search criteria. Proximity searching is looking only for documents that contain two words within a certain location of each other in the document. ImageFast and PaperClip for Windows (via Isys) are the only full-text indexing programs that support proximity searches. PageKeeper is the only full-text indexing program that lacks fuzzy searching. FileMagic Plus lacks the ability to rank search results.

Boolean operators (and, not, and or) allow fine tuning or broadening of search criteria by including or excluding documents based on one or two search words. All eight programs allow this type of searching to some extent. However, Watermark Professional can only perform and-style Boolean searches on keyword fields and date fields. Equip+ and PageKeeper do not allow the not operator. Fileflo allows not in keyword searches but not for searches of document index fields.

Image Manipulation

All the programs allow viewing of the OCR processed text and the image of a page in side-by-side windows (except for PageKeeper, which offers viewing of the image or the t ext but not both at the same time). All also have ample magnification and navigation tools for zooming in on portions of pages or jumping to specific pages within documents. Each program also allows attaching note text to a page.

However, if you want to draw images on the page, your options are more limited. Equip+, FileMagic Plus, and Watermark Professional are the only programs to provide adequate bit-map painting tools. FileMagic Plus and Watermark Professional even offer highlighting pens to visually draw attention to important passages. ImageFast provides users with the ability to draw rectangles, and PaperClip for Windows can link to a third-party paint application.

The Distribution System

Once you view a document and perhaps attach a note to it, you would then either put the document away, print it, or pass it on to another person. Work flow is the term given to this routing of documents to other workers on your network.

ImageFast provides its own E-mail an d distribution module called WorkFast. In addition to allowing documents to be distributed to other users' mailboxes, WorkFast lets you send E-mail, attach priority to documents, and mark them regarding their completion status.

A special network version of PaperClip for Windows provides the same capabilities as WorkFast minus the status feature. The status marking capability was to be a part of a PaperClip for Windows Workflow add-on planned for release in April.

FileMagic Plus and Watermark Professional offer routing and the ability to access MAPI and VIM-compatible (Vendor-Independent Messaging) software, which, in turn, allows access to Lotus Notes. PageKeeper offers routing and its own E-mail system. Equip+ uses an inbox/outbox metaphor to send and receive faxes. It is the only program to come with its own fax/modem driver software, and you can set it up to receive and send faxes unattended. All other programs use a fax driver, such as Delrina Software's WinFax, to send and receive faxes.

The Wider World

At some point, users will want other Windows applications to dynamically interact with their document image manager. But our findings in that regard are rather disappointing. Only two of the programs are able to act as OLE clients and servers: FileMagic Plus and Watermark Professional. Of these, only Watermark Professional supports OLE 2.0. DDE links are supported in Equip+, FileMagic Plus, ImageFast, and Watermark Professional.

PaperClip for Windows provides its own proprietary method of linking to other applications. Although this method works, testers found that text formatting was not maintained when linking the OCR module to a Windows word processor.

Equip+ and FileMagic Plus provide print-file drivers that allow printing from other Windows applications to a file in the document image manager's file format. After printing a Microsoft Word for Windows document to the file driver, for example, you can view the Word document in the document image m anager exactly as if a scanner had scanned it.

Watermark Professional can hook up in a client/server environment. It offers an image-server product based on Microsoft SQL Server for Windows NT ($2995 for the 25-user version). And PaperClip for Windows offers an SQL Server version ranging from $595 to $995 per seat, depending on the number of seats. ImageFast and FileMagic Plus plan to offer tie-ins to database servers in future releases.

No Clear Winner

Document management standards are still too much in flux and technologies are not defined well enough to pick any of the evaluated products as the clear choice for handling all document-imaging needs. But the technology is taking off fast, and three of the tested programs are worth serious consideration for any company's initial foray into document image management.

Watermark Professional is a good introduction to the category. It has an elegant interface, well-written documentation, superior OCR, and a bevy of ima ge manipulation tools. For example, it can easily handle price lists, and when paired with a third-party database server, it emerges as the best bet for forms processing. However, unless you already have or intend to install a database server, think twice about using Watermark Professional in a high-volume forms-processing environment: Its retrieval times are slow.

PageKeeper is worth considering for handling documents that require full-text indexing, such as customer service literature. Its streamlined processing of scanning, high-quality OCR, and full-text indexing, along with its ability to highlight the relevance of search results, makes PageKeeper well suited to applications that require information access and image retrieval.

FileMagic Plus has a good interface and documentation and a nice mix of functions. It doesn't offer the OCR quality of some of its competitors, but if you are looking for a compromise between the forms prowess of Watermark Professional and the full-text retrieval of P ageKeeper, FileMagic Plus fits the bill.


PRODUCT INFORMATION


Equip+ 6.0                              $249

Alacrity Systems, Inc.
Hackettstown, NJ
(800) 252-2748
(908) 813-2400


Fileflo 3.6                             $795

Newport Canyon Associates
Irvine, CA
(714) 833-0333


FileMagic Plus 4 Professional           $1795

Westbrook Technologies, Inc.
Branford, CT
(800) 949-3453
(203) 399-7111


ImageFast 2.0                           $695

ImageFast Softwa
re Systems, Inc.
McLean, VA
(703) 893-1934


PageKeeper 1.1                          $595

Caere Corp.
Los Gatos, CA
(800) 535-7226
(408) 395-5148


PaperClip for Windows and Isys 3.0      $695

PaperClip Imaging Software, Inc.
Hackensack, NJ
(800) 929-3503
(201) 487-3503


Recollect 2.11                          $595

MindWorks Corp.
Sunnyvale, CA
(800) 396-6463
(408) 730-2100


Watermark Professional Edition 1.0      $295

Watermark Software, Inc.
Burlington, MA
(617) 229-2600


OVERVIEW


NSTL
RATING                                VERSION  QUALITY VERSATILITY PERFORMANCE

***     Watermark Professional Edition  1.0       #         X           O
**      FileMagic Plus                  4 series  #         O           O
**      PageKeeper                      1.1       #         X           O
**      ImageFast                       2.0       X         O           X
**      F
ileflo                         3.6       X         O           O
**      Recollect                       2.11      O         #           O
**      Equip+                          6.0       O         O           #
**      PaperClip for Windows and Isys  3.0       O         #           X


                                         EASE OF   EASE
                                         LEARNING  OF USE    PRICE

***     Watermark Professional Edition       X       X       $295
**      FileMagic Plus                       X       X       $1795
**      PageKeeper                           X       X       $595
**      ImageFast                            O       X       $695
**      Fileflo                              X       X       $795
**      Recollect                            X       X       $595
**      Equip+                               X       O       $249
**      PaperClip for Windows and Isys       O       O       $695


KEY
*****   Outstanding
****    Excellent
***
     Average
**      Below average
*       Poor
X       Good
O       Fair
#       Unacceptable



HIGHLIGHTS

                    
STRENGTHS                           LIMITATIONS

Equip+         -- Form fill-in feature           -- Cannot view page during 
                                                      batch scans
               -- Strong OCR feature support     -- Limited document indexing
               -- Fax send/receive driver 
                    included                     -- Limited document retrieval


Fileflo        -- Numerous OCR features          -- No automatic full-text 
                                                      indexing
               -- Fast retrieval                 -- Unsatisfactory OCR quality
               -- Good print and fax output      -- Limited work-flow support
                    quality


FileMagic      -- Numerous indexing features     -- Limited number of OCR 
Plus
                                                  features
               -- Easiest to use                 -- Unsatisfactory OCR quality
               -- OLE 1.0 client/server 
                    support


ImageFast      -- Numerous scanning features     -- Confusing indexing 
                                                      functions
               -- Fast retrieval speed           -- Limited OCR support
               -- Strongest work-flow support    -- Unsatisfactory OCR quality


PageKeeper     -- Displays relevance of search   -- Can't set resolution 
                    results                           during scanning
               -- Integration of scanning, OCR,  -- Limited links to other 
                    and indexing                      applications
               -- Excellent OCR quality          -- Limited image-editing 
                                                      support

PaperClip      -- Numerous scanning features     -- Most difficult to learn 
for
                                                   and use
Windows        -- Good proprietary links to      -- Poor OCR quality
and Isys            other applications
               -- Strong retrieval feature set   -- Poor print quality


Recollect      -- Easiest to learn               -- Limited links to other 
                                                      applications
               -- Integration of scanning, OCR,  -- Poor OCR quality
                    and indexing
               -- Displays relevance of search   -- Slow retrieval
                    results


Watermark      -- Good documentation and user    -- Limited indexing features
Professional        interface
Edition        -- Scanning, OCR, and image       -- Limited retrieval features
                    features
               -- OLE 1.0 and 2.0 client/server  -- Slow retrieval without 
                    support                           database server


SCANNING AND FILE MANAGEMENT FEATURES

illustration_link (13 Kbytes)


DOCUMENT EDITING AND RETRIEVAL FEATURES

illustration_link (13 Kbytes)


Watermark Professional Edition

screen_link (28 Kbytes)

Top-rated Watermark Professional Edition bests its competitors' text-processing features with TextProofer, an unusually powerful program that contains an integrated spelling checker and text editor.


Up to the Reviews section contentsGo to previous article: Mathematica Meets WarpSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network