HSM is an increasingly popular way to control the cost of networked storage. Here's how three PC LAN-based products compare.
Barry Nance
While the cost of hard drives has dropped below 50 cents per megabyte over the past year, the cost of managing LAN data has, ironically, risen. The intangible cost of LAN data management is close to $8 per megabyte per year, according to Mike Peterson, president of research firm Peripheral Strategies (Santa Barbara, CA). Even if you discount so-called intangible costs and rely only on hard figures, the out-of-pocket cost of adding storage to a LAN can mean paying for a file-server computer, the server NOS (network operating system), a backup device for the server, and other components. These costs dwarf the price of the hard drive itself. To help reduce the cost
of data management, manufacturers are beginning to offer a technology known as HSM (hierarchical storage management) on PC LANs.
Previously available on mainframes and Unix-based computers, HSM lets you automate the migration of LAN data to and from file-server hard drives to slower but larger-capacity devices. However, it is not a substitute for reliable backup procedures: You still must implement a backup/restore mechanism for the data on your LAN. Rather, HSM extends the storage capability of file servers. It moves older, infrequently used files from primary storage (the file server's hard drives) to secondary storage (optical read/write media and magnetic tape). The figure ``
The Hierarchy of Network Storage
'' shows the price, speed, and capacity trade-offs for the different types of storage media.
To state the concept in different terms, HSM provides on-line storage of frequently used files and near-line storage of other files. It automatically and transparently moves fi
les to and from near-line storage as it extends the storage capacity of file servers. A person at a LAN workstation who accesses a migrated file incurs a slight delay lasting a few seconds to half a minute while the HSM software ``demigrates'' a file. HSM is particularly well suited for situations involving many large files (e.g., images of documents) when only a subset of those files needs to be on-line at any one time.
These products are for serious LANs. Installation and setup time is hefty, and two of the reviewed products require multiple file servers with multiple volumes on each server. You might even need to install additional RAM in your file servers; NetWare doesn't offer virtual memory management, relying on physical RAM to hold all the programs running on the server.
Know Your Place in the Hierarchy
Peripheral Strategies has identified five levels of HSM that are widely accepted as guidelines by the HSM industry. Level 1 is simple automatic migration with trans
parent retrieval. Level 2 adds real-time, dynamic load balancing of free disk space based on predefined thresholds. Level 2 also can manage two or more layers of near-line storage (e.g., an optical jukebox and magnetic tape library). Level 3 provides for the management of three or more layers of storage hierarchy and dynamically balances the consumption of available space in each layer. Level 4 HSM products can migrate files based on data type and other criteria, through the use of policies (rules). Products that conform to level 4 preserve ownership, attribute, and location information about files, thus allowing multiplatform (DOS, Macintosh, OS/2) HSM. Finally, level 5 identifies HSM products that can work with database manager software, such as DB2/2, NT Server, or Oracle, to migrate portions of a database (rather than an entire file) to and from secondary storage. There are, as yet, no level 5 HSM products; the most advanced of the three reviewed here implements level 4 features.
Migrating old or i
nfrequently used files onto inexpensive media such as removable optical disks or tape not only frees up primary storage space for more current files but also reduces the average cost of storage. Thus, instead of expanding your LAN in an on-line fashion, you can use HSM to begin expanding it in a more controlled, near-line manner. HSM can also increase overall network performance by optimizing access times for the data you're most likely to need. In a complex HSM setup, you might have ultrafast cached hard drives layered above slower 10-GB single-spindle drives, which are layered on top of a 20- or 40-GB optical jukebox for near-line storage, in turn layered above a 192-GB tape library.
About the Test Environment
We evaluated three HSM products for PC LANs: Storage Migrator, from Arcada; Palindrome HSM, from Palindrome; and NetSpace, from Avail. All three are actually NLMs (NetWare loadable modules) that run on a NetWare file server. Storage Migrator and NetSpace fit approximately
into level 4 of HSM, while Palindrome HSM is a level 3 program. Further research revealed new HSM and HSM-like products you'll want to be aware of (see "The Emerging Faces of HSM"), as well as a single-user HSM program reviewed in ``A Smaller Version of Infinity''.
The NetWare 3.11 environment we created for evaluating HSM products included an ADIC 1200D DAT (digital audiotape) Autochanger, which holds 12 DDS (digital data storage) tapes with a total capacity of approximately 192 GB, and a 20-GB Hewlett-Packard 20XT optical disk drive. Both units feature hands-off operation and act as robotic librarians when retrieving files. The ADIC Autochanger and HP optical drive not only provided a good platform for HSM evaluation but are typically the hardware HSM vendors recommend to customers.
We used several criteria to measure and compare these HSM products. A good HSM implementation should support several layers of media hierarchy and offer hands-off media independence. HSM software should use a rule
s engine that understands capacity and time thresholds, exceptions by file type, and forced migration. It should also optimize migrations in a way that minimizes the need for remigration. And HSM software should demigrate files quickly, as fast as the secondary storage allows. All three products do all these things, but in different ways and to differing degrees.
Avail's NetSpace
The first HSM product for NetWare LANs, Avail's NetSpace 3.0, is primarily a collection of NLMs. These NLMs include an HSM Engine, a Server Monitor that displays system activity, a Media Maintenance Manager that allows changing of tape library magazines, a Device Maintenance Manager for changing near-line storage devices, a Database Recovery Manager for repairing NetSpace files after a server crash, a Backup Manager for scheduling the rotation of multiple NetSpace migration sets, and a Recall Initiator that recalls files from near-line storage.
NetSpace requires a dedicated NetWare server (which it
calls the Storage Server) with at least three same-size migration partitions, or NetWare volumes. A second NetWare server, termed the Domain Server, stores the NetSpace administrative programs and holds duplicate data files for the Storage Server. NetSpace stores migrated files on the Storage Server's hard disk and, optionally, on optical media and magnetic tape.
When NetSpace migrates a file from a file server it manages, it leaves a phantom file, or placeholder, behind. NetSpace stores a 420-byte link to the actual file in secondary storage in either the placeholder file or, if you load the NetWare name-space NLM and if the backup software supports name-space extended attributes, in extended attributes. At installation time, you choose whether NetSpace should change or preserve the last modified date attribute for the placeholder file left on the file server. Preserving the date makes directory comparisons easier but prevents some network backup utilities from recognizing which files have changed.
The AVRECALL component of NetSpace is an NLM that recalls a file from secondary storage when a workstation attempts to access the file. AVRECALL's command-line parameters control the maximum number of recalls per connection per hour, whether to send recall notification messages to the workstation, and other recall behaviors. To see the notification message, a workstation must load the AVRSPND TSR program, which displays a message while demigration occurs. The TSR isn't needed to notify AVRECALL of an access to a placeholder. However, without the TSR, NetSpace users can't cancel a demigration operation once it's started.
NetSpace ensures that sufficient space always exists for migrated files and (like Palindrome HSM) can temporarily move files to off-line storage when file-server disk space runs low. NetSpace automatically queues the off-line data for a file-restoration operation when server disk space increases. NetSpace can also allow viewing, browsing, or searching (but not altering) of migrate
d files without permanently recalling those files to on-line storage. After the file browse or search operation, NetSpace returns the file to hierarchical storage.
Another NetSpace NLM, AVLOGMON, runs on managed file servers and allows backup utilities and virus-protection programs to open placeholder files without causing the actual file to migrate from secondary storage. AVLOGMON can also temporarily disable migration during backup and restore procedures, thus ensuring that a backup or restore operation occurs when the server's files are in a consistent state.
Palindrome HSM
Palindrome's Network Archivist (PNA) backup utility software is well known among LAN administrators, and the company's HSM product acts as an extension to PNA. Palindrome HSM 3.1a adds automatic file retrieval and multiple media-type support to PNA, which must be installed on the same server as the HSM components. However, the HSM component doesn't require a dedicated NetWare server.
Palindrome
HSM consists of NLMs and DOS/Windows software. The Volume Monitor NLM, PALVMON, performs several tasks. It monitors disk-space use, maintains lists of files eligible for migration (Palindrome calls them prestaged lists), converts migrated files into placeholder files, and notifies an administrator of HSM error and alert conditions. The Volume Monitor delegates some tasks, such as the actual migration operation, to the HSM Engine NLM, PALVENG. Another NLM, the Recall Server (PALRECAL), receives requests to move files from secondary to primary storage. The Archivist Queue Server NLM, PALQSVR, performs the actual demigration of the file.
An administrator can configure Palindrome HSM to migrate files as soon as a NetWare volume begins to run out of space (a condition called Event Migration), and can specify the amount of disk space associated with the event. The default high watermark is 90 percent full. Migration continues until the prestaged list of eligible files is exhausted or available primary stora
ge increases to a specified low-watermark level (the default is 80 percent).
The administrator can also direct Palindrome HSM to migrate files to secondary storage at a particular time of day on one or more days (a procedure termed Scheduled Migration). In this mode, Palindrome HSM begins migrating files without regard for how much disk space is left. Scheduled migration proceeds until the list of eligible files is exhausted or the amount of free space increases to the specified low watermark.
The administrator instructs Palindrome HSM to use one of three strategies in building the list of eligible files: least recently used, largest file, and most eligible. A file is most eligible if its last access date is prior to the last access date of other eligible files; most eligible status puts such a file near the top of the list.
During migration, Palindrome HSM by default leaves a zero-byte placeholder on the file server. Palindrome HSM's demigration of files (the recall process) uses a combi
nation of workstation and server software; the workstation software intercepts a file access operation performed on a placeholder and sends a request to the Recall Server NLM running on the server. DOS-based workstations load an 11-KB TSR agent to intercept file operations, while Windows workstations use a Windows VxD (virtual device driver) to watch for accesses to placeholders.
The TSR can be a problem in memory-constrained DOS workstations (though Palindrome notes the TSR can be loaded high) and offers only a single DOS session on OS/2-based workstations. The TSR does, nonetheless, give impatient users a chance to cancel demigration of the file. Palindrome says it is working on a version of HSM containing an NLM that notices accesses to migrated files without depending on recall notification by a TSR agent. As of now, it's the only one of the three products reviewed that requires a TSR to handle demigration. (Regardless, you can optionally use Palindrome HSM's File Manager program to manually reques
t demigration of files.)
Arcada's Storage Migrator
Previously sold by Conner Peripherals as Conner HSM,
Arcada's Storage Migrator 3.0
is a modified, earlier version of Avail's NetSpace. Arcada adds to Avail's software its own Infinet View graphical tool for tracking and managing migrated files across optical-disk and magnetic-tape media. InfiNET View provides administrators with information about file location, server use, and jukebox use (including remaining free space). The utility can show an administrator when files last migrated from one location to another, for example. Storage Migrator also includes some network management tools for producing reports and statistics on system storage operations.
As you'd expect from its ancestry, Storage Migrator is very similar to NetSpace in both architecture and daily operation. However, Storage Migrator lacks the ability to move files temporarily to removable off-line storage, and it also doesn't distinguish be
tween mere file viewing (or searching) of migrated files and recall of a file for update purposes.
Arcada plans soon to release an upgrade that won't require a dedicated server and will add integration with the company's Backup Exec software.
Assessing the Early Crop
HSM is an emerging technology for PC LANs, and these products show the immaturity of HSM in the PC environment. As yet, NetWare-based HSM products don't take into account data management on application servers or workstation hard drives. Many HSM products lack sufficient file-by-file migration rules and don't offer centralized management of backup, HSM, and archiving procedures (excepting backup programs such as PNS, that have recently added HSM features). More practically, perhaps, applications such as Microsoft Word for Windows can take hours to retrieve and display summary information when listing demigrated files in File Open dialog boxes.
Avail's NetSpace is the best HSM implementation of the three
products we evaluated. Unlike Palindrome HSM, NetSpace doesn't require a TSR recall notification agent, does a good job of supporting extended attributes (NetWare name spaces), and is easy to administer. However, if you already use the popular Network Archivist product from Palindrome, you might want to buy Palindrome HSM; it's a natural extension of PNA. But NetSpace is the clear winner if you want to use near-line storage to augment on-line storage that's growing by leaps and bounds.
ABOUT THE PRODUCTS
Infinite Disk 2.1 $129
Chili Pepper Software
1630 Pleasant Hill Rd.
Suite 180-200
Duluth, GA 30136
(800) 395-1812
(404) 339-1812
fax: (404) 513-7411
NetSpace 3.0 $2749
Avail Systems
4760 Walnut St.
Boulder, CO 80301
(800) 962-8245
(303) 444-4018
fax: (303) 546-4219
Palindrome HSM 3.1a $2995
Palindrome Corp.
600 East Diehl Rd.
Naperville, IL 60563
(800) 288-4912 ext. 375
(7
08) 505-3300
fax: (708) 505-7917
Storage Migrator 3.0 $7500
(for two managed NetWare servers)
Arcada Software
37 Skyline Dr.
Suite 1101
Lake Mary, FL 32746
(800) 327-2232
(407) 333-7500
fax: (407) 333-7770
TapeDisk $249.95
TapeDisk Corp.
85 Cove Lane
Oshkosh, WI 54901
(800) 827-3372
(715) 235-3388
fax: (715) 235-3818
HOW HSM PRODUCTS COMPARE
ARCADA'S
AVAIL'S PALINDROME STORAGE
NETSPACE HSM MIGRATOR
Demigration speed* (in seconds)
-- Optical jukebox 3.1 3.4 3.1
-- Tape library 12.0 12.0 12.0
Temporary recall for browsing X O O
Temporary off-line storage X X O
Requires dedicated server X
O X
Needs TSR to cause demigration O X O
X = yes
O = no
*In the demigration speed test, we forced a file onto the optical jukebox
(using the same disk for each test) and then onto the magnetic tape
library (the same tape cassette in the same position in the tape magazine).
We then measured retrieval times.
illustration_link (65 Kbytes)
HSM technology lets network administrators develop a customized strategy for tiered storage. The main control screen in Arcada's Storage Migrator lets you establish rules for file migration, such as high/low ``watermarks'' for each type of storage (shown here as cylinders) and the ages and sizes of files to be migrated.
illustration_link (30 Kbytes)
HSM's goal is to store network data on the lowest-cost device that meets performance requirements. At the top of the hierarchy are hard disks, with their fast access times, high per-megabyte costs, and relatively low capacities. HSM products automate the process of migrating less active files down the hierarchy to larger, slower, less expensive media, such as optical disks and tape drives.
Barry Nance is a consulting editor at BYTE and has been a programmer for 20 years. He is the author of Using OS/2 Warp 3.0 (Que, 1994), Introduction to Networking (Que, 1994) and Client/Server LAN Programming (Que, 1994). You can reach him via BIX or the Internet at
barryn@bix.com
.