Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesData Compression on the Macintosh


February 1994 / Cover Story / Data Compression on the Macintosh

Macintosh users have thus far escaped most of the controversy over disk compression that haunts PC users. For one thing, disk compression isn't a standard feature of the Mac operating system as it is with the latest versions of DOS. It's like the PC world before DoubleSpace: You have to buy a third-party product to get compression, so it tends to attract users who are more aware of the trade-offs.

Also, for various reasons, Mac software doesn't require as much disk space as Windows software. Many Macs are still sold with 80-MB hard drives--woefully small by today's PC standards but plenty large for the average Mac user.

Nevertheless, data compression is very much in demand. High-end Mac users tend to be graphic artist professionals who handle truly huge files: A single scanned photograph might easi ly require 50 MB. File-level compressors have been popular for years, and the Mac counterparts of PKZip include StuffIt Deluxe and StuffIt SpaceSaver from Aladdin Systems (Watsonville, CA); DiskDoubler and AutoDoubler from Symantec (Cupertino, CA); More Disk Space from Alysis Software (San Francisco, CA); Now Compress from Now Software (Portland, OR); and Compact Pro, a shareware program by Bill Goodman from Cyclos (San Francisco, CA).

Unlike PKZip, however, most file-level compressors on the Macintosh can work transparently, automatically compressing and decompressing files as they're opened and closed. In fact, some of these programs constantly scan the disk for uncompressed files and automatically compress them during idle times. Control panels let you decide whether all files on a disk should be compressed or only certain files and folders.

Real-time disk compressors are fairly new on the Mac. Unlike file-level compressors, they install themselves at the device-driver level, similar to disk compressors on the PC. There's one important difference, however: On the Mac, device drivers automatically load into memory from all storage media on the SCSI chain during startup or when a removable disk is mounted. In other words, the compression software is tied to the media, not to the machine. So a service bureau, for example, can read a compressed Syquest disk without installing the compression software on its system.

There are three driver-level products available for the Macintosh: eDisk from Alysis Software; Stacker from Stac Electronics (Carlsbad, CA); and TimesTwo from Golden Triangle Computers (San Diego, CA). TimesTwo replaces the disk's existing SCSI driver with a custom driver that handles compression. Both Stacker and eDisk work with the existing SCSI driver, wedging themselves between the driver and the operating system.

The main advantages of driver-level compressors are that they're less likely to conflict with system extensions (also called INITs), and they're capable of comp ressing more kinds of files--even the non-ROM portions of the Mac operating system in the System Folder. On the downside, driver-level compressors may be incompatible with other SCSI device drivers.

Macintosh compressors use the same basic compression methods as those on the PC, achieving the same average compression ratio of about 2 to 1. However, the Mac file system shares a limitation of DOS that prevents either platform from exceeding that ratio on large hard drives, even with files that are highly compressible.

The problem is that both DOS and the Mac address their allocation blocks (called clusters on the PC) with a 16-bit number, so the maximum number of blocks on a drive--regardless of its capacity--is 65,536. Therefore, drives larger than 512 MB cannot use a minimum block size of 8 KB or less, because there aren't enough addresses. On a 1-GB drive, the block size grows to 16 KB; on a 2-GB drive, it expands to 32 KB.

This results in wasted space on large drives, because a block ca n't hold more than one file, so even a tiny file requires a whole block. One solution is to partition large drives into smaller logical drives. Each logical drive can address 65,536 blocks, so the blocks can be smaller.

On a compressed drive, block sizes are variable, so less space is wasted. However, it's possible to run out of allocation blocks before running out of actual physical space. This happens when the overall compression ratio on a large drive exceeds 2 to 1. Individual files can be compressed at much higher ratios, of course, but the average compression ratio across the entire disk cannot exceed that limit.

For example, let's say you compress a 512-MB drive. Its virtual size (based on an average 2-to-1 compression ratio) is 1024 MB, or 1 GB, with 16-KB blocks. Now you start filling the drive with highly compressible files that achieve a ratio of 4 to 1. There's enough physical space on the drive to store 2-GB worth of those files, but you'll get a surprising disk-full error after 1 G B. Why? Because the drive is limited to 65,536 allocation blocks, and it would need 131,072 of those 16 KB blocks to store 2 GB.

In practice, this barrier is not a serious problem because the average compression ratio for a typical mix of files rarely exceeds 2 to 1. But as drive capacities continue to climb and compression software keeps improving, this limitation is sure to be removed in future versions of DOS and the Mac OS.


Up to the Cover Story section contentsGo to previous article: How Safe Is Data Compression?Go to next article: Data Loss: A Cautionary TaleSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network