Jon Udell
Join the Web Project conference
news://dev4.byte.com/joncon
(Usenet view)
http://dev4.byte.com/joncon/threads.html
(Web view)
Recently, BYTE's publisher, Dave Egan, asked BYTE's marketing manager, Rob Mitchell, to send a thank-you message to everyone who has contributed to the Virtual Press Room (
vpr
), our Web-based archive of press releases. Clearly, the message had to travel as E-mail--we could hardly fax it, because we designed
vpr
to be a superior alternative to fax. So I
began rounding up and testing Internet mail servers.
That project led to a flurry of new developments at The BYTE Site, including mail front ends to document databases, groupware applications, and an automatic failure notification system. Here's a progress report.
Building a Mailing List
To enable Rob to send Dave's message, I extracted a list of E-mail addresses from the Hypertext Markup Language (HTML) files in the
vpr
archive. The Perl script that did that (see
http://www.byte.com/netcol/netproj.htm
) used a regular expression to match Internet mail addresses in the <contact> field of each document. I have since replaced my own regular expression with the smarter one used in Earl Hood's MHonArc (see
http://www.oac.uci.edu/indiv/ehood/
).
How could I send a message to the resulting list of 300 addresses? That's too many to string out on the To: line of a message header. Instead, I created a list account, vprusers@byte.com, aliased to a file containing the 300 addresses. On our BSD/OS 2.0 server running sendmail, you do that by adding this line to
/etc/aliases
:
vprusers: :include: /usr/vpr/vprusers.list
On our Windows NT 3.51 server running post.office, you achieve the same result by pasting the list into a field on an HTML form. Either way, mail to vprusers from Rob should go to everyone on the list, and replies should come back to Rob. I tested the setup on a list of local BYTE addresses, swapped in the real list (plus my own address, for tracking), and told Rob to fire away.
T
he Sorcerer's Apprentice
Sendmail gurus who have bothered to read this far are probably chuckling to themselves. They won't be surprised by what happened next. Rob's message arrived promptly in my mailbox. However, smug satisfaction became horrified panic when, a few minutes later, another copy of the message showed up from Company X, one of the list members. I had created a mail loop. When Company X got Rob's message, it sent the message back to the list, one member of which was Company X, which then got another copy of the message, which it sent back to the list...each iteration fanning out to 300 recipients. Oops!
The arrival of a third message interrupted my contemplation of this hall of mirrors. I disabled the list account and went looking for the explanation. The long message header, which casual mail users may regard as a nuisance, immediately proved its worth. It's the audit trail that administrators use to debug and maintain the planetary communications system that is Internet mai
l. Within hours, I had found all the pieces of the puzzle (see
"Anatomy of a Mail Loop"
).
If you look closely at this picture, you'll see that things went wrong at step 3. Just like a real letter, an Internet message comes in an envelope. In our case, there were 300 envelopes addressed to 300 destinations, each containing the same header (From: Rob Mitchell, To: vprusers@byte.com) and the same body (Dave's thank-you note).
When the envelope addressed to Company X reached Company Y, Y's sendmail correctly performed the final step of mail routing--it discarded the envelope and delivered the header and message into X's mailbox. Pullmail incorrectly performed an extra step. Lacking the now-discarded envelope address, it routed to the address in the To: field of the header. Because that field contained vprusers@byte.com, a mail loop formed.
Postmortem
The author of Pullmail, Mark Woollard (mark@swsoft.co.uk), wrote it specifically for use with Frontier In
ternet Services' mail servers, which copy envelope addresses into the custom X-Frontier-To: field of message headers, thereby enabling correct routing. It wasn't meant for general-purpose mail routing, and that's what created the potential for an explosion--but I lit the match.
There are two compelling reasons not to use a list of forwarding addresses as I did. Along with the looping problem, there is the possibility of unauthorized use. Rob could send a message to vprusers@byte.com, but so could any of the tens of millions of Internet mail users from around the world. Hence the need for list managers such as majordomo, a set of Perl scripts that can, among other things, null the To: field of headers (to prevent looping) and reject messages from those who aren't list members (to prevent unauthorized use).
But what if you don't have a list manager? John MacFarlane, president of Software.com -- whose
Unix and NT mail server, post.office
, which I am using, is also now available
under the Netscape label--suggested the following defensive maneuver:
To: null@byte.com
From: rob_mitchell@byte.com
Bcc: vprusers@byte.com
This setup traps inappropriate use of the address in the To: field.
Mail and Web Synergy
I haven't installed a list manager because, until we need to do another broadcast mailing, I'm busily mining a rich vein of mail-enabled Web applications. When a Web server lacks a complementary mail server, it's at best just passively mail-enabled. It can channel user-initiated mail by means of mailto: uniform resource locators (URLs), but it cannot itself send or receive mail.
When I began mail service at our site, things began to get more interesting. For example, consider the BOMB, a feedback mechanism that solicits comments by means of a form that's linked to every page of the BYTE on-line archive (see
"BOMB's Away,"
October 1995 BYTE,
http://www.byte.com/netcol/netproj.htm
).
The original version of the BOMB fed information into a database, but it didn't transmit that data to individuals responsible for particular articles. If you comment on a News & Views article, for example, Dave Andrews (who edits that section) ought to hear about it. Now he will.
The revised comment-logging script maps a set of section names ("News & Views," "Features") to a corresponding set of role definitions. Here are some role definitions in Perl:
@news_ed = `dave.news@bix.com';
@feature_ed =
`jmontgomery@bix.com,
thalfhill@bix.com,
tom_thompson@bix.com';
@managing_ed =
`rafe@bix.com,mschlack@bix.com';
Here is a Perl associative array that maps section names to roles:
%RoleMap = (
`News & Views', `@news_ed,
@managing_ed,bomb@bix.com',
`Features',`@feature_ed,
@managing_
ed,bomb@bix.com');
When you submit a comment, the script updates the database and also routes your comment to appropriate editors with a line such as:
`mail $RoleMap{$section_name}
<
$comment_file`;
Mail-to-HTML Transducers
Why copy all comments to bomb@byte.com? I've been looking for a way to present the textual information the BOMB has been collecting. The original BOMB database was intended for SQL queries against numeric data. It lacked a way to view the anecdotal remarks entered into the BOMB's multiline text-input field. Once comments began funneling into the BOMB account, it became possible to review the text comments with a POP3 mail client such as Eudora or Netscape Navigator 2.0.
However, there's an even better way. Programs exist that can convert SMTP-style mailboxes into HTML archives. I mentioned two last month--Earl Hood's MHonArc and Kevin Hughes' hypermail (
http://www.eit.software/hypermail
). Both are excellent. They build views by subject, author, and date, and can also link replies to original messages to create threaded views.
I've used MHonArc, which is written in Perl, to convert 30 MB of mail downloaded from my BIX account over the last few years into a navigable archive. As a bonus, it decodes some kinds of Multipurpose Internet Mail Extensions (MIME) attachments. For the BOMB archive, I used hypermail, a C program that's faster and simpler to deploy than MHonArc.
Both tools produce neat piles of HTML documents--one per mail message. You can easily feed these to a Web indexer such as freeWAIS or SWISH (another of Kevin Hughes' contributions to the Internet) to make your Web-based mail archive searchable. An example of the application of these techniques is the archive of the www-talk discussion list
at
http://www.eit.com/www.lists/
.
Mail-Enabled Data Entry
Because the BOMB is a Web application, its users construct database records in an HTML form and insert them by invoking a Common Gateway Interface (CGI) script. But what about users who are not running Web browsers? There should be an automatic way for them to enter their data.
Consider the
demos
application that converts staff reports on vendor demonstrations into a private Web archive (see
"Global Groupware,"
November 1995 BYTE,
http://www.byte.com/netcol/netproj.htm
). BYTE staffers file t
hese reports in our private conference on BIX. The first version of
demos
proved we could convert that conference into a more easily searchable and navigable Web archive.
However, the conversion wasn't automatic. I had to download the conference, massage it, and build the archive. And, of course, whatever isn't automatic tends to slip; the on-line
demos
archive soon went out of date. With the revised version of
demos
, users who post reports to the conference can at the same time mail them to a special account on our mail server. Arrival of a new report triggers hypermail, which updates the Web archive. Because I'm out of the loop now, the archive is current.
A related application is the E-mail interface to
vpr
that my associate Rex Baldazo is developing.
Vpr
's Achilles' heel is that it presumes Web access. For some of the PR agents and marcom specialists who are the primary intended users of
vpr
, that can be a tall order. Far more of them can
readily use Internet mail than can conveniently access the Web.
Thus, Rex's rite of passage into the Perl programming fraternity is to rig
vpr
to accept mail input. You'll send mail to vprinfo@byte.com to retrieve a copy of the form. Then you'll send the completed form to vprsubmit@byte.com. Just like the interactive Web-based version, mail-based
vpr
will either report errors in case of an incomplete or incorrect form or log the data and report success. However, these reports will, in this case, travel as E-mail.
There are still more mail-enabled developments in the pipeline. What about a system that enables BYTE's marketing staff to update Web pages for which hey're responsible by mailing them to the server? Or that enables the sales staff to mail in their ad-insertion orders? The combination of Web and Internet mail technologies puts all this within easy reach.
TOOLWATCH
DeBabelizer......$399
Equilibrium Technologies
Sausolito, CA
Phone: (415) 332-4343
Internet:
http://www.equilibrium.com
Check out the nifty thumbnails that now link to the full-size images on The BYTE Site. They're courtesy of DeBabelizer, a do-everything graphics converter with the all-important batch capability that production sites need.
BOOKNOTE
Sendmail
by Brian Costales, Eric Allman, and Neil Rickert
O'Reilly & Associates, Inc.
ISBN 1-56592-056-2
Price: $32.95
Even if you never use sendmail, but instead opt for a modern commercial reimplentation of Internet mail service such as post.office, you will benefit from this encyclopedic discussion of Internet mail t
echnology.
illustration_link (44 Kbytes)

1. Rob Mitchell sends a message
to vprusers@byte.com, a mailing list on our SMTP server.
2. The message fans out
to 300 recipients, including user@X.com, who receives mail at Company Y, the Internet service provider for Company X.
3. Periodically, X connects to Y and runs
the utility Pullmail (pullmail@swsoft.co.uk,
http://www.net-shopper.co.uk
) to fetch X's mail from Y's server (using POP3) and redistribute it to the users at X.
4. The header address
(To: vprusers@byte.com), now incorrectly transformed into an envelope address, forms a mail loop.
Company X's incorrect use of a mail router created the potential for an explosion. However, my own failure to protect a list account put the torch to it. I'm thankful I was able to stop it on the third iteration.
Rob Mitchell and Dave Egan were even more thankful.
photo_link (12 Kbytes)

screen_link (33
Kbytes)

This forms-based management interface works identically for Unix and Windows NT versions of post.office. From any Web browser in the world, you can add or modify accounts and monitor mail queues (see
http://www.software.com
).
Jon Udell is BYTE's executive editor of new media. You can reach him on the Internet or BIX at
judell@bix.com
.