BYTE.com
RSS feed

Newsletter
Free E-mail Newsletter from BYTE.com
Email Address
First Name
Last Name




 
    
             
BYTE.com > Tangled in the Threads > 2001 > May

Document Namespace Issues

By Jon Udell

May 10, 2001

(Document Engineering :  Page 3 of 3 )



In this Article
Document Engineering
Character Encoding Issues
Document Namespace Issues
It's hard enough to figure out how to represent and exchange characters. So, it's not surprising that we're also facing tough issues when we string characters together. Lines (in text editors) and paragraphs (in word processors) are the primitives we all know and use. But in an increasingly hypertextual world, we want to be able to name, refer to, and even version these things.

Raymond Yee:

I've been wondering for a while if there are any generalizations of this concept. What I'd really be interested in is an operating system in which every document (and parts of documents) can be addressed, kind of like URLs for everything on a machine. I've been wanting a way to refer to anything on my own machine (whether it's a cell in an Excel spreadsheet, a specific entry in my BibTeX database, a specific bookmark in a PDF file, or any part of an HTML document -- whethersomething tacked on an anchor to it or not.)

What systems are available to provide such fine-grained naming of documents and their parts?

I responded with a few examples I'm aware of. In Zope, when you parse an XML document into the object database, every single element is URL-addressable. This is also true in Excelon.

When you fully generalize this, you end up with Xanadu -- a non-erasable storage system that remembers (and versions) everything. But a practical UI for dealing with such a thing seems almost impossibly elusive. In practice, I'd happily settle for the kind of granularity that gets you, in the case of documents, things like tables, paragraphs, subheads, and links -- the major features of the landscape -- but not every table cell or word.

It would be very helpful for these features to carry natural names, e.g., a leading fragment of the paragraph, or the text of a title, or the label of a link -- rather than a parser-generated name like Zope's http://my.zope/doc/memo/e3454.

It's hard to overstate the importance, and the difficulty, of naming. When I write at any length nowadays, I tend to write in XHTML.

Previous page Page 3 of 3 


BYTE.com > Tangled in the Threads > 2001 > May
Dr. Dobb's Media Center
BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE: Volume 2 - Heuristic Algorithms
The Best of BYTE: Volume 2 - Heuristic Algorithms
In this volume of Best of BYTE, we explore the emergence of some heuristic algorithms. Although we have only scratched the surface of this intriguing subject, we hope we've suggested the potential of the synthesis of heuristics and algorithms.

© 2008 Think Services, Privacy Policy, Terms of Service, United Business Media Limited
Site comments: webmaster@byte.com
Web Sites: BYTE.com, dotnetjunkies.com, Dr. Dobb's Journal, SD Expo, Sys Admin, sqljunkies.com, Unixreview



MarketPlace
Try Numara FootPrints 9, The ITSM software that Delivers Real Value, Flexibility and Results.
Automatically capture customer crash data, no debugger required. Support for .NET, C++, OS X, Java.
Understand C/C++ code in less time. Get up to speed faster with Crystal Flow for C/C++.
Develop 10 times faster ! ALM, IDE, .Net, RAD, 5GL, Database, 5GL, 64-bit, etc. Free Express version
Easily create an automated, repeatable process for building and deploying software.
Wanna see your ad here?
 

web2