Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesNet Programming For The Masses


February 1996 / Special Report / Net Programming For The Masses

Don't chuckle: TCL's creators are serious about creating the Internet's lingua franca

Peter Wayner

For the past 10 years, the vision of the future of programming has been clear: Fast, object-oriented compilers will turn C++ into lightning-quick native code. Now that compilers are becoming fast and adept at this task, a team of programmers at SunSoft is traveling back to the future by pushing an interpreted scripting language known as TCL as the next necessary tool. (TCL, often pronounced "tickle," is an acronym for Tool Command Language.) The goal is to produce a machine-independent, virus-free Visual Basic killer that will complement the Java language and allow software agents to roam from machine to machine throughout the Internet.

It's difficu lt to pinpoint exactly when the glory days of compilers dimmed, but it must have begun when Microsoft's Visual Basic, a stodgy, interpreted version of BASIC, soared in popularit y. This software product enabled even fair-weather programmers to turn out clean and neat code.

The popularity of native compiled code took a further dive when the Internet offered everyone the chance to send programs hither and yon. Suddenly, interpreted languages such as Lisp, with built-in memory-handling ability, seemed much more elegant than combining stripped-down and hard-coded native binaries.

Missing Links

TCL's creators didn't intend for it to be a language used by agents to roam the Internet. Nor did it become popular among Unix programmers out of some nostalgia for a time when all higher-level languages were interpreted. John Ousterhout and some of his students at the University of California-Berkeley wanted to create a meta-programming language -- that is , a very high-level programming language. Their goal was to develop a language that Unix programmers and others could use to build code by binding small programs into larger applications.

Ousterhout is now a Distinguished Engineer at Sun Labs and is responsible for TCL development. The language got its name because it was used to flexibly link a number of different code modules or tools written in regular high-level languages, such as C and FORTRAN.

Programmers using TCL don't call up libraries or other blocks of code through procedure calls; instead, they issue TCL command strings. A small TCL interpreter, which can be linked into the code, interprets the strings. This interpreter acts as a flexible buffer that then makes the procedure calls.

The flexibility of this approach is important. Although object-oriented programming is supposed to make it easy to reuse code, it's useful only if you stay within the narrow framework of the class hierarchy. This structure is painfully limiting, bec ause programmers usually create it during the first several weeks of a project, when people have only a vague idea of what's needed. If you want to add a few features to a block of code or change the way in which the code exchanges data, you often need to rip apart the old hierarchy and begin anew.

TCL's flexibility can alleviate the massive turbulence caused by adding features. You don't use data structures defined by an object-oriented hierarchy to pass information; instead, the interpreted TCL code links modules. If you add new features or change some structure, you only have to tweak the TCL code that links the modules; you don't need to recompile the basic tool code. You can write arbitrary operations in TCL to massage the data and get it to flow neatly between different code modules. You also perform reformatting work in this "glue" layer.

Here's a simple example. Say your lab partner, Bob, writes some software that runs some lab equipment. He originally took all temperatures in Fahrenheit , so you design your code to spit out Fahrenheit temperatures. Later, Bob embraces the metric system and rewrites his interface code to take temperatures in Celsius. If you're simply linking in his code as a library, you might need to rewrite all your code to use Celsius values. If you use TCL to pass the data to his code, you can rearrange the TCL code to do the conversion on the fly. You can use a filter function to do the conversion or even use raw TCL to do the work.

You might wonder what's so difficult about changing your code module to spit out Celsius. What if some other code written by someone else in the lab also uses that module? Then that person must update his or her module. The ripple effects of forced upgrades might be great if you're a company like Microsoft, whose business success depends on staying current. But forced upgrades are a major hassle if the software is just a tool for you to get your job done. TCL alleviates the need to delve into your old code and change everything. In the above scenario, you wouldn't have any reason to resist Bob's leap forward.

This example is a bit trivial, but it shows how nice it is to have a way to glue code with built-in intelligence. C++ links compiled code with an information-transfer mechanism that simply says, "Here's the data" (see the figure "TCL's Flexible Links" ). TCL can do anything to data along the way and say, "Here's the data that I've repaginated, normalized, and converted to your specifications."

This feature can also help novice programmers. Because the language is so simple, people with only a small amount of programming experience can link modules successfully in TCL. There is, for instance, only one data type: the string. This simplistic representation slows down system code that uses data types to speed operations, but it offers no major impediment if the inefficiency is limited to code that glues things together. Simplicity is the major reason why some at SunSoft see TCL as the Visual Basic for the I nternet.

Safer TCL

Several years ago, Nathaniel Borenstein and Marshall Rose recognized that TCL was more than just a tool for managing large programming projects. The two decided to build on the Multipurpose Internet Mail Extensions (MIME) standard for bundling multiple types of data into a single E-mail message. When they finished with that, they wondered, "What if we could send a program as well?"

Their solution was a modified version of TCL, known as Safe-TCL, which resists malicious programs, such as viruses, that might arrive via E-mail. Their modifications and suggestions to the original version of TCL are now incorporated into the latest version, 7.5, to emerge from SunSoft. These security modifications are crucial to taking the TCL language used by Unix hackers to link C code and turning it into the TCL language that links code throughout the Internet in a safe and virus-free way.

Borenstein and Rose picked TCL because it's a high-level language that they cou ld implement with a small amount of code. They hoped to make it possible for people to ship forms and other interactive programs via E-mail. The recipient's E-mail application could fire up the programs without worrying whether they contained any malicious code.

They found that the interpreted nature of TCL was an asset for two reasons. First, interpreted code is easy to run on many different machines. An E-mail message with embedded TCL can run successfully on a Sun system, a PC, or a Mac because there's no binary code.

Second, and more important, programmers can make a TCL interpreter safe by arranging for it to watch the execution of code for errant instructions intended to change memory or overwrite the file system. A TCL interpreter can watch each step and make sure that an instruction is accessing only those strings or memory locations that it's authorized to use. A language like C, which gives a programmer infinite freedom to manipulate pointers, could never be made safe like this.

Version 7.5 of TCL contains a command called interp , which creates a new interpreter with its own name space in which code can be executed without affecting any of the outside data. Ousterhout refers to this as a padded cell . An incoming E-mail message or agent can be executed in its own interpreter without any danger of malicious code causing damage.

The original Safe-TCL built by Borenstein and Rose had just two interpreters: a safe one, used to contain untrusted code, and a free one that could access the system. An incoming agent would run in the safe interpreter and communicate with the rest of the system by issuing commands to the free interpreter.

If, for instance, the code running in the safe interpreter was that of a travel agent making plane reservations, it could access the customer's seat preferences stored in a file by issuing a command that ran in the free interpreter. This free command would send out plane information only and would not allow general access to fi les that might contain other private data. Programmers can determine which features are available to incoming agents by controlling which commands run in the free interpreter and what they do.

The model in TCL 7.5 is much more flexible than in previous versions. Each TCL interpreter can act as a master and create a slave interpreter. These slaves can be either safe or free. Each master decides what functions it offers to its slave. The slaves can also create their own slaves in a great hierarchy, but safe interpreters can create only new safe interpreters. This flexibility allows agents that come in the form of TCL code to exchange data with other agents by either starting them up in a slave interpreter or merely examining messages in a safe slave interpreter.

Programmers keep the interpreters safe by strictly limiting the features available to a TCL program running in them. For instance, TCL code can open, read, write, and close arbitrary files on the computer. This command is locked away from safe interpreters.

To allow an incoming TCL program to access files, you create your own file access code in the master interpreter. You can write this code so that users can access only those files located in particular subdirectories or on particular disks.

You can then make this command available in the safe slave interpreter via the alias mechanism that adds new functions to the safe-interpreter domain. The code running in the safe interpreter can use this crippled command only to access files. These extra functions are one of the principal ways in which information can be passed out of safe interpreters.

Assessing the safety features of TCL can be a long, drawn-out project. The designers of TCL 7.5 left out all the obvious ways in which a TCL program could breach the walls of the interpreter and mess up a system. TCL 7.5 blocks the file operations, the exec command (which lets a TCL program execute a subprocess), and access to general information about the file system and en vironmental variables. Information can enter and leave only through the additional functions that a programmer adds. This is why the padded-cell analogy is so accurate.

Some consider this model to be limiting. The Telescript language from General Magic (Mountain View, CA) offers a more spacious structure, complete with many extra features that make life easier for programmers. Telescript, for instance, offers a place where multiple agents can come and meet. Many sections of code can enter, meet, exchange data, and then leave this "padded cell." The system also contains a permit mechanism that controls which features are available to which agents and can even limit how an agent consumes system resources, such as CPU time or memory. None of these features is in the current version of TCL.

This more sophisticated environment can make complicated rendezvous scenarios easier to create, but it might also inadvertently add security holes. For instance, it's possible that agent A and agent B might not b e able to breach the security alone, but they might be able to do so when both exist in the same place. Anticipating weird and unlikely combinations like this is difficult, and no one has any great experience in structuring such situations.

The potential for danger is not from the basic TCL or Telescript software that does the interpreting -- these packages are sure to be tested extensively. The holes could emerge when programmers do a bad job of creating the functions that pass information in and out of the padded cell. Ousterhout claims that his hierarchical model offers a simpler vision that's easier for any programmer to keep secure. Jim White, the General Magic designer responsible for Telescript, believes that people want agents to communicate and that programmers will be able to successfully keep leaks from emerging. Only time and practice will determine which vision is more successful.

Programming for the Masses

TCL complements the increasingly p opular Java (see "Wired on the Web," January BYTE). The latest versions run on Macs, Windows PCs, and Unix boxes. While Java is a full-featured programming language that gives programmers access to low-level details, such as threaded processes, TCL is a significantly simpler language that can act as a high-level scripting tool for linking applets. Ousterhout hopes that Java will be the tool that sophisticated programmers use to write tools and that TCL will be the high-level language that novice and sophisticated users alike will use to knit these tools together.

Ousterhout's hope that TCL 7.5 will become the Visual Basic for the Internet has plenty going for it. Anyone can incorporate TCL easily into his or her programming projects. The C code for the interpreter is freely available. The language itself is not hard to implement, nor is it particularly hard to understand. It also comes with a user-interface toolkit, called TK, that displays a consistent interface on any platform. This makes it an ideal candidate for multiplatform development.

SunSoft is actively supporting the technology, and it's only a matter of time before we see what the Internet will choose. The greatest competition for the TCL language will probably be General Magic's Telescript. While Telescript has a more sophisticated approach to agents and their interaction, TCL is free. This is a significant advantage, because the greatest advances on the Internet often come from cash-poor programmers. In any case, the short reign of the native-code-generating object-oriented compiler is about over.


WHERE TO FIND


SunSoft

Mountain View, CA
(415) 960-3200

http://www.sun.com


HotBYTEs
 - information on products covered or advertised in BYTE


How They Stack Up


TCL


--
  An interpreted scripting language.

--
  A tool for building agents that roam the Internet.

--
  Developers write code by binding small programs into 
       larger applications.

--
  Adding new features to programs doesn't require 
       recompiling the basic code.

--
  C code for the interpreter is freely available.


JAVA


--
  A full-featured programming language.

--
  Gives programmers access to threaded processes and 
       other low-level details.

--
  TCL complements it by acting as a high-level scripting 
       tool for linking applets.

--
  Java serves sophisticated programmers; TCL serves 
       experienced and novice developers alike.


TELESCRIPT


--
  Offers a more spacious structure and more programming 
       features than TCL.

--
  Agents interact in a secure meeting area.

--
  A "permit" mechanism controls which system resources are 
       available to each agent.



TCL's Flexible Links

illustration_link (11 Kbytes)

The interp command can set up a Safe-TCL interpreter that won't allow code to access outside files or memory. Safe-TCL interpreters can even be nested to create a hierarchy of interpreters.


Peter Wayner is a BYTE consulting editor based in Baltimore, Maryland. He is author of Agents Unleash ed (Academy Press, 1995), which is being retitled Agents At Large. His World Wide Web home page is http://access.digex.net/pcw/pcwpage.html . He can also be contacted on the Internet or BIX at pcw@access.digex.com .

Up to the Special Report section contentsGo to next article: Stack AttackSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM  
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network