A transaction-processing rate of 1000 pages per minute translates into an effective throughput of 1.4 million pages per day. However, the total number of requests is not the real issue when planning a server configuration. Instead, you should plan for dealing with the peak request load and the peak connection rate. Another consideration is how your network handles the data load. If your client stations are connected mostly through slower ports (e.g., 56-Kbps links or 14.4-Kbps modems), the outgoing ports will be a constraining factor, not how fast your disk and processor are.
How many users should you expect? A recently published paper from the NCSA (National Center for Supercomput
ing Applications) studied the load patterns on its own WWW servers, which are probably among the most heavily used servers on the Internet. They are subject to a "typical maximal" load of approximately 600 files per minute.
Unfortunately, this study does not describe what classes of links the client stations were using. Although the NCSA site has an internal FDDI (Fiber Distributed Data Interface) network and an external T3 link, slower clients cannot receive data as quickly as faster ones can, which then delays how quickly they can request subsequent pages. In effect, the ability to ship data quickly to a large number of slower links means that the site can support a greater number of concurrent users than if the clients were all connected on faster lines.
Another busy site of interest is www.playboy.com.
Playboy
estimates that it services 800,000 requests per day and turns away at least another 800,000.
Not many sites experience the kinds of loads placed on the NCSA and
Pla
yboy
sites, however. Most sites providing services across the Internet can expect loads on the order of thousands of packets per day. An in-house site, even at a corporate headquarters supporting hundreds of people, should expect a significantly smaller load.
So, when you plan your server, consider the following four principal factors:
1. The size of your network connection.
Are your clients connected directly to multiple Ethernet ports? One Ethernet port? A T1 link? Something slower? You can provide service only as fast as the number of your network connections grows.
2. The surfing habits of your clients.
Do they do a lot of indexing and server processing (stressful to the server's CPU)? FTP transfer (balanced between the disk and network, with the CPU used to transfer data)? Or WWW-style (World Wide Web) processing (more work for the CPU, but a lot of network-port servicing)?
3. The storage requirements of your data.
This requ
irement affects your choice of disks (SCSI for many gigabytes, but you can probably get away with IDE if your data is measured in hundreds of megabytes and you don't plan to expand).
4. The access patterns of your clients.
HTML (Hypertext Markup Language) and WAIS (Wide Area Information Service) users are helped by a lot of primary memory. Random FTP access doesn't employ a memory cache effectively.