Increased bandwidth may not be the best way to improve your network's performance. Construct a mathematical model to estimate potential performance gains.
Brett Husselbaugh
Too many people have been conditioned by marketing hype and misleading information into believing that bandwidth is the true measure of performance. Adding bandwidth to a poorly performing network is like raising the speed limit of a congested highway--neither solution buys much.
When you measure in terms of bandwidth, you are measuring in the frequency domain. The frequency domain is ruled by terms like megahertz and megabits per second. These terms are applied to measure the theoretical capacity of a communications channel. In a LAN, the channel is the Ethernet, Token Ring, or other topological segment to which your
servers and workstations are connected. That channel is often called the media.
When you use the frequency domain to describe the capacity of a communications channel, you consider only the channel and not necessarily what is happening at its end points. You assume that the end points are both capable of sustaining the maximum information rate of the channel. If you are talking about a synchronous point-to-point channel, like a WAN link, your choice of the frequency domain is appropriate. In a LAN scenario, however, other factors make the choice of the frequency domain inappropriate.
For instance, a LAN may have many nodes, all competing for simultaneous access to the media. Because only one node can transmit at any given instant, all other nodes must wait.
In addition, information flow on a LAN is not synchronous. Rather, information is transmitted in small bursts called packets. As it turns out, these packets require a fairly significant amount of time to form and decipher. Here, the w
ord significant is relative, meaning that the time to service a packet at a given node on the network will often be longer than the time it takes to transmit it.
For this reason, a LAN should be considered and measured in the time domain. In the time domain, people typically use terms like milliseconds to measure performance. It is better to describe a LAN's behavior by the time it takes to move information rather than by the rate at which information moves along the wire. By studying where the time is spent, you can locate and eliminate bottlenecks--bottlenecks that you would never find using bandwidth techniques.
LAN Read Cycles
The most common cycle seen on a LAN is the read cycle, where information is moved from a file server to a workstation. Why? LANs are generally used to serve applications from a common server. Even when that application is a large database application, more data is read from its files than written. Commonly 75 percent to 95 percent of all transactions are reads.
Data is generated only as fast as people can work, yet it can be sifted through and moved from the server to the workstation as fast as the system will allow. Therefore, focusing on the read cycle is necessary to gain a fairly accurate understanding of network performance.
The read cycle shown in the figure ``Common Network Read Cycle'' is composed of four distinct phases. It begins with the workstation forming a request packet and forwarding it to the server. The server requires time to decode the packet, obtain the requested information either from disk or from cache memory, move that information internally, form a response packet, and then wait for access to the media. Once access has been granted, the server transmits the response. The workstation then has to perform many of the same internal processes as the server, including waiting for access to the media. The cycle repeats itself until the file, or required portion of the file, is transferred.
Which Parameters Matter?
When you stud
y the characteristics of a LAN, you need to know the parameters that affect performance. Although there are many variables to consider, there are three important variables over which you have some degree of control. The three tunable parameters include the following:
1. Packet size
Different topologies support different maximum packet sizes. Ethernet supports a maximum frame size of 1518 bytes. That frame will be filled with a maximum payload of 1024 bytes of data. On the other hand, Token Ring can carry a maximum payload of 16,384 bytes of data (16-Mbps mode) and 4096 bytes of data (4-Mbps mode). FDDI (Fiber Distributed Data Interface) frames can carry a maximum payload of 4096 bytes of data.
2. Media-access time
This is a critical parameter influenced by many factors. First is the nominal access delay inherent in the chosen access methodology of the media. Token Ring uses a token-passing scheme to control and grant access to the media. On the other hand, Ethernet uses a methodology known a
s CSMA/CD. CSMA/CD access times are typically on the order of five to 10 times faster than token-passing access times. As you will see, this can greatly influence overall performance.
In addition to delays inherent in the different access methodologies employed, media-access time is also influenced by the amount of competition on the segment. The higher the competition, the longer the average access time.
3. Signaling speed (typically called bandwidth)
This is the speed at which individual bits of information are transmitted on the media. This is the least influential of the tunable parameters.
Modeling Network Performance
To best understand network performance, you need an appropriate model. A model will let you change each of the tunable parameters, crank the change through the model, and see the impact. Such a model is expressed in the following equation:
(This equation is not available electronically. Please see December, 1994, issue.)
Where:
a = media-access time factor
f = signaling (frequency) factor
p = packet-size (payload) factor
tc = total client service time
tm = time required to access media
ttrq = time required to transmit a request packet
ttrs = time required to transmit a response packet
trd = total time required for baseline read cycle
t'rd = total time required for modified read cycle
x = file server to workstation service time factor
y = percent fixed-to-total service time
The equation considers all elements of time contained in one network read cycle. It takes into account all three tunable network parameters. The influence each tunable parameter has on each of the time elements is considered, and a ratio is formed between a known case and a case that is created by changing one or more of the tunable parameters. This equation represents a way of calculating an expected performance gain or loss from a known situati
on. By changing one or more parameters, the equation calculates the expected change in performance.
Although it has been validated through several practical throughput tests, the equation is just one tool for understanding network behavior. It yields results that should be taken as upper bounds of expected performance changes, yet its predictions will be in the same ballpark as actual measured performance changes.
Using the Model
You apply the model by choosing a baseline scenario and then changing one or more of the tunable parameters. The results can be displayed as a graph showing the expected change in performance versus the average service time of the network nodes.
The first case compares 10-Mbps Ethernet with 16-Mbps Token Ring to determine the effect increased bandwidth has on performance. In this case, the bandwidth is raised by a factor of 1.6, while all other factors are held constant.
The figure ``Increased Bandwidth'' illustrates why the performance change is plotte
d versus the network-node service time. As service time increases, performance gain due to bandwidth decreases. This stands to reason because more and more of the time required to perform a read cycle is spent at the nodes and not in transmitting on the media. This is why you can experience poor performance on your network, yet the network monitor reports low bandwidth utilization.
If a node requires 5 ms of service time or more, the expected performance increase is less than 5 percent. So if you are planning to move from Ethernet to Token Ring because of the increase in available bandwidth, you may be about to pay 100 percent more for less than a 5 percent gain.
The second case again compares Ethernet and Token Ring, but this time the bandwidth and payload are changed (see the figure ``Token Ring's higher packet-payload capacity is modeled with the payload size of 4096 bytes. Ethernet can carry a maximum of 1024 bytes of payload per packet.
The results of this analysis are clear: Payload
capacity is a significant performance factor. If you have large volumes of data to move, you need to maximize packet size. No single factor influences performance more than payload. It dominates all other tunable parameters. Thus, if you are considering moving from Ethernet to Token Ring for the additional payload capacity, and your application can take advantage of that extra capacity, then you are about to pay 100 percent more for roughly a 100 percent gain in performance.
The next case again compares Ethernet to Token Ring, but here Ethernet is modeled as having five times faster average access to the media. To isolate the effects of faster access time on performance, Ethernet and Token Ring are held to 1024-byte payloads.
The figure ``Increased Bandwidth and Media-Access Time'' shows that performance change for moving to 16-Mbps Token Ring is negative for all points on the curve. This striking result means that under this scenario Ethernet actually outperforms Token Ring, even though the To
ken Ring is operating at 1.6 times the bandwidth. Media-access time is a more significant factor than bandwidth, yet a less significant factor than payload. This analysis suggests that for environments that require packets of 1024 bytes and less, you can expect higher performance from the lower-cost Ethernet. Such environments would include host access, some database applications, and possibly client/server applications.
Some database applications tend to move information based on the record size of the table being accessed. Although a 2-GB table is being accessed, the database engine running on the client will request only a record at a time from the table. For example, if the record size is 256 bytes, then Ethernet will be a better performer for that database application.
The final case compares 100-Mbps FDDI with 16-Mbps Token Ring (see the figure ``Token Ring vs. FDDI''). Token Ring is modeled at its maximum 16,384 bytes of payload per packet, while FDDI is modeled at its maximum 4096 bytes
of payload per packet. Media-access time is held constant between the two.
Notice how dominant payload size is in network performance and how little influence bandwidth has. Although FDDI offers 6.25 times the bandwidth of 16-Mbps Token Ring, at 2.5 ms of node-service time the move to FDDI becomes a wash. From that point forward, you can actually expect to see a decrease in performance. Even at zero service time, the expected performance increase is only 10 percent.
Interpreting the Model
When interpreting the information presented in this article, remember that the model used was derived considering a homogeneous network of only one transaction type and packet size--the read transaction with full packet.
A larger packet size will increase data transfer performance. How much depends on how contiguous the data to be moved is. That is, you will get much better performance moving 5 MB stored in one file than moving 5 MB stored in 50 files. This is because of the overhead of small packets
required in opening a file.
Some argue that increasing average packet size on a network will increase competition, because each packet transmitted holds the media longer. You must keep in mind, however, that the increased packet size will tend to decrease the number of transmissions on the media since fewer are required to move the same volume of data. This means that while there may be some increase in segment competition, the bottom line is that there is no reason not to increase average packet size on your media.
Also remember that the model presented considers only two nodes on a LAN--the workstation and server. It remains valid in the presence of other nodes and transactions, however. The presence of other nodes and transactions does nothing more than increase the time to access the media. Therefore, you can predict performance during periods of higher network use by simply selecting a higher node-service-time point on the graph, since increased time to access the media also increases avera
ge time spent at the nodes.
Improving Network Performance
There are several steps you can take to optimize performance on your LAN. The first is to check that your packet size is maximized. Many older Token Ring drivers default to 1-KB packets. The packet size must be set at all decision points on the network, including workstations, routers, and servers. The best way to tell if your packet size is maximized is to use a network monitoring device.
After tweaking packet size, carefully choose the NIC (network interface card) and driver solution for your workstations. Remember, network performance includes service time at all nodes, not just the server. Poor performance at the workstation can yield poor overall network performance, regardless of the capabilities of the server. Do not make the decision lightly.
Be aware that most performance problems in NICs tend to be due to their drivers. Choose a manufacturer that has a strong background in the particular media you are using.
In
addition, educate yourself about network competition and segmentation. High levels of competition can cause drastically reduced throughput due to increased media-access times. Competition must be controlled to achieve a high degree of performance on your LAN.
If you have done everything you can and still have poor network performance, don't just throw bandwidth at the problem. Chances are high that you won't solve the problem with that approach and will waste time and money. Perform a detailed network analysis to determine where the bottlenecks are and then draft a plan of action. In almost all cases, this will be a less costly, more-effective and timely solution.
The Best Network for You
There are three important factors over which you have some degree of control when trying to improve the performance of your network: packet size, time to access the media, and bandwidth. Packet size has the most impact, followed by media-access time. Of the three, bandwidth has the least impact.
At 1
6 Mbps, Token Ring offers the highest available packet size and thus the highest performance in an environment where large volumes of data must be moved. But where packet size is of less importance, such as host access and some client/server applications, Ethernet offers the lowest-cost, highest-performance solution.
The one you choose is dependent on your specific application and environment. The model presented here is a tool to help you make that decision.
Figure: Common Network Read Cycle
Network read cycles commonly have four stages.
Figure: Increased Bandwidth and Payload
Ethernet vs. Token Ring assuming bandwidth increases from 10 to 16 Mbps, and payload goes from 1024 to 4096 bytes.
Figure: Increased Bandwidth
Ethernet vs. Token Ring assuming bandwidth increases from 10 to 16 Mbps.
Figure: Increased Media-Access Time
Ethernet vs. Token Ring assuming bandwidth increases from 10 to 16 Mbps and media-access time increases f
ivefold.
Figure: Token Ring vs. FDDI
Token Ring vs. FDDI assuming bandwidth increases from 16 to 100 Mbps, but Token Ring uses maximum payload (16,384 bytes) vs. FDDI (4096 bytes).
Brett Husselbaugh is president of Tobek Computer Systems, a LAN consulting firm in Dallas, Texas. He can be reached on the Internet at
wbretth@aol.com
or on BIX c/o ``editors.''