Second Generation Flow Control
Explicit Rate Traffic Management For ATM
Dr. Lawrence G. RobertsMay 1995
Packet switching was started in the 1960's 1 using a credit based flow control technique in the ARPANET, the predecessor of today's Internet. This simple technique sufficed for 25 years and has been used in all packet networks to date. However, with the major increase in network speed required in ATM networks, the flow control technique must evolve to one which can support very high speed data bursts responsively with very low data loss and without incurring the high costs associated with credit flow control. This was a major technical challenge which was unsolved until the ATM Forum undertook last year to agree on a flow control technique for Available Bit Rate (ABR) traffic. As a result of a major cooperative design effort of many people, an Explicit Rate Flow Control technique has been agreed on which is extremely responsive, loss free and economic. In fact it works so well, with such low delays, that it is very likely to be used for real-time traffic like video and voice as well as data traffic.
The Credit - Rate Debate
Starting in early 1994 a major split occurred in the ATM Forum Traffic Management group around whether to select a rate based flow control technique or a credit based flow control technique. Credit techniques were well understood since they had been used for 25 years, and one company had even built an ATM switch with credit flow control. Rate control methods were primitive and those which were known at that time worked very poorly. Thus, some members believed it was best to use the proven static credit technique. However, many of the members believed that credit would be far too expensive for large numbers of virtual connections at high speeds and that a rate based technique would be far more economic and work far better for ATM. It was also clear that the static credit flow control would not work well for Wide Area Network (WAN) distances and a single technique that would work at all distances was desired. The current ATM Forum specification, UNI 3.1, specifies that switches indicate congestion with a simple mark on data cells, the EFCI bit. For these EFCI switches, the logical control technique is binary rate control. Compatibility with these switches was desired. Also, the agreed control technique for the other ATM traffic types, Constant Bit Rate (CBR), and Variable Bit Rate (VBR) was rate setting. Thus, it was felt that rate control for ABR would be more consistent for ATM service.
The technical problem with the early rate proposals was that they were binary feedback techniques operating with the delay of the entire Round Trip Time (RTT) of the Virtual Connection (VC). Even in the Local Area Network (LAN) where delays are small, the binary technique requires so long to adjust the source rate that the result is not only unresponsive but requires very large buffers in the switches. Thus, in August 1994, I proposed that the Forum consider an explicit rate flow control technique. In an explicit rate system, the switches compute exactly what rate the source should operate at to just use the available bandwidth and send this information to the source directly in a special Resource Management (RM) cell. Thus, the time delay is substantially reduced and the source immediately starts to send at the correct rate and does not have to slowly step up or down to that rate. The impact is a dramatically more responsive system with far smaller buffers being required.
The explicit rate proposal was folded into the existing binary rate proposal as an option for the next round of debates. Then, upon further investigation, a major flaw in the binary rate technique was discovered, that it was extremely unfair for different VC's being bottlenecked at the same node, but otherwise going through a different number of nodes. It is in fact so bad that a VC traversing five nodes would only get 6% of the bandwidth of VC's traversing only one node even though they both were bottlenecked by the same overloaded trunk. Fairness is a major requirement for any such service so the binary EFCI technique had to be made secondary, and the explicit rate plan made primary. Also, the explicit rate plan was enhanced to insure its fairness by adding intelligent marking, a concept where only those VC's bottlenecked at a node have their rates set by that node. In August 1994, I submitted this revised explicit rate proposal the ATM Forum. At the September meeting, after much argument, the vote was taken and it was decided that the ATM Forum would recommend a rate flow control technique based on the excellent performance of this explicit rate technique.
Since then, all the members have worked together to refine the details of the explicit rate flow control technique. Many significant improvements have been added to insure its robust operation even under adverse conditions. It has taken until May 1995 to go from the initial concept to a completely understood, detailed specification which has been fixed to stand up to any attack any all failure conditions. This specification should go to straw vote in June become final a few months later.
The Explicit Rate Control Procedure
The basis of the technique is for the source to send out an RM cell every Nrm (32) data cells which contains the requested source rate (ER), the rate the source is sending at (CCR) and the minimum rate which was guaranteed at call setup (MCR). This RM cell travels across the network and is turned around by the destination. On its return across the network, each switch examines the requested rate and if it can not support this rate, reduces it to the rate it can support. Typically the switch would compute the rate it could support by dividing the total bandwidth available for the bottlenecked VC's by the number of VC's bottlenecked at this node. The RM cell would then return to the source with the ER field indicating the maximum rate that the network could allow this VC to operate without causing congestion. The source then sets its scheduler to send cells at this rate and proceeds to send data (and RM cells) at this rate until conditions change.
Since every 32nd cell is an RM cell, there is always an RM cell going past any switch for it to mark if conditions suddenly change at that node. Thus, for example, if a previously idle VC starts up, and its data starts to arrive at a switch, the switch will mark the next RM cell going by on all current VC's, with the new lower rate which will be required to support the one additional VC. Within the round trip time from this switch to the source plus one RM cell interval, the other VC's rates will have been adjusted. During this short period, a small queue of cells proportional to the number of cells the startup VC would have sent during this period, will build up. Then, with new rates in place for all VC's at a rate slightly under 100% loading, the queue will drain off, and the system will be back to equilibrium.
To limit the cells sent by a startup VC during this first RTT, and also to protect the network in case of destination failure, the specification has set a limit (Xrm) on the number of RM cells which may be sent without any RM cell being received. After Xrm RM cells have been sent the source must decrease its rate each RM cycle. For a switch, the number of cells an OC-3 source may send on startup before control is established in RTT seconds is thus carefully limited to typically 200-2000 cells even if the source is permitted to startup at full rate. Given this, and based on computing the number of sources that could start randomly in RTT seconds, it is possible to determine a strict bound on the buffer size required to insure that no more than one in 1012 cells are lost. For LAN's with OC-3 startup rates and typical parameters, buffer sizes of 2,000 cells would be sufficient for 4,000 VC's. For WANS's 10,000-32,000 cells of buffer storage would be required under the same conditions with OC-3 startup rates (Figure 1.). The buffer requirement is approximately equal to the delay bandwidth product for the channel, that is RTT times the cell rate of 353,000 cells/sec for an OC-3. This storage can be in a pool supporting many ports since none of these buffers are dedicated to any VC or port.
All switches may not have buffers large enough to permit the source to startup after an idle period at OC-3. However, even if the buffer size is limited to say 16,000 cells, the Initial Cell Rate, ICR , can be adjusted to insure a cell loss of less than 10-12 as shown in Figure 2. In this case, with more limited buffers, they can still permit the source to startup at full OC-3 rate for LAN and MAN distances, but must reduce the startup rate for WAN distances to insure extremely low cell loss. This only slows the source down for the first round trip time of 20-100 ms. After that, the source will be told the maximum rate the network can support and it can then immediately proceed at that rate.
Figure 2. Initial Cell Rate Required to Insure Cell Loss at Less Than 10-12 With Explicit Rate Flow Control and a 16,00 cell buffer.
One important benefit of explicit rate control is that the buffer requirements are practically independent of the number of VC's permitted. This is because the buffer requirements are due to the probability of a group of idle VC's starting up during one RTT period and no fixed allocation of buffer space per VC is required. If more VC's are active, which would increase the probability of VC's starting up, and assuming the channel continues to be loaded the same, then the average rate of each VC must have decreased, which correspondingly lowers the probability of VC startup, resulting in approximately the same buffer requirement. This is important because the number of VC's users are requiring on user interfaces is rapidly increasing toward 4,000 per port and the number required on trunks will be whatever number can be supported while operating the trunk at 95% utilization. With the typical client-server applications, it is likely that trunks will need to support at least 4,000 VC's and perhaps more.
Buffer Comparison with Credit Proposal
The above buffer requirement for explicit rate is in stark contrast to the requirement for static credit where, in addition to the delay bandwidth product per queue, 10 cells are required to be reserved for each VC. Thus, a LAN credit switch (with RTT < 1 ms) which wants to support the same 4,000 VC's per port would require 2000 buffers for the delay bandwidth product plus 40,000 buffers for the per VC buffers for a total of 2.2 Megabytes of buffer memory per port. For OC-3 ports, the cheapest way to provide this memory is with four DRAM memory chips and a controller for each port which would cost $98/port. This is in comparison to the explicit rate technique where 2000 buffers can support 4 ports and the memory for all 4 ports could be one SSRAM chip costing $11/port. There are not likely to be any significant differences in the other (logic) costs for rate or credit since the logic for queuing and switching is far more complex than the logic for flow control. The extra $87/port for the buffering in the credit system would typically increase the end-user price by $300/port which is very significant (+30%) compared to the $1,000/port switch price point likely to exist in 1996.
Cost Per Network Interface
It is true that the logic for the Network Interface Card (NIC) using the static credit technique is less complex than the logic required for the explicit rate technique because no scheduler would be required. This would only be true if no CBR and VBR support is put into the same card since they require the scheduler. However, NIC's are already being built with only one chip which does everything; the explicit rate functions, the scheduling functions, the assembly-reassembly functions and the computer bus interface. This will clearly be the future for all NIC designs and thus the cost of the NIC will not vary with the flow control technique unless it was so complex as to require two chips. However, the cost of ATM for a user interface is the sum of one NIC plus one switch port. Since the switch port for static credit would have been more expensive and the NIC cost the same, then the total cost for interfacing a user with credit would have exceeded the cost for explicit rate by the same $300/port estimated above.
Responsiveness of Rate Techniques
The main performance characteristic of a flow control technique is its responsiveness. The most critical situation is when the source has been idle and restarts, wishing to send a block of data. The binary rate techniques required if the older EFCI switches are used will need to build up the rate slowly, at one upward increment each RTT. The starting rate, ICR, will need to be set low so as to protect the network buffers because of the long time required to control a source starting up. With explicit rate switches, the rate can start at full speed and within one RTT the rate will be set exactly to the rate the network can support. In a typical MAN example with RTT=5 ms and where the network happens to be able to support 44 Mbps at the moment a source starts up, the transfer times for a 100 kilobytes block appear in Table 1.
The binary rate switch network is 16 ms slower to transfer the block mainly because it must start slow and requires 30 ms to get up to the allowed network rate. A static credit network and an explicit rate network would feed the data forward fast enough that the total block is transferred at the maximum rate the network could currently handle.
Real-Time Over ABR with Explicit Rate and Weighted Fair Queuing
It has always been intended that real-time traffic would be sent using CBR or VBR service on an ATM network. In these services prior guarantees are made at call setup time as to the bandwidth guaranteed for a call. In ABR, a guarantee is also available, that is a guaranteed Minimum Cell Rate (MCR). Taking advantage of this option, and using weighted fair queuing where the data traffic and the real-time traffic use different queues and can be assured not to interfere with each other, explicit rate networks will be able to provide guaranteed low delay service. The bandwidth available will be continually changing and the source must be able to change its rate as requested. This is very easy for most compression techniques as will almost always be used for video traffic. Then, instead of reserving 2 to 3 times as much bandwidth as necessary for a variable rate video signal so that cells are not lost when all the users peak at once, the network just informs the sources at those peak moments and they all slow down enough to just fit in the channel. This way, the ABR channel can always maintain 95% utilization. With a minimum bandwidth guarantee, each video source is assured of being able to always send at maximum compression and normally will be sending at its desired rate. Thus, by using the explicit rate feedback in the ABR service, the channel utilization increases by 2:1 at OC-12 and 3:1 at OC-3 for MPEG compressed video traffic.
Explicit rate for ATM networks is now almost completely defined by the ATM Forum. It is technically a very new method of flow control but has proved to be extremely fair, responsive and economic. As the specification is just being completed and will not be frozen for several months, switches incorporating explicit rate will not be available until sometime in 1996, some as early as first quarter, and others as late as fourth quarter. NIC cards incorporating explicit rate will become available earlier, perhaps as early as third quarter 1995, which is important because all NIC cards must be replaced in order to support the new ABR service. It is a major advance over the EFCI binary rate ATM switches currently available in fairness, responsiveness and cost.. It is also a major cost improvement over the static credit flow control which has been in use since the beginning of packet switching, particularly cheaper for WAN networks but also 25% cheaper for LAN networks. It operates with such low queuing delays that it also appears to be possible to be used for real-time traffic like compressed video. As a result, it is very possible that explicit rate ABR service will be the only ATM service needed in future years.
1 L. G. Roberts, "Multiple Computer Networks and Intercomputer Communication," ACM Symposium on Operating System Principles, October 1967. (First description of the ARPANET)
Copyright © 2001 Dr. Lawrence G. Roberts