Section: Linux (8)
Updated: 16 December 2001
tc qdisc ... dev dev ( parent classid | root) [ handle major: ] cbq [ allot bytes ] avpkt bytes bandwidth rate [ cell bytes ] [ ewma log ] [ mpu bytes ]
tc class ... dev dev parent major:[minor] [ classid major:minor ] cbq allot bytes [ bandwidth rate ] [ rate rate ] prio priority [ weight weight ] [ minburst packets ] [ maxburst packets ] [ ewma log ] [ cell bytes ] avpkt bytes [ mpu bytes ] [ bounded isolated ] [ split handle & defmap defmap ] [ estimator interval timeconstant ]
Class Based Queueing is a classful qdisc that implements a rich linksharing hierarchy of classes. It contains shaping elements as well as prioritizing capabilities. Shaping is performed using link idle time calculations based on the timing of dequeue events and underlying link bandwidth.
When shaping a 10mbit/s connection to 1mbit/s, the link will be idle 90% of the time. If it isn't, it needs to be throttled so that it IS idle 90% of the time.
During operations, the effective idletime is measured using an exponential weighted moving average (EWMA), which considers recent packets to be exponentially more important than past ones. The Unix loadaverage is calculated in the same way.
The calculated idle time is subtracted from the EWMA measured one, the resulting number is called 'avgidle'. A perfectly loaded link has an avgidle of zero: packets arrive exactly at the calculated interval.
An overloaded link has a negative avgidle and if it gets too negative, CBQ throttles and is then 'overlimit'.
Conversely, an idle link might amass a huge avgidle, which would then allow infinite bandwidths after a few hours of silence. To prevent this, avgidle is capped at maxidle.
If overlimit, in theory, the CBQ could throttle itself for exactly the amount of time that was calculated to pass between packets, and then pass one packet, and throttle again. Due to timer resolution constraints, this may not be feasible, see the minburst parameter below.
When enqueueing a packet, CBQ starts at the root and uses various methods to determine which class should receive the data.
In the absence of uncommon configuration options, the process is rather easy. At each node we look for an instruction, and then go to the class the instruction refers us to. If the class found is a barren leaf-node (without children), we enqueue the packet there. If it is not yet a leaf node, we do the whole thing over again starting from that node.
The following actions are performed, in order at each node we visit, until one sends us to another node, or terminates the process.
This algorithm makes sure that a packet always ends up somewhere, even while you are busy building your configuration.
When dequeuing for sending to the network device, CBQ decides which of its classes will be allowed to send. It does so with a Weighted Round Robin process in which each class with packets gets a chance to send in turn. The WRR process starts by asking the highest priority classes (lowest numerically - highest semantically) for packets, and will continue to do so until they have no more data to offer, in which case the process repeats for lower priorities.
Classes by default borrow bandwidth from their siblings. A class can be prevented from doing so by declaring it 'bounded'. A class can also indicate its unwillingness to lend out bandwidth by being 'isolated'.
The root of a CBQ qdisc class tree has the following parameters:
A CBQ qdisc does not shape out of its own accord. It only needs to know certain parameters about the underlying link. Actual shaping is done in classes.
Classes have a host of parameters to configure their operation.
The time to wait is called the offtime. Higher values of minburst lead to more accurate shaping in the long term, but to bigger bursts at millisecond timescales. Optional.
Minidle is specified in negative microseconds, so 10 means that avgidle is capped at -10us. Optional.
The defmap specifies which priorities this class wants to receive, specified as a bitmap. The Least Significant Bit corresponds to priority zero. The split parameter tells CBQ at which class the decision must be made, which should be a (grand)parent of the class you are adding.
As an example, 'tc class add ... classid 10:1 cbq .. split 10:0 defmap c0' configures class 10:0 to send packets with priorities 6 and 7 to 10:1.
The complimentary configuration would then be: 'tc class add ... classid 10:2 cbq ... split 10:0 defmap 3f' Which would send all packets 0, 1, 2, 3, 4 and 5 to 10:1.
The actual bandwidth of the underlying link may not be known, for example in the case of PPoE or PPTP connections which in fact may send over a pipe, instead of over a physical device. CBQ is quite resilient to major errors in the configured bandwidth, probably a the cost of coarser shaping.
Default kernels rely on coarse timing information for making decisions. These may make shaping precise in the long term, but inaccurate on second long scales.
Tutoriais de Tecnologia Web