环境

  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 6
  • Network interface card (NIC) with multiple hardware transmit interrupt channels
  • Default Linux queuing discipline ​​mq qdisc​

问题

  • What is the ​​mq qdisc​​ (queuing discipline) in ​​tc​​ Traffic Control?
  • How does Linux send packets to NICs with multiple transmit interrupt queues?
  • Does the kernel use more than one Tx channel on multi-queue network interfaces?

决议

Where a NIC has multiple hardware transmit queues available, the Linux kernel makes use of those multiple queues. The kernel does this by way of the multi-queue queuing discipline (​​mq​​ qdisc).

The ​​mq​​ qdisc is a dummy qdisc which sets up an array of ​​pfifo_fast​​ queues as a Traffic Control ​​tc class​​ under the root ​​mq qdisc​​. One ​​pfifo_fast​​ queue is created for each hardware queue.

Traffic is placed into each queue using a hash function. This should allow traffic to spread across the multiple queues.

The root queue can be inspected in the usual way:

​Raw​

# tc -s qdisc show dev eth0
qdisc mq 0: root
Sent 22838 bytes 227 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

Each individual queue can be inspected as a class off the root queue:

​Raw​

# tc -s class show dev eth0
class mq :1 root
Sent 21870 bytes 229 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class mq :2 root
Sent 3618 bytes 21 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

The length of the ​​pfifo_fast​​ queues is changed in the usual way:

​Raw​

# ip link set dev eth0 txqueuelen

It is possible to change each class to a different queueing discipline, if ratelimiting or prioritization is required on the interface.

根源

Added to the Linux kernel in 2009 with ​​net_sched: add classful multiqueue dummy scheduler​​.

Also see ​​multiqueue.txt​​ in ​​kernel-doc​​.

诊断步骤

Inspect kernel source.

When a device has multiple tx queues, the ​​mq​​ qdisc is initialised. This just allocates an array of qdiscs for each real tx queue, and calls ​​qdisc_create_dflt()​​:

​Raw​

net/sched/sch_mq.c

static int mq_init(struct Qdisc *sch, struct nlattr *opt)
{
...
for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
dev_queue = netdev_get_tx_queue(dev, ntx);
qdisc = qdisc_create_dflt(dev, dev_queue, default_qdisc_ops,
TC_H_MAKE(TC_H_MAJ(sch->handle),
TC_H_MIN(ntx + 1)));
...
priv->qdiscs[ntx] = qdisc;
}
...

The default qdisc is ​​pfifo_fast​​:

​Raw​

net/sched/sch_generic.c

/* Qdisc to use by default */
struct Qdisc_ops *default_qdisc_ops = &pfifo_fast_ops;
EXPORT_SYMBOL(default_qdisc_ops);