目录


文章目录

  • 目录
  • 配置
  • 80-vpp.conf
  • startup.conf
  • 可以配置 VPP 的 Threading Modes
  • 运行示例
  • non-DPDK 模式运行 VPP
  • DPDK 模式运行 VPP
  • Router and Switch for namespaces


配置

  • VPP 配置手册:https://fd.io/docs/vpp/master/gettingstarted/users/configuring/index.html

VPP 有两个重要的配置文件:

  1. VPP Sysctl 配置文件:80-vpp.conf
  2. VPP 配置文件:startup.conf

80-vpp.conf

80-vpp.conf 在安装 VPP 的过程中,需要手动或自动的被复制到 /etc/sysctl.d/80-vpp.conf,这样我们就可以使用 systemctl 指令来管理 VPP 守护进程的生命周期了。

$ systemctl start vpp && systemctl enable vpp && systemctl status vpp

80-vpp.conf 的初始内容如下,主要设定了 VPP 守护进程所需要占用的 HugePage(大页内存)数量。

# Number of 2MB hugepages desired
vm.nr_hugepages=1024

# Must be greater than or equal to (2 * vm.nr_hugepages).
vm.max_map_count=3096

# All groups allowed to access hugepages
vm.hugetlb_shm_group=0

# Shared Memory Max must be greater or equal to the total size of hugepages.
# For 2MB pages, TotalHugepageSize = vm.nr_hugepages * 2 * 1024 * 1024
# If the existing kernel.shmmax setting (cat /proc/sys/kernel/shmmax)
# is greater than the calculated TotalHugepageSize then set this parameter
# to current shmmax value.
kernel.shmmax=2147483648

如上所示,默认的,VPP 会将操作系统的 HugePages 设置为 10242M。

$ cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
HugePages_Total: 1024
HugePages_Free: 982
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB

注意,启动 VPP 时,会将操作系统原有的 HugePage 设置覆盖掉。所以,当使用 OpenStack VM 来启动 VPP 时,注意 VM 需要配置为开启 HugePage,且 VPP 配置的大页数最好与 VM 配置的大页数保持一致。

startup.conf

startup.conf 作为 VPP 的主要配置文件,包含了以下几个部分。

  • unix:Unix-Like 操作系统层面的配置。
  • api-trace
  • api-segment
  • statseg
  • socksvr
  • cpu:CPU 相关配置。
  • memory:内存相关配置。
  • buffers:缓冲相关配置。
  • node:图节点相关配置。
  • dpdk:DPDK 相关配置。
  • plugins:插件相关配置。
  • l2fib:L2 FIB 相关配置。
  • ipsec:IPSec 相关配置。
  • logging:日志配置。
unix {
# nodaemon
log /var/log/vpp/vpp.log
full-coredump
cli-listen /run/vpp/cli.sock
gid vpp

## run vpp in the interactive mode
# interactive

## do not use colors in terminal output
# nocolor

## do not display banner
# nobanner
}

api-trace {
## This stanza controls binary API tracing. Unless there is a very strong reason,
## please leave this feature enabled.
on
## Additional parameters:
##
## To set the number of binary API trace records in the circular buffer, configure nitems
##
## nitems <nnn>
##
## To save the api message table decode tables, configure a filename. Results in /tmp/<filename>
## Very handy for understanding api message changes between versions, identifying missing
## plugins, and so forth.
##
## save-api-table <filename>
}

api-segment {
gid vpp
}

socksvr {
default
}

# memory {
## Set the main heap size, default is 1G
# main-heap-size 2G

## Set the main heap page size. Default page size is OS default page
## which is in most cases 4K. if different page size is specified VPP
## will try to allocate main heap by using specified page size.
## special keyword 'default-hugepage' will use system default hugepage
## size
# main-heap-page-size 1G
#}

cpu {
## In the VPP there is one main thread and optionally the user can create worker(s)
## The main thread and worker thread(s) can be pinned to CPU core(s) manually or automatically

## Manual pinning of thread(s) to CPU core(s)

## Set logical CPU core where main thread runs, if main core is not set
## VPP will use core 1 if available
# main-core 1

## Set logical CPU core(s) where worker threads are running
# corelist-workers 2-3,18-19

## Automatic pinning of thread(s) to CPU core(s)

## Sets number of CPU core(s) to be skipped (1 ... N-1)
## Skipped CPU core(s) are not used for pinning main thread and working thread(s).
## The main thread is automatically pinned to the first available CPU core and worker(s)
## are pinned to next free CPU core(s) after core assigned to main thread
# skip-cores 4

## Specify a number of workers to be created
## Workers are pinned to N consecutive CPU cores while skipping "skip-cores" CPU core(s)
## and main thread's CPU core
# workers 2

## Set scheduling policy and priority of main and worker threads

## Scheduling policy options are: other (SCHED_OTHER), batch (SCHED_BATCH)
## idle (SCHED_IDLE), fifo (SCHED_FIFO), rr (SCHED_RR)
# scheduler-policy fifo

## Scheduling priority is used only for "real-time policies (fifo and rr),
## and has to be in the range of priorities supported for a particular policy
# scheduler-priority 50
}

# buffers {
## Increase number of buffers allocated, needed only in scenarios with
## large number of interfaces and worker threads. Value is per numa node.
## Default is 16384 (8192 if running unpriviledged)
# buffers-per-numa 128000

## Size of buffer data area
## Default is 2048
# default data-size 2048

## Size of the memory pages allocated for buffer data
## Default will try 'default-hugepage' then 'default'
## you can also pass a size in K/M/G e.g. '8M'
# page-size default-hugepage
# }

# dpdk {
## Change default settings for all interfaces
# dev default {
## Number of receive queues, enables RSS
## Default is 1
# num-rx-queues 3

## Number of transmit queues, Default is equal
## to number of worker threads or 1 if no workers treads
# num-tx-queues 3

## Number of descriptors in transmit and receive rings
## increasing or reducing number can impact performance
## Default is 1024 for both rx and tx
# num-rx-desc 512
# num-tx-desc 512

## VLAN strip offload mode for interface
## Default is off
# vlan-strip-offload on

## TCP Segment Offload
## Default is off
## To enable TSO, 'enable-tcp-udp-checksum' must be set
# tso on

## Devargs
## device specific init args
## Default is NULL
# devargs safe-mode-support=1,pipeline-mode-support=1

## rss-queues
## set valid rss steering queues
# rss-queues 0,2,5-7
# }

## Whitelist specific interface by specifying PCI address
# dev 0000:02:00.0

## Blacklist specific device type by specifying PCI vendor:device
## Whitelist entries take precedence
# blacklist 8086:10fb

## Set interface name
# dev 0000:02:00.1 {
# name eth0
# }

## Whitelist specific interface by specifying PCI address and in
## addition specify custom parameters for this interface
# dev 0000:02:00.1 {
# num-rx-queues 2
# }

## Change UIO driver used by VPP, Options are: igb_uio, vfio-pci,
## uio_pci_generic or auto (default)
# uio-driver vfio-pci

## Disable multi-segment buffers, improves performance but
## disables Jumbo MTU support
# no-multi-seg

## Change hugepages allocation per-socket, needed only if there is need for
## larger number of mbufs. Default is 256M on each detected CPU socket
# socket-mem 2048,2048

## Disables UDP / TCP TX checksum offload. Typically needed for use
## faster vector PMDs (together with no-multi-seg)
# no-tx-checksum-offload

## Enable UDP / TCP TX checksum offload
## This is the reversed option of 'no-tx-checksum-offload'
# enable-tcp-udp-checksum
# }

## node variant defaults
#node {

## specify the preferred default variant
# default { variant avx512 }

## specify the preferred variant, for a given node
# ip4-rewrite { variant avx2 }

#}


# plugins {
## Adjusting the plugin path depending on where the VPP plugins are
# path /ws/vpp/build-root/install-vpp-native/vpp/lib/vpp_plugins

## Disable all plugins by default and then selectively enable specific plugins
# plugin default { disable }
# plugin dpdk_plugin.so { enable }
# plugin acl_plugin.so { enable }

## Enable all plugins by default and then selectively disable specific plugins
# plugin dpdk_plugin.so { disable }
# plugin acl_plugin.so { disable }
# }

## Statistics Segment
# statseg {
# socket-name <filename>, name of the stats segment socket
# defaults to /run/vpp/stats.sock
# size <nnn>[KMG], size of the stats segment, defaults to 32mb
# page-size <nnn>, page size, ie. 2m, defaults to 4k
# per-node-counters on | off, defaults to none
# update-interval <f64-seconds>, sets the segment scrape / update interval
# }

## L2 FIB
# l2fib {
## l2fib hash table size.
# table-size 512M

## l2fib hash table number of buckets. Must be power of 2.
# num-buckets 524288
# }

## ipsec
# {
# ip4 {
## ipsec for ipv4 tunnel lookup hash number of buckets.
# num-buckets 524288
# }
# ip6 {
## ipsec for ipv6 tunnel lookup hash number of buckets.
# num-buckets 524288
# }
# }

# logging {
## set default logging level for logging buffer
## logging levels: emerg, alert,crit, error, warn, notice, info, debug, disabled
# default-log-level debug
## set default logging level for syslog or stderr output
# default-syslog-log-level info
## Set per-class configuration
# class dpdk/cryptodev { rate-limit 100 level debug syslog-level error }
# }

可以配置 VPP 的 Threading Modes

Single-Threaded​:

  • Control 和 Forwarding Engine 都运行在一个 Thread 中。
    FD.io/VPP — VPP 的配置与运行_sed

Multi-thread with workers only​:

  • Control 运行在 Main thread(API,CLI)中。
  • Forwarding Engine 分布运行在一个或多个 Thread 中。
    FD.io/VPP — VPP 的配置与运行_配置文件_02

Multi-thread with IO and Workers​:

  • Control 运行在 Main thread(API,CLI)
  • IO thread handling input and dispatching to worker threads
  • Worker threads doing actual work including interface TX
  • RSS is in use
    FD.io/VPP — VPP 的配置与运行_配置文件_03

Multi-thread with Main and IO on a single thread

  • Main 和 IO 运行在单个 Thread 中。
  • 而 Workers 则分布在不通过的 Core 上运行。
    FD.io/VPP — VPP 的配置与运行_配置文件_04

运行示例

non-DPDK 模式运行 VPP

non-DPDK 模式运行 VPP,使用 Linux 的虚拟网络设备(e.g. veth pair)作为 VPP Host-interface。

FD.io/VPP — VPP 的配置与运行_sed_05

  1. 编辑 VPP1 的配置。
$ vi /etc/vpp/startup1.conf
unix {
nodaemon
cli-listen /run/vpp/cli-vpp1.sock
}

plugins {
plugin dpdk_plugin.so { disable }
}
  1. 启动进程。
$ vpp -c /etc/vpp/startup1.conf
vpp[10475]: clib_elf_parse_file: open `/usr/bin/vp': No such file or director
vpp[10475]: vat-plug/load: vat_plugin_register: oddbuf plugin not loaded...
  1. 进入 CLI。
$ vppctl -s /run/vpp/cli-vpp1.sock
_______ _ _ _____ ___
__/ __/ _ \ (_)__ | | / / _ \/ _ \
_/ _// // / / / _ \ | |/ / ___/ ___/
/_/ /____(_)_/\___/ |___/_/ /_/

vpp# show version
vpp v21.01.0-5~g6bd1c77fd built by root on vpp-host at 2021-05-12T16:17:20
  1. 创建 Linux veth pair 设备,vpp1out 作为 VPP 的 Host-Interface,vpp1host 作为 Host(宿主机)的 Network Interface。
$ sudo ip link add name vpp1out type veth peer name vpp1host

$ ip l
...
4: vpp1host@vpp1out: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 22:5b:e3:6a:2c:85 brd ff:ff:ff:ff:ff:ff
5: vpp1out@vpp1host: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 22:40:3a:b0:48:a2 brd ff:ff:ff:ff:ff:ff
  1. Turn up both ends:
$ sudo ip link set dev vpp1out up
$ sudo ip link set dev vpp1host up
  1. Assign an IP address
$ sudo ip addr add 10.10.1.1/24 dev vpp1host

$ ip addr show vpp1host
4: vpp1host@vpp1out: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 22:5b:e3:6a:2c:85 brd ff:ff:ff:ff:ff:ff
inet 10.10.1.1/24 scope global vpp1host
valid_lft forever preferred_lft forever
inet6 fe80::205b:e3ff:fe6a:2c85/64 scope link
valid_lft forever preferred_lft forever
  1. Create a VPP host-interface attached to vpp1out.
vpp# create host-interface name vpp1out
host-vpp1out

vpp# show hardware
Name Idx Link Hardware
host-vpp1out 1 up host-vpp1out
Link speed: unknown
Ethernet address 02:fe:d6:f5:de:03
Linux PACKET socket interface
local0 0 down local0
Link speed: unknown
local
  1. Turn up the VPP host-interfac:
vpp# set int state host-vpp1out up

vpp# show int
Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count
host-vpp1out 1 up 9000/0/0/0
local0
  1. Assign ip address 10.10.1.2/24
vpp# set int ip address host-vpp1out 10.10.1.2/24

vpp# show int addr
host-vpp1out (up):
L3 10.10.1.2/24
local0 (dn):
  1. 测试连通性
vpp# ping 10.10.1.1
116 bytes from 10.10.1.1: icmp_seq=1 ttl=64 time=9.1423 ms

DPDK 模式运行 VPP

DPDK 模式运行 VPP,使用真实的 Host 网卡设备作为 VPP Host-interface。笔者的环境为 OpenStack 虚拟机,vNIC 为 OvS vTap,驱动为 virtio。

$ lspci
...
00:07.0 Ethernet controller: Red Hat, Inc. Virtio network device

修改 startup.conf 配置:

dpdk {
dev 0000:00:07.0 {
num-rx-queues 1
}
}

重启 VPP 后查看 interface 状态:

vpp# show interface addr
GigabitEthernet0/7/0 (dn):
local0 (dn):

Router and Switch for namespaces

FD.io/VPP — VPP 的配置与运行_配置文件_06