calico BGP RR Model
BGP Full Mesh 缺陷
•由于IBGP水平分割的存在,为了保证所有的BGP路由器都能学习到完整的BGP路由,就必须在AS内实现IBGP全互联,这就导致AS内部需要维护大量的BGP连接,从而影响网络性能,路由反射器(Route Reflector,RR)可以“放宽”水平分割原则,解决该问题。
•为保证IBGP对等体之间的连通性,需要在IBGP对等体之间建立全连接关系。假设在一个AS内部有n台设备,那么建立的IBGP连接数就为n(n-1)/2。当设备数目很多时,设备配置将十分复杂,而且配置后网络资源和CPU资源的消耗都很大。在IBGP对等体间使用路由反射器可以解决以上问题。
BGP RR
RR在接收BGP路由时
1.如果该路由学习自非Client IBGP对等体,则反射给自己所有的Client;
2.如果路由学习自Client,则反射给所有非Client IBGP对等体和除了该Client之外的所有Client(华为设备可通过命令关闭RR在Client之间的路由反射行为);
3.如果路由学习自EBGP对等体,则发送给所有Client和非Client IBGP对等体。
eNSP 模拟 BGP RR 模式
在 Full Mesh 的基础上, AR1 和 AR2 建立 Peer,
AR2 和 AR3 建立 Peer,AR1 和 AR3 不建立 Peer 对等体,而是在 AR2 上将 AR1 和 AR3 配置为 RR-client
除此以外,其他的和 Full Mesh 配置全部一致。
eNSP 模拟 OSPF + IBGP Full Mesh
具体细节配置如下所示:
[AR2]bgp 123
[AR2-bgp]router-id 2.2.2.2
[AR2-bgp]peer 1.1.1.1 as-number 123
[AR2-bgp]peer 1.1.1.1 connect-interface l0
[AR2-bgp]peer 3.3.3.3 as-number 123
[AR2-bgp]peer 3.3.3.3 connect-interface l0
[AR2-bgp]peer 1.1.1.1 reflect-client
[AR2-bgp]peer 3.3.3.3 reflect-client
[AR2-bgp]dis this
[V200R003C00]
#
bgp 123
router-id 2.2.2.2
peer 1.1.1.1 as-number 123
peer 1.1.1.1 connect-interface LoopBack0
peer 3.3.3.3 as-number 123
peer 3.3.3.3 connect-interface LoopBack0
#
ipv4-family unicast
undo synchronization
peer 1.1.1.1 enable
peer 1.1.1.1 reflect-client
peer 3.3.3.3 enable
peer 3.3.3.3 reflect-client
#
return
calico BGP RR
安装部署
1.关闭 BGP Full Mesh
确定当前部署环境为 BGP Full Mesh
Calico BGP Full Mesh 跨节点通信
[root@master ~]# calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+----------+-------------+
| 192.168.0.81 | node-to-node mesh | up | 14:40:26 | Established |
| 192.168.0.82 | node-to-node mesh | up | 14:40:25 | Established |
+--------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
查看本地是否具有 bgpconfiguration
,如果已存在,则导出 yaml 后,配置nodeToNodeMeshEnabled
为 false。
如果没有的话,需要创建一个默认的bgpconfiguration
,注意,需要将nodeToNodeMeshEnabled
设置为 false ,关闭 BGP Full Mesh。
[root@master <sub>]# calicoctl get bgpconfiguration
NAME LOGSEVERITY MESHENABLED ASNUMBER
cat << EOF | calicoctl create -f -
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
name: default
spec:
logSeverityScreen: Info
nodeToNodeMeshEnabled: false
asNumber: 64512
EOF
[root@master </sub>]# calicoctl get bgpconfiguration default -o yaml
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
creationTimestamp: "2022-05-24T14:46:43Z"
name: default
resourceVersion: "9017"
uid: efb6991a-32d5-4e85-9aa2-df5978e7683f
spec:
asNumber: 64512
logSeverityScreen: Info
nodeToNodeMeshEnabled: false
[root@master ~]# calicoctl node status
Calico process is running.
IPv4 BGP status
No IPv4 peers found.
IPv6 BGP status
No IPv6 peers found.
2. 指定 RR 路由反射器
我们指定 node1.whale.com
为 RR 路由反射器
[root@master ~]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master.whale.com Ready control-plane,master 10d v1.23.5 192.168.0.80 <none> CentOS Linux 7 (Core) 5.4.193-1.el7.elrepo.x86_64 docker://20.10.16
node1.whale.com Ready <none> 10d v1.23.5 192.168.0.81 <none> CentOS Linux 7 (Core) 5.4.193-1.el7.elrepo.x86_64 docker://20.10.16
node2.whale.com Ready <none> 10d v1.23.5 192.168.0.82 <none> CentOS Linux 7 (Core) 5.4.193-1.el7.elrepo.x86_64 docker://20.10.16
使用 calicoctl
命令进行改动
[root@master ~]# calicoctl get node node1.whale.com -o yaml --export > node1.yaml
添加到如图所示指定位置内容,指定 node1 节点为 RR 反射器,以此类推,集群节点多的时候,分别指定 1,3,5 等奇数节点的 RR 数量。
metadata
labels
calico-route-reflector"calico-route-reflector"
spec
bgp
routeReflectorClusterID224.0.0.1
修改完毕以后,calicoctl 执行 apply 即可
[root@master ~]# calicoctl apply -f node1.yaml
Successfully applied 1 'Node' resource(s)
3.配置 RR-client
配置 RR-client 连接到 RR 路由反射器
peer--RR
calicoctl apply -f - <<EOF
kind: BGPPeer
apiVersion: projectcalico.org/v3
metadata:
name: peer-to-rrs
spec:
nodeSelector: "!has(calico-route-reflector)"
peerSelector: has(calico-route-reflector)
EOF
配置 RR -- RR 之间进行连接
RR--RR
calicoctl apply -f - <<EOF
kind: BGPPeer
apiVersion: projectcalico.org/v3
metadata:
name: rrs-to-rrs
spec:
nodeSelector: has(calico-route-reflector)
peerSelector: has(calico-route-reflector)
EOF
查看bgppeer
[root@master ~]# calicoctl get bgppeer
NAME PEERIP NODE ASN
peer-to-rrs !has(calico-route-reflector) 0
rrs-to-rrs has(calico-route-reflector) 0
查看 BGP 状态
[root@master ~]# calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+---------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+---------------+-------+----------+-------------+
| 192.168.0.81 | node specific | up | 15:05:41 | Established |
+--------------+---------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
通过 RR 节点的配置,来进行二次确认
我们找到 calico-node 在 node1 节点的配置文件,查看是否含有 BGP RR 的配置,从而确认集群是否成为 RR 的模式。
[root@master ~]# kubectl -n kube-system exec -it calico-node-bntq7 -- bash
[root@node1 /]# cat /etc/calico/confd/config/bird.cfg
function apply_communities ()
{
}
# Generated by confd
include "bird_aggr.cfg";
include "bird_ipam.cfg";
router id 192.168.0.81;
# Configure synchronization between routing tables and kernel.
protocol kernel {
learn; # Learn all alien routes from the kernel
persist; # Don't remove routes on bird shutdown
scan time 2; # Scan kernel routing table every 2 seconds
import all;
export filter calico_kernel_programming; # Default is export none
graceful restart; # Turn on graceful restart to reduce potential flaps in
# routes when reloading BIRD configuration. With a full
# automatic mesh, there is no way to prevent BGP from
# flapping since multiple nodes update their BGP
# configuration at the same time, GR is not guaranteed to
# work correctly in this scenario.
merge paths on; # Allow export multipath routes (ECMP)
}
# Watch interface up/down events.
protocol device {
debug { states };
scan time 2; # Scan interfaces every 2 seconds
}
protocol direct {
debug { states };
interface -"cali*", -"kube-ipvs*", "*"; # Exclude cali* and kube-ipvs* but
# include everything else. In
# IPVS-mode, kube-proxy creates a
# kube-ipvs0 interface. We exclude
# kube-ipvs0 because this interface
# gets an address for every in use
# cluster IP. We use static routes
# for when we legitimately want to
# export cluster IPs.
}
# Template for all BGP clients
template bgp bgp_template {
debug { states };
description "Connection to BGP peer";
local as 64512;
multihop;
gateway recursive; # This should be the default, but just in case.
import all; # Import all routes, since we don't know what the upstream
# topology is and therefore have to trust the ToR/RR.
export filter calico_export_to_bgp_peers; # Only want to export routes for workloads.
add paths on;
graceful restart; # See comment in kernel section about graceful restart.
connect delay time 2;
connect retry time 5;
error wait time 5,30;
}
# ------------- Node-to-node mesh -------------
# This node (node1.whale.com) is configured as a route reflector with cluster ID 224.0.0.1;
# ignore node-to-node mesh setting.
# ------------- Global peers -------------
# No global peers configured.
# ------------- Node-specific peers -------------
# For peer /host/node1.whale.com/peer_v4/192.168.0.80
protocol bgp Node_192_168_0_80 from bgp_template {
neighbor 192.168.0.80 as 64512;
source address 192.168.0.81; # The local address we use for the TCP connection
rr client;
rr cluster id 224.0.0.1;
}
# For peer /host/node1.whale.com/peer_v4/192.168.0.81
# Skipping ourselves (192.168.0.81)
# For peer /host/node1.whale.com/peer_v4/192.168.0.82
protocol bgp Node_192_168_0_82 from bgp_template {
neighbor 192.168.0.82 as 64512;
source address 192.168.0.81; # The local address we use for the TCP connection
rr client;
rr cluster id 224.0.0.1;
}
至此,我们的 BGP RR 模式已经可以正常创建,而关于 跨节点通信,则和 BGP Full Mesh 类似,我们本章节就暂不演示,有需要可以去 BGP Full Mesh 章节仔细查看即可。