lvs类型、lvs调度方法


lvs类型:

lvs-nat: 通过修改请求报文的目标IP地址进行调度;类似多目标的DNAT;

lvs-dr:通过重新封闭请求报文的帧首部(目标为RS的RIP对应MAC地址)进行调度;

    (1) 在前端路由器上静态指定;

    (2) arptables;

    (3) 通过修改内核参数来限制arp通告和响应级别;

lvs-tun:通过为请求报文的原有IP首部之外再次封装外层IP首部完成调度;ipip;

lvs-fullnat:通过修改请求的报文的源IP以及目标IP地址进行调度;


lvs调度方法:

静态方法:rr, wrr, sh, dh

动态方法:lc(least connection), wlc, sed, nq, lblc, lblcr

lc: Overhead=Active*256+Inactive

sed: OVerhead=(Active+1)*256/weight


lvs: ipvs/ipvsadm

    ipvs: netfilter, input


ipvsadm:

    管理集群服务:

    定义集群服务的方法:

        -t service-address (IP:PORT)

        -u service-address (IP:PORT)

        -f service-address (FWM: firewall mark)

        

        -A|-E|-D

    

        -s scheduler


        管理集群服务的RS:

            为集群服务指定RS:

                -r server-address (IP[:PORT])

    

        lvs类型:-g|-i|-m

        指定权重:-w #

    

        -a|-e|-d

    

        查看:

            -L

            -n, --stats, --rate, --exact

        清空、保存及重载

        清空:-C

        保存:-S

        重载:-R

        

    session 保持:

    session sticky:基于源ip绑定,基于cookie绑定;

    session replication cluster:在各server之间以多播方式“复制”各session,从而每个server会持有所的session;(tomcat)

    session server:引入第三方存储,专用于共享存储session信息;(redis, memcached)

    

lvs-dr:

    (1) 各RS要直接响应Client,因此,各RS均得配置VIP;但仅能够让Director上的VIP能够与本地路由直接通信;

    (2) Director不会拆除或修改请求报文的IP首部,而是通过封装新的帧首部(源MAC为Director的MAC,目标MAC为挑选出的RS的MAC)完成调度;


2.4.26, 2.6.4 kernel引入了两个内核参数:

    arp_announce:定义arp通告限制级别;2----只能通告同一网络

    arp_ignore:定义arp忽略arp请求或arp通告的级别;1---只能请求目标就是目标地址


接口位置:/proc/sys/net/ipv4/conf/INTERFACE


配置过程总结:

    Director:

    (1) VIP配置在物理接口的别名上

    ifconfig INTERFACE:ALIAS $vip broadcast $vip netmask 255.255.255.255

    

    (2) 配置路由信息

    route add -host $vip dev INTEFACE:ALIAS

    

    RS:

    (1) 先修改内核参数

    echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore

    echo 1 > /proc/sys/net/ipv4/conf/lo/arp_ignore

    echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce

    echo 2 > /proc/sys/net/ipv4/conf/lo/arp_announce

    

    (2) VIP配置在lo的别名上

    ifconfig lo:0 $vip broadcast $vip netmask 255.255.255.255 up

    

    (3) 配置路径信息

    route add -host $vip dev lo:0


示例

wKiom1YiHz2RfF-rAARhOScIKuE429.jpg

wKioL1YiH6XB-3zFAAGU0PUPeFY201.jpg

wKiom1Yh5Q3C7zLXAAHDUfpRtR8390.jpg

wKioL1YiH-TCgczpAAHL0Y6Bh4o035.jpg

wKiom1YiIDTBysMhAATBjwZOG18440.jpg

wKiom1YiIF7CZnIKAAQxN5whSvo536.jpg

wKioL1YiILuyNovOAASgwx1HFk4335.jpg

wKiom1YiImHgJcXjAAWbUG73qrA011.jpg

wKioL1YiIvDSgs-7AAWacow4wY8563.jpg

wKioL1YiJVOBfurOAAWg7CaA0Ro227.jpg

wKiom1YiJbnRGr-jAAR84PmHpeE390.jpg

wKiom1YiJbmBdwLeAASmfeadIaU978.jpg

wKioL1YiJj3B-RDuAAMtyWs4Bro921.jpg

wKiom1YiJn2jeBWGAADT4C2TATY635.jpg

wKioL1YiKFOxvr5fAAIMaSUgGTE392.jpg

wKioL1YiKDeB8fblAAD-2YOzaMM192.jpg

wKiom1YiKMDhHEpnAADQbjnQvAE371.jpg

wKiom1YiKMCyDvD1AADMzzo1lSQ583.jpg

wKioL1YiKOWwT0QRAAEO8knDZtU499.jpg

wKioL1YiKTHypaiAAAG_-v-yG2Q146.jpg

wKiom1YiKV3QFkNTAAH-YwXGMSo506.jpg

wKioL1YiKgOASS0iAANJgVUeQI4837.jpg

wKiom1YiK-DwnaNLAAENhUA95Ow591.jpg


注意:vip和dip不是同一网段

    1.保证在路由器上要拥有各个网关地址,并开启路由转发

    2.rip配置ip和默认网关

    3.其他内核配置不变

    4.ipvsadm设置rip与VIP为不同网段

        


示例脚本:

DR类型director脚本示例:

    #!/bin/bash

    #

    vip=172.16.100.7

    rip=('172.16.100.8' '172.16.100.9')

    weight=('1' '2')

    port=80

    scheduler=rr

    ipvstype='-g'

    

    case $1 in

    start)

    iptables -F -t filter

    ipvsadm -C

    

    ifconfig eth0:0 $vip broadcast $vip netmask 255.255.255.255 up

    route add -host $vip dev eth0:0

    echo 1 > /proc/sys/net/ipv4/ip_forward

    

    ipvsadm -A -t $vip:$port -s $scheduler

    [ $? -eq 0 ] && echo "ipvs service $vip:$port added."  || exit 2

    for i in `seq 0 $[${#rip[@]}-1]`; do

    ipvsadm -a -t $vip:$port -r ${rip[$i]} $ipvstype -w ${weight[$i]}

    [ $? -eq 0 ] && echo "RS ${rip[$i]} added."

    done

    touch /var/lock/subsys/ipvs

    ;;

    stop)

    echo 0 > /proc/sys/net/ipv4/ip_forward

    ipvsadm -C

    ifconfig eth0:0 down

    rm -f /var/lock/subsys/ipvs

    echo "ipvs stopped."

    ;;

    status)

    if [ -f /var/lock/subsys/ipvs ]; then

    echo "ipvs is running."

    ipvsadm -L -n

    else

    echo "ipvs is stopped."

    fi

    ;;

    *)

    echo "Usage: `basename $0` {start|stop|status}"

    exit 3

    ;;

    esac



DR类型RS脚本示例:

    #!/bin/bash

    #

    vip=172.16.100.7

    interface="lo:0"

    

    case $1 in

    start)

    echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore

    echo 1 > /proc/sys/net/ipv4/conf/lo/arp_ignore

    echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce

    echo 2 > /proc/sys/net/ipv4/conf/lo/arp_announce

    

    ifconfig $interface $vip broadcast $vip netmask 255.255.255.255 up

    route add -host $vip dev $interface

    ;;

    stop)

    echo 0 > /proc/sys/net/ipv4/conf/all/arp_ignore

    echo 0 > /proc/sys/net/ipv4/conf/lo/arp_ignore

    echo 0 > /proc/sys/net/ipv4/conf/all/arp_announce

    echo 0 > /proc/sys/net/ipv4/conf/lo/arp_announce

    

    ifconfig $interface down

    ;;

    status)

    if ifconfig lo:0 |grep $vip &> /dev/null; then

    echo "ipvs is running."

    else

    echo "ipvs is stopped."

    fi

    ;;

    *)

    echo "Usage: `basename $0` {start|stop|status}"

    exit 1

    esac



lvs定义集群服务,有三种方式:

    -t service-address

    -u service-address

    -f service-address 

    FWM

    

    FWM: firewall mark

    iptables/netfilter, 

    filter, nat, mangle, raw

    

    mangle: 防火墙标记,目的为了使用不同端口访问同一个集群服务。

    

    前提:在ipvs生效之前的netfilter的某hook function上定义iptables规则,实现给报文打上防火墙标记;

    

    定义方法:

    (1) 打标:在Director上mangle表的PREROUTING链上实现

    # iptables -t mangle -A PREROUTING -d $vip -p $protocol --dport $port -j MARK --set-mark [1-99]

    

    (2) 基于FWM定义集群服务

    # ipvsadm -A -f FWM -s SCHEDULER

    # ipvsadm -a -f FWM -r server-address -g|-i|-m -w #

    

示例:

IP配置和路由配置和内核配置事先配置完毕

wKioL1YjFG2gXkrjAAUFpTjKJF8937.jpg

wKiom1YjEQmB0IEIAAEbO4c9pVM032.jpg



lvs的persistence: lvs持久连接


    无论使用哪一种调度方法,持久连接功能都能保证在指定时间范围之内,来自于同一个IP的请求将始终被定向至同一个RS;

    

    persistence template:持久连接模板

    

    PPC:每端口持久;持久连接生效范围仅为单个集群服务;如果有多个集群服务,每服务被单独持久调度;

    PCC:每客户端持久;持久连接生效范围为所有服务;定义集群服务时,其TCP或UDP协议的目标端口要使用0;

    PFWM:每FWM持久:持久连接生效范围为定义为同一个FWM下的所有服务;


示例

wKiom1Yjb7ySw6EVAAZ1hIR6TeE590.jpg

wKioL1Yjb-LAPX8VAAPfNbj4FGg959.jpg

wKiom1Yjb7vDbDqwAAN_lSGh_dw555.jpg

wKioL1YjcuaDWd-3AAQ2lcPFSxo094.jpg

wKiom1YjcsDg9swPAAVN4PyINaA677.jpg

wKiom1YjnMHwvhwYAARmX_W42N4183.jpg

wKioL1YjnXbh2JvFAAEF2JcATy0833.jpg

wKioL1YjnOfQKgw9AAFXe9ezRlc813.jpg

wKiom1Yjo_igOH6RAAP1UbfZ4do632.jpg


    ipvsadm -A -t|-u|-f service-address -s SCHEDULER [-p [#]]

        无-p选项:不启用持久连接

        -p #:指定持久时长,省略时长,默认为300seconds

示例:(接上)ppc

wKiom1Yjp7mwprCfAARPNCuDJws207.jpg

wKioL1Yjp9_iVN2tAAJeHHDWqAc997.jpg

wKiom1YjqH3QjFBiAAImzLPtxiU657.jpg

wKioL1YjqYuSqM4sAAFnjgqc-HA622.jpg

wKiom1YjqonBZz8fAAOD5edJ6dQ380.jpg

wKiom1Yjq93jlHEpAAPSAe8Eaj4286.jpg

wKioL1YjrDSD3hsFAAOBQAytpwk225.jpg

wKioL1YjrQPAy9zFAAQaVSR9e84133.jpg

wKioL1Yjrz2QwhSpAAOe4Ug7yzU822.jpg

wKiom1YjrxfSU8mQAAOH8KHOz7Y197.jpg

wKioL1Yjr2fiYTkbAAH5YLciokw320.jpg


pcc

wKioL1YjsguA_fbgAANTyLd7FcU061.jpg

wKiom1YjseWCkmB3AACthV6png8039.jpg

wKioL1YjslHwAASeAAHPXX7ZDKM583.jpg


pfwm

wKiom1YjwZbwYuQkAAGvsrvfRVw586.jpg

wKioL1Yjwb2AjOtrAARjkgN0LNI186.jpg

wKiom1YjwZfDjVjWAAWMG4soOZA369.jpg

wKioL1Yjwb2zqJjEAAIw5Fu0efw576.jpg

wKioL1Yjwb3RhUHPAAQnLNVJ2bk018.jpg

wKiom1YjwZeyz9d4AAJoL4ro1x0350.jpg

wKiom1YjwZeS_Ya0AAQpE-KC9QU301.jpg

wKioL1Yjwb6Blc-aAADAIBLVu4o126.jpg

wKioL1Yjwb6xCHkUAAS9GUtVA0s399.jpg

wKiom1YjwZiiOm9xAARwoTjQrxs596.jpg

wKioL1Yjwb6yh3izAAE-9FMBz18893.jpg

wKiom1YjwZiz_hPyAAD_7gN1KPw072.jpg

wKioL1Yk3M6BwhNlAADBbZxaFDU187.jpg

wKiom1Yk3K_i8bZdAAC3MblV8Go382.jpg


重做:

wKiom1Yj0VqTOEUkAAVENK2zTXE628.jpg

wKioL1Yj0YGxrcsiAAVDW-CTmS4919.jpg

wKiom1Yj0VvwiDHBAAJt4Ysd6TE769.jpg

wKioL1Yj0YGTr2alAAMsIbpdy_0370.jpg

wKiom1Yj0VuAYvMyAAEvfJiNtKs869.jpg


dr的不同网段:

dr:vip,dip不同网段,同时指向各自网关

    vip:ifconfig eth0:0 192.168.0.2/24

    打开转发

    设置路由

    ipvs设置

rs:vip,dip不同网段,指向网关,设置路由,打开服务,配置防火墙





lvs集群:

    lvs本身不支持对RS的健康状态作检测;

    

    健康:周期性检查机制

    状态发生转变时,要作出相应处理

        up --> down: 建议要至少确认三次;

        down --> up: 建议一次以上(含一次);

    

    下线处理机制:

        (1) 设置权重为0;

        (2) 将相应的RS从ipvs的可用RS列表中移除;

    

    上线处理机制:

        (1) 设置为正常权重;

        (2) 将相应的RS添加至ipvs的可用RS列表;

    

    解决方案:

    (1) 写程序完成相应功能;


        如何做健康状态检查:

            三种方案:

                IP层:ping等主机在线状态探查工具;

                传输层:端口扫描工具探查服务在线状态;

                应用层:请求专用于健康状态检查的资源或者某正常资源;

        

        备用服务器:

            sorry server, backup server

            可以在Director上直接实现:即配置director成为web服务,仅提供有限资源,在所有RS都故障时,方才启用此server;

        

初始脚本,完成RS健康状态检查;


    #!/bin/bash

    #

    fwm=10

    sorry_server='127.0.0.1'

    lvstype='-m'

    checkloop=3

    logfile=/var/log/ipvs_health_check.log

    rs=('192.168.10.11' '192.168.10.12')

    rw=('1' '1')

    rsstatus=(0 0)

    

    addrs() {

       # $1: rs, $2: rs weight

       ipvsadm -a -f $fwm -r $1 $lvstype -w $2

       [ $? -eq 0 ] && return 0 || return 1

    }

    

    delrs() {

       # $1: rs

       ipvsadm -d -f $fwm -r $1

       [ $? -eq 0 ] && return 0 || return 1

    }

    

    chkrs() {

       # $1: rs

       local i=1

       while [ $i -le $checkloop ]; do

    if curl --connect-timeout 1 -s http://$1/index.html | grep -i "real[[:space:]]* server" &> /dev/null; then

        return 0

           fi

           let i++

    sleep 2

       done

       return 1

    }

      

    initstatus() {

       for host in `seq 0 $[${#rs[@]}-1]`; do

    if chkrs ${rs[$host]}; then

       if [ ${rsstatus[$host]} -eq 0 ]; then

    rsstatus[$host]=1

               fi

           else

       if [ ${rstatus[$host]} -eq 1 ]; then

    rsstatus[$host]=0

       fi

    fi

       done

    }

    

    initstatus

    

    while :; do

       for host in `seq 0 $[${#rs[@]}-1]`; do

    if chkrs ${rs[$host]}; then 

       if [ ${rsstatus[$host]} -eq 0 ]; then

    addrs ${rs[$host]} ${rw[$host]}

    [ $? -eq 0 ] && rsstatus[$host]=1

       fi

    else

       if [ ${rsstatus[$host]} -eq 1 ]; then

    delrs ${rs[$host]}

    [ $? -eq 0 ] && rsstatus[$host]=0

       fi

    fi

       done

       sleep 10

    done


    改进此脚本

(1) 启用在rs上下线时记录日志;

(2) 在所有rs下线时启用sorry_server;