一、介绍
Distributed Replicated Block Device(DRBD)是一个用软件实现的、无共享的、服务器之间镜像块设备内容的存储复制解决方案。
我们可以理解为它其实就是个网络Raid 1,两台服务器间就算某台因断电或宕机也不会对数据有任何影响,而真正的热切换可以通过Heartbeat方案解决,不需要人工干预。
DRBD版本:DRBD-8.4.4
node1(主节点)IP: 192.168.142.130 主机名:drbd1.corp.com
node2(从节点)IP: 192.168.142.131 主机名:drbd2.corp.com
客户端主机IP:192.168.142.132 (安装nfs)
二、安装前准备:(node1,node2)
sed -i '41 s/^/* soft nofile 65535\n* hard nofile 65535\n* soft nproc 65535\n* hard nproc 65535/g' /etc/security/limits.conf
ulimit -n 65535
sed -i '/^SELINUX=enforcing/c#SELINUX=enforcing' /etc/selinux/config
sed -i '12 s/^/SELINUX=disabled/g' /etc/selinux/config
sed -i '/^SELINUXTYPE=targeted/c#SELINUXTYPE=targeted' /etc/selinux/config
setenforce 0
service iptables stop
chkconfig iptables off
yum install -y openssh-clients vim ntpdate
ntpdate -u cn.pool.ntp.org
echo "*/20 * * * * /usr/sbin/ntpdate -u cn.pool.ntp.org >/dev/null &" >> /var/spool/cron/root
mkdir /store
vim /etc/hosts #配置hosts解析
192.168.142.130 drbd1.corp.com
192.168.142.131 drbd2.corp.com
vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=drbd1.corp.com #修改为各自的主机名
############################两个节点分别添加一块8G硬盘分区sdb1作为DRBD设备磁盘#############
fdisk /dev/sdb
n-p-1-1-"+8G"-w
三、DRBD的安装配置:
1、安装配置drbd:(node1,node2)
yum install gcc gcc-c++ make glibc flex kernel-devel kernel-headers kernel wget -y
reboot
wget http://oss.linbit.com/drbd/8.4/drbd-8.4.4.tar.gz
tar zxvf drbd-8.4.4.tar.gz
cd drbd-8.4.4
./configure --prefix=/usr/local/drbd --with-km
make KDIR=/usr/src/kernels/2.6.32-754.35.1.el6.x86_64/
make install
mkdir -p /usr/local/drbd/var/run/drbd
cp /usr/local/drbd/etc/rc.d/init.d/drbd /etc/rc.d/init.d
chkconfig --add drbd
chkconfig drbd on
depmod
modprobe drbd #加载drbd模块
lsmod |grep drbd
vim /usr/local/drbd/etc/drbd.conf #清空文件内容,并添加如下配置
resource r0{
protocol C;
startup { wfc-timeout 0; degr-wfc-timeout 120;}
disk { on-io-error detach;}
net{
timeout 60;
connect-int 10;
ping-int 10;
max-buffers 2048;
max-epoch-size 2048;
}
syncer { rate 200M;}
on drbd1.corp.com{
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.142.130:7788;
meta-disk internal;
}
on drbd2.corp.com{
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.142.131:7788;
meta-disk internal;
}
}
2、创建DRBD设备和激活r0资源并在两个节点都启动服务:
mknod /dev/drbd0 b 147 0
drbdadm create-md r0
drbdadm create-md r0
service drbd start
service drbd status
这里ro:Secondary/Secondary表示两台主机的状态都是备机状态,ds是磁盘状态,显示的状态内容为“Inconsistent不一致”,这是因为DRBD无法判断哪一方为主机,应以哪一方的磁盘数据作为标准。
3、将node1设置为主节点再查看两个节点状态:(node1)
drbdsetup /dev/drbd0 primary --force
service drbd status
4、挂载DRBD到系统目录/store:(node1)
mkfs.ext4 /dev/drbd0
mount /dev/drbd0 /store
service drbd status
注:Secondary节点上不允许对DRBD设备进行任何操作,包括挂载;所有的读写操作只能在Primary节点上进行,只有当Primary节点挂掉时,Secondary节点才能提升为Primary节点,并自动挂载DRBD继续工作。
四、Hearbeat安装配置
1、安装heartbeat
vim /etc/yum.repos.d/epel.repo #添加epel源
[epel]
name=Extra Packages for Enterprise Linux 6 - $basearch
baseurl=http://archives.fedoraproject.org/pub/archive/epel/6/$basearch
enabled=1
gpgcheck=0
yum install heartbeat -y
2、设置heartbeat配置文件和双机互联验证文件
vim /etc/ha.d/ha.cf #(node1)节点配置
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 5
ucast eth0 192.168.142.131 # 指定对方网卡及IP
auto_failback off
node drbd1.corp.com drbd2.corp.com
vim /etc/ha.d/ha.cf #(node2)节点配置
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 5
ucast eth0 192.168.142.130 # 指定对方网卡及IP
auto_failback off
node drbd1.corp.com drbd2.corp.com
vim /etc/ha.d/authkeys #(node1,node2) 双机互联验证文件authkeys
auth 1
1 crc
chmod 600 /etc/ha.d/authkeys
3、编辑集群资源文件:(node1,node2)
vim /etc/ha.d/haresources
drbd1.corp.com IPaddr::192.168.142.120/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/store::ext4 killnfsd
IPaddr::192.168.142.120/24/eth0:用IPaddr脚本配置对外服务的浮动虚拟IP
drbddisk::r0:用drbddisk脚本实现DRBD主从节点资源组的挂载和卸载
Filesystem::/dev/drbd0::/store::ext4:用Filesystem脚本实现磁盘挂载和卸载
注:该文件内IPaddr,Filesystem等脚本存放路径在/etc/ha.d/resource.d/下,也可在该目录下存放服务启动脚本(例如:mysql,www),将相同脚本名称添加到/etc/ha.d/haresources内容中,从而跟随heartbeat启动而启动该脚本。
4、编辑重启NFS服务的脚本killnfsd和DRBD的脚本drbddisk:(node1,node2)
vim /etc/ha.d/resource.d/killnfsd
killall -9 nfsd; /etc/init.d/nfs restart;exit 0
vim /etc/ha.d/resource.d/drbddisk
#!/bin/bash
# This script is inteded to be used as resource script by heartbeat
# Philipp Reisner, Lars Ellenberg
DEFAULTFILE="/etc/default/drbd"
DRBDADM="/sbin/drbdadm"
if [ -f $DEFAULTFILE ]; then
. $DEFAULTFILE
fi
if [ "$#" -eq 2 ]; then
RES="$1"
CMD="$2"
else
RES="all"
CMD="$1"
fi
drbd_set_role_from_proc_drbd()
{
local out
if ! test -e /proc/drbd; then
ROLE="Unconfigured"
return
fi
dev=$( $DRBDADM sh-dev $RES )
minor=${dev#/dev/drbd}
if [[ $minor = *[!0-9]* ]] ; then
# sh-minor is only supported since drbd 8.3.1
minor=$( $DRBDADM sh-minor $RES )
fi
if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then
ROLE=Unknown
return
fi
if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then
set -- $out
ROLE=${5%/**}
: ${ROLE:=Unconfigured} # if it does not show up
else
ROLE=Unknown
fi
}
case "$CMD" in
start)
# try several times, in case heartbeat deadtime
# was smaller than drbd ping time
try=6
while true; do
$DRBDADM primary $RES && break
let "--try" || exit 1 # LSB generic error
sleep 1
done
;;
stop)
# heartbeat (haresources mode) will retry failed stop
# for a number of times in addition to this internal retry.
try=3
while true; do
$DRBDADM secondary $RES && break
let --try || exit 1 # LSB generic error
sleep 1
done
;;
status)
if [ "$RES" = "all" ]; then
echo "A resource name is required for status inquiries."
exit 10
fi
ST=$( $DRBDADM role $RES )
ROLE=${ST%/**}
case $ROLE in
Primary|Secondary|Unconfigured)
# expected
;;
*)
drbd_set_role_from_proc_drbd
esac
case $ROLE in
Primary)
echo "running (Primary)"
exit 0 # LSB status "service is OK"
;;
Secondary|Unconfigured)
echo "stopped ($ROLE)"
exit 3 # LSB status "service is not running"
;;
*)
# NOTE the "running" in below message.
# this is a "heartbeat" resource script,
# the exit code is _ignored_.
echo "cannot determine status, may be running ($ROLE)"
exit 4 # LSB status "service status is unknown"
;;
esac
;;
*)
echo "Usage: drbddisk [resource] {start|stop|status}"
exit 1
;;
esac
exit 0
chmod 755 /etc/ha.d/resource.d/killnfsd
chmod 755 /etc/ha.d/resource.d/drbddisk
service heartbeat start #先启动node1 ,再启动 node2
chkconfig heartbeat on
现在从其他机器能够ping通虚IP 192.168.142.120,表示配置成功
五、配置NFS:(node1,node2)
vim /etc/exports
/store *(rw,no_root_squash)
service rpcbind restart
service nfs restart
chkconfig rpcbind on
chkconfig nfs off
注:这里设置NFS开机不要自动运行,因为/etc/ha.d/resource.d/killnfsd 该脚本会控制NFS的启动。
六、测试高可用
1、正常热备切换,在客户端挂载NFS共享目录
yum install -y nfs-utils rpcbind
systemctl start rpcbind
systemctl start nfslock
systemctl start nfs
chkconfig rpcbind on
chkconfig nfslock on
chkconfig nfs on
mount -t nfs 192.168.142.120:/store /tmp
模拟将主节点node1 的heartbeat服务停止,则备节点node2会立即无缝接管;测试客户端挂载的NFS共享读写正常。
service heartbeat stop #node1上停止 heartbeat
service drbd status #node2上的DRBD状态变成primary
2、异常宕机切换
强制node1关机 ,node2节点也会立即无缝接管,测试客户端挂载的NFS共享读写正常。