RHEL下部署heartbeat，实现简单故障转移群集

原创

ahzzk2012 2014-01-08 12:05:06 ©著作权

文章标签 rhel heartbeat 集群配置 文章分类 运维

©著作权归作者所有：来自51CTO博客作者ahzzk2012的原创作品，请联系作者获取转载授权，否则将追究法律责任

实验环境：RHEL 5.5 64bit

实验需求：VM虚拟机、heartbeat安装包

实验目的：实现两台samba服务器之间的自动切换，以及磁盘的共享存储，达到简单故障转移的目的。

实验规划：

HOSTA:

hostname：sev1.example.com sev1 eth0:192.168.138.10 eth1：192.168.1.10 （心跳端口） GW:192.168.138.2 主节点

HOSTB:

hostname：sev2.example.com sev2 eth0:192.168.138.20 eth1：192.168.1.20 （心跳端口） GW:192.168.138.2 备用节点

实验步骤：

1、打开VMware虚拟机，首先安装2台虚拟主机，均使用RHEL 5.564bit操作系统。在安装操作系统的时候注意把samba服务安装好。（如果等系统安装好之后再装samba的话，依赖关系很杂，使用rpm安装不太方便！）

2、在HOSTA虚拟主机下修改虚拟配置，手动添加一个磁盘做共享，暂时命名为share，这里为了实现2台机器能自动挂载共享存储，需修改该磁盘的参数。在VM的根目录下找的新建的共享磁盘，修改share.vmx文件，添加如下几行参数：

disk.locking = "FALSE"

diskLib.dataCacheMaxSize=0

diskLib.dataCacheMaxReadAheadSize=0
diskLib.dataCacheMinReadAheadSize=0
diskLib.dataCachePageSize=4096
diskLib.maxUnsyncedWrites=0

scsi0:1.sharedBus = "virtual"（scsi是虚拟设备节点，根据实际情况修改即可）
scsi0:1.shared = "true"

3、启动HOSTA，用root身份登录（方便以后操作），打开终端，使用fdisk-l命令查看磁盘，接着格式化该磁盘，这里我是想使用整个磁盘，所以就不分区，直接格式化成ext3格式，具体命令如下：

fdisk -l 查询该磁盘“盘符” /dev/sdb

fdisk /dev/sdb m(这里可以用不同的参数分区，就不多说了，自己百度) 重启之

终端输入 mkdir -p /home/share 新建挂载点

mkfs -t ext3 -c /dev/sdb 格式化为ext3

tips：手动挂载 mount /dev/sdb /home/share测试成功！（记得unmount）

4、HOSTB的配置不需要新建磁盘，直接在添加硬盘的时候选择已存在的硬盘，指定到share这个磁盘，记得使用新建好挂载点之后要测试下，mount成功即可。

5、配置samba服务器：a、采用终端配置，直接终端输入vi/etc/samba/smb.conf (主配置文件）。b、图形化界面配置，路径为：管理-->服务器-->samba 。samba配置很简单，就不多说了，关键是要搞懂权限问题。（自己也有点模糊~！）

6、在HOSTA上安装heartbeat软件

这里采用rpm安装，直接把安装包CP到虚拟机里，heartbeat-2.1.3-3版本需要3个包，安装顺序如下：

heartbeat-pils-2.1.3-3.el5.centos.i386.rpm

heartbeat-stonith-2.1.3-3.el5.centos.i386.rpm

heartbeat-2.1.3-3.el5.centos.i386.rpm

安装方法：先cd到该目录，ls查看文件，rpm -ivhheartbeat-pils-2.1.3-3.el5.centos.i386.rpm（注意使用tab键），根据提示安装即可。待3个包都安装好之后，最好rpm -q heartbeat -d 查看安装了哪些东西，这是一个好习惯哈。

7、heartbeat安装好之后，在/use/share/doc/heartbeat-2.1.3下找到以下3个文件：authkeys haresources ha.cf 把这三个文件cp到/etc/ha.d 下面。具体配置如下：

a、ha.cf配置：

There are lots of options in this file. Allyou have to have is a set
# of nodes listed {"node ...} one of{serial, bcast, mcast, or ucast},
# and a value for"auto_failback".
# ATTENTION: As the configurationfile is read line by line,
# THE ORDER OF DIRECTIVE MATTERS!
# In particular, make sure that theudpport, serial baud rate
# etc. are set before the heartbeatmedia are defined!
# debug and log file directives gointo effect when they
# are encountered.
# All will be fine if you keep themordered as in this example.
# Note on logging:
# If any of debugfile, logfile andlogfacility are defined then they
# will be used. If debugfile and/orlogfile are not defined and
# logfacility is defined then therespective logging and debug
# messages will be loged to syslog.If logfacility is not defined
# then debugfile and logfile will beused to log messges. If
# logfacility is not defined anddebugfile and/or logfile are not
# defined then defaults will be usedfor debugfile and logfile as
# required and messages will be sentthere.
# File to write debug messagesto
#debugfile /var/log/ha-debug
# File to write other messagesto
logfile /var/log/ha-log
# Facility to use forsyslog()/logger
logfacility local0
# A note on specifying "how long"times below...
# The default time unit isseconds
# 10 means ten seconds
# You can also specify them inmilliseconds
# 1500ms means 1.5 seconds
# keepalive: how long betweenheartbeats?
keepalive 2
# deadtime: howlong-to-declare-host-dead?
# If you set this too low you will get the problematic
# split-brain (or cluster partition) problem.
# See the FAQ for how to use warntime to tune deadtime.
deadtime 60
# warntime: how long before issuing"late heartbeat" warning?
# See the FAQ for how to usewarntime to tune deadtime.
warntime 10
# Very first dead time(initdead)
# On some machines/OSes, etc. thenetwork takes a while to come up
# and start working right afteryou've been rebooted. As a result
# we have a separate dead time forwhen things first come up.
# It should be at least twice thenormal dead time.
initdead 120
# What UDP port to use forbcast/ucast communication?
#
udpport 694
# Baud rate for serial ports...
#baud 19200
# serial serialportname...
#serial /dev/ttyS0 # Linux
#serial /dev/cuaa0 # FreeBSD
#serial /dev/cuad0 # FreeBSD 6.x
#serial /dev/cua/a # Solaris
# What interfaces to broadcastheartbeats over?
bcast eth1 # Linux
#bcast eth1 eth2 # Linux
#bcast le0 # Solaris
#bcast le1 le2 #Solaris
# Set up a multicast heartbeatmedium
# mcast [dev] [mcast group] [port][ttl] [loop]
# [dev] deviceto send/rcv heartbeats on
# [mcast group] multicastgroup to join (class D multicast address
# 224.0.0.0 - 239.255.255.255)
# [port] udp port tosendto/rcvfrom (set this value to the
# same value as "udpport" above)
# [ttl] thettl value for outbound heartbeats. this effects
# how far the multicast packet will propagate. (0-255)
# Must be greater than zero.
# [loop] togglesloopback for outbound multicast heartbeats.
# if enabled, an outbound packet will be looped back and
# received by the interface it was sent on. (0 or 1)
# Set this value to zero.
#mcast eth0 225.0.0.1 694 1 0
# Set up a unicast / udp heartbeatmedium
# ucast [dev] [peer-ip-addr]
# [dev] deviceto send/rcv heartbeats on
# [peer-ip-addr] IP address ofpeer to send packets to
ucast eth1 192.168.1.20
# About boolean values...
# Any of the followingcase-insensitive values will work for true:
# true, on, yes, y, 1
# Any of the followingcase-insensitive values will work for false:
# false, off, no, n, 0
# auto_failback: determineswhether a resource will
# automatically fail back to its"primary" node, or remain
# on whatever node is serving ituntil that node fails, or
# an administrator intervenes.
# The possible values forauto_failback are:
# on - enable automatic failbacks
# off - disable automatic failbacks
# legacy - enable automatic failbacks in systems
# where all nodes do not yet support
# the auto_failback option.
# auto_failback "on" and "off" arebackwards compatible with the old
# "nice_failback on" setting.
# See the FAQ for information on howto convert
# from "legacy" to "on" without a flash cut.
# (i.e., using a "rolling upgrade" process)
# The default value forauto_failback is "legacy", which
# will issue a warning atstartup. So, make sure you put
# an auto_failback directive in yourha.cf file.
# (note: auto_failback can be anyboolean or "legacy")
#
auto_failback on
# Basic STONITH support
# Using this directive assumes thatthere is one stonith
# device in the cluster. Parameters to this device are
# read from a configuration file.The format of this line is:
# stonith
# NOTE: it is up to you to maintainthis file on each node in the
# cluster!
#stonith baytech /etc/ha.d/conf/stonith.baytech
# STONITH support
# You can configure multiple stonithdevices using this directive.
# The format of the line is:
# stonith_host
# is themachine the stonith device is attached
# to or * to mean it is accessible from any host.
# is thetype of stonith device (a list of
# supported drives is in /usr/lib/stonith.)
# are driverspecific parameters. To see the
# format for a particular device, run:
# stonith -l-t
# Note that if you put your stonithdevice access information in
# here, and you make this filepublically readable, you're asking
# for a denial of service attack;-)
# To get a list of supported stonithdevices, run
# stonith -L
# For detailed information on whichstonith devices are supported
# and their detailed configurationoptions, run this command:
# stonith -h
#stonith_host * baytech 10.0.0.3 myloginmysecretpassword
#stonith_host ken3 rps10 /dev/ttyS1 kathy 0
#stonith_host kathy rps10 /dev/ttyS1 ken3 0
# Watchdog is the watchdogtimer. If our own heart doesn't beat for
# a minute, then our machine willreboot.
# NOTE: If you are using thesoftware watchdog, you very likely
# wish to load the module with theparameter "nowayout=0" or
# compile it withoutCONFIG_WATCHDOG_NOWAYOUT set. Otherwise even
# an orderly shutdown of heartbeatwill trigger a reboot, which is
# very likely NOT what you want.
#watchdog /dev/watchdog
# Tell what machines are in thecluster
# node nodename... -- must match uname -n
node sev1.example.com
node sev2.example.com
# Less common options...
# Treats 10.10.10.254 as apsuedo-cluster-member
# Used together with ipfailbelow...
# note: don't use a cluster node asping node
ping 192.168.138.2
# Treats 10.10.10.254 and10.10.10.253 as a psuedo-cluster-member
# called group1. If either10.10.10.254 or 10.10.10.253 are up
# then group1 is up
# Used together with ipfailbelow...
#ping_group group1 10.0.0.1 10.0.0.2
# HBA ping derective for FiberChannel
# Treats fc-card-name aspsudo-cluster-member
# used with ipfail below ...
#
# You can obtain HBAAPI fromhttp://hbaapi.sourceforge.net. Youneed
# to get the library specific toyour HBA directly from the vender
# To install HBAAPI stuff, all Youneed to do is to compile the common
# part you obtained from thesourceforge. This will produce libHBAAPI.so
# which you need to copy to/usr/lib. You need also copy hbaapi.h to
# /usr/include.
# The fc-card-name is the nameobtained from the hbaapitest program
# that is part of the hbaapipackage. Running hbaapitest will produce
# a verbose output. One of the firstline is similar to:
# Apapter number 0 is named: qlogic-qla2200-0
# Here fc-card-name isqlogic-qla2200-0.
#hbaping fc-card-name
# Processes started and stopped withheartbeat. Restarted unless
# they exit with rc=100
#respawn userid /path/name/to/run
#respawn root /usr/lib/heartbeat/ipfail
# Access control for client api
# default is no access
#apiauth client-name gid=gidlist uid=uidlist
#apiauth ipfail gid=root uid=root
###########################
# Unusual options.
###########################
# hopfudge maximum hop count minusnumber of nodes in config
#hopfudge 1
# deadping - dead time for pingnodes
#deadping 30
# hbgenmethod - Heartbeat generationnumber creation method
# Normally these are stored on disk and incremented asneeded.
#hbgenmethod time
# realtime - enable/disable realtimeexecution (high priority, etc.)
# defaults to on
#realtime off
# debug - set debug level
# defaults to zero
#debug 1
# API Authentication - replaces thefifo-permissions-based system of the past
# You can put a uid list and/or agid list.
# If you put both, then a process isauthorized if it qualifies under either
# the uid list, or under the gidlist.
# The groupname "default" hasspecial meaning. If it is specified, then
# this will be used for authorizinggroupless clients, and any client groups
# not otherwise specified.
# There is a subtle exception tothis. "default" will never be used in the
# following cases (actual defaultauth directives noted in brackets)
# ipfail (uid=HA_CCMUSER)
# ccm (uid=HA_CCMUSER)
# ping (gid=HA_APIGROUP)
# cl_status (gid=HA_APIGROUP)
# This is done to avoid creating agaping security hole and matches the most
# likely desired configuration.
#apiauth ipfail uid=hacluster
#apiauth ccm uid=hacluster
#apiauth cms uid=hacluster
#apiauth ping gid=haclient uid=alanr,root
#apiauth default gid=haclient
# message format in the wire, it canbe classic or netstring,
# default: classic
#msgfmt classic/netstring
# Do we use logging daemon?
# If logging daemon is used,logfile/debugfile/logfacility in this file
# are not meaningful any longer. Youshould check the config file for logging
# daemon (the default is/etc/logd.cf)
# more infomartion can be fould inhttp://www.linux-ha.org/ha_2ecf_2fUseLogdDirective
# Setting use_logd to "yes" isrecommended
use_logd yes
# the interval we reconnect tologging daemon if the previous connection failed
# default: 60 seconds
#conn_logd_time 60
# Configure compression module
# It could be zlib or bz2, dependingon whether u have the corresponding
# library in the system.
#compression bz2
# Confiugre compressionthreshold
# This value determines thethreshold to compress a message,
# e.g. if the threshold is 1, thenany message with size greater than 1 KB
# will be compressed, the default is2 (KB)
# compression_threshold 2

b、配置authkeys

# Authenticationfile. Must be mode 600
# Must have exactly one authdirective at the front.
# auth sendauthentication using this method-id
# Then, list the method and key thatgo with that method-id
# Available methods: crc sha1,md5. Crc doesn't need/want a key.
# You normally only have oneauthentication method-id listed in this file
# Put more than one to make a smoothtransition when changing auth
# methods and/or keys.

# sha1 is believedto be the "best", md5 next best.
# crc adds no security, except frompacket corruption.
# Use only on physically secure networks.
auth 1
# Authentication file. Must bemode 600
# Must have exactly one authdirective at the front.
# auth sendauthentication using this method-id
# Then, list the method and key thatgo with that method-id
# Available methods: crc sha1,md5. Crc doesn't need/want a key.
# You normally only have oneauthentication method-id listed in this file
# Put more than one to make a smoothtransition when changing auth
# methods and/or keys.
# sha1 is believed to be the "best",md5 next best.
# crc adds no security, except frompacket corruption.
# Use only on physically secure networks.
auth 1
1 crc
#2 sha1 HI!
#3 md5 Hello!

重点：配置完后要修改authkeys文件权限 chmod 600authkeys（这一步必须做）
c、配置haresources

# This is a list ofresources that move from machine to machine as
# nodes go down and come up in thecluster. Do not include
# "administrative" or fixed IPaddresses in this file.
#
# The haresources files MUST BEIDENTICAL on all nodes of the cluster.
# The node names listed in front ofthe resource group information
# is the name of the preferred nodeto run the service. It is
# not necessarily the name of thecurrent machine. If you are running
# auto_failback ON (or legacy), thenthese services will be started
# up on the preferred nodes - anytime they're up.
# If you are running withauto_failback OFF, then the node information
# will be used in the case of asimultaneous start-up, or when using
# the hb_standby {foreign,local}command.
# BUT FOR ALL OF THESE CASES, theharesources files MUST BE IDENTICAL.
# If your files are different thenalmost certainly something
# won't work right.
#
# We refer to this file when we'recoming up, and when a machine is being
# taken over after going down.
# You need to make this right foryour installation, then install it in
# /etc/ha.d
# Each logical line in the fileconstitutes a "resource group".
# A resource group is a list ofresources which move together from
# one node to another - in the orderlisted. It is assumed that there
# is no relationship betweendifferent resource groups. These
# resource in a resource group arestarted left-to-right, and stopped
# right-to-left. Long lists ofresources can be continued from line
# to line by ending the lines withbackslashes ("\").
# These resources in this file areeither IP addresses, or the name
# of scripts to run to "start" or"stop" the given resource.
# The format is like this:
#node-name resource1 resource2 ... resourceN
sev1.example.com 192.168.138.23 httpd
sev1.example.com 192.168.138.24Filesystem::/dev/sdb::/home/share::ext3 smb
# If the resource name contains an:: in the middle of it, the
# part after the :: is passed to theresource script as an argument.
# Multiple arguments are separatedby the :: delimeter
# In the case of IP addresses, theresource script name IPaddr is
# implied.
# For example, the IP address135.9.8.7 could also be represented
# as IPaddr::135.9.8.7
# THIS IS IMPORTANT!! vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# The given IP address is directedto an interface which has a route
# to the given address. Thismeans you have to have a net route
# set up outside of theHigh-Availability structure. We don't set it
# up here -- we key off of it.
# The broadcast address for the IPalias that is created to support
# an IP address defaults to thehighest address on the subnet.
# The netmask for the IP alias thatis created defaults to the same
# netmask as the route that itselected in in the step above.
# The base interface for the IPaliasthat is created defaults to the
# same netmask as the route that itselected in in the step above.
# If you want to specify that thisIP address is to be brought up
# on a subnet with a netmask of255.255.255.0, you would specify
# this as IPaddr::135.9.8.7/24 .
# If you wished to tell it that thebroadcast address for this subnet
# was 135.9.8.210, then you wouldspecify that this way:
# IPaddr::135.9.8.7/24/135.9.8.210
# If you wished to tell it that theinterface to add the address to
# is eth0, then you would need tospecify it this way:
# IPaddr::135.9.8.7/24/eth0
# And this way to specify both thebroadcast address and the
# interface:
# IPaddr::135.9.8.7/24/eth0/135.9.8.210
# The IP addresses you list in thisfile are called "service" addresses,
# since they're they're the publiclyadvertised addresses that clients
# use to get at highly availableservices.
# For a hot/standby (n 2-node system with only
# a single service address,
# you will probably only put onesystem name and one IP address in here.
# The name you give the address tois the name of the default "hot"
# system.
# Where the nodename is the name ofthe node which "normally" owns the
# resource. If this machine isup, it will always have the resource
# it is shown as owning.
# The string you put in for nodenamemust match the uname -n name
# of your machine. Dependingon how you have it administered, it could
# be a short name or a FQDN.
#
#-------------------------------------------------------------------
# Simple case: One service address,default subnet and netmask
# No servers that go up and down with the IP address
#just.linux-ha.org 135.9.216.110
#-------------------------------------------------------------------
# Assuming the adminstrativeaddresses are on the same subnet...
# A little more complex case: Oneservice address, default subnet
# and netmask, and you want to startand stop http when you get
# the IP address...
#just.linux-ha.org 135.9.216.110 http
#-------------------------------------------------------------------
# A little more complex case: Threeservice addresses, default subnet
# and netmask, and you want to startand stop http when you get
# the IP address...
#just.linux-ha.org 135.9.216.110135.9.215.111 135.9.216.112 httpd
#-------------------------------------------------------------------
# One service address, with thesubnet, interface and bcast addr
# explicitly defined.
#just.linux-ha.org 135.9.216.3/28/eth0/135.9.216.12 httpd
#-------------------------------------------------------------------
# An example where a sharedfilesystem is to be used.
# Note that multiple aguments arepassed to this script using
# the delimiter '::' to separateeach argument.
#node1 10.0.0.170 Filesystem::/dev/sda1::/data1::ext2
# Regarding the node-names in thisfile:
# They must match the names of thenodes listed in ha.cf, which in turn
# must match the `uname -n` of somenode in the cluster. So they aren't
# virtual in any sense of theword.

8、在HOSTB上配置heartbeat

这里我采用了比较偷懒的方法，因为配置和HOSTA一样，只需要在ha.cf配置里找的ucast eth1192.168.1.20这一行，把地址改为192.168.1.10即可，所以我直接用ftp登录到HOSTA上面，把上面3个配置文件GET一下就OK！

9、启动heartbeat

HOSTA:终端输入：service heartbeatstart OK

HOSTB:终端输入：service heartbeatstart OK

这里如果配置正确，网络连通性OK，那么就会自动虚拟出一个eth0:0网口，即为heartbeat协商出的虚拟IP。记得使用 ps-ef 命了查看heartbeat的运行状态哈~~！

打字太累，截图不好传，写这么多主要是方便自己以后忘记的时候在看看~！本人在虚拟机上测试通过，可以自动切换并启动smb服务，httpd服务也是出奇测试用的，磁盘挂载也OK，这里千万不能在fstab内把磁盘自动挂载上了，必须要heartbeat来挂载，这样才有效！、

总结：使用heartbeat来实现故障转移群集只是简单的配置而已，需要注意一下几点：

1、安装heartbeat之前要修改主机名，IP等信息，需关注hosts /etc/sysconfig/network等网络配置文件配置好之后再安装

2、heartbeat配置主要是ha.cf，需要主要的是添加节点、选择心跳检测端口、 ping外网连通性，authkeys只是验证方式，选择一种即可，在haresources文件内也只需加入一条要执行的命令就行了！（这条命令是精华，花了偶一个星期，后来才发现注释里都有说明，英文不好伤不起啊……）

3、linux下的配置文件里的注释很重要，有空一定要多看看，配置起来很有帮助！

4、群集大致分3种：高可用，负载均衡（貌似故障转移也属于负载均衡的哈）和高性能计算，对于大型服务器的部署，这些都是必须的，以后需要多研究！以后不知道还有没有机会学习veritas和oracle！

上一篇：基于centos 6.4的ORACLE 11g安装部署

下一篇：linux下Oracle的开启和关闭步骤

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯