实验环境:RHEL 5.5 64bit

实验需求:VM虚拟机、heartbeat安装包

实验目的:实现两台samba服务器之间的自动切换,以及磁盘的共享存储,达到简单故障转移的目的。

实验规划:

HOSTA:

hostname:sev1.example.com sev1     eth0:192.168.138.10  eth1:192.168.1.10  (心跳端口) GW:192.168.138.2  主节点

HOSTB:

hostname:sev2.example.com sev2    eth0:192.168.138.20  eth1:192.168.1.20  (心跳端口) GW:192.168.138.2  备用节点

实验步骤:

1、打开VMware虚拟机,首先安装2台虚拟主机,均使用RHEL 5.564bit操作系统。在安装操作系统的时候注意把samba服务安装好。(如果等系统安装好之后再装samba的话,依赖关系很杂,使用rpm安装不太方便!)

2、在HOSTA虚拟主机下修改虚拟配置,手动添加一个磁盘做共享,暂时命名为share,这里为了实现2台机器能自动挂载共享存储,需修改该磁盘的参数。在VM的根目录下找的新建的共享磁盘,修改share.vmx文件,添加如下几行参数:

disk.locking = "FALSE"

diskLib.dataCacheMaxSize=0        

diskLib.dataCacheMaxReadAheadSize=0
diskLib.dataCacheMinReadAheadSize=0
diskLib.dataCachePageSize=4096    
diskLib.maxUnsyncedWrites=0

scsi0:1.sharedBus = "virtual"(scsi是虚拟设备节点,根据实际情况修改即可)
scsi0:1.shared = "true"

3、启动HOSTA,用root身份登录(方便以后操作),打开终端,使用fdisk-l命令查看磁盘,接着格式化该磁盘,这里我是想使用整个磁盘,所以就不分区,直接格式化成ext3格式,具体命令如下:

fdisk -l  查询该磁盘“盘符”  /dev/sdb

fdisk  /dev/sdb m(这里可以用不同的参数分区,就不多说了,自己百度)    重启之

终端输入 mkdir -p /home/share   新建挂载点

mkfs -t ext3 -c /dev/sdb  格式化为ext3

tips:手动挂载  mount /dev/sdb /home/share测试成功!  (记得unmount)

4、HOSTB的配置不需要新建磁盘,直接在添加硬盘的时候选择已存在的硬盘,指定到share这个磁盘,记得使用新建好挂载点之后要测试下,mount成功即可。

5、配置samba服务器:a、采用终端配置,直接终端输入vi/etc/samba/smb.conf (主配置文件)。b、图形化界面配置,路径为:管理-->服务器-->samba 。samba配置很简单,就不多说了,关键是要搞懂权限问题。(自己也有点模糊~!)

6、在HOSTA上安装heartbeat软件

这里采用rpm安装,直接把安装包CP到虚拟机里,heartbeat-2.1.3-3版本需要3个包,安装顺序如下:

heartbeat-pils-2.1.3-3.el5.centos.i386.rpm

heartbeat-stonith-2.1.3-3.el5.centos.i386.rpm

heartbeat-2.1.3-3.el5.centos.i386.rpm

安装方法:先cd到该目录,ls查看文件,rpm -ivhheartbeat-pils-2.1.3-3.el5.centos.i386.rpm(注意使用tab键),根据提示安装即可。待3个包都安装好之后,最好rpm -q  heartbeat -d  查看安装了哪些东西,这是一个好习惯哈。

7、heartbeat安装好之后,在/use/share/doc/heartbeat-2.1.3下找到以下3个文件:authkeys haresources  ha.cf  把这三个文件cp到/etc/ha.d 下面。具体配置如下:

a、ha.cf配置:

There are lots of options in this file.  Allyou have to have is a set
#       of nodes listed {"node ...} one of{serial, bcast, mcast, or ucast},
#       and a value for"auto_failback".
#       ATTENTION: As the configurationfile is read line by line,
#                 THE ORDER OF DIRECTIVE MATTERS!
#       In particular, make sure that theudpport, serial baud rate
#       etc. are set before the heartbeatmedia are defined!
#       debug and log file directives gointo effect when they
#       are encountered.
#       All will be fine if you keep themordered as in this example.
#       Note on logging:
#       If any of debugfile, logfile andlogfacility are defined then they
#       will be used. If debugfile and/orlogfile are not defined and
#       logfacility is defined then therespective logging and debug
#       messages will be loged to syslog.If logfacility is not defined
#       then debugfile and logfile will beused to log messges. If
#       logfacility is not defined anddebugfile and/or logfile are not
#       defined then defaults will be usedfor debugfile and logfile as
#       required and messages will be sentthere.
#       File to write debug messagesto
#debugfile /var/log/ha-debug
#       File to write other messagesto
logfile /var/log/ha-log
#       Facility to use forsyslog()/logger
logfacility   local0
#      A note on specifying "how long"times below...
#       The default time unit isseconds
#              10 means ten seconds
#       You can also specify them inmilliseconds
#              1500ms means 1.5 seconds
#       keepalive: how long betweenheartbeats?
keepalive 2
#       deadtime: howlong-to-declare-host-dead?
#              If you set this too low you will get the problematic
#              split-brain (or cluster partition) problem.
#              See the FAQ for how to use warntime to tune deadtime.
deadtime 60
#       warntime: how long before issuing"late heartbeat" warning?
#       See the FAQ for how to usewarntime to tune deadtime.
warntime 10
#       Very first dead time(initdead)
#       On some machines/OSes, etc. thenetwork takes a while to come up
#       and start working right afteryou've been rebooted.  As a result
#       we have a separate dead time forwhen things first come up.
#       It should be at least twice thenormal dead time.
initdead 120
#       What UDP port to use forbcast/ucast communication?
#
udpport 694
#       Baud rate for serial ports...
#baud   19200    
#       serial  serialportname...
#serial /dev/ttyS0      # Linux
#serial /dev/cuaa0      # FreeBSD
#serial /dev/cuad0      # FreeBSD 6.x
#serial /dev/cua/a      # Solaris
#       What interfaces to broadcastheartbeats over?
bcast   eth1    # Linux
#bcast  eth1 eth2       # Linux
#bcast le0            # Solaris
#bcast  le1 le2         #Solaris
#       Set up a multicast heartbeatmedium
#       mcast [dev] [mcast group] [port][ttl] [loop]
#      [dev]           deviceto send/rcv heartbeats on
#       [mcast group]   multicastgroup to join (class D multicast address
#                      224.0.0.0 - 239.255.255.255)
#      [port]          udp port tosendto/rcvfrom (set this value to the
#                      same value as "udpport" above)
#      [ttl]           thettl value for outbound heartbeats.  this effects
#                      how far the multicast packet will propagate. (0-255)
#                      Must be greater than zero.
#      [loop]          togglesloopback for outbound multicast heartbeats.
#                      if enabled, an outbound packet will be looped back and
#                      received by the interface it was sent on. (0 or 1)
#                      Set this value to zero.
#mcast eth0 225.0.0.1 694 1 0
#       Set up a unicast / udp heartbeatmedium
#       ucast [dev] [peer-ip-addr]
#      [dev]           deviceto send/rcv heartbeats on
#       [peer-ip-addr]  IP address ofpeer to send packets to
ucast eth1 192.168.1.20
#       About boolean values...
#       Any of the followingcase-insensitive values will work for true:
#              true, on, yes, y, 1
#       Any of the followingcase-insensitive values will work for false:
#              false, off, no, n, 0
#       auto_failback:  determineswhether a resource will
#       automatically fail back to its"primary" node, or remain
#       on whatever node is serving ituntil that node fails, or
#       an administrator intervenes.
#       The possible values forauto_failback are:
#              on      - enable automatic failbacks
#              off     - disable automatic failbacks
#              legacy  - enable automatic failbacks in systems
#                      where all nodes do not yet support
#                      the auto_failback option.
#       auto_failback "on" and "off" arebackwards compatible with the old
#              "nice_failback on" setting.
#       See the FAQ for information on howto convert
#              from "legacy" to "on" without a flash cut.
#              (i.e., using a "rolling upgrade" process)
#       The default value forauto_failback is "legacy", which
#       will issue a warning atstartup.  So, make sure you put
#       an auto_failback directive in yourha.cf file.
#       (note: auto_failback can be anyboolean or "legacy")
#
auto_failback on
#       Basic STONITH support
#       Using this directive assumes thatthere is one stonith
#       device in the cluster. Parameters to this device are
#       read from a configuration file.The format of this line is:
#         stonith
#       NOTE: it is up to you to maintainthis file on each node in the
#       cluster!
#stonith baytech /etc/ha.d/conf/stonith.baytech
#       STONITH support
#       You can configure multiple stonithdevices using this directive.
#       The format of the line is:
#         stonith_host
#         is themachine the stonith device is attached
#             to or * to mean it is accessible from any host.
#         is thetype of stonith device (a list of
#             supported drives is in /usr/lib/stonith.)
#         are driverspecific parameters.  To see the
#             format for a particular device, run:
#           stonith -l-t
#       Note that if you put your stonithdevice access information in
#       here, and you make this filepublically readable, you're asking
#       for a denial of service attack;-)
#       To get a list of supported stonithdevices, run
#              stonith -L
#       For detailed information on whichstonith devices are supported
#       and their detailed configurationoptions, run this command:
#              stonith -h
#stonith_host *     baytech 10.0.0.3 myloginmysecretpassword
#stonith_host ken3  rps10 /dev/ttyS1 kathy 0
#stonith_host kathy rps10 /dev/ttyS1 ken3 0
#       Watchdog is the watchdogtimer.  If our own heart doesn't beat for
#       a minute, then our machine willreboot.
#       NOTE: If you are using thesoftware watchdog, you very likely
#       wish to load the module with theparameter "nowayout=0" or
#       compile it withoutCONFIG_WATCHDOG_NOWAYOUT set. Otherwise even
#       an orderly shutdown of heartbeatwill trigger a reboot, which is
#       very likely NOT what you want.
#watchdog /dev/watchdog    
#       Tell what machines are in thecluster
#       node    nodename...    -- must match uname -n
node    sev1.example.com
node    sev2.example.com
#       Less common options...
#       Treats 10.10.10.254 as apsuedo-cluster-member
#       Used together with ipfailbelow...
#       note: don't use a cluster node asping node
ping 192.168.138.2
#       Treats 10.10.10.254 and10.10.10.253 as a psuedo-cluster-member
#       called group1. If either10.10.10.254 or 10.10.10.253 are up
#       then group1 is up
#       Used together with ipfailbelow...
#ping_group group1 10.0.0.1 10.0.0.2
#       HBA ping derective for FiberChannel
#       Treats fc-card-name aspsudo-cluster-member
#       used with ipfail below ...
#
#       You can obtain HBAAPI fromhttp://hbaapi.sourceforge.net.  Youneed
#       to get the library specific toyour HBA directly from the vender
#       To install HBAAPI stuff, all Youneed to do is to compile the common
#       part you obtained from thesourceforge. This will produce libHBAAPI.so
#       which you need to copy to/usr/lib. You need also copy hbaapi.h to
#       /usr/include.
#       The fc-card-name is the nameobtained from the hbaapitest program
#       that is part of the hbaapipackage. Running hbaapitest will produce
#       a verbose output. One of the firstline is similar to:
#              Apapter number 0 is named: qlogic-qla2200-0
#       Here fc-card-name isqlogic-qla2200-0.
#hbaping fc-card-name
#       Processes started and stopped withheartbeat.  Restarted unless
#              they exit with rc=100
#respawn userid /path/name/to/run
#respawn root /usr/lib/heartbeat/ipfail
#       Access control for client api
#              default is no access
#apiauth client-name gid=gidlist uid=uidlist
#apiauth ipfail gid=root uid=root
###########################
#       Unusual options.
###########################
#       hopfudge maximum hop count minusnumber of nodes in config
#hopfudge 1
#       deadping - dead time for pingnodes
#deadping 30
#       hbgenmethod - Heartbeat generationnumber creation method
#              Normally these are stored on disk and incremented asneeded.
#hbgenmethod time
#       realtime - enable/disable realtimeexecution (high priority, etc.)
#              defaults to on
#realtime off
#       debug - set debug level
#              defaults to zero
#debug 1
#       API Authentication - replaces thefifo-permissions-based system of the past
#       You can put a uid list and/or agid list.
#       If you put both, then a process isauthorized if it qualifies under either
#       the uid list, or under the gidlist.
#       The groupname "default" hasspecial meaning.  If it is specified, then
#       this will be used for authorizinggroupless clients, and any client groups
#       not otherwise specified.
#       There is a subtle exception tothis.  "default" will never be used in the
#       following cases (actual defaultauth directives noted in brackets)
#                ipfail       (uid=HA_CCMUSER)
#                ccm          (uid=HA_CCMUSER)
#                ping         (gid=HA_APIGROUP)
#                cl_status     (gid=HA_APIGROUP)
#       This is done to avoid creating agaping security hole and matches the most
#       likely desired configuration.
#apiauth ipfail uid=hacluster
#apiauth ccm uid=hacluster
#apiauth cms uid=hacluster
#apiauth ping gid=haclient uid=alanr,root
#apiauth default gid=haclient
#       message format in the wire, it canbe classic or netstring,
#       default: classic
#msgfmt  classic/netstring
#       Do we use logging daemon?
#       If logging daemon is used,logfile/debugfile/logfacility in this file
#       are not meaningful any longer. Youshould check the config file for logging
#       daemon (the default is/etc/logd.cf)
#       more infomartion can be fould inhttp://www.linux-ha.org/ha_2ecf_2fUseLogdDirective
#       Setting use_logd to "yes" isrecommended
use_logd yes
#       the interval we  reconnect tologging daemon if the previous connection failed
#       default: 60 seconds
#conn_logd_time 60
#       Configure compression module
#       It could be zlib or bz2, dependingon whether u have the corresponding
#       library in the system.
#compression    bz2
#       Confiugre compressionthreshold
#       This value determines thethreshold to compress a message,
#       e.g. if the threshold is 1, thenany message with size greater than 1 KB
#       will be compressed, the default is2 (KB)
#      compression_threshold 2

b、配置authkeys

#       Authenticationfile.  Must be mode 600
#       Must have exactly one authdirective at the front.
#       auth    sendauthentication using this method-id
#       Then, list the method and key thatgo with that method-id
#       Available methods: crc sha1,md5.  Crc doesn't need/want a key.
#       You normally only have oneauthentication method-id listed in this file
#       Put more than one to make a smoothtransition when changing auth
#       methods and/or keys.

#       sha1 is believedto be the "best", md5 next best.
#       crc adds no security, except frompacket corruption.
#              Use only on physically secure networks.
auth 1
#       Authentication file.  Must bemode 600
#       Must have exactly one authdirective at the front.
#       auth    sendauthentication using this method-id
#       Then, list the method and key thatgo with that method-id
#       Available methods: crc sha1,md5.  Crc doesn't need/want a key.
#       You normally only have oneauthentication method-id listed in this file
#       Put more than one to make a smoothtransition when changing auth
#       methods and/or keys.
#       sha1 is believed to be the "best",md5 next best.
#       crc adds no security, except frompacket corruption.
#              Use only on physically secure networks.
auth 1
1 crc
#2 sha1 HI!
#3 md5 Hello!

重点:配置完后要修改authkeys文件权限 chmod  600authkeys(这一步必须做)
c、配置haresources

#       This is a list ofresources that move from machine to machine as
#       nodes go down and come up in thecluster.  Do not include
#       "administrative" or fixed IPaddresses in this file.
#
#       The haresources files MUST BEIDENTICAL on all nodes of the cluster.
#       The node names listed in front ofthe resource group information
#       is the name of the preferred nodeto run the service.  It is
#       not necessarily the name of thecurrent machine.  If you are running
#       auto_failback ON (or legacy), thenthese services will be started
#       up on the preferred nodes - anytime they're up.
#       If you are running withauto_failback OFF, then the node information
#       will be used in the case of asimultaneous start-up, or when using
#       the hb_standby {foreign,local}command.
#       BUT FOR ALL OF THESE CASES, theharesources files MUST BE IDENTICAL.
#       If your files are different thenalmost certainly something
#       won't work right.
#
#       We refer to this file when we'recoming up, and when a machine is being
#       taken over after going down.
#       You need to make this right foryour installation, then install it in
#       /etc/ha.d
#       Each logical line in the fileconstitutes a "resource group".
#       A resource group is a list ofresources which move together from
#       one node to another - in the orderlisted.  It is assumed that there
#       is no relationship betweendifferent resource groups.  These
#       resource in a resource group arestarted left-to-right, and stopped
#       right-to-left.  Long lists ofresources can be continued from line
#       to line by ending the lines withbackslashes ("\").
#       These resources in this file areeither IP addresses, or the name
#       of scripts to run to "start" or"stop" the given resource.
#       The format is like this:
#node-name resource1 resource2 ... resourceN
sev1.example.com 192.168.138.23 httpd
sev1.example.com 192.168.138.24Filesystem::/dev/sdb::/home/share::ext3 smb
#       If the resource name contains an:: in the middle of it, the
#       part after the :: is passed to theresource script as an argument.
#       Multiple arguments are separatedby the :: delimeter
#       In the case of IP addresses, theresource script name IPaddr is
#       implied.
#       For example, the IP address135.9.8.7 could also be represented
#       as IPaddr::135.9.8.7
#       THIS IS IMPORTANT!!    vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
#       The given IP address is directedto an interface which has a route
#       to the given address.  Thismeans you have to have a net route
#       set up outside of theHigh-Availability structure.  We don't set it
#       up here -- we key off of it.
#       The broadcast address for the IPalias that is created to support
#       an IP address defaults to thehighest address on the subnet.
#       The netmask for the IP alias thatis created defaults to the same
#       netmask as the route that itselected in in the step above.
#       The base interface for the IPaliasthat is created defaults to the
#       same netmask as the route that itselected in in the step above.
#       If you want to specify that thisIP address is to be brought up
#       on a subnet with a netmask of255.255.255.0, you would specify
#       this as IPaddr::135.9.8.7/24 .
#       If you wished to tell it that thebroadcast address for this subnet
#       was 135.9.8.210, then you wouldspecify that this way:
#              IPaddr::135.9.8.7/24/135.9.8.210
#       If you wished to tell it that theinterface to add the address to
#       is eth0, then you would need tospecify it this way:
#              IPaddr::135.9.8.7/24/eth0
#       And this way to specify both thebroadcast address and the
#       interface:
#              IPaddr::135.9.8.7/24/eth0/135.9.8.210
#       The IP addresses you list in thisfile are called "service" addresses,
#       since they're they're the publiclyadvertised addresses that clients
#       use to get at highly availableservices.
#       For a hot/standby (n 2-node system with only
#       a single service address,
#       you will probably only put onesystem name and one IP address in here.
#       The name you give the address tois the name of the default "hot"
#       system.
#       Where the nodename is the name ofthe node which "normally" owns the
#       resource.  If this machine isup, it will always have the resource
#       it is shown as owning.
#       The string you put in for nodenamemust match the uname -n name
#       of your machine.  Dependingon how you have it administered, it could
#       be a short name or a FQDN.
#
#-------------------------------------------------------------------
#       Simple case: One service address,default subnet and netmask
#              No servers that go up and down with the IP address
#just.linux-ha.org      135.9.216.110
#-------------------------------------------------------------------
#       Assuming the adminstrativeaddresses are on the same subnet...
#       A little more complex case: Oneservice address, default subnet
#       and netmask, and you want to startand stop http when you get
#       the IP address...
#just.linux-ha.org      135.9.216.110 http
#-------------------------------------------------------------------
#       A little more complex case: Threeservice addresses, default subnet
#       and netmask, and you want to startand stop http when you get
#       the IP address...
#just.linux-ha.org      135.9.216.110135.9.215.111 135.9.216.112 httpd
#-------------------------------------------------------------------
#       One service address, with thesubnet, interface and bcast addr
#       explicitly defined.
#just.linux-ha.org     135.9.216.3/28/eth0/135.9.216.12 httpd
#-------------------------------------------------------------------
#       An example where a sharedfilesystem is to be used.
#       Note that multiple aguments arepassed to this script using
#       the delimiter '::' to separateeach argument.
#node1  10.0.0.170 Filesystem::/dev/sda1::/data1::ext2
#       Regarding the node-names in thisfile:
#       They must match the names of thenodes listed in ha.cf, which in turn
#       must match the `uname -n` of somenode in the cluster.  So they aren't
#       virtual in any sense of theword.

8、在HOSTB上配置heartbeat

这里我采用了比较偷懒的方法,因为配置和HOSTA一样,只需要在ha.cf配置里找的ucast eth1192.168.1.20这一行,把地址改为192.168.1.10即可,所以我直接用ftp登录到HOSTA上面,把上面3个配置文件GET一下就OK!

9、启动heartbeat

HOSTA:终端输入:service heartbeatstart       OK

HOSTB:终端输入:service heartbeatstart       OK

这里如果配置正确,网络连通性OK,那么就会自动虚拟出一个eth0:0网口,即为heartbeat协商出的虚拟IP。记得使用 ps-ef 命了查看heartbeat的运行状态哈~~!

打字太累,截图不好传,写这么多主要是方便自己以后忘记的时候在看看~!本人在虚拟机上测试通过,可以自动切换并启动smb服务,httpd服务也是出奇测试用的,磁盘挂载也OK,这里千万不能在fstab内把磁盘自动挂载上了,必须要heartbeat来挂载,这样才有效!、


总结:使用heartbeat来实现故障转移群集只是简单的配置而已,需要注意一下几点:

1、安装heartbeat之前要修改主机名,IP等信息,需关注hosts    /etc/sysconfig/network等网络配置文件   配置好之后再安装

2、heartbeat配置主要是ha.cf,需要主要的是添加节点、选择心跳检测端口、 ping外网连通性,authkeys只是验证方式,选择一种即可,在haresources文件内也只需加入一条要执行的命令就行了!(这条命令是精华,花了偶一个星期,后来才发现注释里都有说明,英文不好伤不起啊……)

3、linux下的配置文件里的注释很重要,有空一定要多看看,配置起来很有帮助!

4、群集大致分3种:高可用,负载均衡(貌似故障转移也属于负载均衡的哈)和高性能计算,对于大型服务器的部署,这些都是必须的,以后需要多研究!以后不知道还有没有机会学习veritas和oracle!