With MySQL replication and keepalived, we can setup a quite robust high available MySQL environment in a few steps:

Environment:
Host1: db01.wordpress.com
Host2: db02.wordpress.com
DBVIP: mysql.wordpress.com 10.0.0.1

1. Setup MySQL Master-Master replication
Ref: Setup MySQL replication

2. Install keepalived at both hosts


--using apt-get, for Ubuntu
apt-get install keepalived
 
--using yum, for Redhat
yum install keepalive

3. Config keepalived

1) Add keepalived config file /etc/keepalived/keepalived.conf
Config file for host db01:

! Configuration File for keepalived
global_defs {
      notification_email {
        alexzeng@wordpress.com
      }
      notification_email_from alexzeng@wordpress.com
      smtp_server mx.wordpress.com
      smtp_connect_timeout 30
      router_id mysql-ha
      }
 
vrrp_script check_mysql {
   script "/mysql/keepalived_check.sh db02.wordpress.com"
   interval 2
   weight 2
}
 
vrrp_instance VI_1 {
      state BACKUP
      interface eth1
      virtual_router_id 51
      priority 100
      advert_int 1
      nopreempt  # only needed on higher priority node
      authentication {
      auth_type PASS
      auth_pass 1111
      }
 
      track_script {
        check_mysql
      }
      virtual_ipaddress {
        10.0.0.1/24 dev eth1 label eth1:1
      }
      notify_master /mysql/keepalived_master.sh
      notify_backup /mysql/keepalived_backup.sh
}


Config file of host db02:

Copy the config file in db01, and change this line:From   script "/mysql/keepalived_check.sh db02.wordpress.com"to   script "/mysql/keepalived_check.sh db01.wordpress.com"

2) Add scripts to both nodes
/mysql/keepalived_check.sh : monitor MySQL (for the host/network down, keepalived has internal mechanism to monitor them)


#!/bin/bash
# monitor mysql status
# if this node mysql is dead and its slave delay less than 120 seconds, then stop its keepalived. The other node will bind the IP.
 
export MYSQL_HOME=/mysql
export PATH=$MYSQL_HOME/bin:$PATH
 
mysql="$MYSQL_HOME/bin/mysql"
delay_file="$MYSQL_HOME/slave_delay_second.log"
slave_host=$1
 
$mysql -u root --connect_timeout=3 --execute="select version();"
 
if [ $? -ne 0 ]; then
 delayseconds=`cat $delay_file`
 if [ $delayseconds -le 120 ]; then
   /etc/init.d/keepalived stop
 fi
 exit 1 #bad
fi
 
# Get slave delay time and save it
$mysql -urepluser -prepluser -h$slave_host --connect_timeout=3 -e"select version();"
if [ $? -eq 0 ]; then
  delayseconds=`$mysql -urepluser -prepluser -h$slave_host --connect_timeout=3 -e"show slave status\G"|grep Seconds_Behind_Master|awk '{print \$2}'`
  if [[ "$delayseconds" =~ ^[0-9]+$ ]] ; then
     echo "$delayseconds" > $delay_file
  else
     echo "9999" > $delay_file
  fi
fi
exit 0 #good

/mysql/keepalived_master.sh : it will be called when the node becomes master


#!/bin/bash
 
my_host=`hostname`
current_date=`/bin/date +"%b %d %H:%M:%S"`
From="$my_host"
mail_list=alexzeng@wordpress.com
 
Subject="$my_host is MASTER"
Msgboday="$current_date : mysql.wordpress.com is online at $my_host"
echo "$Msgboday" | /usr/bin/mailx  -s "$Subject" "$mail_list"

/mysql/keepalived_backup.sh : it will be called when the node becomes slave


#!/bin/bash
 
my_host=`hostname`
current_date=`/bin/date +"%b %d %H:%M:%S"`
From="$my_host"
mail_list=alexzeng@wordpress.com
 
Subject="$my_host is BACKUP"
Msgboday="$current_date : mysql.wordpress.com is offline at $my_host"
echo "$Msgboday" | /usr/bin/mailx  -s "$Subject" "$mail_list"

4. Start keepalived at both nodes

1
2
3
4
5
service keepalived start
or
/etc/init.d/keepalived start
 
Check its log file at /var/log/messages

5. Test it
Scenarios:
A. Stop MySQL at the master node
B. Shutdown master node network
C. Shutdown master node OS
D. Split-brain (the nodes cannot connect to each other) – In my test, keepalived didn’t do anything in this situation.

Check result:
1) Check emails
2) Check IP using ifconfig at both nodes
2) Connect to DB without stop:

1
2
3
4
5
6
while true loop
do
date
mysql -urepluser -prepluser -hmysql.wordpress.com -e"select @@hostname;"
sleep 1
done;

In my test, the db cannot be connected for just 2 seconds.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Mon Oct 29 22:30:51 GMT+7 2012
+---------------+
| @@hostname    |
+---------------+
| db01          |
+---------------+
Mon Oct 29 22:30:52 GMT+7 2012
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql.wordpress.com' (111)
Mon Oct 29 22:30:53 GMT+7 2012
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql.wordpress.com' (111)
Mon Oct 29 22:30:54 GMT+7 2012
+---------------+
| @@hostname    |
+---------------+
| db02          |
+---------------+