1.Percona XtraDB Cluster的搭建

安装环境:

节点1:A: 192.168.91.18

节点2:B:192.168.91.20

节点3:C:192.168.91.21

innodb引擎层实现的复制

ABC server_id要不一样


ABC:

下载软件:

wget http://www.percona.com/downloads/Percona-XtraDB-Cluster-56/Percona-XtraDB-Cluster-5.6.21-25.8/binary/tarball/Percona-XtraDB-Cluster-5.6.21-rel70.1-25.8.938.Linux.x86_64.tar.gz


安装依赖包:

yum install -y socat

yum install -y perl-DBD-MySQL.x86_64 perl-IO-Socket-SSL.noarch socat.x86_64 nc

(其中nc是一个强大的网络工具)

 yum install -y http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm


#安装xtrabackup备份软件:

yum list |grep percona-xtrabackup

yum install -y percona-xtrabackup.x86_64


#rpm -qa |grep percona

percona-release-0.1-3.noarch

percona-xtrabackup-2.3.7-2.el6.x86_64


ABC:

解压PXC包:

 tar xf Percona-XtraDB-Cluster-5.6.21-rel70.1-25.8.938.Linux.x86_64.tar.gz


软链接:

ln -s /home/tools/Percona-XtraDB-Cluster-5.6.21-rel70.1-25.8.938.Linux.x86_64 /usr/local/mysql


创建 mysql 的用户及组

groupadd mysql

useradd –g msyql –s /sbin/nologin –d /usr/local/mysql mysql


创建启动文件:

cp /usr/local/mysql/support-files/mysql.server  /etc/init.d/mysqld


创建 mysql 需要的基本目录

mkdir -p /data/mysql3306/{data,logs,tmp}

chown -R mysql:mysql *


A 配置文件:

vim /etc/my.cnf

#pxc

default_storage_engine=Innodb

#innodb_locks_unsafe_for_binlog=1

innodb_autoinc_lock_mode=2

wsrep_cluster_name=pxc_cluster      #集群名称

wsrep_cluster_address=gcomm://192.168.91.18,192.168.91.20,192.168.91.21

wsrep_node_address=192.168.91.18

wsrep_provider=/usr/local/mysql/lib/libgalera_smm.so

#wsrep_provider_options="gcache.size = 1G;debug = yes"

wsrep_provider_options="gcache.size = 1G;"

#wsrep_sst_method=rsync

wsrep_sst_method=xtrabackup-v2

wsrep_sst_auth=sst:147258


B配置文件:

#pxc

default_storage_engine=Innodb

#innodb_locks_unsafe_for_binlog=1

innodb_autoinc_lock_mode=2

wsrep_cluster_name=pxc_cluster      

wsrep_cluster_address=gcomm://192.168.91.18,192.168.91.20,192.168.91.21

wsrep_node_address=192.168.91.20

wsrep_provider=/usr/local/mysql/lib/libgalera_smm.so

#wsrep_provider_options="gcache.size = 1G;debug = yes"

wsrep_provider_options="gcache.size = 1G;"

#wsrep_sst_method=rsync

wsrep_sst_method=xtrabackup-v2

wsrep_sst_auth=sst:147258     


C配置文件:

#pxc

default_storage_engine=Innodb

#innodb_locks_unsafe_for_binlog=1

innodb_autoinc_lock_mode=2

wsrep_cluster_name=pxc_cluster      

wsrep_cluster_address=gcomm://192.168.91.18,192.168.91.20,192.168.91.21

wsrep_node_address=192.168.91.21

wsrep_provider=/usr/local/mysql/lib/libgalera_smm.so

#wsrep_provider_options="gcache.size = 1G;debug = yes"

wsrep_provider_options="gcache.size = 1G;"

#wsrep_sst_method=rsync

wsrep_sst_method=xtrabackup-v2

wsrep_sst_auth=sst:147258


ABC:

初始化:

[root@Darren1 mysql]# ./scripts/mysql_install_db


A:

第一个节点启动:

/etc/init.d/mysql bootstrap-pxc

Bootstrapping PXC (Percona XtraDB Cluster)Starting MySQL (Percona XtraDB Cluster)......... SUCCESS!


>mysql 

delete from mysql.user where user!='root' or host!='localhost';

truncate mysql.db;

drop database test;

grant all on *.* to sst@localhost identified by '147258';     #创建用于xtrabackup的用户sst,密码要和my.cnf中对应

flush privileges;


BC:

启动节点二和节点三:

/etc/init.d/iptables stop

sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config


[root@Darren2 data]# /etc/init.d/mysqld start

Starting MySQL (Percona XtraDB Cluster).........State transfer in progress, setting sleep higher

... SUCCESS!

 

[root@Darren3 data]# /etc/init.d/mysqld start

 ERROR! MySQL (Percona XtraDB Cluster) is not running, but lock file (/var/lock/subsys/mysql) exists

Starting MySQL (Percona XtraDB Cluster)..................State transfer in progress, setting sleep higher

... SUCCESS!


测试:

A:

root@localhost [testdb]> create database testdb;

root@localhost [testdb]>create table t1(c1 int auto_increment not null,c2 timestamp,primary key(c1));

root@localhost [testdb]>insert into t1 select 1,now();

root@localhost [testdb]>select * from testdb.t1;

+----+---------------------+

| c1 | c2                  |

+----+---------------------+

|  1 | 2017-03-06 12:29:56 |

+----+---------------------+

B:

root@localhost [testdb]>select * from testdb.t1;

+----+---------------------+

| c1 | c2                  |

+----+---------------------+

|  1 | 2017-03-06 12:29:56 |

+----+---------------------+

C:

root@localhost [testdb]>select * from testdb.t1;

+----+---------------------+

| c1 | c2                  |

+----+---------------------+

|  1 | 2017-03-06 12:29:56 |

+----+---------------------+


关闭方式:

关闭:/etc/init.d/mysql stop

全部节点关闭后重启:

第一个节点启动的节点:/etc/init.d/mysql bootstrap-pxc

其它节点/etc/init.d/mysql start


SST和IST

State Snapshot Transfer(SST) 全量传输

发生在:新节点的加入,或者集群中节点故障(关闭)时间过长

wsrep_sst_method = xtrabackup-v2

这个参数有三个值:

(1)xtrabackup-v2

使用xtrabackup传输,需要提前创建用于备份的用户并制定参数用户名和密码:wsrep_sst_auth=sst:147258

(2)rsync:最快的传输方式,不需要指定wsrep_sst_auth参数,拷贝数据的时候read-only(flush table with read lock)

(3)mysqldump:不建议使用,数据量大的时候不行,拷贝数据的时候read-only(flush table with read lock)


Incremental state Transfer(IST) 增量传输

发生在:一个节点数据的改变,把增量的部分拷贝到另几个节点,通过一个缓存gcache控制,如果增量大于gcache会选择全量传输,再有在增量小于等于gcache时候,才会选择增量传输。

wsrep_provider_options="gcache.size = 1G"


如果去停止PXC其中的一个节点?

当 wsrep_local_state_comment 的状态是 Synced 表示三个节点之间数据同步,这样才能去停止其中一个的服务,滚动重启;


每个节点能够离线多长时间计算?

比如说想离线2h,算一下2个小时能够生成多大的binlog,对应的gcache.size就设置多大。

如一个比较繁忙的订单系统,5分钟产生200M的binog,则一个小时产生2.4G,两个小时4.8G,那么wsrep_provider_options="gcache.size = 6G",gcache是需要实际内存分配的,也不能设置太大,否则会出现oom-kill;


故障恢复后,加入集群的过程分析:

(1)如果数据量不是很大,重新初始化,搞一次SST;

(2)如果数据量很大,用rsync传输;


PXC的特点及注意事项:

(1)PCX每个节点都自动配置了自增初始值和步长,跟双主一样,这样是为了防止主键冲突;

node1:

auto_increment_offset=1

auto_incremnet_increment=3

node2:

auto_increment_offset=2

auto_incremnet_increment=3

node3:

auto_increment_offset=3

auto_incremnet_increment=3

(2)PCX集群是乐观控制,事物冲突情况可能发生在commit阶段,当多个节点修改同一行数据,只有其中一个节点能够成功,失败的节点将终止,并且返回死锁错误代码:

如:

A:

root@localhost [testdb]>begin;

root@localhost [testdb]>update t1 set c2=now() where c1=3;

B:

root@localhost [testdb]>begin;

root@localhost [testdb]>update t1 set c2=now() where c1=3;

root@localhost [testdb]>commit;

A:

出现报错deadlock:

root@localhost [testdb]>commit;

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

(3)PXC只支持innodb引擎,mysql库下的表基本上都是myisam表怎么传输呢,PXC虽然不支持myisam表,但是支持DCL语句,如create user,drop user,grant,revoke等,可以通过开启参数wsrep_replicate_myisam,使pxc支持myisam表,因此当PXC出现数据不一致的时候,首先要查看是否是myisam表;

如:

node1:

root@localhost [testdb]>show create table t2\G

*************************** 1. row ***************************

       Table: t2

Create Table: CREATE TABLE `t2` (

  `c1` int(11) NOT NULL AUTO_INCREMENT,

  `c2` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,

  PRIMARY KEY (`c1`)

) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8

root@localhost [testdb]>select * from t2;

+----+---------------------+

| c1 | c2                  |

+----+---------------------+

|  2 | 2017-03-08 11:41:31 |

+----+---------------------+

在node2和node3节点上面都看不到,因为没有传送过来。

(4)PXC每个表必须要有主键,如果没有主键,可能造成集群中每个节点的data page里的数据不一样,select limit 可能在不同的节点产生不同的结果集;

(5)不支持表级锁 (lock table),所有的DDL操作都是实例级别的锁,需要用pt-osc工具

如:

例1:

node1:

root@localhost [testdb]>lock table t1 read;

root@localhost [testdb]>insert into t1 select 69,now();

ERROR 1099 (HY000): Table 't1' was locked with a READ lock and can't be updated

node2:节点2仍然可以插入,说明read lock没有生效

root@localhost [testdb]>insert into t1 select 69,now(); 

Query OK, 1 row affected (0.01 sec)

Records: 1  Duplicates: 0  Warnings: 0

例2:

node1:

root@localhost [testdb]>lock table t1 write;

root@localhost [testdb]>insert into t1 select 1,now();

Query OK, 1 row affected (0.03 sec)

Records: 1  Duplicates: 0  Warnings: 0

root@localhost [testdb]>select * from t1;

+----+---------------------+

| c1 | c2                  |

+----+---------------------+

|  1 | 2017-03-08 14:59:46 |

+----+---------------------+

node2: 节点二没有受写锁影响,可以读写:

root@localhost [testdb]>insert into t1 select 2,now();

Query OK, 1 row affected (0.05 sec)

Records: 1  Duplicates: 0  Warnings: 0

root@localhost [testdb]>select * from t1;

+----+---------------------+

| c1 | c2                  |

+----+---------------------+

|  1 | 2017-03-08 14:59:46 |

|  2 | 2017-03-08 14:59:57 |

+----+---------------------+

(6)不支持XA 事物

(7)query log日志存放在文件中,不能放在表里,即需要指定参数log_output=file;

(8)整个集群的性能/吞吐量由性能最差的节点决定,木桶效应;

不考虑延迟的主从复制:每秒6万insert,

考虑到延迟的主从复制:每秒3万insert,

pxc:每秒1万insert


(9)节点数量是3<=num<=8

(10)脑裂,所以至少需要三个节点,有个仲裁节点,防止脑裂;

演示脑裂:

强制干掉mysql进程:

node2:

[root@Darren1 mysql3306]# kill -9 10014   

node3:

[root@Darren3 ~]# kill -9 10115

node1:

root@localhost [(none)]>use testdb;

ERROR 1047 (08S01): Unknown command

脑裂前的值:

show global status like '%wsrep%';

wsrep_local_state_comment    | Synced

wsrep_cluster_status         | Primary

wsrep_ready                  | ON

脑裂后的值:

wsrep_local_state_comment    | Initialized

wsrep_cluster_status         | non-Primary

wsrep_ready                  | OFF   

重启node2或者node3会报错:

[root@Darren1 data]# /etc/init.d/mysqld start

 ERROR! MySQL (Percona XtraDB Cluster) is not running, but PID file exists

解决方法:重启node1,然后再重启node2和node3