MySQL集群搭建(centos6.5)
注:系统(centos6.5)、集群版本(mysql-cluster-gpl-7.3.6-linux-glibc2.5-x86_64.tar.gz)
参考网站:
https://zhuanlan.zhihu.com/p/28572091
1、安装集群版本
a、准备工作
不管是Management Server,还是Data node、SQL node,都需要先安装MySQL集群版本,然后根据不用的配置来决定当前服务器有哪几个角色。
安装之前准备好mysql用户和mysql用户组,相关命令:
groupadd mysql
useradd mysql -g mysql
为了方便测试,确定相关机器的防火墙已关闭(或者设置防火墙这几台机器之间的网络连接是畅通无阻的),相关命令:
chkconfig iptables off
service iptables stop
b、安装集群版本
上传安装包至/usr/local目录下,并解压
tar -zxvf mysql-cluster-gpl-7.3.6-linux-glibc2.5-x86_64.tar.gz
重命名文件夹
mv mysql-cluster-gpl-7.3.6-linux-glibc2.5-x86_64 mysql
授权
chown -R mysql:mysql mysql
切换mysql用户
安装MySQL
cd /usr/local/mysql
scripts/mysql_install_db --user=mysql --datadir=/usr/local/mysql/data
注:所有服务器上都需要执行上述操作来安装MySQL集群版本。
2、集群配置
a、管理节点
root用户下,创建目录和配置文件
mkdir /var/lib/mysql-cluster
cd /var/lib/mysql-cluster
vim config.ini
config.ini配置信息如下:
[NDBD DEFAULT]
NoOfReplicas=2
[NDB_MGMD]
#设置管理节点服务器
nodeid=1
HostName=10.16.8.193
DataDir=/var/lib/mysql-cluster
[NDBD]
id=2
HostName=10.16.8.193
DataDir=/usr/local/mysql/data
[NDBD]
id=3
HostName=10.16.8.194
DataDir=/usr/local/mysql/data
[MYSQLD]
id=4
HostName=10.16.8.193
[MYSQLD]
id=5
HostName=10.16.8.194
#必须有空的mysqld节点,不然数据节点断开后启动有报错
[MYSQLD]
id=6
[mysqld]
id=7
授权
chown -R mysql:mysql /var/lib/mysql-cluster
切换用户
启动管理服务
/usr/local/mysql/bin/ndb_mgmd -f /var/lib/mysql-cluster/config.ini --initial
注:命令行中的ndb_mgmd是mysql cluster的管理服务器,后面的-f表示后面的参数是启动的参数配置文件。
如果在启动后过了几天又添加了一个数据节点,这时修改了配置文件启动时就必须加上–initial参数,不然添加的节点不会作用在mysql cluster中。
b、数据节点
编辑/etc/my.cnf文件
# vim /etc/my.cnf
[mysqld]
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
datadir=/usr/local/mysql/data
socket=/var/lib/mysql/mysql.sock
user=mysql
default-storage-engine=ndbcluster
ndbcluster
ndb-connectstring=10.16.8.193
character_set_server=utf8
lower_case_table_names=1
[mysql_cluster]
ndb-connectstring=10.16.8.19
切换用户
启动数据节点服务
/usr/local/mysql/bin/ndbd --initial
注:第一次启动需要加参数:–initial,以后就不用加了,直接运行: /usr/local/mysql/bin/ndbd
这个参数尤其重要,因为加了–initial后会把数据库清空的
c、SQL节点
编辑/etc/my.cnf文件(数据节点和SQL节点在统一服务器时可省略)
# vim /etc/my.cnf
[mysqld]
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
datadir=/usr/local/mysql/data
socket=/var/lib/mysql/mysql.sock
user=mysql
ndbcluster
ndb-connectstring=192.168.3.115
[mysql_cluster]
ndb-connectstring=192.168.3.115
复制mysqld到系统服务里面去
cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysqld
切换用户
启动数据节点服务
service mysqld start
/usr/local/mysql/bin/mysqladmin -u root password 'password'
d、完成效果
ndb_mgm> show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2 @192.168.3.116 (mysql-5.6.19 ndb-7.3.6, Nodegroup: 0, *)
id=3 @192.168.3.117 (mysql-5.6.19 ndb-7.3.6, Nodegroup: 0)
[ndb_mgmd(MGM)] 1 node(s)
id=1 @192.168.3.115 (mysql-5.6.19 ndb-7.3.6)
[mysqld(API)] 4 node(s)
id=4 @192.168.3.116 (mysql-5.6.19 ndb-7.3.6)
id=5 @192.168.3.117 (mysql-5.6.19 ndb-7.3.6)
id=6 (not connected, accepting connect from any host)
id=7 (not connected, accepting connect from any host)
3、nginx方向代理
因为需要使用nginx的tcp连接,所以需要使用到nginx的stream模块
a、安装nginx
因为是使用rpm方式进行安装,下面给出官方的依赖包下载地址:
http://vault.centos.org/6.5/os/x86_64/Packages/
在上面网站中安装必要依赖,比如gcc gcc-c++ automake autoconf libtool make glibc glibc-devel lsof
,具体需要的依赖到时安装时会提示错误,再安装即可,安装命令:
官网下载nginx,这里我使用的是如下的版本:
解压:
tar zxvf nginx-1.12.2.tar.gz
进入目录:
源码编译:
注:这里如果报错,一般是缺少某个依赖,去centos依赖库下载回来执行rpm命令安装即可
安装:
执行命令查看到的版本和依赖模块如下:
[root@kfjqrapp1 sbin]# ./nginx -V
nginx version: nginx/1.12.2
built by gcc 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)
configure arguments: --prefix=/home/xxx/nginx --with-stream
b、nginx.conf配置文件如下:
#user nobody;
user root;
worker_processes auto;
#error_log /var/log/nginx/error.log;
#pid /var/run/nginx.pid;
error_log logs/error.log;
error_log logs/error.log notice;
error_log logs/error.log info;
pid logs/nginx.pid;
events {
worker_connections 1024;
}
stream {
upstream mysqld {
server 10.16.8.193:3306 weight=4 max_fails=3 fail_timeout=30s;
server 10.16.8.194:3306 weight=5 max_fails=3 fail_timeout=30s;
}
server {
listen 3306;
proxy_connect_timeout 30s;
proxy_timeout 43200s;
proxy_pass mysqld;
}
}
http {
include mime.types;
default_type application/octet-stream;
#log_format main '$remote_addr - $remote_user [$time_local] "$request" '
# '$status $body_bytes_sent "$http_referer" '
# '"$http_user_agent" "$http_x_forwarded_for"';
#access_log logs/access.log main;
sendfile on;
#tcp_nopush on;
#keepalive_timeout 0;
keepalive_timeout 65;
max_ranges 1;
upstream myproject {
server 10.16.8.191:8082;
server 10.16.8.192:8082;
}
server {
listen 9300;
server_name localhost;
location / {
proxy_pass http://myproject;
}
}
#gzip on;
server {
listen 80;
server_name localhost;
#charset koi8-r;
#access_log logs/host.access.log main;
max_ranges 1;
location / {
root html;
index index.html index.htm;
max_ranges 1;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
max_ranges 1;
}
# proxy the PHP scripts to Apache listening on 127.0.0.1:80
#
#location ~ \.php$ {
# proxy_pass http://127.0.0.1;
#}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
#location ~ \.php$ {
# root html;
# fastcgi_pass 127.0.0.1:9000;
# fastcgi_index index.php;
# fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name;
# include fastcgi_params;
#}
# deny access to .htaccess files, if Apache's document root
# concurs with nginx's one
#
#location ~ /\.ht {
# deny all;
#}
}
# another virtual host using mix of IP-, name-, and port-based configuration
#
#server {
# listen 8000;
# listen somename:8080;
# server_name somename alias another.alias;
# location / {
# root html;
# index index.html index.htm;
# }
#}
# HTTPS server
#
#server {
# listen 443 ssl;
# server_name localhost;
# ssl_certificate cert.pem;
# ssl_certificate_key cert.key;
# ssl_session_cache shared:SSL:1m;
# ssl_session_timeout 5m;
# ssl_ciphers HIGH:!aNULL:!MD5;
# ssl_prefer_server_ciphers on;
# location / {
# root html;
# index index.html index.htm;
# }
#}
}
注:这里要注意的是proxy_timeout 43200s;这个参数,代表tcp连接之后再这个秒数之内,如果不进行通讯,那么会自动断开连接,这个就看看实际项目的需求了,一般看一下定时任务的时间来定。
4、常用命令以及问题整理
1) ndb_mgmd
管理命令:/usr/local/mysql/bin/ndb_mgm
执行之后就是管理控制台了,里面可以继续输入命令。(具体命令可以使用help查看)
2) 停止集群服务器的命令:/usr/local/mysql/bin/ndb_mgm -e shutdown
如果集群配置有更新了:rm /usr/local/mysql/mysql-cluster/ndb_1_config.bin.1
3) 停止SQL节点的命令:/usr/local/mysql/bin/mysqladmin -uroot shutdown
4)使用需要注意如下几点:
a、表必须用ENGINE=NDB或ENGINE=NDBCLUSTER选项创建,或用ALTER TABLE选项更改,以使用NDB Cluster存储引擎在 Cluster内复制它们。如果使用mysqldump的输出从已有数据库导入表,可在文本编辑器中打开SQL脚本,并将该选项添加到任何表创建语句,或 用这类选项之一替换任何已有的ENGINE(或TYPE)选项。
b、另外还请记住,每个NDB表必须有一个主键。如果在创建表时用户未定义主键,NDB Cluster存储引擎将自动生成隐含的主键。(注释:该隐含 键也将占用空间,就像任何其他的表索引一样。由于没有足够的内存来容纳这些自动创建的键,出现问题并不罕见)。
c、再重启mysql-cluster的时候,在重启数据节点的时候注意看一下是不是已经启动了,才开始启动最后的sql节点,有时数据量比较大的时候,启动数据节点需要一段时间的,使用show
命令查看如果是数据节点有starting
标识的话,那就是还没启动成功:
id=2 @192.168.0.15 (mysql-5.7.18 ndb-7.5.6, starting, Nodegroup: 0)
问题一:
ERROR 1297 (HY000): Got temporary error 233 ‘Out of operation records in transaction coordinator (increase MaxNoOfConcurrentOperations)’ from NDBCLUSTER
在[NDBD DEFAULT]中增加参数
MaxNoOfConcurrentOperations=300000
MaxNoOfLocalOperations=330000
问题二:
java.net.UnknownHostException: XXXX Name or service not known
修改/etc/hosts文件
127.0.0.1 主机名 localhost.localdomain localhost
或是再添加一条
问题三:
对于连表查询,mysql-cluster性能比较差,这里还没想到怎么优化,但是如果是带有IN的sql语句,可以通过如下方式进行优化:
下面语句执行可能要很长时间:
select * from userinfo where id in(select author_id from artilce where type=1);
可以进行如下的优化(使用临时表):
select id,username from userinfo where id in (select author_id from article where type = 1);
问题四:
Node 1: Forced node shutdown completed. Caused by error 2305: ‘Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node’.
config.ini,在[ndbd]中改一个参数为StopOnError=0即可.
问题五:导入数据的时候遇到如下报错
ERROR 1005 (HY000) at line 25: Can’t create table ‘tuge.pangolin_fnc_accountverification’ (errno: 140)
Error | 1296 | Got error 738 ‘Record too big’ from NDB
分析原因:表行数据太大
解决办法:需要更改表结构,使最大单行数据的大小小于8KB!
问题六:导入大量数据的时候遇到如下报错
Error 1297: Got temporary error ‘REDO’ log overloaded.
ERROR : Got temporary error 1204 ‘Temporary failure, distribution changed’ from NDBCLUSTER
ERROR : Got temporary error 1234 ‘REDO log files overloaded (increase disk hardware)’ from NDBCLUSTER
分析原因:遇到这个错误,是表示redo log用完了,需要增加
解决办法:修改config.ini文件,更改或添加如下参数:
FragmentLogFileSize=256M
NoOfFragmentLogFiles=16
NoOfFragmentLogFiles这个参数可以更改到更大,但是初始化的时候会慢一点
问题七:在导入大量数据的时候,出现如下报错:
ERROR 1114 (HY000) at line 54: The table ‘gps_led_sendadverthistory’ is full
分析原因:你分配的内存或者硬盘空间已经用完(如果你采用磁盘表的话),需要通过ndb管理节点客户端和登录mysql查看具体的原因,通过ndb_mgm登录,执行命令:
All report memory 看下分配的内存是否使用完,如果采用磁盘表的,还需要登录mysqld节点,然后执行如下查询,来确定是否是磁盘不足:
SELECT TABLESPACE_NAME, FILE_NAME, EXTENT_SIZE*TOTAL_EXTENTS/1024/1024 AS TOTAL_MB, EXTENT_SIZE*FREE_EXTENTS/1024/1024 AS FREE_MB, EXTRA FROM information_schema.FILES WHERE FILE_TYPE="DATAFILE";
解决办法:如果是数据内存不足的情况,直接更改文件config.ini中[NDBD DEFAULT]下的
DataMemory=11480M
IndexMemory=1024M
5、在网上查询到的优化配置
config.ini
[ndb_mgmd default]
DataDir = /usr/local/mysql/data
[ndbd default]
NoOfReplicas = 2
DataMemory = 2500M
IndexMemory = 800M
DataDir = /usr/local/mysql/data
StringMemory = 50
MaxNoOfTables = 4096
MaxNoOfOrderedIndexes = 2048
MaxNoOfUniqueHashIndexes = 1024
MaxNoOfAttributes = 24576
MaxNoOfTriggers = 10240
MaxNoOfConcurrentTransactions = 409600
MaxNoOfConcurrentOperations = 3276800
###MaxNoOfLocalOperations = 55000
TimeBetweenGlobalCheckpoints = 1000
TimeBetweenEpochs = 100
TimeBetweenWatchdogCheckInitial = 60000
TransactionBufferMemory = 20M
DiskCheckpointSpeed = 20M
DiskCheckpointSpeedInRestart = 100M
TimeBetweenLocalCheckpoints = 20
SchedulerSpinTimer = 400
SchedulerExecutionTimer = 100
RealTimeScheduler = 1
BackupMaxWriteSize = 2M
BackupDataBufferSize = 32M
BackupLogBufferSize = 8M
BackupMemory = 40M
MaxNoOfExecutionThreads = 4
TransactionDeadLockDetectionTimeOut = 15000
BatchSizePerLocalScan = 512
###Increasing the LongMessageBufferb/c of a bug (20090903)
LongMessageBuffer = 16M
###Heartbeating
HeartbeatIntervalDbDb = 15000
HeartbeatIntervalDbApi = 15000
FragmentLogFileSize = 256M
NoOfFragmentLogFiles = 16
[mysqld default]
[ndb_mgmd]
NodeId = 1
HostName = 172.16.10.160
[ndbd]
NodeId = 11
HostName = 172.16.10.170
[ndbd]
NodeId = 12
HostName = 172.16.10.171
[ndbd]
NodeId = 13
HostName = 172.16.10.172
[ndbd]
NodeId = 14
HostName = 172.16.10.173
[mysqld]
NodeId = 81
HostName = 172.16.10.150
[mysqld]
NodeId = 82
HostName = 172.16.10.151
my.cnf
[mysqld]
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
default-storage-engine = ndbcluster
slow-query-log = on
slow_query_log_file = /usr/local/mysql/data/slow-query.log
long_query_time = 5
skip-external-locking
key_buffer_size = 600M
max_allowed_packet = 100M
table_open_cache = 2048
sort_buffer_size = 1024M
net_buffer_length = 8K
read_buffer_size = 400M
read_rnd_buffer_size = 200M
lower_case_table_names =1
back_log = 384
thread_stack = 256K
join_buffer_size = 500M
thread_cache_size = 200
query_cache_size = 640M
tmp_table_size = 256M
max_connections = 5000
max_connect_errors = 10000000
wait_timeout = 2880000
interactive_timeout = 2880000
thread_concurrency = 8
ndbcluster
ndb-connectstring = 172.16.10.160
[mysql_cluster]
ndb-connectstring = 172.16.10.160