1.mycat是一种数据库中间件,详细信息访问官网http://www.mycat.org.cn/
2.架构图
从上图可以看出:原来架构单服务器单库单表向多服务器多库多表转换
这样做的好处是:
- 解决单台服务器始终存在单个文件大小上限。无论是共享或独占方式的表空间,mysql数据文件大小受操作系统限制,不能无限增长
- 解决单台服务器资源上限。单台机器的硬件资源不可能无限扩展。
- 解决成本问题。1台昂贵的小型机的价格>20台普通服务器价格
- 解决性能瓶颈。
- 解决大表在线DDL操作难。
实现思路:利用mycat强大的分表功能,首先配置3个节点用于分表,每个节点写500w数据,其中1个节点做为扩展接点,如果扩展节点被写数据那么,自动配置mycat增加数据节点。所以,始终保持至少有1个空数据节点。注意:一个server可以有多个数据节点
3.mycat配置
schema.xml配置
<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://org.opencloudb/">
<schema name="transdb" checkSQLschema="false">
<table name="orders" primaryKey="id" autoIncrement="true" rule="auto-sharding-long-orders" dataNode="dn1,dn2,dn3" />
</schema>
<dataNode name="dn1" dataHost="localhost1" database="transdb0" /> <dataNode name="dn2" dataHost="localhost2" database="transdb1" /> <dataNode name="dn3" dataHost="localhost3" database="transdb2" />
<dataHost name="localhost1" maxCon="100000" minCon="10" balance="0"
writeType="0" dbType="mysql" dbDriver="native">
<heartbeat>select user()</heartbeat>
<writeHost host="hostM1" url="db1:3306" user="root" password="123456"></writeHost>
</dataHost>
<dataHost name="localhost2" maxCon="100000" minCon="10" balance="0"
writeType="0" dbType="mysql" dbDriver="native">
<heartbeat>select user()</heartbeat>
<!--heartbeat>select user()</heartbeat-->
<writeHost host="hostS1" url="db2:3306" user="root" password="123456"></writeHost>
</dataHost>
<dataHost name="localhost3" maxCon="100000" minCon="10" balance="0"
writeType="0" dbType="mysql" dbDriver="native">
<heartbeat>select user()</heartbeat>
<!--heartbeat>select user()</heartbeat-->
<writeHost host="hostS2" url="db3:3306" user="root" password="123456"></writeHost>
</dataHost>
</mycat:schema>
rule.xml
在rule.xml 添加如下内容(分表规则和算法)
<tableRule name="auto-sharding-long-orders">
<rule>
<columns>id</columns>
<algorithm>rang-long-orders</algorithm>
</rule>
</tableRule>
<function name="rang-long-orders"
class="org.opencloudb.route.function.AutoPartitionByLong">
<property name="mapFile"> autopartition-long-orders.txt</property>
</function>
autopartition-long-orders.txt
在conf目录中新增 autopartition-long-orders.txt文件,用于rang-long-orders算法选择数据接点,内容如下(0,1,2分别对应schema.xml的dn1,dn2,dn3):
# range start-end ,data node index
# K=1000,M=10000.
0-500M=0
5000001-1000M=1
10000001-1500M=2
sequence_conf.properties
在sequence_conf.properties添加全局自增序列,用于orders表自增id(调用方式:next value for MYCATSEQ_ORDER),添加如下内容:
ORDER.CURID=0
ORDER.HISIDS=
ORDER.MINID=0
ORDER.MAXID=1000000000
server.xml配置如下:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE mycat:server SYSTEM "server.dtd">
<mycat:server xmlns:mycat="http://org.opencloudb/">
<system>
<property name="defaultSqlParser">druidparser</property>
<property name="sequnceHandlerType">0</property>
<property name="serverPort">3307</property> <property name="managerPort">9067</property>
</system>
<user name="mycat">
<property name="password">123456</property>
<property name="schemas">transdb</property>
</user>
<quarantine>
<whitehost>
<host host="127.0.0.1" user="mycat"/>
<host host="192.168.2.27" user="mycat"/>
<host host="192.168.2.45" user="mycat"/>
</whitehost>
<blacklist check="false"></blacklist>
</quarantine>
</mycat:server>
4.在每个节点上创建数据库和表(数据库编号dn1=>transdb0,dn2=>transdb1....)
create database
transdb0;
use
transdb0;
create table orders (id int(11) NOT NULL AUTO_INCREMENT primary key,user_id int,pay decimal(22,2),goods_name char(50));
5.启动mycat
bin/mycat start
6.测试分表功能
插入数据python脚本
#!/bin/env python
#-*-coding:utf-8-*-
import MySQLdb
from multiprocessing import Process;
import random
def mysql_exec():
host='192.168.2.155'
port=3307
user='mycat'
passwd='123456'
dbname='transdb'
goods_name=random.choice(["IPONE6S", "APPLE", "IPONE7S", "IPONE8S", "IPONE9S","IPONE6P", "APPLE1", "IPONE7P", "IPONE8P", "IPONE9P"])
pay=round(random.uniform(100, 10000),2)
user_id=random.randint(1, 20000)
sql="insert into orders(id,user_id,pay,goods_name)values(next value for MYCATSEQ_ORDER ,%s ,%s ,%s);"
param=(user_id,pay,goods_name)
try:
conn=MySQLdb.connect(host=host,user=user,passwd=passwd,port=int(port),connect_timeout=1,charset='utf8')
conn.select_db(dbname)
curs = conn.cursor()
if param <> '':
curs.execute(sql,param)
else:
curs.execute(sql)
conn.commit()
curs.close()
conn.close()
except Exception,e:
print "mysql execute: " + str(e)
def main():
i=0
while i<10000000:
mysql_exec()
i=i+1
if __name__=='__main__':
main()
登录mycat,查询所有节点数据
mysql -umycat -p123456 -h192.168.2.155 transdb -P3307
mysql> select count(1) from orders;
+---------+
| COUNT0 |
+---------+
| 7127006 |
+---------+
查询单个节点数据
节点1:
mysql -uroot -p -hdb1 transdb0
mysql> select count(1) from orders;
+----------+
| count(1) |
+----------+
| 5000000 |
+----------+
1 row in set (1.09 sec)
mysql> select min(id),max(id) from orders;
+---------+---------+
| min(id) | max(id) |
+---------+---------+
| 1 | 5000000 |
+---------+---------+
1 row in set (0.00 sec)
节点2:
mysql -uroot -p -hdb2 transdb1
mysql> select min(id),max(id) from orders;
+---------+---------+
| min(id) | max(id) |
+---------+---------+
| 5000001 | 7137006 |
+---------+---------+
7.自动扩展
schema.xml配置,红色字体为增加内容
<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://org.opencloudb/">
<schema name="transdb" checkSQLschema="false">
<table name="orders" primaryKey="id" autoIncrement="true" rule="auto-sharding-long-orders" dataNode="dn1,dn2,dn3 ,dn4" />
</schema>
<dataNode name="dn1" dataHost="localhost1" database="transdb0" />
<dataNode name="dn2" dataHost="localhost2" database="transdb1" />
<dataNode name="dn3" dataHost="localhost3" database="transdb2" />
<dataNode name="dn1" dataHost="localhost1" database="transdb3" />
autopartition-long-orders.txt,红色字体为增加内容
# range start-end ,data node index
# K=1000,M=10000.
0-500M=0
5000001-1000M=1
10000001-1500M=2
15000001-2000M=3
在扩展节点上创建数据库和表
create database
transdb3;
use
transdb3;
create table orders (id int(11) NOT NULL AUTO_INCREMENT primary key,user_id int,pay decimal(22,2),goods_name char(50));
重启mycat(实际上可以登录server.xml配置中的9067端口进行reload,但是该功能貌似不完善,restart更稳妥,但是会断开连接)
8.扩展节点注意
- 一定要有一个备用的空节点,防止现有节点数据满了,插入报错:(1064, "can't find any valid datanode :ORDERS -> ID -> 5000001")
- 扩展节点如果与旧节点在相同的server上,应该确缓冲池可以加载所有节点数据页
- 分表算法难以保证数据节点易扩展和访问热点分散,这个案例是才用范围分片算法,存在热点问题,即,可能80%的查询集中在最后一个数据节点
- 每个节点的数据量并不是严格对应autopartition-long-orders.txt文件中的范围划分,实际上这个范围对应的是分片键的值。(这里是orders的id值)