先通过show plugins来查看服务器是否支持分区表
源码编译的要想支持分区表,需要设置 -DWITH_PARTITION_STORAGE_ENGINE 选项
分区表有下面的限制,分区表不支持外键,如果表上存在主键,则分区列要在主键中
分区的类型
mysql的分区类型要比oracle10g的分区类型多几种,下面是mysql的分区类型

  • 范围分区
  • 列表分区
  • 列分区
  • hash分区
  • 键分区
  • 子分区
    范围分区,这个跟oracle一样,就是按列值得范围划分分区,在建表的时候,范围要从小到大
    CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT ‘1970-01-01’,
    separated DATE NOT NULL DEFAULT ‘9999-12-31’,
    job_code INT NOT NULL,
    store_id INT NOT NULL
    )
    PARTITION BY RANGE (store_id) (
    PARTITION p0 VALUES LESS THAN (6),
    PARTITION p1 VALUES LESS THAN (11),
    PARTITION p2 VALUES LESS THAN (16),
    PARTITION p3 VALUES LESS THAN MAXVALUE
    );
    CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT ‘1970-01-01’,
    separated DATE NOT NULL DEFAULT ‘9999-12-31’,
    job_code INT,
    store_id INT
    )
    PARTITION BY RANGE ( YEAR(separated) ) (
    PARTITION p0 VALUES LESS THAN (1991),
    PARTITION p1 VALUES LESS THAN (1996),
    PARTITION p2 VALUES LESS THAN (2001),
    PARTITION p3 VALUES LESS THAN MAXVALUE
    );
    CREATE TABLE quarterly_report_status (
    report_id INT NOT NULL,
    report_status VARCHAR(20) NOT NULL,
    report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
    )
    PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) (
    PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2008-01-01 00:00:00’) ),
    PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2008-04-01 00:00:00’) ),
    PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2008-07-01 00:00:00’) ),
    PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2008-10-01 00:00:00’) ),
    PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2009-01-01 00:00:00’) ),
    PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2009-04-01 00:00:00’) ),
    PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2009-07-01 00:00:00’) ),
    PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2009-10-01 00:00:00’) ),
    PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP(‘2010-01-01 00:00:00’) ),
    PARTITION p9 VALUES LESS THAN (MAXVALUE)
    );

CREATE TABLE members (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATE NOT NULL
)
PARTITION BY RANGE COLUMNS(joined) (
PARTITION p0 VALUES LESS THAN (‘1960-01-01’),
PARTITION p1 VALUES LESS THAN (‘1970-01-01’),
PARTITION p2 VALUES LESS THAN (‘1980-01-01’),
PARTITION p3 VALUES LESS THAN (‘1990-01-01’),
PARTITION p4 VALUES LESS THAN MAXVALUE
);
对于范围分区,还有个扩展的范围列分区,就是不只是在1列上分区,而是在多列上范围分区,跟单列的范围分区在下面几点有所区别:

  1. 范围列不接受表达式
  2. 范围列接受1个或多个列
    3 可以接受DATE,TIMESTAMP类型列
    mysql> CREATE TABLE rcx (
    -> a INT,
    -> b INT,
    -> c CHAR(3),
    -> d INT
    -> )
    -> PARTITION BY RANGE COLUMNS(a,d,c) (
    -> PARTITION p0 VALUES LESS THAN (5,10,’ggg’),
    -> PARTITION p1 VALUES LESS THAN (10,20,’mmmm’),
    -> PARTITION p2 VALUES LESS THAN (15,30,’sss’),
    -> PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
    -> );
    在列范围分区中,在插入数据的时候对比的是记录的大小,不是标量的大小
    .test>CREATE TABLE rc1 (
    -> a INT,
    -> b INT
    -> )
    -> PARTITION BY RANGE COLUMNS(a, b) (
    -> PARTITION p0 VALUES LESS THAN (5, 12),
    -> PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE)
    -> );
    Query OK, 0 rows affected (0.13 sec)

.test>insert into rc1 values(1,2);
Query OK, 1 row affected (0.02 sec)

.test>insert into rc1 values(5,2);
Query OK, 1 row affected (0.00 sec)

test>insert into rc1 values(5,10);
Query OK, 1 row affected (0.00 sec)

.test>insert into rc1 values(5,12);
Query OK, 1 row affected (0.00 sec)

test>insert into rc1 values(5,14);
Query OK, 1 row affected (0.00 sec)

test>insert into rc1 values(2,14);
Query OK, 1 row affected (0.01 sec)

test>select * from rc1;
+——+——+
| a | b |
+——+——+
| 1 | 2 |
| 5 | 2 |
| 5 | 10 |
| 2 | 14 |
| 5 | 12 |
| 5 | 14 |
+——+——+
6 rows in set (0.01 sec)

select partition_name,table_rows from information_schema.partitions where table_name=’rc1’;
+—————-+————+
| partition_name | table_rows |
+—————-+————+
| p0 | 4 |
| p3 | 2 |
+—————-+————+
2 rows in set (0.01 sec)

.test>select (1,2)<(5,12),(5,2)<(5,12),(5,10)<(5,12);
+————–+————–+—————+
| (1,2)<(5,12) | (5,2)<(5,12) | (5,10)<(5,12) |
+————–+————–+—————+
| 1 | 1 | 1 |
+————–+————–+—————+
1 row in set (0.01 sec)

test>select (5,12)<(5,12),(5,14)<(5,12),(2,14)<(5,12);
+—————+—————+—————+
| (5,12)<(5,12) | (5,14)<(5,12) | (2,14)<(5,12) |
+—————+—————+—————+
| 0 | 0 | 1 |
+—————+—————+—————+
1 row in set (0.01 sec)

test>select (6,1)<(5,12);
+————–+
| (6,1)<(5,12) |
+————–+
| 0 |
+————–+
列表分区,这个没有maxvalue
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT ‘1970-01-01’,
separated DATE NOT NULL DEFAULT ‘9999-12-31’,
job_code INT,
store_id INT
)
PARTITION BY LIST(store_id) (
PARTITION pNorth VALUES IN (3,5,6,9,17),
PARTITION pEast VALUES IN (1,2,10,11,19,20),
PARTITION pWest VALUES IN (4,12,13,14,18),
PARTITION pCentral VALUES IN (7,8,15,16)
);

mysql> CREATE TABLE t (account VARCHAR(30) NOT NULL,store_id INT,PRIMARY KEY (`account`))PARTITION BY list(account)(
    -> PARTITION p1 VALUES IN ('aa','ab'), 
    -> PARTITION p2 VALUES IN ('dd'));
ERROR 1697 (HY000): VALUES value for partition 'p1' must have type INT

hash分区
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT ‘1970-01-01’,
separated DATE NOT NULL DEFAULT ‘9999-12-31’,
job_code INT,
store_id INT
)
PARTITION BY HASH( YEAR(hired) )
PARTITIONS 4;
hash分区的方法就是取模
MOD(YEAR(‘2005-09-01’),4)
= MOD(2005,4)
= 1
hash分区的字段需要是数字型或是日期,varchar类型的无法hash,能key分区
5.6还支持线性hash,创建线性hash的语法就是多了个linear,具体hash算法参考官方文档
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT ‘1970-01-01’,
separated DATE NOT NULL DEFAULT ‘9999-12-31’,
job_code INT,
store_id INT
)
PARTITION BY LINEAR HASH( YEAR(hired) )
PARTITIONS 4;
关于线性hash的优势,对于分区的添加,删除,合并,分裂速度更快,适用于海量数据的情况下,缺点是数据的分布不是很均衡。
The advantage in partitioning by linear hash is that the adding, dropping, merging, and splitting of partitions is made much faster, which can be beneficial when dealing with tables containing extremely large amounts (terabytes) of data. The disadvantage is that data is less likely to be evenly distributed between partitions as compared with the distribution obtained using regular hash partitioning.

键分区,这个跟hash分区类似,键分区的列需要是主键的列,如果表上没有主键,有唯一索引,那么唯一键被用来做分区键。如果唯一键没有被定义成not null,那么会报错。
CREATE TABLE tm1 (
s1 CHAR(32) PRIMARY KEY
)
PARTITION BY KEY(s1)
PARTITIONS 10;
CREATE TABLE k1 (
id INT NOT NULL PRIMARY KEY,
name VARCHAR(20)
)
PARTITION BY KEY()
PARTITIONS 2;
键分区这种方式,数据可能分布的不是很均匀,主要看前置字符的区别度是否足够

mysql> CREATE TABLE t (account VARCHAR(30) NOT NULL,store_id INT,PRIMARY KEY (`account`))PARTITION BY KEY(account) PARTITIONS 4;
Query OK, 0 rows affected (0.07 sec)

mysql> desc t;
+----------+-------------+------+-----+---------+-------+
| Field    | Type        | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| account  | varchar(30) | NO   | PRI | NULL    |       |
| store_id | int(11)     | YES  |     | NULL    |       |
+----------+-------------+------+-----+---------+-------+
2 rows in set (0.03 sec)

mysql> insert into t values('aaa',1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t values('ab',1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t values('abc',1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t values('abcd',1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t values('bbb',1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t values('bbbbb',1);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = 't';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          4 |
| p1             |          0 |
| p2             |          0 |
| p3             |          2 |
+----------------+------------+
4 rows in set (0.02 sec)

mysql> SELECT * FROM t PARTITION (p0);
+---------+----------+
| account | store_id |
+---------+----------+
| aaa     |        1 |
| ab      |        1 |
| abc     |        1 |
| abcd    |        1 |
+---------+----------+
4 rows in set (0.00 sec)

mysql> SELECT * FROM t PARTITION (p1);
Empty set (0.00 sec)

mysql> SELECT * FROM t PARTITION (p2);
Empty set (0.00 sec)

mysql> SELECT * FROM t PARTITION (p3);
+---------+----------+
| account | store_id |
+---------+----------+
| bbb     |        1 |
| bbbbb   |        1 |
+---------+----------+
2 rows in set (0.00 sec)

子分区
在5.6开始,允许对于range list分区在进行hash,key分区,创建子分区可以指定分区的名字也可以不指定,直接指定个数。
语法参考:https://dev.mysql.com/doc/refman/5.6/en/partitioning-subpartitions.html

查看分区表

CREATE TABLE discount_log_discountlog_part (
id int(11) NOT NULL AUTO_INCREMENT,
user varchar(255) NOT NULL,
transaction_datetime datetime(6) NOT NULL,
amount double NOT NULL,
pos_id longtext NOT NULL,
revoked tinyint(1) NOT NULL,
store_id int(11) NOT NULL,
pos_amount double NOT NULL,
PRIMARY KEY (id,transaction_datetime),
KEY discount_log_discountlog_store_id_f697f9c4_fk_discount_store_id (store_id),
KEY idx_transaction_datetime (transaction_datetime)
) ENGINE=InnoDB AUTO_INCREMENT=185441 DEFAULT CHARSET=utf8
/*!50500 PARTITION BY RANGE COLUMNS(transaction_datetime)
(PARTITION p0 VALUES LESS THAN (‘2016-06-30 23:59:59’) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (‘2016-07-31 23:59:59’) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (‘2016-08-31 23:59:59’) ENGINE = InnoDB,
PARTITION p3 VALUES LESS THAN (‘2016-09-30 23:59:59’) ENGINE = InnoDB,
PARTITION p4 VALUES LESS THAN (‘2016-10-31 23:59:59’) ENGINE = InnoDB,
PARTITION p5 VALUES LESS THAN (‘2016-11-30 23:59:59’) ENGINE = InnoDB,
PARTITION p6 VALUES LESS THAN (‘2016-12-31 23:59:59’) ENGINE = InnoDB,
PARTITION p7 VALUES LESS THAN (‘2017-01-31 23:59:59’) ENGINE = InnoDB,
PARTITION p8 VALUES LESS THAN (‘2017-02-28 23:59:59’) ENGINE = InnoDB,
PARTITION p9 VALUES LESS THAN (‘2017-03-31 23:59:59’) ENGINE = InnoDB,
PARTITION p10 VALUES LESS THAN (‘2017-04-30 23:59:59’) ENGINE = InnoDB,
PARTITION p11 VALUES LESS THAN (‘2017-05-31 23:59:59’) ENGINE = InnoDB,
PARTITION p12 VALUES LESS THAN (‘2017-06-30 23:59:59’) ENGINE = InnoDB,
PARTITION p13 VALUES LESS THAN (‘2017-07-31 23:59:59’) ENGINE = InnoDB,
PARTITION p14 VALUES LESS THAN (‘2017-08-31 23:59:59’) ENGINE = InnoDB,
PARTITION p15 VALUES LESS THAN (‘2017-09-30 23:59:59’) ENGINE = InnoDB,
PARTITION p16 VALUES LESS THAN (‘2017-10-31 23:59:59’) ENGINE = InnoDB,
PARTITION p17 VALUES LESS THAN (‘2017-11-30 23:59:59’) ENGINE = InnoDB,
PARTITION p18 VALUES LESS THAN (‘2017-12-31 23:59:59’) ENGINE = InnoDB,
PARTITION p20 VALUES LESS THAN (‘2018-01-31 23:59:59’) ENGINE = InnoDB,
PARTITION p22 VALUES LESS THAN (‘2018-02-28 23:59:59’) ENGINE = InnoDB,
PARTITION p23 VALUES LESS THAN (‘2018-03-31 23:59:59’) ENGINE = InnoDB,
PARTITION p24 VALUES LESS THAN (‘2018-04-30 23:59:59’) ENGINE = InnoDB,
PARTITION p25 VALUES LESS THAN (‘2018-05-31 23:59:59’) ENGINE = InnoDB,
PARTITION p26 VALUES LESS THAN (‘2018-06-30 23:59:59’) ENGINE = InnoDB,
PARTITION p27 VALUES LESS THAN (‘2018-07-31 23:59:59’) ENGINE = InnoDB,
PARTITION p28 VALUES LESS THAN (‘2018-08-31 23:59:59’) ENGINE = InnoDB,
PARTITION p29 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB)

限制:1分区列要在主键中,2唯一索引要在分区列中。
正是由于这些限制,导致mysql中使用分表的方式要比分区好很多。

mysql中的分区是分区数据和索引,所以mysql中的索引都是分区索引。