目录
一、分区类型
二、支持类型
1. RANGE
2. LIST
3. HASH
4. KEY
5. COLUMNS
6. 子分区
7. 分区类型总结
三、分区示例
1. 非分区表导入分区表
step1:非分区表结构,导入新表
step2:添加分区
step3:导入数据
2. 分区表导入非分区表
step1:分区表结构,创建非分表,表结构相同
step2:新表删除分区
step3:导入数据
四、参考资料
一、分区类型
从逻辑上,只有一个表或一个索引,但是物理上可能由多个物理分区组成。每个分区都是独立的对象,可以独自处理。
分区类型
二、支持类型
无论创建何种类型的分区,表没有主键、唯一索引时,可以指定任何一列为分区列;若是表有主键、唯一索引时,则分区列必须是主键、唯一索引的一部分,若不是,会报错如下所示。
1503 - A PRIMARY KEY must include all columns in the table's partitioning function
批量INSERT数据时,遇到数据不在分区内,则不同的存储引擎处理不同。MyISAM引擎会将之前数据成功插入,之后插入失败;InnoDB存储引擎将其视为一个事务,所有数据插入失败。
1. RANGE
连续区间的列值被放入分区,关键字VALUES LESS THAN。RANGE适用于日期分区。以下代码,表是没有主键的,则任意列为分区列。把id列分为3个分区。其中p2分区的范围:20 < id < MAXVALUE(正无穷)。
CREATE TABLE `test_range_partition` (
`id` int(11) NULL DEFAULT NULL,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic PARTITION BY RANGE (id)
PARTITIONS 3
(PARTITION `p0` VALUES LESS THAN (10) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p1` VALUES LESS THAN (20) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p2` VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 );
分区后,查看磁盘上的表空间文件,不再是一个.ibd文件,而是由各个分区的物理文件组成,如下所示:
-rw-r----- 1 mysql mysql 98304 Aug 5 07:02 test_range_partition#P#p0.ibd
-rw-r----- 1 mysql mysql 98304 Aug 5 07:03 test_range_partition#P#p1.ibd
-rw-r----- 1 mysql mysql 98304 Aug 5 07:03 test_range_partition#P#p2.ibd
information_schema.PARTITIONS可以查看表的分区情况:
- PARTITION_METHOD:分区类型
- PARTITION_NAME:分区名
- PARTITION_EXPRESSION:分区表达式
- PARTITION_DESCRIPTION:分区值
- TABLE_ROWS:分区中的记录数
mysql> SELECT
-> *
-> FROM
-> information_schema.PARTITIONS p
-> WHERE
-> p.TABLE_SCHEMA = DATABASE ()
-> AND p.TABLE_NAME = 'test_range_partition'\G;
*************************** 1. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: test_mysql
TABLE_NAME: test_partition
PARTITION_NAME: p0
SUBPARTITION_NAME: NULL
PARTITION_ORDINAL_POSITION: 1
SUBPARTITION_ORDINAL_POSITION: NULL
PARTITION_METHOD: RANGE
SUBPARTITION_METHOD: NULL
PARTITION_EXPRESSION: id
SUBPARTITION_EXPRESSION: NULL
PARTITION_DESCRIPTION: 10
TABLE_ROWS: 1
AVG_ROW_LENGTH: 16384
DATA_LENGTH: 16384
MAX_DATA_LENGTH: NULL
INDEX_LENGTH: 0
DATA_FREE: 0
CREATE_TIME: 2022-08-05 06:39:13
UPDATE_TIME: 2022-08-05 07:02:48
CHECK_TIME: NULL
CHECKSUM: NULL
PARTITION_COMMENT:
NODEGROUP: default
TABLESPACE_NAME: NULL
*************************** 2. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: test_mysql
TABLE_NAME: test_partition
PARTITION_NAME: p1
SUBPARTITION_NAME: NULL
PARTITION_ORDINAL_POSITION: 2
SUBPARTITION_ORDINAL_POSITION: NULL
PARTITION_METHOD: RANGE
SUBPARTITION_METHOD: NULL
PARTITION_EXPRESSION: id
SUBPARTITION_EXPRESSION: NULL
PARTITION_DESCRIPTION: 20
TABLE_ROWS: 1
AVG_ROW_LENGTH: 16384
DATA_LENGTH: 16384
MAX_DATA_LENGTH: NULL
INDEX_LENGTH: 0
DATA_FREE: 0
CREATE_TIME: 2022-08-05 06:39:13
UPDATE_TIME: 2022-08-05 07:03:15
CHECK_TIME: NULL
CHECKSUM: NULL
PARTITION_COMMENT:
NODEGROUP: default
TABLESPACE_NAME: NULL
*************************** 3. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: test_mysql
TABLE_NAME: test_partition
PARTITION_NAME: p2
SUBPARTITION_NAME: NULL
PARTITION_ORDINAL_POSITION: 3
SUBPARTITION_ORDINAL_POSITION: NULL
PARTITION_METHOD: RANGE
SUBPARTITION_METHOD: NULL
PARTITION_EXPRESSION: id
SUBPARTITION_EXPRESSION: NULL
PARTITION_DESCRIPTION: MAXVALUE
TABLE_ROWS: 2
AVG_ROW_LENGTH: 8192
DATA_LENGTH: 16384
MAX_DATA_LENGTH: NULL
INDEX_LENGTH: 0
DATA_FREE: 0
CREATE_TIME: 2022-08-05 06:39:13
UPDATE_TIME: 2022-08-05 07:03:39
CHECK_TIME: NULL
CHECKSUM: NULL
PARTITION_COMMENT:
NODEGROUP: default
TABLESPACE_NAME: NULL
3 rows in set (0.00 sec)
2. LIST
列值离散而不是连续。所以列值只能定义,关键字VALUES IN。
CREATE TABLE `test_list_partition` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`status` int(11) NOT NULL,
PRIMARY KEY (`id`, `status`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic PARTITION BY LIST (status)
PARTITIONS 2
(PARTITION `p0` VALUES IN (1,3,5,7,9) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p1` VALUES IN (0,2,4,6,8) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 );
3. HASH
数据均匀的分散到不同的分区内,保证各分区的数据量大致一样。根据用户自定义表达式的返回值来进行分区,返回值不能为负数。关键字HASH (XX),其中XX是分区列名,也可以带分区列名的一个函数,但是两者都需要返回一个整形数值,而不是负数。
CREATE TABLE `test_hash_partition` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`create_time` datetime(0) NOT NULL,
PRIMARY KEY (`id`, `create_time`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic PARTITION BY HASH (YEAR(create_time))
PARTITIONS 4
(PARTITION `p0` ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p1` ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p2` ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p3` ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 );
确定新插入数据行在分区中的位置。
- 优点:增加、删除、合并/拆分分区操作更加快捷,大表操作效率提高
- 缺点:与HASH相比,各个分区数据分布可能不大均衡
4. KEY
使用MySQL的哈希函数。
CREATE TABLE `test_key_partition` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 7 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic PARTITION BY KEY (`id`)
PARTITIONS 4
(PARTITION `p0` ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p1` ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p2` ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p3` ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 );
关键字KEY改为LINEAR KEY,则为LINEAR KEY分区,与LINEAR HASH特点相同。
5. COLUMNS
RANGE、LIST、HASH、KEY这四种类型都是整型分区,即:字段必须是整型,如果不是整型,则通过函数将其转化为整型。COLUMNS分区可以直接使用非整型的数据进行分区,同时也可以支持多列值分区。支持类型如下:
- 所有的整型:INT、SMALLINT、TINYINT、BIGINT,而FLOAT、DECIMAL不支持
- 日期类型:DATE、DATETIME,其他日期类型不支持
- 字符类型:CHAR、VARCHAR、BINARY、VARBINARY,而TEXT、BOLB不支持
RANGE COLUMNS、LIST COLUMNS是直接写字符串分区,如下所示。也可以多列分区,如:COLUMNS(id,create_time)
CREATE TABLE `test_key_partition` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`create_time` datetime(0) NOT NULL,
PRIMARY KEY (`id`, `create_time`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 7 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
-- 添加分区
ALTER TABLE test_columns_partition PARTITION BY RANGE COLUMNS(create_time)
(
PARTITION p2017 VALUES LESS THAN ('2018-01-01'),
PARTITION p2018 VALUES LESS THAN ('2019-01-01'),
PARTITION p2019 VALUES LESS THAN ('2020-01-01'),
PARTITION p2020 VALUES LESS THAN ('2021-01-01')
)
6. 子分区
分区的基础上再分区,也称复合分区。如MySQL允许在RANGE和LIST分区上再进行HASH或KEY子分区。子分区注意问题:
- 每个子分区的数量必须相同,且所有子分区名有且唯一
- 定义任何一个子分区,则必须定义所有子分区
CREATE TABLE `test_sub_partition` (
`a` int(255) NULL DEFAULT NULL,
`b` datetime(0) NULL DEFAULT NULL
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic PARTITION BY RANGE (YEAR(b))
PARTITIONS 2
SUBPARTITION BY HASH (TO_DAYS(b))
SUBPARTITIONS 2
(PARTITION `p0` VALUES LESS THAN (1990) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 (SUBPARTITION `p0sp0` MAX_ROWS = 0 MIN_ROWS = 0 ,
SUBPARTITION `p0sp1` MAX_ROWS = 0 MIN_ROWS = 0 ),
PARTITION `p1` VALUES LESS THAN (2000) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 (SUBPARTITION `p1sp0` MAX_ROWS = 0 MIN_ROWS = 0 ,
SUBPARTITION `p1sp1` MAX_ROWS = 0 MIN_ROWS = 0 ));
-rw-r----- 1 mysql mysql 8578 Jul 14 01:30 test_sub_partition.frm
-rw-r----- 1 mysql mysql 98304 Jul 14 01:30 test_sub_partition#P#p0#SP#p0sp0.ibd
-rw-r----- 1 mysql mysql 98304 Jul 14 01:30 test_sub_partition#P#p0#SP#p0sp1.ibd
-rw-r----- 1 mysql mysql 98304 Jul 14 01:30 test_sub_partition#P#p1#SP#p1sp0.ibd
-rw-r----- 1 mysql mysql 98304 Jul 14 01:30 test_sub_partition#P#p1#SP#p1sp1.ibd
7. 分区类型总结
总类型 | 类型 | 描述 | NULL值处理 |
数值分区 | RANGE | 1. 范围分区:连续区间的列值; 2. 插入数据不在分区内,抛出异常; 3. MAXVALUE:正无穷大; 4. 删除区内所有数据:alter table 表明 drop partition 分区名; 5. 适用:时间分区等; 6. MySQL5.5开始支持RANGE COLUMNS分区。 | 视NULL为最小值,放入最左边分区 |
LIST | 1. LIST分区:面向离散值,使用VALUES IN (.....); 2. 批量插入时: MyISAM将之前插入数据成功,之后插入失败; InnoDB视为一个事务,全部数据插入失败。 | 显式NULL值属于哪个分区,否则报错 | |
HASH | 1. 用户自定义表达式的返回值进行分区,返回值不能为负数; 2. 目的:数据均匀落到各个分区,区内数据量均匀; 3. 适用:自增长主键分区。 | NULL通过函数时,返回0 | |
LINNER HASH | 1. 比HASH更加复杂的算法:确定新行插入到分区表中的位置; 2. 优点:增加、删除、合并/拆分分区操作更加快捷,大表操作效率提高; 缺点:与HASH相比,各个分区数据分布可能不大均衡。 | NULL通过函数时,返回0 | |
KEY | 1. MySQL提供的哈希函数进行分区; 2. 不同存储引擎,哈希算法不同。 | NULL通过函数时,返回0 | |
LINNER KEY | 与LINNER HASH效果一样 | NULL通过函数时,返回0 | |
类型分区 | COLUMNS | 1. 根据列的类型分区,不需要转化为整型; 2. 可对多列值进行分区; 3. 支持类型:所有整型类型、日期类型、字符串类型; 不支持类型:TEXT、BLOB。 | ------ |
子分区 | 子分区 | 1. 概念:分区的基础上再分区,也称“复合分区”(SUBPARTITION); 2. MySQL允许RANGE和LIST分区上再进行HASH和KEY分区; 3. 子分区注意: 每个子分区的数量必须相同,且所有子分区名有且唯一; 定义任何一个子分区,则必须定义所有子分区; | ------ |
三、分区示例
1. 非分区表导入分区表
step1:非分区表结构,导入新表
CREATE TABLE `test_playbill` (
`playbill_id` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT '主键Id',
`db_id` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
`content_id` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '节目id',
`start_time` datetime(0) NULL DEFAULT NULL COMMENT '节目开始时间',
`end_time` datetime(0) NULL DEFAULT NULL COMMENT '节目结束时间',
`content_time` int(11) NULL DEFAULT NULL COMMENT '节目时长/秒',
`status` smallint(6) NULL DEFAULT 1 COMMENT '状态 0:下线 1:上线',
`create_time` datetime(0) NULL DEFAULT NULL COMMENT '创建时间',
`create_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '创建人',
`update_time` datetime(0) NULL DEFAULT NULL COMMENT '修改时间',
`update_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '创建人',
PRIMARY KEY (`playbill_id`) USING BTREE,
INDEX `idx_mgdb_id2`(`mgdb_id`, `content_id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci COMMENT = '节目单' ROW_FORMAT = Dynamic;
step2:添加分区
ALTER TABLE test_partition_playbill PARTITION BY RANGE (YEAR(create_time))
(
PARTITION `p2018` VALUES LESS THAN (2019) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p2019` VALUES LESS THAN (2020) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p2020` VALUES LESS THAN (2021) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p2021` VALUES LESS THAN (2022) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p2022` VALUES LESS THAN (2023) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0 ,
PARTITION `p2023` VALUES LESS THAN (2024) ENGINE = InnoDB MAX_ROWS = 0 MIN_ROWS = 0
)
step3:导入数据
INSERT INTO test_partition_playbill SELECT * FROM test_playbill;
mysql> SELECT
-> PARTITION_METHOD,
-> PARTITION_NAME,
-> PARTITION_EXPRESSION,
-> PARTITION_DESCRIPTION,
-> TABLE_ROWS
-> FROM
-> information_schema.`PARTITIONS`
-> WHERE
-> table_schema = 'test_mysql'
-> AND table_name = 'test_partition_playbill';
+------------------+----------------+----------------------+-----------------------+------------+
| PARTITION_METHOD | PARTITION_NAME | PARTITION_EXPRESSION | PARTITION_DESCRIPTION | TABLE_ROWS |
+------------------+----------------+----------------------+-----------------------+------------+
| RANGE | p2018 | YEAR(create_time) | 2019 | 11789 |
| RANGE | p2019 | YEAR(create_time) | 2020 | 45218 |
| RANGE | p2020 | YEAR(create_time) | 2021 | 111462 |
| RANGE | p2021 | YEAR(create_time) | 2022 | 122793 |
| RANGE | p2022 | YEAR(create_time) | 2023 | 0 |
| RANGE | p2023 | YEAR(create_time) | 2024 | 0 |
+------------------+----------------+----------------------+-----------------------+------------+
6 rows in set (0.00 sec)
2. 分区表导入非分区表
分区表的数据移动到非分区表中。该语句满足以下条件:
- 分区表与非分区表的结构相同,但不含分区
- 非分区表中的数据必须在分区表的分区定义内
- 被交换的表不能有外键
step1:分区表结构,创建非分表,表结构相同
CREATE TABLE test_playbill2 LIKE test_partition_playbill;
step2:新表删除分区
ALTER TABLE test_playbill2 REMOVE PARTITIONING;
step3:导入数据
执行下述语句,分区表test_partition_playbill中的p2019分区的数据移动到test_playbill2。此时p2019分区已没有数据。
ALTER TABLE test_partition_playbill EXCHANGE PARTITION p2019 WITH TABLE test_playbill2;