从Mysql8.0.17开始,支持在json列上添加多值索引。多值索引会为一条记录添加多条索引记录,查找时,通过索引能快速定位到记录。

要使用多值索引,先通过select version()看一下版本是否支持。

JSON字段的创建和查询

Mysql的json类型是schemaless的,可以插入任意符合json格式的数据,包括数组和对象。

例如,有一张用户表,包括用户标签(tag)、扩展信息(extInfo)都是json类型。tag存放的是字符串数组,extInfo存放的是一个对象,包含age(数值)和address(字符串)。

create table demo_user
(
		name varchar(64),
    tag     json,
    extInfo json
);

在不加索引的情况下,可以插入任意符合json格式的数据,每行记录的内容都可以不一样。

insert into demo_user(name, tag, extInfo) values ('小明', '["t1","t2"]', '{"age": 20, "address": "a1"}');
insert into demo_user(name, tag, extInfo) values ('小红', '["t1","t2"]', '{"age": 20, "address": 20}');
insert into demo_user(name, tag, extInfo) values ('小李', '["t1","t2"]', '{"age": "aaa", "address": "a1"}');
insert into demo_user(name, tag, extInfo) values ('小花', '{"age": "aaa", "address": "a1"}', '{"age": "aaa", "address": "a1"}');

select * from demo_user;
+------+-------------------------------+-------------------------------+
|name  |tag                            |extInfo                        |
+------+-------------------------------+-------------------------------+
|小明  |["t1", "t2"]                   |{"age": 20, "address": "a1"}   |
|小红  |["t1", "t2"]                   |{"age": 20, "address": 20}     |
|小李  |["t1", "t2"]                   |{"age": "aaa", "address": "a1"}|
|小花  |{"age": "aaa", "address": "a1"}|{"age": "aaa", "address": "a1"}|
+------+-------------------------------+-------------------------------+

还没添加索引,索引按照tag、age、adress查询是会走全表扫描。

explain select * from demo_user where json_contains(tag, '"t1"');
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|id|select_type|table    |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra      |
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|1 |SIMPLE     |demo_user|null      |ALL |null         |null|null   |null|4   |100     |Using where|
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+

explain select * from demo_user where json_contains(extInfo->'$.age', '20');
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|id|select_type|table    |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra      |
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|1 |SIMPLE     |demo_user|null      |ALL |null         |null|null   |null|4   |100     |Using where|
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+

explain select * from demo_user where json_contains(extInfo->'$.address', '"a1"');
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|id|select_type|table    |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra      |
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|1 |SIMPLE     |demo_user|null      |ALL |null         |null|null   |null|4   |100     |Using where|
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+

添加多值索引

添加多值索引的方式和添加普通索引类似,可以通过create table、alter table或create index添加索引,唯一的区别是对索引列的描述有差别。json是一个schemaless,在创建索引的时候要明确要为哪个key建立索引,并且告知列是什么类型。

通过cast(expr AS type [ARRAY])的方式将json转成指定类型的ARRAY,column是需要加索引的json列,type是json里key的类型。

type的可选值有很多,我们需要用到字符串和整形,其他的在官方文档里有描述[cast函数]

  • char(n),转成varchar类型的字符串
  • unsigned,转成无符号类型的bigint
-- 将tag里的值转成字符类型
alter table demo_user add index tag( (cast(tag as char(64) array)) );

-- 将extInfo里的age转成无符号整形
alter table demo_user add index age( (cast(extInfo->'$.age' as unsigned array)) );

-- 将extInfo里的address转成字符类型
alter table demo_user add index address( (cast(extInfo->'$.address' as char(255) array)) );

这时候,因为之前插入的数据可能无法转成对应的类型,就会在建立索引的时候报错。

例如,在给tag添加索引时,因为有第四条记录的tag值是{"age": "aaa", "address": "a1"},所以提示JSON Object类型无法转成array类型。

alter table demo_user add index tag( (cast(tag as char(64) array)) );
---
[42000][1235] This version of MySQL doesn't yet support 'CAST-ing JSON OBJECT type to array'

给age添加索引时,提示数据截断,因为第三条里age值为"aaa",转成整数是为空。

alter table demo_user add index age( (cast(extInfo->'$.age' as unsigned array)) );
---
[22001][3903] Data truncation: Invalid JSON value for CAST for functional index 'age'.

当我们把数据修正后就能正常添加上索引,再看看执行explain看一下查询是否走索引:

explain select * from demo_user where json_contains(tag, '"t1"');
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|id|select_type|table    |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra      |
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|1 |SIMPLE     |demo_user|null      |range|tag          |tag|259    |null|4   |100     |Using where|
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+

explain select * from demo_user where json_contains(extInfo->'$.age', '20');
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|id|select_type|table    |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra      |
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|1 |SIMPLE     |demo_user|null      |range|age          |age|9      |null|2   |100     |Using where|
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+

explain select * from demo_user where json_contains(extInfo->'$.address', '"a1"');
+--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+
|id|select_type|table    |partitions|type |possible_keys|key    |key_len|ref |rows|filtered|Extra      |
+--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+
|1 |SIMPLE     |demo_user|null      |range|address      |address|1023   |null|3   |100     |Using where|
+--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+

可以看到3个查询语句都走到了索引。

除了json_contains能走索引,member of()和json_overlaps()也能用到索引,但是这几个函数的功能是不一样的,具体要查询文档说明。

explain select * from demo_user where 't1' member of (tag);
+--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+
|id|select_type|table    |partitions|type|possible_keys|key|key_len|ref  |rows|filtered|Extra      |
+--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+
|1 |SIMPLE     |demo_user|null      |ref |tag          |tag|259    |const|4   |100     |Using where|
+--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+

explain select * from demo_user where JSON_OVERLAPS(tag, '"t1"');
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|id|select_type|table    |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra      |
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|1 |SIMPLE     |demo_user|null      |range|tag          |tag|259    |null|4   |100     |Using where|
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+

总结

json列能添加多值索引,提高了json的查询性能。同时因为索引要指定列的类型,索引列就不能使用任意类型,这会失去一部分schemaless的能力。需要注意的是,使用cast函数添加索引时,要选择合适type,不然会出现类型转换失败的异常。

参考

JSON类型文档:MySQL :: MySQL 8.3 Reference Manual :: 11.5 The JSON Data Type

多值索引文档:MySQL :: MySQL 8.3 Reference Manual :: 13.1.15 CREATE INDEX Statement

cast函数文档:MySQL :: MySQL 8.3 Reference Manual :: 12.10 Cast Functions and Operators