mysql中index_merge索引合并优化

原创

tiancityycf 2022-11-25 15:24:02 博主文章分类：面试必考 ©著作权

文章标签 字段 mysql 数据 文章分类 前端开发

©著作权归作者所有：来自51CTO博客作者tiancityycf的原创作品，请联系作者获取转载授权，否则将追究法律责任

前言

深入理解 index merge 是使用索引进行优化的重要基础之一。理解了 index merge 技术，我们才知道应该如何在表上建立索引。

MySQL在分析执行计划时发现走单个索引的过滤效果都不是很好,对多个索引分别进行条件扫描，然后将它们各自的结果进行合并(intersect/union)。

我们的 where 中可能有多个条件(或者join)涉及到多个字段，它们之间进行 AND 或者 OR，那么此时就有可能会使用到 index merge 技术。

MySQL5.0之前，一个表一次只能使用一个索引，无法同时使用多个索引分别进行条件扫描。但是从5.1开始，引入了 index merge 优化技术，对同一个表可以使用多个索引分别进行条件扫描。

index merge 算法根据合并算法的不同分成了三种：intersect, union, sort_union（先内部intersect然后在外面union）

场景

假设我们有如下表结构，数据100万

CREATE TABLE user(
  id INT NOT NULL auto_increment PRIMARY KEY,
  name VARCHAR (8) DEFAULT NULL,
  age TINYINT DEFAULT NULL,
  KEY idx_name (name),
  KEY idx_age (age)

1. index merge 之 intersect

index intersect merge就是多个索引条件扫描得到的结果进行交集运算。显然在多个索引提交之间是 AND 运算时，才会出现 index intersect merge

select * from user where name='zs' and age=12

explain分析一下发现
possible_keys   key  : 都是 idx_name,idx_age  
extra  : Using intersect(idx_name,idx_age); Using where; Using temporary; Using filesort

2. index merge 之 union

简单而言，index uion merge就是多个索引条件扫描，对得到的结果进行并集运算，显然是多个条件之间进行的是 OR 运算。

下面几种类型的 where 条件，以及他们的组合可能会使用到 index union merge算法：

1) 条件使用到复合索引中的所有字段或者左前缀字段(对单字段索引也适用)

2) 主键上的任何范围条件

3) 任何符合 index intersect merge 的where条件；

上面三种 where 条件进行 OR 运算时，可能会使用 index union merge算法。

3. index merge 之 sort_union

This access algorithm is employed when the WHERE clause was converted to several range conditions combined by OR, but for which the Index Merge method union algorithm is not applicable.

多个条件扫描进行 OR 运算，但是不符合 index union merge算法的，此时可能会使用 sort_union算法

4. index merge的局限

1）If your query has a complex WHERE clause with deep AND/OR nesting and MySQL does not choose the optimal plan, try distributing terms using the following identity laws:

5. 对 index merge 的进一步优化

index merge使得我们可以使用到多个索引同时进行扫描，然后将结果进行合并。听起来好像是很好的功能，但是如果出现了 index intersect merge，那么一般同时也意味着我们的索引建立得不太合理，因为 index intersect merge 是可以通过建立复合索引进行更一步优化的。

6. 复合索引的最左前缀原则

上面我们说到，对复合索引的非最左前缀字段进行 OR 运算，是无法使用到复合索引的

SQL如下：

select c_id, count(1) total from table1
where act_id = 100476 and deleted=0 and create_time is not null
group by c_id ;

在table1表中，act_id 和 deleted 都是独立的索引

select count(*) from table1 where act_id = 100476; 结果为20w行

select count(*) from table1 where deleted=0; 结果为100w行

从上面的解释我们可以看出来，index merge其实就是分别通过对两个独立的index进行过滤之后，将过滤之后的结果聚合在一起，然后在返回结果集。

在我们的这个例子中，由于deleted字段的过滤性不好，故返回的rows依然很多，所以造成的很多的磁盘read，导致了cpu的负载非常的高，直接就出现了延迟。

ps：其实在这个case中，并不需要加2个条件的index，只需要将deleted这个index干掉，直接使用coupon_act_id这个index即可，毕竟这个index的过滤的结果集已经很小了。

或者通过关闭index intersect功能也可以。