MySQL如何执行连接查询

mysql中的关联(join)一词所包含的意义比一般意义上理解的更广泛。总的来说,mysql认为任何一个查询都是一次关联——并不仅仅是一个查询需要到两个表匹配才叫关联 ,所以在mysql中,每一个查询,每一个片段(包括子查询,甚至基于单表的select)都可能是关联。

当前mysql关联执行的策略很简单:mysql对任何关联都执行嵌套循环关联操作,即mysql先在一个表中循环取出单条数据,然后再嵌套循环到下一个表中寻找匹配的行,依次下去直到找到所有表中匹配的行为止。然后根据各个表匹配的行,返回查询中需要的各个列。Mysql会尝试在最后一个关联表中找到所有匹配的行,如果最后一个关联表中无法找到更多的行以后,mysql返回到上一层关联表,看是否能够找到更多的匹配记录,依此类推迭代执行。

按照这样的方式查找 第一个记录,再嵌套查询下一个关联表,然后回溯到上一个表,在MySQL中是通过嵌套循环的方式实现——正如其名——嵌套循环关联。




内连接-inner join

下面这个关联查询的例子:连接字段为col3,等值连接,具体怎么连接请看伪代码的表示:

mysql> select tb1.col1,tb2.col2
    -> from tb1 inner join tb2 using (col3)
    -> where tb1.col1 in (5,6);
mysql> select tb1.col1,tb2.col2
    -> from tb1 inner join tb2 using (col3)
    -> where tb1.col1 in (5,6);

假设mysql按照查询中的表顺序进行关联操作,我们则可以使用下面的伪代码来表示mysql如何来完成这个查询:

outer_iter = iterator over tb1 where col1 in (5,6)
outer_row  = outer_iter.next
while outer_row
	inner_iter = iterator over tb2 where col3 = outer_row.col3
	inner_row  = inner_iter.next
	while inner_row
		output [outer_row.col1, inner_row.col2]
		inner_row = inner_iter.next
	end
	outer_row = outer_iter.next
end
outer_iter = iterator over tb1 where col1 in (5,6)
outer_row  = outer_iter.next
while outer_row
	inner_iter = iterator over tb2 where col3 = outer_row.col3
	inner_row  = inner_iter.next
	while inner_row
		output [outer_row.col1, inner_row.col2]
		inner_row = inner_iter.next
	end
	outer_row = outer_iter.next
end

稍微解释一个伪代码:根据条件where col1 in (5,6),迭代表tab1,得到表tab1的迭代器,遍历迭代器,在while循环里,得到tab2 的迭代器,条件是tab2 的col3等于tab1 的col3。然后就遍历tab2 的迭代器,打印出关联后的列。基本思路就是这样,我解释的不够清楚,请看伪代码。




左外连接-left outer join

上面的执行计划对于单表查询和多表关联查询都适用,如果是一个单表查询,那么只需要完成上面外层的基本操作。对于外连接上面的执行过程仍然适用。例如,我们将上面查询修改为如下:

mysql> select tb1.col1,tb2.col2
    -> from tb1 left outer join tb2 using col3
    -> where tb1.col1 in (5,6);
mysql> select tb1.col1,tb2.col2
    -> from tb1 left outer join tb2 using col3
    -> where tb1.col1 in (5,6);

对应的伪代码如下:

outer_iter = iterator over tb1 where col1 in (5,6)
outer_row  = outer_iter.next
while outer_row
	inner_iter = iterator over tb2 where col3 = outer_row.col3
	inner_row  = inner_iter.next
	if inner_row 
		while inner_row
			output [outer_row.col1, inner_row.col2]
			inner_row = inner_iter.next
		end
	else 
		output [outer_row.col1,NULL]
	end
	outer_row = outer_row.next
end
outer_iter = iterator over tb1 where col1 in (5,6)
outer_row  = outer_iter.next
while outer_row
	inner_iter = iterator over tb2 where col3 = outer_row.col3
	inner_row  = inner_iter.next
	if inner_row 
		while inner_row
			output [outer_row.col1, inner_row.col2]
			inner_row = inner_iter.next
		end
	else 
		output [outer_row.col1,NULL]
	end
	outer_row = outer_row.next
end

和上面的伪代码不同的地方在这里:

if inner_row 
	while inner_row
		output [outer_row.col1, inner_row.col2]
		inner_row = inner_iter.next
	end
else 
	output [outer_row.col1,NULL]
end
if inner_row 
	while inner_row
		output [outer_row.col1, inner_row.col2]
		inner_row = inner_iter.next
	end
else 
	output [outer_row.col1,NULL]
end

如果inner_row是空的,也就是在tb2 表中没有和outer_row.col3相等的row,所以执行else逻辑,打印出对应的空行。

下面我会以实际的例子再次解读关联查询。


建表语句:

create table tb_customer(
    id int not null auto_increment,
    name varchar(64) not null,
    city varchar(128) not null,
    gender int default 0,
    age int,
    primary key (id)
)

create table tb_item(
    id int not null auto_increment,
    item_name varchar(128) not null,
    item_price int not null,
    customer_id int not null,
    primary key(id),
    constraint tb_item_fk foreign key (customer_id) references tb_customer(id)
)

insert into tb_customer(name,city) values ('bob','taian'),('coc','beijing'),('dod','shenyang'),('eoe','laiwu');
insert into tb_customer(name,city) values ('aoa','taian'),('ioi','beijing'),('non','shenyang'),('fof','laiwu');

insert into tb_item(item_name,item_price,customer_id)values('a',1,1),('b',2,1),('c',3,1),('d',4,1);
insert into tb_item(item_name,item_price,customer_id)values('a',1,2),('b',2,2),('c',3,2),('d',4,2);
insert into tb_item(item_name,item_price,customer_id)values('a',1,3),('b',2,3),('c',3,3),('d',4,3);
insert into tb_item(item_name,item_price,customer_id)values('a',1,4),('b',2,4);
insert into tb_item(item_name,item_price,customer_id)values('e',1,5),('f',2,5);
insert into tb_item(item_name,item_price,customer_id)values('e',1,6),('g',2,6);
create table tb_customer(
    id int not null auto_increment,
    name varchar(64) not null,
    city varchar(128) not null,
    gender int default 0,
    age int,
    primary key (id)
)

create table tb_item(
    id int not null auto_increment,
    item_name varchar(128) not null,
    item_price int not null,
    customer_id int not null,
    primary key(id),
    constraint tb_item_fk foreign key (customer_id) references tb_customer(id)
)

insert into tb_customer(name,city) values ('bob','taian'),('coc','beijing'),('dod','shenyang'),('eoe','laiwu');
insert into tb_customer(name,city) values ('aoa','taian'),('ioi','beijing'),('non','shenyang'),('fof','laiwu');

insert into tb_item(item_name,item_price,customer_id)values('a',1,1),('b',2,1),('c',3,1),('d',4,1);
insert into tb_item(item_name,item_price,customer_id)values('a',1,2),('b',2,2),('c',3,2),('d',4,2);
insert into tb_item(item_name,item_price,customer_id)values('a',1,3),('b',2,3),('c',3,3),('d',4,3);
insert into tb_item(item_name,item_price,customer_id)values('a',1,4),('b',2,4);
insert into tb_item(item_name,item_price,customer_id)values('e',1,5),('f',2,5);
insert into tb_item(item_name,item_price,customer_id)values('e',1,6),('g',2,6);

下面来看一下关联查询的实际效果:

mysql> select * from tb_customer;
+----+------+----------+--------+------+
| id | name | city     | gender | age  |
+----+------+----------+--------+------+
|  1 | bob  | taian    |      0 | NULL |
|  2 | coc  | beijing  |      0 | NULL |
|  3 | dod  | shenyang |      0 | NULL |
|  4 | eoe  | laiwu    |      0 | NULL |
|  5 | aoa  | taian    |      0 | NULL |
|  6 | ioi  | beijing  |      0 | NULL |
|  7 | non  | shenyang |      0 | NULL |
|  8 | fof  | laiwu    |      0 | NULL |
+----+------+----------+--------+------+
8 rows in set (0.00 sec)

mysql> select * from tb_item;
+----+-----------+------------+-------------+
| id | item_name | item_price | customer_id |
+----+-----------+------------+-------------+
|  1 | a         |          1 |           1 |
|  2 | b         |          2 |           1 |
|  3 | c         |          3 |           1 |
|  4 | d         |          4 |           1 |
|  5 | a         |          1 |           2 |
|  6 | b         |          2 |           2 |
|  7 | c         |          3 |           2 |
|  8 | d         |          4 |           2 |
|  9 | a         |          1 |           3 |
| 10 | b         |          2 |           3 |
| 11 | c         |          3 |           3 |
| 12 | d         |          4 |           3 |
| 13 | a         |          1 |           4 |
| 14 | b         |          2 |           4 |
| 15 | e         |          1 |           5 |
| 16 | f         |          2 |           5 |
| 17 | e         |          1 |           6 |
| 18 | g         |          2 |           6 |
+----+-----------+------------+-------------+
18 rows in set (0.00 sec)
mysql> select * from tb_customer;
+----+------+----------+--------+------+
| id | name | city     | gender | age  |
+----+------+----------+--------+------+
|  1 | bob  | taian    |      0 | NULL |
|  2 | coc  | beijing  |      0 | NULL |
|  3 | dod  | shenyang |      0 | NULL |
|  4 | eoe  | laiwu    |      0 | NULL |
|  5 | aoa  | taian    |      0 | NULL |
|  6 | ioi  | beijing  |      0 | NULL |
|  7 | non  | shenyang |      0 | NULL |
|  8 | fof  | laiwu    |      0 | NULL |
+----+------+----------+--------+------+
8 rows in set (0.00 sec)

mysql> select * from tb_item;
+----+-----------+------------+-------------+
| id | item_name | item_price | customer_id |
+----+-----------+------------+-------------+
|  1 | a         |          1 |           1 |
|  2 | b         |          2 |           1 |
|  3 | c         |          3 |           1 |
|  4 | d         |          4 |           1 |
|  5 | a         |          1 |           2 |
|  6 | b         |          2 |           2 |
|  7 | c         |          3 |           2 |
|  8 | d         |          4 |           2 |
|  9 | a         |          1 |           3 |
| 10 | b         |          2 |           3 |
| 11 | c         |          3 |           3 |
| 12 | d         |          4 |           3 |
| 13 | a         |          1 |           4 |
| 14 | b         |          2 |           4 |
| 15 | e         |          1 |           5 |
| 16 | f         |          2 |           5 |
| 17 | e         |          1 |           6 |
| 18 | g         |          2 |           6 |
+----+-----------+------------+-------------+
18 rows in set (0.00 sec)




关联查询一:

select a.* , b.* from tb_customer a inner join tb_item b;
select a.* , b.* from tb_customer a inner join tb_item b using (id);
select a.* , b.* from tb_customer a inner join tb_item b;
select a.* , b.* from tb_customer a inner join tb_item b using (id);

第一个关联查询不加关联条件,实际的伪代码可以这样表示:

outer_iter = iterator over tb1
outer_row  = outer_iter.next
while outer_row
	inner_iter = iterator over tb2
	inner_row  = inner_iter.next
	while inner_row
		output [outer_row.col1, inner_row.col2]
		inner_row = inner_iter.next
	end
	outer_row = outer_iter.next
end
outer_iter = iterator over tb1
outer_row  = outer_iter.next
while outer_row
	inner_iter = iterator over tb2
	inner_row  = inner_iter.next
	while inner_row
		output [outer_row.col1, inner_row.col2]
		inner_row = inner_iter.next
	end
	outer_row = outer_iter.next
end

第二个关联查询的伪代码可以这样表示:

outer_iter = iterator over tb1
outer_row  = outer_iter.next
while outer_row
	inner_iter = iterator over tb2 where id = outer_row.id
	inner_row  = inner_iter.next
	while inner_row
		output [outer_row.col1, inner_row.col2]
		inner_row = inner_iter.next
	end
	outer_row = outer_iter.next
end
outer_iter = iterator over tb1
outer_row  = outer_iter.next
while outer_row
	inner_iter = iterator over tb2 where id = outer_row.id
	inner_row  = inner_iter.next
	while inner_row
		output [outer_row.col1, inner_row.col2]
		inner_row = inner_iter.next
	end
	outer_row = outer_iter.next
end

第二个关联查询的实际效果:

mysql> select a.* , b.* from tb_customer a inner join tb_item b using (id);
+----+------+----------+--------+------+----+-----------+------------+-------------+
| id | name | city     | gender | age  | id | item_name | item_price | customer_id |
+----+------+----------+--------+------+----+-----------+------------+-------------+
|  1 | bob  | taian    |      0 | NULL |  1 | a         |          1 |           1 |
|  2 | coc  | beijing  |      0 | NULL |  2 | b         |          2 |           1 |
|  3 | dod  | shenyang |      0 | NULL |  3 | c         |          3 |           1 |
|  4 | eoe  | laiwu    |      0 | NULL |  4 | d         |          4 |           1 |
|  5 | aoa  | taian    |      0 | NULL |  5 | a         |          1 |           2 |
|  6 | ioi  | beijing  |      0 | NULL |  6 | b         |          2 |           2 |
|  7 | non  | shenyang |      0 | NULL |  7 | c         |          3 |           2 |
|  8 | fof  | laiwu    |      0 | NULL |  8 | d         |          4 |           2 |
+----+------+----------+--------+------+----+-----------+------------+-------------+
8 rows in set (0.00 sec)
mysql> select a.* , b.* from tb_customer a inner join tb_item b using (id);
+----+------+----------+--------+------+----+-----------+------------+-------------+
| id | name | city     | gender | age  | id | item_name | item_price | customer_id |
+----+------+----------+--------+------+----+-----------+------------+-------------+
|  1 | bob  | taian    |      0 | NULL |  1 | a         |          1 |           1 |
|  2 | coc  | beijing  |      0 | NULL |  2 | b         |          2 |           1 |
|  3 | dod  | shenyang |      0 | NULL |  3 | c         |          3 |           1 |
|  4 | eoe  | laiwu    |      0 | NULL |  4 | d         |          4 |           1 |
|  5 | aoa  | taian    |      0 | NULL |  5 | a         |          1 |           2 |
|  6 | ioi  | beijing  |      0 | NULL |  6 | b         |          2 |           2 |
|  7 | non  | shenyang |      0 | NULL |  7 | c         |          3 |           2 |
|  8 | fof  | laiwu    |      0 | NULL |  8 | d         |          4 |           2 |
+----+------+----------+--------+------+----+-----------+------------+-------------+
8 rows in set (0.00 sec)

可以对照着伪代码分析一下




关联查询二:

select a.* , b.* from tb_customer a inner join tb_item b;
select a.* , b.* from tb_item a inner join tb_customer b;
select a.* , b.* from tb_customer a inner join tb_item b;
select a.* , b.* from tb_item a inner join tb_customer b;

这两个关联查询有什么不同呢?其实本质就是一样,只不过是列的输出位置不同,下面只是一部分输出结果:

mysql> select a.* , b.* from tb_customer a inner join tb_item b;
+----+------+----------+--------+------+----+-----------+------------+-------------+
| id | name | city     | gender | age  | id | item_name | item_price | customer_id |
+----+------+----------+--------+------+----+-----------+------------+-------------+
|  1 | bob  | taian    |      0 | NULL |  1 | a         |          1 |           1 |
|  2 | coc  | beijing  |      0 | NULL |  1 | a         |          1 |           1 |
|  3 | dod  | shenyang |      0 | NULL |  1 | a         |          1 |           1 |
|  4 | eoe  | laiwu    |      0 | NULL |  1 | a         |          1 |           1 |
|  5 | aoa  | taian    |      0 | NULL |  1 | a         |          1 |           1 |
|  6 | ioi  | beijing  |      0 | NULL |  1 | a         |          1 |           1 |
|  7 | non  | shenyang |      0 | NULL |  1 | a         |          1 |           1 |
|  8 | fof  | laiwu    |      0 | NULL |  1 | a         |          1 |           1 |
................................
................................
mysql> select a.* , b.* from tb_customer a inner join tb_item b;
+----+------+----------+--------+------+----+-----------+------------+-------------+
| id | name | city     | gender | age  | id | item_name | item_price | customer_id |
+----+------+----------+--------+------+----+-----------+------------+-------------+
|  1 | bob  | taian    |      0 | NULL |  1 | a         |          1 |           1 |
|  2 | coc  | beijing  |      0 | NULL |  1 | a         |          1 |           1 |
|  3 | dod  | shenyang |      0 | NULL |  1 | a         |          1 |           1 |
|  4 | eoe  | laiwu    |      0 | NULL |  1 | a         |          1 |           1 |
|  5 | aoa  | taian    |      0 | NULL |  1 | a         |          1 |           1 |
|  6 | ioi  | beijing  |      0 | NULL |  1 | a         |          1 |           1 |
|  7 | non  | shenyang |      0 | NULL |  1 | a         |          1 |           1 |
|  8 | fof  | laiwu    |      0 | NULL |  1 | a         |          1 |           1 |
................................
................................

左表:tb_customer 

右表:tb_item 




关联查询三:

select * from tb_customer outer join tb_item;
select * from tb_customer left outer join tb_item;
seldct * from tb_customer right outer join tb_item;
select * from tb_item outer join tb_customer;
select * from tb_item left outer join tb_customer;
seldct * from tb_item right outer join tb_customer;
select * from tb_customer outer join tb_item;
select * from tb_customer left outer join tb_item;
seldct * from tb_customer right outer join tb_item;
select * from tb_item outer join tb_customer;
select * from tb_item left outer join tb_customer;
seldct * from tb_item right outer join tb_customer;

这六个外关联都是错误的,为什么是错误的,为什么不加关联条件出错,都可以从伪代码上分析清楚。。

所以这里有个结论:外关联必须要有关联条件 。




关联查询四:

select * from tb_item a right outer join tb_customer b on a.customer_id = b.id;
select * from tb_item a right outer join tb_customer b on b.id = a.customer_id; 
select * from tb_customer a left outer join tb_item b on a.id = b.customer_id;
select * from tb_item a right outer join tb_customer b on a.customer_id = b.id;
select * from tb_item a right outer join tb_customer b on b.id = a.customer_id; 
select * from tb_customer a left outer join tb_item b on a.id = b.customer_id;

这三个关联查询都是正确的外连接。。。

以第一条关联查询分析:

以下是部分查询结果:

mysql> select * from tb_item a right outer join tb_customer b on a.customer_id = b.id;
+------+-----------+------------+-------------+----+------+----------+--------+------+
| id   | item_name | item_price | customer_id | id | name | city     | gender | age  |
+------+-----------+------------+-------------+----+------+----------+--------+------+
|    1 | a         |          1 |           1 |  1 | bob  | taian    |      0 | NULL |
|    2 | b         |          2 |           1 |  1 | bob  | taian    |      0 | NULL |
|    3 | c         |          3 |           1 |  1 | bob  | taian    |      0 | NULL |
|    4 | d         |          4 |           1 |  1 | bob  | taian    |      0 | NULL |
..............................
..............................
mysql> select * from tb_item a right outer join tb_customer b on a.customer_id = b.id;
+------+-----------+------------+-------------+----+------+----------+--------+------+
| id   | item_name | item_price | customer_id | id | name | city     | gender | age  |
+------+-----------+------------+-------------+----+------+----------+--------+------+
|    1 | a         |          1 |           1 |  1 | bob  | taian    |      0 | NULL |
|    2 | b         |          2 |           1 |  1 | bob  | taian    |      0 | NULL |
|    3 | c         |          3 |           1 |  1 | bob  | taian    |      0 | NULL |
|    4 | d         |          4 |           1 |  1 | bob  | taian    |      0 | NULL |
..............................
..............................

通过该查询结果,我们可以看到有部分row是重复出现的,那问题来了为什么是重复出现的,还是要从伪代码上找答案,重复出现的那行说明是外层首先循环遍历的。。

那么可以得出在这里外层首先循环遍历的表时tb_customer表,而在关联查询二:中我们看到inner join 首先遍历的表时tb_item,这不会说明什么问题,但我们可以得出内连接和外连接首先遍历的表时不同的,做出选择的是mysql的查询优化器做出的执行计划。




EXPLAIN 查询语句

explain select * from tb_item a right join tb_customer b on a.customer_id = b.id;
explain select * from tb_item a inner join tb_customer b;
explain select * from tb_item a right join tb_customer b on a.customer_id = b.id;
explain select * from tb_item a inner join tb_customer b;

结果是:

mysql> explain select * from tb_item a right join tb_customer b on a.customer_id = b.id;
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
| id | select_type | table | type | possible_keys | key        | key_len | ref                 | rows | Extra |
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
|  1 | SIMPLE      | b     | ALL  | NULL          | NULL       | NULL    | NULL                |    8 | NULL  |
|  1 | SIMPLE      | a     | ref  | tb_item_fk    | tb_item_fk | 4       | local_database.b.id |    1 | NULL  |
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
2 rows in set (0.00 sec)

mysql> explain select * from tb_item a inner join tb_customer b;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                 |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
|  1 | SIMPLE      | b     | ALL  | NULL          | NULL | NULL    | NULL |    8 | NULL                                  |
|  1 | SIMPLE      | a     | ALL  | NULL          | NULL | NULL    | NULL |   18 | Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
2 rows in set (0.00 sec)

mysql>
mysql> explain select * from tb_item a right join tb_customer b on a.customer_id = b.id;
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
| id | select_type | table | type | possible_keys | key        | key_len | ref                 | rows | Extra |
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
|  1 | SIMPLE      | b     | ALL  | NULL          | NULL       | NULL    | NULL                |    8 | NULL  |
|  1 | SIMPLE      | a     | ref  | tb_item_fk    | tb_item_fk | 4       | local_database.b.id |    1 | NULL  |
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
2 rows in set (0.00 sec)

mysql> explain select * from tb_item a inner join tb_customer b;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                 |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
|  1 | SIMPLE      | b     | ALL  | NULL          | NULL | NULL    | NULL |    8 | NULL                                  |
|  1 | SIMPLE      | a     | ALL  | NULL          | NULL | NULL    | NULL |   18 | Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
2 rows in set (0.00 sec)

mysql>

==================END==================