MySQL如何执行连接查询
mysql中的关联(join)一词所包含的意义比一般意义上理解的更广泛。总的来说,mysql认为任何一个查询都是一次关联——并不仅仅是一个查询需要到两个表匹配才叫关联 ,所以在mysql中,每一个查询,每一个片段(包括子查询,甚至基于单表的select)都可能是关联。
当前mysql关联执行的策略很简单:mysql对任何关联都执行嵌套循环关联操作,即mysql先在一个表中循环取出单条数据,然后再嵌套循环到下一个表中寻找匹配的行,依次下去直到找到所有表中匹配的行为止。然后根据各个表匹配的行,返回查询中需要的各个列。Mysql会尝试在最后一个关联表中找到所有匹配的行,如果最后一个关联表中无法找到更多的行以后,mysql返回到上一层关联表,看是否能够找到更多的匹配记录,依此类推迭代执行。
按照这样的方式查找 第一个记录,再嵌套查询下一个关联表,然后回溯到上一个表,在MySQL中是通过嵌套循环的方式实现——正如其名——嵌套循环关联。
内连接-inner join
下面这个关联查询的例子:连接字段为col3,等值连接,具体怎么连接请看伪代码的表示:
mysql> select tb1.col1,tb2.col2
-> from tb1 inner join tb2 using (col3)
-> where tb1.col1 in (5,6);
mysql> select tb1.col1,tb2.col2
-> from tb1 inner join tb2 using (col3)
-> where tb1.col1 in (5,6);
假设mysql按照查询中的表顺序进行关联操作,我们则可以使用下面的伪代码来表示mysql如何来完成这个查询:
outer_iter = iterator over tb1 where col1 in (5,6)
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tb2 where col3 = outer_row.col3
inner_row = inner_iter.next
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
outer_row = outer_iter.next
end
outer_iter = iterator over tb1 where col1 in (5,6)
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tb2 where col3 = outer_row.col3
inner_row = inner_iter.next
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
outer_row = outer_iter.next
end
稍微解释一个伪代码:根据条件where col1 in (5,6),迭代表tab1,得到表tab1的迭代器,遍历迭代器,在while循环里,得到tab2 的迭代器,条件是tab2 的col3等于tab1 的col3。然后就遍历tab2 的迭代器,打印出关联后的列。基本思路就是这样,我解释的不够清楚,请看伪代码。
左外连接-left outer join
上面的执行计划对于单表查询和多表关联查询都适用,如果是一个单表查询,那么只需要完成上面外层的基本操作。对于外连接上面的执行过程仍然适用。例如,我们将上面查询修改为如下:
mysql> select tb1.col1,tb2.col2
-> from tb1 left outer join tb2 using col3
-> where tb1.col1 in (5,6);
mysql> select tb1.col1,tb2.col2
-> from tb1 left outer join tb2 using col3
-> where tb1.col1 in (5,6);
对应的伪代码如下:
outer_iter = iterator over tb1 where col1 in (5,6)
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tb2 where col3 = outer_row.col3
inner_row = inner_iter.next
if inner_row
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
else
output [outer_row.col1,NULL]
end
outer_row = outer_row.next
end
outer_iter = iterator over tb1 where col1 in (5,6)
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tb2 where col3 = outer_row.col3
inner_row = inner_iter.next
if inner_row
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
else
output [outer_row.col1,NULL]
end
outer_row = outer_row.next
end
和上面的伪代码不同的地方在这里:
if inner_row
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
else
output [outer_row.col1,NULL]
end
if inner_row
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
else
output [outer_row.col1,NULL]
end
如果inner_row是空的,也就是在tb2 表中没有和outer_row.col3相等的row,所以执行else逻辑,打印出对应的空行。
下面我会以实际的例子再次解读关联查询。
建表语句:
create table tb_customer(
id int not null auto_increment,
name varchar(64) not null,
city varchar(128) not null,
gender int default 0,
age int,
primary key (id)
)
create table tb_item(
id int not null auto_increment,
item_name varchar(128) not null,
item_price int not null,
customer_id int not null,
primary key(id),
constraint tb_item_fk foreign key (customer_id) references tb_customer(id)
)
insert into tb_customer(name,city) values ('bob','taian'),('coc','beijing'),('dod','shenyang'),('eoe','laiwu');
insert into tb_customer(name,city) values ('aoa','taian'),('ioi','beijing'),('non','shenyang'),('fof','laiwu');
insert into tb_item(item_name,item_price,customer_id)values('a',1,1),('b',2,1),('c',3,1),('d',4,1);
insert into tb_item(item_name,item_price,customer_id)values('a',1,2),('b',2,2),('c',3,2),('d',4,2);
insert into tb_item(item_name,item_price,customer_id)values('a',1,3),('b',2,3),('c',3,3),('d',4,3);
insert into tb_item(item_name,item_price,customer_id)values('a',1,4),('b',2,4);
insert into tb_item(item_name,item_price,customer_id)values('e',1,5),('f',2,5);
insert into tb_item(item_name,item_price,customer_id)values('e',1,6),('g',2,6);
create table tb_customer(
id int not null auto_increment,
name varchar(64) not null,
city varchar(128) not null,
gender int default 0,
age int,
primary key (id)
)
create table tb_item(
id int not null auto_increment,
item_name varchar(128) not null,
item_price int not null,
customer_id int not null,
primary key(id),
constraint tb_item_fk foreign key (customer_id) references tb_customer(id)
)
insert into tb_customer(name,city) values ('bob','taian'),('coc','beijing'),('dod','shenyang'),('eoe','laiwu');
insert into tb_customer(name,city) values ('aoa','taian'),('ioi','beijing'),('non','shenyang'),('fof','laiwu');
insert into tb_item(item_name,item_price,customer_id)values('a',1,1),('b',2,1),('c',3,1),('d',4,1);
insert into tb_item(item_name,item_price,customer_id)values('a',1,2),('b',2,2),('c',3,2),('d',4,2);
insert into tb_item(item_name,item_price,customer_id)values('a',1,3),('b',2,3),('c',3,3),('d',4,3);
insert into tb_item(item_name,item_price,customer_id)values('a',1,4),('b',2,4);
insert into tb_item(item_name,item_price,customer_id)values('e',1,5),('f',2,5);
insert into tb_item(item_name,item_price,customer_id)values('e',1,6),('g',2,6);
下面来看一下关联查询的实际效果:
mysql> select * from tb_customer;
+----+------+----------+--------+------+
| id | name | city | gender | age |
+----+------+----------+--------+------+
| 1 | bob | taian | 0 | NULL |
| 2 | coc | beijing | 0 | NULL |
| 3 | dod | shenyang | 0 | NULL |
| 4 | eoe | laiwu | 0 | NULL |
| 5 | aoa | taian | 0 | NULL |
| 6 | ioi | beijing | 0 | NULL |
| 7 | non | shenyang | 0 | NULL |
| 8 | fof | laiwu | 0 | NULL |
+----+------+----------+--------+------+
8 rows in set (0.00 sec)
mysql> select * from tb_item;
+----+-----------+------------+-------------+
| id | item_name | item_price | customer_id |
+----+-----------+------------+-------------+
| 1 | a | 1 | 1 |
| 2 | b | 2 | 1 |
| 3 | c | 3 | 1 |
| 4 | d | 4 | 1 |
| 5 | a | 1 | 2 |
| 6 | b | 2 | 2 |
| 7 | c | 3 | 2 |
| 8 | d | 4 | 2 |
| 9 | a | 1 | 3 |
| 10 | b | 2 | 3 |
| 11 | c | 3 | 3 |
| 12 | d | 4 | 3 |
| 13 | a | 1 | 4 |
| 14 | b | 2 | 4 |
| 15 | e | 1 | 5 |
| 16 | f | 2 | 5 |
| 17 | e | 1 | 6 |
| 18 | g | 2 | 6 |
+----+-----------+------------+-------------+
18 rows in set (0.00 sec)
mysql> select * from tb_customer;
+----+------+----------+--------+------+
| id | name | city | gender | age |
+----+------+----------+--------+------+
| 1 | bob | taian | 0 | NULL |
| 2 | coc | beijing | 0 | NULL |
| 3 | dod | shenyang | 0 | NULL |
| 4 | eoe | laiwu | 0 | NULL |
| 5 | aoa | taian | 0 | NULL |
| 6 | ioi | beijing | 0 | NULL |
| 7 | non | shenyang | 0 | NULL |
| 8 | fof | laiwu | 0 | NULL |
+----+------+----------+--------+------+
8 rows in set (0.00 sec)
mysql> select * from tb_item;
+----+-----------+------------+-------------+
| id | item_name | item_price | customer_id |
+----+-----------+------------+-------------+
| 1 | a | 1 | 1 |
| 2 | b | 2 | 1 |
| 3 | c | 3 | 1 |
| 4 | d | 4 | 1 |
| 5 | a | 1 | 2 |
| 6 | b | 2 | 2 |
| 7 | c | 3 | 2 |
| 8 | d | 4 | 2 |
| 9 | a | 1 | 3 |
| 10 | b | 2 | 3 |
| 11 | c | 3 | 3 |
| 12 | d | 4 | 3 |
| 13 | a | 1 | 4 |
| 14 | b | 2 | 4 |
| 15 | e | 1 | 5 |
| 16 | f | 2 | 5 |
| 17 | e | 1 | 6 |
| 18 | g | 2 | 6 |
+----+-----------+------------+-------------+
18 rows in set (0.00 sec)
关联查询一:
select a.* , b.* from tb_customer a inner join tb_item b;
select a.* , b.* from tb_customer a inner join tb_item b using (id);
select a.* , b.* from tb_customer a inner join tb_item b;
select a.* , b.* from tb_customer a inner join tb_item b using (id);
第一个关联查询不加关联条件,实际的伪代码可以这样表示:
outer_iter = iterator over tb1
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tb2
inner_row = inner_iter.next
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
outer_row = outer_iter.next
end
outer_iter = iterator over tb1
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tb2
inner_row = inner_iter.next
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
outer_row = outer_iter.next
end
第二个关联查询的伪代码可以这样表示:
outer_iter = iterator over tb1
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tb2 where id = outer_row.id
inner_row = inner_iter.next
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
outer_row = outer_iter.next
end
outer_iter = iterator over tb1
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tb2 where id = outer_row.id
inner_row = inner_iter.next
while inner_row
output [outer_row.col1, inner_row.col2]
inner_row = inner_iter.next
end
outer_row = outer_iter.next
end
第二个关联查询的实际效果:
mysql> select a.* , b.* from tb_customer a inner join tb_item b using (id);
+----+------+----------+--------+------+----+-----------+------------+-------------+
| id | name | city | gender | age | id | item_name | item_price | customer_id |
+----+------+----------+--------+------+----+-----------+------------+-------------+
| 1 | bob | taian | 0 | NULL | 1 | a | 1 | 1 |
| 2 | coc | beijing | 0 | NULL | 2 | b | 2 | 1 |
| 3 | dod | shenyang | 0 | NULL | 3 | c | 3 | 1 |
| 4 | eoe | laiwu | 0 | NULL | 4 | d | 4 | 1 |
| 5 | aoa | taian | 0 | NULL | 5 | a | 1 | 2 |
| 6 | ioi | beijing | 0 | NULL | 6 | b | 2 | 2 |
| 7 | non | shenyang | 0 | NULL | 7 | c | 3 | 2 |
| 8 | fof | laiwu | 0 | NULL | 8 | d | 4 | 2 |
+----+------+----------+--------+------+----+-----------+------------+-------------+
8 rows in set (0.00 sec)
mysql> select a.* , b.* from tb_customer a inner join tb_item b using (id);
+----+------+----------+--------+------+----+-----------+------------+-------------+
| id | name | city | gender | age | id | item_name | item_price | customer_id |
+----+------+----------+--------+------+----+-----------+------------+-------------+
| 1 | bob | taian | 0 | NULL | 1 | a | 1 | 1 |
| 2 | coc | beijing | 0 | NULL | 2 | b | 2 | 1 |
| 3 | dod | shenyang | 0 | NULL | 3 | c | 3 | 1 |
| 4 | eoe | laiwu | 0 | NULL | 4 | d | 4 | 1 |
| 5 | aoa | taian | 0 | NULL | 5 | a | 1 | 2 |
| 6 | ioi | beijing | 0 | NULL | 6 | b | 2 | 2 |
| 7 | non | shenyang | 0 | NULL | 7 | c | 3 | 2 |
| 8 | fof | laiwu | 0 | NULL | 8 | d | 4 | 2 |
+----+------+----------+--------+------+----+-----------+------------+-------------+
8 rows in set (0.00 sec)
可以对照着伪代码分析一下
关联查询二:
select a.* , b.* from tb_customer a inner join tb_item b;
select a.* , b.* from tb_item a inner join tb_customer b;
select a.* , b.* from tb_customer a inner join tb_item b;
select a.* , b.* from tb_item a inner join tb_customer b;
这两个关联查询有什么不同呢?其实本质就是一样,只不过是列的输出位置不同,下面只是一部分输出结果:
mysql> select a.* , b.* from tb_customer a inner join tb_item b;
+----+------+----------+--------+------+----+-----------+------------+-------------+
| id | name | city | gender | age | id | item_name | item_price | customer_id |
+----+------+----------+--------+------+----+-----------+------------+-------------+
| 1 | bob | taian | 0 | NULL | 1 | a | 1 | 1 |
| 2 | coc | beijing | 0 | NULL | 1 | a | 1 | 1 |
| 3 | dod | shenyang | 0 | NULL | 1 | a | 1 | 1 |
| 4 | eoe | laiwu | 0 | NULL | 1 | a | 1 | 1 |
| 5 | aoa | taian | 0 | NULL | 1 | a | 1 | 1 |
| 6 | ioi | beijing | 0 | NULL | 1 | a | 1 | 1 |
| 7 | non | shenyang | 0 | NULL | 1 | a | 1 | 1 |
| 8 | fof | laiwu | 0 | NULL | 1 | a | 1 | 1 |
................................
................................
mysql> select a.* , b.* from tb_customer a inner join tb_item b;
+----+------+----------+--------+------+----+-----------+------------+-------------+
| id | name | city | gender | age | id | item_name | item_price | customer_id |
+----+------+----------+--------+------+----+-----------+------------+-------------+
| 1 | bob | taian | 0 | NULL | 1 | a | 1 | 1 |
| 2 | coc | beijing | 0 | NULL | 1 | a | 1 | 1 |
| 3 | dod | shenyang | 0 | NULL | 1 | a | 1 | 1 |
| 4 | eoe | laiwu | 0 | NULL | 1 | a | 1 | 1 |
| 5 | aoa | taian | 0 | NULL | 1 | a | 1 | 1 |
| 6 | ioi | beijing | 0 | NULL | 1 | a | 1 | 1 |
| 7 | non | shenyang | 0 | NULL | 1 | a | 1 | 1 |
| 8 | fof | laiwu | 0 | NULL | 1 | a | 1 | 1 |
................................
................................
左表:tb_customer
右表:tb_item
关联查询三:
select * from tb_customer outer join tb_item;
select * from tb_customer left outer join tb_item;
seldct * from tb_customer right outer join tb_item;
select * from tb_item outer join tb_customer;
select * from tb_item left outer join tb_customer;
seldct * from tb_item right outer join tb_customer;
select * from tb_customer outer join tb_item;
select * from tb_customer left outer join tb_item;
seldct * from tb_customer right outer join tb_item;
select * from tb_item outer join tb_customer;
select * from tb_item left outer join tb_customer;
seldct * from tb_item right outer join tb_customer;
这六个外关联都是错误的,为什么是错误的,为什么不加关联条件出错,都可以从伪代码上分析清楚。。
所以这里有个结论:外关联必须要有关联条件 。
关联查询四:
select * from tb_item a right outer join tb_customer b on a.customer_id = b.id;
select * from tb_item a right outer join tb_customer b on b.id = a.customer_id;
select * from tb_customer a left outer join tb_item b on a.id = b.customer_id;
select * from tb_item a right outer join tb_customer b on a.customer_id = b.id;
select * from tb_item a right outer join tb_customer b on b.id = a.customer_id;
select * from tb_customer a left outer join tb_item b on a.id = b.customer_id;
这三个关联查询都是正确的外连接。。。
以第一条关联查询分析:
以下是部分查询结果:
mysql> select * from tb_item a right outer join tb_customer b on a.customer_id = b.id;
+------+-----------+------------+-------------+----+------+----------+--------+------+
| id | item_name | item_price | customer_id | id | name | city | gender | age |
+------+-----------+------------+-------------+----+------+----------+--------+------+
| 1 | a | 1 | 1 | 1 | bob | taian | 0 | NULL |
| 2 | b | 2 | 1 | 1 | bob | taian | 0 | NULL |
| 3 | c | 3 | 1 | 1 | bob | taian | 0 | NULL |
| 4 | d | 4 | 1 | 1 | bob | taian | 0 | NULL |
..............................
..............................
mysql> select * from tb_item a right outer join tb_customer b on a.customer_id = b.id;
+------+-----------+------------+-------------+----+------+----------+--------+------+
| id | item_name | item_price | customer_id | id | name | city | gender | age |
+------+-----------+------------+-------------+----+------+----------+--------+------+
| 1 | a | 1 | 1 | 1 | bob | taian | 0 | NULL |
| 2 | b | 2 | 1 | 1 | bob | taian | 0 | NULL |
| 3 | c | 3 | 1 | 1 | bob | taian | 0 | NULL |
| 4 | d | 4 | 1 | 1 | bob | taian | 0 | NULL |
..............................
..............................
通过该查询结果,我们可以看到有部分row是重复出现的,那问题来了为什么是重复出现的,还是要从伪代码上找答案,重复出现的那行说明是外层首先循环遍历的。。
那么可以得出在这里外层首先循环遍历的表时tb_customer表,而在关联查询二:中我们看到inner join 首先遍历的表时tb_item,这不会说明什么问题,但我们可以得出内连接和外连接首先遍历的表时不同的,做出选择的是mysql的查询优化器做出的执行计划。
EXPLAIN 查询语句
explain select * from tb_item a right join tb_customer b on a.customer_id = b.id;
explain select * from tb_item a inner join tb_customer b;
explain select * from tb_item a right join tb_customer b on a.customer_id = b.id;
explain select * from tb_item a inner join tb_customer b;
结果是:
mysql> explain select * from tb_item a right join tb_customer b on a.customer_id = b.id;
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 8 | NULL |
| 1 | SIMPLE | a | ref | tb_item_fk | tb_item_fk | 4 | local_database.b.id | 1 | NULL |
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
2 rows in set (0.00 sec)
mysql> explain select * from tb_item a inner join tb_customer b;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 8 | NULL |
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 18 | Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
2 rows in set (0.00 sec)
mysql>
mysql> explain select * from tb_item a right join tb_customer b on a.customer_id = b.id;
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 8 | NULL |
| 1 | SIMPLE | a | ref | tb_item_fk | tb_item_fk | 4 | local_database.b.id | 1 | NULL |
+----+-------------+-------+------+---------------+------------+---------+---------------------+------+-------+
2 rows in set (0.00 sec)
mysql> explain select * from tb_item a inner join tb_customer b;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 8 | NULL |
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 18 | Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------------+
2 rows in set (0.00 sec)
mysql>
==================END==================