A Visual Explanation of SQL Joins

I thought Ligaya Turmelle's  as a great primer for novice developers. Since SQL joins appear to be set-based, the use of ​ ​​ to explain them seems, at first blush, to be a natural fit. However, like the commenters to her post, I found that the Venn diagrams didn't quite match the reality in my testing.

I love the concept, though, so let's see if we can make it work. Assume we have the following two tables. Table A is on the left, and Table B is on the right. We'll populate them with four records each.

id name       id  name -- ----       --  ---- 1  Pirate     1   Rutabaga 2  Monkey     2   Pirate 3  Ninja      3   Darth Vader 4  Spaghetti  4   Ninja

Let's join these tables by the name field in a few different ways and see if we can get a conceptual match to those nifty Venn diagrams. 





















SELECT * FROM TableA
INNER JOIN TableB
ON TableA.name = TableB.name

id name id name
-- ---- -- ----
1 Pirate 2 Pirate
3 Ninja 4 Ninja

Inner join produces only the set of records that match in both Table A and Table B.



sql jion_sql



SELECT * FROM TableA
FULL OUTER JOIN TableB
ON TableA.name = TableB.name

id name id name
-- ---- -- ----
1 Pirate 2 Pirate
2 Monkey null null
3 Ninja 4 Ninja
4 Spaghetti null null
null null 1 Rutabaga
null null 3 Darth Vader

Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null.



sql jion_sql_02



SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name

id name id name
-- ---- -- ----
1 Pirate 2 Pirate
2 Monkey null null
3 Ninja 4 Ninja
4 Spaghetti null null

Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.



sql jion_结果集_03



SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name
WHERE TableB.id IS null

id name id name
-- ---- -- ----
2 Monkey null null
4 Spaghetti null null

To produce the set of records only in Table A, but not in Table B, we perform the same left outer join, then exclude the records we don't want from the right side via a where clause



sql jion_ide_04



SELECT * FROM TableA
FULL OUTER JOIN TableB
ON TableA.name = TableB.name
WHERE TableA.id IS null
OR TableB.id IS null

id name id name
-- ---- -- ----
2 Monkey null null
4 Spaghetti null null
null null 1 Rutabaga
null null 3 Darth Vader

To produce the set of records unique to Table A and Table B, we perform the same full outer join, then exclude the records we don't want from both sides via a where clause



sql jion_sed_05


There's also a cartesian product or cross join, which as far as I can tell, can't be expressed as a Venn diagram:

SELECT * FROM TableA CROSS JOIN TableB

This joins "everything to everything", resulting in 4 x 4 = 16 rows, far more than we had in the original sets. If you do the math, you can see why this is a very dangerous join to run against large tables.



对于SQL的Join,在学习起来可能是比较乱的。我们知 有很多inner的,有outer的,有left的,有时候,对于Select出来的结果集是什么样子有点不是很清楚。Coding Horr 清楚为什么Coding Horror也被墙)通过 文氏图 释了SQL的Join。我觉得清楚易懂,转过来。

假设我们有两张表。


  • Table A 是左边的表。
  • Table B 是右边的表。

其各有四条记录,其中有两条记录是相同的,如下所示:

id name       id  name -- ----       --  ---- 1  Pirate     1   Rutabaga 2  Monkey     2   Pirate 3  Ninja      3   Darth Vader 4  Spaghetti  4   Ninja

下面让我们来看看不同的Join会产生什么样的结果。

SELECT * FROM TableA INNER JOIN TableB ON TableA.name = TableB.name  id  name       id   name --  ----       --   ---- 1   Pirate     2    Pirate 3   Ninja      4    Ninja

Inner join

产生的结果集中,是A和B的交集。

sql jion_sed_06

SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name  id    name       id    name --    ----       --    ---- 1     Pirate     2     Pirate 2     Monkey     null  null 3     Ninja      4     Ninja 4     Spaghetti  null  null null  null       1     Rutabaga null  null       3     Darth Vader

Full outer join 产生A和B的并集。但是需要注意的是,对于没有匹配的记录,则会以null做为值。

sql jion_数据集_07

SELECT * FROM TableA LEFT OUTER JOIN TableB ON TableA.name = TableB.name  id  name       id    name --  ----       --    ---- 1   Pirate     2     Pirate 2   Monkey     null  null 3   Ninja      4     Ninja 4   Spaghetti  null  null

Left outer join 产生表A的完全集,而B表中匹配的则有值,没有匹配的则以null值取代。

sql jion_ide_08

SELECT * FROM TableA LEFT OUTER JOIN TableB ON TableA.name = TableB.name WHERE TableB.id IS null   id  name       id     name --  ----       --     ---- 2   Monkey     null   null 4   Spaghetti  null   null

产生在A表中有而在B表中没有的集合。

sql jion_sql_09

SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name WHERE TableA.id IS null OR TableB.id IS null  id    name       id    name --    ----       --    ---- 2     Monkey     null  null 4     Spaghetti  null  null null  null       1     Rutabaga null  null       3     Darth Vader

产生A表和B表都没有出现的数据集。

sql jion_ide_10

还需要注册的是我们还有一个是“交差集” cross join, 这种Join没有办法用文式图表示,因为其就是把表A和表B的数据进行一个N*M的组合,即笛卡尔积。表达式如下:

SELECT * FROM TableA CROSS JOIN TableB

这个笛卡尔乘积会产生 4 x 4 = 16 条记录,一般来说,我们很少用到这个语法。但是我们得小心,如果不是使用嵌套的select语句,一般系统都会产生笛卡尔乘积然再做过滤。这是对于性能来说是非常危险的,尤其是表很大的时候。