1、关系数据的各种操作
并(union)
把相同字段数目和字段类型的表合并到一起
笛卡尔积(cartesian product)
内连接 inner jion
所谓连接就是在表关系的笛卡尔积数据记录中,按照相应字段值的比较条件进行选择生成一个新的关系。连接又分为内连接,外连接,交叉连接。
内连接:在表关系的笛卡尔积数据记录中,保留表关系中所有匹配的数据记录,舍弃不匹配的数据记录。
按照匹配的条件可以分成自然连接,等值连接,和不等连接
自然连接:
表关系的笛卡尔积中,首先根据表关系中相同名称的字段自动进行记录匹配,然后去掉重复的字段。
等值连接
表关系的笛卡尔积中,选择所匹配字段值相等(=符号)的数据记录。
不等连接
表关系的笛卡尔积中,选择所匹配字段值(!=)的数据记录。
外连接 outer jion
在表关系的笛卡尔积数据记录中,不仅保留表关系中所有匹配的数据记录,而且还会保留部分不匹配的数据记录。
按照保留不匹配条件数据记录来源可以分为:left outer union,right outer union ,full outer jion
左连接:
表关系的笛卡尔积中,除了选择相匹配的数据记录,还包含关联左边表中不匹配的数据记录。
右连接:
表关系的笛卡尔积中,除了选择相匹配(相等)的数据记录,还包含右边表中不匹配的数据记录。
全连接:
表关系的笛卡尔积中,除了选择相匹配(相等)的数据记录,还包含左右表中不匹配的数据记录。
交叉连接 cross jion
2、内连接查询
1、在from子句里利用逗号(,)区分多个表,在where子句里通过逻辑表达式来实现匹配条件,从而实现表的连接
2、ANSI连接语法形式,在from子句中使用"jion...on"关键字,而连接条件写在关键字on子句里,这是推荐使用的连接语法。
select field1,...,fieldn
from join_tablename1
inner join join_tablename2 [inner join join_tablenamen]
on join_condition
*等值连接
自连接:一种特殊的等值连接,自身与自身连接
为表取别名机制
select field1,...,fieldn [AS] otherfieldn
from tablename1 [as] othertablename1,...,tablenamen [AS] othertablenamen
*不等连接
可以使用的关系运算符包含">" ">=" "<" "<=" "!="
3、外连接查询
select field1,...,fieldn
from join_tablename1 left|right|full [outer] join join_tablename2
on join_condition
左外连接
新关系中执行匹配条件时,以关键字left join左边的表为参考表
select e.ename empolyee,e.job,l.ename leader
from t_emplyee e left outer join t_employee l
on e.mgr=l.ename;
+--------------+-----------+--------+
| employe | job | leader |
+--------------+-----------+--------+
| smith | clerk | ford |
| alen | salesman | black |
| ward | salesman | black |
| jones | manager | king |
| martin | salesman | black |
| ford | analyst | jones |
| black | manager | ford |
| king | president | NULL |
+--------------+-----------+--------+
8 rows in set (0.00 sec)
select e.ename empolyee,e.job,l.ename leader
from t_emplyee e inner join t_employee l
on e.mgr=l.ename;
+--------------+----------+--------+
| employee | job | leader |
+--------------+----------+--------+
| ford | analyst | jones |
| smith | clerk | ford |
| black | manager | ford |
| alen | salesman | black |
| ward | salesman | black |
| martin | salesman | black |
| jones | manager | king |
+--------------+----------+--------+
7 rows in set (0.00 sec)
观察发现:虽然等值连接sql语句也显示出雇员的相应信息,但是没有显示出雇员为king的信息。
右外连接
新关系中执行匹配条件时,以关键字right join右边的表为参考表
4、合并查询数据记录
select field1 field2,...,fieldn
from tablename1
union |union all
select field1 field2 ... fieldn
from tablename2
union | union all
select field1 field2 ... fieldn
from tablename3
...
通过union来实现并操作,即可通过其将多个select语句的查询结果合并在一起组成新的关系。
例:
mysql> create table t_cstudent(
-> name varchar(20),
-> sex varchar(5));
Query OK, 0 rows affected (0.09 sec)
mysql> create table t_mstudent(
-> name varchar(20),
-> sex varchar(5));
Query OK, 0 rows affected (0.07 sec)
创建两张表,t_cstudent,t_mstudent
select * from t_cstudent;
+----------+-------+
| name | sex |
+----------+-------+
| ccjgong1 | man |
| ccjgong2 | woman |
| ccjgong3 | man |
| ccjgong4 | woman |
| ccjgong5 | woman |
| cmcjgong | man |
+----------+-------+
select * from t_mstudent;
+----------+-------+
| name | sex |
+----------+-------+
| cmcjgong | man |
| mcjgong1 | women |
| mcjgong2 | man |
| mcjgong3 | man |
| mcjgong4 | woman |
| mcjgong5 | woman |
+----------+-------+
select * from t_cstudent
union
select * from t_mstudent
+----------+-------+
| name | sex |
+----------+-------+
| ccjgong1 | man |
| ccjgong2 | woman |
| ccjgong3 | man |
| ccjgong4 | woman |
| ccjgong5 | woman |
********| cmcjgong | man |*******
| mcjgong1 | women |
| mcjgong2 | man |
| mcjgong3 | man |
| mcjgong4 | woman |
| mcjgong5 | woman |
+----------+-------+
执行结果显示出合并后的数据记录,同时去除了重复数据记录,使新关系里没有任何重复的数据记录。
通过union all 把查询结果集直接合并在一起。
select * from t_cstudent
union all
select * from t_mstudent
+----------+-------+
| name | sex |
+----------+-------+
| ccjgong1 | man |
| ccjgong2 | woman |
| ccjgong3 | man |
| ccjgong4 | woman |
| ccjgong5 | woman |
********| cmcjgong | man |********
********| cmcjgong | man |********
| mcjgong1 | women |
| mcjgong2 | man |
| mcjgong3 | man |
| mcjgong4 | woman |
| mcjgong5 | woman |
+----------+-------+
与union相比,执行结果成功显示出合并后的数据记录,但是没有去掉了重复数据记录,即新关系里存在重复的数据记录
5、子查询
在mysql中虽然可以通过连接查询实现多表查询数据记录,但是不建议使用。这是因为连接查询的性能很差。因此出现了连接查询的替代者子查询。在具体应用中,mysql软件推荐使用子查询来实现多表查询数据记录。
两个表执行查询时,会对表先进行笛卡尔积,然后再选取符合匹配条件的数据记录。进行笛卡尔积操作时,会生成两个数表数据记录的乘积条数据记录,如果这两张表的数据记录比较大,则在进行笛卡尔积操作时就会造成死机。
对于有经验的用户,首先会通过count()函数来统计操作表笛卡尔积后的数据记录数,然后才会进行多表查询,因此多表查询一般会经过如下步骤:
1、通过统计函数查询所关联表笛卡尔积后的数据记录数:
select count(*) from t_dept,t_employee
2、如果查询到的数据记录数mysql软件可以接受,然后进行多表连接查询,否则就应该考虑通过其他方式来实现。
3、如果数据记录数mysql不能接受,则使用子查询来实现多表查询。
子查询:就是在一个查询之中嵌套了其他的若干查询,即在select查询语句的where或from子句中包含另一个select查询语句。在查询语句中,外层的select查询语句成为主查询,where子句中的select查询语句被称为子查询,也被称为嵌套查询。
通过子查询可以实现多表查询,该查询语句中可能包含in,any,all,exist等关键字,除此之外还可能包含比较运算符。理论上子查询可以出现在查询语句的任意位置,但是在实际开发中,子查询经常出现在where和from中
where:该位置的子查询一般返回单行单列,多行单列,单行多列数据记录
from:该位置的子查询一般返回多行多列数据记录,可以当做一张临时表
+-------+--------+-----------+------+---------------------+---------+--------+--------+
| empno | ename | job | mgr | hiredate | sale | comm | deptno |
+-------+--------+-----------+------+---------------------+---------+--------+--------+
| 7369 | smith | clerk | 7902 | 1981-03-12 00:00:00 | 800.00 | NULL | 20 |
| 7499 | alen | salesman | 7698 | 1982-03-12 00:00:00 | 1600.00 | 300.00 | 30 |
| 7521 | ward | salesman | 7698 | 1983-03-12 00:00:00 | 1250.00 | 500.00 | 30 |
| 7566 | jones | manager | 7839 | 1981-03-12 00:00:00 | 2893.00 | NULL | 20 |
| 7654 | martin | salesman | 7698 | 1981-03-12 00:00:00 | 1250.00 | NULL | 30 |
| 7902 | ford | analyst | 7566 | 0000-00-00 00:00:00 | 3000.00 | NULL | 30 |
| 7698 | black | manager | 7902 | 1981-03-02 00:00:00 | 2850.00 | NULL | 30 |
| 7839 | king | president | NULL | 1981-03-12 00:00:00 | 5000.00 | NULL | 10 |
+-------+--------+-----------+------+---------------------+---------+--------+--------+
a、返回结果为单行单列和单行多列子查询
1、返回结果为单行单列子查询
select sale from t_employee where ename='smith';
select * from t_employee
where sale>(
select sale
from t_employee
where ename='smith');
+-------+--------+-----------+------+---------------------+---------+--------+--------+
| empno | ename | job | mgr | hiredate | sale | comm | deptno |
+-------+--------+-----------+------+---------------------+---------+--------+--------+
| 7499 | alen | salesman | 7698 | 1982-03-12 00:00:00 | 1600.00 | 300.00 | 30 |
| 7521 | ward | salesman | 7698 | 1983-03-12 00:00:00 | 1250.00 | 500.00 | 30 |
| 7566 | jones | manager | 7839 | 1981-03-12 00:00:00 | 2893.00 | NULL | 20 |
| 7654 | martin | salesman | 7698 | 1981-03-12 00:00:00 | 1250.00 | NULL | 30 |
| 7902 | ford | analyst | 7566 | 0000-00-00 00:00:00 | 3000.00 | NULL | 30 |
| 7698 | black | manager | 7902 | 1981-03-02 00:00:00 | 2850.00 | NULL | 30 |
| 7839 | king | president | NULL | 1981-03-12 00:00:00 | 5000.00 | NULL | 10 |
+-------+--------+-----------+------+---------------------+---------+--------+--------+
2、单行多列子查询
where子句中的子查询除了是返回单行单列的数据记录外,还可以返回单行多列的数据记录,不过这种子查询很少出现
mysql> select ename,sale,job
-> from t_employee
-> where (sale,job)=(
-> select sale,job
-> from t_employee
-> where ename='smith');
+-------+--------+-------+
| ename | sale | job |
+-------+--------+-------+
| smith | 800.00 | clerk |
+-------+--------+-------+
b、返回结果为多行单列子查询
当子查询的返回结果为多行单列数据记录时,该子查询语句一般会在主查询语句的where子句里出现,通常会包含in,any,all,exist等关键字
1、带有关键字in的子查询
mysql> select * from t_employee;
+-------+--------+-----------+------+---------------------+---------+--------+--------+
| empno | ename | job | mgr | hiredate | sale | comm | deptno |
+-------+--------+-----------+------+---------------------+---------+--------+--------+
| 7369 | smith | clerk | 7902 | 1981-03-12 00:00:00 | 800.00 | NULL | 20 |
| 7499 | alen | salesman | 7698 | 1982-03-12 00:00:00 | 1600.00 | 300.00 | 30 |
| 7521 | ward | salesman | 7698 | 1983-03-12 00:00:00 | 1250.00 | 500.00 | 30 |
| 7566 | jones | manager | 7839 | 1981-03-12 00:00:00 | 2893.00 | NULL | 20 |
| 7654 | martin | salesman | 7698 | 1981-03-12 00:00:00 | 1250.00 | NULL | 30 |
| 7902 | ford | analyst | 7566 | 0000-00-00 00:00:00 | 3000.00 | NULL | 30 |
| 7698 | black | manager | 7902 | 1981-03-02 00:00:00 | 2850.00 | NULL | 30 |
| 7839 | king | president | NULL | 1981-03-12 00:00:00 | 5000.00 | NULL | 10 |
| 7676 | sandy | manager | 7839 | 1981-03-12 00:00:00 | 3500.00 | NULL | 50 |
| 7678 | edy | manager | 7839 | 1981-03-12 00:00:00 | 3500.00 | NULL | 60 |
+-------+--------+-----------+------+---------------------+---------+--------+--------+
当主查询的条件在子查询的查询结果里时,就可以通过关键字in来进行判断。相反,可以使用not in
select ename,depto
from t_employee
where deptno in (
select deptno from t_dept);
+--------+--------+
| ename | deptno |
+--------+--------+
| smith | 20 |
| alen | 30 |
| ward | 30 |
| jones | 20 |
| martin | 30 |
| ford | 30 |
| black | 30 |
| king | 10 |
+--------+--------+
可以发现sandy和edy并没有打印出来
select ename,depto
from t_employee
where deptno not in (
select deptno from t_dept);
+-------+--------+
| ename | deptno |
+-------+--------+
| sandy | 50 |
| edy | 60 |
+-------+--------+
2、带有关键字any的子查询
关键字any用来表示主查询的条件为满足子查询返回查询结果中任意一条数据记录,该关键字有三种匹配方式:
=ANY:其功能与关键字IN一样
>ANY(>=ANY):比子查询中返回数据记录中最小的还要大于(大于等于)数据记录
<ANY(<=ANY):比子查询中返回数据记录中最大的还要小于(小于等于)数据记录
例:
select sale from t_employee where job='manager';
+---------+
| sale |
+---------+
| 2893.00 |
| 2850.00 |
| 3500.00 |
| 3500.00 |
+---------+
select ename,job,sale from t_employee where sale < ANY (select sale from t_employee where job='manager');
+--------+----------+---------+
| ename | job | sale |
+--------+----------+---------+
| smith | clerk | 800.00 |
| alen | salesman | 1600.00 |
| ward | salesman | 1250.00 |
| jones | manager | 2893.00 |*****
| martin | salesman | 1250.00 |
| ford | analyst | 3000.00 |***
| black | manager | 2850.00 |
+--------+----------+---------+
通过例子得知,any的意思满足子查询的任意一条记录,而不是所有的记录。
3、带有ALL的子查询
关键字ALL用来表示主查询的条件为满足子查询返回结果中所有数据记录,该关键字有两种匹配方式,分别为:
>ALL(>=ALL):比子查询中返回数据记录中最大的还要大于(大于等于)数据记录
<ALL(<=ALL):比子查询中返回数据记录中最小的还要小于(小于等于)数据记录
select ename,job,sale from t_employee where sale < ALL (select sale from t_employee where job='manager');
+--------+----------+---------+
| ename | job | sale |
+--------+----------+---------+
| smith | clerk | 800.00 |
| alen | salesman | 1600.00 |
| ward | salesman | 1250.00 |
| martin | salesman | 1250.00 |
+--------+----------+---------+
与any进行比较。
4、带有关键字exist的子查询
EXISTS用于检查子查询是否至少会返回一行数据,该子查询实际上并不返回任何数据,而是返回值True或False。
c、返回结果为多行多列子查询
当子查询的返回结果为多行多列数据记录时,该子查询语句一般会在主查询语句的from子句里,被当做一张临时表的方式来处理。
例:执行sql语句select,于数据库company中,查询雇员表t_employee中各部门的部门号,部门名称,部门地址,雇员人数和平均工资。
select * from t_employee;
+-------+--------+-----------+------+---------------------+---------+--------+--------+
| empno | ename | job | mgr | hiredate | sale | comm | deptno |
+-------+--------+-----------+------+---------------------+---------+--------+--------+
| 7369 | smith | clerk | 7902 | 1981-03-12 00:00:00 | 800.00 | NULL | 20 |
| 7499 | alen | salesman | 7698 | 1982-03-12 00:00:00 | 1600.00 | 300.00 | 30 |
| 7521 | ward | salesman | 7698 | 1983-03-12 00:00:00 | 1250.00 | 500.00 | 30 |
| 7566 | jones | manager | 7839 | 1981-03-12 00:00:00 | 2893.00 | NULL | 20 |
| 7654 | martin | salesman | 7698 | 1981-03-12 00:00:00 | 1250.00 | NULL | 30 |
| 7902 | ford | analyst | 7566 | 0000-00-00 00:00:00 | 3000.00 | NULL | 30 |
| 7698 | black | manager | 7902 | 1981-03-02 00:00:00 | 2850.00 | NULL | 30 |
| 7839 | king | president | NULL | 1981-03-12 00:00:00 | 5000.00 | NULL | 10 |
| 7676 | sandy | manager | 7839 | 1981-03-12 00:00:00 | 3500.00 | NULL | 50 |
| 7678 | edy | manager | 7839 | 1981-03-12 00:00:00 | 3500.00 | NULL | 60 |
+-------+--------+-----------+------+---------------------+---------+--------+--------+
select * from t_dept;
+--------+------------+----------+
| deptno | dname | loc |
+--------+------------+----------+
| 10 | accounting | new york |
| 20 | researcher | dalls |
| 30 | sales | chicago |
| 40 | operation | boston |
+--------+------------+----------+
t_dept和t_employee的结构和数据如上所示。
解法一:内连接法
select d.deptno,d.dname,d.loc,count(e.ename) number,avg(e.sale) average
from t_dept d inner join t_employee e on d.deptno=e.deptno
group by d.deptno;
+--------+------------+----------+--------+-------------+
| deptno | dname | loc | number | average |
+--------+------------+----------+--------+-------------+
| 10 | accounting | new york | 1 | 5000.000000 |
| 20 | researcher | dalls | 2 | 1846.500000 |
| 30 | sales | chicago | 5 | 1990.000000 |
+--------+------------+----------+--------+-------------+
解法二:外连接法
select d.deptno,d.dname,d.loc,count(e.ename) number,avg(e.sale) average
from t_dept d left outer join t_employee e on d.deptno=e.deptno
group by d.deptno;
+--------+------------+----------+--------+-------------+
| deptno | dname | loc | number | average |
+--------+------------+----------+--------+-------------+
| 10 | accounting | new york | 1 | 5000.000000 |
| 20 | researcher | dalls | 2 | 1846.500000 |
| 30 | sales | chicago | 5 | 1990.000000 |
| 40 | operation | boston | 0 | NULL |
+--------+------------+----------+--------+-------------+
解法三:子查询
mysql> select d.deptno,d.dname,d.loc,number,average
-> from t_dept d inner join
-> (select deptno dno,count(empno) number,avg(sale) average from t_employee group by deptno desc) employee
-> on d.deptno=employee.dno;
+--------+------------+----------+--------+-------------+
| deptno | dname | loc | number | average |
+--------+------------+----------+--------+-------------+
| 30 | sales | chicago | 5 | 1990.000000 |
| 20 | researcher | dalls | 2 | 1846.500000 |
| 10 | accounting | new york | 1 | 5000.000000 |
+--------+------------+----------+--------+-------------+