sql server 从两个表进行数据计算 sql 从两个表获取两列数据

转载

mob64ca13f9a97c 2023-12-15 14:55:49

文章标签 sql 两个表列包含字段数据字符串 文章分类 SQL Server 数据库

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_数据

受益于人美心善无眠小姐姐的数据分析面试必备——SQL准备好了么，按着框架一步步又重新敲了一遍，好记性离不开烂笔头。中间有些部分按着自己的喜好做了更改，条条道路通罗马，选择自己最舒服的那条就好~

一、最基本（选数据）

怎么把数据从表里选出来？

-- 从table_1中选择name列
select name from table

提取的数据分属不同表格，如何筛选？

-- table_1里有id，age；table_2里有id，sex。提取id，age，sex三列信息
select a.id,a.age,b.sex from 
(select id,age from table_1) a
join 
(selcet id,sex from table_2) b
on a.id=b.id

在此插入，表连接

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_数据_02

join：默认是inner join

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_字段_03

left join: 左连接，以左表为准，逐条去右表找可匹配字段，如果有多条会逐次列出，如果没有找到则是NULL

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_字符串_04

right join：右连接，以右表为准，逐条去左表找可匹配字段，如果有多条会逐次列出，如果没有找到则是NULL

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_数据_05

full outer join：全连接，包含两个表的连接结果，如果左表缺失或者右表缺失的数据会填充NULL（有的数据库不支持）

还有，纵向合并（两张表列数不相同，无法合并）

# union 合并后删除重复值
select * from table_1
union
select * from table_2

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_数据_06

# union all 合并后保留重复值
select * from table_1
union all
select * from table_2

union和union all 均基于列合并多张表的数据，所合并的列格式必须完全一致。union的过程中会去重并降低效率，union all 直接追加数据。union 前后是两段select 语句而非结果集。

二、最常用

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_字段_07

如果有千万用户数据，想知道有多少去重的用户数？—— 去重 distinct

-- 统计出不同的id
SELECT DISTINCT id from table_1

-- 统计出不同的id个数
SELECT count(DISTINCT id) id from table_1

-- 优化版本的count distinct
select count(*) FROM
(SELECT DISTINCT id from table_1) as t

distinct 会对结果集去重，对全部选择字段进行去重，并不能针对其中部分字段进行去重。使用count distinct进行去重统计会将reducer数量强制限定为1，而影响效率，因此适合改写为子查询。

分性别统计，看看男女各多少？—— 聚合函数和group by

-- 统计不同性别（F、M）中，不同的id个数
select sex,count(distinct id) from table_1
group by sex
-- 其他聚合函数例如：max/min/sum/avg

-- 统计每家公司最大/最小/平均年龄
select max(age),min(age),avg(age) from table_1
group by company

只想查看A公司的男女人数数据？—— 筛选 where/having

-- 统计A公司的男女人数
select sex,count(distinct id) from table_1
where company='A'
group by sex

--统计各公司的男性平均年龄，并且仅保留平均年龄30岁以上的公司
select company,avg(age) from table_1
group by company
having avg(age)>30

想查出A公司的女职员平均年龄？—— 多重筛选

select ,avg(age) from table_1
where company='A' and sex='F'

对收入分组 —— case when

select id,
case
when salary<500 then '[0,500)'
when salary<1000 then '[500,1000)'
when salary<1500 then '[1000,1500)'
else '1500+'
end as 分组
from table_1

case 函数的格式为（case when 条件1 then value1 else null end）, 其中else 可以省，但是end不可以省。

字符串

1.concat(A,B...)返回将A和B按顺序连接在一起的字符串

SELECT *,concat('ca','t') from table_1
-- 得到一列，内容为 cat

2. split(str, regex)?????

3.substr（str,0,len) 截取字符串从0位开始的长度为len个字符

select substr('abcde',3,2) from table_1
-- 得到cd

三、基础进阶

排序 —— row_number/rank/dense_rank

-- 按成绩排序
SELECT *,
row_number() over (order by 成绩 DESC) as 'row_number',
rank() over  (order by 成绩 DESC) as 'rank',
dense_rank() over (order by 成绩 DESC) as 'dense_rank' 
from table_1

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_字符串_08

想要获取top10%的值？

SELECT * from
(select *,
row_number() over (ORDER BY `成绩` DESC) as 排名
from table_1) as t
where t.排名<(SELECT count(`学号`) from table_1)*0.25
;

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_字符串_09

对时间字段进行操作？—— 时间函数

SELECT DATE_FORMAT(20200915,'%Y-%m') as 年月,
DATEDIFF(20200915,20200914) as 间隔天数,
year(20200901)-year(20190817) as 间隔年 
from table_1

sql server 从两个表进行数据计算 sql 从两个表获取两列数据_字符串_10

#时间函数补充--获取当前时间
SELECT now() as 当前时间,
DATE_FORMAT(now(),'%Y-%m-%d') as 年月日,
year(now()) as 年,--------------------------------------还有month()/day()/hour()/minute()/second()
DATE_FORMAT(now(),'%H:%i:%S') as 时分秒,
time(now()),--------------------------------------------同样提取时分秒
WEEKOFYEAR(now()) as 第几周,
dayofweek(now()) as 星期,
quarter(now()) as 季度
;

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。