前文的问题
第二版用时 33秒左右.
在原来的基础上,稍加改进,即可提升三分之一的性能.
1. select query_time,d,max(ts) ts from (
2. select t2.query_time,ts,rn,round(rn/total,10) percent,
3. case
4. .71>=round(rn/total,10) then 0.71
5. .81>=round(rn/total,10) then 0.81
6. .91>=round(rn/total,10) then 0.91
7. end d
8. from (
9. select query_time,ts,
10. @gid=query_time then @rn:=@rn+1 when @gid:=query_time then @rn:=1 end rn
11. from (
12. select * from t ,(select @gid:='',@rn:=0) vars order by query_time,ts
13. ) t1
14. ) t2 inner join (
15. select query_time,count(*) total from t group by query_time
16. ) t3 on(t2.query_time=t3.query_time)
17. where round(rn/total,10)>=0.71
18. ) t6
19. where d is not null
20. group by query_time,d
where
round
(
rn
/
total
,
10
)
>
=
0
.
71
即 用定义的最小的百分位数进行过滤后,再group by
此时 查询时间可以低至
20.531 s
当然,这个SQL还有进一步提升的空间
计算 某个百分位数的位置,有如下的公式:
loc=1+(n-1)*p,n是元素数,p是分位点。loc大小介于1和n之间
那么SQL可以进行如下优化
select t5.query_time,t5.ts,t2.v from (
select query_time,total,v, floor(1+(total-1)*v) rn
from (
select query_time,count(*) total from t group by query_time
) t3, (select 0.71 v,1 seq union all select 0.81,2 union all select 0.91,3) t4
)
t2 inner join (
select
query_time,
case when @gid=query_time then @rn:=@rn+1 when @gid:=query_time then @rn:=1 end rn,
ts
from (
select * from t ,(select @gid:='',@rn:=0) vars order by query_time,ts
) t1
) t5 on (t2.query_time=t5.query_time and t2.rn=t5.rn )
除了本身简化了SQL复杂度,查询时间也低至 15秒左右