iposdb数据库 date函数优化

原创

宠物挂了 2017-06-19 11:46:29 博主文章分类：MySQL ©著作权

©著作权归作者所有：来自51CTO博客作者宠物挂了的原创作品，谢绝转载，否则将追究法律责任

如下4条SQL，明显是2类SQL语句。2017年5月19号

问题：走了两次全表。
select count(*) 活跃商户数量 from (select merchant_code,COUNT(DISTINCT concat(merchant_code,DATE(create_date))) merchantCodeCreateDate from pos_order where DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2' GROUP BY merchant_code HAVING merchantCodeCreateDate>=4) as tmptable;

select count(*) 活跃机具数量 from (select machine_sn,COUNT(DISTINCT concat(machine_sn,DATE(create_date))) machineSnCreateDate from pos_order where DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2' GROUP BY merchant_code HAVING machineSnCreateDate>=4) as tmptable;

问题：走了一次全表扫描
select count(DISTINCT merchant_code) 交易商户数量 from pos_order where DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2';

select count(DISTINCT machine_sn) 交易机具数量 from pos_order where DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2';

发现：
1 4个查询都是pos_order 的查询。
2 不必要的自身连接
3 count(辅助索引)快于count(*)
4 count(distinct)优化 --算了，这点没必要了。
例子：
select count(DISTINCT machine_sn) 交易机具数量 from pos_order where DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2';
改写：
explain select count(machine_sn) 交易机具数量 from pos_order where machine_sn=any(select distinct machine_sn from pos_order) and DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2';

any关键字，只要满足内层查询语句返回一个结果中的任何一个，就可以通过条件来执行外层查询语句。
all与any相反，只有满足内层查询语句返回的所有结果，才可以执行外层查询语句。
explain select count(machine_sn) 交易机具数量 from pos_order where machine_sn=all(select distinct machine_sn from pos_order);
machine_sn | varchar(32) | NO | MUL | NULL '机器KSN号'

第二种语法改写：
select count(DISTINCT machine_sn) 交易机具数量 from pos_order where DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2';

强制走索引，但是有判断还是全表。使用union all方式。
explain select count(DISTINCT machine_sn) 交易机具数量 from pos_order union all select count(machine_sn) from pos_order force index(idx_create_date) where DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2';

改写函数名称
select count(DISTINCT machine_sn) 交易机具数量 from pos_order where create_date BETWEEN DATE_SUB(curdate(),INTERVAL 1 day) AND date_sub(curdate(),INTERVAL 7 day) and trade_status='2';

很纳闷啊，怎么会没数。
业务SQL语句：
mysql> select count(DISTINCT merchant_code) 交易商户数量 from pos_order where DATE(create_date)>='2017-05-12' and DATE(create_date)<='2017-05-18' and trade_status='2';
+--------------------+
| 交易商户数量 |
+--------------------+
| 39882 |
+--------------------+
1 row in set (1 min 48.89 sec)

mysql> select date_sub(curdate(),INTERVAL 7 day),DATE_SUB(curdate(),INTERVAL 1 day);
+------------------------------------+------------------------------------+
| date_sub(curdate(),INTERVAL 7 day) | DATE_SUB(curdate(),INTERVAL 1 day) |
+------------------------------------+------------------------------------+
| 2017-05-12 | 2017-05-18 |
+------------------------------------+------------------------------------+
1 row in set (0.00 sec)

大的放前面。
mysql> select count(DISTINCT machine_sn) 交易机具数量 from pos_order where create_date >= DATE_SUB(curdate(),INTERVAL 1 day) AND create_date <= date_sub(curdate(),INTERVAL 7 day) and trade_status='2';
+--------------------+
| 交易机具数量       |
+--------------------+
|                  0 |
+--------------------+
1 row in set (0.00 sec)

数值对不上，也是没谁了。

去掉函数。从ALL变成了range 但是数据没有等价。
select count(DISTINCT merchant_code) 交易商户数量 from pos_order where create_date>='2017-05-12 00:00:00' and create_date<='2017-05-18 23:59:59' and trade_status='2';

select count(DISTINCT merchant_code) 交易商户数量 from pos_order force index(idx_create_date) where create_date>='2017-05-12 00:00:00' and create_date<='2017-05-18 23:59:59' and trade_status='2';

优化distinct最有效的方法是利用索引来做排重操作，先把排重的记录查找出来在通过count统计，这样效果更高
表量级：14745537 1474W条+数据。
目前：pos_order 表的索引情况如下：
知识点哈：如果时间类型 create_date 用来做运算的话是不走索引的（故idx_create_date索引无效）