最近遇到很多大分页的问题,服务器负载很高,SQL的执行时间也较长,非常的头痛。下面我们就说说分页问题的解决。
大家都知道对于mysql来说,select * from table where x=‘?’ limit 0,20;和select * from table where x='?' limit 100000,20;的执行效率差了非常多。
这是由于当limit m,n的时候,虽然都会扫描所有记录,但是当m越大的时候需要返回的数据就越多(从磁盘上返回的数据实际是m+n行),故消耗的IO也就越多,执行时间也就越慢。
ps:本测试场景主键为联合索引,返回结果为一个非排序字段。
- 我们首先对比一下执行时间,可以明显看出当m之间相差很大的时候,执行时间相差10倍多。
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) limit 100,20;
+------------------+
| cmt_id |
+------------------+
| 3664182476231215 |
| 3664182476231953 |
| 3664182509143378 |
| 3664182509856931 |
| 3664182522387354 |
| 3664182547189538 |
| 3664182551815916 |
| 3664182555954900 |
| 3664182567872431 |
| 3664182572795837 |
| 3664182580749693 |
| 3664182581130933 |
| 3664182585326062 |
| 3664182610558327 |
| 3664182610558824 |
| 3664182622828298 |
| 3664182631220149 |
| 3664182631538407 |
| 3664182631538450 |
| 3664182643808242 |
+------------------+
20 rows in set (0.01 sec)
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) limit 1000000,20;
+------------------+
| cmt_id |
+------------------+
| 3666228885570033 |
| 3666228885570295 |
| 3666228885724162 |
| 3666228885724441 |
| 3666228885724588 |
| 3666228889063746 |
| 3666228889064071 |
| 3666228889491318 |
| 3666228889765049 |
| 3666228889765102 |
| 3666228889765287 |
| 3666228889919257 |
| 3666228889919632 |
| 3666228889919681 |
| 3666228889919751 |
| 3666228893258463 |
| 3666228893258600 |
| 3666228893258942 |
| 3666228893686221 |
| 3666228893686312 |
+------------------+
20 rows in set (1.39 sec)
下面我们在看下explain的结果,结果显示都使用了主键,并且扫描行数基本等于索引命中的行数,基本都在500w左右。
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) limit 100,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5493293
Extra: Using where; Using index
1 row in set (0.00 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) limit 1000000,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5505085
Extra: Using where; Using index
1 row in set (0.00 sec)
最后我们看看profile的结果,查看一下时间都消耗在那里了? 我们可以很明显的看到时间消耗在sending data上,这说明从磁盘读取的数据的量级是不一样的。
show profile for query 9;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000139 |
| Waiting for query cache lock | 0.000053 |
| checking query cache for query | 0.000082 |
| checking permissions | 0.000051 |
| Opening tables | 0.000059 |
| System lock | 0.000052 |
| Waiting for query cache lock | 0.000066 |
| init | 0.000061 |
| optimizing | 0.000055 |
| statistics | 0.000127 |
| preparing | 0.000062 |
| executing | 0.000045 |
| Sending data | 0.000198 |
| end | 0.000052 |
| query end | 0.000049 |
| closing tables | 0.000051 |
| freeing items | 0.000052 |
| Waiting for query cache lock | 0.000048 |
| freeing items | 0.000059 |
| Waiting for query cache lock | 0.000048 |
| freeing items | 0.000051 |
| storing result in query cache | 0.000095 |
| logging slow query | 0.000050 |
| cleaning up | 0.000049 |
+--------------------------------+----------+
24 rows in set (0.00 sec)
show profile for query 8;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000164 |
| Waiting for query cache lock | 0.000053 |
| checking query cache for query | 0.000079 |
| checking permissions | 0.000052 |
| Opening tables | 0.000059 |
| System lock | 0.000052 |
| Waiting for query cache lock | 0.000067 |
| init | 0.000059 |
| optimizing | 0.000054 |
| statistics | 0.000131 |
| preparing | 0.000062 |
| executing | 0.000049 |
| Sending data | 1.363197 |
| end | 0.000188 |
| query end | 0.000037 |
| closing tables | 0.000041 |
| freeing items | 0.000068 |
| logging slow query | 0.000034 |
| cleaning up | 0.000034 |
+--------------------------------+----------+
19 rows in set (0.00 sec)
从上面的结果我们可以看出,实际的需求只有20行,但是在大分页的情况下mysql会多返回很多没有用的数据,最后也被抛弃掉了,这实际上就是做了很多无用功。
那么怎么解决呢? 基本上,在有条件的前提下都通过定位上一页的节点来进行解决。
第一页等于:select * from table limit 0,20;
第二页等于:select * from table where x > 20_id limit 0,20;
第三页等于:select * from table where x > 40_id limit 0,20;
依次类推,所有sql实际返回的数据都只有需要的。避免的无效的浪费,执行效率自然提高了。
ps:以上方案只是理论上的解决方法,在某些场景下由于业务需求可能无法做到定位也就无法使用这种解决方法了。(这种情况最好push产品制定规则只能一页一页翻,不许按照页号查询,这是最高大上的解决方案 ^_^)
- 同样,我们再次对比执行时间,explain结果和show profile结果,可以明显看出,使用定位条件之后,执行时间缩短,扫描行数降低,sending data的时间减少。
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) and cmt_id > 3666228893686312 limit 0,20;
+------------------+
| cmt_id |
+------------------+
| 3666228893960193 |
| 3666228893960306 |
| 3666228894114323 |
| 3666228894114562 |
| 3666228894114954 |
| 3666228894114979 |
| 3666228897881511 |
| 3666228897881872 |
| 3666228898309690 |
| 3666228898309714 |
| 3666228898309957 |
| 3666228898310091 |
| 3666228901648690 |
| 3666228901648999 |
| 3666228902076531 |
| 3666228902350928 |
| 3666228902351339 |
| 3666228902504833 |
| 3666228902505082 |
| 3666228902505405 |
+------------------+
20 rows in set (0.00 sec)
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) limit 1000020,20;
+------------------+
| cmt_id |
+------------------+
| 3666228893960193 |
| 3666228893960306 |
| 3666228894114323 |
| 3666228894114562 |
| 3666228894114954 |
| 3666228894114979 |
| 3666228897881511 |
| 3666228897881872 |
| 3666228898309690 |
| 3666228898309714 |
| 3666228898309957 |
| 3666228898310091 |
| 3666228901648690 |
| 3666228901648999 |
| 3666228902076531 |
| 3666228902350928 |
| 3666228902351339 |
| 3666228902504833 |
| 3666228902505082 |
| 3666228902505405 |
+------------------+
20 rows in set (0.50 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) and cmt_id > 3666228893686312 limit 0,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY,idx_cmtid
key: PRIMARY
key_len: 17
ref: NULL
rows: 2866505
Extra: Using where; Using index
1 row in set (0.00 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) limit 1000020,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5704423
Extra: Using where; Using index
1 row in set (0.00 sec)
show profile for query 4;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000012 |
| Waiting for query cache lock | 0.000003 |
| checking query cache for query | 0.000033 |
| checking permissions | 0.000007 |
| Opening tables | 0.000014 |
| System lock | 0.000009 |
| Waiting for query cache lock | 0.000015 |
| init | 0.000017 |
| optimizing | 0.000011 |
| statistics | 0.000101 |
| preparing | 0.000015 |
| executing | 0.000003 |
| Sending data | 0.000048 |
| end | 0.000005 |
| query end | 0.000004 |
| closing tables | 0.000006 |
| freeing items | 0.000007 |
| Waiting for query cache lock | 0.000003 |
| freeing items | 0.000009 |
| Waiting for query cache lock | 0.000003 |
| freeing items | 0.000003 |
| storing result in query cache | 0.000003 |
| logging slow query | 0.000003 |
| cleaning up | 0.000003 |
+--------------------------------+----------+
24 rows in set (0.00 sec)
show profile for query 1;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000018 |
| Waiting for query cache lock | 0.000003 |
| checking query cache for query | 0.000036 |
| checking permissions | 0.000005 |
| Opening tables | 0.000013 |
| System lock | 0.000007 |
| Waiting for query cache lock | 0.000018 |
| init | 0.000014 |
| optimizing | 0.000013 |
| statistics | 0.000092 |
| preparing | 0.000015 |
| executing | 0.000004 |
| Sending data | 0.498155 |
| end | 0.000011 |
| query end | 0.000006 |
| closing tables | 0.000011 |
| freeing items | 0.000019 |
| logging slow query | 0.000003 |
| cleaning up | 0.000004 |
+--------------------------------+----------+
19 rows in set (0.01 sec)
以上情况只是一种最普通的分页,如果在加上order by排序会发生申请情况呢?
MySQL会将所有符合条件的数据全部从磁盘中读取出来进行排序,这时候如果你的结果集非常的大,那么肯定会很不幸的用到file sort,执行效率会更加低效。
- 我们可以看出,增加了order by之后,执行时间上升,explain中使用了use filesort。
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000000,20;
+------------------+
| cmt_id |
+------------------+
| 3666920111261527 |
| 3666920110631298 |
| 3666920110488548 |
| 3666920110488450 |
| 3666920107065370 |
| 3666920106436281 |
| 3666920106435875 |
| 3666920102442166 |
| 3666920102240747 |
| 3666920102240571 |
| 3666920098673971 |
| 3666920098245760 |
| 3666920097903462 |
| 3666920094479312 |
| 3666920094050624 |
| 3666920093850769 |
| 3666920090283766 |
| 3666920090283278 |
| 3666920089855420 |
| 3666920089513609 |
+------------------+
20 rows in set (2.15 sec)
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) limit 1000000,20;
+------------------+
| cmt_id |
+------------------+
| 3666228885570033 |
| 3666228885570295 |
| 3666228885724162 |
| 3666228885724441 |
| 3666228885724588 |
| 3666228889063746 |
| 3666228889064071 |
| 3666228889491318 |
| 3666228889765049 |
| 3666228889765102 |
| 3666228889765287 |
| 3666228889919257 |
| 3666228889919632 |
| 3666228889919681 |
| 3666228889919751 |
| 3666228893258463 |
| 3666228893258600 |
| 3666228893258942 |
| 3666228893686221 |
| 3666228893686312 |
+------------------+
20 rows in set (0.51 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000000,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5758999
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) limit 1000000,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5766865
Extra: Using where; Using index
1 row in set (0.00 sec)
从下面结果可以明显看出,增加order by之后慢就慢在了file sort,有非常高的sort result时间。
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000147 |
| Waiting for query cache lock | 0.000038 |
| checking query cache for query | 0.000073 |
| checking permissions | 0.000043 |
| Opening tables | 0.000052 |
| System lock | 0.000039 |
| Waiting for query cache lock | 0.000052 |
| init | 0.000048 |
| optimizing | 0.000041 |
| statistics | 0.000108 |
| preparing | 0.000047 |
| executing | 0.000034 |
| Sorting result | 1.952326 |
| Sending data | 0.049525 |
| end | 0.000119 |
| query end | 0.000035 |
| closing tables | 0.000037 |
| freeing items | 0.000048 |
| logging slow query | 0.000034 |
| cleaning up | 0.000034 |
+--------------------------------+----------+
20 rows in set (0.00 sec)
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000238 |
| Waiting for query cache lock | 0.000052 |
| checking query cache for query | 0.000083 |
| checking permissions | 0.000053 |
| Opening tables | 0.000064 |
| System lock | 0.000055 |
| Waiting for query cache lock | 0.000067 |
| init | 0.000062 |
| optimizing | 0.000056 |
| statistics | 0.000136 |
| preparing | 0.000062 |
| executing | 0.000054 |
| Sending data | 0.536962 |
| end | 0.000162 |
| query end | 0.000050 |
| closing tables | 0.000054 |
| freeing items | 0.000065 |
| logging slow query | 0.000048 |
| cleaning up | 0.000049 |
+--------------------------------+----------+
19 rows in set (0.00 sec)
针对增加了order by的场景,使用定位的方法一样可以解决,只是效果不会很明显,这是由于无论如何都需要进行排序,都会使用file sort。添加定位点之后只会降低扫描行数。
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20 ;
+------------------+
| cmt_id |
+------------------+
| 3666920626622611 |
| 3666920626622443 |
| 3666920626211039 |
| 3666920626210922 |
| 3666920622016235 |
| 3666920622015997 |
| 3666920622015925 |
| 3666920618487944 |
| 3666920617821349 |
| 3666920614037035 |
| 3666920609842026 |
| 3666920609430825 |
| 3666920605900993 |
| 3666920605900800 |
| 3666920605647290 |
| 3666920605236000 |
| 3666920601868466 |
| 3666920601705343 |
| 3666920601452150 |
| 3666920601452003 |
+------------------+
20 rows in set (1.91 sec)
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) and cmt_id > 3666228893686312 order by cmt_id DESC limit 0,20 ;
+------------------+
| cmt_id |
+------------------+
| 3671683229693241 |
| 3671683225498045 |
| 3671683175152930 |
| 3671683145598353 |
| 3671683137757342 |
| 3671683137393312 |
| 3671683108024528 |
| 3671683099634319 |
| 3671683049288683 |
| 3671682998943116 |
| 3671682985938981 |
| 3671682969575762 |
| 3671682957353115 |
| 3671682953157056 |
| 3671682927984660 |
| 3671682923426458 |
| 3671682923014799 |
| 3671682919592618 |
| 3671682915396323 |
| 3671682902810065 |
+------------------+
20 rows in set (1.21 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5321277
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) and cmt_id > 3666228893686312 order by cmt_id DESC limit 0,20 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 17
ref: NULL
rows: 2628613
Extra: Using where; Using index; Using filesort
1 row in set (0.01 sec)
针对这种order by的优化就是消除use filesort,不让mysql创建tmp table,减少io消耗。
我们可以尝试针对order by的字段增加一个索引来予以解决。alter table add index idx_cmtid (cmt_id);
但是实际结果发现,单独增加cmt_id的索引之后,并没有很明显的改进,主要由于mysql的查询解析器在大分页的情况下并没有使用idx_cmtid,依然还是使用primary key。所以就没有区别。
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20;
+------------------+
| cmt_id |
+------------------+
| 3666920823375353 |
| 3666920823375040 |
| 3666920819593858 |
| 3666920819180271 |
| 3666920814985444 |
| 3666920811641573 |
| 3666920810790571 |
| 3666920810790479 |
| 3666920810790348 |
| 3666920807446258 |
| 3666920807446048 |
| 3666920806595605 |
| 3666920806595355 |
| 3666920803250667 |
| 3666920798206103 |
| 3666920794424604 |
| 3666920789815677 |
| 3666920786469259 |
| 3666920786468699 |
| 3666920781838773 |
+------------------+
20 rows in set (1.91 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5515441
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)
但是,增加这个索引也并不是完全没有作用。
我们发现如果不增加这个索引,在小分页的时候mysql依然会使用filesort将所有数据全部进行排序,这导致了在增加order by之后,limit 10,20和limit 100000,20的执行时间都非常慢。
- 从下面的结果我们可以看出,虽然时间消耗不同,但都是用了filesort。
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20;
+------------------+
| cmt_id |
+------------------+
| 3666920953421278 |
| 3666920950094530 |
| 3666920950093869 |
| 3666920949934669 |
| 3666920949933693 |
| 3666920949641608 |
| 3666920949641219 |
| 3666920949226311 |
| 3666920949226309 |
| 3666920945898475 |
| 3666920945446511 |
| 3666920945446375 |
| 3666920945030784 |
| 3666920941703471 |
| 3666920941703025 |
| 3666920940836296 |
| 3666920940835649 |
| 3666920937347808 |
| 3666920937347742 |
| 3666920933312224 |
+------------------+
20 rows in set (3.46 sec)
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 20,20;
+------------------+
| cmt_id |
+------------------+
| 3671685629083937 |
| 3671685620311641 |
| 3671685620311585 |
| 3671685603529356 |
| 3671685582553113 |
| 3671685578738053 |
| 3671685578737935 |
| 3671685561837734 |
| 3671685557174711 |
| 3671685511611403 |
| 3671685507416720 |
| 3671685461266277 |
| 3671685452301297 |
| 3671685448679391 |
| 3671685444366409 |
| 3671685444366252 |
| 3671685440170408 |
| 3671685436093519 |
| 3671685427130993 |
| 3671685415115441 |
+------------------+
20 rows in set (1.80 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5192497
Extra: Using where; Using index; Using filesort
1 row in set (0.01 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 20,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5192497
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)
而我们查看profile结果发现最大的差别在sort result这里,这就是由于m的不同导致。m越大,sort result的时间越长。
show profile for query 27;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000155 |
| Waiting for query cache lock | 0.000049 |
| checking query cache for query | 0.000084 |
| checking permissions | 0.000028 |
| Opening tables | 0.000112 |
| System lock | 0.000061 |
| Waiting for query cache lock | 0.000068 |
| init | 0.000062 |
| optimizing | 0.000058 |
| statistics | 0.006505 |
| preparing | 0.000067 |
| executing | 0.000049 |
| Sorting result | 3.406519 |
| Sending data | 0.048499 |
| end | 0.000180 |
| query end | 0.000037 |
| closing tables | 0.000037 |
| freeing items | 0.000050 |
| logging slow query | 0.000035 |
| cleaning up | 0.000036 |
+--------------------------------+----------+
20 rows in set (0.01 sec)
show profile for query 28;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000161 |
| Waiting for query cache lock | 0.000054 |
| checking query cache for query | 0.000084 |
| checking permissions | 0.000052 |
| Opening tables | 0.000060 |
| System lock | 0.000058 |
| Waiting for query cache lock | 0.000037 |
| init | 0.000115 |
| optimizing | 0.000056 |
| statistics | 0.000129 |
| preparing | 0.000064 |
| executing | 0.000169 |
| Sorting result | 1.798130 |
| Sending data | 0.000177 |
| end | 0.000041 |
| query end | 0.000038 |
| closing tables | 0.000042 |
| freeing items | 0.000039 |
| logging slow query | 0.000090 |
| cleaning up | 0.000036 |
+--------------------------------+----------+
20 rows in set (0.00 sec)
在增加了这个cmt_id索引之后,我们可以发现,在小分页的时候mysql会自动选择cmt_id这个索引,会让执行时间变为ms级别。
- 从下面的结果可以明显看出,小分页的执行时间非常多,explain中由于使用了cmt_id索引也不在有use filesort,并且profile中也不有sort result了。效果非常明显。
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20;
+------------------+
| cmt_id |
+------------------+
| 3666921226995030 |
| 3666921226994136 |
| 3666921226253946 |
| 3666921226253491 |
| 3666921222798659 |
| 3666921222058888 |
| 3666921221902007 |
| 3666921221901859 |
| 3666921217863864 |
| 3666921217706833 |
| 3666921213988237 |
| 3666921213668596 |
| 3666921213668585 |
| 3666921213668410 |
| 3666921213511559 |
| 3666921210212272 |
| 3666921209473576 |
| 3666921209473377 |
| 3666921206017425 |
| 3666921206017068 |
+------------------+
20 rows in set (1.95 sec)
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 20,20;
+------------------+
| cmt_id |
+------------------+
| 3671688006880471 |
| 3671687998428664 |
| 3671687990097514 |
| 3671687990038088 |
| 3671687990037931 |
| 3671687977896250 |
| 3671687973313916 |
| 3671687973257690 |
| 3671687960727327 |
| 3671687956530995 |
| 3671687940009625 |
| 3671687931741716 |
| 3671687914959415 |
| 3671687914526127 |
| 3671687877196165 |
| 3671687847144510 |
| 3671687818455925 |
| 3671687818333168 |
| 3671687813843641 |
| 3671687805193819 |
+------------------+
20 rows in set (0.00 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5514589
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 20,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: index
possible_keys: PRIMARY
key: idx_cmtid
key_len: 8
ref: NULL
rows: 454
Extra: Using where; Using index
1 row in set (0.00 sec)
show profile for query 13;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000017 |
| Waiting for query cache lock | 0.000004 |
| checking query cache for query | 0.000040 |
| checking permissions | 0.000006 |
| Opening tables | 0.000013 |
| System lock | 0.000009 |
| Waiting for query cache lock | 0.000019 |
| init | 0.000087 |
| optimizing | 0.000015 |
| statistics | 0.000090 |
| preparing | 0.000020 |
| executing | 0.000004 |
| Sorting result | 1.890079 |
| Sending data | 0.053175 |
| end | 0.000011 |
| query end | 0.000005 |
| closing tables | 0.000009 |
| freeing items | 0.000020 |
| logging slow query | 0.000003 |
| cleaning up | 0.000007 |
+--------------------------------+----------+
20 rows in set (0.00 sec)
show profile for query 14;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000057 |
| Waiting for query cache lock | 0.000004 |
| checking query cache for query | 0.000042 |
| checking permissions | 0.000007 |
| Opening tables | 0.000013 |
| System lock | 0.000008 |
| Waiting for query cache lock | 0.000019 |
| init | 0.000017 |
| optimizing | 0.000015 |
| statistics | 0.000087 |
| preparing | 0.000021 |
| executing | 0.000003 |
| Sorting result | 0.000008 |
| Sending data | 0.001283 |
| end | 0.000004 |
| query end | 0.000004 |
| closing tables | 0.000006 |
| freeing items | 0.000008 |
| Waiting for query cache lock | 0.000002 |
| freeing items | 0.000012 |
| Waiting for query cache lock | 0.000003 |
| freeing items | 0.000002 |
| storing result in query cache | 0.000004 |
| logging slow query | 0.000002 |
| cleaning up | 0.000003 |
+--------------------------------+----------+
25 rows in set (0.00 sec)
可以看出在增加了order by字段的索引之后,小分页的问题得到了解决,虽然没有解决大分页的问题,但是依然会降低服务器的消耗。
如果我们强制使用cmt_id索引的话,其扫描行数会等于m值。
explain select cmt_id from cmt_timeline_9.status_cmt_1401 FORCE INDEX (idx_cmtid) where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000000,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: index
possible_keys: NULL
key: idx_cmtid
key_len: 8
ref: NULL
rows: 1000020
Extra: Using where; Using index
1 row in set (0.00 sec)
但是时间增加了,我们可以看到主要时间消耗在sending data上,这说明从磁盘中读取的数据更多了,应该是由于对整个cmt_id排序后需要依次过滤到status_id,最终导致读取数据的增多。这么看来mysql的解析器的判断是正确的。
select cmt_id from cmt_timeline_9.status_cmt_1401 FORCE INDEX (idx_cmtid) where status_id=3664180852723027 and vflag in (0, 6) and cmt_id > 3666921377612353 order by cmt_id DESC limit 1000020,20;
+------------------+
| cmt_id |
+------------------+
| 3666921612039086 |
| 3666921608362788 |
| 3666921607844184 |
| 3666921607844029 |
| 3666921604322638 |
| 3666921604167533 |
| 3666921599615892 |
| 3666921599615198 |
| 3666921591581352 |
| 3666921591225463 |
| 3666921591063960 |
| 3666921587030689 |
| 3666921587030392 |
| 3666921587030118 |
| 3666921582674207 |
| 3666921582673920 |
| 3666921579149595 |
| 3666921578993892 |
| 3666921578478775 |
| 3666921574445235 |
+------------------+
20 rows in set (9.47 sec)
show profile for query 27;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000110 |
| Waiting for query cache lock | 0.000004 |
| checking query cache for query | 0.000047 |
| checking permissions | 0.000005 |
| Opening tables | 0.000013 |
| System lock | 0.000008 |
| Waiting for query cache lock | 0.000020 |
| init | 0.000019 |
| optimizing | 0.000014 |
| statistics | 0.000016 |
| preparing | 0.000018 |
| executing | 0.000003 |
| Sorting result | 0.000006 |
| Sending data | 8.362638 |
| end | 0.000012 |
| query end | 0.000003 |
| closing tables | 0.000006 |
| freeing items | 0.000019 |
| logging slow query | 0.000002 |
| cleaning up | 0.000003 |
+--------------------------------+----------+
20 rows in set (0.00 sec)
为了避免上面的情况,增加status_id的命中率,我们增加status_id和cmt_id的联合索引,看看结果如何。alter table add index idx_status_cmt(status_id,cmt_id);
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20;
+------------------+
| cmt_id |
+------------------+
| 3666921910051463 |
| 3666921909886784 |
| 3666921909886768 |
| 3666921909886508 |
| 3666921905856903 |
| 3666921905691463 |
| 3666921901496667 |
| 3666921901496661 |
| 3666921901496260 |
| 3666921898006342 |
| 3666921897590963 |
| 3666921897466778 |
| 3666921897466470 |
| 3666921897466340 |
| 3666921897301633 |
| 3666921897301238 |
| 3666921893658110 |
| 3666921893271419 |
| 3666921888911815 |
| 3666921888911674 |
+------------------+
20 rows in set (0.63 sec)
xplain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: ref
possible_keys: PRIMARY,idx_status_cmt
key: idx_status_cmt
key_len: 8
ref: const
rows: 4992808
Extra: Using where; Using index
1 row in set (0.01 sec)
select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20;
+------------------+
| cmt_id |
+------------------+
| 3666922094205570 |
| 3666922090997708 |
| 3666922090996574 |
| 3666922090010742 |
| 3666922086801253 |
| 3666922085815383 |
| 3666922082049661 |
| 3666922081620430 |
| 3666922078410803 |
| 3666922077996538 |
| 3666922077425771 |
| 3666922077425107 |
| 3666922073800576 |
| 3666922073659391 |
| 3666922070018704 |
| 3666922069464422 |
| 3666922065408535 |
| 3666922065269076 |
| 3666922065268964 |
| 3666922064840328 |
+------------------+
20 rows in set (4.36 sec)
explain select cmt_id from cmt_timeline_9.status_cmt_1401 where status_id=3664180852723027 and vflag in (0, 6) order by cmt_id DESC limit 1000020,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: status_cmt_1401
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 9
ref: NULL
rows: 5434753
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)
可以明显看出,增加了status_id和cmt_id的联合索引之后,mysql解析器在大分页的时候自动选择使用这个索引,并不会出现filesort,执行时间大缩短,而小分页就更不在话下了。
啰啰嗦嗦说了这么多,能看到这里的读者你一定很有耐心。
总结:
1、针对大分页带order by的查询SQL的优化方法是,添加等值限定字段+order by字段的联合索引,不要去管范围查询的字段。(status_id+cmt_id)不要管flag。
2、增加定位查询,通过对上一页的定位来查找下一页,缩小mysql需要sending data的量。