一、什么是Table Metadata Lock

在MySQL以前的版本中,存在这样一个bug:

Description:
If user1 has an active transaction on a table and then user2 drops this table, then user1 does COMMIT, then in the binlog we have something like:
DROP TABLE t;
BEGIN;
INSERT INTO t … ;
COMMIT;
which is wrong.

MySQL官方文档链接:
http://bugs.mysql.com/bug.php?id=989

这个bug大致意思是说:当一个会话在主库执行DML操作还没提交时,另一个会话对同一个对象执行了DDL操作如drop table,而由于MySQL的binlog是基于事务提交的先后顺序进行记录的,因此在从库上应用时,就出现了先drop table,然后再向table中insert的情况,导致从库应用出错。

因此,MySQL在5.5.3版本后引入了Metadata lock,只有在事务结束后才会释放Metadata lock,因此在事务提交或回滚前,是无法进行DDL操作的。

MySQL官方文档位置:
http://dev.mysql.com/doc/refman/5.6/en/metadata-locking.html

 

二、遇到的Metadata Lock问题

前几天时候,公司的开发同事让对一张表添加字段,由于该表数据量很大,因此使用了pt-online-change-schema工具进行字段的添加,在添加的过程中发现进度非常慢,通过shop processlist发现以及积累了大量的Metadata lock:Waiting for table metadata lock

这些语句很明显是被添加字段的DDL所阻塞,但是DDL又是被谁阻塞了呢?

查询当前正在进行的事务:

mysql> select * from information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 7202
                 trx_state: RUNNING
               trx_started: 2016-07-20 23:02:53
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 0
       trx_mysql_thread_id: 52402
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 0
          trx_lock_structs: 0
     trx_lock_memory_bytes: 360
           trx_rows_locked: 0
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: READ COMMITTED
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 10000
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)

发现一个正在运行的事务,从trx_started字段可以判断出,该事务已经运行了很久,一直没有结束,看来就是这个事务阻塞了添加字段的DDL语句。

根据查询到的trx_started时间以及trx_mysql_thread_id到MySQL的general log中查找,当然前提是开启了general log的功能,在general日志中对应的时间发现该thread执行了语句:

set autocommit=0;
  •  

关闭了自动提交,再往下看,oh my god……下面居然是一堆SELECT语句!

好了,终于找到原因,kill掉先:

mysql> kill 52402;
Query OK, 0 rows affected (0.00 sec)
  •  

之后便可以正常执行下去了。

 

三、测试验证

为了进一步确认问题的原因并验证,进行模拟测试:

会话1:显式开启事务,执行SELECT:

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test;
+----------+
| date     |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (0.00 sec)
  •  

会话2:对test表执行DDL:

mysql> alter table test add index `date`(`date`);

语句被阻塞,show processlist查看状态:

mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
| Id    | User        | Host      | db   | Command | Time   | State                                                                       | Info                                      |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
|    16 | system user |           | NULL | Connect | 540155 | Waiting for master to send event                                            | NULL                                      |
|    17 | system user |           | NULL | Connect | 529732 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL                                      |
| 51673 | root        | localhost | test | Sleep   |     55 |                                                                             | NULL                                      |
| 51681 | root        | localhost | test | Query   |      0 | init                                                                        | show processlist                          |
| 51683 | root        | localhost | test | Query   |     29 | Waiting for table metadata lock                                             | alter table test add index `date`(`date`) |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
5 rows in set (0.00 sec)

可以看到alter table语句的状态为Waiting for table metadata lock

会话3:对test表进行查询:

mysql> select * from test;

同样被阻塞:

mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
| Id    | User        | Host      | db   | Command | Time   | State                                                                       | Info                                      |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
|    16 | system user |           | NULL | Connect | 540305 | Waiting for master to send event                                            | NULL                                      |
|    17 | system user |           | NULL | Connect | 529882 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL                                      |
| 51673 | root        | localhost | test | Sleep   |    205 |                                                                             | NULL                                      |
| 51681 | root        | localhost | test | Query   |      0 | init                                                                        | show processlist                          |
| 51683 | root        | localhost | test | Query   |    179 | Waiting for table metadata lock                                             | alter table test add index `date`(`date`) |
| 51703 | root        | localhost | test | Query   |     18 | Waiting for table metadata lock                                             | select * from test                        |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
6 rows in set (0.00 sec)

接下来我们将会话1的事务提交,效果如下:

  • 会话1:
mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test;
+----------+
| date     |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (0.00 sec)

mysql> commit;
Query OK, 0 rows affected (0.00 sec)
  • 会话2:
mysql> alter table test add index `date`(`date`);
Query OK, 0 rows affected (3 min 49.87 sec)
Records: 0  Duplicates: 0  Warnings: 0
  • 会话3:
mysql> select * from test;
+----------+
| date     |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (1 min 8.27 sec)
  • show processlist:
mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
| Id    | User        | Host      | db   | Command | Time   | State                                                                       | Info             |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
|    16 | system user |           | NULL | Connect | 540411 | Waiting for master to send event                                            | NULL             |
|    17 | system user |           | NULL | Connect | 529988 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL             |
| 51673 | root        | localhost | test | Sleep   |     55 |                                                                             | NULL             |
| 51681 | root        | localhost | test | Query   |      0 | init                                                                        | show processlist |
| 51683 | root        | localhost | test | Sleep   |    285 |                                                                             | NULL             |
| 51703 | root        | localhost | test | Sleep   |    124 |                                                                             | NULL             |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
6 rows in set (0.00 sec)

可以看到,当会话1提交事务后,会话2和会话3的语句便可以正常执行了,由于被阻塞的原因,因此执行时间分别为 ( 3 min 49.87 sec )( 1 min 8.27 sec )

 

四、总结

  • 对于纯SELECT操作来说,完全没有必要添加事务,MySQL的innodb是基于MVCC多版本控制,加事务没有任何意义
  • 需要使用到事务时,也要尽量缩小事务的运行时间,一个事务中不要包含太多的语句