测试环境:
MySQL5.7
pt-online-schema-change 3.2.0
第一次加索引,使用MySQL原生的online DDL。
做一个超级慢的update,模拟长事务。
1 | mysql> update test set user_id= 'aa' where user_id like '%123%' ; |
此时开始加索引:
1 | mysql> alter table test add index idx_uid(user_id); |
查看现场状态:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | mysql> show processlist; + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ | 2 | pt_checksum | server-254-163:53960 | test | Sleep | 0 | | NULL | | 299 | repl | server-254-163:15492 | NULL | Binlog Dump | 1656769 | Master has sent all binlog to slave; waiting for more updates | NULL | | 308 | repl | server-254-163:31672 | NULL | Binlog Dump | 1656282 | Master has sent all binlog to slave; waiting for more updates | NULL | | 312 | root | 172.16.118.23:58883 | handong | Sleep | 6208 | | NULL | | 313 | root | 172.16.118.23:58884 | handong | Sleep | 7097 | | NULL | | 326 | root | localhost | NULL | Query | 0 | starting | show processlist | | 329 | root | localhost | handong | Query | 9 | updating | update test set user_id= 'aa' where user_id like '%123%' | | 332 | root | localhost | handong | Query | 3 | Waiting for table metadata lock | alter table test add index idx_uid(user_id) | + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ 8 rows in set (0.00 sec) |
可以看到添加索引遇到了MDL(Waiting for table metadata lock)
新开一个窗口,继续进行简单查询:
1 | mysql> select * from test limit 1; |
此查询一直等待,未返回结果,查询线程状态:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | mysql> show processlist; + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ | 2 | pt_checksum | server-254-163:53960 | test | Sleep | 0 | | NULL | | 299 | repl | server-254-163:15492 | NULL | Binlog Dump | 1656840 | Master has sent all binlog to slave; waiting for more updates | NULL | | 308 | repl | server-254-163:31672 | NULL | Binlog Dump | 1656353 | Master has sent all binlog to slave; waiting for more updates | NULL | | 312 | root | 172.16.118.23:58883 | handong | Sleep | 6279 | | NULL | | 313 | root | 172.16.118.23:58884 | handong | Sleep | 7168 | | NULL | | 326 | root | localhost | handong | Query | 45 | Waiting for table metadata lock | select * from test limit 1 | | 329 | root | localhost | handong | Query | 80 | updating | update test set user_id= 'aa' where user_id like '%123%' | | 332 | root | localhost | handong | Query | 74 | Waiting for table metadata lock | alter table test add index idx_uid(user_id) | | 333 | root | localhost | NULL | Query | 0 | starting | show processlist | + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ 9 rows in set (0.00 sec) |
再次新开一个窗口,继续进行简单查询:
1 | mysql> select * from test limit 1; |
查看线程状态:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | mysql> show processlist; + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ | 2 | pt_checksum | server-254-163:53960 | test | Sleep | 0 | | NULL | | 299 | repl | server-254-163:15492 | NULL | Binlog Dump | 1656925 | Master has sent all binlog to slave; waiting for more updates | NULL | | 308 | repl | server-254-163:31672 | NULL | Binlog Dump | 1656438 | Master has sent all binlog to slave; waiting for more updates | NULL | | 312 | root | 172.16.118.23:58883 | handong | Sleep | 6364 | | NULL | | 326 | root | localhost | handong | Query | 130 | Waiting for table metadata lock | select * from test limit 1 | | 329 | root | localhost | handong | Query | 165 | updating | update test set user_id= 'aa' where user_id like '%123%' | | 332 | root | localhost | handong | Query | 159 | Waiting for table metadata lock | alter table test add index idx_uid(user_id) | | 333 | root | localhost | handong | Query | 11 | Waiting for table metadata lock | select * from test limit 1 | | 334 | root | localhost | NULL | Query | 0 | starting | show processlist | + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+---------------------------------------------------------+ 9 rows in set (0.00 sec) |
可以发现:
对于原生online DDL,在开始DDL之前,如果有长事务未释放,进行DDL的事务就会处于Waiting for table metadata lock状态,后续所有关于这张表的查询都会排队Waiting for table metadata lock。如果生产环境此表访问频繁,可能就会造成应用访问超时。
这次把刚才加的索引删掉,利用pt-online-schema-change
做一次批量更新,模拟长事务:
1 | mysql> update test set user_id= 'aaa' where user_id like '%123%' ; |
使用 pt-online-schema-change删除索引:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | [root@server-254-163 ~]# pt-online- schema -change --user=root --password=mysql -h localhost --socket=/mysql/mysql3307/mysqld.sock --port=3307 D=handong,t=test --alter "drop index idx_uid" --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --charset=utf8 --execute No slaves found. See --recursion-method if host server-254-163 has slaves. Not checking slave lag because no slaves were found and --check-slave-lag was not specified. Operation, tries, wait: analyze_table, 10, 1 copy_rows, 10, 0.25 create_triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 No foreign keys reference `handong`.`test`; ignoring --alter-foreign-keys-method. Altering `handong`.`test`... Creating new table ... CREATE TABLE `handong`.`_test_new` ( `id` int (11) NOT NULL AUTO_INCREMENT, `user_id` varchar (32) DEFAULT NULL , PRIMARY KEY (`id`), KEY `idx_uid` (`user_id`) ) ENGINE=InnoDB AUTO_INCREMENT=95418961 DEFAULT CHARSET=utf8mb4 Created new table handong._test_new OK. Altering new table ... ALTER TABLE `handong`.`_test_new` drop index idx_uid Altered `handong`.`_test_new` OK. 2020-12-17T18:31:16 Creating triggers... |
一直处于创建trigger状态,查询此时线程状态:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | mysql> show processlist; + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ | 2 | pt_checksum | server-254-163:53960 | test | Sleep | 0 | | NULL | | 299 | repl | server-254-163:15492 | NULL | Binlog Dump | 1659526 | Master has sent all binlog to slave; waiting for more updates | NULL | | 308 | repl | server-254-163:31672 | NULL | Binlog Dump | 1659039 | Master has sent all binlog to slave; waiting for more updates | NULL | | 326 | root | localhost | handong | Query | 0 | starting | show processlist | | 329 | root | localhost | handong | Query | 14 | updating | update test set user_id= 'aaa' where user_id like '%123%' | | 332 | root | localhost | handong | Sleep | 2760 | | NULL | | 333 | root | localhost | handong | Sleep | 2612 | | NULL | | 334 | root | localhost | NULL | Sleep | 2601 | | NULL | | 335 | root | localhost | handong | Query | 8 | Waiting for table metadata lock | CREATE TRIGGER `pt_osc_handong_test_del` AFTER DELETE ON `handong`.`test` FOR EACH ROW DELETE IGNORE | | 336 | root | localhost | handong | Sleep | 8 | | NULL | + -----+-------------+----------------------+---------+-------------+---------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ 10 rows in set (0.00 sec) |
发现create trigger出现Waiting for table metadata lock
此时批量更新依然没有结束,重新开个窗口,进行简单查询:
1 2 3 4 5 6 7 | mysql> select * from test limit 1; + ---------+---------------------+ | id | user_id | + ---------+---------------------+ | 4812551 | 1019092022432797988 | + ---------+---------------------+ 1 row in set (0.01 sec) |
发现很快就能返回结果,测试多次都没有问题。
pt-online-schema-change解决了DDL等待DML时候,后边所有会话都堆积的情况,避免影响业务。
查看删除索引全部日志:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | [root@server-254-163 ~] # pt-online-schema-change --version pt-online-schema-change 3.2.0 [root@server-254-163 ~] # pt-online-schema-change --user=root --password=mysql -h localhost --socket=/mysql/mysql3307/mysqld.sock --port=3307 D=handong,t=test --alter "drop index idx_uid" --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --charset=utf8 --execute No slaves found. See --recursion-method if host server-254-163 has slaves. Not checking slave lag because no slaves were found and --check-slave-lag was not specified. Operation, tries, wait: analyze_table, 10, 1 copy_rows, 10, 0.25 create_triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 No foreign keys reference `handong`.` test `; ignoring --alter-foreign-keys-method. Altering `handong`.` test `... Creating new table... CREATE TABLE `handong`.`_test_new` ( ` id ` int(11) NOT NULL AUTO_INCREMENT, `user_id` varchar(32) DEFAULT NULL, PRIMARY KEY (` id `), KEY `idx_uid` (`user_id`) ) ENGINE=InnoDB AUTO_INCREMENT=95418961 DEFAULT CHARSET=utf8mb4 Created new table handong._test_new OK. Altering new table... ALTER TABLE `handong`.`_test_new` drop index idx_uid Altered `handong`.`_test_new` OK. 2020-12-17T18:31:16 Creating triggers... 2020-12-17T18:33:20 Created triggers OK. 2020-12-17T18:33:20 Copying approximately 95333225 rows... INSERT LOW_PRIORITY IGNORE INTO `handong`.`_test_new` (` id `, `user_id`) SELECT ` id `, `user_id` FROM `handong`.` test ` FORCE INDEX(`PRIMARY`) WHERE ((` id ` >= ?)) AND ((` id ` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 189639 copy nibble*/ SELECT /*!40001 SQL_NO_CACHE */ ` id ` FROM `handong`.` test ` FORCE INDEX(`PRIMARY`) WHERE ((` id ` >= ?)) ORDER BY ` id ` LIMIT ?, 2 /*next chunk boundary*/ Copying `handong`.` test `: 4% 10:35 remain Copying `handong`.` test `: 8% 11:02 remain Copying `handong`.` test `: 12% 10:51 remain Copying `handong`.` test `: 15% 10:49 remain Copying `handong`.` test `: 19% 10:38 remain Copying `handong`.` test `: 22% 10:24 remain Copying `handong`.` test `: 25% 10:11 remain Copying `handong`.` test `: 29% 09:46 remain Copying `handong`.` test `: 32% 09:19 remain Copying `handong`.` test `: 36% 08:52 remain Copying `handong`.` test `: 39% 08:21 remain Copying `handong`.` test `: 43% 07:51 remain Copying `handong`.` test `: 46% 07:20 remain Copying `handong`.` test `: 50% 06:52 remain Copying `handong`.` test `: 54% 06:20 remain Copying `handong`.` test `: 57% 05:50 remain Copying `handong`.` test `: 61% 05:21 remain Copying `handong`.` test `: 64% 04:53 remain Copying `handong`.` test `: 68% 04:22 remain Copying `handong`.` test `: 71% 03:54 remain Copying `handong`.` test `: 75% 03:25 remain Copying `handong`.` test `: 79% 02:54 remain Copying `handong`.` test `: 82% 02:24 remain Copying `handong`.` test `: 86% 01:56 remain Copying `handong`.` test `: 89% 01:25 remain Copying `handong`.` test `: 93% 00:55 remain Copying `handong`.` test `: 96% 00:28 remain 2020-12-17T18:47:16 Copied rows OK. 2020-12-17T18:47:16 Analyzing new table... 2020-12-17T18:47:16 Swapping tables... RENAME TABLE `handong`.` test ` TO `handong`.`_test_old`, `handong`.`_test_new` TO `handong`.` test ` 2020-12-17T18:47:16 Swapped original and new tables OK. 2020-12-17T18:47:16 Dropping old table... DROP TABLE IF EXISTS `handong`.`_test_old` 2020-12-17T18:47:16 Dropped old table `handong`.`_test_old` OK. 2020-12-17T18:47:16 Dropping triggers... DROP TRIGGER IF EXISTS `handong`.`pt_osc_handong_test_del` DROP TRIGGER IF EXISTS `handong`.`pt_osc_handong_test_upd` DROP TRIGGER IF EXISTS `handong`.`pt_osc_handong_test_ins` 2020-12-17T18:47:16 Dropped triggers OK. Successfully altered `handong`.` test `. |
1、创建一张新表_xxx_new ,对其做DDL操作
2、创建3个触发器(delete\update\insert)在复制数据开始之后,将对源数据表继续进行数据修改的操作记录下来,以便在数据复制结束后执行这些操作,保证数据不会丢失
3、复制数据,从源数据表复制数据到新表(分成多个chunk,小事务提交)
4、修改外键相关的子表,根据修改后的数据,修改外键关联的子表
5、将源数据表重命名为old表,将新表更改为源表名
6、删除原表
7、删除触发器