在上一篇文章《MySQL 5.7中如何定位DDL被阻塞的问题》中,对于DDL被阻塞问题的定位,我们主要是基于MySQL 5.7新引入的performance_schema.metadata_locks表。提出的定位方法,颇有种"锦上添花"的意味,而且,也只适用于MySQL 5.7开始的版本。
但在实际生产中,MySQL 5.6还是占绝不多数。虽然MySQL 8.0都已经GA了,但鉴于数据库的特殊性,在对待升级的这个事情上,相当一部分人还是秉持着一种“不主动”的态度。
既然MySQL 5.6用者众多,有没有一种方法,来解决MySQL 5.6的这个痛点呢?
还是之前的测试Demo
会话1开启了事务并执行了三个操作,但未提交,此时,会话2执行了alter table操作,被阻塞。
session1> begin;
Query OK, 0 rows affected (0.00 sec)
session1> delete from slowtech.t1 where id=2;
Query OK, 1 row affected (0.00 sec)
session1> select * from slowtech.t1;
+------+------+
| id | name |
+------+------+
| 1 | a |
+------+------+
row in set (0.00 sec)
session1> update slowtech.t1 set where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
session2> alter table slowtech.t1 add c1 int; ##被阻塞
session3> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
| 2 | root | localhost | NULL | Sleep | 51 | | NULL |
| 3 | root | localhost | NULL | Query | 0 | starting | show processlist |
| 4 | root | localhost | NULL | Query | 9 | Waiting for table metadata lock | alter table slowtech.t1 add c1 int |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
rows in set (0.00 sec)
其实,导致DDL阻塞的操作,无非两类:
1. 慢查询
2. 表上有事务未提交
其中,第一类比较好定位,通过show processlist即能发现。而第二类基本没法定位,因为未提交事务的连接在show processlist中的输出同空闲连接一样。
如下面Id为2的连接,虽然Command显示为“Sleep”,其实是事务未提交。
mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
| 2 | root | localhost | NULL | Sleep | 77 | | NULL |
| 3 | root | localhost | NULL | Query | 0 | starting | show processlist |
| 4 | root | localhost | NULL | Query | 44 | Waiting for table metadata lock | alter table slowtech.t1 add c1 int |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
3 rows in set (0.00 sec)
所以,网上有kill空闲(Command为Sleep)连接的说法,其实也不无道理,但这样做就太简单粗暴了,难免会误杀。
其实,既然是事务,在information_schema. innodb_trx中肯定会有记录,如会话1中的事务,在表中的记录如下,