一次死锁(Dead Lock)分析

2022-12-28 14:42:56 浏览数 (3)

代码语言:javascript复制
mysql> SHOW ENGINE INNODB STATUS G
*************************** 1. row ***************************
  Type: InnoDB
  Name: 
Status: 
=====================================
2020-10-15 23:24:00 0x7f8080732700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 7 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 57730 srv_active, 0 srv_shutdown, 3095533 srv_idle
srv_master_thread log flush and writes: 3153263
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 2360471
OS WAIT ARRAY INFO: signal count 5901167
RW-shared spins 0, rounds 15858296, OS waits 911780
RW-excl spins 0, rounds 119668028, OS waits 495341
RW-sx spins 3879652, rounds 70898774, OS waits 419099
Spin rounds per wait: 15858296.00 RW-shared, 119668028.00 RW-excl, 18.27 RW-sx
------------------------
LATEST FOREIGN KEY ERROR
------------------------
2020-10-15 23:23:54 0x7f8080396700 Transaction:
TRANSACTION 10559562, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
5 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 38634, OS thread handle 140189883787008, query id 5849250 172.17.0.1 rabbitdb updating
DELETE FROM `t_industry` WHERE `id` = ?
Foreign key constraint fails for table `BDMS`.`t_department`:
,
  CONSTRAINT `t_department_ibfk_1` FOREIGN KEY (`industry_id`) REFERENCES `t_industry` (`id`)
Trying to delete or update in parent table, in index PRIMARY tuple:
DATA TUPLE: 5 fields;
 0: len 4; hex 0000832f; asc    /;;
 1: len 6; hex 000000a1204a; asc      J;;
 2: len 7; hex 6600000ad420e8; asc f      ;;
 3: len 10; hex 613430316b6e39733774; asc a401kn9s7t;;
 4: len 5; hex 99a79ef5f5; asc      ;;

But in child table `BDMS`.`t_department`, in index industry_id, there is a record:
PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 0000832f; asc    /;;
 1: len 4; hex 001050ac; asc   P ;;

------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-10-15 23:23:35 0x7f808024c700
*** (1) TRANSACTION:
TRANSACTION 10559056, ACTIVE 1 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 38631, OS thread handle 140189884327680, query id 5848630 172.17.0.1 rabbitdb update
INSERT INTO `t_verification_task` (`name`, `user_id`, `datasource_id`, `department_id`, `business_id`, `create_time`, `running_interval`, `single_fetch_count`, `type`, `expected_count`, `checkpoint`, `current_count`, `alarm_id`, `max_process_id`, `mode`, `lastmodified`) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1096 page no 5 n bits 248 index datasource_id of table `BDMS`.`t_verification_task` trx id 10559056 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

*** (2) TRANSACTION:
TRANSACTION 10558977, ACTIVE 3 sec fetching rows
mysql tables in use 1, locked 1
135 lock struct(s), heap size 24784, 10385 row lock(s), undo log entries 10000
MySQL thread id 38630, OS thread handle 140189882435328, query id 5848556 172.17.0.1 rabbitdb updating
DELETE FROM `t_datasource_settings` WHERE `creator_user_id` = ?
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1096 page no 5 n bits 248 index datasource_id of table `BDMS`.`t_verification_task` trx id 10558977 lock mode S
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1153 page no 471 n bits 128 index PRIMARY of table `BDMS`.`t_datasource_settings` trx id 10558977 lock_mode X waiting
Record lock, heap no 58 PHYSICAL RECORD: n_fields 27; compact format; info bits 0
 0: len 4; hex 002df89e; asc  -  ;;
 1: len 6; hex 000000a11e4e; asc      N;;
 2: len 7; hex a8000003ec0110; asc        ;;
 3: len 30; hex 5465737441626f7274566572696669636174696f6e5461736b50726f6365; asc TestAbortVerificationTaskProce; (total 68 bytes);
 4: len 9; hex 6c6f63616c686f7374; asc localhost;;
 5: len 2; hex 0cea; asc   ;;
 6: len 5; hex 6d7973716c; asc mysql;;
 7: len 4; hex 42444d53; asc BDMS;;
 8: len 17; hex 745f726f6c655f31323533363139323233; asc t_role_1253619223;;
 9: len 2; hex 6964; asc id;;
 10: len 2; hex 6964; asc id;;
 11: len 8; hex 7261626269746462; asc rabbitdb;;
 12: len 8; hex 7261626269746462; asc rabbitdb;;
 13: len 0; hex ; asc ;;
 14: len 0; hex ; asc ;;
 15: len 0; hex ; asc ;;
 16: len 4; hex 0010509e; asc   P ;;
 17: len 4; hex 00103705; asc   7 ;;
 18: len 4; hex 80000000; asc     ;;
 19: len 0; hex ; asc ;;
 20: len 0; hex ; asc ;;
 21: len 4; hex ffc22d63; asc   -c;;
 22: len 5; hex 99a79ef5e3; asc      ;;
 23: SQL NULL;
 24: len 1; hex 80; asc  ;;
 25: len 4; hex 80000000; asc     ;;
 26: len 4; hex 80000000; asc     ;;

*** WE ROLL BACK TRANSACTION (1)
------------
TRANSACTIONS
------------
Trx id counter 10560218
Purge done for trx's n:o < 10560218 undo n:o < 0 state: running
History list length 16
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421665306893968, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421665306893048, not started flushing log
mysql tables in use 1, locked 1
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421665306892128, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 1; buffer pool: 0
2490 OS file reads, 4069765 OS file writes, 2302797 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 104.70 writes/s, 66.85 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 4 merges
merged operations:
 insert 4, delete mark 1, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 6 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 52 buffer(s)
Hash table size 34679, node heap has 5 buffer(s)
Hash table size 34679, node heap has 4 buffer(s)
Hash table size 34679, node heap has 6 buffer(s)
Hash table size 34679, node heap has 10 buffer(s)
Hash table size 34679, node heap has 23 buffer(s)
67.99 hash searches/s, 550.64 non-hash searches/s
---
LOG
---
Log sequence number 72743908501
Log flushed up to   72743908183
Pages flushed up to 72743693137
Last checkpoint at  72743680140
1 pending log flushes, 0 pending chkp writes
1984754 log i/o's done, 63.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 1699934
Buffer pool size   8192
Free buffers       1097
Database pages     6988
Old database pages 2559
Modified db pages  371
Pending reads      0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 99526, not young 3033
0.00 youngs/s, 0.00 non-youngs/s
Pages read 1984, created 13327, written 2002306
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 6988, unzip_LRU len: 0
I/O sum[1434]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140190030145280, state: sleeping
Number of rows inserted 15735481, updated 221045, deleted 15526358, read 85492063
26.00 inserts/s, 9.43 updates/s, 25.00 deletes/s, 122.70 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

1 row in set (0.02 sec)

mysql> 

当系统提示发生死锁Deadlock found when trying to get lock; try restarting transaction后,可在mysql控制台输入命令SHOW ENGINE INNODB STATUS G,并找到LATEST DETECTED DEADLOCK部分分析一下,就能大概知道死锁的原因在哪里了。

TX1: 首先看tx1里有一插入语句INSERT INTO t_verification_task... 并且在这条插入语句之后紧跟*** (1) WAITING FOR THIS LOCK TO BE GRANTED:,这是表示,这个tx1在等待这个锁被授权,然后才能接着插入流程。那这个锁是什么锁呢? tx1 等待的锁是BDMS库的t_verification_task表的行级插入意向排他锁 RECORD LOCKS space id 1096 page no 5 n bits 248 index datasource_id of tableBDMS.t_verification_tasktrx id 10559056 lock_mode X insert intention waiting 这是因为tx1要在t_verification_task进行插入,因此要申请一个插入意向排他锁,以免其他的事务对此条记录造成影响。

TX2: 再看tx2里有一删除语句DELETE FROM t_datasource_settings WHERE creator_user_id = ? 而紧跟着这条语句的*** (2) HOLDS THE LOCK(S):则说明了 tx2持有的锁是BDMS库t_verification_task表的行级共享锁 RECORD LOCKS space id 1096 page no 5 n bits 248 index datasource_id of table BDMS.t_verification_task trx id 10558977 lock mode S*** (2) WAITING FOR THIS LOCK TO BE GRANTED:则表明tx2在等待下面的这个锁被授权,然后才能接着删除流程。锁内容如下: tx2等待的锁是BDMS库的t_datasource_settings表的行级排他锁 RECORD LOCKS space id 1153 page no 471 n bits 128 index PRIMARY of table BDMS.t_datasource_settings trx id 10558977 lock_mode X waiting

这里要先说明一下出现的两张表的关系,t_verification_task拥有t_datasource_settings的id为外键。而我们知道从mysql 5.7官网对Next-Key locks的解释知道:InnoDB执行行级锁定,以使其在搜索或扫描表索引时对遇到的索引记录设置共享或排他锁。因此,行级锁实际上是索引记录锁。索引记录上的下一键锁定也会影响该索引记录之前的“间隙”。即,下一键锁定是索引记录锁定加上索引记录之前的间隙上的间隙锁定。如果一个会话R在索引中的记录上具有共享或排他锁 ,则另一会话不能R在索引顺序之前的间隙中插入新的索引记录,也就是说:

  1. tx1在获得t_verification_task的datasource_id索引即t_datasource_settings的PRIMARY主键索引的插入意向排他锁之前,同时持有了从最初的记录到当前记录的排他下一键锁(未打印出来),而tx2正好持有了这个间隙之间(tx1的id是10559056>tx2的id是10558977)的一条记录的共享锁S,因此tx1的行级插入意向排他锁无法获得,需要等到tx2持有的行级共享锁S被释放。
  2. tx2想获得t_datasource_settings的PRIMARY主键索引的行级排他锁,必须先让tx1的从最初的记录到当前记录的排他下一键锁先释放,然而这个排他下一键锁要等到tx1的插入意向排他锁获得并插入完成后才会被释放。 综上所述,1被2持有的行级共享锁阻塞,2被1持有的排他下一键锁阻塞,因此死锁。

解决方案:

  1. 保持事务精简,这个问题会发生的主要原因在于一次性想删除1000条记录,导致持有锁的时间过长而事务被回滚,因此可修改为少量删除一个事务,最小化事务的网络通信往返量,减少完成事务可能的延迟并释放锁。
  2. 降低事务隔离级别,MySQL的默认隔离级别是REPEATABLE READ会默认开启这个下一键锁防止幻读。

0 人点赞