转载

并发插入引起的死锁问题排查

2017大年初一晚上,线上报死锁了..
大家挂着VPN排查问题,也是有点意思.
不过当时并没有排查出原因

上班之后,发现是客户端的一个bug.本来应该发送一个请求,但是却发送了大量的请求,应用层面又没有做幂等设计,所以所有的请求都落到了数据库层面。
数据库是一个过程

环境
MySQL 5.6.14
事务隔离级别 读提交

引起问题的逻辑大致如下:

  1.     if not exists (select UserID from award_free_firecracker_watch_common where UserID=pUserID and AwardDate=pToday and GiftID=pGift) then
  2.       insert into award_free_firecracker_watch_common (UserID, AwardDate,GiftID) values (pUserID, pToday,pGift);
  3.     end if;
  4.     
  5.     select AwardTotal into pOld from award_free_firecracker_common where UserID=pUserID and GiftID=pGift for update;
  6.     update award_free_firecracker_common set AwardTotal=AwardTotal+1,AwardLeft=AwardLeft+1 where UserID=pUserID and GiftID=pGift;
  7.     
  8.     SELECT ROW_COUNT() into pUpdateCount;
  9.     if pUpdateCount>0 then
  10.       update award_free_firecracker_watch_common set WatchCount=WatchCount+1 where UserID=pUserID and AwardDate=pToday and GiftID=pGift ;
  11.     end if;


因为这个过程也不是我写的,大致的意思是先查询award_free_firecracker_watch_common表,如果没有这个用户的记录,则插入.
然后修改award_free_firecracker_common的记录,最后在修改award_free_firecracker_watch_common的记录.

MySQL的加锁过程
https://dev.mysql.com/doc/refman/5.7/en/innodb-locks-set.html

INSERT sets an exclusive lock on the inserted row. This lock is an index-record lock, not a next-key lock (that is, there is no gap lock) and does not prevent other sessions from inserting into the gap before the inserted row.

Prior to inserting the row, a type of gap lock called an insert intention gap lock is set. This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap. Suppose that there are index records with values of 4 and 7. Separate transactions that attempt to insert values of 5 and 6 each lock the gap between 4 and 7 with insert intention locks prior to obtaining the exclusive lock on the inserted row, but do not block each other because the rows are nonconflicting.

If a duplicate-key error occurs, a shared lock on the duplicate index record is set. This use of a shared lock can result in deadlock should there be multiple sessions trying to insert the same row if another session already has an exclusive lock. This can occur if another session deletes the row. Suppose that an InnoDB table t1 has the following structure:

CREATE TABLE t1 (i INT, PRIMARY KEY (i)) ENGINE = InnoDB;

Now suppose that three sessions perform the following operations in order:

Session 1:

START TRANSACTION;
INSERT INTO t1 VALUES(1);

Session 2:

START TRANSACTION;
INSERT INTO t1 VALUES(1);

Session 3:

START TRANSACTION;
INSERT INTO t1 VALUES(1);

Session 1:

ROLLBACK;

The first operation by session 1 acquires an exclusive lock for the row. The operations by sessions 2 and 3 both result in a duplicate-key error and they both request a shared lock for the row. When session 1 rolls back, it releases its exclusive lock on the row and the queued shared lock requests for sessions 2 and 3 are granted. At this point, sessions 2 and 3 deadlock: Neither can acquire an exclusive lock for the row because of the shared lock held by the other.

A similar situation occurs if the table already contains a row with key value 1 and three sessions perform the following operations in order:

Session 1:

START TRANSACTION;
DELETE FROM t1 WHERE i = 1;

Session 2:

START TRANSACTION;
INSERT INTO t1 VALUES(1);

Session 3:

START TRANSACTION;
INSERT INTO t1 VALUES(1);

Session 1:

COMMIT;

The first operation by session 1 acquires an exclusive lock for the row. The operations by sessions 2 and 3 both result in a duplicate-key error and they both request a shared lock for the row. When session 1 commits, it releases its exclusive lock on the row and the queued shared lock requests for sessions 2 and 3 are granted. At this point, sessions 2 and 3 deadlock: Neither can acquire an exclusive lock for the row because of the shared lock held by the other.


可以看到,Insert的时候,对记录上排它锁和插入意向锁.
并发的情况下,如果这个记录已经上了排它锁,则会尝试给这个记录上共享锁.
如果有三个以上的并发线程,
第一个线程上了排它锁,第二和第三个线程,看到该记录有排他锁,则尝试给这个记录上共享锁。
一旦第一个线程回滚,则第二,第三线程拥有共享锁的同时,都在申请排它锁.这时候就出现了死锁.

需要注意的是,假如第一个线程提交了,则第二个,第三个线程会报重复主键的错误,但是这时候,第二个,第三个线程,还是拥有这个记录的共享锁.第二,第三线程必须回滚.否则他们拥有的共享锁不释放.

回到最开始的问题.
三个线程同时insert award_free_firecracker_watch_common表,一个线程成功获取排它锁,其他两个线程上共享锁.
等获取排他锁的线程提交,两个上共享锁的线程,最后一步 update award_free_firecracker_watch_common表,则产生了死锁。
他们都是在获取了共享锁的同时,申请排它锁.

参考:
https://dev.mysql.com/doc/refman/5.7/en/innodb-locks-set.html
http://www.cnblogs.com/renolei/p/4673842.html
http://feiyang21687.github.io/Mysql-Transaction/
http://yeshaoting.cn/article/database/mysql%20insert%E9%94%81%E6%9C%BA%E5%88%B6/
http://blog.csdn.net/and1kaney/article/details/51214001

正文到此结束
Loading...