About enq: TX – row lock contention deadlock?
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
以前一个客户今天联系我,有个死锁的问题,如下的客户提供的信息,看上去有点意思,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TX-000a001e-0001720c 35 105 X 41 142 X TX-00030028-0000c5bd 41 142 X 35 105 X session 105: DID 0001-0023-0003A7FB session 142: DID 0001-0029-0018FFCA session 142: DID 0001-0029-0018FFCA session 105: DID 0001-0023-0003A7FB Rows waited on: Session 142: obj - rowid = 0000CE48 - AAAM5IAAHAAC0fyAAC (dictionary objn - 52808, file - 7, block - 739314, slot - 2) Session 105: obj - rowid = 0000CE48 - AAAM5IAAHAAC0fzAAo (dictionary objn - 52808, file - 7, block - 739315, slot - 40) Information on the OTHER waiting sessions: Session 142: pid=41 serial=37613 audsid=4133092 user: 73/B1xxxxMANAGER O/S info: user: Administrator, term: xxxxAPP1, ospid: 3736:920, machine: WORKGROUP\xxxxAPP1 program: ISETDA.xxxx.Server.APPServer.exe application name: ISETDA.xxxx.Server.APPServer.exe, hash value=1815860971 Current SQL Statement: UPDATE MMSPLT SET LOC = :0 , CHANGED_TIME = :1 , WORKER = :2 WHERE (PLT = :3 ) End of information on OTHER waiting sessions. Current SQL statement for this session: UPDATE MMSPLT SET interface_yn = 'R' WHERE erpsnd in ('1','2') AND interface_yn = 'N' =================================================== |
可以看到这是同一个表的不同的2行数据,在进行DML操作的时候出现了死锁。下面我们来看下Mode是多少:
1 2 3 4 5 |
last wait for 'enq: TX - row lock contention' blocking sess=0x0x139250a60 seq=59 wait_time=2930837 seconds since wait started=2 name|mode=54580006, usn<<16 | slot=30028, sequence=c5bd Dumping Session Wait History for 'enq: TX - row lock contention' count=1 wait_time=2930837 name|mode=54580006, usn<<16 | slot=30028, sequence=c5bd |
这里的event是enq: TX – row lock contention,而54580006其实是分为2部分,前面是Name,后面的0006是mode。
1 2 3 4 5 6 7 8 9 10 11 12 |
============ Plan Table ============ -----------------------------------------------------+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | -----------------------------------------------------+-----------------------------------+ | 0 | UPDATE STATEMENT | | | | 16 | | | 1 | UPDATE | MMSPLT | | | | | | 2 | INLIST ITERATOR | | | | | | | 3 | TABLE ACCESS BY INDEX ROWID | MMSPLT | 86 | 430 | 16 | 00:00:01 | | 4 | INDEX RANGE SCAN | MMSPLT_IDX02| 124 | | 4 | 00:00:01 | -----------------------------------------------------+-----------------------------------+ |
从执行计划来看,出现了INLIST ITERATOR操作,这是一个迭代操作。说明sql是根据or或in操作有关系,这一点从前面的SQL可以看出来。
下面我在自己的10.2.0.4环境进行模拟一下,再现这个问题。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
<pre class="brush:plain">SQL> create table t_deadlock as select owner,object_id,object_name 2 from dba_objects where object_id < 5; Table created. SQL> select count(1) from t_deadlock; COUNT(1) ---------- 3 SQL> alter table t_deadlock minimize records_per_block; Table altered. SQL> insert into t_deadlock select owner,object_id,object_name 2 from dba_objects where object_id > 5 and object_id < 15; 9 rows created. SQL> commit; Commit complete. SQL> select object_id,dbms_rowid.rowid_object(rowid) obj#, 2 dbms_rowid.rowid_relative_fno(rowid) rfile#, 3 dbms_rowid.rowid_block_number(rowid) block#, 4 dbms_rowid.rowid_row_number(rowid) row# 5 from t_deadlock order by 4,5; OBJECT_ID OBJ# RFILE# BLOCK# ROW# ---------- ---------- ---------- ---------- ---------- 4 55870 5 2084 0 3 55870 5 2084 1 2 55870 5 2084 2 11 55870 5 2085 0 10 55870 5 2085 1 9 55870 5 2085 2 8 55870 5 2086 0 7 55870 5 2086 1 6 55870 5 2086 2 14 55870 5 2088 0 13 55870 5 2088 1 12 55870 5 2088 2 12 rows selected. |
下面分别开2个会话窗口来进行模拟:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
Session 1: SQL> select sid from v$Mystat where rownum < 2; SID ---------- 159 SQL> update t_deadlock set owner='killdb' where object_id in(5,6,7); 2 rows updated. Session 2: SQL> select sid from v$mystat where rownum < 2; SID ---------- 149 SQL> rollback; Rollback complete. SQL> update t_deadlock set owner='xxoo' where object_id in(9,10,11); 4 rows updated. Ssession 1: SQL> update t_deadlock set owner='xxoo' where object_id in(9,10,11); update t_deadlock set owner='xxoo' where object_id in(8,9,10,11) * ERROR at line 1: ORA-00060: deadlock detected while waiting for resource Session 2: SQL> update t_deadlock set owner='killdb' where object_id in(5,6,7); .....一直等待 |
我们可以看到,成功模拟出了这个ora-00600死锁的情况,下我们来看下trace 文件的内容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TX-00020013-0000024b 15 159 X 18 149 X TX-00080019-00000351 18 149 X 15 159 X session 159: DID 0001-000F-00000012 session 149: DID 0001-0012-0000000E session 149: DID 0001-0012-0000000E session 159: DID 0001-000F-00000012 Rows waited on: Session 149: obj - rowid = 0000DA3E - AAANo+AAFAAAAgmAAC (dictionary objn - 55870, file - 5, block - 2086, slot - 2) Session 159: obj - rowid = 0000DA3E - AAANo+AAFAAAAglAAC (dictionary objn - 55870, file - 5, block - 2085, slot - 2) Information on the OTHER waiting sessions: Session 149: pid=18 serial=16 audsid=320188 user: 58/ROGER O/S info: user: Roger-PC\Roger, term: ROGER-PC, ospid: 2188:4172, machine: WORKGROUP\ROGER-PC program: sqlplus.exe application name: SQL*Plus, hash value=3669949024 Current SQL Statement: update t_deadlock set owner='killdb' where object_id in(5,6,7) End of information on OTHER waiting sessions. Current SQL statement for this session: update t_deadlock set owner='xxoo' where object_id in(9,10,11) =================================================== |
接着我们来看下这个process的dump信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
---------------------------------------- SO: 000007FF1EB84E20, type: 4, owner: 000007FF1EA62988, flag: INIT/-/-/0x00 (session) sid: 159 trans: 000007FF1D17C3E8, creator: 000007FF1EA62988, flag: (100041) USR/- BSY/-/-/-/-/- DID: 0001-000F-00000012, short-term DID: 0000-0000-00000000 txn branch: 0000000000000000 oct: 6, prv: 0, sql: 000007FF1E552258, psql: 000007FF19EAB960, user: 58/ROGER service name: SYS$USERS O/S info: user: Roger-PC\Roger, term: ROGER-PC, ospid: 7176:7164, machine: WORKGROUP\ROGER-PC program: sqlplus.exe application name: SQL*Plus, hash value=3669949024 last wait for 'enq: TX - row lock contention' blocking sess=0x000007FF1EB77810 seq=128 wait_time=2999550 seconds since wait started=12 name|mode=54580006, usn<<16 | slot=20025, sequence=24b Dumping Session Wait History for 'enq: TX - row lock contention' count=1 wait_time=2999550 name|mode=54580006, usn<<16 | slot=20025, sequence=24b for 'enq: TX - row lock contention' count=1 wait_time=3000600 name|mode=54580006, usn<<16 | slot=20025, sequence=24b for 'enq: TX - row lock contention' count=1 wait_time=3000050 name|mode=54580006, usn<<16 | slot=20025, sequence=24b for 'enq: TX - row lock contention' count=1 wait_time=2999524 name|mode=54580006, usn<<16 | slot=20025, sequence=24b ......... |
搜索Plan关键字,可以定位到该SQL的执行计划:
1 2 3 4 5 6 7 8 9 10 11 |
============ Plan Table ============ ----------------------------------------+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------+-----------------------------------+ | 0 | UPDATE STATEMENT | | | | 4 | | | 1 | UPDATE | T_DEADLOCK| | | | | | 2 | INLIST ITERATOR | | | | | | | 3 | INDEX RANGE SCAN | IDX_ID_T | 9 | 270 | 2 | 00:00:01 | ----------------------------------------+-----------------------------------+ |
我们可以看到执行计划中出现了INLIST ITERATOR。至于Or的情况,大家可以自行模拟。不过我这里模拟的情况和客户的实际情况
还是有一点细微差异,他这里的执行计划中还有一个回表的操作:TABLE ACCESS BY INDEX ROWID
不过,这个模拟已经可以说明问题了。对于enq: TX – row lock contention的死锁,如果Mode=6,那么其实只有这样一样情况。
这里需要我们注意的是,有些人以为这里是row lock contention,那么肯定是因为Index的关系,其实这是错误的理解。
对于row的操作,Oracle首先会申请一个表级别的TM锁,然后再申请Tx. 这里是否存在Index其实一点关系都没有.
1 2 3 4 5 6 7 |
SID TY ID1 ID2 LOCK_TYPE REQUEST CTIME BLOCK ---------- -- ---------- ---------- ------------------- ---------- ---------- ---------- 149 TX 262168 606 None 6 156 0 159 TM 55870 0 Row Exclusive 0 195 0 149 TM 55870 0 Row Exclusive 0 189 0 149 TX 131109 587 Exclusive 0 189 0 159 TX 262168 606 Exclusive 0 195 1 |
为了说明这一点,我drop掉Index,一样会出现这个死锁,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
Deadlock graph: ---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TX-00060008-00000267 18 149 X 15 159 X TX-0008000c-00000350 15 159 X 18 149 X session 149: DID 0001-0012-0000000A session 159: DID 0001-000F-0000000F session 159: DID 0001-000F-0000000F session 149: DID 0001-0012-0000000A Rows waited on: Session 159: obj - rowid = 0000DA3E - AAANo+AAFAAAAglAAA (dictionary objn - 55870, file - 5, block - 2085, slot - 0) Session 149: obj - rowid = 0000DA3E - AAANo+AAFAAAAgmAAB (dictionary objn - 55870, file - 5, block - 2086, slot - 1) Information on the OTHER waiting sessions: Session 159: pid=15 serial=5 audsid=310187 user: 58/ROGER O/S info: user: Roger-PC\Roger, term: ROGER-PC, ospid: 19232:14592, machine: WORKGROUP\ROGER-PC program: sqlplus.exe application name: SQL*Plus, hash value=3669949024 Current SQL Statement: update t_deadlock set owner='google' where object_id >6 and object_id < 13 End of information on OTHER waiting sessions. Current SQL statement for this session: update t_deadlock set owner='google' where object_id >6 and object_id < 13 =================================================== |
其中process dump如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
---------------------------------------- SO: 000007FF1EB77810, type: 4, owner: 000007FF1EA64200, flag: INIT/-/-/0x00 (session) sid: 149 trans: 000007FF1D157328, creator: 000007FF1EA64200, flag: (41) USR/- BSY/-/-/-/-/- DID: 0001-0012-0000000A, short-term DID: 0000-0000-00000000 txn branch: 0000000000000000 oct: 6, prv: 0, sql: 000007FF19DE9768, psql: 000007FF19F9D9E0, user: 58/ROGER service name: SYS$USERS O/S info: user: Roger-PC\Roger, term: ROGER-PC, ospid: 15668:4172, machine: WORKGROUP\ROGER-PC program: sqlplus.exe application name: SQL*Plus, hash value=3669949024 last wait for 'enq: TX - row lock contention' blocking sess=0x000007FF1EB84E20 seq=25 wait_time=3000153 seconds since wait started=244 name|mode=54580006, usn<<16 | slot=8000c, sequence=350 Dumping Session Wait History for 'enq: TX - row lock contention' count=1 wait_time=3000153 name|mode=54580006, usn<<16 | slot=8000c, sequence=350 for 'enq: TX - row lock contention' count=1 wait_time=3000073 name|mode=54580006, usn<<16 | slot=8000c, sequence=350 |
此时的SQL执行计划肯定是全表扫描了,如下:
1 2 3 4 5 6 7 8 9 10 |
============ Plan Table ============ ----------------------------------------+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------+-----------------------------------+ | 0 | UPDATE STATEMENT | | | | 4 | | | 1 | UPDATE | T_DEADLOCK| | | | | | 2 | TABLE ACCESS FULL | T_DEADLOCK| 6 | 180 | 4 | 00:00:01 | ----------------------------------------+-----------------------------------+ |
说了这么多,对于这种enq: TX – row lock contention 死锁,如果mode=6,那么唯一的解决方法就是kill 会话或者调整业务逻辑。
如果你遇到的enq: TX – row lock contention 死锁的mode为4即使shared mode,那么请检查如下几个方面是否存在问题:
1) ITL争用,建议调整INITRANS
2)涉及unique Index,那么说明index可能涉及到重复键值
3)检查是否用了位图Index。
如果是TM死锁,那么通常是外键缺乏Index导致。
Leave a Reply
You must be logged in to post a comment.