检查客户数据库的时候发现存在大量死锁的情况
Thread 1 advanced to log sequence 257 (LGWR switch)
Current log# 16 seq# 257 mem# 0: /oradata/Oracle/online_log/redo16_01.log
Current log# 16 seq# 257 mem# 1: /oradata/oracle/online_log/redo16_02.log
Tue Jul 03 10:14:53 2018
Archived Log entry 385 added for thread 1 sequence 256 ID 0x59dc8ffa dest 1:
Tue Jul 03 10:14:53 2018
LNS: Standby redo logfile selected for thread 1 sequence 257 for destination LOG_ARCHIVE_DEST_2
Tue Jul 03 10:19:39 2018
opiodr aborting process unknown ospid (23762) as a result of ORA-609
Tue Jul 03 10:51:18 2018
ORA-00060: Deadlock detected. More info in file /u01/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_25846.trc.
Tue Jul 03 10:54:01 2018
ORA-00060: Deadlock detected. More info in file /u01/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14067.trc.
Tue Jul 03 11:02:28 2018
ORA-00060: Deadlock detected. More info in file /u01/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_20781.trc.
Tue Jul 03 11:21:13 2018
Thread 1 cannot allocate new log, sequence 258
Private strand flush not complete
查看trace文件orcl_ora_25846.trc结果如下
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-026e0020-000001a5 147 4468 X 385 241 S
TM-0007fd6c-00000000 385 241 X 147 4468 SX
session 4468: DID 0001-0093-000001FEsession 241: DID 0001-0181-00000014
session 241: DID 0001-0181-00000014session 4468: DID 0001-0093-000001FE
Rows waited on:
Session 4468: obj - rowid = 0007FD6C - AAAAAAAAAAAAAAAAAA
(dictionary objn - 523628, file - 0, block - 0, slot - 0)
Session 241: no row
----- Information for the OTHER waiting sessions -----
Session 241:
sid: 241 ser: 425 audsid: 24705000 user: 160/FD14
flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
pid: 385 O/S info: user: oracle, term: UNKNOWN, ospid: 20781
image: oracle@dbserver1
client details:
O/S info: user: TL3050, term: TL3050-WZ, ospid: 5300:5348
machine: WORKGROUP\TL3050-WZ program: CWV4.2.8.337_20131204.exe
application name: CWV4.2.8.337_20131204.exe, hash value=580982453
current SQL:
insert into pzd2018
(UNI_NO,ORD,STYPE,STYPE2,SNO,SYEAR,SMONTH,RMONTH,SDAY,SABSTRACT,OPERATOR,J_AMOUNT,D_AMOUNT,SUBJ,SUBJNAME,
OPP_SUBJ,SRC_CODE,ECO_CODE,SRC_PAYTYPE,SRC_BUTYPE,ECO_TYPE,ECO_WARRANT,PRJ_ORDER,PRJ_NAME,OPP_PRJ,CLR_ORDER,
UNIT_CODE,SPECCODE,CONTRACT_NO,CAR_NO,OLPAY_SNO,schedule_date,WB_TYPE,WB_JNUM,WB_DNUM,WB_FACT,NUM_TYPE,
NUM_JNUM,NUM_DNUM,NUM_PRICE,CAP_NO,CAP_ORD,JSFS_CODE,ZPH,BUSS_DATE,OTHER_UNIT,ACNT,BANKNO,ADDRESS1,ADDRESS2,
TNO,ACT_NO,BU_CODE,T_CODE,RESBU_CODE,RESBU_AMT,SPECCODE1,SPECCODE2,SPECCODE3,SPECCODE4,RES_S1,
RES_S2,RES_S3,RES_S4,ASSET_SUBJ,TAX_NO,SRC_NAME,ADDITION,UNI_PRJ_ORDER,Clr_Bu_Code,
Source_Type,Source,SrKey,SMark,Uni_Prj_Name,clrsno,input_name,check_name,attach_act,
attach_act_no,pz_attr,src_type,zj_type,ref_uni_no,charge_sno,charge_name,src_lkx,order_type,c
----- End of information for the OTHER waiting sessions -----
Information for THIS session:
----- Current SQL Statement for this session (sql_id=9ktt36bsngnyx) -----
insert into pz2018
(UNI_NO,STYPE,STYPE2,SNO,SYEAR,SMONTH,RMONTH,SDAY,INPUT_NAME,CHECK_NAME,COMP_NAME,COMP_NAME2,
ADDITION,CHILDNUM,J_AMOUNT,D_AMOUNT,SSTATE,REMARK,PZ_ATTR)
values(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16,:17,:18,:19)
===================================================
2、问题分析
可以看出来241号会话持有一个TM锁,在执行insert into pzd2018语句在等待S锁
4468号会话持有一个TX锁,在执行insert into pz2018语句,在等待SX锁
通过与业务沟通与数据库查询发现了以下的锁表操作,并和业务确定了属于业务SQL
lock table pz2018 in exclusive mode
到这里问题已经清楚了,整个逻辑是这样的
241号会话将pz2018全表排他模式进行了锁定,导致4468会话无法对pz2018表进行insert操作,原因是无法在表上获取共享排它锁即SX锁,导致4468号会话进入等待模式
而4468号会话在等待前进行了insert into pzd2018操作,而241号会话在插入时存在唯一约束,导致241会话进行TX锁等待,等待4468号session数据提交或者回滚
这样一个环状等待就形成了即死锁
等待发生时会话的等待情况