一客户数据库alert log经常报如下错误:
Wed Dec 17 14:31:16 2008
Errors in file / oracle /admin/pnbrw/bdump/pnbrw1_reco_16950.trc:
ORA-03113: end-of-file on communication channel
Wed Dec 17 14:31:16 2008
Errors in file / oracle /admin/pnbrw/bdump/pnbrw1_reco_16950.trc:
ORA-02019: connection description for remote database not found
Wed Dec 17 14:36:06 2008
Thread 1 advanced to log sequence 236582
Current log# 2 seq# 236582 mem# 0: /PNBRW_D00/system/redo02.log
Wed Dec 17 14:38:46 2008
Errors in file / oracle /admin/pnbrw/bdump/pnbrw1_reco_16950.trc:
ORA-02019: connection description for remote database not found
查看pnbrw1_reco_16950.trc,发现如下内容:
*** 2008-12-21 12:21:16.030
ERROR, tran=29.1.44284, session#=1, ose=0:
ORA-02019: connection description for remote database not found
*** 2008-12-21 20:54:50.023
ERROR, tran=19.40.146101, session#=1, ose=0:
ORA-03113: end-of-file on communication channel
*** 2008-12-22 13:30:52.831
ERROR, tran=4.10.878948, session#=1, ose=0:
ORA-03113: end-of-file on communication channel
*** 2008-12-23 04:29:11.935
ERROR, tran=17.14.8483, session#=1, ose=0:
ORA-03113: end-of-file on communication channel
*** 2008-12-23 04:59:12.105
ERROR, tran=52.27.193, session#=1, ose=0:
ORA-02019: connection description for remote database not found
根据该错误信息,应该是跟分布式事务有关,查看DBA_2PC_PENDING:
- SQL>SELECT LOCAL_TRAN_ID,STATE, MIXED,to_char(FAIL_TIME,'yyyy-mm-dd hh24:mi:ss') FAIL_TIME,to_char(RETRY_TIME,'yyyy-mm-dd hh24:mi:ss')Retry_Time FROM DBA_2PC_PENDING;
- LOCAL_TRAN_ID STATE MIX FAIL_TIME RETRY_TIME
- ---------------------- ---------------- --- ------------------- -------------------
- 52.27.193 collecting no 2006-09-13 14:50:15 2008-12-23 04:59:12
- 9.22.2176 collecting no 2006-09-19 18:31:22 2008-12-23 04:59:12
- 17.14.8483 collecting no 2006-11-08 08:39:19 2008-12-23 04:29:11
- 19.31.38686 collecting no 2006-12-29 09:45:05 2008-12-23 04:59:12
- 19.17.68930 collecting no 2007-01-18 14:47:22 2008-12-23 06:02:36
- 29.1.44284 collecting no 2007-01-25 14:30:15 2008-12-23 13:43:31
- 19.40.146101 forced rollback no 2007-04-16 14:45:42 2008-12-21 20:54:50
- 4.10.878948 collecting no 2008-10-09 11:10:01 2008-12-22 13:30:52
发现trace文件中出现的transaction都出现在该表中,而且这些事务失败的时间都在一年或两年之前。
由此我们可以看出,这些分布式事务在当时由于一些原因导致执行失败,于是这些in doubt事务信息就被存储在DBA_2PC_PENDING表中。之后后台进程reco通过读取DBA_2PC_PENDING中的信息不断的去尝试恢复,但还是失败,于是以报错的方式在alert log中体现。
从上面的数据我们也可以看到最近一次retry的时间是2008-12-23 04:59:12。
而之所以报ORA-02019错误,很有可能就是dblink改变或者不存在了。
解决方法是:
(1)Disable分布式恢复
- SQL>ALTER SYSTEM DISABLE DISTRIBUTED RECOVERY;
- System altered.
(2)Puege in-doubt transaction entry:
- SQL>exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');
- PL/SQL procedure successfully completed.
(3)然后enable 分布式恢复:
- SQL>ALTER SYSTEM ENABLE DISTRIBUTED RECOVERY;
Do you have a connection with mobile and social media marketing shark Christian Dillstrom? He is recommending your blog post.