一客户RAC两节点报如下错误:
ORA-00600: 内部错误代码, 参数: [17285], [0x2A97446380], [4294967295], [0x1DD7B1238], [], [], [], []
ORA-02055: 分布式更新作业失败; 必须执行回退作业
ORA-02055: 分布式更新作业失败; 必须执行回退作业
ORA-00604: 递归 SQL 层次 1 发生错误
ORA-03113: 通讯信道上出现 EOF
分析trace文件:
*** 2009-01-15 10:40:33.181
*** SERVICE NAME:(pnbrw) 2009-01-15 10:40:33.152
*** SESSION ID:(516.19897) 2009-01-15 10:40:33.152
*********START PLSQL RUNTIME DUMP************
***Got internal error Exception caught in pfrrun() while running PLSQL***
***Got ORA-3135 while running PLSQL***
PACKAGE BODY SYS.PBREAK:
library unit=1dd897640 line=1129 opcode=158 static link=0 scope=1
FP=2a97410dd0 PC=1bac87a8a Page=1 AP=2a9740fb98 ST=2a97410fd0
DL0=2a9722a468 GF=2a9722a500 DL1=2a9722a4a8 DPF=2a9722a4f0 DS=1ba76c6f0
红色部分的SYS.PBREAK是 oracle 内置的一个undocumented的package,主要用于debug。
而ORA-3135错误的意思如下:
[ oracle @pnbrw1 bdump]$ oerr ora 3135
03135, 00000, “connection lost contact”
// *Cause: 1) Server unexpectedly terminated or was forced to terminate.
// 2) Server timed out the connection.
// *Action: 1) Check if the server session was terminated.
// 2) Check if the timeout parameters are set properly in sqlnet.ora.
再看如下红色部分:
SO: 0×1dc56b2e8, type: 4, owner: 0×1dc3b4bb0, flag: INIT/-/-/0×00
(session) sid: 516 trans: 0×1d182c778, creator: 0×1dc3b4bb0, flag: (100041) USR/- BSY/-/-/-/-/-
DID: 0001-003B-00035386, short-term DID: 0001-003B-00035387
txn branch: 0×1d18aee80
oct: 47, prv: 0, sql: 0×1c2a97e40, psql: 0×1c1f75370, user: 83/RPMGR
O/S info: user: noby, term: IT25511B, ospid: 2624:3804, machine:
program: TOAD.exe
application name: TOAD.exe, hash value=0
last wait for ‘pipe get’ blocking sess=0x(nil) seq=47937 wait_time=2929946 seconds since wait started=10
handle address=1de291608, buffer length=1000, timeout=e10
*** 2009-01-15 10:40:43.715
ksedmp: internal or fatal error
ORA-00600: 内部错误代码, 参数: [17285], [0x2A97446380], [4294967295], [0x1DD7B1238], [], [], [], []
ORA-02055: 分布式更新作业失败; 必须执行回退作业
ORA-02055: 分布式更新作业失败; 必须执行回退作业
ORA-00604: 递归 SQL 层次 1 发生错误
ORA-03113: 通讯信道上出现 EOF[ oracle @pnbrw1 bdump]$ oerr ora 02055
02055, 00000, “distributed update operation failed; rollback required”
// *Cause: a failure during distributed update operation may not have
// rolled back all effects of the operation. Since
// some sites may be inconsistent, the transaction must roll back to
// a savepoint or entirely
// *Action: rollback to a savepoint or rollback transaction and resubmit
由此初步判断:
1.当时有人用TOAD工具在做debug,造成ORA-600,ORA-02055,ORA-00604等错误的发生;
2.ORA-02055说明分布式更新失败,需要回滚,之所以失败很有可能就是会话终止;3.追查在IT25511B上操作的人看当时在执行什么。
这些错误对系统没有影响,当然会报这些错误很有可能是由于 oracle bug造成的。在metalink上发现如下几个跟该问题有点类似:
Note:464607.1
Note:335954.1
Bug No. 4640115
后记:
客户后来确认当时确实有人因为跑的程序有问题在IT25511B上用TOAD做了Debug,因此建议他们不要在生产环境做debug动作。
No Comments
Be the first to comment on this entry.
Leave a comment
Fields in bold are required. Email addresses are never published or distributed.
Some HTML code is allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>URLs must be fully qualified (eg: http://www.dbifan.com),and all tags must be properly closed.
Line breaks and paragraphs are automatically converted.
Please keep comments relevant. Off-topic, offensive or inappropriate comments may be edited or removed.