Wednesday, April 15, 2015

COMMIT Causes ORA-03113

A colleague had a problem with an Oracle 11.2.0.1 database today. It affected the following table:

SQL> desc bepe.sd_national_holiday
Name                       Null?    Type
-------------------------- -------- ------------------
SD_NATIONAL_HOLIDAY_ID     NOT NULL NUMBER(15)
DAY_DATE                            DATE
DESCRIPTION                         VARCHAR2(500)
CREATED_BY                 NOT NULL VARCHAR2(100)
CREATION_DATE              NOT NULL TIMESTAMP(6)
ACTIVITY_BY                NOT NULL VARCHAR2(100)
ACTIVITY_DATE              NOT NULL TIMESTAMP(6)
TCN                        NOT NULL NUMBER
 
SQL>

The table was empty:

SQL> select count(*) from bepe.sd_national_holiday
  2  /
 
  COUNT(*)
----------
         0
 
SQL>

… but, when I tried to add data to it, the COMMIT, which followed, failed with an ORA-03113. Typing EXIT, without doing a COMMIT, caused the same problem:

SQL> insert into bepe.sd_national_holiday
  2  (day_date) values ('15-APR-2015')
  3  /
 
1 row created.
 
SQL> commit
  2  /
commit
     *
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 7037
Session ID: 133 Serial number: 6339
 
SQL> conn /
Connected.
SQL> insert into bepe.sd_national_holiday
  2  (day_date) values ('15-APR-2015')
  3  /
 
1 row created.
 
SQL> exit
ERROR:
ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 133 Serial number: 6343
 
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options (with complications)
APXDEV1 /export/home/oracle/andrew >

The table remained empty:

SQL> l
  1  select count(*)
  2* from bepe.sd_national_holiday
SQL> /
 
  COUNT(*)
----------
         0
 
SQL>

… and trying to move it did not work either:

SQL> l
  1  alter table bepe.sd_national_holiday
  2* move tablespace apex_13482906934262524
SQL> /
alter table bepe.sd_national_holiday
                 *
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 11685
Session ID: 156 Serial number: 21804
 
SQL>

I was able to reproduce this error again and again. I thought there might be some corruption in the underlying data file but dbverify did not show anything. I looked at the messages in the alert log:

ORA-07445: exception encountered: core dump [qesrcRseObj_Invalidate()+4] [SIGSEGV] [ADDR:0x18] [PC:0x1029AFAC4] [Address not mapped to object] []

The closest match I could find for these on My Oracle Support was unpublished bug 9025861. In Oracle 11.2, results from functions run against tables can be cached. Then Oracle can access these cached values rather than rerun the functions. The suggested fix for this bug is to upgrade to Oracle 11.2.0.2 or set result_cache_max_size to zero, which stops cached results being used. I set the parameter to zero:

SQL> l
  1  select value from v$parameter
  2* where name = 'result_cache_max_size'
SQL> /
 
VALUE
----------
2621440
 
SQL> alter system set result_cache_max_size = 0
  2  /
 
System altered.
 
SQL>

… and the problem went away:

SQL> insert into bepe.sd_national_holiday
  2  (day_date) values ('15-APR-2015')
  3  /
 
1 row created.
 
SQL> commit
  2  /
 
Commit complete.
 
SQL>

Strangely enough, resetting the parameter to its previous value did not make the problem return:

SQL> alter system set result_cache_max_size = 2621440
  2  /
 
System altered.
 
SQL> select value from v$parameter
  2  where name = 'result_cache_max_size'
  3  /
 
VALUE
----------
2621440
 
SQL> insert into bepe.sd_national_holiday
  2  (day_date) values ('16-APR-2015')
  3  /
 
1 row created.
 
SQL> commit
  2  /
 
Commit complete.
 
SQL> 

However, this is not a permanent cure. I did the same thing just over a month ago and the problem went away then too but, as you can see, it returned today for no apparent reason.

We noticed that an Oracle 11.2.0.2 database, which is being used for similar work, has not been affected by this problem so upgrading to this version might be the safest option.

Monday, April 13, 2015

statement suspended, wait error to be cleared

This happened in an Oracle 11.2.0.4 database. A colleague complained of poor performance. I looked to see what his session was waiting for:

SQL> l
  1  select event
  2  from v$session_wait
  3  where sid = 226
  4* and state = 'WAITING'
SQL> /
 
EVENT
----------------------------------------------------------------
statement suspended, wait error to be cleared
 
SQL>

I looked in the alert log and saw the following error:

statement in resumable session 'User WORK(105), Session 226, Instance 1' was suspended due to
    ORA-01653: unable to extend table WORK.WRK_CONSUMPTION_NHH by 8192 in tablespace MI_DATA

I increased the size of one of the tablespace’s datafiles:

SQL> l
  1  alter database datafile
  2  '/cogdbase/ecocog5/eco_data/mi_data2.dbf'
  3* resize 31g
SQL> /
 
Database altered.
 
SQL>

Then I looked in the alert log and saw that the statement had started working again:

Mon Apr 13 16:23:19 2015
Completed: alter database datafile
'/cogdbase/ecocog5/eco_data/mi_data2.dbf'
resize 31g
Mon Apr 13 16:23:19 2015
statement in resumable session 'User WORK(105), Session 226, Instance 1' was resumed

Finally, I checked the session and it was waiting for a different event:

SQL> l
  1  select event
  2  from v$session_wait
  3  where sid = 226
  4* and state = 'WAITING'
SQL> /
 
EVENT
----------------------------------------------------------------
db file sequential read
 
SQL>

Thursday, April 02, 2015

One Cause of ORA-01092

I saw the following error in a job which had just recreated the control file of a test database after cloning:

SQL>   2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   3
2   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48   49   50   51   52   53   54   55   56   57   58   59   60   61   62   63
  64   65   66   67   68   69   70   71   72   73   74   75   76   77   78   79   80   81   82   83   84   85   86   87   88   89   90   91   92   93   94
95   96   97   98   99  100  101  102  103  104  105  106  107  108  109  110  111  112  113  114  115  116  117  118  119  120  121  122  123  124  125  126
  127  128  129  130  131  132  133  134  135  136  137
Control file created.
 
SQL> ALTER DATABASE OPEN RESETLOGS
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
 
SQL>

I looked in the alert log and noticed the errors below:

Wed Apr  1 19:30:46 2015
Errors in file /oracle/app/oracle/product/10.2.0/admin/mrmdpt1/udump/mrmdpt1_ora_8712.trc:
ORA-00600: internal error code, arguments: [25025], [115], [], [], [], [], [], []
Wed Apr  1 19:30:47 2015
Errors in file /oracle/app/oracle/product/10.2.0/admin/mrmdpt1/udump/mrmdpt1_ora_8712.trc:
ORA-00600: internal error code, arguments: [25025], [115], [], [], [], [], [], []
Wed Apr  1 19:30:47 2015
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 8712
ORA-1092 signalled during: ALTER DATABASE OPEN RESETLOGS...

I checked the SQL which had recreated the control file. I saw that one of the datafiles from the source database was missing. The missing file was called '/agasprd/mrmprod/mrm_tables3/undo_1a.dbf'. I checked its file_id in the source database:

SQL> l
  1  select file_id from dba_data_files
  2  where file_name =
  3* '/agasprd/mrmprod/mrm_tables3/undo_1a.dbf'
SQL> /
 
   FILE_ID
----------
       115
 
SQL>

I noticed that it matched the 2nd parameter after the ORA-00600 (i.e. [115]). If this happens to you, you should correct the SQL so that it includes all the source database’s datafiles then repeat the cloning process from the start.