I was asked to look at a long running DELETE
statement recently. Once I had solved the problem, I decided to record
it for future reference. I set up data to reproduce the problem in an
Oracle 11.2.0.2.7 test database and ran it as follows:
SQL> conn andrew/reid
Connected.
SQL> select to_char(sysdate,'DD-MON-YY HH24:MI:SS')
2 start_time from dual
3 /
START_TIME
------------------
25-OCT-12 18:38:00
SQL> alter session set sql_trace = true
2 /
Session altered.
SQL> delete t1
2 /
1000000 rows deleted.
SQL> alter session set sql_trace = false
2 /
Session altered.
SQL> select to_char(sysdate,'DD-MON-YY HH24:MI:SS')
2 finish_time from dual
3 /
FINISH_TIME
------------------
26-OCT-12 07:37:07
SQL>
Deleting
1 million rows would not normally take almost 13 hours so there was
something else going on. I ran the trace file through tkprof with sys=no and saw the following in the output file:
********************************************************************************
SQL ID: 9mk1dmrqf9dv8 Plan Hash: 784337829
delete t1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 3 0 0
Execute 1 604.16 795.65 3645 1624 6062716 1000000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 604.17 795.66 3645 1627 6062716 1000000
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 126 (ANDREW)
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
0 0 0 DELETE T1 (cr=1561304066 pr=3714 pw=0 time=3797440855 us)
1000000 1000000 1000000 TABLE ACCESS FULL T1 (cr=1562 pr=1467
pw=0 time=18489606 us cost=461 size=13097435 card=1007495)
Rows Execution Plan
------- ---------------------------------------------------
0 DELETE STATEMENT MODE: ALL_ROWS
0 DELETE OF 'T1'
1000000 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T1' (TABLE)
********************************************************************************
According
to this, the deletion had only used 10 minutes of CPU time, which did
not explain the 13 hour run time. At the end of the file I saw this,
which gave me a clue:
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 120 0.00 0.01 0 0 6 0
Execute 1000129 42414.03 45950.87 60 1561302066 1038509 1000000
Fetch 270 0.01 0.58 9 478 0 147
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1000519 42414.04 45951.47 69 1561302544 1038515 1000147
Recursive statements are commands run automatically by Oracle so I ran the tkprof again, this time without the sys=no. An extra entry appeared in the output file and it accounted for the excessive run time:
********************************************************************************
SQL ID: dm6vj6u32r27r Plan Hash: 1813935393
delete from "ANDREW"."T2"
where
"COL1" = :1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1000000 42413.99 45950.84 60 1561302066 1038509 1000000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1000001 42413.99 45950.84 60 1561302066 1038509 1000000
Misses in library cache during parse: 1
Misses in library cache during execute: 2
Optimizer mode: CHOOSE
Parsing user id: SYS (recursive depth: 1)
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
0 0 0 DELETE T2 (cr=1562 pr=1 pw=0 time=34707 us)
0 0 0 TABLE ACCESS FULL T2 (cr=1562 pr=1 pw=0 time=34700 us)
********************************************************************************
I
had not run this statement so something was causing Oracle to execute
it automatically one million times. I thought it might be some sort of
referential integrity checking so I looked for indexes on table T1:
SQL> l
1 select index_name, column_name, column_position
2 from dba_ind_columns
3 where table_owner = 'ANDREW'
4* and table_name = 'T1'
SQL> /
INDEX_NAME COLUMN_NAME COLUMN_POSITION
--------------- --------------- ---------------
PK1 COL1 1
SQL>
Then I checked to see if any tables were referring to index PK1:
SQL> l
1 select owner, table_name
2 from dba_constraints
3* where r_constraint_name = 'PK1'
SQL> /
OWNER TABLE_NAME
--------------- ---------------
ANDREW T2
SQL>
... and looked up the DDL used to create this table:
SQL> l
1* select dbms_metadata.get_ddl('TABLE','T2','ANDREW') from dual
SQL> /
CREATE TABLE "ANDREW"."T2"
( "COL1" NUMBER,
CONSTRAINT "CHECK_NUMBER1" FOREIGN KEY ("COL1")
REFERENCES "ANDREW"."T1" ("COL1") ON DELETE CASCADE ENABLE
) SEGMENT CREATION IMMEDIATE
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "USERS" ;
SQL>
This showed referential integrity checking between tables T1 and T2. Each time a row was deleted from table T1, the corresponding row(s) had to be removed from table T2. This was the reason for the 1 million executions of delete from "ANDREW"."T2" where "COL1" = :1 in the tkprof output above.
I looked for indexes on table T2 but there were none:
SQL> l
1 select index_name
2 from dba_indexes
3 where owner = 'ANDREW'
4* and table_name = 'T2'
SQL> /
no rows selected
SQL>
This explained the full table scans when rows were deleted from table T2. I created an index on table T2 and reran the test. This time the deletion took a little over 3 minutes:
SQL> create index t2_index on t2(col1)
2 /
Index created.
SQL> select to_char(sysdate,'DD-MON-YY HH24:MI:SS')
2 start_time from dual
3 /
START_TIME
------------------
29-OCT-12 12:14:26
SQL> delete t1
2 /
1000000 rows deleted.
SQL> select to_char(sysdate,'DD-MON-YY HH24:MI:SS')
2 finish_time from dual
3 /
FINISH_TIME
------------------
29-OCT-12 12:17:45
SQL>