Tuesday, March 03, 2015

What is an INDEX SKIP SCAN?

I tested this on Oracle 11.2. First I created a table with two columns. The first column, OWNER, had low cardinality i.e. it had only two possible values, PUBLIC and SYS. The second column, OBJECT_NAME, had the names of all objects owned by PUBLIC and/or SYS so it had high cardinality i.e. it had many different values:

SQL> conn system/manager
Connected.
SQL> create table t1 as
  2  select owner, object_name
  3  from dba_objects
  4  where owner in ('SYS', 'PUBLIC')
  5  /
 
Table created.
 
SQL> select owner, count(*)
  2  from t1
  3  group by owner
  4  /
 
OWNER                            COUNT(*)
------------------------------ ----------
PUBLIC                              29041
SYS                                 31907
 
SQL>

I created an index on the table and gathered statistics:

SQL> create index i1 on t1(owner, object_name)
  2  /
 
Index created.
 
SQL> exec sys.dbms_stats.gather_table_stats(-
> ownname=>'SYSTEM',tabname=>'T1');
 
PL/SQL procedure successfully completed.
 
SQL>

Then I counted the number of rows with an OBJECT_NAME of DUAL. There were two, one owned by SYS and the other owned by PUBLIC. You might think that Oracle would not have been able to use the index, as OBJECT_NAME was not the leading column. However, as the leading column in the index, OWNER, had low cardinality, Oracle could SCAN the entries in the index for PUBLIC and see if any had an OBJECT_NAME of DUAL. It could then SKIP to the index entries for SYS and look for an OBJECT_NAME of DUAL there. This is why it is called an INDEX SKIP SCAN:

SQL> set autotrace on
SQL> select count(*) from t1
  2  where object_name = 'DUAL'
  3  /
 
  COUNT(*)
----------
         2
 
 
Execution Plan
----------------------------------------------------------
 
---------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost  |
---------------------------------------------------------
|   0 | SELECT STATEMENT |      |     1 |    25 |     3 |
|   1 |  SORT AGGREGATE  |      |     1 |    25 |       |
|   2 |   INDEX SKIP SCAN| I1   |     2 |    50 |     3 |
---------------------------------------------------------
 
Note
-----
   - 'PLAN_TABLE' is old version
 
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        526  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
SQL>

I changed the test to include all the rows from DBA_OBJECTS, not just those owned by PUBLIC and/or SYS then reran it. Table T1 had 51 different values in the OWNER column so Oracle did a full table scan instead:

SQL> conn system/manager
Connected.
SQL> create table t1 as
  2  select owner, object_name
  3  from dba_objects
  4  /
 
Table created.
 
SQL> select count(distinct owner) from t1
  2  /
 
COUNT(DISTINCTOWNER)
--------------------
                  51
 
SQL> create index i1 on t1(owner, object_name)
  2  /
 
Index created.
 
SQL> exec sys.dbms_stats.gather_table_stats(-
> ownname=>'SYSTEM',tabname=>'T1');
 
PL/SQL procedure successfully completed.
 
SQL> set autotrace on
SQL> select count(*) from t1
  2  where object_name = 'DUAL'
  3  /
 
  COUNT(*)
----------
         2
 
 
Execution Plan
----------------------------------------------------------
 
-----------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    24 |    31 |
|   1 |  SORT AGGREGATE    |      |     1 |    24 |       |
|   2 |   TABLE ACCESS FULL| T1   |     2 |    48 |    31 |
-----------------------------------------------------------
 
Note
-----
   - 'PLAN_TABLE' is old version
 
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        362  consistent gets
          0  physical reads
          0  redo size
        526  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
SQL>

Using NOT = in a WHERE Clause

I ran this test in Oracle 12.1. First I created a table, added some data to it and made sure that the bytes column was set to zero in every row:
 
SQL> conn system/manager
Connected.
SQL> alter session set optimizer_mode = first_rows
  2  /
 
Session altered.
 
SQL> create table t1
  2  tablespace users
  3  as select segment_name, bytes
  4  from dba_segments
  5  /
 
Table created.
 
SQL> update t1 set bytes = 0
  2  /
 
5757 rows updated.
 
SQL> begin
  2  for a in 1..4 loop
  3  insert into t1 select * from t1;
  4  end loop;
  5  end;
  6  /
 
PL/SQL procedure successfully completed.
 
SQL>
 
Then I added one row with a value greater than zero in the bytes column:
 
SQL> insert into t1 (segment_name, bytes)
  2  values ('ABC123',1000)
  3  /
 
1 row created.
 
SQL>
 
I created an index and gathered statistics on the table:
 
SQL> create index i1 on t1(bytes)
  2  tablespace users
  3  /
 
Index created.
 
SQL> exec sys.dbms_stats.gather_table_stats(-
> ownname=>'SYSTEM',tabname=>'T1');
 
PL/SQL procedure successfully completed.
 
SQL>
 
I tried to find the row with the non-zero value in the bytes column like this. The NOT = (!=) made Oracle choose to do a full table scan instead of using the index:
 
SQL> set autotrace on
SQL> select segment_name from t1 where bytes != 0
  2  /
 
SEGMENT_NAME
------------------------------
ABC123
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 46057 |  1034K|   104   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 46057 |  1034K|   104   (1)| 00:00:01 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("BYTES"<>0)
 
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        374  consistent gets
        371  physical reads
          0  redo size
        550  bytes sent via SQL*Net to client
        543  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
SQL>

I wanted to make Oracle use the index. In situations like this it is sometimes possible to change a query but still get the same results. In this example, you would not expect to see a negative value in the bytes column. So you could write the SQL like this instead:

SQL> select segment_name from t1 where bytes > 0
  2  /
 
SEGMENT_NAME
------------------------------
ABC123
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 37133972
 
--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      | 46057 |  1034K|   263   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1   | 46057 |  1034K|   263   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | I1   | 46057 |       |    85   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("BYTES">0)
 
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          3  physical reads
          0  redo size
        557  bytes sent via SQL*Net to client
        543  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
SQL>

This time, Oracle decided to use the index. The observant among you may be wondering why I set optimizer_mode to first_rows at the start. That was because the query refused to use the index at all when it was set to all_rows.