Wednesday, August 24, 2016

SUBSTR Versus LIKE in Oracle 11.2

I was reading an old SQL tuning book which was printed in 2002. It said that a where clause with like could often use an index whereas a similar clause using substr could not. I wondered if this might still be the case in an Oracle 11.2.0.1 database. To find out, I created a table:

SQL> conn andrew/reid
Connected.
SQL> create table tab1 as
  2  select table_name from dba_tables
  3  /

Table created.

SQL>


... and made sure it had plenty of data:

SQL> begin
  2  for a in 1..12 loop
  3  insert into tab1 select * from tab1;
  4  end loop;
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> select count(*) from tab1
  2  /

  COUNT(*)
----------
  13348864

SQL>


I added an extra row which I could look for later:

SQL> insert into tab1 values('DAILY_FORECAST')
  2  /

1 row created.

SQL>


...added an index to help find it:

SQL> create index ind1 on tab1(table_name)
  2  /

Index created.

SQL>


...and collected statistics:

SQL> exec dbms_stats.gather_table_stats(-
> ownname=>'andrew', -
> tabname=>'tab1', -
> cascade=>true);

PL/SQL procedure successfully completed.

SQL>


I used like to find the row and it took 0.39 seconds:

SQL> alter session set sql_trace = true
  2  /

Session altered.

SQL> set timing on
SQL> select count(*) from tab1
  2  where table_name like 'DAILY%'
  3  /

  COUNT(*)
----------
         1

Elapsed: 00:00:00.39

SQL>


... but when I used substr, it took 28.79 seconds:

SQL> select count(*) from tab1
  2  where substr(table_name,1,5) = 'DAILY'
  3  /

  COUNT(*)
----------
         1

Elapsed: 00:00:28.79

SQL> set timing off
SQL> alter session set sql_trace = false
  2  /

Session altered.

SQL>


I ran the trace file through tkprof to see how Oracle had executed the SQL. The statement which used substr had done a full table scan:

********************************************************************************

select count(*) from tab1
where substr(table_name,1,5) = 'DAILY'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      7.16      28.78      38936      38940          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      7.16      28.78      38936      38940          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 8891  (ANDREW)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=38940 pr=38936 pw=0 time=0 us)
      1   TABLE ACCESS FULL TAB1 (cr=38940 pr=38936 pw=0 time=0 us cost=9169 size=2135824 card=133489)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
      1    TABLE ACCESS   MODE: ANALYZED (FULL) OF 'TAB1' (TABLE)

********************************************************************************


...but the statement which used like had used the index:

********************************************************************************

select count(*) from tab1
where table_name like 'DAILY%'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.02       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.01          3          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.02       0.02          3          3          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 8891  (ANDREW)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=3 pr=3 pw=0 time=0 us)
      1   INDEX RANGE SCAN IND1 (cr=3 pr=3 pw=0 time=0 us cost=19 size=65792 card=4112)(object id 211183)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
      1    INDEX   MODE: ANALYZED (RANGE SCAN) OF 'IND1' (INDEX)

********************************************************************************

2 comments:

Marko Sutic said...

Hi Andrew,

you cannot compare performance of substr and like operator because substr values are not indexed.
If you want to compare performance then you should create function index.

create index ind2 on tab1(substr(table_name,1,5));

Best regards,
Marko

Andrew Reid said...

Dear Marko,

That's true.
Thank you for taking the time to comment.
I will take a look at your blog when I get a moment.

Kind Regards,

Andrew