Showing posts with label full table scan. Show all posts
Showing posts with label full table scan. Show all posts

Wednesday, August 24, 2016

SUBSTR Versus LIKE in Oracle 11.2

I was reading an old SQL tuning book which was printed in 2002. It said that a where clause with like could often use an index whereas a similar clause using substr could not. I wondered if this might still be the case in an Oracle 11.2.0.1 database. To find out, I created a table:

SQL> conn andrew/reid
Connected.
SQL> create table tab1 as
  2  select table_name from dba_tables
  3  /

Table created.

SQL>


... and made sure it had plenty of data:

SQL> begin
  2  for a in 1..12 loop
  3  insert into tab1 select * from tab1;
  4  end loop;
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> select count(*) from tab1
  2  /

  COUNT(*)
----------
  13348864

SQL>


I added an extra row which I could look for later:

SQL> insert into tab1 values('DAILY_FORECAST')
  2  /

1 row created.

SQL>


...added an index to help find it:

SQL> create index ind1 on tab1(table_name)
  2  /

Index created.

SQL>


...and collected statistics:

SQL> exec dbms_stats.gather_table_stats(-
> ownname=>'andrew', -
> tabname=>'tab1', -
> cascade=>true);

PL/SQL procedure successfully completed.

SQL>


I used like to find the row and it took 0.39 seconds:

SQL> alter session set sql_trace = true
  2  /

Session altered.

SQL> set timing on
SQL> select count(*) from tab1
  2  where table_name like 'DAILY%'
  3  /

  COUNT(*)
----------
         1

Elapsed: 00:00:00.39

SQL>


... but when I used substr, it took 28.79 seconds:

SQL> select count(*) from tab1
  2  where substr(table_name,1,5) = 'DAILY'
  3  /

  COUNT(*)
----------
         1

Elapsed: 00:00:28.79

SQL> set timing off
SQL> alter session set sql_trace = false
  2  /

Session altered.

SQL>


I ran the trace file through tkprof to see how Oracle had executed the SQL. The statement which used substr had done a full table scan:

********************************************************************************

select count(*) from tab1
where substr(table_name,1,5) = 'DAILY'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      7.16      28.78      38936      38940          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      7.16      28.78      38936      38940          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 8891  (ANDREW)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=38940 pr=38936 pw=0 time=0 us)
      1   TABLE ACCESS FULL TAB1 (cr=38940 pr=38936 pw=0 time=0 us cost=9169 size=2135824 card=133489)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
      1    TABLE ACCESS   MODE: ANALYZED (FULL) OF 'TAB1' (TABLE)

********************************************************************************


...but the statement which used like had used the index:

********************************************************************************

select count(*) from tab1
where table_name like 'DAILY%'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.02       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.01          3          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.02       0.02          3          3          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 8891  (ANDREW)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=3 pr=3 pw=0 time=0 us)
      1   INDEX RANGE SCAN IND1 (cr=3 pr=3 pw=0 time=0 us cost=19 size=65792 card=4112)(object id 211183)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
      1    INDEX   MODE: ANALYZED (RANGE SCAN) OF 'IND1' (INDEX)

********************************************************************************

Monday, June 11, 2012

Analyze versus dbms_stats

Many years ago, if you wanted to collect optimizer statistics for a table, you had to use the analyze command. Then Oracle brought out dbms_stats and said you should use that instead as it produced better statistics. I decided to test this out on an Oracle 10.2.0.3.0 database. I did this by analyzing a table then running a query on it. Then I ran dbms_stats on the table, ran the query again and compared the results. First I set sql_trace to true to give me a trace file to run through tkprof later:

SQL> alter session set sql_trace = true
  2  /

Session altered.

SQL>

Then I created a table, added lots of data to it and counted the rows:

SQL> create table andrews_table
  2  storage (maxextents unlimited)
  3  as select * from dba_tables
  4  /

Table created.

SQL> begin
  2  for a in 1..9 loop
  3  insert into andrews_table
  4  select * from andrews_table;
  5  end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> select count(*) from andrews_table
  2  /

  COUNT(*)
----------
   1074688

SQL>

Then I counted the number of rows with SYSTEM as the owner. The percentage was quite small so using an index to access them should be better than doing a full table scan:

SQL> select owner, count(*)
  2  from andrews_table
  3  where owner = 'SYSTEM'
  4  group by owner
  5  /

OWNER                            COUNT(*)
------------------------------ ----------
SYSTEM                              75264

SQL>

Next I created an index, analyzed the table and checked that both the table and the index had been analyzed:

SQL> create index andrews_index
  2  on andrews_table(owner)
  3  /

Index created.

SQL> analyze table andrews_table
  2  compute statistics
  3  /

Table analyzed.

SQL> select last_analyzed from user_tables
  2  where table_name = 'ANDREWS_TABLE'
  3  /

LAST_ANAL
---------
09-MAY-12

SQL> select last_analyzed from user_indexes
  2  where index_name = 'ANDREWS_INDEX'
  3  /

LAST_ANAL
---------
09-MAY-12

SQL>

Then I queried the rows with SYSTEM as the owner. Sometimes when I run tests like this, I give Oracle a perfectly good index to use but it does a full table scan instead and I never know why so I added a hint to make sure that Oracle used the index:

SQL> select /*+ index(andrews_table andrews_index) */
  2  count(*) after_analyze from andrews_table
  3  where owner = 'SYSTEM'
  4  and pct_used = 40
  5  /

AFTER_ANALYZE
-------------
        50176

SQL> alter session set sql_trace = false
  2  /

Session altered.

SQL>

The next step was to repeat the test using dbms_stats instead:

SQL> alter session set sql_trace = true
  2  /

Session altered.

SQL> create table andrews_table
  2  storage (maxextents unlimited)
  3  as select * from dba_tables
  4  /

Table created.

SQL> begin
  2  for a in 1..9 loop
  3  insert into andrews_table
  4  select * from andrews_table;
  5  end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> select count(*) from andrews_table
  2  /

  COUNT(*)
----------
   1074688

SQL> select owner, count(*)
  2  from andrews_table
  3  where owner = 'SYSTEM'
  4  group by owner
  5  /

OWNER                            COUNT(*)
------------------------------ ----------
SYSTEM                              75264

SQL> create index andrews_index
  2  on andrews_table(owner)
  3  /

Index created.

SQL> exec dbms_stats.gather_table_stats -
> (ownname=>'oracle', -
>  tabname=>'andrews_table', -
>  cascade=>true);

PL/SQL procedure successfully completed.

SQL> select last_analyzed from user_tables
  2  where table_name = 'ANDREWS_TABLE'
  3  /

LAST_ANAL
---------
09-MAY-12

SQL> select last_analyzed from user_indexes
  2  where index_name = 'ANDREWS_INDEX'
  3  /

LAST_ANAL
---------
09-MAY-12

SQL> select /*+ index(andrews_table andrews_index) */
  2  count(*) after_dbms_stats from andrews_table
  3  where owner = 'SYSTEM'
  4  and pct_used = 40
  5  /

AFTER_DBMS_STATS
----------------
           50176

SQL> alter session set sql_trace = false
  2  /

Session altered.

SQL>

Finally I ran the trace file through tkprof and compared the results. CPU time, elapsed time, disk reads and buffer gets were all quite a bit less when dbms_stats was used to collect statistics:

********************************************************************************

select /*+ index(andrews_table andrews_index) */
count(*) after_analyze from andrews_table
where owner = 'SYSTEM'
and pct_used = 40

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.54      31.74      12696      13209          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.55      31.75      12696      13209          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 32  (ORACLE)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=13209 pr=12696 pw=0 time=31745246 us)
  50176   TABLE ACCESS BY INDEX ROWID ANDREWS_TABLE (cr=13209 pr=12696 pw=0 time=43170772 us)
  75264    INDEX RANGE SCAN ANDREWS_INDEX (cr=192 pr=191 pw=0 time=633026 us)(object id 120507)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
  50176    TABLE ACCESS   MODE: ANALYZED (BY INDEX ROWID) OF
               'ANDREWS_TABLE' (TABLE)
  75264     INDEX   MODE: ANALYZED (RANGE SCAN) OF 'ANDREWS_INDEX'
                (INDEX)

********************************************************************************

select /*+ index(andrews_table andrews_index) */
count(*) after_dbms_stats from andrews_table
where owner = 'SYSTEM'
and pct_used = 40

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.12      24.39      11777      12711          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.12      24.39      11777      12711          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 32  (ORACLE)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=12711 pr=11777 pw=0 time=24395897 us)
  50176   TABLE ACCESS BY INDEX ROWID ANDREWS_TABLE (cr=12711 pr=11777 pw=0 time=26819905 us)
  75264    INDEX RANGE SCAN ANDREWS_INDEX (cr=192 pr=134 pw=0 time=378268 us)(object id 120510)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
  50176    TABLE ACCESS   MODE: ANALYZED (BY INDEX ROWID) OF
               'ANDREWS_TABLE' (TABLE)
  75264     INDEX   MODE: ANALYZED (RANGE SCAN) OF 'ANDREWS_INDEX'
                (INDEX)

********************************************************************************

Monday, June 04, 2012

db file scattered read

I tested this example on Oracle 11.2.

I wanted to see whether the TIME_WAITED column in V$SESSION_EVENT is only incremented for events such as db file scattered read and db file sequential read if Oracle has to go to disk to satisfy a read request. I decided to check this out using the db file scattered read event, which is associated with full table scans. Before I started, I created a table called ANDREWS_TABLE. Then I checked the total time my session had waited for the event in question:

SQL> select time_waited
  2  from v$session_event
  3  where sid = (select distinct sid from v$mystat)
  4  and event = 'db file scattered read'
  5  /

TIME_WAITED
-----------
        520

SQL>

Next I counted the rows in the table. This was the first time I had done this so Oracle had to get the rows from disk:

SQL> set timing on
SQL> select count(*) from andrews_table
  2  /

  COUNT(*)
----------
     12308

Elapsed: 00:00:00.18

SQL> set timing off
SQL>

I checked the total time waited for the event and it had increased by 0.15 seconds – so far so good:

SQL> select time_waited
  2  from v$session_event
  3  where sid = (select distinct sid from v$mystat)
  4  and event = 'db file scattered read'
  5  /

TIME_WAITED
-----------
        535

SQL>

I counted the rows again. This time Oracle already had the blocks in memory so it did not have to go to disk:

SQL> set timing on
SQL> select count(*) from andrews_table
  2  /

  COUNT(*)
----------
     12308

Elapsed: 00:00:00.01
SQL> set timing off
SQL>

The elapsed time went down from 0.18 seconds to 0.01 seconds because Oracle already had the rows in memory. Finally I checked the time waited for the event again:

SQL> select time_waited
  2  from v$session_event
  3  where sid = (select distinct sid from v$mystat)
  4  and event = 'db file scattered read'
  5  /

TIME_WAITED
-----------
        535

SQL>

It had not increased, showing that the TIME_WAITED for this event only goes up during a full table scan if Oracle has to fetch blocks from disk.