Saturday, December 18, 2010

Selecting Data at Random from an Oracle Table

This example shows how to select data at random. First I created a table and counted the number of rows it had:

SQL> create table andrew as
  2  select owner, object_name, object_type
  3  from dba_objects
  4  /

Table created.

SQL> select count(*) from andrew
  2  /

  COUNT(*)
----------
     68299

SQL>


Then I used the sample keyword to select 0.01% of the rows at random. In theory this should have produced 6 or 7 rows depending on how the rounding was done but the actual number was slightly different. This is normal:
  
SQL> col object_name format a30
SQL> select object_name from andrew sample (0.01)
  2  /

OBJECT_NAME
------------------------------
DBMS_STREAMS_AUTH
sun/nio/cs/ext/EUC_TW$Encoder
/55848856_BasicTableUIMouseInp
/e081aebc_LLNIFieldDefsRes
sun/awt/PlatformFont
/e3d5b152_DSAKeyPairGenerator
oracle/jms/AQjmsMessages_it
X$PR6UAG759GKO8J6PU11C34F52MCK
/eb2968cb_MlibConvolveRIF
WWV_META_CLEANUP

10 rows selected.

SQL>


I repeated the process and got a different sample. The number of rows shown was not the same either:

SQL> select object_name from andrew sample (0.01)
  2  /

OBJECT_NAME
------------------------------
/54e20e2_MIRSSAHeapVariableMan
/ad0cf592_BasicInternalFrameUI
/a0cd2aae_KDCReq
/ac90f805_DelegateImpl
/6a03f6a7_OraCustomizerErrorsT
SDO_NET_PARTITION
MGMT_CRED_ROW_ARRAY
WWV_FLOW_FLASH_CHARTS_PK
WWV_BIU_FLOW_LOV_VAL_AUDIT

9 rows selected.

SQL>


If the number of rows you see is important, the following method is better. I ran it twice to show that the number of rows returned was consistent. The number of rows selected was not displayed due to the setting of FEEDBACK:

SQL> show feedback
FEEDBACK ON for 6 or more rows
SQL> select object_name from
  2  (select object_name from andrew
  3   order by dbms_random.value)
  4  where rownum < 6
  5  /

OBJECT_NAME
------------------------------
CWM2$LEVELUPD
oracle/sql/CharacterSetJAEUC
DM_SVM_ATTRIBUTE
oracle/xquery/exec/XpathExpr
/b5483d4c_ByteArrayLexOrder

SQL> select object_name from
  2  (select object_name from andrew
  3   order by dbms_random.value)
  4  where rownum < 6
  5  /

OBJECT_NAME
------------------------------
/e28402f8_ObjectClosedExceptio
/a9973a55_BufferedOutputStream
/96361fa8_ShortBufferException
/b40ab093_AssignBitAndExpressi
/f5e722c3_XSLJDWPDebugger

SQL>

2 comments:

Andrew Reid said...

Hay muchos más detalles sobre este asunto en el vínculo arriba.

Andrew Reid said...

Il y a des détails de DBMS_RANDOM etc en français à l’adresse ci-dessus.