Tag Archives: order by

How Oracle handles the distinct clause between 10gR1 and 10g R2

Oracle has changed the algorithm used to perform “select distinct” operations from 10gR1 and earlier version to 10gR2 and new versions. This has caught a lot of custom application and script writers by surprise, when they depended on the ordering of returned values from 10gR1 and earlier versions. The problem is caused by the way in which the Oracle parser handles a “select distinct” operation: From Oracle 8 to 10gR1, the parser used a sort algorithm to determine distinct values and from 10gR2 and later, the parser uses a hash algorithm to determine distinct values. The end result is that 8 to 10gR1 will return “select distinct” operations in sorted order and new version will not. So to resolve this issue you must use the “order by” clause to guarantee values are returned in sorted order. This article shows the plan differences between the two parsed statements in Oracle 10gR1 and 10gR2.

1. Logon to Oracle 10gR1 and execute “select version from v$instance to show the exact version of this release.

SQL> select version from v$instance;

VERSION
—————–
10.1.0.3.0

SQL>

2. Create the table test1 with the following DDL and insert ten numerical values.

create table test1(v_number number(10));

declare

v_ct number:=1;

begin

while v_ct<10
loop
insert into test1(v_number) values(v_ct);

v_ct := v_ct+1;

end loop;
end;
/

3. Select from this table with “select distinct” clause, as you can see the values are returned in sorted order even though we did not specify the order by clause.

select distinct v_number from test1;

SQL> select distinct v_number from test1;

V_NUMBER
———-
1
2
3
4
5
6
7
8
9

9 rows selected.

SQL>

4. Execute explain plan on the previous statement and you can see that the parser is performing a sort to find all distinct values in the table, thus returning the values in sorted order.

SQL> explain plan for select distinct v_number from test1;

Explained.

SQL> @?/rdbms/admin/utlxpls.sql

PLAN_TABLE_OUTPUT
——————————————————————

Plan hash value: 1260548514

——————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————–
| 0 | SELECT STATEMENT | | 9 | 117 | 4 (25)| 00:00:01 |
| 1 | SORT UNIQUE | | 9 | 117 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL| TEST1 | 9 | 117 | 3 (0)| 00:00:01 |
——————————————————————–

Note

PLAN_TABLE_OUTPUT
———————————————————————
– dynamic sampling used for this statement

13 rows selected.

SQL>

5. Logon to 10gR2 server and select version from v$instance to show the exact RDBMS version.

SQL> select version from v$instance;

VERSION
—————–
10.2.0.4.0

SQL>

6. Create the table test1 and insert ten numerical values.

SQL> create table test1(v_number number(10));

Table created.

SQL>
SQL> declare
2
3 v_ct number:=1;
4
5 begin
6
7 while v_ct<10
8 loop
9 insert into test1(v_number) values(v_ct);
10
11 v_ct := v_ct+1;
12
13 end loop;
14 end;
15 /

PL/SQL procedure successfully completed.

SQL>

7. Perform select distinct operation on test1 table and note that the values are returned in an unsorted order.

SQL> select distinct v_number from test1;

V_NUMBER
———-
1
6
2
4
5
8
3
7
9

9 rows selected.

SQL>

8. Perform a explain plain on your “select distinct” statement and note that the sort operation has been replaced by a hash operation, thus values will not be returned in a sorted format.

SQL> explain plan for select distinct v_number from test1;

Explained.

SQL> @?/rdbms/admin/utlxpls.sql

PLAN_TABLE_OUTPUT
——————————————————————–

Plan hash value: 255531131

——————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
———————————————————————
| 0 | SELECT STATEMENT | | 9 | 117 | 4 (25)| 00:00:01 |
| 1 | HASH UNIQUE | | 9 | 117 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL| TEST1 | 9 | 117 | 3 (0)| 00:00:01 |
———————————————————————-

Note

PLAN_TABLE_OUTPUT
——————————————————————-
– dynamic sampling used for this statement

13 rows selected.

SQL>

9. The reason for the change in the parser’s behavior is speed of execution; the hash algorithm is much more efficient then the sort operation and thus execution time decreases. In order to have a guaranteed sorting of your value, you must us an “order by” clause.

Larry J Catt, OCP 9i, 10g
oracle@allcompute.com
www.allcompute.com