Archive

Posts Tagged ‘Parallelism’

Oracle : Recursive Common Table Expressions and Parallelism

April 14, 2012 3 comments

Sometimes, no matter how hard you search you just can’t find an answer – that was the problem this week. Oracle’s recursive common table expressions (RCTE), or Recursive Sub Query Refactoring to put it in Oracle’s terms were proving to be pretty bad on performance. (Hopefully, the next person searching will now find this answer.)

As feature’s go, this one is should be a relatively well-known feature – it’s part of the ANSI SQL-99 standard and available in a number of RDBMs, with near identical implementation on the syntax.

Even the esteemed Mr Kyte has changed his position on RCTE’s from being more code and harder to understand than the CONNECT BY syntax, to being a somewhat useful feature.

So what was the question to which we could find no answer?

Why does a RCTE seem to ignore parallel hints?

Amazingly, we can’t find anything documented about this against RCTE’s themselves or in the parallelism sections of the documentation. No mention of restrictions of parallelism on RCTE’s appear anywhere.

We have quite a complex example but needed a simple scenario to submit to Oracle to get an answer. Kudos for the krufting of this goes to Phil Miesle – it was his turn to deal with Oracle support.

First, create a numbers table, and fill it with data, we even used a RCTE to do that part.

create table vals (n ,constraint vals_pk primary key (n) ) 
as 
with numbers(n) as (
select 1 as n    
from dual   
union all  
select n+1 as n    
from numbers   
where n < 1000000 
)
select n from numbers;

We now need a data table, that is basically going to act as a hierarchy, for us to test the RCTE against. A simple parent / child table is suffice:

create table pairs (
  parent 
  ,child 
  ,constraint pairs_pk primary key (parent,child) ) 
as 
select 
  'B'||to_char(mod(n,100000)+1) as parent      
  ,'A'||to_char(n) as child   from vals;

Using the numbers table, the table now contains A1 to 1000000 linking to B1 to B100000 – so in effect, every B has 10 A’s linked to it.

This then continues, with every 10 B’s linking to a C:

insert into pairs 
select distinct        
  'C'||to_char(mod(n,10000)+1) as parent      
   ,'B'||to_char(mod(n,100000)+1) as child  
from vals;

And so on, with each successive layer having a 10 to 1 ratio:

insert into pairs 
select distinct       
  'D'||to_char(mod(n,1000)+1) as parent      
  ,'C'||to_char(mod(n,10000)+1) as child  
from vals;

insert into pairs 
select distinct       
  'E'||to_char(mod(n,100)+1) as parent       
  ,'D'||to_char(mod(n,1000)+1) as child  
from vals;

insert into pairs 
select distinct        
  'F'||to_char(mod(n,10)+1) as parent      
  ,'E'||to_char(mod(n,100)+1) as child  
from vals;

insert into pairs 
select distinct       
  'G'||to_char(mod(n,1)+1) as parent      
  ,'F'||to_char(mod(n,10)+1) as child  
from vals;
commit;

And finally we have G1 linking to F1 to F10 – so the structure is clearly a very basic tree.

Gather the stats to make sure the optimizer is going to have a half chance at a decent plan.

begin 
dbms_stats.gather_table_stats(ownname=>user,tabname=>'PAIRS'); 
end; 
/

So for the test cast query, we wished to generate all the possible paths within the tree – which is in effect a non-cyclical directed graph. This is the ideal scenario for connect by / RCTE to perform its magic, I need to recurse the dataset in a single set based statement.

explain plan for
select count(*)
from (
  with e (parent,child) 
  as (
    select parent,child
    from pairs
    union all
    select e.parent,pairs.child
    from e
    join pairs on e.child = pairs.parent
  )
  select * from e
);

Grab the plan output using

select * from table(dbms_xplan.display)

– which I’ve included the important bit below:

---------------------------------------------------- 
| Id  | Operation                                  |
---------------------------------------------------- 
|   0 | SELECT STATEMENT                           |
|   1 |  SORT AGGREGATE                            |
|   2 |   VIEW                                     |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|
|   4 |     TABLE ACCESS FULL                      |
|*  5 |     HASH JOIN                              |
|   6 |      TABLE ACCESS FULL                     |
|   7 |      RECURSIVE WITH PUMP                   |
----------------------------------------------------

There is nothing shocking or unusual about the plan, it is what we would expect to see. So let’s now add some parallelism to the query:

explain plan for
select count(*)
from (
  with e (parent,child) 
  as (
    select /*+ parallel(pairs,4) */
      parent,child
    from pairs
    union all
    select  /*+ parallel(pairs,4) parallel(e,4) */ 
      e.parent,pairs.child
    from e
    join pairs on e.child = pairs.parent
  )
  select * from e
);

The expected effect on the plan should be that we would see parallelism operations against both table accesses.

----------------------------------------------------
| Id  | Operation                                  |
----------------------------------------------------
|   0 | SELECT STATEMENT                           |
|   1 |  SORT AGGREGATE                            |
|   2 |   VIEW                                     |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|
|   4 |     PX COORDINATOR                         |
|   5 |      PX SEND QC (RANDOM)                   |
|   6 |       PX BLOCK ITERATOR                    |
|   7 |        TABLE ACCESS FULL                   |
|*  8 |     HASH JOIN                              |
|   9 |      TABLE ACCESS FULL                     |
|  10 |      RECURSIVE WITH PUMP                   |
----------------------------------------------------

This now shows us the problem, you can see the PX Co-ordinator is present within the anchor clause of the RCTE, but there is no parallelism listed against the recursion. At first we though it might be ignoring the hints for some reason, but the following idea disproved that theory immediately.

explain plan for
select count(*)
from (
  with e (parent,child) 
  as (
    select /*+ parallel(pairs,4) */
      parent,child
    from pairs
    union all
    select  /*+ parallel(pairs,4) parallel(e,4) use_merge(e,pairs) */ 
      e.parent,pairs.child
    from e
    join pairs on e.child = pairs.parent
  )
  select * from e
);

The plan altered to use the merge hint as follows:

---------------------------------------------------- 
| Id  | Operation                                  |
| --------------------------------------------------
|   0 | SELECT STATEMENT                           |
|   1 |  SORT AGGREGATE                            |
|   2 |   VIEW                                     |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|
|   4 |     PX COORDINATOR                         |
|   5 |      PX SEND QC (RANDOM)                   |
|   6 |       PX BLOCK ITERATOR                    |
|   7 |        TABLE ACCESS FULL                   |
|   8 |     MERGE JOIN                             |
|   9 |      SORT JOIN                             |
|  10 |       RECURSIVE WITH PUMP                  |
|* 11 |      SORT JOIN                             |
|  12 |       TABLE ACCESS FULL                    |
----------------------------------------------------

The explain plan is of course a terrible plan – there would be no reason to use a merge join, but the fact it appears in the plan demonstrates the hints on the recursion clause are being read by the query engine and that it chose to discard the parallelism ones.

Given this example – an SR was raised to find out why the performance is so bad, and are we looking at a bug? If it was a bug, then we could look for a fix of some kind.

The test case was accepted and reproduced inside Oracle very efficiently – it was given to the parallel query department to determine what was the problem.

The response back?

This is the expected behavior. Oracle does not parallelize the iterations sub-query of connect-by/recursive-with clause.

That’s the last thing we wanted to hear – ‘by design’. It’s by design that this feature is going to be incredibly slow on larger data sets. That’s not so much as ‘design’ as rendering RCTE’s useless in Oracle unless you have small data sets, or don’t mind waiting around for a long time to get answers back.

We were already close to ditching any use of the RCTE syntax, this fully nailed the coffin shut on that feature.
(The other reason we are still looking to sort out the test case for – but we have witnessed problems with RCTEs contained within a view. When the view is joined to and accessed with a predicate against the view, we have seen the predicate pushed into the recursion – which results in an incorrect answer. The predicate pushing cuts the recursion short in effect. We had worked around this – but it was an annoying bug.)

Oracle stalwarts will consider that we were foolish to use the RCTE’s over oracle connect by syntax – except that we were not. An RCTE can do far more complex recursion than the Connect By can do, and for the specific instance we wanted to use it, that complexity was required

Another reason for trying to go down that route was performance, because the connect by clause is no better at parallelism:

explain plan for 
select count(*) 
from 
(  
  select /*+ parallel(pairs,4) */         
    parent,child    
  from pairs   
  start with parent = 'G1' 
  connect by parent = prior child 
);

-------------------------------------
| Id  | Operation                   | 
-------------------------------------
|   0 | SELECT STATEMENT            |
|   1 |  SORT AGGREGATE             |
|   2 |   VIEW                      |
|*  3 |    CONNECT BY WITH FILTERING|
|*  4 |     INDEX RANGE SCAN        |
|   5 |     NESTED LOOPS            |
|   6 |      CONNECT BY PUMP        |
|*  7 |      INDEX RANGE SCAN       |
-------------------------------------

The plan is no better for using a CONNECT BY – but from a performance perspective  the connect by clause is clearly faster when we ran some comparisons.

So the verdict on Oracle and RCTE / Recursive Sub-Query Refactoring – excellent language feature – unscalable performance – will refuse to parallel the recursion – very useless for those of us in the VLDB world.