Archive
Oracle : ambiguous join statements being accepted and executed
I came across a weird piece of SQL earlier today where my first reaction was “That is just invalid SQL”, except that I was viewing it in OEM so it clearly had been parsed and executed. Perhaps it is not too bizarre or surprising to some, but because the form of the SQL is something I would not write due to the inherent problem within it that I didn’t expect Oracle to behave as it did.
In abstract form the statement is as follows:
select *
from a
full outer join b on a.id = b.id
inner join c on a.id = c.id
Perhaps that seems innocent enough but its ambiguous and non-resolvable. To demonstrate that let’s consider giving some minor data to A,B and C.
A - records with ID = 1,2 and 3.
B - records with ID = 1,2 and 4.
C - records with ID = 1 and 2.
If A joins to C, we resolve down via the inner join to just IDs of 1 and 2 – and then full outer join to B getting to a final result of 1,2,4.
If A joins to B first, we get rows with 1,2,3,4 – and then the inner join to C getting a final result of 1,2.
This is why as soon as I saw the statement I considered it invalid – its ambiguous and if I was to treat SQL correctly as a declarative language, there is no notion within that statement as it stands to know which would be the correct answer.
So that is so far theory crafting – so let’s put this into some real simple examples of code.
with a as
(
select 1 as id from dual
union
select 2 as id from dual
union
select 3 as id from dual
)
, b as (
select 1 as id from dual
union
select 2 as id from dual
union
select 4 as id from dual
)
, c as (
select 1 as id from dual
union
select 2 as id from dual
)
select a.id as a_id, b.id as b_id, c.id as c_id
from a
full outer join b on a.id = b.id
inner join c on a.id = c.id
The results:
>A_ID B_ID C_ID
>---------- ---------- ----------
>1 1 1
>2 2 2
So that looks clear, it performs the outer join, and then the inner join – seems straight forward? but if this is properly being treated as declarative as it should be then we still know that there is an ambiguity in the statement, but Oracle has ignored that and applied a rule to make the decision for you. I already have an issue with that approach since there is no indication that it had to step in and correct an ambiguity in the statement submitted.
Normally in SQL we know that changing the order of the joins should not make a difference in the result – it can change the explain plan for the query, especially with an ordered hint in place, but the results should be the same – so let’s test that concept on this.
with a as
(
select 1 as id from dual
union
select 2 as id from dual
union
select 3 as id from dual
)
, b as (
select 1 as id from dual
union
select 2 as id from dual
union
select 4 as id from dual
)
, c as (
select 1 as id from dual
union
select 2 as id from dual
)
select a.id as a_id, b.id as b_id, c.id as c_id
from a
inner join c on a.id = c.id
full outer join b on a.id = b.id
All that has changed is the inner join is moved above the full outer join – but that is the limit of the difference. The results:
>A_ID B_ID C_ID
>---------- ---------- ----------
>1 1 1
>2 2 2
> 4
Now, in my view of the SQL world, this is just incorrect – the statement was ambiguous to start with and should not of been permitted to be executed, but rejected and an exception thrown. It should be treated as a declarative language and no inference from the order of the joins should be made.
Whats surprising to me is that it accepts, parses and executes the statement with no warnings or declaration of what it is doing, and then to make matters worse it is not consistent in the result, but bases the result on an inference which is going to be somewhat random on the style of the developer writing the SQL. I would normally say that the order of joins on a statement does not affect the final output – evidently that is not entirely true.