Right Deep, Left Deep and Bushy Joins
At UKOUG someone asked me if DB Optimizer’s VST diagrams could deal with left deep verses right deep execution plans. What is right deep verses left deep? Good question. In join trees (not VST) the object on the left is acted upon first then the object on the right. Below are left deep and right deep examples of the same query, showing
- query text
- join tree
- join tree modified to more clearly show actions
- VST showing the same actions
All of this boils down to the point that a right deep HJ can return rows earlier than a left deep HJ. A left deep HJ has to wait for each join to finished completely so that the result set can be hashed before the next step can probe it. On the other hand, in a right deep HJ, it’s not the result sets that are being hashed, but a table at each level, thus each table can be hashed without waiting for intermediary results and once these hashes are complete a probed row can flow all the way through the join tree, from bottom to top, similar to how a nested loop can start giving results early. The Left Deep HJs only have two open work areas at a time where as the Right Deep can have multiple work areas open. One of the key thing to keep i n mind is how much data we have to hash. If the intermediate result sets are large (and/or growing each level) then that represents more work each step of the way.
Normally I don’t think in left deep verses right deep because it doesn’t change the the path through the join tree. On the other hand it does change whether we are hashing a result set or we are hashing a table. If hashing a table then the results can be returned, in theory, more quickly.
For NL joins there are only left deep join. The object on the left always probes the object on the right. The object on the left is always accessed first ( there is no need to modify the object on the left first and probe from the right with NL).
Besides left deep and right deep, there are also bushy joins which Oracle doesn’t do unless forces to through sub-queries or views as in the following:
Finally, the above VST diagrams were modified to be more easily compared to the join tree diagrams. Below are the VST diagrams as displayed by default in Embarcadero’s DB Optimizer. DB Optimizer shows the tables all in a line because the tables are one to one relationships. Keeping the VST diagram layout constant, it is easier to see the differences in execution paths:
Note on hinting and order of tables in explain plan:
For NESTED LOOPS and HASH JOINS the hint takes one argument, which is the “inner” table, ie the one that get’s probed into.
For NESTED LOOPS the argument is the table probed in to (ie the second in order of access)
For HASH JOINS the argument is the table that gets hashed and probed into (the first table in order of access)
The part I find confusing is that the inner table appears in a different order for NESTED LOOPS and HASH JOINS. For NESTED LOOPS the inner table appears second, which makes sense to me. Thus I’d expect the inner table to also be the second table in the execution plan for HASH JOINS but before we can probe into the inner table in a HASH JOIN we have to hash it first, which means the first thing we access is the the inner table , thus the inner table the first thin in the execution plan.
For NESTED LOOPS if order is “LEADING(X,Y) then nested loops hint can only be on Y, ie USE_NL(Y) because the inner table can only be the second table in order of execution.
For HASH JOINS if the order is “LEADING(X,Y) then the hash join hint can only be on X, ie USE_HASH(X) because the inner table has to be accessed first to make a hash of it before we can probe the hash with the outer table.
A couple of good references on USE_NL and USE_HASH
- Use inner tables USE_NL and USE_HASH hints
- Order of execution is set before applying USE_NL and USE_HASH hints
For more information see: