for more information See
- SQL Tuning by Dan Tow (the origins)
- Kyle Hailey on VST
- Karen Morten on VST and Karen’s Blog
- Jonathan Lewis on Tuning Visually and VST
- SQL Tuning with VST by Craig Martin and Craig’s other Presentations
- Embarcadero’s DB Optimizer with automated VST diagrams
SELECT COUNT (*) FROM a, b, c WHERE b.val2 = 100 AND a.val1 = b.id AND b.val1 = c.id;
There are indexes on b.id and c.id. Diagramming the query in DB Optimizer gives
The red lines with crows feet mean that as far as the definitions go, the relations could be many to many.
Question is “what is the optimal execution path for this query?”
One of the best execution plans is to
- start at the most selective table filter
- join to children if possible
- else join to parent
There is one filter in the diagram, represented by the green F on table B. Table B has a filter criteria in the query “b.val2=100″.
Ok, table B is where we start the query. Now where do we go from B? Who is the parent and who is the child? It’s not defined in the constraints nor indexes on these tables so it’s hard for us to know. Guess what ? It’s also hard for Oracle to figure it out. Well, what does Oracle decide to do? This is where the cool part of DB Optimizer comes in.
The super cool thing with DB Optimizer is we can overlay the diagram with the actual execution path (I think this is so awesome)
For the digram we can see Oracle starts with B and joins to A. The result if this is joined to C. Is this the optimal path?
Well, let’s keep the same indexes and just add some constraints:
alter table c add constraint c_pk_con unique (id); alter table b add constraint b_pk_con unique (id);
Now let’s diagram the query with DB Optimizer:
We can now see who the parent and child is, so we can determine the optimal query path which is to start at B, the only filter and join to the child C then to the parent A. Now what does Oracle do with the added constraint info:
Guess what? The execution plan has now changed with the addition of constraints and now Oracle’s execution path goes from a suboptimal plan to the optimal path. Moral of the story is to make sure and define constraint information because it helps the optimizer, but what I wanted to show here was the explain plan overlay on the diagram which makes comparing execution plans much easier. Putting the queries VST diagrams side by side along with the overlay of execution path we can clearly and quickly see the differences:
The dark green nodes are the starting nodes and the red nods are the ending nodes. Inbetween nodes would be light green, but there are none in this diagram.
Here is an example from an article by Jonathan Lewis
The query Jonathan discusses is
SELECT order_line_data FROM customers cus INNER JOIN orders ord ON ord.id_customer = cus.id INNER JOIN order_lines orl ON orl.id_order = ord.id INNER JOIN products prd1 ON prd1.id = orl.id_product INNER JOIN suppliers sup1 ON sup1.id = prd1.id_supplier WHERE cus.location = 'LONDON' AND ord.date_placed BETWEEN '04-JUN-10' AND '11-JUN-10' AND sup1.location = 'LEEDS' AND EXISTS (SELECT NULL FROM alternatives alt INNER JOIN products prd2 ON prd2.id = alt.id_product_sub INNER JOIN suppliers sup2 ON sup2.id = prd2.id_supplier WHERE alt.id_product = prd1.id AND sup2.location != 'LEEDS')
which diagrammed looks like
There are multiple filters, so we need to know which one is the most selective to know where to start, so we ask DB Optimizer to display the statistics as well (blue below a table is the filter %, green above is # of rows in table and numbers on join lines are rows returned by a join of just those two tables)
Now that we can determine a candidate for best optimization path, does Oracle take it?
Can you find the optimization error? (Hint the not exists subquery is sub-optimially accessed)
Dark green is where execution starts. There are two starts: one for the main query body and one for the subquery.
The red is where query execution ends.
( also see this older blog on the designing VST http://dboptimizer.com/2010/08/03/product-design-vst/ )
PS a big part of this work is by the lead developer Matt Vegh. Many thanks to Matt for this awesome work.
PPS another example from Karl Arao
The dark green nodes are starts, so there are 4 separate starts. We can see how the result sets from each start are joined with each successive table join set. The red is the final step.
Creating Diagrams Steps by Craig Martin
- Pick any table in query and draw a node to represent it
- Look for joins to this table’s PK / UK
- Draw down arrow into this node and create node for joined table at top of arrow.
- Look for joins from current table to PK/UK of another table
- Draw down arrow from this node and create node for joined table at bottom of arrow
- Repeat until all tables on diagram
- Fill in Filter and Join Ratios
- Asterisk by filter ratio when always will return 1 row
- Start with table with best (smallest) filter ratio
- Drive through full, unique indexes downward as far as possible, choosing nodes with best filter ratios first
- Only when necessary, drive up diagram links through full, nonunique FK indexes
- All other things being equal, choose path that will get you to lower filter ratios sooner
Somewhat Common Exceptions
- Filter to Exactly 1 Row
- These can be done out of order, cross joining the results
- Detail Join Ratios Near 1
- Don’t have to follow downfirst guideline. Go in whatever direction to get to next best filter ratio
- Filter Ratios Almost Equal
- Drive to a slightly higher filter ratio first if it will lead to a much lower filter ratio
Resources / Additional Reading
- SQL Tuning by Dan Tow http://amzn.com/0596005733
- Articles on Join Types