SingingSQL Presents: Presents When Good Optimizers Make Bad Choices (Sometimes) March, 2005 ©2005 Dan Tow All rights reserved
[email protected] www.singingsql.com
Overview •Introduction – the Nature of the Game •The CBO versus the Human “Player” •Choices Unavailable to the CBO •Objections to Human Tuning, Addressed
Introduction – The Nature of the Game Think of tuning as a game (“Big Blue” versus Kasparov?): •What advantages does the CBO have, versus a human “player” (tuner)? –Ability to crunch lots of numbers, well, to explore many options.
•What disadvantages does the CBO have? –Very little time to optimize. –Limited information. –Fewer options (different, more-limiting rules than apply to a human player). –Less strategic sophistication than a good human tuner.
Overview •Introduction – the Nature of the Game •The CBO versus the Human “Player” •Choices Unavailable to the CBO •Objections to Human Tuning, Addressed
Reasons the CBO Misses Good Choices •CBO is not allowed to execute queries to test alternative plans, or to learn more about selectivities than its best guess from statistics. •CBO searches a tiny fraction of all possible plans, for truly complex SQL, and applies a limited strategy to prune the search. •CBO cost function imperfectly predicts the relative costs of alternative plans. •CBO lacks insight into the freedom to change corner-case behavior.
Unfair Advantages Humans Enjoy CBO is not allowed to execute queries to test alternative plans, or to learn more about selectivities than its best guess from statistics: •While the CBO must try to optimize every query, the human can focus on a couple of dozen queries. •We can test alternatives as long as necessary and only choose alternatives that we prove to be faster. •With proven, tested performance of a chosen alternative, a human need never lose (although we may “tie” with the CBO).
Unfair Advantages Humans Enjoy CBO searches a tiny fraction of all possible plans, for truly complex SQL, and applies a limited strategy to prune the search: •An informed human understanding of the tuning problem is much deeper, enabling more sophisticated strategies to locate the optimum, or a near-optimum, in a huge search space. •Kasparov can (sometimes) beat even the brute force of Big Blue with a superior understanding of strategy, and SQL optimization, so far, favors humans more than chess does.
Unfair Advantages Humans Enjoy CBO cost function imperfectly predicts the relative costs of alternative plans. The cost function really has two parts: •It predicts how many index entries, table rows, and blocks of every object the execution plan will reach during execution – this is the logical cost.
Unfair Advantages Humans Enjoy CBO cost function imperfectly predicts the relative costs of alternative plans. The cost function really has two parts: •From the logical cost, it must estimate a physical cost, which is mostly just the runtime, but weighted, potentially, in favor of plans that avoid bottlenecks that harm other processes.
Unfair Advantages Humans Enjoy CBO cost function imperfectly predicts the relative costs of alternative plans. The cost function really has two parts, logical and physical cost estimation: •A bad logical cost estimate almost guarantees a bad physical cost estimate (often by orders of magnitude!). •However, even a good logical cost estimate does not guarantee a good physical cost estimate.
Unfair Advantages Humans Enjoy •CBO cost function imperfectly predicts the relative costs of alternative plans. Logical costs: •Even with perfect statistics, the selectivity of a complex filter is impossible to predict, and filter selectivities are the main determiner of logical costs. •Humans need not predict – we can measure! •This is a huge advantage!
Selectivity-Error Examples SQL> @exq8 test1 1 SELECT count(Data_Col) FROM DTow_Talk_Child 2* WHERE Date_Col >= TO_DATE('2005/01/01','YYYY/MM/DD')-10 .1 SELECT STATEMENT c=196, R=1 ..2 SORT AGGREGATE c=_, R=1 ...3 TABLE ACCESS BY INDEX ROWID 1*DTOW_TALK_CHILD c=196, R=10009 ....4 INDEX RANGE SCAN DTOW_TALK_CHILD_DATE_COL: Date_Col c=27, R=10009 SQL> @test1 3 ; COUNT(DATA_COL) --------------10000
Selectivity-Error Examples SQL> @exq8 test2 1 SELECT count(Data_Col) FROM DTow_Talk_Child 2 WHERE Date_Col >= TO_DATE('2005/01/01','YYYY/MM/DD')-10 3* AND Code_Col = 'OP' .1 SELECT STATEMENT c=7, R=1 ..2 SORT AGGREGATE c=_, R=1 ...3 TABLE ACCESS BY INDEX ROWID 1*DTOW_TALK_CHILD c=7, R=46 ....4 INDEX RANGE SCAN DTOW_TALK_CHILD_CODE_COL: CODE_COL c=1, R=4600 SQL> @test2 4 ; COUNT(DATA_COL) --------------3700
Selectivity-Error Examples SQL> @exq8 test3 1 SELECT count(Data_Col) FROM DTow_Talk_Child 2 WHERE Date_Col <= TO_DATE('2005/01/01','YYYY/MM/DD')-90 3* AND Code_Col = 'OP' .1 SELECT STATEMENT c=7, R=1 ..2 SORT AGGREGATE c=_, R=1 ...3 TABLE ACCESS BY INDEX ROWID 1*DTOW_TALK_CHILD c=7, R=4195 ....4 INDEX RANGE SCAN DTOW_TALK_CHILD_CODE_COL: CODE_COL c=1, R=4600 SQL> @test3 4 ; COUNT(DATA_COL) --------------110
Unfair Advantages Humans Enjoy •CBO cost function imperfectly predicts the relative costs of alternative plans. Physical costs: •Relative CPU times for different sort of logical I/O vary widely, in hard-to-predict ways. •Hit ratios of the different objects involved in SQL execution vary widely, in hard-to-predict ways. •Humans need not predict – we can measure! We can even measure physical costs without knowing logical costs! •This is a huge advantage!
Unfair Advantages Humans Enjoy CBO cost function imperfectly predicts the relative costs of alternative plans: •Humans need not predict – we can measure! •This is the moral equivalent to being allowed to “take back” moves in a game of chess, as many moves as we want, as often as we want, as we learn the consequences of those moves. •This is a colossal advantage!
Measurement Example 7 … FROM DTow_Talk_Child C, --
8 DTow_Talk_Parent1 P, 9 DTow_Type TC, 10 DTow_Type TP 11 WHERE C.FKey1_ID=P.PKey_ID 12 AND C.Type_ID=TC.Type_ID 13 AND P.Type_ID=TP.Type_ID 14* and TC.Description='VERY_RARE' .1 SELECT STATEMENT c=3023, R=1 ..2 SORT AGGREGATE c=_, R=1 ...3 HASH JOIN c=3023, R=333333 ....4 TABLE ACCESS FULL 4*DTOW_TYPE c=2, R=3 ....4 HASH JOIN c=3020, R=333333 .....5 TABLE ACCESS FULL 2*DTOW_TALK_PARENT1 c=250, R=100000 .....5 HASH JOIN c=2559, R=333333 ......6 TABLE ACCESS BY INDEX ROWID 3*DTOW_TYPE c=2, R=1 .......7 INDEX RANGE SCAN DTOW_TYPE_DESCRIPTION: DESCRIPTION c=1, R=1 ......6 TABLE ACCESS FULL 1*DTOW_TALK_CHILD c=2555, R=1000000
Measurement Example SQL> @mysid9 SESSION_ID Oracle_PID Client_PID ---------- ------------ -----------78 3705 11178 Elapsed: 00:00:00.15 SQL> @test4 15 ; CDATE PDATE TCDESC TPDESC ---------- ---------- ---------- ---------1000 1000 1000 1000 Elapsed: 00:00:01.14 SQL> @reads9 LIO PIO ---------------------------- ---------------------------Logical Reads = 18484 Physical Reads = 0
Measurement Example SQL> @exq8 test6 1 select count(*) from (SELECT /*+ first_rows */ …rownum rn 8 FROM …--<just like test4 SQL> 16* AND TC.Description='VERY_RARE') .1 SELECT STATEMENT c=300252, R=1 ..2 SORT AGGREGATE c=_, R=1 ...3 VIEW 1* c=300252, R=333333 ....4 COUNT c=_, R=_ .....5 TABLE ACCESS BY INDEX ROWID 2*DTOW_TALK_CHILD c=3, R=3 ......6 NESTED LOOPS c=300252, R=333333 .......7 NESTED LOOPS c=252, R=100000 ........8 MERGE JOIN CARTESIAN c=252, R=100000 .........9 TABLE ACCESS BY INDEX ROWID 4*DTOW_TYPE c=2, R=1 ..........10 INDEX RANGE SCAN DTOW_TYPE_DESCRIPTION: DESCRIPTION c=1, R=1 .........9 BUFFER SORT c=250, R=100000 ..........10 TABLE ACCESS FULL 3*DTOW_TALK_PARENT1 c=250, R=100000 ........8 INDEX UNIQUE SCAN DTOW_TYPE_UNQ1: type_id c=_, R=1 .......7 INDEX RANGE SCAN DTOW_TALK_CHILD_FKEY1_ID: fkey1_id c=2, R=10
Measurement Example SQL> @mysid9 SESSION_ID Oracle_PID Client_PID ---------- ------------ -----------78 3705 11178 Elapsed: 00:00:00.14 SQL> @test6 -------FIRST_ROWS plan 17 ; COUNT(*) ---------1000 Elapsed: 00:00:03.60 SQL> @reads9 LIO PIO ---------------------------- ---------------------------Logical Reads = 220516 Physical Reads = 0
Measurement Example SQL> @exq8 test7 1
SELECT /*+ leading(TC) use_nl(C P) index(C DTOW_TALK_CHILD_TYPE_ID)
2 index(P DTOW_TALK_PARENT1_UNQ1) */ … --
Measurement Example SQL> @mysid9 SESSION_ID Oracle_PID Client_PID ---------- ------------ -----------78 3705 11178 Elapsed: 00:00:00.14 SQL> @test7 -----Hand-tuned plan, with hints 16 ; CDATE PDATE TCDESC TPDESC ---------- ---------- ---------- ---------1000 1000 1000 1000 Elapsed: 00:00:00.16 SQL> @reads9 LIO PIO ---------------------------- ---------------------------Logical Reads = 3012 Physical Reads = 0
Unfair Advantages Humans Enjoy CBO lacks insight into the freedom to change corner-case behavior: •SQL constitutes a rigorous, unambiguous spec for what the CBO must deliver, functionally, regardless of cost, without the slightest risk of wrong-rows bugs in even the most obscure corner cases!
Unfair Advantages Humans Enjoy CBO lacks insight into the freedom to change corner-case behavior: •Humans understand which corner cases we can safely ignore. •Humans understand that even for corner cases we cannot ignore, the current corner-case behavior specified by the SQL may be wrong, and should be changed, if it is wrong, and if it prevents access to a fast execution plan.
Corner-Case Example SQL> @exq8 test5 1 SELECT 2 C.PKey_ID CKey_ID, 3 C.Date_Col CDate, 4 G.PKey_ID GKey_ID 5 FROM DTow_Talk_Child C, 6 DTow_Talk_Generic G 7 WHERE C.PKey_ID=G.Generic01 8* AND C.PKey_ID=654321 .1 SELECT STATEMENT c=2170, R=1 ..2 NESTED LOOPS c=2170, R=1 ...3 TABLE ACCESS BY INDEX ROWID 1*DTOW_TALK_CHILD c=3, R=1 ....4 INDEX UNIQUE SCAN DTOW_TALK_CHILD_UNQ1: pkey_id c=2, R=1 ...3 TABLE ACCESS FULL 2*DTOW_TALK_GENERIC c=2167, R=1
Corner-Case Example SQL> @test5 9 ; CKEY_ID CDATE GKEY_ID ---------- --------- ---------654321 21-JAN-04 1654321 Elapsed: 00:00:01.26 SQL> @reads9 LIO PIO ---------------------------- ---------------------------Logical Reads = 14282 Physical Reads = 0
Corner-Case Example SQL> @exq8 test8 1 SELECT 2 C.PKey_ID CKey_ID, 3 C.Date_Col CDate, 4 G.PKey_ID GKey_ID 5 FROM DTow_Talk_Child C, 6 DTow_Talk_Generic G 7 WHERE TO_CHAR(C.PKey_ID)=G.Generic01 8* AND C.PKey_ID=654321 .1 SELECT STATEMENT c=6, R=1 ..2 NESTED LOOPS c=6, R=1 ...3 TABLE ACCESS BY INDEX ROWID 1*DTOW_TALK_CHILD c=3, R=1 ....4 INDEX UNIQUE SCAN DTOW_TALK_CHILD_UNQ1: pkey_id c=2, R=1 ...3 TABLE ACCESS BY INDEX ROWID 2*DTOW_TALK_GENERIC c=3, R=1 ....4 INDEX RANGE SCAN DTOW_TALK_GENERIC_GENERIC01: GENERIC01 c=2, R=1
Corner-Case Example SQL> @test8 9 ; CKEY_ID CDATE GKEY_ID ---------- --------- ---------654321 21-JAN-04 1654321 Elapsed: 00:00:00.15 SQL> @reads9 LIO PIO ---------------------------- ---------------------------Logical Reads = 9 Physical Reads = 0
Unfair Advantages Humans Enjoy CBO lacks insight into the freedom to change corner-case behavior: •View-using queries are the most common example of queries with quirky corner-case behaviors (almost never anticipated by the developers) that trigger expensive execution plans.
Overview •Introduction – the Nature of the Game •The CBO versus the Human “Player” •Choices Unavailable to the CBO •Objections to Human Tuning, Addressed
Choices Unavailable to the CBO The CBO’s game: •Tune fast. •Tune at runtime, without executing SQL against application data. •Take each SQL spec as gospel. •Use only what can be deduced or assumed from dictionary statistics. •Use the database as-is.
Choices Unavailable to the CBO Higher-level games for human tuning: •Take as long as necessary to tune. Verify fixes! •Learn what needs to be learned any way necessary. •Consider altering corner-case behavior of the SQL. •Use existing denormalizations. •Educate end users to avoid unnecessary, hard problems (e.g., search on phone number, not last name.)
Choices Unavailable to the CBO Higher-level games for human tuning: •Consider modifying the application: –Read fewer rows. –Read less frequently, or not at all. –Move load off-hours, or into batch. –Combine multiple statements into one. –Break one statement into multiple statements.
Choices Unavailable to the CBO Higher-level games for human tuning: •Consider modifying the database: –Add or modify indexes. –Denormalize the database (rarely necessary!). –Add constraints that secure corner-case issues. –Alter the database design.
Overview •Introduction – the Nature of the Game •The CBO versus the Human “Player” •Choices Unavailable to the CBO •Objections to Human Tuning, Addressed
Objections to Human Tuning, Addressed “The CBO’s assumptions and approximations are fine! It finds a great plan, or at least a good enough plan, 99% of the time.”
Objections to Human Tuning, Addressed “The CBO’s assumptions and approximations are fine! It finds a great plan, or at least a good enough plan, 99% of the time.” YES!
Objections to Human Tuning, Addressed “The CBO’s assumptions and approximations are fine! It finds a great plan, or at least a good enough plan, 99% of the time.” That’s about right!
Objections to Human Tuning, Addressed “The CBO’s assumptions and approximations are fine! It finds a great plan, or at least a good enough plan, 99% of the time.” Corrollary: It’s a waste of time to just tune SQL at random, or to tune all SQL!
Objections to Human Tuning, Addressed “The CBO’s assumptions and approximations are fine! It finds a great plan, or at least a good enough plan, 99% of the time.” However, that can still leave hundreds of badly tuned queries in a complex application suite!
Objections to Human Tuning, Addressed “The CBO’s assumptions and approximations are fine! It finds a great plan, or at least a good enough plan, 99% of the time.” If you find the right SQL to tune (method R or similar, runtime-based methods), you will find that most of that SQL has ready opportunities for improvement!
This is not a paradox!
Objections to Human Tuning, Addressed “Most developers do not know enough to outsmart the CBO.” OK, if you are among those developers, learn more! –… Or hire someone who already knows how;-)
Objections to Human Tuning, Addressed “The CBO only fails when the database (or application) design is bad.” Even if this was true, so what?! Bad design happens! (Tune in the real world!) We (in this room, today) almost never have the chance to start at, and completely control, the design! Tuning slow SQL is a great way to uncover needed fixes to bad design!
Objections to Human Tuning, Addressed “Even if you can improve the query, today, tuning is a bad idea because hints will overconstrain the CBO in the future.” Less than half of SQL tuning even involves beating the CBO at its own game! –Add indexes, change the application, denormalize,… –These changes do not constrain the future CBO!
Objections to Human Tuning, Addressed “Even if you can improve the query, today, tuning is a bad idea because hints will overconstrain the CBO in the future.” Even when you beat the CBO at its own game, you don’t always need hints to do it! –E.g., convert subqueries to inline views. –E.g., convert view-using queries to queries of simple tables. –These changes do not constrain the future CBO!
Objections to Human Tuning, Addressed “Even if you can improve the query, today, tuning is a bad idea because hints will overconstrain the CBO in the future.” When you do need hints to beat the CBO at its own game, so be it! These are big improvements! –Take the “bird in the hand”! –Use a robust execution plan. –The SQL will probably change, anyway, long before your hint will be even marginally likely to get in the CBO’s way!
Questions?