Existence check Sometimes we need to check if a specific datum is in the table in order to decide what to do. e.g. If it exists do “A” Else do “B”.
r a l u ng i s m u al Dat r u l p a Dat
Existence check -BAD (1) SELECT DISTINCT 1 INTO :HV FROM TABLE WHERE …. 1. If there is no index to satisfy the “WHERE” then we do tablespace scan and sort. 2. Even if we have an index, we may have duplicates so we still invoke the sort. 3. Even if we have unique index – it may be changed in the future.
Existence check –BAD (2) SELECT COUNT(*) INTO :HV FROM TABLE WHERE ….
1. If there is no index to satisfy the “WHERE” then we do tablespace scan. 2. Even if we have index, we count all occurrences. (It’s o.k. If we were asked to count all rows).
Existence check – GOOD (3) 1. DECLARE CURSOR. (Optimize for 1 row with UR) 2. OPEN CURSOR. 3. FETCH. 4. CLOSE CURSOR. 1. Need to code 4 SQL statements. 2. Looks cumbersome. 3. Uses more resource then the next idea.
Existence check – GOOD (4) SELECT 1 INTO :HV FROM TABLE WHERE …. IF SQLCODE = +100 THEN “NOT_EXISTS” IF SQLCODE = 0 or SQLCODE = -811 THEN “EXISTS”. 1. Needs some documentation in the program. 2. Don’t use any data returned by it.
Internal secret • • • •
A singleton select is done internally as a cursor ! Does not use cross memory calls as regular cursor. Internal code length is shorter then regular cursor. DB2 builds a cursor read, internally, for the singleton select and does 1 or 2 fetch commands: – If the 1st command finds nothing, we get a sqlcode = +100. – If the 2nd command finds something, we get a sqlcode = -811. – Else we get the data and sqlcode = 0.
Sub Select - BAD SELECT * FROM T1 WHERE T1.CODE IN (select T2.code from T2 where T2.key = ‘X’) • Will cause tablespace scan on T1. • DB2 may change this type of sub select to Join (if possible).
Do it as join -GOOD SELECT FROM WHERE AND
T1.* T1 , T2 T2.KEY = ‘X’ T1.CODE=T2.CODE
Sub Select - GOOD SELECT * FROM T1 WHERE T1.CODE NOT IN (select T2.code from T2 where T2.key = ‘X’)
Can’t be done as a Join.
Sub Select - GOOD SELECT * FROM T1 WHERE NOT EXISTS (select 1 from T2 where T1.code=T2.code)
Can’t be done as a Join.
Sub Select – BAD or GOOD ? SELECT A1, A2, A3 FROM T1 WHERE A1 = ? AND A2 = (select max(A2) from T1)
Use a cursor - BAD or GOOD ? DECLARE CRS1 CURSOR FOR SELECT A1, A2, A3 FROM T1 WHERE A1 = ? ORDER BY A2 DESC OPTIMIZE FOR 1 ROW Open crs1; fetch crs1 into…. ; close crs1
Statistics Time
0.00392 4 0.00625 24
CPU
SQL
0.00341 0.00517
Sorts Locks
4 4
0 1
Rows
7 9
Cursor
Sub-Select Assuming proper index in both cases !!!
Sub-Query vs. Cursor - Conclusion • Assuming proper index: – If the command is used infrequently then we can use the sub-query, otherwise – use the cursor. • If no proper index exists: – The cursor will invoke sort on all the rows that conform to the search criteria. – The sub-query will scan all rows for the max/min value but will not sort. – Use the sub-query.
Conclusion Proper indexes can help
Real life example (1) SELECT FROM
* MNTB.TVTNSDRA
A
WHERE A.LOT_NUMBER IN (SELECT B.LOT_NUMBER FROM MNTB.TVTNITUR B WHERE UNIT = '638‘ AND B.LOT_NUMBER = A.LOT_NUMBER); Canceled after 23 minutes elapsed Join column not 1st in index
Real life example (2) SELECT FROM
A.* MNTB.TVTNSDRA A, MNTB.TVTNITUR B WHERE UNIT = '638‘ AND B.LOT_NUMBER = A.LOT_NUMBER WITH UR ; Canceled after 14 minutes elapsed Join column not 1st in index
Real life example (3) SELECT * FROM MNTB.TVTNSDRA A WHERE A.LOT_NUMBER IN (SELECT DISTINCT B.LOT_NUMBER FROM MNTB.TVTNITUR B WHERE UNIT = '638‘) WITH UR; Finished after 14 seconds elapsed Join column not 1st in index
Why? • The 1st example is a correlated sub-query where the inner query is executed for every row of the outer query. • The 2nd example is a join that has no suitable index. • The 3rd example is a non-correlated sub-query where the inner query is executed only once, the result table is kept sorted in memory and the external query checks against it.
Need a date ? Select distinct current date from table1;
select current date from sysibm.sysdummy1; EXEC SQL SET :HV = CURRENT DATE ;
Sub Select – IN vs. EXISTS (3)
SELECT A, B, C FROM TAB1 WHERE EXISTS (SELECT 1 FROM TAB2 WHERE ……);
OUTER
INNER
Sub Select – IN vs. EXISTS (4) • If the “inner” table is big or if there is usable index on it then EXISTS will perform better. • If the “inner” table is small or there is no usable index on it then IN will perform better. • If there are few rows that qualify then the query will be converted to IN (list) which allows a matching index scan.
SELECT * • Don’t use “SELECT *” unless you really need all columns. • Each column has to be moved from DB2 page to the DM, then to the RDS and then to the working program. • This move is done field by field.
ORDER BY • Include only the columns needed for the sort.
Select A1, B1, C1 From table Where A1 = :hv1 Order by A1, A2, A3 Select A1, B1, C1 From table Where A1 = :hv1 Order by A2, A3
Cursor within a cursor • Cursor within a cursor (in program code) means a lot of unnecessary open & close operations of the internal cursor. • Code it as a join / sub-select / in-list instead.
Divide and conquer • Teachers table • Courses table • Each teacher can teach any number of courses. • We look for teachers who can teach all courses.
DIVIDE (1) CREATE TABLE DIV1 (KD1 INT DD1 CHAR(5)
NOT NULL, NOT NULL);
CREATE TABLE DIV2 (KD2 INT KD1 INT
NOT NULL, NOT NULL);
Bring all records from DIV2 which have all occurrences from DIV1.
DIVIDE (2) KD1 DD1
KD2
KD1
100
1
100
2
1
AAA
2
BBB
100
3
3
CCC
101
1
101
5
102
1
102
2
102
3
102
4
104
1
DIV1
Result: 100
DIV2
DIVIDE (3) SELECT A.KD2 FROM (SELECT DISTINCT
DIV2.KD2 AS KD2, DIV2.KD1 AS KD1 FROM DIV2 GROUP BY DIV2.KD2, DIV2.KD1) AS A
GROUP BY A.KD2 HAVING COUNT(*) = (SELECT COUNT(*) FROM DIV1) AND NOT EXISTS (SELECT DIV2.KD1 FROM DIV2 WHERE
A.KD2=DIV2.KD2
AND DIV2.KD1 NOT IN (SELECT DIV1.KD1 FROM DIV1));
Find Duplicates SELECT FROM GROUP BY HAVING [ORDER BY
A, B, C, COUNT(*) T1 A, B, C COUNT(*) > 1 4 DESC]
AS
‘NUM#’
GROUP BY ON FUNCTIONS (1)
SELECT DEPT, GROSS_SALARY FROM (SELECT DEPT, SALARY+BONUS AS GROSS_SALARY FROM EMP WHERE RANK >= 30) AS A GROUP BY DEPT, GROSS_SALARY
GROUP BY ON FUNCTIONS (2)
SELECT SUM(SALARY), MONTH_SAL FROM (SELECT SALARY ,MONTH(SALARY_DATE) AS MONTH_SAL FROM EMP ) AS A GROUP BY MONTH_SAL
How much (does it costs) ? Statement type Simple FETCH
Estimated number of machine instructions 3,500 to
9,000
Singleton SELECT
12,000 to 40,000
Update/Delete/Insert
40,000 to 90,000
The END