Basics Of Sql Tuning.ppt

  • Uploaded by: Sagar Deshpande
  • 0
  • 0
  • December 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Basics Of Sql Tuning.ppt as PDF for free.

More details

  • Words: 5,104
  • Pages: 42
Basics Of SQL Tuning

Introduction to SQL What is SQL SQL is structured Query Language which is a computer language for storing, manipulating and retrieving data stored in relational database. what happen when you fired the SQL Statement  Parsing  Binding  Executing.  Fetching (For Select Statement) SQL Tuning Constitute of Three parts   

SQL Writing (50 %) SQL Tuning (30 %) Advance SQL Tuning(20 %)

Tips on Writing Tuned SQL      

    

Know your indexes Know the required response time Have sufficient data volumes Have the optimizers, statistics and Oracle Parameters the same as production Know how to obtain and interpret an execution plan Know when and when not to use indexes Know what prevents an index from being used Understand how Oracle joins tables and get the Driving Table right Understand the behavior of the Oracle Optimizers Know which hints are available to you and which ones to use Having an architecture and data model that will assist performance

Efficient Sql Writing TipsNever use >SELECT * from Statement: Each query should access only the columns needed for the function that will be performed. But if all the columns are required then, there is no harm in using SELECT *. Never use a Cursor for fetching a single row: Because for using cursors you will have to DECLARE, OPEN, FETCH and CLOSE them. A single SELECT is much faster.

Avoid using DISTINCT: The DISTINCT verb removes the duplicates from the result table. If duplicates are not a problem and if the chance of having a duplicate does not exist, then avoid using it because it adds to the overheads. Use Equivalent Data Types: Use the same data type and lengths when comparing column values to host variables or literals. This eliminates the need for data conversion. For examples, to compare a CHAR (6) column with a CHAR (5) column, Oracle has to do the data conversion and when Oracle has to convert data the available indexes are not used. The same is the case when there is a mismatch in lengths or data type between a column and a host variable that are being compared.

Use BETWEEN instead of <= and >=: The BETWEEN predicate is more efficient that the `greater/less than or equal to’ predicates because the optimizer select the most efficient path for the BETWEEN predicate. Use IN instead of LIKE: Use IN or BETWEEN instead of LIKE whenever possible. If you know that only a certain number of occurrences exist, using IN with the specific list will be much more efficient than LIKE. You can imitate the functionality of LIKE by using a range of values.

Avoid using NOT: Whenever possible avoid the use of NOT (<>). Instead reword the SQL so that the same results are obtained. For example use ‘WHERE sum > 100’ instead of ‘WHERE sum <> 100’. Code the most restrictive predicate first: Choose the condition that will eliminate most of the rows as the first predicate. For example, if you want to get the names of all the person whose name starts with the letter ‘M’ in ‘Mumbai’ city, instead of coding ‘WHERE name BETWEEN ‘NNN’ AND ‘MMM’ and city=’Mumbai’, code city=’Mumbai’ AND name BETWEEN ‘NNN’ AND ‘MMM’.

Use UNION ALL instead of UNION: The UNION operator always results in a sort. So in cases where there is no possibility of duplicates or in case where elimination of duplicates is unnecessary use UNION ALL instead of UNION. Use NOT EXISTS instead of NOT IN: When you are coding a sub query using negation logic, use NOT EXISTS instead of NOT IN to increase the efficiency of the SQL. Use joins instead of sub queries: A join can be efficient than a sub query or a correlated sub query using IN. Join on indexed columns: The efficiency of your programs improves when tables are joined based on indexed columns rather than non-indexed ones.

Limit the columns Grouped: When you are using the GROUP BY clause, specify only those columns that need to be grouped. Use same cases for all the tables, Views accessed in select statements, which will allow Oracle to use shared pool efficiently

Limit updating indexed columns: When columns in indexes are updated, a corresponding update is applied to all indexes in which the column participates. This will reduce the performance considerably because of additional I/O overhead.

Avoid Bitmap Index in OLTP DB: Do not use Bitmap Index on columns which are part of OLTP database ,as these OLTP tables are getting updated, deleted and inserted frequently. As To update one value in Bitmap index, Oracle has to rearrange all the Bitmap index elements Prefer Truncate in place of Delete: As Delete statements doesn't reset the High water mark of a table, it leaves behind empty blocks which may hamper FULL table access of such tables. Truncate deletes all the data and reset the high-water mark ,which results in no empty blocks. Use /*+ APPEND */ Hint while inserting data in Global Temporary Or Temporary tables used for data manipulation, this will insert data faster as data will be stored straight way after the High water mark.

Tips on Writing Tuned SQL      

    

Know your indexes Know the required response time Have sufficient data volumes Have the optimizers, statistics and Oracle Parameters the same as production Know how to obtain and interpret an execution plan Know when and when not to use indexes Know what prevents an index from being used Understand how Oracle joins tables and get the Driving Table right Understand the behavior of the Oracle Optimizers Know which hints are available to you and which ones to use Having an architecture and data model that will assist performance

Indexes 



All developers require a sound knowledge of the indexes on their tables Indexes should be used to avoid FULL Table Scans ONLY when the SQL requires the index to perform acceptably  Occasionally Oracle will prefer FULL Scans and the developer will need to place a hint into the SQL to make it use the correct index  If the developer is required to add an index  Be aware of the impact on other transactions  Be aware of the issues caused by single column indexes, adding one concatenated index in preference to two single column indexes

What is an Index? 

An index is an hierarchically organized structure designed to facilitate the rapid lookup of row identifiers (rowids) based on key values .

Types Of Index 

B-tree indexes This is the standard tree index that Oracle has been using since the earliest releases.



Bitmap indexes Bitmap indexes are used where an index column has a relatively small number of distinct values (low cardinality). These are super-fast for read-only databases, but are not suitable for systems with frequent updates



Bitmap join indexes This is an index structure whereby data columns from other tables appear in a multi-column index of a junction table. This is the only create index syntax to employ a SQL-like from clause and where clause

Inside Oracle indexes 

In order to properly manage the blocks, Oracle controls the allocation of pointers within each data block. As an Oracle tree grows (via inserting rows into the table), Oracle fills the block, and when full it splits, creating new index nodes (data blocks) to manage the symbolic keys within the index.



Hence, an Oracle index block may contain two types of pointers - Pointers to other index nodes (data blocks) - ROWID pointers to specific table rows



Oracle manages the allocation of pointers within index blocks, and this is the reason why we are unable to specify a PCTUSED value (the freelist re-link threshold) for indexes. When we examine an index block structure, we see that the number of entries within each index node is a function of two values 1) The length of the symbolic key 2) The blocksize for the index tablespace









Each data block within the index contains "nodes" in the index tree, with the bottom nodes (leaf blocks), containing pairs of symbolic keys and ROWID values. As an Oracle tree grows (via inserting rows into the table), Oracle fills the block, and when the block is full, it splits, creating new index nodes (data blocks) to manage the symbolic keys within the index. Hence, an Oracle index block may contain pointers to other index nodes or ROWID/Symbolic-key pairs. Why Indexes Aren't Used The presence of an index on a column does not guarantee it will be used. The following is a small list of factors that will prevent an index from being used: The optimizer decides it would be more efficient not to use the index. As a rough rule of thumb, on evenly distributed data an index will be used if it restricts the number rows returned to 5% or less of the total number of rows. In the case of randomly distributed data, an index will be used if it restricts the number of rows returned to 25% or less of the total number of rows. • You perform a function on the indexed column i.e. WHERE UPPER(name) = 'JONES'

• You perform mathematical operations on the indexed column i.e. WHERE salary + 1 = 10001 • You concatenate a column i.e. WHERE firstname || ' ' || lastname = 'JOHN JONES' • The use of 'OR' statements confuses the Cost Based Optimizer (CBO).

Explain plan The EXPLAIN PLAN statement displays execution plans chosen by the Oracle optimizer for SELECT, UPDATE, INSERT, and DELETE statements. A statement's execution plan is the sequence of operations Oracle performs to run the statement. Whenever an SQL query is issued to the Oracle database, Oracle creates a plan for executing the query. This execution plan can be saved in a plan table. The plan table contains the order of the tables referenced by the statement, the method used to access the table, the join method, and data operations. The EXPLAIN PLAN results let you determine whether the optimizer select a particular execution plan, such as, join, Scan Methods. It also helps you to understand the optimizer decisions, such as why the optimizer chose a nested loops join instead of a hash join,

To explain a SQL statement, use the EXPLAIN PLAN FOR clause immediately before the statement.

For example: EXPLAIN PLAN FOR SELECT last_name FROM employees; . You can Display the plan_Table output by using the below sql statement select * from table(dbms_xplan.display());

Reading Explain Plan What is COST ? Cost of the operation as estimated by the optimizer's query approach. Cost is not determined for table access operations. The value of this column does not have any particular unit of measurement; it is merely a weighted value used to compare costs of execution plans. The value of this column is a function of the CPU_COST and IO_COST columns. CPU_COST

CPU cost of the operation as estimated by the query optimizer's approach. The value of this column is proportional to the number of machine cycles required for the operation. For statements that use the rule-based approach, this column is null. IO_COST

I/O cost of the operation as estimated by the query optimizer's approach. The value of this column is proportional to the number of data blocks read by the operation. For statements that use the rule-based approach, this column is null.

What is CARDINALITY ? Estimate by the query optimization approach of the number of rows accessed by the operation. What is BYTES? Estimate by the query optimization approach of the number of bytes accessed by the operation. There are other parameter Like Object Owner which depict the owner of that particular object, Object Name etc.

Significance of Access Paths To choose an access path, the optimizer first determines which access paths are available by examining the conditions in the statement's WHERE clause. The optimizer then generates a set of possible execution plans using available access paths and estimates the cost of each plan using the statistics for the index, columns, and tables accessible to the statement. The optimizer then chooses the execution plan with the lowest estimated cost 





To choose among available access paths, the optimizer considers the Selectivity which is the percentage of rows in the table that the query selects A query that selects a small percentage of a table's rows has good selectivity, while a query that selects a large percentage of rows has poor selectivity. The optimizer is more likely to choose an index scan over a full table scan for a query with good selectivity than for one with poor selectivity. Index scans are usually more efficient than full table scans for queries that access only a small percentage of a table's rows, while full table scans are usually faster for queries that access a large percentage

FULL TABLE SCAN This type of scan reads all rows from a table and filters out those that do not meet the selection criteria When Oracle performs a full table scan, the blocks are read sequentially. Each block is read only once. Why a Full Table Scan Is Faster for Accessing Large Amounts of Data ? This is because full table scans can use larger I/O calls, and making fewer large I/O calls is cheaper than making many smaller calls. When the Optimizer Uses Full Table Scans? The optimizer uses a full table scan in any of the following cases:  Lack of Index  Large Amount of Data  Small Table

Rowid Scans The rowid of a row specifies the datafile and data block containing the row and the location of the row in that block. Locating a row by specifying its rowid is the fastest way to retrieve a single row, because the exact location of the row in the database is specified. To access a table by rowid, Oracle first obtains the rowids of the selected rows, either from the statement's WHERE clause or through an index scan of one or more of the table's indexes. Oracle then locates each selected row in the table based on its rowid. When the Optimizer Uses Rowids? This is generally the second step after retrieving the rowid from an index. The table access might be required for any columns in the statement not present in the index. Access by rowid does not need to follow every index scan. If the index contains all the columns needed for the statement, then table access by rowid might not occur.

INDEX SCAN In this method, a row is retrieved by traversing the index, using the indexed column values specified by the statement. An index scan retrieves data from an index based on the value of one or more columns in the index. To perform an index scan, Oracle searches the index for the indexed column values accessed by the statement. If the statement accesses only columns of the index, then Oracle reads the indexed column values directly from the index, rather than from the table. An index scan can be one of the following types:  Assessing I/O for Blocks, not Rows  Index Unique Scans  Index Range Scans  Index Skip Scans  Full Scans  Fast Full Index Scans  Index Joins  Bitmap Indexes

Assessing I/O for Blocks, not Rows Oracle does I/O by blocks. Therefore, the optimizer's decision to use full table scans is influenced by the percentage of blocks accessed, not rows. This is called the index clustering factor. If blocks contain single rows, then rows accessed and blocks accessed are the same. However, most tables have multiple rows in each block. Consequently, the desired number of rows could be clustered together in a few blocks, or they could be spread out over a larger number of blocks. Although the clustering factor is a property of the index, the clustering factor actually relates to the spread of similar indexed column values within data blocks in the table. A lower clustering factor indicates that the individual rows are concentrated within fewer blocks in the table. Conversely, a high clustering factor indicates that the individual rows are scattered more randomly across blocks in the table. Therefore, a high clustering factor means that it costs more to use a range scan to fetch rows by rowid, because more blocks in the table need to be visited to return the data. shows how the clustering factor can affect cost.

Assume the following situation: There is a table with 9 rows. There is a non unique index on col1 for table. The c1 column currently stores the values A, B, and C. The table only has three Oracle blocks. Case 1: The index clustering factor is low for the rows as they are arranged in the following diagram. Block 1 Block 2 Block 3 -------------------AAA BBB CCC This is because the rows that have the same indexed column values for c1 are located within the same physical blocks in the table. The cost of using a range scan to return all of the rows that have the value A is low, because only one block in the table needs to be read.

Case 2: If the same rows in the table are rearranged so that the index values are scattered across the table blocks, then the index clustering factor is higher. Block 1 ------ABC

Block 2 ------ABC

Block 3 -------ABC

This is because all three blocks in the table must be read in order to retrieve all rows with the value A in col1.

Index Unique Scan : Index unique scans, which occur when the Oracle database engine uses an index to retrieve a specific row from a table. Generally we have the unique index on the column of the table. SELECT * FROM emp e WHERE e.empno = 1234; Explain Plan-SELECT STATEMENT () TABLE ACCESS ( BY INDEX ROWID) EMP INDEX ( UNIQUE SCAN ) UQ_EMP

Index Full Scan : Oracle will choose an index full scan when the CBO statistics that indicate that a full-index scan is going to be more efficient than a full-table scan and a sort of the result set. The full-index scan is normally invoked when the CBO determines that a query will return numerous rows in index order, and a fulltable scan and sort option may cause a disk sort to the TEMP tablespace.

Index Range Scan The most common method of index access in Oracle is the index range scan. An index range scan is used when the SQL statement contains a restrictive clause that requires a sequential range of values that are indexes for the table. SELECT * FROM emp e WHERE e.empno > 1234; Explain Plan SELECT STATEMENT () TABLE ACCESS ( BY INDEX ROWID) INDEX ( RANGE SCAN )

EMP UQ_EMP

Index Skip Scan: Index skip-scan feature enables the optimizer to use a concatenated index even if its leading column is not listed in the WHERE clause. Skip scanning lets a composite index be split logically into smaller subindexes. In skip scanning, the initial column of the composite index is not specified in the query. In other words, it is skipped. Index skip-scans are faster than full scans of the index, requiring fewer reads to be performed. Consider, for example, a table employees (sex, employee_id, address) with a composite index on (sex, employee_id). Splitting this composite index would result in two logical subindexes, one for M and one for F. For this example, suppose you have Created index on sex and employee ID

The index is split logically into the following two subindexes: The first subindex has the keys with the value F. The second subindex has the keys with the value M. The column sex is skipped in the following query >SELECT * FROM employees WHERE employee_id = 101; A complete scan of the index is not performed, but the subindex with the value F is searched first, followed by a search of the subindex with the value M. Fast Full Index Scans Fast full index scans are an alternative to a full table scan when the index contains all the columns that are needed for the query, and at least one column in the index key has the NOT NULL constraint. A fast full scan accesses the data in the index itself, without accessing the table. It cannot be used to eliminate a sort operation, because the data is not ordered by the index key. It reads the entire index using multiblock reads, unlike a full index scan, and can be parallelized. A fast full scan is faster than a normal full index scan in that it can use multiblock I/O and can be parallelized just like a table scan.

When a index contains all of the values required to satisfy the query and table access is not required. The fast full-index scan execution plan will read the entire index with multi-block reads (using db_file_multiblock_read_count) and return the rows in unsorted order The fast full scan usually requires fewer physical I/Os than a full table scan , allowing the query to be resolved faster. There is concatenated index on the columns empno, ename, and deptno. SELECT e.empno,e.ename,e.deptno FROM emp e WHERE e.deptno = 30; ----Explain Plan---0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=5) 1 0 SORT (AGGREGATE) 2 1 INDEX (FAST FULL SCAN) Since all of the columns in the SQL statement are in the index, a fast full scan is available. Index fast full scans are commonly performed during joins in which only the indexed join key columns are queried.

Oracle Joining Methods : Types of Joins:  Nested loop  Sort merge joins  Hash joins  Index joins

Nested loop : Oracle reads the first row from the first row source and then checks the second row source for matches. All matches are then placed in the result set. This continues until all rows in the first row source have been processed. The first row source is often called the outer or driving table, Second row source is called the inner table. It is the fastest methods of receiving the first records back from a join.

Suppose somebody gave you a telephone book and a list of 20 names to look up, and asked you to write down each person’s name and corresponding telephone number. You would probably go down the list of names, looking up each one in the telephone book one at a time. This task would be pretty easy because the telephone book is alphabetized by name. Moreover, somebody looking over your shoulder could begin calling the first few numbers you write down while you are still looking up the rest. This scene describes a NESTED LOOPS join. NESTED LOOPS joins are ideal when the driving row source (the records you are looking for) is small and the joined columns of the inner row source are uniquely indexed or have a highly selective nonunique index. However, NESTED LOOPS joins can be very inefficient if the inner row source (second table accessed) does not have an index on the joined columns or if the index is not highly selective. If the driving row source (the records retrieved from the driving table) is quite large. SELECT /*+ ordered */ e.ename, d.deptno FROM dept d, emp e WHERE d.deptno = e.deptno;

Sort Merge Join Oracle sorts the first row source by its join columns, Oracle sorts the second row source by its join columns Merges the sorted row sources together. As matches are found, they are put into the result set. Suppose two salespeople attend a conference and each collect over 100 business cards from potential new customers. They now each have a pile of cards in random order, and they want to see how many cards are duplicated in both piles. The salespeople alphabetize their piles, and then they call off names one at a time. Because both piles of cards have been sorted, it becomes much easier to find the names that appear in both piles. This example describes a SORT-MERGE join. SORT-MERGE joins can be effective when lack of data selectivity or useful indexes render a NESTED LOOPS join inefficient. SORT-MERGE joins can be used only for equijoins (WHERE D.deptno = E.deptno, as opposed to WHERE D.deptno >= E.deptno).

Hash Join Oracle accesses one table (usually the smaller of the joined results) and builds a hash table on the join key in memory. It then scans the other table in the join (usually the larger one) and probes the hash table for matches to it. Oracle uses a HASH join efficiently only if the parameter PGA_AGGREGATE_TARGET is set to a large enough value. HASH joins work only on equijoins HASH joins can be effective when the lack of a useful index renders NESTED LOOPS joins inefficient. The HASH join might be faster than a SORT-MERGE join, in this case, because only one row source needs to be sorted, and it could possibly be faster than a NESTED LOOPS join because probing a hash table in memory can be faster than traversing a b-tree index. HASH joins use memory resources and can drive up I/O in the temporary tablespace if the sort memory is not sufficient (which can cause this join method to be extremely slow). SELECT /*+ ordered */ e.ename, d.deptno FROM dept d, emp e WHERE d.deptno = e.deptno;

Index Join : If a set of indexes exists that contains all of the information required by the query then the optimizer can choose to generate a sequence of HASH joins between the indexes. Each of the indexes are accessed using a range scan or fast full scan, depending on the conditions. This method is extremely efficient when a table has a large number of columns, but you want to access only a limited number of those columns. The advantage of INDEX joins over fast full scans is that fast full scans have a single index satisfying the entire query. INDEX joins have multiple indexes satisfying the entire query. Two indexes (one on ENAME and one on DEPTNO) have been created prior to the execution of the corresponding query in this next listing. The query does not need to access the table SELECT e.ename,e.deptno FROM emp e WHERE e.deptno = 20 AND e.ename = 'DULLY';

Efficient Sql Writing TipsNever use >SELECT * from Statement: Each query should access only the columns needed for the function that will be performed. But if all the columns are required then, there is no harm in using SELECT *. Never use a Cursor for fetching a single row: Because for using cursors you will have to DECLARE, OPEN, FETCH and CLOSE them. A single SELECT is much faster.

Avoid using DISTINCT: The DISTINCT verb removes the duplicates from the result table. If duplicates are not a problem and if the chance of having a duplicate does not exist, then avoid using it because it adds to the overheads. Use Equivalent Data Types: Use the same data type and lengths when comparing column values to host variables or literals. This eliminates the need for data conversion. For examples, to compare a CHAR (6) column with a CHAR (5) column, Oracle has to do the data conversion and when Oracle has to convert data the available indexes are not used. The same is the case when there is a mismatch in lengths or data type between a column and a host variable that are being compared.

Use BETWEEN instead of <= and >=: The BETWEEN predicate is more efficient that the `greater/less than or equal to’ predicates because the optimizer select the most efficient path for the BETWEEN predicate. Use IN instead of LIKE: Use IN or BETWEEN instead of LIKE whenever possible. If you know that only a certain number of occurrences exist, using IN with the specific list will be much more efficient than LIKE. You can imitate the functionality of LIKE by using a range of values.

Avoid using NOT: Whenever possible avoid the use of NOT (<>). Instead reword the SQL so that the same results are obtained. For example use ‘WHERE sum > 100’ instead of ‘WHERE sum <> 100’. Code the most restrictive predicate first: Choose the condition that will eliminate most of the rows as the first predicate. For example, if you want to get the names of all the person whose name starts with the letter ‘M’ in ‘Mumbai’ city, instead of coding ‘WHERE name BETWEEN ‘NNN’ AND ‘MMM’ and city=’Mumbai’, code city=’Mumbai’ AND name BETWEEN ‘NNN’ AND ‘MMM’.

Use UNION ALL instead of UNION: The UNION operator always results in a sort. So in cases where there is no possibility of duplicates or in case where elimination of duplicates is unnecessary use UNION ALL instead of UNION. Use NOT EXISTS instead of NOT IN: When you are coding a sub query using negation logic, use NOT EXISTS instead of NOT IN to increase the efficiency of the SQL. Use joins instead of sub queries: A join can be efficient than a sub query or a correlated sub query using IN. Join on indexed columns: The efficiency of your programs improves when tables are joined based on indexed columns rather than non-indexed ones.

Limit the columns Grouped: When you are using the GROUP BY clause, specify only those columns that need to be grouped. Use same cases for all the tables, Views accessed in select statements, which will allow Oracle to use shared pool efficiently

Limit updating indexed columns: When columns in indexes are updated, a corresponding update is applied to all indexes in which the column participates. This will reduce the performance considerably because of additional I/O overhead.

Avoid Bitmap Index in OLTP DB: Do not use Bitmap Index on columns which are part of OLTP database ,as these OLTP tables are getting updated, deleted and inserted frequently. As To update one value in Bitmap index, Oracle has to rearrange all the Bitmap index elements Prefer Truncate in place of Delete: As Delete statements doesn't reset the High water mark of a table, it leaves behind empty blocks which may hamper FULL table access of such tables. Truncate deletes all the data and reset the high-water mark ,which results in no empty blocks. Use /*+ APPEND */ Hint while inserting data in Global Temporary Or Temporary tables used for data manipulation, this will insert data faster as data will be stored straight way after the High water mark.

Related Documents

Basics Of Sql Tuning.ppt
December 2019 12
Sql And Sql Plus Basics
November 2019 26
Sql Server 2005 Basics
April 2020 15
Sql
October 2019 20
Sql
June 2020 12
Sql
November 2019 11

More Documents from ""

Basics Of Sql Tuning.ppt
December 2019 12
December 2019 10
A Project Omkar.docx
June 2020 7
Midori Presentation
December 2019 29