Db2

Uploaded by: api-3853979
0
0

July 2020
PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA

Overview

Download & View Db2 as PDF for free.

More details

Words: 7,037
Pages: 80

Preview
Full text

Relational Model

Overview •Introduced by Dr.E.F.Codd in 1970 •Model based on mathematical foundations •Earlier models intrinsically tied to internal representations •Developer had to be aware of navigational principles •Need for a model to be divorced from physical organization •Require independence between physical & logical model The Model •In Oct.85 Dr.E.F.Codd published a two part paper •It introduced rules for Relational model •Rules determined whether a product is fully relational or not Two papers •Is your DBMS really relational - Oct 14,1985 •Does your DBMS run by the rules - Oct 21,1985 The implications were •satisfying rules was a technically feasible proposition •practical benefits if system did satisfy rules A DBMS is said to be fully relational if it supports Codd’s 12 rules, 9 Structural, 3 Integrity and 18 Manipulative features.

Basic Concepts •Relation corresponds to a table, resembles files •Rows of a relation are called Tuples, resemble records •Columns of a relation are called attributes, resemble fields •Entity and relationships are both represented as relations •A relation consists of same kind of tuples •Structure of a relation is defined by Scheme (definition) •An instance of a scheme is called Relational instance Properties of a Relation •Relation is a set of tuples •No two tuples in a relation are identical •Tuples in a relation have no order among themselves •Attribute values are atomic •Attribute values map onto a domain Notion of Keys Superkey Candidate key Primary key

- unique identifier for tuple - minimal superkey - Designated candidate key

A Relational model consists of Structural part relations, domains, etc. Integrity part entity, referential, domain/user defined Manipulative part operators & extensions

Structural Features Relations Mathematical entity to hold data in the relational model. A set of n-tuple, perceived as a two dimensional table where an intersection of a row and a column is a atomic value. Base tables A named and autonomous relation, one that actually holds data. Query tables Relations that result from execution of queries, not named and do not have persistent existance. View tables A named, virtual and derived relation, that is defined in terms of other named relations. Snapshot tables A named, derived and real relation, represented in terms of other relations and also by it’s own materialized data. Attributes Correspond to columns of the table, all attribute values are of the same type and atomic in nature. Domains All possible values from which attribute values are chosen. Primary key A set of attributes, whose value is a minimal unique identifier to a row of the table. A designated candidate key. Foreign keys A set of attributes, whose value is the the primary key of another table

Integrity Features Entity Integrity No component of primary key of base relation is allowed to be NULLS Referential Integrity The database must not contain any unmatched foreign key values Domain defined Integrity Attribute values should be those within the domain that it is mapped onto Manipulative Features •Restrict •Project •Cross product •Union, Intersection, Difference •Join •Divide •Extensions

Twelve Rules Information Rule All information be represented in one and only one way, i.e values in column positions within rows of tables. Guaranteed access Rule Every individual scalar value in database must be logically addressable by specifying table name, column name and primary key value. Systematic treatment of NULL Support representation of missing and inapplicable information that is systematic and distinct from all regular values. Active Online Catalog Support for online, relational catalog accessible to authorised users by means of regular query language. Comprehensive data sublanguage Rule Supports one relational language that has linear syntax, that is used interactively and within application programs, that supports data definition, manipulation, security, integrity and transaction management operations. View updating Rule All views that are technically updatable must be updatable by the system.

Twelve Rules High level Insert, Update, Delete Support for set-at-a-time Insert, Update, Delete operations. Physical Data Independence Changes at Internal level do not affect Conceptual level. Logical Data Independence Changes at Conceptual level do not affect External level. Integrity Independence Integrity constraints specified seperately from application programs, they are stored in catalog, it is possible to change integrity constraints without affecting existing code. Distributive Independence Existing applications should operate sucessfully when distributed version of DBMS is first introduced and when existing data is redistributed around the system. Non subversion Rule If the system provides low-level record-at-a-time interface, then that interface cannot subvert the system, thereby bypassing relational security or integrity constraints.

SQL

SQL •special purpose language for accessing & manipulating data •different from application progamming languages like C,Cobol •uses a combination of relational algebra & calculus constructs History 1970 - Dr.Codd proposed Relational model 1971-79 - SEQUEL implemented in System R 1980 - SEQUEL became SQL 1986 - SQL 86 (ANSI standard) 1989 - Follow on to SQL-86 - SQL-89 1992 - SQL-2 1995 - SQL-3

SQL components Data Definition Language (DDL) •specifies database schema •creates, modifies, deletes database objects (tables, view, index) Data Manipulation Language (DML) •manipulates data using Insert, Modify, Delete operations •accesses data from database for queries Data Control Language (DCL) •grants and revokes authorisation for database access •audits database use •provides transaction management

Data Definition

Base tables •consists of a row of column headings •zero or more rows of data values •each data row contains one scalar value for each column •all values in a column are of same data type •row ordering is irrelevant •order is imposed when rows are retrieved •columns are considered to be ordered from left to right •column ordering has a significance •rows and columns do have a physical ordering as stored version •physical row and column ordering is transparent to user •base table is autonomous, it exists in it’s own right

Table Creation Create Table S ( S# SNAME

Char(5) Char(20)

STATUS

Smallint

CITY

Char(15)

Primary Key (S#) ); Create Table SCOPY Like S ; Table Modification Alter Table S Add DISCOUNT SmallInt ; Table Deletion Drop Table S ;

Not Null, Not Null With Default, Not Null With Default, Not Null With Default,

Index •indexes are created and dropped using SQL •data manipulation statements do not refer to indexes at all •decision to use or not to use index is made by DB2 Index Creation Create [Unique] Index X on T (P,Q Desc, R) ; Index Deletion Drop Index X ; Notes on Data definition •data definition statements can be executed at any time •possible to create a few tables and start using them •subsequently new columns could be added •possible to experiment with effects of indexes •permits one not to get everything right the first time

Data Manipulation

Select Statement reference Select [All/Distinct] <scalar-expr> From Where Group By Having Order By Sample table definitions 1. Supplier

S (S#,SNAME,CITY,STATUS)

2. Part

P (P#,PNAME,COLOR,WEIGHT)

3. Supp-Part

SP (S#,P#)

Simple retrieval Get part names of all parts Select PNAME From P Retrieval with duplicate elimination Get part numbers for all part supplied Select Distinct P# From SP Retrieval of computed values Select P#,’Height’,HEIGHT*250 From P Retrieval of full details Select * From S

Qualified Retrieval Get supplier numbers for suppliers in Bombay with status above 20 Select S# From S Where CITY = ‘Bombay’ and STATUS > 20 Retrieval using ordering Get supplier numbers and status for suppliers in Bombay in descending order of status Select S#,STATUS From S Where CITY = ‘Bombay’ Order By STATUS desc Order by 3rd column Select P#,’Height’,’HEIGHT*250 From P Order By 3,P#

Retrieval using Range of values Get parts whose weight is in the range of 16..19 both limit values inclusive Normal way Select P#,PNAME From P Where WEIGHT >= 16 and WEIGHT <= 19 Using BETWEEN Select P#,PNAME From P Where WEIGHT Between 16 and 19 Similarly Where WEIGHT Not Between 16 and 19

Retrieval using IN Get parts whose weight is any one of the following values 12,16,17 Normal way Select P#,PNAME From P Where WEIGHT = 12 or WEIGHT = 16 or WEIGHT = 17 Using IN Select P#,PNAME From P Where WEIGHT IN (12,16,17) Similarly Where WEIGHT NOT IN (12,16,17)

Retrieval using NULL Since NULL is missing or inapplicable information, normal comparisons won’t work Get supplier numbers for those suppliers for whom STATUS is inapplicable Select S# From S Where STATUS Is Null Similarly Where STATUS Is Not Null

Cartesian product Each row of one table joined with every row of the other table Select S.*,P.* From S,P Equi Join If the join condition comprises of equality operator, then the join is known as Equi Join Select S.*,P.* From S,P Where S.CITY = P.CITY Theta Join If the rows from a cartesian product are eliminated by restrict operation on the basis of any condition, then the join is known as Theta Join. Select S.*,P.* From P,SP Where P.P# = SP.P# And P.RATE < SP.RATE

Natural Join From a cartesian product, choose common fields and compare their values for equality, finally one of the common fields is eliminated from the projection Select S.*,SP.* From S,SP Where S.S# = SP.S# Natural join on 3 tables Select S.*,SP.*,P.* From S,SP,P Where S.S# = SP.S# And P.P# = SP.P# Join table with Self Get employees with their manager names Select

First.E#,First.ENAME,First.M#, Second.ENAME From E First, E Second Where First.M# = Second.E#

Simple Subquery Suppose suppliers supplying part P2 are S1,S2,S3,S4, then the query to get supplier names supplying part P2 could be as follows. Select SNAME From S Where S# In (‘S1’,’S2’,’S3’,’S4’) However we can get S# of suppliers supplying part P2 from the database by the following query Select S# From SP Where P# = ‘P2’ The IN clause of SQL requires a list of values, and a query with one attribute is also a list of values, hence can be substituted in the IN clause Select SNAME From S Where S# In (

Select S# From SP Where P# = ‘P2’)

This is known as subquery or nested query

Query with multiple nesting Get supplier names for suppliers supplying at least one red part Select SNAME From S Where S# In (List of suppliers supplying red part) Select SNAME From S Where S# In (

Seleet SNAME From S Where S# In (

Select S# From SP Where P# In (List of red parts))

Select S# From SP Where P# In (

Select P# From P Where COLOR = ‘RED’))

Subquery & Outer query referring to same table Get supplier number for suppliers who supply atleast one part supplied by supplier S2 Select Distinct S# From SP Where P# In (List of parts supplied by supplier S2) Select Distinct S# From SP Where P# In ( Select P# From SP Where S# = ‘S2’) Subquery with scalar comparisons Get supplier number for suppliers located in the same city as supplier S1 Select S# From S Where CITY = (City of supplier S2) Select S# From S Where CITY = ( Select CITY From S Where S# = ‘S1’)

Query using EXISTS EXISTS is always associated with a subquery. EXISTS tests whether the results of an suquery return zero rows or non zero rows. EXISTS with a subquery is used as a condition in the Where clause of the outer query If the subquery returns one or more rows the EXISTS condition is TRUE, otherwise it is FALSE Similarly NOT EXISTS is negation if EXISTS

Query using EXISTS Get supplier names for suppliers who supply part P2 Select SNAME From S Where Exists ( List of suppliers where S# is the same as that of the outer query and P# is ‘P2’) Select SNAME From S Where Exists ( Select * From SP Where S# = S.S# And P# = ‘P2’) Example with NOT EXISTS Get supplier names for suppliers who do not supply part P2 Select SNAME From S Where Not Exists (

Select * From SP Where S# = S.S# And P# = ‘P2’)

Quantified comparisons Get part names for parts whose height is greater than every blue part List of heights of blue parts Select HEIGHT From P Where COLOR = ‘Blue’ Part names for required parts Select PNAME From P Where HEIGHT > ALL (List of heights of blue parts) Select PNAME From P Where HEIGHT > ALL ( Select HEIGHT From P Where COLOR = ‘Blue’) Similarly Where HEIGHT > ANY ( Subquery

)

Aggregate Functions Aggregate functions operate on a collection of scalar values of one column of a table to produce a single scalar value defined as it’s result Some aggregate functions are as follows: COUNT SUM AVG MAX MIN

- number of rows - sum of values - average of values - largest/maximum value - smallest/minimum value

Examples Get total number of suppliers Select Count(*) From S Get total suppliers supplying parts Select Count(Distinct S#) From SP

Examples Get number of shipments for part P2 Select Count(*) From SP Where P# = ‘P2’ Get total quantity of part P2 supplied Select Sum(QTY) From SP Where P# = ‘P2’ Get average quantity of part P3 supplied Select Sum(QTY)/Count(*) From SP Where P# = ‘P3’ Or Select Avg(QTY) From SP Where P# = ‘P3’

Aggregate functions in subquery Get supplier numbers of suppliers with status less than maximum status Select S# From S Where STATUS < (Maximum status) Select S# From S Where STATUS < (

Select Max(STATUS) From S)

Get supplier number and city for all suppliers whose status is greater than or equal to the average status of their city Select S#,STATUS,CITY From S Where STATUS >= (Average of their city) Select S#,STATUS,CITY From S SX Where STATUS >= ( Select Avg(STATUS) From S SY Where SY.CITY = SX.CITY )

Group By, Having Group By statement •rearranges rows into groups •on the basis of Group By attributes •such that each group has same value for Group By attributes •expressions in Select should be single valued for the group •such as Aggregate functions or Group By attributes Example A part is supplied by more than one supplier at different rates, this information is maintained in SP table, each row contains the part supplied, by a supplier and at specific rate. Get average rate for each part supplied. Note: For each part there would be a group of rows, the average rate for that part would be the average of all values of RATE attribute in that group. Select P#,Avg(RATE) From SP Group By P#

Use of Where and Group By The use of Where clause restricts some rows from participating in groups as specified by Group By clause Get part number, total and maximum quantity for parts excluding those supplied by supplier S1 Select P#,Sum(QTY),Max(QTY) From SP Where S# <> ‘S1’ Group By P# Get part number and average rate supplied by suppliers excluding those of type B Select P#,Avg(RATE) From SP Where TYPE <> ‘B’ Group BY P#

Having The Having clause restricts output of rows that result from the Group By clause, the restriction is based on a condition that usually includes an aggregate function. Get part numbers for all such parts that are supplied by more than one supplier Select P# From SP Group By P# The results of this query are all parts that are supplied, we need to choose from this list only those that are supplied by more than one supplier, this information is available as an aggregate function i.e.Count(*) Select P# From SP Group By P# Having Count(*) > 1

Union The Union operator performs a union of tuples from two tables, the two tables are said to be union-compatible if the number of columns are the same and column types are compatible. Get part number of parts that either weigh more than 16 pounds or are supplied by supplier S2 or both. Parts that weigh more than 16 pounds UNION Parts supplied by supplier S2 Select P# From P Where WEIGHT > 16 Union Select P# From SP Where S# = ‘S2’ Note: Union eliminates redundant duplicates Union All retains duplicates

Union The Output of the union can be ordered by the ORDER BY clause, ORDER BY clause cannot appear in individual SQL statements that participate in the Union. Select P#,’Weight’ From P Where WEIGHT > 16 Union All Select P#,’Supplied’ From SP Where S# = ‘S2’ Order By 2,1 Intersection Intersection operator performs Intersection of tuples from two tables, the output is common tuples from two tables. Difference Difference operator performs Difference of tuples from two tables, the output is tuples from one table that are not there in the other.

SQL Exercises Schema SAILORS RES BOATS

(sid, sname, rating) (sid, bid, date) (bid, bname, color)

1.

Find names of sailors who have reserved boat #2

2.

Find names of sailors who have reserved a red boat

3.

Find names of sailors with rating > 5

4.

Find names of sailors who have reserved boats on 1/1/95

5.

Find color of boats reserved by Ravi

6.

Find names of sailors who have reserved at least one boat

7.

Find names of sailors who have reserved all boats

8.

Find names of sailors who have reserved red or green boats

9.

Find names of sailors who have reserved red and green boats

10.

Find names of sailors who have reserved boats reserved by Ravi

SQL Exercises Schema CUSTOMER PRODUCT SALES

(cno, cname, city, status, category) (pno, pname, type) (cno, pno, date, qty, rate)

1.

A sales report giving sales quantity for products of type ‘P01’, report should have customer and product names, report should be sorted on product type, customer name

2.

A sales report giving total sales quantity for each product, report should have product name

3.

A sales report giving total sales value for each customer, report should have customer name

4.

A sales report giving citywise total sales value, report should contain only type ‘P04’ products, report should consider customer whose status is ‘SMP’ or ‘STP’

5.

A sales report giving maximum & minimum sales quantity for each product, report should contain product name

Embedded SQL

Introduction Any SQL statement that is used as an online query can also be used in an application program as an embedded statement. This is known as dual-mode principle. Embedded SQL statements are prefixed with EXEC SQL for distinction. Executable SQL statements appear wherever an executable host language statement can appear. eg: Procedure division in Cobol program. SQL statements include references to host variables, such references are prefixed with colon ‘:’ to distinguish them from column names. Host variables must be declared in ‘DECLARE’ section. Declaration of host variables must physically preceed use of variable in SQL statement. eg: BEGIN DECLARE SECTION ... END DECLARE SECTION

Introduction In SQL statements, host variables can appear wherever a literal is permitted. Host variables can also be used to place output from SQL statement, they can thus be used as source and target for data in SQL statements. Any tables used in program can optionally be declared by means of EXEC SQL DECLARE TABLE statement, this is to make the program self-documentary. After any SQL statement is executed, a feedback information reflecting the outcome of execution is returned to the program in two special host variables. They are • SQLCODE - a 31 bit signed integer • SQLSTATE - character string of length 5 In principle, every SQL statement should be followed by a test on either SQLCODE or SQLSTATE. A zero value in SQLCODE indicates successful completion, a positive value means the statement executed but some exceptional condition occurred, a negative value means the statement did not execute successfully. SQLSTATE is subdivided into a two character class code and three character subclass code.

Introduction A program can contain EXEC SQL INCLUDE SQLCA This causes the precompiler to insert declaration of SQL communication area, which contains declaration of SQLCODE and SQLSTATE along with other feedback variables. Host variables must have datatype compatibility with SQL datatype of columns that they are to be compared or assigned to or from. SELECT statements usually retrieve multiple rows, and host languages are not equipped to handle more than one row at a time. It is therefore necessary to provide some kind of bridge between set-at-a-time operations of SQL statements and row-at-a-time operations that host language can handle. Cursors provide such a bridge. A cursor is a new kind of object relevant to embedded SQL, interactive SQL has no need for it. It consists of a kind of a pointer that is used to run thru a set of rows, thus providing addressability to rows retrieved by SQL statement, one row at a time.

SQL examples - Not involving Cursors Singleton Select EXEC SQL

Select STATUS, CITY Into :RANK, :CITY From S Where S# = :GIVENS# ;

EXEC SQL

Select STATUS, CITY Into :RANK Indicator :RANKIND, :CITY From S Where S# = :GIVENS# ;

If RANKIND = -1 Then ... End-If

SQL examples - Not involving Cursors INSERT EXEC SQL

Insert Into P (P#, PNAME, WEIGHT) Values (:PNO, :PNAME, :PWT) ;

COLORIND = -1 CITYIND = -1 EXEC SQL

Insert Into P (P#, PNAME, COLOR, CITY) Values (:PNO, :PNAME, :PCOLOR Indicator :COLORIND, :PCITY Indicator :CITYIND) ;

SQL examples - Not involving Cursors UPDATE EXEC SQL

Update S Set STATUS = STATUS + :RAISE Where CITY = ‘LONDON’ ;

RANKIND = -1 EXEC SQL

Update S Set STATUS = :RANK Indicator :RANKIND Where CITY = ‘LONDON’ ;

EXEC SQL

Update S Set STATUS = NULL Where CITY = ‘LONDON’ ;

DELETE EXEC SQL

Delete From SP Where :CITY = ( Select CITY From S Where S.S# = SP.S#) ;

SQL examples - Involving Cursors The DECLARE X CURSOR ... statement defines a cursor called X, and associates itself with a query specified by SELECT statement, which is also a part of Cursor declaration. The query is not executed at this point. The SELECT statement is effectively executed when the cursor is opened using current values of host variables in the procedural part of the program. The FETCH ... INTO statement is used to retrieve rows from the result table, one row at a time. The INTO clause specifies a list of host variables that match the SELECT clause declaration of the cursor. EXEC SQL

EXEC SQL EXEC SQL

Declare X Cursor For Select S#, SNAME From S Where CITY = :Y ; Open X For all rows accessible via X EXEC SQL Fetch X Into :S#, :SNAME ; Close X ;

Since there are multiple rows in the result table, Fetch is placed in a loop. A Fetch would result in SQLCODE as +100 if no more rows exist in the result table, this condition is used to terminate the loop. The cursor is finally closed by CLOSE statement.

Cursor Declaration EXEC SQL

Declare <cursor name> Cursor For Order By For [Fetch Only / Update of Columns] Optimize For Rows

Notes: 1. Union expression is a SELECT expression or a union of SELECT expressions

2. DECLARE cursor is a declarative and not executable statement 3. ORDER BY cannot be specified if UPDATE or DELETE CURRENT needs to be invoked 4. ORDER BY orders rows to be retrieved by FETCH statements 5. Specifying FOR FETCH ONLY performs better, and is the default 6. OPTIMIZE may be specified purely for performance reasons, it causes the optimizer to choose a more efficient access.

Executable Statements EXEC SQL OPEN <cursor name> •Opens or activites a specified cursor •A set of rows are identifed and become active for the cursor •Cursor also identifes a position within that set of rows •Active set of rows is considered to have an order EXEC SQL FETCH <cursor name> INTO : ... •identified cursor must be open •advances cursor to next position •assigns values from that row to host variables •if there is no row then SQLCODE is +100 •fetch next is the only cursor movement operation EXEC SQL CLOSE <cursor name> •deactivates specified cursor, which is currently open •closed cursor can be opened again •when opened again, the active set of rows may be different •values in host variables can be different •changes to host variables while cursor is open is redundant

Executable Statements Update & Delete EXEC SQL

Update Set = <expr>, = <expr> ... Where Current of Cursor

EXEC SQL

Delete From

Where Current of Cursor

Update and Delete using cursor are not permissible if cursor declaration involves Union or ORDER BY clause or if union expression involves non-updatable view Update should have FOR UPDATE clause identifying columns that appear as targets of SET statement.

Relational Integrity

Relational Integrity Rules Need for Integrity rules •Any database consists of some configuration of data values •That configuration is supposed to reflect real world situation •Some configuration of values do not make sense •These do not represent any possible state of real world Rules as Database definition •Database definition needs to be extended to include some rules •rules inform DBMS of certain constraints in the real world •rules can also prevent such impossible configuration of values •Such rules are known as Integrity rules. Since base tables are supposed to reflect reality, all Integrity rules apply to base tables. Integrity rules are specific and general. General Integrity rules are those that concern primary key and foreign key. Specific Integrity rules are those concerning domain & user defined integrity.

Notion of Primary key Primary key •is a unique identifier for a relation •can be composite •is a designated candidate key •no component can be eliminated without destroying uniqueness Notes •Every relation has a primary key, moreso the base relations •Reason for choosing primary key is outside the scope of model •Important for primary key to be really significant •There need not be an index on primary key •Primary key is pre-requisite to foreign key support •Provides tuple level addressing mechanism in relational system

Entity Integrity Entity Integrity Rule No component of primary key of base relation is allowed to be NULLS. Justification Base relations correspond to real world and entities in real world must be distinguishable. Hence their representatives in database must also be distinguishable. Null value for primary key implies that entity in database has no full or partial identity. Primary key is supposed to perform a unique identification function. Null value imples ‘value is unknown’. Primary key with Null implies that a tuple represents an identity we do not know. This means that we do not know how many entities exist in the database or real world. An entity without identity does not exist. Rule In a relational model, we never record information about something we cannot identify.

Foreign keys In a table, a given value of an attribute should be permitted to appear in the database if the same value also appears as a primary key value of some other table in the database. Foreign key value represents a reference to a tuple containing a matching primary key value. The relation containing foreign key is called referencing relation. The relation containing the matching primary key is called the referenced or target relation. The problem ensuring that the database does not include any invalid foreign key values is known as Referential Integrity problem. The foreign key is either wholly NULL or wholly NON NULL. There should exist a base relation with primary key such that each NON NULL foreign key value has a corresponding primary key value.

Notes 1. Foreign key and primary key should be defined on the same underlying domain. 2. Foreign keys need not be component of primary keys. Any attribute can be a foreign key 3. A given relation can be referencing as well as referenced relation. 4. A relation might include a foreign key where it’s values are required to match the primary key value of the same relation. Such relations are known as self-referencing relations. 5. Foreign keys, unlike primary keys, can have NULLS 6. Foreign key to primary key relationship is said to be the glue that holds the database together.

Referential Integrity Rule The database must not contain any unmatched foreign key values An unmatched foreign key value is a NON NULL foreign key value for which there does not exist a matching value of primary key in the relevant target relation. Note 1. Referential integrity requires foreign keys to match primary keys 2. Foreign key and referential integrity are defined in terms of each other 3. Support for referential integrity and support for foreign key mean the same. Foreign key rules Referential integrity rule is framed purely in terms of database states. Any state of database that does not satisfy the rule, is by definition incorrect. These incorrect states can be avoided as follows - system could reject any operation that results in illegal state - system accepts the operation and performs additional compensating operation to guarantee that overall results are a legal state

Some key issues Can a foreign key accept NULLS ? The answer to this question does not depend on the database designer but the policies that are in effect in the real world. What happens to an attempt to delete a target record of a foreign key reference ? Restricted - target record is not deleted if there are any referencing records Cascade - All referencing records are deleted Nullifies - Foreign keys of all referencing records are set to NULLS. What happens on an attempt to update primary key of a target of foreign key reference ? Same as with Delete. Restrict, Cascade, Nullify

Primary key definition in DB2 Create Table SP ( S# Char(5) Not Null, P# Char(6) Not Null, QTY Integer, Primary Key (S#, P#) ); Foreign key definition in DB2 Create Table SP ( S# Char(5) Not Null, P# Char(6) Not Null, QTY Integer, Primary Key (S#, P#), Foreign Key SKF (S#) References S On Delete Cascade, Foreign Key PKF (P#) References P On Delete Restrict );

Views

Introduction View is a named virtual table that is derived from a base table. It does not exist in it’s own right, but appears to the user as if it did. Views do not have their own physical seperate, distinguishable stored data. Instead their definition in terms of other tables is stored in the catalog. An example: Create View GOOD_SUPPLIERS As Select S#, STATUS, CITY From S Where STATUS > 15 ; GOOD_SUPPLIERS is in effect a window into the real table S. Further this window is dynamic in nature, changes to S would automatically and instantaneously be visible thru that window. Likewise changes to GOOD_SUPPLIERS would automatically and instantaneously be applied to real table S. Users can operate on GOOD_SUPPLIERS as if it were a real table. The system handles the operation by converting it into an equivalent operation on the underlying base table.

View Creation Examples Create View REDPARTS (P#, PNAME, WT, CITY) As Select P#, PNAME, WEIGHT, CITY From P Where COLOR = ‘RED’ ; Create View PQ (P#, TOTQTY) As Select P#, Sum(QTY) From SP Group By P# ; Create View SUPPLIER_PARTS (S#, P#, SNAME, PNAME) As Select SP.S#, SP.P#, S.SNAME, P.PNAME From S, SP, P ; Create View LONDON_REDPARTS As Select P#, WT From REDPARTS Where CITY = ‘LONDON’ ; Create View GOOD_SUPPLIERS As Select S#, STATUS, CITY From S Where STATUS > 15 With Check Option ;

Types of Views Column subset Views Create View SCITY As Select S#, CITY From S Create View STATUS_CITY As Select STATUS, CITY From S The above views are created such that they are vertical or column subsets of base tables, hence they are called column-subset views. View SCITY includes the primary key of the base table, whereas view STATUS_CITY does not. For a given record, in the view STATUS_CITY, it would be impossible to identify the corresponding record in the base table. This is because the view does not include the primary key of the base table. Views that include the primary key of the base tables are known as Key preserving views. Column subset views are theoretically updatable if they preserve the primary key of the base table.

Types of Views Row subset Views Create View LONDON_SUPPLIERS As Select S#, SNAME, STATUS, CITY From S Where CITY = ‘LONDON’ ; The above view is created such that it is horizontal or row subset of the base tables, hence it is called row-subset view. The view LONDON_SUPPLIERS includes the primary key of the base table, hence it is a Key preserving view. Row subset views are theoretically updatable if they preserve the primary key of the base table.

Types of Views Join Views Create View SUPPLIER_PART As Select SP.S#, SP.P#, S.SNAME From S,SP Where S.S# = SP.S# ; View SUPPLIER_PART is constructed from join of two tables, these are known as colocated views. Colocated views suffer from all kinds of problems from standpoint of updatability. Statistical Summary Create View PQ (P#, TOTQTY) As Select P#, Sum(QTY) From SP Group By P# ; The view PQ is constructed such that each row of the view is a result of some aggregate function on a set of rows in the base table. Such a view cannot be updated as it would be impossible to know how to distribute the updates in the rows of the base table.

View Updatability Updatable views are those on which Insert, Delete and Update operations can occur. Not all views are Updatable. There are some views that are theoretically updatable but are not updatable in SQL systems. In general Join views cannot be updated, however there are some views that are not joins which cannot be updated. Check Option Create View GOOD_SUPPLIERS As Select S#, STATUS, CITY From S Where STATUS > 15 ; The view is row-column subset, key preserving and updatable. Would it be possible to insert a supplier with STATUS = 10, the CHECK option is designed to deal with such situations. During Insert and Update operations, the view is checked to ensure that all Inserted and Updated rows satisfy the view definition condition. If Check option is not specified, all data would be accepted, but some newly Inserted or Updated rows may disappear from the view.

Logical Data Independence Since application programs are not dependent on the physical structure of stored database, DB2 provides physical data independence. If application programs are also independent of the logical structure of database, the system is said to provide logical data independence. Logical structure of database can change due to two aspects, Growth & Restructuring. Growth •Database grows to incorporate new kinds of information •A table is expanded to include new fields •The database is expanded to include a new table •Growth does not affect application programs in DB2 Restructuring •Database is restructured so that overall information remains same •Placement of information within database changes •Restructuring is undesirable, but unavoidable

Logical data Independence An Example A base table S(S#,SNAME,STATUS,CITY) is split into two tables SX(S#,SNAME,CITY) and SY(S#,STATUS) Application programs that referred to base table S would need to be changed, since they would not have to refer to SX & SY for any database operations. The old table S can be reconstructed as a join of SX & SY. Hence the view can substitute reference to old base table S after it is split. Application programs that referred to base table S would not refer to the view S, hence they need not undergo change. Create View S(S#,SNAME, STATUS,CITY) As Select SX.S#,SX.SNAME,SY.STATUS,SX.CITY From SX,SY Where SX.S# = SY.S# ; Having create the view S, Select operations would continue to work as before, however Update operations would not work. Although such a view is theoretically updatable, DB2 does not allow updates on a view that is defined as a join. Thus application programs performing update operations are not immune to this type of change.

Advantages of Views •Provide certain amount of logical data independence •allow some data to be seen by different users in different ways •simplifies user’ perception •allows focus on data that is of concern and ignore the rest •provides automatic security

SQL Access Guidelines

Never Use SELECT * •Never ask DB2 anything more than required •Query should access only those columns that are needed •Changes to table structure may imply changes to program Singleton SELECT verses Cursor •Singleton Select outperforms Cursor •When a row needs to be retrieved, cursor is preferred •FOR UPDATE clause of cursor ensures integrity •DB2 places an X lock prohibiting concurrent updates Use FOR FETCH ONLY •Enales DB2 to use block fetch •Increases efficiency Avoid using DISTINCT •Distinct eliminates duplicates •Invokes Sort to eliminate duplicates •Code only when duplicate elimination is mandatory Limit the SELECTed data •Select should return minimal but required rows •Do not code generic queries without WHERE clause •More efficient to use WHERE clause to restrict retrieval

Code Predicates on Indexed columns •Requests satisfied more efficiently using an existing index •Not efficient when most rows in a table are to be accessed Multi Column Indexes •Used when high-level column is specified in WHERE clause Several Indexes instead of multicolumn Index •Multiple indexes more efficient than single multicolumn index •Provide better overall performance for all queries •At the expense of individual queries Use ORDER BY when sequence is important •DB2 doesn’t guarantee the order of rows returned •Path of data retrieved may change from each execution •ORDER BY is mandatory when sequence is important Limit columns in ORDER BY clause •For ORDER BY clause, DB2 invokes a Sort •The more columns in ORDER BY, the less efficient •Specify only those columns that are essential Use Equivalent data types •Essential when comparing column values to host variables •Eliminates need for data conversion •Index is not used if data types incompatible

Use Between instead of <= and >= •Between more efficient than combination of <= and >= •Optimizer selects a more efficient path Use IN instead of LIKE •For known list of data occurrences use IN •IN with specific list is more efficient than LIKE Formulate LIKE predicates with care •Avoid % or _ at the begining of comparison string •Avoid using LIKE with host variable Avoid using NOT (except with EXISTS) •Recode queries to avoid use of NOT •By taking advantage of knowledge of data being accessed Code most restrictive predicate first •Place predicate that eliminates greatest number of rows first Use Predicates wisely •Reduce number of predicates •Know your data to reduce predicates Specify number of rows to be returned •Code cursor statement with OPTIMIZE FOR n ROWS •DB2 selects optimal path •Does not prevent program from fetching more rows

Complex SQL Guidelines

UNION versus UNION ALL •Union invokes a sort, UNION ALL does not •Use UNION ALL when retrieved data does not have duplicates Use NOT EXISTS instead of NOT IN •NOT EXISTS verifies non existance •NOT IN must have complete set of rows materialized •Use NOT EXISTS for subquery using negation logic Use constant for existance checking •If subquery tests existance, specify constant in select-list •Select-list of subquery is unimportant, will not return data Predicate transitive closure rules •DB2 optimizer uses the rule of transitivity, which states •If a=b and b=c then a=c •Efficient to code a redundant predicate to exploit transitivity where a.col1 = b.col1 where a.col1 = b.col1 and a.col1 = :hostvar and a.col1 = :hostvar and b.col1 = :hostvar Minimize number of tables in a join •Do not join more than 5 tables •Eliminate unnecessary tables from join statements Denormalize to reduce joining •To minimize the need for joins, consider denormalizing •Would imply redundancy, dual updating & usage of space

Reduce the number of rows to be joined •number of rows participating in join determines response time •to reduce join response time, reduce number of rows •code predicates to minimize number of rows Join using SQL instead of program logic •efficient to join using SQL instead of application code •optimizer has a vast array of tools of optimize performance •application code would not consider equivalent optimizations Use Joins instead of Subqueries •Join is more efficient than a correlated subquery or using IN Join on clustered column •Use clustered columns in join criteria when possible •reduces the need for immediate sorts •might require clustering parent table by primary key •and child table by foreign key Join in Indexed columns •efficient when tables are joined on indexed columns •consider creating indexes for join predicates Avoid Cartesian products •Never use a join without a predicate •A join without a predicate results in a cartesian product •A lot of resources are spent on such a join

Provide adequate search criteria •provide additional search criteria in WHERE clause •these to be in addition to the join criteria •provides DB2 an opportunity for ranking tables to be joined •allows queries to perform adequately Limit columns to be used in GROUP BY •specify only minimum columns to be grouped •DB2 needs to sort retrieved data •the more columns, the more expensive is the sort

Increase the possibility of Stage 1 processing A predicate that is satisfied by Stage 1 processing is evaluated by the Data Manager portion of DB2 rather than the Relational Data System. The Data Manager component is at a closer level to data than the Relational System. Stage 1 predicate is evaluated at earlier stage of data retrieval, thereby avoiding the overhead of passing data from one component to another. The following list shows predicates satisfied by Stage 1 Colname Colname Colname Colname Colname Colname a.Colname

operator value IS NULL BETWEEN val1 AND val2 IN (list) LIKE pattern LIKE hostvariable operator b.Colname

The last item in list refers to two columns in different tables. Predicates formulated with AND, OR, NOT are not at Stage 1

Increase the possibility of Index processing A query that can use an index has more access path options, so it has the capability of being a more efficient query than the query that cannot use an index. DB2 optimizer can use an index or indexes in a variety of ways to speed the retrieval of data from DB2 tables. The following list shows predicates satisfied by using Index Colname Colname Colname Colname Colname Colname a.Colname

operator value IS NULL BETWEEN val1 AND val2 IN (list) LIKE pattern LIKE hostvariable operator b.Colname

The last item in list refers to two columns in different tables. Predicates formulated with AND, OR, NOT are not Indexable

Db2

Overview

More details

Related Documents

Db2

Db2

Db2

Db2

Db2

Db2