A Structured Approach To Sql Query Design

  • Uploaded by: Brendan Furey
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View A Structured Approach To Sql Query Design as PDF for free.

More details

  • Words: 3,134
  • Pages: 21
AIM TECHNICAL DESIGN A Structured Approach to SQL Query Design Team:

Technology

Creation Date:

22 May 2009

Created By:

Brendan Furey ([email protected])

Last Updated:

19 August 2009

Control:

21095268.doc

Version:

1.1

Approvals:

Document Control

Change Record Date

Author

Version

Change Reference

22-May-2009

BP Furey

1.0

19-Aug-2009

BP Furey

1.1

Initial Entity Overview: Cardinality reversal (Account); Entity/Subtype Definitions: Added sizes; Join Sequences: Note on outer joins; Other: Minor changes

File Ref: 21095268.doc

Document Control

ii

Contents Document Control............................................................................................... .........ii Change Record........................................................................... ..........................ii Introduction.......................................................................................................... ........4 Technical Overview................................................................................ .....................5 Design Process................................................................................ .....................5 Implementation Notes........................................................................... ................6 Diagramming Tool....................................................................................... ....6 Subtypes............................................................................................... ..........6 Attributes......................................................................................................... 6 Notation................................................................................... .......................6 Additional Advantages of Approach.................................................. ....................7 Performance Tuning................................................................. ......................7 Documentation................................................................................................ 7 Package Design........................................................................................ ......7 Worked Example: COGS No Charge Report......................................... .....................8 Requirement Summary........................................................................... ..............8 Entity Relationship Diagrams.................................................................... ............8 Entity Overview....................................................................... .......................8 Entity/Subtype Listings........................................................................ ................11 Entity/Subtype Structure............................................................. ..................11 Entity/Subtype Definitions........................................................................... ..12 Query Diagrams.................................................................................... ..............13 Query Structure..................................................................... .......................13 Main Query.................................................................................. .................14 Transaction View.......................................................................................... .15 Join Sequences............................................................................................ .......16 Notes................................................................................. ...........................16 Query Code............................................................................................... ..........16 Text................................................................................................ ...............16 Notes................................................................................. ...........................18 Issues............................................................................................. ...........................20 Issues................................................................................................... ...............20 References......................................................................................... .......................21

File Ref: 21095268.doc

Document Control

iii

Introduction SQL is a declarative language for manipulating data stored in a relational database. Oracle’s PL/SQL is a procedural extension intended to implement logic that cannot be performed directly in SQL. It’s generally accepted that developing software procedurally involves greater effort and results in more complex systems than using declarative languages; in the case of SQL, performance is usually also much better when implementing a requirement entirely in SQL. This leads to a Best Practice guideline, sometimes succinctly expressed (eg REF-1), as: ‘Do it in a single SQL statement if at all possible’ Unfortunately, this guideline appears to be followed surprisingly rarely, particularly in ERP environments. Often, in both batch programs and reports, a set of data that could be selected in a single query will instead be selected by a large and complex program with multiple small SQL queries scattered throughout. There are a couple of possible reasons that may explain why this is so: •

With each major release, Oracle increases the power of SQL and its ability to do internally what previously had to be programmed, but the developer community can be slow to keep pace with advances



ERP systems in particular tend to have very complex, highly granular data models, owing to the need for generality. This makes for rather complex SQL, which can be daunting to develop without good design techniques. In practice SQL is hardly ever designed and the temptation is to design a procedural program with simpler embedded SQL statements

The purpose of this document is to describe a structured, graphical approach to the design of SQL queries that may be a useful way of handling the complexity without reverting to procedural design. It focuses on subquery structure and join orders, rather than on other areas such as grouping and aggregation, or design patterns. The author has used it to design complex queries with up to 48 table instances, and the approach is demonstrated using a real (rather simpler) example of a custom report within Oracle’s Order Management and Inventory modules (see REF-2 for Oracle’s table specifications).

File Ref: 21095268.doc

Worked Example: COGS No Charge Report

4

Technical Overview Design Process The approach is based on entity-relationship diagramming, but applied in a different way from its usage in database design. The main steps in the design process are: •





Produce one or more entity-relationship diagrams that include all the physical entities required for the query o

Where necessary, follow a top-down approach using higher level entities to group related entities together, and break them down in secondary diagrams

o

Use subtypes to show the logical structure as well as the physical, based on the query requirements (eg display Ship To and Bill To customers as distinct subtypes for an Invoice Print query)

Tabulate the entity and subtype structures o

Include definition of subtypes

o

Map bottom-level entities to physical tables

Produce a query structure diagram, showing proposed subqueries, including inline views and each section of any unions o







Produce one or more entity-relationship diagrams for each subquery (including the main query) o

Where necessary, follow a top-down approach using higher level entities to group related entities together, and break them down in secondary diagrams

o

Mark which entities are constraining, or possibly constraining

Define a route through each diagram that a query plan could reasonably take, marking with numbered arrows the sequence of entities visited o

Begin with a possible driving table, then pass to entities that are linked to entities already visited, favouring the most constraining entities

o

The sequence represents the order in which the tables will be joined in the code, but need not be that followed by the SQL engine

o

The join sequence will be a good starting point in analysing any performance problems that may occur

Tabulate the complete join sequence o

File Ref: 21095268.doc

Use notes to explain the reasoning behind the structure

Group by subquery, and any entity groupings for convenience

Worked Example: COGS No Charge Report

5

Implementation Notes Diagramming Tool It is important for clarity that that entities and links can be sized and positioned flexibly. This effectively means that diagrams need to be be manually constructed rather than generated, and Oracle Designer and similar tools do not appear suitable. We have used Microsoft Visio. Subtypes Subtypes are often used at the logical phase of database design to represent the partitioning of an entity into a number of subentities, followed by a physical implementation in one of a number of ways: for example, a Party in Oracle's customer model may be one of several types, including Organisation and Person, and this is physically implemented by a party_type column on the table. The concept is used here more generally and more dynamically, to represent a division of an entity into groups of records, according to any data conditions specified in the query. Subtypes are depicted as two or more entities within another entity (but take care to avoid confusing with distinct entities within an entity group, and may be nested. Subtypes within a query diagram normally correspond to distinct table instances. Attributes Attributes do not appear on the diagrams, as they are not necessary for our purposes and cause clutter and distortion of entities, reducing clarity. Notation The following two points refer to both ERDs and query diagrams. •



Entities o

Rounded boxes

o

Broken lines indicate a complex entity containing subentities

o

Entities appearing within another solid-lined entity are subtypes

Relationships o

Straight lines between entities

o

Perpendicular bar denotes referencing end

o

Circle denotes optionality

o

Triple ending denotes many end of many to one relationship

The remaining points apply to query diagrams only •

Constraining entities o



Join sequence o

File Ref: 21095268.doc

Asterisk against the name

Numbered arrows

Worked Example: COGS No Charge Report

6

Additional Advantages of Approach Performance Tuning The design process followed here results in a logical join sequence. When the Cost Based Optimiser fails to find a good execution plan, the starting point in analysis is usually to compare its join order with what the developer would expect, and if necessary hints (such as LEADING) can be added to obtain a better plan. This tends to more of an issue with large queries. Documentation The design process described results in a document that makes large queries much easier to understand for support staff. Package Design The type of ERDs shown here can be used in designing package structures for maintenance of logical entities. For example, where Oracle’s customer model is used, the logical entities are usually at a higher level than Oracle’s physical model and maintenance procedures would correspond to the logical level represented on Entity-Relationship diagrams.

File Ref: 21095268.doc

Worked Example: COGS No Charge Report

7

Worked Example: COGS No Charge Report Requirement Summary This is a custom report based on Oracle Applications (11.5.10) Inventory and Order Management modules (the table definitions can be obtained from REF-2). It lists order lines that are sold at zero price, and includes the Inventory COGS (Cost of Goods Sold) account distributions in two categories (material and overhead). Briefly, the requirements are: •

List order lines with zero unit price, showing COGS Material and Overhead Inventory costs (where they exist), along with Warehouse, Ship To and Item data



Order lines may be of type Configuration or Non-Configuration, and the latter are nonshippable, so do not have Inventory records, in which case print the Order Line records with zero for the COGS costs



Report driven by the dates of a GL period, applied to the Inventory records for Configuration Lines, and the Order Lines for Non-Configuration Lines

Entity Relationship Diagrams Entity Overview The diagram below gives an overview, showing how the main entity groups relate to each other, with the complex entities broken down subsequently. Broken lines denote complex entities (but don’t always come out in Word!).

File Ref: 21095268.doc

Worked Example: COGS No Charge Report

8

GL Period

COGS Overhead

Order Header

Logical

MTL Transaction Account

COGS Material

Physical

Component

MTL Transaction NonConfiguration

Line Type

Other Warehouse

Model

GL Account Configuration

Account

Order Line Line

Ship To Site Use

Inventory Item

Address

Item Category

Customer

Product Line Category

Ship To

File Ref: 21095268.doc

Item

Worked Example: COGS No Charge Report

9

Entity Breakdown

Ship To Site Use

Inventory Item

Customer Site

Customer Account

Item Category

Party Site

Party

Category

Customer

Product Line

Location

Address Ship To

Other Category Set Product Line Category Item

File Ref: 21095268.doc

Worked Example: COGS No Charge Report

10

Entity/Subtype Listings Entity/Subtype Structure The table below shows the entity structure and the subtype structure where applicable. Italics denote complex entities referenced within others. Entity 1 GL Period MTL Transaction Account

Entity 2

Entity 3

Overhead Material Other

Order Line Line Type

Configuration Ship To

Model Component

Non-Configuration

Ship To Site Use

Address

Subtype 2

Logical Physical

MTL Transaction Account GL Account

Warehouse Order Header Line

Subtype 1

Customer Site Party Site Location

Customer

Customer Account Party

Inventory Item Item Category Item

File Ref: 21095268.doc

Product Line Category

Category Category Set

Product Line Other

Worked Example: COGS No Charge Report

11

Entity/Subtype Definitions The table below displays all the bottom level entities, with the tables that implement them, and the subtype conditions where applicable. Entity

Table

GL Period MTL Transaction Account

gl_periods

Size in 1000’s 0.4

mtl_transaction_accounts

54,943

Subtype

Overhead GL Account

gl_code_combinations

117

Material Other Physical

MTL Transaction

mtl_material_transactions

26,782

Warehouse Order Header

mtl_parameters oe_order_headers_all

0.1 822

Logical

oe_order_lines_all

3,230

Line Type Ship To Site Use Customer Site Party Site Location Customer Account Party Inventory Item Item Category Category

oe_transaction_types_tl hz_cust_site_uses_all hz_cust_acct_sites_all hz_party_sites hz_locations hz_cust_accounts hz_parties mtl_system_items_b mtl_item_categories mtl_categories

0.1 489 327 216 145 54 409 1,565 34,294 64

Category Set

mtl_category_sets

0.1

File Ref: 21095268.doc

Condition on column values here and in linked MTL Transaction Account record Condition on column values here and in linked MTL Transaction Account record Other Has child transaction Has parent transaction (these arise from intercompany orders using virtual warehouses)

NonConfiguration Model/Comp onent

Line of type Model with linked Component Lines Line not of type Model or Component Component Lines are linked to a Model Line

Product Line Other

Set name = ‘PLINE’ Other name

Configuration Order Line

Condition

Worked Example: COGS No Charge Report

12

Query Diagrams Query Structure

Configuration

NonConfiguration

Transaction View Main Query

Notes •

The query is driven by two different sets of source records, requiring the inner union



The union goes into an inline view in order to avoid duplicating all the the other tables for each section

File Ref: 21095268.doc

Worked Example: COGS No Charge Report

13

Main Query

2

1

NonConfiguration*

Configuration*

Transaction*

Inventory Item

Order Header

3

7

Ship To Site Use

Warehouse

8

Item Category

Customer Account

Customer Site

4 6

9

12 13

Category

5

Product Line

Party Site

Party

10 Other

Category Set Product Line Category Item

File Ref: 21095268.doc

Customer

11 Location

Address Ship To

Worked Example: COGS No Charge Report

14

Transaction View

GL Period*

GL Period*

1

1

Account Linked

4 MTL Transaction Account

Parent

Configuration Line

5

MTL Transaction

2

Component

6

NonConfiguration Line

2

3 Model*

GL Account Account

7

Line Type*

Configuration*

Line Type*

Non-Configuration* Transaction*

Notes •

Observe that we have shown the GL Account entity without the subtypes that appeared in the ERD. This represents a design decision not to link to separate instances of the table for the subtypes, but instead link to a single instance and use the row-column pivotting technique to obtain the two COGS amounts on a single line. See the notes section after the query code for an explanation



We have used a different subtyping for our MTL Transaction from that in the ERD. The record linked to may be a logical or a physical transaction, and we link from it to its parent, if it exists (logical case)

File Ref: 21095268.doc

Worked Example: COGS No Charge Report

15

Join Sequences The table below shows the (possible) join sequences by subquery Entity 1

Entity 2

Entity 3 Configuration

Table

GL Period Account

MTL Transaction Order Line (Configuration)

gl_periods mtl_transaction_accounts gl_code_combinations mtl_material_transactions mtl_material_transactions (+) oe_order_lines_all

MTL Transaction Account GL Account Any Physical (if previous Logical) Component Model

oe_order_lines_all

Line Type

oe_transaction_types_tl Non-Configuration

GL Period Order Line (Non-Configuration) Line Type

gl_periods oe_order_lines_all oe_transaction_types_tl Main Query

Transaction Inventory Item Item Category

Item

Product Line Category Order Header Ship To

Category Category Set (Product Line)

Customer

mtl_category_sets

Customer Site Party Site

oe_order_headers_all hz_cust_site_uses_all hz_cust_acct_sites_all hz_party_sites

Location

hz_locations

Customer Account Party

hz_cust_accounts hz_parties mtl_parameters

Ship To Site Use Address

(Inline View) mtl_system_items_b mtl_item_categories mtl_categories

Warehouse Notes Outer Joins

Outer joins are a frequent source of errors in SQL, either by the join being incorrectly specified as outer (or inner), or by the outer join syntax being incorrectly implemented (usually by including the (+) on only some of the relevant clauses). ANSI join syntax (see notes on next section) makes it harder to get the implementation wrong. Specification errors will be less likely if the join type is part of the design. Outer joins are indicated by the same symbol as in Oracle native SQL - (+) - in the table above.

Query Code Text SELECT

/*+ LEADING (ilv) USE_NL (ooh) */ war.organization_code

File Ref: 21095268.doc

orgcode,

Worked Example: COGS No Charge Report

16

FROM ( SELECT lty) */

loc.country, ooh.order_number ilv.line_number ilv.line_type, ilv.dept, ilv.reason, ooh.cust_po_number Substr(par.party_name,1,25) ilv.shipped_quantity Nvl (ilv.material*ilv.shipped_quantity, Nvl (ilv.OH*ilv.shipped_quantity, 0) msi.segment1 cat.segment1 ilv.period_name

order_num, line_num,

ponum, cust_name, qty, 0) extd_mat, extd_ovh, partnum, prodline, period

/*+ LEADING (per mta gcc trx trx_p ool_cpt ool_mdl lty) USE_NL (gcc trx trx_p ool_cpt ool_mdl

ool_mdl.header_id, ool_mdl.line_number, ool_mdl.inventory_item_id, ool_mdl.ship_to_org_id, ool_cpt.shipped_quantity, Max (CASE WHEN mta.cost_element_id = 1 OR gcc.segment4 IN ('1360', '1361') THEN mta.rate_or_amount END) material, Max (CASE WHEN mta.cost_element_id = 2 OR gcc.segment4 IN ('1330', '1331') THEN mta.rate_or_amount END) OH, lty.name line_type, ool_mdl.attribute10 dept, ool_mdl.attribute9 reason, mta.organization_id, per.period_name FROM gl_periods per, mtl_transaction_accounts mta, gl_code_combinations gcc, mtl_material_transactions trx, mtl_material_transactions trx_p, oe_order_lines_all ool_cpt, oe_order_lines_all ool_mdl, oe_transaction_types_tl lty WHERE mta.transaction_date BETWEEN per.start_date AND per.end_date AND per.period_name = '&&1' AND per.period_type = '1' AND gcc.code_combination_id = mta.reference_account AND trx.transaction_id = mta.transaction_id AND trx_p.transaction_id (+) = trx.parent_transaction_id AND ool_cpt.line_id = Nvl (trx.trx_source_line_id, trx_p.trx_source_line_id) AND ool_mdl.line_id = ool_cpt.link_to_line_id AND lty.transaction_type_id = ool_mdl.line_type_id AND mta.transaction_source_type_id = 2 AND mta.accounting_line_type = 1 AND (mta.cost_element_id IN (1, 2) OR gcc.segment4 IN ('1360', '1361', '1330', '1331')) AND ool_mdl.unit_selling_price = 0 AND Nvl (ool_mdl.attribute9, '&&2') BETWEEN '&&2' AND '&&3' AND lty.name BETWEEN Nvl ('&&4', 'BO') AND Nvl ('&&5', 'US') AND lty.name <> 'TO' GROUP BY ool_mdl.header_id, ool_mdl.line_number, ool_mdl.inventory_item_id, ool_mdl.ship_to_org_id, ool_cpt.shipped_quantity, lty.name, ool_mdl.attribute10, ool_mdl.attribute9, mta.organization_id, per.period_name UNION SELECT /*+ LEADING (per ool lty) USE_NL (lty) */ ool.header_id, ool.line_number, ool.inventory_item_id, ool.ship_to_org_id, Nvl (ool.ordered_quantity, 0), 0, 0, lty.name, ool.attribute10,

File Ref: 21095268.doc

Worked Example: COGS No Charge Report

17

FROM WHERE AND AND AND AND AND AND AND AND

WHERE AND AND AND AND AND AND AND AND AND AND AND AND AND AND ORDER

ool.attribute9, ool.ship_from_org_id, per.period_name gl_periods oe_order_lines_all oe_transaction_types_tl ool.request_date per.period_name per.period_type lty.transaction_type_id ool.unit_selling_price lty.name lty.name ool.link_to_line_id ( (ool.top_model_line_id ool.top_model_line_id OR (ool.top_model_line_id ) ) mtl_system_items mtl_item_categories mtl_categories mtl_category_sets oe_order_headers_all hz_cust_site_uses_all hz_cust_acct_sites_all hz_party_sites hz_locations hz_parties hz_cust_accounts mtl_parameters msi.inventory_item_id msi.organization_id mic.inventory_item_id mic.organization_id mic.category_set_id mcs.category_set_name cat.category_id ooh.header_id csu.site_use_id sit.cust_acct_site_id pst.party_site_id loc.location_id cus.cust_account_id par.party_id war.organization_id BY 1, 6, 2, 3, 4

per, ool, lty BETWEEN per.start_date AND per.end_date = '&&1' = '1' = ool.line_type_id = 0 BETWEEN Nvl ('&&4', 'BO') AND Nvl ('&&5', 'TO') IN ('BO', 'TO') IS NULL IS NOT NULL AND != ool.line_id) IS NULL) ilv, msi, mic, cat, mcs, ooh, csu, sit, pst, loc, par, cus, war = ilv.inventory_item_id = ilv.organization_id = msi.inventory_item_id = ilv.organization_id = mcs.category_set_id = 'PLINE' = mic.category_id = ilv.header_id = ilv.ship_to_org_id = csu.cust_acct_site_id = sit.party_site_id = pst.location_id = sit.cust_account_id = cus.party_id = ilv.organization_id

Notes Row-Column Pivotting The GL Account is regarded as having three subtypes for the purpose of this query. The query needs to bring back two of them for a given transaction, and display cost values in two columns corresponding to the subtypes, but one or the other may be missing. We prefer to avoid the complications and likely inefficiency of attempting to achieve this by outer-joining to two instances and instead use a row-column pivotting method, which is a useful general purpose technique that goes as follows (let’s say we have n columns, COL_1-COL_n whose values are obtained by expressions EXPRESSION_1- EXPRESSION_n and corresponding Where conditions CONDITION_1CONDITION_n): •

Join the table once for all conditions corresponding to the columns o

WHERE (CONDITION_1 OR … CONDITION_n)



Group by all columns except COL_1-COL_n



Add lines to Select list for i = 1 to n: o

File Ref: 21095268.doc

Max (CASE WHEN CONDITION_i THEN EXPRESSION_i END) COL_i

Worked Example: COGS No Charge Report

18

In Oracle 11g, there is a PIVOT clause native to SQL. Hints The Explain Plan for the query was found to favour hash joins, and poor performance was obtained. As a result a LEADING hint was added to each subquery, giving the preferred join order, following directly from the design sequences, and where necessary USE_NL hints were added to ensure nested loop joins. This gave much improved performance. ANSI SQL We would prefer to use ANSI join syntax, but cannot because the version of Oracle 10g we are using (10g 10.2.0.3.0 ) has a bug that causes some ANSI queries to fail spuriously with ORA-01445

File Ref: 21095268.doc

Worked Example: COGS No Charge Report

19

Issues Issues # 1

Issue ANSI join syntax

File Ref: 21095268.doc

Description Oracle bug, see above

Note if closed

Worked Example: COGS No Charge Report

20

References REF

Document

REF-1

Oracle, Ask Tom, ‘Considering SQL as a Service’

REF-2

Oracle, eTRM, R11.5.10

File Ref: 21095268.doc

Location http://asktom.oracle.com/pls/asktom/f?p= 100:11:0::::P11_QUESTION_ID:6727247 00346558185 https://etrm.oracle.com/pls/trm11510/etr m_search.search

Worked Example: COGS No Charge Report

21

Related Documents


More Documents from ""