Databasemanagementsystems-cs530a-15

  • Uploaded by: Hoi
  • 0
  • 0
  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Databasemanagementsystems-cs530a-15 as PDF for free.

More details

  • Words: 2,302
  • Pages: 35
Database design with a relational model Functional dependencies Normalization

Functional Dependencies 

Definition:

– An integrity constraint between two sets of attributes

 

Think of it as a relationship between two sets of attributes Generalizes the concept of a key, where both X and Y represent sets of attributes X→Y – X functionally determines Y – Y is functionally determined by X FD {X → Y}

i.e., when two tuples agree on their values for the X domains, they agree on their values for the Y domains {X} {Y} 1,2,a,0 m 1,2,a,0 m

Functional Dependencies 







Legal instances of a relation must satisfy all ICs, including FDs FD are a property of the intension, not the extension, but can be “ruled out” using the extension Primary key is a particular case of a FD (but not all FDs are keys) F+ denotes the closure of F – The set of all functional dependencies implied by a given set of functional dependencies

Functional Dependency

A

B

C

D



A1

B1

C1

D1



A1

B1

C1

D2



A1

B2

C2

D1



A2

B1

C3

D1



Determinant

AB → C Dependent

Functional Dependency Closure 

Set F of Functional Dependencies (given) Relation: – –

EmpProj: SSN, Pnumber, Hours, Ename, Pname, Plocation FDs F:   



{SSN → Ename} Pnumber → {Pname, Plocation} {SSN, Pnumber} → Hours}

Closures: – {SSN}+ = {SSN, Ename} – {Pnumber}+ = {Pnumber, Pname, Plocation}



F+ – {SSN, Pnumber}+ = {SSN, Pnumber, Ename, Pname, Plocation, Hours}

Attribute Closure   

Determine FDs that can easily be identified (given) To see if a given FD is in the closure of a set F of FDs, use Armstrong’s Axioms to infer additional dependencies Armstrong’s Axioms – 3 rules of inference can generate all FDs: 

Reflexivity:

If X ⊇Y, then X → Y



Augmentation:

If X → Y, then XZ → YZ for any Z



Transitivity:

If X → Y and Y → Z, then X → Z



But keep in mind 2 other rules that are useful: 

Union:

If X → Y and X → Z, then X → YZ



Decomposition:

If X → YZ, then X → Y and X → Z

Minimal Set 

Requirements: – Every dependency has a single attribute for its right side – No X → A in F can be replaced with Y → A where Y is a subset of X and still have dependencies equivalent to F – No dependency can be removed and still have a set of dependency equivalent to F



Definition: – If the right-hand (dependent) side of every FD in S involves just one attribute, and – if the left-hand (determinant) side of every FD in S is irreducible (can’t discard another attribute without making a new set not equivalent to S - aka, changing the closure of S+), and – no FD in S can be discarded without changing the closure of S+, then the set S of FDs is irreducible (minimal).

Example 

Given: A → B, ABCD → E, EF → G, EF → H, and ACDF → EG





First rewrite: ACDF → E and ACDF → G Then consider ACDF → G (implied by: A → B, ABCD → E, and EF → G) Delete it (and, in the same way, ACD → E replaces ABCD → E) Minimal Set: A → B, ACD → E, ACDF → G, and ACDF → H. Single attribute on right, Minimize left, Delete redundant FDs

Another example… A = employee number, B = department number, C = manager’s employee number, D = project number for a project unique to a manager, E = department name, F = time specified for a project

Given: A -> BC, B -> E, CD -> EF Determine: Does AD->F hold? A-> BC (given), A->C (decomposition), AD->CD (augmentation), CD->EF (given), AD->EF (transitivity), AD->F (decomposition)

Cover Let S1 and S2 be two sets of FDs.  If every FD implied by S1 is implied by S2 (S1+⊆S2+), S2 is a cover for S1. So, if FDs are enforced for S2, they are enforced for S1. If S1 is also a cover for S2, they are equivalent. 

From another direction 

For P, the following FDs are part of the closure: P# ->Pname, P# -> Color, P# -> Weight, P# -> City The following FDs are not irreducible:   

P# -> {Pname, Color} P# -> Weight P# -> City –

   

{P#, Pname} -> Color P# -> Pname P# -> Weight P# -> City –

    

(right-hand side is not a single attribute)

(left-hand side can be further reduced without changing closure

P# -> P# P# -> Pname P# -> Color P# -> Weight P# -> City –

(first FD can be dropped without changing closure)

Functional Dependency from the SQL perspective

An SQL consideration: If X functionally determines Y, then Y is functionally dependent on X. SQL Translation: In a legal instance with one or more determinant columns and one dependent columns, if you create a SELECT statement with a WHERE clause containing “=“ for some value in the determinant columns, any tuples returned by the dependent column in your FROM clause must always be the same for all tuples. So the SELECT contains the right side and where or group by contains the left side of a Functional Dependency statement SELECT DependentColumn FROM Table WHERE DeterminantColumn1 = ‘value1’…AND DeterminantColumnN = ‘valueN’

Normalization Product Support Coverage ProductNumber

Date

Time

CR76 CR56 CR74 CR56

05/13/0 2 05/13/0 2 05/13/0 2 07/04/0 2

13:30 12:00 12:00 10:30

EmployeeI D SG5 SG5 SG37 SG5

PhoneNumber

PayRate

Withholding

3651101 3651101 3651102 3651102

12 12 12 18

1.80 1.80 1.80 2.40

ProductNumber,Date -> Time, EmployeeID, PhoneNumber EmployeeID, Date, Time -> ProductNumber PhoneNumber, Date, Time -> EmployeeID, ProductNumber EmployeeID, Date -> PhoneNumber Date -> PayRate PayRate -> Withholding

Product Coverage Support

Illustrate FDs

ProductNumb er

D ate

T ime

ProductNumber,Date -> Time, EmployeeID, PhoneNumber EmployeeID, Date, Time -> ProductNumber PhoneNumber, Date, Time -> EmployeeID, ProductNumber EmployeeID, Date -> PhoneNumber Date -> PayRate PayRate -> Withholding

EmployeeI D

PhoneNumbe r

PayRate

Withholding

Decompose Product Support Coverage

1NF

Pay

Coverage

Product Coverage

Wages

Phone Assignment

2NF

Taxes

3NF

BCNF

Sample Set in First Normal Form Ssn

Name

Department

Rank

Wage

Hours

Bonded

123223666

Amsden

Human Resource

8

10

40

Y

231515368

Skaggs

Management Information Systems

8

10

30

131243560

Sider

Marketing & Sales

5

7

30

434263751

Gallow

Marketing & Sales

5

7

32

612674134

Moore

Marketing & Sales

8

10

40

Normal Forms  

   

The key, the whole key, and nothing but the key Assumptions: – FDs exist – A designated primary key can be found Remember, a key is a FD, but a FD is not always a key A property of a relation schema indicating the type of redundancy in the relation schema The goal is to remove redundancy based on dependencies The desire is to minimize additional integrity constraints

Normal Forms 

Considerations: – Relational design by analysis – Normal forms are based on functional dependencies (FDs) – Intuitive, perhaps, but identifying a strictly controlled procedure allows a programmatic process – Should consider 2 additional properties  Lossless join (nonadditive join property) – required 

Dependency preservation property – use when possible

Lossless Joins and Dependency Preservation 

If relation R and FDs F hold over R, then decomposing R into R1 and R2 is lossless if the closure of F contains either: – FD R1 ∩ R2 -> R1 or – FD R1 ∩ R2 -> R2



If the closure of the attributes in R1, independent of those attributes in R2, unioned with the closure of attributes of R2, independent of those attributes in R1, are equivalent to the closure F, then dependency is preserved

2NF to 3NF

PN

PM

D

AD3111

20

A00

MA2100

10

D11

OP1000

10

D11

IF1000

20

A00

PN

PM

AD3111

20

IF1000

20

MA2100

10

OP1000

10

PM

D

20

A00

10

D11

Sample Set Ssn

Name

Department

Rank

Wage

Hours

Bonded

123223666

Amsden

Human Resource

8

10

40

Y

231515368

Skaggs

Management Information Systems

8

10

30

131243560

Sider

Marketing & Sales

5

7

30

434263751

Gallow

Marketing & Sales

5

7

32

612674134

Moore

Marketing & Sales

8

10

40

Normal Forms  

   

The key, the whole key, and nothing but the key Assumptions: – FDs exist – A designated primary key can be found Remember, a key is a FD, but a FD is not always a key A property of a relation schema indicating the type of redundancy in the relation schema The goal is to remove redundancy based on dependencies The desire is to minimize additional integrity constraints

Normal Forms 

Considerations: – Relational design by analysis – Normal forms are based on functional dependencies (FDs) – Intuitive, perhaps, but identifying a strictly controlled procedure allows a programmatic process – Should consider 2 additional properties  Lossless join (nonadditive join property) – required 

Dependency preservation property – use when possible

Lossless Joins and Dependency Preservation 

If relation R and FDs F hold over R, then decomposing R into R1 and R2 is lossless if the closure of F contains either: – FD R1 ∩ R2 -> R1 or – FD R1 ∩ R2 -> R2



If the closure of the attributes in R1, independent of those attributes in R2, unioned with the closure of attributes of R2, independent of those attributes in R1, are equivalent to the closure F, then dependency is preserved

Normal Forms 

Definitions: – Superkey –   

R={A1,A2,…,An} A set of attributes S ⊆ R No 2 tuples (t1,t2) in any legal relation state r of R has t1[S]=t2[S]

– Key –  

A superkey that is disqualified by any attribute being removed, & Must be minimal (can’t remove another attribute without disqualifying as key of R)

– Candidate Key – 

A set of attributes in a relation satisfying key definition

– Primary Key – 

Arbitrarily designated

– Primary Attribute – 

A member of a candidate key

Levels of Normalization

BCNF 3 NF 2 NF 1 NF

BCNF table EmployeeNumb er 100

Employee

HireDate

Salary

Mauser

DepartmentNumb er E21

6/19/1980

26150

110

Roberts

A00

5/16/1958

38170

120

O’Connell

A00

12/5/1963

37950

130

Clark

C01

7/28/1971

26150

140

Ross

C01

12/15/1976

35420

150

Adamson

D11

2/12/1972

30280

160

Clark

D11

10/11/1977

26150

290

Parker

D31

5/30/1980

15340

310

Skaggs

D31

9/12/1964

15900

First Normal Form (1NF) 

Requirements: – 1NF disallows multivalued attributes, or composite attributes, or their combinations, by requiring only single atomic (indivisible) values in the domain of an attribute – In other words, no “relation within relations”

Business Rules Example 1NF 



Staffing hours (S) are on a per project activity (activities within projects) basis - AN Managers (PM) and their departments (D) are assigned to projects (PN) – –

A department is assigned to a project managers A project manager is assigned to projects

PN

AN

PM

D

S

AD3111

20

20

A00

0.8

AD3111

30

20

A00

1.5

AD3111

40

20

A00

1

AD3111

50

20

A00

1.25

AD3111

60

20

A00

0.75

AD3111

70

20

A00

0.35

MA2100

20

10

D11

0.5

MA2100

30

10

D11

1

OP1000

30

10

D11

0.25

IF1000

30

20

A00

1

IF1000

50

20

A00

0.5

IF1000

60

20

A00

0.5

Second Normal Form (2NF) 

  

2NF requires full functional dependency on the whole key If FD X→Y, removal of A eliminates the FD For any attribute A∈X, (X-{A})→Y If A can be removed and FD remains, X→Y is a partial functional dependency (a violation of 2NF)

\

2NF violations 

Partial dependencies – X is a subset of some key K

K

X

A

1NF

Staffing is on a per project activity ( and activities within projects) basis Managers and their departments are assigned to projects

PN

AN

PM

D

S

AD3111

20

20

A00

0.8

AD3111

30

20

A00

AD3111

40

20

AD3111

50

AD3111

{PN,AN→STAFF} PN →{PM,DN}

2NF

PN

AN

AD3111

20

STAF F 0.8

1.5

AD3111

30

1.5

A00

1

AD3111

40

1

20

A00

1.25

AD3111

50

1.25

60

20

A00

0.75

AD3111

60

.75

AD3111

70

20

A00

0.35

AD3111

70

.35

MA2100

20

10

D11

0.5

MA2100

30

10

D11

1

OP1000

30

10

D11

0.25

IF1000

30

20

A00

1

IF1000

50

20

A00

0.5

IF1000

60

20

A00

0.5

PN

PM

D

AD3111

20

MA210 0 OP1000

10

A0 0 D1

IF1000

20

10

1 D1 1 A0 0

MA2100 20

.5

MA2100 30

1

OP1000

30

0.25

IF1000

30

1

IF1000

50

.5

IF1000

60

.5

Third Normal Form (3NF) 

3NF requires elimination of transitive dependency of nonprime (non-key) attributes on another nonprime (non-key) attribute (i.e., dependency is directly on the primary key) {X→Y,Y→Z}⊨X→Z – If a functional dependency X→A holds in R, one of these must be true:   

X is a superkey of R, or A ∈X (i.e., each attribute in A - X is part of some candidate key for R)

3NF violations 

Transitive dependencies – X is not a subset of the key

K

X

A X

2NF

A department is assigned to a project manager A project manager is assigned to projects

PN

PM D

AD311 1 MA210

20

0 OP100 0 IF1000

10 10 20

PN→PM →D

A0 0 D1 1 D1 1 A0 0 PM D 20 10

A0 0 D1 1

3NF

PN

PM

AD311 1 MA210

20

0 OP100 0 IF1000

10

10 20

More Documents from "Hoi"

Ket Qua Vong Loai Bang B
November 2019 29
Ke Hoach Ve Tau Tet
November 2019 23
Card.docx
December 2019 5
The Huy_toi Pham Quoc Te
November 2019 16