Class Note 4

  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Class Note 4 as PDF for free.

More details

  • Words: 2,922
  • Pages: 12
CLASS NOTE-4 (FUNCTIONAL DEPENDENCY & NORMALIZATION) 1. Q: What is Functional Dependency? Why required? What do you mean by Functional Dependency Diagram? Answer: (First Part) definition: Let R be a relation scheme and X, Y be sets of attributes in R. A functional dependency from X to Y exists if and only if:  For every instance of |R| of R, if two tuples in |R| agree on the values of the attributes in X, then they agree on the values of the attributes in Y We write X → Y and say that X determines Y Example on Student (sid, name, supervisor_id, specialization):  {supervisor_id} → {specialization} means  If two student records have the same supervisor (e.g., Dimitris), then their specialization (e.g., Databases) must be the same  On the other hand, if the supervisors of 2 students are different, we do not care about their specializations (they may be the same or different). Sometimes, I omit the brackets for simplicity:  supervisor_id → specialization

Another example is:

1

(Second Part) Why required? Functional dependency is required to remove the habit of bad database design. Given below is a bad database design-

(Third Part) FD Diagram?

Drawing the diagrams: Functional dependency diagrams are a useful diagrammatic way of showing functional dependencies. They are especially helpful in deducing the closure of an attribute set or of a set of functional dependencies. These diagrams are not used by Silberschatz, Korth, and Sudarshan. They can be found in Date's book, but he does not give a formal definition of the rules for drawing these diagrams. The following are some guidelines for drawing functional dependency diagrams. 1. Each attribute in the relation schema appears only once in the diagram. 2. If the left side of a functional dependency consists of an irreducible set of attributes, these attributes are enclosed in a box from which the arrow for that FD originates. 3. All arrows terminate on single attributes. In other words, apply Armstrong's decomposition rule to turn FDs with multiple attributes on the right-hand side into multiple FDs with only one attribute on the right-hand side. Example: Given the set of functional dependencies

The FD diagram for this set of FDs would be:

2

2. Q: What is a trivial functional dependency? Answer: A functional dependency X → Y is trivial if Y is a subset of X  {name, supervisor_id} → {name}  If two records have the same values on both the name and supervisor_id attributes, then they obviously have the same supervisor_id.  Trivial dependencies hold for all relation instances A functional dependency X → Y is non-trivial if Y∩X = ∅  {supervisor_id} → {specialization}  Non-trivial FDs are given implicitly in the form of constraints when designing a database.  For instance, the specialization of a students must be the same as that of the supervisor.  They constrain the set of legal relation instances. For instance, if I try to insert two students under the same supervisor with different specializations, the insertion will be rejected by the DBMS 3. Q: What are different rules of functional dependency? What are Armstrong’s axioms? Proof that it is sound and complete. Answer: Different rules on functional dependency areLet X, Y, Z,M be subset of the relation scheme of a relation R 1. Reflexivity: If Y⊆X, then X→Y (trivial FDs) Example: {name, supervisor_id}→{name}

2. Augmentation: If X→Y , then X∪Z→Y∪Z Example: If {supervisor_id} →{spesialization} , then {supervisor_id,name}→{spesialization,name}

3. Transitivity: If X→Y and Y→Z, then X→Z Example: If {supervisor_id} →{spesialization} and {spesialization} →{lab}, then {supervisor_id}→{lab}

4. Union:, If X→Y and X→Z , then X→YZ 5. Decomposition: If X→YZ, then X→Y and X→Z 6. Pseudo transitivity: If X→Y, and XZ→M then YZ→M 7. Reflexivity: X→X First Three rules are termed as Armstrong’s Axioms. It is sound and complete. Proof is given belowArmstrong’s axioms are sound (i.e., correct) and complete (i.e., they can produce all possible FDs) Example: Transitivity

3

let X, Y, Z be subsets of the relation R If X→Y and Y→Z, then X→Z Proof of soundness: Assume two tuples T1 and T2 of |R| are such that, for all attributes in X, T1.X = T2.X,  since X→Y then, T1.Y = T2.Y  since Y→Z and T1.Y = T2.Y then T1.Z = T2.Z 4. Q: Define Closure of set of Functional dependency and Closure of Attribute Set. Answer: Closure of Functional Dependency Set: For a set F of functional dependencies, we call the closure of F, noted F+, the set of all the functional dependencies that can be derived from F (by the application of Armstrong’s axioms).  Intuitively, F+ is equivalent to F, but it contains some additional FDs that are only implicit in F. Consider the relation scheme R(A,B,C,D) with F = {{A} →{B},{B,C} →{D}} F+ = { {A} →{A}, {B}→{B}, {C}→{C}, {D}→{D}, {A,B}→{A,B}, […], {A}→{B}, {A,B}→{B}, {A,D}→{B,D}, {A,C}→{B,C}, {A,C,D}→{B,C,D}, {A} →{A,B}, {A,D}→{A,B,D}, {A,C}→{A,B,C}, {A,C,D}→{A,B,C,D}, {B,C} →{D}, […], {A,C} →{D}, […]} Closure of Attribute Set: For a set X of attributes, we call the closure of X (with respect to a set of functional dependencies F), noted X+, the maximum set of attributes such that X→X+ (as a consequence of F) Consider the relation scheme R (A,B,C,D) with functional dependencies {A}→{C} and {B}→{D}.  {A}+ = {A,C}  {B}+ = {B,D}  {C}+={C}  {D}+={D}  {A,B}+ = {A,B,C,D} {A,B} is a superkey because:  It determines all attributes  Is minimal - neither {A} nor {B} alone are candidate keys 5. Q: Write down the algorithm to find out the closure of Attribute Set. Answer:

Input:  R a relation scheme  F a set of functional dependencies  X ⊂ R (the set of attributes for which we want to compute the closure) Output:  X+ the closure of X w.r.t. F X(0) := X Repeat X(i+1) := X(i) ∪ Z, where Z is the set of attributes such that  there exists Y→Z in F, and 4

 Y ⊂ X(i) Until X(i+1) := X(i) Return X(i+1) 6. Q:

Find out the closure of the following FD set.

{ SSN→ENAME, PNUMBER→{PNAME, PLOCATION}, {SSN, PNUMBER}→HOURS } Answer: {PNAME}+ ={SSN, ENAME} {PNUMBER}+ ={PNUMBER, PNAME, PLOCATION} {SSN, PNUMBER}+ = {SSN, PNUMBER, PNAME, PLOCATION, HOURS}

7. Q: Define Irreducible set of functional dependency and Canonical Cover. Answer:

A set of FDs S is irreducible if and only if 1) The right -hand side of each FD in S involves one attribute 2) No attribute in the left-hand side of each FD in S can be removed without changing S+ 3) No FD in S can be removed without changing S+ Example: R = (A, B, C, D) S = {A → BC B→ C A→B AB → C AC → D} The irreducible set A →B B→ C A →D Canonical Cover:

An attribute of a functional dependency is said to be extraneous if we can remove it without changing the closure of the set of functional dependency. The formal definition of extraneous attribute is as follows-Consider a functional dependency X→Y • An attribute A is extraneous in X if A∈ X and F logically implies (F-{X→Y}) U {(X-A) →Y} • An attribute A is extraneous in Y if A ∈ Y and F logically implies (F-{X→Y}) U {X→(Y-A)} A canonical cover for F is a set of dependencies Fc such that  F and Fc,are equivalent  Fc contains no redundancy  Each left side of functional dependency in Fc is unique.  For instance, if we have two FD X→Y, X→Z, we convert them to X→Y∪Z. Example to find out the canonical cover of the following: R = (A, B, C) F = {A → BC 5

B→C A→B AB → C} Combine A → BC and A → B into A → BC  Set is now {A → BC, B → C, AB → C} A is extraneous in AB → C because of B → C.  Set is now {A → BC, B → C} C is extraneous in A → BC because of A → B and B → C. The canonical cover is: A→B B→C 8. Q: What do you mean by normalization? What are Different forms of normalizations? Answer:

The process of normalization was proposed by Codd (1972). Initially Codd proposed three normal forms, which he called first, second and third normal forms. Normalization of data can be taken upon as based on their FDs and primary keys to achieve 1. minimizing redundancy 2. minimizing the insertion, deletion and update anomalies We can also say thatNormalization is the process of decomposing a relation schema R into fragments (i.e., smaller tables) R1, R2,.., Rn. Our goals are:  Lossless decomposition: The fragments should contain the same information as the original table. Otherwise decomposition results in information loss.  Dependency preservation: Dependencies should be preserved within each Ri , i.e., otherwise, checking updates for violation of functional dependencies may require computing joins, which is expensive.  Good form: The fragments Ri should not involve redundancy. Roughly speaking, a table has redundancy if there is a FD where the LHS is not a key (more on this later).

Different forms of normalizations are• First normal form 6

• • • • •

Second normal form Third normal form BCNF Fourth normal form Fifth normal form

9. Q: What is first normal form? How can you convert a relation into first normal form? Answer:

A relation is said to be in First normal form if value of any attribute in a tuple must be a single value from the domain of that attribute. Following relation is not in first normal formCustomer (Customer_ID, First_Name, Surname, TelephoneNo)

How to normalize it into first normal form: First Solution: Expand the relation and make the Customer_ID and TelephoneNo together as primary key.

Second Solution: Separate the TelephoneNo attribute and make another relation taking along with the primary key of the original relation Customer-A (Primary key is- Customer_ID)

Customer-B (Primary key is- Customer_ID,TelephoneNo)

Third Solution: Expand the TelephoneNo attribute into no of columns as the maximum no of possible values of its domain.

10. Q: What is second normal form? How can you convert a relation into second normal form? Answer:

A relation is said to be in second normal form if it is already in first normal form and full functional dependency exists. i.e. all non prime attributes are fully dependent on primary key attribute. 7

Let us consider the following relation-

Here, Employee, skill together are primary key. But CurrentWorkingLocation is partially dependent on Employee. So partial functional dependency exists. So we have to remove this. So after 2NF conversion-

11. Q: What is Third normal form? How can you convert a relation into Third normal form? Answer:

A relation is said to be in second normal form if it is already in second normal form and no non prime attribute is transitively dependent on primary key attribute. Let us consider the following relation-

Here, Area is a non prime attribute and it is transitively dependent on primary key CompanyNo. So this is not in 3NF. It is already in 2NF. So after 3NF conversion the above relation will be-

12. Q: What is Boycee Codd normal form? How can you convert a relation into BCNF? Answer:

A relation is said to be in BCNF form if it is already in Third normal form and whenever a non trivial functional dependency A → B exists A is always a superkey Let us consider the following relation-

This relation is not in BCNF because Area → LtNo. Area is not a superkey. So after BCNF conversion the above relation will be decomposed into-

8

Here, some of the decomposition may be lost but which never loss information.

13. Q: Consider two functional dependencies F & G. Check whether they are equivalent.

The FD set F is{A → C, AC → D, E → A, E → H} The FD set G is{A → CD, E → AH} Answer:

For FD set F1. A → C (given) 2. A → AC (augmentation) and AC → D (given). Hence A → D 3. E → A 4. E → H For FD set F5. A → C (decomposition) 6. A → D (decomposition) 7. E → A (decomposition) 8. E → H (decomposition) So F and G are equivalent. 14. Q: Given a relation r(A,B,C,D) and F {AB → C, B → D, D → B} of functional dependencies, find the

candidate keys of the relation. How many candidate keys are in this relation? What are prime attributes. Answer:

There are two candidate keys AB and AD. AB is a candidate key, because1. AB → A (reflexivity) 2. AB → B (reflexivity) 3. AB → C (given) 4. AB → B (from 2) and B → D (given). So AB → D So, AB determines all the attributes, hence AB is a candidate key. AD is a candidate key, because5. AD → A (reflexivity) 6. AD → D (reflexivity) 7. AD → D (from 2) and D → B (given). So AD → B 8. AD → B (given) and AB → C (given). So, AD → C So, AD determines all the attributes, hence AD is a candidate key. So prime attributes are A, B, D. 9

15. Q: What do you mean by decomposition? What is non-loss decomposition? Answer:

Decomposition is a process for breaking a relation schema that has many attributes into several schema with fewer attributes. Two decompositions• Lossy decomposition: If after decomposition the functional decomposition is not preserved than it is called lossy decomposition. • Lossless decomposition: If after decomposition the functional decomposition is preserved than it is called lossless decomposition. Consider the following relation-

Let us decompose into following two relations-

After Joining again above this two relations-

Here spurious tuples will be obtained so, it is lossy decomposition. 15. Q: What is multivalued dependency? Answer:



  

Let R be a relation and X and Y arbitrary sets of attributes of R; if for any 2 tuples (in any extension) t1 and t2 with the property  t1[X] = t2[X] there exists 2 tuples t3 and t4 such that (Z = R - (X∪Y))  t1[X] = t3[X] = t4[X]  t1[Y] = t3[Y] and t2[Y] = t4[Y]  t1[Z] = t2[Z] and t3[Z] = t4[Z] then X →→ Y For example,

10

 

Then, ENAME →→ PNAME and ENAME →→ DNAME

15. Q: What is trivial MFD? What is fourth normal form? Answer:

R, is a relation; X and Y arbitrary sets of attributes of R  X →→ Y is trivial if and only if  Y ⊆ X or  X∪Y=R A table is in 4NF if and only if, for every one of its non-trivial multivalued dependencies X →→ Y, X is a superkey, that is, X is either a candidate key or a superset thereof. If X →→ Y is trivial then it is in 4NF. Consider the following relation which is not in 4NF.

Now in fourth normal form-

16. Q: What is denormalization? Answer:

The process of talking a normalization schema and making it non-normalization is called denormalization. The normal forms discussed so far required that the given relation R if not in the given normal form be decomposed in two relations to meet the requirements of the normal form. In some rare cases, a relation can have problems like redundant information and update anomalies because of it but cannot be decomposed in two relations to remove the problems. In such cases it may be possible to decompose the relation in three or more relations using the 5NF. The fifth normal form deals with join-dependencies which is a generalisation of the MVD. The aim of fifth normal form is to have relations that cannot be decomposed further. A relation in 5NF cannot be constructed from several smaller relations. 16. Q: what is fifth normal form? Answer:

11

A relation R satisfies join dependency (R1, R2, ..., Rn) if and only if R is equal to the join of R1, R2, ..., Rn where Ri are subsets of the set of attributes of R. A relation R is in 5NF (or project-join normal form, PJNF) if for all join dependencies at least one of the following holds. (a) (R1, R2, ..., Rn) is a trivial join-dependency (that is, one of Ri is R) (b) Every Ri is a candidate key for R.

An example of 5NF can be provided by the example below that deals with departments, subjects and students. dept subject Comp. Sc. CP1000 Mathematics MA1000 Comp. Sc. CP2000 Comp. Sc. CP3000 Physics PH1000 Chemistry CH2000

student John Smith John Smith Arun Kumar Reena Rani Raymond Chew Albert Garcia

The above relation says that Comp. Sc. offers subjects CP1000, CP2000 and CP3000 which are taken by a variety of students. No student takes all the subjects and no subject has all students enrolled in it and therefore all three fields are needed to represent the information. The above relation does not show MVDs since the attributes subject and student are not independent; they are related to each other and the pairings have significant information in them. The relation can therefore not be decomposed in two relations (dept, subject), and (dept, student) without loosing some important information. The relation can however be decomposed in the following three relations (dept, subject), and (dept, student), (subject, student) and now it can be shown that this decomposition is lossless.

12

Related Documents

Class Note 4
April 2020 4
Class Note
August 2019 32
Note 4
November 2019 0
Class 4
November 2019 12
Class 4
November 2019 15