Normalization Normalization is a method for organizing data elements in a database into tables and reduces redundancy. Redundancy is unnecessary repetition of data. It’s based on functional dependencies among the attributes of a relation.The process of normalization was first developed by E.F.CODD in 1972. Normalization Avoids • Duplication of Data – The same data is listed in multiple lines of the database • Insert Anomaly – A record about an entity cannot be inserted into the table without first inserting information about another entity – Cannot enter a customer without a sales order • Delete Anomaly – A record cannot be deleted without deleting a record about a related entity. Cannot delete a sales order without deleting all of the customer’s information. • Update Anomaly – Cannot update information without changing information in many places. To update customer information, it must be updated for each sales order the customer has placed Example of Redundancy and Anomaly:
6
Functional Dependency (FD):
It’s describes the relationship between attributes (column) in a relation. FD must be ONE TO ONE relationship not ONE TO MANY relationship. Example:
Types of Functional Dependencies: 1. Full Dependency 2. Partial Dependency 3. Transitive Dependency 1.
Full Dependency:
2.
Partial Dependency:
3.
Transitive Dependency:
in a relation, the attribute s B is fully functional dependent on A if B is FD on A, but not on any proper subset of A. A condition in which an attribute is dependent on only a portion (subset) of the primary key.
another attribute that is not key.
A condition in which an attribute is dependent on part of the primary
Example:
6
What are anomalies in DBMS and their types? Tables that have redundant data have problems known as anomalies.So data redundancy is a cause of an anomaly. Redundancy is the duplicaion of the data. There are 3 types of anomalies. Insert Anomaly:When you insert a record without having it stored on the related record. Delete Anomaly:When you delete some information and lose valuable related information at the same time. Update Anomaly: Any change made to your data will require you to scan all records to make the changes multiple time.
6
First Normal Form Definition: All attributes are atomic and dependent on the primary key.Atomicity means only one value for each tuple and all the key attributes (Column) are defined. Example:
Before 1 NF:
OrderDate Customer Items 11/30/199 Joe Smith Hammer, 8 Nails
Saw,
OR OrderDate Customer Item1 11/30/199 8
Item2
Joe Smith Hammer Saw
Item3 Nails
After 1NF: Customers CustomerI Name D 1 Joe Smith 2 A Smith 3 Smith Renee
OrderID 1 1 1
Orders Custome rID Hammer 1 Saw 1 Nails 1 Item
OrderDa te 11/30/1998 11/30/1998 11/30/1998
Example 2 (After 1NF):
6
Second Normal Form Definition: 1NF and every non key attribute are fully and functionally dependent on the primary key. Explanation: It includes no partial dependency i.e. create separate table with the functionally dependent data and the
part of the key on which it depends. Example:
Before 2 NF: OrderID 1 1 1
After 2NF OrderID 1 2 3
Orders Item Custome rID Hammer 1 Saw 1 Nails 1 Orders Custome rID 1 1 1
OrderDa te 11/30/1998 11/30/1998 11/30/1998
OrderDa te 11/30/1998 11/30/1998 11/30/1998
OrderDetails OrderID Item 1 Hammer 2 Saw 3 Nails
Example 2 (After 2NF):
6
Third Normal Form Definition: It is in 2NF and no transitive dependencies exist. Explanation:In 2NF and every non-key column is not mutually dependent.
Example:
Before 3 NF: OrderDetails Quant Price ity 2 500 5 2000
Item Hammer Saw Nails
8
50
Tota l 1000 1000 0 400
After 3 NF: OrderI D 1 2 3
OrderDetails Item Quantit Price y Hammer 2 500 Saw 5 2000 Nails 8 50
Example 2:
6
Boyce-Codd Normal Form Definition: A relation is in BCNF if and only if all the determinants are candidate keys. BCNF was to address certain types of anomaly
developed in 1974 by Raymond F. Boyce and Edgar F. Codd not dealt with by 3NF as originally defined
Explanation: The Boyce-Codd Normal Form (BCNF) database normalization methodology is a methodology for
database design used to normalize data beyond the third normal form (3NF). In BCNF, determinant must be a candidate key; if this is not the case then the form of the BCNF. For example, if there was a table used to hold data about employee’s with the attributes: employeeID, firstName, lastName, title. The employeeID field determines the firstName and lastName, this is a superkey and must hold unique data and can not be NULL. Similarly the tuple (firstName, lastName) determines the employeeID, this makes firstName, lastName candidate keys of employeeID. In such a case; as previously explained, these determinants must be candidate keys, however, a relations' candidate keys must have unique sets of values for each row it holds. every database is not
Example:
6
Example 2:
6
Example 3:
Rate Types Rate Type
Today's Bookings
Court Member Flag
Court Start Time End Time Member Flag
1
Yes
1
09:30
10:30
Yes
STANDARD 1
No
1
11:00
12:00
Yes
PREMIUM-A 2
Yes
1
14:00
15:30
No
PREMIUM-B 2
No
2
10:00
11:30
No
2
11:30
13:30
No
2
15:00
16:30
Yes
SAVER
Fourth Normal Form Definition: It is in 3NF and has no multiple sets of multivalve dependencies i.e. multivalve dependencies are removed Example:
Before 4NF
6
After 4NF:
Fifth Normal Form Definition: Any remaining anomalies are removed. In this normal form we isolate semantically related multiple relationships. It’s a renormalizations.
6
SUMMARY:
Other Details Relational data model relation a table in a relational database is called relation in the mathematical language of relational algebra. relations are unordered. attribute column of a table in database table is called attributes. columns or attributes have names. domain set of permissible values for an attribute ( or column) is called domain. tuple a row in the database table is called tuple in the mathematical language of relational algebra. order of tuples in a relation has no significance. database a database is a collection of multiple relations. schema a database design is called schema, alternatively, a schema can refer to namespace within a database. cardinality of a relation number of attributes in a relation is called cardinality of te relation.
6