Database Tables and Normalization Normalization
is a process for assigning attributes to entities. It reduces data redundancies.
Normalization
works through a series of stages called normal forms: First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) etc…………..
The
highest level of normalization is not always desirable. 1
Guidelines for Designing Relations
Guideline 1: Design relations so that the the meaning of the data is clear. Do not combine attributes from different entities and relationships into a single relation. Guideline 2: Design relations so that no insertion, deletion or modification PROBLEMS are likely to occur. Guideline 3: As far as possible, avoid including attributes in a relation that have null values in a majority of the tuples. Guideline 4: Design relations that can be JOINed with each other using equality conditions on attributes that are either primary keys or foreign keys.
2
Keys
The role of the key is based on a concept known as determination, which is used in the definition of functional dependence. An
attribute B is functionally dependent on an attribute A if the value of A determines the value of B.
An
attribute that is part of a primary or candidate key is known as a key ( or prime) attribute. Else, it is a nonkey (or non-prime) attribute.
If an attribute B is functionally dependent on a composite key A but not on any subset of A, then B is fully functionally dependent on A; otherwise B is partially dependent on A. 3
Two Other Dependencies Partial Dependency A functional dependency ‘A determines B’ is a partial dependency if the dependency holds even after removing some attribute from A.
Transitive Dependency A transitive dependency occurs when one non-prime attribute depends on another non-prime attribute.
4
Database Tables and Normalization 1NF
Definition
The term First Normal Form (1NF) describes the tabular format in which: All the prime attributes are defined and have non-null values. All table entries are atomic, i.e., no cell of the table contains a group of items. All attributes are dependent on the primary key. 5
Customer
Customer ID 123
First Name Robert
124
Jane
Surname
Tel. No. 1
Tel. No. 2
Ingram
Tel. No. 3
2025456
Wright
555-403-1
659555-776-
Customer Customer ID
First Name
1
Surname
Robert 2
Jane
Ingram Wright
Customer Telephone Number Customer ID 1 2 124
Telephone Number 2025456 555-403-1 659555-7766
Database Tables and Normalization 2NF
Definition
A table is in Second Normal Form (2NF) if: It
is in 1NF, and It includes no partial dependencies, i.e., no attribute is dependent on only a part of a composite candidate key. Note that it is possible for a table in 2NF to exhibit transitive dependency, i.e., a non-prime attribute can be functionally dependent on some other non-prime attribute. 7
Consider a table describing employees' skills: Employee
Skill
Current Work Location
Jones
Typing
Jones
Shorthand
114 Main Street
Jones
Whittling
114 Main Street
Roberts
Light Cleaning
114 Main Street
73 Industrial Way
The table's only candidate key is {Employee, Skill}.The remaining attribute, Current Work Location, is dependent on only part of the candidate key, namely Employee. Therefore the table is not in 2NF. Note the redundancy in the way Current Work Locations are represented: we are told three times that Jones works at 114 Main Street. This redundancy makes the table prone to update problems. 8
A 2NF alternative to this design would represent the same information in two tables: Employees Employee
Jones Roberts
Location 114 Main Street 73 Industrial Way
Current Work
Employees' Skills Employee
Jones Jones Jones Roberts
Skill
Typing Shorthand Whittling Light Cleaning
9
Database Tables and Normalization 3NF
Definition
A
table is in Third Normal Form (3NF) if: It is in 2NF, and It contains no transitive dependencies.
10
11
To
normalize data to achieve third normal form-
• Identify and remove any attribute which is actually dependent on a non-key attribute in the same relation . • Create a new relation for this attribute identified by the relevant key.
12
13
Denormalization Normalization
is only one of many database design
goals. Normalized
(decomposed) tables require additional
processing. Normalization
purity is often difficult to sustain in the modern database environment. The conflict between design efficiency, information requirements, and processing speed are often resolved through compromises that include denormalization. 14
Merging Relations (View Integration) This is required when, on large projects, the work of several sub-teams comes together during logical design. Problems that may arise: Synonyms: Different names, same meaning. Homonyms: Same name, different meanings. 15
Synonyms STUDENT1 (Student_ID, Name) STUDENT2 (SSN, Name, Address) If Student_ID is same as SSN, the two relations can be merged to STUDENT (SSN, Name, Address)
Homonyms STUDENT1 (Student_ID, Name, Address) STUDENT2 (Student_ID, Name, Phone, Address) If one address refers to the campus address and the other to the home address, the relations can be merged to STUDENT (Student_ID, Name, Phone, Campus_addr, Home_addr)
16