Normalization

  • Uploaded by: manubhardwajcs
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Normalization as PDF for free.

More details

  • Words: 1,494
  • Pages: 43
DBS201: Introduction to Normalization

Agenda  Top

Down vs Bottom Up  What is Normalization?  Why Normalization?  Normalization Steps

Top Down vs Bottom Up  Top

Down

Usually provided just a narrative or very high level data requirements  Need to discover entities, attributes, relationships  Result is tables 

Top Down vs Bottom Up  Bottom

Up

Provided with views of data  Views can be screen shots or reports (printouts)  Views contain fields (data)  Need to groups fields together – find fields that are in common  Result is tables 

Agenda  Top

Down vs Bottom Up  What is Normalization?  Why Normalization?  Normalization Steps

What is normalization?  Normalization 

Process for evaluating and correcting table structures to minimize data redundancies 

 Can

helps eliminate data anomalies

be used in conjunction with ER modeling to produce a good database design

What is normalization?  Works

through a series of stages called normal forms:  Normal

form (1NF)  Second normal form (2NF)  Third normal form (3NF)

What is normalization?  2NF  3NF

is better than 1NF is better than 2NF  For most business database design purposes, 3NF is highest we need to go in the normalization process  Highest level of normalization is not always most desirable

Agenda  Top

Down vs Bottom Up  What is Normalization?  Why Normalization?  Normalization Steps

Why normalization?  Example:

company that manages building

projects Charges its clients by billing hours spent on each contract  Hourly billing rate is dependent on employee’s position  Periodically, a report is generated that contains information displayed as in Table 5.1 

A Sample Report Layout

A Table in the Report Format

Why normalization?  Structure

of data set in Figure 5.1 does not handle data very well  The table structure appears to work; report is generated with ease  Unfortunately, the report may yield different results, depending on what data anomaly has occurred

Agenda  Top

Down vs Bottom Up  What is Normalization?  Why Normalization?  Normalization Steps

Conversion to First Normal Form  Relational

table must not contain repeating

groups  Repeating group 

Derives its name from the fact that a group of multiple (related) entries can exist for any single key attribute occurrence

 Normalizing

the table structure will reduce these data redundancies  Normalization is three-step procedure

Step 1: Eliminate the Repeating Groups  Present

data in a tabular format, where each cell has a single value and there are no repeating groups  Eliminate repeating groups by eliminating nulls, making sure that each repeating group attribute contains an appropriate data value

First Normal Form

Step 2: Identify the Primary Key  Primary

key must uniquely identify attribute values (a row)

 Primary

key is PROJ_NUM, EMP_NUM (because the combination of those two uniquely identifies each row of the table)

Step 3: Identify all Dependencies  Dependencies

can be depicted with the help

of a diagram  Dependency diagram: Depicts all dependencies found within a given table structure  Helpful in getting bird’s-eye view of all relationships among a table’s attributes  Use makes it much less likely that an important dependency will be overlooked 

Dependency Diagram 1NF PROJ_NUM PROJ_NAME EMP_NUM EMP_NAME JOB_CLASS CHG_HOUR HOURS

First Normal Form  Tabular

format in which:

All key attributes are defined  There are no repeating groups in the table  All attributes are dependent on primary key 

 All

relational tables satisfy 1NF requirements  Some tables contain partial dependencies Dependencies based on only part of the primary key  Still subject to data redundancies 

EMPLOYEE_PROJECT(PROJ_NUM(PK), EMP_NUM(PK), PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS)

Conversion to Second Normal Form  Relational

database design can be improved by converting the database into second normal form (2NF)  Two steps

Step 1: Identify All Key Components  Determine

which attributes are dependent on which other attributes  Using the dependency diagram, document the partial dependencies: in other words take each part of the primary key and document which attributes are dependent on each part of the primary key

Dependency Diagram 1NF PROJ_NUM PROJ_NAME EMP_NUM EMP_NAME JOB_CLASS CHG_HOUR HOURS

2NF

Step 2: Identify the Dependent Attributes  Write

each key component on separate line, and then write the original (composite) key on the last line  Each component will become the key in a new table PROJECT (PROJ_NUM (PK), PROJ_NAME) EMPLOYEE (EMP_NUM(PK), EMP_NAME, JOB_CLASS, CHG_HOUR) EMPLOYEE_PROJECT(PROJ_NUM(PK), EMP_NUM(PK), HOURS)

Second Normal Form  Table

is in second normal form (2NF) if:

It is in 1NF and  It includes no partial dependencies:  No attribute is dependent on only a portion of the primary key 

Conversion to Third Normal Form  Data

anomalies created are easily eliminated by completing these steps

Step 1: Identify Each New Determinant  For

every transitive dependency, write its determinant as a PK for a new table 

Determinant 

Any attribute whose value determines other values within a row

 Using

the dependency diagram, document the transitive dependencies: in other words identify the attributes dependent on each determinant identified above and identify the dependency

Dependency Diagram 1NF

2NF

3 NF

PROJ_NUM PROJ_NAME EMP_NUM EMP_NAME JOB_CLASS CHG_HOUR HOURS JOB_CLASS is a determinant because it can determine other values within the row. In this case, it’s the CHG_HOUR

Step 2: Name the table  Name

the table to reflect its contents and function

PROJECT (PROJ_NUM (PK), PROJ_NAME) EMPLOYEE (EMP_NUM(PK), EMP_NAME) JOB (JOB_CLASS(PK), CHG_HOUR) EMPLOYEE_PROJECT(PROJ_NUM(PK), EMP_NUM(PK), HOURS)

Third Normal Form A

table is in third normal form (3NF) if:

It is in 2NF and  It contains no transitive dependencies 

Improving the Design  Table

structures are cleaned up to eliminate the troublesome initial partial and transitive dependencies  Normalization cannot, by itself, be relied on to make good designs  It is valuable because its use helps eliminate data redundancies

Improving the Design (continued)  The

following changes should be made:

PK assignment  Naming conventions  Attribute atomicity  Adding attributes  Adding relationships (will define the FKs)  Refining PKs  Eliminate derived attributes 

Business Rules  Business Rules drive the relationships  Look at the data in the table and

understand/interpret what the relationship is between the data

Business Rules A

job class can have more than 1 employee in it; results in a 1:M relationships between JOB and EMPLOYEE  Add the FKs into the appropriate tables

Step 2: Name the table  Name

the table to reflect its contents and function

PROJECT (PROJ_NUM (PK), PROJ_NAME)

New foreign key

EMPLOYEE (EMP_NUM(PK), EMP_NAME, JOB_CLASS(FK)) JOB (JOB_CLASS(PK), CHG_HOUR) EMPLOYEE_PROJECT(PROJ_NUM(PK), EMP_NUM(PK), HOURS) Business rule not necessary between EMPLOYEE and PROJECT. Why?

First Normal Form

Normalization and Database Design  Normalization

should be part of design

process  Make sure that proposed entities meet required normal form before table structures are created  ER diagram Provides the big picture, or macro view, of an organization’s data requirements and operations  Created through an iterative process 

Identifying relevant entities, their attributes and their relationship  Use results to identify additional entities and attributes 

ERD EMPLOYEE JOB _CLASS PK

JOB _CLASS

PK

EMP _NUM

FK 1

EMP _LNAME EMP _FNAME EMP _INITIAL JOB _CLASS

Has assigned

CHG _HOURS

works

EMPLOYEE PROJECT PK

PROJ _NUM

Has working

PK,FK 1 PK,FK 2

_PROJECT

EMP _NUM PROJ _NUM HOURS

PROJ _NAME

Normalization and Database Design (continued)  Normalization

procedures

Focus on the characteristics of specific entities  A micro view of the entities within the ER diagram 

 Difficult

to separate normalization process from ER modeling process  Two techniques should be used concurrently

Summary  Normalization

is a table design technique aimed at minimizing data redundancies  First three normal forms (1NF, 2NF, and 3NF) are most commonly encountered  Normalization is an important part—but only a part—of the design process  Continue the iterative ER process until all entities and their attributes are defined and all equivalent tables are in 3NF

Normalization Exercise STU_NUM

STU_LNAME

STU_MAJOR

DEPT_CODE

DEPT_NAME

DEPT_PHONE

211343Stephanos

Accounting

ACCT

Accounting

4356

200128Smth

Accounting

ACCT

Accounting

4356

199876Jones

Marketing

MKTG

Marketing

4378

199876Ortiz

Marketing

MKTG

Marketing

4378

223456McKulski

Statistics

MATH

Mathematics

3420

Normalize the above table. 1NF – eliminate repeating groups, identify a PK for the table structure 2NF – find partial dependencies 3NF - find transitive dependencies Write final table structures, including all relationships. Make all attributes atomic.

Normalization Exercise Vehicle Num

Model

11Ford

15Chevrolet

Year

Acme Tire Number 1997

1999

Tire Manfctr

Size

KM This Vehicle

1327Goodyear

15R7

43000

1328Goodyear

15R7

43000

1329Goodyear

15R7

25000

1330Goodyear

15R7

24000

2013BF Goodrich

15R7

18000

2014BF Goodrich

15R7

18000

2013BF Goodrich

15R7

29000

2014BF Goodrich

15R7

29000

Normalize the above table. 1NF – eliminate repeating groups, identify a PK for the table structure 2NF – find partial dependencies 3NF - find transitive dependencies Write final table structures, including all relationships. Make all attributes atomic.

Related Documents

Normalization
November 2019 16
Normalization
June 2020 12
Normalization
November 2019 34
Normalization
November 2019 26
Normalization
November 2019 18
Normalization
November 2019 17

More Documents from ""

Normalization
June 2020 12