Database Designing Concepts

  • Uploaded by: ChiranSJ
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Database Designing Concepts as PDF for free.

More details

  • Words: 9,807
  • Pages: 51
Database Designing Concepts Data Base A Data Base is shared collection of logical interrelated data. A Data Base System Every organization needs to record data relevant to their every day activates. The organization can choose from the data collected & stored some of these data in an electronic Data Base. Eg: A university needs to record data to help in the activates of teaching & learning therefore three record what kind of students & lectures they have what are the courses & modules they are running which lecturers are teaching what modules. One the data is entered in to a database it could be utilization to get complete & accrete information’s such as list of students who have enrolled for each modules. this can help to make decision on room utilization .etc.. Data vs. Information The Word data refers to facts concerning things such as people, objects or events. Information is data that have been processed & presented in a from suitable for human interpretation. Disadvantages of manual system 1. A Constant stream inter company paper work 2. Telephone calls or faxes are required to communicate changes & keep the file synchronize. 3. These systems cannot provide “what if” type of questions. 4. Managers cannot easily obtain summarized information required for decision making. 5. Duplicate data can exist throughout organization resulting in lack of consistency. Disadvantages of file processing system 1. There is limited data sharing 2. Inconsistency of data 3. Duplication of data can occur. 4. Excessive program maintenance 5. Poor enforcement of standard. Advantages of Database 1. Better data sharing 2. Better security & integrity 3. Reduced redundancy 4. The program & data are independent 5. Multiple views of data.

Disadvantages of Database 1. New specialized persons are required. 2. Need to get 3. Problems in data sharing explicit backup. Database Approaches Database approach emphasizes the integration & sharing of data. There are 2 designing approaches. 1-process Driven Design Requirement Analysis

Process Design

Data Design

Implementation File processing system a process driven approached was traditional being used with this approche organization process such as order processins, inventory control, pay roll are identified & analyzed. Process and data flows between processed are described using tools such as DED. Designers work backwards from the required outputs of the system to determine the requirement unput. They can use flow chart into outputs. Finally they design data file as a wide product of product of process divan design. 2- Data Driven Design Requirement Analysis

Data design Process Design

Implentation

It forcuses on entities person, Plese, events or concept about which an organization choose to record data. This approche ideenti files the attributes of the properties of these entities & the relation ship among them. It also identities the business rules that speifiese how the entities are managed or used. After creating the suitable module of the data structure and related business rules the designer developes the application reqired to manage the data. Designers now have discovered a balance approche to combine to different applications approprialely suitable. Database development Enterprise Data module

Requirement Analysis

Conceptual model

Logical Data modle

Implementation

Physical Data Model

Conceptual data model This involves building a “real world” model expressed interms of data requirements. The initiaset of information & processing reqirements are gathered from the users. The developments of the large databases conception modeline may consis of view integration (combining the user views into onces schema) The output from this stage is the entity relationship digram. (3) Logical data Model This involves the building of the real world expressed interms of a data model. Eg. Hierarchical, Network ,Relational. This involves determind the concept of the database using the conception model as input & transforming it into the architectural data model.This is the mapping of the ER model.to the relational model & caring the process of normalization. The output of this stage is relational schema up or set of table. Physical data model. This involves building a model of the real world expressed interms of data structures & Acces mechanisum available in a choosen Database management system.

This involved in a transformation of a logical data model suitable for a specific software /Hardware specification. This is usuall expressed in data dominion language & theoutput of this stage is implementation plan. Entity An entity is the thing we modeled & it is something about which we wish to store information we name the entities with the singular noun. An entity can be 1-Aperson – Employer, student, customer etc. 2-A place – It could be a state country , region etc.3-An object – chair,Table, building etc. 4- An Event – sales, Rigistration, Renewal. 5- A concept – Acount ,Course The organization. ER models are usually constructed during the analysis phase of the DB development process. Output of this stage is a coceptval data model expressed in the form of a detailed ER digram. The ER digram consist of entities relationships & attribules.

(4) Basic Systems of an ER digram.

Entity

Relationship

Attributes Registration Primary key Attributes. This is the property for entity.

Primary key

Primary key

Eg: Student Relationshipjhhgvf Rel Student No

Attributes Name Add Tel No

ationship Association between entities. Can be specified using relationship. There are 3 basic kind of relationship. 1- 1:1 (one – to – one) 2.- M (one – to – Many) / (1:*) 3 – M:M (Many – to – Many ) / (*:*) 1:1 1.

2.

(5)

Country

H

Capital

1

Lock

1 H

1 3

1 H

Husband

1:M/ / 1:*

Key

Wife

*

1 B

Invoice

1.

Customer

2.

Mother

1

* H ave

. 1

3 Shop

M: M /*:* PE . Student

Child

Sales Man

has

*

* Sales Man

Regis ter

Cannot allow M:M for introduced “Link Entity” with two 1:M relationship.

(6)

Student

Student OR

1 *** * Enrolment

Course

Course Student

Link Entity

Course

*

* Communi cate

Person

Country

Country C 1

*

Language

* 1 Person

(7) British counsil library Library items – Book / CD – DVD / Enayclopedia Reception Membership Readers Libraraian

British Councial Library

H ave

Reception

Regi ster

Reader

CD DVD Library Item

H ave

books

H ave

Encyclopedia Copy A

D eati

1.* Librararian

Lending Collection

Accomerdation of ER model to a relational Data model. After drawing a ER digram it has to be converted into relational schema it can be done as follows. Convert each entity into a relational table in your relational data model the identifier of the entity becomes primary key of the table, all other attributes become nonkey attributes of the table. (8) Eg. Student (Entity) (RegNo,Name,Add) Primary Key Table Name

1: M Relationships are represented as relational schema by taking the key field of the table at one end gide as a foreign key to the table at the many end side. Eg: Customer Customer (CNo Primary key, Name, Billno,goods-------) Foreign Key Order Order (Order No,Name---------(No)

1:1 relationship can represented as one table A turnary relationship is respresented as 1 table with the compersit primary key which consist of all & identifiers of the 3 tables. 1:M Recursive relationship is represented in1 with a forigenkey which is effectively primary key.

Manage

Employee

** Employee (EmpId , Name, Add.DoB ------

If there are M:M relationship they should be avoided by breaking them into 2, 1:M relationship for that we have to introduced additional entity call link entity. LE is added which cross references between instances of 1 entity & instances of other entity. The many ends of the relationship upper at the LE. Thhe PK of the 2 entities are in corperated into the linkentity. (9)

There can be two types of entities avatable in ER digram 1- Super type entity 2. Sub type entity Super type entity This is an entity that stores attributes commen to 1 or more entity sub types Sub type entities This is an entity that inherites some common attributes from an entity super type.

Attributes

These are the charatristics of entities can be used to disscribe entities and relationship in the ER digram. An entity can have many attributes but should have an identification key. A key is an attribute or group of attributes which can identify an entity uniquely. A key attribute can be underlined in the ER digram. There are single valued attribute is 1 that can have sigle valued attribute Is 1 that can have sigle value. Eg.A persons 1D ,Subject,Mark ,Grade, etc. Multy valued attributes they can have many values. Eg.An office can have different officeno.A person have sevelan qualification. Cardinaler & optionarity Cardinality we already discussed. Relationship partiapation Praticipation in an entity relationship can either optional or mandertary Optional relationship Here the participation is optional if one entity occurrence doesnot require the corresponding entity occurrence, Entity. Mandertary relationship Relationship participation is mandertary (must) if one entity occurrence in a pertionship indicate that the minimum cardinality is one mandertary entity (10)

Relationship degree The relationship degree indicates the no.of associated entities or types with the relationship.There are 3 of relationship 1- unary (Reairsive) This is the relationship between the same entity class or type. Binary Relationship This is a relationship between 2 different entity types.

Ternary Relationship This is a relationship between 3 entity type & it is a simaltion rishipamong 3 entity types at the same time. Find 5 types of unary , Binary, ternary relationship Unary

Person

Child

Talk

Individual

Cons ult

Turnary 1.

Company

Deal s

Customer

11 Note book

pen write

Play

moni tor

person

2.

Student

Text book Study

Teacher

(12) 3.

Company

Neeal

Staff

Customer

4.

Bank

Individual Need

Money

(13)

Binary 1.

2

Person

Individual

* Communicat

* eat

Language

Food

3.

*

Country Have

*

Company

4

Have 5,

*

Student

District

Staff

Subject

Study

Weak Entity Weak entity is one that needs e following conditions. 1- It cannot exist without e entity with witch have e relationship. 2- - It has a primary key that is partially or to fally dependent rom e parent entity relationship. Data models The evaluation of data model. Request for better data management has lot to do with several different models.That attempt to resolve & file system problems. The mager data models are 1- hierarchical model 2- - Network model 3- Relational model (14) Hierchocl model. This was e 1st database model interduced in mid 1960’s & was based on e hierarchical model. Which assumes that all data relationship can be structured as hierarchical. The hierchieal data model uses a simple approche where e relationship between entities are always 1:M, This forms a simple one parent record abcord above it & many child record below it. * Customer

* Invoice

Invoice line

This digram illustrate how can make a hierrchi showing e relationship between customer invoice & Showing e relationship between customer more invoice line A Cuctomer can own one or more invoice lines. In hierarchical database model these file would be tide together to e physical point. Al Pointer is a physical address which identify where a rewrd can be found on a disribed. Disadvantages. - Large amount of data is stored - There is only one path to access data item - Difficult in the real world. - Difficult to perform ad-hos quries Network Model The Network model was created to represent complex data relationship more effectively than e hierarchical model. In e network model any record main have many immediate parent records as well as many child records. The network model is able to model greater number of real world situation. The network model build around e concepts of “Set”. In a network Data base model a relationship is called set (There are 1:M relationship between e owner & e member) Each set is composed of atleaset 2 record type whwre e owner = hierarchical parent record & e member record = The hierarchical child record. The differents between e night inelude a than one set. In other words a member can have several owners. (15)

Supplier m 1 Ubrary

m Staff 1

m Library item

m m

1 Member

1 1

m Fine

m Reterence

Disadvantages - When compare to hierarchical structure the network strueture is more complicated by the set concepts - The hos quries are difficult to execute - The programmer should know the structure as well as set type which makes of the networking before processing is done. Relational Model Relational model user the table from of data collection, which is similler to hles. Tables within the relational model are known as relations. The records in the table are known as “tuples” in a relational model a relational or table consist of a series of rows the columns. A column is known as field where a row is to tuple STUDENT Field= Couumns Primary key

Reg No

St Name

St Add

Tel

00106

Krishi

Col-06

2361742 Rg.No

Record/ Tuple/ Row Stude Course code HND

Payment Amt

Reg No

10,000

00106

Forein key Relations can be related to each other by sharing common entity caracteristies. The common link between student & payment tables enable us to connect the student to the payment table. Even though their details are stored in seperrate tables this is done by the use of payment key – foreign key concept. This relational model contains 3 relation type normally 1-1:1 2-1:M 3- M:M A relational schema is a visul representation of the relational entities if attritutes of those entity & relationship. Advantages of relational model.

Paym

-

Essy to understand the concept users are able to interaet with the relational DB very easily since the familier to everyone. No physical pointers when compared to previous model. Easy setup & change can be done with minimal effort. Logical & physical independent.

Relational Data Base Model Keys 1- Primary Key- A Primary key is a attribute that can uniquely inentity given row. 2- Eg.In student table primary key can be taken as “Reg No” 2.- Composit key – Aprimary key can a single attribute or commpersision of several attributes. Which can be used to uniquely identify a row or tuple. A unique identifier cabn be made up of mony attributes & that is call a composite primarykey. Compsit key STUDENT

Eg: Reg No, Course Code

Course code Reg No 0001 0001 0001

HND Multi C++

3- Foreign key – An attribute (or combination of attributes) in one table whose values match the primary key in another table is called as foreign key. Foreign key is an attribute whose values match primary key values in the related table. STUDENT Reg No

PAYMENT RegNo**

4- Candidate key – We can identify a candidate key an attribute or combination of attributes that cmiquely identify a new in a table. A relation can have more than one candidate key among, among the one will be choosen as the primary key. Eg. Person ID

Project ID

Supervisor ID

Time spent

0001 0001

01 02

0002 0002

50 hrs

Candidate key – 1 – Person ID, Project ID 2- Person ID, supervisor ID Reg no

Name

Dob

001 002 003

Krishi Krishi Renu

22.10.96 19.11.96 22.10.96

Candidate key- 1- Reg No, DOB 2- Name , Reg No Relational Data Base Integrity Rules. 1. Entity Tntegrity Specially all primary key entries are unique & no part of a primary key may be null. This is to gurante that each entity will have a unique identity and ensure that foreign key values can be properly identified . Eg: No invoice can have a duplicate number as well as it cannot be null. All invoices uniquely identified by their invoice number. 11. Referential Integrity A foregn key may have either a null value as long as it is not part of its tables primery key or an entry that matches a primary key value in a table to Which it is related. This is to make. It possible for a attributes not to have a corresponding value but it should be impossible to have invalid entry . Eg: 1 M Teacher Student Teach TEACHER * Techer ID

TName

Qualification

Course

001 002

Mary Ann

Bsc Msc

Dip.Ict HND

STUDENT StReg No

Name

DOB

Course

Teacher ID

PG 0001 PG 0002

Jack Tom

25.02.80 08.11.86

HND Dip.ICT

001

1 1-

*

Employee

Superviser

Supervision

Supervisor

SupID

SName

Add

Department

Tel

001 002

Amal Kamal

Col-3 Col-4

1A 2A

256742 232786

EmployeeNo

EName

Department

SupID

001 002

Rohan Shovon

1A 2A

Null 002

We don’t need details about supervisor ID in a Employee table. 2.

Bank Account

Customer

Have

Bank Account

3-

Bank Type

Intrest Rate

Saving Current

0.05 0.15

Movie

Star Movie

MID 123 456

MName Star Moon

Language Tamil Tamil

Date 1/1/2000 22/2/2000

Star StarID

SName

Salary

MName

MID

001 002

Ananda Nalanda

10,000 150,00

Star Moon

123 456

We Need to identify about the movie ID in a star table iii.- Domain Integrity This concerns the values which are stored in a particular columns of a table. Relational Database operators (Relational Algebra) manupilahng tables and its contents using 8 relational operators. 1- Union 2- Intersect 3- Difference

4- Product 5- Select 6- Project 7- Join 8- Divide Union This combines all the rows from 2 tables. The tables must have the some attributes to be used in the union. Stno

1111 3333 5555

Name

Add

Tel

AA BB CC

Col-3 Col-5 Col-7

234567 543211 2113441

+

Stno

Name

Add

Tell

2222 4444 6666

DD EE FF

Col-4 Col-6 Col-7

2234415 5432167 5762134

StNo

Name

Add

Tel

1111 3333 5555 2222 4444 6666

AA BB CC DD EE FF

Col-3 Col-5 Col-7 Col-4 Col-6 Col-7

234567 543211 21`13441 2234415 5432167 5762134

Intersect Intersct output only the rows apperar in both tables. In this case the tables must be union compatible to produce a valid result. You cannot use intersect if one of the ottributes is numeric & the other is character base. Difference Production Soap Noodles Oil

Production Soap Morgarine Bread

Product Noodles oil Margarine Bread

Product Product outputs all possible pairs of rows from 2 tables. If one table has four rows & the other table has 3 rows a product output is a list of components of 4x3=12 rows. Itcode

Itdes

ItQty X

001 002

AA BB

10 20

ItPrice

Itrecord

OrdNo

100 50

2 5

111 222

Itcode

Itdes

Itprice

Itrecord

Ord No

001 001 002 002

AA AA BB BB

10 10 20 20

100 50 100 50

111 222 111 222

Select Select output values for nall rowsfound in table. Select can be used to list either all of the row values or it can output only. Those row values that match a specified criteria, In other words select outputs a horizontal substraction of a table.

Name

Item ID

Descriptin

Qty

0001 0002 003

AA BB CC

Shoe Noodles Bread

20 30 400

Output will be same as Item Table

Eg: 1- Select * from Item Eg: 2- Select * from Item where ItemID =”0001” ItemID 0001

Name AA

Description Qty Shoe 20

Project Project outputs all values for selected fields. In other words project outputs a vertical substraction of a table. Item 1 ItemID Name Description Qty 0001 AA Shoe 20 0002 BB Noodle 30 0003 CC Bread 40 Item 1 Item ID 0001 0002 0003

Name AA BB CC

Description Shoe Noodle Bread

Qty 20 30 40

Eg: 2- Project ItemID, Name from Item Item ID 0001 0002 0003

Item 2 Name AA BB CC

Divide Divide requires the use of one single column. Table 1 two column table A

B

ITcode

Description

0001 0002 0003 0004

AA BB CC DD

IT ode 0001 0003 0007 0008

Divide A/B

Description AA CC

Eg:2 Code A A B B B

Loc No 5 7 5 6 3

Q Table-1 Reg No 0001 0002 0003 0004

Name AA BB CC DD

Divide

Code A B

Loc No 5

Table - 2 Add Col-3 Col-4 Col-5 Col-6

Tel No 1111 2222 3333 4444

Reg No 0005 0006 0007 0008 0003

Name EF GH HI JK CC

Add Nuge Maha Kirula Katta Col-5

1- Union Reg No 0001 0002 0003 0004 0005 0006 0007 0008 Reg No 0003

Name AA BB CC DD EF EH HI JK 2- Intersect Name Cc 3- Product

Add Col-3 Col-4 Col-5 Col-6 Nuge Maha Kirula Kotta Add Col-5

Tel 3333

Tel No 1111 2222 3333 4444 5555 6666 7777 8888

Tel No 5555 6666 7777 8888 3333

Course Name HND NCC BIT ICT HND

Reg No

Reg No

Name

Add

TelNo

Name

Add

Tel No

0001 0001 0001 0001 0001 0002 0002 0002 0002 0002 0003 0003 0003 0003 0003 0004 0004 0004 0004 0004

AA AA AA AA AA BB BB BB BB BB CC CC CC CC CC DD DD DD DD DD

1111 1111 1111 1111 1111 2222 2222 2222 2222 2222 3333 3333 3333 3333 3333 4444 4444 4444 4444 4444

00005 00006 00007 00008 0003 0005 0006 0007 0008 0003 0005 0006 0007 0008 0003 0005 0006 0007 0008 0003

EF GH HI JK CC EF GH HI JK CC EF GH HI JK CE EF GH HI JK CC

Nuge Maha Kirula Kotta Col-5 Nuge Maha Kirula Katta Col-5 Nuge Maha Kirula Katta Col-5 Nuge Maha Kirula Katta Col-5

5555 6666 7777 8888 3333 5555 6666 7777 8888 3333 5555 6666 7777 8888 3333 5555 6666 7777 8888 3333

4-Difference Reg No 0005 0006 0007 0008

Name EF GH HI JK

Add Nuge Maha Kirula Katta

5-Select Name from Table 2 Name EF GH HI JK

Tel No 5555 6666 7777 8888

Course Name HND NCC BIT ICT HND HND NCC BIT ICT HND HND NCC BIT ICT HND HND NCC BIT ICT HND

6-Select * (All) From Table 2 Where Add ≠ Col-5 Reg.No 0005 0006 0007 0008

Name EF GH HI JK

Add Nuge Maha Kirula Katta

Tel No 5555 6666 7777 8888

7- Project Table over Reg No, Name Reg No 0001 0002 0003 0004

Name AA BB CC DD

Join Join allows us to combine information from 2 or more tables. Join has power for relational Databases alliwing to use of independent tabales to linked by the common attributes. Join is a result of 3 stages. Stage 1- The product operation is applied stage 2- select operation is applied stage 3- A Prosect operation is applied. CUSTOMER Code CName 0001 A 0002 B 0003 C 0005 D

AGENT CAdd Col-3 Col-4 Col-5 Col-6

ACode 2002 1001 2002 3003

A Code 1001 2002 3003

AArea Kirula Moha Kohu

ATel 2222 3333 4444

Product C code 0001 0001 0001 0002

Cname A A A B

CAdd Col-3 Col-3 Col-3 Col-4

Acode 2002 2002 2002 1001

Acode 1001 2002 3003 1001

Aarea Kirula Maha Kohu Kirula

ATel 2222 3333 4444 2222

0002 0002 0003 0003 0003 0004 0004 0004

B B C C C D D D

Col-4 Col-4 Col-5 Col-5 Col-5 Col-6 Col-6 Col-6

1001 1001 2002 2002 2002 3003 3003 3003

2002 3003 1001 2002 3003 1001 2002 3003

Moha Kohu Kirula Maha Kohu Kirula Maha Kohu

3333 4444 2222 3333 4444 2222 3333 4444

Per form the select condition where customer Acode = Agent Acod TABVLE – 1 Ccode 0001 0002 0003 0004

Cname A B C D

CAdd Col-3 Col-4 Col-5 Col-6

Project Table-1 over Acode Ccode CName 0001 A 2222 B 3333 C 4444 D Q. SNo 1111 2222 3333 4444

Product Sno 1111 1111 1111

SName AA BB CC DD

Sname AA AA AA

Acode 2002 1001 2002 3003

CAdd Col-3 Col-4 Col-5 Col-6

Course HND BIT VOW Multi Model

Course HND HND HND

Acode 2002 1001 2002 3003

ACode 2002 1001 2002 3003

CAdd L11 L12 L11 L12

Lecid L11 L11 L11

Area Maha Kirula Maha Kohu

A Add Maha Kirula Maha Kohu

ACode L11 L12 L13 L15

Lecid L11 L12 L13

Atel 3333 2222 3333 4444

A Tel 3333 2222 3333 4444

AAdd AB BC DE FG

ATel A B C D

Lecname AB BC DE

LecQual A B C

1111 2222 2222 2222 2222 3333 3333 3333 3333 4444 4444 4444 4444

AA BB BB BB BB CC CC CC CC DD DD DD DD

HND BIT BIT BIT BIT VOW VOW VOW VOW Multi Mid Multi Mid Multi Mid Multi Mid

L11 L12 L12 L12 L12 L11 L11 L11 L11 L12 L12 L12 L12

L15 L11 L12 L13 L15 L11 L12 L13 L15 L11 L12 L13 L15

FG AB BC DE FG AB BC DE FG AB BC DE FG

D A B C D A B C D A B C D

LECID L11 L12 L11 L12

LECNAME AB BC AB BC

Select Table .LECID = TABLE 2. LECID Table -3 SNO 1111 2222 3333 4444

SNAME AA BB CC DD

COURSE HND BIT VOW Multi Mid

Project Table 3 over LECID Sno Sname 1111 AA 2222 BB 3333 CC 4444 DD

CECID L11 L12 L11 L12

Course HND BIT Vow Multi mid

Lecid L11 L12 L11 L12

Lecname AB BC AB BC

LECQUAC A B A B

LecQual A B A B

Relational Analysis There are 2 ways of describing relational model. One method uses the tearms relation, attributes & tuples. The other method describes the model using more familier tearms that is table, field, record/ row. The relational model describes the data as a set of relations & table. Each relational table has a table name with set a attributes & each attribute having a unique name. The relation attributes are same as table columns or fields. Each relation has set of tuples/ records in a table.

There is a close crospondence between the ER model & Relational model. ER View Employee

Project

*

Employee

Project Assignment

Project

EmpNo

Employee Empno 1111 2222 3333 4444

Name AA BB CC DD

ProNo

Add Col-3 Col-4 Col-5 Col-6

Tel 1234 1235 1236 1237

Project Prono 001 002 003 004 Emp no

Protitle

Duration 6 month 12 Month 8 Month 6 Month Pro no

In relational analysis the notation we would use is as follows. Employee (Empno) Name, Add, Tel No) Project ( Prono, Pro title, Duration, Budget) Project Assignment (prono) ,Empno, Time spent)

Budget 20,000/= 80,000/= 35,000/= 25,000/= Time spent

In this notation each relation is represented by one line. Each relation is represented by one line. Each line starts with the relational name & it is followed by the names of the relational attributes with in the brackets. The under- lined lined attribute 9s) is the relations,s keys (primary keeys) If you can organize data into set of such relation then we can gurantee a good DB design. A Number of normal forms have been defined to eliminate duplications in relations. They are commonly known as - 1st Normal Form (INF) - 2nd Normal Form (2NF) - 3rd Normal Form (3NF) - Boyce code Normal Form (BCNF) Normalization What is Normalization is a formal process for designg which attributes should be grouped together in a relation. Before proceding with the physical design we need a method to validate the logical design to this point. Normalization is a tool to validate & improve a logical design. It needs to satisfy some constrain. Such avoide unnecessary duplicateting data normalizations is a process of decomposing relations with anomalies to produce small well structured relations. Normal form are the ruled for structuring relations these were indroduced by E.F.Cood. Later Dr.Cood proposed another normal form call boyce could normal form (BCNF) Reason for normalization data structure to avoid data redundancy – If a creting fields occurs in DB several times errors can are 3 types of anomalies exist 1- Insertion Anomaly 2- Deletion Anomaly 3- Updation Anomaly Insertion Anomaly Employee Empno E100 E150 E300 E350

Name Smith James Ann David

Add Col-1 Col-2 Col-4 Kandy

-------

Empno C-ID

CName Dura

Price

E100 E100 E300 E100

DICS IAD BSC MCSE

65000 10900 12500 15000

D-100 D-1100 D-25 M-100

5m 1yr 7yr 5yr

Supose that we need to add new course details to EMP_ course table where the PK is EMPNo, C-ID. To insert a new row the user must supply the values for both empNo & C_ID. This is an anomaly since the user should be able toenter the course details without supplying the EMPNo. Deletion Anomaly Suporse that the data for EMPNO E100 is deleted from the table this will result in loosing the information to the course name,IAD ,DICS,MCSE To voide these anomalies we can apply normalization process for the data. Unnrmalization form

Tables with repeating groups Remove repeation 1st normal form Remove partial dependencies 2nd Normal form Remove transitive dependencies 3rd Normal form Remove remaining anomalies

Boyce code

Determinency & Dependency During normalization logical assouations between data items are identified & represented in the DB Design without any file maintenance anomalise. Identitifycation of logical associations between data items is important in normalization & DB design. Such associations between data items are calle determinency & dependency relationship . Eg: If we can difermine a specific valie of data item “V” If we know the data item determinant & Data item V set to be the dependent. Therw are 2 types of deteminency dependency relatitionship 1- Functional dependenty 2- Non functional dependency Functional dependency Normalization is based on the analysis of functional dependency, A functional dependency is a relationship between 2 attributes Eg: - A relation “R” has an attribute A for every value instance of A this is value of a uniquely determines the value of B for a give. Non Funtional dependency For a given data item if we can find several value of other data item then there is a non functional dependency between 2 data items Eg::- A student No can have several subjects related. Candidate key This is a determinant that can uniquy identified non key attribute are functionally dependent on part of the primary key. Eg:- Emp Course (EmpID,Name CName, CDuration) Course name & Duration depends on parts of the primary key that is empid the other part will be linked by CID. Moltivalued dependency This is a type of dependence of exist where there are atleast 3 attributes in a relation Eg: A,B,C,Relation

Each value of a there is a well defined value for B & well defined value for C But the set of values of B indepennent of C Eg:

Subject DBD DBD

Teacher AA BB

Tex Book Text 1 Text 1

Transitive dependency This is funtionnal dependency between 2 or more “non key” “Attributes in a relation. Eg: Consider thye following relation Eg: Consider the following relation Sales (CNo,Name, Sales person, region) In the above relation each sales person is asigened to a unique region We can identify that functional dependence exist in tne sales relation i- Customer No

2- Sales person

Name Sales person Region region

The region is functionally dependent on sales person & sales person is functionally depend on CNo, AS a result there are update anoma lies in the sales relation. Eg: We can,t add new sales person to a new region without a customer. If we delete a customer we,ll loose the sales person information aswell. 3- If a salesperson is transferred to a new region then the several rows must be changed. Rules of Data Normalization. 1- INF- Elemunate repeating groupsMake a Elemunate repreating groupsMake a separate table for each set of related attributes & give each table a primary key. 2- 3NF- Eleminate columns not depend on key- If attribute do not contribute to the description of key remove that to a separate table. 3- BCNF- This is Boyce codd Normal dependency between candidate key attribute sdperte them to different table.

Eliminate repesting Groups. In the original member list,each member name is followed by any database that the mwmber has experience with . some might know many, & others might not know any. To answer the

question, who knows DB2? “we need to perform an awkward DB2?” we need to perform an awkward scan of the list looking for reerence to BB2.This is inefficient & extremely. Untidy way to store information. Moving the known DB into a separate table helps a lot. Sepatating the repeating groups of DBS from the member information results in INF. The member ID in the Database table matches The PK in the 2 tables with a join operation. Now we can answer the question by looking in the database table for “DB2” & getting the list of member.

DataBase Table

Member list 1. Jojn SmithAccess DB2 FoxPro 2. Dave jones dBase,Clipper 3. Mike Beach 4. Jerry Miller DB2, Oracle 5. Benstuary Oracle, Sybase 6. Fred Flint Informix 7. Joe Blow 8. Greg brown Access , Missq Server 9. Doughope

2. Eliminate Redundant Data

Member table MID Name 1. Johnsmith 2. David Jones 3Mile Beach 4.Jerry Miller 5. Benstuart 6. Fred Flint 7. Joe Blow 8. Gre Brows 9. Dougltope

DID MID Database 1 1 Access 2 1 DB2 3 1 Foxpro 4 2 dBase 5 2 clipper 6 4 DB2 7 4 Oracle 8 5 Oracle 9 5 Sybase 10 6 Infomix 11 8 Access 12 8 Mssqlserver

3. In the DB Table, & PK is made up of the member ID & Databade ID. This makes scnse for other attributes like “ where learned” & “Skill level” attributes, since they will be different for every member/ database combination. But the database name depends only on the Database. The same database under different IDs. This is an update anomaly. Or suppose the last member listing a particular database leaves the group. This records will be removed from the system, & the data base will not be stored anywhere . This is a delete anomaly. To avoid these proplems, we need 2NF. To Achieve this ,separate the attributes deperching on both parts of the key from those depending only on the Database ID. This results in 2 tables: “ database” Which gives the name for each database ID , and “Member database” which list the databases for each member. Now we can recassfica database in a single operation: look up The database ID in the ‘ Database’ table & change its name. The result will instantly be available throught the application. Database Table DID 1 2 3 4 5 6 7 8 9 10 11 12

MID Database 1 Access 1 DB2 1 Foxpro 2 dbase 2 Clipper 4 DB2 4 Oracle 5 Oracle 5 Sybase 6 Informax 8 Access 8 Mssqlserren

Member Table MID Name 1 2 3 4 5 6 7 8 9

Database Table DID Database 1 Access 2 DB2 3 FoxPro 5 dBase 6 Clipper 7 8 9

Member Database Table MID DID 1 1 1 2 1 3 2 4 2 5 4 2 4 6 5 6 5 7 6 8 8 1 8 9

3 Elimnate columns not Dependent on key The member table satisfies INF – It contains no repeating groups. It satisfies NF – since it does’t have a multivalued key. But the key is member ID,& the company name & location describe only a company, not a member. To achieve 3 NF, They must be moved in to a separate table. Since they describe a company, company code becomes the key for the new “company” table. The motivation for this is the same for 1NE : we want to avoid update & the IBM were currently stored no members from the previous design, there would be no record to its existence, even though 20 past members were from IBM:

Member Table MID Name Company 1 Johnsmrth ABC 2 Dave Jones MC1 3 Mike Beach IBM 4 Jerry Miller MCI 5 Ben Stuart AIC 6 Fred Flint ABC 7 Joe Blow RuNuts 8 Gerg Brown XYZ 9 Doug Hope IBM

Member Table MID Name 1 2 3 4 5 6 7 8 9

CID 1 2 3 2 4 1 5 6 3

Com Loc Alabama Florida Delaware Flnda Nebraska Alahama Iowa Now York Delawre

Company table CID Name Loc 1 2 3 4 5 6

BCNF Boyce – code Normal 1 form states mathematically that A relation R is said to be in BCNF if whenever X->A holds in R ,and A is not in X, then X is candidate key for R. BCNF covers very specific situations where 3 NF misses inter- dependencies between nonkey (but candidate key) attributes. Typically, any relation that is in 3 NF is also in BCNF if (a) there are multiple candidate keys, (b) the keys are composed of multiple attributes & (c) there are common attributes between the keys. Eg: Classcode 503 540

Class Enrolment Class Discription Mgt Info Maths

Student No 0001 0003 0005 0002 0004

Name AA BB CC DD EE

1- Convert into INF 2- Repealing group exist within this data each class code have any no. of student init, so the student information makes the repeating groups. Data cannot be stored or processed in a database when it this from. What we musthave is one record containing all the data for each student. INF Class code 503 503 503 540 540

Class discription Mgt Info Mgt Info Mgt Info Maths Maths

Student No 0001 0003 0005 0002 0004

Name AA BB CC DD EE

Enrolled in a class, there can be no “gaps” in the data when stored in a file The above table is in the INF no repeating data & no gaps. To take the table to the 2NF we would have to remove partial dependency. Class Information Student Info

Class code 503 540

Class Discription Mgt Info Maths

Class code 503 503 503 540 540

Student No 0001 0003 0005 0002 0004

Name AA BB CC CC EE

Non loss decomposition This is the process of transforming an unnormalized data set in to a fully normalized database. Relational/ Bracketing nitation If we are going to represent all normalized table within the system it’ll become very complexed, to avoid this , we can use the bracketingnotalion which can represent table by on statement. It’ll start with a table name followed by the key (s) attributes are separated with commas as well as primary keys are underlined & the foreign key shown with** Emergence of the DBMS Enter the database management system the simple approohe open data file for input soon became fraught to be addressed. Computer system vendors needed to be able to support the critical needd of a growing & evolving market palce that supported the data processing needs of organization in all fields of human endevaour. Thhe first DBMS appeared during the 1960’s at a time in human history where projects of momentous scale were being contemplated planned engineeed never before had such large data problems on the flow where identified & solutions were researched & develop ofen in real time The BBMS became necessare because the data was for more volatile than had earlier been planed & because there was still major limiting factors in the cost associated with data storage media. Data grew as a detailed transations by transation levels In the 1980’s all the major vendors of hardware system large enough to support the e volving need of evolving computerized . record keeping systems of large oranizations bundle some from of DBMS with their system solution. The 1st DBMS species were thus very much vendor specific. IBM as usual let the field but there were a growing no of competes and cloneds whose DB solution offered varing entry points in to the bandwagon of computerized record keeping system. Through this time the specifc nature of the problems being resolved & were around from the perspective of IT management were evolving with the teahnology & the no of main frame & mini computer hardware vendors in creased & the no of pheriperal types & their vendors also

increased. The bundeling of DB operational sevices such as the ability to perform & Schedule data backup became routine. The IT operational environment at this time was categorized with a collection of system house keeping task with major emhersis on organization produetion DB backup. File recoranization & reindex ation of data were also standard but mainly manual system house keeping system task. Other tasks incude storage media arehival management of document & records or business perticilarly those relste to finance record keeping. MedPix medical image database uses helthy dos of foss MedPix is a sprawling online medicimages DB & diagnostic tool that’s used around The wirld by radiologist, nurses, physicians & medical students & the whole system is powered by linux & open source software. MedPix is hosted by the US federal government’s helth sciences university, the uniformed services university in Bethesda, Maryland. It ‘ s the brainchild of JamesG. Smirniotopoulos, M.D., a USU professor of Radiology, Neurology & Biomedical informatics & clinical sciences chair of it’s Department of Radiology & Radiological sciences. Exploiting the DBMS (Selecting a DBMS) The DSMS is a software application system that used to create, maintain & provide control to create, maintain & provide control to user database Eg: It provides a user interface to the data base system. Data management layer The ANS1 / SPARC architechture A DBMS works as an interface between application progras of the users & the data base. The American National standard institute standard planning & requirements comitte propose a 3 level architecture for this interface such as. External/ User view This level describes The user application programa viwew of the database several users can share the same view. Conceptual/ l0gical view This describes overall data requirementd as a detailed technology ideperdent speafication of the over all structure of a database.Entity- Relatiorship modeling & object – Oriented modeling are 2 diffw\erent graphical notations used for presenting the comceptual notation used for presenting the conceptual mobbed. Physical / Internal view This describes in which the data is stored & data is accessed

The above 3 layer of architecture create independence of data at 2 level. Logical Data independence This refers to the ability of changes to the external Without Eg:- Phys any influence to the external cchema. Physical Data independence This refers to the ability of introduang changes to the physical sohema without any influence to the logical schema Eg: Chaging the storage structure of data without any influence to the wonceptial Schema.

Toolkit Interface

The DBMS Consist of 3 parts. 1. kernel (core- central point0 2. Interface 3. toolket Kernel. The DBMS kernel is the central ergine which handles the rain data management functions. Most DBMS are installed on thetop of some operating systems. Therefore DBMS needs to interact with operating systems to implement & access the data & system calalogece which are usually stored on harddisk. Therefore they interact with some of the operting system elements such as i.- file manager which translate between the data structures manipulated by the DBMS & the files on the harddisk. ii. acces mechanism – the file manager doesnal manage the physical input & output of data directly . It interact with appropriate access mechnisum established for different physical structures.

System buffer – The reading & writing data is normaly stored in the system buffer of the operating system. Functions of DBMS. - the DBMS must be able to create, retrieve, update, delete the data from the data base (CRUD Function) -

Data Integrity & data seairity

-

Data Communication

-

Conarrency cCntrol

-

Data recovery & backups

-

Query optimization

-

Managing transaction

-

Scheduline transaction

-

Data Dictionary

Software components available in DBMS. -

DDL Compilers DDL Compiler

Data dictioraiyment

Query processor & runtime processor User Queries

Query

Runtime Processor

Pre- compiler Application programs

Precompiler

DML Compiler

Host lang

DML Compiler Host Language Compiler Data dictionary manager Data base features Data independence The srperation of data descriptions/ definitions from the applications that uses the data In a relational model the logical independence & physical independence is achieved by allowing changes to the physical storage or the application programs to take place without the user being aware of the changes. Data abstraction Data abstraction is the process by which a data base attempts to represent properties of objects in the real world. A database records. The relevant details required to support some organizational activities & it does not record all the details. The database is an abstraction of the real world. The abstraction level varies according to the different commercial databases. Data Security Data security is the process of protecting the data from external thrects. Data is key resource for organization. Loss of data , Privacy , integrity, availability etc are issues that cost an organization in financial terms. Data Interation

A data base should be a collection of data which at last has no redundant data. The aim is to hare 1 database storing 1 logical item of data in one place which can be accessed by a range of information systems. Eg: Replacing several departmental data base in one company with one data base system that can be nade accessible to several departments. Physical Data base design. The process of physical database design is to translate the logical description of data in to the technical specification for retieving & storing data, the goal is to create the required performance & ensure the database integrity, security & recoverability. Physical database design must be performed carefully since the decisions made during this stage have major impact on data accessibility , response time , security , user friendliness , ect. Physical database design doses not include implementing database but it produice a technical specification that programmers & others involved in information system construction There are 3 major inputs to physical database design Logical database structure that were developed during logical design. This structure may be expressed in itierarachical, Network, relational object oriented & definition for each attribute. User processing requirements that were identified during requirement definition including sign , when data is used & where response time , time , data ,security , backup & recovery , integrity & retention of data. Description of technology used for omplemnting the databases (DBMS0 There are several critical decisions that ‘ll affect the integrity & performance of the applications system. These key decisions are - data volume analysis - transaction / usage analysis - Contro & security analiysis - Destribution analysis - Integrity analysis

Integrity Analysis Tntegrity defines the business rules or constraints that should apply on the data. There are 4 basic types of integrity rules. Entity Integrity This define that each and every entity should have a unique identifier a primary key and it cannot be null.

Referential Tntegrity This refers to the rules concerining the relationship between the 2 entity types (1e) by defining foregn key relational data model. Domain integrity This refers to the eonstraction valid values for attributes. We can achieve the domain integrity for an attribute by specifying the following. - The type of the attribute - The length - Format range - Allowed values - Null support or Not - Check constraints Triggerring Operations All the other business rules that protect the validity of an attribute value defined as a tribber A trigger is a program / proceduire which is aulmatically executed due to an event. An event can be either Insevetion , deletion / updation. User Rule: check with drawal amount
Setn up integrity constant expressed in some data definition language and set of a additional interity constraints expressed in some data interity language A set of distribution strategies A set of defined users A plan for securing data Database Implementation decision 5 options are available to the relationan DB developers in fine tuning the application to increase the performance of the application. Establishing the storage structure the associated mechanisms Adding Indexes Denormalization Exploiting the DBMS Implementing integrity constraints Establishing the storage structure & Access mechanisms. The criteria that are normally important in selecting the storage structure are Fast access for retrieval High throughput for processing transcaction Effective use of storage space Protection from failures Minimizing the need for reorganization Accommodating a growth Securing from unaithorised MSE. The designeres must select the file organization tha provides areasonable balancing among the criteria withing the resource available Mechanisms available for storing data in a relational system are Sequeritial files Hash files Clustering files Indexed sequential files. Adding Index Index is a table or other data structure that is used to determine the location of the row in a table that satisfies the condition. Indeex may be defined on both primary key value or non key attribute value.(Secondary Index) Indexing improves the access performance but if reduces the update performance because every time we add a record to a table, each index on that table has to be update, we always index on primary key & foreign key but sometimes we index on other nonkey attributes of a table when they are extremely important in the running of some reports.

De – Normarisation The main problem with fully normalized database is it usually made up of many tables. To perform queries , such tables have to be reconstructed using join operations. There fore some database developers occasionally choose to implement a relational database that purposely violation the principles of normalization. Their goal is to reduce the no.of physical database tables that must be accessed to retrieve the desired data by needd. But denormalization leade & it increases the change of errors and inconsistercies. Situations in which you should consider to denormalise are 2 Entities with a 1-1 relationship. If one of the entity’s is an optional participant. If the matching entity exist most of the time, then it may be advisable to combine the 2 entities in to one table. Many – many relationshop with a non key attribute – rather than combining 3 tabless to extract the data from the 2 entities in a relationship , if may be advisable to combine the columns from 1 of the entities in to the tables representing the M-M relationship which avoids 1 joins in many queries. Reference Data – If exists in an entity on the 1 side of 1 – M relationship & this entity participates in no other data relationship , then wer can consider of mering these two entities. Integrity constraints There are 3 main ways of implementing constraints. Inherently – We should at leaset be able to specify entity , referentoral & domain integrity. Procedural – Constraints occus outside the data model. Most existing data model. Most existing application programs that interface with database implements the constraints procedurally. Nonm – procedural – constructing the constraints by implementing triggers & stored procedures. Transaction management & concurrency control what is transaction. Transaction management & concurrency control what is transaction. In database term a transaction is any action that read from db or write on db. Transaction may consist of a simple select statement to gwnerate list of table contains. If may consist of a serious of insert statements to add rows to , or moretables or if may consist of combination of select , update , insert statements. Transaction management with SQL Amaerican national standard Institude (ANSI) has defined standards that Govern SQL DB tranactions. The transaction support is provide by the SQL statement commit & roll back.

Commit A commit statement is reached in which case all changes are permanently recored with the DB> The commit statement automatically ends the SQL statement. Roll back A roll back statement is reached in which case all the changes are dborted & DB is rolled back to its previous constant state Trasaction Lock The DBMS uses a transaction lock to keep track of all transaction to update the DB. Information stored in this lock’ll be used by the DBMS for recovery requirements though the roll back statements. 1- A transaction lock stores a record for beginning of transaction 2- Each transaction component 3- The pending of the transaction TRA TRA PRE TRA NEXT OPERA TABLES POW ATTR BE FOR AFTER ID NO TRA TION ID IBUTES Value VALUE 341 354 361 365

101 101 101 101

Null 341 352 303

352 363 365 Null

Start Uptate Update Commit

Start Product 1558 Customer 1001 End of T/A

Product 25 Customer 525

23 615

Concurrency Control The condition of the simultaneous execution of transaction in multiuser DB system is known as concurrency control. The objective of concurrency control is to ensure the serializability of transaction in a multiuser DB environment conairrency control is important because the simultanius execution of trasaction over shared DB can create several integrity & consistency problems. The 3 main problems are last update Uncommitted dependency Inconsistance retrieval. Last update Most common problem that is encounted where multiple users attempt to update DB without proper concurrency control. Transation A Check balance (10,000) Take (2000) Check balance (8000)

Transaction B 10.00 10.05 Check balance (10,000) 10,10 10,20 Take (7000) 10,30 10,40

The effect of transation is lost due to interference during the transation. Uncommitted dependency problem Uncommitted dependency problem arise If 1 transation is allowed to retrieve a record that has been update by another transaction but has not being committed by other transaction. TransationA (Check balance (500)

Transation B 10,00 10.05 transafer 200 10.10 10,15 10,20

Take (1000) Check balance (200)

Transaction A depends on an uncommitted transaction at 10,05 there fore the end result transaction A is incorrect because an uncommitted value may be changed due to some problem in the other transactions. Inconsistency Analyst (inconsistence refrieval) Inconsistency retrieval occur when transaction calculate some summary functions ahile other transactions are updation the data. The problem is that the transaction might read some data before they are changed & another data after they have changed this will result some inconsistency data. Trasaction A Sam = 0

Transation B 10,30

Open A C1 (Sum = sum +500 Open AC 2 (Sum = Sum +100)

10,30 10,36 transfer 50 from AC2 to AC 3

Open AC 3 (Sum = Sum + 200) AC1

AC2

50

100

AC3 250

Trasaction A should end up with an answer as 300 but the produced answer is 35. this is due to transaction A has considered an inconsistent situation of the DB. There fore the transaction has performed an inconsistent analyst. Concurrency control with locking methods Locking mechanisms are the common types of any data that is retrieved by a user for updating must be locked or denied to other users until the update is completed. Types of locks Shared lock (slock/ Read Lock) A transaction should place a shared lock on record who it’ll only read that record. Shared lock allows other transactions to read a record plaung an slock a record prevents another user from plaung an exolusn lock (updating, write) on the record. A

S

B

Exclusne lock (Write/ Lock) Exculusive lock prevents another transaction from reading & there fore updating a record until it is unlocked. A transaction should place an exlusive lock on a record when it is above to update that record. A Take B Chek Money money Data Read Data lock When 2 or more transaction have locked a common resource & each waits for other to unlock that resource. Database security DB security refer to a process of protecting a ab against accidental or international losses or misuse. A database administrator is responsible for the security of the db. The follwing are the different types of threats available for the db. 1- possibilities of the it & frauds – the data related to an organization may be illegally extracted without any permission by a hacker.

2- Possibility of loss of confidentiality an unauthorises person or a hacker try to access the corporate date available on the db. 3- Loss of Privaoy – An organization keeps data related to individual persons & viewing The personal data by others,ll lead to loss of privay. 4- Loss of Integrity – The integrity may be lost due to not implementing the interity rules properly due to hardware or software problems. 5- Loss of availability – The availability of the database may be lost due to different reasons such as matural disasters floads, fire, bomb attack, virus attack. The DBA is responsible for database security they secuirity measures on database can be defined in 2 different ways Computer based measurements Non computer based. Computer based Measures Haring an authorization on the db system by the operating system on which it is running (username, password os + DB) Having an authorization strategy to grant privell ages to the userson the DB tables. - creating views on db tables & giving access to user to those views. -

taking backups of data in DB maintaining done on the db or to trace any changes

-

Applying ercription strategy on more important data.

Non computer based measures. - sperate the duties of the users on the D, - Destroying the oopies of data without discarding them. - Use the fire – pro of storage mechanisms for storing lockeyp oppies - Apply physical access a secure environment. - Eg- locking , Assigind asecirity quard Database Administration It is a technical function that is responsible for physical DB design & for decurity enforcement, DB performance, backup & recovery In a large database system, DBA functions is carried out bythe DB administration or DBA team The DBA have the following core responsibilities. Administration of the DB The main activities of DBA in administrating the database the are physical design of the implementation. -

Data standards & documentation Monitering data usage & tuning DB structures.

-

Data Archiving

-

Data backup & recovery

Administration of the DBMS The following are the main activities of a DBMS - Installation of the DBMS - Configuring the system - Monitering the DBMS usage - Turing the DBMS The DB environment Administration The following activities are invoived with the DB environment administration. -

Access control 1:- This is dealine with or ceting users, assigning passwords, gran giving access to DBMS facilities Impact Assessment Privancy, security, Integrity control Education & training.

Related Documents


More Documents from ""