Process of developing data structures from user requirements •
logical structure
•
physical structure
GOALS I. information content requirements II. natural and easy-to-understand III. processing and performance requirements
Database Design Steps Proj.Scope Requirements Collection and Analysis DB Requirements Conceptual Design Conceptual Schema (in high-level data model Logical Design E.g. ER model)
DBMS-independent
Internal Schema Physical Design Conceptual Schema (in DBMS specific data model e.g. relational model)
DBMS specific
Database Design Steps Requirements Collection and Analysis
Database Design Steps Requirements Collection and Analysis
I.Requirements Analysis Purpose: identify/describe data required by users Input: Functional and Data requirements Output: User specifications
Database Design Steps Proj.Scope Requirements Collection and Analysis DB Requirements Conceptual Design Conceptual Schema (in high-level data model Logical Design E.g. ER model)
DBMS-independent
Internal Schema Physical Design Conceptual Schema (in DBMS specific data model e.g. relational model)
DBMS specific
Database Design Steps
Conceptual Design
Database Design Steps
Conceptual Design
Database Design Steps
Conceptual Design
Database Design Steps
Conceptual Design
Database Design Steps
Conceptual Design
Database Design Steps Conceptual Design
Database Design Steps Conceptual Design
II.Conceptual Design
Purpose: synthesize diff. users' views to global database design Input: User requirements from (I) & functional requirements Output: high-level data model
Database Design Steps Proj.Scope Requirements Collection and Analysis DB Requirements Conceptual Design Conceptual Schema (in high-level data model Logical Design E.g. ER model)
DBMS-independent
Internal Schema Physical Design Conceptual Schema (in DBMS specific data model e.g. relational model)
DBMS specific
Database Design Steps III. Logical Design (implementation design)
Purpose: to map conceptual design into specific DBMS
Logical Design
Database Design Steps Proj.Scope Requirements Collection and Analysis DB Requirements Conceptual Design Conceptual Schema (in high-level data model Logical Design E.g. ER model)
DBMS-independent
Internal Schema Physical Design Conceptual Schema (in DBMS specific data model e.g. relational model)
DBMS specific
Database Design Steps IV. Physical Design
Purpose: Concerned with factors relating to performance Physical Design
Conceptual Modeling Objective: to produce HIGH-LEVEL
DATA MODEL GOALS :
•A complete understanding of the database structure, meaning (semantics), interrelationships, and constraints •A stable description of the database contents •Usually more expressive and general than data models of individual DBMSs
Qualities of Conceptual Data Model Expressiveness
Able to distinguish different types of data, relationships and constraints
Simplicity
Simple enough for non-specialist users to understand and use its concepts
Minimalist
Have a small number of basic concepts that are distinct and non-overlapping in meaning
Formality
Formal, accurate & unambiguous specification of data
Entity/Relationship approach - one of the most well known modeling methods Developed by P.Chen in 1976 - many variations since then
Entity
A real-world object that can be distinctly identified May represent some real physical object
E.g., Goh Chok Tong is a government employee; Zoe Tay is an actress; my car is a Nissan Sunny
May represent some conceptual idea
E.g., SC304 is a course; Semester 1 2001/2002 is a semester
Attribute •Property of an entity •Value from corresponding value set •Simple versus composite
Simple attributes are atomic
E.g. tel#; part-color; basic-salary;
Composite attributes made up of simple attributes E.g., address = (apt#, block#, road name, postal code)
Single valued versus multivalued Single valued: single value associated with an attribute Multivalued: may have more than one values E.g., University degree attribute may Date of Birth contain B.Eng., M.Eng., or Ph.D. Stored versus derived Most attribute values are stored Derived from stored value AGE E.g., age from birthdate
! Don’t confuse Instance with Type
Entity type
A set of entities with same attributes It is a schema or intension for a set of entities sharing the same structure Schema (intension)
Instances (extension)
STUDENT: Matric#, Name, BirthDate
80808080E, J ohn Tan, 1-1-1980
COURSE: Code, Title, Instructor
CHILD: Name Date-of-birth Weight
SC102, Logic Design, Chan S. SC203, Algorithms, Huang S. Y. Alpha, 1st Jan.'96, 25kg Beta, 2nd Feb. '96, 28kg Omega, 3rd Mar.'96, 21kg
Key attribute
An attribute whose value is unique for each entity E.g., Matric# is key of STUDENT entity type Some entity types may have more than one key attributes E.g., STUDENT may be identified by Matric# or Student# or IC# Key can be composite attribute E.g., CarReg# can be (CarPlate#, State)
Relationship
An association among entities E.g., John Lee is registered for SC304; Dewey is a dependent of Donald Duck
Relationship type
A set of relationships of the same type Example of WORKS_FOR relationship between EMPLOYEE entity type and DEPARTMENT entity type
Employee
emp1 emp2 emp3 emp4 emp5
WORKS_FOR r1 r2 r3 r4 r5
Department
d1 d2 d3
Each instance r must relate one employee to one dept.
• Participation constraints – TOTAL PARTICIPATION. • Existence dependency : Entity exists only when it participates in a relationship. EMPLOYEE e1
SUPPORTS
DEPENDENT
e2
r1
d1
e3
r2
d2
e4
r.3 . .
d.3 . .
e5 . . .
Every DEPENDENT must participate in SUPPORTS
• Partial dependency: Entity may or may not participate in a relationship • For example, MANAGES has partial participation of EMPLOYEE and total participation of DEPARTMENT EMPLOYEE e1
MANAGES
DEPARTMENT
e2
r1
d1
e3
r2
d2
e4
r.3 . .
d.3 . .
e5 . . .
Not every Employee is a manager
Every Dept has a manager
ER NOTATION Entity type
Attribute Key attribute
Weak entity type Multivalued attribute Relationship type Derived attribute Identifying Relationship type
Composite attribute
Cardinality ratio N:1 Total participation of EMPLOYEE in WORKS_FOR EMPLOYEE
WORKSFOR
N
1
DEPARTMENT
Cardinality ratio N:M Partial participation of EMPLOYEE in ASSIGNED_TO EMPLOYEE
N
ASSIGNEDTO
M
PROJECT
Cardinality ratio 1:1 EMPLOYEE
MANAGES
1
1
DEPARTMENT
R
EMPLOYEE
(min,max)
(1,1)
(1,N)
E
WORK_ FOR
SUPERVISES
Structural constraint (min, max) on participation of E in R
(5,N)
DEPARTMENT
(0,1)
PROFESSOR
STUDENT
(1,N)
(1,N) TUTORS
• Naming conventions – Singular names for entity types – Upper case for entity types and relationships – Lower case (with upper case first letter) for attribute – Relationship names readable from left to right, top to bottom
• Role name – Role played by entity in relationship instance • E.g., relationship MARRIED between two persons where entity type is PERSON; role names distinguish both persons are husband and wife
Role name wife
child 1
N PERSON
IsMarriedTo 1
IsChildOf 1 parent
husband Ic-no
• Attributes of relationship – Relationships may have attributes • E.g., WORKS_ON relationship between EMPLOYEE and PROJECT may have hour as attribute EMPLOYEE
N
WORK_ ON
M
PROJECT
hour "hour" indicates the no. of hours spent by Employee X on Project Y
• Weak entity – Existence depends on some other entity • E.g., DEPENDENTS are weak entities; dependent on EMPLOYEE
– Has no key attributes; identified by being related (via an identifying relationship) to specific entities from another entity type (called identifying owner)
Partial key: Set of attributes that uniquely identify weak entities related to same owner entity • E.g., dependent name (names of dependents of same employee must be unique)
• Primary key of weak entity: identifying owner’s primary key + partial key Ssn is “social security no.”
Ssn
EMPLOYEE
Name 1
DEPENDENTS_ OF N Key = Ssn+name
DEPENDENT
Name Relationship
Another example: Participation constraint
CUSTOMER
MAKES
Due-date
Weak entity type
RENTAL
Indentifying relationship
RENTS
Rent-date
Payment
Partial key
VIDEOTAPE
A
B
C
D E
Example • Organization made up of various departments, each having a name, identifying no., and an employee A who is the manager. A department may be located in different places.
A
B
C
D E
• Information about employees include name, identification number, birthdate, address, sex, and salary. Each employee is assigned to one department. The date the manager is appointed to a department is also tracked. Employees may be directly supervised by another employee.
B
A
B
C
D E
• Each project within the organisation is controlled by a department. Employees (not necessarily from the controlling dept.) are assigned to projects.
C
A
B
C
D E
•
Information about projects include project name, no., and location. Hours spent by employees on each project are also kept.
D
A
B
C
D E
• Information about employees’ dependents are kept. These include name, sex, birthdate and relationship.
E
Enhanced ER Model (EER) – Superclasses/subclasses, attribute inheritance – Generalization/specialization – Membership constraint – Specialization hierarchies, specialization lattices, multiple inheritance
Superclasses/Subclasses and Attribute Inheritance • An entity type (set) may have additional sub-groups that are meaningful and need be represented explicitly because of their significance to the database application • Example: – Superclass: Employee – Subclass: Administrative Officers, Lecturers, Technicians, Clerical Officers, Cleaners
• Superclass/subclass relationship called an IS-A relationship – E.g., Lecturer IS-A Employee – Notation used is subset symbol ⊂ on connecting line from Superclass to subclass – “d” within circle indicates ‘disjoint’ – double line between Superclass and circle indicates TOTAL participation
• An entity in a subclass inherits all attributes of its Superclass
Surname
FirstName
Birthdate Address
Name ICNo
Sex EMPLOYEE
d
LECTURER HighestQual
TECHNICIAN LabLocated
CLERICALOFFICER TypingSpeed
Specialization and Generalization • Specialization is process of defining a set of subclasses of an entity type – Set of subclasses form a specialization on the basis of some distinguishing characteristic(s) – Define different roles for different entities of same entity type – Associate additional specific attributes to subclass – Establish additional specific relationship types between subclasses and other entity types or subclasses
• Generalization is the reverse process; to generalize a set of entity types into a single Superclass – Identify common features – Generalize into a Superclass; original entity types become subclasses – Example involves two entity types: CAR and TRUCK which are generalized to form a Superclass VEHICLE
NoOfPassengers
Price Colour
CAR
Price
TRUCK Height
Tonnage
LicenseNo
COE
LicenseNo
Price
LicenseNo VEHICLE
Colour
d
TRUCK
CAR NoOfPassengers
Tonnage
COE
Height
Aggregation
• In ER, a relationship type cannot be connected directly to another relationship type • To overcome this, aggregates are used • Example: CUSTOMER
Notice that connection to ARRANGED-BY is with outer rectangle (higher level entity) and not with BORROWS
BORROWS
ARRANGED-BY
BANK-OFFICER
LOAN Higher-level aggregate object
At the end of this chapter, you should be able to: 1. Understand a given ER/EER diagram 2. Draw an ER/EER diagram based on given specifications 3. Be aware of alternatives faced by designer and select the most appropriate one