File And Database Design

  • Uploaded by: api-27124815
  • 0
  • 0
  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View File And Database Design as PDF for free.

More details

  • Words: 3,304
  • Pages: 78
File and Database Design Data modeling with Entity Relationship Diagrams 2. File Access and Organization 1.

CSC 2202 Systems Analysis and Design

1

File and Database Design Objectives Define entity, file, record, and attribute and discuss the various types of keys Explain the concept of entity relationships diagram Define cardinality and its notation. Explain normalization Describe various types of files, including master, transaction, table, work, history, and security. CSC 2202 Systems Analysis and Design

2

File and Database Design Objectives Evaluate methods of file organization, including sequential, direct, and indexed Calculate file sizes and storage requirements Discuss file and database control measures

CSC 2202 Systems Analysis and Design

3

Introduction Data terminology and concepts Relationships among data objects Entity-relationship diagrams (ERDs) Normalization File organization and access methods File types File media, sizing, and controls CSC 2202 Systems Analysis and Design

4

Data Modeling with Entity Relationship Diagrams

CSC 2202 Systems Analysis and Design

5

Key Definitions Data model A formal way of representing the data that are used and created by a business system  Shows the people, places and things about which data is captured and the relationships among them. 

Logical data model 

shows the organization of data without indicating how it is stored, created, or manipulated. CSC 2202 Systems Analysis and Design

6

Key Definitions Physical data model  shows

how the data will actually be stored in databases or files.

Normalization is the process analysts use to validate data models. Data models should balance with process models CSC 2202 Systems Analysis and Design

7

What Is an ERD? A picture showing the information created, stored, and used by a business system. Entities generally represent similar kinds of information Lines drawn between entities show relationships among the data High level business rules are also shown

CSC 2202 Systems Analysis and Design

8

Using the ERD to Show Business Rules Business rules are constraints that are followed when the system is in operation. ERD symbols can show when one instance of an entity must exist for an instance of another to exist A

doctor must exist before appointments the doctor can be made CSC 2202 Systems Analysis and Design

9

Using the ERD to Show Business Rules ERD symbols can show when one instance of an entity can be related to only one or many instances of another entity 

One doctor can have many patients; each patient may have only one primary doctor

ERD symbols show when the existence of an entity instance is optional for a related entity instance 

A patient may or may not have insurance coverage CSC 2202 Systems Analysis and Design

10

An ERD Example

CSC 2202 Systems Analysis and Design

Entity A person, place, event, or thing about which data is collected Must be multiple occurrences to be an entity 

Example: If a firm has only one warehouse, the warehouse is not an entity. However, if the firm has several warehouses, the warehouse could be an entity if the firm wants to store data about each warehouse instance.

CSC 2202 Systems Analysis and Design

12

Entities and Instances

CSC 2202 Systems Analysis and Design

13

Attributes Information captured about an entity Only those used by the organization should be included in the model Attribute names are nouns Sometimes entity name is added at the beginning of the attribute name for clarity CSC 2202 Systems Analysis and Design

14

Identifiers One or more attributes can serve as the entity identifier, uniquely identifying each entity instance Concatenated identifier consists of several attributes An identifier may be ‘artificial,’ such as creating an ID number Identifiers may not be developed until the Design Phase CSC 2202 Systems Analysis and Design

15

Identifier Types

CSC 2202 Systems Analysis and Design

16

Relationships Associations between entities The first entity in the relationship is the parent entity; the second entity in the relationship is the child entity Relationships should have active verb names Relationships go in both directions CSC 2202 Systems Analysis and Design

17

Cardinality Cardinality  refers

to the number of times instances in one entity can be related to instances in another entity One instance in an entity refers to one and only one instance in the related entity (1:1)  One instance in an entity refers to one or more instances in the related entity (1:N)  One or more instances in an entity refer to one or more instances in the related entity (M:N) 

CSC 2202 Systems Analysis and Design

18

Modality Modality Refers

to whether or not an instance of a child entity can exist without a related instance in the parent entity  Not

Null means that an instance in the related entity must exist for an instance in another entity to be valid  Null means that no instance in the related entity is necessary for an instance in another entity to be valid CSC 2202 Systems Analysis and Design

19

The Data Dictionary and Metadata Metadata is information stored about components of the data model Metadata is stored in the data dictionary so it can be shared by developers and users throughout the SDLC A complete, shareable data dictionary helps improve the quality of the system under development CSC 2202 Systems Analysis and Design

20

Metadata Examples An Entity can be described by:

An Attribute can be described by:

A Relationship can be described by:

Name Definition Special Notes User Contact Analyst Contact

Name Description Alias Sample Values Acceptable Values Format Type Special Notes

Verb phrase Parent entity Child entity Definition Cardinality Modality Notes

CSC 2202 Systems Analysis and Design

21

CREATING AN ENTITYRELATIONSHIP DIAGRAM

CSC 2202 Systems Analysis and Design

22

ERD Basics Drawing the ERD is an iterative process of trial and revision ERDs can become quite complex

CSC 2202 Systems Analysis and Design

23

Steps in Building ERDs Identify the entities Add appropriate attributes for each entity Draw the relationships that connect associated entities

CSC 2202 Systems Analysis and Design

24

Identify the Entities Identify major categories of information If available, check the process models for data stores, external entities, and data flows  Check the major inputs and outputs from the use cases 

Verify that there is more than one instance of the entity that occurs in the system

CSC 2202 Systems Analysis and Design

25

Add Appropriate Attributes Identify attributes of the entity that are relevant to the system under development Check the process model repository entries for details on data flows and data stores  Check the data requirements of the requirements definition  Interview knowledgeable users  Perform document analysis on existing forms and reports 

Select the entity’s identifier CSC 2202 Systems Analysis and Design

26

Draw the Relationships Start with an entity and identify all entities with which it shares relationships Describe the relationship with the appropriate verb phrase Determine the cardinality and modality by discussing the business rules with knowledgeable users

CSC 2202 Systems Analysis and Design

27

ERD Building Tips Data stores of the DFD should correspond to entities Only include entities with more than one instance of information Don’t include entities associated with implementation of the system (they will be added later) CSC 2202 Systems Analysis and Design

28

Data Terminology and Concepts

Data design includes  Entities  Fields  Records  Files  Keys

CSC 2202 Systems Analysis and Design

29

Data Terminology and Concepts

Definitions  Entity:

a person, place, thing, or event for which data is collected and maintained

 Field

(attribute): a single characteristic or fact about an entity

 Record:

a collection of fields that describes one instance of an entity

 File:

a set of records that contains data about a specific entity CSC 2202 Systems Analysis and Design

30

Data Terminology and Concepts

Definitions  Key

field: a field used to locate, retrieve, or identify a specific record

 Primary

key: a key that uniquely identifies each record

CSC 2202 Systems Analysis and Design

31

Data Terminology and Concepts

Key fields  Primary

keys

A

field or combination of fields that uniquely and minimally identifies each member of an entity  A primary key composed of more than one field is called a multivalued key  Candidate

keys

 Any

field that could serve as primary key  Any field that is not a primary key or candidate key is called a nonkey field CSC 2202 Systems Analysis and Design

32

Data Terminology and Concepts

Key fields  Foreign A

keys

field in one file that matches a primary key value in another file  Example: the advisor number is a foreign key in the STUDENT file that matches a primary key value in the ADVISOR file  A foreign key need not be unique  A combination of two or more foreign keys can form a unique primary key value  Referential integrity ensures that a foreign key value cannot be entered unless it matches a primary key value in another file CSC 2202 Systems Analysis and Design

33

Data Terminology and Concepts

Key fields

 Secondary

keys

A

field or combination of fields that can be used to access or retrieve records  Secondary keys do not need to be unique

CSC 2202 Systems Analysis and Design

34

The goal of data Modeling To determine the required data contents of the main computer files in the new system OR (more likely) to determine the contents of data base tables i.e. to eliminate redundancy Data model forms the basis of the physical storage CSC 2202 Systems Analysis and Design

35

Entity-Relationship Diagrams Entity-relationship diagrams (ERDs)  An

ERD is a graphical model that shows relationships among system entities  Each entity is a rectangle, labeled with a noun  Each relationship is a diamond, labeled with a verb

CSC 2202 Systems Analysis and Design

36

Entity-Relationship Diagrams Entity-relationship diagrams (ERDs)  Types

of relationships

 One-to-one

(1:1)  One-to-many (1:M)  Many-to-many (M:N) A

full ERD shows all system relationships

CSC 2202 Systems Analysis and Design

37

The need for a data model Relationships and data structures in the system may be very complex 

Need to examine them independently of the processing.

The entity-relationship data model was first proposed by P. Chen (1976) - it has become widely used in database design. The resulting E-R diagram provides a graphical notation for Modeling the data CSC 2202 Systems Analysis and Design

38

Entity-relationship diagram components There are two main components in an E-R diagram:  entities

(object types)  relationships.

Entity is an object about which you want to store data Entities are represented by a rectangle CSC 2202 Systems Analysis and Design

39

Examples of ER Diagrams

CSC 2202 Systems Analysis and Design

40

Examples of ER Diagrams (cont)

CSC 2202 Systems Analysis and Design

41

Data Relationships and ERDs One-to-one (1:1) relationship  Exists

when exactly one of the second entity occurs for each instance of the first entity  Examples  One

office manager heads one office  One vehicle ID number is assigned to one vehicle  One driver drives one delivery truck  One faculty member is chairperson of one department

CSC 2202 Systems Analysis and Design

42

SC Fig 8.5 (1:1)

CSC 2202 Systems Analysis and Design

43

Data Relationships and ERDs One-to-many (1:M) relationship  Exists

when one occurrence of the first entity can be related to many occurrences of the second entity, but each occurrence of the second entity can be associated with only one occurrence of the first entity- Examples:  One

individual owns many automobiles  One customer places many orders  One department employs many employees  One faculty advisor advises many students CSC 2202 Systems Analysis and Design

44

SC Fig 8.6 (1:M)

CSC 2202 Systems Analysis and Design

45

Data Relationships and ERDs Many-to-many (M:N) relationship  Exists

when one instance of the first entity can be related to many instances of the second entity, and one instance of the second entity can be related to many instances of the first  Examples A

student enrolls in one or more classes, and each class has one or more students registered  A passenger buys tickets for one or more flights, and each flight has one or more passengers  An order lists one or more products, and each product is listed on one or more orders CSC 2202 Systems Analysis and Design

46

(M:N) *

CSC 2202 Systems Analysis and Design

47

Data Relationships and ERDs A full ERD shows all system relationships  Examples A

sales rep serves one or more customers, but each customer has only one sales rep  A customer places one or more orders, but each order has only one customer  An order lists one or more products, and each product can be listed in one or more orders  A warehouse stores one or more products, and each product can be stored in one or more warehouses CSC 2202 Systems Analysis and Design

48

SC Fig 8.8 Example

CSC 2202 Systems Analysis and Design

49

Data Relationships and ERDs Cardinality  Describes

how instances of one entity relate

to another  Mandatory vs. optional relationships  Crow’s foot notation is one method of showing cardinality  Most CASE products support the drawing of ERDs

CSC 2202 Systems Analysis and Design

50

SC Fig 8.9 Crow’s foot notation

CSC 2202 Systems Analysis and Design

51

ERD from CASE tool

CSC 2202 Systems Analysis and Design

52

Example E-R diagrams Assignment management system example Alternative method for showing relationships ... replace single and double headed arrows, or 0, 1, “crows foot” with the labels: (1) for one  (M) or (N) for many 

CSC 2202 Systems Analysis and Design

53

Example E-R diagram

CSC 2202 Systems Analysis and Design

54

Data Relationships and ER Diagrams Creating an ERD 1. Identify the entities 2. Determine all significant events or activities for two or more entities 3. Analyze the nature of the interaction 4. Draw the ERD

CSC 2202 Systems Analysis and Design

55

Connection between DFD and E-R Consider the following example Inventory data store on DFD E-R diagram showing how the Inventory entity is related to other entities Same E-R diagram showing attributes (data elements)

CSC 2202 Systems Analysis and Design

56

VALIDATING AN ERD

CSC 2202 Systems Analysis and Design

57

Design Guidelines Best practices rather than rules Entities should have many occurrences Avoid unnecessary attributes Clearly label all components Apply correct cardinality and modality Break attributes into lowest level needed Labels should reflect common business terms Assumptions should be clearly stated CSC 2202 Systems Analysis and Design

58

Normalization Technique used to validate data models Series of rules applied to logical data model to improve its organization Three normalization rules are common CSC 2202 Systems Analysis and Design

59

Normalization Steps

CSC 2202 Systems Analysis and Design

60

Unnormalized Entity Begin with an entity from the logical data model

CSC 2202 Systems Analysis and Design

61

First Normal Form (1NF) Look for repeating groups of attributes and remove them into separate entities

CSC 2202 Systems Analysis and Design

62

Second Normal Form (2NF)

entity has a concatenated identifier, look for attributes that nd only on part of the identifier. If found, remove to new ent

CSC 2202 Systems Analysis and Design

63

Third Normal Form (3NF)

or attributes that depend only on another nonidentifying att nd, remove to new entity. Also remove any calculated attribu

CSC 2202 Systems Analysis and Design

64

ERD after normalisation *

CSC 2202 Systems Analysis and Design

65

Balancing ERDs with DFDs All analysis activities are interrelated Process models contain two data components  Data flows and data stores The DFD data components need to balance the ERD’s data stores (entities) and data elements (attributes) Many CASE tools provide features to check for imbalance Check that all data stores and elements correspond between models  

Data that is not used is unnecessary Data that has been omitted results in an incomplete system

Do not follow thoughtlessly -- check that the models make sense! CSC 2202 Systems Analysis and Design

66

Steps in Database Design Four steps in database design 1. Create the initial ERD 2. Assign all data elements to entities 3. Create 3NF designs for all records, taking care to identify all primary, secondary, and foreign keys 4. Verify all data dictionary entries

CSC 2202 Systems Analysis and Design

67

Database Management Potential problems in a file processing environment  Data

redundancy  Inconsistent data  Inefficiency

Database management design A

database is a structure that can store data about many entities and the relationships among them CSC 2202 Systems Analysis and Design

68

Database Management Elements of database management systems A

database management system (DBMS) is used to create, access, and control a database  Data definition language (DDL)  Data manipulation language (DML)  Query language  Data dictionary  Utility programs CSC 2202 Systems Analysis and Design

69

Database Management Characteristics of database management A

DBMS is more powerful, complex, and expensive than a file processing system  Effective security, backup, and recovery procedures are essential  Data is a company-wide resource  Data mining enables users to extract information from anywhere in the organization CSC 2202 Systems Analysis and Design

70

File Access and Organization Types of files A file is a set of logical records that contains data about an entity  Master

files  Table files  Transaction files  Work files  Security files  History files CSC 2202 Systems Analysis and Design

71

File Access The file access method determines how programs read and write records Two main methods are used  Sequential

access method

 Program

reads all records in order until the end of the file is reached

 Random

access method

 Program

can read any logical record without having to access all preceding records CSC 2202 Systems Analysis and Design

72

File Media Types Technology has enabled many alternatives, each with advantages and disadvantages  Magnetic

tape  Floppy disk  Hard disk (fixed disk, direct access storage device or DASD)  RAID (Redundant Array of Inexpensive Disks)

CSC 2202 Systems Analysis and Design

73

File Media Types Technology has enabled many alternatives, each with advantages and disadvantages  Removable

disk cartridge  Optical disc (CD-ROM)  WORM  Mass storage system

CSC 2202 Systems Analysis and Design

74

File and Database Control Control must include measures to ensure that data is correct, complete, and secure  Passwords  Encryption  Backup

and recovery procedures  Audit trail files  Audit fields

CSC 2202 Systems Analysis and Design

75

Summary The ERD is the most common technique for drawing data models. The building blocks of the ERD are: Entities describe people, places, or things  Attributes capture information about the entity  Relationships associate data across entities 

The ERD must be balanced with the DFD.

CSC 2202 Systems Analysis and Design

76

Summary (cont) E-R Modeling is used extensively in database design. Data analysis is often considered more important than functional analysis - in a Computer Based Information System it is easier to change a program (which represents a process) than to change the underlying data structures (i.e. file or database) CSC 2202 Systems Analysis and Design

77

Review Questions 1. What is an entity-relationship diagram and what is the role of this diagram in the systems development process? 2. “Data is a valuable organisational resource and needs to carefully analysed and stored commensurate with organisational needs.” Discuss this statement. 3. What is normalisation and what role does it play in the data design process? 4. Describe the attributes of a data structure which is in 3rd Normal Form (3NF).

CSC 2202 Systems Analysis and Design

78

Related Documents

File And Database Design
November 2019 12
Database File
November 2019 15
Database Design
April 2020 14
Database Design
November 2019 20
Database Design
November 2019 31
Database Design
May 2020 15