Database Concepts For Cr

  • Uploaded by: api-3810976
  • 0
  • 0
  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Database Concepts For Cr as PDF for free.

More details

  • Words: 1,239
  • Pages: 32
Database Concepts

What is a Database 

Personal address book in a Word document



Collection of Word documents



Collection of Excel Spreadsheets



Data collected, maintained, & used in airline reservation



Data used to support launch of a space shuttle Structured Set of Data

Models of Reality DML DATABASE SYSTEM

REALITY • structures • processes

DDL

DATABASE



Database is a model of structures of reality



Use of a database reflect processes of reality



Database system is a software system which supports definition & use of a database 

DDL: Data Definition Language



DML: Data Manipulation Language

When does DBMS is used         

Persistent Storage of Data Centralized Control of Data Control of Redundancy Control of Consistency & Integrity Multiple User Support Sharing of Data Data Independence Control of Access & Security Backup & Recovery

Data Modeling REALITY • structures • processes

 



DATABASE SYSTEM data modeling

MODEL

Model represents a perception of structures of reality Data modeling process is to fix a perception of structures of reality & represent this perception In data modeling process we select aspects & we abstract

Database Design purpose of database design is to create a database which 

is a model of structures of reality



supports queries & updates



runs efficiently

Database Terminology 

Data Models



Keys



Integrity



Triggers & Stored Procedures



Null Values



Surrogates



Normalization

Data Model A data model consists of notations for   

expressing:

Data Structures Integrity Constraints Operations

Data Model - Data Structures All data models have notation for defining:  entity types  attribute types  relationship types

FLIGHT-SCHEDULE

DEPT-AIRPORT AIRPORT-CODE

AIRLINE

WEEKDAY

101

delta

mo

156

101

atl

545

american

we

110

912

cph

912

sc&inavian

fr

450

545

lax

mo

231

242

usair

PRICE

FLIGHT#

FLIGHT#

Data Model Constraints Rules that cannot be expressed by Data Structures   

Static constraints apply to database state Dynamic constraints apply to change of database state E.g., “All FLIGHT-SCHEDULE entities must have precisely one DEPT-AIRPORT relationship FLIGHT-SCHEDULE

DEPT-AIRPORT AIRPORT-CODE

AIRLINE

WEEKDAY

101

delta

mo

156

101

atl

545

american

we

110

912

cph

912

sc&inavian

fr

450

545

lax

mo

231

242

bos

242

usair

PRICE

FLIGHT#

FLIGHT#

Data Model Operations Operations support change & retrieval of data: insert FLIGHT-SCHEDULE(97, delta, tu, 258); insert DEPT-AIRPORT(97, atl); select FLIGHT#, WEEKDAY from FLIGHT-SCHEDULE where AIRLINE=‘delta’;

 

FLIGHT-SCHEDULE

DEPT-AIRPORT PRICE

FLIGHT#

AIRPORT-CODE

FLIGHT#

AIRLINE

WEEKDAY

101

delta

mo

156

101

atl

545

american

we

110

912

cph

912

sc&inavian

fr

450

545

lax

242

usair

mo

231

242

bos

97

delta

tu

258

97

atl

Keys Keys are uniqueness constraints 



A key on FLIGHT# in FLIGHT-SCHEDULE will force all FLIGHT#’s to be unique in FLIGHT-SCHEDULE Consider following keys on DEPT-AIRPORT:

FLIGHT-SCHEDULE

DEPT-AIRPORT

FLIGHT#

AIRLINE

WEEKDAY

101

delta

mo

156

101

atl

545

american

we

110

912

cph

912

sc&inavian

fr

450

545

lax

mo

231

242

bos

242

usair

PRICE

FLIGHT#

AIRPORT-CODE

Integrity 





Integrity: – does model reflect reality well? – is model without internal conflicts? a FLIGHT# in FLIGHT-SCHEDULE cannot be null because it models existence of an entity in real world a FLIGHT# in DEPT-AIRPORT must exist in FLIGHT-SCHEDULE because it doesn’t make sense for a non-existing FLIGHT-SCHEDULE entity to have a DEPT-AIRPORT DEPT-AIRPORT

FLIGHT-SCHEDULE

AIRPORT-CODE

AIRLINE

WEEKDAY

101

delta

mo

156

101

atl

545

american

we

110

912

cph

912

sc&inavian

fr

450

545

lax

mo

231

242

bos

242

usair

PRICE

FLIGHT#

FLIGHT#

Triggers & Stored Procedures 

Triggers can be defined to enforce constraints on a database, e.g.,



DEFINE TRIGGER DELETE-FLIGHT-SCHEDULE ON DELETE FROM FLIGHT-SCHEDULE WHERE FLIGHT#=‘X’ ACTION DELETE FROM DEPT-AIRPORT WHERE FLIGHT#=‘X’;

FLIGHT-SCHEDULE

DEPT-AIRPORT

FLIGHT#

AIRLINE

WEEKDAY

101

delta

mo

156

101

atl

545

american

we

110

912

cph

912

sc&inavian

fr

450

545

lax

mo

231

242

bos

242

usair

PRICE

FLIGHT#

AIRPORT-CODE

Null Values CUSTOMER#

123-45-6789 234-56-7890 345-67-8901









NAME

MAIDEN NAME

Lisa Smith Lisa Jones George Foreman inapplicable unknown Mary Blake

DRAFT STATUS

inapplicable drafted inapplicable

Null-value unknown reflects that attribute does apply, but value is currently unknown. That’s ok! Null-value inapplicable indicates that attribute does not apply. That’s bad! Null-value inapplicable results from direct use of “catch all forms” in database design “Catch all forms” are ok in reality, but detrimental in database design

Surrogates reality name

custom#

customer customer custom# name addr

customer



addr

surrogate-based representation

surrogates are system-generated, unique, internal identifiers

DATA MODELS    

ER-Model Hierarchical Model Relational Model Object-Oriented Model

ER Model airport

airport

name

code

visa required dept

domestic

international

time

flight

flight

dept airport

1 airport addr

city

zip

p

weekdays

flight

airport

schedule 1

street

n

arriv

n

airport

1

arriv instance

time

of customer#

date n

customer name

customer

n

reserva-

n

tion

seat#

flight instance

flight#

Hierarchical Model flight-sched flight# flight-inst date customer customer#

dept-airp airport-code

arriv-airp airport-code

customer name

parent-child relationship types (1:n only!!): (flight-sched,flight-inst), (flight-inst,customer)  one record type is root, all or record types is a child of one parent record type only  substantial duplication of customer instances 

Relational Model



Commercial systems include: ORACLE, DB2, SYBASE, INFORMIX, INGRES, SQL Server



Dominates database market on all platforms

Relational Model Data Structures   

domains attributes relations

relation name attribute names flight-schedule flight#: airline:

weekday:

price:

integer

char(2)

dec(6,2)

domain names

char(20)

Relational Model Integrity Constraints   

Primary Keys Entity Integrity Referential Integrity

flight-schedule

customer

flight# p

customer# p reservation flight#

date

customer#

customer name

Relational Model Operations 

Powerful query languages



Procedural; describes how to compute a query; operators like JOIN, SELECT, PROJECT



Declarative; describes desired result, e.g. SQL, QBE



insert, delete, & update capabilities

Object-Oriented Model based on object-oriented paradigm, e.g., Simula, Smalltalk, C++, Java area is in a state of flux  object-oriented commercial systems include GemStone, Ontos, Orion-2, Statice, Versant, O2  object-relational model has relational repository model; adds object-oriented features;  object-relational commercial systems include Starburst, POSTGRES

Clinical Database 

Data from Clinical Trial stored in some computer system – Manual – Electronic





Complex Data Cleaning, Review & Reporting Excel, Access, SAS, Oracle

Clinical Db Design Always follows Protocol Creation  Accuracy  Clarity  Ease & Speed of Data Entry  Creation of Analysis Datasets  Source Data Transfer formats  DBMS Requirements

Normalization 

 

Step-by-step decomposition of complex records into simple records Reduces redundancy Non-loss decomposition – To avoid inconsistencies – Update anomalies

Clinical Db Normalization     



Tall Skinny Vs Short Fat Normalized Vs Non-normalized Fewer Columns & More Rows Efficient Access & Storage Reducing the Size of Data Groupings or Records Does not apply to CRF Image Db

Short Fat

Patient Visit ID

BP_DI A_1

BP_SY S_1

BP_DI A_2

BP_SY S_2

BP_DI A_3

BP_SY S_3

1001

120

72

118

70

117

68

2

Tall Skinny Patien Visi Measureme t ID t nt

BP_DI A

BP_SY S

1001

2

1

120

72

1001

2

2

118

70

1001

2

3

117

68

Short Fat Vs Tall Skinny 

Short Fat – Data Cleaning Checks for variables within a single visit is easy – Missing Values easily detected



Tall Skinny – Easy Creation of Structures & Associated Checks – Data Querying is easier

Thank You

Related Documents