Database Systems Notes.docx

  • Uploaded by: Justin Barwick
  • 0
  • 0
  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Database Systems Notes.docx as PDF for free.

More details

  • Words: 2,817
  • Pages: 8
Database Systems Notes Database – a self-describing collection of integrated records. A database contains data but also a description of that data known as the system catalog or metadata. One advantage of a database is the employment of data abstraction such that the addition of data structures or the modification of data structures in a database do not impact application programs that interface with the database (provided that the modified elements of the data structure aren’t directly used by the application program). Before a database is designed, the information needs of an organization need to be stated. These needs are identified as one of three terms: entity, attribute, and relationship. An entity is an object that is to be defined or presented in the database. An attribute is a property of an entity worthy of recording. A relationship is an association between entities. These three terms are elements of an Entity-relationship diagram, which graphically represent the structure and design of the database that will hold this information.

The database management system (DBMS) is software that allows the user to create, access, and control databases. Application programs communicate to the database through the DBMS utilizing a data manipulation language (typically SQL)

The DBMS also provides a mechanism known as the view mechanism which allows the user to view only the information relevant to them and displayed in a specific manner. Five major components of a DBMS: 1. Hardware: The database, DBMS and application program must exist on hardware somewhere. Ideally, the database and DBMS exist in a server in a secure location while the application software is distributed to the client. This physical separation of client and server is known as a client-server architecture for a database system. 2. Software: This encompasses the DBMS, the application programs and 3rd/4th generation software utilized to improved DBMS management and application program generation. 3. Data: The data is the bridge between the people/procedures and the hardware/software. The structure of a database is known as the schema. A database schema could consist of 4 files (or tables) each possessing eight fields (or attributes). 4. Procedures: Instructions or rules that govern the design and use of the database. 5. People: Broken down into four types of people who interface with a database: a. Data and database administrator: Data administrator manages the data resource: database planning, development and maintenance of standards, policies and procedures, and conceptual/logical database design. Consult senior management to ensure database supports corporate objectives. Database administrator realizes the database design and implementation, security and integrity control, maintenance and operational system, and satisfactory user acceptance. b. Database designer c. Application developer d. End-user Database design must prioritize the data over the application program. Catering to the application program first typically leads to structurally unsound database systems which lead to failure. Logical database designers focus on identifying data, the relationships between the data and the constraints applied to the data. Logical database designers work in two stages: 1. Conceptual design: target the dbms, application programs, programming languages, and any physical consideration (all independent of implementation) 2. Logical design: targets the specific data model of the database. Physical database designers: realize the logical design plan and implement them as tables, integrity constraints, selecting storage structures, access methods, performance, and security measures. Must know the DBMS extremely well. Two end users: 1. Naïve users: unaware of the DBMS and require the application program to make the complicated decisions while the user performs simple commands 2. Sophisticated users: understands the DBMS and are capable of using high-level programming languages.

Relational DBMS fall short in their limited modeling capabilities of the real world. Third generation DBMS such as object-oriented DBMS (OODBMS) and object-relational DBMS (ORDBMS) are attempting to resolve this issue. 1990s development of XML SAP prevoides an enterprise resource planning (ERP), and application layer built no top of a DBMS designed to integrate all business functions of an organization. Controlled duplication of data within a DBMS may provide performance improvements. DBMS may implement constraints on specific records, tables, or relationships between data to enforce the integrity of the data stored in the DBMS. Security is enforced using username and passwords as well as account credentials. The DML statements available to the user also limit the information observable or actions performable by the user. 2.0 DATABASE ENVIRONMENT The starting point of the database design must be an abstract and general description of the information requirements of the company to be represented by the database. The three-level architecture is a methodology that represents the three ways data items can be described with the purpose of separating the user’s view of the database from the actual representation of the database: 1. External level: The way the user perceives the data. Only includes the entities, attributes and relationships in the “real world” that the user is interested in. 2. Internal level: The way the DBMS and operating system perceives the data. Describes how the data is stored in the database. Focuses on how to optimally store and retrieve data. Concerned with storage space allocation for data and indexes, record descriptions for storage, record placement, data compression and data encryption techniques. 3. Conceptual level: Provides the mapping and the desired independence between the external and internal levels. Describes what data is stored in the database and the relationships among that data. Contains all entities, their attributes, and their relationships, the constraint on the data, the semantic information about the data, and the security and integrity information. Must not contain storage level details. Each user should be able to access the same data in the database but view the data differently. Users should also be able to customize their view of the data without affecting other users. User interactions with the database should not include (or should be independent of) the schema in which the DBMS organizes data (such as indexing or hashing). The administrator should be able to change the database storage structures without altering user views. The administrator should be able to change the conceptual structure without changing the user views. Internal structure should be unaffected by physical changes (such as changing to new storage devices) There are three types of database schema in the three-level architecture:

1. External schema (subschema): Multiple external schema describe the different external-level views of the data. 2. Conceptual schema: Describes all entities, attributes, and relationships together with constraints. Only 1 per database. 3. Internal schema: Complete description of the internal model containing stored record definitions, methods of representation, data fields, and indices of used storage structures. The conceptual/internal mapping enables to DBMS to locate physical data based on the conceptual model of that data. The external/conceptual mapping relates the terminology of the user to the concepts represented in the database. The database schema is the description of the database and not the database itself. The schema is the intension of what the final database description should be. Database instances are snapshots (or states) of what the data in the database is at any moment. Logical data independence: The immunity of the external schema to change in response to changes in the conceptual schema. Physical data independence: The immunity of the conceptual or external schemas to changes in the internal schema. Summary: The desired description of how the information in the database is present to the user should not be affected by additions or manipulations to the concept of how the real life entities are to be described in the DBMS. The ideal concepts should not be affected by the changes performed on the designed organization, security, and indexing of that data on the storage devices. These separations may create slowing database systems but allow for proper improvements/changes of a system layer without impacting other layers of the system. A data model is a collection of concepts that are used to describe the organization’s data, the relationships between that data and the constraints on that data. A data model comprises of 3 parts: 1. A structural part listing a set of rules by which the database can be constructed 2. A manipulative part listing the operations allowed on the data (specifying what parts of the data may be retrieved or manipulated) 3. A set of integrity constraints ensures the accuracy of the data. Applying this to the ANSI-SPARC 3 level architecture yields: 1. An external data model, also called the Universe of Discourse (UoD), is design to represent each user’s view of the data 2. A conceptual data model to represent the logical view of the data independent of the DMBS 3. An internal data model to describe the conceptual schema in a way that the DBMS understands. Object-based data models: A modeling system that uses concepts like entity, attribute, and relationship. The entity is a distinct object in the organization that is to be represented in the database. An attribute is a property of any given entity worth recording in the database. A relationship defines the association between entities. Common object-based data models are:

   

Entity-Relationship models Semantic models Functional models Object-oriented models

MS Access utilizes the relational record-based data model to express the data stored within the database. Data and relationships are represented as tables with distinct field names. Conceptual database design (or conceptual modeling) is the process of constructing a model of the conceptual schema which is the complete and accurate representation of the data requirements of the database system (enterprise) independent of the DBMS and independent of external application programs. The conceptual model should be the heart and source of the database system since the external model stems from the conceptual model and the internal model is the implementation of the conceptual model with the chosen DBMS. Database design should start by establishing the conceptual model. Functions of a DBMS A DBMS should provide the following ten functions and services: 1. Data storage, retrieval, and update. 2. A user-acceptable catalog containing the descriptions of data items stored and which are user accessible. Typically a system catalog contains:  Names, types and sizes of data items  Names of relationships  Integrity constraints on the data  Names of authorized users who may access the data  The data items accessible to each user and the type of accessibility allowed (e.g. insert, update, delete, or read access)  External, conceptual and internal schemas as well as the mapping between each  Usage statistics such as frequency of transactions 3. Transaction support ensuring all transactions are made or none of them, prevent the database to exist in an inconsistent state. 4. Concurrency control services. These services prevent data loss during overlapping transactions that utilize the same data/metadata. 5. Recovery services. The ability to recover the database from a consistent state during failure. 6. Authorization services. Allowing only authorized user to access the database and limiting the viewable data based on the authorization of the user. 7. Support for data communication. User must be able to access the data in the DBMS through communication messages. All messages are handled by a data communication manager (DCM) which communicates with the DBMS directly. 8. Integrity services. The DBMS must ensure that both the data, changes to the data and the database adhere to a set of rules to maintain data integrity (e.g. Staff members may not be assigned to more than 100 properties). 9. Services to promote data independence. A DBMS must support the independence of programs from the actual structure of the database.

10. Utility services. Services to help the DBA effectively use the DBMS such as:  Import and export facilities  Monitoring facilities  Statistical analysis programs to examine performance and usage  Index reorganization facilities to reorganize indexes and their overflow  Garbage collection and reallocation to reallocate space on the physical storage devices 3.0 Database Architectures and the Web File-server architecture consists of system of users connected to a server used as file storage shared between the user computers. This file server would contain files such as documents, spreadsheets, images and databases. This architecture uses a LAN. The file-server holds the files necessary to run the applications and the DBMS. File-server disadvantages: 1. Large network traffic 2. Full copy of DBMS is required on each computer 3. Concurrency, recovery, and integrity control are more complex since multiples DBMS are accessing the same files. Traditional two-tier client-server architecture consists of the client (tier 1) and the database server (tier 2) all within a local area network (LAN). Data-intensive business applications handle presentation of data (user interface) and business and data application logic on the client side and data base access via transaction logic and data validation on the server side. The client typically requests data by issuing an SQL command to the server via user interface actions to the application. The server interprets the request by checking authorization, ensuring integrity, maintaining the system catalog, and performing query and update processing all while providing concurrency and recovery control. Here the server’s responsibility is to only query information from and maintain the database reducing network traffic and refining the functional requirements of the server. This change also allows the client to take on more application processing responsibilities and less database processing responsibilities. Enterprises with extremely large client base pushed the two-tier system to the limit as applications became more complex. Complex applications demanded a larger quantity of high spec. computers for the client increasing cost. This demand lead to the three-tier client server architecture. The three-tier architecture separates the business logic and data processing layer of the application from the client computer to an application server. The client computer would then host just the user interface and simple computations for information display. The application server would be designed to perform complex computations quickly and handles multiple clients at once. This application server would then communicate with the database server containing the database and performing all the data validations and database accessing functions. An application server hosts an application programming interface (API) to expose business logic and business processes for use by another application. Application servers must handle the following issues: 1. Concurrency

2. 3. 4. 5. 6. 7. 8.

Network connection management Providing access to all database servers Database connection pooling Legacy database support Clustering support Load balancing Failover

Typical applications servers are Java Platform, Enterprise Edition (JEE), .NET with Microsoft, and Oracle Application Server. A transaction processing (TP) monitor is a program that controls data transfer between clients and servers in order to provide a consistent environment, particularly for online transaction processing (OLTP). TP monitors assist in the routing of client requests to the appropriate DBMS. TP monitors can allocate request for multiple heterogenous applications, balance the load of multiple clients, establish connection to the DBMS only when requested from the client prevent slow-down for excessively continuous underutilized connections to the DBMS. Here the TP monitor offloads processes typically performed by the DBMS server. Examples are CICS (IBM), Tuxedo (Oracle), and Java Transaction API (JTA). A web service is a software system designed to support interoperable machine-to-machine interaction over a network. Web services share business logic, data, and process through a programmatic interface across a network designed for application interface not user interface. The key to web service approach is to user widely accepted standards:    

XML (eXtensible Markup Language) SOAP (Simple Object Access Protocol) is a communication protocol for exchanging structured information over the internet. It is platform and language independent. WSDL (Web Services Description Language) protocol used to describe and locate a web service. UDDI (Universal Discovery, Description, and Integration) protocol is a platform-independent, XML-based registry for businesses to list themselves on the internet. UDDI protocol is designed to provide WSDL documents.

RESTful web services moves away from SOAP based services and toward Representational State Transfer (REST) based communication. REST services do not require XML, SOAP, WSDL, or UDDI definitions. In the REST architecture style, data, and functionality are considered resources and are access using Uniform Resource Identifiers (URIs), generally links on the web. Resources are acted upon by HTML operations for create, read update and delete: PUT, GET, POST, and DELETE. The communication protocol is typically HTTP.

Service-oriented architecture separates the application for the services performed in the applications. The services are stateless scripts called by the application upon necessity to access the data repository. This allows for the services to be combined in any way the application desires accommodating any business requirement changes that may be required by these services or the applications. This architecture works best with a standard programming language.

A distributed database management system (DDBMS) is a software system where a single logical database is fragmented among computers of network. Each fragment is under control of a separate DBMS and processes requests for that data as well as sends requests for other fragmented data. The distribution of data is invisible to all users making the system appear to be centralized.

Related Documents


More Documents from "Justin Barwick"