Managing Data Resources
Objectives • • • • • • • •
Definition of terms Explain growth and importance of databases Name limitations of conventional file processing Identify categories of databases Explain advantages of databases Identify costs and risks of databases List components of database environment Describe evolution of database systems
Organizing Data in a Traditional File Environment File Organization Terms and Concepts
• Bit: Smallest unit of data; binary digit (0,1) • Byte: Group of bits that represents a single character • Field: Group of words or complete number • Record: Group of related fields • File: Group of records of the same type
Organizing Data in a Traditional File Environment File Organization Terms and Concepts
• Database: Group of related files • Entity: Person, place, thing, or event about which information must be kept • Attribute: A piece of information describing a particular entity • Key field: Field that uniquely identifies every record in a file
Organizing Data in a Traditional File Environment The data hierarchy
Organizing Data in a Traditional File Environment Entities and attributes
Organizing Data in a Traditional File Environment Traditional file processing
Problems with Data Dependency
Each application programmer must maintain their own data Each application program needs to include code for the metadata of each file Each application program must have its own processing routines for reading, inserting, updating and deleting data Lack of coordination and central control Nonstandard file formats
Problems with Data Redundancy • Waste of space to have duplicate data • Causes more maintenance headaches • The biggest problem:
– When data changes in one file, could cause inconsistencies – Compromises data integrity
Organizing Data in a Traditional File Environment Problems with the Traditional File Environment
• Data redundancy
Different systems/programs have separate copies of the same data
• Programdata dependence
All programs maintain metadata for each file they use
• Lack of flexibility
Programmers must design their own file formats
• Poor security, lack of datasharing and availability No centralized control of data
• Excessive Program Maintenance 80% of of information systems budget
The Database Approach to Data Management Database Management Systems
Database • Collection of centralized data • Controls redundant data • Data stored so as to appear to users in one location • Services multiple application
Definitions • Database: organized collection of logically related data • Data: stored representations of meaningful objects and events – Structured: numbers, text, dates – Unstructured: images, video, documents
• Information: data processed to increase knowledge in the person using the data • Metadata: data that describes the properties and context of user data
Data in Context
Context helps users understand data
Graphical displays turn data into useful information that managers can use for decision making and interpretation
Metadata :Descriptions of the properties or characteristics of the data, including data types, field sizes, allowable values, and data context
The Database Approach to Data Management The contemporary database environment
The Database Approach to Data Management Database Management Systems
Database Management System (DBMS) A software system that is used to create, maintain, and provide controlled access to user databases
• Creates and maintains databases • Eliminates requirement for data definition statements • Acts as interface between application programs and physical data files • Separates logical and physical views of data
The Database Approach to Data Management Database Management Systems
Three Components to a DBMS •
Data definition language: Formal language programmers use to specify structure of database
•
Data manipulation language: For extracting data from database, e.g. SQL
•
Data dictionary: Tool for storing, organizing definitions of data elements and data characteristics
The Database Approach to Data Management Sample data dictionary report
Figure 75
The Database Approach to Data Management Database Management Systems
• • • • •
How a DBMS Solves Problems of a Traditional File Environment
Reduces data redundancy Eliminates data inconsistency Uncouples programs from data Increases access and availability of data Allows central management of data, data use, and security
Advantages of the Database Approach • Programdata independence • Minimal data redundancy • Improved data consistency • Improved data sharing • Increased productivity of application development • Enforcement of standards • Improved data quality • Improved data accessibility and responsiveness • Reduced program maintenance • Improved decision support
Cost and Risk of the Database Approach • New, specialized personnel • Installation and management cost and complexity • Conversion costs • Need for explicit backup and recovery • Organizational conflict
Components of the Database Environment • CASE Tools – computeraided software engineering • Repository – centralized storehouse of metadata • Database Management System (DBMS) – software for managing the database • Database – storehouse of the data • Application Programs – software using the data • User Interface – text and graphical displays to users • Data Administrators – personnel responsible for maintaining the database • System Developers – personnel responsible for designing databases and software • End Users – people who use the applications and databases
Components of the Database Environment
Evolution of Database Systems
Evolution of DB Systems • • • • • • • • •
Flat files 1960s 1980s Hierarchical – 1970s 1990s Network – 1970s 1990s Relational – 1980s present Objectoriented – 1990s present Objectrelational – 1990s present Data warehousing – 1980s present Webenabled – 1990s – present Data mining – 2000s present
The Database Approach to Data Management The three basic operations of a relational DBMS
Figure 77
The Database Approach to Data Management Types of Databases
Hierarchical DBMS • Older system presenting data in treelike structure • Models onetomany parentchild relationships • Found in large legacy systems requiring intensive high volume transactions: Banks; insurance companies • Examples: IBMs IMS
The Database Approach to Data Management A hierarchical database for a human resources system
Figure 78
The Database Approach to Data Management Types of Databases
Network DBMS • Older logical database model • Models manytomany parentchild relationships • Example: Student – course relationship: Each student has many courses; each course has many students
The Database Approach to Data Management The network data model
Figure 79
The Database Approach to Data Management Types of Databases
Relational DBMS • Represents data as twodimensional tables called relations • Relates data across tables based on common data element • Examples: DB2, Oracle, MS SQL Server
The Database Approach to Data Management The relational data model
The Database Approach to Data Management Types of Databases
Three Basic Operations in a Relational Database • Select: Creates subset of rows that meet specific criteria • Join: Combines relational tables to provide users with information • Project: Enables users to create new tables containing only relevant information
The Database Approach to Data Management Types of Databases
ObjectOriented Databases (OODBMS) • Stores data and procedures as objects • Better able to handle graphics and recursive data • Data models more flexible • Slower than RDBMS • Hybrid: objectrelational DBMS
Data Mining at Fingerhut Inc. Fingerhut published about 25 different catalogs, but shipped only the general merchandise catalog monthly and tracked customers buying patterns and behaviors. If a customer bought cookware, first Fingerhut would follow up with specialized Cooks’ Book and More Houseware & cooking supplies catalogs. Then telemarketers would call to follow up with new products. Through customer data mining Fingerhut found that customers who recently changed their residence were likely to triple their purchasing in the 12 weeks after their move, with a peak in buying in the first four weeks. Their selections often followed a pattern— new furniture, telecommunications equipment, and decorations but seldom jewelry or home electronics. The company used this discovery to tailor a new "mover's catalog" to entice customers who recently moved.