Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
CHAPTER 5
MANAGING ORGANIZATIONAL DATA AND INFORMATION
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Learning Objectives Discuss traditional data file organization and its problems Explain how a database approach overcomes the problems associated with traditional file environment, and discuss the advantages of the database approach Describe how the three most common data models organize data, and the advantages and disadvantages of each model Describe how a multidimensional data model organizes data Distinguish between a data warehouse and a data mart Discuss the similarities and difference between data mining and text mining
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Chapter Overview Basics of Data Arrangement and Access • The Data Hierarchy • Storing and Accessing Records Database Management Systems • Logical versus Physical View • DBMS Components
The Traditional File Environment • Problems with the File Approach
Databases: The Modern Approach • Locating Data in Databases • Creating the Database Logical Data Data Models Warehouse • Hierarchical Model • Multidimensional • Network Model Model • Relational Model • Data Marts • Advantages and • Data Mining Disadvantages of the • Text Mining Three Models • Emerging Models • Other Models
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Case: FedEx Pinpoints Profitable Customers The Problem
customers are classified as good , bad, or ugly by the cost of doing business with them and the profits they return keep the good customers, improve the bad customers, and drop the ugly ones easy to identify customers who spend money with them but difficult to identify customers who are profitable for them
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Case (continued…) The Solution
use a data warehouse, stocked with customer data, that allows the company to compare the complex mix of marketing and servicing costs that go into retaining each individual customer versus the revenues he, she, or it might bring in
The Results “good” customers - expect a phone call if their shipping volumes falter, which can prevent defections before they occur “bad” customers – can be turned into profitable customers by charging higher shipping rates “ugly” customers – can be ignored
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Case (continued…) What have we learned from this case?? Organizations can now scrutinize their customers (or other data) very carefully with advanced data management and analysis tools Customized strategies can be developed to cut costs, transform the marginal customer into a profitable customer, and permit more profitable pricing structures Other types of data can give an organization important feedback about its products, services, markets, and coming trends
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Basics of Data Arrangement and Access
The Data Hierarchy
Field - a logical grouping of characters into a word, a small group of words, or a complete number Record - a logical grouping of related fields File - a logical grouping of related records Database - a logical grouping of related files Entity - a person, place, thing, or event about which information is maintained Attribute - each characteristic or quality describing a particular entity Primary Key - field that uniquely identifies the record Secondary Key - field that has some identifying information, but typically does not identify the file with complete accuracy
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Basics of Data Arrangement and Access (continued …)
Storing and Accessing Records
Indexed Sequential Access Method (ISAM) » uses an index of key fields to locate individual records » index - lists the key field of each record and where that record is physically located in storage » track index - shows the highest value of the key field that can be found on a specific track
Direct File Access Method » uses the key field to locate the physical address of a record » transform algorithm - translates the key field directly into the record’s storage location on disk
Introduction to Information Technology Turban, Rainer and Potter Chapter 4 Computer Software
Traditional File Environment The organization has multiple applications with related data files Each application has a specific data file related to it, containing all the data records needed by the application
Each application comes with an associated application-specific data file
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Traditional File Environment (continued …)
Problems with the file approach data redundancy - the same piece of information could be duplicated in several places data inconsistency - the various copies of the data no longer agree data isolation - difficulty in accessing data from different applications security - new applications may be added to the system on an ad hoc basis data integrity - data values must often meet integrity constraints application/data independence - the applications and data in computer systems should be independent
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Database : The Modern Approach Database Management System provides access to all the data Example : University administration Registrar Office
Class Programs
Accounting Dept.
Accounts Programs
Athletics Dept.
Sports Programs
Academic Info. Team Data Employee Data Tuition Data Financial Aid Student Data Course Data Course Data Registration Data
Database Management System
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Database : The Modern Approach (continued …) Locating Data in Databases Centralized database
» all the related files are in one physical location » used on large, mainframe computers » saves the expenses associated with multiple computers » provides database administrators with the ability to work on a database as a whole at one location » files are not accessible except via the centralized host computer » recovery from disasters can be more easily accomplished at a central location » vulnerable to a single pint of failure » speed problem
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Database : The Modern Approach (continued …)
Locating Data in Databases (cont’) Distributed database
» complete copies of a database, or portions of a database, are in more than one location, which is usually close to the user » replicated database - complete copies of the entire database are delivered to many locations, primarily to alleviate the single-point-of-failure problems of a centralized database as well as to increase user access responsiveness » partitioned databases - these are subdivided, a portion of the entire database in each location
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Centralized vs. Distributed Databases User New York
User Los Angeles
User New York
Central Location
User Los Angeles
Central Location Los Angeles
New York Chicago
New York
Centralized Database
Kansas City
New York
User Chicago
User Kansas City
Distributed Database
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Database : The Modern Approach (continued …)
Creating a Database
Conceptual design - an abstract model of the database from the user or business perspective Physical design - shows the way a database is actually arranged with a storage devices Entity-relationship (ER) modeling » process of planning the database design » ER diagram - document of the conceptual data model » Entity classes Instance Identifiers Relationships
Normalization » method for analyzing and reducing a relational database to its most streamlined form for minimum redundancy,
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Database Management Systems A software program (or group of programs) that provides access to a databases Permits an organization to store data in one location, from which it can be updated and retrieved Provides access to the stored data by various application programs Provides mechanisms for maintaining the integrity of stored information, managing security and user access, recovering information when the system fails, and accessing various database functions form within an application written in a third-generation, fourthgeneration, or object-oriented language
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
DBMS (continued …) Logical versus Physical View Physical view - deals with the actual, physical arrangement and location of data in the direct access storage devices (DASD) Logical view - represents data in a format that is meaningful to a user and to the software programs that process that data
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
DBMS (continued …) DBMS Components Data model » defines the way data are conceptually structured
Data definition language (DDL) » defines what types of information are in the database and how they will be structured » functions of the DDL > provide a means for associating related data > indicate the unique identifiers (or keys) of the records > set up security access and change restrictions
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
DBMS (continued …) DBMS Components (cont’) Data manipulation language (DML) » used with third-generation, fourth-generation, or object-oriented languages to query the contents of the database, store or update information in the database, and develop database applications » Structured query language (SQL) - most popular relational database language, combining both DML and DDL features
Data Dictionary » stores definitions of data elements and data characteristics
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Logical Data Models A manager’s ability to use a database is highly dependent on how the database is structured logically and physically. In a logically structuring database, businesses need to consider the characteristics of the data and how the data will be accessed. Three common data models : hierarchical, network, and relational Using these models, database designer can build logical or conceptual view of data that can then be physically implemented into virtually any database with any DBMS.
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …) Hierarchical Database Model structures data into an inverted “tree” in which each record contains two elements rigidly 1st : a single root or master field, often called a key, which identifies the type location or ordering of the records 2nd : a variable number of subordinate fields, which defines the rest of the data within a record
all fields have only one “parent”, each parent may have many “children” advantage : speed and efficiency problem : access to data is predefined before the programs; and each relationship must be explicitly defined when the database is created
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Hierarchical Data Model Sales East Coast
Midwest
China Stemware Flatware
West Coast
China Stemware Flatware
China Stemware Flatware
Plates
Bowls
Plates
Bowls Plates
Bowls
Region
Product Category
Product
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …) Network Database Model creates relationship among data through a linked-list structure in which subordinate records (members) can be linked to more than one data element (owner) pointer - explicit link, storage addresses that contain the location of a related record many-to-many relationships are possible complexity : for every set of linked data elements, a pair of pointers must be maintained
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …) Relational Database Model based on a simple concept of tables in order to capitalize on characteristics of rows and columns of data relations - tables tuple - row attribute column select operation - creates a subset consisting of all records in the file that meet stated criteria join operation - combines relational tables to provide the user with more information than is available in individual tables project operation - creates a subset consisting of columns in a table, permitting the user to create new tables that
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Relational Database Model Smith, A.
Dir. Accounting
43
China
Jones, W.
Dir. Total Quality Management
32
Stemware
Lee, J.
Dir. Information Technology
46
China
Durham, K.
Manager, Production
35
Stemware
Stone, L.
Administrative Asst.
28
Flatware
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Company Data Models MODEL
ADVANTAGES
DISADVANTAGES
Hierarchical Speed and efficiency in database search
Access to data is predefined by exclusively hierarchical relationships, predetermined by administrator. Limited search/ query flexibility. Not all data is naturally hierarchical.
Network database
Many more relationships between data elements can be defined. Greater speed and efficiency than relational database models.
The most complicated model to design, implement, and maintain. Greater query flexibility than hierarchical model, but less than relational model.
Relational database
Conceptual simplicity; no predefined relationships among data. High flexibility in ad hoc querying. New data and records can be added easily
Lower processing efficiency and speed. Data redundancy is common, requiring additional maintenance.
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …) Emerging Data Models Object-oriented database model - an object - a small amount of data put together with all the data needed in order to perform an operation with that data » Object - similar to an entity in that it represents a person, place, or thing, but it also contains all of the data that the object needs in order to perform an operation » Attributes - characteristics that describe the state of that object » Method - an operation, action, or a behavior the object may undergo » Messages - from other objects activate operations contained within the object » Class - all the messages to which the object will respond, as
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …) Emerging Data Models (cont’) Object-relational database model - adds new object
storage capabilities to relational database management systems Hypermedia database model - stores chunks of information in a form of nodes connected by links established by the user
Other Database Models Geographical information database - contains locational
data for overlaying on maps or images Knowledge database- stores decision rules used to evaluate situations and help users make decisions like an experts Multimedia database - stores data on many media : sounds, video, images, graphics animation, and text.
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Data Warehouses A data warehouse is a relational and or multidimensional database management system designed to support management decision making. The data in the “warehouse” is stored in a single, agreed-upon format even when underlying operational databases store the data differently.
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Data Warehouses Framework and ViewAccess Applications Data Mart Legacy
Select
OLTT
Extract Transform Integrate Maintain
External Operational System/Data
Preparation
Metadata Reposition Enterprise Data Warehouse Target Database(s) (RDB, MDDB)
Marketing
Data Mart Risk Management
Data Mart Engineering
EIS/DSS Custom-Built Application (4GL tools)
A P I S M L D D L E W A R E
Production Reporting Tools
Relational Query Tools OLAP/ROLAP
Data Mining
Web Browsers
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Data Warehouses (continued ...) Data Warehouse Offers Many Business Advantages It provides business users with a “customer-centric” view of the company’s heterogeneous data by helping to integrate data from sales, service, manufacturing and distribution, and other customer-related business systems. It provides added value to the company’s customers by allowing them to access better information when data warehouse is coupled with Internet technology. It consolidates data about individual customers and provides a repository of all customer contacts for segmentation modeling, customer retention planning, and
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Data Warehouses (continued ...) Data Warehouse Advantages (cont’) It removes barriers among functional areas by offering a way to reconcile views from multiple sources, thus providing a look at activities that cross functional lines. It reports on trends across multidivisional and/or multinational operating units, including trends or relationships in areas such as merchandising, production planning, and so forth.
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Data Warehouses (continued ...) Multidimensional Database Model can be the core of data warehouses data are stored in arrays consists of at least three dimensions dimensions are the edges of the cube, and represent the primary “views” of the business data the data are intimately related and can be viewed and analyzed from different perspectives, which are called dimensions allows for the effective, efficient, and convenient storage and retrieval of large volumes of data
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Data Warehouses (continued ...) Data Marts a scaled-down version of a data warehouse that focuses on a particular subject area usually designed to support the unique business requirements of a specific department or business process. Example : Marketing data mart takes less time to build, costs less, and less complex the indiscriminate introduction of multiple data marts with no linkage to each other, or to an enterprise data warehouse, will cause problems
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
Data Warehouses (continued ...) Data Mining provides a means of extracting previously unknown, predictive information from the base of accessible data in data warehouses discovers hidden patterns, correlations, and relationships among organizational data predicts future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions functions of data mining » classification » sequencing
» clustering » forecasting
» association
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
What’s in IT for Me? For Accounting Data gathered about each transaction (business event) in the organization is stored in its databases
For Finance Computerized databases external to the organization, such as CompuStat or Dow Jones, provides financial data on organizations in its industry
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
What’s in IT for Me? (continued …) For Marketing Databases including customer name, address, purchase, amount, etc, help to plan targeted marketing campaigns and to evaluate the success of previous campaigns. Data mining is critical for many marketing efforts to remain competitive.
For Production/Operations Management
Organizational databases are accessed for determining optimum inventory levels for parts in a production process Information in databases are used to know when to perform required service on machines
Introduction to Information Technology Turban, Rainer and Potter Chapter 5 Managing Organizational Data and Information
What’s in IT for Me? (continued …) For Human Resources Management Organizational databases contain extensive data on employees, such as name, address, gender, race, age, salary, hiring date, current job descriptions, past job descriptions, and past performance evaluations
For MIS Vacancies for MIS include data entry and data storage management to database management and data analyst