Data Warehousing

  • Uploaded by: Raghav
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Data Warehousing as PDF for free.

More details

  • Words: 1,341
  • Pages: 28
Data warehousing The basic concepts & its applications

What is a Data warehouse?

It means… 









collection of databases with information extracted from other databases. a repository of an organization's electronically stored data model of corporation's data, put together to answers the corporation's business questions. A data warehouse is designed to support strategic decision making. designed to facilitate reporting and analysis.

Comparing a Data Warehouse and an Operational Database Data Warehouse,

Operational Database

subject oriented

application oriented

integrated

multiple diverse sources

time –variant, current

real-time

nonvolatile

updateable

History of DATA WAREHOUSING





concept dates back to the late 1980s when IBM researchers Barry Devlin and Paul Murphy developed "business data warehouse". In essence, data warehousing concept was intended to provide an architectural model for the flow of data from operational systems to decision support environments



In absence of a data warehousing architecture, an enormous amount of redundancy of information was required to support the multiple decision support environments that usually existed.



In larger corporations it was typical for multiple decision support environments to operate independently. Each environment served different users but often required much of the same data.



The process of gathering, cleaning and integrating data from various sources, usually long existing operational systems was replicated for each environment.



the operational systems were frequently reexamined as new decision support requirements emerged.



Based on analogies with real-life warehouses, data warehouses were intended as large-scale collection/storage/staging areas for corporate data.



Data could be retrieved from one central point or data could be distributed to "retail stores" or "data marts" which were tailored for ready access by users.

Types of Data Warehousing Applications  Data

warehousing systems target three different types of applications:

 personal  query

productivity

and reporting

 planning

and analysis

"Personal productivity applications  spreadsheets,

statistical packages and graphics tools,  are useful for manipulating and presenting data on individual PCs.  Developed for a standalone environment, these tools address applications requiring only small volumes of warehouse data.

"Data query and reporting applications  deliver

warehouse-wide data access through simple, list-oriented queries,  the generation of basic reports.  These reports provide a view of historical data  they do not address the enterprise need for in-depth analysis and planning.

"Planning and analysis applications address essential business requirements as       

budgeting, forecasting, product line and customer profitability, sales analysis, financial consolidations manufacturing mix analysis --applications that use historical, projected and derived data.

Benefits of Data Warehousing

  

  

Has a subject area orientation

Data is arranged by subject rather than by application, and is more intuitive for users to navigate. closer in concept to the way decision makers think about their business.

Integrates data from multiple, diverse sources

the integration provided by a data warehouse was seen as very desirable. Allows grouping multiple, diverse sources of data & integrating them into single administrative suite

 



 

Allows for analysis of data over time data snapshots taken at times that are significant to the decision making process make it possible to analyze trends over time. This can readily indicate areas where further analysis is warranted.

Adds ad hoc reporting and enquiry In addition to the points highlighted in the definition, a data warehouse is designed to be accessible with end-user tools, and this allows ad hoc reporting and analysis by end-users.

 



  



Relieves the development burden on IT With data warehouse and users trained in the use of appropriate desktop tools, users can find answers to their own questions. Provides improved performance for complex analytical queries

provides improved performance for complex analytical queries Using denormalization and aggregation. Frequently used aggregates are often precomputed and materialized in views known as summary tables These materialized views provide fast access to integrated data, regardless of the original data sources.





 

Relieves processing burden on transaction oriented databases Provides analysis capabilities to decision makers Allows for a continuous planning process Converts Corporate Data into Strategic Information

Drawbacks of data warehousing

     

1. Time spent in careful analysis of measurable needs 2. Design and implementation effort 3. Hardware costs 4. Software costs 5. On-going support and maintenance 6. Resulting re-engineering effort

Design process of data warehouse

The actual design process for developing data warehouse talk to the users determine their needs in terms that can be measured design a database to support those needs document the data descriptions and other attributes design the logic for translating data from various sources into an integrated data store  Writing code for extracting data from various sources & transforming it into data warehouse, with updates to the metadata  Finally package the procedures to handle scheduling, management and maintenance     

Data Warehouse Architecture  Architecture,

in the context of an organization's data warehousing efforts, is a conceptualization of how the data warehouse is built.

 conceptualization

of data warehouse architecture consists of the following interconnected layers:

 

 

 

 

OPERATIONAL DATABASE LAYER source data for the data warehouse - An organization's EIS systems fall into this layer. INFORMATIONAL ACCESS LAYER data accessed for reporting and analyzing & tools for reporting and analyzing data - Business intelligence tools fall into this layer. DATA ACCESS LAYER The interface between the operational and informational access layer Tools to extract, transform, load data into the warehouse fall into this layer. METADATA LAYER The data directory -This is often usually more detailed than an operational system data directory. There are dictionaries for the entire warehouse and sometimes dictionaries for the data that can be accessed by a particular reporting and analysis tool.

Data Warehouse Components 

LOAD MANAGEMENT   



WAREHOUSE MANAGEMENT  



relates to the collection of info. from disparate internal or external sources. loading process includes summarizing, manipulating and changing data structures into format that lends itself to analytical processing. Actual raw data kept alongside or within data warehouse enabling construction of new and different representations.

relates to day-to-day management of data warehouse. management tasks associated with warehouse includes ensuring its availability, the effective backup of its contents & security.

QUERY MANAGEMENT   

relates to provision of access to contents of warehouse includes partitioning of info. into different areas with different privileges to different users. Access may be provided through custom-built applications, or ad hoc query tools.

Technologies Involved In Data Warehousing            

source system identification: data warehouse design and creation: changed data capture: data acquisition: data cleansing: data aggregation multi-dimensional analysis tools: business intelligence (bi) metadata management data mining tools: data visualization tools: query tools:

APPROACHES TO STORING DATA 

leading approaches to storing data in a data warehouse –

 DIMENSIONAL

APPROACH  NORMALIZED APPROACH.

DIMENSIONAL APPROACH 

transaction data are partitioned into either "facts" (numeric transaction data), or "dimensions", (reference information giving context to facts).

 

Key advantage : easier for the user to understand and to use. the retrieval of data from the data warehouse tends to operate very quickly.



disadvantages



to maintain integrity of facts and dimensions, loading the data warehouse with data from different operational systems is complicated difficult to modify data warehouse structure if organization adopting the dimensional approach changes the way in which it does business.





NORMALIZED APPROACH 

the data in the data warehouse are stored following the Codd normalization rule.



Tables grouped by subject areas that reflect general data categories data on customers, products, finance, etc.)



Key advantage it is straightforward to add information into the database.

    

disadvantage because of no. of tables involved, difficult for users to 1) Join data from different sources into meaningful information 2) Access the information without a precise understanding of sources of data & data structure of the data warehouse.

Related Documents

Data Warehousing
April 2020 35
Data Warehousing
October 2019 40
Data Warehousing
June 2020 23
Data Warehousing
June 2020 24
Data Warehousing
June 2020 33
Data Warehousing
June 2020 17

More Documents from ""

Filters
October 2019 48
The Digital Abstraction
October 2019 25
Data Warehousing
June 2020 23