Data Modelling and Loading
1
Data Modeling and Loading- First Steps • Data Modeling • ERM model • MDM / Star Schema model • BW Extended Star Schema • BW Master Data • InfoObjects • Attributes • Hierarchies • Text • Loading Master Data via Flat Files 2
5. Transactional Data
3
SAP BW Data Model • One “business process” is modeled at a time • Data storage optimized for reporting by a “Star Schema”
Dimension 2 Dimension 1
• Characteristics are structured together in related branches called “Dimensions” • The key figures, KPI's, and other calculations form the “Facts”
Dimension 3
Facts
Dimension n
Dimension 4
• This structure is the same for all sources 4
Example: Sales • Who did we sell to?
Product Dimension
• What did we sell? • Who sold it?
Customer Dimension
• How much did we sell? • Who did we compete against? • When did we sell?
Sales Dimension
Quantities Revenues Costs Rev./Group Time Dimension
Competition Dimension
5
Dimensions • Dimension tables are groupings of related characteristics. • A dimension table contains a generated primary key and characteristics. • The keys of the dimension tables are foreign keys in the fact table. Customer dimension C
Customer #
Region …
13970522
West
...
Product dimension P
Time dimension
Product #
Product group …
2101004
Displays
...
T
Period
Fiscal year
…
10
1999
...
6
Dimensions Example: Sales Customer
Product
Sales
Customer number
Material number
Salesperson
Customer name
Material text
Rep group
Cust Category
Material type
Sales territory
Cust Subcategory
Category
Sales region
Division
Subcategory
Sales district
Industry
Market key
Sales planning group
Revenue Class
MRP Type
Distribution key
Transportation zone
Material group 1
Currency
Planner
VAT #
Forecast model
Legal Status
Valuation class
Regional market
Standard cost
Cust Statistics group
Weight Volume
Incoterms
Storage conditions
Billing schedule
Creation Date
Price group Delivering plan
Competition Nielsen indicator SEC Code Primary competitor Secondary Competitor
Time
ABC Classification
Date
Account assignment group
Week
Address
Month
State
Fiscal Year
Country Region
7
Fact Table • A record of the fact table is uniquely defined by the keys of the dimension tables • A relatively small number of columns (key figures) and a large number of rows is typical for fact tables • A fact table is maintained during transaction data load
Fact table P C T
Quantity
Revenue
Discount Sales overhead
250 500,000 $
50,000 $
280,000 $
50 100,000 $
7,500 $
60,000 $
…
...
…
…
8
Star Schema • The combination of Fact and Dimension Tables is called a Star Schema. Customer dimension C
P C T
Customer # Region
…
13970522
...
Quantity
Revenue
west
Discount Sales overhead
250 500,000 $
50,000 $
280,000 $
50 100,000 $
7,500 $
60,000 $
…
...
…
…
Fact table P Product # 2101004
Product group
…
displays
...
Product dimension
T
Period
Fiscal year
10
1999
Time dimension 9
… ...
Example Star Schema: Sales
Sales
Competition Nielsen indicator
Customer
SEC Code
Customer number
Primary competitor
Customer name Cust. Category Cust. Subcategory
Industry Revenue Class Transportation zone Currency VAT # Legal Status Regional market Cust. Statistics group
Secondary Competitor
Facts
Division
Material Material number Material text Material type Category Subcategory
IncoTerms
Market key
Billing schedule
MRP Type
Price group
Material group 1
Delivering plan
Planner
ABC Classification
Forecast model
Account assignment group
Valuation class
Address
Weight Volume
State
Storage conditions
Country
Creation Date
Standard cost
Customer Material Competition Sales Time Qty sold List price Discounts Invoice price Fixed mfg cost Variable cost Moving average price Standard cost Contribution margin Expected ship date Actual ship date
Sales Salesperson Rep group Sales territory Sales region Time
Sales district
Date
Sales planning group
Week
Distribution key
Month Fiscal Year
Region
10
Extended Star Schema (Functional View) Sales hierarchy Zone 1 District 1 Territory 1
Zone 2 District 2
Territory 2
Territory 3 Territory 4
District 3 Territory 5
Customer master data: Attributes Customer-no
Name
Location
Fact table C P
customer-no
T
District 4
Territory 6
Territory 7
Territory 8
Territory 9
Industry key
territory chain
quantity sold
revenue
office
Sales InfoCube
head office
discount
sales overhead
stock value
Time dimension T
period
Product dimension P
District 5
Sales hierarchy
Customer dimension C
Zone 3
product-no
product group
brand
category
Product master data: Text product-no
language
product description 11
fiscal year
From Data Model to Database Terminology used to discuss the MDM modeling of a business process.
Quantities Revenues Costs Rev./Group
e
Sales Dimension
Customer Dimension
Time Dimension
Competition Dimension
Star Schema (Logical)
Ti m
Customer Dimension
Product Dimension
Product Dimension
InfoCube (Physical) Real data base tables linked together and residing on a BW database server. 12
InfoCube: SAP BW Design • Central data stores for reports and evaluations • Contains two types of data − Key Figures − Characteristics • 1 Fact Table and up to 16 Dimension Tables − 3 Dimensions are predefined by SAP −
Time
−
Unit
−
Info Package 13
Data Granularity • Data Granularity is defined as the “detail” of the database, the characteristics which describe our key figures. − Fundamental atomic level of data to be represented − The “by” words - for example, Sales by customer, by material • It determines how far you can “drill down” on the data. • Example: Time Granularity − Day versus Month − A customer buys the same product 2 to 3 times a month − With time granularity of Day : 2 or 3 fact table entries − With time granularity of Month : 1 record in the fact table but a loss of information (i.e. number of orders on different weekdays). 14
Performance versus Disk Space • The decision on granularity has the biggest impact on space and performance • Reducing granularity means losing information • With ‘normal’ star schemas (i.e. big fact table and small dimension tables) the design of dimensions is primarily guided by analytical needs. • Large dimension tables have a big impact on performance
15
6. Master Data
16
Characteristic InfoObject • BW term for Business Evaluation Object • A unique name containing technical information and business logic • InfoObject components: − Technical Definition (length, format, check routines, etc.) − Master Data, Texts − Attributes − Hierarchies − Compound Information
17
Scenario for New InfoObject • The legacy system and R/3 system have cost center numbers of different lengths • A new InfoObject (COSTC##) is needed with a length of 13 characters. • R/3 data will take the first 3 characters of the system ID as a prefix for identification purposes. BW InfoObject COSTC00 Master Data Table Cost Center#(13) Profit CenterPerson Resp 2930000007890 2940000006123 SAP1000000000 SAP2000000000
LEGACY COSTCENTER TABLE Cost Center#(13 char.) Profit CenterPerson Resp 2930000007890 2940000006123
5454 Joe 6547 Bjorne
5454 6547 32245 65465
Joe Bjorne Maria Ming
R/3 System (SYSTEM NAMESAP…..) = Profit Center Person Resp
Cost Center#(10 char.)
32245 Maria 65465 Ming
1000000000 2000000000 18
Creating a New InfoCube – Already Covered? 7. Activate 6. Choose Key Figures 5. Choose Time Characteristics 4. Assign Characteristics to Dimensions 3. Create Necessary User-Defined Dimensions 2. Choose Characteristics Specified in Data Model
1. Create New InfoCube Name in Selected InfoArea
19