Unicode

  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Unicode as PDF for free.

More details

  • Words: 3,112
  • Pages: 23
SAP TechEd ‘03 Las Vegas

Unicode July 7th 2005

Dr. Christian Hansen Netweaver DT Internationalization, SAP AG

Agenda

1.What is Unicode? 2.Who needs Unicode? 3.How to go there?

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

1

SAP TechEd ‘03 Las Vegas

Agenda

1.What is Unicode? 2.Who needs Unicode? 3.How to go there?

 SAP AG 2005, Dr. Christian Hansen

About Code Pages: Conventional Code Pages Disadvantages of old standard code pages Each covers only a subset of all characters used Incompatibilities between different codepages Only restricted data exchange possible Too many of them KYOCERA Canon APPLE HP IBM IS0-9 IS0-5 Mircosoft IS0-9 EBCDIC 1250 697/ 1251 IS0-3 IS0-2 697/ 0277 IS0-7 IS0-3 12 0500 IS0-2 12571256 1252 IS0-7 ASCII1252 12 1250 1251 IS0-3 IS0-2 BIG-5 1252 1254 BIG-5 IS0-9 IS0-5 IS0-9 IS0-5 SJIS IS0-8 SJIS IS0-8 IS0-4 IS0-3 IS0-4 IS0-3 IS0-2 IS0-7 IS0-2 IS0-7 IS0-6 IS0-1 SAP: IS0-1IS0-6 Languages: 41 Characters: 22,378  SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Code Pages: 390

Session ID, Speaker

2

SAP TechEd ‘03 Las Vegas

Solution: Unicode, one Code Page for all Scripts Japanese

Chinese

Hebrew

Korean

Greek Taiwanese

Icela ndic

Russian Ukrainian

English

Danish Dutch, German Finnish French, Italian Norwegian Portuguese Spanish Swedish Turkish

Thai Croatian Czech Hungarian Polish Rumanian Slovakian Slovene

And more languages can be supported easily without the need for new code pages or other new methods (e.g. Vietnamese!)

 SAP AG 2005, Dr. Christian Hansen

Solution: Unicode characters ASCII General Scripts Symbols

CJK Ideographs 65,000 characters

Hangul

Compatibility Surrogate Area

Additional 1,000,000 characters

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

3

SAP TechEd ‘03 Las Vegas

Representation of Unicode Characters UTF-16 – Unicode Transformation Format, 16 bit encoding Fixed length, 1 character = 2 bytes (surrogate pairs = 2 + 2 bytes) Platform-dependent byte order (big/little endian) 2 byte alignment restriction

UTF-8 – Unicode Transformation Format, 8 bit encoding Variable length, 1 character = 1...4 bytes Platform independent no alignment restriction 7 bit US ASCII compatible Character

Unicode scalar value

UTF-16 big endian

UTF-16 little endian

UTF-8

a

U+0061

00 61

61 00

61

ä

U+00E4

00 E4

E4 00

C3 A4

α

U+03B1

03 B1

B1 03

CE B1

U+3479

34 79

79 34

E3 91 B9

 SAP AG 2005, Dr. Christian Hansen

Internationalized Software with Unicode The Unicode Standard was adopted by IBM and several other companies including Apple, HP, JustSystem, Microsoft®, Oracle, Sun™, Sybase, and Unisys. Unicode is required by modern standards such as XML, Java™, ECMAScript (JavaScript™), LDAP, CORBA 3.0, and WML. Unicode is also the official way to implement ISO/IEC 10646 and is supported in many operating systems and all modern browsers.

Check out http://www.unicode.org/ !

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

4

SAP TechEd ‘03 Las Vegas

Agenda

1.What is Unicode? 2.Who needs Unicode? 3.How to go there?

 SAP AG 2005, Dr. Christian Hansen

Scenario: System integration and non-Unicode

Jörg Müller 조희정 Web Dynpro ABAP KSC5601: 조희정 J#rg M#ller

J2EE Unicode: 조희정 Jörg Müller

조희정 J#rg M#ller

조희정 Jörg Müller

Unicode Non-Unicode

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

5

SAP TechEd ‘03 Las Vegas

Scenario: System integration and Unicode

Jörg Müller 조희정 Web Dynpro ABAP Unicode: 조희정 Jörg Müller 조희정 Jörg Müller

J2EE Unicode: 조희정 Jörg Müller 조희정 Jörg Müller

Unicode Non-Unicode

 SAP AG 2005, Dr. Christian Hansen

SAP NetWeaver™ the integration platform? Evolution of mySAP Technology

SAP NetWeaver™ NetWeaver™ People Integration

Unifies and aligns people, information and business processes

Portal

Information Integration Business Intelligence

Knowledge Management

Master Data Management

Process Integration Integration Broker

Business Process Management

Application Platform J2EE

.NET

Collaboration

Integrates across technologies and organizational boundaries A safe choice with full .NET and J2EE interoperability

Life Cycle Management



Composite Application Framework

Multi-Channel Access

The business foundation for SAP and partners …

ABAP

andOS OS Abstraction Abstraction DBDBand

WebSphere

Powers business-ready solutions that reduce custom integration Its Enterprise Services Architecture increases business process flexibility

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

6

SAP TechEd ‘03 Las Vegas

SAP NetWeaver™ with non-Unicode ABAP stack Evolution of mySAP Technology

SAP NetWeaver™ NetWeaver™ People Integration

Unifies and aligns people, information and business processes

Multi-Channel Access Composite Application Framework

Collaboration

Information Integration Business Intelligence

Knowledge Management

Master Data Management

Process Integration Integration Broker

Business Process Management

Application Platform J2EE

The business foundation for SAP and partners …

non-Unicode ABAP

andOS OS Abstraction Abstraction DBDBand

.NET

Integrates across technologies and organizational boundaries A safe choice with full .NET and J2EE interoperability

Life Cycle Management



Portal

WebSphere

Powers business-ready solutions that reduce custom integration Its Enterprise Services Architecture increases business process flexibility

 SAP AG 2005, Dr. Christian Hansen

SAP NetWeaver™ with non-Unicode ABAP stack Evolution of mySAP Technology

SAP NetWeaver™ NetWeaver™ People Integration

Unifies and aligns people, information and business processes

no

Portal

Information Integration Business Intelligence

Knowledge Management

Master Data Management

Process Integration Integration Broker

Business Process Management

Application Platform J2EE

.NET

Collaboration

Integrates across technologies and organizational boundaries A safe choice with full .NET and J2EE interoperability

Life Cycle Management



Composite Application Framework

Multi-Channel Access

The business foundation for SAP and partners …

non-Unicode ABAP

andOS OS Abstraction Abstraction DBDBand

WebSphere

Powers business-ready solutions that reduce custom integration Its Enterprise Services Architecture increases business process flexibility

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

7

SAP TechEd ‘03 Las Vegas

SAP NetWeaver™ with non-Unicode ABAP stack Evolution of mySAP Technology

SAP NetWeaver™ NetWeaver™ People Integration

Unifies and aligns people, information and business processes

no

Multi-Channel Access Composite Application Framework

Collaboration

Information Integration Business Intelligence

Knowledge Management

Master Data Management

Process Integration Integration Broker

Business Process Management

Application Platform J2EE

no

The business foundation for SAP and partners …

non-Unicode ABAP

andOS OS Abstraction Abstraction DBDBand

.NET

Integrates across technologies and organizational boundaries A safe choice with full .NET and J2EE interoperability

Life Cycle Management



Portal

WebSphere

Powers business-ready solutions that reduce custom integration Its Enterprise Services Architecture increases business process flexibility

 SAP AG 2005, Dr. Christian Hansen

SAP NetWeaver™ with non-Unicode ABAP stack Evolution of mySAP Technology

SAP NetWeaver™ NetWeaver™ People Integration

Unifies and aligns people, information and business processes

no

Portal

Information Integration Business Intelligence

Knowledge Management

Master Data Management

Process Integration Integration Broker

Business Process Management

Application Platform J2EE

.NET

Collaboration

Integrates across technologies and organizational boundaries A safe choice with full .NET and J2EE interoperability

no

Life Cycle Management



Composite Application Framework

Multi-Channel Access

no

The business foundation for SAP and partners …

non-Unicode ABAP

andOS OS Abstraction Abstraction DBDBand

WebSphere

Powers business-ready solutions that reduce custom integration Its Enterprise Services Architecture increases business process flexibility

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

8

SAP TechEd ‘03 Las Vegas

Only solution for full integration: Unicode Evolution of mySAP Technology

SAP NetWeaver™ NetWeaver™ People Integration

s ye

Unifies and aligns people, information and business processes

Multi-Channel Access Composite Application Framework

Collaboration

Information Integration Business Intelligence

Knowledge Management

Master Data Management

Process Integration Integration Broker

Business Process Management

Application Platform J2EE

s ye es y The business foundation for SAP and partners



Unicode ABAP

andOS OS Abstraction Abstraction DBDBand

.NET

Integrates across technologies and organizational boundaries A safe choice with full .NET and J2EE interoperability

Life Cycle Management



Portal

WebSphere

Powers business-ready solutions that reduce custom integration Its Enterprise Services Architecture increases business process flexibility

 SAP AG 2005, Dr. Christian Hansen

Agenda

2.Who needs Unicode? Everybody making full use of SAP Netweaver

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

9

SAP TechEd ‘03 Las Vegas

Old solution for multiple languages: MDMP* West European View

Japanese View

* Check your system type with report RSCPINST

Korean View

current configuration

 SAP AG 2005, Dr. Christian Hansen

Old solution for multiple languages: MDMP West European View

Japanese View

Korean View

ve a e W t e

4 0 ‘ r

N79991 h t i and

)

En

f o d

s

w38402 t r o tes 8

p ee no p u (s

(As of release NetWeaver 04s and moving forward, MDMP will no longer be supported)  SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

10

SAP TechEd ‘03 Las Vegas

Only solution for multiple languages: Unicode

Unicode

 SAP AG 2005, Dr. Christian Hansen

Agenda

2.Who needs Unicode? Everybody making full use of SAP Netweaver All Korean customers using more languages than only English and Korean

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

11

SAP TechEd ‘03 Las Vegas

Oracle Database setting KO16KSC5601 Default database character set for SAP systems running on Oracle is WE8DEC. Note 695899 described a way to change this to the Korean character set KO16KSC5601. Several Korean customers used this to make their database transparent for access by non-SAP products (e.g. DB-Link). Due to an incompatible change in the Oracle database this will not be supported in the future (Oracle 10, ERP2005). See SAP note 858869 (still in progress).

The only way to make the database transparent is a Unicode setup.

 SAP AG 2005, Dr. Christian Hansen

Agenda

2.Who needs Unicode? Everybody making full use of SAP Netweaver All Korean customers using more languages than only English and Korean All Korean customers that have been using KO16KSC5601 as Oracle character set …

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

12

SAP TechEd ‘03 Las Vegas

Agenda

1.What is Unicode? 2.Who needs Unicode? 3.How to go there?

 SAP AG 2005, Dr. Christian Hansen

Unicode Statistics: Current Figures Unicode shares since 6.10 (%)

20 04 20 -08 04 -1 20 -09 1 04 -0 20 -10 8 04 -0 20 -11 6 04 -0 20 -12 3 04 -0 20 -12 1 05 -2 20 -01 9 05 -2 20 -02 6 05 -2 20 -03 3 05 -2 20 -04 3 05 -2 20 -05 0 05 -1 20 -06 8 05 -1 -0 5 713

Shares (%)

8.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5

World wide more than 1700 Unicode systems are already running  SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

13

SAP TechEd ‘03 Las Vegas

How to go there: Unicode availability Unicode enabled mySAP Components SAP Web Application Server:

WAS 6.20

mySAP Customer Relationship Management (CRM):

mySAP CRM 4.0

mySAP Supply Chain Management (SCM):

mySAP SCM 4.X

mySAP Supplier Relationship Management (SRM):

mySAP SRM 4.0

mySAP Business Intelligence (BW):

mySAP BW 3.5

mySAP Product Lifecycle Management (PLM):

Ramp up

mySAP Strategic Enterprise Management (SEM):

SEM 4.0

SAP R/3 Enterprise:

Ext. Set 2.00

SAP Note 79991  SAP AG 2005, Dr. Christian Hansen

How to go there: platform support SAP supports Unicode systems on the following platforms: Database system

Platform

W2K

Linux3

Solaris1

HP1

Tru641

AIX1

AS/400

OS/390

SQL Server



-

-

-

-

-

-

-

Oracle













-

-

DB/2









-







SAP DB













-

-

164

bit versions only

2OS/390

support is planned for Q3/2004 with DB/2 V8.1.

3Tentatively,

64 bit version will be available in Q2/2004

There will be no support for Informix.

SAP Note 379940

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

14

SAP TechEd ‘03 Las Vegas

How to go there: Unicode System installation Unicode is the default for new installations!

 SAP AG 2005, Dr. Christian Hansen

How to go there: Unicode System conversion Unicode System conversions: Converting existing systems to Unicode needs several steps: Upgrade to a Unicode compliant version of the application (see note 79991) Adapt ABAP, C/C++ programs Convert the database (System Copy) Install Unicode executables Check interfaces (3rd party software meight not be ready for Unicode)

Conversion projects need thorough planing and execution!

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

15

SAP TechEd ‘03 Las Vegas

What to consider I: Hardware Requirements based on parallel benchmarking of Unicode / non-Unicode test systems

CPU

RAM +30% depending on existing scenario (MDMP, double byte)

Database size

+50% Application Servers are based on UTF-16 internally

Network Load

UTF-8* :up to +35% UTF-8** :up to +10% UTF-16 :up to +60..70%

* +35% is the observed maximum in growth for small systems (db size < 200GB) **+10% is the observed maximum for bigger systems (db size > 200GB)

UTF-8 almost no change due to efficient compression * *first customer conversions indicate: DB size increase due to Unicode conversion is outweighed by size decrease due to DB reorganization – so actually DB shrinks!

 SAP AG 2005, Dr. Christian Hansen

What to consider II: Outside communication

Outside communication: Sapnet quicklink Unicode@sap: Unicode@sap --> Unicode library --> ABAP and Unicode --> TechEd 2004: CI253 External Unicode Interfaces

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

16

SAP TechEd ‘03 Las Vegas

Communication: The Ideal Picture The ideal Picture: only Unicode components Conversions are done algorythmically (1:1 relation) No data misinterpretation No data loss

JAVA Application (Portal)

RFC Client (SAP_UC)

R/3 Enterprise

mySAP BW 3rd Party

All business relevant characters available at the same time R/3 Enterprise

...

Files

Internet

 SAP AG 2005, Dr. Christian Hansen

Communication: Reality The reality: Unicode and non-Unicode components

Conversions between incompatible code pages everywhere Only common subset exchangeable Special rules have to be obeyed to make communication possible

JAVA Application (Portal)

R/3 4.6C ISO8859-1 SJIS

RFC Client (char) SAP_CODEPAGE = 1100

mySAP BW ISO8859-1

3rd Party EBCDIC

... R/3 Enterprise Files

Internet 1251 IS0-1 IS0-8 ...charset=iso-8859-1" > ...charset=windows-1257" > ...charset=Shift_JIS" > ...charset=utf-8" >

IS0-9 SJIS BIG-5697/ 697/ 0500 IS0-3 0277 IS0-2 IS0-7 1252

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

17

SAP TechEd ‘03 Las Vegas

What to consider III: ABAP Programming

ABAP Programming: Sapnet quicklink Unicode@sap: Unicode@sap --> Unicode library --> ABAP and Unicode --> TechEd 2004: CI252 Making ABAP Programs Unicode enabled

 SAP AG 2005, Dr. Christian Hansen

Transparent Unicode Enabling of R/3 Character Expansion Model Separate Unicode and non-Unicode versions of R/3

ABAP ABAP ABAP source source source

NonUnicode R/3

1 character = 1 byte (types C, N, D, T, STRING) Non-Unicode kernel Non-Unicode database

Unicode R/3

1 character = 2 bytes (UTF16), (types C, N, D, T, STRING) Unicode kernel Unicode database

No explicit Unicode data type in ABAP Single ABAP source for Unicode and non-Unicode systems

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

18

SAP TechEd ‘03 Las Vegas

Transparent Unicode Enabling of R/3 Implications: Major part of ABAP coding is ready for Unicode without any changes

Challenge: clear distinction between character and byte processing:

1 Character ≠ 1 Byte Minor part of ABAP coding has to be adapted to comply with Unicode restrictions Find the relevant places with transaction UCCHECK (Call the transaction today if you are already on SAP_BASIS ≥ 6.10)

 SAP AG 2005, Dr. Christian Hansen

Unicode Restrictions – Example Access To Structures With Offset/Length Structure must begin with characters Offset/length counted in characters Access only allowed within the character type prefix of a structure

N(6)

C(4)

X(3)

C(5)

+off(len) … = stru+13(5).

"Unicode error!

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

19

SAP TechEd ‘03 Las Vegas

ABAP list layout in Unicode systems ABAP lists: Difference between memory and display length

Character units in the memory

Display columns

Non-Unicode

2

2

Unicode

1

2

’ 한’

1 Character ≠ 1 Display Column

 SAP AG 2005, Dr. Christian Hansen

Example: ABAP list layout in Unicode systems non-Unicode System

Unicode System

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

20

SAP TechEd ‘03 Las Vegas

What to consider IV: Database conversion

Database Conversion: Sapnet quicklink Unicode@sap: Unicode@sap --> Unicode library --> Unicode Conversion Library --> Basic Information --> CI206 Conversion of SAP Systems to Unicode

 SAP AG 2005, Dr. Christian Hansen

Conversion Preparation: Concept

Before the database conversion to Unicode is executed, all text data must be assigned a correct code page. Single Code Page Systems/ Unambiguous Blended Code Page Systems (ca. 90% of all customer installations)

y eas

MDMP Systems/ Ambiguous Blended Code Page Systems (ca. 10% of all customer installations)

Why?

x ple m co

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

21

SAP TechEd ‘03 Las Vegas

Documentation The following documents are required for the conversion of non-Unicode SAP systems to Unicode:

Unicode Conversion Guide; available for Web AS 6.20/6.30/6.40.

System Copy Guide; available for Web AS 6.20/6.30/6.40.

Single Code Page and MDMP!

SAP Note 548016; valid for Web AS 6.20/6.30/6.40.

 SAP AG 2005, Dr. Christian Hansen

References and Contacts www.service.sap.com/unicode@sap

Unicode Conversion Library

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

22

SAP TechEd ‘03 Las Vegas

Summary

1.What is Unicode? The state of the art technology for internationalized text processing

2.Who needs Unicode? Everybody making full use of SAP Netweaver All Korean customers using more languages than only English and Korean All Korean customers that have been using KO16KSC5601 as Oracle character set

3.How to go there? New installation System conversion

 SAP AG 2005, Dr. Christian Hansen

Further information Find more information about Unicode at: service.sap.com/Unicode@sap service.sap.com/Unicode

Unicode Conversion Unicode@sap --> Unicode library --> Unicode Conversion Library --> Basic Information --> CI206 Conversion of SAP Systems to Unicode Unicode@sap --> Unicode library --> Unicode Conversion Library --> Unicode Conversion Kit 620 --> Unicode Conversion Guide Web AS 6.20/6.30 SP 50

ABAP Unicode programming Unicode@sap --> Unicode library --> ABAP and Unicode --> TechEd 2004: CI252 Making ABAP Programs Unicode enabled Unicode@sap --> Unicode library --> ABAP and Unicode --> TechEd 2004: CI253 External Unicode Interfaces

Details for further reading Unicode@sap --> Unicode library --> ABAP and Unicode --> ABAP Programs in Unicode Systems: Requirements Unicode@sap --> Unicode library --> ABAP and Unicode --> ABAP List Layout in Unicode Systems: Development Guide

Also recommend: SAP Unicode learning maps available at quicklink rkt-unicode.

 SAP AG 2005, Dr. Christian Hansen

© 2003 SAP Labs, LLC

Session ID, Speaker

23

Related Documents

Unicode
June 2020 3
Unicode
May 2020 19
Quykebancunghoa Unicode
November 2019 0
0908-unicode
May 2020 12
Dua-unicode
June 2020 6
Urdu Unicode
May 2020 0