Data Compression Tech-cs

  • Uploaded by: lipika008
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Data Compression Tech-cs as PDF for free.

More details

  • Words: 2,079
  • Pages: 34
National Institute of Science and Technology

Technical Seminar Presentation 2005

A Review of Data Compression Techniques Presented by Sudeepta Mishra Roll# CS200117052 At

NIST,Berhampur Under the guidance of Mr. Rowdra Ghatak Sudeepta Mishra

CS200117052

[1]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Introduction • Data compression is the process of encoding data so that it takes less storage space or less transmission time than it would if it were not compressed. • Compression is possible because most real-world data is very redundant

Sudeepta Mishra

CS200117052

[2]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Different Compression Techniques • Mainly two types of data Compression techniques are there. – Loss less Compression.

Useful in spreadsheets, text, executable program Compression. – Lossy less Compression. Compression of images, movies and sounds.

Sudeepta Mishra

CS200117052

[3]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Types of Loss less data Compression • Dictionary coders. – Zip (file format). – Lempel Ziv. • Entropy encoding. – Huffman coding (simple entropy coding). • Run-length encoding.

Sudeepta Mishra

CS200117052

[4]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Dictionary-Based Compression • Dictionary-based algorithms do not single symbols as variable-length bit they encode variable-length strings of as single tokens. • The tokens form an index into a dictionary. • If the tokens are smaller than the they replace, compression occurs.

Sudeepta Mishra

encode strings; symbols phrase phrases

CS200117052

[5]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Types of Dictionary • Static Dictionary. • Semi-Adaptive Dictionary. • Adaptive Dictionary. – Lempel Ziv algorithms belong to this category of dictionary coders. The dictionary is being built in a single pass, while at the same time encoding the data. – The decoder can build up the dictionary in the same way as the encoder while decompressing the data.

Sudeepta Mishra

CS200117052

[6]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Dictionary-Based Compression: Example • Using a English Dictionary the string: “A good example of how dictionary based compression works” • Gives : 1/1 822/3 674/4 1343/60 928/75 550/32 173/46 421/2 • Using the dictionary as lookup table, each word is coded as x/y, where, x gives the page no. and y gives the number of the word on that page. If the dictionary has 2,200 pages with less than 256 entries per page: Therefore x requires 12 bits and y requires 8 bits, i.e., 20 bits per word (2.5 bytes per word). Using ASCII coding the above string requires 48 bytes, whereas our encoding requires only 20 (<-2.5 * 8) bytes: 50% compression. Sudeepta Mishra

CS200117052

[7]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Lempel Ziv • It is a family of algorithms, stemming from the two algorithms proposed by Jacob Ziv and Abraham Lempel in their landmark papers in 1977 and 1978.

LZ77

LZ78 LZJ

LZR

LZFG

LZSS

LZH

LZB

Sudeepta Mishra

LZC

LZW LZT

LZMW

CS200117052

[8]

National Institute of Science and Technology

Technical Seminar Presentation 2005

LZW Algorithm • It is An improved version of LZ78 algorithm. • Published by Terry Welch in 1984. • A dictionary that is indexed by “codes” is used. The dictionary is assumed to be initialized with 256 entries (indexed with ASCII codes 0 through 255) representing the ASCII table.

Sudeepta Mishra

CS200117052

[9]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) W = NIL; while (there is input){ K = next symbol from input; if (WK exists in the dictionary) { W = WK; } else { output (index(W)); add WK to the dictionary; W = K; } } Sudeepta Mishra

CS200117052 [10]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Flow Chart START W= NULL YES STOP

IS EOF ?

NO

K=NEXT INPUT YES W=WK

IS WK FOUND? NO

OUTPUT INDEX OF W ADD WK TO DICTIONARY W=K

Sudeepta Mishra

CS200117052 [11]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example • Input string is • The Initial Dictionary contains symbols like a, b, c, d with their index values as 1, 2, 3, 4 respectively. • Now the input string is read from left to right. Starting from a.

Sudeepta Mishra

a b d c a d a c

a b c d

1 2 3 4 CS200117052 [12]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example • W = Null • K=a • WK = a In the dictionary.

a b d c a d a K

a b c d Sudeepta Mishra

c

1 2 3 4 CS200117052 [13]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example • K = b. • WK = ab is not in the dictionary. • Add WK to dictionary • Output code for a. • Set W = b

a b d c a d a c K

1 a 1

ab

5

b 2 c 3 d 4

Sudeepta Mishra

CS200117052 [14]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example • K=d • WK = bd Not in the dictionary. Add bd to dictionary. • Output code b • Set W = d

a b d c a d a c K

1 2 a 1

ab

5

b 2 bd

6

c 3 d 4 Sudeepta Mishra

CS200117052 [15]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example

• K=a • WK = da not in the dictionary. • Add it to dictionary. • Output code d • Set W = a

a b d a b d a c K

1

2 a 1

4 ab

5

b 2 bd c 3 da

6 7

d 4 Sudeepta Mishra

CS200117052 [16]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example • K=b • WK = ab It is in the dictionary.

a b d a b d a c K

1

2 a 1

4 ab

5

b 2 bd c 3 da

6 7

d 4 Sudeepta Mishra

CS200117052 [17]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example

• K=d • WK = abd Not in the dictionary. • Add W to the dictionary. • Output code for W. • Set W = d

Sudeepta Mishra

a b d a b d a c K

1

2 a 1

4

5

ab

5

b 2 bd c 3 da

6 7

d 4 abd

8 CS200117052 [18]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example

• K=a • WK = da In the dictionary.

a b d a b d a c K

1

2 a 1

Sudeepta Mishra

4

5

ab

5

b 2 bd c 3 da

6 7

d 4 abd

8 CS200117052 [19]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example

• K=c • WK = dac Not in the dictionary. • Add WK to the dictionary. • Output code for W. • Set W = c • No input left so output code for W. Sudeepta Mishra

a b d a b d a c K

1

2 a 1

4

5

7

ab

5

b 2 bd c 3 da

6 7

d 4 abd

8

dac

9

CS200117052 [20]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Compression) Example

• The final output string is 124573 • Stop.

Sudeepta Mishra

a b d a b d a c K

1

2

4

5

7 3

a 1

ab

5

dac

b 2 bd c 3 da

6

d 4 abd

8

9

7

CS200117052 [21]

National Institute of Science and Technology

Technical Seminar Presentation 2005

LZW Decompression Algorithm read a character k; output k; w = k; while ( read a character k ) /* k could be a character or a code. */ {

entry = dictionary entry for k; output entry; add w + entry[0] to dictionary; w = entry;

}

Sudeepta Mishra

CS200117052 [22]

Technical Seminar Presentation 2005 National Institute of Science and Technology

LZW Decompression Algorithm Flow Chart START K=INPUT Output K W=K YES

IS EOF

STOP

?

NO K=NEXT INPUT ENTRY=DICTIONARY INDEX (K) Output ENTRY ADD W+ENTRY[0] TO DICTIONARY

Sudeepta Mishra

W=ENTRY

CS200117052 [23]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Decompression) Example

• K=1 • Out put K (i.e. a) • W=K

1

2

4

5

7

3

K

a a 1 b 2 c 3 d 4

Sudeepta Mishra

CS200117052 [24]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Decompression) Example

• • • •

K=2 entry = b Output entry Add W + entry[0] to dictionary • W = entry[0] (i.e. b)

1

2

4

5

7

3

K

a b a 1

ab

5

b 2 c 3 d 4

Sudeepta Mishra

CS200117052 [25]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Decompression) Example

• • • •

K=4 entry = d Output entry Add W + entry[0] to dictionary • W = entry[0] (i.e. d)

1

2

4

5

7

3

K

a b d a 1

ab

5

b 2 bd c 3

6

d 4

Sudeepta Mishra

CS200117052 [26]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Decompression) Example

• • • •

K=5 entry = ab Output entry Add W + entry[0] to dictionary • W = entry[0] (i.e. a)

1

2

4

5

7

3

K

a b d a b a 1

ab

5

b 2 bd c 3 da

6 7

d 4

Sudeepta Mishra

CS200117052 [27]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Decompression) Example

• • • •

K=7 entry = da Output entry Add W + entry[0] to dictionary • W = entry[0] (i.e. d)

Sudeepta Mishra

1

2

4

5

7

3

K

a b d a b d a a 1

ab

5

b 2 bd c 3 da

6

d 4 abd

8

7

CS200117052 [28]

National Institute of Science and Technology

Technical Seminar Presentation 2005

The LZW Algorithm (Decompression) Example

• • • •

K=3 entry = c Output entry Add W + entry[0] to dictionary • W = entry[0] (i.e. c)

Sudeepta Mishra

1

2

4

5

7

3 K

a b d a b d a c a 1

ab

5

b 2 bd c 3 da

6

d 4 abd

8

dac

9

7

CS200117052 [29]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Advantages • As LZW is adaptive dictionary coding no need to transfer the dictionary explicitly. • It will be created at the decoder side. • LZW can be made really fast, it grabs a fixed number of bits from input, so bit parsing is very easy, and table look up is automatic.

Sudeepta Mishra

CS200117052 [30]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Problems with the encoder • What if we run out of space? – Keep track of unused entries and use LRU (Last Recently Used). – Monitor compression performance and flush dictionary when performance is poor.

Sudeepta Mishra

CS200117052 [31]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Conclusion • LZW has given new dimensions for the development of new compression techniques. • It has been implemented in well known compression format like Acrobat PDF and many other types of compression packages. • In combination with other compression techniques many other different compression techniques are developed like LZMS.

Sudeepta Mishra

CS200117052 [32]

Technical Seminar Presentation 2005 National Institute of Science and Technology

REFERENCES [1] http://www.bambooweb.com/articles/d/a/Data_Compression.html [2] http://tuxtina.de/files/seminar/LempelZivReport.pdf [3] BELL, T. C., CLEARY, J. G., AND WITTEN, I. H. Text Compression. Prentice Hall, Upper Sadle River, NJ, 1990. [4] http://www.cs.cf.ac.uk/Dave/Multimedia/node214.html [5] http://download.cdsoft.co.uk/tutorials/rlecompression/RunLength Encoding (RLE) Tutorial.htm [6] David Salomon, Data Compression The Complete Reference, Second Edition. Springer-Verlac, New York, Inc, 2001 reprint. [7] http://www.programmersheaven.com/2/Art_Huffman_p1.htm [8] http://www.programmersheaven.com/2/Art_Huffman_p2.htm [9] Khalid Sayood, Introduction to Data Compression Second Edition, Chapter 5, pp. 137-157, Harcourt India Private Limited. Sudeepta Mishra

CS200117052 [33]

National Institute of Science and Technology

Technical Seminar Presentation 2005

Thank You

Sudeepta Mishra

CS200117052 [34]

Related Documents


More Documents from "lipika008"