Special Topics Data Compression
Instructor Introduction
Instructor: Eyas El-Qawasmeh Email :
[email protected] Office hours : 10-12. Wen and Monday. Office location : D1 Level 0: One floor up from the chemistry department.
7A Objectives • Understand the essential ideas underlying data compression. • Become familiar with the different types of compression algorithm. • Be able to describe the most popular data compression algorithms in use today and know the applications for which each is suitable.
What is compression? Compression, sometimes known as "packing," refers to the creation of a smaller file from a larger file (or group of files). Alternatively, from Webopedia, compression is "storing data in a format that requires less space than usual."
Why compression? Storage - Compressed files take up less space. Speed/efficiency - In many cases smaller files can be executed or read in less time. Bandwidth/transfer time - Smaller files take less time to download/upload.
Different tactics: Lossless and Lossy compression Lossless Compression - No data (in a technical sense) is lost. Therefore, the compression process is reversible. Lossy Compression - Some data is lost during compression. This process is irreversible.
Note Lossy compression generally creates smaller files than lossless compression. However, many types of files cannot undergo lossy compression without losing functionality - once they're compressed, you must to be able to decompress them in order to read them. In these cases, lossless compression can be used to store smaller files, and the file can be expanded when a person needs to be access it.
Data Compression • Looks for repetitive sequences or patterns in data – e.g. the the quick the brown fox the
• We are more repetitive than we think – text often compresses over 50% • Lossless vs. lossy – example: image formats: gif is lossless and jpg is lossy
Data Compression • History: – Not a new concept, dates back to infancy of computers – Perhaps first mainstream use was for transmitting data over phone lines • saved time and money
– Also used for archiving data
Data Compression • Years of research have gone into data compression, but findings can be compressed into a short list: – LZW compression - Lempel-Ziv-Welch – builds substring dictionary (GIF, TIFF, ZIP, gzip, many others) – Huffman Encoding - uses binary tree and symbols (TIFF, bzip2) – Burrows-Wheeler block sort - transform data to make more compressible (bzip2) – RLE - run length encoded (PCX) – others are lossy - JPG, MPG, MP3, WAV - many use FFT
Why Data Compression is needed • Disk space cheap now - 3.3c/MB or 2P/MB – But more drive space is now needed as well. • not so cheap anymore
– Examples: – Bioinformatics regularly deals with databases of information gigabytes in size • 1 DNA string can be 650MB of data
– Images: • Uncompressed pictures, say 800x600 can be 50MBs!!!!
• But time is money, so want fast decompression
Why Data Compression is needed
Size of applications is going from large to larger MP3, MPEG, Tiff, Facsimile (fax), …etc.
Fax has about 4 million dots/page more than 1 minutes over 56Kbps
TV / Motion Pictures uses 30 pictures (frames) / second • 200,000 pixels / frames • Color pictures require 3 bytes for each pixel (RGB) • Each frame has 200,000 * 24 = 4.8 Mbits • 2-hour movie requires 216,000 pictures total bits for such movie = 216,000 * 4.8 Mbits = 1.0368 x 1012 • This is much higher than the capacity of DVDs
Simple Compression Example Assume only uppercase characters are to be sent Assume only uppercase characters are to be sent
ASCII can be used 8-bit * number of characters to sent
Alternatively, a different 5-bit code can be used A: B: C:
00000 00001 00010 : :
Y: Z:
37.5% reduction is achieved
11000 11001
Compression Factor • A good metric for compression is the compression factor (or compression ratio) given by:
• If we have a 100KB file that we compress to 40KB, we have a compression factor of:
Common Applications • Text compression – loss-less, gzip uses Lempel-Ziv coding, 3:1 compression – better than Huffman
• Audio compression – lossy, mpeg 3:1 to 24:1 compression – MPEG = motion picture expert group
• Image compression – lossy, jpeg 3:1 compression – JPEG = Joint photographic expert group
• Video compression – lossy, mpeg 27:1 compression
Example: bzip2 • Source code freely available - current version 1.0.2 • Generally achieves good compression ratio, better than gzip • Can be slow at compression but very fast decompression • Relatively platform independent, patent free
Example: Symbol table compression • This very simplified, because this done would binary representation, instead of character • Example text: What do you do when you see an endangered animal eating an endangered plant? –George Carlin • Compressed (symbol table and text) 1 what 2 do 3 you 4 when 5 see 6 an 7 endangered 8 animal 9 eating 10 plant 11? 12 – 13 George 14 Carlin 1 2 3 2 4 3 5 6 7 8 6 7 10 11 12 13 14
Audio Compression • Sounds can be represented as a vector valued function • At any point in time, a sound is a combination of different frequencies of different strengths • For example, each note on a piano yields a specific frequency. • Also, our ears, like pianos, have cilia that responds to specific frequencies. • Just like sin(x) can be approximated by small number of terms, e.g. x -x^3/3+x^5/120…, so can sound. • Transforming a sound into its “spectrum” is done mathematically by a fourier transform. • The spectrum can be played back, as on computer with sound cards.
Audio • Using many frequencies, as in CDs, yields a good approximation Using few frequencies, as in telephones, a poor approximation • Sampling frequencies yields compression ratios between 6 to 24, depending on sound and quality • High-priced electronic pianos store and reuse “samples” of concert pianos • High filter: removes/reduces high frequencies, a common problem with aging • Low filter: removes/reduces low frequencies • Can use differential methods: – only report change in sounds
Image Compression • with or without loss, mostly with – who cares about what the eye can’t see
• Black and white images can regarded as functions from the plane (R^2) into the reals (R), as in old TVs – positions vary continuous, but our eyes can’t see the discreteness around 100 pixels per inch.
• Color images can be regarded as functions from the plane into R^3, the RGB space. – Colors are vary continuous, but our eyes sample colors with only 3 difference receptors (RGB)
• Mathematical theories yields close approximation – there are spatial analogues to fourier transforms
Image Compression
• faces can be done with eigenfaces – images can be regarded a points in R^(big) – choose good bases and use most important vectors – i.e. approximate with fewer dimensions: – JPEG, MPEG, GIF are compressed images
Video Compression • Uses DCT (discrete cosine transform) – Note: Nice functions can be approximated by • sum of x, x^2,… with appropriate coefficients • sum of sin(x), sin(2x),… with right coefficients • almost any infinite sum of functions – DCT is good because few terms give good results on images. – Differential methods used: • only report changes in video
Revisiting Image compression
Image Compression
Image Representation Images are made up of very small dots (pixels)
Is there really such thing as black & white photos?
Revisiting Image compression • Like file compression • many different formats – common image formats: TIFF, JPG, GIF, and PNG – There are dozens of other formats. – We'll look at 2 different formats • lossless format: GIF • lossy format: JPG
Image Compression
Different video colors may be obtained using combination of Red, Green & Blue (RGB)
8 bits can be used to represent each of the tree colors
A total of 24 bits 224 different combinations
Since human eye cannot distinguish these many colors, we think of it as true color
Image Compression
An alterative to RGB is YIQ, which is based on RGB
YIQ uses 8-bit group for luminance (brightness), and two other 8-bit (each) for chrominance (color) For example, Y=0.30R+0.59G+.11B (luminance) I=0.60R-0.28G-0.32B (chrominance) Q=0.21R-0.52G+0.31B (chrominance)
Human eye is more sensitive to luminance than chrominance; that is some loss in colors through transmission may not be visually detectable
That is useful information when it comes to image compression
Image Compression
Regardless of using RGB or YIQ, there is 3 8-bit groups per pixel
To fill a 640 x 480 computer screen, we need: 640 * 480 * 24 7,372,800 bits
Video usually uses 30 images/sec, and transfer may be done simultaneously to many users
This may easily average to Gbps rate and higher, which is far more than what current technology (as of 2006, when this was written) can handle
Bitmaps • In a bitmap image – the image file has to define the exact color of every pixel in the image. – For example, imagine a typical bitmap on the web that is 400 by 400 pixels. • It would needs 24 bits per pixel for 160,000 pixels, or 480,000 bytes (480 MB). • That would be a huge image file • both the PNG/GIF and JPG formats compress the image in different ways.
GIF Files
Graphics Interchange Format
The number of colors that GIF works with is only 256 (28 in contrast to 224 for JPEG
Stores up to 256 colors in a table and attempt to cover the range of colors in an image as closely as possible
The resulting bit values are subjected to some form of Lempel-Ziv compression
Lossy if the number of colors in an image is more than 256; lossless otherwise
Best suited for graphics that contain relatively few colors and sharply defined boundaries between the colors, such as cartoons, charts, …etc.
Not that suitable for images with lots of variations between colors, such as fullcolor photographic-quality images
GIF format • By default, converting to a GIF format, reduces the number of colors to 256 or 8 bits per pixel • and "runs" of same-color pixels are encoded in a color+numberOfPixels format. – For example, if there are 100 pixels on a line with the color 41, the image file stores the color (41) and the length of the run (100). – This makes a GIF file great for storing drawings that have lots of same-color pixels
• A GIF format is an perfect reproduction of the original (when used at the same pixel depth).
Example with GIF
• This image is 575x194 with 8 bits per pixel – It's about (actually)48KB in a GIF file. • running the number: 575*194*1byte (or 24 bits) = 111550 bytes or 104KB • With the color compression, saved 56KB of space. – Mostly in the white
JPG format • JPEG/JPG is designed for compressing either full-color or gray-scale images of natural, realworld scenes. It works well on photographs, naturalistic artwork, and similar material; not so well on lettering, simple cartoons, or line drawings. • PEG is "lossy," and is designed to exploit known limitations of the human eye, notably the fact that small color changes are perceived less accurately than small changes in brightness.
JPG format (2) • Skipping the complexity of JPG format, you can control the quality of the picture, between 100 being the best (worst compression) and 1 being the worst (but best compression) – Somewhere between 65 to 70 being the "best quality and compression" – You can also choose progressive scan, which speed up rendering or Optimize Huffman codes
JPEG Compression
Joint Photographic Experts Group Formed by ISO, ITU & IEC
Previous methods were examples of lossless compression
JPEG however is lossy
Considering the optical system of a human, that may still be acceptable
JPEG Compression
JPEG is good for grayscale or photo-colored image JPEG compression consists of three phases: Discrete Cosine Transform (DCT), Quantization, and Encoding DCT divides an image into a series of “blocks” (8x8 pixels each block). I.e. 640x480 pixels = 80x60 blocks
Figure 5.8 – JPEG’s Three Phases
JPEG Compression
For example, 800 x 800 image would be divided into 100 x 100 blocks – each block has 8 x 8 pixels
JPEG Compression
If grayscale, then each pixel is represented by an 8bit number Each 8 x 8 pixel block will be represented as 2-D array with 8 rows & 8 columns If color, then each pixel is represented by a 24-bit number Each 8 x 8 pixel block will be represented as three 2-D arrays with 8 rows & 8 columns each
JPEG Compression
DCT Phase Basically, it is a function that takes a 2-D array with 8 rows & 8 columns and produces another 2-D array
P[i][j] is the input array, T[i][j] is the output array
The values inside T[i][j] are called spatial frequencies
Formula page 240
JPEG Compression
DCT Phase
JPEG Compression
Quantization Phase Provides a way of ignoring small difference that may not be perceptible
It produces another 2-D array, call it Q for example
Q[i][j] values are obtained by dividing T[i][j] values by some number and rounding to the nearest integer
The resulting array with have fewer distinct numbers, so it easier to compress
JPEG Compression
Quantization Phase Example: T values are divided by 10 then rounded
JPEG Compression
Quantization Phase How can we obtain T (decompression) from Q then?!
JPEG Compression
Quantization Phase To preserve as much information as possible, define a quantization array, say U, and divide T values by the values of U
Quantization Phase
JPEG Compression
JPEG Compression Encoding Phase
So far, transformation & quantization were done, but nothing about compression!
This phase finally does the compression
The idea is to linearize the content of the Q array and compress it for transmission
Run-length coding can then be used
JPEG Compression
JPEG can use Huffman encoding or Arithmetic encoding for the non-zero values
Since many 2-D must be transmitted for the image, and many of them may not have much differences, Differential encoding can also be applied
In general, JPEG compression ration is about 20:1; that is, the resulted file is 5% of the original
Better ratios are possible but loss of quality may become noticeable
JPEG 2000 is the newest JPEG coding
Mass details of JPEG can be found at www.JPEG.org
Example with JPG
• This image is 575x194 with 24 bits per pixel – The file size is 23KB
MP3
Psychoacoustic Model: • What can human auditory hear? • What can human auditory distinguish?
Auditory Masking: • If two sounds have similar frequencies, but one is weak and the other is high, it is possible that a human cannot hear the weak sound
MP3 fundamental idea is: • • •
Capture an audio signal Determine what cannot be heard and remove it, then Digitize the rest
MP3
Psychoacoustic Model: • What can human auditory hear? • What can human auditory distinguish?
Auditory Masking: • If two sounds have similar frequencies, but one is weak and the other is high, it is possible that a human cannot hear the weak sound
MP3 fundamental idea is: • • •
Capture an audio signal Determine what cannot be heard and remove it, then Digitize the rest
MP3
Figure 5.15 – MP3 Encoding