Faculty of Science and Technology School of Computing, Engineering and Physical Sciences ZHENG Zong Bin Image Compression Using Hopfield Neural Network (EL3990) Submitted in partial satisfaction of the Requirements for the degree of Bachelor of Engineering (with Honors) In Digital Communications April 2009 I declare that all material contained in this report, including ideas described in the text, computer programs and drawings, is my own work except where explicitly and individually acknowledged. Signed ……………………. Date ………………………. 1
Abstract
It goes without saying that the technology of image compression is playing more and more vital role all over the world. The aim of this project is to develop an image compression technology using Hopfield Neuron Network. In this dissertation, image compression is base on an algorithm named Block Truncation Coding (BTC). It is a coding technique which simple and effective. Next step is using Hopfield Neuron Network to define a new threshold to optimize the reconstruction quality. No matter in BTC or HNNBTC that the block size is fixed. Additionally, using Hopfield Neuron Network and Block Truncation Coding to compress color image in different color space, like RGB, YUV, YIQ, and HSV. Finally, try to use variable block size in order to do more compression and improve a better compression image quality. Above algorithm are implemented by using Matlab software.
2
Acknowledgements
Obviously, I cannot finish this project without my Project Supervisor Dr. Martin Ray Varley, who helps me through all stages of this project. Without his valuable advice and considerable patience, I would have not been able to complete this project successfully.
Meanwhile, thanks Dr. Chen xin, due to his advice in programming.
Finally, I specially thank my parents and girl friend, for their loves, encouragement, and support all the way through my life. 3
Contents Abstract……………………………………………………………………………………………………………………………………...2 Acknowledgements…………………………………………………………………………………………………………………….3 Contents……………………………………………………………………………………………………………………………………..4 List of Figures………….………………………………………………………………………………………………………………….6 List of Tables…………..…………………………………………………………………………………………………………………11 Chapter 1 Introduction 1.1 Research Background……………………………………………………………….……………………………….…12 1.2 Project Objectives……………………………………………………………………………..…………………………13 1.3 Compression System Model…………………………………………………..……………..…………………….13 1.3.1 Model 1.3.2 Redundancy Types 1.3.3 Lossless & Lossy 1.4 Principle of Image Compression……………………………………………………………………………….….16 1.4.1 Encoding 1.4.2 Decoding 1.5 Principle of Hopfield Neural Network……………………………..…………….…………………………….17 1.6 Information Measurement……………………………………………………………………………..……………19 1.7 Outline of Dissertation………………………………………………………………………………………………...21 1.8 Test Images…………………………………………………………………………………………………………………22 Chapter 2 Block Truncation Coding 2.1 Introduction ……………………………………………………………………………………………………………….23 2.2 Basic BTC Algorithm…………………………………………………………………………………………………….24 2.3 Process ……………………………………………………………………………………………………………………...26 2.4Program Flowchart ………………………………………………………………………………..………..…………30 2.5 Result Analyze………………………………………………………………………………………..………………….31 2.5.1 Cameraman 2.5.2 Pirate 2.5.3 Woman_darkhair 2.6 Summary………………………………………………………………………………………………………………….…40 Chapter 3 Hopfield Neural Networks 3.1 A Short Introduction about Neural Network…………………………………………………………….….42 3.2 Basic Algorithm ………………………………………………………….………….……………..……………………43 3.3 Program Flowchart ………………………………………………………………………………………..……………45 3.4Result Analyze………………………………………………………………………………………….………………...46 4
3.4.1 Cameraman 3.4.2 Pirate 3.4.3 Woman_darkhair 3.5Summary....................................................................................................................…...53 Chapter 4 Comparison 4.1 Algorithm Compression………………………………………………………………………………………….……54 4.2 Image Result Comparisons……………………………………………….…………………………….…………..54 4.3 Summary………………………………………………………………………………………………………………….…57 Chapter 5 Color Image Compression 5.1 Color Image Compression Base on RGB Color Space…………………………………………………….59 5.1.1 Introductio 5.1.2 Flowchart 5.1.3 Result Analyze 5.2 Color Image Compression Base on YUV Color Space…………………………………………………..64 5.2.1 YUV Color Space 5.2.2 Flowchart 5.2.3 Result Analyze 5.2.3.1 Lena_color 5.2.3.2 Madril_color 5.2.4 A way to Achieve More compression in YUV Color Space…………………………………69 5.3 Color Image Compression Base on YIQ Color Space…………………………………………………..…74 5.4 Color Image Compression Base on HSV Color Space…………………………………………………….77 5.5 Conclusion & Comparisons…………………………………………………………………………………………..80 Chapter 6 Variable Block Size 6.1 Principle ………………………………………………………………………………………………………………..…….83 6.2 Programming……………………………………………………………………………………………………………….84 6.3 Practical Works…………………………………………………………………………………………………………….85 Chapter 7 Conclusion and Future Work 7.1 Conclusion …………..………………………………………………………………………………………………………86 7.2 Future Work………………………………………………………………………………………………………..……….88 References …................................................................................................................................89 Appendix A. Statement of Work (SOW)……………………………………………………………………….………….93 Appendix B. Gantt chart………………………………………………………………………………………………….……...95
5
List of Figure Figure 1.1 Compression System Model Figure 1.2 Lossless & Lossy Figure 1.3 Encoding Figure 1.4 Decoding Figure 1.5 Three Neurons Hopfield Network Figure 1.6 Bidirectional Connection Diagram Figure 1.7 State Transitions to Stable State Figure 1.8 Cameraman Figure 1.9 Pirate Figure 1.10 Woman_darkhair Figure 1.11 Lena_color Figure 1.12 Mandril_color Fihgure2.1 Original Image Figure2.2 Bitmap Figure2.3 Reconstructed Image Figure2.4 Difference between Original and Reconstructed image Figure2.5 Many Blocks Figure2.6 Errors Figure2.7 Programming Flowchart Figure2.8 Original Image Figure2.9 Histogram of Original Figure2.10 4*4 block size Figure2.11 Histogram of 4*4 block size Figure2.12 8*8 Block Size Figure2.13 Histogram of 8*8 Block Size Figure2.14 16*16 Block Size Figure2.15 Histogram of 16*16 Block Size Figure2.16 Original Image Figure2.17 4*4 Block Size Figure2.18 8*8 Block Size Figure2 .19 16*16 Block Size Figure2.20 Original Figure2.21 Intensity Profile of Original Figure2.22 4*4 Block Size Figure2.23 Intensity Profile of 4*4 Block Size Figure2.24 8*8 Block Size Figure2.25 Intensity Profile of 8*8 Block Size Figure2.26 16*16 Block Size 6
Figure2.27 Intensity Profile of 16*16 Block Size Figure2.28 Original Figure2.29 4*4 Block Size Figure2.30 8*8 Block Size Figure2.31 16*16 Block Size Figure2.32 Original Figure2.33 4*4 Block Size Figure2.34 Errors Figure2.35 8*8 Block Size Figure2.36 Errors Figure2.37 16*16 Block Size Figure2.38 Errors Figure3.1 Flowchart Figure3.2 Original Figure3.3 Histogram of Original Figure3.4 4*4 Block Size Figure3.5 Histogram of 4*4 Block Size Figure3.6 8*8 Block Size Figure3.7 Histogram of 8*8 Block Size Figure3.8 16*16 Block Size Figure3.9 Histogram of 16*16 Block Size Figure3.10 Original Figure3.11 4*4 Block Size Figure3.12 8*8 Block Size Figure3.13 16*16 Block Size Figure3.14 Original Figure3.15 Intensity Profile of Original Figure3.16 4*4 Block Size Figure3.17 Intensity Profile of 4*4 Block Size Figure3.18 8*8 Block Size Figure3.19 Intensity Profile of 8*8 Block Size Figure3.20 16*16 Block Size Figure3.21 Intensity Profile of 16*16 Block Size Figure3.22 Original Figure3.23 4*4 Block Size Figure3.24 8*8 Block Size Figure3.25 16*16 Block Size Figure3.26 4*4 Block Size Figure3.27 Errors Figure3.28 8*8 Block Size 7
Figure3.29 Errors Figure3.30 16*16 Block Size Figure3.31 Errors Figure4.1 Group 1 Cameraman (BTC) Figure4.2 Group 2 Cameraman (HNNBTC) Figure4.3 Group 1 Pirate (BTC) Figure4.4 Group 2 Pirate (HNNBTC) Figure4.5 Woman_darkhair (BTC) Figure4.6 Woman_darkhair (HNNBTC) Figure5.1 RGB Figure5.2 RGB Cube Figure5.3 RGB 24-bit Color Cube Figure5.4 Flowchart Figure5.5 Lena Figure5.6 Bitmap __ RGB Figure5.7 Decoding _ RGB Figure5.8 Reconstructed Figure5.9 Differences Figure5.10 4*4 Block Size Figure5.11 8*8 Block Size Figure5.12 16*16 Block Size Figure5.13 4*4 Block Size Figure5.14 8*8 Block Size Figure5.15 16*16 Block Size Figure5.16 BTC Figure5.17 HNNBTC Figure5.18 Flowchart Figure5.19 Original Image Figure5.20 YUV Color Image Figure5.21 Bitmap_ YUV Figure5.22 Encoding_YUV Figure5.23 Reconstructed YUV Color Image Figure5.24 Reconstructed Image Figure5.25 4*4 Block Figure5.26 8*8 Block Figure5.27 16*16 Block Figure5.28 4*4 Block Size Figure5.29 8*8 Block Size Figure5.30 16*16 Block Size Figure5.31 4*4 Block Size 8
Figure5.32 8*8 Block Size Figure5.33 16*16 Block Size Figure5.34 4*4 Block Size Figure5.35 8*8 Block Size Figure5.36 16*16 Block Size Figure5.37 4*4 Block Size Figure5.38 8*8Block Size Figure5.39 16*16 Block Size Figure5.40 4*4 Block Size Figure5.41 8*8 Block Size Figure5.42 16*16 Block Size Figure5.43 Bitmap_YUV’ Figure5.44 Bitmap_YUV Figure5.45 Bitmap_RGB Figure5.46 Decoding_YUV’ Figure5.47 Deocidng_YGB Figure5.48 Decoding_RGB Figure5.49 Group 1 Group 2 Group 3 Figure5.50 Group 1 Group 2 Group 3 Figure5.51 RGB Image Figure5.52 YIQ Image Figure5.53 Y Channel Figure5.54 I Channel Figure5.55 Q Channel Figure5.56 Bitmap_YIQ Figure5.57 Decoding_YIQ Figure5.58 Reconstructed YIQ Image Figure5.59 Reconstructed Image Figure5.60 BTC Figure5.61 HNNBTC Figure5.62 BTC Figure5.63 HNNBTC Figure5.64 HSV Color Space Figure5.65 RGB to HIS Conversion Figure5.66 RGB Image Figure5.67 HSV Image Figure5.68 HSV_HSV Figure5.69 BTC Figure5.70 HNNBTC Figure5.71 BTC 9
Figure5.72 HNNBTC Figure5.73 RGB Figure5.74 YUV Figure5.75 YIQ Figure5.76 HSV Figure 6.1 Flowchart Figure 6.2 Original Figure 6.3 Bitmap Figure 6.4 Average of two classes about 8*8 & 4*4 Block Size
10
List of Table Table 2.1 Parameter for ‘Cameraman’ Table 2.2 Parameter of ‘Pirate’ Table 2.3 Parameter of ‘Woman_darkhair’ Table 3.1 Parameter of ‘Cameraman’ Table 3.2 Parameter of ’Pirate’ Table 3.3 Parameter of ‘Woman_darkhair’ Table 4.1 Compare BTC and HNNBTC Table 4.2 MSN and SNR Table 5.1 Compare BTC & HNNBTC in RGB Color Space Table 5.2 Compare BTC & HNNBTC in YUV Color Space Table 5.3 Comparison Table 5.4 Compare BTC & HNNBTC in YIQ Color Space Table 5.5 Compare BTC & HNNBTC in YIQ Color Space Table 5.6 Compare Color Image Compressions in Different Color Space Table 5.7 MSE & SNR in Image Compression Using Different Color Space 11
Chapter 1 Introduction 1.1 Research Background Digital image processing techniques have been being used more and more widely in many fields such as multimedia, the Internet, television and fax, etc. In effect, the objective of image compression is to minimize the data of digital images [1]. Suppose, there is a 512 x 512 pixels image, and 8 bits (256 difference grey levels) replace each pixel, which covers this image is over 2,000,000 bits (256 Kbytes). It is obviously that the amount of storage holds an extraordinary space. Result from the huge data to be stocked, image Compression becomes one of the most important key techniques in image processing. Transmit data used in an efficient form is the other motivation. If an image is to be transmitted across a channel, in other words, 256 Kbytes are to be transmitted across a channel. Cost is a very vital aspect that needs to consider [1]. In past twenty years, modern techniques based on the Neural Networks, fractal theory and wavelet transform have been successfully used to Image Compression. This project focuses on the application of Hopfield Neural Network for compressing images. The basic compression method is named Block Truncation Coding, which has a better development at the end of 1970s’ by Purdue University [2] (Chapter 2), next is to combine BTC and Hopfield Neural Network to implement image compression (Chapter 3), which main purpose is to ensure an appropriate computational energy function in order to get a stable state to define a threshold [3].
12
1. 2 Project Objectives First and foremost, is to get a brief about image compression, realize its scope, significance, Second, understand the principle of image compression. Third, start from the basic algorithm called Block Truncation Coding. Fourth, investigate another algorithm named Hopfield Neuron Network. Fifth, improve the function using Matlab. Sixth, compare BTC and HNNBTC. Seventh, have a complete plan about the whole plan, including a risk assessment of each task.
1.3 Compression System Model 1.3.1 Model
Figure 1.1 Compression System Model
A general compression system model is shown above. f(x,y) represents the original image, and f’(x,y) is the reconstructed image. This is a process illustration how an original image be reduced. It should be across encoding (includes source and channel encoder) and decoding (channel and source decoder). The process essentially transforms twodimensional array pixels into a statistically associated with a data set [4]. Here, we only focus on source part, which means source encoder and decoder. 13
1.3.2 Redundancy Types The aim of source encoding is to reduce more and more redundancy as possible in the original image. The redundancy may be classified into three different types [4] : (1) Coding Redundancy Huffman coding is the most popular technique. The key is to use difference length binary numbers to represent each pixel value. This exerts a tremendous influence in reducing coding redundancy. (2) Inter-pixel Redundancy If two pixels are adjacent, the pixel value will be similar. As the result of high correlation, a difference can be used as representing a pixel value between it and a neighbor. (3) Psycho visual Redundancy (irrelevancy) This is a way to reduce less important or unimportant elements without harming the visual quality of the image if this is a visual image. For example, the human eye cannot distinguish an image quantized to 8-bit per pixel or 7-bit per pixel. Therefore, psycho visual redundancy would be presented in the 7-bit resolution image. In fact, psycho visual redundancy is a lossy compression technique.
1.3.3 Lossless & Lossy Date back to 1948, C.E.Shannon formulated the concept of distortion function in his “The Mathematical Principles of Communication” [5]. In this book, it said, there is redundancy in any information, the radio of redundancy is related with the probability of information about the size of each symbol (numbers, letters or words) occurs. In 1959, the rate distortion theory was established, which laid the foundation of the source coding theory. Hence, derived from two basic ways of data compression: Lossless and Lossy [6]. 14
i) Lossless, from an objective point of view, it reduces the amount of data (e.g. redundancy in space and time) required to represent an image. At the same time, retaining all the information in the image. More important, there are no errors between the original and reconstructed image. Reconstructed image is the copy of the original image. ii) Lossy, human perception of the information in an image normally does not involve quantitative analysis of every pixel value in the image. Therefore, the different above the last technique is that it can be eliminated without significantly impairing the quality of image perception. However, this loss information cannot be restored.
Figure 1.2 has shown the traditional digital image lossless and lossy compression method. [5]
Figure 1.2 Lossless & Lossy 15
1.4 Principle of Block Truncation Coding 1.4.1Encoding
Figure 1.3 Encoding
1.4.2 Decoding
Figure 1.4 Decoding 16
1.5 Principle of Hopfield Neural Network Hopfield Neural Network offered the ‘good’ image quality through the feedback from neurons output to the input [7], which is a common type of neural network, as shown below (a network with 3 neurons) [7] [8].
Figure 1.5 Three Neurons Hopfield Network
It is obviously that this is a single layer network with feedback. On the left, the elements are not neurons, but are merely fan-out elements enabling the diagram to be drawn clearly. Below diagram shows a bidirectional connection between any two neurons.
Figure 1.6 Bidirectional Connection Diagram 17
There is an equal probability of attempting to fire for any one neurons. Because there are three neurons, the attempting fire probability is 1/3. The activation function for input neurons is [8]:
Where:
represents the threshold of neurons i. means the weight from neuron j to neuron i. n is the number of neurons in the network.
According to
If a Hopfield neural network contains n neurons, then the output layer is a n-bit binary number,
(
=8) state of energy can be calculated. In the previous, the network is a
three neurons network, so there are
=8 states. The stored pattern to which the network
converges depends on the input pattern and the connection weight matrix. A state transition diagram for the Hopfield network could be drawn like the following, but it must be arranged in order of decreasing computational energy: Suppose the energy of each state shows below: S1<S2<S3<S4<S5<S6<S7<S8
18
Figure 1.7 State Transitions to Stable State
Where:
means the state i.
The network eventually finds a stable state, according to a minimum of computational energy, so S1 is the stable state. n
n
n
n
E = ∑∑ FiFj[ x (i ) − x ( j )] + ∑∑ (1 − Fi )(1 − Fj )[ x (i ) − x ( j )]2 2
i =1 j =1
i =1 j =1
1.6 Information Measurement The image compression quality depends on the image resolution, which defined the smallest number of discernible line pairs per unit distance [9]. The higher resolution in the reconstruction means the less compression, the compression ratio is smaller, the better quality, the lower MSE, and higher SNR is. •
Compression Ratio
This is a standard to quantify the amount of the image has been compressed. If original image takes M bits, and the reconstructed takes N bits, then the compression ratio is defined like
19
For example, at first each pixel represents 8-bit, after compression the representation becomes 4-bit, in other words, the compression radio •
=8/4=2.
Mean Square Error(MSE)
If a original image f(x,y) was compressed and a new image f’(x,y) construct after reconstruction. In general, there will be a difference (loss of information) between corresponding pixels in the two images. The mean square error (MSE) [10], assuming an image of size M by N pixels, is given by: M‐1 N‐1 MSE= (1/MN) ∑ ∑ (f(x, y) – f’(x, y)) 2 X=0 y=0 The lower MSN means the better reconstruction in a quantitative sense. Ideally zero. •
Signal to Noise Ratio
The signal-to-noise ratio (SNR) of a reconstructed image effectively interprets all the errors introduced by the compression as ‘noise’, and the original image f(x,y) as ‘signal’. Conventionally the SNR is expressed in decibel (dB), and the formula for this like:
The higher SNR is corresponding to the better quality of the reconstructed image. •
Peak Signal‐to‐Noise Radio
Another measure used is peak signal-to-noise radio (PSNR). Here, ‘peak signal’ is the maximum grey level possible under the original resolution. If the resolution of original image is 8-bit, 255 is the maximum grey level, hence,
Where: PSNR is expressed in decibels (dB). 20
The higher PSNR is corresponding to the better quality of the reconstructed image. In this dissertation, I used SNR to measure whether the reconstructed image is good or not. 1.7 Outline of Dissertation This dissertation is divided into 8 parts. Chapter 1: The backgroup study. This chapter discusses the developments of compression technology, objectives and significance of this project, some measurement to define the quality of image compression. Chapter 2: Block Truncation Coding. It contains the principle of BTC algorithm, result analyze, conclusion. Chapter 3: Hopfield Neural Network basic on block truncation coding. This chapter shows the history and principle of Hopfield Neural Network. Analyse the result and gives a conclusion. Chapter 4: Comparisons. This is focus on the difference of BTC and HNNBTC Chapter 5: Color Image Compression. As we know, color image has many difference models, like RGB, YUV. This chapter is to use BTC and HNNBTC to compressing color image no matter in RGB model or YUV model. Chapter 6: Variable Block Size. Before, image cuts into a fixed block size (4*4,8*8 or 16*16), sometimes, the reconstructed image cannot reach a better quality, under this situation, it can be cut into variable block. If in this area, there are a lot of information
21
then use a smaller block size (4*4). If the area contains unimportant information, then represent it by a large size (8*8). Chapter 7: Error control. It is no doubt that when in the transmission, some mistake will happens. Chapter 8: Conclusion and Future Work. Main idea is to point out some improvement. 1.8 Test Images Here list a few image will be test in my programming [11].
Figure 1.8 Cameraman Figure 1.9 Pirate Figure 1.10 Woman_darkhair
Figure 1.11 Lena_color Figure 1.12 Mandril_color
22
Chapter 2 Block Truncation Coding 2.1 Introduction Among various kinds of lossy compression, Block Truncation Coding (BTC) is a simple image coding technique [12]. It has the following advantages. (1) Being easy to implement compared with other block based compression methods such as transform coding [13] and vector quantization [14]; (2) High quantization; (3) Good image quality for reconstruction; (4) Fast computing speed, 5 times than DCT coding; (5) Relatively high compression ratios In its original form, BTC was designed in such a manner that the reconstructed block preserved the first and second moments of the original block [3]. In recent years, several efforts have been made to improve the coding efficiency of the basic BTC technique: Arce and Gallagher, Jr. [15] showed that the truncated block is well approximated by wide-sense Markoff statistics, simultaneity, and improved coding efficiency by using median filter roots. In order to get higher compression ratios, a modified BTC technique that combined BTC with vector quantization was put forward by Udpikar and Raina [16]. Healy and Mitchell [17] carried out use an inter-frame system to reduce the bit rate. Compression ratios in the range of 5: to 6:1 have been reported [15][16][17]. Many difference BTC methods have been defined by using distinct kinds of quantization and error criteria, such as full-band Absolute Moment Block Truncation Coding (AMBTC), Sub-banding Absolute Moment Block Truncation Coding (SAMBTC). Even though they can’t reach the highest compression ratios, simple implement, faster, more efficient attract human’s sight [18] [19]. 23
2.2Basic BTC Algorithm Block Truncation Coding was first developed by Delp and Mitchell [12], has been developed by Purdue University [20]. This technique used a one bit (two-level) nonparametric quantizer adaptive over local regions of the image. It could keep well local statistics characteristic of the image [20]. In this image compression scheme, first and foremost, divided image into small n x n nonparametric blocks, which are coded individually. Let M=n X n, x(1), x(2),…..x(N) be the pixel values of a block of the original image. The first two sample moments and the sample variance are, respectively,
=
Where x(i) is the value if the image pixel values. Next, design the one bit quantizer, a threshold (output). If If
, and two reconstructed levels a and b
output=b
I=1, 2, 3, 4, 5……M
output=a
Normally, set the
as the threshold value M M
, then
= (M-q) * a + q*b = (M-q)*
+ q*
Where M is n x n, q is the number of pixels which are greater than the threshold value.
Using the equation above, it can get output a and b. 24
a= ‐
b=
However, in 1985, Udpikar and Raina [21] made a improvement of the BTC technique. Only the first- order statistical information is preserved, namely the mean of the pixels less than a threshold
, and
is the pixels that are greater than or equal to the
threshold. The new output levels are defined as,
a =
=
b =
=
where x (i)
where
Where M is n x n, q is the number of pixels which are greater than the threshold value [20].
Threshold values of quantizer and two reconstructed levers will change as statistical character of a block changes. In other words, encoding is a processing aim at local region. Furthermore, after the quantization, block will be representing by a n x n mapping matrix. This matrix consists of pixel classifications (bitmap), and representative intensity for each class. Finally, the receiver reconstructed the image block by calculating ‘a’ and ‘b’, the put these values in accordance with the code in the bit map [22] [23] [24] [25].
2.3 Process 25
Step 1: Open Image
Basic on the grey level image is a two-dimensional image, and color is a threedimensional image. When a image is read, then it will judge it is a grey level or color image. If read a ‘cameraman.tif’ image, then program will show that: Step 2: Determine the block size.
This is a new idea of my programming. Define a function called ‘range’. Before encoding, the system will ask which block size we want. This succeeds automation implementation. This means,
three programs can be represented by one function.
Here, choose a 4*4 block.
Step 3: Calculate the average of each block.
The first 4*4 block of ‘cameraman.tif’ image is like the following:
X=
Now, set the average of this matrix as the threshold value. = (156+157+160+159+156+157+159+158+158+157+156+156+16/+157+154+154)/16=157.125. Step 4: Building a bitmap
“1” replace those values are bigger than or equal to 4*4 block’s bitmap becomes:
bitmap =
Step 5: Work out two reconstructed level.
26
, otherwise uses “0”, hence, this
Work out two reconstructed levels, in simpler words, calculate the average value of “1” and “0”. Individually, a=156, and b=159. These values will send with bitmap together to receiver. The reconstructed block will change into
X’ =
It goes without saying that a lot of data is compressed. The original image takes 16 x 8 bits to represent. However, if using BTC, it only needs 16 bits (bit map), and two 8 bit (“1” & “0”) reconstructed level. Compression ratio is
=
= 4:1
At this time, the data rate changes from 8 bit per pixel to 2 bit per pixel. For example, open “cameraman.tif”.
Fihgure2.1 Original Image Figure2.2 Bitmap
On the left, it is the original image. Then, cut the image into 4*4 block, the right side is a bitmap of the original image which made by ‘1’ and ‘0’.
27
After that, use the bitmap to decode the image in order to get the reconstructed image.
Figure 2.3 Reconstructed Image Figure2.4 Difference between Original and Reconstructed image
It seems that these two images are the same. Because the image is too large, I made them smaller. However, if zoom in the image, it is easy to find out there are lots difference between original and reconstructed images. Especially, when two blocks are adjacent, the differences become more visual. Meanwhile, edge will drop down sharply, which, it seems not so smooth. If focus on the area of the cameraman’s head of and the sky, it is very clearly that there are so many blocks. The disadvantage of using BTC will cause the contouring artifacts. I will continue discuss about this later.
Figure 2.5 Many Blocks
28
Also, if use the original image to subtract the compressed image, then I can get the following figure.
The darker area is the smaller errors it has. Because the color of coat and cameraman’s head is black, these areas have fewer changes. From the above figure, it is no doubt that the most errors occur between the cameraman and backgroup, which means the edge, is a fluctuated period.
Figure 2.6 Errors
Figure 2.6 is using the original image (Figure 2.1) to minus the reconstructed image (Figure 2.3). The same pixel part becomes to black (‘0’), which is to say the darker area the smaller errors it has. Because the color of coat and cameraman’s head is black, these areas have fewer changes. From the above figure, it is no doubt that the most errors occur between the cameraman and backgroup, which means the edge is a fluctuated period.
29
2.4 Program Flowchart Principle of program is graphically shown in the following figure.
Figure2.7 Programming Flowchart 30
2.5Result Analyze 2.5.1 Cameraman
Figure2.8 Original Image Figure2.9 Histogram of Original
Figure2.10 4*4 block size Figure2.11 Histogram of 4*4 block size
31
Figure2.12 8*8 Block Size Figure2.13 Histogram of 8*8 Block Size
Figure2.14 16*16 Block Size Figure2.15 Histogram of 16*16 Block Size
Co
32
Conclusion: (1) From the above four figures, some part of image became blur, for example, cameraman’s hair change from smooth into frizzled, the contour line of face is less resolution as the block size increases. To sum up, the smaller block size, the better quality image is. (2) Histogram figures, the x axis is the sample pixel values, which from 0 to 255 for the unsigned integer 8 format. And the other axis is number of pixels against grey level. By these figures, they show that pixel value distributions will changes as the block size changes. 16*16 block size distribute is unevenly than 4*4 block size.
Figure2.16 Original Image Figure2.17 4*4 Block Size
Figure2.18 8*8 Block Size Figure2 .19 16*16 Block Size 33
Conclusion: (1)When zoom in four images with the same times, it can be saw the block size change larger and larger. The image became blur because some information is redundant by BTC technique. The larger block, the more information lost, and this kind of loss cannot be restored. In brief, I made up a form to sum up this ‘cameraman’ image. Image
Correlation
Original 4*4 8*8 16*16
100% 99.47% 98.88% 98.13%
Size(KB)
(theoretical) (actual) 257 1 1 65 4:1 3.9538:1 41 6.4:1 6.2682:1 35 7.5294:1 7.3428:1 Table 2.1 Parameter for ‘Cameraman’
MSE
SNR
40.8640 86.1284 142.5833
26.4010 23.1629 20.9737
a.MSE
Where: f(x,y) is the original image, f’(x,y) is the reconstructed image
By this formula, if more data are redundant, in other words, f’(x,y) is smaller, this lead to MSE changes to smaller. It is a standard to evaluate the compression radio, or it is computed the loss information between relevant pixels in the two images. MSE is higher, the reconstruction in a quantitative sense will worse. MSE drops down when block size rises, the quality of reconstructed image decrease. This answered to the description of the figures above. b.SNR
34
The lower MSE, the higher SNR for most cases. The higher SNR is corresponding to the better quality of the reconstructed image. 26.4010, 23.1629, 20.9737, SNR of the 4*4 is the highest in three compressions, which means, 4*4 has the better quality in theory. In fact, the better compress quality in human’s eye is 4*4 block size image. 2.5.2 Pirate Figure 2.20 Original Figure 2.21 Intensity Profile of Original
Note: On the right, the figure shows the intensity profile of the red line in the original image on the left side.
Figure 2.22 4*4 Block Size Figure 2.23 Intensity Profile of 4*4 Block Size
35
Figure 2.24 8*8 Block Size Figure 2.25 Intensity Profile of 8*8 Block Size
Figure 2.26 16*16 Block Size Figure 2.27 Intensity Profile of 16*16 Block Size
36
Conclusion: Here, I used a function called ‘improfile’ in Matlab. It calculated the intensity value, and the plot them along a line or a multiline path in an image. By these figures, the peak value are nearly the same, however, valleys are becomes lower and lower as the block changes. More important, the difference of the neighbor pixel value turn into smaller, because the block is larger, more pixels contain into a block. This is the reason why those figures on the right hand looks like the same light and dark levels for some neighbor pixels, the value are equal continued for a few pixels. And the advantage of these reconstructed levels is that the bias components of neighboring blocks are strongly correlated [26]. Also, under the action of the image cut into blocks, and more pixel value change into the average value (class’0’&class’1’), the number of valleys is turn to less and less, the compression ratio is increased. The larger block used, the more contour of block will come out in the reconstructed image.
Figure 2.28 Original Figure 2.29 4*4 Block Size
37
Figure 2.30 8*8 Block Size Figure 2.31 16*16 Block Size
(1)The first original image is so clear that eyeball were looking sideways. It looks like he was finding his prey. The next image is so-so, the eyeball were still there but not so plain than the previous one. However, when the block size gets into 8*8, it cannot see the pirate’s eyeball, and the man seems blind. This is also the same information include in the next image.
It proved what I said before:
larger block is, the worse quality
reconstructed image is. Thus, this is a form to sum up this ‘pirate’ image. Image
Correlation
Original 4*4 8*8 16*16
100% 99.10% 98.20% 96.93%
Size(KB)
(theoretical) (actual) 257 1 1 65 4:1 3.9538:1 41 6.4:1 6.2682:1 35 7.5294:1 7.3428:1 Table 2.2 Parameter of ‘Pirate’
2.5.3 Woman_darkhair
Figure 2.32 Original 38
MSE
SNR
40.7506 80.9846 137.2873
25.5814 22.5988 20.3065
Figure 2.33 4*4 Block Size Figure 2.34 Errors
Figure 2.35 8*8 Block Size Figure 2.36 Errors
Figure 2.37 16*16 Block Size Figure 2.38 Errors
39
Conclusion: On the left side are the compression images, on the other hand are the errors between original and reconstructed image, it is used the original image (Figure 2.32 original) minus the reconstructed image separately (Figure 2.33, Figure 2.35, Figure 2.37). If both of them have the same pixel value, then the result becomes ‘0’ (black). The larger pixel value difference increase, the dark level will decrease. It goes without saying that the more errors happen in the last image, more redundancies, the worse quality it is. Simultaneous, I made a form to compare those reconstructed images. Image
Correlation
Original 4*4 8*8 16*16
100% 99.86% 99.63% 99.07%
Size(KB)
(theoretical) (actual) 257 1 1 65 4:1 3.9538:1 41 6.4:1 6.2682:1 35 7.5294:1 7.3428:1 Table 2.3 Parameter of ‘Woman_darkhair’
MSE
SNR
10.7125 28.6020 70.4786
31.5849 27.3199 23.4033
2.6 Summary It is needless to say block truncation coding is a well-known lossy image compression technique. First is to divide an image into a number of fixed size blocks. Next encode each block, contain calculate the average of two representative values, and built up a bitmap. The technique is known to preserve high PSNR, but achieves low compression radio [26]. Compare three grey level images by difference block sizes, histogram, and intensity profile, errors between original and reconstructed image, we can find the following point:
40
(1) The larger block used, the worse quality compression is, the higher compression ratio is. (2) When the block size increase, the contouring artifacts becomes obviously. (3) Fast computing
41
Chapter 3 Hopfield Neural Networks 3.1 A short Introduction about Neural Network Neural network is based on a model of the basic cells of the brain: neurons. Date back to the 1940, Warren McCulloch and Walter Pitts developed such a model-referred to in these notes as the McCulloch Pitts (MCP) neuron [27]. In 1949 Donald Hebb showed that neural network could exhibit the learning ability (he proposed a training algorithm low of the neural network). In the 1950s and 1960s, researchers developed the first artificial neural network circuits (the most common form of realization nowadays is computer simulation). However in the late 1960s, Marvin Minsky and Seymour Papert published a book which highlighted the inability of such networks to carry out even simple tasks. This led to disillusionment in the field, and research into neural networks lapsed into obscurity. In 1982 John Hopfield recognized that the stability of the ‘response’ (or stable states) of a group of neurons may result in a good way that makes advanced memories. This analysis was based on a definition of ‘energy’ in a network, and a proof that the network operates by minimizing this energy and settling into stable states. In 1986, the error backpropagation learning algorithm that may be widely applied in train multi-layer networks was proposed by Runmelhuelhart, Hinton and Williams [28] [29]. Since the field that the research of the neural networks became expand and mature, and neural networks are now engaged in a wide variety of application areas, such as process control, financial forecasting, image processing ,speech processing etc.
42
3.2 Basic Algorithm
Obviously, neurons in the Hopfield [30] [31] [32] Network are highly and selectively interconnected so as to give rise to collective computational properties and create networks with good computational efficiency. The most important point of using Hopfield Neuron Networks is to collective computational propertied emerge from the existence of an energy function of the states of the neuron outputs, namely the computational energy [33]. It should be reminded that the number of neurons is equal to the pixels. The computational energy function of the Hopfield Network has the following form [5]:
Where: , i=1,2,…..n, are the outputs of the network;
is the connection strength between neuron i and neuron j; , i =1,2,…..n is the external input to neuron i.
Where
is the total input to neuron i.
Now, this is a formula shows the relationship between neurons and the external inputs.
T I
ij
i
= =
− −
∂ ∂ V 2 43
2
i
∂ E ∂ V i
E ∂ V Vj
j
= 0
There is a feedback in the Hopfield Neuron Network. In other words, this is used to asynchronously update the network at discrete random time, and achieve the stable state. An energy function E describes the lowest energy state corresponds to an effective classification of the pixels in the network [30].
E =
n
n
∑∑ i =1
FiFj [ x ( i ) − x ( j )] 2 +
j =1
n
n
∑ ∑ (1 − Fi )(1 − Fj )[ x ( i ) − x ( j )] i =1
2
j =1
From this formula, it is no doubt that the difference between the pixel values must be small enough, or else the network will not be stabilized [34]. The image is cut into n*n blocks for coding, and a Hopfield network with n*n neurons is used for each block in turn. The synaptic interconnection strengths and external input on the neurons are given by [5]
T ij = − 4 [x (i ) − x I
i
= 2
( j )]2
n
∑ [x (i ) − x ( j )]
2
j=1
The network will be iterated until the stable state reached, which means, the value of does not change any more. Hence, we can get the quantiser output levers of class ‘0’ and class ‘1’.
Where q represent the number of pixels in class 44
=1.
3.3 Program Flowchart
Figure 3.1 Flowchart
45
3.4 Result Analyse 3.4.1 Cameraman
Figure 3.2 Original Figure 3.3 Histogram of Original
46
Figure 3.4 4*4 Block Size Figure 3.5 Histogram of 4*4 Block Size Figure 3.6 8*8 Block Size Figure 3.7 Histogram of 8*8 Block Size
Figure 3.8 16*16 Block Size Figure 3.9 Histogram of 16*16 Block Size 47
Figure 3.10 Original Figure 3.11 4*4 Block Size
Figure 3.12 8*8 Block Size Figure 3.13 16*16 Block Size Image
Correlation
Original 4*4 8*8 16*16
100% 99.86% 98.85% 97.75%
Size(KB)
(theoretical) (actual) 257 1 1 65 4:1 3.9538:1 41 6.4:1 6.2682:1 35 7.5294:1 7.3428:1 Table 3.1 Parameter of ‘Cameraman’
MSE
SNR
38.5849 85.9758 138.3457
26.6502 24.0707 21.1757
Conclusion: From Table 3.1, it is easy to find that the compression ratio is quite close to the theory, however, the computation of image compression is more than BTC. Normally, it takes 5 minutes for 4*4 block size, 30 minutes for 8*8 block size. If choosing the 16*16 block
48
size, the time will reach more than 1 hour. Certainly, if we choose smaller images, like 128*128, the computation time will be less. Or, we can resize the image into half of it, this will reduce the time to run the programming. 3.4.2 Pirate
Figure 3.14 Original Figure 3.15 Intensity Profile of Original
Figure 3.16 4*4 Block Size Figure 3.17 Intensity Profile of 4*4 Block Size 49
Figure 3.18 8*8 Block Size Figure 3.19 Intensity Profile of 8*8 Block Size
Figure 3.20 16*16 Block Size Figure 3.21 Intensity Profile of 16*16 Block Size
50
Figure 3.22 Original Figure 3.23 4*4 Block Size
Figure 3.24 8*8 Block Size Figure 3.25 16*16 Block Size
Image
Correlation
Original 4*4 8*8 16*16
100% 99.18% 98.22% 96.58%
Size(KB)
(theoretical) (actual) 257 1 1 65 4:1 3.9538:1 41 6.4:1 6.2682:1 35 7.5294:1 7.3428:1 Table 3.2 Parameter of ’Pirate’
MSE
SNR
37.1992 80.9846 137.2873
25.9774 22.8592 20.5148
Conclusion: Table 3.2 showed that the MSE is increasing, SNR falling down whilst the block size becomes larger and larger. From Figure3.22 to Figure 3.25, zoom in the same times of 51
the same part corresponding to the respective images, the contouring artifacts turns into more visual. Contour of the block becomes more obviously. 3.4.3 Woman_darkhair Figure 3.26 4*4 Block Size Figure 3.27 Errors
Figure 3.28 8*8 Block Size Figure 3.29 Errors
Figure 3.30 16*16 Block Size Figure 3.31 Errors 52
Image
Correlation
Original 4*4 8*8 16*16
100% 99.87% 99.63% 98.90%
Size(KB)
(theoretical)
(actual)
257 1 1 65 4:1 3.9538:1 41 6.4:1 6.2682:1 35 7.5294:1 7.3428:1 Table 3.3 Parameter of ‘Woman_darkhair’
MSE
SNR
N/A 10.1232 28.1911 83.1571
N/A 31.8307 27.4399 22.6829
3.5 Summary As the block size is increasing, the correlation between original and reconstructed image decrease, the compression ratio and MSE are getting higher and higher as the SNR falling down sharply, the quality becomes worse and worse. Computation takes up more and more time as the block size changes into bigger, for example, 16*16 block size will cost more than 1 hour. 53
Chapter 4 Comparisons 4.1 Algorithm Compression Even though most step for compression are the same between BTC and HNNBTC. However, the difference of BTC and HNNBTC is how to find this threshold. In BTC, it uses a threshold value to separate the pixel values in the image, if values are equal or bigger than the threshold value, set ’1’, otherwise set ‘0’, normally the threshold is the average value of this block. But in HNNBTC, the threshold is completely different as the result of stable state. The stable state depends on the energy of the image. Meanwhile, there is quite bad in run the HNNBTC to compress the image. In my programming, I don’t I can use the toolbox in Matlab, I did not how to detect the stable state in Hopfield Neuron Network by automate. However, I thought out a way that is to set a maximum operation times. Because when the networks reached its stable state, when the training is continued, the stable state won’t change. This is why my HNNBTC programming cost so much time. Block size is bigger, the time to detect the stable time wills longer, for example, 8*8 block size will take at least 15 minutes to encode. 4.2 Image Result Comparisons Figure 4.1 to Figure 4.6 are some reconstructed images by difference compression technologies. The first group is using BTC, and the second group is by HNNBTC. No matter using BTC or HNNBTC, the original images are the same. And the order of each group from left to right is following by 4*4, 8*8, 16*16 block size.
54
Figure 4.1 Group 1 Cameraman (BTC)
Figure 4.2 Group 2 Cameraman (HNNBTC)
Figure 4.3 Group 1 Pirate (BTC)
Figure 4.4 Group 2 Pirate (HNNBTC) 55
Figure 4.5 Woman_darkhair (BTC)
Figure 4.6 Woman_darkhair (HNNBTC)
Now, compare each compression method with the same size 4*4 block size.
Correlation MSE SNR (actual) (%) (dB) BTC HNNBTC BTC HNNBTC BTC HNNBTC BTC HNNBTC Cameraman 3.9538:1 3.9538:1 99.47 99.86 40.8640 38.5849 26.4010 26.6502 Pirate 3.9538:1 3.9538:1 99.10 99.18 40.7506 37.1992 25.5814 25.9774 Woman_darkhair 3.9538:1 3.9538:1 99.86 99.87 10.7125 10.1232 31.5849 31.8307 Table 4.1 Compare BTC and HNNBTC
Image
Table 4.1 illustrated no matter BTC or HNNBTC, both of them have the same compression radio in theory and actual values.
56
Table 4.2 MSN and SNR
Table 4.2 illustrated no matter ‘cameraman’, ‘pirate’, or ‘woman_darkhair’, MSE in BTC is always higher than HNNBTC, but it is opposite in SNR, HNNBTC is greater than BTC. This is to say HNNBTC consistently gives better results than BTC.
4.3 Summary No matter using BTC or HNNBTC, the larger block size chooses, the less number of blocks will be bring out, the lower resolution reconstructed image is. Especially for Hopfield Neuron Network, which threshold is base on the stable state. The computation of calculate the stable state during
times operation in the encoding part.
The result of using Hopfield Neuron Network gives a better image quality, is optimization than Block Truncation Coding. However, it still causes the contouring artifacts. Because in each block, it only has two reconstructed levels, when in 57
compression, lots of data are redundant. As the result of this, the contour edge of the reconstructed image will become clear after decoding. Also, there is another disadvantage of both of them. The bit rate is high, for example, when the block size is 4*4, the bit rate = (8+8+16)/16=2 bpp. Higher bit rate is, more expensive the cost will be in the transmission. About how to reduce the bit rate, I will discuss in the chapter 6.
58
Chapter 5 Color Image Compression 5.1Color Image Compression Base on RGB Color Space 5.1.1 Introduction
As we know, each pixel in a color image has three frequency spectrum, they are ’R’,’G’,’B’. And each component takes one byte to storage, in other words, a color pixel needs 3 bytes. So, there are
different color pixels.
Red, green, blue are called primary colors, and they can be added to produce the secondary colors. For example, red plus blue will produce magenta, cyan can be made by green and blue, add red and green together will have yellow [35]. These results have shown below.
Figure 5.1 RGB
Meanwhile, RGB model is based on a Cartesian coordinate system. Figure 5.2 is showed that the color subspace of interest. The primary colors are at three corners, the secondary are at three other corners. RGB model applied in display in the screen. 59
Figure 5.2 RGB Cube Figure 5.3 RGB 24‐bit Color Cube
5.1.2 Flowchart
Figure 5.4 Flowchart 60
Open the ‘Lena_512_Color’, and cut the image into 4*4 blocks.
Figure 5.5 Lena
Encoding, and get the bitmap.
Figure 5.6 Bitmap __ R G B
Decoding
Figure 5.7 Decoding __ R G B
Reconstructed Image Combine the ‘R’, ‘G’, ‘B’ three matrix together. 61
Figure 5.8 Reconstructed Figure 5.9 Differences Figure 5.5 Original
Figure 5.9 shows the differences between the original and reconstructed image.
5.1.3 Result Analyse The following images are divided into two groups. The first one is a group that used Block Truncation Coding to compress the image called ‘lena_color’ (Figure 1.11) in different block sizes. The other group is used Hopfield Neuron Networks base on Block Truncation Coding. First Group using BTC:
Figure 5.10 4*4 Block Size Figure 5.11 8*8 Block Size Figure 5.12 16*16 Block Size
Compression Ratio: The original image is 770KB, after compression, Figure 5.10 is 193KB, Figure 5.11 is 121KB, Figure 5.12 is 103KB.
is the compression for 4*4 block size,
represents for 8*8 block size, the compression of 16*16 block size is
62
.
121 = 6.3636 103 = 7.4757 Second Group using HNNBTC
Figure 5.13 4*4 Block Size Figure 5.14 8*8 Block Size Figure 5.15 16*16 Block Size
Figure 5.16 BTC
Figure 5.17 HNNBTC Image ‘lena’
MSE SNR (actual) BTC HNNBTC BTC HNNBTC BTC HNNBTC 4*4 3.9896:1 3.9896:1 31.7350 29.2244 67.9618 69.9618 8*8 6.3636:1 6.3636:1 64.5222 62.7686 58.7167 59.6671 16*16 7.4757:1 7.4757:1 116.6808 113.9560 50.9980 51.6106 Table 5.1 Compare BTC & HNNBTC in RGB Color Space
From the data in Table 5.1, no matter in 4*4 block size, 8*8, or 16*16, MSE (HNNBTC) is always lower than BTC, SNR (HNNBTC) is greater than BTC. Again, HNNBTC 63
perform a better reconstructed image quality than BTC when compress color image in RGB color space.
5.2 Color Image Compression Base on YUV Color Space 5.2.1 YUV Color Space YUV Color Space always applied in the transmission of the television signal [36]. ‘Y’ represents the black-and white (luminance) component, ‘U’ and ‘V’ means the color difference signals. If RGB translate into YUV follows by the formulas below: Y = 0.299R + 0.587G + 0.114B U = - 0.147R -0.289G + 0.436B = 0.492(B-Y) V = 0. 615R – 0.515G – 0.100B = 0.877(R-Y)
To change YUV Color Space into RGB Color Space: R= Y + V G= Y – 0.192U – 0.509V B= Y + U 5.2.2 Flowchart
64
Figure 5.18 Flowchart
Open the ‘Lena_512_Color’, and cut the image into 4*4 blocks, meanwhile, change RGB Color Space into YUV Color Space.
Figure 5.19 Original Image Figure 5.20 YUV Color Image
65
As I mentioned before, screen display color image used RGB in general. So does Matlab. This is why the right YUV color image looks like. Encode three channels individually.
Figure 5.21 Bitmap_ Y U V
Decoding
Figure 5.22 Encoding_Y U V
‘Y’ represents the luminance, ‘U’ and ‘V’ are the color difference. Combine three channels together, and change YUV back to RGB.
Figure 5.23 Reconstructed YUV Color Image Figure 5.24 Reconstructed Image 66
5.2.3 Result Analyse The following images are divided into two groups. The first one is a group that used Block Truncation Coding to compress the image called ‘lena_color’ (Figure 1.11) in different block sizes. The other group is used Hopfield Neuron Networks base on Block Truncation Coding. 5.2.3.1 Lena_color First Group using BTC:
Figure 5.25 4*4 Block Figure 5.26 8*8 Block Figure 5.27 16*16 Block
Second Group using HNNBTC:
Figure 5.28 4*4 Block Size Figure 5.29 8*8 Block Size Figure 5.30 16*16 Block Size
The first group using BTC:
67
Figure 5.31 4*4 Block Size Figure 5.32 8*8 Block Size Figure 5.33 16*16 Block Size
Seconding group using HNNBTC:
Figure 5.34 4*4 Block Size Figure 5.35 8*8 Block Size Figure 5.36 16*16 Block Size
Image ‘lena’ 4*4 8*8 16*16
(actual) BTC HNNBTC 3.9896:1 3.9896:1 6.3636:1 6.3636:1 7.4757:1 7.4757:1
MSE BTC 36.1131 72.1813 130.2377
HNNBTC 32.9077 71.5945 127.7450
SNR BTC 66.2780 57.2552 49.5659
HNNBTC 67.4891 58.8914 50.0693
Table 5.2 Compare BTC & HNNBTC in YUV Color Space
Table 5.3 shows HNNBTC offered a better image quality than BTC again by compare the MSE and SBR. More important, HNNBTC has the same compression ratio with BTC.
68
5.2.3.2 Madril_color Group 1 using BTC
Figure 5.37 4*4 Block Size Figure 5.38 8*8Block Size Figure 5.39 16*16 Block Size Group 2 using HNNBTC
Figure 5.40 4*4 Block Size Figure 5.41 8*8 Block Size Figure 5.42 16*16 Block Size
YUV model is widely used for digital video. In this color space, luminance (brightness or intensity) information is stored as a single component (Y). Information of chrominance (color) is made into two color difference (U and V). As the experiment result, HNNBTC is as a better performance compression than BTC. 5.2.4 A Way to Achieve More Compression in YUV Color Space As general speaking, ‘U’ and ‘V’ are sampled by a factor of two to four in the spatial dimensions.
69
As the result of ‘U’ and ‘V’ are much less sensitive to the human visual system than the luminance ‘Y’, we can compress ‘U’ and ‘V’ channel into half, and keep the ‘Y’ channel, than the compression ratio will increase. Because the most important channel is ‘Y’, which represents the black-and white (luminance) component, ‘u’ and ‘V’ are just the color difference. Now, it is going to show the bitmap (encoding), decoding, and reconstructed results. All groups’ original images are cut into 4*4 block size. Bitmap
The first group is using BTC to compress color image base on YUV model, reduced ‘U’ and ‘V’ channel into half size, and Kept ‘Y’ channel with the original size. The second group is using BTC to compression color image base on YUV model but kept all channels as the original size. The third group is based on RGB model, and ‘R’, ’G’, ’B’ three matrix didn’t change. Group 1:
Figure 5.43 Bitmap_YUV’
70
Group 2:
Figure 5.44 Bitmap_Y U V Group 3:
Figure 5.45 Bitmap_R G B Decoding Group 1:
Figure 5.46 Decoding_Y U V’
71
Group 2:
Figure 5.47 Deocidng_Y U V
Group 3:
Figure5.48 Decoding_R G B
Reconstructed
Figure 5.49 Group 1 Group 2 Group 3 72
Errors
Figure 5.50 Group 1 Group 2 Group 3
Obviously, the contour of ‘lena’ using RGB model is clear than using YUV, no matter in bitmap, or decoding part. When image are compression, the better quality of reconstructed image is the compression using RGB model. By compare these three errors images, it is easy to find that lots of color data using YUV model are redundant than RGB model, this cause the quality of the reconstructed image. After compression, I found that the size of reconstructed image of group 1 is 97KB, group 2 is 193KB, and group 3 is 193KB. Original image is 770KB. It is obvious that the compression ratio has increase, the quality becomes worse. Model YUV’ YUV RGB
Compression Ratio 7.9381:1 3.9896:1 3.9896:1
MSE 45.7508 36.1131 31.7350
SNR 63.1960 66.2780 67.9618
Table 5.3 Comparison
Notice: YUV’ means group 1, the ‘U’,’V’ channel has reduced into half size.
Through Table 5.3, it finds that the compression ratio becomes larger if sample only half of the ‘Y’, ‘U’ channel.
73
5.3 Color Image Compression Base on YIQ Color Space Well, there is another color space called YIQ, which defines by National Television Systems Committee (NTSC) [36]. It is widely used in television in the United States, which color coordinate is derived from the YUV format. A main advantage of this format is that grayscale information is separated from color data. Due to this, the same signal can be used for both color and black and white sets. In this space, a color pixel data consists of three components. ‘Y’ represents the luminance, ‘I’ stands for in-phase, and ‘Q’ stands for quadrature-phase. ‘Y’ component contains grayscale information. Another two components make up chrominance. The relationship between YIQ and RGB is,
Step 1: Open a RGB image, translate into YIQ color space.
Figure 5.51 RGB Image Figure 5.52 YIQ Image
74
Figure 5.53 Y Channel Figures 5.54 I Channel Figure 5.55 Q Channel
Step 2: Segment image into 4*4 blocks, calculate the average pixel value (quantized) of the
whole image. If pixel value greater or equal to average, set ‘1’, otherwise, set into
‘0’. And then get a bitmap.
Figure 5.56 Bitmap_Y I Q
Step 3: Decoding base on compute the reconstructed level average and bitmap.
Figure 5.57 Decoding_Y I Q
75
Step 4: Combine three channels in order to get the reconstructed YIQ image, and then translate back into RGB image.
Figure 5.58 Reconstructed YIQ Image Figure 5.59 Reconstructed Image
Result Analyze: The following images are compressed in the same image by the same block size, but different method. Figure 5.60 is used BTC, Figure 5.62 is zoomed in left eye of the Figure 5.60. Figure 5.61 compressed by HNNBTC, the Figure 5.63 belong to Figure 6.63.
Figure 5.60 BTC Figure 5.61 HNNBTC
Figure 5.62 BTC Figure 5.63 HNNBTC
76
Method Block Size MSE BTC 4*4 36.5491 HNNBTC 4*4 34.7460 Table 5.4 Compare BTC & HNNBTC in YIQ Color Space Conclusion:
SNR 66.1217 66.7808
The smaller MSE, the larger SNR. Using BTC has a higher MSE (36.5491) than HNNBTC (66.1217), which means the worse quality reconstructed. This can be found by compare the SNR too. SNR 66.7808 (HNNBTC) is higher than 66.1217 (BTC). By color image compression base on YIQ color space, it can prove that HNNBTC has a better performance than BTC again.
5.4 Color Image Compression Base on HSV Color Space Compress color image can be using difference color space, not only in RGB, YUV, YIQ, but also in HSV space (hue, saturation, value) [4]. This color space was found by Alvy Ray in 1978. People who always used this space to selecting colors (e.g. of paints or inks) from a color wheel or palette. This is because it corresponds better to how people experience color than the RGB color space does. The following figure illustrates the HSV color space [36].
77
Figure 5.64 HSV Color Space
As the result of hue varies from 0 to 1.0, the corresponding colors order from red through yellow, green, cyan, blue, magenta, and back to red, then there are actually red values both at 0 and 1.0. The corresponding colors (hues) vary from unsaturated (shades of gray) to fully saturated (no white component) as saturation varies from 0 to 1.0. Value, brightness, changes from 0 to 1.0, this is the reason why the corresponding colors become increasingly brighter. Saturation can be recognized as the purity of a color. Value is roughly equivalent to brightest.
Figure 5.65 RGB to HIS Conversion 78
Figure 5.66 RGB Image Figure 5.67 HSV Image
Figure 5.68 HSV_H S V
Result Analyze: The following images are compressed in the same image by the same block size, but different method. Figure 5.69 is used BTC, Figure 5.71 is zoomed in left eye of the Figure 5692. Figure 5.70 compressed by HNNBTC, the Figure 5.72 belong to Figure 6.70.
Figure 5.69 BTC Figure 5.70 HNNBTC
79
Figure 5.71 BTC Figure 5.72 HNNBTC
Method Block Size MSE BTC 4*4 33.8781 HNNBTC 4*4 31.0753 Table 5.5 Compare BTC & HNNBTC in YIQ Color Space
SNR 67.1104 68.2355
Conclusion: In MSE, 31.0753 < 33.8781, and 68.2355 >67.1104, so color image compression using HNNBTC in HSV color space has the better quality than is Using BTC. This can prove that H NNBTC has a better performance than BTC again.
5.5 Conclusion & Comparisons
Figure 5.73 RGB Figure 5.74 YUV
80
Figure 5.75 YIQ Figure 5.76 HSV
Image
Color Block Size MSE SNR Space BTC HNNBTC BTC HNNBTC Lena_color_512 RGB 4*4 Block 31.7350 29.2744 67.9618 69.9618 Lena_color_512 YUV 4*4 Block 36.1131 32.9077 66.2780 67.4891 Lena_color_512 YIQ 4*4 Block 36.5491 34.7460 66.1217 66.7808 Lena_color_512 HSV 4*4 Block 33.8781 31.0753 67.1104 68.2355 Table 5.6 Compare Color Image Compressions in Different Color Space
Table 5.7 MSE & SNR in Image Compression Using Different Color Space
Figure 5.82 illustrated the same image compare in different compression technology in different color space. No matter in RGB, YUV, or YIQ, HSV color space, it is easy to find that HNNBTC has a better performance than BTC. It proves that again HNNBTC is optimization than BTC.
81
However, the image quality is quite approach using BTC and HNNBTC base on YIQ and HSV color space. In YIQ, SNR (BTC) is 66.1217, for HNNBTC it is 66.7808. It is not too much different between these two techniques. Meanwhile, SNR (RGB) 69.9618, SNR (YUV) 67.4891, SNR (YIQ) 66.7808, SNR (HSV) 68.2355, by comparing these data, using the same technique to compress the same color image by different color space, it finds that RGB has a higher reconstructed image quality than color space. 82
Chapter 6 Variable Block Size 6.1 Principle As discussed before, even HNNBTC can find an optimum to make a better quality than BTC, the bit rate still the same, the cost in the translation are more expensive. If it uses the variable block size to replace the fixed block size, than will reduce the bit rate, and solve this problem [37]. At the same time, because HNNBTC only used two reconstructed levels to reconstruct the image, some detail in the image cannot be reverted. Suppose, if an area A has much more information than area B, both of them used the same block size, this will cause the contouring artifacts, especially in the area A. Two reconstructed levels are not enough for area A. Under this situation, if we want a better quality of reconstructed image, it is better to use the variable block size to replace the fixed block. When the area A has more information, than use a smaller block size, like 4*4 block. Otherwise, use a normal block size.
8x8block cut into 4x4block by 4 two representative values two representative values 64‐bit bitmap 16‐bit bitmap multiply by 4
83
¾ It should comprises a block size marker (1 bit), which includes a lot details about the block, such as cut into how many n*n blocks, what the value of n is. An advantage of using block size marker is to distinguish the usage of difference block sizes in decoding. Variable block size approach exploits local image content for improved compression.
6.2 Programming Flow chart
Figure 6.1 Flowchart
84
Here it used a function called standard division. STD is the standard deviation of the whole image, STD’ is the standard deviation for an 8*8 block. BSM represents block size marker. If the block STD less than the STD of the whole image, then the block won’t be change, keep the original size to calculate the two level’s average value; while if the block STD equals to or more than the STD of the whole image, then the block would be divided into 4*4*4 (in this case) for the block size, which means it contains four blocks, each block is 4*4. Then use the same theory to get the bitmap and two levels average value.
6.3 Practical Works Here, I used a 128*128 grey level image called ‘cameraman’.
Figure 6.2 Original Figure 6.3 Bitmap
In my programming, first step is to cut the image into a set of 8*8 block size. However, if the standard deviation of the block is greater than the standard deviation of the whole image, then divide this block into 4*4 block size, which number of the 4*4 block is 4. At the same time, sign a block size marker (BSM). According to the average of two classes of different block size, and the matrix of BSM, it proved that my encoding part is right. (These data are only part from the image.)
Figure 6.4 Average of two classes about 8*8 & 4*4 Block size
85
Decoding should be easy, however, all the data are loss in the transmission. I need more time to find out what is wrong in my programming. This becomes one part of my future work.
Chapter 7 Conclusion and Future Work
7.1 Conclusion This project aims to implement and develop an image compression technology using Hopfield Neuron Network. The first basic aim is to understand the principle of the image compression, and then using Matlab to implement the compression. The process of doing this project can be divided into 7 objectives. 1. Investigations on the BTC Investigations on the Block Truncation Coding are the first objective of this project work as it is the basic algorithm. As presented in Chapter 2, BTC is simple but effective and have a satisfying result in reconstructed image quality. 2. Investigations on the Hopfield Neuron Network As the project title is ‘Image Compression Using Hopfield Neuron Network’, Hopfield Neuron Network plays an important role here. It has a better performance than BTC in compression. 3. Investigations of the Matlab Software To implement the project procedures, Matlab software is suggested to use. As the result of this, the Matlab software system is investigated. One of the advantages for this function toolbox it has. There are many image compression and neuron network algorithms, especially color space, in this software.
86
4. Design a Matlab Program When finished investigate the principle of BTC and HNNBTC, two Matlab programs are written to perform the grey level images compression, BTC and HNNBTC respectively. And each program has 4*4, 8*8, 16*16 block size, in other words, I used a function called ‘range’, we can choose any one of these three block size. All the results are implemented by Matlab. 5. Improvement of the Project in grey level images Grey level images are easy to implement using different methods BTC and HNNBTC. Meanwhile, MSE and SNR, compression ratio are the standard that use to define a ‘good’ quality of the compression. Detail sees in Chapter 2 and Chapter 3, 4. 6. Do color images compression in different color space No matter in RGB, YUV, or YIQ, even HSV color space, I proved that HNNBTC has a better performance than BTC. There are presented in the Chapter 5. 7. Improvement more compression in YUV color space as asked As the ‘Y’ represent the luminance, ‘U’ and ‘V’ are the color difference, in more simple, more information are in Y channel, it can be more redundant in U and V channel. Under my Project Supervisor’s help, I succeed make the compression ratio into 7.9381:1, and the original ratio is 3.9896:1. More detail is in Chapter 5, 5.2.5. 8. Encoding in variable block size It is the most difficult part, when I finish the encoding, trying to decoding, all the data are missed in the transmission. The deadline is on the way, so as the exam, I don't have enough time to find out what I made mistake in the decoding. 87
7.2 Future Work Base on the process of this project, future work, as described below, would form a useful contribution to the further development to the system. 1. Finish the decoding part in variable block size. This should also include experiments to compress the color images. 2. Error control coding. As we know, there will have some channel error during the transmission. Therefore, error control codes use to detect and correct channel errors becomes very important. (62, 56) Hamming code can be used, as it has capable of correcting one error in 56 bits. 3. Meanwhile, if using smaller bits to represent one pixel, like 7 bits per pixel to instead of 8 bits per pixel. This means compression ratio will be increasing through reducing brightness resolution. 4. Using this technique not only combined with error control, also the hardware service.
88
Reference [1]Martin, Lik-kwan Shark, “Hand out of Digital Image Processing”, 2008. [2] R.J.CLARKE, “Digital Compression of still Images and Video”, 1995, pp.7-17. [3]S. Cavalieri, A. Di Stefano and O. Mirabella, “Optimal path determination in a graph by Hopfield neural network”. Neural Networks 7, 1994, pp. 397–404.
[4] Rafael C. Gonzalez and Richard E. Woods, “Digital Image Processing”, 2002.
[5] Claude E. Shannon, “A Mathematical Theory of Communication, Bell System Technical Journal”, 1948, Vol. 27, pp. 379–423, 623–656. [6] T. X. CHEN, J.H.XIE and J.M.HUANG, “Ensure Structure of Tree Video Coding Benefits Technologies”, Conference on Technology and Management, 2000, pp.387-393. [7] Martin Roy Verlay and Mike Peak, “Handout of Artificial Neural Networks”, 2009. [8] G. Qiu, M.R. Varley and T.J. Terrell, "Improved Block Truncation Coding using Hopfield Neural Network", Electronics Letters, October 1991 (0013-5194),vol. 27, no. 21, pp1924-1926,. [9] J. J. Y. Huang and P. M. Schulte’s, “Block quantization of correlated Gaussian random variables”, IEEE Trans. Comm., Sep. 1963, vol. 11, no. 9, pp.289–296. [10] K. K. Ma, “Put Absolute Moment Block Truncation Coding in Perspective”, 1997, IEEE Trans. Commun., vol.45, no.3:284-286.
89
[11]Rafael C. Gonzalez and Richard E. Woods, Second Edition. October 2007, ’Digital Image processing’. [12]E.J.Delp and O.R. Mitchell, ”Image Compression Using Block Truncation Coding”, IEEE Trans. On Comm.,1979, Vol. COM-27, No.9, pp. 1335-1342. [13] Gray, R. M., “Vector quantization”, IEEE Assp Magazine, April 1984, vol. 1, pp.429 [14] Clarke R. J., “Transform coding of images”, 1985, Academic Press, London. [15] G.R. Arce and N.C. Gallagher, Jr., “BTC image coding using median filter roots”, IEEE Trans. On Communications, June 1983, vol. COM-31, No.6, pp. 784-793. [16] Udpikar, V.R. and Raina, J.P., “BTC image coding using vector quantization”, IEEE Trans. Commun., March 1987, vol. COM-35, pp.352-356, [17] Healy, D. J. and Mitchell, O. R., “Digital video bandwidth compression using BTC”, IEEE Trans. Commun., June 1983, vol. COM-31, pp.784-792. [18]K. K. Ma., “Sub-band Coding of Digital Video”, School of Electrical and Electronic Engineering, Nan yang Technological University, June 2000, pp. 1-3. [19] M. D. lema and O.R. Michel. “Absolute Moment Block Truncation Coding and its Application to Colour Images”, IEEE Trans.On Common., Oct .1984, vol.32, pp.11481157. [20] R.J.CLARKE, “Digital Compression of still Images and Video”, 1995, pp.7-17.
90
[21] V. Udpikar and J. P. Raina, “Modifield algorithm for block truncation coding of monochrome images”, Electronic Letters, 1985, Vol.21, No.20, pp. 900-902. [22]Khalid Kamali. “Fractal Video Compression”, Faculty of Engineering and Surveying, the University of Southern Queensland. Oct, 2005, pp.2-15. [23]M. Ghanbari, “an introduction to standard codes”, Video Coding. The Institution of Electrical Engineers, 1999. [24] Y.C.Hu, “Improved moment preserving block truncation coding for image compression”, Electron. Lett. Sep.2003, Vol.39, (19) . [25] K.Somasundaram and I.Kaspar Raj, “An Image compression Scheme based on Predictive and Interpolative Absolute Moment Block Truncation Coding”, GVIP Journal, Dec, 2006, Volume 6, Issue 4, pp 33-37. [26] Bibhas Chandra Dhara, Bhabatosh Chanda,”Block truncation coding using pattern fitting”, 2004 [27]Lippmann,R.P., “An introduction to computing with neural nets”, IEEE ASSP Mag., 1997, pp 4-22. [28]Bdeini,L. and Tonazzini, ”A.neural network use in maximum entropy mage restoration”, Image and Vision Computing,1990, pp108-114. [29]Roth, M.W. “Neural-network technology and its applications”, Heuristics, 1989, pp46-62.
91
[30] J.J. Hopfield, “Neurons networks and physical systems with emergent collective computational abilities”, Proc. Natl. Acad. Sci. USA, 1982, Vol 79, pp. 2554-2558. [31] J. J. Hopfield, “Neural with graded response have collective computational properties like those of two-state neurons”, Proc. Natl. Acad. Sci. USA, 1984, Vol. 81, pp. 3088-3092. [32] J. J. Hopfield and D. W. Tank, “Neural computational of decisions in optimization problems”, Biol. Cybern. 52, pp. 141-152, 1985 [33] G. Qiu, "An Investigation of Neural Networks for Image Processing Applications", PhD thesis, University of Central Lancashire, 1993 [34] G. Qiu, M.R. Varley and T.J. Terrell, "Variable Bit Rate Block Truncation Coding for Image Compression using Hopfield Neural Networks", Proc. 3rd International Conference on Artificial Neural Networks, Brighton, May 1993, pp233-237 (IEE Conference Publication No. 372) [35] K.-K.Ma, S.A. Rajala, “Sub band Absolute Moment Block Truncation Coding, Optical Engineering”, Special Issue on Visual Communications and Image Processing, 1996, vol.35, no.1: 213-231. [36] Help in Matlab [37]M.R. Varley and X. Mo, "Error-resilient Image Coding using Hopfield Neural Network Block Truncation Coding Scheme", Proc. IEE Colloquium on Data Compression: Methods and Implementations, Savoy Place, London, November 1999, pp7/1-7/7 (09633308) 92
Appendix A. Statement of Work (SOW) University of Central Lancashire Department of Technology Image Compression using Hopfield Neural Networks B.Eng. (Hons.) Digital Communication Issue 1,Nov 2008 Zongbin Zheng
1. Aim The aim of this project is to understand what is the image compression, what is the Hopfield neural network, how to implement the artificial intelligence in the data processing. As it is a whole year project, doing a good project schedule is absolutely necessarily, like most project, it should be split into several parts, solve each problem step by step, make sense of project management in practice, outline the risk of the project.
2. Background Image compression is a one kind of the information technology application. It uses data of collection, storage, transaction, transmission and utilization for all the information. In
the book [1] ‘Digital Image Processing’ mentions that image compression is using the pixels in the image as the data, reduce the redundancy data to achieve the predict efficiency image. Thus the image compression includes lossy and lossyless, lossy means the compressed image is different from the original image which lost some data; the other is no change in the constructed image, like the medical application. Hopfield Neural Network is a feedback net. It offers the binary output, single layer of neurons, it obtains the only stable state when the neurons asynchronously renovate. It also applied by many direction researches such as feature detectors, hand design and domain specific. In this project is using the Hopfield neural network model in image compression.
3. Work Breakdown Structure 93
Work flows; include how the project goes, the project structure, and the frameworks in details. z z z z
statement of works Gantt chart Group meeting image compression Learning what is the Image compression and its function Choosing the coding method for programming Using the MATLAB 6.5/C++ to programming for the image compression z Hopfield neural network Learning what is the Hopfield neural network and its function Using the MATLAB 7.0/C++ to program Connected the excel with MATLAB for the neural networks z Developing different coding method for the image compression z Comparing the efficiency among every coding method z Reporting Interim report preparation Final report preparation z Viva
4. Dependencies All the software need include MATLAB 6.5 and visual 6.0 for C++.
5. Risk Management z Analyse the information from the Internet. How to deal with such many knowledge is important for some are valueless and some are worth. z Register the process, because this is a project, lots of data must be memorized, they can’t loss. It is better to save in different document on that day so that data can be found easily. z Efficiency, the efficiency of programming, it will effect a change of the state of project. z Maybe there are some problems about the images that it can’t be shown for the result. z The most difficult is about the designing and flowchart of software. z Feedback/development, it takes more time to think about what else methods can carry out the function, and which is better. Simultaneity, find out more way to optimize this project. 6. Deliverables Include these two: Item
Due date
1.
Interim report
28 NOV 2008
2.
Final report (2 copies and CD)
27 Apr 2009
94
Appendix B Gantt chart
95