An Approach To Content

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View An Approach To Content as PDF for free.

More details

  • Words: 3,560
  • Pages: 4
An Approach to Content-Based Image Retrieval using Clustering and Space Transformation Biren N. Shah

Vijay V. Raghavan

University of Louisiana at Lafayette CACS, P. O. Box 44330 Lafayette, LA 70504, USA 1-337-482-5243

University of Louisiana at Lafayette CACS, P. O. Box 44330 Lafayette, LA 70504, USA 1-337-482-6603

[email protected]

[email protected]

ABSTRACT In this paper, we propose a new approach to content-based image retrieval that uses space transformation methods to transform the original low-level image space into a reduced dimension highlevel vector space that enables efficient query processing. Our retrieval system consists of two phases: database population and image retrieval. Our approach is generic as it can be applied to many other application domains, in addition to image retrieval. Our approach is efficient, since we use the inexpensive “estimated” distance, as opposed to the computationally inefficient “real” distance, to retrieve the relevant results for a given query image. Our results show that the space transformation methods that we compared preserve either the distances or the order of the distances in the transformed space. Keywords: image retrieval, color, clustering, generic, automatic, efficient, space transformation.

1. INTRODUCTION Interest in applications involving the search and management of digital images has increased tremendously over the last few years. Image databases exist for storing art collections, satellite images, medical images and many other real-time applications. Image databases can be huge, containing hundreds of thousands of images. However, it is not possible to access or make use of this information unless it is organized to allow for efficient browsing and retrieval. Content-Based Image Retrieval (CBIR) systems are required to effectively and efficiently access images using information contained in image databases. A CBIR system uses information from the content of images for retrieval and helps the user retrieve database images relevant to the contents of a query image [3]. A generic system is defined as one where the processing steps remain more or less the same (or, standardized) for different choices of image contents. An automatic system requires no manual intervention while a semi-automatic system may require a limited manual intervention. Thus, approaches to CBIR can be semi-automatic and non-generic, semi-automatic and generic, automatic and non-generic or automatic and generic [3]. The

selection of an approach to CBIR is influenced by the image features to be extracted, the level of abstraction to be revealed in the features and the extent of desired domain independence. The use of low-level (or measurable, automatically extractable) features, for example, color, texture or shape, in image retrieval makes the approach automatic but not necessarily efficient. CBIR systems that make use of only high-level (composite) or semantic features are domain dependent. Inter-image distances computed using the low-level features are called “real” inter-image distances. Inter-image distances computed using the high-level features are called “estimated” inter-image distances. There is an urgent need for an efficient and effective image retrieval technique that not only reduces the complexity associated with computing the inter-image distance using low-level features, but also aids in making the system generic and automatic. The goal is to develop an efficient, generic and fully automated CBIR system. The image retrieval problem can be defined as: Let there be an image database, populated with images O0, O1, O2, …, On. Let Q denote a query image. Let Π denote the real inter-image distance function. The real inter-image distance between two image objects Oi and Oj is denoted by Π(Oj, Oj). The goal is to efficiently and effectively retrieve the best q (q < < n) images from the image database.

2. BACKGROUND A review of the literature [2, 5] shows that the use of “real” distances in image retrieval problems is computationally expensive. Chang et al. [2] used color as a discriminating feature for image retrieval. For a given query, the images in the image database are ranked on the basis of real distance between the query image and the images in the image database. In the case of using color as the feature, the computation of real distances is expensive, since that process should account for the effects of color correlations. Faloutsos et al. [5], while using the low-level features such as color, have tried to make it more efficient by using estimated distance instead of real distance. They reduce the search time by eliminating many images from consideration by using an estimated distance in such a way that the lower bound lemma is satisfied. This lemma does not allow the elimination of relevant images, but may retain many non-relevant images. Then they perform the search on the basis of real distance in the search space with fewer images. CBIR is a type of a pattern recognition problem. Goldfarb [6] proposed some new ideas that use space transformation to

efficiently solve pattern recognition problems. Space transformation methods provide a way of mapping original samples into an efficient and effective domain-independent lowdimensional vector space, preserving discriminatory information contained in the original sample representation. Space transformation methods allow the utilization of the analytical tools available in a vector space. Let {vi}0<=i<=k be the set of high-level feature vectors of the objects (images) in the transformed space R(n+, n-), where (n+, n-) is the vector signature and dij is the real inter-object (e.g., interimage) distance in the original space. Goldfarb [6] proposed the following two methods to represent the original visual content in a completely new dimension space: 1. First Embedding Method: The first embedding method assumes that v0 = 0 and ||vi-vj||2 = d2ij ∀ i,j 2. Main Embedding Method: The main embedding method assumes that the mean vector v of a finite pseudo-Metric space with respect to the transformation function is a zero vector. According to Goldfarb’s results, the distance preservation is guaranteed, as long as the real distance function is pseudo-Metric. The advantage of using the first embedding method, which allows incremental addition of new objects, is not exploited by the proposed approach. According to Zhao [12], the pseudo-Metric space used in Goldfarb’s methods is more general, although some of the useful properties of a vector space are lost. To overcome that, he proposed another domain-independent space transformation approach called the Similarity-preserving Embedding Method. His method defines similarity in such a way that the ordering of the distances is preserved i.e. for any distance function, if two objects are closer, then their distance will be smaller and viceversa. The method makes no assumption about the original space. This relaxation of requirement that distances be preserved makes it possible to map the original space into a normed vector space. Choubey [3] used the space transformation ideas proposed by Goldfarb [6] to develop a generic and fully automatic CBIR architecture. They have been successful in developing an efficient image retrieval system. Their feature space transformation results showed that the dimension of the feature space could be substantially reduced using some threshold value, but with some compromise in distance preservation. Consequently, the retrieval is an approximation of the entire process. As a result, there is a tradeoff between efficiency and effectiveness. Thus, a need exists to further investigate this area and explore its new dimensions. Our approach to solving the image retrieval problem involves using the space transformation methods proposed by Goldfarb [6] and Zhao [12]. A high-level feature vector finally represents each image. Inter-image comparisons are made using the high-level feature vectors of the transformed space. We have used clustering techniques to speed up the search process in the transformed vector space. Our estimated distance is more accurate compared to the estimated distance used by Faloutsos et al. [5] in the sense that the ordering of images corresponding to real distance is better preserved by the proposed method for estimating distance. Our approach is applicable to the following two cases: 1. Original feature space represents images as feature vectors. 2. Unlike multi-dimensional scaling, only proximity data (e.g., inter-object distance data) needs to be given, without the corresponding feature vectors.

The potential utility of our approach is not confined to image retrieval.

3. PROPOSED ARCHITECTURE Our approach requires the selection of a subset of images that will form the initial set. The initial set represents images from all possible classes. For that, the entire collection of images is divided into a number of image clusters. Images that are quite similar to each other belong to the same cluster. Relative to the number of methods [9, 10, 11] available for the clustering problem in vector space, there are much fewer methods developed for proximity data sets. In our experiments, we have used one such method called the maxmin [10] image-clustering algorithm to divide the entire image set into a number of image clusters. The algorithm uses a threshold coefficient, which represents minimum separation needed between cluster centroids, to vary the number of resultant clusters. Since the initial set is composed of centroid images from all the clusters, its size is the same as the number of clusters. Our retrieval system involves the following two phases: 1. Database Population (Figure 1) 2. Image Retrieval (Figure 2) The Database Population Phase (Figure 1) uses low-level image properties, such as color, for computing the real inter-image distances to the images in the initial set. The size of the initial set is much smaller than that of the image database itself. Starting from the real inter-image distances, the high-level image feature vectors for these images are computed using any of the space transformation algorithms. These feature vectors can have significantly fewer features, compared to a representation of images based on low-level image properties. The extent to which the real inter-image distances among the images are preserved in the transformed feature space can be controlled. That is, components that have a negligible effect (close to zero) on the overall representation of the vector can be discarded to further reduce the dimension of the feature space and hence the complexity of the process of distance computation during retrieval. Discarded components may, however, affect the integrity of the vector space. The threshold used for this purpose needs to be fine-tuned. The construction of transformed space is done once for each of the image clusters and the initial set. Thus, in addition to the high-level feature space of an initial set, there will be a high-level feature space for each image cluster, i.e., there will be n+1 feature spaces in total, where n is the number of image clusters. image cluster or initial set image data Image Processing

Image Database

high-level feature

Low-level Feature Extraction

High-level Feature Extraction

low-level feature

reduced dimension

Real Interimage Distance Computation distance data Threshold Unit

Figure 1. Database Population Phase For a given query image (assumed to be one of the database images), its high-level feature vector is directly obtained from the image database. Image Retrieval (Figure 2) is performed using the estimated distance as an approximation to the corresponding real

Estimated Interhigh-level image Distance feature Computation ……… (cluster j)

Merging and Ranking Results

retrieval results

Figure 2. Image Retrieval Phase

4. EXPERIMENTS AND RESULTS A detailed set of experiments were carried out to measure the retrieval efficiency and effectiveness of our proposed approach using estimated distance, to see the effect of using threshold value on the degree of distance preservation and order preservation for all the image pairs in the initial set. To achieve the above goals, three sets of experiments were conducted, using images from the Department of Water Resources (DWR) repository maintained at University of California, Berkeley [4], respectively, to test the distance preservation property, order preservation property, and retrieval effectiveness of our new proposed approach. Experiments were performed with two image data sets: one with 32 images (DWR32) and other with 75 images (DWR75). We have used color as a feature to illustrate the principles behind the proposed scheme. This scheme can easily be extended to other image features, such as texture, shape, etc. Images used in the experiments used the RGB color space and were all quantized to 64 color bins using the median cut [8] quantization algorithm for extracting the low-level color features. We used different measures for evaluating the overall performance of the system by measuring how closely the ranking of images generated by estimated distances matches that of the real distance. Kendall’s Tau [7] was used to measure the order preservation by comparing the order of the real inter-image distance and the estimated inter-image distance for all the image pairs in the initial set. A value of 1 implies that the degree of association between the two orderings is at its positive extreme and a value of –1 implies that the strength of association is the worst possible. L2-Norm was used to check the degree of interimage distance preservation. A value of 0 implies 100% interimage distance preservation. As the corresponding values increases, distance preservation gets poorer and poorer. Rnorm [1] was used to measure the retrieval effectiveness. The Rnorm value will be 1 if the rankings generated by using the real inter-image

Table 1. Kendall’s Tau Values – Order Preservation Results of Experiment I (First Embedding Method)

8 14 19

DWR32 No Threshold 1.0000 1.0000 1.0000

T1

T2

Initial Set Size

retrieval results

0.9947 0.9912 0.9952

0.6772 0.9771 0.9875

20 34 42

DWR75 No T1 Threshold 1.0000 0.9912 1.0000 0.9986 1.0000 0.9988

Table 2. Kendall’s Tau Values – Order Preservation Results of Experiment I (Main Embedding Method)

8 14 19

DWR32 No Threshold 1.0000 1.0000 1.0000

T1

T2

1.0000 0.9912 0.9955

0.5661 0.9775 0.9901

Initial Set Size

Estimated Interimage Distance Computation (cluster i)

high-level Estimated Interfeature image Distance Computation (initial set) retrieval results

20 34 42

DWR75 No T1 Threshold 1.0000 0.9961 1.0000 0.9987 1.0000 0.9995

Table 3. Kendall’s Tau Values – Order Preservation Results of Experiment I (Similarity-preserving Embedding Method)

8 14 19

DWR32 No Threshold 1.0000 1.0000 1.0000

T1

T2

0.4180 0.2830 0.2178

0.4339 0.2850 0.2652

Initial Set Size

high-level feature

High-level Feature Extraction

Initial Set Size

Image Database

Each set of experiments was carried out with different threshold values that we use to determine the number of high-level image features to be retained. T1 in DWR32 indicates that one component (the worst) of the feature vector was discarded. T2 in DWR32 indicates that three least influential components of the feature vector were discarded. The experiments that used images from DWR32 were performed with two different values of threshold (T1 and T2) in addition to using no threshold. T1 in DWR75 indicates that two least influential components of the feature vector were discarded. The experiments that used images from DWR75 were performed with just one other value of threshold (T1), besides using no threshold. Our experimental study evaluates the proposed approach only up to the level of initial set. This allows us to evaluate our experimental goals relative to the ranking of cluster centers, under the assumption that query is one of the images in the initial set.

Initial Set Size

query

distance exactly conform to the rankings generated by using the estimated inter-image distance. The Rnorm value will be 0 if the rankings generated by using the real inter-image distance is exactly inverse of the rankings generated by using the estimated inter-image distance.

Initial Set Size

distance. First, the estimated distance of the query image to each of the images in the initial set is found using the high-level feature vectors. Since similar images belong to the same cluster, a second-level search is done for each cluster corresponding to the relevant images of the initial set and the estimated distance of the query image to each of the images in the selected image clusters is found. The retrieval results are ranked in order of their estimated distances.

20 34 42

DWR75 No T1 Threshold 1.0000 0.2207 1.0000 0.1134 1.0000 0.0446

4.1 Experiment I (Order Preservation) This experiment was performed to check the order preservation property of the images in the initial set for each of the space transformation methods. The results are summarized in Table 1, Table 2, and Table 3. The results show that the order preservation is 100% when no threshold value is used, i.e., when no feature component in the transformed space is discarded. Also, the order preservation for the main and the first embedding space

transformation method is quiet acceptable with threshold values T1 and T2. The order preservation for the similarity-preserving method is poor with threshold values T1 and T2, which prove that it does not support reducing the dimension of the feature space. We also see that the order preservation increases as the size of the initial set increases, although the difference is quite negligible.

4.2 Experiment II (Distance Preservation) This experiment was performed to check the distance preservation property of the images in the initial set for the first and the main space transformation methods. The results are summarized in Table 4 and Table 5. The results show that the distance preservation is 100% when no threshold value is used, i.e., when no feature component in the transformed space is discarded. When compared with the real distance values, the distance preservation is quiet acceptable with threshold values T1 and T2. Since the similarity-preserving embedding method does not preserve distance [12], experiments on it were not performed.

8 14 19

DWR32 No Threshold 0.0000 0.0000 0.0000

T1 (x 107)

T2 (x 107)

Initial Set Size

Initial Set Size

Table 4. L2-Norm Values – Distance Preservation Results of Experiment II (First Embedding Method)

0.7851 1.2444 1.0502

8.9549 2.2777 1.6744

20 34 42

DWR75 No T1 Thres- (x 107) hold 0.0000 1.2378 0.0000 0.3322 0.0000 0.3585

8 14 19

DWR32 No Threshold 0.0000 0.0000 0.0000

T1 (x 107)

T2 (x 107)

Initial Set Size

Initial Set Size

Table 5. L2-Norm Values – Distance Preservation Results of Experiment II (Main Embedding Method)

0.0000 1.1645 0.9708

11.426 2.5296 1.6701

20 34 42

DWR75 No T1 Thres- (x 107) hold 0.0000 0.3949 0.0000 0.3103 0.0000 0.1712

4.3 Experiment III (Retrieval Effectiveness) This experiment was performed to measure the image retrieval efficiency and effectiveness of our approach with each of the space transformation algorithms. The experiment was carried out with six different initial sets, three each from DWR32 and DWR75. Every image in the initial set was used as a query image. Also, all feature components were retained while performing retrieval. The results showed a consistent Rnorm value of 1. The corresponding tables are not necessary as they can be derived from results of experiments I and II.

5. CONCLUSION AND FUTURE WORK We have proposed an automatic, generic and efficient CBIR architecture. Our experiments show that the space transformation methods investigated preserve either the distances or the order of the distances in the transformed space. The similarity-preserving embedding method does not work well if the dimension of the feature space is reduced. Our results from the experiments that use Rnorm as an evaluation measure show that the image retrieval idea works effectively. Some experiments (not reported) that involve

the main and the first embedding method show that the dimension of the feature space can be substantially reduced, while preserving the real inter-image distances. To address the scalability issue, we are currently performing experiments on larger image databases. Our preliminary results indicate that our approach and the space transformation methods will scale up well. The results reported in this paper use only the images in the initial set to populate the database with their highlevel feature vectors. Also, it limits the retrieval process by requiring that the query image be one of the images in the initial set. Our approach can however be extended to perform a second level search into the image clusters corresponding to the first few relevant images from the initial set and resolve both of those limitations. We plan to evaluate our approach in these contexts in a future study.

6. REFERENCES [1] P. Bollmann-Sdorra, V. V. Raghavan, “On the Necessity of Term Dependence in a Query Space for Weighted Retrieval,” Journal of the American Society for Information Science, 49 (1998), 1161-1168. [2] S. F. Chang, A. Eleftheriadis, D. Anastassiou, “Development of Columbia’s Video on demand Testbed,” International Journal of Image Communication- Signal Processing, 8(1996), 191-207. [3] S. K. Choubey, V. V. Raghavan, “Generic and Fully Automatic Content-Based Image Retrieval Using Color,” Pattern Recognition Letters, 18 (1997), 1233-1240. [4] Department of Water Resources, University of California, Berkeley, http://elib.cs.berkeley.edu/photos/tarlist.txt, Date: 2nd September 2002, Time: 2:00 p.m. [5] C. Faloutsos et al., “Efficient and Effective Querying by Image Content,” Journal of Intelligent Information Systems, 3(1994), 231-262. [6] L. Goldfarb, “A New Approach to Pattern Recognition,” In Progress in Machine Intelligence and Pattern Recognition (L. N. Kanal and A. Rosenfeld, eds.) (Amsterdam, Netherlands: North-Holland, 1985), pp. 1-156. [7] W. L. Hays, P. L. Winkler, “Nonparametric Methods,” In Statistics: Probability, Inference, and Decision (New York, NY: Holt, Rinehart and Winston, Inc., 1970), vol. II, pp. 182-264. [8] P. Heckbert, "Color Image Quantization for Frame-Buffer Display," Computer Graphics, 16 (1980), 297-307. [9] C. Li, E. Chang, H. Garcia-Molina, G. Wiederhold, “Clustering for Approximate Similarity Search in HighDimensional Spaces,” IEEE Transactions on Knowledge and Data Engineering, 14(2002), 792-808. [10] J. T. Tou, R. C. Gonzalez, Pattern Recognition Principles (Boston, MA: Addison-Wesley, 1972). [11] A. Vellaikal, C.-C. J. Kuo, “Hierarchical Clustering Techniques for Image Database Organization and Summarization,” In Proceedings of the SPIE Multimedia Storage and Archiving Systems III, Boston, MA, USA, (1998), 68-79. [12] X. Zhao, “Space Transformation and Clustering Methods for Proximity Data Set,” http://www.cacs.louisiana.edu/~bns0742/Zhao/Zhao.doc, Date: 3rd September 2002, Time: 12:51 p.m.

Related Documents