Automatic Face Recognition System using P-tree and K-Nearest Neighbor Classifier
Mohammad Kabir Hossain, Abu Ahmed Sayeem Reaz, Rajibul Alam, and Dr.William Perrizo* Department of Computer Science and Engineering, North South University, Dhaka 1217 *Department of Computer Science, North Dakota State University, Fargo, ND 58105, USA Emails:
[email protected],
[email protected],
[email protected],
[email protected]
Abstract: Face recognition has recently received remarkable attention in both authentication and identification systems due to high acceptability and collectability, regardless its lower circumvention and uniqueness than other biometric verification technologies. The basic approach with face recognition commences with feature set construction from the relevant facial traits of the users, termed enrollment [1]. When a user is to be authenticated (i.e. the user's identity is to be verified), his/her facial sample is captured and a feature set is created. This feature set is then compared with the enrollment feature set. But feature set search mechanism is time consuming and sometimes exhaustive. In this paper, a very efficient and time saving search mechanism is proposed that exploits the advantages of Peano Count Tree and K- Nearest Neighbor Search techniques. Keywords: Verification, Identification, Feature set, Image processing, P-tree, bSQ format, Quadrant ID, Euclidean similarity function, Minkwoski similarity function, Manhattan distance, K-nearest neighbor classifier. 1. INTRODUCTION While research towards automatic face recognition began in the late 1960's, progress has been slow. Recently there has been renewed interest in the problem due in part to its numerous security applications ranging from identification of suspects in police databases to identity verification at automatic teller machines [1]. Though automatic face recognition systems exhibit higher FAR (False Acceptance Rate), the probability that a sample falsely matches the presented face identification record or feature sets, and FRR (False Rejection Rate), the probability that a sample of the right person is falsely rejected, than other successful biometric systems like fingerprint or iris recognition, it is attractive because of its wide-spread acceptability, universality and easier acquisition. Recognition is a step by step process and quite timeconsuming in case it has to deal with a large problem domain. In applications like surveillance system, airport and banking security, database search time is significantly huge. Thus, development of real-time application is a challenging task. In this paper our prime concern has been reduction in database search time during face matching phase. In order to achieve that objective we have made use of the techniques of peano count tree and k-nearest neighbor algorithm. The rest of
the paper is organized as follows: first a review of the technology behind such an identification system and then the proposed system. 2. RELATED CONCEPTS Before we go through the identifier we need to know the approaches and problem domains involved in such systems. 2.1 Approaches Automatic face recognition divides into roughly two lines of inquiry: feature based approaches which rely on a feature set small in comparison to the number of pixels, and direct image methods which involve no intermediate feature extraction stage. The later method is substantially susceptible to lighting condition. Whereas in principle, feature-based schemes can be made invariant to scale, rotation and/or illumination variations and thus we are interested in them. 2.2 Problem Domains Face recognition system serves two problem domainsverification and identification. When a user is to be authenticated (i.e. the user's identity is to be verified), samples will be captured from the device and again a feature set is created. This feature set is then compared with the enrollment feature set. If the resulting similarity value is above a predefined threshold, the user is considered to be authenticated. In contrast to the verification use case, with identification the (claimed) identity of the user is not known in advance, but shall be determined based on sample images of the user's face and a set (population) of feature sets with known identities. The identification system takes some samples of the user's face, generates a feature set from them and compares this feature set with each element of this population. The elements yielding the highest comparison values above a certain confidence threshold are candidates for the identity wanted. In this paper, we proposed a system that can serve both purposes. 3. PREVIEWS OF FACE IDENTIFIER There are three main parts of the system. They are, Image processing, Recognition algorithm, and Database searching mechanism.
3.1 Image Processing: Image capture: We can use digital camera to take an individual’s photo. In surveillance system video camera continuously takes photos. The video signal is converted to a series of digital pictures. Face finding: The first problem to be solved before attempting face recognition is to locate the face in the image. A number of algorithms are available for the task. Many authors [2,3,4] have discussed methods of segmenting skin-tone regions from images, particularly for finding faces and hands. There are also several algorithms for face detection based on the intensity pattern of the image. 3.2 Recognition Algorithm: Two steps are involved: Feature extraction and feature matching. 1) Feature extraction: It involves extracting specific facial traits (e.g., relative position of nose, eyes, chinbones, chin, etc.) and then forming a tuple or feature set of those geometric attributes. We might use distances of facial traits instead of positions as features. Also discrete cosine transform coefficients of an image can be used as features. Whatever the value of the features might be we are getting a feature set which is important for the proposed system. 2) Feature matching: The feature set is then matched with the feature sets of faces stored in the database. 3.3 Database Searching: A huge number of feature sets are stored in the database. Database search should not be exhaustive. Efficient search algorithm limits database search space. 4. PROPOSED FACE RECOGNITION SYSTEM 4.1 Introducing P-tree: We’ll try to explain p-tree, based on how it organizes spatial data. A P-tree is a quadrant-based tree. It recursively divides an entire image into quadrants and records the count of 1-bits for each quadrant, thus forming a quadrant count tree. P-trees are somewhat similar in construction to other data structures in the literature (e.g. Quadtrees [5] and HHCodes [6] ).
In this example, 55 is the number of 1's in the entire image. This root level is labeled level 0. The numbers 16, 8, 15, and 16 found at next level (level 1) are the 1bit count for the four major quadrants in raster order, or Z order (upper left, upper right, lower left and lower right). Since the first and last level-1 quadrants are composed entirely of 1-bits (called pure-1 quadrants), sub-trees are not needed, and these branches terminate. Similarly, quadrants composed entirely of 0-bits are called pure-0 quadrants, which also cause termination of tree branches. This pattern is continued recursively using Peano, or Z-ordering (recursive raster ordering), of the four sub-quadrants at each new level. Eventually, every branch terminates (since, at the "leaf" level, all quadrants are pure). If we were to expand all sub-trees, including those for pure quadrants, then the leaf sequence would be the Peano-ordering of the image. Thus we use the name Peano Count Tree [7]. A spatial image can be viewed as a 2-dimensional array of pixels. Associated with each pixel are various descriptive attributes, called “bands”. For example, BMP image has 3 bands- namely, Blue, Green and Red. An image can also be viewed as a relational table where each pixel is a tuple and each band is an attribute. The primary key can be expressed as x-y coordinates. 4.2 Bit Sequential Format: Data in relational table must be mapped to bit Sequencial (bSQ) format for p-tree generation. Suppose in spatial data a reflectance value in a band is a number in the range 0-255 and is represented by a byte. We split each band into eight separate files, one for each bit position. In figure 2 we give a very simple illustrative example of bSQ format with only two bands in a scene having only four pixels (two rows and two columns). P-trees are basically z-order-run-length-compressed, representations of each bSQ file. So, we store the database not in relational tables but in bSQ files.
For example, given an 8-row-8-column image of single bits, its P-tree is as shown in Figure 1. Figure 2. bSQ formats for a two-band 2x2 image 4.3 Basic P-trees:
Figure 1. 8x8 image and its p-tree
We reorganize each bit file of the bSQ format into a basic p-tree. The definition of basic P-tree has been given in [7] as written below:
Definition 1: A basic P-tree Pi,j is a P-tree for the jth bit of the ith band i. 4.4 Logical Operations on P-tree The basic P-trees defined in Section 4.3 can be combined using simple logical operations (AND, OR and COMPLEMENT) to produce P-trees for the original values at any level of precision.
simplicity we wrote Qid as 132 instead of 1.3.2 in figure 3[8]. 4.7 Matching Feature Set: Each feature set can be viewed as a tuple in a relational table where each feature represents a particular attribute. This is just like pixel values of an image where a feature corresponds to a band of the image. Each tuple might have index of form (x, y).
Definition 3: A tuple P-tree P (v1, v2, …, vn), is the Ptree of value vi at band i, for all i from 1 to n [6]. We have, P (v1,v2, …, vn) = P1(v1) AND P2(v2) AND …AND Pn(vn)
Let S = B1 x B2 x … x Bd be a d-dimensional feature space and B1, B2, …, Bd are the dimensions or features of S. We consider each tuple of the table as ddimensional vector v =
. Enrolled feature vectors would correspond to a myriad number of points in S for a large database. During authentication a ddimensional feature vector of the face to be authenticated is created the enrolled feature sets are searched for a match. A distance metric is required to measure the closeness of vectors. Various distance are available. For two data points, X = <x1, x2, x3, …, xn-1> and Y = , the Euclidean similarity function is defined as [8]. In this paper a new matching mechanism has been proposed where k-nearest neighbors are searched that might use any of the distance metrics mentioned above.
4.6 Root-Count and Quadrant ID of a P-tree
Example:
The total number of 1’s in a P-tree is called the root count. The root count of a p-tree indicates the total number of 1’s in the image from where the P-tree was built. We will assume P-trees are coded in a depth-first ordering of the paths to each 1-quadrant. We use a hierarchical quadrant id scheme to identify quadrants as delineated in figure 3. At each level we append a subquadrant id number (0 means upper left, 1 means upper right, 2 means lower left, and 3 means lower right).
This example has been taken from [10]. The following relation table contains 4 features of 4-bit data values.
4.5 Value and Tuple P-trees Definition 2: A value P-tree Pi (v) is the P-tree if value v at band i. Value v can be expressed in 1-bit upto 8-bit precision [7]. Value p-trees can be constructed by ANDing basic Ptrees or their complements. For example, Pi (110) = Pi,1 AND Pi,2 AND Pi,3’
Figure 3. Quadrant id (Qid) For a spatial data set with 2n-row and 2n-column, there is a mapping from raster coordinates (x,y) to Peano coordinates (called quadrant id or Qid). If x and y are expressed as n-bit strings, x1x2…xn and y1 y2…yn then the mapping is (x,y) = (x1x2…xn, y1 y2…yn) → (x1 y1.x2 y2… .xnyn). Thus in an 8 by 8 image, the pixel at (3,6) = (011,110) has quadrant id 01.11.10 = (1.3.2). For
Table 1. Feature sets in relational table Index X Y 0,0 0,1 0,2 0,3 1,0 1,1 1,2 1,3 2,0 2,1 2,2 2,3 3,0 3,1 3,2 3,3
Features B1 0011 0011 0111 0111 0011 0011 0111 0111 0010 0010 1010 1111 0010 1010 1111 1111
B2 0111 0011 0011 0010 0111 0011 0011 0010 1011 1011 1010 1010 1011 1011 1010 1010
B3 1000 1000 0100 0101 1000 1000 0100 0101 1000 1000 0100 0100 1000 1000 0100 0100
B4 1011 1111 1011 1011 1011 1011 1011 1011 1111 1111 1011 1011 1111 1111 1011 1011
This dataset is converted in bSQ format. The feature-1 bit-bands would be like below if we display the bSQ format in 2- dimension:
B11 0000 0000 0011 0111
B12 0011 0011 0001 0011
B13 1111 1111 1111 1111
B14 1111 1111 0001 0011
There is a constraint on bSQ formation. P-tree requires the number of rows and columns in bSQ file be multiple of four. In case, the number of tuples in database is not a multiple of four we need to pad zero vectors at the end; so that it is. Now, for feature-1, basic p-trees are as follows (tree pointers are omitted). P1,1 5 0014 0001
P1,2 7 0403 0111
P1,3 16
P1,4 11 4403 0111
Value P-trees can be constructed from basic p-trees applying a series of AND and COMPLEMENT operations. Value p-tree construction process of P1,0011 has been shown below as an example: P1,0011 = P1,1’ AND P1,2’ AND P1,3 AND P1,4 4 11 9 16 11 4000 4430 4041 4403 1110 1000 0111 From the value p-trees, we can generate tuple p-trees. As an example tuple p-tree of tuple 0010,1011,1000,1111 is: 3 0
0
3 1
1
0 1
0
Figure 4. Tuple p-tree P0010,1011,1000,1111 Here 3 is the root count indicating the relational table has three such tuples. The three leaves with value 1 indicate the quadrants where these tuples are residing. These quadrants can be trivially mapped to indexes of the relational table. In the example, the three leaves with value 1 has Qids 2.0, 2.1 and 2.2 as read from the left to right. The three Qid’s are mapped such way: 2.0 = 10.00 → 10,00 = (2,0) 2.1 = 10.01 → 10,01 = (2,1) 2.2 = 10.10 → 11,00 = (3,0) Based on this idea, to identify a person, the recognition system works as follows: 1) Basic p-trees of the enrolled feature sets are constructed as preprocessing. This is a one-time job.
2) Feature vector of the person to be recognized is extracted. 3) A tuple p-tree of this feature vector is generated based on the basic p-trees. 4) If the root count of this tuple p-tree is more than zero, then we have one or more matches. The indexes of those matched tuples at the relational database can be figured out using Qid to (x,y) co-ordinate conversion technique discussed above. This is a nice way of feature matching, as it doesn’t need to scan the database at all. A few algebraic operations perform the job. Besides, the AND operation which produce the tuple p-trees is fast [11] Thus, we have a framework for real-time recognition system. 4.8 Some Improvements: Feature extraction method might introduce some distortion or addition of noise, which is unavoidable. In such cases, some disparity between extracted and enrolled feature vectors of the same person might occur. The above-mentioned system expects to match feature vectors exactly. If an exact match is not found it fails to recognize. So, we assume the feature extraction process to be perfect or propose some improvement. If root count of the tuple p-tree for target feature vector is zero we know there is no exact match. But it is possible to know what are the nearest matches. We propose K-nearest neighbor classifier to achieve that objective, as it is simple and effective [13]. Given an unknown sample, a k-nearest neighbor classifier searches the pattern space for the k training samples that are closest to the unknown sample. These k training samples are the k “nearest neighbors” of the unknown sample [12]. “Closeness” is defined in terms of Euclidean distance, where Euclidean distance between two feature vectors, X = (x1, x2, …, xn) and Y = (y1, y2, …, yn) is n −1
d 2( X , Y ) =
( xi − yi ) 2
i =1 . It can be generalized to Minkowski similarity function,
dq ( X ,Y ) = q
n −1 i =1
wi | xi − yi |q
Manhattan distance, which is
d1 ( X , Y ) =
n −1 i =1
| xi − yi |
. If q = 1, it gives the
. Any selected distance metric can be used to find the knearest neighbors. More on how lazy classifier works using p-trees is given in [13].
5. CONCLUSION In this paper, we have tried to show, how we can apply p-tree concept in face recognition. Though, framework for a whole system has been developed in this paper, we concentrated only on database search mechanism. The proposed mechanism can be applied not only in face but also in other recognition system successfully for realtime system development. REFERENCES [1] “User guide -face recognition concepts “ Available at: http://www.cognitecsystems.de /documentation/frsdk_public/u_biometrics.html [2] Ingemar L. Cox, Joumana Ghosn, Peter N. Yianilos “Feature-based face recognition system using mixturedistance”, International Conference on Computer Vision and Pattern recognition, IEEE, pp. 209-216, 1996. [3] Jean Christophe Terrillon, Martin David, and Shigeru Akamatsu, “Automatic detection of human faces in natural scene image by use of a skin color model and of invariant moments”, in International Conference on Face and Gesture Recognition, number-3, pp. 112-117. IEEE, April 1998. [4] T. Darrell, G. Gordon, J. Woodfill, and M. Harville, “A virtual mirror interface using real-time robust face tracking”, in International Conference on Face and Gesture Recognition, number-3, pp. 616-621. IEEE, April 1998. [5] H. Samet, “Quadtree and related hierarchical data structure”, ACM Computing Surveys, 16(2): 187-260, June 1984. [6] HH-codes. Available at http://www.statkart.no/nlhdb/iveher/hhtext.htm, 03.10.2000 [7] Qin Ding, Maleq Khan, Amalendu Roy and William Perizzo, “The p-tree algebra”, Proceedings of ACM Symposium on applied Computing (SAC’02), Madrid, Spain, March 2002, pp. 426-431. [8] William Perrizo, Qin Ding, and Anne Denton, “Lazy classifiers using p-trees”, Proceedings of 15th International Conference on Computer Applications in Industry and Engineering (CAINE'02), San Diego, CA, Nov. 2002, pp. 176-179 [9] M. K. Hossain and W. Perrizo, “Automatic fingerprint identification system using p-tree” [10] Qiang ding, William Perrizo, “Cluster analysis of spatial data using peano count tree”, Proceedings of CATA2002, San Francisco, USA, April 4-6, 2002.
[11] “Association rule mining on remotely sensed images using peano count trees,” Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 66-79, 2002. [12] J. Han and M. Kamber, “Data mining: concepts and techniques”,Ed., San Francisco, USA: Morgan Kaufmann Publishers, 2001, pp. 314-315. [13] W. Perrizo, Q. Ding, and A. Denton, “Lazy classifiers using p-trees”. Available at: "http://www.citeseer.nj.nec.com/perrizo02lazy.html"