Improved Segmentation through Dynamic Time Warping for Signature Verification using a Neural Network Classifier Wan-Suck Lee, N. Mohankrishnan, & Mark J. Paullik Department of Electrical & Computer Engineering University of Detroit Mercy, Detroit, MI mohank@udmercy. edu
Abstract
stationary characterization of the signature, wherein its spatial evolution was captured by uniformly dividing it into eight even-sized segments, and using an AR model to represent each segment. However, such a scheme does not allow for the natural variations (inconsistencies) in a person’s signature from one sample to the next. In fact, close examination of the incidence of errors revealed that a number of them corresponded to signatures that were inconsistently segmented. It was felt that the incorporation of superior segmentation techniques prior to feature extraction should be an important next step in extending this work. This would make for tighter clustering of similar segments in the feature space and possibly result in enhanced verification performance. The technique of dynamic time warping (D’IW) [5] which provides for a nonlinear alignment of two signatures is very suitable for this purpose and has been employed in this work. This paper thus extends earlier work by presenting new experimental results from a study conducted to evaluate the performance of a neural network classifier. The classifier input consisted of AR coefficients extracted from signature segments obtained through a DTW-based signature segmentation scheme. The remaining sections of the paper are organized as follows. Section 2 details the collection of the signature data base used, Section 3 discusses the preprocessing steps carried out on the signature, while the method used to segment signatures is described in Section 4. The AR modeling of each segment is presented in Section 5 and the neural network architecture used for verification is described in Section 6. Finally, the experimental results are discussed in Section 7.
A segmentation technique based on dynamic time warping is investigated as a means of obtaining improved alignment between multiple signatures from a writer; the segments generated through this approach are represented by an autoregressive model. The signature verification performance is evaluated with a neural-network-based classifier, and shown to be a significant improvement over previous results obtained by the authors that were based on uniform segmentation.
1. Introduction The field of biometric personal identification appears to be on a steady growth path [1,2]. One variant of it, namely signature verification as carried out by human beings, is historically one of the oldest means of identity validation both for the author of a document or the initiator of a financial transaction. Today the electronic collection and storage of signatures at the point of a financial transaction is an increasingly common phenomenon. While the majority of current systems use these signatures for archival purposes, it is conceivable that automated authentication can be incorporated in the near future. In earlier work [3] the authors have developed a nonstationary autoregressive (AR) model representation for the signature, that was used to carry out verification based on conventional distance measures to measure class similarity or dissimilarity. Subsequently, a neural-network-based classifier was incorporated [4] that used these same distances as inputs and yielded average false acceptance (FR) and false rejection (FA) rates of 2.52%and 1.17% respectively. While these results were encouraging, they only evaluated the potential of the neural network classifier for separating clusters established on the basis of distance measures to quantify class separation. A logical follow-up would be to use a neural network classifier directly on the autoregressive parameters (avoiding distances measures entirely). Furthermore, all the earlier work depended on a non-
2. The Data Base The writer population was made up of 16 writers, with each writer providing a tlotal of 150 signatures collected over 15 sessions. The signatures were recorded dynamically using a Wacaim digitizing tablet that provided a uniformly time-spaced (x,y) coordinate sequence of points along the signature at a sampling rate of 205 samples/second and a spatial resolution of 1200 points/inch.
929
0-8186-8821-1198 $10.00 0 1998 IEEE
- t
i
V
Figure 3. Perpendicular distance sequence
4. Segmentation through DTW ~
~
Dynamic time warping is an algorithm which estimates the best geometrical associations between the points of two similar sequences. It does this by evaluating various permitted pairings between the points of the two sequences, and selecting the best alignment path through these points (the allowed pairings form a grid) based on some optimality criteria and search constraints [ 5 ] . A basic version of the DTW algorithm is explained below using Figure 4. In this figure the two axes represent the sequence indices of the signatures. There are many
~~~
Figure 1. Sample digitized signatures In order to avoid the effects of muscle fatigue, no more than 20 signatures were recorded in one sitting. Some sample digitized signatures are shown in Figure 1.
3. Preprocessing In addition to (x,y) coordinates of points traced along the signature, the digitizing tablet also indicates whether each point corresponds to a pendown or penup point. Each signature is considered to be characterized by all the points between the first and last pendowns in the execution of the signature. The (x,y) coordinate sequence corresponding to each signature is converted into a 1-D sequence using the following process. First, the signature contour is resampled using just 256 points that are re-positioned along it at equal-arc-length intervals, so that we have uniform spatial sampling for each signature. The choice of 256 points was based on a power spectral analysis of signature sequences which led to the conclusion that 256 samples were sufficient to represent all salient curve characteristics in the worst case. The best fit straight line is then found for the 256 interpolated samples as shown in Figure 2. Next, a 1-D sequence s(n) is computed by finding the perpendicular distance of each point to the base line resulting in the sequence shown in Figure 3, which was used in subsequent processing.
i
I
Figure 4. Finding the optimal warping function using Dn/v
possible pairings of these indices subject to certain restrictions of monotonicity, continuity, boundary alignment, and search window width, as expressed by the following conditions: a) monotonicity condition
i ( k - 1) I i ( k ) a n d j ( k - 1) I j ( k ) ,
(1)
b) continuity condition
i ( k ) - i ( k - 1) I 1 and j(k) - j ( k - 1) I 1,
930
(2)
c) boundary condition
chosen segmentation points. The optimal warping function that establishes the best correspondence between points of the “misaligned” signature with the “master” signature is indicated. The segmental points of the “misaligned” signature are established by finding points corresponding to the pre-selected segmentation points of the master signature through the optimal warping function. An appropriate master signature is selected for each writer and uniform spatial segmentation is used to define its segment boundaries; these are the sequence points 32, 64,96, etc.
i(1) = 1, j(1) = 1, &
i ( K )= I , j ( K ) = J ,
(3)
d) search window condition
where “r” is the maximum permitted search window width. The cumulative distance g(ij) between the two sequences from the beginning of the signatures to point (ij) is then calculated as
0
256
-
M A
g ( i - 1, j ) + d ( i , j )
S
T E
R
g(i - 1, j - 1) + d ( i , j ) 256
Here d(i,j) is the distance between the ithpoint of the first signature sequence and the jthpoint of the second signature sequence. The initialization of the above recursion is done by
g(1,l) = d(L1).
0
MISALIGNED
Figure 5. Segmentation using the DTW
(6)
The overall distance (also known as distortion) between the two sequences is then given by
5. Feature Extractioin through AR Modeling Second order AR model coefficients are estimated from each segment of the signatures aligned through the DTW procedure described above using the covariance method. The AR model for section ‘i’ may be written as:
The optimal warping function (or path) is then found recursively by starting at point (1,J) and backtracking to the beginning of the signatures. This path is indicated by the dark line in Figure 4. The above description of the algorithm does not include nuances such as choice of the “slope” of the path and a “weighting” function, w(k), which scales d(ij) in equation (5). We have chosen a “ slope” of zero and a “weighting” function given by
w(k)=(i(k)-i(k-l))+(j(k)- j(k-1)).
5 ( n ) = s,; ( n )- ai,
(9)
where ‘p’ denotes the AR imodel order, ai is the local mean, ri(n) is a zero mean sequence obtained by removing ai from si(n),cik are the model coefficients for section ‘i’, and w(n) is a random noise sequence with zero mean and variance.
(8)
For more details on the significance of the above DTW features, readers are referred to [6]. The basic concept of the segmentation technique employing DTW can be understood from Figure 5. Here the vertical and horizontal axes represents the spatial axis of a chosen “master” signature and “misaligned” signature respectively, and black dots on a master signature indicate
6. Verification using,a Neural Network Verification was carried out using a neural network classifier. The topology used was that of a multi-layer
931
perceptron as shown in Figure 6. A single hidden layer made up of 16 neurons was chosen on the basis of trial-anderror runs. The input to the network consists of 16 nodes resulting from the use of 8 signature segments and a 2”d order AR model corresponding to each segment (8 X 2=16). The output layer was made up of 1 neuron corresponding to accepting (target “1 ”) or rejecting (target “0”)the claimed registered writer class.
FROM WRIER”1’
NN(WRITER “I”) 16byl SEGMWTAL
________
mD
WEIGHT AND BIAS
I
CLAIM OF WRITER’I”
Figure 7. Training and testing procedure
7. Results and Discussion
INPUT LAYER
The segmentation performance of uniform spatial and DTW-based methods are presented and compared. The uniform spatial segmentation technique divides the signature into eight even segments, with each segment modeled by an AR model. Examples of these are provided in Figures S(a) and S(b) for two signature samples. In these figures, the black dot denotes the eight points corresponding to the segmentation boundaries of the signatures. As can be seen the segment boundaries are inconsistent between the two signatures.
W
FIRST HIDDEN LAYER
Figure 6. Multilayer perceptron with single hidden layer
The database was divided into three groups (labeled as
A, B, & C) made up of signatures from sessions 1-5,6-10, and 11-15 respectively; each group thus consisted of 800 signatures (50 signaturedwriter X 16 writers). A flow chart explaining the training and testing procedure is shown in Training was carried out using the Figure 7. backpropagation algorithm using 250 signatures per writer. These were made up of 100 authentic signatures and 150 random forgeries (the genuine signatures of other writers) from groups A & B. A total of 16 neural networks are required for the verification task - each writer needs his own network. The training procedure results in a unique set of weights and biases for each writer. The network was subsequently evaluated using 200 test signatures per writer from group C - made up of 50 authentic signatures and 150 random forgeries - that were not utilized in the training process. As shown in Figure 7, the verification process took place as follows. When an identity claim was made, the feature vector of the claimant is input to the neural network corresponding to the claimed identity. If the output of the network is above the threshold of 0.5, the identity claim is accepted, otherwise declared as a forgery. The choice of the threshold determines the relative magnitudes of FA and FR types of errors - it can be altered to tradeoff one for the other depending on the application.
1 0
m
m
m
m
ya)
m
gmo
I”
(a): Uniform spatial segmentation point (AMR 06/01)
1 8
x 10
I
1.m I 1 m
2 m
ym
m
5om
m
7rm
gm
I m 1 m
(b): Uniform spatial segmentation point (AMR 06/02)
932
xm
4
I 2
Table 1. Experimental results
I
-
8
Distances input
I”
t
U am
3x0
segmentation) m 5 x c
eaa
ma
mn
sm
1 -
AR coeff. input to NN (uniform
(c): DTW based segmentation points (AMR 06/02)
Figure 8. Comparison of segmentation results
segmentation)
Now consider the results of DTW-based segmentation using a search window of 20 points. The signatures shown in Figures 8(a) and 8(c) were the “master” and “misaligned” signatures respectively. As can be seen, there is a far superior correlation between the segmentation points of the two signatures as compared to uniform segmentation. A window width of 20 (established on the basis of trial and error runs) was found to work very well in our experiments in aligning genuine signatures over the entire database. The optimal window length is related to the consistency of the writers. Parizeau and Plamondon [7] mentioned that the effect of window length (within reason) is not critical to DTW performance. The verification results obtained are shown in Table 1. In this table the first result is reproduced from earlier work in which conventional distance measures were used as inputs to the neural network for purposes of comparison. The remaining two results pertain to the work discussed in this paper. As can be seen from the first two results (both based on uniform segmentation), the use of AR coefficients as the input to the neural network classifier improves FR error rates from 2.52% to 0.88% and FA error rates from 1.17% to 0.67% respectively. This clearly indicates the advantages of using the AR coefficients directly as a feature set. The last two results while both using AR coefficient inputs to the neural network, are based on uniform and DTW segmentation respectively. When improved alignment between signatures is implemented with the latter technique, the FR and FA error rates decrease even further to 0.25% and 0.25% respectively. It is quite evident that the segmentation technique using DTW has significant benefits.
References [ 11 “Special Issue on Automated Biometric Systems”, Proceeding of the IEEE, September 1997. [2] Benjamin Miller, “Vital Signs of Identity”, IEEE Spectrum, February 1994, pp. 22-30. [3] N. Mohankrishnan,Mark 11. Paulik, and Mohamad Khalil, “On-
Line Signature Verification Using a Nonstationary Autoregressive Model Representation”, Proceedings of the IEEE International Symposium on Circuits and Systems, Chicago, Illinois, May 1993, pp. 2303-2306. [4] N. Mohankrishnan, Wan-Suck Lee, and Mark J. Paulik, “Multi-Layer Neural Network Classification of On-Line Signatures”, Proceedings of the IEEE Midwest Symposium on Circuits and Systems, Ames, Iowa, August 1996. [5] Makoto Yasuhara and Masatomo Oka, “Signature Verification Experiment Based on Nonlinear Time Alignment: A Feasibility Study”, IEEE Transactions on Systems, Man, and Cybemetics, March 1977,pp. 212-216. [6] H. Sakoe and S. Chiba, “Dynamic Programming Algorithm Optimization for Spoken Word Recognition”, IEEE Transactions on Acoustics, Speech, and Signal Processing, February 1978,pp. 43-49. 171 M. Parizeau and R. Plamlondon,“A ComparativeAnalysis of Regional Correlation,Dynamic Time Warping, and SkeletalTree Matching for signature Verification”, IEEE Transactions on Pattern Analysis and Machine Intelligence, July 1990, pp. 710717.
933