International Symposium’on Signal Processing and its Applications (ISSPA), Kuala Lumpur, Malaysia, 13 - I6 August. 2001. Organized by the Dept. of Microelectronics and Computer Engineering, UTM, Malaysia and Signal Processing Research Centre, QUT, Australia
ON-LINE SIGNATURE VERIFICATION SYSTEM USING PROBABILISTIC FEATURE MODELLING G. K Kiran, R. Srinivasa Rao Kunte J.N.N College of Engineering Shimoga, India
[email protected] jnnce-ecd@vsnl. com ABSTRACT Signature verification is a popular biometric authentication technique. Although the use of other biometrics like retinal pattems and fingerprints are quite prevalent, they are not simple methods. With the development of very handy digitizer tablets, signatures can be captured on-line and can be processed for verification. Since signatures of the same person can vary with time and state of mind, it is necessary to develop an efficient signature verification system. We have proposed in this paper a signature verification system which extracts certain dynamic features derived from velocity and acceleration of the pen together with other global parameters like total time taken, number of pen-ups. The features are modeled by fitting probability density functions i.e., by estimating the mean and variance, which could probably take care of the variations of the features of the signatures of the same person with respect to time and state of mind. 1. INTRODUCTION
Signature verification has till date remained an increasingly attractive biometric method for person identification and verification. Although the use of other biometrics like retinal pattems or fingerprht pattems, which are very unique to each individual, are on the rise, a few factors have undermined their universal and widespread commercial utility for low-end use. For instance, the obtaining of retinal patterns of individual under test is often seen as intrusive while also the equipment needed to obtain these biometrics may be too complex and expensive for large scale deployment. On the other hand, a flurry of developments in the field of transducers and general technology have resulted in the availability of rather simple, inexpensive yet highly effective pen based devices like graphic tablets which can give the position of the pen in X-Y coordinates, pressure at the tip of the pen, and sometimes pen-tilt with respect to the horizontal. These graphic tablets can be easily interfaced with any desktop personal computer. Handwritten signatures, after a little practice, can be put down by users on such graphic tablets and can be processed by the applications running on the PC.
0-7803-6703-0/01/$IO.006??001IEEE
Sudhaker Samuel S.J.College of Engineering Mysore, India.
[email protected]
Furthermore, since a very large percentage of the day-today financial transactions are normally carried out on the basis of verification of signatures, one finds enough motivation in trying to design a system that verifies handwritten signatures in near real-time and with reasonable confidence. What makes the study of signatures all the more interesting is that unlike retinal pattems or fingerprints which are an unmistakable part of one’s anatomy, the signatures are learnt, practiced and have evolved over a period of time which is one of the reason why the signatures can be duplicated by a forger to any reasonable degree of success. Signature verification generally falls into two categories, off-line (static) and on-line (dynamic) signature verification systems [I]. In the off-line systems the signature is obtained on an ordinary piece of writing paper which is then scanned to obtain an image in any one of the standard PC formats. Here the problem of signature verification is translated into a more general form of a problem in image analysis. After some initial preprocessing many of the techniques and algorithms used in character or shape recognition are then employed to verify the signature. The major disadvantage of such a method is that the dynamics of the signature (varying signature features as the signature is executed) are not captured so that it is relatively easy for a person to remember the general shape and style of the strokes of the signature of another individual and then to forge the signature. In on-line verification systems, the pen based tablet systems are used to obtain a sequence of X-Y coordinates and pressure of the pen (quantised to a certain number of bits) as the signature is made on the surface of the tablet. This sequence of X-Y coordinates is then analysed to obtain the velocity, acceleration, distance traveled and other profiles of the signature. The signature of a person, which more often than not is the name of the individual written in a manner that is unique to the individual, is a process that is learnt, practiced and perfected over a period of time. The handwritten signature has been thought of as ballistic in nature with the message commands from the central nervous system to the muscles
355
International Symposium on Signal Processing and its Applications (ISSPA), Kuala Lumpur, Malaysia, 13 - 16 August, 2001. Organized by the Dept. of Microelectronicsand Computer Engineering, UTM, Malaysia and Signal Processing Research Centre, QUT, Australia
being played out in a sequence with little or no feedback. That is, it is indeed very rare to find a person pausing in the midst of a signature only to look at the signature and see if it was coming good and then continuing with it again and finishing the signature. Thus the velocity profile of the signature, the acceleration characteristics of the pen and the total time taken from start to finish are all unique characteristics of an individual’s signature. Even if the forger takes. great pain in remembering the styles and contours of the strokes, it is extremely unlikely that he would be able to match the velocity profile or any other dynamic characteristics of the original signature. Thus, in being able to capture the dynamics of the signature on-line signature verification goes one step ahead in addressing the problem of signature verification. Some of the various approaches of on-line verification systems as reported in literature are as follows: The velocity and pressure waveforms of the pen are treated as continuous functions of time (or discrete version of a continuous hnction). Then, techniques such as Dynamic Time Warping (DTW, used extensively in speech recognition), which take care of time alignment are used to compare the reference template and the test signature [ 1][2]. The signatures are segmented into words based on the zero velocity points ,in the velocity curve and then DTW or other techniques are used to compare the different segments to determine the similarity or otherwise of the test and reference signatures [3]. Hidden Markov Models (HMh4, a technique used extensively in speech recognition) have also been applied to the word segments [4]. The velocity and pressure waveforms are reduced to a set of features like the number of times the velocity becomes zero, or the number of times the first derivative of the velocity (acceleration) crosses the zero axis etc. These features are combined with other global features like the total time taken or the total distance traveled. The comparison is then accomplished by using the mathematical norm or by using other well established techniques like neural networks etc [5].
2.
SYSTEM METHODOLOGY
In our proposed system, the typical enrolling process of a new user begins with the user putting down the signature above a horizontal reference line in a relaxed manner. About 5 to 10 of his (or her) specimen signatures are obtained and then certain features, to be explained later, are extracted from his signatures and a reference set .(typically called a database) is built up. Later, when the same us& (or someone claiming to be that particular user) presents himself for authentication, the features extracted from his test signature are compared with that of the features in the reference set belonging to that particular person under test. If the signature has indeed come from the original user, then the comparison process will yield a
higher similarity score. While on the other hand, an attempt at forgery will yield a lower similarity score. The features used for signature verification in our system are indeed the features that have been widely studied and written upon in literature such as total time taken in doing the signature, total distance traveled by the pen, number of zeros in the velocity and pressure curves etc. (more details follow in the subsequent sections). A different approach has been proposed and used for the comparison of the reference and the test signatures. A probability density function, Gaussian in this case, is used in modeling the clustering of the features. That is, a density function is parametrically fitted (or estimated) to each feature using all the specimen signatures of a person. In the process of estimation, the mean and variance for each of the features are calculated and used as the representatives of the signature for identification. During the testing phase (or authentication), the probability score obtained by fitting the test feature data to the estimated reference values is accumulated over all the features. A decision is then taken on the basis of this accumulated score about the authenticity of the signature under question. The graphic tablet used in our experiments comes with a 5”x 5” pad, cordless pen (stylus) and 256 levels of pressure sensing of the tip of the stylus. Every time the pen moves, the software drivers report the X-Y coordinates and the pressure of the contact of the pen. To enable us to perform the velocity and acceleration calculations we also store the PC system time along with the X-Y coordinates and pressure. Also, the number of pen-ups during the signature is noted.
2.1 Features We are proposing to extract the following features fiom the signature for verification: 1. The number of zeros in the velocity in X direction. 2. The number of zeros in the velocity in Y direction. (or instead of the zeros it could be values that are a very small percentage of the peak values) 3. The number of zero crossings in the acceleration in the X direction. 4. The number of zero crossings in the acceleration in the Y direction. (These zero crossings indicate the change from increasing to decreasing velocity or from decreasing to increasing velocity). 5. Total time taken in performing the signature. 6 . Total distance traveled by the pen. This is the sum of all the euclidean distances between all the points, D = C { (Xi-Xi+l)2+(Yi-Yi+~)2}I‘ Where the summation in the previous expression runs from i = 0 to N-1. 7. Total number of pen-ups (including or excluding the final pen-up). 8. Total pen-up time, which is the total time for which the pen was up. 9. The number of times the pressure goes above an upper threshold T,...and
356
I
International Symposium on Signal Processing and its Applications (ISSPA), Kuala Lumpur, Malaysia, 13 - 16 August, 2001. Organized by the Dept. of Microelectronics and Computer Engineering, UTM,Malaysia and Signal Processing Research Centre, QUZ Austruliu
10. The number of times the pressure falls below a lower threshold T,,. (Thresholds can be a certain percentage of the peak value). The velocity or the speed, which changes as a function of time, can be calculated as the first derivative of displacement as: VXm= ( X m + l - X m ) / ( t m+l - t m ) ’. (2) Vym= ( Ym+l - Y m ) / ( t m+l - t m ) .’ (3) ’
Which are the velocity in the X and Y directions respectively at time t m. The acceleration can be calculated as the first derivative of velocity, and thus we would have for the acceleration in the X and Y direction at time t ,,:
As an example, Fig. 1 shows a sample signature and the corresponding velocity plot and the acceleration plot are shown in Fig. 2 and Fig. 3 respectively for reference. For each of the 10 features listed above, the mean and variance are estimated over all the sample specimen signatures. The details of which follow in the next section.
2.2 Feature Modeling:
During the process of enrolling of a new user a number of signatures, typically between 10 to 20 (the more the better) are obtained over a period of time (we refrained fiom forcing the subjects to do all the signatures in one sitting). The features, as mentioned in the previous section, are obtained for each of the sample signatures. If there are 20 sample signatures, then, 20 sets of feature vectors each containing 10 proposed features are extracted. During the course of building our system we ncticed that the handwritten signatures tend to vary a great deal depending on the mental state, the surroundings in which he is made to sign, the time period over which he is made to sign and sometimes also the specific reason for which he is to sign. So, most of the features like the total time taken, the total distance traveled, time between pen-ups tend to vary, from one signature to the other of the same person, about a mean value with a certain variance that is uniquely characteristic to a certain individual (that is, the feature values are tend to be found clustered about a mean value). It is this observation that prompted us to think about fitting disributions to these features. It has been observed that of all the mathematical distributions the Gaussian density finction best models most of the processes that are observed in nature. The mean gives us an idea of the value around which the clustering takes place while the variance is an indicator of how big or small the cluster is spread in feature space (smaller the variance smailer is the cluster and likewise). Since N signatures give us N values of a particular feature (like N total time taken), we try to fit a Gaussian density function to these values and in doing so estimate the mean and variance that parametrically define the distribution. The unbiased estimator for the mean is,
Fig. 1 A sample signature.
I
=N
pk =ZX, / N
i= 1 where the XI ‘s are the feature values. The unbiased estimator for the variance is, I
.‘ (6)
=N
Vark= Z ( X, - pk)* / ( N-1 ) .. (7) i= 1 where k varies from 1 to the number of features i.e., IO. Fig. 2 The velocity of the pen tip as a function of time for the sample signature in Fig. 1
As a simple numerical example, let us consider that there are 4 specimen signatures of an individual and two features, namely total time taken and number of pen-ups are extracted from the signature. Sig 1 - 4.234 sec and 5 pen-ups Sig 2 - 4.623 sec and 6 pen-ups Sig 3 - 4.555 sec and 4 pen-ups Sig 4 - 4.772 sec and 5 pen-ups The mean of feature 1 (time taken) is 4.546 sec and its variance is0.051477, while @at of feature 2 (number of , , - I .._- - --J----1 , E ?E w dv (a! ?VI ‘ill t4l. pen-ups) is 5 and 0.666 respectively. So, for this particular Fig. 3 The acceleration of the pen tip as a function of time user with 4 specimen signatures the reference template is (4.546, 0.05 1477), (5, 0.666). for the sample signature in Fig. 1 J..
’
L
.
357
International Symposium on Signal Processing and its Applications (ISSPA). Kuala Lumpur, Malaysia, 13 - 16 August, 2001. Organized by the Dept. of Microelectronics and Computer Engineering, WTM, Malaysia and Signal Processing Research Centre, QUT, Australia
Thus we have, for a given individual, a mean and variance ( p k , var k ) that describes the behaviour of a particular feature. For a IO featured system we have: (p
I,
var
I
1, ( p~
I
var 2 1, ( p 3 , var3 1, ... ( PIO, var I O )
This is computed for every user in the system. So, when a user’s authenticity has to be verified, the set of qean and variance reference value corresponding to the user in test has to be used in verifying the claim.
2.3 Verification When a request is made to authenticate a given user, the claimant is asked to sign above the horizontal reference line. From this signature, the same features as mentioned before are extracted and a feature vector built up. The values of the features of this test signature are then substituted into the equation defining the Gaussian density function of mean and variance ( p ,var ) i.e., P(X,) = e ~ p [ - ( X , - p , ) ~ / ( 2 v a r , ) ] / d2(1 ~ v a r , ) ..(8) where P( X, ) is the score the particular feature XI generates. (Here it is noted that in defining the density function the probability of a variable taking a particular value is zero!). From the above equation, it is evident that closer the value of XIto the mean, as dictated by the variance, greater will be the score geflerated. The above equation is evaluated over all the features (the variable i is run from 1 to 10) and a probability score PS is generated given by: I
=IO
PS = z P( X I ) I=
I
.. (9)
After the mean and variance of each feature is estimated, a quantity called the threshold is estimated. To estimate the threshold the specimen signatures are treated as test signatures and the features are then substituted into the defining equation (8) and the probability score PS is calculated over all the features. This is repeated over the whole set of sample signatures and the average of all the probability scores obtained can serve as a good measure of the threshold. If the accumulated score PS is equal to or greater than the threshold, then the signature is verified as authentic while if it is less than the threshold then it is declared as forged. If one desires a slightly lighter threshold then it can be set at a smaller percentage of the average.
0.23089 for the second and so a total score of 0.258477. The genuine sigqature gives a score of 1.7092 for the first feature while the second feature gives a ,score of 0.48886 for a total of 2.1978. This score of the genuine signame will be probably above the set threshold. 3. CONCLUSION
This system was tried on a set of 5 users with about 25 reference signatures for each individual. As no expert forgers (signature analysis experts) were available we had to do with each other duplicating each other’s signature. Initial results have proved quite encouraging, inspite of the fact that the users who tested the system knew what features the system was looking at. About 95% verification (rejecting forgeries) was obtained in these cases. It is expected that this system will give much better results if tested on subjects who have no inkling about the features of the signature that the system is looking out for. The proposed system promises a very simple yet reliable solution to the problem of signature verification. The only problem perhaps might be the need to use a lot of reference signatures, but any even-minded individual should comply with the necessity of putting down a few signatures. But a lot of testing and refinement in terms of the features need to be done before the system can be projected with any reasonab!e degree of confidence.
5. REFERENCES [l] R: Plamondon and G. Lorette (1989), “Automatic Signature Verification and Writer Identification: The State of the Art”, Pattern Recognition, vol 22, pp. 107-131. [2] R.Martens and LClaesen, “Dynamic Programming Optimisation for &-Line Signature Verification”, Proceedings of ICDAR’97, pp. 653-656. [3] Trevor Hastie and Eyal Kishon, and Jason Fan, “A model for signature verification”, Technical Report 11214-91071.5-07TM,AT&T Bell Laboratories, July 1991. [4] R.S.Kashi, J. Hu, W. L. Nelson and W. Turin, “Online Handwritten Signature Verification using Hidden Markov Model features”, Proceedings of ICDAR ’97, pp. 253-257. [ 5 ] G. K. Gupta and R. C . Joyce (1997), “A Study of Some Pen Motion Features in Dynamic Handwritten Signature Verification”, Technical Report, Computer Science Dept, James Cook University of North Queensland.
As a numerical example, let us consider the reference template of mean and variances of the previous example for a user i.e., (4.546,0.051477), (5, 0.666). Now, let there be a forgery and an authentic signature where the forgery has the feature values of 5.2 sec total time and 6 pen-ups while the genuine signature has values of 4.6 sec and 5 respectively. Using the equation (8) for P( Xi ) as explained above with the given mean and variance, the forgery yields a score of 0.02758 for the first feature and
358