Different Approach to Off-line Handwritten Automatic Signature Recognition and Verification Abstract 1. Introduction Biometrics is methods for physiological characteristics such as face, iris and finger print or behavioral traits such as signature and voice for identify verification of individual. Biometrics is used in password-security systems which are impossible to copy, steal, or guess biometric properties. Signature is a behavioral biometric not a physiological biometric. The signature is an authorization taken by someone. Every person’s signature is unique. Automatic signature verification is divided into two categories off-line and on-line signature verification. The main difference between of on-line and off-line lies in how data are obtained. The on-line systems depend on signature is written on a special digitizer tablet and other devices. The off-line systems depend on the scanned image of a signature (i.e person sign using pen on paper) by a scanner or camera and useful for automate verification of signatures on bank checks and documents. (5) (6) (7) (10) (11) The off-line has many difficulties such as variation within genuine signature, scanning device can add some noise, or the width of pen and less discriminative information because the signature is input to the system as image.
So the off-line systems are more difficult than on-line systems because it is missing some dynamic information such as duration, number of strokes and direction of writing. The genuine signatures for same person may little vary and differences between a forgery and genuine signature may be undetectable. The off-line signature system are very challenging problem. The off-line signature is easier to forge while the on-line signatures are more unique and difficult to forge. The on-line systems have extra information such as shape information, dynamic features like speed, pressure and capture time of each point on the signature route. The on-line signature verification is more reliable than off-line signature verification. The results for the off-line signature systems are comparable with performance with the performance of experienced human examiners. (5) (6) (7) (11) In testing and evaluating the performance of the signature system there are two important factors: the false rejection rate of genuine signature and the false acceptance rate of forgery signatures. There is no public signature database of genuine and forger signatures is available, so it is difficult to compare existing signature systems which each system use different database for testing and evaluation. Signature identification is classified into two problems, recognition and verification. Recognition is identifying the person who is singing the document, check. Verification is checking if signature is genuine or forgery. There are three
1
groups of forgery: random, simple and skilled. The random forgeries are produced the signature without any knowledge of signer’s name and signature’s shape. The simple forgeries are produced the signature by knowing the signer’s name but without any knowledge of signature’s shape. The skilled forgeries are produced the signature by knowing the signer’s name and signature’s shape, so they are attempting to imitate the original signature as possible. Both recognition and verification problems are important for banks and credit cards companies. The recognizing and verifying the handwritten signatures is the most important field in the last years because it is primary to identify the authorize person and authentication transaction for baking process. Many methods for automated for signature examination are developed and there are several implementations for signature recognition and verification. In this paper different method such as Principal Components Analysis, Neural Network, Colour Code algorithm and Support vector machine for off-line handwritten automatic signature recognition and verification is compared. The rest of paper is organized as follow: In Section 2 we discuss the OCR. In Section 3 we discuss biometric. In Section 4 we discuss the database are used in different methods. In section 5 we discuss the features extraction are used in different algorithms. In section 6 we discuss the different methods for signature verification. In section 7 we discuss the different methods for signature recognition. In section 8 we discuss the performance, and testing of each of the system. Finally concludes the paper.
2. Data Base
2
In this section we discussion the different databases are used in three methods. In Neural Network, the testing data base, signatures images are divided into 18 sets, each set contain 24 genuine signatures and 24 forgery signatures. The total size of database is 400 samples. (5) In Principal Components Analysis, the testing data base consists of 18 genuine samples (18 persons who signed 24 genuine signatures) and 100 forgery samples (20 persons were asked to imitate genuine signature, each person sign 5 signatures for each genuine).
The total size of database is 840 samples. (6) In Colour Code Algorithm (CCA) is developed for banking applications. In banking system, it takes three specimen signatures and while transactions the signature is compared with these specimen signature. (7) In the Support Vector Machine is used 1320 signatures were signed by 70 persons. For training phase are used 40 persons each person signed 8 signatures and 30 persons who are imitate the signatures, for each person 4 forgery signatures are signed. For testing phase 320 genuine signatures and 320 forgery signatures are written by the same 40 persons in the training phase. (9) 3. Preprocessing In Neural Network uses few preprocessing steps to improve the verification performance of the system. These are size of the normalization, smoothing of the route and re-sampling. First the signature is scanned in a 8-bit, 300 dots per inch resolution and scanned signature cut out and size using an image editor software. Then convert the scanned signature into gray scale image by using threshold. The scanned signature consists of black pixel on white background, and then the image is complimented by white image on black background. The scanned signature has some noise components such as background noise pixels, to remove these noises, a median filter is used.
3
The size of scanned signature is normalized to fixed default value, the size of image is normalized according to width or height not to the both width and height because it needs to calculate ratio which is a problem. 4. Feature Extraction Some methods use feature extraction and others doesn’t use. In this section we discuss the differences in the three methods. Method 1: There are three types of feature extraction: global features, local grid features, and textual features. Global features which give information about specific cases concerning the structure of the image, grid features which give information about overall signature appearance, and texture analysis, which gives information about signature appearance. Method 1: The global features and grid features are used in Neural Network. After normalization and skeletonization of the signature image the global feature are computed. The global feature such as (5) Image area, which gives the number of black pixels in the imag. (5) Pure width and height, which gives the width of the image with horizontal blank spaces removed. (5) Baseline shift, which measured as difference in the vertical centre of gravities between the left and the right halves of the image.
Vertical centre of the signature and maximum horizontal projection, which use to indicate the location and strength of the signature baseline. Horizontal centre of the signature and maximum horizontal projection. Maximum vertical and horizontal projection. Global slant angle, which the image is rotated by 30o in the clock-wise direction then rotated in the anti-clockwise direction in steps of 2o and horizontal projection is calculated. The rotation is stopped when the horizontal projection reached the maximum and the slant angle is different of 30o and the angle at the which the horizontal projection is maximum. Number of edge points: the edge points are the pixels which have only one immediate neighbor. Number of cross points: the cross points are the pixels which have more than or equal to three neighbors. Number of close points, which describes the amount of complexity that the signature lines involve.
In grid information features, the skeleton image is partitioned into 96 rectangular segments and for each segment, the area is computed then the result are normalized to lowest value would be zero and highest value would be one. Method 2: But texture analysis is used in Principal Components Analysis. In Principal Components Analysis??? (6)
4
Method 3: The Colour Code Algorithm is as Morphological and statistical technique. It does not use feature based technique but it using pixel to pixel matching. The Colour Code Algorithm for signature recognition system consists of the following steps: (7) Method 4: The Support Vector Machine features are used global features, mask features and grid features. The global features such as signature area, signature height to width ratio, maximum horizontal and maximum vertical histogram and horizontal and vertical center of the signature, local maximum numbers of the signature and edge point numbers of signature. The mask features which give information of the direction of the lines of the signature and the angles of the signatures have interpersonal differences. The grid features is used to find the densities of signature parts. (9) 5. Signature Verification Figure 1.1 summarizes the task to be solved by a signature verification system: given a test signature and a claimed ID, either accept a user as the identity owner or deny him based on a dissimilarity degree between the test and reference set signatures. In either of the signature verification systems, the users are first enrolled by providing reference signature samples. When a user presents a test signature and claims to be a particular individual, the test signature is compared with reference set signatures of the claimed identity. If the dissimilarity between the test and reference set signatures is above a certain threshold, the user is rejected, otherwise accepted.
Method 1: ???? Method 2: In Principal Components Analysis for verification process, it is used Artificial Neural Network (ANN), which consist of 28 input variables, 18 hidden neurons, and 2 output variables and designed for verification one signature at a time, and for training phase use back propagation algorithm. (6) (In Principal Components Analysis for verification process, it is used Artificial Neural Network (ANN), which is used to confirm or reject a written sample) (6) Method 4: 6. Signature Recognition
5
Method 1: In Principal Components Analysis for recognition process, it is used K Nearest Neighbours classifier (KNN), the features vectors (fv) of training set are given, and then detect the features of unknown signature (U).
To measure the similarity between the features vectors (fv) and the features of unknown signature (U), the Euclidean distance is used to measure the distance between fv and U. then the distance computed for unknown signature. (6) (In Principal Components Analysis for recognition process, it is used K Nearest Neighbours classifier (KNN), which select writer of a sample from among a group of writers) (6) Method 3: In Colour Code Algorithm the signature is recognized by morphological approach, is used to obtain check pattern. (The check pattern is used to take decisions for the validation of the signature depending on the values set in the preferences. The preference has different values for the radii for generating the check pattern, the threshold value for Intensity Normalization operation, the decision thresholds, the threshold for maximum pixel change, the threshold for maximum rotation angle.) In this system the overlapping two signatures and check pattern is generated for two images, then test signature is processed and angle of rotation is found, after that find the percentage matching for specific signature by calculating the number of pixels lying in the deviation bands in the check pattern.
Each band in the check pattern correspond to a deviation percentage, such as black band represent perfect pixel, the red band indicate 10 percent deviation, green band indicate 20 percent deviation, and blue band indicate 30 percent deviation. While pixels lying in the background color have deviation greater than 30 percent. (7) 7. Performance and Testing The testing is done in the two different ways for verification and recognition. The verification is the process of decision if the signature is genuine or forgery. The recognition is the finding the identification of the person who is singing the document, check. Method 2: The results of the testing Principal Components Analysis in data base sample are False Recognition Rate is 15%, and False Accept Rate is 17%. (6) Method 3: In Colour Code Algorithm system for signature recognition the accuracy is about 80% to 90%. It can’t achieve 100% because the irrelevant signatures are rejected by system. (7) Method 4: The result of testing the Support Vector Machine in database sample for each 8 person genuine and 8 forgery signature are test in verification phase. The False Accept Rate is 0.02 and False Accept Rate is 0.11(9)
6