SIFT is one of descriptor algorithms with the most robustness. However, this extraction algorithm with the scale-invariant image feature point is complex and has the large amount of calculation which can not satisfy the real-time property of license plate character recognition. To solve this problem, this study proposes a simplified SIFT descriptor generating algorithm. Based on the characteristics of the plate character, the geometric center of license plate characters is taken as the feature points and then the main direction is found through PCA algorithm to simplify the process of detecting feature points. Combined with SVM classification algorithm, a new recognition method for license plate characters is proposed. Experimental results show that the simplified SIFT descriptor well maintains the robustness of original SIFT descriptor. In addition, the recognition method for license plate characters makes a good effect on the recognition of slant, fuzzy and partly missing license plate characters.
PDF Abstract XML References Citation
How to cite this article
Like ID card numbers, license plate characters are used to label vehicles. So, along with the fast development of intellectualized modern traffic management system, the license plate recognition system is widely applied in all the fields of traffic management systems. Because the recognition technology of the license plate characters is one of the most important technologies in the license plate recognition systems, the recognition technology will directly influence the final recognition rate.
Feature Description method for the robust and the well-designed classifier are the core of the whole recognition system. Now-a-days, many descriptor designed methods are proposed, such as Mikolajczyk and Schmid (2005), who make a contrastive analysis on properties of all kinds of descriptors in detail. In addition, the overall property of the difference invariant descriptor Koenderink and van Doorn (1987) is more poorer. Even though the tunable filter Freeman and Adelson (1991) and the gradient method are designed simply, the matching speed is fast while the robustness is bad. Moreover, descriptors, like shape contexts (Belongie et al., 2002) and the complex coefficient filters (Schaffalitzky and Zisserman, 2002), do not have the better robustness to change images. Until now, SIFT descriptor (Lowe, 2004) is the descriptor algorithm with the best robustness. However, for this descriptor, the extraction algorithm with the scale-invariant image feature points is complex and has a large amount of calculation. Because the generated descriptor is 128-dimensional vector and it is wider in the dimension, the matching speed is slower and the demand for the real-time property can not be satisfied.
Due to the above questions, present study simplifies the generation steps of SIFT descriptor on the basis of features of license plate characters. The simplified SIFT descriptor directly considers the geometric center of character images as feature points. Through PCA algorithm (Smith, 2002; Shlens, 2005) the main direction of feature points is found and the descriptor of feature points is generated so that the computational complexity of images feature points is simplified. Meanwhile, with PCA algorithm, the generated SIFT descriptor in this study further lower the dimension (Ke and Sukthankar, 2004) to generate a low dimensional feature vector and satisfy the demand for the real-time property.
Nowadays, there are many algorithms for classification, such as KNN Text Classifier (Hao, 2007), the Neural Network (Paliwal and Kumar, 2009) and Support Vector Machines (SVM) Classifier (Burges, 1998). On the basis of the space, KNN Text Classifier (Hao, 2007) is easy to calculate but it has the bad ability of overcoming the noise and is low in classification accuracy. The Neural Network (Smola and Scholkopf, 2004) is a machine learning method for a big sample. For this method, the effect of this method is related with the number of training samples and training times, so this method is not suitable for solving questions about the recognition of license plate characters. Moreover, SVM is a new type of machine learning methods to solve questions about small samples on the basis of statistical learning theory by Vapnik. Also, this method has been widely applied in the aspects of pattern recognition (Burges, 1998), regression analysis (Smola and Scholkopf, 2004) and probability density estimation (Xiaoyun et al., 2009).
SIFT algorithm mainly includes four steps (Lowe, 2004) (1) Scale-space extrema detection; (2) Keypoint localization; (3) Orientation assignment and (4) Keypoint descriptor. The first three steps are to acquire the keypoint and the orientation of the keypoint and is complex to calculate while the fourth step is to generate the descriptor. The license plate characters are small in size, so there only need one feature point to describe one image and also the geometric center and the gravity center of characters are not quite distinctive. Therefore, in view of features of the license plate characters, SIFT directly adopts characters geometric center as the keypoint. Moreover, PCA algorithm is used to find the orientation of the keypoint as so to achieve the first three steps. So, the computational complexity is greatly lowered. Furthermore, in this study, the process of generating the descriptor with the generation algorithm of SIFT is similar to the fourth step of SIFT algorithm. The effect of the orientation selection by PCA and mapping by the coordinates is shown in Fig. 1.
The first step is to smooth the image by Gaussian convolution so as to eliminate yawp points; (2) The second step is to sharpen the image with Sobel in order to make the edges of characters brighter. As a result, the yawp points of the image in Fig. 1b are mostly eliminated.
Figure 1d shows the effect of the orientation selection by PCA on the image. The process of the orientation selection by PCA can be summarized into three steps: (1) First is to build up the coordinate system in which the geometric center of images is considered as the original point and the point refers to the keypoint (2). The nx2 matrix is constructed through mapping the pixel point (the pixel value is beyond a certain threshold value) on the coordinate system. The coordinate mapping formulae are x = (1-C)/2.0 + j and y = (R-1)/2.0-i. In the formulae, C stands for the number of columns numbers of the character image, R for the number of rows, j for the row of the current pixel point and i for the column of the current pixel point. (3) The feature vector of the matrix is calculated by PCA and the feature vector corresponding to the maximum feature value is put as the orientation of the feature point.
Figure 1c shows the character image which is obtained by rotating the coordinate axis to the orientation. There are two steps to rotate the coordinate axis. (1) First step is to rotate the coordinate axis. The formula is x = x΄. Sin ∂-y΄. Cos∂ and y = x΄. cos∂ + y΄. Sin∂. When ∂ < 0, ∂ = π + ∂. In the formulae, ∂ stands for the gradient of the orientation while x΄ and y΄ for the value of the pixel in the former coordinate system along the orientation of X axis and Y axis, respectively. (2) For the second step, in the new coordinate system, when white pixels in the character region are obviously more than white pixels, the threshold values are set to extract characters. Moreover, the coordinate axis is rotated to the orientation so as to assure that the character image has the invariability when it is rotated.
|Effect of the orientation selection by PCA and mapping by the coordinates
Present study, the process of generating SIFT descriptor is similar to that of generating SIFT algorithm. However, because the license plate character images are not of unequal size, the character images should be transformed to standard characters before the descriptor is generated. In this study, the size of standard character images is 16x16. The whole process of generating the descriptor can be divided into five steps:
|To compute the pixel gradient value and direction
P in the formulae stands for the pixel value. Information about the pixel is described by its neighboring pixel points in four directions.
|To compute the position weight of the pixel.
With the center of the character image as the feature point, SIFT descriptor divided the image into 4x4small pieces. With (Cxi, Cyi) as the central point of each small piece, there are 16 central points on the whole. Moreover, Sx and Sy, respectively stand for the spaces between two neighboring small pieces in both x and y directions. Also, Dxi and Dyj, respectively stand for the position weight of the point (x, y) in the x and y directions of the small piece (Cxi, Cyj) while PWij PW is the position weight of the point (x, y) in the small piece (Cxi, Cyj) when the image is processed with rotation transformation.
|To compute Gaussian weight of pixels.
In this formula, σ means the variance and in fact, σ can be replaced by a constant. Take the character image with the size of 16x16 as an example, when σ = 2xSx= 8, the acceptable experimental result can be achieved.
|To compute the direction weight of pixels.
The computation of the direction weight depends on pixels gradient direction which ranges from 0 to 2π. From 0, a direction entry can be determined every 0.25π, so the gradient direction can be divided into 8 entries. In addition, θ stands for the angle between the gradient directions of pixels, Di for values in the gradient direction of No. i and DWDi for the direction weight in the gradient direction of No. i. When |θ-Di|<0.25π, Di refers to the entry in the neighboring direction and at this moment the point has the direction weight.
|To generate SIFT descriptor.
The value m (x, y) refers to the gradient module of pixels in the point (x, y) and W stands for the descriptor generated by the point (x, y). Also, PW, DW and GW are the 128-dimensional vectors. All the pixels of character images are traversed to repeat the above steps and the descriptors generated each time are accumulated to generate 128-dimensional SIFT descriptor.
Step 2 is processed by position weighting to make the descriptor keep invariability when images are rotated, Step 3 by Gaussian weighting to show that the nearer the pixel is to the center, the more contribution the pixel makes and Step 4 by direction weighting to make the descriptor have the ability of anti-noise and the light influence. These three weighting processes assure the robustness of SIFT descriptor.
The key point of SVM algorithm is to introduce Structural Risk Minimization (SRM) to the classification. SVM algorithm has been developed from the optimal classification hyperplane theory in the linear separable case with the essence that the support vector on the basis of optimal classification hyperplane theory of construction is found in the training samples. SVM algorithm is put by a quadratic classification problem but in character recognition, multiple classification problems need to be solved. Now-a-days, the main common methods for realizing the multiclass SVM are summarized as follows: One-against-one, one-against-the-others and SVM decision trees. Sub-classifiers constructed by one-against-the-others only contain K, its structure is simple and also the training sample of the license plane characters is not large, so in this study SVM algorithm adopt one-against-the-others method to solve multiple classification problems. Until now, there are four more commonly used kernel functions: Linear kernel function, D times polynomial kernel function, Gaussian radial base kernel function and Neural network kernel function.
Present study, SVM algorithm chooses Gaussian radial base function as the kernel function. In order to obtain the optimal parameter (c,|σ2) with c as punishing factors, when c = 2i, σ2 = 2j and i, j = -8, -7, ..., 7, 8, each pair of c and σ2 is combined to train so as to find a group with high recognition rate, namely the optimal parameter.
EXPERIMENTAL STUDY AND ANALYSIS
Based on the gradient information of images, SIFT descriptor generates the feature vectors. Like the former character recognition algorithm for extracting features, SIFT descriptor is also directly on the basis of information about images. In order to test the robustness of SIFT descriptor, 8, B and the Character characters Yu and Xiang are generated to SIFT descriptors. Also, based on Euclidian distance, similarities among feature vectors of these four characters are compared. Moreover, each type of characters contains 10 images which are, respectively the former images, images processed by adding Gaussian noise of 0.1, 0.2, 0.3 and 0.4 and also images with the oblique angle of 5, 15, 25, 35 and 45° separately.
In order to test the effectiveness of the license plate characters recognition by the combination of SIFT descriptor and SVM classification algorithm, the recognition algorithms of SIFT+KNN and SIFT+SVM are respectively adopted to recognize the four types of characters: Chinese characters, letters of English alphabet, the combination of Arabic numerals and letters of English alphabet and Arabic numbers. Then, results of character recognition are analyzed.
Furthermore, to test that the recognition algorithm in this study can satisfy in-time need of recognizing the license plate characters, PCA algorithm is used to lower the dimension of the generated SIFT descriptor further and compare the recognition results from different dimensions. The experimental results shows that the decrease of dimensions has little influence on the recognition rate.
Selection of the experimental sample: The experimental sample contains 700 images of license plates photographed in present study. However, through the license plate recognition system to position and algorithm of division, obtained character images can not show images of out-of-town license plate characters completely, so a part of images in the sample are downloaded from Internet while images photographed in this study seem more fuzzy.
In the experiment, each type in the training sample set contains 8 images of characters and in the testing sample set, Chinese character set includes 500 images of Chinese characters, letters of English alphabet set is composed of 500 images of letters of English alphabet, the set mixed by letters of English alphabet and Arabic numbers includes 1000 images mixed by letters of English alphabet and Arabic numbers and also the Arabic numerals set contains 500 images of Arabic numerals.
Verification on effectiveness of SIFT descriptor: Forty images of characters in Fig. 2 are used to verify the validity of SIFT descriptor. The purpose of the research on character recognition is to find out features that can distinguish the same type of license plate characters from other types. In this study, to verify the validity of SIFT descriptor, Euclidian distance is adopt to compute similarities between vectors. The smaller the Euclidian distance between two vectors is, the more similar they are. If the Euclidian distance between vectors generated by the same type of character images is smaller than by the different type, it will show that the generating algorithm of SIFT descriptor in this study can find license plate characters optimal features which can distinguish images with different types well and describe images with the same type greatly.
|Former character images verified by effectiveness of SIFT descriptor
In Fig. 2, the character 8 is very similar to the character B and the Chinese character Xiang to the Chinese character Yu, so it is difficult to research on this kind of characters in the character recognition. In order to verify the robustness of SIFT algorithm, each type of character images includes 10 images from left to right: The former image, images processed by adding Gaussian noise of 0.1, 0.2, 0.3 and 0.4 and images with the oblique angle of 5, 15, 25, 35 and 45°.
Four curved lines are shown in Fig. 3 as the experimental results. The first image from 8", B, Xiang and Yu, respectively is used as the basis to compare similarities between the generated SIFT descriptors. From left to right, 40 points in Fig. 3 corresponds successively to 40 images in Fig. 2.
First, the first curved line, on the basis of the character 8", is analyzed. Except that the similarity of the tenth image is slightly higher than that of images from the eleventh to fifth, similarity of the other 9 images about the character 8" is a little lower than that of images with the same type. Second, on the basis of the character B. The second curved line is analyzed that similarity of with the same type of the character images B is a little lower than that of different types. Also, based on the Chinese characters Xiang and Yu, respectively, the third and fourth curved lines can be analyzed to get the same results.
Moreover, corresponding to the images processed by adding Gaussian noise in Fig. 2, the points in Fig. 3 are analyzed that each curved line corresponding to the image processed by adding Gaussian noise is very similar to the same type of reference images which shows that SIFT descriptor has high robustness to the noise.
The above analysis can validate the effectiveness and robustness of the algorithm generated by SIFT descriptor while the following experiment can verify this algorithm has very high robustness to character images with the oblique angle of less than 25°.
Verification on effectiveness of SIFT descriptor and SVM for character recognition: Figure 4 shows the recognition rates of different types of character images, respectively. In these four types of images, X axis stands for the testing image set in which images are processed by adding noise or rotated and Y axis for the recognition rate. From left to right, 10 points in each type of images, respectively stand for the former testing image, testing images processed by adding Gaussian noise of 0.1, 0.2, 0.3 and 0.4 and testing images with the oblique angles of 5, 15, 25, 35 and 45°.
Through the comparison between the recognition rates of the first and the fifth points, it is shown that SIFT descriptor has high robustness to noise while the noise disturbance not larger than Gaussian noise of 0.4 makes little effect on the recognition rate. By comparison between the algorithm of SIFT and KNN and the algorithm SIFT and SVM, it is shown that the classification algorithm of SVM can greatly enhance the recognition rate.
|Verification on effectiveness of SIFT descriptor
Especially when the oblique angle of character images becomes bigger, this classification algorithm is more superior. Moreover, this classification algorithm has more greater influence on the Chinese character Fig. 4a recognition rather than Arabic numerals Fig. 4b, letters of English alphabet Fig. 4c and the combination of letters of English alphabet and Arabic numerals Fig. 4d. Because Chinese characters are very difficult to recognize, this classification algorithm plays an important role in improving the whole effect of license plate character recognition.
Nevertheless, Fig. 4 also shows that when the oblique angle of images is bigger than 25°, the recognition rate is quickly lowered, especially for the recognition rate of Chinese characters. Due to complex strokes of Chinese characters, their images with very large oblique angle will become fuzzy in structure after choosing the orientation so that it is difficult to recognize them.
Verification on effectiveness of SIFT descriptor by dimension reduction: Figure 5 shows the recognition effect of SIFT descriptor reduced dimension by PCA algorithm. License plate recognition system requires high real-time performance and SIFT descriptor becomes 128-dimensional, so in actual application, it involves much calculation.
|Recognition rates of images with different types; (a) Chinese characters; (b) Arabic numerals; (c) Letters of English alphabet; (d) Letters of English alphabet + Arabic numerals
|Recognition rate after reducing dimension of SIFT descriptor
Therefore, SIFT descriptor is proceeded by reducing dimension to further satisfy the real-time demand for license plate recognition system and the experiment in this study aims to validate effectiveness of this method.
In Fig. 5, X axis stands for the dimension of 20, 40, 60, 80, 100, 128, respectively and Y for recognition rate. Four curved lines, respectively refer to the recognition rates of Chinese characters, Arabic numerals, letters of English alphabet and combination of Arabic numerals and letters of English alphabet in different dimensions.
When the dimension of vectors generated by SIFT descriptor are lowered to not less than 60, it will make little effect on the recognition rate of character images. When the dimension is lowered to 20, the recognition rate for Arabic numerals will not change, that for letters of Englishalphabet will not be lowered to 0.01, that for Chinese characters and the combination of Arabic numerals and letters of English alphabet will not be influenced greatly but only to about 0.01.
From the above analysis, it is feasible for PCA algorithm to lower the dimension of SIFT descriptor. In the actual application for license plate recognition, SIFT descriptor can be lowered to the suitable dimension according to the need of real-time property.
Present study, the robustness of SIFT descriptor is further amended to apply it for the recognition for license plate characters and then a new kind of recognition methods is proposed. With the geometric central point as feature points, PCA algorithm is used to select the orientation and then SIFT descriptor is adopted to extract the feature data. Finally, the characters are classified and recognized by SVN algorithm. The experiments shows that this algorithm is very effective to recognize characters, especially for Chinese characters and combination of letters of English alphabet and Arabic numerals. However, there is still a bottleneck for the recognition rate of Chinese characters. Therefore, the further study will be done to optimize this algorithm to improve the recognition rate of Chinese characters so as to satisfy the practical and engineering demand.
This study is supported by the National Natural Science Foundation of China under grant No. 60975015, the Natural Science Foundation Project of CQ CSTC under Grant No. 2009Bb2364 and the Scientific and Technological Project of Chongqing under Grant No. 2009AC2057. Many thanks also to the anonymous referees to comment our work.
- Freeman, W.T. and E.H. Adelson, 1991. The design and use of steerable filters. IEEE Trans. Pattern Anal. Mach. Intell., 13: 891-906.
- Belongie, S., J. Malik and J. Puzicha, 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell., 24: 509-522.
- Schaffalitzky, F. and A. Zisserman, 2002. Multi-view Matching for Unordered Image Sets, or How Do I Organize My Holiday Snaps? In: Computer Vision (ECCV'02), Heyden, A. and G. Sparr (Eds.). Springer, Berlin, Heidelberg, ISBN: 978-3-540-43745-1, pp: 414-431.