In this study, a novel algorithm is presented for dealing with the online data based on nonnegative matrix factorization model. Nonnegative Matrix Factorization (NMF) is a promising approach for face recognition in that it is capable of extracting the local features by factorizing the nonnegative matrix into two nonnegative matrices. However, there are two major weaknesses in almost all the existing NMF based methods. The first shortcoming is that the computational cost is high for large matrix factorization, it also needs more memory to save the huge data. The other is that it must conduct repetitive learning when the data samples are updated. To overcome these two limitations, a novel Online Nonnegative Matrix Factorization (ONMF) algorithm for online face recognition is presented in this study. The ONMF algorithm can not only deal with the incremental nonnegative matrix factorization, but also can deal with the decremental nonnegative matrix factorization which has never been considered in other study. Two face databases, namely ORL and FERET face database, are selected for evaluation. Compared with the conventional standard NMF, the method in this study gives the better performance both in computational costs and other aspects.
PDF Abstract XML References Citation
How to cite this article
Nonnegative Matrix Factorization (NMF) can extract local features by factorizing the nonnegative matrix into two nonnegative matrixes (Lee and Seung, 1999, 2000). It has been used for static data analysis and pattern recognition in the past. For the reason that the results of the factorization have good interpretable properties and also because of the sparseness of the factor matrixes, it has been widely used in many fields including face recognition and spectral data analysis (Guillamet and Vitria, 2002, 2003; Lee and Seung, 1999). However, for its low convergence speed and the huge memory demands for the online data processing in conventional NMF, it is impossible to apply the model to online data analysis. The purpose of the online NMF algorithm is to perform rapid NMF analysis so that the data in reality such as recommendation systems or online text mining can be produced in real time. In such applications, systems need to exclude participation of the old samples from their representations and also need to increase influence of the latest samples. For instances, in a real face recognition database, the new samples have to be added into the database and oppositely, the old samples will be excluded from the database.
Nonnegative matrix factorization was first used in face recognition to extract the local features in 1999 (Lee and Seung, 1999), it aims to perform nonnegative matrix decomposition on the training image matrix A such that A≈UVT, where, U, V are the base matrix and the coefficient matrixes, respectively. The local image features are learned and included in U as column vectors. In the latest years, several algorithms for dealing with the incremental learning were developed. In (Wen-Sheng et al., 2008; Pan et al., 2008), a supervised Incremental NMF (INMF) scheme was given base on the class of the new samples, with the orthogonal constraints, the author declares that their factors are unique and their algorithm is convergent , experiments given in their study show that their algorithm have a good performance, but the influence of the new samples only on the class while not the whole database which will affect the local features extraction. Bucak and Gunsel (2009), an incremental subspace learning method was proposed by a weighted cost function which allows controlling the memorylessness of the factorization and incrementally updating its factors by reflecting the influence of each observation, the method was tested on video data sequences features extraction. Cao et al. (2007), a new model was presented by adding some orthogonal constraint and sparseness constraints in the objective function, it also approximate the relation between the old base matrix and the new base matrix. Rebhan et al. (2008), an incremental learning algorithm was given, it can incrementally and continuously adapt to the new data. Several other new algorithms for the incremental NMF algorithm also are given such as by Li et al. (2010) and Bucak and Gunsel (2007). All the above algorithms have taken account into the incremental case of the online data while have not considered the decremental situations which often happens in reality. To the best of our knowledge, there is no studies available dealing the decremental case based on NMF.
In this study, we give an algorithm which will consider the two cases: incremental and decremental cases. The study is organized as follows: first, we derive the algorithms for the two cases and then analysis the computational complexity of the algorithm. After that, experiments are given based two face databases. The conclusions and the future work were given in the end.
The basic model for nonnegative matrix factorization is to factorize the nonnegative matrix A into two nonnegative matrix factors U, V, where, the matrix U is the base matrix and V is the coefficient matrix. Basing on the Euclidean error distance, the NMF problem has the following optimization model:
The updates formula for standard NMF (Lee and Seung, 2000) is:
The factors are U*, V* and we have the approximations A≈U*(V*)T.
Incremental learning: The basic NMF method has been proven to be an effective method for extracting the local features by seeking the nonnegative factorizations. However, the basic NMF method and its available existing variations assume that the data are static. In reality, we need considering the dynamic nature of the online data. In order to apply the NMF based method on online data, we have to repetitively factorize the data when it updates, this procedure is clearly time consuming. Thus, our direction is to aim at improving the basic NMF based method by developing an online version. From the point of the application in face recognition, the task of NMF is to find a set of nonnegative bases to represent the input data by a linear combination, when the new data arrives and the old data is deleted, the data base need to be updated to represent the data. So, we mainly consider the changes in the base matrix, this is the main idea of the online NMF method given in this study.
For the new incremental samples αεRmxk, the coefficient based on the old bases is:
In Eq. 3, the coefficient of the incremental samples is not necessarily nonnegative due to the generalized inverse of matrix U*, so in the iteration, we project the coefficient onto the nonnegative orthant when it is not nonnegative.
Assumptions: The column vectors of matrix U is linearly independent.
This assumption is modest in that the column vectors of the base vectors are often linear independent. For the incremental samples, we have the following model:
where, matrix A is the old data and a is the new samples, under the condition of assumption, the above model then is changed into the following form:
Then, by using the method by Lee and Seung (1999), we can derive the iteration formula of the algorithm. The objective function of the above model is:
For simplicity, we can denote b = a-U(U*TU*)-1U*Ta and then we have:
The lagrangian function of the above model Eq. 5 is:
The KKT conditions of the optimization problem Eq. 8 is:
where, matrixes λ, μ have the similar dimensions as U,V, respectively. And the symbol q denotes elmentwise product of the two matrixes, sum( ) has the same mean with that in Matlab. Matrix λ,μ≥0 mean all the entries in the matrixes are nonnegative.
According to the KKT conditions, we have the following results:
Then, we have:
So, the iterates of the algorithm is:
In order to avoid division by zero, we have a small positive value ε,then we have:
Decremental learning: Suppose matrix a are the old samples that will be excluded, first we rearrange the column of the samples matrix as:
where, matrix P is the permutation matrix, then we have:
Use the same philosophy as the incremental case, we mainly take account into the base matrix of the factorization, then we have:
where, V1 is the new coefficient matrix and (V1T, U+a) has the same size as . With the same division as above, we have the iterations for the decremental case:
In order to avoid division by zero, like what we have done for Eq. 11 , we have a small positive value ε, then we have:
The online NMF algorithm:
|•||Step 1 input the initializations U(0),V(0) the rank r≤min(m,n), ε and the termination criterion for the algorithm, carry out the following iterations until convergent|
|•||Step 2 update U according to the iterates formula Eq. 13 or 18:|
|•||Step 3 update V according to the iterates formula Eq. 13 or 18|
Extra computational complexity of the above algorithm: Compared with the conventional standard NMF updates formula, there are more operations needed to be carried out in advance. In the iterations, we mainly take account into the multiplication and division operations. Generally, for a matrix with kxk size, the operations for the calculations of the inverse matrix are O(k3) based on the Gaussian Elimination methods. Suppose the number of the updating data (incremental/decremental ) is k, which means a∈Rmxk, the base matrix U∈Rmxr, the extra operations of the ONMF algorithms in an iteration compared with the conventional NMF are:
|•||U*TU has nr2 times operations|
|•||(U*TU*)-1 has O(r3) times operations;|
|•||The production of (U*TU)-1UT has mr2 times operations and the total operations for the calculation of (U*)+ are O(r3) + (m+n)r2|
|•||The operations for the (a(U*+a)T)ij is m2r+mr|
|•||the operations for the UU*+aaT (U*+)T is m2r +mr|
In the real experiments, we can calculate VTV and after that we calculate U(VTV) which will save the computational costs because the former operations are O(mnr) and the operations for latter are O(max(m,n)·r2).
Here, Feret and ORL face databases are selected to evaluate the performance of our ONMF method along with conventional NMF method. All the images in the two databases are 112x92 size. The original images are reduced to 28x23 size for the ORL database and 20x20 size for the Feret database by the DCT transformation. We stop the iteration if the stopping condition is met or if it exceeds 1000 times iterations. The standard NMF means the NMF iterations formula used by Lee and Seung (2000), the results with the standard NMF means the results obtained by using the standard NMF algorithm on the samples with incremental/decremental case.
Experiment 1: The database is feret face recognition database, we evaluate r = 20, ε = 1e-006, the number of the samples is 300 and the number of incremental sample is 30. We initialize all the algorithms randomly. The factorization results is in Table 1.
Experiment 2: The database is ORL face recognition database, we evaluate r = 20, ε = 1e-006, the number of the samples is 300 and the number of incremental sample is 30. We initialize all the algorithms randomly. The factorization results is in Table 2.
Experiment 1: The database is feret face database, we evaluate r = 20, ε = 1e-006, the number of the samples is 300 and the number of decremental sample is 30. We initialize all the algorithms randomly. The factorization results is in Table 3.
Experiment 2: The database id ORL face database, we evaluate r = 20, ε = 1e-006, the number of the samples is 300 and the number of decremental sample is 30.
|Table 1:||Numerical results of incremental NMF for feret database|
|Table 2:||Numerical results of incremental NMF for ORL database|
|Table 3:||Numerical results of decremental NMF for feret database|
|Table 4:||Numerical results of decremental NMF for ORL database|
We initialize all the algorithms randomly. The factorization results is in Table 4.
The sparseness of the base matrix U: For a vector uεRn, we define the sparseness as that by Hoyer (2004),
The sparseness of the base matrix is defined as:
The left side iteration number is the number for Incremental NMF and decremental NMF, the right side of the iterations number is the number for the standard NMF. The sparseness is for the base matrix of the factors.
|Table 5:||The sparseness of the base matrix|
The experimental results are given from Table 1-5. From the experimental results, we can find that the ONMF algorithm can approximate the local minimum with less iteration than the conventional standard NMF algorithm. The time costs are a little more than the conventional NMF with the same iterations but our algorithm can achieve the local minimum in several iterations while the conventional NMF cant. With the same iterations for the two algorithms, the error of the objective functions for our algorithm is less. The sparseness of the base matrix of ONMF algorithm is higher than the conventional NMF, which means that it can extract the local features successfully. So, the ONMF algorithm has good performance compared with the conventional algorithm.
CONCLUSIONS AND FUTURE WORK
A novel ONMF algorithm for face recognition based on online updating is proposed in this study. Compared with the conventional NMF based methods, our algorithm have the following properties: (1) It only need storing the base matrix of the former factorization without storing the original data matrix ,which will save a huge storage space; (2) It can deal with not only the incremental online data, but also the decremental case, which have not been considered in other studies. (3) The algorithm have a better convergence speed, it can convergent to the approximate local minimum with fewer iterations. Experimental results on ORL and FERET face database show that the ONMF algorithm has good performance.
The future research is the online update scheme under different error distance criterion such as KL divergence. We also plan to test the effectiveness of our approach on more data sets and more fields such as text mining, image processing, video surveillance and so on.
This study is partly supported by National Nature Science Foundation of China (NNSFC) under the grant No. 10571109; the Nature Science Foundation of Shandong Province under grant No. Y2008A01; the Spring Bud project 2009 of SDUST.
- Guillamet, D. and J. Vitria, 2002. Non-negative Matrix Factorization for Face Recognition. In: Lecture Notes in Computer Science, Escrig-Monferrer, M.T., F. Toledo and E. Golobardes (Eds.). Springer, Berlin, Heidelberg, ISBN: 978-3-540-00011-2. pp: 336-344.
- Guillamet, D. and J. Vitria, 2003. Evaluation of distance metrics for recognition based on non-negative matrix factorization. Pattern Recognition Lett., 4: 1599-1605.
- Cao, B., D. Shen, J.T. Sun, X. Wang, Q. Yang and Z. Chen, 2007. Detect and track latent factors with online nonnegative matrix factorization. Proceedings of the 20th International Joint Conference on Artificial Intelligence, Jan. 6-12, Hyderabad, India, pp: 2689-2694.
- Hoyer, P.O., 2004. Nonnegative matrix factorization with sparseness constraints. J. Machine Learning Res., 5: 1457-1469.