HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2007 | Volume: 7 | Issue: 20 | Page No.: 2947-2956
DOI: 10.3923/jas.2007.2947.2956
On The Use of Advanced Correlation Filters for Human Posture Recognition
Nooritawati Md. Tahir, Aini Hussain, Salina Abdul Samad, Hafizah Husain and Andrew Teoh Beng Jin

Abstract: This study affords the method of using advance correlation filters in human posture recognition task. Two types of correlation filters were implemented and their efficacy evaluated. The correlation filters under consideration are Minimum Average Correlation Energy (MACE) and Unconstrained Minimum Average Correlation Energy (UMACE). Initial results prove that correlation filters offer significant potential used in posture recognition task with UMACE outperforming the MACE filter. In this research, both filters were subjected to a challenging task to recognize human posture without any restriction on the gender, clothing and posture variations. The UMACE filter performs remarkably well with an average accuracy of 89% compared to MACE filter which attained 42%.

Fulltext PDF Fulltext HTML

How to cite this article
Nooritawati Md. Tahir, Aini Hussain, Salina Abdul Samad, Hafizah Husain and Andrew Teoh Beng Jin, 2007. On The Use of Advanced Correlation Filters for Human Posture Recognition. Journal of Applied Sciences, 7: 2947-2956.

Keywords: recognition, posture and Advanced correlation filters

INTRODUCTION

One of the central topics in computer vision is the recognition of human postures. The recognition of posture is one step in the global process of analyzing human behaviour. Behaviour analysis is a vital field in dealing with many applications such as video surveillance that detect and track a person and recognize special actions such as theft, activity recognition and pedestrian detection. Accordingly, the major goal is to accomplish a more natural means of interaction with computers. In this paper we describe a novel methodology of recognizing four main postures namely standing, sitting, bending and lying position using advanced correlation filters. Previous methods done by Haritaoglu et al. (1998, 2000), Wren et al. (1997), Bobick and Davis (2001) and Iwasawa et al. (1997) that are based on 2D appearances used similar approach. Firstly, the principal parts of the body such as head, hands and feet are detected and based on these detections the secondary parts of the body such as shoulders, elbows and knees are revealed. Haritaoglu et al. (1998, 2000) segmented the silhouette from the background and computed the vertical and horizontal projections of the silhouette to determine the global posture of the person either standing, sitting, crawling-bending and laying.

To perform posture recognition, the system computed the projections of the current silhouette and compared to the model of projection realized for a set of predefined posture. Alternatively, Iwasawa et al. (1997) proposed a method that determines the center of gravity of the human silhouette. Next, they compute the orientation of the upper half of the body. Finally, the significant points such as feet, hands, elbows and knees are estimated by using a heuristic contour analysis of the human silhouette. The principal drawback of these methods is the dependency on the point of view. However these methods are not resource demanding since only one camera is used and therefore are adapted in real time applications.

Recently, advanced correlation filters have been investigated and they have evolved into very effective algorithms for pattern recognition applications specifically biometrics verification (Hennings and Vijaya Kumar, 2004; Jingu et al., 2005; Savvides and Vijaya Kumar, 2003; Venkataramani and Vijaya Kumar, 2004; Chong et al., 2006). For instance, Savvides and Vijaya Kumar (2003) examined the performance of advanced correlation filters for face authentication. The results are based on the illumination subsets of the CMU PIE (Carnegie Mellon University Pose, Illumination and Expression) database. They also presented methods that reduce the memory requirements of these filters to run on limited computational resources including computationally efficient methods of synthesizing these filters. Further, they described an online training algorithm implemented on a face verification system for synthesizing correlation filters from a video stream to handle pose and scale variations. Their system also uses an efficient scheme to perform face localization within the current framework during the authentication stage. Another related study is by Jingu et al. (2005) on the evaluation of face recognition performance based on visual and thermal infrared (IR) face images using advanced correlation filter methods. They used MACE and OTSDF (Optimal Tradeoff Synthetic Discriminant Function) filters and achieved better performance over commercial face recognition algorithms such as FaceIt®. Their study proved that correlation filters performed very well when the face images are of significantly low resolution and could be applied in human identification at a distance (HID). They also described in detail a fully automated way of eyeglass detection and removal in thermal images resulting in a significant increase in thermal face recognition performance. Venkataramani and Vijaya Kumar (2004) also evaluated the performance of composite correlation filters in fingerprint verification for access control applications. They focused in obtaining digital live-scan fingerprints from sensors, rather than the inked fingerprints usually used in criminal identification. The NIST Special Database 24, obtained from an optical fingerprint sensor, was used to evaluate the performance of fingerprint verification in the presence of distortion. Their results showed that unconstrained filter was a good choice since it can be incrementally updated to reduce complexity demonstrating the distortion tolerance potential of correlation filters and performed reasonably well even with low resolution images. On the other hand, Hennings and Vijaya Kumar (2004) introduced the application of correlation filter classifiers for palmprint identification and verification. They described how the extraction of an appropriate region of interest in the palmprint surface could be used to design correlation filters that accomplish 100% recognition on a database of 50 persons. Recently, Chong et al. (2006) developed a private biometrics formulation that is based on the concealment of random kernel and the iris images to synthesize a MACE filter for iris authentication. The purpose was to provide private biometrics realization in iris authentication in which biometric template can be reissued once it was compromised. The proposed method was able to decrease the computational load by reducing the filter size and thus, improved the authentication rate significantly.

In this study, in an attempt to further demonstrate the efficiency and robustness of correlation filters, we investigate the possible use of advanced correlation filters in our posture recognition system. The main advantage of this method over previous approaches of posture recognition (Haritaoglu et al., 1998, 2000; Wren et al., 1997; Bobick and Davis, 2001; Iwasawa et al., 1997) is that no feature extraction stage is involved since the image pixels are the input domain for training the filters. This particular property of correlation filters has been proven effective in recognition and classification (Sao and Yegnanarayana, 2004; Henning and Vijaya Kumar, 2004). As such, in this study we aim to illustrate the additional abilities of correlation filters in recognizing posture images.

MATERIALS AND METHODS

Firstly, four categories of human postures namely standing, bending, sitting and lying position were chosen for classification purpose. The database of images is recorded under normal office lighting system in an indoor environment. The system implementation includes background subtraction and silhouette extraction. In this work, we assume a static background and the background subtraction are achieved by thresholding the difference between the current frame and the static background image. In doing so, a human silhouette is extracted. Further, advanced correlation filters will feat as the posture classification system. Advanced correlation filters also known as composite filters, are the family of correlation-based classifiers. A template is used to correlate a test image and look for a sharp peak in the output plane that ideally resembles an impulse. This will be the evident characteristic of correlation outputs corresponding to images of the correct posture. In contrast, when a different posture image is correlated, the output shape does not contain a well-defined peak and it indicates that the image is a false or incorrect posture. This concept is shown in Fig. 1. The performance of such correlation templates depends highly on the set of images used for computing the template. Indeed, advanced correlation filters have been thoroughly studied in the last two decades and they have evolved into very effective algorithms for pattern recognition applications (Gader et al., 2004; Sao and Yegnanarayana, 2004; Hennings and Vijaya Kumar, 2004). Advanced correlation filters can be designed to accommodate the intrinsic amplitude variability of the images in the training set while being tolerant to noise pervading the images (Jingu et al., 2005; Savvides and Vijaya Kumar, 2003).

Fig. 1: Application of advanced correlation filter in posture recognition

In present experiments, 150 images of size 64x64, for each category of the four main postures are chosen as our database to evaluate the performance of MACE and UMACE for posture recognition. We first applied 3 training images comprising of the front and side posture views in order to synthesis the MACE filter of posture. This process continues as the number of training images are increased in step of 3 each time until reaching 15 images. To evaluate the performance of each posture, cross correlations of all the images in the dataset were computed using each posture MACE filter resulting in 150-15 = 135 correlation outputs corresponding to true class posture and 150x3 = 450 false class postures. The corresponding PSRs were then measured, recorded and evaluated.

Overview of composite filters: The fundamental motivation for designing correlation filters was driven by distortion invariant optical pattern recognition. Muller and Herbst (2002) introduced the Synthetic Discriminant Function (SDF) approach for this purpose in 1980. SDF is a linear combination of matched spatial filters whereby the weights are chosen so that the correlation outputs corresponding to the training images will yield equivalent correlation peak values at the origin. Basic SDF is also referred to as the Equal Correlation Peak (ECP) SDF. The drawback of the ECP SDF is twofold. First, it cannot tolerate significant input noise. Second, ECP SDF has no built-in shift-invariance capability. Thus, shifts in the input target are replicated in the correlation output. Nevertheless, the SDF method has deeply influenced the current design of advanced correlation filters although the idea was actually introduced more than twenty years ago. To achieve robustness to noise, Vijaya Kumar et al. (2005) introduced the MVSDF (Minimum Variance Synthetic Discriminant Function). There are two drawbacks to this method. The first is that the MVSDF also controls only one point in the correlation map, just like the ECP SDF. The second is that the variance of the noise matrix must be made known beforehand in order to design the filter. However, even if the latter is known exactly, MVSDF is impractical because it requires inverting a large noise covariance matrix (Vijaya Kumar et al., 2005). The MACE (Minimum Average Correlation Energy) filter was an attempt to control the entire correlation plane. Vijaya Kumar et al. (2005) reduced correlation function levels at all points except at the origin of the correlation plane and thereby obtained a very sharp correlation peak. However, MACE filters often suffer from two main drawbacks. Firstly, there is again no built-in immunity to noise. Secondly, the MACE filter is often excessively sensitive to intra-class variations. Nevertheless, MACE leads to a useful frequency domain design approach for object recognition. Studies have shown that hard constraints on correlation values at the origin are not only unnecessary but can be counterproductive (Vijaya Kumar et al., 2005). Hence, Unconstrained Correlation (UC) filters were introduced.

In this study, we have implemented one of the special UC filters, called UMACE short for Unconstrained Minimum Average Correlation Energy. For notation, matrices are denoted by upper case and vectors by bold italic characters. Upper case symbols refer to the frequency domain, while lower case symbols represent quantities in the spatial domain. The symbols T, * and + denote the conjugate transpose, complex transpose and complex conjugate transpose, respectively. Let us consider a set of images, each of which has a size d1xd2 = d. By assuming that each image is represented in a d-dimensional image space and given by a column vector of the form:

(1)

Next, let xi denotes the ith training image, where i =1 to N and Xi is the Fourier Transform of xi.

Thus

(2)

and

X = [X1, X2, …, Xi, …XN]

Where:

Xi (k) = The kth element in the frequency domain vector Xi,
k = 1,…d.

Note that X is a dxN matrix with N column vectors Xi. Let the column vector h be the impulse response of the correlation filter and H its frequency response. We then define ci as the correlation function of the ith image xi with h, such that:

ci= xi q h = IFFT{Xi(k). H(k)}

and

(3)

MACE (Minimum Average Correlation Energy) filters: Vijaya Kumar et al. (2005) has defined several quantities that are useful for formulating the design approach in terms of an optimization problem. The overall objective is to determine H so that the average energy is minimized in the correlation plane and a sharp peak is obtained at the origin. The quantities are given as follows:

(a)The power spectrum of Xi:

(4)

where, Di is a diagonal matrix of size dxd whose diagonal components are the magnitude squared of the associated component of Xi. It is also the power spectrum of xi.

(b)The energy of the ith correlation plane:

(5)

(c)The correlation peak amplitude at the origin:

(6)

In general, ui is user-defined where i = 1,…., N. All ui belonging to the same postures class are set to 1; otherwise they are set to 0.

(d)The average correlation plane energy E:

(7)

The optimal design problem solved by the MACE filter H can be expressed as:

minH Eav = minH (1/N) H+DH

Where:

(8)

Where:

i = 1,…..N.

It can be shown Eq. 6 that the optimal solution of the MACE filter is given by:

Where:

(9)

If we have N training images each of size d1xd2 pixels, then X in Eq. 9 is a matrix of size LxN, where L is the total number of pixels in a single training image (L = d1xd2). X is a matrix that contains along its columns lexicographically re-ordered versions of the 2-D Fourier transforms of the N training images. D is a diagonal matrix of dimension LxL containing the average power spectrum of the training images along its diagonal. The column vector u contains N entries, corresponding to the desired values at the origin of the correlation plane of the training images.

UMACE (unconstrained minimum average correlation energy) filters: UMACE is another variant of the MACE filter (Savvides and Vijaya Kumar, 2003) and in this case, instead of imposing a hard constraint on the origin of the correlation plane, its height at the origin is free to increase according to the test data. The optimized filter equation is given as H = D-1 where m is a column vector containing the average of the 2D Fourier transforms of the training images. UMACE filters are computationally more attractive as they require the inversion of only a diagonal matrix. Noise tolerance can be built in to the filters as described in (Savvides and Vijaya Kumar, 2003). This is done by substituting D with D and D = αD + sqrt (1-α2)C, where, C is a diagonal matrix containing the noise power spectral density. For white noise, C is the identity matrix and α ranges from 0 to 1 and is chosen to trade-off noise tolerance for discrimination. Note that an α of 1 yields a MACE filter.

Peak-to-Sidelobe Ratio (PSR): The MACE and UMACE methodologies determine a different filter for each member of the same class. In other words, given a test image, we wish to ascertain how similar it is to a MACE or UMACE filter. However, the design procedures discussed above output a complete correlation plane. Savvides and Vijaya Kumar (2003) Ventakaramani and Vijaya Kumar (2004) have suggested the Peak-to-Sidelobe Ratio (PSR) as a summary of the information in each correlation plane. Thus, the PSR is used to evaluate the degree of similarity of correlation planes. The Peak-to-Sidelobe Ratio (PSR) is a metric for measuring the overall correlation function for a test image. The significance of the PSR is that it measures the sharpness of the correlation function.

RESULTS AND DISCUSSION

The recognition performances of the MACE filter to discriminate all postures are as shown in Fig. 2.

Fig. 2: PSR score for each class posture using MACE filter. The black dots indicate the correlation peaks for the training images

Fig. 3: Sample Correlation outputs using MACE filter designed for Standing Posture. Top (right): Input is a standing posture belonging to training set. Top (left): Input is a standing posture from test data. Bottom: Inputs are posture images from false class

Fig. 4: PSR score for each class posture using UMACE filter

Clearly, MACE filter fails to distinguish between the true class posture and the false group postures for all categories of postures. The black dots indicate the correlation peak for the training images used to synthesize the MACE filter for all posture and as expected they yield high PSR values. Subsequently, Fig. 3 shows the correlation output for standing posture using MACE filter. The sharp correlation peak resulting in a large PSR value is the training image. The left correlation output shows the response to a test image of the true class posture followed by the false class postures namely sitting, bending and lying position. The poor performance of MACE is expected. It is due to variations in the test dataset that were not adequately captured by the training set since human postures have very flexible and, not so well defined, complex shapes. In face recognition (Sao and Yegnanarayana, 2004), fingerprint verification (Venkataramani and Vijaya Kumar, 2004), palmprint recognition (Hennings and Vijaya Kumar, 2004) and iris recognition (Chong et al., 2006) applications, increasing the number of training images help improve authentication, but in our study, adding more training images do not necessarily enhance the PSR values. These are due to the downside of MACE filter that has no in built immunity to noise and therefore, could not accommodate the intra-class variations. Next, UMACE filter is applied as an alternative to MACE filter in order to attain better posture recognition. We repeat the same procedures using UMACE in order to determine the PSR performances in each posture class. As can be seen, UMACE filter produces an acceptable significant margin that discriminates the true class posture and the false category (Fig. 4).

Present experimental results for the UMACE filter showed that by increasing the number of training images has helped increase the margin between the true class and the false class posture and thus, giving better discriminating ability. It was also observed that the variance of the false class is reduced as the number of training images is increased. In the verification stage of using UMACE filter, the following three quantities were used to select the proper threshold namely False Acceptance Rate (FAR), False Rejection Rate (FRR) and Equal Error Rate (EER). For a given verification PSR threshold θv for a class, the performance can be measured by the False Acceptance Rate (FAR) and False Rejection Rate (FRR), defined as follows:

Fig. 5: FAR vs threshold and FRR vs threshold. ERR is obtained by adjusting FAR vs FRR

Fig. 6: Correlation outputs for each posture using UMACE filter with 12 training images


When, FAR and FRR are equal, the common value is referred to as Equal Error Rate (EER). Next, each posture in the database was verified against every other posture. For a particular UMACE filter, all of the false class postures (150x3 = 450) and true class posture (150- number of training images used) criterion scores were obtained. Table 1 shows the average EER for each posture using 3 images and then being increased in step of 3 until 21 images. It is noted that at 12 training images offered almost equal EER performances even if the number of training images is increased. Therefore we conclude that 12 training images are acceptable as our templates.

Table 1: EER for all 4 UMACE filters synthesized using different number of training images

Subsequently, the threshold value to be selected for each posture is determined from the performance curves for all postures using UMACE filter as shown in Fig. 5. Thus, the lower the EER, the superior the overall performance of the verification system. As expected, it is obvious from the results that UMACE filter performs better and could separate the true class posture from the false category particularly for standing and lying down postures with recognition results of 96.2 and 93.6% respectively. Some samples correlation outputs of the postures are presented in Fig. 6. The best testing model that matches the training image will have a large PSR value. The PSR value reflects the UMACE filters ability to recognize and verify similarity between postures. It can be realized in Fig. 6 that the true class posture has higher PSR than the false category.

CONCLUSIONS

The performance of advanced correlation filters specifically MACE and UMACE for posture verification is presented. Based on the results obtained, the UMACE filter is found to be more robust compared to MACE with respect to variation in postures. It was determined that using 12 training images, the UMACE filters are capable of discriminating over 90% of the true class posture from the false profile namely for the standing and lying position posture where as for sitting and bending categories it obtained above 80% correct classification. It should be noted that the postures database comprises of various position for both gender without clothing restriction. This recognition rate outperformed the results obtained by Goldmann et al. (2004) which obtained average classification rate of 79.77% for classifying the same four main postures. Therefore, the recognition outcomes are considered remarkable. These results are promising and demonstrated the potential use of advanced correlation filters as an interesting option for posture recognition. Further work includes evaluating the effect and performance of different image regions and using multiple filters for training images. Accordingly, incremental updating of filters could also be considered for memory reduction as space is a constraint in most recognition system. Hence only a few training images are involved at a time. The performance of other correlation filters such as Maximum Average Correlation Height (MACH), Distance Classification Correlation Filter (DCCF) filters could also be experimented for posture verification system.

ACKNOWLEDGMENTS

This research was supported by MOSTI under the IRPA Grant No: 03-02-02-0017-SR0003/07-03. The authors also acknowledge Prof. Dr. Burhanuddin Yeop Majlis as the Program Head and Universiti Teknologi Mara (UiTM) for the UiTM-JPA SLAB scholarship award.

REFERENCES

  • Bobick, A.F. and J.W. Davis, 2001. The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intel., 23: 257-267.
    CrossRef    Direct Link    


  • Chong, S.C.A., T.B. Jin and D.N.C. Ling, 2006. Iris authentication using privatized advanced correlation filter. Proceedings of the International Conference on Advances in Biometrics, January 5-7, 2006, IEEE Xplore, pp: 382-388.


  • Gader, P., J.M. Keller, T. Jones, J. Miramonti and G. Hobson, 2004. MACE Prefiltering for neural network based automatic target recognition neural networks. Proc. IEEE World Cong. Comput. Intel., 6: 4006-4011.


  • Goldmann, L., M. Karaman and T. Sikora, 2004. Human body posture recognition using MPEG-7 descriptors. Proc. SPIE, 5308: 177-188.
    Direct Link    


  • Haritaoglu, I., D. Harwood and L. Davis, 1998. W4: Real-time surveillance of people and their activities. Proceedings of the 3rd International Conference on Face and Gesture Recognition, April 14-16, 1998, IEEE Xplore, pp: 222-227.


  • Haritaoglu, I., D. Harwood and L.S. Davis, 2000. W4: Real-time surveillance of people and their activities. IEEE Trans. Pattern Anal. Machine Intell., 22: 809-830.
    CrossRef    


  • Hennings, P. and B.V.K. Vijaya-Kumar, 2004. Palmprint recognition using correlation filter classifiers. Proc. Signals. Syst. Comput., 1: 567-571.
    Direct Link    


  • Iwasawa, S., K. Ebihara, J. Ohya and S. Morishima, 1997. Real-time estimation of human body posture from monocular thermal images. Proceedings of the International Conference on Computer Vision and Pattern Recognition, June 17-19, 1997, San Juan, Puerto Rico, pp; 15-20.


  • Jingu, H., M. Savvides and B.V.K. Vijaya-Kumar, 2005. Performance evaluation of face recognition using visual and thermal imagery with advanced correlation filters. Proceedings of the Computer Society Conference on Computer Vision and Pattern Recognition, June 25, 2005, San Diego, CA., USA., pp: 9-14.


  • Muller, N. and B.M. Herbst, 2002. On the use of sdf-type filters for distortion parameter estimation. Proc. IEEE Trans. Pattern Anal. Mach. Intel., 24: 1521-1528.
    Direct Link    


  • Sao, A.K. and B. Yegnanarayana, 2004. Face verification using correlation filters and autoassociative neural networks. Proceedings of the International Conference on Intelligent Sensing and Information Processing, January 4-7, 2004, IEEE Xplore, pp: 364-367.


  • Savvides, M. and B.V.K. Vijaya Kumar, 2003. Efficient design of advanced correlation filters for robust distortion-tolerant face recognition. Proceedings of the Conference on Advanced Video and Signal Based Surveillance, July 21-22, 2003, Miami, Florida, pp: 45-52.


  • Venkataramani, K. and B.V.K. Vijaya-Kumar, 2004. Performance of composite correlation filters for fingerprint verification. J. Optic. Eng., 24: 1820-1827.
    Direct Link    


  • Vijaya-Kumar, B.V.K., A. Mahalanobis and R.D. Juday, 2005. Correlation Pattern Recognition. 1st Edn., Cambridge University Press, Cambridge, UK., pp: 130-243


  • Wren, C.R., A. Azarbayejani, T. Darrell and A.P. Pentland, 1997. Pfinder: Real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intel., 19: 780-785.
    CrossRef    Direct Link    

  • © Science Alert. All Rights Reserved