Dimensionality Reduction for Classification of Blind Steganalysis

Journal of Software Engineering

Year: 2015 | Volume: 9 | Issue: 4 | Page No.: 721-734
DOI: 10.3923/jse.2015.721.734

Dimensionality Reduction for Classification of Blind Steganalysis

Xiuhui Ge and Hao Tian

Abstract: One of the critical problems in Steganalysis is the reduction of dimension of high dimensional features. It can improve the distinction between the cover and stego, achieve higher classification accuracy. A number of approaches have been elaborated to solve the issues of dimensionality reduction. Moreover, some dimensional methods aren’t often considered in blind steganalysis. It is the opportunity to addresses the issue of using those low-dimensional mapping provided by different dimensionality reduction method to improve classification accuracy of blind steganalysis. The suggested approach has been subsequently tested through a series of experiments aimed to evaluate the impact of different DR methods, such as PCA, LDA, Isomap, S-Isomap, LLE and SLLE. Experiments on real data sets demonstrated that some dimensionality reduction methods (such as LLE, LDA) can have more discriminative power than other dimensionality reduction methods (such as PCA, Isomap). When Isomap and LLE are compared with S-Isomap and SLLE, the results reveal that the supervised methods performance is better than the unsupervised dimensionality reduction methods in blind steganalysis.

Fulltext PDF Fulltext HTML

How to cite this article

Xiuhui Ge and Hao Tian, 2015. Dimensionality Reduction for Classification of Blind Steganalysis. Journal of Software Engineering, 9: 721-734.

Keywords: Dimensionality reduction, PCA, LDA, LLE, SLLE, Isomap, S-Isomap and blind steganalysis

INTRODUCTION

Dimensionality Reduction (DR) has become the core topic of machine learning, pattern recognition, information retrieval and data mining. In many observation, it is desirable to represent data in plane, three or higher dimensioned space, but it is difficult for people to manipulate data in high-dimensional space. Dimensionality Reduction (DR) is to remove redundant or irrelevant information and find inherent low dimensional structure in high dimensional data with fewer dimensional to represent meaningful data. In high-dimensional space, sample points are usually very sparse, the distance between any sample points will be larger and tend to be equal, so the learning algorithm based on distance may failed and computational overhead significantly increased with increasing dimension. DR can solve this problem.

DR can combine variables and transform them in a linear (LDR) or nonlinear way (NLDR). Classic LDR are principal component analysis, Linear Discriminant Analysis (LDA) and Independent Component Analysis (ICA). Principal components analysis was proposed by Pearson (1901), he performed a principal analysis is to find a smaller variables to describe the observation in plane, three or higher dimensioned space by the best-fitting straight line or plane. PCA (Jolliffe, 1986) firstly calculated the covariance matrix of sample data, then solved the corresponding covariance matrix for eigenvectors, sorts eigenvalues in descending order, selects front feature vector as projected vectors, projects the original high-dimensional data to subspace. Because PCA is linear and unsupervised methods, which will find the best straight line (or plane) through this set of data point. PCA likes pre-processing method, which can reduce the dimension of the original data and maximize the variance between low-dimension data. Fisher (1936) proposed Linear Discriminant Analysis (LDA) to reduce the dimension of data for later classification. He used linear discriminant analysis to solving projection vector and then high-dimensional data were projected onto low-dimensional space and he also defined the fisher criterion function, viewed extreme vectors as the best projection direction. In the projection space, between-class scatter is maximized and the within-class scatter is minimized, so as to achieve a more accurate classification. Comon (1994) proposed Independent Components Analysis (ICA). It is a linear transformation. ICA transforms the signal or data into a linear combination of statistically independent non-Gaussian signal sources. Hyvarinen et al. (2001) refer that what distinguishes ICA from other methods is that it looks for components that are both statistically independent and non-Gaussian. In essentially, DR is to find the best linear model under different optimal criteria.

In the last decade, many NLDA have been proposed. A class of nonlinear dimensionality reduction is based on the core. Scholkopf et al. (1999) firstly introduced the core into the area of DR, proposed KPCA. Kernel PCA is a nonlinear generalization of PCA. It performs PCA in arbitrarily dimensionality feature spaces. Besides PCA, researchers also proposed many NLDA methods, such as KICA (Scholkopf et al., 1998), KFDA (Bach and Jordan, 2002), KSOM (Mika et al., 1999). Kernel method is to find a good function which can linearly (or similar linearly) segment the data in the feature space.

In addition to core methods, manifold learning is another important ways to achieve nonlinear dimensionality reduction. Seung and Lee (2000) proposed that compared with the linear dimensionality reduction method. Mainfold learning can find meaningful low-dimensional structures in nonlinear high dimensional data. A basic assumption of manifold learning is sample distributed over a potential manifold, it can recover the low-dimensional manifold structure from high-dimensional data, namely find a low-dimensional manifold in a high-dimensional space and find corresponding embedding mapping. Despite the high dimensions of the input space, in the mainfold, dimension is not so high and manifold learning can re-expressed data set of high-dimensional space in low-dimensional space. The DR methods based on manifold learning belong to spectral method. It is characterized by the use of second-order manifold to describe the internal structure of the manifold. It mainly include multidimensional scaling (MDS) (Cox and Cox, 1994), isometric mapping (Isomap) (Tenenbaum et al., 2000), Locally Linear Embedding (LLE) (Roweis and Saul, 2000), Laplace feature mapping (Belkin and Niyogi, 2003), Local Tangent Space Alignment (LTSA) (Zhang and Zha, 2005), Maximum Variance Unfolding (MVU) (Van der Maaten et al., 2009) and so on. Compared with the LDA, NLDA can find essential dimensions of high-dimensional data, through NLDA, it can be more effective to performed data analysis.

In the field of steganalysis based on image, image features and classification are two key technologies. Those extracted features should be sensitive to the process of steganographic embedding and insensitive to the cover images. To find key features, it required to analyze data’s underlying structure with the increase of dimension, the required data points will grow exponentially. Thus, it is impossible to build a universal steganalyzer in the space of all cover images. We have to build the steganalyzer in the smaller dimension space that is we need to obtain the special feature set. The complete feature set can be written as:

DR aims to find a function:

So, through DR, it still preserves the important feature of the data, we often mention feature selection. In classification problems of steganalysis, objects were represented by feature vectors in feature subspace, such representation need to perform feature selection which is usually difficult and domain dependent. Feature selection and DR is not the same, but feature selection can be understood as a special case of DR, it can be reduced the dimension of feature selection regarded as a special case. Feature selection try to find a subset of features {f_i1,..., f_in}∈Rⁿ, ij∈ {1,..., d}, n<d, Miche et al. (2007) are the pioneer of using feature selection in the area of steganalysis. They use feature selection to select most relevant features for the desired classification. They proposed the drawbacks of performing steganalysis in high-dimensional spaces (1) The need for data points, (2) The increase of complexity and the lack of interpretability. At present, many DR methods such as dimensionality reduction techniques ICA, MDS, Laplacian, LTSA, LLE, Isomap, SLLE, S-Isomap are not widely used in steganalysis.

Feature selection aims to reduce dimensionality, we can view it as the special case of DR. In steganalysis, reduce the dimension of features can more effectively improve the distinction between the cover and stego, achieve higher classification accuracy.

METHODOLOGY

To carry out this study, a literature review relating to the existing approaches was conducted dealing with the problem of dimensionality reduction. Then, by following this literature review, we introduce the supervised and unsupervised methods such as Isomap, S-Isomap, LLE, SLLE into the area of steganalysis and compare with traditional method such as PCA and LAD in steganalysis. Through extensive experiments using dataset which were extracted base on different feature extracted program, such as CHEN (486), CCHEN (972), CCC300 (4860) and SPAM (686).

PCA for classification: Principal Component Analysis (PCA) (Jolliffe, 1986) is a linear method of identifying patterns in data and uses the mathematical concepts of eigenvalues and eigenvectors to highlight the similarities and differences of data points. It projects the data points in the high-dimensional space to the low-dimensional space and to keep the maximum variance between the projected data points. It tries to find significant eigen values and eigen vectors and form a new coordinate system defined by the significant eigenvector.

PCA works is as follows:

Step 1:	Prepare the dataset and subtract the mean
Step 2:	Calculate the covariance matrix and the eigenvectors and eigenvalues of the covariance matrix, choose components and form a feature vector, choose d largest eigenvalues corresponding eigenvectors as projected direction
Step 3:	Project original high-dimensional data to the spanned subspace of projection vector

PCA is one of the most simple linear dimensionality reduction methods, which preserves the Euclidean distance between the data points and ensure maximum variance. It removes only the linear correlation between the data. In addition, through the above experiment, PCA dimensionality reduction does not necessarily contribute to the classification, which focuses on optimizing fidelity of the original high-dimensional data and minimize the loss of the projection or maximize variance.

LDA for classification: Linear Discriminant Analysis (LDA) is a supervised linear dimensionality reduction methods. It first was proposed by Fisher (1936). Linear discriminant analysis is to make data points within class as close as possible and data points between class as possible to separate, namely after dimensionality reduction the distance within class as small as possible, the distance between classes as large as possible. Before dimensionality reduction, it is a prerequisite to complete LDA and you must have known the class label of data. That is, the given data set has been identified, then projected data to a straight line, so that the data points separated by class. Thus, the high-dimensional labelled samples are projected to low-dimensional space to classify and compress feature space dimension. In the new sub-space, data will form a cluster by class and the same class of samples will be closer.

ISOMAP and S-ISOMAP for classification: Two core methods have been proposed to solve the NDL problem, namely Isomap and LLE. In this section, to study the Isomap and its variants. In Tenenbaum et al. (2000) published papers on science, proposed nonlinear dimensionality reduction algorithm: Isomap. The authors use local metric information to learn the underlying global geometry of a data set, apply classical metric MDS with a smart metric and replace euclidean distance with geodesic distance. Isomap is different from PCA and Multi-Dimensional Scaling (MDS), but Isomap simultaneously has the advantage of PCA and MDS: non-iterative procedure, polynomial procedure computational efficiency and asymptotic convergence guarantees. PCA projects inputs data onto linear subspace, finds subspace such that restore the largest possible data variance in lower dimensions. MDS finds embedding that best preserves distances between data points. The nature of image is the data set, it can be represented by the Cartesian coordinate. Accordingly, the image variability can be calculated by mathematical methods. Isomap represents the global structure of dataset in a single coordinate system. Isomap essentially is the extension of MDS. Due to the low-dimensional representation center of MDS is at the origin, so it keeps the inner products. In other words, the inner products in the low-dimensional space are approached the distance in high-dimensional space. In classical MDS method, the distances are generally Euclidean distance in high-dimensional space. Isomap's theoretical framework is MDS, but it is in the theoretical framework on manifolds, which uses the geodesic distance instead of the Euclidean distance.

Isomap works as follows:

Step 1:	Calculate the K closest neighbors of each object in the mainfold. The neighborhood relations are represented as a weighted graph
Step 2:	Calculate the shortest path between two points in the graph, in other words, calculate the geodesic distance between the two data points and form geodesic distance matrix
Step 3:	Run MDS on the geodesic distance matrix to calculate low-dimension representations of data points

When isomap performs uniformly sample in a single flat manifold, the effect of low-dimensional embedding is better. Isomap also needs enough data points to calculate the geodesic distance, if the data points is not enough, the results is inaccurate. Isomap is also sensitive to the noise, when the input data is noisy data, the result is not stability. At the same time, Isomap is unsupervised learning method and the mapping goal is suitable for visualization instead of classification. Thus, Isomap has many variants (Jenkins and Mataric, 2004; Zha and Zhang, 2003) to solved its weakness. Geng et al. (2005) proposed a robust method based on the idea of Isomap: S-Isomap (Geng et al., 2005). It follows the supervised learning scheme, uses class label information to complete the NLD. The main difference between supervised and unsupervised learning is whether there is Class information in the data samples. The goal of unsupervised DR methods is to make minimal loss of information in the reduce dimension process. Supervised dimensionality reduction method's goal is to maximize the discriminability in different classes for classification. And at the same time, unsupervised learning algorithm assume no prior information such as class labels, pairwise constraints, on the input data. In reverse, supervised learning algorithms have class labels. Isomap is unsupervised learning algorithm and S-Isomap is supervised learning algorithm. In Supervised Isomap (S-Isomap), Euclidean distance is replaced by dissimilarity. The S-Isomap method defines the dissimilarity matrix D(x_i, y_j) between two sample data points (x_i, y_j) (Geng et al., 2005):

(1)

Where:

the parameter β is used to prevent D(x_i, y_j) to increase too fast when D(x_i, y_j) is relatively large. d₀ is a constant, l_i and l_j are the class labels. If dissimilarity is less than 1, x_i and y_j in the same class; otherwise, they belong to different class. Thus, the inter-class dissimilarity is larger than the intra-class dissimilarity. It is a very good property for classification. The theory of S-Isomap is similar to Isomap:

Step 1:	Calculate dissimilarity matrix using class labels in the distance matrix
Step 2:	Run Isomap. Remaining steps of the algorithm remain as of the unsupervised Isomap

In order to use S-Isomp to perform the classification. Firstly, use related machine learning technology, such as generalized regression network or kernel regression, to construct the approximate mapping. Secondly, use KNN or LDA to predict related class label, which is training classifier and then, we use new data points to test classifier. If the accuracy is well, than stop training and testing. Otherwise, repeat the process of training and testing.

LLE and SLLE for classification: Locally Linear Embedding (LLE) is also an unsupervised learning classic algorithm like Isomap. It proposed by Roweis and Saul (2000). Locally Linear Embedding (LLE) best preserves the local neighborhood of objects and uses the rest of objects to preserve the global distance. The low dimensional data processed by LLE retains as much of the original topological structure as possible. LLE works (Vlachos et al., 2002) as follows:

Step 1:	Find the neighbors of every data point
Step 2:	Compute the weight matrix between pairs of neighbors, that matrix can be best linearly reconstruct from its neighbors
Step 3:	Calculate the low dimensional embedding vectors best reconstructed by weight matrix. Constructing the cost matrix based on weight matrix and compute its bottom nonzero eigenvectors

LLE has the advantages such as don’t use iterative algorithm and set a few parameters. The assumption of LLE is that is sampled well from a manifold. To extend LLE to multiple manifold, De Ridder et al. (2003) proposed the supervised LLE (SLLE). The traditional LLE algorithm is based on the first step in the Euclidean distance between sample points to find a neighbors point and in the SLLE, it adds sample label information.

The idea of SSLE is as follows:

Step 1:	At first, it calculates affinity matrix between all samples. Then, affinity matrix eigen-decomposition is performed to more clearly separate different classes. Moreover, SLLE use label information to construct off-diagonal block matrix to replace the distance matrix in LLE
Step 2:	Solve for reconstruction weights
Step 3:	Compute embedding from eigenvectors of affinity matrix

There are also other variant, such as Hessian LLE.

Classification in blind steganalysis: Steganalysis is an art and science of detecting messages hidden using steganography, it detects suspected media, such as images, audio and video to determine whether or not the existence of secret information and if possible, locate the hidden source of communication, blocking the covert communication channel, if necessary, to further extract hidden information. In steganalysis, there are a variety of attack methods, but the most challenging is blind steganalysis, unknown hidden media, key, statistics, media distribution and embedding algorithms. Blind steganalysis is the ultimate goal of steganalysis and it is algorithm-independent and embedding technology-independent.

The so-called “Blind” refers to such methods as long as using the cover sets and stego sets to train the classifier, it may detects any embed secret message in the cover. In this study, our steganalysis is based on images. Blind detection algorithm is not concerned about the information or analysis camouflage algorithm and which focus more on the natural patterns of digital images. These methods is to find a certain statistics which has the ability to distinguish as feature vectors sets, then, after the feature selection, the testing model is constructed form the experimental data using of neural networks, clustering algorithm or other soft regression calculation tool and finding the appropriate decision threshold, the detection model is called as classifier of cover and stego. Avcibas et al. (2001) firstly introduced the machine learning methods into the field of steganalysis. After this, blind steganalysis algorithm became the focus of research. The proposed method of Avcibas is completely blind steganalysis. Currently, half-blind study is widespread. It first extracted the features using the cover image and stego image and then it used machine learning methods to train and test the corresponding classifier.

For classification, we desire to map the data into a space whose dimensions clearly separate members from different classes. Classification is a key step for steganalysis. Whose aim is to discover unknown relationships from large datasets, LDR and NDR can achieve this goals. The data point in high-dimensional space are mapped into the low-dimensional space using PCA, LDA, Isomap, S-Isomap, LLE, SLLE and other variants of different DR methods. But DR is different with direct feature selection. In direct feature selection, these features could be either the first coefficients of the Fourier decomposition or the wavelet decomposition. We can view these dimensionality reduction methods as secondary feature extraction.

Dataset: The detection results of steganalysis are significantly influenced by the features of images. Every experiment should be descripted in details. In our experiments, the 4000 images are from digital cameras. It apply the different steganography algorithms to treat this images and form cover dataset and stego dataset.

We prepare training dataset and test datasets from original cover dataset and stego dataset. For each data set, we use one-half of data for training and the other half for testing. The training and testing sets have an equal number of cover and stego images.

Forming the stego images by different stego-tool or algorithm, in our experiments, it use S-Tools, Jsteg, Outguesssteghide (Hetzl and Mutzel, 2005), F5 (Westfeld, 2001), MBS (Sallee, 2003).

The idea behind using datasets in different stego tools is to study the effect of using clean images verses stego image for different stego algorithm. At the same time, to form blind steganalytic detector, we need to use different steganalytic methods to train and test classification.

For blind steganalysis, sometimes there are no cover images. At this situation, there are two choices: One is to train classifier using stego images as many as possible which treated by many steganalytic algorithm, so that having adequate training samples to make classifier can produce accurate prediction of unknown steganalytic algorithm. Another is to only use cover image to train the classifier, classifier learns the features only in the feature space of cover images, the distributions which is different from the cover would be viewed as suspicious cover. The advantage of later method is that the training process is simple, when there is a new embedding method and also needn’t to retrain the classifier. The former is a semi-blind steganalysis; the latter is completely steganalysis.

Feature extraction: Feature extraction of blind steganalysis has two kinds of methods, one method is to extract features directly from the images, then select or design the classifier with the selected feature set; finally, the new data are classified by the classifier. Another method to pre-process the image such as RGB image is converted to grayscale image, JPEG compression, cut, DCT transform and DWT transform. Then, perform feature selection. At last, classifier is trained. In our experiments, according to the above two methods for feature extraction, we use the following features for steganalysis, the number in the parentheses is its dimensions:

•	CHEN(486) (Chen and Shi, 2008)
•	CCHEN(972) (Shi et al., 2006)
•	CCC300(48600) (Kodovsky and Fridrich, 2009)
•	SPAM (686) (Kodovsky and Fridrich, 2011)

RESULTS

After feature extraction, we use NDR and LDR technology to reduce the dimension and the dimension reduction is viewed as further feature extraction. Because only the 2D and 3D data can be graphical representation, so, in our experiment, we use classic data dimensionality reduction technology to treat data and give change before and after using DR in Fig. 1.


Fig. 1(a-f):	Original data and DR data of (a) PCA, (b) LDA, (c) LLE, (d) Isomap, (e) S-Isomap and (f) SLLE

In machine learning, researchers have done a lot of research on the dimensionality reduction. In this thesis, we introduce the dimensionality reduction into steganalysis field. The PCA, LDA, MDS, S-Isomap, Isomap, LLE, SLLE can be apply in the classification of blind steganalysis.

Because we need to complete blind steganalysis, so we randomly use Jsteg, Outguess, steghide, F5, MBS to treat images and form original dataset: Dataset1 (200), dataset2 (800), dataset3 (1000) and dataset4 (2000), the number in the parentheses is its image numbers. Then, we use separate feature extracted program CHEN (486), CCHEN (972), CCC300 (4860) and SPAM (686) to extract feature. Through running different extracted program, we extracted different features:

Those features are viewed as sample of data in classification. On each sample data, different DR methods including PCA, LDA, Isomap, S-Isomap, LLE, SLLE are run and form the DR data. Data after dimensionality reduction is compared with Data before dimensionality reduction in classification.

We use well-known classification methods-SVM to perform classification. The parameters for the methods are determined empirically through ten-fold cross validation. The accuracy of the ten-fold cross validation of SVM on the 16 data sets is tabulated in Table 1.

In order to analyze the influence of dimensionality reduction to the classification, we use typical DR methods, PCA, LDA, Isomap, LLE, S-Isomap and SLLE to perform N group contrast experiments, to verify whether the low dimension can simplify the classifier, reduce computational complexity and improve the accuracy of blind steganalysis. Among those DR method, PCA and LDA are classical linear dimension reduction method, Isomap and LLE are the typical nonlinear dimension reduction method and S-Isomap and SLLE are the typical supervised nonlinear dimension reduction method.

Table 1:	Accuracy of data before dimensionality reduction

Table 2:	Accuracy of data after dimensionality reduction

Applied 6 kind DR method to F1-F16 respectively, we obtain the data after DR-D_ij, where, the value of i is from 1-16, represent the 16 group original data; the value of j is from 1 to 6, represent the 6 kinds dimensionality reduction method. The typical classified method (SVM) is applied to each group of D_ij.

In Table 2, the 1, 2 column are the typical linear dimension reduction method, the 3, 4 column are the typical nonlinear dimensionality reduction method, the 5, 6 column are supervised nonlinear dimensionality reduction method.

Table 2 gives the best performance on 16 groups of 96 data sets. In supervised learning methods, several values of K from 15-40 are tested.

DISCUSSION

Fisher (1936) proposed Linear Discriminant Analysis (LDA) to reduce the dimension of data for later classification. Miche et al. (2007) proposed the drawbacks of performing steganalysis in high-dimensional spaces (1) the need for data points, (2) the increase of complexity and the lack of interpretability. Tian et al. (2008) applied PCA to image steganalysis. Qi et al. (2009) applied PCA to Audio steganalysis. In this study, we introduce other typical dimensional reduction methods to the steganalysis field and view these dimensionality reduction methods as secondary feature extraction. Through systematical experiments, compare the proposed methods, such as LLE, SLLE, ISomap and S-Isomap to the traditional methods (such as PCA and LDA) in steganalysis.

Experiments on those data sets have been systematically performed. These experiments reveal a number of interesting points as shown in Fig. 2:

•	Both PCA, LDA, SLLE, ISomap and S-Isomap performed better in the dimensional subspace (D_2j, D_4j) than in the original image space
•	In all the dimension reduction experiments, LDA and SLLE consistently performs better than the original image space. These experiments also show that dimension reduction is especially suitable for some image space
•	Comparing to the different methods give the best performance on different data sets, DR method-PCA, LDA, Isomap and S-Isomap give the best performance on two data sets D_2j and D_4j, which are the perfect detection and their accuracy are 1


Fig. 2:	Accuracy comparison of Original data and DR data of PCA, LDA, LLE, Isomap, S-Isomap and SLLE

CONCLUSION

To deal with the continuing growth in the dimension of features used in current steganalysis, we introduce the dimensional reduction methods, such as Isomap, S-Isomap, LLE and SLLE, used in pattern recognition to the steganalysis field. The work presented in this study opens the way to different perspectives. It views dimensionality reduction methods as secondary feature extraction, that is, feature selection is the special case of DR. Experiments on real data sets demonstrated that some DR methods (such as LLE, LDA) can have more discriminative power than other DR methods (such as PCA, Isomap). From Table 2, we know that LDA can make the labelled data has the better separability in linear methods. At the same time compared the Isomap and S-Isomap, LLE and SLLE, we know that the supervised methods performance is better than the unsupervised DR methods in nonlinear dimension reduction.

On the other hand, the present results show that DR methods can effectively reduce the feature dimension while preserving steganalytic accuracy and can also greatly improve the steganalytic efficiency.

Future work will include using other typical DR method in blind steganalysis and we also plan to construct an automatic testing platform of blind steganalysis using the software engineering knowledge.

REFERENCES

Avcibas, I., M. Nasir and B. Sankur, 2001. Steganalysis of watermarking techniques using image quality metrics. Proceedings of the SPIE Conference on Security and Watermarking of Multimedia Contents III, Volume 4314, January 21-26, 2001, San Jose, USA., pp: 525-531.

Bach, F.R. and M.I. Jordan, 2002. Kernel independent component analysis. J. Mach. Learn. Res., 3: 1-48.
CrossRef Direct Link

Belkin, M. and P. Niyogi, 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15: 1373-1396.
CrossRef

Comon, P., 1994. Independent component analysis: A new concept? Signal Process., 36: 287-314.
CrossRef Direct Link

Cox, T.F., and M.A.A. Cox, 1994. Multidimensional Scaling. Chapman and Hall, London

Chen, C. and Y.Q. Shi, 2008. JPEG image steganalysis utilizing both intrablock and interblock correlations. Proceedings of the International Symposium on Circuits and Systems, May 18-21, 2008, Washington, DC., USA., pp: 3029-3032.

De Ridder, D., O. Kouropteva, O. Okun, M. Pietikainen and R.P.W. Duin, 2003. Supervised locally linear embedding. Proceedings of the Joint International Conference on Artificial Neural Networks and Neural Information Processing, June 26-29, 2003, Istanbul, Turkey, pp: 333-341.

Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Ann. Eugen., 7: 179-188.
CrossRef Direct Link

Geng, X., D.C. Zhan and Z.H. Zhou, 2005. Supervised nonlinear dimensionality reduction for visualization and classification. IEEE Trans. Syst. Man Cybern. Part B: Cybern., 35: 1098-1107.
CrossRef

Hyvarinen, A., J. Karhunen and E. Oja, 2001. Independent Component Analysis. Wiley, USA., ISBN: 978-0-471-40540-5, Pages: 504

Qi, Y., Y. Wang and J. Yuan, 2009. Audio steganalysis based on co-occurrence matrix and PCA. Proceedings of the International Conference on Measuring Technology and Mechatronics Automation, April 11-12, 2009, Zhangjiajie, Hunan, pp: 433-436.

Scholkopf, B., A.J. Smola and K.R. Muller, 1999. Kernel Principal Component Analysis. In: Advances in Kernel Methods: Support Vector Learning, Scholkopf, B.C., J.C. Burges and A.J. Smola (Eds.). MIT Press, Cambridge, MA., USA., ISBN-13: 9780262194167, pp: 327-352

Hetzl, S. and P. Mutzel, 2005. A graph-theoretic approach to steganography. Proceedings of the 9th IFIP TC-6 TC-11 International Conference on Communications and Multimedia Security, September 19-21, 2005, Salzburg, Austria, pp: 119-128.

Jolliffe, I.T., 1986. Principal Component Analysis. Springer-Verlag, Berlin, Germany

Jenkins, O. and M. Mataric, 2004. A spatio-temporal extension to Isomap nonlinear dimension reduction. Proceedings of the 21st International Conference on Machine Learning, July 4-8, 2004, Banff, Alberta, Canada, pp: 441-448.

Kodovsky, J. and J. Fridrich, 2009. Calibration revisited. Proceedings of the 11th ACM Multimedia and Security Workshop, September 7-8, 2009, Princeton, New Jersey, pp: 63-74.

Kodovsky, J. and J. Fridrich, 2011. Steganalysis in high dimensions: Fusing classifiers built on random subspaces. Proceedings of the SPIE, Electronic Imaging, Media, Watermarking, Security and Forensics, January 23-26, 2011, San Francisco, CA., USA -.

Mika, S., G. Ratsch, J. Weston, B. Scholkopf and K.R. Mullers, 1999. Fisher discriminant analysis with kernels. Proceedings of the IEEE Neural Networks for Signal Processing Workshop, August 23-25, 1999, Madison, WI., pp: 41-48.

Karl Pearson, F.R.S., 1901. LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinburgh Dublin Phil. Maga. J. Sci., 2: 559-572.
CrossRef Direct Link

Roweis, S.T. and L.K. Saul, 2000. Nonlinear dimensionality reduction by locally linear embedding. Science, 290: 2323-2326.
CrossRef Direct Link

Scholkopf, B., A. Smola and K.R. Muller, 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput., 10: 1299-1319.
CrossRef Direct Link

Seung, H.S. and D.D. Lee, 2000. The manifold ways of perception. Science, 290: 2268-2269.
CrossRef Direct Link

Sallee, P., 2003. Model-based steganography. Proceedings of the 2nd International Workshop Digital Watermarking, October 20-22, 2003, Seoul, Korea, pp: 174-188.

Shi, Y.Q., C. Chen and W. Chen, 2006. A Markov process based approach to effective attacking JPEG steganography. Proceedings of the 8th Information Hiding Workshop, July 10-12, 2006, Brittany, France, pp: 249-264.

Tenenbaum, J.B., V. de Silva and J.C. Langford, 2000. A global geometric framework for nonlinear dimensionality reduction. Science, 290: 2319-2323.
CrossRef Direct Link

Tian, Y., Y.M. Cheng, Z.X. Qian and Y.L. Wang, 2008. Image steg analysis based on PCA and SVM. J. Graduate School Chin. Acad. Sci., 25: 74-79.

Van der Maaten, L.J., E.O. Postma and H.J. van den Herik, 2009. Dimensionality reduction: A comparative review. J. Mach. Learn. Res., 10: 66-71.

Vlachos, M., C. Domeniconi, D. Gunopulos, G. Kollios and N. Koudas, 2002. Non-linear dimensionality reduction techniques for classification and visualization. Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 23-25, 2002, Edmonton, AB, Canada, pp: 645-651.

Westfeld, A., 2001. F5 a steganographic algorithm: High capacity despite better steganalysis. Proceedings of the 4th Information Hiding Workshop, April 25-27, 2001, Pittsburgh, PA., USA., pp: 289-302.

Miche, Y., P. Bas, A. Lendasse, C. Jutten and O. Simula, 2007. Advantages of using feature selection techniques on steganalysis schemes. Proceedings of the 9th International Work-Conference on Artificial Neural Networks, June 20-22, 2007, San Sebastian, Spain, pp: 606-613.

Zha, H. and Z. Zhang, 2003. Isometric embedding and continuum ISOMAP. Proceedings of the 20th International Conference on Machine Learning, August 21-24, 2003, Washington DC., pp: 864-871.

Zhang, Z. and H. Zha, 2005. Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J. Sci. Comput., 26: 313-338.
CrossRef

HOME JOURNALS CONTACT

Journal of Software Engineering

Year: 2015 | Volume: 9 | Issue: 4 | Page No.: 721-734 DOI: 10.3923/jse.2015.721.734

Dimensionality Reduction for Classification of Blind Steganalysis

Xiuhui Ge and Hao Tian

How to cite this article

Xiuhui Ge and Hao Tian, 2015. Dimensionality Reduction for Classification of Blind Steganalysis. Journal of Software Engineering, 9: 721-734.

Keywords: Dimensionality reduction, PCA, LDA, LLE, SLLE, Isomap, S-Isomap and blind steganalysis

REFERENCES

Year: 2015 | Volume: 9 | Issue: 4 | Page No.: 721-734
DOI: 10.3923/jse.2015.721.734