The texture is an important characteristic in the analysis of many types of images. Texture classification consists in partitioning a set of images into different classes in such a way that all images belonging to the same class are homogeneously textured.
The fundamental problem in texture classification is to determine a proper
set of features that can be used to make a distinction between textures. In
the past decades, many textural features have been used in texture analysis
(Zhang and Tan, 2002; Tuceryan and Jain, 1998). The most commonly used are those
derived from co-occurrence matrices, Markov random field models, Gibbs distribution,
autoregressive models, local linear transform, texture spectrum and multichannel/multiresolution
The multichannel spatial filtering techniques have been motivated by both the results of studies of human visual system and the suitability of such an analysis for many image processing tasks. The multichannel methods commonly used are: Wigner distribution, Gabor functions and Wavelet Transform (WT).
Different types of WT were already used in texture classification and applied to many fields such as remote sensing (Zhang et al., 2005; Ferro and Warner, 2002), industrial control (Yang et al., 2004) and medical imaging (Soltanian-Zadeh et al., 2004).
The conventional pyramid-type wavelet transform was firstly suggested for texture analysis (Mallat, 1989). Thereafter it has been successfully used in its initial form (Loum, 2006; Salari and Ling, 1995) or in combination with other traditional single resolution techniques such as co-occurrence matrix (Arivazhagan and Ganesan, 2003). A second type of WT, namely the discrete Wavelet Frame Transform (WFT), was also used in texture classification and segmentation (Laine and Fan, 1996; Unser, 1995). This transform yields a texture description invariant with respect to translations of the input image. The performances of the WT and the WFT for texture characterisation were compared by Li and Shawe-Taylor (2005). They found that the features derived from the WFT have better discriminating performance than those derived from the traditional WT.
A multiband extension of the WT and the WFT can be obtained by using wavelet packet or multi-wavelet transforms (Tham et al., 2000; Coifman and Wickerhauser, 1990). These latter have some useful properties for texture analysis such as symmetry, orthogonality, regularity, phase linearity and short support. In the multi-wavelet transform, several scaling functions and mother wavelets are used to decompose the original image, whereas in the Wavelet Packet Transform, the image is decomposed onto a wavelet packet basis. From the multichannel filtering point of view, the wavelet packet decomposition is performed as well on low pass as on high pass filters.
In most WPT-based texture classification methods, the local texture properties
are generally characterized by texture features extracted from a small window
around each pixel in the channel outputs (Huang and Aviyente, 2006; Rajpoot,
2003; Laine and Fan, 1993). Due to the over complete representation of the WPT,
some authors have estimated that a set of useful channels must be selected for
a better texture characterisation. Hence, Chang and Kuo (1993) have proposed
that only the channels with the highest energy content must be decomposed at
each stage. That leads to an adaptative Tree Structured Wavelet Transform (TSWT).
More recently, Huang and Aviyente (2006) have computed an information-theoretic
measure called the mutual information, for selecting the channels.
In this study, we propose recursive decomposition, namely the Full Wavelet Decomposition (FWD), which can be viewed as a particular case of the WPT. The FWD is obtained by applying the basic two-scale WT or WFT to all the outputs of the filter banks. Then, the whole channel outputs of the FWD are preserved for texture classification in order to determine a robust set of features that can be used to make a distinction between textures. We assume that each channel contains information about frequency and orientation which are essential for texture characterisation. In present approach, the multichannel set of absolute values of wavelet coefficients, located at a spatial position (m, n) in the channel outputs, is sorted out in descending order. This operation provides the order of each channel associated with the position (m,n). Then the occurrences of a given channel or a group of channels in a particular order are estimated over a spatial window of size WxW. We assume that the different occurrences are specifics to the texture and can be used to characterise it. To quantify local texture properties, the occurrences are either recorded in a Channel Order Matrix (COM), or used to construct a Frequency Channel Spectrum (FCS). Finally, features derived from COM or FCS, are extracted for texture characterisation and classification. The interest of the proposed methods for texture classification will be demonstrated in this research.
MATERIALS AND METHODS
Full wavelet decomposition: A 2-D discrete Wavelet Transform (WT) can be implemented with a quadrature mirror filter bank (Mallat, 1989). Low pass filter H and high pass filter G are applied to the image in both the horizontal and vertical directions, followed by a down sampling by two of each channel output in order to keep the overall number of samples constant (Fig. 1).
The main difference between the standard WT and the WFT is that no down sampling is applied to the output of the filter banks of WFT. Consequently, WFT provides a translation invariant image representation that is desirable for texture characterisation (Unser, 1995).
In present study, the two transforms (WT and WFT) are separately used in two different algorithms. At each iteration, these transforms provide a decomposition of the original image in a set of four frequency channels (low-low, low-high, high-low and high-high) that are respectively noted as A, O, V and D.
The basic decomposition (Fig. 1) can be repeated several times in order to obtain a multichannel representation of the original image. In the conventional pyramid-type wavelet transform, we leave the high frequency alone and repeat the process from the output of the low frequency channel A, whereas in the Tree-Structured Wavelet Transform (TSWT), the decomposition is iterated only on channels where the energy is significantly large.
Present approach is to use all the available channels in the texture classification because, even the channels with lower energy contents, bear some useful information, which can help to distinguish between textures. This is why the Full Wavelet Decomposition (FWD) of the textured image will be performed using either the WT or the WFT. We hope that such an analysis can lead to a robust texture classification.
The full decomposition is obtained by recursively applying the basic decomposition
scheme in Fig. 1, to all outputs of the filter banks. Hence,
a J-level FWD generates a set of 4J frequency channels. By considering
the path of the resulting channels in the well-known quad tree structure, each
channel can be expressed in the general form X1X2
where Xi=A, O, V or D and I = 1,
, J. Channel A provides an
approximate image of the original image whereas channels O, V and D, respectively
give horizontal, vertical and diagonal details. Figure 2 shows
an example of a 2-level FWD.
FWD: (a) channel decomposition and (b) quad tree representation
The original image is decomposed into 16 channels (AA, AO,
It is evident that the ability of discriminating different textures is performed by increasing the number of channels in the decomposition. However, for the purpose of computational efficiency, a 3-level decomposition is to be chosen.
At level 3, the FWD provides 64 channels X1X2X3. Each channel will be labelled with a channel number CN = 42N1+4N2+N3+1, where respectively, Ni = 0,1,2,3 if Xi = A,O,V,D. For example the number CN of channel VOA is equal to 42x2+4x1+0+1 = 37.
The wavelet coefficients in the channel outputs of the FWD are useful for texture discrimination but they cannot be used directly as texture features. In present approach, it has proposed to characterise texture by the occurrences of the channel output orders determined at each position (m, n) within a window of WxW pixels. These orders are obtained by sorting out the channels according to the relevance of their wavelet coefficients. We assume that the specific local information describing the decomposition of a given texture is embedded in the variations in the order of the channels. Our two methods to be used for texture feature extraction are dealt with in the next section.
Texture characterisation methods:
Channel Order Matrix (COM) method: The first texture characterisation
method to be used is based on the construction of the Channel Order Matrix (COM).
We associate to a given position (m, n) in the channel outputs, the set of 64
values corresponding to the magnitude (absolute value) of the wavelet coefficients
located at that position. These values are sorted out to classify the 64 channels
in descending order. This operation provides the order of each channel at the
spatial location (m, n). Each position in the sorted list represents a specific
weight of the channel at the considered position: The higher the order, the
more dominant is the channel.
The occurrences of the orders of each channel over a window of W1xW1 pixels can be recorded in a COM with channel order on the row and channel number on the column. This matrix contains local information about texture properties. Several features can be extracted from the COM in order to quantify texture information. In our application, the two features defined below were used (1≤q≤64):
By considering the 64 channels, our textural feature vector has in total 128
components. The weighted feature S(q) gives the final weight of channel q over
the considered window. S(q) takes a higher value if the channel q occupies a
good position in the various sorted channel lists. That would be the case when
the energy of q is significantly large. The variance V(q) is used to measure
the dispersion in the order of the channel q. V(q) takes a low value when the
position of channel q changes frequently.
Due to the down sampling in the WT, the COM method requires quite a large size (at least 128x128 pixels) of the original texture images. The second algorithm to be used needs relatively small size of blocks because it is based on WFT in which no down sampling is applied.
Frequency Channel Spectrum (FCS) method: When using the second algorithm, the Full Wavelet Decomposition (FWD) is performed with the WFT. The 64 channels labelled as described earlier, are subdivided into 8 groups (g = 1,
,8) according to their channel numbers: channels with CN∈[1; 8] in g=1, channels with CN∈[9;16] in g = 2 and so on. It can be observed that each group g of 8 frequency channels corresponds to a specific band of frequencies going from the very low frequencies for group 1 to the very high frequencies for group 8. Each band contains specific texture information in different orientations and frequency ranges.
For a given position (m,n), we consider the 8 absolute values of the group g. The mean of these values is used to binarise each one of them. This operation does not affect the configuration of the channels of the group g but it reduces the complexity of the computation.
We obtain a binary sequence of 8 bits with the most significant one corresponding
to the smallest channel number of the group. In present approach, the decimal
number CUg(m,n) corresponding to the previous binary number, represents
the channel unit of the group g associated to the position (m,n). The decimal
number CUg(m,n) can take 28 = 256 possible values.
For each group g, the occurrence distribution of the CUg(m,n) computed over a block of W2xW2 pixels, can be represented on a partial frequency channel spectrum, which describes the local texture aspects in a precise band of frequencies. To cover the whole frequency domain, the 8 partial frequency channel spectra are associated in ascending order of g to form a single Frequency Channel Spectrum (FCS). Hence, the FCS records all the channel units CUg(m,n) of the different groups g (g = 1,
,8), computed over a given block of size W2xW2. The channel unit CU(m,n) associated with the FCS, has a total of 8x256 = 2048 possible values. The channel unit CU(m,n) corresponding to a value of CUg(m,n) is given by the following formula:
It is important to note that the computation complexity is significantly reduced by the subdivision of the frequency domain into eight bands. Indeed, rather than having a total of 264 = 1.845 1019 possible channel units, this number are limited to 2048. In our application, the textural feature vector is composed of these 2048 channels units.
The FCS method can be viewed as a multichannel/multifrequency binary version of Texture Spectrum (TS) method. Let us recall that TS is a well-known spatial method, which gives the texture information using the eight neighbouring pixels around the central pixel (Wang and He, 1990).
Texture classification: The steps involved in the texture classification
process are shown in Fig. 3. A 3-level FWD of each test block
is implemented using either the WT or the WFT. In the first case, the COM is
constructed from the occurrences of the channel output orders determined on
a block of size W1xW1. In the second case, channel units
of each band of frequencies are computed over a block of W2xW2
positions to form the FCS. Finally, an appropriate textural feature vector is
extracted in each case to complete the texture classification using the minimum
Euclidean distance decision rule.
To evaluate the performance of our characterisation methods, a texture classification has been performed on 30 Brodatz's texture images (Brodatz, 1968) of size 512x512 with 256 grey levels. Texture images are listed in Table 1.
The mean of each image is removed before the processing. For each textured image 116 blocks are randomly chosen. Sixteen blocks are used in the training module whereas the 100 others are reserved for the classification module. The prototype of each class is estimated by averaging the features associated to the 16 blocks.
The blocks are of size 128x128 and 32x32 pixels in the COM and the FCS methods respectively. However, on one hand, due to the down sampling in the WT, the features derived from the COM are computed over a window of W1xW1= 16x16 pixels. Whereas on the other hand, the size of the window in the FCS case, is of W2xW2= 32x32 pixels.
Experimental results and discussion: In our first experiment, the performances
of the COM and the FCS feature extraction methods are investigated.
textured images used in our experiments
D21, D77 and D79 (from left to right) and their corresponding, COM (second
line) and FCS (third line)
results of 30 Brodatz' textured images using the COM method
Classification results of 30 Brodatz textures images
using the FCS method
For this purpose, we have constructed the COM and FCS of 3 visually similar
test textures (D21, D77, D79) using Haars wavelet. Experimental results
are displayed in Fig. 4.
As expected, different textures have had different COM and FCS. That means that the texture aspect of an image can be efficiently characterized by the features derived from the COM and the FCS.
When analysing the results of the COM method, we can observe that the dominant
channels (with best orders), are not necessarily those corresponding to the
low frequencies (with low channel numbers). As Chang and Kuo (1993) already
indicated it, the most information useful for discriminating between textures
is not necessarily contained in the low frequency channel.
textures used in the classification experiments
mosaic textures T1 (a1) and T2 (b1) and their segmentation results (a2)
It can also be observed that the completion of a FWD is of some importance. In effect, the less dominant channels are useful to distinguish between the textures (COM of textures D21 and D79). Finally, the COM of the texture D77 which have circular patterns is crudely diagonal.
The majority of previous remarks are also valid for the FCS method. In addition, the FCS of the texture D77 with circular patterns (i.e., without particular orientation) is almost equalised at a low value of occurrences, except for the group g = 1 (channel unit <256) which corresponds to the low frequency channels (Fig. 4). The FCS of the texture D21 with square patterns has larger amplitudes in the various groups of channels. Finally, the FCS of the texture D79 with vertical orientation, shows some modes in both the low frequency channel (g = 1 and 2) and the high frequency channels (g = 5 and 6) obtained from the decomposition of the channel output V which gives vertical details. As results, the FCS may be seen as containing some local information about the orientation of the texture patterns.
The second experiment was carried out for texture classification based on the COM method. We were interesting in studying the performance of various wavelet filters: Haar's filters, Daubechies' 4 and 16-tap filters (4-Db and 16-Db) (Daubechies, 1988).
We have also chosen to compare the performance of our first method (COM Method) with two others: In the first one, a set of energies calculated at the leaves of the most dominant channels of the TSWT is used as the texture feature. This set known as energy map, is a multiresolution feature. As described in Chang and Kuo (1993), the TSWT has been implemented with Daubechies 16 tap-filters (Daubechies, 1988). In the second one, energy is computed over each channel output of the 3-level FWD to form a textural feature vector of 64 elements. This method is a multichannel extension of Law's (MEL) method (Laws, 1980). The results of the classification process are given in Table 2.
First observation is that the results of the first texture classification method depend on the size of the filter bank. The smaller the sizes of filters, the better are the classification results. Increasing the filter size seems not to have a beneficial effect on the classification performance. The results obtained with the shorter Haar's filter are surprisingly good. Some authors such as Unser (1995) and Yang et al. (2004) have also found that Haar wavelet achieves the best classification performance.
Second observation concerns the comparative study. We can see that the proposed method using Haar's filters outperforms the two comparative methods. The correct classification rate is 99.07%, which is higher than both the MEL (96.57%) and the TSWT (95.3%) methods. This result suggests that taking into account the entire channels of the full decomposition helps to perform better classification results. Channels with lower energy content carry some useful information for texture characterisation. So, their contribution should not be neglected.
The next experiment is concerned with the implementation of the texture classification using the FCS method. Here, the classification performance of the FCS method has been compared to the MEL and the traditional Texture Spectrum (TS) methods.
In Table 3 the comparative results of the second texture classification method are depicted. The FCS method seems to have the best performance overall, followed by the TS and the MEL methods. The later gives the worst performance. Redundancy introduced by the WFT, seems to be desirable for feature extraction. The results of Arivazhagan and Ganesan (2003) also show that texture characterisation using WT is superior to traditional single resolution techniques such as texture energy and texture spectrum.
It is important to note that the best performances of our method are obtained for deterministic textures in particular when the size of the texture patterns is large (Fig. 5, D52 and D95). This type of textures has a high concentration of localized spatial frequencies. In contrast the TS method is particularly sensitive to stochastic textures which exhibit no dominant frequency or orientation characteristics (D4 and D19). We assume that the combination of the TS and the FCS, which are respectively spatial and spatial-frequency methods, should allow us to obtain better results. This would certainly be investigated in further studies.
Last experiment is related to the supervised segmentation of two images shown in Fig. 6 (a1 and b1). The first image (T1) is a mosaic of D16 and D79 and the second (T2) includes textures D16, D17, D53 and D77. Each image was scanned with a window of size 32x32 in 2-pixel wide steps. The overlapping is needed to smooth out the transition from one textural region to another. For each step, the FCS is constructed and the Euclidean distance was used for the classification of a block of 2x2 pixels.
From Fig. 6 (a2 and b2) it can be observed that the detected boundaries of different regions agree with those perceived by the human visual system. These results confirm the robustness of the FCS method for texture characterisation. Several other images of mosaic of textures have been segmented with convincing results.
In this study two feature extraction techniques for texture classification were presented. The order of each channel outputs of the Full Wavelet Decomposition (FWD) was determined for each position in a given window. Texture has been characterized by the analysis of channel order over that window. In order to quantify texture properties, the texture features were computed over a Channel Order Matrix (COM) or a frequency channel spectrum (FCS). The classification process was tested on real natural textures from Brodatzs album and the performance of our methods was found better when compared with some other methods.