INTRODUCTION
Popularity of the Internet provides a great opportunity for transferring large amount of data in networks. However, it also increases the risk of illegal and unauthorized access to data during transmission. Mechanisms should be provided to protect against these attacks.
Steganography is a way for secret communication by using digital media to convey
essential messages. The word steganography derived from Greek and it means cover
writing. It is all about creating a form of secret communication between two
parties and it is a complement to cryptography whose goal is to conceal the
content of a message (AlJaber and Aloqily, 2003). Steganography
uses a media like an image, video, audio or text file to hide information in
such a way that it does not attract any attention and looks like an innocent
medium (Hmood et al., 2010a).
The first use of steganography was in military and government for secret communication
but nowadays, steganography methods become widely used for many purposes. Researchers
propose new methods in this area as well as enhancing the previous approaches
to improve the steganography applications (AlFrajat et
al., 2010).
There are lots of methods used in image steganography. However, they have their
own weaknesses and strengths. Since Least Significant Bit (LSB) insertion method
is one of the simplest data hiding techniques, it has long been a focus for
researchers to propose attacking methods and they are called either steganalytic
or steganalysis attacks. It is proved that sometimes simple LSB method is not
secure at all because some harmful statistics can be exploited to reveal the
existence of the secret data (Lee et al., 2009).
In this study most of the effort is done to get a better imperceptibility of
the stegoimage without losing the embedding capacity.
RELATED WORK
Simple LSB substitution: The word LSB stands for Least Significant Bit.
This method is simple and easy to implement in Steganography area. This method
actually substitutes the LSBs of cover image with secret bits sequentially.
In order to hide messages by this approach at least one bit is stored in each
pixel of cover image. For example by using 8bit gray scale image format with
the size of 512x512 can embed 262, 144 bits (32,768 bytes or pixels) (Hmood
et al., 2010b). By embedding this amount of data both stegoimage
and its respective cover image look the same since human eye cannot distinguish
little changes in the value of pixels. Embedding more than one bit in each pixel
by using pixels in the edge area will make more change in high frequency areas
so it can still be undetectable by human eye as well (Lee
and Chen, 1999). By considering the size of cover image pixels there is
no limitation on embedding rate in this method but the more secret bits we embed,
the less imperceptibility of stegoimage is obtained. So researchers proposed
several methods to improve the weakness of this method.
The optimal LSB method: The simple LSB method can be modified to improve
the quality of stegoimage. The algorithms of such improved schemes are still
based on simple LSB method. In this section one of the improved methods, called
Optimal LSB which applies Optimal Pixel Adjustment Process (OPA) to improve
the stegoimage quality is presented. Three candidates are picked out for the
pixels value and compared to see which one has the closest value to the original
pixel value with the secret data embedded in. The best candidate is then called
the optimal pixel and used to conceal the secret data (Chan
and Cheng, 2004).
The embedding process is described by Wu and Hwang (2007)
as follows:
• 
Let p_{i} be the original pixel value and k bit (s)
be secret data to be embedded 
• 
Embed k bit (s) of secret data into p_{i} by using the LSBs method.
The stegoimage p_{i} can then be obtained 
• 
Generate another two pixel values by adjusting the (k + 1)th bit of p_{i}.
Therefore, p_{} and p_{+} can be calculated as follows: 
obviously, the hidden data in p_{} and p_{+} are identical to p_{i} because the last k bits of them are the same.
The best approximation to the original pixel value, p_{i}, (i.e., the
optimal candidate) is found by the following formula:
Finally, all the optimal candidates for
replace the original pixel values p_{i} and the embedding algorithm
come to its end.
In order to make the algorithm faster and using low complicated calculation,
the pseudocode of the algorithm is given below (Tseng et
al., 2008). The inputs of the algorithm are secret image S and cover
image C and the output is stegoimage C`. M and N are the height and the width
and i and j are the coordinate of the cover image C.
Inputs: Transformed Secret ImageS`, Cover Image C Output: Stegoimage
C` 

The secret image S is rearranged to become S’by the same size as the cover
image C but with the smaller bit plane than the k bit plane of cover image (k
is embedded bit plane). D [j] [i] is the value that subtracts the value of secret
image S’ (S’ [j] [i]) from the same coordinate of cover image C (C
[j] [i]). For instance, if C [j] [i] = 50_{10}= 110010_{2},
S’ ([j] [I]) = 7_{10} = 111_{2} k = 3, then D ([j] [i])
= 50 mod 2^{3}7 = 5 is smaller than 2^{2}. Due to that, the
value of C’ ([j] [i]). So, the difference will be smaller than embedding
like the simple LSB substitution. The difference between cover pixel value and
stegoimage value by OPA algorithm is 3 (5047 = 110010_{2}101111_{2}
= 3_{10})which is smaller than the difference after embedding by the
simple LSB which is 5 (5045 = 110010_{2}110111_{2} = 5_{10}).
This smaller difference will enhance the imperceptibility of the stegoimage
by reducing the value of mean square error.
Wang et al. (2001) optimal LSB substitution
method: Wang et al. (2001) approach is based
on simple LSB substitution and Genetic Algorithm (GA). The process of Wang
et al. (2001) method is shown in Fig. 1. There
are two differences between simple LSB and Wang et al.
(2001) method. The first one is that in the LSB substitution method interceptors
can extract the secret data (secret image) from stegoimage easily, because
the hidden secret image is regularly distributed in stegoimage. To eliminate
this disadvantage, Wang et al. (2001) method
uses a transformation function which is described below, to modify each location
of decomposed image SI to a new location in the meaningless image ESI. Assume
pixel locations in ESI are numbered sequentially from 0 to p1. The transform
function has been used before replacing C’ by ESI^{*}.
The transform function used is f (x) = (k0 + k1xx) mod p and gcd (k1, p) =
1 where k0 and k1 are the key constants dcd (,) means greatest common divisor.
k0 and k1 are needed for recovering the hidden secret image from ESI′. Illegal
interceptors will not be able to gain the secret data without knowing these
two keys and the cipher process.
Second significant difference between Wang et al.
(2001) method and simple LSB replacement is that Wang's replacement is optimal
substitution instead of simple substitution. To achieve this goal in Wang's
method, a substitution matrix S = {S_{ij}} is used to convert each pixel
value i of ESI’ to another value j in ESI^{*} where 0≤i<2^{ki}
and 0≤i<2^{ki}. In the matrix S there is only one element in
each row and column that has the value 1. Thus, there are (2^{k})! possible
substitution matrices and only the best one will be chosen which makes the least
distortion between secret image ESI’ and host image C. A simple example
of matrix S for a 4x4 secret image with k = 2 is presented in Fig.
2.

Fig. 2: 
An Example of optimal substitution process from ESI’
to ESI^{*} by Matrix S 
According to optimal substitution matrix S, the pixel value 0 of ESI’ would be replaced by value 2 in ESI^{*} and the pixel value of 3 in ESI^{*} should be replaced by 0 in ESI^{*}and so on. The substitution matrices must be saved for sending to the receiver. The receiver needs the matrices for extracting the embedded information from the stegoimage. Without these information receiver will not be able to be aware of the hidden information.
Finding the best substitution matrix takes a lot of time if K is a large number.
For example if K = 3 there are (2^{3})! = 40, 320 different matrices
and to find the best one we should check them all and it is too time consuming.
So Wang et al. (2001) method used GA to find
the best matrix to reduce searching time.
Wu et al. (2004) global and local optimal
LSB substitution: In Wang et al. (2001) method,
optimal substitution matrix is derived for the whole image but if we use the
same mechanism for each block of the image, the PSNR value will be increased.
That is the significant difference between Wang et al.
(2001) and Wu et al. (2004) method. In fact
Wang et al. (2001) global transformation idea
is not beneficial for the whole image (Wu et al.,
2004). In Wu et al. (2004) method, the transformation
matrix is calculated according to block characteristics of cover image. So the
quality of stegoimage would be better.
The block diagram of Wu et al. (2004) method
is presented in Fig. 3. In their method the decomposed image
S’ is divided into n blocks, {ES’_{0}, ES’_{1},....ES’_{n1}}.
C’also will be divided into {C’_{0}, C’_{1},....
C’_{n1}}. The next step is to search for the best match between
ES’_{i} and C_{j}. Most similar ones between ES’_{i}
and C_{j} will be selected as a match pair. GA is used to perform this
step.
Furthermore, Wu et al. (2004) proposed two different
strategies. The first strategy is the global optimal substitution strategy and
the second one is the local optimal substitution strategy. The first one uses
the same optimal substitution matrix for all blocks and also one substitution
matrix for blocks mapping. Similar to the Wang et al.
(2001) approach, the matrices must be saved to be sent to the receiver for
extraction process. So the global optimal strategy needs fewer data to be recorded
in comparison with their second strategy.

Fig. 4: 
An example of Global Optimal Substitution method 
The second one uses different substitution matrices for blocks and more matrices
are required so more data need to be recorded although a better stegoimage
quality is provided. The first strategy which is based on global optimal substitution
is called global method and the second strategy which is based on local optimal
substitution matrix is called local method.
A simple example of the global method is presented in Fig. 4. As shown, after finding the best match between ES^{*}_{i} and C_{j}, only one substitution matrix has been used for the whole image. The same strategy is used for each block of cover image in local method.
Figure 5 shows an example of local method and we can see that each block has its own substitution matrix and also a matrix for block mapping.
THE PROPOSED METHOD
As mentioned earlier many algorithms are proposed by researchers to solve the
problems of simple LSB and increase the imperceptibility of stegoimage (Chan
and Cheng, 2004; Wang et al., 2001; Wu
et al., 2004, 2005; Zhang
and Wang, 2005). But the proposed method (Adaptive Optimal Embedding) has
a higher imperceptibility of the stegoimage by using more characteristics of
cover image. The proposed approach searches a pixel to find a match between
original pixel bits and secret bits. In conventional LSB method, the hidden
information is embedded sequentially and starts to embed from the first LSBs
of each pixel. On the other hand, the embedding position in our method depends
on pixel value. If there is a match between secret bits and original cover pixel’s
bits, there is no need to embed and we just have to identify the starting bits
of found match. The starting bits where the secret bits are embedded will be
combined together to form a stream of bits. This stream of bits is used as a
stegokey that needs to be communicated to the receiver for extracting the secret
message.

Fig. 5: 
An example of Local Optimal Substitution method 
Experiment shows that the method produces no image distortion and the imperceptibility
increases significantly.For embedding process, first we have to know the embedding
rate (number of bits we embed per pixel). Since the 8bit grayscale image is
selected as cover image, the embedding rate is simply obtained by dividing number
of secret bits to number of pixels. For example, we have 1, 048, 576 secret
bits and size of cover image is 512x512. So by dividing 1, 048, 576 to 262,
144 the embedding rate would be four which means four bits should be embedded
in each pixel of cover image. Using sample pixels of cover image and sample
secret bits is useful to explain the process of embedding in detail which is
given in Fig. 6. Also the respective flowchart of embedding
process is presented in Fig. 7.
The embedding process starts from the leftmost pixel of the first row and moving
to the right of the same row before continuing to the subsequent rows of the
image. For example, the first secret bits to embed are 1010 and the cover pixel
value is 01101001. We can notice that the four bits starting from third LSB
are the same as the secret bits. Therefore no embedment is required in this
case. Hence, we do not cause any image distortion because we did not change
any original bits. We need to identify that the four bits of this pixel are
secret bits and inform the receiver for extraction process. To embed the next
secret bits (0101) in respective cover pixel, again we need to search for a
match in the corresponding pixel. Since there is no match between secret bits
and cover bits, we used the four LSBs of cover pixel to embed the secret bits
by using OPA algorithm. The stegokey is generated by determining the first
bit position where the leftmost bit of secret bits is embedded. As shown in
Fig. 8, the stegokey for the sample secret bits is Fig.
8 depicts where the secret bits are embedded and shows how the stegokey
is derived.
The receiver extracts the secret bits by using the stegokey and the stegoimage. According to Figure 9, first bits of stegoimage’s pixel are and the first two bits of stegokey are . It means that our secret bits are the four bits after second LSB of stegopixel. So for extracting the secret data we take four bits of the modified pixel, starting from third LSB which are . The second two bits of stegokey are and this means extract the bits from the second pixel by starting first LSB which are . Then repeat the extracting process for the rest of the pixels. The extraction flowchart is given in Fig. 9.

Fig. 6: 
An example of secret bits and cover image 

Fig. 7: 
The desired embedding flowchart of Adaptive Optimal Embedding 

Fig. 8: 
The stegoimage and stegokey after embedding phase 

Fig. 9: 
The desired extraction flowchart of Adaptive Optimal Embedding 
EXPERIMENTAL RESULTS
Four methods that provide high embedding capacity were described in Section
two. These four methods are Simple LSB, Optimal LSB, Wang et al.’s
method and Wu et al.’s method. This section presents the comparison
of these methods and the proposed method. The images used as cover and secret
images are 8 bit grayscale. The standard image Lena with the size of which is
shown in Fig. 10 is used as cover image. Figure
11 shows three secret images Barbara, Peppers and AirplaneF16 which are
used as secret images. The size of each secret image is. All standard images
are collected from (Gonzalez and Woods, 2008).

Fig. 10: 
The cover image Lena with size of 512x512 

Fig. 11: 
Secret Images with the size of 512x256. (a): Barbara (b):
Peppers (c): Airplane F16 

Fig. 12: 
The Stegoimage Lena, embedded with secret image Barbara.
In (a) n = 1 and (b) n = 2 
Imperceptibility of proposed method: To evaluate the imperceptibility
of the stegoimage after embedding and also to compare with previous works,
PSNR (Peak SignaltoNoise Ratio) metric is used. As we know the higher stegoimage
quality, the more imperceptibility of the hidden message. The PSNR is the very
first metric which can judge imperceptibility very well. The formula is as follows:
Whereby:
where, M and N represent the image size. In the formula, P (x ,y) stands for
the original pixel value and P’ (x ,y) represents the pixel value in position
(x ,y) with the secret data already hidden in. A greater PSNR value means a
lower degree of image distortion after embedding process of the secret data.
For example, given a gray scale image as the cover image to hide secret data
in, it is hard for any human being to perceive any difference between the cover
image and the stegoimage if the PNSR value of the stegoimage goes beyond 36
dB (Wu and Hwang, 2007).
Table 1 shows the result of embedding 1,048,576 bits (four
bits per pixel) in cover image Lena by Simple LSB,Optimal Pixel Adjustment,
Wang et al. (2001) method, Wu
et al. (2004) global method, Wu et al.
(2004) local method and Adaptive Optimal Embedding.
Table 1: 
The result of embedding secret images Barbara, Peppers and
Airplane F16 into cover image Lena 

The results show that the value of PSNR in Wu et al.
(2004) local method is significantly better than Wang
et al. (2001) method because more attributes of block characteristics
were explored. However, the best PSNR was obtained by Adaptive Optimal Embedding
and then by OPA. In Wang et al. (2001) and Wu
et al. (2004) method the receiver needs block matching matrix and
optimal substitution matrices for extraction. In Adaptive Optimal Embedding
algorithm n is the number of bits used as stegokey for each pixel. We assume
8 bits of cover pixel’s are presented as C_{1}C_{2}C_{3}C_{4}C_{5}C_{6}C_{7}C_{8}.
If n = 1 the algorithm checks the first two layers of cover pixel (C_{5}C_{6}C_{7}
and C_{4}C_{5}C_{6}C_{7}) to find the match
for secret bits which needs one bit to identify these two layers. When n = 2
the first four layers are checked (C_{5}C_{6}C_{7}C_{8},
C_{4}C_{5}C_{6}C_{7}, C_{2}C_{3}C_{4}C_{5})
Fig. 12.
Table 2: 
The result of embedding different amount of secret data with
different ‘n’ in cover image Lena 

As shown in Table 1, there is a significant improvement of
PSNR value by Adaptive Optimal Embedding method. When n = 2 the probability
of finding the desired match of secret bits in cover bits is two times higher
than when n = 1, so the quality of stegoimage is significantly better by applying
n = 2. However, by using one bit as stegokey for each pixel (n = 1) the results
obtained by Adaptive Optimal Embedding is still better than other methods. It
is noteworthy that in Wang et al. (2001) and
Wu et al. (2004) methods if embedding rate is
more than two, it takes a lot of time to find the best, respective optimal substitution
matrices so they use genetic algorithm to increase the efficiency of finding
these matrices but in Adaptive Optimal Embedding since the embedding process
is simple and even sometimes there is no need to embed any bits (in situations
that match is found) the time of applying our algorithm is nearly the same as
simple LSB and OPA.
Experiment was also conducted to evaluate the performance of different payloads
by embedding two, three and four bits per pixel. As shown in Table
2, the results obtained by Adaptive Optimal Embedding method are still better
than the OPA algorithm. By embedding two bits per pixel, the difference between
Adaptive Optimal Embedding and OPA is more than when we embed three bits per
pixel with the same n. The reason is when the embedding rate is two, the probability
of finding the match is higher than when the embedding rate is three, because
we are looking for a specific series of two bits (as secret bit) in population
of four different values represented by two bits. So the probability of finding
is 1/4. But when we embed three bits, we look for a series of three bits in
eight values (represented by three bits) and the probability of finding this
series is 1/8. The reason of smaller difference in PSNR value between Adaptive
Optimal Embedding and other methods by embedding four bits and three bits is
the same.
As discussed earlier, the best PSNR is obtained by our method and then by OPA. Although in our method the size of stegokey is large, the experiments showed most of the values of stegokey in this method are the same and they are zero. The probability of finding a 4bit match between a bit stream is 1/16 because we are looking for four special bits in a population of four bits which can show sixteen different values. For instance we look for 1101 in four bits and these four bits can show a value of 0000 to 1111. In the proposed algorithm, the more bits we use for searching in cover pixel (n), the more chance to find a match. But since the chance of finding the match is not much, most of the bits are embedded in LSB. Therefore, by using a compression algorithm like Huffman which is so suitable here, the size of stegokey can be decreased significantly.
Security of proposed method: As the opponent side to steganography,
the goal of steganalysis is to detect whether hidden message exists in a media
or not and afterward, detect the hidden data if there is. In the past decades,
many steganalysis methods have been proposed by researchers to achieve this
goal (Xia et al., 2009). One of the steganalysis
methods is called Chisquare attack which is used to prove the security of proposed
method.
The idea of Chisquare attack is simple and it is all based on the probability
of how zeros and ones are distributed all over the medium (stegoimage). If
the bits in secret messages are generated randomly, then the probability that
half of the number of bits is either zero or one is fifty percent (Westfeld
and Pfitzman, 1999). This fact makes the idea of χ2 test (Chisquare)
clearer.
Let X^{128x1} and Y^{128x1} be two vectors such that x_{k}
= frequentlcy (2k) and y_{k} = frequentlcy (2k + 1). Initially, every
entry in 0≤k≤127 x and y is set to 0. Then the algorithm counts the grayscale
values in the test image and increments the corresponding entry in X or Y. The
erotical expected frequency of grayscale values of 2k and 2k + 1 is:
Now suppose that there are m categories. In the case of 8bit grayscale images,
there are 128 categories (256 grayvalues/2) which means also 128 PoVs. So to
check all the possible POVs, 128 categories should be taking into consideration
by the given Equation below. The Chisquare statistic, with m1 degrees of freedom,
is calculated by:

Fig. 13: 
Results for ChiSquare attack on stegoimage Lena, embedded
by simple LSB and tested on (a): fifty percent (131, 072 bits) and (b):
100% (524, 288 bits) of the image 

Fig. 14: 
Results for Chisquare attack on stegoimage Lena, embedded
by Adaptive Optimal Embedding and tested on 1, 048, 576 bits, (four random
bits embedded in per pixel (a): n =1 (b): n =2) 
The next step of Chisquare method is calculating the probability of embedded
information by integrating the density function with x^{2}_{k1 }as
its upper limit:
This probability of embedding is the probability of x^{2}_{m1}under
the condition that x_{i} = z_{i} for all i in Chisquare statistic
Equation (Stanley, 2005).
As described in (Westfeld and Pfitzman, 1999), once
the embedding is done the more bits embedded, the probability is stronger and
more reliable. This means that when 100% of LSBs are embedded by random bits,
it is expected to see the best result of Chisquare attack saying the probability
of embedding is one for the whole 100% of the stegoimage.
Figure 13a shows the result of embedded information in fifty
percent of the cover image when in Fig. 13b, more than one
bit is embedded in all pixels of the cover image. In both of them, Simple LSB
algorithm is applied for embedding process. By taking a simple look at the Fig.
13, it is noticed that the test works out and attacks the very existence
of secret data for these specific payloads. The payload of 1231, 072 bits is
exactly using fifty percent of the stegoimage by embedding one bit per pixel
and it has been detected that the same probability has been embedded with some
secret bits. The payload of 104,576 embeds more than one bit in each respective
pixel of the cover image and the attack is still reliable since Chisquare detects
100% of the zstegoimage has the probability of embedding equal to one.
As the same attack used for the stegoimages embedded by Adaptive Optimal Embedding,
the shapes of diagrams changed. The Chisquare attack is tested on different
amount of data by embedding two, three and four bits per pixel. The test is
done by applying two different n (n = 1 and n = 2). So finally, we show the
result of embedding by proposed approach is strongly robust against Chisquare
attack. As results show in Fig. 14, by embedding more than
one bit in 100% of the cover image, this method is still reliable against Chisquare
attack since there is no change in Chisquare curve that means there is no way
to know the percentage of the embed data in cover image.
There is one point left which worth describing and that is the detected probability is more reliable when a kind of straight line can be seen which is steady or fixed in some area and it drops in some particular points and then goes on the same steady way. But, there is only one change between zero percent and less than five percent of the curves, in Fig. 14. Although there is one change in the Chisquare curve in Fig. 14 but the detected probability in this area for first curve is near to one and for second curve is almost zero (0.05). Also 4 bits is embedded in all pixel of the cover image, so if the Chisquare attack can detect the embedded information, the value of probability of embedding must be one for the whole stegoimage.
CONCLUSION
A new approach is presented to resolve the most important problem of image steganography which is imperceptibility of the stegoimage without losing the embedding capacity. To solve this problem, the proposed method identifies a certain bit stream of the cover image as secret bits. The image distortion is decreased by using OPA technique when bits in the cover image’s pixel are not same as secret bits. To show the robustness of this approach, Chisquare attack was tested on different stegoimages which have been embedded with different payloads. In order to enhance the effectiveness of the proposed method, it is recommended to apply some algorithms either to decrease the size of stegokey such as Huffman or embed the stegokey in another cover image.