Image Hashing Algorithm Based on Robust Bits Extraction in JPEG Compression Domain

Information Technology Journal

Year: 2010 | Volume: 9 | Issue: 1 | Page No.: 152-157
DOI: 10.3923/itj.2010.152.157

Image Hashing Algorithm Based on Robust Bits Extraction in JPEG Compression Domain

Yuan-Yuan Hu and Xia-Mu Niu

Abstract: Perceptual image hashing algorithm based on content-preserving property has a promising application in image retrieval and identification. By statistical analysis of the low frequency coefficients of quantized DCT blocks in JPEG compressed images, a strong correlation between the perceptual similar images is observed. The positions of the most significant low frequency coefficients are found to be invariant during common signal processings and even some geometric distortions. Based on these, a JPEG-compatible image hashing algorithm is proposed. It is verified by our experiments that the proposed algorithm not only has a strong robustness of JPEG compression (QF = 5) and common signal processings, but also is robust to geometric distortions of cropping and resize (more than 5%) and rotation (more than 3 degree), which is absent in other hashing algorithms in JPEG compression domain. Since, JPEG decoder is not needed during generating the hashing, it is more convenient and efficient for index and recognition in large image databases or the internet than those in pixel-wise domain.

Fulltext PDF Fulltext HTML

How to cite this article

Yuan-Yuan Hu and Xia-Mu Niu, 2010. Image Hashing Algorithm Based on Robust Bits Extraction in JPEG Compression Domain. Information Technology Journal, 9: 152-157.

Keywords: robust bits extraction, Joint Photographic Experts Group (JPEG), Discrete Cosine Transform (DCT), image identification and :Image hashing algorithm

INTRODUCTION

It is well known that the cryptographic hashing is very sensitive to the messages: even one bit flip will change the output dramatically. However, for multimedia, such as images, even after many manipulations such as compression, cropping and rotation, they still look the same to the human eyes. So, they should have the same hash value. By abstracting the images' content-preserving properties, image perceptual hashing should have the ability to map perceptually identical mages to the same hash value, regardless of the data formats and/or the manipulations it suffered. The content-preserving property makes perceptual image hashing have a wide application, such as identification/search of images in large databases and content-based image authentication etc.

The underlying robust image hashing algorithms can be roughly divided into three groups based on the storing format of the analyzed images: bmp, JPEG and JPEG 2000. The hashing algorithms based on bmp mean that the content-preserving characters are extracted in pixel-wise domain, which are not confined in a special storing format. Both JPEG and JPEG 2000 are special storing formats. The DCT and DWT are their transformations, respectively. Though some algorithms based on DCT or DWT are not compatible with JPEG or JPEG 2000, we still distribute them into JPEG or JPEG 2000 group.

According to Monga and Evans (2006), for bmp-based robust hashing, it can be further classified into methods based on the properties (mean, variance and higher moments) of intensity values of image blocks, preservation of coarse image representation and low-level image feature extraction.

Comparing with the transformations in pixel-wise domain such as Contourlet transform (Li et al., 2006) and Gabor transform (Manjunath et al., 1996), which can conveniently extract the content-preserving characters (edge, texture, etc.) in accord with HVS, the transforms in compressed domain, such as DCT and DWT, which although are most common in real storing formats, are less effective. Therefore, the research in compressed domain is much less than in pixel-wise domain. However, since most of images in reality are storing in compressed domains, it is more feasible and urgent to design the image hashing algorithms compatible with their storing format in identification/search of images in large databases or internet. Efficient extraction of the invariable characters in their respective transform coefficients matrix under content-preserving attacks becomes crucial.

Most of the robust hashing algorithms in DWT, which is the base of JPEG 2000 format, are depending on the structural relations, such as Yang and Chen (2005) and Lu and Liao (2003). In a special encoding algorithm, SPIHT, is utilized to design the hashing (Yang and Chen, 2005). In the general structural relationship between the inter-subbands in DWT coefficients matrix is studied and applied in their proposed image hashing algorithm (Lu and Liao, 2003).

The JPEG compression is the most common image format used by digital cameras and other photographic image capture devices; and also is the most common format for storing and transmitting photographic on the World Wide Web. As for DCT, the hashing algorithms can be roughly classified into two groups: relation (Lin and Chang, 2001) and preservation of coarse image representation (Fridrich, 1999; Fridrich and Goljan, 2000; Yu and Sun, 2006). Lin and Chang (2001), brought out the invariance of the relationships between DCT coefficients at the same position in separate blocks of an image. It has a very strong robustness under JPEG compression, which becomes the basic principles of most robust hashing algorithms and robust watermarking algorithms in DCT domain. However, it is not so robust under other common signal processing and even fragile to geometric distortions since geometric distortions destroy the original structures of blocks. Moreover, the relationships are founded between the coefficient blocks before quantization. It means that, when identifying, searching/indexing images in large database or internet, you have to go through entropy decoding and dequantization. Although, special JPEG decoder is not necessarily needed during entropy decoding, it must be applied in dequantization since the quantization scale has to be extracted. Fridrich (1999) and Fridrich and Goljan (2000) observed that the value of low frequency DCT coefficients of an image will be relatively invariant without significant content changes. His experiments divided the test image (256x256) into 16 blocks of 64x64 pixels and extracted 50 bits from each block for statistical analysis. The experimental result proved his observation. However, first, his observation is based on the primary DCT coefficients, which does not consider the influence of quantization for different quality factor; second, robustness under geometric distortions is not tested (Yu and Sun, 2006) is based on the invariant character of DCT signs.

Instead of statistical analysis of the DCT coefficients before quantization, our statistical analysis is based on the quantized coefficients, which has two obvious advantages. First, since the quantized coefficients have gone through the compression effect once, their robustness under JPEG compression attack is improved. Second, when generating the hashing for index, only common entropy decoder is needed instead of special JPEG decoder, which makes the search more convenient and effective. The result shows a strong correlation between the content-different images and a weak correlation between the content-different images. The positions of the most significant low frequency coefficients are mostly invariant under content-preserving attacks. Based on the above observation, our algorithms are proposed. Experiments prove that it have a great robustness under common signal processings. Even more, it has robustness under some geometric distortions.

STATISTICAL ANALYSIS

The baseline sequential JPEG encoding and decoding processes are described in Fig. 1. C_h,w(u,v) is the DCT coefficient at coordinates (u,v) in block (h,w). The quantized DCT coefficients are computed with:

(1)

where, scale controls the compression rate; greater scale means greater compression rate and poorer quality factor. At last, entropy coding involves after the Zigzag-ordered Cθ_h,w(q), for 8x8 block, q = 0,1,..63. The JPEG decoding is in a reverse process.

The open JPEG coder in Lakkundi (2006) is applied in this study. Perform entropy decoding for JPEG compressed image j to obtain the Zigzag-ordered Cθ_h,w(q), q = 0,1,..63. The first nine low frequency coefficients (exclude the DC coefficient) of each block are extracted to form an sequence.

Fig. 1:

Baseline sequential JPEG encoding and decoding processes. This figure gives the baseline sequential JPEG encoding and decoding processes. C_h,w(u, v) is the DCT coefficient at coordinates (u, v) in block (h, w). The quantized DCT coefficients Cθ_h,w(u, v) are computed with Eq. 1. At last, entropy coding involves after the Zigzag-ordered Cθ_h,w(q), q = 0,1,...63. The JPEG decoding is in a reverse process

Then, the correlations are analyzed statistically with Eq. 2 in order to find the robust character in quantized DCT:

(2)

For test images in Fig. 2a and b, Lena.jpg (Compression Ratio 9.5622 : 1) and Pepper.jpg (Compression Ratio 9.2255 : 1), the illustration of their sequences between 4501 and 5000 are shown in Fig. 3a and d below respectively. It is clearly shown that they are distinct from each other, their correlation coefficient is -0.0289.


Fig. 2:	Test image of (a) Lena and (b) Peppers. The compressed ratio of Lena.jpg is 9.5622 : 1 and Pepper.jpg is 9.2255 : 1. They are both compressed by open JPEG coder in (Lakkundi, 2006)

Figure 3b is the arrays of the test image Lena.jpg recompressed by JPEG with QF = 10, Fig. 3c is the arrays of the test image Lena.jpg rotated 3 degree. It is found that they are very similar with Fig. 3a, their correlation coefficients are 0.9741 and 0.2995, respectively, which are much greater than -0.0289.

Experiments are performed on the USC-SIPI image database, including 44 images. All the images are JPEG compressed by Lakkundi (2006), scale = 1. The histogram of the inter correlation coefficients between the 44 images are shown in Fig. 4. Suppose Th = 0.1, the percentage that abs (R(v,v'))≥Th is only 0.95%, less than 1%.

The correlation coefficients of 44 JPEG compressed images on content-preserving attacks are shown in Fig. 5. The x-coordinate is the attack strength. For disk blurring, it refers the blurring times, ranging from 1 to 10; for cropping, it is (2, 5, 10, 20, 30, 40, 50, 60 and 70%); for Gaussian white noise, the variance is (0.01 0.05 0.1 0.2 0.3 0.4 0.5); for histeq, it is the number of bins, the range is (256 224 192 160 128 96 64 32 16 8 4 2); for jpeg re-compression, it refers QF, the range is (95 90 80 70 60 50 40 30 20 10 5), for median-filtering, it is the filtering times, ranging from 1 to 10; for rotation, it is (1°, 3°, 5°, 10°).

Fig. 3:

Illustration of sequences of low frequency DCT coefficients between 4501 and 5000, from the top to bottom, (a) Lena.jpg, (b) recompression of Lena.jpg with QF = 10, (c) rotation 3° of Lena.jpg and (d) Peppers.jpg. These experimental results are achieved by MATLAB programs. The first nine low frequency coefficients (exclude the DC coefficient) of each block are extracted to form an sequence. The correlation coefficients of (a) between (d), (a) between (b) and (a) between (c) are -0.0289, 0.9741 and 0.2995, respectively. The content-different images are observed, which we name as observation 1. Furthermore, the regions with bigger values are more stable than those with smaller values, which we name as observation 2. The above two observations form the basis of our proposed algorithms

It is clear shown that JPEG recompression, median filtering and histeq have little influence on the correlation of DCT blocks; though with the stronger attacks under disk blurring and Gaussian white noise, the correlations decrease, their absolute values are still above Th = 0.1; meanwhile, geometric distortions destroy the correlations greatly, only 5% crop and 3° rotation are above Th = 0.1.


Fig. 4:	Histogram of the inter correlation coefficients between the perceptually different images. Experiments are performed on the USC-SIPI image database, including 44 images. All the images are JPEG compressed by Lakkundi (2006), scale = 1. These experimental results are also achieved by MATLAB programs


Fig. 5:	Correlation coefficients of 44 JPEG compressed images on content-preserving attacks. The experimental value is achieved through MATLAB program and the figure is drawn by this tool. From the statistical analysis, a strong correlation between the content-similar images and a weak correlation between the content-different images are observed

From the statistical analysis, a strong correlation between the content-similar images and a weak correlation between the content-different images are observed, which we name as observation 1. Furthermore, the regions with bigger values are more stable than those with smaller values, which we name as observation 2. The above two observations form the basis of our proposed algorithms.

PROPOSED ALGORITHM

It has been tested that blocks with big AC coefficients correspond to complex texture regions. Since, complex texture regions are perceptually important in HVS, with the observation 2, only the biggest AC coefficients of the quantized blocks are the candidate robust bits. To prevent the shifting of the biggest value and the second biggest value during recompression, a threshold Th_Δ is defined. Only the coefficient with the biggest absolute value which is Th_Δ bigger than the second biggest absolute value is extracted as robust bits. Their positions are recorded as the image hashing. The scheme of the proposed algorithm is shown in Fig. 6.


Fig. 6:	Proposed algorithm. Only the coefficient with the biggest absolute value which is Th_Δ bigger than the second biggest absolute value is extracted as robust bits. Their positions are recorded as the image hashing


Fig. 7:	Histogram of S between the perceptually different images. The experimental value is achieved through MATLAB program and the figure is drawn by this tool. The experimental results show that the similarity coefficient S is low enough for the perceptually different images

The similarity of H(J) is calculated in Eq. 3:

(3)

The numerator is the number of the robust bits with the same position; the denominator is the total number of blocks. Since, the blocks of simple texture are omitted, S must be less than 1 even for the identical images. Suppose a Th, S≥Th, perceptually same images; S<Th, perceptually different images.

RESULTS

With Th_Δ = 2, the histogram of S between perceptually different images was shown in Fig. 7. Suppose Th = 0.1 the percentage that S≥Th was only 0.11%. S of 44 JPEG compressed images on content-preserving attacks were shown in Fig. 8. The definition of the chart was same with Fig. 5. It was clear that, for disk blurring, the algorithm had a robustness of 4 times blurring; for cropping, it was up to 5%; for Gaussian noise, its variance was up to 0.1; for histeq, it reached 2 bins; for JPEG compression, QF could reach to 5; for median filtering (3x3), it had a robustness of 10 times; for rotation, it was up to 3°. Experiments proved that, the proposed algorithm not only had a better robustness under common signal processing (Lin and Chang, 2001), but also had a fairly good robustness under the rotation distortion which was often absent in most

Fig. 8:

S of 44 JPEG compressed images on content-preserving attacks. The char shows the efficiency of our proposed algorithm: it not only has a better robustness under common signal processing, but also has a fairly good robustness under geometric distortions which is often absent in most DCT hashing algorithms. The experimental value is achieved through MATLAB program and the figure is drawn by this tool

DCT hashing algorithms (Lin and Chang, 2001; Fridrich, 1999; Fridrich and Goljan., 2000; Yu and Sun, 2006).

As Th_Δ increases, more regions of image blocks would be deemed as simple textures, which would cause H(J) to decrease. The decrease of H(J) between perceptually different images would cause Th decrease, too. Experiments showed that with Th_Δ = 2, similar robustness would result comparing with Th_Δ = 2, however, when Th_Δ = 10, robustness grew weaker.

Next, the first three biggest absolute coefficients were extracted as the candidate robust bits, only those with the difference between them more than Th_Δ = 2 were selected at last. The result was similar with Th_Δ = 10, but its robustness under malicious content changes got stronger.

CONCLUSION

By statistical analysis of the low frequency coefficients of quantized DCT blocks in JPEG compressed images, a strong correlation between the content-similar images and a weak correlation between the content-different images are observed. Meanwhile, the regions with bigger values are more stable than those with smaller values. Based on these, a robust image hashing algorithm is proposed. Experiments show that the algorithm has a better robustness of standard benchmark (e.g., Stirmark) attacks including JPEG compression (QF = 5), geometric distortions of cropping and resize (more than 5%), rotation (more than 3 degree) and common signal processings. Since, JPEG decoder is not needed during generating the hashing, it is convenient and efficient for index and recognition in large image databases or the internet. Authentication based on the statistical analysis is going to study further.

REFERENCES

Monga, V. and B.L. Evans, 2006. Perceptual Image Hashing Via Feature Points: Performance Evaluation and Tradeoffs. Proc. IEEE Trans. Image, 15: 3452-3465.
Direct Link

Li, H.F., W.W. Song and S.X. Wang, 2006. Robust image watermarking algorithm based on contourlet transform. J. Commun., 4: 87-94.

Manjunath, B.S., C. Shekhar, and R. Chellappa, 1996. A new approach to image feature detection with applications. Pattern Recognition, 29: 627-640.
CrossRef

Lu, C.S. and H.Y.M. Liao, 2003. Structural digital signature for image authentication: An incidental distortion resistant scheme. IEEE Trans. Multimedia, 5: 161-173.
CrossRef Direct Link

Lin, C.Y. and S.F. Chang, 2001. A robust image authentication method distinguishing JPEG compression from malicious manipulation. IEEE Trans. Circuits Syst. Video Technol., 11: 153-168.
CrossRef Direct Link

Fridrich, J., 1999. Robust bit extraction from images. Proc. IEEE Multimedia Comput. Syst. Conf. Florence, Italy, 2: 536-540.
Direct Link

Fridrich, J. and M. Goljan, 2000. Robust hash functions for digital watermarking. Proceedings of the International Conference on Information Technology: Coding and Computing, Mar. 27-29, Las Vegas, NV, USA., pp: 178-183.

Yu, L.J. and S.H. Sun, 2006. Image robust hashing based on DCT sign. Proceedings of the International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Dec. 18-20, Pasadena, California, USA., pp: 131-134.

Yang, S.H. and C.F. Chen, 2005. Robust image hashing based on SPIHT. Proceedings of the International Conference on Information Technology: Research and Education, Jun. 27-30, Hsinchu, Taiwan, pp: 110-114.

Lakkundi, R., 2006. JPEG codec. Matlab Central. http://www.mathworks.com/matlabcentral/fileexchange/10476.

HOME JOURNALS CONTACT

Information Technology Journal

Year: 2010 | Volume: 9 | Issue: 1 | Page No.: 152-157 DOI: 10.3923/itj.2010.152.157

Image Hashing Algorithm Based on Robust Bits Extraction in JPEG Compression Domain

Yuan-Yuan Hu and Xia-Mu Niu

How to cite this article

Yuan-Yuan Hu and Xia-Mu Niu, 2010. Image Hashing Algorithm Based on Robust Bits Extraction in JPEG Compression Domain. Information Technology Journal, 9: 152-157.

Keywords: robust bits extraction, Joint Photographic Experts Group (JPEG), Discrete Cosine Transform (DCT), image identification and :Image hashing algorithm

REFERENCES

Year: 2010 | Volume: 9 | Issue: 1 | Page No.: 152-157
DOI: 10.3923/itj.2010.152.157