Research Article
Watermarking System for the Security of Medical Image Databases used in Telemedicine
College of Computer Science and Information Technology (CCSIT), King Faisal University, Saudi Arabia
Presently, medical practitioners in developed countries usually send medical information from one medical center located in one geographical area to another through some communication media such as the internet. Such process of diagnosis for the patients is called telemedicine. Medical information such as medical images and diagnosis reports are exchanged daily between research institutes and different medical centers around the world for learning and treatment purposes by doctors and researchers1. Patient information should be more secure and only be accessed by the authorized users to change or modify the information (i.e., called confidentiality of medical information). Confidentiality and authenticity are the two main issues when the patient information is transmitted from one place to another through unreliable source of communication such as internet2.
Some researchers stated that digital image watermarking may be a good solution for solving such issues. Because, through medical image watermarking Electronic Patient Record (EPR) can be embedded in medical images for authentication and copyright protection3-5. However, watermarked medical image can easily be diagnosed by the medical practitioner. As the embedded medical image is transmitted through the internet to second doctor enabling him to retrieve the hidden information for further diagnosis. In case the medical image is not attacked, watermarking algorithm can help the doctor to recover the image by undoing the watermark process to its pristine state in the few seconds for further diagnosis with confidentiality6. A number of medical image watermarking algorithms usually produce distortions in the medical image after watermarking thus leading to misdiagnosis.7-10. The proposed system in this study nullify the watermark process and brings the image back to its pristine state. Also, it aims to facilitate the medical practitioner to have original image in hand after extracting the patient information from watermarked image. Consequently, doctors can further diagnosis the medical image without any loss of medical information.
Medical image watermarking is an emerging field for the protection of medical images. Liew and Zain11 and Liew et al.12 have developed two reversible block based techniques. In the first technique, input image is divided into ROI and RONI regions. The second technique is the same as the first technique except that the removed LSBs of pixels in blocks of ROI are compressed using Run Length Encoding (RLE) technique before embedding into RONI. Memon et al.13 reported a technique in which first input image is divided into ROI and RONI. Then, a fragile watermark is embedded into ROI using LSB method. Later RONI is segmented into blocks of user defined size N×N and a location map of embeddable blocks is created. The disadvantage of this technique was that there is no proper specification of how original ROI is recovered when the ROI is tampered. Al-Qershi and Khoo14 reported a technique in which input image is divided into three regions: ROI, RONI and border pixels. Finally, the information about the sizes of watermarks implanted into ROI and RONI is embedded into border pixels using the same DWT technique. The two drawbacks of this technique namely using compressed form of ROI as the recovery information for ROI and applicability to only images of size 512×512 were reported. From the overall previous study, it can be concluded that all the techniques except reported by some investigators9,15,16 are ROI-based while, the techniques reported by others11-14 are embed distortion inside ROI. However, none of the technique spot tampering clearly inside ROI and neither identify the accurate tampering of blocks. The proposed system overcomes all these limitations.
The main objective of this study was to present a medical image authentication and verification system beneficial for communicating medical information for low budget hospitals through the internet in telemedicine environment.
The proposed system presented the lossless data hiding scheme using the data compression technique for creating the space to embed the payload in wavelet coefficients using LSB method. Consequently, compressed coefficients were then decompressed in the extraction phase to get the payload back and to recover the original medical image in its pristine state. For 8-bit gray scale medical images, there were 8-bit-planes with pixel values in the range of 0-255. Because without having good bias of zeros and ones, better embedding capacity cannot be achieved in spatial domain10. Therefore, in the proposed system, a lossless compress technique was used to compress the binary bit-plane in frequency domain and the watermark was implanted in the saved space. On the extraction side, first the implanted watermark was extracted and then compressed bit-plane was decompressed. The original image can be recovered due to lossless compression. Figure 1a shows the original CT scan image and Fig. 1b shows the 4th bit-plane of RONI area of input CT scan image after segmentation. Figure 1b, the redundancy of black pixels can easily be observed, especially in corners. The advantage of this redundancy was to hide the watermark data.
Fig. 1(a-b): | (a) Original image and (b) 4th bit-plane of RONI |
Table 1: | Comparison of embedding capacity in both spatial and frequency domains |
In order to obtain the higher bias through zeros and ones, image transform was applied to medical images. For removing high redundancy and hiding more watermark information while avoiding round off error, the second generation frequency transform such as IWT17 was used. The IWT maps integers to integers and was used along with the lifting scheme CDF(2,2) which gave better imperceptibility after hiding the payload as compared to other lifting schemes as reported18.
Arithmetic Encoding (AE) for bit-plane compression: The bit-planes of the wavelet frequency coefficients provide better bias of zeros and ones as compared to spatial domain19. Table 1, the 4th bit-planes in spatial domain achieved 1.82% embedding capacity after compressing the bit-plane by using AE technique while, 82.84% embedding capacity could be achieved by applying the AE technique on the 4th bit-plane of the wavelet coefficients after applying the IWT upto 1st level on the same medical image as shown in Fig. 1a.
As reported in Memon and Gilani6, embedding the payload in the lower bit-planes reported higher values of Peak Signal to Noise Ratio (PSNR) whereas, embedding the payload in higher bit-planes reported lower values of PSNR in spatial domain.
Proposed watermarking system: A very simple lossless data hiding system based on image segmentation IWT, AE and LSB substitution method was proposed. The system is comprised of watermark embedding and watermark extraction phases. In the watermark embedding phase, two watermarks were embedded such as the fragile and robust. The fragile watermark is the Binary Pattern (BP), whereas robust watermark is Composite Watermark (CW). The BP was implanted in the ROI using spatial domain while CW was implanted in the RONI using frequency domain. In order to embed the BP simple LSB method was used. The 1st bit-plane of ROI was replaced by the BP. To embed the CW in the frequency domain first IWT upto 1st level was applied on RONI region to get horizontal (HL1), vertical (LH1) and diagonal (HH1) wavelet frequency coefficients. After this, the AE technique was applied to compress HL1, LH1 and HH1 to obtain the room for data embedding. The CW was then embedded in RONI along with the compressed HL1, LH1 and HH1 coefficients and some book keeping data was required for decompressing the coefficients in the 4th bit-plane of wavelet frequency coefficients using LSB technique. The BP served for identifying the tampered areas in ROI aiming to check the integrity of ROI area of the medical image while, the main purpose of composite watermark was to prove the authenticity and ownership of the medical image. In the watermark extraction phase, the watermarks were extracted from watermarked image and compared with their corresponding counterparts to check the integrity of the content to prove the authenticity and the ownership of the medical image.
Fig. 2: | Binary pattern embedded in ROI |
The proposed system also facilitate to recover the original image to its pristine state after extracting the hidden data. The proposed system comes under the category of lossless, distortion-free or invertible data hiding scheme.
Generation of watermarks: The procedure of generating both the fragile and robust watermarks is described below. After creating the watermarks, these were implanted in Region of Interest (ROI) and Regon of Non-Interest (RONI), separately.
Binary Pattern (BP): In order to check the integrity of medical image, a fragile watermark referred as BP, composed of words "KFU" was generated by Microsoft Paint tool as depicted in Fig. 2.
Composite Watermark (CW): The CW is composed of four different types of watermarks as described below:
• | Patient Record (PR): This contains name, age, sex and history of patient and is alphanumeric. In order to make binary vector of PR, each alphanumeric character was converted into its corresponding ASCII code. Later on, each ASCII code was converted into corresponding binary representation. A total of 128 characters were used as which produced binary vector of 1024-bits after conversion and 8-bit binary chunks were assigned to each character |
• | Doctors ID: (DI): The 16 character string was given by doctor when producing the medical image. The same procedure of conversion was adopted as discussed for the creation of PR. The length of DI was 128-bits |
• | Hospital logo (HG): This icon was used by the hospital for claiming the ownership. The HG is used as hospital logo as shown in Fig. 3. The size of HG was 32×32 bits which produced 1024-bits after converting into binary vector |
• | LSB information (LI): This is 1st bit-plane of ROI. The length of LI is actually adaptive and depends on the size of ROI |
Fig. 3: | Sample hospital logo of size 32×32 pixels |
After creating these four watermarks, all were concatenated into single binary vector and referred as CW. The CW was further concatenated with compressed wavelet coefficients (HL1, LH1 and HH1 ) along with some book keeping data required for decompressing the wavelet coefficients to produce the vector P, such that P = [p(i), p∈{0 1}]. In order to increase the security of watermark, the resultant watermark P was further EX-ORed with some pseudo random generated binary vector say, Q such that Q = [q(i), q∈{0 1}] of same size as of P, to obtain the final watermark, such that The Q is generated with some key, k which may be communicated to the receipt via private channel. The scheme of creating is depicted in Fig. 4. By adopting this technique, if the Hacker knows the algorithm and extract the watermark, this study still will not be able to decode the embedded information without having the information of key, k.
Segmentation of input image into ROI and RONI: In medical image diagnosis applications, the ROI is generally defined by medical practitioner. The ROI of different shapes were selected by experts as per their requirements keeping in view the application in hand as described by some researchers16. For the proposed method, the segmentation algorithm as reported in Memon et al.5 was used.
Fig. 4: | Watermark () embedded in RONI |
Watermark embedding procedure: The implantation of watermarks starts after segmenting the input image into ROI and RONI. The watermark casting process was applied separately on both ROI and RONI regions. The each step of embedding procedure is described below:
• | Separated the ROI and RONI regions of input medical image, I by applying segmentation process |
• | Generated both the watermarks BP and |
• | Extracted the LSBs of ROI and store this information in separate store |
• | Set the LSBs of ROI to zeros |
• | Embed BP into the 1st bit-plane of ROI by LSB substitution method to obtain the Watermarked Region of Interest (WROI) |
• | Applied IWT upto 1st level to RONI and compress HL1, LH1 and HH1 wavelet coefficients using AE technique |
• | Casted the into the 4th bit-plane of the compressed coefficients using LSB method |
• | Applied Inverse Integer Wavelet Transform (IIWT) to obtain the Watermarked Region of Non-Interest (WRONI) |
• | Finally, combined both the WROI and WRONI to obtain the watermarked image I |
Procedure for watermark extraction: A number of steps of extraction procedure are nearly the same as embedding procedure. The extraction procedure was defined as below:
• | Divided the I into WROI and WRONI |
• | Extracted the information of 1st bit-plane from WROI |
• | Applied the IWT up to first level on WRONI to obtain marked HL1, LH1 and HH1 wavelet coefficients |
• | •Extracted from these marked HL1, LH1 and HH1 coefficients |
Verification of ying integrity of ROI, authenticity and recovering original medical image: In order to verify the integrity of ROI, checking the authenticity of the received medical image and reconstructing the received image to its pristine state, the following steps were followed:
• | Compared the extracted Binary Pattern (BP) visually with reference BP. If no visible artifacts were found in the BP, the integrity of ROI was retained, otherwise the ROI was tampered during the transmission |
• | Applied the XOR operation to the extracted watermark () with the pseudo random binary vector P generated with same key, k, as used at the time of embedding to get back the binary vector Q |
• | Separated CW, wavelet coefficients HL1, LH1 and HH1 and book keeping data from Q |
• | Then divided the CW into PR, DI, HG and LI binary vectors as per their pre-defined lengths |
• | Replaced the LSBs of 1st bit-plane of WROI with the LI information to get back original ROI |
• | Applied the AE technique on the compressed coefficients to decompress them using book keeping data |
• | Took the IIWT upto 1st level of WRONI to get original RONI |
• | Lastly, combined both the ROI and RONI to obtain the original image back to its pristine state |
The simulation results of the proposed system are summarized below. First the watermarks were generated and then embedded in the medical images. After that, the watermark extraction procedure was performed.
Data set used in experiments: For experimental results, the CT scan medical image databases were used as special case. The size of CT scan image was fixed as 256×256 pixels and all were 8-bit gray level images. The proposed system can be used for any type of modality such as MRI, Xray or ultrasound (US) etc. In that case, the ROI can be defined by the medical practitioner based on the application in hand. Both the manual and automatic methods can be used.
Embedding the watermarks: First the watermarks to be embedded were generated as described earlier and then the input image was divided into ROI and RONI. After that, the BP was inserted in ROI and was inserted in RONI of input medical image. The first row of Fig. 5 depicts the watermarked image produced after inserting the BP into the CT scan image. The PSNR of 57.5219 dB was found for the watermarked image. The PSNR was calculated using Eq. 1 and 2. The MSE is the Mean Square Error and R is the maximum value of gray scale intensity in the input image in Eq. 1, whereas in Eq. 2, I1 is the input image and I2 is the watermarked image. The study findings agree with those of Zain and Fauzi16 who reported that a medical image watermarking scheme based on secret key and public chaotic mixing algorithm detect tampering and subsequently helps recovering the original image:
(1) |
(2) |
The second row of Fig. 5 shows different image after subtracting the original image from watermarked image. It can easily be observed from residual image that watermark was only embedded in ROI.
Figure 6 shows the watermarking image after inserting the watermark in RONI area of original image. The PSNR was calculated as 31.6209 dB on average. The average total time required for embedding both fragile and robust watermarks was 32.50 sec. The second row of Fig. 6 shows different image after subtracting the original image from watermarked image. It can easily be observed from residual image that watermark was only casted into RONI area.
Table 2, the data of some patient having an age of 55 years is presented. The slice thickness was set as 0.1 mm while capturing the images. Total of 68 slices were captured. The segmentation time, total area of ROI, total time required for embedding the watermark and PSNR were calculated after embedding the watermark as shown for every 5th slide in this table.
Fig. 5: | Binary pattern embedded in ROI |
Fig. 6: | Watermark embedded in RONI |
Table 2: | Simulation results of model patient |
The time taken for segmenting the CT scan image into ROI and RONI for each slice is shown in Fig. 7 through the line graph. It can be observed that minimum time required by image database was 11.75 sec while maximum time required was 15.75 sec. The time required for segmentation by each slice depends on the structure of image. It was observed that segmentation time does not depend on the size of ROI.
This was due to the fact, whether ROI is present or not in the input slice, the segmentation process will scan entire image. Thus, the images having no ROI also consume same or less time for segmentation.
The payload capacity of ROI and RONI varies according to the size of ROI and RONI. Generally, due to structure of lung parenchyma of human body, the size of ROI in the start and end slices is smaller as compared to slices containing the middle part of lung parenchyma. Thus, the early and last slices have low embedding capacity whereas middle slices have high embedding capacity. Figure 8 shows the different ROI areas for 68 slices. It can easily be observed from the figure that some starting slices have zero pixels in their ROI area. For example, slices bearing number 1-5 have zero pixels in their ROI. Similarly in the end part of lung, the slice numbers ranging from 54-64 have zero pixels in their respective ROI area.
The embedding time is the total time required for embedding both watermarks in ROI and RONI.
Fig. 7: | Line graph showing the time required for segmenting |
Fig. 8: | Bar graph showing the ROI in each slice for model patient |
This time depends on the amount of payload embedded in the input image. It can be observed from Table 2 that the image slices having higher area consumed more time as compared to the image slices having lesser area of ROI. The payload was defined as the number of binary bits embedded per pixel. Thus, it was measured as bits per pixel (bpp). The simulation results showed that payload is directly proportional to area of ROI. This is because, more the ROI, more the LSB information embedded in RONI. The study findings are identical to those of Wu et al.20 who reported a technique in which two block based methods are developed for the tamper detection and recovery of the medical images. In the first method, each block is embedded with the authentication message and recovery information of the other blocks. In second method, JPEG bit-string of the selected ROI is created and then is segmented into fixed length pieces.
Fig. 9 | :Slice No. and the corresponding payload value after water marking |
Fig. 10: | Slice No. and the corresponding PSNR value after watermarking |
The main drawback of this technique was that it requires more time for calculations to generate the recovery data of ROI.
Figure 9 shows that early slices payload was low and then slowly increases and decreases accordingly. This behavior is similar to the one shown in Fig. 8 due to increasing and decreasing nature of lung parenchyma in human body. The value of PSNR measure depends on the payload embedded in the input image. This is completely inversely proportional to the payload. Low the payload, high the value of PSNR and conversely, high the payload small the value of PSNR. Figure 10 shows that the PSNR values for each slice after watermarking. The slices containing the starting and end area of lung parenchyma showed higher value of PNSR due to embedding of low payload whereas the slices containing the middle part of lung parenchyma showed low value of PNSR due to high embedding payload.
Watermark extraction: In order to check the system, watermarks were extracted after dividing the watermarked image I into WROI and WRONI areas.
Fig. 11: | Extracted watermark from ROI without attack |
For extracting the watermark from WROI, the 1st bit-plane of WROI was extracted. Figure 11 shows that the BP extracted from ROI, when there was no attack on watermarked image. After that was extracted from WRONI. As was done at the time of embedding, the was EX-ORed with P in order to get Q. Now CW was seperated from wavelet coefficients and book keeping data. The CW is further divided into PR, DI, HG and LI as per their pre-defined lengths. First PR was passed under the reverse process of the procedure for converting the alphanumric data into binary as explained earlier to get back the patient information. After this, the extracted doctors identification code (DI) was compared with the refrence DI by using the metric Normalilzed Hamming Distance (NHD) as defined by some researchers13. The formula for NHD is given in Eq. 3:
(3) |
where, w in the input watermark, w* is the extracted watermark and N is the length of the watermark. The range of NHD is between 0 and 1. However, when the value of NHD is low, the quality of extracted watermark was better. In this case, zero distance was observed between both the reference ID and the extracted DI. This ensured the authenticity claimed by the doctor who produced the medical image. The extracted hospital logo HG was compared with reference HG and found the same as shown in Fig. 3. Also, by this way the ownership claim of the image was ensured. In order to reconstruct the received image back to its pristine state, decompression was done using AE on wavelet coefficients to get original wavelet coefficients.
This was followed by applying IIWT on WRONI to get back the original RONI. Finally, both the ROI and RONI were combined to get the original image back. Similar results were reported by Deng et al. 15 who reported that a medial image watermarking technique based on reversible watermarking and quadtree decomposition is useful for medical data secirity. Also Kim et al.9 developed a region-based tampering detection and restoring scheme for authentication and integrity verification of medical images based on image homogeneity analysis.
The present study proposed a watermarking system for medical images. Also, the objective of the proposed system was two folds. First, it ensured the authenticity and copyright protection of medical image databases in real time environment. While in the second, it guaranteed the safe transfer of medical image databases from one geographical location to other. In order to achieve these two goals, the proposed system implanted two watermarks namely fragile and robust watermarks in the medical image databases. The robust watermark proved to be robust under intensive experiments with different properties. The fragile watermark led to an efficient system for complete verification of medical images in both lawful and unlawful assault. By using the proposed system, the medical image database can be accessed only by authorized users. The proposed system can also facilitate the public and private sectors by providing security in the medical image database archiving and retrieving systems.
The author gratefully thank the Deanship of Research, King Faisal University, Al-Ahsa for financial support and provision of infrastructure for conducting the study.