This research used characterized Discrete Cosine Transform (DCT) based Gradient Vector Flow (GVF) active contours for segmentation of chromosome spread images. The main advantage of using these characterized parameters is that they have already been standardized and can be used for segmentation of any similar class of chromosome spread images. The segmentation error is the difference between the length of the axis of the chromosome and the length of the axis of the contour in both the major and minor axes. Segmentation has been performed on 318 images drawn from three independent datasets and the segmentation errors have been quantified. A study of the segmentation error and the segmentation method yields valuable insight into the performance of the DCT based GVF active contours as a segmentation tool for chromosome spread images and also yields directions for future study to improve the efficiency of the technique.
The chromosome spread images are obtained using the following general procedure. About 5 mL of blood is removed from the patient. If a fetus is being karyotyped, amniotic fluid is removed from the amniotic sac which surrounds the fetus during development. This is done with the aid of a large syringe and ultrasound picturing. There are cells which have come off the fetus in this fluid. The white blood cells are removed from the blood or the living cells are removed from the amniotic fluid. These cells are then cultured in a medium in which they undergo mitosis. Mitosis is stopped at metaphase using chemicals. The cells are then placed onto a slide and spread out. They are viewed under a microscope which is specially adapted with a camera to take a picture of the chromosomes from one of the cells.
Hence, the chromosomes in a spread image have variations in shape caused due to bending effects, variations due to overlaps, variations due to illumination and device dependencies. This causes difficulties in segmenting the chromosome images as the segmenting algorithm or technique needs to be retrained often. Therefore, a special class of deformable curves called Gradient vector flow active contours is chosen to perform good and efficient segmentation of chromosome spread images under such difficulties caused by so much of variation.
GRADIENT VECTOR FLOW (GVF) ACTIVE CONTOURS
Gradient Vector Flow (GVF) active contours use gradient vector flow fields obtained by solving a vector diffusion equation that diffuses the gradient vectors of a gray-level edge map computed from the image. The GVF active contour model cannot be written as the negative gradient of a potential function. Hence it is directly specified from a dynamic force equation, instead of the standard energy minimization network.
The external forces arising out of GVF fields are non-conservative forces as they cannot be written as gradients of scalar potential functions. The usage of non-conservative forces as external forces show improved performance of gradient vector flow field active contours compared to traditional energy minimizing active contours[4,5].
The GVF field points towards the object boundary when very near to the boundary, but varies smoothly over homogeneous image regions extending to the image border. Hence the GVF field can capture an active contour from long range from either side of the object boundary and can force it into the object boundary. The GVF active contour model thus has a large capture range and is insensitive to the initialization of the contour. Hence the contour initialization is flexible.
The gradient vectors are normal to the boundary surface but by combining Laplacian and Gradient the result is not the normal vectors to the boundary surface. As a result of this, the GVF field yields vectors that point into boundary concavities so that the active contour is driven through the concavities. Information regarding whether the initial contour should expand or contract need not be given to the GVF active contour model. The GVF is very useful when there are boundary gaps, because it preserves the perceptual edge property of active contours[5,6].
The GVF field is defined as the equilibrium solution to the following vector diffusion equation:
Where, ut denotes the partial derivative of u(x,t) with respect to t, ∇2 is the Laplacian operator (applied to each spatial component of u separately) and f is an edge map that has a higher value at the desired object boundary.
The functions in g and h control the amount of diffusion in GVF. In Eq. 1, g(|∇f|)∇2u produces a smoothly varying vector field and hence called as the smoothing term, while h(|∇f|)(u-∇f) encourages the vector field u to be close to ∇f computed from the image data and hence called as the data term. The weighting functions g(·) and h(·) apply to the smoothing and data terms respectively and they are chosen as g(|∇f|) = μ and h(|∇f|) = |∇f|2. g(·) is constant here and smoothing occurs everywhere, while h(·) grows larger near strong edges and dominates at boundaries.
Hence, the Gradient Vector Flow field is defined as the vector field v(x,y)=[u(x,y),v(x,y)] that minimizes the energy functional:
The effect of this variational formulation is that the result is made smooth when there is no data.
When the gradient of the edge map is large, it keeps the external field nearly equal to the gradient, but keeps field to be slowly varying in homogeneous regions where the gradient of the edge map is small, i.e., the gradient of an edge map ∇f has vectors point toward the edges, which are normal to the edges at the edges and have magnitudes only in the immediate vicinity of the edges and in homogeneous regions ∇f is nearly zero. μ is a regularization parameter that governs the tradeoff between the first and the second term in the integrand in Eq. 2. The solution of Eq. 2 can be done using the Calculus of Variations and further by treating u and v as functions of time, solving them as generalized diffusion equations.
DISCRETE COSINE TRANSFORM (DCT) BASED GVF ACTIVE CONTOURS
The transform of an Image yields more insight into the properties of the image. The Discrete Cosine Transform has excellent energy compaction. Hence, the Discrete Cosine Transform promises better description of the image properties. The Discrete Cosine Transform is embedded into the GVF Active Contours. When the image property description is significantly low, this helps the contour model to give significantly better performance by utilizing the energy compaction property of the DCT.
The 2D DCT is defined as :
The local contrast of the Image at the given pixel location (k,l) is given by
Here, wt denotes the weights used to select the DCT coefficients. The local contrast P(k,l) is then used to generate a DCT contrast enhanced Image, which is then subject to selective segmentation by the energy compact gradient vector flow active contour model using Eq. 2.
MATERIALS AND METHODS
The chromosome metaphase image (at 72 pixels per inch resolution) provided by Prof. Ken Castleman and Prof. Qiang Wu (Advanced Digital Imaging Research, Texas) was taken and preprocessed. Insignificant and unnecessary regions in the image were removed interactively.
Interactive selection of the chromosome of interest was done by selecting a few points around the chromosome that formed the vertices of a polygon. On constructing the perimeter of the polygon, seed points for the initial contour were determined automatically by periodically selecting every third pixel along the perimeter of the polygon.
The GVF deformable curve was then allowed to deform until it converged to the chromosome boundary. The optimum parameters for the deformable curve with respect to the Chromosome images were determined by tabulated studies.
The image was made to undergo minimal preprocessing so as to achieve the goal of boundary mapping in chromosome images with very weak edges.
The DCT based GVF Active contour is governed by the following parameters, namely, σ, μ, α, β and κ. σ determines the Gaussian filtering that is applied to the image to generate the external field.
Larger value of σ will cause the boundaries to become blurry and distorted
and can also cause a shift in the boundary location. However, large values of
σ are necessary to increase the capture range of the active contour. μ
is a regularization parameter and requires a higher value in the presence of
noise in the image. α determines the tension of the active contour and
β determines the rigidity of the contour. The tension keeps the active
contour contracted and the rigidity keeps it smooth. α and β may also
take on value zero implying that the influence of the respective tension and
rigidity terms in the diffusion equation is low. κ is the external force
weight that determines the strength of the external field that is applied. The
iterations were set suitably.
RESULTS AND DISCUSSION
Characterization of parameters of the DCT based GVF active contour segmentation scheme has yielded the values of σ = 0.25, μ = 0.075, α = 0, β = 0 and κ = 0.625 as characterized parameter values(1). Standardization experiments have established that the characterized parameter values are indeed standardized and can be used for segmenting chromosome spread images from any dataset(2). Therefore, chromosome images from three independent datasets.
Three hundred and eighteen images from the three datasets were segmented using
DCT based GVF active contours. A sample image, its DCT based GVF field and its
corresponding segmented image are shown Fig. 1a-c.
|| Sample chromosome image
|| DCT based GVF field of Fig. 1a
|| Segmented output image of Fig. 1a
Though the segmentation appears very accurate to the naked eye, still there is a very small error in segmentation, which can be measured as a difference between the lengths of the corresponding axes in the original image and the contoured image along the major and minor axes. Table 1 indicates the segmentation error for the 318 samples.
|| Segmentation error for 318 samples from 3 datasets
|| Segmentation errors obtained while segmenting 318 sample
chromosome images from 3 independent datasets
From Fig. 2, it is observed that the major axis error oscillates more than the minor axis error. This may be due the fact that the maximum radius along the major axis of the chromosome is 64.873797 and the minimum radius along the major axis is 11.489264, while the maximum and minimum lengths of the chromosome alone the minor axis are only 25.821964 and 7.354900. The ratio of the maxima to the minima along the major axis is approximately 5.646471 while the same ratio along the minor axis is only 3.510851. Hence the segmentation error along the minor axis may seem to manifest less oscillation, but given the less ratio of the maximum to minimum length as 3.510851 compared to 5.646471 along the major axis, it can be inferred that the oscillations of the error along the major and the minor axis oscillate proportionally. This lends support to the fact that the tension and rigidity parameters that assume a constant value in the DCT based GVF Active Contour formulation enforce a circular iterative effect on the initial contour, forcing it to converge circularly on the chromosome image, irrespective of the size of the chromosome or ratio of the major axis length to the minor axis length.
Also, we find from Fig. 2 that the error varies only between 0.055934 and 1.963482, when segmented with DCT based GVF Active Contours with parameter values of σ = 0.25, μ = 0.075, α = 0, β = 0 and κ = 0.625 and subjected to limited preprocessing and enhancement using only median filtering, adaptive histogram equalization, averaging and graylevel adjustment using built-in Matlab functions.
It is found that the characterization and the preprocessing and enhancement requirements play a vital role in the successful segmentation of chromosome spread images using DCT based GVF active contours. To extend this technique for universal segmentation of chromosome spread images, retraining by repeated characterization would defeat the aim of universal segmentation itself. Since preprocessing and enhancement requirements also play a vital role, concentration can be exercised on them to reduce the segmentation error to less than one pixel that is found in characterized segmented chromosome image samples, which is acceptable as the contour iterative step size is one pixel and the contour thickness is also one pixel wide. Too much of enhancement and preprocessing would destroy the build up of the effective gradient between the background and the image boundary which is essential for the effective convergence of the DCT based GVF active contour. Therefore, optimum enhancement and preprocessing by trial-and-error method is sufficient for good convergence of the DCT based GVF active contour with acceptable segmentation error less than one pixel.
This makes the DCT based GVF Active Contour accompanied with optimum preprocessing and enhancement, an efficient segmentation tool for any chromosome, provided that the DCT based GVF Active Contour is characterized with chromosomes that are same or similar to the chromosome of interest, to achieve best results.
Therefore, the DCT based GVF active contours are hence established as an efficient tool for chromosome image segmentation independent of the dataset from which the images are obtained.
The authors extend their heartfelt thanks to Dr. Michael Difilippantonio, Staff
Scientist at the Section of Cancer Genomics, Genetics Branch/CCR/NCI/ NIH, Bethesda
MD; Prof. Ekaterina Detcheva at the Artificial Intelligence Department, Institute
of Mathematics and Informatics, Sofia, Bulgaria; Prof. Ken Castleman and Prof.
Qiang Wu, from Advanced Digital Imaging Research, Texas and Wisconsin State
Laboratory of Hygiene-http://worms.zoology.wisc.edu/zooweb/Phelps/karyotype.html
for their help in providing chromosome spread images.