Efficient Multi-resolution Detection of Binary Object

Fang, Xianyong; Wu, Hao

ABSTRACT

This study presents an efficient multi-resolution method to detect binary object directly. Both intensity and geometry differences are used to measure the similarity between the source object template and target objects inside the test image. Especially for the geometrical difference, the histogram of oriented gradients is proposed to measure the similarity of the edge images between the source object template and the target object. Based on these two measurements, a multi-resolution strategy is used to accurately locate the object position in a coarse-to-fine way, where intensity difference and geometry difference applied consequently in each layer. Experimental results demonstrate the efficiency of the proposed method.

PDF Abstract XML References Citation

INTRODUCTION

Binary object has at least two merits for object recognition: (1) It is not affected by illumination which may generate different textures, (2) its simple 2-value structure is computationally faster with less workload than other types of popular object imaging method, such as gray image (Bouzenada et al., 2007), or color image (Hsien-Chou and Pao-Tang, 2009). Binary object detection has been widely applied to manufacturing, inspection, target recognition, etc. In this study, we propose a new multi-resolution method to accurately locate the binary object by matching the object template inside the test image in both intensity and geometry.

While there are a lot of studies in object detection and recognition aiming at various types of gray and color images, a few studies are taken specifically for binary object detection. Among them, Murase and Nayar (1995), Krumm (1997) and Hong and Zhang (2001) proposed learning-based methods which compute the object features first by training. But the learning based methods require laborious training process and are not flexible for various types of objects. Ye (2005) proposed another interesting method which attempts to compare the geometry similarity between the template and the test image directly. This method creates a tree structure for the template and the image respectively based on branch points in the contour. It can detect different objects conveniently because it does not need the special training set used in the learning based methods. But the tree structure heavily relies on the branch property of the object contour. However, in comparison with learning-based methods, we prefer this direct style of method to realize binary object detection. In this paper, we do not use this tree structure, but instead propose a novel direct detection method. It borrows the idea of histogram of oriented gradients from gray and color object detection (Lowe, 2004; Dalal and Triggs, 2005; Watanabe et al., 2009) and applies it to the edge of the object for geometry similarity measure. In addition, the pixel difference inside each object is also adopted as the intensity similarity measure to improve the detection performance. Furthermore, we use a multi-resolution strategy. By creating pyramids for both the template and the test image, this coarse-to-fine strategy eliminates the tedious search process in the fine image through previously locating the potentially similar objects in the coarser test image and thus improves the speed.

THE BINARY OBJECT DETECTION METHOD

The pipeline of the method: Figure 1 shows the detection process. A template image containing the source object T_O and a test image I_O are input first. Then the edges of objects in the template and the test image are labeled by the seed fill algorithm as edge images T_E and I_E, respectively. After these initialization steps, a multi-resolution detection strategy is used to locate the target object in a coarse-to-fine way.

In this strategy, the test image under different rotation angle is detected with the template so that direction of the target object can be accurately located by the rotation angle.


Fig. 1:	The binary object detection pipeline of our proposed method

Therefore, first the rotation angle a is initialized to the start test angle a_start. Then the progressive multi-resolution detection of the test image under this angle, which will be discussed in the following paragraphs, starts.

First, I_O is rotated a degrees. Then the L-layer image pyramids for T_O and T_E of T, P_TO and P_TE, are created respectively with layer L as the coarsest layer. The pyramids, P_IO and P_IE, for I_O are also created in the same way. The interested object set O_is is initialized to contain all objects in I_E so that dissimilar ones are discarded in the following steps. Then layer by layer refinement is adopted to find the matched object positions in I_O. In the current layer l, first, the intensity similarity of current layer between PTO_l and PIO_l is checked first to refine the interested objects set O_is. Some objects that have big intensity difference between PTO_l and PIO_l are removed after this step. Then, the geometry similarity between P_TEl and PIE_l for each object in O_is is checked for the refinement of the interested objects O_is. After this step, some objects in O_is will be removed because of their geometrical dissimilarity to the template P_TEl.

The above multi-layer refinement repeated until all layers are processed and then O_is containing all the objects that match with the object under angle a is obtained. Then, O_is is put into the final output set O_f and another round of progressive search with increased rotation angle a = a +a_step is taken again. After all angles has been tested (a>a_end), the algorithm stops with O_f containing the final matched objects under different angles.

We now turn to discuss the details of the two core steps in the multi-resolution scheme, the refinement of O_is by the intensity similarity and the geometry similarity.

REFINEMENT OF INTERESTED OBJECTS IN EACH LAYER

In this refinement process, the intensity similarity is used first to remove those in O_is that are dissimilar to the template. Assuming the current layer l of the template and the test image are PTO_l and PIO_l, the acceptance of each interested object in O_is, Accept I, can be formulated as:

Image for - Efficient Multi-resolution Detection of Binary Object

(1)

where, a_i is the intensity similarity threshold and the intensity similarity Dist_i is measured as the ratio of the sum of all pixel positions (x, y) having different values in both images to the sum of pixel values in the template P_TOl.

Those objects that rejected by the intensity similarity refinement are removed in O_is and those left in O_is are refined again by the geometry similarity which will be discussed in the following.

Assuming the histograms of oriented gradients for the edge images of the template and test image are H_TEl and H_IEl, respectively, the acceptance of each interested object in O_is, AcceptG, can be formulated as:

(2)

In Eq. 2, a_g is the geometry similarity threshold and the geometry similarity Dist_g is measured by accumulating the bin differences of histograms of oriented gradients along the edges between H_TEl and H_IEl

Outlier removal: There will be many similar objects detected which are partially overlapped with the best match because both the intensity and geometry similarity are verbosely checked block by block and angle by angle. Calling those that are not the best match to the template as the outliers, we adopt a center based method to remove them, i. e., we first calculate every putative object center and then remove those clustering together except the one with the highest match score to the template. The match score S is defined as:

(3)

where, λ is a coefficient that weights the final contribution of the intensity similarity and the geometry similarity.

EXPERIMENTAL RESULTS

Experiments are undertaken in MATLAB to demonstrate the performance of the proposed method. In our experiments, pyramid is created as the Gaussian pyramid and the search step in layer l is set to be max(T_height, T_wdith)/100 where, T_height and T_wdith are the height and width of the template image in level l, respectively. But in the top layer (coarsest layer), the step is set to be 1 for better detection accuracy. The angle step a_step is set to be 1. The thresholds of equation 1 and 2, a_i and a_g, are set to be 0.2 and 1.2, respectively. We also set the coefficient in equation 3, λ, to be 0.5 to emphasize the importance of geometrical difference since, geometrical difference is about 2 ~ 4 times higher than the intensity difference.

Figure 2 shows the dent detection test in a gear image. Figure 2a shows the gear image and the dent pattern. The gear image is 320x480 but the surrounding white area is clipped out for the tight show of the image in this study.

Fig. 2:

Detection of a dent pattern in a gear image. (a) The gear image and the dent pattern showed in the red rectangle. The gear image is 320x240 while the pattern is 20x16. The gear image has been clipped out the surrounding blank (white) area for the tight show in this study, (b) the edges of the gear image and the dent pattern showed in the red rectangle. They are detected with seed fill algorithm, (c) the detection result before outlier removal. Each putative object is shown in a red rectangle. About 8 putative objects similar in intensity and geometry around every dent are detected. The details inside the blur rectangle are shown in Fig. 2e for performance comparison, (d) the detection result after outlier removal. Each object is shown in a red rectangle and only the most similar one around each dent is detected in comparison with Fig. 2c. The details in the blur rectangle are shown in Fig. 2e for performance comparison and (e) the performance of the outlier removal. The enlarged sub-images inside the blur rectangles of Fig. 2c and d are shown in the left side and the right side, respectively

Fig. 3:

Detection of a spade pattern in a poker icon image, (a) the poker icon image and the spade pattern showed in the red rectangle. The icon image is 640x480 while the pattern is 107x111, (b) the edges of the poker icon image and the spade pattern showed in the red rectangle. They are detected with seed fill algorithm and (c) the detection result after outlier removal. It is shown in a larger size than Fig. 3a and b. Each detected object is shown in a red rectangle. The number in the center of each icon shows the detected order based on the similarity score S to the spade pattern, where 1 means the most similar one among them

Figure 2b shows the edges of the image as well as the dent pattern obtained after the seed fill algorithm. The binary images as well as their edge images are taken into the multi-resolution detection process for 360 degree of search. Two-layer pyramids are created for each rotated gear image and the dent pattern respectively because 3 or more layers will make the pattern (only sized 20x16) too ambiguous. Totally there are 233 putative objects obtained in the layer 1 and then 186 objects after the geometry match. Figure 2c shows these 186 objects obtained before we remove the outliers. Clearly they are heavily clustered. But there is only 24 objects left after applying the outlier removal (Fig. 2d) where all dents are correctly located. Figure 2e gives a close view of the performance of the outlier removal where the two sub-images inside represent the blue rectangles cut out from Fig. 2c and d, respectively. After the outlier removal, the heavy cluster in the left sub-image is removed with only one best match left as showed in the right sub-image.

Our method is also successfully tested with many other binary images. Figure 3 gives one interesting example among them. Figure 3a shows the test image having poker icons and the spade pattern for detection and Fig. 3b shows the edges of the icon image as well as the pattern obtained after applying the seed fill algorithm. In this example, three-layer pyramid is created for the icon image and its edge image respectively since the pattern is rather large (107x111). The angle range is also between 0 and 360 degree. Figure 3c shows the final result obtained after the outlier removal. In this figure, their orders are displayed to show their similarities to the spade pattern. Denoting those icons as their orders, we can have two groups, group 1 containing 1, 2 and 3 and group 2 containing 4, 5, 6, 7. While there are relatively large differences between icons in group 1 and icons in group 2, there are relatively small differences between icons inside each group. The order of icons inside each group is finally decided by their scores S although visually it is hard for us to judge whether one of them is more similar to the pattern than the others.

Our method can also be applied to gray object detection. In this case, first the test image and the pattern are threshold into binary images. Then our method is applied to detect the objects.

Fig. 4:

Semiconductor example, (a) the gray semiconductor image and its clipped pattern which is shown in the red rectangle. This image is 640x480 while the pattern is 24x50. The image has been clipped out the surrounding blank (white) area for the tight show in this study, (b) the thresholding result. The red rectangular area is the pattern and (c) the detection result showed in a larger size than Fig. 4a and b. Each object is shown in a red rectangle. The number denotes the order of the similarity as Fig. 3c. We manually display the first 28 similar objects to demonstrate the robust of our method for gray object detection

Figure 4 shows such an example where a semiconductor image is searched for its sub-pattern (Fig. 4a). Figure 4b shows the binary image obtained for our method and Fig. 4c shows the final detection result with the number showing the similarity of each object to the pattern. The first 28 objects that are similar to the pattern are shown manually since, the 29th and after are not very accurately located as we wish. But these first 28 objects have demonstrated the applicability of our method to gray image, where the original pattern is correctly located as the No. 1 object with other similar ones following it closely.

CONCLUSIONS

This study presents a new multi-resolution binary object detection method. It applies first the intensity similarity and then the geometry similarity into the coarse-to-fine process to accelerate the detection and accurately locate the target objects. Experiments on both binary image and gray image show the efficiency of our method.

Our method has been successfully applied to detect patterns in binary images with the proposed multi-layer pipeline. As demonstrated in Fig. 4, we are also trying to extend the current idea to pattern detection in gray image. But, as Fig. 4 shows, it is sometimes hard to robustly detect all visually similar patterns in gray image. This is partly because our thresholding step discards much useful texture information in the gray image. In this case, we may introduce more robust cues, such as curvature or surface gradients, into our pipeline to improve performance.

REFERENCES

Dalal, N. and B. Triggs, 2005. Histograms of oriented gradients for human detection. IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognition, 1: 886-893.
CrossRef Direct Link
Hong, L. and Y. Zhang, 2001. Two value image pattern recognizing technology of ANN. Infrared Laser Eng., 30: 432-437.
Direct Link
Krumm, J., 1997. Object detection with vector quantized binary features. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (CSCCVPR`97), Washington, DC, USA., pp: 179-185.
Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision, 60: 91-110.
CrossRef Direct Link
Murase, H. and S.K. Nayar, 1995. Visual learning and recognition of 3-d objects from appearance. Int. J. Comput. Vision, 14: 5-24.
CrossRef Direct Link
Watanabe, T., S. Ito and K. Yokoi, 2009. Co-occurrence histograms of oriented gradients for pedestrian detection. Adv. Image Video Technol., 5414: 37-47.
CrossRef
Ye, Q., 2005. A branch moment invariant extraction algorithm for binary image. Comput. Eng. Appli., 41: 78-80.
Liao, H.C. and P.T. Chu, 2009. A novel visual tracking approach incorporating global positioning system in a ubiquitous camera environment. Inform. Technol. J., 8: 465-475.
CrossRef Direct Link
Bouzenada, M., M.C. Batouche and Z. Telli, 2007. Neural network for object tracking. Inform. Technol. J., 6: 526-533.
CrossRef Direct Link

Information Technology Journal

Research Article

Efficient Multi-resolution Detection of Binary Object

ABSTRACT

How to cite this article

Search

INTRODUCTION

CONCLUSIONS

ACKNOWLEDGMENTS

REFERENCES

Search

Related Articles

Leave a Comment