Subscribe Now Subscribe Today
Research Article
 

Temporally Consistent Depth Maps Recovery from Stereo Videos



Li Yao, Dong-Xiao Li and Ming Zhang
 
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail
ABSTRACT

Dense depth maps provide significant geometry information for 3D video and free viewpoint video systems. Traditional stereo correspondence methods usually deal with each stereo image pair separately. As a result, the generated depth sequence is temporally inconsistent. This paper presented a novel approach to recover spatio-temporally consistent depth maps. The proposed method first applied sequential belief propagation algorithm to achieve an approximate minimum of spatial energy on Markov random fields. Then in temporal domain, a smoothness cost along optical flow was incorporated between consecutive frames. The combined cost which determined the disparity value was passed forward and temporal consistency was enforced during the process. In addition, the streamlined implementation was time and memory efficient. In experimental validation, quantitative evaluation as well as subject assessment was performed on several test datasets. The results showed that the proposed method yielded temporally consistent depth sequence and reduced flickering artifacts in the synthesized view while maintaining visual quality.

Services
Related Articles in ASCI
Search in Google Scholar
View Citation
Report Citation

 
  How to cite this article:

Li Yao, Dong-Xiao Li and Ming Zhang, 2012. Temporally Consistent Depth Maps Recovery from Stereo Videos. Information Technology Journal, 11: 30-39.

DOI: 10.3923/itj.2012.30.39

URL: https://scialert.net/abstract/?doi=itj.2012.30.39
 
Received: May 22, 2011; Accepted: September 25, 2011; Published: November 22, 2011



INTRODUCTION

The concept of three-dimensional display has attracted human being for several decades. Seeking for a real impression of natural world, people have made many attempts to exploit stereopsis in 3D display (Konrad and Halle, 2007). Glass-based stereoscopic displays use filters or shuttles to provide virtual 3D visual experience (Urey et al., 2011). Obviously, it is inconvenient for the viewers to wear glasses or other special device. Autostereoscopic techniques which directly present stereoscopic images seem more popular (Dodgson, 2005). In autostereoscopic displays, more than two views are required so that the viewers can observe different corresponding scenes in different positions (Zwicker et al., 2007). However, the vast raw data of multi-view video is considered as a severe conflict with the transmission bandwidth (Meesters et al., 2004).

Recently, Depth Image-Based Rendering (DIBR) has been considered as one of the most significant technologies for 3DTV (three-dimensional television). The 3D content is typically represented by regular 2D video and associated gray-scale depth map (Fehn, 2004). On the other hand, the topic of stereo correspondence is a fundamental issue in computer vision (Brown et al., 2003; Zigh and Belbachir, 2010; Yu et al., 2011; Shuchun et al., 2011). A large number of methods have been proposed to solve this ill-posed problem which suffers from image noise, border mismatch, textureless regions and occlusions. Scharstein and Szeliski (2002) showed a survey of taxonomy and categorization for dense stereo correspondence algorithms. The existing techniques for dense stereo correspondence roughly fall into two categories: local methods and global methods. Local methods determine the disparity value of a concerned pixel depending on a local surrounding area, e.g., block based method (Bae et al., 2008) and variable windows approach (Veksler, 2003). They have very efficient implementations but lead to mismatch in boundary regions. On the contrary, global methods make explicit smoothness assumption and solve the optimization problem in a global framework. The main distinction among these methods is the optimization procedure used, such as simulated annealing (Barbu and Zhu, 2005), graph cuts (Boykov et al., 2001), belief propagation (Yang et al., 2006) and so on. In general, these methods deal with each single image pair and disregard of the correlation between consecutive frames in a video sequence (Scharstein and Szeliski, 2002). They do not explicitly distinguish moving objects with static background. As a result, the depth value of static objects and background may fluctuate in a different time. Furthermore, it causes critical artifacts in the synthesized virtual view and discomforts the viewers because human vision is sensitive to the frequent flickering (Lee and Ho, 2010).

Motivated by the demand of enhancing temporal consistency, the stereo correspondence procedure incorporates several improvements in current research. Bleyer and Gelautz (2009) applied smoothing operation on disparity map sequence to decrease flickering artifacts. Smirnov et al. (2010) proposed a filtering method to achieve high-quality and temporally consistent depth maps. Commonly these methods enforce temporal coherence with filters, so the results often get blurred in object boundary. Tao et al. (2001) addressed the problem of extracting depth information of non-rigid dynamic 3D scenes from multiple synchronized video streams and the temporal consistency is improved during the process. Zhang et al. (2009) proposed a bundle optimization framework to incorporate geometric coherence constraint of multiple frames in a video. Also a general form of scene flow can be utilized to estimate depth maps (Vedula et al., 2005) but it increases computation cost dramatically and fails to realize in real-time application.

In contrast with 3D MRF model which treats all the frames simultaneously (Zhaozheng and Collins, 2007), this framework adopts streamlined implementation. First in spatial domain, Belief Propagation (BP) algorithm is applied for each frame to find a minimization of MRF energy. Then in temporal domain, a recursive function combines all frames and computes total energy in a linear structure. After BP based matching of each frame accomplished, the temporal algorithm allows the computational belief messages to be released. Only an aggregated cost is transferred forward for the next frame. Moreover, temporal consistency is refined during the process. Based on the streamlined framework, a novel method was proposed to recover consistent depth maps from stereo video sequences in this paper. Temporal smoothness function was employed along motion path decided by optical flow. The framework of traditional stereo correspondence based on Markov Random Fields (MRF) was extended to whole video sequence.

MATERIALS AND METHODS

Overview of framework: To facilitate the work of stereo correspondence, several necessary preprocess are required. First, epipolar geometry constraint is imposed. Unlike motion estimation in video compression which only cares about the data redundancy, disparity vector in stereo correspondence relates to a pair of pixels which are exactly from a same position in 3D scene. In order to reduce the number of potential correspondences and increase matching reliability, original stereo video is calibrated in an epipolar line (Hartley and Zisserman, 2004). In epipolar geometry, the search range of disparity is limited in a horizontal scan-line and the disparity value can be easily transformed to a depth value when camera parameters are known. Second, image quality of stereo video need to be corrected. The intrinsic property of image such as color, or intensity, will influence the measure of similarity and further impact the computation of matching cost.

In general, the methods are formulated in an energy-minimization framework based on Markov Random Fields (Geman and Geman, 1984). MRF model provides a convenient and powerful foundation for many intractable problems in early vision involved with gridded image-like data (Bai et al., 2008; Izabatene and Rabahi, 2010). A regular algorithm is designed by terms of Maximization A Posteriori (MAP) estimation based on Bayesian network. A typical MRF model of N4 neighborhood system is shown in Fig. 1. The white circles f (i, j) denote unknowns to be inferred while the dark circles f’ (i, j) denote input data. The black boxes d (i, j) denote elemental data penalty terms and s (i, j) denote interact potentials between connected nodes in the random fields. The data term together with the smoothness term make up the total energy function. In stereo correspondence problem, the goal is to find corresponding points between two rectified images. The label of each pixel indicates discrete variable of disparity value. The data term measures how well the pixel in source image matches the one in reference image. While the smoothness term imposed by MRF model indicates that pixels in neighboring areas should have similar disparity value (Boykov et al., 2001). Altogether, the energy function of stereo correspondence problem can be defined as:

E = Ed+λES
(1)

where, Ed is the data energy and ES is the smoothness energy. λ gives the relative weight of the smoothness penalty.
Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
Fig. 1: Graphic model for an N4 neighborhood Markov random field. Note: The white circles denote the unknowns f (i, j) while the dark circles denote the input data f’ (i, j). The black boxes d (i, j) denote data penalty and S (i, j) denote interaction potentials between adjacent nodes

The overall flow chart of the proposed method can be depicted as follows: After input stereo video stream is rectified (Du et al., 2004), raw matching cost is computed first. A robust measure of truncated AD (absolute difference) of intensity is used which is specified in Eq. 3, where, p and p’ are the corresponding pixels with the label of disparity lp and S is the set of pixels in the image. A Disparity Space Image (DSI) is created for storing raw costs within all possible disparity. Then, BP based stereo correspondence algorithm is applied for each single frame. If current frame is the first of the video sequence, the disparity map can be immediately obtained. Otherwise, temporal smoothing is enforced by calculating an aggregated cost along motion path. The temporal accumulated energy and the spatial energy together form the total energy. Then a Winner-Take-All (WTA) strategy is used to determine the output depth maps. The raw cost and temporary memory storage will be released before the next round starts.

To sum up, the modified energy function from Eq. 1 of each frame can be defined as:

E = Ed+λES+Et
(2)

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(3)

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(4)

where, (p, q) denotes a pair of neighboring pixels. And Et is the temporal aggregation energy which will be specified in the following section.

The smoothness energy term still need to be detailed. A simple case of Potts model (Boykov et al., 2001) assumes that labeling should be piece-wise constant. This model considers only two conditions. For equal labels the cost is zero and for different labels the cost is set to a constant. It is unsuitable for some complicated situations. Here a general form of multi-pass jump smoothness costs function is applied (Felzenszwalb and Huttenlocher, 2004), in which different pairs of adjacent labels can lead to different costs:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(5)

where, ρS controls the rate of increase in the cost and τS is the truncation value.

Spatial optimization: There are two basic assumptions to produce dense disparity map: uniqueness and continuity. That is, the disparity maps have a unique value per pixel and are continuous almost everywhere. In addition, the MRF model adopts spatial and contextual constraints which are ultimately necessary and significant in low-level vision.

Belief propagation which is employed in this method, can be used as an effective way of solving inference problems in pair-wise MRF models. The main advantage of BP is to solve MAP problem of labeling on MRF model and reduce the computation time from exponential level to linear level. The original standard BP proposed by Pearl (1988) takes the idea of passing local messages around the nodes through edges and guarantees the convergence for any tree-strutted graphic model. Recently, it is extended to loopy belief propagation so that it can deal with graphs with loops (Ihler et al., 2005).

The loopy BP algorithm works by passing message to neighboring nodes along the four-connected image grid. Each message is a vector of dimension with the number of possible labels. It is initialized to zero in form of negative log probabilities. Each node uses the message received from neighboring nodes to compute new message for other neighboring nodes. Let mpq (lq) be the message that node p sends to a neighboring node q on label lq (Fig. 2a). It is updated in the following way:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(6)

where, the data cost Dp (lp) and the smoothness cost VS (lp, lq) are defined in Eq. 3 and 5, respectively.

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos

denotes the message calculated from neighboring nodes of p except node q.

After several iterations of message passing in all directions, the final belief of node q is calculated as:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(7)

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
Fig. 2(a-b): Graphic illustration of the proposed BP algorithm. (a) Message computed from node p to q and (b) Messages propagate in forward scan-line direction

An involved issue during BP message passing is how to arrange the updating schedule (Tappen and Freeman, 2003). The message updating schedule determines when a node uses the messages received from neighbors to compute the new message. In parallel implementation, the messages are passed along rows and then along columns and used to compute the next round of messages. It starts at the first node and passes messages in one direction until reach the end. Then the messages are passed backward in a similar way.

However, the convergence time of parallel schedule is always quite long. An alternative schedule is to propagate messages in one direction and sequentially update each node. It means when a node sends message to its neighboring node, the neighboring node would use it to compute message for next node immediately. Our sequential implementation is inspired from TWR-S algorithm (Kolmogorov, 2006). It processes nodes in scan-line order, with forward and backward passing (Fig. 2b). Such asynchronous updating scheme allows the message to propagate much more quickly across the image. Thus it is preferred in this framework.

Temporal optimization: Video processing and analysis takes into account not only the pixel values in a single static frame but also the temporal relations between frames. Like the intra-frame continuity, temporal smoothness constraint is exploited for an analytic framework. A direct way is to connect regular location of continuous frames in a 6-neighborhood 3D grid model. But the basic assumption of continuity in temporal domain is violated when the object is moving. Therefore, motion information is exploited in this method. Motion source refers to the temporal variations in image sequences (Stiller and Konrad, 1999; Xu and Zi-Shu, 2011). The 3D motion of object induces 2D motion on the image plane. In order to make the problem tractable, it is assumed that the motion of object is finite. And the disparity value varies continuously along the motion path between consecutive frames.

Motion estimation has found various applications in motion-compensated video compression as well as video summarization and video stabilization, for the reason that the temporal relation of intensity is high and useful (Korah and Perinbam, 2006; Ren et al., 2010; Iffa et al., 2011). The temporal constraint is now applied in the stereo correspondence problem for a video sequence. To make explicit estimate of motion at each independent pixel, a general form of optical flow is exploited. Estimating the optical flow requires inferring a dense field of displacement vectors which map all points in the first image to the corresponding locations in the second image. The basic technique for computation of optical flow is based on the optical flow constraint equation:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(8)

It relates the spatio-temporal intensity changes to the velocity (u, v). However, the solution of (8) can not be determined because there are two unknown variables in one linear equation. A smoothness constraint was introduced by Horn to form a global function (Horn and Schunck, 1981):

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(9)

The optical flow is computed by the non-linear diffusion algorithm (Proesmans et al., 1994). Then each pixel of current frame derives a motion vector pointed to the previous one. The computational flow is illustrated in Fig. 3a. The final cost is decided by both the cost of current frame and the consistency cost between consecutive frames. It is calculated as follows for node q in frame n:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(10)

where,

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(11)

Cnq (lq) normally consists of three components. Vt (lq’, lq) represents the temporal smoothness cost function which has a similar definition to Eq. 5. bnq (lq) denotes the spatial belief cost and Cn-1nor, q’ (lq’) denotes the normalized aggregated cost of frame n-1 at node q’ which points to the node q along the motion path. λt is the temporal weighting coefficient. Basically, this function can be also interpreted as a pair-wise dynamic programming procedure (Fig. 3b). The optimal path between consecutive frames with minimum cost is computed to generate the final cost. The label l which minimizes the final cost is selected as the output disparity value at that pixel. The aggregated process is unidirectional so that any latter frame is not used for reference. Note that the enforcement of temporal consistency may deteriorate the results when optical flow estimation fails. The temporal weighting coefficient λt is defined in Eq. 12. When the brightness constancy is violated, λt will be decreased adaptively to reduce the effect by such errors:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(12)

where, k and γ are the parameters determined empirically.

In the end, the cost was normalized as follows:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(13)

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
Fig. 3 (a-b): The final cost computation in spatio-temporal domain. (a) Computational flow; (b) Equivalent pair-wise dynamic programming process for solving Eq. (10). Note: In (b), for each label lq of pixel q, the optimal path is determined by lq(ä which minimizes Vt (lq’, lq)+Cn-1nor, q’ (lq’). Then the term Cnq (lq) can be updated recursively

EXPERIMENTAL RESULTS AND DISCUSSION

To demonstrate the effective performance of the proposed method, experiments are conducted on PC (personal computer) platform with Intel Core2 Duo 3.16 GHz CPU (central processing unit). Several stereo video test sequences from FhG-HHI (2011) (Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institute) 3DV (three-dimensional video) data base (FhG-HHI, 2011) are used, including ‘Book Arrival’, ‘Alt Moabit’, ‘Door Flowers’ and ‘Leaving Laptop’. The resolution of some sequences is re-sampled to reduce the candidates of possible disparity values and enhance the reliability. In traditional stereo correspondence problem, a quantitative way of measuring the quality of the computed result is concerned. Two general approaches are mentioned (Scharstein and Szeliski, 2002). First, compare the resultant disparity map with ground truth data. Second, evaluate the synthetic image rendered by the original source image and the computed disparity map.

Additional measures are required to evaluate the quality in temporal domain. In fact, most of the ground-truth maps for the test sequences are unavailable. Meanwhile, the ground truth data should not be the unique criterion when considering the whole 3DTV system. The disparity map can be recognized as an intermediate product which will be discarded after virtual view synthesized. Not only the intrinsic quality of single depth map but also the consistency of intra and inter frames is critical for the DIBR process. A smooth depth map can improve the synthesized view with less hole regions. To sum up, an overall assessment of the experimental results is taken in the following two aspects.

Subject evaluation: At first, a comparative result of ‘Book Arrival’ sequence is shown in Fig. 4. Figure 4a shows the source images from five consecutive frames which contain apparent motion objects. The result of the proposed method (Fig. 4d) is compared with other existing methods in the related literature. Fig. 4b is obtained from the Hierarchical BP algorithm (Yang et al., 2006) which has no temporal improvement. Another local method based on adaptive windows (Veksler, 2003) is temporally improved by smoothing filter (Smirnov et al., 2010) and the result is shown in Fig. 4c.

Some consequence can be inferred during the comparison. The regions close to left border of the depth map can be ignored since they are invisible in the reference view. The observed results show that the proposed method has a greater stability than other BP method without temporal improvement. Results in Fig. 4d also perform a conspicuous insusceptibility in the static scenes where the disparity value should not vary. With respect to the temporal method using filter, the proposed method is also outperformed. No blur can be detected and motion objects still maintain their outline.

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
Fig. 4 (a-d): Disparity map results of different methods of frame 27 to 31 (from top to bottom) on the ‘book arrival’ sequence. (a) Source sequence; (b) Hierarchical BP method without temporal improvement; (c) adaptive window-based method with temporal smoothing filter and (d) Proposed method

Since, it is difficult to present the visual results of whole sequence directly, only a few frames are selected to show the improvement in temporal domain (Fig. 4). Actually, in the proposed method, smooth disparity map sequence and rendered virtual view sequence can be observed obviously during playing. On the contrary, the result of Hierarchical BP (Yang et al., 2006) dealing with each independent frame has fluctuant disparity value and finally causes flickering artifacts.

The improvement by adopting motion information in the proposed method is also illustrated. In the regions of motion objects between adjacent frames, the method based on regular 3D grid model is prone to generate mismatch (Fig. 5a). After incorporating the motion information of optical flow, the matching errors are reduced (Fig. 5b).

Quantitative evaluation: Here, two basic measures are introduced focusing on the objective evaluation of the depth map sequences. First, the temporal consistency is checked between consecutive frames. Second, the quality of each separated disparity map is tested.

To check the temporal consistency, motion objects need to be separated from the test sequence. It is evident that only the background and static objects maintain the disparity value and viewers mostly feel flickering artifacts at these regions.

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
Fig. 5 (a-b): Improvement of disparity map at the boundary of motion objects. (a) Result of regular 3D grid model and (b) Result of the proposed model with motion compensation

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
Fig. 6 (a-d): Comparative results of temporal consistency. (a) Book arrival, (b) Alt moabit, (c) Door flowers and (d) Leaving laptop. The pixel whose disparity value is inconsistent with the one in the next frame is identified as an error pixel. The total error percentage of each frame is compared among different methods marked by different colors

The static regions are manually segmented for each test sequence. Then each disparity map of current frame is compared with the following one. If the difference of any pair of pixels in the static regions exceeds a threshold, the pixel is marked as an inconsistent pixel. The percentage of total error pixels is calculated and the results are compared with other methods in Fig. 6:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(14)

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
Fig. 7 (a-d): Comparative results of PSNR of synthesized virtual views. (a) Book arrival, (b) Alt moabit, (c) Door flowers and (d) Leaving laptop. Note: The virtual view is synthesized by the source view and the disparity map. PSNR of Y component is computed by comparing it with the original view taken at that location. The PSNR result of each frame is compared among different methods marked by different colours

On the other hand, accurate disparity map straightly brings on a good-quality synthesized image. Since the ground truth maps for these test sequences are unavailable, the experiments are made by means of evaluating the synthesized view. The left source view is projected to the corresponding position in right view using the disparity map. Then the obtained view is measured via PSNR of Y component in contrast with the original view. The pixels in the exposed regions with void value are set to 255. The regions close to right border of the image are ignored since they are invisible in the source view. Fig. 7(a-d) shows the comparative result of different methods:

Image for - Temporally Consistent Depth Maps Recovery from Stereo Videos
(15)

The results of objective measures are analyzed. The figures of the first experiment distinctly illustrate the temporal consistency in different sequences (Fig. 6a-d). The error percentage on static regions is about 5% in the proposed method while the “spatial BP” and “Hierarchical BP” methods present more than 10% errors. As it can be inferred, by employing temporal coherency in the BP optimization framework, the proposed method presents lower error percentage in most of the datasets than other BP methods without temporal improvement. The temporal smoothing method with filter (Smirnov et al., 2010) also has a relatively low error percentage which is between 5% and 10%. However, it fails to gain high performance in the experiment of image PSNR (Fig. 7). The result is 3 to 6 dB lower than the proposed method. It is because that the smoothing filters such as Gaussian filter deteriorate the intrinsic features of images. The proposed method is also compared with spatial BP method without temporal component in the same framework. Consequently they have comparable PSNR results with expectation. The proposed method also performs better than the Hierarchical BP method on ‘Book Arrival’ and ‘Alt Moabit’ sequences in the second experiment. And they achieve similar PSNR results on the rest sequences. As shown in Fig. 7, the PSNR result of the proposed method is about 21.24, 25.81, 21.20 and 21.50 on average, respectively on the test sequences. While the Hierarchical BP method achieves about 20.89, 19.27, 21.35 and 21.52, respectively.

In brief, the experimental results demonstrate that the purpose of enhancing temporal consistency can be achieved at no loss of image quality and less additional operation is required in the implementation. The generated depth maps results are promising to be applied for the depth-image based rendering in 3DTV (three-dimensional television) system as well as other applications that require depth information for three-dimensional reconstruction.

CONCLUSIONS

Depth estimation plays a crucial role in three-dimensional video systems. This paper have presented a novel method for generating dense depth maps from stereo video sequences. Not only the quality of separate depth map but also the temporal consistency is concerned. The proposed method first applies BP algorithm for each frame. The obtained belief message is utilized as a measure of matching cost. Then, optical flow is used to connect consecutive frames and address the correspondence problem in form of a joint spatio-temporal function. It is defined as a recursive process. The cost is accumulated forward along the time axis so that any unnecessary information of the former frames can be released. The experimental results manifest that the proposed method can convincingly produce temporally consistent disparity maps without degradation of synthesized image quality.

ACKNOWLEDGMENTS

This stud was supported in part by the National Natural Science Foundation of China (Grant No. 60802013, 61072081), the National Science and Technology Major Project of the Ministry of Science and Technology of China (Grant No. 2009ZX01033-001-007), Key Science and Technology Innovation Team of Zhejiang Province, China (Grant No. 2009R50003) and China Postdoctoral Science Foundation (Grant No. 20110491804).

REFERENCES

1:  Bae, K.H., J.H. Ko and J.S. Lee, 2008. Errata: Stereo image reconstruction using regularized adaptive disparity estimation. J. Electron. Imag.,
CrossRef  |  Direct Link  |  

2:  Bai, L., F. Chen and X. Zeng, 2008. Application of markov random field in depth information estimation of microscope defocus image. Inform. Technol. J., 7: 808-813.
Direct Link  |  

3:  Barbu, A. and S.C. Zhu, 2005. Generalizing Swendsen-Wang to sampling arbitrary posterior probabilities. IEEE Trans. Pattern Anal. Mach. Intell., 27: 1239-1253.
CrossRef  |  

4:  Bleyer, M. and M. Gelautz, 2009. Temporally consistent disparity maps from uncalibrated stereo videos. Proceedings of 6th International Symposium on Image and Signal Processing and Analysis, Sep.16-18, 2009, Salzburg, pp: 383-387

5:  Boykov, Y., O. Veksler and R. Zabih, 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell., 23: 1222-1239.
CrossRef  |  

6:  Brown, M.Z., D. Burschka and G.D. Hager, 2003. Advances in computational stereo. IEEE Trans. Pattern Anal. Mach. Intell., 25: 993-1008.
CrossRef  |  

7:  Dodgson, N.A., 2005. Autostereoscopic 3-D displays. Computer, 38: 31-36.
CrossRef  |  

8:  Du, X., H.D. Li and W.K. Gu, 2004. A simple rectification method for linear multi-baseline stereovision system. J. Zhejiang Univ. Sci., 5: 567-571.
PubMed  |  

9:  Fehn, C., 2004. Depth-image-based rendering (DIBR), compression and transmission for a new approach on 3D-TV. Proc. SPIE Stereoscopic Displays Virtual Reality Syst. XI, 5291: 93-104.
CrossRef  |  

10:  FhG-HHI, 2011. Mobile 3DTV content delivery optimization over DVB-H system. Mobile 3DTV, Solideyesight, http://sp.cs.tut.fi/mobile3dtv/stereo-video/.

11:  Izabatene, H.F. and R. Rabahi, 2010. Classification of remote sensing data with markov random field. J. Appl. Sci., 10: 636-643.

12:  Geman, S. and D. Geman, 1984. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell., 6: 721-741.
CrossRef  |  

13:  Hartley, R.I. and A. Zisserman, 2004. Multiple View Geometry. Cambridge University Press, Cambridge, UK

14:  Horn, B.K.P. and B.G. Schunck, 1981. Determining optical flow. Artif. Intell., 17: 185-203.

15:  Iffa, E.D., A.R.A. Aziz and A.S. Malik, 2011. Gas flame temperature measurement using background oriented schlieren. J. Applied Sci., 11: 1658-1662.
Direct Link  |  

16:  Ihler, A.T., J.W. Fischer and A.S. Willsky, 2005. Loopy belief propagation: Convergence and effects of message errors. J. Mach. Learn. Res., 6: 905-936.
Direct Link  |  

17:  Kolmogorov, V., 2006. Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell., 28: 1568-1583.
CrossRef  |  

18:  Konrad, J. and M. Halle, 2007. 3-D Displays and signal processing. IEEE Signal Process Mag., 24: 97-111.
CrossRef  |  

19:  Korah, R. and J.R.P. Perinbam, 2006. A novel coarse-to-fine search motion estimator. Inform. Technol. J., 5: 1073-1077.
Direct Link  |  

20:  Lee, S.B. and Y.S. Ho, 2010. View-consistent multi-view depth estimation for three-dimensional video generation. Proceedings of IEEE 3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video, June 7-9, 2010, IEEE Xplore, pp: 1-4

21:  Meesters, L.M.J., W.A. IJsselsteijn and P.J.H. Seuntiens, 2004. A survey of perceptual evaluations and requirements of three-dimensional TV. IEEE Trans. Circuits Syst. Video Technol., 14: 381-391.
CrossRef  |  

22:  Pearl, J., 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. 1st Edn., Morgan Kauffmann Public Inc., San Francisco, CA., USA., ISBN: 0-934613-73-7

23:  Ren, G., P. Li and G. Wang, 2010. A novel hybrid coarse-to-fine digital image stabilization algorithm. Inform. Technol. J., 9: 1390-1396.
Direct Link  |  

24:  Proesmans, M., L. Van Gool, E. Pauwels and A. Oosterlinck, 1994. Determination of optical flow and its discontinuities using non-linear diffusion. Comput. Vision, 801: 294-304.
CrossRef  |  

25:  Scharstein, D. and R. Szeliski, 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision, 47: 7-42.
CrossRef  |  

26:  Smirnov, S., A. Gotchev and K. Egiazarian, 2010. A memory-efficient and time-consistent filtering of depth map sequences. Proc. SPIE, Vol. 7532,
CrossRef  |  Direct Link  |  

27:  Stiller, C. and J. Konrad, 1999. Estimating motion in image sequences. IEEE Signal Process. Mag., 16: 70-91.
CrossRef  |  

28:  Tao, H., H.S. Sawhney and R. Kumar, 2001. Dynamic depth recovery from multiple synchronized video streams. Proc. IEEE Comput. Soc. Conf. Comput. Vision Pattern Recog., 1: I-118-I-124.
CrossRef  |  

29:  Tappen, M.F. and W.T. Freeman, 2003. Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. IEEE Conf. Computer Vision, 2: 900-906.
CrossRef  |  

30:  Urey, H., K.V. Chellappan, E. Erden and P. Surman, 2011. State of the art in stereoscopic and autostereoscopic displays. Proc. IEEE, 99: 540-555.
CrossRef  |  

31:  Vedula, S., P. Rander, R. Collins and T. Kanade, 2005. Three-dimensional scene flow. IEEE Trans. Pattern Anal. Mach. Intell., 27: 475-480.
CrossRef  |  

32:  Veksler, O., 2003. Fast variable window for stereo correspondence using integral images. Proc. IEEE Comput. Soc. Conf. Comput. Vision Pattern Recog., 1: I-556-I-561.
CrossRef  |  

33:  Xu, W. and H. Zi-Shu, 2011. Target motion analysis in three-sensor TDOA location system. Inform. Technol. J., 10: 1150-1160.
Direct Link  |  

34:  Yang, Q., L. Wang, R. Yang, S. Wang, M. Liao and D. Nister, 2006. Real-time global stereo matching using hierarchical belief propagation. Proceedings of the British Machine Vision Conference, Volume 3, September 4-7, 2006, Edinburgh, UK., pp: 989-998
CrossRef  |  Direct Link  |  

35:  Yu, S., D. Yan, Y. Dong, H. Tian, Y. Wang and X. Yu, 2011. Stereo matching algorithm based on aligning genomic. Inform. Technol. J., 10: 675-680.
Direct Link  |  

36:  Shuchun, Y., Y. Xiaoyang, S. lina, Z. Yuping and S. Yongbin et al., 2011. A reconstruction method for disparity image based on region segmentation and RBF neural network. Inf. Technol. J., 10: 1050-1055.
Direct Link  |  

37:  Zhaozheng, Y. and R. Collins, 2007. Belief propagation in a 3D spatio-temporal MRF for moving object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 17-22, 2007, Minneapolis, MN., pp: 1-8
CrossRef  |  

38:  Zhang, G., J. Jia, T.T. Wong and H. Bao, 2009. Consistent depth maps recovery from a video sequence. IEEE Trans. Pattern Anal. Mach. Intell., 31: 974-988.
CrossRef  |  

39:  Zwicker, M., A. Vetro, S. Yea, W. Matusik, H. Pfister and F. Durand, 2007. Resampling, antialiasing and compression in multiview 3-D displays. IEEE Signal Process Mag., 24: 88-96.
CrossRef  |  

40:  Zigh, E. and M.F. Belbachir, 2010. A neural method based on new constraints for stereo matching of urban high-resolution satellite imagery. J. Applied Sci., 10: 2010-2018.
Direct Link  |  

41:  Felzenszwalb, P.F. and D.P. Huttenlocher, 2004. Efficient belief propagation for early vision. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 1, June 27-July 2, 2004, Washington, DC., USA., pp: 261-268
CrossRef  |  

©  2022 Science Alert. All Rights Reserved