Reconstruction of 3D scenes from stereo pairs is based on matching of corresponding points in the left and right images. Generally, reconstruction is an ill-posed inverse optical problem because many optical surfaces may produce the same stereo image pair due to homogeneous texture, partial occlusions and optical distortions. To regularise the problem in order to obtain a unique solution close to human visual perception, specific constraints on surfaces need to be imposed. Almost all existing stereo reconstruction algorithms search for a single optical surface yielding the best correspondence between the images under constrained surface continuity, smoothness, and visibility conditions. Typically, most of the constraints are ‘soft’, i.e. allow for deviations, and the matching score is an ad hoc linear combination of individual criteria of signal similarity, surface smoothness, and surface visibility (or occlusions) with empirically chosen weights for each criterion. The resulting complex optimisation problem is solved using different exact or approximate techniques, e.g. dynamic programming, belief propagation or graph min-cut algorithms. However, the heuristic choice of the weights in the matching score strongly influences the reconstruction accuracy. In addition, natural stereo pairs contain many admissible matches, so that the ‘best’ matching that optimises the score may not lead to correct decisions. Moreover, real scenes very rarely consist of a single surface, so this assumption is also too restrictive.

The thesis develops an alternative approach to 3D stereo reconstruction called Noise-driven Concurrent Stereo Matching (NCSM). The family of algorithms that implement the NCSM paradigm clearly separate image matching from a subsequent search for optical surfaces. First, a hidden noise model which allows for mutual photometric distortions of images and matching outliers is estimated and then used to search for the candidate volumes by detecting all likely image matches. The selection of the 3D candidate volumes performed by image-to-image matching at a set of fixed depth, or disparity, values abandons the conventional assumption that a single best match has to be found. Then, the reconstruction proceeds from most likely foreground surfaces to the background ones (accounting for occlusions in the process), enlarging corresponding background volumes at the expense of occluded portions and selecting consistent optical surfaces that exhibit high point-wise signal similarity. A family of the NCSM based algorithms demonstrates high quality 3D reconstruction from various stereo pairs. Detailed analyses and comparisons show that the NCSM framework yields results competitive with those from the best-performing conventional algorithms on test stereo pairs with no contrast deviations but notably outperforms these algorithms in the presence of large contrast deviations.

Acknowledgments

First of all I would like to express my gratitude to my supervisor, Associate Professor Georgy Gimel'farb, for all the invaluable help, support, commitment and enthusiasm he has given me throughout my PhD study. I have learned much about computer vision from his vast knowledge of the field. Without his support the completion of this project would not have been possible and it would certainly have been less enjoyable.

I would like to thank Associate Professor John Morris and Dr. Patrice Jean Delmas for sharing their expertise and always being available to discuss our results and providing interesting suggestions.

Thanks must also go to my family and my wife Jingyi Li. They have provided unconditional support and encouragement during the last few years.

I would like to thank all my friends and fellow students for their help, input and motivation.

@Copyright by Jiang Liu Contact Administrator jliu001@gmail.com