Research in Stereo

Starting from 2007: Real-time stereo matching (together with AP John Morris and Dr. Patrice Delmas)

See

Intelligent Vision Systems / 3D Vision lab web page
Photogrammetry Lab web page

Starting from 2004: Noise-driven concurrent stereo matching (NCSM)

The mainstream of developing computational stereo vision techniques follows at present a popular conventional strategy of searching for a single optical surface yielding the best pixel-to-pixel correspondence between images of a stereo pair. The search involves specific constraints on surface continuity, smoothness, and visibility (occlusions) embedded into a matching score - typically an ad hoc linear combination of distinctly different criteria of signal similarity and surface properties. The coefficients or weighing factors are to be selected empirically because they dramatically effect accuracy of stereo matching. Only quite simplistic similarity scores such as the sum of squared or absolute differences between the corresponding signals result in more or less computationally feasible algorithms of exact or approximate constrained optimisation that search for the best stereo correspondence. The single surface assumption is also too restrictive - few real scenes have only one surface to be reconstructed.

A new paradigm of concurrent stereo was introduced in 2004 - 2005 (see my publications) in order to circumvent in part these problems by separating image matching from a choice of the 3D surfaces. Concurrent stereo matching first detects all likely matching 3D volumes instead of single best matches. Then, starting in the foreground, the volumes are explored, selecting mutually consistent optical surfaces that exhibit high point-wise signal similarity. Local, rather than global, surface continuity and visibility constraints are applied.

On-going results in developing and testing various versions of the NCSM can be found (in addition to the aforementioned publications) in the PhD thesis of Dr. Jiang Liu (2007).

1999 - 2003: SDPS-based software package "SIP Stereo Image Processing"

Take a look at the picture showing an output of the package.

The left range image displays grey-coded elevations (the blacker, the farther) for every pixel of the right ortho image of the well-known Pentagon. The range image represents a dense 3D digital parallax model (DPM) reconstructed by SIP from the initial 512 × 512 stereo pair. Each (x,y)-point of the DPM gives the x-disparity, or parallax, between the corresponding pixels in the stereo images.

The reconstruction is done by a symmetric dynamic programming (SDPS) algorithm which differs in many aspects from other DPS algorithms including my own pioneering ones published first in 1972 - 1979. You may find technical details in my publications.

The algorithm is not complex, and my former Compaq Armada M700 laptop PC with the Pentium II processor spent less than 8 seconds to reconstruct this 512 × 512 DPM within the x-disparity range of [-10, 10]. Although ranked by accuracy below today's best-performing counterparts, by the reconstruction rate this algorithm outperforms most of them (they may spend dozens of minutes or even hours on a powerful workstation to obtain similar or slightly better results).

True enough, such a simple reconstruction is obtained after studying the stereo problems for more than 30 years. First I used the dynamic programming search for stereo correspondence in 1969 - 1971 and continued to investigate these algorithms later. But the publications were in Russian and appeared well before computational stereo became one of central problems in Computer Vision. They did not attract much attention afterwards, too, because their titles do not mention dynamic programming (as I considered it as merely a tool for finding the desired correspondence and was more interested in mathematical models of stereo images). Thus only a few professionals know about their existence but I still like these early papers recalling those naive but exciting years of my youth when computers that could see, talk, and understand world seemed to be "just around a corner":

Gimel'farb, G. L., Marchenko, V. B., and Rybak, V. I.: An algorithm for automatic identification of identical sections on stereo pairs of photographs. Kibernetika, no. 2, 1972, pp. 118 - 129. (English translation: Cybernetics, vol. 8, no. 2, 1972, pp. 311 - 322. Consultants Bureau, N.Y.).
Gimel'farb, G. L., Marchenko, V. B., and Rybak, V. I.: Automatic identification of identical points on stereoscopic photographs with due regard for irregular photometric distortions of the signals. Kibernetika, no. 4, 1976, pp. 107 - 112. (English translation: Cybernetics, vol. 12, no. 4, 1976. Consultants Bureau, N.Y.).
Gimel'farb, G. L.: Symmetrical approach to the problem of automating stereoscopic measurements in photogrammetry. Kibernetika, no. 2, 1979, pp.73 - 82. (English translation: Cybernetics, vol. 15, no. 2, 1979, pp. 235 - 247. Consultants Bureau, N.Y.).

These algorithms have been refined for many years to take into account the ill-posedness of the inverse photometric problem of binocular or trinocular stereo and to regularise the reconstruction with respect to basic salient features of the problem such as partial occlusions, relative photometric distortions of stereo images, and uniform colouring of some terrain areas.

The x-disparities are inversely proportional to elevations (heights) of the terrain points. Therefore you may agree that in spite of minor deviations, the final DPM "Pentagon" is fairly close to visual expectations. Of course, human vision produces more precise elevation models, and the today's best-performers based on approximate graph-cut or belief propagation optimisation result in less erroneous models, too. But the rate of processing does also matter, and the SDPS has potentialities for further development, in particular, in the above-mentioned NCSM framework...

Some results of using the SIP package are presented below:

Reconstruction of the "Pentagon" scene.
Reconstruction of the "City in Spain" scene.
Reconstruction of the "Mountain in Spain - 1" scene.
Reconstruction of the "Mountain in Spain - 2" scene.
Reconstruction of the "Ramat-Hen" scene.