Theoretical Foundations of Computer Vision

Reinhard Klette, Walter G. Kropatsch, Franc Solina

(editors)


Dagstuhl-Seminar-Report

14.-18.3.1994 (Seminar 9411)


free copies:


	Geschäftsstelle Schloss Dagstuhl
	Universität des Saarlandes
	Postfach 15 11 50
	D-66041 Saarbrücken
	Germany
	e-mail: office@dag.uni-sb.de

Preface


This workshop is the seventh on this topic (Weissig 1982, 1984, 1988, Mirow/Fleeth 1986, Mägdesprung 1990, Buckow /Märkische Schweiz 1992). Typically, there were no restrictions on contents as far as theoretical issues were selected for presentation. Some previous workshops had a specified main topic as "Digital Geometry" or "AI and Vision".
This seventh workshop did cover a broad range of fields: Active Vision, Shape Reconstruction, Segmentation, Invariance, Models, Morphology, Digital Theory, Image Processing and Applications. Still it is surprising that during the Workshop there was an intense interest of all participants in these diverse fields. However, several issues have been identified to be of essential interest, e.g. the need of non- linear operators, the transition between analog and discrete representations, or the integration of visual processing with decision marking. These issues have been addressed in the oral presentations and were intensively discussed during the week in one of the comfortable rooms of the castle.
The last but not the least important result of this workshop was bringing together of scientists from Western and Eastern Europe as well as US that helped to understand different views in the vision community.

Reinhard Klette, Franc Solina, Walter G. Kropatsch


What is Life after Active Vision?



Ruzena Bajcsy
University of Pennsylvania

Vision/Perception is NOT "l'art pour l'art"; that is it is not for its own sake. Vision 
serves a PURPOSE/TASK.
Typically we consider the following tasks:
1. Vision for Manipulation
2. Vision for Mobility
3. Vision for Recognition
4. Vision for Communication.
The (1, 2) are denoted as WHERE questions and the (3, 4) are the WHAT questions.
We consider the most important problem in Visual Perception, the question of 
Representation. In turn Representation implies selection and construction of 
MODELS.
Models must be on many different levels:
1. Sensory level: models of transduction mechanism;
models of radiometric effects that result from interaction between 
the observer, light and the scene;
geometric model of the optics;
2. Signal level: filters linear and non-linear;
3. Topological and geometric level of the objects and scene;
4. Material properties as they can be extracted from color and motion;
5. Kinematic properties, such as movable parts;
6. Identification of Dynamic Systems-such as fluids, flexible materials;
7. Models of Functionality;

Open Problems:
1. What is observable from Vision only?
If we can answer this question, it will imply what we must assume or measure in 
order to have a completely identifiable system.

2. The world is continuous with natural discontinuities.
An open question is how identify these discontinuities.
This is the classical problem of SIGNAL to SYMBOL conversion.
The added difficulty is that this conversion must not be static but be able to 
dynamically change modulo task and context.

3. Biological systems are redundant, non-orthogonal, partially overlapped in their 
functionality, partially dependent and correlated.
The engineering systems on the other hand are orthogonal, independent and 
uncorrelated.

What is needed is a new calculus of non-orthogonal partially dependent systems.


Filtering for Feature Extraction


Tatjana Belikova
Russian Academy of Sciences

The task of object extraction and location on the complex background is under 
consideration. Several models of object known up to their random parameters are 
proposed. They were used to develop linear filters that are optimal by criteria of 
least mean square error and max. signal to noise (s(n) ration to extract objects on the 
complex background or to improve s(n) ratio. The output of the last one filter was 
used to discriminate points with object location. Parametrical and nonparametrical 
estimations of the signal values were used for this purpose.
For parametrical estimation we used max likelihood estimation to differ pixel values 
belonging to two different component that have different mean and deviation 
values. In nonparametrical estimation we used analysis of ordered statistics (rank 
ordered local gray values to find estimation of mean value of each component and 
to reduce deviation of object and background signal). This methods were helpful to 
extract and to locate micro classifications for early treats cancer diagnosis


GLDH Based Analysis of Texture Anisotropy and Symmetry: an Experimental Study


Dimitry Chetverikov
Hungarian Academy of Sciences

Recently, growing attention has been paid to the investigation of oriented 
(anisotropic) textures. This interest has been supported by the discovery of the 
important role by a few dominant high level texture features, including 
directionality, in attentive perception of texture patterns by humans. Co-occurrence 
probability matrix (CPM) and gray-level difference histogram (GLDH) based 
features have been traditionally viewed as powerful texture analysis tools that are, 
however, less suitable for detailed anisotropy analysis because for small discrete 
spacing one cannot set fine angular resolution necessary for detailed directionality 
analysis. We propose a straightforward and computationally efficient extension of 
CPM and GLDH to arbitrary angle and spacing and apply the extended GLDH 
features to the analysis of texture anisotropy. Furthermore, we consider the 
possibility of investigating the symmetry of a texture pattern via the symmetry 
properties of a polar diagram (anisotropy indicatrix) describing the anisotropy of the 
pattern. Results of pilot experiments with real-world textures are shown and 
directions of further research discussed.


Issues in attentive visual motion processing


Konstantinos Daniilidis
Christian-Albrechts University Kiel

Attentive vision encompasses selective sensing in space, time, and resolution. 
Decreasing space and time complexity be selection arises as a practical necessity in 
building vision systems able to sense and act in real time. Attention does not only 
mean the control of the degrees of freedom of the sensorial apparatus. It necessitates 
selection of the appropriate representation as well as of the proper state subspace in 
order to accomplish a specific task in a more efficient and robust manner.
We do not discuss here how attention is achieved: what to select and how to design 
an oculomotor control loop. Our interest is on the benefits of attention concerning 
the accomplishment of a motion related task. We concentrate on two aspects of 
attention regarding motion: fixation and space-variant polar and log-polar 
representation. Overcoming the field of view and bounding the retinal velocity of a 
moving object are obvious advantages of holding the gaze fixated on a moving 
object. We show that fixation enables an object-centered representation for the 
solution of the structure form motion problem. This representation stabilizes the 
estimation of lateral object translation that is confounded with the rotation in a 
camera-centered representation. Furthermore, fixation enables the use of scaled 
orthography for a distant object, leading, thus, to an affine motion field. Building 
upon existing methods we show how the direction of translation can be obtained 
from the oculomotor control inputs (camera rotation) what is supported by theories 
on efference copy and positive feedback. The introduction of the log-polar 
representation decouples the translation along from the rotation about the optical 
axis. We show - in contrast to existing results - that an already know function of the 
local motion parallax depends on the local slope of the surface. Furthermore, it turns 
out that the advantages regarding motion estimation are not in the logarithmic but in 
the polar nature of the space-variant representation. However, a log-polar 
transformation of the motion field facilitates independent motion detection if the 
observer is frontally translating.


Thin Sets


Ulrich Eckhardt*, Longin Latecki* and Albrecht Hübler**
* University Hamburg; ** Wolfsburg

There are Mainly free reasons for dealing with thin subsets of the digital plane Z2
- Such sets are generated by algorithms for thinning binary images,
- Thin sets are discrete analogs of curves in the plane,
- In order to understand 3-D structures and algorithms it becomes necessary to 
revisit critically the known 2D theory.
First we classify digital sets which are considered to be "thin" in some sense. There 
are 8- and 4-curves, contours (oriented boundaries of digital sets) and so-alled graph 
sets. It could be shown quite recently (Latecki, Eckhardt, Rosenfeld, 1994) that 
under rather mild conditions each digital set can be reduced to a topologically 
equivalent graph set.
It is also attractive to investigate families of thin sets. This is important for studying 
digital analogs of circles (or equivalently, rotations of the digital plane) for defining 
niveau lines in gray-scale pictures and for morphological operations. Specifically 
one may ask under which conditions erosion is the inverse operation to dilation.
One result of these investigations is a complete classification of simple and of 
nonsingular coverings of the digital plane by 8- (or 4-) curves and also a 
classification of singular points with respect to morphological operations (Eckhardt, 
Hübler, 1993). These latter points lead to a "morphological skeleton" of a digital set 
which has the property of exact reconstructability but has generally not the same 
topological properties as the original set.


Moment-Based Features for Description and Recognition of Blurred Images


Jan Flusser, Tomas Suk and Stanislav Saic
Academy of Sciences of the Czech Republic

The paper is devoted to the feature-based recognition of blurred images acquired by 
linear shift-invariant imaging system against an image database. The proposed 
approach consists of describing images by features which are invariant with respect 
to blur (that means with respect to the system PSF) and recognizing images in the 
feature space. In comparison with complicated and time-consuming "blind-
restoration" approach, we do not need the PSF identification and image restoration. 
Thanks to his, our approach is much more effective.
Two sets of invariants based on image moments are introduced in this paper - one 
set for symmetric blur, the order one for linear motion blur. The derivation of the 
invariants is a major theoretical result of the paper.


3D Scene reconstruction using a regional Approach


Andre Gagalowicz
INRIA-Rocquencourt

We discuss the problem of 3D indoor scene interpretation from an a priori given 
stereo pair of images. We stress the importance of the existence of an a priori given 
model of the 3D space and only study the case of a global model of this space. The 
proposed method consists in the use of a cooperative analysis/synthesis technique: 
an analysis (vision) task proposes a 3D complete model of the 3D scene 
incorporating geometric and photometric information. A synthesis algorithm is run 
afterwards, using the 3D complete model as input, and produces a synthetic stereo 
pair of the portion of this model possibly seen by the left and right camera. The 
difference between the natural and synthetic stereo pair is used to produce a better 
"complete" model. We consider first, the "learning" phase when we incorporate a 
model to interactively and visually, construct and control, the 3D space global 
model. In the analysis parts, we discuss the construction of a pipeline involving 
image segmentation region matching, stereo reconstruction, geometric and 
photometric interpretation of the "scene" leading to the construction of a "good" 
complete model of the part of 3D scene available in the stereo pair. An extension to 
the case of local vision problem involving an active procedure is briefly discussed 
as a conclusion.


Symmetric Bi- and Trinocular Stereo


Georgy Gimel'farb
Academy of Sciences of the Ukraine

Tradeoffs between theoretically justified and heuristic sides of the symmetric 
approach to intensity-based computational stereo are discussed. Under this approach 
a desired continuous optical surface is reconstructed from given stereo images as a 
bunch of epipolar profiles, each profile being obtained by maximizing a measure of 
similarity between intensities in the images and ortho-image (estimated coloring) of 
reconstructed surface points using dynamic programming (DP) techniques. In our 
previous papers this measure was deduced primarily under a simple Bayesian 
maximal-a-posteriori-probability (MAP) decision using probability models relating 
the profile coloring to corresponding intensities with due account of symmetries 
between the stereo images, independent allowable distortions of the images, 
possible discontinuities in each image because of partial occlusions of the surface, 
etc.
The computational stereo belongs to the domain of ill-posed inverse photometric 
problems because of principal multiplicity of the surfaces given the same stereo pair 
of triple of images. So it is impossible to reconstruct precisely the real surface which 
has given the obtained stereo images. Nonetheless some theoretical models and 
heuristics can be introduced to bring the reconstructed surface close enough to the 
one perceived visually from the sole stereo pair or triple (or what is the same - to 
approach human vision accuracy under this very restrictive condition).
Theoretical base of the computational stereo can be refined by modeling the profile 
geometry to describe more or less probable surface shapes, deducing compound 
Bayesian decisions being more adequate for solving stereo problems than the 
traditional MAP-decision and realized by the like DP techniques, and introducing an 
unified scheme of the symmetric bi- and trinocolar stereo.
But to cope with discontinuities in the images, some suitable heuristics for 
estimating coloring in the monocularly visible points of the surface and defining 
signal similarity for them are necessary. Rather good experimental results for the 
real stereo pairs have been obtained with the similarity measure being a weighted 
linear combination of two like ones: between the intensities in the stereo images and 
estimated surface coloring and between both rectified stereo images in themselves 
only.


Decision Algorithms for Model-Based Vision Problems


Gregory Hager
Yale University

Many vision problems reduce to the problem of making a decision expressed as 
inegnality constraints on an appropriate parametric model. This talk formalizes this 
class of problems and then presents an algorithm that is correct and complete for 
them. This algorithm is then extended to cover problems involving segmentation 
also to address unstructured problems. Finally, it is observed that the use of low-
level spatial organization processes are crucial for the effective use of these 
algorithms.


Improvement of the Curvature Computation


Vaclav Hlavac, Tomas Pajdla, Milos Sommer
Czech Technical University

The improvement of the curvature computing of the digitized curves was presented. 
The standard scheme, i.e. computing curvature by the convolution with the 
truncated. Gaussian, was studied. First, we show that systematic bias caused by 
curvature smoothing can be removed. Second, we demonstrate that about 25 % of 
the error has roots in other phenomena (i.e. anisotropy of the raster, limited size of 
the Gaussian, numerical integration of the convolution, and discretization).


Information technologies for image processing in real-time


Volodymyr Hrytsyk
Academy of Sciences of the Ukraine

An important unsolved problem in complex scene analysis with motion is the real-
time implementation. An approach to solve the problem, based on mathematical 
models, method of fast features calculation and control is proposed for dynamic 
images and complex scenes. The theorems, which determine a constructive method 
of synthetics of neuronlike and systolic computing structures, are given, allowing 
the real-time implementation of recursive and parallel algorithms for image 
processing and scene analysis. An high efficiency of recursive-parallel systems for 
image processing is demonstrated.


Computation of mosaic images using on approximate 3D model


Pascal Jaillon, Annick Montanvert
TIMC-IMAG Grenoble

Image mosaicling consists in fusing images acquired from different places to build a 
global view of a scene. Diverse techniques provide mosaic images for satellite 
applications or painting reconstructions. We propose a method to mosaic images 
lying on three dimensional surfaces, avoiding the computation of a 3D model of the 
surface. A coarse model of the surface and the parameters of the projection 
(acquisition view point, optical axis) point to flatten images. Then images are 
merged with a 2D-technique of mosaicling.
Finally the resulting 2D image is mapped on the evaluation of the 3D surface. This 
allows visualization from any view point.
Such an approach can take into account the perspective distortion, and then 
discontinuities along the junction line are reduced.
Depending on the application, we propose to apply corrections on original images or 
on Laplacian images.
This mosaicling strategy is applied on satellite images, for paintings on vaults, and 
in microscopy.


Shape Reconstruction for Central Projection


Reinhard Klette
Technical University Berlin

The talk deals with the geometric models used for shape reconstruction where a 
rotation disc is used in front of a pinhole camera. For camera calibration the method 
by [Tsai 1986] was implemented and optimized (e.g. with respect to number of 
calibration planes and points in each plane). It is described how to use the 
calibration results for shape reconstruction for the case of objects on the rotating 
disc where motion vectors are used as input.
Three approaches were studied: For the point-based approach the computation of 
accurate dense motion fields is the critical issue. An extensive evaluation of 
differential methods for optical flow computation was performed. On the other 
hand, shape reconstruction based on (assumed) accurate optical flow fields could be 
realized very precise.
For the feature based approach (e.g. edges by the Canny-operator) the epipolar 
constraint of stereo cameras was modified for the rotation disc. Depth may be 
reconstructed at traced feature points, and even the rotation angle bay be calculated 
by tracing one (!) point in two consecutive images using the calibration results. For 
the region based approach, integrative constraints for shape reconstruction did prove 
to be numerically quite instable. Several theoretical results (e.g. shape from area and 
centroids of corresponding regions) were derived for future implementation.


Color Vision for Stereo Correspondence


Andreas Koschan
Technical University Berlin

Problem solving in digital image processing without color Information is sometimes 
difficult or even impossible as for example in the following cases: highlight 
detection, correspondence analysis in stereo images, image segmentation, etc. On 
the other hand, the necessity for color research often arises directly from the 
application (e.g., identification of color codes on resistors, food analysis, traffic sign 
recognition, etc.). In this paper it is shown that stereo matching results can be 
considerably improved when using color information. The Block matching 
technique has been extended to the so called Chromatic Block matching technique 
because of its efficiency already shown for gray valve images. Furthermore, it has 
been shown that results can be further improved when employing the I1I2I3 color 
space instead of the RGB solid. No significant influence has been found yet 
between the color measures and the results. In summary, we believe that precise 
dense depth maps can be obtained more easily when applying this Chromatic Block 
Matching technique to color stereo images.


Optimal Statistical Filtering of gray-level Images


Vladimir Kovalevsky
Technische Fachhochschule Berlin

A method for eliminating random noise is suggested whose performance consists in 
the following. The distribution of the gray values in a sliding window is 
approximated by up to four normal distributions. Parameters of the distributions and 
the probability PK(g) that a gray value g belongs to one of the distributions k are 
estimated by an iteration method suggested by Schlesinger many years ago. Then 
the gray value of the central pixel of the window in the output image is set equal to 
the mean value of one of the normal distributions. This distribution is selected 
according to the maximum a posteriory probability PK(g). The results of the 
filtering are compared with those of the sigma-filter.


Properties of Pyramidal Representations


Walter G. Kropatsch
Technical University of Vienna

The categorization of different components generalizes the classical concept of 
image pyramids and provides a powerful tool for efficient image analysis. Three 
different components of image pyramids are distinguished: their structure, the 
contents of their cells and the processes that operate on them. Different applications 
impose different requirements on the processing of the data. There are several 
engineering decisions to be made. The properties of the three different components 
of a pyramidal system are discussed and illustrated by examples. New results and 
research trends give an overview of the current state of the art.


Robust Recovery of Structures in Images


Ales Leonardis
University of Ljubljana

The significance of detecting geometric parametric structures has long been realized 
in the vision community. In this paper, a reliable and efficient method for extracting 
geometric parametric structures is presented. In method consists of two inter/wined 
procedures, namely model-recovery and model-selection. The first procedure 
systematically recovers parametric models in an image creating a redundant set of 
possible descriptions, while the model-selection procedure searches among them to 
produce an optimal result in terms of the objective function.
In reliability of the recovery procedure which builds the parametric models is 
ensured by an iterative procedure through simultaneous performance of data 
classification and parameter estimation. The overall relative insensitivity to noise 
and minor changes in the input data is achieved by considering many competitive 
solutious and selecting those that produce the simplest description. The selection 
procedure is defined as a Quadratic Bodean problem, and the solution is sought by 
the WTA (winner-takes-all) technique.
The presented method is efficient for two reasons: firstly, it is designed as a search 
which utilizes intermediate results as a guidance toward the final result, and 
secondly, it combines model recovery and model selection in a computationally 
efficient procedure. The proposed method proved to be successful for recovering 
parametric surface models and volumetric models (superquadrics) in range images 
and parametric curve models in edge images.


A structure-probabibistic approach to edge detection and adaptive filtering


Roman M. Palenichka
Academy of Sciences of the Ukraine

Edge detection operator can be efficiently used for image segmentation and filtering 
as a control possibility. For this purpose an approach, based on direct estimation of 
edge probability, is proposed. The approach consider two-stage detection procedure. 
At the first stage an image segment is tested on uniformness by evaluating the 
probability of uniform (smooth) segment. If this segment is non-smooth, the second 
test should be applied, during which the edge probability is evaluated. The edge 
position can be selected as a maximal value point of computed probability in a 
given neighborhood. This method can be successfully used for binary segmentation 
as thresholding procedure with floating threshold. The value of threshold in each 
point depends on its value in pervious point the updated threshold value as well as 
the edge probability at this point. This approach is based on a structural 
mathematical image model, composed of two components. The first one is the 
intensity trend and the second component represents fluctuations. For fast 
implementation of this method the fast recursive algorithm is proposed to calculate 
such local image features as mean value, median and variance.


An Incremental Learning System for Interpretation of Images


Petra Perner*, Walter Pätzold**
*HWTK Leipzig, **KWT Dresden

Defect Classification by image based techniques is an important issue in quality 
assurance and nondestructive testing. The solution of the problem is usually 
complex and context dependent. A domain specific interpretation of the problem is 
required. Thus, the acquisition, representation and use of the problem specific 
knowledge in combination with image processing facilities is a central point. The 
main problem in defect classification arises since mostly generalized knowledge is 
lacking. Therefore knowledge based techniques are necessary which may work 
based on single instances of the problem domain and learn new instances during the 
use of the system. This leads to the case-based reasoning paradigm in the paper, we 
propose an architecture for a case-based reasoning system for image interpretation. 
We discuss our approach on the problem domain ultra sonic image interpretation. 
The application is characterized by structural representation. The signal-to-symbol 
transformation for spatial knowledge is discribed as well as the case representation. 
For the determination of similarity between two structural representation we 
propose structural similarity and describe the algorithm for calculation of similarity. 
For the case base, we chose to use an hierarchical representation. Therefore, we 
developed an algorithm which can incremental learn this hierarchical representation.


Computer Vision and Mathematical Morphology


Jos Roerdink
University of Groningen

An extension of mathematical morphology is investigated incorporating symmetry 
and invariance concepts essential for computer vision applications.
Classical morphology uses image transformations which are translation invariant. In 
many applications other forms of symmetry are involved, for example polar 
symmetry or invariance under perspective transformations. We extend morphology 
to such cases by considering any homogeneous space (G; X) when X is a set on G a 
group cutting transitively on X. Morphological transformations can then be 
constructed as mappings of the Boolean algebra P(X) (the power set of X) to itself, 
which are invariant under the group G. Examples are the plane with the Euclidean 
motion group or the scene with the rotation group. We discuss how this might be 
used for introducing projective invariance in morphology. Finally, a sketch is given 
of a preliminary attempt at morphological description of 3D surfaces by using 
concepts from differential geometry.


Globally convergent nonlinear diffusion networks for early vision


Christoph Schnörr, Rainer Sprengel
University of Hamburg

A class of minimization problems is considered to model nonlinear, transition 
preserving data-reduction processes for early vision. These problems are non-
discretely formulated and have always a unique minimizing solution that 
continuously depends on the data. Approximate solutions based on the Galerkin 
method converge as the discretization parameter tends to zero.
The computation of approximate solutions can be done by a globally convergent 
and highly parallel relaxation procedure or, in principle, by a globally, 
asymptotically stable analog network.
The relationship of a prototype minimization problem to the nonlinear diffusion 
approach of Perona and Matik and its variational formulation due to Nordström is 
discussed.
It is shown that the parametrization of our diffusion coefficient can be used to 
control the trade-off between smoothing and preserving data transitions. The 
stability of the localization of data transitions in parameter space is demonstrated, 
and a criterion to select these transitions is presented.
Finally, the application of the general principle to locally computed motion data is 
considered, and two corresponding functionals and corresponding numerical results 
are discussed.


Banach Constructor and Image Compression


Wladyslaw Skarbak
Polish Academy of Sciences

The Banach constructor is defined as a concept unifying special cases of 
deterministic fractal modeling. The fractal compression of digital images is 
presented as a Banach constructor defined by a patchwork. The patchwork concept 
is a formal mathematical model which allowed for a compact definition of the 
fractal operator, specification of a condition for its contractility (for all u norms, 1 <=
u <= infinity), and formulating conditions ensuring the required fidelity of the 
reconstructed image. Fast fractal compression algorithm (FFC) is based on 
patchworks which are affine (with contrast and scaling fixed), sparse, and local. 
While the known fractal compression schemes (Jacquin, Jacobs et al., Barnslay) 
require encoding time 100-1000 greater than decoding time, FFC gives high quality 
images of natural scenes with this ratio not exceeding 10.
Formal as far best fit, affine, contrast fixed transforms which perform the best fit of 
two digital patches, are given for Minkowski u norms which u = 1, u = 2, and u = infinity. 
Experiments confirm superiority of quadratic norm at quality-time tradeoff.


On Separability Problems in Computational Geometry and their Applications


F. Sloboda, B. Zatko
Slovak Academy of Sciences

Properties of the external and internal shortest path of a simple polygon were 
described. further properties of the shortest path in a polygonally bounded compact 
set were described and the separability problem of two disjoint polygons were 
investigated.


Segmentation with Volumetric Models


Franc Solina
University of Ljubljana

A new approach to reliable and efficient recovery of part-level descriptions from 
range images is presented. It is shown that a set of superquadric volumetric models 
can be directly recovered from unsegmented range data. Superquadric models are an 
extension of ellipsoids that cover a continuum of shapes including parallelepipeds 
and aglinders as well. The approach is based on the recover-and-select paradigm by 
Leonardis that consists of two intertwined processes: model recovery and model 
selection. In the model recovery process a redundant set of superquadrics is initiated 
in the image and allowed to grow. Recovered models are selected using a MDL-like 
criterian which results in the simplest overall description.


Steerable Filters for Attentive Visual Processing


Gerald Sommer* and Markus Michaelis**
*Christian-Albrechts-Universität Kiel
**GSF-MEDIS-Institut, Neuherberg

Junctions of lines or edges are important visual cues in various fields of computer 
vision. They are characterized by the existence of more than one orientation at one 
single point, the so called keypoint. In this work we investigate the performance of 
highly orientation selective functions to detect multiple orientations and to 
characterize junctions. A quadrate pair of functions. A quadrature pair of functions 
is used to detect lines as well as edges and to distinguish between them. An 
associated one-sided function with an angular periodicity of 360¡ can distinguish 
between terminating and non-terminating lines and edges which constitute the 
junctions. To calculate the response of these functions in a continuum of 
orientations and scales a method is used that was introduced recently by P. Perona 
[8].
These functions are called steerable filters. They seems to be the natural kind of 
operators which are able to be adapted in the degrees of freedom in the attention 
stage under the control of a task. It is shown that their response to local structures 
can be used to make explicit the recognized structures. This is the way for 
knowledge based computer vision. In behavior based systems (attentive systems) 
these response are used as implicit representations which constitute the input to 
associative memories to fuse several local hints to a global recognition of structure.


Theoretical Foundations of Anisotropic Diffusion in Image Processing


J. Weickert
Universität Kaiserlautern

A frequent problem in low level vision consists of eliminating noise and small-scale 
details from an image while still preserving or even enhancing the edge structure. 
Nonlinear anisotropic diffusion filtering using an adapted diffusion tensor offers one 
possibility to achieve these goals. We sketch the essential ideas of this technique 
and demonstrate its advantages compared to isotropic and nonlinear diffusion. 
Although exhibiting an edge enhancing potential, the proposed method provides a 
scale-space fulfilling several architectural, information reducing and invariance 
properties. Furthermore, it leads to well-pronounced edges with stable locations 
across a wide range of scales. It is shown that most of the restoration and scale-
space properties carry over from the continuous to the discrete case. Applications 
are presented ranging from preprocessing of medical images and postprocessing of 
numerical results containing fluctuations to visualizing quality relevant features for 
the grading of wood surfaces and fleece.


Stability and Likelihood of Views of Three Dimensional Objects


Daphna Weinshall, Michael Werman and Naftali Tishby
The Hebrew University of Jerusalem

Can we say anything general about the distribution of two dimensional views of 
general three dimensional objects? In this paper we present a first formal analysis of 
the stability and likelihood of two dimensional views (under weak perspective 
projection) of three dimensional objects. This analysis is useful for various aspects 
of object recognition and database indexing. Examples are Bayesian recognition and 
image interpretation; indexing to a three dimensional database by invariants of two 
dimensional images; the selection of "good" templates that may reduce the 
complexity of correspondence between images and three dimensional objects; and 
ambiguity resolution using generic views.
We show the following results: (1) Both the stability and likelihood of views do not 
depend on the particular distribution of points inside the object; they both depend on 
only three numbers, the three second moments of the object. (2) The most stable and 
the most likely views are the some view, which is the "flattest" view of the object; 
moreover, there is no other view which is even locally the most stable or the most 
likely view. Under orthographic projection, we also show: (3) the distance between 
one image to another does not depend on the position of its viewpoint with respect 
to the object, it depend only on the (geodesic) distance between the viewpoints on 
the viewing sphere. We demonstrate these results with real and simulated data.


Model-free texture segmentation based on distances between first-order statistics


Piero Zamperoni
Technische Universität Braunschweig

This contribution focuses on the key-role of the gray value distribution (first-order 
statistics), taken as a whole, for characterizing homogeneous regions of an image 
and for detecting region borders for image segmentation scopes. Where the 
discriminating power of the first-order statistics is not sufficient, also information 
on the gray values' spatial relationships must be extracted from the second-order 
statistics.
The aim of this study is to develop efficient methods for measuring a "degree of 
diversity" between pairs of symmetrical and equal-sized subwindows of the 
observation window, centered on the current pixel P. The maximum degree of 
diversity, measured among 4 subwindow pairs with different orientations, is the 
edge value attributed to P. Repeating this procedure for all the pixels, one obtains an 
edge map, which represents the first and the most critical step in a segmentation 
process.
For measuring the "degree of diversity", several approaches have been investigated, 
all based upon well-known pattern recognition and statistic methods, as for instance:

- Distances (Minkowski, Canberra, Tanimoto-distance, scalar product, and other 
specially developed distance) between the vectors of the rank-ordered gray values 
of the two subwindows;
- Distances (Kolmogorov, Bhattacharyya, Patrick-Fischer) between the estimated 
gray value density functions of the two subwindows;
- Measures of cluster concentration in the gray value space and in the 2-dimensional 
space obtained by considering also a spatial distribution feature;
- Non parametric Wilcoxon-type two-populations tests;
- Distance between the estimated locations and between the estimated scales in the 
paired subwindows.

Experimental results, obtained with natural images featuring problematic textures 
(remote sensing, radar, ultrasonic, nuclear medicine) and with synthetic images with 
all the approaches mentioned above, are shown and illustrated.


CITR: last update: 22 April 1998