Theoretical Foundations of Computer Vision

Reinhard Klette, Walter G. Kropatsch, Franc Solina

(editors)

Dagstuhl-Seminar-Report

14.-18.3.1994 (Seminar 9411)

free copies:


	Geschäftsstelle Schloss Dagstuhl
	Universität des Saarlandes
	Postfach 15 11 50
	D-66041 Saarbrücken
	Germany
	e-mail: office@dag.uni-sb.de

Preface

This workshop is the seventh on this topic (Weissig 1982, 1984, 1988, Mirow/Fleeth 1986, Mägdesprung 1990, Buckow /Märkische Schweiz 1992). Typically, there were no restrictions on contents as far as theoretical issues were selected for presentation. Some previous workshops had a specified main topic as "Digital Geometry" or "AI and Vision".
This seventh workshop did cover a broad range of fields: Active Vision, Shape Reconstruction, Segmentation, Invariance, Models, Morphology, Digital Theory, Image Processing and Applications. Still it is surprising that during the Workshop there was an intense interest of all participants in these diverse fields. However, several issues have been identified to be of essential interest, e.g. the need of non- linear operators, the transition between analog and discrete representations, or the integration of visual processing with decision marking. These issues have been addressed in the oral presentations and were intensively discussed during the week in one of the comfortable rooms of the castle.
The last but not the least important result of this workshop was bringing together of scientists from Western and Eastern Europe as well as US that helped to understand different views in the vision community.

Reinhard Klette, Franc Solina, Walter G. Kropatsch

What is Life after Active Vision?


Ruzena Bajcsy
University of Pennsylvania

Vision/Perception is NOT "l'art pour l'art"; that is it is not for its own sake. Vision 
serves a PURPOSE/TASK.
Typically we consider the following tasks:
1. Vision for Manipulation
2. Vision for Mobility
3. Vision for Recognition
4. Vision for Communication.
The (1, 2) are denoted as WHERE questions and the (3, 4) are the WHAT questions.
We consider the most important problem in Visual Perception, the question of 
Representation. In turn Representation implies selection and construction of 
MODELS.
Models must be on many different levels:
1. Sensory level: models of transduction mechanism;
models of radiometric effects that result from interaction between 
the observer, light and the scene;
geometric model of the optics;
2. Signal level: filters linear and non-linear;
3. Topological and geometric level of the objects and scene;
4. Material properties as they can be extracted from color and motion;
5. Kinematic properties, such as movable parts;
6. Identification of Dynamic Systems-such as fluids, flexible materials;
7. Models of Functionality;

Open Problems:
1. What is observable from Vision only?
If we can answer this question, it will imply what we must assume or measure in 
order to have a completely identifiable system.

2. The world is continuous with natural discontinuities.
An open question is how identify these discontinuities.
This is the classical problem of SIGNAL to SYMBOL conversion.
The added difficulty is that this conversion must not be static but be able to 
dynamically change modulo task and context.

3. Biological systems are redundant, non-orthogonal, partially overlapped in their 
functionality, partially dependent and correlated.
The engineering systems on the other hand are orthogonal, independent and 
uncorrelated.

What is needed is a new calculus of non-orthogonal partially dependent systems.

Filtering for Feature Extraction

Tatjana Belikova
Russian Academy of Sciences

The task of object extraction and location on the complex background is under 
consideration. Several models of object known up to their random parameters are 
proposed. They were used to develop linear filters that are optimal by criteria of 
least mean square error and max. signal to noise (s(n) ration to extract objects on the 
complex background or to improve s(n) ratio. The output of the last one filter was 
used to discriminate points with object location. Parametrical and nonparametrical 
estimations of the signal values were used for this purpose.
For parametrical estimation we used max likelihood estimation to differ pixel values 
belonging to two different component that have different mean and deviation 
values. In nonparametrical estimation we used analysis of ordered statistics (rank 
ordered local gray values to find estimation of mean value of each component and 
to reduce deviation of object and background signal). This methods were helpful to 
extract and to locate micro classifications for early treats cancer diagnosis

GLDH Based Analysis of Texture Anisotropy and Symmetry: an Experimental Study

Dimitry Chetverikov
Hungarian Academy of Sciences

Recently, growing attention has been paid to the investigation of oriented 
(anisotropic) textures. This interest has been supported by the discovery of the 
important role by a few dominant high level texture features, including 
directionality, in attentive perception of texture patterns by humans. Co-occurrence 
probability matrix (CPM) and gray-level difference histogram (GLDH) based 
features have been traditionally viewed as powerful texture analysis tools that are, 
however, less suitable for detailed anisotropy analysis because for small discrete 
spacing one cannot set fine angular resolution necessary for detailed directionality 
analysis. We propose a straightforward and computationally efficient extension of 
CPM and GLDH to arbitrary angle and spacing and apply the extended GLDH 
features to the analysis of texture anisotropy. Furthermore, we consider the 
possibility of investigating the symmetry of a texture pattern via the symmetry 
properties of a polar diagram (anisotropy indicatrix) describing the anisotropy of the 
pattern. Results of pilot experiments with real-world textures are shown and 
directions of further research discussed.

Issues in attentive visual motion processing

Konstantinos Daniilidis
Christian-Albrechts University Kiel

Attentive vision encompasses selective sensing in space, time, and resolution. 
Decreasing space and time complexity be selection arises as a practical necessity in 
building vision systems able to sense and act in real time. Attention does not only 
mean the control of the degrees of freedom of the sensorial apparatus. It necessitates 
selection of the appropriate representation as well as of the proper state subspace in 
order to accomplish a specific task in a more efficient and robust manner.
We do not discuss here how attention is achieved: what to select and how to design 
an oculomotor control loop. Our interest is on the benefits of attention concerning 
the accomplishment of a motion related task. We concentrate on two aspects of 
attention regarding motion: fixation and space-variant polar and log-polar 
representation. Overcoming the field of view and bounding the retinal velocity of a 
moving object are obvious advantages of holding the gaze fixated on a moving 
object. We show that fixation enables an object-centered representation for the 
solution of the structure form motion problem. This representation stabilizes the 
estimation of lateral object translation that is confounded with the rotation in a 
camera-centered representation. Furthermore, fixation enables the use of scaled 
orthography for a distant object, leading, thus, to an affine motion field. Building 
upon existing methods we show how the direction of translation can be obtained 
from the oculomotor control inputs (camera rotation) what is supported by theories 
on efference copy and positive feedback. The introduction of the log-polar 
representation decouples the translation along from the rotation about the optical 
axis. We show - in contrast to existing results - that an already know function of the 
local motion parallax depends on the local slope of the surface. Furthermore, it turns 
out that the advantages regarding motion estimation are not in the logarithmic but in 
the polar nature of the space-variant representation. However, a log-polar 
transformation of the motion field facilitates independent motion detection if the 
observer is frontally translating.

Thin Sets

Ulrich Eckhardt*, Longin Latecki* and Albrecht Hübler**
* University Hamburg; ** Wolfsburg

There are Mainly free reasons for dealing with thin subsets of the digital plane Z2
- Such sets are generated by algorithms for thinning binary images,
- Thin sets are discrete analogs of curves in the plane,
- In order to understand 3-D structures and algorithms it becomes necessary to
revisit critically the known 2D theory.
First we classify digital sets which are considered to be "thin" in some sense. There
are 8- and 4-curves, contours (oriented boundaries of digital sets) and so-alled graph
sets. It could be shown quite recently (Latecki, Eckhardt, Rosenfeld, 1994) that
under rather mild conditions each digital set can be reduced to a topologically
equivalent graph set.
It is also attractive to investigate families of thin sets. This is important for studying
digital analogs of circles (or equivalently, rotations of the digital plane) for defining
niveau lines in gray-scale pictures and for morphological operations. Specifically
one may ask under which conditions erosion is the inverse operation to dilation.
One result of these investigations is a complete classification of simple and of
nonsingular coverings of the digital plane by 8- (or 4-) curves and also a
classification of singular points with respect to morphological operations (Eckhardt,
Hübler, 1993). These latter points lead to a "morphological skeleton" of a digital set
which has the property of exact reconstructability but has generally not the same
topological properties as the original set.

Moment-Based Features for Description and Recognition of Blurred Images

Jan Flusser, Tomas Suk and Stanislav Saic
Academy of Sciences of the Czech Republic

The paper is devoted to the feature-based recognition of blurred images acquired by
linear shift-invariant imaging system against an image database. The proposed
approach consists of describing images by features which are invariant with respect
to blur (that means with respect to the system PSF) and recognizing images in the
feature space. In comparison with complicated and time-consuming "blind-
restoration" approach, we do not need the PSF identification and image restoration.
Thanks to his, our approach is much more effective.
Two sets of invariants based on image moments are introduced in this paper - one
set for symmetric blur, the order one for linear motion blur. The derivation of the
invariants is a major theoretical result of the paper.

3D Scene reconstruction using a regional Approach

Andre Gagalowicz
INRIA-Rocquencourt

We discuss the problem of 3D indoor scene interpretation from an a priori given
stereo pair of images. We stress the importance of the existence of an a priori given
model of the 3D space and only study the case of a global model of this space. The
proposed method consists in the use of a cooperative analysis/synthesis technique:
an analysis (vision) task proposes a 3D complete model of the 3D scene
incorporating geometric and photometric information. A synthesis algorithm is run
afterwards, using the 3D complete model as input, and produces a synthetic stereo
pair of the portion of this model possibly seen by the left and right camera. The
difference between the natural and synthetic stereo pair is used to produce a better
"complete" model. We consider first, the "learning" phase when we incorporate a
model to interactively and visually, construct and control, the 3D space global
model. In the analysis parts, we discuss the construction of a pipeline involving
image segmentation region matching, stereo reconstruction, geometric and
photometric interpretation of the "scene" leading to the construction of a "good"
complete model of the part of 3D scene available in the stereo pair. An extension to
the case of local vision problem involving an active procedure is briefly discussed
as a conclusion.

Symmetric Bi- and Trinocular Stereo

Georgy Gimel'farb
Academy of Sciences of the Ukraine

Tradeoffs between theoretically justified and heuristic sides of the symmetric
approach to intensity-based computational stereo are discussed. Under this approach
a desired continuous optical surface is reconstructed from given stereo images as a
bunch of epipolar profiles, each profile being obtained by maximizing a measure of
similarity between intensities in the images and ortho-image (estimated coloring) of
reconstructed surface points using dynamic programming (DP) techniques. In our
previous papers this measure was deduced primarily under a simple Bayesian
maximal-a-posteriori-probability (MAP) decision using probability models relating
the profile coloring to corresponding intensities with due account of symmetries
between the stereo images, independent allowable distortions of the images,
possible discontinuities in each image because of partial occlusions of the surface,
etc.
The computational stereo belongs to the domain of ill-posed inverse photometric
problems because of principal multiplicity of the surfaces given the same stereo pair
of triple of images. So it is impossible to reconstruct precisely the real surface which
has given the obtained stereo images. Nonetheless some theoretical models and
heuristics can be introduced to bring the reconstructed surface close enough to the
one perceived visually from the sole stereo pair or triple (or what is the same - to
approach human vision accuracy under this very restrictive condition).
Theoretical base of the computational stereo can be refined by modeling the profile
geometry to describe more or less probable surface shapes, deducing compound
Bayesian decisions being more adequate for solving stereo problems than the
traditional MAP-decision and realized by the like DP techniques, and introducing an
unified scheme of the symmetric bi- and trinocolar stereo.
But to cope with discontinuities in the images, some suitable heuristics for
estimating coloring in the monocularly visible points of the surface and defining
signal similarity for them are necessary. Rather good experimental results for the
real stereo pairs have been obtained with the similarity measure being a weighted
linear combination of two like ones: between the intensities in the stereo images and
estimated surface coloring and between both rectified stereo images in themselves
only.

Decision Algorithms for Model-Based Vision Problems

Gregory Hager
Yale University

Many vision problems reduce to the problem of making a decision expressed as
inegnality constraints on an appropriate parametric model. This talk formalizes this
class of problems and then presents an algorithm that is correct and complete for
them. This algorithm is then extended to cover problems involving segmentation
also to address unstructured problems. Finally, it is observed that the use of low-
level spatial organization processes are crucial for the effective use of these
algorithms.

Improvement of the Curvature Computation

Vaclav Hlavac, Tomas Pajdla, Milos Sommer
Czech Technical University

The improvement of the curvature computing of the digitized curves was presented.
The standard scheme, i.e. computing curvature by the convolution with the
truncated. Gaussian, was studied. First, we show that systematic bias caused by
curvature smoothing can be removed. Second, we demonstrate that about 25 % of
the error has roots in other phenomena (i.e. anisotropy of the raster, limited size of
the Gaussian, numerical integration of the convolution, and discretization).

Information technologies for image processing in real-time

Volodymyr Hrytsyk
Academy of Sciences of the Ukraine

An important unsolved problem in complex scene analysis with motion is the real-
time implementation. An approach to solve the problem, based on mathematical
models, method of fast features calculation and control is proposed for dynamic
images and complex scenes. The theorems, which determine a constructive method
of synthetics of neuronlike and systolic computing structures, are given, allowing
the real-time implementation of recursive and parallel algorithms for image
processing and scene analysis. An high efficiency of recursive-parallel systems for
image processing is demonstrated.

Computation of mosaic images using on approximate 3D model

Pascal Jaillon, Annick Montanvert
TIMC-IMAG Grenoble

Image mosaicling consists in fusing images acquired from different places to build a
global view of a scene. Diverse techniques provide mosaic images for satellite
applications or painting reconstructions. We propose a method to mosaic images
lying on three dimensional surfaces, avoiding the computation of a 3D model of the
surface. A coarse model of the surface and the parameters of the projection
(acquisition view point, optical axis) point to flatten images. Then images are
merged with a 2D-technique of mosaicling.
Finally the resulting 2D image is mapped on the evaluation of the 3D surface. This
allows visualization from any view point.
Such an approach can take into account the perspective distortion, and then
discontinuities along the junction line are reduced.
Depending on the application, we propose to apply corrections on original images or
on Laplacian images.
This mosaicling strategy is applied on satellite images, for paintings on vaults, and
in microscopy.

Shape Reconstruction for Central Projection

Reinhard Klette
Technical University Berlin

The talk deals with the geometric models used for shape reconstruction where a
rotation disc is used in front of a pinhole camera. For camera calibration the method
by [Tsai 1986] was implemented and optimized (e.g. with respect to number of
calibration planes and points in each plane). It is described how to use the
calibration results for shape reconstruction for the case of objects on the rotating
disc where motion vectors are used as input.
Three approaches were studied: For the point-based approach the computation of
accurate dense motion fields is the critical issue. An extensive evaluation of
differential methods for optical flow computation was performed. On the other
hand, shape reconstruction based on (assumed) accurate optical flow fields could be
realized very precise.
For the feature based approach (e.g. edges by the Canny-operator) the epipolar
constraint of stereo cameras was modified for the rotation disc. Depth may be
reconstructed at traced feature points, and even the rotation angle bay be calculated
by tracing one (!) point in two consecutive images using the calibration results. For
the region based approach, integrative constraints for shape reconstruction did prove
to be numerically quite instable. Several theoretical results (e.g. shape from area and
centroids of corresponding regions) were derived for future implementation.

Color Vision for Stereo Correspondence

Andreas Koschan
Technical University Berlin

Problem solving in digital image processing without color Information is sometimes
difficult or even impossible as for example in the following cases: highlight
detection, correspondence analysis in stereo images, image segmentation, etc. On
the other hand, the necessity for color research often arises directly from the
application (e.g., identification of color codes on resistors, food analysis, traffic sign
recognition, etc.). In this paper it is shown that stereo matching results can be
considerably improved when using color information. The Block matching
technique has been extended to the so called Chromatic Block matching technique
because of its efficiency already shown for gray valve images. Furthermore, it has
been shown that results can be further improved when employing the I1I2I3 color
space instead of the RGB solid. No significant influence has been found yet
between the color measures and the results. In summary, we believe that precise
dense depth maps can be obtained more easily when applying this Chromatic Block
Matching technique to color stereo images.

Optimal Statistical Filtering of gray-level Images

Vladimir Kovalevsky
Technische Fachhochschule Berlin

A method for eliminating random noise is suggested whose performance consists in
the following. The distribution of the gray values in a sliding window is
approximated by up to four normal distributions. Parameters of the distributions and
the probability PK(g) that a gray value g belongs to one of the distributions k are
estimated by an iteration method suggested by Schlesinger many years ago. Then
the gray value of the central pixel of the window in the output image is set equal to
the mean value of one of the normal distributions. This distribution is selected
according to the maximum a posteriory probability PK(g). The results of the
filtering are compared with those of the sigma-filter.

Properties of Pyramidal Representations

Walter G. Kropatsch
Technical University of Vienna

The categorization of different components generalizes the classical concept of
image pyramids and provides a powerful tool for efficient image analysis. Three
different components of image pyramids are distinguished: their structure, the
contents of their cells and the processes that operate on them. Different applications
impose different requirements on the processing of the data. There are several
engineering decisions to be made. The properties of the three different components
of a pyramidal system are discussed and illustrated by examples. New results and
research trends give an overview of the current state of the art.

Robust Recovery of Structures in Images

Ales Leonardis
University of Ljubljana

The significance of detecting geometric parametric structures has long been realized
in the vision community. In this paper, a reliable and efficient method for extracting
geometric parametric structures is presented. In method consists of two inter/wined
procedures, namely model-recovery and model-selection. The first procedure
systematically recovers parametric models in an image creating a redundant set of
possible descriptions, while the model-selection procedure searches among them to
produce an optimal result in terms of the objective function.
In reliability of the recovery procedure which builds the parametric models is
ensured by an iterative procedure through simultaneous performance of data
classification and parameter estimation. The overall relative insensitivity to noise
and minor changes in the input data is achieved by considering many competitive
solutious and selecting those that produce the simplest description. The selection
procedure is defined as a Quadratic Bodean problem, and the solution is sought by
the WTA (winner-takes-all) technique.
The presented method is efficient for two reasons: firstly, it is designed as a search
which utilizes intermediate results as a guidance toward the final result, and
secondly, it combines model recovery and model selection in a computationally
efficient procedure. The proposed method proved to be successful for recovering
parametric surface models and volumetric models (superquadrics) in range images
and parametric curve models in edge images.

A structure-probabibistic approach to edge detection
and adaptive filtering

Roman M. Palenichka
Academy of Sciences of the Ukraine

Edge detection operator can be efficiently used for image segmentation and filtering
as a control possibility. For this purpose an approach, based on direct estimation of
edge probability, is proposed. The approach consider two-stage detection procedure.
At the first stage an image segment is tested on uniformness by evaluating the
probability of uniform (smooth) segment. If this segment is non-smooth, the second
test should be applied, during which the edge probability is evaluated. The edge
position can be selected as a maximal value point of computed probability in a
given neighborhood. This method can be successfully used for binary segmentation
as thresholding procedure with floating threshold. The value of threshold in each
point depends on its value in pervious point the updated threshold value as well as
the edge probability at this point. This approach is based on a structural
mathematical image model, composed of two components. The first one is the
intensity trend and the second component represents fluctuations. For fast
implementation of this method the fast recursive algorithm is proposed to calculate
such local image features as mean value, median and variance.

An Incremental Learning System for Interpretation of Images

Petra Perner*, Walter Pätzold**
*HWTK Leipzig, **KWT Dresden

Defect Classification by image based techniques is an important issue in quality
assurance and nondestructive testing. The solution of the problem is usually
complex and context dependent. A domain specific interpretation of the problem is
required. Thus, the acquisition, representation and use of the problem specific
knowledge in combination with image processing facilities is a central point. The
main problem in defect classification arises since mostly generalized knowledge is
lacking. Therefore knowledge based techniques are necessary which may work
based on single instances of the problem domain and learn new instances during the
use of the system. This leads to the case-based reasoning paradigm in the paper, we
propose an architecture for a case-based reasoning system for image interpretation.
We discuss our approach on the problem domain ultra sonic image interpretation.
The application is characterized by structural representation. The signal-to-symbol
transformation for spatial knowledge is discribed as well as the case representation.
For the determination of similarity between two structural representation we
propose structural similarity and describe the algorithm for calculation of similarity.
For the case base, we chose to use an hierarchical representation. Therefore, we
developed an algorithm which can incremental learn this hierarchical representation.

Computer Vision and Mathematical Morphology

Jos Roerdink
University of Groningen

An extension of mathematical morphology is investigated incorporating symmetry
and invariance concepts essential for computer vision applications.
Classical morphology uses image transformations which are translation invariant. In
many applications other forms of symmetry are involved, for example polar
symmetry or invariance under perspective transformations. We extend morphology
to such cases by considering any homogeneous space (G; X) when X is a set on G a
group cutting transitively on X. Morphological transformations can then be
constructed as mappings of the Boolean algebra P(X) (the power set of X) to itself,
which are invariant under the group G. Examples are the plane with the Euclidean
motion group or the scene with the rotation group. We discuss how this might be
used for introducing projective invariance in morphology. Finally, a sketch is given
of a preliminary attempt at morphological description of 3D surfaces by using
concepts from differential geometry.

Globally convergent nonlinear diffusion networks for early vision

Christoph Schnörr, Rainer Sprengel
University of Hamburg

A class of minimization problems is considered to model nonlinear, transition
preserving data-reduction processes for early vision. These problems are non-
discretely formulated and have always a unique minimizing solution that
continuously depends on the data. Approximate solutions based on the Galerkin
method converge as the discretization parameter tends to zero.
The computation of approximate solutions can be done by a globally convergent
and highly parallel relaxation procedure or, in principle, by a globally,
asymptotically stable analog network.
The relationship of a prototype minimization problem to the nonlinear diffusion
approach of Perona and Matik and its variational formulation due to Nordström is
discussed.
It is shown that the parametrization of our diffusion coefficient can be used to
control the trade-off between smoothing and preserving data transitions. The
stability of the localization of data transitions in parameter space is demonstrated,
and a criterion to select these transitions is presented.
Finally, the application of the general principle to locally computed motion data is
considered, and two corresponding functionals and corresponding numerical results
are discussed.

Banach Constructor and Image Compression

Wladyslaw Skarbak
Polish Academy of Sciences

The Banach constructor is defined as a concept unifying special cases of
deterministic fractal modeling. The fractal compression of digital images is
presented as a Banach constructor defined by a patchwork. The patchwork concept
is a formal mathematical model which allowed for a compact definition of the
fractal operator, specification of a condition for its contractility (for all u norms, 1 <=
u <= infinity), and formulating conditions ensuring the required fidelity of the
reconstructed image. Fast fractal compression algorithm (FFC) is based on
patchworks which are affine (with contrast and scaling fixed), sparse, and local.
While the known fractal compression schemes (Jacquin, Jacobs et al., Barnslay)
require encoding time 100-1000 greater than decoding time, FFC gives high quality
images of natural scenes with this ratio not exceeding 10.
Formal as far best fit, affine, contrast fixed transforms which perform the best fit of
two digital patches, are given for Minkowski u norms which u = 1, u = 2, and u = infinity.
Experiments confirm superiority of quadratic norm at quality-time tradeoff.

On Separability Problems in Computational Geometry
and their Applications

F. Sloboda, B. Zatko
Slovak Academy of Sciences

Properties of the external and internal shortest path of a simple polygon were
described. further properties of the shortest path in a polygonally bounded compact
set were described and the separability problem of two disjoint polygons were
investigated.

Segmentation with Volumetric Models

Franc Solina
University of Ljubljana

A new approach to reliable and efficient recovery of part-level descriptions from
range images is presented. It is shown that a set of superquadric volumetric models
can be directly recovered from unsegmented range data. Superquadric models are an
extension of ellipsoids that cover a continuum of shapes including parallelepipeds
and aglinders as well. The approach is based on the recover-and-select paradigm by
Leonardis that consists of two intertwined processes: model recovery and model
selection. In the model recovery process a redundant set of superquadrics is initiated
in the image and allowed to grow. Recovered models are selected using a MDL-like
criterian which results in the simplest overall description.

Steerable Filters for Attentive Visual Processing

Gerald Sommer* and Markus Michaelis**
*Christian-Albrechts-Universität Kiel
**GSF-MEDIS-Institut, Neuherberg

Junctions of lines or edges are important visual cues in various fields of computer
vision. They are characterized by the existence of more than one orientation at one
single point, the so called keypoint. In this work we investigate the performance of
highly orientation selective functions to detect multiple orientations and to
characterize junctions. A quadrate pair of functions. A quadrature pair of functions
is used to detect lines as well as edges and to distinguish between them. An
associated one-sided function with an angular periodicity of 360� can distinguish
between terminating and non-terminating lines and edges which constitute the
junctions. To calculate the response of these functions in a continuum of
orientations and scales a method is used that was introduced recently by P. Perona
[8].
These functions are called steerable filters. They seems to be the natural kind of
operators which are able to be adapted in the degrees of freedom in the attention
stage under the control of a task. It is shown that their response to local structures
can be used to make explicit the recognized structures. This is the way for
knowledge based computer vision. In behavior based systems (attentive systems)
these response are used as implicit representations which constitute the input to
associative memories to fuse several local hints to a global recognition of structure.

Theoretical Foundations of Anisotropic Diffusion
in Image Processing

J. Weickert
Universität Kaiserlautern

A frequent problem in low level vision consists of eliminating noise and small-scale
details from an image while still preserving or even enhancing the edge structure.
Nonlinear anisotropic diffusion filtering using an adapted diffusion tensor offers one
possibility to achieve these goals. We sketch the essential ideas of this technique
and demonstrate its advantages compared to isotropic and nonlinear diffusion.
Although exhibiting an edge enhancing potential, the proposed method provides a
scale-space fulfilling several architectural, information reducing and invariance
properties. Furthermore, it leads to well-pronounced edges with stable locations
across a wide range of scales. It is shown that most of the restoration and scale-
space properties carry over from the continuous to the discrete case. Applications
are presented ranging from preprocessing of medical images and postprocessing of
numerical results containing fluctuations to visualizing quality relevant features for
the grading of wood surfaces and fleece.

Stability and Likelihood of Views of Three Dimensional Objects

Daphna Weinshall, Michael Werman and Naftali Tishby
The Hebrew University of Jerusalem

Can we say anything general about the distribution of two dimensional views of
general three dimensional objects? In this paper we present a first formal analysis of
the stability and likelihood of two dimensional views (under weak perspective
projection) of three dimensional objects. This analysis is useful for various aspects
of object recognition and database indexing. Examples are Bayesian recognition and
image interpretation; indexing to a three dimensional database by invariants of two
dimensional images; the selection of "good" templates that may reduce the
complexity of correspondence between images and three dimensional objects; and
ambiguity resolution using generic views.
We show the following results: (1) Both the stability and likelihood of views do not
depend on the particular distribution of points inside the object; they both depend on
only three numbers, the three second moments of the object. (2) The most stable and
the most likely views are the some view, which is the "flattest" view of the object;
moreover, there is no other view which is even locally the most stable or the most
likely view. Under orthographic projection, we also show: (3) the distance between
one image to another does not depend on the position of its viewpoint with respect
to the object, it depend only on the (geodesic) distance between the viewpoints on
the viewing sphere. We demonstrate these results with real and simulated data.

Model-free texture segmentation based on distances
between first-order statistics

Piero Zamperoni
Technische Universität Braunschweig

This contribution focuses on the key-role of the gray value distribution (first-order
statistics), taken as a whole, for characterizing homogeneous regions of an image
and for detecting region borders for image segmentation scopes. Where the
discriminating power of the first-order statistics is not sufficient, also information
on the gray values' spatial relationships must be extracted from the second-order
statistics.
The aim of this study is to develop efficient methods for measuring a "degree of
diversity" between pairs of symmetrical and equal-sized subwindows of the
observation window, centered on the current pixel P. The maximum degree of
diversity, measured among 4 subwindow pairs with different orientations, is the
edge value attributed to P. Repeating this procedure for all the pixels, one obtains an
edge map, which represents the first and the most critical step in a segmentation
process.
For measuring the "degree of diversity", several approaches have been investigated,
all based upon well-known pattern recognition and statistic methods, as for instance:

- Distances (Minkowski, Canberra, Tanimoto-distance, scalar product, and other
specially developed distance) between the vectors of the rank-ordered gray values
of the two subwindows;
- Distances (Kolmogorov, Bhattacharyya, Patrick-Fischer) between the estimated
gray value density functions of the two subwindows;
- Measures of cluster concentration in the gray value space and in the 2-dimensional
space obtained by considering also a spatial distribution feature;
- Non parametric Wilcoxon-type two-populations tests;
- Distance between the estimated locations and between the estimated scales in the
paired subwindows.

Experimental results, obtained with natural images featuring problematic textures
(remote sensing, radar, ultrasonic, nuclear medicine) and with synthetic images with
all the approaches mentioned above, are shown and illustrated.

CITR:
last update: 22 April 1998