Concept for a book to be based on the motion workshop
Processing Visual Motion in the Real World
A Survey of Computational, Neural, and Ecological Constraints
We would like to consider three types of constraints which limit the performance
of biological motion detection systems.
(i) Computational: In extracting relevant information from two-dimensional
images, whether on egomotion, on the three-dimensional layout of the
environment, or on moving objects, the visual system has to cope with
highly ambiguous data. Examples of such ambiguities are the so-called
'aperture' or 'correspondence' problems which arise at the elementary
level of motion detection. They demonstrate that the basic computational
problems in motion vision are mathematically ill-posed. We would like
to discuss what implications such computational constraints have in complex
stimulus situations where different sources of spatiotemporal dynamics
are difficult to separate.
(ii) Neural: Biological systems perform computations with neurons which
are the cause of a number of severe processing limitations. Examples
are the small dynamic range in coding intensities and temporal changes
with analogue neuronal signals or spike trains, the imperfections in
approximating exact mathematical operations, or the abundance of internal
noise in signal transmission and information processing. It is also possible
that constraints on the level of the architecture and the connectivity
of neural tissues have had an influence on the evolution of processing
strategies. We suggest to look at structure and function in the visual
processing areas of invertebrates and vertebrates in terms of computational
demands and neural implementation.
(iii) Ecological: Visual systems operate in a finite and often very
specific world, with a more or less restricted spectrum of visual features.
Given that motion processing mechanisms have evolved under selective
pressure in specific environments and the context of specific lifestyles,
the systematic analysis of visual environments and visual tasks should
help us to understand the natural operating conditions of motion vision
and the neural coding strategies involved. In particular, the question
arises to what degree neuron properties like adaptation or dynamic range,
reflect computational, neural or ecological constraints.
We suggest five topics, corresponding to five parts of the book, for discussing
the significance of these constraints for motion vision.
(1) Elementary motion detection processes: It is clear by now that
the output of biological elementary motion detectors (EMDs) does not
faithfully represent the local size and direction of image motion vectors.
The reason being that each EMD has a directional characteristic depending
strongly on the spatial layout of the stimulus, and that the EMD's speed
tuning depends on pattern features such as contrast or spatial frequency.
What aspects of image motion EMDs can relay to higher processing stages
also depends on their response dynamics. Furthermore, neurons and synapses
introduce additional limitations to efficient motion processing. What
neuronal properties and connections can approximate the operations that
are needed in motion detection? How do neurons and synapses limit performance
in motion detection beyond the principal computational constraints? We
are interested in tracing these computational and neural constraints
from the elementary motion detection level through to higher processing
stages by discussing the relevance of two fundamental classes of operations
that combine local motion information: image segmentation and spatio-temporal
integration.
(2) The segmentation of visual scenes: Spatial and/or temporal integration
is needed to extract relevant signals from the local EMD level, because
the local signal is unreliable. Integration on the other hand limits
spatial and temporal acuity in perception and motor control. In order
to separate objects from background, and to recognise them, the visual
system has to differentiate between local motion signals. Given the need
for integraton, the high acuity with which scenes can be segmented into
different areas and objects with the help of motion cues is quite surprising.
What are the strategies to cope with this trade-off between acuity and
reliability? We would like to discuss the role of spatial interactions
between local motion signals by focussing on two striking examples: motion
transparency, i.e. the fact that two different motion signals can be
perceived simultaneously within the same region of the visual field,
and second-order motion, i.e. the recruitment of other than luminance
information in motion processing by the human visual system.
(3) Spatio-temporal integration and the use of optic flow: Many tasks
that rely on visual motion processing are related to the control of locomotion
and require extraction of cues from optic flow. The velocity vector field
is usually assumed to serve as input for higher level operators but the
signals provided by local EMDs are highly ambiguous with respect to direction
and speed. Real neurons introduce further imperfections in optic flow
representations. The extraction of flowfield information is therefore
thought to be based on integration and interaction across large parts
of the visual field. We would like to discuss the balance between local
and global motion processing and the functional significance of matched
neural filters. How limited and how robust are pragmatic solutions? To
what extent do principal computational and neuronal constraints affect
the estimation of egomotion parameters, the detection and recognition
of independently moving objects, and the extraction of depth information
from optic flow?
(4) Vision and action: Image motion cues are used to guide behaviour.
There are three prominent classes of visual tasks which require reliable
motion detection: (a) Extracting egomotion parameters; (b) Object detection
and recognition; (c) Stabilising eye movements and tracking. We suggest
to discuss how computational strategies and neuronal implementations
determine the way an animal is operating, and vice versa. For instance,
the reliability of coding is determined by both the quality of the available
motion information provided by EMDs, and by the properties of real neurons
which only have a limited number of possible states. In what ways do
these constraints limit performance in a decision task? And how are computational
strategies affected by structured locomotion or object-directed action?
(5) Natural operating conditions and neural coding: In the real world,
the environmental, neural and computational constraints for motion vision
are likely to depend strongly on lifestyle. Equally, visual habitats
differ both in the spatial distribution of contrast and in the spectral
composition of scenes and will therefore provide different conditions
for motion vision. Two aspects of natural operating conditions are of
immediate interest: one is the spatial and temporal distribution of signals
in a given visual habitat; the second is the structure of locomotion
which to a large degree determines the pattern of motion signals an observer
experiences. From what we know about natural operating conditions (environment
+ behaviour) can we decide whether the coding strategies of neurons reflect
ecological or other constraints? What do we need to know about lifestyle,
environment, neural and computational constraints to understand neural
representation?
Book Format:
We envisage a book of about ten chapters with author combinations emerging
from the workshop. Suggested deadline for manuscripts: 31 January 1998.
We should agree on an electronic format that is freely interchangeable
and ideally also on a format for the figures. We would like to have an
Author and Subject index. We want to discuss further details, the final
structure of contributions, and refereeing procedures at the end of the
workshop.