2nd AVA Natural Images Meeting at University of Bristol

15 Sep 1999 archived

Abstracts of Presentations:

9.00 Registration and Coffee

10.00 Invited Lecture:

The functions of ultraviolet vision in birds
Professor Innes Cuthill, University of Bristol.

11.00 Contrast normalization and coding efficiency,
Nuala Brady, University of Manchester.

11.30 How can we measure the effectiveness of simple-cell coding schemes,
Ben Willmore, David Tolhurst, University of Cambridge.

12.00 Exploring ways of improving complex extrafoveal visual processing,
Dean Melmoth, University of Cardiff.

12.30 Colorimetric uniqueness in natural images,
Steve Westland, University of Keele, Mitch Thomson, University of Aston

13.00 – 14.00 Lunch
14.00 Judgements of eye level in outdoor scenes
Helen Ross, University of Stirling.

14.30 Are natural scenes statistically scale-invariant?
Mitch Thomson, University of Aston.

15.00 Primate trichromacy and natural images
Daniel Osorio, University of Sussex.

15.30 – 15.45 Tea
15.45 Are we optimised to perceive natural images?
Tom Troscianko, Alejandro Parraga, University of Bristol, David Tolhurst, University of Cambridge.

16.15 Psychophysics of eye-direction judgements: what your eye tells me, and how your eyebrows try to stop it
Roger Watt, University of Stirling.

16.45 – 17.15 Open contributions and general discussion.

17.15 – 18.00 Buffet and wine reception.

18.00 Close of meeting.

The functions of ultraviolet vision in birds

Professor Innes C. Cuthill & Friends,

Co-editor (Old World) Behavioral Ecology,
School of Biological Sciences,
University of Bristol,
Woodland Road,
Bristol BS8 1UG, U.K.

Birds can see ultraviolet (UV) light because, unlike humans, their lenses and other ocular media transmit UV, and they possess a class of photoreceptor which is maximally sensitive to violet or ultraviolet light, depending on the species. Birds retain what appears to be the ancestral tetrapod, perhaps vertebrate, system of a single class of rod, subserving scotopic vision, and four spectrally distinct cone types, used for colour vision under photopic conditions. Current evidence is consistent with the idea that birds have a tetrachromatic colour space, as compared to the trichromacy of humans, so will see a range of hues we cannot imagine. Birds, along with some reptiles and fish, also possess double cones in large numbers, a cone class the function of which is still far from clear. We will review a range of behavioural experiments by our group, which show that UV information is utilised in behavioural decisions, notably in foraging and signalling. Hidden sex differences in coloration have been found in species which are more-or-less monomorphic to humans, and so the extent of chromatic variation, both within and between species, may have been underestimated in the past. It is also significant that removal of UV wavelengths affects mate choice even in species which are colourful to us. These studies emphasise that avian and human colour perceptions are different and that use of human colour standards, and even artificial lighting, may produce misleading results. However, genuinely objective measures of ‘colour’ are available, as are, importantly, models for mapping the measured spectra into an avian colour space. We will discuss the implications for future studies of evolutionary hypotheses that make predictions about colour variation in the natural world.

Contrast normalization and coding efficiency

Nuala Brady1 and David J. Field2

1Department of Psychology, University of Manchester, Manchester M13 9PL, UK
2Department of Psychology, Cornell University, Ithaca, NY 14853, USA

The visual system employs a gain control mechanism in the cortical coding of contrast, whereby the response of each cell is normalised by the integrated activity of neighbouring cells. The normalization pool is broadly tuned for spatial frequency and orientation, so that a cell's response is adapted by stimuli which fall outside its ‘classical’ receptive field. Various functions have been attributed to divisive gain control; particularly popular is the notion that normalization serves to match a cell's limited dynamic range to the distribution of contrasts in a scene, thereby increasing differential sensitivity. Here we consider an alternative proposal that contrast normalization, coupled with thresholding, serves to reduce the flow of information to higher levels of processing, thereby increasing the sparseness of the visual code. Forty-six natural scenes were analysed using oriented, frequency-tuned filters and contrast response distributions were compared before and after normalization. The distribution of contrasts in natural images is highly kurtotic, peaking at low values and having a long exponential tail characteristic of many natural signals. There is considerable variability in local contrast both within and between images. This variability is reduced after implementing contrast normalization, and that the distribution of response activity shifts towards the Gaussian shape associated with an efficient transfer of information in cells whose capacity is limited by noise. When normalization is combined with thresholding, the sparseness of the visual code is considerably increased.

How can we measure the effectiveness of simple-cell coding schemes?

Ben Willmore, David Tolhurst

Department of Physiology, University of Cambridge, Downing Street, Cambridge CB2 3EG, UK

There are a growing number of rival explanations for the goal of the coding performed by cortical simple cells. In order to discuss objectively whether one scheme is "better" than another, it is important to rely on quantitative measures of the merits and weaknesses of each scheme. The kurtosis of the response distributions of individual model simple cells is often used as a criterion of sparseness. However, this probably says as much about the properties of the coded images as about the effectiveness of the code. We argue that the within-image kurtosis is better correlated with direct measurements of sparseness. We need a measure of distributedness to distinguish a compact code (e.g. PCA) where the same few basis functions do all the work in all images, from a distributed code where different small subsets of basis functions code for different images. Lastly, for non-orthogonal codes we must have some measure ofcompleteness.

Exploring ways of improving complex extrafoveal visual processing.

Dean Melmoth

University of Cardiff

It has been proposed that the human visual system is designed so that the periphery is responsible for the detection of movement and gross detail, with the ability to perform detailed analysis resting solely

with the fovea. The implications of such a qualitative difference would be that despite spatial scaling (i.e. stimulus magnification) to compensate for the extrafoveal decrease in processing resources, we could never hope to achieve the same level of performance in complex tasks using peripheral vision as at the fovea. We show, however, that by considering an additional dimension to size alone (in our experiments, contrast), even complex tasks may be performed in extrafoveal vision. Thus, the periphery may be considered as qualitatively similar to the fovea for an increased range of tasks. The implications for individuals with central visual field loss are clearly enormous as visual aids designed with these results in mind could greatly improve their ability to perform complex everyday tasks. In addition, any task in which peripheral vision is important (piloting, driving, use of visual displays etc.) stands to benefit greatly from these results. We can also speculate on other stimulus dimensions which may be appropriately scaled to improve extrafoveal vision.

Colorimetric uniqueness in natural images

Stephen Westland and Mitchell GA Thomson*

MacKay Institute of Communication & Neuroscience
School of Life Sciences, Keele University, Staffordshire, ST5 5BG, UK
*Department of Vision Sciences, Aston University, Birmingham, B4 7ET, UK

It has been postulated that the spectral reflectance functions of natural surfaces are highly constrained and that such spectra can be represented by linear models with a small number of (although more than three) components (Maloney, JOSA A, 1986, p. 1673). It has also been claimed that natural metamers do not exist under incandescent light and natural daylight (Lennie & D’Zmura, Critical Reviews in Neurobiology, 1988, p333). We test these claims using several new sets of natural (flora and human skin) and man-made (paint samples and building materials) reflectance spectra.

Our analyses of natural reflectance data broadly confirm the earlier findings of Maloney and support the claim that metamerism in the natural world is extremely rare. Comparative analyses of man-made reflectance spectra suggest that the spectral smoothness of man-made surfaces is almost identical to that of natural surfaces. We claim that the incidence of metamerism is only higher in the man-made world compared to the natural world because of specific design of man-made surfaces and is not the result of any fundamental difference in surface spectral statistics. The fact that natural metamerism does not occur implies that distinct classes of natural surfaces elicit unique cone-excitation ratios in the human visual system. This colorimetric uniqueness might allow the visual system to recover colour information about the natural world from cone responses without the necessity for computational processes that can recover the reflectance spectra of surfaces in the world.

Judgements of eye level in outdoor scenes

Helen E Ross, Shazia Nawaz, Robert P O'Shea1

Department of Psychology, University of Stirling,
Stirling FK9 4LA, Scotland
1Department of Psychology, University of Otago, PO Box 56, Dunedin, New Zealand

Illusions of apparent pitch and height are often reported in mountainous scenery, implying changes in visually perceived eye level (VPEL). We measured VPEL in various environments. Experiment 1 on level ground showed a VPEL of + 0.05 deg, with no significant effect of viewing distance (15, 30 or 45 m) or of locations (an indoor corridor and an outdoor playing field): large illusions are therefore not caused by a baseline bias. Experiment 2 in varied terrain showed that VPEL followed the dominant pitch by about 25% of true pitch; when the same terrain was viewed from a 30 m high building, VPEL was raised by about + 0.4 deg. In Experiment 3 observers viewed flat terrain and distant mountains from a 100 m cliff, and gave a mean VPEL of + 0.7 deg. The positive error may be caused by the perspective raising of flat ground when viewed from a height. In Experiment 4 observers looked down dense woodland slopes of 6.5 and 15 deg, and showed VPELs of -3.63 deg and -0.42 deg respectively; and when looking over an 8.3 deg open mountain slope to a facing mountain, VPEL was -1.09 deg. Steep downhill slopes are less effective than moderate slopes, because they are out of sight when looking straight ahead: such scenes differ from pitchroom experiments, where downhill information is given by the sidewalls. Men were less susceptible than women to downhill errors. Experience also reduces errors. Most VPEL errors can be explained by the dominant perspective slope of the foreground, combined with some background distractor effects.

Are natural scenes statistically scale-invariant ?

Mitch Thomson

Neurosciences Research Institute, Aston University, Aston Triangle, Birmingham B4 7ET, UK

The orientation-averaged power spectra of natural images appear to fall off as a power-law function of spatial frequency, and it has been argued that by choosing logarithmic bandwidths for spatial-frequency-selective filters, the output energy of such filters can be made, for natural-image inputs, independent of their modal spatial frequencies. Filter-banks with these properties have been described as ‘contrast scale-invariant’ or ‘second-order scale-invariant’. The present work considers whether or not the higher-order statistics of natural scenes could also be considered scale-invariant, and what consequences this would have for visual coding. I will argue that there is no simple definition of scale-invariance for higher-order statistics, since the (multidimensional) domains of higher-order spectra can projected onto the (two-dimensional) Fourier spectra of images in a number of different ways. Nonetheless, I will show data to support the assertions that (a) finite, discrete natural images are not higher-order scale invariant (although spatially infinite natural scenery may be); (b) the visual system may still be able to produce a neural output distribution whose higher-order statistics are largely independent of scale.

Primate trichromacy and natural images

Daniel Osorio

School of Biological Sciences, University of Sussex, BN1 9QG
Email: d.osorio@sussex.ac.uk

In many primates trichromacy is selectively favoured over dichromacy, also the spectral tuning of the red and green cone photopigments is fixed with sensitivity maxima near 533 and 565 nm respectively. To understand the advantages of trichromacy and the spectral tuning of the red and green cones it is necessary to know what primates look at. This talk reviews recent work on spectral and spatial coding of natural scenes, based on hyperspectral images obtained with TW Cronin. There are three main conclusions. First, power the red-green signal is very low, <1% of the luminance signal, so a trichromat’s eye affords little additional statistical information over a dichromat’s. Second, because both red and green cones contribute to luminance mechanisms, it is probable that luminance signals are corrupted by chromatic ‘noise’ owing to the differing spectral sensitivities of the inputs. We estimate (Osorio et al. 1998) that for natural images chromatic noise is equivalent in amplitude to a luminance contrast (in magnocellular ganglion cells) of about 0.5% – 1%. Third, luminance, yellow-blue, and red-green signals are uncorrelated in natural images. In addition, Ruderman et al. (1998), derived the 27 principal components (eigenvectors) needed to represent 3×3 pixel patches of a set of natural images. These predict the optimal set of spatio-chromatic neural (e.g. retinal ganglion cell) receptive fields. The 27 principal components fall into three classes: they are either achromatic, summing cone signals, they compare blue to red + green cone outputs, or they compare red and green cone outputs. There are no ‘unusual’ receptive fields, for example: red + blue vs. green.

References:
Osorio D, Ruderman DL, Cronin TW (1998) Estimation of errors in luminance signals encoded by primate retina resulting from sampling of natural images with red and green cones. J Opt Soc Am A, 15, 16-22
Ruderman DL, Cronin TW, Chiao CC (1998) Statistics of cone responses to natural images: implications for visual coding. J Opt Soc Am A, 15, 2036-2045.

Are we optimised to perceive natural images?

Tom Troscianko1, Alejandro Parraga1, David Tolhurst2

1Dept of Psychology, University of Bristol
2Department of Physiology, University of Cambridge, Downing Street, Cambridge CB2 3EG, UK

It is frequently assumed that we are optimised to perceive natural images, but this assumption has rarely been put to the test. We have developed a technique which involves the detection of small changes in the shape of an object such as a car or a human face, by morphing it gradually towards a different object and measuring discrimination thresholds for detecting the morphed change. We manipulated the second-order statistics of the images by steepening or flattening the slope of the 1/f function relating amplitude and spatial frequency. We found that discrimination thresholds tended to be lowest for images which had "natural" values of the 1/f slope, thus confirming the original hypothesis.

The question then arose as to how the human visual system was performing this discrimination. We developed a computational model of local contrast discrimination based on the observer’s contrast sensitivity and contrast discrimination functions – in other words, using data from grating detection and discrimination tasks to predict performance in a task based on complex, natural, objects. The model predicts performance surprisingly well. We then asked whether there were particular assumptions of the model (such as spatial frequency bandwidth, csf shape, or contrast discrimination function) which were particularly powerful in making the model fit the psychophysical data. The result was that there was no overwhelming need to be precise in the specification of these functions. The implication may be that there could be many types of visual system which would be optimal for perceiving "natural" images.

Psychophysics of eye-direction judgements: what your eye tells me, and how your eyebrows try to stop it

Roger Watt

University of Stirling

Humans foveate objects of interest in the scene around them. Frequently this is the only external sign of that interest, although it may sometimes be followed later by an orienting of the head and body. The foveation is a relatively fast and transient action, with the object of interest being fixated for perhaps less than a second. Foveation acquires an especial importance when the object is another human, and it is not surprising that we are all keen to be sensitive to when someone if looking at us and equally keen to escape detection when we choose to look at someone.

I have measured sensitivity to being looked at in a psychophysical manner. Subjects see a sequence of images of people who are looking a little to left or right of straight ahead and are asked to say, for each, which direction the person is looking in. By varying the direction of eye direction in the stimulus, it is possible to measure the minimum angular rotation of the eyes from straight ahead that subjects can detect reliably. There are three interesting features of these results:

1). The minimum detectable rotation of the eye (i.e. the distal cue) does not vary with viewing distance over a range from 0.5m to 16m, but does decrease rapidly beyond that point. This behaviour is unusual for a spatial judgement – normally performance is inversely related to viewing distance and therefore expressed as a visual angle (the proximal cue). 2). At brief exposures, (20msec) the distance over which this high sensitivity is found is about halved, but the sensitivity itself is not markedly reduced.

3). With brief exposures, the maximum distance for highest sensitivity is further halved when the image is of a face with lowered eyebrows.

Putting these together, the picture that results is of a critical "receptive field" within which a person’s eye-direction can be detected by an observer. The range of this field increases over time out to a maximum distance of about 16m., but is affected by the eyebrows. Thus, to avoid being detected whilst looking at someone else, a brief glance with eyebrows down is recommended. If you wish to be detected, then the opposite is appropriate.

Registration (click on one of the categories below and select any available options via pull down menu)

The registration for this event is over.