Spatial and spatial-frequency analysis in visual optics



This article is corrected by:

  1. Errata: Spatial and spatial-frequency analysis in visual optics Volume 32, Issue 5, 441, Article first published online: 13 August 2012

Gerald Westheimer
E-mail address:


Citation information: Westheimer G. Spatial and spatial-frequency analysis in visual optics. Ophthalmic Physiol Opt 2012, 32, 271–281. doi: 10.1111/j.1475-1313.2012.00913.x


Background:  In the specification of visual targets and their transmission through the eye’s optics to form retinal images, the spatial distribution of energy and its Fourier transform, the spatial-frequency spectrum, are equivalent, so long as linearity constraints are obeyed. The power spectrum, in which phase has been discarded, is an insufficient descriptor; it does not enable the original object to be reconstituted.

Procedure:  Not so well known, and explored here, are joint representations in the space and spatial-frequency dimensions. Their properties are outlined for some sample targets and for transforms of the Gabor, Difference-of-Gaussians and Wigner types. A related approach is one in which other kernel functions, such as the Gaussian or its derivative, are substituted for the cosines in the Fourier transform; here also graphs can be generated which jointly display properties both of the target and of its point-by-point representation in a size-tuned domain.

Applications:  This kind of study has application in matching the performance characteristics of optical devices to the eye’s, in optical superresolution, and in the analysis of the demands placed on neural processing in, for example, visual hyperacuity.


When describing visual targets and the changes they undergo in transmission through the eye’s optics to become retinal images and then in their further processing by the retinal and neural stages of vision, the time-honored way is to depict the light intensity or excitation level in transverse planes. A more recent alternate is the spatial frequency or Fourier transform representation, Equation 1. Provided that linearity constraints are met and that the phase of the Fourier components is not neglected, the two are equivalent.1 Because spatial frequency and spatial extent are reciprocal measures, they tend to emphasize different aspects of any given situation. Feature size and shape are seen best in the spatial coordinates, whereas the presence and resolution of fine details are emphasized in the high spatial-frequency Fourier components.


Equation 1 Fourier transform of I(x) the light distribution in one dimension in a lateral plane, giving the distribution of amplitude I(n) and phase fn of the cosinusoidal patterns of spatial frequency n that describe the light distribution equivalently.

In Figure 1 there are four examples of targets, a sinusoidal grating, a wide bar, a line, and finally, a line pair whose separation can be discriminated with hyperacuity precision.2 Illustrated are the frontal view of the patterns in their lateral planes normal to the optical axis, the light distribution I(x) along the x axis in the middle column, and I(ν), the Fourier counterparts of I(x), where amplitudes alone are given because, for simplicity’s sake, targets have been selected for which phase does not enter. The emphasize the salient aspects, some simplifying assumptions have been made: the grating in the first row and the black background in the other rows is assumed to extend to infinity, and the lines in row D are assumed very narrow.

Figure 1.

 Four light patterns presented in different formats; from top to bottom: (a) sinusoidal grating; (b) a wide bar; (c) a narrow line; (d) a line pair, discrimination of whose separation is a hyperacuity task. Left column: Frontal view of object plane featuring the patterns, the grating in (a) and the backgrounds in (b–d) are assumed to extend to infinity. Middle column: Cross-sectional light intensity distribution along a horizontal line across the patterns. Right column: Spatial frequency (Fourier) spectrum, normalized. Somewhat schematic.

If the entire light distribution along the x axis is used as the basis for Fourier transformation, it suffices to plot the spatial frequency representation along a single line orthogonal to the x axis, but if phase is involved, this will not be just a vertical ordinate at points along this ν axis, but a vector with amplitude and phase, represented by the orientation angle in the plane parallel to the ν = 0 plane. Even if the pattern is symmetrical, phase changes will be introduced if the analysis is not centered on an axis of symmetry. For example, if the line in row C of Figure 1 is shifted laterally, the amplitudes of the vectors at all v values will not change. But because any lateral shift in x-distance becomes an increasing phase angle with increasing spatial frequency of the cosinusoidal components, the phase angles increase sequentially. This is illustrated in Figure 2, which shows the spatial-frequency locus for the thin line when it has been laterally shifted by 6 arcmin. The phase, whose angle is proportional to the frequency, results in the spiral appearance of a three-dimensional amplitude/phase/spatial-frequency representation.

Figure 2.

 Spatial frequency representation of the thin line, row C of Figure 1, when it has been laterally shifted by 6 arcmin. (a) a three-dimensional depiction with the amplitude as the distance from the ν axis and the phase angle increasing linearly with frequency. The spatial sinusoidal component of 6 arcmin period (10 cycles per degree) now is shifted in phase one full revolution of 360 degrees. There are an amplitude components, (b) as well as phase (c). See also Figure S2a in the online supporting information.

It follows that the power spectrum, based on only the length of the vectors from the ν axis to each point on the spatial-frequency locus, thus omitting phase information, is an incomplete representation. The actual target cannot be reconstructed from it.3 It has been demonstrated that natural images have mostly similar power spectra.4 Since for survival we depend on recognizing differences in natural scenes, and these differences reside in the phase information, the study of spatial vision necessarily has to transcend stimulus power spectra.

Box 1 Fourier description: amplitude, phase, power

For I(x), a distribution in the spatial dimension x, the related distribution F(v) along v, the spatial frequency dimension, is given by


For functions with even symmetry, i.e. for which the conditions I(x) = I(−x) holds, the sin components vanish and F(ν) = I(x) (cos(2πxv)dx. Otherwise the elements at the spatial frequency v of F(v), the transform of I(x), can be regarded as a vectors of amplitude Av = √(C(v)2 + S(v)2) and phase angle φ = tan−1 (S(v)/C(v)).

The meaning of this statement is the following. To reconstitute the original function I(x) from a knowledge of F(v), for each frequency v the functions cos(2πνx) and sin(2πνx) are multiplied by C(v) and S(v) respectively and products summed. Alternatively, to reconstitute the original function I(x) from the distribution of amplitudes A and phases φν, the contributions Avcos(2πνx + φ) are summed over all values of ν.

The power spectrum is given by


Combined space and spatial-frequency representations

Not so well-known is a mode of representation which combines both the space and spatial-frequency domains. It is more easily studied by confining consideration to a single lateral image dimension which now becomes one of the axes of a two-dimensional distribution, the other, spatial frequency along the ν axis, being drawn orthogonally. This suffices for light patterns with even symmetry for which all phases are zero. An example of the concept is provided by Figure 3, a 3-dimensional configuration with the light distribution (here a one-dimensional Gabor blob) drawn along the x-axis and its spatial frequency spectrum along the orthogonal ν axis.

Figure 3.

 The light distribution of a one-dimensional Gabor patch superimposed on a uniform light field, a typical stimulus used in vision experiments, is plotted along the x-axis. In its center, drawn at right angles, is its spatial frequency spectrum. This permits the display in a single view of the two representations of this stimulus pattern. Two simplifications allow this kind of analytical display: a stimulus pattern that varies in only one dimension, and, also, one that has even symmetry so that the phase components of the spatial frequency vectors remains zero throughout and the spectrum can be contained in the graph’s vertical plane. See also Figure S3 in the online supporting information.

Often there is an advantage to concentrate on the frequency content not of the whole of the distribution but of those segments in which there are rapid changes, as for example in the vicinity of borders. Analysis of acoustic signals is an area where this problem is acute. Though the longer the segment of a sound recording that is being subjected to probing, the more it satisfies the criteria of strict Fourier theory, what usually prompts study are local differences in the temporal stream, as one musical note following another, or the vowels preceding consonants.

Restricting the signal train that is being subjected to frequency analysis is best managed not by chopping it into segments but rather by multiplying it with an overlaid windowing function and performing a Fourier analysis over the narrowed regions of the test distribution. The operation is repeated, shifting the windowing function over the test distribution sequentially point by point. This yields a picture of the, so to speak, ‘local’ Fourier content of the distribution being analyzed and it is linked to its source by being graphed orthogonally at each position of the window, resulting in a surface.

The formula used in this approach is given in Equation 2, where g(x) is the windowing function extending over a range no wider than the source distribution and usually much less. At each point x one computes a temporary subsidiary distribution, consisting of a segment of the full distribution limited to the segment x ± tm multiplied by the windowing function, and performs a Fourier analysis over this temporary subsidiary contribution. The result is plotted along the ν axis orthogonal to the distribution’s axis at the point x, giving a surface when carried out all along the source distribution. Of specific interest is the height along the ν = 0 line. Then the cosine factor in Equation 1 is 1.00, leaving, as the remaining computation, the sum of the product of the heights of the distribution and of the windowing function centered on each point x over the region from x−tm to x+tm.


Equation 2 Generic equation for generating the two-dimensional function G(x,n), which has the convolution of the target distribution I(x) with a windowing function g(x) along the x axis, and the Fourier transformation of this convolution over the limited domain x + tm along the n axis.

What has been achieved is a 2-dimensional representation, simultaneous in the space and spatial frequency domains, of two aspects of a single object or image dimension. Plotted along the ν = 0 line is no longer the original distribution but its convolution with the particular windowing function that is required to reach a compromise between one so narrow as to be meaningless for Fourier purposes and one so wide that significant local differentiations within the test distribution have been washed out. For that reason in some of the following graphs the originating distribution has been superimposed, as a reminder of what had been started with. Nothing changes in principle for extension from simple symmetrical distributions to cases when the phase matters and each point in the x,ν plane has as its z coordinate not just an amplitude but a complex number representing amplitude and phase.

Some widely used exemplars of the windowing function can be cited. The first is the one introduced by Gabor, in which g(x) is the Gaussian function exp((xk)2). Here k is free parameter, an inverse measure of the width, which must be specified; g(x) falls to 0.02 at ± 2/k and thus governs the range of integration ± tm. Gabor transforms of three of the targets in Figure 1 are shown in Figure 4.

Figure 4.

 Gabor transforms of the targets patterns in the first three rows of Figure 1. The light distributions in the middle columns of Figure 1 are first multiplied, point by point, with a Gaussian of standard deviation one arc min (falling to 2% of the central value at 2 arcmins either side), and the Fourier transforms of these subsidiary distributions drawn point by point along the direction orthogonal to the x axis. In b and c the physical object limits have been sketched in the foreground. The Gaussian parameter has been chosen here to approximate the spread of light in a typical eye. The figures thus help visualize the situation that applies to the processing of light by retinal receptors. See also Figure S4a-c in the online supporting information.

The concept of windowing functions for purposes of limiting the region within which Fourier transformation takes place, can be extended into the field of visual processing but, as soon as neural components enter, linearity can no longer be taken for granted and rigor may be compromised. Nevertheless the approach has value in allowing visualization of the changes that signals undergo in their passage from object to retinal image and then subsequently to the sensorium. The eye’s point-spread function, these days capable of good specification by means of wavefront technology, is unobjectionable on this count, but more elaborate excitation spread functions, which might include both optical light spread in the eye media and retinal center/surround antagonism, need to be treated with care. To model the center/surround antagonism of retinal functioning, a difference of two Gaussian (DoG) windowing function can be employed. The resulting distributions (Figure 5) highlight the band-pass operation thus performed. This process has been utilized in, for example, deconstructing the geometrical-optical illusion known as the shifted-chessboard or café-wall patterns.5

Figure 5.

 As in Figure 4, but a difference-of-Gaussians (DoG) function, 1.3*exp(0.9*x2)0.4*exp(0.18*x2), is employed for windowing. The parameters chosen can serve to visualize the spatial distribution of excitation at some foveal retinal layer incorporating light spread and center/surround antagonism. See also Figure S5a-c in the online supporting information.

A different approach is taken in the so-called Wigner distribution function (Equation 3), originating in theoretical physics to define simultaneously the position and momentum of a particle. It has found traction in modern optics6,7 where its manifest orthogonality matches


Equation 3 for the Wigner distribution function, W(x,n) is a specific implementation of the general transformation distribution in Equation 2 in which the windowing process is accomplished by means of the autocorrelation of the original target distribution I(x). The domain used for the Fourier transformation is now the whole distribution.

that inherent in a function and its Fourier transform, as realized in the relationship between the distribution of coherent light in the transverse planes containing a distant object and the system’s aperture. The spatial basis for Fourier transform computation now is the distribution’s autocorrelation, and this has the virtue of having the distribution itself govern the windowing range instead of it being imposed by the investigator, as is the case with the Gabor and DoG transforms. The Wigner distribution function has some special properties: Its integral gives the total energy that is being transferred and it undergoes defined transformations in the process of free-space propagation and imaging by lenses.6 Wigner distribution functions for the three of the four targets in Figure 1 are shown in Figure 6.

Figure 6.

 Wigner distribution functions for the top three targets in Figure 1. As distinct from the Gabor transforms in Figure 4 or the DoG transforms in Figure 5, no arbitrary parameters are involved. See also Figure S6a-c in the online supporting information.

For practical purposes, these representations allows the opportunity to survey the spatial frequency contents at points on the x axis, and thus to gauge the resolution needed for reconstruction of specific regions of the target.

Another appropriate windowing function is the cross-sectional light distribution of the Airy disk, (J1(x)/x)2, which defines the effect purely of diffraction in optical imagery in monochromatic light of wavelength λ and a circular pupil of diameter a. Its Fourier transform is the optical transfer function.1 When the computation is performed with sufficient rigor, it will demonstrate that under these circumstances there is a cut-off spatial frequency v = a/λ beyond which all values in the x,ν plane are zero.

A related but somewhat different approach is the wavelet methodology,8 which does not feature a single windowing function, like the Gaussian in the Gabor and DoG transforms, or the one that is determined by the distribution itself in the Wigner function, but one whose width varies inversely with the Fourier frequency. Though this makes the computational basis uniform across the frequency spectrum, by the same token it fails to concentrate singularly on specific locations within the distribution.

These distributions have the interesting property that whereas in a given situation the total footprint remains the same, their shape can vary.9 A specific example is pure magnification, were any stretch in the x dimension is accompanied by concomitant reciprocal contraction of the spatial frequency scale. The mode of analysis thus can help in the design of display situations. Aperture and wavelength determine the maximum spatial frequency transmission of the system. The magnification can then be adjusted so that the target’s spatial frequency demands are met by the instruments’s optical transfer capabilities, necessarily of course at the expense of target extent.


To illustrate many of the concepts that have been introduced, an example will be drawn from the topic of visual hyperacuity, the normal human observer’s ability to make spatial localizations with precision better, by almost an order of magnitude, than the eye’s resolution limit and the image binning imposed by retinal receptor packing. Vernier alignment thresholds of a few arcsecs are an example of what is involved, but for purposes of this discussion a related threshold will be used, the discrimination of the separation of two edges or lines. Most observers can distinguish between line pairs 4 arcmin and 3.9 arcmin apart, whereas the best optical resolution thresholds, and also the diameter of foveal cones, are between 0.5 and 1.0 arcmin. These targets, quite typical of hyperacuity patterns, have the advantage that, unlike misaligned vernier stimuli, their full description and analysis can be carried out in a single dimension and that, because they have even symmetry, the phase of their Fourier components can be ignored, i.e. purely cosine transformation suffices.10

The basic conformation of the line-pair pattern in the space, spatial frequency, Gabor, DoG and Wigner transformations are shown in Figure 7. The just distinguishably-different pattern, 3.9 arcmin apart, will have their scales differ from the above by 2 ½%, diminished in the space and augmented in the Fourier domains. A standard deviation of 1 arcmin has been chosen for the Gabor transform, a value that is representative of the image spread, and the values for the DoG adequately model space constants for foveal visual processing.

Figure 7.

 Five representations of the standard display in a hyperacuity task. Observers can distinguish the separation of a line pair 4 arcmin apart (a) from that of a pair 3.9 arcmin apart. The difference is five to 10 times better than the eye’s resolution limit and the foveal cone spacing. (b) Fourier spectrum of a (c) Gabor transform of a, (d) DoG transform of a, (e) Wigner distribution function of a. The figures for the just distinguishably-different pattern would be identical except for a scale change, 2½% contracted along the x axes and 2½% expanded along the ν axis. See also Figure S7c,d,e in the online supporting information.

Each display shows a different facet of the situation. The Fourier transformation (Figure 7b) covers the entire pattern and can also serve as a reminder of the two slits/sinusoidal fringe space/spatial frequency duality that is familiar in Young’s interference fringe experiment of wave optics. The Gabor transform (Figure 7c), which might be an approximate representation of the energy in the optical retinal image or perhaps the pattern of retinal receptor excitation, highlights the spatial regions in which processing places the most demand on higher spatial frequencies. It allows, for example, the gauging of the effect on the pattern of curtailing the spatial-frequency spectrum by, e.g. pupil size reduction or optical blur, something that would not be obvious by just looking at the other panels. The DoG transform (Figure 7d) is a representation of the activity of an array of retinal ganglion cells which has the center/surround antagonism incorporated.5 Finally, the Wigner distribution function (Figure 7e), is a richer source of information than the others. The strips along the ν axis intersecting the x axis at ± 2 arcmin. The location of the two target lines, are closely related to what is contained in the other transforms, but in addition there is a great deal of activity in the center. The distribution along the ν axis at x = 0 is actually equivalent to what is shown in Figure 7b and constitutes the spatial frequency demands in the sensorium as a whole of the double-line target. If image analysis were entirely in the Fourier domain, something that is not the case in human vision, this is where the emphasis would be.

In a hyperacuity determination, the observer makes the distinction between, say, 4′ and 3′54″ line-pairs that are presented not superimposed but individually, either side by side or one after the other. Because it matters in what order the differencing and transform operations are performed – the difference between the Wigner transforms of two signals is not equal to the Wigner transform of the difference between the signals – it needs consideration which of the two sequences might be involved in neural processing. By the nature of the task, an instantaneous difference signal is an unlikely event. More probable the representations of the two targets are examined separately and then compared. The neural apparatus used for such discrimination is as yet unclear. Figure 8, containing an assembly of the differences in the targets, and in their Fourier, Gabor, DoG and Wigner transforms highlights the location within these distributions where differences are most pronounced. The magnitude of the effects, as compared with the data in Figure 7, is much less.

Figure 8.

 Differences for the just-distinguishable patterns in the two-line hyperacuity target of Figure 7, shown for five different representations: (a) Spatial pattern (b) Fourier transform (c) Gabor transform (d) DoG transform (e) Wigner distribution function. See also Figure S8c,d,e in the online supporting information.

The small values of hyperacuity thresholds raise the question of how they relate to the ultimate limits of information transfer through optical devices mandated by diffraction theory. The minimum spread of a point object into Airy disks, whose diameter depends directly on the wavelength and inversely on the system aperture, enables visualization of the resolution difficulties as they are encountered when such images of two closely-adjacent objects overlap. But a more succinct expression is embodied in the cut-off spatial frequency which specifies the spatial frequency band that can be admitted to form an image. For wavelength λ and aperture a (in the same units, say, mm) it is a/λ cycles/radian visual angle at the entrance pupil. In spite of much current writing on superresolution,7,11,12 none of the phenomena subsumed under this term actually ‘break the diffraction limit’ if by that is meant that they violate the Uncertainty Principle of quantum theory, which limits the simultaneous measurement of position and momentum of elementary particles – here photons whose direction of propagation becomes uncertain as the width of the aperture through which they pass is restricted. In superresolution as currently understood – of which there are many examples, starting early-on in microscopy with dark-field illumination – target details beyond the conventional diffraction limit are passed through optical system by exchanging or multiplexing spatial-frequency bandwidth. In the same vein, no hyperacuity phenomena ‘break the diffraction limit;’ rather they optimize information provided within it, usually in a way that might be described in terms of Bayesian inference. Distinctions are made on the basis of differences within the cut-off spatial frequency and, by virtue of prior understanding, on the tacit assumption that nothing beyond it enters in the decision.

The basic approach can be judged from Figure 7e. Because the target is spatially delimited, the Wigner distribution function of the 2-line target pattern can be extended arbitrarily far into the Fourier domain. However, in actuality it would be truncated at the cut-off spatial frequency of the eye’s optics, <60 cycles per degree in the natural state. In practice this means what is available to the observer contains an uncertainty of what may have been beyond that truncation line. However, in any hyperacuity test – and usually in equivalent practical situations as well – prior constraints have been set on the range of possible targets; the decision involves the image structure, the likelihoods that it arose from that range of physical targets and the probabilities that such targets had actually been shown.

Non-Fourier Kernel functions

Equation 1 is merely a special case of the more general class of kernel analysis in which the target distribution is being sifted by cosines of a range of frequencies to yield a description in terms of their amplitude and phases. Functions other than the trigonometrical ones that are the basis for the Fourier approach can be used but mostly they lack the advantage of permitting a reversal of the process that allows a unique reconstruction of the source distribution from the transformation. The prominence of the Fourier method is rooted in the firm foundation it has in diffraction theory, which remains an obligatory ingredient in optical imagery. When it comes to visual sensory mechanism, however, whose beginning processing is essential local, it is attractive to consider the use of spatially more restricted kernels. Mathematically, this is simply expressed by substituting for the cosine term in Equation 1 some other function whose parameter can also take a range of values, equivalent to spatial frequency and also with increasing spatial compactness with increasing values. Two specific examples are the Gaussian and the Gaussian derivative, illustrated in Figure 9a and b, respectively, where width is governed by a single parameter, reciprocally related to spread.

Figure 9.

 Gaussian (a) and Gaussian derivative (b) functions with increasing parameter k and hence decreasing width, that can be used as kernel functions in place of the cosinusoids with increasing spatial frequency that have been employed in the other figures.

The difference between the Gabor transform of Equation 2 and the distribution resulting from sifting a target light pattern through a series of Gaussian kernels is important. The former results when the target is sifted by a single Gaussian and then subjected to a Fourier analysis. What is being studied in this section, on the other hand, has nothing to do with sinusoids. It is the output when a target distribution is analyzed at each of its points through a series of Gaussians of gradually diminishing widths.

This procedure, using the Gaussian derivate function 9B, has been performed for a visual field, half of which is uniformly illuminated, the other half dark: a light step. The graph in Figure 10 can be interpreted as the level of stimulation that might be experienced at each location x by a hypothetical array of edge detectors with a range of accepting functions with decreasing widths and shows those that are most affected in the various locations along the target’s x dimension. Based on empirical evidence showing segregation into on and off streams13 and separate processing of edges with opposing polarity,14,15 the functions illustrated in Figure 9 should each be considered in two flavors, the Gaussian also going negative, and the Gaussian derivative also with reversed edge polarity.

Figure 10.

 A light-step target distribution (solid line in foreground) has been analyzed point by point, using Gaussian derivative functions (see Figure 9b). They operate as edge detectors and are activated only at the light/dark border at levels that decrease with narrowing of their tuning width, i.e. with increasing distance along the k axis. The graph, therefore, allows visualizing of the stimulation level of the active subset of such detectors that tile (presumably with overlap) the x–k plane.

The graphical depiction in Figure 10 provides an accessible and easily visualized framework for some size-tuned sensory mechanisms that have been proposed over the years.16,17 Most such formulations assume that the distribution of detector widths is not continuous, but concentrated into clusters, called channels. Such a discrete tiling of the acceptance zone would be accommodated by compartments in the x–k plane of the manifold in Figure 10 presumably with some overlap and changes in scale from fovea into the retinal periphery. The sensory response to a given target would then be determined by how well the stimulus distribution is matched by the topography of the detection apparatus.


The author gratefully acknowledges helpful suggestions from John Robson and Alex Zlotnik in the presentation of the material.


inline image

Gerald Westheimer completed his optometric training at the Sydney Technical College in 1943 and practiced optometry till leaving Sydney in 1951 to enroll as a Ph.D. student at Ohio State with Glenn Fry. His thesis on oculomotor responses utilized a systems-theoretical approach based on his further exposure to programs in mathematics, physics and physiology leading to the B.Sc. at Sydney University and the Fellowship Diploma of the Sydney Technical College, the first higher optometric qualification awarded in Australia. Faculty positions in a succession of optometry schools, Houston, Ohio State, California at Berkeley, led to his current academic affiliation with the Division of Neurobiology at Berkeley which he founded and headed between 1987 and 1992.

Westheimer’s research ranges over a gamut of eye-related topics, in particular the eye’s optics and spatial vision where he spearheaded the analysis of thresholds that transcend the diffraction limit, now called hyperacuity. Recognitions include membership in the Order of Australia, election to Fellowship the Royal Society, the American Academy of Arts and Science, the Optical Society of America, award of the Collin Research and Bicentennial Medals of the Optometrists Association of America, the Prentice Medal of the American Academy of Optometry, the Tillyer Medal of the Optical Society, the Ferrier lectureship of the Royal Society and several honorary degrees.