Aperture synthesis radar imaging has been used to investigate coherent backscatter from ionospheric plasma irregularities at Jicamarca and elsewhere for several years. Phenomena of interest include equatorial spread F, 150-km echoes, the equatorial electrojet, range-spread meteor trails, and mesospheric echoes. The sought-after images are related to spaced-receiver data mathematically through an integral transform, but direct inversion is generally impractical or suboptimal. We instead turn to statistical inverse theory, endeavoring to utilize fully all available information in the data inversion. The imaging algorithm used at Jicamarca is based on an implementation of the MaxEnt method developed for radio astronomy. Its strategy is to limit the space of candidate images to those that are positive definite, consistent with data to the degree required by experimental confidence limits; smooth (in some sense); and most representative of the class of possible solutions. The algorithm was improved recently by (1) incorporating the antenna radiation pattern in the prior probability and (2) estimating and including the full error covariance matrix in the constraints. The revised algorithm is evaluated using new 28-baseline electrojet data from Jicamarca.
 Coherent radar backscatter from field-aligned plasma irregularities can be used to assess the stability of ionospheric regions from the ground and study the instability processes at work. Radars provide relatively unambiguous information about the range to and Doppler shift of the irregularities. Information about the spatial distribution of the irregularities in the transverse directions is more ambiguous; even steerable radars rely on the stationarity of the target to some extent to construct images of regional irregularity structure, and finite beam width effects introduce additional ambiguity, particularly when the targets are spatially intermittent and exhibit high dynamic range. Many radars use fixed beams, and the pseudoimages they produce (so-called “range time intensity” images) are only accurate representations to the extent that the flow being observed is uniform, frozen in, and lacks important details at scale sizes comparable to or smaller than the scattering volume. It is generally not possible to assess the validity of these assumptions a priori, calling the practice into question.
 Radar interferometry makes it possible to discern the spatial distribution of scatterers within the radar illuminated volume [Farley et al., 1981; Kudeki et al., 1981]. Interferometry with two spaced antenna receivers (a single baseline) yields three moments of the distribution. A powerful generalization of interferometry involves using more receivers and baselines to yield more moments, a sufficient number of moments specifying an image of the scatterers in the illuminated volume [Woodman, 1997]. The first true images of ionospheric irregularities were formed this way by Kudeki and Sürücü  observing irregularities in the equatorial electrojet over Jicamarca. A few years later, Hysell  and Hysell and Woodman  produced images of plasma irregularities in equatorial spread F with higher definition by incorporating statistical inverse methods in the data inversion. The same basic algorithm has since been applied to studies of large-scale waves in the daytime and nighttime electrojet [Hysell and Chau, 2002; Chau and Hysell, 2004], bottom-type spread F layers [Hysell et al., 2004a], quasiperiodic echoes from midlatitude sporadic E layers [Hysell et al., 2002, 2004b], and the radar aurora [Bahcivan et al., 2005]. Satisfactory performance of the original imaging algorithm relegated low priority to further development, but some improvements have recently emerged.
 In this paper, we review the algorithm and describe refinements that move it closer to an optimal solution to the imaging problem. The improvements allow the incorporation of additional information. Specifically, we utilize the radar radiation pattern and the full error covariance matrix.
 The data underlying radar imaging are complex interferometric cross correlation or cross-spectral measurements derived from spaced receivers separated by the vector baseline d. We take for granted that incoherent integration is involved in the measurements. The relationship between such measurements, called visibilities, and the scattered power density as a function of bearing, called the brightness, is given by [Thompson, 1986]
which has the form of a continuous Fourier transform between baseline and bearing space, σ being a unit vector in the direction of a place in the sky being mapped. Here, V is the visibility, B is the brightness, k is the wave number, and AN is the normalized two-way antenna effective area. In the context of radio astronomy, AN represents the characteristics of the receiving antennas. In radar imaging, the antennas used for reception are usually much smaller than those used for transmission, and AN is consequently dominated by the characteristics of the transmitting antenna array. Together, the product ANB is the effective brightness distribution, Beff, which represents the angular distribution of the received signals. It is this function we are interested in recovering from the data, the antenna radiation patterns being known. The only time the radiation pattern need explicitly be included in such integrals is in cases when dissimilar antennas are used for different spaced receivers (see below).
 The visibility data in (1) need not come from a filled array but could equally well come from a sparse array of spaced receivers. What matters is the number of independent interferometry baselines and the length of the baselines rather than the total antenna area or number of antenna array elements. This is the meaning of the term “aperture synthesis.”
where η and ξ are the direction cosines of the bearing σ with respect to x and y coordinates, which can be arbitrarily oriented. (The z direction can but need not be the pointing direction of the instrument.) If the field of view of the sky being mapped is sufficiently restricted, the radical in the denominator of the integrand may be treated like a constant. In that event, and in cases where the spaced receivers are coplanar so that dz can always be made to be zero, (2) retains the form of a two-dimensional Fourier transform. Since Beff is band limited by the finite width of the radar radiation pattern, the visibilities can be completely represented by a discrete set of periodic measurements, that is, a Fourier series.
 In practice, visibility samples are nonuniformly spaced, sparse, and incomplete, making inversion of the integral transform in (2) with a discrete Fourier transform impractical. Adaptive beamforming techniques such as the method introduced by Capon  may be used instead. However, these make no provision for statistical fluctuations in the data and the possibility that the problem is ill conditioned. Nor do they offer obvious means of incorporating prior information. The preferred methods of inversion are therefore indirect (probabilistic, model-based) rather than direct [see, e.g., Bertero and Boccacci, 1998]. The algorithm described here derives from the MaxEnt spectral analysis method, a Bayesian method based on maximizing the Shannon entropy of the candidate spectrum [Shannon and Weaver, 1949]. The method should not be confused with the maximum entropy method (MEM) or other autoregressive models, with which it has only a remote connection [Jaynes, 1982]. MaxEnt was first applied to spaced receiver image reconstruction by Gull and Daniell  (and to a more general image reconstruction problem by Wernecke and D'Addario ). Variations and refinements to the technique were published soon after by Wu , Skilling and Bryan , and Cornwell and Evans . Our algorithm is based on one developed by Wilczek and Drapatz  (hereinafter referred to as WD85). Rationales for MaxEnt have been advanced by Ables , Jaynes [1982, 1985], Skilling , and Daniell , among many others.
 In the following analysis, we simplify the notation by working in one dimension, θi, the discretized zenith angle in the equatorial plane. (One dimensional imaging is sufficient for observing field-aligned irregularities at the magnetic equator. The two-dimensional generalization is meanwhile trivial.) The real valued brightness is represented by the symbol fi = f(θi). The visibility data come from normalized cross-correlation estimates
where we presume the absence of extraneous common signals due to interference or jamming. Here, the v1,2 represent quadrature voltage samples from two receivers spaced by a distance dj in the direction of the magnetic equator, and the N1,2 are the corresponding noise estimates. The angle brackets above represent a time average associated with incoherent integration of the data. We represent the visibility data with the symbol gj = g(kdj) and assign two real values for each baseline; one each for the real and imaginary part of (3). Given M interferometry baselines with nonzero length, there are therefore a total of 2M + 1 distinct visibility data. (The visibility for the zero baseline is identically unity.) Using this notation and invoking the Einstein summation convention, (2) becomes
where hij is either the real or imaginary part of the point spread function exp(jkdjθi), depending on whether gj is the real or imaginary part of (3), and the ej are the corresponding random error terms arising from the finite number of samples used to compute (3). The sum is over the zenith angles in the image to be constructed. We can form images with arbitrarily narrow and numerous zenith angle bins, but the attainable granularity is ultimately limited by the data properties and longest baselines available. In practice, resolution is increased until new details cease emerging.
 Statistical inverse theory asks, “given a set of measured visibilities and their error bounds, what is the probability that a given brightness distribution produced them?” The answer comes from Bayes' theorem, which states that the conditional probability P(f∣g) is proportional to the product of P(g∣f) and P(f). The conditional probability or “likelihood” that the visibilities arose from the brightness distribution, P(g∣f), can be derived from the forward problem. The marginal or prior probability of the brightness, P(f), is an expression of other information about the image independent of the data. The image that maximizes P(f∣g), the posterior probability, is what we seek.
 MaxEnt methods associate P(f) with the Shannon entropy of the brightness distribution, S = −fi ln (fi/F). Here, F = Iifi = g0 is the total image brightness, Ii being a vector of ones. Of all distributions, the uniform one has the highest entropy. In that sense, entropy is a smoothness metric. The entropy of an image is also related to the likelihood of occurrence in a random assembly process. All things being equal, a high-entropy distribution should be favored over a low-entropy one. The former represents a broadly accessible class of solutions, while the latter represents an unlikely outcome that should only be considered if the data demand it. Finally, only positive definite brightness distributions are allowed by S. In incorporating it, we reject the vast majority of candidate images in favor of a small subclass of physically obtainable (positive definite) ones.
 Neglecting the error terms for the moment, the brightness distribution that maximizes S while being constrained by (4) is the extremum of the functional:
where the λj are Lagrange multipliers introduced to enforce the constraints by the principles of variational mechanics and L is another Lagrange multiplier enforcing the normalization of the brightness. Maximizing (5) with respect to the fi and to L yields a model for the brightness, parameterized by the λj:
Note how Z plays the role of Gibbs' partition function here. This is no accident; the same derivation lies at the foundation of statistical mechanics.
 An obvious strategy for incorporating the data and errors in the imaging problem at this point would be to write an expression for the likelihood, which is related to χ2, multiply this by the prior probability (substituting (6) into S), and maximize the resulting posterior probability. Like the original Gull-Daniell algorithm, WD85 depart from this slightly by adapting (5) so as to enforce a constraint on the expectation of χ2 rather than minimize it. The constraint is included with the addition of another Lagrange multiplier (Λ):
where the last step was accomplished by substituting (6) and (7) into S. The Σ term constrains the error norm, calculated in terms of theoretical error variances σj2. Rather than finding the brightness which deviates minimally from the data while also having high entropy, WD85 find the brightness which deviates from the data in a prescribed way so as to have the highest possible entropy consistent with experimental uncertainties.
 Maximizing (8) with respect to the Lagrange multipliers yields 2M + 1 algebraic equations:
which merely restates (4). Maximizing with respect to the error terms ej yields equations relating them to the λj:
(no sum implied). Maximizing with respect to Λ yields one more equation relating that term to the others:
The resulting system of 2M + 1 coupled, nonlinear equations for the Lagrange multipliers can be solved numerically using a hybrid method [Powell, 1970]. Finally, (6) yields the desired image. The algorithm is robust and virtually always converges in practice given uncontaminated data. An analytic form of the required Jacobian matrix can readily be derived from (9) and used to optimize the performance of the numerical solver.
 The algorithm is conceptually related to regularization, whereby inversion is performed by minimizing a cost function that includes χ2 along with some other metric, the regularization parameter, that favors smoothness or some other desirable property. The “negentropy” (−S) could be considered a suitable regularization parameter. The difference between WD85 and regularization is that χ2 is constrained rather than minimized here in an attempt to find solutions with the highest entropy consistent with the data and their experimental uncertainties. To simply minimize χ2 would be to seek solutions with statistical errors potentially much smaller than expected, a seemingly suboptimal use of information. That said, debate exists in the literature regarding the optimal value for Σ (see Gull  for discussion). In practice, smaller values produce images with greater detail at the expense of increased artifacts. The choice of Σ balances desire for the former against aversion to the latter but should in no case depart drastically from expectation.
2.1. Radiation Pattern
 The original WD85 algorithm has been improved recently with the incorporation of two pieces of information. One is the two-way antenna radiation pattern. At Jicamarca, imaging experiments are usually performed using the north and south quarters of the main antenna for transmission and the modules (64ths) for reception. The quarters are excited with a phase taper intended to broaden the pattern and widen the imaging field of view, which is effectively about 10° wide given the 20–30 dB usable dynamic range of the images. Figure 1 shows the two-way pattern, which is relatively flat and free of sidelobes but which exhibits a subtle valley to the west of zenith. The pattern is also slightly asymmetric and extends farther to the east than to the west of zenith.
 The radiation pattern can obviously be divided from the Beff curves produced by imaging to yield something closer to the true brightness. The practical merits of this procedure are questionable, however, as it involves dividing small numbers by other small numbers at the image periphery. Features lying outside the main radiation pattern cannot be recovered in practice, and we do not attempt to do so. However, the boundary can be clarified and sharpened by incorporating the beam shape in the prior probability. Doing so reduces the solution space to images free of artificial features in excluded regions.
 The prescription is to modify the entropy expression. If Shannon's expression favors a uniform brightness distribution, the expression that favors distributions that resemble the beam shape can be shown to be
where pi is the two-way radiation power pattern. Propagating this expression through the preceding analysis alters only the brightness model:
where (12) implies no sum on i. The remaining formalism is unchanged. The only restriction is that pi should be positive definite. In practice, the effect of the modification is to suppress the brightness outside the main beam and enhance it slightly within. Examples are shown in the next section of the paper.
 In the event that antennas with different configurations are used for reception, the prescription is as follows. The pi in (12) can be set to match the radiation power pattern of the transmitting antenna array, which has a common effect on all the signals received. The radiation patterns of the receiving antennas should then be explicitly incorporated in the expressions for the effective brightness, Beff, associated with each baseline. In view of (1) and (4), this can be accomplished by modifying the point spread function for the given baseline j such that hij → hij℘1i℘2i (no sum implied), where ℘1,2i are the radiation amplitude patterns for the antennas at either end of the baseline. In view of the principle of pattern multiplication, characteristics of the radiation pattern common to all the receiving antennas can be incorporated in pi instead of ℘1,2i so as to better constrain the image.
2.2. Error Covariances
 A significant shortcoming of the WD85 algorithm for radar imaging is its utilization of error variances alone. Off-diagonal terms in the error covariance matrix arising from correlated errors are neglected. This practice is widespread in inverse theory but may not always be justified. The problem is addressed below.
 The complete error covariance matrix for interferometric cross correlation or cross-spectral estimates is derived in Appendix A and summarized in (A19), (A16), and the discussion leading to (A14). The former pair gives the error covariance for the cross-products
where er12 stands for the error in the estimate of the real part of the correlation of the signals from spaced receivers 1 and 2, for example, and where the indices may be repeated depending on the interferometry baselines in question. The latter extends the results to the finite signal-to-noise ratio (SNR) case and must be applied to the terms with repeated indices. Taken together, these formulas show that the error covariance matrix is diagonally dominant only in the low-SNR case or in cases where the coherence is small. These limits seldom apply to coherent scatter, however. Even the longest interferometry baseline at Jicamarca, nearly 100 wavelengths long, can exhibit high coherence. Even the low-power, portable imaging radar used to study midlatitude and high-latitude plasma irregularities runs in the high-SNR limit [Hysell et al., 2002, 2004b]. The error covariances are not diagonally dominant in general, and discarding the off-diagonal terms misrepresents statistical confidence in the data. Some detail may be lost in the image as a result.
 Our procedure is to evaluate the full error covariance matrix according to the formulas in Appendix A. We then invert the matrix and calculate its eigenvectors and eigenvalues. The inverse covariance matrix is real symmetric, and its eigenvalues are real. They are positive except in cases where the condition number of the error covariance matrix is too low to permit reliable inversion. In these rare events, the new algorithm fails, and the original one must be used.
 The transformation matrix Tjk whose rows are the eigenvectors defines a space in which the error covariances are diagonal. We can transform the system into this space through a similarity transformation and still utilize most of the original formalism. Reciprocals of the eigenvalues become the new error variance estimates, that is, σj → σ′j. The measured visibilities must also be transformed, that is, g′j = Tjkgk. Likewise, we transform the point spread function in (9), that is,
No further changes need be made to the algorithm. The additional computational burden is modest compared to the iterative solution of the coupled equations.
 Prior to similarity transformation, we find that the autocovariances (the diagonal elements of the error covariance matrix) are relatively uniform, with values of the order of (2m)−1, where m is the number of statistically independent samples entering into the estimated visibilities. The off-diagonal terms, meanwhile, are smaller but of the same order. After diagonalization, however, the variances are more widely distributed, with values both much smaller and much larger than (2m)−1. The explanation is discussed in Appendix A, where it is pointed out that normalized interferometric coherence measurements are much more accurate than phase measurements, particularly when the coherence is large. The large phase errors distribute themselves about evenly between the real and imaginary parts of the visibilities given arbitrary phase angles, which is why the error variances are uniform prior to diagonalization, and the real and imaginary errors are highly correlated in general, contributing to the magnitude of the off-diagonal terms in the covariance matrix. Diagonalization is tantamount to rotating the phase plane so as to align an axis with the interferometric phase angle. In this space, the errors decorrelate. Errors for parameters aligned with (normal to) the phase angle become smaller (larger) than the original autocovariances. The effect is only important for data with high coherence in the large signal-to-noise ratio limit.
 Overall, the effect of considering correlated errors is to produce more detailed radar images. The small error variances arising from diagonalization permit MaxEnt to produce fine structure that is pointed to by the data but that would otherwise have too low entropy to be considered. At the same time, diagonalization suppresses some presumably artificial features in the images that can no longer be supported in view of the large error variances now associated with a number of the visibility data. These effects are illustrated in the examples in the following section.
Figure 2 shows example radar images of plasma irregularities in the equatorial electrojet observed at Jicamarca on July 26, 2005. The image space is in the plane of the magnetic equator. The waveform used to collect the images was an uncoded 3μs pulse with an interpulse period of 2.5 ms. The incoherent integration time was 4 s. A low-power transmitter was used for the experiments, and reception was performed using 8 approximately collinear antenna modules yielding 28 distinct baselines. The distances between the modules, projected on the line of the magnetic equator, ranged from 13.2 m to 570.8 m. While the imaging field of view is consequently precisely 26°, the antenna illuminated a much narrower range of angles (see Figure 1), and so only half the actual field of view reconstructed through imaging is shown. The antenna configuration used was primarily designed for experiments with spread F, where significant backscatter can sometimes be received from very large zenith angles.
 The images depict the backscattered characteristics versus range and zenith angle in the equatorial plane. We process the signals into four spectral bins and generate images for the lowest three frequency components. The red- and blue-shifted Doppler bins are assigned red and blue colors, and the remaining, zero-shifted bin is assigned green. The intensity of the colors plotted is proportional to the SNR, on a logarithmic scale from 20–40 dB. By combining the colors in the images this way, the brightness, hue, and saturation of the pixels come to represent the scattered power, Doppler shift, and spectral width in an intuitive way.
 The bands evident in Figure 2 are large-scale gradient drift waves that predominate in the daytime electrojet [Kudeki et al., 1982]. Not only are the intensities modulated by the bands, but so are the Doppler shifts, signifying the reversal of the electron E × B drift in opposite phases of the electrostatic waves. Inspection of the Doppler spectra shows that type I echo signatures are present, signifying that the polarization electric fields produced by the gradient waves were strong enough to excite Farley Buneman instabilities. The large-scale wave wavelengths are kilometric, and they propagate westward during the day with phase speeds of up to about 200 m/s. They exhibit considerable shear, propagating fastest at high altitudes. The shear causes the waves to twist, break, and collapse. The waves then reform, exhibiting a nonlinear relaxation behavior that has been described and simulated by Ronchi et al. . This behavior is can be seen clearly in animated sequences of images like those in Figure 2. A detailed analysis of properties of the waves deduced from imaging was presented by Hysell and Chau .
Figure 3 shows the same plasma irregularities as Figure 2, only this time computed without specifying the radar radiation pattern in the prior probability. These images have a higher tendency to indicate the presence of backscatter outside the radar illuminated volume, something that is almost certainly artificial. The fault is not grave, but correction is desirable. The inclusion of the radiation pattern in the entropy expression appears to be an effective remedy. Since the total power in the images is conserved, the darker peripheries in Figure 2 are accompanied by slightly brighter images in the radar illuminated volume. This could be a useful for accurately quantifying signal strengths.
Figure 4 shows the same radar images, this time computed using the radiation pattern in the prior probability but neglecting the off-diagonal terms in the error covariance estimator. While the effect is arguably subtle when presented in log intensity format, there is more detail in Figure 2 than in Figure 4, and the former are sharper than the latter. Blank space appears in gaps between some of the large-scale waves in Figure 2 whereas there are almost no gaps in Figure 4. The generally sharper images in the middle panels arise from a number of smaller error variance estimates. At the same time, the features in Figure 2 seem to be better confined to the zenith angles illuminated by the radar.
 Aperture synthesis radar imaging is emerging as a powerful tool for studying the spatial structure of plasma irregularities in the Earth's ionosphere. The resolution of the technique is limited by the longest interferometry baselines which can, in practice, be much longer than the dimensions of the main radar antenna array. Because the data tend to be sparsely and incompletely sampled and because of the poorly conditioned nature of the problem, imaging generally takes the form of a constrained optimization problem. Statistical inverse theory in generally and MaxEnt in particular constitute an optimal solution to the problem, incorporating all available prior information and constraining the solution with the data in terms of complete and precise confidence limits. MaxEnt recovers much more detail than adaptive beamforming approaches such as Capon's method, although the latter reproduces broad morphological features accurately and may still be useful in real time applications.
 We have improved the WD85 MaxEnt algorithm by incorporating the two-way antenna pattern in the prior probability and by utilizing the full error covariance matrix in the constraint. The former modification suppresses artificial image features outside the sector illuminated by the radar. The latter gives rise to somewhat sharper images than would otherwise be possible by exploiting the high degree of accuracy associated with normalized coherence estimates in interferometry. A thorough analysis of errors in radar imaging is presented in Appendix A that may be useful in a broad class of signal processing problems.
 The benefits of modifying the WD85 algorithm are not drastic as the faults they remedy were not grave. The impact on the kind of qualitative studies of ionospheric irregularities undertaken to date will probably not be very significant. However, as quantitative applications for radar imaging appear, corrections such as these will take on added significance. Convection electric field estimates from radar images of auroral echoes are an example of an application that is highly intolerant of artifacts and that can benefit from this analysis [Bahcivan et al., 2005].
Appendix A:: Error Analysis
 Here, we derive a general expression for the error covariance for normalized cross-spectral estimates in the presence of noise. The approach mirrors that outlined by Farley . Our results reproduce those presented by Hysell and Woodman  but are more general and compact. The expressions derived should have a wide range of applicability outside the context of radar interferometry in view of the fact that the samples in question could equally come from different times as from different antennas, as they are understood to here.
 Consider four complex signals which can be regarded as the quadrature voltages present at the output from four receivers, each one attached to a different antenna receiving radar backscatter. Let v1i represent the ith sample of the voltage from receiver number 1, for example, which is taken to be a Gaussian random variable with zero mean. If ρ12 is defined as the normalized cross correlation of the signals from receivers 1 and 2, then an obvious estimator of ρ12 is
where the numerator and denominator are computed from the same m statistically independent, concurrent samples. The estimators A12 and B12 that make up 12 deviate from their expectations because of the finite number of samples involved, giving rise to errors that we express as
where A12 is an unbiased estimator and 〈A12〉 ≡ Sρ12, S being the signal power. Note however that B12 is a biased estimator because of possible correlations in the signals from the two receivers involved and that 〈B12〉 ≈ S2(1 + ∣ρ12∣2).
 Assuming that m is large and that the relative errors are small, we then have
The first-order error terms in (A3) are the dominant contributors to the error covariance, and only they will be retained in what follows. Higher-order terms introduce additional biases in 12 but will be neglected.
 While the samples are random, they are correlated, and consequently so are the errors in correlation functions estimated from them. The error correlation can be expressed as
where we do not differentiate between ρij and 〈Aij〉/ in the term multiplying the angle brackets, here or elsewhere in the appendix.
 What remains is the computation of the four quadratic terms inside the angle brackets in (A4) which are readily determined with the application of the fourth moment theorem for Gaussian random variables and its generalizations [e.g., Reed, 1962]. For example, we may write
Making use of the last result and with a little rearranging of (A5), an expression for the corresponding quadratic term in (A4) can be derived:
 Likewise, we may write
where terms with leading factors smaller than m−1 have been neglected. With the appropriate identifications, this may be expressed as
With some further rearranging, we have
To order m−1, this leads to the result
 Similar calculations can be performed to yield the remaining two quadratic terms
Finally, incorporating (A6)()()–(A9) into (A4) yields the complete expression for the error covariances:
Several comments about this result are in order. First, in the case of small correlations, the quadratic term in (A10) dominate, and the expression reduces to the one derived by Farley . Second, in the case where the correlations tend toward unity, the covariance vanishes, even when the number of samples m is small. This property can permit very accurate interferometric measurements to be made even given relatively short integration times [Farley and Hysell, 1996]. Third, (A10) is a general one that applies even in the cases of repeated indices. For example, if the subscripts 3 and 4 are replaced by 1 and 2, respectively, (A10) reduces to the expression for the variance derived by Farley . Note that any correlation term that appears with repeated indices (e.g., ρii) is merely unity.
 Finally, (A10) applies to the case of no noise but can easily be generalized to encompass the effects of added noise. In what follows, we assume that the noise has been estimated from a large number of samples and consequently neglect errors introduced by the computation of the noise estimate.
 The appropriate estimator for the correlation function in the presence of noise is
where S and N are the signal and noise powers associated with the given voltage samples, respectively, ρS and ρN refer to the signal and noise correlation functions such that (S + N)ρ = Sρs + NρN, and where 〈A12〉 = Sρs + NρN and S + N replaces S in 〈B12〉. With some rearranging and the expansion of its denominator, (A11) can be written as
 At this point, we limit the discussion to estimator errors for correlation functions with nonzero lags and make the assumption that the antennas upon which different signals are received are sufficiently distant that the noise correlation ρN vanishes. In that case, ρs = ρ (S + N)/S, and (A12) adopts the form of (A3) except with additional factors of the ratio (S + N)/S. Propagating those factors through the calculations performed earlier leads to the following expression for the covariances:
In addition, it is convenient replace all the ρ factors in (A13) with the corresponding ρS factors, each multiplied by a factor of S/(S + N). Doing so returns the expression to the form of (A10), only with the correlation functions now representing the correlations of the signal components only (ρ → ρS). The exceptions to this rule are the correlation terms with repeated indices, ρii, which should be replaced by unity. The consequences of all of this is that the normalized correlation function error covariances for signals in the presence of noise are ultimately given by (A10), only substituting the factor
wherever correlation terms with repeated indices appear.
 Last, when considering error terms associated with zero lag estimators, the noise correlation terms in (A12) must be retained, and ρS = ρ = 1 must be used for that lag. Doing so and carrying the results through the calculations leading to (A13), is easily shown that the covariance terms involving the zero lag estimator are all identically zero. This result should be obvious because the normalized correlation function for the zero lag and its estimator are identically unity and cannot suffer from statistical errors.
 The preceding analysis yielded the complex error covariances for complex visibility estimates. We can be more explicit and derive the error covariances for the real and imaginary parts of the visibilities treated as separate entities. The result are the most complete description of the experimental error bounds possible. The additional information they contain can be used to constrain the imaging problem more precisely.
 We return to (A10), the complex covariance for complex correlation functions, and express it as
where the er,i terms stand for the real and imaginary parts of the errors for the correlation estimate for the given antenna pair. Regrouping terms, we have
Consider next the related quantity
which is like (A4) only without the complex conjugate. Following the procedure outlined above and borrowing on the existing formalism, it can be shown that
Furthermore, we note that
Therefore the variances and covariances of the real and imaginary parts of the errors are contained in the real and imaginary parts of the sum and difference of δ2 and δ′2:
The effects of a finite signal-to-noise ratio are accounted for by the prescriptions given in (A14).
 For example, (A17) and (A18) can be used to derive the autocovariances for the real and imaginary parts of a certain correlation function estimate in the large signal-to-noise ratio limit. In this case, δ2 = (1/m)(1 − 3∣ρ∣2/2 + ∣ρ∣4/2), δ′2 = (1/m)(−ρ2/2 + ρ2∣ρ∣2/2), and
 These expressions illustrate important properties of normalized cross-correlation estimates (see Figure A1). The first is that the errors become small as the coherence ∣ρ∣ approaches unity. The second is that coherence errors are generally much smaller than phase errors, particularly when the coherence is high. This is due to the tendency for normalization to reduce (actually cancel) coherence errors. Errors are smallest along the phase-plane axis aligned with the phase angle of ρ, and the real and imaginary errors are highly correlated when ρ is aligned with neither axis. These properties suggest that the real and imaginary parts of the visibility are somewhat unnatural parameters for imaging and that the problem would perhaps be better formulated in terms of coherence and phase.
 The real and imaginary errors become uncorrelated when the phase angle of the correlation is aligned with the real or imaginary axis. Diagonalization of the error covariance matrix therefore amounts to rotating into a coordinate system where alignment occurs. In such a space, we expect half the errors (corresponding to the aligned axes) to be small and the other half (corresponding to the axes at right angles to the correlation phase) to be large. In effect, M complex, normalized visibility measurements imply M accurately measured parameters related to coherence and another M less accurately measured parameters related to phase angle.
 For the sake of completeness, we point out that (A19) and (A20) lead automatically to expressions for the variances of measurements of normalized coherence and phase:
 In closing, we note that the probability density functions for the cross spectra of highly coherent signals can depart significantly from Gaussian distributions, undermining the validity of the preceding analysis to some degree. The expressions derived here provide the leading behavior, and their failure is regarded as the topic for future work.
 The authors wish to thank Donald Farley and Ronald Woodman for valuable discussions. This work was supported by the National Science Foundation through NSF grant ATM-0225686 to Cornell University. The Jicamarca Radio Observatory is a facility of the Instituto Geofísico del Perú and is operated with support from the NSF Cooperative Agreement ATM-0432565 through Cornell University. The help of the staff was much appreciated.