Radio Science

Using complex independent component analysis to extract weak returns in MARSIS radar data and their possible relation to a subsurface reflector on Mars



[1] Mars Advanced Radar for Subsurface and Ionosphere Sounding (MARSIS) radar signals returned from the surface and subsurface of the northern lowlands of Mars are mostly characterized as a prominent surface pulse followed by a tail of weak signals. We applied the method of blind source separation (BSS) using complex independent component analysis (CICA) to examine if there is any subsurface information hidden in the apparent clutter and noise in the tails. In this paper we describe the application of the method with a case study in a selected region in the western part of Utopia Planitia (88°E–91°E, 45°N–51°N) where an unnamed large (∼160 km in diameter), semifilled crater is located. Applying CICA, we find a particularly significant signal from one of the blind sources. This signal is interpreted as representing a subsurface reflection interface, based on analysis of characteristics of the separated sources and their relation to surface roughness inside and outside the crater. A numerical simulation is conducted to examine the conditions under which the method tends to be effective. We conclude that CICA is helpful for recognizing subsurface signals above the noise level but obscured by clutter and thermal noise, provided the surface is relatively smooth.

1. Introduction

[2] To study the geologic structures of Mars to a depth of a few kilometers is one of the scientific objectives of MARSIS (Mars Advanced Radar for Subsurface and Ionosphere Sounding) on board the European Space Agency's Mars Express orbiter (MEX) [Picardi et al., 2004]. In this respect there have been striking achievements, for example, the detection of what appear to be buried craters in the northern lowlands, near-surface layered structures in the polar and some near-equator regions of Mars [Picardi et al., 2005; Watters et al., 2006; Plaut et al., 2007; Watters et al., 2007; Farrell et al., 2008], and constraints of possible inferences on the material properties in central Elysium Planitia and other regions [Boisson et al., 2009; Zhang et al., 2008]. However, it appears that MARSIS has revealed fewer subsurface features of Mars than was expected, especially in the northern lowlands where the surface is relatively smooth and the surface clutter impact on the radar signals is relatively easier to track. The aforementioned findings are restricted to a few specific locations, while over most of the vast lowlands no subsurface information has been extracted from MARSIS observations. On the other hand, the Mars Orbital Laser Altimeter (MOLA) data have brought to light numerous “quasi-circular depressions” (QCDs), which are interpreted to be buried impact basins [Smith et al., 2003; Frey et al., 2002]. This means that if the top crustal materials are sufficiently low loss (i.e., allowing a high degree of signal penetration), more subsurface features such as “crater basin floors” might be observed by MARSIS. Farrell et al. [2009] theoretically predicts that a strong reflecting interface 1–2 km below the surface can be detectable by MARSIS if the overlying layer has a loss tangent (ratio between conductivity and permittivity times wave angular frequency) below, roughly, 0.008. Other simulations, taking into account surface roughness and material type variables, also predict the visibility of subsurface interfaces, if any, within a few kilometers' depth by MARSIS [e.g., Ciarletti et al., 2003; Xu et al., 2006]. Covering most of the northern lowlands is the geological unit called the Vastitas Borealis Formation (VBF) underlain by the Hesperian volcanic flows [e.g., Tanaka et al., 2003, 2005]. The thickness of VBF was roughly estimated to be a few hundred meters to 2 km [e.g., Head et al., 2002; Fairén and Dohm, 2004], while opinions regarding the material types of VBF are highly controversial, for example, lava flows or eolian products [Greeley and Guest, 1987], fine-grained residue ocean sediments [Kreslavsky and Head, 2002], frozen deposits of sediment-laden water from outflow channels [Carr and Head, 2003], and resurfacing/deformational products [Tanaka et al., 2005]. However, a general perspective may be that in the northern lowlands the thickness and materials of the top layer are changeable from place to place. There are places like the one in Chryse Planitia where the top layer is low loss and a reflecting interface 2–2.5 km under surface may be observed by MARSIS [Watters et al., 2007]. There may be places where no subsurface reflecting interfaces exist, or above an interface the top layer is too lossy and/or too thick to let the radar signals go through twice before entirely absorbed. In this case, MARSIS can see nothing from the subsurface. Also possible is that there are “intermediate” places where subsurface information may be included in the observed signals but neglected due to lack of further processing of the signals. It was anticipated before, and has been seen since MARSIS sent data, that there are difficulties in recognizing underground features from MARSIS subsurface sounding data, due to the fact that clutter and other noise are competing with possibly weak subsurface signals [Campbell and Shepard, 2003; Safaeinili et al., 2003; Nouvel et al., 2004; Farrell et al., 2009; D. W. Beaty et al., Analysis of the potential of a Mars orbital ground-penetrating radar instrument, white paper, 2001, available at]. Clutter/noise reduction using any effective method is therefore very important to help recognize possible subsurface signals from the MARSIS subsurface sounding data.

[3] In this paper we use the method of complex-valued independent component analysis (complex ICA, or CICA) to separate possible subsurface reflections from clutter and thermal noise. CICA is a signal processing technique (dealing with complex-valued signals) widely applied where blind source separation (BSS) problems are concerned [Fiori, 2001; Pham and Cardoso, 2001; Cichocki and Amari, 2003]. BSS means that the observed signals are supposed to be mixtures of a set of unknown, statistically independent source signals, and the original sources are recovered based on information from their mixtures. In this paper we suppose the MARSIS signal to be a linear, instantaneous mixture of several independent sources including surface clutter, thermal noise, and sometimes subsurface reflections, based on the notion that any received radar signal is a vector sum of echoes from multiple scatterers [Curlander and McDonough, 1991]. The recognition of subsurface signals is thus taken as a BSS problem.

[4] In the northern lowlands where the surface is relatively smooth, the observed MARSIS signals are dominantly characterized as a prominent surface pulse followed by a series of lower intensity peaks above the noise level, typically lasting up to about 100 μs, which is called a “long tail” and is believed to be predominantly composed of surface clutter [Farrell et al., 2009]. Our present study is centered on the “long tails”. We hereafter call them “tails” for simplicity. The tails can vary in shape at different places where even the surface roughness looks similar. For example, Figure 1 shows two representative signals (frames 230 and 240 on orbit 3675, where a frame is a single observation recorded by MARSIS). From Figure 1 we see the tails of the two signals are different: the tail of frame 240 looks “fatter” than that of frame 230. (We use modulus, or magnitude, to express the intensity of a complex-valued signal. The modulus of a complex number z is defined as ∣z∣ = [(Re z)2 + (Im z)2]1/2.) Such a difference is hard to discern in a radargram (signal intensity distribution on the spacecraft position versus time delay plane) after coherent processing (i.e., Doppler filtering, range compression, and ionospheric correction) only. We will see that the difference can be enhanced by CICA thus to enable a further interpretation.

Figure 1.

Two examples (frames) of MARSIS subsurface sounding signals showing the prominent surface pulse followed by a tail in each signal (a frame is a single observation recorded by MARSIS). The vertical axis is modulus normalized to 0–1. Only a portion of the 512-sample MARSIS sampling window is shown in the horizontal axis. Geographical positions (latitude and east longitude) of the two frames are indicated by latitude and longitude.

[5] The geographical positions of the above two frames are shown in Figure 2. The two locations are of different geological backgrounds: frame 240 is one of the frames located in a semifilled large crater (centered at 89.50°E, 47.75°N, and about 160 km in diameter), whereas frame 230 is one of those outside the crater. Is there an essential relation between the difference in geology and the difference in shape of the tails? We will see that CICA can be helpful in answering this question.

Figure 2.

Geographical positions of the MARSIS signals shown in Figure 1. The Mars Express orbiter ground track of orbit 3675 is plotted on a topographic map derived from the 128 pixel/degree Mars Orbital Laser Altimeter (MOLA) data [Smith et al., 2003] (not projected). A cross on the orbit track indicates a frame position. Some of the frame numbers are tagged at the right of the frame positions. The dashed curve outlines a large degraded crater that appeared as a quasi-circular topographic depression of ∼300 m depth and ∼160 km diameter. Also indicated are some newer smaller craters on the north part of the large crater. On the right side is the scale of the topographic map (in the latitudinal direction the scale is uniform; in the longitudinal direction the scale varies with latitude).

[6] In section 2 we first have a brief description of the CICA technique we employ, and in section 3 we explain how it is applied in MARSIS signal processing. A case study of the region shown in Figure 2 is demonstrated in section 4, showing that a possible subsurface signal mixed in clutter and other noise may be detected with the help of CICA. In section 5 we describe a numerical simulation which is intended to further examine the conditions in which the method tends to be effective. Concluding remarks are included in section 6.

2. Complex ICA

[7] Normally we think of a radar signal as the sum of all the reflections/scatterings from the surface/subsurface in the field of view. It is however possible to consider the radar signal as composed of signals from n > 1 different physical sources, and each physical source gives a specific “source signal”. As the radar moves along its orbit, any mn successive observations may be taken as simultaneous measurements (by m sensors) of different mixtures of the same n source signals. (Obviously, this approximation requires n and m to not be very large). The aim of CICA is to estimate (separate) the n source signals based on the m measurements. The estimated source signals can then be used to infer the characteristics of the physical sources, e.g., existence/nonexistence of a subsurface target, taking into account the physical context involved.

2.1. Algorithm

[8] Suppose we have m observed complex-valued signals in discrete time domain with d samples, each denoted by zi(t), where i = 1,2, …, m, and t = t1,t2,…,td stands for time positions. In CICA the signals are taken as random vectors, and we denote zi = zi(t) as a random vector. The m vectors comprise an m × d data matrix, denoted by z,

equation image

where “T” stands for transpose. In order to keep comparability between the zi values, also for mathematical convenience, zi is centered; i.e., each of the d elements in zi is replaced by a new value equal to its original value minus the average of the d elements. The average value is defined as

equation image

where j is the imaginary unit, and E( ) means averaging. Hereafter we take the data z as already centered.

[9] Suppose each zi is a linear mixture of n independent, unknown source signals, which are also d dimensional random vectors, denoted by sk = sk(t), k = 1,2, …, n, nm. The n source signals comprise a n × d matrix, denoted by s:

equation image

Then z and s are related by

equation image

where A represents an unknown, complex-valued m × n mixing matrix. In practice, for simplicity, m = n is often adopted; A is thus square (n × n) and is assumed to be full rank. The aim of CICA is to find out a separating matrix W(z) (as a function of data z) such that

equation image

where equation image represents an estimate of the original source signals s, and a is any n × n ambiguity matrix, defined as a complex matrix with only one nonzero value in each row and in each column [Ollila et al., 2008; Ollila and Koivunen, 2009].

[10] Hereafter for brevity we will call a “source signal” a source, which is distinguishable from a “signal source” or a physical source of signals.

[11] Corresponding to different assumptions about the sources, there are different algorithms to compute W from data z. We use the generalized uncorrelating transform (GUT) algorithm, which is proposed and explained in mathematical detail by Ollila and Koivunen [2009]. Based on the descriptions in that paper, we rewrite the algorithm as the following four steps (suppose z is already centered).

[12] Step 1 is to calculate the covariance matrix of data z, denoted by C(z), which is an n × n complex-valued, symmetric matrix containing the covariance between zi and zk (i,k = 1,2, …, n) as elements. The covariance between zi and zk is a complex value defined as cov(zi,zk) = E(zizkH), where H means conjugate transpose. We denote in matrix form

equation image

[13] Step 2 is to calculate the pseudocovariance matrix of z, denoted by P(z), which is also an n × n symmetric complex matrix containing the pseudocovariance between zi and zk as elements. The pseudocovariance between zi and zk is a complex value defined as pcov(zi,zk) = E(zizkT). We denote in matrix form

equation image

[14] In step 3, let

equation image

where the superscript −1 stands for inverse and the asterisk stands for conjugate. Conduct singular value decomposition (SVD) [Strang, 1988] of D(z). That is, find out a unitary matrix U(z) and a real-valued, nonnegative diagonal matrix Λ(z) such that

equation image

[15] Then U(z) contains the eigenvectors (independent components, ICs) of D(z) as rows, and Λ(z) contains the eigenvalues of D(z) as diagonal elements. The eigenvalues, λi(i = 1,2, …, n), in Λ(z) are rearranged in decreasing order (and the rows in U(z) are accordingly reordered). We conduct SVD using an efficient iterative method described by Killingbeck et al. [2004].

[16] In step 4, set W = U(z)*, and then calculate equation imagei using equation (5).

2.2. Remarks

[17] Let us have a look into the matrix E(z) = C(z)−1P(z) defined in equation (8) in order to understand the optimization principle behind the above algorithm. The covariance matrix C(z) is a description of correlations between the data vectors (z), so C(z)−1 represents “independence” between the data vectors. The pseudocovariance matrix P(z) describes circularities of independent vectors. A random vector z is called circular if z and zejθ have the same probability distribution for any real number θ [Ollila and Koivunen, 2009]. Roughly, if the sample points of z are distributed in a circular area centered on the origin of the complex plane (note that z is centered), then z is circular; otherwise it is noncircular. If zi and zk (ki) are circular then pcov(zi,zk) = 0 [Ollila and Koivunen, 2009]. Thus, bigger elements in P(z) imply larger noncircularities of the vectors. Therefore, E(z), as well as D(z) = E(z)E(z)* (equation (8)), is a measurement of the degree to which the data vectors are uncorrelated and noncircular. Since the separating matrix W is derived by SVD of D(z), the optimization criterion for the above described CICA algorithm is that the separated sources are maximally independent of each other and maximally noncircular.

[18] The estimates (equation imagei) contain ambiguity with respect to the real sources (si, which are unknown) in scaling and permutation. Scaling ambiguity means that equation imagei is a result of si multiplied by some unidentified complex constants, which are the unknown nonzero elements of a in equation (5). This implies that circular sources cannot be reliably separated from each other by the method, because the probability distribution of a circular random vector is the same as that of another circular random vector multiplied by a certain complex number. Permutation ambiguity means that, although each equation imagei corresponds to a unique sk, we do not know which corresponds to which, because a is unknown. Permutation ambiguity necessitates interpretation of the separated sources incorporating a priori knowledge about the physical sources.

[19] The ambiguities are mathematically unavoidable [Eriksson and Koivunen, 2006]. However, they do not amount to an obstacle to the use of the method, if our interest is focused on the “waveforms” of the sources rather than their absolute amplitudes. By waveform we mean the shape of the signal function z = z(t) in a given time domain (e.g., the sampling window of MARSIS). For example, suppose si = s(t) is a signal containing a short pulse and sk = s(t + Δt) is another signal different from si by only a nonzero time shift Δt. We still say si and sk are of different waveforms. Two sources of different waveforms can be separated by CICA, because there is no time shift ambiguity introduced in the algorithm of CICA.

[20] A subsurface echo hidden in the tail is likely to be separated by the algorithm under some conditions; e.g., the echo is from a flat interface. In this case the energy of the echo tends to concentrate on a specific time position across multiple observations, thus tending to have larger pseudocovariance between adjacent observations (i.e., noncircularity) and smaller covariance with other sources that have different concentrating time positions (i.e., independence).

[21] The eigenvalues λi (diagonal elements of Λ(z)) can be taken as measurements of the statistical significances of the corresponding sources. Statistical significance means the degree to which the source is independent and noncircular, and it also indicates how large a fraction of covariation in the data (z) is explained by the corresponding source.

3. Application of CICA in MARSIS Data Processing

[22] Here we describe the general scenario we employ to apply CICA to help recognize possible subsurface signals obscured by clutter and other noise in MARSIS data. The whole scenario contains five steps, including a data prepossessing step before the CICA algorithm is applied, as follows.

3.1. Data Decompressing and Coherent Processing

[23] Data decompressing and coherent processing is the prepossessing step which must be done before CICA is applied. We use the level 1b “FRM_SS3_TRK_CMP_EDR_*.DAT” data files, which contain radar echoes in compressed form resulted from an algorithm called “maximum exponent normalization” [Orosei et al., 2008]. Data decompressing is to restore the data in the records from 1 byte integers to 4 byte real numbers.

[24] Coherent processing incorporates Doppler filtering, range compressing with ionospheric correction, in order to convert the returned echoes of the transmitted long chirp into short pulses, thus making it possible to recognize the spatial positions of the targets based on the time delays of the pulses. Doppler filtering is done onboard [Orosei et al., 2008], so the data in the data files we used were already Doppler filtered. For range compressing and ionospheric correction, we employ a synthetic pulse compressing filter including neutral collision frequency to process the received signals. The filter realizes optimal compressing of a signal via maximizing the signal-to-noise-ratio (SNR) of the processed signal, and it is shown to be able to get higher SNR than that obtained using filters omitting neutral collision frequency. The method is described in detail by Zhang et al. [2009].

3.2. Time Aligning and Truncation

[25] As described in section 2, CICA retrieves n > 1 sources from n observed signals, so n signals are analyzed together whenever the algorithm is applied. Due to topography of the Mars surface and the uneven delaying effects of the ionosphere, the time positions of the observed signals can vary within the MARSIS sampling window. The observed signals should be aligned to an identical position before CICA is applied. This is done by simply shifting every one of the signals so that their “surface peaks” in the time domain are aligned to a certain, say the first, sample position. This aligning operation implies that the detection of subsurface signals by CICA will be sensitive to a subsurface interface parallel with the surface.

[26] Truncation means that the surface pulses are removed from the signals before CICA is applied. This is necessary for two reasons. First, the surface pulse is practically an outlier with respect to the whole signal because it is usually very short and always far stronger than the rest of the signal. It is obvious that the surface pulse, if not removed, always contributes the most to the covariation between multiple observed signals relative to other parts of the signals, so any information included in the tail may be shadowed by the surface pulse. Second, we are interested in information carried by the tail rather than in the surface pulse, which is already clear. We simply cut off the first 10 samples of the signals (after alignment) in the time domain, where 10 is a rough estimate of the time width (sample number) of the surface peak. This straightforward truncation keeps the remaining part of the observed signal intact, and thus information is maximally preserved. Note that the sampling window of MARSIS is ∼183 μs including 512 samples, so the time interval between adjacent samples is about 0.36 μs. This time interval corresponds to a spatial distance of 20–50m in a medium, if we assume a permittivity of the medium, ɛr, to be 4–16 (the distance, l, can be calculated using the formula l = cΔtɛr−1/2, where c = 2.9979 × 108 m/s is light speed in vacuum, and Δt the time interval). Therefore, when 10 samples including the surface pulse are truncated, a depth of 200–500 m (depending on ɛr) is sliced off the top of the surface of the data.

[27] Truncation can also include removal of some samples from the latest part (after the tail) of an aligned signal. This is plausible because the latest part of the signal is known to be noise dominating, and it is preferable for saving computational resources without harm to our objective to study the tails. We remove 211 samples (equivalent to a length of 5–10 km, depending on ɛr) from the end of each observed signal, so the signals processed by CICA have finally 291 samples each (equivalent to a depth range of about 10 km, 350 m below surface, if we assume the permittivity to be 9), from samples 10 to 300, with the tail as a large part (about half) in each.

3.3. Choosing the Number of Sources

[28] To satisfy the need of CICA, a source number (n) should be assumed. We assume n = 5, considering the tail may be generally taken as a mixture of five different components in physical sense, namely, continuous clutter, deterministic clutter, subsurface reflection, “pure sidelobes,” and thermal noise. (Note that these components may be understood as “types” of signals or noise, but they can also be taken as specific source signals, because we can consider, for example, the whole surface of the radar view area as one target.) We will here have a brief description of the components and define the terms for future reference.

[29] 1. Continuous clutter, also called random clutter, arises from random small-scale topographic roughness of the off-nadir surface, and appears closely following the surface pulse [Safaeinili et al., 2003; Campbell and Shepard, 2003]. For a stochastic Gaussian surface, the relation between the specific radar cross section (σH) of the surface to incident angle (θ) of a radar wave can be described by a monotonically decreasing function σH(θ) (the Hagfors backscatter law) [Evans and Hagfors, 1968; Sultan-Salem and Tyler, 2006],

equation image

where ρ is the Fresnel normal power reflectivity and C is called the width factor related to radar wavelength and surface roughness. For MARSIS, θ is positively related to the time delay of the surface backscatters, so we can say that (average) continuous clutter decreases with the time delay in a return. This clutter usually contributes much to the tail so that the tail takes a similar decreasing waveform. Due to its stable decreasing waveform, continuous clutter can be detected by CICA as a significant independent source.

[30] 2. Deterministic clutter, also called specific target clutter, is conventionally defined as that associated with specific geographic features with relatively high backscatter cross section [Safaeinili et al., 2003; Campbell and Shepard, 2003]. In a single frame, a deterministic clutter shows a “pulse waveform”. By pulse waveform we mean a waveform with one or more relatively prominent intensity peaks concentrated at some time position(s) apart from the surface peak.

[31] The above notions of continuous clutter and specific target clutter are conventional. We need to redefine these two terms here for convenient use in application of CICA. We call a surface clutter a specific target clutter if it has a pulse waveform. We call a surface clutter continuous clutter if it has a generally decreasing waveform. With these new definitions we mean that where the intensity fluctuations of a surface clutter form an intensity high strong enough to breach the “generally decreasing” property of a continuous clutter, we also call the high a specific target clutter, although we cannot or need not specify which topographic feature causes the high. An essential difference between the new definitions and the traditional notions is that an “abnormal” part of the traditional continuous clutter is reclassified as specific target clutter, so in the new definitions these two types of clutter differ from each other by waveform. Note that we do not quantify here “prominent peak” and “general decrease”; instead we continue using these qualitative descriptions, because we will see (in section 4) that most waveforms of important sources obtained by CICA can be judged clearly on a visual basis, which can serve our main purposes.

[32] With its new definition, specific target clutter can possibly (though not highly probably) exist even in areas with only small-scale random roughness. In rougher areas, this clutter can be more abundant. It is specific target clutter that amounts to the most considerable hindrance, compared to other noise components, to the separation of any subsurface signal from the tail by CICA, because this clutter possesses a waveform similar to that of a subsurface signal.

[33] 3. Subsurface reflection, if it exists, may also appear as a pulse waveform.

[34] 4. “Pure sidelobes” are the intrinsic sidelobes arising from pulse compression. Any chirp becomes a wave with intensity obeying a sin(x)/x form (where x represents the time distance to the chirp center position) after pulse compression. When near the surface, pure sidelobes of the surface pulse are relatively strong and decrease (oscillatingly) with time, so they are likely to mix with continuous clutter and the mixture becomes an independent source detected by CICA. In other parts of the tail, pure sidelobes are weak and may sum up to approach a kind of Gaussian nature thus not to be distinguishable from thermal noise.

[35] 5. Thermal noise is the cosmic/instrumental internal noise. Thermal noise appears to be a stationary zero-mean Gaussian process and exists everywhere in an observed signal but dominating the parts after the tail and before the surface pulse. By “stationary “ we indicate that the mean values and deviations of the real and imaginary parts of the signal do not change with time. After the signal is centered (see section 2.1), thermal noise in the data may become nonzero mean, hence noncircular. Therefore, thermal noise is likely to be detected by CICA as a significant source.

[36] The source number n = 5 seems to be a proper choice because it coincides with the number of possible physical components. A smaller n adopted means a higher possibility that the separated sources are still complicated mixtures, whereas a bigger n may lead to too many sources. In both cases the difficulty in interpretation of the sources may increase. In addition, a bigger n means a decreased “azimuth resolution” of the CICA applications in our approach, as will be seen in section 3.4.

[37] It is important to notice that n = 5 does not mean the separated sources correspond to the physical components strictly one to one. CICA is responsible only for estimating mathematically (statistically) independent source signals. An estimated source can still be a mixture of signals in the physical sense, but probably only one or two physical sources are dominating in it, thus different from the original data. For example, an estimated source dominated by thermal noise appears as a stationary waveform. However, a fraction of surface clutter may be mixed into this source and increase the intensity fluctuations of this source in its near-surface part. (In this case we call the waveform “wide-sense stationary,” meaning that the deviation of the random vector may be changeable with time while its mean value remains constant.) Our main purpose is to find out if there is a subsurface signal hidden in the data. If there is really a subsurface signal hidden in the data and it is strong enough to be detected by a significant source, this subsurface signal can be expected to be dominating, or at least “visible”, in a pulse-waveform source. Therefore, our main interest is in significant pulse-waveform sources. It is also possible that a subsurface signal really exists but, after CICA is conducted, it is mixed into an estimated source that does not have a pulse waveform. In this case we say that the subsurface signal is undetectable by CICA. A subsurface signal is undetectable by CICA because the subsurface signal is not strong enough relative to noise or clutter at the time position where the subsurface signal has an intensity peak.

3.4. Applying CICA Along an Orbit

[38] On an orbit of MEX there are typically around 1000 successive observation records (frames) of MARSIS saved as a data file. We can choose from an orbit any segment containing more than five frames. Suppose the data are already processed by steps described in sections 3.13.3. The CICA algorithm (steps 1–4 in section 2.1) is applied sequentially to each group of five successive frames in the segment. Adjacent groups are different by one frame only (while four frames are overlapping) so that the frame group is a moving window along the orbit (and the length of the window, the number n = 5, determines the “azimuth resolution” of this CICA approach, as mentioned in section 3.3). For each window, we will obtain five sources, which we call S1, S2, S3, S4, and S5, respectively (they correspond to the decadently ordered eigenvalues λ1, …, λ5). The five observed signals in a window are different linear mixtures of the same five sources, and the horizontal (azimuth) position of all five sources in a window is represented by the window center. Thus, for the orbit segment, we will obtain five new radargrams, corresponding to S1, …, S5. (Note that the “edge frames,” the first two and last two frames in a data file, if included in the selected orbit segment, will be ignored in the new radargrams because they cannot be window centers.)

3.5. Interpretation of Separated Source Signals

3.5.1. Interpretation Criteria

[39] Interpretation is to evaluate what the separated sources may physically mean in each single window and to identify subsurface signals, if any, in the orbit segment. The physical meaning of a source in a window is evaluated on the basis of comparison between the waveform of this source with those of the five source “types” as described in section 3.4, taking into account the statistical significance of the source. Subsurface signals may be identified on the basis of a comparative investigation of the significant sources in multiple successive windows, taking into account the surface roughness and geology of the observed area. We can summarize the interpretation criteria as follows:

[40] Criterion 1 is that if a source is statistically insignificant (i.e., it corresponds to a too-small eigenvalue), the source is ignored.

[41] Criterion 2 is that if a source has a (wide-sense) stationary waveform, it indicates mainly thermal noise (plus weak, mixed pure sidelobes and fractions of surface clutter); if a source has a decreasing waveform, it indicates mainly continuous clutter (mixed with near-surface pure sidelobes of the surface pulse).

[42] Criterion 3 is that if a source has a pulse waveform, its intensity peak(s) apart from the leading sample may indicate either a subsurface signal or a specific target clutter. Whether it is a subsurface signal is further evaluated by criterion 4. If a prominent intensity peak also presents around the leading sample of the source, this leading peak is taken as continuous clutter.

[43] Criterion 4 is that in multiple successive windows, if the surface is smooth and the intensity peaks of the pulse-waveform sources appear at a common time position, then the sources may indicate a subsurface feature; otherwise we cannot judge whether they are subsurface reflections or specific target clutter.

3.5.2. Remarks

[44] Criterion 1 means that we consider only statistically significant (independent, noncircular, and contribute relatively much to the covariation of the received signals in a window) sources, from where we find if any subsurface signals exist. We can expect that some of the separated sources must be insignificant, because the observed signals analyzed are natural mixtures of scatterings from numerous scatterers and thermal noise, and can contain circular components. While independent and noncircular components are separated out as significant sources, the circular components are also separated but as insignificant sources. Since the five sources in a window are ordered decadently by significance, we can even directly take the last two sources as insignificant as a rule. Here a question may arise: Since in every window two separated sources are considered as “useless”, should we choose the source number n = 3, instead of 5, before separation? The answer is that 5 is better than 3, because the “extra” two may accommodate circular fractions of all the possible sources; thus the other three sources may tend to be “purer”. Moreover, the statistical approach involved (e.g., calculation of covariance) appeals for a larger n.

[45] Criteria 2 and 3 are based on descriptions about the possible source signals (section 3.3). Note that in order to judge a possible subsurface signal, we do not use a source of (wide-sense) stationary waveform because such a source represents mainly thermal noise. Instead, only significant sources of pulse waveforms are considered. These sources are usually S2 and/or S3. Due to permutation ambiguity of CICA (see section 2.2), S2 and S3 may be “complementary” to each other. In other words, if there is a subsurface target, it may be detected by either S2 or S3, at the same (sample) time position but different frame positions. If the subsurface signal is relatively strong, it may be detected by S2; if it is weak (but still strong enough), it may be detected by S3.

[46] Criterion 4 implies that in rough regions the method is not usable. In rough regions strong specific target clutter may thwart the separation of a subsurface signal as a source, and bring about false signals, because dense off-nadir clutter over an orbit segment may enable statistically independent components (estimated sources) to be extracted from them in multiple windows. The peaks of such components may possibly occur at a common time position over multiple windows, so that a false feature may appear in estimated sources.

[47] In smooth regions, however, a subsurface interface can be identified provided the interface is of some large scale (e.g., at least a few tens of kilometers stretching along the orbit track) and gives a reflection strong enough to constitute a significant source over a number of successive windows. The interface should be flat or a slope parallel to the orbit track and facing on the radar. An interface oblique in other directions or a point reflector is unlikely to be detected by the method, because for instance a point reflector cannot produce reflection peaks at the same time position in multiple windows. A point reflector is unlikely to be detected even in a single window as an independent source, because a point reflector does not produce reflection peaks at the same time position in multiple successive frames.

[48] Regarding criterion 4, another important question is that, even if the sources are identified as subsurface signals, can we be sure that they come from an identical hidden feature across the multiple windows? The answer is almost definitely “yes,” because first we have limited the detectable subsurface target to for example a large flat feature, and second we use the moving window approach. In any two adjacent windows the data used are mostly the same (four frames repeat while one frame is changed). Therefore, if the two pulse-waveform sources in the two adjacent windows show intensity peaks at the same time position, we can say that the two source signals are possibly from a common subsurface target. The same inference holds for every two adjacent windows in an orbit segment containing a number of successive windows. If in many successive windows the peaks appear at the same time position, the probability is high that the peaks represent the same target. The longer the orbit segment is, the higher the probability, because more different data are used in a longer orbit segment: the probability is low that different subsurface targets happen to give reflection peaks at the same time position in many different observations. Of course, geological information available about the observed area is very important to identify or characterize the signal source(s).

[49] Criterion 4 requires considering surface roughness in a quantitative way in order to define smooth/rough regions. This will be discussed in section 4.

4. A Case Study

4.1. Study Region and Surface Roughness

[50] The study region is described in section 1 and is shown in Figure 2. This region is chosen because of its special geology. In the region is a large quasi-circular depression (QCD), which is interpreted as a buried impact crater [Frey et al., 2002], of ∼160 km in diameter and a generally flat floor (except in its northern part where some newer smaller craters occur; see Figure 2). The present floor of the large crater has an average elevation of about −4500 m while the surrounding region is generally lower than −4200 m. The original depth of this large crater may be up to a few kilometers, because the depth of a new or lightly degraded crater is positively related to its diameter by a statistically estimated power law of depth = 0.25(diameter0.25) (depth and diameter are in kilometers) [Garvin and Frawley, 1998]. The present shallowness (∼300 m) of the crater may indicate a large amount of infilling. The wall of the crater may have been buried by the filling sediments which form a gentle slope (∼1/150) toward the center of the crater. The absence of an upraised rim around the crater may be due to erosion.

[51] The radargram for the orbit segment across the region (orbit 3675, frames 223–273) is shown in Figure 3 (top). From Figure 3 we see the surface pulse is prominent with a tail following. However, more information is hard to decipher from the radargram. In Figure 3 (middle) the same radargram is displayed with a contrast-stretched color code (the lower 20% (0–0.2) of the normalized modulus is shown in full contrast so that the tail is highlighted). We see the tails are complex at every frame position on the orbit track (a topographic profile along the orbit track is given in Figure 3 (bottom)). We conducted CICA to separate blind sources from the tails on the orbit segment in order to find if there is any subsurface information hidden in the tails. Since interpretation of the separated sources requires considering surface roughness, here we shall have a description of the surface roughness along the orbit track.

Figure 3.

(top) The 3Mz band radargram for the orbit segment shown in Figure 2. Modulus normalization (to 0–1) is done separately for each frame. (middle) A contrast stretched version of the radargram shown in Figure 3 (top). (bottom) MOLA elevation profile along the orbit ground track with the position of the large crater indicated (the right direction is north).

[52] In order to quantitatively represent surface roughness, we compute the root-mean-square (RMS) slopes, based on the 128 pixel/degree MOLA data [Smith et al., 2003]. RMS slope, sx), is defined as [Shepard et al., 2001; Orosei et al., 2003]

equation image

where Δx is a horizontal distance, z(xi) and z(xi + Δx) denote elevations at horizontal positions xi and xi + Δx, respectively, and n is the number of point pairs satisfying that the two points are Δx apart from each other in the area of interest. We let Δx = 500 m. In practical calculations, however, we allow Δx to be a range of 500 ± 100 m in order to obtain a large number of point pairs in the area. Note that the MOLA data grid is ∼460 × 327 m at latitude 45°N gradually changing northward to ∼460 × 291 m at 51°N, so with Δx = 500 ± 100 m a large number of point pairs can be obtained in almost all directions. We calculate sx) in circular areas of 20 km diameter centered at every nadir point on the orbit track. Meanwhile, we calculate for each of the circular areas a confidence interval of sx) at the confidence level of 0.001 (confidence level is a probability at which the data-derived value deviates from the “real” value by an amount greater than half the confidence interval). The variation of sx) and its confidence interval along the orbit track is illustrated in Figure 4 (top). Since MARSIS is a nadir-looking radar with circular footprints [Orosei et al., 2008], we do not discriminate the direction of Δx. An RMS slope calculated in this way can be called to represent the 0.5-km-scale a directional roughness. From Figure 4 (top) we see that the surface roughness along the orbit track is generally low (with the maximum RMS slope of 0.0195 at frame 259). The confidence interval of sx) at the 0.001 confidence level is small (∼0.001), indicating a high accuracy of the calculated RMS slopes. Kreslavsky and Head [2000] studied roughness of the Mars surface on different scales based on MOLA data, using “differential slope.” From their results, we find that our study region coincides with one of their relatively smooth regions (see Kreslavsky and Head [2000], plate 1). Our result (Figure 4, top) is largely consistent with theirs, although we use RMS slope instead of differential slope. Differential slope removes larger scale tilt from the elevation difference when roughness is calculated at a given scale, in order to better characterize the intrinsic roughness of geological units [Kreslavsky and Head, 2000]. However, we choose RMS slope here because we think larger scale tilt can also affect the radar returns.

Figure 4.

(top) The 0.5-km-scale RMS slope and its confidence interval (at confidence level 0.001) along orbit 3675 ground track. (middle) Wavelength-scale (100 m scale) RMS slopes corresponding to different assumptions about the Hurst exponent (H). (bottom) MOLA elevation profile along the orbit ground track with the position of the large crater indicated (the right direction is north).

[53] Radar signals are most considerably affected by wavelength-scale roughness, s(λ) (where λ stands for radar wavelength), as s(λ) determines whether the returned echoes are mainly coherent or incoherent [Campbell and Shepard, 2003]. s(λ) cannot be derived directly from the elevation data due to the data resolution limitation, but it may be derived from sx). We can assume the spatial variation of the topography to be self-affine. (That is, the roughness increases at a fixed rate with an increasing horizontal scale at which the roughness is measured. Self-affinity is observed widely in natural topography including that of Mars [Shepard et al., 2001; Orosei et al., 2003].) s(λ) is then computed using the relation

equation image

where 0 ≤ H ≤ 1 is called the Hurst exponent [Shepard et al., 2001; Orosei et al., 2003]. In practical calculations with equation (12), we use the upper limit of the confidence interval of sx) (Figure 4, top) in place of sx) itself, so the calculated s(λ) is also an upper limit. (This is expected to lead to safer or more conservative inferences based on s(λ) in sections 4.2.2, 4.2.3, and 5.2.) For MARSIS, λ ≈ 100 m corresponding to the 3 MHz band. We calculated the upper limits of s(100) corresponding to assumptions of H = 0.25, 0.5, 0.7, and 1, as shown in Figure 4 (middle). Note that in Figure 4 (middle) we have changed the RMS slopes s(100) by a calculation of 180arctan s(100)/π so that they are in the unit of degrees. This is for convenient comparing with results of Campbell and Shepard [2003], who, on the basis of a theoretical investigation, suggest that in MARSIS observations coherent echoes may be significant if s(λ) < 1°, 3°, and 5° corresponding to H = 1, 0.5, and 0.25, respectively. In light of these suggested guidelines, we can see from Figure 4 (middle and bottom) that our study region is generally smooth with respect to MARSIS: the wavelength-scale RMS slopes are generally much lower than the upper limits proscribed in the guidelines, except at frames 251 and 261, where some newer smaller craters occur (cf. Figure 2) and the surface is rougher. Hereafter we may generally call the areas near frames 251 and 261, coarsely 248–263, rough areas, while calling the rest of the region along the orbit segment track smooth areas. In smooth areas prominent intensity peaks in a separated source may represent coherent echoes, either a specific target clutter or a subsurface signal, because coherent echoes may dominate the analyzed observations [Campbell and Shepard, 2003].

4.2. Results and Discussions

4.2.1. Separated Sources in Individual Frame Windows

[54] Figure 5 displays the five separated sources in the window centered at frame 240 (only the first 150 of the 291 samples are shown). At this frame position the surface is smooth (Figure 4). We see that S1 is largely wide-sense stationary, so it represents mainly thermal noise. In S1 the first several tens of samples exhibit larger variations than the rest, indicating that S1 contains fractions of surface clutter. Also noticeable is that S1 is low at the two leading samples. This occurs because continuous clutter, which is strongest at the leading samples, is separated out as an independent source, S3.

Figure 5.

Five separated sources (S1, …, S5) for the frame window centered at frame 240. Numbers in parentheses are eigenvalues corresponding to the sources. Only the first 150 of the 291 samples for each source are shown.

[55] S2 is of a pulse waveform with prominent peaks at sample 12, sample 20, and other positions. Whether there is a subsurface signal among the peaks in S2 is to be further analyzed later by taking into account adjacent windows.

[56] S3 is of a decreasing waveform, so it represents mainly continuous clutter and pure sidelobes of the surface pulse.

[57] S4 and S5 possess more complicated waveforms and probably contain mainly specific target clutter. These two sources correspond to very small eigenvalues, so they are insignificant.

[58] As shown in Figure 6, the sources have different circularities: S1 and S2 are noncircular (data points are distributed in an elongated region, or not centered at the origin of the complex plane), S4 and S5 are circular, and S3 is weakly noncircular (data distribution appears to have a weak elongated trend in ∼22° or ∼248° direction on the complex plane). Thus S4 and S5 are less reliable than the other three sources and are ignored in further analysis.

Figure 6.

Polar scatter diagrams for the five sources (S1, …, S5) for the frame window centered at frame 240. The center of a circle is the origin of the complex plane; the horizontal and vertical directions are real (rightward positive) and imaginary (upward positive) axes, respectively. Numbers in parentheses are eigenvalues corresponding to the sources.

4.2.2. Along-Orbit Variation of Waveforms of the Sources

[59] S1, S2, and S3 are jointly plotted in Figure 7, where we see that in the whole region S1 is generally wide-sense stationary, indicating mainly thermal noise mixed by surface clutter in near surface portions, resembling the situation of the window at frame 240 as described in section 4.2.1. Intensity variations across frames in S1 show different noise levels in the data for different frames. (The noise level in the prepossessed data can be changeable from frame to frame. Why this happens is unknown yet, although we think it may result from ionospheric effects.)

Figure 7.

Radargrams for the separated sources (S1, S2, and S3), frames 223–273, orbit 3675. Modulus normalization is done across all frames (except the edge frames 223, 224, 272, 273) for each source. Sample numbers 0–9 for each source are truncated, so the intensity is blank. “F” indicates a feature discussed in the text. MOLA elevation profile along the orbit ground track with the position of the large crater indicated (the right direction is north).

[60] S2 has different behaviors in smooth and rough areas (cf. Figure 4). In smooth areas, S2 displays mainly decreasing waveforms outside the crater (frames 223–233 and 263–273), indicating continuous clutter, and displays mainly pulse waveforms inside the crater (frames 234–262), indicating either specific target clutter or possibly subsurface signals. In rough areas (frames 248–263), S2 shows mainly pulse waveforms.

[61] S3 appears in most cases to be “complementary” to S2; i.e., when S2 is of a decreasing waveform S3 tends to be of a pulse waveform, and when S2 is a pulse waveform S3 is of a decreasing one. This is clear at frames 228, 234–249, 251, 254–255, 259–260, and 260–263. Only one occasion occurs that neither S3 nor S2 is of a decreasing waveform (frame 252); in this case, however, S1 has a peak at the leading sample, suggesting that continuous clutter happens to enter S1.

[62] To sum up, S1, S2, and S3 in this orbit segment largely cover the five physical sources as we described in section 3.3. S1 with wide-sense stationary waveforms represents mainly thermal noise mixed with weak pure sidelobes and circular fractions of surface clutter. S2 and S3 with decreasing waveforms represent mainly continuous clutter mixed by pure sidelobes of the surface pulse. S2 and S3 with pulse waveforms represent either specific clutter or subsurface signals.

4.2.3. A Subsurface Signal?

[63] In order to find out if there is any subsurface signal in the sources, we apply the interpretation criterion 4 (which has been defined and discussed in section 3.5). In light of this criterion, we examine S2 in smooth areas. We find in the south part of the crater (frames 234–242) that the intensity peaks of S2 can be linked along the orbit to show a distinct flat feature of about 30 km long, 500–1000 m under the surface (assuming that the top layer permittivity is 4–16), as marked out in Figure 7 by “F”. Hereafter for convenience we call this specific feature F and this specific area “the F area”. We suggest that F represents a subsurface signal, because it satisfies criterion 4. Specifically, F is best explained as a flat, large subsurface interface because, in this smooth area, specific target clutter is unlikely to produce the clear intensity peaks appearing at the same time position (∼10 μs) across nine successive windows (each window has five frames, so F involves 13 frames) and detected by CICA as one significant independent source (S2). Since the F area is not the smoothest among all the smooth areas in the region (cf. Figure 4), there may still be a doubt that F could be caused by surface clutter. Regarding this doubt, in addition to Criterion 4, we can give the following reasons why F is a subsurface signal rather than a surface clutter. At frames 225–227 and 231–232, south of the crater, the surface is smooth but somewhat rougher than that of the F area. At frames 265–267, north outside the crater, the surface roughness is equivalent to that of the F area. In these two places no such a feature like F occurs (cf. Figures 4 and 7). In the F area, the surface is not equally smooth: From frame 234 to 242, the 0.5-km-scale RMS slope changes from ∼0.009 to ∼0.006, decreased by about 30%. F does not seem to be affected by this change in roughness. This suggests that F is probably not caused by surface roughness.

[64] It is noticeable that F is restricted to frames 234–242 and does not continue northward (Figure 7). However, this does not mean that a subsurface interface must stop at frame 242. Beyond frame 242 northward the increased roughness of surface hinders CICA detection of a subsurface signal, and we cannot judge whether the pulses in S2 or S3 in the rough regions are clutter or subsurface signals.

[65] In order to further examine the conditions under which CICA may detect a possible subsurface interface like F, we conducted a numerical simulation, as described in section 5.

5. A Numerical Simulation

5.1. Simulation Models and Setups

[66] This simulation is aimed at an investigation of the conditions (surface roughness, material properties) in which it is possible to recognize a subsurface signal hidden in the tails of the MARSIS observations using CICA. Whether CICA can recognize the subsurface signal depends essentially on whether it can separate from a frame window a significant source that represents the subsurface signal. The simulation is therefore focused on a single frame window. In order to deal with scatterings from surface/subsurface of randomly distributed roughness, we use a Monte Carlo approach [Guan et al., 2009] to carry out the simulation. The simulation involves surface digital elevation models (DEMs) as well as surface clutter, subsurface signal, and thermal noise models. They are generated and handled in the following eight steps.

[67] Step 1 is to generate a self-affine DEM. A self-affine DEM can be generated in various ways and we use the simple method described by Finlay and Blanton [1994]. Figure 8 shows such a DEM with 128 × 128 cells (small rectangles constituting the surface). The extents of the generated model in x (azimuth), y (cross track), and z (vertical) directions are all set to unity. The practical dimensions of the model can be assumed by adopting scaling factors, denoted by fx, fy, and fz for the three directions, respectively. We set fx = fy = 128 × 1000, corresponding to 1 × 1 km2 horizontal cell size. We let fz match the wavelength-scale RMS slope, s(λ), so that the elevation variation of the DEM approaches consistency (in statistical sense) with that derived from MOLA data. Using equation (12), we obtain fz = 1000s(λ)(10)1-H. The surface model is generated from an initial rectangle of which the four vertexes are set to 0 in elevation. The rectangle is then partitioned into four new rectangles of equal size. Partition is carried out by interpolating new vertexes recursively for each existing rectangle until the cell size reaches 1/128 of the initial rectangle. For each new vertex generated at the kth recursion step, the elevation is a random number drawn from a Gaussian distribution, of which the mean value is the average elevation of the neighboring vertexes generated in the previous recursion (k − 1) step, and the deviation decreases with cell size. We set the decreasing rate (ratio of deviation at recursion step k − 1 to that at step k) to 0.8, corresponding to a Hurst exponent of H ≈ 0.7 or a fractal dimension of D ≈ 2.3 for the surface. (H and D can be obtained using the relations Gkrk1-H and D = 3 − H, where Gk and rk represent elevation deviation and cell size at step k [Shepard et al., 2001].) We can adjust factor fz to get different s(λ) values to simulate regions of different roughnesses. MOLA data are not directly used in this step because MOLA data also need interpolation to construct the wavelength-scale roughness distribution.

Figure 8.

A self-affine digital elevation model (DEM) with the Hurst exponent of 0.7.

[68] Step 2 is to generate a surface clutter corresponding to the DEM. By surface clutter here we indicate all backscatters from the surface, including the surface pulse. Corresponding to the nadir-looking feature of MARSIS, we suppose the spacecraft is positioned above the center point of the DEM. The height of the spacecraft, h, is set to 300 km. The received surface clutter is a sum of backscatter from all the cells. Each cell is assumed to be generally flat with small-scale roughness, so that the incident angle of the EM wave, θi, at the ith cell is related to the position of the cell center (xi,yi,zi) by

equation image

with (0,0,0) being the center point of the DEM. The distance (range) from the cell to the radar is

equation image

The received power back from the ith cell can be approximated using equation (10), i.e.,

equation image

The Fresnel normal power reflectivity ρs in equation (15) is given by

equation image

where ɛ1 = ɛ0ɛr11/ω is the complex permittivity of the surface material, ɛr1 is the relative permittivity, σ1 (in S/m) is the conductivity of the material, and ω = 2πc/λ is the angular frequency of the wave (in rad/s); ɛ0 = 8.8542 × 10−12 F/m is the vacuum permittivity. In equation (15) the width factor C is related to the wavelength-scale RMS slope by [Sultan-Salem and Tyler, 2006]

equation image

The received wave back from the ith cell can then be approximated as

equation image

Let the whole signal from the DEM be a time series denoted by v = (v1,v2,…,vd). Let Rmax and Rmin be the maximum and minimum ranges for the DEM, and let

equation image

be the sample interval. Then vk is given by

equation image

where N is the total number of cells and f(a,b) is a selection function defined as

equation image

[69] Step 3 is to generate a subsurface signal corresponding to the DEM. Suppose a dielectric interface presents at a depth of L below the surface. (We set L = 700 m in practice.) The interface is assumed to be also self-affine in elevation variation and generally parallel to the upper surface. The material of the top layer (the layer between the interface and the upper surface) is assumed to be homogeneous. The subsurface signal is denoted by u = (u1,u2,…,ud). The elements uk (k = 1,2,…,d) are calculated as follows.

[70] The angle of refraction, γi, at the ith cell of the upper surface is related to θi by

equation image

The refractivity at this cell is given by (only the parallel polarized wave is considered here)

equation image

The refracted wave propagates in the top layer until it reaches the subsurface interface where the wave is backscattered. The backscattered power at position i on the interface is again evaluated using equation (10):

equation image

The Fresnel normal power reflectivity on the interface, ρss, is given by

equation image

where ɛ2 = ɛ0 ɛr2jσ2/ω is the complex permittivity of the bottom half space (beneath the subsurface interface) and ɛr2 is the relative permittivity of the half space. The backscattered subsurface signal goes in the top layer to the upper surface, where it is refracted into the air. The angle of refraction is equal to θi, and the incident angle is γi. The refractivity is

equation image

During propagation in the top layer, the wave is attenuated and changed in phase by a factor of

equation image

where μ0 is the permeability of vacuum (we assume the permeability of the layers to be 1). The received signal back from the ith position on the subsurface interface is then

equation image

This signal is delayed by (in terms of range)

equation image

uk is then given by

equation image

[71] Step 4 is to generate a noise vector. Let 0 < Pn < 1 be a percentage value indicating a noise level and ∣vmax be the maximum modulus in vector v (the surface clutter). The noise vector, denoted by n = (n1,n2,…,nd), is generated by creating random numbers that obey a zero-mean Gaussian distribution with a deviation of Pnvmax/equation image. Such random numbers are taken as the real and imaginary parts of nk (k = 1,2,…,d) so that the deviation of the modulus of nk will be Pnvmax. We set Pn = 0.01 corresponding to 20 dB of the surface peak relative to thermal noise, which is typical of MARSIS observations (as exemplified in Figure 1).

[72] Step 5 is to generate a mixed signal, which is v + u + n. The mixed signal is intended to simulate an observation of MARSIS (suppose a subsurface signal exists in the tail). We think the five sources as described in section 3.3 are largely represented by this mixed signal. The pure sidelobes are ignored because, as we have mentioned (section 3.3), they are not distinguishable from clutter and thermal noise.

[73] Step 6 is to repeat steps 1–5 five times to produce five mixed signals. These mixed signals constitute a 5 × d complex matrix, denoted as z.

[74] Step 7 is to remove 10 samples from the head of z. Apply the CICA algorithm (described in section 2.1) to z to produce five sources. As usual, we denote the sources as S1, …, S5.

[75] Step 8 is to examine the sources to find out if the subsurface signal can be discernible in one of the significant sources. Change parameters and repeat steps 1–8 to see how the visibility of the subsurface signal changes with the parameters. For this purpose we define a “local SNR” denoted by Sk in decibels,

equation image

where sk is the kth element in the signal and k is the sample number at which the modulus peak of a subsurface signal occurs. We let Ss,k and Sm,k denote Sk for separated sources and mixed signals, respectively. We can expect that a large deviation of Sm,k may reduce the average Ss,k, because a large deviation of Sm,k may mean a smaller covariation between the mixed signals. Ss,k can be used to evaluate the visibility of a subsurface signal in a source, because, after generation, a subsurface signal does not change in the time position (sample number) of its modulus peak duing mixing and CICA. If Ss,k for a significant source is quite large (e.g., >0.5), we say that in this source the subsurface signal is visible. For mixed signals, if the values of Sm,k are all large for the n = 5 signals, we say the subsurface signal is visible in the mixed signals; otherwise it is invisible. Considering the permutation ambiguity of CICA (section 2.2) and relative significance of the sources, we always let Ss,k = max (SS2,k, SS3,k), where SS2,k and SS3,k stand for Sk for S2 and S3, respectively. In the calculation of Sm,k, the five mixed signals are always considered altogether.

5.2. Simulation Results

5.2.1. General Results

[76] Figure 9 illustrates an example of generated surface clutter, subsurface signal, noise, and mixed signal. The moduli are normalized to 0–1 for visual effect. Before they are normalized, the noise and the subsurface signal have maximum moduli of about 1% and 3.8% that of the surface peak, respectively. Therefore, the mixed signal looks similar to the surface clutter, with a prominent surface pulse followed by a tail. The modulus peak of the hypothetical subsurface signal occurs at sample 44. At the same sample position a peak is not clear in the mixed signal, meaning that the subsurface signal is obscured by clutter and noise.

Figure 9.

An example of simulated surface clutter, subsurface signal, noise, and mixed signal. Parameters assumed are the following: wavelength-scale RMS slope s(λ) = 0.0140; top layer permittivity and conductivity ɛr1 = 4 and σ1 = 10−5 S/m; bottom half-space permittivity and conductivity ɛr2 = 9 and σ2 = 2 × 10−5 S/m. The small value of Sm,k(0.1020 dB) suggests the subsurface signal is obscured by clutter and noise.

[77] Figure 10 illustrates a set of five sources separated from five mixed signals including the one shown in Figure 9. In Figure 10, generally S1 is wide-sense stationary and S2 is (roughly) decrease with time delay. Therefore, continuous clutter and noise are embodied in the significant sources, resembling the situation with true MARSIS data, as described in section 4.2.1 (Figure 5, where S3 is of a decreasing waveform instead of S2 as in Figure 10). (We believe this difference results from a relatively strong subsurface signal in the situation of Figure 5.) In Figure 10 we see the subsurface signal is detected by S3 with Ss,k = 2.4260 dB. (Note the peak occurs at sample 34 instead of 44 because 10 leading samples are truncated from the mixed signals before CICA is applied.) This shows that in this situation CICA can detect a weak subsurface signal by one of the separated sources.

Figure 10.

An example of five separated sources from five mixed signals including the one shown in Figure 9. Parameters are the same as those in Figure 9. The relatively large value of Ss,k (2.4206 dB, large relative to Sm,k as indicated in Figure 9) suggests the subsurface signal is visible in S3.

5.2.2. Effects of Surface Roughness on the Visibility of a Subsurface Signal

[78] Figure 11 shows a relation of visibility (Ss,k and Sm,k) of generated subsurface signals versus wavelength-scale RMS slope (s(λ)), for two sets of dielectric parameters. Each panel results from statistics of 50 simulations for every s(λ) value displayed.

Figure 11.

Relation of visibility of a subsurface signal in mixed signals and separated sources (Sm,k and Ss,k) to surface roughness (wavelength-scale RMS slopes(λ)), based on 50 simulations. The roughness range of “the F area” discussed in section 4.2.3 is indicated.

[79] In Figure 11 we see that Sm,k tends to have large deviations and its mean values are below zero except at very low roughness (s(λ) < 0.0078), meaning that the subsurface signal is invisible in the mixed signals except in very smooth regions. Ss,k is almost always larger than Sm,k for the two sets of dielectric properties, suggesting that the visibility of the subsurface signal in S2 or S3 tends to be higher than in the mixed signals. This, however, depends on roughness. In Figure 11 (top) it appears that s(λ) ≈ 0.0134 is a roughness boundary above which the average values of Sm,k or Ss,k tend to be stable and their deviations are decreased to a smaller, stable value. In Figure 11 (bottom), the boundary appears to become greater (s(λ) ≈ 0.0248) and similar variation trends in Sm,k and Ss,k appear. These trends can be explained by the following: in rougher regions fluctuations of the mixed signals become stronger and peaks are more frequent (more densely distributed), leading to more stable Sm,k values. In this case the subsurface signal tends to be overwhelmed by clutter. Meanwhile, it is more possible that peaks exist at the same position k in the mixed signals. Thus, a peak at k may become more likely to appear in a significant source, although the peak does not represent a subsurface signal. In other words, when the deviation of Sm,k approaches a smaller, relatively stable level with increasing roughness (suppose other parameters are fixed), CICA may begin to extract false signals. Therefore, we think s(λ) ≈ 0.0134 can be taken as a roughness boundary to distinguish between smooth and rough regions. With smaller σ1 value and/or larger ɛr2r1 contrast (Figure 11, bottom), the boundary may be higher because smaller σ1 value and/or larger contrast may enhance the visibility of the subsurface target; thus a higher level of clutter is needed in order to overwhelm the subsurface signal. Thus, in smooth regions CICA tends to be able to detect a subsurface signal, while in rough regions CICA is not useful. In Figure 11 the roughness range of the F area (as we have discussed in section 4.2.3) is indicated. The F area has s(λ) values between 0.0089 and 0.0141 (calculated from degrees as shown in Figure 4 corresponding to H = 0.7). It belongs to a smooth region if judged by the boundary of s(λ) ≈ 0.0248, while most (∼86%) of it belongs a smooth region if judged by the boundary s(λ) ≈ 0.0134.

5.2.3. Impact of Dielectric Constants of Materials on Visibility of a Subsurface Signal

[80] Figure 12 illustrates a relation of the visibility of a supposed subsurface signal in mixed signals and sources to possible dielectric constants of the two layers. From Figure 12 we see the following (given the parameters of s(λ), σ1, and σ2 as shown): (1) The subsurface signal is invisible in the mixed signals (the deviation of Sm,k is large). (2) The average and maximum values of Sm,k increase with permittivity contrast (ɛr2r1), meaning that the subsurface signal is not totally overwhelmed by clutter or noise (clutter and noise should not increase with permittivity contrast). (3) The average Ss,k generally decreases with increasing maximum value and the deviation of Sm,k (Figure 12, top and middle), because large deviation of Sm,k can reduce Ss,k. (4) If ɛr1 is low (ɛr1 = 4; Figure 12, top), the average Ss,k is positive no matter how much ɛr2 is, meaning that the subsurface signal tends to be detected by S2 or S3. (5) When ɛr1 is increased to 6 (Figure 12, middle), the average Ss,k is positive only when ɛr2 is less than 10, maybe because the average Ss,k is lowered by increased deviation of Sm,k. (6) When ɛr1 is increased to 8 (Figure 12, bottom), the average Ss,k is negative no matter what ɛr2 is, meaning that CICA does not tend to be capable of detecting the subsurface signal by S2 or S3. This can be explained by the following: when ɛr1 is quite large (e.g., 8), the subsurface signal may be delayed to a time position far from the surface pulse and approach the noise level, where both the noise and the clutter may prevail and the subsurface signal tends to be overwhelmed. In other words, a subsurface signal should be well higher than the noise level in order to be detected by CICA. In order to fulfill this condition, permittivity of the top layer should be quite small, e.g., ɛr1 < 6 (given the parameters of s(λ), σ1, and σ2 as shown in Figure 12).

Figure 12.

Relation of visibility of a subsurface signal in mixed signals and separated sources (Sm,k and Ss,k) to permittivity of the top layer (ɛr1) and the bottom half space (ɛr2), based on 50 simulations.

5.2.4. Impact of Conductivity of Materials on Visibility of a Subsurface Signal

[81] Figure 13 illustrates a relation of the visibility of a hypothetical subsurface signal in mixed signals and sources to possible conductivity values of the two layers. From Figure 13 we see that if the conductivity of the top layer (σ1) is quite low (<6 × 10−8 S/m) and the two layers have a quite large conductivity contrast (σ2/σ1 > 50/6), then both Sm,k and Ss,k have large values and their minimal values are above zero; i.e., the subsurface signal is visible in both the mixed signals and the sources. In other words, if a subsurface interface is directly observed by the radar, a CICA of the observations (although unnecessary to do in practice) retains the visibility. Figure 14 is an example showing that a subsurface signal is seeable in mixed signals and in source S2.

Figure 13.

Relation of visibility of a subsurface signal in mixed signals and separated sources (Sm,k and Ss,k) to conductivity of the top layer (σ1), based on 50 simulations.

Figure 14.

An example showing a subsurface signal visible in mixed signals and in one of the separated sources (S2). Parameters assumed are the following: wavelength-scale RMS slope s(λ) = 0.0140; top layer permittivity and conductivity ɛr1 = 4 and σ1 = 10−8 S/m; bottom half-space permittivity and conductivity ɛr2 = 9 and σ2 = 5 × 10−7 S/m.

[82] Figure 13 shows that, with decreasing conductivity contrast, the average Sm,k decreases. At σ1 ≈ 1 × 10−7 S/m, i.e., the contrast is about 1/5, the average Sm,k is decreased to below zero, meaning that the subsurface signal becomes invisible in mixed signals. Further increase in σ1 until the contrast approaches 1/2 may reduce the visibility slightly, because a higher σ1 value tends to shield a subsurface signal from being seen in the mixed signals. With further increased σ1 the contrast approaches 1/1 and the visibility becomes stable, because the conductivity contrast becomes less important whereas the strength of the subsurface signal is controlled mainly by the permittivity contrast (ɛr2r1 = 9/4). In contrast, the average Ss,k remains above 1 and the minimal values of Ss,k remains near positive, suggesting that the subsurface signal tends to be visible in S2 or S3, even if the conductivities change so much.

5.3. Summary and Discussion

[83] The above described simulation is restricted to dealing with a single frame window (five observations). Since many variables are involved and their mutual relations are complex, the above simulation results are far from a complete description of the behavior of the received signals back from a region of variable surface roughness and including subsurface echoes. However, these initial results may throw a light on the judgment of “F” as a subsurface dielectric interface rather than as a clutter, as discussed in section 4.2.3. First, in addition to the discussions in section 4.1, this simulation supports that the F area can be taken as a smooth area, as we have mentioned in section 5.2.2. In this respect we have assumed a ɛr1 value of 4 or 5, a ɛr2 value of 9 or 15, and σ1 and σ2 values of 5 × 10−6–10−5 S/m, which can be believed to be possible values for the Martian upper crust materials [Heggy et al., 2006; O. Bombaci et al., Mars Express mission: Subsurface and ionospheric sounding processing architecture for MARSIS instrument, paper presented at the MARSIS Co-I Workshop, Rome, Italy, 2006.]. Second, the simulation suggests that a subsurface signal should be higher than the noise level in order to be detected by CICA, based on an examination of the relation between visibility of a subsurface signal in a source and permittivity of the top layer (section 5.2.3). F represents a series of intensity peaks at a time position 30 sample intervals apart from the surface pulses (Figure 7). At this time position the tail is obviously higher than the noise level, as exemplified in Figure 1. This means that if a subsurface signal exists at such a time position it is likely to be higher than the noise level. We have calculated Ss,k considering both S2 and S3 in the simulation, while F is in S2 only. This may mean that the subsurface signal represented by F is quite strong.

[84] The simulation may also provide some clues about the material properties of the two layers. If F represents a subsurface interface, then the top layer permittivity should be smaller than 6 and the permittivity contrast (ɛr2r1) should probably be lower than 10/6, assuming s(λ), σ1, and σ2 to be 0.0140, 10−5 S/m, and 2 × 10−5 S/m, respectively (Figure 12), because otherwise the interface tends to be undetectable by CICA. If we assume a smaller conductivity for the bottom layer (σ2 = 5 × 10−7 S/m), then σ1 should be greater than about 6 × 10−8 (Figure 14) because otherwise the subsurface signals may be easily discernible in the original signals. Considering these clues, we may suppose, for example, the top layer and the bottom layer to be frozen deposits [Carr and Head, 2003] with different, low ratios of ice in the deposits (with the deeper layer having less ice), because such materials may match the dielectric parameters [Heggy et al., 2006]. However, this supposition is rough because it is based on a broad “trend” of the visibility of the signals in the sources. Further research is needed to better constrain in this way the material properties, and geologically what the materials of the two layers may be remains to be studied.

6. Concluding Remarks

[85] We have applied the method of CICA to examine the tails of observed MARSIS signals in the northern lowlands of Mars, in order to find possible subsurface signals weakened by clutter and other noise in the tails. In this paper we describe the application scenario of the method, with a case study of a selected region in the west part of Utopia Planitia (88°E–91°E, 45°N–51°N), where an unnamed large, semifilled crater is located. We find a clear feature in the crater represented by a significant independent source signal separated by CICA. We interpret the feature as representing a subsurface interface based on an analysis of the characteristics of the separated sources and their relations to surface roughness inside and outside the crater. We also conduct a numerical simulation to further examine the conditions under which the method tends to be effective. We conclude that CICA can be helpful for recognizing subsurface signals of MARSIS with intensity above the noise level but obscured by clutter and noise, provided the surface is smooth. In order to further exploit MARSIS data to reveal structures of the upper crust of the northern lowlands of Mars, our future work will involve extending the application of the method to processing of MARSIS data in other regions of the northern lowlands and incorporating more data sets such as the Mars Global Surveyor gravity data, which have been used to assess the crustal details of impact basins on Mars [Potts et al., 2004].


[86] MARSIS was built and is jointly managed by the Italian Space Agency and NASA. Mars Express was built and is operated by the European Space Agency. Zhenfei Zhang thanks NSFC (project 40874092) for financial support.