Several magnetic field models of Mars have been constructed since the Mars Global Surveyor data became available. Three distinct schemes formulated through spherical harmonic functions, discrete equivalent dipoles, and the continuous magnetic field kernels have yielded results that are grossly compatible but with very different details. Models of internal potential function in terms of spherical harmonics tend to yield divergent high-degree Mauersberger-Lowes spectra, whereas crustal magnetization models exhibit flat but still significant spectra up to high degrees. To have a better fitting to the observed data seems to have dominated previous efforts that have yielded fine details with wavelengths shorter than the lateral track spacing. The variance-reduction versus model-variance tradeoff analysis is invoked in this study for the determination of the appropriate regularization. Taking advantage of the recently developed multiscale inversion, we are able to conservatively retain only the model components that are robustly constrained by the data rather than unilaterally pushing for a higher degree of fitting. With the variance reduction around 82%, we find that to reach a reasonably fair data fitting without high model variance, the high-degree power spectra of our preferred model exhibit an obvious decaying trend, implying that a lot of the short-wavelength energy embedded within established models is either not robustly resolvable or is of external origin or is simply reflecting the nonuniform distribution of sampling at short scales. The reason that models based on spherical harmonics have greater high-degree power is attributed to the spectral leakage due to the truncated representation.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 A magnificent magnetic field variation in the southern hemisphere of Mars has been discovered owing to the compilation of the Mars Global Surveyor (MGS) data [Acuna et al., 1999]. There is, however, no significant field intensity observed for the northern lowlands although there are as many observations for the northern hemisphere as for the south. The significant magnetic signature demarcating an extensive part in the south has been attributed to ancient magnetization of the Martian crust [e.g., Hood et al., 2005]. Consequently, there have been considerable efforts to construct models of the Martian crustal magnetic field, not only for delineating potential tectonic features but also to systematically summarize the robust information of the precious MGS data as completely as possible.
 The nature and the quality of the magnetic observations as well as features of the main phases of MGS have been documented previously [e.g., Albee et al., 2001]. Essentially, the MGS satellite observed magnetic data consists in vectorial, three-component magnetic field observations at different altitudes, from below 200 km to 367–435 km, during different mission phases. The three-component data set we used in this study is the same set previously used to construct the spherical harmonics degrees 90 internal potential model [Cain et al., 2003] as well as the spatially continuous magnetization model [Whaler and Purucker, 2005]. There are in total three-component measurements at 111,274 points, with altitudes from 102 to 426 km, composing the 333,822 field intensity data. The specific parameters and the adopted coordinate system of the data are described by Cain et al. .
 The consensus established from previous works attributes the current major contributor of the Martian magnetic field to the lithospheric magnetization of a layer about 40 km thick, and that there is presently no dominant dipole field for the planet [e.g., Voorhies et al., 2002; Langlais et al., 2004; Whaler and Purucker, 2005]. One school of approach is to find the scalar internal potential function at the surface of Mars in terms of spherical harmonics such that the gradient vectors of the model will fit the observations [Connerney et al., 2001; Arkani-Hamed, 2001, 2002, 2004; Cain et al., 2003]. Discussions about the effects of the variation of the attitudes and the lateral sampling have been raised. Interestingly, although assumptions on the internal origin of the magnetic field have been made in these studies, the divergent Mauersberger-Lowes power spectra [Backus et al., 1996] toward high degrees, however, implies significant contributions from external sources. Other studies assume a continuously varying magnetization vector function M(r), where r stands for the position vector, such that the theoretical magnetic field intensity vector observed at robs,
fits the field observation. This linear data rule states that the data functional is of the form of an inner product between the model function and the data kernel. To evaluate expression (1) for a given magnetization model, a numerical scheme based on parameterizing the model function must be implemented. Langlais et al.  use the equivalent source dipole technique that attributes the magnetic field to the contribution from 4840 dipoles with spatially varying magnetization intensity and direction, uniformly distributed across the globe and 20 km below the Martian surface. Whaler and Purucker , on the other hand, expand the model function in terms of the data kernels. One advantage of this expansion is that it automatically avoids annihilators [Parker, 1994], since any component expressible in terms of the data kernels will not be orthogonal to all data kernels. That is, there will be no model components of this form that make no contribution to the data. One of the major disadvantages, however, is that the resulting Gram matrix is too sizeable and thus computationally demanding, although the matrix is usually sparse. Whaler and Purucker  take advantage of the sparseness and indicate that an effective computation can usually be performed with only the 0.21% largest elements of the Gram matrix retained. Both these two studies obtain models that reveal power spectra similar to the former studies under degree 40. The higher-degree power spectra become considerably lower but are still significant. There have been concerns that crustal magnetic features with wavelengths shorter than the altitude of the observation might not be robustly resolvable [Connerney et al., 2001; Arkani-Hamed, 2002]. Noticeably, since the north-south trending track spacing of the MGS has a width of ∼2°–5°, that is, ∼100–300 km at the equator [Arkani-Hamed, 2001], it has been argued that the highest harmonics degree corresponding to twice the lateral resolvable wavelength is thus about 65 [Arkani-Hamed, 2004]. In spite of these discussions, recent models tend to have significant power spectra contributions from much higher degrees.
 We basically follow the approach that inverts for the spatial variation of the equivalent source crustal magnetization. We build the spherical tessellation initiated from a spherical icosahedron. Midpoints on the edges of each of the 20 spherical triangles are then connected to form 4 children triangles. The refinement of the spherical meshes is then executed successively until we have 10242 nodes marking the vertices of the 20480 (= 20 × 45) triangular faces. Summation of the integrand of equation (1) evaluated at finite Gaussian integration points [e.g., Zienkiewicz and Taylor, 1991] within each triangle is then computed to numerically approximate the inner product of the data rule. Let m be the vector with M (= 3 × 10242) magnetization model components, then the N (= 333,822) dimensional data vector d is constrained by
 Notice that in the current formulation, the degrees of freedom of the model, 3 × 10242, is more than twice as much over the previous formulation based on the equivalent source dipoles, 3 × 4840 in the work of Langlais et al. . The parameterization of Langlais et al.  assumes a finite amount of equivalent dipoles located on the vertices of the spherical triangular meshes. We, on the other hand, assume that the magnetization varies linearly within each of the 20480 triangles such that the magnetization is a globally continuous vector function. This further enables much better capability of resolving short-wavelength features. Elements of each row of the sensitivity matrix G specify the dependency of a particular datum upon the M dimensional model vector. An example of the spatial variations of selected observations reveals the localized constraints and the effects of the distinct altitudes (Figure 1). Conventionally, model estimates, , can then be solved by the damped least squares (DLS) [e.g., Lawson and Hanson, 1974] algorithm,
 The value of the nonnegative damping factor θ2 controls the strictness of the imposed preference of the minimum model norm. It is also a knob for tuning the variance reduction (vr) versus model variance (σm) tradeoff. Briefly, the variance reduction is defined to indicate the capability of a model (m) to reconstruct the observed data (d). It can be calculated by
 On the other hand, the model variance is a measure of the uncertainty of a model manifested from noises contaminated to the data; it is computed [Paige and Saunders, 1982] by
 It is noted that a heavier damping setup by a larger value of θ2 usually leads to a robust model (lower σm), but sacrifices the data fitting (lower vr) at the same time. We will show in the following how tradeoff between the model robustness and the data fitting helps to determine an appropriate value of θ2 and an optimal model.
3. Multiscale Inversion Based on the Spherical Wavelet Basis
 It has been pointed out that minimum norm solutions obtained from DLS generally lack interpolation capabilities into sparsely sampled areas and tend to yield fragmented and fractured models [e.g., Chiao and Liang, 2003]. Regularizations based on enforcing model smoothness or roughness penalization have also been conventional practices in handling geophysical inverse problems [e.g., Menke, 1984; Delprat-Jannaud and Lailly, 1993]. The implementations, however, usually presume that the model smoothness [e.g., Meyerholtz et al., 1989], or the intrinsic model correlation length [Tarantola and Nercessian, 1984], is spatially uniform or stationary. It has been shown that this is not a realistic presumption and has led to devices of multiscale regularization based on wavelet representations of models such that spatially nonstationary smoothness enhancement is automatically invoked depending on the in situ density of model constraints offered by the data [Chiao and Kuo, 2001; Chiao and Liang, 2003]. We follow the same rationale and transform the aforementioned spherical meshes into a stage to build spherical wavelet bases.
 To briefly summarize the algorithm, a simplified single triangle is taken as an example (Figure 2). To discretely describe a function f(x) across the interior of the triangle, we can specify the spatial variation of f at uniformly distributed nodes, such as f1 = f(r1), f2 = f(r2), f3,…… where r1,r2 are position vectors at the internal nodes 1,2 (Figure 2). These nodes are vertexes of internal triangles through successive levels of refinement of the original triangle. That is, connecting midpoints on the edges, the parent triangle Δ123 is subdivided into four children triangles Δ456, Δ536, Δ146, Δ425 (Figure 2). Each of the resulting triangles can be further subdivided accordingly. Now instead of representing f(x) by [f1, f2, f3, f4….f9….] distributed uniformly throughout the triangle, there are ways to build hierarchical representations of f(x). A naïve example is cast in the following sense:
 That is, on the fundamental level, level_1, there are 3 degrees of freedom hi1 = fi,i = 1,2,3 to be specified where the upper index marks the refinement level and the lower indices are for the locations of nodes. On the next refinement, there are 6 degrees of freedom, hi2,i = 1.6. As specified in equation (6), the first 3 degrees of freedom that are used to characterize the large-scale variation are inherited from the lower level of representation whereas the additional 3 degrees of freedom are obtained by the in situ deviations of f(x) from the expected values predicted by linearly interpolated from larger-scale variation at each midpoint, for example, h42 = f4 − (h11 + h21)/2. That is, the original in situ variations, f4 = (h11 + h21)/2 + (h42), are replaced by the combination of a low-passed portion (the contribution interpolated from a larger scale) and a high-passed detail. Fast wavelet transforms [e.g., Mallat, 1998] are efficient schemes that accomplish the transformation W in equation (6) that maps the strictly spatial representation fi to a localized hierarchy representation hil of this sort. In addition, lifting schemes [Sweldens, 1996] can be incorporated to further improve the quality of the multiresolution representation. In this study, we transform the representation based on the original spherical mesh into an expansion utilizing spherical wavelet bases [see also Chiao and Kuo, 2001]. That is the reason why our construction subdivides the edges of each spherical triangles of the icosahedron by 25 segments instead of any integer as in the work of Langlais et al. . In fact, starting from the formulation (1) based on the direct spatial representation; we devise a bi-orthogonal wavelet transform [Cohen et al., 1992] directly on each row of the coefficient matrix G, that is GW*, such that the solution model vector to be solved for is now automatically the wavelet representation of the original spatial function for the crustal magnetization. That is, equation (2) is replaced by
where W−1 is the inverse wavelet transform that reverses the operation of the forward wavelet transform W. The advantage of solving for m in the wavelet domain is that with the same amount of degrees of freedom, parameters in the wavelet representation are grouped into a natural hierarchy of local scales such that the damping regularization acts to sort through successive scales depending on the local data constraints. In short, sites with dense constraints are capable of resolving more details robustly whereas robust long-wavelength components are still available for sparsely constrained area.
 We execute two different groups of inversions based on the simple damping scheme (equation (3)) and the multiscale inversion (equation (8)), each with several different values for the damping factor θ2. The variance-reduction (vr) versus model-variance (σm) tradeoff curves (Figure 3) clearly indicate how an appropriate model might be selected. We first notice that with comparable variance reduction, the results obtained via the multiscale inversions (marked by solid triangles on Figure 3) have model variances that are in general an order of magnitude lower than the simple damping results (marked by open circles on Figure 3). As mentioned in the previous section, this is due to the way the model variation is assembled through the scales hierarchy from the longer wavelengths that have more accumulated constraints in the multiscale inversion. For both the simple damping results and the multiscale inversion results, high model variances are associated with the solutions that best fit the observational data (solutions marked by group 3 on Figure 3, that are located on the high-variance-reduction extreme on the tradeoff curves), implying that there are significant unreliable components poorly constrained by the data embedded in such solutions to reach high data fitting. In other words, these lightly damped solutions are overinterpreting the information content of the data. On the other hand, solutions approaching the knees of tradeoff curves (marked by group 2 on Figure 3) that exhibit almost similar variance reductions (over 92%) bear considerably lower model variances. Continuing the trend of decreasing the model variance, conservative solutions with variance reduction around 82% (solutions group 1 around the knee of the tradeoff curves) reduce the model variance even more. Further model variance decreasing (along the reversed horizontal axis on Figure 3), however, sacrifices too much variance reduction to gain just barely significant decreases of model variance, and is thus underfitting the precious observational data.
 For reasons discussed above, we believe that the appropriate solutions worth exploring that will reveal robust model structure without sacrificing significant amount of data information are located in between group 2 and group 1. In fact, we prefer the conservative group 1 multiscale inversion solution (referred as solution_1 hereafter) that can be characterized as the most reliable model with a reasonably low data misfit. Simple damping group 2 solution (referred as solution_2 hereafter), on the other hand, can be treated as a reference conventional model that might be a little bit on the overinterpreting side. The overall patterns of the crustal magnetization revealed in these two solutions are similar (Figures 4 and 5) . In fact, the general features are quite similar to previous works such as those obtained by Langlais et al.  and Whaler and Purucker . However, the conservative multiscale solution solution_2 is dominated by long-wavelength structures at some places. Notice that this smoothing is not applied in a stationary sense, that is, the model is not the result of a uniform low-passing like in other conventional regularizations that enforce smoothness [Chiao and Kuo, 2001]. The relatively smooth model, solution_1, can fit the MGS data reasonably well (see also Figures 6, 7, and 8) although there are notable short-scale deviations from the observations. It is also worth pointing out that in Whaler and Purucker's model, to build the minimum RMS magnetization model, short-scale features are required to enforce null magnetization within data gaps. These short-scale features have very little effects on modifying the data misfit or to increase the variance reduction. Our solutions have considerably less and decaying high-degree power spectra but still retain reasonable data fitting. The reference simple damping solution, solution_2, has very similar Mauersberger-Lowes spectra up to degree 75 as compared to Whaler and Purucker's model. However, our preferred robust multiscale solution, solution_1, has similar power spectra to almost all previous models only up to degree 40 and then starts to dive. We will show in the next section, through inversion experiments executed on data generated from a synthetic model, why such a conservative choice to pick a reliable solution is important.
 The sampling geometry of a particular data set such as the distribution of track spacing and the observing altitudes inevitably imposes natural limits on the shortest resolvable model wavelengths. Although the general consensus is to formulate inverse problems with enough model degrees of freedom to avoid the potential aliasing effect, the actual resolvable model components are intrinsically determined by the sampling geometry and are usually much less than those implied from the resolution presumed by the formulation. The variance reduction versus model variance tradeoff analysis helps to locate the optimal model resolution by offering the appropriate degree of strictness of regularization or damping. In principle, formulations based on data kernels [e.g., Whaler and Purucker, 2005] are intrinsically free from the concern of nonuniqueness since there will not be annihilators embedded. There are, however, always the problem associated with the noise contamination or observation errors. In other words, proper regularization is still essential to avoid overinterpreting the data. Unlike other previous works that pursue the best data fitting only, Whaler and Purucker  as well as Langlais et al.  invoke the minimization of the RMS magnetization to regularize the inverse problem. However, the model with the least data misfit still seems to be the choice for the preferred model (e.g., Table 2 of Whaler and Purucker ).
 Our solutions that fit the data reasonably well have considerably less and decaying high-degree power spectra (Figure 9) although our spherical mesh, with a mean spacing of about 1.4°, is fully capable of resolving fine details beyond these higher degrees. These solutions are selected based on locating the optimal area around the knee of the variance-reduction versus model-variance tradeoff curve. That is, decaying high-degree power spectra is a consequence of having low model variance while retaining a reasonable data fitting. In other words, fine details corresponding to those high-degree power spectra are relatively less robustly constrained by the data.
 Notice that there are external as well as internal field contributions to the data. An inverse problem formulated following equation (1) results in an equivalent source magnetization model that extracts crustal signals as far as it is permissible. Arkani-Hamed  used the radial component of the mapping phase data alone that are believed to be least contaminated by the external field, as well as covariance analysis and comparison between models derived from two subsampled data sets, to suppress the time-varying and noncrustal parts of the models induced by external field. Although he concluded conservatively that the degree ∼62 is likely an optimum upper limit of the harmonic degrees of the crustal magnetic field that can be resolved by the high-altitude mapping phase MGS data, it is interesting to note that the resulting model, however, has high-degree power higher than Cain et al.'s  model (Figure 9) that is based on a data set including all three components data from AB, SPO and MO phases. In other words, external field contaminations do not seem to be the main factors responsible for the differences of their high-degree power.
 We believe there are two major factors that result in the apparent discrepancies among models established so far. The first factor that differentiates results based on the Crustal magnetization Model (CM), might it be discrete in nature such as the GSFC model [Langlais et al., 2004] or the continuous ones such as the WP model [Whaler and Purucker, 2005] and the model of this study, from those based directly on Spherical Harmonics (SH) can very likely be attributed to the effect of spectral leakage [Trampert and Snieder, 1996; Chiao and Kuo, 2001]. This effect is similar to the aliasing effect when truncated Fourier series is adopted to expand a function with high-degree energy. The high-degree energy that is not properly represented by their actual degrees owing to the truncated expansion will pile up near the truncated degree and distort the actual spectra especially close to the truncated degree. Instead of directly decomposing a function or a time series, the spectral leakage is very similar in essence except that it occurs when less than enough degrees of freedom are adopted for a model parameterization of an inverse problem. It is obvious from this as well as other previous studies that it will take even higher degrees to get strictly numerically better fitting to the MGS data and that it is quite clear that degree n = 90 is just an arbitrary level of truncation. In other words, when a SH representation truncated at n = 90 is adopted to fit the MGS data, spectral leakage onto those high degrees close to the truncation degree will be inevitable. The CM models are however, truncated differently. In fact, one needs much higher degrees to completely represent these models in terms of the spherical harmonics expansion. That is, there are still significant power beyond n = 90 for CM models whereas SH models have their powers drastically annihilated reaching beyond n = 90. We believe that this is the main reason that makes the SH models have higher power around n = 90 than the CM models.
 The second factor that makes some models having lower high-degree power than others within the same group is regularization, the key issue that we have been discussing in the current study. To ensure that the solution is reliable, we suggest adjusting the strictness of the imposed regularization or damping. Enforced regularization shaves off poorly constrained model components while sacrificing some degree of goodness of fit. That is, we have reasons other than pursuing just better fitting to choose our preferred model that has high-degree power even lower than other CM models. On the other hand, we believe that the reason for the FSU model [Cain et al., 2003] having much lower high-degree power than the MG model [Arkani-Hamed, 2004] and the NASA model [Connerney et al., 2001] is that the FSU model is based on a more complete data set that reduces the degree of nonuniqueness of the inverse problem. That is, for the same amount of degrees of freedom to be modeled, more data constraints behave similarly as regularization that reduces relatively poorly constrained components and results in less high-degree power. Further comparison of spatial patterns of the surface potential among our preferred models and those established previously demonstrates the fundamental differences that might be results of the two factors mentioned above (Figure 10). Notice that the FSU model (Figure 10b) that is constrained with more data than the MG model (Figure 10a) appears to be much simpler along with much lower high-degree power (Figure 9). That is, it is very likely that a significant portion of those short-scale complexities in the MG model with high-degree power are not robustly resolvable model components. The WP model (Figure 10d) is in fact constrained by the same data set as the FSU model. So the reason why the WP model bears even less complicated structures than the FSU model is very likely due to its distinct formulation that avoids null space model components from scratch. It is interesting to note that our solution_2 model (Figure 10e) is very similar to the WP model. Whereas our solution_2 model reaches a variance reduction over 92%, the intrinsic model structure is much simpler than those previously established SH models (Figures 10a and 10b). Furthermore, we have reasons to believe that the even simpler structures in the conservative solution_1 model (Figure 10f) might be more robust. It is also worth mentioning that although the difference between the surface potential models from solution_2 and solution_1 seems to be subtle (Figures 10e and 10f), their manifestations on the crustal magnetization models are in fact significant (Figures 4 and 5).
 To further verify the interpretation of the finer detail features discussed above, we execute recovery experiments with a known implanted synthetic magnetization model. A circular crustal model with constant 20 km depth and alternating positive and negative magnetization in the radial component, Mr, is implanted around the equator (Figure 11a). There are no assumed lateral, Mθ and Mϕ, components. The same sampling geometry of the MGS data set is invoked as the observations. That is, the three-component magnetic field intensity data observed at different altitudes across Mars are replaced by synthetic data generated from the implanted magnetization structure. A considerable amount of uncorrelated noise with peak amplitude as high as 60% of the peak amplitude of the model generated data is then randomly blended in the synthetic data. Since the sampling geometry is the same, there is no need to carry out a new tradeoff analysis for the synthetic data set. Damping factors for the three groups of solutions marked on the tradeoff curves on Figure 3 are tested to obtain corresponding solutions. Not surprisingly, the recovered, underdamped simple damping solution (solution_3 on the tradeoff diagram shown by Figure 3) is significantly corrupted by manifestation from the uncorrelated noise added to the data (Figure 11b). The corruption reduces considerably toward solution_2, but it is still significant and interferes with the correct interpretation of the recovered model (Figure 11c). On the other hand, the noise corruption upon the recovered model that corresponds to the multiscale inversion solution_1 is obviously much lower (Figure 11d) and reasonable. What is worth cautioning is the significant aliasing effects onto the Mθ and the Mϕ components that are not implanted (Figures 11e and 11f). This is, however, inevitable for any formulation based on equation (1) and is simply unresolvable by data constraints alone.
 In summary, the reason to carry out tradeoff analysis is to serve as an effective way of picking the right degree of regularization and thus the appropriate model components that are robustly constrained by the data. The quality of the actual MGS data is probably much better than the tested synthetic data such that the potential corruption might not be as serious as what is demonstrated in Figure 11. However, overinterpretation or overfitting data with unreliable model components is prone to misleading results that can be avoided by giving up a small fraction of the relatively less reliable data information.
 We wish to acknowledge K. A. Whaler and M. E. Purucker for unselfishly sharing all supporting material of their model on the Web: http://planetary-mag.net/jgr_mars_whaler/ [Whaler and Purucker, 2005]. Constructive comments from Joseph Cain and anonymous reviewers have been more than helpful in making significant improvements. All graphs have been created using the Generic Mapping Tools package [Wessel and Smith, 1991]. This study is supported by the National Science Council of ROC under the contracts NSC 95-2611-M-002-004 and NSC 94-2116-M-002-023. See the auxiliary material for the computer files used.