Basal friction beneath ice sheets remains poorly characterized and yet is a fundamental control on ice mechanics. Here we use a complete map of surface velocity of the Antarctic Ice Sheet to infer the basal friction over the entire continent by combining these observations with a three-dimensional, thermomechanical, higher-order ice sheet numerical model from the Ice Sheet System Model open source software. We demonstrate that inverse methods can be readily applied at the continental scale with appropriate selections of cost function and of scheme of regularization, at a spatial resolution as high as 3 km along the coastline. We compare the convergence of two descent algorithms with the exact and incomplete adjoints to show that the incomplete adjoint is an excellent approximation. The results reveal that the driving stress is almost entirely balanced by the basal shear stress over 80% of the ice sheet. The basal friction coefficient, which relates basal friction to basal velocity, is, however, significantly heterogeneous: it is low on fast moving ice and high near topographic divides. Areas with low values extend far out into the interior, along glacier and ice stream tributaries, almost to the flanks of topographic divides, suggesting that basal sliding is widespread beneath the Antarctic Ice Sheet.
 Realistic modeling of the Antarctic Ice Sheet is essential to improve projections of its past, present, and future contributions to sea level rise in a warming climate [IPCC-AR4, 2007]. Boundary conditions are required inputs for ice sheet numerical models. Among these boundary conditions, basal friction is one of the main controls of ice sheet mechanics and it is also one of the most poorly known variables because it cannot be observed directly. Inverse methods that combine ice sheet modeling and surface observations provide a viable alternative to constrain basal conditions. This approach has been applied to simplified two-dimensional ice sheet models [MacAyeal, 1992] and extended to higher-order and full-Stokes models [Morlighem et al., 2010; Seroussi et al., 2011; Jay-Allemand et al., 2011]. Larour et al.  and Gillet-Chaulet et al.  applied this approach to the Greenland Ice Sheet using different ice flow models, but inversion of basal friction has never been attempted at the scale of the Antarctic continent, which is 7 times larger than Greenland. Pollard and DeConto  recently used a simplified approach to infer basal friction beneath the Antarctic Ice Sheet at a resolution of 40 km by tuning basal friction to best match observations of ice sheet surface elevation.
 Here, we present and apply an inverse method to the entire Antarctic Ice Sheet using a three-dimensional, thermomechanical, higher-order, ice flow model combined with high-resolution (300 m) ice motion data. To apply this method to the entire continent, the approach needs to be scalable and the cost function must accommodate flow regimes spanning from near stagnant ice in the interior (cm/yr) to fast-flowing ice along the periphery (km/yr), almost 6 orders of magnitude difference in speed.
 Inverting for basal friction requires the construction of an adjoint model. A common approximation is to neglect the nonlinearity of ice viscosity (e.g., MacAyeal ). The impact of this incomplete adjoint approximation on the performance of the inversion has not been fully established. Goldberg and Sergienko  showed that for a hybrid model [Schoof and Hindmarsh, 2010; Goldberg, 2011], the exact adjoint may be advantageous in some cases to minimize the cost function. Here, we address this issue by deriving the exact solution of the adjoint model and by comparing the results to those obtained with the incomplete adjoint. We also compare the performance of two descent algorithms. Finally, we analyze and discuss the inferred pattern of basal friction in Antarctica and the implications of the results for ice sheet modeling.
2 Ice Flow Model Equations
2.1 Field Equations
 Most ice sheet numerical models of the Antarctic Ice Sheet rely on the Shallow-Ice Approximation [Hutter, 1982], the Shelfy-Stream Approximation [MacAyeal, 1989], or a combination of both (e.g., Pollard and DeConto , Pattyn  or Martin et al. ). These simplified ice flow models are more computationally manageable than higher-order or full-Stokes models.
 Here we employ a higher-order model [Blatter, 1995; Pattyn, 2003] which accounts for both vertical shear and membrane stresses simultaneously. This model is derived from the full-Stokes equations by making two assumptions: (1) horizontal gradients of vertical velocities are negligible compared to the vertical gradients of horizontal velocities, and (2) bridging effects [van der Veen and Whillans, 1989] are negligible. The horizontal velocity is a solution of
where (vx,vy,vz) are the three components of velocity in a Cartesian coordinate system (x,y,z), with z the vertical axis, ρ the ice density, g the norm of the acceleration due to gravity, and s the ice upper surface elevation. The ice viscosity, μ, is assumed to be isotropic and follows Glen's flow law [Glen, 1955]
where is the effective strain rate, n Glen's law coefficient taken as n=3, and B ice rigidity. B is mainly temperature dependent. We use the temperature dependence of Cuffey and Paterson  to convert ice temperature to ice rigidity.
 It has been shown [Morlighem et al., 2010] that this ice flow model is valid almost over the entire ice sheet, except in spatially limited regions where bridging effects cannot be neglected.
2.2 Boundary Conditions
 Let Ω define the ice domain and let ∂Ω be its boundary. ∂Ω is the union of three interfaces: ice in contact with the atmosphere (Γs), the bedrock (Γb), and the ocean (Γw), such that ∂Ω=Γs∪Γb∪Γw.
 At the surface of the ice sheet, Γs, we assume a stress-free boundary condition. For a higher-order model, this boundary condition reads
where n is the outward-pointing unit normal vector. A viscous friction law is applied at the base of the ice sheet, Γb. Following Perego et al. , we write this boundary condition as
where α is the basal friction coefficient. Basal friction, sometimes referred to as basal shear stress, basal drag, or basal traction, is defined as
where v=(vx,vy) is the ice velocity vector in the horizontal plane. Along the ice front, water pressure is applied:
To simplify this expression, we write fw=ρg(s−z)+ρwg min(z,0).
2.3 Weak Formulation
 We present the weak formulation of the model, i.e., a formulation that accounts for both the model equations and boundary conditions in a single equation. We will see in the next section that it is useful to introduce this formulation for variational data assimilation methods.
 Let be the Hilbert space of kinematically admissible fields and (φx,φy) the components of φ. The space denotes the space of square-integrable functions whose first derivatives are also square integrable on Ω.
 The weak formulation of the model is , find such that
3 Basal Friction Inversion
 We employ a variational inversion, also called control method [MacAyeal, 1992], which is derived from the optimal control theory. It consists of determining the pattern of basal friction coefficient that minimizes the misfit between modeled and measured surface velocities, by relying on a gradient descent algorithm. The gradient is retrieved by a Lagrange multiplier method.
3.1 Cost Function
 We define a cost function, or objective function, to minimize the misfit between modeled, v=(vx,vy), and measured surface velocities, . This cost function is
where ϵ is a minimum velocity used to avoid singularities. The first term in this cost function is the classical misfit, which measures the square of the difference between model and observation. This term is standard and particularly efficient for the treatment of fast-flowing regions. The second term measures the square of the logarithmic difference between models and observations [Morlighem et al., 2010]. Since ice flow speed varies exponentially across the ice sheet, this term is capable of accounting for slower ice flow more efficiently. We combine the advantages of these two terms by using them simultaneously. The last term is a Tikhonov regularization term, which penalizes uncontrolled oscillations of α and stabilizes the inversion.
 These independent cost functions are weighed by nondimensionalizing constants: γ1, γ2 and γt. We define J as the cost function when ice velocities satisfy the model equations
where v(α) is the solution of the model equation for a given α field.
 To calculate the gradient of this cost function with respect to the basal friction coefficient, we use the method of Lagrange multipliers. To simplify the equations, we assume that γ1=1 and γ2=γt=0. The extension to the general case is straightforward. The Lagrangian, , of this optimization problem is
where are the Lagrange multipliers. If we integrate by parts the first term of the second and third lines, we retrieve the boundary conditions. This gives a form of the Lagrangian that is similar to the model weak formulation (equation (8)), where the Lagrange multipliers play the role of a kinematically admissible field:
If the velocity v is solution of the model equations, only the first term is nonzero. We integrate a second time the terms of the second line of the Lagrangian (equation (12)):
where and are defined as
These three forms of the Lagrangian (equations (11), (12), (13)) are useful to derive the adjoint equations and the derivative of the cost function with respect to the basal friction coefficient, α.
3.3 Cost Function Derivative
 We wish to retrieve the Gâteaux derivative of the cost function, J(α), with respect to the basal friction coefficient field, α. The Gâteaux derivative is defined as ,
If is the solution of the model equations, then it is solution of the weak formulation (equation (8)). Equation (12) becomes
If we take the Gâteaux derivative of this equation with respect to the basal friction coefficient and apply the chain rule
Since is solution of the model equations, equation (11) imposes that the derivative of the Lagrangian with respect to the adjoint state vanishes, ,
If we choose such that the derivative of the Lagrangian with respect to the model state v vanishes, equation (17) becomes
This expression of the cost function derivative is convenient because it makes it easy to choose a direction that minimizes the cost function. In the case of a steepest-descent algorithm for example, the fastest way to minimize the cost function is to follow a direction collinear to the derivative of the cost function by choosing
where β is a positive scalar coefficient, and αold and αnew are the previous and the updated basal friction coefficient patterns.
 To compute the derivative of the cost function, it is therefore advantageous to choose such that ,
This equation defines the adjoint equations, and its solution defines the adjoint state.
3.4 Incomplete and Exact Adjoint Equations
 To compute the adjoint state, it is necessary to solve equation (21). We now write this expression explicitly with and without the incomplete adjoint approximation. If we assume that the viscosity is linear (the viscosity does not depend on velocity), we can derive the incomplete adjoint equations using equation (13), ,
We retrieve the field equations of the incomplete adjoint
and its boundary conditions
These equations are similar to the forward problem, which makes their numerical implementation straightforward.
 We now derive the exact adjoint equations by taking into account the dependency of ice viscosity, μ, to ice velocity. Using equation (12), the derivative of the Lagrangian with respect to the velocity becomes ,
 The Gâteaux derivative of the viscosity, μ, with respect to the ice velocity is decomposed as follows:
The two terms of this derivative are
The equations of the exact adjoint are therefore ,
where μ′ is defined as
The last term of equation (29) was neglected in the incomplete adjoint (equation (22)). We cannot derive the local equations and we compute the adjoint state by solving this weak formulation directly.
4 Application to the Antarctic Ice Sheet
 We employ the SeaRISE data set to initialize our model of the Antarctic Ice Sheet. The surface elevation is from Bamber et al. , bed topography merges BEDMAP1 [Lythe and Vaughan, 2001], and the AGASEA UT/BAS ice thickness data from year 2004 [Vaughan et al., 2006; Holt et al., 2006], and ice shelf thickness is from Griggs and Bamber . For the thermal regime, we employ the surface temperatures from Comiso  and the geothermal heat flux from Maule et al. . Surface velocities are from Rignot et al. .
 To constrain the ice rigidity, we calculate the thermal regime of the ice sheet assuming thermal steady state and translate the corresponding temperature field into an ice rigidity field using Cuffey and Paterson . On ice shelves, we use the surface velocities to infer ice rigidity—basal friction is zero—using a model inversion [Rommelaere and MacAyeal, 1997; Morlighem et al., 2010]. All the numerical modeling is carried out using the open source Ice Sheet System Model (ISSM) software [Larour et al., 2012].
 In order to accurately capture both fast narrow ice streams and slower regions, while maintaining a reasonable computational cost, we rely on an anisotropic mesh refinement [Morlighem et al., 2010] to minimize the interpolation error of surface velocity and ice thickness and the total number of mesh elements. The resulting mesh comprises 64,000 triangles, with a resolution of 3 km along the coast. This two-dimensional mesh is vertically extruded into 14 horizontal layers forming a three-dimensional mesh of about 825,000 elements. We initialize the model with the data set described above and solve the inverse problem on National Aeronautics and Space Administration's (NASA) Pleiades supercomputer.
 We use γ1=1 and then choose γ2 such that the first two terms have about the same order of magnitude, which gives here γ2=100. γt, the Tikhonov regularization parameter, is calibrated with an L-curve analysis [Hansen, 2000; Jay-Allemand et al., 2011]. The Tikhonov parameter must be large enough to prevent the formation of wiggles in the solution but small enough so that the model fits the observations. The L-curve analysis is a tradeoff curve between the two quantities that both should be controlled: the misfit between model and observation (first two terms of equation (9), ) and the regularizing term (last term of equation (9), ). The L-curve analysis consists of calculating the misfit, , and the regularizing term, , for different values of γt. The results are displayed on a log-log plot (Figure 1). We choose γt=1×10−7. We tested the algorithm with different initial guesses for α (α0=10, 50, and 100 Pa yr/m1/2) and found that the solution was not sensitive to the initial guess. We present here the results for α0=100 (Pa yr/m)1/2.
 We first compare the convergence of the inversion using the incomplete adjoint approximation and the exact adjoint (Figure 2) of our model. These two approaches have been compared for a hybrid model combining the Shallow-Ice Approximation and the Shallow-Shelf Approximation by Goldberg and Sergienko . We find that there is not much difference between the performance of the exact adjoint and the incomplete adjoint, which suggests that the incomplete adjoint is a satisfactory approximation of the exact adjoint for basal friction inversion (Figure 2). The difference between the inferred patterns of basal friction coefficient is less than 4%.
 We also compare the convergence of a simple steepest-descent algorithm and a Limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm [Nocedal, 1980], which uses a quasi-Newton method for both exact and incomplete adjoints. To do so, we interface ISSM with the Toolkit for Advanced Optimization [Munson et al., 2012]. As shown in Figure 2, the convergence is faster with the quasi-Newton method, which converges quadratically.
 All these algorithms lead to similar patterns of basal friction. We analyze here the results of the exact adjoint with the quasi-Newton algorithm, which achieves the best convergence. The misfit between modeled surface velocity (Figure 3a) and observations (Figure 3b) is less than 10 m/yr on average and less than 70 m/yr on areas of fast ice motion, i.e., where ice speed is larger than 500 m/yr. Ice flow is reproduced with great fidelity on both grounded and floating ice. Gaps in observation do not have a detectable impact on the continuity of the model solution. Results obtained for a different initial guess of α are similar, i.e., the inversion method is robust.
 The inferred basal friction and driving stress are shown in Figure 4 side by side. Basal friction and driving stress are almost undistinguishable over the majority of the ice sheet surface: they are within 15% of each other over 80% of the domain. Both quantities are small near ice divides, where surface slope is low, and large near the coast, where surface slope is higher. Basal friction is high along mountainous regions, e.g., the Transantarctic mountains and the Antarctic Peninsula plateau, which are also regions where uncertainties in ice thickness are high.
 We use a constant value for α in equation (5). The initial pattern of surface velocity is therefore different from the observations. To speed up the inversion, the optimization may be initialized with a basal friction equal to the driving stress and assuming that the basal velocity is equal to the observed surface velocity. In the case of a viscous friction law (equation (5)), this yields
where εv=0.1 m/yr. Because this initial guess is close to the expected solution, the convergence is faster and we reduce the risk of converging to a local minimum.
 The inferred basal friction coefficient (Figure 5) is smooth but heterogeneous. The former is due to the Tikhonov regularization, which stabilizes the inversion by preventing the basal friction coefficient from varying significantly over short distances. The addition of this regularization does not increase the misfit between model and observations since it is calibrated with an L-curve analysis, but it ensures that no artificial small-scale feature is introduced in the results. The heterogeneous appearance of the basal friction coefficient reflects the presence of areas of fast ice motion surrounded by areas of slower ice motion with only small differences in driving stress between them. Areas of low basal friction coefficient represent areas of significant sliding (Figure 6). The basal friction coefficient, α, is high in the interior, where the driving stress is low but ice speed is also low. Low values of α are found in areas of fast motion and over a rather vast portion of the domain, suggesting that basal sliding is widespread beneath the ice sheet (Figure 6).
 The agreement between driving stress and basal friction is expected, because lateral drag is nonexistent in interior regions and longitudinal drag is only significant in the proximity of ice sheet grounding lines as shown in Figure 4c. The great similarity between these two quantities explains why basal friction can be correctly inferred by ice flow models that are, a priori, not necessarily valid in some regions. For instance, in Morlighem et al. , the basal friction inferred by a Shallow-Shelf model is similar to the full-Stokes basal friction even though the Shelfy-Stream assumptions are valid only on the ice stream, where basal sliding is significant and vertical shear is negligible. Indeed, this Shallow-Shelf model will also balance driving stress and basal friction, leading to a similar pattern of basal friction to the one shown here.
 Our results for the basal friction coefficient are consistent with the regional details presented by Joughin et al.  over the Amundsen Sea sector using a two-dimensional shelfy-stream model and a similar sliding law or the results of Morlighem et al.  for Pine Island Glacier using a variety of models including higher-order and full-Stokes with the same sliding law. These regional models are based on a finer resolution (∼1 km), and the good agreement with our inferred basal friction suggests that our results are not sensitive to grid spacing at the current resolution. Our pattern of basal friction coefficient is more difficult to compare quantitatively to that of Pollard and DeConto , who use a different sliding law. The regions of low friction in our model and high sliding coefficient in their model are in good agreement for Siple Coast, Thwaites, and Pine Island for example. The patterns tend to differ in East Antarctica, in the region of Totten, where there are gaps in velocity coverage.
 Overall, we detect low values of α, or high level of basal sliding, over a large fraction of the continent, i.e., beneath ice streams but also far out inland, along the flanks of topographic divides (Figure 5 and 6). For instance, fast sliding extends far into the drainage basins of Recovery, Slessor, Pine Island, Thwaites, Siple Coast, but also Byrd and Totten glaciers. The results therefore suggest that fast sliding is not common just to fast-flowing features but is widespread on the continent, as basal velocities exceed 30% of the surface velocities over more than 80% of the ice sheet (Figure 6).
 The detection of fast sliding over such a large fraction of Antarctica has strong implications for ice sheet modeling. In particular, it means that few parts of the ice sheet must be frozen to the bed. This is consistent with a recent modeling of the geothermal regime of the ice sheet [Pattyn, 2010], which finds ice frozen at the bed only in a few regions: mountainous regions, e.g., between South Pole and Amundsen Glacier in the Transantarctic Mountains, and along the coast of Queen Maud Land. According to our results, basal sliding is a major participant in controlling ice flow in Antarctica; hence, the inversion for the basal friction coefficient is essential to correctly initialize ice flow models.
 In this inversion, we rely on a higher-order ice flow model, assuming thermal steady state. Although this assumption is a viable approximation, the history of atmospheric conditions is not captured by our model. Warmer ice would lead to more vertical shear, smaller basal velocities, and higher basal friction to match the surface observations. Colder ice would lead to lower basal friction coefficient and larger basal velocities, as vertical shear would be reduced. Nevertheless, the balance between driving stress and basal friction should not be altered, as we do not expect longitudinal and lateral drag to change significantly in areas where the driving stress is already fully balanced by the basal friction. The same conclusion is expected for different sliding laws: the inferred basal friction should be identical, but the pattern of basal friction coefficient will change depending on the type of sliding law.
 Errors in bed topography might also alter our results; however, Pollard and DeConto  noted that only widespread errors of more than 400 m changed the large-scale pattern of basal friction for their algorithm. Our model is also limited by the mesh resolution of 3 km along fast ice streams. A resolution similar to one ice thickness, i.e., < 1 km, would be preferable for a detailed analysis at the basin level but would be more computationally intensive.
 Finally, the method presented here is applied to a purely viscous friction law but can be easily extended to a variety of friction laws, provided that they remain differentiable with respect to the velocity. In addition, we have not considered the case of time-dependent velocities in this study, to determine how the basal friction coefficient changes with time. This latter aspect is of course critical for time-dependent studies.
 We present a method to infer basal friction in Antarctica with both an incomplete and an exact adjoint. The model yields a pattern of basal friction coefficient, at a relatively high spatial resolution, which minimizes the misfit between observed and modeled ice velocity and provides a description of the spatial distribution and magnitude of basal friction. We show that the incomplete adjoint approximation, which is easy to implement, does not affect the convergence of the inversion significantly and hence is an excellent approximation for the exact adjoint solution for basal friction inversion. Overall, the inferred pattern of basal friction is similar to the driving stress, but the pattern of basal friction coefficient is highly heterogeneous. The results provide an important observational constraint for initializing ice sheet models and suggest that rapid sliding is significant over a large fraction of the Antarctic continent.
 This work was performed at the Department of Earth System Science, University of California, Irvine and at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration, Cryospheric Sciences Program, grant NNX12AB86G. Hélène Seroussi was supported by an appointment to the NASA Postdoctoral Program at the Jet Propulsion Laboratory, administered by Oak Ridge Associated Universities through a contract with NASA. We thank the two anonymous reviewers, the Associate Editor J. Bassis, and the Editor B. Hubbard for their helpful and insightful comments.