It is well known that slip rate estimates from geodetic data are nonunique because they depend on model assumptions and parameters that are often not known a priori. Estimates of fault slip rate on the Mojave segment of the San Andreas fault system derived from elastic block models and GPS data are significantly lower than estimates from geologic data. To determine the extent to which the slip rate discrepancy might be due to the oversimplified models of the rheology of the lithosphere, we develop a two-dimensional linear Maxwell viscoelastic earthquake cycle model and simultaneously estimate fault slip rates and lithosphere viscosity structure in the Mojave region. The model consists of episodic earthquakes in an elastic crust overlying layers with different viscosities that represent the lower crust, uppermost mantle, and upper mantle. We use GPS measurements of postseismic relaxation following the 1992 Landers earthquake, triangulation measurements spanning 1932–1977, GPS measurements of the contemporary velocity field, and paleoseismic data along the San Andreas fault. We infer lower crustal (15–30 km depth) viscosity of ∼1019–1020 Pa s, uppermost mantle (30–60 km) viscosity of ∼1020–22 Pa s, and underlying upper mantle viscosity of ∼1018–1019 Pa s, consistent with inferences from laboratory experiments of relatively high-viscosity lithospheric mantle and lower-viscosity lower crust and underlying asthenospheric mantle. We infer a 20–30 mm/yr slip rate on the San Andreas fault, in agreement with the lower end of geologic estimates. Inversions of geodetic data with models that do not incorporate layered viscosity structure may significantly misestimate slip rates.
 Estimates of slip rate on the Mojave segment of the San Andreas fault system inferred from elastic dislocation models and GPS data are inconsistent with estimates using geologic data. According to geologic estimates, the San Andreas fault slips 25–35 mm/yr along this segment [Sieh and Jahns, 1984; Weldon et al., 2004] while elastic block models predict lower slip rates of about 15 mm/yr [Becker et al., 2004; Meade and Hager, 2005]. Elastic block models incorporate steady, long-term rigid block motion and interseismic elastic strain accumulation due to locking of faults in the upper seismogenic crust modeled with dislocations in an elastic half-space [e.g., Savage and Burford, 1973; McCaffrey, 2002]. It is well known that slip rate estimates from geodetic data depend strongly on model assumptions about rheology of the crust and mantle, and therefore any discrepancies between estimates using geologic and geodetic data might be attributed to model assumptions. For example, it is well documented that estimates of fault slip rates from surface deformation using models that incorporate viscous flow below the elastic crust are dependent on the assumed viscosity which is not often known a priori. Savage and Prescott  demonstrated this with an earthquake cycle model consisting of a fault with periodic, sudden slip events in an elastic crust (schizosphere) overlying a Maxwell viscoelastic lower crust and mantle (plastosphere). Here we are adopting the nomenclature of schizosphere, which refers to the portion of the lithosphere that deforms elastically during interseismic periods, and plastosphere, which refers to the portion of the lithosphere that flows [e.g., Scholtz, 1990]. In this model, the surface velocity field depends on the fault slip rate, the thickness of the schizosphere, the time since the last earthquake, the average repeat time of earthquakes, and the viscosity of the plastosphere. A consequence of the time dependence of the surface velocity field is that models that ignore this effect may lead to either significant underestimates of fault slip rate using data late in an earthquake cycle or significant overestimates of fault slip rate using data early in an earthquake cycle [e.g., Dixon et al., 2002]. Dixon et al.  showed that estimates of slip rate in the Owens Valley fault zone in eastern California can vary by as much as a factor of three, depending on the choice of plastosphere viscosity. They showed that elastic block models, which are appropriate for the condition that relaxation time is greater than the average recurrence interval of earthquakes, predict higher slip rates from geodetic data than estimated from geologic data, while viscoelastic earthquake cycle models with plastosphere relaxation times less than the average recurrence interval predict slip rates similar to those estimated from geological data.
 A challenge with the application of viscoelastic earthquake cycle models to many tectonic regions is that the viscosity structure of the plastosphere is often not known or disparate data sets have been used individually to infer different viscosities. While the viscosity distribution in the western United States is relatively well studied, there are a number of apparent discrepancies in inferred viscosity distributions. Table 1 summarizes estimates of viscosity structure of the lower crust and upper mantle in the western United States using a variety of data sets and methods. Studies of the average viscosity structure over decadal timescales using geodetic data infer average plastosphere viscosities that are inconsistent with estimates of average viscosity structure over longer timescales. For example, Segall  and Johnson and Segall [2004b] infer average plastosphere viscosities of 1019–1020 Pa s using earthquake cycle models and geodetic data along the San Andreas fault. However, studies of transient isostatic adjustment associated with lake loads in the western United States suggest plastosphere viscosities of less than 1019 Pa s [Bills et al., 1994; Kaufmann and Amelung, 2000] (see review of results by Dixon et al. ). Plastosphere viscosities of 1018–1019 were inferred for southwest Montana from models of viscoelastic relaxation following the 1959 Hebgen Lake earthquake constrained by historical geodetic measurements of surface deformation [Nishimura and Thatcher, 2003]. This discrepancy may be due to model assumptions about viscosity structure. Segall  and Johnson and Segall [2004b] assumed a uniform linear viscosity for the plastosphere while the others assumed a layered linear viscoelastic structure. It is possible that the apparent discrepancy is a reflection of lateral variations in plastosphere viscosity structure across the western United States, however, the data are too sparse to draw firm conclusions about this.
 Laboratory creep experiments of lower crustal and upper mantle materials [e.g., Kohlstedt et al., 1995] has led some to suspect that the viscosity of the lower crust is lower than the viscosity of the underlying mantle. In the western United States there is evidence to support this hypothesis and other evidence to refute it. Deng et al.  inferred flow in the lower crust with a low viscosity of 1018 Pa s in the Mojave Desert following the 1992 Landers earthquake, consistent with results from a geodynamic model of deformation in the eastern Mojave region [Kaufman and Royden, 1994]. Bokelmann and Beroza  also infer a relatively low viscosity (<1019 Pa s) lower crust below the San Andreas fault from focal mechanism orientations. However, Freed and Bürgmann  and Pollitz et al.  inferred higher viscosities in the lower crust than in the upper mantle from models of postseismic deformation following the 1992 Landers and 1999 Hector Mine earthquakes in the Mojave Desert. Quite different lower crustal viscosity estimates have been inferred in central Nevada using different measurements of deformation associated with postseismic relaxation following recent large earthquakes in the Central Nevada Seismic Belt. Using similar models of an elastic plate over two viscous layers representing the lower crust and uppermost mantle, Hetland and Hager  used GPS data to infer a lower crust viscosity in the range 5–50 × 1018 Pa s while Gourmelen and Amelung  used InSAR data to infer lower crustal viscosity of greater than 1020 Pa s.
 A conclusion from the review of these studies is that the different models and deformation measurements lead to very different viscosity estimates. Lower crust viscosity estimates vary over two orders of magnitude and estimates of upper mantle viscosities vary over three orders of magnitude. Postseismic GPS time series record rapid relaxation processes that occur over the months and years following an earthquake. The current GPS velocity field in California records decadal timescale deformation processes associated with the earthquake cycle. The paleo-lake shoreline data records relaxation processes that take place over thousands of years. All of these studies investigate relaxation processes in the plastosphere following a tectonic event, but it is difficult to compare or integrate results from these studies. Various simplifications of viscosity layering were assumed for these studies, and the various measurements over different time periods may reflect time-dependent relaxation processes that might occur at different depths within the plastosphere.
 In an attempt to resolve these discrepancies, we develop a model that enables us to integrate various data sources covering a broad range of time periods into a joint estimate of fault slip and viscosity structure. The Mojave region is ideal for this study because of the abundance of geodetic and paleoseismic data. We have GPS time series data of postseismic relaxation following the 1992 Landers earthquake in the Mojave Desert (http://quake.usgs.gov/research/deformation/gps/auto/LandersPro/), triangulation data spanning 1932–1977 [National Geodetic Survey (NGS), 2004], and GPS measurements of the contemporary velocity field (http://epicenter.usc.edu/cmm3/; Figure 1). In addition, a detailed history of past earthquakes is beginning to emerge (Figure 2) with continued analysis of paleoseismic excavations along the San Andreas fault [Weldon et al., 2004; G. E. Hilley and J. J. Young, Determining event timing, recurrence, and correlation from paleoseismic excavation data along the central and southern San Andreas fault, California, submitted to Bulletin of the Seismology Society of America, 2006, hereinafter referred to as Hilley and Young, submitted manuscript, 2006a].
 Our model is an extended version of the Savage and Prescott  earthquake cycle model. The model consists of three linear Maxwell viscoelastic layers to represent the lower crust, uppermost mantle, and upper mantle. We identify the viscosity structure and the slip history on the San Andreas fault that is consistent with geodetic measurements of surface deformation, paleoseismic data on timing of past earthquakes, geologic estimates of fault slip rates, and inferences of relaxation times associated with isostatic adjustments.
2. Lithosphere Rheology
 Laboratory creep experiments show that lower crustal and upper mantle materials display nonlinear, thermally activated, power law creep behavior with effective viscosity
where σ is differential stress, C is a material constant and is a function of temperature and activation energy, and n is typically in the range 2–4. However, without specific knowledge of stress, temperature, lithology, grain size, and water abundance at depth, experiments place essentially no constraints on effective viscosities in the lithosphere. Figure 3 shows theoretical effective viscosities for a range of shear stress values and crustal and mantle materials with varying laboratory values for C summarized by Freed and Bürgmann . Temperatures are assumed to increase linearly with depth down to the top of the asthenospheric upper mantle. The temperature is assumed constant with depth in the upper mantle as evidenced by seismic inversions for temperature in the western United States [Goes and van der Lee, 2002]. The hypothetical effective viscosities range over 10 orders of magnitude.
3. Earthquake Record in the Mojave Region
 Geologic and paleoseismic studies indicate that the San Andreas fault slips 25–35 mm/yr along the Mojave segment [e.g., Sieh and Jahns, 1984; Weldon et al., 2004]. The remaining 15–25 mm/yr needed to keep up with the 50 mm/yr of total shift across the plate boundary is presumed to occur on neighboring faults, mostly within the Eastern California Shear Zone [e.g., Meade and Hager, 2005].
 The paleoseismic record provides some detail on the earthquake history of the San Andreas fault. Estimates of earthquake timing are available from nine trenching sites along the southern and central San Andreas fault. The most complete record of past earthquakes is recorded at the Wrightwood site on the Mojave segment where 14 events in the last 1600 years have been identified [e.g., Fumal et al., 2002; Weldon et al., 2004]. Hilley and Young (submitted manuscript, 2006a) reanalyzed the paleoseismic data and calculated event probabilities using a Bayesian formulation and a Monte Carlo sampling algorithm. Correlations between sites reveal a complex rupture history in which recurrence times and rupture lengths vary with time. Weldon et al.  and Hilley and Young (submitted manuscript, 2006a) show that various scenarios for the segmentation of rupture on the San Andreas fault are possible given the data constraints. Because we are using two-dimensional (2-D) models, we cannot address the various rupture segmentation scenarios that might be inferred from the paleoseismic data. We will assume events recorded at the Wrightwood site (Figure 2), which displays the most complete record of paleo-earthquakes, can be modeled as earthquakes that rupture the entire Mojave segment.
 We know the exact timing of the two most recent large earthquakes on the Mojave segment of the San Andreas fault. The 1857 Fort Tejon earthquake produced slip of about 7–9 m along the Carrizo section [Sieh, 1978; Liu et al., 2004] and 3–6 m along the Mojave segment [Sieh, 1978; Salyards et al., 1992]. Salyards et al.  suggest that the two earthquakes before the 1857 event produced 5.5 and 6.25 meters of offset at Pallett Creek (Figure 1).
 Geodetic studies indicate that about 25% of the plate motion across the diffuse Pacific/North American plate boundary occurs within the Eastern California Shear Zone [e.g., Meade and Hager, 2005]. The nature and history of earthquake behavior in this zone is a topic of current research and is not very well understood. Paleoseismic data in this region is too sparse at this time to make many direct comparisons with models of geodetic data.
4. Mojave Region Geodetic Data
 As our analysis will demonstrate, measurements that sample different periods of an earthquake cycle are required to resolve lithosphere viscosity structure. To obtain broad temporal data coverage, we use several different geodetic data sets: (1) GPS measurements of the contemporary velocity field, (2) triangulation measurements spanning 1932 to 1977, and (3) GPS measurements of postseismic deformation following the 1992 Landers earthquake. The locations of the measurements are displayed in Figure 1.
 The contemporary GPS velocity field in the Mojave region (Figure 1) is taken from the SCEC Crustal Motion Map, version 3.0 (http://epicenter.usc.edu/cmm3/). The crustal motion map is constructed on the basis of GPS data since 1986, United States Geological survey trilateration data spanning 1970–1992, and VLBI data collected by the NASA Crustal Dynamics Program (1980–1994). The GPS velocity component parallel to the San Andreas fault at sites within the dashed lines in Figure 1 are plotted in Figure 4a as a projection on a profile perpendicular to the San Andreas fault.
 Triangulation data were obtained from the NGS ). We selected triangulation measurements within 10 km of the San Andreas fault and calculated the average shear strain rate across the 20-km-wide zone spanning the fault. We calculated the average shear strain rate during each time period using the method of Frank , although as by Thatcher , we generalize Frank's  method for calculating strain rates with three angles at the centroid of triangles to strain rate estimations using any number of measurements. The shear strain rates are plotted in Figure 4b. Vertical bars show 2σ uncertainties and horizontal bars denote the time interval over which the strain rate is averaged. The average shear strain rate during each time period is obtained by differencing angle measurements at the beginning and end of the time period. The number of angle measurements used for each calculation ranges from 24 to 223. We omitted calculations that span the 1952 Kern County earthquake just north of the Mojave region and the 1971 San Fernando earthquake to minimize influences of deformation distinct from the San Andreas fault.
 GPS time series of postseismic displacements following the 1992 Landers earthquake are also plotted in Figure 4c. We show the four sites located farthest from the fault, although we used time series data from nine sites in our inversions. We disregarded measurements during the first two years after the earthquake to avoid the most rapid postseismic velocities that may be attributed to nonlinear flow in the upper mantle [e.g., Freed and Bürgmann, 2004] or rapid afterslip [e.g., Shen et al., 1994; Savage and Svarc, 1997], as these mechanisms are neglected in our model.
 Probabilities on timing of paleo-earthquakes on the Mojave segment of the San Andreas fault are shown in Figure 2 (estimates taken from Hilley and Young (submitted manuscript, 2006a)). Hilley and Young (submitted manuscript, 2006a) used published data to reanalyze the timing of ancient earthquakes along the central and southern San Andreas fault. In their study, they augmented radiocarbon age estimates with geologic information, such as relative ordering of strata and the accumulation of peat within the stratigraphy, to refine the timing of earthquake events (G. E. Hilley and J. J. Young, Evaluation of layer ages, earthquakes, and their recurrence. I: Development and evaluation of new Bayesian Markov-chain Monte Carlo simulation methods applied to excavations with continuous peat growth, submitted to Bulletin of the Seismology Society of America, 2006, hereinafter referred to as Hilley and Young, submitted manuscript, 2006b). This study, built on the work of Biasi and Weldon  and Biasi et al. , allows the timing of ancient earthquakes observed in the paleoseismic record to be cast in terms of probability densities that may be directly used in the analyses such as the current study. Importantly, Hilley and Young (submitted manuscript, 2006a) highlight that significant differences may arise from application of different types of geologic information when estimating the timing of ancient earthquakes. For example, while use of observed peat accumulation may reduce uncertainties in the timing of ancient earthquakes, systematic differences in earthquake ages exist when using peat accumulation constraints versus stratigraphic ordering constraints [Hilley and Young, submitted manuscript, 2006a]. In addition, details of the solution method used to estimate earthquake ages using peat accumulation constraints may also produce significant differences in earthquake timing estimates [Hilley and Young, submitted manuscript, 2006b], and so it is unclear if the use of peat accumulation constraints decrease the true uncertainties in earthquake ages. For this reason, Hilley and Young (submitted manuscript, 2006b) suggest that both the more conservative stratigraphic ordering constraints be used in addition to peat accumulation constraints in any study that seeks to estimate that age of ancient earthquakes. In keeping with this, we consider both a scenario in which only stratigraphic ordering constraints are used to estimate timing of ancient earthquakes (Figure 2b), as well as a situation in which peat accumulation is used to further improve earthquake age estimates (Figure 2a).
5. Layered Viscoelastic Structure
 We build an earthquake cycle model incorporating multiple linear viscoelastic layers. In this section we discuss the construction of this model and illustrate the influence of layered viscoelastic structure on predicted interseismic surface velocities.
5.1. Model Construction
 Our earthquake cycle model consists of an infinitely long strike-slip fault in an elastic crust overlying two Maxwell viscoelastic layers and a Maxwell viscoelastic half-space (Figure 3). The viscoelastic regions represent the lower crust, uppermost mantle, and upper mantle. We consider a distinct uppermost mantle layer to approximate a possible relatively viscous lithospheric mantle lid separating a weaker lower crust and asthenospheric upper mantle. The 1-D analog of a Maxwell viscoelastic solid is a spring connected to a dashpot, which is a plunger in a cylinder filled with a Newtonian viscous fluid (n = 1, equation (1)).
 The multilayer model is an extension of the earthquake cycle model concept introduced by Savage and Prescott  in which the far-field steady velocity field is obtained with the superposition of an infinite sequence of earthquakes on the fault. In the Savage-Prescott model, a fault is embedded in an elastic plate overlying a Maxwell viscoelastic half-space with uniform viscosity. The characteristic relaxation time, tR, of the viscoelastic half-space is 2η/μ where η is viscosity and μ is the elastic shear modulus. In this paper, we assume a uniform shear modulus of μ = 3 × 1010 Pa. Earthquakes are modeled as sudden uniform dislocations on a vertical fault. Earthquakes are imposed at a regular recurrence interval and a steady far-field velocity is achieved after an infinite sequence of periodic earthquakes. Meade and Hager  and Hetland and Hager  elaborated on this model to allow for nonperiodic earthquakes, and Hetland and Hager  further extended the Savage-Prescott model to a general linear viscoelastic rheology. Our model differs from these earthquake cycle models in that we incorporate a layered viscoelastic lower crust and mantle.
 We obtain the solution for a single earthquake in an elastic layer overlying two viscoelastic layers and a viscoelastic half-space with propagator matrix methods [e.g., Ward, 1985]. The theory is linear so we can use superposition to sum solutions for single earthquakes to obtain a sequence of earthquakes. The steady far-field velocity is achieved by summing an infinite sequence of earthquakes. We obviously cannot specify the timing and slip for all earthquakes in the infinite sequence, so we break it into a finite sequence extending from the present to some time, t0, in the past, and an infinite sequence extending from time −∞ to time t0. All of the earthquakes in the infinite sequence have the same slip magnitude and recurrence interval. In the finite sequence, slip and recurrence time is allowed to vary. The solution is quasi-analytical and numerically efficient so that complete inversions for posterior probability distributions of model parameters are tractable. The only numerical step is an inverse Fourier transform from wave number space to physical space, and this is computed efficiently using the Fast Fourier Transform.
5.2. Surface Velocity Profiles
Figure 5 shows interseismic velocity profiles at four different times assuming the illustrated lithosphere viscosity structure and a sequence of earthquakes with 200-year recurrence intervals. In each model, the viscosity is varied in the lower crust, uppermost mantle, and upper mantle. Also shown for reference are the velocity profiles predicted by the Savage-Prescott model with the uniform viscosity set to the upper mantle viscosity of the multilayer model and with all other parameters the same.
 The effect of a relatively low-viscosity lower crust is shown in Figure 5a. Compared to the uniform viscosity model, the low-viscosity lower-crust model produces significantly lower velocities and shear strain rates within about 100 km of the fault for the time periods shown. This is because the relatively low-viscosity lower crust quickly relaxes the large stresses produced near the bottom edge of the fault after an earthquake through viscous flow. The flow is rapid and localized near the fault in the early stages after the earthquake, generating high shear strain rates in the elastic layer, and slow and diffuse in the later stages, producing low shear strain rates.
Figure 5b illustrates the effect of a relatively high-viscosity lower crust and uppermost mantle overlying a lower-viscosity mantle. The relatively high-viscosity lower crust and uppermost mantle localizes the deformation near the fault with relatively steady shear strain rate within 50 km of the fault. The relatively high-viscosity lower crust introduces sustained localized flow below the fault that generates localized deformation in the elastic crust. The low-viscosity mantle relaxes relatively quickly, producing rapid deep flow and corresponding long-wavelength elastic flexure early in the cycle and slower flow and lower rates of long-wavelength deformation later in the cycle. The effect is similar to flexure of an elastic plate loaded under shear and then cracked part way through the plate at the top. The cracked plate partly releases the load, but the plate below the crack supports some of the load and localizes deformation near the bottom tip of the crack.
Figure 5c shows interseismic velocities for a model similar to the previous model, but including a lower-viscosity lower crust. The shear strain rate is lower near the fault than in Figure 5b, but the general pattern is quite similar.
5.3. Synthetic Inversions
 A notable feature of each of the forward models in the previous section is that the far-field velocities ∼300 km from the fault are significantly lower than the long-term plate velocities, during much of the interseismic period. The Savage and Burford  buried elastic dislocation model would require a lower slip rate to reproduce the velocities in Figure 5, suggesting inversions of geodetic data for slip rates using elastic block models might underestimate the true fault slip rate.
 To investigate how severely the elastic block models and the Savage-Prescott earthquake cycle model might under predict slip rates, we construct a synthetic data set using the model illustrated in Figure 5c) by imposing the most recent earthquake at 150 years ago (t = 150 profile). We invert the synthetic data for slip rates using the Savage-Prescott model and the Savage and Burford  buried dislocation model. To simulate data across the San Andreas fault, we assume a 30 mm/yr slip rate, select 30 equally spaced samples of the velocity profile, and add normally distributed error with standard deviation of 1.5 mm/yr. Figure 6a is a contour map of confidence intervals for the joint posterior probability distribution for slip rate and locking depth using the Savage-Buford elastic dislocation model. The elastic model predicts slip rates of 16.5–22.5 mm/yr which is 55–75% of the true slip rate of 30 mm/yr. Figure 6b shows a contour map of confidence intervals for the joint probability distribution of elastic plate thickness and viscosity for the Savage and Prescott  earthquake cycle model. Here we have fixed the time since the last earthquake and the recurrence time and we found the least squares estimate of slip rate for each value of elastic thickness and viscosity. There is a strong correlation between elastic plate thickness and viscosity for elastic plate thicknesses less than 20 km. For elastic plate thicknesses between about 20 and 50 km, the data are fit satisfactorily for any viscosity value greater than 1020 Pa s. Figure 6c shows the optimized slip rates within the 95% confidence bounds. The range in slip rates is similar to that obtained from the elastic block model.
 Our synthetic inversion demonstrates that the Savage-Burford buried elastic dislocation model and the Savage-Prescott earthquake cycle model may both underestimate slip rates on faults if the true mantle viscosity is lower than the viscosity of the uppermost mantle and lower crust.
6. Mojave Model
 We model the geodetic data in the Mojave region of the San Andreas fault system with the multilayer episodic earthquake cycle model discussed above (Figure 3). The timing of past earthquakes is constrained by paleoseismic data (Figure 2). We take t0, the beginning of the nonperiodic earthquake sequence, to be the time of event EQ5 at the Wrightwood site, about 1000 A.D. (Figure 2). Offsets and timing of earlier events are not resolved by our model.
 The sensitivity of surface velocities to the timing and magnitude of past earthquakes is illustrated in Figure 7. We compare a reference surface velocity profile with velocity profiles generated with a random distribution of slip magnitude and timing of past earthquakes. The reference model has the rheological structure illustrated in Figure 7 with event times of the last seven earthquakes taken from the peak of the probability distributions for Wrightwood in Figure 2a, and coseismic slip of 4.3 m for each of these most recent events (which corresponds to 30 mm/yr average slip rate). Before event EQ5, it is assumed that earthquakes occur every 200 years with slip rate of 30 mm/yr. We then randomize the slip magnitude and timing of the seven most recent earthquakes by varying the timing within ±50 years and slip magnitude within ±5 m of the reference model. The current surface velocity profiles (year 2006) from the randomized models are subtracted from the reference surface velocity profile. The upper and lower bounds on the distribution of the differenced velocities are shown in Figure 7. We show three different results: randomized slip and timing for (1) the 1812 and 1857 earthquakes only, (2) earthquakes 1–5 only, and (3) earthquakes 6–11 only (numbering shown in Figure 2). Also plotted are the 2σ error bars for each of the GPS measurements as a function of distance from the San Andreas fault. We see that the surface velocities are most sensitive to slip and timing for the 1812 and 1857 earthquakes. The modeled velocities within 100 km of the fault are marginally sensitive to the timing and slip for earthquakes 1–5 given the uncertainty in the measurements. However, only the long-wavelength component of the modeled velocity profile is sensitive to timing and slip for earthquakes 6–11 and the variation is completely within the 2σ error. Apparently, the model and GPS data will not resolve slip and timing of the older events in the paleoseismic record and will only marginally resolve events EQ 1–5.
 We model the major faults in the Mojave region as infinitely long, parallel strike-slip faults, although in reality the faults are neither infinitely long nor parallel. This is an adequate approximation to obtain first-order estimates of lithosphere viscosity structure and fault slip rates. As discussed previously, we project GPS data within the dashed lines in Figure 1 onto a profile perpendicular to the trend of the Mojave segment of the San Andreas fault. The velocity profile is plotted in Figure 4. A line of GPS data across the Homestead and Emerson Valley faults that ruptured in the 1992 Landers earthquake is also projected onto a profile perpendicular to the faults.
 Following Meade and Hager , from west to east, we model the Hosgri fault, San Gabriel fault, San Andreas fault and two faults in the Eastern California Shear Zone (Figure 1). Although there a number of active fault strands in the Eastern California Shear Zone, we model the entire zone with two parallel faults because we are not concerned with the details of deformation in the zone. Because we have a detailed earthquake history only along the San Andreas fault, this is the only fault for which we model interseismic viscoelastic cycle effects. Interseismic deformation due to slip on the other faults is modeled with buried screw dislocations [Savage and Burford, 1973]. The post-Landers data is modeled with our multilayer viscoelastic model, but we impose a single earthquake rather than an infinite sequence of earthquakes. It is reasonable to assume the cumulative interseismic displacements near Landers are negligible compared to postseismic displacements during this relatively short time period. We assign 5 mm/yr of slip on the San Gabriel fault following the results of Meade and Hager , and we solve for the slip rate on the other faults. The locking depth for each of these buried dislocations is set at 15 km. This approximation using buried screw dislocations is reasonable because, for any velocity profile produced by our earthquake cycle model, there is a buried dislocation model that approximately reproduces the velocity profile (except during the earliest period of the cycle where local velocities can exceed the far-field velocity). The advantage of this approximation is that there are fewer parameters to estimate in the buried dislocation model. The disadvantage is that the slip rate estimate on faults modeled with the buried dislocation are not reliable because we ignore time-varying viscous effects.
7. Inversion Scheme
 To incorporate prior information from geology on timing of past earthquakes, we formulate a Bayesian inverse problem. In a Bayesian formulation, the posterior distribution of the model parameters, m, given the data, d, is
where k is a constant, p(m) is the prior distribution of the model parameters, and p(d∣m) is the distribution obtained from the data and data errors [e.g., Mosegaard and Tarantola, 2002]. Assuming the model relationship d = g(m) + e with normally distributed (Gaussian) errors e ∼ N(0, Σd),
 For this study, the parameters, m, include slip rates on all the faults, timing of and slip during past earthquakes, elastic crust and uppermost mantle thickness, and viscosities of the lower crust, uppermost mantle, and upper mantle. The prior distribution on the model parameters, p(m), is a quantitative estimate of the model parameters obtained independently of the geodetic data, as for example, the prior probability distributions on timing of past earthquakes (Figure 2). In this formulation, the target posterior distribution, p(m∣d), can be thought of as a refinement of the prior distribution p(m) through the introduction of geodetic data and the model. If the geodetic data and model do not provide any further constraints on the model parameters, then the posterior distribution will be equal to the prior distribution.
 If the relationship between the data and model is linear and the priors are Gaussian, we could obtain the posterior distribution with least squares. However, in this problem the relationship between many of the model parameters and the data is nonlinear and the prior distributions are non-Gaussian, so, we cannot obtain a closed form expression for the posterior distribution, p(m∣d). We build a discrete representation of the posterior distribution by sampling with a Monte Carlo-Metropolis method as explained briefly by Johnson and Segall [2004a], and in more detail by Mosegaard and Tarantola  (see Hilley et al.  for another application).
 We estimate viscosities of the lower crust, uppermost mantle, and upper mantle as well as the thickness of the elastic crust and the depth to the top of the upper mantle. The depth to the bottom of the lower crust is fixed to 30 km which is the average Moho depth in the Mojave region [e.g., Zhu and Kanamori, 2000]. We estimate the timing of EQ 1–5 (shaded events in Figure 2), slip for EQ 1–4 and 1812 and 1857, and the average recurrence interval and long-term slip rate prior to EQ 5 (note that slip is not estimated for EQ 5 because it is the final earthquake in the infinite periodic sequence assigned uniform slip). The earthquakes leading up to EQ 5 in Figure 2 are modeled as an infinite periodic sequence and the recurrence time and slip rate are estimated in the inversion. All of the unknown parameters are listed in Table 2.
T is average recurrence time before 900 A.D.; is average slip rate on each of the modeled faults; H1,3 and η1–3 are defined in Figure 3; si is coseismic slip for each earthquake, and savg is the average coseismic slip magnitude before EQ 5; ti are the dates of earthquakes; SAF is San Andreas fault; ECSZ is Eastern California Shear Zone; HOS is Hosgri fault; and SG is San Gregorio fault.
Type of prior: u, uninformative (i.e., boxcar distribution); g, Gaussian; and p, paleoseismology.
Upper and lower bounds on uninformative prior and 95% confidence intervals on Gaussian priors.
The 95% confidence intervals on posterior distributions.
Result assuming order-only constraints on Wrightwood paleoseismic data.
Buried dislocation result is not meaningful in context of the viscoelastic cycle model.
 The priors assumed for all parameters are listed in Table 2 and plotted in Figure 8. The priors on timing of earthquakes are the paleoseismic probability density functions from the Wrightwood site constructed by Hilley and Young (submitted manuscript, 2006a) (Figure 2). We perform two inversions. One uses the Wrightwood probability distributions constructed using constraints from peat accumulation rates, and the other inversion uses the distributions that are not constrained by peat accumulation rates (order only). The prior distributions on coseismic slip are based loosely on studies by Sieh  and Salyards et al. . We assumed uniform (boxcar) priors for the 1812 earthquake and EQ 1 because the paleoseismic data indicates that these earthquakes may not have ruptured the entire Mojave segment, and therefore we did not want to put any prior weight on the slip magnitudes except to limit the upper value to 10 m (Figure 2). We further assume that the average slip rate over the last six earthquakes is comparable to the average long-term slip rate prior to EQ 5 by requiring the current slip deficit on the San Andreas (the amount of slip in the next earthquake needed to keep up with the long-term slip rate) to be between 0 and 10 meters.
Table 2 and Figures 4 and 8 summarize the inversion results using the prior probability distributions with full constraints for the Wrightwood site shown in Figure 2a. The inferred long-term slip rate on the San Andreas fault is 20–28 mm/yr, at the lower end of the 25–35 mm/yr estimate from geologic data. The prior distributions on slip magnitude in the five earthquakes before the 1857 event are broad and the inversion significantly refines the distribution only for the 1812 and 1857 earthquakes. The inversion suggests that the 1812 earthquake was relatively small with only 1–3 m of average slip. This may reflect the likely scenario that the 1812 earthquake ruptured only part of the Mojave segment. The inversion refines the estimate of slip in the 1857 earthquake; the posterior probability distribution is shifted to the lower end of the prior distribution. Figure 8 shows that the prior and posterior distributions on timing are nearly identical, and so the inversion does not further refine the timing of earthquakes. This is also the case for the inversion using order-only constraints for Wrightwood (Figure 2b). We do not show inversion results for the second inversion that uses the more conservative order-only constraints at Wrightwood since the results are nearly identical to the results using the prior with full constraints. The long-term slip rate estimate for the San Andreas fault using the order-only prior is 22–30 mm/yr.
 The 95% confidence intervals on lithosphere layer thicknesses and viscosities are plotted in Figure 3. It is remarkable that given no prior constraints on viscosity, the viscosities are resolved to within 1–2 orders of magnitude in each layer. It is also interesting that the viscosity distribution with depth follows the general pattern expected from laboratory measurements; the average mantle viscosity is lower than the average upper mantle and lower crustal viscosities and the lower crustal viscosity is lower than the uppermost mantle viscosity. Also, values of 52–66 km for H3, the depth to the top of the low-viscosity upper mantle asthenosphere, are consistent with independent evidence for the lack of a thick lithospheric mantle lid in the western US [Goes and van der Lee, 2002; Freed and Bürgmann, 2004].
9.1. Comparison With Elastic Models
 Our slip rate estimate of 20–30 mm/yr is higher than estimates from 3-D elastic block models [Becker et al., 2004; Meade and Hager, 2005]. However, our model differs from the 3-D block models in that we are assuming infinitely long strike-slip faults and we model viscous flow below the elastic crust. We also consider deformation data early and late in the earthquake cycle, as well as information on past earthquake occurrence. To investigate whether the difference in slip rate estimates is due to the different assumptions about rheology or to different assumptions about fault geometry (e.g., finite versus infinite), we inverted the contemporary GPS data using a Savage and Burford  buried fault model, which is the 2-D equivalent to the 3-D block models. Assuming no prior information on slip rate, the inversion yields a slip rate of 17.5–21 mm/yr with locking depth of 18–24 km for the Mojave segment of the San Andreas fault (both 95% confidence limits). Meade and Hager  report slip rate estimates of 13–15.5 mm/yr with locking depth of 15 km and Becker et al.  report slip rates of 10–25 mm/yr with 15 km locking depth. If we also fix the locking depth to 15 km in our buried dislocation inversion, we obtain slip rates of 16.6–18.2 mm/yr for the San Andreas fault. The 2-D and 3-D elastic dislocation models both produce slip rate estimates that are much lower than geologic estimates. This analysis suggests the difference in slip rate estimates between our 2-D multilayer model and the 3-D block or 2-D dislocation models is largely due to differences in assumed rheology, rather than differences in fault geometry.
9.2. Importance of Combining Observations From Different Time Periods
 The use of multiple data sets sampling different time periods of the earthquake cycle allows us to resolve the distribution of viscosity with depth. To demonstrate the importance of temporal data coverage on the viscosity estimates, we show the fit to the data for viscosities outside of the 95% confidence limits. Figure 4c illustrates that the constraints on mantle viscosity come largely from the post-Landers data. The dashed and solid curves show the modeled time series assuming a mantle viscosity of 1017 Pa s and 1020 Pa s. The low-viscosity mantle relaxes too quickly and the high-viscosity mantle relaxes too slowly. The nearly steady shear strain rate across the San Andreas fault since about 1930 places constraints on the lower-crust and uppermost mantle viscosities. The dashed curve in Figure 4b shows the best fitting model with the lower-crust and uppermost mantle viscosity fixed to 1019 Pa s. The shear strain rate in the model is too high around 1940 and there is more variation in shear strain rate with time than the data suggest. A low-viscosity channel in the lower crust is inconsistent with the data as illustrated in Figure 4a. The dashed curve in Figure 4a, which shows the best fitting model when the viscosity of the lower crust is fixed to 1018 Pa s, does not fit the data. The post-Landers GPS data also places upper bounds on the lower-crust and uppermost mantle viscosity. We found unbounded estimates of viscosity when the post-Landers data were not used in the inversion. This follows from the results of our synthetic inversion (Figure 6) where we saw that the contemporary velocity field alone cannot constrain the upper bounds on viscosity.
9.3. Limitations of Model Assumptions
 Two simplifying assumptions are worth further discussion. First, we discarded the first two years of post-Landers GPS data to minimize the possible influence of nonlinear viscosity or afterslip on our results. We assumed that the postseismic transient after the first two years is due to relaxation of a layered linear viscous plastosphere, but any afterslip continuing beyond the first two years could contribute to the surface velocity field. For example, Johnson and Segall [2004a] showed that there could be detectable amounts of afterslip for as many as ten years following an earthquake. Second, we assumed Newtonian viscosity (n = 1 in equation (1)) so that the effective viscosity is independent of stress and constant with time. However, laboratory experiments suggest that plastosphere viscosity is non-Newtonian and there is some evidence to confirm this from models of postseismic deformation [e.g., Pollitz, 2003; Freed and Bürgmann, 2004]. To examine the potential pitfalls of the Newtonian assumption, we develop a simplified 1-D model to approximate the coupling of elastic deformation in the crust with viscous, power law flow in the upper mantle (Figure 9).
 The 1-D problem is cast in terms of thickness averaged stresses and displacements, following Elsasser . Force balance on an element of the elastic plate (Figure 9a) requires that the thickness-averaged mantle shear stress, σ, is proportional to the gradient in thickness-averaged shear stress in the elastic plate, τ,
where He is elastic thickness and x is lateral position. Assuming linear elasticity for the plate and power law viscosity for the channel,
where μ is elastic shear modulus, Hv is the viscous channel thickness, and C and n are defined in equation (1). In order to remove the dependence of the solution on x, we approximate u(x, t) with a form that guarantees that the thickness averaged shear stress in the plate and mantle is independent of position,
where A(t) is the amplitude of the displacement profile that decays with time after an earthquake and W is a length scale over which there is significant straining of the elastic plate (Figure 9a). The displacements are plotted in Figure 9b for W = 200 km and coseismic slip of 5 m. Substituting equation (6) in to equation (5),
We solve the first-order ordinary differential equation for A(t) numerically using a Matlab Runge-Kutta solver.
 We load the lithosphere with 5 meters of sudden displacement at x = 0 every 200 years until a cycle-invariant state is reached. We vary μ/He and C · Hv and plot the evolution of effective viscosity in Figure 9c. In each case, n = 2. The gray curves assume μ/He = 30 GPa/km and the black curves assume μ/He = 3 GPa/km. The effective viscosity decreases immediately after the earthquake when the shear stress is high and increases with time as the strain rate and shear stress is reduced [e.g., Freed and Bürgmann, 2004]. Higher values of n cause the effective viscosity to evolve more quickly after an earthquake, but otherwise the effect is similar.
 This approximate solution shows that the effective viscosity of the mantle could vary by as much as three orders of magnitude over a 200 year earthquake cycle. If flow in the lower crust and mantle is better approximated by a power law rheology than a linear Maxwell rheology, then the meaning of our estimates of viscosity of the lithosphere layer is not so clear. An analysis of this data with simple 2-D earthquake cycle models incorporating power law viscous flow would provide some insight into the effect of the nonlinear viscosity.
 We have demonstrated that the estimate of slip rate on the Mojave segment of the San Andreas fault from geodetic data can be reconciled with estimates from geologic data using a model that accounts for vertical variations in viscosity. We obtain slip rates of 20–30 mm/yr for the San Andreas fault. Neglecting the first-order stratification of viscosity in the lower crust and upper mantle leads to systematic underestimates of the San Andreas fault slip rate. The estimated viscosities for the lower crust, uppermost mantle, and mantle are on the order of 1019–1020 Pa s, 1020–1022 Pa s, and 1018–1019 Pa s, respectively. The relatively high-viscosity lower crust (>1019 Pa s) is consistent with studies of postearthquake deformation in the Mojave Desert [e.g., Pollitz et al., 2000; Pollitz, 2003]. The relatively low-viscosity upper mantle is consistent with isostatic rebound studies invoking layered viscosity in which the viscosity of the underlying mantle is generally lower than the overlying uppermost mantle and lower crust (see Table 1). (The model is quite consistent with the lithospheric rheology structure inferred from recent studies of postearthquake deformation in the Mojave Desert, as well as the relatively low (<1019 Pa s) western United States mantle viscosities inferred from isostatic rebound studies.) This model reproduces the contemporary velocity field across the Mojave region, shear strain rates across the San Andreas fault from 1932 to 1977, and postseismic GPS time series following the 1992 Landers earthquake.