We introduce new forward and inverse methods for inferring long-term fault slip rates, interseismic fault creep rates, and distribution of locked and creeping patches on faults using geodetic and geologic data. The forward model consists of fault-bounded blocks in an elastic crust overlying a Maxwell viscoelastic mantle. Interseismic elastic distortion of the blocks is modeled due to periodic locking and unlocking of faults throughout the earthquake cycle. Patches on the fault are assumed to be either locked during the interseismic period or creeping at constant shear stress. We utilize a Bayesian, probabilistic inversion method to infer the posterior probability distribution of long-term interseismic fault slip rates, distribution of locked and creeping patches, and relative weighting of multiple data sets. We illustrate the method with an inversion of a synthetic data set. We apply the method to estimate fault slip rates and the distribution of interseismic creep on faults in the San Francisco Bay Area, CA, using GPS-derived velocities and geologic measurements of fault slip rates. We show that the inferred fault slip rates and areas of the locked regions of faults are sensitive to the assumed viscosity of the upper mantle and the timing of past earthquakes and can be significantly different from values inferred from elastic models that do not include viscous flow. Considering models with different viscosities, inferred fault slip rates on major Bay Area faults can differ by factors of 1.5–4.0 and the inferred moment accumulation rate can differ by factors of 2–13.
 Quantifying the potential for large earthquakes on faults is an objective of many studies utilizing geodetic measurements of surface motions in deforming plate boundary settings. Computing the rate of accumulation of seismic moment on faults requires estimates of the long-term fault slip rate and the area of the fault that is locked. These parameters are routinely estimated using GPS-derived velocities and elastic dislocation models. The traditional approach to mapping the distribution of locked asperities and creeping regions of faults using geodetic data is to employ a back slip model [e.g., Savage, 1983]. In these models, interseismic reductions of the long-term fault slip rate is modeled by superimposing forward slip at the long-term rate with backward slip using the solution for a dislocation in an elastic half-space. Through an inversion of geodetic data, one estimates the spatial distribution of backward slip rate (slip deficit rate) or a coupling ratio (0–1, ranging from creep at long-term rate to zero creep) that quantifies the rate of back slip relative to the forward slip rate [e.g., Mazzotti et al., 2000; McCaffrey et al., 2008; Meade and Loveless, 2009]. The distribution of slip deficit rate or coupling ratio is constrained to vary smoothly along the fault.
 Often studies assume a ‘locking depth’ above which the fault is completely locked and below which the fault creeps. The notion of a ‘locking depth’ arises from an idealized conceptual model of the kinematics of fault slip that consists of a locked ‘seismogenic’ portion of the fault at shallow depths and temperatures below about 350°C and a ductile, creeping portion of the fault at greater depths and higher temperatures. However, deviations from this simple model are evident in a number of locations where faults appear to display along-strike variations in slip behavior at seismogenic depths [e.g., Schmidt et al., 2005; Murray and Segall, 2005; McCaffrey et al., 2008] and consequently the transition from locking to creeping occurs at temperatures well below 350°C. A conceptual model more nearly consistent with observations is one consisting of numerous isolated locked asperities surrounded by fault creep [e.g., Seno, 2003]. Indeed, such a picture of plate interface locking is emerging for the Sumatra subduction interface from geodetic and seismic studies of the recent sequence of subduction zone earthquakes [Konca et al., 2008].
 The slip deficit rate or the coupling ratio, which are kinematic quantities, are often interpreted in terms of physical properties of the fault, although the interpretation of coupling is not unique [e.g., Wang and Dixon, 2004; Lay and Schwartz, 2004]. Regions of the fault with inferred slip deficits equal to the long-term rate (or coupling ratio of 1) might be interpreted as locked asperities that slip unstably during earthquakes. Regions of the fault where the slip deficit rate is inferred to be zero (coupling ratio of 0) might be considered to be stable sliding regions that do not rupture in earthquakes. However, the interpretation of slip deficit rates between zero and the long-term rate is ambiguous. These regions may be creeping areas that are pinned from slipping at the long-term rate by neighboring fully locked portions of faults, or a spatial average of creep surrounding many small locked asperities. Additionally, regions with coupling ratio between 0 and 1 could represent areas of the fault that creep interseismically and also rupture during earthquakes. This ambiguity has been investigated with numerical models by Hetland and Simons  and with repeating earthquake observations by Igarashi et al. . The difficulty with resolving entirely locked regions of faults from regions with low creep rate may be highlighted by several studies that show the inferred region of high coupling on subduction zones is larger than the area of known rupture in past earthquakes [e.g., Nishimura et al., 2004; Suwa et al., 2006; Chlieh et al., 2008]. Furthermore, it is possible that earthquake ruptures may not be confined to areas of the fault that are locked during the interseismic period as suggested by King and Wesnousky  and Hillers and Wesnousky .
 Confounding the problem of inferring moment accumulation rate on locked parts of faults is the recognition that inferred fault slip rates and locking depths are dependent on model assumptions. For example, Dixon et al.  and Johnson et al.  showed that fault slip rates inferred from GPS-derived velocities and elastic dislocation models can be different from slip rates inferred from viscoelastic earthquake cycle models that incorporate elastic distortion across locked faults and distributed viscous flow in the lower crust and/or uppermost mantle. Savage and Lisowski  suggested that inferred locking depths from geodetic data could be biased to deep values using the conventional elastic models of interseismic strain accumulation if there is in fact a significant amount of broad viscous relaxation in the lower crust and/or uppermost mantle. Savage and Lisowski  suggested that a viscoelastic cycle model that accounts for deep viscous flow could explain broad strain patterns across the Mojave segment of the San Andreas fault system with locking depths of 10–15 km, consistent with typical rupture depths of large strike-slip earthquakes, whereas the conventional elastic model of strain accumulation requires locking depths of about 25 km to explain the strain observations.
 The purpose of this paper is to present a method for using geodetic data to estimate fault slip rates and the distribution of interseismic locking and creeping on faults while addressing some of the possible limitations of previous approaches. As in the popular elastic block models described by McCaffrey  and Meade and Loveless , we simultaneously estimate the long-term motion of fault-bounded crustal blocks and the distribution of interseismic creep on bounding faults. Blocks are modeled in an elastic plate overlying a viscoelastic substrate to represent faulting in the effectively elastic crust overlying the viscously flowing upper mantle. The method presented in this paper avoids use of the ambiguous coupling ratio by incorporating simple physical conditions on fault slip by following Bürgmann et al.  and assuming patches on the fault are either completely locked (no slip) or creep at constant resistive shear stress.
 This paper is organized as follows. We first describe the model setup, including the formulation of the long-term, steady state motions and the formulation of the interseismic perturbation due to faults locking and unlocking. We then describe a Bayesian inversion method for estimating fault slip rates and the spatial distribution of interseismic fault creep. We illustrate the forward and inverse method with an inversion of synthetic data for a simplified fault geometry and data distribution. Finally, we apply the method to GPS and geologic data in the San Francisco Bay Area, CA, to estimate fault slip rates and the distribution of interseismic creep.
2. Model Construction
 The general concept of our model is analogous to the 2-D concept originated by Savage and Prescott  and Savage . The principal idea of the Savage back slip models is that the interseismic velocity field can be decomposed into: (1) a steady, long-term velocity field in which the faults slide at the long-term slip rate and (2) a transient perturbation to this steady state due to locking of faults which is modeled with backward fault slip to cancel the long-term velocity discontinuity. This concept is illustrated in Figure 1. Savage and Prescott  and Savage  assumed no steady state strain in the fault bounded blocks. The solution for a dislocation in an elastic half-space or in an elastic plate overlying a viscoelastic substrate was adopted by Savage for the back slip part of the solution. The 3-D elastic half-space versions of this model [e.g., McCaffrey, 2002; Meade and Hager, 2005] are directly analogous to the 2-D elastic model of Savage and Burford . In the 3-D models, the long-term, steady state motion of blocks is described by rigid-body rotations of blocks about Euler poles. Interseismic elastic strain is introduced with backward slip on dislocations in a flat elastic half-space.
 Following these previous studies, we also construct the interseismic deformation field as a superposition of a steady state, long-term velocity field (with no fault locking) and an interseismic perturbation to this steady state due to locking of faults. We assume faults in an elastic plate (crust) overlying a Maxwell viscoelastic substrate (upper mantle).
2.1. Steady State Model
 A kinematic steady state velocity field is constructed as an extension of elastic half-space block models developed by McCaffrey  and Meade and Hager  in which the steady state velocity field is defined by rigid body rotations of blocks about Euler poles resulting in a purely horizontal velocity field. We modify the block motion to account for vertical motion across dipping faults and long-term distortion of blocks due to non planar fault geometry and bending of the lithosphere associated with dip-slip motion on faults. Our steady state velocity field satisfies the following slip conditions on faults: (1) the fault-normal component of velocity discontinuities across faults is zero and (2) the dip component of slip on faults is δV/cosθ where δV is the fault-trace-perpendicular component of the horizontal velocity discontinuity across the fault trace and θ is the fault dip. The first condition guarantees that fault surfaces do not open or interpenetrate. The second condition assures that the horizontal component of the velocity discontinuity across dipping faults is equal to δV, but is not appropriate for very steep faults because δV/cosθ approaches infinity as dip approaches vertical.
 The steady state surface velocity field is obtained by summing three velocity fields as illustrated in Figure 2a. We begin with rotational velocity fields about Euler poles for each block. Fault-normal components of velocity discontinuities across faults are canceled by adding the velocity field generated by steady opening or closing of the faults in an elastic plate overlying a viscous substrate. Finally, the contribution to the steady state velocity field due to dip-slip motion on faults is added. Figure 2b illustrates the model in the vicinity of a dipping fault. We assume the dip-component of the slip rate is δV/cosθ where δV is the horizontal component of the fault-normal velocity discontinuity and θ is the dip of the fault. This assures that the horizontal component of the velocity discontinuity across dipping faults is equal to δV. After canceling the fault-normal component of the velocity discontinuity across faults, there remains a fault-parallel component of block motion, δVcosθ. We therefore prescribe the full slip rate of δV/cosθ by imposing steady slip on the fault of amount δV/cosθ − δVcosθ. Unlike previous elastic block formulations [e.g., McCaffrey, 2002; Meade and Loveless, 2009], this formulation introduces some long-term, steady state internal block strain due to convergence across faults.
 For a mathematical formulation of the solution for dislocations in an elastic plate over viscoelastic half-space using the method of propagator matrices, we refer the reader to Fukahata and Matsu'ura  and Matsu'ura and Sato . Our formulation of the propagator matrix solution is essentially identical to theirs, including a surface approximation for the effect of gravity. As shown by Matsu’ura and Sato  and Pollitz et al. , the surface velocities resulting from a steadily sliding dislocation source in elastic plate over a viscoelastic half-space is proportional to the infinite time response of a suddenly imposed dislocation (step-function) and is independent of viscosity.
 In an Earth-centered Cartesian coordinate system, the velocity at coordinate X on a spherical cap rotating on the surface of Earth about an Euler pole, Ω, is the cross product r(X) × Ω where r(X) is the vector from the center of the earth to point X. For n blocks we define Ω = (Ω1, Ω2, …, Ωn). Writing the cross product in matrix form, the Euler poles are related linearly to steady state surface velocities through a matrix, Gss,
2.2. Interseismic Model: Earthquake Cycle
 To motivate the formulation of our interseismic model, we begin by reviewing several related 2-D models of the earthquake cycle as illustrated in Figure 3a. Perhaps the simplest earthquake cycle model is Savage and Burford's  elastic half-space model in which interseismic elastic deformation near a locked fault is modeled with a buried screw dislocation. This model assumes deep interseismic fault creep is steady in time and neglects rapid postseismic afterslip and mantle flow. Savage and Prescott's  viscoelastic coupling model (and the analogous dip-slip version [Savage, 1983]) explicitly incorporates mantle flow in a Maxwell viscoelastic half-space underlying a dislocated elastic plate. The Savage coupling models generate a time-varying surface velocity field by imposing periodic earthquakes on the fault. Figure 3b shows velocity profiles predicted by Savage and Prescott's  model at various times in an earthquake cycle for a range of parameters. Here T is the earthquake recurrence time, t is the observation time (or time since the last earthquake), and tR is the viscoelastic relaxation time (tR = 2η/μ, where η is viscosity and μ is elastic shear modulus). For a recurrence time of 250 years and μ =30 GPa, the ratios T/tR = 100,10,2,and 0.1 correspond to viscosities of 1018, 1019, 5 × 1019, and 1021 Pa s, respectively.
 The surface velocity patterns vary widely with the ratio, T/tR, as shown in Figure 3b. Early in an earthquake cycle the surface velocities may exceed the long-term plate rate, and late in the cycle the velocities are everywhere lower than the plate rate. This temporal variation is stronger for larger T/tR ratios. For T/tR < 1 (recurrence time shorter than relaxation time), the surface velocities are nearly steady in time and indistinguishable from the velocity profile predicted by Savage and Burford's  elastic model. The creep rate below the locking depth is assumed to be constant and equal to the long-term slip rate in Savage and Prescott's  model. Johnson and Segall  incorporated stress-driven creep in the coupling model that produces time-variable creep rates. In this work, we build earthquake cycles into our 3-D model in a manner similar to those of Savage and Prescott , Savage , and Johnson and Segall .
 Our model for interseismic deformation is constructed by discretizing the faults into many rectangular dislocation patches that are slipped backward to reduce the long-term slip rate to the interseismic slip rate. This is achieved through a boundary element formulation in which patches are either locked (no slip) or creeping at constant resistive shear stress. As discussed in more detail below, a simple slip history is assumed for both locking and creeping patches which results in an approximation to the condition of creep at constant resistive shear stress. The slip history is illustrated in Figure 4. Following Savage and Prescott , backslip on locked patches is imposed to completely cancel the long-term slip rate, , during the interseismic period and periodic sudden slip events of amount T (earthquakes) are imposed at a regular recurrence interval, T. On creeping patches, backslip is imposed at a lower rate than the long-term slip rate to achieve an interseismic creep rate, c , with c < , and periodic slip of amount ( − c )T is imposed. As discussed below, c is computed to achieve zero instantaneous stressing rate on the fault at the time of observation.
 We first discuss the simplest case in which the relaxation time of the viscoelastic half-space is long relative to the earthquake recurrence times and flow can be considered to be effectively steady in time and independent of earthquake timing. We then discuss the more general case that flow is time variable and earthquake timing cannot be ignored.
2.2.1. Steady Interseismic Deformation
 Distributed interseismic slip can be incorporated into the model relatively easily if we assume that interseismic deformation is steady in time. As illustrated in Figure 3b, it is known that in the case that the ratio of recurrence time to relaxation time is small, T/tR < 1, the surface velocities are nearly steady in time [e.g., Savage and Prescott, 1978] because viscous flow is nearly steady in time. Figure 5 illustrates that the steady velocity field for T/tR = 0.1 can be approximated with the velocity field obtained from summing steady block motion and back slip on a dislocation in an elastic half-space extending from the surface to the depth of the bottom of the elastic plate [see also Meade and Hager, 2005]. Therefore, for relatively large tR (high viscosity), the perturbation due to back slip can be approximated using a dislocation in an elastic half-space.
 For this relatively high viscosity case, the stressing rate on the creeping fault will also be relatively steady in time and therefore creep under constant resistive stress will be nearly steady in time. This is illustrated in Figure 6a using the model of Johnson and Segall  who solved for the creep rate below the locking depth of an infinitely long strike-slip fault in an elastic plate over a viscoelastic half-space. Periodic uniform slip events are imposed on the fault above the locking depth and the fault creeps at constant resistive stress below (freely slipping). Figure 6a shows the slip rate on the fault below the locking depth at different times in an earthquake cycle for two different values of T/tR. For the higher viscosity case, T/tR = 0.5, the slip rate is nearly uniform in time for most of the earthquake cycle. Figure 6b shows the predicted surface velocities at different times.
 In our 3-D model, we follow Johnson and Segall  and assume creeping patches on the fault slide at constant resistive shear stress (equivalent to zero shear stress, or zero stressing rate) during the interseismic period. Figure 7 illustrates the concept. We assume that the long-term slip determined by the steady state model described previously occurs at some constant (in time) shear stress (i.e., = 0). Fault locking is imposed by slipping locked sections of the fault backward at the long-term slip rate to completely cancel slip. Creeping areas surrounding the locked parts of the fault continue to slide at constant shear stress and therefore will also slip backward to satisfy this condition. The back slip distribution and the long-term slip rate distribution are added together to get the interseismic slip rate distribution.
 This formulation for interseismic slip works well in the block model approach because for a given set of locked patches, there is a linear relationship between observed surface velocities, d, and Euler poles, Ω,
where dss and dbslip are the steady state and back slip contributions, respectively, and the matrix Gbslip is constructed from the solution for a rectangular dislocation in an elastic half-space [Okada, 1992] and Gss is the contribution from the steady state model defined in (1). To derive Gbslip, we first need to define some other matrices. Let there be Np total fault patches, Nc creeping patches, Nℓ = Np − Nc locked patches, and Ne Euler poles. Let Vℓ and Vc be matrices of dimensions Nℓ × 3Ne and Nc × 3Ne, respectively, that give the long-term, steady back slip rates on locked or creeping patches for given Euler pole coordinates,
We define an Np × Np matrix Gσ that relates back slip rate on all fault patches to the shear stressing rate on all patches,
To compute induced backslip on creeping patches due to imposed backslip on locked patches, we define the Nc × Nc matrix G′σ that is obtained by removing the rows and columns of Gσ that correspond to the locked fault patches. The shear stressing rate on creeping patches due to back slip on creeping patches is
We define the Nc × Nℓ matrix G″σ that relates the stressing rate on creeping patches due to backslip on locked patches,
We define the matrices Gc and Gℓ that relate the back slip rate on creeping and locked patches to surface velocities
The back slip rate on creeping patches induced by backslip on locked patches is the slip rate distribution, c, that satisfies the condition
Then dbslip = dc + dℓ = Gcc + Gℓℓ can be written in terms of the Euler poles
Therefore, the second matrix in equation (2) that relates interseismic surface velocities to Euler poles is
 This method is related to a method for analyzing volcano deformation developed by Yun et al.  for determining the distribution of sill/dike opening assuming a uniform overpressure. Yun et al.  conducted a nonlinear optimization for the distribution of locked and opened patches. Similar forward models have been applied to fault creep including Bürgmann et al. .
2.2.2. Nonsteady Creep at Constant Resistive Stress
 To the extent that the stressing rate on creeping areas of faults is modulated by time-variable flow in the asthenosphere, the fault creep rate must also vary with time. For example, we expect rapid creeping during the postseismic phase following large earthquakes and slower creep rates later in the earthquake cycle, as illustrated by Johnson and Segall's  model in Figure 6. Figure 8a shows more results of Johnson and Segall's  model for creep rate below a fault locked down to 10 km depth in an elastic plate of thickness 20 km for different ratios of recurrence time to asthenosphere relaxation time, T/tR. As the ratio T/tR is increased, the creep rates vary more significantly throughout an earthquake cycle.
 It would be computationally expensive to implement Johnson and Segall's  stress-driven creep method in our 3-D earthquake cycle model because this would require discretizing the slip in space and time and integrating over multiple earthquake recurrence times. Instead, we now develop a method to approximate time-varying fault creep. We first formulate the 2-D problem for an infinitely long strike slip fault. The creep rate on the fault is a function of the stress on the fault at any given time which is dependent on the history of slip on the fault and associated viscous flow. Let T be the recurrence time of earthquakes, let be the long-term fault slip rate, and let c(t, z) be the instantaneous creep rate at time t and depth z on the fault. Following Johnson and Segall , we require that the total stress is zero on the creeping part of the fault at some time, t, after initiation of sliding at t = −∞,
where teq is time of the last earthquake, geU (t − teq,z,T) gives the stress at depth z and time t due to imposed periodic earthquakes above the locking depth and gcL (t − t0, z0,z) gives the stress at depth z and time t due to slip below the locking depth at depth z0 and time t0. In this notation, the subscripts on g denote either creep (c) or earthquake slip (e) and the superscript denotes slip on either the upper (U) or lower (L) part of the fault (above or below the locking depth). The Green's functions, g, are computed for a point dislocation in an elastic plate over a Maxwell viscoelastic half-space. Johnson and Segall  solved this equation with a boundary element approach by expressing fault slip with discrete values in space and time. The approach is tractable for a single two-dimensional fault and a single recurrence time, but would be prohibitively costly to compute for many three-dimensional faults with different earthquake recurrence times.
 In this work, instead of solving equation (15) at all times as in the study by Johnson and Segall , we adopt an efficient approximate solution that is computed using the stressing rate only at the current observation time. The approach is to assume a simple slip history, illustrated in Figure 9, in which the fault slips during earthquakes uniformly from the surface to the locking depth by amount T and then tapers from the locking depth down to zero at the bottom of the plate (the tapering is not imposed, but is solved-for in the boundary element calculation). Between earthquakes, the fault creeps at a constant rate, c(z). The interseismic creep rate, c(z), is computed by solving the equation below which assumes zero instantaneous stressing rate on the fault at the time of observation, (t − teq),
where the dot denotes time derivative, geU is the same as above, eL (t − teq, ξ, z, T) is the stressing rate at depth z and time t − teq due to periodic coseismic slip at depth ξ below the locking depth every T years, and cL(ξ, z) is the instantaneous stressing rate at depth z due to steady creep below the locking depth at depth ξ, and c(ξ) is the yet-to-be-determined interseismic creep rate. The first integral in equation (16) is the contribution to the instantaneous stressing rate due to steady creep below the locking depth, and the second integral in equation (16) is the instantaneous stressing rate due to periodic coseismic slip below the locking depth of amount ( − c(ξ))T.
 We discretize the fault below the locking depth into N patches of equal size with centers, zi, and uniform slip rate, c(zi). Then from equation (16), the stressing rate on the jth patch is zero,
Let Geu be a vector of stressing rates, eU, on all N patches, and let GcL and GeL be N × N matrices of stressing rates on the patches due to slip on all patches. Then,
where c is a N × 1 vector of creep rates on all patches, and is a N × 1 vector with each entry equal to . The vector of creep rates is then,
 The above creep approximation can be incorporated into a 3-D model analogously to the steady-creep formulation in section 2.2.1. The only difference is now the matrix Gσ in equation (4) relates stressing rates on patches to the history of steady backslip and periodic forward slip on patches, and Gσ depends on earthquake timing, T, teq, and relaxation time, tR. The 3-D solution for periodic earthquakes on a dislocation in an elastic plate over a Maxwell viscoelastic substrate is obtained using propagator matrix methods similar to those of Matsu'ura and Sato  and Pollitz et al. .
3. Inversion Method
 The inversion method uses the mixed linear-nonlinear Bayesian formulation described in the study by Fukuda and Johnson . Here we give an overview of the formulation of the inverse method for the particular problem discussed in this paper. In general, we could have many data sets with different error structures. For the San Francisco Bay Area problem we address later in this paper, we have GPS measurements of surface velocities, long-term fault slip rate measurements, and surface creep rate measurements, all of which are related linearly to the Euler pole coordinates. Each fault patch has an unknown binary parameter that indicates whether a patch is creeping or locked. The binary locking parameters are related nonlinearly to the surface observations. The problem is therefore a mixed linear-nonlinear inversion. Earthquake timing and viscosity are assumed to be known for this problem. Here we show briefly how to set up the inversion but we leave the details and derivations to Fukuda and Johnson .
3.1. Observation Equation
 Let dgps be a vector of GPS velocities, let dslip be a vector of long-term slip measurements, and let dsc be a vector of surface creep rate measurements. Let m be the vector with entries of −1 and +1 that indicate whether a patch is creeping or locked, respectively. The observation equation is given by
where the G are matrices constructed from (2), Ω is a vector of Euler pole coordinates, the ε are vectors of observation error which are assumed to follow a Gaussian distribution with zero mean and covariance matrices of σgps2Σgps, σ2slip Σslip, and σ2scΣsc. Here σ2gps, σ2slip, and σ2sc are unknown scale factors of the covariance matrices that account for the relative weighting of the data sets and unknown model variance. Matrices Ggps and Gsc are constructed from equation (2) by subtracting the surface velocities on both sides and immediately next to the fault.
3.2. Bayesian Formulation
 We adopt a Bayesian approach to estimate m, Ω, σ2gps, σ2slip, and σsc. We denote a vector that contains all unknown variances by σ2 = [σ2gps, σ2slip, σ2sc]T, and a vector that contains all the data sets, d =[dTgps, dTslip, dTsc]T. We simplify notation and write equation (20) as
 In the Bayesian approach, the solution to the inverse problem is the joint posterior probability density of the unknown parameters given data, p(m, Ω, σ2∣d). Bayes' theorem states the posterior probability density function (PDF) is
where p(d∣m,Ω,σ2) is the PDF of data given the model parameters, which accounts for the theoretical data-parameter relationship, p(m,Ω,σ2) is the prior PDF of the model parameters, and the denominator is a constant that normalizes p(m,Ω,σ2∣d) and is independent of m,Ω,σ2.
 The linear and the nonlinear parameters in the joint posterior distribution can be separated using an identity for joint probability,
The first distribution on the right hand side, p(Ω∣d,m,σ2), is a Gaussian distribution because m and σ2 are specified. The mean and variance of the first distribution can be obtained with least squares. The second distribution of the right hand side, p(m,σ2∣d), is a non-Gaussian distribution and consequently there are no analytical expressions for the mean and variance. We employ a Markov Chain Monte Carlo method to generate samples from the second distribution.
3.3. Probabilistic Models for Data and Priors
 We construct the matrix of covariances
Then equation (21) and the assumption of Gaussian error indicate that the probability density of d given m,Ω,σ2, follows a Gaussian distribution of mean G(m)Ω and covariance matrix Σd:
where Nd is the length of d.
 There is no reason to expect the priors on model parameters to be correlated with priors on data weights, so we assume
Usually, we do not have any prior knowledge on σ2, so we assume uniform prior distributions for these parameters. We have no prior information for Ω in the following application, so for simplicity we assume a uniform prior. We introduce prior information that the distribution of locked patches is smooth to some degree,
where k is a normalizing constant and
where Np is the number of fault patches, j denotes the indices of the patches neighboring the ith patch, and β is a weighting parameter. The parameter, β, is specified in the inversion, not solved for. Because m is binary, constructed with negative and positive ones, Σjmimj is maximum when the signs of all mj are the same as the sign of mi. The prior favors models where neighboring patches are all locked or unlocked, and therefore we refer to β as the “connectivity parameter”.
3.4. Monte Carlo Sampling of p(m,σ2∣d)
 Consider the second distribution of the right hand side of (23). From Bayes’ theorem, the following proportionality can be stated,
Assuming independent priors and uniform prior distribution for σ2,
where Nd is the length of d, and Ω* is the least squares solution that minimizes
We build a discrete representation of the distribution p(m,σ2∣d) with a collection of samples using a Monte Carlo-Metropolis algorithm described step-by-step in the study by Fukuda and Johnson .
3.5. Least Squares Estimation of p(Ω∣d,m,σ2)
 Consider the first distribution of the right hand side of (23). From Bayes’ theorem, the posterior distribution p(Ω∣d,m,σ2) is written as
Since p(d∣Ω,m,σ2) is given by (25), as long as p(Ω) is uniform or Gaussian, then p(Ω∣d,m,σ2) is a Gaussian distribution and the mean and variance can be computed using standard least squares methods.
3.6. Marginal Posterior Distributions and Slip Rate Estimate
 Because we obtain a discrete representation of p(m,σ2∣d), with each sample having probability 1/Ns (Ns is number of samples), the full posterior PDF, p(Ω∣m,d,σ2), is approximated as a collection of Ns continuous Gaussian distributions.
 The marginal posterior probability distribution of Euler poles is
where i denotes the ith sample of p(m,σ2∣d). Thus p(Ω∣d) is a sum of continuous, Gaussian distributions, p(Ω∣d,mi, (σ2)i).
 Similarly we can construct the posterior marginal distribution of fault slip rates, p(∣d),
where the Gaussian distribution, p(∣d,mi, (σ2)i), is obtained by linear error propagation of p(Ω∣d,mi, (σ2)i) through the relationship defined in (3).
4. Synthetic Validation
 We now conduct an inversion of synthetic data to illustrate and validate the inversion method outlined in the previous section. Figure 10 shows a synthetic data set generated by computing velocities at regularly spaced points with a forward model consisting of three blocks bounded by three faults with slip rates and locking distribution illustrated in the figure. For simplicity, the steady interseismic model is used here so that there is no need to introduce viscosity and earthquake timing parameters. In fact, all inversions in this paper are computed assuming known viscosity and earthquake timing parameters so that Greens functions need not be computed many times (this is the computationally expense part of the model). Noise selected from a Gaussian distribution is added to the computed velocities. We illustrate a joint inversion of two data sets. We add uncorrelated noise with standard deviation of 0.5 mm/yr to data set 1 (Figure 10). We add uncorrelated noise with standard deviation of 1.5 mm/yr to data set 2, which is more densely spaced but has more limited spatial coverage than data set 1.
 To conduct the inversion, we assume nothing is known a priori about the relative uncertainties of the two data sets; we assign uncorrelated data errors with standard deviation of 1 mm/yr to both data sets, and the inversion is working correctly if data weights of σ12 = (0.5)2 and σ22 = (1.5)2 are recovered. We invert for the unknown data variances, locations of locked and creeping patches, and the Euler pole coordinates (fault slip rates) using the mixed linear-nonlinear Monte Carlo method described in the previous section. We conduct the inversion for three connectivity conditions (see section 3.3): no connectivity (β = 0), moderate connectivity (β = 0.5), and high connectivity (β = 1.0).
Figure 11 shows the posterior probability distributions of data weights (expressed as the square root of variance) and fault slip rates with the true values marked by vertical lines for the β = 0.5 case. The true values are generally recovered by the inversion with the exception that the estimated slip rate for fault 3 is about 2 mm/yr higher than the true value and the mean of the distribution for σ2 is shifted slightly above the true value. The slight bias in the estimates of some parameters seems to be a result of adding noise to finite-sized data sets. We have conducted this inversion many times with different noise drawn from the same distributions and the inversion always results in slightly biased values for some parameters but the bias is dependent on the noise.
Figure 12 shows the recovered locking distributions for the three smoothing conditions. The mean distribution is the average of binary values for all of the samples collected in the Monte Carlo inversion (−1 is creeping, +1 is locked). A mean near −1 indicates high probability that the patch is creeping, and a mean near +1 indicates high probability that a patch is locked, and a mean near zero indicates that that the probability the patch is locked is equal to the probability that the patch is creeping and therefore the locking parameter is poorly resolved. The “most probable” solution is shown in the right column of Figure 12. The “most probable” model is computed from the marginal distribution of the locking parameter for each patch; patches with a mean locking parameter greater than zero are shown as locked and patches with mean value less than zero are shown as creeping. The corresponding mean and most probable interseismic slip distributions are shown in Figure 13. The mean and most probable slip distributions are not the same because the posterior distributions are non-Gaussian.
 The results of the synthetic inversion illustrate an important result regarding resolution of the distribution of creeping and locked patches. In the case of no smoothing or moderate smoothing, the locking parameter on patches that are locked in the true model are not well resolved (mean of the locking parameter is between 0 and 0.4), but the locking parameter on patches that creep in the true model are well resolved, especially near the surface (mean of locking parameter in creeping areas is near negative one). The reason that locking parameters on locked patches are not well resolved and locking parameters on creeping patches are well resolved is because a single locked patch reduces the creep rate over an area much larger than a single patch. Thus, a large, effectively locked region can be produced with a relatively small number of locked patches distributed across the locked zone, and there are many possibilities for the distribution of the small number of locked patches. Conversely, a single locked patch in an otherwise creeping region reduces slip over a large area and therefore the creep rate is highly sensitive to the location of locked patches.
5. Application to San Francisco Bay Area
 The portion of the Pacific-North American plate boundary zone in the San Francisco Bay Area, CA, is a complex zone of primarily strike-slip faults as shown in Figure 14a. To the north of San Francisco are the subparallel San Andreas, Maacama-Rodgers Creek, and Concord-Green Valley faults. Within the Bay Area, slip is distributed among a number of nonparallel strike-slip faults. South of the Bay Area most of the slip occurs on the creeping section of the San Andreas fault. Estimates of strike-slip rates summarized by Working Group on California Earthquake Probabilities (WGCEP)  are shown in Figure 14b. The fault zone in the San Francisco Bay Area is thought to account for about 40 mm/yr of the 50 mm/yr of total relative transverse motion between the Pacific and North American plates.
 Many Bay Area faults display surface creep [e.g., McFarland et al., 2007]. Faults that are known to creep significantly at the surface are shown in Figure 14b with a dashed red line, while faults that do not exhibit surface creep are shown with dashed yellow lines. Surface creep rate measurements summarized by McFarland et al.  are shown in Figure 14c.
 In this study we constrain long-term fault slip rates and interseismic locking distribution with GPS-derived velocities, surface creep rate measurements, and long-term slip rate measurements.
 We use the BAVU GPS-derived velocities in the study by d’Alessio et al.  and shown in Figure 14a. The velocities are in a North-America-fixed reference frame. Following d’Alessio et al. , we incorporate GPS-derived velocities from throughout the Pacific (PA) and North American (NA) plates to constrain the total relative motion across Bay Area faults [d'Alessio et al., 2005, Figure 1] for the distribution of global stations used in the analysis).
 We use estimates of long-term fault slip rates as summarized by WGCEP  and shown in Figure 14b. Although the slip rate ranges shown in Figure 14b are typically regarded as strict bounds, we assume the mean and range represent the mean and standard deviation of a Gaussian distribution. This is a conservative use of the data in the sense that parameter values are permitted outside of the bounds in our inversions.
 The surface creep rate measurements shown in Figure 14c were compiled by McFarland et al.  from two decades of theodolite measurements across Bay Area faults. The geometry of block-bounding faults is adopted from d’Alessio et al.  who established fault geometry using mapped surface traces of faults, relocated microseismicity, topographic lineaments, and interpreted geologic cross sections. The surface traces of the vertical, strike slip model faults are shown in Figure 14. Unlike the study by d'Alessio et al. , our block geometry does not include a boundary along the Eastern California Shear Zone (ECSZ) that separates the Sierra Nevada/Great Valley block (as established by Argus and Gordon ), and the Basin and Range; we assume that the interseismic deformation associated with the Basin and Range and ECSZ, which are 300 km or more from Bay Area faults, is not recorded in the BAVU network.
 All faults are discretized into 4.2 × 4.2 km patches that extend from the surface to 25 km depth, the base of the elastic plate. We solve for the distribution of locked and creeping patches, data weights, and Euler pole coordinates (which are used to compute the long-term and interseismic slip rates on patches). The timing of past earthquakes is assumed to be known and the values for recurrence time and date of the last earthquake are shown in Figure 14b. These values are taken from UCERF 2, the Uniform California Earthquake Rupture Forecast, Version 2 [Field et al., 2007].
5.3.1. Connectivity Parameter
 To decide on a preferred connectivity parameter, β (equation (27)), we compare the inversion results for a range of values of β for the elastic half-space case. The locking distribution and interseismic creep rates are shown in Figure 15 for three different connectivity conditions. The locking distributions and interseismic creep rates inferred from the three inversions are similar for the top row of patches at the ground surface. However, the locking distributions and creep rates below the first row (depths greater than 4.2 km) vary in detail. There are a number of salient features common to all three connectivity conditions: (1) the peninsula segment of the SAF is largely locked from the surface to a depth of about 8.4 km (the bottom of the second row of patches), (2) the southern SAF is creeping nearly everywhere, (3) the Hayward fault is largely creeping at the southernmost end but there are several locked patches along the central and northern sections, (4) the southern Calaveras (south of junction with Hayward) is largely creeping, (5) the Green Valley fault is largely locked apart from two shallow creeping zones, and (6) the Valley Margin and San Gregorio Faults are largely locked at all depths. It should be noted that were are not using any GPS data north of Point Reyes (PR on Figure 14), so the locking distribution on the northern San Andreas and the Rodgers Creek faults is unreliable. We have no objective means to select a preferred connectivity parameter, however from a close comparison of the three mean locking distributions in Figure 15, one can see that this method is unable to resolve locked and creeping patches smaller than several patch-lengths (patches are 4.2 × 4.2 km); locking on the very large patches in the high smoothing case is well resolved, but locking on patches the size of individual cells is not well resolved in the low-connectivity case. Therefore, we consider the locking distribution in the moderate-connectivity case to be the most reliable estimate on the basis that it is not obviously over smoothed or under smoothed as indicated by the resolution of locked and creeping regions.
5.3.2. Locking Distribution and Interseismic Creep Rate
 We now consider only the moderate connectivity case (β = 0.5) and vary the asthenosphere relaxation time. Figure 16 shows the mean locking distribution for relaxation times of 5, 25, 75, 125, and ∞ (elastic) years, corresponding to viscosities of 2 × 1018, 1019, 3 × 1019, 5 × 1019 Pa s, and ∞, respectively. Figure 17 shows the interseismic creep rates for corresponding models shown in Figure 16.
 A comparison of results for different relaxation times shows that as the relaxation time is decreased, the inferred locking depth of faults is decreased and the inferred interseismic creep rate increases. This is particularly clear on the Peninsula San Andreas, The Rodgers Creek/Hayward fault, and the Greenville/Calaveras fault, but less clear on the San Gregorio and Valley Margin faults where the data is sparse and the slip rates are lower. The contrast between the elastic case and the tR = 5 year case is stark; locked regions extend down to at least 12 km on average in the elastic case but only about 4 km in the tR = 5 year case.
 This relation between relaxation time and locking depth could have been expected from an examination of 2-D model predictions illustrated in Figure 3b. At mid to late times in the earthquake cycle, and given the same locking depth, the horizontal velocity profile is broader for shorter relaxation times. As a consequence, to produce the localized velocity profile associated with long relaxation time, a model with a lower relaxation time requires a shallower locking depth. The interseismic creep rates are higher for shorter relaxation times because the area of the creeping parts of the fault is larger. The correlation between locking depth and relaxation time was recognized by Savage and Lisowski .
Figure 18 shows the residual GPS velocities (data minus model) for several different relaxation times. For clarity, the error ellipses are not shown, but the formal standard deviations are 1.5 mm/yr on average. The residual velocities are largely less than 5 mm/yr suggesting that the model explains most of the data. The residuals are largest right next to the creeping Hayward fault suggesting that the model does not capture well the details of the shallow creep distribution. The residual vectors for the different relaxation times are quite similar over most of the region except along a fault-perpendicular profile near Point Reyes and extending across the San Andreas, Rodgers Creek, and Green Valley faults. Here the residuals for the tR = 5 year case are systematically larger than for the other models and the tR = 5 year case under predicts the amount of right-lateral motion across the region. This implies that the tR = 5 year model over predicts the amount of post-1906 relaxation.
 The model fit to the surface creep rates is shown in Figure 19 for the tR = 25 years and elastic cases. The creep measurements (blue) largely fall within the 2σ range of the predicted surface creep rates (gray) for both models.
5.3.3. Slip Rates
 The posterior distributions of long-term fault slip rates are plotted with distance along faults in Figure 20. Table 1 compares results from this study with 3-D elastic block model results from d’Alessio et al.  and various 2-D elastic models and 2-D viscoelastic cycle models.
Slip rates are listed as a 2σ range and mean in mm/yr. Slip rates are grouped by model type. The average of results from Freymueller et al. , Savage et al. , and Prescott et al.  is reported for 2-D elastic models. Results of d'Alessio et al.  are reported for 3-D elastic model. Average of results from Segall  and Johnson and Segall  is reported for 2-D viscoelastic models. Results of this study are listed under 3-D viscoelastic model. The “sum” columns are the sum of mean slip rates across the northern Bay Area (first “sum” column, SAF N + RC + Cn/GV) and the southern Bay Area (second “sum” column, SG + SAF Pen + Hwd + Cal N + Gville). SAF N, northern section of San Andreas; RC, Rodgers Creek; Cn/GV, Concord-Green Valley; SG, San Gregorio; SAF pen, peninsula segment of San Andreas; Hwd, Hayward; Cal N and Cal S, northern and southern segments of Calaveras; Gville, Greenville.
 The slip rate estimates from 2-D and 3-D cycle models for the northern San Andreas fault are systematically higher than the slip rate estimates from 2-D and 3-D elastic models by about 5 mm/yr. The elastic models predict slip rates of about 16–22 mm/yr for the northern San Andreas fault, which is at or below the low end adopted by WGCEP . The cycle models predict slip rates of about 20–28 mm/yr for the northern San Andreas fault, which is similar to the 21–27 mm/yr adopted by WGCEP .
 For the Rodgers Creek fault, the 3-D elastic and cycle models predict slip rates of 5–10 mm/yr, similar to the 7–11 mm/yr adopted by WGCEP . Elastic and cycle 2-D models predict higher slip rates of 8–15 mm/yr.
 Our 3-D elastic and cycle model estimates of slip rate on the Concord/Green Valley segments are higher than all previous 2-D and 3-D model estimates. Previous studies infer slip rates of 5–10 mm/yr for the Concord/Green Valley segments, somewhat higher than the 2–8 mm/yr adopted by WGCEP . However, we infer higher slip rates of 6–16 mm/yr. It is not clear to us why our estimates are systematically higher than previous slip rate estimates for these segments.
 All models predict slip rates of 1–7 mm/yr for the San Gregorio fault, consistent with the 1–5 mm/yr adopted by WGCEP . For the peninsula segment of the San Andreas fault, the 3-D elastic models and the 3-D cycle models with relaxation times of 75 and 125 years predict slip rates of 16–19 mm/yr, consistent with the 13–21 mm/yr adopted by WGCEP . The 3-D cycle models with shorter relaxation times of 5 and 25 years predict higher slip rates of 20–24 mm/yr.
 All models predict slip rates of 6–10 mm/yr for the Hayward fault and 9–16 mm/yr for the Calaveras segments which are similar to the 7–11 and 12–18 mm/yr adopted by WGCEP . All models predict slip rates of 2–7 mm/yr for the Greenville fault which is somewhat higher than the 1–3 mm/yr adopted by WGCEP .
Table 1 also tabulates the sum of mean slip rates across subparallel faults in the northern and southern Bay Area. The first ‘sum’ column is the sum of slip rates across the northern Bay Area along the northern San Andreas, Rodgers Creek, and Concord/Green Valley faults. We infer 42–46 mm/yr of total slip across the northern Bay Area, less than the ∼50 mm/yr of total relative motion between the Pacific Plate and North America. Hammond and Thatcher  inferred ∼7 of total right-lateral offset rate across the northern Walker Lane transition zone between the Basin and Range and Sierra Nevada, which when added to the 42–46 mm/yr of right-lateral motion we infer across the Bay Area gives the full plate rate of ∼50 mm/yr. The sum of our predicted slip rates across faults in the southern Bay Area, including the San Gregorio, the peninsula San Andreas Fault, the Hayward fault, the northern Calaveras, and the Greenville fault, is 39–46 mm/yr, similar to the northern Bay Area.
5.3.4. Moment Accumulation Rates
 Because we estimate the locked surface area of faults and we estimate the long-term fault slip rates, we can compute moment accumulation rate. In Figure 21 we have converted this moment accumulation rate to moment magnitude as a function of recurrence time. The range of inferred recurrence times from trenching studies is shown with a gray box in Figure 21 for segments that have sufficient paleoseismic data [WGCEP, 2002; Field et al., 2007]. There is a wide range in estimated moment magnitudes. Considering the two standard deviation uncertainties, for all segments except the Rodgers Creek/Hayward segment, the range in allowable moment magnitude is at least half a magnitude interval. There is a tendency, although not an exact correlation, for the predicted magnitudes to increase with increasing relaxation time (increasing viscosity) because the locking area increases with relaxation time.
6. Implications to the Rheology of the Upper Mantle
 The results presented above show that our estimates of fault slip rates, locked areas, and moment accumulation rates depend strongly on the assumed upper mantle relaxation time. The models with different relaxation times fit the GPS and slip rate data nearly equally well, except that the lowest viscosity model (tR = 5 years) under predicts the amount of right-lateral shift across the northernmost Bay Area. It is not surprising that the current GPS velocity field cannot be used to constrain the viscosity of the upper mantle because these data provide no information on how deformation rates vary over time. Therefore, we re-examine the postseismic deformation transient following the 1906 San Francisco earthquake inferred from historical triangulation measurements. The strain rates computed from the triangulation data are shown in Figure 22 and are taken from Kenner and Segall . Kenner and Segall , Segall , and Johnson and Segall  modeled these data with 2-D models assuming an infinitely long strike-slip fault and found that the data could be explained with an earthquake on a fault in elastic plate overlying a linear viscoelastic half-space with short relaxation time of about 5–15 years (viscosity of 2–6 × 1018 Pa s) or on a fault in an elastic plate or elastic half-space extending into a linear viscous shear zone with viscosity per unit thickness of about 1017 Pa s/m. In particular, the studies of Kenner and Segall  and Johnson and Segall  suggest that upper mantle viscosity cannot be inferred from this data because the strain rate transient can be explained as entirely due to afterslip in a thin linear viscous shear zone. However, recent studies suggest that coseismic stresses are relaxed by afterslip more quickly than could be explained by a linear viscous shear zone [e.g., Hearn et al., 2002; Perfettini and Avouac, 2004; Johnson et al., 2006]. Recent studies show that afterslip evolution is consistent with slip on a fault obeying a rate-strengthening friction law that relates shear stress, τ, to slip rate, V,
 We re-examine the post-1906 triangulation data assuming afterslip occurs on a fault in an elastic crust overlying a Maxwell viscoelastic mantle. The fault is an infinitely long strike-slip fault and creeps at depth according to the rate-strengthening friction law (40) with σ (a − b) = 0.7 MPa. Six meters of coseismic slip is imposed down to 13 km depth based on inversion results by Song et al. , and the fault creeps below this depth down to the base of the elastic crust which is fixed at 25 km depth [e.g., Lin et al., 2010]. We show the fit to the strain rate data in Figure 22 for a range of mantle viscosities. The data require a mantle viscosity of no greater than 4 × 1018 Pa s. Examining the elastic half-space model (dashed curve), the afterslip decays rapidly during the first 10 years after the earthquake and afterslip alone cannot account for the continued decay in strain rate for many decades. In this model, the continued decay in strain rate after 10 years requires significant relaxation of the mantle.
 The maximum Maxwell viscosity of 4 × 1018 Pa s inferred from the postseismic analysis is entirely consistent with mantle viscosities inferred from various geodetic studies of postseismic or interseismic deformation in the western US. Most studies infer lower crustal viscosities of 1019–1021 Pa s and mantle viscosities of 1018–1019 Pa s [e.g., Hammond et al., 2009; Thatcher and Pollitz, 2008; Johnson et al., 2007].
 While the post-1906 data are best explained with a mantle viscosity less than 4 × 1018 Pa s, results of our 3-D cycle models suggest that this viscosity is inconsistent with present-day motions in the Bay Area. As we showed in Figure 18, the model with relaxation time of tR = 5 years (viscosity of 2 × 1018 Pa s) under predicts the present-day right lateral motion across the northern San Francisco Bay Area, indicating that this model generates too much relaxation of the upper mantle 100 years after the 1906 earthquake. Furthermore, as shown in Figure 16, the tR = 5 year model requires very shallow locking depths of 4–5 km for the entire region which is probably not correct because large strike-slip earthquakes typically rupture to depths of about 15 km.
 Together, the inferences of low mantle viscosities of 4 × 1018 Pa s during the decades following the 1906 earthquake and the inference of higher viscosities from the present-day data indicate that the average effective viscosity of the mantle may increase with time following earthquakes. This could be explained with either a nonlinear, power law viscous rheology [e.g., Freed and Bürgmann, 2004] or a Burgers body, transient viscosity [e.g., Pollitz, 2003]. This response might also be explained by temperature-dependent mantle viscosity with effective viscosity decreasing with depth [e.g., Riva and Govers, 2009]. The rapid postseismic mantle flow inferred from the post-1906 triangulation data might reflect rapid flow of the lower viscosity mantle at depth while the current deformation rates measured with GPS data reflect flow at lower, more nearly steady rates in the uppermost mantle just below the elastic crust. A model like this was proposed by Johnson et al.  for southern California.
 We have presented forward and inverse methods for inferring long-term fault slip rates, interseismic fault creep rates, and distribution of locked and creeping patches on faults using geodetic and geologic data and a 3-D viscoelastic earthquake cycle model. The forward model consists of fault-bounded blocks in an elastic crust overlying a Maxwell viscoelastic upper mantle. Long-term motions of the blocks are imposed to produce a steady velocity field. Interseismic elastic distortion of the blocks is modeled due to periodic locking and unlocking of faults throughout the earthquake cycle. Periodic earthquakes are imposed on patches that are locked during the interseismic period. An approximation for interseismic creep at constant resistive shear stress was introduced. The approximation obviates the need to integrate the stresses over the history of sliding to compute slip rate by assuming a simple slip history in which the fault slides throughout the earthquake cycle at a steady interseismic rate proportional to the stressing rate at the time of observation. Rapid afterslip during the early phase of the earthquake cycle is lumped into coseismic slip. The inversion method utilizes a Bayesian, probabilistic inversion formulation together with Monte Carlo Markov Chain methods to infer the posterior probability distribution of long-term interseismic fault slip rates, distribution of locked and creeping patches, and relative weighting of multiple data sets.
 We apply the method to the San Francisco Bay Area, CA, using GPS-derived velocities and geologic measurements of fault slip rates. We show that the inferred fault slip rates and areas of the locked regions of faults are sensitive to the assumed viscosity of the upper mantle and the timing of past earthquakes, and can be significantly different from values inferred from elastic models that do not include viscous flow. Considering models with different viscosities, inferred fault slip rates on major Bay Area faults can differ by factors of 1.5–4.0 and the inferred moment accumulation rate can differ by factors of 2 to 13. Some of the slip rate estimates are systematically different from previous 2-D elastic and viscoelastic model estimates and previous 3-D elastic model estimates. Nearly all Bay Area faults display significant along-strike variations in the distribution of locked and creeping patches. The inferred locking depth varies from 0 km (creep at the surface) to 25 km (bottom of elastic crust).
 We have re-examined the post-1906 strain rate transient inferred from triangulation data and find that a model incorporating rate-strengthening afterslip can fit the data only with relatively low mantle viscosity (less than 4 × 1018 Pa s). Interseismic models with such low mantle viscosity produce systematic misfits to GPS velocities across the northern San Francisco Bay Area, suggesting that a single average mantle viscosity may not be able to fit data at different time periods.