A Comparison of Plasticity Regularization Approaches for Geodynamic Modeling

The emergence, geometry and activation of faults are intrinsically linked to frictional rheology. The latter is thus a central element in geodynamic simulations which aim at modeling the generation and evolution of fault zones and plate boundaries. However, resolving frictional strain localization in geodynamic models is problematic. In simulations, equilibrium cannot always be attained and results can depend on mesh resolution. Spatial and temporal regularization techniques have been developed to alleviate these issues. Herein, we investigate three popular regularization techniques, namely viscoplasticity, gradient plasticity and the use of a Cosserat continuum. These techniques have been implemented in a single framework based on an accelerated pseudo‐transient solution strategy. The latter allows to explore the effects of regularization on shear banding using the same code and model configuration. We have used model configurations that involve three levels of complexity: from the emergence of a single isolated shear band to the visco‐elasto‐plastic stress buildup of a crust. All considered approaches allow to resolve shear banding, provide convergence upon mesh refinement and satisfaction of equilibrium. Viscoplastic regularization is straightforward to implement in geodynamic codes. Nevertheless, more stable shear banding patterns and strength estimates are achieved with computationally more expensive gradient and Cosserat‐type regularizations. We discuss the relative benefits of these techniques and their combinations for geodynamic modeling. Emphasis is put on the potential of Cosserat‐type media for geodynamic applications.

straightforward and the approach has been used in the context of geodynamic modeling Jacquey & Cacace, 2020). However, during dynamic deformation viscoplasticity alone is insufficient to remove mesh sensitivity (Stathas & Stefanou, 2021). Hence, there is a need to investigate the relative benefits of other, in particular spatial regularization techniques for modeling shear banding.
Non-local plasticity is one of the earliest regularization approaches (Bažant & Lin, 1988). It involves the spatial averaging of the plastic strain, which is used for in hardening/softening laws. A characteristic length scale, proportional to the area over which strain is averaged is introduced. With this scheme, the sparsity of the linearized discrete operators is reduced as the length scale is increased. Since this scheme is based on strain averaging, it is a candidate for regularizing mesh sensitivity which arises from material strain softening. Gradient plasticity involves the spatial gradients of the plastic strain, typically second-order gradients (i.e., Laplacian), in the yield function (de Borst & Mühlhaus, 1992). This implies the existence of a multiplier (K g ) in unit Pa ⋅ m 2 and introduces a length scale in the constitutive model. Due to this modification, the return-mapping procedure involves the solution of a partial differential equation and the plastic multiplier rate becomes an additional degree of freedom (DOF).
Spatial regularization can also achieved by employing a Cosserat, or more generally a micro-polar continuum. This approach relies on the inclusion of micro-rotations and can capture the micro-structure of granular or blocky materials. The constitutive relation which relates micro-curvatures and the conjugate couple-stresses introduces a length scale (Mühlhaus & Vardoulakis, 1987;Sabet & de Borst, 2019;Stefanou et al., 2017). Different from viscoplasticity and gradient plasticity, the use of a Cosserat continuum for regularization does not require modification of the yield function, but the second and third stress invariants have to be changed. In contrast, tensor invariants are augmented by the including contributions from micro-curvatures and couple-stresses. Another important difference is the fact that, when using a Cosserat continuum, length scale effects do not only affect the plastic strains, but also their elastic and viscous counterparts. However, the use of a Cosserat continuum is only effective for shear localization, since the rotations are mobilized only in mode-II.
The use of gradient and Cosserat plasticity has so far mostly been limited to engineering and geomechanical modeling (de Borst et al., 1993;Mühlhaus & Vardoulakis, 1987;Sabet and de Borst, 2019;Stefanou et al., 2017). The benefits and drawbacks of each approach have been discussed in independent studies, which involve different model configurations and different simulation tools. While being insightful, comparison of regularization techniques using similar tools and similar model configurations are rare (de Borst et al., 1993).
Here, we investigate the effects of the three main regularization techniques for non-associated frictional plastic deformation (viscoplasticity, second-order gradient, and Cosserat plasticity) in a similar context. We use the finite difference method (FDM) and an accelerated pseudo-transient (PT) solution strategy (Räss et al., 2022), which provides a simple and unified framework to study coupled non-linear systems (Duretz, Räss, et al., 2018;Räss et al., 2019). We also investigate the combination of spatial and temporal regularization approaches. This has been applied in damage mechanics, for example, through the phase-field approach (e.g., Miehe et al., 2010), and has also been proposed for the study of rate-and-state frictional slip (e.g., Pranger et al., 2022) and compaction banding (e.g., Leuthold et al., 2021).

Governing Equations
We consider steady-state deformations of a compressible medium in two-dimensions (2D plane strain). The balance of linear momentum takes the form of: where body force components are products of the density (ρ) and the gravity acceleration vector components (g i ).
The total stress is expressed as σ ij = −pδ ij + τ ij + ϵ ijk R k which contains contributions from the symmetric part of the deviatoric stress tensor (τ ij ), the pressure the pressure (p) and the force-conjugate of the micro-rotation rate . The latter contributions is antisymmetric contribution and is non-zero only in the case of the Cosserat model.
The symbol ϵ ijk represents the Levi-Civita tensor. In the Cosserat model, since the stress tensor is not symmetric, the balance of angular momentum also needs to be considered: where m ij are components of the couple-stress tensor.
The conservation of mass is formulated as where are velocity vector components. The terms e and p correspond to elastic and plastic divergence rates, respectively. In the following, we will solve Equations 1-3 to obtain the velocity vector components, the pressure and the micro-rotation-rate vector , given initial and boundary conditions, as well as the constitutive relationships described below. In the following computations, we will consider fixed normal velocity components to model boundaries, zero shear rate tangential to boundaries and zero micro-rotation rate components at the boundaries.

Constitutive Relationships
In order to capture long-term creep of rocks, strain localization due to frictional plastic deformation and rock elasticity, our rheological model involves contributions from viscosity (v), elasticity (e) and plasticity (p). The deviatoric rheology is based on an additive decomposition of the deviatoric strain rate tensor (̇) : where η is the creep viscosity, G is the shear modulus, ̇ is the rate of the plastic multiplier and Q is the plastic flow potential. For the volumetric rheology, we consider elastic and plastic deformations: where K is the bulk modulus. For Cosserat continuum, the following decompositions are applied as well: where are the curvature rates, l c is the Cosserat length scale, and η c and G c are the Cosserat viscosity and the shear modulus, respectively. ̇ is the rotation rate which contains contributions from the spin (̇) and micro-rotation rate (̇) . The kinematic and semi-discrete constitutive relationships are explicitly given in Appendices A and B.
We use a Drucker-Prager model, and the yield function and the plastic potential are expressed as, respectively: where ϕ is the friction angle, C is the cohesion, ψ is the dilatancy angle and τ II is the square root of the second invariant of the deviatoric stress tensor. The progressive strain-induced decohesion is modeled by applying linear strain softening:̇= where ̇ is the rate of the plastic multiplier and h is a softening modulus. Non-associated plastic flow and material softening can both lead to strain localization including the formation of shear bands (e.g., Rudnicki & Rice, 1975). However, the absence of an internal length or time scale in a standard, rate-independent continuum renders the numerical simulation of plastic shear banding mesh sensitive and causes severe equilibrium convergence issues (Duretz, Souche, et al., 2018;Hageman et al., 2021;Spiegelman et al., 2016). In an attempt to remove these effects, which affect both solution patterns and load-bearing capacities (i.e., effective strength), we investigate three different types of regularization.

Viscoplasticity
Viscoplasticity can be used as a temporal regularization. It relies on the parallel assembly of a frictional (slider) and viscous (dashpot) rheological element; so-called Kelvin element. We use a consistency viscoplastic model for which the yield function is modified in the following manner (Heeres et al., 2002) where η vp is the viscosity of the Kelvin element (unit Pa ⋅ s). The latter quantity is the main parameter used for viscoplastic regularization and the product ̇v p represents the viscoplastic overstress. In the limit, where η vp → 0, F vp → F, and the effects of viscoplasticity vanish.

Gradient Plasticity
The second-order gradient approach can be used as a spatial regularization approach. To this end the yield function expressed as (de Borst & Mühlhaus, 1992): where ɛ p is the accumulated plastic strain and K g (unit Pa ⋅ m 2 ) is the main parameter controlling the second-order gradient regularization. The non-regularized Drucker-Prager yield function (F) is recovered in the limit where K g → 0. It is important to notice that, due to the Laplacian operator, the plastic multiplier rate is a global variable. The plastic multiplier rate is thus an additional global DOF which is obtained by solving the boundary value problem defined by Equation 11. To this end, it is necessary to apply boundary conditions at the boundary of the plastic domain. This is in contrast with the non-regularized (F) and viscoplastic (F vp ) plasticity models in which ̇ is defined locally.

Cosserat Plasticity
With the Cosserat model, there is no need to modify the yield function but for a redefinition of τ II . The continuum model contains a characteristic length scale, which we can exploit for spatial regularization. The yield function is simply expressed as: where τ II is now defined as: Herein, we have taken h 1 = h 2 = h 3 = 1/2 for convenience (de Borst, 1991). In practical cases, however, these parameters should be calibrated based on laboratory experiments .
Like the gradient model, the Cosserat model involves additional DOFs. The latter are the components of the micro-rotation vector (ω i ) and the number of additional DOFs depends on the dimension considered: 1 in 2D, 3 in 3D. It is important to note that the Cosserat continuum model is naturally equipped with an internal length-scale (l c ) that influences the rheological behavior in all deformation stages (elastic, viscous, plastic). Most implementations of the Cosserat model discussed in the literature are formulated in the context of total stress, displacement-based finite element method (FEM). For the purpose of this study we have expressed the Cosserat model using a velocity-pressure formulation (see Appendices A and B) and the resulting equations were discretized using the FDM.

Pseudo-Transient (PT) Solving Strategy
The above equations were discretized with the FDM using a staggered-grid spatial discretization and a backward-Euler temporal discretization. The resulting set of non-linear equations was solved using the accelerated PT method (Räss et al., 2022) which relies on an explicit second-order integration of the non-linear equations in pseudo-time. This approach is particularly well-suited for the solution of coupled non-linear equations (Duretz, Räss, et al., 2018;Räss et al., 2017Räss et al., , 2022 and its flexibility allows to easily incorporate additional equations, combine new elements and explore the effects of coupled physical processes (e.g., Schmalholz et al., 2020). In an accelerated PT solve, damping is applied to each equation that include a Laplace-type operator (Räss et al., 2022).
In the present case, damping was thus applied to the linear momentum balance, the angular momentum balance (for Cosserat continuum) and the return-mapping equation (for gradient regularization). Besides damping, a continuation was applied to the plastic multiplier. This allows to progressively relax the non-linearity due to plasticity throughout the iterative process in a similarly fashion to treatment of power-law viscosity in, for example, Duretz, Räss, et al. (2018). The values of all involved numerical parameters are given in the appended code.
We implement the discretized governing equations using the Julia language (Bezanson et al., 2017). We use the Parallelstencil.jl (Omlin & Räss, 2021b) and ImplicitGlobalGrid.jl (Omlin & Räss, 2021a) Julia packages which permit to write backend-agnostic high-level code for high-performance distributed stencil computations on xPUs, that is, both central processing units and graphical processing units (GPUs). This approach conveniently addresses the two-language problem allowing for the development of a single code that can be used both for prototyping and production purpose. The accelerated PT method provides a fully local and iterative algorithm optimally leverages the processing capabilities of many-core hardware such as GPUs.

Model Configuration
Next, we investigate shear banding using three different model configurations. The first case focuses on the development of a single shear band. A 14.1 × 10.4 km 2 model domain containing a weak seed in the southwest corner (radius R and shear modulus G/4) is subjected to background pure shear rate (̇B G ) . The shear stress and the micro-rotation rate are zero on the domain boundary. The gradient of the plastic strain at the boundary (i.e., variable in space and time) is set to zero. We do not aim at resolving the complete elastic loading, and for this reason the models are pre-stressed. The initial pressure is set to the confining pressure (P conf = 250 MPa), the horizontal deviatoric stress is set to −P conf /2, and the horizontal deviatoric stress is set to P conf /2. For models with a Cosserat continuum, the initial couple-stress components and bending stress are set to zero.
The second case involves a domain size of 14.0 × 10 km 2 . The boundary conditions are similar to the previous case. The initial condition accounts for a random cohesion field which gives rise to a non-trivial shear banding pattern. All models were run with the same initial random noise, which was interpolated to meshes of different resolution. The initial cohesion is set to 100 MPa. In order to introduce a spatial variation, a read noise is applied to the cohesion field. The latter is isotropic and is characterized by a zero mean, a constant variance and no preferred period. Diffusion is applied to the initial cohesion for a total time of 7.1 Kyr and with a diffusivity of 10 −6 m 2 ⋅ s −1 resulting in minimum and maximum cohesion values of 95.6 and 103.8 MPa, respectively. In crustal-scale models, the domain has dimensions of 50 × 30 km 2 . The vertical component acceleration of gravitational acceleration (g y ) and density are respectively set to −9.81 m/s 2 and 2,700 kg/m 3 . The initial pressure is set to lithostatic pressure (P = −∫ρg y dy) and the top surface is a free surface. All initial stresses are set to zero. The temperature (T) increases with depth with a gradient of 15°C/km and the surface temperature is 20°C.
Viscous creep is activated and the viscosity follows a power-law relation, II , using creep parameters of Westerly granite (Hansen and Carter (1983): A = 3.1623 × 10 −26 Pa −n .s −1 , Q a = 186.5 kJ/mol and n = 3.3) and the gas constant R (= 8.314 J/kg/mol). The seed has constant viscosity of 10 21 Pa ⋅ s. For models with a Cosserat continuum, the bending shear modulus and shear viscosity were kept equal to that used for the evaluation of deviatoric stress (G c = G and η c = η). The remaining material parameters are listed in Table 1.

Verification of Cosserat Model Implementation Using FEM Simulations
The implementation of a Cosserat continuum in an existing code implies some fundamental changes (see Appendices A and B). We here verify our FDM implementation by benchmarking against FEM-based simulations.
DURETZ ET AL. The reference FEM solution has been generated using a standard displacement-rotation framework (Hageman et al., 2021). The spatial discretization was performed with cubic B-splines (Hughes et al., 2005), using 94 × 62 elements (element size 150 × 168 m). This resolution corresponds to that used for the FDM calculations in this section. We used a backward Euler scheme for the temporal discretization. One difference of note between the FDM implementation presented in this paper, and the reference FEM implementation is that the finite element solution required significantly smaller time increments to localize accurately: We have used 40 versus 400 time steps for the FDM and the reference FEM scheme, respectively.
The results depicted in Figure 1 were computed using the single shear band model configuration. We report the temporal evolution of the average of the second deviatoric stress invariant. The results were obtained for three different length scales ranging from 100 to 300 m, with and without cohesion softening. In all cases, we obtained good agreement between FDM and FEM results. The only noticeable mismatch was obtained for a length scale of 100 m including softening. Under these conditions, the shear band width reached a value which was too close to the spatial resolution of either (FEM or FDM) model.

Shear Banding With Viscoplasticity, Gradient Plasticity and in Cosserat Continuum
We investigated the single shear band model configuration using η vp = 10 18 Pa ⋅ s for viscoplastic regularization, K g = 5.10 12 Pa ⋅ m 2 for gradient plasticity and l c = 80 m for the Cosserat continuum. The simulations have been run for 40 time steps up to a final time of ≈1.1 Kyr. The plastic strain maps exhibit a finite width band of increased strain localization for each regularization approach (Figures 2a-2c). For the parameters considered, the widths of the bands are comparable. In comparison with gradient and Cosserat implementations, the strain gradient appears sharper in the case of viscoplasticity. By analyzing the spatial distribution of τ II (Figures 2d-2f), we observe that viscoplastic regularization exhibits small-scale shear band splitting. It should be noted the gradient implementation also exhibits a shear band splitting (Figure 2e), which is not observed in the case of a Cosserat regularization (Figure 2f). The Cosserat model clearly exhibits high stress regions on each side of the shear band ( Figure 2f).

Table 1 List of Material Parameters Employed in This Study
of the overstress. The gradient plasticity exhibits the steepest softening. Each implementation shows convergence of the stress-time curve upon mesh refinement. The Cosserat implementation clearly outperforms the other approaches since almost no difference is noticeable between the results of the lowest and the highest tested resolution.
Profiles of the plastic strain (ɛ p , Figure 4a), the stress invariant (τ II , Figure 4b) and the pressure (P, Figure 4c) were probed across the shear bands. As expected from the 2D maps, viscoplasticity provides a plug flow profile for the plastic strain while the profiles resulting from the gradient and Cosserat models are more gradual (Figure 4a). All regularization approaches show a decreased value of the stress and the pressure inside the shear bands (Figures 4b  and 4c). For viscoplasticity and gradient plasticity, the development of multiples are noticeable on both stress and pressure profiles (Figures 4b and 4c). For Cosserat, the high stress rims observed in the 2D stress maps (Figure 2f) are now clearly observable along the stress profile ( Figure 4b).
In order to verify that the latter effect is not an artifact of mesh resolution, we have run a grid convergence test. The resolution was increased from 46 × 30 up to 382 × 254 cells. The results clearly show convergence upon mesh refinement ( Figure 5). The stress (τ II ) in the rims saturates at ≈140 MPa with increasing resolution while the center of the shear band and the far-field reach values of 117 and 125 MPa, respectively. Moreover, independently computed FEM solutions (Figure 1) exhibited the same behavior.
Further differences between regularization schemes arise when considering the effect of dilatancy. Using the same model configuration, we have studies the effect of increasing the dilation angle ( Figure 6). For all regularization,  non-zero dilatancy tends to smear out strain localization. The intensity of plastic strain thus decreases within shear bands. Nevertheless, gradient and Cosserat models manage to maintained strain localization within shear bands (Figures 6f and 6i). In the case of viscoplastic regularization, plastic strain also occurs outside the shear bands. With increasing dilation angle, the plastic strain gradient reduces and localization an thus less pronounced (Figure 6c).
We studied the width of shear bands for each regularization approach as a function of the main regularization parameter. Widths (W) were extracted from the plastic strain profiles and correspond to the bandwidth of the best Gaussian fit. For all approaches, shear band widths of several 100 of meters were obtained; they are thus numerically resolved. For viscoplasticity, the characteristic width is proportional to the inverse of the Kelvin element viscosity (η vp , Figure 7a). For gradient plasticity, shear band width scales with the square root of the gradient parameter (K g , Figure 7b). For the Cosserat model, the width depends linearly on the length scale l c (Figure 7c). This is another benefit of the Cosserat model in comparison to the other approaches as it allows for a straightforward control of shear band widths.

Shear Band Networks Arising From Initial Random Cohesion
We carried out simulations with a random initial cohesion for a viscoplasticity, gradient plasticity and Cosserat continuum.
We report the evolution of averaged stress in Figure 8a. Similar to the single shear band case, gradient plasticity provides the most abrupt stress drop while viscoplasticity provides the most gradual stress drop. For the considered set of parameters, the gradient and Cosserat models result in approximately the same residual load, ≈140 MPa, while the viscoplastic model reaches ≈155 MPa, which reflects the overstress. For each regularization approach, the shear banding pattern consists of one main thick shear band that reflects on the four model boundaries (Figure 8b). The location of reflection points differs in all cases. Gradient plasticity yields the narrowest band, while the viscoplasticity and Cosserat models are characterized by a similar width. In the case of viscoplasticity, the generation of multiple bands can clearly be observed, while gradient and Cosserat models remain localized in a single band.
The shear banding patterns obtained with the three approaches are remarkably stable upon mesh refinement ( Figure 9). The viscoplastic model is the most affected by a variation of the model's resolution, however, this only impacts the details of the shear band multiples.
We additionally explored the role of the governing parameters for each regularization approach ( Figure 10). As in the single shear band case (e.g., Figure 7), the shear band broadens with an increase of either η vp , K g , or l c , while the intensity of the plastic strain decreases with increasing shear band width. For viscoplasticity we observe more prominent multiples for an increasing Kelvin-element viscosity. Minor multiples were also obtained with of gradient plasticity, but only for the lowest value of K g . The Cosserat models are characterized by robust shear banding pattern, which is merely affected by a change of resolution or governing parameter and which is devoid of any multiple shear banding.

Crustal Scale
We have carried out simulations of shear banding at the crustal scale using the three regularization techniques. The results are depicted in Figure 11, which depicts maps of second strain rate invariant after 1.07 Myr. The models undergo horizontal compression and include a brittle-ductile transition as well as a free surface. We focused on shear banding that occurs during visco-elasto-plastic stress build up. Therefore, the effect of finite strain was not included and the model geometries did not evolve with time.
As for the previous test (Figure 10), we can clearly observe the effect of the regularization. For each approach, strain localization loses intensity with increasing the regularization parameter. In particular, we observe that main shear bands broaden and that surrounding shear band multiples smear out. Shear banding patterns can even exhibit striking similarities (e.g., Figures 10c and 10i). Again, multiples are much more present using viscoplastic regularization which exhibits high frequency shear bands at low values of η vp (Figure 10a) and which resembles patterns obtained using non regularized frictional plasticity (i.e., local and rate-independent).

Combining Benefits of Temporal and Spatial Regularization
In some cases, it may be beneficial to combine the effects of several regularization schemes. In particular when both effects of temporal and spatial regularization are desired. Such combination is easily achievable in the context of the employed PT solving strategy. We have investigated the combination of gradient plasticity and viscoplasticity as well as Cosserat continuum with viscoplasticity, using the models based on the single shear band configuration. The stress-time evolution curves indicated the effect of viscoplastic overstress on the gradient and Cosserat model results (Figure 12a). The latter is minimized by setting low values of regularization viscosity (η vp < 5.10 17 Pa ⋅ s). As observed for direct iterative schemes (e.g., Duretz et al., 2019), the inclusion of rate dependence also facilitates convergence to equilibrium. This is particularly remarkable for gradient plasticity ( Figure 12b) where even small value of η vp can lead to 50% reduction in the number of iterations. The effect of viscoplasticity is also noticeable for the Cosserat model but the improvement is less than 25% (Figure 12c).

Discussion: The Benefits and Limitations of Regularization Approaches
For the configurations considered, each regularization approach delivers satisfactory results. They allow for convergence regarding mesh refinement, satisfaction of equilibrium, and result in comparable shear band widths. Nevertheless, each approach influences the solution in its own way.
The inclusion of viscoplasticity induces overstress and large values of the viscosity parameter, η vp , can lead to an overshoot of the material strength. Nevertheless, for moderate overstress, fairly accurate strength estimates can be achieved while keeping the regularization benefits of viscoplasticity . A power-law model can also be used to limit the effects of growing overstress with increasing strain rate (Duretz et al., 2021). Another drawback of viscoplasticity is the occurrence of multiple shear bands and a possible delocalization of the strain. The latter is particularly well expressed when non-zero dilatancy is taken into account. This effect was vastly reduced with gradient plasticity and was not observed with models based on a Cosserat continuum. In contrast with viscoplasticity and gradient plasticity, in the Cosserat continuum, shear bands exhibit rims which of high differential stress, which likely contributes to the focusing of the strain inside the band. A comparison with stress distributions around natural shear/fault zones could help to validate this. In general the Cosserat model provides the most robust shear banding patterns and is the simplest way to control shear band width. At this point, we only intended to use Cosserat as a numerical regularization technique. We have made simplifying assumptions in the choice of the extra parameters involved in the Cosserat model. Notably, the definition of invariants involves extra material parameters. By assuming h 1 = h 2 = h 3 , we have restricted our models to isotropic viscous and plastic flow. Accurate constitutive modeling using Cosserat medium can also be achieved (e.g., Stefanou et al., 2017). Such models often imply mechanical anisotropy as reflected by estimated values h 1 , h 2 , and h 3 for different materials.

Implications and Perspectives for Geodynamic Modeling
The considered regularization approaches all show attractive properties for geodynamic models which typically involve the development of fault zones, shear zones and plate boundaries. It should be noted that the considered regularizations do not alter the angle of shear bands arising from shear instabilities using Drucker-Prager rheology. Spatial regularizations (gradient and Cosserat continuum) provide the most robust shear banding patterns. They manage to maintain plastic deformation within the shear bands, exhibit excellent mesh convergence and simple control over shear band width.
Although we have used a Cosserat continuum as a regularization, with an appropriate description of the material structure, a Cosserat model can adequately reflect the mechanical behavior a material (e.g., Stefanou et al., 2017). Moreover, Cosserat continua are general because, besides plasticity, they also introduce a length scale in elasticity (e.g., Lakes, 1995) or viscosity (Riihimäki, 1978). The latter could have interesting implications in geophysics such as in seismology (Abreu et al., 2017). Upon adequate upscaling of rocks microstructure, the use of Cosserat media could provide further insights in the patterns of interseismic deformation, which are typically studied using classical elasticity and viscosity (Savage & Burford, 1973;Traoré et al., 2014). The role of couple-stresses was shown to be beneficial for the simulation of anisotropic multilayer folding (Muhlhaus et al., 2002). Additionally, Cosserat media may constitute an appealing alternative for modeling ductile strain localization using a constitutive model that contains no inherent length scale or length scales that are typically non-resolvable in a typical geodynamic simulation (e.g., grain size evolution, Rozel et al., (2011)).

Implications for Geodynamic Simulation Tools
Viscoplasticity is straightforward to implement and is already used in codes such as LaMEM (Moulas et al., 2022), M2Di  and MDoodz (Duretz et al., 2021;Yamato et al., 2022). Nevertheless, a finer control on shear banding patterns is achievable using spatial regularizations such as gradient plasticity or Cosserat continuum. In particular, the use of a Cosserat continuum allows for a direct control of shear band widths (e.g., (Mühlhaus & Vardoulakis, 1987), see also Figure 7c). However the implementation of spatial regularization requires major changes in the structure and solving strategies of existing codes. With gradient regularization, the return mapping procedure involves the Laplacian of the plastic multiplier rate. This involves the solution of an additional coupled partial differential equation with boundary conditions defined at the boundary of the plastic domain. The plastic multiplier rate thus becomes an additional global DOF (in 2D or 3D). In the Cosserat continuum, the components of the micro-rotation rate vector are additional DOFs. This corresponds to one and three additional global DOF, and associated partial differential equation, in 2D and 3D, respectively.
The PT solving strategy enabled us to analyze the different regularization in a single framework. Nevertheless, the inclusion of spatial regularization techniques may also be easily achieved with other solving strategies that are designed to automatically discretize and solve coupled non-linear systems of equations (Davies et al., 2022;Wilson et al., 2017). Alternatively, it may also be possible to include effects of spatial regularization by decoupling the equations (operator split) and using staggered schemes such as usually applied in damage modeling (Miehe et al., 2010).

Conclusions
Frictional strain localization is an important feature of geodynamic simulations. However, due to the lack of inherent length and time scales, the numerical treatment of frictional plasticity in geodynamic model is problematic. For this reason, we have investigated the effects of three temporal and spatial regularization techniques: viscoplasticity, gradient plasticity and the use of a Cosserat continuum. The three techniques were implemented in a single code based on the accelerated PT method combined to a FDM discretization. To conform to geodynamic modeling standards, we expressed the Cosserat model for velocity-pressure formulations and benchmarked it using state-of-the-art FEM models. The three regularization strategies provide attractive properties for modeling strain localization in frictional material equipped (or not) with a material strain softening parametrization. All regularization strategies allow, to some extent, for controlling shear band widths, they deliver convergence upon mesh refinement and satisfaction of the force balance. All regularization techniques were successful at modeling crustal scale shear banding during visco-elasto-plastic build-up. In practice, viscoplasticity is straightforward to include in existing codes. Gradient and Cosserat implementations require extra degrees of freedom but both provide more robust shear banding patterns and more precise strength estimates. Besides regularization, the use of Cosserat-type media to upscale rock fabrics brings exciting perspectives for future geodynamic modeling.

Appendix A: Splitting of the Shear Stress Components
In the Cosserat model, the total strain rate tensor is not symmetric as the shear components include effect of micro-rotation rates:̇ Assuming a viscous rheology, the total shear stress component σ xy ≠ σ yx and is expressed as: Upon expansion, σ xy may be reformulated as: or after using the symmetrized shear strain rate = 1 2 ( + ) and the spin ̇= 1 2 ( − ) = 2 − 2 c − 2 c (A4) Using the identities = 2 and that = −2 c(̇+̇) , and carrying out a similar derivation for σ yx , one obtains: These expressions are convenient to work with in geodynamic code since the deviatoric shear stress remains symmetric. The only addition is that one needs to keep track of the additional (antisymmetric) components of shear stress caused by the rotation (R z ) in order to calculate the total shear (σ xy ) and subsequently evaluate the linear and angular momentum balances.
where the notation ′ indicate the symmetrized deviatoric strain rate tensor components. Plane-strain conditions are assumed. The constitutive relationships are expressed are given in the continuous form in Section 2.2. In the following, we employ a semi-discrete form for practical purposes. The rates are discretized using backward Euler time integration. We assume small elastic strains, such that terms related to stress tensor advection, rotation and stretch are negligible. The updates of deviatoric stress, force-conjugate of micro-rotation rate and couple-stress may be expressed as: where F trial is the yield function evaluated for trial deviatoric stress invariant and trial pressure. Furthermore, if one assumes that ve c = ve , the deviatoric stress invariant becomes proportionally to the effective strain rate invariant (see e.g., Kaus et al., 2016;Moresi et al., 2003) and an effective viscosity can be expressed (e.g., Duretz et al., 2021).

Data Availability Statement
All the scripts are available in the repository (Räss & Duretz, 2023), the initial cohesion field is available in this repository (Duretz, 2023). The scripts are written in the open source Julia language and the scripts' latest version is accessible at https://github.com/PTsolvers/PlasticityRegularisations_G3.