Segmented diffusion imaging with iterative motion‐corrected reconstruction (SEDIMENT) for brain echo‐planar imaging

Multi‐shot techniques offer improved resolution and signal‐to‐noise ratio for diffusion‐ weighted imaging, but make the acquisition vulnerable to shot‐specific phase variations and inter‐shot macroscopic motion. Several model‐based reconstruction approaches with iterative phase correction have been proposed, but robust macroscopic motion estimation is still challenging. Segmented diffusion imaging with iterative motion‐corrected reconstruction (SEDIMENT) uses iteratively refined data‐driven shot navigators based on sensitivity encoding to cure phase and rigid in‐plane motion artifacts. The iterative scheme is compared in simulations and in vivo with a non‐iterative reference algorithm for echo‐planar imaging with up to sixfold segmentation. The SEDIMENT framework supports partial Fourier acquisitions and furthermore includes options for data rejection and learning‐based modules to improve robustness and convergence.

Navigator-free methods 33,34 completely renounce any navigator information. The two latter reconstruction approaches are described as data driven, as they are purely based on the imaging data itself.
For multi-shot DWI, several data-driven algorithms have been proposed to correct for motion-induced artifacts. A detailed review of these algorithms is postponed to the next section. As a brief example, POCS-ICE 29 iteratively corrects for physiological motion by phase correction, recovering decent images for high segmentations. Conversely, AMUSE 30 corrects for both physiological and inter-shot rigid in-plane motion in a non-iterative scheme. This extension nicely inhibits gross motion artifacts, but the one-time SENSE navigators are prone to noise propagation.
Iterative rigid motion estimations have achieved improved image quality for non-diffusion multi-shot applications by enforcing SENSE-based data consistency, [35][36][37][38][39][40] but the iterative integration of both physiological and macroscopic motion for DWI has still been challenging.
In this work, we integrate physiological and macroscopic in-plane motion correction into an iterative reconstruction scheme called SEDIMENT.
The motion parameters are estimated from SENSE-based shot navigators, which are iteratively improved by sharing consistent multi-shot information. The method is presented for brain EPI. Moreover, the framework enables partial Fourier (PF) reconstructions 41 to harvest SNR by reducing the echo times for EPI.

Multi-shot diffusion reconstruction
SENSE-based multi-shot reconstruction combines two basic models to describe the joint signal formation, namely the SENSE encoding model 6,7 and a model for the shot relationships. 16 Phased arrays provide additional encoding information by their respective spatial sensitivities, which can be exploited to resolve fold-over artifacts from undersampled trajectories using SENSE. Each single interleave can be interpreted as a conventional undersampled SENSE problem that can be solved individually, but that becomes increasingly ill conditioned for higher segmentation.
The second model describes the relationships between the shots to exploit their common features and increase the problem conditioning. The modeling of the shot-specific variations is application dependent and determines the tractability of the joint problem.
The SENSE forward model 7 for each shot contains sensitivity and Fourier encoding. For a discrete realization, the complex shot image vector x i ∈ C N of shot i is related to the corresponding shot data d i ∈ C N K by a linear model. N and N K are the number of pixels and k-space samples, respectively. The SENSE functional J i penalizes the Euclidean distance between the model and the data: Herein, S ∈ C N C N×N is the SENSE operator with N C coil sensitivity maps (CSMs) and  i ∈ C N K ×N C N is the masked Fourier operator of the ith undersampled shot trajectory. The set  0 = {1, 2, … , N I } contains all N I shot indices. This work evaluates Cartesian interleaved EPI (N K = N C N) samples with optional PF acquisition, 41 as shown in Figure 1A.
For multi-shot echo-planar DWI, we consider two shot variations that affect the joint image underlying all shots. 30 First, each shot image x i contains a shot-specific phase that originates from tiny motion during the diffusion encoding. Second, the shot images are prone to in-plane macroscopic motion between the shots. The shot variations are visualized in Figure 1B. The macroscopic motion is included using matrix formulations. 42 Hence, each shot image is related to the underlying joint image by a set of joint image constraints: The shot-specific phase operator Φ i ∈ C N×N is a diagonal matrix comprising the complex-valued phase of each pixel, whereas Ω i ∈ C N×N denotes the macroscopic motion operator. The structure of Ω i depends on the selected motion transformation. The order of the operators can be reversed. FIGURE 1 Example of shot-specific DWI variations for interleaved EPI. A, The three shot trajectories jointly yield full k-space coverage. PF acquisition allows for trajectory shortening, indicated by dotted lines. B, The three shot images contain strong macroscopic inter-shot motion and motion-induced phase variations. Conventional SENSE (R = 1) 6 reconstructs the fully stacked k-space of the complementary shots neglecting shot variations. The resulting image quality is unacceptable The joint image constraints in Equation (2) provide a set of equality constraints that connect the shot-wise SENSE functionals from Equation (1) to form the constrained joint diffusion problem: Commonly, further constraints and regularizations are added, including smoothness assumptions on the phase variations, Tikhonov regularizations and potential sparsity constraints. 43 In many publications, the equality constraints are plugged into the objective function to eliminate the shot images, yielding an unconstrained problem: Comprising unknown macroscopic and physiological motion operators, the joint problem represents a non-convex optimization, which becomes computationally intractable and susceptible to local minima. 44 In the case of known motion operators Ω i and Φ i , the multi-shot encoding is interpretable as looking at the spin densities with N I N C sensitivity profiles, which contain the normal set of N C coils modulated with a shot-specific phase and macroscopic transformation. Mathematically, this results in an extended linear SENSE problem with N I N C so-called composite sensitivity profiles, 21 which is unconstrained and convex. Extra-and self-navigated methods are therefore used to estimate the unknown motion-related operators and circumvent non-convex joint optimization.

Related data-driven algorithms
The literature provides a variety of data-driven algorithms solving the joint problem in Equation (3) to achieve robust multi-shot diffusion reconstructions. The algorithms can be jointly assigned to the field of model-based image reconstruction, 16 but they employ different strategies to solve the optimization problems. In general, all algorithms correct for phase variations from physiological motion due to its severe and unavoidable nature in multi-shot diffusion. Conversely, macroscopic motion is mainly neglected. Table 1 provides an overview of recent data-driven multi-shot DWI algorithms, which we compiled to the best of our knowledge. Except for PR-SENSE, 31 all EPI-based methods support PF reconstructions.
The majority of algorithms focus on physiological motion correction by phase estimation using SENSE navigation. MUSE 24 and SENSE+CG 26 employ a two-step approach to obtain the phase maps once and then reconstruct the joint image. First, shot images are reconstructed using SENSE, which makes the shot-specific phase variations accessible by filtering. Second, the unconstrained joint problem in Equation (4) with the just estimated phase operator is solved to obtain the joint image. These non-iterative algorithms are limited to relatively low segmentations, because the SENSE-based shot reconstructions become increasingly ill conditioned for higher segmentation. The g-factor noise 6 thus propagates into the shot images, deteriorating the one-time phase estimates. SF-MUSE, 27 POCSMUSE, 28 POCS-ICE 29 and PR-SENSE 31 basically use an iterative structure to estimate the phase operator Φ i and the joint image . The iterative feedback thereby successively improves the shot-wise SENSE conditioning, enabling higher segmentations.
In contrast to the SENSE-navigated algorithms, MUSSELS 33 and Shot-LLR 34 include the phase variations by iteratively posing low-rank constraints. These approaches perform navigator-free optimizations of the non-convex joint problem, circumventing SENSE navigation.
Furthermore, they omit the explicit definition of smoothness-enforcing filters to obtain Φ i . AMUSE 30 extends the two-step MUSE framework 24 by including macroscopic motion corrections in the non-iterative scheme. The algorithm obtains the shot-wise macroscopic motion states once by rigid registration and also performs diffusion contrast corrections, which arise from the change in the effective diffusion directions according to the shot-specific rotations. AMUSE uses the rotation parameters from registration and a diffusion tensor estimate to equalize the shot contrasts. The phase maps are determined using a total variation (TV) filter. Analogous to the non-iterative phase-corrected reconstruction, the motion estimation is sensitive to noise propagation effects within the shot navigators, which could be enhanced by iteratively reinforcing data consistency.
This work focuses on the joint iterative improvement of the physiological and macroscopic shot motion estimates Ω i and Φ i for multi-shot DWI.
AMUSE furthermore embeds the multi-shot framework into the higher-level diffusion tensor imaging (DTI). 30 The proposed contrast corrections have been shown to improve the DTI results, but the robust estimation of the initial diffusion tensor poses another challenging problem for higher segmentations apart from the motion estimates. Therefore, we excluded the diffusion contrast corrections from this work.

SEDIMENT framework
SEDIMENT adopts SENSE navigation to estimate both motion-induced phase variations and macroscopic motion for multi-shot DWI reconstruction and embeds it into an iterative scheme. Inspired by POCS-ICE, 29 the algorithm alternates between the shot-wise data consistency in k-space (Equation (1)) and the joint image constraints in image space (Equation (2)), but extends this scheme by continuous macroscopic motion corrections and optionally includes PF acquisitions and data rejection. SEDIMENT and the reference algorithm MC-SENSE+CG (macroscopic motion-corrected SENSE+CG) are shown in Figure 2. We dropped iteration superscripts to provide uncluttered notation. The operations are always performed on the up-to-date estimates.

Initialization
SEDIMENT guesses first shot images x init i using initial CG-SENSE. 7 The conjugate gradient (CG) method includes an intrinsic regularization attenuating low singular values. 45 The iteration number controls the regularization and thus balances aliasing and noise propagation. 46

Symmetry Projection
PF techniques assume that the spatial frequency content of the object's phase is limited in the phase-encoding direction. 41 The reconstruction therefore imposes another low-resolution constraint on the phase, using a conventional phase projection operator. 8 The signal phase is thereby FIGURE 2 Schematic diagrams of the presented motion-corrected multi-shot DWI algorithms. A, SEDIMENT. B, MC-SENSE+CG (non-iterative reference algorithm). Both schemes estimate complex-valued shot images from the data by initial CG-SENSE. The shots navigate macroscopic and physiological motion estimations yielding macroscopic parameters and phase maps. SEDIMENT optionally includes symmetry projections for PF acquisition and data rejection before shot combination and feedback to the data projection. MC-SENSE+CG employs a final multi-shot CG to solve for the joint image substituted by the constrained low-resolution phase i : The operator  pe ∈ C N×N denotes the 1D-FFT (fast Fourier transform) in the phase-encoding direction and V ∈ R N×N is a 2D window with 1D Hann shape in the phase-encoding direction. The abs function takes the element-wise absolute value, • is the (element-wise) Hadamard product and x i is the ith shot image estimate.
In the case of full Fourier acquisition, this step is skipped by setting x i = x init i . Note that the PF phase constraint also limits the maximum frequency content of the previous motion-induced phase estimate i .

Motion Estimation
The shot-image guesses navigate macroscopic and physiological motion estimations. The shot-wise operator for macroscopic motion Ω i allows us to include various motion models, such as rigid, affine or elastic transformations. In the context of brain imaging, this work performs rigid registration of the shot magnitude images as presented in AMUSE. 30 The shot with the highest total correlation to all other shots is chosen as the registration reference with index i ref once in advance. The shots are then aligned by applying the estimated motion transformation. Next, the physiological motion correction requires the estimation of the motion-induced shot phase variations i . Assuming spatial smoothness, the phase maps are extracted from low-resolution data using a 2D triangular k-space window 29 : Here, Ω H i x i is the aligned shot image,  2d ∈ C N×N the 2D-FFT operator and W ∈ R N×N the window function. ∡ extracts the element-wise phase, the superscript H denotes the Hermitian of an operator and j is the imaginary number. The phase operator is constructed by Φ i = diag( i ). After correcting for macroscopic and physiologic motion, a joint image estimate i = Φ H i Ω H i x i can be obtained for each shot.

Optional Data Rejection
The initial set  0 provides options to make the reconstruction robust against unmodeled artifacts by data rejection. Besides shot mismatches due to through-plane motion or failed registration, patient motion can deteriorate MR signal evolution and the diffusion encoding in various ways, 47 leading to severely corrupted shot data. By exclusion from the current set , corrupted shots can be rejected before shot combination by analyzing k-space peak broadenings, 17,27 correlation measures 48 or other means.
In this work, the normalized root-mean-square error (nRMSE) of shot i with respect to the registration reference i ref is evaluated in every iteration: A shot index is rejected from  when nRMSE i exceeds a given threshold . This measure includes both residual magnitude and phase variations.

Shot Combination
The joint image is obtained by complex averaging of the motion-corrected shot images: The operator || is the cardinality of the set , representing the number of averaged shots. For known motion operators, the joint image constraints in Equation (3) can be interpreted as consensus constraints for the unconstrained joint problem in Equation (4). This is related to an average projection operator. 49 Apart from the projection, the joint image could also be retrieved by solving the extended SENSE problem in Equation (4) using CG as in AMUSE, 30 but the complex averaging involves relatively low computational loads for use in an iterative scheme and reuses the previous shot guesses x i .

Shot Data Projection
The shot data projection recovers shot images x i from the joint image and reinforces SENSE-based data consistency for each shot. 29 For Cartesian trajectories, the data projection just substitutes the estimated data points by actually measured ones, whereas unsampled k-space positions remain unchanged 8 : Hence, the shot estimates are updated by the joint image and fed back into the iterative scheme.

Acceleration strategies
The SEDIMENT scheme is furthermore accelerated using coil compression [50][51][52] and the point spread function (PSF) of regularly undersampled Cartesian EPI. 6,38 As the SENSE matrix size scales linearly with the number of coils N C , the computational expense of the SENSE updates can be reduced by coil compression.
Moreover, the PSF of Cartesian trajectories can be used to substitute the k-space undersampling 8,29 by pure image space operations. In general, Cartesian undersampling corresponds to a convolution with a shah-shaped PSF in image space. 6 Regarding multiple uniformly undersampled EPI interleaves, the trajectories are equal, except for a shot-specific shift from the k-space origin.
The k-space undersampling thus involves three concatenated image space operations. First, the trajectory is shifted back to the origin by Second, the k-space undersampling of the centered trajectory is applied by a PSF convolution represented by P ∈ C N C N×N C N . Finally, the trajectory offset is reversed by the conjugate phase ramp to obtain the original shifted trajectory for each shot: This PSF formulation is an identity, not an approximation, for substituting the FFTs. Note that the image space operations require that the number of pixels in the phase-encoding direction is divisible by the reduction factor. Furthermore, PF acquisition excludes this acceleration as the truncated sampling affects the Cartesian PSF.

Reference algorithms
The proposed algorithm was compared with the MC-SENSE+CG reference scheme and another variant of the SEDIMENT framework. All algorithms start with the CG-SENSE initialization, followed by macroscopic motion estimation, shot alignment and physiological motion estimation as shown in Figure 2.
MC-SENSE+CG is a non-iterative algorithm that solves the joint multi-shot diffusion problem in Equation (4)  Prior-MC SEDIMENT (prior macroscopic motion-corrected SEDIMENT) adapts the iterative procedure of SEDIMENT, but skips the macroscopic motion estimation after the initial estimate. This variant was implemented to evaluate the necessity of combined iterative physiological and macroscopic motion correction. The repetitive registration should yield performance gains to justify the increased computational load.

Numerical simulations
The algorithms were evaluated both in simulations and in vivo. The simulations were performed using the BrainWeb phantom 53 from the T 1 -weighted normal brain database and 1 × 1 × 1 mm 2 resolution. The phantom was padded to a matrix size of 256 × ⌊256∕N I ⌋ N I in the readout and phase-encoding directions, respectively, ensuring equal but shifted trajectories for all EPI shots.
The simulation data was prepared in five steps according to the forward model. First, the motion-induced phase variations were created and applied for each shot as random functions of second spatial order as presented by Hu et al. 31 Second, shot-wise rigid in-plane motion was uniformly sampled from a range of ±5 pix (±5 mm) and ±10 • and the shot data was transformed accordingly. Third, 12 2D Gaussian sensitivity maps were arranged circularly around the image center. The disturbed shot data was multiplied by the CSMs to obtain multi-shot multi-coil data.
Fourth, complex Gaussian noise with zero mean and equal variance for the real and imaginary parts was added in image space according to the predefined SNR. Finally, the data was undersampled in k-space according to the shot trajectories (optionally including PF acquisition).
The BrainWeb phantom was prepared for {2, 3, 4, 5, 6} shots and SNRs of {5, 10, 15, 20} without PF trajectories. The simulation data was reconstructed by the three algorithms without data rejection. The nRMSE and reconstruction time were used to measure performance. Total performance was measured as the average over 10 random simulation cases for each shot-SNR pair.

In vivo experiments
The in vivo experiments were executed on a 3 Tesla Philips Ingenia Scanner (Philips Healthcare, Best, The Netherlands) using a head coil with 13 channels. The data was obtained from six healthy volunteers. Informed consent was obtained according to the rules of the institution.
The multi-shot echo-planar brain DWI experiments were performed using conventional Stejskal-Tanner diffusion encoding within a spin echo sequence 54 and magnetization-prepared fat suppression. The DWI data was obtained in both full and partial Fourier acquisitions with four and six shots for a b-value of 1000 s∕mm 2 in three orthogonal directions. The subjects were asked to perform random in-plane motion from shot to shot within the head coil. DTI experiments were executed with four and five shots for a b-value of 1000 s∕mm 2 in 15 diffusion directions. Here, both static and gross motion-corrupted data was acquired. CSMs were acquired by precalibration. Relevant parameter settings are listed in Table 2.

Implementation details
The reconstructions were conducted using Python 3.6.5 on a system with a 2.7 GHz Intel Core i7 4-core CPU and 16 GB RAM. The preparations of the CSMs included masking and coil compression. The sensitivities were masked 6 by a threshold in advance of the presented reconstructions.
The threshold was set to 10% of the body coil magnitude maximum value. In addition, binary closing (10 iterations) and binary dilations (5 iterations) were performed using the SciPy library. In simulations, the phantom was used as a reference image for thresholding after smoothing with a 2D Gaussian filter ( = 15). Coil compression was performed by principal component analysis with a 99% threshold using the singular value decomposition in NumPy. This resulted in a reduction from 13 to 7 coils.
Non-uniform sampling on EPI ramps was adjusted by gridding in advance. Fourier transforms were performed using the FFT of the NumPy library. The PSF-based undersampling was used whenever possible (not for PF acquisition) to avoid FFTs including data projections and CG gradient computations.
The CG calculations were stopped by either the residual norm criterion 7 or a maximum number of iterations. The residual norm tolerance was set to 10 −4 . The maximum iteration count was empirically set to 12 and 10 for single-and multi-shot reconstructions, respectively. Coil sensitivity normalization 7 was applied for all CG methods.
The macroscopic motion was estimated using the rigid registration described in the fast elastic image registration 56 framework. After an exhaustive presearch, the registration performs a Gauss-Newton scheme with Armijo's step size rule. A normalized gradient field 57 metric was used to stabilize the registration against intensity variations in the g-factor areas. 6 To stabilize convergence, the shots were aligned at their joint average location after each registration by subtracting the mean rigid parameters of all included shots (in ). 39,58 Furthermore, registration parameters below 0.01 pix (about 10 m) and 0.01 • were ignored and set to zero.
Registration accuracies below this threshold were assumed immoderate, hampering convergence.
The rigid shot alignment with Ω i was performed using a k-space formulation that avoids gridding. 39 Translations were applied in k-space by multiplying phase ramps according to the Fourier shift theorem. Rotations were implemented as a concatenation of three shears applied in k-space. 59 The action of Ω i was thus implemented by subsequent rotational and translational operators.
The SEDIMENT phase filters were k-space window functions avoiding phase unwrapping. For physiological motion estimation, the shot phases i were smoothed by a 2D triangular window in k-space using 2D-FFTs. 29 For full Fourier acquisition, the window size was scaled to half the image size. For PF acquisitions, the range in the phase-encoding direction was limited to the symmetric area. MC-SENSE+CG used the phase unwrapper and 2D median filter from the SciPy library. The median filter kernel was set to 9 × 9 pixels applied on unwrapped full resolution phases. Phase estimation was disabled for non-DWI datasets (b 0 = 0 s∕mm 2 ). PF projection phases i were obtained using a 2D window with 1D Hann shape in the phase-encoding direction scaled to the size of the symmetric sampling area in the k-space center.
The iterative algorithms were stopped either by a convergence criterion or by a maximum iteration count. The mean-square error (k) = || (k) − (k−1) || 2 2 ∕|| (k−1) || 2 2 of subsequent iterations was used as the convergence criterion, 29 where k is the iteration number. In this work, convergence was assumed when (k) dropped below the tolerance = 10 −6 or when a number of 200 iterations was exceeded.

Simulation results
The outcomes of the multi-shot DWI BrainWeb simulations are compiled in Figure 3. The bar plots in Figure 3A and 3B show the nRMSE and the durations over varying segmentations for all three methods, namely MC-SENSE+CG, Prior-MC SEDIMENT and SEDIMENT. The algorithms perform similarly for low segmentations, whereas for more than three shots the iterative methods achieve significantly lower errors than The BrainWeb simulations emphasize the importance of iterative reconstruction, especially for high segmentations. Exceeding three shots, the SENSE-induced g-factor penalty 6 increasingly deteriorates the shot navigators, resulting in corrupted final images. The iterative phase estimation realized by Prior-MC SEDIMENT effectively cures ghosting and shading artifacts compared with MC-SENSE+CG, but in the presence of rigid motion Prior-MC SEDIMENT relies on the initial macroscopic motion guess, which has occasionally proven to be insufficient.
SEDIMENT's repeated gross motion estimation refines the associated parameters at the cost of increased computational load. The convergence with in-plane rigid motion estimation is more unsteady due to its non-convex nature, but nevertheless SEDIMENT decently reconstructs even severely corrupted datasets. Besides, there are cases in which the present motion is not completely corrected by SEDIMENT. Convergence was stabilized by averaging the shot motion parameters 58 and limiting the registration to ±0.01 pix and ±0.01 • .

In vivo results
The in vivo performance of the three algorithms for full and partial Fourier multi-shot DWI is compared for six-shot datasets in Figure 4. The full Fourier reconstructions in Figure 4A show the final shot and joint magnitude images of MC-SENSE+CG, Prior-MC SEDIMENT and SEDIMENT.
The dataset contains only minor inter-shot gross motion. MC-SENSE+CG uses the initial CG-SENSE shot images and performs no further shot updates. These shot images thus also represent the shot initializations of the other two methods. The individual shots contain strong noise propagation artifacts in areas where the signal is suppressed due to the CG regularization by early stopping. 45 The MC-SENSE+CG joint image is strongly corrupted by ghosting and signal dropout, whereby the anatomical structures, such as the interhemispheric fissure, appear unblurred.
This suggests sufficiently accurate rigid motion estimation in the first step, but deficient phase estimation. The iterative recoveries appear artifact free and are hardly distinguishable. The convergence in Figure 4C is almost congruent.
The PF dataset in Figure 4B appears severely affected by inter-shot gross motion. The MC-SENSE+CG shot images contain similar artifacts as in Figure 4A as well as slight blurring artifacts in the phase-encoding direction from uncorrected PF acquisition. The MC-SENSE+CG joint image is severely deteriorated. Prior-MC SEDIMENT cured the shot phase-related artifacts by iterating over the phase, but the images still contain strong blurring from inadequate registration. SEDIMENT yields decent final images.
The PF convergence criterion in Figure 4D drops unsteadily for both iterative methods. SEDIMENT needed fewer iterations (but more time) due to the enhanced consistency by repeated image registration. The joint image of SEDIMENT at iteration 50 was inserted to analyze the Conversely, SEDIMENT has been shown to recover decent images even for inaccurate initial gross motion guesses, as demonstrated in Figure 4B.
The SENSE-enabled registration apparently casts the algorithm into a sufficiently accurate (but still possibly local) minimum.

In vivo data rejection
For some datasets, the motion-corrected SEDIMENT reconstruction failed due to the non-convex nature of the problem. In these cases, the SNR of the shot images normally impedes the estimation of sufficiently accurate rigid motion parameters. Figure 5 shows a final joint image example and the convergence criteria, respectively, of a six-shot SEDIMENT reconstruction with and without data rejection. The SEDIMENT reconstruction without rejection is blurred by rigid shot motion mismatches. Iterative shot data rejection with a tolerance of = 0.46 results in an uncorrupted image. The convergence of SEDIMENT with shot rejection contains two strong strikes compared with the relatively continuous evolution without rejection. The black dotted line indicates the number of rejected shots in each iteration.
The iterative data rejection of SEDIMENT initially excludes four shots by the nRMSE consistency measure and combines just the two remaining shots. These two shots contribute to the joint reconstruction and together enhance SNR. This, in turn, improves the conditions for image registration in subsequent iterations so that more and more shots can be included by consistency, until after about 25 iterations all shots

In vivo DTI results
For the in vivo DTI data, the multi-shot datasets for each diffusion direction were reconstructed with SEDIMENT. Using Dipy, 55 the resulting joint images per direction were aligned by affine registration to subsequently estimate the tensors. The resulting FA maps for a four-shot and a five-shot case from two different subjects are presented in Figure 6A.  Figure 6B shows the rigid motion parameters estimated by SEDIMENT over the full data acquisition. The shot index indicates the chronological excitation number in the experiment and therefore represents a surrogate of time.
Subject 1 moved just slightly during the scan (about ±2 • and ±2mm), producing blurring and kinks from the rigid mismatch, which propagate through the direction-wise reconstructions into the tensor estimates (pink arrow). The motion correction greatly reduces these artifacts, except for minor blurring and noise-like structures (red arrows), and recovers important white matter structures. In contrast, Subject 2 performed large movements during the scan (about ±10 • and ±5mm). The strong misalignment led to heavy gross motion artifacts in the direction-wise reconstructions, which render the FA maps unusable. The iterative motion correction is able to align the shot data sufficiently to restore the overall structures and the SNR of the motion-corrupted data compared with the static reference. The gross motion correction and the data rejection were disabled for the SEDIMENT reconstructions in the center column. Subject 1 only performed small gross motion during the scan, which produced blurring and kinks in the FA maps (pink arrow). Subject 2 performed strong motion during the scan (about ±10 • and ±5mm). The uncorrected mismatch severely deteriorated the SEDIMENT reconstructions, resulting in a poor FA map. The iterative rigid correction is able to mitigate the motion-induced artifacts, except for minor blurring and noise-like artifacts (red arrows). Note that the static and motion cases of Subject 1 are slightly different slices. B, Estimated rigid motion parameters for the motion case over the full data acquisition (shot index is a surrogate of time)

DISCUSSION
SEDIMENT provides decent multi-shot DWI reconstructions in the presence of physiological and inter-shot in-plane macroscopic motion. The model-based algorithm successfully combines multiple segments and augments the coverage of motion corruption scenarios shown in Figure 1.
The iterative scheme achieves superior image quality compared with non-iterative methods such as AMUSE, 30 making higher EPI segmentations feasible, although the achievable segmentation is still limited for two reasons. First, the g-factor penalty affects the SENSE-based shot motion navigators, which impedes data consistency for high segmentations and impairs robustness. Second, the computational load increases with the number of shots.
The initial CG-SENSE shot-image guesses crucially affect the robustness in dealing with the non-convex macroscopic motion estimation.
SEDIMENT generally uses convex optimization strategies, so it can only produce acceptable results for sufficiently precise initial guesses. Herein, SENSE provides a reliable technique to resolve the undersampling and thereby empowers the algorithm to leverage non-convex motion effects.
SEDIMENT has even shown convergent behavior with acceptable results for visibly inaccurate initial motion estimates. In general, the condition of the individual SENSE-based shot problems determines both convergence and the feasibility itself.
SENSE-based shot navigators generally benefit from enhanced coil orthogonality, more channels or proper regularization. 5 Moreover, the trajectory and its undersampling pattern affect the g-factor characteristics. Spirals, 29 variable-density spirals 21 or Cartesian trajectories with extra lines for self-navigation 22,23 provide crucial low-resolution signal for motion estimation. In contrast, the EPI interleaves acquire varying amounts of central k-space energy, resulting in different degradations of the shot reconstructions. As an example, the single shots 0 to 5 in Figure 4B contain about 25,9,8,5,26 and 27% of the total signal energy, which qualitatively matches the visual impression. Moreover, randomized sampling schemes enable supportive sparsity-enforcing regularization 43 of the joint problem in Equation (3), but this is difficult to realize for EPI.
In addition, PF acquisition allows for significant echo time reductions, which, at the same time, improves the feasibility of T 2 -critical applications for abdominal diffusion. The rapid T 2 signal decay experienced, for example, in prostate DWI complicates image recovery for conventional EPI trajectories. PF techniques shorten the k-space trajectory so that the echo top is reached earlier. The echo time reduction can help to increase SNR and reduces T 2 imprinting issues in DWI. Nevertheless, PF reconstructions require the signal phase to be slowly varying in the phase-encoding direction. This assumption might be unjustified due to, for example, susceptibility variations, strong field inhomogeneities or flow, 41 and must be reconsidered for each application scenario.
Iterative data rejection is a crucial reconstruction element to improve robustness by including only consistent shot data. As an example, the shot combination might become unfavorable for failed registrations, through-plane motion (see Figure S1 in supporting material) or contrast variations 47 when intra-shot motion occurs during the diffusion encoding process. For this purpose, the iterative SEDIMENT approach allows one to selectively include datasets for consistent reconstruction, improving convergence and reliability.
Compared with AMUSE, 30 which is mimicked by MC-SENSE+CG in the present work, the current SEDIMENT implementation for multi-shot DWI neglects the implications of rotational motion onto the diffusion contrast and is thus only valid for small rotations under the assumption of a smooth q-space. Considering the integration of SEDIMENT into a DTI framework, the refined motion estimates could be leveraged to achieve more accurate tensor estimates. However, the initial tensor estimation becomes more challenging for higher segmentations as well. Using the initial SENSE-based shot estimates, the g-factor penalty complicates appropriate initial tensor guesses. In contrast, the consistency-based integration into the iterative framework further increases the computational load and raises questions about the numerical stability of the tensor model. Hence, the integration of SEDIMENT into a DTI framework is subject to future research.
Currently, SEDIMENT neglects macroscopic motion-induced variations of the CSMs by making several assumptions. The use of equal CSMs first implies that the electromagnetic properties at each point in space remain unchanged after small macroscopic object motion. The coil setup is thereby assumed to be at rest and independent of the body, in contrast to setups that are attached. Second, macroscopic motion disturbance of the low-resolution SENSE reference scan is neglected. Third, the coarse SENSE reference scans cover the whole motion range of the subject within the coil. As an alternative approach, sensitivity estimation could be incorporated into the model-based reconstruction. 60 The presented retrospective algorithms could further be fruitfully fused with prospective motion correction approaches. 61 Strong subject motion can corrupt the DWI data in various ways, stressing the implemented motion models. Prospectively navigated acquisitions reduce the prevalent artifacts in the data, providing enhanced conditions for retrospective corrections.
Possible future extensions to SEDIMENT involve learning-based modules for rigid motion estimation, phase denoising and data rejection.
Hitherto, the rigid registration and the phase estimation ignore the SENSE-related g-factor distribution over the image, which could be used to suppress areas of high noise propagation. Moreover, their implementation normally includes filters, thresholds and weighting factors that are manually determined and that are sensitive to SNR and, thus, the segmentation.
Feature-based deep learning modules could provide enhanced motion parameter estimates and rejection tools, which are separable from the joint image recovery. As an example, Bilgic et al 44 proposed to jointly denoise the magnitudes of segmented multi-echo reconstructions using a neural network and to use the denoised magnitudes to regularize phase estimation using phase cycling. 62 The enhanced phase maps are then used for conventional reconstruction. In this way, the influence of neural networks is restricted to the estimation of motion parameters, and erroneous estimates appear with well defined artifact shapes in the conventional physics-based reconstruction. This modular inclusion opposes fully end-to-end approaches such as Variational Networks 63 or AUTOMAP. 64

CONCLUSION
SEDIMENT is a SENSE-navigated iterative scheme that improves state-of-the-art multi-shot DWI reconstruction, correcting for motion-induced phase variations and rigid inter-shot in-plane motion. The continuously refined shot motion estimates enable the consistent combination of multiple shot datasets, thereby making high EPI segmentations feasible in the presence of macroscopic motion. The algorithm supports PF reconstructions and strategies for shot data rejection to boost SNR and robustness, paving the way for further data-driven multi-shot DWI applications. The presented scheme provides an adjustable modular framework, enhancing reconstruction quality and speed to ease clinical adoption of the method in DWI applications.