Resolving degeneracy in diffusion MRI biophysical model parameter estimation using double diffusion encoding

Purpose Biophysical tissue models are increasingly used in the interpretation of diffusion MRI (dMRI) data, with the potential to provide specific biomarkers of brain microstructural changes. However, it has been shown recently that, in the general Standard Model, parameter estimation from dMRI data is ill‐conditioned even when very high b‐values are applied. We analyze this issue for the Neurite Orientation Dispersion and Density Imaging with Diffusivity Assessment (NODDIDA) model and demonstrate that its extension from single diffusion encoding (SDE) to double diffusion encoding (DDE) resolves the ill‐posedness for intermediate diffusion weightings, producing an increase in accuracy and precision of the parameter estimation. Methods We analyze theoretically the cumulant expansion up to fourth order in b of SDE and DDE signals. Additionally, we perform in silico experiments to compare SDE and DDE capabilities under similar noise conditions. Results We prove analytically that DDE provides invariant information non‐accessible from SDE, which makes the NODDIDA parameter estimation injective. The in silico experiments show that DDE reduces the bias and mean square error of the estimation along the whole feasible region of 5D model parameter space. Conclusions DDE adds additional information for estimating the model parameters, unexplored by SDE. We show, as an example, that this is sufficient to solve the previously reported degeneracies in the NODDIDA model parameter estimation.


| INTRODUCTION
Diffusion MRI (dMRI) has been established as an invaluable tool for characterizing brain microstructure in vivo and non-invasively. Diffusion weighted images (DWIs) are sensitive to the random displacement of water molecules within a voxel, 1 probing tissue on scales considerably lower than image resolution. 2 Diffusion MRI provides the aggregate signal from the distribution of components within a voxel. By measuring across multiple diffusion orientations and weightings, information about the underlying tissue architecture can be unravelled. The ability to detect small alterations in brain tissue is a key factor when developing biomarkers for early stages of neurodegenerative diseases. 3 Various approaches to derive information from Diffusion Weighted Images (DWI) have been proposed in the literature. [4][5][6][7][8] Most direct approaches, such as Diffusion Tensor Imaging (DTI), 4 are just aimed at describing the main MRI signal characteristics (signal representations, 9 ). However, the quest for specific information on tissue microstructural integrity inspired the development of biophysical tissue models. [10][11][12][13] By assuming certain characteristics for the tissue, such as the type of constituents, their geometry and physical properties, these models may allow the extraction of more specific microstructural information than signal representations, as long as these assumptions are at least approximately satisfied by the tissue. Nevertheless, the validity of these results relies on how accurate the model is for the tissue under study. The widely used Neurite Orientation Dispersion and Density Imaging (NODDI) 14 model fixes the diffusivity values of the compartments present in the voxel to specific values. NODDI's assumptions have been shown to be incompatible with data from spherical tensor encoding (STE) in Lampinen et al 15 and it has been argued to introduce bias in the estimation of the remaining model parameters. 16 To overcome this limitation, Jelescu et al 17 extended the model by adding the diffusivities to the estimation routine, and removing the CSF compartment. They dubbed it NODDIDA (NODDI with Diffusivity Assessment). While this approach eliminated some flawed assumptions made by NODDI, this led to multiple possible solutions that describe the signal equally well. This reflects that the estimation problem is ill-posed or, at least, illconditioned, and is usually stated as the existence of degenerated model parameter sets. Recent work by Novikov et al showed that this degeneracy is intrinsic to the so-called Standard Model (SM), 18 of which NODDIDA is a special case. They show that choosing the correct solution is challenging even with the use of high b-value data, although Jespersen et al 19 obtained stable estimations in ex-vivo brain tissue using extremely high b-values (15 ms/μm 2 ). Reisert et al 20 proposed a supervised machine learning approach trained with the expected value of the Bayesian posterior, which, by definition, disregards the possible multimodality of the distribution. Furthermore, it was trained on simulated data with the prior assumption of similar traces for the intra-and extra-axonal diffusivities.
Most of the dMRI techniques have been developed for an acquisition performed within a Single Diffusion Encoding (SDE) framework. Since Stejskal and Tanner developed the Pulsed Gradient Spin Echo (PGSE) sequence, 21 there have been many works aimed at maximizing the information that can be obtained from a dMRI experiment by exploring different acquisition protocols. 22,23 One of the many modifications proposed to the magnetic gradient waveforms involves the addition of multiple gradient pairs. Particularly, a scheme that has lately gained popularity is termed double diffusion encoding (DDE), 24 first proposed by Cory et al. 25 The term DDE refers to any sequence consisting of two consecutive diffusion encodings. It has been shown that DDE, as well as other multiple encoding schemes, has the potential to provide new information that is not immediately accessible with SDE. 26 Many groups focused on developing methods for extracting microstructural information based on this scheme. [27][28][29][30] Jespersen et al 31 showed that in the low-diffusion-weighting limit, the information extracted from single and multiple diffusion encodings is the same. A recently developed dMRI framework based on q-space trajectory encoding (i.e. multidimensional diffusion encoding) was proposed to probe tissue in ways not accessible by SDE. 32 For tissues comprised of multiple Gaussian compartments (MGCs) any q-space trajectory is equivalent to a second order b-tensor, which generalizes the concept of b-value. In such systems SDE and DDE are fully specified by b-tensors, with one and two non-zero eigenvalues, respectively, and are also called linear tensor encoding (LTE) and planar tensor encoding (PTE), in case of DDE with perpendicular directions. Lampinen et al 15 have analyzed the advantages of a multidimensional encoding over SDE NODDI. They proved that extending the acquisition increases the accuracy in quantifying microscopic anisotropy. However, it has not been fully explored, from the point of view of fitting a biophysical model to noisy measurements, if single or multiple encodings can provide us with more precise model parameter estimates (cf. 29,30  linear with planar or spherical tensor encoding to address the degeneracy and increase the precision of parameter estimation have been investigated [33][34][35] in both in silico and/or in vivo experiments. Their results show that the estimation precision is increased by the addition of these orthogonal measurements. However, a theoretical background of why this happens is still missing. This paper extends NODDIDA to a DDE scheme and assesses the accuracy of estimators based on SDE and DDE measurements. This extension adds more degrees of freedom to the data acquisition (i.e. two diffusion encoding periods must be chosen). We hypothesized that DDE acquisition protocols containing both parallel and perpendicular direction pairs might outperform SDE protocols in informing biophysical models. We investigated analytically the different information provided by DDE and SDE in terms of their fourth order cumulant expansions. We examine the ill-posed nature of the parameter estimation from SDE and present a theoretical explanation of why DDE resolves the degeneracy (except for the completely isotropic case κ = 0) without requiring extremely high diffusion weightings (e.g. b > 4 ms∕μm 2 ). Additionally, we generated in silico dMRI measurements for acquisitions with different DDE configurations from a wide range of model parameter values covering the biologically feasible region of the 5D parameter space. Under similar experimental conditions, higher accuracy and precision is obtained for DDE combining parallel and perpendicular direction pairs, outperforming SDE in most scenarios.

| Biophysical model assumptions
A general assumption among multi-compartment models representing tissue microstructure is that water exchange between compartments is negligible for typical experimental time scales. The total signal is the weighted contribution from each compartment. The two-compartment model dubbed Standard Model is the most general version of the typical models used for diffusion in neural tissue (see Ref. 36 ). The stick compartment (sometimes referred as intra-neurite) represents axons, which are expected to be the main contributors to the restricted diffusion signal, and, possibly, dendrites and glial processes. 19 The inclusion of dendrites and glial processes is open to discussion 36 and implies the assumption that in certain regimes (depending on e.g. diffusion time) they have similar diffusivity and T 2 relaxation properties and directional distribution, a question which still has not been fully addressed (see partial discussion in Lampinen et al 37 ). Sticks are zero-radius cylinders and model fibers in which diffusion is assumed to occur only along the fiber's main direction as it was first proposed for water in neurites in Jespersen et al. 13 Later, Nilsson et al 38 showed theoretically that typical axonal diameters cannot be resolved with SDE and gradient amplitudes available on clinical scanners and thus, are indistinguishable from sticks. This was also confirmed experimentally in Veraart et al. 39 The second compartment represents the extra-neurite space where diffusion is hindered and is modeled as Gaussian anisotropic 19 (zeppelin compartment). A fiber segment is defined as the local bundle of aligned sticks with the extra-neurite space surrounding them. Voxels are composed of a large number of fiber segments. The SM consists of the fiber segment signal model (i.e. kernel) with the diffusivities and water fraction as free parameters, together with a general fiber orientation distribution function (ODF), which could be represented by its spherical harmonics decomposition. One limitation of this model is that each fascicle within a voxel is assumed to have identical diffusion properties, leading to identical microstructural parameters.
Some other works consider a third compartment that represents the contribution from stationary water. 11,40 However, recent works 41 have concluded that the signal arising from this compartment can be neglected in most structures for the diffusion times used in the clinic and should only be considered in the cerebellum. 42 Additionally, in its original version, NODDI included an isotropic diffusion compartment to account for the presence of cerebrospinal fluid (CSF). This compartment was removed from NODDIDA for the sake of simplicity. 17 Considering a general fiber ODF involves a large set of parameters, which can hinder their unambiguous estimation from the dMRI signal. The NODDIDA model, 17 is essentially the SM with the constraint that the fiber ODF must be a Watson spherical distribution (û) = f (û|̂ , ), with concentration parameter κ and main direction ̂ (see Figure 1). This cylindrically symmetric ODF is usually considered a sufficiently good and parsimonious model, 43 especially for white matter regions without crossing fibers. Although being a simplified version of SM, NODDIDA still presents some degeneracy problems. Thus, in this work, we focus our analysis on the NODIDDA model.

| NODDIDA model with SDE
For a general SM, the signal from a SDE experiment, where the diffusion weighting b (i.e. b-value) is applied in the diffusion encoding direction n = [n x , n y , n z ] t , is given by the convolution over the unit sphere 18 is the response signal (kernel) from a fiber segment oriented along direction û. Here, f is the (mainly) T 2 -weighted stick volume fraction, D a the intra-neurite axial diffusivity, and Δ e = D ‖ e − D ⟂ e , with D ‖ e , D ⟂ e the extra-neurite diffusivities parallel and perpendicular to the fiber-segment axis. 36 These scalar kernel parameters (f, D a , D ‖ e , and D ⟂ e ) provide important tissue microstructural information, and have shown potential clinical relevance as they are sensitive to specific disease processes such as demyelination, axonal loss or inflammation. [44][45][46] It has been recently shown that the parameter estimation is challenging under normal experimental conditions. 17 There are two issues here. The first one is that fitting these models to noisy measurements is generally a non-convex optimization problem, potentially having several local minima of the objective function, requiring appropriate optimization algorithms. However, the existence of multiple local minima opens the door to a second, more serious, issue: the objective function can present multiple minima with equal or very similar values. In the presence of noise these minima are perturbed, making unstable which one becomes the global minimum. Jelescu et al 17 evidenced this ill-posedness issue for clinically feasible dMRI acquisitions in two particular cases. They showed that the estimated parameters from a collection of independently simulated dMRI measurements follow a bi-modal distribution, despite being simulated from a single ground truth, and the presence of practically indistinguishable spurious minima in the objective function.

| Parameter estimation from SDE: An ill-posed problem
A recent work by Novikov et al 18 analyzed in detail this inverse problem for the unconstrained SM by reparametrizing it into its rotational invariants. They concluded that without any constraints on the ODF shape, it was not possible to estimate the kernel parameters with an acquisition sensitive up to order (b 2 ). However, in this work we are interested in studying NODDIDA, where the ODF is given by a Watson distribution.
For intermediate diffusion weightings (i.e. b < 2.5 ms∕μm 2 ) the dMRI signal is accurately represented by its 4 th -order cumulant expansion 47  where F denotes the Dawson function. 50 Using these equations, we can derive the relations between the BP and DK parameters that fully describe this axially symmetric environment, as done in Hansen et al 51 for fully aligned fibers, but here for an arbitrary value of κ: where D = (2D ⟂ + D ‖ )∕3. Taking the limit for κ→∞ we recover the system of equations for parallel fibers presented in Hansen et al 51 (Equation 12). In Hansen et al 48 the equivalent to the system in Equation 6 is solved reaching two alternative equations for κ,  ± ( ) = 0, each giving possible solutions. This suggested that, in general, there should be two solutions, one for each branch. However, this is not always the case, as illustrated in Table 1. We derive here an alternative expression of the solution in one equation only. First, Equation 6 can be reparametrized as: After this substitution, Equation 6 can be expressed as a linear system of five equations for the 5 unknowns α, β, γ, δ and ε, decoupled into two independent smaller systems: Observe that the coefficients of matrices L and M depend on κ. We will ignore for the moment that the five unknowns are not independent. The solution is unique as long as matrices L and M are invertible. This is the case when κ ≠ 0, since det L = p 2 and det M = − 1 2 p 2 p 4 . In the limit of a fully isotropic medium (κ = 0) the system has only two independent equations, not allowing the recovering of the kernel parameters without additional information. By solving the two systems in Equation 8 we find expressions for α, β, γ, δ and ε that only depend on κ and the DK parameters (see Appendix A for solution). Those variables are actually defined from only four kernel parameters (Equation 7), resulting in the coupling equation T A B L E 1 Illustration of sets of biophysical (BP) parameter values resulting in the same diffusion-kurtosis (DK) parameters By plugging the expressions for α, β, γ, δ and ε as functions of κ into Equation 9, we obtain a nonlinear equation for κ with potentially multiple solutions. Each solution for κ gives a single solution for α, β, γ, δ and ε, which in turn, gives a single solution for the kernel parameters:

DK parameters Branch BP parameters C new invariants
Thus, the number of solutions to Equation 9 corresponds to the number of BP parameter sets that have the same DK parameters. Table 1 presents cases with up to four solutions. We computed the number of solutions for 10k random points in the BP parameter space. Most present two solutions (70.2%), some only one (29.3%), and only a small proportion have four solutions (0.5%). This gives rise to the previously discussed degeneracy in model parameter estimation from noisy measurements. 17 In contrast with the claim in Hansen et al 51 even in the extreme case of parallel fibers leaving only four unknowns, the five equations in Equation 6 are independent. This is possible due to the nonlinear nature of the system. If κ is known and not zero (including the limiting case κ → ∞ of parallel fibers), the full-system is invertible as long as f is not 0 or 1, and D ⟂ e is not null. In that case, each point in the DK parameter space (signal profile) corresponds to a single set of BP parameters. However, this is not the case for an arbitrary unknown κ. Here, the full-system has five independent equations with five unknowns, but, depending on the parameter values, it can have only one or multiple solutions. This latter case makes the inverse mapping an ill-posed problem.
Using very high b-values might be considered an option to solve this problem, as it will add higher order terms in Equation 3. However, it is still challenging due to very low associated signal-to-noise ratio (SNR) and is also unfeasible in most clinical scanners, although bespoke systems with ultra-strong gradients may provide leverage in this regard. 52 Another solution that does not require powerful gradients is to seek for independent measurements providing new information.

| Model extension to DDE
DDE adds an extra dimension to the dMRI acquisition, unexplored by SDE experiments. For a general multidimensional acquisition, 32,53 due to the assumption of impermeable compartments, within each of which the diffusion displacement profile is assumed to be Gaussian, the signal can be written as: with the kernel for b = tr(B). The b-tensor of a DDE acquisition is B = b 1n1 ⊗n 1 + b 2n2 ⊗n 2 , defined from the pair of gradient directions, n 1 , n 2 , and their individual diffusion weightings, b 1 , b 2 . It has in general two non-zero eigenvalues, viz. PTE. In contrast, the SDE's b-tensor, B = bn ⊗n, has only one non-zero eigenvalue, viz. LTE. Hence, for this model a SDE acquisition is a subset of the DDE acquisitions (SDE = DDE ‖ ⊂ DDE), for which n 1 =n 2 (parallel direction pair).

| DDE information gain
DDE can, in principle, provide independent complementary information. This could transform the inverse mapping of recovering BP parameters from diffusion-weighted measurements into a well-posed problem. The fourth order cumulant expansion for the dMRI signal arising from a DDE experiment is Here, C is the second cumulant tensor of the dMRI signal expansion in terms of the b-tensor and satisfies minor and major symmetries: but it is not totally symmetric. Its totally symmetric part is proportional to the kurtosis tensor: For MGCs or DDE with long mixing times, 31 D and C can be written as where f and D ( ) ij denote the fraction and diffusion tensor of compartment α, including in this summation the integral over the unit sphere with the ODF (cf. Equation 1). This motivated naming C as the diffusion tensor covariance. 31,32 Our definition of C coincides with the one in Westin et al 32 and for long mixing times it is also proportional to the Z tensor (C = Z∕(4Δ 2 )), earlier introduced in Jespersen. 31 The Z tensor is defined more generally, i.e. not restricted to MGCs, as a cumulant of the DDE signal.
In the case of a Watson ODF, W and C are transversely isotropic fourth order tensors, i.e. they have cylindrical symmetry. Hence, instead of having 15 and 21 independent components they only have three and five, respectively. We can write both tensors as a function of coordinate independent tensor forms (for full derivation see Appendix B), like it is done for W in Hansen et al 51 (Equation 6): where C was written separating its fully symmetric part from the remaining part, 54 and where ij is the Kronecker delta and ̂ the Watson distribution main direction. Equation 17 shows explicitly that C contains two extra degrees of freedom independent of W. Observe that the fully symmetric part of R and J vanishes, so that the information encoded in 1 and 2 is not accessible from a SDE experiment. 28 We can isolate the new non-symmetric components by the antisymmetrization Considering a coordinate frame with the z-axis parallel to the fibers main direction ̂ , we can identify Similarly to Equation 6 we can relate the elements of C to the biophysical parameters like it was done for W. For the SM, including NODDIDA, D and C are given by where For NODDIDA we get h 2 ( , ) = H (2) ij n i n j and h 4 ( , ) = H (4) ijk n i n j n k n , with =̂ ⋅n. The cross-terms of C present new information not accessible from SDE. This makes the DDE signal able to resolve the degeneracy. To make this explicit, we can write the components isolated in Equation 20 in the adapted coordinate frame in terms of BP parameters: Those two equations are independent to the ones in Equation 6, adding complementary information to the mapping between DK and BP spaces (see last column in Table 1). Using the same variables defined in Equation 7 we get These two equations enlarge the system in Equation 8. Following the derivation in Appendix C, we demonstrate that they determine a single solution for κ: since the left-hand side is a strictly monotone increasing function on κ. This agrees with recent work by Cotaar et al, 55 who showed that combining different b-tensor shapes can determine robustly fiber dispersion. Observe that the cases f = 0 or f = 1 reflect only an apparent degeneracy, as the different sets of parameters represent the same physical model. In contrast, the case of κ = 0 presents a proper degeneracy of the model due to lack of information, where different model instances have identical D and C tensors.

| Signal generation
All synthetic measurements were generated from substrates composed of 1 μm diameter cylinders to simultaneously assess our stick approximation. We found this difference was below the noise level. We computed the signal attenuation in the cylinder's perpendicular plane with the Gaussian phase approximation for both SDE 56 and DDE. 30 Since there is no closed analytical solution for the integral on the sphere in Equation 11, we computed the spherical convolution using Lebedev's quadrature 57 : where w i are the quadrature weights of each grid point û i across the unit sphere. For all configurations of SDE and DDE we used 1,202 quadrature points, which guarantee an exact result up to a 59 th order spherical harmonics decomposition of the ODF. No practical differences were found between the results from our SDE implementation and the analytic summation for SDE in Zhang et al. 43 Finally, Rician noise was added to the synthetic signals, normalizing it to obtain a SNR = 50 for the b 0 measurements, like in Jelescu et al. 17 (17)

| Parameter estimation algorithm
Parameter estimation was based on a nonlinear least squares estimator. This was justified due to the relatively high SNR considered for the experiments, where Rician noise can be approximated as Gaussian. 58 17 For all configurations, this optimization procedure was repeated using 30 independent random initializations for the model parameters to avoid local minima of the five-dimensional cost function. The local solution with the lowest residue was the global optimum.

| SDE and DDE tested configurations
Five encoding configurations were considered: DDE 60 + 0 , DDE 40 + 20 , DDE 30 + 30 , DDE 20 + 40 , and DDE 0 + 60 , with progressively increasing proportions of perpendicular direction pairs, b, with respect to parallel direction pairs, a, denoted as DDE a + b . Observe that DDE 60 + 0 is equivalent to SDE if MGCs are assumed. We compared the SDE protocol used in Jelescu et al 17 against different DDE acquisitions with the same number of measurements that can be measured in a similar experimental time. The SDE measurement protocol had two shells with b-values of 1 and 2 ms∕μm 2 with 30 directions each. 17 These directions were generated using the Sparse and Optimal Acquisition (SOA) scheme. 59 DDE configurations were also divided in two shells with the same b-values as above and both directions in each pair had equal individual diffusion weightings, b 1 = b 2 = 1 2 b. Thus, perpendicular direction pairs define axially symmetric planar b-tensors, uniquely defined by their normal vector. We generated homogeneously distributed normal vectors using the same algorithm used for the SDE directions. The DDE 30 + 30 acquisition had 30 parallel direction pairs and 30 perpendicular direction pairs with normal vectors coinciding with the parallel pairs 33 (see middle diagram in Figure 2). The DDE 0 + 60 protocol had only perpendicular directions pairs (right diagram in Figure 2). Configuration DDE 40 + 20 had two parallel per each perpendicular directions pair, and DDE 20 + 40 two perpendicular per each parallel directions pair. All acquisitions had five non diffusion-weighted measurements (i.e. b 0 measurements).

| Experiments
We performed two in silico experiments to assess whether the addition of DDE measurements can enhance the parameter estimation in the presence of typical noise in the measurements.
In the first experiment, we considered two possible instances of NODDIDA parameter values for a voxel in the posterior limb of the internal capsule (PLIC) taken from Jelescu et al 17 (see Table 2), for which SDE estimates showed a bimodal distribution. We explored in detail whether DDE solves the degeneracy between these particular cases. Only SDE and DDE 30 + 30 acquisition configurations were considered for this experiment. Two thousand and five hundred independent realizations of Rician noise were added to the synthetic SDE and DDE signals.
The second experiment aims to compare the accuracy and precision provided by SDE and the different DDE configurations extensively along the feasible region of the full five-dimensional (5D) space of parameters (diffusivities between 0 and 3 μm 2 ∕ms, fraction between 0 and 1, and κ positive). This allows exploring whether there are subregions presenting different behaviors. A 5D grid was generated by all the combinations of

| RESULTS
Histograms of the estimated model parameters from the first experiment (Figure 3) show an increase in the accuracy and precision of the estimates with the DDE scheme. The bimodal distribution of the estimated parameters is evident with the SDE acquisition, confirming that it is not possible to differentiate true and spurious minima. This effect is removed when using the DDE sequence.
We analyzed the shapes of the SDE and DDE objective functions from the synthetic measurements of SET A (sum of squared differences: F A ( )). To facilitate the visualization of these 5D functions, we performed a 1D cut through a straight line joining the true and spurious minima of SDE. This was parametrized with the scalar variable t:  Figure 4 shows the behavior of F A ( ) along this cut as a function of t. It can be observed that although the DDE objective function is still bimodal, the spurious and true minima have significantly different absolute values (due to the contribution of the tensor C to the DDE signal). This enables us to distinguish both peaks in conditions where SDE cannot (i.e. b max = 2 ms∕μm 2 ). Adding more directions to the SDE acquisition would not help to differentiate the peaks, as even in the noiseless case these two sets produce the same signal. Only by increasing the SDE diffusion weighting the spurious minimum could be differentiated from the true one.
For each point in the 5D grid of parameters, the Root Mean Square Error (RMSE, for definition see for instance 60 ) of each parameter has been computed from 50 independent noise realizations. The distributions of RMSE of the parameter estimates from this second experiment are displayed in Figure 5 with violin plots (similar to box plots but showing also estimated probability density 61 ). The summary statistics of the RMSE distributions are shown in Table 3. On average, DDE 40 + 20 and DDE 30 + 30 are the most accurate configurations for estimating all parameters. This suggests that the incorporation of even a small proportion of DDE measurements can remove the degeneracy, leading to an increase in accuracy and precision.
To compare the performance of SDE and DDE in different regions of the parameter space, we projected the 5D RMSE map onto different 3D sub-spaces. Figures 6 and 7 show two different 3D projections, over (D ‖ e , D ⟂ e , c 2 ( )) and over (f ,D a , c 2 ( )), of the RMSE of f and D a , respectively. The highest improvement of DDE with respect to SDE is associated with low c 2 values, i.e. high orientation dispersion. Additionally, for highly aligned voxels the performances of both schemes becomes similar.

| DISCUSSION
Our work shows that modifying the diffusion MRI pulse sequence can mitigate the degeneracy on NODDIDA's parameter estimation. Our proposal circumvents the need of presetting diffusivities to a priori values as in NODDI. We showed that estimating the NODDIDA model through SDE is generally an ill-posed problem. Depending on the specific combination of model parameters, multiple parameter sets may produce the same signal profile. We show analytically that DDE makes parameter estimation well-posed, and illustrate for a particular model instance the better behaved cost function obtained with DDE. In silico experiments over a wide range of model parameter combinations confirmed that extending the acquisition to DDE makes the inverse problem well-posed and solves the degeneracy in the parameter estimation. Combining DDE parallel (i.e. LTE) and perpendicular (i.e. PTE) direction pairs not only provides more stable parameter estimates but also increases their accuracy and precision. In Section 2, we showed that considering a noise-free scenario, in the case of fibers following a Watson ODF with known (nonzero) concentration parameter κ (including the case of parallel fibers), the inverse problem of recovering biophysical parameters from SDE measurements is wellposed. This is not the case for arbitrary unknown concentration κ, where Jelescu et al 17 first showed experimentally that the parameter estimation from SDE with intermediate b-values was degenerated. This was analyzed in Jespersen et al 49 showing that there were two nonlinear equations providing possible solutions. We demonstrated that in the absence of noise the number of BP parameter sets that describe the signal equally well up to (b 2 ) can be either 1, 2, or 4. In contrast, we showed analytically that the C tensor includes non-symmetric independent components that are accessible by DDE, but not by SDE, allowing the complete inverse mapping between the cumulant signal representation and BP parameter space. Consistently, the first experiment showed that in both of the PLIC synthetic voxels, DDE leads to more accurate and precise parameter estimations. This is clearly seen when analyzing the optimization cost-function which shows that although DDE also presents multiple local minima, the global minimum is substantially deeper, unlike SDE, thus it can be picked in typical noise levels. However, two points in the 5D model parameter space are insufficient to draw more general conclusions. Therefore, the second F I G U R E 3 Histograms of the estimated model parameters for SDE (top row) and DDE 30 + 30 (bottom row) schemes in the first experiment for 2,500 independent noise realizations (SNR = 50). The ground truth represents two possible solutions of the NODDIDA model applied to a voxel in the PLIC (Table 2). These values are shown in blue lines and correspond to set A (upper two rows), and set B (lower two rows) experiment swept the parameter space extensively using a regular grid. Mean results (see Table 3) showed the minimum RMSE for an acquisition consisting of both linear and planar b-tensors, suggesting that the optimal combination for the scenario considered is between DDE 40 + 20 and DDE 30 + 30 configurations.
Increasing the total number of measurements and SNR will have a larger impact in enhancing DDE parameter estimation than with SDE, since the bimodality present in SDE implies a non-zero lower bound for the achievable MSE even without noise. Results from 34 show that the addition of STE data also leads to an increase in the precision of D a and f in in vivo experiments. In our synthetic experiments the addition of PTE data reduces the RMSE in all the parameter estimates (to a lesser extent in f and D ⟂ e ). Recently, Dhital et al 35 showed through in silico experiments that incorporating PTE data to LTE data enabled us to discriminate spurious solutions in the cost-function. This latter result is explained by our theoretical analysis in Section 5 where we derive the independent equations provided by DDE that make the inverse problem well-posed. While finalizing this paper, a preprint 62 appeared, reaching similar conclusions.
Biophysical models are promising for extracting microstructure-specific information but care must be taken when applying them in dMRI. Some assumptions are more meaningful than others and hence their impact on parameter estimation must be assessed. 9 Invalid assumptions in the model will likely produce bias in the resulting microstructural information, which is epistemic and thus such biases cannot be removed simply by removing the degeneracy. Releasing the diffusivities in the typical two-compartment model eliminates an invalid assumption, reduces possible biases in the estimated parameters, and provides extra information amenable  to be used as a biomarker of microstructural integrity and sensitive to specific disease processes. 44,45,63 In this work, we have focused on analyzing the estimability of the model under different acquisition settings. The validation against complementary real data is an independent problem. Both should be addressed further to bring biophysical models to the clinic.
Jespersen et al 19 showed the estimation of the SM was stable and without degeneracy when using extremely high bvalues (15 ms∕μm 2 ) on ex vivo data. Recent work by Novikov et al 18 64 This latter work goes in a similar direction to our work here, i.e. adding extra dimensions to the experiment and changing the objective function to avoid ill-posedness. However, measuring multiple directions while varying the TE implies increasing the acquisition time and TE, affecting the SNR. However, this approach can be combined with DDE leading to a DDE acquisition with multiple TEs. Recently, Lampinen et al 15 showed that by acquiring data with linear and spherical tensor encodings the accuracy in estimating the microstructural anisotropy was increased compared to that derived from NODDI's parameters. Additionally, Dhital et al 41 measured the intracellular diffusivity using isotropic encoding. These two works point in a similar direction than ours, i.e. extending the acquisition to combine different shapes of b-tensors to maximize accuracy and precision. Future work will study the generalization of the model to a multidimensional diffusion acquisition, since the C tensor can be fully sampled using different combinations of b-tensor shapes, not only by LTE + PTE. Also, a detailed analysis of the impact of noise will be performed, further assessing the practical identifiability of the model parameters.
This work's aim was to demonstrate that it is possible to solve the intrinsic degeneracy of the SM with a Watson fODF using DDE. Although a cylindrically symmetric fODF might be insufficient to model crossing fibers, it may provide a reasonable approximation in the spinal cord and certain other white matter fiber bundles, 65 or in highly dispersed tissues like gray matter. Work by Tariq et al has extended the initial NODDI model to a Bingham ODF. 66 Additionally, Novikov et al 18 proposed the unconstrained SM with ODF to be described by a series of spherical harmonics. We plan to extend the analysis in this paper to general ODFs. The extension of biophysical models to multidimensional dMRI acquisitions 32 should be further explored. The comparisons made in this work between SDE and DDE protocols do not consider the optimization of the diffusion directions in DDE, just taking four arbitrary chosen DDE protocols extrapolated from an optimized SDE. We expect that further optimization of the DDE acquisition protocol may also lead to larger improvements. Finally, the largest errors in the parameter estimates occur for κ → 0. This might mean that for highly dispersed tissue (i.e. grey matter) many measurements might be needed to accurately estimate model parameters.

| CONCLUSIONS
The potential increase in sensitivity and specificity in detecting brain microstructural changes is a major driving force for developing biophysical models. However, non-linear parameter estimation of these models is not necessarily well-posed and can lead to unreliable parameter values. In this work, we not only extended the NODDIDA biophysical model from SDE to DDE schemes, but also demonstrated theoretically the advantages this latter approach has. We illustrated how DDE resolves the degeneracy issue intrinsic to this model estimation from SDE. We prove theoretically that DDE provides complementary information that makes the parameter estimation wellposed. Additionally, this extension leads to an increase in the accuracy and precision in the model parameter estimates in the presence of noise. The combination of parallel and perpendicular measurements for optimal parameter estimation as function of SNR and measurement time remains to be investigated. Our approach does not require high diffusion weightings to make the inverse problem well-posed and it can be further developed for the unconstrained SM.

ACKNOWLEDGMENTS
This work has been supported by the OCEAN project (EP/ M006328/1) and MedIAN Network (EP/N026993/1) both funded by the Engineering and Physical Sciences Research Council (EPSRC) and the European Commission FP7 Project VPH-DARE@IT (FP7-ICT-2011-9-601055). DKJ is supported by a Wellcome Trust Investigator Award (096646/Z/11/Z) and a Wellcome Trust Strategic Award (104943/Z/14/Z). 1 These equations are presented in Ref. [49] (Equation 9). However, in their expression, Δ e incorrectly appears squared also in the second term of the equation for W( )D 2 . This was a typing error.