### Abstract

- Top of page
- Abstract
- MATERIALS AND METHODS
- RESULTS
- DISCUSSION
- CONCLUSIONS
- Acknowledgements
- REFERENCES

Quantitative water/fat separation in MRI requires careful modeling of the acquired signal. Multiple signal models have been proposed in recent years, but their relative performance has not yet been established. This article presents a comparative study of 12 signal models for quantitative water/fat separation. These models were selected according to three main criteria: magnitude or complex fitting, use of single-peak or multipeak fat spectrum, and modeling of *T* decay. The models were compared based on an analysis of the bias and standard deviation of their resulting estimates. Results from theoretical analysis, simulation, phantom experiments, and in vivo data were in good agreement. These results show that (a) complex fitting is uniformly superior to magnitude fitting, (b) multipeak fat modeling is able to remove the bias present in single-peak fat modeling, and (c) a single-*T* model performs best over a range of clinically relevant signal-to-noise ratios (SNRs) and water/fat ratios. Magn Reson Med, 2010. © 2010 Wiley-Liss, Inc.

The ability to quantitatively measure fat content in tissues has multiple important applications in MRI, including studies of bone marrow (1), breast (2), muscle (3), brain (4), liver)5, 6), and heart (7–9). In recent years, chemical shift–encoded water/fat separation methods have become increasingly popular for quantitative fat measurement. This popularity is largely due to the ability of chemical shift–encoded methods to overcome the limitations of alternative techniques: lack of spatial information in single-voxel spectroscopy, sensitivity to amplitude of static field (*B*_{0}) and amplitude of radiofrequency field inhomogeneities in conventional fat saturation, or loss of SNR and inherent *T*_{1}-weighting in short-tau inversion recovery (10–12).

There are four key issues with chemical shift–encoded water/fat separation. First, the presence of large *B*_{0} magnetic field inhomogeneities can result in large errors in water/fat separation if the *B*_{0} effects are not adequately addressed (13, 14). Second, the commonly used spoiled gradient echo sequences may result in considerable residual *T*_{1} weighting, typically leading to bias (overestimation) in the estimated fat component, which has a shorter *T*_{1} than the water component (15). Third, noise also results in bias in the estimation of the minority component of the signal (whether it is water or fat), particularly in cases where the minority component is very small compared to the majority component, i.e., fat fractions (FFs) close to 0% or 100% (15). Fourth, inaccurate modeling of the acquired chemical shift–encoded signal also results in considerable bias in fat quantification (6, 16, 17).

Complications due to *B*_{0} field inhomogeneities, *T*_{1} bias, and noise have been thoroughly addressed in the literature. Field inhomogeneities can be corrected by region-growing or regularized estimation methods (14, 18–22). *T*_{1} bias in spoiled gradient echo acquisitions can be avoided by using a small flip angle or corrected by using a dual-flip-angle acquisition (15). Noise bias also can be corrected effectively by using magnitude discrimination or phase-constrained reconstruction (15), or by using a look-up table bias correction over a region of interest (23). However, signal modeling for quantitative water/fat separation remains largely an unresolved issue. Specifically, there are three key decisions to make when modeling the acquired signal: use of magnitude or complex fitting, use of single-peak or multipeak fat modeling, and modeling of the signal *T* decay. These alternatives can be summarized as follows:

*Magnitude vs. complex fitting.* Fitting the magnitude of the signal has been proposed as a means of simplifying the estimation since it removes the effects of field inhomogeneity (16, 24, 25). However, magnitude fitting has several well-known drawbacks, such as the nongaussian distribution of the noise in magnitude MR images, and an inability to correctly detect FFs above 50%.

*Single peak vs. multipeak fat models.* The basic single-peak signal model ignores the presence of multiple spectral peaks in the fat signal, which leads to bias in quantification. This bias can be overcome by using a more sophisticated, multipeak fat model, where the relative amplitudes of the different fat peaks can either be precalibrated or autocalibrated (6, 17, 26).

*Modeling of**T**decay.* In general, the amplitudes of the water and fat components of the signal will decrease with echo time (TE) due to *T* decay. It has been shown that ignoring this decay may result in considerable bias, and a number of groups have developed methods for including *T* in the model. In general, water and fat will have different *T* decays (and even the different fat peaks will have different decays, although this is typically ignored as it would result in significant complication in the estimation ((12))), so these should be estimated separately, adding two nonlinear parameters to the estimation ((26–28)). As a simplification of this general model, a single *T* has been proposed for both water and fat ((29, 30)). Intermediate models have also been proposed, where the decay rates of water and fat are different, but the difference is assumed known ((16)).

In this article, we present a comparative analysis of multiple models based on the alternatives described above. The analysis focuses on two key properties of the estimates for each model: bias and standard deviation. These properties capture the behavior of different models regarding model mismatch (bias) and noise sensitivity (standard deviation). The analysis is based on theoretical properties of the different models, simulations, and phantom data. Additionally, the conclusions derived from this analysis are verified qualitatively with an in vivo dataset.

### RESULTS

- Top of page
- Abstract
- MATERIALS AND METHODS
- RESULTS
- DISCUSSION
- CONCLUSIONS
- Acknowledgements
- REFERENCES

Figure 1 shows the phantom setup used in this work, including an in-phase image, as well as separated water and fat images. The average estimated relaxation parameter values in the water component (water-only vial) were *T*_{1,W} = 953 ms and *T*_{2,W} = 82 ms; in the fat component (fat-only vial), these values were *T*_{1,F} = 207 ms and *T*_{2,F} = 43 ms. It must be noted that *T*_{1,W} seemed to decrease in the mixed vials (e.g., it was measured to be 813 ms in the vial containing 50% fat) (41). However, this range of values does not affect the results of bias and standard deviation comparison as the sequence parameters were chosen to avoid *T*_{1} weighting.

Figure 3 shows similar results, but comparing the CRLB predictions with based on the measured standard deviation for fat amplitude estimation in the actual phantom experiments. Note that the phantom results closely follow the simulations (shown in Fig. 2), with the largest difference arising in the magnitude fitting using a single peak and no decay, where the phantom estimates often converged to zero at low FFs, thus showing very low standard deviation (and very high NSA). Aside from that effect, magnitude-fitting models result in lower NSA than their complex-fitting counterparts, both in theory (CRLB) and in practice (simulations and phantom data).

Figures 4 and 5 show the standard deviation σ_{F} and the RMSE for fat amplitude estimation using the 12 models, both for the simulation (Fig. 4) and for the phantom data (Fig. 5). Note the close correspondence of simulation and phantom results for most models. Several of the magnitude-fitting models present a larger discrepancy between simulation and phantom data. We suggest that this discrepancy might be due to residual model mismatches in the phantom case. A more detailed discussion of this effect will be deferred to the description of FF estimation results. The simpler models (e.g., without accounting for *R* or multipeak fat), produce significant bias in the estimation of fat amplitudes, resulting in RMSE much higher than σ_{F}. For these models, the bias dominates the errors. Therefore, an analysis of these based only on CRLB (or standard deviations) will not give an accurate assessment of the quality of the estimates.

Figures 6 and 7 show FF results (mean ± standard deviation) for simulated and phantom data for a range of true FFs between 0% and 100%. All single-peak models result in considerable bias. For the multipeak, no-decay model, the bias in fat amplitude estimation seems to be approximately compensated by the bias in water amplitude estimation, resulting in good estimates except at very low or very high FFs. Generally, complex-fitting models perform significantly better (smaller bias and standard deviation) than their magnitude-fitting counterparts. Furthermore, complex-fitting phantom results show better agreement with simulation results. Magnitude-fitting phantom results show somewhat different behavior (most notably an increased bias) with respect to the simulations. We hypothesize that the cause is the sensitivity of magnitude fitting to model mismatches. To test this hypothesis, we generated a second set of simulated data, where the multipeak (six-peak) fat model is not exactly correct, but instead the peaks at −175 Hz and −119 Hz were each split into two peaks separated by 10 Hz, with the same amplitude as the original peak. Noise was added to the resulting simulated data, as described in the “Materials and Methods,” and the resulting signals were fitted using all 12 models, where the multipeak model still consisted of the original six peaks. The resulting FF plots are shown in Fig. 8. The complex-fitting results are similar to the ones shown in Fig. 6. However, the magnitude-fitting results have increased bias and standard deviation due to the model mismatch. These results correspond well with the observed phantom results (Fig. 7).

In vivo liver imaging results are shown in Fig. 9. The SNR was approximately 20. The FF maps shown in Fig. 9 are provided to illustrate the differences in bias and standard deviation for the various signal models used for fat and water fitting. The low SNR of the fat in the liver region leads to a noise bias (15). Estimates of FF were calculated from the mean values of fat and water signal intensities within a circular region of interest rather than from the FF map, which is noisier. Furthermore, the complex fat images were filtered to improve the SNR. Using a 7 × 7 filter, yielded an SNR for fat signal of approximately 5 for the complex-fitting, multipeak, one-decay estimates, which results in noise bias error under 5%. All signal models are affected similarly by noise bias, which was not the objective of the paper. It must be noted that we do not have a ground truth for the in vivo data, but rather compare only the relative estimates of the different models. The single-peak models (with the exception of the complex-fitting, single-peak, two-decay model) result in lower FF estimates relative to the multipeak models. This is in good agreement with simulation and phantom results. Additionally, the two-decay estimates are noisier compared with the no-decay and one-decay models (with the exception of the magnitude-fitting, multipeak, one-decay model, which produces unstable results due to model mismatch).

Based on these results, we can highlight the following key observations (arrows are marked in the figures with the corresponding observation number):

- 1
Despite the model mismatch, the CRLB provides a useful approximation of the standard deviation obtained with the different models. However, the CRLB does not take model mismatch–related bias into account.

- 2
The bias component of the RMSE can be significantly larger than the standard deviation component.

- 3
The relative importance of the bias component with respect to the standard deviation component is a function of the SNR. This is shown in Fig.

10, where complex-fitting, multipeak fat models are compared. For low SNRs, the standard deviation component of the error, which is larger in the two-decay model, dominates the (approximate constant with SNR) bias component of the error, which is larger in the one-decay model.

- 4
Complex fitting results in better estimates than magnitude fitting. This is true for the standard deviation (as shown by the CRLB, simulation, and phantom results), as well as for the bias (as shown by the simulation and phantom results). Additionally, complex fitting is less sensitive to model mismatch.

- 5
Multipeak has significantly reduced bias error compared to single peak. Furthermore, single-peak models perform worse when there is more fat.

- 6
The no-decay models result in very large bias for fat amplitude estimation. For single-peak fat modeling, the two-decay model is needed in order to approximately account for the multipeak nature of the fat signal.

- 7
For multipeak fat modeling, the two-decay model typically results in lower bias than the one-decay model, but the increased standard deviation results in higher errors except at high SNR and FFs close to 50%. For SNRs <30, the increased standard deviation in the two-decay model dominates the improvement in bias with respect to the one-decay model. This is in good agreement with Chebrolu et al. (

28) and is demonstrated in Fig.

10 with simulation and phantom results for a range of SNR and FF values.

### DISCUSSION

- Top of page
- Abstract
- MATERIALS AND METHODS
- RESULTS
- DISCUSSION
- CONCLUSIONS
- Acknowledgements
- REFERENCES

We have performed a systematic comparison of signal models for water/fat separation from chemical shift–encoded acquisitions. The analysis was based on comparing the bias and standard deviation resulting from the different models. This study can be viewed as an extension of previous work, e.g., where the standard deviation was studied for different acquisition strategies using the CRLB (31), or different sets of models were compared empirically (6, 16).

The present study has several limitations. First, the study assumes that the signal phase is reliable. Under these conditions, complex fitting is uniformly superior to magnitude fitting. In the presence of phase distortions (e.g., due to eddy currents), magnitude fitting (16) or a mixed approach (40) may become more attractive. However, phase distortions were not found to be significant in our experimental data. Similarly, ghosting dut to motion may complicate the fitting, but it was not observed in our in vivo data. Second, the study assumes that a suitable calibration is available for multipeak fat models. Third, in order to limit the complexity of the study, we fixed several parameters such as the choice of TEs. The present set of eight TEs allowed stable application of even the more complicated, two-decay models. Using fewer TEs (e.g., four) is expected to result in increased noise sensitivity, particularly in the more sophisticated two-decay models. This choice was made to approximately follow the usual sets of TEs in recent fat quantification literature (16). Fourth, this study does not take computation time into account. Generally, increasing the number of parameters (especially nonlinear parameters) in a model will result in increased computation. For instance, computation times to process 1024 voxels with the three complex multipeak models (no-decay, one-decay, and two-decay models), were 8.9, 9.6, and 16.4 sec, respectively, in our nonoptimized MatLab (MathWorks) implementation.

Multipeak fat modeling has been shown in this and previous work to result in reduced bias in fat quantification relative to single-peak fat modeling (6). However, the present results seem to indicate that even the six-peak fat model with separate *R* decays for water and fat does not completely describe the fat signal. This residual model mismatch appears in two ways: (a) the multiple fat peaks are not all in phase in the calibration, and (b) magnitude fitting contains significant bias. However, incorporating more peaks into the model results in more difficult calibration due to the complication of calibrating peaks with very similar resonant frequencies.

The decay constant *R* for each species can be approximated as a combination of an intrinsic component due to spin-spin interactions and an extrinsic component due to field inhomogeneities and susceptibility effects: *R* = *R*_{2,W} + *R* and *R* = *R*_{2,F} + *R*, where *R*_{2,W} = 1/*T*_{2,W}, *R*_{2,F} = 1/*T*_{2,F}, *R* ∼ γ Δ *B*, and Δ*B* is the amount of *B*_{0} field variation within the voxel (42, 43). Thus, *R* and *R* will generally be different, which is observed in the phantom data, using a multipeak, two-decay model, where the estimated difference was *R* − *R* ≈ 12 sec^{−1}. This is in good agreement with the *T*_{2} relaxation parameters measured in the phantom (using a spin-echo sequence with varying TEs), where *T*_{2,W} ≈ 82 ms, and *T*_{2,F} ≈ 43 ms, resulting in *R*_{2,F} − *R*_{2,W} ≈ 11 sec^{−1}. Furthermore, according to this approximation, the difference *R* − *R* = *R*_{2,F} − *R*_{2,W} can be approximately known a priori if *T*_{2,W} and *T*_{2,F} are assumed known. However, it has been suggested that *R* and *R* may behave differently, e.g., as a function of iron concentration (28). If a single-peak fat model is used, the apparent *R* will be higher as it has to account for the dephasing due to interference between multiple fat peaks at frequencies near the the main peak. Moreover, assuming that all the fat peaks share a single *R*_{2,F} (or *R*) is also an approximation, but estimating independent decay constants for each fat peak would result in greatly increased computational complexity and noise sensitivity, likely making it impractical. Furthermore, if the relative differences between the decay rates of the different fat peaks can be assumed known a priori, this information can also be incorporated into the model.

According to our results, finding the optimal model for water/fat separation reduces to a choice between complex, multipeak fitting including either two decays (*R*, *R*) or a single decay rate *R*. This choice presents a clear tradeoff of bias and standard deviation: the two-decay model can represent the acquired signal more accurately (reduced bias), but the estimation of an additional decay rate increases the noise sensitivity (increased standard deviation). This increased standard deviation is particularly significant in the estimates of the “minority” component of the signal: in the one-decay model, the minority component “gets to share” the *R* parameter of the majority component, resulting in very stable (although somewhat biased) estimates of the minority component. In the two-decay model, estimation of the decay parameter for the minority component must be done independently, resulting in noisy decay rate estimates and in turn noisy amplitude estimates. As shown in Fig. 10, the choice between one or two decays depends on the SNR and the (expected) true FF. In several important applications, low FFs (e.g., 0–20%) are expected (5, 6, 9, 28), which makes the one-decay model preferable unless very high SNR can be achieved.