FatSegNet: A fully automated deep learning pipeline for adipose tissue segmentation on abdominal dixon MRI

Introduce and validate a novel, fast, and fully automated deep learning pipeline (FatSegNet) to accurately identify, segment, and quantify visceral and subcutaneous adipose tissue (VAT and SAT) within a consistent, anatomically defined abdominal region on Dixon MRI scans.


| INTRODUCTION
The excess of body fat depots is an increasing major public health issue worldwide and an important risk factor for the development of metabolic disorders and reduced quality of life. 1,2 While the body mass index (BMI) is a widely used indicator of adipose tissue accumulation in the body, it does not provide information on fat distribution 3 neither with respect to different fat tissue types nor with respect to deposit location. Different compartments of adipose tissue are associated with different physiopathological effects. 4,5 Abdominal adipose tissue (AAT), composed of subcutaneous and visceral adipose tissue (SAT and VAT), has long been associated with an increased risk of chronic cardiovascular diseases, glucose impairment, and dyslipidemia. 6,7 Recently, several studies have indicated a stronger relation between the accumulation of VAT with an adverse metabolic and inflammatory profile compared to SAT. 8,9 Therefore, an accurate and independent measurement of VAT and SAT volumes (VAT-V and SAT-V) is of significant clinical and research interest.
Currently, the gold standard for measuring VAT-V and SAT-V is the manual segmentation of abdominal fat images from Dixon magnetic resonance (MR) scans-a very expensive and time-consuming process. Thus, especially for large studies, automatic segmentation methods are required. However, achieving good accuracy is challenging due to complex AAT structures, a wide variety of VAT shapes, large anatomical differences across subjects, and the inherent properties of the Dixon images: low intensity contrast between adipose tissue classes, inhomogeneous signals, and potential organ motion. So far, those limitations impeded the widespread implementation of automatic and semi-automatic techniques based on intensity and shape features, such as fuzzy-clustering, 10 k-means clustering, 11 graph cut 12,13 active contour methods, 14 and statistical shape models. 15 Recently, fully convolutional neural networks (F-CNNs) 16,17 have been widely adopted in the computer vision community for pixel/voxel-wise image segmentation in an end-toend fashion to overcome above-mentioned challenges. With these methods there is no need to extract manual features, divide images into patches, or implement sliding window techniques. F-CNNs can automatically extract intrinsic features and integrate global context to resolve local ambiguities thereby improving the results of the predicted models. 17 Langer et al 18 proposed a three-channel UNet for AAT segmentation, which is a conventional architecture for 2D medical image segmentation. 19 While this method showed promising results, we demonstrate that our network architecture outperforms the traditional UNet for segmenting AAT on our images with a wide range of anatomical variation. More recent architectures such as the SD-Net 20 and Dense-UNet, a densely connected network, 21 have the potential to improve generalizability and robustness by encouraging feature re-usability and strengthening information propagation across the network. 21 In prior work, we introduced a competitive dense fully convolutional network (CDFNet) 22 as a new 2D F-CNN architecture that promotes feature selectivity within a network by introducing maximum attention through a maxout activation unit. 23 The maxout boosts performance by allowing the creation of specialized sub-networks that target a specific structure during training. 24 Therefore, this approach facilitates the learning of more complex structures 22,24 with the added benefit of reducing the number of training parameters relative to the aforementioned networks.
In this paper, we propose FatSegNet, a novel fully automated deep learning pipeline based on our CDFNet architecture to localize and segment VAT and SAT on abdominal Dixon MR images from the Rhineland Study, an ongoing large population-based cohort study. 25,26 To constrain AAT segmentations to a consistent anatomically defined region, the proposed pipeline consists of three stages: 1. Localization of the abdominal region using a semantic segmentation approach by implementing CDFNet models on sagittal and coronal planes; we use the lumbar vertebrae positions as reference points for selecting the region of interest. 2. Segmentation of VAT and SAT within the abdominal region through 2D CDFNet models on three different planes (axial, sagittal, and coronal). 3. A view aggregation stage where the previous generated label maps are combined to generate a final 3D segmentation.
We initially evaluate and compare the individual stages of the pipeline with other deep learning approaches in a sixfold crossvalidation. We show that the proposed network architecture (CDFNet) improves segmentation performance and simultaneously reduces the number of required training parameters in step 1 and 2. After asserting segmentation accuracy, we evaluate the whole pipeline (FatSegNet) with respect to robustness and reliability against two independent test sets: a manually edited and a test-retest set. Finally, we present a case study on unseen data comparing the VAT-V and SAT-V calculated from the FatSegNet segmentations against BMI to replicate age and sex effects on these volumes in a large cohort.

| Datasets
The Rhineland Study is an ongoing population-based prospective cohort (https ://www.rhein land-studie.de/) which enrolls participants aged 30 years and above at baseline from Bonn, Germany. The study is carried out in accordance with the recommendations of the International Council for Harmonisation (ICH) Good Clinical Practice (GCP) standards (ICH-GCP). Written informed consent was obtained from all participants in accordance with the Declaration of Helsinki.
The first 641 subjects from the Rhineland Study with BMI and abdominal MR Dixon scans are included. The sample presents a mean age of 54.2 years (range 30 to 95) and 55.2% of the subjects are women. The BMI of the participants ranges from 17.2 to 47.7 kg/m 2 with a mean of 25.2 kg/m 2 . Subjects were stratified into two subsets: 38 scans were manually annotated for training and testing; the remaining 603 subjects were segmented using the proposed pipeline. After visual inspection, 16 subjects were excluded due to poor image quality or extreme motion artifacts (e.g. potentially caused by breathing). Thus, 587 participants were used for the case study analysis and a subset of 50 subjects were randomly selected for manual corrections of the predicted label maps. This manually edited set and an independent test-retest set of 17 healthy young volunteers were used to assess reliability of the automated segmentation and volume estimates.
Ground truth data 38 subjects were randomly selected from sex and BMI strata to ensure a balanced population distribution. These scans were manually annotated by two trained raters without any semi-automated support such as thresholding, which can reduce accuracy in the ground truth and lead to overestimation of the performance of the proposed automated method.
Specific label schemes were created for each individual task of the pipeline. For localizing the abdominal region, raters divided the scans into three different blocks defined by the location of the vertebrae as follows: the abdominal region (from lower bound of twelfth thoracic vertebra (Th12) to the lower bound of L5), the thoracic region (all above the lower bound of Th12), and the pelvic region (everything below the lower bound of L5), as illustrated in Figure 1E). For AAT segmentation, 60 slices per subject were manually labeled into three classes: SAT, VAT, and bone with neighbouring tissues. The bone was labeled to prevent bone marrow from being misclassified as adipose tissue. In order to improve spatial context and prevent misclassification of the arms, the dataset was complemented by a synthetic class defined as "other tissue" that was composed of any soft tissue inside the abdomen cavity that is not VAT or SAT. The manual annotations are illustrated in Figure 1B,C. Furthermore, four subjects were labeled by both raters to evaluate the inter-rater variability.

Test-retest data
17 additional subjects were recruited with the exclusive purpose of measuring the acquisition protocol reliability. The group presents a mean age of 25.5 years (range: 20 to 31) and 65.0% of the participants are women; all of them have a normal BMI (BMI <25 kg/m 2 ). Subjects were scanned in two consecutive sessions. Before starting the second session, subjects were removed from the scanner and re-positioned.

| FatSegNet pipeline
The FatSegNet is to be deployed as a post-processing adipose analysis pipeline for the abdominal Dixon MR images acquired in the Rhineland Study. Therefore, it should meet the following requirements: (1) be fully automated, (2) segment the different adipose tissue types within the anatomically defined abdominal region, and (3) be robust to body type variations and generalizable in presence of high population heterogeneity. Following the prior conditions, we designed FatSegNet as a fully automated deep learning pipeline for adipose segmentation ( Figure 2). The proposed pipeline consists of three stages: (1) the abdominal region is localized by averaging bounding boxes from two abdominal segmentation maps generated by CDFNets on the sagittal and coronal view. For each view a bounding box is set to the full image width. The height is extracted by localizing the highest and lowest slice with at least 85% of none background voxels classified as abdominal region. Highest and lowest slice position are averaged across the views. (2) Afterward, adipose tissue is segmented within the abdominal region by three CDFNets on different views (axial, coronal, and sagittal) with standardized input sizes (zero padding).
(3) Finally, a view aggregation network merges the predicted label maps from the previous stage into a final segmentation; the implemented multi-view scheme is designed to improve segmentation of structures that are not clearly visible due to poor lateral resolution. This 2.5D strategy produces a fully automated pipeline to accurately segment adipose tissue inside a consistent anatomically defined abdominal region.

Competitive dense fully convolutional network (CDFNet)
For the segmentation task, we introduce the CDFNet architecture due to its robustness and generalizability properties. The proposed network improves feature selectivity and, thus, boosts the learning of fine-grained anatomies without increasing the number of learned parameters. 22 We implemented the CDFNet by suitably adopting the Dense-UNet architecture proposed by Roy et al 27 and extending it toward competitive learning via maxout activations. 24 The Dense-UNet proposed in 27 follows the usual dumbbell like architecture with four dense-block encoders, four dense-block decoders and one bottleneck layer. Each denseblock is based on short-range skip connections between convolutional layers as introduced for densely connected neural networks 28 ; the dense connection approach stacks multiple convolutional layers in sequence and the input of a layer is iteratively concatenated with the outputs of the previous layers. This type of connectivity improves feature reusability, increases information propagation, and alleviates vanishing gradients. 28 The architecture additionally incorporates the traditional long-range skip connections between all encoder and decoder blocks of the same spatial resolution as introduced by Ronnenberger et al 19 which improves gradient flow and spatial information recovery.
Within the Dense-UNet, the information aggregation through these connections is performed by concatenation layers. Such a design increases the size of the output feature map along the feature channels, which in turn results in the need to F I G U R E 2 Proposed FatSegNet Pipeline for segmenting AAT. The pipeline is divided into three stages: First, localization of abdominal region. Then, tissue segmentation on the abdominal region and finally, view aggregation. Both local and global volume estimates of individual structures are calculated on the final prediction learn filters with a higher number of parameters. Goodfellow et al introduced the idea of competitive learning through maxout activations, 23 which was adapted by Liao and Carneiro 24 for competitive pooling of multi-scale filter outputs. Both 23 and 24 proved that the use of maxout competitive units boosts performance by creating a large number of dedicated subnetworks within a network that learns to target specific sub-tasks and reduces the number of required parameters significantly, which in turn can prevent over-fitting.
The maxout is a simple feed-forward activation function that chooses the maximum value from its inputs. 23 Within a CNN, a maxout feature map is constructed by taking the maximum across multiple input feature maps for a particular spatial location. The proposed CDFNet uses competitive layers (maxout activation) instead of concatenation layers. Our preliminary results 22 demonstrate that these competitive units promote the formation of dedicated local sub-networks in each of the densely connected blocks within the encoder and the decoder paths. This encourages sub-modularity through a network-in-network design that can learn more efficiently. Toward this, we propose two novel architectural elements targeted at introducing competition within the short-and longrange connections, as follows:

Local Competition-Competitive Dense Block (CDB):
By introducing maxout activations within the short-range skip connections of each of the densely connected convolutional layers (at the same resolution), we encourage local competition during learning of filters. The multiple convolution layers in each block prevent filter co-adaptation.

Global Competition-Competitive Un-pooling Block (CUB):
We introduce a maxout activation between a longrange skip connection from the encoder and the features up-sampled from the prior lower resolution decoder block. This promotes competition between finer feature maps with smaller receptive fields (skip connections) and coarser feature maps from the decoder path that spans much wider receptive fields encompassing higher contextual information.
In brief, the proposed CDFNet comprises a sequence of four CDBs, constituting the encoder path (down-sampling block), and four CDBs constituting the decoder path (up-sampling block), which is joined via a bottleneck layer. The bottleneck consists of a 2D convolutional layer followed by a Batch Normalization. The skip-connections from each of the encoder blocks feed into the CUB that subsequently forward features into the corresponding decoder block of the same resolution as illustrated in Figure 3.

View aggregation network
The proposed view aggregation network is designed to regularize the prediction for a given voxel by considering spatial information from the coronal, axial, and sagittal view. The network, therefore, merges the probability maps of the three different CDFNets from the previous stage by applying a (3 × 3 × 3) 3D-convolution (30 filters) followed by a Batch Normalization. Then a (1 × 1 × 1) 3D-convolution is employed to reduce the feature maps to the desired number of classes (n = 5). The final prediction probabilities are obtained via a concluding softmax layer (as illustrated in Supporting Information Figure S1). Our approach learns to weigh each view differently on a voxel level, compared to standard hard-coded global view aggregation schemes. Such hard-coded weighting schemes can be suboptimal when working with anisotropic voxels sizes (e.g., here 2 mm × 2 mm × 5 mm) as resolution differences impose a challenge when combining the spatial information from the finer (within-plane) and coarser (across slice) resolutions. Additionally, in the presence of high variance, abdominal body shapes across subjects segmentation benefit from data-driven approaches that can flexibly adopt weights to individual situations and even spatial locations, which are not possible if hardcoded global weights are being used.

| Experimental setup
For training and testing the pipeline, we perform a sixfold cross-validation subject-space split on the ground truth dataset.

| Baselines and comparative methods
We validate the FatSegNet by comparing the performance of each stage of the pipeline against the cross-validation test sets using Dice score index (DSC) to measure similarity between the prediction and the ground truth. Let M (ground truth) and P (prediction) denote the labels binary segmentation, the Dice score index is defined as where |M| and |P| represents the number of elements in each segmentation and |M ∩ P| the number of common elements. Therefore, the DSC ranges from 0 to 1 and a higher DSC represents a better agreement between segmentations. Additionally, we benchmark the proposed CDFNet models for abdominal region localization and AAT delineation with state-of-the-art segmentation F-CNNs such as UNet, 19 SD-Net, 20 and Dense-UNet. 27 We use the probability maps generated from the aforementioned networks to train the view aggregation model and measure performance with and without view aggregation. The proposed view aggregation performance for each FCNNs is compared against two non-data-driven (hard-coded) methods: equally balanced weights for all views and axial focus weights (accounting for higher in-plane resolution, axial = 0.5, coronal = 0.25, sagittal = 0.25). Finally, to permit a fair comparison, all benchmark networks follow the same architecture of four encoder blocks, four decoders blocks, and one bottleneck layer as illustrated in Figure 3 with an input image size of 224 × 256. Note, significant differences between our proposed methods and comparative baselines are evaluated by a Wilcoxon signed-rank test 29 after multiple comparisons correction using a one-sided adaptive FDR. 30 The aforementioned models are implemented in Keras 31 with a TensorFlow back-end using an NVIDIA Titan Xp GPU with 12 GB RAM and the following parameters: batch size of 8, momentum set to 0.9, constant weight decay of 10 −06 , and an initial learning rate of 0.01 decreased by a order of 10 every 20 epochs. The models are trained for 60 epochs with an earlystopping criterion (no relevant changes on the validation loss after the last 8 epochs-convergence was observed around 50 epochs). A composite loss function of median frequency balanced logistic loss and Dice loss 20 is used. This loss function emphasizes the boundaries between classes and supports learning of unbalanced classes such as VAT. Finally, online data augmentation (translation, rotation and global scaling) is performed to increase training set size and improve the networks generalizability. Note, the FatSegNet implementation is available at https ://github.com/reuter-lab/FatSe gNet.

| Pipeline reliability
We assess the FatSegNet reliability by comparing the difference of VAT-V and SAT-V across sessions for each subject of the test-retest and manually edited set. Given a predicted label map and N i (l) the number of voxels classified as l (VAT or SAT) in session i (test-retest, or manual-automated), the absolute percent difference (APD(l)) of a label volume measures variability across sessions. It is defined as Additionally, we calculate the agreement of total VAT-V and SAT-V between sessions by an intra-class correlation (ICC) using a two-way fixed, absolute agreement and single measures ICC(A,1). 32

Rhineland study
We compare the volumes of abdominal adipose tissue (AAT-V, SAT-V, and VAT-V) generated from FatSegNet with BMI on the unseen dataset. A fast quality control is performed to identify drastic failure cases. The differences among BMI groups are evaluated with a one-way analysis of variance (ANOVA) with subsequent Tukey's honest significant difference (HSD) post hoc comparisons. The associations of volumes of abdominal adipose tissue and BMI are assessed using partial correlation and linear regression after accounting for age, sex, and height of the abdominal region. Separate linear regression analyses are performed to explore the effect of age on SAT-V and VAT-V in men and women. All the statistical analyses are performed in R. 33

| Localization of abdominal region
For assessing the performance of abdominal region detection after creation of an average bounding box from the coronal and sagittal views the average Dice overlap (sixfold crossvalidation) was calculated, as illustrated on the Supporting Information Figure S2. We observe that all models perform extremely well on the relatively easy task of localizing the desired abdominal region (DSC >0.96). There is no significant difference between the models; however, we use our CDFNet because it requires substantially less parameters (see Table 1) compared to the UNet and Dense-UNet.

| Segmentation of AAT
In Table 1, we present the average Dice score (sixfold crossvalidation) for VAT and SAT for each individual view as well as for the view aggregation model. Here, we observe that all methods work extremely well for SAT segmentation. Nevertheless, our proposed CDFNet outperforms the UNet and SD-Net on all single-view models and, when compared with the Dense-UNet, there is significant improvement in the sagittal and coronal views. For the more challenging task of VAT recognition, which is a more fine-grained compartment with large shape variation, the proposed CDFNet outperforms the SD-Net on all single planes; when compared with Dense-UNet and U-Net, there is only significant improvement in the axial and coronal plane. Nonetheless, CDFNet achieves this performance with ∼30% (Dense-UNet) and ∼80% (UNet) less parameters, demonstrating that the proposed architecture improves feature selectivity and simplifies network learning. Furthermore, fewer parameters can help decrease overfitting error, especially when training with limited annotated data, and thus improve generalizability. Note, that Dice scores increase and difference of pairwise comparisons is slightly reduced after the view aggregation ( Table 1), showing that this steps helps all individual networks to reach a better performance by introducing spatial information from multiple views and regularizing the prediction maps. The proposed data-driven aggregation scheme outperforms (DSC) the hard-coded models for SAT and with statistically significance for VAT as shown in Table 2. Furthermore, learned weights are spatially varying and can adjust to subject-specific anatomy, which in turn can improve generalizability. We empirically observe that the aggregation model smoothes the label maps slightly, resulting in visually more appealing boundaries. It also significantly reduces the arms from being misclassified as adipose tissue which can otherwise be observed in different views, especially on overweight and obese subjects, where arms are located closer to the abdominal cavity, as seen Supporting Information Figure S3.
Finally it should be highlighted, that all single-view and the view aggregation models achieve similarly excellent results on the SAT segmentation compared to inter-rater variability and outperform the manual raters for the more challenging VAT segmentation by a margin. Table 3 presents the reliability metrics evaluated on the testretest and the manually edited test set. The proposed pipeline presents only a small absolute percent volume difference (APD) for VAT and SAT, and excellent agreement between the predicted and corrected segmentation maps. It must be noted, that APD is larger for both tissue types in the testretest setting as it also includes variance from acquisition noise (e.g. motion artefacts, non-linearities based on different positioning) in addition to potential variances of the processing pipelines. Nevertheless, we observe excellent agreement (ICC) between sessions for the test-retest dataset for both adipose tissue types.

| The characteristics of the study population
After visual quality inspection, 16 scans were flagged due to image artefacts, such as motion or low contrast (see Figure 4C,D for two examples). The characteristics of the remaining 587 participants with valid data on BMI and volumes of abdominal adipose tissue are presented in Supporting Information Table S1. The mean (SD) age of the subjects is 54.2 (13.3) years, and 54.7% are women. 311 (53.0%) subjects are normal weight, 209 (35.6%) overweight, and 67 (11.4%) obese. We observed a BMI increase with age (β = 0.03, P = .007) and a borderline significance of age difference among BMI groups (P = .052, ANOVA). Obvious differences are observed in AAT-V, VAT-V, and SAT-V across BMI groups (P < .001, ANOVA). VAT-V to SAT-V ratio is higher in overweight and obese participants compared to those with normal weight (P < .001), but there is no difference between overweight and obese (P = .505).

| The association between abdominal adipose tissue volumes and BMI
BMI shows a strong positive correlation with AAT-V and SAT-V (AAT-V: r = .88, P < .001; SAT-V: r = .85, P < .001), T A B L E 2 Mean (and standard deviation) Dice scores (cross-validation) of hard-coded balanced weights, hard-coded axial focus weights, and the proposed view aggregation for abdominal adipose tissue segmentation but only a moderate correlation with VAT-V (r = 0.65, P < .001) after adjusting for age, sex, and abdominal region height. As illustrated in Figure 5, both SAT-V and VAT-V are positively associated with BMI after accounting for age, sex, and abdominal region height (P < .001). The accumulation of SAT-V is higher than VAT-V as BMI increases.

VAT-V and SAT-V
The influence of age and sex on VAT-V and SAT-V follows different patterns (as illustrated in Figure 6). Men tend to have lower SAT and higher VAT compared to women (P < .001).
VAT-V significantly increase with age in both men and women. Conversely, SAT-V is weakly associated with age in women (β = 0.02, P = .012), but not in men (β = −0.01, P = .337).

| DISCUSSION
In our study, we established, validated, and implemented a novel deep learning pipeline to segment and quantify the components of abdominal adipose tissue, namely, VAT-V, SAT-V, and AAT-V on a fast acquisition abdominal Dixon MR protocol for subjects from the Rhineland Study, a large population-based cohort. The proposed pipeline is fully automated and requires approximately 1 minute for analyzing a subject's whole volume. Moreover, since the pipeline is based on deep learning models, it can be easily updated and retrained as the study progresses and new manual data are generated-which can further improve overall pipeline robustness and generalizability, providing a pragmatic solution for a population-based study. The proposed pipeline, termed FatSegNet implements a three-stage design with the CDFNet architecture at the core for localizing the abdominal region and segmenting the AAT. The introduction of our CDFNet inside the pipeline boosts the competition among filters to improve feature selectivity within the networks. CDFNet introduces competition at a local scale by substituting concatenation layers with maxout activations that prevent filter coadaptation and reduce the overall network complexity. It also induces competition at a global scale through competitive unpooling. This network design, in turn, can learn more efficiently.
For the first stage of the pipeline, i.e. localization of the abdominal region, all FCNNs can successfully determine the upper and lower limit of the abdominal region from a segmentation prediction map. However, our CDFNet requires significantly fewer parameters compared to the traditional UNet and Dense-UNet. Furthermore, the localization block is able to identify the abdominal region correctly even in cases with scoliosis (curved spine) as illustrated in Figure 7F. For the more challenging task of segmenting AAT, we demonstrate that CDFNet recovers VAT significantly better than traditional deep learning variants that rely on The selection of an inhomogeneous BMI testing set ensures that our method is evaluated for different body types and avoids biases, as better segmentation performance can be achieved on subjects with high content of AAT compared to lean subjects. 34,35 Moreover, images from individuals with high AAT could be accompanied by other types of issues, such as fat shadowing ( Figure 7D), or arms located in close proximity to the abdominal cavity ( Figure 7A,D,E). These issues are mitigated by our view aggregation model that regularizes the predicted segmentation by combining the spatial context from different views ultimately improving segmentation of tissue boundaries. Moreover, this approach automatically prevents misclassification of arms whereas previous deep learning AAT segmentation methods required manual removal of the upper extremities in a pre-processing step. 18 Note, that we prefer the 2D over a full 3D approach in this work. A full 3D network architecture has more parameters, requiring significantly more expert annotated training data (full 3D cases) and/or artificial data augmentation, which could increase the chance of overfitting-in addition to increased GPU memory requirements.
As demonstrated on the Rhineland Study data, the proposed pipeline exhibits high robustness and generalizability across a wide range of age, BMI, and a variety of body shapes as seen in Figures 7 and 4A,B. FatSegNet successfully identifies the AAT in different abdomen morphologies, spine curvatures, adipose shadowing, arms positioning, or intensity inhomogeneities. Furthermore, the pipeline has a high test-retest reliability between the calculated volumes of VAT and SAT without the need of any image pre-processing (biascorrection, image registration, etc.) or manual selection of a slice or region. Furthermore, the manually edited test set demonstrates a high similarity of automated and manual labels and excellent agreement of volume estimates. However, as is usual with any automated method, segmentation reliability decreases when input images have low quality as illustrated in Figure 4C,D where the scans present severe motion/breathing artifacts or very low-image contrast. In order to detect these problematic images in large studies, an automated or manual quality control protocol should be implemented before passing images to automated pipelines.
In accordance with previous studies on smaller data sets, 13,36 our data showed a lower correlation of BMI with VAT-V than with AAT-V and SAT-V. We also observed a sex difference of the SAT-V and VAT-V accumulation as previously reported 37,38 : men were more likely to have higher VAT-V and lower SAT-V compared to women. Moreover, we further explored the association between age with SAT-V and VAT-V and found an obvious age effect on the accumulation of VAT-V in both men and women, and a weak age effect on SAT-V in women but not in men. This discrepancy was previously observed by Machann et al, 37 who assessed the body composition using MRI in 150 healthy volunteers aged 19 to 69 years. They reported a strong correlation between VAT-V and age both in men and women, whereas SAT-V only slightly increased with age in women. The fact that our results replicate these previous findings on a large unseen dataset corroborates stability and sensitivity of our pipeline.
In conclusion, we have developed a fully automated postprocessing pipeline for adipose tissue segmentation on abdominal Dixon MRI based on deep learning methods. While reducing the number of required parameters, the pipeline outperforms other deep learning architectures and demonstrates high reliability. Furthermore, the proposed method was successfully deployed in a large population-based cohort, where it replicated well known SAT-V and VAT-V age and sex associations and demonstrated generalizability across a large range of anatomical differences, both with respect to body shape and fat distribution.