A Progressive Approach for Uncertainty Visualization in Diffusion Tensor Imaging

Diffusion Tensor Imaging (DTI) is a non‐invasive magnetic resonance imaging technique that, combined with fiber tracking algorithms, allows the characterization and visualization of white matter structures in the brain. The resulting fiber tracts are used, for example, in tumor surgery to evaluate the potential brain functional damage due to tumor resection. The DTI processing pipeline from image acquisition to the final visualization is rather complex generating undesirable uncertainties in the final results. Most DTI visualization techniques do not provide any information regarding the presence of uncertainty. When planning surgery, a fixed safety margin around the fiber tracts is often used; however, it cannot capture local variability and distribution of the uncertainty, thereby limiting the informed decision‐making process. Stochastic techniques are a possibility to estimate uncertainty for the DTI pipeline. However, it has high computational and memory requirements that make it infeasible in a clinical setting. The delay in the visualization of the results adds hindrance to the workflow. We propose a progressive approach that relies on a combination of wild‐bootstrapping and fiber tracking to be used within the progressive visual analytics paradigm. We present a local bootstrapping strategy, which reduces the computational and memory costs, and provides fiber‐tracking results in a progressive manner. We have also implemented a progressive aggregation technique that computes the distances in the fiber ensemble during progressive bootstrap computations. We present experiments with different scenarios to highlight the benefits of using our progressive visual analytic pipeline in a clinical workflow along with a use case and analysis obtained by discussions with our collaborators.


Introduction
Diffusion Tensor Imaging (DTI) is a non-invasive technique that allows the reconstruction of anatomical connections in the brain, i.e., white matter. This process, known as fiber tractography or fiber tracking [BPP*00], has proven to be a useful technique for the interpretation of brain anatomy [Laz10; MZ06; NGH*05]. Fiber tracking has gained popularity in research on brain diseases such as multiple sclerosis, stroke, autism, dementia, and schizophrenia [AP08; Cat06; fC05; HSO*01] and is gaining traction in clinical practice, for example, in planning for brain tumor resection surgery [RDM*09]. Despite the potential of these methods, several downsides limit their widespread use. One of the main reasons is the uncertainty present in the results. The acquired data has to go through a complex transformation and visualization pipeline, shown in Figure 1, accumulating uncertainties present at each step. The MRI-based acquisition suffers from artifacts such as noise, image distortion, motion artifacts, and partial volume effects (PVE) [BWJ*03]. The modeling stage involves the estimation of the second order tensor using fitting techniques or higherorder regression models, adding further variation to the final re- sults [KCA*06]. During fiber tracking, several user-defined parameters significantly affect the resulting fibers [BVPtH09] and different integration schemes can produce different outcomes. Finally, the visualization stage may also introduce uncertainty due to the use of different illumination models or simplification of the fiber geometry. Schultz et al. [SVBK14] review several sources of uncertainty involved in the fiber tracking pipeline and discuss strategies that provide reliable interpretation of the results. All of these uncertainties add variations in the resulting visualization, influencing the decision making process. In this work, we focus on the first two stages of the pipeline, i.e., uncertainty that arises due to acquisition noise and errors in diffusion modeling. In clinical applications, the visualization of uncertainty is often ignored, thereby hampering the user to make effective decisions. In the absence of uncertainty information, neurosurgeons may consider safety margins around critical brain structures [WHK*08] based on experience and prior knowledge (see Figure 2a). Such safety margins assume that there is a homogeneous distribution of the uncertainty, which is not the case (see Figure 2b). Uncertainty information becomes even more critical when fiber tracking is used in pathological anatomy, e.g., when fiber tracts are displaced or infiltrated by a tumor. In such cases, the experience and anatomical knowledge of the surgeon to estimate the uncertainty becomes even less effective. Figure 2 shows a fiber bundle affected by a tumor present in its vicinity. As can be seen, deterministic fiber tracking as used in the clinical workflow could not show the fibers going towards the frontal area of the tumor (Figure 2b). Missing the possibility that fibers can be in the frontal area of the tumor can lead to inadvertently damaging of the tracts during the surgery.
Wild bootstrapping is a stochastic method, used to approximate uncertainty in DTI. It approximates regular bootstrapping where multiple acquisitions are acquired [WTW*08] to model uncertainty. Wild bootstrapping requires only a single scan and simulates multiple acquisitions using probability distributions from the residuals that remain after fitting the diffusion tensors to the data. Computing a large number of such simulated scans allows to approximate a distribution, from which the output, together with its uncertainty, can be derived. This procedure, however, incurs substantial computational costs and is difficult to be used in an interactive fiber tracking process, where parameters are defined through exploratory trial-and-error which is the current clinical workflow for the definition of fiber tracts in surgery planning.
We propose a progressive approach that allows interactive estimation and exploration of fiber tracts and their corresponding uncertainty without pre-processing the data. Wild bootstrapping provides an ensemble of fiber tracts (i.e., polylines) that cannot be effectively visualized directly. Aggregation strategies are used to effectively visualize this kind of data [BPtHV13;WMK13]. However, these methods are not efficient when ensemble members are progressively generated. We modified an existing solution by Brecheisen et al. [BPtHV13], to allow progressive fiber generation and synchronized visualization. The main contribution of this paper is a progressive visual analytics framework for stochastic based uncertainty visualization in DTI fiber tracking. The main aspects in this contribution are listed below: • We have developed a progressive visual analytics (PVA) pipeline for local calculation of tensor bootstrap samples combined with simultaneous fiber tracking. • We have adapted the ensemble-based fiber tract aggregation and visualization to work in a progressive framework.
The framework enables interactive generation, and visual analysis of fiber tracts with uncertainties. We analyse the computational benefits through experiments and illustrate the potential of the framework by a set of use-cases.

Requirement Analysis
This work has been carried out in collaboration with clinical partners who want to incorporate uncertainty into their current tractography workflow for neurosurgery planning. We base our visualization pipeline on the methods used in their current workflow, which incorporates diffusion tensor imaging (DTI), combined with deterministic streamline generation of fiber tracts. While more sophisticated methods exist for modeling the diffusion, as well as fiber tracking, our proposed solution must work with the diffusion tensor model as well as deterministic fiber tract generation to maximize compatibility with the current clinical workflow. After acquisition and pre-processing of the data, radiologists define a region of interest (ROI) and generate the corresponding fiber bundle. Further, the proposed solution should minimize the time between acquisition and analysis. Therefore, we propose a progressive visual analytics (PVA) approach to generating the underlying bootstrap samples, as well as deriving fiber tracts. The proposed PVA pipeline is designed to allow the interactive estimation of uncertainty in the tractography and enables clinicians to define the regions of interest and explore the results interactively.

Related work
Several approaches are available in the literature that characterize, represent, and visualize uncertainties due to noise and modeling errors in fiber tracking, each with their own pros and cons. In this section, we present related work according to uncertainty estimation methods (Section 3.1) for fiber tracking and corresponding uncertainty visualization techniques (Section 3.2).

Fiber Tracking Uncertainty Estimation
Various techniques have been proposed to quantify the uncertainties in fiber tracking. These methods can roughly be divided into two main categories: Analytical methods and Stochastic methods. Analytical methods rely on explicit mathematical modeling [KHR*06] and are mostly based on the Bayesian framework, which was first introduced by Behrens et al. [BWJ*03] in the DTI domain.
Alternatively, uncertainties can be estimated by bootstrapping, meaning from multiple sample data sets of the same subject. For DTI, the most straightforward approach is to scan the subject multiple times. Due to random noise, each scan will be slightly different. Enough scans will allow an estimation of the uncertainty in the data. However, this would require scanning time and costs that are not affordable in a clinical setting. Several stochastic techniques are proposed to mitigate these limitations [CLH06]. One of the most widely used stochastic techniques is wild bootstrapping [WTW*08]. This technique only requires a single scan, based on which a large number of samples can be generated. A variation of this technique is the residual bootstrap, where the residuals are assigned randomly along the gradient directions [DH97]. Several authors have used these techniques to model uncertainties in DTI [Jon03; LA05; PB03; VHN*16]. Even though these techniques provide uncertainty information throughout the complete DTI pipeline, they have high computational and memory costs, limiting their use in clinical and exploratory settings.
Fiber tracking is used to reconstruct brain white matter connections based on diffusion weighted imaging information. There has been extensive research in the development of fiber-tracking algorithms. These techniques can be categorized into Deterministic approaches [CLC*99; LWT*03; MCCV99] and Probabilistic approaches [BBJ*07; HTJ*03; KNH02]. In our work, we adopt the wild bootstrapping method combined with deterministic streamline fiber tracking [BPtHV13; WTW*08] to estimate the fiber tract uncertainty corresponding to data acquisition and modeling. Notice that the framework would allow to develop other approaches with similar characteristics. As discussed, computational and memory costs involved in generating multiple wild-bootstrapping tensor fields and the corresponding ensemble of fiber tract samples are very high. We introduce the concept of local wild bootstrapping driven by fiber tracking, which helps in reducing the affiliated costs, and provide progressive updates.

Visualization
Effective uncertainty visualization is essential to provide information about the reliability of data to the end user. There has been considerable work on uncertainty visualization for scalar, vector, and tensor fields. For a general overview of uncertainty visualization, we refer to one of the various surveys in recent literature [GS06; JS03; PWL*97; Riv07].
The most straightforward approach to visualize an ensemble of fibers is to render the resulting fiber samples directly in a so-called spaghetti plot [BBKW02; CFJ*06; Jon08]. However, a spaghetti plot has several shortcomings. Most importantly, it suffers from clutter and occlusion, making it difficult to distinguish between areas with a single fiber sample compared to an number of densely distributed ones. To reduce clutter, Enders et al. [ESM*05] presented a technique to wrap the fiber bundles within a surface hull. Similar techniques have been used by Merhof et al. [MMB*09] and Chen et al. [CZCE08], who cluster the fibers according to their proximity and generate hulls enclosing the resulting fibers. These techniques resolve the cluttering issues through summarization, however, they cannot easily handle complex fiber shapes. Illustrative techniques have also been proposed to visualize complex fiber ensembles [BPtHV13;OVV10].
To represent the error distribution within a fiber ensemble, statistical information, such as mean or confidence interval, are of interest. However, these measures are not as well defined for the curves as they are for scalar values. Several approaches to compute these statistical information for ensembles of curves exist. Whitaker et al. [WMK13] and Mirzagar et al. [MWK14] use the concept of band-depth to compute the centrality within the set of curves and estimate the variations. Ender et al. [ESM*05] compute the average of the curves in a bundle, resulting in the central fiber. Instead of computing the mean of the fibers, Brecheisen et al. [BPtHV13] compute the median and confidence interval of the curve by calculating the distances among fiber pairs based on a chosen measure. This approach enables the visualization of the complex fiber structure along with the uncertainty information. Here, we adopt a similar technique to calculate the most representative fiber and the percentile of variation and modify it such that it can be incorporated in a progressive visualization framework.
In our work, we address computational cost and latency issues that are part of the uncertainty visualization pipeline for DTI fiber tracking. We base our approach on wild bootstrapping and streamline fiber tracking that we adapt the the Progressive Visual  [FP16]. PVA provides intermediate results that help users to understand the evolution of a lengthy computation, such as the wild bootstrapping simulation, allowing to start the exploration of the data during the computation without a need to wait until the end of the simulation. Further, PVA allows steering the computations, similar to interactive program steering [GVS94] and computational steering [JP94] approaches. Here, we introduce progressive generation and aggregation of the fiber samples combined with immediate, interactive uncertainty visualization. To the best of our knowledge, there is no related work that proposes using a progressive strategy for the purpose of uncertainty visualization in DTI fiber tracking.

Stochastic Modeling
Various approaches have been used to model the uncertainties due to noise and modeling errors, as discussed in Section 3.1. In our work we focus on stochastic methods that simulate sample variations and facilitate the propagation through the pipeline. We have chosen wild bootstrapping for the presented framework to estimate and propagate the uncertainty in the data acquisition and diffusion modeling steps from a single DTI scan.
Wild bootstrapping generates multiple samples based on the residual that arises due to the fitting errors. Jones [Jon08] showed that the results obtained with wild bootstrapping are comparable to those from regular bootstrapping and discussed its applicability. Wild bootstrapping has been described by various authors [Liu*88; WTW*08] but was first combined with fiber tracking by Jones [Jon08]. The general principle behind wild bootstrapping is based on a fitting of the DWI signal s(g i ) in the several acquisition directions g i , into a tensor model [Jon08]. The six unique elements of the diffusion tensor, D, can be estimated by using the ordinary least square method. Once we have D, we can compute the model predicted signal value s (g i ) that corresponds to the fitted tensor. A residual value r(g i ) is calculated with r(g i ) = s (g i ) − s(g i ). A new signal per orientation at each voxel, s (g i ), is stochastically generated according to s (g i ) = s(g i ) + sign(r(g i )), where the sign()-function randomly multiplies the residual by 1 or −1. A new tensor sample of the diffusion tensor, D , is estimated for each voxel independently multiple times by fitting a tensor to the generated s (g i ) signals. By perturbing residuals randomly, each tensor fit will be different from the previous one. This repetitive estimation of the tensor for each voxel is carried out for all voxels of the tensor volume multiple times, resulting in an ensemble of tensor volumes. The concepts we present are general to any local stochastic uncertainty model. We have chosen wild bootstraping for our framework given its use and demonstrated similarity to bootstraping in the DTI context [Jon08].

Towards a Progressive Visual Analytics Pipeline
As described in the previous section, wild bootstraping generates ensembles of tensor volumes. The naive pipeline is based on precomputing the ensemble of tensor volumes followed by deterministic fiber-tracking, i.e., streamline generation (see Figure 3). For each tensor volume sample and a seed point, a new fiber sample is generated. Once all the fiber samples are tracked, we obtain a fiber ensemble to be visualized. This process is able to show the variations in the obtained fiber tracts. However, the pre-computation of the whole tensor volumes ensemble requires long computation times and a large memory footprint.
Accessibility within the clinical workflow is a major limitation for the use and evaluation of uncertainty information in practice. The lack of availability of the tools and the complexity in achieving the visualization of the uncertainty is one of the main bottlenecks in their clinical use, despite its enormous potential. In neurosurgical applications, for example, in pre-surgical planning, generating the fiber pathways is often a trial and error process. It needs various iteration and requires constant tuning of fiber tracking regions of interest and parameters to meet the expectations of the clinician. The large pre-processing times of uncertainty modeling add latency in the visualization which breaks the clinical workflow. We introduce a progressive approach that reduces the latency between acquisition and visualization and allow users to explore and interact with fiber tracking parameters and their uncertainties directly during the computations. We identify and the bottlenecks present in the clinical workflow and provides a first step towards making uncertainty visualization more accessible to the user. We do realise that our proposed approach needs clinical validation and evaluation, just as tractography does in general. However, we consider a clinical evaluation as future work.  In uncertainty modeling through bootstrapping, it is not known in advance how many bootstrap samples will lead to an accurate enough result. Many factors, including the shape of the bundles themselves, the area of the brain and the level of noise and artifacts introduce variation in the number of required samples. Modeling uncertainty for one bundle may require fewer bootstrap iterations than another. Computing a predefined number of iterations 'N', either misrepresents the uncertainty or wastes resources and time. To circumvent this problem, a progressive visualization approach allows the user to see intermediate results, observe the uncertainty simulation's evolution, and ultimately identify when the results are stable enough on-the-fly, saving valuable time.
In the following, we start our discussion with a naive progressive visual analytic pipeline, identify drawbacks, and proceed to our proposed local bootstrapping and fiber tracking approach.

Naive Progressive Approach
The first step towards a progressive visual analytics pipeline is to visualize the fiber samples during the wild bootstrapping calculations without a need to pre-compute all tensor volumes. For this purpose, bootstrap sample calculation and fiber tracking stage are combined. Figure 4 illustrates the pipeline of the naive progressive bootstrapping and fiber tracking approach for a given seed point. In the progressive approach, a tensor volume is generated at each iteration by using the wild bootstrap technique. Based on the newly created sample, fiber tracking from a given seed point is performed. Each iteration results in a unique fiber sample, which can directly be visualized. The variations in the fiber samples represent the effect of the noise and modeling errors. The bootstrap iterations repeat continuously which increases the reliability in the estimation of the uncertainty. The user can start the evaluation of the data immediately and define when to take a decision given a perceived visual stability of the results.
The progressive approach reduces the memory footprint of the wild-bootstrapping method, as no pre-computed tensor volumes need to be stored. However, computing each complete diffusion tensor volume takes in the order of several seconds, which is still too long to be used in an interactive system, making this progressive pipeline impractical.

Local Bootstrapping and Fiber tracking
In most applications of fiber tracking, the users are mostly interested in a specific fiber bundle or a particular region of the brain. In these cases, calculating the bootstrap sample for the whole volume is a large waste of computation resources, since just a small portion will be used. However, the precise region of interest for the tracing is not known in advance, and cannot be computed in pre-processing time. Taking this into consideration, we propose a novel approach for accelerating the computations of the progressive bootstrap method presented in previous section. Here, we combine wild bootstrapping with fiber tracking and the computations are performed only for those cells that are necessary for the currently tracked fiber. The pipeline for the local bootstrapping and fiber tracking is illustrated in Figure 5.
The streamline algorithm is initiated with specific seed points. During the numerical integration of the corresponding streamlines, we need to obtain the diffusion tensor that defines the vector field at a specific position in the volume. We use tensor component-wise trilinear interpolation [CFJ*05] for the estimation of the diffusion tensor at any point in the volume. For trilinear interpolation, we need the diffusion tensors at the eight voxels of the cell containing the current position. These voxel tensor values are determined by performing wild bootstrapping calculations at the specific voxels as described in Section 4. While tracing the streamlines in a single bootstrap sample, we keep track of the voxels that have been already computed and store the corresponding tensor wildbootstraping sample. Every time a previously computed voxel is required, it is fetched, without the need for re-computation. This ensures coherence through the streamline integration steps. We also reuse the stored voxel tensors within a wild-boodstrapping iteration when fiber samples from different seed points are traced. This preserves the coherence between fibers, and produces the same results as the naive pipeline. The resulting fiber samples (shown by the red lines in Figure 5) are then progressively visualized after being calculated. The bootstrap iterations repeat, resulting in multiple fiber samples for each seed point.

Uncertainty Visualization
So far, we have discussed a method to progressively generate bootstrap fiber samples. This progressiveness has no use if we cannot visualize the resulting fiber ensemble in an effective and progressive manner. Directly rendering thousands of fiber samples in a spaghetti plot adds cluttering and occlusion, making it difficult to effectively obtain relevant information. We developed a progressive aggregation method to indicate the relevant uncertainty information, and an interactive visualization approach for effective exploration of the uncertainties. Brecheisen et al. proposed to determine a representative fiber using pairwise distances between all fiber samples. The representative fiber is the fiber with the minimum accumulated distance to all the other fibers in the ensemble and as such can be seen as the most central fiber. In addition, all other fibers are ordered according to their accumulated distances such that intervals of uncertainty can be defined.

Progressive Fiber Aggregation
To calulate the pairwise distances, Brecheisen et al. [BPtHV13] used the mean of the closest point distance [MVV05]. We modify this approach presented by Brecheisen et al. to be used within the scope of a progressive approach as follows.
We assign a distance score S i to each fiber sample F i which is the accumulated distance of F i to all other available fiber samples as where d defines a distance measure between fibers, in our case the closest point distance. With each bootstrap iteration, a new fiber sample F k is generated and added. As the distance score S i is a simple sum it can be updated easily. We only have to compute d(F i , F k ) for each already computed fiber sample F i and add it to the corresponding existing distance score S i . Additionally, S k is computed by summation of all newly computed d(F i , F k ) using Equation 1. We keep the scores in an sorted table such that the lowest score, corresponding to the sample that has a minimum distance to all the  others, is selected as the representative fiber. Higher scores indicate that the samples are further away from the rest and can be interpreted as having higher uncertainty. Despite being of linear order, the computation of d(F i , F k ) is computationally costly. Therefore, we progressively update the existing distance scores, as well as the new distance score S k , in order of the distance score of the fiber samples. Furthermore, to avoid unnecessary distance computations, if the distance d(F i , F k ) is smaller than a pre-defined threshold, we assume that the fiber samples F i and F k are similar enough to not need a higher precision in the distance calculation. By keeping track of all computed distances, we can avoid the costly distance calculations d(F j , F k ) for the remaining samples F j , by simply using the existing distance d(F j , F i ).
We illustrate the progressive updates of the distance score table in Figure 6. After the third iteration (N = 3, Figure 6a), three fiber samples are present with the second sample as the center line. During the fourth iteration (N = 4, Figure 6b), a new sample is added to the existing ones. The distance scores are re-computed and the representative fiber is updated accordingly. At N = 5 (Figure 6c), another sample is added with a distance less than a predetermined threshold to the existing fiber sample 3. In this case, the distance score table is updated according to the distances of the similar fiber and the new fiber is added to the same table entry as sample 3. Notice that the more fiber samples we calculate the higher the costs of keeping the score table but also the higher the chance of finding a similar fiber sample. An evaluation on the performance gain and the accuracy is presented in section 7.

Progressive Rendering
Once the representative fiber and the aggregations have been determined, an effective visualization is needed. We draw the representative fibers as red tubes and the remaining fiber samples, representing the ensemble variation, as illuminated polylines in orange. We use multi-layered rendering to avoid occlusion of the representative fibers by the other fiber samples. We first render the fiber samples, followed by a second pass to render the representative fibers on top, as shown in Figure 7. In this way, the representative fibers are always visible, regardless of occlusion by other fiber samples As a result, the depth perception of those samples in relation to the representative fibers is less clear. However, we deem the visibility of the representative fibers more important, while the fiber samples provide context.
As the simulation progresses, changes of the representative fiber and fiber samples can be observed by the user in the progressively updating visualization. For further exploration of the fiber aggregation, intervals can be specified similar to the work by Brecheisen et al. [BPtHV13] to show variation from the representative fiber. An interval can be expressed as a percentage range of the distance score table, e.g., 0 − 50% closest fiber samples. The selected fiber samples are rendered in blue. We have chosen the color scheme for the representative fiber, fiber samples and selection, using the red to blue diverging color map from ColorBrewer [HB03]. By using colors from one end of the color map for the representative fiber and fiber samples we indicate their connection, while using the other end for the selection provides a clear highlight.
To draw the selection in the multi-layered approach, described above, we use a third layer, between the complete set of fiber samples and the representative fibers. As a result, selected fibers are shown on top of the complete set but may be occluded by the representative fibers, as shown in Figure 7.

Linked Distance Score Histogram
Selecting fibers based on the interval ranges, as described in Section 6.2 allows the user to gain an impression of the variance of the samples from the representative fiber, for example, to identify outliers. To provide further insight into the distribution of the generated fiber samples, we compute the distance of each sample from its corresponding representative fiber and show these in a histogram   (inset, Figure 7). The fiber distribution in the histogram is represented so that the left part of the histogram depicts the fiber samples closer to the representative fibers, while the extreme right part denotes the fibers further away.
We further allow the user to select intervals visually, by brushing in the histogram view, providing an intuitive way to understand the uncertainty distribution. We use the same color scheme used for the 3D representation (Section 6.2) to indicate fiber samples and selections in the histogram. Together with with the stability of the actual fiber visualization, the continuously updating histogram is an indicator of the stability of the estimated uncertainty. Over time, it is expected that the histogram will have fewer fluctuations, indicating that the addition of samples has less influence on the final uncertainty estimation. To aid the evaluation of the stability of the uncertainty estimation beyond animation, we calculate the earth mover's distance (EMD) [RTG98] between the histograms of consecutive bootstrap iterations. The EMD quantifies the differences in the distribution for the two consecutive histograms. Hence after several iterations, the histogram becomes more stable, consequently the distance between the histograms reduces depicting the stability of the simulation. We show these values in an optional, on-demand line plot, shown in the bottom right corner of Figure 7.

Results
In this section, we evaluate the developed framework and discuss the interactivity of the progressive simulation, uncertainty estimation, and rendering. We used two DW-MRI data sets, one from a healthy subject, one of a patient with a brain tumor, provided by our collaborators. During separate sessions with two clinical collaborators, we extracted several fiber tract bundles (i.e., Inferior Front Occipital Fasciculus (IFOF), Corticospinal Tract (CST), Arcuate Fasciculu (AF), Optic Radiation (OR)) from these datasets, using our tool. The original volume datasets comprise of 112 × 112 × 70 voxels, with a resolution of 2 × 2 × 2mm 3 , a b-value of 1, 000, and 56 gradient directions. All computations were performed using an Intel (R) Core i7-4820K CPU at 2.6 GHz. Our framework is implemented in C++, as a plug-in for the open-source medical image processing and visualization framework 3D Slicer [FBK*12].

Progressive Simulation
In clinical applications, generating the specific fiber tract requires constant tuning of the parameters, especially the regions of interest (ROIs). Our framework allowed our clinical partners to generate and manipulate the fiber tracking regions of interest during the progressive generation of fiber samples. The interactions provided in our framework were found to be useful to create the fiber tracts. Figure 8 illustrates the progressive computation and visualization of a representative fiber and the corresponding variation for a single seed point. The fiber is part of the Arcuate Fasciculus (AF) bundle as was defined by our collaborators. At each bootstrap iteration, a new fiber sample is generated from the seed point, which in turn updates the distance score table, and subsequently the representative fiber. The variation in the representative fiber and inclusion of the new fiber samples can be seen in Figures 8b-f, illustrating the result after 5, 10, 100, 350, and 500 iterations, respectively. In Figures 8a-d, it can be seen that the 3D representation and the histogram changes significantly between the different snapshots. As the number of similar fiber samples increases within an ensemble, the representative fiber updates accordingly. After an adequate number of bootstrap iterations, the overall structure of the fiber tract, along with its variations, becomes stable, as indicated in Figures 8e and f. As can be seen, there are no major changes in the fiber structure and histograms, and the simulation can be considered as converged. The stability of the histogram can also be analyzed with the histogram stability plot as discussed in Section 6.3. With increasing number of iterations, the histogram becomes stable and consequently the distance among the consecutive histograms diminishes, as shown in the line plot in Figure 8g. However, it should be noted that the stability plot alone is not an indicator of the convergence of the simulation, rather it only depicts   how stable histogram is. Convergence of the simulation is always observed in combination with analyzing the 3D shape of the fiber structure and the stability of the histogram. Figure 9 shows the convergence behavior when using a seed region instead of a single seed point in the AF tract. As the simulation proceeds, the inclusion of more fiber samples stabilizes the distribution. Since the bundle is rather compact and is not strongly affected by noise, there are consistent updates in the fiber structure from the beginning of the simulation. The distance distribution and the fiber samples shows no major changes even as early as 50 iterations. The structure of the fiber tract samples seems stable from the early stage of the simulation, and hence, one can estimate that only few bootstrap iterations are required for estimating the uncertainty in this case. Figure 9d shows the histogram stability plot which further clarifies the consistency of the histogram, as evident from the plot, the histogram distributions remains homogeneous from the early stage of the simulation. Figure 10 shows a second example, using region-based seeding to generate the CST tract as defined by our collaborators. Here, a tumor causes displacement of the CST tract. Initially, the distance distribution and the representative fibers update more rapidly because insufficient fiber samples are present. This can be seen from the samples and histogram in Figures 10a-c. The histogram stability plot, shown in Figure 10e further shows the large distance among the histogram distribution in the early stage of the simulation. As the simulation proceeds, the inclusion of more fiber samples stabilizes the distribution.
The examples shown indicate that convergence, as indicated by our partners, is reached after a different number of iterations. With our progressive framework, the users do not need to define the total number of iterations in advance and wait for the results. They can directly analyze the progressive results according to their stability estimated from the 3D visualization and the histogram.

Expert Feedback
We have conducted multiple feedback sessions with our collaborators, including two radiology operators and a surgeon in training. Our collaborators generated the core fiber bundles presented in this paper using our tool and provided feedback, both informally, as well as through a questionnaire. They noted that using our framework improved their understanding of the uncertainty present in the data and the extracted bundles. The stability of the simulations was identified by analyzing histogram stability and the 3D shape of the bundle. Our collaborators were enticed by the interactive definition of bundles. However, they also remarked that they would need more experience with such methods and uncertainty in general, to be able to provide reliable evidence of the benefits. We also got feedback from a collaborating neurosurgeon, on the visualization of the uncertainty, discussed in more detail in section 7.2.

Interactive Uncertainty Exploration
Similar to strategies presented by [BPtHV13], we provide the possibility to specify confidence intervals of fiber samples to be visualized. The interval can be selected based on a percentage of fiber samples closest to their corresponding representatives, similar to quantiles in scalar value distributions. Figure 11c shows the 0 − 50% interval of the fiber samples closest to their representative for the AF tract.The selection is highlighted in blue. Figure 11d illustrates the 90 − 100% interval, showing the 10% fibers that are farthest away from their representative fiber. The interval selection through the percentage of closest fibers has a direct interpretation on the chance to track a fiber within the region.
As explained in Section 6.3, our framework also allows to select interval based on the histogram distribution of the distances to the representative fiber. Figure 11a and 11b illustrate the selection us-   ing the distance score histogram. By observing the histogram, one can identify the branch present in the fiber ensemble. The selection in Figure 11a corresponds to the fiber samples that are closer to the representative while the selection in Figure 11b corresponds to the branch which is further away. As illustrated it is possible to identify deviations from a uni-modal distributions. Our representative-fiber calculation assumes that the distribution of fibers originating from a seed point is uni-modal. However this does not always hold and one calculated representative fiber is not adequate. The histogram is likely to show multiple peaks when this is the case (see Figure 7). In discussion with our collaborators, this interaction with the histogram helps in understanding the variations present in the bundle and identifying the outliers.  Figure 2a, is a common clinical practice used to determine the area of risk. The margin is equally distributed along the fibers however, it is not reliable as can be seen in Figure 2b. Our collaborators neurosurgeons indicated that false negative as the ones missed in Figure 2a are specially dangerous, as neurosurgeons may inadvertedly damage tracts and induce neurological deficits.
We further utilize histogram for interval selection to analyze uncertainties. Figure 12 illustrates the interval selection in the case where the CST bundle is affected by a tumor. Figure 12a illustrates the selection of the 40% fibers that are closest to representatives and Figure 12b represents the interval of 70%. As can be seen in the Figure 40% fibers are in the back area, however, on increasing the interval to 70%, the branch towards frontal area can be observed. As indicated from our collaborators The interval selection helps define the risk area for planning tumor resection surgery.

Computational Analysis
To analyze the computational cost and acceleration, achieved with our progressive approach, we have generated three fiber ensembles, corresponding to different anatomical regions defined with our collaborators. First, we compare the number of voxels necessary to compute each fiber tract sample. As the computation per voxel is identical between methods, we decided to use the number of voxels, instead of the computation time for comparison. It should be noted that the naive approach can easily be parallelized even for a single fiber and member. However, this advantage can be offset by computing multiple fibers in parallel with our approach. As discussed earlier, the naive bootstrap method computes the ensemble of the complete diffusion tensor field, hence, the number of voxels corresponds to all voxels comprising the volume (approximately 1.5 million). Our local bootstrap strategy only computes bootstrap samples for voxels along the fiber of interest. We provide an overview of the required mean (µ) number of voxels and the standard deviation (σ) per iteration of the same tract in Table 1. The mean and standard deviation are computed over 100 bootstrap iterations. In summary, our local bootstrap strategy significantly reduces the number of bootstrap computations required for each iteration. Consequently, the computation time for the required voxels as well as the required memory is significantly lower compared to the naive approach.
In typical applications, users are interested in fiber bundles consisting of multiple fiber tracts. Typically, those fiber tracts share a significant amount of voxels, meaning that as we increase the number of fiber tracts computed for the same seed regions, these shared   voxels can be re-used. Figure 13 shows the number of voxels that need to be computed for the Corticospinal tract (CST, ) and Arcuate Fasciculus (AF), with increasing seeding density. As can be seen in Figures 9 and 10, the fibers are much less spread in the AF compared to the CST. Consequently, the amount of voxels, needed for computing the bundle flattens out much quicker for the AF than for the CST (Figure 13). Nonetheless, in both cases, we can observe a flattening of the curves indicating that performance gains are even bigger for bundles, than for individual fibers.
Our framework consist of two major stages progressive bootstrap and fiber tracking, where fibers are generated, and progressive fiber aggregation where the derived data, such as the representative fiber are computed. The computation time for the bootstrap and fiber-tracking stage only varies, depending on variations in the data, i.e., tracts taking longer or shorter paths in the current iteration. The computation time for the progressive fiber aggregation stage, however, increases with each iteration, i.e., at iteration n, n−1 pairwise distances need to be computed. While this sums up to the same N 2 /2 distances that need to be computed without the progressive approach, distributing the computations over the iterations reduces the wait time for the visualization significantly. Further, introducing the similarity threshold can drastically cut computation times. We illustrate the correlation between the cut off threshold and computation time in Figure 14. We performed the progressive aggregation for 1, 000 bootstrap iterations with increasing threshold values. As can be seen, the computation time (green) drastically decreases, even for small threshold values. At the same time, the error (red line, Figure 14) compared to the exact computation without a threshold is increasing with larger threshold values. The error is calculated as the mean of the closest point distance between the exact representative fiber, and the representative fiber computed with the given threshold value. Since the fiber ensemble is computed using a stochastic process, we repeated the calculations 100 times, represented by the error bars in Figure 14. Given the curves in Figure 14, we estimate that a small cut off can provide significant speed up, with no noticeable reduction of accuracy .

Conclusion and Future Work
In this work, we have presented a progressive visual analytics strategy for uncertainty visualization in DTI fiber tracking, based on stochastic modeling. We have modified the wild-bootstrapping and fiber-tracking pipeline to enable a progressive approach. In particular, we have designed a local wild-bootstrapping approach, inte-grated into and driven by interactive fiber tracking. The presented pipeline can be implemented with other stochastic strategies and fiber tracking approaches that rely only on local information, such as tensor deflection (TEND) [Laz10]. Fiber tracking methods that require global properties of the volume to reconstruct the fiber tract, such as geodesic-based fiber tracking [HWF11], would, however, not directly benefit from the presented approach. Although we developed our progressive pipeline for DTI, the concept is extensible to HARDI and other models, as long as a local simulation method for uncertainty estimation is present such as [CDW11]. Furthermore, we have described a progressive approach to aggregate the fiber ensemble during computation, for immediate progressive visualization. We have adapted previous work by Brecheisen et al. [BPtHV13] such that it can work progressively. As shown in Section 7.3, our proposed progressive update of the representative fiber effectively reduces the computational cost. While we use the mean of the closest point distances between fiber pairs, as proposed in the original work, our progressive aggregation would also work with other distance measures [JPS*10; MVV05]. The main idea behind our aggregated visualization is to provide a less cluttered uncertainty representation by showing the variations around an identified representative fiber. Our definition of representative fiber assumes a unimodal distribution of fibers originating from a seed point. However, this may not always be the case. If the distribution is multimodal, e.g., bimodal, then the representative fiber loses its meaning, for example appearing in between the two modes. Other summarization and aggregation methods would need to be explored to be able to address this issue. We are using deterministic fiber tracking in combination with the bootstrap method, hence, the limitations of deterministic fiber tracking still persist in our pipeline and other approaches could be explored as future work.
Our clinical collaborators stress the relevance of adding uncertainty to their existing current workflow. However the lack of access to tools that show uncertainty makes it difficult to show its real benefit in practice. Our proposed progressive approach is a first step towards reducing the clinical bottleneck making uncertainty visualization more accessible to clinicians. As future direction we want to integrate the progressive uncertainty visualization in their workflow and evaluate whether and how uncertainty influences the decision making process. A progressive pipeline provides the possibility of immediate analysis at the danger of evaluating premature results. In this paper, we rely on the stability shown by the animation of the visualized results to indicate reliability of the results. However, more research is needed to evaluate the implications of the progressive pipeline. Despite positive anecdotal feedback from our clinical partners, the acceptance of our progressive framework by clinical users cannot be assumed, as the users are unaccustomed with uncertainty visualization and it requires some experience to adopt it in a routine workflow.
The focus of this work was on the progressive computation rather than the visual representation. More sophisticated visual representations, integrated with the progressive aggregation method are an interesting avenue for future work. Furthermore, we have explored a limited amount of sources of uncertainty. The progressive framework can be extend to accommodate other sources of uncertainty coming from other stages of the pipeline.