Vascular diseases can be diagnosed and characterized by abnormalities in blood vessel morphology observed with three-dimensional medical imaging techniques such as magnetic resonance angiography (MRA). An example of this includes the correlation of the severity of hypertension with tortuosity or twistedness of arteries (Hiroki et al., 2002). Currently, nearly all medical evaluation of 3D images is performed qualitatively by visual assessment by specialists. Quantitative assessment of vessel morphology including radius, length, and tortuosity (twistedness) measurements by computer software would make comparison of measurements across medical centers, tracking changes over time, and automated screening for vascular disease possible. Quantitative assessment of artery morphology can be made from centerlines of arteries (O'Flynn et al., 2007; Lesage et al., 2009). Arterial centerlines have also been used to measure the tortuosity of blood vessels (Bullitt et al., 2003). Centerlines can be used to measure artery lengths and radius. Change of radius in arterial centerlines can potentially detect stenoses and aneurysms (Frangi et al., 1999; Kang et al., 2009; Lesage et al., 2009). Centerlines can be used for many tasks involving the quantitative analysis of blood vessels.
Stable and accurate centerline algorithms are needed to quantitatively measure and investigate the blood vessels and the effects of disease on blood vessels. Stability of the centerline is the ability of an algorithm to create the same centerline for the same image data with different input parameters, primarily the starting point of the centerline tree. Accuracy refers to how close an extracted centerline is to an ideal centerline for a numeric phantom. Centerline accuracy and stability measurement methods are needed to select the best algorithms for generating centerlines for a quantitative task. Accuracy and stability visualization methods are needed to know where centerlines are accurate or inaccurate, stable, or unstable. Different studies will have different areas of interest; the researcher will want to know if the centerline is accurate and stable in the area of interest. For example intracranial aneurysms typically occur in the circle of Willis arteries, which experience higher blood pressure and pressure variations than the peripheral intracranial arteries (Arimura et al., 2004). Thus, for the purpose of aneurysm detection, it is more important that the circle-of-Willis arterial centerlines are stable whereas the stability of the peripheral arterial centerlines is less important to aneurysm assessment.
The purpose of this study is to develop methods for measuring and visualizing the accuracy and stability of centerline algorithms and select the best available algorithm for creating centerlines in central arteries of human brain MRA images. Arterial centerlines have the potential for developing diagnostic and descriptive measures of vascular diseases. The methods developed here may also be used to quantify tubular structures in any three-dimensional image.
MATERIALS AND METHODS
In this study, we collected phantom and human subject images including: a computer generated helical and straight line phantom (Fig. 1A,B), two computer generated branching phantoms with background noise from Aylward (Aylward and Bullitt, 2002) (Fig. 2A,B), eight human brain Time of Flight (TOF)-MRA images (Fig. 3), and four numerical helix phantoms for testing the length measurement using the best performing centerline algorithm (Fig. 4). The helical phantoms were generated by calculating points on a helix (using the equation x(t) = cost(t), y(t) = sin(t), z(t) = t) then rolling a ball with a six voxel radius along the points. The eight MRA image data sets were selected from our ongoing intracranial aneurysm study approved by the University of Utah Institutional Review Board. Additional information on the data set is given in the Supporting Information S1.
Image processing tools for this study were developed in Java with the ImageJ toolkit (Rasband, 1997; Burger and Burge, 2007). Results were stored in the MySQL database (available at http://www.mysql.com/). Graphing results and statistical analysis were performed with R (R Development Core Team, 2009).
The unsegmented computer generated branching phantoms and the eight human brain MRA images were segmented from the background noise and brain tissue leaving the simulated arterial tissue (Fig. 5A,B) or the human brain arterial tissue (Fig. 6) with a z-buffer segmentation (ZBS) algorithm (Parker et al., 2000; Chapman et al., 2004). A three-dimensional movie of segmented arterial arteries is available online in Supporting Information S3
The point where the three branches of the branching phantoms meet is narrow so that the region growing threshold employed during the ZBS segmentation had to be lowered in order to keep the segmented branches all in one connected component (Fig. 5B). Additional details of the segmentation process are covered in the Supporting Information S2. The result of segmentation is the extracted arterial tree.
Cost Function Segmentation Preprocessing
To generate a centerline through the segmented arteries a cost was assigned to every voxel (a three-dimensional pixel) in the extracted arterial tree. Four different costs functions were MDFEi, COM, BT-COM, and BT-MDFEi.
The MDFEi cost function calculates a value for each voxel that is higher for voxels closer to the edge of the arteries and lower for voxels closer to the middle of the arteries by first calculating the distance from edge (DFE), modifying the DFE (MDFE) to break ties and inverts to make the costs higher on the outside and lower on the inside (Zhang et al., 2005) (Fig. 7A).
The center of mass (COM) cost function loops through every voxel in the segmentation and calculates the average X, Y, and Z positions of each voxel and up to 26 three-dimensional neighbors as the COM, recording each voxel's COM and cumulative distance moved from the original position to each subsequent COM through multiple iterations. At each iteration, the COM calculation depends on the positions of the previous iteration. The COM calculation is repeated until all voxels have been moved a minimum of 30 times. Increasing iterations increased stability only minimally after 30 iterations. The cumulative distances moved are divided by the minimum nonzero distance moved in the entire segmentation and the result is cubed. Voxels at the segmentation edge begin moving with the earliest iterations and tend to move the farthest, generating high-cost scores. Voxels near the center move with later iterations and for short distances, generating low-cost scores (Fig. 7B).
The BT centerline algorithm (Homman, 2007) eroded the segmentations to single voxel-width skeletons (Figs. 8A,B and 9). The brain artery skeletons are close to centerlines but have short segments running across wide arteries (Fig. 9). The skeletons were used as inputs into the MDFEi or COM cost functions to utilize the existing software program developed for the inverse MDFE and COM centerlines. The DFE will always be one for every voxel and not change the skeleton. The voxels will typically only have 1, 2, or 3 neighbors in the COM reducing the amount of movement compared with the earlier algorithms.
The result of preprocessing are cost values for every voxel in the extracted arterial tree. The costs will be higher at the edge and lower in the middle. The binary-thinned arteries will only have one cost value. The centerline of the arteries will be the lowest cost path through the cost function.
To calculate the centerline the precomputed arterial tree costs were input into the Dijkstra shortest paths algorithm (Dijkstra, 1959). Dijkstra's algorithm calculated the lowest cost centerlines from every voxel back to a selected starting root voxel. Then the paths less than 30 voxels long were removed leaving a skeleton centerline of the arterial tree (Zhang et al., 2005). The root of the MDFEi-based centerline tree was the maximum MDFE, the thickest point in the arterial tree (Zhang et al., 2005). The root of the BT-MDFEi-based centerline tree was the most central voxel in the arterial tree. The root of the COM-based centerline tree and binary thinning-center of mass (BT-COM)-based centerline trees are voxels with the lowest COM score. In the event of tied starting root points, the root closest to the center of the image was selected. The centerlines tend toward the lower cost middle voxels of the preprocessed segmentations (Fig. 10A,B). Shortest paths centerline generation on the binary-thinned input has the effect of pruning off short branches that are running across artery widths (Fig. 11). The results of the centerline algorithms were single voxel width centerline skeletons of the arterial trees.
Stability of the centerline was measured by generating the centerlines for the same segmentation starting at different root points. The first centerline tree is initiated from the root as described for the centerline algorithms. The arterial tree endpoints of the largest connected centerline tree were used as roots for a second round of centerlines. Smaller centerline trees of segmented arteries not connected to the largest section were discarded.
To measure and identify stable and unstable centerlines, the first round and all second round centerlines were accumulated in one image. The most stable centerline points occur in the same voxel for all N centerline root points. The stability measure for an image was the percentage of centerline voxels in the accumulated image called centerline for all of the centerline roots.
To visualize the accumulated centerlines the inverse of the accumulation was plotted in 3D with the surrounding segmentation. This makes unstable voxels that are called centerline by fewer than N roots brighter than their neighbors and therefore easily visible (Fig. 12A,C,E,F).
Accuracy of the centerlines of the phantoms is measured by the root mean square error (RMSE) of Euclidean distances from the algorithm generated centerline points to the nearest known centerline point for a phantom. The centerline points used to generate the helix line phantom were known. The known centerline points for the helix line phantom were sparse (Fig. 1B); the algorithm-derived centerlines had more points because the centerlines extend out to last voxel at the end while the known center points used to generate the phantom stop at one radius distance from the edge of the phantom as seen in Fig. 1B. Therefore, the RMSE was computed only over the set of known points and their nearest centerline determined neighbors. Aylward and Bullitt (Aylward and Bullitt, 2002) provided ideal subvoxel accuracy positive control centerline coordinates, available in a text file, for the branching phantoms with noise. The RMSE of these phantoms was calculated between each centerline point determined by the Dijkstra algorithm and the closest subvoxel positive control point. The accuracy of the helix line phantom was visualized by plotting each algorithm centerline coordinate in red accumulating for each starting root and plotting the positive centerline control points in green. The red and green color together made yellow showing where the algorithm and positive control points were the same and where they differed (Fig. 12B,D,F,H).
The lengths of four numerical phantoms of decreasing pitch were measured by four different methods. The COM centerline, smoothed COM centerline and generation points methods summed the Euclidean distances between the centerline points. The smoothed COM centerline method calculated the mean x, y and z coordinates of each centerline point with its one or two neighbors on the centerline. The generation points method used a step size of 1.0−5 mm and generated and summed the distance between points of the line x(t) = cost(t), y(t) = sin(t), z(t) = t with a step size of 1.0 × 10−5 mm for values of t. The analytical lengths were calculated from the equation: . Two times the radius was added to the generation points and analytic method because the lines do not extend to the ends of the objects.
Phantom Centerline Stability and Accuracy
The centerline tree of first helix line phantom (Fig. 1B) had seven ends. The first starting point in the thickest point of the phantom followed by starting points at the seven ends made total of eight centerline trees for the stability analysis. In this case, the inverse MDFEi cost function was the least stable with the lowest stability score (Table 1). Instability occurred throughout the helix at bifurcations and at line ends (Fig. 12A). The binary thinning skeleton only left one possible highly stable centerline with some instability occurring at ends when the skeleton was passed to either the inverse MDFEi or COM programs. The COM-based centerline was more stable (higher stability score) than the inverse MDFEi-based centerline (Table 1).
Table 1. Helix line phantom stability and accuracy
RMSE of accuracy
The MDFEi algorithm produced the most accurate centerline in the helix line phantom and was the least stable.
The MDFEi cost function was the most accurate (lowest RMSE) despite being the most unstable (lowest stability score) (Table 1). The accuracy visualization shows the positive control green, algorithm red and overlapping yellow centerlines (Fig. 12B,D,F,H). The inaccuracies occur at ends and bifurcations and in the helix portion of the phantom. The locations of the inaccuracies are similar to the locations of instability (Fig. 12). The COM-based centerline lost more accuracy than the other algorithms bending around bifurcations as seen by the green color in Fig. 12D. The binary thinning algorithms were frequently a few voxels off as seen by the green in the helix (Fig. 12F,H) accounting for the high RMSE of accuracy.
Stability and Accuracy on Phantoms with Noise
The MDFEi-based centerline algorithm had the lowest stability and best accuracy (lower RMSE) for the lower SD 10 noise phantom. For the higher SD 20 noise phantom the COM and binary thinning paired with COM had lower RMSE. Binary thinning followed by the COM-based algorithm consistently outperformed binary thinning followed by the MDFEi-based algorithm with higher stability scores and lower RMSE of accuracy. Therefore, the rest of the trials on MRA data used the binary thinning followed only by the COM-based algorithm for centerline generation. The number of primary (one) plus secondary starting root points was three for most algorithms and four for the binary thinning followed by the COM-based algorithm because the initial start point for the first round centerline was near an end of the branching object for the tests with three starting roots. The binary thinning followed by COM had the highest stability besides having the extra centerline tree (Table 2). The bright end of the lower right branch in Fig. 13B shows that much of COM-based algorithm instability happens at the end of the branch because one of the starting roots occurred here, shortening the centerline. When the centerline was rooted at another branch end the centerline extended longer at this branch end. This was an example of how the starting point alters the centerline tree.
Table 2. Comparison of algorithm stability and accuracy on phantoms
Number of trees
RMSE of accuracy
Stability and accuracy comparison of the MDFEi, COM, BT-MDFEi, and BT-COM algorithms on branching tubular phantoms with standard deviation (SD) 10 and 20 distributed Gaussian noise.
Artery Centerline Stability
The running time for calculating COM costs for arterial trees was under 60 sec for a total time of 3–5 min to generate the centerlines for all the centerline algorithms. Stability images from the segmentation are shown in Fig. 14. A region of instability was seen in the left internal carotid artery (ICA) in Fig. 14A. In the vessel segmentation, the ICA siphon frequently loops back and touches itself creating a shortcut for the centerline to pass through. The MDFEi-based centerlines pass through the kissing point (Figs. 15A and 16A,B). Dijkstra's shortest paths on the segmentation cost functions of the MDFEi- and COM-based algorithms produce only nonlooping branches. In the MDFEi-based algorithm, the first centerline passes through the kissing point and subsequently two centerlines extend out from the kissing point to end on the distant edges of the siphon arterial wall (Figs. 14A,B and 16B). The COM-based algorithm produced high scores near the kissing point, even though the scores in the kissing point are low (Fig. 15B), causing the first centerline to run around the siphon loop (Figs. 14C,D and 16B). Shorter centerlines are subsequently generated by Dijkstra's algorithm (Dijkstra, 1959) from the kissing point to end at the longer centerline. However, these shorter centerlines fall below the 30 voxel length threshold and are removed leaving the final centerline tree (Fig. 16B). The BT-COM-based centerline consistently forms a loop with one part of the centerline passing through the narrow kissing point (Figs. 14E,F and 16C). Failing to pass a centerline through the ICA siphon loop is a common centerline failure and is used as a measure of centerline accuracy in the MRI images since there is no gold standard centerline for computing the RMSE as with the earlier phantoms. Three dimension stability images of the artery are available online for the MDFEi algorithm in Supporting Information S4 and COM algorithm in Supporting Information S5. Supporting Information S6 and S7 contain movies of a MIP projection along the Z axis of each centerline tree from the different root points for the DFE (Supporting Information S6) and COM (Supporting Information S7) algorithms.
As with the phantoms the COM- and BT-COM-based centerline had higher stability than the MDFEi-based centerlines. We recorded when the centerline succeeded and failed to pass through the ICA siphon (Table 3). Success meant that the centerline passed through the ICA without touching the edge of the segmentation for all roots; any failure of one tree was counted as a failure. The MDFEi-based centerline would frequently pass through the ICA siphon correctly for some starting root locations but not others leading to its lower mean stability measure.
Table 3. Comparison of centerline algorithms on MRA brain images
ICA siphons accurate
Portion ICA siphons correct
Both ICA correct in image
Mean number of trees
Standard deviation of trees
Standard deviation stability
Comparison of centerline stability, number of centerline tree roots, and correctness of the centerline through the ICA siphon between centerline algorithms on eight brain artery MRA images.
A one-way analysis of variance (ANOVA), using the lm command from R (R Development Core Team, 2009), of the centerline accuracy in the 16 ICA siphons by algorithm (MDFEi, COM, BT-COM) showed a significant difference P-value = 3.62e-04. The COM was significantly more accurate than BT-COM, P-value = 1.26e-02. BT-COM was more accurate than MDFEi but not significantly different, P-value = 9.01e-02.
One-way ANOVA showed the algorithm significantly affects the stability, P-value = 1.63e-06. As the centerline stability box and whiskers plot shows the BT-COM-based algorithm and COM algorithm are very close in stability, P-value = 0.846 indicating no significant difference. The MDFEi algorithm produces a significantly less stable centerline than the BT-COM-based algorithm, P-value < 0.0001 (Fig. 17A and Table 4).
Table 4. Means by algorithm
Mean number of trees
The MDFEi algorithm was less stable than the COM and BT-COM algorithms while generating a similar number of centerline trees for the stability measure.
The first run of different centerline algorithms can produce differing numbers of tree ends for roots of following centerline trees generated to measure stability. More ends and more centerline trees create more opportunity for instability. The centerline algorithms did not produce significantly different numbers of tree ends (P = 0.862). The number of tree ends used as roots does not account for the instability of the MDFEi centerline algorithm as seen in Fig. 17B and Table 4. Table 4 summarizes the centerline stability data by algorithm recorded in Table 5.
Centerline stability and number of trees used to generate the stability measure was collected for eight human brain data sets.
Numerical Helix Phantoms Length Measure
Extracted centerline lengths measured from COM and smooth COM centerlines were within 2% greater than the known centerline lengths measured from the generation points and analytically (Table 6). The generation points are shown in Fig. 18A and the extracted centerline is shown in Fig. 18B.
The lengths of numerical phantom helixes with voxel size 1.0 × 1.0 × 1.0 mm3 were measured from the COM centerlines, the smoothed COM centerlines and the ends of the phantoms through the control points. The increase of the smoothed COM centerline length of the control length and percent increase of the smoothed COM centerline over the control length were calculated.
Our study showed the COM-based centerline algorithm was more stable and correctly calculated more centerlines around the kissing ICA loops than the other algorithms tested using the newly developed centerline stability measure and visualization of ICA loop centerline. The stability measurement strategy demonstrated the consistency of the COM-based centerline algorithm in the kissing ICA loops. The stability measurement strategy of starting the centerline tree at different points can be reused to test other centerline algorithms for use on any tubular structure. The stability, and phantom accuracy, visualization methods developed here identified where inaccuracy and instability were occurring. These methods are also usable for a wide range of centerline algorithms and applications of centerlines to anatomical studies.
This study is a first attempt to address the problem of kissing vessels. Kissing blood vessels are common in the segmentations of the brain and other anatomy. The method of having the centerline follow the middle of the mass of the artery solved the kissing vessel problem in the ICA loop in this study. Using mass in the centerline algorithm will be useful in any anatomical case where the true anatomical centerline is in the largest mass and noise creates smaller adjacent structures to the vessel.
The measure of stability from multiple starting points was able to determine the COM-based algorithm handled the ICA kissing siphon from all starting ends ensuring the stability of the algorithm. It would be computationally impractical to test stability by starting the stability measure from every point in the segmentation or even from every point in the first centerline tree. By starting the centerline trees from all ends of the first centerline tree the centerline algorithm will approach the regions of instability such as kissing ICA loop and all bifurcations from all directions thus testing the algorithm from all directions. The visualization of stability and accuracy allowed us to see that the instabilities and inaccuracies are mainly occurring at kissing vessel points and bifurcations.
The COM-based centerline algorithm generates correct centerlines in cases where the artery is much larger than the kissing points and is resistant to adjacent segmentation noise as long as the noise is smaller than the artery. The COM-based centerline gravitates toward the center of the largest mass and in the case of the ICA siphon the largest mass is in the loop and kissing section is smaller contributing less to the center of the mass.
All the cost functions had inaccuracies in the loop of the helix. The curve of the helix in the phantom is approximated to the nearest voxel and the centerline algorithms also have to approximate voxel positions causing the inaccuracies and instability frequently seen in curving centerlines. Curved centerlines often have a stair-stepped appearance. Some applications of the centerline may require subvoxel smoothing of the centerline to obtain smooth curves.
A limitation of comparing centerline algorithms by stability is that the most accurate centerline algorithm was not always the most stable algorithm. The BT algorithm was inherently stable because it erodes the segmentation from all outside points simultaneously to a single skeleton line. The BT cost functions, paired with MDFEi or COM, had consistently high stability while not having the highest accuracy in the phantom or in the count of correct ICA siphon centerlines in the brain images. The BT algorithm removed most voxels of the segmentation in the skeletonization step leaving a limited path for the centerline generation from the skeleton allowing the BT algorithms to maintain high stability.
Highest stability also did not correspond to highest accuracy in the no noise and low-noise phantoms where the MDFEi-based algorithm was least stable but most accurate. In the brain images the COM-based algorithm was clearly most accurate in the ICA siphon and was not significantly less stable than the most stable BT-COM-based algorithm, while having the lowest stability in all phantoms and brain images the MDFEi-based centerlines were the most accurate in the no added noise helix line phantom (Fig. 1A,B) and low SD 10 noise (Fig. 2A) phantoms. The RMSE of accuracy of the MDFEi centerlines increased from 0.393 in the SD 10 noise phantom to 0.674 in the SD 20 noise phantom (Table 2), a greater increase than the other more stable COM, BT-MDFEi and BT-COM-based algorithms. It makes sense that the least stable centerline would lose accuracy the fastest as noise increases. As noise increases the stability of the algorithm becomes increasingly important to maintaining accuracy. The MDFEi-based centerlines were the least accurate in the ICA siphons of the brain images that contain the noise of the MRI.
The extracted centerline lengths were larger than the known centerline lengths most likely due to discretization effects around the true centerline. The smoothing of the centerline reduced the discretization effects and thus the distance between the extracted centerline length and the true length. Other groups have observed errors of the same order when comparing centerline lengths to known values. One study measured a water-filled glass tube phantom within 2% of the true length using vertical projections of MRI data (Roberts et al., 1991). Another study used the inverse geometry of x-rays to measure phantoms within 1% of the true length (Tomkowiak et al., 2011).
The current study tested a limited number of centerline algorithms both internally and externally developed. The stability measure and visualization of inverse stability are usable by researchers testing algorithms for particular centerline extraction applications of tubular anatomy. There may not be an ultimate singular centerline algorithm suitable for all applications. The COM-based algorithm which was best in this study for extracting the ICA siphon loop is prone to missing small dim arteries near larger brighter arteries. An application looking at a small dim artery would have to use another algorithm making the availability of comparison methods important. Tubular structures occur frequently in anatomy. In addition to the arteries studied here other anatomical structures studied with centerlines include veins, lung bronchioles, large and small intestine, nerves, bones, and any other tubular anatomical structures.
Centerlines can be used to measure features of tubular anatomical structures. This study expands the range of structures that can have a centerline calculated. The kissing ICA siphon loops could not have a centerline made with the existing MDFEi (Zhang et al., 2005) and BT (Homman, 2007) algorithms. The COM algorithm developed here made a centerline possible in the ICA siphon loop. The centerline stability measure showed that the COM algorithm handled the kissing ICA siphon starting from any direction showing that the COM algorithm is stable in this case. Increased smoothing of the COM centerline may further reduce length errors compared with the true values. The stability measure can be reused to test centerline algorithms when evaluating centerline algorithms for other tubular anatomical structures.
The authors greatly appreciate the help of the staff at the Utah Center for Advanced Imaging Research in supporting this research. The branching phantoms used in this article were generated and made available by the CASILab at The University of North Carolina at Chapel Hill and were distributed by the MIDAS Data Server at Kitware, Inc.