REAVER: A program for improved analysis of high‐resolution vascular network images

Abstract Alterations in vascular networks, including angiogenesis and capillary regression, play key roles in disease, wound healing, and development. The spatial structures of blood vessels can be captured through imaging, but effective characterization of network architecture requires both metrics for quantification and software to carry out the analysis in a high‐throughput and unbiased fashion. We present Rapid Editable Analysis of Vessel Elements Routine (REAVER), an open‐source tool that researchers can use to analyze high‐resolution 2D fluorescent images of blood vessel networks, and assess its performance compared to alternative image analysis programs. Using a dataset of manually analyzed images from a variety of murine tissues as a ground‐truth, REAVER exhibited high accuracy and precision for all vessel architecture metrics quantified, including vessel length density, vessel area fraction, mean vessel diameter, and branchpoint count, along with the highest pixel‐by‐pixel accuracy for the segmentation of the blood vessel network. In instances where REAVER's automated segmentation is inaccurate, we show that combining manual curation with automated analysis improves the accuracy of vessel architecture metrics. REAVER can be used to quantify differences in blood vessel architectures, making it useful in experiments designed to evaluate the effects of different external perturbations (eg, drugs or disease states).

pathological state. Examples include vessel diameter as an indicator of vasodilation, vasoconstriction, or arteriogenesis, 2 as well as vascular length density as an indicator of altered levels of tissue oxygenation 3 or tissue regeneration. 4 Since the structural architecture of microvessel networks is closely intertwined with function, changes in microvessel architecture can, therefore, be used to assess cellular and tissue level responses to disease and treatments. Confocal imaging of intact microvascular networks labeled with fluorescent tags yields images with high signal to noise 5 and serves as a gold standard method for visualizing the structure of microvascular networks. 6 Several image-processing programs have been previously developed to quantify fluorescent images of microvessel architecture in an automated manner, including Angioquant, 7 Angiotool, 8 and RAVE. 9 While these programs have been used in various studies, they are estimated to have a low degree of adoption by the research community relative to the multitude of studies that have quantified microvascular architecture using a manual approach. 2 Furthermore, the publications that introduce these tools for automation lack a common method for evaluating performance and provide nonstandard forms of metrics that make comparison between them difficult. 2 For image segmentation, manual analysis through visual inspection remains the gold standard technique, [10][11][12] defined as the method accepted to yield results closest to the true segmentation. Using manual analysis as an approximation of ground-truth 13 can serve as a basis to compare performance between automated analysis methods by classifying disagreement from ground-truth as error, as done previously in other applications. 14 In this paper, we establish and validate a new open source tool, named REAVER, for quantifying various aspects of vessel architecture in fluorescent images of microvascular networks ( Figure 1A) that uses simple image processing algorithms to automatically segment and quantify vascular networks, while offering the option for manual user curation ( Figure 1B). We use a benchmark dataset of fluorescently labeled images from a variety of tissues that exhibit a broad range of vascular architectures as a means of assessing our program's general ability to automatically analyze vessel structure and minimize possibility of bias resulting from examining any single tissue. The error of REAVER's output to ground-truth for various output metrics, including vessel length density, vessel area fraction, vessel tortuosity, and branchpoint count, is compared to the other vascular image analysis programs listed above. The accuracy of the output metrics, defined as the closeness of a measured value to ground-truth, 15 is measured based on absolute error. [16][17][18] Precision, related to the random errors caused by statistical variability, is measured by comparing the variance of error between different programs. REAVER's effectiveness is highlighted by its greater accuracy and precision compared to all other programs. Given the ubiquity of high-resolution fluorescent microscopy and the established need for automated, rigorous, and unbiased methods to quantify vessel architectural features, we present REAVER as an image analysis tool to further microvascular research.   Figure S1) and for the retinal location dataset ( Figure S3) to examine vascular heterogeneity within a single biological sample.

| Immunohistochemistry and confocal imaging of retinas
Retinas were labeled with IB4 Lectin Alexa Fluor 647 (ThermoFisher Scientific I32450) and imaged at using a 20x objective (530 um field of view) and a 60× objective (212 μm field of view) with a Nikon TE-2000E point scanning confocal microscope. A total of 36 2D images were obtained from z-stacks through maximum intensity projection from six different tissues and used as a benchmark dataset for segmenting vessels and quantifying metrics of vessel architecture. To establish ground-truth, all images were manually analyzed in ImageJ. 19

| REAVER algorithm
Rapid Editable Analysis of Vessel Elements Routine's algorithm was implemented in MATLAB and designed to process the image in two separate stages: a) segmentation based on intensity over local background and b) skeletonization and refinement. Segmented vasculature is identified through a combination of filtering, thresholding, and binary morphological operations. The image is first blurred with a light blurring averaging filter with an 8-pixel neighborhood, and then, an image of the background (low frequency features) is calculated with a larger user-defined heavier averaging filter (default: 128 pixels in length, yielding 40 µm for 20× images and 27 µm for the 60× images). To create a background-subtracted image, the heavily blurred background image is subtracted from the lightly blurred image. The background-subtracted image is thresholded by a userdefined scalar (default: 0.045) to generate an initial segmentation.
Next, the segmentation border is smoothed and extraneous pixels are removed with an 8-neighborhood convolution filter that is thresholded such that only pixels with at least 4 neighbors are kept.
Leveraging the domain-specific knowledge that vessel networks are comprised of large connected components, those with area less than a user-defined value are removed (default: 1600 pixels, yielding 155 µm 2 for 20× and 69 µm 2 for 60×). To further smooth segmentation borders, the complement of the segmented image is convolved with an 11-square averaging filter (length of 3.4 µm 2 for 20× and 2.3 µm 2 for 60×), and values are thresholded above 0.5. To fill in holes within segmented vasculature, connected components of the complemented segmentation with less than 800 pixels (area of 77 µm 2 for 20×, 34 µm 2 for 60×) are set to true. The images are then thinned to compensate for a net dilation of segmentation from earlier processing steps. Finally, connected components of size less than a user-set value (default: 1600 pixels, yielding 155 µm 2 for 20× and 69 µm 2 for 60×) are removed again to generate the final segmented image.
To generate the vessel centerline, the segmented image border is further smoothed with eight iterative applications of a 3-pixel square true convolution kernel thresholded such that pixels with at least 4 neighbors are set to true. To fillsegmentation based on intensity over local in small holes and further clean the segmentation edge, the MATLAB binary morphological operations "bridge" and "fill" are applied in that order four times, along with an application of a 3-pixel square majority filter where every pixel needs 5 or more true pixels in the square to pass. Connected components in the complement of the segmentation with pixel area less than 80 pixels (area of 7.7 µm 2 for 20× and 3.4 µm 2 for 60×) are set to true in order to fill in holes within segmented vessels. The initial vessel centerline is identified by applying the binary morphological "thin" an infinite number of times to the segmentation with replication padding applied; otherwise, thinned centerlines would not extend to the end of the image.
To filter out centerlines for segments that are too thin, a We note that while this algorithm was tested with a benchmark image dataset that included a practical range of resolutions with the F I G U R E 3 REAVER demonstrates higher accuracy and precision across metrics compared to alternative blood vessel image analysis programs. To evaluate accuracy, absolute error of A, vessel length density (mm/mm 2 ), C, vessel area fraction, E, vessel diameter (µm), and G, branchpoint count compared to manual results (two-tailed paired t tests with Bonferroni correction, 6 comparisons, α = 0.05, N = 36 images). For analysis of precision, the absolute value of residual error to group's median error for B, vessel length density (mm/mm 2 ), D, vessel area fraction, F, vessel diameter (µm), and H, branchpoint count (two-tailed paired t tests with Bonferroni correction, 6 comparisons, α = 0.05, N = 36 images). For the annotations above each plot, significant pairwise comparisons between groups with Bonferroni adjusted p-values (letters). Groups are annotated when there is no evidence of nonzero bias with error, as determined by the origin falling within the bounds of the 95% confidence interval of the mean with Bonferroni adjustment of 4 comparisons (pound sign). Vessel metric diagrams were modified from Ref. 2 default image processing set of parameters, the parameters are resolution-dependent to some degree. We argue that the resolution range we used, images acquired at 20× and 60× magnification, represents the most relevant range of modalities for probing complete microvascular structures. Lower magnification below 20× lacked sufficient resolution to discern the structure of the smallest vessels of the microvascular network, while higher magnification over 60× sampled such small areas of vasculature that estimates of various metrics of vascular structure would be unreliable. Using resolutions far outside this range would require changing the default image processing parameters.

| Manual analysis of benchmark dataset
To make the time demands for establishing ground-truth manageable, a mixed-manual analysis approach was used to analyze the benchmark dataset, where a simple set of ImageJ macros provided an initial guess for thresholding and segmenting blood vessels, and then, the user manually used the paintbrush to draw in changes required. The initial automated guess was used to save time, but there is a possibility that it biased the ground-truth data to unfairly favor REAVER's results. To check whether bias in ground-truth could alter statistical outcomes, a completely manual segmentation was compared to the mixed-manual method in a subset of images from the benchmark dataset (N = 6 images, one from each tissue type, Figure S5A-D). The completely manual analysis was conducted by a different user with no cross-training between them to represent the worst-case estimation of disagreement between the two methods.
The disagreement of four output metrics (vessel length density, vessel area fraction, vessel diameter, and branchpoints) was examined via Bland-Altman plots, and all metrics had no evidence of bias (N = 6 images, P-values displayed in each chart, Figure S5E-H, no multiple comparisons correction applied for conservative interpretation). The width of the confidence intervals of the mean was calculated based on the 6 sample images (normality approximation, Figure S5, ObsW CI 95 ). Since the confidence interval is based on the standard error (and decreases by 1/√n), the confidence intervals for the entire benchmark dataset is estimated based on increasing the sample size from 6 to 36 images with sample standard deviation fixed ( Figure S5, EstW CI 95 ). We found these estimated confidence intervals were minor in size compaired to the effect sizes observed with the mean absolute error of the automated segmentation between the programs tested ( Figure S5, columns labeled AngioQuant -REAVER).
F I G U R E 4 REAVER exhibits higher sensitivity and specificity with vessel segmentation, along with lower execution time compared to alternatives. Using the test dataset of images with manual analysis as ground-truth, the A, accuracy, B, sensitivity, and C, specificity of the segmentation for each program, along with D, execution time for each image (two-tailed paired t tests, with Bonferroni correction; 6 comparisons, α = 0.05, N = 36 images). For the annotations above each plot, significant pairwise comparisons between groups with Bonferroni adjusted p-values below significance level (letters) The mixed manual analysis used for ground-truth for the benchmark dataset was acquired through manual curation of an initial automated threshold using macros in ImageJ to provide an initial guess of what structures in the image were considered vessels. Each image was loaded into ImageJ and an initial segmentation was calculated as a basis for manual curation. The image was segmented using a macro that removed high frequency features, applied local thresholding using the Phansalkar method, 20 decreased noise with the despeckle function, removed binary objects of pixel area less than 100 pixels, morphologically opened the image (erosion followed by dilation), applied a median filter on the adjacent four pixel neighborhood, and finally enhanced the brightness of the image for visibility.
Following this initial segmentation, trained editors used the paintbrush tool to correct errors in the segmentation. The total time to correct the segmentation was recorded. After the segmentation was adjusted to satisfaction, another ImageJ macro was run to generate a preliminary skeleton of the image. This script applied a median filter of radius 9 and the ImageJ Skeletonize operation. Once again, the curator used the paintbrush tool to correct the automatically generated skeleton. Special care was taken to ensure the skeleton had a width of only one pixel. The total time to correct the skeleton was recorded. The segmentation was run through the same analysis code that the other automated methods were analyzed with. The curator then tagged each branchpoint in the skeleton and recorded the total count and locations. These data were used as ground-truth to compare the automated analysis of several vessel architecture image processing pipelines.

| Image quantification of benchmark dataset
Each software package provided different collections of metrics calculated in different ways. To fairly evaluate program performance in an unbiased fashion, a collection of four metrics was selected that could be calculated from the output data supplied by each program: specifically, the segmented vasculature image and the vessel centerline image. These output images were collected from each program and then analyzed with the same code to quantify the vessel length density, vessel area fraction, mean vessel diameter, and number of branchpoints. If these output images were not available in the program, we either inserted code to export them to disk, or captured them from the program graphical display. Some of the programs had adjustable settings that altered the image analysis process: Default image processing settings were used for all programs as a test of general performance with quantifying vascular architecture from fluorescently labeled images.
AngioTool is an open-source package written in JAVA. We could not successfully recompile the program to access and export the F I G U R E 5 Curation of automatic image segmentation can enhance accuracy of output metrics. Comparison of error with A, vessel length density (mm/mm 2 ), C, vessel area fraction, E, vessel diameter (µm), and D, branchpoint count from automated analysis using default parameters before and after manual curation of image segmentation. Comparison of error with E, vessel length density, F, vessel area fraction, G, vessel diameter, and H, branchpoint count from automated analysis using degraded parameters before and after manual curation of image segmentation (for each of the two datasets, two-tailed paired t tests with Bonferroni correction, 4 comparisons, α = 0.05, N = 36 images) output images directly and therefore had to use indirect means to obtain output images. An image was imported and processed with default settings (vessel diameter: 20, vessel intensity: [15,255], no removal of small particles, and no filling of holes). Images of the segmen-

| Image processing execution time
The processing times for the manual data were recorded using a stopwatch while the curator was editing the segmentation and skeleton images in ImageJ. The processing times for AngioQuant, RAVE, and REAVER were all collected by adding tic/toc statements that log execution time into their MATLAB codes immediately before processing began and immediately after processing finished. This generated measurements for each program which were recorded.
Since AngioTool was provided as an executable file and the source code could not be successfully compiled without editing the code for dependency issues, reorganizing the file structure, and downloading external required libraries, the processing times were collected differently than the other three programs. The third-party application "Auto Screen Capture" (https://sourc eforge.net/p/autos creen /wiki/Home/) was used to capture images of the AngioTool application's progress bar approximately every 15ms starting from before the start of processing to after it finished. The screenshots were automatically named as the exact time they were taken at the resolution of 1 ms The collection of screenshots was inspected to identify the start time for processing based on the mean time of the final screenshot before the progress bar changed and the one immediately after. The end time for processing was determined by taking the mean time between the final image before the progress bar completed and the image immediately after. The difference between these two mean times was taken to get a total processing time. The total measurement error from collecting processing times in this way works out to be <3% of the total processing time.
All processing times were gathered on a computer with 32GB of DDR4-2666 RAM with CAS Latency of 15, an Intel i7-8700K 3.7 GHz 6-Core Processor, and a GeForce GTX 1080 graphics card with 8GB of VRAM. No overclocking, parallel processing or GPU processing was used.

| REAVER curation analysis
Rapid Editable Analysis of Vessel Elements Routine's code was modified to include a timer object which triggered every 20 seconds to save data to disk in the same manner as when manually

| Program evaluation metrics
The accuracy of the vessel structure metrics, defined as the closeness of a measured value to a ground-truth, 15 was examined with absolute error [16][17][18] ( Figure 3A,C,E,G). Let Y i,j be the value of a given vessel structure metric (vessel length density, vessel area fraction, branchpoint count, and vessel diameter) from the i th image and j th program, and G i,j be the corresponding ground-truth value derived from manual analysis. We define error, E i,j , as the difference between a measurement and its corresponding ground-truth and assess accuracy with the absolute error, A i,j : Measurements with low absolute error are considered highly accurate. We define precision 21 P i,j of the j th program for i th image to be where ∼ Ej is the median of E ij across images, with i = 1,…, 36 images, using the variable transform from the Brown-Forsythe test of variance 21 ( Figure 3B,D,F,H).
Additionally, we proposed metrics that quantify the agreement between each program's vessel segmentation and the ground-truth is Error was also examined before and after user curation with a different set of internal image processing parameters set to substandard values ( Figure 5E-H). Let Y B,S i,r denote the value of a given vessel structure metric before any user curation (superscript B) using substandard internal image processing parameters (superscript S) from REAVER (program index j set to r, the index for REAVER). The abso-

TA B L E 2 Metrics for examining user curation
Parameter set

User curation
Before Let Y F,S i,r denote the value of a given vessel structure metric following user curation (superscript F) using default image processing parameters (superscript S) from REAVER and not any other program (with j set to r, the program index for REAVER), The absolute error A F,S i,r is

| Summary of metric classes
Metrics used in this study are split into two main classes (Table 1).
Vessel structure metrics are the measures that describe archi- performance. To clarify the notation used for examining error before and after manual user curation, conventions are illustrated in Table 2.

| Statistical analysis
To probe how the programs performed relative to one another, we  Figure 4D) were preferable, while programs with higher segmentation accuracy, specificity, and sensitivity ( Figure 4A-C) were considered better.
In addition to testing on accuracy and specificity, we tested whether each program had zero bias or equivalently, whether the mean error terms equals zero via a two-tailed t test ( Figure 3A,C

| REAVER demonstrates higher accuracy and precision across metrics
When the accuracy of vessel length density measurements was examined across the different automated image analysis tools ( Figure 3A), REAVER had the lowest mean absolute error that was different from all other programs (76.5% reduction with P = 6.57e-3 compared to AngioTool, the next lowest program, two-tailed paired t tests with Bonferroni adjustment). All programs except AngioQuant had evidence of a nonzero bias revealed through individual twotailed t tests for a mean of zero (P < .05). When the precision of vessel length density measurements was examined, REAVER had the lowest random error that was different from all other programs (84.6% reduction with P = 1.61e-3 from AngioTool, the next lowest program, two-tailed paired t tests with Bonferroni adjustment) ( Figure 3B).

Rapid Editable Analysis of Vessel Elements Routine also had
the highest accuracy in quantifying vessel area fraction and the lowest mean absolute error that was significantly different from all the other programs (75.8% reduction of the error with P = 6.16e-8 from AngioTool, next lowest program, two-tailed paired t tests with Bonferroni adjustment) ( Figure 3C). All programs except REAVER had a nonzero bias, revealed through the associated two-sided t tests for a mean of zero (P < .05). When the precision of vessel area fraction was examined, REAVER and RAVE had the lowest random error that was different from all other programs (53.3% reduction with P = 8.62e-3 from AngioTool, the next lowest program after RAVE, two-tailed paired t tests with Bonferroni adjustment) ( Figure 3D).

Rapid Editable Analysis of Vessel Elements Routine had the
lowest absolute error in vessel diameter that was different from all other programs (83.9% reduction with P = 8.29e-7 from AngioTool, the next lowest program, two-tailed paired t tests with Bonferroni adjustment) ( Figure 3E). All programs, including REAVER, exhibited evidence of nonzero bias revealed through two-tailed t tests of mean zero (P < .05) for each individual program. In terms of the precision of the vessel diameter measurement, REAVER had the lowest random error that was different from all other programs (72.3% reduction from AngioQuant, the next lowest program, with P = 1.66e-3, two-tailed paired t tests with Bonferroni adjustment) ( Figure 3F).
In terms of the accuracy of the branchpoint density measurement, REAVER had the lowest mean absolute error that was different from all other programs (94.6% reduction with P = 4.43e-5 from AngioTool, the next lowest program, two-tailed paired t tests with Bonferroni adjustment) ( Figure 3G). All programs except REAVER had a nonzero bias, revealed through individual two-tailed t tests for a mean of zero (P < .05). REAVER had the lowest random error that was different from all other programs (93.2% reduction with P = 4.70e-5 from AngioTool, the next lowest program by means, twotailed paired t tests with Bonferroni adjustment) ( Figure 3H).

| REAVER exhibits higher segmentation accuracy and sensitivity with faster execution time
The error in the automated vessel segmentation was examined across all images in the benchmark dataset relative to the segmentation from manual analysis. 27 REAVER had the highest mean accuracy that was different from all other programs (6.4% increase from AngioTool, the next highest program, P = 1.73e-7, two-tailed paired t tests with Bonferroni adjustment) ( Figure 4A). In terms of sensitivity, REAVER had the highest mean sensitivity that was different from all other programs (34.1% increase from AngioTool, the next highest program, with P = 1.00e-15, two-tailed paired t tests with Bonferroni adjustment) ( Figure 4B). In terms of specificity, RAVE and AngioQuant had higher mean specificity than the other two programs (0.4% increase from AngioTool, the next highest group, with P = 4.39e-2, two-tailed paired t tests with Bonferroni adjustment) ( Figure 4C). With regard to execution time, REAVER had the fastest mean execution time that was different from all other programs (36.4% reduction from AngioTool, the next lowest program, with P = 1.8e-16, two-tailed paired t tests with Bonferroni adjustment) ( Figure 4D). All automated program execution times were <1% of the time required for manual analysis (3089 ± 1355 seconds per image, not displayed due to orders of magnitude difference), highlighting a major benefit of automated techniques.

| Blinded manual segmentation curation can improve accuracy of metrics
The errors for each of the output metrics relative to the manual analysis were compared for: (a) metrics obtained by REAVER using purely automated analysis, and (b) metrics obtained by using a combination of automation paired with manual curation of the image segmentation. Using the same images and internal image processing parameters (as used in Figure 2), the absolute error across all images were compared before and after manual curation where the user was blinded to the group each image belonged to. The absolute error for vessel length density was reduced 45% (P = 6.4E-5, paired two-tailed t test with Bonferroni adjustment, Figure 5A), while there was no change to the vessel area fraction error (P = 1, paired two-tailed t test with Bonferroni adjustment, Figure 5B).
Absolute error in vessel diameter measurements had a decreasing trend, with a 25.0% reduction in absolute error (P = .188, paired two-tailed t test with Bonferroni adjustment, Figure 5C), and absolute error in branchpoint density measurements experienced a similar decreasing trend with 17.7% reduction in absolute error (P = .112, paired two-tailed t test with Bonferroni adjustment, Figure 5D).
Since REAVER demonstrated superior performance with this image dataset compared to the other programs, the error for many of the metrics was small, consequently lowering the potential effect size that manual curation may provide. To test whether manual curation is useful for lower quality results that could benefit more from manual curation, REAVER's internal image processing parameters were intentionally set to extreme values to produce a heavily flawed segmentation. Using the same dataset of images, user curation increased the accuracy for all of the metrics: the absolute error for vessel length density was reduced by 75.9% (P = 1.64e-11, paired two-tailed t test with Bonferroni adjustment, Figure 5E), the vessel area fraction absolute error was reduced 57.5% (P = 9.99e-6, paired two-tailed t test with Bonferroni adjustment, Figure 5F), vessel diameter absolute error was reduced 44.5% (P = 4.79e-3, paired two-tailed t test with Bonferroni adjustment, Figure 5G) and branchpoints absolute error was reduced by 73.2% (P = 1.36e-6, paired two-tailed t test with Bonferroni adjustment, Figure 5H).

| REAVER reveals differences in microvascular architectures across spatial locations in murine retina
An effective microvascular image analysis program can separate between groups of images with known differences in microvascular architecture. The blood vessels of the murine retina are a well characterized microvascular network that exhibits extensive heterogeneity of vessel architecture depending location in the tissue, 2 both with radial distance from optic disk and with each of the three discrete layers of vasculature beds: the deep plexus, intermediate plexus, and superficial capillary plexus. 28 With a dataset of images separated by two radial distances from the center of the retina, at each of the three vascular layers ( Figure S3A,B), REAVER could discern unique vessel architectural features across the metrics quantified ( Figure S3D-L). These metrics were able to achieve a partial linear separation between retina locations with the first two components of a principle components analysis ( Figure S3C).

| D ISCUSS I ON
We present a novel software package, REAVER, for quantifying met- where the program we developed had an unfair advantage with our dataset, REAVER was developed using a separate dataset of images with a different labeling technique from the dataset used to design the program. 29 To further minimize this bias, the benchmark dataset used in this study was specially designed to include a variety of mouse tissues with very diverse structural features, so efficacy was examined across tissues instead of focusing on a single tissue type. Performance of image analysis programs can be examined with the Bland-Altman analysis ( Figure S4A-P) sults. Using a pilot study of a small dataset of images comparing both automated results and automation with curation to ground-truth will reveal to a researcher if curation is worth the time investment for a particular application. Furthermore, manual curation of automated segmentation represents a promising technique for efficiently generating ground-truth analysis of images that requires much less time than purely manual techniques. It is important to note that we only investigate each program's ability to automatically segment the vasculature: many of them include several manually adjustable image processing settings (although none offer the option for direct manual curation), and there is a possibility that one of the other programs would perform better than REAVER with optimal parameters.
Testing performance with manual adjustments would be a complex undertaking reserved for future research, requiring not only a fair method for identifying optimal parameters for each image and program under realistic use cases, but also evaluating how effective a user can be at identifying the optimal parameters and obtaining the optimal segmentation.
While our comparison of the precision and accuracy of four different automated image analysis programs was achieved by performing a separate comparison for each metric, in the future, it would be beneficial to compare program performance across all metrics simultaneously. This could require a method of weighting based on a metric's ability to discern alterations in a relevant biological dataset, while accounting for covariance and dependence between metrics (such as vessel length density being closely correlated with vessel area fraction for vessel networks with nearly uniform vessel diameters). The evaluation of trueness or bias, defined as the average distance between an output metric across images and ground-truth values, 15 is not included in this study because no method exists to statistically compare trueness between study groups since distributions must be compared to each other and their distance to zero bidirectionally at the same time. The development of such a technique would be required for discerning differences in trueness and lead to a more complete characterization of error and performance of the programs examined. Furthermore, our representation of groundtruth could be improved by having multiple users manually analyze the images to generate a gold standard from the consensus, as done previously with image object classification. 35 In summary, we introduce REAVER, a new software tool for analyzing architectural features in two-dimensional images of microvascular networks, that exhibited the highest accuracy and precision for all structural metrics quantified in our study. We present REAVER as an image analysis tool to analyze high resolution fluorescence images of blood vessel networks that can be used to further microvascular research.

| PER S PEC TIVE S
Microvascular research often requires characterizing changes in the structure of blood vessel networks, yet there is a lack of software programs to carry out these analyses. We present an open source software package, REAVER, to analyze and quantify various aspects of images fluorescent high-resolution images of blood vessel networks. REAVER is shown to outperform other vessel architecture image analysis programs with a benchmark dataset of manually analyzed images, suggesting it as a useful tool to further microvascular research.

ACK N OWLED G M ENTS
We thank Dr. Gustavo Rohde of University Virginia for providing key feedback for improvements to the manuscript.