Comparison of open-source three-dimensional reconstruction pipelines for maize-root phenotyping

Understanding three-dimensional (3D) root traits is essential to improve water uptake, increase nitrogen capture, and raise carbon sequestration from the atmosphere. However, quantifying 3D root traits by reconstructing 3D root models for deeper field-grown roots remains a challenge due to the unknown tradeoff between 3D root-model quality and 3D root-trait accuracy. Therefore, we performed two computational experiments. We first compared the 3D model quality generated by five state-of-the-art open-source 3D model reconstruction pipelines on 12 contrasting genotypes of field-grown maize roots. These pipelines included COLMAP, COLMAP + PMVS (Patch-based Multi-View Stereo), VisualSFM, Meshroom


INTRODUCTION
Root phenotyping is essential in research projects aiming to improve water uptake, nitrogen capture, and carbon sequestration (Ault, 2020;Lynch, 2019;Lynch & Wojciechowski, 2015;Paustian et al., 1997;Smith et al., 2007).However, advanced methods are required to measure and quantify complex root architectures in the field environment.
With the development of computer vision techniques, image-based root phenotyping with consumer cameras has emerged as a cost-efficient, highly scalable, and accessible alternative to expensive high-end imaging devices.Established 2D image-based root-phenotyping methods, such as digital imaging of root traits (DIRT) (Bucksch et al., 2014), archiDART (Delory et al., 2016), EZ-Root-VIS (Shahzad et al., 2018), GiA Roots (Galkovskyi et al., 2012), and Rhizo-Vision (Seethepalli et al., 2020) provide highly accurate trait measurements.However, 2D imaging approaches can only capture partial information from dense and highly occluded 3D maize-root architectures.Quantifying important traits, such as crown root number and whorl number, and their distances remains challenging.
The use of 3D imaging techniques in root phenotyping is promising because of their ability to leverage multiple views of a given scene to resolve occlusion in dense root architectures (Bucksch, 2014;Clark et al., 2011;Dowd et al., 2022;Topp et al., 2013).However, 3D imaging methods, such as X-ray CT (Shao et al., 2021) or magnetic resonance imaging (MRI; van Dusschoten et al., 2016), are 100-1000 times more expensive than multicamera systems (Liu et al., 2021), and do not meet the needs of large-scale field studies due to the high operation cost and difficulties in deploying in the field environment.Moreover, 3D imaging methods incur labor costs for highly trained staff and custom-shielded rooms for operations.Therefore, X-ray CT and MRI techniques are unsuitable for capturing root architecture with high throughput.In contrast, multicamera systems can scale at a fraction of the cost of 3D imaging methods and require neither highly trained staff nor custom facilities for their operation.
Despite the availability of highly developed reconstruction pipelines, to our knowledge, performance on 3D root reconstruction tasks has not previously been quantified.Until now, relationships between model quality, image count, and computation time have not been thoroughly explored, and it has remained unclear whether the model quality sufficient for

Core Ideas
• 3D reconstruction quality faces a trade-off between number of images and computation time.• Increasing computation time reduces the human factor in root trait measurement.• Optical 3D root phenotyping is an economical, highly scalable, and automated high-throughput solution.
detailed analysis of root traits is possible without significant performance penalties.
To answer the question above, we performed two computational experiments.We compared the 3D model quality generated by the above five open-source 3D model reconstruction pipelines on 12 samples from 12 contrasting genotypes of field-grown maize roots in the first computational experiment.The 3D model quality comparison included visual quality, number of points and surface density of a 3D point cloud model, and computation time.The COLMAP pipeline achieved the best performance regarding 3D model quality, which represents the best-observed tradeoff between the point cloud metrics number of points and surface density, image number, and runtime.In the second computational experiment, we implemented COLMAP in the 3D reconstruction pipeline in DIRT/3D (Liu et al., 2021) and compared the accuracy of 3D root traits generated by DIRT/3D (COLMAPbased 3D reconstruction) pipeline with our current DIRT/3D (VisualSFM-based 3D reconstruction) pipeline from the same dataset, including 12 genotypes with 5-10 replicates per genotype (Liu et al., 2021).For brevity, we shall now refer to DIRT/3D (COLMAP-based 3D reconstruction) pipeline as DIRT/3D (COLMAP), and DIRT/3D (VisualSFM-based 3D reconstruction) pipeline as DIRT/3D (VisualSFM).
In the following section, we discuss the methodology including root sample collection, two computational experiments, and the statistical analysis method.Then, we visually and quantitatively assess the quality of trait measurements from reconstructed models of maize genotypes.

Root sample collection
We used the same root samples as described in The image collection was conducted in an imaging chamber prototype built for Pennsylvania State University (Shi et al., 2019).Since this imaging chamber was equipped with a higher resolution camera than the cameras described in Liu et al. (2021), we selected 12 roots from all the samples, with each root representing one genotype.We captured images of each root using this prototype imaging chamber, as conceptually introduced in Shi et al. (2019).The images were captured using 10 cameras (Image Source DFK 33ux183 USB 3.0, 12 mm focal length V1228-MPY2 12 Megapixel Machine Vision Lens) arrayed around a central focal point.Image capture was synchronized using a cluster of 10 Raspberry Pi 4s with a server-client design.For each sample, approximately 360 images with an image resolution of 5,472 × 3,648 were taken.The number of images was chosen after evaluating the scanning geometry of this 3D scanner following the method described in Liu et al. (2009).We obtained an upper bound of the number of images needed to avoid under-sampling of the multi-view scanning setup.Blurred images, which were caused by motion blur effect due to the rotation of all cameras, were manually removed from the datasets.We captured 12 sets of images.Then, we computed 60 3D root models with the five 3D reconstruction pipelines (COLMAP, COLMAP+PMVS VisualSFM, Meshroom, and OpenMVG+MVE).The 3D root-trait measurement was conducted using the same pipeline as that described in Liu et al. (2021).

Computational environment
We conducted the firitst computational experiment on a DELL workstation.(OptiPlex

Computational Experiment 2: Image capture and computational methods
For each root sample, we used around 600 images captured with the 3D scanner described in Liu et al. (2021).Each image has a resolution of 3856 × 2764 pixels, which is about half the resolution used in Computational Experiment 1.We evaluated the scanning geometry of the 3D scanner to obtain an estimate of the number of images needed to avoid under-sampling of the multi-view scanning setup as described in Liu et al. (2009).As a result, we chose 600 images as a practical and feasible number of images.Blurred images, which were caused by motion blur effect due to the rotation of all cameras, were manually removed from the datasets.
We computed 80 3D root models for the 12 genotypes with DIRT/3D (COLMAP).The 3D trait measurement of the roots was conducted using the same pipeline as that described in Liu et al. (2021).

Computational environment
The computation was conducted on the high-performancecomputing (HPC) resource SAPELO2 at the Georgia Advanced Computing Resource Center (GACRC).We ran DIRT/3D (COLMAP) in a singularity container and recorded the running time for each execution of the container.The singularity container was retrieved from https://hub/docker.com/r/computationalplantscience/dirt3d-reconstruction and https://hub.docker.com/r/computationalplantscience/dirt3d-traits.All the scripts for this computational experiment are available on GitHub (https://github.com/Computational-Plant-Science/3D_review_scripts/tree/master, folder Computational_test_2).

Three-dimensional model quality computation
We used CloudCompare v2.12.alpha (Girardeau-Montaut, 2016) to compute the number of points in the 3D point cloud model and to estimate the surface density of a point cloud in computational Experiments 1 and 2. We loaded each point cloud model into CloudCompare v2.12.alpha using the software's graphical user interface.We then retrieved the number of points via the "Properties" tab.
The surface density S was estimated by counting the number of neighbors N inside a sphere of radius R for each point.The surface density S is defined as follows: Comparison of computation times per 3D reconstruction pipeline.The five tested pipelines are color labeled, and the lengths of the bars represent the time needed to compute the point cloud model.Although COLMAP generated the best 3D models among all the pipelines regarding number of points and surface density, it also took the most computational time, on average.Meshroom was the second most time-consuming pipeline.COLMAP + PMVS took the least computational time, on average.

Statistical analysis
The CORREL function in Microsoft Excel (Microsoft 365 A3) and the Analysis ToolPak add-in for Excel were used to compute correlation coefficients between the two sets of trait measurements derived from DIRT/3D (COLMAP) and DIRT/3D (VisualSFM).In addition, the built-in Rsquared formula in Microsoft Excel (Microsoft 365 A3) was used to compute the R-squared value in the regression analysis.

Visual assessment
We selected 12 field-grown maize roots from 12 different genotypes to compute 3D point clouds using all five pipelines.For each root sample, around 360 images were captured, and overall, 60 3D point cloud models were generated.
Figure 1 provides a visual comparison of the computed 3D models with the pipelines COLMAP, VisualSFM, Open-MVG, Meshroom, and MVE (Fuhrmann et al., 2014).Among all the tested 3D reconstruction pipelines, COLMAP and COLMAP+PMVS achieved good visual results, in terms of model completeness, with no obvious disconnection of roots or missing parts of the root system.With the reduced image dataset, VisualSFM tended to omit fine details, such as brace roots at the margins of the point cloud.Meshroom produced models with large interior gaps.OpenMVG+MVE displayed finer details than VisualSFM but does not provide color information per point.

Quantitative assessment
We assessed the two sets of point cloud metrics (number of points and surface density) for each 3D point cloud model."Number of points" represents the total of 3D points generated by a 3D reconstruction pipeline."Surface density" is the average density across the surface of the root architecture.The comparison results for the number of points are illustrated in Figure 2. The COLMAP pipeline produced the largest number of points, achieving, on average, 94 times the number of points of Meshroom, which generated the fewest points.In general, COLMAP outperformed all the other tested pipelines regarding the number of points per root system (Figure 2).In addition to the number of points, we compared the surface density of the point clouds generated by each pipeline.Higher surface density values are desirable for point cloud characteristics.The comparison of surface density in Figure 3 reveals that COLMAP and OpenMVG+MVE produced the models with the largest surface density in most of the root samples.The surface density generated by COLMAP was, on average, 94 times the surface density generated by Visu-alSFM, whereas OpenMVG+MVE was 31 times the surface In Computational Experiment 1, COLMAP produced the best 3D model results in terms of the number of points and surface density for most of the root samples.However, we noticed there are exceptions such as genotype B112.This exception can be explained by the physical arrangements of roots within the root system as a discriminating factor between genotypes.As such, different arrangements lead to different degrees of occlusion.Therefore, these root architectures are likely to show potential genotype dependence.
On average, COLMAP computed almost 29 times longer than the quickest algorithm (OpenMVG+MVE), and five times longer than Meshroom.COLMAP+PMVS was significantly faster than COLMAP, and needed, on average, three times longer than OpenMVG+MVE to produce the 3D model.COLMAP+PMVS required a similar time to compute the 3D point cloud as VisualSFM, as illustrated in Figure 4.

Computational Experiment 2: Comparison of three-dimensional trait accuracy with maximal surface density three-dimensional models
The results of the first computational experiment showed that COLMAP produced higher surface density compared to the other tested algorithms, using around 360 images per sample.In the second computational experiment, we tested whether the increased surface density enabled similar trait measurement accuracy using COLMAP with around 600 images, compared to the original version of DIRT/3D (Liu et al., 2021) using 3600 images with VisualSFM.Therefore, we implemented COLMAP in the 3D reconstruction pipeline in DIRT/3D (Liu et al., 2021) and named it DIRT/3D (COLMAP) to distinguish it from the old DIRT/3D (VisualSFM).In this computational experiment, we used the maize-root image dataset as described in DIRT/3D (Liu et al., 2021).The full dataset includes 12 genotypes with 5-10 replicates per genotype.For each root sample, we uniformly sampled 600 images from around 3600 images.A visualization of the 12 example 3D root models and their skeletal curves is provided in Figure 5.
We also compared the average number of points for each genotype for all samples within the genotype (Figure 6).The comparison revealed that the number of points generated from DIRT/3D (COLMAP) increased, on average, six times compared with that of DIRT/3D (VisualSFM).The largest increase in the number of points was observed for genotype B101.The average number of points generated by DIRT/3D (COLMAP) was approximately 11 times higher for B101 than in the 3D models generated by DIRT/3D (VisualSFM).The smallest increase in the number of points was observed for genotype LH59 x PHG29; the average number of points increased by about three times when changing from DIRT/3D (VisualSFM) to DIRT/3D (COLMAP).In addition to the average number of points, we further compared the surface density of 3D models generated by DIRT/3D (VisualSFM) and DIRT/3D (COLMAP).The results in Figure 7 reveal that the model surface density improvement is significant between DIRT/3D (VisualSFM) and DIRT/3D (COLMAP).The results indicate that the surface density of the 3D model generated by DIRT/3D (COLMAP) increased, on average 11 times compared with that of DIRT/3D (VisualSFM).
The largest increase in surface density was observed for genotype B101, which increased 22 times from the DIRT/3D (VisualSFM) implementation to the DIRT/3D (COLMAP) In our acuracy analysis of traits measurements, we used the manual measurements of ten 3D root traits and compared their correlations with DIRT/3D (VisualSFM) and DIRT/3D (COLMAP) tra.The correlation analysis of these traits revealed  2 > 0.80 and  < 0.001 (Figure 8).These traits included complete root crown traits and individual root traits, as described in Liu et al. (2021).
Regarding the complete crown root traits, we observed the improvement of trait measurement correlation for most of the traits, except root-system diameter and second-youngest nodal-third-youngest nodal whorl distance (Figure 8A).For individual root traits, we observed an improvement of trait measurement correlation for the youngest nodal root diameter and second-youngest nodal root diameter (Figure 8B).This improvement might benefit from the surface density increase of DIRT/3D (COLMAP) over DIRT/3D (VisualSFM).Of note is that DIRT/3D (VisualSFM) resulted in R 2 > 0.8 in the original datasets, with around 3600 images per sample root; however, DIRT/3D (COLMAP) only required around 600 images per sample root, with the correlations improving despite reducing the number of input images.

DISCUSSION AND CONCLUSIONS
The first computational experiment evaluated the 3D reconstruction quality of five open-source pipelines on field grown-maize root systems.All five pipelines produced point clouds as output.We quantified reconstruction quality as the point count and surface density, and we quantified the efficiency of a pipeline as the total computing time.In our evaluation, COLMAP and COLMAP+PMVS generated the largest number of points and the highest surface density.
Although the computation time of COLMAP was around 12 times slower than the VisualSFM implementation used in the original DIRT/3D paper (Liu et al., 2021), COLMAP achieved 10 times the number of points and a 94 times higher surface point density.
More important than the measurable quality metric of the point cloud is the correlation of the trait measurement itself.As such we asked: "Does an improvement in 3D model quality benefit the accuracy of root traits measurement?"The obvious increase in surface point density when using COLMAP in Computational Experiment 1 formed the hypothesis that a drastically reduced input image set can produce equally good trait measurements compared with the 3600 input images used in the original VisualSFM implementation of DIRT/3D (VisualSFM).We tested this hypothesis in our second computational experiment by using around 600 images as input to DIRT/3D (COLMAP).We then computed the correlation of the DIRT/3D (COLMAP) traits against the same manual ground truth used in DIRT/3D (VisualSFM).We observed slightly better correlations for the traits extracted with the COLMAP implementation of DIRT/3D (COLMAP) than with DIRT/3D (VisualSFM).On top of the coefficients of determination, we can visually identify highly similar patterns in the correlation with a manual ground truth for DIRT/3D based on VisualSFM and COLMAP (Figure 8).
Overall, our experiments indicate that it is possible to reduce the image-capturing time and trade it against increased computation time without significantly compromising the accuracy of the trait measurement.This finding suggests potential reductions in imaging time and effort, which typically require a trained staff member to place each root in the 3D scanning device manually and wait for the scanning process to finish before placing the next root (Liu et al., 2021).Reducing the number of images from around 3600 to around 600 reduces scanning time from seven minutes to four minutes per root and can reduce data transfer times from the scanner to online storage at CyVerse (Devisetty et al., 2016) from 15 min to 6 min (based on our second experiment).This outcome promises to streamline the most labor-intensive step of the root-phenotyping process.
We found that reductions in scanning and computation time are possible without excessively diminishing model quality.Our results highlight the need for further exploration of tradeoffs in root-image processing and demonstrate that neither customized operating rooms nor highly trained staff are necessary to operate a high-throughput root-imaging system.Our 3D imaging system promises to excel in high-throughput applications as an inexpensive and scalable 3D scanning solution for 3D root phenotyping.

AU T H O R C O N T R I B U T I O N S
Suxing Liu: Conceptualization, data curation, formal analysis, software, validation, visualization, writing -original draft.Wesley Paul Bonelli: Methodology, software, writing -review and editing.Peter Pietrzyk: Data curation, methodology, resources, writing -review and editing.Alexander Bucksch: Conceptualization, funding acquisition, investigation, methodology, project administration, supervision, writing -review and editing.

A C K N O W L E D G M E N T S
This research was supported by the NSF CAREER Award No. 1845760 and USDOE ARPA-E ROOTS Award Number DE-AR0000821 to Alexander Bucksch.Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect those of the founders.The work was performed during a transition period from the University of Georgia to the University of Arizona.

F
Visual comparison of three reconstructed maize genotypes.The 3D root models in each row compare the same genotype across different 3D reconstruction pipelines.The 3D root models in each column compare a 3D reconstruction pipeline across different genotypes.

F
I G U R E 2 Comparison of the number of points in 3D models.The five pipelines are color labeled, and the lengths of bars represent the value of the number of points in each 3D point cloud model.F I G U R E 3 Comparison of surface density of 3D models.COLMAP produced models with the highest surface density among all the pipelines, whereas VisualSFM produced the lowest surface density.The five tested pipelines are color labeled, and the lengths of the bars represent the value surface density of each 3D point cloud model.

F
I G U R E 5 12 sample 3D root models and their related computed root structure DIRT/3D (COLMAP).The rows named "3D models" provide examples of the computed 3D root point clouds per genotype.The rows named "Structure" illustrate the computed root architecture representation as skeletal curves.density of VisualSFM.VisualSFM generated models with the lowest surface density among all the pipelines.
Comparison of the number of points in 3D models generated by DIRT/3D (COLMAP) and DIRT/3D (VisualSFM).The average numbers of points for the genotypes are rendered in different color bars.The lengths of the bars represent the value of the number of points.Black lines represent standard deviation.Comparison of surface density in 3D models generated from DIRT/3D (COLMAP) and DIRT/3D (VisualSFM).Average surface density for each genotype is rendered in different color bars, with the lengths of the bars representing the value of surface density.Standard deviation is represented as black lines.Surface density is defined as the number of neighbors divided by the neighborhood surface area.
Comparison of correlation analysis of 10 traits.Results computed by DIRT/3D (COLMAP) and DIRT/3D (VisualSFM).The y-axis represents the manual measurement values, whereas the x-axis represents the DIRT/3D (COLMAP) or DIRT/3D (VisualSFM) computed values.R 2 represents R-squared value in regression analysis.The dotted blue lines represent the linear trending lines of the correlation.
(VisualSFM) measurement, Unit: degree DIRT/3D (VisualSFM) : 2 nd youngest nodal root angle (B) Traits of individual roots (IRs) F I G U R E 8 Continued implementation.The smallest increase in surface density was found for genotype B112, which still increased five times.