FIELDimageR: An R package to analyze orthomosaic images from agricultural field trials

Remote sensing is revolutionizing the phenotyping of agricultural field trials, but for many researchers, the extraction of plot‐level results is a bottleneck. We have developed the R package FIELDimageR as a user‐friendly tool to analyze orthomosaic images containing many plots. The basic workflow involves cropping and rotating the image, followed by the creation of a shapefile based on the experimental design. The package includes functions to calculate the number of plants per plot, canopy cover percentage, vegetation indices, and plant height. FIELDimageR is publicly available as a GitHub repository (https://github.com/filipematias23/FIELDimageR).


INTRODUCTION
High-throughput phenotyping platforms (HTPPs) are a fast and non-invasive tool for phenotyping plant populations under field conditions (Shi, Thomasson, Murray, Pugh, & Rooney, 2016). The most common HTPPs in agriculture are groundwheeled or aerial vehicles equipped with multiple sensors for imaging based on geographic information systems (GIS) (Araus & Cairns, 2014). The reflected light captured in these images can be used to draw inference about many traits, including plant structure (e.g., plant height, leaf area index), nutrient status, and the presence of abiotic and biotic stresses (Yang, Liu, Zhao, Li, & Huang, 2017).
To realize the potential benefits of remote sensing, a number of software tools are needed. After collecting overlapping images, tools such as Agisoft PhotoScan (2019) and Pix4Dmapper (Pix4D, 2019) can be used to stitch them together into a single orthomosaic image. Whereas the con-struction of the orthomosaic is a largely automated process, the delineation of experimental units (i.e., plots) and extraction of plot-level features is more idiosyncratic. At present, many agricultural scientists are relying on (typically unpublished) scripts for use with specialized GIS software programs, such as QGIS (QGIS Development Team, 2019) or ArcGIS (ESRI, 2019). Our objective was to develop a more user-friendly, R-based pipeline to accomplish this task, building on existing image analysis packages, such as raster (Hijmans, 2019), sp (Bivand, Pebesma, & Gómez-Rubio, 2013), and EBImage (Pau, Fuchs, Sklyar, Boutros, & Huber, 2010). The package, called FIELDimageR, is being distributed under the GNU General Public License 2 at https://github.com/ filipematias23/FIELDimageR. In this Science Note, we illustrate key features of the software using images from the University of Wisconsin-Madison potato (Solanum tuberosum L.) breeding program. A more in-depth tutorial is available for download with the package.

Core Ideas
• Developed user-friendly software to extract plotlevel results from orthomosaic images. • Plant stand counts can be estimated based on the watershed transformation. • Potato maturity was estimated more reliably by remote sensing than by visual rating.

Preparing the image for data extraction
The workflow in FIELDimageR (Figure 1 and 3) begins with an orthomosaic image and ends with a table of plot-level features. The fieldCrop function ( Figure 3, Line 2) is used to crop the image to the area of interest, by clicking the four corners in the plotting window of RStudio (Supplemental Figure S1). The exact corners of the trial do not need to be selected at this time, as further refinement of the trial region is accomplished later in the pipeline. Although not necessary, at this point one may wish to create a mask over the soil by  Table S1), or the user can define a new index (myIndex). When creating a shapefile to facilitate extraction of plotlevel features, the goal is to overlay the field design on the orthomosaic. FIELDimageR assumes a row-column design, and for convenience the grid is constructed to be aligned with the plotting window. As a result, the user needs to first rotate the image using the fieldRotate function (Figure 3, Line 3). The rotation angle is determined by clicking on two points along the edge of the trial (Figure 1b), and the user indicates whether the line segment should be horizontal or vertical in the rotated image. Next, using the fieldShape function Figure 3, Line 5), the user selects the four corners of the region that will be subdivided evenly based on the number of rows and columns in the design (Figure 1d). The grid is constructed solely from geometric calculations without regard to the presence of alleys or other features in the image (Figure 1e). By default, plot numbers are assigned from left to right, starting in the upper left corner of the grid. Alternatively, the user can also provide a matrix of plot numbers or labels. When multiple orthomosaics (e.g., collected over time) are available with the same coordinate reference system (e.g., by using ground control points or RTK GPS), a shapefile created from one image can be used to extract information from all of them.

Counting plants
The number of plants per plot, or stand count, is an important measure of seed quality and early vigor and can be used to adjust yield calculations. The fieldCount function (Figure 3, Line 6) uses image segmentation routines from the EBImage package (Pau et al., 2010) to identify distinct plants. First, the distance map transformation is applied to the binary mask created by fieldMask, generating a grayscale image of the minimum distance to each background pixel. Next, this grayscale image is interpreted as a topographic map, and the watershed transformation is used to identify the boundaries between the watershed basins (Soille, 2013), which are interpreted as different plants. As shown in Figure 2, fieldCount has some ability to identify distinct plants even when they are touching, with accuracy varying from 78 to 100% (Supplementary  Table S2). When there are differences in emergence time and early vigor between genotypes (as in Figure 2), it is unlikely that images collected on a single date can be used to estimate stand count accurately for all plots; multiple collection dates will be needed.

Vegetation indices
Vegetation indices (VIs) are designed to relate spectral differences in reflectance to plant characteristics. The fieldIndex function (Figure 3, Line 8) was developed to calculate the specified VI for every pixel in the image. The fieldInfo function (Figure 3, Line 9) calculates the average VI for each plot based on the shapefile. One of the most commonly used VIs is the normalized difference vegetation index (NDVI) F I G U R E 3 FIELDimageR command line pipeline (Rouse, Haas, Schell, & Deering, 1973), which is a good predictor of aboveground biomass but tends to saturate as canopy cover exceeds 90% (Barnes, Clarke, Richards, Colaizzi, & Haberland, 2000). The normalized difference red edge index (NDRE) has been proposed as more sensitive to plant N status under these conditions (Jain, Ray, Singh, & Panigrahy, 2007;Nigon, Mulla, Rosen, Cohen, & Alchanatis, 2015). In the UW-Madison potato breeding program, plant maturity has traditionally been assessed based on a visual rating (1-9) of senescence around 100 days after planting (DAP). Table 1 compares the broad-sense H 2 of the visual rating compared to NDVI and NDRE for three different trials in 2019, representing the major market classes in the United States. In two trials, both NDVI and NDRE had higher H 2 than the visual rating. In the third trial, only NDRE was more heritable. These results suggest remote sensing can improve the reliability of maturity assessment for the potato breeding program, as has been observed for other traits in maize (Zea mays L.) (Anderson, Murray, Malambo, Ratcliff, & Popescu, 2019;Niu, Zhang, Zhang, Han, & Peng, 2019) and sorghum [Sorghum bicolor (L.) Moench] (Guo, Zheng, Potgieter, Diot, & Watanabe, 2018;Hu, Chapman, Wang, Potgieter, & Duan, 2018).

Plant height
Plant height is strongly correlated with life span, seed mass, time to maturity, and ability to compete for light (Moles, Warton, Warman, Swenson, & Laffan, 2009). Despite its relative simplicity of measurement, plant height can be labor-and time-consuming to collect for research programs with large fields. An alternative way to estimate plant height is using image analysis through the Canopy Height Model (CHM) (Anderson et al., 2019). This model is based on the difference between the Digital Surface Model (DSM) from the soil base (before sprouting) and any moment in the crop growth cycle. The DSM files are one output product from the orthomosaic step (Agisoft PhotoScan, 2019; Pix4D, 2019). This methodology was used with FIELDimageR to estimate plant height for the three UW-Madison potato trials mentioned above with H 2 estimates ranging from 0.70 to 0.85.

Processing time
Image size and resolution can drastically impact the processing time. To accelerate the analysis, we recommend two approaches: (a) reduce the image resolution using the aggregate function from package raster, and (b) use multiple cores by setting the parameter n.core (available for fieldInfo, field-Count, and fieldArea). For example, Figure 1 has a resolution of 1.7 by 1.7 cm and required 115 s to extract and compute plot averages with fieldInfo (MacOS 10.14.5, 3.2 GHz Intel Core i5 with 32 GB RAM, four physical cores). Reducing the resolution to 3.5 by 3.5 cm decreased the time to 45 s without appreciable impact on the results (correlation was 0.99). The processing time can be reduced even further by using multiple cores: 25 and 16 s using two and four cores, respectively.

CONCLUSION
The FIELDimageR package offers plant scientists a convenient set of utilities to extract remote-sensing phenotypes from orthomosaic images of field trials. The vignette distributed with the package illustrates the functionality of the software in more detail. Features not addressed in this Science Note include the ability to estimate plant canopy percentage (fieldArea function), non-grid shapefiles (fieldPolygon function), and spatial visualization of traits (fieldPlot function).

ACKNOWLEDGMENTS
We thank Lin Song for assisting with the literature review of the vegetation indices.

AUTHOR CONTRIBUTIONS
FIM designed the study. FIM and MVCH collected and analyzed the data. FIM, MVCH, and JBE drafted the manuscript. JBE supervised the whole study. All authors read and approved the final version of the manuscript for publication.