Moisture content estimation and senescence phenotyping of novel Miscanthus hybrids combining UAV‐based remote sensing and machine learning

Miscanthus is a leading perennial biomass crop that can produce high yields on marginal lands. Moisture content is a highly relevant biomass quality trait with multiple impacts on efficiencies of harvest, transport, and storage. The dynamics of moisture content during senescence and overwinter ripening are determined by genotype × environment interactions. In this paper, unmanned aerial vehicle (UAV)‐based remote sensing was used for high‐throughput plant phenotyping (HTPP) of the moisture content dynamics during autumn and winter senescence of 14 contrasting hybrid types (progeny of M. sinensis x M. sinensis [M. sin x M. sin, eight types] and M. sinensis x M. sacchariflorus [M. sin x M. sac, six types]). The time series of moisture content was estimated using machine learning (ML) models and a range of vegetation indices (VIs) derived from UAV‐based remote sensing. The most important VIs for moisture content estimation were selected by the recursive feature elimination (RFE) algorithm and were BNDVI, GDVI, and PSRI. The ML model transferability was high only when the moisture content was above 30%. The best ML model accuracy was achieved by combining VIs and categorical variables (5.6% of RMSE). This model was used for phenotyping senescence dynamics and identifying the stay‐green (SG) trait of Miscanthus hybrids using the generalized additive model (GAM). Combining ML and GAM modeling, applied to time series of moisture content values estimated from VIs derived from multiple UAV flights, proved to be a powerful tool for HTPP.


| INTRODUCTION
Miscanthus is a promising perennial crop that can achieve high biomass production on marginal lands van der Cruijsen et al., 2021;Pancaldi & Trindade, 2020;Shepherd et al., 2020). Due to its perennial nature, Miscanthus has a limited input requirement and is cultivated under no tillage regime leading to multiple ecosystem services provision (Agostini et al., 2021;Ferrarini et al., 2016Ferrarini et al., , 2021Martani et al., 2021). Most of the research on Miscanthus has been conducted on Miscanthus x giganteus (Heaton et al., 2010), which is a naturally occurring sterile triploid hybrid of Miscanthus sacchariflorus (M. sac) and Miscanthus sinensis (M. sin) (Hodkinson et al., 2002). New Miscanthus hybrids (Clifton-Brown et al., 2018Hastings et al., 2017) have been recently obtained from several breeding programs . In Europe, rhizome-and seed-based Miscanthus hybrids are available at a technology readiness level that can enable the plantation of thousands of hectares per year (Clifton-Brown et al., 2018). These novel Miscanthus hybrids are being tested in multiple environments within the EU-BBI project GRACE.
Plant senescence is a key trait for perennial plants as it limits biomass yield, modifies moisture content, and affects nutrient translocation (Boersma et al., 2015;Jensen et al., 2016;Malinowska et al., 2016;Sarath et al., 2014;Yang & Udvardi, 2017). Moisture content at harvest is the most important biomass quality trait (Robson et al., 2011;Styks et al., 2020). Monitoring the dynamics of crop senescence and moisture content can support the choice of the optimal harvest time that can improve biomass quality and logistics biomass supply chain. Lewandowski et al. (2016) found that moisture content of different genotypes can vary due to morphological differences and senescence patterns, but it is primarily determined by harvest date. Several studies have shown that late senescence (stay green-SG) maximizes biomass yield (Clifton-Brown et al., 2001), while early senescence increases biomass quality . SG is determined by a complex physiological control (e.g., chlorophyll efficiency, nitrogen contents, nutrient remobilization, and source-sink balance) (Munaiz et al., 2020;Thomas & Howarth, 2000) and traditional phenotyping methods for evaluating SG and delayed senescence are time-consuming (Furbank & Tester, 2011). Nondestructive methods are based on greenness visual score (Bogard et al., 2011) and SPAD measurements (Lopes & Reynolds, 2012;Xie et al., 2016), for the estimation of the green leaf area and relative chlorophyll content, respectively. These methods can be used to monitor field trials but are not effective in monitoring senescence dynamics at commercial scale. New sensing technologies have contributed to a substantial improvement in the monitoring of SG in different crops (Cerrudo et al., 2017;Kipp et al., 2014;Liedtke et al., 2020;Lopes & Reynolds, 2012). High-throughput plant phenotyping (HTPP) with remote sensing is a rapid and nondestructive technology that can be used to monitor the senescence of numerous genotypes, thus supporting breeding programs (Anderegg et al., 2020;Hassan et al., 2018). Remote sensing technologies use different types of sensors, such as Red-Green-Blue (RGB), multispectral, hyperspectral, and thermal cameras, installed on satellites and on unmanned aerial vehicles (UAVs) (Xie & Yang, 2020). Spectral data can be used to calculate vegetation indices (VIs), which can be used to estimate crop parameters related to SG trait: normalized difference vegetation index (NDVI) for green biomass (Cabrera-Bosquet et al., 2011), enhanced vegetation index (EVI) for leaf area index (LAI) (Alexandridis et al., 2019), and modified chlorophyll absorption in reflectance index (MCARI) for chlorophyll content (Haboudane et al., 2002). Other VIs, such as the plant senescence reflectance index (PSRI) (Merzlyak et al., 1999) or the structure insensitive pigment index (SIPI) (Peñuelas et al., 1995), which are based on the chlorophyll/carotenoid ratio as the decomposition rates of these pigments are affected during senescence, were specifically developed to study crop senescence. The normalized difference water index (NDWI) (Gao, 1996), calculated using near-infrared (NIR) and shortwave-infrared (SWIR) spectral bands, has been proposed as a powerful direct water-sensitive VI, which can be used for the remote sensing of canopy water content (CWC) (Jackson et al., 2004). However, NDWI is rarely calculated by UAV because it requires costly sensors that are equipped with the SWIR band. Zhang and Zhou (2019) compared direct against indirect (which does not include the SWIR band) water-sensitive VIs, such as NDVI, NDRE, CIgreen, and CIred-edge and found that these VIs were strongly correlated with the CWC as the direct VIs.
Field trials carried out with small plots cannot be monitored using satellite data, for this HTPP using UAV-based multispectral images is best used in breeding programs where numerous genotypes are compared (Gracia-Romero et al., 2019;Ostos-Garrido et al., 2019;Su et al., 2019;Varela et al., 2021;Zhou et al., 2019). UAV-based multispectral images were used in many studies to compare genotypes on the basis of VIs linked to LAI (Potgieter et al., 2017), green LAI (Blancon et al., 2019), canopy cover (Makanza et al., 2018), crop biomass and yield (Johansen et al., 2020;Wang et al., 2019), and senescence dynamics (Hassan et al., 2018). However, many VIs show nonlinear relationships with their associated crop parameters (Verrelst et al., 2015). Machine learning (ML) regression algorithms have increasingly been used in HTPP to recognize nonlinear and nonparametric relationships. ML is used to combine multiple VIs for estimating crop parameters from a sequence of UAV remote sensing acquisitions. ML models use two main datasets: a training set on which the best model is trained to fit the measured parameters and a test set used to assess the performance of model (Kuhn & Johnson, 2013). In addition to the VIs data, with ML methods, numerous types of data, such as categorical variables (e.g., genotype, crop type, locations, agronomic treatments) (Im et al., 2009;Meroni et al., 2021;Wolanin et al., 2020), can be used in the analysis (Verrelst et al., 2018). An ML method commonly used in many remote sensing analyses is random forest (RF) (Belgiu & Drăguţ, 2016;Holloway & Mengersen, 2018), which can estimate crop biomass (Han et al., 2019) and yield (Johansen et al., 2020) from UAV multispectral images. A main limit of the RF model is its transferability to environments, cropping systems, or growing seasons different from those used for training the model (Vuolo et al., 2013). Another limitation is in the training set size (Millard & Richardson, 2015) and the unreliability of predictions made beyond the range of values of the parameters present in the training set (Shah et al., 2019). In addition, Schauberger et al. (2020) reported that 52% of the studies on ML do not validate the models' performance with independent test sets. Overall, the quality of training data for developing robust ML models is the key for successfully transferring the trained model and its knowledge to other target domains/tasks. For these reasons, new studies are needed to assess the transferability of ML models for UAV applications in agricultural sciences (Johansen et al., 2020).
However, to date, only time series VIs data from UAV, and not estimated crop parameter of ML models, are used for HTPP. A set of known models are normally fitted to VIs time series to characterize plant growth/status associated with different phenological phases. Specifically for the senescence, logistic functions (Christopher et al., 2014) and the Gompertz model (Anderegg et al., 2020) are the two most used models. Another potential approach to fit VIs data is the generalized additive model (GAM) (Nolè et al., 2018). Antonucci et al. (2021), for example, successfully used GAM approach for HTPP of whole-canopy photosynthesis and transpiration.
Although remote sensing applications that support these approaches exist and have been already tested successfully for field crops (Alam et al., 2012;Kavats et al., 2019;Yang, 2011;Zhang et al., 2021), no remote sensing application for estimating moisture content of Miscanthus is reported in scientific literature.
As a first-time testbed for phenotyping Miscanthus with UAV remote sensing, two locations, where 14 contrasting Miscanthus hybrids were compared in a completely randomized block design, were monitored regularly with moisture content measurements and UAV flights and senescence dynamics were assessed during two growing seasons. The objectives of this study were (1) to evaluate the performances and transferability of RF models in estimating the moisture content of Miscanthus biomass and (2) to phenotype the dynamics of senescence and identify SG trait of contrasting Miscanthus hybrids using GAM applied to moisture content time series.

| Experimental design
This study is part of the EU-BBI funded project "GRowing Advanced industrial Crops on marginal lands for biorEfineries" (GRACE) that aims to prove the feasibility of largescale Miscanthus cultivation on marginal land. Two of the eight plot scale (PS) trials conducted within GRACE project have been selected for this study. The two sites were located in the province of Piacenza (NW Italy): PAC 1 located in San Bonico (45°00′11.70″N, 9°42′35.39″E) and PAC 2 located in Chiulano (44°50′40.32″N, 9°35′04.93″E) ( Figure 1). Former land use was arable land with cereal crops rotation and permanent meadow, respectively, in PAC 1 and PAC 2. The climate in both locations is temperate. The sites differ for soil type and elevation ( Figure 1). Meteorological data were collected from automatic weather stations located at each experimental site (Table 1). Experimental layout was a complete randomized block design with 14 Miscanthus hybrids (Table 2) with n = 4 replicates for a total of n = 56 plots. Plot size was 6 m × 7 m. The 14 hybrids, coded from GRC 1 to GRC 15 (except GRC 12), were grouped into three main genotypes: M. x giganteus as control genotype, and interspecific (M. sin x M. sac) and intraspecific (M. sin x M. sin) hybrids genotypes. Both PS trials were established in April 2018 after winter ploughing and spring seed bed preparation (power harrowing). Plugs and rhizomes were manually transplanted while mechanical weeding during the first years was performed three times. Neither irrigation nor fertilization was applied. Measurements of this study were carried in the second and third growing season during senescence.

| Crop measurements
Senescence was tracked visually following the scoring method proposed by Robson et al. (2011), which is based on a scale from 1 to 9, where 1 indicates the lowest level of "greenness" of the whole visible aerial parts of the plant and 9 is the score attributed when no visible leaf senescence occurs. Scores were acquired from August to February (until harvest) for a total of 10 events in PAC 1 and 9 in PAC 2. Besides scoring senescence, at each measurement event, whole stem samples randomly selected for each plot (20 for M. sin x M. sin and 10 for M. sin x M. sac hybrids, respectively) were sampled to calculate plant moisture content. Samples were weighed immediately after harvest and again after having been oven-dried at 105°C, and then, the percentage of moisture content was calculated (Samuelsson et al., 2006).

| UAV multispectral data and vegetation indices
The unmanned aerial vehicle (UAV) used in the experiment was a four-rotator DJI Matrice 210 RTK (SZ DJI Technology Co.) combined with an RTK (Real-Time Kinematic) GPS positioning system. At each visual scoring event, a UAV multispectral data acquisition was performed; in addition, 10 supplementary flight missions were carried out on PAC 1 and five on PAC 2 to increase the frequency of senescence tracking. Ten flights were performed over PAC 1 in both seasons, while in PAC 2, 6 and 8 flights were realized in the first and second seasons, respectively (Table S1). The UAV was equipped with a MicaSense RedEdge-Mx multispectral camera (MicaSense). RedEdge-Mx camera can acquire the images in five different spectral bands: blue (475 nm center, 32 nm bandwidth), green (560 nm center, 27 nm bandwidth), red (668 nm center, 14 nm bandwidth), red edge (717 nm center, 12 nm bandwidth) and near-infrared (840 nm center, 57 nm bandwidth). All the flights were performed between 11.00 and 15.00. The flight altitude above ground level (AGL) was 40-50 m in PAC 1 and 80-100 m in PAC 2. The forward overlap was set at 80% and lateral overlap was set at 75% of the images. The flight speed was set at 3 m/s. The ground sampling distance (GSD) was 2.78-3.47 cm and 5.56-6.94 cm in PAC 1 and PAC 2, respectively. The flight was performed in automatic mode with waypoint routes as the presence of a GPS navigation system enables a more accurate image acquisition. The DJI Pilot software (SZ DJI Technology Co.) was used for flight planning and automatic mission control. For the radiometric calibration of the images, the reflectance of a spectral panel (MicaSense) with reflectance values provided by MicaSense was captured before each flight. In addition, a light sensor that automatically adjusts the readings to ambient light was mounted at the top of the UAV to minimize error during image capture. The radiometric calibration, image mosaicking, and orthomosaic generation were done using the Pix4D mapper (Pix4D, S.A.). The orthomosaic in reflectance values generated from the software was used for the calculation of 54 vegetation indices (VIs) as shown in Table S2. To extract the spectral information of each experimental plot, the polygons of the experimental design were drafted in AutoCAD (Autodesk) and georeferenced based on the UAV multispectral images by using QGIS software (QGIS Development Team, 2021).   (Feng et al., 2020;Yue et al., 2018) was initially applied to solve the multicollinearity problem among VIs by selecting the most important VIs for moisture content estimation. Inputs for the RFE algorithm were the predictor variables (the 54 VIs calculated from UAV multispectral images) and the corresponding target values (the measured plant moisture content). In the RFE algorithm, the random forest (RF) model was used to minimize the root mean square error (RMSE). The RFE results were combined with the "pickSizeTolerance" function to select a model containing fewer predictor variables within the bounds of a user-defined threshold metric (Parmley et al., 2019). RMSE metric and the 0%, 1%, and 5% tolerance thresholds were utilized to identify models with acceptable performance but with fewer predictor variables. On the selected VIs, RF was then used to estimate the moisture content of Miscanthus hybrids. RF model is an ensemble learning model where the output averages the result of multiple regression trees (Kamir et al., 2020). The RF models were created using the caret R package (Kuhn, 2008). Two steps in RF modeling were adopted: Firstly, RF was trained and tested on the VIs selected from RFE algorithm at the tolerance threshold of 1%; secondly, the three categorical crop variables (material, hybrid code, and genotype, Table 2) and their combinations were added in RF modeling to check for improvement in model's performance.
For the RF modeling, the optimal size of the variable subset ("mtry") was obtained by grid-searching method using repeated k-fold cross-validation. The repeated k-fold cross-validation consists of dividing the data into k independent folds of the same size, training the algorithm on (k−1) folds, and testing its accuracy on the remaining fold based on the error between predicted and target values several times (Kamir et al., 2020). In our study, we used a tenfold cross-validation, which was repeated five times. This procedure was used to estimate the moisture content and to evaluate the transferability of the models tested on five subset test datasets: four specific season and location datasets (two locations x two growing seasons) and one reference dataset, as a comparison. The reference dataset was created by using a stratified random sampling method (Han et al., 2019): data from both locations and seasons were split into 70/30 between training and testing based on moisture content distribution. To include the categorical variables into the models (second step), a one-hot-encoded approach was used to encode categorical variables into numbers, assigning the value 1 when the condition is satisfied and 0 when it is not satisfied.
RF models' performances were evaluated calculating the root mean square error (RMSE) and the normalized root mean square error (NRMSE) as follows: where n is the sample number, x i and y i are the estimated and measured moisture content, and ȳ is the mean of the measured value. The performance metrics were also calculated for different intervals of moisture content and each Miscanthus hybrid. The moisture content intervals investigated were lower than 30%, between 30% and 60%, higher than 60%, and finally between 10% and 80%. The set size used for each training dataset was reported to compare the metrics of the models. For each model, the RMSE and NRMSE were calculated for each genotype and for the different moisture content intervals to evaluate the models.

senescence dynamics
The moisture content during senescence was estimated from spectral data acquired by UAV using the validated RF model: This approach was selected to add supplementary flights to the dataset without field

Miscanthus biomass
The frequency distribution of measured moisture content and its variation during the two senescence seasons at two locations are shown in Figure 2 and Figure S1. Overall, the peak of frequency distribution of moisture content values was recorded for all genotypes within the interval between 30% and 60% ( Figure 2). M. sin x M. sin showed a left-skewed distribution with relatively high-frequency values for moisture content below 30% (Figure 2). Moisture content loss started at the beginning of December at both locations and for all genotypes ( Figure S1). M. sin x M. sac hybrids showed a higher moisture content (+18% and +6%) than M. sin x M. sin hybrids and M. x giganteus in both locations from December until harvest in late winter (Figure 2 and Figure S1). On average, M. sin x M. sac hybrids and the M. x giganteus were harvested at 45% and 37% moisture content, respectively (Figure 2 and Figure S1). M. sin x M. sin hybrids had an average moisture content at winter harvest of 22%. The dynamics of moisture content during senescence are confirmed by visual recording of senescence score based on plant greenness ( Figure S2). For all genotypes, the correlation between senescence score and moisture content indicated that moisture content loss starts when senescence score values of 4 are recorded.

| Recursive feature elimination of vegetation indices
The optimal number of vegetation indices (VIs) included in the models to minimize RMSE in the estimation of moisture content was obtained by the recursive feature elimination (RFE) algorithm with repeated crossvalidation ( Figure 3). RFE analysis showed that using four or less VIs led to a moisture content estimation with RMSE values higher than 8% (Figure 3a). With the 0% tolerance threshold, the minimum RMSE (7.4%) was achieved with 30 VIs. However, the use of 20 or more VIs led to a moisture content estimation with a mean RMSE value of 7.4%. On the contrary, with the thresholds of tolerance of 1% and 5%, the optimal number of VIs was 14 (RMSE = 7.5%) and 6 (RMSE = 7.8%), respectively ( Figure 3a). The threshold of tolerance of 1% was chosen as the threshold that maximizes the model's performances with the minimum number of VIs. According to the importance of the ranking (Figure 3b), 14 VIs have been selected for RF models training among the 54 VIs calculated (Figure 3b). The 14 VIs were BNDVI, GDVI, PSRI, MCARI/MTVI2, GOSAVI, NGBDI, NLI, GBNDVI, GLI, MCARI/OSAVI2, SIPI, MCARI2, OSAVI2, and GI. The six most important VIs to reach 5% tolerance (RMSE <7.8%) were (Figure 3b) BNDVI, GDVI, PSRI, MCARI/ MTVI2, GOSAVI, and NGBDI.

| RF model performance and transferability
The performances (RMSE and NRMSE) of the random forest (RF) models were compared among the season-specific datasets of the two location and against one reference dataset (split into 70/30 training/test) (Table 3). When all the genotypes and all moisture content intervals were considered, the RF model of the reference dataset was the most accurate one among the five models considered in estimating Miscanthus moisture content (RMSE = 6.9% and NRMSE = 14%). The other models achieved lower accuracy values with RMSE ranging from 9.2% to 10.6% and NRMSE from 20.1% to 22.1%. The accuracy of the RF models trained with the season-locationspecific datasets and for the intervals of moisture content of 30%-60% and >60% was on average similar (RMSE = 8.5%) to the accuracy of the RF model trained with the reference dataset for the same intervals (RMSE = 6.3%). On the contrary, the accuracy of the RF models for the season-location-specific datasets was lower for the interval of moisture content <30% (RMSE = 16.4%) than the reference dataset (RMSE = 10.7% and 6.3%, respectively). The addition of categorical variables (material, hybrid code, and genotype of Table 2) to the reference dataset model of VIs improved the accuracy of moisture content estimation (Figure 4). The single addition of material, hybrid code, or genotype in the model (Figure 4b-d) decreased the RMSE from 6.9% (model with only Vis) to 6.8%, 6.4%, and 5.7%, respectively. The simultaneous addition of three categorical variables to the model achieved the best performance with an RMSE = 5.6% and NRMSE = 11.4% (Figure 4e). Finally, the RMSE of all models was evaluated for each genotype (Figure 4f). The addition of categorical variables decreased the RMSE value with respect to the model with only VIs for the M. x giganteus genotype from 7.6% to 5.6%, for the interspecific M. sin x M. sac genotype hybrids from 6.9% to 4.7%, while for intraspecific M. sin x M. sin genotype hybrids from 6.8% to 6.1%.

senescence dynamics with multiple UAV flights
The RF model trained with the VIs and the three categorical variables was used to estimate moisture content of Miscanthus hybrids from spectral data of multiple UAV flights at two locations. Generalized additive model (GAM) was applied to time series moisture content data estimated from RF model, with the M. x giganteus (GRC 9) as reference for estimating significant differences among the hybrids during senescence. M. sin x M. sin hybrids (GRC 1-8) from DOY 280 (mid-early October) showed a constant and significant lower moisture content than the M. x giganteus hybrid ( Figure 5). The first genotype showing a significant difference in moisture content compared to GRC 9 was GRC 5, at DOY 260 (mid-September), while the last was GRC 1, at DOY 312 (mid-early November). Intraspecies M. sin x M. sin hybrids showed the highest variability on moisture content loss during senescence compared to interspecies M. sin x M. sac hybrids. The estimated difference of moisture content at harvest varied from 10.2% for GRC 1 to 14.5% for GRC 6. On the contrary, F I G U R E 2 Frequency distribution of the moisture content of different Miscanthus genotypes during the two seasons and on two locations constant negative differences compared to GRC9 occurred later in the season (early November) for interspecific M. sin x M. sac hybrids (GRC 10-15). The difference is statistically significant approximately from DOY 295 (mid-late October) for GRC 10 hybrid and from DOY 314 (mid-early November) for GRC 13 hybrid. At harvest, the estimated moisture content difference varied from −9.2% for GRC 11 to −10% for GRC 14. The rhizome-based GRC 15 hybrid, a M. sin x M. sac genotype, showed a similar moisture content dynamics to the other rhizome-based hybrid (GRC 9).

| DISCUSSION
The characterization of moisture content dynamics of Miscanthus biomass is important to determine the harvest time and selecting the most suitable genotypes in each environment. This study estimated the moisture content of 14 contrasting Miscanthus hybrids combining unmanned aerial vehicle (UAV) remote sensing and machine learning. The random forest (RF) model was trained with moisture content values measured directly from each plot trial, UAV multispectral data (the vegetation indices), and categorical variables of Miscanthus hybrids (material, hybrid code, and genotype). The time series of the moisture content values estimated by RF model from VIs derived from multiple UAV flights was used for phenotyping senescence dynamics and identifying the stay-green (SG) trait of Miscanthus hybrids using the generalized additive model (GAM).

| Selection of multispectral vegetation indices for Miscanthus moisture content estimation
Increasing the number of VIs from 1 to 14 improved the RF model's accuracy and allowed to decrease RMSE from 10% to 7.5% (Figure 3a). Generally, the estimation of the crop parameters via multiple VIs is affected by data redundancy and multicollinearity among some vegetation indices (VIs) (Yue et al., 2018). The use of recursive feature elimination (RFE) algorithm proved to be a suitable approach to minimize RMSE while reducing the noise effect caused by data redundancy and multicollinearity, as Reference dataset is composed of 30% of the initial dataset (both locations and seasons) used for the validation and 70% used for RF model training.
suggested by Han et al. (2019) and Anderegg et al. (2020). This study showed that the three most important VIs for estimating moisture content were VIs based on blue (BNDVI), green (GDVI), and red-edge (PSRI) spectral bands (Figure 3b). Zhu et al. (2019) found that the blue band is sensitive to the change of carotenoid content and the green and red-edge bands are sensitive to the change of chlorophyll content. VIs based on these spectral bands indeed have been used to study crop senescence dynamics (Anderegg et al., 2020;Peñuelas & Inoue, 1999). The blue band proved to be the most important variable for predicting harvest date (pod's maturity) in soybean (Yu et al., 2016). Anderegg et al. (2020) reported that the time series of PSRI could accurately track senescence dynamics of the canopy of wheat and replace the visual scorings. Furthermore, the SIPI was strongly correlated with relative water content (RWC) and can indirectly evaluate leaf water stress (Peñuelas & Inoue, 1999). Also, this study confirmed that the VIs selected by the RFE algorithm and used in the RF model were sensitive to changes of chlorophyll/carotenoid ratio during senescence. Finally, although no VIs based on the SWIR band were used in this study, it was demonstrated that the combination of multiple VIs based on VIS-NIR images compensated for the lack of the SWIR band, which is known to predict well crop moisture content when integrated with VIs such as NDWI (Zhang & Zhou, 2019).

| Moisture content estimation with a machine learning algorithm
This study estimated the moisture content with the RF model, trained with a wide range of genotypes, across two senescence seasons and at two different locations, differing strongly in soils and slightly in climate. These differences, as suggested by Maxwell et al. (2018), help to assess the RF model transferability. The transferability of the moisture content estimation models was evaluated by splitting the moisture content dataset into five test datasets. The performance metrics of the RF models showed that a good accuracy (6.9% of RMSE and 14.0% of NRMSE) was achieved when all the genotypes and all moisture content intervals were considered in the models  (Table 3). Similar results were reported by Li et al. (2021) to estimate the moisture content of three species of trees, who achieved an NRMSE between 8.6% and 13.9%. The models evaluated to estimate the moisture content might be affected by errors in the estimation in some moisture content intervals due to limits in the range of data used to train the model (Shah et al., 2019). Indeed, small increases in RF models performance were found when the models were trained with the specific season and location datasets. This difference is due to different models' accuracy when the moisture content is <30%. During the two seasons, many hybrids did not reach such low moisture content, and thus, the training set size for this interval was lower.
To assess the performance of the models in identifying the optimal harvest dates based on moisture content at different endpoints of drying, the moisture content dataset was indeed divided into different intervals (<30%, 30-60, >60%, and 10-80%). It is considered that the optimal moisture content for the Miscanthus winter harvest is at or below 20% (Lewandowski et al., 2016) in order to avoid self-ignition of biomass, minimize transport costs, and increase combustion efficiency (Robson et al., 2011). In this study, especially novel interspecies seed-based M. sin x M. sac hybrids rarely reached at harvest a moisture content lower than 30% (Figure 2), while M. sin x M. sin in some cases dried until 10%. In the low moisture content interval (<30%), a large difference in RMSE was found between the model trained with the reference dataset and on the season-locationspecific datasets (Table 3). These results indicate that the tested models cannot be transferred with good accuracy to locations and or/growing seasons where biomass of these genotypes dried until moisture content <30%. The low transferability of RF beyond the extreme values of the training data range confirmed that this is one of the main limits of the RF model (Johansen et al., 2020;Vuolo et al., 2013). On the contrary, the RF models were transferable in different locations and growing seasons for moisture content values ranging between 30% and 60% ( Table 3). The training set size and the moisture content distribution during senescence confirmed to be the most important dataset's characteristics to achieve good model's performances (Millard & Richardson, 2015) and transferability (Johansen et al., 2020).
The addition of categorical variables in RF model improved the estimation of moisture content. Introducing three categorical variables such as material, hybrid, and genotype decreased more the RMSE than adding only material type (Figure 4b,e). The M. sin x M. sac and M.
x giganteus genotypes showed the highest improvement of RMSE due to the addition of these categorical variables ( Figure 4f). The data imbalance in the "hybrid" categorical variables among control M. x giganteus (n = 1), interspecies (n = 4), and intraspecies (n = 8) genotype hybrids could have caused these differences in model's performance.
Another limitation of the RF model developed in this study relies on the fact that it is composed of multiple VIs calculated with precise multispectral bands. This means that our RF model might not reach the same accuracy if the same VIs are calculated on spectral data acquired with different multispectral cameras operating within different band intervals. This calls for the development of algorithms able to overcome these differences in the spectral data through advanced normalization and calculation procedures of VIs from different sensors (Emilien et al., 2021;Hoque & Phinn, 2018).

| Phenotyping stay-green trait via UAV remote sensing to capture genotypic variation during senescence
This study demonstrated that high-throughput plant phenotyping (HTPP) of contrasting Miscanthus hybrids is possible by combining multiple UAV flights and GAM modeling. Stay-green (SG) is an important phenotypic trait when evaluating the senescence of novel Miscanthus hybrids. The goal of plant breeders is to obtain high yielding plants with high biomass quality. In Miscanthus, a delayed senescence is expected to increase yields, while an early senescence is expected to increase biomass quality (Robson et al., 2011). In our environments, senescence of M. sin x M. sin hybrids led to drier biomass (22% mean moisture content in late February) than commercially available rhizome-based hybrids (GRC 9-15 with 37%), while M. sin x M. sac hybrids showed an SG trait with an average moisture content of 45% until harvest. These findings confirmed that biomass with low moisture content at the harvest is usually related to early senescence in Miscanthus, as was found by Robson et al. (2011). However, opposite results to our study were reported by Nunn et al. (2017) that observed a lack of relationship between an early senescence and low moisture content at harvest in different locations across Europe.
Mild cold conditions during autumn-winter periods affected the start of senescence and moisture content losses dynamics until late winter harvest in all Miscanthus hybrids. The overwintering conditions (e.g., number and frequency of chilling frosts) between the start of senescence and harvest time have a higher effect on the moisture content than the senescence itself (Sarath et al., 2014). That was the case in our two southern European locations, where a reduced frequency of killing frost days and absence of prolonged freezing periods in late autumn-early winter in 2019-2020 seasons (Table 1) might not have induced complete senescence in the M. sin x M. sac hybrids leading to a higher moisture content at harvest. During the first years after establishment, Miscanthus might have a reduced senescence (Clifton-Brown & Lewandowski, 2000) due to changes in the source-sink dynamics of young Miscanthus plants (Boersma et al., 2015). However, standing age did not affect in our case the observed delayed senescence since measurements were done on mature plantation at second and third year.
Genotypic variations in flowering and senescence times are instead two key explanatory factors of SG trait observed in perennial crops. The relationship between flowering and senescence in Miscanthus has been proposed to promote nutrient remobilization, and hence biomass quality improvement (Jensen et al., 2016). GAM applied to estimated moisture content values from the RF model from multiple UAV flights helped us to capture differences in senescence dynamics in contrasting Miscanthus hybrids ( Figure 5) F I G U R E 5 Senescence dynamics of the different Miscanthus hybrids according to the difference in estimated moisture content with reference hybrid M. x giganteus-GRC 9 (dashed black line). The estimation of moisture content time series was carried out by using a GAM. Solid and dashed coloured lines denote respectively significant (p < 0.05) and not significant differences of the corresponding hybrid compared to reference hybrid Nunn et al., 2017). Jensen et al. (2016) found, for similar contrasting hybrids, that nitrogen and phosphorous remobilization rate to underground rhizomes followed the same trend of moisture content loss observed also in our study. The absence or delay of flowering, respectively, in M. sin x M. sac and rhizome-based hybrids may have caused delayed senescence that was also observed in the SG trait in this study. As a consequence, these genotypes were harvested at higher moisture content ( Figure S1) and likely higher nutrient content compared to M. sin x M. sin hybrids. The high variability among Miscanthus hybrids in moisture content loss dynamics during senescence ( Figure 5) might be further explained by the wider geographical distribution of M. sinensis than of M. sacchariflorus (Clifton-Brown & Hastings, 2015). This may have produced a higher genetic variation of the phenotypic traits due to the hybridization among M. sinensis species (Robson et al., 2011). Additionally, also the cold resistance trait likely depends on the origin and in situ environmental characteristics of the genetic accession of the Miscanthus species. In fact, opposite results to our study were reported by  showing that different M. sin x M. sin hybrids had delayed senescence with respect to M. x giganteus and M. sacchariflorus hybrids.
In conclusion, this study demonstrated that moisture content of Miscanthus can be accurately estimated via machine learning algorithm applied to multiple VIs calculated from UAV-based VIS-NIR images. The RF model developed on different genotypes showed a good transferability to multiple location and seasons when moisture content ranges from 30% to 60%. Further training datasets are required to extend the transferability and confirm the same performance of the RF model at lower moisture content values (10%-30%). For the first time, we showed that the combination of machine learning (ML) and GAM applied to time series of moisture content values estimated from VIs derived from multiple UAV flights is a powerful tool for high-throughput plant phenotyping. Remote sensing can be used for phenotyping future advanced breeding programs of Miscanthus. The possibility to distinguish via remote sensing the SG trait of novel Miscanthus hybrids can deepen our understanding of key factors mediating the induction of early or delayed senescence. Our study focused on the use of ML algorithms to estimate moisture content during Miscanthus senescence, but we believe that the same methodological approach can be used for estimating other phenological traits or yield components in similar and/or different crops. This is particularly relevant for upscaling models from experimental plot to field scale by using satellites. Satellites can collect data of many fields simultaneously, with a larger number of spectral bands, like the SWIR band, that could ultimately support with high precision and resolution moisture content and yield estimation. ML algorithms could be applied in remote sensing to develop satellite and UAV applications beneficial to sustainable crop management, for example, in the case of Miscanthus to identify optimal harvest date or to predict commercial yield (quantity and quality).

ACKNOWLEDGMENTS
This study is part of the GRACE project which has received funding from the Bio-based Industries Joint Undertaking (JU) under the European Union's Horizon 2020 research and innovation programme under grant agreement No 745012. The JU receives support from the European Union's Horizon 2020 research and innovation programme and the Bio-based Industries Consortium. We especially thank "Fondazione Eugenio e Germana Parizzi" for their financial support.

CONFLICTS OF INTEREST
Authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT
The data that support the findings of this work are openly available in the Zenodo data repository (https://zenodo. org/) at https://doi.org/10.5281/zenodo.6053788.