In‐field soil spectroscopy in Vis–NIR range for fast and reliable soil analysis: A review

In‐field soil spectroscopy represents a promising opportunity for fast soil analysis, allowing the prediction of several soil properties from one spectral reading representing one soil sample. This facilitates data acquisition from large amounts of samples through its rapidity and the absence of required chemical processing. This is of particular interest in agriculture, where the chance to retrieve information from soils directly in the field is very appealing. This review is focused on in‐field visible to near infrared (Vis–NIR) spectroscopy (350–2500 nm), aimed at analysing soils directly in the field through proximal sensing. The main scope was to explore the available knowledge to identify existing gaps limiting the reliability and robustness of in‐field measurement, to foster future research and help transition towards the practical application of this technology. For this purpose, a literature review was performed, and surveyed information encompassed sensor range, carrier platforms in use, sensor type, distance to the soil sample, measurement methodology, measured soil properties and soil management, among many others. From this, we derived a list of tools in use with their spectral measurement properties, including the potential cross‐calibration with soil spectral libraries from laboratory spectroscopy of soil samples and potential measured target soil properties. Different instruments and sensors used to measure at varying wavelength ranges and with different spectral qualities are available for a large range of prices. The most frequently analysed soil properties included soil carbon contents (soil organic carbon, soil organic matter, total carbon), texture (clay, silt, sand), total nitrogen, pH and cation exchange capacity. Future perspectives comprise the implementation of larger databases, including different instruments and cropping systems as well as methodologies combining existing knowledge regarding laboratory spectroscopy with in‐field methods. The authors highlight the need for a broadly accepted measurement protocol for in‐field soil spectroscopy, fostering harmonization and standardization and consequently a more robust application in practice.


| INTRODUCTION
Capturing the spatial variability of soil properties across landscapes is necessary for the efficient management of natural resources such as precision agriculture, non-point source pollution modelling and planning of resource use (Waiser et al., 2007).Current tiered investigation approaches and sampling strategies can be improved by using proximal sensing.Proximal soil sensing refers to fieldbased methods that sense the soil in proximity to the ground, ranging from outside the soil-that is, within a maximum distance of two metres (Viscarra Rossel et al., 2011)-where the probe does not enter the measured soil volume, to methods in close contact to the soil including measurements facilitated by the shank of a mobile platform (Christy, 2008) and contact probes (Metzger et al., 2023).Both contact and distance measurements can be made in static and mobile mode, moving over or through the soil affecting the measurement settings and resulting data quality.The use of proximal soil sensing techniques could increase the number of measured soil samples that are necessary for an adequate characterization of soil heterogeneity at field scale.Thus, proximal sensing directly in the field became a challenge and it gained interest due to its potential advantages (Kuang et al., 2012).These technologies involve either on-the-go sensors, mounted on agricultural vehicles or hand-held instruments, which can be used for site-specific management (Christy, 2008;Metzger et al., 2023).Due to the high sampling density allowed, these sensors are considered more effective in capturing field variability, hence addressing the problem of selecting the correct soil sampling strategy to ensure representative soil samples (Figure 1).
In this framework, soil spectroscopy offers a promising option for soil analysis, with advantages such as the prediction of several soil properties from just one spectral measurement, facilitating data acquisition from large amounts of samples through its rapidity and the absence of required chemicals or extractions (Metzger et al., 2023).In addition, handling is simple, sample presentation is flexible with modern instrumentation and measurements can be performed in a totally non-destructive waywithout contact if required or with minimum invasion in case of taking soil cores or changing sample presentation on-site without taking samples away (Minasny & McBratney, 2008;Viscarra Rossel & Behrens, 2010).Spectral measurements, therefore, offer a fast and efficient option to identify certain properties of objects and materials, that is soil, in a non-or minimum-destructive or invasive way.This is of particular interest in an agricultural context, where information from soils and plants is needed regularly to support management decisions.Usually, this involves soil sampling and laboratory analyses that requires laboratory equipment and consumables.The option to do such measurements in the field and to retrieve information shortly afterif not instantlyis therefore very intriguing.

| IN-FIELD SOIL SPECTROSCOPY
Soil spectroscopy works due to energy-matter interactions: a material can reflect, absorb, scatter and emit electromagnetic radiation in a characteristic manner depending on its molecular composition and structure, resulting in a unique spectral signature (Shaw & Burke, 2003).Infrared diffuse reflectance spectroscopy is based on the principle that radiation containing all relevant frequencies in a certain range is directed to the sample.The radiation will cause individual molecular bonds in soil constituents to vibrate, either by bending or stretching, and energy will thereby be absorbed.A specific bond in a specific chemical context will absorb a specific energy quantum.As the energy quantum is directly related to frequency (and inversely related to wavelength), energy at different spectral bands will be absorbed to various degrees depending on the soil composition, which in turn will result in a corresponding

Highlights
• An overview of recent in-field soil spectroscopy studies is given, identifying knowledge gaps • Vis-NIR was confirmed to have a very good potential for in-field measurements • A summary of used tools, measuring practices and predicted soil properties is provided • Future challenges such as using soil spectroscopy in different management systems and the combination of laboratory and in-field methods are highlighted reflectance spectrum (Horta et al., 2015;Miller, 2001;Stenberg et al., 2010).Reflectance is typically measured in relation to a reference material and will, in addition to what is absorbed, be affected by scatter, which is a function of soil composition, for example, mineralogy, texture and structure (Nocita et al., 2015).Since a rapid and cost-effective evaluation of soil properties is essential for monitoring soil conditions, and conventional laboratory measurements are costly and time-consuming, the latter could not be considered as appropriate for large datasets.In recent decades, the use of soil reflectance spectroscopy, particularly in the infrared (IR) and visible-near infrared (Vis-NIR) ranges, has become a powerful technique to simplify soil studies (Barra et al., 2021).Soil spectroscopy is considered a rapid, cost-effective, quantitative and environmentally friendly technique, which can provide hyperspectral data with numerous wavebands and various waveband width properties, both in the laboratory and in the field.It has been evaluated as a possible additional method to laboratory analysis for monitoring soil parameters, to address the need for continuous information about soils, while reducing the cost of soil analyses (Li et al., 2022).The non-destructive nature of such a technique allows simultaneous and repeatable measurements, representing a significant advantage over conventional laboratory measurements (Pasquini, 2018).In situ soil reflectance spectroscopy application requires proper environmental conditions, such as non-rainy days to work in the field, and various pretreatment methods to mitigate the effect of soil moisture content, soil roughness and vegetation cover (Gehl & Rice, 2007).
Different instruments and sensors are available for spectral measurements, consequently, different ranges in wavelength or frequency range or channel bandwidth and number are used (Pandey et al., 2020).Wavelength ranges in use cover gamma range, ultraviolet, visible, NIR, short wave infrared (SWIR), medium infrared (MIR), IR and microwaves or frequencies in the radar range.The exact definitions and use of these ranges differ in the literature (Pandey et al., 2020;Thenkabail & Lyon, 2011).For the sake of simplicity, and because it is used very often, we will focus on Vis-NIR, ranging from 350 to 2500 nm, and consisting of the visible (350-750 nm), NIR (750-1100 nm) and SWIR (1100-2500 nm; Ng et al., 2022;Rodrigues et al., 2022), referred to from here onwards as full range.Depending on the number of single channels available, we can define multispectral (3-40) or hyperspectral (>40) sensors.Vis-NIR spectroscopy, a rapid and easy-to-use technique, has a very good potential for in-field measurements, due to its simplicity, robustness and flexibility, allowing the estimation of several soil chemical and physical properties, such as texture (especially the clay fraction), organic and total carbon, carbonate and cation exchange capacity (CEC; Stenberg et al., 2010).To make full use of the advantages of Vis-NIR spectroscopy, measurements directly in the field should be strived for, as timeconsuming efforts like sampling, packing and marking samples that need to be transported to a laboratory for further pretreatments before analysis can be avoided.Robust field instruments are now available and are becoming more affordable and user-friendly (Gullifa et al., 2023).Achieving successful field measurements would be of great benefit to agriculture, as it could facilitate denser sampling and an improved spatial resolution, supporting the delineation in management zones and variable rate inputs applied (Sarkhot et al., 2011).
Sensors can be relatively small, and even smaller sensor heads can be attached to the actual sensor by fibre optics.There are also several commercial and semicommercial instruments available for point and onthe-go analyses (Ben-Dor et al., 2017;Wetterlind et al., 2015).Despite this, most of the studies and applications focus on stationary sampling and laboratory analyses of dried and sieved samples.During the last decades, much focus has been put on the development of large regional, national or even transnational soil spectral libraries (SSLs; Nocita et al., 2015;Orgiazzi et al., 2018).A representative database, such as an SSL, with soil spectra and known properties analysed by a reference method, is required to calibrate the spectra to the properties of interest such as soil organic carbon, as shown by a mapping approach using the LUCAS topsoil database (Castaldi et al., 2019;Nocita et al., 2014).However, these lab-derived SSLs are entirely based on spectra measured under very controlled conditions from dried and sieved soils.As spectra are influenced by both water content (Lobell & Asner, 2002) and structure (Udelhoven et al., 2003), the use of SSLs and already calibrated prediction models cannot be expected to work without correcting for the discrepancy between field and laboratory spectra.In addition, moist or wet samples for calibration typically result in reduced prediction performance (Knadel et al., 2014;Stenberg, 2010).One reason for this is probably that moisture influences the entire spectrum to some extent due to scattering, and more strongly in specific spectral regions due to absorption.This tends to override less pronounced spectral features of other soil constituents (Knadel et al., 2022;Lobell & Asner, 2002;Stenberg, 2010).As moisture content will vary between samples-and thus also the moisture effects-an additional dimension is added for moist samples, making calibration work overwhelming.Despite the challenges involved, applying SSL-based calibrations on field sampled spectra appears as the most realistic and reasonable way forward; several large SSLs are already built and under development (Dangal et al., 2019;Demattê et al., 2022).However, the challenges dealing with discrepancies between field and laboratory spectra need to be addressed.Two paths can be followed to harmonize field and laboratory spectra.First, measures in the field can be taken to produce spectra that resemble the corresponding laboratory spectra to the largest possible degree.In addition, mathematical algorithms and procedures can be adopted to reduce the influence of moisture and texture.Algorithms addressing moisture effects were recently reviewed by Knadel et al. (2022).
This literature review intends to give an overview of current measurement techniques and studies, to gain a better understanding of past and ongoing research and relevant debates about in-field soil spectroscopy.The review aimed to explore the knowledge available to identify gaps limiting in-field measurement accuracy and robustness, including the potential cross-calibration with SSLs from lab spectroscopy of soil samples.Therefore, this review contains a list of used tools with their spectral and measured properties.Moreover, it emphasizes current in-field application methods, existing studies and commercial applications using this technology for agricultural purposes-including an overview of target properties measurable with soil spectroscopy in the field, potential application and use cases.

| Criteria for literature search and information extraction
A literature search was carried out from January to March 2022 on Scopus, ScienceDirect and Web of Science databases, then updated in May, June and July 2023.The terms used for the queries in Title and Keywords fields were the following: Soil proximal sensing AND in situ AND NIR AND Soil spectroscopy.Other additional keywords were soil, Vis-NIR, field, lab, sample preparation and subsampling.Duplicates were removed, and two papers were discarded after being considered non-relevant.The final list of publications resulting from the search included 103 references.Five of the analysed references were reviews, while 13 papers dealt only with dried and sieved samples.Before performing analyses on the dataset, the references dealing with laboratory analyses only were taken out and were only used for the discussion and understanding of the methodologies.The final number of references considered was 90; the complete list is provided as Supplementary Material (S1).
For the selected literature items, we collected information on five levels (Figure 2): 1. 'Bibliographic' is the bibliographic information about each article.2. 'Main topic of the study'-is whether the study was about in-field or lab measurements, if a comparison is performed, if it is a research paper or a review, if it is a multi-sensor study, and so forth.To cover all the available information, the 'other' categories were included.3. 'Target soil property' is the property (or properties) studied, number and type.4. 'Sensor characteristics' are the characteristics of the sensor used and of the scanning method.5. 'Conclusions' are the main conclusions of the papers summarized.
In Table 1, the categories used for analysing the literature are reported.The complete list of extracted information, including the discussed categories but also much more related information, can be found in the Supplementary Material (S2).
For better understanding of the statistical terms: the term 'reliability' refers to the consistency of a measure.One can consider three types of consistency: over time (repeatability), across items (internal consistency) and across different researchers (inter-rater reliability).In statistics, a measure is said to have a high reliability if it produces similar results under consistent conditions.In practical terms, it is the degree to which data, and the insights gleaned from it, can be trusted and used for effective decision-making.Reliability is the quality of being dependable, trustworthy or of performing consistently well.Reliability requires working as expected in normal, well-known circumstances.Robustness is the capability of performing without failure under a wide range of possible conditions (Jones, 2021).

| Grouping of papers
For this study, the selection of publications about soil spectroscopy comprised studies ranging from 2006 to 2023.The maximum number of articles per year ( 16) was in 2022, followed by 2015 (10) (Figure 3).The published articles were spread over a total of 35 academic journals and 1 workshop proceeding.It can be assumed that the number of publications per year will further grow after 2023, reflecting the actuality and importance of research in the field of soil spectroscopy.The covered studies stemmed from 23 countries spanning all continents and one continental region (Antarctica), with the highest number of papers coming from China (14), Germany (11) and the United States (8).
Most of the investigated studies focused on in-field measurements only ( 28), but other studies also compared field and lab measurements ( 23), integrating acquisition methods, sample preparation and algorithms for analysis.

Conclusions
… T A B L E 1 Overview of the grouping categories used for the data analysis.Only five papers were literature reviews, the others were original research papers.The investigated soil properties were grouped into different classes, of which soil organic carbon/soil organic matter (SOC/SOM) estimation was by far the most investigated subject (31 papers; Figure 4).Chemical, physical and hydrological properties were also studied by means of soil spectroscopy, often to support soil classification and mapping purposes.With lower occurrences, the other investigated properties were pH, total N, nutrients (macro and micro), texture and granulometric fractions, clay content, hydraulic features, carbonates, CEC, bulk density, electrical conductivity, contaminants and heavy metals and mineralogy.Soil classes, soil respiration, microbial biomass and root density are properties found only in one article each.Prediction accuracy was also a target of several studies, as well as the effect of moisture on prediction accuracy.Three papers dealt with soil salinity assessment.One paper focused on archaeological soil characterization.Other papers compared multiple sensors and their capacity to be used for the prediction of soil parameters, and examined the possibilities to use existing SSLs to predict from field-collected spectra or developing new devices to scan soils with Vis-NIR (Figure 5).There are many ways of grouping or clustering large amounts of papers, and with this review, we tried to analyse contrasting perspectives which comprise the location of the study and the position of the sensor with respect to the soil target, sensors used and analytical approaches, among others.

| Sensors and instruments
Various Vis-NIR spectrometers were used in the reviewed studies, varying in the spectral range, spectral resolution, and so forth, reflecting a rapid development of portable spectral devices, with a likely increasing potential for infield use.The spectral resolution (bandwidth) of the hyperspectral devices ranged from 3 to 16 nm depending on the instrument and wavelength region, but spectra are often produced with a resampled resolution of 1 or 2 nm, which in the case of full range spectrometers results in more than 2000 data points.For multispectral devices, the bandwidth can range from only three bands in the visible range (Red, Green, Blue: RGB) to up to 40 bands including the red edge and NIR range (Biney et al., 2023;Bockholt, 2020;Fitzgerald et al., 2006).
Most studies used at least one full-range spectrometer (51) or multiple spectrometer ranges (26), either supplementing each other-different spectrometers for different spectral regions-or comparing full-range and reduced range spectrometers (Figure 5).Some studies also used sensors in the visible and NIR range (350-1000 nm, n = 4) as well as the SWIR range (1000-2500 nm, n = 1), while two studies used mid-infrared instruments.
Most of the sensors reflect 'traditional' sensor types, as seen above.The most used spectrometer instrument brand was by far ASD (Malvern Panalytical, Malvern, UK) with a total of 62 mentions: 46 mentions for ASD FieldSpec devices, 8 for ASD AgriSpec, 7 for ASD LabSpec and 1 for ASD QualitySpec Trek.Other brands included Veris (10), Bruker (10), Agilent (9) and Spectral Evolution (4).

| Measurements and platforms
In several studies, the soils were measured with a contact probe (32).Contact probes are usually connected to the spectrometer via a fibre optic cable.They are placed in contact with the soil sample to exclude ambient light and use an internal light source to guarantee constant light intensity.The constant lighting makes the contact probe particularly suitable for in-field use, as the operation does not depend on the natural lighting conditions.In 18 publications, multiple sensor types were used, mostly involving contact probes in field studies and bare fibre measurements in top-down field measurements simulating remote sensing or lab measurements making use of an external light source (Figure 6).Measurements without a contact probe were mostly bare fibre measurements without an active light source (passive light source such as sunlight, n = 15) or measurements with an active light source (active light source as in laboratory measurements for comparisons, n = 14).In eight publications, the sensors were included in a shank for soil tillage, which is dragged horizontally through the soil or a probe that is inserted vertically in a mobile, on-the-go measurement platform.
Depending on the use case and the sensor type (Figure 6), there are different platforms on which the spectrometers are used (Figure 7).Apart from using multiple platform types (35), the most used type was portable by humans, either in a case or an adapted backpack for ease of use.In 15 papers, the transport on tractors was specified allowing static and on-the-go measurements, and in 4 papers, benchtop mounted instruments, including portable instruments, were used in benchtop configuration in the lab.Only four publications considered in this study mentioned airborne moving platforms.
The distance between the sensor and the soil sample in the considered papers varied largely (Table 2).The most used distance was 0 cm, mostly due to the use of contact probes.Making use of, or evaluating different distances was found 15 times, followed by measuring 10-200 cm away from the sample surface (with bare fibre, n = 10), more than 200 cm (n = 8) and 0-10 cm (n = 6).
There were many possibilities to collect the spectra involving various degrees of sample handling.In most of the studies (25), only the soil surface was scanned, in 14 studies, the soil was scanned at the surface and additionally in the laboratory.Often the soil samples for the laboratory were then taken from a topsoil volume (usually top 10 or 20 cm).Sometimes the sampled cores or vertical profiles were measured spectrally.On-the-go measurements from moving platforms were used in 11 studies (Figure 8).

| Predicted soil properties and management
Soil spectroscopy has been used to determine several different soil properties (Figure 9).Most of the studies focused on SOC/SOM and chemical and physical soil properties such as total element content or soil texture parameters, respectively.However, many other soil properties were studied, indicating the large potential for multiple parameters detection of soil spectroscopy.
The soil use and management type (Table 3) show that most studies focused on conventional tillage (24) or did not mention soil management although indicating the cropping system (18).A total of 17 studies compared different soil management systems, among them conventional, reduced or zero tillage as well as grassland.The soil management systems reflected by cropping systems were vineyards and orchards, paddy soils and sugar cane.A total of 16 of the evaluated studies did not give any information about the soil management or crop and vegetation system investigated.

| Measurement purpose
There are multiple purposes for which Vis-NIR spectroscopy is applied in the analysis of soils and cropping systems.In our review, we found two main purposes.First, as a rapid and cost-effective method to assess soil characteristics as an alternative to wet chemical analyses in the laboratory for agricultural soil assessments (Barra et al., 2021;Biney et al., 2020;Metzger et al., 2023).Therefore, usually, a subsample of the field is taken and measured, whose value is then attributed to the field or a section of the field.This would be, in fact, only an extension of traditional laboratory analysis with a new (faster and cheaper) method.Second, the technique can be used as a ground-truthing or calibration method for air and spaceborne sensing or other imaging approaches, where the entire surface is recorded (Ben-Dor et al., 2017;Hong et al., 2020;Pandey et al., 2020).Both methods are increasingly more combined for mapping purposes at field scale for decision support in an agricultural context (Yuzugullu et al., 2020) or on regional scale for mapping and monitoring of soil properties (Castaldi et al., 2019;Yuzugullu et al., 2024).Depending on the purpose of the measurements, the position where the spectrum is measured can vary in the soil: while for agriculture very often soil information down to a certain depth is required (i.e., rooting or tillage depth), ground-truthing for satellites requires information of the undisturbed soil surface.In some cases, it might be also advantageous to take the fresh soil out of the field and analyse it ex situ, whereas measurement directly in the field promises the least effort.Different cropping systems were traditional soil tillage with intensive ploughing in contrast to soil conserving methods with the extreme of no-till methodology.Grassland use also reflects different soil management with temporary grassland as part of a crop rotation, which remains undisturbed for one to several years.Intensive soil disturbance by conventional tillage produces a well-mixed layer in the topsoil, which can be measured as a representative sample by soil spectroscopy.In less or non-disturbed systems, such as strip or no till or grassland, usually, a gradient of soil properties establishes with soil depth over time.This is often well reflected by SOC/SOM and nutrients, but also by soil density, and so forth.Often plant residues are much more prominent and living plant roots are present in the subsurface.This is most pronounced in grassland systems.These differences have significant consequences for spectral measurements and subsequently for analysis and evaluation and/or correction methodology.While well mixed topsoil allows for a topsoil measurement procedure, soils with established gradients likely need a representation of that gradient in the spectral measurements in depth to give representative data.
Although soil spectroscopy may not be as precise per individual measurement compared to laboratory analysis, it is more cost-efficient (Debaene et al., 2014;Li et al., 2022;Viscarra Rossel & Brus, 2018), providing a balance between accuracy and cost.Since this technique is cheaper, simpler and more practical to use, many more measurements can be made across space (laterally and vertically) and time, so that, as an ensemble, the data are more informative (England & Viscarra-Rossel, 2018).It is evident that the accuracy and robustness of the measurements, predicted soil properties and even resulting maps are better when local or regional soil samples are combined with laboratory analysis to adjust and support the calibration and parameter prediction (Castaldi et al., 2019).When the remote sensing component is also used to strategically identify the optimal sampling position according to spatial distribution and spread of soil heterogeneity, it is often called precision or support sampling (Yuzugullu et al., 2020;Yuzugullu et al., 2024).
F I G U R E 8 Approaches to soil scanning.

| Choice of sensor and platform
Each type of sensor has its advantages and disadvantages, that is, using a contact probe eliminates the influence of external light and its variability, but only a small fraction of the surface (about 2 cm 2 ) is measured.Whereas, when passive sampling with the bare fibre (and field-of-view lenses) is used, a larger portion of the soil surface can be measured, giving a more integral view (Debaene et al., 2023).In the case of bare fibre measurements, sometimes lenses are used to better define the field of view (e.g., 4 ; Debaene et al., 2023).This can also be scaled up to using sensors on UAVs and satellites, where the spatial resolution, as well as the spectral resolution, become lower, but in turn bigger areas can be scanned (thus also enhancing the influence of distorting factors, unfortunately).These different sensor types must be taken into consideration in conjunction with the measurement purpose, and the appropriate setup decided upon.This makes the task of defining one best practice for in situ Vis-NIR spectroscopy rather challenging and calls for a differentiated view on the topic.
Most of the paper evaluated here emphasized the fact that models with laboratory spectra are generally more robust (Sleep et al., 2022), but some papers underlined that the method is suitable for in situ measurements.Yet, a procedure to link field and laboratory samples and measurements to be able to use SSLs with new field samples must be defined (Biney et al., 2020;Yin et al., 2023).

| Linking in-field to laboratory spectroscopy
The literature on the subject is scarce and relatively recent, starting with two papers in 2009 (Morgan et al., 2009;Viscarra Rossel et al., 2009), to 21 papers in the present  method, to obtain similar results to the dry spectra models.However, it is noteworthy that it is difficult to evaluate such an amount of literature since the results are not black and white.For instance, many papers present different definitions of what field spectra are (surface, core, contact probe, etc.).Moreover, laboratory spectra have often different meanings between studies, too (laboratory wet, dry, dry and sieved, etc.).As underlined before, some methods can be employed to diminish those external effects (e.g., EPO method to correct for soil moisture), spectra pretreatment to correct for soil roughness, among others.Mostly, the differences between spectra are related to soil moisture and environmental factors related to physical soil properties (Knadel et al., 2022).

| Data quality control
The use of reflectance spectroscopy may have the same analytical quality as the traditional laboratory methods for some properties.Several studies demonstrated that Vis-NIR reflectance spectroscopy can be used to accurately determine important soil constituents, such as organic carbon, clay, sand and CEC (Demattê et al., 2019).Soriano-Disla et al. ( 2014) found soil water content, texture, SOC, CEC, exchangeable Ca and Mg, total N, pH, concentration of metals or metalloids, microbial size and activity could be successfully predicted.Generally, MIR spectroscopy produced better predictions than Vis-NIR, but Vis-NIR still outperformed MIR for several properties (e.g., biological ones).An advantage of Vis-NIR is the instrument portability.In-field predictions for clay, water, total organic carbon, extractable phosphorus, and total N appear to be similar to laboratory methods, but there are issues regarding, for example, sample heterogeneity, moisture content and surface roughness (Soriano-Disla et al., 2014).
The types and the range of spectral sensors influence the quantification of soil properties, but Romero et al. (2018) observed that there is a small difference between sensor measurements.Taking into consideration these small differences, caused by geometry and equipment variation, Ben-Dor et al. (2015) determined a protocol to standardize measurements between sensors.The accuracy of a sensor can be measured by means of measurement repeatability at the same time and place, and the correlation with reference measurements of soil properties (Sinfield et al., 2010).Demattê et al. (2019) found that predictions of soil properties using different sensors showed high reproducibility, which is associated with the analytical capacity of the reflectance spectroscopy technique.
Laboratory soil spectroscopy measurements are prone to errors due to differences in equipment and procedures used (Ge et al., 2011;Knadel et al., 2013;Pimstein et al., 2011), soil preparation and sub-sampling (Ben-Dor et al., 2015) as well as the temperature and humidity differences of the lab environments (Chabrillat et al., 2019).In-field measurements are additionally affected by the insitu conditions but also by differences in measurement procedures, which are usually more pronounced than among laboratory approaches.For instance, static or mobile measurements reflect different data quality.A reduced integration time over a specific sampling point for mobile, moving platform approaches lead to a lower signal to noise ratio, reducing the accuracy and robustness of a subsequent prediction.However, for mobile approaches, this decreased single point measurement quality is traded for an increased spatial sampling density, which allows for a higher accuracy of the overall prediction in a field and likely better mapping results, on the other hand.
In situ soil conditions such as soil moisture, structure, stone content, coarse organic residues, smearing and small-scale heterogeneity, mottles and redox features affect the overall performance of field-based soil reflectance spectroscopy applications that require proper environmental conditions and various pretreatment methods to mitigate the effect of moisture content, soil roughness and vegetation cover (Gehl & Rice, 2007;Stenberg, 2010).There is, however, some discussion of the extent to which factors, such as soil aggregation and heterogeneity, affect the reliability of in situ derived soil prediction for Vis-NIR.For example, Waiser et al. (2007) found that clay contents predicted from Vis-NIR scans of dried in situ soil are more accurate than predictions from measurements of field-moist in situ soil.Spectral measurements of dried and ground soil resulted in the most accurate predictions of clay content: in this case, the effect of soil moisture appears to be more attenuating than soil structure or aggregation (Waiser et al., 2007).Efforts in correcting for the effect of soil moisture have received growing attention in the literature: for example, air drying the sample reduces the intensity of bands that are related to water, thus signals associated with other soil properties are not masked or hidden.Moreover, EPO (Roger et al., 2003) has been used as a Vis-NIR preprocessing step to remove the effect of soil moisture from spectra (Minasny et al., 2011).Although bare soil conditions are most favourable for in-field measurements, in real field conditions, the presence of either green vegetation or straw and mulch is very common, and may lead to overestimation of SOC (Bartholomeus et al., 2011).

| Processing and use of in-field spectroscopy data
The analysis of the collected publications (Supplementary Table S1) reflects an immense range of possibilities to mathematically pre-process the spectra to extract the maximum information, as well as many mathematical modelling approaches to relate the information contained in the spectra to target properties (e.g., laboratory results).The mathematical pre-processing algorithms involve smoothing the spectrum over a defined number of points-simple or more advanced smoothing algorithms such as the Savitzky-Golay algorithm (Savitzky & Golay, 1964), baseline correctionscontinuum removal (Crucil et al., 2019), standard normal variate (SNV; Barnes et al., 1989), multiplicative scatter correction (MSC; Geladi et al., 1985) and first and second derivatives combined with Savitzky-Golay smoothing, to name the most often used ones.Other pre-processing algorithms dealing with the negative influence of soil moisture, such as EPO and direct standardization (DS), are described in more detail by Knadel et al. (2022).It is very common to test several algorithms on a particular dataset, including combining multiple algorithms to obtain the best modelling results.As such, for example, Gras et al. (2014) tested 42 combinations of pre-processing algorithms-and Cambou et al. (2016) made the same approach for 35-from which the best combination could be selected.So far, there are few suggestions on how to choose the right combination of preprocessing methods.We found only one data mining tool supporting decisions on the use of algorithms, PARACUDA II ® (Carmon & Ben-Dor, 2017;Gholizadeh et al., 2018).
The problem of algorithm selection gets more complex when different modelling techniques are also taken into consideration.Methods for Vis-NIR predictions of soil properties are purely empirical.The first step is the collection of a SSL containing the ideal duo of target soil properties and corresponding soil spectra.Then, the routine of spectral preprocessing, followed by fitting of any number of calibration model types (Soriano-Disla et al., 2014), is performed.
Commonly used algorithms include partial least squares regression (PLSR; Hutengs et al., 2019), random forest (RF), support vector machines (SVMs) and M5Rules (CUBIST; Munnaf & Mouazen, 2022) or neural network models (Tiwari et al., 2015;Wang et al., 2019).Additionally, the selected model validation represents an additional source of variability.Here, the methods include splitting of a dataset into a calibration and validation set to test the model performance on unknown samples (Breure et al., 2022) Based on the plethora of possibilities to set up the spectral prediction models, it can be assumed that the choice of modelling approach can also be a source of error.Apart from the model choice, their predictive quality also depends on the precision and accuracy of the reference laboratory measurements (Rayment et al., 2012;Reeves III, 2010).Limitations of soil spectroscopy are often caused by the shortage of calibration services and the lack of harmonized standard operating procedures (Benedetti & van Egmond, 2021).To overcome these obstacles, the Global Soil Spectral Calibration Library and Estimation Service was proposed (Shepherd et al., 2022).The selection of reference spectra is one of the important steps dealing with calibration and bias.The calibration-related error and bias can likely be improved using standardization and best practice protocols, when data stem from different sources, reflecting different measurement processes and use of methods as described above.
Missing data and information also represent a large constraint in many studies, hindering cross comparison or use of studies for meta-analysis.For example, the 16 studies about soil use and management reviewed here did not give any information on the background of the considered soils.Since soil management can have a large impact on the needed calibration and consequently on the output accuracy, such missing information will likely be reflected in larger error terms (Greenberg et al., 2022;Karapetsas et al., 2022;Pei et al., 2019;Yin et al., 2023).

| Applications and limits of proximal soil sensing
As derived from the discussion above, soil spectroscopy holds great potential for practical applications.Applications may range from decision support in agriculture, partly replacing laboratory analysis or allowing to derive more information, to support soil mapping (Minasny et al., 2009), to obtain a larger sampling resolution and a larger amount of information because of its theoretically lower cost per unit of measurement.In this range, one can also see the development of measurement devices and analysis software, advancing user-friendly access to soil spectroscopy technology.
However, the potential for lower costs or the realization of more measurements per area as well as good data quality depend strongly on automation and standardization of processes and methods.Such processes and methods range from the soil preparation over the actual in-field measurement to the prediction of a value, ideally reflecting a good prediction of soil properties including error information.The reliability and robustness required to derive these final values, being interpreted for decisions in agricultural management or integrated in mapping approaches as a point in spatial models, will drive its application in practical terms.As derived from our study, automatization rarely takes place during measurement, analysis and even in laboratories today.And for many involved processes and methods also standardization is lacking.Therefore, labour and technology costs with respect to sampling, but also for establishing the sampling, prediction and interpretation pipeline still represent a limit of soil spectroscopy needed to be overcome for an efficient application and use.
In many situations, the availability of spectral libraries and thus calibration and validation data limit a broader use of soil spectroscopy.Thus, the establishment and extension of SSLs are a major driver to advance soil spectroscopy towards a standard tool.With existing libraries, there is still a strong demand for optimization of prediction and validation methodology and its standardization, as mentioned above.To get more spectra and to quality control measurements today, the support sampling approach is well accepted to get more sampling points for calibration and validation.
In terms of soil properties, it is evident that spectroscopy can be used to predict several properties at the same time from the same spectra.As discussed above, the prediction accuracy and robustness varies between properties depending on spectral features available for prediction, but also on used sensors, measurement procedure, availability of calibration data and analytical methods among others.Because of the dependence of the final prediction on the used processes and methods, we emphasize that their standardization is a major step toward improving our understanding of which properties can be predicted with high or low accuracy, reliability and robustness by in-field soil spectroscopy.Therefore, future work should initially focus on standardization followed by a broader and better evaluation of property prediction power and potential, supported with integration of technical possibilities for automatization.

| Questions and tasks for the future
Several questions remain to be answered in the future, especially: 1. How can the hard-and software technology be advanced towards robust use in agricultural practice, to be applicable in regular farming and soil mapping operations? 2. Beyond the potential of Vis-NIR spectroscopy to be 'a fast and cheap alternative for laboratory analyses', there are very few publications that evaluate the trade-off between speed/price and the reduced accuracy due to model-inherent errors, with special focus on affordable instruments (which are more likely to be used in farm operations).3. In-field spectral measurements need to be harmonized as much as possible to ensure robust prediction of soil properties.This could be facilitated by a commonly accepted measurement protocol setting standards.

| CONCLUSIONS
This paper gives an overview of recent in-field soil spectroscopy studies, clearly underlining the large potential of the technique and highlighting ways forward to implementation into practical farming or soil mapping operations, also pinpointing actual limits and needs for research and practical applications.
The intrinsic ability of multi-parameter soil property detection bears the potential for better information and thus decision support products on managed soils and soil borne production systems, respectively.However, tools, approaches, processes, methods and cropping systems where the spectroscopy is applied and evaluated differ largely in the considered studies, increasing input variables for robust property prediction and prediction errors, making cross-comparisons between monitored situations, and thus method harmonization, elaborate.
Thus, ways forward advancing in-field spectroscopy and its practical application comprise increased databases including different instruments and cropping systems as well as methodologies combining existing knowledge on lab spectroscopy with in-field spectroscopy calibration methods.A best practice protocol for in-field spectroscopy application seems to be central to improve reliability of soil property measurements and finally practical application of soil spectroscopy.Based on standardized and harmonized methods and data analysis, it seems realistic to identify which soil properties can be predicted reliably and robustly in future research.
Efforts to raise awareness of the existing knowledge in industry and of the available technology and subsequently its potential in practice will likely foster the implementation of spectroscopy in real farm operations.

F
I G U R E 1 Overview of proximal soil sensing as distinguished from remote sensing (left) and a close-up of sensing methodologies used in the field and lab (middle), and examples of surface sample preparation (right).Symbols from Booysen et al. (2019).
Newer sensor types, such as microelectromechanical systems (MEMS), offer cheaper and more flexible development of measurement devices in a multispectral range, but they are much less represented in current publications, since they represent a relatively novel development.Recently, Ng et al. (2020), Tang et al. (2020) and Metzger et al. (2023) compared a MEMS-based spectrometer to a full range spectrometer.

F
I G U R E 3 Number of soil spectroscopy papers per year evaluated in this review.F I G U R E 4 Aggregation of study topics covered in this review.F I G U R E 5 Overview of the sensor ranges utilized in the reviewed studies.

F
I G U R E 6 Overview of the types of sensors.F I G U R E 7 Number of different platforms used as sensor carrier in the investigated studies.T A B L E 2 Distances between the sensor and the soil sample.

F
I G U R E 9 Main groups of soil properties being predicted in the reviewed publications.A detailed overview of the soil properties is given in the Supplementary Material.
T A B L E 3 Overview of the management types covered by the studies.Management Count DescriptionConventional 24 Intensive soil tillage, usually ploughing (crops: wheat, barley, grain and silage maize, lettuce, soybean, cotton, alfalfa, potato, taro, green cover crop, bare soil or no info on crop) No/reduced tillage 3 Arable cropping with no (direct seeding) or reduced (shallow harrowing, strip tillage, etc.different soil management types contain: conventional (12), grassland (9), reduced tillage (3), no-tillage (4), paddy soil (1), orchard (3), forest (1), agroforest (2), mining region soil (or reduced tillage, cultivation of woody or shrubby species Sugar cane 2 Perennial crop for 4-8 years with soil tillage before planting and reduced tillage during cultivation Undefined soil management 18 Agricultural land use but no clear statement on soil and/or crop management Paddy soil 4 Regular flooding without tillage before samplingreview (Supplementary Material).About 70% of these papers are more recent than 2018.Therefore, it can clearly be seen as a new research focus in the field of soil spectroscopy.Currently, most papers deal only with comparing prediction results obtained using laboratory samples versus field samples.Even though more than 70% of papers present field spectra and the corresponding laboratory spectra, no link between those spectra is provided.Mostly, a prediction model comparison is given.However, we found few attempts truly linking lab-derived databases with in-field spectroscopy with the aim of taking advantage of the existing knowledge.Viscarra Rossel et al. (2009) successfully employed field spectra to spike an existing SSL.Franceschini et al. (2018) andGuo et al. (2019) tried to correct field moist spectra with the external parameter orthogonalization (EPO;Roger et al., 2003) , the use of different datasets from different sources to construct a larger, eventually more representative dataset and the use of different cross-validation techniques.The cross-validation methods vary with (i) the number of samples left out in each run-for example, Bricklemyer and Brown (2010) used leave-one-out cross-validation, while Chen et al. (2019) used a five-fold cross-validation; (ii) the method with which the dataset is split-for example, Christy (2008) used a fuzzy c-means algorithm, and Sleep et al. (2022) split the dataset with the Kennard-Stone algorithm (Ramirez-Lopez et al., 2014); and, (iii) the times the model run is repeatedfor example, Hutengs et al. (2019) used a 100 times repeated double cross-validation.