Increasing the scope of the resazurin‐resorufin smart tracer system in hydrologic and biogeochemical sciences: The effects of storage duration and temperature on preservation

The resazurin (raz)‐resorufin (rru) smart tracer system has led to significant advances in hydrologic and biogeochemical sciences but its application has been limited in scope and geography by the short duration that it has been assumed samples can be reliably stored before analysis. For the first time, we quantify the effects of sample storage duration and temperature on measured raz and rru concentrations in order to identify robust storage protocols. Raz/rru concentrations equivalent to those typically used in field and laboratory applications were prepared in three different mediums (deionized water, groundwater, and surface water). Samples were stored at 20°C, 5°C, and −20°C, representing average room, refrigerator, and freezer temperatures. Analysis of raz/rru concentration changes were conducted systematically during an 84‐day period, with higher frequency of analysis in the first 7 d. For samples stored at room temperature, changes of up to 22.6% were observed over the initial 24 h. In contrast, for samples stored in the refrigerator and freezer, changes of up to 8.4% and 6.9% were observed in the same period, respectively. Freezing samples provided the best preservation, yielding a maximum of ~ 10% change after 14 d compared to a maximum of ~ 30% change when cooled. It is generally recommended that raz/rru samples be stored cooled for up to 48 h and frozen for up to 14 d. This offers exciting opportunities to broaden the application of the raz‐rru system to undertake measurements in wider geographical (remote) locations and at increased sampling frequencies.

The resazurin (raz)-resorufin (rru) smart tracer system has been increasingly adopted in the hydrological sciences since its introduction to the field by Haggerty et al. (2008). It has been applied to a range of experimental designs including hydrological tracer injections (Lemke et al. 2013), flume experiments , and column and batch experiments (Comer-Warner et al. 2018). The adoption of the technique has significantly accelerated scientific advancement in several sub-disciplines and contributed to improving our understanding of important ecosystem controls such as metabolically active groundwater-surface water interactions and system metabolism in rivers (Gonz alez-Pinz on et al. 2015), lakes (Baranov et al. 2016), and wetlands (Riveros-Iregui et al. 2018). For more information about these applications and contributions to the literature, readers should refer to the recent review by Knapp et al. (2018).
The raz-rru system is capable of providing information about the microbial and biogeochemical conditions and functioning of the environment(s) it is exposed to by its function as a selective binary nano-switch (Knapp et al. 2018). Raz is irreversibly transformed to rru in mildly reducing conditions and mostly in the presence of living microorganisms, but it is effectively conservative in oxic environments such as most surface waters (Haggerty et al. 2008). Both raz and rru are fluorescent, with optimal excitation/emission wavelength pairs at sufficient distance to separate signals. This allows detection on both traditional benchtop fluorometers (such as Varian Cary Eclipse, Agilent Technologies, Inc.) as well as through online field fluorometers (such as GGUN FL30, Albillia Sarl) of concentrations < 1 μg L À1 , depending on the specific instrument employed and the medium (i.e., water type) in which the compounds exist (Lemke et al. 2013;Blaen et al. 2017;Blaen et al. 2018).
Best practice guidelines on how to conduct raz-rru experiments have been developed collaboratively and consecutively by the hydrological community. More recently, these have been refined into standard procedures for both field and laboratory applications (Blaen et al. 2017;Knapp et al. 2018). These guidelines include notes on solute preparation, sample extraction and processing (i.e., filtering and buffering pH), instrument calibration and correction, and data processing (e.g., correcting for signal overlap). Some guidelines exist on sample storage, for example samples should be stored in the dark (Haggerty et al. 2008), at temperatures of around 4 C and may be frozen (Knapp et al. 2018). However, to date, no systematic analysis has been conducted that would explicitly quantify the effects of storage duration, medium, and preservation method (i.e., temperature control) on the stability or potential decay of raz and rru. In fact, many applications do not specifically report sample storage time or preservation method which has been identified as a challenge for intercomparison of experimental observations (Lemke et al. 2013).
This lack of standardization, and paucity of understanding of the impacts of storage duration and preservation method, may lead to overly conservative application which may partly explain the limitation of most previous applications to systems where faster sample analysis is possible, restricting the experimental complexity and geographical scope in which the raz-rru system is applied. Furthermore, not knowing for how long samples can reliably be stored, or by which preservation method, reduces the number of samples which can be collected as most researchers aim to analyze all samples within 24 h. Increasing the time in which samples can be analyzed reliably would allow more samples to be collected and therefore facilitate more complex experimental designs, for example those which aim for both high temporal and spatial sample frequency. Whilst the introduction of online fluorometers has facilitated similar designs, these have been mostly restricted to surface water analysis and have been challenged by instrumental error, such as drift and interference (Blaen et al. 2017). To date, the application of raz/rru in the hydrological sciences is limited almost exclusively to North America and Europe, and then most of these are in easily-accessible locations (Knapp et al. 2018). Allowing samples to be analyzed following longer storage duration would allow time for samples to be transported to distant labs, therefore facilitating the use of the raz-rru system in remote locations. Furthermore, acceptable long-term storage would allow users to analyze samples from long-running or related experiments in the same batch, reducing the significant challenges associated with correcting for instrumental drift or error.
In this study, we investigated the effects of three storage temperatures (room temperature, refrigerator, and freezer) and storage duration (holding time) (periodically between 1 and 84 d) on the concentration of raz and rru in two mixtures of different concentrations (low and high) and in three mediums (deionized water [DIW], groundwater [GW], and surface water [SW]) representing tracer mixtures and mediums of typical applications. The aim of this work is to provide evidence-based practical guidance on best practice protocols for the storage of raz/rru samples. The critical point at which samples have changed so much as to be deemed unusable is determined by the data quality objectives (DQO) (i.e., tolerable error levels) defined in a research project (Wright 2004). Projects may propose different DQOs depending on project goals and data application (e.g., for use in decision making). Readers should interpret these results to define their own guidelines (e.g., acceptance criteria or thresholds) and inform necessary judgments with regard to DQOs in individual applications. However, general guidelines are proposed which should avoid unacceptable error in most applications. These guidelines are based on a threshold of 10% change, above which is assumed to be unacceptable, which is similar to thresholds proposed for other analytes, for example for phosphate, nitrate, and silicate (Chapman and Mostert 1990) and DOC (Cook et al. 2016).

Materials and procedures
Samples were prepared in three different mediums, representing a range of potential environmental sample conditions, to analyze potential interference of other water solutes (especially organic carbon) with the stability or decay of raz/rru (Lemke et al. 2014). Ultrapure water (18.2 MΩ cm À1 ) (DIW) was prepared in the lab in a Millipore A10. GW was extracted from a 140-m-deep borehole in a permo-triassic sandstone aquifer at Ecolaboratory (University of Birmingham, UK). SW was extracted from the Bournbrook, a small urban river in Birmingham, UK. The GW and SW were filtered on the day of collection using 0.45-μm quantitative ashless filter paper (Whatman-grade no. 42). Concentrations of nutrients (Table 1) (NO 3 , NO 2 , NH 3 , PO 4 ) were analyzed on a continuous flow analyzer (San++, Skalar), with a limit of detection and precision of 0.001% AE 1% mg NO 3 -N/L, 0.05% AE 3% mg NO 2 -N/L, 0. 0.01% AE 5% mg NH 4 -N/L, and 0.05% AE 5% mg PO 4 -P mg L À1 . Concentrations of dissolved organic carbon (DOC) ( Table 1) were analyzed on a Shimadzu TOC-L analyzer (Shimadzu Corporation) with a limit of detection of 4 μg L À1 .
Treatment solutions were prepared using raz sodium salt (Apollo Scientific Ltd., lot: AS456027) and rru sodium salt (Sigma-Aldrich, lot: MKBQ2285V). High-concentration stock solutions were prepared in DIW as suggested by Knapp et al. (2018). Two mixtures of raz and rru were then prepared in different mediums, with a low target concentration mixture (30 μg L À1 raz, 10 μg L À1 rru) and high target concentration mixture (150 μg L À1 raz, 50 μg L À1 rru). These mixtures were selected because they are representative of the concentrations one might expect in hydrological applications, such as field tracer tests (Blaen et al. 2017;Knapp et al. 2018). The actual starting concentrations vary in different mediums (e.g., higher than expected in the GW) which may be explained by variation in background fluorescence which was not accounted for, or by human error in producing the mixtures. For each condition and replicate, 10 mL of sample was transferred to a prelabelled 15 mL polypropylene centrifuge tube (Falcon 352196).
Samples were transferred to the respective storage method within 2 h of preparation. In all sample preservation methods, samples were stored in the dark and were undisturbed between analysis periods and temperatures were monitored at 10-min intervals using a Tinytag Aquatic 2 temperature logger (Gemini Data Loggers Ltd.). Samples were stored at three different temperatures: room temperature (mean, 20.49 C, min 18.43 C, max 24.37 C); refrigerator (mean 5.33 C, min 2.36 C, max 9.05 C); freezer (mean À20.40 C, min À22.3 C, max À15.85 C). Averages and ranges were calculated whilst omitting the first 48 h of the storage period, when loggers were acclimatizing. Whilst storage temperatures fluctuated (see Supplementary Fig. 1), the average temperatures in each storage method are distinctly different, and represent typical lab storage conditions. Samples stored in the freezer were removed 24 h ahead of their analysis to defrost in the refrigerator. Therefore, samples stored in the freezer and the refrigerator were analyzed at the temperature of the refrigerator. Samples stored at room temperature were analyzed at room temperature. The temperature of the sample during analysis is likely to affect fluorescence intensity (Smart and Laidlaw, 1977;Haggerty et al. 2008) but this is to reflect practicality and normal lab practice.
Samples were analyzed on a Varian Cary Eclipse benchtop fluorescence spectrophotometer (Agilent Technologies, Inc.). Emission/excitation wavelength pairs of 600/617.97 nm for raz and 570/586 nm for rru were used, which are within the ranges suggested by Haggerty et al. (2008). The fluorescence spectrophotometer was calibrated with 150 μg L À1 raz and 30 μg L À1 rru and a mixture of 30 : 10 μg L À1 raz : rru. Signal overlap was corrected using the MATLAB code provided by Knapp et al. (2018). Whilst fresh calibration standard solutions were prepared and analyzed at each analysis period, all fluorescence intensities were converted to concentrations using the calibration equations established from standards data acquired on the same day the samples were prepared. This was informed by the comparison of concentrations calculated using calibration equations established for each analysis period with those calculated using one calibration equation, where it was concluded that more error is likely introduced Table 1. Average (n = 5) concentrations of non-purgeable organic carbon, nitrate, nitrite, ammonium, and phosphate (all presented in mg L À1 ) prior to addition of tracers for three mediums: DIW, GW, and SW. Standard deviation is reported in brackets. in the preparation of new standards at every period than instrumental error between periods. This comparison is presented in Supplementary Section 3, including all calibration equations (Supplementary Table S1) and a graph of calculated concentrations ( Supplementary Fig. S3). Furthermore, only standards prepared in DIW were used to calibrate the instrument because it is often impractical to calibrate using the sample medium when samples are extracted from different environments (e.g., river water and different depths in the hyporheic zone) and because the focus of this experiment is the effect of storage method and duration. Therefore, in this study, we did not consider the implications of different calibration medium on raz/rru concentration changes. More detail is provided in Supplementary Section 2 and the effect of calibration standard solution medium is presented in Supplementary Fig. S2. Samples were buffered with 1 M sodium hydroxide (Fisher Scientific, lot: 1991898) to pH > 8 to ensure raz was present in its anionic form and therefore exhibits maximum fluorescence intensity (Bueno et al. 2002). Samples were shaken for 30 s before analysis to re-oxygenate waters and ensure dihydroresorufin (a daughter compound of rru) is transformed back to rru (Knapp et al. 2018). Drift standards were included in every six samples to observe instrument drift, which was insignificant over the analysis periods. For each sampling interval, triplicates of each medium and each treatment concentration were analyzed. Samples were analyzed on the day of preparation to confirm the starting concentrations. One set of samples were removed from the freezer as soon as they had fully frozen and were allowed to thaw in the refrigerator before being analyzed 24 h after preparation. This was to examine changes to the spectrophotometric properties of raz/rru in different mediums after the freeze/ thaw process, which has resulted in unpredictable changes to fluorescent peak positions and intensities in other analytes like DOC (Spencer et al. 2007;Cook et al. 2016). On Days 2, 3, 6, and 7, only samples stored at room temperature and in the refrigerator were analyzed as frozen samples would have spent a substantial proportion of the total storage time thawing. Subsequently, samples stored by each method were analyzed at 14, 28, 42, and 84 d after preparation.

All units mg L À1
Four samples were identified as outliers, where measured concentrations differed from the other two replicates at least by a factor of 3, and were omitted from analysis. Statistical analysis was conducted to (1) identify the storage duration at which tracer concentrations became significantly different from starting concentrations when subject to different storage methods, and (2) to investigate if the storage method resulted in significantly different recovery of tracers. Analysis was conducted for the low concentration samples only because rru concentrations exceeded the upper limit of quantification (ULQ) in 62% of high concentration samples. Without rru concentrations, raz concentrations could not be calculated, precluding the analysis. Data from low concentration samples of all three mediums were amalgamated (for both raz and rru separately) to conduct the statistical analysis on the effect of storage method and duration, but mediums are plotted and commented on separately in the text. Analysis for (1) was conducted using concentration data in μg L À1 and for (2) using percentage change from starting concentrations. Shapiro-Wilk tests were used to test for normality. Where data exhibited a normal distribution, a Student's t-test (T) and a one-way ANOVA with Tukey post-hoc test (Tukey) were conducted. Where data were not normally distributed, Wilcoxon signed-rank tests (WSR) and Kruskal-Wallis oneway ANOVA tests (KW) were conducted. Statistical tests were performed in R version 4.1.0 (R Core Team 2021). The statistical test employed for each result is reported using the abbreviations above and the level of significance is 0.05 for all tests.

Assessment
Results are presented separately for the low and high concentrations of raz/rru. Samples stored at room temperature are termed as ROOM, samples stored in the refrigerator are termed as COOLED, and samples stored in the freezer are termed as FROZEN. The medium is indicated by DIW for deionized water, GW for groundwater, and SW for surface water, which is in subscript following the storage method where appropriate. Relevant results of statistical tests are reported in the text and the outputs of all statistical tests are reported in Supplementary Tables S4-S6.

Low concentrations of raz/rru
In ROOM samples, concentrations of raz mostly remained within the 10% change threshold on Days 1-3 ( Fig. 2A) but on Day 6 a significant difference to starting concentration had been realized for the medium amalgamated results (WSR, V = 36, p < 0.05) (Fig. 1A-C). Rru changed more quickly (up to 22.6% after 1 d) and concentrations were significantly different to starting concentrations on Day 3 of analysis (WSR, V = 0, p < 0.01) (Fig. 2D). COOLED samples exhibited a similar pattern but concentrations changed more slowly. A maximum of 1.3% and 2.7% change was observed for raz and rru concentrations, respectively, on Day 1 of analysis (Fig. 2B,E). The 10% threshold was first violated on Day 7 for raz for all mediums, and Day 2 for rru in SW and Day 7 for rru in DIW and GW. A significant difference in concentration was first realized on Day 6 for raz (WSR, V = 33, p < 0.05) and Day 14 for rru (WSR, V = 0, p < 0.05) for the medium amalgamated results. FROZEN samples were preserved best (i.e., least deviated from starting concentrations), where a significant difference in concentration was first realized on Day 42 for raz (WSR, V = 36, p < 0.05), but not at all for rru, even after 84 d (WSR, V = 6, p > 0.05). Concentrations of raz and rru in FRO-ZEN samples were mostly within the 10% threshold on Day 14, except for GW samples.
Rru concentrations were significantly better preserved in COOLED samples than ROOM samples on Day 1 (ANOVA, F = 4.375, p < 0.05; Tukey, p < 0.05), but no difference was observed between FROZEN samples and the other storage methods. On Day 14, statistical differences between storage methods were also observed (ANOVA, F = 21.57, p < 0.05), where rru was significantly better preserved in FROZEN samples than in ROOM samples (Tukey, p > 0.05), in COOLED samples than ROOM samples (Tukey, P > 0.05), but no significant difference between COOLED and FROZEN was observed. The same patterns were observed for the remaining analysis periods (Days 28, 42, and 84). Raz concentrations were preserved similarly in all storage methods on Days 1, 2, and 3, but on Day 6, concentrations were significantly better preserved in COOLED samples than ROOM samples (T, t = 2.9923, p < 0.05). On Day 28, COOLED and FROZEN samples were better preserved than ROOM samples (ANOVA, F = 6.37, p < 0.01; Tukey, p > 0.05), but FROZEN and COOLED storage methods performed similarly.
Raz/rru preservation appeared to be affected by the medium in which samples were prepared, especially for rru. In general, COOLED samples of DIW and GW were still well preserved on Day 42, whereas in SW samples changes to the concentration of rru consistently violated the 10% threshold on Day 2 and after (see Supplementary Table S2). Similarly, FROZEN samples of DIW and GW were generally well preserved throughout the experimental period (i.e., on Day 84) but in SW samples changes to the concentration of rru consistently violated the 10% threshold on Day 28 and after. Rru in DIW was better preserved than in the GW and SW samples on Day 3 of analysis, exhibiting a 7.7% change compared to a 54.7% change in GW and a 50.6% change in SW (Fig. 2D-F). On Day 3 and after, however, both tracers seemed to behave similarly regardless of medium. This is with the exception of raz in GW, which increased on Day 6 and after for ROOM samples, with high variance in replicates. This increase in raz, along with the limited increase in rru, may be explained by the transformation of raz, or the decay of rru, to unquantifiable products that fluoresce at a higher intensity but a similar wavelength to raz (Haggerty et al. 2008). The limited increase in rru in the GW (Fig. 1E), as well as the decrease in rru in the ROOM SW samples on Day 28 and after (Fig. 1F), may also be explained by the decay of existing rru which is occurring simultaneously to the production of new rru, potentially dampening both signals. In the ROOM SW samples, the concentration of raz decreases quickly, by $ 20.9 μg L À1 (À67.1%) on Day 28 where it remained for the duration of the study, suggesting all the raz had been removed (Fig. 1C). The remaining fluorescence signal most likely represents background fluorescence, which has not been corrected for in the calibration, as discussed in materials and methods. FROZEN SW samples exhibited minor decreases in raz but not in an obvious trend, for example decreasing by $ 2.1 μg L À1 (À12.1%) on Day 14 and by $ 2.5 μg L À1 (À8.0%) on Day 84, which could suggest some unpredictability in how samples respond to storage or otherwise be explained by variance.

High concentration of raz/rru
In general, similar patterns were observed in the high concentration samples. In some samples the concentration of rru quickly increased to exceed the ULQ of the fluorescence spectrophotometer, for example in the high concentration ROOM DIW samples at Day 6 ( Fig. 3D) and ROOM SW after 1 d (Fig. 3F). With the concentration of rru in the mixtures unmeasurable, the fluorescence signals cannot be corrected for signal overlap and so the concentration of raz cannot be calculated. Therefore, these too have been omitted, which can be seen for the samples mentioned above in the corresponding raz graphs (Fig. 3A,C). High tracer concentration data are not plotted in relative change graphs (as in Fig. 2) because missing data could lead to misleading presentation, but data are presented in Supplementary  Table S3.
Concentrations of rru had exceeded the ULQ at Day 1 in the ROOM SW samples. COOLED SW samples were better preserved with raz/rru concentrations measurable up to Day 6. Raz/rru in FROZEN SW samples were not measurable at Day 14 but may have been well preserved in intermediate timeframes not analyzed here.
Raz and rru in DIW and GW were better preserved than in SW. In ROOM DIW , both raz and rru were well preserved on Day 3 of analysis, exhibiting < 5% concentration change. Large losses of both raz and rru on Days 6 and 7 in the ROOM DIW samples are probably not representative of overall patterns and measured concentrations remain within standard deviation of starting concentrations (Fig. 3A,D). In FROZEN DIW samples, both raz and rru were very well preserved on Day 84 (the end of the experimental period), exhibiting À5.5% and À9.8% change, respectively.
Lower starting concentrations of raz and rru in the GW samples prevented violation of the ULQ allowing changes to be observed for almost the entire experimental period, excluding only Day 84 for ROOM GW samples (Fig. 3B,C). In ROOM GW samples, raz decreased by 15.5% on Day 6, before steadily recovering to starting concentrations on Day 14 and continuing to increase until the ULQ is violated on Day 84. This recovery and increase in raz concentration is as of yet unexplained and remains unpredictable. These patterns could be explained by the transformation or decay of either tracer to some unquantifiable product which fluoresces at a similar wavelength spectrum as to interfere with raz. They could also be attributable to changes in background fluorescence (which was not quantified here), for example due to cycling of DOC. Rru seemed to increase linearly in the first 14 d, increasing by 12.4% on Day 3 and 55.8% on Day 14. The rate of rru increase slowed after Day 14, by which point most of the raz has likely been transformed to rru. COOLED GW samples were better preserved exhibiting a maximum of 6.3% change in raz on Day 14, except a spike in concentration on Day 1 by 8.4%, which could more likely be explained by some mechanism of fluorescence intensity rather than an actual difference in the absolute concentration of raz. A similar pattern, but of lower magnitude, in the concentration of raz (initial decrease followed by an increase) was also observed in the COOLED GW samples. Rru was also well preserved, changing by < 10% at every period until Day 42. In FROZEN GW samples, both raz and rru were very well preserved for the entire experimental period, only deviating from the starting concentrations within the range of standard deviation for most samples, exhibiting À0.6% and 3.9% change on Day 84, respectively.

Discussion
Samples containing raz and rru have typically been analyzed quickly (within 24 h of extraction) because of uncertainties about how well sample integrity is preserved. Here, for the first time, the magnitude and rate at which the concentrations of raz and rru change after extraction is quantified. In this discussion, we offer mechanistic explanation of results and outline the implications of these results for experimental designs and the practical analysis of samples.

Effect of storage temperature
The storage method (i.e., temperature) significantly impacted the preservation of concentrations of both raz and rru. Even after 24 h, samples in some mediums (SW) stored at room temperature had changed by an error level which is unacceptable for most applications. Concentrations in refrigerated samples were preserved much better, which is likely attributable to a decrease in biological activity induced by the lower temperature (Adams et al. 2010). Tracer concentrations changed almost three times faster when samples were stored at room temperature compared to in the refrigerator. As such, samples should be cooled as soon as possible-within 10 h to avoid changes of >10% and much sooner if further storage is required before sample analysis. If possible, samples should be cooled at collection, for example using a portable cooler, especially where air temperature is high.
The temperature in the refrigerator fluctuated significantly, reaching a maximum temperature of >9 C-high enough to stimulate significant microbial metabolic activity (Adams et al. 2010). Whilst this should not occur in laboratory-grade refrigerators, these occurrences are probably not unusual in typical research labs (which are subject to power outages, for example). Refrigeration, therefore, may be a riskier method of storage than freezing, especially for longer durations.
The freezing and thawing process appears to have little or no effect on the measured concentration of raz and rru. Frozen samples of DIW and GW were well preserved (< 10% change) for 84 d (the end of the experimental period). SW samples, however, do exhibit notable changes in the concentration of rru after 28 d. It is unlikely that this is explained by the microbiologically facilitated transformation of raz to rru because most microorganisms do not metabolize at temperatures this low (Adams et al. 2010). Furthermore, the increase in the concentration of rru is not reflected in a corresponding decrease in the concentration of raz. This highlights the need to consider other potential mechanisms for the change in concentration or fluorescence intensity. Some of the observed changes in concentration in the frozen samples could be attributed to microbiologically facilitated transformation as the temperature of samples increased during defrosting in the refrigerator for 24 h. Defrosting samples more quickly, for example in a water bath, may be a suitable solution to reduce this impact, followed by more immediate analysis (Chapman and Mostert 1990).

Variability of effects in different sample medium
The medium of the sample impacts the fluorescence intensity of raz and rru, and their preservation. Here, medium is characterized by water chemistry (i.e., the concentration of nutrients and DOC) only. Microbiological communities were not characterized but likely differed in abundance, composition, and diversity in different mediums, which could have affected fluorescence and raz/rru transformations.
This work again highlights the necessity of calibrating fluorometry instruments with standards prepared in the same medium as the samples, as identified in Blaen et al. (2017) and Knapp et al. (2018). However, this is often not practical or possible, for example when samples are extracted from many locations with different waters, like different depths in the hyporheic zone. In this instance, data should be interpreted with care, for example directly comparing concentrations of samples from a similar medium (e.g., just the SW) but not those extracted from different waters (e.g., deep in the hyporheic zone and in the SW). SW samples exhibited the highest changes in concentration of raz and rru, in both magnitude and rate, in all storage methods. This may be explained by the higher concentrations of DOC and nutrients which control microbial metabolism (Brailsford et al. 2019). It may also be explained by SW containing oxygen-producing cyanobacteria, which may maintain aerobic conditions in the samples for longer thus facilitating the aerobic respiration of raz. GW samples exhibited unexpected and unpredictable changes in concentration which do not appear to be explained by water chemistry but may be attributable to microorganism community assemblage and dynamics.

Implications for experimental designs
Raz is less stable than rru, exhibiting both faster rates and higher magnitude of change. This may be evidence of the transformation of raz to as of yet unidentified products, i.e., not rru, (O'Brien et al. 2000). However, this may also be explained by the relatively weak fluorescence emission of raz compared to rru, which tends to be characterized by shorter fluorescence lifetimes (Berezin and Achilefu 2010). In applications where only the concentration of rru is required-for example in closed system experiments (e.g., Comer-Warner et al. 2018)-the acceptable storage time of samples may be higher. Furthermore, lower concentrations of raz and rru are more suitable for storage. The concentration mixtures used here were selected to represent typical mixtures expected in stream tracer experiments (Blaen et al. 2017), some of which quickly violated the ULQ as concentrations increased. Whilst some fluorometers may be capable of analyzing much higher concentrations (Lemke et al. 2013), this work may influence the target concentrations users select, especially if sample storage is required. On top of this, lower variance was observed between repeats of low concentration mixtures compared to high concentration mixtures. Sample dilution may be appropriate, but this should be conducted before storage if possible (i.e., using a dilution factor appropriate for the estimated concentration based on the target concentration), and using the same medium to avoid interfering with fluorescence intensity, for example using stream water from before the tracer injection (which does not contain raz or rru). Otherwise, users may reduce target concentrations to well below the ULQ. Readers are encouraged to first establish the ULQ of their instrument (of tracers in different mediums) in order to inform target concentrations.
For other analytes, correction factors have been established to correct for the effect of storage duration and temperature on concentration (e.g., Peacock et al. 2015;Zeng et al. 2020). This data demonstrates that this is not feasible for the raz-rru system in environmental waters due to the unpredictability of concentration change and to the several interacting factors that may influence the rate, magnitude, and direction of this change, such as water chemistry, microbiology, and the starting concentration of both tracers. Nor is it possible to use knowledge of the medium and raz/rru composition of samples to reliably predict the amount of change of raz/rru. Furthermore, users must consider that the concentrations of raz and rru may not change similarly in different samples. Therefore, the maximum storage time of all samples should be limited to the time at which the most volatile samples are expected to change beyond acceptable bounds, most likely SW samples. Users may consider analyzing more volatile samples first but this could introduce unacceptable bias.

Necessary future research
The dominant mechanism driving the changes in concentration in raz and rru is the irreversible transformation of the former to the latter. Less attention has been afforded to other potentially significant drivers of concentration change, such as decay and transformation to "unquantifiable by-products," which has been alluded to but largely ignored (Haggerty et al. 2008). Any changes in the actual concentration of raz and rru in samples containing both are in fact a combination of many processes occurring simultaneously, and sometimes antagonistically, for example the decay of existing rru and the production of new rru. Furthermore, as we have seen here and in other works, the measured concentration of both raz and rru may be affected by a variety of interacting factors that are difficult to separate and correct for. As such, the relationship between the concentrations of raz and rru may be more complex than often assumed. Future work should investigate the mass balance of raz and rru in closed systems of different conditions and seek to identify and characterize alternative decay and by products. As well as implications for sample storage, this work highlights the sensitivity of the raz-rru smart tracer technique to environmental, procedural, and practical factors.

Recommendations
These recommendations are provided to further extend the guidelines outlined by other authors (e.g., Blaen et al. 2017;Knapp et al. 2018). As stated, readers should interpret this work considering relevant DQOs to identify appropriate storage duration and method for individual applications.
• Samples should be refrigerated or frozen after extraction as soon as possible-within 10 h to avoid risking >10% change in the concentration of tracers. • If possible, samples should be analyzed within 24 h, until which they should be stored in a laboratory-grade refrigerator. • If storage duration must exceed 48 h, samples should be frozen. In this instance, users should expect a change in concentration in raz and rru (from the concentration at the point of freezing) of around 10% after 14 d. • Samples should be stored at low concentrations of raz and rru-at least half the concentration of the ULQ of the fluorometer in use. This could be achieved by lowering target concentrations or by dilution before storage.