Implications of macroinvertebrate taxonomic resolution for freshwater assessments using functional traits: The Paute River Basin (Ecuador) case

This study aimed at investigating the taxonomic resolutions (TRs) of benthic macroinvertebrates for freshwater assessments in the scope of the functional trait approach (FTA).


| INTRODUC TI ON
Traditionally, taxonomic-based metrics have been used as indicators of aquatic ecosystem integrity and conservation status (Bellucci et al., 2013;Bücker et al., 2010;Sotomayor et al., 2020). However, the functional trait approach (FTA) has emerged in the last decades as one of the most promising tools for bioassessment of freshwater ecosystems (Bruno et al., 2016). Functional traits are morphological, physiological, phenological or behavioural characteristics of organisms that influence their performance or fitness (Nock et al., 2016).
Traits are usually divided into two categories (Menezes et al., 2010), that is biological (e.g. maximum body size, life span, feeding and reproductive strategies, mobility) and ecological (related to habitat preferences, such as pH and temperature tolerances, organic pollution tolerance). The ecological functions of species can be described by a multitude of biological traits that reflect their adaptation to environmental disturbances (Townsend & Hildrew, 1994). For example, the maximal size, body shape, life span, reproductive cycle and feeding strategies are typical traits reflecting the adaptation of species to more or less disturbed environments . As these traits are comparable across taxonomic groups, they provide a unified measure through communities differing in taxonomic composition. For bioassessment, this unified measure is relevant because the FTA implies that environmental filters select taxa with suitable traits to coexist under similar environmental conditions or types of disturbances (Poff, 1997;Storey et al., 1991).
Benthic macroinvertebrates play an important role in the stream ecosystem functions, that is, influence on nutrient cycles, primary productivity, decomposition and translocation of materials (Wallace & Webster, 1996). Despite the fact that several significant studies have been carried out on benthic macroinvertebrates and their functional traits (Bêche & Resh, 2007;Chará-Serna et al., 2012;Forio et al., 2018;Príncipe et al., 2010;Reynaga et al., 2020), there exist substantial debate among biologists on the most appropriate approaches for sampling design, field sampling, laboratory analyses and data assessments. A point of disagreement in freshwater bioassessment using benthic macroinvertebrates has been the level at which they should be identified, that is the relevant taxonomic resolution (Bailey et al., 2001;Dolédec et al., 2000;Gayraud et al., 2003;Jones, 2008;Mueller et al., 2013;Vieira et al., 2006). Taxonomic resolution (TR) varies depending on the goals of a study as the level of identification should match the need for information (Ferraro & Cole, 1995). Hence, under ideal conditions the default taxonomic level for bioassessments should be species; however, under realworld constraints (i.e. financial, logistical, lack of local taxonomic knowledge) detailed high TRs (i.e. at genus or species level) are not an option for many regions in the world.
In areas where coarser TRs (i.e. at family level) are the only possibility, certain aspects have to be considered, including (a) how well finer TR data (e.g. species or genus composition) are represented by coarser TR data? and (b) how much loss in accuracy of freshwater quality assessments is produced by the availability of coarser TR data instead of finer TR data? Arguing that taxonomic composition provides little information on ecosystem functioning, many researchers have been working instead on the FTA, in which site-by-taxa data sets are re-defined, using functional or biological trait information, into site-by-trait data sets (Jones, 2008).
In the FTA, the generated trait databases are defined either on finer or on coarser TR, as it is also the case in Latin America (Forio, 2017;Ramírez & Gutiérrez-Fonseca, 2014;Tomanova & Usseglio-Polatera, 2007).
The main goal of this research was to study the TR of benthic macroinvertebrates using functional traits for freshwater assessments. Hence, the specific goals were to (a) quantify the difference between families and genera in terms of functional description; (b) define and apply a functional diversity index for site characterization using families and genera; and (c) compare sampling site grouping on the basis of functional diversity as per TR (i.e. families or genera), using an appropriate cluster algorithm. While related research has been performed in other regions (e.g. Dolédec et al., 2000;Gayraud et al., 2003;Mueller et al., 2013), it has hardly considered study sites with pronounced elevational and climatic gradients as the ones that characterize the current study site and, definitively, has not used the methodology that has been implemented herein. Indeed, the present study is the first South American attempt to investigate the effects of taxonomic resolution of benthic macroinvertebrates on freshwater bioassessment using functional traits. Particularly, very few studies have considered how taxonomic resolution influences F I G U R E 1 (a) Paute River Basin in continental Ecuador, its two largest cities (Cuenca and Azogues) and the location of the 22 sampling sites; and (b) the digital elevational model of the basin functional diversity measures (Schmera et al., 2017), which emphasizes the international relevance of this study.

| Study area
The Paute River Basin (PRB) is located in the south of Ecuador ( Figure 1). It is of high importance owing to its significant hydroelectric potential (CONELEC, 2011;Salazar & Rudnick, 2008). The basin has an area of 6,442 km 2 including the eastern lower portion towards the Amazon plateau. Its elevational ranges between 500 and 4,250 m above sea level (a.s.l.), and slopes vary between 25% and 50%. The climatology of the basin is highly variable. Its average air temperature varies between 4.4°C and 18.6°C. This wide range of variation of the average air temperature depicts the high variability of this meteorological variable in response to the significant elevational gradient of the study site. The lower air temperatures correspond to the western Andes range with an average of 6°C (Páramo), while the warmest areas are situated in the valleys and subtropical zones (Amazonia region), with an interannual average fluctuating between 22°C and 26°C. Due to the wide elevational range, rainfall oscillates in intensity and duration, with maximum annual averages between 2,500 and 3,000 mm at higher elevations and minimum annual averages between 600 and 800 mm in the valleys (Celleri et al., 2007). Two major cities, namely Cuenca and Azogues, are located in the basin with approximately 500,000 and 33,850 inhabitants, respectively (2010 census). Pollution in the study basin comes from point and non-point sources, including domestic wastewaters, agricultural runoff, animal husbandry and industrial effluents (Da Ros, 1995). Degraded streams are characterized by high values of faecal coliforms, chlorides, total hardness, nitrate, total alkalinity, biochemical oxygen demand, turbidity, water temperature and low dissolved oxygen values. On the contrary, pristine sites are associated with good plans of land use (i.e. protected areas) and adequate habitat conditions (i.e. riparian vegetation, streambed heterogeneity) (Sotomayor, 2016;Sotomayor et al., 2018;Sotomayor et al., 2020).

| Sampling and analysis of benthic macroinvertebrates
Macroinvertebrate data were collected from 22 sampling sites located in the study basin within three years (2010, 2011 and 2012) repeatedly, by the former Ecuadorian National Secretary of Water (SENAGUA)-Santiago River Hydrographic Demarcation (DHS). At each sampling site, a 20-m-long reach was selected. Three evenly spaced transects were delineated throughout the 20-m-long reach (Von Ellenrieder, 2007). Macroinvertebrate samples were collected by a standard "kick-sampling" using a 25 × 25 cm 2 nylon handnet (mesh size: 0.5 mm) placed on the stream bottom, in front of which (upstream) the substratum was vigorously stirred by kicking it (Jacobsen et al., 1997). Sampling was carried out for three minutes, encompassing all existing microhabitats characterized by different depths, substrates and water velocities. The three samples from each transect were pooled together, and sampling continued by visually inspecting (for about 20 min) the substrate and aquatic vegetation to collect the tightly clinging taxa (e.g. Blephariceridae or Elmidae) that may have not been dislodged by kick-sampling (Roldán, 2003). Macroinvertebrate samples were preserved in 70% ethanol and sorted with the use of a stereomicroscope.
On average, the sampling sites were visited four times per year. Some were sampled more frequently, because they were located either at highly impacted sites (e.g. sites B2, B8 y B13 in Figure 1) or, on the contrary, at unaltered environmental (i.e. reference) locations (e.g. sites SB4, COL1, PIN5 in Figure 1). As a result, a benthic macroinvertebrate database was developed for n sp = 22 sampling sites, n rep = 230 sampling replicates that overall contain n gen = 79 genera, n fam = 44 families and n ord = 17 superior taxonomic groups (in its great majority orders).

| Allocation of biological traits
Seven biological trait categories were selected ( Table 1). They are well documented as good indicators for capturing compositional changes due to ecohydrological stress (Forio, 2017;Reynaga et al., 2020). Information on individual traits was gathered from literature about research applied on Neotropical zones, that is Tomanova et al. (2006Tomanova et al. ( , 2007; Chará-Serna et al. (2012); Ramírez and Gutiérrez-Fonseca (2014) and Longo and Blanco (2014). Biological trait databases at the genus (g) level (Db g-trait ) were constructed ( Figure 2) on the basis of the affinity of each taxon for trait categories using affinity scores within a fuzzy coding procedure (Chevenet et al., 1994) upon literature information. A score value of 0 indicates that there is no affinity of the taxon for the trait category, 1 low affinity, 2 medium affinity and 3 strong affinity.

| Trait data processing
Two levels of TRs, namely, genus (g) and family (f), were used in the scope of the FTA (Figure 2). The following protocol, adapted from Gayraud et al. (2003), was used for data preprocessing to provide the same weight to each taxon and every biological trait, prior to further analyses: 1. For all n rep , the affinity scores of each genus from Db g-trait were standardized so that the sum of affinity scores to a given trait equalled 1 (z g-trait ).
2. The standardized affinity scores of each genus from z g-trait were used to derive corresponding scores at family level by averaging the affinity scores of all genera belonging to a given family (Db f-trait ). The latter were then rescaled so that their sum, for a given trait, equalled 1 (z f-trait ). In this way, affinity scores have the same scale for genera and families (z g-trait and z f-trait ). Only families with n gen > 1 were considered to derive scores at family level.
3. Trait category proportions were calculated by multiplying the standardized affinity scores (either z g-trait or z f-trait ) of each taxon with its correspondent log(x + 1)-transformed abundance (Latli et al., 2017;Schmera et al., 2014). A log(x + 1) transformation was used instead of a ln(x) transformation owing to the fact that for several taxa, an abundance of 1, leading to a ln value equal to 0, was observed. The resulting trait proportions were rescaled (rsc) so that their sum equalled 1 for a given trait and river reach. As a result, two databases (Db g-rsc and Db f-rsc ) with standardized information of both TRs were derived.

| Comparison of taxonomic resolutions
To quantify the difference between genera (Db g-rsc ) and families (Db f-rsc ) in terms of their functional descriptions (i.e. using data of trait standardized abundance at genus and family levels), the differences in trait structure (DTS) were characterized through the Nash and Sutcliffe (1970) coefficient (NSE) and the Pearson-type correlation coefficient (r 2 ). The NSE is defined as: where Txg i is the rescaled i-th functional description of a genus from Db g-rsc , Txf i is the rescaled i-th functional description of the linked family (simulated from the genera that it encompasses), Txg is the rescaled mean of the functional description of the corresponding genus, and n is the total number of observations. NSE is dimensionless and varies in the interval −∞ < NSE ≤ 1. NSE = 1 indicates a perfect coincidence of the two data sets being compared, while NSE ≤0 characterizes completely dissimilar data sets. The higher the NSE value in the interval [0, 1], the more similar are the two data sets (Ritter & Muñoz-Carpena, 2013;Vázquez et al., 2002;Wu et al., 2006). The r 2 index can range from 0 (no correlation) to 1 (perfect correlation). Higher values indicate higher correlation between the two data sets being compared. Despite some pitfalls of r 2 for characterizing the similarity of two data sets (Legates & McCabe, 1999;Vázquez et al., 2002), it has been used in this study as a complement to the information provided by the NSE, particularly on the correlation between both data sets. Depending on the context of its use, typically, values greater than 0.7 are considered to reflect an acceptable correspondence between two data sets (Santhi et al., 2001;Schober & Schwarte, 2018;Vázquez, 2003).
In the context of (numerical) modelling, where observed and simulated data sets are compared with each other, some authors have adopted scales of agreement such as (Moriasi et al., 2007;Parajuli et al., 2009): excellent (r 2 and/or NSE > 0.90), very good (r 2 and/or NSE in the range 0.75-0.89), good (r 2 and/or NSE in the range 0.50-0.74), fair (r 2 and/or NSE in the range 0.25-0.49) and poor (r 2 and/or NSE in the range 0-0.24). Hence, a threshold value of 0.75, for either index, has been adopted herein to depict an acceptable similarity between two data sets. The empirical exceedance probability distribution of both indices was calculated (Figure 3a) on the basis of the Weibull plotting position method (Makkonen, 2006;Vázquez, 2003).

| Calculating functional diversity
Functional diversity (FD) can be summarized using indices based on trait values and taxa dominance in the community, that is abundance (Pla et al., 2012). Rao's quadratic diversity index (Q) (Rao, 1982) is one of the most important FD metrics (Mouchet et al., 2010) that has been widely used to measure the effects of anthropogenic activities on the FD of aquatic organisms. In general, the observed pattern is a reduction in Q values for impacted areas (Desrosiers et al., 2019;Magliozzi et al., 2020;Meng et al., 2021).
Nevertheless, Q suffers from some pitfalls (Chen et al., 2018;Liu & Rao, 1995;Pons & Petit, 1996), among which the fact that its range of values depends on the scale of the dissimilarities between the compared data sets is one of the most relevant. Consequently, the rRao variation of Q that overcomes this problem (Pavoine et al., 2005) by standardizing Q with respect to its maximum value (Champely & Chessel, 2002) was used. rRao is determined from the computation of the Euclidean distance (d ij ), between taxa i and j in the trait space, as a measure of dissimilarity between i and j (Chen et al., 2002): where N is the number of taxa, w i is the relative frequency of taxa i, w j is the relative frequency of taxa j, and ij = √ 2d ij. Herein, the suggested rRao was calculated for all of the replicates considering both data sets, Db g-rsc and Db f-rsc , resulting respectively in data sets rRao g-rep and rRao f-rep . rRao varies in the interval [0, 1], with 0 representing a total loss of FD, that is, a heavily impacted stream ecosystem.
Being dimensionless, rRao enables a direct comparison between data of both TRs.

| Data preprocessing
Firstly, the Kolmogorov-Smirnov test was used to compare the distributions of the two TRs, that is, rRao g-rep and rRao f-rep , by computing the maximum distance between the cumulative distributions of both TRs (Thas, 2010). Secondly, the Shapiro-Wilk test (Shapiro & Wilk, 1965) was used to evaluate whether the distributions of rRao g-rep and rRao f-rep data sets were normal by considering all of the available data.
This test showed with a 95% confidence level that neither of the two data sets is normally distributed (p = .0001 < α = 0.05), exhibiting Flow chart of the implemented study in both cases a positive skew. Therefore, it was decided (Sotomayor et al., 2018) to aggregate the rRao g-rep and rRao f-rep data for every one of the sampling sites (i.e. using central tendency values). Thus, individual Shapiro-Wilk tests per site were run for either TR considering all of the data available per site. For a particular site, if the test indicated normality, the average rRao value was used for that site, for the respective TR being analysed; otherwise, the median central tendency value was assigned (Anderson & Finn, 1996;Helsel et al., 2020). As a result, two sets of aggregated data were obtained, rRao g-samp and rRao f-samp , each of them containing 22 data in correspondence with the 22 sampling sites.

| Cluster analysis
Both data sets rRao g-samp and rRao f-samp were used in the context of cluster analysis (CA) through the K-means algorithm (MacQueen, 1967), which is a non-hierarchical and simple partitional clustering algorithm that attempts to find K non-overlapping clusters. Thus, K is the number of clusters, that is, in the current research, the number of FD levels for both TRs, and is defined in advance.
Every point in the data is then assigned to the closest cluster centroid, forming a cluster. The centroid of each cluster is then updated based on the points assigned to that particular cluster. This process is repeated until no centroid changes are needed (Wu, 2012). The optimal K-means clustering in one dimension via dynamic programming (Wang & Song, 2011) was used herein. K was varied between 2 and 5 for each K-means run on both data sets. Lower K values are associated with higher rRao values and vice versa. This K range was fixed in correspondence with what is commonly reported as the number of surface water quality classes (Howladar et al., 2018;Moyel, 2014;Shakhari et al., 2019;Sotomayor et al., 2018).
Once all of the K-means runs with K varying in the [2, 5] interval were carried out, a comparative analysis was performed on the two outputs associated with every K value by using the Youden index (ψ; Youden, 1950), the non-error rate (NER; Ballabio et al., 2018) and a measure of concordance (MoC; Pfitzner et al., 2009) to characterize their similarity. ψ is a commonly used measure to compare diagnostic abilities of two tests (Biggerstaff, 2000). It is a measure related to single classes; thus, the average (ψ) for all of the classes was calculated to characterize the whole clustering output. ψ = Sn -(1 -Sp), with Sn being the sensitivity and Sp being the specificity that, in turn, are classification parameters of the h-th class and are related to the confusion matrix (CM). Sn represents the ability of a given classifier to correctly identify the samples of the h-th class. The Sp represents the capability of a given classifier to reject samples of the other classes. It is calculated as the ratio of samples not belonging to the h-th class, which were not classified in the h-th class over the total number of samples not belonging to the h-th class (Sokolova et al., 2006). The NER is the average of Sn and provides an overall evaluation of the classification quality. ψ and NER have values between 0 (no class discrimination) and 1 (perfect class discrimination) (Frank & Todeschini, 1994).
Finally, the MoC was used for characterizing the extent to which two cluster outputs agree with each other. Hence, given that there are I clusters in K-means g-K and J clusters in K-means f-K , and that each individual cluster in K-means g-K is referred to as g i , and each cluster in K-means f-K as f j , for i ∈ {1, 2, …, I} and j ∈ {1, 2, …, J}, then any cluster g i from K-means g-K can be subdivided into smaller subclusters or fragments. A fragment (m ij ) consists of those elements of g i that have also been allocated to a single cluster f j ; m ij is therefore the intersection between g i and f j . If cluster g i contains the fragment m ij , then cluster f j also contains the same fragment. The term m 2 ij / g i f j provides a symmetric measure of mutual concordance between the two clusters g i and f j . The maximum value m 2 ij / g i f j = 1 is attained only when g i = f j . The MoC is calculated as follows and varies between 0 (no concordance) and 1 (perfect concordance).
At last, the threshold cluster number (K th ) was chosen as the K value after which genus-and family-based bioassessments are dissimilar as suggested by the ψ, NER and MoC similarity indices. The rRao was calculated with the aid of the "ade4" R package (Dray & Dufour, 2007), while ψ and NER were calculated using the Classification toolbox version 5.0 (Ballabio & Consonni, 2013) in MATLAB ® (Araghinejad, 2014). The K-means was implemented with the "Ckmeans.1d.dp" R package (Wang & Song, 2011), and the MoC index was calculated with Microsoft Excel ® Spreadsheet.

| Sampling and analysis of benthic macroinvertebrates
A total of 31,588 benthic macroinvertebrate individuals were collected; 27,292 individuals were identified to the genus level. The taxonomic analysis resulted in 79 genera, 44 families and 17 superior groups (in its great majority orders). Individuals belonging to the Oligochaetes could be identified only to order level. Baetodes were the most dominant genus, accounting for 50.3% of all individuals, followed by Leptohyphes (5.6%), Gigantodax (5.3%), Camelobaetidius (4.2%), Helicopsyche (3.3%) and Limonicola (3.2%). The rest of the genera accounted for 28.1% of the total abundance.

| Comparison of taxonomic resolutions (TRs)
The analysis suggested an exceedance probability of about 82% for having higher values than 0.75, for both NSE and r 2 (Figure 3a). The evolutions of the exceedance probability curves associated with NSE and r 2 were steadily similar (Figure 3a). However, not all taxa there is an acceptable similarity ("very good" according to the scale suggested by Moriasi et al., 2007;Parajuli et al., 2009) between the two rescaled functional trait databases Db g-rsc and Db f-rsc encourages using either TRs in freshwater quality bioassessments that include, for instance, functional diversity indices.

| Calculating functional diversity
The main statistical descriptors of the rRao distributions for both TRs are almost identical (Figure 4a). Hence, no significant differences between rRao from both TRs were observed (Figure 4b).
This trend was emphasized by the Kolmogorov-Smirnov test; that is, there was not a statistically significant difference between distributions of rRao g-rep and rRao f-rep at the 95.0% confidence level (p value = .08 > α = 0.05).

| Cluster analysis
Significant differences between the outputs of the cluster analysis were noticeable from K = 4 onwards ( Table 2) as a function of TR.
K-means outputs for K = 2 and K = 3 were very much similar in terms of both TRs, as the only difference in both cases occurred for the allocation of the MAG1 and PB2 sites for K = 2 and K = 3, respectively.
This was confirmed by the values adopted by the ψ, NER and MoC similarity indices ( Figure 5) that dropped significantly after K th = 3, characterizing as such dissimilar genus and family cluster outputs, which would imply dissimilar genus-and family-based bioassessments. Although a K = 6 value is generally not used in bioassessments and, as such, was not included in Figure 5, it was numerically inspected to cross-check on a consistent decrease in the similarity index values, which was indeed the case, emphasizing as such that in the current case, K th = 3. It should be emphasized that one of the main aims of the current research was quantifying the difference between families and genera in terms of functional description. Thus, the goal was not to assess the ecological pertinence of the outcomes of the clustering algorithm used.

F I G U R E 4 (a)
Main statistical properties of Rao index relative to maximum (rRao) for all of the genera and families; and (b) rRao frequency histograms for genera and families F I G U R E 3 (a) Empirical distribution of exceedance probability of Nash and Sutcliffe coefficient (NSE) (P(NSE ≥ nse)) and r 2 (P(R 2 ≥ r 2 )) for the whole set of traits of family and genus; and scatter plots of mean NSE and r 2 calculated for the standardized traits of family and genus (Db f-rsc and Db g-rsc ) at every sampling site for representative (b) high-agreement, (c) medium-agreement and (d) low-agreement conditions. X is either NSE or r 2

| D ISCUSS I ON
One of the specific objectives of this study was defining and applying a functional diversity (FD) index for site characterization using families and genera. To find an accurate estimation of FD, the primary inputs (i.e. the trait information) are crucial. Some studies (Chará-Serna et al., 2012;Reynaga et al., 2020;Tomanova et al., 2006Tomanova et al., , 2007 assign trait scores at family and/or genus levels, and generate sitespecific information through gut content analyses, mouthpart observations, size measurements or skeleton hardness measurements, among others. However, in this study, the generation of this sitespecific data was not an option due to time and resource limitations. The alternative, considered in this research, was using published trait information. Considering the first law of geography-near things are more similar than distant ones (Tobler, 1970) However, for some of our genera (12.7%), no trait data were available in these DBs. Thus, the missing data were completed with macroinvertebrate trait data from non-Neotropical areas but that also are situated in mountainous regions, in an attempt to reduce the use of non-representative trait information.
Many FD indices have been proposed aiming at measuring functional-richness, evenness and divergence-aspects that some F I G U R E 5 Distribution of similarity indices, namely, measure of concordance (MoC), non-error rate (NER) and average Youden index (ψ), evaluated on the outputs of the cluster analysis TA B L E 2 Outputs of the cluster analysis as a function of the number of water quality (WQ) classes (K) and the taxonomic resolution (TRs, i.e. genus and family) authors argue are the components of FD (Villéger et al., 2008) in the context of a multifaceted framework. It is also argued that this multifaceted framework of FD helps understanding how biodiversity interacts with the environment. Hence, not all of them provide the same information because not all of them are measuring the same facet; as such, there is a lack of consensus about which the most appropriate one is (Mouchet et al., 2010). Further, other authors put a lot of criticism over the FD multifaceted framework (i.e. Laliberté & Legendre, 2010). Rao's quadratic entropy (Q) is the most frequently used index to quantify the FD of freshwater macroinvertebrate assemblages (Schmera et al., 2017) and in general functional ecology. One possible reason for that is due to the fact that Q represents a mixture between functional aspectsrichness and divergence (Mouchet et al., 2010). Nonetheless, when computing Q for comparisons between data sets (i.e. under a comparative framework) with very different numbers of taxa, it is statistically biased (Casanoves et al., 2011;Chen et al., 2018) because its final value depends, among other aspects, on the scale of the dissimilarities between the compared data sets. Consequently, under these conditions of comparison, the use of Q is not adequate (Pavoine et al., 2005). Hereafter, for carrying out the current comparison between family and genus TRs, it was used the modification of the Rao index by Pavoine et al. (2005), that is Rao relative to a maximum (rRao), that is Q standardized with respect to its maximum as a statistical simple way of overcoming the referred bias pitfall. Further, the latter is in line with what has been done in other related ecohydrological studies (Koperski, 2019;Teresa et al., 2015).
The study showed that both differences in trait structure (DTS), characterized by the NSE and the r 2 indices, and FD, measured through the rRao index, exhibited no significant differences as a function of the TR. However, some authors argue that species-level identification is generally required when using diversity indices (Legendre & Legendre, 2012), such as the rRao index, for example when assessing freshwater pollution (Dalu et al., 2017;Guerold, 2000;Lenat & Resh, 2001). Others have defended, however, the use of diversity indices at family level (Bo et al., 2020;Corbi & Trivinho-Strixino, 2006;Ferraro & Cole, 1995;Mueller et al., 2013), which is in line with the results of this study (Figure 4). Taxa should be identified to a sufficient level to permit the detection of changes in stressed macroinvertebrate assemblages (Terlizzi et al., 2003). Current results suggested that family-level identification was detailed enough to obtain similar rRao values for either TR. Furthermore, when using cluster analysis upon the rRao values, for a number of water quality (WQ) classes K = 2 and K = 3, nearly all of the same sampling sites are in the same clusters for both TRs, emphasizing that family-level identification could be sufficient until K th = 3.
The different K-means outputs are consistently grouping the same sites for either the best or the worse WQ conditions, independently of the considered K value and TR. This is congruent with previous findings in the Paute River Basin where sites belonging to the Burgay sub-basin (i.e. B2, B8 and B13; see Table 2) were associated with deteriorated WQ. On the contrary, the Collay sub-basin (i.e. COL1; see Table 2) was considered as a clean system (Sotomayor, 2016;Sotomayor et al., 2018Sotomayor et al., , 2020. In the study, the trait information of a given family was derived through the use of the trait data of all of the genera belonging to that family (through the averaging of z g-trait for obtaining the family scores). This approach was aimed at, firstly, achieving standardized information for both TRs that would enable a direct comparison between genus-based and family-based results; and, secondly, capturing the genus, as well as feasible, variability occurring within a given family. Particularly, the latter aimed at mitigating as much as possible the limitation of using coarse taxonomic resolutions (when no finer information is available beforehand), which would normally imply larger errors if this in-family genus variability is neglected (Dalu et al., 2017;Wu, 1982) or roughly approximated by using information upon more common or dominant genera.
Dissimilarities between the K-means outputs as a function of the TRs started from K = 4 onwards, which was assessed through the ψ, NER and MoC indices ( Figure 5). ψ and NER are based on the confusion matrix (CM) assessment, considering the same K for either TR. This principle of using the same K for either cluster analysis is the so-called principle of distributional equivalence (Govaert & Nadif, 2018). An alternative technique for comparing the outputs of K-means as a function of the TRs is based on the mutual information (MI) approach, represented herein by the MoC index, which expresses the amount of information that the cluster outputs have in common (Pfitzner et al., 2009 The joint application of the CM and MI indices led to a congruent assessment of the K th value. To the best of our knowledge, it has rarely been used a combination of CM and MI indices in the context of a multiobjective bioassessment study (Ficko & Bončina, 2013).
Dealing with different TRs raises the question on the precision of correct identification of macroinvertebrates. Identification accuracy and precision are better for coarse taxonomic resolutions than for finer ones. Identification to genus or species level requires a significant investment in quality assurance, namely, a comprehensive library of macroinvertebrate taxonomic keys, a reference collection and the frequent consultation of taxonomic specialists (Jones, 2008). For example, Sondermann (2013) reported that for the Colombian elmid fauna, around 42% of the reported genera and 15% of the overall genus records turned out to be presumably erroneous. Similar findings were reported by Moulton et al. (2000) and Ochieng et al. (2019). Hence, working on family identification level assures more reliable databases, which can serve as input data to further bioassessment analyses.
This study focused on developing an alternative in the framework of FTA, which is normally applied on genus/species level, so that it could be implemented on a family level, with the potential of cost-reduced long-term monitoring for WQ management purposes.
This research recommends the establishment of an initial period of monitoring and detailed taxonomic identification, to a species/genus level, upon which the proposed methodology could be applied to define, as in the current case, a similarity level of functional traits for finer and coarser TRs. The methodology also identifies a threshold level of WQ class discretization, until which working with coarser TRs is acceptable. By doing these, the proposed methodology aims at using in the future coarser TRs, with the consequent reduction in time and monetary expenses, as well as the need of high-level taxonomic expertise. The length of this initial period when a detailed TR is used would depend on several factors, among which the objectives of the study beforehand, the size of the study site and the frequency of monitoring are very influential. These factors are likely to affect as well the similarity level of functional traits for finer and coarser TRs that could be reached in the local application of the proposed methodology. Further, it is up to the decision-makers in charge of WQ management to decide on whether the reached level of functional trait data similarity is sufficient for WQ management purposes and whether the defined threshold level of WQ class discretization is acceptable.

| CON CLUS IONS
The functional trait approach (FTA) was applied as a way of studying the taxonomic resolution (TR) of benthic macroinvertebrates for freshwater assessments. For achieving this purpose, the differences between families and genera were quantified in terms of functional description, which led to the conclusion that in the study site a functional trait data similarity of about 82% exists for both TRs.
Site characterization was carried out by using a functional diversity index (rRao) for families and genera with no significant differences between both TRs. Sampling site clustering on the basis of functional diversity was compared using the K-means algorithm for both TRs, based on two different frameworks, namely, the confusion matrix (CM) and the mutual information (MI) assessments. These frameworks led to a congruent assessment of the K value (K th = 3 classes) until which family-level identification in the study basin was suitable in detecting changes of macroinvertebrate assemblages. The study proposes an alternative FTA methodology for WQ management purposes so that coarser TRs could be used in the future with the consequent reduction in monitoring time and costs and the need of high-level taxonomic expertise.

ACK N OWLED G EM ENTS
The authors would like to express their gratitude to the former Ecuadorian National Secretariat of Water (SENAGUA)-DHS for making the raw data accessible to the current study and to Pablo E. Gutiérrez-Fonseca, Fernando Casanoves and Julio Di Rienzo for their helpful suggestions. This research was carried out in the context of the research Project "Efectos del estrés hídrico sobre la biodiversidad en ríos abastecedores de la ciudad de Cuenca," financed by the Research Directorate of the Universidad de Cuenca (DIUC).

CO N FLI C T O F I NTE R E S T
The authors declare that they have not known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

DATA AVA I L A B I L I T Y S TAT E M E N T
The database belongs to the Ecuadorian government, concretely to the former National Secretary of Water (SENAGUA), which was recently absorbed by the Environmental, Water and Ecological Transition Ministry (MAATE). To use this database, the first author subscribed with the Ecuadorian government a specific agreement that entitles him using the data for his PhD research. This signed agreement explicitly forbids him sharing the information with a third party without the specific authorisation of the government, which definitively might be almost impossible to get in a reasonable mid-term.
Thus, the co-authors would prefer not to publicly share this database.