Phylogeography of a widespread Australian freshwater fish, western carp gudgeon (Eleotridae: Hypseleotris klunzingeri): Cryptic species, hybrid zones, and strong intra‐specific divergences

Abstract Despite belonging to the most abundant and widespread genus of freshwater fishes in the region, the carp gudgeons of eastern Australia (genus Hypseleotris) have proved taxonomically and ecologically problematic to science since the 19th century. Several molecular studies and a recent taxonomic revision have now shed light on the complex biology and evolutionary history that underlies this group. These studies have demonstrated that carp gudgeons include a sexual/unisexual complex (five sexual species plus an assortment of hemiclonal lineages), many members of which also co‐occur with an independent sexual relative, the western carp gudgeon (H. klunzingeri). Here, we fill yet another knowledge gap for this important group by presenting a detailed molecular phylogeographic assessment of the western carp gudgeon across its entire and extensive geographic range. We use a suite of nuclear genetic markers (SNPs and allozymes) plus a matrilineal genealogy (cytb) to demonstrate that H. klunzingeri s.l. also displays considerable taxonomic and phylogeographic complexity. All molecular datasets concur in recognizing the presence of multiple candidate species, two instances of historic between‐species admixture, and the existence of a natural hybrid zone between two of the three candidate species found in the Murray–Darling Basin. We also discuss the major phylogeographic patterns evident within each taxon. Together, these analyses provide a robust molecular, taxonomic, and distributional framework to underpin future morphological and ecological investigations on this prominent member of regional freshwater ecosystems in eastern Australia.


| INTRODUC TI ON
Phylogeographic assessments of freshwater fishes have commonly revealed cryptic species (e.g., Adams et al., 2014;Baumsteiger et al., 2012;Pinacho-Pinacho et al., 2018), thereby improving knowledge of species richness and diversity in aquatic ecosystems globally (Seehausen & Wagner, 2014).In Australia, for example, the number of recognized (but not necessarily described) species increased by 39 between 2002 and 2013 (Allen et al., 2002;Unmack, 2013), with many of the newly defined species revealed by molecular phylogenetic and phylogeographic evidence (Adams et al., 2023).Furthermore, phylogeographic studies have indicated that historical, geological, and/or climatic processes can be determinants of contemporary patterns of biodiversity and distribution across riverine landscapes (e.g., Buckley et al., 2021;Shelley et al., 2020;Waters et al., 2020).Consequently, the phylogeographic assessment of freshwater fishes provides foundational diversity and biogeographic knowledge upon which valid ecological and environmental management studies are dependent (Page et al., 2017).
In this study, we present a molecular phylogeographic assessment of the western carp gudgeon (Eleotridae: Hypseleotris klunzingeri) throughout its entire Australian range.This small-bodied freshwater fish (approximate maximum size is 60 mm TL) is one of the most abundant and widely distributed species in eastern Australia (Pusey et al., 2004;Unmack, 2000).Here, it occurs as a disjunct northern population in the Burdekin Basin in central-northern Queensland, then in coastal drainages from just north of the Fitzroy Basin to the Clarence Basin, as three disjunct southern populations in the Macleay, Hunter, and Shoalhaven basins in central New South Wales, inland throughout the Murray-Darling Basin (MDB), and finally west to the Bulloo River and Cooper Creek in the Lake Eyre Basin (Figure 1, Table 1).
Hypseleotris klunzingeri is often found in sympatry with other carp gudgeons, all belonging to a species complex comprising a suite of other sexual and hemiclonal congeners, but is reproductively isolated from them (Bertozzi et al., 2000;Schmidt et al., 2011;Unmack et al., 2019).
As reviewed by Pusey et al. (2004), H. klunzingeri occurs in a wide range of lotic and lentic habitats, with aquatic plants and leaf-litter beds, and slow-flowing water (i.e., <0.2 m/s) the preferred (but not exclusively used) habitat features.The species is primarily benthic, with a diet dominated by microcrustaceans and macroinvertebrates (e.g., chironomids, ephemeropterans, and trichotperans).Hypseleotris klunzingeri has a wide range of water quality tolerances but is only known from upper estuarine areas downstream of tidal barrages that prevent upstream movement, indicating that estuarine habitats are not naturally used by the species.Spawning can occur most of the year (excluding winter), but peaks in spring and early summer in response to increasing water temperature and daylight length, and possibly flow events, with flow events reported to catalyze the onset of mass migration of H. klunzingeri (Pusey et al., 2004).
As with many other Australian freshwater fishes, the taxonomy and distribution of H. klunzingeri has historically been equivocal, having first been mistaken by Klunzinger (1872Klunzinger ( , 1880) ) as H. cyprinoides (which does not occur in Australia), then described as Carrassiops klunzingeri by Ogilby (1898), to be later reassigned to the genus Hypseleotris.Its common name of western carp gudgeon relates to the early thought that the taxon primarily occurred west of the Great Dividing Range compared with the more easterly H. galii (e.g., Anderson et al., 1971), whereas today we know that representatives of both lineages naturally occur both east and west of the Great Dividing Range (Thacker, Geiger, & Unmack, 2022).While the current nomenclature suggests a robust taxonomic status, matrilineal phylogeographic data presented by Thacker et al. (2007) indicated several divergent lineages within H. klunzingeri that to date have not been closely re-assessed.Furthermore, their phylogeographic analyses suggest the low elevation divide between the coastal Burnett River and the Condamine River in the northern MDB is the pathway by which H. klunzingeri was exchanged between these basins (see also Unmack, 2013).
The purpose of this study is to present a thorough molecular phylogeographic assessment of H. klunzingeri, using comprehensive sampling and an expanded suite of molecular datasets.Regarding the latter, our choice of two independent sets of co-dominant nuclear markers (single nucleotide polymorphisms [SNPs] and allozyme loci), plus matrilineal sequence data (the mtDNA gene Cytochrome b [cytb]), ensures that this study is ideally placed to help resolve the above-described taxonomic uncertainty and re-evaluate biogeographic patterns in this prominent Australian freshwater fish.We sampled throughout range of H. klunzingeri for a total of 120 sites (Figure 1, Table 1).All river names used herein are cross-referenced to site numbers in Table 1.Each molecular dataset has a different number of populations and individuals but overlap between marker types was high.The SNP dataset consisted of 106 sites and 204 individuals, allozymes had 80 sites and 233 individuals, and cytb included 101 sites and 293 individuals (Table 1).The cytb dataset includes all H. klunzingeri sequences from Thacker et al. (2007).Two versions of the cytb dataset were constructed, one for the whole gene with 267 individuals, and a second one for the first 601 base F I G U R E 1 Composite map showing the distribution of Hypseleotris klunzingeri and location of all sites surveyed.Each site is represented by a taxon symbol, as per the legend provided and numbered in Table 1.Maps were generated using QGIS v3.8.2 software.Major drainage divisions are outlined with a black border.Photograph shows a male from site 2.

| Sample collection
TA B L E 1 Site, taxon, and sampling details for all Hypseleotris klunzingeri examined in this study.1 with an asterisk) that provided additional phylogeographic insights within the MDB using these larger sample sizes.
For ease of presentation, individuals are identified throughout by their final taxon assignment, namely, pure taxa KN, KE, KS, and KW (reflecting their various compass orientations), plus assorted hybrid/ admixed combinations thereof (KE m = KE from the MDB, KW m = KW from the MDB, and KSxKE = hybrids between KS and KE).

| SNP genotyping
DNA was extracted by Diversity Arrays Technologies (DArT Pty Ltd, Canberra, Australia, www.diver sitya rrays.com) using a NucleoMag 96 Tissue Kit (Macherey-Nagel) coupled with NucleoMag SEP to allow automated separation of high-quality DNA on a Freedom Evo robotic liquid handler (TERAN Pty Ltd).
Sequencing for SNP genotyping was done using DArTseq™ (DArT Pty Ltd), which uses a combination of complexity reduction using restriction enzymes, implicit fragment size selection, and next-generation sequencing (Sansaloni et al., 2011), as described in detail by Kilian et al. (2012).The technique is similar to double-digest restriction associated DNA sequencing (ddRAD) (Peterson et al., 2012), but has the advantages of accepting lower quantities of DNA, greater tolerance of lower quality DNA, and higher call rates (Sansaloni et al., 2011).The restriction enzyme combination of PstI (recognition sequence 5′-CTGCA|G-3′) and SphI (5′-GCATG|C-3′) was used for the double digestion.
The PstI-compatible adapter included the Illumina flowcell attachment sequence, a sequencing primer sequence, a barcode region of variable length (see Elshire et al., 2011), and the PstI-
Sequences generated from each lane were processed using proprietary DArT analytical pipelines as outlined by Georges et al. (2018) to yield repeatable SNP markers.In addition, DArT processes approximately one-third of the samples twice from DNA to allelic calls as technical replicates and scoring consistency (repeatability) was used as the main selection criterion for high quality/low error rate markers.

| SNP filtering
After receiving the SNP data from DArT Pty Ltd, the SNP data and associated metadata were read into a genlight object as implemented in R package adegenet (Jombart, 2008) to facilitate subsequent processing with R package dartR (Gruber et al., 2018).
We created two different datasets based upon different filtering of the initial 19,903 polymorphic SNP loci, one for the phylogenetic analysis ('phylo' dataset), and the other for the PCoA and fixed difference analyses ('PCoA' dataset).The phylo dataset was initially filtered to remove any obviously introgressed individuals within the MDB (identified using the PCoA dataset), as reticulation events are not compatible with bifurcating trees.The next step retained only loci for which repeatability was greater than 0.99 and all loci with a callrate above 0.6.The PCoA dataset included all individuals and was first filtered for repeatability to include values >0.99.The second filtering step removed all secondary loci (loci found within the same sequenced fragment) with the locus retained having the higher polymorphism information content (PIC) value.Finally, loci with a callrate above 0.9 were retained.The additional filtering steps were undertaken on the PCoA dataset for the two analyses that are sensitive to the presence of too many missing values and/or tightly linked loci (ordination and the calculation of fixed differences).The data remaining after these primary filtering steps for both datasets are regarded as highly reliable.
The PCoA dataset was used for each of the additional (stepwise) PCoA analyses based on a subset of individuals being compared, with additional filtering applied to remove any loci that become monomorphic in such subsets.

| SNP analyses
Genetic similarity among individuals and populations was visualized using ordination (Principal Coordinates Analysis [PCoA]; Gower, 1966), using individuals as entities and loci as attributes and implemented by the gl.pcoa and gl.pcoa.plotfunctions of dartR.For our phylogenetic analysis, we constructed SNP genotypes for each individual by concatenating only the variable bases from each SNP locus into a single partition.A few loci had the SNP removed with the adaptor, because of chance matching of the adaptor sequence to the terminal region containing the SNP.These loci were removed prior to concatenation.Heterozygous SNP positions were represented by the standard ambiguity codes.We generated a phylogenetic tree using maximum likelihood (ML) applied to concatenated sequences.ML analyses were conducted using RAxML 8.2.12 (Stamatakis, 2014) on the CIPRES cluster (Miller et al., 2010) using the model GTRCAT and searching for the best-scoring ML tree using the model GTRGAMMA in a single program run, with bootstrapping set to finish based on the autoMRE majority rule criterion.The tree was imported to Mega 7.0.18(Kumar et al., 2016), formatted, and mid-point rooted.To assist with identifying potential introgressed individuals, heterozygosity was calculated in R using the command "het <-rowMeans(as.matrix(gl)==1,na.rm=T)" followed by "write.
csv (het, file="het.csv")." The diagnosability of lineages and candidate species was assessed by calculating the number of pairwise fixed differences

| MtDNA genotyping and analyses
The mitochondrial cytb gene was sequenced following the PCR protocols in Hammer et al. (2014), except that samples were amplified with the following primer pairs: Glu18 TAACC AGG ACT AAT GRC TTGAA with Hd.alt GGRTT GTT GGA GCC TGT TTCAT or Hd.Hyps GGGTT GTT GGA GCC SGT TTCGT and midg.496GGCGG CTT TTC RGT AGATAA with Eleo.Thr.40GATTT TAA CCT CCT GCG TCCG.
Sequences were edited using Chromas 2.6.5 (Technelysium) and imported into BioEdit 7.2.5 (Hall, 1999).Sequences were aligned by eye and checked via amino acid coding in MEGA to test for unexpected frame shift errors or stop codons.Data were analyzed phylogenetically using RAxML as per for the SNP analysis but using the model GTRGAMMA.Strong support for the presence of four primary taxa (KN, KE, KS, and KW) plus three KS-admixed groupings in the MDB (KE m , KSxKE, and KW m ) is also evident in the fixed difference counts (Table 2).These data clearly demonstrate that the primary taxa are all readily diagnosable at multiple loci (range 38-432 absolute fixed differences; 64-525 near-fixed differences), whereas KSxKE displays no fixed differences from either parental taxon (as expected for recently admixed populations) and both KE m and KW m display fewer fixed differences from KS than do their pure parental taxa (as typically found for historic admixture/introgression).Additional support is presented by the observed heterozygosity counts for each group (Table 2), which show all pure taxa display comparatively low levels of heterozygosity (range 0.0098-0.0218)when compared to KE m (almost double that of pure KE), KW m (more than double that of pure KW), and most notably KSxKE (more than fourfold higher than either parent).

| Phylogenetic analysis of SNPs
Our initial 'phylo' dataset comprised all individuals except for the 16 hybrid KSxKE fish.Surprisingly, the presence of the nine admixed KE m individuals caused considerable distortion of the relationships among populations and taxa in the resultant ML tree (Figure 3a), reflecting the reality that reticulate evolution often leads to data that do not exhibit tree-like behavior (Unmack et al., 2022).Removal of these KE m individuals followed by re-fil- with others from the same or adjacent river and there is obvious phylogeographic structure at the regional level in all taxa except KS.

| Allozyme analyses
The results of our allozyme analyses closely mirrored those obtained for the SNPs dataset, both in terms of primary genetic lineages and identifying pure versus admixed populations.A series of PCoA analyses of the allozyme data (Figure 4) displayed a near-identical association between individuals as depicted for the SNPs (Figure 2), and together supported the presence of the same seven primary groupings, namely, KN, pure KE, KE m (close to pure KE but slightly displaced toward KS), KSxKE, pure KS, KW m , and pure KW.The same assignment of sites into primary groupings is shown in an unrooted NJ tree (Figure 5).These findings are further validated by the fixed difference and observed heterozygosity counts for each grouping (Table 3), which show the same patterns of diagnosability and comparative levels of heterozygosity as found for the SNPs.Together, our two nuclear datasets fully support the presence of four primary taxa in the western carp gudgeon, namely, KN, KE+ (KE + KE m ), KS, and KW+ (KW + KW m ), all readily diagnosable by numerous SNP and allozyme loci (Table 4).TA B L E 2 Pairwise number of diagnostic SNP loci between the primary genetic groups identified by PCoA (Figure 2).

| MtDNA analyses
The whole and half cytb datasets consisted of 1141 and 601 base pairs for 267 and 293 individuals, respectively.Maximum likelihood (ML) was run on the two cytb datasets with RAxML producing trees with likelihood scores of −3145.296216and −1595.307571,and the rapid bootstrap search for both analyses finishing at 650 and 500 replicates, respectively (simplified tree for whole cytb in Figure 6; detailed trees for both whole and half cytb in Figures S3 and S4, respectively).
For the full cytb dataset, the deeper phylogenetic relationships were moderately well-supported and largely similar to those found for SNPs.The KN clade was sister to the KE clade and both were sister to a composite KS/KW clade, comprising three distinctive sublineages referable to taxon KS plus well-supported, sister sublineages for KW and KW m (Figure 6a; Figure S3).Consistent with their hybrid status, KSxKE individuals displayed either KSderived or KE-derived haplotypes (ratio 14:15, Figure S4).For the non-hybrid taxa, an individual's mtDNA clade membership was concordant with their SNP/allozyme primary lineage identification in all instances apart from six KS fish, all from three northern MDB rivers (sites 65, 66, and 68; Gwydir, Namoi, and Macquarie

Rivers).
Mapping the distribution of the major cytb lineages (Figure 6) clearly demonstrates that KE-derived haplotypes have intruded into the KSxKE hybrid zones as identified using our nuclear datasets (Figure 1 and shaded in Figure 6) plus are present in the pure KS populations from some upper MDB rivers.It also reveals that haplotypes from two rare KS-lineages, otherwise characteristic of the Border Rivers in the upper MDB (sites 59, 62, and 63), have also spread across the drainage divide into the upper Clarence River (site 41) and far downstream into the KSxKE hybrid zone in the lower Murray.

| Origin and dynamics of KSxKE hybrid zones
Our molecular data provide several additional perspectives on the KSxKE hybrid zones.Consistent with geographic expectations, stepwise PCoA of the SNPs dataset consistently identified the Condamine (i.e., taxon KE m ) as the most likely source population for the KE+ parent (Figure 3b; Figure S1).Second, individuals displayed varying degrees of admixture between their two parental taxa (Figure S1), indicating that F 1 hybrids must be sufficiently fertile to at least produce F 2 and/or backcross offspring.Finally, there was no obvious correlation between the distance from the KE m source and the extent of introgression for KE-derived alleles (Figure 1; Figure S1).

| Within-taxon phylogeographic structure
Both our SNP and mtDNA datasets contain sufficient genetic insights to explore phylogeographic trends within the four primary taxa.In addition to the well-supported dichotomy between KW and KW m , the SNPs identified a shallow split in KW between Bulloo and Cooper populations (sites 111-114 vs. 115-120; Figure 3), They also revealed additional albeit more complicated phylogeographic structure for the other three taxa (Figures 3 and 4), herein further explored using taxon-specific PCoAs (Figure 7).
For KN, there was a primary dichotomy between the three northern and four southern sites, which reflects latitude rather than river basin membership (Figure 7a).Most notably, the Burdekin KN was closely allied with those in the Fitzroy River (Figures 3 and 7) despite being somewhat of a northern outlier (Figure 1).
A more complex pattern is evident within KE+.All relevant PCoAs inferring a source population.Although the mtDNA data are also relatively homogeneous for KS, the Border Rivers harbored two distinctive cytb lineages at high frequency that were absent elsewhere in pure KS (Figure 6).

| DISCUSS ION
Building on the work of Thacker et al. (2007), we present three comprehensive molecular datasets that together identify four primary taxa (KN, KE+, KS, and KW+) plus several examples of historic (KE m , KW m ) and relatively recent admixture (KSxKE) within the parent species H. klunzingeri.All primary taxa, lineages, and admixed zones are fully diagnosable by numerous independent genetic markers, and our intensive sampling provides an ideal starting point for future field surveys to plug apparent distributional gaps, ecological assessments of the H. klunzingeri complex, or formal taxonomic revision.
Regarding the latter, allocating the nominal form of H. klunzingeri s.s.(Ogilby, 1898) to a specific taxon may prove problematic, given the type locality (the Murray River in South Australia) is apparently, at least the time of collecting for genetic evaluation, a hybrid zone (KSxKE).

| Candidate species
The genetic resolution of candidate species has often relied on identifying lineages using either gene genealogies (e.g., mtDNA or nDNA gene trees) or multilocus population trees (for allozymes or genomic data).However, as such tree-only approaches

TA B L E 3
Pairwise number of diagnostic allozyme loci between the primary genetic groups identified by PCoA.
detect genetic structure rather than candidate species per se (Sukumaran & Knowles, 2017;Unmack et al., 2022), lineages delineated in this manner need not directly equate to biological or evolutionary species but may instead reflect major phylogeographic breaks within a species or a composite of two or more species plus admixed individuals.
While there is no simple formula for deciding whether two genetically distinctive allopatric populations are conspecific or represent different species, we have recently advocated a six-step approach to assist in this task (Unmack et al., 2022).These steps are as follows: identify lineages, hybrids, and introgressed populations using a combination of ordination of individuals (step 1) plus phylogenetic methods (step 2), followed by pairwise assessments of lineage diagnosability (step 3), comparative geographic distribution (step 4), and sampling intensity (step 5), and concluding with a review of any other biologi- Note: Diagnostic molecular markers: SNPs = number of absolute fixed differences/number of fixed differences (5% tolerance); all values but one are highly statistically significant (p < .001;*p = .026):Alloz = number of fixed differences (10% tolerance): Cytb = +++ unequivocally diagnosable, numerous fixed nucleotide differences; ++ = unequivocally diagnosable, some fixed nucleotide differences; + distinct primary clades but not unequivocally diagnostic.Terminology for comparative geographic distribution follows Figure S5 (see also Unmack et al., 2022).Sampling intensity: the extent to which each lineage has been geographically sampled (all pairwise comparisons reflect intense genomic sampling).taxa (e.g., Lintermans, 2007;Meredith et al., 2003), now known to comprise a complex of sexual species and 'unisexual' (hybridogenetic) lineages (Thacker et al., 2022b;Unmack et al., 2019).We hope that a recent taxonomic revision by Thacker, Geiger, and Unmack (2022) for this hemiclonal species complex, which includes five sexual species and multiple unisexual combinations, will help establish a more robust taxonomic framework for identifying individuals to their correct sexual group and hence facilitate the documentation of comparative biological information for all sexual forms of Hypseleotris, including those referable to the H. klunzingeri complex.
Table 4 summarizes the outcomes of applying steps 3-5 to the primary taxa identified for H. klunzingeri using steps 1 and 2. As shown, there is strong evidence that KN, KE+, and KS are all valid candidate species, being unequivocally or effectively diagnosable from each other at hundreds of unlinked genes and displaying distributional patterns that are inconsistent with being phylogeographic lineages within a single species (Table 4; Figure S5).
Given their comparatively low number of diagnostic differences, the decision as to whether the allopatric taxa KS and KW+ are conspecific or represent distinct evolutionary species remains the only taxonomic question not fully resolved by our stand-alone genetic datasets.However, as the number of molecular characters that diagnose KS from KW+ greatly exceeds the nine partially diagnostic morphological characters that delineate other co-occurring species of Hypseleotris (Thacker, Geiger, & Unmack, 2022), we have concluded that KW+ 'probably' represents a fourth

| Broad patterns within and between candidate species
Taxon KN has an unusual distribution and genetic pattern.(Lintermans, 2007), while the turtle Emydura macquarii has crossed from the Bulloo River into Paroo (Georges et al., 2018).
The most likely spot for faunal exchange is via the Bindegolly portal (Georges et al., 2018).
There and C. amniculus (Darling Hardyhead) and several Galaxias species in the mountain galaxiid complex, which are closely related (Lintermans, 2007;Raadik, 2014).Some introgression has been recorded in both fish groups (Adams et al., 2011(Adams et al., , 2014)).There are also three congeneric species groups present in the MDB that are known to produce hybrids from the genera Maccullochella (Douglas et al., 1995) and Philypnodon (Hammer et al., 2019).
In other groups, there are examples of introgression such as in the genus Melanotaenia between three species (P.J. Unmack, M. Adams, unpublished data), along with an admixture zone in the genus Retropinna (Hammer et al., 2007;Unmack et al., 2022).In addition, there is the hemiclonal complex of Hypseleotris carp gudgeons, which have hybrid origins (Unmack et al., 2019).Given that even distantly related fish species are known to readily hybridize (Vespoor & Hammar, 1991), the MDB provides considerable opportunities for mixing gene pools from different colonizations and reinvasions of the basin from surrounding river basins over evolutionary time frames across a range of species with different levels of genetic divergence.It is also likely that opportunities for hybridization have increased as natural habitats in the MDB have been anthropogenically altered or degraded (Lintermans, 2007;Scribner et al., 2001).

| Cryptic biodiversity in Australian freshwater fishes
The resolution of cryptic species diversity within H. klunzingeri ac-

| CON CLUS IONS
This study has revealed that carp gudgeons are even more speciose than previously thought, adding several additional candidates to the existing six sexual species plus their various hemiclonal relatives (Thacker, Geiger, & Unmack, 2022, Thacker, Shelley, et al., 2022;Unmack et al., 2019).Even ignoring the complication of sympatric hemiclones, many river basins contain at least three or more sexual species, with the geographically extensive MDB notably harboring six sexual taxa (plus multiple hemiclones).Moreover, the observed partial mismatch between geographic and phylogenetic patterns, plus the presence of a natural hybrid zone in the lower Murray and in at least two Darling tributaries add yet more layers of complication.Given such complexity, future taxonomic and field identification efforts will be particularly challenging and ideally require the involvement of a molecular identification technology (with SNPs providing the gold standard of unequivocal identification all sexual and unisexual forms) as part of a coordinated accumulation of companion morphological exemplars of each morphotype at each site surveyed.Our own research group has already adopted this strategy where resources permit.
The dynamic boom and bust nature of many Australian freshwater ecosystems highlights the need for monitoring spatial genetic patterns for all resident species over time, particularly after major climate events such as have impacted eastern Australia over the past decade (Hughes et al., 2009;Legge et al., 2022;Lintermans, 2013).
Our study provides future researchers with a framework to pursue such an endeavor for the Hypseleotris of eastern Australia.As a major component of the biodiversity and ecology of these ecosystems, carp gudgeons also offer great potential for environmental monitoring, provided researchers can identify individuals to their correct taxon.In the past, carp gudgeons have often been lumped into one composite 'taxon' (Hypseleotris spp.) when included in ecological surveys (Lintermans, 2007), a custom that precludes any genuine assessment of whether Hypseleotris alpha diversity has declined or shifted (e.g., hemiclone ratio/presence) at such sites.Finally, this study further underlines the point that active conservation and management practices for freshwater fishes, including both the intended (i.e., wrong genetic lineage used) and unintended (i.e., where carp gudgeons or other non-target species unknowingly contaminate the hatchery release event) consequences of fish stocking programs (Lintermans, 2004), need to be mindful of the existence of both un-

ACK N OWLED G M ENTS
We thank the many researchers who assisted on field trips, in the laboratory, or supplied tissues.We also gratefully acknowledge the All sampling was approved by the University of Canberra Committee for Ethics in Animal Experimentation (approval codes CEAE 13-06, 15-06 and 20180442) and undertaken under the following state wildlife collecting licenses: New South Wales P07/0007-5.0 and P18/0027-1.1;Queensland 168221, 191126 and 212524; South Australia S115 and ME9902959; Victoria RP1146 and RP1344.Fish were ethically euthanized using either AQUI-STM or dilute clove oil, snap-frozen in liquid nitrogen, and deposited in the SA Museum's Australian Biological Tissues Collection.
Our allozyme dataset comprised the same 54 putative allozyme loci as employed byUnmack et al. (2019) and was generated according to the principles and procedures presented inRichardson et al. (1986) andHammer et al. (2007).We used PCoA, coupled with assessments of diagnosability (fixed differences, allowing a 10% cumulative tolerance for shared alleles at a locus as advocated byAdams et al., 2014, for allozyme markers) and admixture (intermediate positioning between parental taxa for PCoA, higher levels of heterozygosity, and lack of fixed differences at otherwise diagnostic loci), to explore the broader taxonomic and phylogeographic patterns evident in this dataset.The rationale and methods for these analyses followAdams et al. (2014) andUnmack et al. (2022).

|
FigureS1), it became evident across all other analyses that the resulted in a final 'phylo' dataset of 179 individuals for 7419 polymorphic SNP loci.ML recovered one tree with a −ln score of −72522.995738and the rapid bootstrap search finished at 450 replicates (Figure 3b; full tree in Figure S2).Support across most of the deeper nodes of the tree was strong, with once again four principal lineages recognized, namely, KN and KE (as sister clades), with both being sister to clades KS and KW.Taxon KN represents northern coastal populations from the Burdekin, Fitzroy, and the coastal Boyne rivers (note there are two Boyne rivers in our study, the second being a tributary to the Burnett River).Apart from its presence in the Condamine and Darling (as lineage KE m ), taxon KE consists of populations from eastern coastal rivers from Baffle Creek south to the Clarence River, plus a recently introduced population in the Barron River in far north Queensland (site 1).Taxon KS contains individuals from three coastal New South Wales rivers (Macleay, Hunter, and Shoalhaven) along with non-introgressed populations from the MDB, except those in the Warrego and Paroo rivers (lineage KW m ), which, along with Bulloo River and Cooper Creek populations, are referable to taxon KW.Outside of the MDB, most individuals tend to group closely

F
RAxML trees for SNP dataset.(a) Preliminary tree skeleton, showing how the inclusion of KE m individuals distorts relationships within KE and blurs the distinctiveness of KE and KS.(b) Final RAxML tree, rooted at the mid-point.Branches are color coded by primary taxon and major clades identified by the symbols used following Figure 1.Early-branching nodes with bootstrap values of 95% or higher are asterisked.Minor clades within taxa are labeled with their corresponding site codes.(# = site 16, the only Burnett fish not aligning with the other Burnett sites).

(
Figures2b,d and 4c; FigureS1) consistently found that Condamine fish (KE m ) were genetically most similar to those in the Burnett, while two geographic outliers (sites 1 and 16) and a number of regional population clusters were present for pure KE in the RAxML tree (Figure3b).A KE-specific PCoA revealed a similar pattern of diversity, including the site 1 and 16 outliers (discussed separately), but was also able to detect a primary phylogeographic split between sites from the Maroochy River northward and west to the Burnett (sites 10-29, excluding site 16) versus sites south of and including the Pine and Brisbane Rivers (sites 28-48).The northern outlier (site 1, Barron River) clearly clusters with sites 19-24 (Burrum and Mary Rivers), supporting its likely status as an introduced population from that part of KE's range.Intriguingly, both individuals from site 16 (Burnett River) are anomalously placed, one intermediate between the two primary clusters (and showing elevated heterozygosity levels) and the other clustering with the southern phylogroup.This same pattern is displayed in the allozyme data (PCoA not shown) and in the mtDNA tree (Figure S3), with one individual from site 16 clustering with Brisbane River haplotypes (southern phylogroup) and the other with Burnett River haplotypes (northern phylogroup).With respect to KS, most sites are relatively homogeneous, with only modest structure relating to geographic outlying populations in two of the Border Rivers (Severn and McIntyre Rivers; sites 60, 61), the Macleay River (site 49), and the Shoalhaven River (site 52), the latter clustering with one of the sites in the adjacent drainage divide (site 83, Murrumbidgee River) and therefore F I G U R E 4 Scatterplots of ordination scores in the first two dimensions for the initial PCoA and two follow-up PCoAs for the allozyme dataset.(a) initial PCoA of all 233 individuals; (b) PCoA of the 92 individuals referable to KS, KW, or KW m ; (c) PCoA of the 151 individuals referable to KE, KE m , or KS.Symbols and presentation as for Figure 2.

F
I G U R E 5 Neighbor-joining tree based on pairwise Nei Distances among all sites surveyed in the allozyme study.

F I G U R E 6
Summary of mtDNA analyses for Hypseleotris klunzingeri.(a) Condensed gene tree for full cytb sequences.Early-branching nodes with bootstrap values of 95% or higher are asterisked.MtDNA clade symbols match those used in (b).Detailed tree presented in Figure S3.(b) Map of the major cytb lineages in the MDB and adjacent drainages.Shading represents the geographic distribution identified in this study for candidate species KS (yellow) and KE (red), and for the two KSxKE hybrid zones (gray).

F
Results of PCoA for the refiltered SNP datasets for pure populations of taxa KN, KE, and KS.Axes are scaled according to the relative contribution of each dimension (in brackets).Clusters are labeled with site codes.(a) KN (n = 15); (b) KE (not including KE m individuals; n = 72); (c) KS (n = 66).candidatespecies.A full resolution of its taxonomic status will require additional targeted assessments of any morphological and other biological differences between KS and KW+, and must include exemplars of pure KS, pure KW, and KW m .The scenario of sister Cooper versus MDB candidate taxa is also evident in another co-occurring freshwater fish (Australian smelt, Retropinna spp.;Unmack et al., 2022).
m and KW m ), plus a relatively recent and possibly ongoing hybrid zone (KSxKE) between upstream KE+ and its downstream congener KS.Within KS, there is no strong pattern of phylogeographic structure.These populations are primarily found in the Murray subcatchment upstream of the Darling River confluence, along with the Macquarie, Gwydir, Namoi, and Macintyre subcatchments.Taxon KE+ is restricted to the Condamine-Balonne subcatchment.These KE m lineage individuals have obvious genetic affinities to populations from the Burnett River, a common pattern for those MDB species that are also found in coastal river basins in southeastern Queensland (Unmack, 2013).Fish with the KSxKE genetic profile are present in the Darling River south into the lower Murray River, plus in the Castlereagh and Bogan subcatchments.These are likely a result of KE m fish from the Balonne River dispersing further downstream, but also managing to push upstream into nearby tributaries like the Bogan and Castlereagh.Contemporary patterns in the Darling River are likely to vary over time as drought eliminates populations (due to excess water extraction), with recolonization either coming via floodwaters from either the Balonne (introducing more KE m fish) or Macintyre/Barwon rivers (taxon KS fish), carrying different genotypes into the lower Darling River.The western portion of the Murray-Darling Basin in the Warrego and Paroo subcatchments has admixed populations between KS and KW, represented by KW m .These populations share a long drainage basin boundary with both Bulloo River and Cooper Creek (which contains pure KW).One other fish species has crossed from the Lake Eyre Basin rivers into the Paroo and Warrego subcatchments, Melanotaenia splendida have been four invasions into east coast river basins of KS from the MDB.There first is located in the upper Maryvale River in the upper Clarence Basin (site 41), which has a mitochondrial haplotype identical to those adjacent in the Border Rivers subcatchment (upper Macintyre River), along with a similar relationship based on SNPs (FiguresS2 and S3).The second invasion occurred in the upper Macleay River in Salisbury Waters (site 49), which is adjacent to the Gwydir subcatchment.The fish from Salisbury Waters are most similar to those from the Border Rivers subcatchment in the upper Macintyre River for SNPs, while for mitochondrial DNA from the upper Macintyre and the Gwydir, subcatchments were similar.The third transfer occurred with the Hunter Basin.This population has long been considered likely native as they are known to be widespread, although patchy in occurrence, and several other fishes are shared with the Hunter, but not in surrounding coastal basins (e.g., Craterocephalus amniculus, Mogurnda adspersa), or they are also present in some additional surrounding basins (e.g., Tandanus tandanus).Hunter Basin KS (sites 50, 51) had a clear genetic affinity with fish from the Murrumbidgee subcatchment rather than the adjacent Macquarie and Namoi subcatchments.The fourth coastal basin population was found in the Shoalhaven Basin (site 52).This population was genetically closest to fish primarily from the adjacent Murrumbidgee subcatchment.It is tempting to speculate on whether these four geographic outliers represent native populations or human-mediated introductions.While many such introductions likely remain undocumented, over 50 Australian freshwater fishes are already known or presumed to have been either deliberately or accidentally introduced into catchments outside their native range(Lintermans, 2004).If these four coastal KS populations were native, we would expect each to show the greatest genetic similarity to its adjacent MDB population.This is the case for three of these comparisons, with the Hunter Basin being the exception.In addition, we might expect native occurrences to have broader distributions within coastal basins provided they have had a large period of time to disperse.Instead, all but the Hunter appear to only harbor localized populations, which have not dispersed far.Introductions could come from nearby populations, thus mimicking a native pattern, or they could be from distant populations, if accidentally introduced along with deliberately stocked, hatchery-reared sportfish.Unfortunately, each of these basins lacks early historical records, a common situation for Australia's freshwater fishes.At this stage, we consider these four populations are likely introduced, although we acknowledge the evidence is equivocal.The presence of three out of the four candidate species in the MDB is an unusual distributional pattern.The majority of species of native fish known to occur in the MDB do not share the Basin with widely distributed and truly sibling congeners, the exceptions being Craterocephalus fluviatilis(Murray Hardyhead) described candidate species, and deep phylogeographic structure within all species, to avoid undertaking or facilitating translocations or mixing of distinct genetic lineages.Conceptualization (equal); formal analysis (equal); funding acquisition (equal); investigation (equal); methodology (equal); project administration (equal); resources (equal); writing -original draft (equal); writing -review and editing (equal).Benjamin D. Cook: Conceptualization (supporting); writing -original draft (equal); writing -review and editing (supporting).Jerald B. Johnson: Funding acquisition (supporting); project administration (supporting); resources (supporting); writing -review and editing (supporting).Michael P. Hammer: Conceptualization (supporting); investigation (supporting); resources (supporting); writing -review and editing (equal).Mark Adams: Conceptualization (equal); data curation (equal); formal analysis (equal); funding acquisition (equal); investigation (equal); methodology (equal); resources (equal); writing -original draft (equal); writing -review and editing (equal).

South
Australian Museum for access to their tissue collection.Open access publishing facilitated by The University of Adelaide, as part of the Wiley -The University of Adelaide agreement via the Council of Australian University Librarians.
both absolute and allowing a 5% tolerance for shared alleles at each Unmack et al., 2022ciated probabilities that such values could arise through sampling error alone (dartR command gl.fixed.diff;parametertloc= 0 or tloc = 0.05; seeUnmack et al., 2022for rationale and methods involved).
Unmack et al. (2022)from applying the framework recommended byUnmack et al. (2022)to assess which lineages of H. klunzingeri sensu lato are also candidate species.
information that might indicate that lineages are not conspecific (step 6).Unfortunately, observations relevant to this final step are largely unavailable in the literature, since many ecological studies of Hypseleotris in eastern Australia have not attempted to reliably distinguish H. klunzingeri from a suite of congeneric and often co-occurring TA B L E 4 The southern portion of the Fitzroy River Basin is quite different to the northern portion, a pattern not replicated in any species examined so far.The species is unknown from coastal basins between the Fitzroy and Burdekin Basins except for the small Herbert Creek catchment (tissues were not available for this study).This absence appears to be real, with moderately intensive sampling only finding its congener H. bucephala at many sites.Within the Burdekin Basin, KN has a narrow distribution, being found primarily in Lake Dalrymple and in the Burdekin River upstream of the dam until the vicinity of Charters Towers.It is absent from the arid Belyando River, the main southern tributary to Lake Dalrymple and containing habitats that are otherwise commonly inhabited by H. klunzingeri.Again, Complex geographic and genetic patterns are found for H. klunzingeri populations within the MDB.Befitting its extensive geographic coverage and low-relief topography, the basin harbors the pure taxa KS, KE+, and KW+, the latter two showing evidence of historic admixture with KS (as lineages KE Unmack et al., 2022).Such data can not only reveal the presence of cryptic species but can also uncover nuanced evidence for admixture and introgression that was often not detectable prior to the advent of detailed genomic datasets and the coupling of ordination and tree-based approaches.The application of this modern molecular approach to taxonomic uncertainty in H. klunzingeri has identified the presence of additional candidate species in this nominal taxon.Moreover, as in our companion study of Australian smelt (Retropinna spp.;Unmack et al., 2022), additional morphological, phenotypic, or ecological data are not required in most cases to validate the identified candidate species per se (the exceptions being KW+ vs. KS in Hypseleotris, and COO vs. MTV in Retropinna).However, such data remain the foundation for formal description and naming of candidate species and are of course valuable for addressing other questions about the biology of individual species, whether formally named or not.