A novel target‐enriched multilocus assay for sponges (Porifera): Red Sea Haplosclerida (Demospongiae) as a test case

With declining biodiversity worldwide, a better understanding of species diversity and their relationships is imperative for conservation and management efforts. Marine sponges are species‐rich ecological key players on coral reefs, but their species diversity is still poorly understood. This is particularly true for the demosponge order Haplosclerida, whose systematic relationships are contentious due to the incongruencies between morphological and molecular phylogenetic hypotheses. The single gene markers applied in previous studies did not resolve these discrepancies. Hence, there is a high need for a genome‐wide approach to derive a phylogenetically robust classification and understand this group's evolutionary relationships. To this end, we developed a target enrichment‐based multilocus probe assay for the order Haplosclerida using transcriptomic data. This probe assay consists of 20,000 enrichment probes targeting 2956 ultraconserved elements in coding (i.e. exon) regions across the genome and was tested on 26 haplosclerid specimens from the Red Sea. Our target‐enrichment approach correctly placed our samples in a well‐supported phylogeny, in agreement with previous haplosclerid molecular phylogenies. Our results demonstrate the applicability of high‐resolution genomic methods in a systematically complex marine invertebrate group and provide a promising approach for robust phylogenies of Haplosclerida. Subsequently, this will lead to biologically unambiguous taxonomic revisions, better interpretations of biological and ecological observations and new avenues for applied research, conservation and managing declining marine diversity.


| INTRODUC TI ON
Worldwide biodiversity is declining at an unprecedented rate; thus, there is an urgent need for novel approaches to characterise biodiversity and tools for faster biomonitoring that could aid conservation efforts (Formenti et al., 2022).Sponges (Phylum Porifera) are a diverse animal taxon with wide abundance in almost all aquatic habitats wherein they fulfil a multitude of ecological functions, such as providing nutrients to higher trophic levels and contributing to the habitat heterogeneity of coral reefs (Bell, 2008;De Goeij et al., 2013).Despite their ecological importance, the biodiversity of sponges is still poorly understood but essential for a better interpretation of biological and ecological observations.
Here, we develop a multilocus probe assay based on target enrichment of ultraconserved elements (UCEs) for the first time for sponges.This assay can capture genome-wide markers for sponges to allow phylogenetic reconstructions at different taxonomic levels and has high potential to aid in (rapid) species identification and discovery in biodiversity surveys.UCEs are highly conserved regions in the genome that can easily be 'captured' using complementary synthetic DNA or RNA baits and can provide phylogenetically informative sites at the flanking regions of the UCEs but also within the UCEs (Faircloth et al., 2012).These informative sites can be used to infer the phylogenetic histories of systematically challenging groups across shallow and deep timescales (Faircloth et al., 2012;Faircloth et al., 2013;McCormack et al., 2012;Quattrini et al., 2018).
UCEs have successfully resolved molecular phylogenetic relationships in taxonomically complex marine invertebrate taxa, such as molluscs (Goulding et al., 2023;Moles & Giribet, 2020), anthozoans (Cowman et al., 2020;Quattrini et al., 2018) and echinoderms (Hugall et al., 2015).The phylogenetic relationships of various groups within the phylum Porifera, particularly those belonging to the order Haplosclerida (Class Demospongiae), are still poorly understood due to the discrepancy between the morphological and molecular hypotheses proposed for this group (McCormack et al., 2002;Redmond et al., 2011Redmond et al., , 2013)).For taxonomically complex groups such as Haplosclerida, UCEs are potentially more informative for reconstructing molecular phylogenies than a single or handful of gene markers and are a promising tool to resolve the intra-order phylogenetic relationships.In addition, UCEs provide enough information for further downstream processing and can be leveraged for unambiguous species identification, delimitation and discovery in biodiversity surveys (Erickson et al., 2021).For complex holobionts like sponges, UCEs are preferred over other methods, such as restriction-site associated DNA (RAD) sequencing.This is due to the high abundance of sponge-associated microbes and other commensal DNA co-extracted with the sponge's DNA (Vargas et al., 2012), where the enzymatically produced DNA fragments are often of unknown identity.Importantly, in contrast to RAD-based methods, target capturing provides access to non-anonymous genomic markers of known homology.
We test our newly developed multilocus probe assay, specifically designed to capture the broad diversity of sponges belonging to the demosponge order Haplosclerida, with specimens collected from the Red Sea.To date, very little has been documented on the sponge diversity of the Red Sea, which limits our understanding of the sponge communities present, their ecology, and how they will adapt under future climate conditions (Wooster et al., 2019).
Haplosclerida is one of the most speciose sponge orders, with >800 species described to date (Van Soest et al., 2012).Furthermore, they are highly abundant in many tropical coral reef systems (e.g.Erpenbeck et al., 2016), which are under increasing pressure due to anthropogenic stressors and climate change.Currently, haplosclerids form a yet unresolved taxonomic conundrum, one of the largest and most challenging in poriferan taxonomy and systematics (Van Soest et al., 2012).First, because the lack of diagnostic characters makes unambiguous species identification and distinction very challenging in Haplosclerida (McCormack et al., 2002), and second, because molecular phylogenies demonstrate para-or polyphyly of all morphologically defined families and most genera (McCormack et al., 2002;Redmond et al., 2011Redmond et al., , 2013)), and in a few cases also the non-monophyly of species (Redmond et al., 2011(Redmond et al., , 2013)).However, although incongruent with supra-specific morphological definitions, these molecular phylogenies converge on robust clades named A-E (Redmond et al., 2011(Redmond et al., , 2013)).
With access to a broad range of taxa belonging to four families within Red Sea haplosclerids, we have a test case to assess the order-wide applicability of our newly developed multilocus probe assay.We demonstrate that the multilocus probe assay successfully captures genome-wide loci of the Red Sea sponges and that those loci have sufficient phylogenetic resolution to recover the clades previously established in molecular phylogenies of the order Haplosclerida.Furthermore, we illustrate the broad applicability of the designed probe assay, as it also captured sufficient loci of other demosponges to construct a phylogeny reflecting the ordinal relationships of this diverse group.

| Sampling and identification of Red Sea haplosclerids
In the scope of the Red Sea Biodiversity Project (Senckenberg Research Institute; King Abdulaziz University (KAU), Jeddah, Saudi Arabia) and sampling campaigns led by the King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia, sponge specimens were collected in the northern, central and southern regions of Saudi Arabia's Red Sea area between 2012 and 2017 (Figure 1; Table 1).The sponges were collected between 1 and 43 m depth using SCUBA or by hand-picking in shallow waters, preliminarily sorted, photographed and preserved in 95% EtOH until further processing (Erpenbeck et al., 2016).
Because the sponge samples include commensal DNA, and the primers used are eukaryotic universal primers, the retrieved OTUs were pre-sorted for each of the haplosclerid specimens, followed by pre-screening the five representative sequences of the most abundant OTUs as a query for a BLAST search (blastn, http:// www.ncbi.nlm.nih.gov/ BLAST/ ) against NCBI GenBank.The most abundant 18S OTUs were blasted against the non-redundant nucleotide database, and the taxonomy of the most similar sequence to the query was used as a taxonomic proxy for the query sequence.Eventually, 26 haplosclerid specimens were selected and morphologically identified (up to species level, if possible).Briefly, the tissue and spicules of the specimens were prepared for light and scanning electron microscopy (SEM), respectively, using the methods described in Van Soest & Hooper (2005).Spicule measurements were carried out (n = 25) and the range and mean of F I G U R E 1 Localities (triangles) from which the 26 haplosclerid specimens were collected between 2012 and 2017 in the Red Sea.
TA B L E 1 Specimens included for the in silico test and phylogenetic analysis of this study.the spicule's length and width were calculated.The systematic assignment of the haplosclerid specimens followed the 'Systema Porifera' (Hooper & Van Soest, 2002).In addition, haplosclerid species descriptions of the Red Sea were consulted (Helmy & Van Soest, 2005;Ilan et al., 2004;Keller, 1889;Lévi, 1965;Vacelet et al., 2001).

| Preparation of transcriptome data prior bait design
Raw transcriptome sequence data were obtained from public databases (Table 2).These data were quality-filtered and assembled using TransPi (Rivera-Vicéns et al., 2022).The transcriptome assemblies were ranked based on the number of contigs, mean contig length (bp), N50 value (bp) and completeness as judged by searches against the Benchmarking Universal Single-Copy Orthologs (BUSCOs) using rnaQuast v2.0.1 and BUSCO v3 and v4 respectively.In addition, to ensure that the assemblies fell within the correct demosponge order, the full-length 18S rRNA gene was searched for in each transcriptome and used to infer a maximum likelihood (ML) phylogeny (Figure S1) using RAxML 8.2.12 (Stamatakis, 2014).com/ PalMuc/ HaploTC).The final number of retrieved probes depends on the selected base transcriptome and the number of taxa in which candidate loci can be found (e.g.Gustafson et al., 2019).

| Conserved locus identification and bait design
Therefore, each design was considered a separate bait set and given a version number to keep track of the specificity of the retrieved probes during downstream analysis (Figure S2; Table S2).
This resulted in a total of 137 bait sets, including 48 bait sets To search for ultraconserved element (UCE) loci across different subsets of transcriptomes, 100-bp paired reads were simulated for each taxon using ART_ILLUMINA (Huang et al., 2012) and aligned to a reference or 'base' transcriptome using stampy version 1.0.32 (Lunter & Goodson, 2011) with a sequence divergence of ≤5 and the resulting BAM output files were filtered for unmapped reads using samtools (Li et al., 2009).This was followed by converting the BAM files into BED format using bedtools (Quinlan & Hall, 2010), sorting by contig and position along that contig and merging proximate or overlapping putative conserved regions (of the simulated reads).
Any putative conserved intervals shared between the taxa and the selected base transcriptome and/or repetitive regions were removed.Also, intervals shorter than 80-bp, >25% of the regions were masked, and/or that had ambiguous (N or X) bases were removed using the command phyluce_probe_strip_masked_loci_from_set.
The identified UCEs were extracted with a buffer region of 120-bp using phyluce_probe_get_genome_sequences_from_bed, and temporary bait sets targeting the respective UCEs were designed with phyluce_probe_get_tiled_probes with the tiling density set to two.
Potentially problematic baits-those with a >25% repeat content and a GC content >70% or <30%-and duplicates (i.e.sequences that were ≥50% identical over ≥50% of their length) were removed from the temporary bait set using phyluce_probe_easy_lastz followed by phyluce_probe_remove_duplicate_hits_from_probes_using_lastz.
To confirm whether the UCEs of the temporary bait sets could be located in the exemplar taxa, the baits were aligned against each of the transcriptomes using phyluce_probe_run_multiple_lastzs_sqlite, with an identity value of 70% and a default value for minimum coverage of 83% (Quattrini et al., 2018).Baits matching multiple contigs were removed.The resulting regions were extracted from each transcriptome sequence, buffering each locus to 140-bp, using phyluce_probe_slice_sequence_from_genomes.
The final bait set to target haplosclerid UCEs was designed using phyluce_probe_get_tiled_probe_from_multiple_inputs.The baits were designed to be 80-bp long with a two-times tiling density and 40-bp overlap in the middle (two baits), GC content between 30 and 70% range and <25% masked bases.Duplicate baits (i.e.those that are ≥50% identical over ≥50% of their length) were removed using phyluce_probe_easy_lastz and phyluce_probe_remove_du-plicate_hits_from_probes_using_lastz and assigned a UCE-code to uniquely identify the bait sets generated by the base transcriptome used and taxonomic specificity (i.e.haplo-only versus haplo+outgroup).The probes retrieved from the different bait set versions were merged, followed by filtering for duplicates.Eventually, from this final bait set, 20,000 baits were selected randomly for synthesis (from here on, further referred to as 'multilocus probe assay').

| In silico test of the bait set
The designed baits were mapped against 39 transcriptomes from various demosponges (Table S3) to assess whether the multilocus probe assay could successfully and consistently capture haplosclerid loci.We used phyluce_assembly_match_contigs_to_probes with the minimum coverage and identity set to 85%.This was followed by extracting the UCE loci using phyluce_assembly_get_match_ counts to create the initial list of targeted loci.The targeted loci were then aligned with MAFFT (Katoh et al., 2002)

| In vitro test of the bait set
Following the in silico test, biotinylated RNA baits were synthe- We sequenced up to 32 libraries (i.e.four target capture pools) per high-throughput MiniSeq run.

| Analysis of the sequenced enriched libraries
Upon sequencing, the demultiplexed Illumina raw reads were checked for adapter contamination and low-quality bases and trimmed using fastp (Chen et al., 2018).The cleaned reads were then further processed using a modified workflow of PHYLUCE: Tutorial I: UCE Phylogenomics (https:// phylu ce.readt hedocs.io/ en/ latest/ tutor ials/ tutor ial-1.html) (Faircloth, 2016;Faircloth et al., 2012).The data were assembled using the phyluce_assembly_assemblo_spades program, followed by looking for matches between the assembled contigs and UCE bait sequences (85% identity, 85% coverage) using phyluce_assembly_match_contigs_to_probes.Then, the loci were extracted using phyluce_assembly_get_match_counts, followed by phyluce_assembly_get_fastas_from_match_counts.Before aligning the UCE loci, we performed GBlocks internal trimming of the alignments using phyluce_align_seqcap_align and phyluce_align_ get_gblocks_trimmed_alignments_from_untrimmed with default parameters.The alignments were cleaned using phyluce_align_re-move_locus_name_from_files, and 35% complete data matrices were created (i.e. each locus had 35% of the total specimens' occupancy) for the entire data set and the individual clades A, B and C. The alignments were concatenated for further downstream processing.
The number of phylogenetic informative sites was calculated using PAUP* 4.0a169 (Swofford & Sullivan, 2009).Maximum likelihood (ML) inference was performed with RAxML version 8.2.12, using the options for rapid bootstrapping (model GTRGAMMAX, 1000 bootstrap replicates).A Bayesian inference (200,000 generations, 25% burn-in) was conducted using RevBayes version 1.2.1 using the same model settings as the ML analysis.

| Multilocus probe assay and targeted loci
Depending on the bait set and the universality of the selected UCEs, the number of recovered probes varied between 195 and 476 for n taxa = 7 and 5974 and 12,002 for n taxa = 2 for the haploonly bait sets.These baits target 14 to 1790 loci (Table 3a).For the Haplo+outgroup bait sets, the number of recovered probes varied between 16 and 270 (opting for the most stringent option) and targeted one to 17 loci (Table 3b).After merging the probes retrieved from the different bait sets and filtering for duplicates, we obtained 21,463 unique baits.For the multilocus probe assay, we selected For these bait sets, the second most stringent option (i.e.n = 7) was opted to avoid an output of 0.

| Target-enrichment multilocus phylogeny of Red Sea Haplosclerida
The total number of reads obtained from the target-cap- Most nodes had moderate to high bootstrap and posterior probability values at shallow and deeper nodes (>70 and ≥0.99, respectively), except for three nodes for which bootstrap values ranging between 61 and 68 were found (Figure 2; Figure S3).Both analyses recovered the four widely accepted clades within the Haplosclerida with high support (Figure 2).The 26 specimens, morphologically identified to genus or species level, covered nine different genera belonging to the families Chalinidae, Callyspongiidae, Niphatidae and Petrosiidae (Supplementary Information II).Accordingly, in the phylogeny, the species' transcriptomes used for bait design fell in the expected clades A, B, C and E, as shown by previous studies.Furthermore, we observed that morphologically identical specimens had sister relationships in the tree.

| Target-capture approach captures sufficient loci across Demospongiae
The designed probes 'enriched' between 234 and 798 loci in haplosclerids (572 ± 185 SD) and between 46 and 195 loci in the other demosponges (132 ± 32 SD) ( Note: The percentage matrix value (%) equals the percentage of the total number of specimens (# taxa) that occupy a specific locus.The phylogenetic informative (PI) sites were calculated for the different data sets.
length of 86,118 bp and 45,642 (53%) phylogenetically informative (PI) sites (Table 4).The ML phylogeny generated from the alignment had overall well-supported nodes.For the Red Sea haplosclerids, in particular, bootstrap values were moderate to high (74-100 respectively) (Figure 3).The Bayesian phylogeny was mostly congruent with the ML phylogeny, except for the placement of Aaptos sp., which had a sister relationship with

Tentorium papillatum instead of forming a clade together with
Halichondria (Halichondria) panicea and Pseudospongosorites suberitoides (Suberitida) (Figure S4).The transcriptome of Aaptos sp. had the lowest number of contigs and complete and singlecopy BUSCO values (10.4% and 8.4% respectively) compared to the other transcriptome data used for the analysis (Table S3).The posterior probability values in the Bayesian phylogeny indicated high support (>0.99).In both phylogenies, the haplosclerid transcriptomes grouped in clades A, B, C and E, congruent with the earlier single-marker phylogenies.In the demosponge phylogeny, the two Tedania species were not a sister group.However, both fell within the order Poecilosclerida.Overall, we recovered a phylogeny that is mainly consistent with accepted relationships of Demospongiae (Erpenbeck et al., 2004;Redmond et al., 2013;Thacker et al., 2013).

TA B L E 5
The number of UCE loci recovered from the targeted 20,000 baits from the 39 demosponge transcriptomes.Note: Additionally shown are the total number of contigs, the number of UCEs removed for matching multiple contigs and the number of contigs removed for matching multiple UCEs.

| DISCUSS ION
To our knowledge, this is the first time a multilocus probe assay has been designed to capture hundreds of genome-wide loci from one of the systematically most challenging groups of nonmodel organisms, namely sponges (Phylum Porifera).Here, we demonstrate the applicability of target-capture enrichment using ultraconserved element (UCE) loci as an effective and robust approach for resolving phylogenetic relationships on both deep and shallow evolutionary timescales from species across one of the most diverse demosponge orders, the Haplosclerida.Thus far, target-capturing of UCEs has been successfully applied to other marine invertebrates, such as echinoderms (Hugall et al., 2015), anthozoans (Cowman et al., 2020;Quattrini et al., 2018), arthropods (Ballesteros et al., 2021), annelids (Petersen et al., 2022) and recently molluscs (Goulding et al., 2023;Moles & Giribet, 2020), where they successfully resolved phylogenetic relationships at both high (class) and low (species) taxonomic levels, and provided insights into evolutionary dynamics between species.
Using our custom-made multilocus probe assay, we reconstructed a supra-specific phylogeny of a broad range of Red Sea Haplosclerida based on the target-captured loci and additional transcriptome data of 11 taxa.This phylogeny was well-resolved, congruent with previous results (McCormack et al., 2002;Raleigh et al., 2007;Redmond et al., 2011Redmond et al., , 2013) ) and had high branch support on both deeper and shallow nodes; thus, we demonstrate the high potential of the probe assay for demosponge taxonomy, systematics and phylogeny.We propose future studies to tailor the application of this target-capture enrichment approach for (rapid) species identification to specific sponge taxa to benefit the development of biomonitoring tools to aid biodiversity surveys.To this end, Erickson et al. (2021) demonstrated that sufficient SNPs could be extracted from the UCE loci to differentiate between populations of corals.A similar approach would be highly desirable for a broad variety of sponge taxa.
The performance of a bait set is inherently dependent on the quality of the input data used for its design.In the case of sponges, obtaining high-quality transcriptomes or genomes that can be used for probe design is challenging because of the potential inhibition of enzymatic reactions during library production by co-extracted biochemical compounds produced by the sponge holobiont (Chelossi et al., 2004) or co-amplification of commensal organisms that reside in sponges (Vargas et al., 2012).At the time of probe design, only one haplosclerid genome (Amphimedon queenslandica) was published (Srivastava et al., 2010), which constrained us to work mainly with transcriptomic data and, as a result, the designed baits all fall in coding (exon) regions.Although highly conserved elements are more likely to be found in coding regions and are also expected to capture loci across species more efficiently, they are argued to be more suitable for resolving at low-moderate phylogenetic distances (Bi et al., 2012).However, when combined with target capture, previous studies have demonstrated the utility of exon-based loci for inferring and resolving phylogenetic relationships at both deeper and shallow nodes (Ballesteros et al., 2021;Quattrini et al., 2018).Transcriptomes of the haplosclerids Petrosia (Petrosia) ficiformis, Haliclona (Gellius) amboinensis and Xestospongia testudinaria that were available to us were not included in our bait design due to transcriptome incompleteness and questionable identification of the species due to their placement in the 18S rDNA phylogeny (Figure S1).However, these species were placed within the Haplosclerida in our in silico and in vitro tests based on the loci retrieved from these transcriptomes.Hence, despite the ambiguous placement of these haplosclerids using the 18S rDNA marker, our probes captured sufficient information to place them accurately in the UCE-based phylogeny, which is also in agreement with previous results (Guzman & Conaco, 2016;McCormack et al., 2002;Redmond et al., 2011Redmond et al., , 2013)).In addition, we observed inconsistencies in the transcriptomic data in the 18S-based phylogeny that were not present in UCE-based phylogenies.First, some of the other (non-haplosclerid) demosponges (Cliona varians, Stelligera sp., Tedania (Tedania) anhelans and Kirkpatrickia variolosa) were incorrectly placed in the 18S phylogeny, while they were correctly placed in our UCE-based phylogeny (Figure 3).Second, we observed long branches for some of the species (e.g.Cliona varians) or entire clades (e.g.Mycale (Carmia) cecilia and Halichondria (Halichondria) panicea) in the 18S phylogeny, which were, in turn, not detected in our UCE-based phylogenies.Although we cannot exclude the possibility of artificial errors occurring in the 18S sequences as a result of multiplexing PCR or the selected sequencing approach, we argue that the choice of 18S as a single marker is potentially to blame for the misplacement of certain species and the possible long branch attraction (LBA) observed.Another reason for such artefacts could be high substitution rates in the rRNA evolution of Haplosclerida (Lavrov et al., 2008;Simion et al., 2017).Here, we demonstrate that our UCE-based approach is likely less prone to these limitations and results in more reliable phylogenies compared to 18S or, generally, single marker-based phylogenetic reconstructions.
Between the different bait set versions, we observed considerable differences in the number of retrieved loci from essentially the same input data (Table 2).For example, we obtained a relatively high number of probes from bait sets with lower stringency levels Ambiguities regarding the placement of species in the phylogenetic trees were only noticed in a few cases, which is likely the result of the incompleteness of the transcriptome data used for the analysis (Table S3).For example, when comparing the haplosclerid and the demosponge phylogenies, the placement of Xestospongia testudinaria remained ambiguous.Namely, in the Red Sea analysis, X.
testudinaria is nested within Clade A, while in the demosponge phylogeny, it was found to be a sister group to Clade B. The alternating position of X. testudinaria between trees with different taxon samplings could be explained by the lower number of haplosclerid taxa in the demosponge phylogeny, which is also based on less retrieved loci compared to the phylogeny containing largely haplosclerid taxa.
Another reason for the different placement can be the result of incomplete taxon sampling (see discussion in Quattrini et al., 2018;Wiens, 2005) because our Red Sea phylogeny contains a more reduced taxon set compared with both McCormack et al. (2002) and our larger demosponge phylogeny.Adding more haplosclerid representatives, and in particular type material that functions as a taxonomic reference point, will certainly lead to a more accurate placement of the species and genera in this group and an improved phylogeny-based classification of Haplosclerida.
The target-capture enrichment approach uses relatively short bait sequences and, therefore, is suitable for generating genome-scale data from older and possibly poorly preserved or formalin-fixed museum material (Agne et al., 2022;McCormack et al., 2016).Thus, it can be leveraged for 'museomics' to serve as the baseline for re-eval- This study demonstrates that target-capture enrichment combined with our newly designed multilocus probe assay is a valuable genomic resource for understanding evolutionary relationships among demosponges, particularly haplosclerid sponges.We confirmed the broad applicability of the bait set as ordinal relationships of species were revealed across the class Demospongiae.Future studies are recommended to test and possibly broaden the applicability of the bait set to other demosponge groups.For example, studies could select the markers of our multilocus probe assay that successfully captured loci from these groups and enrich the subset
sised using the myBaits® Custom (1-20,000) target capture kit by Daicel Arbor BioSciences (Ann Arbor, MI, USA).Prior to the in vitro test, the initial concentration of each sample was measured with a Qubit 2.0 fluorometer, and libraries were prepared using a modified protocol of the xGen™ ssDNA & Low-Input DNA Prep (96 rxn) library kit (Supplementary Information I).Briefly, we used half the reaction mix for the adaptase, extension, ligation and dual indexing PCR steps of the library preparation protocol.Following the manufacturer's protocol, we quantified and quality-controlled the libraries using a Qubit 2.0 fluorometer and a Bioanalyzer 2100 in High Sensitivity DNA mode and amplified the genomic DNA libraries to obtain the required 250 ng for hybridisation.Hybridisation and target capture enrichment were performed using pools of eight libraries and the MyBaits target enrichment standard protocol.The target-enriched libraries were 150PE sequenced in an Illumina MiniSeq in high-throughput mode.

TA B L E 3
The number of recovered probes (left) and targeted loci (right) for (a) the haplo-only bait sets depending on the selected base transcriptome and the stringency level during the bait design (i.e. the number of taxa aligned against the base transcriptome) and (b) the haplo+outgroup bait sets including a demosponge outgroup in the bait design.Note that for the haplo+outgroup bait set, only the most stringent option (i.e.n taxa = 8) was opted, except for two bait sets, namely the bait set with Haliclona (Haliclona) oculata as base transcriptome and Crella (Crella) elegans as demosponge outgroup and the bait set with Neopetrosia compacta with Vaceltia sp. as demosponge outgroup, for which the second most stringent option was selected (i.e

F I G U R E 2
Maximum-likelihood phylogeny on a 35% concatenated alignment matrix of the 446 retrieved loci (n taxa = 40, n char = 118,775 bp).Branches denote bootstrap (BS) values/posterior probabilities (PP) from Bayesian analyses, with BS = 100% and PP = 1.0 indicated with an asterisk (*).The Red Sea samples are indicated with a GW-ID number.The other haplosclerids are derived from transcriptome data.The tethyid Tethya wilhelma, the poecilosclerid Latrunculia (Latrunculia) apicalis and scopallinid Scopalina sp. were used as outgroups.The scale is 0.08 substitutions/site.
(e.g. level 2-5) and with H. (Rhizoniera) viscosa set as the base transcriptome, while for the most stringent options (e.g. level 6-7) more probes were retrieved when A. queenslandica was set as the base transcriptome.This is not trivial since the specificity of the bait set likely depends on the quality and completeness of the input data and the phylogenetic distance between the input species.Therefore, future studies focussing on specific families or clades could optimise our bait set to maximise the enrichment probability of a subset of loci in the taxon of interest.Although the ratio of probes may not be directly proportional to the different clades within the Haplosclerida, our bait set successfully enriched numerous loci from our Red Sea haplosclerids and resulted in a phylogeny that recovered almost all previously molecularly defined clades (i.e.Clade A-C, and E) of the Haplosclerida.
uating systematics and taxonomy in sponges and other groups.In this vein, a new phylogenetic classification of Haplosclerida should implement a 'bottom-up strategy', applying target-capture enrichment to each haplosclerid genus type if possible or to taxonomically validated material if types are not available to derive a phylogenetically consistent classification of this order.

Family Species ID SMF ID Latitude Longitude Region Location Chalinidae
Note: Ten specimens were collected in the framework of the Red Sea Biodiversity Project (Senckenberg Research Institute; King Abdulaziz University, KAU, Saudi Arabia) and are part of the collection of Senckenberg Research Institute in Frankfurt (SMF), Germany.The other 16 specimens were collected under the campaigns led by the King Abdullah University of Science and Technology (KAUST) Thuwal, Saudi Arabia, in the Red Sea.Samples were collected between 2012 and 2017.

Table 5
Alignment matrix statistics for the different data sets.
).The concatenated 35% occupancy alignment matrix consisted of 72 conserved loci and had a trimmed mean locus length of 1196 bp ± 846 SD, a total TA B L E 4