Travel with your kin ship! Insights from genetic sibship among settlers of a coral damselfish

Abstract Coral reef fish larvae are tiny, exceedingly numerous, and hard to track. They are also highly capable, equipped with swimming and sensory abilities that may influence their dispersal trajectories. Despite the importance of larval input to the dynamics of a population, we remain reliant on indirect insights to the processes influencing larval behavior and transport. Here, we used genetic data (300 independent single nucleotide polymorphisms) derived from a light trap sample of a single recruitment event of Dascyllus abudafur in the Red Sea (N = 168 settlers). We analyzed the genetic composition of the larvae and assessed whether kinship among these was significantly different from random as evidence for cohesive dispersal during the larval phase. We used Monte Carlo simulations of similar‐sized recruitment cohorts to compare the expected kinship composition relative to our empirical data. The high number of siblings within the empirical cohort strongly suggests cohesive dispersal among larvae. This work highlights the utility of kinship analysis as a means of inferring dynamics during the pelagic larval phase.

Studies on the dispersal strategies of early life histories of coral reef fishes have focused on later development stages using genetic kinship analyses and/or abundance surveys of settled juveniles (Almany et al., 2013;Avise & Shapiro, 1986;Bernardi, Beldade, Holbrook, & Schmitt, 2012;Berumen et al., 2012;Herrera et al., 2016;Jones, Planes, & Thorrold, 2005;Nanninga, Saenz-Agudelo, Zhan, Hoteit, & Berumen, 2015;Schunter, Garza, Macpherson, & Pascual, 2013). Interestingly, most of these studies found evidence for the presence of siblings and high genetic relatedness between some of the juveniles studied. For instance, Bernardi et al. (2012) were the first to detect that siblings of a damselfish were arriving together at their new home, as they found that related individuals mostly recruited to the same or nearby anemones on the same night (Bernardi et al., 2012). However, whether this was just an anomalous result due to high recruitment success of a small recruiting source to that particular reef or for that species, or if it was evidence for some intrinsic behavioral trait of a fish larva to stay close to its kin during the PLD, still remains unanswered.
Our study focuses on Dascyllus abudafur, a small zooplanktivorous coral reef damselfish, which was recently taxonomically "resurrected" from its sister species, D. aruanus (Borsa, Sembiring, Fauvelot, & Chen, 2014). These fishes form colonies in which all individuals differ in their total body length, likely indicating that they are of different age. Nonoverlap of ages may serve as a mechanism to decrease foraging competition and inbreeding among putative siblings of the same recruiting cohort (among D. aruanus (Gillespie, 2009) and in D. abudafur, from personal measurements). After the completion of a PLD of 23.09 (±1.92 SD) days (Robitzch, Lozano-Cortés, Kandler, Salas, & Berumen, 2015) they reside in branching corals for their entire life span (among D. aruanus from Fricke & Holzberg, 1974;and Coates, 1982). Dascyllus abudafur is omnipresent in the Red Sea and the Western Indian Ocean (Borsa et al., 2014) while D. aruanus is abundant and well-studied throughout the Indo-Pacific. Therefore, ample information is available on D. aruanus' biology and ecology, which we use as proxy for the biological data implemented in our study for statistical inferences (Appendix S1). For D. abudafur, there is no information available on recruitment behavior. However, D.
aruanus tends to have seasonal breeding peaks and preferentially recruits a couple of days after the new and full moon and around the first and third quarter moons, a pattern that is common among damselfishes (Mizushima, Nakashima, & Kuwamura, 2000). In our study, we assessed the genetic composition and relatedness of a subset of 168 new recruits captured in one single night during the new to first quarter moon, in one single reef in the north central Saudi Arabian Red Sea, which we refer to as the recruiting cohort sample (RCS). This RCS provides a snapshot of the geneflow for a specific reef during one recruitment event and allowed us to ask crucial questions about larval behavior at very early life stages. We were also provided some insight on realized connectivity between populations . Since it is nearly impossible to follow and observe a larva from the time it hatches to the time it settles, inference derived from sibship among the larvae of our RCS allows us to propose strategies of dispersal and putative cohesive routing during the PLD of this species. The main questions of this study were: (a) Are larvae nonrandomly recruiting together with their kin? and (b) does the RCS reflect the genetic diversity of the adult population at the settlement destination?

| Study site and sample collection of Dascyllus abudafur
In the Saudi Arabian Red Sea near the city of Rabigh at Al-Karrah Bellamare). The light trap was deployed with a 1 kg weight attached to its cod end, from a 35 m research vessel (RV Thuwal) on the side of the reef sheltered from the predominant wave activity. The light trap remained at a depth of ~1 m and a bottom depth of ~15 m from 17.00 hr (before sunset) until 06.00 hr (before sunrise) of the next day. This sample is referred to as the recruiting cohort sample (RCS).
Additionally, 53 adult specimens (referred to as the adult population sample; APS) were collected the same day at about 15 m depth from four coral colonies at AKA using clove oil, tweezers, hand nets, and a 1 m × 1.5 m plastic bag to cover the targeted colonies (before applying clove oil). All specimens were immediately stored in 96% ethanol and transported to KAUST facilities for further analysis.

| Single nucleotide polymorphisms (SNPs) library preparation and sequencing
Genomic DNA was extracted from preserved fin or gill tissue (from adult specimens) or the head of the recruiting larvae using a Nucleospin-96 Tissue Kit (Macherey-Nagel, Düren, Germany).
Double digest restriction associated DNA (ddRAD) libraries were prepared using the restriction enzymes SphI and MluCI (NEB) and following the protocol described by Peterson, Weber, Kay, Fisher, and Hoekstra (2012) (Peterson et al., 2012)
Individual reads with phred-scores below 30 (averaged on a sliding window of 10%) or with ambiguous barcodes were discarded. After this, 13 individuals with less than 500,000 reads were discarded.
For the remaining 211 specimens, RADSeq loci were assembled de novo using the 'denovo_map.pl' pipeline in STACKS. Different parameter combinations were evaluated, which resulted in different numbers of loci but gave similar results in genetic comparisons (genetic clustering and F ST between sites). For the main analyses presented herein, we used a parameter combination similar to the one recommended by Mastretta-Yanes et al. (2014). Further data filtering was performed using the population component of STACKS (details in Appendix S3). The resulting vcf file was converted to other program-specific input files using PGDSPIDER v2.0.5.2 (Lischer & Excoffier, 2012). The data set was evaluated for linkage disequilibrium (LD) and Hardy-Weinberg equilibrium (HWE), as implemented in Genepop 4.2 (Raymond & Rousset, 1995;Rousset, 2008), to choose a set of SNPs that were statistically independent to each other, met kinship analysis expectations, and was thus analytically highly powerful with a lower numbers of SNPs (for faster and accurate computational inferences). LD and HWE assessments were solely based on the genotypes of the 38 adult individuals (APS) to prevent false results of LD and HWE deviations in case of potentially higher relatedness or selective pressure among young recruits.
P-values were tested for false-discovery-rate using qvalue in R (package from BioLite). GenAlEx 6.501 (Peakall & Smouse, 2006 was used to calculate the multilocus probability identity, which provides an estimate of the average probability that two unrelated individuals drawn from the same population will have the same multilocus genotype. Using this information, 300 loci (max. probability identity of 7.11 × 10 -36 ) with the most equally distributed allele frequencies (minor allele frequency > 0.33) were chosen as a highly informative and computational economic final data set, comprising 193 individuals with less than 15% missing data.

| Population structure and sibship assignment
In order to identify potential genetic differentiation between the APS and the RCS, a Bayesian clustering analysis was performed in STRUCTURE v. 2.3.4 (Pritchard, Stephens, & Donnelly, 2000). The analyses were carried out with 300,000 iterations discarded as burn-in and 500,000 iterations retained, averaged for ten runs for a number of populations set from K = 1 to 3. A maximum K of 3 was used to represent one step above the assumed putative maximum number of K's (K = 2): one for the adult population and one for the recruiting cohort. CLUMPAK (Kopelman, Mayzel, Jakobsson, Rosenberg, & Mayrose, 2015) was used to choose and graph the optimal K. Additionally, a PCA was performed using the dudi.pca function in the R package poppr to see whether the two samples clustered separately or not depending on their genotype.
COLONY was used to identify sibling groups within the RCS.
Sibling pair assignments were accepted upon a posterior probability exceeding 0.75 (as used for other coral reef fishes (Herrera et al., 2016;Schunter et al., 2013)). The parameters used were the following: polygamous diploid dioecious species, no inbreeding, unknown population allele frequencies, no information on the maternal or paternal mean sibship size, and using the FLPL method: a combination of full likelihood (FL) and the pairwise-likelihood (PL) method, with medium precision in computation, faster than the FL method, but with a similar accuracy specially for high number of markers and not too large sibship sizes. These parameters were used on the 300 loci data set, using three different random seed numbers.

| Estimation of the theoretical total recruiting cohort (TRC) of Dascyllus abudafur at Al-Karrah Reef (AKA, Red Sea) for statistical inferences
To investigate whether the percentage of sibling pairs (calculated by COLONY) within our empirical RCS at AKA (N = 168, after qualityfiltering of SNPs data) were evidence of corecruitment with kin in D. abudafur, or whether similar values could also be achieved by random association, we estimated different putative theoretical total recruiting cohorts (TRCs). We define the TRC as the total number of recruiting larvae that were theoretically arriving at AKA on the night we took our sample of 168 recruits (the empirical RCS) with a light trap. The inferred TRCs were used to subsample, randomly and computationally, multiple recruiting cohort samples (computed RCSs, i.e., cRCSs) of equal size (N = 168) to our empirical RCS; and compare the sibship composition of both (the cRCS versus. our empirical RCS). Hence, we tested whether the sibship assignments within the empirical RCS were significantly higher than the average sibship composition of the permutated random cRCSs. A scheme of our experimental design can be found in Figure 2.
As a first step to estimate the size of and kin within a theoretical TRC, data on the reproductive biology, ecology, abundance, and habitat of D. abudafur was gathered from literature, personal observations, and discussions with expert colleagues in the field (Appendix S1 and Table 2). One problem with assessing whether siblings are more common than expected is that we do not know the area from which the recruiting cohort could have been sourced. Since the size and genetic composition of the true TRC is unknown, to have a wider insight into what kind of cRCSs different TRCs could deliver, the aforementioned information was firstly used to infer two theoretical TRCs of different source areas (sizes/radii) around our study site.
The first one, "TRC r25, " represented the scenario that larvae hatching from reefs around a radius of 25 km (r25) from AKA can recruit to AKA after completing their pelagic larval duration (PLD). The second, "TRC r50, " represented the scenario that the AKA recruits have hatched at reefs from a radius of maximum 50 km (r50) around AKA.
For the calculations of the two TRC r s (TRC r25 and TRC r50 ), the following parameters were required: (a) the daily number of larvae produced by an average colony of D. abudafur that will survive the F I G U R E 2 Scheme of the types data used in this study for Dascyllus abudafur from a north central Red Sea reef, Al Karrah (black triangle in the map, upper left corner). Yellow fish represent recruiting larvae of D. abudafur, for which "TRC" is the total recruiting cohort, represented by the different blue figures. The hexagonal blue figure is the TRC the study targets to estimate, while the two smaller round blue figures represent theoretical TRCs under the scenario that the radii for these TRCs are either 25 km (r25) or 50 km (r50). The gray figures represent a subsample of the recruiting cohort (i.e., recruiting cohort sample, "RCS," either empirical or Monte Carlo simulated). The cylindrical gray figure symbolizes the light trap which captured the recruiting larvae in this study (i.e., our RCS of N = 168), for which ddRAD sequencing SNPs data was generated to estimate sibship assignments using the program COLONY (i.e., blue/male, pink/female, and yellow/progeny fishes). Based on these results, the maximum number of colonies that provided our RCS at AKA (Cmax), the theoretical radius around AKA containing Cmax (r Cmax ), and the resulting TRC for that Cmax (TRC Cmax ) were calculated. Additionally, computational subsampling of the two theoretical TRCs (for r25 and r50) provided computed RCSs ("cRCS," N = 168), with which the maximum source radius for a putative TRC containing sibship assignments similar to those within our empirical RCS (r Rp ) was estimated to jointly infer: the TRC from the maximum possible radius to achieve sibship assignments similar to those within our empirical RCS by random (TRC rRp ) and the TRC of the maximum number of colonies that can be source of the RCS in order to achieve with 95% confidence a similar sibship assignment to that found within the empirical RCS (TRC riRCS ). For more details see

A B C D ) 2 x FS + 2 x HS
PLD and successfully recruit to a new home reef (i.e., be part of the daily TRC of a reef during the reproductive season; R Cd ); (b) the number of D. abudafur colonies per area of habitat in the Red Sea (C A ); (c) the percentage of simultaneously reproductive active females in the population (F R% ); and (d) the area of habitat available in a chosen radius of (putative) recruitment sources (A r ). This resulted in the following formula: TRC r = R Cd × C A × F R% × A r . The same formula was used for both radii since we did not incorporate a decay in dispersal with distance among D. abudafur in the Red Sea due to the lack of evidence for isolation by distance at a scale relevant to our study (Raitsos et al., 2017;Saenz-Agudelo et al., 2015). The values for all formulae used can be found in Table 2, and details on the calculations of (a) to (d), in Appendix S4.

| Statistical significance of kinship results: hypothesis testing
With the calculated TRC r25 and TRC r50 , our study aimed to test the hypothesis (null hypothesis) that the presence of a given number of sibling pair assignments or sibling groups in our sample can be the result of random events rather than active larval behavior. Under this hypothesis, we assume that a pool of N recruits (here: N = 168) taken from the TRC at a certain reef is a random sample of a completely mixed recruitment cohort at the focal home reef. The alternative hypothesis is that those N recruits comprise groups of siblings because larvae stick together among their kin (and are not randomly mixed), while traveling in the pelagic environment (during their PLD), and before recruiting (as we believe is the case for our empirical RCS).
To test our hypothesis, a script was coded in PYTHON (https:// github.com/HexTr ee/fish-sibli ngs/blob/maste r/fish_sim.py) for Monte Carlo simulations to randomly draw one million times a subsample of 168 recruits (= cRCSs) out of the theoretical TRC r s (TRC r25 (1) and TRC r50 (2)) and compare the average sibling composition of the random draws (average sibling composition of cRCS r25 s vs. that of cRCS r50 s) to calculate the rate (Rp) at which the likelihood to withdraw sibling pairs (SPs) in a cRCS (N = 168) decreases as the radius is halved. Rp was calculated using the following two assump-  Table 2).

| Population genetic statistics and Bayesian clustering analysis
Using 300 independent SNP loci, the expected heterozygosity was high and similar for both analyzed samples, the RCS and APS

| Sibship assignment
Siblings within the RCS were identified using COLONY.

| Statistical significance of kinship results and data mining
Monte Carlo simulations were run for various TRC r s. The different radii associated to each of the following TRCs are drawn in Figure 5 for scale. Random draws of 168 individuals (cRCS) out of a TRC con- The COLONY results also indicate that a maximum of 120 colonies (= C max ) provide our RCS. Assuming that each one of these colonies produces an average R Cd of 6, and if every single colony that built our TRC was represented within our light trap sample (RCS), then the TRC (coming from only 120 colonies) would be of R Cd × C max = 6 × 120 = 720 (= TRC Cmax ), which is much lower than the TRC rRp (from above) and equivalent to an area of A Cmax = TRC Cmax /(R Cd × C A × F R% ) = 720/ (6 × (102/km 2 ) × 0.33) = 3.57 km 2 (for: TRC Cmax = R Cd × C A × F R% × A Cmax ) and a radius of r Cmax = √(A Cmax /π) = 1.066 km.
The simulations to pick random samples of 168 and assess the sibling pairs were also run for TRC rRp and TRC Cmax , for which the former gave a SP Rp of 11, 0.41 triplets, and 0.01 quadruplets; and for TRC Cmax : 34 SP Cmax , 13.71 triplets, and 3.08 quadruplets, 0.37 quintets, and 0.02 sextets. With the second programmed tool, our data was given a score value of 142 and after 100,000 random picks that score was only observed in 0.015% of the random picks within TRC rRp . This latter result indicates how rare our empirical sibship results were even within a relatively small TRC. Furthermore, the tool computed that the TRC riRCP would have to be as little as 750 recruits only (i.e., coming from a maximum of 125 colonies and a radius of 1.09 km (r iRCP ), Table 2) to have a 95% chance to find sibship assignments similar to those given by our data when picking 168 individuals randomly. Altogether, our results highlight the significance of our data and its potential to reject the null hypothesis that the recruiting sample travels as a randomly mixed cohort of larvae and not as groups of close kin. what we know about connectivity in the Red Sea (Nanninga, 2013;Nanninga et al., 2015;Raitsos et al., 2017), such a tiny source area seems unlikely (see Fig. 6 for scale). Hence, even if most assumptions implemented in our computations are unrealistic, we provide an example of how to approach robust predictions when still lacking much data in a nonmodel organism. Based on the evidence, we assume our observations to be the result of an innate behavioral trait of schooling among kin (in D. abudafur) rather than a simple stochastic occurrence, although hydrodynamic processes such as packet entrainment may also be acting upon the dispersal of larvae. We further have evidence of potential "sneaker" males and/or the presence of more than one reproducing male per colony, as the genetic kinship among the recruiting quadruplets and quintuplets indicated that these were the progeny of at least two males (as opposed to the expected haremic half siblings from different mothers but the same father, see Figure 4). In terms of genetic composition, the new recruits of our study were providing nearly the same genetic diversity contained in the adult population sample. Both samples (that of the recruiting cohort and the adult population samples) had high heterozygosity, which suggests a large, well-mixed, and well-established D. abudafur population in the north central Red Sea, rather than the presence of strong selective sweeps and restricted geneflow. Hence, even though larvae may travel together with their siblings, the ultimate genetic exchange between sources of recruiting fishes seems to be large and stable.

F I G U R E 4 Theoretical (panel a) and empirical (panel b) sibship
assignments among quadruplets (4 × HS) and quintuplets (5 × HS) of Dascyllus abudafur from the north central Red Sea. Panel a exemplifies the typical hierarchical reproductive structure found in the species, where one dominant single male (blue fish silhouette) produces progeny (yellow fish silhouettes) with multiple females (pink fish silhouettes) in a single colony. Panel b shows the empirical results we found using the program COLONY to infer sibship assignments, which essentially deviate from the expected hierarchical reproductive structure and rather suggest the presence of at least two reproducing males in a single colony. Orange arrows represent the pedigree of the progeny (yellow fish) and the black connecting arrows between the progeny represent the half sibling assignment. The dashed lines (in panels a and b) split the sibship results among either four (4×) or five (5×) half siblings (HS), which was the highest sibship assignment found among our data Assumed progeny in hierarchical reproductive structure: Results from COLONY sibship assessment:

F I G U R E 5
Map of the coastal area around our sampling site, Al Karrah reef (AKA, north central Saudi Arabian Red Sea), near the city of Rabigh, displaying the area of putative radii providing larvae of Dascyllus abudafur that may recruit to AKA. The area of each radius (A r ) used to estimate the number of colonies providing aforementioned recruits, solely included potential habitat which is based on bathymetry and therefore only includes the coastal sections in the two lighter blue colors given in the figure's legend. The legend further indicates the radii corresponding to the different colored arrows and circles around AKA (marked with a yellow cross). Their exact sizes are, from smallest to largest: 1.066 km (r Cmax , in black), 1.09 km (r iRCS , in green), 4.31 km (r Rp , in gray), and 25 km and 50 km (r 25/50 , in pink). Details to their calculations can be found in Table 2 x AKA r = 5 0 k m r = 2 5 k m Rabigh 5 km The daily number of larvae produced by an average colony of D. abudafur that will survive the pelagic larval duration (PLD) and putatively, successfully recruit to a new home reef.

| Why are closely related larvae traveling together?
Little is known about schooling in coral reef fish larvae. Based on studies of migratory fishes from temperate regions, one could infer that schooling may have evolved to aid navigation through collective movement, especially when it comes to finding a way back from the pelagic to the restricted reef habitat. In particular among coral reef fish larvae, schooling emerges as a vital innate behavior to succeed when relying on finding targets as small as coral reefs. Schooling among your kin could further increase the ability to find suitable food as well as chemical and physical cues for navigation (for example, the 'many wrongs principle'; Codling, Pitchford, & Simpson, 2007;Larkin & Walton, 1969;Simons, 2004). It could also decrease risk of predation (Handegard et al., 2012).
Moreover, traveling in schools comprised of their closest kin may have additional advantages for larvae. Collectively, they have higher chances of recognizing dangerous predators, identifying food sources and cues (Handegard et al., 2012), and navigating to a suitable habitat (Berdahl, Westley, Levin, Couzin, & Quinn, 2016;Couzin, 2007;Torney, Berdahl, & Couzin, 2011;Torney, Neufeld, & Couzin, 2009). In schools of fish, it has also been shown that the ability to sense odor cues increases with group size (Hall, Burton, Margrey, & Graves, 1982) and coral reef fish larvae particularly rely on such cues when recruiting home (Paris et al., 2013). Hence, schooling with your siblings may not only be advantageous but also the easiest alternative, because these are the nearest kin to be found right after hatching. The likelihood of sibling larvae to school could therefore also differ between reproductive strategies (broadcast spawners vs. benthic brooders) and among environmental conditions (such as current strength and food availability). Schooling has also been found to have hydrodynamic benefits (Herskin & Steffensen, 1998;Ross, Backman, & Limburg, 1992)

| What are the consequences of schooling with kin during the pelagic larval duration (PLD) for population dynamics?
Assuming that only a small proportion of the available gene pool is transported to the population by one recruitment event ("sweepstakes effect"; Hedgecock, 1994), a single cohort is expected to have less genetic diversity than that found among cohorts (Hedgecock, Barber, & Edmands, 2007 (Hellberg, Burton, Neigel, & Palumbi, 2002;Thorrold et al., 2002). Even though we find a scenario in which siblings of D. abudafur are actively schooling together, a single recruitment event was able to provide as much of the genetic richness present in the adult population sample. This can be interpreted as a highly panmictic population, in which a large number of colonies are reproducing together and providing high connectivity between reefs, despite cohesive dispersal. We further propose that while it is a good strategy to codisperse, the larvae are not settling together with their kin into a single colony, countering interbreeding. But the exact implications our results may have on the genetic structure of the D. abudafur population in the Red Sea still remains to be investigated. Sensitivity analyses combining theoretical and empirical values related to larval dispersal (e.g., dispersal kernels, in situ hydrodynamic measurements for particle movement modeling, mortality rates, and reproductive cycles) and including data from large-and small-scale genetic studies (such as those in Saenz-Agudelo et al., 2015;Schunter et al., 2019), would improve our approach and give further insight into the population dynamics of the species. Moreover, it would be interesting to investigate whether cohesive dispersal with siblings is also a distinguishing behavioral larval trait among other coral reef fishes and maybe another crucial factor determining dispersal, population structure, and biogeographic ranges. While it is hard to define whether our finding of cohesive dispersal of siblings among coral reef fishes is an exceptional or a common behavioral larval trait, we advocate for future studies to use similar suitable genetic markers to identify relationships among conspecific corecruiting larvae (sampled temporally punctual and in large numbers) and test this behavior among other coral reef fish species. The genetic relatedness among new recruits of coral reef fishes within a settling cohort will provide essential knowledge of mechanistic features of larval dispersal (Buston et al., 2009;Planes, 2002;Selkoe et al., 2006), crucial for the management of marine populations.

ACK N OWLED G M ENTS
For logistic and fieldwork support, we thank the Coastal and

CO N FLI C T O F I NTE R E S T
The authors declare no competing interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
-Script for Monte Carlo simulations: https://github.com/HexTr ee/fish-sibli ngs/blob/maste r/fish_sim.py -Sampling coordinates as given in the text (Figure 1 and