Estimating the age of Hb G‐Coushatta [β22(B4)Glu→Ala] mutation by haplotypes of β‐globin gene cluster in Denizli, Turkey

Abstract Background Hb G‐Coushatta variant was reported from various populations’ parts of the world such as Thai, Korea, Algeria, Thailand, China, Japan and Turkey. In our study, we aimed to discuss the possible historical relationships of the Hb G‐Coushatta mutation with the possible migration routes of the world. For this purpose, associated haplotypes were determined using polymorphic loci in the beta globin gene cluster of hemoglobin G‐Coushatta and normal populations in Denizli, Turkey. Methods We performed statistical analysis such as haplotype analysis, Hardy–Weinberg equilibrium, measurement of genetic diversity and population differentiation parameters, analysis of molecular variance using F‐statistics, historical‐demographic analyses, mismatch distribution analysis of both populations and applied the test statistics in Arlequin ver. 3.5 software program. Results The diversity of haplotypes has been shown to indicate different genetic origins for two populations. However, AMOVA results, molecular diversity parameters and population demographic expansion times showed that the Hb G‐Coushatta mutation develops on the normal population gene pool. Our estimated τ values showed the average time since the demographic expansion for normal and Hb G‐Coushatta populations ranged from approximately 42,000 to 38,000 ybp, respectively. Conclusion Our data suggest that Hb G‐Coushatta population originate in normal population in Denizli, Turkey. These results support the hypothesis that the multiple origin of Hb G‐Coushatta and indicate that mutation may have been triggered the formation of new variants on beta globin haplotypes.

. Beta globin gene cluster haplotypes are frequently encountered in population surveys. Human b-globin gene cluster is located at chromosome 11 (5 0 -e-G c -A c -wb-db-3 0 ). The haplotypes obtained by testing seven polymorphic restriction sites in this region of the beta globin gene cluster provide important data on the structure of populations, their origins and their possible associations with mutations (Alcantara et al., 2003;Chen, Easteal, Board, & Kirk, 1990;Currat et al., 2002;De Lugo, Rodriguez-Larralde, & De Guerra, 2003;Mattevi et al., 2000). Although data on the bglobin gene cluster haplotypes are limited for the world cases, there are four different suggested genetic origins of these haplotypes reported in association with the Hb G-Coushatta cases American Indian [À + À À + À ?], Chinese [À + + À + + ?], Kocaeli-Turkey [À + + À À À +], and Denizli-Turkey [À + À + + + +] (Li et al., 1999;Ozturk et al., 2007). In our previous studies based on halotype analysis for the abnormal hemoglobins detected in Denizli-Turkey, the average time since the demographic expansion of Hb D-Los Angeles population was calculated as ranged from approximately 38,000 (95% CI 18,500-62,000) ybp, Hb S population 26,000 ybp (95% CI; 11,000-36,000) and the normal population 42,000 (95% CI, 25,000-58,000) ybp, respectively (Ozturk, Arikan, Atalay, & Atalay, 2016, 2017. In our study, we aimed to discuss the possible historical relationships of the Hb G-Coushatta mutation with the possible migration routes of the world. In accordance for this purpose we tested the Hb G-Coushatta population and the normal population haplotype data in the Denizli region comparatively with the statistical software program Arlequin ver 3.5 (Excoffier, Laval, & Schneider, 2005;Excoffier & Lischer, 2010). Associated haplotypes were determined using polymorphic loci in the b-globin gene cluster of both populations.

| Sample collection
We studied 15 unrelated patients with abnormal Hb G-Coushatta and 59 unrelated normal DNA samples. It has been reported in previously published articles that during the identification of these haplotype data were used 59 unrelated healthy subjects DNA samples (Ozturk et al., 2016). Normal and Hb G-Coushatta DNA samples were taken from Pamukkale University, Medical Faculty, Department of Biophysics DNA Bank (Denizli, Turkey) as anonymous samples. Written informed consent has been already taken from individuals and/or from their parents for further anonymous DNA analysis.

| Haplotype identification and statistical analysis
In the first step of our study, PCR-RFLP (Polymerase chain reaction-restriction fragment length polymorphism) method was applied on seven polymorphic restriction sites (HincII 5 0 to e, HindIII 5 0 to Gc, HindIII in the IVS-II 5 0 to Ac, HincII in wb, HincII 3 0 to wb, AvaII in b, HinfI 3 0 to b) in the b-globin gene cluster as previously reported (Ozturk et al., 2016). Associated haplotypes for the normal population samples and patients with Hb G-Coushatta were determined by the obtained RFLP results. We performed statistical analysis of both populations and applied the test statistics in Arlequin 3.5 software program with unknown gametic phase such as haplotype analysis (Excoffier et al., 2005;Falchi et al., 2005), Hardy-Weinberg equilibrium tests (Excoffier & Lischer, 2010;Excoffier et al., 2005) measurement of genetic diversity and population differentiation parameters, analysis of molecular variance (AMOVA) using F-statistics (F ST , F IT , F IS ) (Mantel, 1967;Schneider, Roessli, & Excoffier, 2000;Slatkin, 1995;Wright, 1965), historical-demographic analyses (Tajima's Fu's tests) (Fu, 1997;Tajima, 1989a), mismatch distribution analysis, analyses of tau (s) and initial theta, SSD, the Harpending's raggedness index (Hri) and p-values of SSD (Excoffier, 2004;Harpending, 1994;Ray, Curratand, & Excoffier, 2003;Rogers, 1995;Rogers & Harpending, 1992;Slatkin & Hudson, 1991) as previously reported (Ozturk et al., 2016). The Rogers and Harpending (1992) model was used to calculate the time elapsed since the population expansion by estimating Tau (s), h 0 , and h 1 based on the mismatch distribution outputs from Arlequin. Historic demographic expansions were also investigated by the examination of frequency distributions of pairwise differences between sequences (mismatch distribution), which is based on three parameters: h 0 , h 1 (h before and after the population growth) and s (time since expansion expressed in unit of mutational time).

| RESULTS
Tables 1 and 2 show the summary of listed frequencies and haplotypes of Hb G-Coushatta and normal populations respectively. In normal population the haplotype with the highest frequency is Mediterranean haplotype I [+ À À À À + +] (14%). However, in Hb G-Coushatta population the Mediterranean haplotype I [+À ---+ +] does not have any frequency value.
We showed the summary of molecular diversity parameters for each population (Table 4), statistical demographic parameters for two populations with the mismatch distribution graphics (Figure 1), parameters of the graphic shape testing results Harpending's raggedness index and p-values of sum of square deviations (SSD) ( Table 5), respectively.
In terms of the time estimations, parameter values of s and historical population parameters h (h 0 and h 1 ) also show a similar historical growth period for the populations (Table 4). The mean population age for normal and Hb G-Coushatta populations in Denizli depends on results of estimations parameter values of s, dated approximately 42,000 ybp (95% CI; 11,000-55,000) to 38,000 ybp (95% CI; 10,000-62,000), respectively ( Table 4). The results in Table 6 show that normal population was in Hardy-Weinberg equilibrium (p > .05) for each of the seven polymorphic loci and the Hb G-Coushatta population was in Hardy-Weinberg equilibrium except for fourth loci.

| DISCUSSION
Haplotype studies related to Hb G-Coushatta in American, Chinese, Thai and Turkish individuals suggest a multiple origin for this variant (Itchayanan et al., 1999;Li et al., 1999;Ozturk et al., 2007). Previously published b-globin gene cluster haplotypes data in association with the Hb G-Coushatta cases (American Indian [À + -- and Denizli-Turkey [-+ -+ + + +]) support the prediction that this variant has a multi-centric origin (Li et al., 1999;Ozturk et al., 2007). Interestingly, while the haplotype of [-+ -+ + + +] (Ozturk et al., 2007) obtained by pedigree analysis in the Denizli region was found to be in 4th place with 6% frequency (Table 1), the haplotype of [-+ + ---+] obtained by pedigree analysis performed with the samples in the Kocaeli region is not included in the diversity of haplotypes list obtained from the Denizli region Hb G-Coushatta T A B L E 1 b-globin gene cluster haplotypes for the seven loci in association with the Hb G-Coushatta [b22(B4)Glu?Ala] (HGVS Name: HBB:c.68A>C) population in Denizli, Turkey
T A B L E 2 b-globin gene cluster haplotypes for the seven loci in association with Normal population in Denizli, Turkey

No. Haplotype
Frequency SD 1 + À À À À + + 0.144068 0.032465 2 + + À + + + + 0.127119 0.030796 3 À + À + + + + 0.084746 0.025748 4 + À À À À + À 0.076271 0.024539 5 À À À À À À À 0.067797 0.023242 6 À À À À À À + 0.059322 0.021839 7 À À À À À + + 0.050847 0.020310 8 + + + À + + + 0.050847 0.020310 9 À + + À + + + 0.033898 0.016730 10 + À À À À À À 0.033898 0.016730 11 À À À À À + À population in this study (Table 1). Similarly, American Indian [-+ --+ -?], and Chinese [-+ + -+ + ?] types are not on the haplotype list associated with the Hb G-Coushatta mutation in the Denizli region (Table 1). The fact that the American, Chinese and Kocaeli-Turkish type haplotypes are not among the haplotypes associated with Hb G-Coushatta mutation in the Denizli region supports the literature view on the independent and multicentric origin of this mutation. The haplotype diversity and frequency percentages in Tables 1 and 2 show that the Hb G-Coushatta mutation most probably developed on the normal population gene pool in Denizli, Turkey. The haplotypes representing the different genetic origins of the Kocaeli and Denizli Hb G-Coushatta populations which are geographically close regions support the view that the Hb G-Coushatta mutation is independent from the historical migration routes. Table 3 summarizes the results of the AMOVA test statistic calculated the degree of genetic differentiation between normal and Hb G-Coushatta populations in the Denizli region. These results indicate that negligible genetic differentiation (5.96%) between the two populations (F ST : 0.05964, p = 0.01369 AE 0.00367) ( Table 3). This low and statistically significant (p < .05) genetic differences showed that the two populations are not diversified by the effect of migration on the gene pool. The difference in haplotype diversity between the two populations may be the T A B L E 3 (AMOVA) F-statistics calculated for seven loci differentiation among populations of between Normal and Hb G-Coushatta possible effect of mutation formation on polymorphic loci in the b-globin gene cluster. Table 4 showed the high and similar haplotypic diversity (h), low nucleotide diversity (p) and similar average number of pairwise nucleotide differences (k) between the two populations (Li, 1997;Nei, 1987). Additionally, Fu's F s statistic showed a significant negative value for the two populations, indicating similar population expansion throughout history for these populations. Tajima's D values were insignificant (p > .05) for the two populations, suggesting that these populations are at neutral equilibrium (Table 4). These findings indicate that the molecular diversity of both populations have genetically similar development and expansion in the historical period (Fu, 1997;Tajima, 1989aTajima, ,b, 1993. The mismatch distribution parameters in Table 4 were investigated using mismatch distribution analysis to estimate the demographic developments of the two populations (Harpending, 1994). According to the graphs obtained from the calculated distribution parameters, the normal population appeared to be unimodal (unimodal distribution; the exponential growth is smooth), while the Hb G-Coushatta population is departure from the unimodal Hb-G Coushatta 0.01343 .160 0.05294 .290 SSD, sum of squared deviations; rg, Harpending's raggedness. p (SSD) is the probability of observing by chance a less than good fit between the observed and mismatch distribution for a demographic history of the population defined by the estimated parameters s, h 0 , and h 1.  (Figure 1; gray bar). The reason for this difference in distribution is that the Hb G-Coushatta population presented in Table 6 is in departure from Hardy-Weinberg Equilibrium (HWE) of the fourth locus (Guo & Thompson, 1992). HWE (p > .05) means that there will be no change in allelic or genotypic frequencies from one generation to the next. However, with the possible effect of Hb G-Coushatta mutation in the fourth locus may have occurred as the difference between the observed and expected pairwises. Mismatch distribution results were supported by the level of Harpending's raggedness index and p values of SSD in Table 5 (Harpending, 1994;Ozturk et al., 2016). Our results suggested that the origin of the Hb G-Coushatta population in Denizli province may have been in the Mediterranean area, separated from Hb G-Coushatta population in Kocaeli region which geographically close region and other populations rather than from recent Asiatic migrations. According to our estimated values of s show that the average time since the demographic expansion for normal and Hb G-Coushatta populations ranged from approximately 42,000 ybp (95% CI; 11,000-55,000) and 38,000 ybp (95% CI; 10,000-62,000), respectively. Historic demographic expansions were investigated by the examination of frequency distributions of pairwise differences between sequences (mismatch distribution), which is based on three parameters: h 0 , h 1 (h before and after the population growth) and s (time since expansion expressed in unit of mutational time) (Table 4) (Rogers & Harpending, 1992). In our published studies, the average time since the demographic expansion of Hb S and Hb D-Los Angeles populations in Denizli was calculated as range from approximately 26,000 ybp (95% CI; 11,000-36,000) and 38,000 ybp (95% CI; 18,500-62,000), respectively (Ozturk et al., 2016(Ozturk et al., , 2017.

T A B L E 6 Hardy-Weinberg equilibrium (HWE) test for all Loci in Normal and Hb-G Coushatta populations
According to the Klein's results, Homo sapiens neanderthalensis (HN) constitute a group of hominids whose particular morphology developed in Europe during the last 350,000 years under the effect of selection and genetic drift reaching its final form approximately 130,000 ybp (Klein, 2003). This subgroup of hominids populated the Europe and Western Asia approximately 45,000 ybp, until the arrival of Homo sapiens sapiens (HS), the first modern humans (Mellars, 1992;Parker, 1993). This is an available data on European mtDNA diversity indeed support this view. The most European populations present a signal of Paleolithic demographic expansion from a small population, which could be dated to about 40,000 ybp . Entrance of Homo sapiens to Europe was between 50,000 to 46,000 ybp. Today most Europeans can trace their ancestry by mtDNA lines that appeared among 50,000 and 13,000 ybp (Oppenheimer, 2012).
Our results indicate that the Hb G-Coushatta population was not introduced into the Anatolian gene pool by migration from Asia or any other geographical region, compatible with the published dating results. Since Asiatic tribal migrations were recent events (about 2 000 ybp) we had to observe the genetic drifts in our data but we did not observe genetic drifts during the time course of about 40,000 ybp up to the present time.
In conclusion, these findings further suggest that the Hb G-Coushatta population originated in the normal population in Denizli, Turkey. Although two populations share common genetic origin findings, they have different haplotype variations. We think that the reason for these variations is the departure from Hardy-Weinberg equilibrium of the fourth locus in the Table 6. The possible effect of the Hb G-Coushatta mutation on polymorphic loci in the b-globin gene cluster may cause this haplotypic variation between normal and Hb G-Coushatta populations.

CONFLICT OF INTERESTS
Authors Ozturk, Arikan, Atalay and O. Atalay declare that they have no conflicts of interest.

AUTHOR CONTRIBUTIONS
SA, provided support in the data generation of the laboratory results, sample collection and preparation. AA and EOA supervised the study, support in the data interpretation and manuscript preparation.