First insight into oral microbiome diversity in Papua New Guineans reveals a specific regional signature

The oral microbiota is a highly complex and diversified part of the human microbiome. Being located at the interface between the human body and the exterior environment, this microbiota can deepen our understanding of the environmental impacts on the global status of human health. This research topic has been well addressed in Westernized populations, but these populations only represent a fraction of human diversity. Papua New Guinea hosts very diverse environments and one of the most unique human biological diversities worldwide. In this study we performed the first known characterization of the oral microbiome in 85 Papua New Guinean individuals living in different environments, using a qualitative and quantitative approach. We found a significant geographical structure of the Papua New Guineans oral microbiome, especially in the groups most isolated from urban spaces. In comparison to other global populations, two bacterial genera related to iron absorption were significantly more abundant in Papua New Guineans and Aboriginal Australians, which suggests a shared oral microbiome signature. Further studies will be needed to confirm and explore this possible regional‐specific oral microbiome profile.

Westernized populations, but these populations only represent a fraction of human diversity. Papua New Guinea hosts very diverse environments and one of the most unique human biological diversities worldwide. In this study we performed the first known characterization of the oral microbiome in 85 Papua New Guinean individuals living in different environments, using a qualitative and quantitative approach. We found a significant geographical structure of the Papua New Guineans oral microbiome, especially in the groups most isolated from urban spaces. In comparison to other global populations, two bacterial genera related to iron absorption were significantly more abundant in Papua New Guineans and Aboriginal Australians, which suggests a shared oral microbiome signature. Further studies will be needed to confirm and explore this possible regional-specific oral microbiome profile.

| INTRODUC TI ON
In recent years the field of host-associated microbes has greatly increased due to a desire to understand how the microbiome impacts a broad array of hosts, such as animals, plants and algae (Parfrey et al., 2018). Microbes have been shown to be involved in host immunity, digestion and nutrient scavenging activities, facilitate metabolism and cellular growth, regulate behaviour, and release microbial toxins, among others (Miller et al., 2021). However, the first step in understanding host-microbiota dynamics is to identify the composition of the microbiome, and for that unique microbiomes across diverse host populations must be studied (Parfrey et al., 2018).
The human oral microbiota is one of the most complex microbial communities in the human body, colonized by around 700 different types of microorganisms (Lamont et al., 2018). Awareness of the role of oral microbes in health and disease has been increasing, not only in its obvious relation to oral diseases, such as periodontal disease (Costalonga & Herzberg, 2014) and oral cancer (Wang & Ganly, 2014), but also in how it relates to systemic (Graves et al., 2019), cardiovascular (Bryan et al., 2017) and gastrointestinal diseases (Flemer et al., 2018). The oral microbiome is formed mainly by bacteria, but it also contains lower amounts of fungi, viruses, archaea, and protozoa. The top bacterial phyla detected in the oral microbiome are Firmicutes, Fusobacteria, Proteobacteria, Actinobacteria, Bacteroidetes and Spirochaetes (Deo & Deshmukh, 2019), but abundance of these taxa differs between populations. Several studies on oral microbiome have been performed for Westernized populations (Burcham et al., 2020;Nasidze et al., 2009), which have been associated with a specific Western diet, low in fibre and fruit and high in refined sugars and fat (Cordain et al., 2005), but also in more remote populations, such as Native South Americans (Contreras et al., 2010), native Alaskans (Li et al., 2014), Philippines hunter-gatherers (Lassalle et al., 2018) and Batwa Pygmies (Nasidze et al., 2011). These studies revealed a high diversity in oral microbiome within and between the different populations. Different environments, lifestyles, eating habits, or genetic background are possible reasons for these variations (Li et al., 2014;Mousa et al., 2022).
Papua New Guinea (PNG) is one of the most culturally diverse and least urbanized countries in the world, with over 800 spoken languages and thousands of ethnic groups (Pawley et al., 2005).
It also hosts a very high human genetic diversity (Brucato et al., 2021), including the highest proportion of archaic introgressions from Denisovans (~4%) in the world (Jacobs et al., 2019;Reich et al., 2011). The majority of Papua New Guineans live in a semi-traditional way with an agriculture-based diet, including sago, sweet potato, taro, cassava, yam, rice, maize, breadfruit, various nuts, sugarcane, pig, chicken and fruits, like coconut, banana and pineapple (Schapper, 2017). The PNG populations have a history of low levels of protein consumption, which has been improving in the last years with an increase in the importation of poultry and fish (Schmidt & Fang, 2021). However, with the importation of animalsourced foods there has also been an increased importation of ultraprocessed foods, and meals containing higher saturated fat (Schmidt & Fang, 2021). PNG has poor general health due to low socioeconomic status and high burden of infectious diseases, such as pneumonia, malaria, tuberculosis and meningitis (Riley, 2009). While the gut microbiota of different regions within PNG showed higher bacterial diversity and different abundance profiles in comparison with Westernized populations (Martinez et al., 2015), a comprehensive analysis of the oral microbiome for diverse ethnic groups and regions in PNG has not been performed.
In this work we combined the results from two different methodologies ( Figure 1) to characterize the microbial community from DNA extracted from PNG saliva samples: (1) Qualitative results from a microbial detection array, which contains specific probes to the different microbial species, and (2) quantitative from whole genome sequencing (WGS) reads unmapped to the human genome, which were aligned to a microbial database. The array methodology is cost-effective, highly sensitive and accurate (Thissen et al., 2014), and easy to analyse (Gardner et al., 2010;McLoughlin, 2011), being the ideal technique to validate an off-target microbiome inference from the human WGS. In fact, the recovery of off-target reads obtained from human WGS provides researchers with excellent secondary opportunities to study data beyond the intended target regions (Cavadas et al., 2020;Gomes et al., 2019;Samuels et al., 2013). This technique permits users to detect a larger number of microorganisms (Ranjan et al., 2016) and to obtain the abundance of the identified microbial communities. However, it is more expensive than the array methodology and involves much more complex bioinformatic analysis.
The quantitative oral microbiome data allowed us to assess differences between diverse PNG populations in light of diet and/or environmental factors, and against other worldwide populations.
Our working hypothesis was that a specific oral microbiome signature might be found in these populations due to adaptation to their specific biodiversity, cultural lifestyle or genetic background, since their 50kya-establishment in the ancient Sahul region (New Guinea/ Australia).   Table S1). The saliva samples were collected with the Oragene DNA OG-500 collection kit (DNA Genotek Inc.), and the nucleic acids were extracted following the manufacturer's instructions. All samples were collected from healthy unrelated adult donors, all of whom provided written informed consent. In each sampling location, a full presentation of the project was made, followed by discussion with each donor to ensure that they fully understood the project. Participants were surveyed for language affiliation(s), current residence, date and place of birth, and a short genealogy up to three generations to establish regional ancestry.

| Microbiome inference from Axiom Microbiome Array
From the DNA extracted, we performed microbiome diversity characterization with the Axiom Microbial Detection Array (MDA) from Thermo Fisher Scientific. First, cDNA synthesis was performed using the SuperScript VILO cDNA synthesis kit from Invitrogen. Afterwards, microbiome diversity was characterized with the MDA, which can rapidly identify the microbial community at the species or strain level, giving a profile of which microorganisms are present in the analysed samples. The Microbiome array harbours a total of 1.38 million probes that detect over 12,513 microbial species including archaea, bacteria, fungi, protozoa, and viruses, and the probes were designed against both conserved regions (for the detection of higher taxonomic ranks) and unique regions (species/strain specific) of the microbial genomes with a sensitivity of 100-1000 genome copies and achieving 100% accuracy in species identification. The 96-well array plate version of the Microbiome array was used to run all the samples on the GeneTitan instrument (Thermo Fisher Scientific). A volume of 20 μl of DNA from all samples was used as input into the MDA, including a no-template control and a positive control for quality control. The standard manufacturer's protocol was followed for all steps of the workflow. Quality control (QC) was performed using the QC option in the Axiom Microbial Detection Analysis Software (MiDAS), which analyses specific probes included in the Microbiome array to target regions of the human genome that are nonpolymorphic (sites that do not vary in sequence from one individual to the next). This QC analysis generates a metric termed Dish QC (DQC) and applies a threshold of DQC above 0.95 for the samples to be considered as having passed QC. Six samples were removed from the analysis after QC: PNG237 and PNG238 (East Sepik province), PNG128, PNG514 and PNG525 (Western province), and PNG205 (East highlanders living in Port Moresby). The remaining samples (n = 86) were then further analysed with MiDAS, which analyses all targets considered positive F I G U R E 1 Schematic representation of the experimental strategy (sample preparation, sample analysis, data analysis and output) for the two microbial detection techniques. and reaches a consensus for the microbial community present in the samples. Positive targets are the probes with signal intensity above the 99th percentile of the random control probe intensities and with more than 20% of target-specific probes. These probe sequences are available from Thermo Fisher Scientific as part of the MiDAS software. To note, this technology is unable to provide a view of the varying prevalence of the microorganisms present in analysed samples.

| Microbiome inference from human whole genome sequences
For 68 individuals, also characterized by Microbiome array, we used reads extracted from WGS from (1) previously published studies (Brucato et al., 2021), and (2) from newly unpublished whole genome sequences data (Table S1). The sequencing procedure has been described previously (Brucato et al., 2021). Briefly, F I G U R E 2 (a) Locations of the five sampled groups in PNG. (b) Box-plot representing the concordance in percentage of genera inferred in both methods by applying a threshold of abundance to the WGS results of 1%, 0.5%, 0.1% and 0.01% (Table S3). (c) Correlation of alpha-diversity values (Chao1 index) obtained for each individual oral microbiome characterized by both microbiome array and WGS methodologies. A regression line for the linear model (in blue) and the confidence interval (grey shadow) was added as well as the R and p-value for the model. sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. About 150-bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. We aligned the fastq files against the human reference GRCh38 (GCA 000001405.15) using minimap2-2.23 (Li, 2018) and the unmapped reads were retrieved. For taxonomic assignment of the unmapped sequencing reads we resorted to kraken 2 tool (Wood et al., 2019) using the standard plus protozoa and fungi database (PlusPF; includes archaea, bacteria, viral, plasmid, human, UniVec_Core, protozoa and fungi; updated on 27 January 2021). We then used Bracken (Lu et al., 2017; Bayesian Re-estimation of Abundance after Classification with KRAKEN) to estimate abundance at the genus level, using the taxonomic assignments made by Kraken. Since these PNG populations are understudied, we are unable to consider possible unidentified microbiota species that may be specific to them. We therefore performed all analysis on genus level to reduce the bias caused by using the Microbiome array and the WGS databases.
We applied Bracken's default threshold of 10 reads as the minimum number of reads required for a classification at the specified rank. To remove low background noise on the WGS results due to minor contaminants, we applied a threshold for which all genera below a given percentage would be removed. This threshold was determined by calculating the best correlation of alpha-diversity values obtained from the Chao1 index between each sample using the qualitative and quantitative results, applying several thresholds, from 1% to 0.0001% (1%, 0.5%, 0.1%, 0.05%, 0.01%, 0.005%, 0.001%, 0.0005%, 0.0001%). Applying a threshold of 0.01% to the WGS results, the Chao1 values obtained from both methodologies presented the best correlation from all the tested thresholds.

| Microbial composition
We created phylogenetic trees and calculated the respective abundance of the genera found in the five groups from the qualitative results (Microbiome array). The abundance in this case refers to the percentage of samples in the group in which the specific taxon was detected. We used r studio software (R Studio Team, 2020) to perform pairwise chi-squared tests and correction with Holm's method to see which genera were significantly different. Venn diagrams with the percentage of bacteria and virus were created to help to visualize the similarities and differences between the five groups resorting to the webtool Bioinformatics and Evolutionary Genomics (https://bioin forma tics.psb.ugent.be/webto ols/Venn/). Discrimination between the five different groups using quantitative data was performed with the linear discriminant analysis effect size (LEfSe) algorithm (Segata et al., 2011) with the calculation of a linear discriminant analysis (LDA) score, available at the Huttenhower laboratory Galaxy server (http://hutte nhower.sph.harva rd.edu/galax y/) (Afgan et al., 2018). Statistical differences were found under a Kruskal-Wallis test which identified the features with significant variation across the five groups. Then, pairwise Wilcoxon tests were performed between each pair of groups, using the "one-against-all" option for multiclass analysis, which only requires that one of the pairwise Wilcoxon tests to be significant in order to keep the feature in the list of significantly discriminating features (Segata et al., 2011).
Only taxa with a p-value <.05 and with a LDA value >3 were considered enriched.

| Alpha-diversity
The alpha-diversity of each sample was calculated by estimating the genus and species richness, evenness, and diversity using the phyloseq package version 1.23.1 (McMurdie & Holmes, 2013). Alphadiversity measures were calculated differently for qualitative and quantitative results. For the qualitative results (from the Microbiome array data) we calculated Chao1 (Hughes et al., 2001), which measures the richness of the samples and is more suitable for presence/ absence data. We created a ggplot for each alpha-diversity metric and performed pairwise comparisons using t tests and correction with Holm's method using rstudio (R Studio Team, 2020). After alphadiversity analysis, one sample was considered an outlier, PNG54 (Port Moresby), as its Chao1 index value falls out of the mean ± 3SD for all samples ( Figure S1), so it was removed from further analyses.

| Beta diversity and unrooted neighbourjoining tree
The phylogenetic information for the genera present in our samples was retrieved (on 20 October 2021) from the Time Tree website (http://www.timet ree.org/). This information was used to calculate Jaccard (Koleff et al., 2003), unweighted and weighted UniFrac distances (Lozupone et al., 2011), beta diversity measures that analyse the similarities and differences among groups. The Jaccard and the unweighted UniFrac distances were applied to the qualitative data (Microbiome array) as they do not consider abundances in the calculations and the weighted UniFrac distances were used on the quantitative data (WGS results) The unrooted neighbour-joining tree was used to represent distances between the five groups. A permutation multivariate analysis (1000 permutations) of variance (PERMANOVA; Anderson, 2017) of the beta diversity distances (to reflect the phylogenetic diversity between genera) was also performed and plotted in a nonmetric multidimensional scale (nMDS).
The null hypothesis tested by PERMANOVA was that the centroids of each group are equivalent, under the assumption of exchangeability of the samples among groups. For these, we retrieved the values of abundance for the core microbiome that were described and available in their publications.

| Comparative analysis of oral microbiome data from other populations
Previous research has shown that microbiome results might differ quite substantially depending on the underlying method used (Durazzi et al., 2021;Hillmann et al., 2018), but a high correlation has been found between these different methods regarding the most abundant species in the samples. As such, we performed comparative analysis of only the most abundant taxon at genus level to limit inter-study variation due to the comparison between results from different methodologies (WGS and 16S).  Table S1). After QC, 81 microbial genera were detected in the cohort. We controlled for substructure due to age or gender, and found no correlation (for age, p-value = .36; Figure S2A; for gender, p-value = .4; Figure S2B), confirming previous results (Nasidze et al., 2009).

| RE SULTS
A subset of the cohort was also characterized using available WGS to obtain quantitative results (n = 65): Port Moresby (n = 13); Simbu province (n = 18); East Sepik (n = 9); Western province (n = 18); and East highlanders living in Port Moresby (n = 7) (Table S1). Reads from WGS data not mapping to the human genome were analysed with KRAKEN 2 (Wood et al., 2019), resulting in an average of 18,652,535 reads matching microbial genomes per individual (ranging from 1,483,445-44,976,955 reads). As expected, the WGS data identified a much higher number of genera (1757 genera; Table S2) than with the Microbiome array data. In fact, the BRAKEN method used in the WGS inference limits the number of reads to 10, while the Microbiome array has a sensitivity of 100-1000 genome copies. Hence, when comparing the concordance in inferred genera between the two methods, these values are high for the most abundant genera (Figure 2b; Table S3): an average of 92% match filtering to the genera with abundance above 1% (28 out of the 65 samples achieving 100% match); 84% when filtering to the genera with abundance above 0.5% (11 out of 65 samples reaching a 100% match); 58% when lowering the threshold to 0.1%; and 30% for threshold 0.01%. Even so, the overall diversity (alpha-diversity, Chao1) per sample is highly correlated between the two methods when using all genera above 0.01% of presence (R = .64; p-value = 1.3 × 10 −08 ; Figure 2c). Further analyses based on WGS were done for this threshold.

| Core oral microbiome in PNG
Our qualitative analysis (Microbiome array) of the PNG oral microbiome detected 75 bacterial genera and nine viral genera (Table S4).
Although the array includes probes for RNA viruses and we performed RNA to cDNA reaction, we were not able to identify any RNA viruses in the PNG samples. This is most probably due to the storage of samples at room temperature, leading to the instability of RNA molecules. Around 43% of the detected bacteria genera are common to all the groups (Figure 3a). Of these, there are 12 genera that were present in more than 50% of the samples in all groups, namely Veilonella, Streptococcus, Prevotella, Fusobacterium, Rothia, Haemophilus, Neisseria, Porphyromonas, Treponema, Leptotrichia, Actinomyces and Selenomonas ( Figure S3).

| Oral microbiome differences between PNG populations
We detected significant differences between PNG groups at the global level of oral microbiome, comparing their alpha-diversity (Chao1), based on qualitative data (Microbiome array; Figure 3b).
The East Sepik group, the most remote (as in difficult to reach by transportation, and with less access to health care services, electricity, education, processed food, etc.) community of our sampling, has a significantly lower alpha-diversity value than all other PNG groups (p < .05), except the Port Moresby group (Figure 3b). The four other PNG groups show relative similar alpha-diversity (p > .05, Figure 3b).
These results mirror the results of beta diversities (Jaccard and unweighted UniFrac; Figure S4), although the PERMANOVA test was not significant for the East Sepik differentiation (p > .05). The East Sepik has one of the highest percentages of unique genera (8%; Figure 3a) and only shares 54% of genera with the four other groups, while the values are higher for the other groups: 57% for F I G U R E 3 Diversity of the oral microbiome samples using qualitative results (microbiome array). (a) Venn diagram representing the percentage of genus in common between the five groups: Port Moresby (in red); Simbu province (in yellow); East Sepik province (in green); Western province (in blue); and east highlanders living in Port Moresby (in pink); (b) alpha-diversity measure of the oral microbiome with Chao1 index between the different groups; statistical analysis was performed using pairwise t tests and correction with Holm's method (*p < .05; **p < .01); and (c) phylogenetic tree of genera found in the oral microbiome of the distinct groups, with emphasis in the genera significantly different. The values are the percentage of samples in each group in which that genus was detected with the MDA, and the size is proportional to that value. Statistical analysis was performed using chi-squared tests with multiple testing correction (*p < .05; **p < .01; ***p < .001).
Port Moresby; 67% for Simbu Province; 70% for Western Province; and 62% for East highlanders living in Port Moresby. This regional particularity is partly explained by the significantly higher frequency  (Figure 3b). This lack of significant differences might be due to the incorporation of both microbiotas (initial oral microbiota from the highlands and the F I G U R E 4 Profile of microbial composition of oral microbiome based on quantitative data (WGS results) with the taxa discriminating between the five distinct groups: Port Moresby (in red); Simbu province (in yellow); East Sepik province (in green); Western province (in blue); and east highlanders living in Port Moresby (in pink). We applied the LEfSe algorithm and LDA calculation in a "oneagainst-all" multi-class analysis, requiring only one pairwise Wilcoxon tests to be significant. Only discriminating features with p < .05 and LDA score over 3 are presented.
acquired microbiota from the coast) in the group of East Highlanders living in Port Moresby.

| Oral microbiome of PNG and other populations
To have a broader perspective of the oral microbiome of PNG population samples, we merged the WGS results from our PNG populations with the available WGS samples from Philippines hunter-gatherers, Philippines traditional farmers and Western controls (Lassalle et al., 2018) and processed their data set with our workflow. We generated an unrooted neighbour-joining for the eight populations ( Figure 5a) which showed that the PNG populations are closer to one another and separated from the Philippines and the Western controls.
We also compared the abundance of the PNG core microbiome Australian Aboriginals and Torres Strait Islanders (14%) than in the remaining groups (9.58%). This enrichment of Rothia and Neisseria appear to be limited to the New Guinea-Australia region (Figure 5b), which might indicate a specific trait of these genera on the oral microbiome, potentially influenced by environmental conditions found in this geographic region.
Veilonella appears in higher abundances in the Western controls (11.86%), the Philippines traditional farmers (11.75%), in the Port Moresby group (9.05%) and the East Highlanders living in Port Moresby group (9.07%). This genus has been associated with groups having a Westernized diet (Clemente et al., 2015;Lassalle et al., 2018). These studies also reported an enrichment of Haemophilus in hunter-gatherers, which is not apparent in our results (Table S5), as the Western controls have a higher abundance of Haemophilus than the Hunter-gatherer groups. (Lassalle et al., 2018).

| DISCUSS ION
The core oral microbiome of PNG agrees with the normal microbiota in the oral cavity obtained in other studies (Aas et al., 2005;Deo & Deshmukh, 2019;Dewhirst et al., 2010). Bacterial genera such as Prevotella, Neisseria and Streptococcus represent most of the human oral microbiome diversity. However, as seen within our PNG populations, the abundances of each taxon might vary, potentially due to differential lifestyles, diets or host genetics (Li et al., 2014;Mousa et al., 2022). Both quantitative (WGS) and qualitative (Microbiome array) results show significant differences between the East Sepik group and the other PNG groups. This might be explained by the relative geographic isolation and a more traditional form of subsistence, with limited access to Westernized food, contrary to the other four groups (Pawley et al., 2005;Roscoe, 2005). Similar results were described in the oral microbiomes of isolated groups across the globe, such as the Native American Yanomami population (Clemente et al., 2015), the African Batwa (Li et al., 2014), and in the Philippines hunter-gatherer group (Lassalle et al., 2018), An interesting pattern of genera enrichment was found in the populations from New Guinea and Australia (PNG groups and Australian Aboriginals and Torres Strait Islanders). We detected higher abundances for the Neisseria and Rothia genera than other populations of the world. New Guinean and Aboriginal Australian populations share a common history, since the early settlement of these territories 50,000 years ago (Allen & O'Connell, 2020;Brucato et al., 2021;Pedro et al., 2020), then connected into the ancient continent Sahul.
A secondary gene flow linked to the Austronesian dispersal 3 kya led to a gradient of Asian genetic ancestry in coastal groups of the region (Brucato et al., 2021;Mallick et al., 2016). However, our limited data set (no group from New Guinea with a predominant Austronesian genetic ancestry) and current knowledge on microbiome research, do not support a strong link between a population genetic ancestry (e.g., Austronesian) and a specific oral microbiome signature, despite some encouraging new studies (Weyrich, 2021). It is, nevertheless, remarkable that our study revealed a specific pattern in the oral microbiome shared by New Guineans and Aboriginal Australian groups, that are genetically separated for at least 30,000 years (Brucato et al., 2021;Malaspinas et al., 2016). This specific pattern may have been accentuated by limited gene flow between southeast coastal New Guinean lowlanders and northeast Indigenous Australians until at least 10 kya, and during the Austronesian period of the last 3 kya (Bellwood, 2007;Brucato et al., 2021;Karmin et al., 2022;Lipson et al., 2014;Pedro et al., 2020). This may suggest a Sahul-specific oral microbiome diversity, caused by a shared environment and/or a shared ancient ancestry. Among the Sahul-specific oral microbiome, the Rothia genus has been shown to have positive effects on the host's iron pool, by the production of the iron-scavenging enterobactin (Uranga et al., 2020). Enterobactin is a siderophore protein released by bacteria into the surrounding environment to bind to available iron, which is then used in several biochemical pathways of the bacteria and the host (Raymond et al., 2003). Both Rothia and Neisseria genera appear to be depleted in the saliva of iron-deficient individuals, compared to healthy controls, which establishes their importance in the iron homeostasis (Xi et al., 2019). Iron deficiency is an important health matter in PNG and Indigenous Australian communities, associated with increased morbidity and mortality (Leonard et al., 2019;Manning et al., 2012). These genera might be more abundant in the human oral microbiome of populations from the ancient Sahul region to, in part, compensate a potential iron deficiency. Other worldwide studies of oral microbiome in healthy individuals did not report this Rothia and Neisseria pattern, except for Nasidze et al. (2009) that show a higher abundance of the Rothia genus in the German population, but no interpretation was provided. Additionally, in the gut microbiome of Papua New Guineans individuals, the Lactobacillus genus, which is able to sense iron deprivation and communicate this iron need to the host (Das et al., 2020), was found highly enriched when compared to US individuals (Martinez et al., 2015). Iron plays an important role in infection and immunity, for both pathogens and host cells (Nairz & Weiss, 2020). While it is well-known that pathogens and diet are motors for human adaptation, understudied populations still have the potential to shed light on new adaptations (Pedro et al., 2021).
In this case, the environmental pressure (e.g., high pathogenic pressure, specific diet and plant use) observed in New Guinea and north Australia (Horwood et al., 2019) might have led to the development of a Sahul-specific oral microbiome.
This unexpected pattern of the PNG oral microbiome clearly needs further exploration. As this data was collected opportunistically, potential environmental and cultural cofactors were not controlled for. In particular, betel nut chewing (Hernandez et al., 2017;Zhong et al., 2021) has shown to be highly associated with variation in the oral microbiome in other populations and is a common habit to all the PNG locations. The impact of this specific cultural behaviour would presumably result in an alteration of the global oral microbiome in all PNG populations, which would not greatly affect our results. Nevertheless, the effect of betel nut chewing in the oral microbiome is an important question and should be further explored in the PNG populations. The time of sample collection might also influence the results obtained for the oral microbiome, as seasonality in sample collection should be considered in further studies, and might increase the power in discriminating between the oral microbiomes of the PNG populations (Clemente et al., 2015). Oral health is also one of the factors that influence oral microbiota (Chukkapalli et al., 2015;Costalonga & Herzberg, 2014) and is common to all PNG populations. We have no information available concerning oral hygiene habits of the participants, but the influence of dental hygiene should not be ignored. All these external factors should be integrated in future analyses to evaluate their impact on our results.
However, the convergence of our results in PNG with those from another study on Australian Aboriginals and Torres Strait Islanders (Handsley-Davis et al., 2021) suggests that the influence of these potential cofactors is unlikely to be major.
The detection of a possible Sahul-specific oral microbiome diversity may be potentially related to adaptation to their specific biodiversity, cultural lifestyle or genetic background. The results obtained in this study highlight the value of studying the microbiomes of more isolated groups to fully encompass the ecology and biogeography of the host-microbiota interactions. As the microbiota can affect the host's phenotypic traits, host-microbe relationships can evolve in selective environments, conferring fitness advantages to the host.
The influence of the microbiome in host ecology and evolution is more easily studied in nonhuman mammals, which are subjected to variation in nutritional intake, due to the seasonal availability of foods, changes in their natural habitat and higher disease burden (Amato, 2016;Clayton et al., 2018). In fact, Amato et al. (2015) reported this relationship in black howler monkeys (Alouatta pigra). The researchers showed that the gut microbiota adapts its composition based on the higher consumption of available fruit versus a higher consumption of leaves. In most modern human populations, which have access to a broad availability of food and medicine year-round, there is limited selective pressure that could affect the microbiome of the host (Amato, 2016;Colquhoun & Lyon, 2001). However, human populations residing in rural or remote areas (as for PNG populations, especially the East Sepik province) are still subjected to the pressure of seasonal food availability. A potential study including more samples and more relevant data from these populations will be developed to further study, validate and understand how ecology, ancestry and human lifestyle impacts microbiome structure. Only by studying the different microbiotas in the ecosystem of diverse hosts, can we truly highlight the unique characteristics that defines the symbiotic relationship between microbe and host (Parfrey et al., 2018).

ACK N OWLED G EM ENTS
We especially thank all of our study participants. We also thank

CO N FLI C T O F I NTE R E S T
The authors have no conflicts of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
The MiDAS output files obtained from the Array methodology and the KRAKEN output files obtained from the WGS unmapped reads have been made available as Supporting Information files. The reads mapping to the microbial community (after removal of human reads) were submitted to ENA (www.ebi.ac.uk/ena), for the study number ERP139829 under the BioSample accession numbers ERS12516930 -ERS12516994.

O PE N R E S E A RCH BA D G E S
This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The MiDAS output files obtained from the Array methodology and the KRAKEN output files obtained from the WGS unmapped reads are available as supplementary files. The reads mapping to the microbial community (after removal of human reads) were submitted to ENA (www.ebi.ac.uk/ena), for the study number ERP139829 under the BioSample Accession numbers ERS12516930 -ERS12516994.

B EN EFIT-S H A R I N G
Benefits generated: Benefits from this research accrue from the sharing of our data and results on public databases as described above.