Social Network Analysis of the Genetic Structure of Pacific Islanders

Authors


Corresponding author: John Edward Terrell, Department of Anthropology, Field Museum of Natural History, Chicago, Illinois, United States of America. Fax: 312-665-7193; E-mail: terrell@fieldmuseum.org

Summary

Social network analysis (SNA) is a body of theory and a set of relatively new computer-aided techniques used in the analysis and study of relational data. Recent studies of autosomal markers from over 40 human populations in the south-western Pacific have further documented the remarkable degree of genetic diversity in this part of the world. I report additional analysis using SNA methods contributing new controlled observations on the structuring of genetic diversity among these islanders. These SNA mappings are then compared with model-based network expectations derived from the geographic distances among the same populations. Previous studies found that genetic divergence among island Melanesian populations is organised by island, island size/topography, and position (coastal vs. inland), and that similarities observed correlate only weakly with an isolation-by-distance model. Using SNA methods, however, improves the resolution of among population comparison, and suggests that isolation by distance constrained by social networks together with position (coastal/inland) accounts for much of the population structuring observed. The multilocus data now available is also in accord with current thinking on the impact of major biogeographical transformations on prehistoric colonisation and post-settlement human interaction in Oceania.

Introduction

Many features and processes of natural and artificial systems can be well represented by networks of interacting elements (Proulx et al., 2005). Social network analysis (SNA) is a body of theory and a set of relatively new techniques used mainly in the analysis and study of relational data about contacts, ties, or connections linking individuals or groups with one another, which are not characteristics that the individuals or groups would have or display on their own (Scott, 2000). Examples of such relational data would be information on who is married to whom in a community, how much oil is being shipped between Iraq and the United States, and how scientists collaborate with one another when writing scientific papers. Here I extend the use of SNA techniques to explore historical relationships of contemporary human populations in the south-western Pacific as evidenced by their autosomal genetic relationships.

Recently obtained high-quality data on autosomal variation among local human populations in island Melanesia have been the focus of two previous analyses (Friedlaender et al., 2008; Hunley et al., 2008). I report additional analysis using SNA methods contributing new controlled observations on the structuring of genetic diversity among these islanders. These findings are in accord with current thinking about the impact of major biogeographical transformations on prehistoric colonisation and post-settlement human interaction in Oceania. They also illustrate some of the advantages of including SNA methods in the geneticist's toolkit.

Previous Analyses

According to Friedlaender et al. (2008), inadequate sampling particularly of populations resident in Melanesia has made it difficult to determine the origins and phylogenetic relationships of Pacific Islanders. They report, however, that their genome scan of autosomal markers (687 microsatellite and 203 insertion/deletion polymorphisms) on 952 individuals from 41 Pacific populations (for specific locations, see: Friedlaender et al., 2008: Fig. 8) primarily in the Bismarck Archipelago and the northern Solomon Islands (but also including select sample sets from New Guinea, Aboriginal Taiwan, Micronesia, and Polynesia) provides the basis for understanding the remarkable nature of Melanesian genetic variation. Specifically, they find that variation within individual Pacific populations is very low—especially in Melanesian populations, where high levels of homogeneity within populations exaggerate the estimated levels of differentiation among populations in analyses of molecular variance (AMOVA) as well as computed pairwise FST genetic distances. Nonetheless, they find that genetic divergence among island Melanesian populations is evidently organised by island, island size/topography, and position (coastal vs. inland). Additionally, while Polynesians are narrowly represented in their dataset, and the Polynesians included (especially the Maori New Zealanders) are genetically quite distinctive (Friedlaender et al., 2008: Fig. S3B), these islanders appear to cluster with contemporary Micronesians, Taiwan Aborigines and East Asians rather than with Melanesians.

Figure 8.

Pacific Islanders using mean STRUCTURE assignment probabilities when K = 10, colour-coded by linguistic affiliation. Spring-embedding network mapping of only the Pacific Island populations in the genome scan (Taiwan included) derived from mean STRUCTURE assignment probabilities when K = 10 reported by Friedlaender et al. (2008), when the threshold distance value is set at >0.0, colour-coded by linguistic affiliation (green= Austronesian; orange= Papuan; blue= Austronesian languages whose speakers have a significant (>0.04) ‘Oceanic’ Austronesian signature.

Hunley et al. (2008) make use of a subset of 751 autosomal microsatellite loci from the same genetic dataset typed for 776 individuals to evaluate two models of genetic and linguistic coevolution in island Melanesia. Both models anticipate that statistically significant correspondence between genetic and linguistic variation may be observable in high-quality genetic and linguistic datasets for this region. The first model predicts that such correspondence began to develop early in the range expansion of Homo sapiens sapiens out from Asia into the Pacific through a settlement or colonisation process of population splits and subsequent local isolation. The second model, one they consider analogous to genetic models of isolation by distance, predicts that correspondence has formed instead over time through ongoing genetic and linguistic exchanges among neighbouring populations. Given these differing model expectations, Hunley and his colleagues ask two questions. Does genetic and linguistic variation covary in this region? If so, which model best accounts for the coevolution observed?

They evaluate the robustness of their two alternative modelling premises by comparing observed and simulated patterns of genetic variation, genetic and linguistic trees, and matrices of genetic, linguistic, and geographic distances. They find that linguistic and genetic exchanges among the populations represented in the dataset used have erased evidence of splitting and isolation that may have happened early in the settlement history of the region. The correlation patterns found are also inconsistent with the predictions of their isolation-by-distance model for this area as a whole, although there is evidence that coevolution has occurred in the rugged interior of the large island of New Britain (for which they report some of the strongest correlations among genetic, linguistic, and geographic distances in any region worldwide).

Social Network Analysis (SNA)

One of the most useful ideas in the social sciences is the observation that individuals are embedded in thick webs of interaction and social relations (Borgatti et al., 2009). Modern social network theory is a conceptually rigorous and analytically exacting way of studying, exploring, and reporting on how individuals create effective social groupings—some as informal as friendships, breakfast clubs, and hip-hop groups, and others as influential and globally consequential as the World Court, the United Nations, and multi-national corporations—not just ideally, but in actual practice.

Grist for the mill of SNA can be evidence as direct and obvious as field observations on who is seen talking with whom and who attends the same academic conferences, or as indirect and seemingly trivial as linguistic information on regional accents in spoken American English or Russian. Similarly, although its rapidly advancing insights may be less often viewed in this light, molecular genetics today can be a rich source of information on network relationships, past and present, among populations, both human and non-human (e.g., Proulx et al., 2005; Brohée et al., 2008; Wolf & Trillmich, 2008; Kasper & Voelkl, 2009; McDonald, 2009).

In reporting their analysis, Friedlaender et al. (2008) used a number of well-established ways of conveying information, including standard cartographic maps, data tables, matrices, pie charts, Cartesian graphs, bar graphs, and neighbour-joining trees. As Rosenberg (2004) observes, however, most graphical strategies cannot effectively display population subgrouping (K) membership estimates when K > 3. In an effort to overcome this limitation, Friedlaender et al. (2008) used Rosenberg's (2004) computer program DISTRUCT, which creates bar graphs by employing different colours to show the values of subgroup membership coefficients for each individual in a genome scan as a fixed-length line segment partitioned into ≤ K coloured components. However, as K increases and the number of individuals or groups represented grows, reading the assembled bar graphs becomes increasingly impressionistic. Making formal comparisons among the entities diagrammed (individuals, or population means) can be even more challenging and impressionistic.

Social network analysis is a collection of analytical tools derived from mathematical graph theory. As Kaspar & Voelkl (2009) note, the call to analyse the dynamics of large networks such as the Internet, the World Wide Web, and global trade has led to the development of new analytical concepts and algorithms. Fortunately, the development of faster and more powerful computers has also made it possible today to use algorithms that previously would have required months or years of processing time. Tools drawn from SNA offer alternatives to DISTRUCT visualisation that are more effective at representing multidimensional comparative relationships.

Research Issues

Inadequate population sampling is only one reason biological relationships among Pacific Islanders remain poorly defined. Insufficient knowledge about the evolving conditions under which these islanders lived in the past has also limited the value and reliability of research observations and conclusions (Terrell, 1986a). Social network analysis is used here to map estimated genetic similarities among these populations derived in two ways (see Data), first using the mean STRUCTURE population assignments at K = 10 (the level of K subdivision at which individuals in the genome scan identified as “European” are finally set apart as a separate cluster), and second, using the matrix of pairwise FST co-ancestry distance values published by Friedlaender et al. (2008: Tables S6–S7). These network mappings are then compared with model-based social network expectations derived from the geographic distances among these same populations on the modelling premise that the likelihood of interaction and exchanges between any two populations is partially a function of their pairwise geographic distance apart (Manica et al., 2005; Templeton, 2006). I use SNA to consider three specific issues in the human biogeography of the Pacific Islands (Terrell, 2006).

1. Was New Guinea a vicariant barrier to human movement in the south-western Pacific during the Pleistocene?

Today New Guinea, the second largest island in the world with an area of 808,000 km2, is ∼2,400 km long and 650 km wide at its broadest point (Fig. 1). The northern coastline of this island lies on the leading edge of the Australian tectonic plate which is in geologically rapid oblique collision with a number of micro-plates in the Bismarck Sea. Convergence at the North Bismarck and Australian plates is ∼70 mm/yr, and lateral shear between them is ∼100 mm/yr; major earthquakes occur often in the zone of deformation between these plates (Tregoning et al., 1998).

Figure 1.

Map of the study area.

World sea levels were 10 m or more below present sea levels (BPL) for 11,000 of the past 17,000 years (Voris, 2000). With sea levels as little as 10 m BPL, southern New Guinea was attached to Australia by a land bridge across the Torres Strait (Voris, 2000; Lambeck & Chappell, 2001). Because of the steep shoreline gradient and the rapid post-glacial rise in sea levels (over 1 cm/yr until around 6,000 years ago) (Chappell, 1974; Fairbanks, 1989; Chappell & Polach, 1991; Lambeck & Chappell, 2001), it is probable that the northern coastline then was rocky and steep until about 7,000 years ago when the global rise in sea levels slowed and coastal sedimentary environments had a chance to develop. During the Pleistocene and early Holocene, as a consequence, there were probably few stable, productive lowland areas along this island's lengthy northern coast suitable for human settlement and land use (Pope & Terrell, 2008).

By conservative estimates, New Guinea was initially colonised by people ∼40,000–45,000 years ago (Specht, 2005). It is probable, however, that due to global environmental and eustatic conditions during the Pleistocene, as well as the specific geomorphologic characteristics of this island's northern coastline just described, New Guinea was more of a vicariant barrier than a land bridge for people travelling between Asia and the Pacific Islands—an inference that has been called the “sleeping giant” hypothesis (Terrell, 2006). Due to the limiting effect of geographic distance on rates of gene flow in most species, genetic similarity between populations decreases with increasing distance. If the “sleeping giant” hypothesis is correct, it is likely that genome scans of contemporary populations on nearby islands both east and west of New Guinea will find that they are often genetically quite distinct from New Guineans (at least from inland populations on this island) to a degree exceeding what might be expected on the basis of isolation by geographic distance alone.

2. Did the stabilisation of world sea levels near their present stand ∼6,000–7,000 years ago, which led to the growth of coastal resources favourable to human settlement, change the scope and character of interaction and exchanges among populations in the south-western Pacific?

Stabilisation of world coastlines following the Pleistocene together with the development of modern climatic conditions led to the growth of widespread coastal habitats favourable to human subsistence and coastal settlement on a global scale that had not been matched since the previous interglacial climax ∼120,000 years ago (Bailey & Parkington, 1988). The lower reaches of the short river systems currently found along the northern coastline of New Guinea carry significant amounts of sediment to the sea from the nearby mountains. These rivers have wide braided beds due to the sudden sharp decrease in gradient when they emerge from the mountains (McSaveney et al., 2000). Once sea levels had risen to within a few metres of their present stand ∼6,000–7,000 years ago (Bailey & Parkington, 1988; McSaveney et al., 2000; Voris, 2000; Dickinson, 2001; Specht, 2005), relatively stable bay, lagoon, and estuarine ecosystems began to form along this coastline as water-borne sediments began to accrete where favourable configurations of shoreline, local subsidence, and offshore islands trapped sediment in sandbars, lagoons, and small river deltas. It may be inferred that by the mid Holocene, therefore, New Guinea was no longer a vicariant barrier restricting trade and travel between island Southeast Asia and the Pacific Islands.

Additionally, although the lowland tectonic basin of the Sepik and Ramu rivers of northern New Guinea is currently filled with swamps and broad flood plains, formerly this basin was a large inland sea that during the Holocene attained it fullest extent ∼7,500–6,300 years ago (Swadling, 1997; Chappell, 2005; Swadling & Hide, 2005; Swadling et al., 2008). Judging by current rates of sediment discharge from the Sepik and Ramu catchments, it may have taken ∼3,000 years to infill this former inland sea. Considering its size and geographic location, this sea and the large freshwater lagoons that temporarily replaced it must have played a prominent role in shaping the scale, range, and character of prehistoric human interaction and exchanges within and beyond northern New Guinea for thousands of years (Swadling & Hide, 2005). Thus it seems probable that by the mid Holocene, New Guinea's isolation may have given way to new commerce and intercourse as people in the growing coastal villages began to travel and trade with one another and with people to the east and the west in Indonesia and the Bismarck Archipelago—an inference called the “ancient lagoons” hypothesis (Terrell, 2002, 2004, 2006). If so, then it is likely that genome scans of contemporary populations on nearby islands located east and west of New Guinea will find that while the populations resident in these two regions are genetically distinct from one another and from most New Guineans (in keeping with the “sleeping giant” hypothesis), they nonetheless share characteristics attributable to gene flow between both areas during the last 6,000–7,000 years.

3. If there has now been more than 6,000 years of interaction and exchanges among local and more distant coastal communities in the south-western Pacific, is genetic variation among human populations today in this part of the world still informative about the Pleistocene era and more recent human history?

To learn whether the remarkable degree of genetic diversity among some contemporary populations in the south-western Pacific preserves evidence of the early human history of this region, a suitable null, or baseline, hypothesis to explore would be that isolation by geographic distance among these populations constrained (or channelled) by social networks sufficiently accounts for the differences seen—in this instance, for differences in the genetic distances observed.

Expectations

Friedlaender et al. (2008) ask whether there is clear organisation of genetic variation among groups in Melanesia by language, by island, or by distance from major dispersal centres. Here I ask how strongly genetic similarities derived from autosomal markers among people in the south-western Pacific have been determined by evolving social networks among these populations. SNA methods are used here to examine and expand on several expectations derived from the findings previously reported by Friedlaender, Hunley, and their colleagues.

1. Geography

According to Friedlaender et al. (2008), genetic differentiation among Melanesian populations varies between islands, as well as by island size and topographic ruggedness. Hunley et al. (2008) similarly report that allelic identities are relatively high between populations on the same island, and relatively low and uniform between populations on different islands. New Britain, the largest and, in their estimation, the most rugged island represented in their genome scan, is assigned by STRUCTURE analysis when K = 10 with five different clusters; Bougainville Island in the northern Solomon Islands is given two common cluster assignments; New Ireland is given only one (Friedlaender et al., 2008: Fig. 8). It is not reported which measures of island size and topographic complexity or ruggedness, if any, were used to arrive at the observation that these two interrelated dimensions of environmental heterogeneity partially structure genetic diversity among these Pacific Islanders. Friedlaender et al. (2008) say only that “the larger and more rugged the island, the greater the differentiation among populations”. Nor do they say how much of the variance among populations can be attributed to island ruggedness/complexity rather than island size alone. Do SNA mappings of these populations, nonetheless, support these inferences?

2. Position

Additionally, Friedlaender et al. (2008) and Hunley et al. (2008) find that genetic divergence among these Melanesian populations is structured by geographic position (coastal vs. inland). Do SNA mappings confirm this observation?

3. Language

It is conventional in historical linguistics to subdivide the languages spoken in the Pacific into two primary categories, Austronesian (in the Pacific, these are also called Oceanic) and non-Austronesian (or Papuan). Friedlaender et al. (2008) report finding only a modest degree of association between these two groupings and genetic variation—apportioning the molecular variance in this way only accounts for 0.2% of the total—perhaps, they suggest, because the language populations in their genome scan assigned by linguists to these two categories are scattered across most of the islands surveyed, and intermarriage has blurred original differences (confirming yet again earlier findings: Terrell & Fagan, 1975). Hunley et al. (2008) add that anthropologists generally doubt that language and genetics evolve in tandem since the prolonged isolation required for the formation of stable genetic and linguistic correspondences seems unlikely. They found that significant genetic exchange has occurred between local populations within islands regardless of whether they belong to the same major language group or not, but that genetic exchange between islands may have been relatively restricted for some time. They conclude that there is a pervasive pattern of closer genetic than linguistic proximity between populations on the same island. Do SNA mappings help clarify how language and genetic variation may covary in this region?

4. Social networks

If the probability of interaction and exchanges between any two populations is partially a function of how near to one another they are geographically (Manica et al., 2005; Templeton, 2006), then how closely do expected social networks modelled using as input data the geodesic distances between every pair of populations represented in the genome scan correspond to networks modelled using instead genetic distances? Unlike Friedlaender et al. (2008), Hunley et al. (2008) explored this possibility that similarities among the populations in the genome scan might correlate with geographic distance. However, they assumed that there will be decreasing allelic identity with increasing geographic distance regardless of the actual geographic locations of the populations sampled. They did not directly consider, for instance, whether the distances involved are over land or sea. Not surprisingly, they found little correlation between distance and observed allelic identities (Hunley et al., 2008: Table 5; and see below). Here it is assumed instead that isolation by distance is constrained by geographic location, and that direct interaction and exchanges are mostly probable between nearest neighbours, i.e., the effect of distance will be qualitative as well as quantitative (for further discussion, see Results: Social networks).

Materials and Methods

Most of Polynesia, Micronesia, and all of southern Melanesia (Fiji, Vanuatu, New Caledonia, and other places immediately east and southeast of the Solomon Islands), Indonesia, and the Philippines are not represented in the Friedlaender et al. (2008) dataset. Therefore, the analyses I report refer mainly to populations in the Bismarck Archipelago and the northern Solomon Islands (Fig. 1). Currently little is known in detail about the prehistory of these populations (Specht, 2007). A matrix of the geodesic distances between every pair of populations represented is used here as a proxy measure of the likelihood of their interaction with one another to model the probable social networks among nearest neighbours.

Network Analysis

Network analysis using mathematical graph theory is one of the fundamentals of discrete mathematics. This approach is now widely used in the social sciences to study social relations among groups, or sets, of actors (Newman, 2001; Krause et al., 2007; Mizoguchi, 2009). As Newman explains:

A network is a set of items … (called) vertices or sometimes nodes, with connections between them, called edges. Systems taking the form of networks (also called ‘graphs’ in much of the mathematical literature) abound in the world. Examples include the Internet, the World Wide Web, social networks of acquaintance or other connections between individuals, organizational networks and networks of business relations between companies, neural networks, metabolic networks, food webs, distribution networks such as blood vessels or postal delivery routes, networks of citations between papers, and many others. (Newman, 2003: pp. 168–169)

While not always labelled as such, network-based ideas and methods have been a staple of anthropological research since the 1960s (Terrell, 1986b), and they are now widely used in the social sciences, as well as in physics, biology, and other fields (e.g., Proulx et al., 2005) with practical applications in health care, law enforcement, national security, and business management (Borgatti et al., 2009).

In the social sciences, a primary focus of network research is on how individuals and groups interact to maintain enduring associations, and on the social, economic, and practical consequences of alternative network configurations. Here the focus, however, is historical. I use social network analysis to evaluate the null, or baseline, hypothesis that isolation by geographic distance among these populations constrained (or channelled) by social networks linking them sufficiently accounts for differences in the outcomes observed—in this instance, for the differences seen in the genetic distances.

For the analyses reported here, I used the network software packages Netdraw 2.083, Netminer 3.3.1, and Ucinet 6.207 (Borgatti, 2002; Borgatti et al., 2002; Netminer, 2008) to examine how genetic variation among Pacific Island populations is patterned. In the accompanying figures, the vertices or nodes are all of one type. They represent populations in the dataset studied by Friedlaender et al. (2008) and Hunley et al. (2008). The edges in the mappings illustrated in Figures 2–8 and 12 are derived from the STRUCTURE and FST values reported in Friedlaender et al. (2008): Tables S6–S7; those in Figures 9–11 are based on geodesic distances calculated from the latitude/longitude coordinates of the populations represented using the Vincenty formula (Vincenty, 1975). All of the edges (i.e., lines between places) shown are undirected rather than directed, i.e., the nodes (populations) in any given pair are considered to be equally similar to each other.

Figure 2.

Three elementary network mappings derived from reported linguistic distances among seven populations in the Bismarck Archipelago (data from Hunley et al., 2008: table S1). Each node (vertex) in these figures represents a population; each line or tie (edge) represents a linkage or connection between them. (a) The seven nodes have been arrayed at equal distances in a circular pattern revealing that each node is linked directly with all of the others in the network depicted. (b) As in many real life situations, however, each of the actual human populations represented by these seven nodes is only strongly linked with some of the other six nodes, not with all of them. In this second mapping, only linkages equal to, or greater than, a selected strength (i.e., of a chosen threshold value, or cut-off point) are shown. However, the lengths of the lines (edges) drawn are purely arbitrary, and do not indicate which nodes are actually “nearer” or “more similar” to one another. Yet it is clear that relationships among these seven populations are structured for some reason, or reasons, possibly worthy of closer investigation. (c) A third mapping of these same seven nodes drawn using spring-embedding (see text) suggesting by their spatial placement relative to one another and their distance apart how similar or dissimilar each node is to the others in the network.

Figure 3.

Spring-embedding network mapping of the populations in the genome scan (for definition, see Mapping networks). Mapping derived from the mean STRUCTURE assignment probabilities when K = 10 reported by Friedlaender et al. (2008) colour-coded by geographic location. Blue-white= Asia; blue= Taiwan; black= Europe; red= Polynesia; pink= Micronesia; yellow= New Britain; purple= New Guinea; dark green= North Solomons; green= New Ireland; light green= New Hanover; pale green= Mussau.

Figure 4.

Spring-embedding network mapping derived from the matrix of pairwise FST co-ancestry distance values. Spring-embedding network mapping of the populations in the genome scan derived from the matrix of pairwise FST co-ancestry distance values reported by Friedlaender et al. (2008) when the threshold distance value is set at ≤0.04, the minimum needed to link all but two of the nodes (populations) included in the array with at least one other node in the set. The nodes are colour-coded to indicate the Girvan-Newman (2002) cluster assignments for K = 6. Note these assignments basically partition these populations into two major clusters comprising most of the Asians and Melanesians in the dataset.

Figure 5.

Spring-embedding network mapping colour-coded for the Girvan-Newman clusters when K = 6, the optimal subdivision, when the threshold distance value is set at >0.0. Spring-embedding network mapping of the populations in the genome scan as shown in Figure 3 colour-coded for the Girvan-Newman (2002) clusters when K = 6, the optimal computed subdivision.

Figure 6.

Pacific Islanders colour-coded for position (blue = coastal; orange = inland; blue squares = intermediate). Spring-embedding network mapping only of Pacific Islanders (Taiwan included) derived from the mean STRUCTURE assignment probabilities when K = 10 colour-coded for position (blue = coastal; orange = inland; blue squares = populations classified by Hunley et al. (2008) as living in positions intermediate between inland and sea), when the threshold distance value is set at >0.0.

Figure 7.

Pacific Islanders using mean STRUCTURE assignment probabilities when K = 10, colour-coded by geographic location. Spring-embedding network mapping of only the Pacific Island populations in the genome scan (Taiwan included) derived from mean STRUCTURE assignment probabilities when K = 10 reported by Friedlaender et al. (2008), when the threshold distance value is set at >0.0, colour-coded by geographic location. Blue= Taiwan; orange= Polynesia; light orange= Micronesia; purple= New Guinea; yellow= New Britain; dark green= North Solomons; green= New Ireland; light green= New Hanover; pale green= Mussau.

Figure 12.

Ego network (Hanneman & Riddle, 2005) for Mussau Island based on the mean STRUCTURE assignments when the threshold similarity value is set at ≥0.05. Orange= Papuan speakers; green= Austronesian speaker; blue= Mussau, an Austronesian-speaking population.

Figure 9.

Nearest-neighbour structuring of interaction among the populations represented in the genome scan, colour-coded by island. Spring-embedding network array of populations when the threshold geographic distance is 118 km or less, the minimum distance linking all of the nodes (places and populations) into a single network, colour-coded by island with populations sampled on New Britain subdivided into western (black), central (gray), and eastern (purple), and New Ireland (blue) coloured to include the small off-shore islands along its eastern shoreline.

Figure 10.

Girvan-Newman clustering of populations when K = 7, the optimal subdivision of the network shown, colour-coded by cluster assignment.

Figure 11.

Nearest-neighbour structuring of interaction among the populations colour-coded as in Table 1. Nearest-neighbour structuring of interaction among the populations represented in the genome scan colour-coded to show the genetic clustering shown in Table 1, right column (blue nodes represent locations not represented in the genetic scan).

Mapping Networks

Mapping is often used in SNA research to examine patterning within network data (Freeman, 2005; in the classroom, I sometimes refer informally to this analytical work as “mapping data matrices”). Figure 2 shows three alternative ways of mapping a simple network made up of the same seven nodes, which in this illustrative example represent seven human populations in Melanesia (the data used to construct this model network were taken from a matrix of linguistic distances available in Hunley et al., 2008: Table S1).

In Figure 2a, these seven nodes are arrayed at equal distances in a circular pattern. As basic as this mapping is, it is apparent that each node in this network is linked directly with all of the others. However, as in many real life situations, each of the actual populations represented in this illustration is strongly linked with only some of the other six nodes, not with all of them. This fact of life is mapped in Figure 2b where only linkages equal to, or greater than, a selected strength (i.e., of a chosen threshold value, or cut-off point) are drawn. It is now obvious that the relationships among these seven populations are structured for some reason, or reasons—an observation which may be worth further investigation (below, and Fig. 2c).

Note that the placement of nodes in Figure 2b is arbitrary. These seven are shown precisely where they were placed in Figure 2a. Consequently, the lengths of the lines drawn between nodes are also purely arbitrary, and should not be seen as inherently meaningful. In other words, tie length in this figure is not showing us which nodes may be “closer” or “more similar” to one another (in this instance, more similar linguistically).

Several computational techniques are often used in SNA for placing nodes relative to one another in multidimensional (usually 2- or 3-dimensional) space such that the placement and distances among nodes may be interpretable. These techniques include metric- and non-metric multidimensional scaling, principal components analysis, spring-embedding, and other strategies (Hanneman & Riddle, 2005).

“Spring-embedding” refers to one way of making network mappings as visually intelligible as possible by mathematically treating the edges as if they were metal springs and using energy minimisation to place the nodes relative to one another (Fruchterman & Reingold, 1991). As Golbeck and Mutton explain, the effect of spring-embedding is to:

distribute nodes in a two-dimensional plane with some separation, while attempting to keep connected nodes reasonably close together. The spring embedder graph drawing process considers the graph model as a force system that must be simulated. Each node in the graph is modeled as a charged particle, thereby causing a repulsive force between every pair of nodes. Each edge is modeled as a spring that exerts an attractive force between the pair of nodes it connects. The graph is laid out by repeated iterations of a procedure that calculates the repulsive and attractive forces acting on all nodes in the graph. At the end of each iteration, all nodes are moved according to the resultant forces acting on them. (Golbeck & Mutton, 2005: p. 173)

Since the placement of nodes achieved using a method such as spring-embedding is the outcome of an iterative process of computing intermediate approximations, it must be emphasised that there is usually no single “correct” placement when using such SNA techniques (Hanneman & Riddle, 2005). A mapping of the seven nodes in our model example such as that shown in Figure 2c, which was generated using spring-embedding, is best viewed as exploratory and suggestive, not conclusive (in the case of the language information used to generate Figure 2, for example, further investigation suggests that the linkages among the nodes shown in Figure 2c, as well as their spatial placement relative to one another, correspond in most cases to their relative geographic distances apart; Donohue et al., 2010).

Once a relational mapping of a set of nodes has been done using selected dimensions of variation (here genetic and geographic distance), it is then possible to examine the attributes, or properties, of each node (Scott, 2000: pp. 2–5) to see if any of those properties may be informing the structuring and characteristics of the network under consideration, e.g., their geographic location (Figs. 3–4, 7), geographic position (Fig. 6), language affiliation (Fig. 8), or autosomal cluster assignment (Fig. 11).

Data

As previously noted, SNA is a set of methods for visually mapping and exploring the relational aspects of social structures (Scott, 2000: p. 38). Molecular genetic data today is proving to be a primary source of information on the local, regional, and global structure and history of populations (e.g., Hill et al., 2007), as well as the interaction between patterns of genetic structure and relatedness and the dynamics of social systems (e.g., McDonald, 2009).

The computer program STRUCTURE used by Friedlaender et al. (2008) draws on multilocus genotype data to divide people into a requested number of subgroups (K) by assigning probabilities of membership in each of K subgroups to each individual represented in the input dataset. Here I use the mean population subgroup assignments when K = 10 (Friedlaender et al., 2008: Table S6) computed as Pearson's correlation coefficients as input data for the mappings shown in (Figs. 3, 5–8, 12). Although obtained from a relatively restricted geographic area, Friedlaender et al. (2008) find that the range of the pairwise FST co-ancestry values (Reynolds et al., 1983) for the same Melanesian populations in the genome scan is extremely large. For comparative purposes, FST values for these populations (Friedlaender et al., 2008: Table S7) are used to assemble the SNA mapping shown in Figure 4.

Results

In their original description of the model-based clustering method called STRUCTURE, Pritchard and his colleagues (Pritchard et al., 2000: p. 945) remark:

The definition of populations is typically subjective, based, for example, on linguistic, cultural, or physical characters, as well as the geographic location of sampled individuals. This subjective approach is usually a sensible way of incorporating diverse types of information. However, it may be difficult to know whether a given assignment of individuals to populations based on these subjective criteria represents a natural assignment in genetic terms, and it would be useful to be able to confirm that subjective classifications are consistent with genetic information and hence appropriate for studying the questions of interest.

These comments could be read as implying that human populations are objective rather than subjective taxonomic units, and that population assignments made using non-genetic traits may be unable to identify biologically meaningful populations. Although such suppositions are problematic (Terrell & Stewart, 1996; Tëmkin & Eldredge, 2007), the issue here is whether mapping the genetic structure of Pacific Islanders can help resolve questions of interest about their biogeography and history. Judging whether the fine-grained genetic information under consideration can be helpful critically depends on how successfully cluster assignments given to individuals and populations derived from this dataset can be matched with relevant non-biological information about these islanders, their biogeography, and their known history.

1. Geography

Friedlaender et al. (2008) report that the cluster assignments of the populations in their genome scan when K = 10 show that Melanesian genetic differentiation varies not only between islands, but also by island size and topographic complexity. The spring-embedding network mapping in Figure 3 confirms that the apportioning of mean STRUCTURE assignments among these populations is geographically patterned. Asians and Melanesians are clearly differentiated from one another, with Polynesians and Micronesians intermediate between these two geographic clusters, although the linkages drawn between Micronesians (primarily individuals from Belau) and Samoans are more numerous with Asians than with Melanesians. Note also that the three neighbouring Baining populations on New Britain appear to form a tightly associated and isolated cluster on their own.

It is less certain, however, whether there is clear organisation of genetic variation among Melanesian populations by island, island size, and topographic complexity or ruggedness. While populations on New Britain evidently comprise at least five distinct clusters, those on the islands of Mussau, New Hanover, New Ireland, and in the northern Solomon Islands (hereafter MHIS islands) do not. They cluster instead together. It is also apparent that while the populations representing Highland New Guineans and people in the Sepik River area of northern New Guinea form a cluster apart from New Britain and the MHIS islands cluster, the node representing the Sepik River—as might be expected given the “ancient lagoons” hypothesis—is linked with populations in island Melanesia. However, their association is seemingly stronger with the MHIS islands cluster than with populations on New Britain even though geographically this island is located between the Sepik coast and the MHIS islands.

Figure 4 is a comparable network mapping based on the pairwise FST co-ancestry distance values reported by Friedlaender et al. (2008) for the microsatellite data in their genome scan. Here again Asians cluster unambiguously apart from Melanesians, with Polynesians, Micronesians, and the three Baining populations on New Britain intermediate between both. Polynesians and Micronesians also appear to have closer evident ties with Asians than with Melanesians. However, two of the populations in the scan (one on Taiwan, the other on New Britain) cannot be integrated into this mapping given the threshold distance value used to generate this network, which was set at ≤0.04, a distance that integrates all but these two into a single network (the Baining outlier joins this network at an FST distance of 0.0541 or less; Taruko, the Taiwan outlier, at 0.0413).

The colour-coding in this figure shows the cluster assignments generated using the SNA Girvan-Newman algorithm for detecting communities in complex systems (Girvan & Newman, 2002) when K = 6, the optimal subdivision of this dataset calculated by the algorithm. Note that while several populations on New Britain are still strongly divergent from the others in the scan, this FST co-ancestry mapping coalesces some of the New Britain populations with most of the other Melanesians represented. For comparison, Figure 5 similarly shows Girvan-Newman cluster assignments when K = 6 using instead the mean STRUCTURE probabilities, as in Figure 3. While these two mappings differ, what is apparent is that the primary subdivision in both figures is drawn between Asians and Melanesians—an observation in accord with the “sleeping giant” hypothesis—and that the Baining populations are clustered uniquely on their own.

2. Position

Friedlaender et al. (2008) report that inland groups in their scan are the most differentiated, while shore-dwelling populations are more intermixed. Hunley et al. (2008) were able to identify the same coastal/inland residential distinction. Figure 6 is a spring-embedding network mapping of the mean STRUCTURE assignment probabilities when K = 10 only for the Pacific Islanders (Taiwan included). Except for the Nasioi, Sepik, Anem, and Kol, there appears to be good fit between genetic differentiation and the inland/coastal assignments made by Friedlaender et al. (2008). It should be noted, however, that the Nasioi are both coastal and inland (Friedlaender, 1975), and although these investigators classify Sepik River individuals as living inland, this river basin (as discussed above) was formerly an inland sea joining the Bismarck Sea where the mouths of the Sepik and Ramu rivers are presently located.

3. Language

Friedlaender et al. (2008) find that microsatellite variation among the Oceanic-speaking populations in their scan is significant, although it is greater among the Papuan-speaking populations represented, many of which are located inland on the larger islands. They also find what they consider to be a small but very clear “East Asian/Polynesian” genetic signature, which they label as an “Austronesian” signature, in 8 out of the 19 Oceanic-speaking populations in their scan (at a frequency >0.04). This signature is not found in the Papuan-speaking groups represented at this threshold level or greater (this signature occurs in all of the populations represented in the genome scan, but usually at a level of <0.01; Friedlaender et al., 2008: Supplementary table S8).

Figure 7 is a spring-embedding network mapping again only for Pacific Islanders and Taiwan colour-coded once more by geographic location. Note that the populations on the MHIS islands again cluster together, while those on New Britain do not. Figure 8 shows these populations colour-coded instead by language (green = Austronesian; orange = Papuan; blue = Austronesian-speaking Oceanic populations with an “Austronesian” assignment >0.04). When Figure 8 is compared with Figure 6, it is clear that Austronesian (Oceanic) speakers live both inland and on the coasts, and that populations displaying the conjectured ‘Austronesian’ genetic signature at a frequency >0.04 are widely dispersed in both the MHIS islands area and on New Britain.

Friedlaender et al. (2008) interpret this signature as a clear sign that ancestors of the Polynesians moved through Melanesia from a homeland located somewhere west of New Guinea relatively rapidly, only intermixing to a very modest degree with local populations while on the their way east to Polynesia. However, as just noted, Figure 8 shows that populations with this signature today are distributed throughout this part of Melanesia from the northern Solomon Islands to Mussau, and from there to west New Britain. Furthermore, judging by currently available evidence, this signature is no more prominent among Taiwanese populations than it is in Melanesia. Thus there would appear to be little evidence for concluding, as Friedlaender et al. (2008) do, that ancestors of the Polynesians migrated from the general vicinity of Taiwan to the central Pacific or that they did so without extensive contact with Melanesian populations along the way.

4. Social networks

In ecology and other environmental sciences, isolation by distance as a formative evolutionary process usually refers to the impact that geographic distance has on the probability of interaction between organisms and species (Segelbacher et al., 2003; Telles & Diniz-Filho, 2005). This phrase rarely has only this meaning in the social sciences (Terrell, 1986b). Instead, these words commonly refer to various measures of how directly or indirectly people are embedded in evolving social networks (Relethford, 2004; Hanneman & Riddle, 2005; Jun & Kim, 2007), although it should be noted, too, that assessing habitat connectivity, for example, has been shown to be particularly useful in conservation biology and in studies of genetic differentiation (Segelbacher et al., 2003).

Nearest neighbour methods are commonly used in geography to model expected patterns of connectivity among people in different places in lieu of more direct measures (e.g., using the actual number of telephone calls, or trucking shipments between two locations; Hanneman & Riddle, 2005). The spring-embedding networks illustrated in Figures 9–11 are based on the spatial distances among all of the locations shown as determined from their geographic coordinates.

While ethnographic information on the movement of people locally from place to place in this region of the Pacific is limited, available data indicate that the small islands off the east coast of New Ireland have been instrumental in the flow of information and goods (Kaplan, 1976; Terrell, 1976). Seven offshore small island locations, therefore, have been added to those in the previous figures based on this information to better approximate the geographic structuring of interactions among the populations sampled genetically.

The network mappings illustrated in the previous figures give the certain impression that genetic diversity on New Britain is exceptional, at least in comparison with other islands included in the genome scan. However, while attending to whether people live inland or by the sea is evidently one way to explain some of the observed variation (Fig. 6), it is not clear that differences in the topographic complexity or ruggedness of these islands contribute much to the patterning of variation seen. New Britain is more than four times larger than New Ireland, and nearly as large as that when compared with Bougainville, which are the other two major islands immediately east of New Guinea. Moreover, the populations represented in the genome scan were not randomly selected. Hence how to generalise from the autosomal information now available is not self-evident. Island ruggedness and even the “islandness” of islands may have little to do with the patterning of variation seen. Figures 9–11 illustrate instead how—in addition to settlement position (coastal vs. inland)—it is evidently important to pay attention in this part of the Pacific to how social networks linking people and populations are structured.

Figure 9 shows mapping of the expected linkages among these populations when the threshold for the geographic ties (edges) is set at a geographic distance of 118 km or less—the minimum geodesic distance permitting all of the nodes (places and populations) represented to be joined into a single network. Nodes are colour-coded by geographic locale. Due to its size, the nodes representing populations on New Britain are coded to show where these populations are located on this large island—whether in eastern New Britain, western New Britain, or towards the east-west centre of the island.

Observe that this mapping implies that these populations ought to fall into possibly 8–10 distinguishable genetic clusters. For comparison, Figure 10 shows the probable clusters in the same network based on the Girvan-Newman algorithm when K = 7, the optimal likely number of subdivisions as determined by the algorithm. Note that based on geographic distance alone, these alternative mappings assign Bougainville with one cluster, New Ireland with two, and New Britain with 3–4. While not identical in number to the cluster allocations given to them using mean STRUCTURE assignments (see Expectations: Geography), it is clear that geographic distance contributes to genetic differentiation in this region.

Table 1 summarises the node clustering shown in these figures. Note that there are nine possible clusters based on geographic distance alone as the sorting criterion (left column), but only seven clusters using the Girvan-Newman algorithm (centre column). The names in bold in this column are the nodes merged by Girvan-Newman clustering from one of the purely geographic clusters given in the left hand column. The right hand column lists the clusters identified based on genetic distances (using the mean STRUCTURE assignments) when the similarity threshold for cluster membership is set at >0.80 (the clusters begin to break down when the threshold value is higher). Observe that there are now only six clusters listed. Colour-coding for these six is shown along the right edge of the table. This coding is used in Figure 11 to identify node genetic cluster membership on the same geographic network previously illustrated in Figures 9–10 (note: in this figure, the blue-coloured nodes have no genetic information associated with them in the genome scan, and therefore, it cannot be said that these nodes form a genetic cluster). The fourth genetic cluster in the right-hand column (colour-coded brown) comprises inland populations—confirming again that the effect of geographic distance may be greater over land than on the coast.

Table 1.  Comparison of node clustering using different approaches.* Thumbnail image of

In light of this table and Figure 11, it is clear that although geographic distance may be contributing much of the story of genetic diversity in this part of Melanesia, geography is not the whole story. Table 1 indicates there is actually less diversity in this region than predicted using geographic distance solely as the criterion. Nevertheless, while it may be its sheer island size rather than something striking about its topography that favours genetic diversity on New Britain, the network analyses presented here, as well as previous analyses by others, indicate that the three Baining populations sampled on New Britain stand out as the most divergent population cluster in this genome scan. Hunley et al. (2008) suggest that socio-cultural and ecological variation may be involved. Perhaps this may be the case, but if: (a) the commonsense assumption is made when modelling the impact of distance on genetic diversity that geographic isolation between communities has probably varied systematically across the linkages shown in the baseline social network depicted in Figures 9–11, and (b) if so, the transport costs—the “resistance” or “friction” of travel by foot—have probably been greater inland than those of travel on foot or by canoe between coastal communities, then (c) the distinctiveness of the three Baining populations becomes understandable (see Supplementary Text S1 and Fig. S1).

Even so, something more than just geographic distance alone is influencing the structuring of genetic similarities among the populations in this genome scan. Mussau is the most geographically isolated Melanesian population in the scan (excluding the three populations sampled for New Guinea). This island is 118 km by sea from the nearest neighbouring island population in the scan, Lavongai North on New Hanover. SNA analysis shows, however, that this island is the most genetically linked population in the scan (Fig. 12). Furthermore, Mussau occupies a similarly prominent position in this region's archaeological record (Specht, 2007). Isolation by distance constrained by social networks may be structuring genetic diversity in this part of the world, but it is not solely determining it.

Discussion

1. Geography

For the island Melanesian populations in their genome scan, Friedlaender et al. (2008) found that their AMOVA analysis of microsatellites suggests that the larger and more rugged the island, the greater the genetic differentiation among populations. Therefore, they inferred that diversity among these populations is primarily organised by island size and topographic complexity. However, unlike Hunley et al. (2008), they did not consider that diversity may be structured also by isolation by geographic distance. The latter attribute the higher genetic and linguistic distances identified for interior/Papuan-speaking populations to restrictions on movement in the rugged highland interiors, as well as, they add, the longer tenure of Papuan-speaking populations in the region. SNA analysis suggests instead that while topographic complexity and island size may be intuitively plausible determinants of genetic (and cultural) diversity, both are not required to account for variability observed in the scan. Isolation by distance constrained by social networks reasonably accounts for much of the population structuring seen. It is not always appropriate to invoke Occam's razor that “entities must not be multiplied beyond necessity,” but in this instance, such would seem appropriate.

2. Position

Friedlaender et al. (2008) also report that in addition to island size and topographic ruggedness, population diversity is structured by geographic position: coastal vs. inland. Hunley et al. (2008) concur. SNA mapping also supports this observation (Fig. 11). It should be added, however, that drawing a distinction between “inland” and “coastal” is a simple way of saying that the strength, or effect, of distance on isolation may be greater moving over land rather than along the coast, or by sea, and it is evident from Figure 6 that only on the very large islands of New Guinea and New Britain are absolute distances sizable enough for this over land effect to be generally noticeable.

3. Language

Although Friedlaender et al. (2008) found little association between language diversity (Austronesian vs. Papuan) and genetic variation, they report that there is an identifiable, although weak, “Austronesian” genetic signature present in at least some Austronesian-speaking populations in Melanesia—a signature, however, seen most prominently in Polynesia and Micronesia. (Hunley et al. (2008) also refer to this signature, which they identify as a “Taiwan Aboriginal signal”.) They attribute this signature to “at least one powerful new impulse of influence” around 3,300 years ago “from Austronesian speaking migrants from Island Southeast Asia” (Friedlaender et al., 2008: p. 1). As noted, they conclude: “Our analysis indicates the ancestors of Polynesians moved through Melanesia relatively rapidly and only intermixed to a very modest degree with the indigenous populations there”.

However, as also noted earlier, all of Indonesia, the Philippines, and southern Melanesia (Fiji, Vanuatu, New Caledonia, and other places immediately east and southeast of the Solomon Islands) as well as most of Polynesia and Micronesia are not represented in their genome scan. It would seem premature, therefore, to favour such a specific historical scenario. Friedlaender et al. (2008) remark, for instance, that while the Polynesians in their analysis are similar to Taiwan Aborigines and East Asians, they might be even closer to populations in Indonesia, the Philippines, or Southeast Asia not covered in their scan. It should be added, however, that there is no reason at present to also exclude from consideration populations in southern Melanesia with whom Polynesians, in particular, have notably close archaeological ties (Sand, 2007).

Additionally, there are at least two other immediate reasons for not accepting the narrowly constrained historical interpretation of the genetic signature favoured by Friedlaender, Hunley, and their colleagues. First, this signature is expressed at cluster assignment probabilities greater than 20% (Friedlaender et al., 2008: Table S8) only among Polynesians (they report 0.65 as the mean proportion for Samoans; 0.86 for Maori New Zealanders), and Micronesians (for whom, primarily from Belau, the mean proportion is 0.49). In contrast, the proportion is less than 20% on Taiwan (for the Ami, the mean proportion is 0.16; for the Taruko, it is 0.08), and in island Melanesia (where it is an assignment given to less than half of the Melanesian Austronesian populations in the genome scan). Therefore, on current evidence, caution suggests this signature should not be seen as a generic “Austronesian” or “Taiwan Aboriginal signal” hallmark, but should instead probably be called an “Oceanic” signature. While the absence of this signature among most of the Melanesian Austronesian-speaking populations in their genome scan might be seen, as Friedlaender et al. (2008) propose, as evidence implying that ancestors of Polynesians moved rapidly through Melanesia and only intermixed to a modest degree with local populations there, the same evidence may also be a indication that this signature is not an Austronesian hallmark, and should be seen instead as evidence implying that this signature may have originated somewhere in the Bismarck Archipelago sometime prior to the stabilisation of world sea levels ∼6,000–7,000 years ago and was only carried westward to some parts of island Southeast Asia during the mid or late Holocene—an inference in accord with the “ancient lagoons” hypothesis previously discussed (a similar explanation could account for the unexpectedly high frequencies of the so-called “Polynesian” mitochondrial motif among Papuan speakers on Bougainville Island and elsewhere; see Friedlaender, 2005: map 3; Friedlaender et al., 2007).

Second, it is likely that the visibility of this genetic signature among Polynesians and Micronesians (as represented chiefly by Belauans, who speak a non-Oceanic Austronesian language) is historically misleading. Its prominence on islands east of Melanesia is in accord with the long-standing founder effect hypothesis (Terrell, 1986a; Hill et al., 2007; Excoffier & Ray, 2008) that the genetic distinctiveness of Polynesians (as well as many Micronesians) is largely an accident of history—a sign that the first human settlement of Fiji and western Polynesia was achieved by a small, non-random (i.e., biologically related) group of people who came from a genetically heterogeneous source population resident somewhere to the west in Melanesia.

Furthermore, it seems premature to interpret the low visibility of this signature today in Melanesia or on Taiwan as a sign of how much—or how rarely—people from Southeast Asia and island Melanesia met and interacted with one another in the past. Hunley et al. (2008) note that previous analyses of autosomal microsatellites and Y-chromosome data suggest that Papuan-speaking groups have contributed more genetically to Oceanic-speaking groups than vice versa over the last three millennia. Considering the state of current knowledge, there would seem to be no compelling reason to infer that (a) genetic exchanges between people living east and west of New Guinea were limited solely to a chronologically restricted moment in time ∼3,300 years ago, (b) the movement of people was only in a single direction (i.e., from on or near Taiwan), or that (c) people were on the move solely for a particular reason (i.e., voyages of discovery, say, rather than commerce). It should be added that there is now a growing body of archaeological data pointing to large-scale interaction and exchanges back-and-forth in the Pacific region long before 3,300 years ago (Torrence et al., 2009).

4. Social Networks

Hunley et al. (2008) found that local genetic and linguistic exchanges in island Melanesia over time have obscured but not completely erased in datasets of both kinds evidence useful for reconstructing the early history of Pacific Islanders. They also found a weak but significant matrix correlation between genetics and distance (r = 0.31, p = <0.005; Hunley et al., 2008: Table 5). They conclude that a larger sample would show an even more robust isolation-by-distance pattern. Using SNA methods shows, however, that the structuring of interaction and exchanges by geographic distance has played a readily discernible role in patterning genetic similarities and differences among the populations considered. On the other hand, as previously noted, these autosomal data were not obtained using a probability sampling design. Therefore, caution must be used when interpreting findings based on this information since the inferred number of clusters present in a STRUCTURE analysis can be strongly influenced by the sampling scheme used (Pritchard et al., 2000: p. 956). Furthermore, it is evident from Table 1 that there is less noticeable genetic clustering in this part of the Pacific than is predicted using geographic distance as the principal structuring criterion.

Conclusions

The clarity and resolution of using fine-grained information on autosomal variation to explore population structure and genetic relationships can be improved by using methods developed in social network analysis (SNA). I have applied techniques drawn from SNA to map the structuring of genetic diversity among islanders in the south-western Pacific. These mappings lead to several observations and inferences about the history and genetic relationships of these islanders.

Sampling. As Friedlaender et al. (2008) have remarked, inadequate sampling, particularly of populations in Melanesia, has made it difficult to determine the origins and phylogenetic relationships of Pacific Islanders. However, the populations represented in their genome scan were not randomly selected, nor are they representative of the Pacific region as a whole. While Friedlaender et al. (2008) say their genome scan was both intensive and comprehensive in its coverage of populations across a comparatively small geographic area, they report sampling only “Papuan-speaking populations and their immediate Oceanic-speaking neighbors from the islands immediately to the east of New Guinea in what is called Northern Island Melanesia, consisting of the Bismarck and Solomon Archipelagos”. Thus how best to interpret their autosomal data is not self-evident. Moreover, for example, ethnographic information indicates that the small islands off the east coast of New Ireland have been instrumental in the flow of information and goods, yet these islands are not represented in their scan (Fig. 11). Consequently, while the genome scan assembled by Friedlaender and his colleagues is unquestionably far more extensive than what was previously available for study, there remains considerable uncertainty about human genetic relationships in this part of the Pacific due to current sampling limitations.

Population structure. Although Hunley et al. (2008) report patterns of gene-language co-evolution inconsistent with predictions of their isolation-by-distance model for this region as a whole, it is reasonable to assume that isolation by distance is constrained by the geographic location of the populations being sampled genetically (contraHunley et al., 2008: p. 8), and that direct interaction and exchanges are most probable between neighbouring communities. The SNA mappings (Figs. 9–11) presented here indicate that geographic distance has clearly influenced the structuring of genetic relationships among populations in the south-western Pacific. Yet these SNA mappings (including Fig. 12) also show that genetic relationships are often closer than geographic distance alone would lead us to expect.

History. While Friedlaender et al. (2008) conclude that genetic exchanges between islands may have been relatively restricted for some time, the clustering together of populations resident on Mussau, New Hanover, New Ireland, and in the northern Solomon Islands does not accord with this inference. Additionally, Friedlaender et al. (2008), as well as Hunley et al. (2008), also say that the ancestors of Polynesians moved through Melanesia quickly and only intermixed genetically to a modest degree with local populations already in residence there. However, the genetic signature they use to support this inference is geographically distributed widely throughout this part of the Pacific from the northern Solomon Islands to Mussau, and from there to west New Britain. Judging by what is presently known, moreover, this signature is also no more prominent among Taiwanese populations than it is among Melanesians. Accordingly, there is currently insufficient evidence to conclude that ancestors of the Polynesians migrated from the general vicinity of Taiwan to the central Pacific, that they did so without extensive contact with Melanesian populations along the way, or that “the sailing capabilities of the ancestors of the Polynesians transformed the nature of their Diaspora and kept them relatively homogeneous” (Friedlaender et al., 2008: p. 15).

“Sleeping giant” hypothesis. It is not unexpected that autosomal variation in island Melanesia would substantiate the differentiation of Asians and Melanesians long viewed as self-evident on the witness of phenotypic characters alone. The “sleeping giant” hypothesis that New Guinea during the Pleistocene and early Holocene served more as a vicariant barrier than as a land bridge between island Southeast Asia and Oceania is a logical way to account for the striking physical divergence of these two neighbouring—yet morphologically dissimilar—regional populations. Unfortunately, however, Taiwan and the Sepik River, the two closest localities in these two regions represented in the genome scan under consideration here, are ∼4,000 km apart. The alternative baseline (or “null”) hypothesis that isolation by distance constrained by social networks rather than Pleistocene living conditions may adequately account for differences observed between Asians and Melanesians cannot yet be evaluated using autosomal evidence. Furthermore, due to the under-representation of New Guineans in this scan, it is also moot whether island Melanesians and most New Guineans differ genetically from one another at a level of divergence striking enough to be attributable to Pleistocene biogeographical isolation rather than more recent isolation by distance on a local scale during the Holocene. Therefore, strategically collected and comparably fine-grained genetic information on human variation in Indonesia and on the island of New Guinea is critically needed to evaluate further the “sleeping giant” hypothesis about the history of human diversity in the Pacific.

“Ancient lagoons” hypothesis. Similarly, what Friedlaender et al. (2008) label as an ‘Austronesian’ signature—but which instead should probably be called an “Oceanic” signature—is in agreement with the “ancient lagoons” hypothesis that the stabilisation of world sea levels near their present stand ∼6,000–7,000 years ago, which led to the growth of coastal resources favourable to human settlement, may have increased the range and intensified the impact of interaction and exchanges among populations resident on islands throughout the south-western Pacific. This historical hypothesis is less demanding of what is still quite limited genetic data than the alternative hypothesis favoured by Friedlaender et al. (2008) and Hunley et al. (2008) that this genetic signature, however labelled, should be seen as support for the far more specific historical reconstruction that modern Polynesians and Micronesians are alike descended from a powerful new impulse of influence around 3,300 years ago personified by Austronesian speaking migrants from island Southeast Asia who swept through Melanesia relatively rapidly on their way to Polynesia, only intermixing to a very modest degree with local people (Friedlaender et al., 2008). Given how geographically widely distributed this signature is in the Bismarck Archipelago and the northern Solomons, it may also be older in this part of Melanesia than hypothesised by Friedlaender, Hunley, and their colleagues.

Acknowledgements

I thank Jonathan Friedlaender and Keith Hunley for their comments on the first draft of this report. John Hart, New York State Museum, and James Koeppl, Field Museum of Natural History, have read the manuscript in its several iterations, and have offered wise advice and support. Mark Golitko prepared the base map for Figure 1. The views and conclusion, of course, are my own, and my acknowledging these individuals should not be taken as implying they necessarily agree with what I have to report.

Ancillary