|
Post by Admin on Jun 9, 2020 20:08:15 GMT
A Highly Unstable Recent Mutation in Human mtDNA Ana T. Duggan and Mark Stoneking Abstract An A-to-G transition at position 16247 in the human mtDNA genome denotes haplogroup B4a1a1a and its sublineages. Informally known as the “Polynesian motif,” this haplogroup has been widely used as a marker in Oceania of genetic affiliation with the Austronesian expansion. The 16247G allele has arisen only once in the human mtDNA phylogeny, about 7,000 thousand years ago, and is nearly fixed in Remote Oceania. We analyzed 536 complete mtDNA genome sequences from the Solomon Islands from haplogroup B4a1a1 and associated subhaplogroups and found multiple independent back mutations from 16247G to 16247A. We also find elevated levels of heteroplasmy at this position in samples with the 16247G allele, suggesting the ongoing occurrence of somatic back-mutations and/or transmission of heteroplasmy. Moreover, the G allele is predicted to introduce a novel stem-loop structure in the DNA sequence that may be structurally unfavorable, thereby accounting for the remarkable number of back-mutations observed at the 16247G allele in this short evolutionary time span. More generally, haplogroup-calling scripts result in inaccurate haplogroup calls involving the back-mutation and need to be supplemented with other types of analyses; this may be true for other mtDNA lineages because no other lineage has been investigated to the same extent (over 500 complete mtDNA sequences). Main Text Although other potential origins have been suggested,1–3 it is likely that the Austronesian expansion spread from Taiwan beginning about 5,000 years ago.4–7 The Austronesian expansion resulted in the spread not only of a new language family and culture but also the mtDNA B4a1 lineage.8,9 As the B4a1 lineage expanded across Island Southeast Asia, Near Oceania, and Remote Oceania, various subhaplogroups appeared along the migration route, often with signatures of specific homelands. For example, haplogroup B4a1a1a and the immediately descendent haplogroup B4a1a1a1 are found at very high frequency in Near Oceania and approach fixation in Remote Oceania.10–12 These two haplogroups are thought to have arisen in the Bismarck Archipelago due to the high diversity of lineages found in that region and their general scarcity further west.13 Haplogroup B4a1a1a is defined by a series of diagnostic mutations, many in the mtDNA coding region. The mutation that defines haplogroup B4a1a1a from its immediate predecessor B4a1a1 is an A-to-G transition at position 16247 in the noncoding control region (Figure 1). Because this position was an obvious marker for early mtDNA studies that sequenced only the hypervariable segments of the control region, and because B4a1a1a and its descendent haplogroups achieve near fixation in Remote Oceania, this mutation became known as the “Polynesian motif.”10 The derived 16247G allele occurs only once in the entire human mtDNA phylogeny (Phylotree Build 14) and has been previously estimated to have arisen 5,000–7,000 years ago.13,14 It is well known that some positions in the mtDNA genome are hypermutable, exhibiting independent forward and back-mutations across multiple lineages in the mtDNA phylogeny.15 We report here a type of mutational instability in that the derived 16247G allele has undergone multiple independent back-mutations to the ancestral allele since its single origin in the entire mtDNA phylogeny. Moreover, we detect high levels of heteroplasmy at this position, suggesting ongoing back-mutations within samples with the 16247G allele. We also infer that the 16247G allele induces a novel stem-loop structure that may account for the instability associated with this mutation. Figure 1 The sequencing of 536 complete mtDNA genomes from haplogroup B4a1a1 and associated subhaplogroups was undertaken as part of a larger ongoing population genetics study of the Solomon Islands.16 Samples were collected with written informed consent in the Solomon Islands in 2004 with the approval of the Solomon Islands Ministry for Educational Training and the Ministry of Health and Medical services. Ethical approval for sample collection and study was obtained from the Ethics Commission of the University of Leipzig Medical Faculty. Libraries were prepared for multiplex sequencing following the protocol of Meyer and Kircher17 with modifications for mtDNA target enrichment by in-solution capture after Maricic et al.18 Libraries were sequenced on the Illumina Genome Analyzer IIx platform with single-end, 76 base-pair reads to an average coverage of 430× per sample. The haplogroup of each sample was determined with a custom script by comparison to PhyloTree Build 14.19 Most of the sequences belonged to haplogroups B4a1a1a (47%) and B4a1a1a1 (39%). An unusual trend was observed while verifying the haplogroup calls for some B4a1a1a1 sequences. Approximately 20% of the sequences that contained the diagnostic 6905G allele, indicating they belonged to haplogroup B4a1a1a1, had the 16247A allele instead of the expected 16247G allele that is diagnostic for the immediately ancestral haplogroup B4a1a1a (Figure 1). A network of all putative B4a1a1a1 samples (Figure 2) indicated that although the majority of samples with the ancestral A allele at position 16247 appeared to have arisen from a single back-mutation event, there were other samples that fell outside of this clade, indicating multiple independent back-mutations at 16247. These back-mutations at position 16247, which we denote as B4a1a1a1+16247!, appear to be widely prevalent because they are present in 14 of the 18 populations examined and account for up to 43% of the observed B4a1a1a1 haplotypes in these populations (Table 1).
|
|
|
Post by Admin on Jun 10, 2020 5:22:20 GMT
Figure 2 To see whether other mtDNA sequences classified in the B4a1a1 lineage are also influenced by back-mutations from 16247G, we produced a network of all 536 sequences that fell in the B4a1a1 lineage and sublineages thereof (Figure 3). Indeed, a group of 16 sequences that are all from Polynesian Outlier populations and had been assigned to the B4a1a1 haplogroup instead fell as a terminal node on a long branch of the B4a1a1a haplogroup (Figure 3). This suggests that these sequences mark an additional back-mutation at position 16247 from G to A involving the B4a1a1a haplogroup (Figure 1), making them indistinguishable from B4a1a1 with respect to haplotyping algorithms. Figure 3 In order to determine the time span over which these back-mutations have occurred, we dated the age of the mutations of the B4a1a1a haplogroup to be approximately 7,200 years old, and the younger B4a1a1a1 haplogroup (from which most of the back-mutations arose) to be approximately 6,350 years old (Table 2). We also generated several trees both through maximum likelihood and Bayesian analyses. A recurrent observation in these trees is that many of the haplogroups do not appear as monophyletic clades, with individual or small groups of B4a1a1 or B4a1a1a1+16247! sequences falling within the larger B4a1a1a or B4a1a1a1 haplogroups (see Figure S1 available online). The B4a1a1 lineage network provides an explanation because all of the sequences (with 16247A) that are scattered across the tree are involved in reticulations with a second haplogroup with the 16247G allele (Figure 3). This raises the possibility that these sites of reticulation are yet further independent back-mutation events at position 16247, which were not obvious from the network as a result of their connection with two haplogroups. We then searched the literature for other whole mtDNA genome sequences that might harbor the B4a1a1a1 + 16247! motif and identified three such sequences. All are identified as B4a1a1a1; one comes from the Bismarck Archipelago,13 one from Bougainville,20,21 and the third from an individual from coastal Papua New Guinea.22 This back-mutation pattern therefore appears to be of high prevalence in the Solomon Islands, which may reflect the high incidence of the B4a1a1a lineage in the region and/or the fact that our current study is the most in-depth study of complete mtDNA genomes in Oceania. Further study of other regions where these haplogroups are highly prevalent, i.e., Remote Oceania, would be of interest. These analyses have demonstrated multiple back mutations from the 16247G allele, which has arisen only once during human evolution. To investigate how unusual this is, we checked the mtDNA phylogeny for other occurrences of repeated back-mutations from a mutation that arose only once during human evolution. There are only three other positions that, like position 16247, have mutations that arose only once and then subsequently back-mutated on a terminal branch of the mtDNA phylogeny. However in all of these cases (forward mutations: position G92A defines haplogroup Q1, position C1703T defines haplogroup N1b, and position C3780T defines haplogroup M49, PhyloTree Build 14) both the forward and back-mutation events are much older than position 16247, with the forward mutations dated to 37.5, 20, and 30 Kya respectively.14 The back-mutations in these cases are 15, 9, and 7 Kya, respectively, and define haplogroups Q1b, N1b1a, and M49b14 (PhyloTree Build 14). The ages of these forward mutations and their respective back-mutations would suggest that they are far more stable than the mutational pattern we have observed at position 16247: the forward mutations were stable for at least 10,000 years before the back-mutations arose. In contrast, the forward mutation at position 16247 and the multiple subsequent back-mutations have all occurred within the last 7,200 years (Table 2). Because our study sequenced the lineages within one haplogroup to such great depth and numbers, it has enabled us to discover these independent back-mutation events. Further sequencing may identify other back mutation events within the Q1, N1b, and M49 lineages or within other lineages. Nevertheless, the instability associated with position 16247 (namely, a single forward mutation followed by multiple independent back mutations within a few thousand years) appears to be unique in the human mtDNA genome. Figure 4 Sequences inferred to have back-mutations were examined for evidence of contamination by inspecting the aligned sequence reads from the GAIIx; instead of contamination, we found evidence for potential heteroplasmy in some samples. We then examined the sequence reads for potential heteroplasmies at position 16247 and all other diagnostic positions for the B4a1 lineage in all 536 samples. Based on the criterion of requiring at least three forward and three reverse reads for each allele at position 16247, we identified 54 samples which were potentially heteroplasmic; additionally, requiring the minor allele to have a frequency of at least 20% reduced this to 11 potentially heteroplasmic samples.23 No other diagnostic positions showed an equivalent level of potential heteroplasmy, and many positions showed none at all, which suggests that the potential heteroplasmies observed are not the result of sample contamination (Table S1). To further investigate this, we examined the trace files from the mtDNA HVR-1 Sanger sequencing that had been previously performed on these samples16 and found that six of the nine samples of interest who had trace files of sufficient quality also clearly exhibited heteroplasmy at position 16247 (Figure S2). Thus, there seems to be an excess of heteroplasmy at position 16247 in samples with predominantly the 16247G allele, suggesting that there may be ongoing somatic mutations and/or transmission of heteroplasmy at this position. We next investigated potential reasons for the apparent instability of the derived 16247G nucleotide. This position is located in the noncoding control region and therefore would not have any deleterious effect on any mitochondrial genes; moreover, a search of the literature does not indicate any role for this position in replication, transcription, or other regulatory processes. However, the 16247G nucleotide appears to have an effect on DNA secondary structure, with the derived G nucleotide associated with a 10 bp stem-loop structure, beginning at position 16247, that is not predicted to occur with the ancestral A nucleotide at position 16247 (Figure 4). Given the repeated and rapid back-mutation events associated with the derived G nucleotide at this position, we hypothesize that this stem-loop structure increases the instability of the mtDNA genome and/or influences the rate of replication or transcription. Previously, the observation of a back-mutation at position 16247 in one sequence reported from Papua New Guinea22 was suggested to reflect recombination associated with heteroplasmy maintained over several generations.24 The multiple back-mutation events, identified because they occurred after several other mutations were acquired in various sequences, render recombination highly unlikely, because it is then not clear why recombination would influence only position 16247 and not any of the other mutations in these sequences. This instability in position 16247 creates uncertainty for calling haplogroups within the B4a1 lineage. If the 16247G allele is present, haplogroup calling can simply follow established procedures (Figure 1). Indeed, even the back-mutation to the 16247A allele on the background of haplotype B4a1a1a1 is easy to classify, provided that whole mitochondrial genome sequencing has been completed and the state at position 6905 can be assessed. However, given the multiple independent back-mutation events at this position, classifying all of these as a single haplogroup (e.g., B4a1a1a1+16247!) has the undesirable effect of creating a paraphyletic haplogroup. Moreover, even more difficulty with haplogroup assignment arises with those sequences that appear to have back-mutated to 16247A from a background of B4a1a1a, because these sequences are indistinguishable from B4a1a1. For example, one heteroplasmic individual (VL08) falls into haplogroup B4a1a1 with no private mutations (Figure 3; Figure S2); without the existence of heteroplasmy it would be impossible to distinguish this sample from either B4a1a1 or B4a1a1a and indeed we cannot determine the direction of mutation in this sample (Figure 3; Figure S2). The “Polynesian motif” is therefore still a useful marker, but its tendency to revert to the ancestral nucleotide underscores the importance of whole genome sequencing in place of control-region sequencing, and the further importance of investigating the relationship among individual sequences, for example in a network or tree, to identify back-mutation events that otherwise may be missed by haplotyping algorithms. This study also illustrates the value of high coverage sequencing of whole mtDNA genomes for detecting heteroplasmy and the benefits of sequencing large numbers of samples from single haplogroups for discovering unexpected and unusual mutation events. In particular, here we observed a different, and to our knowledge unique, phenomenon of a recent mutation that has occurred only once in the human mtDNA phylogeny and has subsequently undergone multiple independent back-mutations to the ancestral state.
|
|
|
Post by Admin on Jun 26, 2020 6:45:16 GMT
Human genetics of the Kula Ring: Y-chromosome and mitochondrial DNA variation in the Massim of Papua New Guinea European Journal of Human Genetics volume 22, 1393–1403(2014) Abstract The island region at the southeastern-most tip of New Guinea and its inhabitants known as Massim are well known for a unique traditional inter-island trading system, called Kula or Kula Ring. To characterize the Massim genetically, and to evaluate the influence of the Kula Ring on patterns of human genetic variation, we analyzed paternally inherited Y-chromosome (NRY) and maternally inherited mitochondrial (mt) DNA polymorphisms in >400 individuals from this region. We found that the nearly exclusively Austronesian-speaking Massim people harbor genetic ancestry components of both Asian (AS) and Near Oceanian (NO) origin, with a proportionally larger NO NRY component versus a larger AS mtDNA component. This is similar to previous observations in other Austronesian-speaking populations from Near and Remote Oceania and suggests sex-biased genetic admixture between Asians and Near Oceanians before the occupation of Remote Oceania, in line with the Slow Boat from Asia hypothesis on the expansion of Austronesians into the Pacific. Contrary to linguistic expectations, Rossel Islanders, the only Papuan speakers of the Massim, showed a lower amount of NO genetic ancestry than their Austronesian-speaking Massim neighbors. For the islands traditionally involved in the Kula Ring, a significant correlation between inter-island travelling distances and genetic distances was observed for mtDNA, but not for NRY, suggesting more male- than female-mediated gene flow. As traditionally only males take part in the Kula voyages, this finding may indicate a genetic signature of the Kula Ring, serving as another example of how cultural tradition has shaped human genetic diversity. Introduction Oceania represents a vast geographic area with a complex human settlement history.1, 2 Previous studies have addressed human genetic variation in Oceania, particularly Polynesia,3, 4, 5, 6, 7, 8 mainland New Guinea9, 10, 11, 12, 13 and Island Melanesia.14, 15, 16, 17, 18 One area of Near Oceania (NO) yet understudied from a human genetic perspective is the island region off the southeastern tip of New Guinea (Figure 1). Administratively designated the Milne Bay Province of Papua New Guinea (PNG), this area encompasses the D'Entrecasteaux Islands (ie, Normanby, Fergusson, Dobu and Goodenough), the Trobriand Islands, the Woodlark group (ie, Gawa, Woodlark and the Laughlan Islands), the Louisiade Archipelago (ie, Misima, Sudest, Rossel and the islands of the Calvados chain), as well as a portion of the nearby PNG mainland. The inhabitants of this region have been designated as Massim,19, 20, 21 a term that has since been used to refer not only to the people but also to the geocultural region inhabited by them.22 From the many islands of the Massim, only the Trobriand Islands were included in previous human genetic studies4, 7, 11, 16, 23 Hence, a human genetic description of the Massim is lacking so far, despite the major attention they have received in the cultural anthropology literature, in particular with respect to the Kula, a traditional inter-island trading system described in more detail below. Figure 1 Human settlement of mainland New Guinea goes back at least 40–50 thousand years: the Ivane Valley in eastern PNG was occupied 43–49 thousand years ago (kya).24 The Solomon Islands (ie, the archipelago east of the Milne Bay area) were occupied by at least 28 kya.25 Archeological findings in the Massim are scant26 and there is currently no evidence of long-term human occupation before 2 kya.27, 28 Given the lower sea levels during the Pleistocene, many of the current Massim islands may have been connected by land bridges,29 potentially facilitating human migration between them. The languages spoken in the Massim belong to the Papuan Tip cluster within the Western Oceanic branch of the Austronesian language family,30 with the exception of Rossel Island, the easternmost island of the Louisiade Archipelago, where a non-Austronesian (Papuan) language is spoken.31 Among a number of the islands in the Milne Bay Province an extensive trading system, referred to as Kula or Kula Ring has traditionally developed.32, 33, 34, 35, 36, 37 This Kula exchange system was extensively described by anthropologist Bronislaw Malinowski in his classic work ‘Argonauts of the western Pacific’33 and has since become an oft-cited anthropological example of balanced reciprocity. This particular trading system assures that items only available on some islands, for instance, because of unequal geographic distribution of natural resources, but vitally needed on other islands, are shared among people from different islands. The Kula is centered around the exchange of two valuables in a ring-like manner: necklaces, called soulava, are moved in clockwise direction through the island world, whereas armshells, called mwali, do so in the opposite direction (Figure 1). Notably, only men participate in the Kula; they sail to fixed trading partners on other Kula islands and it is not uncommon for them to stay away from home for several months. The islands of the southeastern Massim are not extensively involved in the Kula. Only Misima is sometimes mentioned as a Kula partner, but trading with Misima is probably much less intense as compared with the northern and western Massim.35, 38 The presence of the Kula trading system, involving regular migrations between islands with lengthy stays away from home, raises the question whether besides the economic exchange also genetic exchange takes place. If the male-specific migration of the Kula indeed left detectable signatures in the genomes of the contemporary populations from the region, this should be evident by studying the paternally inherited non-recombining portion of the Y-chromosome (NRY), in comparison with the maternally inherited mitochondrial (mt) DNA. If true, the Kula would serve as another example of the impact of human culture on genetic variation as has been observed before, for example, for residence patterns39 and social stratification.40 Furthermore, it is of interest to investigate to what extent the inhabitants of Rossel Island, given that they are the only non-Austronesian (ie, Papuan) speakers of the Massim, differ genetically from the other, Austronesian-speaking people of the Massim, in particular their direct neighbors from Sudest. Although human contacts between Rossel and Sudest certainly existed—geographic distance between the two islands is small and sea-crossing from one to the other is feasible—the Rossel society has been described as endogamous,41 suggesting limited or no genetic exchange with Sudest, a situation that could have promoted genetic divergence between the two populations. With these questions in mind, we analyzed NRY and mtDNA polymorphisms in individuals from across the Massim area (Figure 1). In reconstructing the demographic history of Oceania, our results not only fill an important gap between previous genetic studies on the mainland of New Guinea in the west,9, 10, 11, 12, 13 the Bismarck Archipelago in the north14, 15, 16 and the Solomon Islands in the east,17, 18 but also provide further insights into how human culture impacts on human genetics.
|
|
|
Post by Admin on Jun 26, 2020 21:06:57 GMT
Results and Discussion NRY and mtDNA haplogroups in the Massim In total, 13 NRY haplo-/paragroups were detected, all falling within the major clades C, M, S, O and K* (Table 1). In particular, seven haplo-/paragroups encompass the bulk (97.7%) of NRY diversity (Figure 2a). Y-haplogroups O-M119*(xM110), O-M110 and O-M324*(xM7,M134) are of Asian/Austronesian (AS) origin, whereas haplogroups C-M208*(xP33,P54), M-P34*(xM83) and S-M254*(xM226) are of NO origin, as previously described.7, 16 The K-M9*(xP79,M4,M353,P117,M214,M74,M230) paragroup is particularly frequent in the Massim (overall: 33.4%) and probably includes several yet undefined sublineages of K-M9. A Y-STR haplotype network analysis for K-M9* (Supplementary Figure S1) indeed showed that its haplotypes are quite diverse. Given that K-M9* was not found in East Asia and only sporadically in Southeast Asia in our data set, whereas it is rather frequent in parts of NO such as in the Admiralty Islands (27.0%), a NO origin for the Massim K-M9* Y-chromosomes observed here seems most likely. NRY haplogroups were not homogeneously distributed throughout the Massim (Figure 2a). The NO haplogroup M-P34* and the AS haplogroup O-M324* were more prominent in the northern and western Massim, whereas the NO haplogroup S-M254* was more frequent in the southeastern Massim. Furthermore, Rossel stood out from its local neighbors because of its high frequency of AS haplogroup O-M110 and complete absence of NO haplogroup C-M208*. The Wanigela group from the Collingwood Bay showed a remarkably low haplogroup diversity (Supplementary Table S5) with over 90% belonging to M-P34*. Figure 2 Geographic distribution of (a) the seven major Y-chromosome haplogroups and (b) the nine major mtDNA haplogroups observed in the Massim area (see Tables 1 and 2 for complete haplogroup frequency data). For comparison, the gross frequencies of these haplogroups as observed in other regions of Asia/Oceania, pooled from different population samples, are included as well using previously published data.7, 10, 16, 18, 42 For practical reasons, the PNG Highlands group includes Kapuna, a Papuan-speaking group from the Gulf Province of PNG with an assumed origin in the highlands, while the PNG Coast group includes Bereina, an Austronesian-speaking group from the southern coastal area of PNG. We distinguished 12 different mtDNA haplo-/paragroups in the Massim, all falling within the mtDNA clades P, Q, E, B4, B5, F and R23 (Table 2). In particular, nine haplo-/paragroups accounted for 96.5% of the mtDNA pool (Figure 2b). Haplogroups F1a, B4-16261*(xB4a1a1a) and B4a1a1a (also known as the ‘Polynesian motif’) are of (ultimate) AS origin,51, 52 although B4a1a1a may also have originated in descendants of East Asians residing in Nusa Tenggara53, 54, 55 or in the Bismarck Archipelago.8 Haplogroups E*(xE2) and E2 are likely of Taiwanese/Island Southeast AS origin56 (here also classified as AS), whereas haplogroups P*(xP1), P1, Q1 and Q2 have a NO origin.57 Like the NRY haplogroups, also the mtDNA haplogroups were not homogeneously distributed throughout the Massim (Figure 2b). NO haplogroup P1 was much more frequent in the southeastern Massim, while AS haplogroup B4-16261* was almost absent there, and NO haplogroups Q1 and Q2 were almost only found in the western Massim. Rossel again stood out from its local neighbors, because of its major component of AS haplogroup E* and low frequency of NO haplogroup P*. Interestingly, Sudest showed a relatively high frequency of the AS haplogroup F1a, which was otherwise only detected sporadically (single individuals) in the eastern Calvados and in Gawa but not in any of the other Massim groups studied. The presence of haplogroup R23 in a single individual from the western Calvados was unexpected, as this haplogroup has so far only been observed much more westward in Nusa Tenggara58 and among Cham from Vietnam.59 Table 2 Observed mtDNA haplogroup frequencies (%) in the Massim area Population n Q* a Q1 a Q2 a P* a P1 a B4* b B4-16261* b B4a1a1a b B5b b F1a b E* b E2 b R23 b Collingwood Bay Wanigela 23 4.3 34.8 8.7 26.1 4.3 21.7 Airara 25 4.0 24.0 32.0 24.0 4.0 12.0 Western Massim Fergusson 45 6.7 6.7 20.0 6.7 4.4 8.9 37.8 2.2 2.2 4.4 Normanby 22 4.5 27.3 4.5 63.6 Mainland eastern tip 31 3.2 19.4 3.2 9.7 9.7 12.9 29.0 3.2 9.7 Northern Massim Trobriand Islands 47 6.4 2.1 23.4 4.3 25.5 27.7 10.6 Gawa 16 37.5 37.5 12.5 6.3 6.3 Woodlark 19 10.5 36.8 26.3 26.3 Laughlan Islands 6 16.7 16.7 16.7 16.7 33.3 Southeastern Massim Misima 18 27.8 55.6 16.7 Western Calvados 14 28.6 35.7 28.6 7.1 Eastern Calvados 17 58.8 29.4 5.9 5.9 Sudest 39 48.7 12.8 7.7 2.6 28.2 Rossel 110 5.5 35.5 6.4 46.4 6.4 aAssigned a Near Oceanian origin following Kayser M2 and Friedlaender et al.57 bAssigned an Asian origin following Kayser M2; Friedlaender et al51; Trejaut et al52; Soares et al56; Hill et al58 and Peng et al.59
|
|
|
Post by Admin on Jun 27, 2020 5:33:32 GMT
AS versus NO genetic ancestry in the Massim To quantify the relative contributions of NO versus AS paternal and maternal ancestors to the gene pool of the Massim people, we assigned, based on previous knowledge, the most probable ancestral origin to each of the observed NRY and mtDNA haplogroups (Table 3). Overall, the proportion of AS mtDNA haplogroups in the Massim (52.3%) was more than two times higher than that of AS NRY haplogroups (23.9%). Such an excess of AS mtDNAs compared with AS Y-chromosomes was previously also observed in Admiralty Islanders north of the PNG mainland (60.7% AS mtDNA versus 18.2% AS NRY),16 in Solomon Islanders (excluding Polynesian outliers; 77.7% AS mtDNA versus 27.5% AS NRY)18 and in Polynesians from Remote Oceania (96.4% AS mtDNA versus 34.6% AS NRY).7, 16 This pattern is consistent with a historical admixture scenario that involved mainly AS women and mainly NO men in NO, perhaps the Bismarck Archipelago, before the occupation of Remote Oceania, in line with the previously suggested Slow Boat from Asia hypothesis of Polynesian origin in particular and the Austronesian origin in general.4, 7, 16, 60 Within the Massim, the people from the northern islands carried the highest amount of AS ancestry, while relatively low amounts of AS ancestry were seen in the southeastern Massim. However, the non-Austronesian-speaking Rossel Islanders formed an exception, showing a larger AS proportion both for NRY and mtDNA than their direct Austronesian-speaking neighbors, a finding that contradicts the expectation based on linguistics (for more details on Rossel see below). NRY haplogroups MtDNA haplogroups Population n NO ancestry (%) AS ancestry (%) Unknown ancestry (%) n NO ancestry (%) AS ancestry (%) Unknown ancestry (%) Collingwood Bay Wanigela 21 100.0 — — 23 73.9 26.1 — Airara 32 93.8 6.3 — 25 60.0 40.0 — Western Massim Fergusson 35 77.1 22.9 — 45 44.4 55.6 — Normanby 27 66.7 33.3 — 22 36.4 63.6 — Mainland eastern tip 24 79.2 20.8 — 31 45.2 54.8 — Northern Massim Trobriand Islands 60 63.3 36.7 — 47 31.9 68.1 — Gawa 10 40.0 60.0 — 16 37.5 62.5 — Woodlark 13 46.2 53.8 — 19 10.5 89.5 — Laughlan Islands 5 80.0 20.0 — 6 16.7 83.3 — Southeastern Massim Misima 14 92.9 7.1 — 18 83.3 16.7 — Western Calvados 19 73.7 26.3 — 14 64.3 35.7 — Eastern Calvados 11 90.9 9.1 — 17 88.2 11.8 — Sudest 38 94.7 5.3 — 39 61.5 38.5 — Rossel 80 70.0 30.0 — 110 40.9 59.1 — Total 389 76.1 23.9 — 432 47.7 52.3 — East Asiaa 113 — 92.0 8.0 121 — 79.3 20.7 Southeast Asiaa 205 9.8 82.4 7.8 199 — 80.4 19.6 Nusa Tenggaraa,b 373 85.5 13.7 0.8 31 9.7 83.9 6.5 WNG Lowlanda,c 90 100.0 — — 121 84.3 3.3 12.4 WNG Highlandsa,c 95 98.9 1.1 — 107 92.5 — 7.5 PNG Highlandsa 73 98.6 1.4 — 72 91.7 1.4 6.9 PNG Coasta 65 81.5 18.5 — 80 32.5 67.5 — Admiralty Islandsd 148 81.8 18.2 — 145 37.9 60.7 1.4 Solomon Islandse 712 72.5 27.5 — 703 20.9 77.7 1.4 Polynesiaa 315 63.2 34.6 2.2 306 3.6 96.4 — Abbreviations: AS, Asian; NO, Near Oceanian. aKayser et al.7 bMona et al.42 cTommaseo-Ponzetta et al.10 dKayser et al.16 eDelfin et al.18 Massim genetic population substructure and the Kula Genetic distances between the Massim groups calculated from NRY and mtDNA haplogroup/haplotype data (Supplementary Table S6) were visualized in MDS plots (Figure 3). Wanigela appears as a strong outlier in both NRY-based plots, which can be explained by its exceptionally high frequency of haplogroup M-P34* and the complete lack of AS NRY haplogroups. Airara takes an outlier position only in the Y-STR-based plot, but not in the NRY-haplogroup-based plot, which can be explained by the fact that Airara's K-M9* Y-STR haplotypes are quite distinct from the K-M9* haplotypes in other Massim groups (Supplementary Figure S1). In contrast, neither Wanigela nor Airara are outliers in the mtDNA-based plots. Both groups come from the coast of mainland PNG (Collingwood Bay), on the border with the Massim area, and were included in this study because Goodenough, one of the islands in the western Massim, is visible from Collingwood Bay, and therefore people from Wanigela and Airara may be involved in admixture processes with the Massim. Moreover, archeology has revealed the presence of prehistoric pottery in the Trobriand Islands that originated from the Collingwood Bay61, 62 (modern pottery in the Trobriands comes mostly from the Amphlett Islands). Nearly all NO haplogroups found in the Massim are also found in the Collingwood Bay, hence our results do not exclude the possibility that the Massim have ancestral ties in the Collingwood Bay, in line with the archeological evidence. However, a pairwise haplotype-sharing analysis (Table 4) did not reveal increased haplotype sharing between the Collingwood groups and the nearest sampled group of Fergusson, nor the Trobriand Islanders, suggesting that genetic exchange between these groups was rather limited. The other genetic outlier particularly for Y-STR and mtDNA haplotypes (less so for NRY/mtDNA haplogroups) is Rossel (Figure 3) (for further details on Rossel see below). Figure 3 Apart from the outliers, the positioning of the sampled groups is in good agreement with geography. When we repeated the MDS analysis without the Airara, Wanigela and Rossel groups (Supplementary Figure S2), a strong north-south correlation with geography along the first dimension was seen in all four plots. Notably, Seligmann21 and Malinowksi33 had provisionally subdivided the Massim—on ethnographic grounds—into a northern and a southern portion. We considered and tested several alternative subdivisions by means of AMOVA, while leaving out Wanigela and Airara for the reason explained above (Supplementary Table S7). The grouping that explained the largest proportion of among-group variation for both NRY and mtDNA data was a division into three groups: (1) the western plus northern Massim, (2) the southeastern Massim excluding Rossel and (3) Rossel. This grouping explained 15.98% (P<0.001) of the among-group variation for mtDNA haplotypes, 14.78% (P<0.001) for mtDNA haplogroups, 10.18% (P<0.001) for Y-STR haplotypes and 7.91% (P<0.001) for NRY haplogroups. When performing an AMOVA with the whole Massim as one group (again excluding Wanigela and Airara), the among-populations percentage was 6.47% (P<0.001) for NRY haplogroups and 7.37% (P<0.001) for Y-STR haplotypes, whereas for mtDNA haplogroups and haplotypes this was 17.99% (P<0.001) and 18.33% (P<0.001), respectively. Although the Massim Y-STR value (7.37%) is lower than that obtained for the Admiralties (10.31%) and Solomons (ex-Polynesian outliers) (11.09%), the Massim mtDNA haplotype value (18.33%) is high compared with that of the Admiralties (12.3%) and Solomons (ex-Polynesian outliers) (13.1%) (Supplementary Table S8). This comparative result suggests that the Massim are more structured mtDNA-wise than NRY-wise, and more so than other regions of NO studied so far. We furthermore investigated the putative effect of the Kula trading system on genetic population substructure of the Massim. As the Kula trade occurs between islands in a circular manner in both clockwise and counter-clockwise direction depending on the objects traded, we modeled the Kula system as a ring of participating trading partners that can exchange goods with adjacent partners (Figure 1). Notably, however, not all Massim islands participate in the Kula. From the populations included in this study, the Calvados chain islands, Sudest and Rossel as well as the PNG mainland populations of Airara and Wanigela are not known to be involved in the Kula33 and were therefore excluded from the model. Furthermore, the Laughlan Islanders, who may be only marginally involved in the Kula,37 were excluded because of small sample size. Genetic distances appropriate for the marker type (Supplementary Table S6) were compared via Mantel testing with circular trading distances. For NRY (both at the haplogroup level and at the haplotype level) no statistically significant correlation was observed, whereas for mtDNA a significant correlation was observed both for haplotype data (0.42; P=0.029) and for haplogroup data (0.47; P=0.019). This result can be explained by predominantly male-mediated gene flow between islands involved in the Kula, having a homogenizing effect on the Y-chromosome diversity but not on the mtDNA pool. As only men traditionally participate in the Kula voyages, this finding may indicate a genetic signature of the Kula Ring. However, the mtDNA-based correlation decreased and became nonsignificant when excluding Misima, which is reported to be less intensively involved in the Kula.35, 38
|
|