|
Post by Admin on Apr 15, 2018 18:36:07 GMT
Almost all Europeans speak Indo-European (IE) languages, and certainly not by chance. However, the place of origin of the first IE speakers has not been identified, nor has any consensus emerged about the time and the mechanisms by which IE languages spread over Western Eurasia. Based on the linguistic and archaeological evidence, the earliest speakers of a proto-IE language have often been identified with people living in the Pontic steppes about 6000 years ago [1], with their subsequent westward diffusion occurring in parallel with that of the Kurgan [2], or the Yamna [3] pastoral cultures. Conversely, genetic evidence of demic diffusion from the Near East into Europe [4] led Renfrew [5,6] to propose an earlier spread of Indo-European from Anatolia (9500–8000 years ago), through a single expansion carrying Neolithic technologies, genes and languages into much of Europe. In principle, language change does not need to be accompanied by migration, because cultural contacts, or a combination of cultural and demographic changes, may also lead to changes at the linguistic level. A common feature of the Kurgan and Anatolian models is that both postulate a migration fuelled by cultural innovations, horse riding or farming, respectively, which in turn facilitated language spread. This means that the spread of cultural novelty, documented by archaeological and linguistic evidence, has entailed demographic changes, which in principle have left their signature at the genomic level. The majority of linguists currently support the hypothesis of a late, Pontic spread, here referred to as the Kurgan hypothesis [7], although Bayesian analyses of linguistic variation [8,9] seem to be easier to reconcile with an early diffusion of IE languages from Anatolia. However this inference is highly sensitive to prior assumptions, and when a different set of plausible prior assumptions is used, the same methods support the chronology suggested by the Kurgan hypothesis [10]. Recently, genetic data have provided strong new evidence relevant to this debate. There is ample genetic evidence that extensive migration accompanied the European spread of Neolithic technologies from the Near East [11–14]. Neolithic farmers came to occupy territories once inhabited by Mesolithic hunters and gatherers [15–20] which, for the sake of simplicity, we shall jointly consider here as hunter-gatherers. However, the hunter-gatherers did not go extinct, as data from Central Europe and Spain suggest a resurgence of a genomic component associated with them, during the Middle and late Neolithic [17,18,20]. There is now genetic evidence of population movements from the Russian steppes into Central Europe in the Bronze Age [20,21] and Iron Age [22]. These processes may or may not have had large-scale consequences at the demographic and linguistic level, but the later expansion would be consistent with a spread of languages associated with the Kurgan hypothesis [18,21]. Figure 1. Geographical location and timescale of the ancient individuals. Sampling locations and ages for the ancient samples. The colours and the symbols for each population are the same in all the analyses. In bold, the samples included in the optimized dataset. See also electronic supplementary material, table S2. In the great majority of ancient DNA studies, migration and admixture processes were not explicitly modelled, but instead inferred from levels of genetic resemblance among samples, as shown by principal component analysis (PCA) plots, clustering, and fitting of admixture graphs and clade tests based on f3- and f4-statistics (e.g. [18,20,21]). While giving a general overview of the data, these exploratory methods do not allow for formal comparison of alternative models, nor do they estimate parameters such as migration rates and population sizes. For that purpose, and to obtain insight into the origins of genomic variation in Middle Neolithic Central Europe, we collected and typed samples of 17 individuals from the Megalithic barrow of Kierzkowo (Poland), which is archaeologically assigned to the Globular Amphorae culture (GAC). The GAC is documented in Central and Eastern Europe, from the Elbe to the middle Dnieper, around 5400–4800 BP. It plays a crucial role in this debate because it has been argued to be associated with the first Indo-European migrations based on its burial rituals, including burial of livestock, usage of domestic horse, and presence of amber sun-disks [1]. Gimbutas [23] argued that when the Kurgan culture expanded from its homeland in the steppe and forest-steppe of Ukraine and South Russia, it did so in three waves, thus leading to the diffusion of the IE languages (see electronic supplementary material, figure S1). The GAC people are regarded by Gimbutas as part of the first wave, associated with the spread of the Yamna culture from the Pontic region to the Danube basin and the Balkans, between 5100 and 4900 BP [1]. If Gimbutas' theory is correct, the people of the GAC should have Yamna related admixture, as well as genetic affinity to the populations associated with the later, Bell Beaker culture, documented in many areas of Europe 4800 to 3800 BP
|
|
|
Post by Admin on Apr 16, 2018 18:41:41 GMT
(i) Samples and sequencing Seventeen specimens, bones and teeth, were selected (electronic supplementary material, table S4) and their mtDNAs were analysed independently in two different laboratories (see Mitochondrial data in the electronic supplementary material and electronic supplementary material, table S5). We collected from the literature ancient mtDNA data from the same populations described in the nuclear data section (figure 1 and electronic supplementary material, table S6). The sequences of 213 samples available in FASTQ format were analysed applying the same pipeline described in the ‘Mitochondrial DNA sequence pre-processing and mapping’ section. A subset of samples, namely 56 individuals belonging to five populations, was extracted from the initial dataset and used together with nine GAC samples for the coalescent simulations and additional exploratory analysis (electronic supplementary material, table S7). Phylogenetic networks, based on nucleotide variation in the two mtDNA datasets, were constructed using the median joining algorithm [34] implemented in Network 5.0 program ( . Figure 2. Scheme summarizing the five alternative models compared via ABC random forest. We generated by coalescent simulation mtDNA sequences under five models, differing as to the number of migration events considered. The coloured lines represent the ancient samples included in the analysis, namely Unetice (yellow line), Bell Beaker (purple line), Corded Ware (green line) and Globular Amphorae (red line) from Central Europe, Yamnaya (light blue line) and Srubnaya (brown line) from Eastern Europe. The arrows refer to the three waves of migration tested. Model NOMIG was the simplest one, in which the six populations did not have any genetic exchanges; models MIG1, MIG2 and MIG1, 2 differed from NOMIG in that they included the migration events number 1, 2 (from Eastern to Central Europe, respectively before and after the onset of the GAC), or both. Model MIG2, 3 represents a modification of MIG2 model also including a back migration from Central to Eastern Europe after the development of the Corded Ware culture. (ii) Analysis of demographic models We compared different demographic models, each characterized by a different set of migration events, via approximate Bayesian computation via random forest (ABC-rf) (see electronic supplementary material for details and electronic supplementary material, table S8). In all cases, we assumed some level of genetic continuity across three geographical regions, namely Eastern Europe, Central Europe and the Near East; the models then differed as for the presence of one or two migration events, connecting different regions. The analysis required two main steps; in the first one we estimated the number and the extent of migration waves from the Pontic steppes (arrows 1 and 2 in figure 2); in the second one we quantified the degree of resemblance among Corded Ware people and Early Bronze Age individuals from Eastern Europe, considering the possibility of a third, eastward migration (arrow 3 in figure 2). Other details about the models are in the electronic supplementary material. To compare these models, we applied the ABC-rf considering 50 000 simulations per model and 500 trees in the forest, using the abcrf and the predict functions provided in the abcrf R package. To evaluate the ability of the ABC-rf procedure to distinguish among the models tested, we calculated the classification error using as PODs each dataset of our reference table. The ability of the selected models to actually generate the observed variation was assessed using linear discriminant analysis (LDA) and PCA. To estimate the models' parameters we selected the best 5000 simulations out of 1 million for each model selected. Posterior probabilities for models and parameters were calculated using R scripts from code.google.com/p/popabc/source/browse/#svn%2Ftrunk%2Fscripts, modified by S.G.
|
|
|
Post by Admin on Apr 17, 2018 18:57:28 GMT
Figure 3. (a) Principal component analysis on genomic diversity in ancient and modern individuals. (b) K = 3,4 ADMIXTURE analysis based only on ancient variation. (a) Principal component analysis of 777 modern West Eurasian samples with 199 ancient samples. Only transversions considered in the PCA (to avoid confounding effects of post-mortem damage). We represented modern individuals as grey dots, and used coloured and labelled symbols to represent the ancient individuals. (b) Admixture plots at K = 3 and K = 4 of the analysis conducted only considering the ancient individuals. The full plot is shown in electronic supplementary material, figure S7. The ancient populations are sorted by a temporal scale from Pleistocene to Iron Age. The GAC samples of this study are displayed in the box on the right. To explore the genetic affinities among ancient and modern-day individuals, we examined a PCA plot of the AP dataset. The newly reported GAC individuals fell within a cluster comprising most Early and Middle Neolithic individuals (figure 3a and electronic supplementary material, figure S4). As previously observed [20], a clear separation is apparent between hunter-gatherers and samples of more recent periods, with the Bronze Age individuals at the top of the plot, the Late Neolithic samples in a central position and the Early and Middle Neolithic samples at the bottom. We found again a Europe–Near East cline along the principal component 1 in modern populations, and the clustering of early farmers across Europe with present-day Sardinians [18,20,35] (electronic supplementary material, figure S5). Clustering by ADMIXTURE [31] of the genotypes in our ancient, A, dataset (figure 3b; see electronic supplementary material, figure S7 and figure S8 for the complete analysis of both datasets AP and A), revealed three ancestral components, clearly separating three groups. The orange component is found at high frequencies in hunter-gathering populations such as the Holocene samples from Hungary and in the Motala samples from Sweden. All Early Neolithic populations are characterized by having a large proportion of their genotypes represented by a yellow component, which, strikingly, is also strongly represented in Chalcolithic samples from Iberia consistent with a common origin from first farmers of Anatolia [20]. By contrast, the individuals belonging to the Yamna, Corded Ware, Afanasievo and Andronovo populations, showed a high frequency of a third, blue component, which is consistent with the scenario of a common ancestry in the Pontic steppes followed by westward migration. Except for one of our Kierzkowo samples (in which it represented about 15% of the genotype), the Steppe-related component was absent altogether in the GAC population. All other GAC samples showed, instead, a mix between a major Early Neolithic component (up to 83%), and the component found at high frequencies among hunter-gatherers (up to 30%). When the ADMIXTURE is asked to cluster the samples into four rather than three groups, the Early Neolithic cluster fissions into two, and the new component (green) is present in the GAC, as well as in several other Western European populations. This second Neolithic component, here referred to as Western Europe Neolithic, accounts for a large share of the ancestry of individuals such as those from Iberia (Iberia_CA), La Mina (LaMina_MN) and Els Trocs (Els_Trocs_EN). The ADMIXTURE analysis on the AP dataset (electronic supplementary material, figure S8) confirms the pattern of relationship between the studied populations. Additional genomic components become apparent, an expected consequence of demographic changes occurred in later prehistoric and in historic periods. However, considering six groups, we could reproduce the clusters previously described in the ancient samples: Early Neolithic (red), hunter-gatherer (yellow), Steppe-related (green) and Western Europe Neolithic (orange). As was the case in the analysis conducted only on ancient samples, the GAC population showed both the Western Europe Neolithic and the Early Neolithic component, with negligible, if any, the presence of the component so strongly associated with the Kurgan migration.
|
|
|
Post by Admin on Apr 18, 2018 18:50:48 GMT
Moving from individual to population comparisons, we summarized levels of shared genetic ancestry between pairs of populations since their divergence from an African outgroup calculating sets of f3-statistics, in the form (X, GAC; Mbuti), where X represents, in turn, each ancient population in our dataset. Once again, the GAC people appeared to have more in common with the other Middle Neolithic samples, in particular from Hungary, Iberia and Sweden, than with geographically closer samples (figure 4). A genetic link with the Loschbour sample is also apparent, supporting the hypothesis that, around the Middle Neolithic, farmers of Near-Eastern origin, after a first phase of expansion without admixture [4], began to incorporate in their communities the residual western hunters and gatherers [18]. Remarkably, we did not find evidence of any GAC clear genetic link with the Yamna sample, as well as with any other populations related with the Kurgan Migration Hypothesis. We then calculated the same statistics in the form of f3 (X, Corded Ware; Mbuti), to verify whether we could detect signals of introgression from the Pontic steppes in the later, Corded Ware population, as already observed by Haak et al. [18] and Allentoft et al. [21]; a clear genetic link became apparent. Other samples showing high similarity with Corded Ware individuals are the Afanasievo and Andronovo samples. Finally, we confirmed that in the Late Neolithic there is an increased similarity between farmers and hunter-gatherers, a likely consequence of the assimilation of the latter into the former, shown by the high level of drift shared by the Corded Ware sample and the hunter-gatherers from Sweden and Russia. Figure 4. Outgroup f3 statistics. (a) Test in the form f3 (X, Globular Amphorae; Mbuti). (b) Test in the form f3 (X, Corded Ware; Mbuti), where X is all other ancient populations. Black error bars represent two standard errors. ABC-rf ([39]; see prior distributions in electronic supplementary material, table S8) gave the strongest support to the MIG2 model (posterior probability = 0.40; the number of votes associated with each model are reported in electronic supplementary material, table S9), involving a single migration from the Pontic steppes into Central Europe, just before the onset of the Corded Ware culture, but after the moment at which our GAC samples are dated. Given the low discrimination power resulting from this four-model comparison (electronic supplementary material, table S10), and to better investigate the relationship between the two models receiving the highest number of votes (i.e. MIG2 and MIG1,2), we also performed a direct comparison between MIG2 and MIG1,2. The classification error was lower (electronic supplementary material, table S11), and the probability associated with the selected model, MIG2, was 0.62 (electronic supplementary material, table S12). The proportion of Corded Ware lineages actually derived from Yamna people was estimated to be 0.33 (median value) or 0.43 (modal value) (electronic supplementary material, table S13). As a second step, we compared the MIG2 model with a model also including a back migration to Eastern Europe (MIG2,3, figure 2), thus exploring the possibility that the Sintashta are derived directly from an Eastward migration of Corded Ware people, as proposed by Allentoft et al. [21]. Because we had no high-quality mitochondrial data for the Sintashta, we chose as a proxy the Srubnaya, which appeared very similar in previous analyses of nuclear variation. The comparison of these two models via ABC-rf marginally favoured MIG2,3, but only with a posterior probability of 0.53 (electronic supplementary material, tables S14 and S15). The estimated median proportions of Yamna related lineages contributing to Corded Ware lineages, and of Corded Ware related lineages contributing to the Srubnaya lineages, were respectively 0.31 and 0.27 (electronic supplementary material, table S16). Note that the value estimated for the proportion of Corded Ware lineages coming from Yamna is consistent between the two models, whereas the estimate obtained for the admixture with Yamna in the Srubnaya mitochondrial genome has low R2 value, indicating that we might not have enough power to quantify its extent. Both the MIG2 and the MIG2,3 models provided a good ability to reproduce the observed data, as it is shown by the LDA and the PCA plots in electronic supplementary material, figure S15.
|
|
|
Post by Admin on Apr 19, 2018 18:49:47 GMT
Discussion In its classical formulation, the Kurgan hypothesis, i.e. a late Neolithic spread of proto-Indo-European languages from the Pontic steppes, regards the GAC people as largely descended from Late Neolithic ancestors from the East, most likely representing the Yamna culture; these populations then continued their Westward movement, giving rise to the later Corded Ware and Bell Beaker cultures. Gimbutas [23] suggested that the spread of Indo-European languages involved conflict, with eastern populations spreading their languages and customs to previously established European groups, which implies some degree of demographic change in the areas affected by the process. The genomic variation observed in GAC individuals from Kierzkowo, Poland, does not seem to agree with this view. Indeed, at the nuclear level, the GAC people show minor genetic affinities with the other populations related with the Kurgan Hypothesis, including the Yamna. On the contrary, they are similar to Early-Middle Neolithic populations, even geographically distant ones, from Iberia or Sweden. As already found for other Late Neolithic populations [18], in the GAC people's genome there is a component related to those of much earlier hunting-gathering communities, probably a sign of admixture with them. At the nuclear level, there is a recognizable genealogical continuity from Yamna to Corded Ware. However, the view that the GAC people represented an intermediate phase in this large-scale migration finds no support in bi-dimensional representations of genome diversity (PCA and MDS), ADMIXTURE graphs, or in the set of estimated f3-statistics. For a formal test of these findings, mtDNA data, with its absence of recombination, have valuable properties and allow explicit modelling. Analysis of mitochondrial DNA also benefits from the extensive mathematical methodology that has been developed for studying population history based on such data. Population relationships inferred from mitochondrial data closely resembled those inferred from nuclear data, and so it seemed unlikely that the two datasets may reflect very different demographic events. The models including no migration (NOMIG), or a migration from the Pontic steppes before the onset of the GAC (MIG1) found very limited support in the ABC analysis. The best fit in the first round of simulations was obtained including only a later migration from Yamna, i.e. one not affecting the GAC individuals for which we have information (MIG2), followed by the model in which two subsequent migrations are considered (MIG1,2). Adding, in a second round of simulations, a further episode of gene flow, this time eastwards (MIG2,3), improved the fit, but not dramatically so. The proportion of mtDNA lineages possibly derived from those of the Yamna people is less than 50%. In short, simulation-based tests on mtDNA variation do not suggest that the GAC people of this study have special links with migrants from the Pontic steppes, but show a direct connection between the Yamna and later Central Europe cultures (Corded Ware and Bell beaker), who derived almost half of their mitochondrial variation from them. The emerging picture is thus one in which migrations from the Pontic steppes into Central Europe left a trace in the genomes of the Corded Ware culture, but not in those of the GAC. At this stage, it is hard to say whether and to what extent the finding that migrations from the Pontic steppes had little or no demographic impact on the GAC also has implications for the Kurgan hypothesis. There is little doubt that in the Late Neolithic there were indeed migration processes from the Pontic steppes into Central Europe, documented by the archaeological and genetic links between the Yamna and Corded Ware cultures. However, depending on the number of people involved, migrations may or may not leave a recognizable trace in the genetic makeup of a population. The data we provide show that such a trace is not apparent in the genomes of the GAC people. Rather, evolutionary connections are evident between the GAC and other European groups, both at the nuclear and mitochondrial level. In short not all population relationships in the Central European late Neolithic correspond to those proposed in the original Kurgan model. Of course, there is also the possibility that GAC sites other than those investigated in this study might show different genomic features, but at this stage, this is only a matter of speculation. Therefore, either the GAC people pre-existed and were extraneous to the Pontic Steppe migration process envisaged by Gimbutas, or the Pontic steppes migrants' contribution was represented by few individuals, too few indeed to leave a trace in the genetic makeup of the GAC population. To get a deeper insight into the linguistic changes prompted by Neolithic migration, one should have an idea of the languages spoken in that period, which is currently out of reach. Therefore, our work leaves open the possibility that the GAC was in contact with Yamna, but in this case, the interaction was mostly at the cultural level, entailing very limited migration, if any, contrary to the predictions of Gimbutas' Kurgan hypothesis. One alternative is that that the Pontic steppe migration did not profoundly affect mitochondrial variation, as implied by several studies showing an excess of migrating males in expanding pastoral economies (see Saag et al. unpublished data [40] and Kristiansen et al. [41], and references therein). However, in this case, the substantial contribution of males should at least result in some degree of similarity between GAC and Yamnaya at the nuclear level, which did not emerge in this study. In both cases, a transmission of cultural traits from the Pontic steppes to the GAC, and later further West, is conceivable and not ruled out by our data; further archaeological work, including studies of other GAC sites, may shed additional light on this. Published 22 November 2017.DOI: 10.1098/rspb.2017.1540
|
|