Post by Admin on Jan 8, 2022 23:34:01 GMT
Massive migration from the steppe is a source for Indo-European languages in Europe
Abstract
We generated genome-wide data from 69 Europeans who lived between 8,000-3,000 years ago by enriching ancient DNA libraries for a target set of almost four hundred thousand polymorphisms. Enrichment of these positions decreases the sequencing required for genome-wide ancient DNA analysis by a median of around 250-fold, allowing us to study an order of magnitude more individuals than previous studies1–⇓⇓⇓⇓⇓⇓8 and to obtain new insights about the past. We show that the populations of western and far eastern Europe followed opposite trajectories between 8,000-5,000 years ago. At the beginning of the Neolithic period in Europe, ~8,000-7,000 years ago, closely related groups of early farmers appeared in Germany, Hungary, and Spain, different from indigenous hunter-gatherers, whereas Russia was inhabited by a distinctive population of hunter-gatherers with high affinity to a ~24,000 year old Siberian6. By ~6,000-5,000 years ago, a resurgence of hunter-gatherer ancestry had occurred throughout much of Europe, but in Russia, the Yamnaya steppe herders of this time were descended not only from the preceding eastern European hunter-gatherers, but from a population of Near Eastern ancestry. Western and Eastern Europe came into contact ~4,500 years ago, as the Late Neolithic Corded Ware people from Germany traced ~3/4 of their ancestry to the Yamnaya, documenting a massive migration into the heartland of Europe from its eastern periphery. This steppe ancestry persisted in all sampled central Europeans until at least ~3,000 years ago, and is ubiquitous in present-day Europeans. These results provide support for the theory of a steppe origin9 of at least some of the Indo-European languages of Europe.
Genome-wide analysis of ancient DNA has emerged as a transformative technology for studying prehistory, providing information that is comparable in power to archaeology and linguistics. Realizing its promise, however, requires collecting genome-wide data from an adequate number of individuals to characterize population changes over time, which means not only sampling a succession of archaeological cultures2, but also multiple individuals per culture. To make analysis of large numbers of ancient DNA samples practical, we used in-solution hybridization capture10,11 to enrich next generation sequencing libraries for a target set of 394,577 single nucleotide polymorphisms (SNPs) (“390k capture”), 354,212 of which are autosomal SNPs that have also been genotyped using the Affymetrix Human Origins array in 2,345 humans from 203 populations4,12. This reduces the amount of sequencing required to obtain genome-wide data by a minimum of 45-fold and a median of 262-fold (Online Table 1). This strategy allows us to report genomic scale data on more than twice the number of ancient Eurasians as the entire preceding literature1–⇓⇓⇓⇓⇓⇓8 (Extended Data Table 1).
Extended Data Table 1:
Number of ancient Eurasian modern human samples screened in genome-wide studies to date.
Only studies that produced at least one sample at ≥0.05× coverage are listed.
We used this technology to study population transformations in Europe. We began by preparing 212 DNA libraries from 119 ancient samples in dedicated clean rooms, and testing these by light shotgun sequencing and mitochondrial genome capture (SI1, Online Table 1). We restricted to libraries with molecular signatures of authentic ancient DNA (elevated damage in the terminal nucleotide), negligible evidence of contamination based on mismatches to the mitochondrial consensus13, and, where available, a mitochondrial DNA haplogroup that matched previous results using PCR4,14,15 (SI2). For 123 libraries prepared in the presence of Uracil-DNA-glycosylase16 to reduce errors due to ancient DNA damage17, we performed 390k capture, carried out paired end sequencing, and mapped to the human genome. We restricted analysis to 95 libraries from 69 samples that had at least 0.06-fold average target coverage (average of 3.8-fold), and used majority rule to call an allele at each SNP covered at least once (Online Table 1). After combining our data (SI3) with 25 ancient samples from the literature — three Upper Paleolithic samples from Russia1,7,6, seven people of European hunter gatherer ancestry4,5,8,2, and fifteen European farmers2,8,4,3, — we had data from 94 ancient Europeans. Geographically, these came from Germany (n=41), Spain (n=10), Russia (n=14), Sweden (n=12), Hungary (n=15), Italy (n=1) and Luxembourg (n=1) (Extended Data Table 2). Following the central European chronology, these included 19 hunter-gatherers (>5,500 BCE), 28 Early Neolithic farmers (EN: ~6,000-4,000 BCE), 11 Middle Neolithic farmers (MN: ~4,000-3,000 BCE) including the Tyrolean Iceman3, 9 Late Copper/Early Bronze Age individuals (Yamnaya: ~3,300-2,700 BCE), 15 Late Neolithic individuals (LN: ~2,500-2,200 BCE), 9 Early Bronze Age individuals (~2,200-1,500 BCE), two Late Bronze Age individuals (~1,200-1,100 BCE) and one Iron Age individual (~900 BCE). Two individuals were excluded from analyses as they were related to others from the same population. The average number of SNPs covered at least once was 212,375 and the minimum was 22,869 (Fig. 1).
Abstract
We generated genome-wide data from 69 Europeans who lived between 8,000-3,000 years ago by enriching ancient DNA libraries for a target set of almost four hundred thousand polymorphisms. Enrichment of these positions decreases the sequencing required for genome-wide ancient DNA analysis by a median of around 250-fold, allowing us to study an order of magnitude more individuals than previous studies1–⇓⇓⇓⇓⇓⇓8 and to obtain new insights about the past. We show that the populations of western and far eastern Europe followed opposite trajectories between 8,000-5,000 years ago. At the beginning of the Neolithic period in Europe, ~8,000-7,000 years ago, closely related groups of early farmers appeared in Germany, Hungary, and Spain, different from indigenous hunter-gatherers, whereas Russia was inhabited by a distinctive population of hunter-gatherers with high affinity to a ~24,000 year old Siberian6. By ~6,000-5,000 years ago, a resurgence of hunter-gatherer ancestry had occurred throughout much of Europe, but in Russia, the Yamnaya steppe herders of this time were descended not only from the preceding eastern European hunter-gatherers, but from a population of Near Eastern ancestry. Western and Eastern Europe came into contact ~4,500 years ago, as the Late Neolithic Corded Ware people from Germany traced ~3/4 of their ancestry to the Yamnaya, documenting a massive migration into the heartland of Europe from its eastern periphery. This steppe ancestry persisted in all sampled central Europeans until at least ~3,000 years ago, and is ubiquitous in present-day Europeans. These results provide support for the theory of a steppe origin9 of at least some of the Indo-European languages of Europe.
Genome-wide analysis of ancient DNA has emerged as a transformative technology for studying prehistory, providing information that is comparable in power to archaeology and linguistics. Realizing its promise, however, requires collecting genome-wide data from an adequate number of individuals to characterize population changes over time, which means not only sampling a succession of archaeological cultures2, but also multiple individuals per culture. To make analysis of large numbers of ancient DNA samples practical, we used in-solution hybridization capture10,11 to enrich next generation sequencing libraries for a target set of 394,577 single nucleotide polymorphisms (SNPs) (“390k capture”), 354,212 of which are autosomal SNPs that have also been genotyped using the Affymetrix Human Origins array in 2,345 humans from 203 populations4,12. This reduces the amount of sequencing required to obtain genome-wide data by a minimum of 45-fold and a median of 262-fold (Online Table 1). This strategy allows us to report genomic scale data on more than twice the number of ancient Eurasians as the entire preceding literature1–⇓⇓⇓⇓⇓⇓8 (Extended Data Table 1).
Extended Data Table 1:
Number of ancient Eurasian modern human samples screened in genome-wide studies to date.
Only studies that produced at least one sample at ≥0.05× coverage are listed.
We used this technology to study population transformations in Europe. We began by preparing 212 DNA libraries from 119 ancient samples in dedicated clean rooms, and testing these by light shotgun sequencing and mitochondrial genome capture (SI1, Online Table 1). We restricted to libraries with molecular signatures of authentic ancient DNA (elevated damage in the terminal nucleotide), negligible evidence of contamination based on mismatches to the mitochondrial consensus13, and, where available, a mitochondrial DNA haplogroup that matched previous results using PCR4,14,15 (SI2). For 123 libraries prepared in the presence of Uracil-DNA-glycosylase16 to reduce errors due to ancient DNA damage17, we performed 390k capture, carried out paired end sequencing, and mapped to the human genome. We restricted analysis to 95 libraries from 69 samples that had at least 0.06-fold average target coverage (average of 3.8-fold), and used majority rule to call an allele at each SNP covered at least once (Online Table 1). After combining our data (SI3) with 25 ancient samples from the literature — three Upper Paleolithic samples from Russia1,7,6, seven people of European hunter gatherer ancestry4,5,8,2, and fifteen European farmers2,8,4,3, — we had data from 94 ancient Europeans. Geographically, these came from Germany (n=41), Spain (n=10), Russia (n=14), Sweden (n=12), Hungary (n=15), Italy (n=1) and Luxembourg (n=1) (Extended Data Table 2). Following the central European chronology, these included 19 hunter-gatherers (>5,500 BCE), 28 Early Neolithic farmers (EN: ~6,000-4,000 BCE), 11 Middle Neolithic farmers (MN: ~4,000-3,000 BCE) including the Tyrolean Iceman3, 9 Late Copper/Early Bronze Age individuals (Yamnaya: ~3,300-2,700 BCE), 15 Late Neolithic individuals (LN: ~2,500-2,200 BCE), 9 Early Bronze Age individuals (~2,200-1,500 BCE), two Late Bronze Age individuals (~1,200-1,100 BCE) and one Iron Age individual (~900 BCE). Two individuals were excluded from analyses as they were related to others from the same population. The average number of SNPs covered at least once was 212,375 and the minimum was 22,869 (Fig. 1).