Post by Admin on Apr 27, 2018 18:38:06 GMT
Genome-wide analysis of ancient DNA has emerged as a transformative technology for studying prehistory, providing information that is comparable in power to archaeology and linguistics. Realizing its promise, however, requires collecting genome-wide data from an adequate number of individuals to characterize population changes over time, which means not only sampling a succession of archaeological cultures2, but also multiple individuals per culture. To make analysis of large numbers of ancient DNA samples practical, we used in-solution hybridization capture10,11 to enrich next generation sequencing libraries for a target set of 394,577 single nucleotide polymorphisms (SNPs) (‘390k capture’), 354,212 of which are autosomal SNPs that have also been genotyped using the Affymetrix Human Origins array in 2,345 humans from 203 populations4,12. This reduces the amount of sequencing required to obtain genome-wide data by a minimum of 45-fold and a median of 262-fold (Supplementary Data 1). This strategy allows us to report genomic scale data on more than twice the number of ancient Eurasians as has been presented in the entire preceding literature1–8 (Extended Data Table 1).
Figure 1
Location and SNP coverage of samples included in this study
(a) Geographic location and time-scale (central European chronology) of the 69 newly typed ancient individuals from this study (black outline) and 25 from the literature for which shotgun sequencing data was available (no outline). (b) Number of SNPs covered at least once in the analysis dataset of 94 individuals.
We used this technology to study population transformations in Europe. We began by preparing 212 DNA libraries from 119 ancient samples in dedicated clean rooms, and testing these by light shotgun sequencing and mitochondrial genome capture (Supplementary Information section 1, Supplementary Data 1). We restricted the analysis to libraries with molecular signatures of authentic ancient DNA (elevated damage in the terminal nucleotide), negligible evidence of contamination based on mismatches to the mitochondrial consensus13 and, where available, a mitochondrial DNA haplogroup that matched previous results using PCR4,14,15 (Supplementary Information section 2). For 123 libraries prepared in the presence of uracil-DNA-glycosylase16 to reduce errors due to ancient DNA damage17, we performed 390k capture, carried out paired-end sequencing and mapped the data to the human genome. We restricted analysis to 94 libraries from 69 samples that had at least 0.06-fold average target coverage (average of 3.8-fold) and used majority rule to call an allele at each SNP covered at least once (Supplementary Data 1). After combining our data (Supplementary Information section 3) with 25 ancient samples from the literature — three Upper Paleolithic samples from Russia1,6,7, seven people of European hunter gatherer ancestry2,4,5,8, and fifteen European farmers2,3,4,8—we had data from 94 ancient Europeans. Geographically, these came from Germany (n=41), Spain (n=10), Russia (n=14), Sweden (n=12), Hungary (n=15), Italy (n=1) and Luxembourg (n=1) (Extended Data Table 2). Following the central European chronology, these included 19 hunter gatherers (∼43,000–2,600 BC), 28 Early Neolithic farmers (∼6,000–4,000 BC), 11 Middle Neolithic farmers (∼4,000–3,000 BC) including the Tyrolean Iceman3, 9 Late Copper/Early Bronze Age individuals (Yamnaya:∼3,300–2,700 BC), 15 Late Neolithic individuals (∼2,500– 2,200 BC), 9 Early Bronze Age individuals (∼2,200–1,500 BC), two Late Bronze Age individuals (∼1,200–1,100 BC) and one Iron Age individual (∼900 BC). Two individuals were excluded from analyses as they were related to others from the same population. The average number of SNPs covered at least once was 212,375 and the minimum was 22,869 (Fig. 1).
We determined that 34 of the 69 newly analysed individuals were male and used 2,258 Y chromosome SNPs targets included in the capture to obtain high resolution Y chromosome haplogroup calls (Supplementary Information section 4). Outside Russia, and before the Late Neolithic period, only a single R1b individual was found (early Neolithic Spain) in the combined literature (n=70). By contrast, haplogroups R1a and R1b were found in 60% of Late Neolithic/Bronze Age Europeans outside Russia (n=10), and in 100% of the samples from European Russia from all periods (7,500–2,700 BC; n=9). R1a and R1b are the most common haplogroups in many European populations today18,19, and our results suggest that they spread into Europe from the East after 3,000 BC. Two hunter-gatherers from Russia included in our study belonged to R1a (Karelia) and R1b (Samara), the earliest documented ancient samples of either haplogroup discovered to date. These two hunter gatherers did not belong to the derived lineages M417 within R1a and M269 within R1b that are predominant in Europeans today18,19, but all 7 Yamnaya males did belong to the M269 subclade18 of haplogroup R1b. Principal components analysis (PCA) of all ancient individuals along with 777 present-day West Eurasians4 (Fig. 2a, Supplementary Information section 5) replicates the positioning of present-day Europeans between the Near East and European hunter-gatherers4,20, and the clustering of early farmers from across Europe with present day Sardinians3,4, suggesting that farming expansions across the Mediterranean to Spain and via the Danubian route to Hungary and Germany descended from a common stock. By adding samples from later periods and additional locations, we also observe several new patterns.
Figure 2
Population transformations in Europe
(a) PCA analysis, (b) ADMIXTURE analysis. The full ADMIXTURE analysis including present-day humans is shown in Supplementary Information section 6.
All samples from Russia have affinity to the ∼24,000-year-old MA1(ref. 6), the type specimen for the Ancient North Eurasians (ANE) who contributed to both Europeans4 and Native Americans4,6,8. The two hunter-gatherers from Russia (Karelia in the northwest of the country and Samara on the steppe near the Urals) form an ‘eastern European hunter-gatherer’ (EHG) cluster at one end of a hunter-gatherer cline across Europe; people of hunter-gatherer ancestry from Luxembourg, Spain, and Hungary sit at the opposite ‘western European hunter-gatherer’4 (WHG) end, while the hunter-gatherers from Sweden4,8 (SHG) are intermediate. Against this background of differentiated European hunter-gatherers and homogeneous early farmers, multiple population turnovers transpired in all parts of Europe included in our study. Middle Neolithic Europeans from Germany, Spain, Hungary, and Sweden from the period, ∼4,000–3,000 BC are intermediate between the earlier farmers and the WHG, suggesting an increase of WHG ancestry throughout much of Europe. By contrast, in Russia, the later Yamnaya steppe herders of ∼3,000 BC plot between the EHG and the present-day Near East/Caucasus, suggesting a decrease of EHG ancestry during the same time period. The Late Neolithic and Bronze Age samples from Germany and Hungary2 are distinct from the preceding Middle Neolithic and plot between them and the Yamnaya. This pattern is also seen in ADMIXTURE analysis (Fig. 2b, Supplementary Information section 6), which implies that the Yamnaya have ancestry from populations related to the Caucasus and South Asia that is largely absent in 38 Early or Middle Neolithic farmers but present in all 25 Late Neolithic or Bronze Age individuals. This ancestry appears in Central Europe for the first time in our series with the Corded Ware around 2,500 BC (Supplementary Information section 6, Fig. 2b).
The Corded Ware shared elements of material culture with steppe groups such as the Yamnaya although whether this reflects movements of people has been contentious21. Our genetic data provide direct evidence of migration and suggest that it was relatively sudden. The Corded Ware are genetically closest to the Yamnaya ∼2,600km away, as inferred both from PCA and ADMIXTURE (Fig. 2) and FST (0.011±0.002) (Extended Data Table 3). If continuous gene flow from the east, rather than migration, had occurred, we would expect successive cultures in Europe to become increasingly differentiated from the Middle Neolithic, but instead, the Corded Ware are both the earliest and most strongly differentiated from the Middle Neolithic population. ‘Outgroup’ f3 statistics6 (Supplementary Information section 7),which measure shared genetic drift between a pair of populations (Extended Data Fig. 1), support the clustering of hunter-gatherers, Early/Middle Neolithic, and Late Neolithic/Bronze Age populations into different groups as in the PCA (Fig. 2a).We also analysed f4 statistics, which allow us to test whether pairs of populations are consistent with descent from common ancestral populations, and to assess significance using a normally distributed Z score. Early European farmers from the Early and Middle Neolithic were closely related but not identical. This is reflected in the fact that Loschbour, a WHG individual fromLuxembourg4, shared more alleles with post-4,000 BC European farmers from Germany, Spain, Hungary, Sweden and Italy than with early farmers of Germany, Spain, and Hungary, documenting an increase of hunter-gatherer ancestry in multiple regions of Europe during the course of the Neolithic. The two EHG form a clade with respect to all other present-day and ancient populations (|Z|<1.9), and MA1 shares more alleles with them (|Z|>4.7) than with other ancient or modern populations, suggesting that they may be a source for the ANE ancestry in present Europeans4,12,22 as they are geographically and temporally more proximate than Upper Paleolithic Siberians.
The Yamnaya differ from the EHG by sharing fewer alleles with MA1 (|Z|=6.7) suggesting a dilution of ANE ancestry between 5,000–3,000 BC on the European steppe. This was likely due to admixture of EHG with a population related to present-day Near Easterners, as the most negative f3 statistic in the Yamnaya (giving unambiguous evidence of admixture) is observed when we model them as a mixture of EHG and present-day Near Eastern populations like Armenians (Z=-6.3); Supplementary Information section 7). The Late Neolithic/Bronze Age groups of central Europe share more alleles with Yamnaya than the Middle Neolithic populations do (|Z|=12.4) and more alleles with the Middle Neolithic than the Yamnaya do (|Z|=12.5), and have a negative f3 statistic with the Middle Neolithic and Yamnaya as references (Z=-20.7), indicating that they were descended from a mixture of the local European populations and new migrants from the east. Moreover, the Yamnaya share more alleles with the CordedWare (|Z|≥3.6) than with any other Late Neolithic/Early Bronze Age group with at least two individuals (Supplementary Information section 7), indicating that they had more eastern ancestry, consistent with the PCA and ADMIXTURE patterns (Fig. 2). Modelling of the ancient samples shows that while Karelia is genetically intermediate between Loschbour and MA1, the topology that considers Karelia as a mixture of these two elements is not the only one that can fit the data (Supplementary Information section 8). To avoid biasing our inferences by fitting an incorrect model, we developed new statistical methods that are substantial extensions of a previously reported approach4, which allow us to obtain precise estimates of the proportion of mixture in later Europeans without requiring a formal model for the relationship among the ancestral populations. The method (Supplementary Information section 9) is based on the idea that if a Test population has ancestry related to reference populations Ref1, Ref2 , …, RefN in proportions α1,α2,…,αN, and the references are themselves differentially related to a triple of outgroup populations A, B, C, then:
Figure 1
Location and SNP coverage of samples included in this study
(a) Geographic location and time-scale (central European chronology) of the 69 newly typed ancient individuals from this study (black outline) and 25 from the literature for which shotgun sequencing data was available (no outline). (b) Number of SNPs covered at least once in the analysis dataset of 94 individuals.
We used this technology to study population transformations in Europe. We began by preparing 212 DNA libraries from 119 ancient samples in dedicated clean rooms, and testing these by light shotgun sequencing and mitochondrial genome capture (Supplementary Information section 1, Supplementary Data 1). We restricted the analysis to libraries with molecular signatures of authentic ancient DNA (elevated damage in the terminal nucleotide), negligible evidence of contamination based on mismatches to the mitochondrial consensus13 and, where available, a mitochondrial DNA haplogroup that matched previous results using PCR4,14,15 (Supplementary Information section 2). For 123 libraries prepared in the presence of uracil-DNA-glycosylase16 to reduce errors due to ancient DNA damage17, we performed 390k capture, carried out paired-end sequencing and mapped the data to the human genome. We restricted analysis to 94 libraries from 69 samples that had at least 0.06-fold average target coverage (average of 3.8-fold) and used majority rule to call an allele at each SNP covered at least once (Supplementary Data 1). After combining our data (Supplementary Information section 3) with 25 ancient samples from the literature — three Upper Paleolithic samples from Russia1,6,7, seven people of European hunter gatherer ancestry2,4,5,8, and fifteen European farmers2,3,4,8—we had data from 94 ancient Europeans. Geographically, these came from Germany (n=41), Spain (n=10), Russia (n=14), Sweden (n=12), Hungary (n=15), Italy (n=1) and Luxembourg (n=1) (Extended Data Table 2). Following the central European chronology, these included 19 hunter gatherers (∼43,000–2,600 BC), 28 Early Neolithic farmers (∼6,000–4,000 BC), 11 Middle Neolithic farmers (∼4,000–3,000 BC) including the Tyrolean Iceman3, 9 Late Copper/Early Bronze Age individuals (Yamnaya:∼3,300–2,700 BC), 15 Late Neolithic individuals (∼2,500– 2,200 BC), 9 Early Bronze Age individuals (∼2,200–1,500 BC), two Late Bronze Age individuals (∼1,200–1,100 BC) and one Iron Age individual (∼900 BC). Two individuals were excluded from analyses as they were related to others from the same population. The average number of SNPs covered at least once was 212,375 and the minimum was 22,869 (Fig. 1).
We determined that 34 of the 69 newly analysed individuals were male and used 2,258 Y chromosome SNPs targets included in the capture to obtain high resolution Y chromosome haplogroup calls (Supplementary Information section 4). Outside Russia, and before the Late Neolithic period, only a single R1b individual was found (early Neolithic Spain) in the combined literature (n=70). By contrast, haplogroups R1a and R1b were found in 60% of Late Neolithic/Bronze Age Europeans outside Russia (n=10), and in 100% of the samples from European Russia from all periods (7,500–2,700 BC; n=9). R1a and R1b are the most common haplogroups in many European populations today18,19, and our results suggest that they spread into Europe from the East after 3,000 BC. Two hunter-gatherers from Russia included in our study belonged to R1a (Karelia) and R1b (Samara), the earliest documented ancient samples of either haplogroup discovered to date. These two hunter gatherers did not belong to the derived lineages M417 within R1a and M269 within R1b that are predominant in Europeans today18,19, but all 7 Yamnaya males did belong to the M269 subclade18 of haplogroup R1b. Principal components analysis (PCA) of all ancient individuals along with 777 present-day West Eurasians4 (Fig. 2a, Supplementary Information section 5) replicates the positioning of present-day Europeans between the Near East and European hunter-gatherers4,20, and the clustering of early farmers from across Europe with present day Sardinians3,4, suggesting that farming expansions across the Mediterranean to Spain and via the Danubian route to Hungary and Germany descended from a common stock. By adding samples from later periods and additional locations, we also observe several new patterns.
Figure 2
Population transformations in Europe
(a) PCA analysis, (b) ADMIXTURE analysis. The full ADMIXTURE analysis including present-day humans is shown in Supplementary Information section 6.
All samples from Russia have affinity to the ∼24,000-year-old MA1(ref. 6), the type specimen for the Ancient North Eurasians (ANE) who contributed to both Europeans4 and Native Americans4,6,8. The two hunter-gatherers from Russia (Karelia in the northwest of the country and Samara on the steppe near the Urals) form an ‘eastern European hunter-gatherer’ (EHG) cluster at one end of a hunter-gatherer cline across Europe; people of hunter-gatherer ancestry from Luxembourg, Spain, and Hungary sit at the opposite ‘western European hunter-gatherer’4 (WHG) end, while the hunter-gatherers from Sweden4,8 (SHG) are intermediate. Against this background of differentiated European hunter-gatherers and homogeneous early farmers, multiple population turnovers transpired in all parts of Europe included in our study. Middle Neolithic Europeans from Germany, Spain, Hungary, and Sweden from the period, ∼4,000–3,000 BC are intermediate between the earlier farmers and the WHG, suggesting an increase of WHG ancestry throughout much of Europe. By contrast, in Russia, the later Yamnaya steppe herders of ∼3,000 BC plot between the EHG and the present-day Near East/Caucasus, suggesting a decrease of EHG ancestry during the same time period. The Late Neolithic and Bronze Age samples from Germany and Hungary2 are distinct from the preceding Middle Neolithic and plot between them and the Yamnaya. This pattern is also seen in ADMIXTURE analysis (Fig. 2b, Supplementary Information section 6), which implies that the Yamnaya have ancestry from populations related to the Caucasus and South Asia that is largely absent in 38 Early or Middle Neolithic farmers but present in all 25 Late Neolithic or Bronze Age individuals. This ancestry appears in Central Europe for the first time in our series with the Corded Ware around 2,500 BC (Supplementary Information section 6, Fig. 2b).
The Corded Ware shared elements of material culture with steppe groups such as the Yamnaya although whether this reflects movements of people has been contentious21. Our genetic data provide direct evidence of migration and suggest that it was relatively sudden. The Corded Ware are genetically closest to the Yamnaya ∼2,600km away, as inferred both from PCA and ADMIXTURE (Fig. 2) and FST (0.011±0.002) (Extended Data Table 3). If continuous gene flow from the east, rather than migration, had occurred, we would expect successive cultures in Europe to become increasingly differentiated from the Middle Neolithic, but instead, the Corded Ware are both the earliest and most strongly differentiated from the Middle Neolithic population. ‘Outgroup’ f3 statistics6 (Supplementary Information section 7),which measure shared genetic drift between a pair of populations (Extended Data Fig. 1), support the clustering of hunter-gatherers, Early/Middle Neolithic, and Late Neolithic/Bronze Age populations into different groups as in the PCA (Fig. 2a).We also analysed f4 statistics, which allow us to test whether pairs of populations are consistent with descent from common ancestral populations, and to assess significance using a normally distributed Z score. Early European farmers from the Early and Middle Neolithic were closely related but not identical. This is reflected in the fact that Loschbour, a WHG individual fromLuxembourg4, shared more alleles with post-4,000 BC European farmers from Germany, Spain, Hungary, Sweden and Italy than with early farmers of Germany, Spain, and Hungary, documenting an increase of hunter-gatherer ancestry in multiple regions of Europe during the course of the Neolithic. The two EHG form a clade with respect to all other present-day and ancient populations (|Z|<1.9), and MA1 shares more alleles with them (|Z|>4.7) than with other ancient or modern populations, suggesting that they may be a source for the ANE ancestry in present Europeans4,12,22 as they are geographically and temporally more proximate than Upper Paleolithic Siberians.
The Yamnaya differ from the EHG by sharing fewer alleles with MA1 (|Z|=6.7) suggesting a dilution of ANE ancestry between 5,000–3,000 BC on the European steppe. This was likely due to admixture of EHG with a population related to present-day Near Easterners, as the most negative f3 statistic in the Yamnaya (giving unambiguous evidence of admixture) is observed when we model them as a mixture of EHG and present-day Near Eastern populations like Armenians (Z=-6.3); Supplementary Information section 7). The Late Neolithic/Bronze Age groups of central Europe share more alleles with Yamnaya than the Middle Neolithic populations do (|Z|=12.4) and more alleles with the Middle Neolithic than the Yamnaya do (|Z|=12.5), and have a negative f3 statistic with the Middle Neolithic and Yamnaya as references (Z=-20.7), indicating that they were descended from a mixture of the local European populations and new migrants from the east. Moreover, the Yamnaya share more alleles with the CordedWare (|Z|≥3.6) than with any other Late Neolithic/Early Bronze Age group with at least two individuals (Supplementary Information section 7), indicating that they had more eastern ancestry, consistent with the PCA and ADMIXTURE patterns (Fig. 2). Modelling of the ancient samples shows that while Karelia is genetically intermediate between Loschbour and MA1, the topology that considers Karelia as a mixture of these two elements is not the only one that can fit the data (Supplementary Information section 8). To avoid biasing our inferences by fitting an incorrect model, we developed new statistical methods that are substantial extensions of a previously reported approach4, which allow us to obtain precise estimates of the proportion of mixture in later Europeans without requiring a formal model for the relationship among the ancestral populations. The method (Supplementary Information section 9) is based on the idea that if a Test population has ancestry related to reference populations Ref1, Ref2 , …, RefN in proportions α1,α2,…,αN, and the references are themselves differentially related to a triple of outgroup populations A, B, C, then: