|
Post by Admin on Feb 4, 2022 19:12:14 GMT
Reconstructing the spatiotemporal patterns of admixture during the European Holocene using a novel genomic dating method
Manjusha Chintalapati1#, Nick Patterson2#, Priya Moorjani1,3
12 Abstract 13 14 Recent studies have shown that gene flow or admixture has been pervasive throughout 15 human history. While several methods exist for dating admixture in contemporary 16 populations, they are not suitable for sparse, low coverage data available from ancient 17 specimens. To overcome this limitation, we developed DATES that leverages ancestry 18 covariance patterns across the genome of a single individual to infer the timing of admixture. 19 By performing simulations, we show that DATES provides reliable results under a range of 20 demographic scenarios and outperforms available methods for ancient DNA applications. 21 We apply DATES to ~1,100 ancient genomes to reconstruct gene flow events during the 22 European Holocene. Present-day Europeans derive ancestry from three distinct groups, local 23 Mesolithic hunter-gatherers, Anatolian farmers, and Yamnaya Steppe pastoralists. These 24 ancestral groups were themselves admixed. By studying the formation of Anatolian farmers, 25 we infer that the gene flow related to Iranian Neolithic farmers occurred no later than 9,600 26 BCE, predating agriculture in Anatolia. We estimate the early Steppe pastoralist groups 27 genetically formed more than a millennium before the start of steppe pastoralism, providing 28 new insights about the history of proto-Yamnaya cultures and the origin of Indo-European 29 languages. Using ancient genomes across sixteen regions in Europe, we provide a detailed 30 chronology of the Neolithization across Europe that occurred from ~6,400–4,300 BCE. This 31 movement was followed by a rapid spread of steppe ancestry from ~3,200–2,500 BCE. Our 32 analyses highlight the power of genomic dating methods to elucidate the legacy of human 33 migrations, providing insights complementary to archaeological and linguistic evidence. 34 35 Keywords 36 genomic clocks, admixture, ancient DNA, European Holocene, molecular clock, migration, 37 Neolithic, Bronze Age, Yamnaya 38 Significance 39 40 The European continent was subject to two major migrations during the Holocene: the 41 movement of Near Eastern farmers during the Neolithic and the migration of Steppe 42 pastoralists during the Bronze Age. To understand the timing and dynamics of these 43 movements, we developed DATES that leverages ancestry covariance patterns across the 44 genome of a single individual to infer the timing of admixture. Using ~1,100 ancient genomes 45 spanning ~8,000–350 BCE, we reconstruct the chronology of the formation of the ancestral 46 populations and the fine-scale details of the spread of Neolithic farming and Steppe 47 pastoralist-related ancestry to Europe. Our analysis demonstrates the power of genomic 48 dating methods to provide an independent and complementary timeline of population origins 49 and movements using genetic data
|
|
|
Post by Admin on Feb 4, 2022 20:28:56 GMT
50 Introduction 51 52 Recent studies have shown that population mixture (or “admixture”) is pervasive 53 throughout human history, including mixture between the ancestors of modern humans and archaic 54 hominins (i.e., Neanderthals and Denisovans), as well as in the history of many contemporary 55 human groups such as African Americans, South Asians and Europeans (1, 2). Many admixed 56 groups are formed due to population movements involving ancient migrations that pre-date 57 historical records. The recent availability of genomic data for a large number of present-day and 58 ancient genomes provides an unprecedented opportunity to reconstruct population events using 59 genetic data, providing evidence complementary to linguistics and archaeology. Understanding 60 the timing and signatures of admixture offers insights into the historical context in which the 61 mixture occurred and enables the characterization of the evolutionary and functional impact of the 62 gene flow. 63 To characterize patterns of admixture, genetic methods use the insight that the genome of 64 an admixed individual is a mosaic of chromosomal segments inherited from distinct ancestral 65 populations (3). Due to recombination, these ancestral segments get shuffled in each generation 66 and become smaller and smaller over time. The length of the segments is inversely proportional to 67 the time elapsed since the mixture (3, 4). Several genetic approaches––ROLLOFF (4), ALDER 68 (5), Globetrotter (2), and Tracts (6)–– have been developed that use this insight by characterizing 69 patterns of admixture linkage disequilibrium (LD) or haplotype lengths across the genome to infer 70 the timing of mixture. Haplotype-based methods perform chromosome painting or local ancestry 71 inference at each locus in the genome and characterize the distribution of ancestry tract lengths to 72 estimate the time of mixture (2, 6). This requires accurate phasing and inference of local ancestry, 73 which is often difficult when the admixture events are old (as ancestry blocks become smaller over 74 time) or when reference data from ancestral populations is unavailable. Admixture LD-based 75 methods, on the other hand, measure the extent of the allelic correlation across markers to infer 76 the time of admixture (4, 5). They do not require phased data from the target or reference 77 populations and work reliably for dating older admixture events (>100 generations). However, 78 they tend to be less efficient in characterizing admixture events between closely related ancestral 79 groups. 80 While highly accurate for dating admixture events using data from present-day samples, 81 current methods do not work reliably for dating admixture events using ancient genomes. Ancient 82 DNA samples often have high rates of DNA degradation, contamination (from human and other 83 sources) and low sequencing depth, leading to a large proportion of missing variants and uneven 84 coverage across the genome. Additionally, most studies generate pseudo-homozygous genotype 85 calls––consisting of a single allele call at each diploid site––that can lead to some issues in the 86 inference. In such sparse datasets, estimating admixture LD can be noisy and biased (see 87 Simulations below). Moreover, haplotype-based methods require phased data from both admixed 88 and reference populations which remains challenging for ancient DNA specimens. 89 An extension of admixture LD-based methods, recently introduced by Moorjani et al. 90 (2016), leverages ancestry covariance patterns that can be measured in a single sample using low 91 coverage data. This approach measures the allelic correlation across neighboring sites, but instead 92 of measuring admixture LD across multiple samples, it integrates data across markers within a 93 single diploid genome. Using a set of ascertained markers that are informative for Neanderthal 94 ancestry (where sub-Saharan Africans are fixed for the ancestral alleles and Neanderthals have a 95 derived allele), Moorjani et al. (2016) inferred the timing of Neanderthal gene flow in Upper 96 Paleolithic Eurasian samples and showed the approach works accurately in ancient DNA samples 97 (1). However, this approach is inapplicable for dating admixture events within modern human 98 populations, as there are very few fixed differences across populations (7). 99 Motivated by the single sample statistic in Moorjani et al. (2016), we developed DATES 100 (Distribution of Ancestry Tracts of Evolutionary Signals) that measures the ancestry covariance 101 across the genome in a single admixed individual, weighted by the allele frequency difference 102 between two ancestral populations. This method was first introduced in Narasimhan et al. (2019), 103 where it was used to infer the date of gene flow between groups related to Ancient Ancestral South 104 Indians, Iranian farmers, and Steppe pastoralists in ancient South and Central Asian populations 105 (8). In this study, we evaluate the performance of DATES by performing extensive simulations for 106 a range of demographic scenarios and compare the approach to other published genomic dating 107 methods. We then apply DATES to infer the chronology of the genetic formation of the ancestral 108 populations of Europeans and the spatiotemporal patterns of admixture during the European 109 Holocene using data from ~1,100 ancient DNA specimens spanning ~8,000–350 BCE.
|
|
|
Post by Admin on Feb 4, 2022 22:16:18 GMT
110 Results 111 112 Overview of DATES: Model and simulations 113 114 DATES estimates the time of admixture by measuring the weighted ancestry covariance 115 across the genome using data from a single diploid genome and two reference populations 116 (representing the ancestral source populations). DATES works like haplotype-based methods as it 117 is applicable to dating admixture in a single genome and not like admixture LD-based methods, 118 which by definition require multiple genomes to be co-analyzed; but unlike haplotype-based 119 methods, it is more flexible as it does not require local ancestry inference. There are three main 120 steps in DATES: we start by first learning the genome-wide ancestry proportions by performing a 121 simple regression analysis to model the observed genotypes in an admixed individual as a linear 122 mix of allele frequencies from the two reference populations. For each marker, we then compute 123 the likelihood of the observed genotype in the admixed individual using the estimated ancestry 124 proportions and allele frequencies in each reference population (this is similar in spirit to local 125 ancestry inference). This information is, in turn, used to compute the joint likelihood for two 126 neighboring markers to test if they derive ancestry from the same ancestral group, accounting for 127 the probability of recombination between the two markers. Finally, we compute the covariance 128 across pairs of markers located at a particular genetic distance, weighted by the allele frequency 129 differences in the reference populations (Note S1). 130 Following (1), we bin the markers that occur at a similar genetic distance across the 131 genome, rather than estimating admixture LD for each pair of markers, and compute the covariance 132 across increasing genetic distance between markers. The estimated covariance is expected to decay 133 exponentially with genetic distance, and the rate of decay is informative of the time of the mixture 134 (4). Assuming the gene flow occurred instantaneously, we infer the average date of gene flow by 135 fitting an exponential distribution to the decay pattern (Methods). In cases where data for multiple 136 individuals is available, we compute the likelihood by summing over all individuals. To make 137 DATES computationally tractable, we implemented the fast Fourier transform (FFT) for calculating ancestry covariance as described in ALDER (5). This provides a speedup from 𝛰(𝑛 2 138 ) 139 to 𝛰(𝑛 log 𝑛 ), which reduces the typical runtimes from hours to seconds (Note S1). 140 To assess the reliability of DATES, we performed simulations where we constructed ten 141 admixed diploid genomes by randomly sampling haplotypes from two source populations (Note 142 S2). Briefly, we simulated individual genomes with 20% European and 80% African ancestry by 143 using phased haplotypes of Northern Europeans (Utah European Americans, CEU) and west 144 Africans (Yoruba from Nigeria, YRI) from the 1000 Genomes Project respectively (7). As 145 reference populations in DATES, we used closely related surrogate populations of French and 146 Yoruba respectively, from the Human Genome Diversity Panel (9). We first investigated the 147 accuracy of DATES by varying the time of admixture between 10–300 generations. For 148 comparison, we also applied ALDER (5) to these simulations. Both methods reliably recovered 149 the time of admixture up to 200 generations or ~5,600 years ago, assuming a generation time of 150 28 years (1), though DATES was more precise than ALDER for older admixture events (>100 151 generations) (Table S2.4). Further, DATES shows accurate results even for single samples (Figure 152 1). 153 Next, we tested DATES for features such as varying admixture proportions and use of surrogate 154 populations as reference groups. By varying of European ancestry proportion between ~1–50% 155 (the rest derived from west Africans), we observed DATES accurately estimated the timing in all 156 cases (Figure S2.2A). However, the inferred admixture proportion was overestimated was lower 157 admixture proportions (<10%) (Figure S2.2B). Thus, we caution against using DATES for 158 estimating ancestry proportions and recommend other methods based on f-statistics (10). Using 159 reference populations which are divergent from true admixing source, we found that the inferred 160 dates were accurate even when we used Khomani San instead of Yoruba as the reference 161 population (FST ~ 0.1) (Figure S2.5). We also found that using the admixed samples themselves as 162 one of the reference populations also works reliably as ALDER (i.e., single reference setup) (5). 163 An important feature of DATES is that it does not require phased data and is applicable to 164 datasets with small sample sizes, making it in principle useful for ancient DNA applications. To 165 test the reliability of DATES for ancient genomes, we simulated data mimicking the relevant 166 features of ancient genomes, namely with large proportions of missing genotypes (between 10– 167 60%), and pseudo-homozygous genotype calls (instead of diploid genotype calls). DATES showed 168 reliable results in both cases, even only a single admixed individual was available (Figure S2.7). 169 In contrast, admixture LD-based methods require more than one sample and do not work reliably 170 with missing data. For example, ALDER estimates were very unstable for simulations with >40% 171 missing data. For older dates (>100 generations), we observed slight bias even with >10% missing 172 genotypes (Figure S2.17). As LD calculations leverage shared patterns across samples, variable 173 missingness of genotypes across individuals leads to substantial loss of data leading to unstable 174 and noisy inference. This highlights a major advantage of DATES for ancient DNA studies as it 175 provides reliable results even in sparse datasets (Note S2.5). 176 DATES assumes a model of instantaneous gene flow with a single pulse of mixture between 177 two source populations. However, many human populations have a history of multiple pulses of 178 gene flow. To test the performance of DATES for multi-way admixture events, we generated 179 admixed individuals with ancestry from three sources (East Asians, Africans, and Europeans) 180 where the gene flow occurred at two distinct time points (Note S2, Figure S2.10). By applying 181 DATES with pairs of reference populations at a time and fitting a single exponential to the ancestry 182 covariance patterns, we observed that DATES recovered both admixture times in case of equal 183 ancestry proportion from the three ancestral groups when the associated reference groups were 184 used for dating (Figure S2.11). In the case of unequal admixture proportions from three ancestral 185 groups, DATES inferred the timing of the recent admixture event in most cases, though some 186 confounding was observed, especially when the ancestry proportion of the recent event was low 187 (Figure S2.12). However, if the reference populations were set up to match the model of gene flow, 188 we observed that we could reliably recover the time of the recent gene flow event. For example, 189 there is limited confounding if the two references used in DATES include (i) the source population 190 for the recent event and (ii) either the pooled ancestral populations contributing to the first (or 191 earlier) event or the intermediate admixed group formed after the first event (Table S2.1). This 192 highlights how the choice of reference populations can help to tune the method to infer the timing 193 of specific admixture events reliably. 194 Finally, we explored the impact of more complex demographic events, including 195 continuous admixture and founder events using coalescent simulations (Note S2). In the case of 196 continuous admixture, DATES inferred an intermediate timing between the start and the end of the 197 gene flow period, similar to other methods like ALDER and Globetrotter (2, 5) (Table S2.2). In 198 the case of populations with founder events, we inferred unbiased dates of admixture in most cases 199 except when the founder event was extreme (Ne ~ 10) or the population had maintained a low 200 population size (Ne < 100) until present (i.e., no recovery bottleneck) (Figure S2.13, Table S2.3). 201 In humans, few populations have such extreme founder events, and thus, in most other cases, our 202 inferred admixture dates should be robust to founder events (11). We note that while DATES is not 203 a formal test of admixture, in simulations, we find that in the absence of gene flow, the method 204 does not infer significant dates of admixture even when the target has a complex demographic 205 history (Figure S2.15, S2.16). 207 Comparison to other methods 208 209 We assessed the reliability of DATES in real data by comparing our results with published 210 methods: Globetrotter, ALDER, and ROLLOFF. These methods are designed for the analysis of 211 present-day samples that typically have high-quality data with limited missing variants. In 212 addition, Globetrotter uses phased data which is challenging for ancient DNA samples. Thus, 213 instead of rerunning other methods, we took advantage of the published results for contemporary 214 samples presented in Hellenthal et al., (2014) (2). Following (2), we created a merged dataset 215 including individuals from Human Genome Diversity Panel (9), Behar et al. (2010) (12), and Henn 216 et al. (2012) (13) (Methods). We applied DATES and ALDER to 29 target groups using the 217 reference populations reported in Hellenthal et al. 2014 (Table S12), excluding one group where 218 the population label was unclear. Interestingly, the majority of these groups (25/29) failed 219 ALDER’s formal test of admixture; either because the results of the single reference and two 220 reference analyses yielded inconsistent estimates or because the target had long-range shared LD 221 with one of the reference populations (Table S4.1). Using DATES, we inferred significant dates of 222 admixture in 20 groups, and 14 of those were consistent with estimates based on Globetrotter. In 223 most remaining cases, recent studies suggest the target populations may have ancestry from 224 multiple gene flow events, either involving the same source populations or additional ancestral 225 groups . The estimated admixture timing based on DATES, ROLLOFF, and ALDER (assuming 226 two-way admixture regardless of the formal test results) were found to be highly concordant (Table 227 S4.1). 228 229 Fine-scale patterns of population mixtures in ancient Europe 230 231 Recent ancient DNA studies have shown that present-day Europeans derive ancestry from three 232 distinct sources: (a) hunter-gatherer-related ancestry that is closely related to Mesolithic hunter 233 gatherers (HG) from Europe; (b) Anatolian farmer-related ancestry related to Neolithic farmers 234 from the Near East and associated to the spread of farming to Europe; and (c) Steppe pastoralist 235 related ancestry that is related to the Yamnaya pastoralists from Russia and Ukraine (16–19). Many 236 open questions remain about the timing and dynamics of these population interactions, in particular 237 related to the formation of the ancestral groups (which were themselves admixed) and their 238 expansion across Europe. To characterize the spatial and temporal patterns of mixtures in Europe 239 in the past 10,000 years, we used 1,096 ancient European samples from 152 groups from the 240 publicly available Allen Ancient DNA Resource (AADR) spanning a time range of ~8,000–350 241 BCE (Methods, Table SA). Using DATES, we characterized the timing of the various gene flow 242 events, and below, we describe the key events in chronological order focusing on three main 243 periods.
|
|
|
Post by Admin on Feb 5, 2022 19:26:35 GMT
229 Fine-scale patterns of population mixtures in ancient Europe 230 231 Recent ancient DNA studies have shown that present-day Europeans derive ancestry from three 232 distinct sources: (a) hunter-gatherer-related ancestry that is closely related to Mesolithic hunter233 gatherers (HG) from Europe; (b) Anatolian farmer-related ancestry related to Neolithic farmers 234 from the Near East and associated to the spread of farming to Europe; and (c) Steppe pastoralist235 related ancestry that is related to the Yamnaya pastoralists from Russia and Ukraine (16–19). Many 236 open questions remain about the timing and dynamics of these population interactions, in particular 237 related to the formation of the ancestral groups (which were themselves admixed) and their 238 expansion across Europe. To characterize the spatial and temporal patterns of mixtures in Europe 239 in the past 10,000 years, we used 1,096 ancient European samples from 152 groups from the 240 publicly available Allen Ancient DNA Resource (AADR) spanning a time range of ~8,000–350 241 BCE (Methods, Table SA). Using DATES, we characterized the timing of the various gene flow 242 events, and below, we describe the key events in chronological order focusing on three main 243 periods. 244 245 Holocene to Mesolithic: Pre-Neolithic Europe was inhabited by hunter-gatherers until the arrival 246 of the first farmers from the Near East (20, 21). There was large diversity among hunter-gatherers 247 with four main groups–– western hunter-gatherers (WHG) that were related to the Villabruna 248 cluster in central Europe, eastern hunter-gatherers (EHG) from Russia and Ukraine related to the 249 Upper Paleolithic group of Ancestral North Eurasians (ANE) ancestry, Caucasus hunter-gatherers 250 (CHG) from Georgia associated to the first farmers from Iran, and the GoyetQ2-cluster associated 251 to the Magdalenian culture in Spain and Portugal (18, 22–25). Most Mesolithic HGs fall on two 252 main clines of relatedness: one cline that extends from Scandinavia to central Europe showing 253 variable WHG–EHG ancestry, and the other in southern Europe with WHG–GoyetQ2 ancestry 254 (23). This ancestry is already present in the 17,000 BCE El Mirón individual from Spain, 255 suggesting that the GoyetQ2-related gene flow occurred well before the Holocene. However, the 256 WHG–EHG cline was formed more recently during the Mesolithic period, though the precise 257 timing of the spread of EHG ancestry remains less well understood. 258 To characterize the formation of the WHG–EHG cline, we used genomic data from 16 259 ancient HG groups (n=101) with estimated ages of ~7,500–3,600 BCE. We first verified the 260 ancestry of each HG group using qpAdm that compares the allele frequency correlations between 261 the target and a set of source populations to formally test the model of admixture and then infer 262 the ancestry proportions for the best fitted model (16). For each target population, we chose the 263 most parsimonious model, i.e., fitting the data with the minimum number of source populations. 264 Consistent with previous studies, our qpAdm analysis showed that most HGs from Scandinavia, 265 the Baltic Sea region, and central Europe could be modeled as a two-way mixture of WHG and 266 EHG-related ancestry (Table S5.1, Note S5). To confirm that the target populations do not harbor 267 Anatolian farmer-related ancestry (that could lead to some confounding in estimated admixture 268 dates), we applied D-statistics of the form D(Mbuti, target, WHG, Anatolian farmers) where target 269 = Mesolithic HGs. We observed that none of the target groups had a stronger affinity to Anatolian 270 farmers than WHG, suggesting that the mixtures we date below reflect pre-Neolithic contacts 271 between the HGs (Table S5.2). 272 To infer the timing of the mixtures in the history of Mesolithic European HGs, we applied 273 DATES to hunter-gatherers from Scandinavia, the Baltic regions, and central Europe. We inferred 274 that the earliest admixture occurred in Scandinavian HGs from Norway and Sweden around ~80– 275 113 generations before the samples lived (Figure SB). Accounting for the average sampling age 276 of the specimens and the mean human generation time of 28 years (1), this translates to a timing 277 of admixture of ~10,200 BCE for Norway and Sweden Mesolithic individuals, though dates are 278 more recent (~8,000 BCE) in the Motala HG’s. In the Baltic region, we inferred admixture dates 279 of ~8,700–6,000 BCE in Latvia and Lithuania HGs, postdating the mixture in Scandinavia (Figure 280 3). In southeast Europe, the Iron Gates region of the Danube Basin shows widespread evidence of 281 mixtures between hunter-gatherer groups and, in the case of some outliers, mixture of hunter 282 gatherers and Anatolian farmer-related ancestry as early as the Mesolithic period (26). Further, 283 these groups showed strong affinity to the WHG-related ancestry in Anatolian populations, 284 suggesting ancient interactions with Near Eastern populations (26). We applied qpAdm to test the 285 model of admixture in Iron Gates HG and found that the parsimonious model with WHG and EHG 286 provides a good fit to the data. Further, when we tested the model with Anatolian-related ancestry 287 using Anatolian HG (AHG) as an additional source population, AHG was not required as the AHG 288 ancestry proportion was not significant (Table S5.1.1 and S5.1.2). Applying DATES to Iron Gates 289 HG with WHG and EHG as reference populations, we inferred this group was genetically formed 290 in ~10,000–8400 BCE. Our samples of the Iron Gates HGs include a wide range of C14 dates 291 between 8,800–5,700 BCE. We confirmed our dates were robust to the sampling age of the 292 individuals as we obtained statistically consistent dates when all samples were combined as one 293 group or when subsets of samples were grouped in bins of 500 years (Figure SA). The most recent 294 dates of ~7,500 BCE were inferred in eastern Europe in Ukraine HGs, highlighting how the WHG 295 EHG cline was formed over a period ~2000–3000 years (Figure 3, Table SC). 296 297 Early to middle Neolithic: Neolithic farming began in the Near East––the Levant, Anatolia, and 298 Iran––and spread to Europe and other parts of the world (18, 20, 27). The first farmers of Europe 299 were related to Anatolian farmers, whose origin remains unclear. The early Neolithic Anatolian 300 farmers (Aceramic Anatolian farmers) had majority ancestry from AHG with some gene flow from 301 the first farmers from Iran (26). AHG, in turn, had ancestry from Levant HG (Natufians) and some 302 mysterious hunter-gatherer group related to the ancestors of WHG individuals from central 303 Europe–– a gene flow event that likely occurred in the late Pleistocene (26). Using qpAdm, we 304 confirmed that early Anatolian farmers could be modeled as a mixture of AHG and Iran Neolithic 305 farmer-related groups (Note S5). To learn about the timing of the genetic formation of early 306 Anatolian farmers, we applied DATES using one reference group as a set of pooled individuals of 307 WHG-related and Levant Neolithic farmers-related individuals as a proxy of AHG ancestry and 308 the second reference group containing pooled Iran Neolithic farmer-related individuals. We note 309 that the application of DATES to three-way admixed groups can lead to intermediate dates between 310 the first and second pulse of gene flow unless the reference populations are chosen carefully (Table 311 S2.1). Our setup for early Anatolian farmers should have limited confounding and should recover 312 the timing of the most recent event (in this case, the gene flow from CHG or Iran Neolithic-related 313 groups) reliably (Table S2.1). We infer the Iran Neolithic farmer-related gene flow occurred 314 ~10,900 BCE (12,200–9,600 BCE), predating the origin of farming in Anatolia (28). During the 315 subsequent millennia, these early farmers further admixed with Levant Neolithic groups to form 316 Anatolian Neolithic farmers who spread towards the west to Europe and in the east to mix with 317 Iran Neolithic farmers, forming the Chalcolithic groups of Seh Gabi and Hajji Firuz. Using 318 DATES, we inferred the Chalcolithic groups were genetically formed in ~7,600–5,700 BCE (Table 319 SC). 320 In Europe, the Anatolian Neolithic farmers mixed with the local indigenous hunter 321 gatherers replacing between ~3-50% ancestry of Neolithic Europeans. To elucidate the fine-scale 322 patterns and regional dynamics of these mixtures, we applied DATES to time transect samples 323 from 94 groups (n=657) sampled from sixteen regions in Europe, ranging from ~6,000-1,900 BCE 324 and encompassing individuals from the early Neolithic to Chalcolithic periods (Table SB). Using 325 qpAdm, we first confirmed that the Neolithic Europeans could be modeled as a mixture of 326 European hunter-gatherer-related ancestry and Anatolian farmer-related ancestry and inferred their 327 ancestry proportions (Table SD). For most target populations (~80%), we found the model of gene
|
|
|
Post by Admin on Feb 5, 2022 21:31:06 GMT
Figure 2: Genetic formation of early Anatolian farmers and early Bronze Age Steppe pastoralists. 328 flow between Anatolian farmer-related and WHG-related ancestry provided a good fit to the data 329 (p-value > 0.05). In some populations, we found variation in the source of the HG-related ancestry 330 and including either EHG or GoyetQ2 improved the fit of the model. In five groups, none of the 331 models fit, despite excluding outlier individuals whose ancestry profile differed from the majority 332 of the individuals in the group (Table SD, Table SE). To confirm that the target populations do not 333 harbor Steppe pastoralist-related ancestry, we applied D-statistics of the form D(Mbuti, target, 334 Anatolian farmers, Steppe pastoralists) where target = Neolithic European groups. We observed 335 that four groups had a stronger affinity to Steppe pastoralists compared to Anatolian farmers, and 336 hence we excluded these from further analysis (Table SF). After filtering, we applied DATES to 337 86 European Neolithic groups using WHG-related individuals and Anatolian farmers as reference 338 populations. 339 Earlier analysis has suggested that farming spread along two main routes in Europe, from 340 southeast to central Europe (‘continental route’) and along the Mediterranean coastline to Iberia 341 (‘coastal route') (23, 29, 30). Consistent with this, we inferred one of the earliest timings of gene 342 flow in the Balkans around 6,400 BCE. Using the most comprehensive time-transect in Hungary 343 with 19 groups (n=63) spanning from middle Neolithic to late Chalcolithic, we inferred that the 344 admixture occurred between ~6,100–4,500 BCE. Under a model of a single shared gene flow event 345 in the common ancestors of all individuals, we would expect to obtain similar dates of admixture 346 (before present) after accounting for the age of the ancient specimens. Similar to Lipson et al. 347 (2017), we observed that the estimated dates in middle Neolithic individuals were substantially 348 older than those inferred in late Neolithic or Chalcolithic individuals (Figure 3). This would be 349 expected if the underlying model of gene flow involved multiple pulses of gene flow, such that the 350 timing in the middle Neolithic samples reflects the initial two-way mixture and the timing in the 351 Chalcolithic samples captures both recent and older events. Interestingly, Lipson et al. (2017) and 352 other recent studies have documented increasing HG ancestry from ~3-15% from the Neolithic to 353 Chalcolithic period (16, 23, 31), suggesting that there was additional HG gene flow after the initial 354 mixture. This highlights that the interactions between local hunter-gatherers and incoming 355 Anatolian farmers were complex with multiple gene flow events between these two groups, which 356 explains the increasing HG ancestry and more recent dates in Chalcolithic individuals (Table SD). 357 Mirroring the pattern in Hungary, we documented the resurgence of HG ancestry in the 358 Czech Republic, France, Germany, and southern Europe. In central Europe, we inferred that the 359 Anatolian farmer-related gene flow occurred ~5,600-5,000 BCE, with some exceptions. In the 360 Blätterhöhle site from Germany, we inferred the gene flow occurred more recently (~4,000 BCE), 361 consistent with the occupation of both hunter-gatherers and farmers in this region until the late 362 Neolithic (31). In eastern Europe, using samples related to the Funnel Beaker culture (TRB; from 363 German Trichterbecher) from Poland, we dated the Anatolian farmer-related gene flow occurred 364 ~5,300–4,200 BCE. Following the TRB decline, the Baden culture and the Globular Amphora 365 culture appeared in many areas of Poland and Ukraine (25). These cultures had close contacts with 366 Corded Ware complex and steppe societies, though we did not find any evidence of Steppe 367 pastoralist-related ancestry in the GAC individuals (Table SD). Applying DATES, we inferred the 368 Anatolian farmer-related and HG mixture occurred ~5,200-3,100 BCE, predating the spread of 369 Steppe pastoralists to eastern Europe (16, 19). 370 Along the Mediterranean route, we characterized Anatolian farmer-related gene flow in 371 Italy, Iberia, France, and the British Isles. Using samples from five groups in Italy, we inferred the 372 earliest dates of Anatolian farmer-related gene flow of ~6,100 BCE, and within the millennium, 373 the ancestry spread from Sardinia to Sicily (Figure 3). In Iberia, the Anatolian farmer-related 374 mixture occurred ~6,000–3,400 BCE and showed evidence for an increase in HG ancestry from 375 ~9–20% after the initial gene flow. In France, previous studies have shown that Anatolian farmer 376 related ancestry came from both routes, along the Danubian in the north and along the 377 Mediterranean in the south (23). This is reflected in the source of the HG ancestry, which is 378 predominantly EHG and WHG-related in the north and includes WHG and Goyet-Q2 ancestry in 379 the south (23). Consistently, we also observed that the admixture dates in France were structured 380 along these routes, with the median estimate of ~5,100 BCE in the east and much older ~5,500 381 BCE in the south (Table SC). In Scandinavia, we inferred markedly more recent dates of admixture 382 of ~4,300 BCE using samples from Sweden associated with the TRB culture and Ansarve 383 Megalithic tombs, consistent with a late introduction of farming to Scandinavia (33). 384 Finally, we inferred recent dates of admixture in Neolithic samples from the British Isles 385 (England, Scotland, and Ireland) with the median timing of ~5,000 BCE across the three regions. 386 Interestingly, unlike in western and southern Europe, there was no resurgence in HG ancestry 387 during the Neolithic in Britain (34). This suggests our dates can be interpreted as the time of the 388 main mixture of HGs and Anatolian farmers in this region, implying that the farmer-related 389 ancestry reached Britain a millennium after its arrival in continental Europe. By 4,300 BCE, we 390 find that Anatolian farmer-related ancestry is present in nearly all regions in Europe. 391
|
|