Post by Admin on Mar 27, 2021 0:34:50 GMT
Admixture modeling of IA steppe populations
Genetic ancestry modeling of the IA groups performed with qpWave and qpAdm confirmed that the steppe_MLBA groups adequately approximate the western Eurasian ancestry source in IA Scythians while the preceding steppe_EBA (e.g., Yamnaya and Afanasievo) do not (data file S4). As an eastern Eurasian proxy, we chose LBA herders from Khovsgol in northern Mongolia based on their geographic and temporal proximity. Other eastern proxies fail the model because of a lack or an excess of affinity toward the Ancient North Eurasian (ANE) lineage (25). However, this two-way admixture model of Khovsgol + steppe_MLBA does not fully explain the genetic compositions of the Scythian gene pools (data file S4). We find that the missing piece matches well with a small contribution from a source related to ancient populations living in the southern regions of the Caucasus/Iran or Turan [we use the term “Turan” for consistency with (7), only its geographical meaning, designating the southern part of Central Asia; Fig. 3A]. The proportions of this ancestry increase through time and space: a negligible amount in the most northeastern Aldy_Bel_700BCE group, ~6% in the early Tasmola_650BCE, ~12% in Pazyryk_Berel_50BCE, ~10% in Sargat_300BCE, ~13% in Saka_TianShan_600BCE, and ~20% in Saka_TianShan_400BCE (Fig. 3A), in line with f4-statistics (table S2). Sarmatians also require 15 to 20% Iranian ancestry while carrying substantially less Khovsgol and more steppe_MLBA-related ancestry than the eastern Scythian groups.
Fig. 3 Bar plots showing the ancestry proportions and SEs obtained from qpWave/qpAdm modelings.
(A) Fitting models for the main IA groups using LBA sources, the major genetic shift with the “new” East Asian influx (DevilsCave_N-like) observed in the Middle IA outliers and Korgantas. (B) Fitting models for the post-IA groups using IA groups as sources. A transparency factor is added to the models presenting poor fits (P < 0.05; only Konyr_Tobe_300CE). On the top is shown the color legend for the sources tested. (C) Summary of the admixture dates obtained with DATES for the main groups studied. The y axis is the temporal scale from BCE (negative) to CE (positive) dates. The x axis represents the results for the different target groups reported in the legends in each box using the two-way sources reported at the bottom of the three panels formed along the x axis (e.g., source1 + source2). The colored bars represent the date ranges of the culture, while the filled symbols show the admixture dates ± SEs obtained from DATES and converted into dates considering 29 years per generation starting from the median point of the culture’s age. The three set of sources reported correspond to the summary of the main admixture events described in the text from left to right: the LBA formation of the Scythian gene pools; the BMAC-related influx increasing through time in the Tian Shan Sakas; and the new eastern influx starting in the IA and continuing throughout the centuries. A number-based key (the white numbers from 1 to 6 inside the black circles) connects different tests and analyses shown in the figure with the corresponding arrow in Fig. 4.
For Sarmatians and later Tian Shan Sakas, only the groups from Turan (i.e., Turan_ChL, BMAC, and postBMAC) match as sources, while groups from Iran and Caucasus fail; we chose BMAC and postBMAC as the representative proxies (Fig. 3A and data file S4). The extra eastern Eurasian influx in the outliers (Tasmola_Birlik_640BCE, Korgantas_300BCE, and Pazyryk_Berel_50BCE_o) is not sourced from the same eastern proxies as the previous groups (i.e., Khovsgol); instead, it can only be modeled with an ancient northeast Asian (ANA) lineage, represented by the early Neolithic groups from the Devil’s Gate Cave site in the Russian Far East (DevilsCave_N) (Fig. 3A and data file S4).
Post-IA genetic turnovers in the Kazakh Steppe
We observe an intensification of the new eastern Eurasian influx described above among the individuals from the early 1st millennium CE (“Xianbei_Hun_Berel_300CE”) as well as the later 7th- to 11th-millennium CE individuals (“Karakaba_830CE” and “Kayalyk_950CE”). They are scattered along PC1 from the main IA Tasmola/Pazyryk cluster toward the ANA groups (Fig. 2C). The two individuals associated with Hun elite burials dated from the third century CE, one from the site of Kurayly in the Aktobe region in western Kazakhstan and the other from Budapest, Hungary (“Hun_elite_350CE”), cluster closely together along this cline (Fig. 2C and figs. S1 to S3).
The individuals from the ancient city of Otyrar Oasis in southern Kazakhstan show a quite distinct genetic profile. Three of five individuals (“Konyr_Tobe_300CE”) fall close to the published Kangju_250CE individuals from a similar time period and region (11), between Sarmatians and BMAC (Fig. 2C). KNT005 is shifted toward BMAC in PCA (Fig. 2C and fig. S1). Furthermore, KNT005 is the only one carrying a South Asian Y haplogroup, L1a2 (data file S1), and showing a South Asian genetic component in ADMIXTURE (Fig. 2D and fig. S2). KNT004 is shifted in PC1 toward East Asians (figs. S1 to S3). Admixture models including ~10% South Asian and ~50% eastern Eurasian influx adequately explain KNT005 and KNT004, respectively (data file S4). In contrast, the individuals from the site of Alai Nura (Alai_Nura_300CE) in the Tian Shan mountains (~200 km east from the Konyr Tobe site) still lay along the IA cline of the Tian Shan Saka, with four individuals falling closer to Konyr_Tobe_300CE and four closer to the Tasmola/Pazyryk cloud (Fig. 2C and figs. S1 to S3).
Dating ancient admixture
Admixture dating with the DATES program reveal an early formation of the main Scythian gene pools during 1000 to 1500 BCE (Fig. 3C and fig. S4). DATES is designed to model only the two-way admixture, so to account for the estimated three-way models obtained with qpWave and qpAdm, we independently tested the three pairwise comparisons (steppe_MLBA, BMAC, and Khovsgol). DATES was successful in fitting exponential decays for the two western + eastern Eurasian pairs, steppe_MLBA + Khovsgol, and BMAC + Khovsgol, while failing in the western + western Eurasian pair (steppe_MLBA + BMAC) (fig. S4 and table S3). For each target, steppe_MLBA + Khovsgol and BMAC + Khovsgol yielded nearly identical admixture date estimates (table S3). We believe that our estimates mostly reflect an average date between the genetically distinguishable eastern (Khovsgol) and western (steppe_MLBA + BMAC) ancestries, weighted by the relative contribution from the two western sources, rather than reflecting a true simultaneous three-way admixture. It is noteworthy that DATES found increasingly younger admixture dates in the Tian Shan Saka groups as the BMAC-related ancestry increases: from Saka_TianShan_600BCE to the Saka_TianShan_400BCE and especially in the later Alai_Nura_300CE as well as for Pazyryk_Berel_50BCE and Sargat_300BCE with respect to the date of Tasmola_650BCE (~1100 to 900 BCE with respect to ~1300 to 1400 BCE; Fig. 3C). A small-scale gene flow from a BMAC-related source continued over IA may explain both the increase in the BMAC-related ancestry proportion and increasingly younger admixture dates (Fig. 3A). Again, the inferred dates reflect an average over the IA admixture with a BMAC-related source and the LBA one with steppe_MLBA; therefore, they are likely shifted toward older time periods than the actual time of the IA gene flow.
Confirming the results from qpAdm, the admixed individuals from Tasmola_Birlik_640BCE and Korgantas_300BCE (“admixed_Eastern_out_IA”) show very recent admixture dates (Fig. 3C, fig. S4, and table S3). The later groups of Xianbei_Hun_Berel_300CE, Hun_elite_350CE, and Karakaba_830CE further corroborate this trend of recent dates of admixture, revealing that this new eastern influx likely started in the IA and continued at least during the first centuries of the first millennium CE (Fig. 3C, fig. S4, and table S3).
Present-day Kazakhs
PCA, ADMIXTURE, and CHROMOPAINTER/fineSTRUCTURE fine-scale haplotype-based analyses performed on present-day Kazakhs reveal a tight clustering and absence of detectable substructure among Kazakhs regardless of the geographic location or Zhuz affiliation (Fig. 2 and fig. S5). We still grouped the Kazakh individuals according to their Zhuz affiliations (which roughly reflects their geographic origin) and ran Globetrotter analyses following the pipeline in (26) as independent replicates to identify the different ancestry sources contributing to the gene pool of Kazakhs and date admixture events. Globetrotter analyses confirmed that the three groups have the same source composition and admixture dates and are a result of a complex mixture of different western, southern, and eastern Eurasian ancestries (table S4). The dates of admixture identified by Globetrotter highlight a narrow and recent time range for the formation of the present-day Kazakh gene pool, between 1341 and 1544 CE (table S5).
Genetic ancestry modeling of the IA groups performed with qpWave and qpAdm confirmed that the steppe_MLBA groups adequately approximate the western Eurasian ancestry source in IA Scythians while the preceding steppe_EBA (e.g., Yamnaya and Afanasievo) do not (data file S4). As an eastern Eurasian proxy, we chose LBA herders from Khovsgol in northern Mongolia based on their geographic and temporal proximity. Other eastern proxies fail the model because of a lack or an excess of affinity toward the Ancient North Eurasian (ANE) lineage (25). However, this two-way admixture model of Khovsgol + steppe_MLBA does not fully explain the genetic compositions of the Scythian gene pools (data file S4). We find that the missing piece matches well with a small contribution from a source related to ancient populations living in the southern regions of the Caucasus/Iran or Turan [we use the term “Turan” for consistency with (7), only its geographical meaning, designating the southern part of Central Asia; Fig. 3A]. The proportions of this ancestry increase through time and space: a negligible amount in the most northeastern Aldy_Bel_700BCE group, ~6% in the early Tasmola_650BCE, ~12% in Pazyryk_Berel_50BCE, ~10% in Sargat_300BCE, ~13% in Saka_TianShan_600BCE, and ~20% in Saka_TianShan_400BCE (Fig. 3A), in line with f4-statistics (table S2). Sarmatians also require 15 to 20% Iranian ancestry while carrying substantially less Khovsgol and more steppe_MLBA-related ancestry than the eastern Scythian groups.
Fig. 3 Bar plots showing the ancestry proportions and SEs obtained from qpWave/qpAdm modelings.
(A) Fitting models for the main IA groups using LBA sources, the major genetic shift with the “new” East Asian influx (DevilsCave_N-like) observed in the Middle IA outliers and Korgantas. (B) Fitting models for the post-IA groups using IA groups as sources. A transparency factor is added to the models presenting poor fits (P < 0.05; only Konyr_Tobe_300CE). On the top is shown the color legend for the sources tested. (C) Summary of the admixture dates obtained with DATES for the main groups studied. The y axis is the temporal scale from BCE (negative) to CE (positive) dates. The x axis represents the results for the different target groups reported in the legends in each box using the two-way sources reported at the bottom of the three panels formed along the x axis (e.g., source1 + source2). The colored bars represent the date ranges of the culture, while the filled symbols show the admixture dates ± SEs obtained from DATES and converted into dates considering 29 years per generation starting from the median point of the culture’s age. The three set of sources reported correspond to the summary of the main admixture events described in the text from left to right: the LBA formation of the Scythian gene pools; the BMAC-related influx increasing through time in the Tian Shan Sakas; and the new eastern influx starting in the IA and continuing throughout the centuries. A number-based key (the white numbers from 1 to 6 inside the black circles) connects different tests and analyses shown in the figure with the corresponding arrow in Fig. 4.
For Sarmatians and later Tian Shan Sakas, only the groups from Turan (i.e., Turan_ChL, BMAC, and postBMAC) match as sources, while groups from Iran and Caucasus fail; we chose BMAC and postBMAC as the representative proxies (Fig. 3A and data file S4). The extra eastern Eurasian influx in the outliers (Tasmola_Birlik_640BCE, Korgantas_300BCE, and Pazyryk_Berel_50BCE_o) is not sourced from the same eastern proxies as the previous groups (i.e., Khovsgol); instead, it can only be modeled with an ancient northeast Asian (ANA) lineage, represented by the early Neolithic groups from the Devil’s Gate Cave site in the Russian Far East (DevilsCave_N) (Fig. 3A and data file S4).
Post-IA genetic turnovers in the Kazakh Steppe
We observe an intensification of the new eastern Eurasian influx described above among the individuals from the early 1st millennium CE (“Xianbei_Hun_Berel_300CE”) as well as the later 7th- to 11th-millennium CE individuals (“Karakaba_830CE” and “Kayalyk_950CE”). They are scattered along PC1 from the main IA Tasmola/Pazyryk cluster toward the ANA groups (Fig. 2C). The two individuals associated with Hun elite burials dated from the third century CE, one from the site of Kurayly in the Aktobe region in western Kazakhstan and the other from Budapest, Hungary (“Hun_elite_350CE”), cluster closely together along this cline (Fig. 2C and figs. S1 to S3).
The individuals from the ancient city of Otyrar Oasis in southern Kazakhstan show a quite distinct genetic profile. Three of five individuals (“Konyr_Tobe_300CE”) fall close to the published Kangju_250CE individuals from a similar time period and region (11), between Sarmatians and BMAC (Fig. 2C). KNT005 is shifted toward BMAC in PCA (Fig. 2C and fig. S1). Furthermore, KNT005 is the only one carrying a South Asian Y haplogroup, L1a2 (data file S1), and showing a South Asian genetic component in ADMIXTURE (Fig. 2D and fig. S2). KNT004 is shifted in PC1 toward East Asians (figs. S1 to S3). Admixture models including ~10% South Asian and ~50% eastern Eurasian influx adequately explain KNT005 and KNT004, respectively (data file S4). In contrast, the individuals from the site of Alai Nura (Alai_Nura_300CE) in the Tian Shan mountains (~200 km east from the Konyr Tobe site) still lay along the IA cline of the Tian Shan Saka, with four individuals falling closer to Konyr_Tobe_300CE and four closer to the Tasmola/Pazyryk cloud (Fig. 2C and figs. S1 to S3).
Dating ancient admixture
Admixture dating with the DATES program reveal an early formation of the main Scythian gene pools during 1000 to 1500 BCE (Fig. 3C and fig. S4). DATES is designed to model only the two-way admixture, so to account for the estimated three-way models obtained with qpWave and qpAdm, we independently tested the three pairwise comparisons (steppe_MLBA, BMAC, and Khovsgol). DATES was successful in fitting exponential decays for the two western + eastern Eurasian pairs, steppe_MLBA + Khovsgol, and BMAC + Khovsgol, while failing in the western + western Eurasian pair (steppe_MLBA + BMAC) (fig. S4 and table S3). For each target, steppe_MLBA + Khovsgol and BMAC + Khovsgol yielded nearly identical admixture date estimates (table S3). We believe that our estimates mostly reflect an average date between the genetically distinguishable eastern (Khovsgol) and western (steppe_MLBA + BMAC) ancestries, weighted by the relative contribution from the two western sources, rather than reflecting a true simultaneous three-way admixture. It is noteworthy that DATES found increasingly younger admixture dates in the Tian Shan Saka groups as the BMAC-related ancestry increases: from Saka_TianShan_600BCE to the Saka_TianShan_400BCE and especially in the later Alai_Nura_300CE as well as for Pazyryk_Berel_50BCE and Sargat_300BCE with respect to the date of Tasmola_650BCE (~1100 to 900 BCE with respect to ~1300 to 1400 BCE; Fig. 3C). A small-scale gene flow from a BMAC-related source continued over IA may explain both the increase in the BMAC-related ancestry proportion and increasingly younger admixture dates (Fig. 3A). Again, the inferred dates reflect an average over the IA admixture with a BMAC-related source and the LBA one with steppe_MLBA; therefore, they are likely shifted toward older time periods than the actual time of the IA gene flow.
Confirming the results from qpAdm, the admixed individuals from Tasmola_Birlik_640BCE and Korgantas_300BCE (“admixed_Eastern_out_IA”) show very recent admixture dates (Fig. 3C, fig. S4, and table S3). The later groups of Xianbei_Hun_Berel_300CE, Hun_elite_350CE, and Karakaba_830CE further corroborate this trend of recent dates of admixture, revealing that this new eastern influx likely started in the IA and continued at least during the first centuries of the first millennium CE (Fig. 3C, fig. S4, and table S3).
Present-day Kazakhs
PCA, ADMIXTURE, and CHROMOPAINTER/fineSTRUCTURE fine-scale haplotype-based analyses performed on present-day Kazakhs reveal a tight clustering and absence of detectable substructure among Kazakhs regardless of the geographic location or Zhuz affiliation (Fig. 2 and fig. S5). We still grouped the Kazakh individuals according to their Zhuz affiliations (which roughly reflects their geographic origin) and ran Globetrotter analyses following the pipeline in (26) as independent replicates to identify the different ancestry sources contributing to the gene pool of Kazakhs and date admixture events. Globetrotter analyses confirmed that the three groups have the same source composition and admixture dates and are a result of a complex mixture of different western, southern, and eastern Eurasian ancestries (table S4). The dates of admixture identified by Globetrotter highlight a narrow and recent time range for the formation of the present-day Kazakh gene pool, between 1341 and 1544 CE (table S5).