Post by Admin on Jul 7, 2021 21:07:44 GMT
Genetic origins and complexity of BA Xinjiang populations
To determine the genetic differences and affinities among the BA Xinjiang groups, we first conducted a discriminant analysis of principal components (DAPC) based on haplogroup frequency. PC1 explains population variation from east to west geographically, and PC2 explains the variation from north to south (Fig. 2A and fig. S5). In general, all the populations were divided into four main clusters: northeastern Asian (NEA: Siberian and northern East Asian), southeastern Asian (SEA: southern East Asian and Southeast Asian), central Steppe, and European (Turan and European). All the ancient Xinjiang samples lie on a cline extending from the NEA populations to the central Steppe and European clusters (Fig. 2A and fig. S5), suggesting that these ancient Xinjiang populations had varying degrees of relatedness to NEA, central Steppe, and European populations.
Fig. 2 The genetic results for the ancient Xinjiang samples and other ancient and present-day Eurasians.
(A) DAPC based on haplogroup frequency; the eigenvalues of the first and second PCs are shown on the top right. The different colors on the plot represent different groups made under unsupervised classification. The shapes with black frames represent the ancient Xinjiang samples with the BA samples in dark red and the IA/HE samples in dark blue. The populations plotted as triangles are ancient populations, and the circles are present-day humans. NEA, northern East Asian (upward pointing triangles), and Siberia (triangles pointing to the right); SEA, southeastern Asia; Euro, Europe; CS, central Steppe populations in BA (CS_BA; upward pointing triangles) and IA (CS_IA; triangles pointing to the right); pdCA/XJ, present-day populations in and around Xinjiang. (B) The genetic distance (FST) heatmap plot of ancient Xinjiang samples and ancient Eurasians. The different labels represent genetically distinctive groups corresponding to those in the DAPC. Values with FST ≈ 0.00 are in white, representing a close genetic relationship. SEXiaohe_BA was removed from the FST heatmap, considering the significantly large genetic distances (FST > 0.10) between them and other groups.
We find that the EMBA Xinjiang individuals from northern and western Xinjiang, associated with the Afanasievo culture (NWAfana_EMBA), are surrounded by western Steppe–related populations (WSteppe_EMBA and WSteppe_MLBA) (Fig. 2A and fig. S5). In contrast, the EMBA individuals associated with the Chemurchek culture (NChemur_EMBA) and individuals from the Songshugou site in northern Xinjiang (NSSG_EMBA) form their own separate cluster surrounded by other central Steppe populations (Fig. 2A and fig. S5). High proportions of U, H, and R haplogroups are observed, which were reported primarily in BA Steppe populations (table S1). Although there is no significant genetic differentiation among these EMBA individuals (FST ≈ 0.00, P > 0.05) (Fig. 2B, fig. S6, and table S2), we find that only NChemur_EMBA shows significant genetic differentiation from both western Steppe populations (WSteppe_EMBA: FST ≈ 0.045, P ≈ 0.005; WSteppe_MLBA: FST ≈ 0.042, P ≈ 0.006) (Fig. 2B and table S2). NChemur_EMBA also shows significant genetic differentiation from CSteppe_MLBA (FST ≈ 0.057, P ≈ 0.002) but not from CSteppe_EMBA (FST ≈ 0.029, P ≈ 0.16) (Fig. 2B and table S2), consistent with its position on the DAPC plot (fig. S5C). Moreover, in the median-joining networks, the WSteppe_EMBA individuals cluster with NWAfana_EMBA in haplogroups U4, U5, and H15 (Fig. 3D and fig. S7B); with NChemur_EMBA in U4, U5, H2, H6a, and W3 (Fig. 3D and fig. S7, B and D); and with NSSG_EMBA in U4 (Fig. 3D and fig. S7B) (39, 40). The two haplogroups H2 and H5 are present in the western Steppe–related populations, and H6a appears in the populations related to the Okunevo culture present in the Altai region (fig. S7B and table S1) (40). We also find an NEA connection (Baikal_EBA) with NChemur_EMBA supported by the appearance of haplogroup D4j (table S1), which differed in these two populations by only four mutations in the network analysis and appeared in northern East Asia, including the southern Siberian region (Fig. 3E and fig. S7A) (41). The one EMBA sample from HBH has the haplogroup U5 (table S1), suggesting a more western Steppe–related connection. Therefore, we demonstrate that both western and northern Xinjiang populations have considerable western Steppe–related ancestry during the EMBA.
Fig. 3 Median-joining haplogroup networks.
The median-joining network of the haplogroups C4 (A), R1b (B), HV (C), U4 (D), and D4 (E) related to ancient northeastern Asian (NEA), Botai/Dali, European (Euro), Turan, and BA and IA populations from the central Steppe (CS_BA and CS_IA). The Euro group consists mainly of western Steppe–related individuals, and the CS_IA group contains the Saka, Hun, and Nomad populations. The size of the circles represents the proportion of each haplotype.
We find that eastern groups (E_BA and E_LBA) cluster separately from the EMBA individuals in northern and western Xinjiang. Both of the eastern groups cluster with ancient and present-day NEA in the DAPC (Fig. 2A and fig. S5). E_BA and E_LBA harbor high proportions of the haplogroup D (~36.70 and 32.00%, respectively), which is a common lineage in ancient and present-day NEA populations (42, 43) including northern Chinese (18.20 to 44.80%) and ancient Mongolians (31.20%) (Figs. 1A and 3E and table S3). E_LBA also shows nonsignificant genetic distances to some of these NEA populations, specifically two ancient Gan-Qing populations (GQQijia_BA and GQKayue_LBA; FST < 0.05, P > 0.05) and four present-day populations (Japanese, Mongolian, Tu, and Oroqen; FST < 0.03, P > 0.05) (fig. S6, B and C, and table S2). Although both E_BA and E_LBA have the western Steppe–related haplogroup U, they show a higher proportion of lineages from NEA than from Europe, with more European lineages appearing in later samples (20% in E_BA and 36% in E_LBA) (table S3). This pattern is consistent with DAPC in which E_LBA plots closer to West Eurasians compared to E_BA (Fig. 2A). Moreover, haplogroup D4b2b4 is found in both the Xiongnu and E_LBA (Fig. 3E), which suggests a direct relationship between E_LBA and Xiongnu populations due to the presence of shared NEA ancestry. Thus, E_BA and E_LBA populations show more NEA connections, but the presence of western Steppe–related lineages (U, 16.7% in E_BA and 8% in E_LBA) also supports additional connections to the western Steppe–related populations (table S3).
Although SEXiaohe_BA clusters into the NEA groups in DAPC, which is similar to E_BA and E_LBA, they show more affinity for populations with ancient and present-day Siberian ancestry (Fig. 2A and fig. S5). SEXiaohe_BA has a high proportion of the C4 haplogroup (six of seven individuals) present in ancient and present-day Siberian populations, including NEA and Shamanka populations from near the Lake Baikal region of South Siberia (Fig. 3A and fig. S7A). This population is unique in yielding significant genetic distances compared to all other ancient and present-day populations (FST > 0.11, most P values < 0.04), including other BA Xinjiang groups, but it has the lowest genetic distances with three present-day populations from Siberia (Even, Evenk, and Yakut: FST < 0.13) (table S2). These results are consistent with previous studies on Xiaohe (19, 20).
We also find the mtDNA haplogroup R1b in BA Xinjiang samples (NChemur_EMBA, n = 2; NWAfana_EMBA, n = 1; NSSG_EMBA, n = 1; SEXiaohe_BA, n = 1) and in IA and HE populations from eastern and western Xinjiang (E_IA, n = 1; W_IA, n = 1; W_HE, n = 2) (table S1), which was reported not only in East European Hunter-Gatherers (Karelia) (44) but also in Botai (40) and Dali (28) individuals from Kazakhstan. Moreover, the haplogroup K1b2 was shared among the Botai (40) and western Steppe–related populations as well as our LBA samples from eastern Xinjiang (E_LBA) (table S1). The R1b median–joining network shows that the EMBA sample (3012 to 2890 cal BCE) from northern and western Xinjiang, associated with the Afanasievo (NWAfana_EMBA), plots in the center of the network and was separated from Botai by only a single mutation (Fig. 3B). This branch, in turn, is associated with NSSG_EMBA and another branch that includes an individual from the Dali site (Fig. 3B). This may suggest either a deep ancestry connection with an Ancient North Eurasian (ANE) population or some genetic connections with geographically proximal populations from Kazakhstan (Dali and Botai) (28, 40). We also find the R1b haplogroup in one of the individuals from the Xiaohe population (Fig. 3B), which may also suggest a North Xinjiang connection with Xiaohe people (11). Thus, during the BA, the northwestern Xinjiang populations showed a high genetic affinity for western Steppe–related cultures, such as the Afanasievo and Chemurchek, and the southeastern Xinjiang populations for NEA and South Siberian populations (Fig. 4A), suggesting a scenario of complex interactions with the neighboring populations and communities of diverse cultural backgrounds.
To determine the genetic differences and affinities among the BA Xinjiang groups, we first conducted a discriminant analysis of principal components (DAPC) based on haplogroup frequency. PC1 explains population variation from east to west geographically, and PC2 explains the variation from north to south (Fig. 2A and fig. S5). In general, all the populations were divided into four main clusters: northeastern Asian (NEA: Siberian and northern East Asian), southeastern Asian (SEA: southern East Asian and Southeast Asian), central Steppe, and European (Turan and European). All the ancient Xinjiang samples lie on a cline extending from the NEA populations to the central Steppe and European clusters (Fig. 2A and fig. S5), suggesting that these ancient Xinjiang populations had varying degrees of relatedness to NEA, central Steppe, and European populations.
Fig. 2 The genetic results for the ancient Xinjiang samples and other ancient and present-day Eurasians.
(A) DAPC based on haplogroup frequency; the eigenvalues of the first and second PCs are shown on the top right. The different colors on the plot represent different groups made under unsupervised classification. The shapes with black frames represent the ancient Xinjiang samples with the BA samples in dark red and the IA/HE samples in dark blue. The populations plotted as triangles are ancient populations, and the circles are present-day humans. NEA, northern East Asian (upward pointing triangles), and Siberia (triangles pointing to the right); SEA, southeastern Asia; Euro, Europe; CS, central Steppe populations in BA (CS_BA; upward pointing triangles) and IA (CS_IA; triangles pointing to the right); pdCA/XJ, present-day populations in and around Xinjiang. (B) The genetic distance (FST) heatmap plot of ancient Xinjiang samples and ancient Eurasians. The different labels represent genetically distinctive groups corresponding to those in the DAPC. Values with FST ≈ 0.00 are in white, representing a close genetic relationship. SEXiaohe_BA was removed from the FST heatmap, considering the significantly large genetic distances (FST > 0.10) between them and other groups.
We find that the EMBA Xinjiang individuals from northern and western Xinjiang, associated with the Afanasievo culture (NWAfana_EMBA), are surrounded by western Steppe–related populations (WSteppe_EMBA and WSteppe_MLBA) (Fig. 2A and fig. S5). In contrast, the EMBA individuals associated with the Chemurchek culture (NChemur_EMBA) and individuals from the Songshugou site in northern Xinjiang (NSSG_EMBA) form their own separate cluster surrounded by other central Steppe populations (Fig. 2A and fig. S5). High proportions of U, H, and R haplogroups are observed, which were reported primarily in BA Steppe populations (table S1). Although there is no significant genetic differentiation among these EMBA individuals (FST ≈ 0.00, P > 0.05) (Fig. 2B, fig. S6, and table S2), we find that only NChemur_EMBA shows significant genetic differentiation from both western Steppe populations (WSteppe_EMBA: FST ≈ 0.045, P ≈ 0.005; WSteppe_MLBA: FST ≈ 0.042, P ≈ 0.006) (Fig. 2B and table S2). NChemur_EMBA also shows significant genetic differentiation from CSteppe_MLBA (FST ≈ 0.057, P ≈ 0.002) but not from CSteppe_EMBA (FST ≈ 0.029, P ≈ 0.16) (Fig. 2B and table S2), consistent with its position on the DAPC plot (fig. S5C). Moreover, in the median-joining networks, the WSteppe_EMBA individuals cluster with NWAfana_EMBA in haplogroups U4, U5, and H15 (Fig. 3D and fig. S7B); with NChemur_EMBA in U4, U5, H2, H6a, and W3 (Fig. 3D and fig. S7, B and D); and with NSSG_EMBA in U4 (Fig. 3D and fig. S7B) (39, 40). The two haplogroups H2 and H5 are present in the western Steppe–related populations, and H6a appears in the populations related to the Okunevo culture present in the Altai region (fig. S7B and table S1) (40). We also find an NEA connection (Baikal_EBA) with NChemur_EMBA supported by the appearance of haplogroup D4j (table S1), which differed in these two populations by only four mutations in the network analysis and appeared in northern East Asia, including the southern Siberian region (Fig. 3E and fig. S7A) (41). The one EMBA sample from HBH has the haplogroup U5 (table S1), suggesting a more western Steppe–related connection. Therefore, we demonstrate that both western and northern Xinjiang populations have considerable western Steppe–related ancestry during the EMBA.
Fig. 3 Median-joining haplogroup networks.
The median-joining network of the haplogroups C4 (A), R1b (B), HV (C), U4 (D), and D4 (E) related to ancient northeastern Asian (NEA), Botai/Dali, European (Euro), Turan, and BA and IA populations from the central Steppe (CS_BA and CS_IA). The Euro group consists mainly of western Steppe–related individuals, and the CS_IA group contains the Saka, Hun, and Nomad populations. The size of the circles represents the proportion of each haplotype.
We find that eastern groups (E_BA and E_LBA) cluster separately from the EMBA individuals in northern and western Xinjiang. Both of the eastern groups cluster with ancient and present-day NEA in the DAPC (Fig. 2A and fig. S5). E_BA and E_LBA harbor high proportions of the haplogroup D (~36.70 and 32.00%, respectively), which is a common lineage in ancient and present-day NEA populations (42, 43) including northern Chinese (18.20 to 44.80%) and ancient Mongolians (31.20%) (Figs. 1A and 3E and table S3). E_LBA also shows nonsignificant genetic distances to some of these NEA populations, specifically two ancient Gan-Qing populations (GQQijia_BA and GQKayue_LBA; FST < 0.05, P > 0.05) and four present-day populations (Japanese, Mongolian, Tu, and Oroqen; FST < 0.03, P > 0.05) (fig. S6, B and C, and table S2). Although both E_BA and E_LBA have the western Steppe–related haplogroup U, they show a higher proportion of lineages from NEA than from Europe, with more European lineages appearing in later samples (20% in E_BA and 36% in E_LBA) (table S3). This pattern is consistent with DAPC in which E_LBA plots closer to West Eurasians compared to E_BA (Fig. 2A). Moreover, haplogroup D4b2b4 is found in both the Xiongnu and E_LBA (Fig. 3E), which suggests a direct relationship between E_LBA and Xiongnu populations due to the presence of shared NEA ancestry. Thus, E_BA and E_LBA populations show more NEA connections, but the presence of western Steppe–related lineages (U, 16.7% in E_BA and 8% in E_LBA) also supports additional connections to the western Steppe–related populations (table S3).
Although SEXiaohe_BA clusters into the NEA groups in DAPC, which is similar to E_BA and E_LBA, they show more affinity for populations with ancient and present-day Siberian ancestry (Fig. 2A and fig. S5). SEXiaohe_BA has a high proportion of the C4 haplogroup (six of seven individuals) present in ancient and present-day Siberian populations, including NEA and Shamanka populations from near the Lake Baikal region of South Siberia (Fig. 3A and fig. S7A). This population is unique in yielding significant genetic distances compared to all other ancient and present-day populations (FST > 0.11, most P values < 0.04), including other BA Xinjiang groups, but it has the lowest genetic distances with three present-day populations from Siberia (Even, Evenk, and Yakut: FST < 0.13) (table S2). These results are consistent with previous studies on Xiaohe (19, 20).
We also find the mtDNA haplogroup R1b in BA Xinjiang samples (NChemur_EMBA, n = 2; NWAfana_EMBA, n = 1; NSSG_EMBA, n = 1; SEXiaohe_BA, n = 1) and in IA and HE populations from eastern and western Xinjiang (E_IA, n = 1; W_IA, n = 1; W_HE, n = 2) (table S1), which was reported not only in East European Hunter-Gatherers (Karelia) (44) but also in Botai (40) and Dali (28) individuals from Kazakhstan. Moreover, the haplogroup K1b2 was shared among the Botai (40) and western Steppe–related populations as well as our LBA samples from eastern Xinjiang (E_LBA) (table S1). The R1b median–joining network shows that the EMBA sample (3012 to 2890 cal BCE) from northern and western Xinjiang, associated with the Afanasievo (NWAfana_EMBA), plots in the center of the network and was separated from Botai by only a single mutation (Fig. 3B). This branch, in turn, is associated with NSSG_EMBA and another branch that includes an individual from the Dali site (Fig. 3B). This may suggest either a deep ancestry connection with an Ancient North Eurasian (ANE) population or some genetic connections with geographically proximal populations from Kazakhstan (Dali and Botai) (28, 40). We also find the R1b haplogroup in one of the individuals from the Xiaohe population (Fig. 3B), which may also suggest a North Xinjiang connection with Xiaohe people (11). Thus, during the BA, the northwestern Xinjiang populations showed a high genetic affinity for western Steppe–related cultures, such as the Afanasievo and Chemurchek, and the southeastern Xinjiang populations for NEA and South Siberian populations (Fig. 4A), suggesting a scenario of complex interactions with the neighboring populations and communities of diverse cultural backgrounds.