|
Post by Admin on Mar 10, 2022 19:58:23 GMT
Ancient genomes from the Himalayas illuminate the genetic history of Tibetans and their Tibeto-Burman speaking neighbors
Abstract Present-day Tibetans have adapted both genetically and culturally to the high altitude environment of the Tibetan Plateau, but fundamental questions about their origins remain unanswered. Recent archaeological and genetic research suggests the presence of an early population on the Plateau within the past 40 thousand years, followed by the arrival of subsequent groups within the past 10 thousand years. Here, we obtain new genome-wide data for 33 ancient individuals from high elevation sites on the southern fringe of the Tibetan Plateau in Nepal, who we show are most closely related to present-day Tibetans. They derive most of their ancestry from groups related to Late Neolithic populations at the northeastern edge of the Tibetan Plateau but also harbor a minor genetic component from a distinct and deep Paleolithic Eurasian ancestry. In contrast to their Tibetan neighbors, present-day non-Tibetan Tibeto-Burman speakers living at mid-elevations along the southern and eastern margins of the Plateau form a genetic cline that reflects a distinct genetic history. Finally, a comparison between ancient and present-day highlanders confirms ongoing positive selection of high altitude adaptive alleles.
Nature Communications volume 13, Article number: 1203 (2022)
|
|
|
Post by Admin on Mar 10, 2022 20:35:04 GMT
Introduction The Tibetan Plateau is characterized by hypobaric conditions, rough terrain, cold temperatures, and a relatively low biological productivity. Despite these constraints, ethnic Tibetans have successfully adapted to this environment and have lived on the plateau for millennia1. Understanding their genetic and cultural adaptations to this challenging hypoxic environment is of great archaeological, anthropological, genetic, and physiological interest2. To fully do so requires answering many fundamental questions regarding the origins of present-day Tibetan populations, including the source populations and initial movements of peoples onto the Plateau, the timing of the establishment of permanent Plateau populations, and the establishment of the gene pools ancestral to the present-day Tibetans.
Although archaeological data relating to early population movements onto the Plateau are sparse, the Baishiya Karst Cave site (3280 masl, meters above sea level) on the extreme northeastern edge of the Tibetan Plateau suggests the presence of Denisovan-related peoples between 160 and 60 thousand years ago (kya)3,4,5. Dates at the Nywa Devu site (4600 masl) on the central Plateau suggest a modern human presence between 30 and 40 kya6. Whether either of these sites reflects a permanent settlement of humans on the Plateau is unknown. Meyer et al.7 propose an initial permanent occupation of the central Plateau at Chusang (4270 masl) by hunter-gatherers between 7.4 and 12.7 kya. In contrast, Chen et al.8 and others have argued that a permanent population on the central Plateau was not possible until the advent of barley-based agriculture around 3.6 kya. The latter model generally presumes that agriculture was introduced onto the Plateau by migrants from lower elevation sites (<2500 masl) along the northeastern margins of the Plateau; these migrants are proposed to have contributed substantially to the gene pool of present-day Tibetans9.
However, evidence for more complex, multiple origins of present-day Tibetans is also supported by genetic data. Densely sampled uniparental markers can be traced for the most part to lineages present in northern East Asia since the early Holocene, but older haplogroups such as mitochondrial M16 and Y chromosomal D-M174, originating from a deep Eurasian lineage, are also uniquely present among present-day Tibetans10,11,12. The idea of an ancient Paleolithic contribution to the Tibetan gene pool has also been proposed based on whole genome sequence data. A study comparing present-day Tibetan genomes to those of ancient Siberians and archaic hominins inferred a contribution from a mixture of ancient ancestries—archaic and non-archaic—among the hypothesized early peoples on the Plateau13. This proposal is consistent with the finding of a haplotype at the EPAS1 (Endothelial PAS Domain Protein 1) locus that introgressed from a Denisovan-like population into the present-day Tibetan gene pool, conferring a selective advantage in high altitude environments14,15,16,17.
Taken together, current genetic data suggest a multi-stage settlement of the Plateau: movements of Pleistocene-era populations with some level of archaic admixture onto the Plateau followed by Holocene-era migrations from the northeastern edges of the Plateau. Although the identity and origins of the Pleistocene-era population remain unknown, a recent analysis has identified a clear east-west cline of genetic variation within present-day geographically dispersed Tibetan populations18. This cline may reflect Neolithic population movements, such as those that might have been associated with the spread of barley agriculture. Prior to the spread into the Plateau, barley agriculture was practiced by Late Neolithic and Early Bronze Age populations in the Gansu-Qinghai region, such as those associated with the Qijia culture (ca. 2300–1800 BCE)19,20. This cline may also have been established or reinforced by later historical events, such as the expansion of the Tibetan empire since the 7th century CE, or by a prolonged process of gene flow between nearby populations in an isolation-by-distance manner that did not involve long-range migrations.
Ancient DNA (aDNA) data has the potential to resolve these questions, in part because genetic inferences from ancient populations are not confounded by recent historical events. Previous aDNA studies of individuals from three high elevation Himalayan sites in the Mustang district of north-central Nepal dating to 800 BCE–650 CE showed that these sites were inhabited by populations of clear East Asian ancestry who had likely migrated from the Tibetan Plateau21.
Here we obtain aDNA data from additional individuals from these and four additional Himalayan sites in the Mustang and Manang districts (MMD), increasing the temporal coverage by more than 600 years, from ca. 1420 BCE–650 CE, and providing the earliest genetic evidence to date for Plateau populations. We show that these ancient Himalayan populations genetically cluster with present-day Tibetans and that they represent an early branch within the Tibetan lineage, making them particularly informative for inferring the history of the Tibetan gene pool, its origins, and its current distribution among the present-day Tibetans and their neighbors.
|
|
|
Post by Admin on Mar 10, 2022 22:12:35 GMT
Results Ancient genomes from the Himalayas Here we analyze genome-wide data of 38 ancient individuals from seven sites in the MMD region, Nepal (Fig. 1; Supplementary Data 1–3): Suila (n = 1; 1494–1317 BCE), Lubrak (n = 2; 1269–1123 BCE), Chokhopani (n = 3; 801–770 BCE), Rhirhi (n = 4; 805–767 BCE), Kyang (n = 7; 695–206 BCE), Mebrak (n = 9; 500 BCE–1 CE), and Samdzong (n = 12; 450–650 CE). Of these 38 individuals, 31 are newly reported in this study and seven were previously reported in a prior study of the region21. We also produced new data for two of the previously published individuals, resulting in new genome-wide data for 33 individuals (Supplementary Data 1, 2). All data were generated from human dental material. Due to disturbance of mortuary contexts, some teeth were initially assumed to be from distinct individuals but were later identified as replicate samples based on genetic data, resulting in nine individuals with data from multiple teeth (Supplementary Data 4). Data from multiple teeth and libraries belonging to a single individual were pooled accordingly prior to downstream analyses. Among the seven archaeological sites, Suila, Lubrak, Rhirhi, and Kyang have not been previously described (Supplementary Text 1). After initial genetic screening, 13/33 individuals were whole genome sequenced to low coverage (0.5-6.6x per individual; Supplementary Data 1). We additionally applied capture-enrichment methods to target two sets of single nucleotide polymorphisms (SNPs): (1) a set of “1240K” variants, designed to intersect with markers on the Affymetrix Human Origins and the Illumina genotyping arrays22 and here captured for all 33 individuals; and (2) an additional set of 50 K variants, selected and curated from selection scan and phenotype association signals in present-day Tibetan populations23 and captured for 21 individuals (Supplementary Data 1). The combined per-individual data satisfied standard quality control measures for ancient genomic data (Supplementary Data 2). For downstream analysis, we assembled two reference datasets derived primarily from published genome-wide genotype data produced on the Affymetrix Human Origins (“HO”; ~500 K SNPs) and the Illumina (“Illumina”; ~220 K SNPs) genotyping arrays (Supplementary Data 5, 6). We augmented these datasets with published ancient genomes as well as genomes of present-day Sherpa and Tibetan individuals from Nepal (Supplementary Data 5, 6). Whereas we focused most of our analyses on the HO set for its higher SNP density, we also used the Illumina set for in-depth analysis of diverse Himalayan populations across Nepal, Bhutan, India, and Tibet Autonomous Region24. Fig. 1: Geographic locations for ancient groups and present-day Tibeto-Burman speakers. Circles represent ancient groups and are colored by archaeological periods; squares represent present-day populations of Tibeto-Burman speakers. Left inset: an enlarged view of the seven aMMD sites. Lower left inset: an enlarged view of the present-day Nepalese and Bhutanese populations: 1. Bahing; 2. Bantawa; 3. Baram; 4. Brokkat; 5. Brokpa; 6. Bumthang; 7. Chali; 8. Chamling; 9. Chantyal; 10. Chepang; 11. Chetri; 12. Dakpa; 13. Damai; 14. Dhimal; 15. Dumi; 16. Dzala; 17. Gongduk; 18. Gurung; 19. Khengpa; 20. Kulung; 21. Kurtop; 22. Lakha; 23. Layap; 24. Limbu; 25. Lower_Mustang; 26. Magar; 27. Majhi; 28. Mangde; 29. Monpa; 30. Nachiring; 31. Newar; 32. Ngalop; 33. Nubri; 34. Nup; 35. Puma; 36. Sampang; 37. Sarki; 38. Sherpa; 39. Sherpa_Khumbu; 40. Sonar; 41. Sunwar; 42. Tamang; 43. Thakali; 44. Tshangla; 45. Tsum; 46. Upper_Mustang; 47. Wambule. The base map was created in R v4.0.0 using publicly available map and altitude information from the mapdata v2.3.0 and elevatr v0.3.4 packages.
|
|
|
Post by Admin on Mar 11, 2022 1:15:34 GMT
The genetic structure of high altitude East Asians and their neighbors To describe the genetic profile of the ancient individuals from Nepal (aMMD) in the context of world-wide human diversity, we first performed principal component analysis (PCA)25. After confirming that they cluster with other East Asian individuals (Supplementary Fig. 1), we projected the aMMD individuals onto the first two PCs calculated for present-day Eastern Eurasian individuals (Fig. 2; Supplementary Data 4). The present-day populations form a structure with three spurs representing, respectively, clines of ancestry corresponding to southern Chinese and southeast Asians (SC-SEA), northeast Asians, and Tibeto-Burman populations. The Ami of Taiwan, Ulchi of the Lower Amur River basin in the Russian Far East, and Sherpa of Nepal form the distal ends of the three spurs, respectively. The Tibeto-Burman spur matches the east-west genetic cline of present-day Tibetans reported in a previous study18. Consistent with our previous results21, all aMMD individuals, including those from the newly investigated sites of Suila, Lubrak, Rhirhi and Kyang, cluster together with present-day Tibetan populations. The genetic profiles obtained from the unsupervised model-based clustering method ADMIXTURE are consistent with those from the PCA, with aMMD individuals sharing unique ancestral components with mid and high altitude present-day populations (Supplementary Fig. 2). Likewise, outgroup-f3 statistics26 indicate that the aMMD individuals have the highest level of shared genetic drift with each other, followed by present-day Sherpa and Tibetans, and then by low-altitude Tibeto-Burman speakers such as Naxi, Yi, and Nagaland populations in India (Supplementary Figs. 3, 4). Fig. 2: aMMD individuals on the top two PCs of present-day Asian individuals. We calculated PCs from 486 present-day Asian individuals in the HO dataset and projected aMMD individuals on top of the top PCs. Gray dots represent present-day individuals we used to calculate PCs. Circles represent median positions of present-day groups colored by their language families along with their respective group abbreviations. Red capital letters “U, L, C, R, K, M, S” represent projected aMMD individuals. The uniparental haplogroups of the aMMD individuals also support their close genetic relationship with present-day Sherpa/Tibetans (Supplementary Data 2). We assigned Y haplogroups for 14 aMMD individuals. We observed little diversity, with 13 of the 14 males having derived markers of the Y-haplogroup O-M117, and 12 males carrying derived markers of its sublineage Oα1c1b-CTS5308 (Supplementary Fig. 5)27. Among present-day populations, this sublineage is found primarily among Tibetans and Sherpa on the Plateau, in contrast to its sister lineage Oα1c1b-Z25929, which is today mainly found in Southern China and Northeast India27. A rapid radiation of all extant O-M117 lineages is estimated to have occurred 7000–5000 BP and has been interpreted as reflecting the spread of Sino-Tibetan languages, likely originating from northern China27. Notably, the Y-haplogroup O-M117 has been found also in ancient individuals from the upper Yellow River Neolithic Yangshao and Late Neolithic Qijia cultures28, providing evidence for the majority of male aMMD lineages tracing back to this region. One aMMD male individual (S41) belonged to a different Y-haplogroup, D1a, which is another common haplogroup on the Tibetan Plateau today10. The mitochondrial haplogroups of the aMMD individuals, while more diverse, are also prevalent among present-day Tibetans (Supplementary Data 2).
|
|
|
Post by Admin on Mar 11, 2022 19:19:35 GMT
The genetic relationship between ancient and present-day high altitude East Asians While most closely related to each other (Supplementary Figs. 3, 4), the aMMD individuals show subtle differences in their genetic affinity that may suggest a fine-scale genetic heterogeneity among them (Supplementary Fig. 3). Most prominently, all aMMD groups have the highest outgroup-f3 statistic with Lubrak, while having the lowest value with Chokhopani. Indeed, all the other aMMD groups, including the earliest Suila, are significantly closer to individuals at Lubrak than Chokhopani, as measured by f4 (Mbuti, aMMD; Chokhopani, Lubrak) (>+4.4 SEM, standard error measure). The same pattern is also observed for present-day Nepalese Sherpa/Tibetans (>+2.7 SEM), while lowland East Asian populations are symmetrically related to Chokhopani and Lubrak (Supplementary Data 7). Using qpWave, we formally compared the two topologies ((Lubrak, aMMD), Chokhopani) and ((Chokopani, aMMD), Lubrak). We show that Suila, Rhirhi, Mebrak, and Samdzong are cladal to Lubrak (i.e., the former topology holds) within the limits of our resolution (p > 0.192), and Kyang only slightly differentiates from Lubrak (p = 0.027; Supplementary Table 1). In contrast, modeling the aMMD groups as a sister group of Chokhopani uniformly failed and thus the latter of the two topologies can be rejected (p < 1.38 × 10−4). A combination of Lubrak with a minor contribution from a South Asian group (e.g., Pulliyar) adequately fits all four groups, with an estimated South Asian ancestry contribution of only 1.9–5.1% (p > 0.179; Supplementary Table 2). For Chokhopani, neither Lubrak + South Asian nor Lubrak + Naxi/Yi/Naga fits (p < 3.67 × 10−4); however, Suila + Naxi/Yi/Naga fits with a substantial lowlander contribution (31–40%; Supplementary Table 3). We also detect a significant signal of admixture in Chokhopani using DATES, which infers an admixture time in Chokhopani of 46 ± 11 generations before the time of Chokhopani, placing it at ca. 1500-2800 BCE (for mean ± 2 SEM; Supplementary Fig. 6). This implies gene flow must have occurred between Chokhopani and the ancestors of these low/middle altitude populations prior to 800 BCE, and plausibly before 1500 BCE.
Like aMMD groups, present-day Sherpa/Tibetan groups from the MMD region and the nearby Gorkha/Solukhumbu districts in Nepal23, as well as Tibetans from more distant locations, are genetically closest to Lubrak and then to each other among the ancient and present-day East Asians (Supplementary Fig. 7; Supplementary Data 8). The earliest aMMD group from Suila, as well as later aMMD groups, are also among the top outgroup-f3 signals of present-day Sherpa/Tibetans. Chokhopani shows smaller outgroup-f3 values as expected from its admixture signal with lowlanders. Therefore, we conclude that Lubrak/Suila are so far the earliest known representative of a gene pool that is most enriched in the high altitude populations in the Tibetan Plateau and the Himalayas; we refer to this gene pool as the “Tibetan” lineage in this study.
A dual genetic origin of high altitude East Asians Archaeological data suggest that Neolithic populations of the Upper/Middle Yellow River basin exerted a major cultural influence on the spread of farming onto the Plateau8. This region has also been proposed as the likely homeland of the Sino-Tibetan language family29,30. Interestingly, among ancient lowland East Asians28,31,32,33, Middle/Late Neolithic groups from the Upper Yellow River region and its periphery (Fig. 1) show the closest genetic affinity to the aMMD groups (Supplementary Fig. 8). These include Late Neolithic individuals from the Jinchankou and Lajia sites in the Upper Yellow River region belonging to the Qijia culture (ca. 2300-1800 BCE; Upper_YR_LN), individuals from the Late Neolithic Shimao site of Shengedaliang in Shaanxi province (ca. 2250-1950 BCE; Shimao_LN), and those from the Middle Neolithic Miaozigou site in Inner Mongolia (ca. 3550-3050 BCE; Miaozigou_MN). These three groups have a similar genetic profile, deriving ~80% of their ancestry from a gene pool related to the Middle Neolithic individuals of the Yangshao culture sites of Wanggou and Xiaowu in the Central Plain (ca. 4000-3000 BCE; YR_MN) and the remaining ~20% from the Ancient Northeast Asian (ANA) gene pool related to Neolithic-era hunter-gatherers from the Devil’s Gate Cave site of the Russian Far East (“DevilsCave_EN”)28,32. Taking Upper_YR_LN and YR_MN as representatives of lowland gene pools, we modeled the relationship between aMMD and Upper_YR_LN/YR_MN via a graph-based approach using qpGraph34. YR_MN fails to mimic the primary source of the aMMD groups and present-day Sherpa/Tibetans, mainly due to the extra affinity of aMMD to the ANA gene pool (Supplementary Table 4). In contrast, Upper_YR_LN, having a stronger genetic affinity to ANA, is consistently chosen as their primary genetic source in the best-scored graphs (Fig. 3). Together with their geographic and temporal proximity with early farmers on the Plateau, our results support a major genetic link between Plateau populations and the predecessors of early barley farmers on the northeastern fringe of the Plateau. However, we note that this genetic link was already established in the earliest aMMD groups dating to 1494–1317 cal. BCE at the far southern end of the Plateau (Supplementary Data 3). This date is only ~200 years after the proposed onset of the ca. 1650 BCE barley farmer expansion from the northeastern fringe of the Plateau8. A rapid population expansion from the Yellow River across the entire Plateau, a distance of more than 1800 km across rough terrain, would need to be invoked to explain these findings. Hence, substantial genetic exchange with lowlanders likely occurred prior to the barley expansion.
|
|