|
Post by Admin on Mar 17, 2022 21:57:52 GMT
Results Genetic Affinities of TIB in the Context of Global Populations A comparison of genetic differences measured by unbiased FST19 between TIB and 188 worldwide populations17 (Tables S3 and S4) showed that TIB is most closely related to East Asian populations (Figures 1A and 1B), followed by Central Asian and Siberian populations (Figure 1C). These relationships remained consistent in separate analyses of TBN and SHP samples (Figures S2 and S3). In addition, these results were also confirmed by PCA (Figure 2 and Figures S4 and S5) and outgroup f3 statistics21 (Figure S6). Interestingly, TBN and SHP are distinguishable in a two-dimensional PC plot (Figure 2D), indicating that they are two distinct groups with genetic differences (FST = 0.010), slightly smaller than those between TBN and HAN (FST = 0.011). The overall genetic makeup of TIB is closer to surrounding populations living in the plateau, such as Tu (FST = 0.006), Yizu (FST = 0.007), and Naxi (FST = 0.009), than to any other population worldwide (Figure 1B and Tables S7 and S8), most likely as a result of direct ancestry sharing or reciprocal gene flow between these populations. Figure 1 Genetic Affinities of TIB in the Context of Worldwide Populations (A) A fan-like chart showing genetic differences (FST) between TIB and worldwide populations. Each branch represents a comparison between TIB and 1 of the 256 populations; lengths are proportional to the FST values, indicated by gray circles. The populations are classified by geographical regions and indicated with the colors shown in the legend. The populations in each region are presented in a clockwise order according to great-circle distance to Tibet. (B) A fan-like chart showing FST between TIB and East Asian populations. (C) A fan-like chart showing FST between TIB and Central Asian and Siberian populations. Figure 2 Analysis of the First Two PCs of TIB Individuals and Other Population Samples Geographical regions where the individuals are located are indicated with the colors shown in the legend. Numbers in brackets denote the variance explained by each PC. The HAN samples are classified into East Asian populations and are not highlighted in this illustration. (A) In this two-dimensional PC plot, TIB samples are located closely in the East Asian cluster and share similar PC coordinates with some populations in South Asia and Central Asia and Siberia. (B) PCA of TIB individuals and Eurasian samples (western Eurasian samples were excluded). (C) PCA of TIB individuals and some East Asian samples. (D) PCA of TBN and SHP individuals and samples of some closely related surrounding populations (Tu, Yizu, and Naixi). We caution that the relationship revealed by an overall analysis of the genome-wide data only reflects a limited aspect of the origins of the populations studied. The results should not be directly interpreted as the genetic origins and history of the Tibetans; rather, they reflect the population’s present-day genetic makeup, which could be substantially influenced by recent gene flow from surrounding populations. This highlights the need to account for admixture when inferring population history.
|
|
|
Post by Admin on Mar 18, 2022 18:45:07 GMT
Genetic Relationship and Divergence between TIB and HAN Among the lowland populations, the data indicated that Han Chinese are most closely related to TIB (FST = 0.011) (Figure 1B and Table S7); the two populations had an overall strong correlation in the allele-frequency spectrum (Figure S7), but Y chromosome and mtDNA data showed substantial differences between them (Tables S5 and S6). Y chromosome haplogroup D-M174 is an East-Asian-specific male lineage and is rare in populations from regions bordering East Asia (Central Asia, North Asia, and the Middle East), usually less than 5%.6, 46, 47, 48 In our data, D-M174 lineages were not observed in the 39 HAN samples (frequency was 0) but had a high frequency in TIB (66.6%; Table S5), indicating a very large difference in Y DNA lineages between the two groups. Such a large difference was not observed in mtDNA, but some haplogroups also showed considerable differences between groups. For example, the frequency of B5 in TBN (0%) was quite different from that in HAN (10.26%) (Table S6). Some haplogroups, such as A and M5, also showed substantial differences between SHP and TBN; however, because sample sizes, especially that of SHP, in our data were small, estimation of haplotype frequency is not reliable. TIB showed an overall lower genetic diversity than HAN, including a lower level of heterozygosity, a higher level of runs of homozygosity (Figure S8), and a consistently smaller effective population size over the last ∼15,000 years (Figures 3 and and4A),4A), most likely because TIB is isolated on the Tibetan Plateau, whereas HAN has experienced recent population expansion. The average time of divergence between TBN and HAN was estimated to be ∼15,000–9,000 YBP on the basis of a sequentially Markovian coalescent analysis24, 25 of high-coverage genomes. This divergence time between Tibetans and Han Chinese is thus much earlier than the estimate of 2,750 years ago by a previous study.4 In contrast, the estimated divergence time between SHP and HAN was ∼16,000–11,000 YBP, and that between TBN and SHP was ∼11,000–7,000 YBP (Figure 4B). The divergence between TIB and HAN most likely resulted from recent migration to the Tibetan Plateau after the Last Glacial Maximum (LGM), a period of intense cold from ∼26,500 to 19,000 YBP.49 Subsequent gene flow between TBN and HAN or continuous migrations to the plateau most likely caused the divergence time between TBN and HAN to be slightly shorter than that between SHP and HAN. Figure 3 Estimated Changes in Effective Population Size over Time The estimation was based on PSMC analysis of single genomes from three modern human groups with deeply sequenced genomes in this study (33 TBN, 5 SHP, and 39 HAN). To overcome uncertain date estimates due to a lack of confidence in the true value of the mutation rate, we scaled time in units of 2 μT (where μ is the mutation rate and T is time in generations). We also provide an absolute estimation of time (top) under the assumption of a fast mutation rate of 1.0 × 10−9 or a slow mutation rate of 0.5 × 10−9 per site per year. Figure 4 Estimated Changes in Historical Effective Population Size and Divergence Time between Populations (A) The estimation was based on MSMC analysis of four genomes (eight haploid genomes) randomly selected from each of the three modern human groups with deeply sequenced genomes in this study (TBN, SHP, and HAN). The analysis was repeated twice with different combinations of different individuals, and two curves are displayed for each group in the plot with different colors. Curves with dashed lines denote the estimation based on the same set of genomes but with non-modern human sequences removed (AMH only). (B) The estimation was based on the MSMC algorithm. Two individual genomes were randomly selected from each group in the group pairs (TBN versus HAN, SHP versus HAN, or SHP versus TBN), and in total four genomes (eight haploid genomes) were used for each MSMC analysis. The analysis was repeated eight times, and eight curves are displayed for each group in the plot with different colors. Here, we show the results based on absolute estimation of time under the assumption of a slow mutation rate of 0.5 × 10−9 per site per year.
|
|
|
Post by Admin on Mar 18, 2022 20:28:48 GMT
Ancestral Makeup of TIB Ancestry analysis with ADMIXTURE32 suggested that present-day Tibetans share the majority of their ancestry makeup with populations from East Asia (∼82%), Central Asia and Siberia (∼11%), and South Asia (∼6%) and have minor ancestral relationships with western Eurasian (<1%) and Oceanian (<0.5%) populations (Figure 5 and Figures S9–S11). In contrast, HAN share much less ancestry with Siberian (∼7%), South Asian (<0.5%), and Oceanian (∼0%) populations but higher ancestry with East Asian populations (>90%). Figure 5 Summary Plot of Genetic Admixture The results of individual admixture proportions estimated from 592,799 autosomal SNPs with genotype data available for 38 TIB, 39 HAN, and 2,345 HuOrigin samples (African samples were not included). Each individual is represented by a single line broken into K = 7 colored segments with lengths proportional to the K = 7 inferred clusters. The population IDs are presented outside of the circle of the plot. The results of population-level admixture of TIB and HAN are summarized and displayed in the two pie charts in the center of the circle plot; admixture proportions are denoted as percentages and with different colors. We applied outgroup f3 statistics21 and f4 statistics17, 34 to assess ancient ancestral contributions to TIB by analyzing available ancient genomes, including those of an Altai Neanderthal,14 a Denisovan,15 and Ust’-Ishim, a 45,000-year-old anatomically modern human from Siberia16 (see Material and Methods). TIB was found to share slightly more alleles with ancient genomes than many worldwide populations, except for Oceanian populations, which showed significantly higher shared archaic (both Denisovan- and Neanderthal-like) ancestry than TIB, and East Asian populations (especially those surrounding TIB: Yizu, Tu, and Naxi), which had similar or even higher levels of shared archaic ancestry than TIB (Tables S9, S10, S11, S12, S13, S14, and S15 and Figures S12 – S26). In particular, no modern human populations shared more alleles with Ust’-Ishim than TIB (Table S13 and Figure S14). Although the exact estimates varied and the absolute values are not comparable, these genome-wide analyses indicate that TIB shares consistently higher ancestry with ancient non-AMH genomes than with HAN, the lowland population that has the closest modern relationship with TIB.
|
|
|
Post by Admin on Mar 19, 2022 1:42:25 GMT
Individual Archaic Ancestry Is Higher in TIB Than in HAN Genomic segments of non-AMH origins were identified with the S∗ statistic,38 as well as two methods developed in this study (see Material and Methods). The total amount of non-AMH sequences in the TIB gene pool (6.17%) was significantly higher than that in the HAN gene pool (5.86%) (p < 10−5) (Figures 6A and Figure S27A). Restricting the comparison to Neanderthal-like and Denisovan-like sequences, we did not observe any significant differences between TIB and HAN as a percentage of Neanderthal-like sequences per individual (Figure 6B), although we did observe some marginally significant differences when we restricted the comparison to Neanderthal-specific sequences (p = 0.01; Figure S27B). Significant differences were also observed between TIB and HAN on the basis of Denisovan-like sequences per individual (p = 0.001; Figure 6C), and this difference was more pronounced when the comparison was restricted to Denisovan-specific sequences (p < 10−5; Figure S27C). Together, this analysis suggests that the major differences between TIB and HAN resulted from the contribution of Denisovan-like and unknown non-AMH sequences (Figure 6 and Figure S27). Figure 6 Distribution of Non-modern Human Sequences among Individuals and Correlations with Altitude (A) A comparison of the percentage of non-modern human sequences per individual between TIB (6.17 ± 0.10) and HAN (5.86 ± 0.07) demonstrates significant differences (p < 10−5). On the horizontal axis, T, S, and H denote Tibetan, Sherpa, and Han Chinese, respectively. (B) A comparison of the percentage of Neanderthal-like sequences per individual between TIB (1.04 ± 0.04) and HAN (1.02 ± 0.04) shows significant differences (p = 0.379). On the horizontal axis, T, S, and H denote Tibetan, Sherpa, and Han Chinese, respectively. (C) A comparison of the percentage of Denisovan-like sequences per individual between TIB (0.42 ± 0.02) and HAN (0.40 ± 0.02) shows slightly significant differences (p = 0.001); the statistical significance (p value) was obtained by permutation tests repeated 100,000 times. On the horizontal axis, T, S, and H denote Tibetan, Sherpa, and Han Chinese, respectively. (D) Correlation between the average proportion of non-modern human sequences in regional Tibetan populations and altitude (Pearson R2 = 0.855; p = 0.0083). (E) Correlation between the average proportion of Neanderthal-like sequences per individual and altitude (Pearson R2 = 0.216; p = 0.354). (F) Correlation between the average proportion of Denisovan-like sequences per individual and altitude (Pearson R2 = 0.412; p = 0.170). Altitudinal Correlation of Archaic Ancestry Interestingly, when the Tibetan samples were grouped into seven regional populations according to geographical location (see Material and Methods), a strong correlation was observed between the average proportion of non-AMH sequences present in Tibetan individuals and altitude (Pearson R2 = 0.855; p = 0.0083; Figure 6D), but no significant correlation was observed in relation to Neanderthal-like or Denisovan-like sequences (Figures 6E and 6F). These results indicate that non-AMH sequences have some association with altitude, possibly contributing to the adaptation of Tibetans, but the main contribution seems to originate from unknown non-AMH ancestry (Figure S27D) rather than from Neanderthal-like or Denisovan-like sequences. A similar pattern and conclusion was apparent when the analysis was restricted to Neanderthal-specific and Denisovan-specific sequences (Figures S27E and S27F). A Highly Differentiated Genomic Region with Elevated Archaic Ancestry in TIB The above results reveal that non-AMH-derived sequences not only exist in the Tibetan genomes but are also very common. However, it is not clear whether these non-AMH sequences derived from recent migrations to Tibet with both AMH and non-AMH ancestries or were inherited from the ancient colonization of the Tibetan Plateau. The overall spatial distributions of non-AMH-derived sequences are similar between TIB and HAN (Figure S28), and the TMRCA for non-AMH sequences in both TIB and HAN was estimated to be >40,000 years (see Material and Methods). These results might be not surprising, given that TIB and HAN share recent genetic drift (Figures 1 and and2)2) and the majority of their genetic makeup (Figure 5). Nonetheless, when frequency was taken into account, a ∼300 kb region was found to be considerably different between TIB and HAN, such that both Denisovan-like and Neanderthal-like ancestries were significantly elevated in TIB (Figure S29). This ∼300 kb region is located on chromosome 2, encompasses eight genes (EPAS1, LOC101805491, TMEM247, ATP6V1E2, RHOQ [MIM: 605857], LOC100506142, PIGF [MIM: 600153], and CRIPT [MIM: 604594]), and has extreme differences between TIB and HAN (FST > 0.65) for SNVs with a high frequency of derived alleles. These differences are in sharp contrast to the genome-wide average (FST = 0.011).
|
|
|
Post by Admin on Mar 19, 2022 19:08:13 GMT
Entangled Ancestries and Their Ancient Origins in the ∼300 kb Region The ancestral pattern in the ∼300 kb region is extremely complicated, such that it contains a mix of Denisovan, Neanderthal, ancient Siberian, and unknown archaic ancestries, which are elevated in TIB (Table S16 and Figure 7). The unusually high frequency of archaic sequences and substantial differences between TIB and HAN—as well as other worldwide populations—cannot be explained by recent gene flows or incomplete linage sorting (Figure 7 and Table S17). These highly differentiated sequences harbored in the genomes of present-day Tibetan highlanders were most likely “inherited” from earlier settlers rather than “introgressed” by later arrivers. The complicated ancestral architectures in these regions have many implications for Tibetan origins and their pre-history. The surviving archaic sequences in the ∼300 kb region trace their ancestries back to ∼62,000–38,000 years ago (Table S17), pre-dating the LGM49 and indicating that the colonization of humans in Tibet is much more ancient than previously thought. The mixed archaic ancestries in this ∼300 kb region and simulation studies indicate that Tibet has been a human melting pot since the Paleolithic age, despite its inhospitable environment and the fact that interbreeding occurred among different hominin groups before the post-LGM arrivals. We suggest that SUNDer, an unknown ancient group of multiple ancestral origins, introduced the ancient haplotypes that are unique to present-day Tibetan highlanders (Figure 8). Figure 7 Local-Ancestry Inference of a ∼300 kb Region in TIB and Eight Other Populations The local ancestry was jointly inferred with ArchaiSeeker and ChromoPainter45 (see Material and Methods). The upper panel shows genes located in the ∼300 kb region (chr2: 46,577,796–46,870,806). The two vertical arrows indicate the two previously identified Tibetan-specific variant tags, the five-SNP motif,7 and the Tibetan-enriched deletion (TED).50 The positions of ancestry-informative markers (AIMs) for the Denisovan (DEN AIM), the Neanderthal (NEA AIM), and Ust’-Ishim (UST AIM) are also displayed in the following rows. For the AIMs of ancient samples identified here (DEN, NEA, and UST), we required the frequency of the ancient samples’ alternative allele (with respect to reference genome GRCh37) to be >0.5 in TIB and <0.1 in other worldwide populations. The lower panels exhibit the inferred ancestry of haplotypes in TIB, HAN, CHB (Han Chinese in Bejing, China; 1000 Genomes), CHS (Southern Han Chinese; 1000 Genomes), JPT (Japanese in Tokyo, Japan; 1000 Genomes), PAP (Papuan),51 SIB (Siberians),51, 52, 53 SSIP (Singapore Indians),54 and SSMP (Singapore Malay).55 Each row represents a haplotype with ancestry derived from Neanderthals14 (red), Ust’-Ishim16 (orange), Denisovans15 (blue), or Han Chinese (green) or uncertain ancestry (including unknown archaic and uncertain modern human; gray). Figure 8 A Model for Reconstructing the Evolutionary History of SUNDer This model was constructed on the basis of the observed ancestry information and haplotype pattern in the ∼300 kb region (see Figure 7). The ancient group represented by SUNDer was generated by two ancient admixture events: one occurred among a Denisovan-like group, a Neanderthal-like archaic group, and one or more unknown archaic hominin groups and resulted in an admixed group (UNDer), and the other occurred between UNDer and an ancient Siberian (modern human) group represented by Ust’-Ishim and eventually resulted in SUNDer. We assume that these two admixture events occurred before the LGM and contributed ancient ancestral components, including Neanderthal-like (red), Denisovan-like (blue), unknown archaic (gray), and Ust’-Ishim-like Siberian (orange) ancestry, to the Tibetans’ gene pool. The post-LGM admixture occurred between SUNDer and lowland modern human groups represented by Han Chinese and introduced the majority of the ancestry (green) into the Tibetans’ gene pool. Finally, this evolutionary history is reflected in EPAS1, encompassed in the ∼300 kb region, and its downstream region in the Tibetan genome. Two major Tibetan haplotypes are shown in the present-day Tibetans: the old one is SUNDer-like (frequency: 17/76), and the new one is the SUNDer haplotype with a Han Chinese component (frequency: 23/76).
|
|