|
Post by Admin on Dec 21, 2022 5:04:43 GMT
Fig 2. Distribution of common-to-high-frequency aSNPs falling within CREs across 18 tissues. Patterns of enrichment across the 18 different tissue types in Roadmap Epigenomics for the set of common-to-high-frequency aSNPs annotated within chromatin states associated with CREs, relatively to the matched background set of naSNPs. Histograms on top indicate the mean number of variants calculated across all cell types belonging to each tissue. Asterisks indicate BH-corrected Fisher’s exact test p-values: * = p ≤ 0.05, ** = p ≤ 0.01; *** = p ≤ 0.001; **** = p ≤ 0.0001 (for full statistical results see S7 Table). doi.org/10.1371/journal.pgen.1010470.g002As above, repeating this analysis in Neanderthal aSNPs segregating within Europeans also identified significant enrichments of Neanderthal aSNPs within CREs active in mesenchymal and blood & T cells (S6(B) Fig). Interestingly, these findings fully recapitulate previously reported observations in Europeans [11, 12]. Thus, in line with previous studies [11, 22, 50–52], our observation of a strong enrichment of Denisovan aSNPs within CREs active in immune cell types would suggest a potential contribution of these archaic variants to regulatory processes happening within immune-related cells, and by extension, to immune-related traits. As a final test, we intersected these variants with significant cis-eQTLs (qval < 0.05) from GTEx v8 [35]. We found 59 Denisovan, 117 Neanderthal and 388 non-archaic variants (0.15%, 0.47% and 0.58% of all common-to-high-frequency SNPs, respectively) were identified as potential cis-eQTLs in at least one tissue in GTEx. We expanded the analysis to the entire set of variants, regardless of their allele frequency, and found 132 Denisovan, 242 Neanderthal and 688 non-archaic variants (0.09%, 0.27% and 0.3%, respectively) overlapping with cis-eQTLs. Given the well-recognised bias towards individuals of European ancestry within the GTEx dataset, these findings both support the need to include under-represented populations into genetic surveys, and reaffirm that the majority of our introgression assignments are correct, and, as expected, that our variants are geographically restricted. Assessing the impact of aSNPs on transcription factor binding sites We next sought to characterise whether aSNPs might alter gene regulation and the functional mechanisms by which this may occur. We thus examined the potential of aSNPs annotated within CREs to disrupt known transcription factor binding sites (TFBS) by assessing their impact on 690 different position weight matrices (PWMs) drawn from the from HOCOMOCO [40] and Jaspar [39] databases. After consolidating variants predicted to disrupt the same motif across the two databases (see Methods), we found that 16,048 Denisovan, 10,032 Neanderthal aSNPs and 28,370 non-archaic naSNPs (respectively 40.5%, 40.7% and 42.3% of the tested variants) were predicted to disrupt at least one TFBS. To avoid redundancy due to similarities between predicted motifs across closely related TFs, we took advantage of recent work by Vierstra et al. [41], which clusters HOCOCOMO and Jaspar PWMs into 286 distinct clusters on the basis of sequence similarity, and considered only those instances included in these clusters. We then tested whether any of the motif clusters contained a significant excess of TFBS-disrupting aSNPs relative to naSNPs. Overall, we did not find any significant differences between the number of TFBS-disrupting archaic and non-archaic SNPs across the set of motif clusters (p = 0.26 and 0.14 for Denisovan and Neanderthal aSNPs, respectively). This suggests that, on a genome-wide scale, aSNPs are not disrupting any specific family of DNA motifs. To quantify the impact of aSNPs on the relevant PWM, for each of these TFBS-disrupting variants we then calculated a Δ PWM score, defined as the difference in the PWM score between the archaic and non-archaic alleles in case of aSNPs or between the derived and ancestral allele in case of naSNPs. While naSNPs are on the whole predicted to be significantly more disruptive than aSNPs, this difference is likely caused by the number of TFBS-disrupting variants reported in each category as Δ PWM scores remain highly comparable across ancestries (S8(B) Fig). Together, these results indicate that despite a lack of specific gene regulatory networks preferentially altered by aSNPs, a non-neglible fraction of our identified cis-regulatory variants might affect gene expression through altering the affinity between TFs and their cognate DNA sequences. High levels of population differentiation Denisovan aSNPs within CREs active in immune cells Recent studies have shown the existence of genetic structure characterising the Indonesian archipelago, with Papuan—and thus, Denisovan—ancestry showing a marked west to east cline [6, 53]. We therefore investigated whether our subset of variants, particularly those within CREs active in immune-related cells, also showed similar levels of population differentiation across the region. To this extent, we defined aSNPs and naSNPs in haplotype data from 63 individuals from Western Indonesia, also part of [6]. From the total set of 14,117 Denisovan, 8,662 Neanderthal and 22,916 non-archaic variants annotated within CREs active in immune-related cells, we found that 4,732 Neanderthal and 9,682 non-archaic variants (42.2% and 54.6%, respectively) segregate within Neanderthal and non-archaic haplotypes in western Indonesians (Fig 3A). However, in keeping with the expectation of little to no Papuan genetic ancestry in Western Indonesia only 1,535 Denisovan variants (10.8%) in our Papuan sample are also segregating within Denisovan-like haplotypes in this additional data set (Fig 3A). Indeed, we found significantly higher Fst values for the set aSNPs annotated within CREs, especially those of Denisovan ancestry (pWilcoxon < 2.2 ⋅ 10−16 and 0.0013, respectively for Denisovan and Neanderthal), compared to naSNPs (Fig 3B).
|
|
|
Post by Admin on Dec 22, 2022 20:29:10 GMT
Fig 3. Levels of population differentiation for the set of cis-regulatory SNPs between western Indonesian and New Guinean populations. A) Proportion of variants shared across the Indonesian archipelago segregating within the corresponding haplotypes. B) Distribution of the Fst values between western Indonesia and PNG for the three groups of variants. doi.org/10.1371/journal.pgen.1010470.g003
|
|
|
Post by Admin on Dec 25, 2022 4:06:16 GMT
Denisovan alleles are associated with gene expression differences in immune cells Given the observed enrichment of aSNPs, especially of Denisovan ancestry, within the CREs of immune-related cells, we used GREAT [54] to link these to their nearest genes in an attempt to identify possible targets. There was limited overlap across ancestries in the sets of putative target genes. For instance, out of a total number of 9,504 putatively target genes only 303 (3.2%) are predicted to be targeted by all three ancestries (Fig 4A). Conversely, we found that 795 (8.3%) and 573 (6.0%) genes are uniquely targeted by Denisovan and Neanderthal variants, respectively (Fig 4A). Fig 4. Biological processes putatively affected by SNPs within immune-related CREs. For each ancestry the figure shows A) the overlap between the putative sets of target genes for each ancestry; B) the semantic similarity score estimates between the set of significantly enriched GO terms. doi.org/10.1371/journal.pgen.1010470.g004We next performed a Gene Ontology (GO) enrichment analysis on the set of genes associated with either archaic ancestry, using the set of genes associated with naSNPs as a bacground set. Denisovan aSNPs are associated with genes strongly involved in multiple immune-related processes (S8 Table). Genes targeted by Neanderthal variants are instead enriched for more general biological processes, albeit instances related to neutrophil/granulocyte migration and chemotaxis were observed for Neanderthal variants (S8 Table). We then computed the semantic similarity between the significantly enriched terms (GREAT reported BH-adjusted hypergeometric p ≤ 0.01) using a method that incorporates both the locations of the terms in the GO graph as well as their relations with their ancestor terms [55], and found relatively low levels of semantic similarity between archaic and non-archaic enriched GO terms (Fig 4B), suggesting the set of variants might indeed affect the regulation of different biological processes within immune-related cells. Finally, for each ancestry we manually examined the set of genes associated with the significant GO terms. Denisovan cis-regulatory variants were predicted to regulate genes such as TNFAIP3, OAS2 and OAS3, all of which have been repeatedly identified as harbouring archaic hominin contribution that impact immune responses to pathogens [51, 52, 56]. In particular, we found 20 Denisovan variants associated with OAS2 and OAS3, eight of which (rs368816473, rs372433785, rs139804868, rs146859513, rs143462183, rs370655920, rs375463218 and rs372139279) were also predicted to significantly alter the affinity between TFs and their cognate DNA sequences by our analyses. Notably, in all cases the archaic allele segregated at frequencies between 0.2 and 0.4 in Papuans but was absent from Western Indonesia. Similarly, a comparison with 6 different non-human primate genomes indicated that for 7 of these 8 Denisovan aSNPs the introgressed allele is derived. The OAS locus has been repeatedly found to harbour signals of both Neanderthal and Denisovan adaptive introgression, with archaic alleles predicted to alter gene regulation [52, 56]. Our previous work has shown that both OAS2 and OAS3 are differentially expressed in whole blood between the people of Mentawai, a small barrier island off the coast of Sumatra, in West Indonesia, and the Korowai, a genetically Papuan group living on the Indonesian side of New Guinea island (Fig 5B) [57]. To understand whether Denisovan introgression might contribute to these differences, we tested the regulatory activity of five of the eight Denisovan variants described above (rs372433785, rs139804868, rs146859513, rs143462183 and rs370655920), all of which were predicted to disrupt the binding sites of TFs active in human immune cells, using a plasmid reporter experiment (see Methods). Fig 5. Functional validation of the regulatory impact of Denisovan variants near OAS2 and OAS3. A) The genomic region encompassing the eight Denisovan variants associated with OAS2 and OAS3. The top two tracks display patterns of DNase Hypersensitivity sites in Blood and immune T cells as well as in HSC and B cells [34]. Tested variants are shown along with their calculated Δ PWM. Bottom tracks display the chromatin state information for the same tissues; B) the distribution of the log2 RNA-seq counts per million in whole blood for OAS2 and OAS3 between the Korowai and the people of Mentawai, from [57]; C) the relative expression changes between Denisovan and non-archaic alleles, or between the alternative and the reference allele for positive control (rs9283753) [45], in two Papuan LCLs. Asterisks mark significant differences from 1, BH-corrected p < 0.05: *** = 0.0001 < p ≤ 0.001; ** = 0.001 < p ≤ 0.01; * = 0.01 < p ≤ 0.05. doi.org/10.1371/journal.pgen.1010470.g005We tested all five alleles across two different lymphoblastoid cell lines (LCLs) established from Papuan donors. In all cases, we found the direction of effect to be consistent across biological and technical replicates. In particular, two Denisovan alleles (rs139804868:A>G and rs146859513:C>G) consistently showed significantly lower transcriptional rates compared to their non-archaic counterpart. rs139804868:A>G is predicted to disrupt a motif bound by BHLHA15 and TAL1, which have been respectively found to be involved in immune B cell differentiation [58] and hematopoiesis [59]. rs146859513:C>G is predicted to disrupt a binding site for NFKB2, which is known to regulate the expression of cytokines and chemokines related genes [60] (Fig 5C). While confirming the validity of our approach, taken together, these findings point to a substantial contribution of Denisovan variants to immune-related processes in present-day Papuans, one chiefly mediated through the regulation of active immune responses mounted against pathogenic infections.
|
|
|
Post by Admin on Dec 26, 2022 19:31:16 GMT
Discussion There is significant interest in understanding the functional consequences of archaic introgression. Evidence indicates that both Denisovan and Neanderthal aSNPs, especially those within protein coding and conserved non-coding elements, are mostly deleterious and negatively selected in modern humans [14, 61]. Similar findings have been recently reported for highly pleiotropic enhancers, where aSNPs are depleted likely as a consequence of their potential to perturb gene expression across multiple tissues [12]. Nevertheless, out of the substantial number of archaic variants still segregating within present-day populations, a large fraction falls within genomic regions that show strong evidence of functional activity across a variety of cell types. Indeed, studies conducted primarily on Neanderthal introgressed DNA have suggested a non-negligible contribution to gene expression variation in modern humans [10, 11, 22], with repeated examples of Neanderthal archaic variants falling within regulatory elements or the seed region of mature micro-RNAs predicted to affect transcriptional and post-transcriptional regulatory processes [11, 62].
In this study, we have taken advantage of a recently published dataset [6] to investigate the landscape of archaic introgression in individuals of Papuan genetic ancestry, the functional consequences of which remained poorly understood. This has allowed us to characterise the putative contribution of Denisovan DNA, which is known to account for up to 5% of the genome of present-day Papuans [4, 63], while also comparing it to that of Neanderthal DNA and non-archaic variants that arose following the Out of Africa event. We specifically analysed all of our variants across multiple cell types and functional chromatin states aiming to account for the strong dependency on the cells’ chromatin landscapes [34] when assessing the potential activities of introgressed alleles. While our analyses focus on a small subset of introgressed variants identified by [6], our filtering criteria ensure that we only consider variants with a high likelihood of being truly introgressed. In addition, comparing Denisovan and Neanderthal signals within the same samples serves as an internal control for difference in linkage disequilibrium and background selection between ancestries, hence providing stronger support for any eventual Denisovan- and/or Neanderthal-specific contribution to phenotypic variation in contemporary Papuans.
First, we confirm previous reports that aSNPs mostly occur within non-coding sequences [9, 14, 64], with a large number falling within highly constitutive and/or functionally inert regions, which might be the result of weaker purifying selection acting on these elements. aSNPs are significantly depleted from other inactive chromatin states but over-represented within states putatively containing elements involved in gene regulatory processes. Applying our approach to Neanderthal aSNPs segregating within present-day West Eurasians [31] replicates most of our findings, as well as previously reported signals including a depletion of Neanderthal variants within enhancer-like elements [12]. Previous studies have shown that, while polygenic risk scores [65] and eQTLs [66] have limited cross-population portability, biochemically active functional regulatory elements tend to be under evolutionary constraint within the human lineage [67, 68]. Thus, while our approach is contingent on functional data generated in a population that is not of Papuan ancestry, our focus on functional analyses are likely to be more robust to differences across populations in linkage disequilibrium. Nevertheless, our set of SNPs, and our conclusions, represent a first step that will benefit from functional validation at scale.
Second, we find notable differences between the two archaic ancestries in their patterns of introgression. Such differences are likely consequence of the tissue-specificity nature of CREs, and of enhancers in particular [69]. Indeed we report that variants annotated in these CRE-associated states tend to lie within elements active in a restricted number of cell types, suggesting that the effects of archaic introgression might be highly cell-type-specific. In addition, admixture with Neanderthals might have similar consequences even across distantly related human populations, with Neanderthal aSNPs strongly enriched within elements active within blood & immune T cells and mesenchymal cells both in Papuans and Europeans, replicating previously reported observations [11, 12]. While we report a similar enrichment for Denisovan alleles within these tissues, we further note a Denisovan aSNPs-specific significant enrichment with CREs active within HSC & B cells, suggesting a more pervasive contribution of Denisovan introgression to gene regulatory processes happening within immune-related cells. In line with this, we further show that Denisovan and Neanderthal variants annotated within immune-related CREs might target different sets of genes with limited overlap in the biological processes they are involved in. Indeed, only genes predicted to be regulated by Denisovan aSNPs are strongly involved in active immune responses.
Third, we have characterised the impact of introgressed variants on transcription factor binding sites. A substantial proportion of aSNPs within CREs are predicted to modify the interactions between TFs and their cognate binding sites, although we do not detect an overall higher impact of aSNPs relative to naSNPs. Our data suggest that, despite aSNPs possibly disrupting individual DNA motifs, at a genome-wide level admixture with archaic hominins is unlikely to have resulted in large rewiring of transcription factor regulatory networks. While this might contrast the findings of previous studies [70], in our study we focused on families of DNA motifs (as clustered by [41]) rather than consider enrichment within individual PWMs, an approach that we believe is better suited to controlling for the typical redundancy of PWMs. This, however does not exclude that aSNPs might not disrupt specific DNA motifs. Indeed, our approach also identifies Neanderthal aSNPs disrupting the PWM of TFs such as CREB1, USF1, MYB and STAT6, similarly to what has been reported by [70]. We also note that there is a significant (hypergeometric p < 2.2 ⋅ 10−16) overlap between genes targeted by N-eQTLs in [70] and our own set of possible target genes (S9 Fig).
Finally, we performed a plasmid reporter assay experiment to quantify the molecular impact of Denisovan TFBS-disrupting variants annotated within CREs active in immune-related cells and predicted to affect the expression level of OAS2 and OAS3. These genes belong to a family of pattern-recognition receptors involved in innate immune responses against both RNA and DNA viruses, with OAS3 considered to be essential in reducing viral titer during Chikungunya, Sindbis, influenza or vaccinia viral infections [71]. At least two previous studies have shown the presence of both Neanderthal and Denisovan archaic haplotypes segregating at this locus respectively within European [52] and Papuan [56] individuals. Sams et al. found two variants (rs10774671, rs1557866) within these Neanderthal haplotypes which are respectively associated with the codification of different OAS1 splicing isoforms and with a reduction in OAS3 expression levels, the latter only upon viral infection [52]. Our previous work has found that both OAS2 and OAS3 are differentially expressed between western Indonesians and Papuans [57]. Here we report a set of eight SNPs (rs368816473, rs372433785, rs139804868, rs146859513, rs143462183, rs370655920, rs375463218 and rs372139279), located roughly 41 kb upstream of OAS2 and OAS3, all of which are predicted to strongly alter the ability of different transcription factors, including IRF4, NFKB2 and TAL1, to bind to their underlying DNA motifs. In all cases, the reference allele is fixed within western Indonesian populations, whereas the archaic alleles segregate at frequencies between 0.2 and 0.4 in Papuans. We show that at least 5 of these variants lie within sequences that can regulate expression in reporter gene plasmid assays, and that in two of these variants (rs139804868 and rs146859513) the Denisovan allele is associated with significantly lower transcriptional activity compared to the non-archaic allele in immune cells from two different Papuan donors, again suggesting that these SNPs are of biological importance.
Recent work by [72] has shown that variants which arose in the human lineage following its split from the Neanderthal-Denisovan common ancestor, and which have reached (near) fixation in modern humans since then can exhibit significant differences in regulatory activity relative to the ancestral state. Similarly, Jagoda et al. has shown large impact on gene expression associated with the presence of Neanderthal aSNPs in vitro [73]. Our results suggest that Denisovan alleles segregating within modern human populations are also likely to actively participate in gene regulatory processes, especially those that take place within immune-related cells. This agrees with recent findings from a study that analysed the genome of present-day people of Taiwan, the Philippines, the Solomon Islands and Vanuatu [50]. While further experimental validation of our observations is necessary in order to characterise the genome-wide impact of archaic introgression, the results presented in this study argue for a potential contribution of Denisovan variants to immune-related phenotypes amongst early modern humans in the region, potentially favouring adaptation to the local environment [15, 22].
|
|
|
Post by Admin on May 3, 2024 22:41:11 GMT
Papua New Guinea (PNG) has a wide range of environments, each presenting unique challenges to human survival. Highlanders and lowlanders of PNG are striking examples of populations facing distinct environmental stress. Whereas the highlanders encounter low oxygen availability due to altitude, the lowlanders are exposed to specific pathogens that are absent in the highlands, such as malaria. Despite these strong environmental pressures, the specific adaptations of these populations have remained overlooked. A new study published in Nature Communications sheds light on the genetic adaptations of Papua New Guineans in response to their unique environmental pressures. The new findings presented rely on new whole-genome sequences from highlanders and lowlanders from Papua New Guinea. The data was collected by the Papuan Past project, which brings together researchers from the universities of Tartu (Estonia), Toulouse (France), and Papua New Guinea. "We explored the signatures of selection in newly sequenced whole genomes of 54 PNG highlanders from Mt. Wilhelm (Chimbu Province) and 74 PNG lowlanders from Daru Island (Western Province). We hypothesized that the genomes of both populations have been shaped differently to mitigate the detrimental effects of their respective environments," explains Dr. François-Xavier Ricaut, CNRS researcher at the Centre de Recherche sur la Biodiversité et l'Environnement (University of Toulouse, France), the project leader and corresponding author. "The genetic variants under selection identified in our study show associations with blood-related phenotypes," says Dr. Mathilde André, the lead author from the Institute of Genomics (University of Tartu, Estonia). One of these genetic variants under selection in Papua New Guinean highlanders might impact the red blood cell count. A higher red blood count helps the highlander adapt to the lower oxygen availability in the highlands. On the contrary, the selected variant in the lowlanders is associated with the percentage of white blood cells. "This supports the idea that hypoxia might have been the main driving force of selection that has acted on Papua New Guinean highlanders. However, specific pathogens might have shaped the genome of lowlanders through selection," adds Dr. André. Dr. Nicolas Brucato, a co-author from the University of Toulouse, continues, "Interestingly, both the variants also affect the heart rate of individuals with those mutations. This multiplicity highlights the complexity of interpreting the role of genetic mutations. One mutation can affect multiple phenotypes altogether." Dr. Mayukh Mondal from the Institute of Genomics, who co-led the project, adds, "Interestingly, one of the top candidates for selection in lowlanders has a non-human origin." Denisova is one of the archaic hominin populations living in Asia before modern humans settled in Papua New Guinea around 50,000 years ago. Although Denisova quickly went extinct around that time, they have interbred with Papua New Guinean ancestors and left their legacy in the genome of modern Papua New Guineans. This study suggests that a genetic mutation in Denisova that impacts a specific protein structure has been directly passed to Papua New Guinean genomes. "It looks like the altered protein is beneficial for the lowlanders to survive in their environment. Although we do not know the exact cause of this selection, this mutation might help the lowlanders overcome malaria," concludes Dr. Mondal. This new insight into how local adaptation has shaped the genomes and phenotypes of Papua New Guinean highlanders and lowlanders differently points out the necessity of investigating populations with diverse backgrounds to shed light on the key aspects of human biology. More information: Positive selection in the genomes of two Papua New Guinean populations at distinct altitude levels, Nature Communications (2024). DOI: 10.1038/s41467-024-47735-1 www.nature.com/articles/s41467-024-47735-1Facebook Twitte
|
|