|
Post by Admin on Oct 12, 2018 18:20:40 GMT
Figure S1 Definition of Introgressed Segments, Related to Figure 1 Results VIPs and Introgression Data We focused on 4,534 VIPs (∼20% of the human proteome; Table S1) that engage in defined physical interactions with many viruses, including 20 human viruses known to interact with at least 10 VIPs (Table S1; STAR Methods). VIPs were annotated based on interactions with modern viruses, but these can be thought of as proxies for related viruses in ancient populations. This extension is supported by the fact that related viruses tend to use similar host VIPs (Enard et al., 2016). For example, VIPs interacting with HIV are also likely to interact with other lentiviruses. Thus, if enrichment of adaptive introgressions of HIV-interacting VIPs is observed, this presents evidence of past adaptation related to a lentivirus rather than to HIV itself. To estimate enrichment of introgression at VIPs, we used the IS of Neanderthal ancestry in East Asian and European modern human genomes identified by Sankararaman et al. (2014). These authors used a conditional random field (CRF) approach to estimate the frequencies and lengths of IS, and here, we simply reuse these estimates (STAR Methods). In brief, for each position in the genome that is marked by a SNP, the CRF model provides a posterior probability that any randomly sampled modern haplotype contains an allele of Neanderthal origin. Smoothed over a set of contiguous SNPs, this also provides a regional estimate of the frequency of Neanderthal ancestry (Figures 1 and S1; STAR Methods). The method also generates a list of high-confidence Neanderthal haplotypes present in some individuals that we also utilized in this paper. We analyzed East Asian and European modern human populations separately because they had distinct histories of interbreeding with Neanderthals (Kim and Lohmueller, 2015, Vernot and Akey, 2015). Figure 2 Confounding Factors Included in the Bootstrap Test To determine if VIPs are enriched in segments introgressed between modern humans and Neanderthals, it is important to first define which other factors, in addition to the levels of constraint, affect the occurrence of IS along the genome independently of interactions with viruses. In the genome, factors that affect the occurrence of IS should differ inside compared to outside IS. We must therefore match VIPs and non-VIPs for genomic factors that (1) differ inside versus outside IS, and (2) also differ between VIPs and non-VIPs (Figure 2A). We defined all the genomic factors that differed between IS and non-IS regions in both directions, including GC content, the number of human protein-protein interactions, and multiple parameters controlling for levels of deleterious variants (i.e., Tajima’s D, FUNSEQ score, and densities of coding, regulatory, and conserved elements) (Figures 2 and S2; Table S2; STAR Methods). Because all of these genomic parameters also varied between VIPs and non-VIPs (Table S2), we used a bootstrap test to first match the VIPs with control non-VIPs for all relevant factors (Figures 2B and 2C; Tables S3 and S4; STAR Methods). We also systematically matched VIPs and non-VIPs with similar recombination rates in the bootstrap test (STAR Methods), and we assessed whether the enrichment of VIPs in the IS becomes more pronounced in regions of higher recombination rate (Hinch et al., 2011). We did this to further confirm that adaptive introgression rather than heterosis explains our results. Indeed, Kim et al. (2017) have recently shown that heterosis can mimic adaptive introgression in regions of low recombination but to a smaller extent in regions of high recombination.
|
|
|
Post by Admin on Oct 13, 2018 18:07:18 GMT
Figure S2 Genomic Factors inside and outside Introgressed Segments, Related to Figure 2 The model of positive directional selection of Neanderthal IS predicts an enrichment of Neanderthal ancestry at VIPs. More specifically, positive directional selection should have left Neanderthal IS at VIPs that are longer and at higher frequencies than Neanderthal segments that overlap non-VIPs. Long IS, in particular, are expected at VIPs if positive directional selection occurred not too long after interbreeding. We first used the bootstrap test to show that significantly more Neanderthal IS overlap VIPs than non-VIPs both in East Asia (169 segments overlapping VIPs versus 136 overlapping matched non-VIPs on average, bootstrap test p < 10−3) and in Europe (154 segments overlapping VIPs versus 128 overlapping matched non-VIPs on average, bootstrap test p = 0.003). We further used the hypergeometric test to detect a strong and highly significant excess of long and frequent Neanderthal IS encompassing VIPs in both East Asian and European populations (Figure S3). Specifically, the excess of VIPs in the long IS (≥100 kb) is significantly higher than in all (>0 kb) IS both in East Asians (Figure S3A, hypergeometric one-tailed test p = 1.2 × 10−5) and Europeans (Figure S3B, p = 0.007). Likewise, the excess of VIPs in the IS at frequencies >15% is higher than that in all IS both in Asians (p = 0.05) and Europeans (p = 0.025). Most importantly, the excess of VIPs in the long (≥100 kb) segments at frequencies >15% is significantly higher than that in all segments (>0 kb) at frequencies higher than 15% both in East Asians (Figure S3A, p = 2.5 × 10−4) and Europeans (Figure S3B, p = 0.034). This significant excess of long and frequent Neanderthal IS is the hallmark of directional selection. These patterns remained when we restricted the analysis to only very high confidence segments of Neanderthal ancestry (Figures S3C and S3D; STAR Methods) and are also robust to variations in the definition of IS (STAR Methods). Figure S3 Hypergeometric Test Results for the Excess of Long and Frequent Neanderthal Segments, Related to Figure 3 The hypergeometric test we implemented required fixing arbitrary thresholds of length and frequency of the IS. We thus further verified whether we could observe a more general trend toward an increase in Neanderthal ancestry at VIPs as we increased the length and frequency of IS across a wide range of thresholds. Figure 3 Excess of Introgression from Neanderthals to Modern Humans at VIPs Figure 3A shows that the excess of Neanderthal ancestry at VIPs does tend to progressively increase with larger length thresholds as well as with larger frequency thresholds (see also Figures S4A and S4B). Moreover, the excess of Neanderthal ancestry at VIPs is significantly greater in high-recombination regions of the genome (hypergeometric test using IS larger than 100 kb and at frequencies higher than 15%; East Asia p = 0.016, Europe p = 0.039) (Figure 3B) as expected under the adaptive introgression model. These patterns remained when (1) we restricted the analysis to LT-VIPs (Figure S4C), or (2) we used a different recombination map (Kong et al., 2010) (Figures S4D and S4E), or (3) when we added a control for background selection (McVicker et al., 2009) (Figures S4F and S4G). Furthermore, VIPs and control non-VIPs have very similar numbers of segregating variants (241 segregating variants on average in VIPs and 239 in non-VIPs in East Asia, p = 0.32. 247 in VIPs and 243 in non-VIPs in Europe, p = 0.2) revealing that VIPs and control non-VIPs have similar amounts of highly constrained sites.
|
|
|
Post by Admin on Oct 14, 2018 18:29:48 GMT
Figure S4 Additional Controls for the VIPs versus Non-VIPs Comparison, Related to Figure 3 Overall, the enrichment of Neanderthal ancestry, and specifically the strong enrichment of long and frequent IS at VIPs, suggest that viruses frequently drove adaptive introgression after interbreeding between Neanderthals and modern humans. It is important to note, however, that so far we have not used information on adaptive introgression at the level of specific loci. Several scans for adaptive introgressed loci previously identified multiple loci with locus-specific evidence of adaptive introgression (Gittelman et al., 2016, Jagoda et al., 2017, Racimo et al., 2017). If the overall enrichment of long and frequent IS reflects the impact of adaptive introgression at VIPs, then VIPs should be particularly strongly enriched in loci previously shown to have undergone adaptive introgression. Here, we used the loci identified by three different scans (Gittelman et al., 2016, Jagoda et al., 2017, Racimo et al., 2017) and estimated their enrichment at VIPs. In line with the overall enrichment of Neanderthal ancestry at VIPs being due to adaptive introgression, we found a very strong excess of adaptive IS at VIPs compared to non-VIPs (Figure S4H). As expected, the excess is very pronounced for long and frequent adaptive IS (Figure S4I). Thus, these results further show that adaptive introgression had a substantial impact at VIPs after interbreeding. Estimating the Proportion of Adaptive Introgressed Segments The excess of long and frequent IS at VIPs can be used to estimate the rate of adaptive introgression. The number of long and frequent IS at VIPs above the expected number based on matched non-VIPs is a lower bound for the proportion of adaptive IS. For example, if there were 50 IS at VIPs versus 20 IS at control non-VIPs, we would estimate that the 30 additional long and frequent segments at VIPs were due to adaptive introgression. Overall we identified 121 (versus 66 expected) segments longer than 100 kb overlapping VIPs in East Asia (bootstrap test p < 10−3) and 103 (versus 68 expected) in Europe (p < 10−3). For the introgressions that are long (≥100 kb) and at high frequency (≥15%) and thus more likely to be adaptive, the absolute counts are smaller but the enrichment is even more pronounced: 36 (versus 11 expected) segments in Asia (p < 10−3), and 19 (versus 6 expected) in Europe (p < 10−3). Based on these numbers, we estimated that out of all long and high-frequency IS from Neanderthals to modern humans, 15% to 32% (54 of 171) in East Asians and 12% to 25% (27 of 105) in Europeans have been positively selected in response to viruses. In total there are 171 and 105 long and high-frequency IS overlapping genes in East Asians and Europeans, respectively. In East Asians, a total of 1,702 VIPs matched three or more control non-VIPs in the bootstrap test. These 1,702 VIPs overlap the 36 IS (versus 11 expected) used to measure enrichments (Figure 3; STAR Methods), leaving us with ∼25 adaptive IS. Additional 42 IS overlapping VIPs were not used because the VIPs matched with fewer than three control non-VIPs in the bootstrap test (STAR Methods). If we assume that the same proportion was adaptive among the unmatched VIPs, we obtain a total of 54.17 (25 of 36 matched and ∼29 of 42 unmatched) positively selected IS, or 32% of all the 171 long, high-frequency IS in East Asians. Using the same extrapolation, we estimated that a total of ∼27 or ∼25% of all the 105 long, high-frequency IS in Europeans were positively selected in response to viruses. We could also use these enrichments to estimate false discovery rates (FDR) of adaptive introgression for individual VIPs. VIPs with FDR below 50% are listed in Table S5. Interestingly, several previously published candidate VIP loci for adaptive introgression have low FDR, including the OAS gene cluster (FDR = 0.22 in Europe) (Mendez et al., 2013) or the TLR1/6/10 gene cluster (FDR = 0.17 in Europe) (Dannemann et al., 2016).
|
|
|
Post by Admin on Oct 15, 2018 18:45:28 GMT
We next tested for an excess of introgressions from modern humans to Neanderthals, using the data on introgressed genomic regions in a single Altai Neanderthal individual (Kuhlwilm et al., 2016). Because adaptive IS are expected to be longer than neutral ones, we estimated the excess of segments of modern human ancestry in the single Altai Neanderthal individual genome at VIPs as a function of their size. We found a large excess of long segments of modern human ancestry at VIPs (Figure 4A). Furthermore, as predicted, the excess is more pronounced in high-recombination regions of the genome (Figure 4B). We confirmed that this excess was also detected using only high-quality LT-VIPs (Figure S5). Figure 4 Excess of Introgression from Modern Humans to Neanderthals at VIPs We next asked if it is possible to identify which ancient viruses might be responsible for the observed enrichments. While such an analysis in the direction from modern humans to Neanderthals is severely underpowered with only 19 VIPs found in IS over 100 kb in the Altai Neanderthal, the number is much larger in modern humans with 152 VIPs found in long (≥100 kb) and frequent (≥15%) Neanderthal IS. We used the 20 modern human viruses that interact with ten or more VIPs as proxies for the ancient related viruses that infected humans at the time of interbreeding (Table S1). These 20 viruses are evenly distributed between RNA viruses (2,684 VIPs) and DNA viruses (2,547 VIPs) (Table S1). Of the 2,684 RNA VIPs, 1,563 interact with only RNA viruses, while out of 2,547 DNA VIPs, 1,426 interact with only DNA viruses. Figure S5 Excess of Introgression from Modern Humans to Neanderthals at LT-VIPs, Related to Figure 4 We first asked if ancient RNA or DNA viruses were more likely to have been involved, with the expectation that RNA viruses should be more likely to drive adaptive introgression because they are more likely to jump from one species to another (Geoghegan et al., 2017, Kreuder Johnson et al., 2015). In order to determine whether introgression was skewed toward either RNA or DNA viruses, we used the bootstrap test to compare the number of IS at VIPs that interact with only one RNA virus with the number of IS at VIPs that interact with only one DNA virus and are located far from any RNA VIP (≥500 kb) (STAR Methods). We did not detect any significant skew in favor of RNA-virus VIPs in East Asia (Figure 5A). By contrast, in Europe, we detected a strong bias of RNA-virus VIPs in long, high-frequency IS (Figure 5A). This pattern was more pronounced for introgression in the regions of high recombination (Figure 5B). The enrichment of Neanderthal ancestry at RNA VIPs became even more pronounced (Figures S6A and S6B) when we repeated the comparison after excluding genes known to interact with bacteria, Plasmodium (Ebel et al., 2017), and immune genes annotated as such by the Gene Ontology database (The Gene Ontology Consortium, 2017). Thus, other pathogens appear unable to explain the signal at RNA VIPs. The enrichment was also more pronounced when using only adaptive IS (Gittelman et al., 2016, Jagoda et al., 2017, Racimo et al., 2017) (Figures S6C and S6D). Furthermore, the slightly stronger background selection at RNA VIPs than at control DNA VIPs both in East Asia and Europe (7% stronger in both cases, p < 10−3) makes the comparison conservative. RNA VIPs also have slightly fewer segregating variants (9% less in Europe, p < 10−3) and thus slightly more sites under strong purifying selection than control DNA VIPs, which is again conservative. The enrichment at RNA VIPs was further confirmed using only LT-VIPs (Figures S6E and S6F) or a different recombination map (Figures S6G and S6H).
|
|
|
Post by Admin on Oct 16, 2018 18:15:17 GMT
Figure 5 Excess of Introgression from Neanderthals to Modern Humans at RNA VIPs versus DNA VIPs We next tried to identify which families of ancient RNA viruses might explain the observed skew toward RNA VIPs in Europeans. Of the 11 RNA viruses included in this analysis (Table S1), HIV (a lentivirus), influenza A virus (IAV, an orthomyxovirus) and hepatitis C virus (HCV, a flavivirus) have by far the highest numbers of VIPs. It appears that both HIV-only and IAV-only VIPs were each associated with a large excess of high-frequency, long adaptive IS in European modern humans compared to VIPs that interact with only one DNA virus (Figures 6A–6D). The excess was particularly strong for HIV-only and IAV-only VIPs within high-recombination regions (Figure 6B,D). Specifically, we found seven (versus 0.29 expected) high-frequency (≥15%) IS overlapping IAV-only VIPs (p < 10−3) and eight (versus 0.83 expected) overlapping HIV-only VIPs (p < 10−3). Table S5 lists the specific VIPs found in these IS. Figure S6 Additional Controls for the RNA versus DNA VIPs and Specific Virus Comparisons, Related to Figures 5 and 6 Figure 6 Excess of Introgression from Neanderthals to Modern Humans at IAV-Only, HIV-Only, and HCV-Only VIPs While these results were robust when restricting to HIV-only LT-VIPs (Figures S6I and S6J), we did not detect a significant enrichment at IAV-only LT-VIPs. It is possible that the smaller number of IAV-only LT-VIPs (56 overall versus 195 for HIV and only 15 in high recombination regions) did not provide sufficient power to detect a significant excess of introgression. Indeed, subsampling of HIV-only LT-VIPs to the number of IAV-only LT-VIPs reduced the power enough to eliminate statistical significance (bootstrap test p > 0.05 for IAV-only LT VIPs and all ten random subsamples and all introgression lengths and frequencies). Although we detected no significant enrichment of IS at HCV-only VIPs (Figures 6E and 6F), this might also be an issue of statistical power because HCV has far fewer unique VIPs than HIV and IAV (157 versus 405 and 490, respectively). Indeed, subsampling of HIV-only and IAV-only VIPs to the small number of HCV-only VIPs results in insufficient power to detect any excess due to a small sample size (Figure S6K), leaving open the possibility that HCV-like viruses might also have been involved.
|
|