|
Post by Admin on Feb 20, 2024 20:16:55 GMT
(A and B) The y axis represent Z-scores of covA coefficients, for covA computed on candidate regions (CR) of 5 kilobases as in Figure 2. X-axes represent genome-wide (GW) covA estimated coefficients: we report beta effect sizes for continuous traits in (A) and Odds Ratios for categorical traits in (B). Independent models are run for different covAs. Colors label the ancestry tested, while inner and outer color intensity represents significance of CR covA Z-score and GW covA coefficients, respectively. Pastel colors indicate not significant results at Benjamini-Hochberg FDR = 0.05 (double-sided Z-test p value for CR covA Z-score or double-sided coefficient p value for GW covA coefficients). Labels indicate selected outlying ancestry-trait associations. Selection signatures at candidate regions with ancestry-trait association So far, we only explored associations between a given trait and a local or genome-wide excess of a given ancestry. The observed local admixture imbalance points to a role of that ancient contribution in explaining a given phenotype. However, these results alone do not show whether after the admixture event the incoming genetic material also underwent a selective sweep within the recipient population, altering population-wide allele frequencies as investigated in Mathieson et al., 2015.6 In other words, the local admixture imbalances we detected so far are not necessarily transferred to the whole population. We independently asked whether the phenotype-associated regions above also exhibit signs of recent natural selection. We applied CLUES22 to the list of GWAS hits used as index for our candidate regions to obtain per-SNP evidence of recent (up to 500 generations ago) natural selection and to see which phenotypes show enrichment in SNPs with strong selection signals compared to a random set of GWAS hits. Out of the genomic regions responsible for ancestry-trait association shown in Figure 2, pigmentation-related SNPs (eye and hair color) showed extremely high CLUES logLR values (Figure 4A; Figure S4) in accordance with previous results,6,9,23 as well as SNPs related to BMI and cholesterol, pointing to ongoing or recent selection at these loci. Diastolic blood pressure (DBP) and sleep-related SNPs also showed the same extreme signature, but the candidate regions encompassing them did not reach significance in ancestry-trait association.
|
|
|
Post by Admin on Feb 22, 2024 16:27:24 GMT
(A) CLUES log likelihood ratios (logLR) values distribution for GWAS hits for six selected phenotypes. For each phenotype, at most 100 top SNPs with highest logLR values and the corresponding ranks from the random GWAS hits distribution are shown. Gray dots show mean values for each rank in the background distribution while the whiskers show the 5-95 percentile range. The logLR values for tested SNPs are shown in red or blue depending on whether the value lies above the 95th percentile of the values from the background distribution with a given rank. Number of tested SNPs for each phenotype are shown in panel titles. Sleep indicates SNPs connected to all sleep traits as indicated in Table S3. See also Figure S4. (B) Maximum likelihood estimates of derived allele frequency trajectories for top three SNPs with highest logLR values for each phenotype. When more than one SNPs come from the same locus, only the top-scoring SNP is shown. The recent and putatively ongoing nature of the inferred selective pressure on the six traits shown in Figure 4A is further exemplified by the steep increase in derived allele frequencies over time inferred for the top three SNPs of each trait and shown in Figure 4B. These include some loci previously shown to be selected in West Eurasians (rs4988235 at MCM6/LCT,24 pigmentation-related SNPs at HERC2/OCA2, TYRP1, TYR, TPCN2,9,23,25 and rs653178 at ATXN226) and some others yet to be explored. In particular, rs17630235, associated with BMI and DBP, is an expression quantitative trait locus (QTL) in several epithelial tissues27 for ALDH2, an aldehyde dehydrogenase known for its role in the alcohol metabolism.28 Although this selective signal might be due to rs17630235 proximity with ATXN2, it is tempting to speculate about the changed role of ALDH2 in a post-neolithic society, which made available several fermentable substrates. Other selected SNPs include rs74555583 and rs11539148, both associated with sleep patterns (chronotype). Most notably, the latter is a missense variant in the catalytic domain of QARS1, for which also functions as splicing QTL.27 QARS1 itself encodes an enzyme involved in the glutaminyl-tRNA synthesis and, when mutated, leads to microcephaly, cerebral-cerebellar atrophy, and seizures.29
|
|
|
Post by Admin on Feb 23, 2024 21:19:37 GMT
Discussion Here we combined existing knowledge on genotype-phenotype associations and the availability of ancient genomes to assess the impact of ancient migrations on the phenotypic landscape of contemporary Europeans. We leveraged on traits measured in living individuals, complementing previous works where phenotypes were inferred for ancient genomes instead. As a whole, the most affected traits include pigmentation and anthropometric traits together with blood cholesterol levels, caffeine consumption, heart rate, and age at menarche. Importantly, while our genome-wide results highlight an overall excess of an ancestry in the carriers of a given phenotype, this is not necessarily mirrored at the genetic loci for which the genotype-phenotype association is ascertained in the literature. A genome-wide excess can completely explain a regional signal, leading to non-significant Z-scores, and even indicate a different direction for the same ancestry. While the first scenario can be due to the extreme polygenicity of a trait, possibly coupled with an inaccurate tagging of the actual functional regions by the GWAS catalog hits, the second might indicate an incomplete correction of non-genetic factors in the genome-wide analysis. Indeed, it is possible that place of residence and educational attainment alone cannot fully account for confounding environmental effects such as socioeconomic status. Conversely, candidate region Z-scores are disentangled from background confounders and virtually free from collinearity issues when they also agree with the relevant ICs. In this light, we here chose to report and discuss results showing region-specific significance for covAs and matching ICs (as reported in Table S4), thus refraining from making inferences on traits such as eye pigmentation in Yamnaya, among others. WHG ancestry in present day individuals is linked to lower cholesterol levels, higher BMI, and putatively contributed brown hair and light eye color to the contemporary Estonian population. This last association has been previously described based on the HERC/OCA2 haplotypes found in ancient WHG samples.5,23 In addition, loci associated with these features also appear to have undergone selection in Estonians. Other region-specific associations for this ancestry include decreased hip circumference and increased caffeine consumption and heart rate. An enriched Yamnaya ancestry is linked to a strong build, with tall stature (in agreement with previous studies6,8) and increased hip and waist circumferences, both at genome-wide and region-specific levels, but also to black hairs and high-cholesterol concentrations when focusing on candidate regions. The associations of Yamnaya and WHG ancestries to respectively higher and lower cholesterol levels, together with the observed signatures of selection at loci connected to cholesterol and BMI, add a new component to our understanding of post-neolithic dietary adaptation7,30,31 with potential implications to disease risk and outcomes in present-day populations. Anatolia_N enrichment in trait-related genomic regions is connected with a reduced BMI-corrected waist-to-hip ratio, reduced BMI, light (but not green) eyes and fair hair, increased age at menarche, and reduced heart rate. Notably, covA(i,Anatolia_N) has a substantial weight only in IC2, the single IC that reaches significance when predicting heart rate, suggesting a prominent role for this ancestry in determining this trait. Finally, the Siberian ancestry is connected with dark hair pigmentation, higher heart rate, lower caffeine consumption, and most prominently, green eye color and lower age at menarche. Importantly, while the results connected to the Siberian ancestry are not of broad applicability to all European populations, covA(i,Siberia) and relative ICs received effect-sizes with mixed significance in all the previous traits except for age at menarche and pigmentation, suggesting that other ancestries might have a larger impact. In other words, we do not find other phenotypes that can be best explained by similarity with Siberia, implying that the presence of this ancestry in the Estonian genome does not significantly affect the inference based on the other, pan-European ancient components. A general caveat about significance levels observed in this study is that as we refrain from reducing interdependent traits by arbitrary choices, even testing multiple alternatives of the same trait, we expose ourselves to inflated false negatives. We deemed it best to acknowledge and control this risk by avoiding overly stringent multiple testing corrections as Bonferroni and adopting the Benjamini-Hochberg procedure to control FDR instead. In addition, as highly significant traits tend to have higher heritability, it is likely that our analysis might not have enough statistical power for poorly heritable traits. Nevertheless, as we are able to highlight ancestry-trait associations for caffeine consumption (h2=0.087±0.009) , brown hair color (h2=0.079±0.009) , and even heart rate h2=0.044±0.009) , this condition should be limited only to the very few traits exhibiting lower heritabilities. Importantly, our inferences are applicable to contemporary individuals of European ancestry, where the phenotypes were collected. Conversely, using them to extrapolate features of ancient populations, although tempting, should be done with caution due to the interaction of their genetic legacy with a radically different lifestyle and environment. Furthermore, when seeking a biological interpretation of our results, it should be kept in mind that certain narrowly defined, contemporary phenotypes such as caffeine consumption may point to broader biological pathways. Taken together, our results show that the ancient components that form the contemporary European landscape were differentiated enough at a functional level to contribute ancestry-specific signatures on the phenotypic variability displayed by contemporary individuals, regardless of which target population one may examine. In particular, when looking at Estonians, for 11 out of 27 traits surveyed here we could confirm a significant relationship between presence of a given ancestry in genetic regions associated with a given phenotype and how this is expressed by contemporary individuals. While showing that both autochthonous (WHG) and incoming groups contributed genetic material that shapes the phenotype landscape observed today, we also demonstrated that a subset of these loci further underwent positive selection in the last 500 generations. Although not determining whether the selected alleles (and phenotypes) were predominantly contributed by the autochthonous or incoming groups, by connecting genotypic ancestry and complex traits measured in a large dataset, our results reveal both neutral and adaptive consequences of the post-neolithic admixture events on the European phenotype landscape.
|
|