How East Asians Adapted to Historical Coronavirus Outbreaks

new

Admin
Administrator

Posts: 72,776

How East Asians Adapted to Historical Coronavirus Outbreaks Jul 11, 2021 5:12:19 GMT

Quote

Post by Admin on Jul 11, 2021 5:12:19 GMT

Meta-analyses of COVID-19
Overall, the COVID-19 Host Genetics Initiative combined genetic data
from 49,562 cases and two million controls across 46 distinct studies (Fig. 1).
The data included studies from populations of different genetic ancestries,
including European, Admixed American, African,
Middle Eastern, South Asian and East Asian individuals (Supplementary Table 1).
An overview of the study design is provided in Extended Data Figure 1.
We performed case-control meta-analyses in three main
categories of COVID-19 disease according to predefined and partially
overlapping phenotypic criteria. These were (1) critically ill COVID-19
cases defined as those who required respiratory support in hospital
or who were deceased due to the disease, (2) cases with moderate or
severe COVID-19 defined as those hospitalized due to symptoms associated
with the infection, and (3) all cases with reported SARS-CoV-2
infection regardless of symptoms (Methods). Controls for all three
analyses were selected as genetically ancestry-matched samples without
known SARS-CoV-2 infection, if that information was available
(Methods). The average age of COVID-19 cases across studies was 55
years (Supplementary Table 1). We report quantile-quantile plots as
Supplementary Figure 1 and ancestry principal component plots for
contributing studies in Extended Data Figure 2.
Across our three analyses, we reported a total of 13 independent
genome-wide significant loci associated with COVID-19 (P< 1.67 × 10-8
threshold adjusted for multiple trait testing) (Supplementary Table 2),
most of which were shared between two or more COVID-19 phenotypes. Two of
these loci are in very close proximity within the 3p21.31
region, which was previously reported as one single locus associated
with COVID-19 severity13–17 (Extended Data Figure 3). Overall, we find
six genome-wide significant associations for critical illness due to
COVID-19, using data for 6,179 cases and 1,483,780 controls from 16
studies (Extended Data Figure 4). Nine genome-wide significant loci
were detected for moderate to severe hospitalized COVID-19 (including
five of the six critical illness loci), from an analysis of 13,641 COVID-19
cases and 2,070,709 controls, across 29 studies (Fig. 2a top panel).
Finally, seven loci reached genome-wide significance in the analysis
using data for all available 49,562 reported cases of SARS-CoV-2 infection
and 1,770,206 controls, using data from a total of 44 studies (Fig. 2a
bottom panel). The proportion of cases with non-European genetic
ancestry for each of the three analyses was 23%, 29% and 22%, respectively.
We report the results for the lead variants at the 13 loci in different
ancestry-group meta-analyses in Supplementary Table 3. We note that
two loci, tagged by lead variants rs1886814 and rs72711165, had higher
allele frequencies in South East Asian (rs1886814, 15%) and East Asian
genetic ancestry (rs72711165, 8%) whilst the minor allele frequencies
in European populations were < 3%. This highlights the value of including
data from diverse populations for genetic discovery. We discuss
replication of previous findings and the new discoveries from these
three analyses in our Supplementary Note.

Variant effects on severity vs. susceptibility
We found no genome-wide significant sex-specific effects at the 13 loci.
However, we did identify significant heterogeneous effects (P <0.004)
across studies for 3 out of the 13 loci (Methods), likely reflecting
differential ascertainment of cases (Supplementary Table 2). There was
minor sample overlap (n = 8,380 EUR; n = 745 EAS) between controls
from the genOMICC and the UK Biobank studies, but leave-one-out
sensitivity analyses did not reveal any bias in the corresponding effect
sizes or P-values (Supplementary Information, Extended Data Figure 5).
We next wanted to better understand whether the 13 significant loci
were acting through mechanisms increasing susceptibility to infection
or by affecting the progression of symptoms towards more severe
disease. For all 13 loci, we compared the lead variant (strongest
association P-value) odds ratios (ORs) for the risk-increasing allele across
our different COVID-19 phenotype definitions.
Focusing on the two better powered analyses: all cases with reported
infection and all cases hospitalized due to COVID-19, we find four of the
loci have similar odds ratios between these two analyses (Methods)
(Supplementary Table 2). Such consistency suggests a stronger link to
susceptibility to SARS-CoV-2 infection rather than to the development
of severe COVID-19. The strongest susceptibility signal was the previously
reported ABO locus (rs912805253)13,14,16,17. Interestingly, and in
agreement with the report by Robert and colleagues16, we also report a
locus within the 3p21.31 region that was more strongly associated with
susceptibility to SARS-CoV-2 than progression to more severe COVID-19
phenotypes. Rs2271616 showed a stronger association with reported
infection (P=1.79×10-34; OR[95%CI]= 1.15 [1.13-1.18]) than hospitalization
(P=1.05×10-5 ; OR[95%CI]=1.12[1.06-1.19]). For this locus, which
contains additional independent signals, the linkage-disequilibrium
pattern is discordant with the P-value expectation (Supplementary
Note; Extended Data Figure 6), pointing to a key missing causal variant or
to a potentially undiscovered multi-allelic or structural variant
in this locus.
In contrast, nine out of the 13 loci were associated with increased
risk of severe symptoms with significantly larger ORs for hospitalized
COVID-19 compared to the mildest phenotype of reported infection
(eight loci below threshold P <0.004 test for effect size difference,
and additionally lead variant rs10774671 had a clear increase in ORs
despite not passing this threshold) (Supplementary Table 2). We further
compared the ORs for these nine loci for critical illness due to COVID-19
vs. hospitalized due to COVID-19, and found that these loci exhibited a
general increase in effect risk for critical illness (Methods) (Extended
Data Figure 7a, Supplementary Table 4), but the lower power for association
analysis of critically ill COVID-19 means that these results should
be considered as suggestive. Overall, these results indicated that these
nine loci were more likely associated with progression of the disease
and worse outcome from SARS-CoV-2 infection compared to being
associated with susceptibility to SARS-CoV-2 infection.
For some of these analyses, the controls were simply existing population
controls without knowledge of SARS-CoV-2 infection or COVID-19
status, which may bias effect size estimates as some of these individuals
may have either become infected with SARS-CoV-2 or developed
COVID-19. We perform several sensitivity analyses (Supplementary
Note; Extended Data Figure 7b; Supplementary Table 4) showing that
using population controls can be a valid and powerful strategy for host
genetic discovery of infectious disease, and particularly those that are
widespread and with rare severe outcomes.

Last Edit: Jul 11, 2021 5:41:38 GMT by Admin

Admin
Administrator

Posts: 72,776

How East Asians Adapted to Historical Coronavirus Outbreaks Jul 11, 2021 22:21:44 GMT

Quote

Post by Admin on Jul 11, 2021 22:21:44 GMT

Gene prioritization and PheWas
To better understand the potential biological mechanism of each locus,
we applied several approaches to prioritize candidate causal genes and
explore additional associations with other complex diseases and traits.
Of the 13 genome-wide significant loci, we found nine loci to implicate
biologically plausible genes (Supplementary Table 2, Supplementary
Table 5). Protein-altering variants in LD with lead variants implicated
genes at six loci, including TYK2 (19p13.2) and PPP1R15A (19q13.33). The
COVID-19 lead variant rs74956615:T>A in TYK2, which confers risk for
critical illness (OR[95%] = 1.43 [1.29, 1.59]; P = 9.71 × 10–12) and hospitalization
due to COVID-19 (OR [95%CI] = 1.27 [1.18, 1.36]; P = 5.05 × 10–10)
is correlated with the missense variant rs34536443:G>C (p.Pro1104Ala; r
2 =0.82) . This is consistent with the primary immunodeficiency described
with complete TYK2 loss of function3 as this variant is known to reduce
function19,20. In contrast, this missense variant was previously reported
to be protective against autoimmune diseases (Extended Data Figure 8;
Supplementary Table 6), including rheumatoid arthritis (OR = 0.74; P =
3.0 × 10–8; UKB SAIGE), and hypothyroidism (OR = 0.84; P = 1.8 × 10–10;
UK Biobank). At the 19q13.33 locus, the lead variant rs4801778, that
was significantly associated with reported infection (OR [95%CI] = 0.95
[0.93, 0.96]; P = 2.1 × 10–8), is in LD (r2 = 0.93) with a missense variant
rs11541192:G>A (p.Gly312Ser) in PPP1R15A.

Lung-specific cis-eQTL from GTEx v821 (n = 515) and the Lung eQTL
Consortium22 (n = 1,103) provided further support for a subset of loci
(Supplementary Table 7), including FOXP4 (6p21.1) and ABO (9q34.2),
OAS1/OAS3/OAS2 (12q24.13), and IFNAR2/IL10RB (21q22.11), where the
COVID-19 associated variants modify gene expression in lung. Furthermore,
our PheWAS analysis (Supplementary Table 6) implicated
three additional loci related to lung function, with modest lung eQTL
evidence, i.e. the lead variant was not fine-mapped but significantly
associated. An intronic variant rs2109069:G>A in DPP9 (19p13.3),
to be risk-increasing for interstitial lung disease (tag lead variant
rs12610495:A>G [p.Leu8Pro], OR = 1.29, P =2.0 × 10-12)5. The COVID-19
lead variant rs1886814:A>C in FOXP4 locus is correlated (r2 = 0.64)
with a lead variant of lung adenocarcinoma (tag variant=rs7741164;
OR=1.2, P=6.0 × 10-13)6,23 and similarly with a lead variant reporting in
subclinical interstitial lung disease24. In severe COVID, lung cancer and
ILD, the minor, expression increasing allele is associated with increased
risk. We also found that intronic variants (1q22) and rs1819040:T>A
in KANSL1 (17q21.31), associated protectively against hospitalization
due to COVID-19, were previously reported for reduced lung function
(e.g. tag lead variant rs141942982:G>T, OR [95%CI] = 0.96 [0.95, 0.97],
P = 1.00 × 10–20)7. Notably, the 17q21.31 locus is a well-known locus for
structural variants containing a megabase inversion polymorphism
(H1 and inverted H2 forms) and complex copy-number variations,
where the inverted H2 forms were shown to be positively selected in
Europeans25
Lastly, there are two loci in the 3p21.31 region with varying genes
prioritized by different methods for different independent signals.
For the severity lead variant rs10490770:T>C, we prioritized CXCR6
with the Variant2Gene (V2G) algorithm27, although LZTFL1 is the closest gene.
The CXCR6 plays a role in chemokine signaling28, and LZTFL1
has been implicated in lung cancer29. Rs2271616:G>T, associated with
susceptibility, tags a complex region including several independent
signals (Supplementary Note) all located within a gene body of SLC6A20
which is known to functionally interact with the SARS-CoV-2 receptor
ACE230. However, none of the lead variants in the 3p21.31 region has
been previously associated with other traits or diseases in our PheWAS
analysis. While these results provide supporting in-silico evidence for
candidate causal gene prioritization, further functional characterization
is strongly needed. Detailed locus descriptions and LocusZoom plots are provided in Supplementary Figure 2.

Last Edit: Jul 12, 2021 4:41:06 GMT by Admin

Admin
Administrator

Posts: 72,776

How East Asians Adapted to Historical Coronavirus Outbreaks Jul 12, 2021 3:12:07 GMT

Quote

Post by Admin on Jul 12, 2021 3:12:07 GMT

Polygenic architecture of COVID-19
To further investigate the genetic architecture of COVID-19, we used
results from meta-analyses including samples from European ancestries
(sample sizes described in Methods and Supplementary Table 1)
to estimate SNP heritability, i.e. proportion of variation in the two
phenotypes that was attributable to common genetic variants, and to
determine whether heritability for COVID-19 phenotypes was enriched
in genes specifically expressed in certain tissues31 from GTEx dataset32.
We detected a low, but significant heritability across all three analyses
(<1% on observed scale, all P-values < 0.0001, LDSC intercept range
1.0024-1.0137; Supplementary Table 8). The values are low compared
to previously published studies15 but may be explained by differences
in reported estimate scale (observed vs. liability), the specific method
used, disease prevalence estimates, phenotypic differences between
patient cohorts or ascertainment of controls. Despite the low reported
values, we found that heritability for reported infection was significantly
enriched in genes specifically expressed in the lung (P = 5.0 × 10-4)
(Supplementary Table 9). These findings, together with genome-wide
significant loci identified in the meta-analyses, suggest that there is a
significant polygenic architecture that can be better leveraged with
future, larger, sample sizes.

Genetic correlation Mendelian Randomization
Genetic correlations (rg) between the three COVID-19 phenotypes
was high, though lower correlations were observed between hospitalized
COVID-19 and reported infection (critical illness vs. hospitalized:
rg [95%CI] = 1.37 [1.08, 1.65], P = 2.9 × 10-21; critical illness vs. reported
infection, rg [95%CI] = 0.96 [0.71, 1.20], P = 1.1 × 10-14; hospitalized vs.
reported infection: rg [95%CI] = 0.85 [0.68, 1.02], P = 1.1 × 10-22). To better
understand which traits are genetically correlated and/or potentially
causally associated with COVID-19 hospitalization, critical illness and
SARS-CoV-2 reported infection, we chose a set of 38 disease, health
and neuropsychiatric phenotypes as potential COVID-19 risk factors
based on their clinical correlation with disease susceptibility, severity,
or mortality (Supplementary Table 10).
We found evidence (FDR<0.05) of significant genetic correlations
between 9 traits and hospitalized COVID-19 and SARS-CoV-2 reported
infection (Fig. 3; Extended Data Figure 9; Supplementary Table 11).
Interesting findings include that genetic liability to ischemic stroke was
only significantly positively correlated with critical illness or
hospitalization due to COVID-19, but not with a higher likelihood of reported
SARS-CoV-2 infection (infection r g= 0.019 vs. hospitalization rg = 0.41,
z = 2.7, P = 0.006; infection rg = 0.019 vs. critical illness rg = 0.40, z =
2.49, P = 0.013).
We next used two-sample Mendelian randomization (MR) to infer
potentially causal relationships between these traits. After correcting
for multiple testing (FDR < 0.05), 8 exposure — COVID-19 trait-pairs
showed suggestive evidence of a causal association (Fig. 3; Supplementary
Table 12; Extended Data Figure 10; Supplementary Figure 3).
Five of these associations were robust to potential violations of the
underlying assumptions of MR. Corroborating our genetic correlation
results and evidence from epidemiological studies, genetically predicted higher
BMI (OR [95%CI] 1.4 [1.3, 1.6], P = 8.5 × 10-11) and smoking (OR [95%CI] =
1.9 [1.3, 2.8], P = 0.0012) were associated with increased
risk of COVID-19 hospitalization, with BMI also being associated with
increased risk of SARS-CoV-2 infection (OR [95%CI] = 1.1 [1.1, 1.2], P =
4.8 × 10-7). Genetically predicted increased height (OR [95%CI] = 1.1
[1, 1.1]), P = 8.9 × 10-4) was associated with an increased risk of reported
infection, and genetically predicted higher red blood cell count (OR
[95%CI] = 0.93 [0.89, 0.96], P = 5.7 × 10-5) with a reduced risk of reported
infection. Despite the evidence of genetic correlation between type II
diabetes and COVID-19 outcomes, there was no evidence of a causal
association in the MR analyses, suggesting that the observed genetic
correlations are due to pleiotropic effects between BMI and type II
diabetes. Further sensitivity analyses relating to sample overlap are
discussed in Supplementary Information.

Admin
Administrator

Posts: 72,776

How East Asians Adapted to Historical Coronavirus Outbreaks Jul 12, 2021 22:45:17 GMT

Quote

Post by Admin on Jul 12, 2021 22:45:17 GMT

Discussion
The COVID-19 Host Genetics Initiative has brought together investigators
from across the world to advance genetic discovery for SARS-CoV-2
infection and severe COVID-19 disease. We report 13 genome-wide
significant loci associated with some aspect of SARS-CoV-2 infection
or COVID-19. Many of these loci overlap with previously reported
associations with lung-related phenotypes or autoimmune/inflammatory
diseases, but some loci have no obvious candidate gene.
Four out of the 13 genome-wide significant loci showed similar effects
in the reported infection analysis (a proxy for disease susceptibility)
and all-hospitalized COVID-19 (a proxy for disease severity). Of these,
one locus was in close proximity, but yet independent, to the major
genetic signal for COVID-19 severity at 3p21.31. Surprisingly, this locus
was associated with COVID-19 susceptibility rather than severity. The
locus overlaps SLC6A20, which encodes an amino acid transporter
that interacts with ACE2. Nonetheless, we caution that more data is
needed to resolve the nature of the relationship between genetic
variation and COVID-19 at this locus, particularly as the physical proximity,
linkage disequilibrium structure and patterns of association suggest
that untagged genetic variation might be drive the association signal in
the region. Our findings support the notion that some genetic variants,
most notably at ABO and PPP1R15A loci, in addition to the aforementioned
SLC6A20, might indeed impact susceptibility to infection rather
than progression to severe COVID-19 once infected.
Several of the loci reported here, as noted in previous publications13,15,
intersect with well-known genetic variants that have established genetic
associations. Examples of these include variants at DPP9 and FOXP4
which show prior evidence of increasing risk for interstitial lung
disease5, and missense variants within TYK2 that show a protective
effect on several autoimmune-related diseases33–36. Together with the
heritability enrichment observed in genes expressed in lung tissues,
these results highlight the involvement of lung-related biological
pathways in developing severe COVID-19. Several other loci show no prior
documented genome-wide significant associations, even despite the
high significance and attractive candidate genes for COVID-19 (e.g.,
CXCR6, LZTFL1, IFNAR2 and OAS1/2/3 loci). The previously reported
associations for the strongest association for COVID-19 severity at
3p21.31 and monocytes count are likely to be due to proximity and not
a true co-localization.
Increasing the global representation in genetic studies enhances
the ability to detect novel associations. Two of the loci affecting
disease severity were only discovered by including the four studies
of individuals with East Asian ancestry. One of these loci, close to
FOXP4, is common particularly in East Asian (32%) as well as Admixed
American in the Americas (20%) and Middle Eastern samples (7%), but
has a low frequency in most European ancestries (2-3%) in our data.
Although we cannot be certain of the mechanism of action of FOXP4
association is an attractive biological target, as it is expressed in the
proximal and distal airway epithelium37, and has been shown to play a
role in controlling epithelial cell fate during lung development38. The
COVID-19 Host genetics Initiative continues to pursue expansion of the
datasets included in the consortium’s analyses to populations from
underrepresented populations in upcoming data releases. We plan to
release ancestry-specific results in full once the sample sizes allow for
a well-powered meta-analysis.
Care should be taken when interpreting the results from a
meta-analysis because of challenges with cases and controls ascertainment
and collider bias (see Supplementary Note for a more detailed
discussion on study limitations). Drawing a comprehensive and reproducible
map of the host genetics factors associated with COVID-19
severity and SARS-CoV-2 requires a sustained international effort to
include diverse ancestries and study designs. To accelerate downstream
research and therapeutic discovery, the COVID-19 Host Genetic Initiative
regularly publishes meta-analysis results from periodic data
freezes on the website www.covid19hg.org and provides an interactive
explorer where researchers can browse the results and the genomic
loci in more detail. Future work will be required to better understand
the biological and clinical value of these findings. Continued efforts to
collect more samples and detailed phenotypic data should be endorsed
globally, allowing for more thorough investigation of variable, heritable
symptoms39,40, particularly in the light of newly emerging strains of
SARS-CoV-2 virus, which may provoke different host responses leading to disease.

Last Edit: Jul 12, 2021 22:45:45 GMT by Admin

Admin
Administrator

Posts: 72,776

How East Asians Adapted to Historical Coronavirus Outbreaks Nov 10, 2021 21:38:15 GMT

Quote

Post by Admin on Nov 10, 2021 21:38:15 GMT

Paleolithic to Bronze Age Siberians Reveal Connections with First Americans and across Eurasia

Summary
Modern humans have inhabited the Lake Baikal region since the Upper Paleolithic, though the precise history of its peoples over this long time span is still largely unknown. Here, we report genome-wide data from 19 Upper Paleolithic to Early Bronze Age individuals from this Siberian region. An Upper Paleolithic genome shows a direct link with the First Americans by sharing the admixed ancestry that gave rise to all non-Arctic Native Americans. We also demonstrate the formation of Early Neolithic and Bronze Age Baikal populations as the result of prolonged admixture throughout the eighth to sixth millennium BP. Moreover, we detect genetic interactions with western Eurasian steppe populations and reconstruct Yersinia pestis genomes from two Early Bronze Age individuals without western Eurasian ancestry. Overall, our study demonstrates the most deeply divergent connection between Upper Paleolithic Siberians and the First Americans and reveals human and pathogen mobility across Eurasia during the Bronze Age.

Graphical Abstract

Introduction
The Lake Baikal region in Siberia has been inhabited by modern humans since the Upper Paleolithic and has a rich archaeological record (Katzenberg and Weber, 1999, Weber, 1995). In the past 5 years, ancient genomic studies have revealed multiple genetic turnovers and admixture events in this region. The 24,000-year-old individual (MA1) from the Mal’ta site represents an ancestry referred to as “Ancient North Eurasian (ANE),” which was widespread across Siberia during the Paleolithic (Fu et al., 2016, Raghavan et al., 2014a, Sikora et al., 2019) and that contributed to the genetic profile of a vast number of present-day Eurasian populations as well as Native Americans (Haak et al., 2015, Lazaridis et al., 2014, Lazaridis et al., 2016, Raghavan et al., 2015). ANE ancestry was suggested to have been largely replaced in the Lake Baikal region during the Early Neolithic by a gene pool related to present-day northeast Asians, with a limited resurgence of ANE ancestry by the Early Bronze Age (Damgaard et al., 2018a).

Siberia has also been proposed as a source for multiple waves of dispersals into the Americas, the first of which was shown to be driven by a founding population estimated to have formed around 25,000–20,000 years before the present (BP) (Raghavan et al., 2015). The so-called Ancient Beringian ancestry represented by a 11,500-year-old Alaskan individual (USR1) was shown to be part of this founding population, estimated to have split from other Native Americans around 23,000 BP (Moreno-Mayar et al., 2018). In addition, the recently published 9,800-year-old Kolyma genome from northeastern Siberia was suggested to represent the closest relative to Native American populations outside of the Americas (Sikora et al., 2019). Moreover, the Paleo-Eskimo ancestry represented by a 4,000-year-old Saqqaq individual from Greenland was also estimated to have split from northeastern Siberian groups and migrated to Arctic America around 6,000–5,000 BP (Flegontov et al., 2019, Raghavan et al., 2014b, Rasmussen et al., 2010). Although these waves of migration are generally linked to ancient Siberian populations, their origins in the context of the Siberian genetic history remain poorly understood. Further studies of the Siberian population history using ancient genomes are, therefore, critical for the better understanding of the formation of Native American populations.

Furthermore, the Neolithic to Bronze Age transition in Eurasia was marked by complex cultural and genetic changes facilitated by extensive population movements, though their impact in the Lake Baikal region is still unclear. Looking to the west, the Early Bronze Age groups from the Pontic-Caspian steppe associated with the Yamnaya complex spread both east and west along with their distinct genetic profile often referred to as “Steppe ancestry” (Allentoft et al., 2015, Haak et al., 2015). The eastward expansion of this group is considered to be associated with the Early Bronze Age Afanasievo culture. However, the later Middle Bronze Age Okunevo-related population from the central steppe as well as the Late Bronze Age Khövsgöl-related population from the eastern steppe harbor only a limited proportion of Steppe ancestry (Jeong et al., 2018, Jeong et al., 2019). Therefore, the effect of steppe migrations in eastern Eurasia, particularly the interactions of Bronze Age Baikal hunter-gatherers with the contemporaneous and geographically proximal Afanasievo population, is still largely unexplored.

In this study, we report 19 newly sequenced ancient hunter-gatherers from the Lake Baikal and its surrounding regions, spanning from the Upper Paleolithic to the Early Bronze Age. Their analyses alongside published data reveal the most deeply divergent ancestry that link Upper Paleolithic Siberians and the First Peoples of the Americas, and more clearly delineate the complex transition between Early Neolithic and Early Bronze Age populations in the Lake Baikal region. We also provide both human and pathogen genomic evidence demonstrating the influence of western Eurasian steppe populations in this region during the Early Bronze Age and discuss the genetic contribution of Lake Baikal hunter-gatherers to Siberian populations through time.