Post by Admin on Jul 11, 2021 5:12:19 GMT
Meta-analyses of COVID-19
Overall, the COVID-19 Host Genetics Initiative combined genetic data
from 49,562 cases and two million controls across 46 distinct studies (Fig. 1).
The data included studies from populations of different genetic ancestries,
including European, Admixed American, African,
Middle Eastern, South Asian and East Asian individuals (Supplementary Table 1).
An overview of the study design is provided in Extended Data Figure 1.
We performed case-control meta-analyses in three main
categories of COVID-19 disease according to predefined and partially
overlapping phenotypic criteria. These were (1) critically ill COVID-19
cases defined as those who required respiratory support in hospital
or who were deceased due to the disease, (2) cases with moderate or
severe COVID-19 defined as those hospitalized due to symptoms associated
with the infection, and (3) all cases with reported SARS-CoV-2
infection regardless of symptoms (Methods). Controls for all three
analyses were selected as genetically ancestry-matched samples without
known SARS-CoV-2 infection, if that information was available
(Methods). The average age of COVID-19 cases across studies was 55
years (Supplementary Table 1). We report quantile-quantile plots as
Supplementary Figure 1 and ancestry principal component plots for
contributing studies in Extended Data Figure 2.
Across our three analyses, we reported a total of 13 independent
genome-wide significant loci associated with COVID-19 (P< 1.67 × 10-8
threshold adjusted for multiple trait testing) (Supplementary Table 2),
most of which were shared between two or more COVID-19 phenotypes. Two of
these loci are in very close proximity within the 3p21.31
region, which was previously reported as one single locus associated
with COVID-19 severity13–17 (Extended Data Figure 3). Overall, we find
six genome-wide significant associations for critical illness due to
COVID-19, using data for 6,179 cases and 1,483,780 controls from 16
studies (Extended Data Figure 4). Nine genome-wide significant loci
were detected for moderate to severe hospitalized COVID-19 (including
five of the six critical illness loci), from an analysis of 13,641 COVID-19
cases and 2,070,709 controls, across 29 studies (Fig. 2a top panel).
Finally, seven loci reached genome-wide significance in the analysis
using data for all available 49,562 reported cases of SARS-CoV-2 infection
and 1,770,206 controls, using data from a total of 44 studies (Fig. 2a
bottom panel). The proportion of cases with non-European genetic
ancestry for each of the three analyses was 23%, 29% and 22%, respectively.
We report the results for the lead variants at the 13 loci in different
ancestry-group meta-analyses in Supplementary Table 3. We note that
two loci, tagged by lead variants rs1886814 and rs72711165, had higher
allele frequencies in South East Asian (rs1886814, 15%) and East Asian
genetic ancestry (rs72711165, 8%) whilst the minor allele frequencies
in European populations were < 3%. This highlights the value of including
data from diverse populations for genetic discovery. We discuss
replication of previous findings and the new discoveries from these
three analyses in our Supplementary Note.
Variant effects on severity vs. susceptibility
We found no genome-wide significant sex-specific effects at the 13 loci.
However, we did identify significant heterogeneous effects (P <0.004)
across studies for 3 out of the 13 loci (Methods), likely reflecting
differential ascertainment of cases (Supplementary Table 2). There was
minor sample overlap (n = 8,380 EUR; n = 745 EAS) between controls
from the genOMICC and the UK Biobank studies, but leave-one-out
sensitivity analyses did not reveal any bias in the corresponding effect
sizes or P-values (Supplementary Information, Extended Data Figure 5).
We next wanted to better understand whether the 13 significant loci
were acting through mechanisms increasing susceptibility to infection
or by affecting the progression of symptoms towards more severe
disease. For all 13 loci, we compared the lead variant (strongest
association P-value) odds ratios (ORs) for the risk-increasing allele across
our different COVID-19 phenotype definitions.
Focusing on the two better powered analyses: all cases with reported
infection and all cases hospitalized due to COVID-19, we find four of the
loci have similar odds ratios between these two analyses (Methods)
(Supplementary Table 2). Such consistency suggests a stronger link to
susceptibility to SARS-CoV-2 infection rather than to the development
of severe COVID-19. The strongest susceptibility signal was the previously
reported ABO locus (rs912805253)13,14,16,17. Interestingly, and in
agreement with the report by Robert and colleagues16, we also report a
locus within the 3p21.31 region that was more strongly associated with
susceptibility to SARS-CoV-2 than progression to more severe COVID-19
phenotypes. Rs2271616 showed a stronger association with reported
infection (P=1.79×10-34; OR[95%CI]= 1.15 [1.13-1.18]) than hospitalization
(P=1.05×10-5 ; OR[95%CI]=1.12[1.06-1.19]). For this locus, which
contains additional independent signals, the linkage-disequilibrium
pattern is discordant with the P-value expectation (Supplementary
Note; Extended Data Figure 6), pointing to a key missing causal variant or
to a potentially undiscovered multi-allelic or structural variant
in this locus.
In contrast, nine out of the 13 loci were associated with increased
risk of severe symptoms with significantly larger ORs for hospitalized
COVID-19 compared to the mildest phenotype of reported infection
(eight loci below threshold P <0.004 test for effect size difference,
and additionally lead variant rs10774671 had a clear increase in ORs
despite not passing this threshold) (Supplementary Table 2). We further
compared the ORs for these nine loci for critical illness due to COVID-19
vs. hospitalized due to COVID-19, and found that these loci exhibited a
general increase in effect risk for critical illness (Methods) (Extended
Data Figure 7a, Supplementary Table 4), but the lower power for association
analysis of critically ill COVID-19 means that these results should
be considered as suggestive. Overall, these results indicated that these
nine loci were more likely associated with progression of the disease
and worse outcome from SARS-CoV-2 infection compared to being
associated with susceptibility to SARS-CoV-2 infection.
For some of these analyses, the controls were simply existing population
controls without knowledge of SARS-CoV-2 infection or COVID-19
status, which may bias effect size estimates as some of these individuals
may have either become infected with SARS-CoV-2 or developed
COVID-19. We perform several sensitivity analyses (Supplementary
Note; Extended Data Figure 7b; Supplementary Table 4) showing that
using population controls can be a valid and powerful strategy for host
genetic discovery of infectious disease, and particularly those that are
widespread and with rare severe outcomes.
Overall, the COVID-19 Host Genetics Initiative combined genetic data
from 49,562 cases and two million controls across 46 distinct studies (Fig. 1).
The data included studies from populations of different genetic ancestries,
including European, Admixed American, African,
Middle Eastern, South Asian and East Asian individuals (Supplementary Table 1).
An overview of the study design is provided in Extended Data Figure 1.
We performed case-control meta-analyses in three main
categories of COVID-19 disease according to predefined and partially
overlapping phenotypic criteria. These were (1) critically ill COVID-19
cases defined as those who required respiratory support in hospital
or who were deceased due to the disease, (2) cases with moderate or
severe COVID-19 defined as those hospitalized due to symptoms associated
with the infection, and (3) all cases with reported SARS-CoV-2
infection regardless of symptoms (Methods). Controls for all three
analyses were selected as genetically ancestry-matched samples without
known SARS-CoV-2 infection, if that information was available
(Methods). The average age of COVID-19 cases across studies was 55
years (Supplementary Table 1). We report quantile-quantile plots as
Supplementary Figure 1 and ancestry principal component plots for
contributing studies in Extended Data Figure 2.
Across our three analyses, we reported a total of 13 independent
genome-wide significant loci associated with COVID-19 (P< 1.67 × 10-8
threshold adjusted for multiple trait testing) (Supplementary Table 2),
most of which were shared between two or more COVID-19 phenotypes. Two of
these loci are in very close proximity within the 3p21.31
region, which was previously reported as one single locus associated
with COVID-19 severity13–17 (Extended Data Figure 3). Overall, we find
six genome-wide significant associations for critical illness due to
COVID-19, using data for 6,179 cases and 1,483,780 controls from 16
studies (Extended Data Figure 4). Nine genome-wide significant loci
were detected for moderate to severe hospitalized COVID-19 (including
five of the six critical illness loci), from an analysis of 13,641 COVID-19
cases and 2,070,709 controls, across 29 studies (Fig. 2a top panel).
Finally, seven loci reached genome-wide significance in the analysis
using data for all available 49,562 reported cases of SARS-CoV-2 infection
and 1,770,206 controls, using data from a total of 44 studies (Fig. 2a
bottom panel). The proportion of cases with non-European genetic
ancestry for each of the three analyses was 23%, 29% and 22%, respectively.
We report the results for the lead variants at the 13 loci in different
ancestry-group meta-analyses in Supplementary Table 3. We note that
two loci, tagged by lead variants rs1886814 and rs72711165, had higher
allele frequencies in South East Asian (rs1886814, 15%) and East Asian
genetic ancestry (rs72711165, 8%) whilst the minor allele frequencies
in European populations were < 3%. This highlights the value of including
data from diverse populations for genetic discovery. We discuss
replication of previous findings and the new discoveries from these
three analyses in our Supplementary Note.
Variant effects on severity vs. susceptibility
We found no genome-wide significant sex-specific effects at the 13 loci.
However, we did identify significant heterogeneous effects (P <0.004)
across studies for 3 out of the 13 loci (Methods), likely reflecting
differential ascertainment of cases (Supplementary Table 2). There was
minor sample overlap (n = 8,380 EUR; n = 745 EAS) between controls
from the genOMICC and the UK Biobank studies, but leave-one-out
sensitivity analyses did not reveal any bias in the corresponding effect
sizes or P-values (Supplementary Information, Extended Data Figure 5).
We next wanted to better understand whether the 13 significant loci
were acting through mechanisms increasing susceptibility to infection
or by affecting the progression of symptoms towards more severe
disease. For all 13 loci, we compared the lead variant (strongest
association P-value) odds ratios (ORs) for the risk-increasing allele across
our different COVID-19 phenotype definitions.
Focusing on the two better powered analyses: all cases with reported
infection and all cases hospitalized due to COVID-19, we find four of the
loci have similar odds ratios between these two analyses (Methods)
(Supplementary Table 2). Such consistency suggests a stronger link to
susceptibility to SARS-CoV-2 infection rather than to the development
of severe COVID-19. The strongest susceptibility signal was the previously
reported ABO locus (rs912805253)13,14,16,17. Interestingly, and in
agreement with the report by Robert and colleagues16, we also report a
locus within the 3p21.31 region that was more strongly associated with
susceptibility to SARS-CoV-2 than progression to more severe COVID-19
phenotypes. Rs2271616 showed a stronger association with reported
infection (P=1.79×10-34; OR[95%CI]= 1.15 [1.13-1.18]) than hospitalization
(P=1.05×10-5 ; OR[95%CI]=1.12[1.06-1.19]). For this locus, which
contains additional independent signals, the linkage-disequilibrium
pattern is discordant with the P-value expectation (Supplementary
Note; Extended Data Figure 6), pointing to a key missing causal variant or
to a potentially undiscovered multi-allelic or structural variant
in this locus.
In contrast, nine out of the 13 loci were associated with increased
risk of severe symptoms with significantly larger ORs for hospitalized
COVID-19 compared to the mildest phenotype of reported infection
(eight loci below threshold P <0.004 test for effect size difference,
and additionally lead variant rs10774671 had a clear increase in ORs
despite not passing this threshold) (Supplementary Table 2). We further
compared the ORs for these nine loci for critical illness due to COVID-19
vs. hospitalized due to COVID-19, and found that these loci exhibited a
general increase in effect risk for critical illness (Methods) (Extended
Data Figure 7a, Supplementary Table 4), but the lower power for association
analysis of critically ill COVID-19 means that these results should
be considered as suggestive. Overall, these results indicated that these
nine loci were more likely associated with progression of the disease
and worse outcome from SARS-CoV-2 infection compared to being
associated with susceptibility to SARS-CoV-2 infection.
For some of these analyses, the controls were simply existing population
controls without knowledge of SARS-CoV-2 infection or COVID-19
status, which may bias effect size estimates as some of these individuals
may have either become infected with SARS-CoV-2 or developed
COVID-19. We perform several sensitivity analyses (Supplementary
Note; Extended Data Figure 7b; Supplementary Table 4) showing that
using population controls can be a valid and powerful strategy for host
genetic discovery of infectious disease, and particularly those that are
widespread and with rare severe outcomes.