Post by Admin on Oct 7, 2020 20:07:05 GMT
Figure 3
Changes in energy of S-protein:ACE2 complex for animals that can be infected by SARS-CoV-2. Boxplots of ΔΔG values calculated by protocol 2 mCSM-PPI2 are shown. Infected: 32 animals that have in vivo or in vitro or real world evidence of infection. Not infected: 9 animals that have been experimentally tested but show no infection. All: 212 animals that were included in this study. The one-sided P value is reported from a Mann–Whitney test of the hypothesis that ΔΔG values from infected animals is lower than for not infected animals, against the null hypothesis that there is no difference between the two distributions.
In general we see a high infection risk for most mammals, with a notable exception for all non-placental mammals. ΔΔG values measured by P(2)-PPI2 correlate well with the infection phenotypes (Table 1). ΔΔG values are significantly lower for animals that can be infected by SARS-CoV-2 than for animals for which there is no evidence of infection (Fig. 3; Mann–Whitney one-sided P = 4.1 × 10–5). Two animals are outliers in the infected boxplot, corresponding to horseshoe bat (ΔΔG = 3.723) and marmoset (ΔΔG = 3.438). To be cautious, since in vivo experiments have shown that marmosets can be infected, and in vitro experiments have shown that horseshoe bats can be infected2,20,21,22 (Table 1), we consider animals that have ΔΔG values less than, or equal to, the ΔΔG = 3.7 for horseshoe bat to be at risk. Additionally, there is a clear sampling bias in the set of animals that have so far been experimentally characterised: all but chicken and duck are mammals. As more non-mammals are tested, the median ΔΔG value for non-infection is likely to increase. In further support of these predictions we analysed the 41 animals having experimental evidence using an orthogonal method, HADDOCK47, and found ~ 95% agreement between the two independent approaches for animals predicted to be at risk (see Supplementary Results 3 and Supplementary Fig. 6).
As shown in previous studies, and supported by experimental data, many primates are predicted to be at high risk19,41,42. In agricultural settings, camels, cows, sheep, goats and horses also have relatively low ΔΔG values, suggesting comparable binding affinities to humans, in agreement with experimental data20,21,48. In domestic settings, dogs18, cats 18, hamsters20, and rabbits20,21,−22,49 also have ΔΔG values suggesting risk, again in agreement with experimental data (Table 1). Whilst, zoological animals that come into contact with humans, such as pandas, leopards and bears, are also at risk of infection as shown experimentally20 and suggested by our calculated ΔΔG values. Importantly, mice and rats do not appear to be susceptible (ΔΔG values > 3.7), so hamsters and ferrets are being used as model organisms for human COVID-19. Of the 35 birds tested only a handful, including the blue tit, show an infection risk. Similarly, out of 72 fish in this study, relatively few show a low change in energy of the complex, suggesting that most have no susceptibility to infection. Those susceptible include the common carp, turbot and Nile tilapia. Of the 14 reptiles and amphibians investigated, only turtle and crocodile show any risk.
In predicting susceptibility, we have chosen thresholds supported by in vivo or in vitro experimental data. Previous work contrasted the binding energy of the S-protein of SARS-CoV and SARS-CoV-2 with human ACE2 protein24,25,32. SARS-CoV is able to infect humans despite a ~ 20-fold lower binding affinity24,25,32, suggesting that even where mutations in different animal species make the interfaces less compatible for SARS-CoV2, a considerably decreased binding energy may still be sufficient to enable infection. By applying this threshold we correctly predict all 32 animals in our dataset that have experimental evidence of infection, to be at risk (Table 1).
However, for a few animals we predict at risk using this threshold, in vitro experimental studies to date have not shown infection. For example, donkeys are at risk of infection (ΔΔG = 1.3) but no infections were observed in vitro for these animals21. However, infection has been observed in vitro for horse21 and horse and donkey have identical DCEX residues and the same ΔΔG. Amongst New World monkeys, marmosets have been experimentally infected19. We predict that the closely related capuchin and squirrel monkey are also at risk, although they have not been shown to be infected using functional assays20. We performed detailed structural analyses to characterise the key residues contributing to binding energy changes and to consider these discrepancies further. Our analyses reveal that the interfaces in both capuchin and squirrel monkey are similar to marmoset, suggesting that these two New World monkeys are also likely to be at risk even though there is no current experimental data supporting this20 (Supplementary Results 4). Furthermore, all these monkeys have high global sequence similarity to human. For capuchin and squirrel monkey this is > ~ 90% and their DCEX residues are identical to those of human, further supporting risk. In marmoset, which has experimental evidence of infection the global sequence identity is 89% and 93% over the DCEX residues.
Additionally, we compared changes in energy of the S-protein: ACE2 complex in SARS-CoV-2 and SARS-CoV and found similar changes suggesting that the range of animals susceptible to the virus is likely to be similar for SARS-CoV-2 and SARS-CoV (Supplementary Results 5).
Conservation of TMPRSS2 and its role in SARS-CoV-2 infection
ACE2 and TMPRSS2 are key factors in the SARS-CoV-2 infection process. Both are highly co-expressed in susceptible cell types, such as type II pneumocytes in the lungs, ileal absorptive enterocytes in the gut, and nasal goblet secretory cells50. Since both proteins are required for infection of host cells, and since our analyses clearly support suggestions of conserved binding of S-protein:ACE2 across animal species, we decided to analyse whether the TMPRSS2 was similarly conserved. There is no known structure of TMPRSS2, so we built a high-quality model (nDOPE = − 0.78) from a template structure (PDB ID 5I25). Since TMPRSS2 is a serine protease, and the key catalytic residues are known, we used FunFams51 to identify highly conserved residues in the active site and the cleavage site that are likely to be involved in substrate binding. This resulted in two sets of residues that we analysed: the active site and cleavage site residues (ASCS), and the active site and cleavage site residues plus residues within 8Å of catalytic residues that are highly conserved in the FunFam (ASCSEX). The sum of Grantham scores for mutations in the active site and cleavage site for TMPRSS2 is zero or consistently lower than ACE2 in all organisms under consideration, for both ASCS and ASCSEX residues (Fig. 4). This means that the mutations in TMPRSS2 involve more conservative changes.
Figure 4
Mutations in DCEX residues seem to have a more disruptive effect in ACE2 than in TMPRSS2. Whilst we expect orthologues from organisms that are close to humans to be conserved and have lower Grantham scores, we observed some residue substitutions that have high Grantham scores for primates, such as capuchin, marmoset and mouse lemur. In addition, primates, such as the coquerel sifaka, greater bamboo lemur and Bolivian squirrel monkey, have mutations in DCEX residues with high Grantham scores. Mutations in TMPRSS2 may render these animals less susceptible to infection by SARS-CoV-2.
Changes in energy of S-protein:ACE2 complex for animals that can be infected by SARS-CoV-2. Boxplots of ΔΔG values calculated by protocol 2 mCSM-PPI2 are shown. Infected: 32 animals that have in vivo or in vitro or real world evidence of infection. Not infected: 9 animals that have been experimentally tested but show no infection. All: 212 animals that were included in this study. The one-sided P value is reported from a Mann–Whitney test of the hypothesis that ΔΔG values from infected animals is lower than for not infected animals, against the null hypothesis that there is no difference between the two distributions.
In general we see a high infection risk for most mammals, with a notable exception for all non-placental mammals. ΔΔG values measured by P(2)-PPI2 correlate well with the infection phenotypes (Table 1). ΔΔG values are significantly lower for animals that can be infected by SARS-CoV-2 than for animals for which there is no evidence of infection (Fig. 3; Mann–Whitney one-sided P = 4.1 × 10–5). Two animals are outliers in the infected boxplot, corresponding to horseshoe bat (ΔΔG = 3.723) and marmoset (ΔΔG = 3.438). To be cautious, since in vivo experiments have shown that marmosets can be infected, and in vitro experiments have shown that horseshoe bats can be infected2,20,21,22 (Table 1), we consider animals that have ΔΔG values less than, or equal to, the ΔΔG = 3.7 for horseshoe bat to be at risk. Additionally, there is a clear sampling bias in the set of animals that have so far been experimentally characterised: all but chicken and duck are mammals. As more non-mammals are tested, the median ΔΔG value for non-infection is likely to increase. In further support of these predictions we analysed the 41 animals having experimental evidence using an orthogonal method, HADDOCK47, and found ~ 95% agreement between the two independent approaches for animals predicted to be at risk (see Supplementary Results 3 and Supplementary Fig. 6).
As shown in previous studies, and supported by experimental data, many primates are predicted to be at high risk19,41,42. In agricultural settings, camels, cows, sheep, goats and horses also have relatively low ΔΔG values, suggesting comparable binding affinities to humans, in agreement with experimental data20,21,48. In domestic settings, dogs18, cats 18, hamsters20, and rabbits20,21,−22,49 also have ΔΔG values suggesting risk, again in agreement with experimental data (Table 1). Whilst, zoological animals that come into contact with humans, such as pandas, leopards and bears, are also at risk of infection as shown experimentally20 and suggested by our calculated ΔΔG values. Importantly, mice and rats do not appear to be susceptible (ΔΔG values > 3.7), so hamsters and ferrets are being used as model organisms for human COVID-19. Of the 35 birds tested only a handful, including the blue tit, show an infection risk. Similarly, out of 72 fish in this study, relatively few show a low change in energy of the complex, suggesting that most have no susceptibility to infection. Those susceptible include the common carp, turbot and Nile tilapia. Of the 14 reptiles and amphibians investigated, only turtle and crocodile show any risk.
In predicting susceptibility, we have chosen thresholds supported by in vivo or in vitro experimental data. Previous work contrasted the binding energy of the S-protein of SARS-CoV and SARS-CoV-2 with human ACE2 protein24,25,32. SARS-CoV is able to infect humans despite a ~ 20-fold lower binding affinity24,25,32, suggesting that even where mutations in different animal species make the interfaces less compatible for SARS-CoV2, a considerably decreased binding energy may still be sufficient to enable infection. By applying this threshold we correctly predict all 32 animals in our dataset that have experimental evidence of infection, to be at risk (Table 1).
However, for a few animals we predict at risk using this threshold, in vitro experimental studies to date have not shown infection. For example, donkeys are at risk of infection (ΔΔG = 1.3) but no infections were observed in vitro for these animals21. However, infection has been observed in vitro for horse21 and horse and donkey have identical DCEX residues and the same ΔΔG. Amongst New World monkeys, marmosets have been experimentally infected19. We predict that the closely related capuchin and squirrel monkey are also at risk, although they have not been shown to be infected using functional assays20. We performed detailed structural analyses to characterise the key residues contributing to binding energy changes and to consider these discrepancies further. Our analyses reveal that the interfaces in both capuchin and squirrel monkey are similar to marmoset, suggesting that these two New World monkeys are also likely to be at risk even though there is no current experimental data supporting this20 (Supplementary Results 4). Furthermore, all these monkeys have high global sequence similarity to human. For capuchin and squirrel monkey this is > ~ 90% and their DCEX residues are identical to those of human, further supporting risk. In marmoset, which has experimental evidence of infection the global sequence identity is 89% and 93% over the DCEX residues.
Additionally, we compared changes in energy of the S-protein: ACE2 complex in SARS-CoV-2 and SARS-CoV and found similar changes suggesting that the range of animals susceptible to the virus is likely to be similar for SARS-CoV-2 and SARS-CoV (Supplementary Results 5).
Conservation of TMPRSS2 and its role in SARS-CoV-2 infection
ACE2 and TMPRSS2 are key factors in the SARS-CoV-2 infection process. Both are highly co-expressed in susceptible cell types, such as type II pneumocytes in the lungs, ileal absorptive enterocytes in the gut, and nasal goblet secretory cells50. Since both proteins are required for infection of host cells, and since our analyses clearly support suggestions of conserved binding of S-protein:ACE2 across animal species, we decided to analyse whether the TMPRSS2 was similarly conserved. There is no known structure of TMPRSS2, so we built a high-quality model (nDOPE = − 0.78) from a template structure (PDB ID 5I25). Since TMPRSS2 is a serine protease, and the key catalytic residues are known, we used FunFams51 to identify highly conserved residues in the active site and the cleavage site that are likely to be involved in substrate binding. This resulted in two sets of residues that we analysed: the active site and cleavage site residues (ASCS), and the active site and cleavage site residues plus residues within 8Å of catalytic residues that are highly conserved in the FunFam (ASCSEX). The sum of Grantham scores for mutations in the active site and cleavage site for TMPRSS2 is zero or consistently lower than ACE2 in all organisms under consideration, for both ASCS and ASCSEX residues (Fig. 4). This means that the mutations in TMPRSS2 involve more conservative changes.
Figure 4
Mutations in DCEX residues seem to have a more disruptive effect in ACE2 than in TMPRSS2. Whilst we expect orthologues from organisms that are close to humans to be conserved and have lower Grantham scores, we observed some residue substitutions that have high Grantham scores for primates, such as capuchin, marmoset and mouse lemur. In addition, primates, such as the coquerel sifaka, greater bamboo lemur and Bolivian squirrel monkey, have mutations in DCEX residues with high Grantham scores. Mutations in TMPRSS2 may render these animals less susceptible to infection by SARS-CoV-2.