|
Post by Admin on May 19, 2020 0:40:55 GMT
Results In March and July of 2019, we detected Betacoronaviruses in three individuals from two sets of smuggled Malayan pangolins (Manis javanica) (n = 27) that were intercepted by Guangdong customs [10]. All three animals suffered from serious respiratory disease and failed to be rescued by the Guangdong Wildlife Rescue Center [10] (S1 Table). Through metagenomic sequencing and de novo assembling, we recovered 38 contigs ranging from 380 to 3,377 nucleotides, and the nucleotide sequence identity among the contigs from these three samples were 99.54%. Thus, we pooled sequences from three samples and assembled the draft genome of this pangolin origin coronavirus. After that, gap filling with amplicon sequencing was conducted to obtain a nearly full genome sequence. This pangolin-CoV-2020 genome (Genbank No.: MT121216) was found to be comprised of 29,521 nucleotides. Strikingly, genomic analyses suggested the pangolin-CoV-2020 has a high identity with both SARS-CoV-2 and Bat-CoV-RaTG13, the proposed origin of SARS-CoV-2 (Fig 1A, S2 Table). The nucleotide sequence identity between pangolin-CoV-2020 and SARS-CoV-2 was 90.32%, whereas the protein sequence identity for individual proteins can be up to 100% (Table 1; Table 2). The nucleotide sequence identity between pangolin-CoV-2020 and Bat-CoV-RaTG13 was 90.24%, while that for the corresponding regions between SARS-CoV-2 and Bat-CoV-RaTG13 was 96.18% (Table 1, S1 Table). Fig 1. Genomic comparison of pangolin-CoV-2020, SARS-CoV-2, and other coronaviruses. A) Similarity plot based on the full-length genome sequence of SARS-CoV-2. Full-length genome sequences of Bat-CoV-RaTG13, Bat-CoV-ZXC21, SARS-CoV, Bat-CoV-ZC45, MERS-CoV, and pangolin-CoV-2020 were used as subject sequences. The green line indicates MERS-CoV, the dark blue line indicates SARS-CoV, the grey line indicates Bat-CoV-ZXC21, the yellow line indicates Bat-CoV-ZC45, the orange line indicates pangolin-CoV-2020, while the light blue line indicates Bat-CoV-RaTG13; B) Phylogenetic analyses of whole genome sequences depicting the evolutionary relationship among SARS-CoV-2, pangolin-CoV-2020, and other coronaviruses from different hosts. The phylogenies were estimated using the MrBayes approach employing the GTR+I+G nucleotide substitution model. The nucleotide sequence identities among the S protein genes were 93.15% between the Bat-CoV-RaTG13 and SARS-CoV-2, 84.52% between pangolin-CoV-2020 and SARS-CoV-2, as well as 73.43% between pangolin-CoV-2020 and SARS-CoV, respectively (Table 1). Further analyses suggested the S gene was relatively more genetically diverse in the S1 region than the S2 region (Fig 2A, S3 Table). Compared with their nucleotide sequences, the S proteins of pangolin-CoV-2020 and SARS-CoV-2 were more conserved, with a sequence identity of 90.18% (Table 2). Fig 2. Genetic analyses of the spike (S) surface glycoprotein of pangolin-CoV-2020, SARS-CoV-2, and other coronaviruses. A) Similarity plot based on the spike surface glycoprotein amino acid and nucleotide sequence of SARS-CoV-2. Bat-CoV-RaTG13, Bat-CoV-ZXC21, Bat-CoV-ZC45, SARS-CoV, and pangolin-CoV-2020 were used as subject sequences. The green lines indicate SARS-CoV, the grey lines indicate Bat-CoV-ZXC21, the yellow lines indicate Bat-CoV-ZC45, the orange lines indicate pangolin-CoV-2020, while the light blue lines indicate Bat-CoV-RaTG13; B) Phylogenetic analysis of S gene sequences depicting the evolutionary relationship among SARS-CoV-2, pangolin-CoV-2020, and other coronaviruses from different hosts. The phylogenies were estimated using MrBayes approach employing the GTR+I+G nucleotide substitution model. The receptor binding domains (RBD) of the S protein were highly conserved between pangolin-CoV-2020 and SARS-CoV-2, the nucleotide and amino acid sequences identity of RBD of S gene between them was highest in comparison with those between pangolin-CoV-2020 and other SARS-like conronaviruses of 86.64% and 96.80% (Table 1, Table 2). Pangolin-CoV-2020 and SARS-CoV-2 also shared a very conserved receptor binding motif (RBM) (98.6%), which was more conserved than in Bat-CoV-RaTG13 (76.4%) (Fig 3). These results support that pangolin-CoV-2020 and SARS-CoV-2 share the same angiotensin-converting enzyme 2 (ACE2) receptor. Further analyses suggested that there was one variation (Gln498) between the RBM of pangolin-CoV-2020 and that of SARS-CoV-2 but conserved in all other key residues being associated with receptor binding (Gly482, Val483, Glu484, Gly485, Phe486, Gln493, Leu455, Asn501), suggesting a potential binding affinity between pangolin-CoV-2020 and human ACE2 receptor (Fig 3).
|
|
|
Post by Admin on May 19, 2020 6:23:12 GMT
Fig 3. Amino acid sequence alignment of the spike (S) surface glycoprotein of the pangolin-CoV-2020 with SARS-CoV-2 and Bat-CoV-RaTG13. Previously identified critical ACE2-binding residues are in the blue box. An arginine in the core structure that interacts with glycan is displayed within the red box. On the other hand, unlike RBD, the nucleotide and amino acid sequence identity of NTD (N-terminal domain) were only 66.2% and 63.1% identical between pangolin-CoV-2020 and SARS-CoV-2. However, a loci Arg408 from the RBD core of SARS-CoV-2 could form a hydrogen bond with human ACE2 was conserved in pangolin-CoV-2020 (Fig 3). Both pangolin-CoV-2020 and Bat-CoV-RaTG13 lack an S1/S2 cleavage site (~680–690 aa) whereas SARS-CoV-2 possesses (Fig 3). Genomic analyses suggested sequence similarities were not homogeneous across the S genes of pangolin-CoV-2020, SARS-CoV-2, Bat-CoV-ZXC21 and Bat-CoV-ZC45. For example, the first S region (i.e., nucleotides 1–1200) of pangolin-CoV-2020 has a higher nucleotide identity to two bat viruses (Bat-CoV-ZXC21 and Bat-CoV-ZC45) than SARS-CoV-2 and Bat-CoV-RaTG13, whereas the remaining S gene of pangolin-CoV-2020 is opposite (Fig 2A). These results suggest that a recombination event could have occurred during the evolution of these coronaviruses. Phylogenetic analyses suggested that the S genes of pangolin-CoV-2020, SARS-CoV-2 and three bat origin coronaviruses (Bat-CoV-RaTG13, Bat-CoV-ZXC21, and Bat-CoV-ZC45) were genetically more similar to each other than other viruses in the same family (Fig 2B). The S gene of Bat-CoV-RaTG13 was genetically closer to pangolin-CoV-2020 than Bat-CoV-ZXC21 and Bat-CoV-ZC45. Similar tree topologies were observed for the encoding ORFs of RNA-dependent RNA polymerase (RdRp gene) and other genes (S1–S3 Figs). At the genomic level, SARS-CoV-2 was also genetically closer to Bat-CoV-RaTG13 than pangolin-CoV-2020 (Fig 1B). Discussion In this study, we assembled the genomes of coronaviruses identified in sick pangolins and our results showed that pangolin-CoV-2020 is genetically associated with both SARS-CoV-2 and a group of bat coronaviruses. There is a high sequence identity between pangolin-CoV-2020 and SARS-CoV-2. However, phylogenetic analyses and a special amino acid sequence in the S gene of SARS-CoV-2 did not support the hypothesis of SARS-CoV-2 arising directly from the pangolin-CoV-2020. It is of interest that the genomic sequences of coronaviruses detected from two batches of smuggled pangolins intercepted by different customs at different dates were all be associated with bat coronaviruses. In addition, the genetic identity of coronavirus contigs assembled in each animal was extremely high (99.54%). The reads from the third pangolin acquired in July 2019 were relatively less abundant than those from the two pangolin samples acquired in March 2019. Although it is unclear whether coronaviruses in these two batches of smuggled pangolins had the same origin, our results indicated that the pangolins can be a natural host for Betacoronaviruses, which could be enzootic in pangolins. All three exotic pangolins detected with Betacoronaviruses were sick with serious respiratory diseases and failed to be rescued. However, these pangolins were very stressful in the transportation freight when being intercepted by the customs. It is unclear whether this coronavirus is a common virus flora in the respiratory tract of pangolins. Nevertheless, the pathogenesis of this coronavirus in pangolins remains to be elucidated. Phylogenetic trees suggested that Bat-CoV-RaTG13 was more genetically close to SARS-CoV-2 at both individual gene and genomic sequence level compared with the genomic sequence of pangolin-CoV-2020 assembled in this study. Recombination analysis showed that S gene of pangolin-CoV-2020 might be constructed by fragment from Bat-CoV-ZC45 or Bat-CoV-ZXC21 and fragment from Bat-CoV-RaTG13. Interestingly, the cleavage site between S1 and S2 in SARS-CoV-2 had multiple insertions (i.e. PRRA), compared with those of Bat-CoV-RaTG13 and pangolin-CoV-2020, which may result from an additional recombination event. A new study reported a novel bat-derived coronavirus (RmYN02) identified from a metagenomics analysis of samples from 227 bats collected from the Yunnan province in China between May and October of 2019. Although RmYN02 showed a relatively low nucleotide sequence identity (93.3%) to SARS-CoV-2, it had a similar manner of the insertion of multiple amino acids at the junction site of the S1 and S2 subunits of the S protein as SARS-CoV-2, providing strong evidence that such insertion events can occur in nature [11]. Thus, these data suggest that SARS-CoV-2 originated from multiple naturally occurring recombination events among viruses present in bats and other wildlife species. The S protein of coronaviruses binds to host receptors via RBDs and plays an essential role in initiating viral infection and determining host tropism [2]. A prior study suggested that SARS-CoV-2 and SARS-CoV bind to the same ACE2 receptor [9]. Our analyses showed that pangolin-CoV-2020 has a much conserved RBD to these viruses compared to MERS-CoV, suggesting that pangolin-CoV is very likely to use ACE2 as its receptor as well. A comparative analysis of the interaction of the S proteins of coronaviruses with ACE2 proteins of humans and pangolins showed that the S proteins of SARS-CoV-2 and pangolin-CoV can potentially recognize ACE2 in both humans and pangolins [12]. A recent study found that a human ACE2-binding ridge in SARS-CoV-2 RBD takes a more compact conformation compared with the SARS-CoV RBD; moreover, several residue changes in SARS-CoV-2 RBD may also enhance its human ACE2-binding affinity [13]. The core residues in RBM which may related to higher human ACE2-binding affinity than SARS-CoV are 100% identical between SARS-CoV-2 and CoV-Pangolin-2020. Therefore, pangolin-CoV-2020 (CoV-pangolin/GD) potentially recognizes human ACE2 better than the SARS-CoV. In addition to RBD, NTD is also important in recognizing acetylated sialic acids on glycosylated cell-surface receptors [14]. It is reported that SARS-CoV-2 can bind to human ACE2 via the viral CTD (the same as RBD), but not NTD, and that the glycan attached to Asn90 from human ACE2 forms a hydrogen bond with Arg408 from the RBD core [15]. This glycan interacting Arginine is conserved between SARS-CoV-2 and pangolin-CoV-2. Therefore, there is structural similarity in glycan binding between SARS-CoV-2 and pangolin-CoV-2020. On the other hand, ACE2 receptor is present in pangolins with a high sequence conservation with those in the gene homolog in humans. However, the zoonosis of pangolin-CoV-2020 remains unclear. The coronaviruses are shown to have a wide range of hosts, and some of them can infect humans [16]. Thus, it is critical to determine the natural reservoir and the host tropisms of these coronaviruses, especially their potential of causing zoonosis. In the last two decades, apart from SARS-CoV-2, SARS and MERS have caused serious outbreaks in humans, leading to thousands of deaths [3, 4, 17, 18]. Although these three zoonotic coronaviruses were shown to be of bat origin, they seemed to use different intermediate hosts. For example, farmed palm civets were suggested to be an intermediate host for SARS-CoV, although the details of the link from bats to farmed palm civets remain unclear [19–21]. Most recently, dromedary camels in Saudi Arabia were shown to harbor three different coronaviruses, including the dominant MERS-CoV lineage that was responsible for the outbreaks in the Middle East and South Korea during 2015 [22]. Although this present study does not support that pangolins would be intermediate hosts for the emergence of SARS-CoV-2, our results do not exclude the possibility that other CoVs could be circulating in pangolins. Thus, surveillance of coronaviruses in pangolins could improve our understanding of the spectrum of coronaviruses in pangolins. In addition to conservation of wildlife, minimizing the exposures of humans to wildlife will be important to reduce the spillover risks coronaviruses from wild animals to humans. In summary, we suggest that pangolins could be natural hosts of Betacoronaviruses with an unknown potential to infect humans. However, our study does not support that SARS-CoV-2 evolved directly from the pangolin-CoV.
|
|
|
Post by Admin on Jun 3, 2020 0:39:06 GMT
SARS-CoV-2 may have emerged through recombination between a bat and a pangolin coronavirus and purifying selection, a new study has found. Previous studies reported that SARS-CoV-2 appears to be genetically most similar to a coronavirus isolated from a bat in Yunnan in 2013, called RaTG13, but also that some parts of the virus resemble coronaviruses found among Malayan pangolins. Some research groups have concluded that pangolins may have served as an intermediate host for the emerging SARS-CoV-2, but other teams consider a scenario of direct evolution unlikely. Researchers led by Duke University Medical Center's Feng Gao analyzed the SARS-CoV-2 genome and compared it to other members of the Betacoronavirus family, including SARS-CoVs, RaTG13, and pangolin SARS-like CoVs. As they reported on Friday in Science Advances, they found that recombination and purifying selection were likely involved in the evolution of SARS-CoV-2. They also suggested that cross-species infections may have fueled its development by enabling bat RaTG13-like viruses to recombine with viruses similar to pangolin SARS-like CoVs. "We hypothesize that this, and/or other ancestral recombination events between viruses infecting bats and pangolins, may have played a key role in the evolution of the strain that led to the introduction of SARS-CoV-2 into humans," Gao and his colleagues wrote in their paper. The researchers analyzed 43 complete coronavirus genome sequences, including those from bats, pangolins, and humans, to confirm that overall, the bat RaTG13 virus is most closely related to SARS-CoV-2, followed by the pangolin viruses Pan_SL-CoV_GD from Guangdong and Pan_SL-CoV_GX from Guangxi. While the bat RaTG13 virus is broadly the most similar virus to SARS-CoV-2, there are two regions of the SARS-CoV-2 genome where they diverge, namely the ORF1a gene and the part of the spike glycoprotein gene that encodes the ACE2 receptor binding motif (RBM). SARS-CoV-2, the researchers noted, relies on the RBM for its entry into host cells. At the ORF1a gene, SARS-CoV-2 is more similar to the otherwise divergent bat ZXC21 and ZC45 coronaviruses, while at the RBM, it is more similar to the pangolin Pan_SL-CoV_GD virus. By comparing the various coronaviruses, the researchers uncovered signs of recombination breakpoints before and after the ACE2 receptor binding motif within SARS-CoV-2, suggesting it was acquired through recombination. The Pan_SL-CoV_GD RBM differs from that of SARS-CoV-2 by one amino acid, which falls at the edge of the ACE2 contact interface. This suggests that a RaTG13-like virus may have obtained this RBM from a Pan_SL-CoV_GD-like virus through recombination, the researchers noted. This change may have enabled the virus to be better able to infect human cells, as a recent study reported RaTG13 pseudoviruses were less efficient than SARS-CoV-2 pseudoviruses in their ability to use ACE2 to infect cells. Through their comparison of coronavirus genomes, the researchers also uncovered indicators of strong purifying selection. The S2 subunit of the S gene, for instance, is highly conserved among SARS-CoV-2, RaTG13, and Pan_SL-CoV, and the researchers wrote that given its important role in cell entry, purifying selection at this site is not surprising. They further noted signs of selection in other regions of the coronavirus genome, such as the E and M genes. Some viruses exhibited signs of selective pressure at the same genes, while other viruses had increased pressure at other genes, suggesting that some but not all hosts exerted similar evolutionary constraints. Still, the researchers said those similar pressures might make cross-species transmissions easier. Continuous surveillance of coronaviruses in their natural hosts and in humans will be needed to stem new outbreaks, they suggested. "While the direct reservoir of SARS-CoV-2 is still being sought, one thing is clear: reducing or eliminating direct human contact with wild animals is critical to preventing new coronavirus zoonosis in the future," they wrote.
|
|
|
Post by Admin on Jun 3, 2020 7:46:35 GMT
Emergence of SARS-CoV-2 through recombination and strong purifying selection Xiaojun Li1,†, Elena E. Giorgi2,†, Manukumar Honnayakanahalli Marichannegowda1, Brian Foley2, Chuan Xiao3, Xiang-Peng Kong4, Yue Chen1, S. Gnanakaran2, Bette Korber2,5 and Feng Gao1,6,* See all authors and affiliations
Science Advances 29 May 2020: eabb9153 DOI: 10.1126/sciadv.abb9153
Abstract COVID-19 has become a global pandemic caused by the novel coronavirus SARS-CoV-2. Understanding the origins of SARS-CoV-2 is critical for deterring future zoonosis, discovering new drugs, and developing a vaccine. We show evidence of strong purifying selection around the receptor binding motif (RBM) in the spike and other genes among bat, pangolin, and human coronaviruses, suggesting similar evolutionary constraints in different host species. We also demonstrate that SARS-CoV-2’s entire RBM was introduced through recombination with coronaviruses from pangolins, possibly a critical step in the evolution of SARS-CoV-2’s ability to infect humans. Similar purifying selection in different host species, together with frequent recombination among coronaviruses, suggest a common evolutionary mechanism that could lead to new emerging human coronaviruses.
Introduction The severe respiratory disease COVID-19 was first noticed in late December 2019 (1). It rapidly became epidemic in China, devastating public health and economy. At the beginning of May, COVID-19 had spread to ~150 countries and infected over 3.3 million people (2). On March 11, 2020, the World Health Organization (WHO) officially declared it a pandemic.
The etiological agent of COVID-19 (3), severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (4), was identified as a new member of the genus Betacoronavirus, which includes a diverse reservoir of coronaviruses (CoVs) isolated from bats (5–7). While genetically distinct from the betacoronaviruses that cause SARS and MERS in humans (8, 9), SARS-CoV-2 shares the highest level of genetic similarity (96.3%) with CoV RaTG13, sampled from a bat in Yunnan in 2013 (8). Recently, CoV sequences closely related to SARS-CoV-2 were obtained from confiscated Malaya pangolins in two separate studies (10, 11). These pangolin SARS-like CoVs (Pan_SL-CoV) form two distinct clades corresponding to their locations of origin: the first clade, Pan_SL-CoV_GD, sampled from Guangdong (GD) province in China, is genetically more similar to SARS-CoV-2 (91.2%) than the second clade, Pan_SL-CoV_GX, sampled from Guangxi (GX) province (85.4%).
Understanding the origin of SARS-CoV-2 may help develop strategies to deter future cross-species transmissions and to establish appropriate animal models. Recombination plays an important role in the evolution of coronaviruses (12, 13). Viral sequences nearly identical to SARS and MERS viruses were found in civets and domestic camels, respectively (14, 15), demonstrating that they originated from zoonotic transmissions with intermediate host species between the bat reservoirs and humans—a common pattern leading to CoV zoonosis (5–7). However, non-human viruses nearly identical to SARS-CoV-2 have not yet been found. In this paper we demonstrate, through localized genomic analysis, a complex pattern of evolutionary recombination and strong purifying selection between CoVs from distinct host species and that cross-species infections that likely originated SARS-CoV-2.
|
|
|
Post by Admin on Jun 3, 2020 20:07:34 GMT
Results Acquisition of receptor binding motif through recombination Phylogenetic analysis of 43 complete genome sequences from three clades (SARS-CoVs and bat_SL-CoVs in clade 3; SARS-CoV-2, bat_SL-CoVs and pan_SL-CoVs in clade 2; and two divergent bat_SL-CoVs in clade 1) within the Sarbecovirus group (9) confirms that RaTG13 is overall the closest sequence to SARS-CoV-2 (Fig. S1). Pan_SL-CoV_GD are the next closest viruses, followed by Pan_SL-CoV_GX. Among the bat-CoV sequences in clade 2 (Fig. S1), ZXC21 and ZC45, sampled from bats in 2005 in Zhoushan, Zhejiang, China, are the most divergent, with the exception of the beginning of the ORF1a gene (region 1, Fig. 1A). All other Bat_SL-CoV and SARS-CoV sequences form a separate clade 3, while clade 1 comprises BtKY72 and BM48-31, the two most divergent Bat_SL-CoV sequences in the Sarbecovirus group (Fig. S1). Recombination in the first SARS-CoV-2 sequence (Wuhan-Hu-1) with other divergent CoVs has been previously noted (3). Here, to better understand the role of recombination in the origin of SARS-CoV-2 among these genetically similar CoVs, we compare Wuhan-Hu-1 to six representative Bat_SL-CoVs, one SARS-CoV, and the two Pan_SL-CoV_GD sequences using SimPlot analysis (16). RaTG13 has the highest similarity across the genome (8), with two notable exceptions where a switch occurs (Fig. 1A). In phylogenetic reconstructions, SARS-CoV-2 clusters closer to ZXC21 and ZC45 than RaTG13 at the beginning of the ORF1a gene (region 1, Fig. 1B), and, as previously reported (10, 17), to a Pan_SL-CoV_GD in region 2 (Figs. 1C and S2), which spans the receptor angiotensin-converting enzyme 2 (ACE2) binding site in the spike (S) glycoprotein gene. When comparing Wuhan-Hu-1 to Pan_SL-CoV_GD and RaTG13, as representative of distinct host-species branches in the evolutionary history of SARS-CoV-2, using the recombination detection tool RIP (18), we find significant recombination breakpoints before and after the ACE2 receptor binding mortif (RBM) (19, 20) (Fig. S2A). This suggests that SARS-CoV-2 carries a history of cross-species recombination between the bat and the pangolin CoVs. Fig. 1 SARS-CoV-2 recombination with Pan_SL-CoV and Bat_SL-CoV. (A) SimPlot genetic similarity plot between SARS-CoV-2 Wuhan-Hu-1 and representative CoV sequences, using a 400-bp window at a 50-bp step and the Kimura 2-parameter model. Phylogenetic trees of regions of disproportional similarities, showing high similarities between SARS-CoV-2 and ZXC21 (B) or GD/P1La (C), high genetic divergences of all Pan_SL-CoV sequences (D), and high similarities between GD/P1La and to divergent bat_SL-CoV sequences (E). All positions are relative to Wuhan-Hu-1. In Fig. 1A we use the ORF1a and ORF1b nomenclature consistent with the original publication from of the Wuhan virus (3), however, the NCBI betacoronavirus reference sequences (see SAR-CoV-2, NC_045512.2, for an example) designate a single longer stretch called ORF1ab (from 266 to 21,555) that spans both 1a and 1b. Pan_SL-CoV sequences are generally more similar to SARS-CoV-2 than other CoV sequences, with the exception of RaTG13 and ZXC21, but are more divergent from SARS-CoV-2 at two regions in particular: the beginning of the ORF1b gene and the highly divergent N terminus of the S gene (regions 3 and 4, respectively, Fig. 1A). Within-region phylogenetic reconstructions show that Pan_SL-CoV sequences become as divergent as BtKY72 and BM48-31 in region 3 (Fig. 1D), while less divergent in region 4, where Pan_SL-CoV_GD clusters with ZXC21 and ZC45 (Fig. 1E). Together, these observations suggest ancestral cross-species recombination between pangolin and bat CoVs in the evolution of SARS-CoV-2 at the ORF1a and S genes. Furthermore, the discordant phylogenetic clustering at various regions of the genome among clade 2 CoVs also supports extensive recombination among these viruses isolated from bats and pangolins. The SARS-CoV-2 S glycoprotein mediates viral entry into host cells and therefore represents a prime target for drug and vaccine development (12, 19). While SARS-CoV-2 sequences share the greatest overall genetic similarity with RaTG13, this is no longer the case in parts of the S gene. Specifically, amino acid sequences of RBM in the S1 subunit are nearly identical to those in two Pan_SL-CoV_GD viruses, with only one amino acid difference (Q498H)—although the RBM region has not been fully sequenced in one of Guangdong pangolin virus (Pan_SL-CoV_GD/P2S) (Fig. 2A). Pangolin CoVs from Guangxi are much more divergent. Phylogenetic analysis based on the amino acid sequences of this region shows three distinct clusters of SARS-CoV, SARS-CoV-2 and bat-CoV only viruses, respectively (Fig. 2B). Interestingly, while SARS-CoV and SARS-CoV-2 viruses use ACE2 for viral entry, all CoVs in the third cluster have a 5-aa deletion and a 13-14-aa deletion in RBM (Fig. 2A) and do not infect human target cells (5, 21, 22). Fig. 2 Impact of SARS-CoV-2 recombination on coreceptor binding. (A) Amino acid sequences of the receptor binding motif (RBM) in the spike (S) gene among Sarbecovirus CoVs compared to Wuhan-Hu-1 (top). Dashes indicate identical amino acids, dots indicate deletions. ACE2 critical contact sites highlighted in blue, two large deletions in green. (B) Phylogenetic tree analysis of amino acids sequences of RBM. Viruses with the ability to bind ACE2 form two distinct clusters (one including SARS_CoVs and the other including SARS_CoV-2s). Bat-SL-CoVs with large deletions forms another distinct cluster. Although both SARS-CoV and SARS-CoV-2 use the human ACE2 as their receptors (8, 23), they show a high level of genetic divergence (Figs. 1 and S1). However, structures of the S1 unit of the S protein from both viruses are highly similar (20, 24–26), with the exception of a loop that bends differently (Fig. 3A). The root-mean-square deviation (RMSD) between the two S proteins are 1.2Å over 174 Cα residues (24). This suggests that conformational similarity of the binding motif enables viral entry through molecular recognition of ACE2. These structural studies also thoroughly analyzed the contact residues between the S protein and human ACE2 (20, 24). Previously structural and mutagenesis studies have identified two hot-spots, K31 and K353, at the S/ACE2 interface in SARS-CoV. In SARS-CoV-2, these two hot-spots were slightly weakened due to different residues on its S protein but the loop that takes different conformations from SARS-CoV provides additional interaction that strengthens the interaction (26). Among 17 distinct amino acids between SARS-CoV-2 and RaTG13 in the RBM region (Fig. 2A), five contact sites based on the structural studies (24) are different, likely impacting RaTG13’s binding to ACE2 (Fig. 3B and Table S1). The single amino acid difference at position 498 (Q or H) between SARS-CoV-2 and Pan_SL-CoV_GD is at the edge of the ACE2 contact interface; neither Q or H at this position form hydrogen bonds with ACE2 residues (Fig. 3C). Thus, a functional RBM nearly identical to the one in SARS-CoV-2 is naturally present in Pan_SL-CoV_GD viruses. The very distinctive RaTG13 RBM suggests that this virus will not likely infect human cells efficiently. Indeed, a recent study showed that the RaTG13 pseudovirus is much less efficient than SARS-CoV-2 pseudoviruses in using ACE2 to infect cells, and this is most likely due to the L486F and Y493Q substitutions, which result in lower ACE2 binding in RaTG13 (26). Therefore, it is likely that the acquisition of a complete functional RBM by a RaTG13-like CoV through a recombination event with a Pan_SL-CoV_GD-like virus enabled it to more efficiently use ACE2 for human infection.
|
|