Post by Admin on Feb 16, 2021 3:24:09 GMT
Correlating trimeric duplex stability with amino acid coding properties
Inspection of the data in Table 1 and Scheme 1 reveals eight exceptionally stable all-GC codons, as defined by the relative stability of the trimeric duplex they each form with their antiparallel, complementary codon. By the same criteria, a second group of significantly stable codons of the form, GCX, CGX, GGX, and CCX, also can be identified, although less resolved, where X is either T or A. It is noteworthy that collectively the GCX, CGX, GGX, and CCX families of stable codons code for Ala, Arg, Gly, and Pro, which are among the most abundant amino acids, and, save for Arg, also are considered ancient amino acids (Trifonov, 2000). Furthermore, when X = A or T, the complementary codons to this second group are XCG, XGC, XCC, and XGG. Except for Trp, these codons code for the amino acids Cys, Ser, and Thr, which, like Ala, Gly, and Pro noted above all are defined as ancient amino acids (Trifonov, 2000). This empirical correspondence between the stabilities of codons and the abundance as well as age of the amino acids for which they code raises the intriguing possibility of a stability-modulated, evolutionary shaping of the code.
These stable codon groups, and their corresponding, antiparallel codons, each occupy three positions within each of the eight cubes that make up the hypercube scaffolding that interconnect all 64 codons. This set of 24 ‘high-stability’, codon groupings may have been energetically favored ‘prebiotically’. As such, it is intriguing to note that all the other 40 codons can be generated/“evolved” from these most stable codons by, at most, three transition mutations; again suggestive of a stability-modulated, evolutionary shaping of the code.
To test the robustness of our conclusions, we conducted the same analyses for a dataset in which we reversed the polarities of the codon–complementary codon interactions to yield parallel codon couplets. This assessment yielded an altered energy spectrum, as well as changes in the stability rank order for the complementary codon couplets; particularly for the RRR/YYY and the YYR/RRY families of trimeric duplexes (columns 1 and 3 in Scheme 1). Further comparison of the antiparallel and parallel datasets also revealed differences in the energy changes associated with the sequential transition and transversion mutations. In the aggregate, these differential outcomes underscore the robustness of the correlations noted here between the stabilities of the antiparallel couplets and the shaping of the genetic code.
Correlations between larger domain DNA energy profiles and higher-order biological functions
One long-term goal is to define correlations between functional domains of the genome and the energy profiles of such domains. Some initial trends, that require further validation, include the suggestion by Klump and coworkers that protein-coding sequences predominately consist of codon domains of relatively uniform stability (Klump and Maeder, 1991). By contrast, Klump proposes that signal sequences exhibit less uniform and less stable domains, while also being more sensitive to local changes in cellular and sequence/structural environments, thereby allowing them to amplify a perturbation in a localized sequence. This biophysical behavior is what one would expect for a biological signal transducer. Coding sequences, by contrast, are less sensitive to environmental conditions, also consistent with their biological function to faithfully code for a protein. Much more research is required to establish a robust biophysical map of genomes to test such hypotheses. However, these intriguing early correlations should motivate such efforts, including a parallel analysis for RNA.
Concluding remarks
We have reviewed, presented, and integrated evidence that the iconic chemical genetic code also can be viewed as a differential energy code that influences biological outcomes. This perspective includes implications for differential, energy-based, molecular contributions to classic Darwinian evolutionary theory. In short, evolutionary pressures may well derive from the optimization of fundamental biophysical properties, as well as from the classic perspective of being driven to yield a functionally adaptive advantage for either a biopolymer or an organism.
Darwinian evolution, when proceeding over sufficiently long timeframes, leaves only the evolutionary ‘winners’ behind. This reality makes it difficult, if not impossible, to deduce, with any certitude, the precursors or evolutionary pathways that ultimately culminated in the current, evolved ‘winners’. However, by evoking the laws of thermodynamics, it becomes possible via considerations of thermodynamic selection and linkages, as illustrated in the hypercube of Fig. 2, to speculate on what may have preceded these ‘left behind winners’. It is precisely the beauty of thermodynamics that follows universal laws, under all conditions and times, that allows one to extrapolate backward from the ‘left behind winners’ to make informed speculations as to how these winners may have evolved from earlier remnants (molecular fossils).
Consistent with this perspective is our hypothesis that the evolution of the genetic code was shaped and modulated by the differential stabilities of complementary codons; a feature reflective of ‘molecular Darwinism’. As thermodynamic fingerprints of such an evolutionary influence, we found correlations between the free energies of formation of antiparallel, complementary DNA trimers and their codon usage frequency. We also noted correlations between the stabilities of complementary codon couplets and those that code for ‘ancient’ amino acids. Collectively, our observations are consistent with a scenario in which the genetic code, driven by differential codon stabilities, evolved under the influence and regulation of a series of interlocking thermodynamic cycles. We proposed that these coupled energy cycles controlled the transition and transversion mutations of a group of the 24 most ancient (‘prebiotic’) and stable codon pairs, ultimately yielding the complete 64 codon code; via a form of ‘thermodynamic selection’. As such, we suggested that the evolution of the genetic code exhibits contributions from both stability-driven ‘molecular’/genotypic Darwinism as well as the more traditional, phenotypic Darwinism. As we stated in the Abstract, yet worth repeating, it is not surprising that evolution of the code was influenced by differential energetics, as thermodynamics is the most general and universal branch of science that operates over all time and length scales.
Going forward
Given the correlative examples noted here, going forward it seems justified to create a comprehensive energy map of the human genome; or for that matter, the genome of any organism of interest. The differential energy domains so characterized may correlate with known functionalities; or may reveal and yield insights into regions of yet defined function. Such profiling to create an ‘energy genome’ would yield a thermodynamic bridge between sequence, structure, and biological function.
Postscript: Shortly after the beginning of the 20th century, Albert Einstein (Schilpp and Einstein, 1949) declared:
‘A theory is the more impressive the greater the simplicity of its premises, the more different kinds of things it relates, and the more extended its area of applicability. Therefore the deep impression that classical thermodynamics made upon me. It is the only physical theory of universal content which I am convinced will never be overthrown, within the framework of applicability of its basic concepts’.
As biophysical chemists, the authors consider the ultimate exemplar/test of this assertion to be the demonstration that the molecular language and complexities of biology embedded in the genetic code can be rationalized in terms of fundamental thermodynamic principles.
Financial support
This research was supported by grants from the NIH GM23509, GM34469, and CA47995 (to K.J.B.) and NRF (Pretoria, RSA) grant GUN 61103 to H.H.K.
Inspection of the data in Table 1 and Scheme 1 reveals eight exceptionally stable all-GC codons, as defined by the relative stability of the trimeric duplex they each form with their antiparallel, complementary codon. By the same criteria, a second group of significantly stable codons of the form, GCX, CGX, GGX, and CCX, also can be identified, although less resolved, where X is either T or A. It is noteworthy that collectively the GCX, CGX, GGX, and CCX families of stable codons code for Ala, Arg, Gly, and Pro, which are among the most abundant amino acids, and, save for Arg, also are considered ancient amino acids (Trifonov, 2000). Furthermore, when X = A or T, the complementary codons to this second group are XCG, XGC, XCC, and XGG. Except for Trp, these codons code for the amino acids Cys, Ser, and Thr, which, like Ala, Gly, and Pro noted above all are defined as ancient amino acids (Trifonov, 2000). This empirical correspondence between the stabilities of codons and the abundance as well as age of the amino acids for which they code raises the intriguing possibility of a stability-modulated, evolutionary shaping of the code.
These stable codon groups, and their corresponding, antiparallel codons, each occupy three positions within each of the eight cubes that make up the hypercube scaffolding that interconnect all 64 codons. This set of 24 ‘high-stability’, codon groupings may have been energetically favored ‘prebiotically’. As such, it is intriguing to note that all the other 40 codons can be generated/“evolved” from these most stable codons by, at most, three transition mutations; again suggestive of a stability-modulated, evolutionary shaping of the code.
To test the robustness of our conclusions, we conducted the same analyses for a dataset in which we reversed the polarities of the codon–complementary codon interactions to yield parallel codon couplets. This assessment yielded an altered energy spectrum, as well as changes in the stability rank order for the complementary codon couplets; particularly for the RRR/YYY and the YYR/RRY families of trimeric duplexes (columns 1 and 3 in Scheme 1). Further comparison of the antiparallel and parallel datasets also revealed differences in the energy changes associated with the sequential transition and transversion mutations. In the aggregate, these differential outcomes underscore the robustness of the correlations noted here between the stabilities of the antiparallel couplets and the shaping of the genetic code.
Correlations between larger domain DNA energy profiles and higher-order biological functions
One long-term goal is to define correlations between functional domains of the genome and the energy profiles of such domains. Some initial trends, that require further validation, include the suggestion by Klump and coworkers that protein-coding sequences predominately consist of codon domains of relatively uniform stability (Klump and Maeder, 1991). By contrast, Klump proposes that signal sequences exhibit less uniform and less stable domains, while also being more sensitive to local changes in cellular and sequence/structural environments, thereby allowing them to amplify a perturbation in a localized sequence. This biophysical behavior is what one would expect for a biological signal transducer. Coding sequences, by contrast, are less sensitive to environmental conditions, also consistent with their biological function to faithfully code for a protein. Much more research is required to establish a robust biophysical map of genomes to test such hypotheses. However, these intriguing early correlations should motivate such efforts, including a parallel analysis for RNA.
Concluding remarks
We have reviewed, presented, and integrated evidence that the iconic chemical genetic code also can be viewed as a differential energy code that influences biological outcomes. This perspective includes implications for differential, energy-based, molecular contributions to classic Darwinian evolutionary theory. In short, evolutionary pressures may well derive from the optimization of fundamental biophysical properties, as well as from the classic perspective of being driven to yield a functionally adaptive advantage for either a biopolymer or an organism.
Darwinian evolution, when proceeding over sufficiently long timeframes, leaves only the evolutionary ‘winners’ behind. This reality makes it difficult, if not impossible, to deduce, with any certitude, the precursors or evolutionary pathways that ultimately culminated in the current, evolved ‘winners’. However, by evoking the laws of thermodynamics, it becomes possible via considerations of thermodynamic selection and linkages, as illustrated in the hypercube of Fig. 2, to speculate on what may have preceded these ‘left behind winners’. It is precisely the beauty of thermodynamics that follows universal laws, under all conditions and times, that allows one to extrapolate backward from the ‘left behind winners’ to make informed speculations as to how these winners may have evolved from earlier remnants (molecular fossils).
Consistent with this perspective is our hypothesis that the evolution of the genetic code was shaped and modulated by the differential stabilities of complementary codons; a feature reflective of ‘molecular Darwinism’. As thermodynamic fingerprints of such an evolutionary influence, we found correlations between the free energies of formation of antiparallel, complementary DNA trimers and their codon usage frequency. We also noted correlations between the stabilities of complementary codon couplets and those that code for ‘ancient’ amino acids. Collectively, our observations are consistent with a scenario in which the genetic code, driven by differential codon stabilities, evolved under the influence and regulation of a series of interlocking thermodynamic cycles. We proposed that these coupled energy cycles controlled the transition and transversion mutations of a group of the 24 most ancient (‘prebiotic’) and stable codon pairs, ultimately yielding the complete 64 codon code; via a form of ‘thermodynamic selection’. As such, we suggested that the evolution of the genetic code exhibits contributions from both stability-driven ‘molecular’/genotypic Darwinism as well as the more traditional, phenotypic Darwinism. As we stated in the Abstract, yet worth repeating, it is not surprising that evolution of the code was influenced by differential energetics, as thermodynamics is the most general and universal branch of science that operates over all time and length scales.
Going forward
Given the correlative examples noted here, going forward it seems justified to create a comprehensive energy map of the human genome; or for that matter, the genome of any organism of interest. The differential energy domains so characterized may correlate with known functionalities; or may reveal and yield insights into regions of yet defined function. Such profiling to create an ‘energy genome’ would yield a thermodynamic bridge between sequence, structure, and biological function.
Postscript: Shortly after the beginning of the 20th century, Albert Einstein (Schilpp and Einstein, 1949) declared:
‘A theory is the more impressive the greater the simplicity of its premises, the more different kinds of things it relates, and the more extended its area of applicability. Therefore the deep impression that classical thermodynamics made upon me. It is the only physical theory of universal content which I am convinced will never be overthrown, within the framework of applicability of its basic concepts’.
As biophysical chemists, the authors consider the ultimate exemplar/test of this assertion to be the demonstration that the molecular language and complexities of biology embedded in the genetic code can be rationalized in terms of fundamental thermodynamic principles.
Financial support
This research was supported by grants from the NIH GM23509, GM34469, and CA47995 (to K.J.B.) and NRF (Pretoria, RSA) grant GUN 61103 to H.H.K.