The mere fact that changes in energy and entropy are fundamentally correlated is not unexpected; after all, their temperature dependence is akin and dictated by the corresponding change in heat capacity (eqn. 1), i.e., their mean fluctuation (eqn. 2). A relationship between free energy and the temperature at which it vanishes is not astonishing either. Both ΔGT and Tm are commonly interpreted as a representation of 'thermodynamic stability', the former is expressed in energy units and depends on ΔCp(T), the latter lends its unit from the temperature scale and is untouched by any T-dependence of ΔCp. However, we were unable to find in the literature any systematic study that would demonstrate this particular linearity from experimental data, nor its strong dependence on the similarity of congeners, nor its highest quality at T = Tmedian. The distinct linear grouping of the theoretically calculated molecules (of chemically very different nature from that of proteins or nucleic acids) is at least inasmuch significant as their thermodynamic parameters are independently derived from partition functions rather than from experimental enthalpies or experimental equilibrium constants, and in spite of the not entirely exact nature of the calculation of S (due to the harmonic oscillation approximation).
Taken together, the similarity-dependent linearity of vs. Tm, quantified through the regression coefficient , seems to be as general as the whole theory of thermodynamics is. It may thus be that this linearity's origin lies at least in part in the mathematical structure of thermodynamics, not entirely in the physics for which thermodynamics was designed to describe. Therefore we proceed with deriving general consequences, with respect to physics, such as the entanglement of the First and Second Laws for groups of similar objects as mentioned in the introduction. We continue with the mathematical and geometrical analysis of a function that was generated from the combination of equations 3, 8 (both right-hand side), 4 and 5 to give through the elimination of ΔGT equations 9 and 10, i.e., the fundamental energy-entropy relationship and mathematical basis for the 5D parameter space . Equation 9 is a simplified version for ΔCp = 0 (for clarity) of the general form as shown in equation 10. Both equations can be analytically solved for (eqn. 26 [see additional file 1]).
(9)
(10)
The above functions are variants of the well known quadric x = y·z of the shape of a hyperbolic paraboloid (where x = , y = and z = T), thus, of a single saddle point centered in the origin {x = 0; y = 0; z = 0} and the S4-symmetric function spreading from there with an all-negative Gaussian curvature (Figure 1). Any temperature dependence of ΔCp(T) is consistent with the hyperbolic paraboloid (eqn. 9) as shown in equation 10. For ΔCp = 0 (eqn. 9 with hT = h0 + h1·T and sT = s0 + s1·T from the van't Hoff datasets) the basic shape of the function does not change when compared to x = y·z, although the function area may be quite heavily 'distorted' (not shown). However, for ΔCp ≠ 0 = const. (eqn. 9 with hT = h0 + h1·T + h2·T· lnT and sT = s0 + s1·T + s2·T· lnT) the group constants h0-2 and s0-2 that were obtained from the experimental calorimetric datasets produced shapes of the eyebrow-rising kind. In Figure 2 four views of the same 3D-projection, ΔHT versus ΔST and T, of the thermodynamic 5D parameter space is shown for one particular but representative calorimetrically measured protein mutant group (mutants of Staphylococcal Nuclease). In Figure 3 one to two views of three different 3D-projections for the same mutant group are depicted. Both Figures 2 and 3 focus on the zone that contains the experimental data (yellow dots). The interested reader is welcome to copy any set of experimental group constants h0-2 and s0-2 [additional file 2], plot equation 9 at any scale (best solved for to suppress a maximum of asymptotic planes in certain 3D projections) and enjoy the shapes and wormholes created by the T· lnT terms. A more comprehensive study on the characteristics of this function shall be published elsewhere.
The yellow line in Figure 3d, i.e. the experimental isotherm at T = Tmedian, lies in a 'valley' at Tmedian = 320.2 Kelvin created by the saddle of this particular hyperbolic paraboloid. It seems that this isotherm is the best defined of all T, therefore, producing the best linear regression coefficient . Each straight line in ΔGT versus that represents a structurally similar group is, in geometric terms, a geodesic on the hyperbolic paraboloid. The corresponding group functions or , as expressed through equations 9 and 10 are therefore also geodesics. Geometric considerations indicate that the datapoints produce the best rT values in vs. when they are closest to the maximal negative curvature, thus, to the saddle point of the hyperbolic paraboloid (cf. Figure 3d). Flatter curvatures, thus, steeper surface areas of the hyperbolic paraboloid farther away from the saddle point (cf. Figure 1) allow for a higher dispersal of the datapoints owing to idiosyncratic ΔCp values, which leads to lower regression coefficients rT.
Independently of geometric considerations, we interpret this consistently observed linearity as a (physically) 'minimal expense' or (mathematically) 'minimal action' effect: The appearance or evolution of small structural changes within the same group, i.e., without touching essential framework structuring, can only result in constantly proportional, therefore, unevolving free energy changes being 'linear' with respect to their equilibrium temperature changes. A thermodynamic interpretation of this linear relationship would be that incremental irreversible changes within a group of reversibly dynamic similar but distincty different structures are just as reversible changes are: virtually uncoupled, therefore, additive and independent of the path taken in between, as is the prerequisite for obeying the Gibbs-Helmholtz equation and synonymous to ΔG and ΔF being state functions.
One might argue that the linearity of equation 8 is a simplified manifestation of the Taylor series expansion for any mathematical function f(x) = f(x0) + (df/dx)·(x – x0) + (d2f/dx2)·(x – x0)2 + (d3f/dx3)·(x - x0)3 +... which always becomes approximately linear for any slowly varying function f(x), in this case, sufficiently close to the reference point x0 (Tm or in this case). In performing the linear correlations ΔGT versus at T =Tmedian, we do not explicitly claim that the linear relation holds at all temperatures. We do claim, however, that a correlation between ΔGT and Tm at any temperature T using a polynomial of higher than first (linear) degree, as generalised in the above Taylor series expansion, will lead to an analytically solvable relationship for or . We did not prove the generality of this claim but solved ΔH – T·ΔS = hT – [(ΔH/ΔS)·s1,T + (ΔH/ΔS)2·s2,T + (ΔH/ΔS)3·s3,T], which is a Taylor series-expanded version of equation 8 (where ΔCp = 0), for ΔH and ΔS, respectively. The expanded nonlinear variants with s3,T = 0 (quadratic) and s3,T ≠ 0 (cubic) did each result in at least one non-complex analytical solution for ΔH(ΔS) and ΔS(ΔH), albeit bearing a more complicated mathematical structure (not shown). In other words, we claim that a fundamental relationship between energy and entropy for a group of similar objects results from any analytically solvable relationship between ΔGT and . We opt for the simplest, a linear solution: ΔGT and are proportional over a reasonably large temperature range.
Most important for physics is the fact that group specific thermodynamic parameter spaces depict the only possible values that can be realised by a particular group of similar objects. The rest is void, terra incognita for the group members, unless an object changes its characteristics (structure, composition, etc.), unless it 'dissimilarises' off from 'its' group - most likely, to join some other one. The definition of a group, that is, how to determine whether a number of individuals belong to the same group or not, seems at first sight worrying or at least not clearly solved. However, when we think of individuals as being more or less similar to one another, we see that a clear distinction between different groups is not a fundamental issue. Similarity does exist; in the microscopic and macroscopic world it is often a matter of judgement according to some objective, statistically relevant technical signal (at highest available resolution) or at least a subjective physiological 'measurement' ("I know it when I see it", cf. Graphical Abstract). For microscopic objects such as molecules, one should never be tempted to define a group through a good linear regression coefficient only; independent knowledge and/or studies are mandatory. For instance, the advantage of studying mutant protein families not only means being able to analyse a large number of families and sometimes many congeners within one family. Most importantly, we are also certain that single or even multiple site mutants of the same protein do indeed belong to the same structural group, the mutants are undoubtedly similar to one another. Other molecular systems such as synthetic host-guest complexes or water clusters may be less evident to this respect. Still other objects might be even more readily grouped than mutant proteins (cf. Conclusion). The concept of similarity is intrinsically a not readily quantifyable one because intuitively it seems to be a not very objective 'measurement', at least down to Planckian scales: How similar and with respect to what exactly?
We are free to group similar objects essentially at will. For example, we can group one set of RNA hairpins into two families, the one that bears various all-Watson-Crick pairs and the one that contains various single-mismatched base pairs at different positions in the stem, the stem length and loop sequence being the same in both families [6]. We can overlook this subtle difference and treat those hairpins as one group that consist of the same loop sequence and stem length irrespective of single mismatches being present or absent in the stem. The outcome will be a slightly lower linear regression coefficient for this group. It can then be compared to another group of RNA hairpins showing, for example, the same stem length and stem sequence variations but a different loop sequence. We can treat protein mutant families with the same varied degrees of precision/resolution. We could define all known proteins as belonging to the same group and compare it to a more drastically different group of compounds (objects). Nothing prevents us from grouping objects at still lower resolution; the obvious trade-off will be increasingly lower linear regression coefficients. As a matter of fact, there is no a priori objection that we can think of to the grouping of the entire universe and comparing it to some other one, if it were observable. In principle, one would have to agree upon a set of observables (like energy, entropy and temperature), measure them on a statistically representative number of individual members of what we decide, through some hopefully objective criterium, to call a group, determine the corresponding group parameters and then gain easier access to more members of the same group but also, to obtain an objective means for the comparison of this group to another one. In practise, of course, as we embrace more and more dissimilar objects, we will probably evoke increasingly unacceptable linear regression coefficients. Where this limit of a meaningful group analysis lies remains to be seen.