The model
We have constructed a formal model of the experimental RNA system and used the Gillespie algorithm to simulate its dynamical behavior. An earlier, but simpler, model was described previously [15]. Here we include more chemical realism in the model and perform a more detailed analysis. To fully specify the model, we first need to define the molecular species, the possible chemical reactions, which molecules catalyze which reactions, and the corresponding reaction rates. Then the simulation algorithm is described in detail.
Molecules
There are three main groups of molecule types. First, there are the RNA fragments (first group). These constitute the “food set” in the context of autocatalytic sets. Second, these RNA fragments associate spontaneously into non-covalent ribozymes (second group). And third, the non-covalent ribozymes are transformed (through catalyzed reactions) into the corresponding covalent ribozymes (third group).
For the purposes of the model, in the first group of molecule types (the RNA fragments), we ignore the specific junction at which the RNA fragments are covalently recombined, and only consider the particular combination of nucleotides in the IGS and tag sequences of the resulting (non-covalent) ribozyme to be important. Since the RNA fragments are oriented (left to right, or 5´ to 3´), we assume there is a “left” (l) fragment and a “right” (r) fragment that associate, each of which can have any of the four possible nucleotides in the IGS/tag sequence. Consequently, there are four possible “left” fragments and four possible “right” fragments, i.e., eight RNA fragments constituting the first group of molecule types (the food set). We label these lM and rN, M,N∈{A,C,G,U}.
Given an association of such a “left” fragment and a “right” fragment, there can be 16 possible resulting non-covalent ribozymes (second group of molecule types), labeled IMN, M,N∈{A,C,G,U}, depending on the relevant nucleotides M and N that are combined.
Finally, there are the 16 corresponding covalent ribozymes (third group of molecule types), labeled EMN, M,N∈{A,C,G,U}. Thus, in total there are 8 (first group) + 16 (second group) + 16 (third group) = 40 molecule types in the model.
Reactions
There are two main groups of reactions. First, there are the (spontaneous) association reactions transforming two RNA fragments into a non-covalent ribozyme: lM + rN → IMN. There are 16 such reactions (one for each of the 16 possible non-covalent ribozymes). We also include the reverse (dissociation) reactions in the model.
Second, there are the catalyzed recombination reactions that convert a non-covalent ribozyme into a covalent one: IMN → EMN. There are 16 such reactions (one for each of the 16 possible ribozymes), and they can be catalyzed by either a non-covalent or a covalent ribozyme (see below). We also include the reverse reactions in the model. Thus, in total there are 16 + 16 = 32 (bi-directional) reactions in the model.
Catalysis
The ribozymes (both non-covalent and covalent) catalyze each others’ transformation from non-covalent to covalent. In particular, if a ribozyme EMN (or IMN) has a nucleotide M in its guide sequence that is the base-pair complement of the variable nucleotide N' in the tag sequence of another ribozyme IM'N', then the first ribozyme EMN (IMN) can catalyze the non-covalent to covalent transformation of the second ribozyme IM'N'. So, each ribozyme catalyzes the transformation reaction of four other ribozymes (or possibly of three others and that of itself). There is a difference in rates depending on whether the catalyst is a non-covalent (IMN) or covalent (EMN) ribozyme (see below).
The RNA fragment association reactions (forming non-covalent ribozymes) are spontaneous. In the formal autocatalytic sets framework, spontaneous (non-catalyzed) reactions can by definition never be part of an autocatalytic set. This is because, usually, spontaneous reactions happen at significantly lower rates than catalyzed ones, often too low to be relevant. However, in the RNA system the spontaneous RNA-RNA association reactions actually happen at very high rates because these aggregations are driven by favorable base-paring and tertiary interactions, which are numerous in the Azoarcus ribozyme [22]. In the formal framework, we can incorporate such spontaneous but high-rate reactions by assuming that they are actually catalyzed by a (fictional) “generic catalyst” which is also part of the food set. Here, this assumption is made implicitly, but not explicitly included in the model.
Rates
Relative rates for the transformation reactions were obtained experimentally by steady-state kinetic analyses of representative matching and non-matching (i.e., Watson-Crick) IGS-tag relationships in RNA fragments [10, 21]. These rates were then re-scaled to make one time unit in the simulation correspond roughly to one hour in the real experiments. In these experiments, at the end of each transfer step (one hour), close to 20% of the solution has been transformed into covalent ribozymes [10]. Using this as a target, the experimentally obtained rates were then rescaled (while maintaining their relative ratios) such that after one time unit in the simulation also close to 20% of the solution consists of covalent (EMN) ribozymes.
The transformation reaction rates depend on the particular base-pair combination of the relevant nucleotide in the IGS of the catalyst and that of the tag sequence in the reactant. The following (re-scaled) rates were obtained for the four possible Watson-Crick base-pair combinations (ordered from high to low rates):
AU: 0.00613, CG: 0.00541, UA: 0.00517, and GC: 0.00445.
For example, the transformation reaction ICU → ECU catalyzed by the covalent ribozyme EAG has a rate of 0.00613 (catalyst/reactant base-pair combination AU).
The rates of non-covalent ribozyme catalyzed transformation reactions are set to half the rates of the corresponding covalent ribozyme catalyzed reactions, as observed in the laboratory experiments [10]. The rates of the spontaneous association reactions in the simulation model are set to 0.00006 (same rate for each of the 16 possible association reactions). Finally, the rates for all reverse reactions were set to 1/10th the rate of their corresponding forward reactions.
Simulation
To simulate the dynamics of the above chemical reaction system, we used the Gillespie algorithm [23, 24] assuming a closed reaction vessel (with volume 1). Starting with an initial amount of 2000 of each of the 8 food molecule types (i.e., a total amount of 16,000 molecules), the algorithm is run for a given number of time units t. We performed two types of simulations: (i) with and (ii) without transfer steps.
In a simulation without transfer steps, the Gillespie algorithm is simply run for the given number of time units. In a simulation with transfer steps, at the end of each time unit first the solution is diluted to 10%. Rather than simply reducing the concentration of each molecule type to 10%, this dilution is done by randomly drawing molecules (without replacement) from the solution with probabilities according to their relative current concentrations, until 10% of the total concentration is reached. Next, from this diluted solution a random sample (with replacement) of 75 molecules is taken from among the 16 EMN types. Each EMN type that occurs twice or more in this random sample is reported. Finally, the diluted solution is replenished with the 8 food molecule types (in equal concentrations) until the initial total concentration of 16,000 is reached again. These transfer steps are then repeated for the given number of time units. This provides a detailed simulation of the original laboratory serial transfer experiments [10], which can be repeated an arbitrary number of times.
As a consequence of computational limitations, the total amount of molecules used (16,000) is relatively small, and clearly does not compare to the micromolar concentrations in the real experiments. However, to get an idea of how the system’s behavior scales with larger numbers of molecules, we also performed a number of runs of the same simulation but with two orders of magnitude more food molecules (starting with 200,000 molecules of each food type, or a total of 1.6E6 molecules, and appropriately adjusting the time units to maintain a close to 20% conversion to EMN ribozymes after one time unit). Even though this is still a small number compared to the laboratory experiments, we argue that in an origin-of-life context the actual concentrations were most likely significantly lower than micromolar, giving more relevance to the results presented here.