 Research article
 Open Access
 Published:
Autocatalytic sets extended: Dynamics, inhibition, and a generalization
Journal of Systems Chemistry volume 3, Article number: 5 (2012)
Abstract
Background
Autocatalytic sets are often considered a necessary (but not sufficient) condition for the origin and early evolution of life. Although the idea of autocatalytic sets was already conceived of many years ago, only recently have they gained more interest, following advances in creating them experimentally in the laboratory. In our own work, we have studied autocatalytic sets extensively from a computational and theoretical point of view.
Results
We present results from an initial study of the dynamics of selfsustaining autocatalytic sets (RAFs). In particular, simulations of molecular flow on autocatalytic sets are performed, to illustrate the kinds of dynamics that can occur. Next, we present an extension of our (previously introduced) algorithm for finding autocatalytic sets in general reaction networks, which can also handle inhibition. We show that in this case detecting autocatalytic sets is fixed parameter tractable. Finally, we formulate a generalized version of the algorithm that can also be applied outside the context of chemistry and origin of life, which we illustrate with a toy example from economics.
Conclusions
Having shown theoretically (in previous work) that autocatalytic sets are highly likely to exist, we conclude here that also in terms of dynamics such sets are viable and outcompete nonautocatalytic sets. Furthermore, our dynamical results confirm arguments made earlier about how autocatalytic subsets can enable their own growth or give rise to other such subsets coming into existence. Finally, our algorithmic extension and generalization show that more realistic scenarios (e.g., including inhibition) can also be dealt with within our framework, and that it can even be applied to areas outside of chemistry, such as economics.
Background
The idea of collectively autocatalytic sets has been introduced more or less independently several times[1–3], and was subsequently used in a number of origin of life models[4–7]. Recent experimental advances in creating such sets in the laboratory[8–11] have generated a renewed interest in autocatalytic sets. Moreover, there is growing evidence that simple autocatalytic cycles may indeed have been at the core of the origin of life[12].
In our own work, we have studied autocatalytic sets extensively from a computational and theoretical point of view[13–19]. We briefly review some of the main definitions and results here. First, we define a chemical reaction system (CRS) as a tuple$\mathcal{Q}=\{X,\mathcal{R},C\}$ consisting of a set of molecule types X, a set of reactions$\mathcal{R}$ (transforming reactants to products), and a catalysis set C indicating which molecule types catalyze which reactions. We also include the notion of a food set F ⊂ X of molecule types assumed to be freely available from the environment. In a particular model of a CRS, known as the binary polymer model[1, 20, 21], molecule types are represented as bit strings up to a certain length n, reactions are simply ligation and cleavage, and catalysis is assigned at random according to some parameter p (the probability that a given molecule type catalyzes a given reaction). The food set consists of all molecule types up to a certain length t ≪ n.
Informally, an autocatalytic set that is selfsustaining (or an RAF set, in our terminology) is now defined as a subset${\mathcal{R}}^{\prime}\subseteq \mathcal{R}$ of reactions (and associated molecule types) in which:

1.
each reaction $r\in {\mathcal{R}}^{\prime}$ is catalyzed by at least one molecule type involved in ${\mathcal{R}}^{\prime}$;

2.
all reactants in ${\mathcal{R}}^{\prime}$ can be produced from the food set F by using a series of reactions only from ${\mathcal{R}}^{\prime}$ itself.
A formal definition is provided in[14, 17], where we also introduced a polynomialtime (in the size of the reaction set$\mathcal{R}$) algorithm for finding RAF sets in a general CRS. Note that our framework is somewhat different from that of[22], for which it was shown that maximizing the output flow and recognizing autocatalysis is NPcomplete.
Some of our main results are that autocatalytic sets are highly likely to exist, even at very moderate levels of catalysis. For example, in the binary polymer model, each molecule needs only catalyze between one and two reactions, on average, to have a high probability of RAF sets emerging[14, 15]. Also, more realistic assumptions, such as templatebased catalysis (as opposed to merely random catalysis) can be built into the framework easily. In this case, a molecule can only act as a catalyst if it matches (somewhere along its length) a template made up of several bits around the reaction site (which actually prevents the smallest molecules from being catalysts). However, this restriction does not significantly change the main results[17]. In fact, required levels of catalysis for RAF sets to form in the templatebased model can be predicted analytically from the (known) required levels in the base (random) model[18]. And finally, RAF sets can often be decomposed into smaller RAF subsets (possibly even exponentially many), which can provide a mechanism for the evolvability of autocatalytic sets[19, 23].
Here, we continue our studies of autocatalytic sets with various extensions of our framework. First, we investigate actual dynamics of autocatalytic sets. We present some initial but insightful results from simulating molecular flow on RAF sets. Next, we present an extension of our algorithm for detecting autocatalytic sets when inhibition is also considered, i.e., molecules that can potentially prevent a reaction from happening. In an earlier paper we proved that the general problem of detecting autocatalytic sets when inhibition is present, is NPcomplete[15]. However, here we show that the problem is actually fixed parameter tractable, i.e., if the number of inhibiting molecules is not too large, autocatalytic sets (or their absence) can still be determined in polynomial time. Finally, in a recent paper we speculated about a generalized theory of autocatalytic sets beyond the context of chemistry and origin of life[19]. Here, we make a first concrete step in this direction by formulating a generalized version of our RAF algorithm which does not depend on the specifics of chemistry (i.e., molecules and reactions), and can be applied in a more general setting. These results are presented, in three parts, in the following section.
Results and discussion
Part I: Dynamics
In our work so far, we have mostly looked at autocatalytic sets in terms of their graph theoretical properties. However, this has ignored dynamics, i.e., actual molecular flow on autocatalytic sets. Here, we fill this gap by presenting initial results on studying the dynamics of RAF sets. In particular, we provide two examples, a constructed one and a realistic one, to show several aspects of the molecular flow that (can) occur. To a large degree, these dynamical results confirm what had already been analyzed, concluded, and speculated in our earlier (structural) studies, but they also shed some new light on autocatalytic sets and their behavior. Note that a related dynamical study was reported recently[24], although here we focus more directly on the actual molecular flow on RAF sets themselves.
A constructed example
Consider the simple chemical reaction system (CRS)$\mathcal{Q}=\{X,\mathcal{R},C\}$ within the binary polymer model, of which the reaction graph is shown in Figure1, and with a food set F = {00,01,10,11}. This CRS consists of four reactions, each one being a bidirectional ligation/cleavage reaction, either combining two food molecules into a unique molecule of length four (in the “forward”, or ligation reaction), or splitting up a molecule of length four into two food molecules (in the “backward”, or cleavage reaction). The two reactions at the top are mutually catalyzed by each others ligation product, and form a 2reaction autocatalytic (RAF) set. The two bottom reactions are not catalyzed, and are thus not part of any RAF set. However, these two sets of reactions (the top RAF one and the bottom nonRAF one) compete with each other for the food molecules.
Using the Gillespie algorithm[25, 26], we simulate the flow of molecules on this constructed reaction graph. Food molecules are assumed to be always available, and are kept at a minimum concentration of five molecules each (i.e., if after one of the ligation reactions the concentration of a food molecule has dropped below five, it is immediately replenished). One rationale for this is that the reaction system can be assumed to be “contained” inside some compartment, for example a lipid layer[27] or simply naturally occurring cavities in the soil[28, 29]. So, even though the food molecules are in “unlimited” supply in the environment, they still need to be taken up and brought inside the compartment to be used as reactants.
The presence of a catalyst increases the probability that a reaction will happen in direct proportion to the catalyst’s current concentration. However, with this constructed example we are specifically interested in the effects of auto catalysis, and we ignore the fact that a catalyst normally also increases the basic reaction rate. So, for this example, the reaction rates of catalyzed and uncatalyzed reactions are kept equal (at k = 1, in arbitrary units) for all reactions (we relax this assumption again in the more realistic example in the next subsection). The volume is also set to V = 1 (arbitrary units).
To confirm that the simulation produces correct results, we first consider the reactions as unidirectional ligation reactions only. In this case, we expect a linear growth rate over time in the concentrations of the products 0011 and 0110 of the bottom two (nonRAF) reactions, but an exponential growth rate in the concentrations of the products of the top two (RAF) reactions, given that they form an autocatalytic set. Figure2 shows the results, and indeed confirms this expectation (note that the yaxis is on a logscale, so the exponential growth shows as a straight line). Since this is a simple model setting, the time units (xaxis) are arbitrary.
Next, we consider the full system, including the “backward” (cleavage) reactions. In this case, the molecule concentrations cannot grow unlimited, as they start breaking down at a rate proportional to their concentration. So, one would expect them to reach some equilibrium distribution. Figure3 shows the result (simulating 10,000 reaction events). As expected, the molecular concentrations do indeed seem to reach an equilibrium distribution (instead of unlimited growth as with the unidirectional reactions in Figure2). However, the two reactions forming an RAF set still have a large advantage over the two nonRAF reactions. The growth rate in concentrations of the molecules 0001 and 1011 (red and green lines) is much higher (until it levels off) than that of the molecules 0011 and 0110 (blue and purple lines). Also, the RAF set is able to maintain a much higher concentration of its ligation products than the nonRAF set. The light blue line shows the concentration of one of the food molecules over time (for reference). The concentrations of the other food molecules are similar due to the symmetry in the system.
This result clearly shows that the advantage of RAF sets over nonRAF sets is due to the particular, catalytically closed, structure of an RAF set. Even if uncatalyzed reactions have the same (basic) reaction rate as catalyzed reactions, as in this simulation, RAF sets still outcompete nonRAF sets due to the selfreinforcing autocatalytic feedback. However, the equilibrium distribution that is reached does depend largely on the ratio of the reaction rates between the ligation and the cleavage reactions. If this ratio is large enough, the concentrations of the product molecules can be maintained at a high level, as in Figure3. However, reducing this ratio causes the level of the equilibrium concentrations to drop, until at some point there is no advantage anymore for the RAF set over the nonRAF set. Figure4 shows such a situation (again simulating 10,000 reaction events, but setting V = 5, which effectively reduces the mentioned reaction rate ratio by a factor of 5).
A realistic example
Next, we consider an example of an actual autocatalytic (RAF) set that was found by our RAF algorithm in an instance of the binary polymer model with n = 5, t = 2, and p = 0.0045 (with these parameter values, there is a probability of P_{ n }= 0.5 that a model instance contains an RAF set). Figure5 shows this RAF set, which consists of eight bidirectional (ligation/cleavage) reactions. The food set is F = {0,1,00,01,10,11}.
This maximal RAF set actually consists of several RAF subsets (in[19] we show formally how RAF sets can be decomposed into, possibly exponentially many, RAF subsets). First there are two simple (1reaction) irreducible RAF sets contained inside the yellow and purple boxes, respectively. Given that their reactants and catalysts are all food molecules, these subRAFs will always be present. Then there is the 3reaction subRAF contained inside the red box. This subRAF actually includes the purple (1reaction) irrRAF, but can only “grow” into the full 3reaction red subRAF once molecule type 1010 is present. This molecule type catalyzes its own ligation from two instances of the food molecule 10, so this reaction will have to happen spontaneously (uncatalyzed) first, before the red subRAF can come into full selfsustaining existence (in fact, this reaction is actually an irrRAF in itself, but for the purposes of the dynamical analysis here, we do not consider it separately as such, as it immediately gives rise to the full red subRAF as soon as it comes into existence). Next, there is the 3reaction irreducible RAF set contained inside the blue box. This blue subRAF also needs to be seeded, by one of the three reactions happening uncatalyzed (or one of the required molecules coming from elsewhere). Finally, there is the reaction contained in the green box, which strictly speaking is not an RAF by itself, but once molecule type 111 (produced by the blue subRAF) is available, it can become an “extension” of the blue subRAF. However, since the green reaction is catalyzed by its own product, it also needs to happen spontaneously at least once, before it can maintain its own existence autocatalytically.
Using the Gillespie algorithm again, we now study the molecular flow on this maximal RAF of Figure5. In this simulation, we do make a difference between the reaction rates of catalyzed and uncatalyzed reactions, to show the effect of some of the subRAFs needing to be seeded by spontaneous reactions. In particular, if for a given reaction the reactants are present but not the catalyst, the reaction can still go ahead uncatalyzed, but at a reduced rate. For the sake of the simulation, we used a small reduction factor of 20 (i.e., k = 0.05 for uncatalyzed reactions and k = 1, as before, for catalyzed reactions). A higher, more realistic, factor is of course possible, but does not change the qualitative results, and simply means we need somewhat larger timescales to observe similar behavior. Figure6 shows the concentrations over time (simulating 25,000 reaction events this time) for the (ligation) products of the eight reactions making up the maximal RAF set.
The dynamics of the molecular concentrations are a direct reflection of the particular structure of the maximal RAF set in terms of its subRAFs. First of all, the concentrations of the products of the two 1reaction irrRAFs (indicated, as in Figure5, with yellow and purple lines, respectively), immediately start growing at a steady rate (although not exponentially, as they are catalyzed by food molecules, which remain in relatively low concentrations). However, the other subRAFs all need to be seeded by a spontaneous reaction. The first such event happens around time 0.3, when one of the reactions in the blue subRAF happens uncatalyzed. But once this has happened, the blue subRAF as a whole can come into existence and grow in concentration. Note that the two product types 010 (solid blue line) and 11100 (dashed blue line) immediately grow rapidly in concentration, but 111 (dotted blue line) has a damped growth, as it is also used again as a reactant.
The next spontaneous event happens around time 0.5. Recall that around time 0.3 the molecule type 111 came into existence, but for the green reaction to become an extension of the blue subRAF, it will still need to happen uncatalyzed at least once (given that it is catalyzed by its own product). However, when this happens (around time 0.5), the concentration of its product type 01111 (green line), supported by a product of the blue subRAF, immediately starts to grow rapidly. Finally, a last required spontaneous event happens around time 0.55, when molecule type 1010 is created, which then gives rise to the red subRAF coming into full existence (given that the purple irrRAF it contains was already present).
Some additional observations can be made about these dynamics. First, molecule type 00100 (dashed red line), a product of the red subRAF, was actually already present before the full red subRAF came into existence, as a result of spontaneous (uncatalyzed) reactions. However, its concentration only really starts growing once molecule type 1010 (its catalyst; solid red line) is present. Next, the concentration of the product of the purple irrRAF (100, purple line) starts decreasing again as soon as the red subRAF comes into existence, as this molecule type is used as a reactant within the red subRAF. And finally, note that the three molecule types that seem to grow in concentration without limit (00100, 11100, and 01111) are the ones that actually have a nonfood molecule as one of their building blocks (reactants). Food molecules remain present in relatively low concentrations (although they are replenished when they fall below a concentration of five), but nonfood molecules reach higher concentrations, and thus increase, in direct proportion to their concentration, the rate at which reactions that use them as reactants will happen. However, at some point the growth of these three molecule types also levels off, because of the backward (cleavage) reactions happening more and more often as well (similar to what happens in Figure3); for readability of the graph, though, concentrations above 100 molecules are not shown in Figure6.
The reason we have used a stochastic dynamical simulation here (instead of solving a set of ODEs), is that we are specifically interested in the transient behavior of the system, i.e., how subRAFs come into existence and (sometimes) depend on each other. Looking at the equilibrium distribution resulting from the corresponding ODEs does not provide this information. Furthermore, we have shown only one particular instance (realization) of the simulation model in Figure6. Other realizations show very similar behaviors overall, except that the waiting times and order in which the various subRAFs come into existence may differ between simulation runs (due to their stochastic nature). However, averaging the concentrations over many runs would not show these specific behaviors of interest, so we have chosen to show one particular instance as a representative for a whole set of simulations.
These initial results are, of course, only a first step towards a more complete study of the dynamics of autocatalytic sets. However, they already provide some very useful and interesting insights into the kinds of dynamics one can observe in RAF sets, and also confirm some of the claims made recently on how subRAFs can enable their own growth and each others coming into existence[19]. Moreover, there are many directions in which such a dynamical analysis can be extended. For example, one can consider having autocatalytic (sub)sets enclosed in different compartments, able to grow and reproduce (once a threshold concentration of certain molecule types is reached). Variation can then be introduced by only passing on a (perhaps random) subset of the molecules from the parent to the offspring, i.e., offspring compartments can possibly have different combinations of existent subRAFs, enabling an evolutionary process to happen[23]. As another example, one can ask what will happen if there are inhibitors present in the system, i.e., molecules that can actually prevent a reaction from happening. In the next section, we describe an extension of our RAF algorithm for dealing with such a situation.
Part II: Inhibition
Given a chemical reaction system,$\mathcal{Q}=(X,\mathcal{R},C)$, with food set F, suppose we have a collection$({X}_{1},{\mathcal{R}}_{1}),({X}_{2},{\mathcal{R}}_{2}),\dots ,({X}_{k},{\mathcal{R}}_{k})$ where X_{ i }⊂X, and${\mathcal{R}}_{i}\subset \mathcal{R}$. The interpretation of the pair$({X}_{i},{\mathcal{R}}_{i})$ is that every molecule x∈X_{ i }inhibits every reaction$r\in {\mathcal{R}}_{i}$. Notice that any pattern of inhibition can be represented this way, for example by numbering the reactions, and taking${\mathcal{R}}_{i}=\left\{{r}_{i}\right\}$ and X_{ i } to be the set of molecules that inhibit r_{ i } (or we may number the molecules, and take X_{ i }= {x_{ i }} and${\mathcal{R}}_{i}$ to be the set of reactions inhibited by x_{ i }). We wish, however, to consider ‘types’ of molecules that will inhibit ‘types’ of reactions so that k can be chosen to be not too large.
We say that a subset${\mathcal{R}}^{\prime}\subseteq \mathcal{R}$ forms an uninhibited RAF, or more briefly a uRAF, if${\mathcal{R}}^{\prime}$ is an RAF (in the usual sense) and${\mathcal{R}}^{\prime}$ contains no reaction that is inhibited by any molecule that is involved in${\mathcal{R}}^{\prime}$. For a more formal definition, let$\text{supp}\left({\mathcal{R}}^{\prime}\right)$ denote the support of${\mathcal{R}}^{\prime}$ – this is the set of molecules that are either reactants or products of reactions in${\mathcal{R}}^{\prime}$ (this is the same as the union of the set of molecules in F that are reactants of reactions in${\mathcal{R}}^{\prime}$, and the set of products of reactions in${\mathcal{R}}^{\prime}$). Uninhibited RAFs are now defined more formally as follows.
Definition
Given a chemical reaction system,$\mathcal{Q}=(X,\mathcal{R},C)$, with food set F, a subset${\mathcal{R}}^{\prime}$ of$\mathcal{R}$ is a uRAF if
(u1)${\mathcal{R}}^{\prime}$ is an RAF.
(u2)${\mathcal{R}}^{\prime}\cap {\mathcal{R}}_{i}\ne \varnothing \Rightarrow \text{supp}\left({\mathcal{R}}^{\prime}\right)\cap {X}_{i}=\varnothing .$
Note that if a set of reactions${\mathcal{R}}^{\prime}$ satisfies (u2), and if we let${\mathcal{R}}^{\prime}$ now refer to any subset of that set, then this subset also satisfies (u2); this implies that any subset of a uRAF that is an RAF is also a uRAF. □
Determining whether a CRS contains a uRAF was shown to be an NPcomplete problem in[15]. However, here we show that the problem is fixed parameter tractable in the parameter k. So, provided k is not too large, we can still find uRAFs in a CRS efficiently (or determine that a uRAF does not exist).
We first require some additional definitions. Let [k]:={1,…,k}, and for any subset J of [k], let
and let
In the following theorem, the set${\mathcal{R}}^{J}\cap {\mathcal{R}}_{J}$ plays a prominent role (where J is a subset of k); this is precisely the set of reactions r in$\mathcal{R}$ for which (i) r does not belong to${\mathcal{R}}_{j}$ for any j∈J and (ii) if$r\in {\mathcal{R}}_{{j}^{\prime}}$ (for some j^{′}∉J) then none of the molecules in the support of r lie in${X}_{{j}^{\prime}}$. Recall from[19] that for any subset${\mathcal{R}}^{\ast}$ of reactions in$\mathcal{R}$,$s\left({\mathcal{R}}^{\ast}\right)$ is the maximal subRAF contained within${\mathcal{R}}^{\ast}$ (as computed by our RAF algorithm) or the empty set if no such subRAF of${\mathcal{R}}^{\ast}$ exists. We can now state our first theorem.
Theorem 1
Given a chemical reaction system,$\mathcal{Q}=(X,\mathcal{R},C)$, with food set F, the following assertions hold:
For any subset J of [k], if$s({\mathcal{R}}^{J}\cap {\mathcal{R}}_{J})$ is nonempty, then it is a uRAF.
If${\mathcal{R}}^{\prime}\subseteq \mathcal{R}$ is a uRAF, then${\mathcal{R}}^{\prime}\subseteq {\mathcal{R}}^{J}\cap {\mathcal{R}}_{J}$ where
The set of maximal uRAFs is precisely the collection of all nonempty subsets of$\mathcal{R}$ of the form$s({\mathcal{R}}^{J}\cap {\mathcal{R}}_{J})$ as J ranges over subsets of [k].
Proof
For part (i) we know that if$s({\mathcal{R}}^{J}\cap {\mathcal{R}}_{J})$ is nonempty, then it is an RAF (from[14]), thus it suffices to verify property (u2) in the definition of a uRAF above for the set${\mathcal{R}}^{\ast}:={\mathcal{R}}^{J}\cap {\mathcal{R}}_{J}$, which implies that$s\left({\mathcal{R}}^{\ast}\right)$ will also satisfy property (u2), since it is a subset of${\mathcal{R}}^{\ast}$.
Suppose, to the contrary, that property (u2) in the definition of a uRAF is violated by${\mathcal{R}}^{\ast}$, then we can derive a contradiction as follows. For some i∈[k] we must have:
In particular, there exists a reaction, say r_{1}, in${\mathcal{R}}^{\ast}\cap {\mathcal{R}}_{i}$. Moreover, since$\text{supp}\left({\mathcal{R}}^{\ast}\right)={\cup}_{r\in {\mathcal{R}}^{\ast}}\text{supp}\left(r\right)$, the second part of Eqn. (4) implies that there also exists a reaction, say r_{2}, in${\mathcal{R}}^{\ast}$ for which supp(r_{2})∩X_{ i }≠ ∅. Now, since${r}_{1}\in {\mathcal{R}}_{J}$ and${r}_{1}\in {\mathcal{R}}_{i}$ it follows, by the definition of R_{ J }, that i cannot be in J. Now consider r_{2}. This reaction is in R^{J} and so, since i does not lie in J, we must have supp(r)∩X_{ i }= ∅. But this contradicts the choice of r_{2}. This establishes part (i).
For part (ii), suppose that${\mathcal{R}}^{\prime}\subseteq \mathcal{R}$ is a uRAF. It suffices to show that${\mathcal{R}}^{\prime}\subset {\mathcal{R}}^{J}$ and that${\mathcal{R}}^{\prime}\subseteq {\mathcal{R}}_{J}$ for the set J described in Eqn. (3); it follows that${\mathcal{R}}^{\prime}$ will be contained in the intersection of these two sets.
Observe that, for the set J as described in Eqn. (3),${\mathcal{R}}_{j}$ is the set of reactions in$\mathcal{R}$ which do not lie in${\mathcal{R}}_{j}$ for any j for which$\text{supp}\left({\mathcal{R}}^{\prime}\right)\cap {X}_{j}\ne \varnothing $. Now, if${\mathcal{R}}^{\prime}$ is a uRAF then by condition (u2) in its definition, any reaction$r\in {\mathcal{R}}^{\prime}$ must belong to${\mathcal{R}}_{j}$. Similarly, for the choice of J as described,${\mathcal{R}}^{J}$ is the set of reactions r in${\mathcal{R}}^{\prime}$ for which supp(r)∩X_{ i } is empty for all i for which$\text{supp}\left({\mathcal{R}}^{\prime}\right)\cap {X}_{i}=\varnothing $, and so any reaction$r\in {\mathcal{R}}^{\prime}$ must also lie in${\mathcal{R}}^{J}$. This establishes the required two containments, and so part (ii).
For part (iii), we have shown by part (i) that nonempty sets of the form$s({\mathcal{R}}^{J}\cap {\mathcal{R}}_{J})$ are uRAFs, so we need to check that all maximal uRAFs are of this form. Suppose that${\mathcal{R}}^{\prime}$ is a maximal uRAF. Then by part (ii) we know that${\mathcal{R}}^{\prime}\subseteq {\mathcal{R}}^{J}\cap {\mathcal{R}}_{J}$ for the choice of J given by Eqn. (3). Now,$s\left({\mathcal{R}}^{\prime}\right)={\mathcal{R}}^{\prime}$ and so, by part (ii)$s({\mathcal{R}}^{J}\cap {\mathcal{R}}_{J})$ is a uRAF containing${\mathcal{R}}^{\prime}$, and, since${\mathcal{R}}^{\prime}$ is assumed maximal, these two uRAFs must coincide. Part (iii) now follows. □
Corollary 1
Given a chemical reaction system,$\mathcal{Q}=(X,\mathcal{R},C)$, with food set F, together with a family$\left\{\right({X}_{i},{\mathcal{R}}_{i}):i\in [k\left]\right\}$ of inhibition pairs, there is an algorithm for constructing one (or all) maximal uRAFs (or determining that no uRAF exists) in time 2^{k}p(n) where p is a polynomial in the size n of$\mathcal{Q}$.
Proof
Simply apply the RAF algorithm to compute$s({\mathcal{R}}^{J}\cap {\mathcal{R}}_{J})$ for all 2^{k}subsets J of [k]. □
Remark
In contrast to ordinary RAFs, uRAFs need not be closed under union, i.e., if${\mathcal{R}}^{\prime}$ and${\mathcal{R}}^{\mathrm{\prime \prime}}$ are two uRAFs then${\mathcal{R}}^{\prime}\cup {\mathcal{R}}^{\mathrm{\prime \prime}}$ may fail to be a uRAF. Thus, in general, a CRS may have several maximal uRAFs, while there is always a unique maximal RAF.
So, this extension of our algorithm shows that, even though the general problem of finding RAF sets under inhibition is NPcomplete, we can still deal with specific situations (such as when the number of inhibitors is limited) in a relatively efficient way. In the next section, we formulate another extension, or rather a generalization, of our RAF algorithm, which indicates that it can also be applied to problems outside of the context of chemistry and origin of life.
Part III: A generalization
The original RAF algorithm is specifically formulated in the context of chemical reaction systems. However, it is also possible to state the algorithm in a more generalized form. This may be useful for (i) understanding its relationship to other algorithms, and (ii) extending it in further directions, both within the context of chemical reaction systems as well as for other applications (e.g., in economics, as already speculated in[19]).
Suppose we have arbitrary (finite or infinite) sets Y, W where W has a partial order (≤, for example, take W to be the set of subsets of some set partially ordered by set inclusion; as discussed later, this applies in the RAF setting), and functions
(here 2^{Y} refers to the set of all subsets of Y ). Consider the function: ψ:2^{Y}→2^{Y}which is determined by f and g according to the following rule:
for each subset A of Y . Note that ψ(A)⊆A, for all A∈2^{Y}.
Definition
We say that a subset A of Y is gfcompatible if it is nonempty and satisfies the property that g(y) ≤ f(A) for all y∈A.
For a subset A of Y , and k ≥ 1, define ψ^{(k)}(A) to be the result of applying function ψ iteratively k times starting with A. Thus, ψ^{(1)}(A) = ψ(A) and for k ≥ 1, ψ^{(k + 1)}(A) = ψ(ψ^{(k)}(A)). Notice that the sequence (ψ^{(k)}(A),k ≥ 1) is a nested, decreasing sequence of subsets of Y , and so we may define
which is a (possibly empty) subset of Y . Moreover, if Y is finite, then$\overline{\psi}\left(Y\right)={\psi}^{\left(k\right)}\left(Y\right)$ for some k ≤ Y.
To state the main result of this section, we recall two more standard definitions. A set A∈2^{Y} is a fixed point of ψ if ψ(A) = A; and f is monotone if it satisfies the property A_{1} ⊆ A_{2} ⇒ f(A_{1}) ≤ f(A_{2}).
Theorem 2
Given sets Y and W, where W is partially ordered, together with functions f:2^{Y}→W andg:Y→W, the following hold:
The gf−compatible subsets of Y are precisely the nonempty subsets of Y that are fixed points of ψ;
$\overline{\psi}\left(Y\right)$ is gf−compatible, provided it is nonempty; moreover, it contains all gf−compatible subsets of Y provided that f is monotone. In particular, when f is monotone, there exists a gf−compatible subset of Y if and only if$\overline{\psi}\left(Y\right)$ is nonempty.
Proof
If a subset A of Y is nonempty and A = ψ(A) then A = {y∈A:g(x) ≤ f(A)} and so A is gf−compatible. Conversely if A ≠ ψ(A) then since ψ(A)⊂A, there exists y∈A so that g(y) is not dominated by f(A) in the partial order. Thus A is not gf−compatible. This establishes Part (i).
For Part (ii), let$B=\overline{\psi}\left(Y\right)$. Then$\psi \left(B\right)=\psi \left(\overline{\psi}\right(Y\left)\right)=\overline{\psi}\left(Y\right)=B$, so,$\overline{\psi}\left(Y\right)$ is a fixed point of ψ, and so, by part (i), is gf−compatible provided B is nonempty. Also, if f is monotone, and A_{1}⊆A_{2}, then ψ(A_{1}) equals
and this last set is ψ(A_{2}), so ψ is monotone as a function from 2^{Y} to the set 2^{Y} partially ordered under set inclusion. Thus, if B^{′}is any gf−compatible set then, by part (i), B^{′}is a fixed point of ψ and so, since B^{′}⊆Y, we have B^{′}= ψ(B^{′}) ⊆ ψ(Y) and, by iteration of ψ,${B}^{\prime}\subseteq \overline{\psi}\left(Y\right)$, as claimed. The remaining claim in part (ii) now follows directly. □
An algorithm
Theorem 2 has the following immediate consequence when Y is finite, and f is monotone. In this case, consider the following ‘gf−algorithm’. Starting with Y , compute the sequence ψ^{(k)}(Y) until it stabilizes. If this set is empty, then report that no gf−compatible subset of Y exists, otherwise output the stable set$\overline{\psi}\left(Y\right)$, which is the unique maximal gf−compatible subset of Y . Provided that for each subset A of Y , and element y∈Y, the values f(A) and g(y) can be calculated in polynomial time in Y, this algorithm runs in polynomial time in Y. Notice that the algorithm begins with the set Y and iteratively removes subsets of elements, until eventually arriving at a nonempty set$\overline{\psi}\left(Y\right)$ from which nothing further can be removed, or until all the elements of Y are eliminated.
Relationship to the original RAF algorithm
First a simple observation: If a reaction r is catalyzed by k ≥ 1 molecules, then we can replace it (formally) by k copies of this reaction, each of which is catalyzed by just one of the kmolecules. This way we get a set of reactions, each of which is catalyzed by exactly one molecule. We can thus think of this catalyst as an additional reactant and so the reaction proceeds precisely if all the ‘reactants’ are present – formally this is cleaner than saying “all the reactants and at least one catalyst are present”. In fact, the implementation of our RAF algorithm is actually based on this idea. We call this ‘cleaner’ version the expanded CRS, and the catalyst chosen for any given reaction the nominated catalyst. In this expanded CRS, given a reaction r, let ρ(r) denote the set of reactants plus the nominated catalyst of this reaction. We now describe how Theorem 2 and the gf−algorithm applies.
Given a CRS$(X,\mathcal{R},C)$ and food set F ⊆ X, take Y to be the set of all reactions in the expanded CRS, and take M = X, the set of all molecules, take W = 2^{M}, partially ordered under set inclusion. For our choice of the function f we set f(A) = cl_{ A }(F), where cl_{ A }(F) is the closure of the food set F under a subset A of reactions in the expanded CRS; this is the set of all molecules in X that can be constructed from F by repeatedly applying just those reactions that lie in A (and allowing any reaction in A to proceed even if the nominated catalyst is not present). Finally, we set g(r) = ρ(r) (in the expanded model, so ρ(r) includes the nominated catalyst). Then the gf−compatible subsets of Y correspond exactly to the RAFs in the expanded CRS under the recent modified definition of RAF[17], and$\overline{\psi}\left(Y\right)$ is just what we call s(Y) (the maxRAF for the expansion Y of$\mathcal{R}$). Theorem 2(ii) asserts this maxRAF can be found by the gf−algorithm, which is just the modified RAF algorithm[17] applied in the expanded CRS, and the fact this RAF is the unique maximal RAF follows from the fact that the function A↦cl_{ A }(F) is monotone in A.
The connection described assumes that we are working within the expanded CRS setting. However, we can easily relate this back to the original CRS setting by noting that if A is a set of reactions, and A^{′} is the expanded version (replacing each reaction by k copies each with a unique nominated catalyst) then cl_{ A }(F) (in the original setting) coincides with${\text{cl}}_{{A}^{\prime}}\left(F\right)$ (in the expanded setting). Moreover, (i) for any RAF A in the original setting, in the expansion of A there is a subset (selecting an appropriate nominated catalyst for each reaction) that is an RAF in the expanded CRS, and (ii) for any RAF A^{′} in the expanded CRS, replacing the nominated catalyst of each reaction by its full complement of catalysts returns an RAF A in the original CRS.
Notice that, apart from the monotonicity of the function f(A) = cl_{ A }(F), a major factor that helps in guaranteeing a polynomialtime algorithm in the RAF setting is that f(A) can be computed efficiently.
Novel and alternative applications
We now present a simple application of Theorem 2 in a toy economic setting. Suppose Y is a collection of individuals, each of whom produces or consumes different types of “goods”, labeled 1,2,…k. For an individual y∈Y, let g_{ i }(y) be the maximum price individual y is able to pay for good i and let${f}_{i}^{\prime}\left(y\right)$ be the minimal price for which individual y is willing to produce good i. To allow greater generality, if individual y does not need good i we can just set g_{ i }(y) = 0 and if individual y does not produce good i we can just set${f}_{i}^{\prime}\left(y\right)=\infty $. We assume that individuals can produce and sell as many goods as they wish (i.e. the individuals who are buying are not competing for a fixed number of items from any one seller).
We define a subset A of Y as viable if (i) it is nonempty, and (ii) every individual in A can afford to buy each good they need from at least one individual in A. We can formalize this as a gf−compatibility condition as follows.
Let$W={(\mathbf{\text{R}}\cup \{\infty \left\}\right)}^{k}$ (i.e., kdimensional Euclidean space with infinity added to each coordinate) partially ordered in the usual way: (x_{1},…,x_{ k })≤(y_{1},…,y_{ k }) if and only if x_{ i }≤y_{ i }for all i. Note that in this example W is not a collection of subsets of a set (as in the RAF setting). Further, let g(y):=(g_{1}(y),…,g_{ k }(y))∈W, and for a set A individuals (i.e. A∈2^{Y}) let
Then a subset A of Y is viable precisely if for each i and each y∈A,${g}_{i}\left(y\right)\le max\left\{{f}_{j}^{\prime}\right(A):j=1,\dots ,k\}$, which is equivalent to g(y) ≤ f(A) for all y ∈ A. In other words, A is viable if and only if A is gf−compatible. Moreover, notice that f is monotone, and so Theorem 2(ii) applies, so if there is a stable set, then there is a unique maximal one, and it can be found in polynomial time in the size of the population, by using the gf−algorithm.
This provides a (simple) example of how the gf−algorithm can be applied in other contexts, such as economics. This is a first concrete step towards a generalized theory of autocatalytic sets, as we recently proposed[19].
As a further, and rather different, application we point out that the gf−algorithm also provides a polynomialtime solution to HORNSAT, which is a basic problem in propositional logic, of deciding whether a given conjunction of Horn clauses is satisfiable[30]. Recall that a Horn clause is a clause with at most one positive literal, and any number of negative literals (a literal being a boolean variable which can be either ‘true’ or ‘false’). HORNSAT is of interest as it is ‘Pcomplete’ (i.e. not only is it in the complexity class P of problems having polynomialtime solutions, but every problem in the complexity class P can be reduced to HORNSAT).
Suppose then, that we have an instance of HORNSAT consisting of a conjunction of a set$\mathcal{H}$ of n HORN clauses. Without loss of generality we will assume that not all the clauses in$\mathcal{H}$ contain a positive literal, as this is equivalent to the condition that assigning each literal the truth value ‘true’ satisfies every clause in$\mathcal{H}$, and this can be easily checked. We indicate this restriction by saying that$\mathcal{H}$ is a proper instance of HORNSAT. Now we define the sets and functions we will use in the generalized RAF setup. We take$W={2}^{\mathcal{H}}$ with the usual partial order on subsets. Let Y denote the set of all literals appearing in at least one clause in$\mathcal{H}$ (as a positive or negative literal). For a subset A of Y let f(A) be the set of clauses in$\mathcal{H}$ that contain at least one element of A as a negative literal. For y∈Y, let g(y) be the set of clauses in$\mathcal{H}$ which either contain y as a positive literal or else do not contain any positive literals. The following connection with gf−compatibility is established in the Appendix.
Lemma 1
For a proper instance$\mathcal{H}$ of HORNSAT, a subset A of Y is gf−compatible if and only if the following truth assignment satisfies every clause in$\mathcal{H}$:
By Lemma 1, and the fact that f is monotone, we can invoke Theorem 2(ii) and deduce that the gf−algorithm determines whether or not a proper instance of HORNSAT has a satisfying assignment, and if it does, it will construct the truth assignment that has a minimal set of literals set to ‘true’. This may all seem rather technical and irrelevant to chemistry, but it actually shows that a very specific algorithm that was inspired by and constructed for solving a chemical problem in the context of the origin of life (finding autocatalytic sets in chemical reaction systems), turns out to be capable (in its generalized form) of solving any problem that is within the problem class P. This is a surprising and interesting result from an algorithmic point of view, and could perhaps lead to another application of molecular computation[31].
Conclusions
In our previous work, we already showed (both computationally and theoretically) that autocatalytic (RAF) sets are highly likely to exist. However, most of these results were based on graph theoretical properties of RAF sets. Here, we have shown that also in terms of dynamics such sets are indeed selfsustainable and can outcompete nonautocatalytic sets. Furthermore, these dynamical results confirm arguments made previously[19] about how RAF subsets can enable their own growth or give rise to other such subsets coming into existence.
Next, the extension described here of our RAF algorithm shows that more realistic scenarios (such as including inhibition) can also be dealt with within our framework. Despite the fact that the general problem of finding RAF sets when inhibition is present is NPcomplete, in specific cases (such as when the number of inhibitors is not too large) it is still possible to detect RAF sets efficiently, due to our proof of this problem being fixed parameter tractable.
Finally, the generalization of our RAF algorithm shows that it can even be applied to areas outside of chemistry and origin of life, such as economics. This is an important first step towards a generalized theory of autocatalytic sets, as proposed in[19]. And, perhaps, it could lead to another application of molecular computation.
Of course there are still many further extensions possible. In terms of dynamics, a next step could be to consider multiple, possibly competing, compartments each having some (different) combination of subRAFs existent within them. This could then give rise to an evolutionary process along the lines of[23]. Also, it would be interesting to find further applications of the gf−algorithm outside of chemistry. We hope to work on some of these further extensions and generalizations in the future.
Appendix: Proof of Lemma 1
First suppose that A is gf−compatible, and the truth assignment is as specified. Consider clause$c\in \mathcal{H}$. There are three possibilities:

1.
If c contains a positive literal that is not in A then c is satisfied, since that positive literal is assigned the value ‘true’ under (5).

2.
If c contains a positive literal y in A then c ∈ f(A) (since g(y) ⊆ f(A), as y ∈ A), and so c is satisfied under (5).

3.
If c contains no positive literal, then c is contained in g(y) for any y ∈ A(and there exists at least one such y since A is nonempty), and so the condition g(y) ⊆ f(A) (for y ∈ A) implies, once again, that c lies in f(A), and so c is satisfied under (5).
Thus all clauses in$\mathcal{H}$ are satisfied.
Conversely, suppose the truth assignment determined by some set A according to (5) satisfies every clause in$\mathcal{H}$. Then A cannot be the emptyset, otherwise every clause in$\mathcal{H}$ contains a positive literal, so$\mathcal{H}$ would not be proper. We wish to show that g(y)⊆f(A) for all y∈A. Consider clause c∈g(y). Then, by definition of g, either (i) c has no positive literal, or (ii) c has a positive literal and it is y, which lies in A. In case (i), the assumption that c is satisfied implies that at least one of the negative literals in c is set to false, which means one of these literals must be in the set A. Consequently c∈f(A). Similarly, in case (ii), since the positive literal y ∈ A is set to ‘false’ at least one of the negated literals in c must be set to false, which again requires this literal to lie in A, and hence c∈f(A). Thus g(y)⊆f(A) for all y∈A, as required.
References
 1.
Kauffman SA: Cellular homeostasis, epigenesis and replication in randomly aggregated macromolecular systems. J Cybernetics 1971, 1: 71–96. 10.1080/01969727108545830
 2.
Eigen M, Schuster P: The hypercycle: a principle of natural selforganization. Part A: Emergence of the hypercycle. Naturwissenschaften 1977, 64: 541–565. 10.1007/BF00450633
 3.
Dyson FJ: A model for the origin of life. J Mol Evolution 1982, 18: 344–350. 10.1007/BF01733901
 4.
Wächterhäuser G: Evolution of the first metabolic cycles. PNAS 1990, 87: 200–204. 10.1073/pnas.87.1.200
 5.
Gánti T: Biogenesis itself. J Theor Biol 1997, 187: 583–593. 10.1006/jtbi.1996.0391
 6.
Rosen R: Life Itself. Columbia University Press, New York; 1991.
 7.
Letelier JC, SotoAndrade J, Abarzúa FG, CornishBowden A, Cárdenas ML: Organizational invariance and metabolic closure: Analysis in terms of (M;R) systems. J Theor Biol 2006, 238: 949–961. 10.1016/j.jtbi.2005.07.007
 8.
Sievers D, von Kiedrowski G: Selfreplication of complementary nucleotidebased oligomers. Nature 1994, 369: 221–224. 10.1038/369221a0
 9.
Ashkenasy G, Jegasia R, Yadav M, Ghadiri MR: Design of a directed molecular network. PNAS 2004,101(30):10872–10877. 10.1073/pnas.0402674101
 10.
Hayden EJ, von Kiedrowski G, Lehman N: Systems chemistry on ribozyme selfconstruction: Evidence for anabolic autocatalysis in a recombination network. Angew Chem Int Ed 2008, 120: 8552–8556. 10.1002/ange.200802177
 11.
Taran O, Thoennessen O, Achilles K, von Kiedrowski G: Synthesis of informationcarrying polymers of mixed sequences from double stranded short deoxynucleotides. J Syst Chem 2010, 1: 9. 10.1186/1759220819
 12.
Braakman R, Smith E: The emergence and early evolution of biological carbonfixation. PLoS Comput Biol 2012,8(4):e1002455. 10.1371/journal.pcbi.1002455
 13.
Steel M: The emergence of a selfcatalysing structure in abstract originoflife models. Appl Mathematics Lett 2000, 3: 91–95.
 14.
Hordijk W, Steel M: Detecting autocatalytic, selfsustaining sets in chemical reaction systems. J Theor Biol 2004,227(4):451–461. 10.1016/j.jtbi.2003.11.020
 15.
Mossel E, Steel M: Random biochemical networks: The probability of selfsustaining autocatalysis. J Theor Biol 2005,233(3):327–336. 10.1016/j.jtbi.2004.10.011
 16.
Hordijk W, Hein J, Steel M: Autocatalytic sets and the origin of life. Entropy 2010,12(7):1733–1742. 10.3390/e12071733
 17.
Hordijk W, Kauffman SA, Steel M: Required levels of catalysis for emergence of autocatalytic sets in models of chemical reaction systems. Int J Molecular Sciences 2011,12(5):3085–3101. 10.3390/ijms12053085
 18.
Hordijk W, Steel M: Predicting templatebased catalysis rates in a simple catalytic reaction model. J Theor Biol 2012, 295: 132–138.
 19.
Hordijk W, Steel M, Kauffman S: The structure of autocatalytic sets: Evolvability, enablement, and emergence. Acta Biotheoretica 2012,60(4):379–392. 10.1007/s1044101291651
 20.
Kauffman SA: Autocatalytic sets of proteins. J Theor Biol 1986, 119: 1–24. 10.1016/S00225193(86)800479
 21.
Kauffman SA: The Origins of Order. Oxford University Press, New York; 1993.
 22.
Andersen JL, Flamm C, Merkle D, Stadler PF: Maximizing output and recognizing autocatalysis in chemical reaction networks is NPcomplete. J Syst Chem 2012, 3: 1. 10.1186/1759220831
 23.
Vasas V, Fernando C, Santos M, Kauffman S, Sathmáry E: Evolution before genes. Biol Direct 2012, 7: 1. 10.1186/1745615071
 24.
Filisetti A, Graudenzi A, Serra R, Villani M, Füchslin RM, Kauffman SA, Packard N, Poli I, De Lucrezia D: A stochastic model of the emergence of autocatalytic cycles. J Syst Chem 2011, 2: 2. 10.1186/1759220822
 25.
Gillespie DT: A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J Comput Phys 1976, 22: 403–434. 10.1016/00219991(76)900413
 26.
Gillespie DT: Exact stochastic simulation of coupled chemical reactions. J Physical Chem 1977,81(25):2340–2361. 10.1021/j100540a008
 27.
Segré D, BenEli D, Deamer DW, Lancet D: The lipid world. Origin of Life and Evol Biospheres 2001,31(1–2):119–145.
 28.
Martin W, Russel MJ: On the origins of cells: A hypothesis for the evolutionary transition from abiotic geochemistry to chemoautotrophic prokaryotes, and from prokaryotes to nucleated cells. Philos Trans R Soc B 2003, 358: 59–85. 10.1098/rstb.2002.1183
 29.
Martin W, Russel MJ: On the origin of biochemistry at an alkaline hydrothermal vent. Philos Trans R Soc B 2007, 362: 1887–1925. 10.1098/rstb.2006.1881
 30.
Papadimitriou CH: Computational complexity. Addison Wesley, Boston; 1994.
 31.
Adleman LM: Molecular computation of solutions to combinatorial problems. Science 1994,266(5187):1021–1024. 10.1126/science.7973651
Acknowledgements
MS thanks the Royal Society of New Zealand for funding support. We thank Stuart Kauffman for helpful and stimulating discussions.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
WH implemented the algorithms and models (RAF, Gillespie, and binary polymer model) and performed the simulations and analysis. MS formulated and proved the mathematical theorems and algorithm extension and generalization. Both authors wrote the paper and approved the final version.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Received
Accepted
Published
DOI
Keywords
 Truth Assignment
 Horn Clause
 Molecule Type
 Uncatalyzed Reaction
 Reaction Graph